28 000 – 32 000 PLN
netto /miesiąc
B2BEtat: 100%
Podobne ogłoszenia
TeamQuest
Zdalnie
B2B
Praca zdalna
Senior ML / Data Science Engineer (Genai / LLM) @1dea
Senior ML / Data Science Engineer (Genai / LLM)
1dea
Zdalnie
B2B
Praca zdalna
1dea
Zdalnie
B2B
Praca zdalna
Acaisoft
Zdalnie
B2B
Praca zdalna
DCG
Zdalnie
B2B
Praca zdalna
Scalo
Zdalnie
B2B
Praca zdalna
1dea
Zdalnie
B2B
Praca zdalna
TeamQuest
Warszawa
B2B
Warszawa
Praca hybrydowa
DCG
Zdalnie
B2B
Praca zdalna
Scalo
Zdalnie
B2B
Praca zdalna
NOWE TQ0102140 Senior Data Scientist/AI Engineer (Reinforcement Learning)
TeamQuest
100% zdalnie (Warszawa)
Data Science
Senior
XP
min. 5 lat doświadczenia
Senior
min. 5 lat doświadczenia
Kogo poszukujemy?
Responsibilities:
- Designing and deploying RL environments for large-scale agent evaluation and reinforcement learning experiments.
- Create pipelines for task generation, dynamic datasets, and scripted environments with controlled complexity and stochasticity.
- Develop validators and reward models to automatically evaluate trajectories and assess model inference.
- Collaborate with infrastructure and systems engineers to ensure scalability, reproducibility, and equip environments with tools for detailed telemetry.
- Design API interfaces and orchestration structures for running, resetting, and evaluating agents in various environments.
- Optimization of environment performance, reward logging, and reproducibility in distributed configurations.
We offer:
- Attractive salaries
- Possibility of full remote work
- Participation in interesting projects
Czym będziesz się zajmować?
Requirements:
- Over 5 years of experience in software engineering in Python.
- At least 3 years of experience in the position of Data Scientist, Machine Learning/Environment Engineering.
- Working hours from 2:00 PM to 10:00 PM.
- Practical knowledge of AI frameworks (Langchain, Langraph, mcp-server).
- Extensive practical experience in working with artificial intelligence, including instant engineering and climate coding.
Additional advantages:
- Knowledge of the Code of Conduct or Claude's Code.
- Experience in integrating artificial intelligence with the system will be an additional asset.
- Understanding of RL concepts - reward modeling, environmental dynamics, verifiability, evaluation, and agent interaction loops.
- Knowledge of tools, metrics, and data channels for evaluating RL.
- Expertise in planning own work.
Jakie otrzymasz benefity?
Pakiet medycznyPakiet sportowy
Gdzie i jak będziesz pracował?
Centrum, Warszawa
Tryb pracy: Elastyczne godziny pracy
Godziny pracy biura: 7-20
Model pracy
Stacjonarnie
Hybrydowo
100% zdalnie
Kim jesteśmy?
Our client is a rapidly growing company specializing in delivering modern cloud solutions and Kubernetes-based applications aimed at enhancing operational efficiency and reducing costs for businesses. Established in 2021, the company has quickly gained recognition in the market by creating advanced SaaS platforms supported by data engineering and machine learning. With offices in San Jose (USA) and Warsaw (Poland), our client collaborates with renowned partners such as Devtron and Tigera to offer corporate clients and startups from the USA and Europe robust, scalable solutions that support digital transformation, improve operational efficiency, and stimulate innovation. The company is currently seeking talented IT professionals ready to work on top-level projects, offering excellent working conditions and the opportunity for development in an international environment.





