01 Zakres zadań
About the company: US-based AI startup focused on building the next generation of training data for LLMs. The team partners with top AI labs to create realistic RL environments where models encounter research and engineering challenges, iterate, and learn from feedback, pushing AI closer to its full potential.
Project: Design and build reinforcement learning environments to teach LLMs advanced reasoning and modern ML concepts. Candidates will work on realistic feedback loops where models encounter research and engineering problems and iterate on solutions.
What you will do:
- Build and maintain RL/ML environments for LLM training
- Implement robust, production-quality Python code (not just notebooks)
- Deploy and run environments in Docker with focus on reliability and iteration speed
- Analyze model performance and respond to feedback efficiently
- Collaborate with research teams to translate papers and ideas into RL problems
