01 Zakres zadań

About the company: US-based AI startup focused on building the next generation of training data for LLMs. The team partners with top AI labs to create realistic RL environments where models encounter research and engineering challenges, iterate, and learn from feedback, pushing AI closer to its full potential.

Project: Design and build reinforcement learning environments to teach LLMs advanced reasoning and modern ML concepts. Candidates will work on realistic feedback loops where models encounter research and engineering problems and iterate on solutions.

What you will do:

Build and maintain RL/ML environments for LLM training
Implement robust, production-quality Python code (not just notebooks)
Deploy and run environments in Docker with focus on reliability and iteration speed
Analyze model performance and respond to feedback efficiently
Collaborate with research teams to translate papers and ideas into RL problems

02 Wymagania

6 must-have · 1 język

Must-have

Python

Ekspert

Docker

Zaawansowany

LLM

Zaawansowany

Reinforcement Learning

Podstawowy

Wymagane języki

Angielski

Zaawansowany

03 Profil kandydata

Client: US startup
Recruitment process: 2 meetings with hiring managers, followed by a phone screen with our recruiter and technical test
Fully remote work

Skills:

Strong Python (engineering-quality)
Docker and production mindset
Understanding of LLMs and their limitations
Ability to meet throughput expectations
Advanced English (C1/C2) and ≥4 hours overlap with US time zones

Nice-to-have:

Deep knowledge of transformer internals and LLM training/inference
Experience with inference libraries (vLLM, SGLang, etc.)
CUDA or Pallas kernel development experience
Publications or open-source contributions in active DL/ML research
Experience building interactive RL environments and RL-based learning systems

What's in it for you?

Fully remote, flexible work schedule with some overlap to US time zone
Direct impact on how LLMs learn
Collaboration with top AI researchers and labs
Exposure to cutting-edge RL and ML projects

04 O firmie

Verita HR

80 · San Francisco

Work for the largest bank in Europe, which operates in more than 65 countries around the world giving us access to over 90% of all world trade flows. Don’t hesitate to apply, create future of banking with us!

Who we are?

Verita HR is an international company providing recruitment support within #Fintech, #Finance and #Banking market in EMEA. We connect the most innovative organizations with the best people in the market. We conduct systematic market research, which allows our Digital Teams to be a step ahead of the competition.

Zobacz ogłoszenia Profil firmy Strona www

05 Lokalizacja

Centrum, San Francisco

Tryb pracy:
Praca projektowa

Godziny pracy biura: 00-24

Model pracy

Stacjonarnie

Hybrydowo

100% zdalnie

RL Environments Engineer

01 Zakres zadań

What you will do:

02 Wymagania

Must-have

Wymagane języki

03 Profil kandydata

Skills:

Nice-to-have:

What's in it for you?

04 O firmie

05 Lokalizacja

07 Podobne oferty

01 Zakres zadań

What you will do:

02 Wymagania

Must-have

Wymagane języki

03 Profil kandydata

Skills:

Nice-to-have:

What's in it for you?

04 O firmie

05 Lokalizacja

06 Powiązane wyszukiwania

07 Podobne oferty