---
title: RL Environments Engineer
company: Verita HR
category: Data Science
subcategory: Data Science
experience_level: Senior
work_mode: remote
location: San Francisco
employment_type: B2B
salary_min: 35000
salary_max: 74000
salary_currency: PLN
salary_period: month
technologies: [Python, Docker, LLM, ML, Reinforcement Learning, AI]
posted: 2026-04-20
valid_through: 2026-06-21
url: "https://solid.jobs/offer/31024/verita-hr-rl-environments-engineer"
---

# RL Environments Engineer — Verita HR

## Kluczowe informacje

- **Firma:** Verita HR
- **Lokalizacja:** Centrum, San Francisco
- **Tryb pracy:** 100% zdalnie
- **Wynagrodzenie:** 35k–74k PLN netto/m (B2B)
- **Forma zatrudnienia:** B2B
- **Wymiar etatu:** 100%
- **Godziny pracy:** Praca projektowa
- **Poziom doświadczenia:** Senior
- **Minimalne doświadczenie:** 5 mies.
- **Kategoria:** Data Science
- **Specjalizacja:** Data Science
- **Data publikacji:** 2026-04-20
- **Aktywne do:** 2026-06-21

## Technologie i umiejętności

**Wymagane:**

- Python — ekspert
- Docker — zaawansowany
- LLM — zaawansowany
- ML — zaawansowany
- Reinforcement Learning — podstawowy
- AI — podstawowy

## Języki

- Angielski — zaawansowany

## Opis stanowiska

About the company:  US-based AI startup focused on building the next generation of training data for LLMs. The team partners with top AI labs to create realistic RL environments where models encounter research and engineering challenges, iterate, and learn from feedback, pushing AI closer to its full potential.   Project:  Design and build reinforcement learning environments to teach LLMs advanced reasoning and modern ML concepts. Candidates will work on realistic feedback loops where models encounter research and engineering problems and iterate on solutions.  What you will do:   Build and maintain RL/ML environments for LLM training  Implement robust, production-quality Python code (not just notebooks)  Deploy and run environments in Docker with focus on reliability and iteration speed  Analyze model performance and respond to feedback efficiently  Collaborate with research teams to translate papers and ideas into RL problems

## Kogo szukamy

Client:  US startup   Recruitment process:  2 meetings with hiring managers, followed by a phone screen with our recruiter and technical test   Fully remote work    Skills:   Strong Python (engineering-quality)  Docker and production mindset  Understanding of LLMs and their limitations  Ability to meet throughput expectations  Advanced English (C1/C2) and ≥4 hours overlap with US time zones   Nice-to-have:   Deep knowledge of transformer internals and LLM training/inference  Experience with inference libraries (vLLM, SGLang, etc.)  CUDA or Pallas kernel development experience  Publications or open-source contributions in active DL/ML research  Experience building interactive RL environments and RL-based learning systems   What's in it for you?   Fully remote, flexible work schedule with some overlap to US time zone  Direct impact on how LLMs learn  Collaboration with top AI researchers and labs  Exposure to cutting-edge RL and ML projects

## Lokalizacje

- Centrum, San Francisco, USA

## O firmie — Verita HR

**Wielkość firmy:** 80

**Strona WWW:** https://veritahr.com

Work for the largest bank in Europe, which operates in more than 65 countries around the world giving us access to over 90% of all world trade flows. Don’t hesitate to apply, create future of banking with us!  Who we are?    Verita HR  is an international company providing recruitment support within #Fintech, #Finance and #Banking market in EMEA. We connect the most innovative organizations with the best people in the market. We conduct systematic market research, which allows our Digital Teams to be a step ahead of the competition.

## Aplikuj

Aplikuj na: https://solid.jobs/offer/31024/verita-hr-rl-environments-engineer

---

*Źródło: https://solid.jobs/offer/31024/verita-hr-rl-environments-engineer · Wygenerowano: 2026-05-25T17:56:36Z*
