A remote role at OpenAI. Develop RLHF and post-training methods for multimodal models.
Keywords this role’s ATS scans for
Sydicom tailors your CV and cover letter to match these.
How Sydicom helps: we read this listing’s requirements and tune your CV and cover letter to the keywords its ATS (Ashby) is scanning for, wherever you are, then help you apply.
Original listing text, shown exactly as published by the company.
We are looking for a Research Engineer / Scientist to join the Future of Computing Research team to work on RLHF and post-training for personalized, multimodal AI systems.
This role will focus on building the learning and evaluation foundations that help models become more context-aware, adaptive, and useful over time. You will work on problems such as reward modeling, preference learning, long-horizon evaluation, and policy improvement for systems that must make high-quality behavioral decisions in realistic user settings. The work is deeply product-grounded: success is not just higher benchmark performance, but better model behavior in real-world use.
The ideal candidate is excited about pushing beyond one-turn assistant behavior toward systems that improve through feedback, learn from richer signals, and are trained against meaningful notions of user value. Internally, that maps closely to the need for careful reward design, feedback loops, and evaluation frameworks that test whether interventions are actually beneficial over longer horizons.
This role is based in San Francisco, CA. We use a hybrid work model of four days in the office per week and offer relocation assistance to new employees.
OpenAI
Other
357 open roles on Sydicom
OpenAI is an American artificial intelligence (AI) research organization headquartered in San Francisco, consisting of OpenAI Group PBC, a for-profit public benefit corporation (PBC), partially controlled by OpenAI Foundation, a nonprofit. OpenAI developed the generative pre-trained transformer (GPT) family of large language models, the DALL-E series of text-to-image models, and the Sora series of text-to-video models, which have influenced industry research and commercial applications. Its release of ChatGPT in November 2022 has been credited with catalyzing the AI boom, and widespread interest in generative AI.
Source: Wikipedia