Original listing text, shown exactly as published by the company.
Responsibilities
- Serve as the product bridge between Cohere's safety research teams and North, ensuring that findings from model evaluations, red-teaming, and behavioral research translate into product-level guardrails, controls, and safeguards.
- Own the safety product roadmap for Cohere and North, prioritizing features based on research findings, observed misuse patterns, evolving threat vectors, and customer requirements.
- Partner with modeling teams to scope and interpret safety evaluations — understanding how Cohere’s underlying models behave across adversarial inputs, edge cases, and high-stakes use cases.
- Define and drive evaluation frameworks for assessing how safety properties hold up as models and product capabilities evolve, ensuring regressions surface before they reach customers.
- Coordinate the development of guardrails and intervention mechanisms — working across research, engineering, and policy to determine where and how safety controls should be implemented within North's product layer.
- Monitor the AI safety research landscape — from prompt injection and jailbreaks to emerging misuse patterns in agentic systems — and ensure North's roadmap reflects what the research is surfacing.
- Build processes for scaling safety review as North's surface area grows, including how new features get assessed for safety risk before launch.
Requirements
- 5+ years of product management or research operations experience, with meaningful time working alongside research or ML teams at a technology or AI company.
- Technical depth sufficient to engage credibly with safety researchers: you don't need to run evals yourself, but you need to understand what they mean and ask the right questions.
- Genuine interest in AI safety and model behavior, including the real-world implications of deploying LLMs in enterprise contexts.
- Comfortable operating in ambiguity — safety research surfaces unexpected findings, and this role requires good judgment about what to act on and how fast.
- Able to work across researchers, engineers, and product teams and keep everyone aligned without flattening the nuance of what the research is actually saying.
- Strong written communicator who can translate complex model behavior findings for non-technical audiences and knows when something needs to be escalated urgently.
Nice-to-Haves
- Hands-on experience with LLM evaluation, red-teaming, safety benchmarking, or behavioral research.
- Familiarity with AI-specific threat vectors: prompt injection, jailbreaks, RAG poisoning, or misuse patterns in agentic systems.
- Background in trust and safety, content policy, or a research-adjacent operational role at a technology company.
- Experience building zero-to-one processes in research or safety contexts.
- Prior exposure to agentic AI systems and the unique safety challenges introduced by tool use, multi-step reasoning, and autonomous execution.
If any of the above doesn’t line up exactly with your experience, we still encourage you to apply
We strive to create an inclusive work environment for all; we welcome applicants from all backgrounds and are committed to providing equal opportunities. Should you require any accommodations during the recruitment process, please submit an Accommodations Request Form, and we will work together to meet your needs.
We may use AI-enabled tools to screen and assess applicants against the criteria for this position. This helps our recruiters identify potentially qualified candidates, but it doesn't limit the applications our recruiters may review or consider.