Original listing text, shown exactly as published by the company.
Responsibilities
Build API and integration test coverage for Java/Spring routers and database workflows.
- Design and implement Playwright E2E tests for critical user journeys.
- Create stable mocking and fixture strategies (MSW, test fixtures, seeded test data).
- Set up CI quality gates for linting, unit/integration/E2E tests, coverage, and reporting.
- Expand coverage to untested UI and workflow areas and prevent regressions.
- Build an LLM evaluation discipline with golden datasets and measurable quality gates.
- Track model output reliability, detect drift, and gate releases on quality/stability thresholds.
- Improve test maintainability, remove flakiness, and optimize runtime.
- LLM QA Tooling
- Use LangChain4j for JVM-native LLM application support, including agents, RAG, and guardrails.
- Use Langfuse Java SDK and OpenTelemetry for trace capture, prompt/version tracking, and experiment comparison.
- Use the OpenAI Java SDK for structured outputs, function calling, webhook verification, and response validation.
- Use Promptfoo in CI when you need language-agnostic regression and red-team checks around the Java stack.
- Use Java schema validation and guardrail patterns with Jackson, Swagger annotations, and LangChain4j guardrails.
- Use OpenTelemetry-based traces and Langfuse datasets as the reproducible source of truth for release gates.
Qualifications
3+ years in QA automation for production systems.
- Strong Java backend testing skills.
- Strong frontend testing.
- Experience building and maintaining Playwright/Cypress/Selenium E2E suites.
- API testing experience (REST, schema validation, auth/error scenarios).
- Experience with SQL/PostgreSQL testing and test data management.
- CI/CD pipeline experience (GitHub Actions, Azure Pipelines, or Jenkins).
- Practical understanding of testing non-deterministic LLM outputs.
Nice to have
Familiarity with Java-compatible LLM evaluation and regression tooling such as Promptfoo and Langfuse, plus tracebased observability and output validation via API/OpenTelemetry integrations.
- Performance and benchmark testing experience.
- WebSocket testing experience.
- Security and dependency scanning awareness.
- Experience in legal-tech, fintech, or other regulated domains.