Original listing text, shown exactly as published by the company.
What you will do
This is a cross team role, and you will have the full support of leadership and engineering in carrying out your responsibilities - it’s not all down to you, but you will show the rest of us what good looks like.
- Evangelise SRE & DRE across engineering
- Lead the charge on building out a framework for data quality that will provide our customers with strong guarantees about the fidelity of our data as well support our marketing and revenue functions
- SRE as a function define and own the on-call process:
- Quickly establishing a strong working knowledge of our systems
- Commanding incidents
- Running mop-ups
- Ensuring follow-up actions are completed to your schedule
- Evaluating and improving our existing E2E on-call process
- Take part in the on-call rotation, one week every 4–5 weeks (24x7x365 coverage)
- Evaluate, manage and maintain our existing solutions for monitoring, alerting, paging, response, documentation
- Report on uptime, availability, performance, etc across our product suite
- Write post-mortems for both internal and external consumption
- Represent our SRE & DRE function on sales calls with tier one enterprise financial institutions
- Work with product, sales and customer service to define SLAs for different products and use cases
- Work with internal product teams to define SLOs for internal consumption and measurement
- Work with our engineering teams directly to embed DRE practices
You will be a great fit here if you
- Thrive under high pressure situations, and are able to make tough decisions quickly
- Fail fast, own the failure; encourage a blame free engineering culture
- Are an inspiring thought leader, and are able to take others with you on a journey
- Aren’t afraid to get your hands dirty and dig into code across myriad technologies
- Understand the importance of reliability in enterprise finance systems
- Have strong opinions based on your experience that you evolve over time as you learn from others
- Are fluent in AI and agentic workflows
Our ideal candidate has
- Proven experience at leveling up the quality and reliability of large datasets not just services and APIs
- Experience leading site reliability for a high volume SaaS product
- Supported distributed systems in AWS
- The presence and empathy required to hold teams to account
- Defined SLAs / SLOs both internal and client facing
- Offered post mortems to enterprise clients (verbal and written)
Bonus Points for
- Having a genuine interest in the crypto ecosystem and being behind the mission of the company
- Working knowledge of Kubernetes and the challenges presented