Original listing text, shown exactly as published by the company.
The Role
As a Senior Platform Engineer, you are a champion for DevOps and SRE culture and industry best practice within Megaport. You will work alongside talented team members in multiple timezones ensuring that systems are secure, maintainable and available. External to the team you will be engaging with stakeholders in requirements analysis and demonstrations. Technically you will be very hands on. Continually evolving your skills through a mix of peer reviews and research. Ultimately your obsession is customer success and ensuring company goals are met.
What You Will Be Doing
- Improving production reliability and system resilience within an SRE scoped team
- Championing high standards of work and industry best practices
- Communicating with teams and stakeholders at all stages
- Bringing fresh ideas to the table and encouraging others
- Diving into complex technical problems with a can-do attitude
- Working across numerous technologies in a fast-changing industry
- Participating in on-call rotation, incident response, and blameless post-incident reviews
- Writing code, handling alerts, improving solutions, and supporting others
- Playing a crucial role in the success of your company and team
What We Are Looking For
- 5+ years administering Linux systems and related infrastructure in production environments
- A collaborative SRE mindset, with familiarity around SLIs/SLOs/SLAs, error budgets, blast radius, and blameless postmortems
- A focus on automation, reducing toil, and preventing problem recurrence
- A track record of writing runbooks that work for the broader team, not just yourself
- Strong Kubernetes and broader ecosystem fundamentals
- Cloud infrastructure experience; AWS strongly preferred and bare-metal is a bonus
- Strong tool development - Bash, plus either Python or Go preferred, or similar
- Infrastructure-as-code tooling experience - Terraform preferred
- CI/CD and version control, GitHub preferred
- Database experience - one of Postgres, Cassandra, or ClickHouse preferred
- Experience operating a production observability stack (metrics, logs, traces), with an eye for signal over noise
- Comfortable working on live production infrastructure, with strong troubleshooting instincts and ownership of incident response
- A history of continual professional development
- A self-directed style suited to an async, globally distributed team, and comfortable picking up adjacent work when the situation calls for it