About the role
We are looking for a Site Reliability Engineer to join our Platform SRE team. In this role, you will build and operate the infrastructure, tools, and "paved roads" that empower our developers to deliver scalable, secure, and reliable software with speed and confidence.
Youโll work across the entire stackโfrom infrastructure automation and observability to developer enablement and system reliability. You will be a key collaborator with software engineering and security teams, helping to evolve our Infrastructure as Code (IaC), enhance CI/CD pipelines, and scale our internal developer platform. We value pragmatism and engineering excellence, primarily using Python, Go, and AWS to reduce toil and build self-service capabilities.
Responsibilities
- Infrastructure Automation: Design, build, and scale production environments using AWS and Terraform.
- System Reliability: Improve the resilience and operability of our platform through failure-based testing and automated recovery strategies.
- Developer Enablement: Design and implement reusable platform components and self-service tools to streamline the developer experience.
- Observability: Implement and maintain robust observability practices, including system metrics, distributed tracing, and SLO management.
- Mentorship & Standards: Guide junior engineers, uphold high infrastructure quality, and contribute to the teamโs evolving best practices.
- Collaboration: Participate in technical design discussions, sharing feedback and adapting strategies based on team input and evolving requirements.
Skills and Qualifications
- Experience: 4+ years in SRE, DevOps, Cloud Engineering, or Software Development roles.
- Cloud Proficiency: Hands-on experience operating and scaling production environments within AWS.
- Infrastructure as Code: Strong expertise with Terraform for managing complex cloud infrastructure.
- Programming: Proficiency in Go or Python, with experience building production-grade automation, tooling, or libraries.
- Containers & Orchestration: Experience with ECS or Kubernetes.
- CI/CD: Familiarity with modern deployment tools, specifically GitHub Actions.
- Communication: Strong written and verbal skills with a knack for evangelizing reliability best practices across the organization.
Bonus Pointsย
- Experience troubleshooting complex distributed systems in a high-traffic production environment.
- Exposure to event streaming systems such as Kafka or Kinesis.
- Experience contributing to Internal Developer Platforms (IDP) or automating self-service infrastructure workflows.
- Familiarity with systems security, compliance requirements, or infrastructure hardening.
Compensation
As a remote-first organization, our compensation reflects the cost of labor across several Canadian geographic markets.ย
In Alberta, British Columbia & Ontario the base salary for this position ranges from $127,800/year up to $159,750/year.ย
For all other locations, the base salary for this position ranges from $115,000/year up to $143,775/year.ย
Consistent with applicable laws, an employee's pay within this range is based on a number of factors, which include but are not limited to relevant education, skills, job-related knowledge, qualifications, work experience, credentials, and/or geographic location. Your Recruiter can share more on the specific target salary range for your location during the interview process. Coalition, Inc. reserves the right to modify this range as needed.
Vacancy Status: This posting is for an existing vacancy.
Use of AI:ย We utilize AI-assisted tools to help us organize and review the high volume of applications we receive. However, our human recruiting team makes all final hiring decisions and interview selections, valuing personal connection over algorithms.
Application Updates: Consistent with our commitment to transparency, if you are interviewed for this role, we will provide a status update on your application within 45 days of your final interview.
Perks
- 100% medical, dental, and vision coverage
- Flexible PTO
- Annual home office stipend and WeWork access
- Mental & physical health wellness programs like Headspace, Lumino, and more!
- Competitive compensation and opportunity for advancement