OutSolve logo

Site Reliability Engineer

OutSolve
Full-time
Remote

About Us
OutSolve is a leading HR compliance solutions provider, helping organizations to navigate complex regulatory requirements with confidence. Our products and services span the HR compliance lifecycle, including Form I-9 verification, nondiscrimination in employment analytics, pay equity, workplace reporting, and labor law posters. As we expand our product and service offerings, we are looking for a skilled Site Reliability Engineer to join our team to support the rapid growth of our HR compliance platform.

About the Role

We’re seeking a highly skilled Site Reliability Engineer (SRE) to join our engineering team and help ensure the reliability, scalability, and performance of our systems. As an SRE, you’ll blend software engineering with systems engineering to build and maintain resilient infrastructure, automate operations, and drive continuous improvement across our platform.

You’ll be instrumental in designing fault-tolerant systems, implementing proactive monitoring, and streamlining incident response. This role is ideal for someone who thrives in high-ownership environments and enjoys solving complex challenges with code.

 

Key Responsibilities

  • Design, implement, and maintain scalable, reliable infrastructure across our environments
  • Develop automation for deployment, monitoring, and incident response using infrastructure-as-code and scripting tools
  • Collaborate with development and operations teams to define SLAs/SLOs and improve system performance
  • Build observability into systems through metrics, logging, and tracing
  • Lead root cause analysis and post-mortems for production incidents
  • Optimize alerting workflows to reduce noise and improve signal quality
  • Champion reliability best practices across engineering teams
  • Contribute to capacity planning, disaster recovery, and performance tuning
  • Maintain documentation and runbooks for operational processes and incident response
  • Support evolving systems by performing manual operational tasks that enable critical features not yet automated or fully built
  • Identify opportunities to reduce manual toil through automation, tooling, and process improvement
  • Collaborate with engineering teams to translate recurring manual workflows into scalable, reliable solutions
  • Document interim procedures and ensure visibility into temporary workarounds while long-term fixes are in development

 Required Skills & Experience

  • Strong programming/scripting skills (e.g., Node, Python, Bash, SQL)
  • Experience with cloud platforms (e.g., Azure) and container orchestration (e.g., Docker)
  • Proficiency with monitoring and observability tools (e.g., Azure Monitor)
  • Familiarity with CI/CD pipelines and automation tools (e.g., GitHub Actions, Terraform)
  • Solid understanding of networking, security, and system architecture
  • Experience with incident management and on-call rotations
  • Excellent problem-solving and communication skills
  • Working knowledge of Jira and Confluence for issue tracking, documentation, and cross-team collaboration

 

What We Offer

  • 100% remote work environment with flexible scheduling
  • Competitive compensation and benefits package
  • Mission-driven culture focused on empowering organizations through compliance excellence
  • Professional development opportunities in a growing company at the intersection of compliance and technology

  

Equal Opportunity at OutSolve

OutSolve is proud to be an equal opportunity employer. We consider all qualified applicants without regard to race, color, religion, sex, national origin, disability, or protected veteran status.

 

Applicants who need reasonable accommodation to search for a position or to submit an application are invited to email jobs@outsolve.com or call 888-414-2410. Equal Employment Opportunity posters are available upon request.