DevOps Engineer
Responsible for creating, modifying, maintaining, and supporting infrastructure and individual system components that enable business functions and IT Services. Balance availability, security, efficiency, and functional requirements to help provide an optimized production service. Identify and manage existing and emerging risks that stem from business activities and ensure these risks are effectively identified and escalated to be measured, monitored, and controlled.
RESPONSIBILITIES:
- Identifies and manages existing and emerging risks that stem from business activities and the job role.
- Ensures risks associated with business activities are effectively identified, measured, monitored, and controlled.
- Follows written risk and compliance policies and procedures for business activities.
- Designs scalable infrastructure with minimal oversight, implements system change with consistency and autonomy, automates service delivery and maintenance tasks, and builds monitoring and tooling for systems.
- Resolves complex production issues with independence by troubleshooting complex systems independently
- Analyzes complex outages in order to identify opportunities
- Experiments with new patterns and technologies
- Understands and consistently exercises engineering best practices, concepts, and patterns
- Understands the customer and identifies solutions and ideas for the customer with some assistance
- Evaluates potential solutions with some oversight
- May begin mentoring junior engineers.
- Liase with architecture, development, DevOps automation/infrastructure/product delivery, and support teams to help drive customer ticket resolution and SLA with these organizations
- Create, maintain, and improve service observability tools that generate meaningful insights and action items
- Partner with DevOps Engineering leadership to set and review SLOs, SLAs, and SLIs, and continuously improve the on-call incident response process
QUALIFICATIONS:
- 8 years of infrastructure experience demonstrating depth of technical understanding within a specific discipline(s)/technology(s).
- Strong experience operating globally distributed, mission critical systems, and designing for high availability and performance
- Strong experience implementing cloud applications in Azure, AWS, or GCP
- Advanced understanding of Kubernetes and microservices/distributed systems architecture
- Strong experience with Infrastructure as Code, such as Terraform, Ansible, Chef, or Puppet
- Strong experience in infrastructure (databases, Active Directory, DNS, firewalls, networking) in Azure and on-premises datacenters
ADDITIONAL REQUIREMENTS:
- Shell scripting, like Bash
- Java experience
- Writing automation jobs
- Experience administering a multi-tenant application, like UrbanCode Deploy
- Puppet / Linux Administration
- REST API experience
- Batch job experience