Reporting to the Director of Platform Engineering & Operations, we are adding a Cloud Operations Engineer to the team. In this Full-Time position, you will play a critical role in eliminating vulnerabilities and optimizing uptime for our enterprise clients by implementing resilient and robust solutions leveraging automation for fault detection and a laser focus on automatically recovering when a fault is detected and continuously looking to deploy solutions which avoid previously identified faults from being generated.
Responsibilities
- Address root causes and identify solutions. You will focus on problem-solving, automation, and ensuring a sustained focus on engineering. You will:
- Architect and implement monitoring alarms and logging solutions.
- Identify issues proactively, and mitigate them to improve the customer experience.
- Audit, test, and review solutions to ensure we deliver a resilient, monitored, highly secure, and complete solution.
- Demand forecasting and capacity planning. You will create and maintain good visibility of the demand for AWS resources, planning, and usage. You will plan and execute efficient use of resources.
- Optimize through automation. You will eliminate manual, repetitive, tactical solutions with no enduring value. You will implement automation for a more sustained and scalable solution, services, and processes, contributing to the continuous improvement of all operations to efficiently manage and maintain deployments.
- Plan and deploy upgrade, configuration changes and security patches.
- Drive continuous improvement. You will research and implement best practices in DevOps. You will explore and evaluate new and emerging software tools and technologies.
- Be on call outside of business hours, on a weekly rotation basis.
- Take ownership of not only your deliverables but the platform and drive resiliency on the platform as per an SRE mindset.
- Be hands-on. You will:
- Assist in the configuration and support of customer environments.
- Code deployments, optimization, and various tools.
- Troubleshoot and resolve escalated software and infrastructure-related issues and challenges, acting as a customer-facing escalation point.
- Review new tools and software prior to implementation.
Qualifications
- Education/qualifications and experience You have a Bachelor's Degree in Computer Science, Computer Engineering, Software Engineering, and preferably certification in AWS, Ansible and Terraform and have worked on DevOps tools Git and Jenkins
- Exceptional communication and problem solving skills
- Have worked with collaboration tools like Jira, Confluence and Jira ServiceDesk
- Youβre driven, collaborative and motivated. You thrive on developing solutions to open-ended business problems.
- Ability to write code.
- Passion for automation and efficiency.
- Ability to work autonomously and as part of a team.
- Proficient in English.
- 40 hours per week, with weekly on call rotation.
Our Commitment to You
- Flexible work environment
- Competitive compensation (OTE: Base Salary plus bonus)
- Benefits package
- Learning and Development
- Career Progression
- Culture β One team environment founded on respect and collaboration where we do not take shortcuts and are customer obsessed