Keelvar logo

Lead DevOps / SRE (remote)

Keelvar
Full-time
Remote
, Germany, and Spain


Role


Keelvar is looking for a Lead DevOps/SRE to join our growing Engineering & Product team. We are proud to be a remote-first company and this role can be based remotely in either Ireland, Germany, Spain or the UK.

Reporting directly to one of our engineering managers, you will be joining our team of SREs who work closely with our cross functional teams. Each team works on a specific product or product area and includes software engineers, data scientists, product managers and UI designers.





Responsibilities


  • Provide strategic leadership in defining, planning, implementing, iterating, and maintaining Keelvar’s cloud infrastructure in AWS, ensuring alignment with the company’s goals and scaling requirements.

  • Mentor and guide the SRE team in fostering a continuous deployment ecosystem that enables the product engineering teams to release changes to customers quickly, reliably, and sustainably through automated deployment pipelines.

  • Collaborate closely with engineering leadership and cross-functional teams to drive initiatives that enhance the availability, performance, and resilience of critical services

  • Lead efforts in infrastructure and application security by partnering with product engineering teams and the security and compliance team to incorporate DevSecOps principles and implement secure infrastructure, enforce defence-in-depth principles, and advocate for best practices in security.

  • Oversee and enhance production system monitoring to ensure optimal availability, latency, and overall system health, and provide strategic recommendations for improvements

  • Oversee and prioritise tickets and incoming requests to the SRE team, ensuring timely response to incidents, operational tasks, and support requests from product and engineering teams. Develop processes to triage, track, and resolve issues efficiently, and maintain clear communication with stakeholders regarding ticket status and resolution timelines.

  • Develop, test, and evolve disaster recovery plans to ensure business continuity, leading periodic drills to prepare for system failures or disasters.

  • Drive the design and implementation of monitoring and alerting strategies, ensuring application SLAs and SLOs properly defined, met and exceeded

  • Identify and lead initiatives for technical, operational, and process improvements, enabling continuous optimization of SRE practices and team efficiency.

  • Establish and enforce DevOps best practices across the organisation, promoting a culture of continuous integration, continuous delivery, and high system resilience.

  • Provide expert guidance in load testing, performance profiling, and capacity planning to support product scalability.

  • Maintain comprehensive documentation for infrastructure, processes, and procedures to facilitate knowledge sharing and onboarding.

  • Stay at the forefront of industry advancements by actively researching and implementing emerging SRE and cloud computing best practices to improve our processes and infrastructure





Your profile


  • 7+ years of experience in SRE, DevOps with at least 2+ years in a leadership or senior engineering capacity

  • Proven track record in managing and scaling cloud infrastructure, preferably within AWS, with hands-on expertise in automation, infrastructure as code, and cloud-native architectures.

  • Strong technical background in CI/CD pipeline development, with experience building and optimising automated deployment pipelines in agile, product-focused environments.

  • Proficient in DevSecOps principles, with a deep understanding of security best practices in infrastructure and application deployment.

  • Experienced in managing monitoring, logging, and alerting systems for high-availability applications, with a strong knowledge of defining and meeting SLAs and SLOs.

  • Advanced skills in infrastructure-as-code tools, such as Pulumi and familiarity with container orchestration platforms like AWS ECS / Kubernetes

  • Programming skills in one or more languages, such as Python or Shell scripting to support automation and tooling.

  • Solid understanding of networking, database management, and system performance tuning, with the ability to diagnose and resolve complex issues.

  • Proven ability to mentor and guide SRE/DevOps teams, fostering a collaborative, continuous-learning environment.

  • You thrive in an environment of mutual respect, openness and collaboration. You enjoy getting things done at a quick pace.

  • You have a sense of ownership, you enjoy taking initiative and leading projects through to completion.





Why us?


Here at Keelvar, we are proud to be a remote-first organisation and we offer some great perks.

  • Competitive salary with a Series B backed, fast growing organisation
  • 25 days holidays increasing to 26 after 3 years and increasing again to 27 after 5 years. Plus your birthday off on us
  • Flexible working hours with a positive approach to work - life balance
  • An inclusive, collaborative, innovative culture
  • Generous leave offerings including Wellbeing days
  • Technology that enables you to perform to your best

If you really like the sound of the role but don't match every listed criteria exactly, we still want to hear from you. You could be the exact fit for this or any of our other roles.

We are also a diverse group and we intend to continue to attract and retain diverse talent in our organisation. We're committed to an inclusive and diverse Keelvar. We do not discriminate based on gender, ethnicity, sexual orientation, religion, civil or family status, age, disability, or race.



Position ID


1081285




About us


This is an exciting opportunity to join a cutting edge company, disrupting its industry. Currently optimising over $100bn+ in procurement spend for the world's largest companies, Keelvar is more than just a software company. Keelvar is an evolution of how companies work.

Our technology is unparalleled in its space. We are on a fast-paced journey to herald a new era of SaaS 3.0. Using AI, Machine Learning, and Game Theory to build intelligent systems that optimize and automate the procurement sourcing process, we save our customers millions of dollars every year, and help their suppliers find the best customers for them. Many of the world’s top blue chip companies use Keelvar to aid negotiations; they set high standards that we relish achieving because it helps us be the best at what we do.

We believe we can change the world and have fun doing it. We are a hard-working team who love what we do. We believe that a culture of curiosity, experimentation, and risk-taking is the key to finding breakthrough approaches - and we don’t settle for conventional approaches.

We strive for excellence, challenging ourselves and each other, with independent thinking, a lot of focus, and plenty of collaboration. In our eyes, the bigger the challenge, the bigger the reward. We’re not content with just equipping users with good tools; we want to help our customers achieve success and excellence, and sometimes this requires lateral or unconventional thinking. We want you to share your knowledge readily and learn every day. We like to ask questions, and answer questions when we can. We invite you to a workplace that is inclusive and celebrates diversity. We support everyone in being themselves, feeling empowered and inspired to make a difference.

If you are passionate about how technology is changing the world of work and want to work with a great team, this is the role for you.