Dear applicants, please note that applications without salary expectations and an active LinkedIn profile will not be considered.
We are looking for a Senior DevOps Engineer to own the infrastructure that keeps a fast-growing applied AI and data analytics platform running at enterprise scale. You will architect, build, and maintain the cloud systems that Fortune 500 retailers trust with their most critical data and decision-making. This is a hands-on, high-ownership role β you will work closely with backend and ML engineers to ensure the platform is secure, scalable, and relentlessly reliable. A key part of the role is taking over infrastructure leadership from the current DevOps lead over a transition period, then owning that domain long-term. Over time, you will also be involved in growing the DevOps team, though the immediate priority is hands-on technical execution.
Details
Schedule: Full-time
Location: Hybrid (NYC)
Salary: $160K β $200K
Type of collaboration: Full-time employment
The platform connects an organization's entire data landscape β internal systems, social media trends, industry reports, consumer behavior signals β into a single coherent intelligence layer. It drives 8-figure improvements in gross margins for Fortune 500 retailers. As enterprise onboarding ramps up, the infrastructure must scale to match. The team believes in GitOps, infrastructure as code, and building systems that let engineers move fast without breaking things. You will be the person who makes that real β standing up infrastructure, defining reliability targets, automating manual processes, and owning production incidents end-to-end. Former technical founders are a strong fit: what you've built matters more than tenure.
You have
4β7+ years of industry experience, with at least 4 years in hands-on DevOps/infrastructure roles
4+ years managing cloud infrastructure in production β AWS strongly preferred
2+ years of production Kubernetes experience (EKS preferred)
Deep proficiency in Terraform/Terragrunt and infrastructure-as-code principles
Strong scripting skills in Python and Bash β automates everything possible
Experience with CI/CD platforms (GitHub Actions, GitLab CI, Jenkins, or similar)
Nice to have
Security-first mindset with experience implementing compliance frameworks (SOC 2, GDPR)
Systems-level, architectural, and strategic thinking β can articulate reliability targets, scaling policies, and rollback criteria regardless of specific tooling
Experience supporting ML/AI workloads and GPU infrastructure
Familiarity with service mesh architectures and advanced networking
Experience with multi-tenant enterprise SaaS infrastructure
Background in cost optimization and FinOps practices
What to do
Architect and build secure, scalable cloud infrastructure using IaC (Terraform/Terragrunt) on AWS
Design and maintain robust CI/CD pipelines; convert manual processes into automated, repeatable workflows
Own production reliability β set up observability stacks, define SLI/SLOs, and lead incident response
Implement and champion GitOps workflows for all infrastructure deployments
Build compliance-ready infrastructure with IAM best practices and secrets management
Optimize cloud costs while maintaining performance and reliability at scale
Create developer tooling and documentation that accelerates engineering workflows across the team
Mentor team members on DevOps best practices and guide infrastructure strategy
Take over infrastructure leadership over a structured transition period; own this domain long-term
Interview process
Application reviewed by a human
Intro call with the recruiting team
Technical conversation with the infrastructure lead
On-site β meet the team