Luxury Presence logo

Staff DevOps Engineer

Luxury Presence
Full-time
Remote
United States
AWS

We’re hiring a Staff DevOps Engineer to be the connective tissue between our Infrastructure (INFRA) and Developer Experience (DevEx) teams. You’ll own the systems, automation, and guardrails that let engineers ship quickly and safely—from ephemeral/preview environments to production delivery. You’ll work across EKS, core AWS services, and key vendors (e.g., Confluent Cloud) to create a reliable, secure, and cost-aware platform. You’ll also lead CI/CD optimization and introduce modern release gating so changes meet quality and security bars before they land.
This role blends depth in Kubernetes/AWS and CI/CD with enough breadth to understand application needs, data stores, networking, and observability. You’ll collaborate closely with Infra, DevEx, and product engineering to reduce friction, increase confidence, and accelerate delivery.

What You’ll Do

    • Own ephemeral & preview environments: Partner with INFRA to design, build, and scale automated, short-lived environments (per PR/feature) including safe data strategies (seed/snapshot/masking), TTL policies, cost controls, and consistent app/infra templates.
    • CI/CD at scale: Streamline and standardize pipelines (e.g., CircleCI → Argo CD) for services and jobs; speed up builds/tests with caching, parallelism, and flake reduction; maintain artifact/versioning strategies across our monorepos.
    • Release gating & progressive delivery: Partner with QA to implement quality gates (tests, coverage deltas, policy checks).
    • Observability & reliability: Partner with INFRA to level up metrics, logs, and traces (Datadog/OpenTelemetry); define health checks and deployment KPIs; contribute to on-call readiness, incident response, and post-incident improvements.
    • Vendor integration: Assist Product Engineering with building robust integrations for external services (e.g., Confluent Cloud/Kafka) with secure networking, credentials, and monitoring; document best practices as reusable templates.
    • Developer experience: Contribute to internal tooling and documentation that make the “right way the easy way” - CLIs, scaffolds, templates, and playbooks for environment creation, deploys, and debugging.
    • Measure & iterate: Track DORA metrics (lead time, deploy frequency, change failure rate, MTTR); set targets and deliver continuous improvements.

What We’re Looking For

    • 6–10+ years in DevOps/Platform/SRE roles building and operating production systems at scale.
    • Expertise with Kubernetes (EKS) and AWS (IAM, VPC, ECR, SSM/Secrets Manager, CloudWatch, S3, SQS, Lambda, RDS/Aurora).
    • Strong IaC chops (Terraform preferred) and GitOps workflows (Argo CD or similar).
    • Proven track record building ephemeral/preview environments and standardizing app/infra templates across many services.
    • CI/CD mastery (CircleCI or similar) including caching/parallelism, artifact mgmt, test reliability, and pipeline observability.
    • Experience with release strategies (canary/blue-green, automated rollbacks) and quality gates.
    • Observability fundamentals (Datadog/Prometheus/Grafana, OpenTelemetry); ability to define SLIs/SLOs and wire them to delivery decisions.
    • Excellent cross-team communicator who can translate platform constraints into developer-friendly solutions and documentation.
    • Comfort in an AI-augmented engineering culture; enthusiasm for automation and building tools that eliminate toil.

Tech Stack

    • Infrastructure: AWS, EKS, Terraform, ArgoCD, Docker
    • CI/CD: CircleCI, ArgoCD (GitOps)
    • Messaging: Kafka (Confluent Cloud)
    • Observability: Datadog, OpenTelemetry
    • Languages/AppsNode.js/TypeScript microservices, Python jobs, React front-ends

How You’ll Succeed Here

    • You simplify complex systems with good defaults and strong documentation.
    • You bias toward automation, reproducibility, and measurable outcomes.
    • You collaborate across Infra, DevEx, and product teams to enable safe speed.