Responsibilities:
- Design, build and support cloud environments to create digital products
- Monitor and assess the performance of applications in a cloud environment to ensure solutions are available
- Create, test and implement safeguards to maintain data integrity and protect against unauthorized access
ย
General Skills:
- Experience in one of the leading cloud platforms such as AWS, Azure or Google Cloud, etc
- Experience in maintaining complex Linux cloud environments, like CentOS, Ubuntu, or CoreOS, to support modern web technologies: LAMP, MEAN, Drupal and Elasticsearch
- Experience setting up development environments and mechanism using tools such as JIRA, Confluence, Maven and Jenkins or similar tools
- Experience in scripting languages like Python, Bash, PHP, Java, JavaScript, Node, etc.
- Experience in build tools like Git, Ansible, Chef, Puppet etc. for continuous integration
- Knowledge of container-based virtualization technology like Docker
- Integration experience in building and using APIs
- Experience applying industry web, architectural and security standards and best practices
- Experience in mobile device management for various versions of cellular and tablets
Requirements
Experience and Skill Set Requirements:
Must Haves:
-
Design, provision, and manage AWS infrastructure including VPCs, subnets, security groups, IAM policies, EC2, ECS, EKS, RDS, S3, Route 53, and CloudFront.
-
Architect multi-account AWS environments following AWS Well-Architected Framework principles.
-
Manage AWS cost optimization strategies including Reserved Instances, Savings Plans, and rightsizing.
-
Develop, maintain, and refactor Terraform modules and configurations for all cloud infrastructure.
-
Author and maintain Ansible playbooks, roles, and collections for server configuration, application deployment, and compliance enforcement.
-
Operate and administer Red Hat OpenShift Service on AWS (ROSA) clusters, including cluster upgrades, node scaling, and add-on management.
-
Design and maintain CI/CD pipelines (GitLab CI, Azure DevOps Service) for infrastructure and application delivery.
ย
Skill Set Requirements:
Cloud Infrastructure & AWS:
- Design, provision, and manage AWS infrastructure including VPCs, subnets, security groups, IAM policies, EC2, ECS, EKS, RDS, S3, Route 53, and CloudFront.
- Architect multi-account AWS environments following AWS Well-Architected Framework principles.
- Manage AWS cost optimization strategies including Reserved Instances, Savings Plans, and rightsizing.
- Implement and maintain CloudTrail, Config, GuardDuty, Security Hub, and AWS Organizations SCPs.
ย
Infrastructure as Code รขยย Terraform/Terraform Cloud:
- Develop, maintain, and refactor Terraform modules and configurations for all cloud infrastructure.
- Manage Terraform Cloud workspaces, remote state backends, variable sets, and team access policies.
- Enforce IaC standards including module versioning, input/output conventions, and documentation.
- Implement drift detection and remediation workflows using Terraform Cloud run tasks and policy-as-code (Sentinel or OPA).
- Lead Terraform code review processes and mentor junior team members on best practices.
ย
Configuration Management รขยย Ansible:
- Author and maintain Ansible playbooks, roles, and collections for server configuration, application deployment, and compliance enforcement.
- Manage Ansible inventories across dynamic cloud environments using AWS dynamic inventory plugins.
- Integrate Ansible automation with CI/CD pipelines for repeatable and auditable deployments.
- Use Ansible Vault for secrets management and always ensure secure handling of credentials.
- Develop idempotent, well-tested automation that reduces manual toil and configuration drift.
ย
Container Platform รขยย OpenShift ROSA:
- Operate and administer Red Hat OpenShift Service on AWS (ROSA) clusters, including cluster upgrades, node scaling, and add-on management.
- Define and enforce OpenShift RBAC, NetworkPolicies, and SecurityContextConstraints (SCCs).
- Manage Operators, Helm charts, andย Kustomize overlays for workload deployment on ROSA.
- Ensure cluster hardening against CIS benchmarks and organizational security policies.
ย
CI/CD Pipelines:
- Design and maintain CI/CD pipelines (GitLab CI, Azure DevOps Service) for infrastructure and application delivery.
- Implement GitOps workflows using ArgoCD for declarative, auditable deployments to OpenShift ROSA.
- Integrate security scanning tooling (SAST, container scanning, dependency auditing) into pipeline gates.
- Champion shift-left testing principles, ensuring infrastructure changes are validated before promotion to production.
- Maintain pipeline-as-code standards with versioned, peer-reviewed pipeline definitions.
ย
Security & Compliance:
- Serve as a key contributor to the team's security posture, embedding security controls throughout the infrastructure and CI/CD lifecycle.
- Implement secrets management solutions (AWS Secrets Manager) and enforce least-privilege access.
- Support vulnerability management processes by triaging findings from infrastructure and container scanning tools.
- Participate in incident response and post-mortem processes, ensuring remediation actions are tracked and resolved.
ย
Observability & Reliability:
- Build and maintain end-to-end observability solutions using AWS CloudWatch.
- Define and trackย SLOs and SLIs for critical platform services and workloads.
- Lead on-call incident response for platform-level issues, conducting RCAs and driving permanent fixes.
- Produce and maintain runbooks and architectural decision records (ADRs).