C

Senior Site Reliability Engineer - Req # 2116

Cesit
Full-time
Remote
India, India

CES has 26+ years of experience in delivering Software Product Development, Quality Engineering, and Digital Transformation Consulting Services to Global SMEs & Large Enterprises. CES has been delivering services to some of the leading Fortune 500 Companies including Automotive, AgTech, Bio Science, EdTech, FinTech, Manufacturing, Online Retailers, and Investment Banks. These are long-term relationships of more than 10 years and are nurtured by not only our commitment to timely delivery of quality services but also due to our investments and innovations in their technology roadmap. As an organization, we are in an exponential growth phase with a consistent focus on continuous improvement, process-oriented culture, and a true partnership mindset with our customers. We are looking for the right qualified and committed individuals to play an exceptional role as well as to support our accelerated growth.
You can learn more about us at: http://www.cesltd.com/

Ideal Candidate: 
Has working experience in both Windows and Linux
5+ years in Ansible, Terraform, AWS Infrastructure, EKS / k8s Dynatrace (any monitoring tools like Prometheus / Grafana)

Key Skills and Competencies Required
  • 5+ years of extensive experience with Infrastructure as a Code (IaaC) and Desired State Configuration (DSC) tools such as Terraform, CDK and Chef
  • 5+ years of experience packaging, deploying, and managing containerized workloads running in common PaaS solutions (i.e. Docker, Kubernetes)
  • 5+ years expertise in managing AWS infrastructure at scale including expertise in the following services: EC2, S3, Elastic Load Balancing, Lambda, Route 53, ECS, SQS, CloudWatch
  • Prior experience working in a DevOps or SRE environment
  • Highly experienced with automation and scripting using languages such as: PowerShell, Ruby, Go, Python, Bash
  • Large-scale monitoring and reporting experience using ELK stack, Dynatrace and/or New Relic (or other APM), Nagios
  • Experience with IIS management, troubleshooting, and performance monitoring
  • Experience managing web farms in a high-traffic SaaS environment
  • Strong analytical and problem-solving skills including robust troubleshooting skills with a focus on preventative and proactive actions
  • Extensive experience with .NET applications architecture components (caching, content delivery, high availability, load balancing, etc.)
  • Understanding of the Software/Application Development Life Cycle process and experience with implementing and maintaining CI/CD technologies including - TeamCity, Octopus Deploy, GitHub, Jenkins, Code fresh, etc.
  • Knowledge of or experience with most of the following technologies:
Active Directory, SSL, FTP, Big-IP F5, T-SQL, MongoDB, MySQL, SQL Server, Nagios, Git, TeamCity, Octopus Deploy, Code Fresh, Chef, Salt, Docker, Kubernetes, Kafka, AWS, Linux / Windows Server Administration, Bash, Apache

Responsibilities 
  • Drive focused initiatives that improve operational efficiency and scalability of the platform and applications
  • Drive standardization efforts across multiple disciplines and services in conjunction with embedded SREs throughout the organization
  • Identify and drive opportunities to improve automation for the company; scope and create automation for deployment, management and visibility of our services Understand modern software security and secure software systems with cloud-based infrastructure
  • Provide full-stack diagnostics and determine root cause of internal problems
  • Analyse operational performance which supports delivering improvements to critical related system metrics & KPIs
  • Examine all areas of infrastructure and applications for improvement and suggest changes, rather than wait for direction
  • Safeguard application information against accidental or unauthorized damage, modification, or disclosure
  • Build and maintain redundant systems and procedures for high availability and disaster recovery
  • Develop integrated workflows for our support teams
  • Own the customer experience – think and act in ways that put our customers first, provide them a great digital experience, and make them promoters of our products and services
  • Respond to and help troubleshoot incidents

Personal Attributes 
  • Be an enthusiastic learner, user, and advocate of our technologies
  • Has desire to win as a team – make big things happen by working together and being open and willing to try new ideas
  • Strong interpersonal and communications skills (written, verbal, & virtual) with ability to work in a team-oriented, collaborative environment
  • Must have high degree of personal integrity and ability to maintain strict confidentiality
  • Must uphold, safeguard, and promote the organization’s values and philosophy relating particularly to corporate ethics, integrity, and priorities
  • Ability to work without supervision on short-term projects
  • Strong drive, self-motivated, logical, with keen attention to detail