JPMorganChase logo

Lead Site Reliability Engineer

JPMorganChase
Full-time
On-site
Jersey City, New Jersey, United States
$152,000 - $215,000 USD yearly
Description

Assume a critical role in defining the future of a globally recognized firm and have a direct and significant effect in a realm tailored for top achievers in site reliability.

As a Lead Site Reliability Engineer at JPMorgan Chase within Consumer and Community Bank, you hold a leadership role in your team, demonstrate strong knowledge across multiple technical domains, and advise others on the technical and business issues facing them. Take lead and conduct resiliency design reviews, break up complex problems into digestible work for other engineers, act as a technical lead for medium to large-sized products, and provide advice and mentoring to other engineers. 

Job responsibilities

  • Lead the design, implementation, and maintenance of observability solutions across the organization.
  • Develop and enforce best practices for observability, including logging, monitoring, tracing, and alerting.
  • Write and maintain code in Java or similar language, Python, Angular or similar frameworks to build and enhance observability tools and platforms. Automate repetitive tasks to improve system reliability and developer productivity.
  • Implement and driving adoption of SRE principles to improve system reliability, availability, and performance.
  • Design and implement monitoring and alerting strategies to proactively identify and resolve issues.
  • Ensure that observability tools provide actionable insights and are aligned with business objectives.
  • Work closely with cross-functional teams to integrate observability practices into the software development lifecycle.
  • Mentor and guide junior engineers, fostering a culture of learning and continuous improvement.
  • Lead projects related to observability initiatives, ensuring timely delivery and alignment with strategic goals. Communicate effectively with stakeholders to provide updates and gather requirements.
  • Function effectively in an agile environment, managing or contributing to backlog, velocity, and reporting on project landings

Required qualifications, capabilities, and skills

  • Formal training or certification in software engineering concepts with 5+ years of applied experience.
  • Strong understanding of SRE principles and practices.
  • Advanced knowledge of observability tools and platforms (e.g., Dynatrace, Splunk, Grafana)
  • Extensive experience in a similar SRE or observability role.
  • Proven track record of implementing and managing observability solutions in complex environments.
  • Excellent communication skills, both verbal and written, with the ability to convey complex technical concepts to non-technical stakeholders.
  • Collaborative mindset, with the ability to work effectively with diverse teams and stakeholders.
  • Strong analytical and problem-solving skills, with the ability to troubleshoot complex systems and drive root cause analysis.
  • Ability to communicate data-based solutions with complex reporting and visualization methods.
  • Drive to self-educate and evaluate new technology
Preferred qualifications, capabilities, and skills
  • Proficiency in programming languages such as Java, Angular, Python and terraform (nice to have).