Bank of America logo

Site Reliability Engineer II - Network Operations

Bank of America
Full-time
On-site
Plano, Texas, United States
$108,000 - $161,900 USD yearly

Job Description:

At Bank of America, we are guided by a common purpose to help make financial lives better through the power of every connection. We do this by driving Responsible Growth and delivering for our clients, teammates, communities and shareholders every day.

Being a Great Place to Work is core to how we drive Responsible Growth. This includes our commitment to being an inclusive workplace, attracting and developing exceptional talent, supporting our teammates’ physical, emotional, and financial wellness, recognizing and rewarding performance, and how we make an impact in the communities we serve.

Bank of America is committed to an in-office culture with specific requirements for office-based attendance and which allows for an appropriate level of flexibility for our teammates and businesses based on role-specific considerations.

At Bank of America, you can build a successful career with opportunities to learn, grow, and make an impact. Join us!
 

Job Description:
This job is responsible for partnering with engineering and technology teams to implement measures as prescribed by lead/senior SRE engineers. Key responsibilities include ensuring appropriate instrumentation, tooling, ticketing, alerting and on call routines are in place for key services, identifying root causes of issues through production triage efforts, and suggesting code enhancements to technology teams to automate services and improve reliability and efficiency. Job expectations include using software development skills to improve efficiency and to address gaps in reliability.

Responsibilities:

  • Develops and maintains reliability scripts, tools and libraries and leverages them for common instrumentation, automation, and operational needs, and when mentoring Site Reliability Engineer (SRE) resources on reliability practices and established tools/capabilities
  • Collaborates with Development and Infrastructure teams to understand technical solutions and implement monitoring capabilities outlined in the application and system monitoring designs put forward by the SRE Lead
  • Partners to implement code changes to make use of common reliability libraries and tools and helps Application Production Services and Application Development teammates understand how to use them
  • Identifies vulnerabilities and opportunities for reliability improvement, such as investigating low level error rates and 'noise' in monitoring, and defines solutions to reduce manual support effort and/or improve system reliability
  • Engages as a subject matter expert in major incident triage efforts and failure scenario modelling and diagnosis with Problem Manager root causes for major incident/problem management investigations
  • Participates regularly in an on-call rotation with Production Support teammates to learn more about reliability issues affecting their portfolio
  • Collaborate with Engineering and Production Services teams to understand technical solutions and define strategies for network automation to accelerate time-to-market and reduce operational inefficiencies
  • Develop and maintain a catalog of automation services that can be leveraged for common instrumentation, automation, and operational needs to identify and remediate network events
  • Provide next level escalation support for production triage efforts
  • Manage a continuous improvement / continuous development (CI/CD) pipeline for network development and testing
  • Makes complex modifications to existing software applications and modules in accordance with high-level specifications, application support, and industry standards
  • Provide technical guidance and mentorship to junior members of the team
  • Be flexible in providing on-call rotation support, including off-hours support during weeknights and weekends on a regular basis and on holidays as required

Required Qualifications:

  • Strong software development and design skills (Python), understanding of SDLC, agile methodologies and tooling
  • Strong programming skills and development experience: Python and frameworks such as Django, Flask, Jinja, SQLAlchemy
  • Strong communication skills to explain strategy, clarify requirements, facilitate successful testing and adoption.
  • Experience with Git, Jira, Jenkins, Continuous Build systems with automated testing (unit and end-to-end testing)
  • Experience with Infrastructure as code, Object Oriented Analysis and Design (OOA & OOD).
  • OS Platforms: Cisco, RHEL
  • Automation: Configuration via Ansible
  • Be flexible in providing on-call rotation support, including off-hours support during weeknights and weekends on a regular basis and on holidays as required

Desired Qualifications:

  • Experience with workflow tools/frameworks, Pronghorn
  • Network Systems: Routers, Switches, Firewalls, Load Balancers, F5 GTM, LTM, IPAM, DDI, Avi, SDN

Skills:

  • Analytical Thinking
  • Automation
  • Collaboration
  • Production Support
  • Result Orientation
  • Application Development
  • Architecture
  • Influence
  • Project Management
  • Solution Design
  • Adaptability
  • DevOps Practices
  • Risk Management
  • Solution Delivery Process
  • Stakeholder Management

Shift:

1st shift (United States of America)

Hours Per Week: 

40

Pay Transparency details

US - NJ - Jersey City - 101 Hudson St - 101 Hudson (NJ2101)

Pay and benefits information

Pay range

$108,000.00 - $161,900.00 annualized salary, offers to be determined based on experience, education and skill set.

Discretionary incentive eligible

This role is eligible to participate in the annual discretionary plan. Employees are eligible for an annual discretionary award based on their overall individual performance results and behaviors, the performance and contributions of their line of business and/or group; and the overall success of the Company.

Benefits

This role is currently benefits eligible. We provide industry-leading benefits, access to paid time off, resources and support to our employees so they can make a genuine impact and contribute to the sustainable growth of our business and the communities we serve.
Apply now
Share this job