First Horizon logo

Senior Site Reliability Engineer

First Horizon
Full-time
On-site
Plano, Texas, United States

No sponsorship will be provided for this role.

Location: On site at location listed in posting.

Weekly Schedule: Monday-Friday, 9am-5pm

We are seeking a Senior Site Reliability Engineer who will be the guardian of our Azure infrastructure reliability. This role focuses on building comprehensive observability platforms, implementing intelligent monitoring systems, and proactively identifying issues before they impact production. You will create the tools and automation that predict, detect, and prevent problems rather than simply reacting to them. Your primary mission is ensuring our Azure infrastructure and applications never surprise us with failures.

The ideal candidate has deep expertise in Azure Monitor, Application Insights, Log Analytics, and KQL, combined with strong scripting skills in Python or PowerShell. You should have 5-7+ years of experience implementing observability platforms and a proven track record of preventing incidents through proactive monitoring and automation. You'll work with technologies like Prometheus, Grafana, OpenTelemetry, and Azure services (AKS, App Services, Azure SQL, Cosmos DB) while building self-healing automation and predictive analytics tools that keep our systems healthy.

Key Responsibilities:

  • Design and implement comprehensive observability stack across all Azure resources and applications
  • Build intelligent alerting systems with anomaly detection and predictive capabilities to prevent incidents
  • Create self-healing automation and auto-remediation tools that resolve issues without human intervention
  • Develop internal monitoring platforms, dashboards, and CLI tools for engineering teams
  • Write KQL queries and analyze metrics/logs to identify optimization opportunities and predict failures
  • Implement continuous resource monitoring for Azure quotas, costs, security posture, and service health
  • Build capacity forecasting and trend analysis tools to prevent resource exhaustion
  • Reduce alert noise while improving coverage and actionability of monitoring systems
  • Participate in light on-call rotation (prevention-focused approach reduces reactive incidents)

About Us

First Horizon Corporation is a leading regional financial services company, dedicated to helping our clients, communities and associates unlock their full potential with capital and counsel. Headquartered in Memphis, TN, the banking subsidiary First Horizon Bank operates in 12 states across the southern U.S. The Company and its subsidiaries offer commercial, private banking, consumer, small business, wealth and trust management, retail brokerage, capital markets, fixed income, and mortgage banking services. First Horizon has been recognized as one of the nation's best employers by Fortune and Forbes magazines and a Top 10 Most Reputable U.S. Bank. More information is available atΒ www.FirstHorizon.com.Β 

Benefit Highlights

β€’ Medical with wellness incentives, dental, and vision

β€’ HSA with company match

β€’ Maternity and parental leave

β€’ Tuition reimbursement

β€’ Mentor program

β€’ 401(k) with 6% match

β€’ More -- FirstHorizon.com/First-Horizon-National-Corporation/Careers/Our-Benefits

Follow Us

Facebook

X formerly Twitter

LinkedIn

Instagram

YouTube