BrightAI logo

Staff DevOps Engineer

BrightAI
Full-time
Remote
United States

 

The BrightAI platform transforms business results by digitizing a physical environment, making it intelligent and controllable, and unlocking unimagined possibilities for growth and social good.  We built the only platform that integrates all the key digital technologies needed for a truly all-encompassing end-to-end digital transformation: IOT + AI + Cloud + Mobile + Tailored hardware.

We are a high-growth company that is looking for teammates who want to be key contributors to changing the way businesses are run. This is an incredible opportunity to complete work that is disrupting industries. Be a part of scaling a business by increasing the number of devices, events, applications, services, and traffic that result in measurable success for our customers. We have the best and brightest minds in AI, IoT, Cloud and Mobile who have built leading companies in those spaces (Microsoft, Amazon Alexa, SmartThings, Samsung).

We are looking for a Staff DevOps Engineer who wants to join us in a fast paced setting to build next generation clouds and applications that leverage all the latest technologies in AWS, Kafka, Cloud Native, Containers, Terraform, CI/CD, AI and ML. Our tech stack is a loosely coupled, event-based architecture, built on top of Kafka with many microservices and IoT devices, that processes tens of millions of events a day. As a member of the team you will play an integral role in shaping the architecture, processes and features needed to maintain and release a high scale cloud platform with teams.

Responsibilities

  • Building the tools, processes and support for teams to be able to self-manage in an owner operator model
  • Establishing and supporting infrastructure as code best practices and patterns for teams
  • Managing and supporting engineering teams’ software releases
  • Work with cloud software engineers to define repeatable processes and patterns which accelerate the software development lifecycle and ensure confidence and stability in continuous deployment to production environments
  • Staying on top of latest cloud trends and topics
  • Prototyping unproven concepts to inform final implementations
  • Automating everything possible. Leveraging CI/CD to automate repeatable tasks and quality checks
  • Managing multiple AWS accounts and providing unified operational support
  • Building many common dashboards, monitoring and alerts used by the business 

Skills and Expertise

  • Passionate for continuous learning and understanding things in and around DevOps, site reliability, high availability and release engineering best practices
  • BS/MS in Computer Science or equivalent practical experience
  • 4+ years of experience in software languages like TypeScript, JavaScript, Python, Go
  • Deep experience with AWS cloud infrastructure and services, including EKS, EC2, VPC, Route53, S3, RDS (Postgres), MSK, Lambda, ECS, DocumentDB
  • In-depth knowledge of Kubernetes, including experience with deploying, managing, and scaling clusters
  • Experience with Configuration Management and Infrastructure as code: Terraform, Kustomize, Helm, CloudFormation, Ansible
  • Extensive networking experience
  • Experience with Linux and shell scripting
  • Experience with Datadog, SumoLogic, OpsGenie, Prometheus, Grafana, or similar for reporting and alerts
  • Experience with high availability deployments across multiple regions and AZs
  • Experience with setting up and maintaining CI/CD pipeline tooling
  • Experience working within an Agile environment
  • Experience with managing multiple AWS accounts
  • Familiarity with leveraging reserved instances
  • Experience with AWS Cost explorer and managing usage

Bonus

  • Previous startup experience
  • Experience with growing large teams and can describe the issues that arise with growth
  • Experience operating and monitoring large Kafka clusters
  • Experience with cloud development using TypeScript, JavaScript, Python, Go

 

#li-remote