Cornelis Networks, Inc. logo

Senior Software Engineer - DevOps and Infrastructure

Cornelis Networks, Inc.
3 days ago
Full-time
Remote
United States

Cornelis Networks delivers the worldโ€™s highest performance scale-out networking solutions for AI and HPC datacenters. Our differentiated architecture seamlessly integrates hardware, software and system level technologies to maximize the efficiency of GPU, CPU and accelerator-based compute clusters at any scale. Our solutions drive breakthroughs in AI & HPC workloads, empowering our customers to push the boundaries of innovation. Backed by top-tier venture capital and strategic investors, we are committed to innovation, performance and scalability - solving the worldโ€™s most demanding computational challenges with our next-generation networking solutions. ย 


We are a fast-growing, forward-thinking team of architects, engineers, and business professionals with a proven track record of building successful products and companies. As a global organization, our team spans multiple U.S. states and six countries, and we continue to expand with exceptional talent in onsite, hybrid, and fully remote roles.


We are seeking an experienced Senior Software Engineer to enhance Cornelis Networks' development and testing infrastructure. This role focuses on the design and maintenance of onsite and cloud infrastructure, including CPU and GPU-accelerated systems, automation, and CI/CD pipelines. The ideal candidate will have deep Linux systems expertise and hands-on experience managing server hardware and software stacks in Enterprise IT, Cloud or HPC environments.


Responsibilities

  • Design, build, and maintain CI/CD pipelines using GitHub Actions, including custom workflows, across multiple Linux distributions (RHEL, Rocky, SLES, Ubuntu).
  • Integrate HPC/AI workload and Cornelis software build and test stages into CI pipelines, including driver compatibility checks and GPU-accelerated test suites.
  • Interface with SW teams to validate and test new packages and ensure test hardware environments are provisioned with the correct package versions.
  • Collaborate with developers to help debug, interpret, and communicate testing results, ensuring issues are triaged and resolved efficiently.
  • Monitor pipeline health, troubleshoot failures, and continuously improve build reliability and speed.
  • Administer and maintain onsite Linux-based development and test servers across a heterogeneous multi-distro environment.
  • Perform OS provisioning, patching, and lifecycle management, including kernel module and driver management.
  • Manage hardware and software stacks for both CPU and GPU systems, including installation, upgrade, and troubleshooting of AMD (ROCm) and NVIDIA (CUDA/driver) environments.
  • Maintain driver and package compatibility matrices to ensure consistent environments across development, CI, and test infrastructure.
  • Manage test hardware environments to ensure they are provisioned with the correct packages and configurations for ongoing test campaigns.



Minimum Qualifications

  • 5+ years of experience in DevOps or infrastructure engineering in a Linux-based environment.
  • Strong proficiency with Git and GitHub Actions, including designing and maintaining custom CI/CD workflows.
  • Deep Linux systems mastery, especially across RHEL/Rocky, SLES, and Ubuntu, such as package management (RPM, DEB, zypper, dnf, apt), systemd, kernel module management, and system troubleshooting.
  • Hands-on experience managing server systems, including installation and maintenance of drivers, software, and firmware across multiple Linux distributions.
  • Proficiency in Linux scripting for automation and tooling.
  • Strong troubleshooting and debugging skills across hardware, OS, driver, and application layers.
  • Excellent communication skills and ability to work effectively in a remote, collaborative environment.


Preferredย Qualifications

  • Experience with Reframe or similar HPC regression testing frameworks.
  • Experience with HPC, high-performance networking, or RDMA technologies (InfiniBand, Omni-Path, Ethernet RDMA).
  • Familiarity with kernel driver development or kernel module packaging (DKMS, kmod).
  • Experience managing multi-GPU server configurations (e.g., DGX, MI300X-class systems, or similar).
  • Knowledge of GPU monitoring and management tools (nvidia-smi, rocm-smi, DCGM, GPU operator frameworks).
  • Experience with infrastructure-as-code tools such as Ansible.
  • Familiarity with monitoring and observability stacks (Grafana, Loki, Prometheus).
  • Proficiency in Python for automation and tooling.
  • Familiarity with C/C++.
  • Experience in a networking, semiconductor, or systems software company.


Location:ย This is a remote position for employees residing within the United States.


We offer a competitive compensation package that includes equity, cash, and incentives, along with health and retirement benefits. Our dynamic, flexible work environment provides the opportunity to collaborate with some of the most influential names in the semiconductor industry.ย 

ย 

At Cornelis Networks your base salary is only one component of your comprehensive total rewards package. Your base pay will be determined by factors such as your skills, qualifications, experience, and location relative to the hiring range for the position. Depending on your role, you may also be eligible for performance-based incentives, including an annual bonus or sales incentives.ย 

ย 

In addition to your base pay, youโ€™ll have access to a broad range of benefits, including medical, dental, and vision coverage, as well as disability and life insurance, a dependent care flexible spending account, accidental injury insurance, and pet insurance. We also offer generous paid holidays, 401(k) with company match, and Open Time Off (OTO) for regular full-time exempt employees. Other paid time off benefits include sick time, bonding leave, and pregnancy disability leave.ย 

ย 

Cornelis Networks does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. Cornelis Networks is an equal opportunity employer, and all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity or expression, pregnancy, age, national origin, disability status, genetic information, protected veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicantsโ€™ needs under the respective laws throughout all stages of the recruitment and selection process.