T

Cloud Engineer

Tencent
Full-time
On-site
Singapore, Singapore

Responsibilities:

About Tencent
Tencent is an Internet-based platform company founded in Shenzhen, China, in 1998. We use technology to enrich the lives of Internet users and assist the digital upgrade of enterprises. Our mission is "Value for Users, Tech for Good". We embrace a culture of teamwork & creativity and are driven by our values - Integrity, Proactivity, Collaboration and Creativity.


We are rapidly expanding our international operations and are looking for top talent to propel us forward. Combining the results-oriented nature of a start-up with the resources of a profitable and leading Internet company, Tencent offers a unique opportunity for aspiring individuals to thrive.

About WeChat
With over 1.3 billion users worldwide, WeChat is changing the mobile landscape by connecting people, services, and businesses in China and around-the world. The WeChat team in Singapore is responsible for managing and growing our core product including messaging and social networking for users around the world.

Join the WeChat team and play an impactful role in keeping people around the world connected, help redefine how people use their mobile devices to communicate and interact online and to better understand user behaviour and preferences of users.
  • Ensure site reliability by managing the deploy, scaling, and maintanence of new and existing online services that connect over a billion users around the world
  • Leverage your engineering skills while working directly with developers in order to help test and diagnose issues with newly deployed services, infrastructure resources, or code before and after they reach the production environment
  • Manage high severity incidents and incidents impacting end users by focusing on service monitoring, alerts, and rapid recovery
  • Use stress testing to help measure, tune, and optimize system performance and reliability for a wide variety of services
  • Develop and maintain automation tools/systems to help eliminate repetitive manual operations and ensure better site reliability

Produce and maintain documentation and standard operating procedures (SOPs) to more efficiently and reliably handle regular operations in conjunction with colleagues around the world

Requirements:

  • Bachelor's or higher degree in Computer Science, Computer Engineering, or related fields
  • Prior work experience in Cloud Engineering, Site Reliability Engineering (SRE), or DevOps for a major, public-facing internet service
  • Hands-on experience with at least one of the programming languages: Bash, Go, Python
  • Good command of Linux environment with deep understanding of the Linux operating system, including kernel, memory, processes, threads, static / shared libraries, IPC, RPCs, and signals
  • Understanding of standard networking protocols such as HTTP, DNS, SSL, TCP/IP, and ICMP
  • Experience in large-scale distributed environments. Familiarity with distributed systems including the CAP theorem and microservices.
  • Experience with container technologies such as Docker and Kubernetes
  • Experience with monitoring tools like Prometheus and Zabbix
  • Strong sense of ownership, reliability, and integrity demonstrated
  • Passion for eliminating repetitive manual processes via automation