Site Reliability Engineer
LTM
Job Description
LTM is proud to be an equal opportunity employer. We are committed to equal employment opportunity regardless of race, ethnicity, nationality, gender identity, gender expression, language, age, sexual orientation, religion, marital status, veteran status, socioeconomic status, disability or any other characteristic protected by applicable law. We are Hiring for Site Reliability Engineer (SRE) with Gen AI.
Experience: 5 to 19 Years Location: Pan India Shift โ General Notice Period-Early joiners Kindly share your updated resume at sneha.munagekar@ltm.com Role Summary We are seeking an SRE Platform Engineer to build, scale, and operate reliable, secure, and automated platform services. You will focus on improving system reliability, reducing operational toil, and enabling developer productivity through strong engineering practices. You should also be willing to work on COE initiatives and internal Intellectual properties and products that team is required to build to fulfil the different client requirements.
Key Responsibilities Build and operate highly available, scalable platform services Implement SRE practices: SLIs, SLOs, error budgets, and automation Manage and scale Kubernetes-based workloads in cloud environments Develop and maintain CI/CD pipelines and Infrastructure as Code Own production reliability, participate in on-call, incident response, and RCA (if required) Enhance observability using metrics, logs, and traces Required Skills 5-16 years of experience in SRE / Platform / DevOps / Cloud Engineering/ AI Hands-on experience with Kubernetes, Docker, Linux Strong scripting/programming skills (Python, Go, Bash) Experience with AWS / Azure / GCP Familiarity with Terraform / IaC, monitoring, and alerting tools Nice to Have Experience with internal developer platforms, service mesh, or FinOps Cloud or Kubernetes certifications Regards, Sneha