As a Senior DevOps Engineer and more importantly, a Senior Principal Engineer in our team, you will take ownership of mission-critical deliverables across multiple projects. You will not only design and implement resilient, scalable systems but also lead a high-performing DevOps team. Your focus will be on creating infrastructure that powers world-class user experiences.
Your role includes
- Leading a small but impactful team of engineers through day-to-day operations and long-term strategic goals.
- Designing and building highly available, scalable, and modular infrastructure systems using modern DevOps practices.
- Spearheading initiatives around automation, monitoring, incident management, and CI/CD enablement.
- Architecting robust, secure, and cost-efficient deployments in hybrid cloud environments (with AWS as a primary focus).
- Establishing and improving Service Level Indicators (SLIs) and Service Level Objectives (SLOs).
- Managing and enhancing systems observability using modern telemetry stacks (e.g., ELK, Prometheus, Grafana).
- Driving incident response and root cause analysis processes, and leading efforts to automate fault recovery and issue prevention.
- Providing on-call support with a focus on reducing alert fatigue and improving incident outcomes.
- Contributing to infrastructure roadmap planning, performance tuning, and security best practices.
ou Might Be a Fit If You Have
- 4–6 years of hands-on experience in DevOps, SRE, or Infrastructure Engineering roles.
- Solid Linux/Unix administration background with strong shell scripting and debugging skills.
- Deep experience working with Kubernetes (deployment, scaling, high availability, troubleshooting).
- Proven ability in managing scalable infrastructure on AWS. Familiarity with GCP is a bonus.
- Strong grasp of networking concepts like DNS, routing, VPCs, subnets, firewalls, and VPNs.
- Fluency in Infrastructure-as-Code tools (Terraform is essential).
- Experience with configuration management systems like Ansible.
- Proven expertise in building, managing, and improving CI/CD pipelines with tools like Jenkins, GitLab CI, or ArgoCD.
- Strong knowledge of observability and monitoring stacks Prometheus, ELK, Telegraf, Grafana.
- Working familiarity with databases and data platforms like MongoDB, DynamoDB, Redis, Kafka, or Cassandra.
- Bonus - Exposure to Big Data tools and pipelines such as DataProc, Redshift, etc.
Your Toolbelt (a.k.a. Soft Skills)
- You're a self-starter who takes initiative and drives outcomes without micromanagement.
- You think strategically but execute pragmatically. You’re as comfortable leading architecture discussions as you are writing Terraform scripts.
- You embrace ambiguity and adjust quickly in fast-changing environments.
- You collaborate well with engineers, product managers, and stakeholders your communication is clear, honest, and impactful.
We’d Love to Hear More About
- Real-world problems you’ve solved from outages to cost optimization to scaling challenges.
- Your leadership style how you motivate teams, mentor junior engineers, and drive alignment across functions.
- Any open-source projects, blogs, or talks you’ve contributed to show us your passion!
Why Join Xtelify?
- Truly Agile Culture. We’re nimble, iterative, and fast. Your voice matters in the product and platform direction.
- Scale and Impact. We operate in the dynamic entertainment and OTT domain big problems, even bigger opportunities.
- Tech-First Approach. Work with a forward-thinking team using cutting-edge cloud-native tools and platforms.
- Hybrid Cloud Exposure. Be part of our transition to a hybrid cloud model, where performance, resilience, and cost efficiency go hand-in-hand.
- Top-Tier Engineering Team. Collaborate with passionate technologists who love solving real-world problems.
Ready to help shape the future of cloud-native infrastructure in the entertainment space? Apply now and let’s build something great together.