Site Reliability Engineer- Data Platform

Bengaluru, Karnataka, India
Jun 19, 2024
Jun 19, 2025
Onsite
Full-Time
3 Years
Job Description

As a Site Reliability Engineer (SRE) specializing in the Data Platform, you will be instrumental in ensuring the reliability, scalability, and performance of our data infrastructure. Collaborating closely with cross-functional teams, you will design, implement, and maintain robust systems that support our data-driven initiatives. The ideal candidate will excel in automating infrastructure provisioning, possess strong scripting skills, and have a deep understanding of managing high-density data environments.

Key Responsibilities

  1. Configuration Management. Utilize Terraform or Ansible to automate provisioning of infrastructure components in both Cloud and OnPrem environments.
  2. Infrastructure Management. Manage and optimize our Cloudera-based infrastructure for optimal performance, high availability, and scalability. This includes monitoring system health, troubleshooting issues, and conducting routine maintenance.
  3. Data Security and Compliance. Implement and enforce security best practices to protect data integrity and confidentiality, ensuring compliance with relevant regulations such as GDPR, HIPAA, and DPR.
  4. Performance Optimization. Continuously improve Data infrastructure performance, efficiency, and cost-effectiveness. Identify bottlenecks, tune configurations, and implement resource utilization best practices.
  5. Capacity Planning. Monitor resource utilization trends and plan for future capacity needs. Proactively identify and address potential capacity constraints.
  6. Backup and Disaster Recovery. Implement robust backup and disaster recovery strategies to ensure data protection and business continuity. Regularly test and maintain backup and recovery procedures.
  7. Patches & Upgrades. Apply patches and perform platform upgrades following advisories from Cloudera, InfoSec, and Compliance teams.
  8. Documentation and Knowledge Sharing. Create comprehensive documentation for configurations, processes, and procedures related to the Data Platform. Foster continuous learning and improvement by sharing knowledge and best practices with team members.
  9. Collaboration and Communication. Collaborate effectively with cross-functional teams including data engineers, developers, and IT operations. Communicate project status, issues, and resolutions clearly and promptly.

Qualifications

  • Bachelor's degree in Computer Science, Engineering, or related field.
  • Proficiency in Linux system administration, shell scripting, and networking concepts.
  • 3 to 8 years of experience in Infrastructure Automation.
  • Hands-on experience with configuration management tools (e.g., Terraform, Ansible).
  • Strong scripting skills (e.g., Python, Bash) for automation and troubleshooting.
  • Experience with monitoring and logging solutions (e.g., Prometheus, Grafana, ELK stack).
  • Knowledge of networking principles and protocols (TCP/IP, UDP, DNS, DHCP, etc.).
  • Experience with managing *nix based machines and strong working knowledge of Unix tools (e.g., Ubuntu, Fedora, Redhat).
  • Excellent communication skills and the ability to collaborate effectively with cross-functional teams.
  • Strong analytical, problem-solving, and troubleshooting skills.

Good To Have

  • Exposure to cloud platforms like Azure or AWS.
  • Understanding of distributed computing principles and experience with Hadoop ecosystem technologies (HDFS, MapReduce, YARN, Hive, Spark, etc.).
  • Familiarity with Open Data Lake components such as Ozone, Iceberg, Spark, Flink, etc.
  • Knowledge of containerization and orchestration technologies (e.g., Docker, Kubernetes, OpenShift).

This job description outlines a challenging role where you'll contribute significantly to maintaining and enhancing the reliability and performance of our data infrastructure. If you're passionate about infrastructure automation, data security, and working with cutting-edge technologies in a collaborative environment, we encourage you to apply!