Site Reliability Engineer

Hyderabad, Telangana, India
May 24, 2024
May 24, 2025
Remote
Contract
2 Years
Job Description

Our client is in search of a Site Reliability Engineer (SRE) who will play a pivotal role in ensuring the seamless operation and scalability of our systems. Situated at the intersection of software engineering and operations, an SRE is responsible for crafting robust, fault-tolerant systems that uphold the reliability and efficiency standards of our organization.

Responsibilities

  1. Issue Detection. Vigilantly monitor systems to detect and troubleshoot potential issues before they escalate.
  2. Failure Handling. Develop automated mechanisms to swiftly address and recover from system failures, minimizing downtime and disruption.
  3. Disaster Recovery Planning. Create comprehensive disaster recovery plans to safeguard against unforeseen events and ensure business continuity.
  4. System Maintenance. Employ proactive measures to keep systems operational and reliable, implementing updates and optimizations as needed.
  5. Preventive Measures. Identify and address weaknesses in systems to prevent future disruptions, enhancing overall system resilience.

Qualifications

  1. Development Proficiency. Strong background in software development, particularly with applications written in .NET. Proficiency in scripting languages like Python is advantageous.
  2. Automation Skills. Experience with automation tools such as Ansible or Terraform, with a preference for Ansible. Ability to streamline processes through automation.
  3. Adaptability. Comfortable working in a dynamic environment, including the potential for 24x7 operations if required.
  4. Cloud Expertise. Familiarity with cloud platforms, particularly AWS, although experience with any cloud provider is valuable.
  5. Containerization Knowledge. Proficient in Kubernetes and Docker, with the ability to deploy and manage containerized applications effectively.

Why Join Us

  1. Impactful Work. Contribute to the development and maintenance of critical systems that drive our organization's success.
  2. Collaborative Environment. Work alongside talented individuals in a supportive and collaborative team environment.
  3. Continuous Learning. Access opportunities for professional growth and skill development, keeping pace with the latest technologies and industry trends.

If you are passionate about ensuring the reliability and scalability of complex systems, we invite you to join our team as a Site Reliability Engineer. Apply now to be part of our journey towards operational excellence.