Machine Learning Engineer - Oracle Cloud

Bengaluru, Karnataka, India
Sep 05, 2024
Sep 05, 2025
Onsite
Full-Time
4 Years
Job Description

In this role, you'll help design and build a cloud service that supports data scientists, machine learning engineers, and software developers throughout their machine learning development and deployment lifecycle. Your work will involve creating interactive notebooks, enabling distributed machine learning, and implementing robust monitoring and analytics for machine learning models.

Responsibilities

  1. Develop Advanced Data Science Solutions. Build accelerated data science components and machine learning solutions on OCI’s modern Infrastructure as a Service (IaaS) building blocks.
  2. Design Robust Systems. Design and implement distributed, scalable, and fault-tolerant software systems using technologies such as Dask, Spark, Horovod, TensorFlow, and PyTorch.
  3. Full Software Lifecycle Participation. Engage in the entire software development lifecycle, including development, testing, continuous integration, and production operations.
  4. Feature Development. Create new features for the data science platform, leveraging a range of classical and deep learning frameworks, including Generative AI and large language models (LLMs).
  5. Customer and ISV Collaboration. Work closely with customers and Independent Software Vendors (ISVs) to troubleshoot and optimize data science solutions, covering aspects from data wrangling to model deployment.
  6. Documentation and Mentorship. Design, develop, and document robust software components while mentoring new employees and guiding junior engineers to enhance project quality.
  7. On-Call Participation. Participate in on-call rotations to support the service with your team.

Qualifications

  1. Experience. 4 to 7 years of professional experience in machine learning and software development.
  2. Educational Background. B.Tech/B.E. or M.S. in Computer Science, Artificial Intelligence/Machine Learning, or a related field.
  3. DevOps Skills. Familiarity with containerization (Docker), Linux/UNIX Shell, and package management (e.g., Conda).
  4. Programming Proficiency. Strong programming skills in Python (intermediate or higher) and familiarity with software coding practices (unit tests, mocking, logging, debugging, version control).
  5. Cloud Experience. Experience with cloud platforms such as AWS, GCP, or Azure is highly desirable.
  6. Distributed ML Frameworks. Experience with distributed machine learning frameworks like TensorFlow and PyTorch.
  7. SQL Skills. Experience with SQL is highly desirable.
  8. Java Development. Java development experience is a significant plus.
  9. Communication Skills. Excellent verbal and written communication abilities.