We’re looking for an experienced Data Engineer with a strong background in Big Data technologies and cloud-based data platforms. In this role, you will be responsible for designing, developing, and optimizing data engineering pipelines that support large-scale data ingestion, transformation, and analytics. You will work with modern data frameworks and cloud ecosystems (especially AWS) to help our clients derive meaningful insights and drive data-led decisions.
You’ll be a key contributor to the delivery of high-performance data platforms, with a focus on scalable architecture, clean code, and efficient data processing workflows using Spark, Python, Scala, and cloud-native services. This is a hands-on technical role where your problem-solving skills, analytical mindset, and ability to work in fast-paced environments will be essential.
Key Responsibilities
- Design and build robust, scalable, and efficient data pipelines for ingesting, processing, and transforming structured and unstructured data from a variety of sources (files, databases, streams).
- Develop and maintain ETL/ELT processes using Big Data technologies such as Apache Spark, PySpark, Scala, and Python.
- Work with modern cloud-based tools and services, primarily within the AWS ecosystem (including AWS EMR, Glue, Redshift, S3, and Lambda) to build resilient, high-performing data platforms.
- Develop real-time data streaming solutions using Apache Kafka and implement event-driven architectures where applicable.
- Collaborate with cross-functional teams including data scientists, architects, and business analysts to deliver end-to-end data solutions.
- Ensure data quality, integrity, and consistency across all pipelines and data stores.
- Write optimized SQL queries for complex data extraction and reporting.
- Contribute to performance tuning, debugging, and ongoing improvements of existing data workflows.
- Follow best practices for code versioning, CI/CD, testing, and documentation.
- Support DevOps initiatives and cloud-native deployments as part of agile development teams.
Required Technical Skills and Experience
- 5 – 7+ years of experience in Data Engineering, with a strong foundation in data management, data lakes, data warehouses, and Lakehouse architectures.
- Minimum 4+ years of experience working with Big Data technologies, specifically with Apache Spark, Python, or Scala.
- Proven experience building scalable data solutions on cloud platforms, especially AWS.
- Strong hands-on experience with AWS EMR, AWS Glue, AWS Redshift, S3, and DynamoDB.
- Solid knowledge of streaming data platforms and message brokers like Kafka.
- Strong SQL skills and experience working with large data sets and query optimization.
- Familiarity with HDFS, Hive, HBase, and NoSQL databases.
- Ability to translate business requirements into scalable and efficient technical solutions.
- Excellent problem-solving and debugging skills, with a passion for clean and maintainable code.
Preferred Skills and Certifications
- AWS certification (e.g., AWS Certified Data Analytics – Specialty, AWS Certified Solutions Architect).
- Databricks or Cloudera Spark Developer Certification.
- Experience with AWS Lambda, Step Functions, and serverless data architectures.
- Exposure to CI/CD pipelines, Docker, and infrastructure as code tools.
- Understanding of data governance, security, and compliance best practices.
Why Join IBM Consulting?
- A collaborative and inclusive workplace culture that supports continuous learning and professional growth.
- Opportunities to work with cutting-edge technologies and world-class clients across industries.
- Flexibility to explore new roles, skills, and career paths.
- A culture that values your ideas, initiative, and the unique perspective you bring to the table.
Ready to Make an Impact?
Join IBM Consulting and be part of a global team that's solving tomorrow’s problems today. Apply now and help our clients turn data into business value.