Offer summary

Qualifications:

Minimum 10 years of experience in large-scale data pipeline development., Proficiency with AWS services such as S3, Glue, Lambda, and Redshift., Hands-on experience with Databricks, Apache Spark, and SQL., Strong understanding of data modeling, architecture, and CI/CD practices..

Key responsibilities:

Design and develop scalable data pipelines on AWS cloud infrastructure.

Implement data processing workflows using Databricks and Apache Spark.

Build and manage data orchestration workflows with Apache Airflow.

Collaborate with cross-functional teams to meet data analytics and reporting needs.

Job description

Tiger Analytics is a fastgrowing advanced analytics consulting firm. Our consultants bring deep expertise in Data Science, Machine Learning and AI. We are the trusted analytics partner for multiple Fortune 500 companies, enabling them to generate business value from data. Our business value and leadership has been recognized by various market research firms, including Forrester and Gartner. We are looking for topnotch talent as we continue to build the best global analytics consulting team in the world.

As a Lead Data Engineer, you will be responsible for designing, building, and maintaining scalable data pipelines on AWS cloud infrastructure. You will work closely with crossfunctional teams to support data analytics, machine learning, and business intelligence initiatives. The ideal candidate will have strong experience with AWS services, Databricks, and Apache Airflow.

Key Responsibilities:

Design, develop, and deploy endtoend data pipelines on AWS cloud infrastructure using services such as Amazon S3, AWS Glue, AWS Lambda, Amazon Redshift, etc.
Implement data processing and transformation workflows using Databricks, Apache Spark, and SQL to support analytics and reporting requirements.
Build and maintain orchestration workflows using Apache Airflow to automate data pipeline execution, scheduling, and monitoring.
Lead the migration of legacy data systems to modern cloudbased architectures.
Develop and maintain CICD pipelines for data workflows.
Collaborate with data scientists, analysts, and business stakeholders to understand data requirements and deliver scalable data solutions.
Optimize data pipelines for performance, reliability, and costeffectiveness, leveraging AWS best practices and cloudnative technologies.

Requirements

10+ years of experience building and deploying largescale data processing pipelines in a production environment.
Handson experience in designing and building data pipelines on AWS cloud infrastructure.
Strong proficiency in AWS services such as Amazon S3, AWS Glue, AWS Lambda, Amazon Redshift, etc.
Lead the design, development, and optimization of largescale data pipelines and data lakehouse architectures using Databricks
Architect and implement batch and realtime streaming solutions leveraging Apache Spark on Databricks
Handson experience with Apache Airflow for orchestrating and scheduling data pipelines.
Solid understanding of data modeling, database design principles, and SQL and Spark SQL.
Experience with version control systems (e.g., Git) and CICD pipelines.
Excellent communication skills and the ability to collaborate effectively with crossfunctional teams.
Strong problemsolving skills and attention to detail.

Benefits
This position offers an excellent opportunity for significant career development in a fastgrowing and challenging entrepreneurial environment with a high degree of individual responsibility.