Proficiency in Python and SQL for Big Data platforms.
Experience in implementing DWH/ETL solutions with multiple data sources, especially SAP.
At least 3 years of developing and deploying data pipelines on cloud-native ETL tools like CDAP, IIICS, AWS Glue, Azure Synapse, Snowflake, GCP Composer.
Experience migrating on-premises workloads to GCP CDF or other cloud services.
Requirements:
Develop and deploy data pipelines using cloud-native ETL tools.
Implement complex data transformations for large enterprise clients.
Troubleshoot and optimize GCP services for performance.
Collaborate in Agile Scrum teams to deliver data solutions.
● Must be able to code in Python and SQL on Big Data platforms
● Must have vast experience of implementing DWHETL solutions involving multiple data
sources (SAP specifically) and complex transformations for large enterprise customers, preferably Fortune 1000
● Must have 3+ years of experience developing and deploying data pipelines on Cloud native ETL tools eg. CDAP, IIICS, AWS Glue, Azure Synapse, Snowflakes, GCP Composer, etc.
● Must have prior experience of migrating onpremises workloads to GCP CDF or other cloud native services
● Must be proficient in troubleshooting and performance tuning of GCP Services
● Must have executed projects using Agile Scrum methodology and aware of all processes
involved in Scrum
● Good to have experience with cloud deployment of pipelines and orchestration tools
(Airflow, Composer).
● Good to have Hands on experience working knowledge of JDBT, Rest API, Hive, Java
● Should have experience with design of data models which serve multiple applications
underlying the same model (common schemas across multiple scenarios).
● Should have extensive knowledge of largescale data processing concepts and