Offer summary

Qualifications:

Bachelor’s degree in computer science or related field, 3+ years of experience in big data engineering, Proficiency in Hive, Hadoop, Airflow, PySpark, Strong SQL and NoSQL database experience, Knowledge of data integration and ETL processes.

Key responsabilities:

Develop & maintain data ingestion and integration processes

Design & implement ETL pipelines and data models

Optimize data processing workflows for efficiency

Implement data security and access controls

Collaborate with data scientists and stakeholders

Job description

This is a remote position.

Location : Remote, anywhere in US

As a Big Data Engineer, you will be responsible for designing, developing, and maintaining our big data infrastructure. You will work with large datasets, perform data processing, and support various business functions by creating data pipelines, data processing jobs, and data integration solutions. You will be working in a dynamic and collaborative environment, leveraging your expertise in Hive, Hadoop, and PySpark to unlock valuable insights from our data.

Key Responsibilities:

Data Ingestion and Integration:

· Develop and maintain data ingestion processes to collect data from various sources.

· Integrate data from different platforms and databases into a unified data lake.

Data Processing:

· Create data processing jobs using Hive and PySpark for large-scale data transformation.

· Optimize data processing workflows to ensure efficiency and performance.

Data Pipeline Development:

· Design and implement ETL pipelines to move data from raw to processed formats.

· Monitor and troubleshoot data pipelines, ensuring data quality and reliability.

Data Modeling and Optimization:

· Develop data models for efficient querying and reporting using Hive.

· Implement performance tuning and optimization strategies for Hadoop and Spark.

Data Governance:

· Implement data security and access controls to protect sensitive information.

· Ensure compliance with data governance policies and best practices.

Collaboration:

· Collaborate with data scientists, analysts, and other stakeholders to understand data requirements and provide data support.

Requirements

Qualifications:

- Bachelor’s degree in computer science, Information Technology, or a related field.
- 3+ years of experience in big data engineering and data processing.
- Proficiency in Hive, Hadoop, Airflow and PySpark.4
- Strong SQL and NoSQL database experience.
- Experience with data warehousing and data modeling.
- Knowledge of data integration, ETL processes, and data quality.
- Strong problem-solving and troubleshooting skills.
Preferred Qualifications:
- Experience with cloud-based big data technologies (e.g., AWS, Azure and GCP).
- Certification in Hadoop, Hive, or PySpark.

This

Benefits

What a Consulting role at Thoucentric will offer you?

Opportunity to define your career path and not as enforced by a manager.
A great consulting environment with a chance to work with Fortune 500 companies and startups alike.
A dynamic but relaxed and supportive working environment that encourages personal development.
Be part of One Extended Family. We bond beyond work - sports, get-togethers, common interests etc.
Work in a very enriching environment with Open Culture, Flat Organization and Excellent Peer Group.
Be part of the exciting Growth Story of Thoucentric!