Offer summary

Qualifications:

Bachelor's degree in Computer Science, Data Engineering, or a related field., Proficiency in Python, SQL, and Java for data manipulation and application development., Experience with cloud services, particularly AWS, and data warehousing concepts., Familiarity with ETL processes and tools like Snowflake, Databricks, and various data formats..

Key responsabilities:

Develop and maintain data pipelines and ETL processes for data integration.

Implement big data solutions and analytics using cloud platforms and tools.

Provide technical support, including troubleshooting and monitoring of data systems.

Collaborate with teams to understand data needs and ensure data quality across various sources.

Job description

Job Title: Data Engineer
Duration: 12+ Months
Location: 100% Remote(All Over India)

Roles/Requirements ("Data Engineer")
Truly unique opportunity for data engineer for Learning & Analytics project. As part of this team, you will work on the collecting, storing, processing, and analyzing huge sets of data. The primary focus will be to develop the construction and maintenance of our data pipeline, ETL processes and data warehouse. Data Engineer will also be responsible for data quality and understanding the data needs our various source data in order to anticipate and scale our systems.

Must have's:

Candidate must be willing to learn, work well independently, be open to feedback, and enthusiastic, with demonstrated technical aptitude, skills and abilities. Support NYU Returns Project that uses student and vaccine data to enable the access of the individuals to the university.

Current technology includes (but is not limited to):

Mulesoft API Management
Change Data Capture (CDC)
Data Ingestion to preparation to exploration and consumption using Cloud & Big Data Platform
Tableau as Business Intelligence (BI) tools
Dimensional and Relation table structures
AWS cloud (S3, EC2, EMR, Redshift, etc.)
Snowflake, Qlik/Attunity Replicate, Airflow, Databricks

Roles & responsibilities may include:

Integrate data from a variety of data sources (Data warehouse, Data marts) utilizing on-premises or cloud-based data structures using Snowflake;
Develop and implement streaming, data lake, and analytics big data solutions
Create integration of data from multiple data sources, knowledge of various ETL techniques and frameworks using Snowflake or Databricks
Create Applications using Change Data Capture Tools
Technical Support (includes trouble shooting, monitoring)
Technical Analyst and Project Management Support
Application Performance and System Testing Analyst

"Ideal” candidates will have the following experience, knowledge, skills or abilities:

Utilize ETL processes to build data repositories using Snowflake, Python, etc.; integrate data into Data Lake using Spark, Pyspark, Hive, Kafka Streaming, etc.
Application development, including Cloud development experience, preferably using AWS (AWS Services, especially S3, API Gateway, Redshift, Lambda, etc.)

Must have expertise in the following:

Python/R, SQL, Java
Navigational flow, working with Notebooks, scheduling, integrating with AWS for cloud storage
Working with different file formats: Hive, Parquet, CSV, JSON, Avro etc. Compression techniques
Integrating PySpark with different data sources, example: Snowflake, Oracle, Postgres Mysql, etc.
Experience with web services such as AWS, Redshift, S3, RDS, Athena, Dynamo DB and Aurora. Also, the ability to connect data using web API, REST API
Knowledge of Business Intelligence Tools, Enterprise Reporting, Report Development, data modeling, data warehouse architecture, data warehousing concepts
Comfortable with AWS cloud (S3, EC2, EMR, Redshift, etc.)
Ability to collaborate with colleagues across different schools/locations

Required profile