Offer summary

Qualifications:

Bachelor’s or master’s degree in Computer Science, Information Technology, Engineering, or a related field., 8+ years of experience in data engineering or a related field, with leadership experience., Expert-level proficiency in Spark and strong coding skills in Python and Scala., Solid knowledge of Azure Databricks and experience with SQL for large datasets..

Key responsabilities:

Lead projects for the design, development, and maintenance of data and analytics platforms.

Design and automate deployment of distributed systems for data ingestion and transformation.

Implement data governance processes and provide guidance on building reliable data pipelines.

Coach and mentor less experienced team members while ensuring the success of analytics initiatives.

Job description

Description

Leads projects for the design, development, and maintenance of a data and analytics platform. Ensures efficient processing, storage, and availability of data for analysts and other consumers. Collaborates with key business stakeholders, IT experts, and subject-matter experts to plan, design, and deliver optimal analytics and data science solutions. Works on one or multiple product teams simultaneously.

Note:- Even though the role is categorized as Remote, it will follow a hybrid work model.

Key Responsibilities

Design and automate the deployment of distributed systems for ingesting and transforming data from various sources (relational, event-based, unstructured).
Develop frameworks for continuous monitoring and troubleshooting of data quality and integrity issues.
Implement data governance processes for metadata management, data access, and retention policies for internal and external users.
Provide guidance on building reliable, efficient, scalable, and quality data pipelines with monitoring and alert mechanisms using ETL/ELT tools or scripting languages.
Design and implement physical data models to define database structures and optimize performance through indexing and table relationships.
Optimize, test, and troubleshoot data pipelines.
Develop and manage large-scale data storage and processing solutions using distributed and cloud-based platforms such as Data Lakes, Hadoop, Hbase, Cassandra, MongoDB, Accumulo, DynamoDB, and others.
Utilize modern tools and architectures to automate common, repeatable, and tedious data preparation and integration tasks.
Drive automation in data integration and management by renovating the data management infrastructure.
Ensure the success of critical analytics initiatives by employing agile development methodologies such as DevOps, Scrum, and Kanban.
Coach and mentor less experienced team members.

Responsibilities

Technical Skills:

Expert-level proficiency in Spark, including optimization, debugging, and troubleshooting Spark jobs.
Solid knowledge of Azure Databricks for scalable, distributed data processing.
Strong coding skills in Python and Scala for data processing.
Experience with SQL, especially for large datasets.
Knowledge of data formats such as Iceberg, Parquet, ORC, and Delta Lake.
Experience developing CI/CD processes.
Deep understanding of Azure Data Services (e.g., Azure Blob Storage, Azure Data Lake, Azure SQL Data Warehouse, Synapse Analytics, etc.).
Familiarity with data lakes, data warehouses, and modern data architectures.

Competencies

System Requirements Engineering - Translates stakeholder needs into verifiable requirements, establishing acceptance criteria and assessing the impact of requirement changes.
Collaborates - Builds partnerships and works collaboratively with others to meet shared objectives.
Communicates effectively - Develops and delivers clear, audience-specific communications.
Customer focus - Builds strong customer relationships and delivers customer-centric solutions.
Decision quality - Makes timely and informed decisions to keep the organization moving forward.
Data Extraction - Performs ETL activities from various sources using appropriate tools and technologies.
Programming - Develops, tests, and maintains computer code and scripts to meet business and compliance requirements.
Quality Assurance Metrics - Uses IT Operating Model (ITOM) and SDLC standards to assess solution quality.
Solution Documentation - Documents solutions for improved productivity and knowledge transfer.
Solution Validation Testing - Ensures configuration changes and solutions meet customer requirements.
Data Quality - Identifies, understands, and corrects data flaws to enhance information governance.
Problem Solving - Uses systematic analysis to identify root causes and implement robust solutions.
Values differences - Recognizes and appreciates diverse perspectives and cultures.

Qualifications

Education, Licenses, Certifications:

Bachelor’s or master’s degree in Computer Science, Information Technology, Engineering, or a related field.

Experience

8+ years of experience in data engineering or a related field, with experience in a leadership role.
Intermediate experience in relevant disciplines is required.
Knowledge of the latest data engineering technologies and trends is preferred, including:
Analyzing complex business systems, industry requirements, and data regulations.
Processing and managing large datasets.
Designing and developing Big Data platforms using open-source and third-party tools.
SPARK, Scala/Java, Map-Reduce, Hive, Hbase, and Kafka or equivalent.
SQL query language.
Cloud-based clustered compute implementation.
Developing applications requiring large file movement in a cloud-based environment.
Building analytical solutions.
Intermediate experience in the following is preferred:
IoT technology.
Agile software development.

Job Systems/Information Technology

Organization Cummins Inc.

Role Category Remote

Job Type Exempt - Experienced

ReqID 2411156

Relocation Package No

Required profile