Bachelor’s or master’s degree in Computer Science, Information Technology, Engineering, or a related field., 8+ years of experience in data engineering or a related field, with leadership experience., Expert-level proficiency in Spark and strong coding skills in Python and Scala., Solid knowledge of Azure Databricks and experience with SQL for large datasets..
Key responsabilities:
Lead projects for the design, development, and maintenance of data and analytics platforms.
Design and automate deployment of distributed systems for data ingestion and transformation.
Implement data governance processes and provide guidance on building reliable data pipelines.
Coach and mentor less experienced team members while ensuring the success of analytics initiatives.
Report This Job
Help us maintain the quality of our job listings. If you find any issues with this job post, please let us know.
Select the reason you're reporting this job:
At Cummins, we empower everyone to grow their careers through meaningful work, building inclusive and equitable teams, coaching, development and opportunities to make a difference. Across our entire organization, you'll find engineers, developers, and technicians who are innovating, designing, testing, and building. You'll also find accountants, marketers, as well as manufacturing, quality and supply chain specialists who are working with technology that's just as innovative and advanced.
From your first day at Cummins, we’re focused on understanding your talents, current skills and future goals – and creating a plan to get you there. Your journey begins with planning your development and connecting to diverse experiences designed to spur innovation. From our internships to our senior leadership roles, we attract, hire and reward the best and brightest from around the world and look to them for new ideas and fresh perspectives. Learn more about #LifeAtCummins at cummins.com/careers.
Leads projects for the design, development, and maintenance of a data and analytics platform. Ensures efficient processing, storage, and availability of data for analysts and other consumers. Collaborates with key business stakeholders, IT experts, and subject-matter experts to plan, design, and deliver optimal analytics and data science solutions. Works on one or multiple product teams simultaneously.
Note:- Even though the role is categorized as Remote, it will follow a hybrid work model.
Key Responsibilities
Design and automate the deployment of distributed systems for ingesting and transforming data from various sources (relational, event-based, unstructured).
Develop frameworks for continuous monitoring and troubleshooting of data quality and integrity issues.
Implement data governance processes for metadata management, data access, and retention policies for internal and external users.
Provide guidance on building reliable, efficient, scalable, and quality data pipelines with monitoring and alert mechanisms using ETL/ELT tools or scripting languages.
Design and implement physical data models to define database structures and optimize performance through indexing and table relationships.
Optimize, test, and troubleshoot data pipelines.
Develop and manage large-scale data storage and processing solutions using distributed and cloud-based platforms such as Data Lakes, Hadoop, Hbase, Cassandra, MongoDB, Accumulo, DynamoDB, and others.
Utilize modern tools and architectures to automate common, repeatable, and tedious data preparation and integration tasks.
Drive automation in data integration and management by renovating the data management infrastructure.
Ensure the success of critical analytics initiatives by employing agile development methodologies such as DevOps, Scrum, and Kanban.
Coach and mentor less experienced team members.
Responsibilities
Technical Skills:
Expert-level proficiency in Spark, including optimization, debugging, and troubleshooting Spark jobs.
Solid knowledge of Azure Databricks for scalable, distributed data processing.
Strong coding skills in Python and Scala for data processing.
Experience with SQL, especially for large datasets.
Knowledge of data formats such as Iceberg, Parquet, ORC, and Delta Lake.
Experience developing CI/CD processes.
Deep understanding of Azure Data Services (e.g., Azure Blob Storage, Azure Data Lake, Azure SQL Data Warehouse, Synapse Analytics, etc.).
Familiarity with data lakes, data warehouses, and modern data architectures.
Competencies
System Requirements Engineering - Translates stakeholder needs into verifiable requirements, establishing acceptance criteria and assessing the impact of requirement changes.
Collaborates - Builds partnerships and works collaboratively with others to meet shared objectives.
Communicates effectively - Develops and delivers clear, audience-specific communications.