Offer summary

Qualifications:

Bachelor’s degree in computer science, data science, or a related technical field, or equivalent practical experience., 2-5+ years of experience in data engineering, focusing on cloud-based platforms., Strong hands-on experience with Databricks, including Spark and Delta Lake., Proficient in Python, SQL, and familiar with CI/CD practices for data pipelines..

Key responsibilities:

Develop and maintain scalable data pipelines using Databricks and Apache Spark.

Optimize data processing and implement performance tuning techniques for Spark workloads.

Design and implement robust data architectures using AWS services.

Collaborate with cross-functional teams to ensure data meets business needs.

Job description

Description

Company Overview

Shamrock Trading Corporation is the parent company for a family of brands in transportation services, finance and technology. Headquartered in Overland Park, KS, Shamrock is frequently recognized among the “Best Places to Work” in Kansas City and Chicago and was most recently recognized as one of America’s top 100 “Most Loved Workplaces” by Newsweek. We also have offices in Atlanta, Chicago, Dallas, Ft. Lauderdale, Houston, Laredo, Nashville, Philadelphia and Phoenix.

With an average annual revenue growth of 25% over several decades, Shamrock’s success is attributed to three key factors: hiring the best people, cultivating long-term relationships with our customers and continually evolving in the marketplace.

Responsibilities

Shamrock Trading Corporation is looking for a Data Engineer who wants to utilize their expertise in data warehousing, data pipeline creation/support and analytical reporting skills by joining our Data Services team. This role is responsible for gathering and analyzing data from several internal and external sources, designing a cloud-focused data platform for analytics and business intelligence, reliably providing data to our analysts. This role requires significant understanding of data mining and analytical techniques. An ideal candidate will have strong technical capabilities, business acumen, and the ability to work effectively with cross-functional teams. Responsibilities include but are not limited to:

Develop & Maintain Scalable Data Pipelines: Build, optimize, and maintain ETL/ELT pipelines using Databricks, Apache Spark, and Delta Lake.
Optimize Data Processing: Implement performance tuning techniques to improve Spark-based workloads.
Cloud Data Engineering: Work with AWS services (S3, Lambda, Glue, Redshift, etc.) to design and implement robust data architectures.
Real-time & Streaming Data: Develop streaming solutions using Kafka and Databricks Structured Streaming.
Data Quality & Governance: Implement data validation, observability, and governance best practices using Unity Catalog or other tools.
Cross-functional Collaboration: Partner with analysts, data scientists, and application engineers to ensure data meets business needs.
Automation & CI/CD: Implement infrastructure-as-code (IaC) and CI/CD best practices for data pipelines using tools like Terraform, dbt, and GitHub Actions.

Qualifications

Bachelor’s degree in computer science, data science or related technical field, or equivalent practical experience
2-5+ years of experience in data engineering, with a focus on cloud-based platforms.
Strong hands-on experience with Databricks (including Spark, Delta Lake, and MLflow).
Experience building and maintaining AWS based data pipelines: currently utilizing AWS Lambda, Docker / ECS, MSK, Airflow, Databricks, Unity Catalog
Development experience utilizing two or more of the following:

Python: (Pandas/Numpy, Boto3, SimpleSalesforce)
Databricks (pySpark, pySQL, DLT)
Apache Spark
Kafka and the Kafka Connect ecosystem (schema registry and Avro)
Familiarity with CI/CD for data pipelines and infrastructure as code (Terraform, dbt)

Strong SQL skills for data transformation and performance tuning.
Understanding of data security and governance best practices.
Enthusiasm for working directly with customer teams (Business units and internal IT)

Preferred Qualifications

Proven experience with relational and NoSQL databases (e.g. Postgres, Redshift, MongoDB)
Experience with version control (git) and peer code reviews
Familiarity with data lakehouse architectures and optimization strategies.
Familiarity with data visualization techniques using tools such as Grafana, PowerBI, AWS Quick Sight, and Excel.

Benefits Package

At Shamrock we hire bright, ambitious people and give them the tools they need to be successful. By investing in training and development, we hope to become a long-term career for employees, where there are always opportunities for advancement. Shamrock also offers a premier set of benefits for employees and their families:

Medical: Fully paid healthcare, dental and vision premiums for employees and eligible dependents
Work-Life Balance: Competitive PTO and paid leave policies
Financial: Generous company 401(k) contributions and employee stock ownership after one year
Wellness: Onsite gym and discounted membership to select fitness centers. Jogging trails available at Overland Park offices

#LI-NB1 #LI-Remote

Required profile