Offer summary

Qualifications:

Minimum 6 years of software development experience., Proficiency in SQL and Python programming., Experience with big data technologies like Hadoop, Spark, Kafka, and NoSQL databases., Knowledge of Azure cloud platform and data pipeline deployment for AI/ML workloads..

Key responsibilities:

Develop and maintain large-scale analytics solutions processing terabytes of data.

Design and manage data ingestion, streaming, and batch processing workflows.

Optimize performance of SQL queries and data flows.

Support deployment and monitoring of AI/ML models and data pipelines.

Job description

We are looking for a motivated Senior Data Engineer (AI) who is willing to dive into the new project with a modern stack. If you’re driven by a curiosity to learn and a desire to produce meaningful results, please apply!

About Our Customer

You will work with the 6thlargest privately owned organization in the United States. The сustomer is one of the Big Four accounting organizations and the largest professional services network in the world in terms of revenue and number of professionals. The company provides audit, tax, consulting, enterprise risk, and financial advisory services to 263,900 professionals globally.

About the Project

As a Data Engineer, you’ll become a part of a crossfunctional development team who is working with GenAI solutions for digital transformation across Enterprise Products.
The prospective team you will be working with is responsible for the design, development, and deployment of innovative, enterprise technology, tools, and standard processes to support the delivery of tax services. The team focuses on the ability to deliver comprehensive, valueadded, and efficient tax services to our clients. It is a dynamic team with professionals of varying backgrounds from tax technical, technology development, change management, and project management. The team consults and executes on a wide range of initiatives involving process and tool development and implementation including training development, engagement management, tool design, and implementation.

Project Tech Stack
Azure Cloud, Microservices Architecture, .NET 8, ASP.NET Core services, Python, Mongo, Azure SQL, Angular 18, Kendo, GitHub Enterprise with Copilot

Requirements

6+ years of handson experience in software development

Experience coding in SQLPython, with solid CS fundamentals including data structure and algorithm design

Handson implementation experience working with a combination of the following technologies: Hadoop, Map Reduce, Kafka, Hive, Spark, SQL and NoSQL data warehouses

Experience in Azure cloud data platform

Experience working with vector databases (Milvus, Postgres, etc.)

Knowledge of embedding models and retrievalaugmented generation (RAG) architectures

Understanding of LLM pipelines, including data preprocessing for GenAI models

Experience deploying data pipelines for AIML workloads, ensuring scalability and efficiency

Familiarity with model monitoring, feature stores (Feast, Vertex AI Feature Store), and data versioning

Experience with CICD for ML pipelines (Kubeflow, MLflow, Airflow, SageMaker Pipelines)

Understanding of realtime streaming for ML model inference (Kafka, Spark Streaming)

Knowledge of Data Warehousing, design, implementation and optimization

Knowledge of Data Quality testing, automation and results visualization

Knowledge of BI reports and dashboards design and implementation (PowerBI)

Experience with supporting data scientists and complex statistical use cases highly desirable

English level
Intermediate+

Responsibilities

Responsible for the building, deployment, and maintenance of missioncritical analytics solutions that process terabytes of data quickly at bigdata scales

Contributes design, code, configurations, manage data ingestion, realtime streaming, batch processing, ETL across multiple data storages

Responsible for performance tuning of complicated SQL queries and Data flows