Logo for Data Society

Data Engineer (AI)

Roles & Responsibilities

  • Hands-on experience deploying LLM-based applications, including RAG or similar retrieval systems
  • Proven experience deploying systems on AWS or Azure (AWS preferred)
  • 5+ years of data and analytics engineering in cloud environments
  • Expertise in SQL, Python, and schema design with experience in data cataloging and governance tools

Requirements:

  • Design, build, and maintain scalable data pipelines for structured and unstructured data ingestion, transformation, and processing
  • Architect, build, and deploy LLM-based solutions on cloud platforms, including prompt pipelines, embeddings, vector databases, and evaluation workflows
  • Design and implement RAG systems end-to-end including document ingestion, chunking/embedding, indexing, retrieval, grounding, model integration
  • Work closely with clients to gather requirements, provide technical guidance and present solutions and implementation plans

Job description

About Data Society Group
At Data Society Group, we provide the highest quality, leading-edge, industry-tailored data and AI training and solutions for Fortune 1,000 companies and federal, state, and local governmental organizations. We partner with our clients to educate, equip, and empower their workforces with the skills they need to achieve their goals and expand their impact. We are empowering the workforces of the future, supporting engineers and scientists to train up on the most complex AI solutions and Machine Learning skills.

Role Overview
We are seeking a capable and resourceful Data Engineer with expertise in cloud-based text-focused AI systems to join our technology and solutions team. In this role, you will be a key individual contributor, applying your expertise to build robust, scalable, and complex data and AI solutions for our external clients. You will work within a cross-functional team, collaborating closely with UX Designers, Engineers, and Project Managers to translate client requirements into high-quality technical deliverables.

Note: Due to the confidential nature of our federal government Clients, this role requires the ability to pass a United States federal government Public Trust background check and is exclusively open to U.S. Citizens located within the United States.

Responsibilities

  • Design, build, and maintain scalable data pipelines for structured and unstructured data ingestion, transformation, and processing.
  • Architect, build, and deploy LLM-based solutions on cloud platforms, including prompt pipelines, orchestration layers, embeddings, vector databases, and evaluation workflows.
  • Design and implement RAG systems end-to-end including document ingestion, chunking/embedding, indexing, retrieval, grounding, model integration.
  • Architect and enforce data models, governance, cataloging and schema design to support both analytics and AI workloads.
  • Build and optimize cloud-native data architectures to support compute, storage, and orchestration for high-throughput, production-grade AI workloads.
  • Implement reliable and efficient ETL patterns, leveraging best practices for data quality, lineage, versioning, and cataloging.
  • Instrument observability and monitoring for data pipelines, including latency, error rates, and schema drift, with alerting and automated remediation where possible.
  • Implement monitoring, observability, and performance optimization for data and AI systems.
  • Operate effectively within Agile workflows, contribute to sprint planning, estimations, backlog refinement and continuous improvement.
  • Work closely with clients to gather requirements, provide technical guidance and present solutions and implementation plans.
  • Communicate complex technical information to both technical and non-technical stakeholders.
  • Work cross-functionally with UX, engineering, and PM teams to deliver client-facing solutions.
  • Translate complex technical needs into clear development requirements and implementation plans.
  • Stay current with emerging technologies and recommend improvements to our engineering practices, architecture patterns, and cloud ecosystem.

Qualifications

  • Hands-on experience deploying LLM-based applications, including RAG or similar retrieval systems.
  • Proven experience deploying systems on AWS or Azure (AWS preferred).
  • Strong understanding of embeddings, chunking strategies, retrieval optimization, and evaluation.
  • 5+ years of data and analytics engineering in cloud environments.
  • Expertise in SQL, Python, and schema design with experience in data cataloging and governance tools.
  • Demonstrated experience building robust and maintainable data architectures, including real-time or steaming pipelines.
  • Experience working in Agile / Scrum development processes.
  • Excellent communication skills and ability to work cross-functionally with non-technical teams.

Please note this job description is not designed to cover or contain a comprehensive listing of activities, duties or responsibilities that are required of the employee for this job. Duties, responsibilities, and activities may change at any time with or without notice.

This position will be remote in the US though based out of the Washington, DC area with travel to client sites in DC if needed.

Data Engineer Related jobs

Other jobs at Data Society

We help you get seen. Not ignored.

We help you get seen faster — by the right people.

🚀

Auto-Apply

We apply for you — automatically and instantly.

Save time, skip forms, and stay on top of every opportunity. Because you can't get seen if you're not in the race.

✨

AI Match Feedback

Know your real match before you apply.

Get a detailed AI assessment of your profile against each job posting. Because getting seen starts with passing the filters.

Upgrade to Premium. Apply smarter and get noticed.

Upgrade to Premium

Join thousands of professionals who got noticed and hired faster.