Logo for Cloudera

Senior Data Scientist

Roles & Responsibilities

  • Hands-on experience building applications or workflows powered by large language models (LLMs)
  • 5+ years of relevant experience in Data Science, Machine Learning, or AI-focused roles
  • Strong curiosity for emerging AI technologies and the ability to evaluate and adopt them responsibly
  • Academic background in a quantitative discipline such as Statistics, Mathematics, Computer Science, Engineering, Economics, or a related field

Requirements:

  • Design, develop, and deploy GenAI-powered internal applications, copilots, and workflow accelerators
  • Build reusable AI components, including retrieval pipelines, structured prompting patterns, orchestration workflows, and evaluation harnesses
  • Develop and maintain statistical and machine learning models to support automation, optimization, forecasting, and classification use cases
  • Document methodologies, assumptions, and implementation details to ensure transparency and reproducibility

Job description

Business Area:

IT

Seniority Level:

Mid-Senior level

Job Description: 

At Cloudera, we empower people to transform complex data into clear and actionable insights. With as much data under management as the hyperscalers, we're the preferred data partner for the top companies in almost every industry.  Powered by the relentless innovation of the open source community, Cloudera advances digital transformation for the world’s largest enterprises.

At Cloudera, we believe that data can make what is impossible today, possible tomorrow. We empower organizations to transform complex data into clear and actionable outcomes. Join us in our mission to harness the power of data.

We are seeking a talented and curious Senior Data Scientist to join our fast-paced, data-driven organization. In this role, you will design and deliver AI-powered systems and applications that accelerate decision-making and enhance operational excellence.

You will combine strong statistical foundations, advanced programming expertise, and modern Generative AI techniques to build scalable, production-ready solutions. This is a builder-focused role. You will move beyond analysis to develop internal copilots, AI-enabled workflows, and reusable platform components that embed intelligence directly into business processes.

Our work empowers leadership and operational teams by creating measurable, AI-enabled capabilities. We seek a thoughtful and pragmatic innovator who is enthusiastic about GenAI, disciplined experimentation, and building durable internal AI infrastructure.

To succeed in this role, you will demonstrate technical depth, intellectual curiosity, and a strong builder mindset:

  • Generative AI & LLM Engineering: Hands-on experience working with large language models (LLMs) and modern AI tooling. This includes prompt design, structured output generation, retrieval-augmented generation (RAG), evaluation strategies, and workflow automation. Ability to translate GenAI capabilities into reliable, enterprise-ready solutions that integrate with existing systems and data sources.

  • AI Application Development Experience: rapidly prototyping and iterating on internal applications, copilots, or AI-enabled workflow tools. Comfortable evolving prototypes into maintainable, production-grade solutions. Familiarity with modern development frameworks (e.g., Streamlit, Gradio, FastAPI, or similar) is beneficial.

  • Platform-Oriented Thinking: Demonstrated ability to design reusable components such as shared prompt libraries, retrieval pipelines, evaluation frameworks, and standardized integration patterns that enable scalable AI adoption.

  • Data Science & Machine Learning Expertise: Proficiency in Python (or R) for data preparation, feature engineering, statistical modeling, and machine learning. Experience with core data science libraries (e.g., Pandas, NumPy, scikit-learn) and a solid understanding of supervised and unsupervised learning methods.

  • Strong Mathematical and Statistical Foundation: Deep understanding of probability, statistical inference, experimentation, and quantitative reasoning to ensure model robustness and reliability.

  • SQL & Data Fluency: Strong understanding of relational databases and the ability to quickly learn new schemas and data environments. Comfortable writing efficient, production-grade SQL to support modeling, experimentation, and AI-enabled applications.

  • Exceptional Communication Skills: Ability to translate complex business challenges into technical solutions and clearly communicate findings, trade-offs, and recommendations to both technical and non-technical stakeholders.

  • Collaborative Development Experience: Experience working in collaborative environments such as Cloudera Data Science Workbench, Jupyter, Zeppelin, or similar platforms.

  • GitHub Proficiency: Experience using version control to support collaboration, code review, documentation, and long-term maintainability.

As a Senior Data Scientist, you will:

You will apply rigorous analytical thinking and modern AI capabilities to design, build, and scale high-impact solutions.

  • Design, develop, and deploy GenAI-powered internal applications, copilots, and workflow accelerators.

  • Build reusable AI components, including retrieval pipelines, structured prompting patterns, orchestration workflows, and evaluation harnesses.

  • Design retrieval strategies that connect LLMs to trusted internal knowledge sources, ensuring grounded and reliable outputs.

  • Develop and maintain statistical and machine learning models to support automation, optimization, forecasting, and classification use cases.

  • Implement evaluation and validation frameworks to measure quality, accuracy, and consistency of AI-driven systems.

  • Partner cross-functionally to identify high-value opportunities for AI enablement across the organization.

  • Create reusable datasets, feature pipelines, and experimentation frameworks to support iterative development.

  • Uphold high standards for quality, reliability, and responsible AI practices.

  • Contribute to peer review processes to ensure technical rigor and maintainability.

  • Document methodologies, assumptions, and implementation details to ensure transparency and reproducibility.

  

We are excited if you have (Required Experience):

  • Hands-on experience building applications or workflows powered by large language models (LLMs).

  • Evidence of a builder mindset through shipped AI tools, internal platforms, or automation solutions.

  • Demonstrated experience applying machine learning techniques in production or enterprise environments.

  • 5+ years of relevant experience in Data Science, Machine Learning, or AI-focused roles.

  • Strong curiosity for emerging AI technologies and the ability to evaluate and adopt them responsibly.

  • Academic background in a quantitative discipline such as Statistics, Mathematics, Computer Science, Engineering, Economics, or a related field.

You may also have: (Preferred Qualifications)

  • Experience with vector databases, embedding models, or semantic retrieval systems.

  • Experience designing internal AI platforms or shared enablement frameworks.

  • Familiarity with API-driven architectures and integrating AI capabilities into enterprise systems.

  • Exposure to responsible AI practices, governance frameworks, or model lifecycle management.

This role is not eligible for immigration sponsorship.

What you can expect from us:

  • Generous PTO Policy 

  • Support work life balance with Unplugged Days

  • Flexible WFH Policy 

  • Mental & Physical Wellness programs 

  • Phone and Internet Reimbursement program 

  • Access to Continued Career Development 

  • Comprehensive Benefits and Competitive Packages 

  • Paid Volunteer Time

  • Employee Resource Groups

EEO/VEVRAA

#LI-MH2

#LI-Remote

Data Scientist Related jobs

Other jobs at Cloudera

We help you get seen. Not ignored.

We help you get seen faster β€” by the right people.

πŸš€

Auto-Apply

We apply for you β€” automatically and instantly.

Save time, skip forms, and stay on top of every opportunity. Because you can't get seen if you're not in the race.

✨

AI Match Feedback

Know your real match before you apply.

Get a detailed AI assessment of your profile against each job posting. Because getting seen starts with passing the filters.

Upgrade to Premium. Apply smarter and get noticed.

Upgrade to Premium

Join thousands of professionals who got noticed and hired faster.