[Job - 22203] Senior Data Scientist (NLP), Colombia

Remote: 
Full Remote
Contract: 
Work from: 

Offer summary

Qualifications:

Solid experience as a Data Scientist with a focus on NLP projects., Proficiency in Python and libraries like NLTK, spaCy, and Gensim., Strong understanding of NLP techniques such as sentiment analysis and topic extraction., Experience with sequence-to-sequence models and data science pipeline..

Key responsibilities:

  • Conduct data exploration and ensure data quality for NLP contexts.
  • Define and implement NLP models to meet business outcomes.
  • Develop ontologies and contribute to agentic intelligence systems.
  • Train and validate models while documenting processes for stakeholders.

CI&T logo
CI&T XLarge http://www.ciandt.com
5001 - 10000 Employees
See all jobs

Job description

We are tech transformation specialists, uniting human expertise with AI to create scalable tech solutions.
With over 6,500 CI&Ters around the world, we’ve built partnerships with more than 1,000 clients during our 30 years of history. Artificial Intelligence is our reality. 

We are seeking a highly skilled Senior Data Scientist with a strong focus on Natural Language Processing (NLP) to drive AI initiatives within the American health industry. This role emphasizes the development of agentic intelligence systems, building Retrieval-Augmented Generation (RAG) frameworks, and creating ontologies that enhance business outcomes through advanced data-driven insights.

Responsibilities:
Conduct thorough data exploration to validate requirements for NLP contexts and ensure data quality.
Perform NLP pre-processing tasks, including tokenization, lexical analysis, syntactic analysis, semantic analysis, and pragmatic analysis.
Define and implement optimal NLP models that align with expected business outcomes.
Contribute to building agentic intelligence systems and RAG frameworks that enhance data-driven decision-making.
Develop and manage ontologies to support effective data utilization and enhance understanding across teams.
Train and validate models using rigorous experimentation to evaluate and enhance their performance.
Document model development processes, methodologies, and results for both internal and external stakeholders.
Engage in text classification and sentiment analysis, employing both traditional machine learning classifiers and deep learning models.
Continuously evaluate and improve NLP model performance through systematic experimentation and analysis.

Requirements for this challenge:
Solid experience as a Data Scientist, specifically in NLP projects.
Proficiency in programming with Python, particularly using libraries such as NLTK, spaCy, and Gensim.
Strong understanding of NLP techniques, including but not limited to: Topic Extraction, Summarization, Categorization,
Sentiment Analysis.
Demonstrated experience with sequence-to-sequence models for tasks such as machine translation, text summarization, and question answering.
Familiarity with advanced topic modeling techniques, including Latent Dirichlet Allocation (LDA) and Non-negative Matrix Factorization (NMF).
Data Science Pipeline:
Comprehensive understanding of the entire data science pipeline, including data gathering, preprocessing, model development, validation, and deployment.
Experience with model mathematical validation techniques, such as accuracy, precision, recall, F1-score, and ROUGE score.
Ethical Considerations:
Awareness of ethical considerations in NLP, including biases in data and models, privacy concerns, and potential societal impacts.
Problem Solving and Creativity:
Strong critical thinking skills with the ability to troubleshoot issues and creatively apply different NLP techniques to solve real-world problems.
Communication Skills:
Advanced oral and written communication skills in English, with the ability to clearly convey complex concepts to diverse audiences.
Experience working on international projects, demonstrating adaptability and cultural awareness.
Collaboration and Innovation:
Ability to work collaboratively with cross-functional teams to define the best NLP models aligned with business objectives.
Commitment to leveraging state-of-the-art techniques for handling, analyzing, and visualizing large datasets.


Nice to Have:
Experience with Databricks.
Familiarity with Transformers, BERT, and Named Entity Recognition (NER).
Background in data engineering and MLOps, including knowledge in Azure ML and Azure DevOps.
Knowledge of data protection regulations (e.g., PII, CCPA, HIPAA) and best practices.


#MidSenior
#LI-JP3
CI&T is an equal-opportunity employer. We celebrate and appreciate the diversity of our CI&Ters’ identities and lived experiences. We are committed to building, promoting, and retaining a diverse, inclusive, and equitable company and culture focused on creating a better tomorrow. At CI&T, we recognize that innovation and transformation only happen in diverse, inclusive, and safe work environments. Our teams are most impactful when people from all backgrounds and experiences collaborate to share, create, and hear ideas. Before applying for our opportunities take a look at Conflict of Interest Policy on website. We strongly encourage candidates from diverse and underrepresented communities to apply for our vacancies.

CI&T is an equal-opportunity employer. We celebrate and appreciate the diversity of our CI&Ters’ identities and lived experiences. We are committed to building, promoting, and retaining a diverse, inclusive, and equitable company and culture focused on creating a better tomorrow.

At CI&T, we recognize that innovation and transformation only happen in diverse, inclusive, and safe work environments. Our teams are most impactful when people from all backgrounds and experiences collaborate to share, create, and hear ideas.
Before applying for our opportunities take a look at Conflict of Interest Policy on website.

We strongly encourage candidates from diverse and underrepresented communities to apply for our vacancies.

Our benefits include:

- Premium Healthcare
- Meal voucher
- Maternity and Parental leaves
- Mobile services subsidy
- Sick pay-Life insurance
- CI&T University   
- Colombian Holidays
- Paid Vacations
And many others. 


Collaboration is our superpower, diversity unites us, and excellence is our standard. 
We value diverse identities and life experiences, fostering a diverse, inclusive, and safe work environment. We encourage applications from diverse and underrepresented groups to our job positions.

Required profile

Experience

Spoken language(s):
English
Check out the description to know which languages are mandatory.

Other Skills

  • Creativity
  • Collaboration
  • Communication
  • Problem Solving

Data Scientist Related jobs