University Job Title
Data Scientist
Bargaining Unit
None - Not included in the union (Yale Union Group)
Time Type
Full time
Duration Type
Regular
Compensation Grade
Administration & Operations
Compensation Grade Profile
Senior Manager; Senior Program Leader (26)
Wage Ranges
Click here to see our Wage Ranges
Searchable Job Family
Research Res Support, Research/Support
Total # of hours to be worked:
37.5
Work Week
Standard (M-F equal number of hours per day)
Work Location
Central Campus
Worksite Address
120 High Street
New Haven, CT 06511
Work Model
Remote
Position Focus
Yale University’s Cultural Heritage Collections and Scholarly Communication division is seeking a Data Scientist to work with our knowledge graph and cutting edge technologies to improve access to and understanding of the University's collections. This represents a unique opportunity to apply AI, natural language processing, and computational analysis techniques to a vast corpus of open knowledge, in an innovative and collaborative environment. This role will provide essential expertise necessary to advance the core mission of enabling research, teaching and learning to be performed with the collections. In order to do this, we are seeking someone with a strong understanding of machine learning technologies, data analysis platforms and techniques, and with a track record of data engineering skills and practical accomplishments.
The position reports to the Senior Director for Digital Cultural Heritage, within the Office of the Vice-Provost. The position will work closely with colleagues in Yale's museums, libraries, archives, central Information Technology Services (ITS), and with faculty and research projects as time allows. Engagement with the cultural heritage community through conferences, meetings and research projects is also anticipated as a core function.
Primary Responsibilities: Build machine learning based workflows to improve content in the knowledge graph, including via Generative AI, natural language processing, and other information science techniques. Improvements will include accurate reconciliation of entities, restructuring of textual or semi-structured content into a robust knowledge graph, and the integration of multi-modal AI techniques leveraging image, audio and video content in conjunction with data. Perform sophisticated data analysis to inform stakeholders of patterns, gaps and trends in collection data, and to assist with research projects using the knowledge graph. Assist with query optimization, data transformation pipeline debugging and optimization, and other advanced data engineering processes. Applying an engineering mindset to data science, build test suites and validation routines to prove the efficiency and accuracy of the machine learning based pipelines developed. This is not a research position, but one dedicated to practical, demonstrable improvements. Work closely with software engineers and content matter experts in the collecting units and central IT to assist with the quality and understanding of data, services and enrichments.
Please note: The Principal Responsibilities are generic in nature; the information contained above in this Position Focus is most relevant to this position. Additionally, a cover letter is required with submission of application to be considered for this position.
Essential Duties
- Extract huge volumes of data from multiple internal and external sources.2. Conduct undirected research and frame open-ended industry questions.3. Employ sophisticated analytics programs, machine learning and statistical methods to prepare data for use in predictive and prescriptive modeling.4. Thoroughly clean and prune data to discard irrelevant information.5. Explore and examine data from a variety of angles to determine hidden weaknesses, trends and/or opportunities.6. Devise data-driven solutions to the most pressing challenges.7. Invent new algorithms to solve problems and build new tools to automate work.8. Communicate predictions and findings to management and IT departments through effective data visualizations and reports.9. Utilize real-time data streams to generate predictive and prognostic analytical outputs.
Required Education And Experience
Bachelors’ degree in computer science, mathematics or a related subject and six years of experience, or an equivalent combination of education and experience.
Required Skill/Ability 1
5 or more years of software development and/or data management experience.
Required Skill/Ability 2
Expertise in Python, or demonstrated ability to translate experience from an equivalent language.
Required Skill/Ability 3
Experience with graph-based data models and machine learning techniques.
Required Skill/Ability 4
Experience with cultural heritage and/or higher education organizations.
Required Skill/Ability 5
Familiarity with cultural heritage descriptive standards and practices. A keen attention to detail is critical.
Preferred Education, Experience And Skills
Masters/Ph.D. in Information Science or related discipline. Excellent verbal and written communication skills, including when interacting with non-technical audiences. Experience with development of knowledge management/semantic systems, SPARQL and JSON-LD.
Drug Screen
No
Health Screening
No
Background Check Requirements
All candidates for employment will be subject to pre-employment background screening for this position, which may include motor vehicle, DOT certification, drug testing and credit checks based on the position description and job requirements. All offers are contingent upon the successful completion of the background check. For additional information on the background check requirements and process visit "Learn about background checks" under the Applicant Support Resources section of Careers on the It's Your Yale website.
COVID-19 Vaccine Requirement
Required
The University maintains policies pertaining to COVID-19. All faculty, staff, students, and trainees are required to comply with these policies, which may be found here:
https://covid19.yale.edu/health-guidelines
Posting Disclaimer
The intent of this job description is to provide a representative summary of the essential functions that will be required of the position and should not be construed as a declaration of specific duties and responsibilities of the particular position. Employees will be assigned specific job-related duties through their hiring departments.
EEO Statement
University policy is committed to affirmative action under law in employment of women, minority group members, individuals with disabilities, and protected veterans. Additionally, in accordance with Yale’s Policy Against Discrimination and Harassment, and as delineated by federal and Connecticut law, Yale does not discriminate in admissions, educational programs, or employment against any individual on account of that individual’s sex, sexual orientation, gender identity or expression, race, color, national or ethnic origin, religion, age, disability, status as a special disabled veteran, veteran of the Vietnam era or other covered veteran.
Inquiries concerning Yale’s Policy Against Discrimination and Harassment may be referred to the Office of Institutional Equity and Accessibility (OIEA).
Note
Yale University is a tobacco-free campus