Match score not available

Senior Data Engineer - GCP

Remote: 
Full Remote
Contract: 
Experience: 
Senior (5-10 years)
Work from: 

Offer summary

Qualifications:

Bachelor's degree in Computer Science or Engineering, 8+ years of experience in data engineering, 3+ years on Google Cloud Platform, Expertise in Python and SQL, Familiarity with SDLC tools and ETL processes.

Key responsabilities:

  • Design and implement complex data workflows in GCP
  • Architect and optimize large-scale data solutions in BigQuery
  • Develop and implement data governance policies
  • Establish data quality frameworks and compliance processes
  • Participate in Agile meetings for project tracking
Fusemachines logo
Fusemachines SME https://bit.ly/
201 - 500 Employees
See more Fusemachines offers

Job description

About Fusemachines

Fusemachines is a 10+ year old AI company, dedicated to delivering state-of-the-art AI products and solutions to a diverse range of industries. Founded by Sameer Maskey, PhD, an Adjunct Associate Professor at Columbia University, our company is on a steadfast mission to democratize AI and harness the power of global AI talent from underserved communities. With a robust presence in four countries and a dedicated team of over 400 full-time employees, we are committed to fostering AI transformation journeys for businesses worldwide. At Fusemachines, we not only bridge the gap between AI advancement and its global impact but also strive to deliver the most advanced technology solutions to the world.

About the role

This is a remote, consulting position (3 months contract) in the Media-AdTech Industry, responsible for designing, building, and maintaining the infrastructure required for data integration, storage, processing, and analytics (BI, visualization, and Advanced Analytics).

We are seeking an experienced Google Cloud Data Engineer with deep expertise in Cloud Composer and BigQuery to design, build, and maintain data processing systems using Google Cloud Platform (GCP). The ideal candidate will combine strong technical expertise in orchestration and data warehousing with a thorough understanding of data governance principles.


Qualification & Experience

  • Bachelor's degree in Computer Science, Engineering or similar from a top tier school.
  • 8+ years of experience in data engineering roles, with 3+ years on Google Cloud platform working on generation of big datasets using different data sources, in the Media industry.
  • Expertise in Python for efficient data integration, storage, and manipulation.
  • Expert knowledge, understanding, and experience with SQL and writing advanced SQL queries.
  • Advanced SQL skills for complex querying, data modeling, and database design.
  • Familiarity with SDLC tools: Jira, GitHub, CI/CD pipelines, and Artifact Registry.
  • Proficient in data integration from APIs, databases, flat files, and event streaming.
  • Design and maintain ETL processes using Cloud Composer for workflow orchestration.
  • Experience with distributed data technologies: Spark/PySpark, DBT/Dataform, and Kafka.
  • Advanced expertise in:
    • Cloud Computing in GCP, including deep knowledge of a variety of GCP services like: Google Cloud Composer (including custom operator development and complex DAG design)
    • Google BigQuery (including advanced SQL, optimization techniques, and best practices). 
    • Data governance frameworks and implementation
  • Strong proficiency and experience with Google Cloud Pub/Sub,  Google Cloud Storage, Dataflow, Cloud Spanner, Vertex AI, Google Cloud SQL, 
  • Experience with Terraform, Kubernetes, and data monitoring tools for pipeline optimization.
  • Familiarity with regulatory requirements (GDPR, CCPA, HIPAA).
  • Demonstrated experience implementing data governance policies and procedures 

Certifications preferred: 

Google Cloud Professional Data Engineer certification

Key Responsibilities:

  • Design and implement complex data workflows using Google Cloud Composer (Apache Airflow), including custom operators and advanced orchestration patterns.
  • Architect and optimize large-scale data solutions in BigQuery, including performance tuning and cost optimization.
  • Develop and implement comprehensive data governance policies and procedures, including: 
    • Data classification and cataloging
    • Access control and security policies
    • Data retention and archival strategies
    • Compliance monitoring and reporting
  • Create and optimize BigQuery schemas, partitioning strategies, and query performance
  • Design and maintain ETL processes using Cloud Composer for workflow orchestration.
  • Develop data governance policies, including data lineage, classification, and access control.
  • Establish data quality frameworks and compliance processes for data accuracy.
  • Actively participate in Agile meetings, contributing to planning, resource allocation, and project tracking.

Fusemachines is an Equal Opportunities Employer, committed to diversity and inclusion. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or any other characteristic protected by applicable federal, state, or local laws.

Required profile

Experience

Level of experience: Senior (5-10 years)
Spoken language(s):
English
Check out the description to know which languages are mandatory.

Data Engineer Related jobs