Key Facts

Remote From:

Full time

Mid-level (2-5 years)

English

Hard Skills

SQL (Programming Language) Python (Programming Language) Data Lineage Operational Databases Indexing Data Partitioning Apache Airflow AWS Cloud Services Data Streaming Good Documentation Practices +30 more

Other Skills

•
Communication
•
Time Management
•
Analytical Thinking
•
Detail Oriented
•
Problem Solving

Roles & Responsibilities

3+ years of experience in Data Engineering, Backend Engineering, or Data Infrastructure roles
Strong proficiency in Python and SQL
Experience with at least one modern data warehouse (Snowflake, Redshift, BigQuery)
Hands-on experience with orchestration tools such as Airflow or Prefect

Requirements:

Build, maintain, and optimize ETL/ELT pipelines using Python, SQL, or Scala, and orchestrate workflows with Airflow, Prefect, Dagster, or similar tools
Design and optimize cloud data warehouses (Snowflake, BigQuery, Redshift) with scalable schemas and performance tuning
Implement data quality and governance measures including validation checks, lineage tracking, and documentation (dbt, Great Expectations), ensuring compliance with GDPR/HIPAA as applicable
Develop and manage real-time streaming pipelines (Kafka, Kinesis, Pub/Sub) for low-latency data and event-driven architectures

Pavago

Human Resources, Staffing & Recruiting

About Pavago

Pavago - Thinking Globally to Grow Locally 🌍 Welcome to Pavago, where the world is your talent pool. We believe in a borderless future where businesses can harness the best of international expertise without breaking the bank. 🌟 Why Choose Pavago? Affordability: Find exceptional talent at 1/4 the cost of American counterparts. Global Reach: Our vast network spans across continents, ensuring we locate the perfect fit for your unique needs. Localized Growth: By integrating international insights and expertise, we fuel your local business growth. Whether you're a startup looking for the right brains to get your idea off the ground, or an established company wanting to diversify your team and scale operations, Pavago is your bridge to global possibilities. Tap into a world of talent. Let's grow, together. 🚀 Connect with us today!

Company type: Small startup

Industry: Human Resources, Staffing & Recruiting

Founded: 2018

Company size: 2 - 10

Website LinkedIn See all jobs →

Job description

Job Title: Data Engineer

Position Type: Full-Time, Remote
Working Hours: U.S. client business hours (with flexibility for pipeline monitoring, deployments, and data refresh cycles)

About the Role

Our client is seeking a Data Engineer to design, build, and maintain scalable data infrastructure and reliable data pipelines that power analytics, reporting, and operational decision-making across the business.

This role requires strong software engineering fundamentals, deep experience with modern data stacks, and a passion for building clean, reliable, and high-performance data systems. The Data Engineer will ensure data flows seamlessly from source systems into warehouses, dashboards, and downstream applications while maintaining high standards for quality, governance, and scalability.

The ideal candidate is analytical, detail-oriented, and comfortable working across engineering, analytics, and business teams to deliver trustworthy and actionable data.

Responsibilities

Pipeline Development & Data Integration

• Build, maintain, and optimize ETL/ELT pipelines using Python, SQL, or Scala
• Orchestrate workflows using Airflow, Prefect, Dagster, or similar orchestration tools
• Ingest structured and unstructured data from APIs, SaaS platforms, databases, files, and streaming systems
• Develop scalable connectors and automated ingestion workflows

Data Warehousing & Modeling

• Manage and optimize cloud data warehouses such as Snowflake, BigQuery, or Redshift
• Design scalable schemas using star and snowflake modeling techniques
• Implement partitioning, clustering, indexing, and performance optimization strategies
• Build clean, analytics-ready datasets for business intelligence and reporting use cases

Data Quality, Governance & Reliability

• Implement validation checks, anomaly detection, logging, and monitoring to ensure data integrity
• Enforce naming conventions, lineage tracking, and documentation standards using tools such as dbt or Great Expectations
• Maintain audit-ready data processes and ensure compliance with GDPR, HIPAA, or industry-specific requirements
• Monitor pipeline health and proactively resolve failures or inconsistencies

Streaming & Real-Time Data Processing

• Build and manage real-time data pipelines using Kafka, Kinesis, Pub/Sub, or similar platforms
• Support low-latency ingestion and event-driven architectures for time-sensitive applications
• Monitor streaming infrastructure and optimize throughput and reliability

Collaboration & Analytics Enablement

• Partner closely with analysts, data scientists, and business stakeholders to deliver reliable datasets
• Support dashboard and reporting initiatives across Tableau, Looker, or Power BI
• Translate business requirements into scalable data solutions and models
• Maintain clear technical documentation for pipelines, schemas, and workflows

Infrastructure, DevOps & Automation

• Containerize data services using Docker and manage deployments through Kubernetes when applicable
• Automate deployments using CI/CD pipelines such as GitHub Actions, Jenkins, or GitLab CI
• Manage cloud infrastructure using Terraform, CloudFormation, or similar Infrastructure-as-Code tools
• Continuously optimize performance, scalability, reliability, and cloud costs

What Makes You a Perfect Fit

• Passionate about building clean, reliable, and scalable data systems
• Strong debugging and problem-solving mindset with high attention to detail
• Balance of software engineering discipline and analytical thinking
• Comfortable working cross-functionally with technical and non-technical stakeholders
• Proactive communicator who takes ownership of data quality and reliability

Required Experience & Skills

• 3+ years of experience in Data Engineering, Back-End Engineering, or Data Infrastructure roles
• Strong proficiency in Python and SQL
• Experience with at least one modern data warehouse (Snowflake, Redshift, BigQuery)
• Hands-on experience with orchestration tools such as Airflow or Prefect
• Strong understanding of ETL/ELT pipelines, data modeling, and data transformation workflows
• Familiarity with cloud platforms such as AWS, GCP, or Azure

Preferred Experience & Skills

• Experience with dbt for data modeling and transformation management
• Streaming and event-driven data pipeline experience (Kafka, Kinesis, Pub/Sub)
• Experience with cloud-native data services such as AWS Glue, GCP Dataflow, or Azure Data Factory
• Familiarity with Docker, Kubernetes, Terraform, or CI/CD workflows
• Background in regulated industries such as healthcare, fintech, or enterprise SaaS
• Experience optimizing warehouse costs and query performance at scale

What Does a Typical Day Look Like?

A Data Engineer’s day revolves around maintaining reliable pipelines, improving data quality, and enabling teams with scalable access to trustworthy data. You will:

• Monitor pipeline health and troubleshoot failed jobs in Airflow or related orchestration systems
• Build and maintain ingestion pipelines for APIs, SaaS platforms, and operational databases
• Optimize SQL queries and warehouse performance to improve efficiency and reduce cloud costs
• Collaborate with analysts and data scientists to provide curated datasets for reporting and modeling
• Implement validation checks and monitoring to prevent downstream data quality issues
• Document data models, transformations, and workflows to ensure scalability and maintainability

In essence: you ensure the organization has accurate, timely, and reliable data powering operational, analytical, and strategic decisions.

Key Metrics for Success (KPIs)

• Pipeline uptime ≥ 99%
• Data freshness maintained within agreed SLAs
• Zero critical data quality issues reaching downstream reporting systems
• Improved warehouse query performance and cost optimization
• Timely delivery of scalable and reliable datasets
• Positive feedback from analysts, data scientists, and business stakeholders

Interview Process

• Initial Phone Screen
• Video Interview with Pavago Recruiter
• Technical Assessment (e.g., build a small ETL pipeline or optimize a SQL query)
• Client Interview with Engineering/Data Team
• Offer & Background Verification

#DataEngineer #ETL #DataPipelines #BigQuery #Snowflake #Redshift #Airflow #Python #SQL #CloudData #AnalyticsEngineering #DataInfrastructure #RemoteWork #DataEngineeringJobs

Ready to apply?

APPLY

Share ·