Data Engineering is a cornerstone of Plaid's data-focused vision, playing a pivotal role in fostering a data-driven culture and advancing our vision of delivering trustworthy, truly differentiated data products powered by the largest network of user financial profiles. To support our data-focused vision, we need to scale our data systems while maintaining correct and complete data. We provide golden datasets and tooling to teams across engineering, product, and business and help them explore our data quickly and safely to get the data insights they need, which ultimately helps Plaid serve our customers more effectively. In addition, Plaid will not be successful if we can't move quickly. We build the data systems and tools that enable everyone at Plaid to be data-driven, making analytics easy, obvious, and proactive across the company.
Data Engineers heavily leverage SQL and Python to build data workflows that integrate with our Golang applications. We use tools like DBT, Airflow, Redshift, Atlan, and Retool to orchestrate data pipelines and define workflows. We work with data scientists, product analytics, business intelligence, software engineers, product managers, and many other teams to build Plaid's data strategy and a data-first mindset. Our engineering culture is IC-driven -- we favor bottom-up ideation and empowerment of our incredibly talented team. We are looking for engineers and technical leaders who are motivated by creating impact for our consumers and customers, growing together as a team, shipping the MVP, and leaving things better than we found them.
Drive Data Standards and Culture @ Plaid: You will play a high-impact role in defining and promoting data standards and fostering a strong data-driven culture across Plaid.
Deliver and Scale Golden Datasets: You will own the creation and future iterations of Golden Datasets for all major Plaid data models, driving their adoption across key data use cases to empower teams with reliable and actionable insights
Shape Unified Data Architecture: You will collaborate with other data practitioners and leaders to design a unified data architecture and conventions, making data at Plaid more accessible and intuitive. You’ll also enhance data exploration tools to enable efficient and effective usage of data across the organization.
Mentor and Build Technical Excellence: You will mentor junior engineers on the team, providing guidance on designs and implementations. You’ll play a key role in fostering a strong technical culture, empowering the team to deliver high-quality, scalable solutions.
ResponsibilitiesServing as the primary technical DRI, defining and executing the long-term technical roadmap to build and sustain a data-driven culture at Plaid.Working closely with senior leadership and executives to shape Plaid’s data engineering strategy and roadmap, ensuring alignment with the company’s data-focused product goals and overall vision.Deep understanding of Plaid's products and strategy in order to inform the design of Golden Datasets, set data usage principles, and ensure data is structured for maximum impact and usability.Focus on delivering well-documented datasets with clearly defined quality metrics, uptime guarantees, and demonstrable usefulness.Leading critical data engineering projects that foster collaboration across teams, driving innovation and efficiency throughout the company.Mentoring engineers, operations, and data practitioners on best practices for data organization.Advocating for the adoption of emerging industry tools and practices, evaluating their fit and implementing them at the right time to keep Plaid at the forefront of data engineering.Owning core SQL and Python data pipelines that power our data lake and data warehouse, ensuring their reliability, scalability, and efficiency.Upholding Plaid’s commitment to data quality and privacy, embedding these principles into every aspect of data work.Qualifications10+ years of experience in data engineering, with a proven track record of building scalable data systems and pipelines.Experience working with massive datasets (500TB to petabytes) and developing robust data models and pipelines on top of themLead major data modeling, schema design, and data architecture efforts in past rolesDeep appreciation for schema design and the ability to evolve analytics schemas on top of unstructured data.Advanced knowledge of SQL as a flexible, extensible tool and experience with modern orchestration tools like DBT, Mode, and Airflow.Hands-on experience with performant data warehouses and lakes such as Redshift, Snowflake, and Databricks.Building and maintaining both batch and real-time pipelines using technologies like Spark and Kafka.Excited about exploring new technologies and building proof-of-concepts that balance technical innovation with user experience and adoption.Enjoy getting into the technical details to manage, deploy, and optimize low-level data infrastructure.Champion for data privacy and integrity, and always act in the best interest of consumers.