Bachelor's or Master's degree in Computer Science, Information Systems, Engineering or equivalent.
6+ years of relevant experience with data warehouse architectures, ETL/ELT, and reporting/analytic tools.
3+ years of experience with cloud-based data warehouses such as Redshift and Snowflake.
3+ years of experience in Big Data distributed ecosystems (Hadoop, Spark, Hive, Delta Lake).
Requirements:
Design and implement distributed data processing pipelines using Spark, Python, SQL and other tools in the Big Data/Lakehouse ecosystem.
Analyze designs to determine coding, programming, and integration activities required based on general objectives.
Review and evaluate designs and project activities for compliance with architecture, security andquality guidelines and standards.
Write and execute comprehensive testing plans, protocols, and documentation; identify defects and implement solutions for code and integration issues.
Job description
Job Title - Data Engineer
Location - Remote
Duration - 12 Plus Months
Rate - DOE
U.S. Citizens and those authorized to work in the U.S. are encouraged to apply. We are unable to sponsor at this time.
Job Description
The data engineering role is a team member that will help enhance and maintain the Instant Ink Business Intelligence system. You will drive work you're doing to completion with hands-on development responsibilities, and partner with the Data Engineering leaders to implement data engineering pipelines to build solution to help provide trusted and reliable data to customers.
Responsibilities
Design and implement distributed data processing pipelines using Spark, Python, SQL and other tools and languages prevalent in the Big Data/Lakehouse ecosystem.
Analyzes design and determines coding, programming, and integration activities required based on general objectives.
Reviews and evaluates designs and project activities for compliance with architecture, security and quality guidelines and standards
Writes and executes complete testing plans, protocols, and documentation for assigned portion of data system or component; identifies defects and creates solutions for issues with code and integration into data system architecture.
Collaborates and communicates with project team regarding project progress and issue resolution.
Works with the data engineering team for all phases of larger and more-complex development projects and engages with external users on business and technical requirements.
Collaborates with peers, engineers, data scientists and project team.
Typically interacts with high-level Individual Contributors, Managers and Program Teams on a daily/weekly basis.
Bachelor's or Master's degree in Computer Science, Information Systems, Engineering or equivalent.
6+ years of relevant experience with detailed knowledge of data warehouse technical architectures, infrastructure components, ETL/ ELT and reporting/analytic tools.
3+ years of experience with Cloud based DW such as Redshift, Snowflake etc.
3+ years' experience in Big Data Distributed ecosystems (Hadoop, SPARK, Hive & Delta Lake)
3+ years experience in Workflow orchestration tools such as Airflow etc.
3+ years' experience in Big Data Distributed systems such as Databricks, AWS EMR, AWS Glue etc.
Leverage monitoring tools/frameworks, like Splunk, Grafana, CloudWatch etc.
Experience with container management frameworks such as Docker, Kubernetes, ECR etc.
3+ year's working with multiple Big Data file formats (Parquet, Avro, Delta Lake)
Experience working on CI/CD processes such as Jenkins, Codeway etc. and source control tools such as GitHub, etc.
Strong experience in coding languages like Python, Scala & Java
Knowledge and Skills
Fluent in relational based systems and writing complex SQL.
Fluent in complex, distributed and massively parallel systems.
Strong analytical and problem-solving skills with ability to represent complex algorithms in software.
Strong understanding of database technologies and management systems.
Strong understanding of data structures and algorithms
Database architecture testing methodology, including execution of test plans, debugging, and testing scripts and tools.