Help build and integrate the next gen data platform
Analyse and aggregate data into new data models for use by the rest of the company
Write complex ETL in SQL/Python/PySparkto generate reports from a variety of sources.
Make sure that data is secure, reliable and easily accessible across the company by leveraging the latest technologies
Build tools for automation, monitoring and alerting of the data pipelines.
Write Infrastructure as Code using Terraform
Collaborate in design and problem-solving sessions.
Suggest improvements and introduce best practices into the team
Requirements
This is a mid/senior level role with strong skills in Python, SQL and data-modelling
Good experience developing, architecting and scaling data services
Strong ownership and self-drive from the first line of code to the impact on the end user. If you like autonomy and being in the driver’s seat, we would love to have you in our team!
Strong product mindset and a good ability to understand product context and develop success metrics.
Redshift, Glue, S3, Athena,Step Function, Lambda, SQL/Python/PySpark, Tableau / Power BI
Previous experience introducing best practices or processes into a team and a strong desire to help others succeed
Good architectural design skills and being able to discuss the merits and trade-offs of using any particular design approach and technologies
Bonus: Experienced working in a AWS Data Stack (AWS Glue, Lambda, S3, Athena)
Bonus: Experienced using PostgreSQL
Bonus: Experienced using Apache Spark for data processing
Bonus: Experienced with data streaming applications middleware (e.g., Kafka, Rabbitmq)
Bonus: Experienced using Terraform for Infrastructure as code
Experience in Redshift database –query optimization for better performance, manage large tables partitions
Must have experience working with AWS tools and services (Glue, EMR, Athena, Step Function)
Data Pipelining skills –Data blending using python and SQL
Develop & support ETL pipelines
Develop data models that are optimized for business understand-ability and usability
Develop and optimize data tables using best practices for partitioning, compression, parallelization, etc
Develop and maintain metadata, data catalogue, and documentation for internal business customers
Help internal business customers troubleshoot and optimize SQL and ETL solutions to solve reporting problems
Work with internal business customers and partner technical teams to gather and document requirements for data publishing and consumption
Experience in Documentation –create technical specifications and able to translate existing technical designs into data transformation