Key Facts

Remote From:

Full time

Mid-level (2-5 years)

Hard Skills

Apache HBase Scala (Programming Language) Apache Spark Apache Hadoop PostgreSQL Systems Architecture MySQL CI/CD Big Data MariaDB +11 more

Roles & Responsibilities

Bachelor's degree in IT, Computer Science, Software Engineering, Business Analytics or equivalent
2+ years of experience with Scala, Spark, Hadoop (Security, Spark on YARN, architectural knowledge), HBase and Hive
2+ years of experience with RDBMS (MySQL/Postgres/MariaDB) and 1+ year of CI/CD experience
Nice to have: Kafka, Spark Streaming, Apache Phoenix, Memcached/Redis caching, Spark ML, and FP with Scala (cats/scalaz)

Requirements:

Design, develop, and maintain data pipelines using Hadoop ecosystem components (Spark, Hive, HBase) and ensure data quality
Build and optimize data models and schemas in Hive/HBase and integrate with RDBMS
Implement CI/CD pipelines for data applications and manage cloud-based deployments
Collaborate with data scientists, analysts, and stakeholders to deliver analytics-ready datasets and reporting

Job description

Experience -Must have: a) Scala: Minimum 2 years of experience b) Spark: Minimum 2 years of experience c) Hadoop: Minimum 2 years of experience (Security, Spark on yarn, Architectural knowledge) d) Hbase: Minimum 2 years of experience e) Hive - Minimum 2 years of experience f) RDBMS (MySql / Postgres / Maria) - Minimum 2 years of experience g) CI/CD Minimum 1 year of experience Experience (Good to have): a) Kafka b) Spark Streaming c) Apache Phoenix d) Caching layer (Memcache / Redis) e) Spark ML f) FP (Scala cats / scalaz) Qualifications Bachelor's degree in IT, Computer Science, Software Engineering, Business Analytics or equivalent with at-least 2 years of experience in big data systems such as Hadoop as well as cloud-based solutions

Ready to apply?

APPLY

Share ·