Match score not available

AWS Data Bricks Engineer

Remote: 
Full Remote
Experience: 
Senior (5-10 years)
Work from: 

Offer summary

Qualifications:

2+ years AWS Databricks experience, 5+ years in data engineering or analytics, Bachelor's/Master's in related field, Strong written and verbal communication, Self-motivated with problem-solving ability.

Key responsabilities:

  • Lead enterprise solutions & advise team
  • Develop data ingestion and processing
  • Implement data strategies with SMEs
  • Document technical specifications & test cases

Job description

Job Description

This is a remote position.


Requirements

      Strong experience as a AWS Data Engineer and must have AWS Databricks experience.

      Expert proficiency in Spark Scala, Python, and PySpark is a plus

      Must have data migration experience from on prem to cloud

      Hands-on experience in Kinesis to process & analyze Streaming data, and AWS DynamoDB

      In depth understanding of AWS cloud and AWS Data lake and Analytics solutions.

      Expert level hands-on development Design and Develop applications on Databricks, Databricks Workflows, AWS Managed Airflow, Apache Airflow is required.

      Extensive hands-on experience implementing data migration and data processing using AWS services: VPC/SG, EC2, S3, AutoScaling, CloudFormation, LakeFormation, DMS, KinesisKafka, Nifi, CDC processing, Amazon S3EMRRedshiftAthena, Snowflake, RDS, Aurora, Neptune, DynamoDB, Cloudtrail, CloudWatch, DockerLambda, Spark, Glue, SageMaker, AI/ML, API GW, etc.

      Hands-on experience with the Technology stack available in the industry for data management, data ingestion, capture, processing, and curation: Kafka, StreamSets, Attunity, GoldenGate, Map Reduce, Hadoop, Hive, Hbase, Cassandra, Spark, Flume, Hive, Impala, etc.

      Knowledge of different programming and scripting languages

      Good working knowledge of code versioning tools [such as Git, Bitbucket or SVN]

      Hands-on experience in using Spark SQL with various data sources like JSON, Parquet and Key Value Pair

      Experience preparing data for Data Science and Machine Learning.

      Experience preparing data for use in SageMaker and AWS Databricks.

      Demonstrated experience preparing data, automating and building data pipelines for AI Use Cases (text, voice, image, IoT data etc....).

      Good to have programming language experience with .NET or Spark/Scala

      Experience in creating tables, partitioning, bucketing, loading and aggregating data using Spark Scala, Spark SQL/PySpark

      Knowledge of AWS/Azure DevOps processes like CI/CD as well as Agile tools and processes including Git, Jenkins, Jira, and Confluence

      Working experience with Visual Studio, PowerShell Scripting, and ARM templates.

      Strong understanding of Data Modeling and defining conceptual logical and physical data models.

      Big Data/analytics/information analysis/database management in the cloud

      IoT/event-driven/microservices in the cloud- Experience with private and public cloud architectures, pros/cons, and migration considerations.

      Ability to remain up to date with industry standards and technological advancements that will enhance data quality and reliability to advance strategic initiatives

      Basic experience with or knowledge of agile methodologies

      Working knowledge of RESTful APIs, OAuth2 authorization framework and security best practices for API Gateways

Responsibilities:

         Work closely with team members to lead and drive enterprise solutions, advising on key decision points on trade-offs, best practices, and risk mitigation

         Manage data related requests, analyze issues, and provide efficient resolution. Design all program specifications and perform required tests

         Design and Develop data Ingestion using Glue, AWS Managed Airflow, Apache Airflow and processing layer using Databricks.

         Work with the SMEs to implement data strategies and build data flows.

         Prepare codes for all modules according to required specification.

         Monitor all production issues and inquiries and provide efficient resolution.

         Evaluate all functional requirements, map documents, and troubleshoot all development processes

         Document all technical specifications and associates project deliverables.

         Design all test cases to provide support to all systems and perform unit tests.

 

Qualifications:

      2+ years of hands-on experience designing and implementing multi-tenant solutions using AWS Databricks for data governance, data pipelines for near real-time data warehouse, and machine learning solutions.

      5+ years experience in a software development, data engineering, or data analytics field using Python, PySpark, Scala, Spark, Java, or equivalent technologies.

      Bachelors or Masters degree in Big Data, Computer Science, Engineering, Mathematics, or similar area of study or equivalent work experience

      Strong written and verbal communication skills

      Ability to manage competing priorities in a fast-paced environment

      Ability to resolve issues

      Self-Motivated and ability to work independently

      Nice to have-

-        AWS Certified: Solutions Architect Professional

-        Databricks Certified Associate Developer for Apache Spark



Salary

0 - 3000000 INR (Per Year)


Required profile

Experience

Level of experience: Senior (5-10 years)
Spoken language(s):
English
Check out the description to know which languages are mandatory.

Data Engineer Related jobs