5+ years of experience in system engineering or software development
3+ years of experience in engineering with experience in ETL type work with databases and Hadoop platforms.
Skills
Hadoop GeneralDeep knowledge of distributed file system concepts, mapreduce principles and distributed computing. Knowledge of Spark and differences between Spark and MapReduce. Familiarity of encryption and security in a Hadoop cluster.
Data management data structuresMust be proficient in technical data management tasks, i.e. writing code to read, transform and store data
XMLJSON knowledge
Experience working with REST APIs
SparkExperience in launching spark jobs in client mode and cluster mode. Familiarity with the property settings of spark jobs and their implications to performance.
Application DevelopmentFamiliarity with HTML, CSS, and JavaScript and basic designvisual competency
SCCGitMust be experienced in the use of source code control systems such as Git
ETL Experience with developing ELTETL processes with experience in loading data from enterprise sized RDBMS systems such as Oracle, DB2, MySQL, etc.
AuthorizationBasic understanding of user authorization (Apache Ranger preferred)
Programming Must be at able to code in Python or expert in at least one high level language such as Java, C, Scala.
Must have experience in using REST APIs
SQL Must be an expert in manipulating database data using SQL. Familiarity with views, functions, stored procedures and exception handling.
AWS General knowledge of AWS Stack (EC2, S3, EBS, …)
IT Process ComplianceSDLC experience and formalized change controls
Working in DevOps teams, based on Agile principles (e.g. Scrum)
ITIL knowledge (especially incident, problem and change management)