Match score not available

Data Pipeline Operations Engineer

Remote: 
Full Remote
Experience: 
Mid-level (2-5 years)
Work from: 
Maryland (USA), United States

Offer summary

Qualifications:

Strong Linux command line skills, Proficiency in Python and SQL, Experience with Airflow or similar tools, None.

Key responsabilities:

  • Manage the weekly scanning process
  • Perform quality checks on ingested data
SixMap, Inc. logo
SixMap, Inc. Computer Hardware & Networking Startup https://www.sixmap.io/
11 - 50 Employees
See all jobs

Job description

We are seeking a detail-oriented and technically skilled Data Pipeline Operations Engineer to manage and execute our weekly scanning process. This critical role ensures the timely flow of customer data through our research, scanning, and UI ingest pipeline. The ideal candidate has a mix of programming, database, and Linux system administration skills to handle the various steps in the scanning workflow.

SixMap is the leading Automated Cyber Defense Platform for continuous threat exposure management (CTEM) across today’s largest, most complex and dynamic enterprise and government environments. With zero network impact and zero agents, SixMap automatically discovers all Internet-facing assets across IPv4 and IPv6 to deliver the most comprehensive external attack surface visibility. The platform identifies vulnerabilities, correlates proprietary and open-source threat intelligence, and provides actionable insights to defend against imminent threats with supervised proactive response capabilities. The SixMap team brings deep intelligence community expertise and best practices to the defense of both U.S. Federal agencies and Fortune 500 corporations.

Responsibilities
    • Manage the weekly scanning process, ensuring customer data progresses through research, scanning, and UI ingest phases according to defined SLAs
    • Prepare input files and kick off processes on the scanning cluster via Airflow
    • Monitor and troubleshoot jobs, adjusting parameters like rate files as needed to optimize runtimes
    • Perform data ingest into production databases using SQL and Python
    • Clear data artifacts and caches in between ingest cycles
    • Execute post-ingest data refresh routines
    • Perform quality checks on ingested data to validate contractual obligations are met
    • Identify process bottlenecks and suggest or implement improvements to the automated tooling to increase speed and reliability

Requirements

  • Required Skills:
    • Strong Linux command line skills
    • Experience with Airflow or similar workflow orchestration tools
    • Python programming proficiency
    • Advanced SQL knowledge for data ingest, refresh, and validation
    • Ability to diagnose and resolve issues with long-running batch processes
    • Excellent attention to detail and problem-solving skills
    • Good communication to coordinate with other teams
    • Flexibility to handle off-hours work when needed to meet SLAs

  • Preferred Additional Skills:
    • Familiarity with network scanning tools and methodologies
    • Experience optimizing database performance
    • Scripting skills to automate routine tasks
    • Understanding of common network protocols and services
    • Knowledge of AWS services like EC2

Benefits

    • Competitive compensation packages; including equity
    • Employer paid medical, dental, vision, disability & life insurance
    • 401(k) plans
    • Flexible Spending Accounts (health & dependents)
    • Unlimited PTO
    • Remote Working Options

Required profile

Experience

Level of experience: Mid-level (2-5 years)
Industry :
Computer Hardware & Networking
Spoken language(s):
English
Check out the description to know which languages are mandatory.

Other Skills

  • Detail Oriented
  • Physical Flexibility
  • Communication
  • Problem Solving

Related jobs