Senior Engineer - Data/ML Platforms SRE
As a Senior Reliability Engineer, you will play a critical role in ensuring the robustness, availability, and performance of our cutting-edge Data Engineering and Machine Learning Platforms. You'll collaborate closely with cross-functional teams, including platforms developers, infrastructure experts, to enhance the reliability and resilience of our modern platforms. If you're passionate about pushing the boundaries of technology and thrive in a dynamic environment, this role is for you. You will help drive our insurance business transformation as we transition from a traditional IT model to a tech organization with engineering excellence as its mission, while co-creating the culture of psychological safety and continuous improvement.
Responsibilities :
Reliability Enhancements
- Design, develop, and implement software solutions that enhance the reliability and fault tolerance of our modern data and machine learning platforms
- Collaborate with software engineers to create robust, scalable, and efficient platforms
- Proactively identify and address potential reliability bottlenecks and performance issues
Automation and Monitoring :
- Develop and maintain automated processes for deployment, scaling, and maintenance of platforms
- Build effective monitoring systems to detect anomalies, performance degradation, and capacity issues
- Implement proactive measures to prevent incidents
Incident Response and Troubleshooting :
- Participate in on-call rotations to respond to incidents promptly
- Investigate and resolve storage-related incidents, ensuring minimal impact on services
- Conduct post-incident reviews to learn from incidents and improve system reliability
Automation and Monitoring :
- Develop and maintain automated processes for deployment, scaling, and maintenance of our platforms
- Build effective monitoring systems to detect anomalies, performance degradation, and capacity issues
- Implement proactive measures to prevent incidents
Capacity Planning and Scaling :
- Collaborate with infrastructure teams to plan for storage and compute capacity needs
- Scale storage systems efficiently to accommodate growing demands
- Optimize resource utilization while maintaining high availability
Documentation and Knowledge Sharing:
- Document processes, procedures, and best practices
- Share knowledge with colleagues to foster a culture of continuous improvement
- Mentor junior engineers
Qualifications :
- Bachelor’s degree in computer science, Information Systems, or equivalent education or work experience
- Minimum of 5 year of experience in Data Engineering pipeline related roles
- Experience in Big Data ecosystem : ETL, tooling of Big Data Platform (Apache Spark, Airflow), Datalake, Synapse or Snowflake
- Experience in Machine Learning ecosystem : training models, inference, experimentation, and pipelines infrastructure.
- Proficiency in modern on prem object storage technologies (CEPH, MinIO) and its cloud equivalents (AWS S3, Azure Blob Storage, Google Cloud Storage)
- Experience with infrastructure automation, tooling, and configuration management frameworks (e.g., Puppet, Chef, Ansible, Terraform, Pulumi, etc.)
- Fluency of SQL and no-SQL
- Knowledge of CS data structures and algorithms.
- Fluency and Specialization with at least two modern languages such as Java, Python or Go, including object-oriented design.
- Experience with Prometheus, Loki, and Grafana.
- Experience with container orchestration platforms (Kubernetes, or Docker Swarm).
- Experience with linux and open source ecosystem
- Self-driven with an analytical, first principles
- Ability to take a complex challenge and deliver quality simple solutions
- Effective communication skills for cross-functional collaboration.
Annual Salary
$82,000.00 - $185,000.00
The above annual salary range is a general guideline. Multiple factors are taken into consideration to arrive at the final hourly rate/ annual salary to be offered to the selected candidate. Factors include, but are not limited to, the scope and responsibilities of the role, the selected candidate’s work experience, education and training, the work location as well as market and business considerations.
At this time, GEICO will not sponsor a new applicant for employment authorization for this position.
Benefits:
As an Associate, you’ll enjoy our Total Rewards Program* to help secure your financial future and preserve your health and well-being, including:
- Premier Medical, Dental and Vision Insurance with no waiting period**
- Paid Vacation, Sick and Parental Leave
- 401(k) Plan
- Tuition Reimbursement
- Paid Training and Licensures
*Benefits may be different by location. Benefit eligibility requirements vary and may include length of service.
**Coverage begins on the date of hire. Must enroll in New Hire Benefits within 30 days of the date of hire for coverage to take effect.
The equal employment opportunity policy of the GEICO Companies provides for a fair and equal employment opportunity for all associates and job applicants regardless of race, color, religious creed, national origin, ancestry, age, gender, pregnancy, sexual orientation, gender identity, marital status, familial status, disability or genetic information, in compliance with applicable federal, state and local law. GEICO hires and promotes individuals solely on the basis of their qualifications for the job to be filled.
GEICO reasonably accommodates qualified individuals with disabilities to enable them to receive equal employment opportunity and/or perform the essential functions of the job, unless the accommodation would impose an undue hardship to the Company. This applies to all applicants and associates. GEICO also provides a work environment in which each associate is able to be productive and work to the best of their ability. We do not condone or tolerate an atmosphere of intimidation or harassment. We expect and require the cooperation of all associates in maintaining an atmosphere free from discrimination and harassment with mutual respect by and for all associates and applicants.