SRE Support and Automation Engineers

extra holidays - extra parental leave
Work set-up: 
Full Remote
Contract: 
Experience: 
Senior (5-10 years)
Work from: 

Offer summary

Qualifications:

Bachelor's or Master's degree in computer science, Information Technology, or related field., At least 4 years of professional experience in software engineering, preferably in backend or platform teams., Proficiency in programming languages such as Java, Go, or Python., Experience with automation scripting, incident management, and familiarity with cloud platforms and container orchestration..

Key responsibilities:

  • Monitor the health of critical services and proactively address issues.
  • Collaborate with teams to develop solutions ensuring high availability and performance.
  • Work with partner teams to resolve technical issues and develop SOPs.
  • Act as Incident Commander during major incidents and improve monitoring tools.

Astreya logo
Astreya Large https://www.astreya.com/
1001 - 5000 Employees
See all jobs

Job description

SRE Support and Automation Engineers

India (Bangalore)

Job Summary

Site Reliability Engineering (SRE) team bridges the gap between software development and operations. Our mission is to build systems, tools, and platforms that keep our services fast, available, and reliable—at global scale. SRE team works closely with product engineering teams to design, build, and operate resilient applications that power the commerce experiences of millions.

Astreya is looking for Software Engineers with a passion for Reliability, Scalability, and Performance—someone who brings both a developer’s mindset and a systems-thinking approach.

Key Responsibilities:

  • Proactive Monitoring: Continuously monitor the health of eBay's critical services to identify and address potential issues before they escalate.
  • Solution Development: Collaborate with Architecture, Engineering, and Operations teams to develop solutions that ensure high site availability, reliability and performance.
  • Collaborative Problem Solving: Work closely with partner teams to resolve recurring technical issues, onboard new alerts, and develop high-quality Standard Operating Procedures (SOPs).
  • Enhance Monitoring Tools: Build and improve tools for monitoring and mitigating site incidents and conduct reliability audits and tests to strengthen eBay’s reliability and incident management capabilities.
  • Incident Management: Act as Incident Commander to drive resolution of major incidents, manage alarms, and ensure effective communication with leadership and partner teams.

Qualifications/Skills:

  • Bachelor’s or Master’s degree in computer science, Information Technology, or a related field.
  • 4+ years of professional experience in software engineering, ideally in backend or platform teams
  • Proficiency in one or more programming languages (e.g., Java, Go, Python)
  • Experience writing scripts for Automation, automating any repetitive manual tasks.
  • Strong incident management and leadership skills, with excellent technical triage and troubleshooting abilities, especially during crises.
  • Familiarity with cloud platforms, container orchestration (e.g., Kubernetes), and infrastructure-as-code tools
  • Experience with observability stacks (e.g., Prometheus, Grafana, ELK, OpenTelemetry)
  • Strong interpersonal and communication skills to thrive in fast-paced, dynamic environments.

Required profile

Experience

Level of experience: Senior (5-10 years)
Spoken language(s):
English
Check out the description to know which languages are mandatory.

Other Skills

  • Troubleshooting (Problem Solving)
  • Leadership
  • Collaboration
  • Communication
  • Problem Solving

Test Automation Engineer Related jobs