Match score not available

Senior SRE/Deployment Engineer

Remote: 
Full Remote
Contract: 
Salary: 
121 - 210K yearly
Experience: 
Senior (5-10 years)
Work from: 

Offer summary

Qualifications:

5+ years of experience in production systems, Proficient in Linux and scripting tools (Bash/Python/Ansible), Familiar with AWS, GCP, Azure, Kubernetes, Docker, Experience with infrastructure services and tools, Knowledge in monitoring applications like Prometheus, Grafana.

Key responsabilities:

  • Develop and maintain deployment options for Comet
  • Identify and resolve infrastructure bugs quickly
  • Collaborate with customers on deployment needs
  • Work with cross-functional teams for system advancements
Comet logo
Comet Computer Software / SaaS Startup https://www.comet.com/
51 - 200 Employees
HQ: New York
See more Comet offers

Job description

About Comet

Our mission is to help every organization drive business value from AI. Strongly positioned at the forefront of the Gen AI revolution, Comet powers AI developers to accelerate development and execution. From the data scientist tracking training runs to the enterprise team deploying hundreds of models to production to individuals and teams developing generative AI applications, Comet is the platform used by the most innovative builders in the industry. 

Comet is backed by more than $63 million in venture capital funding and powers some of the best machine-learning teams in the world, including Netflix, Uber, Etsy, and Mobileye. We are a remote-first company with offices in New York City (USA) and Tel Aviv (Israel).

You are:

We are seeking a talented Senior SRE/Deployment Engineer to join our team and help build, deploy and maintain our products across various platforms, including multi-cloud, on-premises, and bare-metal deployments.

Responsibilities:
  • Develop and maintain all deployment options for Comet, including multi-cloud, on-premises, and bare-metal deployments, using Linux single server or containerization technologies such as Kubernetes
  • Quickly identify and resolve infrastructure bugs, ensuring high system availability and reliability
  • Work closely with customers to understand their deployment needs and provide effective support for deploying and maintaining Comet on their infrastructure.
  • Drive the success of system advancements by collaborating with cross-functional teams, including development, support, and other teams, to ensure seamless integration and successful deployment of new features and updates.
Requirements:
  • Proficient in Linux system internals, scripting and configuration management tools (Bash/Python/Ansible)
  • 5+ years of experience in running production systems over the cloud, such as AWS, GCP, or Azure, and using containerization technologies such as Kubernetes and Docker to deploy and manage applications
  • Familiarity with cloud-based infrastructure services such as EC2, RDS, S3, and VPC, and with related tools such as CloudFormation and Terraform
  • Experience with monitoring applications such as Prometheus, Grafana, or ELK stack.
  • Excellent communication skills, both verbal and written, to effectively collaborate with team members and clients
  • Passionate about troubleshooting and investigating in unfamiliar environments.
  • Proven customer support and customer-facing experience, capable of assisting clients with varying levels of technical expertise.

What We Offer:

  • Competitive base salary - $170K - $210K based on proven experience, skills and location.
  • Competitive benefits package.
  • Flexible working hours and remote work options.
  • Opportunities for professional growth and development.
  • A collaborative and innovative work environment.
  • The chance to work with cutting-edge technologies and projects.
  • This role will be fully remote in the USA working with a global team (large presence in the US, Tel Aviv and Europe).– some flexibility with work hours is required.

 

 

Comet is an equal-opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees without regard to race, religion, color, sex, gender identity, gender expression, sexual orientation, national origin, ancestry, citizenship status, uniform service member status, marital status, pregnancy, age, medical condition, physical or mental disability, genetic information/characteristics, and any other characteristic protected by State or Federal law.

Required profile

Experience

Level of experience: Senior (5-10 years)
Industry :
Computer Software / SaaS
Spoken language(s):
English
Check out the description to know which languages are mandatory.

Other Skills

  • Collaboration
  • Troubleshooting (Problem Solving)
  • Verbal Communication Skills

Site Reliability Engineer (SRE) Related jobs