Job description

Description

We are looking for Tech Ops - Production Support & Reliability Lead

Front-line production support for Braviant's AWS multi-account stack. Monitor systems, triage alerts, execute runbooks, escalate cleanly to developers. Defensive ownership role - not a developer role despite "Lead" in title.

Stack:

AWS - VPC, ECS, Lambda (SAM/CloudFormation), IAM, NAT, security groups
PostgreSQL on Amazon RDS (~15 instances)
Datadog + CloudWatch (APM, logs, alerting)
Java microservices / API-heavy app stacks
Jira (ITSM) + Slack (ops channels)
Nice-to-have: AWS data services (Glue, S3, Athena, EventBridge), Metaplane

Requirements

Must-have:

3+ years production support / SRE / NOC / ops engineering
Hands-on AWS - EC2/ECS, VPC networking, IAM
Operational PostgreSQL / RDS - slow query reading, basic tuning, vacuum awareness
Incident triage across infra + app layers
Structured incident response (ITIL, NIST, or equivalent)
SLA management in a ticketed environment (Jira or similar)
Strong written English for escalation + post-incident write-ups

Nice-to-have:

Datadog / CloudWatch fluency
AWS data services (Glue, S3, Athena, EventBridge)
Basic IaC (CloudFormation, SAM, Terraform)
Financial services or other regulated-environment background
AWS SysOps Administrator or Solutions Architect cert
Scripting / automation

Tech Ops - Production Support & Reliability (AWS)

Role overview

Qualifications

Responsibilities

Key facts

Hard skills

Other skills

About the company

Company details

Links

Your match analysis

Job description

Description

Requirements

Apply once. Then go straight to the hiring manager.

Related jobs

Medical Director

Future Open Roles

IAM with Brainwave and Sailpoint

Senior Networking Solution Test Engineer – AI Cluster Debugging

Sr Implementation Consultant

Other jobs at Commit

Senior DevOps Engineer

Senior Backend Engineer

Senior Backend Engineer (Python + Django)

Reach out to the hiring manager directly.