Key Facts

Remote From:

Brazil

Full time

Senior (5-10 years)

English

Hard Skills

Other Skills

•
Communication
•
Teamwork
•
Problem Solving

Sky Systems, Inc. (SkySys)

Information Technology & Services

About Sky Systems, Inc. (SkySys)

Sky Systems, Inc. (DBA SkySys) is a technology consulting firm based out of Research Triangle Park, North Carolina, United States. SkySys specializes in Recruitment & Staffing, 24/7 On-Site & Remote Services, Managed Services Provider (MSP), Cisco Select Certified Partner and Dell Technology Partner, Contact Center Solutions (Cisco, Avaya, Genesys), Web Solutions, and other services. SkySys currently works with clients across the United States and Canada. Our list of clients include top Fortune 500 companies in various industries – Financial Services, Banking, Pharmaceutical, IT Service Providers, Healthcare, Oil & Gas, Government, Consulting and Outsourcing, Telecommunications, Insurance, Aerospace, Semiconductors, and many more.

Company type: Startup

Industry: Information Technology & Services

Founded: 2018

Company size: 11 - 50

Website LinkedIn See all jobs →

Job description

Role: Platform Engineer / DevOps Engineer
Position Type: Full-Time Contract (40hrs/week)
Contract Duration: 12 Months
Work Hours: EST/CST
Location: 100% Candidates (Candidates can work from anywhere in LATAM Countries)

WHAT YOU'D BE DOING:

• Operate and improve platform tools so product teams can ship reliably, triaging tickets, fixing build issues, and handling routine service requests (access, secrets, environment setup).
• Maintain and extend self-service workflows (templates, golden paths) by updating docs, examples, and guardrails under guidance from senior engineers.
• Perform day-to-day Kubernetes operations: deploy/update Helm charts, manage namespaces, diagnose rollout issues, and follow runbooks for incident response.
• Support CI/CD pipelines (e.g., GitLab CI): keep pipelines green, add/adjust jobs, implement basic quality gates, and help teams adopt safer deploy strategies (blue/green, canary).
• Monitor and operate the observability stack using Prometheus, Alert Manager, and Thanos; maintain alert rules, dashboards, and SLO/SLA indicators; help reduce alert noise and improve signal quality.
• Assist with service instrumentation across the core observability pillars—tracing, logging, and metrics—with hands-on OpenTelemetry usage (collectors/SDKs) and related telemetry tooling.
• Contribute to and improve documentation: runbooks, FAQs, onboarding guides, and standard operating procedures.
• Participate in an on-call rotation as needed with a well-defined escalation path; assist during incidents, post small fixes, and capture learnings in docs.
• Help with cost- and performance-minded housekeeping: right-size workloads, prune unused resources, and automate routine tasks where appropriate.

WE'RE LOOKING FOR SOMEONE WHO HAS:

• 6+ years in a platform/SRE/DevOps or infrastructure role, with a strong bias toward automation and support.
• Experience operating Kubernetes (or similar) and core ecosystem tools (Helm, Docker, Ingress NGINX, Argo Rollouts basics).
• Hands-on CI/CD experience (preferably GitLab CI): writing/modifying jobs, artifacts, environments, and basic deployment strategies.
• Scripting ability in Bash or Python (Go a plus) to automate repetitive tasks and improve runbooks.
• Familiarity with AWS fundamentals (e.g., IAM, EC2/EKS, S3, CloudWatch/CloudTrail, Parameter Store/Secrets Manager).
• Practical understanding of monitoring/observability (dashboards, logs, alerts) and how to use them for triage and remediation, including Prometheus/Alertmanager/Thanos and OpenTelemetry basics.
• Comfortable working from tickets (Jira/ServiceNow), following change-management practices, and communicating clearly with stakeholders.

HIGHLY PREFERRED CANDIDATES ALSO HAVE:

• Terraform experience for infrastructure as code (managing AWS, Kubernetes add-ons, and platform components).
• API integration experience (Java, Python, or Go) to build small internal tools or glue code.
• Deeper Linux fundamentals and container runtime basics for effective debugging and performance tuning.
• Exposure to insurance/financial services environments, including awareness of compliance and operational controls.

MUST HAVE:

• At least mid-level (not too junior).

• AWS Cloud Experience
– Hands-on AWS experience is required
– Must clearly appear on the resume

• Kubernetes (Critical)
– Strong Kubernetes experience is mandatory
– Must have:
▪ Operating and maintaining clusters
▪ Some experience building Kubernetes environments
– Manager explicitly stated:
"If you don't know how to build, you can't operate.”

• Infrastructure as Code (Terraform)
– Terraform experience explicitly mentioned as important
– Expected to understand IaC concepts

• Python (Hard Requirement)
– Used for:
▪ Automation
▪ Scripting
▪ Internal tooling
– Skill level: mid-level, not advanced application development

• CI/CD Pipeline Ownership
– Hands-on experience with building, maintaining, and enhancing CI/CD pipelines
– Tools:
▪ GitLab CI (preferred)
▪ Jenkins / GitHub Actions acceptable
– Focus on practical ownership, not tool brand

• Platform / DevOps Engineering Background
– Must come from:
▪ Platform Engineering or
▪ Strong DevOps / Infrastructure background
– Hands-on work with infrastructure, pipelines, and cloud operations
– Not a consulting or advisory role

• Observability (Conceptual Strength Required)
– Must understand observability principles and architecture
– Tools interchangeable (Prometheus, Thanos, Grafana, Datadog, New Relic)
– Focus on concepts + implementation

• Linux Fundamentals
– Solid working knowledge required

• Mid-Size Company Experience
– Preference for candidates from modern, mid-size engineering organizations

NICE TO HAVE:

• Security exposure in CI/CD pipelines and infrastructure workflows
• Multi-cloud exposure (Azure or GCP acceptable if AWS exists)
• Go or additional scripting languages (Python still mandatory)
• Experience supporting large engineering groups (~100–150 engineers)

Ready to apply?

APPLY

Share ·