Role: Linux Server Manager
Starting Date: Immediately
Contract: long term
Work Place: São Paulo/SP (100% remote)
Work Time: 9h00 to 18h00
Language: Enghish advanced is mandatory
Position Overview
We are seeking a highly experienced Server Manager to join the Enhanced Operation Service (EOS) team within the Enterprise Cloud Services Delivery organization.
In this role, you will act as a trusted advisor, ensuring the stability, performance, and continuous optimization of end-to-end service delivery for strategic customers throughout their cloud transformation journey.
You will be part of a global engagement, working within a 24x7 operational model supporting a Tech Mahindra client. This is a fully remote position with an initial assignment of at least one year. Standard working hours are Monday to Friday, 9:00 AM to 6:00 PM; however, flexibility is required as schedules may evolve based on operational needs.
Key Responsibilities
This role combines operational excellence with proactive service improvement.
• Incident and Problem Management: Lead and support the resolution of Major Incidents, manage service request failures, and perform root cause analysis (RCA) for outages and performance degradation.
• Service and Change Management: Execute complex service and change requests, including activities involving extended downtime windows and long-running operations.
• Performance and Stability Optimization: Identify, propose, and drive initiatives to enhance system performance, operational stability, and environment standardization.
• Advanced Technical Support: Provide expert-level support in Linux and infrastructure domains, with strong troubleshooting capabilities across servers, storage, and network connectivity.
• Continuous Improvement: Enhance Standard Operating Procedures (SOPs) through automation and define corrective action plans to consistently meet and exceed KPIs.
• Orchestration and Stakeholder Management: Coordinate activities across multiple internal teams and external cloud service providers to ensure seamless and efficient service delivery.
Core Technical Requirements
• Linux (Mandatory): Strong hands-on experience administering enterprise Linux distributions such as SUSE, Red Hat, or Ubuntu, with proven expertise in system, disk, and performance troubleshooting.
• Clustering and High Availability (Mandatory – Flexible): 2–3 years of experience with High Availability (HA) architectures. Familiarity with Pacemaker is preferred; experience with alternatives such as Red Hat HA is also acceptable.
• Cloud Platforms (Mandatory): Practical experience with at least one major public cloud provider (AWS, Azure, or GCP). Exposure to multi-cloud environments is a plus.
• Networking: Advanced troubleshooting capabilities in Linux and cloud environments, including TCP/IP, DNS, LDAP, NAT, firewalls, and connectivity diagnostics.
• Automation: Experience with scripting languages (Shell, Python, Go, etc.) and configuration and automation tools such as Ansible or Chef.
Qualifications and Experience
• Professional Experience: 8–10 years of experience in IT infrastructure and server operations.
• Education: Bachelor’s degree in Computer Science, Engineering, IT Management, or a related field.
• Language: Fluent English is mandatory, as all communication, documentation, and customer interaction are conducted in English.
• Soft Skills: Strong customer orientation, analytical and solution-driven mindset, autonomy in problem-solving, and the ability to proactively acquire new knowledge.
Further information will be given along the technical interview.