Bachelor's Degree in Computer Science, Design, or related field., 5-7 years of experience in a Site Reliability Engineering role or similar., Deep technical expertise in AWS, containerization, monitoring, and automation tools., Strong communication skills in English, both written and verbal..
Key responsibilities:
Ensure system reliability and monitor system health.
Build and improve platform infrastructure and applications.
Collaborate with development teams to enhance services and release processes.
Optimize system performance and drive continuous improvement.
Report This Job
Help us maintain the quality of our job listings. If you find any issues with this job post, please let us know.
Select the reason you're reporting this job:
NTD Software is a Mexican company located in Guadalajara, Jalisco, known as "the silicon valley of Mexico." We help both startups and big companies by finding the right people to join their team and creating digital solutions using the latest or well-established programming languages and tools. Our expertise is in building software from the ground up and expanding our clients' existing teams, allowing us to work with businesses globally.
We are looking for a Senior Site Reliability Engineer with strong experience in AWS, system monitoring, and infrastructure automation. The role involves maintaining and improving the reliability and performance of a cloud-based lending platform used by mid-market and large financial institutions.
The ideal candidate will have a solid background in systems engineering and software development, be comfortable working across teams, and take ownership of operational stability and tooling improvements.
Responsibilities:
Own your deep learning about the software, its functions, and how it fulfills the clients’ needs, and how they use the product.
Oversee systems to ensure reliability for customers.
Monitor distribution systems and notify appropriate persons of any potential issues.
Run the production environment by monitoring availability and taking a holistic view of system health.
Build software and systems to manage platform infrastructure and applications.
Improve reliability, quality, and time-to-market of our suite of software solutions.
Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improve.
Partner with development teams to improve services through rigorous testing and release procedures.
Technical Skills:
Bachelor's Degree (B.A.) in Computer Science or Design or equivalent four-year degree, or equivalent related experience.
5-7 years of proven experience in a Site Reliability role or similar experience.
Excellent oral and written communication skills, including facilitation of group presentations, and consulting skills in the English language.
Possess deep technical experience with AWS, containerization technologies, automated deployment frameworks, monitoring, logging, alerting, system internals, networking, databases, distributed systems, and service-oriented architecture.
Demonstrate hands-on technical leadership and business impact in combining software engineering skills with systems engineering skills to solve complex automation and reliability challenges.
Experience working with Infrastructure and Application Monitoring tools such as: New Relic, SumoLogic, Uptime monitoring (Pingdom), CloudTrail, CloudWatch Insights, CloudFormation, CodePipeline, CodeDeploy.
Extensive working knowledge of managing AWS and Linux OS.
Experience working with MSSQL, MySQL, in cloud-based environments, as well as demonstrable knowledge and experience of AWS service technologies, i.e., Aurora, MySQL.
Experience of working with NoSQL database technologies (ideally DynamoDB).
Experience of working with pipeline automation scripting and tooling, i.e., Jenkins, Terraform.
Knowledge and experience utilizing coding languages (e.g., C++, Java, PHP) and frameworks/systems (e.g., AWS).
Ability to learn new languages and technologies strongly preferred.
Broad understanding of the lending industry, with the ability to become a subject matter expert on the job.
Soft Skills:
A strong sense of ownership.
Excellent written and verbal communication and interpersonal skills.
Able to effectively collaborate with technical and business partners.
Can take on full projects from beginning to end.
Problem solver.
Team Player.
Advanced English level.
Required profile
Experience
Spoken language(s):
English
Check out the description to know which languages are mandatory.