Minimum of 5 years experience with AWS and AI/ML services., Proficiency in infrastructure automation tools like CloudFormation and Terraform., Strong coding skills in Python, Bash, or similar languages., Bachelor's or Master's degree in Computer Science or related field..
Key responsibilities:
Build and enhance AI/ML infrastructure and applications on AWS.
Automate infrastructure deployment and optimize existing systems.
Collaborate with development teams to develop reusable cloud patterns.
Develop monitoring, automation, and operational tools for ML platforms.
Report this Job
Help us maintain the quality of our job listings. If you find any issues
with this job post, please let us know. Select the reason you're reporting
this job:
JSR Tech Consulting provides contract consulting services for business and technology leaders in financial services, pharmaceuticals and healthcare. Our principals have close to 30 years of experience in supplemental services built on many long-term relationships. We can take on requirements of any kind: filling a single strategic position or supporting multimillion-dollar projects.
JSR is also a certified Women-Owned Business Enterprise and an advocate of Disability:IN, a national non-profit supporting disability inclusion in the workplace and the supply chain.
The position is a combination of GenAI (70%) and MLOps (30%)
Candidate should have at least one AWS certification.
Terraform, Cloudformation are a big plus.
As a Cloud Engineer within the AWS AI/ML platform team, you will have the opportunity to work one-on-one with application and infrastructure developers to build and enhance the AI/ML infrastructure and application patterns that power mission-critical applications, ensuring that they're engineered for high availability, durability, and resiliency. You will be part of an agile team that combines various backgrounds, experiences, and perspectives to solve complex problems within AWS and beyond. Responsibilities:
Focus on optimizing existing systems, building infrastructure, and eliminating work through automation.
Influence application and security architecture and design across multi and hybrid cloud platforms.
Peer-reviewing infrastructure-as-code (AWS CloudFormation, Python, Terraform, or similar).
Partnering with application and infrastructure teams to develop reusable cloud patterns.
Deployment and troubleshooting of infrastructure code.
Partner with the Site Reliability Engineering (SRE) team to conduct post-incident reviews and root cause analysis and building monitoring and automation to prevent future incidents.
Identify opportunities to build self-service capabilities and automate infrastructure and application deployments.
Develop tools and best practices for platform development, developer productivity, automation (MLOps, CI/CD, A/B testing), and production operations.
Design, Develop & deliver critical components, frameworks, services, and products using AWS SageMaker, Bedrock, Lambda, and container technologies in AWS.
Expertise in LLM, Agentic framework, Agentic AI
Develop processes, model monitoring, and governance framework for successful ML model operationalization.
Define standards for engineering and operational excellence for running best-in-class ML platforms and continue to improve ML platforms to keep up with the latest innovations.
Assist in gathering and analyzing non-functional requirements and translating that into technical specifications for robust, scalable, supportable solutions that work well within the overall system architecture
Technical Qualifications:
The cloud is a rapidly changing world, with the major players announcing new features almost on a daily basis. A successful Cloud Engineer doesn't need to know everything about everything but instead keeps a pulse on new developments and emerging paradigms to identify areas where they can continuously improve their skill sets.
Ability to debug, optimize code, and automate routine tasks.
A systematic problem-solving approach coupled with a strong sense of ownership and drive.
Ability to quickly pickup and understand where newly released cloud services would be appropriate for business applications.
Experience with infrastructure automation tools such as Puppet, Ansible, CloudFormation, or Terraform.
Working knowledge of pipeline-automation tools such as Jenkins, CodePipeline, Azure DevOps, or other comparable tools.
Experience using Git for source control management.
Ability to proficiently write code in Python, Node.js, Bash (shell), PowerShell, or other similar languages.
Experience using Docker within container orchestration platforms such as AWS ECS, EKS, Google Anthos, or others.
Comfortable in a Linux environment.
Understanding of foundational AWS services such as VPCs, EC2, S3, RDS, Auto Scaling Groups, CloudWatch Logs, etc.
In-depth knowledge of security and IAM within AWS, including the management and operation of Security Groups, KMS Keys, VPC NACLs, and SCPs.
Familiar with ETL and big data tool-chains such as those provided by Hadoop/EMR, Glue, Spark, Impala, or similar.
Understanding of relational database systems and how applications interact with them.
Familiarity with one or more log and event aggregation and monitoring systems such as Splunk, Elasticsearch (ELK), Prometheus, Grafana, or similar.
Qualifications:
5+ years experience in Amazon Web Services (AWS), AIML services
Experience in working in an Agile/Scrum-focused organization.
Strong verbal and written communication skills; comfortable with translating technical problems to non-technical audiences.
MS/BS degree in Information Technology, Computer Science, related technical field, or equivalent practical experience.
Preferred Qualifications:
One or more Associate or Professional-level AWS certificates.
Prior experience within a DevOps, DevSecOps, SRE, or UNIX/Linux Sys-Admin teams.
#LI-TI1
Required profile
Experience
Level of experience:Senior (5-10 years)
Spoken language(s):
English
Check out the description to know which languages are mandatory.