Logo for Open LMS

Cloud Infrastructure Engineer (Open LMS) Hungary, Remote at LTG

Roles & Responsibilities

  • Strong experience with AWS services in production
  • Proficiency in authoring and maintaining Terraform modules
  • Proficiency in authoring and maintaining Puppet modules
  • Deep Linux systems knowledge (Ubuntu)

Requirements:

  • Designing, building, and maintaining AWS infrastructure using Terraform
  • Writing and maintaining Puppet modules to configure and manage fleets of EC2 instances
  • Maintaining and extending Python-based automation and tooling for platform operations
  • Operating and improving distributed service discovery and configuration management

Job description

Role Description

We are looking for a Senior Cloud Infrastructure Engineer to join our team and help build, scale, and evolve our multi-tenant SaaS hosting platform on AWS. Our platform dynamically provisions, manages, and scales hundreds of Moodle LMS instances for education clients — powered by custom orchestration tooling, distributed service discovery, and infrastructure as code.

This is a hands-on infrastructure role. You'll work across the full stack — from Terraform modules and Puppet manifests to Python automation and observability pipelines. The platform is not containerised — there is no Kubernetes here — so we're looking for someone who understands Linux systems deeply and can reason about distributed systems problems from first principles.

You'll have real ownership and influence over the platform's architecture and direction as we continue to grow and evolve the infrastructure.

What You'll Be Doing

  • Designing, building, and maintaining AWS infrastructure using Terraform (EC2, RDS, S3, SQS, Lambda, ALB, ElastiCache, Route 53, VPC networking)
  • Writing and maintaining Puppet modules to configure and manage fleets of EC2 instances across multiple auto-scaling groups
  • Maintaining and extending Python-based automation and tooling that supports platform operations
  • Operating and improving distributed service discovery and configuration management (etcd)
  • Managing and tuning a multi-tier caching strategy (Varnish, Redis/Valkey, PHP OPcache)
  • Running and scaling our observability stack (Prometheus, Grafana, Loki, Fluentd, PagerDuty) and participating in on-call rotations
  • Evaluating and implementing distributed storage solutions as the platform evolves
  • Improving deployment workflows and release processes
  • Collaborating with internal teams on API contracts, integration patterns, and operational tooling
  • Participating in incident response, root cause analysis, and platform reliability improvements

Skills and Aptitudes

  • Strong experience with AWS services in production — particularly EC2, RDS, S3, SQS, Lambda, ALB, ElastiCache, Route 53, IAM, and VPC networking
  • Proficiency in authoring and maintaining Terraform modules for production infrastructure
  • Proficiency in authoring and maintaining Puppet modules (or equivalent agent-based configuration management) for fleet management
  • Solid Python skills — you'll be writing and maintaining production daemons, not just scripts
  • Deep Linux systems knowledge (Ubuntu) — comfortable with Apache/Nginx, PHP-FPM, Varnish, systemd, filesystem mounts, and networking fundamentals
  • Understanding of distributed systems concepts: consensus, leader election, distributed locking, eventual consistency, and the tradeoffs involved
  • Proficiency in building and maintaining observability pipelines (Prometheus, Grafana, Loki, or equivalent) in production
  • Comfortable working in a GitLab-based CI/CD workflow
  • Clear communicator who can document architectural decisions and explain technical tradeoffs to both technical and non-technical stakeholders

Additionally, a Top Candidate Will Exhibit One or More of the Following Preferred Qualifications

  • Hands-on experience with distributed storage systems such as Ceph, GlusterFS, JuiceFS, CubeFS, or AWS EFS — particularly in the context of migration or evaluation
  • Familiarity with etcd (or similar distributed key-value stores like Consul or ZooKeeper) including watch APIs, TTL-based locking, and cluster operations
  • Experience with Varnish and VCL, especially dynamic backend routing or multi-tenant configurations
  • Working knowledge of PHP — not to build applications, but to understand and maintain integration scripts that bridge infrastructure and application layers
  • Background in multi-tenant SaaS platform design — particularly database-per-tenant models on shared infrastructure
  • Familiarity with Moodle LMS or education technology platforms
  • Experience with secrets management solutions (AWS Secrets Manager, HashiCorp Vault, Parameter Store) and automated credential rotation
  • Experience designing zero-downtime deployment strategies for VM-based (non-containerized) environments

Open LMS is an equal employment opportunity/affirmative action employer and considers qualified applicants for employment without regard to race, gender, age, color, religion, national origin, marital status, disability, sexual orientation, or any other protected factor.

Infrastructure Engineer Related jobs

Other jobs at Open LMS

We help you get seen. Not ignored.

We help you get seen faster — by the right people.

🚀

Auto-Apply

We apply for you — automatically and instantly.

Save time, skip forms, and stay on top of every opportunity. Because you can't get seen if you're not in the race.

AI Match Feedback

Know your real match before you apply.

Get a detailed AI assessment of your profile against each job posting. Because getting seen starts with passing the filters.

Upgrade to Premium. Apply smarter and get noticed.

Upgrade to Premium

Join thousands of professionals who got noticed and hired faster.