Staff Site Reliability Engineer

Work set-up: 
Full Remote
Contract: 
Experience: 
Senior (5-10 years)
Work from: 

Offer summary

Qualifications:

Extensive experience in SRE, DevOps, or Systems Engineering., Proficiency in cloud infrastructure, especially AWS, including networking and security., Strong scripting skills in Python, Bash, and Linux tools., Deep understanding of UNIX/Linux internals and container orchestration with Kubernetes..

Key responsibilities:

  • Design, implement, and operate container infrastructure using Kubernetes.
  • Develop and maintain automated CI/CD pipelines with best deployment practices.
  • Lead the adoption of Infrastructure as Code principles using Terraform.
  • Monitor, troubleshoot, and optimize the platform's reliability, scalability, and efficiency.

Addepar logo
Addepar SME https://addepar.com/
501 - 1000 Employees
See all jobs

Job description

Who We Are

Addepar is a global technology and data company that helps investment professionals provide the most informed, precise guidance for their clients. Hundreds of thousands of users have entrusted Addepar to empower smarter investment decisions and better advice over the last decade. With client presence in more than 50 countries, Addepar’s platform aggregates portfolio, market and client data for over $7 trillion in assets. Addepar’s open platform integrates with more than 100 software, data and services partners to deliver a complete solution for a wide range of firms and use cases. Addepar embraces a global flexible workforce model with offices in New York City, Salt Lake City, Chicago, London, Edinburgh, Pune, and Dubai.

The Role

We are looking to add a highly experienced and impactful colleague to the organization to drive the transformation of Addepar’s Production Engineering and SRE team. This role focuses on evolving our platform towards enabling highlevel declarative infrastructure orchestration and its operations. This platform closely integrates our Compute, Network, and Storage control planes, allowing us to develop highly efficient and fasttoiterateon services tailored to various product areas within the company, abstracting our developers from the nuances of underlying infrastructure.

The ideal candidate will play a staff, leading role in implementing, maintaining, and strategically evolving Addepar’s Production Infrastructure.. You will bring a robust combination of leading innovative solutions across functional teams and extensive handson development experience in AWScloud, LinuxUnix, networking, advanced scripting abilities, containerization, Kubernetes, Terraform, Information Security, deep debugging, and comprehensive monitoringobservability skills. This includes designing, deploying, monitoring, automating, and optimizing all operational aspects of Addepars platform with a focus on reliability, scalability, and efficiency.

What You’ll Do
  • Lead the design, implementation, and operationalization of container infrastructure using Kubernetes (k8s), ensuring high availability, performance, and security
  • Architect, build, and maintain advanced, automated CICD pipelines using Jenkins, ArgoCD, AWS CodeBuildPipeline, GitHub Actions, or similar, establishing best practices for deployment strategies (e.g., bluegreen, canary)
  • Drive the adoption and evangelism of Infrastructure as Code (IaC) principles using Terraform, focusing on scaling the Addepar Platform across regions with a focus on cost optimization and operational efficiency
  • Develop deep applicationlevel knowledge to proactively inform and influence infrastructure requirements and constraints for Developers, QA, and Management, including implementing sophisticated dashboards for Cost and Inventory management, performance analysis, and capacity planning
  • Perform advanced monitoring and troubleshooting of our infrastructure and application stack using a wide array of loggingmonitoring tools, driving root cause analysis and implementing preventative measures
  • Initiate and lead collaborations with crossfunctional teams to identify and resolve complex Application or infrastructure issues, serving as a technical subject matter expert
  • Serve as a primary oncall responder for critical incidents, demonstrating strong problemsolving skills under pressure and contributing to postincident reviews to improve system resilience
  • Highlight teamspecific activities, followed by how this role will interact with other teams and groups
    • Who You Are
      • Extensive progressive experience in the SREDevOpsSystems Engineer field, with a track record of taking on increasing responsibility
      • Expertlevel understanding of Cloud Infrastructure fundamentals (AWS preferred), including advanced networking, security, and managed services
      • Exceptional ProgrammingScripting skills in various common languages (Python , Bash, and general Linux tools are essential; Java is a strong plus), with an emphasis on building scalable, maintainable automation and tools
      • Broad and deep expertise with UNIXBSDLinux internals (Ubuntu preferred), including performance tuning, kernellevel debugging, and advanced system administration
      • Extensive Containerization experience with k8s (KOPS, EKS, ECS preferred), including cluster management, custom resource definitions (CRDs), and advanced deployment strategies
      • Demonstrable experience leading initiatives with infrastructureascode tools such as Terraform in complex, multiaccount environments
      • Proficient experience with comprehensive monitoring, logging, and alerting tools such as Prometheus, Grafana, Sentry, Sumologic, or advanced AWS cloudnative tools, with a focus on observability strategy
      • Excellent interpersonal and communication skills to effectively collaborate with multifunctional teams, articulate complex technical concepts, and influence outcomes
      • Demonstrable experience writing and contributing to significant systems automation tooling or opensource projects is a strong plus
      • Exposure to industry practices in financial services is a plus
        • Our Values

          • Act Like an Owner Think and operate with intention, purpose and care. Own outcomes.
          • Build Together Collaborate to unlock the best solutions. Deliver lasting value.
          • Champion Our Clients Exceed client expectations. Our clients’ success is our success.
          • Drive Innovation Be bold and unconstrained in problem solving. Transform the industry.
          • Embrace Learning Engage our community to broaden our perspective. Bring a growth mindset.
            • In addition to our core values, Addepar is proud to be an equal opportunity employer. We seek to bring together diverse ideas, experiences, skill sets, perspectives, backgrounds and identities to drive innovative solutions. We commit to promoting a welcoming environment where inclusion and belonging are held as a shared responsibility.

              We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us to request accommodation.

              PHISHING SCAM WARNING: Addepar is among several companies recently made aware of a phishing scam involving con artists posing as hiring managers recruiting via email, text and social media. The imposters are creating misleading email accounts, conducting remote “interviews,” and making fake job offers in order to collect personal and financial information from unsuspecting individuals. Please be aware that no job offers will be made from Addepar without a formal interview process. Additionally, Addepar will not ask you to purchase equipment or supplies as part of your onboarding process. If you have any questions, please reach out to TAinfo@addepar.com.

Required profile

Experience

Level of experience: Senior (5-10 years)
Spoken language(s):
English
Check out the description to know which languages are mandatory.

Other Skills

  • Problem Solving
  • Collaboration
  • Communication

Site Reliability Engineer (SRE) Related jobs