Bachelor’s degree in computer programming, computer science, or a related field., 5+ years experience in a DevOps or Site Reliability Engineer role., Proficient in Kubernetes and Docker with production experience., Ability to write Bash and/or Python scripts..
Key responsabilities:
Design, build, test, deploy, and automate stable/scalable services for internal and end users.
Create and manage CI/CD pipelines for automated testing and deployment.
Monitor services and drive performance tuning while maintaining a fault-tolerant infrastructure.
Provide multi-tier support to engineering and non-engineering stakeholders.
Report This Job
Help us maintain the quality of our job listings. If you find any issues with this job post, please let us know.
Select the reason you're reporting this job:
Hi! We're Firework. Firework is an ecommerce tech startup that helps brands create & host interactive, shoppable, one-to-one, and short video experiences on their websites. Follow us to learn about developments in ecommerce, tech, and marketing.
Firework is the world’s leading unified video commerce platform that empowers its global partners to personalize the customer experience and engagement at scale. Firework bridges the offline and online for a robust omnichannel immersive brand experience cultivating a deeper emotional human connection between our partners and their end consumers. We are customer-centric and inspired to win together offering total solutions with endless possibilities to help our customers increase purchases and conversions using the power of video. At the heart, we are a global and diverse team of “SuperSpark” creators, entrepreneurs, life-long learners, and data geeks driven by the future of authenticity to transform commerce. Firework has raised over $235M to date, with its latest Series B round led by SoftBank Vision Fund 2. Come reimagine the online customer experience with us.
Summary
Our engineering team is growing! We’re looking for a talented DevOps Engineer to join our global team and build scalable systems that will shape the future of our cloud infrastructure for our customer-facing and internal systems.
What You'll Be Doing
Work across multiple functional teams to assess, design, build and maintain a highly fault-tolerant, elastic infrastructure of tools and automation on cloud.
Create deployments, services, and other resources on Kubernetes clusters.
Design, build, test, deploy, and automate stable/scalable services for the internal engineering team and end users.
Champion for a flawless Service Level Agreement (SLA). Shoot for the 5 9s target.
Be available on-call during your shift to handle any P0 incidents and help bring the systems back online.
Create and manage CI/CD pipelines for automated testing, deployment, and any other use cases.
Continuously monitor all the services and drive performance tuning.
Maintain and improve our existing software engineering tools with upgrades and installations.
Integrate secure solutions and compliance management including identity and access management role-based access control systems.
Debug, troubleshoot, and resolve system level scale, performance, and automation problems.
Provide multi-tier levels of support to engineering and non-engineering stakeholders.
Check in code to Github repositories and perform code reviews for your fellow team members.
What You Should Have
Bachelor’s degree in computer programming, computer science, or a related field.
5+ years experience in a DevOps or Site Reliability Engineer role.
Mix of consumer technology and SaaS technology is ideal.
Working and maintaining production experience of Kubernetes deployments and services.
Kubernetes (k8s) and Docker production experienceBuilt out continuous integration and continuous deployment pipelines.
Able to write Bash and/or Python scripts.
Ability to own and be responsible for the projects you will be working on.
We'll Be Excited If You Have
Experience working with AWS cloud infrastructure and their various services.
Fluent in Terraform/Terragrunt and writing Infrastructure as Code (IaC).
Experience and thorough understanding of the Linux operating systems.
Experience with high-traffic monitoring systems.
Implementation of logging (Grafana/Prometheus), telemetry (New Relic), and tracing is ideal.
Experience with Nginx deployments.Closely work with SQL and NoSQL databases and experience executing zero-downtime database upgrades.
Excellent eye for security and creating bulletproof secure systems.
Excellent and effective verbal, written, interpersonal communication skills.
Comfortable with fast-paced change: ability to demonstrate comfort with ambiguity, adapt quickly and be effective in new situations in a highly dynamic setting.
Data-driven but also imaginative and intuitive in coming up with ideas and solutions.
Proven ability to balance multiple priorities in a collaborative team environment.
This role may be remote in Latin America or Canada or may be hybrid in our San Mateo office. To determine a successful candidate’s starting pay, we carefully consider a variety of factors, including primary work location, an evaluation of the candidate’s skills and experience, market demands, and internal parity. Candidates may receive more information from the talent partner.
Don’t hold back
We understand some candidates may see the above and not apply because they don’t meet all the qualifications. We encourage you to apply anyway; we often find talented candidates that fit many other opportunities we have and look for potential too, not just what you did in the past. As an equal employment opportunity employer, we are a diverse team that strives for an inclusive environment for all. We prohibit discrimination and harassment of any kind based on race, color, sex, religion, sexual orientation, national origin, age, disability, genetic information, pregnancy, or any other protected characteristic as outlined by federal, state, or local laws.