Offer summary

Qualifications:

12+ years of experience in cloud platform engineering, DevOps, or site reliability engineering with a focus on automation., Proficiency in PowerShell scripting and Infrastructure as Code using Bicep., Strong understanding of CI/CD processes and experience with YAML pipelines in Azure DevOps., In-depth knowledge of Microsoft 365 platform and Azure-native services..

Key responsibilities:

Lead investigation and resolution of critical incidents in Azure and Microsoft 365 automation workflows.

Debug and optimize PowerShell, Bicep, and .NET components within automated provisioning workflows.

Collaborate with product owners to introduce new automation use cases and conduct post-incident reviews.

Mentor L1 and L2 engineers and stay updated with changes in Azure and Microsoft 365 APIs.

Job description

Key Responsibilities:

Lead investigation and resolution of critical, recurring, or high-impact incidents across Azure and Microsoft 365 automation workflows.

Deep-dive into PowerShell, Bicep, and YAML scripts to identify logic errors, misconfigurations, or scalability limitations within automated provisioning workflows.

Debug and optimize .NET (C#) components within Azure Functions or related application layers used in workflow orchestration.

Analyze usage patterns and telemetry data from Azure Monitor, Application Insights, and Log Analytics to identify systemic issues or opportunities for automation enhancement.

Implement fixes and design improvements to automation logic that reduce manual intervention and improve workflow reliability (e.g., auto-remediation scripts, retry logic).

Own and evolve the automation framework for Teams and SPO lifecycle operations — including operations like create/delete, external sharing restrictions, and role/ownership changes.

Collaborate with product owners and architects to introduce new automation use cases or extend existing workflows.

Conduct post-incident reviews (PIRs) for high-severity incidents, drive root cause analysis (RCA), and implement corrective actions.

Mentor L1 and L2 engineers, conduct knowledge-sharing sessions, and support onboarding of new team members.

Stay updated with changes in Azure, Microsoft 365 APIs, and automation tooling (PowerShell modules, Bicep schema updates, etc.)

Provide guidance on architecture and best practices for automation reliability

Required Skills & Experience:

12+ years of experience in cloud platform engineering, DevOps, or site reliability engineering (SRE) roles with a focus on automation and operational excellence.

Proficiency in PowerShell scripting, including writing reusable modules, automation logic, and error handling for production workloads.

Extensive experience with Infrastructure as Code using Bicep, including authoring, debugging, and deploying templates for complex Azure resources.

Strong understanding of CI/CD processes and YAML pipelines, with hands-on experience in automating build/release workflows in Azure DevOps.

Proficient in .NET (C#) — especially for debugging Azure Functions or working on backend components integrated into M365 automation flows.

In-depth knowledge of Microsoft 365 platform, including API usage, Teams & SharePoint Online provisioning, governance, and permissions management.

Proven ability to troubleshoot and optimize Azure-native services such as API Management, Azure Functions, Storage, Service Bus, Key Vault, and Container Apps.