Description

  • Hands-on experience with at least one major public cloud platform (Azure, AWS, or GCP)
  • Strong understanding of cloud infrastructure and application runtime components, including compute, storage, networking, IAM, and their impact on operational stability and application performance.
  • Proven experience in operating, supporting, and stabilizing cloud-hosted applications, with primary focus on availability, reliability, performance tuning, and incident management.
  • Solid understanding and practical application of Site Reliability Engineering (SRE) principles, including SLIs, SLOs, error budgets, proactive monitoring, alerting, and post-incident problem management.
  • Strong DevOps experience, including designing, building, and supporting CI/CD pipelines for application deployment, configuration management, and environment consistency.
  • Hands-on experience with DevOps tools and pipelines for build, release, and deployment automation, along with troubleshooting pipeline failures and improving deployment reliability.
  • Experience using Infrastructure as Code (IaC), primarily Terraform, for provisioning, maintaining, and supporting cloud environments, including drift detection and remediation.
  • knowledge of high availability, disaster recovery, backup, and restore strategies, ensuring resilience and minimal downtime for customer applications.
  • Experience with monitoring, logging, and observability solutions, and ability to perform root cause analysis across application, platform, and infrastructure layers.
  • Good understanding of cloud security fundamentals, governance controls, and access management, ensuring secure and compliant operations in line with enterprise standards.

What You’ll Need To Succeed (required Skills)

  • 8-12 years of hands-on experience in SRE, DevOps, or production engineering roles.
  • Experience building and maintaining CI/CD pipelines using Azure DevOps, including environment setup and secure handling of deployment artifacts.
  • Proficient in cloud-native platforms and infrastructure-as-code concepts and tools
  • Strong background in AWS (or GCP/Azure) and container orchestration (Docker, Kubernetes).
  • Proficiency in at least one programming language (Python, Go, or similar) for automation and tooling.