Staff Site Reliability Engineer
Americana Restaurants
Job Description
Job DescriptionWe are hiring a Senior Site Reliability / Platform Engineer to own reliability, performance, and scalability of large-scale, cloud-native ecommerce platforms. Key ResponsibilitiesOwn end to end reliability, availability, scalability & performance of production systems.Define & govern SLOs, SLIs, error budgets.Lead 24/7 on-call, incident response, RCA & preventive actions.Implement automation, self healing, resilient architecture patterns.Architect and deliver secure, scalable cloud-native platforms.Capacity planning & performance forecasting.Drive architecture reviews, chaos engineering & resilience improvements.Maintain runbooks, IaC diagrams, SOPs & incident playbooks.Lead best practices in CI/CD, release management & deployment automation.Ensure strong cloud security posture & compliance.Mentor engineers & promote operational excellence. Tech StackCloud: AzureContainers: KubernetesMessaging/Data: Kafka, Aerospike, MongoDB Atlas, In memory DBsCI/CD: Jenkins, Azure DevOps, GitHub Actions, Argo CDIaC: Terraform, ARM Templates, Ansible, PackerMonitoring: New Relic, Prometheus, Grafana, Azure MonitorSecurity: WAF, DDoS, Azure Front DoorLanguages/OS: Linux, PythonNetworking: DNS, NAT, Routing, Subnetting
Location & Work Mode Mohali (Onsite) 5 Days Working UAE-Based Organization