Production Platform Operations Lead

Production Platform Operations Lead
Production Platform Operations Lead

21 Abril, 2022


Informamos que todas as oportunidades de emprego publicadas no site VAGAS.INFROMOZ, sem excepção, não implicam qualquer custo para os candidatos


Role purpose: An individual that Leads a team responsible for running/operating the staging & production environments where our Software Engineering Squads deploy their workloads. This role deals with both people management and system operation, being responsible for meeting optimal levels of availability of the environments while maintaining a strong team cohesion. This role deals directly with Incidents, vulnerabilities, Capacity planning & Disaster Recovery, being only responsible for the environments and not for the applications deployed by the different squads into those environments.


Key accountabilities and decision ownership
• Define team focus & priorities
• Keep stakeholders updated regarding ongoing issues/Incidents and own Root Cause Analysis documentation
• Report on system availability, stability and defects
• Solve team Impediments by interacting with external parties and performing escalations
• Control costs and budgets regarding Production & Staging Platform
• Ensure that Both Production & Staging are compliant to Cyber security posture
• Own contracts & vendors related to Staging & Production environments
• Perform Technical activities such as troubleshooting, Leading by example


Core competencies, knowledge and experience
• Understanding of SDLC
• Ability to Influence with reasoning
• At least 3 Years of proven Experience in a similar or equivalent role
• Experience Deploying and or Operating Software in production
• Client-server Architecture


Must have skills / professional qualifications
• Network Essentials (IP, DNS, TCP/UDP)
• Web architecture
• Leadership and organizational skills
• Outstanding communication skills
• Problem-solving aptitude
• Fluent English Reading and Listening


Desired technical skills
• Using AWS Cloud
• Containers & Kubernetes
• Infrastructure as Code (with Terraform)
• Amazon Linux
• Microservices Architecture


Number of Direct reports: 6


Key performance indicators
• Platform availability
• Security Compliance of Owned environments
• Meantime to Recover and to Detect Platform Incidents including root cause analysis documentation




Local: Maputo


Deixe uma resposta

O seu endereço de email não será publicado. Campos obrigatórios marcados com *