Job Description
Leadership & Team Management
Lead, mentor, and coach a team of 10 infrastructure engineers, fostering a culture of continuous improvement, high performance, and accountability.
Conduct regular one-on-one meetings, performance reviews, and career development planning for team members.
Manage team scheduling, resource allocation, and workflow to ensure continuous operations and high service availability.
Drive the adoption of best practices in operations, incident management, change control, and documentation.
Infrastructure Operations Management
Own and manage the full scope of infrastructure operations for client environments, spanning on-premises data centers and public cloud platforms (Azure and AliCloud).
Ensure stability, reliability, and security of all infrastructure services (e.g., networking, storage, virtualization, compute, databases).
Maintain a deep operational knowledge of these functions to guide the team and act as a final escalation point for complex technical issues.
Oversee the Incident Management process, driving timely resolution and effective post-incident reviews.
Communication & Stakeholder Engagement
Serve as the primary Technical Point of Contact for clients and internal stakeholders regarding daily infrastructure operations, technical issues, and system performance.
Provide clear, concise, and professional communication (written and verbal) in English for operational updates, incident reports, and technical recommendations.
Must have
5+ years of progressive experience in IT Infrastructure, with at least 2+ years in a dedicated management or team lead capacity.
Excellent English communication skills (both written and verbal) are mandatory for frequent client interaction and technical documentation.
Good operational knowledge of managing enterprise-level infrastructure, including compute, storage, networking, and virtualization.
Hands-on experience with and good understanding of cloud platforms
Demonstrated technical knowledge of core infrastructure operations functions (e.g., monitoring methodologies, backup/recovery, capacity planning) to effectively oversee the team's work.
Experience with ITIL processes, particularly Incident, Change, and Problem Management
Nice to have
Relevant certifications (e.g., Azure, GCP, Alibaba Cloud).
Experience in a client-facing or Managed Service Provider (MSP) environment.
Experience with automation tools and concepts (e.g., scripting, Infrastructure as Code)