Job Purpose:
The DevOps Team Lead sits at the intersection of technical expertise, operational reliability, and project delivery. This role is responsible for leading a team of Systems/Platform engineers to design, implement, and maintain secure, scalable, and highly available infrastructure across AWS, Azure, Google Cloud, and onpremise environments.
The position owns the endtoend application delivery platform (CI/CD, Kubernetes, GitLab, ArgoCD, Helm), observability stack, and continuous ISO/IEC 27001 compliance within the team, ensuring timely delivery of highquality infrastructure services that support business objectives.
Key Responsibilities
Infrastructure & IaC Management
- Lead the design, implementation, and maintenance of infrastructure across AWS, Azure, Google Cloud, and onpremise servers.
- Champion Infrastructure as Code (IaC) practices using tools such as Terraform, Terragrunt, CloudFormation, or equivalent to provision, configure, and manage infrastructure in a repeatable and auditable way.
- Ensure environments are standardized, secure, costoptimized, and aligned with architecture and security guidelines.
Application Delivery & Platform Engineering
- Own and evolve the application delivery platform using GitLab CI, ArgoCD, Helm charts, and Kubernetes.
- Design and maintain CI/CD pipelines to support reliable, frequent, and automated application deployments across environments.
- Establish best practices and guardrails for Kubernetes cluster configuration, namespace management, Helm chart management, and deployment strategies (e.g., blue/green, canary).
- Collaborate closely with development teams to ensure smooth, predictable, and observable releases.
Monitoring, Logging & Alerting
- Lead the design, implementation, and continuous improvement of the observability stack, including Prometheus, Thanos, Alertmanager, Grafana, Kibana, and Elasticsearch.
- Define and maintain monitoring standards, SLOs/SLIs, dashboards, and alerting rules to ensure early detection and rapid resolution of incidents.
- Ensure logs, metrics, and traces are consistently collected, stored, and accessible for troubleshooting, performance tuning, and capacity planning.
Compliance & Information Security (ISO/IEC 27001)
- Lead the implementation, documentation, and continuous maintenance of the ISO/IEC 27001 Information Security Management System (ISMS) within the team.
- Ensure infrastructure, platforms, and operational processes adhere to information security policies, controls, and audit requirements.
- Collaborate with Information Security, Risk, and Compliance stakeholders to support audits, risk assessments, and corrective actions.
- Promote a culture of security and compliance awareness within the team and across collaborating functions.
Team Leadership & People Management
- Lead, mentor, and develop a team of Systems/Platform engineers; provide regular feedback, support career growth, and foster a highperformance culture.
- Plan and prioritize team workload, ensuring timely delivery of projects, BAU tasks, and incident resolution.
- Promote knowledge sharing, documentation, and crosstraining to reduce single points of failure.
Collaboration
- Work closely with software development, security, network, and service desk teams to ensure infrastructure and platforms meet business and operational requirements.
- Translate business needs into technical solutions, set expectations, and communicate clearly on progress, risks, and timelines.
- Participate in architecture and design discussions, contributing infrastructure and operations perspectives.
Reliability, Incident & Problem Management
- Oversee incident response, including triage, communication, and coordination with relevant teams to minimize downtime and impact.
- Drive root cause analysis (RCA) and implement corrective and preventive actions for recurring issues.
- Continuously improve operational processes, runbooks, and standard operating procedures.
Skills & Qualifications:
- Bachelor's degree in Computer Science, Information Technology, Engineering, or related field; advanced degree is a plus.
- 5+ years of handson experience in systems, platform, or infrastructure engineering, with at least 2 years in a technical leadership or team lead role.
- Strong communication skills in English, both written and verbal, with the ability to explain complex technical topics to nontechnical stakeholders.
- Demonstrated ability to provide highquality customer service, manage expectations, and build strong relationships with internal stakeholders.
- Proven experience leading and mentoring technical teams.
Knowledge & Experience:
- Deep expertise in managing and configuring public cloud environments (AWS required; Azure and Google Cloud strongly preferred).
- Strong experience with Infrastructure as Code (IaC) tools such as Terraform, CloudFormation, or equivalent.
- Proven experience designing and maintaining CI/CD pipelines, ideally with GitLab CI; familiarity with other CI tools is a plus.
- Handson experience with Kubernetes, ArgoCD, and Helm charts for application deployment and configuration management.
- Solid understanding of networking concepts within cloud and containerized environments (VPCs, subnets, security groups, ingress/egress, load balancers).
- Strong background in Linux administration, system hardening, patch management, and performance optimization.
- Practical experience with observability stacks: Prometheus, Thanos, Alertmanager, Grafana, Kibana, and Elasticsearch (or equivalent tools).
- Proven experience implementing, operating, or maintaining ISO/IEC 27001 controls and processes within an organization.
- Experience with configuration management/automation tools (e.g., Ansible, Rancher, or equivalent).
- Relevant cloud certifications (e.g., AWS Certified Solutions Architect, Azure Administrator, Google Professional Cloud Architect) are an advantage.
(*) BONUSES & REWARDS
Competitive Salary
13th Month Salary & Performance Bonus
Employee of the Year Award
(*) TRAINING & DEVELOPMENT
In-house & Overseas Training
Full reimbursement for international Technical Certification
Global career opportunity
(*) ANNUAL PAID LEAVES
Vacation Leave: 14 days per year
Medical Leave: 6 days per year
1 extra seniority day for every 3 years of service
(*) HEALTHCARE
Annual Routine Check-up
Premium Healthcare Insurance (Generali)
Comprehensive Insurance
(*) WELLNESS AND LEISURE ACTIVITIES
Annual Team Building
Soccer & Badminton Club and Sports activities
Entertainment activities: Music band, Karaoke & Play-station time
Celebrations special events: Birthdays, Christmas, New Year/Year-end Party.
(*) PERKS
Fruits Days Twice a Month
Unlimited snacks & beverages