About the RoleManage and optimize enterprise multi-region Kubernetes infrastructure (50+ microservices) across Asia, US, Canada, EU using AWS Transit Gateway, Cloud Map, and GitHub Actions.
Required SkillsAWS- Multi-account Organizations, Transit Gateway (multi-region peering, route tables)
- AWS Cloud Map for cross-cluster service discovery
- EKS multi-cluster management (4+ regions)
- VPC networking, IAM with OIDC
Kubernetes- Multi-cluster EKS (5+ clusters, 27 nodes, 368 cores)
- Cross-cluster networking via Transit Gateway + Cloud Map
- Helm, service mesh, AWS Load Balancer Controller
- Resource optimization and scaling
Infrastructure as Code- Terraform (modular architecture, state management)
- ArgoCD for GitOps
- GitHub Actions CI/CD pipelines with AWS OIDC
- Reusable workflows for multi-region deployment
Monitoring- Prometheus, Grafana, Loki, Victoria Metrics
- Multi-cluster observability and alerting
Key Responsibilities- Design Transit Gateway architecture with prod/non-prod isolation
- Implement AWS Cloud Map for cross-cluster service discovery
- Manage 5 EKS clusters across multiple regions
- Build GitHub Actions pipelines for infrastructure/app deployment
- Deploy monitoring stack and optimize cluster performance
- Create reusable Terraform modules and Helm charts
- Finalize cloud migration and decommission legacy infrastructure
- Optimize AWS costs and resource utilization post-migration
Requirements- 5+ years DevOps/SRE experience
- 3+ years production multi-cluster Kubernetes
- 3+ years AWS architecture
- 2+ years GitHub Actions
- Experience with Transit Gateway, AWS Cloud Map
- Scripting: Bash, Python
Tech Stack- AWS: Transit Gateway, Cloud Map, EKS, VPC
- K8s: 5 clusters (Asia, US-Central, US-West, Canada, EU)
- CI/CD: GitHub Actions, ArgoCD
- Monitoring: Prometheus, Grafana, Loki, Rancher
- IaC: Terraform, Helm
Nice to Have- AWS certifications (Solutions Architect, DevOps Professional)
- Service mesh (Istio/Linkerd)
- AI/ML workloads (GPU, Kubeflow, Ray)
- Database migrations
- Container security (Trivy, Snyk)
- Multi-cloud migration experience
What We Offer- Competitive salary
- Flexible work arrangements, including hybrid options
- 13th-month bonus in accordance with company policy
- Comprehensive health, dental, and vision insurance for the employee and one dependent
- Opportunity to shape the future of AI technology
- Collaborative and innovative work environment
Location: District 1-HCM Working model: Hybrid Type: Full-time