Search by job, company or skills

Manabie

Senior DevOps Engineer

new job description bg glownew job description bg glownew job description bg svg
  • Posted a month ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Responsibilities:

  • Design, develop, and maintain backend services using Golang, ensuring high code quality, performance, and maintainability.
  • Participate in system design and architectural discussions, including service boundaries, data flow, scalability, and reliability trade-offs.
  • Build, deploy, and operate applications on Google Kubernetes Engine (GKE):
  • Package and deploy applications using Helm charts
  • Configure resources, autoscaling, health checks, and rollout/rollback strategies
  • Troubleshoot production issues related to performance, stability, networking, and resource usage
  • Manage cloud infrastructure (GCP) using Terraform (Infrastructure as Code):
  • Create, maintain, and review Terraform modules
  • Ensure consistent and reliable environments across development, staging, and production
  • Improve system reliability, observability, and security:
  • Implement and use logging, metrics, tracing, and alerting
  • Participate in incident response, root cause analysis, and post-incident improvements
  • Collaborate closely with product, DevOps, and engineering teams to deliver secure, production-ready solutions.

Requirements and Qualifications:

  • 6+ years of experience in backend or platform engineering in production environments.
  • Solid understanding of distributed systems concepts such as scalability, reliability, retries, timeouts, and consistency.
  • Strong hands-on experience with GKE / Kubernetes, including:
  • Core Kubernetes resources (Deployments, Services, Ingress, ConfigMaps, Secrets)
  • Deploying and managing applications using Helm charts
  • Debugging and operating production workloads
  • Strong understanding of GCP core services, including IAM, VPC, Subnets, Cloud NAT, VPN, Load Balancing, Cloud DNS, Cloud Logging, Cloud Run, and Monitoring.
  • Practical experience with Terraform for infrastructure provisioning and management.
  • Experience with CI/CD pipelines and cloud-native application operations.
  • Strong proficiency in Golang, including:
  • Concurrency (goroutines, channels), context handling, and error management
  • Building and maintaining APIs (REST and/or gRPC)
  • Writing clean, testable, and maintainable code
  • Strong problem-solving skills and the ability to work with complex systems.
  • Good communication skills and a strong sense of ownership.

Optional (Nice to have):

  • Basic knowledge of AI systems and GenAI fundamentals, including AI agents, RAG architectures, and LLM-based services.
  • Familiarity with AI infrastructure concepts:
  • Model inference services
  • GPU-based workloads
  • Scaling, latency, and cost trade-offs
  • Experience with service mesh (e.g., Istio)
  • Familiarity with observability tools (Prometheus, Grafana, Cloud Monitoring)
  • Good understanding of cloud and application security, including:
  • IAM and access control (GCP IAM, Kubernetes RBAC)
  • Secrets management and secure configuration
  • Secure service-to-service communication (mTLS)
  • Container and Kubernetes security best practices

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 140984851

Similar Jobs