Search by job, company or skills

vsol vn solutions

Site Reliability Engineer (Senior/Lead)

Save
new job description bg glownew job description bg glownew job description bg svg
  • Posted 2 days ago
  • Be among the first 10 applicants
Early Applicant

Job Description

VSOL provides top-notch services while strictly adhering to international standards. We remain in the public eye as experts in the next big technologies.

I. Senior SRE

Requirements: 

• Over 4 years of experience with cloud environments and containerization technologies, including designing and implementing scalable, resilient infrastructure solutions using cloud platforms, and Kubernetes.  

• Experience with monitoring and logging tools such as ELK Stack. 

• Demonstrated excellence in network management, advanced troubleshooting, and system optimization, with a focus on enhancing efficiency and reducing downtime. 

• Awareness of experience in IT, with advanced expertise in network engineering and system administration. 

• Awareness of experience in site reliability practices, any experience with GIS platforms is a plus. 

• Strong skills in scripting and automation, particularly with Python and Bash is a big plus.

• Good knowledge of GitOps tools (e.g., Argo CD, FluxCD).

• Knowledge of security frameworks and compliance standards.

Qualifications: 

• Bachelor's degree in Computer Science, Information Technology, or a related field. 

• Cisco Certified Network Associate (CCNA) is a plus. 

• Certified Kubernetes Administrator (CKA) is a plus. 

• Written and spoken English communication skills at CEFR B1 level or above. 

II. SRE Lead

Key Skills and Competencies

• 6+ years of experience in SRE, DevOps, or technical operations roles, with at least 2 years in a lead or senior individual contributor capacity managing 24/7 production environments.

• Proven experience as an Incident Manager or incident command lead for major production incidents — including RCA facilitation, stakeholder communication, and post-incident review processes.

• Strong hands-on experience with Linux/Unix system administration, networking fundamentals (TCP/IP, DNS, firewalls, routing, load balancing), and hybrid cloud/on-premises environments.

• Observability and monitoring: deep experience building and operating stacks using Prometheus, Grafana, ELK Stack, Datadog, or equivalent tools.

• Scripting and automation expertise in Python and Bash (essential); Go is a plus.

• Infrastructure as Code: proficiency with Terraform and Ansible or equivalent tools.

• Container orchestration: strong knowledge of Kubernetes (CKA certification preferred) and Docker.

• Cloud platforms: GCP (preferred), AWS, or Azure; experience with hybrid on-premises and cloud environments.

• Good knowledge of GitOps tools (e.g., Argo CD, FluxCD).

• Basic understanding of AI agents, LLM-based automation (e.g., LangChain, AutoGen, or equivalent frameworks), and AIOps tooling for anomaly detection and intelligent alerting; hands-on experience is a strong differentiator.

• Solid understanding of ITIL v4 practices: incident management, problem management, change management, and continual service improvement; ITIL v4 Foundation certification is required; Managing Professional (MP) level is a strong plus.

• Client-facing communication: able to present service health reports, SLA performance, and technical updates clearly to stakeholders; English proficiency at CEFR B2 or above.

• Ability to guide and mentor team members technically; comfortable making operational decisions and providing direction in high-pressure incident situations.

• Knowledge of security frameworks and compliance standards relevant to managed services environments.

Qualifications

• Bachelor's degree in Computer Science, Information Technology, Engineering, or a related field (or equivalent experience).

• ITIL v4 Foundation certification — required; Managing Professional (MP) or Strategic Leader (SL) track is a strong plus.

• Certified Kubernetes Administrator (CKA) — preferred.

• AWS / GCP / Azure professional certification — a plus.

• Cisco Certified Network Associate (CCNA) — a plus.

• Written and spoken English communication skills at CEFR B2 level or above

Location: District 2

Note: This position may require international travel or onsite engagement in UAE (United Arab Emirates) and KSA (Kingdom of Saudi Arabia) for periods of 3 to 6 months continuously. Candidates will be required to accept this requirement as part of the position.

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 146446589

Similar Jobs