AI Engineer (AI Lab)

greennode

Ho Chi Minh, Vietnam

1-3 Years

Save

Posted 3 hours ago
Be among the first 10 applicants

Early Applicant

Job Description

We are looking for AI Engineer to join our AI Lab team, focusing on building production-grade AI agents.

You will design and implement RAG pipelines, Agentic RAG architectures, tool-using agents, and MCP-based agent servers that can reason, plan, and execute tasks reliably in real-world systems.

This role is engineering-focused, not research-only.

Key Responsibilities

Agent & RAG Engineering (Core)

Design, build, and operate RAG pipelines: retrieval, reranking, grounding, citation
Implement Agentic RAG systems with multi-step reasoning, planning, and tool execution
Build tool-using agents (API, DB, search, internal services) with robust error handling
Design and implement MCP-servers to expose tools and services for LLM agents
Develop multi-agent workflows (planner–executor, router, verifier, reviewer patterns)

System Integration & Reliability

Integrate agents with vector databases (FAISS, Qdrant, Milvus, Pinecone,..).
Define and measure agent KPIs: accuracy, retrieval quality, tool success rate, latency, stability, hallucination rate
Improve agent reliability via prompt design, tool schemas, guardrails, and retries
Support experiment tracking, evaluation pipelines, and internal dashboards (e.g. W&B)

Model Usage & Inference

Work with LLMs / SLMs for inference (OpenAI, Claude, Gemini, Qwen, LLaMA, Mistral, OpenSource LLM)
Apply embedding models effectively for semantic search and grounding
Optimize inference cost, latency, and throughput

Requirements:

Must-Have

Bachelor's degree in Computer Science, AI, or related fields, or equivalent experience
At least 1 year of hands-on experience building RAG pipeline or AI agents
Strong Python skills and experience building backend systems
Solid understanding of LLMs, embeddings, RAG, and agent architectures
Hands-on experience with agent frameworks
(LangChain, LangGraph, LlamaIndex, Flowise, CrewAI, etc.)
Experience with vector databases and retrieval systems
Familiarity with tool calling / function calling in LLMs
Ability to think in systems & failure modes (timeouts, retries, partial failures)
Comfortable reading technical documentation and designing APIs

Nice-to-Have

Experience designing or operating MCP-servers or similar agent tool layers
Knowledge of agent orchestration patterns (ReAct, planner–executor, multi-agent)
Familiarity with AI Gateway / inference routing / auth (rate limit, quota, logging)
Experience with model serving (vLLM, TGI, Triton Inference Server)
Understanding of fine-tuning techniques (LoRA, DPO, distillation) and data prep
Exposure to enterprise AI concerns: security, access control, observability