We are looking for AI Engineer to join our AI Lab team, focusing on building production-grade AI agents.
You will design and implement RAG pipelines, Agentic RAG architectures, tool-using agents, and MCP-based agent servers that can reason, plan, and execute tasks reliably in real-world systems.
This role is engineering-focused, not research-only.
Key Responsibilities
Agent & RAG Engineering (Core)
- Design, build, and operate RAG pipelines: retrieval, reranking, grounding, citation
- Implement Agentic RAG systems with multi-step reasoning, planning, and tool execution
- Build tool-using agents (API, DB, search, internal services) with robust error handling
- Design and implement MCP-servers to expose tools and services for LLM agents
- Develop multi-agent workflows (planner–executor, router, verifier, reviewer patterns)
System Integration & Reliability
- Integrate agents with vector databases (FAISS, Qdrant, Milvus, Pinecone,..).
- Define and measure agent KPIs: accuracy, retrieval quality, tool success rate, latency, stability, hallucination rate
- Improve agent reliability via prompt design, tool schemas, guardrails, and retries
- Support experiment tracking, evaluation pipelines, and internal dashboards (e.g. W&B)
Model Usage & Inference
- Work with LLMs / SLMs for inference (OpenAI, Claude, Gemini, Qwen, LLaMA, Mistral, OpenSource LLM)
- Apply embedding models effectively for semantic search and grounding
- Optimize inference cost, latency, and throughput
Requirements:
Must-Have
- Bachelor's degree in Computer Science, AI, or related fields, or equivalent experience
- At least 1 year of hands-on experience building RAG pipeline or AI agents
- Strong Python skills and experience building backend systems
- Solid understanding of LLMs, embeddings, RAG, and agent architectures
- Hands-on experience with agent frameworks
- (LangChain, LangGraph, LlamaIndex, Flowise, CrewAI, etc.)
- Experience with vector databases and retrieval systems
- Familiarity with tool calling / function calling in LLMs
- Ability to think in systems & failure modes (timeouts, retries, partial failures)
- Comfortable reading technical documentation and designing APIs
Nice-to-Have
- Experience designing or operating MCP-servers or similar agent tool layers
- Knowledge of agent orchestration patterns (ReAct, planner–executor, multi-agent)
- Familiarity with AI Gateway / inference routing / auth (rate limit, quota, logging)
- Experience with model serving (vLLM, TGI, Triton Inference Server)
- Understanding of fine-tuning techniques (LoRA, DPO, distillation) and data prep
- Exposure to enterprise AI concerns: security, access control, observability