Role overview:
- You'll focus on ensuring the quality of AI data, Computer Vision/Vision and Language Models (VLM) prior to their deployment in real-world products.
- The role is significant for building a robust, accurate and scalable data-model pipeline, supports AI and Robotics systems.
Responsibilities:
- Ensure the quality of multimodal datasets (image, video, text, captions, instructions) for VLM training and evaluation, collaborating with internal teams and external data vendors.
- Define and enforce data quality standards for VLMs, covering visual accuracy, textual correctness, vision-language alignment, semantic consistency, and bias/noise detection.
- Review and audit data pre-and post-annotation, identifying systemic issues (like misaligned captions, hallucinated descriptions, weak prompts and inconsistent labeling. )
- Design evaluation datasets, test protocols for VLM use cases including VQA, instruction following, multimodal reasoning, and human-robot interaction scenarios.
- Evaluate VLM performance through quantitative metrics and qualitative behavior analysis, focusing on correctness, consistency, robustness, latency, and stability in real deployments.
- Conduct error analysis and provide clear recommendations for data improvement, prompt refinement, model retraining, and release (go/no-go) decisions.
Qualifications:
- Minimum 3+ years of experience, strong analytical mindset with high attention to data quality and model behavior.
- Experience working with large-scale datasets and structured evaluation workflows.
- Experience working with AI data pipelines, Computer Vision, VLMs, or multimodal datasets.
- Hands-on experience with annotation or review tools (e.g. Label Studio, CVAT, or custom review systems).
- Experience working with external data vendors or crowdsourced annotation teams.
- Familiarity with evaluating AI systems beyond raw metrics, focusing on semantic correctness and user-facing behavior.
- Solid understanding of how modern VLMs work (vision encoder + language model, alignment, prompting).
- Collaborate with AI engineers, researchers, and data vendors.
- Ability to read, understand technical documentation and research papers in English.
Preferred Qualifications:
- Basic Python skills for data analysis/evaluation scripting.
- Understanding of MLOps / model lifecycle management.
- Exposure to robotics, embodied AI, or humanrobot interaction use cases.
Personality/ Attitude
- Proactive, dedicated, business-oriented, responsible and willing to learn.
- Good communication skills, creative problem-solving skills and attention to detail.