Senior AI Engineer

qualgo technologies vietnam

Ho Chi Minh, Vietnam

5-7 Years

Save

Posted 17 hours ago
Be among the first 10 applicants

Early Applicant

Job Description

Company Overview:

Qualgo is an R&D center specializing in cybersecurity products and solutions. We are on a mission to build a trusted cyberspace where individuals and businesses can thrive with confidence.

Job Summary:

As a Senior AI Engineer on our product development team, you will bridge the gap between data science and production software. You will be responsible for taking state-of-the-art NLP/LLM models and deploying them into high-performance, real-time environments.
Because our product intercepts and analyzes communications, you will face unique engineering challenges around latency, scale, and privacy. You will not just be calling third-party LLM APIs. You will architect how our AI operates safely, securely, and instantly, with a strong emphasis on edge computing and on-device ML to help preserve user privacy.
This role is ideal for someone who is strong in AI deployment and systems engineering, and who can work closely with Data Scientists to bring advanced models into real-world products.

Key Responsibilities:

Deploy, optimize, and integrate NLP/LLM models into high-performance, real-time production environments.
Build and maintain AI inference pipelines for low-latency, privacy-sensitive communication analysis use cases.
Design and implement AI systems that can run efficiently across cloud, edge, and on-device environments.
Optimize model performance for production through quantization, pruning, distillation, graph optimization, and runtime tuning.
Convert, benchmark, and deploy models using production-oriented inference frameworks and toolchains.
Work closely with Data Scientists to take models from experimentation to robust production deployment.
Evaluate architecture tradeoffs across latency, privacy, model quality, cost, and hardware constraints.
Design reliable testing, profiling, monitoring, and observability workflows for AI inference systems.
Contribute to system design decisions for privacy-preserving AI, hybrid on-device/cloud inference, and model lifecycle management.
Collaborate with backend, product, and app engineering teams to integrate AI capabilities into end-user experiences.

Qualifications:

Education: Bachelor's degree/ Master's degree or Ph.D. in Computer Science, Artificial Intelligence, Machine Learning, Electrical Engineering, or a related field.
5+ years of experience in AI Engineering, Applied Machine Learning, ML Systems, or a similar role.
Strong programming skills in Python and C++.
Strong experience deploying machine learning models into production environments.
Experience with edge computing, embedded AI, or on-device ML.
Hands-on experience with model optimization techniques such as quantization, pruning, distillation, or hardware-aware optimization.
Experience with production inference frameworks and runtimes such as ONNX Runtime, TensorRT, TFLite, Core ML, or similar technologies.
Strong understanding of performance optimization for inference systems, including latency, throughput, memory footprint, and hardware utilization.
Experience building reliable, scalable AI services or runtime components for production systems.
Ability to work closely with Data Scientists and software engineers to productionize advanced NLP/LLM models.

Nice to have:

Experience with platforms such as NVIDIA Jetson, Google Coral, or similar edge AI environments.
Experience deploying AI models on iOS or Android devices.
Familiarity with mobile development languages such as Swift, Objective-C, Kotlin, or Java.
Experience in privacy-sensitive domains such as trust & safety, fraud detection, cybersecurity, or communication intelligence.
Exposure to multilingual NLP systems or small-model deployment for constrained environments.

What we offer: