Meng Foong

AI Engineer

I build multilingual AI systems for Southeast Asia.

I build AI systems that handle languages most models ignore. My work on Whisper fine-tuning for Malay-English-Mandarin code-switching is the kind of problem that can't be solved by calling an API — it requires hands-on ML work with real conversation data.

I've shipped 4 production systems in under 2 years: a speech intelligence platform, a WhatsApp customer service tool, a marketing automation SaaS, and a financial OCR pipeline. I'm most interested in the infrastructure side — self-hosting models, designing async processing pipelines, and making AI systems reliable at production scale.

I'm looking for AI engineering roles where I can work on harder problems with better engineers.

August 2025 — March 2026

AI Engineer → Technical Lead → Product Owner · MAMPU AI

  • Shipped production speech intelligence platform in 6 weeks with 3-person team; client signed enterprise contract on delivery
  • Built self-hosted STT + LLM pipeline (Whisper v3 Large, Qwen 2.5-13B) for multilingual audio → structured analytics
  • Fine-tuned Whisper using distillation for Malay-English-Mandarin code-switching — 5% WER improvement on internal eval set
  • Designed async processing with Redis job queues to handle 30-120s inference times without API timeouts
  • Built WhatsApp Business API platform (Ruby on Rails + Docker) for internal customer service operations
  • Promoted to own product roadmap post-launch, managing team of 7 through subsequent feature releases

May 2024 — May 2025

Full-Stack Developer Intern · Pilot Multimedia

  • Migrated legacy wholesale credit rating system to React.js and Node.js
  • Built financial report OCR pipeline using Docling IBM, achieving 81% data extraction accuracy
  • Developed text classification model using OpenAI embeddings for chart-of-accounts mapping across 100+ financial documents
  • Automated ETL workflows for 420+ client financial statements, reducing manual processing time by 67%
View Full Resume

Speech Intelligence PlatformFeatured

Converts multilingual audio conversations (Malay, English, Mandarin) into structured analytics. Self-hosted Whisper v3 + open-source LLM pipeline — no third-party API dependency.

Shipped in 6 weeks, client signed enterprise contract. Fine-tuned Whisper with distillation for code-switching languages. Async Redis queue architecture handles 30-120s inference per item.

ReactExpress.jsWhisper v3Qwen 2.5-13BRedisSupabase

AI Marketing SaaS

Marketing system that auto-generates ad copy from market trend analysis, with performance-based auto-publish and auto-pause logic.

Automating 100+ ad campaigns via Facebook Graph API. 25% improvement in ad engagement. 30% reduction in wasted ad spend.

ReactOpenAIFacebook API

WhatsApp Customer Service Platform

Internal platform enabling multi-agent customer service through a single WhatsApp Business account with conversation routing.

10+ staff managing 300+ conversations/month through one shared account. Containerized Rails backend reduced deployment time by 60%.

Ruby on RailsDockerWhatsApp API

Financial OCR Pipeline

Document extraction system that reads financial statements via Docling IBM OCR and classifies line items to chart-of-accounts using OpenAI embeddings.

81% extraction accuracy across 420+ financial statements. Automated ETL reduced manual processing time by 67%.

PythonDocling IBMOpenAIPandas
Case Study

Speech Intelligence Platform

Architecture decisions, tradeoffs, and what I learned shipping a multilingual STT + LLM pipeline in 6 weeks

Details generalized to respect client confidentiality.

The Problem

An enterprise client needed to extract actionable insights from hundreds of audio conversations monthly. Manual review was expensive, inconsistent, and couldn't scale. The system needed to handle multilingual conversations (Malay, English, Mandarin code-switching) with high accuracy — a challenge most off-the-shelf STT solutions fail at.

6 wks
Time to Ship
Concept to enterprise contract
2
Models Deployed
Self-hosted Whisper v3 + Qwen 2.5-13B
+5%
STT Improvement
WER reduction after fine-tuning
3
Team
Engineers, end-to-end delivery
3
Languages
Malay, English, Mandarin code-switching
1

Speech-to-Text Pipeline

I chose to self-host Whisper v3 Large over using a third-party API — this gave us cost control at scale, data privacy compliance, and the ability to fine-tune. I applied distillation methods to improve transcription accuracy by 5% for multilingual code-switching conversations.

2

LLM Analytics Engine

I selected and deployed a self-hosted open-source LLM for structured analytics extraction — converting raw transcripts into structured JSON outputs including sentiment, key topics, and action items. Self-hosting eliminated per-token API costs at production scale.

3

Full-Stack System Design

I designed the end-to-end architecture: React frontend for real-time dashboards, Express.js API layer, Redis for async job queues (critical since inference takes 30-120s per item), and Supabase for auth and persistence. Deployed on Railway with CI/CD via GitHub Actions.

Key Technical Decisions

I chose self-hosted STT over third-party APIs

Cost predictability at scale, data privacy requirements, and the ability to fine-tune for underserved multilingual use cases

I chose an open-source LLM over GPT-4

Eliminated per-token costs, gave full control over prompt engineering and potential fine-tuning, and met data residency requirements

I designed async processing with Redis queues

AI inference takes 30-120s per item — synchronous processing would cause API timeouts. Queues enabled batch processing and retry logic

I selected Supabase over custom PostgreSQL

Built-in auth, real-time subscriptions for live dashboards, and row-level security — reduced backend development by ~2 weeks on a tight deadline

What I Learned

  • Fine-tuning STT models for code-switching languages requires careful dataset curation — I found that synthetic data performed poorly compared to real conversation samples, which shifted my approach mid-project.
  • A 6-week deadline forced ruthless prioritization. I learned to ship the core AI pipeline first and iterate on the UI — perfect is the enemy of shipped.
  • Self-hosting AI models is cost-effective but operationally complex. I built monitoring for inference latency and GPU memory early, which saved us from production incidents later.

Languages

PythonJavaScriptTypeScriptSQL

AI / ML

LLM IntegrationRAGSTTFine-TuningOCREmbeddings

Frontend

ReactNext.jsTanStack Query

Backend

Node.jsExpress.jsRuby on RailsREST APIs

AI Models

Whisper v3 LargeQwen 2.5-13BOpenAI API

DevOps & Tools

DockerRedisRailwaySupabaseGitHub ActionsCI/CDSentryGit

Looking for AI engineering roles where I can work on multilingual NLP, speech systems, or LLM infrastructure. Open to Malaysia, Singapore, or remote.

Say Hello