KUANG-MING CHEN

AI Research Engineer @ TSMC

AI Research Engineer specializing in Post-Training & Agentic Workflows.


About Me

I am an AI Research Engineer at TSMC, specializing in Post-Training and Agentic Workflows. I hold a Master of Science in Electrical and Computer Engineering from the University of Washington, where I was a member of the Information Processing Lab (IPL) under the supervision of Prof. Jenq-Neng Hwang. I also hold a Bachelor of Science in Mechanical Engineering from National Taiwan University, where I was supervised by Prof. Hung-yi Lee.

My research and engineering focus on Large Language Models (LLMs), Vision-Language Models (VLMs), and building intelligent agentic systems. I have a strong track record in model alignment, 3D visual grounding, and efficient language transfer, with multiple publications in top-tier venues like ACL, CVPR, and IROS. Recently, I secured 1st place in the ICCV 2025 AI City Challenge (Track 3) hosted by NVIDIA.


Professional Experience

TSMC (Taiwan Semiconductor Manufacturing Company) Jan. 2026 – Present

AI Research Engineer (Post-Training & Agentic Workflows) | USA

  • Multi-Modal Document Understanding: Leading research on document intelligence for semiconductor manufacturing, focusing on structured and semi-structured documents. Fine-tuning vision-language models for robust parsing, layout reasoning, and cross-modal QA with 98% accuracy.
  • Enterprise Agent System: Architecting an agentic BI system utilizing tool-augmented LLMs for policy compliance checks, focusing on reliability, explainability, and policy-consistent reasoning.
  • Optimization: Utilizing speculative decoding and vLLM to reduce latency for real-time agent responses.

SportsBox AI Dec. 2024 – Jan. 2026

Machine Learning Engineer (LLM & 3D Vision) | USA

  • Golf Swing Analysis: Trained and optimized 2D/3D human pose estimation models for real-time motion analysis and professional coaching feedback. Product is live on iOS/Android supporting 100k+ users.
  • Agentic Feedback: Designed a multi-modal agent system (LLM + Motion LMs) to generate personalized coaching feedback based on 3D pose data.
  • Production Alignment: Post-trained LLMs via SFT to mimic specific coaching personas, ensuring feedback is technically accurate and encouraging.

ASUS Oct. 2023 – May 2024

AI Researcher Intern (LLM Efficiency) | Taiwan

  • InstructionCP: Authored a paper on a fast language-transfer method, allowing Llama models to acquire Chinese capabilities with only 0.1B tokens of data.
  • Benchmark Engineering: Established the internal standard for evaluating LLM reasoning in Traditional Chinese for the company’s 2024 model releases.

Academia Sinica Jul. 2023 – Sep. 2023

Research Intern (Knowledge Localization) | Taiwan

  • Model Privacy & Knowledge: Investigated ”knowledge neurons” within Transformers to isolate and update specific facts without retraining.
  • Adapter Research: Designed a novel adapter architecture for temporal knowledge updates, improving model accuracy by 40% on daily-changing news data.

Selected Publications

  • Modeling LLM Agent Reviewer Dynamics in an Elo-Ranked Review System
    Hsiang-Wei Huang, Junbin Lu, Kuang-Ming Chen, Jenq-Neng Hwang. Submitted to ACL ARR 2026.

    Study of agent reviewer dynamics using Elo ratings and memory to improve decision accuracy in peer review systems.

  • FormCraft: A Three-Level Benchmarking Approach to Form Intelligence
    Jr-Jen Chen*, Kuang-Ming Chen*, et al. Submitted to CVPR 2026.

    A framework for evaluating MLLMs on form understanding across content modality, layout structure, and semantic relation. Collaboration with Google DeepMind.

  • TEMPURA: Temporal Event Masked Prediction and Understanding for Reasoning in Action
    Jen-Hao Cheng, ..., Kuang-Ming Chen, et al. Submitted to COLM 2026.

    Two-stage training framework for fine-grained video temporal understanding using the curated VER dataset (1M instances).

  • 3D Visual Grounding with Reasoning LLMs
    Hsiang-Wei Huang*, Kuang-Ming Chen*, et al. Accepted by CVPR 2025, 3D-LLM Workshop.

    Proposes a 3D visual grounding pipeline using Llama-3.1-8B-Instruct to outperform SoTA zero-shot methods.

  • InstructionCP: A fast approach to transfer Large Language Models into target language
    Kuang-Ming Chen, Hung-yi Lee. Accepted by ACL 2025, SIGTYP Workshop.

    A data-efficient method for transferring LLMs to new language domains using minimal target-language datasets.

  • Chat Vector: A Simple Approach to Equip LLMs With New Language Chat Capabilities
    Shih-Cheng Huang, Pin-Zu Li, Kuang-Ming Chen, et al. Accepted by ACL 2024.

    A novel approach for acquiring conversational and RLHF capabilities in previously unexplored language domains.

  • Compressing Transformer-based self-supervised models for speech processing
    Tzu-Quan Lin, Kuang-Ming Chen, et al. Accepted by IEEE ASRU 2025.

    Methods for compressing speech SSL models, achieving up to 50% parameter reduction while maintaining performance.

Full list on Google Scholar »


Talks & Challenges

1st Place, ICCV 2025 AI City Challenge (Track 3) 2025

Warehouse Spatial Intelligence | Hosted by NVIDIA

Proposed a data-efficient LLM agent system with advanced spatial reasoning for complex QA in warehouse environments. Secured 1st place globally against 10+ international teams.


Teaching

Teaching Assistant National Taiwan University

  • Introduction to Generative AI: Helped 1400+ students fine-tune 7B LLMs for poem generation.
  • Programming Training Class: Taught fine-tuning LLMs (1.3B) with patent data using LoRA to non-specialists.
  • Machine Learning: Designed grading pipelines for 700+ students and created assignments for regression, classification, and RL.

Other Projects

Autonomous Data Analyst Agent [Code]

  • Developed a 14B Llama-based model reasoning over database metadata via deterministic Python execution.
  • Achieved 82% accuracy on TabMWP (vs 20% base model).

6-axis Robot Arm - Battery Storage

Used TMRobot and OpenCV to automatically detect and replace electric motorcycle batteries.

Stock Price Prediction [Code]

Trained an LSTM model using TensorFlow to predict Taiwan stock prices with 90% accuracy.


Education

University of Washington Sep. 2024 – Dec. 2025

MS in Electrical and Computer Engineering | Washington, USA

Coursework: Deep Learning for Big Visual Data, LLMs from Transformers to ChatGPT, TinyML.

National Taiwan University Sep. 2019 – Jul. 2024

BS in Mechanical Engineering | Taiwan

Related: Machine Learning (A+), DL for Human Language Processing (A), Applied DL (A).


Technical Skills

  • Post-Training & Alignment: GRPO, RLHF, PPO, DPO, SFT, Process Supervision, Constitutional AI.
  • Agentic Architectures: Multi-agent systems, Tool-use (Computer Use), MCP, RAG.
  • Languages & Engineering: Python (Expert), C++, PyTorch, DeepSpeed, vLLM, Docker, Kubernetes.
  • Software & Tools: MATLAB, AutoCAD, Inventor, LabVIEW, Arduino.