Applied AI Engineer | OCR, RAG, LLM Evaluation, Edge AI

Applied AI engineer for OCR, RAG, evaluation, and edge AI systems.

100+ document templates benchmarked7 peer-reviewed publicationsPh.D. Candidate, HKUST8+ years in applied AIAWS Certified AI PractitionerHong Kong-based

I help teams evaluate, build, and harden AI systems that have to work under real constraints: private data, noisy documents, inconsistent labels, and non-technical end users. My strongest work sits at the boundary between experimentation and production: OCR/LLM evaluation, retrieval quality, regression harnesses, and edge-model feasibility.

Recent delivery includes replacing Azure Document OCR with a local pipeline across 100+ invoice and vendor templates, adding schema-drift checks before deployment, and advising embedded-camera vision choices across Tiny/MobileViT, INT8, and accelerator options. At HKUST, I led research products that became usable systems, from NarrativeHive and Orchid to HK-GenSpeech and SeaSense.

Interactive Turing Machine

Serkan Kumyol

State

q0

Head

2

Action

S

Read ·, write S, move R

Click cells to edit tapeRead ·, write S, move R

Output Register

S

Machine Type

Single-Tape Deterministic

Program Rules

13 transitions

What I Bring

Evaluation before deployment

I design benchmarks, regression checks, and failure criteria so teams can compare models on evidence instead of intuition.

Private-data document AI

I build OCR, extraction, and retrieval workflows for messy documents where security, controllability, and auditability matter.

Research translated into product

I take ideas from papers and labs and turn them into interfaces, demos, and internal tools that stakeholders can actually use.

Selected Work

See all projects

A hiring-manager cut of the work: recent delivery, systems that survived real user studies or evaluation gates, and projects that show how I operate under ambiguity, technical constraints, and stakeholder scrutiny.

Client Delivery

Document Intelligence + Evaluation2026

Private OCR/LLM Evaluation for Invoice Processing

Benchmarked 100+ invoice and vendor templates to replace a cloud OCR dependency with a local privacy-preserving pipeline, then delivered the regression and deployment recommendations used for vendor selection.

OCRLLM EvaluationvLLMOllama
Published Research Product

Generative Narrative Systems2025

Orchid

An LLM-driven narrative authoring system validated with 100+ users and later published at ACM Creativity and Cognition, showing my ability to turn research into a usable interactive system.

LLM SystemsUser ValidationChromaDBReact
View Orchid
Research Platform

Multi-Agent Systems + RAG2025

NarrativeHive

A multi-agent LLM social simulation with persistent memory and local-model support, built to explore coherence, retrieval, and agent behavior over long-running interactions.

Multi-AgentRAGMemory SystemsNext.js
View NarrativeHive
Applied Research Prototype

Speech AI + Healthcare2025

HK-GenSpeech

A speech-based cognitive-screening prototype built with Whisper, PyTorch, torchaudio, ffmpeg, and React Native to turn generative scene prompts into clinically useful speech elicitation.

WhisperPyTorchtorchaudioffmpeg
View HK-GenSpeech

Experience

Feb 2026 – Present

Independent AI Consultant at 2084 Futures Limited

Edge-AI feasibility for embedded camera vision: compared Tiny/MobileViT + INT8 against Coral/Hailo/IMX500 under real-time surveillance constraints; delivered build-vs-buy guidance and deployment recommendations.

Nov 2025 – Jan 2026

AI Systems Engineer at dRoW Limited

Built private OCR/LLM evaluation pipelines with vLLM, Ollama, and MLflow to benchmark 100+ vendor/invoice templates without moving sensitive data off-prem. Added regression checks for schema drift and unstable outputs; delivered the evaluation report adopted for production vendor selection.

Sep 2017 – Oct 2025

Applied AI Researcher & Technical Lead at HKUST (HLTC/XRIM Labs)

Led applied AI R&D across narrative systems, multimodal interfaces, and speech tooling. Built NarrativeHive, Orchid, SeaSense, and HK-GenSpeech; mentored 30+ student engineers per semester and translated research goals into stakeholder-facing demos and shippable technical scopes.

Mar 2016 – Aug 2017

Software Engineer at Arskom Group

Migrated monolithic satellite billing platform to modular Python/Django architecture, improving billing accuracy by ~40%. Used Jenkins + PyTest to reduce system error rates by ~75% and stabilized weekly releases.

Earlier Career Foundation

Quantitative Linguistics Analyst, METU

Built annotation pipelines and statistical reporting for publication-oriented language research, including ANOVA-based analysis and publication-ready reporting.

Freelance Web Developer

Delivered client websites end-to-end, building the product instinct and full-stack execution discipline that later carried into applied AI work.

Publications

7 peer-reviewed publications spanning NLP, multimodal systems, speech processing, and AI evaluation. These track the same arc as the projects: turning research problems into measured, reproducible results.

Morphological Segmentation and Bayesian Models for Turkish Language Processing

Kumyol, Serkan

2016

Modeling morpheme triplets with a three-level hierarchical Dirichlet process

Kumyol, Serkan and Can, Burcu

2016 International Conference on Asian Language Processing (IALP)2016

Allomorphs and binary transitions reduce sparsity in turkish semi-supervised morphological processing

Can, Burcu and Kumyol, Serkan and Bozsahin, Cem

Proceedings of the First Conference on Turkic Computational Linguistics (TurCLing), ser. TURCLing2016

HK-GenSpeech: A Generative AI Scene Creation Framework for Speech Based Cognitive Assessment

Yong, Vi Jun Sean and Kumyol, Serkan and Low, Pau Le Lisa and Leung, Suk Wai Winnie and Braud, Tristan

Proceedings of the 2025 Conference of the International Speech Communication Association (INTERSPEECH 2025)2025

Myokey: Surface electromyography and inertial motion sensing-based text entry in ar

Kwon, Young D and Shatilov, Kirill A and Lee, Lik-Hang and Kumyol, Serkan and Lam, Kit-Yung and Yau, Yui-Pan and Hui, Pan

2020 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops)2020

Efficient Bilingual Generalization from Neural Transduction Grammar Induction

Yan, Yuchen and Wu, Dekai and Kumyol, Serkan

Proceedings of the 16th International Conference on Spoken Language Translation2019

Orchid: A Creative Approach for Authoring LLM-Driven Interactive Narratives

Wu, Zhen and Kumyol, Serkan and Wong, Shing Yin and Hu, Xiaozhu and Tong, Xin and Braud, Tristan

Proceedings of the 2025 Conference on Creativity and Cognition2025

Technical Skills

Languages

  • Python
  • C++
  • JavaScript/TypeScript
  • SQL
  • Bash

ML/LLM

  • PyTorch
  • TensorFlow
  • DyNet
  • Hugging Face
  • scikit-learn
  • LangChain/LangGraph
  • RAG Workflows
  • vLLM
  • Ollama

Data & Web

  • React
  • Next.js
  • Node.js
  • PostgreSQL/MySQL
  • Vector Search

Infra

  • Docker
  • GitHub Actions
  • Jenkins
  • AWS (EC2, S3, Lambda, SageMaker)
  • GCP
  • Azure
  • Linux/Unix

Other Tools

  • MLflow
  • Kafka
  • Whisper
  • torchaudio
  • ffmpeg
  • Vision Transformers
  • INT8 Quantization
  • Jupyter
  • Benchmark design
  • Regression testing

Education

Training that crosses boundaries: machine learning, cognitive science, and education. This shaped how I build systems and how I explain them—whether to students, collaborators, or clients trying to make build-vs-buy decisions.

Ph.D. in Computer Science (Machine Translation and AI Systems)

Hong Kong University of Science and Technology (HKUST)

Research at the intersection of machine learning, narrative systems, multimodal interaction, and MLOps. Focus: LLMs, multi-agent systems, speech processing, and UI/UX for interactive AI. Concurrent applied AI project delivery throughout the PhD.

2017 – Expected Aug 2026

M.Sc. in Cognitive Science (Artificial Intelligence and NLP)

Middle East Technical University (METU)

Studied language, learning, and perception, with thesis work on unsupervised morphological learning in Turkish using probabilistic models (Hierarchical Dirichlet Process). Focus on Bayesian methods and NLP.

2012 – 2016

B.Sc. in Computer Systems & Education

Gazi University

Built foundation in computing while training in how to teach technical material clearly, structurally, and at the right level for the audience.

2008 – 2012

Credentials & Leadership

AWS

AWS Certified AI Practitioner

Formal cloud-AI credential aligned with production deployment and model-infrastructure decision making.

2024

President, RoastMasters HK Toastmasters Club

Led a 50+ member club, organized evaluations and events, and sharpened stakeholder communication under live-feedback conditions.

2023

3rd Place, Best Speech Evaluation

External signal for clear, structured feedback and executive-facing communication.

Get in touch

Want to collaborate or talk AI?

I'm open to research collaborations, applied AI consulting, and conversations about evaluation, deployment, and the craft of building systems that work.