AI Engineer & Data Scientist · Indonesia

Muhammad
Nuril Huda

Building practical AI systems from data, models, and real-world workflows.

I turn messy data and manual workflows into AI products that people actually use. My evaluation background means I test model behavior before I trust it.

  • M.Cs. Computer Science, UGM
  • Founder, Aureum
  • LLM Evaluation · Alignerr & Outlier
  • Python · FastAPI · PySpark · Docker · AWS · GCP
Portrait of Muhammad Nuril Huda wearing a maroon batik shirt against a dark background

About

Ambiguous problems in. Working systems out.

I work where AI engineering, data science, and product execution meet. Give me an operational problem that arrives half-formed, and I'll turn it into a structured system: define the workflow, reason about the data, evaluate how the model behaves, then ship something people can actually use.

I've worked on LLM evaluation, AI products, forecasting pipelines, backend RAG applications, and medical computer vision research. The common thread: I care whether the system actually helps someone decide or act, not whether the demo looks good.

What I build

Four kinds of systems.

LLM & GenAI Systems

Practical AI workflows using RAG, prompt orchestration, structured outputs, and model evaluation patterns that prioritize correctness and usefulness.

AI Product Engineering

Backend-driven AI applications with Python, FastAPI, Docker, relational workflows, caching, and cloud deployment.

Data Science & Forecasting

Data pipelines, analytics workflows, and forecasting systems using Python, SQL, PySpark, SARIMA/LSTM, and stakeholder-ready reporting.

Computer Vision & Medical AI

Research and project experience in thermal imaging, deep learning, image enhancement, face/deepfake detection, and video intelligence pipelines.

Featured work

Selected systems & projects.

AI products, LLM evaluation, data pipelines, and applied computer vision.

Founder · AI Product

Aureum — AI Financial Copilot

Founded and built an AI financial copilot: users manage personal finances and weigh investment decisions through plain conversation. I designed the AI orchestration, financial analytics, market intelligence pipelines, and the cloud infrastructure underneath, and kept every recommendation transparent and non-custodial.

  • AI orchestration
  • Financial analytics
  • Market intelligence
  • Cloud infrastructure

GenAI · Computer Vision · Media

Ayclip — AI Video Clipping Pipeline

AI-powered video clipping architecture that turns long-form video into short-form clips: YOLOv11 + MediaPipe + ByteTrack for face tracking and 9:16 auto-reframing, Claude-based semantic highlight detection, Whisper speech-to-text with word-level subtitle synchronization, and asynchronous processing on Docker and AWS ECS/S3.

  • Python
  • FastAPI
  • Celery
  • YOLOv11
  • MediaPipe
  • ByteTrack
  • Whisper
  • FFmpeg
  • Docker
  • AWS ECS/S3

LLM Evaluation

Alignerr — LLM Evaluation & Code Review

I evaluate AI-generated STEM, coding, ML, and data science reasoning. In practice that means reviewing code for logic errors and edge cases, writing structured feedback for LLM training workflows, and auditing annotation quality against rubrics.

  • Python
  • Rubric evaluation
  • RLHF/SFT
  • QA workflows

RAG · Backend AI

Hospital Chatbot Appointment System

A medical chatbot that took a manual appointment workflow and made it self-service. Under the hood: a LangChain RAG pipeline, document retrieval, multi-intent handling, and automated doctor assignment on a FastAPI backend.

  • Python
  • FastAPI
  • Docker
  • LangChain
  • RAG
  • Caching

AI Assistant · DS Tooling

DeciSense — AI Assistant for Data Science Workflows

Python assistant that automates data science workflows: dataset intake, validation, profiling, task inference, planning, modeling, evaluation, and reporting. Simple inspectable logic first, LLM assistance second.

  • Python
  • Data validation
  • Profiling
  • ML planning
  • LLM explanations

Data Science · Forecasting

Transportation Demand Forecasting

Forecasting pipelines that join taxi, flight, and weather data on PySpark and GCP. The SARIMA and LSTM models improved prediction accuracy by 15% and cut pipeline runtime by 90%.

  • Python
  • PySpark
  • GCP
  • SARIMA
  • LSTM

Case-study details for each system are available on request.

Capability matrix

Technical capability, grouped by function.

AI Engineering & LLM Systems

  • Agentic AI & multi-agent systems
  • MCP (Model Context Protocol)
  • RAG · hybrid retrieval · reranking
  • Vector databases
  • LangChain
  • Context & prompt engineering
  • Structured outputs & tool calling
  • LLM evaluation & LLM-as-judge
  • RLHF/SFT evaluation
  • Guardrails & AI safety
  • Prompt caching & cost optimization

Backend & AI Product Engineering

  • Python
  • FastAPI
  • REST APIs
  • SQL
  • Caching
  • Docker
  • Git
  • Celery
  • Workflow automation

Data Engineering & Analytics

  • PySpark
  • ETL pipelines
  • Batch processing
  • Schema design
  • BigQuery
  • Looker Studio
  • Dashboard design
  • Reporting automation

Data Science & Machine Learning

  • Scikit-learn
  • NumPy
  • Pandas
  • Time-series forecasting
  • ARIMA
  • ARIMAX
  • SARIMA
  • SARIMAX
  • LSTM
  • GRU
  • Statistical analysis
  • Clustering
  • Classification
  • Regression
  • Feature engineering
  • Model evaluation

Computer Vision & Media AI

  • OpenCV
  • CNN classification
  • Medical image analysis
  • Thermal imaging
  • Deepfake detection
  • YOLOv11
  • MediaPipe
  • ByteTrack
  • Whisper
  • FFmpeg

Cloud & Deployment

  • AWS ECS
  • AWS S3
  • AWS Bedrock
  • AWS Lambda
  • CloudFront
  • GCP
  • BigQuery
  • Compute Engine
  • Vertex AI
  • Streamlit
  • Static hosting

AI-Accelerated Engineering

Most of this site, including the recruiter chatbot in the corner, was built pair-programming with AI agents. Working this way is how I ship faster than a one-person team usually can.

  • AI coding agents (Claude Code)
  • Spec-driven development
  • AI code review & auditing
  • Eval-driven development
  • LLM training data quality
  • Human-in-the-loop workflows
  • Rapid AI prototyping

Product, Research & Collaboration

  • Business problem framing
  • Technical documentation
  • Stakeholder reporting
  • Async collaboration
  • Fast iteration
  • Startup leadership
  • Research writing

Experience

Seven roles in five years.

  1. Jan 2026 — Present

    Founder · Aureum

    Building an AI financial copilot end to end: AI orchestration, financial analytics, market intelligence pipelines, and cloud infrastructure.

  2. Mar 2026 — Present

    AI Engineer · Ayclip

    Designing AI video clipping architecture: CV pipelines, semantic highlight detection, ASR/subtitles, and asynchronous media processing on AWS.

  3. Oct 2024 — Present

    AI Evaluation Consultant · Alignerr

    Evaluating AI-generated STEM/coding/ML solutions, reviewing code quality, and auditing annotation outputs for rubric adherence and reasoning quality.

  4. May 2024 — Oct 2025

    AI Trainer & LLM Evaluator · Outlier

    Reviewed and annotated AI responses for RLHF/SFT across general, technical, and CS domains; assessed instruction-following, truthfulness, and reasoning.

  5. Jan 2024 — Jul 2024

    Data Scientist Intern · Data Glacier

    Built PySpark/GCP pipelines integrating taxi, flight, and weather data; SARIMA & LSTM forecasting improved accuracy 15% and cut runtime 90%.

  6. 2023

    Data Analyst Fellow · GoTo Impact Foundation

    Designed dashboards and automated reporting for student performance tracking; translated findings for non-technical stakeholders.

  7. Aug 2021 — Feb 2022

    AI Engineer Apprentice · Orbit Future Academy

    Led a team of four building a deepfake detection system with CNN architectures (MTCNN, InceptionResNetV1), from preprocessing to evaluation.

Research & education

Medical computer vision research.

Master's thesis · Universitas Gadjah Mada

Early detection of diabetic foot ulcers using thermogram and temperature data integration

97.06% reported classification accuracy

Combined plantar thermogram imaging with temperature data to catch diabetic foot ulceration risk earlier. The work covered image enhancement, feature integration, and model evaluation.

Published in SINTECHCOM Journal, Feb 2025 · DOI 10.59190/stc.v5i2.273

Education

M.Cs. Computer Science

Universitas Gadjah Mada · GPA 3.79/4.00

Education

S.Kom. Informatics

Universitas Muhammadiyah Malang · GPA 3.86/4.00

Book

Java Itu Mudah

DIVA Press, 2020 · ISBN 9786023918980

Contact

Let's build something practical.

Interested in AI engineering, GenAI, LLM systems, data science, or applied AI product work? Reach me by email or LinkedIn. I'm open to roles and collaborations.

nurilhuda3333@gmail.com