LLM & GenAI Systems
Practical AI workflows using RAG, prompt orchestration, structured outputs, and model evaluation patterns that prioritize correctness and usefulness.
AI Engineer & Data Scientist — Indonesia
Building practical AI systems from data, models, and real-world workflows.
I turn messy data and manual processes into AI-powered products — combining engineering judgment, model evaluation, data reasoning, and product execution.
About
I work at the intersection of AI engineering, data science, and product execution. My strength is turning unclear operational problems into structured systems: define the workflow, reason about the data, evaluate model behavior, and ship something people can actually use.
My background spans LLM evaluation, AI product development, forecasting pipelines, backend-driven RAG applications, and medical computer vision research. I care about practical usefulness: AI that gives clear explanations, handles real constraints, and supports better decisions.
What I build
Practical AI workflows using RAG, prompt orchestration, structured outputs, and model evaluation patterns that prioritize correctness and usefulness.
Backend-driven AI applications with Python, FastAPI, Docker, relational workflows, caching, and cloud deployment.
Data pipelines, analytics workflows, and forecasting systems using Python, SQL, PySpark, SARIMA/LSTM, and stakeholder-ready reporting.
Research and project experience in thermal imaging, deep learning, image enhancement, face/deepfake detection, and video intelligence pipelines.
Featured work
AI products, LLM evaluation, data pipelines, and applied computer vision.
Founder · AI Product
Founded and built an AI-powered financial copilot that helps users manage personal finances and make better investment decisions through natural conversation — AI orchestration, financial analytics, market intelligence pipelines, and cloud infrastructure, with transparent, non-custodial recommendations.
GenAI · Computer Vision · Media
AI-powered video clipping architecture that turns long-form video into short-form clips: YOLOv11 + MediaPipe + ByteTrack for face tracking and 9:16 auto-reframing, Claude-based semantic highlight detection, Whisper speech-to-text with word-level subtitle synchronization, and asynchronous processing on Docker and AWS ECS/S3.
LLM Evaluation
Evaluating AI-generated STEM, coding, ML, and data science reasoning: reviewing code for logic errors and edge cases, producing structured feedback for LLM training workflows, and auditing annotation quality against rubrics.
RAG · Backend AI
Backend-driven medical chatbot combining a LangChain RAG pipeline, document retrieval, multi-intent handling, and automated doctor assignment — turning a manual appointment workflow into an AI-assisted service.
AI Assistant · DS Tooling
Python-based assistant automating data science workflows: dataset intake, validation, profiling, task inference, planning, modeling, evaluation, and reporting — simple inspectable logic first, LLM assistance second.
Data Science · Forecasting
Forecasting pipelines integrating taxi, flight, and weather datasets on PySpark and GCP with SARIMA and LSTM models — improving prediction accuracy by 15% and cutting pipeline runtime by 90%.
Case-study details for each system are available on request.
Capability matrix
I build with AI, not just build AI — agent-assisted development is how I ship faster than team size suggests.
Experience
Jan 2026 — Present
Building an AI financial copilot end to end: AI orchestration, financial analytics, market intelligence pipelines, and cloud infrastructure.
Mar 2026 — Present
Designing AI video clipping architecture: CV pipelines, semantic highlight detection, ASR/subtitles, and asynchronous media processing on AWS.
Oct 2024 — Present
Evaluating AI-generated STEM/coding/ML solutions, reviewing code quality, and auditing annotation outputs for rubric adherence and reasoning quality.
May 2024 — Oct 2025
Reviewed and annotated AI responses for RLHF/SFT across general, technical, and CS domains; assessed instruction-following, truthfulness, and reasoning.
Jan 2024 — Jul 2024
Built PySpark/GCP pipelines integrating taxi, flight, and weather data; SARIMA & LSTM forecasting improved accuracy 15% and cut runtime 90%.
2023
Designed dashboards and automated reporting for student performance tracking; translated findings for non-technical stakeholders.
Aug 2021 — Feb 2022
Led a team of four building a deepfake detection system with CNN architectures (MTCNN, InceptionResNetV1), from preprocessing to evaluation.
Research & education
Master's thesis · Universitas Gadjah Mada
97.06% reported classification accuracy
Combined plantar thermogram imaging with temperature data to detect diabetic foot ulceration risk early — image enhancement, feature integration, and rigorous model evaluation.
Published in SINTECHCOM Journal, Feb 2025 · DOI 10.59190/stc.v5i2.273Education
Universitas Gadjah Mada · GPA 3.79/4.00
Education
Universitas Muhammadiyah Malang · GPA 3.86/4.00
Book
DIVA Press, 2020 · ISBN 9786023918980
Contact
Interested in AI engineering, GenAI, LLM systems, data science, or applied AI product work? Reach me directly by email or LinkedIn — I'm open to roles and collaborations.
nurilhuda3333@gmail.com