AI Systems, LLMOps & Agentic Workflows
AI Systems
LLMOps
RAG
Agents
Designing, evaluating, and monitoring LLM-powered systems for reliable knowledge work.
Focus
I build and evaluate LLM-powered systems that combine retrieval, orchestration, multimodal understanding, automated evaluation, and monitoring. The goal is not just to call a model API, but to design systems that can be tested, improved, and trusted in real workflows.
Core Capabilities
- Design and deployment of LLM-powered AI systems, including RAG, agents, and multi-agent orchestration.
- Multimodal ML systems for text, image, and embedding-based understanding.
- Hallucination detection, drift monitoring, automated evaluation, and regression testing with tools such as LangSmith and Langfuse.
- Agent architectures with LangChain, LangGraph, CrewAI, and Microsoft AutoGen.
- LLMOps and model lifecycle management: evaluation, monitoring, versioning, prompt management, and release discipline.
- Vector databases and retrieval systems, including FAISS, Pinecone, and LlamaIndex.
- LLM fine-tuning workflows, including PEFT/LoRA and QLoRA.
How I Present This Work
AI systems work should be demonstrated through artifacts:
- Architecture diagrams.
- Evaluation datasets and test suites.
- Retrieval quality metrics.
- Failure mode analysis.
- Monitoring plans.
- Reproducible notebooks or small deployable demos.
- Clear tradeoffs around cost, latency, quality, and risk.