Agentic research assistant with guardrails

Agents

LangGraph

LangChain

LLMOps

A project template for an LLM agent workflow with task routing, tool use, evaluation, and human oversight.

Published

April 26, 2026

System Goal

Design an AI research assistant that can decompose a request, retrieve relevant information, call tools, draft an answer, evaluate its own output, and hand off uncertain cases for human review.

Architecture

Possible implementation paths:

LangGraph for stateful agent workflows.
LangChain for tool calling and retrieval components.
CrewAI or Microsoft AutoGen for multi-agent coordination experiments.
A judge or evaluator node for groundedness, completeness, and risk checks.
Trace logging with LangSmith or Langfuse.

Reliability Questions

Which tasks can the agent complete autonomously?
Which tasks require human confirmation?
What tool calls are allowed?
How is the system evaluated before release?
How are hallucinations, stale context, and prompt regressions detected?

Evaluation Plan

Create an evaluation set with:

Simple requests.
Ambiguous requests.
Retrieval-heavy requests.
Tool-use requests.
Adversarial or misleading requests.
Expected refusal or escalation cases.

Notebook Plan

notebooks/agentic-research-assistant/01-workflow-design.ipynb
notebooks/agentic-research-assistant/02-tool-calling.ipynb
notebooks/agentic-research-assistant/03-evaluation-set.ipynb
notebooks/agentic-research-assistant/04-tracing-and-regression-tests.ipynb