Syed Mehdi.000
Syed Mehdi
← All projects
AI

Customer Support AI Agent with RAG & LLMOps

Production-style RAG agent with hybrid retrieval, tracing, and automated evaluation.

How it works
Result
80%+ accuracy
Data flows left to right · click a stage
01 User Query
FastAPI

A support question arrives over a FastAPI endpoint, containerised with Docker for consistent deployment.

80%+
Answer accuracy
200 queries
Eval set
Dockerised
Deploy
Stack
Cohere APIChromaDBFastAPIDockerLangChainLLMOps
The Problem

Support teams waste hours answering repetitive questions from scattered documentation, and naive LLM answers hallucinate.

Objective

A grounded, observable agent that answers from a knowledge base with citations and measurable accuracy.

Approach

Built a RAG agent on the Cohere API and ChromaDB with semantic chunking and hybrid retrieval. Added response tracing and observability to surface hallucination patterns, then applied metadata filtering and prompt versioning to improve grounding. Deployed via FastAPI in a containerised (Docker) architecture with automated evaluation and cost monitoring.

Challenges

Diagnosing where hallucinations entered the chain, then tuning chunking and metadata filters to fix grounding without hurting latency.

Results

80%+ answer accuracy on a 200-query evaluation set, with traceable, cost-monitored responses.

What's Next

Multi-agent routing for specialised domains and continuous evaluation in CI.

Want something similar built?

Start a conversation
HomeProjectsServicesResearchWriting