#002 Top 10 RAG ANTI-PATTERNS

Anti patterns is the best way to learn system design for mid-level experts

Sep 14, 2025

🚫 RAG Anti-Patterns with Business Consequences

1) 🧱 Giant, unstructured chunks

Tech issue: Long blobs of text fed to embeddings.
Business problem: Customer support bot returns vague “policy-like” answers instead of exact clauses.
Impact: Call center costs rise, customers escalate to human agents.
Fix: Semantic chunking, metadata tagging, controlled overlap.

2) 📚 Context stuffing

Tech issue: Feeding 10–20 irrelevant chunks to the LLM.
Business problem: Compliance chatbot cites outdated or contradictory rules in regulated industries (banking, insurance).
Impact: Legal liability, fines, brand reputation hit.
Fix: Retrieve broadly → re-rank → feed top 3–6 relevant chunks only.

3) 🎯 Single-vector myopia

Tech issue: Dense retrieval only, misses keywords/numbers.
Business problem: Financial advisor bot ignores “Form 1099” or a specific product ID.
Impact: Wrong tax advice or wrong SKU recommendations.
Fix: Hybrid retrieval (dense + keyword/BM25).

4) 🔄 No query rewriting

Tech issue: User asks vague question → retrieval misses.
Business problem: Customer searches “car accident claim process” but bot fails to link to “motor insurance settlement procedure.”
Impact: Frustration → churn → loss of renewals.
Fix: Query rewriting/expansion with synonyms, HyDE-style hypothetical docs.

5) 🧪 Training/dev mismatch

Tech issue: Eval dataset doesn’t match production queries.
Business problem: Sales enablement bot performs well in demo, but fails with real sales reps asking complex cross-product questions.
Impact: Lost deals, wasted sales enablement investment.
Fix: Build golden dataset from real user logs + hard negatives.

6) 🧾 Missing provenance (no citations)

Tech issue: LLM answers but doesn’t link sources.
Business problem: Healthcare assistant suggests a dosage with no reference.
Impact: Zero trust from doctors; product adoption blocked.
Fix: Always show citations + source confidence score.

7) 🧊 Static indexes forever

Tech issue: Knowledge base never refreshed.
Business problem: HR chatbot still quotes “old maternity leave policy.”
Impact: Employee dissatisfaction, potential legal exposure.
Fix: Incremental ingestion, freshness weighting, cache busting.

8) 🧮 Ignoring structure & entities

Tech issue: Treats structured data like flat text.
Business problem: Retail inventory bot gives wrong stock count or mis-matches SKUs.
Impact: Wrong procurement orders, stockouts, excess carrying cost.
Fix: Entity/graph-aware retrieval; tabular embeddings.

9) ⚠️ No guardrails for “I don’t know”

Tech issue: LLM hallucinates instead of abstaining.
Business problem: Banking bot “invents” a loan interest rate.
Impact: Regulatory breach, lawsuits, reputational damage.
Fix: Confidence thresholds, abstain/fallback strategy.

10) 🚦 One-shot pipelines

Tech issue: Single retrieve→generate step.
Business problem: Corporate research assistant fails on “Compare competitor X’s sustainability policy vs ours.”
Impact: Executives base strategy on incomplete answers.
Fix: Multi-hop/agentic retrieval, scratchpad reasoning.

✅ Business-Lens Checklist

Trust: Citations + abstention to avoid hallucinations.
Compliance: Keep KB fresh; enforce data-level access control.
Customer experience: Query rewriting for natural phrasing; chunking for precise answers.
Operational cost: Reduce escalations to humans by improving precision.
Revenue impact: Sales/research bots must handle real-world, multi-hop queries reliably.

Discussion about this post

No posts

Ready for more?

#nojs-banner { position: fixed; bottom: 0; left: 0; padding: 16px 16px 16px 32px; width: 100%; box-sizing: border-box; background: red; color: white; font-family: -apple-system, "Segoe UI", Roboto, Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 13px; line-height: 13px; } #nojs-banner a { color: inherit; text-decoration: underline; } This site requires JavaScript to run correctly. Please turn on JavaScript or unblock scripts