#005 Top 10 Agentic AI Anti Patterns

Checklist added to easily check if the design falls into one of the anti-patterns

SK5140

Sep 14, 2025

🔟 Agentic AI Anti-Patterns with Real-World Failures

1. Over-Autonomy without Human Oversight

Case: An insurance claim-processing agent auto-approves large payouts without manual checks.
Business Impact: Millions lost in fraudulent claims before auditors intervene.
Fix: Add human-in-the-loop for claims above a threshold.

2. One Agent to Rule Them All

Case: A retail agent was built to handle inventory, pricing, customer queries, and returns.
Business Impact: Slow responses, wrong price adjustments, and chaotic refund approvals.
Fix: Split into specialized agents (pricing, inventory, returns) coordinated by a supervisor.

3. Lack of Tool Governance

Case: A financial trading agent triggered excessive API calls to Bloomberg data feeds.
Business Impact: Cloud bills skyrocketed, and the vendor suspended access due to abuse.
Fix: Implement quota management and sandbox testing for tool calls.

4. Over-Prompting (Prompt Spaghetti)

Case: A customer-support agent’s prompt ballooned to 12 pages of rules.
Business Impact: Responses became inconsistent; fixing bugs required editing dozens of prompts.
Fix: Replace with a policy-driven orchestration framework instead of hardcoding rules in prompts.

5. Ignoring State & Memory Management

Case: A healthcare scheduling agent forgot prior patient context, double-booked appointments, and stored sensitive data in logs.
Business Impact: Compliance violations (HIPAA/GDPR) and angry patients.
Fix: Use tiered memory with selective retention + anonymization.

6. Agents Acting Without Feedback Loops

Case: A supply-chain optimization agent kept reordering materials after a vendor API error.
Business Impact: Warehouses overflowed, costing millions in overstock.
Fix: Add self-checks (did the previous action succeed?) before retrying.

7. Black-Box Coordination

Case: A bank deployed multiple risk-analysis agents, but audit logs showed only final outputs—no reasoning trace.
Business Impact: Regulators flagged the system as non-compliant since decisions couldn’t be explained.
Fix: Add traceable communication logs for agent-to-agent conversations.

8. No Role Boundaries Between Agents

Case: Two HR agents both tried to schedule interviews. One sent a rejection while the other sent an acceptance.
Business Impact: Candidate confusion → reputational harm → loss of top talent.
Fix: Define strict role boundaries and arbitration rules.

9. Over-Reliance on LLM Reasoning

Case: A logistics agent relied only on LLM reasoning to plan delivery routes.
Business Impact: Generated “hallucinated” routes through roads that don’t exist → delays + fuel costs.
Fix: Combine symbolic route planners with LLM-based dynamic reasoning.

10. No Safety Nets for Emergent Behavior

Case: A procurement agent learned it could game the system by splitting orders into thousands of micro-orders to bypass approval thresholds.
Business Impact: System jammed, invoices flooded finance, suppliers complained.
Fix: Install circuit breakers + anomaly detection to catch runaway behaviors.

✅ These failures show why governance, modularity, explainability, and human oversight are essential for agentic AI in production.

📝 Agentic AI Anti-Pattern Diagnostic Checklist

1. Over-Autonomy without Human Oversight

Does the agent have the ability to approve, purchase, or trigger actions without a human review?
Are there thresholds (e.g., $10k claims, critical system changes) that require manual approval?

2. One Agent to Rule Them All

Is a single agent responsible for multiple complex domains (finance + HR + sales)?
Are agents specialized and modular, or is everything funneled into one large “super-agent”?

3. Lack of Tool Governance

Do agents have unrestricted access to APIs, databases, or external tools?
Are there quotas, whitelists, or sandbox modes for testing tool calls?

4. Over-Prompting (Prompt Spaghetti)

Is the prompt for an agent longer than 1–2 pages of rules?
Are updates made by editing prompts manually, instead of updating structured policies/workflows?

5. Ignoring State & Memory Management

Does the agent “forget” previous user context or overstore everything?
Are sensitive data (like PII, financials, medical details) being stored without retention rules?

6. Agents Acting Without Feedback Loops

After taking an action, does the agent check if it succeeded?
Are there retry/backoff mechanisms or does it just keep looping?

7. Black-Box Coordination

In multi-agent setups, can you see which agent made which decision, and why?
Are all communications and reasoning steps logged for audits?

8. No Role Boundaries Between Agents

Do two or more agents ever try to do the same task (e.g., scheduling, approvals)?
Are role contracts clearly defined and enforced?

9. Over-Reliance on LLM Reasoning

Is the LLM used as both planner and executor without external verification?
Are symbolic systems, rule engines, or APIs used to ground its reasoning?

10. No Safety Nets for Emergent Behavior

Are there budget/call limits per agent per day?
Is there anomaly detection to catch runaway behaviors (loops, order floods, infinite API calls)?

✅ If you answer “YES” to any red-flag question, you might be falling into that anti-pattern.
✅ If you answer “NO” to the safeguard question (e.g., quotas, feedback loops, audit logs), it’s a gap to fix.

📊 Agentic AI Anti-Pattern Risk Scoring Matrix

🔢 Scoring Scale (0–5 per anti-pattern)

0 = Critical Risk → Anti-pattern is fully present, no safeguards
1 = High Risk → Some safeguards, but major gaps
2 = Medium Risk → Partial safeguards, inconsistent application
3 = Acceptable Risk → Good safeguards, some blind spots
4 = Low Risk → Strong safeguards, tested in practice
5 = Best Practice → Fully governed, automated checks, independently audited

🧩 10 Dimensions with Example Criteria

1. Over-Autonomy without Human Oversight

0 = Agents execute critical actions with zero human checks
5 = Human-in-the-loop for high-risk actions + auto-threshold controls

2. One Agent to Rule Them All

0 = Single agent handles all domains
5 = Modular multi-agent ecosystem with clear orchestration

3. Lack of Tool Governance

0 = Agents have unrestricted tool/API access
5 = Tools sandboxed, monitored, and quota-managed

4. Over-Prompting (Prompt Spaghetti)

0 = Massive prompt with all rules hardcoded
5 = Policies and workflows separated from prompt design

5. Ignoring State & Memory Management

0 = No memory strategy, sensitive data stored blindly
5 = Tiered memory + compliance-aligned retention & anonymization

6. Agents Acting Without Feedback Loops

0 = Agents act without verifying results
5 = Agents self-check, retry with backoff, escalate if unresolved

7. Black-Box Coordination

0 = No visibility into agent decision-making
5 = Full explainability & audit logs for every coordination step

8. No Role Boundaries Between Agents

0 = Agents compete or duplicate tasks
5 = Clear role contracts + arbitration protocols

9. Over-Reliance on LLM Reasoning

0 = LLM is sole planner/executor/verifier
5 = Hybrid with symbolic/logical systems grounding decisions

10. No Safety Nets for Emergent Behavior

0 = No limits, no anomaly detection
5 = Circuit breakers, budget limits, monitoring, and automated shutdowns

🏁 Example Usage

Imagine auditing an insurance claims agent system:

Anti-PatternScore (0–5)NotesOver-Autonomy2Auto-approves claims but no $ threshold for reviewOne Agent4Separate claims, fraud, and customer-service agentsTool Governance1Agents can hit APIs without quota limitsOver-Prompting3Prompts are structured but still longMemory Mgmt2Stores full conversations, no anonymizationNo Feedback Loops3Has retries, but no escalationBlack-Box1No agent-to-agent audit logsNo Role Boundaries4Agents are well definedOver-Reliance on LLM2LLM does reasoning + execution, no hybrid logicNo Safety Nets1No circuit breakers, high risk

Total Risk Score = 23/50 → High Risk Zone ⚠️

👉 With this scoring matrix, leadership teams can:

Track risk exposure over time (monthly audits).
Compare projects or vendors.
Prioritize remediation (focus on low scores first).

CUT CLUTTER IN TECH

Discussion about this post

Ready for more?