AI Insights Archives - Page 3 of 4

The Additive Trap: Why LLMs Build Up but Rarely Clean Up

Posted on January 29, 2026March 3, 2026 by David Saliba

LLMs are great at adding: new features, new branches, new error messages. They’re bad at removing or simplifying. When you ask for a change, they tend to append code or add another condition rather than delete dead paths or consolidate duplicates. That’s partly training (most edits in the wild are additive) and partly the nature of autoregressive generation: you’re always “continuing” the text, not rewriting it. So the codebase drifts: more branches, more flags, more special cases, and the implicit state machine (what states exist, what transitions are valid) slowly diverges from what you thought you had.

The additive trap shows up in control flow: you add a new state or transition and forget to add the corresponding cleanup, timeout, or error path. Or you add a new “success” path but the old “failure” path now leads nowhere. The model doesn’t reason over the full graph; it fills in the local request. So you get stuck states, unreachable code, or two flags that can both be true when they shouldn’t be. Tests that only cover the happy path won’t catch these, you need a view of the structure.

What would help: tools or disciplines that force a “structure pass.” After the model suggests a change, something checks: are all states covered? Are there new branches with no error handling? Are there conflicting flags? That could be a linter, a custom checker, or a formal spec that you diff against. The key is to treat “shape” as a first-class concern, not just “does it run in one scenario.”

Until we have that, the best mitigation is to use the model for small, localized edits and to do structural review yourself. When you add a state or a branch, explicitly ask: what’s the reverse path? What cleans up? What happens on failure? The model won’t ask for you.

Expect more research and tooling on “structural correctness” of generated code and on ways to make the additive trap visible and fixable.

nJoy 😉

Hallucination in Code: Why AI Writes Bugs It Cannot See

Posted on January 26, 2026March 3, 2026 by David Saliba

When an LLM writes code, it can produce something that looks right and even runs in a narrow test but is wrong in structure: wrong API, wrong assumption about state, or a path that never gets tested. The model doesn’t “see” the full codebase or the spec; it predicts the next token. So it can add a happy path and forget the error path, or introduce two flags that can both be true in a bad combination, or leave a resource open. Those are structural bugs, bugs in the shape of the program, not simple typos. Tests that only cover the happy path won’t catch them.

Why the model writes bugs it cannot see: it has no formal model of the system. It doesn’t know “every state must have an error transition” or “this lock must be released.” It only knows statistical patterns from training code. So it tends to add and rarely to delete or refactor. It fills in the obvious next step and often misses the edge case or the cleanup. That’s the additive trap in code form.

Mitigations: use the model for drafts and then review. Run static analysis, linters, and tests that cover failure paths. In critical areas, keep the model on a short leash: generate small patches, run tests after each, and require human sign-off for structural changes. Some teams use formal specs or state-machine descriptions and then ask the model to implement against them, the spec is the source of truth, the model is the implementer.

Hallucination in code is a special case of “confident and wrong”: the code compiles, maybe even runs once, but the design is broken. The fix is the same as for other hallucinations: don’t trust the output without verification. For code, verification means tests, review, and a clear model of what “correct” means.

Expect more tooling that checks generated code against specs or structural rules, and more patterns for “model proposes, system verifies.”

nJoy 😉

Grounding Strategies: RAG, Structured Outputs, and Tool Use

Posted on January 23, 2026March 3, 2026 by David Saliba

Grounding means tying the model’s output to something external: retrieved documents, tool results, or a strict schema. RAG (retrieval-augmented generation) is the most common: you have a corpus (docs, code, KB), you run a query (user question or embedding), you retrieve the top-k chunks, and you put them in the prompt. The model is then “grounded” in those chunks, it’s supposed to answer from them rather than from memory. It still can hallucinate (e.g. mix chunks or add detail), but the ceiling is lower.

Structured outputs force the model to fill a schema (e.g. JSON with fields like “answer”, “confidence”, “sources”). That doesn’t guarantee truth, but it makes parsing and downstream checks possible. You can require a “sources” array and then validate that each source exists. You can run the answer through a checker (e.g. a query against a DB) before showing it to the user.

Tool use is another form of grounding: instead of the model “remembering” or inventing a fact, it calls a tool (search, API, DB) and you inject the result. The model reasons over the result but doesn’t invent the result itself. So grounding strategies are: (1) put real data in context (RAG), (2) constrain the form of the answer (structured output), (3) get data via tools and let the model interpret it. Often you combine them.

The tradeoff is cost and latency: RAG and tools add retrieval and API calls; structured output can require more tokens or multiple turns. But for any application where correctness matters, grounding is the only reliable path. Unconstrained generation is for draft and exploration; grounding is for production.

Expect more tooling around RAG quality (better retrieval, chunking, and attribution) and tighter integration of tools and structured output in APIs.

nJoy 😉

Confident and Wrong: The Anatomy of an LLM Hallucination

Posted on January 20, 2026March 3, 2026 by David Saliba

A hallucination is often confident: the model states something wrong with no hedging, in the same tone it uses for correct answers. That’s because the surface form (grammar, style, “authoritative” phrasing) is what the model is optimised for; it doesn’t have a separate channel for “I’m unsure.” So you get “The capital of Mars is Olympus City” or a fake study citation that looks real. The anatomy of such an error: the model chose a high-probability continuation that fits the prompt and prior tokens, and that continuation happened to be false.

Confidence and wrongness can combine in dangerous ways. In code, the model might invent an API that doesn’t exist or a parameter that sounds right but isn’t. In medicine or law, a confident wrong answer can be worse than “I don’t know.” The user often can’t tell the difference until they verify, and many users don’t verify. So the harm is in the pairing: wrong + confident.

Some models are being tuned to hedge or say “I’m not sure” when they’re uncertain, but that’s a band-aid: the model still doesn’t have access to ground truth. The better approach is to not rely on the model’s self-assessment. Use retrieval, tools, and human checks for anything that must be correct. Treat confident-sounding output as “draft” until verified.

In UX you can nudge users: “Always verify facts and code.” In system design you can add guardrails: require citations, or run generated code in a sandbox and check the result. The goal is to make the cost of trusting a hallucination visible and to make verification easy.

Expect more work on uncertainty signaling and citation, but the core lesson remains: confidence and correctness are not the same. Design for that.

nJoy 😉

Why LLMs Hallucinate: The Probabilistic Root Cause

Posted on January 17, 2026March 3, 2026 by David Saliba

LLMs hallucinate because they’re not “looking up” facts, they’re predicting the next token. The training objective is to assign high probability to plausible continuations given context. Plausible doesn’t mean true: the model has learned patterns like “the capital of X is Y” and “according to study Z,” so it can generate confident, grammatical, and completely false statements. There’s no separate “truth check” in the forward pass; the only signal is statistical.

Why it happens: the model has seen many texts that look authoritative (wrong Wikipedia edits, forum posts, confabulations in training data). It has also learned that sounding confident is rewarded in dialogue. So when it doesn’t know, it often still produces something that “fits” the context and the prompt. Low-probability tokens can still be sampled (especially at higher temperature), so rare or wrong answers can appear. And the model has no persistent memory of “I already said X”, it can contradict itself in the same conversation.

Mitigations are external: RAG (retrieve real docs and put them in context), tool use (call an API or DB instead of inventing), structured output (force a schema so the model has to fill slots), and post-hoc checks (fact-check, cite sources). You can also reduce temperature and use decoding constraints to make the model more conservative, but that doesn’t remove the underlying cause.

Understanding the probabilistic root cause helps you design systems that don’t over-trust the model. Never treat raw model output as ground truth for facts, names, or numbers. Always have a path to verify or ground.

Expect continued work on “truthfulness” and citation in models, but the fundamental issue, next-token prediction is not truth-tracking, will stay. Design around it.

nJoy 😉

Composing AI Systems: MCP and A2A Orchestration Patterns

Posted on January 14, 2026March 3, 2026 by David Saliba

Once you have MCP for tools and something like A2A for agent-to-agent handoffs, the next step is composing them: one system that uses both. For example, your main agent might call MCP tools (search, DB, API) and also delegate sub-tasks to other agents via A2A. The orchestrator (often an LLM or a small controller) decides when to use a tool and when to call another agent, and it merges results back into the main flow.

Orchestration patterns vary. You can have a single orchestrator that never delegates, it only uses MCP tools. You can have a tree: the top agent delegates to specialists, and each specialist might use tools or delegate again. You can have a pipeline: agent A does step 1, hands to B for step 2, then to C for step 3. The choice depends on the task. Composing MCP and A2A means the orchestrator has two kinds of “actions”: call a tool (MCP) or delegate to an agent (A2A). Both return something the orchestrator can reason over.

Design questions: who owns the overall goal and context? Usually the top-level agent or the host. How do you avoid infinite delegation? Step limits, timeouts, and clear contracts (e.g. “return by time T”). How do you handle partial failure? One agent might fail; the orchestrator needs a strategy (retry, fallback, or abort and report).

In production you’ll also care about observability (tracing which agent did what), cost (each agent call may hit an API), and security (which agents can call which tools or other agents). Composing MCP and A2A gives flexibility; making that composition reliable and debuggable is the next layer of work.

Expect more frameworks that support both MCP and A2A in one stack, and clearer patterns for when to delegate versus when to call a tool.

nJoy 😉

A2A Protocol: How AI Agents Negotiate and Delegate

Posted on January 11, 2026March 3, 2026 by David Saliba

A2A stands for Agent-to-Agent protocol. It’s a way for one AI agent (or assistant) to delegate work to another, for example, a “planner” agent that hands off a research task to a “researcher” agent and then gets a result back. The idea is to standardise how agents discover each other, describe capabilities, and exchange requests and responses so that multi-agent systems can be composed without every team building custom glue.

In practice A2A defines how an agent advertises what it can do (e.g. “I can answer questions about X” or “I can run tool Y”) and how another agent (or orchestrator) sends a request and gets a response. That might be over HTTP with a standard envelope: request ID, from/to, payload, and maybe a deadline or priority. The response might be a direct answer or a reference to a longer-running task that the caller can poll for.

Negotiation and delegation mean the caller can choose among several agents (e.g. by capability or load) and the callee can accept, reject, or redirect. That’s useful when you have multiple specialised agents and want the system to route work automatically. Trust and auth are part of the design: the caller needs to know the callee is who it claims to be, and the callee may enforce quotas or policies.

A2A is still emerging; different frameworks and vendors are proposing similar ideas. The core value is the same as MCP for tools: one protocol so that agents from different builders can work together. Expect to see A2A (or something like it) in products that position themselves as “agent platforms” where you plug in multiple agents and an orchestrator routes and composes them.

Expect more formal specs and implementations. The real test is whether multiple vendors and open-source projects adopt the same protocol.

nJoy 😉

What Makes a Good MCP Server: Tools, Resources, and Prompts

Posted on January 8, 2026March 3, 2026 by David Saliba

A good MCP server does three things well: it exposes tools that are easy for the model to use, it offers resources that add real context, and (optionally) it provides prompts that steer the model toward correct usage. Tools should have clear names and descriptions and arguments that match what the model can reason about. Vague tools (“do stuff”) or huge argument schemas lead to misuse or confusion. Small, focused tools (e.g. “search_docs”, “run_query”) work better.

Resources are read-only inputs: the model (or host) can request a URI and get content. That’s useful for injecting docs, config, or live data into context without the model “calling” something. Design resources so that the URI is predictable and the content is concise; giant dumps hurt. Prompts are optional templates (e.g. “summarise this resource”) that the host can offer to the model. They help when the model needs a nudge toward a specific use of the server.

Error handling matters: when a tool fails, return a clear message so the model can retry or explain. Timeouts and rate limits should be visible to the model when possible (“Tool X is rate limited; try again in N seconds”) so it can adapt. Auth is the host’s job in most setups, the server trusts the host, but if your server has its own auth, document it so host authors can configure it.

Keep the server stateless where you can: each request should be self-contained so you can scale and restart without losing context. If you need state (e.g. a session), make it explicit (e.g. a session_id argument) rather than implicit.

A good MCP server feels like a sharp, predictable API that the model can actually use. Clarity and scope beat feature count.

nJoy 😉

MCP Explained: The USB Standard for AI Tools

Posted on January 5, 2026March 3, 2026 by David Saliba

MCP stands for Model Context Protocol. In practice it’s a standard way for an AI application (a host) to talk to tools and data sources (servers). A host, Cursor, Claude Desktop, or a custom app, discovers and calls MCP servers over a defined transport (e.g. stdio or HTTP). Each server exposes tools (actions the model can request), resources (read-only data the model can pull in), and optionally prompts (templates). So instead of every app inventing its own plugin system, MCP gives you one protocol: like USB for AI tools.

From the host’s point of view you add a server by config (name, transport, args). The host then fetches the server’s capability list: what tools exist, what arguments they take, what resources are available. When the model wants to call a tool, the host sends the request to the right server and passes the result back to the model. The model never talks to the server directly; the host is the broker. That keeps auth, rate limits, and safety in one place.

Servers can be local (a process on your machine) or remote (HTTP). They can wrap existing APIs, file systems, or custom logic. The protocol is transport-agnostic so you can run the same server over stdio in dev and HTTP in production. Tool schemas are JSON-based so they’re easy to generate and validate.

MCP doesn’t solve everything, you still have to build and secure the servers, and the model still has to choose the right tool and arguments. But it solves the “how do we plug tools in?” problem in a way that’s reusable across hosts and models. That’s why it’s spreading: one protocol, many apps and servers.

Expect more MCP servers in the wild and tighter integration in major AI products. The value is in the ecosystem, not any single implementation.

nJoy 😉

Agent Memory: Short-Term Context, Long-Term State, and the Gap Between

Posted on December 28, 2025March 3, 2026 by David Saliba

Agents have two kinds of “memory”: the context window (short-term) and everything else (long-term). Short-term is what you send in each request, the conversation so far, maybe a summary of older turns, plus any retrieved docs or tool results. That’s limited (e.g. 128K tokens) and expensive. Long-term would be a persistent store: facts about the user, past decisions, or project state that survives across sessions. Today most agents don’t have a real long-term memory; they get a fresh context each time or a hand-built “summary” that you inject.

The gap shows up when you want an agent that remembers your preferences, what it did last week, or the current state of a long project. Without long-term memory, you have to tell it again or rely on RAG over past transcripts. That works up to a point, but retrieval isn’t the same as “knowing”: the model might not get the right chunk or might contradict what it “remembered” before. True long-term memory would mean the agent updates a store (e.g. a knowledge graph or structured DB) and reads from it at the start of each run, still an open design problem.

Short-term is also a design choice: do you keep every message, or do you summarise old turns to save space? Summarisation loses detail; keeping everything hits context limits. Many systems use a sliding window plus a running summary. Tool results can be truncated or summarised too, so the model sees “the answer was X” instead of a 10K-character dump.

Until we have standard, reliable long-term memory, agents will stay best at single-session or well-scoped tasks. The progress will come from better retrieval, better summarisation, and eventually learned or hybrid memory that the agent can read and update safely.

Expect more work on agent memory architectures and on grounding agents in external state (databases, docs) as a stand-in for true long-term memory.

nJoy 😉

← Newer posts Older posts →