Streamable HTTP and SSE: Building Remote MCP Servers

stdio works beautifully for local tools, but the moment you want to share an MCP server across a network – serving multiple clients, deploying to a container, integrating with a third-party host over the internet – you need HTTP transport. The MCP specification defines Streamable HTTP as the canonical remote transport: HTTP POST for client-to-server requests, Server-Sent Events (SSE) for server-to-client streaming. This lesson covers the protocol mechanics, the implementation pattern, and the hard-won lessons about making streaming work reliably in production.

Streamable HTTP transport diagram showing HTTP POST requests and SSE stream responses between client and server dark
Streamable HTTP: POST for requests, SSE for streaming responses – the MCP remote transport standard.

The Streamable HTTP Protocol

The Streamable HTTP transport uses a single HTTP endpoint (typically /mcp) and the following request flow:

  • Client to server: HTTP POST with Content-Type: application/json containing the JSON-RPC message(s). The client must include a session ID header (mcp-session-id) after the connection is established.
  • Server to client (immediate response): HTTP 200 with Content-Type: application/json containing the JSON-RPC response. For a single request-response pair with no streaming.
  • Server to client (streaming): HTTP 200 with Content-Type: text/event-stream (SSE). The server keeps the connection open and pushes events as they arrive. This is used for long-running tools, progress notifications, and sampling requests.
  • Server to client (async): HTTP GET to the MCP endpoint opens an SSE stream for the server to push unsolicited notifications (resource updates, tool list changes, etc.).
// Client side: use the Streamable HTTP transport
import { Client } from '@modelcontextprotocol/sdk/client/index.js';
import { StreamableHTTPClientTransport } from '@modelcontextprotocol/sdk/client/streamableHttp.js';

const client = new Client(
  { name: 'my-http-host', version: '1.0.0' },
  { capabilities: {} }
);

const transport = new StreamableHTTPClientTransport(
  new URL('https://my-mcp-server.example.com/mcp')
);

await client.connect(transport);

const tools = await client.listTools();
console.log('Available tools:', tools.tools.map(t => t.name));

“The HTTP with SSE transport uses Server-Sent Events for server-to-client streaming while using HTTP POST for client-to-server communication. This allows servers to stream results and send notifications to clients.” – MCP Documentation, Transports

Building a Streamable HTTP Server

The MCP SDK provides a StreamableHTTPServerTransport that handles all the protocol mechanics. You attach it to any HTTP server framework – Express, Hono, Fastify, or Node’s built-in http module.

import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { StreamableHTTPServerTransport } from '@modelcontextprotocol/sdk/server/streamableHttp.js';
import { z } from 'zod';
import express from 'express';

const app = express();
app.use(express.json());

const server = new McpServer({ name: 'remote-server', version: '1.0.0' });

server.tool(
  'get_weather',
  'Gets current weather for a city',
  { city: z.string().describe('City name') },
  async ({ city }) => {
    const data = await fetchWeather(city);
    return {
      content: [{ type: 'text', text: `${city}: ${data.temp}Β°C, ${data.condition}` }],
    };
  }
);

// Session management: one transport per client session
const sessions = new Map();

app.post('/mcp', async (req, res) => {
  const sessionId = req.headers['mcp-session-id'];

  let transport;
  if (sessionId && sessions.has(sessionId)) {
    transport = sessions.get(sessionId);
  } else {
    // New session
    transport = new StreamableHTTPServerTransport({
      sessionIdGenerator: () => crypto.randomUUID(),
      onsessioninitialized: (id) => sessions.set(id, transport),
    });
    await server.connect(transport);
  }

  await transport.handleRequest(req, res);
});

app.get('/mcp', async (req, res) => {
  const sessionId = req.headers['mcp-session-id'];
  const transport = sessions.get(sessionId);
  if (!transport) return res.status(404).send('Session not found');
  await transport.handleRequest(req, res);
});

app.delete('/mcp', async (req, res) => {
  const sessionId = req.headers['mcp-session-id'];
  sessions.delete(sessionId);
  res.status(200).send('Session terminated');
});

app.listen(3000, () => console.error('MCP HTTP server running on :3000'));
MCP HTTP session management diagram showing session ID header session map multiple clients connecting dark
HTTP session management: each client gets a session ID, mapped to its own transport instance.

SSE Streaming in Practice

When a tool produces results progressively (e.g. a long-running data processing job), the server can stream intermediate progress via SSE notifications before sending the final result:

server.tool(
  'process_large_dataset',
  'Processes a large dataset with progress streaming',
  { dataset_id: z.string(), chunk_size: z.number().default(1000) },
  async ({ dataset_id, chunk_size }, { server: serverInstance }) => {
    const dataset = await loadDataset(dataset_id);
    const totalRows = dataset.length;
    let processed = 0;

    for (let i = 0; i < dataset.length; i += chunk_size) {
      const chunk = dataset.slice(i, i + chunk_size);
      await processChunk(chunk);
      processed += chunk.length;

      // Stream progress via SSE notification
      serverInstance.server.notification({
        method: 'notifications/progress',
        params: {
          progressToken: dataset_id,
          progress: processed,
          total: totalRows,
        },
      });
    }

    return {
      content: [{
        type: 'text',
        text: `Processed ${processed} rows from dataset ${dataset_id}`,
      }],
    };
  }
);

Failure Modes with Streamable HTTP

Case 1: No Session Management - One Transport for All Clients

Creating a single global transport instance and sharing it across all HTTP requests corrupts all sessions. Each client connection needs its own transport instance.

// WRONG: Single global transport - all sessions corrupt each other
const globalTransport = new StreamableHTTPServerTransport({ ... });
await server.connect(globalTransport);

app.post('/mcp', async (req, res) => {
  await globalTransport.handleRequest(req, res); // All clients share state - WRONG
});

// CORRECT: Per-session transport instances (as shown above)

Case 2: SSE Connection Not Kept Alive

SSE connections must be kept open by the server for the duration of the session. Intermediate proxies (nginx, load balancers, CDNs) may buffer responses or close idle connections. Set appropriate headers and configure proxy timeouts.

// When using Express with SSE, set headers to prevent buffering
res.setHeader('Cache-Control', 'no-cache');
res.setHeader('X-Accel-Buffering', 'no'); // Nginx: disable buffering
res.setHeader('Connection', 'keep-alive');

// For nginx: proxy_read_timeout 3600s; proxy_buffering off;

What to Check Right Now

  • Test with curl - send a raw HTTP POST to your server: curl -X POST http://localhost:3000/mcp -H 'Content-Type: application/json' -d '{"jsonrpc":"2.0","id":1,"method":"initialize",...}'
  • Verify SSE with the browser - open your /mcp GET endpoint in a browser with DevTools open. The Network tab should show the SSE stream with events appearing in real time.
  • Configure nginx for SSE - in any production deployment, add proxy_buffering off and proxy_read_timeout 3600s to your nginx location block for the MCP endpoint.
  • Implement session cleanup - sessions that are never explicitly terminated will accumulate. Add a TTL or a periodic cleanup job to the sessions Map.

nJoy πŸ˜‰

MCP stdio Transport: The Local Standard and When to Use It

The transport layer is what carries JSON-RPC messages between client and server. MCP defines multiple transports, and choosing the right one for your use case is the first architectural decision you make when building a server. The stdio transport – using standard input and standard output – is the right choice for local, on-machine server processes, and it is the most widely deployed transport in the MCP ecosystem today. This lesson covers what it is, how it works, when to use it, and when not to.

stdio transport diagram showing client subprocess launch with stdin stdout pipes for JSON-RPC messages dark
stdio transport: the host launches the server as a subprocess and communicates over stdin/stdout pipes.

How stdio Transport Works

With stdio transport, the host launches the MCP server as a child process. JSON-RPC messages are sent to the server via its stdin and received from the server via its stdout. Each message is delimited by a newline character. The server’s stderr is typically forwarded to the host’s logs for debugging. The server process lives for as long as the client needs it and is terminated when the client disconnects or the host exits.

This is a well-understood pattern in Unix tooling – it is how shells pipe data between commands (cat file | grep pattern | wc -l). MCP adopts it for the same reason: simplicity, no network setup required, OS-managed process isolation, and easy integration with any host that can launch subprocesses.

// Server side: connect to stdio transport
import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';

const server = new McpServer({ name: 'my-server', version: '1.0.0' });
// ... register tools, resources, prompts ...

const transport = new StdioServerTransport();
await server.connect(transport);
// Server is now listening on stdin, writing to stdout
// Client side: launch server as subprocess
import { Client } from '@modelcontextprotocol/sdk/client/index.js';
import { StdioClientTransport } from '@modelcontextprotocol/sdk/client/stdio.js';

const client = new Client({ name: 'my-host', version: '1.0.0' }, { capabilities: {} });

const transport = new StdioClientTransport({
  command: 'node',                    // The command to launch the server
  args: ['./server.js'],              // Arguments
  env: {                              // Environment variables for the subprocess
    ...process.env,
    DATABASE_URL: process.env.DATABASE_URL,
  },
  cwd: '/path/to/project',           // Working directory (optional)
});

await client.connect(transport);
// The transport has launched server.js as a subprocess
// and established stdin/stdout communication

“The stdio transport is ideal for local integrations and command-line tools. It allows processes to communicate through standard input and output streams, making it simple to implement and easy to debug.” – MCP Documentation, Transports

stdio in Configuration Files

Most MCP hosts (Claude Desktop, VS Code extensions, Cursor) use a configuration file that lists servers with their launch commands. The host reads this file and launches each server as a stdio subprocess when needed. Understanding this format is essential for distributing your MCP server.

// claude_desktop_config.json format
{
  "mcpServers": {
    "my-database-server": {
      "command": "node",
      "args": ["/path/to/db-server.js"],
      "env": {
        "DATABASE_URL": "postgresql://localhost:5432/mydb"
      }
    },
    "my-file-server": {
      "command": "npx",
      "args": ["@myorg/mcp-file-server"],
      "env": {}
    }
  }
}
MCP server config file format with JSON showing mcpServers object with command args and env fields dark
The standard MCP server configuration format used by Claude Desktop, VS Code, and other hosts.

stdio vs HTTP Transport: When to Use Each

Factor stdio HTTP/SSE
Deployment Local machine only Local or remote
Multiple clients One client per process Many concurrent clients
Network setup None required Ports, TLS, CORS
Security isolation OS process isolation Network + auth required
Sharing Not shareable Shareable across team/internet
State persistence Lives with host process Independent lifetime

Failure Modes with stdio

Case 1: Writing to stdout from Server Code

The most common stdio failure. Anything written to stdout by the server process becomes part of the JSON-RPC stream and corrupts the protocol. Use stderr for all logging.

// WRONG: console.log goes to stdout and corrupts the JSON-RPC stream
console.log('Server started');
console.log('Processing request...');

// CORRECT: Use stderr for all server-side output
console.error('Server started');
process.stderr.write('Processing request...\n');

// OR: Use the MCP logging notification capability
server.server.sendLoggingMessage({ level: 'info', data: 'Server started' });

Case 2: Blocking the Event Loop in stdio Server

stdio servers run in a single Node.js process. If a tool handler blocks the event loop (synchronous file read, a tight computation loop), all other requests to the server will queue up and timeout. Always use async I/O in tool handlers.

// WRONG: Synchronous file read blocks event loop
server.tool('read_large_file', '...', { path: z.string() }, ({ path }) => {
  const content = fs.readFileSync(path); // BLOCKS the event loop
  return { content: [{ type: 'text', text: content }] };
});

// CORRECT: Async I/O
server.tool('read_large_file', '...', { path: z.string() }, async ({ path }) => {
  const content = await fs.promises.readFile(path, 'utf8'); // Non-blocking
  return { content: [{ type: 'text', text: content }] };
});

What to Check Right Now

  • Run your server through cat – a quick sanity check: echo '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","clientInfo":{"name":"test","version":"1.0"},"capabilities":{}}}' | node server.js. You should see a JSON-RPC response on stdout and any logs on stderr.
  • Check for stdout pollution – search your server code for console.log and replace with console.error. Any package that logs to stdout will also cause issues.
  • Use the Inspector as a stdio test harnessnpx @modelcontextprotocol/inspector node server.js gives you a complete GUI client for your stdio server.
  • Handle SIGTERM gracefully – when the host terminates your server, it sends SIGTERM. Handle it to close database connections and flush logs: process.on('SIGTERM', cleanup).

nJoy πŸ˜‰

MCP Roots: Filesystem and URI Boundaries

A server that can read any file anywhere on the filesystem is a security disaster waiting to happen. Roots are MCP’s answer to the containment problem: a mechanism for clients to tell servers exactly which filesystem paths and URIs they are permitted to access. It is not just a security feature – it is also a scoping feature. Roots let the host say “this AI assistant is allowed to work with the files in this project directory”, giving the server a clear operational boundary without restricting it to a fixed list of resources.

MCP roots diagram showing client sending root URIs to server defining allowed filesystem boundaries dark
Roots: the host tells the server which paths it is allowed to work with, enforcing operational boundaries.

What Roots Are

A root is a URI that defines a boundary of the client’s environment that the server may access. A root most commonly represents a directory on the filesystem (file:///Users/alice/my-project), but it can also be any URI scheme meaningful to the server (https://api.mycompany.com/v1, git://my-org/my-repo). The server should limit its operations to within the URIs provided as roots.

Roots flow from client to server: the client announces its roots when the server requests them via the roots/list method. The client can also notify the server when roots change via the roots/list_changed notification.

// Client: declare roots capability
const client = new Client(
  { name: 'my-ide', version: '1.0.0' },
  {
    capabilities: {
      roots: {
        listChanged: true,  // Client will notify when roots change
      },
    },
  }
);

// Client: respond to roots/list requests from the server
import { ListRootsRequestSchema } from '@modelcontextprotocol/sdk/types.js';

client.setRequestHandler(ListRootsRequestSchema, async () => ({
  roots: [
    {
      uri: 'file:///Users/alice/my-project',
      name: 'My Project',
    },
    {
      uri: 'file:///Users/alice/shared-libs',
      name: 'Shared Libraries',
    },
  ],
}));

// Notify server when workspace changes (e.g. user opens a different project)
await client.sendNotification({
  method: 'notifications/roots/list_changed',
});
MCP roots boundary diagram showing project directory as allowed zone and system files as restricted zone dark
Roots create operational zones: the server is guided to stay within declared URIs and avoid everything else.

Server-Side Roots Usage

On the server side, you request the current roots at startup or whenever you need to know the operational scope. Use roots to validate that requested resource URIs fall within allowed boundaries before accessing them.

import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { z } from 'zod';
import path from 'node:path';
import fs from 'node:fs/promises';
import { fileURLToPath } from 'node:url';

const server = new McpServer({ name: 'fs-server', version: '1.0.0' });

// Helper: check if a path is within any declared root
async function isWithinRoots(targetPath, serverInstance) {
  const { roots } = await serverInstance.listRoots();
  const fileRoots = roots
    .filter(r => r.uri.startsWith('file://'))
    .map(r => fileURLToPath(r.uri));

  const normalised = path.resolve(targetPath);
  return fileRoots.some(root => normalised.startsWith(path.resolve(root)));
}

server.tool(
  'read_file',
  'Reads a file from within the allowed workspace roots',
  { file_path: z.string().describe('Path to the file to read') },
  async ({ file_path }, { server: serverInstance }) => {
    // Check the path is within declared roots before reading
    const allowed = await isWithinRoots(file_path, serverInstance);
    if (!allowed) {
      return {
        isError: true,
        content: [{
          type: 'text',
          text: `Access denied: ${file_path} is outside the allowed workspace roots.`,
        }],
      };
    }

    const content = await fs.readFile(file_path, 'utf8');
    return { content: [{ type: 'text', text: content }] };
  }
);

“Roots represent URI boundaries that define the scope of client access. Servers SHOULD use roots as guidance for what resources and operations to offer, respecting the boundaries set by the client.” – MCP Specification, Roots

Roots for Non-Filesystem URIs

Roots are not limited to file paths. Any URI scheme can be a root, which allows hosts to scope server access to particular API endpoints, repository namespaces, or any other URI-addressed resource space.

// API roots example
client.setRequestHandler(ListRootsRequestSchema, async () => ({
  roots: [
    { uri: 'https://api.mycompany.com/v1/projects/42', name: 'Project 42 API' },
    { uri: 'https://api.mycompany.com/v1/users/me', name: 'My User API' },
  ],
}));

// The server checks that any API call it makes is within these URIs:
async function isApiAllowed(endpoint, serverInstance) {
  const { roots } = await serverInstance.listRoots();
  return roots.some(r => endpoint.startsWith(r.uri));
}

Failure Modes with Roots

Case 1: Server Ignoring Roots Entirely

Roots are advisory in the current spec – the protocol does not force enforcement on the server. This means a badly implemented server can simply ignore the roots and access anything it wants. In a security-conscious deployment, the host should use OS-level sandboxing (chroot, Docker volumes, seccomp filters) to enforce the boundaries that roots only hint at.

// RISKY: Server trusts roots only, no OS enforcement
// A malicious or buggy server can bypass this
const allowed = await isWithinRoots(userProvidedPath, serverInstance);
if (allowed) await fs.readFile(userProvidedPath); // Only guarded by protocol hint

// SAFER: Add OS-level enforcement too
// Run the server process in a Docker container with volume mounts limited to the root dirs:
// docker run --volume /Users/alice/my-project:/workspace:ro my-mcp-server

Case 2: Not Handling roots/list_changed

If the user changes the active workspace (opens a different project, switches repositories), the client sends roots/list_changed. If the server caches the roots from startup and ignores this notification, it will use stale root information for all subsequent operations.

// Handle roots change notifications
server.setNotificationHandler(
  { method: 'notifications/roots/list_changed' },
  async () => {
    // Invalidate cached roots and re-fetch
    cachedRoots = null;
    console.error('[server] Roots changed - refreshed scope');
  }
);

What to Check Right Now

  • Declare roots capability on your clients – if you build a host that has a concept of a workspace or project, declare roots and implement the handler. This is what makes your server integration “workspace-aware”.
  • Validate paths against roots in every file-touching tool – add the isWithinRoots check to every tool that reads or writes files. Do this before any fs.readFile or fs.writeFile call.
  • Test path traversal attempts – try passing ../../../etc/passwd to a file-reading tool and verify the roots check catches it.
  • Combine roots with OS isolation – in production, run server processes in containers with volume mounts restricted to the declared roots. Advisory protocol constraints are not a substitute for OS-level isolation.

nJoy πŸ˜‰

MCP Elicitation: Asking the User for Input from Inside a Server

Some tools cannot complete without asking the user a question. “Which account should I debit?” “What is the date range for this report?” “Are you sure you want to delete all 200 records?” These are not questions an LLM should guess at. They require explicit, structured input from the human in the loop. Elicitation is MCP’s mechanism for this – it lets a server, while handling a request, ask the user for information through the client’s UI, wait for the answer, and then continue. It is a synchronous human-in-the-loop pattern baked into the protocol.

MCP elicitation flow showing server pausing tool execution to request user input through client UI dark diagram
Elicitation: the server pauses tool execution, asks the user a structured question, and resumes with the answer.

How Elicitation Works

When a server needs user input, it sends an elicitation/create request to the client. The request includes a message (explaining what information is needed) and a JSON schema describing the expected response format. The client presents this to the user in whatever UI is appropriate – a dialog box, a prompt, a form – and returns the user’s structured response. The server receives the answer and continues processing.

Elicitation is a client capability – the server can only use it if the client declared elicitation support during initialisation. If the client does not support elicitation, the server must handle the lack of user input gracefully (skip the step, use defaults, or return an error with a clear explanation).

// Client: declare elicitation support
const client = new Client(
  { name: 'my-host', version: '1.0.0' },
  {
    capabilities: {
      elicitation: {},  // Required for server-side elicitation
    },
  }
);

// Client: implement the elicitation handler
import { ElicitRequestSchema } from '@modelcontextprotocol/sdk/types.js';

client.setRequestHandler(ElicitRequestSchema, async (request) => {
  const { message, requestedSchema } = request.params;

  // Show UI to user - implementation is host-specific
  const userResponse = await showElicitationDialog(message, requestedSchema);

  if (userResponse === null) {
    // User dismissed/cancelled
    return { action: 'cancel' };
  }

  return {
    action: 'accept',
    content: userResponse,  // Must match requestedSchema
  };
});
MCP elicitation schema showing JSON schema for user response with string number boolean enum field types
Elicitation schemas: flat JSON schemas that define the exact structure of the user’s expected answer.

Server-Side Elicitation Usage

import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { z } from 'zod';

const server = new McpServer({ name: 'payment-server', version: '1.0.0' });

server.tool(
  'process_payment',
  'Processes a payment transaction with user confirmation',
  {
    amount: z.number().positive().describe('Amount in USD'),
    recipient: z.string().describe('Recipient name or email'),
  },
  async ({ amount, recipient }, { server: serverInstance }) => {
    // Elicit confirmation before processing
    const confirmation = await serverInstance.elicitInput({
      message: `You are about to send $${amount.toFixed(2)} to ${recipient}. Please confirm the payment account:`,
      requestedSchema: {
        type: 'object',
        properties: {
          account_id: {
            type: 'string',
            description: 'Your account ID to debit (format: ACC-XXXXXXXX)',
          },
          confirmed: {
            type: 'boolean',
            description: 'Confirm you want to proceed with this payment',
          },
        },
        required: ['account_id', 'confirmed'],
      },
    });

    if (confirmation.action === 'cancel') {
      return { content: [{ type: 'text', text: 'Payment cancelled by user.' }] };
    }

    if (!confirmation.content.confirmed) {
      return { content: [{ type: 'text', text: 'Payment declined - user did not confirm.' }] };
    }

    const result = await processPayment({
      amount,
      recipient,
      accountId: confirmation.content.account_id,
    });

    return { content: [{ type: 'text', text: `Payment ${result.transactionId} processed successfully.` }] };
  }
);

“Elicitation allows servers to request additional information from users during tool execution. This enables interactive workflows where user input is needed to complete tasks, while maintaining a clear separation between the AI model and the human oversight layer.” – MCP Specification, Elicitation

Elicitation Schema Constraints

The schema for an elicitation request is deliberately restricted compared to full JSON Schema. This is intentional – the schema must be renderable by any client UI, which means it cannot be arbitrarily complex. The spec defines a “flat” schema: a single object with primitive properties (string, number, boolean, or enum). No nested objects, no arrays, no $ref references.

// VALID elicitation schema - flat, primitive properties only
{
  type: 'object',
  properties: {
    name: { type: 'string', description: 'Your full name' },
    age: { type: 'number', description: 'Your age in years' },
    agree_to_terms: { type: 'boolean', description: 'Do you agree to the terms?' },
    plan: { type: 'string', enum: ['basic', 'pro', 'enterprise'], description: 'Choose a plan' },
  },
  required: ['name', 'agree_to_terms'],
}

// INVALID: Nested objects not allowed in elicitation schemas
{
  type: 'object',
  properties: {
    address: {
      type: 'object',  // This will fail - nested objects not permitted
      properties: { street: { type: 'string' } },
    },
  },
}

Failure Modes with Elicitation

Case 1: Not Handling cancel Action

Users can always cancel an elicitation. If your tool handler does not check for the cancel action, it will try to proceed with undefined data and crash or produce garbage output.

// BAD: No cancel check
const response = await serverInstance.elicitInput({ ... });
await processData(response.content.value); // Will throw if action was 'cancel'

// GOOD: Always check the action
const response = await serverInstance.elicitInput({ ... });
if (response.action !== 'accept') {
  return { content: [{ type: 'text', text: 'Action cancelled.' }] };
}
await processData(response.content.value);

Case 2: Using Elicitation When a Tool Parameter Would Suffice

Elicitation is for input the server cannot know at design time – confirmation of a specific action, a password, runtime context. If the information can be a tool argument, make it a tool argument. Elicitation adds a round-trip to the user and breaks the automated flow.

// WRONG: Using elicitation for something that should be an arg
server.tool('delete_record', '...', { id: z.string() }, async ({ id }, ctx) => {
  const confirm = await ctx.server.elicitInput({ message: 'Confirm deletion?', ... });
  // Should this really need an interactive prompt? Or is --confirm a better pattern?
});

// BETTER: Use tool annotations + let the host handle confirmation
server.tool(
  'delete_record',
  'Permanently deletes a record',
  { id: z.string(), confirm: z.boolean().describe('Set true to confirm permanent deletion') },
  { annotations: { destructive: true } },
  async ({ id, confirm }) => {
    if (!confirm) return { isError: true, content: [{ type: 'text', text: 'Set confirm=true to proceed.' }] };
    await db.delete(id);
    return { content: [{ type: 'text', text: `Deleted ${id}` }] };
  }
);

What to Check Right Now

  • Map your interactive flows – identify any workflow in your application that requires user input mid-execution. These are elicitation candidates.
  • Keep schemas flat – validate your elicitation schemas against the spec constraints: flat object, primitive values only, no nested objects or arrays.
  • Always handle cancel and decline – every elicitation can result in cancel (user dismissed) or decline (user responded negatively). Handle all three outcomes.
  • Check client support first – before calling elicitInput, check client.getClientCapabilities()?.elicitation. If the client does not support it, fall back to tool-argument-based confirmation.

nJoy πŸ˜‰

MCP Sampling: Server-Initiated LLM Calls and Recursive AI

Here is the mind-bending part of MCP: servers can ask the LLM for help. In the standard model, the flow is one-way – host calls LLM, LLM calls tool, tool runs on server, result goes back. Sampling reverses one arrow. It lets a server, while handling a request, ask the host’s LLM to generate text – and then use that generated text in its response. This is recursive AI, and it is what enables genuinely intelligent MCP servers that reason about their own actions.

MCP sampling flow diagram showing server requesting LLM inference from client creating recursive AI loop on dark background
Sampling: the server requests an LLM inference from the client, enabling server-side reasoning loops.

The Sampling Flow

Sampling works as follows: a server handling a tool call decides it needs to “think” before it can respond. It sends a sampling/createMessage request to the client. The client receives this, shows the pending sampling request to the user (or approves it automatically based on policy), then calls the actual LLM API, and returns the result to the server. The server uses the result to complete its work and returns the final tool response to the original caller.

The critical point: the server does not know which LLM the client is using. It just asks for “a language model response” and gets back generated text. This maintains provider-agnosticism even for server-side reasoning.

// Client configuration to enable sampling
const client = new Client(
  { name: 'my-host', version: '1.0.0' },
  {
    capabilities: {
      sampling: {},  // Must declare this to receive sampling requests from servers
    },
  }
);

// Client must handle incoming sampling requests
client.setRequestHandler(CreateMessageRequestSchema, async (request) => {
  const { messages, maxTokens, temperature } = request.params;

  // Here the host calls its actual LLM
  const openai = new OpenAI();
  const response = await openai.chat.completions.create({
    model: 'gpt-4o',
    messages: messages.map(m => ({
      role: m.role,
      content: typeof m.content === 'string' ? m.content : m.content.text,
    })),
    max_tokens: maxTokens || 1000,
    temperature: temperature || 0.7,
  });

  return {
    role: 'assistant',
    content: { type: 'text', text: response.choices[0].message.content },
    model: 'gpt-4o',
    stopReason: 'endTurn',
  };
});
MCP server using sampling to reason about its own tool execution with request loop diagram dark
A server using sampling to analyse data before returning a structured response.

Server-Side Sampling Usage

On the server side, you request sampling through the server’s sampling capability. Here is a server that uses sampling to classify user intent before deciding which database to query:

import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { z } from 'zod';

const server = new McpServer({ name: 'smart-search', version: '1.0.0' });

server.tool(
  'intelligent_search',
  'Searches across databases, routing the query based on intent',
  { query: z.string().describe('The search query') },
  async ({ query }, { server: serverInstance }) => {
    // Use sampling to classify the query intent
    const classification = await serverInstance.createMessage({
      messages: [{
        role: 'user',
        content: {
          type: 'text',
          text: `Classify this search query into one of: products, users, orders, docs.\nQuery: "${query}"\nRespond with only the category name.`,
        },
      }],
      maxTokens: 10,
    });

    const category = classification.content.text.trim().toLowerCase();

    // Route to the appropriate search function
    let results;
    switch (category) {
      case 'products': results = await searchProducts(query); break;
      case 'users': results = await searchUsers(query); break;
      case 'orders': results = await searchOrders(query); break;
      default: results = await searchDocs(query);
    }

    return { content: [{ type: 'text', text: JSON.stringify(results) }] };
  }
);

“Sampling allows servers to request LLM completions through the client, enabling sophisticated agentic behaviors while maintaining security through human oversight. The client retains control over which model is used and what requests are permitted.” – MCP Documentation, Sampling

Sampling Parameters

The sampling/createMessage request supports model preferences and sampling parameters. These are preferences, not requirements – the client may choose to ignore them if they conflict with its policy or available models.

const response = await serverInstance.createMessage({
  messages: [{ role: 'user', content: { type: 'text', text: 'Summarise in one sentence.' } }],
  maxTokens: 100,
  temperature: 0.3,           // Lower = more deterministic
  modelPreferences: {
    hints: [{ name: 'claude-3-5-haiku' }], // Preferred model - client may ignore
    costPriority: 0.8,         // 0-1: prefer cheaper models
    speedPriority: 0.9,        // 0-1: prefer faster models
    intelligencePriority: 0.2, // 0-1: prefer smarter models
  },
  systemPrompt: 'You are a concise summariser.',
});

Failure Modes with Sampling

Case 1: Using Sampling for Every Decision

Sampling adds latency and cost. Using it for decisions that can be made with deterministic code (string matching, regex, a simple lookup) is waste. Reserve sampling for genuinely ambiguous situations where LLM understanding adds real value.

// WASTEFUL: Sampling for something a regex handles
const isEmail = await serverInstance.createMessage({
  messages: [{ role: 'user', content: { type: 'text', text: `Is "${input}" an email address? Yes or No.` } }],
  maxTokens: 5,
});

// BETTER: Just use a regex
const isEmail = /^[^@]+@[^@]+\.[^@]+$/.test(input);

Case 2: Infinite Sampling Loops

If a server uses sampling and the LLM response triggers another tool call that uses sampling again, you can create infinite loops. Always set a maximum recursion depth and terminate if exceeded.

// Guard against recursion depth
async function toolHandler({ query }, context, depth = 0) {
  if (depth > 3) {
    return { isError: true, content: [{ type: 'text', text: 'Max reasoning depth exceeded.' }] };
  }
  const classification = await serverInstance.createMessage({ ... });
  if (needsMoreInfo(classification)) {
    return toolHandler({ query: refineQuery(query) }, context, depth + 1);
  }
  return finalResponse(classification);
}

What to Check Right Now

  • Declare sampling on your client – if you want servers to be able to use sampling, your client must declare capabilities: { sampling: {} }. Without this, sampling requests from servers will be rejected.
  • Implement a sampling handler – if you build a host application, implement the CreateMessageRequestSchema handler. An unimplemented handler will cause all sampling requests to fail silently.
  • Show sampling requests to users – the spec emphasises human oversight. Production hosts should surface pending sampling requests to users and allow approval/rejection.
  • Cap sampling depth – any server that uses sampling recursively must have a maximum depth limit. Without it, one malformed query can run up unbounded costs.

nJoy πŸ˜‰

MCP Prompts: Reusable Templates and Workflow Fragments

Most MCP developers learn about tools and resources and stop there, treating prompts as a nice-to-have. This is a mistake. Prompts are the mechanism that turns a raw capability server into a polished, user-facing product. They let you bake your best workflows into the server itself, expose them through any MCP-compatible host, and guarantee that users get the same high-quality prompt structure regardless of which host they use. Think of prompts as the “saved queries” of the AI world.

MCP prompts diagram showing prompt template with arguments resolving to rendered messages on dark background
Prompts: named, parameterised message templates that clients surface to users.

What Prompts Are and Why They Matter

An MCP prompt is a named, reusable prompt template that the server exposes for clients to use. When a client calls prompts/get with a prompt name and arguments, the server returns a list of messages ready to be sent to an LLM. The messages can reference resources (to inject dynamic content), contain multi-turn conversation history, and include both user and assistant roles.

The key difference from tools: prompts are human-initiated workflows. A user explicitly selects a prompt from the host UI (“Code Review”, “Summarise Document”, “Translate to French”). Tools are model-initiated – the LLM decides to call them based on context. Prompts are the programmatic equivalent of slash commands.

import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { z } from 'zod';

const server = new McpServer({ name: 'dev-assistant', version: '1.0.0' });

// Simple prompt with arguments
server.prompt(
  'code_review',
  'Review code for quality, security, and best practices',
  {
    code: z.string().describe('The code to review'),
    language: z.string().describe('Programming language (e.g. javascript, python, rust)'),
    focus: z.enum(['security', 'performance', 'style', 'all']).default('all')
      .describe('What aspect to focus the review on'),
  },
  async ({ code, language, focus }) => ({
    messages: [
      {
        role: 'user',
        content: {
          type: 'text',
          text: `Please review the following ${language} code with a focus on ${focus}:\n\n\`\`\`${language}\n${code}\n\`\`\`\n\nProvide specific, actionable feedback with examples.`,
        },
      },
    ],
  })
);

“Prompts enable servers to define reusable prompt templates and workflows that clients can easily surface to users and LLMs. They provide a way to standardize and share common LLM interactions.” – MCP Documentation, Prompts

Prompts with Resource Embedding

Prompts can embed resources directly into messages. When the server returns a message with a resource content block, the client reads the resource and injects its content into the conversation context before sending it to the LLM.

server.prompt(
  'analyse_file',
  'Analyse the contents of a file',
  { file_uri: z.string().describe('The URI of the file to analyse') },
  async ({ file_uri }) => ({
    messages: [
      {
        role: 'user',
        content: [
          {
            type: 'text',
            text: 'Please analyse the following file and provide a summary of its contents, structure, and any notable patterns:',
          },
          {
            type: 'resource',
            resource: { uri: file_uri }, // Client resolves this URI and injects content
          },
        ],
      },
    ],
  })
);

// Multi-turn prompt with context
server.prompt(
  'debug_error',
  'Debug an error with context',
  {
    error_message: z.string(),
    stack_trace: z.string().optional(),
    context: z.string().optional().describe('Additional context about what you were doing'),
  },
  async ({ error_message, stack_trace, context }) => ({
    messages: [
      {
        role: 'user',
        content: { type: 'text', text: 'I am getting the following error:' },
      },
      {
        role: 'user',
        content: {
          type: 'text',
          text: `Error: ${error_message}${stack_trace ? `\n\nStack trace:\n${stack_trace}` : ''}${context ? `\n\nContext: ${context}` : ''}`,
        },
      },
      {
        role: 'assistant',
        content: { type: 'text', text: 'I can help debug this. Let me analyse the error...' },
      },
      {
        role: 'user',
        content: { type: 'text', text: 'What is causing this error and how do I fix it?' },
      },
    ],
  })
);
MCP prompt messages structure showing user assistant message roles with text and resource content blocks
Prompt messages: multi-turn conversations with user/assistant roles and embedded resource content.

Failure Modes with Prompts

Case 1: Putting LLM Logic Inside the Prompt Handler

A prompt handler should assemble and return messages. It should not call an LLM. Calling an LLM inside a prompt handler breaks the separation between prompt construction (server’s job) and prompt execution (host’s job). It also makes your server non-deterministic and slow.

// WRONG: Calling an LLM inside the prompt handler
server.prompt('summarise', '...', { text: z.string() }, async ({ text }) => {
  const openai = new OpenAI();
  const summary = await openai.chat.completions.create({ ... }); // WRONG
  return { messages: [{ role: 'user', content: { type: 'text', text: summary } }] };
});

// CORRECT: Return the prompt; let the host's LLM execute it
server.prompt('summarise', '...', { text: z.string() }, async ({ text }) => ({
  messages: [{
    role: 'user',
    content: { type: 'text', text: `Please summarise the following text in 3 bullet points:\n\n${text}` },
  }],
}));

Case 2: Hardcoding Content That Should Be a Resource Reference

If your prompt inlines large amounts of data (a whole document, a database dump), the data will not be updated when the underlying source changes and the prompt will grow stale. Reference a resource URI instead, letting the client fetch fresh content at prompt execution time.

// BAD: Hardcoded data goes stale
server.prompt('analyse_policy', '...', {}, async () => ({
  messages: [{ role: 'user', content: { type: 'text', text: ENTIRE_POLICY_TEXT_INLINED } }],
}));

// GOOD: Resource reference - always fresh
server.prompt('analyse_policy', '...', {}, async () => ({
  messages: [{
    role: 'user',
    content: [
      { type: 'text', text: 'Please analyse our current company policy for compliance issues:' },
      { type: 'resource', resource: { uri: 'docs://company/policy-current' } },
    ],
  }],
}));

What to Check Right Now

  • Identify your power workflows – what are the 3-5 most common things your users ask the AI to do? Each one is a prompt candidate.
  • Test prompts in the Inspector – the Inspector shows prompts in a dedicated tab. Fill in arguments and render the messages to verify the output before integrating with an LLM.
  • Use resource references for dynamic content – never inline large or frequently-changing data in prompt text. Reference it by URI.
  • Notify on changes – if your prompts change (updated templates, new prompts added), send notifications/prompts/list_changed so clients can refresh their prompt catalogues.

nJoy πŸ˜‰

MCP Resources: Exposing Static and Dynamic Data to AI Models

Tools do things. Resources provide things. This distinction matters more than it sounds. A tool executes code with side effects – it searches, writes, sends, deletes. A resource is a read-only window into data – it gives the model (or the user) access to content without triggering any action. The resources primitive is MCP’s answer to the question: “how do I give the AI access to my data without writing a bespoke data-access tool every time?”

MCP resources diagram showing URI-addressed content blocks flowing from server to client
Resources: URI-addressed content that servers expose for reading by clients and AI models.

What Resources Are and How They Work

Every MCP resource has a URI – a unique identifier that the client uses to request it. The URI can follow any scheme: file://, db://, https://, custom-scheme://. The server defines what URIs exist and what they return. The client requests a URI and gets back content blocks (text or binary).

Resources come in two forms: direct resources (static items with known URIs that the server lists upfront) and resource templates (URI patterns with parameters, for dynamic resources where the set of possible URIs is not fixed).

import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { ResourceTemplate } from '@modelcontextprotocol/sdk/server/mcp.js';
import { z } from 'zod';
import fs from 'node:fs/promises';

const server = new McpServer({ name: 'file-server', version: '1.0.0' });

// Direct resource - static, known URI
server.resource(
  'config',
  'config://app/settings',
  { description: 'The application configuration', mimeType: 'application/json' },
  async (uri) => {
    const config = await fs.readFile('./config.json', 'utf8');
    return { contents: [{ uri: uri.href, mimeType: 'application/json', text: config }] };
  }
);

// Resource template - dynamic, parameterised URI
server.resource(
  'user-profile',
  new ResourceTemplate('users://{userId}/profile', { list: undefined }),
  { description: 'User profile by ID' },
  async (uri, { userId }) => {
    const user = await db.getUser(userId);
    if (!user) throw new Error(`User ${userId} not found`);
    return {
      contents: [{
        uri: uri.href,
        mimeType: 'application/json',
        text: JSON.stringify(user, null, 2),
      }],
    };
  }
);

“Resources represent any kind of data that an MCP server wants to make available to clients. This can include file contents, database records, API responses, live system data, screenshots, images, log files, and more.” – MCP Documentation, Resources

Resource Content Types

Resources return content blocks with either text or blob (binary) content. Text resources are the most common – JSON, Markdown, plain text, CSV, code. Binary resources use base64-encoded data.

// Text resource
return {
  contents: [{
    uri: uri.href,
    mimeType: 'text/markdown',
    text: '# Product Manual\n\nThis product does X...',
  }],
};

// Binary resource (e.g. an image or PDF)
const imageBuffer = await fs.readFile('./logo.png');
return {
  contents: [{
    uri: uri.href,
    mimeType: 'image/png',
    blob: imageBuffer.toString('base64'),
  }],
};

// Multiple content items (e.g. a resource that returns several files)
return {
  contents: [
    { uri: 'file:///src/main.js', mimeType: 'text/javascript', text: mainJsContent },
    { uri: 'file:///src/utils.js', mimeType: 'text/javascript', text: utilsJsContent },
  ],
};
MCP resource templates showing URI pattern matching with parameters extracted and passed to handler
Resource templates: URI patterns like users://{userId}/profile resolve to dynamic content.

Resource Subscriptions

If a resource changes over time, the server can support subscriptions. Clients subscribe to a URI and receive notifications when its content changes. This is useful for live data: a log file that grows, a database record that updates, a sensor reading that changes.

// Server with subscription support
const server = new McpServer({
  name: 'live-data-server',
  version: '1.0.0',
  capabilities: { resources: { subscribe: true } },
});

server.resource(
  'live-metrics',
  'metrics://system/cpu',
  { description: 'Live CPU usage percentage' },
  async (uri) => {
    const usage = await getCpuUsage();
    return {
      contents: [{ uri: uri.href, mimeType: 'text/plain', text: `${usage}%` }],
    };
  }
);

// When the data changes, notify subscribers:
setInterval(async () => {
  server.server.notification({
    method: 'notifications/resources/updated',
    params: { uri: 'metrics://system/cpu' },
  });
}, 5000); // every 5 seconds

Failure Modes with Resources

Case 1: Returning Mutable Data from Resources

Resources are semantically read-only. If your resource handler has side effects (incrementing a counter, logging access, triggering a build), you are violating the contract. Clients may cache resource responses and re-use them without re-fetching. Side effects in resource handlers lead to missed triggers and hard-to-reproduce bugs.

// BAD: Side effect in a resource handler
server.resource('report', 'reports://quarterly', {}, async (uri) => {
  await markReportAsViewed(userId); // Side effect - will not fire on cached reads
  return { contents: [{ uri: uri.href, text: reportContent }] };
});

// GOOD: Side effects belong in tools
server.tool('mark_report_viewed', '...', { report_id: z.string() }, async ({ report_id }) => {
  await markReportAsViewed(report_id);
  return { content: [{ type: 'text', text: 'Marked as viewed.' }] };
});

Case 2: Using Resources When Tools Are the Right Primitive

Resources are for pre-existing data the AI reads passively. If the data requires parameters that affect what is returned, the access has query semantics, or you need to aggregate data from multiple sources on the fly – that is a tool, not a resource.

// Ambiguous: is this a resource or a tool?
// If it takes user query parameters and runs a search algorithm -> Tool
// If it returns a fixed, addressable document -> Resource

// RESOURCE: Fixed, URI-addressable content
server.resource('user-manual', 'docs://user-manual', {}, handler);

// TOOL: Dynamic query with parameters
server.tool('search_docs', '...', { query: z.string() }, handler);

What to Check Right Now

  • Identify your read-only data sources – any data your AI needs to read but not modify is a resource candidate: config files, user profiles, product catalogues, documentation.
  • Use resource templates for parameterised access – if you have N users with profiles, use users://{userId}/profile rather than registering N individual resources.
  • Enable subscriptions for live data – if any of your resources update frequently, implement subscription support so clients can receive push notifications rather than polling.
  • Test resource listing – call resources/list from the Inspector and verify all your direct resources appear with correct URIs and descriptions.

nJoy πŸ˜‰

MCP Tools: Defining, Validating, and Executing LLM-Callable Functions

Tools are the heart of MCP. When people say “the AI can use tools”, they mean it can call functions exposed through this primitive. Tools are what let an AI model search your database, send an email, read a file, call an API, or run a command. Everything else in MCP is scaffolding around this core capability. This lesson covers the full tool API: defining schemas, validation, error handling, streaming, annotations, and the failure modes that will destroy a production system if you do not anticipate them.

MCP tools architecture diagram showing tool definition schema validation handler and response on dark background
The anatomy of an MCP tool: name, description, input schema, and async handler returning content blocks.

The Tool Definition API

A tool in MCP has four required components: a name (unique identifier, snake_case by convention), a description (what the tool does – this is what the LLM reads to decide when to use it), an input schema (a Zod object shape describing what arguments the tool takes), and a handler (an async function that receives validated arguments and returns a result).

import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { z } from 'zod';

const server = new McpServer({ name: 'my-server', version: '1.0.0' });

server.tool(
  'search_products',                    // name
  'Search the product catalogue',       // description
  {                                     // input schema (Zod object shape)
    query: z.string().min(1).max(200).describe('Search terms'),
    category: z.enum(['electronics', 'clothing', 'books']).optional()
      .describe('Optional category filter'),
    max_price: z.number().positive().optional()
      .describe('Maximum price in USD'),
    limit: z.number().int().min(1).max(50).default(10)
      .describe('Number of results to return'),
  },
  async ({ query, category, max_price, limit }) => {  // handler
    const results = await db.searchProducts({ query, category, max_price, limit });
    return {
      content: results.map(p => ({
        type: 'text',
        text: `${p.name} - $${p.price} (${p.category})\n${p.description}`,
      })),
    };
  }
);

The description is the most important field for LLM usability. It is what the model reads when deciding whether to use this tool. Write it as if explaining to a smart colleague what the function does, when to use it, and what it returns. Vague descriptions cause the model to either misuse the tool or avoid it entirely.

“Tools are exposed to the client with a JSON schema for their inputs. Clients SHOULD present tools to the LLM with appropriate context about what the tool does and when to use it.” – MCP Documentation, Tools

Content Types and Rich Responses

Tool handlers return an object with a content array. Each item in the array is a content block. MCP supports three content types: text, image, and resource.

// Text content (most common)
return {
  content: [{ type: 'text', text: 'The result as a string' }],
};

// Multiple text blocks (e.g. separate sections)
return {
  content: [
    { type: 'text', text: '## Summary\nHere is what I found...' },
    { type: 'text', text: '## Details\nFull results below...' },
  ],
};

// Image content (base64-encoded)
const imageData = fs.readFileSync('./chart.png').toString('base64');
return {
  content: [{
    type: 'image',
    data: imageData,
    mimeType: 'image/png',
  }],
};

// Resource reference (URI to a resource the client can read)
return {
  content: [{
    type: 'resource',
    resource: { uri: 'file:///data/report.pdf', mimeType: 'application/pdf' },
  }],
};

// Mixed content (text + image)
return {
  content: [
    { type: 'text', text: 'Here is the sales chart for Q1:' },
    { type: 'image', data: chartBase64, mimeType: 'image/png' },
  ],
};
MCP tool content types diagram showing text image and resource content blocks with example structures
The three tool content types: text, image (base64), and resource (URI reference).

Tool Annotations

MCP supports optional annotations on tools that hint to clients about the tool’s behaviour. These help hosts make better security and UX decisions before invoking a tool. Annotations are hints, not enforceable constraints – a well-behaved host should respect them, but the protocol does not validate them at runtime.

server.tool(
  'delete_file',
  'Permanently deletes a file from the filesystem',
  { path: z.string().describe('Absolute path to the file') },
  {
    // Tool annotations
    annotations: {
      destructive: true,         // Hint: this action cannot be undone
      requiresConfirmation: true, // Hint: ask user before executing
    },
  },
  async ({ path }) => {
    await fs.promises.unlink(path);
    return { content: [{ type: 'text', text: `Deleted: ${path}` }] };
  }
);

// Read-only tool annotation
server.tool(
  'read_file',
  'Reads a file from the filesystem',
  { path: z.string() },
  { annotations: { readOnly: true } }, // Hint: no side effects
  async ({ path }) => {
    const content = await fs.promises.readFile(path, 'utf8');
    return { content: [{ type: 'text', text: content }] };
  }
);

Failure Modes in Tool Design

Case 1: Vague Tool Descriptions Causing Misuse

When the description is too vague, the LLM will either call the wrong tool, pass wrong arguments, or skip the tool when it should use it. This causes subtle, hard-to-debug failures in production.

// BAD: Vague description - what does "process" mean?
server.tool('process', 'Process some data', { data: z.string() }, handler);

// GOOD: Specific description with context and return value
server.tool(
  'summarise_text',
  'Summarises a long text to under 100 words. Use when the user asks for a summary or when text exceeds 2000 characters and needs to be condensed. Returns: a concise summary string.',
  { text: z.string().min(1).describe('The text to summarise') },
  handler
);

Case 2: Throwing Errors Instead of Returning isError

Throwing an uncaught error from a tool handler causes the server to return a JSON-RPC error (protocol-level failure). The LLM sees this as a system failure, not a domain error. For domain errors – “user not found”, “quota exceeded”, “invalid file type” – return isError: true so the LLM can reason about the failure.

// BAD: Protocol error - LLM cannot reason about this
async ({ user_id }) => {
  const user = await db.findUser(user_id);
  if (!user) throw new Error('User not found'); // JSON-RPC error - not helpful to LLM
}

// GOOD: Domain error - LLM can adjust response
async ({ user_id }) => {
  const user = await db.findUser(user_id);
  if (!user) return {
    isError: true,
    content: [{ type: 'text', text: `No user found with ID ${user_id}. Check if the ID is correct.` }],
  };
  return { content: [{ type: 'text', text: JSON.stringify(user) }] };
}

Case 3: Missing Zod .describe() on Input Fields

Every Zod field in a tool’s input schema should have a .describe() call. The description appears in the JSON Schema that gets sent to the LLM. Without it, the model has to guess what the field means from its name alone – which leads to wrong values being passed.

// BAD: No descriptions - LLM must guess what max_items means
{ query: z.string(), max_items: z.number(), include_archived: z.boolean() }

// GOOD: Descriptions guide the LLM to pass correct values
{
  query: z.string().describe('Search query - supports AND, OR, NOT operators'),
  max_items: z.number().int().min(1).max(100).describe('Maximum results to return (1-100)'),
  include_archived: z.boolean().default(false).describe('Set to true to include archived items in results'),
}

Dynamic Tool Registration

Tools do not have to be registered at server startup. You can register tools dynamically and notify connected clients:

// Register a tool at startup
const toolRegistry = new Map();

function registerTool(name, description, schema, handler) {
  server.tool(name, description, schema, handler);
  toolRegistry.set(name, { name, description });
  // Notify connected clients that the tool list changed
  server.server.notification({ method: 'notifications/tools/list_changed' });
}

// Call this at any point after the server is connected
registerTool(
  'new_dynamic_tool',
  'A tool added at runtime',
  { input: z.string() },
  async ({ input }) => ({ content: [{ type: 'text', text: `Got: ${input}` }] })
);

“Servers MAY notify clients when the list of available tools changes. Clients that support the tools.listChanged capability SHOULD re-fetch the tool list when they receive this notification.” – MCP Documentation, Tools

What to Check Right Now

  • Audit your tool descriptions – for each tool you build, ask: if an LLM read only the name and description, would it know exactly when to use this tool and what it returns? If not, rewrite the description.
  • Add .describe() to every Zod field – do this as a rule, not an afterthought. The descriptions are part of the tool API surface.
  • Test isError handling – build a tool that deliberately returns isError: true with an informative message. Test it with the Inspector to see what the LLM would receive.
  • Check your annotation hints – mark every destructive tool (delete, update, send) with destructive: true. This lets compliant hosts ask for confirmation before executing.

nJoy πŸ˜‰

Your First MCP Server and Client in Node.js

Theory becomes knowledge when you type it. This lesson builds a complete, working MCP server and a complete, working client, from a blank directory to a running system with tool calling. By the end, you will have a tangible artefact – code you wrote, running on your machine – that embodies every concept from the first four lessons. Everything after this lesson builds on this foundation.

MCP first server and client complete project structure dark diagram showing server client tools files
The complete first project: a server with three tools and a client that discovers and calls them.

What We Are Building

We will build a “text tools” MCP server – a server that exposes three tools for working with text: word_count (counts words in a string), reverse_text (reverses a string), and extract_keywords (returns unique words above a minimum length). These are deliberately simple tools – the complexity will come later. The goal right now is to write the wiring, understand what each piece does, and verify the whole thing works end to end.

We will also build a client that connects to the server, discovers its tools, and calls each one. In later lessons, the client will call an LLM and route tool calls from model output. Here, the client calls tools directly so you can see the raw MCP protocol working without an LLM in the middle.

Final project structure:

mcp-text-tools/
  package.json
  .env
  server.js      # MCP server with three tools
  client.js      # MCP client that calls the tools

Building the Server

Start with the package setup:

mkdir mcp-text-tools && cd mcp-text-tools
npm init -y
npm pkg set type=module
npm install @modelcontextprotocol/sdk zod

Now write server.js:

// server.js
import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import { z } from 'zod';

const server = new McpServer({
  name: 'text-tools',
  version: '1.0.0',
});

// Tool 1: Count words in a string
server.tool(
  'word_count',
  'Counts the number of words in a text string',
  { text: z.string().min(1).describe('The text to count words in') },
  async ({ text }) => {
    const count = text.trim().split(/\s+/).filter(Boolean).length;
    return {
      content: [{ type: 'text', text: `Word count: ${count}` }],
    };
  }
);

// Tool 2: Reverse a string
server.tool(
  'reverse_text',
  'Reverses the characters in a text string',
  { text: z.string().min(1).describe('The text to reverse') },
  async ({ text }) => ({
    content: [{ type: 'text', text: text.split('').reverse().join('') }],
  })
);

// Tool 3: Extract unique keywords above a minimum length
server.tool(
  'extract_keywords',
  'Extracts unique keywords from text, filtered by minimum character length',
  {
    text: z.string().min(1).describe('The text to extract keywords from'),
    min_length: z.number().int().min(2).max(20).default(4)
      .describe('Minimum keyword length in characters'),
  },
  async ({ text, min_length }) => {
    const words = text
      .toLowerCase()
      .replace(/[^a-z0-9\s]/g, '')
      .split(/\s+/)
      .filter(w => w.length >= min_length);
    const unique = [...new Set(words)].sort();
    return {
      content: [{ type: 'text', text: unique.join(', ') || '(none found)' }],
    };
  }
);

// Start the server on stdio transport
const transport = new StdioServerTransport();
await server.connect(transport);
console.error('text-tools MCP server running on stdio');

A few things to note: console.error is used for server logging (not console.log) because stdio transport uses stdout for protocol messages. Anything written to stdout must be valid JSON-RPC. Log to stderr for human-readable messages.

MCP Inspector showing text-tools server with three tools listed and word-count tool call result
The MCP Inspector showing the text-tools server with all three tools discoverable and callable.

Testing with the Inspector First

Before writing the client, test the server with the MCP Inspector:

npx @modelcontextprotocol/inspector node server.js

Open the URL it prints (usually http://localhost:5173). You should see all three tools listed. Click word_count, enter some text in the text field, and click Run. You should get back a result like Word count: 7. If you do, the server is working correctly. If not, check the error panel for the JSON-RPC response.

Building the Client

Now write client.js – a host that connects to the server, lists tools, and calls each one:

// client.js
import { Client } from '@modelcontextprotocol/sdk/client/index.js';
import { StdioClientTransport } from '@modelcontextprotocol/sdk/client/stdio.js';

// Create the client
const client = new Client(
  { name: 'text-tools-host', version: '1.0.0' },
  { capabilities: {} }
);

// Create the transport - this will launch server.js as a subprocess
const transport = new StdioClientTransport({
  command: 'node',
  args: ['server.js'],
});

// Connect (performs the full MCP handshake)
await client.connect(transport);
console.log('Connected to text-tools server');

// Step 1: Discover what tools the server has
const { tools } = await client.listTools();
console.log('\nAvailable tools:');
for (const tool of tools) {
  console.log(`  ${tool.name}: ${tool.description}`);
  console.log(`    Input schema:`, JSON.stringify(tool.inputSchema, null, 4));
}

// Step 2: Call word_count
console.log('\n--- Calling word_count ---');
const result1 = await client.callTool({
  name: 'word_count',
  arguments: { text: 'The quick brown fox jumps over the lazy dog' },
});
console.log('Result:', result1.content[0].text);

// Step 3: Call reverse_text
console.log('\n--- Calling reverse_text ---');
const result2 = await client.callTool({
  name: 'reverse_text',
  arguments: { text: 'Hello, MCP World!' },
});
console.log('Result:', result2.content[0].text);

// Step 4: Call extract_keywords
console.log('\n--- Calling extract_keywords ---');
const result3 = await client.callTool({
  name: 'extract_keywords',
  arguments: {
    text: 'The Model Context Protocol is an open protocol for AI tool integration',
    min_length: 5,
  },
});
console.log('Result:', result3.content[0].text);

// Clean up
await client.close();
console.log('\nDone. Connection closed.');

Run the client:

node client.js

Expected output:

Connected to text-tools server

Available tools:
  word_count: Counts the number of words in a text string
    Input schema: { ... }
  reverse_text: Reverses the characters in a text string
    Input schema: { ... }
  extract_keywords: Extracts unique keywords from text...
    Input schema: { ... }

--- Calling word_count ---
Result: Word count: 9

--- Calling reverse_text ---
Result: !dlroW PCM ,olleH

--- Calling extract_keywords ---
Result: context, integration, model, open, protocol

Done. Connection closed.

Common First-Project Failures

Case 1: Logging to stdout from a stdio Server

This is the most common first-day mistake. With StdioServerTransport, stdout is the JSON-RPC pipe. If you write anything to stdout that is not valid JSON-RPC, the client will fail to parse it and the connection will break in confusing ways.

// WRONG: stdout output from a stdio server breaks the protocol
console.log('Server started!'); // This goes to stdout - corrupts the pipe

// CORRECT: use stderr for all server-side logging
console.error('Server started!'); // stderr is safe - not part of the protocol

// Or use the MCP logging capability (covered in Lesson 6)
server.server.sendLoggingMessage({ level: 'info', data: 'Server started' });

Case 2: Not Awaiting client.connect()

If you forget to await client.connect(), your subsequent tool calls will race with the initialisation handshake and fail with protocol errors.

// WRONG
client.connect(transport);
const tools = await client.listTools(); // Fails: handshake not complete

// CORRECT
await client.connect(transport);
const tools = await client.listTools(); // Safe

Case 3: Tool Handler Throwing Without isError

When a tool handler throws an exception, the server catches it and returns an error response. But if you want to signal a user-visible error (as opposed to a protocol error), you should return a result with isError: true rather than throwing. Throwing causes a JSON-RPC error response; returning with isError: true returns a normal result that the LLM can read and reason about.

// OK for protocol failures (server bug, network error)
throw new Error('Database connection failed');

// BETTER for user-visible errors the LLM should handle
return {
  isError: true,
  content: [{ type: 'text', text: 'No results found for that query.' }],
};
// The LLM will receive this as tool output and can adjust its response accordingly.

“Tools can signal that a tool call failed by including isError: true in the result. This allows the LLM to reason about the failure and potentially retry or adjust its approach, rather than treating the tool failure as a protocol error.” – MCP Documentation, Tools

What to Check Right Now

  • Run the full project – build the text-tools server and client from this lesson. Do not copy-paste; type it. The act of typing catches misunderstandings that reading does not.
  • Inspect it with the Inspector – run npx @modelcontextprotocol/inspector node server.js before running the client. Verify all three tools appear and work.
  • Add a fourth tool – practice the pattern by adding uppercase_text as a fourth tool. Register it, implement the handler, test with the Inspector, then verify your client discovers it automatically.
  • Read the error – deliberately introduce a bug (typo in a field name, missing argument) and read the JSON-RPC error response. Understanding error messages now saves hours later.

nJoy πŸ˜‰

Node.js Dev Environment for MCP: SDK, Zod, ESM, and Tooling

Setting up a dev environment is the least glamorous part of any course, but it is also the part where the most time gets silently destroyed. This lesson sets up the Node.js MCP development environment properly, once, so you never have to think about it again. We cover the SDK, Zod for schema validation, ESM module configuration, the MCP Inspector, and the small quality-of-life tools that make the workflow fast. Every code example in this course starts from this base.

Node.js MCP development environment setup diagram showing package structure and tooling on dark background
The complete MCP Node.js dev environment: SDK, Zod, ESM, and the Inspector.

Node.js Version and ESM Setup

This course requires Node.js 22 or higher. Node.js 22 is the current LTS release and it ships several features we use throughout the course: native --env-file support (no more dotenv package), the stable node:test built-in test runner, and improved native fetch. Check your version:

node --version
# Should print v22.x.x or higher
# If not: nvm install 22 && nvm use 22

All code in this course uses ESM (ECMAScript Modules) – the import/export syntax. This is the modern Node.js module system and the MCP SDK is distributed as ESM. To use ESM in Node.js, add "type": "module" to your package.json. Here is the base package.json for every project in this course:

{
  "name": "my-mcp-project",
  "version": "1.0.0",
  "type": "module",
  "description": "MCP server / client",
  "engines": { "node": ">=22" }
}

With "type": "module", all .js files in your project are treated as ESM. You can use import and export freely. You cannot use require() directly (use createRequire from node:module if you ever need to load a CJS module from an ESM file). File extensions must be explicit in import paths: ./server.js, not ./server.

Installing the MCP SDK and Zod

Two packages cover everything you need to build and run MCP servers and clients:

npm install @modelcontextprotocol/sdk zod

@modelcontextprotocol/sdk is the official MCP implementation. It provides McpServer (for building servers), Client (for building clients), all transport implementations, and the full type definitions. It is the only MCP-specific dependency you need.

zod is a schema validation library. In MCP, it is used to define the input schemas for tools. When you register a tool on an MCP server, you pass a Zod schema that describes what arguments the tool accepts. The SDK uses this schema to generate the JSON Schema that gets advertised to clients, and to validate incoming tool call arguments before your handler runs. Zod v4 is required (v3 has a different API for .describe() on fields).

// Zod schema for a tool that searches a database
import { z } from 'zod';

const SearchSchema = {
  query: z.string().min(1).max(500).describe('The search query string'),
  limit: z.number().int().min(1).max(100).default(10).describe('Max results to return'),
  category: z.enum(['posts', 'users', 'products']).optional().describe('Filter by category'),
};

// The SDK converts this to JSON Schema for the tool manifest:
// { query: { type: 'string', minLength: 1, maxLength: 500, description: '...' }, ... }
Node.js MCP project structure showing package.json server.js client.js and .env files on dark background
Standard MCP project structure for this course.

Project Structure Convention

Every project in this course follows this directory structure:

my-mcp-project/
  package.json          # "type": "module", dependencies
  .env                  # API keys and config (never committed)
  .gitignore            # includes .env and node_modules
  server.js             # MCP server entry point (or servers/ for multiple)
  client.js             # MCP client / host entry point
  tools/                # One file per tool for larger servers
    search.js
    fetch.js
  resources/            # One file per resource type
    database.js

For API keys, use Node.js 22’s native --env-file flag instead of the dotenv package. This keeps the dependency count low and the setup obvious:

# .env
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_API_KEY=AIza...
DATABASE_URL=postgresql://localhost:5432/mydb
# Run server with env file loaded natively
node --env-file=.env server.js

# Or in package.json scripts
{
  "scripts": {
    "start": "node --env-file=.env server.js",
    "dev": "node --watch --env-file=.env server.js"
  }
}

The --watch flag (Node.js 18+) restarts the process when files change. No nodemon required.

The MCP Inspector

The MCP Inspector is an official tool for testing and debugging MCP servers interactively. It is the most important development tool in your MCP workflow. You can use it without installing anything:

npx @modelcontextprotocol/inspector node server.js

This opens a web UI at http://localhost:5173 (or similar). From the Inspector you can:

  • See all tools, resources, and prompts the server exposes
  • Call any tool with custom arguments and see the raw response
  • Browse resources by URI
  • Render prompts with template arguments
  • Watch all JSON-RPC messages in the network panel in real time

The Inspector is the fastest way to verify that your server is working correctly before integrating it with an LLM. Always test with the Inspector first.

Common Environment Failures

Case 1: Using CJS require() in an ESM Project

With "type": "module" in package.json, all .js files are ESM. Using require() will throw ReferenceError: require is not defined in ES module scope.

// WRONG in an ESM project
const { McpServer } = require('@modelcontextprotocol/sdk/server/mcp.js');

// CORRECT
import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';

If you need to import a CJS module from ESM (rare), use dynamic import or createRequire:

import { createRequire } from 'node:module';
const require = createRequire(import.meta.url);
const someCjsModule = require('some-cjs-package');

Case 2: Missing File Extensions in Import Paths

Unlike bundlers (webpack, Vite), Node.js ESM requires explicit file extensions in relative import paths. Omitting the extension causes a Cannot find module error.

// WRONG
import { myTool } from './tools/search';

// CORRECT
import { myTool } from './tools/search.js';

Case 3: Using Zod v3 with SDK v1

The MCP SDK v1 peer-depends on Zod v4 (not v3). Zod v3 and v4 have different APIs for field descriptions. If you have Zod v3 installed, the .describe() calls on schema fields will behave differently and tool descriptions may be missing from the manifest.

# Check which Zod version you have
npm list zod

# Install Zod v4 explicitly
npm install zod@^4.0.0

“The TypeScript SDK requires Node.js 18 or higher. Node.js 22+ is recommended for native .env file support and the stable built-in test runner.” – MCP TypeScript SDK, README

What to Check Right Now

  • Create a scratch project – run mkdir mcp-scratch && cd mcp-scratch && npm init -y && npm pkg set type=module && npm install @modelcontextprotocol/sdk zod. This is the baseline for Lesson 5.
  • Verify zod version – run npm list zod. It should show 4.x.x. If not, npm install zod@latest.
  • Test the Inspector – run npx @modelcontextprotocol/inspector --help to verify it is reachable. No install needed; it runs from the npm cache.
  • Add node_modules and .env to .gitignore – these are the two most important things to exclude. Run echo "node_modules/\n.env" > .gitignore.

nJoy πŸ˜‰