Skip to content
Back to Writing

Building Custom MCP Servers: Extend AI Agents with Domain-Specific Tools

Byte Smith · · 14 min read

Most teams using AI coding agents start with the tools that come out of the box. That gets you surprisingly far: file access, terminal commands, web search, documentation lookup. But eventually you hit the wall. Your agents cannot query your internal databases. They cannot check your ticket system. They cannot trigger your deployment pipeline or look up customer context from your proprietary API.

That is the gap custom MCP servers fill. Instead of waiting for someone else to build an integration for your internal systems, you build it yourself. You define the tools, control the permissions, enforce the security boundaries, and give your agents exactly the capabilities they need to be useful inside your organization’s specific workflows.

This is not theoretical. Teams that connect agents to internal tools see a qualitative shift in what those agents can accomplish. An agent that can read your database schema, query recent deployments, and check the status of a Jira ticket is fundamentally more useful than one that can only see files on disk. The compound value comes from giving agents access to the same context your engineers use when making decisions.

If you are already familiar with how MCP works at a high level from our overview of AI coding agents and MCP, this guide goes deeper. It covers architecture decisions, security patterns, production concerns, and a reference implementation you can fork and adapt.

MCP architecture recap

Before building, it helps to have the architecture clear. MCP follows a client-server model with three layers:

  • Host: the application the user interacts with (an IDE like VS Code, Claude Desktop, or a custom agent runner)
  • MCP Client: lives inside the host, manages connections to one or more MCP servers
  • MCP Server: exposes capabilities to the client, connects to your backend systems

The server is where your code lives. It receives requests from the client, executes logic against your tools and data sources, and returns structured results.

Transport options

MCP supports two primary transport mechanisms:

  • stdio: the server runs as a subprocess of the host. Simple, no network configuration, ideal for local development and single-user setups.
  • Streamable HTTP (formerly HTTP/SSE): the server runs as a standalone HTTP service. Required for multi-user deployments, remote access, and production environments where you need proper authentication and load balancing.

For local development and testing, stdio is the fastest path. For anything you plan to deploy to a team, you want HTTP transport with proper auth in front of it.

Core primitives

MCP servers expose three types of capabilities:

  • Tools: functions the agent can call (query a database, create a ticket, run a build). These are the most common and most powerful primitive.
  • Resources: read-only data the agent can access (database schemas, configuration files, documentation). Think of these as context the agent can pull in before deciding what to do.
  • Prompts: reusable prompt templates that guide the agent toward specific workflows.

For most custom server projects, you will spend 90% of your time defining tools and the security boundaries around them.

If you want hands-on experience connecting to existing MCP servers before building your own, start with our tutorial on setting up MCP-powered coding agents or extending GitHub Copilot with MCP tools.

Choosing your MCP server pattern

Not every MCP server needs the same level of complexity. The right pattern depends on what you are exposing and the risk profile of those operations.

Read-only data access

This is the simplest and lowest-risk pattern. Your server exposes tools that query databases, fetch API responses, or search documentation, but never modify anything.

Examples:

  • querying a read-only replica of your production database
  • searching internal documentation or knowledge bases
  • fetching deployment status from your CI/CD system
  • looking up customer context from a CRM API

Read-only servers are the best place to start. They deliver immediate value with minimal risk, because the worst case is the agent sees data it should not, which you control through authorization and output filtering.

Stateful operations

Once you move beyond reads, the stakes increase. Stateful operations include creating tickets, updating records, triggering deployments, or modifying configuration.

These require careful permission design:

  • explicit allowlisting of which operations are permitted
  • confirmation workflows for destructive actions
  • audit logging for every write operation
  • rollback capabilities where possible

The key principle is that your MCP server should enforce stricter boundaries than you would for a human user. An agent can make mistakes faster and at higher volume than a person. Your server is the enforcement layer.

Multi-tool composition

The most useful MCP servers expose a cohesive set of related tools, typically five to ten, that work together as a toolkit. A database server might expose list_tables, get_schema, query_database, and explain_query. A project management server might expose search_tickets, get_ticket, create_ticket, update_status, and add_comment.

The composition matters because agents reason better when tools are logically grouped and well-documented. A server with three focused tools is more useful than one with thirty loosely related ones.

Decision framework

Start read-only. Prove the value. Then add write operations one at a time, each with its own permission check and audit trail. This is not about being cautious for its own sake. It is about building trust in the system incrementally so you can move faster later.

Security architecture for MCP servers

Security is not a bolt-on concern for MCP servers. It is the core design constraint. Your MCP server sits between an AI agent and your internal systems. If the server is permissive, the agent inherits that permissiveness, and the blast radius of a mistake or attack scales accordingly.

Authentication

How the server verifies who is making a request:

  • API keys: simplest option. Good for internal tools, single-tenant setups, and development. Store keys in environment variables, never in code. Rotate regularly.
  • OAuth2: the right choice when your MCP server needs to act on behalf of specific users with their permissions. More complex to implement but necessary for multi-user production deployments.
  • mTLS: mutual TLS for service-to-service communication. Use this when your MCP server is called by other services rather than directly by agent hosts.

For most teams starting out, API key authentication with per-key permission scoping is the right balance of security and simplicity.

Authorization

Authentication tells you who is calling. Authorization tells you what they are allowed to do.

Effective MCP server authorization includes:

  • Per-tool permissions: not every authenticated user should have access to every tool. A read-only API key should not be able to call write tools.
  • Role-based access: map API keys or OAuth tokens to roles (reader, writer, admin) and enforce role checks in each tool handler.
  • Tenant isolation: if your server serves multiple teams or customers, ensure queries are scoped to the caller’s data. Never rely on the agent to add the right WHERE clause.

Input validation

Every input from an agent should be treated as untrusted. Agents can hallucinate parameters, and a compromised prompt can attempt injection attacks.

import { z } from "zod";

const QueryToolSchema = z.object({
  table: z
    .string()
    .refine((t) => ALLOWED_TABLES.includes(t), {
      message: "Table not in allowlist",
    }),
  columns: z
    .array(z.string())
    .max(20, "Too many columns requested"),
  where: z
    .record(z.string(), z.union([z.string(), z.number()]))
    .optional(),
  limit: z
    .number()
    .int()
    .min(1)
    .max(100)
    .default(25),
});

Key validation patterns:

  • schema enforcement on every tool input using Zod or a similar library
  • SQL injection prevention through parameterized queries (never string concatenation)
  • parameter sanitization to strip or reject dangerous patterns
  • table and column allowlisting to prevent access to sensitive data

Output filtering

What comes back from your backend systems may contain data the agent should not see or return to the user.

  • PII redaction: mask email addresses, phone numbers, social security numbers, and other sensitive fields before returning results
  • Sensitive column exclusion: define columns that are never returned regardless of the query
  • Row-level security: filter results based on the caller’s permissions

This matters because MCP tool responses become part of the agent’s context, and that context may be visible to the end user or logged in ways you do not fully control.

Rate limiting

Agents can make requests much faster than humans. Without rate limits, a misconfigured agent can hammer your database or exhaust your API quotas in minutes.

Implement per-user and per-tool rate limits. Return clear, structured errors when limits are hit so the agent can back off intelligently rather than retrying in a tight loop.

For a deeper treatment of security patterns for agent-connected systems, see our guide to securing agentic AI applications and API security best practices for AI-integrated apps.

If your MCP server is used in coding agent workflows, also see our guide to securing AI coding agent pipelines, which covers detection, policy controls, and risk-tiered review gates at the CI/CD level.

Production concerns

Getting a tool to work locally is the easy part. Making it reliable, observable, and maintainable in production is where the real engineering happens.

Error handling

Agents cannot read stack traces. They need structured error responses that describe what went wrong and what, if anything, can be done about it.

import { McpError, ErrorCode } from "@modelcontextprotocol/sdk/types.js";

function handleToolError(error: unknown): never {
  if (error instanceof ValidationError) {
    throw new McpError(
      ErrorCode.InvalidParams,
      `Invalid input: ${error.message}. Check the tool schema for valid parameters.`
    );
  }

  if (error instanceof AuthorizationError) {
    throw new McpError(
      ErrorCode.InvalidRequest,
      `Permission denied: your API key does not have access to this operation.`
    );
  }

  if (error instanceof QueryTimeoutError) {
    throw new McpError(
      ErrorCode.InternalError,
      `Query timed out after ${error.timeoutMs}ms. Try a more specific query with fewer results.`
    );
  }

  // Never expose internal details
  throw new McpError(
    ErrorCode.InternalError,
    "An unexpected error occurred. Contact the platform team if this persists."
  );
}

The error messages should help the agent self-correct. “Invalid table name” is better than “Error.” “Table not in allowlist, valid tables are: users, projects, deployments” is better still.

Logging and audit trails

Every MCP tool invocation should produce a structured log entry that captures:

  • timestamp
  • authenticated identity (API key ID, user, or service)
  • tool name and input parameters
  • response summary (success/failure, row count, duration)
  • any validation failures or blocked operations

This is not optional for production use. When something goes wrong, and it will, you need to reconstruct exactly what the agent asked for and what the server returned. Treat MCP server logs with the same rigor you apply to API gateway logs or database audit trails.

Testing

MCP servers need three layers of testing:

  • Unit tests for individual tool handlers: given these inputs, does the handler return the right output and enforce the right constraints?
  • Integration tests with a mock MCP client: does the full request/response cycle work correctly, including auth, validation, and error handling?
  • Security tests: does the server correctly reject unauthorized requests, block SQL injection attempts, and mask sensitive data?

The MCP SDK provides utilities for creating test clients, which makes integration testing straightforward.

Deployment

Containerize your MCP server from the start. A multi-stage Docker build keeps the image small, and Docker Compose lets you run the server alongside its dependencies (database, cache, etc.) for local development.

Production deployments should include:

  • health check endpoints
  • graceful shutdown handling
  • environment-based configuration (no hardcoded secrets)
  • resource limits (memory, CPU)
  • horizontal scaling for HTTP transport deployments

Versioning

As your MCP server evolves, you will add tools, change schemas, and modify behavior. Agents that depend on your server will break if you change tool schemas without coordination.

Best practices:

  • treat tool schemas as a public API contract
  • add new tools rather than modifying existing ones when possible
  • use semantic versioning for your server
  • document breaking changes and provide migration guidance
  • consider running multiple server versions in parallel during transitions

The mcp-enterprise-starter reference implementation

To make all of this concrete, we built mcp-enterprise-starter, a reference implementation that demonstrates every pattern discussed in this guide.

Architecture

The repo is structured as a TypeScript MCP server that connects to PostgreSQL:

mcp-enterprise-starter/
  src/
    server.ts              # MCP server setup and transport config
    tools/
      query-database.ts    # Safe database query tool
      list-tables.ts       # Table listing tool
      get-schema.ts        # Schema inspection tool
    resources/
      schema.ts            # Database schema as MCP resource
    middleware/
      auth.ts              # API key authentication
      validation.ts        # Input validation and sanitization
      rate-limit.ts        # Per-user rate limiting
    utils/
      db.ts                # PostgreSQL connection pool
      sanitize.ts          # Query sanitization helpers
      errors.ts            # Structured error types
  tests/
  docker-compose.yml
  Dockerfile

Server setup

The server initialization wires together transport, authentication, and tool registration:

import { Server } from "@modelcontextprotocol/sdk/server/index.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import {
  CallToolRequestSchema,
  ListToolsRequestSchema,
} from "@modelcontextprotocol/sdk/types.js";
import { requireAuth } from "./middleware/auth.js";
import { checkRateLimit } from "./middleware/rate-limit.js";

const server = new Server(
  { name: "mcp-enterprise-starter", version: "1.0.0" },
  { capabilities: { tools: {}, resources: {} } },
);

// Register tool listing
server.setRequestHandler(ListToolsRequestSchema, async () => ({
  tools: [queryDatabaseTool, listTablesTool, getSchemaTool],
}));

// Handle tool calls with auth and rate limiting
server.setRequestHandler(CallToolRequestSchema, async (request) => {
  const { name, arguments: args } = request.params;
  const apiKey = process.env.MCP_API_KEY || process.env.API_KEYS?.split(",")[0];
  const authCtx = requireAuth(apiKey);
  checkRateLimit(authCtx.apiKey);

  switch (name) {
    case "query_database":
      return await handleQueryDatabase(args);
    case "list_tables":
      return await handleListTables();
    case "get_schema":
      return await handleGetSchema(args);
    default:
      throw new Error(`Unknown tool: ${name}`);
  }
});

// Start with stdio transport for local dev
const transport = new StdioServerTransport();
await server.connect(transport);

Safe database queries

The query_database tool demonstrates how to give an agent useful database access without giving it unrestricted SQL execution:

  • queries are validated SQL strings with parameterized values, not unfiltered raw execution
  • table and column names are validated against an allowlist
  • all values are passed as parameterized query arguments
  • results are limited by default with a configurable maximum
  • sensitive columns are masked in the output
  • destructive keywords (DROP, DELETE, UPDATE, ALTER) are blocked entirely for read-only keys

Auth middleware

The authentication layer maps API keys to permission sets:

import { McpToolError } from "../utils/errors.js";

export interface AuthContext {
  apiKey: string;
  permissions: "read" | "read-write";
}

function getApiKeys(): Map<string, AuthContext> {
  const keys = new Map<string, AuthContext>();
  const envKeys = process.env.API_KEYS || "";

  for (const key of envKeys.split(",").map((k) => k.trim()).filter(Boolean)) {
    keys.set(key, { apiKey: key, permissions: "read" });
  }
  return keys;
}

export function requireAuth(apiKey: string | undefined): AuthContext {
  if (!apiKey) {
    throw new McpToolError("auth_error", "Missing API key.", false);
  }
  const validKeys = getApiKeys();
  const context = validKeys.get(apiKey);
  if (!context) {
    throw new McpToolError("auth_error", "Invalid API key.", false);
  }
  return context;
}

API keys are loaded from the API_KEYS environment variable as a comma-separated list (e.g., dev-key-1,dev-key-2). Each key maps to a permission level. For production systems, you would extend this with per-key tool restrictions, role-based access, or replace it with OAuth2 entirely.

How to use it

The quickest path from clone to working server:

  1. Clone the repo and copy .env.example to .env
  2. Run docker compose up to start PostgreSQL with seeded sample data and the MCP server
  3. Copy the provided mcp-config.json into your Claude Desktop or VS Code MCP configuration
  4. Start asking your agent questions: “What tables are available?”, “Show me the schema for the projects table”, “Find all users in the engineering department”

The sample database includes realistic tables (users, departments, projects) with enough data to demonstrate query building, filtering, and sensitive column masking.

Adapting for your own systems

The repo is designed to be forked and modified. To connect it to your own internal tools:

  1. Replace the database connection with your data source
  2. Define new tools in src/tools/ following the existing patterns
  3. Update the table and column allowlists in your configuration
  4. Add your own API keys and permission scopes
  5. Write tests for your specific tool logic

The patterns for auth, validation, error handling, and logging carry over regardless of what backend you connect to. That is the point of the reference implementation: it gives you a production-quality shell that you fill with your own domain logic.

What is next

If you want to build and deploy a custom MCP server step by step, follow our companion tutorial: Build, Secure, and Deploy a Custom MCP Server. It walks through every stage from scaffolding to containerized deployment with working code at each step.

If you have not set up MCP-powered agents yet, start with the fundamentals:

The broader trajectory for MCP is moving toward multi-server orchestration, where a single agent session connects to multiple MCP servers simultaneously, each providing a different capability domain. Server registries and discovery protocols are emerging to make that coordination manageable. Teams that build well-structured, secure MCP servers now will be in the strongest position as that ecosystem matures.

The fundamental shift is straightforward: your internal tools are no longer just for humans. They are becoming capabilities that AI agents can use, under your control, with your security boundaries, inside your workflows. Building a custom MCP server is how you make that real.

Get the MCP Enterprise Starter Kit →