https://www.featureform.com/post/deploy-mcp-on-aws-lambda-with-mcpengine

 
Want to learn more about Featureform? Schedule a demo today!
white featureform logo
Xwhite featureform logo
Product
Open SourceEnterpriseRAG / LLMSecurity
ResourcesCompanyPricingDocs
Star
GET STARTED

Deploying MCP Servers on AWS Lambda with MCPEngine

April 23, 2025
-
min read
Simba Khadder
MCP servers running on Lambda sending data down to LLM agents

Model Context Protocol (MCP) is quickly becoming the standard for
enabling LLMs to call external tools. It's built around clean,
declarative tool definitions--but most current implementations fall
short of being production-ready. Every official MCP server in the
Anthropic repo, for instance, runs locally and communicates over
stdio. Even the few that support HTTP rely on Server-Sent Events
(SSE) for streaming. This introduces stateful behavior, requiring
persistent TCP connections, complicating retries, and ultimately
making it incompatible with stateless environments like AWS Lambda.
We've written more about these limitations, and how we've addressed
them with MCPEngine.

AWS Lambda offers instant scalability, no server management, and
efficient, event-driven execution. We built native support for it in
MCPEngine, so that MCP tools can run cleanly and reliably in
serverless environments.

MCPEngine is an open-source implementation of MCP that supports
streamable HTTP alongside SSE, making it compatible with Lambda. It
also includes first-class support for authentication, packaging, and
other capabilities to build and deploy production-grade MCP servers.

This post walks through building three progressively more realistic
examples:

 1. A stateless MCP server with a single tool
 2. A stateful version using RDS and a context handler
 3. An authenticated version using OIDC (Google or Cognito)

All three run entirely in Lambda, don't require a custom agent, and
are MCP-spec compliant.

1. Build and Deploy a Stateless Weather MCP API

You can follow along on GitHub here for the full project.

1.1 Defining the MCP Server

We'll start with a single tool called get_weather. It takes a city
name and returns a canned string response. There's no state or
external API call -- just enough to validate end-to-end behavior with
a live LLM.

Install the Python SDK:

pip install mcpengine[cli,lambda]

Create a file called app.py:

from mcpengine import MCPEngine

engine = MCPEngine()

@engine.tool()
def get_weather(city: str) -> str:
    """Return the current weather for a given city.
    """
    return f"The weather in {city} is sunny and 72degF."

handler = engine.get_lambda_handler()

What this does:

  * @engine.tool registers the function with the MCP manifest. The
    function name (`get_weather`) becomes the tool name exposed to
    the LLM.
  * The docstring is exposed in the manifest and shown to the LLM
    during tool selection.
  * handler is the AWS Lambda-compatible entry point. No glue code
    required -- Lambda is configured to look for a global named
    handler and MCPEngine handles the request lifecycle.

1.2 Deploying to Lambda

You can deploy this manually or use Terraform to automate setup.

Option 1: Terraform

If you want to skip most of the boilerplate, we provide Terraform
scripts that:

  * Creates an ECR repository to host the image
  * Provisions the Lambda function and IAM roles
  * Exposes it via a Function URL

You can run it in the directory by calling:

terraform apply

Grab the ECR repository url and Lambda function name from the
terraform output:

export REPOSITORY_URL=$(terraform output -raw repository_url)
export FUNCTION_NAME=$(terraform output -raw lambda_name) 

And then build, tag, and push the image:

docker build --platform=linux/amd64 --provenance=false -t mcp-lambda:latest .
docker tag mcp-lambda ${REPOSITORY_URL}:latest
docker push ${REPOSITORY_URL}:latest

Finally, we'll update the Lambda with this new image:

aws lambda update-function-code \
--function-name ${FUNCTION_NAME} \
--image-uri ${REPOSITORY_URL}:latest

And the application will be running. Once deployed, you can tear it
down with:

terraform destroy

Option 2: From Scratch

If you prefer to deploy manually:

Step 1: Dockerize the Server

FROM public.ecr.aws/lambda/python:3.12
# Set working directory in the container
WORKDIR /var/task
# Copy application code
COPY . .
# Install dependencies
RUN pip install --system --no-cache-dir .
# Expose port for the server
EXPOSE 8000
# Command to run the web server
CMD ["weather.server.handler"]

Then:

docker build --platform=linux/amd64 --provenance=false -t mcp-lambda .

Step 2: Push to ECR

docker tag mcp-lambda:latest <your-ecr-url>/mcp-lambda:latest
docker push <your-ecr-url>/mcp-lambda:latest

Step 3: Deploy to Lambda

aws lambda create-function \
  --function-name mcp-lambda \
  --package-type Image \
  --code ImageUri=<your-ecr-url>/mcp-lambda:latest \
  --role arn:aws:iam::<account-id>:role/<lambda-role>

Step 4: Add permissions for Lambda

aws iam create-role \
    --role-name lambda-container-execution \
    --assume-role-policy-document '{
        "Version": "2012-10-17",
        "Statement": [
            {
                "Effect": "Allow",
                "Principal": {
                    "Service": "lambda.amazonaws.com"
                },
                "Action": "sts:AssumeRole"
            }
        ]
    }'
aws iam attach-role-policy \
    --role-name lambda-container-execution \
    --policy-arn arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole

Step 5: Enable a Function URL, and add an allow-all permission to
call it:

aws lambda create-function-url-config \
  --function-name mcp-lambda \
  --auth-type NONE
aws lambda add-permission \
    --function-name mcp-lambda \
    --function-url-auth-type NONE \
    --action lambda:InvokeFunctionUrl \
    --statement-id PublicInvoke \
    --principal '*'

1.3 Connecting via Claude

Once deployed, you can connect to the server using any compatible
LLM. For example, to connect from Claude:

mcpengine proxy <service-name> <your-lambda-function-url> --mode http --claude

Open Claude, and your tool should appear at the bottom of the chat
bubble. When you ask something like "What's the weather in Tokyo?",
Claude will:

  * Select the tool based on its manifest description
  * Prompt the user to authorize the request
  * Call the endpoint with the tool name and arguments

That's it. You now have a fully deployed, Lambda-hosted MCP server,
responding to real LLM calls over HTTP.

2. Building and Deploying a Stateful Slack-like MCP API

You can follow along on GitHub here for the full project.

Stateless tools are useful for demos, but most real applications need
to persist data. In this section, we'll extend the minimal MCP server
to include state. Specifically, we'll build a basic Slack-like
message board that stores and retrieves messages from a relational
database.

This version uses:

  * Postgres on Amazon RDS for persistent storage
  * A context handler to manage a connection pool
  * Two tools: one to post a message, one to list messages

The goal is not to build a full chat system, just to show how you can
add state to an MCP server without giving up stateless infrastructure
like Lambda.

2.1 Defining the Data Model

We'll store each message as a row in a single table. For simplicity,
all messages go into the same global timeline.

The schema will look like:

CREATE TABLE messages (
  id SERIAL PRIMARY KEY,
  username TEXT NOT NULL,
  text TEXT NOT NULL,
  timestamp TIMESTAMP DEFAULT now()
);

post_message() will insert into this table, and get_messages() will
return the most recent entries.

2.2 Managing Connections with a Context Handler

You shouldn't open database connections inside your tool functions.
Instead, MCPEngine provides a context system: you define a setup
function that runs before the server boots up, and MCPEngine makes
the result available as ctx.

In this case, the context will:

  * Open a connection pool to RDS
  * Attach it to the request context (ctx.db)
  * Clean up after the tool finishes

This keeps your tools focused on business logic, not lifecycle
management.

2.3 Implementing the Tools & Context

Assuming ctx.db is a valid psycopg2 connection, the tools look like
this:

@engine.tool()
def post_message(ctx: Context, username: str, text: str) -> str:
    """Post a message to the global timeline."""
    with ctx.db.cursor() as cur:
        cur.execute("INSERT INTO messages (username, text) VALUES (%s, %s)", (username, text))
        ctx.db.commit()
    return "Message posted."

@engine.tool()
def get_messages(ctx: Context) -> list[str]:
    """Get the most recent messages."""
    with ctx.db.cursor() as cur:
        cur.execute("SELECT username, text FROM messages ORDER BY timestamp DESC LIMIT 10")
        return [f"{row[0]}: {row[1]}" for row in cur.fetchall()]

Add the context handler:

@asynccontextmanager
def app_lifespan():
    import psycopg2
    conn = psycopg2.connect(
        host=os.environ["DB_HOST"],
        user=os.environ["DB_USER"],
        password=os.environ["DB_PASS"],
        dbname=os.environ["DB_NAME"],
    )
    try:
        yield {"db": conn}
    finally:
        conn.close()

You then update the constructor of the MCPEngine, to pass it this
lifespan context builder.

engine = MCPEngine(
    lifespan=app_lifespan,
)

MCPEngine will run the lifecycle to get the connection pool as the
server boots up, and will attach it as context to every request that
comes in. Additionally, when the server stops and shuts down, it will
run the cleanup (everything after the yield statement).

2.4 Deploying the Server

We recommend using Terraform here, since this version involves
provisioning an RDS instance, IAM roles, and security groups. If you
prefer to deploy manually, you can use the Terraform script as a
reference.

terraform apply

This will:

  * Create the database
  * Set up a Lambda function with the correct environment variables

Grab the ECR repository url and Lambda function name from the
terraform output:

export REPOSITORY_URL=$(terraform output -raw repository_url)
export FUNCTION_NAME=$(terraform output -raw lambda_name)

And then build, tag, and push the image:

docker build --platform=linux/amd64 --provenance=false -t mcp-lambda:latest .
docker tag mcp-lambda ${REPOSITORY_URL}:latest
docker push ${REPOSITORY_URL}:latest

Finally, we'll update the Lambda with this new image:

aws lambda update-function-code \
  --function-name ${FUNCTION_NAME} \
  --image-uri ${REPOSITORY_URL}:latest

When you're done, you can tear down the resources with:

terraform destroy

2.5 Connecting and Testing

Once deployed, connect Claude again using:

mcpengine proxy <service-name> <your-lambda-function-url> --claude --mode http

Open Claude and you should now see two tools: post_message and
get_messages.

You can prompt Claude to send or retrieve messages. You can also
connect from another Claude window, use the same tools, and confirm
the messages are shared -- even across users and cold starts.

3. Adding Authentication with Google SSO

You can follow along on GitHub here for the full project.

The tools we've built so far work, but they're open. Anyone can call
them, impersonate any username, and there's no mechanism for
verifying identity. That might be fine for testing, but it's not
acceptable in anything that resembles a production system.

MCPEngine supports token-based authentication using standard OpenID
Connect (OIDC). That means you can integrate with any identity
provider that issues JWTs, including Google, AWS Cognito, Auth0, or
your internal auth stack.

In this section, we'll secure our existing tools using Google as the
identity provider. We'll:

  * Register a Google OAuth app
  * Modify our MCP server to require valid tokens
  * Pass the token through Claude (or any other client)
  * Read the authenticated user from the request context

3.1 Creating a Google OAuth App

First, set up an OAuth client in Google Cloud:

 1. Go to Google Cloud Console
 2. Select or create a project
 3. Navigate to APIs & Services > Credentials
 4. Click Create Credentials > OAuth client ID
 5. Set application type to Web application
 6. Add an authorized redirect URI (you can use http://localhost for
    local testing)
 7. Save the client and note the Client ID

That's all you need for server-side token validation -- the Client ID
and the standard Google issuer (https://accounts.google.com).

3.2 Updating MCPEngine for Auth

To enable auth, you'll need to:

 1. Set the idp_config  when constructing the engine:

from mcpengine import MCPEngine, GoogleIdpConfig

engine = MCPEngine(
    lifespan=app_lifespan,
    idp_config=GoogleIdpConfig(),
)

This tells MCPEngine to use Google's public JWKS endpoint to verify
incoming tokens.

 2. Restrict access to tools using @engine.auth():

@engine.auth()
@engine.tool()
def post_message(text: str, ctx: Context) -> str:
    """Post a message to the global timeline."""
    # Only runs if the token is valid
    ...

If the request doesn't include a valid token, it will be rejected
automatically. If it does, user info will be available through the
context.

3.3 Updating the Client

When calling a protected tool from a client, you need to pass a valid
Google-issued ID token. Claude handles this automatically once it
sees that the tool requires authentication.

When you install the tool in Claude, add the client ID and client
secret:

mcpengine proxy <service-name> <your-endpoint> --claude --client-id <google-client-id> --client-secret <google-client-secret> --mode http

This tells Claude to request a token from Google using your
registered client ID. When the user grants permission, Claude
includes that token in every call to your MCP server.

You don't need to verify anything manually; MCPEngine handles token
validation and decoding internally.

3.4 Deploying the Updated Server

You don't need to change the Dockerfile or tool definitions -- just
make sure you:

  * Pass the issuer_url (either in code or via environment variables)
  * Rebuild and push your Docker image
  * Redeploy to Lambda with the new version

docker build --platform=linux/amd64 --provenance=false -t mcp-lambda .
docker tag mcp-lambda ${REPOSITORY_URL}/mcp-lambda
docker push ${REPOSITORY_URL}/mcp-lambda
aws lambda update-function-code \
  --function-name mcp-lambda \
  --image-uri ${REPOSITORY_URL}/mcp-lambda:latest

3.5 Confirming It Works

Once deployed:

  * Claude should now prompt the user to authenticate via Google
    before calling a protected tool
  * The call will only succeed if a valid token is present
  * You can access the user's identity via ctx

Recap

With two changes -- adding idp_config to the engine and decorating
tools with @engine.auth() -- we've added working authentication to our
MCP server. Google handles the user login. Claude handles the token
flow. MCPEngine handles the verification and exposes identity to your
tool code.

Where to go from here

At this point, we've deployed three working MCP servers on AWS
Lambda:

 1. A stateless one that returns a weather response
 2. A stateful one that uses RDS to persist and retrieve chat
    messages
 3. A secure one using Google SSO for authentication

The authenticated example is closer to a real-world use case. It's
minimal, but demonstrates that you can build something stateful,
Lambda-native, and MCP-compliant, without ever running a server or
maintaining sticky connections.

We used Claude as the client here, but the interface is fully
standard MCP. You can just as easily connect using the MCPEngine
client from another LLM or orchestrator. This opens the door to
agentic systems. For example, you could:

  * Spin up a simulated product manager and engineer
  * Drop them into the same chatroom
  * Let them exchange ideas, log tickets, and revise specs -- all in
    the open

None of this requires any special integrations. Just tools, schemas,
and tokens.

As of today, MCPEngine is the only Python implementation of MCP that
supports built-in authentication. In the next post, we'll walk
through more complex authentication patterns including scoped access,
restricting tools to specific users, and surfacing identity inside
the tool logic.

Ready to get started?

See what a virtual feature store means for your organization.

 
Schedule a Demo
 
Talk to an expert
[6569dbfe68][664ae3b511]
[6569c5c0c0][6569bf6cf7]
[6569ad9997]
white featureform logo
hello @ featureform.com
 
 
 
 

PRODUCT

EnterpriseOpen SourceRAG / LLMs

RESOURCES

Blog PostsPodcastAll Resources

COMPANY

About UsContact UsSecurity

PRICING

DOCS

Sign up for our Newsletter!
[                    ]
[ ]I'd like to opt-in to receive the email newsletter and I accept
the terms.
[Submit]
Success! You've been added.
Oops! Something went wrong while submitting the form.
(c) 2024 Featureform|Privacy Policy|Your California Privacy Rights|
Terms of Service