Deploy a Secure and Production-Ready EKS MCP Server to Manage EKS Clusters Using Natural Language

2025-08-14 1547 words 8 minutes

/images/eks-mcp-server/featuredimage.png

Contents

AI agents today open up fascinating prospects, especially when they are given access to external tools via APIs. This is where their full potential lies: we can now move from simple text generation to the execution of concrete actions on systems such as Cloud infrastructures.

In this article, I’ll explore how to deploy and administer Cloud infrastructures, such as Kubernetes clusters, in natural language, while guaranteeing security and industrialization.

Introduction

So I asked myself the following question:

How can we deploy and administer Cloud infrastructures, such as Kubernetes clusters, in natural language, while guaranteeing security and industrialization?

This is the question I wanted to answer by working on a demonstration: deploying an MCP server capable of administering an AWS EKS cluster in natural language, while respecting the security and architecture constraints of a company information system.

What is the Model Context Protocol (MCP)?

The Model Context Protocol (MCP) is a protocol proposed by Anthropic, which defines a standardized way for LLMs to interact with tools or environments.

In other words, this protocol bridges the conversational capabilities of LLMs and systems of action on systems such as APIs, databases, or here, Kubernetes clusters.

Source: descope.com

Why a Secure Version of the EKS MCP Server?

In this project, I wanted to explore how an MCP server could be deployed remotely, securely and in an enterprise environment. So I set out to deploy an MCP server capable of administering an EKS (Elastic Kubernetes Service) cluster on demand, via an MCP client (Claude Desktop).

A few weeks ago, I discovered that AWS had released a new MCP server: Amazon EKS MCP Server. This server offers an MCP interface with tools for direct administration of EKS (Elastic Kubernetes Service) clusters via an LLM.

However, in its original version, this server was designed for local use, in stdio mode, and does not integrate any authentication mechanism. As a result, it is difficult to use in enterprise cloud environments, where security and traceability are essential.

Key Modifications Made

So I set about modifying the server’s source code to:

Enable remote execution by activating the HTTP Server-Side Events (SSE) communication mode
Define health checks so that it can be exposed via an ALB
Add an authentication layer based on OAuth 2.0, to restrict and secure access to the server
Add additional functions, notably to manage DNS records on Route53
Maintain compatibility with standard MCP clients, such as Claude Desktop

The ultimate goal is simple: to manage EKS clusters in natural language, securely.

In concrete terms, this means being able to send instructions such as:

“Deploy a new service X on my high-availability cluster and expose it”
“What’s the status of my cluster? What optimizations could be made?”
“Why isn’t my deployment Y working?”

…from a client like Claude Desktop, and see these actions executed on an AWS infrastructure, without writing a single line of YAML or touching kubectl.

Solution Architecture

While chatting with my colleague Pierre-Ange, I learned that AWS has published a best practice guide for securely deploying MCP servers on their infrastructure. This guide can be accessed on the following GitHub repo: aws-solutions-library-samples/guidance-for-deploying-model-context-protocol-servers-on-aws

Rather than reinventing the wheel, I took inspiration from the architecture proposed by AWS, adapting it to my own needs for the deployment of my modified EKS MCP Server.

Secure EKS MCP Server Architecture

Key Components

The architecture includes:

ECS Fargate to easily deploy MCP servers and provide horizontal scalability, enabling multiple MCP clients to connect to the service
OAuth 2.0 authentication via Cognito and an authentication server (MAS) to offload and mutualize MCP server authentication mechanisms, with a DynamoDB table to store tokens
CloudFront distribution with a WAF (Web Application Firewall) in front of an ALB to securely expose MCP servers and make them reachable from MCP clients
Least privilege IAM permissions to restrict what the server can do on the AWS account
An EKS Auto Mode cluster, for direct access to network, storage and node management controllers

To find out more about EKS Auto Mode, I wrote an article a few months ago on this new EKS cluster management mode: EKS Auto Mode: What You Need to Know Before Making the Move

Setting up OAuth 2.0 Authentication

Once the architecture described above had been implemented with Terraform, I had to iterate several times on the configuration of the modified MCP server, in particular on the management of OAuth 2.0 authentication.

MCP Client Configuration

First of all, to enable the Claude Desktop MCP client to connect to my remote MCP server, an MCP server configuration must be defined in its configuration file (claude_desktop_config.json for Claude Desktop). In particular, to specify the MCP server’s address and the fact that it can be accessed remotely (mcp-remote).

Here’s what my configuration file looks like:

{
  "mcpServers": {
    "eks": {
      "command": "npx",
      "args": [
        "-y",
        "mcp-remote", 
        "https://eks-mcp-server.example.com/sse"
      ]
    }
  }
}

Authentication Flow

The authentication process follows these steps:

When the Claude Desktop MCP client is launched, the MCP server returns an HTTP 401 code, indicating that authentication is required for access
The default web browser automatically opens
The user is redirected to the OAuth authorization URL defined on the server
And redirects to a login page managed by AWS Cognito

This follows the MCP protocol specifications defined in the MCP Authorization documentation.

OAuth 2.0 Authentication Flow Diagram

For this demonstration, I used AWS Cognito as IdP (Identity Provider), with a User Pool containing my user (email/password). However, it is perfectly possible (and even recommended) to connect to the company’s IdP, such as an Entra ID, for example.

Once authenticated, the token is automatically sent back to the MCP client. The client can then exchange data with the MCP server, and take advantage of the server’s functionalities.

User Experience

Concretely, on the user side:

I open Claude Desktop, my browser opens and redirects me to a login page managed with AWS Cognito:

I log in with my email address and password:

I return to Claude Desktop and confirm that the connection to my MCP EKS server is functional:

Managing Permissions to Administer an EKS Cluster

Now that Claude Desktop is securely connected to my MCP server, there’s one more important step: Give the MCP server the necessary permissions to administer the EKS cluster.

The MCP server is deployed on a Fargate task (ECS). To enable it to interact with the AWS API, this task is assigned an IAM role with all the permissions required to administer an EKS cluster.

At the same time, I configured an EKS Access Entry that gives the IAM role of my ECS task administrator rights on the Kubernetes cluster. Of course, it’s important to set the permissions of this role to least privilege, but for the purposes of this demonstration, I’ve enabled EKS administrator rights for simplicity’s sake.

Let’s Play! 🕹️

Once these prerequisites are in place, my EKS MCP server is authorized to interact with the EKS cluster. Upstream, I can configure the context to facilitate my LLM’s decision-making. I can now formulate natural language instructions with Claude.

Example 1: Deploying a Video Game (Sonic 🦔🎮)

“Deploy Sonic game based on this docker image: dazdaz/sonic. Container is listening on port 8080”

This action is translated by the LLM into a sequence of Kubernetes API calls, executed by the MCP EKS server:

And after a few minutes:

Example 2: Cluster Status Analysis and Recommendations

Another example:

“What’s the status of my cluster? What optimizations could be made?”

I’ll leave you to admire the result, which, I confess, blew me away with its veracity and speed of analysis the first time:

Please note that I’m only using a foundation model (Claude 4 Sonnet). Which means that the result could be even better if we used a custom model, a fine-tuned model, or RAG to have an EKS expert model and better knowledge of its environment.

Conclusion

To conclude this article, we have shown that it is possible today to industrialize agentic solutions based on MCP servers to manage Cloud infrastructures, in this case on AWS.

Using the example of an EKS MCP server and integrating it into a secure and resilient AWS architecture, we have shown that it is not only feasible to manage Kubernetes clusters in a conversational way, but also to do so while respecting the security, operability and scalability standards required by enterprises.

The automation and simplified interaction offered by this type of solution via natural language and AI agents pave the way for a new era of infrastructure management, where complexity is abstracted and performance is enhanced.

Areas for Improvement

To go a step further and get closer to business practices, areas for improvement would include:

Deploying specialized MCP servers to interact with tools like Terraform (IaC), ArgoCD (GitOps) or an SCM like GitHub or GitLab. To keep track of the various codes and configurations generated and applied by AI agents.
Trace, audit and version operations carried out by AI agents to guarantee ongoing compliance.
Enhance agent performance via specialized models, using approaches such as fine-tuning, integration of customized proprietary models, or Retrieval-Augmented Generation (RAG) to provide an LLM, capable of making precise decisions based on the actual state of the global infrastructure and its history.