Navigation

Protocol initialization in progress... ⚙️

MCP Image Recognition Server: Smart Insights & AI Precision

MCP Image Recognition Server: Smart visual insights powered by Anthropic & OpenAI APIs—unlock precise image analysis with ease and a dash of AI magic 👀.

Visit Repository

✨ Research And Data

4.6(116 reviews)

174 saves

81 comments

Ranked in the top 5% of all AI tools in its category

About MCP Image Recognition Server

What is MCP Image Recognition Server: Smart Insights & AI Precision?

Powered by Anthropic and OpenAI vision APIs, this server delivers advanced image analysis capabilities. It combines AI-driven image descriptions with optional text extraction via Tesseract OCR, offering a flexible solution for developers. Version 0.1.2 now includes robust error handling and expanded testing for OCR workflows.

Key Features of MCP Image Recognition Server: Smart Insights & AI Precision?

Multi-Provider Flexibility: Choose primary providers (Claude Vision/GPT-4 Vision) with automatic fallback options
Format Agnostic: Handles JPEG/PNG/GIF/WebP files through base64 or direct file uploads
Intelligent Text Extraction: Optional OCR capability for document analysis
Customizable Models: Leverage OpenRouter to access over 300 models via OpenAI API format
Production-Ready: Docker support and configurable logging for enterprise deployments

MCP Image Recognition Server Features

How to use MCP Image Recognition Server: Smart Insights & AI Precision?

Install dependencies via Python 3.8+ environment
Configure .env with API keys and preferred providers
Run server and use REST endpoints:

/analyze_image (POST) for full analysis

/extract_text (POST) for OCR-only processing

Monitor performance through logging levels (DEBUG/INFO/WARNING)

Use cases of MCP Image Recognition Server: Smart Insights & AI Precision?

Document Automation

Extract text from invoices/receipts while categorizing content with AI labels

Quality Control

Automate product image inspections using object detection capabilities

Content Management

Generate descriptive metadata for image libraries at scale

MCP Image Recognition Server FAQ

FAQ from MCP Image Recognition Server: Smart Insights & AI Precision?

Do I need Tesseract installed?

Only for OCR features - requires separate installation via official docs

Can I use custom models?

Through OpenRouter integration - supports over 300 third-party models

How to handle API limits?

Configure fallback providers in config.yaml to automatically switch when rate limits hit

What's the response format?

Returns JSON with object labels, confidence scores, extracted text (if enabled), and provider metadata

Content

MCP Image Recognition Server

An MCP server that provides image recognition capabilities using Anthropic and OpenAI vision APIs. Version 0.1.2.

Features

Image description using Anthropic Claude Vision or OpenAI GPT-4 Vision
Support for multiple image formats (JPEG, PNG, GIF, WebP)
Configurable primary and fallback providers
Base64 and file-based image input support
Optional text extraction using Tesseract OCR

Requirements

Python 3.8 or higher
Tesseract OCR (optional) - Required for text extraction feature

Installation

Clone the repository:

git clone https://github.com/mario-andreschak/mcp-image-recognition.git
cd mcp-image-recognition

Create and configure your environment file:

cp .env.example .env
# Edit .env with your API keys and preferences

Build the project:

build.bat

Usage

Running the Server

Spawn the server using python:

python -m image_recognition_server.server

Start the server using batch instead:

run.bat server

Start the server in development mode with the MCP Inspector:

run.bat debug

Available Tools

describe_image

* Input: Base64-encoded image data and MIME type
* Output: Detailed description of the image

describe_image_from_file

* Input: Path to an image file
* Output: Detailed description of the image

Environment Configuration

ANTHROPIC_API_KEY: Your Anthropic API key.
OPENAI_API_KEY: Your OpenAI API key.
VISION_PROVIDER: Primary vision provider (anthropic or openai).
FALLBACK_PROVIDER: Optional fallback provider.
LOG_LEVEL: Logging level (DEBUG, INFO, WARNING, ERROR).
ENABLE_OCR: Enable Tesseract OCR text extraction (true or false).
TESSERACT_CMD: Optional custom path to Tesseract executable.
OPENAI_MODEL: OpenAI Model (default: gpt-4o-mini). Can use OpenRouter format for other models (e.g., anthropic/claude-3.5-sonnet:beta).
OPENAI_BASE_URL: Optional custom base URL for the OpenAI API. Set to https://openrouter.ai/api/v1 for OpenRouter.
OPENAI_TIMEOUT: Optional custom timeout (in seconds) for the OpenAI API.

Using OpenRouter

OpenRouter allows you to access various models using the OpenAI API format. To use OpenRouter, follow these steps:

Obtain an OpenAI API key from OpenRouter.
Set OPENAI_API_KEY in your .env file to your OpenRouter API key.
Set OPENAI_BASE_URL to https://openrouter.ai/api/v1.
Set OPENAI_MODEL to the desired model using the OpenRouter format (e.g., anthropic/claude-3.5-sonnet:beta).
Set VISION_PROVIDER to openai.

Default Models

Anthropic: claude-3.5-sonnet-beta
OpenAI: gpt-4o-mini
OpenRouter: Use the anthropic/claude-3.5-sonnet:beta format in OPENAI_MODEL.

Development

Running Tests

Run all tests:

run.bat test

Run specific test suite:

run.bat test server
run.bat test anthropic
run.bat test openai

Docker Support

Build the Docker image:

docker build -t mcp-image-recognition .

Run the container:

docker run -it --env-file .env mcp-image-recognition

License

MIT License - see LICENSE file for details.

Release History

0.1.2 (2025-02-20): Improved OCR error handling and added comprehensive test coverage for OCR functionality
0.1.1 (2025-02-19): Added Tesseract OCR support for text extraction from images (optional feature)
0.1.0 (2025-02-19): Initial release with Anthropic and OpenAI vision support

Related MCP Servers & Clients

Quick Navigation

Content MCP Image Recognition Server
About MCP Image Recognition Server
Features MCP Image Recognition Server
FAQ MCP Image Recognition Server

MCP Categories

Research And Data