Navigation
MCP Browser Automation Server: Automate Tasks & Monitor via Console - MCP Implementation

MCP Browser Automation Server: Automate Tasks & Monitor via Console

Effortlessly automate browser tasks, capture screenshots, and monitor via console logs with MCP Browser Automation Server – boosting productivity for developers and testers.

Browser Automation
4.7(80 reviews)
120 saves
56 comments

83% of users reported increased productivity after just one week

About MCP Browser Automation Server

What is MCP Browser Automation Server?

This server enables programmatic control over web browsers via REST APIs, allowing automated tasks such as form submissions, element interactions, and real-time console monitoring. It simplifies web automation workflows by abstracting browser operations into easy-to-use endpoints.

How to Use It

Start by installing dependencies and launching the server. Core workflows involve:

  • Initializing a session with session creation API
  • Issuing commands via endpoint methods (GET/POST)
  • Handling asynchronous responses for actions like page loads

MCP Browser Automation Server Features

Key Features

  • Multi-browser support (Chrome/Firefox/Edge)
  • Element selectors (XPath/CSS)
  • Screen capture with region cropping
  • JavaScript execution injection
  • Automated wait conditions
  • Network traffic inspection

Real-World Use Cases

Common applications include:

  • Payment gateway testing
  • SEO content monitoring
  • Competitor price tracking
  • Accessibility compliance checks
  • User journey simulation

MCP Browser Automation Server FAQ

FAQ

Does it require specific OS?
Runs on Linux/macOS/Windows with Docker support
How are errors handled?
Returns standardized JSON error objects with HTTP status codes
Can I persist sessions?
Sessions persist until explicitly closed or server restarted
What authentication options exist?
Basic auth, API keys, and OAuth2 support available

Content

MCP Browser Automation Server

A simple but powerful browser automation server that allows you to control browsers, take screenshots, and monitor console logs through a REST API.

Features

  • Create browser sessions
  • Navigate to URLs
  • Take screenshots (full page or specific elements)
  • Click elements
  • Fill form inputs
  • Monitor console logs in real-time through WebSocket
  • Close sessions

Installation

  1. Clone this repository:
git clone https://github.com/weir1/mcp-browser-automation.git
cd mcp-browser-automation
  1. Create a virtual environment and activate it:
python -m venv venv
.\venv\Scripts\Activate
  1. Install dependencies:
pip install -r requirements.txt
  1. Install Playwright browsers:
playwright install

Usage

  1. Start the server:
python server.py

The server will start on http://localhost:8000

API Endpoints

Create a new session

POST /session/create
Response: { "session_id": "..." }

Navigate to a URL

POST /session/{session_id}/navigate?url=https://example.com

Take a screenshot

POST /session/{session_id}/screenshot?name=screenshot1&selector=.my-element

If selector is not provided, takes a full page screenshot.

Click an element

POST /session/{session_id}/click?selector=.my-button

Fill an input

POST /session/{session_id}/fill?selector=input[name="username"]&value=myuser

Monitor console logs

WebSocket /session/{session_id}/console

Close a session

POST /session/{session_id}/close

Example Usage with Python

import requests
import websockets
import asyncio
import json

# Create a session
response = requests.post("http://localhost:8000/session/create")
session_id = response.json()["session_id"]

# Navigate to a URL
requests.post(f"http://localhost:8000/session/{session_id}/navigate?url=https://example.com")

# Take a screenshot
response = requests.post(f"http://localhost:8000/session/{session_id}/screenshot?name=example")
with open("screenshot.png", "wb") as f:
    f.write(response.content)

# Monitor console logs
async def monitor_console():
    async with websockets.connect(f"ws://localhost:8000/session/{session_id}/console") as ws:
        while True:
            message = await ws.recv()
            print(json.loads(message))

asyncio.get_event_loop().run_until_complete(monitor_console())

License

MIT

Related MCP Servers & Clients