Fetcher MCP
MCP server for fetch web page content using Playwright headless browser.
Advantages
JavaScript Support : Unlike traditional web scrapers, Fetcher MCP uses Playwright to execute JavaScript, making it capable of handling dynamic web content and modern web applications.
Intelligent Content Extraction : Built-in Readability algorithm automatically extracts the main content from web pages, removing ads, navigation, and other non-essential elements.
Flexible Output Format : Supports both HTML and Markdown output formats, making it easy to integrate with various downstream applications.
Parallel Processing : The fetch_urls
tool enables concurrent fetching of multiple URLs, significantly improving efficiency for batch operations.
Resource Optimization : Automatically blocks unnecessary resources (images, stylesheets, fonts, media) to reduce bandwidth usage and improve performance.
Robust Error Handling : Comprehensive error handling and logging ensure reliable operation even when dealing with problematic web pages.
Configurable Parameters : Fine-grained control over timeouts, content extraction, and output formatting to suit different use cases.
Quick Start
Run directly with npx:
npx -y fetcher-mcp
Debug Mode
Run with the --debug
option to show the browser window for debugging:
npx -y fetcher-mcp --debug
Configuration MCP
Configure this MCP server in Claude Desktop:
On MacOS: ~/Library/Application Support/Claude/claude_desktop_config.json
On Windows: %APPDATA%/Claude/claude_desktop_config.json
{
"mcpServers": {
"fetcher": {
"command": "npx",
"args": ["-y", "fetcher-mcp"]
}
}
}
Features
Tips
Handling Special Website Scenarios
Dealing with Anti-Crawler Mechanisms
Wait for Complete Loading : For websites using CAPTCHA, redirects, or other verification mechanisms, include in your prompt:
Please wait for the page to fully load
This will use the waitForNavigation: true
parameter.
This adjusts both timeout
and navigationTimeout
parameters accordingly.
Content Retrieval Adjustments
Sets extractContent: false
and returnHtml: true
.
Sets extractContent: false
.
Sets returnHtml: true
.
Debugging and Authentication
Enabling Debug Mode
This sets debug: true
even if the server was started without the --debug
flag.
Using Custom Cookies for Authentication
Sets debug: true
or uses the --debug
flag, keeping the browser window open for manual login.
Interacting with Debug Browser : When debug mode is enabled:
- The browser window remains open
- You can manually log into the website using your credentials
- After login is complete, content will be fetched with your authenticated session
Enable Debug for Specific Requests : Even if the server is already running, you can enable debug mode for a specific request:
Please enable debug mode for this authentication step
Sets debug: true
for this specific request only, opening the browser window for manual login.
Development
Install Dependencies
npm install
Install Playwright Browser
Install the browsers needed for Playwright:
npm run install-browser
Build the Server
npm run build
Debugging
Use MCP Inspector for debugging:
npm run inspector
You can also enable visible browser mode for debugging:
node build/index.js --debug
License
Licensed under the MIT License