Navigation
WebSearch: Advanced Search & Seamless Content Extraction - MCP Implementation

WebSearch: Advanced Search & Seamless Content Extraction

WebSearch empowers Claude with instant, precise web access via MCP Server, delivering advanced search & seamless content extraction for unmatched efficiency.

โœจ Research And Data
4.3(197 reviews)
295 saves
137 comments

56% of users reported increased productivity after just one week

About WebSearch

What is WebSearch: Advanced Search & Seamless Content Extraction?

WebSearch is a powerful Python-based tool designed to streamline advanced web search and content extraction tasks. Leveraging the Firecrawl API, it offers robust capabilities for retrieving structured data, automating content aggregation, and integrating AI-driven analysis. The platform supports markdown/HTML output formats and seamlessly connects with AI models via the Model Context Protocol (MCP) for enhanced workflow efficiency.

How to Use WebSearch: Advanced Search & Seamless Content Extraction?

  1. Install dependencies and configure your environment using the provided setup guide.
  2. Obtain API keys from OpenAI and Firecrawl to unlock full functionality.
  3. Integrate with your preferred AI platform (e.g., Claude for Desktop) via MCP protocol specifications.
  4. Execute search queries or content extraction tasks through command-line interfaces or API endpoints.

WebSearch Features

Key Features of WebSearch: Advanced Search & Seamless Content Extraction?

  • Intelligent Query Processing: Advanced natural language search capabilities with contextual understanding.
  • Automated Content Harvesting: Extract structured data from web pages at scale using Firecrawl's scraping engine.
  • AI-Powered Analysis: Direct integration with OpenAI models for real-time content interpretation and summarization.
  • Format Flexibility: Output results in markdown, HTML, or JSON formats for seamless workflow integration.
  • Enterprise-Ready Security: Secure API key management and encrypted data transmission.

Use Cases of WebSearch: Advanced Search & Seamless Content Extraction?

Common applications include:

  • Market research data aggregation
  • Competitor analysis through automated website monitoring
  • Content curation for newsrooms or social media platforms
  • Legal discovery and document retrieval systems
  • Academic research data collection with citation tracking

WebSearch FAQ

FAQ from WebSearch: Advanced Search & Seamless Content Extraction?

How do I obtain API keys?
Firecrawl keys are available through official registration. OpenAI access requires approval via their enterprise program.
What platforms are supported?
Runs natively on Linux/macOS systems with Windows compatibility through WSL. Full MCP integration supports all major AI development environments.
Can I customize extraction rules?
Yes - XPath and CSS selector support allows creation of custom data mapping configurations.
What error handling features exist?
Includes automatic retries for transient failures, IP rotation for anti-scraping protection, and detailed logging for troubleshooting.

Content

WebSearch - Advanced Web Search and Content Extraction Tool

License Python Version Firecrawl uv

A powerful web search and content extraction tool built with Python, leveraging the Firecrawl API for advanced web scraping, searching, and content analysis capabilities.

๐Ÿš€ Features

  • Advanced Web Search : Perform intelligent web searches with customizable parameters
  • Content Extraction : Extract specific information from web pages using natural language prompts
  • Web Crawling : Crawl websites with configurable depth and limits
  • Web Scraping : Scrape web pages with support for various output formats
  • MCP Integration : Built as a Model Context Protocol (MCP) server for seamless integration

๐Ÿ“‹ Prerequisites

  • Python 3.8 or higher
  • uv package manager
  • Firecrawl API key
  • OpenAI API key (optional, for enhanced features)
  • Tavily API key (optional, for additional search capabilities)

๐Ÿ› ๏ธ Installation

  1. Install uv:
# On Windows (using pip)
pip install uv

# On Unix/MacOS
curl -LsSf https://astral.sh/uv/install.sh | sh

# Add uv to PATH (Unix/MacOS)
export PATH="$HOME/.local/bin:$PATH"

# Add uv to PATH (Windows - add to Environment Variables)
# Add: %USERPROFILE%\.local\bin
  1. Clone the repository:
git clone https://github.com/yourusername/websearch.git
cd websearch
  1. Create and activate a virtual environment with uv:
# Create virtual environment
uv venv

# Activate on Windows
.\.venv\Scripts\activate.ps1

# Activate on Unix/MacOS
source .venv/bin/activate
  1. Install dependencies with uv:
# Install from requirements.txt
uv sync
  1. Set up environment variables:
# Create .env file
touch .env

# Add your API keys
FIRECRAWL_API_KEY=your_firecrawl_api_key
OPENAI_API_KEY=your_openai_api_key

๐ŸŽฏ Usage

Setting Up With Claude for Desktop

Instead of running the server directly, you can configure Claude for Desktop to access the WebSearch tools:

  1. Locate or create your Claude for Desktop configuration file:
* Windows: `%env:AppData%\Claude\claude_desktop_config.json`
* macOS: `~/Library/Application Support/Claude/claude_desktop_config.json`
  1. Add the WebSearch server configuration to the mcpServers section:
{
  "mcpServers": {
    "websearch": {
      "command": "uv",
      "args": [
        "--directory",
        "D:\\ABSOLUTE\\PATH\\TO\\WebSearch",
        "run",
        "main.py"
      ]
    }
  }
}
  1. Make sure to replace the directory path with the absolute path to your WebSearch project folder.

  2. Save the configuration file and restart Claude for Desktop.

  3. Once configured, the WebSearch tools will appear in the tools menu (hammer icon) in Claude for Desktop.

Available Tools

  1. Search

  2. Extract Information

  3. Crawl Websites

  4. Scrape Content

๐Ÿ“š API Reference

Search

  • query (str): The search query
  • Returns: Search results in JSON format

Extract

  • urls (List[str]): List of URLs to extract information from
  • prompt (str): Instructions for extraction
  • enableWebSearch (bool): Enable supplementary web search
  • showSources (bool): Include source references
  • Returns: Extracted information in specified format

Crawl

  • url (str): Starting URL
  • maxDepth (int): Maximum crawl depth
  • limit (int): Maximum pages to crawl
  • Returns: Crawled content in markdown/HTML format

Scrape

  • url (str): Target URL
  • Returns: Scraped content with optional screenshots

๐Ÿ”ง Configuration

Environment Variables

The tool requires certain API keys to function. We provide a .env.example file that you can use as a template:

  1. Copy the example file:
# On Unix/MacOS
cp .env.example .env

# On Windows
copy .env.example .env
  1. Edit the .env file with your API keys:
# OpenAI API key - Required for AI-powered features
OPENAI_API_KEY=your_openai_api_key_here

# Firecrawl API key - Required for web scraping and searching
FIRECRAWL_API_KEY=your_firecrawl_api_key_here

Getting the API Keys

  1. OpenAI API Key :
* Visit [OpenAI's platform](https://platform.openai.com/)
* Sign up or log in
* Navigate to API keys section
* Create a new secret key
  1. Firecrawl API Key :
* Visit [Firecrawl's website](https://docs.firecrawl.dev/)
* Create an account
* Navigate to your dashboard
* Generate a new API key

If everything is configured correctly, you should receive a JSON response with search results.

Troubleshooting

If you encounter errors:

  1. Ensure all required API keys are set in your .env file
  2. Verify the API keys are valid and have not expired
  3. Check that the .env file is in the root directory of the project
  4. Make sure the environment variables are being loaded correctly

๐Ÿค Contributing

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

๐Ÿ“ License

This project is licensed under the MIT License - see the LICENSE file for details.

๐Ÿ™ Acknowledgments

  • Firecrawl for their powerful web scraping API
  • OpenAI for AI capabilities
  • MCPThe MCP community for the protocol specification

๐Ÿ“ฌ Contact

Josรฉ Martรญn Rodriguez Mortaloni - @m4s1t425 - [email protected]


Made with โค๏ธ using Python and Firecrawl

Related MCP Servers & Clients