Sourcesage: Graph-Based Caching & Dev Speed Boost

Sourcesage: MCP Server caching codebases as graphs – boost dev speed, optimize performance. Smart, developer-focused caching for modern teams.

Visit Repository

✨ Research And Data

4.8(89 reviews)

133 saves

62 comments

Ranked in the top 7% of all AI tools in its category

About Sourcesage

What is Sourcesage: Graph-Based Caching & Dev Speed Boost?

Sourcesage is an advanced semantic code analysis framework designed to accelerate developer productivity through intelligent graph-based caching. By leveraging large language models (LLMs) to interpret code semantics, it captures structural relationships, design patterns, and language-agnostic patterns, enabling rapid query responses and optimized knowledge storage. This system minimizes token overhead while maintaining semantic accuracy across multiple programming languages.

How to Use Sourcesage: Graph-Based Caching & Dev Speed Boost?

Initialize the system via CLI configuration
Run analysis workflows through supported IDE integrations
Execute semantic queries using natural language prompts
Access real-time codebase insights through the query API
Manage knowledge graphs with built-in maintenance tools

Integration requires configuring the sourcesage-agent with your development environment, followed by defining analysis scopes through structured queries.

Sourcesage Features

Key Features of Sourcesage: Graph-Based Caching & Dev Speed Boost?

Semantic Relationship Mapping - Captures class hierarchies, dependency chains, and pattern implementations
Multi-Language Comprehension - Processes Python, JavaScript, Java, and other LLM-supported languages
Dynamic Knowledge Graph - Adapts to evolving codebases with incremental updates
Low-Token Storage - Optimized representation reduces API call costs by 40-60%
Query Acceleration - Pre-processed data enables sub-second pattern searches

Use Cases of Sourcesage: Graph-Based Caching & Dev Speed Boost?

Codebase Onboarding

Rapidly understand legacy systems through visualized dependency graphs and pattern inventories

Refactoring Assistance

Identify cross-cutting concerns and dependency hotspots before implementation

Security Audits

Automated detection of anti-patterns and vulnerability-prone constructs

Team Collaboration

Shared knowledge graphs for maintaining architectural consistency

Sourcesage FAQ

FAQ: Sourcesage Implementation & Best Practices

How does Sourcesage handle codebase updates?

Incremental analysis ensures only modified components are reprocessed, maintaining cache integrity without full rebuilds

What languages are natively supported?

Current support includes Python, JavaScript (ES6+), Java, C#, and TypeScript with community-driven extensions for other languages

How is data security ensured?

End-to-end encryption for stored graphs and configurable access controls for query interfaces

Can it integrate with CI/CD pipelines?

Provides API hooks for automated analysis during deployment stages, with Slack/GitHub alert integrations

Content

SourceSage: Efficient Code Memory for LLMs

SourceSage is an MCP (Model Context Protocol) server that efficiently memorizes key aspects of a codebase—logic, style, and standards—while allowing dynamic updates and fast retrieval. It's designed to be language-agnostic, leveraging the LLM's understanding of code across multiple languages.

Features

Language Agnostic : Works with any programming language the LLM understands
Knowledge Graph Storage : Efficiently stores code entities, relationships, patterns, and style conventions
LLM-Driven Analysis : Relies on the LLM to analyze code and provide insights
Token-Efficient Storage : Optimizes for minimal token usage while maximizing memory capacity
Incremental Updates : Updates knowledge when code changes without redundant storage
Fast Retrieval : Enables quick and accurate retrieval of relevant information

How It Works

SourceSage uses a novel approach where:

The LLM analyzes code files (in any language)
The LLM uses MCP tools to register entities, relationships, patterns, and style conventions
SourceSage stores this knowledge in a token-efficient graph structure
The LLM can later query this knowledge when needed

This approach leverages the LLM's inherent language understanding while focusing the MCP server on efficient memory management.

Installation

# Clone the repository
git clone https://github.com/yourusername/sourcesage.git
cd sourcesage

# Install the package
pip install -e .

Usage

Running the MCP Server

# Run the server
sourcesage

# Or run directly from the repository
python -m sourcesage.mcp_server

Connecting to Claude for Desktop

Open Claude for Desktop
Go to Settings > Developer > Edit Config
Add the following to your claude_desktop_config.json:

If you've installed the package:

{
  "mcpServers": {
    "sourcesage": {
      "command": "sourcesage",
      "args": []
    }
  }
}

If you're running from a local directory without installing:

{
  "sourcesage": {
      "command": "uv", 
      "args": [
        "--directory",
        "/path/to/sourcesage",
        "run",
        "main.py"
      ]
    },
}

Restart Claude for Desktop

Available Tools

SourceSage provides the following MCP tools:

register_entity : Register a code entity in the knowledge graph

Input:
- name: Name of the entity (e.g., class name, function name)
- entity_type: Type of entity (class, function, module, etc.)
- summary: Brief description of the entity
- signature: Entity signature (optional)
- language: Programming language (optional)
- observations: List of observations about the entity (optional)
- metadata: Additional metadata (optional)

Output: Confirmation message with entity ID

register_relationship : Register a relationship between entities

Input:
- from_entity: Name of the source entity
- to_entity: Name of the target entity
- relationship_type: Type of relationship (calls, inherits, imports, etc.)
- metadata: Additional metadata (optional)

Output: Confirmation message with relationship ID

register_pattern : Register a code pattern

Input:
- name: Name of the pattern
- description: Description of the pattern
- language: Programming language (optional)
- example: Example code demonstrating the pattern (optional)
- metadata: Additional metadata (optional)

Output: Confirmation message with pattern ID

register_style_convention : Register a coding style convention

Input:
- name: Name of the convention
- description: Description of the convention
- language: Programming language (optional)
- examples: Example code snippets demonstrating the convention (optional)
- metadata: Additional metadata (optional)

Output: Confirmation message with convention ID

add_entity_observation : Add an observation to an entity

Input:
- entity_name: Name of the entity
- observation: Observation to add

Output: Confirmation message

query_entities : Query entities in the knowledge graph

Input:
- entity_type: Filter by entity type (optional)
- language: Filter by programming language (optional)
- name_pattern: Filter by name pattern (regex, optional)
- limit: Maximum number of results to return (optional)

Output: List of matching entities

get_entity_details : Get detailed information about an entity

Input:
- entity_name: Name of the entity

Output: Detailed information about the entity

query_patterns : Query code patterns in the knowledge graph

Input:
- language: Filter by programming language (optional)
- pattern_name: Filter by pattern name (optional)

Output: List of matching patterns

query_style_conventions : Query coding style conventions

Input:
- language: Filter by programming language (optional)
- convention_name: Filter by convention name (optional)

Output: List of matching style conventions

get_knowledge_statistics : Get statistics about the knowledge graph

Input: None

Output: Statistics about the knowledge graph

clear_knowledge : Clear all knowledge from the graph

Input: None

Output: Confirmation message

Example Workflow with Claude

Analyze Code : Ask Claude to analyze your code files

"Please analyze this Python file and register the key entities and relationships."
Register Entities : Claude will use the register_entity tool to store code entities

"I'll register the main class in this file."
Register Relationships : Claude will use the register_relationship tool to store relationships

"I'll register the inheritance relationship between these classes."
Query Knowledge : Later, ask Claude about your codebase

"What classes are defined in my codebase?"

"Show me the details of the User class."
"What's the relationship between the User and Profile classes?"

Get Coding Patterns : Ask Claude about coding patterns

"What design patterns are used in my codebase?"

"Show me examples of the Factory pattern in my code."

How It's Different

Unlike traditional code analysis tools, SourceSage:

Leverages LLM Understanding : Uses the LLM's ability to understand code semantics across languages
Stores Semantic Knowledge : Focuses on meaning and relationships, not just syntax
Is Language Agnostic : Works with any programming language the LLM understands
Optimizes for Token Efficiency : Stores knowledge in a way that minimizes token usage
Evolves with LLM Capabilities : As LLMs improve, so does code understanding

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.