semble — What is it?

MinishLab/semble is a Python-based code search library designed to provide fast and accurate code snippet retrieval for developers and agents, significantly reducing the time and tokens required for code search compared to traditional methods.

⭐ 4,406 Stars 🍴 175 Forks Python MIT Author: MinishLab
Source: README View on GitHub →

Why it matters

This project is gaining attention due to its significant reduction in token usage and search time compared to grep+read, making it an attractive solution for developers and agents requiring efficient code search capabilities. Its unique technical choice of using a hybrid search approach, which combines the speed of grep with the accuracy of a code-specialized transformer, stands out in the code search landscape.

Source: README, Benchmarks

Core Features

Fast and Accurate Code Search

Semble provides fast indexing and search capabilities, with an average indexing time of ~250 ms and query response time of ~1.5 ms, while maintaining high accuracy comparable to code-specialized transformer models.

Source: README
Token-Efficient

Semble returns only the relevant code snippets, using approximately 98% fewer tokens than grep+read, which is particularly beneficial for resource-constrained environments.

Source: README
Zero Setup

Semble operates on CPU without the need for API keys, GPUs, or external services, making it easy to deploy and use without additional infrastructure.

Source: README
MCP Server Support

Semble can run as an MCP server, allowing agents like Claude Code, Cursor, Codex, and OpenCode to directly search any codebase, enhancing the efficiency of code exploration for these agents.

Source: README
Local and Remote Codebases

Semble supports both local and remote codebases, allowing users to search code from local directories or directly from git repositories.

Source: README

Architecture

The architecture of Semble is inferred to be modular, with clear separation between indexing, searching, and the interface for agents. It likely uses a combination of data structures optimized for code search and a hybrid approach to leverage both grep's speed and transformer models' accuracy. The project's dependency on various libraries for embeddings, indexing, and file handling suggests a complex yet well-structured codebase.

Source: Code tree + dependency files

Project Knowledge Graph

Knowledge graph: project (center) + core features (inner hexagons) + key dependencies (outer chips) model2vec vicinity numpy bm25s pathspec Fast and Accurate Code SearchFast and Accurate C… Token-Efficient Zero Setup MCP Server Support Local and Remote CodebasesLocal and Remote Co… semble Project Core feature Key dependency

Center: project; inner ring: core feature modules; outer ring: key dependencies. Auto-generated from core_features and tech_stack.key_deps.

Tech Stack

LanguagePythonFrameworkNot specified
model2vecvicinitynumpybm25spathspectree-sittertree-sitter-language-pack
Local CPU-based deployment
Source: Dependency files + code tree

Quick Start

pip install semble uv tool install semble # In AGENTS.md ## Code Search Use `semble search` to find code by describing what it does or naming a symbol/identifier, instead of grep: ​```bash semble search "authentication flow" ./my-project semble search "save_pretrained" ./my-project semble search "save model to disk" ./my-project --top-k 10 Use `semble find-related` to discover code similar to a known location (pass `file_path` and `line` from a prior search result): ​```bash semble find-related src/auth.py 42 ./my-project `path` defaults to the current directory when omitted; git URLs are accepted. If `semble` is not on `$PATH`, use `uvx --from "semble[mcp]" semble` in its place.
Source: README Installation/Quick Start

Use Cases

Semble is suitable for developers and agents working with codebases of any size, particularly those requiring efficient code search capabilities. It is useful in scenarios such as debugging, code review, or when building tools that require code understanding, such as AI agents or code search engines.

Source: README

Strengths & Limitations

Strengths

  • Strength 1: Significantly reduces the time and tokens required for code search compared to traditional methods.
  • Strength 2: Easy to deploy and use without additional infrastructure.
  • Strength 3: Supports both local and remote codebases.

Limitations

  • Limitation 1: Limited information on the underlying architecture and algorithms.
  • Limitation 2: May require additional setup for certain agents like Claude Code or Codex.
Source: Synthesis of README, code structure and dependencies

Latest Release

v0.1.7 (2026-05-12): Fixed savings aggregation and bumped version.

Source: GitHub Releases

Verdict

MinishLab/semble is a promising project for those seeking an efficient and accurate code search solution. Its unique combination of speed and accuracy, along with its ease of deployment, makes it a valuable tool for developers and agents working with codebases of any size.

Transparency Notice
This page is auto-generated by AI (a large language model) from the following public materials: GitHub README, code tree, dependency files and release notes. Analyzed at: 2026-05-22 10:09. Quality score: 85/100.

Data sources: README, GitHub API, dependency files