Development Guide

Quake is a high‑performance vector search engine written in C++ with Python bindings. It supports adaptive, real‑time updates and query‑adaptive search—letting you specify a recall target so that the index automatically adjusts its search scope. This guide will help you rapidly understand our design, coding standards and contribution workflow.

Overview of the Architecture

Quake’s design is split into two major layers:

  1. The C++ Core Implements the heavy‑lifting: index construction, query processing, vector updates, and dynamic maintenance (splits, merges, and reassignments). It is organized in the src/cpp/ directory. Key components include:

    • QuakeIndex: The central class that coordinates building, searching, and maintaining the index.

    • PartitionManager: Manages dynamic partitions (or “clusters”) of vectors.

    • MaintenancePolicy: Encapsulates the rules for when and how to split or merge partitions to meet recall targets.

    • QueryCoordinator: Distributes search queries across partitions and aggregates results.

    • Bindings: C++ functionality is exposed to Python via pybind11 (located in src/cpp/bindings).

  2. The Python Layer Provides user-friendly wrappers, dataset loaders, and utility functions for integrating with PyTorch and other ML workflows. It is located in the src/python/ directory and uses Sphinx (with autodoc) to extract docstrings from our Python code.

Below shows a flowchart of how the main components and classes of Quake interact.

        flowchart TD
    subgraph C++_Core["C++ Core"]
        QI[QuakeIndex]
        PM[PartitionManager]
        MP["MaintenancePolicy"]
        QC[QueryCoordinator]
        DIL[DynamicInvertedLists]
        IP[IndexPartition]
        B["Bindings (pybind11)"]

        MO[MaintenancePolicyParams]
        SO[SearchParams]
        IB[IndexBuildParams]
    end

    subgraph Python_Layer["Python Layer"]
        PA["Quake Python API"]
        UT["Utility Modules & Helpers"]
        DS["Dataset Loaders"]
        WG["Workload Generator"]
    end

    %% Connections within C++ Core
    QI --> PM
    QI --> MP
    QI --> QC
    PM --> DIL
    DIL --> IP

    B --> QI
    B --> MO
    B --> SO
    B --> IB

    %% Expose C++ Core to Python
    PA --> B

    %% Python Layer structure
    PA --> UT
    PA --> DS
    PA --> WG

    %% Define custom styles
    classDef coreStyle fill:#f9f,stroke:#333,stroke-width:2px;
    classDef pythonStyle fill:#bbf,stroke:#333,stroke-width:2px;

    %% Assign styles to nodes
    class QI,PM,MP,QC,DIL,IP,MO,SO,IB coreStyle;
    class PA,UT,DS,WG pythonStyle;
    

Directory Structure

Familiarize yourself with the layout of the repository:

.
├── CMakeLists.txt              # CMake build configuration
├── README.md                   # High-level project description
├── docs/                       # Documentation sources (RST files, Sphinx config)
│   ├── index.rst
│   ├── install.rst
│   └── development_guide.rst    <-- This guide
├── src/
│   ├── cpp/                    # C++ source, headers, and third‑party submodules
│   │   ├── include/            # Public headers (API)
│   │   ├── src/                # Implementation files
│   │   ├── bindings/           # Python bindings via pybind11
│   │   └── third_party/        # External dependencies (e.g., Faiss, SimSIMD)
│   └── python/                 # Python modules and utilities
├── test/                      # Unit and integration tests (C++ and Python)
├── setup.py / setup.cfg       # Python packaging files
└── .gitmodules                # Git submodule configuration

Contribution Workflow & Coding Standards

We expect all contributors to follow these practices:

Git Workflow & PRs

  • Branching: Create feature branches from the main branch.

  • Pull Requests: Submit clear PRs with detailed descriptions and links to related issues.

  • Code Reviews: Expect direct feedback—clarity and correctness are our top priorities.

Coding Standards

  • C++: Follow the Google C++ Style Guide

  • Python: Adhere to PEP8

  • Docstrings & Comments: Every class and function should be documented. Clear inline comments and comprehensive docstrings help both human readers and our automated documentation tools.

Testing

  • C++ Tests: Located in test/cpp/; run them via CMake (e.g. using ctest or make quake_tests).

  • Python Tests: Located in test/python/; run them with pytest.

  • When Adding Features: Always add tests covering new functionality and ensure tests are clear and reflect real usage scenarios.

Workflow

  1. Clone and Set Up:

git clone https://github.com/marius-team/quake.git
cd quake
git submodule update --init --recursive
  1. Create a Feature Branch:

git checkout -b feature/my-feature
  1. Build and Activate Conda Environment:

This installs the necessary dependencies for building Quake. See Installation for more details.

conda env create -f environments/ubuntu-latest/conda.yaml
conda activate quake-env
  1. Build the Code & Bindings:

C++ Build (optional, if you only want to work on Python code):

mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
make -j$(nproc) bindings

Python Build

pip install .
  1. Run Tests:

C++ Tests:

Build the tests and run them (assuming you are in the build/ directory):

make -j$(nproc) quake_tests
test/cpp/quake_tests --gtest_filter=* # use filters to run specific tests

Python Tests:

Quake must be installed with pip to run the Python tests. Run them using pytest:

pytest test/python/
  1. Make Changes and submit a PR:

After making changes, commit them and push to your branch. Then, create a PR on the main branch.

Conclusion

This guide is a living document. As Quake evolves, update it to reflect improvements and new practices. Our goal is to keep the codebase and its documentation clear, correct, and easy to contribute to.