Skip to content

LongProbe

LongProbe Logo

Sub-second RAG regression testing for production pipelines

PyPI version PyPI Downloads Python Versions License: MIT


Overview

"Did my last commit break retrieval?" โ€” now you know in seconds.

LongProbe is a sub-second RAG regression harness. Define your Golden Questions once, run longprobe check on every commit, and get an exact diff of which document chunks were lost in your latest change โ€” before your users notice.

Think pytest --watch for your RAG pipeline.

Why LongProbe?

Every RAG developer faces the same silent killer: you refactor chunking strategy, upgrade LangChain, or add a new document โ€” and your retrieval silently degrades. DeepEval and RAGChecker are heavyweight evaluation frameworks meant for batch analysis, not fast regression checks in a dev loop.

LongProbe gives you instant feedback:

  • โšก Sub-second checks on small golden sets
  • ๐Ÿ” Exact diffs showing which chunks were lost/gained
  • ๐Ÿ“Š Recall scores with per-question breakdown
  • ๐Ÿ’พ Baseline tracking to catch regressions over time
  • ๐Ÿงช pytest integration for existing test suites
  • ๐Ÿ”Œ Pluggable adapters for any vector store

Quick Example

# Install
pip install longprobe

# Initialize
longprobe init

# Define your golden questions in goldens.yaml
# Configure your vector store in longprobe.yaml

# Run checks
longprobe check

# Save baseline
longprobe baseline save --label v1.0

# Compare after changes
longprobe diff --baseline v1.0

Part of the Long Suite

LongProbe is part of the EnDevSols Long Suite of RAG tools:

Together they cover the full RAG pipeline from ingestion to production monitoring.

Features

Core Capabilities

  • โšก Sub-second checks on small golden sets
  • ๐Ÿ“‹ Golden Questions + Required Chunks defined in simple YAML
  • ๐Ÿ” Three match modes: exact ID, text substring, semantic similarity
  • ๐Ÿ“Š Recall Score with per-question breakdown
  • ๐Ÿ”„ Regression diff: exactly which chunks were lost/gained
  • ๐Ÿ’พ SQLite baseline store: compare against any previous run

Developer Experience

  • ๐Ÿงช pytest plugin: integrate into existing test suites
  • ๐Ÿ–ฅ๏ธ Beautiful CLI with Rich tables, JSON, and GitHub Actions output
  • ๐Ÿ‘€ Watch mode: auto re-run on file changes
  • ๐Ÿ—๏ธ CI/CD ready: fails pipeline on regression

Integrations

  • ๐Ÿ”Œ Pluggable adapters: LangChain, LlamaIndex, Chroma, Pinecone, Qdrant
  • ๐ŸŒ HTTP adapter: test any RAG API
  • ๐Ÿ Python API: programmatic access to all features

Next Steps

  • Quick Start


    Get up and running in 5 minutes

    Quick Start

  • User Guide


    Learn how to define golden questions and configure LongProbe

    User Guide

  • Demos


    See LongProbe in action with live demos

    View Demos

  • API Reference


    Detailed API documentation for Python integration

    API Docs

Community & Support

License

LongProbe is released under the MIT License.