Observability & Hallucination Detection

LongTrainer v1.3.0 introduces native integration with LongTracer — an observability and hallucination detection SDK for RAG pipelines.

When enabled, LongTracer automatically captures performance spans, latencies, and token counts for every LLM call and retriever query. Optionally, it can run CitationVerifier — a hybrid STS + NLI claim verification engine — to detect hallucinated responses in real time.

Installation

LongTracer is an optional dependency. Install it with:

pip install longtrainer[tracer]

This installs longtracer with its LangChain, LangGraph, and MongoDB adapters.

Note

LongTracer requires a running MongoDB instance. By default, it reuses the same MongoDB endpoint you pass to LongTrainer and stores trace data in a separate longtracer database.

Quick Start

Enable tracing with a single flag:

from longtrainer import LongTrainer

trainer = LongTrainer(
    mongo_endpoint="mongodb://localhost:27017/",
    enable_tracer=True,
)

That's it. All RAG, Agent, Vision, and Structured output responses will now be traced automatically.

Configuration

Parameter	Type	Default	Description
`enable_tracer`	`bool`	`False`	Enable LongTracer integration
`tracer_backend`	`str`	`"mongo"`	Trace storage backend (`"mongo"`, `"sqlite"`, `"memory"`)
`tracer_verbose`	`bool`	`False`	Print per-span summaries to console
`tracer_verify`	`bool`	`True`	Run CitationVerifier for hallucination detection
`tracer_threshold`	`float`	`0.5`	Confidence threshold for hallucination flagging (0.0–1.0)

Full Example

trainer = LongTrainer(
    mongo_endpoint="mongodb://localhost:27017/",
    enable_tracer=True,
    tracer_backend="mongo",
    tracer_verbose=True,       # Print spans to console
    tracer_verify=True,        # Enable hallucination detection
    tracer_threshold=0.5,      # Flag claims below 50% confidence
)

How It Works

LongTracer instruments all five response paths in LongTrainer:

RAG Responses (`get_response`)

Automatically captures:

Retrieval span — documents retrieved, latency
LLM span — prompt, model, response, latency
Claim verification — splits the response into individual claims and checks each against retrieved source documents using NLI

Agent Responses (`get_response` with `agent_mode=True`)

Uses the LongTracerAgentHandler which automatically:

Creates a root trace for the entire agent invocation
Captures each tool call as a child span
Logs the final response

Streaming Responses (`get_response(stream=True)` / `aget_response`)

Traces are captured the same way as standard responses. The trace root is closed in a finally block after the stream completes.

Vision Responses (`get_vision_response`)

Since vision pipelines don't use LCEL chains, tracing is handled post-hoc:

A retrieval span captures the context documents
An LLM span captures the vision model response
A grounding span runs CitationVerifier against the retrieved sources (if tracer_verify=True)

Structured Output (`get_response(schema={...})`)

Since invoke_structured bypasses the LCEL chain, manual span wrapping captures:

The structured output call as a single span
The result status and a preview of the response

Lightweight Tracing (Without NLI)

If you want observability (spans, latencies, token counts) without the ~500MB NLI model download, disable verification:

trainer = LongTrainer(
    enable_tracer=True,
    tracer_verify=False,  # Spans only — no NLI model download
)

Warning

When tracer_verify=False, the CitationVerifier is skipped for Vision and Structured output paths. For RAG and Agent paths, the LongTracer callback handler manages verification internally — the tracer_verify flag does not override it.

Graceful Degradation

LongTracer is a fully optional dependency. The system handles all failure scenarios gracefully:

Scenario	Behavior
`enable_tracer=False` (default)	Zero `longtracer` imports — zero overhead
`enable_tracer=True` + `longtracer` not installed	Prints a warning, continues normally
`enable_tracer=True` + runtime error in tracer	Catches exception, logs warning, continues
Tracer crash mid-response	`try/except` in `finally` blocks ensures response is still returned

Trace Storage

Traces are stored in a separate MongoDB database called longtracer (not longtrainer_db). LongTracer automatically sets the MONGODB_URI environment variable from your mongo_endpoint.

You can query traces directly:

from pymongo import MongoClient

client = MongoClient("mongodb://localhost:27017/")
db = client["longtracer"]

# Find all traces for a specific bot
traces = db.runs.find({"inputs.bot_id": "bot-xxx"})
for trace in traces:
    print(trace["inputs"]["chat_type"], trace["outputs"])

Architecture Notes

Single Project: All traces go to the LongTracer "default" project. Bot and chat identification is done via metadata fields (bot_id, chat_id, chat_type) on each trace.
Singleton Init: LongTracer.init() is called exactly once at LongTrainer.__init__() time. Subsequent calls are no-ops.
Private API Usage: The integration uses tracer._safe_update_run() (a private method) to inject metadata into agent traces. This is wrapped in try/except to prevent crashes if the internal API changes.