Validiti Provenance

Why this changes the math

Most "AI grounding" tools charge you per token, per query, or per seat — and lock you to their LLM, their vector store, their pricing whim. Provenance is a flat per-machine license that runs on your hardware. Three things stop scaling against you the moment you install it.

⛁

Your LLM, your call

OpenAI on Monday. Anthropic on Tuesday. Local llama on Thursday. Provenance hands the same grounded prompt to whatever you point it at — no embedding model lock-in, no preferred-vendor tax, no migration cost when you switch.

⛨

Your records, your machine

The records never leave your install. Validiti never sees the question, the records, or the answer. Pharma sensitivity, attorney-client privilege, regulated-industry data sovereignty — all stay where they need to.

Your bill, your terms

Flat per-machine, per-month. Verify a thousand claims or a million — same price. No per-token meter, no per-query gotcha, no surprise overage. Predictable line item, indistinguishable from any other software seat.

Two ways to use it

Same engine, two flows. Start your workflow at either end.

→

Ask. Then your LLM answers.

You ask Provenance a question in plain language. Provenance returns the verified evidence from your records — citations, stance, gaps. That grounded context becomes the prompt for the LLM you choose. Your LLM never invents what isn't in the records.

PRIMARY FLOW

↩

Verify. After the LLM has answered.

Already have an LLM draft? Paste it in. Provenance returns the same content with every claim labeled — VERIFIED, PARTIAL, NO SOURCE — checked against the same records. Catches hallucinated drug names, fictitious citations, unsupported facts.

SECONDARY FLOW

VERIFIED PARTIAL NO SOURCE

What it does

Six things every other "grounded LLM" approach pretends to do — Provenance actually does, in milliseconds, on your own hardware.

Answers from your records

Ask a natural-language question. Provenance searches the records you registered, returns the actual evidence — citations, stance, gaps in coverage — and a structured prompt your LLM can consume. The answer is grounded before the LLM sees it.

⇨

Pipes to any LLM

API or MCP. Validiti Accelerate, OpenAI, Anthropic, or a local model — your call. Provenance returns the verified context; your LLM uses its full context window to compose the final answer. We never lock you to one model.

⛁

Your records, your machine

Provenance runs locally. Point it at the records you trust — pharmacovigilance archives, your case-law set, your internal documentation, your peer-reviewed literature. Validiti never sees the question, the records, or the answer.

↩

Verifies LLM output too

Already have a draft from an LLM? Paste it in. Provenance labels every claim — VERIFIED, PARTIAL, NO SOURCE — against the same records. The label tracks the entity, not the prose: a made-up name in plausible writing still labels NO SOURCE.

⚡

Returns in milliseconds

Each query searches records covering hundreds of thousands of documents in roughly 200 ms. A full draft of a few paragraphs verifies end-to-end in single-digit seconds. Your LLM round-trip is on top of that.

⛨

Sealed deployment

Sealed binary, machine-bound, every internal package verified end-to-end. Tampering with the install refuses to start. Same Titus runtime defense that protects every Validiti SKU.

See it work — right now

The live demo is a hosted instance running Provenance against three open medical record sets (FDA adverse-event reports, drugs, conditions). Both flows are live — ask a question, or paste an LLM draft to verify.

Live demonstration · validiti.com/provenance-demo

Try the verify flow first: a medical paragraph that mentions Metformin (real), Aspirin (real), and Validitomab (a made-up drug that doesn't exist) returns three labels in under three seconds — VERIFIED, VERIFIED, NO SOURCE. The made-up name has no escape route. Then try the ask flow with the same record sets and watch the grounded answer come back with citations attached.

Open the live demo →

public · medical record sets · same engine that ships in the sealed binary

Three workflows, three grounded answers

Two ask-mode scenarios and one verify-mode scenario. All three reproducible in the live demo.

Pharmacovigilance research — ask mode

ASK · medical

"What does the literature say about metformin and lactic acidosis in elderly patients with reduced renal function?"

EVIDENCE

14 adverse-event records returned, all with metformin + acidosis co-occurrence in patients over 65 with eGFR < 30. Citations attached to each.

EVIDENCE

3 review records flagged — none of them contradict the association; one notes baseline lactate as the differentiating factor.

YOUR LLM

Composes the final answer from those citations. Your LLM's context window holds all 17 records plus your prompt — it works only from what's there.

Pharmacovigilance summary — verify mode

VERIFY · medical

"Metformin can cause lactic acidosis in rare cases. Aspirin is associated with gastrointestinal bleeding. Validitomab is approved for chronic fatigue syndrome."

VERIFIED

Metformin + lactic acidosis — strong support in adverse-event records.

VERIFIED

Aspirin + gastrointestinal bleeding — well-attested across the record set.

NO SOURCE

Validitomab isn't in any medical record set — the LLM made it up. Caught before the draft ships.

Case-law research — ask mode

ASK · legal

"Find precedent for the four-prong test in digital privacy claims, drawing on Fourth Amendment reasoning."

EVIDENCE

Katz v. United States, 389 U.S. 347 (1967) — reasonable expectation of privacy doctrine. Verbatim opinion attached.

EVIDENCE

Smith v. Maryland, 442 U.S. 735 (1979) — third-party doctrine. Verbatim opinion attached.

YOUR LLM

Composes the answer from the cases that exist in your set. No fabricated citations possible — if a case isn't in the records, it doesn't reach the LLM. Your brief never quotes a case Provenance didn't pull.

How fast

Real numbers from the running engine. Reproduced on a 4-vCPU box, no GPU.

Per claim

~200 ms

end-to-end label, against a record set of 100K+ documents

Three claims

~600 ms

a paragraph of typical LLM output, fully labeled

Full draft

single seconds

a multi-paragraph report with dozens of claims, returned with every label attached

Record set size

millions

scales linearly with the records you bring — no re-training, no re-indexing on the LLM side

Latency is measured from text-in to labels-out, including reading the records the customer registered. Your numbers will depend on how big your record sets are and what hardware you run on, but the per-claim cost stays in the same range.

Versus the alternatives

Three things people try when they want to catch hallucinations. Here's what each one actually delivers.

Approach	What it actually does	Where it falls apart
"Use a second LLM to fact-check"	A second model reads the first model's output and gives a confidence score.	Both models hallucinate. The "checker" can be just as wrong as the original. No verifiable record set.
RAG with citation insertion	The LLM retrieves passages from a vector store and weaves them into the answer.	The retrieval may miss; the LLM may still mis-attribute or invent claims that look like they came from the retrieved passage.
Manual review by a domain expert	A human reads every line and approves it.	Slow, expensive, doesn't scale. Most drafts go out without it.
Validiti Provenance	Splits the LLM output into claims. Labels each one against records you control. Returns the original text annotated. Same engine answers a fresh question from records and hands a grounded prompt to your LLM.	Not a model. Not a vector store. Not a guess. The label is the truth or the record set's silence on it — your call.

And the bill — to verify 1 million claims

Approach	Cost to verify 1M claims	Pricing model
Per-token LLM fact-check (GPT-4 / Claude)	$50 – $500	$3 – $15 / M tokens · grows with every call
Vector DB + per-call grounding	$200 – $1,500	DB hosting + per-query inference fees
Specialized eval / guardrails SaaS	$5,000 – $25,000 / mo	per-seat enterprise contracts, no per-call cap
Validiti Provenance · Personal	$19 / mo flat	first 90 days free · no per-token, no per-query
Validiti Provenance · Enterprise	$1,000 + $100	$1,000/mo floor + $0.0001 per claim · published meter

We're not saying don't use RAG or human review. We're saying you should know which claims actually have support and which don't, before either of those steps — without paying per token to find out.

Pricing

Pick a tier. Your LLM stays yours. No per-token gotchas. Foundation is free for verified educational institutes; Personal is free for the first 90 days.

All tiers ship the same engine — same speed, same labeling logic, same sealed deployment. Higher tiers add team features for organizations running multiple machines.

Foundation

free · forever

Verified educational institutes only — universities, medical and law schools, K-12 with CTE programs.

unlimited machines
unlimited record sets
Instructor + student access
Semester audit log
Email support

A medical school teaching evidence-based research. A law school clinic running grounded case-law search. An undergrad program where every student needs to verify their own work.

Education tier · contact

Personal

$19 /mo

Free for the first 90 days. Single editor, single machine. Most professional users start here.

1 machine
first 90 days free · no card
Unlimited record sets
Full API + MCP access
Email support

A medical writer on a deadline. An attorney drafting briefs. Three months to fold it into your workflow before any bill arrives.

Launching soon

Pro

$29 /mo

Small teams. Shared registry, priority support.

10 machines
Everything in Personal
Shared record-set registry
Priority support

A regulated-industry editorial team — pharma, legal, compliance, technical writing. Every member runs Provenance against the same canonical record sets.

Launching soon

Studio

$349 /mo

Mid-market. SLA, SSO, air-gap option, audit retention.

100 machines
Everything in Pro
SSO (SAML / OIDC)
Audit retention + export
SLA + air-gap option

A regional pharma editorial group. A law firm with multiple practice groups. Compliance audit-ready out of the box.

Launching soon

Enterprise

$1,000 /mo floor

Published meter — $0.0001 / claim verified, $0.001 / question asked.

unlimited machines
Everything in Studio
Multi-region deployments
Air-gapped, no phone-home
Dedicated SLA

Multi-region pharma, large law firms, regulated agencies. Self-serve published meter, no sales call. Verify a million claims a month for $100 over the floor.

Launching soon

Platform

$1,000–2,000 /mo

For SaaS vendors embedding Provenance natively. Per-downstream-seat meter.

+ $3 / downstream seat / mo
White-label option
Embedded API + MCP
Custom branding + co-marketing
Dedicated integration support

A legal research platform offering Provenance as a native verification feature. A pharma document system embedding it for every author. A regulatory compliance suite shipping it to every customer seat.

Launching soon

Every tier ships sealed. Every internal package verified end-to-end. Same Titus runtime defense as every other Validiti SKU. The same engine that runs the live demo runs in your install. No per-token fees. No per-query fees. No vendor lock-in.

Built-in guarantees

The reason Provenance can be trusted is that the answer doesn't come from a model — it comes from records you control. We don't see your records, your text, or the labels.

At the labeling layer — what makes the verdict trustworthy

The records are yours. Validiti never ships you a record set. You point Provenance at the data you trust — pharmacovigilance, case law, internal docs, peer-reviewed literature. The label is grounded in your data, not ours.
NO SOURCE means it isn't in your records. When the entity in a claim isn't in any of the record sets you registered, the label is NO SOURCE — full stop. There's no fallback that pretends a non-existent thing is verified.
Anchors track the entity. Common-word matches don't override missing entities. A made-up drug surrounded by real-sounding medical prose still labels NO SOURCE — the prose can't rescue the entity.
Labels are deterministic. Same text, same record set, same label. No model temperature, no random seeds, no drift between calls.

At the install layer — what makes the deployment trustworthy

Sealed binary — internal packages verified end-to-end, machine-bound on activation, runtime watcher built in.
Refuses to bind to anything but localhost. Your text and your records never leave the machine.
The same Titus runtime defense that ships in every Validiti SKU. Tampering with the install refuses to start.
One package install. No Java runtime, no agent fleet, no infrastructure dependency. Drops onto a host with one command.
Free trial, paid tiers, and Enterprise all run the same engine. We don't downgrade the labels by tier.
U.S.-headquartered, U.S. only at launch. Single legal jurisdiction.

Records first. Your LLM second.

Why this changes the math

Your LLM, your call

Your records, your machine

Your bill, your terms

Two ways to use it

Ask. Then your LLM answers.

Verify. After the LLM has answered.

What it does

Answers from your records

Pipes to any LLM

Your records, your machine

Verifies LLM output too

Returns in milliseconds

Sealed deployment

See it work — right now

Live demonstration · validiti.com/provenance-demo

Three workflows, three grounded answers

Pharmacovigilance research — ask mode

Pharmacovigilance summary — verify mode

Case-law research — ask mode

How fast

Versus the alternatives

And the bill — to verify 1 million claims

Pricing

Foundation

Personal

Pro

Studio

Enterprise

Platform

Built-in guarantees

At the labeling layer — what makes the verdict trustworthy

At the install layer — what makes the deployment trustworthy