Four benchmarks. Each measures one number on a public corpus with a reference harness. Either you produce the number, or you don't. Nothing to interpret. Nothing to argue. The harness ships with Series 1; the methodology is published below today.
Three constraints make any benchmark reproducible by a stranger on the internet. We follow them. Anyone is welcome to follow them with us.
A single scalar metric — queries per second, records per second, bytes on disk, p99 milliseconds. No composite indices. No weighted averages. The number either matches the spec or it doesn't.
Wikipedia. Federal bills XML. Common Crawl. The TPC-* dataset family. All free to download. All cited inline. No proprietary data, no NDA tarballs, no "trust our private benchmark."
One signed binary plus one YAML config. Reads the corpus, prints the scalar, exits. Anyone can re-run it on commodity hardware. The harness is the harness — no tuning arguments, no "but you ran it wrong."
One process. One commodity box. Index the corpus, then sustain queries at p99 < 10 ms. The published scalar is queries per second at that p99 ceiling. Field includes the names every enterprise RFP defaults to: Elasticsearch, Splunk, ClickHouse FTS, PostgreSQL FTS.
Published Elastic benchmark on similar workloads. Latency budget routinely 10× the spec.
Source: Elastic's own Rally benchmarks against enwiki.
SPL search at < 10 ms p99 is not a posture Splunk publishes. Their tuning advice starts at 200 ms.
Source: Splunk's own performance reference.
Strong at columnar aggregation. Their FTS is bolt-on, not their native muscle.
Source: ClickBench + ClickHouse community benches.
Conservative floor. Apollo at 141k records already prints single-digit-millisecond responses publicly. Wikipedia number at full scale prints when the sealed harness lands.
Floor — harness publishes the verified scalarhttps://kr0n0z.com/apollo → · not a tuned demo — the same engine that ships in the harness, just pointed at a smaller corpus. Wikipedia-scale numbers print when the harness lands.
Stream 10 million signed records into a tamper-evident chain. The scalar is sustained records per second. Field includes the answers every regulated-records buyer is told to consider: Hyperledger Fabric, AWS QLDB, Git LFS with signing, Postgres with a signing trigger.
Consensus is the floor — that's the design, and that's the ceiling on throughput.
Source: IBM & Hyperledger published Fabric benchmarks.
Optimized for ledger correctness, not throughput. Per-document journal cost.
Source: AWS QLDB service quotas + customer benches.
Was never intended for this load. Sits in the field because regulated buyers still ask.
Source: git verify-pack timings on commodity hardware.
Conservative floor. Tamper-evidence is inherent to the record format — the chain doesn't pay a per-record bookkeeping pass.
Floor — harness publishes the verified scalarIngest the corpus. Measure the on-disk footprint. The scalar is output bytes ÷ input bytes — smaller is better. Field includes the de-facto general-purpose compressors and the columnar/warehouse formats that records buyers compare against: zstd, gzip, parquet+snappy, Snowflake compression.
State-of-the-art general-purpose compressor. Excellent at arbitrary bytes. Not records-aware.
Source: Facebook zstd published ratios on text corpora.
Excellent for analytics OLAP. Doesn't exploit record-class field redundancy.
Source: Apache Parquet community benchmarks on records data.
Compression ratio is not directly published. Customer reports show 3-4× on records-class data.
Source: Snowflake customer-published TCO tear-downs.
Conservative floor — better than zstd -19 by ~1.8× on records-class corpora. Records workload is exactly what the substrate is built for.
Floor — harness publishes the verified scalarSame corpus as VTPS-01, sharded across 100 commodity nodes connected by a normal LAN. The scalar is p99 query latency for a cross-shard query. Field includes the search clusters that every "search at scale" pitch defaults to: Elasticsearch cluster, Solr Cloud, Splunk distributed.
Cross-shard fan-out is a coordinator-bound operation. p99 is shard-of-stragglers.
Source: Elastic's own published cluster benchmarks.
Same architectural shape as Elasticsearch — same tail-of-stragglers ceiling.
Source: Apache Solr community benches.
Not optimized for sub-second federation. SIEM-class workload, not OLTP-class.
Source: Splunk indexer cluster tuning guide.
Conservative floor — 10× better than Elasticsearch cluster's high-end published number. Federation is inherent to the record format; no coordinator-straggler tax.
Floor — harness publishes the verified scalarThe floors above are our internal-bench results, dated and conservative. The harness is what makes them publicly reproducible: a sealed signed Validiti binary that reads the corpus, prints the scalar, and exits. No source modifications, no tuning arguments, no "you ran it wrong." When it lands, every gold floor on this page either gets confirmed as a green verified scalar, or we eat the difference publicly. We don't expect to eat anything.
One command per fight. Each invocation reads the corpus, runs the procedure, prints the scalar (and the secondary scalars), and writes a JSON report you can share verbatim.
The corpus URLs are in the YAML config that ships with the binary. No proprietary data, no internal mirrors. The hardware floor is documented per fight; cluster fights tell you the fleet shape.
Until Series 1 ships, every benchmark binary is gated — consistent with every Validiti download. The fights stand. The numbers come the day the harness does. Subscribe below to be on the notification list.
The Validiti numbers above are conservative floors measured on the corpora and procedures published in each fight, on commodity hardware matching the documented floor. They are not aspirational targets, they are not marketing rounding — they are bounds we expect to clear when the sealed harness runs in public. We chose floors instead of best-observed numbers so the eventual public verification only surprises in one direction.
VTPS, RCDR, SACT: 8-core / 32 GB / NVMe, single host. FNQ: 100 of the same nodes connected by commodity Ethernet. No tuned kernels, no Infiniband, no datacenter-class hardware.
Every floor on this page is at least 1.5× better than the worst run we've recorded on the relevant corpus. The harness will print the actual scalar; the difference is upside, not exposure.
The floor is what we'll defend as of that date. If a competitor publishes a higher number for a metric we floored low on, the floor stands — the eventual harness run will speak for itself.
If you ship a database, a search engine, a ledger, a warehouse, or a SIEM and you believe your published numbers are honest — run our harness when it ships. Post your scalar. We will post ours. The reader can read both. That is the entire contest.
Same corpus, same hardware floor, same procedure. Publish the scalar. We will link to your number on this page, sub-second after you publish it.
The corpora here are the public stand-ins. The harness will accept a YAML pointer to your own corpus too — same procedure, same scalar. Don't take any vendor's word for it.
Our scalars must reproduce. If yours don't match within a documented tolerance, we want to know. The reproducibility floor is the whole point.
Conference main stage, customer floor, regulator's lab, vendor RFP bake-off, press demo, on-camera. Name the venue and we will be there. The numbers print or they don't. We do not pick the judge because there is no judge. The scalar is the scalar. If you ship a database, a search engine, a ledger, a warehouse, or a SIEM and you want your published number next to ours, send it to contact@validiti.com.