QIS Component Deep Dive

Routing by Similarity: Eight Battle-Tested Paths That Already Work

These mechanisms power torrents, recommendations, blockchains. No one applies them to private, real-time insight sharing at quadratic scale. QIS does—starting with survival.

By Christopher Thomas Trevethan • January 15, 2026

QIS Component Series — Steps 3 & 4 of 5
Step 1: Data AggregationStep 2: Defining SimilaritySteps 3 & 4: Routing + Outcome PacketsStep 5: SynthesisCapstone: Every Component Exists

Why Steps 3 & 4 Are One Article

This is what makes QIS groundbreaking: routing and insight retrieval happen in a single operation. Define expert similarity → route to your exact cohort → receive outcome packets back. One trip. No shared compute. No separate fetch. No waiting.

The outcome packet—what worked, what failed, what happened—comes back with the routing response. DHT's FIND_VALUE returns stored data. Vector queries return metadata. Route directly to insight. Take insight. Done.

Read more: One Round Trip →

QIS isn't locked to one routing method.

The core is simple: expert defines similarity → route query there → receive outcome packets (not raw data) → synthesize locally.

Eight mechanisms do this today. All exact-capable. All scalable. All light enough for a phone (or a plugged-in laptop if needed). All O(log N) or better. All tested in production at massive scale.

No one applies them to real-time health outcomes. They power torrents, recommendations, blockchains.

QIS just redirects the traffic to survival.

All Eight Methods at a Glance

# Method Routing 1K Packets 1M Packets Best For
1 DHT (Kademlia) Hash → bucket 3–5s 2–3 min Decentralized, proven
2 Vectors (HNSW/FAISS) ANN distance=0 3–5s 90s–2 min Fuzzy matching
3 Registries ID → shard 2–4s 60–90s Expert ID mapping
4 Gossip Overlays Epidemic spread 2–4s 45–70s High redundancy
5 Skip Lists Ordered skip 3–5s 90s–2 min Range queries
6 Content-Addressable CID → providers 3–5s 60–90s Verifiable content
7 Topic Trees (MQTT) Subscribe → multicast 2–4s 40–60s Real-time IoT
8 Central Vector DB Cluster query 1–3s 20–40s Speed at scale

Conservative numbers (2026 5G ~100-300 Mbps real-world). Known limits. Proven backups. Pick one—or combine.

1
DHT (Distributed Hash Table)
Kademlia / libp2p / IPFS — Hash-based exact routing • Decentralized

How It Works

Your semantic fingerprint gets hashed (SHA-256) to produce a 256-bit key. This key is an "address" in a shared address space. Each node knows k neighbors (k=20 in libp2p). To find your bucket, ask the neighbor closest to that address, repeat until you arrive. O(log N) hops.

Here's the key: you're not limited to one bucket. The network allocates a hash prefix for each similarity definition—that prefix IS the problem space. All buckets under that prefix belong to that similarity. Query bucket 0, then 1, then 2... walk the prefix space until no more buckets exist. The prefix boundary tells you when you're done. No per-bucket constraints—the similarity definition determines how much address space is allocated, and you query all of it.

Production Proof

  • BitTorrent Mainline DHT: 16–28 million concurrent nodes (IEEE P2P 2013)
  • IPFS: Kademlia-based content routing for millions of files
  • Ethereum discv5: Node discovery across global network
  • libp2p: Powers Filecoin, Polkadot, and dozens of blockchain networks
1,000 packets
3–5 seconds
100,000 packets
8–12 seconds
1 million packets
2–3 min (phone)
Routing complexity
O(log N)

The QIS Pivot

BitTorrent routes to movie chunks. IPFS routes to file providers. QIS routes to outcome packets from patients exactly like you. Same infrastructure. Different payload. For the deep dive, see DHT: The Quiet Engine Already Running the Internet.

Known Limit

Phone radio/battery at ~500 MB sustained. Solution: plug in laptop/tablet—same code, Ethernet or Wi-Fi.

Backup/Fan-out

Bucket gossip or leader batching. Any node caches 20 outcomes. Subtree walk for spillover. Parallel streams → 1M in ~7–10s on laptop.

2
Distributed Vector Index
HNSW / FAISS on P2P overlay — ANN distance routing • Decentralized

How It Works

Your semantic fingerprint becomes a vector in high-dimensional space (128–1024 dimensions). Query routes via graph traversal—HNSW (Hierarchical Navigable Small World) builds a multi-layer graph where each layer is sparser than the last. Start at the top layer, greedily move toward the query vector, drop down, repeat. Exact match (distance≈0) lands in cluster of identical vectors. Pull attached metadata (outcome packets). Synthesize locally.

Production Proof

  • FAISS (Meta): 1.5 trillion vectors indexed internally, 8.5× faster than previous best
  • Spotify: Song recommendations via vector similarity
  • ChatGPT: RAG retrieval for knowledge augmentation
  • Google Photos: Image similarity search
1,000 packets
3–5 seconds
100,000 packets
7–10 seconds
1 million packets
90s–2 min (phone)
Phone index limit
~2M vectors (PQ)

The QIS Pivot

Spotify asks "what song next?" QIS asks "what worked for patients exactly like me?" Same cosine similarity. Same HNSW graph. Different question. For the deep dive, see Vectors: From Central Servers to Phone Swarms.

Known Limit

Phone index ~2 million vectors with product quantization (500MB-1GB RAM). Beyond: shard across devices or laptop.

Backup/Fan-out

5 leader nodes cache full cluster metadata. Gossip heartbeat sync. Partial results if leaders partial.

3
Registries (Expert ID Mapping)
Sharded KV / Knowledge Graph — Direct ID routing • Decentralized

How It Works

Expert-curated template produces a fixed ID (e.g., "colorectal_stage3_kras+_msi-" → ID 4732). ID routes directly to shard/leader that owns that ID range. Leader returns all registered outcome packets for that ID. No graph traversal, no hash lookups—direct addressing. Simpler than DHT when you have well-defined categories.

Production Proof

  • DNS: The original registry—routes names to IPs at global scale
  • Consul/etcd: Service discovery for microservices (millions of lookups/sec)
  • Knowledge Graphs: Google's Knowledge Graph, Wikidata (billions of entities)
  • Medical ontologies: SNOMED CT, ICD-10 (hundreds of thousands of codes)
1,000 packets
2–4 seconds
100,000 packets
6–9 seconds
1 million packets
60–90 seconds
Routing
O(1) to shard

The QIS Pivot

DNS routes "google.com" to an IP. Medical registries route ICD codes to billing. QIS routes "Stage 3 KRAS+ colorectal" to every outcome packet from matching patients. Same registry pattern. Different payload.

Known Limit

Leader hotspot if one ID is very popular. Solution: shard + parallel ping across replicas.

Backup/Fan-out

3 mirrored leaders per shard. Gossip or consensus sync. Automatic failover.

4
Gossip Overlays
Epidemic Protocols / Scuttlebutt — Viral propagation • Decentralized

How It Works

Nodes gossip publishers they know to random peers. Each round, information "infects" new nodes exponentially—like a virus spreading through a population. After O(log N) rounds, information reaches all nodes with high probability. Each node caches recent publishers. Query finds what you need through accumulated gossip. Natural redundancy: everyone has a slice.

Production Proof

  • Amazon Dynamo: Gossip for membership and failure detection
  • Apache Cassandra: Cluster state propagation via gossip
  • Bitcoin: Transaction and block propagation across network
  • Scuttlebutt (SSB): Social network with pure gossip replication
1,000 packets
2–4 seconds
100,000 packets
5–8 seconds
1 million packets
45–70 seconds
Propagation
O(log N) rounds

The QIS Pivot

Cassandra gossips cluster state. Bitcoin gossips transactions. QIS gossips outcome packets from survival patterns. Same epidemic algorithm. Different epidemic: insights instead of viruses.

Known Limit

Churn (50% offline common). Eventually consistent—may have stale data during propagation.

Backup/Fan-out

Natural—every node caches a slice. High redundancy built-in. Partial vote fallback if some nodes unavailable.

5
Skip Lists
Distributed Sorted Sets / Skip Graphs — Ordered overlay • Decentralized

How It Works

Nodes arranged in sorted order by similarity score. Multi-level structure: bottom level is complete sorted list, higher levels "skip" progressively more nodes (like express lanes). Query starts at top level, skips as far as possible, drops down, repeats. O(log N) search. Unlike DHT, preserves ordering—so you can find "all patients with similarity score between 0.8 and 0.95" (range queries).

Production Proof

  • Redis: Sorted sets (ZSET) use skip lists internally
  • RocksDB: Default Memtable implementation
  • Java ConcurrentSkipListMap: Thread-safe sorted maps in production systems
  • LevelDB: In-memory sorted structure
1,000 packets
3–5 seconds
100,000 packets
8–12 seconds
1 million packets
90s–2 min
Range queries
O(log N + k)

The QIS Pivot

Redis uses skip lists for leaderboards. Discord uses them for member ordering. QIS uses them for ordered outcome retrieval by similarity score. Want the 100 most similar patients? Skip list finds them in order.

Known Limit

Walk length for very large lists. Laptop recommended for 1M+ traversals.

Backup/Fan-out

Duplicate lists (2–3 chains). Ping fastest. Natural fault tolerance through redundancy.

6
Content-Addressable Storage
IPFS / Swarm — Hash-verified content • Decentralized

How It Works

Hash the semantic fingerprint (expert-defined similarity) → get a Content ID (CID). The similarity definition IS the address. DHT routes to "providers" who store outcome packets for that similarity CID. Query by similarity → get all matching outcome packets. The CID is deterministic: same similarity definition always maps to the same address. Optional signatures verify individual packet integrity after retrieval.

Production Proof

  • IPFS: Millions of CIDs, powers NFT metadata storage
  • Filecoin: Incentivized storage network built on IPFS
  • Cloudflare: IPFS gateway since 2018
  • Wikipedia mirror: Entire archive on IPFS
1,000 packets
3–5 seconds
100,000 packets
7–10 seconds
1 million packets
60–90 seconds
Verification
Hash recompute

The QIS Pivot

IPFS addresses content by its hash. QIS addresses similarity definitions by their hash—then stores outcome packets at that address. Same deterministic addressing. Same decentralized routing. Different key: similarity instead of content.

Known Limit

Provider availability—if no one hosts the CID, it's gone. Research shows centralization in practice.

Backup/Fan-out

Multi-provider pinning. Natural replication when multiple nodes store same CID. Fallback to pinning services.

7
Topic Trees (MQTT/PubSub)
Hierarchical Publish-Subscribe — Real-time multicast • Centralized

How It Works

Subscribe to a topic path: "/lung/stage3a/egfr+". Broker routes all messages matching that topic to all subscribers. Wildcards supported: "/lung/+/egfr+" matches any stage with EGFR+. Hierarchical topics enable coarse-to-fine filtering. Publishers post outcome packets to their topic. Subscribers receive in real-time. Lightweight protocol—2-byte header minimum, designed for constrained devices.

Production Proof

  • EMQX: 100 million concurrent connections benchmarked
  • TBMQ: 3 million messages/second on single node, 100M connections
  • HiveMQ: "Millions of connections, billions of messages" (enterprise IoT)
  • AWS IoT Core: MQTT backbone for millions of devices
1,000 packets
2–4 seconds
100,000 packets
5–8 seconds
1 million packets
40–60 seconds
Latency
Single-digit ms

The QIS Pivot

Today's IoT publishes raw sensor readings to central databases for analytics. That's data collection, not quadratic insight. QIS publishes outcome packets—what worked, what failed, under what conditions—to similarity-defined topics. Subscribers synthesize locally. The infrastructure is identical; the pattern is fundamentally different. Works for any domain: healthcare outcomes, industrial failures, agricultural yields, financial signals.

Known Limit

Broker load if topics too broad. Shard topics across broker cluster.

Backup/Fan-out

Broker cluster (3+ nodes). Auto-failover. QoS levels for delivery guarantee.

8
Central Vector Database
Pinecone / Milvus / Weaviate — Managed ANN at scale • Centralized

How It Works

One cluster holds the full index. Query exact vector V (distance=0). Engine (IVF-PQ, HNSW, or DiskANN) returns matching IDs with attached metadata. Server batches or CDN-streams outcome packets. Fastest option—paid infrastructure handles all complexity. Trade decentralization for speed and simplicity.

Production Proof

  • Pinecone: 1.4 billion vectors, 5,700 QPS, 26ms P50 latency (Dec 2025)
  • Milvus: 40K+ GitHub stars, 10,000+ production deployments, tens of billions of vectors
  • Weaviate: Hybrid search with GraphQL API
  • Enterprise users: Salesforce, PayPal, eBay, NVIDIA (Milvus)
1,000 packets
1–3 seconds
100,000 packets
4–7 seconds
1 million packets
20–40 seconds
Proven scale
Billions of vectors

The QIS Pivot

ChatGPT uses vector DB for RAG retrieval. Enterprise uses it for semantic search. QIS uses it for outcome packet retrieval from patients exactly like you. Same infrastructure. Same query pattern. Different question.

Known Limit

Single vendor/trust. Lower privacy (central sees all). Rare buckets (<50 matches) risk re-ID—add min-N guard.

Backup/Fan-out

Built-in replication + multi-region. Fork clusters: ping both, fastest wins.

The scaling law holds across all eight. N agents create N(N-1)/2 synthesis opportunities regardless of which routing mechanism you choose. The architecture determines efficiency, latency, and trust model. The math determines intelligence scaling. Pick the architecture that fits your constraints—the quadratic benefit follows.

While this article uses healthcare examples, the core is as broad as it gets. If you can define the problem and have data sources, you can get quadratic insight. Medicine, manufacturing, agriculture, finance, logistics, energy, research—any domain where similar cases hold answers. QIS enables real-time, scalable, private insight across almost all of them. Precision everything.

Phone Reality: What Actually Works

Every method above has phone/laptop performance estimates. Here's the reality check:

Conservative 2026 Phone Performance

Operation Phone (5G) Laptop (Wi-Fi)
100,000 packets (~48 MB) 8–12 seconds 2–5 seconds
1 million packets (~488 MB) 2–3 minutes 20–40 seconds
Local index search 2–10 ms 1–5 ms
Synthesis (median vote) ~100 ms on 1M ~50 ms on 1M
Battery for 500 MB transfer ~1–2% drain N/A (plugged in)

Phone hits a wall at ~500 MB sustained? Plug in laptop/tablet. Same code. Ethernet/Wi-Fi. Seconds instead of minutes.

The more networks exist, the tighter expert similarity gets. Buckets shrink from millions to hundreds. Pulls drop to blinks. Treatments tailor. Lives extend.

Why These Methods Aren't Saving Lives Already

Every method above is battle-tested. Every one powers critical infrastructure.

None of them route health outcomes.

BitTorrent routes movie chunks. IPFS routes NFT metadata. MQTT routes temperature readings. Cassandra gossips cluster state. Pinecone routes ChatGPT retrieval.

No one asked: What if we routed treatment outcomes instead?

QIS asks that question. The routing mechanisms already exist. The scale is already proven. The only thing missing was the insight to redirect them toward survival.

The Challenge

Show me the pull that takes hours on hospital Wi-Fi.
Show me the bucket without backup or spillover.
Show me the phone that dies before the vote finishes (or just plug in a laptop—same code).
Show me the method that fails at billion scale (Pinecone doesn't. EMQX doesn't. Milvus doesn't).

Can't?

Then routing by similarity is ready.

Scaling isn't the problem. Waiting is.

The only question left: which one do you build first?

Central for speed. Distributed for sovereignty. Hybrid for today. All eight work. All eight scale. All eight can save lives.

Next: Step 5 — Synthesis →

Build With Us

Next: Step 5 — Synthesis →

Subscribe on Substack DHT Deep Dive Vectors Deep Dive Interactive Demo