QIS Component Series — Steps 3 & 4 of 5
Step 1: Data Aggregation →
Step 2: Defining Similarity →
Steps 3 & 4: Routing + Outcome Packets →
Step 5: Synthesis →
Capstone: Every Component Exists
Why Steps 3 & 4 Are One Article
This is what makes QIS groundbreaking: routing and insight retrieval happen in a single operation. Define expert similarity → route to your exact cohort → receive outcome packets back. One trip. No shared compute. No separate fetch. No waiting.
The outcome packet—what worked, what failed, what happened—comes back with the routing response. DHT's FIND_VALUE returns stored data. Vector queries return metadata. Route directly to insight. Take insight. Done.
QIS isn't locked to one routing method.
The core is simple: expert defines similarity → route query there → receive outcome packets (not raw data) → synthesize locally.
Eight mechanisms do this today. All exact-capable. All scalable. All light enough for a phone (or a plugged-in laptop if needed). All O(log N) or better. All tested in production at massive scale.
No one applies them to real-time health outcomes. They power torrents, recommendations, blockchains.
QIS just redirects the traffic to survival.
All Eight Methods at a Glance
| # | Method | Routing | 1K Packets | 1M Packets | Best For |
|---|---|---|---|---|---|
| 1 | DHT (Kademlia) | Hash → bucket | 3–5s | 2–3 min | Decentralized, proven |
| 2 | Vectors (HNSW/FAISS) | ANN distance=0 | 3–5s | 90s–2 min | Fuzzy matching |
| 3 | Registries | ID → shard | 2–4s | 60–90s | Expert ID mapping |
| 4 | Gossip Overlays | Epidemic spread | 2–4s | 45–70s | High redundancy |
| 5 | Skip Lists | Ordered skip | 3–5s | 90s–2 min | Range queries |
| 6 | Content-Addressable | CID → providers | 3–5s | 60–90s | Verifiable content |
| 7 | Topic Trees (MQTT) | Subscribe → multicast | 2–4s | 40–60s | Real-time IoT |
| 8 | Central Vector DB | Cluster query | 1–3s | 20–40s | Speed at scale |
Conservative numbers (2026 5G ~100-300 Mbps real-world). Known limits. Proven backups. Pick one—or combine.
How It Works
Your semantic fingerprint gets hashed (SHA-256) to produce a 256-bit key. This key is an "address" in a shared address space. Each node knows k neighbors (k=20 in libp2p). To find your bucket, ask the neighbor closest to that address, repeat until you arrive. O(log N) hops.
Here's the key: you're not limited to one bucket. The network allocates a hash prefix for each similarity definition—that prefix IS the problem space. All buckets under that prefix belong to that similarity. Query bucket 0, then 1, then 2... walk the prefix space until no more buckets exist. The prefix boundary tells you when you're done. No per-bucket constraints—the similarity definition determines how much address space is allocated, and you query all of it.
Production Proof
- BitTorrent Mainline DHT: 16–28 million concurrent nodes (IEEE P2P 2013)
- IPFS: Kademlia-based content routing for millions of files
- Ethereum discv5: Node discovery across global network
- libp2p: Powers Filecoin, Polkadot, and dozens of blockchain networks
The QIS Pivot
BitTorrent routes to movie chunks. IPFS routes to file providers. QIS routes to outcome packets from patients exactly like you. Same infrastructure. Different payload. For the deep dive, see DHT: The Quiet Engine Already Running the Internet.
Known Limit
Phone radio/battery at ~500 MB sustained. Solution: plug in laptop/tablet—same code, Ethernet or Wi-Fi.
Backup/Fan-out
Bucket gossip or leader batching. Any node caches 20 outcomes. Subtree walk for spillover. Parallel streams → 1M in ~7–10s on laptop.
How It Works
Your semantic fingerprint becomes a vector in high-dimensional space (128–1024 dimensions). Query routes via graph traversal—HNSW (Hierarchical Navigable Small World) builds a multi-layer graph where each layer is sparser than the last. Start at the top layer, greedily move toward the query vector, drop down, repeat. Exact match (distance≈0) lands in cluster of identical vectors. Pull attached metadata (outcome packets). Synthesize locally.
Production Proof
- FAISS (Meta): 1.5 trillion vectors indexed internally, 8.5× faster than previous best
- Spotify: Song recommendations via vector similarity
- ChatGPT: RAG retrieval for knowledge augmentation
- Google Photos: Image similarity search
The QIS Pivot
Spotify asks "what song next?" QIS asks "what worked for patients exactly like me?" Same cosine similarity. Same HNSW graph. Different question. For the deep dive, see Vectors: From Central Servers to Phone Swarms.
Known Limit
Phone index ~2 million vectors with product quantization (500MB-1GB RAM). Beyond: shard across devices or laptop.
Backup/Fan-out
5 leader nodes cache full cluster metadata. Gossip heartbeat sync. Partial results if leaders partial.
How It Works
Expert-curated template produces a fixed ID (e.g., "colorectal_stage3_kras+_msi-" → ID 4732). ID routes directly to shard/leader that owns that ID range. Leader returns all registered outcome packets for that ID. No graph traversal, no hash lookups—direct addressing. Simpler than DHT when you have well-defined categories.
Production Proof
- DNS: The original registry—routes names to IPs at global scale
- Consul/etcd: Service discovery for microservices (millions of lookups/sec)
- Knowledge Graphs: Google's Knowledge Graph, Wikidata (billions of entities)
- Medical ontologies: SNOMED CT, ICD-10 (hundreds of thousands of codes)
The QIS Pivot
DNS routes "google.com" to an IP. Medical registries route ICD codes to billing. QIS routes "Stage 3 KRAS+ colorectal" to every outcome packet from matching patients. Same registry pattern. Different payload.
Known Limit
Leader hotspot if one ID is very popular. Solution: shard + parallel ping across replicas.
Backup/Fan-out
3 mirrored leaders per shard. Gossip or consensus sync. Automatic failover.
How It Works
Nodes gossip publishers they know to random peers. Each round, information "infects" new nodes exponentially—like a virus spreading through a population. After O(log N) rounds, information reaches all nodes with high probability. Each node caches recent publishers. Query finds what you need through accumulated gossip. Natural redundancy: everyone has a slice.
Production Proof
- Amazon Dynamo: Gossip for membership and failure detection
- Apache Cassandra: Cluster state propagation via gossip
- Bitcoin: Transaction and block propagation across network
- Scuttlebutt (SSB): Social network with pure gossip replication
The QIS Pivot
Cassandra gossips cluster state. Bitcoin gossips transactions. QIS gossips outcome packets from survival patterns. Same epidemic algorithm. Different epidemic: insights instead of viruses.
Known Limit
Churn (50% offline common). Eventually consistent—may have stale data during propagation.
Backup/Fan-out
Natural—every node caches a slice. High redundancy built-in. Partial vote fallback if some nodes unavailable.
How It Works
Nodes arranged in sorted order by similarity score. Multi-level structure: bottom level is complete sorted list, higher levels "skip" progressively more nodes (like express lanes). Query starts at top level, skips as far as possible, drops down, repeats. O(log N) search. Unlike DHT, preserves ordering—so you can find "all patients with similarity score between 0.8 and 0.95" (range queries).
Production Proof
- Redis: Sorted sets (ZSET) use skip lists internally
- RocksDB: Default Memtable implementation
- Java ConcurrentSkipListMap: Thread-safe sorted maps in production systems
- LevelDB: In-memory sorted structure
The QIS Pivot
Redis uses skip lists for leaderboards. Discord uses them for member ordering. QIS uses them for ordered outcome retrieval by similarity score. Want the 100 most similar patients? Skip list finds them in order.
Known Limit
Walk length for very large lists. Laptop recommended for 1M+ traversals.
Backup/Fan-out
Duplicate lists (2–3 chains). Ping fastest. Natural fault tolerance through redundancy.
How It Works
Hash the semantic fingerprint (expert-defined similarity) → get a Content ID (CID). The similarity definition IS the address. DHT routes to "providers" who store outcome packets for that similarity CID. Query by similarity → get all matching outcome packets. The CID is deterministic: same similarity definition always maps to the same address. Optional signatures verify individual packet integrity after retrieval.
Production Proof
- IPFS: Millions of CIDs, powers NFT metadata storage
- Filecoin: Incentivized storage network built on IPFS
- Cloudflare: IPFS gateway since 2018
- Wikipedia mirror: Entire archive on IPFS
The QIS Pivot
IPFS addresses content by its hash. QIS addresses similarity definitions by their hash—then stores outcome packets at that address. Same deterministic addressing. Same decentralized routing. Different key: similarity instead of content.
Known Limit
Provider availability—if no one hosts the CID, it's gone. Research shows centralization in practice.
Backup/Fan-out
Multi-provider pinning. Natural replication when multiple nodes store same CID. Fallback to pinning services.
How It Works
Subscribe to a topic path: "/lung/stage3a/egfr+". Broker routes all messages matching that topic to all subscribers. Wildcards supported: "/lung/+/egfr+" matches any stage with EGFR+. Hierarchical topics enable coarse-to-fine filtering. Publishers post outcome packets to their topic. Subscribers receive in real-time. Lightweight protocol—2-byte header minimum, designed for constrained devices.
Production Proof
- EMQX: 100 million concurrent connections benchmarked
- TBMQ: 3 million messages/second on single node, 100M connections
- HiveMQ: "Millions of connections, billions of messages" (enterprise IoT)
- AWS IoT Core: MQTT backbone for millions of devices
The QIS Pivot
Today's IoT publishes raw sensor readings to central databases for analytics. That's data collection, not quadratic insight. QIS publishes outcome packets—what worked, what failed, under what conditions—to similarity-defined topics. Subscribers synthesize locally. The infrastructure is identical; the pattern is fundamentally different. Works for any domain: healthcare outcomes, industrial failures, agricultural yields, financial signals.
Known Limit
Broker load if topics too broad. Shard topics across broker cluster.
Backup/Fan-out
Broker cluster (3+ nodes). Auto-failover. QoS levels for delivery guarantee.
How It Works
One cluster holds the full index. Query exact vector V (distance=0). Engine (IVF-PQ, HNSW, or DiskANN) returns matching IDs with attached metadata. Server batches or CDN-streams outcome packets. Fastest option—paid infrastructure handles all complexity. Trade decentralization for speed and simplicity.
Production Proof
- Pinecone: 1.4 billion vectors, 5,700 QPS, 26ms P50 latency (Dec 2025)
- Milvus: 40K+ GitHub stars, 10,000+ production deployments, tens of billions of vectors
- Weaviate: Hybrid search with GraphQL API
- Enterprise users: Salesforce, PayPal, eBay, NVIDIA (Milvus)
The QIS Pivot
ChatGPT uses vector DB for RAG retrieval. Enterprise uses it for semantic search. QIS uses it for outcome packet retrieval from patients exactly like you. Same infrastructure. Same query pattern. Different question.
Known Limit
Single vendor/trust. Lower privacy (central sees all). Rare buckets (<50 matches) risk re-ID—add min-N guard.
Backup/Fan-out
Built-in replication + multi-region. Fork clusters: ping both, fastest wins.
The scaling law holds across all eight. N agents create N(N-1)/2 synthesis opportunities regardless of which routing mechanism you choose. The architecture determines efficiency, latency, and trust model. The math determines intelligence scaling. Pick the architecture that fits your constraints—the quadratic benefit follows.
While this article uses healthcare examples, the core is as broad as it gets. If you can define the problem and have data sources, you can get quadratic insight. Medicine, manufacturing, agriculture, finance, logistics, energy, research—any domain where similar cases hold answers. QIS enables real-time, scalable, private insight across almost all of them. Precision everything.
Phone Reality: What Actually Works
Every method above has phone/laptop performance estimates. Here's the reality check:
Conservative 2026 Phone Performance
| Operation | Phone (5G) | Laptop (Wi-Fi) |
|---|---|---|
| 100,000 packets (~48 MB) | 8–12 seconds | 2–5 seconds |
| 1 million packets (~488 MB) | 2–3 minutes | 20–40 seconds |
| Local index search | 2–10 ms | 1–5 ms |
| Synthesis (median vote) | ~100 ms on 1M | ~50 ms on 1M |
| Battery for 500 MB transfer | ~1–2% drain | N/A (plugged in) |
Phone hits a wall at ~500 MB sustained? Plug in laptop/tablet. Same code. Ethernet/Wi-Fi. Seconds instead of minutes.
The more networks exist, the tighter expert similarity gets. Buckets shrink from millions to hundreds. Pulls drop to blinks. Treatments tailor. Lives extend.
Why These Methods Aren't Saving Lives Already
Every method above is battle-tested. Every one powers critical infrastructure.
None of them route health outcomes.
BitTorrent routes movie chunks. IPFS routes NFT metadata. MQTT routes temperature readings. Cassandra gossips cluster state. Pinecone routes ChatGPT retrieval.
No one asked: What if we routed treatment outcomes instead?
QIS asks that question. The routing mechanisms already exist. The scale is already proven. The only thing missing was the insight to redirect them toward survival.
The Challenge
Can't?
Then routing by similarity is ready.
Scaling isn't the problem. Waiting is.
The only question left: which one do you build first?
Central for speed. Distributed for sovereignty. Hybrid for today. All eight work. All eight scale. All eight can save lives.