Senior software engineer blogging about software systems, computing history, and practical engineering.

QRP in the Real World

This is a guide to Query Routing Protocol (QRP) as it exists in GTK-Gnutella compatible clients. It deviates from the QRP spec because the spec does not reflect what is used in the real network.

src/core/qrp.c implements both QRP update handling and query target selection.

The old QRP spec describes a general distance-vector scheme where route tables carry hop counts and can be propagated across the graph. That is not the protocol that survived in today's Gnutella.

For protocol implementors:

  1. Use Gnutella message function 0x30 for QRP updates. QRP messages are never routed and must be sent with hops 0 and TTL at most 1.
  2. Use QRP between leaves and ultrapeers, and optionally between directly connected ultrapeers. Leaf-to-ultrapeer QRP lets the ultrapeer decide which leaves should see a query. Ultrapeer-to-ultrapeer QRP is only a last-hop filter: it tells a neighbor whether this ultrapeer/leaf cluster may answer, not whether content exists multiple hops behind that neighbor.
  3. Treat tables as presence filters, not hop-count route tables. A set bit means the peer or its leaves may have something matching a hash.
  4. Send RESET before the first table, then PATCH sequences. Later updates can be patch-only if the table size has not changed.
  5. Support 4-bit signed patches and zlib. Accept 8-bit patches for compatibility with the original QRP format and old clients, but do not prefer them for modern GTK-Gnutella-style presence tables because they are larger and carry no useful hop-distance information. Support 1-bit XOR patches only when the peer advertises the QRP1 vendor feature or on G2 QHT.
  6. Do not require every query word to match when there are three or more words. GTK-Gnutella follows LimeWire's approach: two thirds of words are enough.
  7. Do not use the full-propagation algorithm in the original QRP paper. It creates stale information and black holes in the real network model.

Where QRP Is Used

Leaf to Ultrapeer

A leaf sends its own QRP table to its ultrapeers. The ultrapeer uses that table to decide which leaves should receive a query. Leaves are shielded from most query traffic, but still answer normally when they receive a query.

GTK-Gnutella does not send queries to a leaf that has not supplied a table. This differs from the old suggestion that a receiver should flood while a table is being rebuilt. For leaves, no table means no QRP-routed query traffic.

Ultrapeer to Ultrapeer

Directly connected ultrapeers may exchange aggregate QRP tables when X-Ultrapeer-Query-Routing: 0.1 is advertised by an ultrapeer peer.

This is last-hop QRP only:

  • If a query has TTL greater than 1, it is sent to ultrapeers without using the peer's QRP table.
  • If a query has TTL equal to 1, an ultrapeer that supports inter-UP QRP is sent the query only if its table matches.
  • Ultrapeers without an inter-UP table are sent the query as usual.
  • TTL 0 is used internally to mean leaves only.

The aggregate table sent by a GTK-Gnutella ultrapeer is the OR of:

  • its own local table, and
  • the QRP tables from searchable leaves.

Leaves whose hops-flow prevents full searchability are excluded from the inter-UP aggregate table. In GTK-Gnutella terms, a leaf whose hops_flow is below NODE_LEAF_MIN_FLOW is not a good source for remote last-hop advertising, because neighboring ultrapeers may send queries that this ultrapeer cannot reliably pass along at the required hop depth. Excluding those leaves avoids advertising content that is only partially reachable.

Dynamic Querying

Dynamic querying is the part of GTK-Gnutella that decides how many ultrapeers to query, at what TTL, and when to stop. QRP is one of its inputs, not a standalone route computation.

There are two QRP decisions during a dynamic query. First, GTK-Gnutella uses QRP for the local leaf fan-out: a local ultrapeer forwards the query to matching local leaves immediately, using TTL 0 internally so the target builder selects leaves and excludes neighboring ultrapeers. Second, it uses inter-UP QRP to choose and filter remote ultrapeers when the dynamic query leaves the local cluster.

For a query that originated from a leaf, GTK-Gnutella first routes it to other local leaves whose QRP tables match, excluding the source leaf. It then launches the network dynamic query toward ultrapeer neighbors. For a local search started on an ultrapeer, the same thing happens without a source leaf: local leaves are queried first, then the dynamic query scheduler starts selecting ultrapeer neighbors.

Initial dynamic-query probes are deliberately narrow. GTK-Gnutella tries to find a small set of safe-to-probe ultrapeers whose QRP tables already match the query, then sends low-TTL probes to estimate popularity before spending more network traffic. This is why QRP is especially useful for popular queries: the client can often learn that the query is common before broadcasting it broadly.

Later dynamic-query rounds keep a candidate list of ultrapeers. The list is sorted primarily by output queue pressure, so less clogged peers are preferred. When queue pressure is close, QRP matches break the tie. This is a preference, not an absolute rule, because a query with TTL greater than 1 can still find results beyond an ultrapeer's directly advertised cluster.

The hard QRP filter appears at TTL 1. If the next query would be a last-hop query and the target ultrapeer supports inter-UP QRP but its table does not match, GTK-Gnutella skips that ultrapeer. At TTL greater than 1, GTK-Gnutella does not use the inter-UP table as a denial signal, because the table does not describe nodes farther away.

Capability Discovery

GTK-Gnutella's Gnutella handshakes use:

  • X-Query-Routing: 0.1
  • X-Query-Routing: 0.2
  • X-Ultrapeer-Query-Routing: 0.1

GTK-Gnutella supports QRP up to 0.2 and rejects higher QRP versions on modern leaf/ultrapeer connections. It advertises 0.2 where possible and falls back to 0.1 for older peers.

Do not infer 1-bit patch support from X-Query-Routing: 0.2. GTK-Gnutella learns 1-bit Gnutella QRP support from the vendor-message feature QRP1/1, which is exchanged through the features-supported vendor message. Without that feature, send 4-bit patches even if the peer negotiated QRP 0.2. The 1-bit format is an XOR patch with infinity 1, so a peer that does not explicitly implement it can misapply the update instead of merely ignoring an optimization. G2 QHT is the exception: GTK-Gnutella's G2 builder only emits 1-bit QHT patches.

Wire Format

QRP update messages use Gnutella function 0x30.

All QRP update messages:

  • are local to the TCP connection,
  • use hops 0,
  • use TTL 1 or less,
  • are not forwarded by normal message routing.

RESET

Payload:

byte 0      variant = 0
bytes 1-4   table_length, little-endian uint32
byte 5      infinity

table_length must be a power of two. GTK-Gnutella accepts infinity >= 1. For its normal Gnutella 4-bit tables it sends infinity 2. For 1-bit tables it sends infinity 1.

The receiver discards any previous table for that node when a RESET is accepted.

PATCH

Payload:

byte 0      variant = 1
byte 1      seq_no, 1-based
byte 2      seq_size
byte 3      compressor: 0 = none, 1 = zlib
byte 4      entry_bits
bytes 5..   patch data

seq_no is expected to increase by one for every message in a sequence. seq_size must remain stable within the sequence. The sender cannot use more than 255 messages because the fields are one byte.

GTK-Gnutella sends patch chunks of at least 512 bytes where possible and adjusts chunk size so the sequence fits in 255 messages. It may send up to five patch messages in one send step, but pauses when the peer's output queue is filling. If the connection is clogged it limits QRP update traffic.

GTK-Gnutella compresses a patch with zlib only when the compressed form is smaller than the raw patch. This is zlib data, not gzip.

Patch Semantics

GTK-Gnutella stores tables internally as one bit per slot. Even when a patch uses 4 or 8 entry bits, the receiver interprets the result as presence:

  • present means the slot is set,
  • absent means the slot is clear.

For 4-bit and 8-bit patches:

  • zero means no change,
  • a negative signed value means set/present,
  • a positive non-zero value means clear/absent.

For 4-bit patches the high nibble is the lower table index, then the low nibble. This matches the old QRP examples.

For 1-bit patches:

  • the patch is XOR data,
  • 0 means no change,
  • 1 means flip the slot.

GTK-Gnutella can receive 1-, 2-, 4-, and 8-bit patch entries. It sends 4-bit patches by default and 1-bit patches to G2 or QRP1 peers. The 2-bit case exists for receive-side tolerance, not as a common wire format.

If an incremental patch would be larger than a full patch against an empty table, GTK-Gnutella sends the full patch instead and precedes it with RESET.

Table Size

The original documents talk about 64 KB tables and hop-count entries. In GTK-Gnutella, the meaningful unit is slots.

Local table generation starts at 214 slots and can grow to 221 slots. The target is sparse:

  • at most about 1 percent of slots filled while searching for a size,
  • less than 20 percent insertion conflict ratio if possible.

After the table is built, it is compacted to one bit per slot.

An ultrapeer's inter-UP table is capped at 131072 slots. If the merged local-plus-leaf table is larger, GTK-Gnutella shrinks it before sending to other ultrapeers. This deliberately accepts more false positives to keep inter-UP traffic bounded.

Incoming tables larger than 221 slots are shrunk by a power-of-two factor. Shrinking uses OR semantics: if any source slot in the collapsed range is present, the destination slot is present. This preserves the do not create false negatives goal at the cost of more false positives.

What Goes Into a Table

GTK-Gnutella does not simply hash exact filenames.

For shared files and partial files, it takes the canonical file name, splits it into words, tracks unique words, expands aliases, and then hashes prefix forms of those words.

Important details:

  • Words shorter than 3 characters are ignored.
  • A word contributes the full word plus up to five shorter prefixes.
  • Prefix truncation stops at length 3.
  • Duplicate words are stored once in the QRP table.
  • Aliases can add additional words.
  • The table contains words, not the full filenames.

The prefix rule is why QRP tables are intentionally more full than a strict keyword table. It prevents obvious prefix-search false negatives.

Hashing

For ASCII words, GTK-Gnutella follows the QRP hash test vectors from the old spec:

  • fold the word into a 32-bit value by XORing little-endian 4-byte chunks,
  • multiply by 0x4F1BBCDC,
  • take the high bits bits for a table of 2^bits slots.

GTK-Gnutella asserts the old test cases at startup, for example:

hash("ebcklmenq", 13) == 3527
hash("ndflalem", 16) == 37658
hash("7777a88a8a8a8", 10) == 342

For non-ASCII, the real behavior is GTK-Gnutella-specific. Inputs are canonicalized and lowercased before hashing, then the hash routine decodes UTF-8 characters and folds the low byte of each code point. Code points above U+FFFF are first represented as UTF-16 surrogate values. Do not expect byte-for-byte UTF-8 hashing to match GTK-Gnutella for non-ASCII.

Matching a Query Against a Table

This is the largest practical difference from the paper.

GTK-Gnutella used to require every query word to match. In 2014 it changed to match LimeWire behavior:

  • one or two query words: every word must match,
  • three or more query words: about two thirds must match.

The code test is:

3 * hit / word >= 2

with integer division. In effect:

  • 3 words need 2 hits,
  • 4 words need 3 hits,
  • 5 words need 4 hits,
  • 6 words need 4 hits.

Query words shorter than 3 characters are not included in the query hash vector. A query hash vector can hold at most 128 entries.

URNs are OR-ed. If a matching URN slot is present, the query is forwarded. For legacy compatibility, if no URN matches but enough regular words match, GTK-Gnutella may still forward the query. This matters because some old clients included SHA1 URNs in QRP tables and some did not.

What's New? searches bypass normal QRP word matching. They are sent only to peers that advertise support for the feature, and not to leaves known to have an empty QRT.

Target Selection

When GTK-Gnutella routes a query as an ultrapeer, it builds a target list. The source node is excluded.

For leaves:

  • duplicates can be routed with leaves disabled, since leaves already saw the first copy,
  • no received QRT means skip the leaf,
  • empty QRT means skip the leaf,
  • bad GUID and unusable firewalled cases are skipped,
  • QRP match is required,
  • very full tables may be randomly throttled when the leaf queue is backed up,
  • flow-controlled leaves get only about half of non-SHA1 matching queries and no SHA1 queries.

For ultrapeers:

  • TTL 0 means do not send to ultrapeers,
  • TTL greater than 1 means send without QRP filtering,
  • TTL 1 means use inter-UP QRP if a table is present,
  • no inter-UP table means send, because the peer is treated as non-QRP-aware.

Before sending a query to leaves, GTK-Gnutella may raise an internal TTL 0 to TTL 1 because a Gnutella message cannot be sent with TTL 0.

Receive-Side Validation

GTK-Gnutella is strict about sequence integrity:

  • PATCH before any known table is rejected.
  • Wrong seq_no is rejected.
  • Changing seq_size mid-sequence is rejected.
  • Changing entry_bits mid-sequence is rejected.
  • Non-power-of-two RESET table length is rejected.
  • Infinity less than 1 is rejected.
  • Patch overflow and incomplete patch coverage are rejected.
  • zlib errors are rejected.

Many QRP sequence failures are closed with BYE code 413, Invalid QRP Sequence.

GTK-Gnutella is more tolerant about some historical edges than a new protocol would be. For example, receive-side code can process 2-bit patches and treats unknown compressor values as plain data unless decompression had already been selected for the sequence. Compatible clients should not rely on that tolerance. Send compressor 0 or 1 and keep all sequence metadata stable.

RESET Does Not Mean Flood Leaves

The original paper suggested forwarding all queries while a reset table is being patched. GTK-Gnutella does not do that for leaves. When a RESET is received from a leaf, the old leaf table is discarded. Until the new PATCH sequence completes and installs the table, the leaf has no QRT and is skipped for QRP-routed queries.

For ultrapeers, the opposite default applies: if no inter-UP QRT is available, the peer is treated as not QRP-aware and receives last-hop queries.

Interoperability Checklist

For a Gnutella client that wants to work with GTK-Gnutella:

  • Advertise X-Query-Routing: 0.1 or 0.2 on leaf/ultrapeer connections.
  • Advertise X-Ultrapeer-Query-Routing: 0.1 only if an ultrapeer connection can exchange aggregate last-hop tables.
  • Send QRP updates as function 0x30, hops 0, TTL 1.
  • Always send a RESET before the first PATCH for a connection.
  • Use power-of-two table sizes.
  • Prefer 4-bit patches unless QRP1 was advertised.
  • Use zlib, not gzip, when compressor is 1.
  • Keep PATCH sequence metadata stable.
  • Treat tables as presence filters, not exact indexes.
  • Match the two-thirds word rule for three or more words.
  • For inter-UP QRP, use it only at TTL 1.
  • Expect false positives and design for them. Do not create false negatives.

Things the Old Spec Says That You Should Not Do

  • Do not propagate distance-vector QRP tables through the whole Gnutella graph.
  • Do not store meaningful hop distances in deployed leaf/UP QRP tables.
  • Do not assume all query words must match for longer queries.
  • Do not assume RESET temporarily means send everything to a leaf.
  • Do not assume X-Query-Routing: 0.2 alone means 1-bit patches.
  • Do not use the obsolete experimental function code 0x20.
  • Do not expect non-ASCII hashes to be portable unless the same canonicalization and character folding are used.