Ultimate Blockchain Compression w/ trust-free lite node [combined summary]



Individual post summaries: Click here to read the original discussion on the bitcoin-dev mailing list

Published on: 2012-06-21T22:02:27+00:00


Summary:

In a conversation between Mike Koss and an unknown individual, the topic of discussion was pruning spent transactions from an old block and commitments to the state of unspent transactions. The goal was to enable memoryless nodes to fully validate without trust and initialize new pruned nodes securely without exposing them to the old chain history. This would require adopting a new data structure for managing trees of unspent transactions in a secure, scalable, and DOS resistant manner.The discussion began with the question of whether it made sense to adopt a more complex data structure than the Merkle tree for the Bitcoin protocol. The Merkle tree allows replacing unneeded transactions by their hash and possibly a whole subtree if they fall within a common node of the tree. However, if a lite client only cares about retaining a single transaction in a block, it would only need O(log2(T)) Merkle hashes plus the transaction it cares about.There was a debate on whether a raw trie structure or a PATRICIA tree/trie would be ideal for this purpose. A trie has a deterministic structure and constant time access to elements, but a malicious actor could exploit it by constructing unbalanced segments. Self-balancing search trees like red-black trees or 2-3-4 trees don't suffer from this problem, but they may still have implementation errors.Alan Reiner suggested that a PATRICIA tree/trie would be ideal due to its deterministic structure and increased space efficiency. However, satisfactory implementations of this structure were hard to find. Mark Friedenbach argued that tries are dependent on the distribution of input data for balancing, and a malicious actor could exploit this. Peter Todd pointed out that any trie structure is dependent on the distribution of input data for balancing and suggested looking into Judy structures.In another discussion about app developers updating their RB tree code, Gregory Maxwell emphasized the importance of comprehensive tests and a well-specified algorithm to prevent disagreements in implementation. Alan Reiner expressed concern about different implementations due to the infinite ways of constructing a binary tree, leading to different root hashes. Maxwell suggested using a trie configuration which is history-independent and has only one way to construct the tree given an unspent-TxOut list.On June 19, 2012, Alan Reiner raised concerns about an app developer updating their RB tree code, causing disagreements in the correct root hash. He emphasized the importance of comprehensive tests and a well-specified algorithm to prevent disruptive issues. Reiner suggested that a PATRICIA tree/trie would have been ideal for this situation due to its completely deterministic structure.In a forum thread, the issue of using a tree-structure with multiple valid configurations for the same set of unspent-TxOuts was raised. Using a binary tree or red-black tree could result in significant portions of the network disagreeing on the correct root. A raw trie structure would solve these issues but may not be space-efficient. A PATRICIA tree/trie would be ideal as it has a deterministic structure and is more space-efficient, although it is not a trivial implementation.Andrew Miller proposed that any recursive data structure can be 'Merkle-ized' by adding a hash of the child node alongside each link/pointer. This allows verification of data for each node as the structure is traversed. The storage burden can be delegated to an untrusted helper server, making it easier for a lite-client to insert and rebalance its tree.In a discussion about Merkle trees, Peter Todd suggested discarding vertexes that cause the tree to become unbalanced and setting a depth of imbalance to prevent this. Andrew Miller proposed a simpler solution for balancing binary search trees, such as AVL trees or Red-Black trees. Miller also mentioned that any recursive data structure can be 'Merkle-ized', allowing verification of data for each node while traversing the structure. The storage burden can be delegated to an untrusted helper server as long as the lite-client knows the root hash.Overall, the discussion revolved around finding the best data structure for managing trees of unspent transactions in a secure and scalable manner. Various options were considered, including raw trie structures and PATRICIA tree/tries. The importance of comprehensive tests and well-specified algorithms was emphasized to avoid disagreements in implementation. Additionally, the concept of 'Merkle-izing' recursive data structures was discussed as a way to verify data integrity while traversing the structure.


Updated on: 2023-08-01T03:42:33.712756+00:00