Properties of an ideal PoW algorithm & implementation [combined summary]

Individual post summaries:

Properties of an ideal PoW algorithm & implementation Bram Cohen 2017-04-19 17:43:03+00:00
Properties of an ideal PoW algorithm & implementation Tim Ruffing 2017-04-19 11:08:15+00:00
Properties of an ideal PoW algorithm & implementation praxeology_guy 2017-04-18 19:14:05+00:00
Properties of an ideal PoW algorithm & implementation Natanael 2017-04-18 10:34:04+00:00

Click here to read the original discussion on the bitcoin-dev mailing list

Published on: 2017-04-19T17:43:03+00:00

Summary:

The author suggests using a repeated hashing method with blake2b to prevent lossy implementations and avoid relying on a single crypto primitive. Blake2b is conservative due to its large block size, which prevents asicboost-style attacks and eliminates the need for building logic for multiple blocks. Memory hard functions are effective but fail catastrophically when they do, limiting their use to hardware from a single vendor. The best PoW function for commodity mining hardware is repeatedly hashing with blake2b ten or a hundred times. However, the author believes that hard forking the PoW function is not a good idea.Natanael discusses the idea of proving optimal implementation for hash functions in an email thread. While it is a good idea in theory, it seems impossible to prove currently. He also suggests avoiding internal state reuse through measures such as "whitening" and pre-hashing to prevent ASICBOOST type optimizations. A slower hash function or iterating the hash function multiple times could be used practically. PoW verification will still be fast enough, and memory-hard functions are currently the best option for PoW functions.The cost of producing a chip depends on the number of metal layers required for routing interconnects in a particular area. Fewer layers mean quicker and easier manufacturing, and fewer patentable costs. A paper discussing factors impacting chip design costs has been linked, although its validity cannot be vouched for. To minimize asicboost-like optimizations, there needs to be early nonce mixing with variable-length input that has near-constant work. The entirety of the input should be mixed with the nonce data as soon as possible to prevent unexpected optimizations. A hash algorithm with more linear computation time versus input size would be ideal. It would consist of a first-stage Merkle tree hash to pre-lossy-mix-compress the variable length input stream to the size of the second-stage state vector. Each bit of input should have about equal influence on each of the output bits. Then there would be multi-round mixing of the second stage, which would be significantly more work than the first stage.In an email dated April 18, 2017, Natanael proposed that the best option for changing proof-of-work (PoW) is an algorithm that is moderately processing heavy and resists partial state reuse. The algorithm should have an existing reference implementation for hardware that is provably close in performance to the theoretical ideal implementation. For miners, cost primarily depends on correctly computed hashes per joule, which is related to the number of transistor activations per computed hash. An implementation must be near optimal with a minimum number of necessary transistor activations per computed hash. To prevent ASICBOOST-style optimization, the implementation should not allow much internal state reuse and the PoW step should always be the most expensive part of creating a complete block candidate. Any proof of an implementation being near optimal must consider the possibility of designs that deliberately allow errors to reduce the total count of transistor activations. The algorithm should be reasonably easy to verify, have adjustable difficulty, cryptographic strength, predictable and constant PoW computation performance, no significant reusable state, no meaningful precomputation possible, and rely only on transistors for implementation. The mining PoW should be highly parallelizable, with minimal or no gain from batch computation, and performance scaling should be linear with increased chip size and cycle speed. Overall, the algorithm should have a constant verification speed, be reasonably fast even on slow hardware, have no hidden shortcuts, and be reasonably compact in implementation with small inputs and outputs.

Updated on: 2023-08-01T20:30:55.007540+00:00