Published on: 2018-05-30T03:10:04+00:00
In a recent email exchange on the bitcoin-dev mailing list, Jim Posen shared his analysis of selecting a good P value to reduce total data downloaded using Golomb-Rice filters on mainnet. He determined that the optimal B parameter for different numbers of filter elements watched by the client ranges from 13 to 23. In particular, for C = 10, B = 13 is optimal; for C = 100, B = 16 is optimal; for C = 1,000, B = 20 is optimal; and for C = 10,000, B = 23 is optimal. Posen also attached some results and offered to share the CSV and raw notebook if people were interested.The author conducted an analysis on selecting a good P value to reduce total data downloaded considering both filters themselves and blocks in the case of false positive matches, using data from mainnet. The quantity it minimizes is based on the number of filter elements per block (N), Golomb-Rice coding parameter (B), and the number of filter elements watched by the client (C). The main result shows that for different values of C, the optimal value of B ranges from 13 to 23. Any value of B in the range 16 to 20 seems reasonable, with M=1.4971 * 2^B for optimal compression. The selection of the parameter depends on the target number of elements that a client may watch. Gregory Maxwell via bitcoin-dev corrected a previous statement regarding configuration and stated that M=784931 and rice parameter 19 should be used. The author also provided some results and offered to share the CSV and raw notebook if people are interested.In an email on May 25, 2018 at 6:42 PM, Gregory Maxwell wrote about a configuration error in which the values for M and the rice parameter were incorrect. According to Maxwell, if the configuration was correct, then M should have been equal to 784931 and the rice parameter B should have been 19. The mistake was due to a paste error, as Maxwell had initially provided the values M=1569861 and B=19.In a recent communication on Bitcoin-dev, Pieter Wuille proposed optimal parameter selection for the Golomb Coded Sets in BIP158. He suggested that if an FP rate of exactly 1 in 2^20 is required, then the Rice parameter should be 19 instead of 20. If not, then an FP rate of 1 in a 1.4971*2^B should be picked. Pieter's approximations were used to analyze what parameters minimize total communications for lite wallet scanning. The result showed that M=784931 resulted in a lower total data rate than M=1569861. When using all-scripts inputs and outputs (but no txids), it is estimated that roughly 7500 bits will be set. Therefore, M=1569861 and rice parameter 19 should be used to achieve the actual optimal FP rate for total data transferred. It is noted that different clients will have different monitoring requirements and thus, it makes sense to pick a parameter with optimal compression rather than optimal-data-transfer-for-a-specific-key-count.Pieter spent some time working on the optimal parameter selection for the Golomb Coded Sets proposed in BIP158. The details of the work can be found on his GitHub page. He found that if an FP rate of exactly 1 in 2^20 is desired, the Rice parameter should be 19 instead of 20. If a different FP rate is preferred, such as 1 in a 1.4971*2^B, then M=784931 B=19 or M=1569861 B=20 would be good choices.
Updated on: 2023-08-01T23:18:58.646865+00:00