Base-32 error correction coding [combined summary]

Individual post summaries:

Base-32 error correction coding Jim Paris 2014-02-24 18:29:31+00:00
Base-32 error correction coding Mark Friedenbach 2014-02-21 02:41:05+00:00

Click here to read the original discussion on the bitcoin-dev mailing list

Published on: 2014-02-24T18:29:31+00:00

Summary:

This document proposes a BIP (Bitcoin Improvement Proposal) for a human-friendly base-32 serialization with error correction encoding. The proposed format aims to be useful for helpdesk-related data and future wallet backup & paper wallet serialization formats.The proposed base-32 encoding differs from the default hexadecimal or base58 encodings by including the full range of alphanumeric digits, except for the characters 0, l, v, and 2. These characters are excluded as they can easily be confused with other characters in fonts or handwriting. The proposed encoding also includes automatic correction of up to one transcription error per 31 coded digits, which corresponds to 130 bits of payload data. This means that seamless recovery from up to two transcription errors is possible as long as they occur in separate halves of the coded representation.However, Jim suggests that it may be possible to do better than just correcting single transcription errors. He proposes that the transposition of two adjacent characters or the insertion or deletion of a single character would be very common errors. To address this, he suggests interleaving the two halves of the coded representation to correct transpositions.To implement the error correction encoding, the BIP specifies the usage of Phil Zimmermann's z-base-32 encoding alphabet. This alphabet provides better resistance to transcriptive errors compared to the RFC 3548 standard. The RFC 3548 coder is used for z-base-32, but unnecessary '=' padding characters are stripped before performing the alphabet substitution.In addition to the error correction code transformation of base-32 data, the BIP also specifies a padding scheme for extending numbers or bit vectors of any length to a multiple of 5 bits suitable for base-32 encoding.The error correction encoding described in the BIP utilizes cyclic redundancy check polynomial division. This method requires 5 error correction digits per 26 digits of input, which is slightly less optimal than the theoretical optimum of 4 digits. However, this approach is much easier to implement correctly compared to non-patented error correction codes.While providing an error correction coder for base-32 data is interesting and useful in contexts where base-32 is already deployed, many applications involve encoding integers that are powers of two in length. To address this, the BIP provides a standard scheme for the encoding of any sized bitstring into a multiple of 5 bits in length, suitable for direct encoding into base-32.In summary, this BIP proposes a human-friendly base-32 serialization with error correction encoding. The proposed format includes automatic correction of transcription errors and provides better resistance to transcriptive errors compared to existing standards. It also specifies a padding scheme for extending numbers or bit vectors to a suitable length for base-32 encoding. This proposal aims to be useful for helpdesk-related data and future wallet backup & paper wallet serialization formats.

Updated on: 2023-08-01T07:41:24.513990+00:00