BIP 39: Add language identifier strings for wordlists



Summary:

The proposal suggests enhancing the BIP 39 wordlist set by specifying canonical native language strings, in addition to short ASCII language codes, to identify each wordlist. Currently, languages are only identified by their English names, which could be improved with strings properly vetted and recommended by native speakers. This would facilitate language identification in user interface options or menus, as well as promote interface consistency between implementations, which could be important if a user creates a mnemonic in one implementation and restores a wallet using that mnemonic in another. The proposal further suggests that appropriate strings be specified in bitcoin:bips/bip-0039/bip-0039-wordlists.md as part of the process for accepting new wordlists, and that such strings be ascertained for the existing wordlists, preferably from the persons involved in the original pull requests therefor. The author also notes that the short identifiers for Chinese, "zh-CN" and "zh-TW", are imprecise at best, and asks whether there are any appropriate standardized or customary short ASCII language IDs similar to ISO 3166-1 alpha-2 which are purely linguistic, and not fit to present-day political boundaries.The author provides an example of strings used in easyseed.c for various languages, including English, Simplified and Traditional Chinese, French, Italian, Japanese, Korean, and Spanish. However, the author notes that he/she does not know all these different languages and monkey-pasted language-native strings from a popular wiki site, which may not be accurate or sensible. Therefore, the author intends to leave the vetting of appropriate strings to native speakers or experts in the respective languages. Prior references to wordlist additions at PRs #92, #100, #114, #130, #152, #306, #570, and #621 are also provided for further context.


Updated on: 2023-06-12T23:27:58.806263+00:00