Published on: 2021-07-28T08:25:52+00:00
In an email exchange between Rusty and Carla, they discussed the implementation of more specific error codes in the Lightning Network protocol. They suggested adding TLV (Type-Length-Value) fields after the data field to achieve this and also talked about how to support nodes that have not upgraded. Rusty proposed a straw proposal which included types such as erroneous_message, erroneous_fieldnum, and suggested_value. The new error message provides an error code that tells exactly what has gone wrong, and metadata pointing to the HTLC with an invalid signature.Carla suggested adding an error_code enum where timeout is one of the cases. Rusty also proposed adding the TLV to warnings. The context then shifts to fee updates and a misguided feature bit. The author suggests that if Alice requires a feature bit which Bob doesn't offer, it is on Bob to comply. The author also notes that all issues with the exception of timeouts can be expressed in the form "problem is this message, this field" and suggests having a special TLV case for timeouts in the message.Rusty believes that the spec indicates that senders/recipients of errors "MUST" fail the channel referenced in the error, but he thinks this isn't practical. He adds that the vast majority of errors are "contact your developer, peer says we did something illegal". The email exchange also includes a link to the relevant GitHub page.In a recent spec meeting, Carla Kirk-Cohen raised the idea of adding more specific error codes to the Lightning Network protocol. It was suggested that error codes could be added by using TLV fields after the data field, and nodes that haven't been upgraded could include the error code in the data field or introduce a feature bit. Older nodes should ignore extra fields, and all defined types are now assumed to have an optional TLV appended.While not every possible error can be enumerated, there are many cases in the spec where explicit error codes could be introduced. However, some errors are simply "your implementation is broken" and don't provide anything actionable. A straw proposal was made to include TLV fields under the `tlv_stream` with types such as erroneous_message, erroneous_fieldnum, and suggested_value. This new kind of error provides an error code that tells exactly what has gone wrong and metadata pointing to the HTLC (Hashed Time Lock Contract) with an invalid signature.Carla proposes to re-purpose the existing error message (17) in the spec to allow for more structured errors and create a path for standardized, enriched errors going forward. Instead of an arbitrary string, the error message should contain an error code and optional metadata, which will enrich the context of the error. This upgrade will provide better debugging information, clearer information for the end-user, reduced risk of leaking information, and more fine-grained error handling based on the error code.To achieve this, TLV fields can be added after the data field since non-ASCII values are not allowed in the error string itself. Upgraded nodes will have an improved quality of life, while older nodes will remain unaffected. The new kind of error provides an error code that tells exactly what has gone wrong and metadata pointing to the HTLC with an invalid signature.Carla believes that the majority of implementations don't abide by the instruction in the spec indicating that senders/recipients of errors MUST fail the channel referenced in the error. With standardized sets of errors and reasonable handling, we can clear up some of the ambiguity around errors and make it easier for everyone to follow.Carla provides a list of candidates for error codes, including signature problems, funding process, channel state machine, fee updates, connection level, and gossip. Links to various sections in the Lightning Network Lightning-rfc GitHub repository are provided to give readers more context.
Updated on: 2023-07-31T23:24:56.397555+00:00