Fat Errors [combined summary]



Individual post summaries: Click here to read the original discussion on the lightning-dev mailing list

Published on: 2022-11-10T15:24:28+00:00


Summary:

Joost Jager has proposed a solution to the long-standing issue of gaps in error attribution in the Lightning Network. This issue arises when nodes on a payment route hide themselves and return random data as the failure message, making it difficult for senders to pinpoint the source of the failure. Joost's solution involves a new format for the failure message that includes HMACs added by each node on the route. These HMACs commit the nodes to the failure message and prevent them from modifying it without revealing their involvement.The new failure message format consists of three parts: the message, payloads, and HMACs. The message is the standard onion failure message, while the payloads are a fixed-length array that allows each node to add data to be returned to the sender. The HMACs are also a fixed-length array that contains the HMACs added by each node as the failure message travels back to the sender. Each HMAC corresponds to a presumed position in the path.To keep the size of the failure message manageable, each hop deletes some of the previous HMACs that are no longer useful. This reduces the number of HMACs and the overall size of the failure message. Joost has developed a Go implementation of the new failure message format and has shared it on GitHub. His goal is to make the Lightning Network more reliable by preventing nodes from hiding themselves and allowing senders to pinpoint delays down to a pair of nodes.However, there is still a vulnerability in the Lightning Network where a malicious node can modify a byte of the failure message. The sender can determine the pair of nodes that the offending node is part of, but this scheme increases the size of the failure message. Rusty Russell suggests using 16 hops as the limit, which would result in a smaller payload and HMACs size.In terms of backwards compatibility, nodes need to signal what algorithm they should run to generate or transform the failure message via a tlv onion field. Intermediate nodes also need to advertise their capability to transform the new format through a feature bit. It is also suggested to add the same payloads and HMACs blocks to the update_fulfill_htlc message to address delays in the path.Overall, Joost Jager's proposed solution aims to improve error attribution in the Lightning Network by introducing a new format for the failure message that includes HMACs added by each node on the route. This solution will help prevent nodes from hiding themselves and allow senders to pinpoint delays down to a pair of nodes. However, there are still considerations regarding the size of the failure message and potential vulnerabilities that need to be addressed.The Lightning Network, a protocol for faster and cheaper transfers, has been found to have two vulnerabilities. The first vulnerability involves gaps in error attribution, where certain nodes on the route can hide themselves by returning random data as the failure message. This makes it difficult for the sender to determine where the failure occurred and penalize the responsible node(s). To address this weakness, a new failure message format has been proposed.The proposed solution allows each node on the route to append a timestamp and HMAC (Hash-based Message Authentication Code) to the failure message without revealing their position in the path. The new failure message consists of three parts: `message`, `payloads`, and `hmacs`. This ensures that the sender can verify the integrity of the message and determine the longest chain of HMACs until it encounters a hop_payload with is_final set. If any byte of the failure message is modified, the sender can identify a pair of nodes that the offending node is part of.The second vulnerability involves delayed failure messages that can be exploited by an attacker to steal funds. A proposed solution is to increase the size of the failure message to include HMACs for all possible routes up to 27 hops. Alternatively, reducing the maximum number of hops to 10 would decrease the message size significantly. This would also contribute to reducing latency and capital lock up.To ensure backwards compatibility, nodes need to signal what algorithm they should run to generate or transform the failure message via a TLV (Type-Length-Value) onion field. Intermediate nodes also need to advertise their capability to transform the new format through a feature bit. Additionally, successes can also be delayed, and it may be an option to add the same `payloads` and `hmacs` blocks to the `update_fulfill_htlc` message.The downside of the proposed scheme to address the first vulnerability is its size. With a maximum of 27 hops, the `hmacs` block contains 378 HMACs of 32 bytes each, resulting in a total size of 12 KB. However, this size is seen as justified to fix the vulnerability in Lightning. If failures are expected to become more rare in the future, the size becomes less relevant to the overall operation of the network.In conclusion, the Lightning Network has vulnerabilities related to error attribution and delayed failure messages.


Updated on: 2023-08-01T00:46:59.886099+00:00