[RFC] Simplified (but less optimal) HTLC Negotiation [combined summary]



Individual post summaries: Click here to read the original discussion on the lightning-dev mailing list

Published on: 2021-05-04T05:03:35+00:00


Summary:

In a recent exchange between Rusty Russell and Matt Corallo, they discussed the idea of a turn-taking protocol for splice transactions in Bitcoin. Corallo suggests that this would be less work than debugging based on logs and a new state machine. Russell argues that the proposal is a subset of the existing state machine.The proposal involves adding an "update_noop" message which requests your turn and has no other effects. If you want to send an update, you send it if it's your turn or wait for either a `yield`, or a different update if it's not your turn. If you receive an update when it's your turn you ignore it if you've already sent an update, otherwise, you send `yield`.On April 27, 2021, Rusty Russell added a draft to Github for the lightning-rfc pull #867. The implementation will require update_fee messages to be on their own, which will eliminate any debate on the channel's state when applied. Matt Corallo expressed his liking of this requirement. Russell suggests that if they do not do turns for splicing, they can make rules around pausing other HTLC updates generic for future use and use them for update_fee in a simpler-to-make-compulsory change. This is similar to the close requirement, except that it requires all HTLCs to be absent.Rusty Russell has uploaded a draft for updating bitcoin's lightning network protocol to Github. This is significant as the state machine of the protocol has historically had many bugs associated with it. Russell, therefore, proposes to make a subset of the existing state machine mandatory for use in order to reduce complexity and catch more bugs.Matt Corallo responds, suggesting that simplifying the state machine where possible is nice but replacing it with fresh code is probably not the most obvious decrease in complexity. He also suggests using a fuzzer which aggressively tests message-non-delivery-and-resending production desync bugs. Russell adds that he prefers to avoid having bugs rather than trying to catch them all.Later in the thread, Russell outlines requirements from the draft which relate to splice negotiation, such as that the sender must not send another splice message while a splice is being negotiated and the receiver must respond with a splice message of its own if it has not already.In a recent email exchange, Rusty Russell and Matt Corallo discussed the issue of bugs in the Lightning Network protocol. Russell expressed a desire to eliminate bugs rather than simply catching them through testing, while Corallo suggested that fuzzing could effectively catch many of these bugs. The conversation then turned to a proposal for a simplified version of the existing state machine, which would allow for splicing updates at any time. Both sides would need to send a splice message to synchronize, and no other channel updates could be sent until after the negotiation was complete. While the protocol was initially flawed and needed revision, the hope is that this simplified process will reduce the number of bugs in the system.Matt Corallo, a Bitcoin Core developer, suggested using a fuzzer to catch message-non-delivery-and-resending production desync bugs. A fuzzer has been used for several years and it tests precisely these types of bugs, but it forced several rewrites of parts of the state machine when it initially landed. Though it quickly exhausted the bug fruit, it still catches other classes of bugs occasionally. The state machine here is not that big and simplifying it where possible is nice, but ripping things out to replace them with fresh code (which would need similar testing) is probably not the most obvious decrease in complexity. The protocol has historically had more bugs than anything else, and they literally found another one in feerate negotiation since the last c-lightning release.Rusty Russell, a developer at Blockstream, revisited the protocol because it makes things like splicing easier. The current draft requires stopping changes while splicing is being negotiated, which is not entirely trivial. With the simplified method, you don't have to wait at all. Instead, both sides have to send a splice message to synchronize, and they can only do so once all in-flight changes have cleared. You have to resolve simultaneous splice attempts by using "highest feerate" tiebreak by node_id, and keep track of this stage while you clear in-flight changes. There are several requirements from the draft which relate to this, including that the sender must not send another splice message while a splice is being negotiated, must not send a splice message after sending uncommitted changes, and must not send other channel updates until splice negotiation has completed. Similarly, the receiver must respond with a `splice` message of its own if it has not already, must not reply with `splice` until all commitment updates are resolved by both peers, must use the higher of the two `funding_feerate_perkw` as the feerate for the splice, and must not send other channel updates until splice negotiation has completed.Rusty Russell prefers alternation over reflection.


Updated on: 2023-07-31T23:10:19.750318+00:00