Achieving Zero Downtime Splicing in Practice via Chain Signals [combined summary]



Individual post summaries: Click here to read the original discussion on the lightning-dev mailing list

Published on: 2022-07-02T00:29:53+00:00


Summary:

The discussion between Olaoluwa Osuntokun and Rusty Russell revolves around the reliability of signaling a splice in progress during routing. While waiting for reliable gossip is important, it may not be foolproof, and waiting for a certain number of blocks before closing channels can be wishful thinking. Olaoluwa suggests that if a close looks a certain way, it is a splice, and the channel size can be updated accordingly. However, Rusty questions the need to spam the chain and the reliability of waiting for the latest gossip. Instead, he suggests treating every close as signaling "stay tuned for the gossip message" and marking a channel as a zombie if no message is received after two weeks. In an email exchange, Laolu Osuntokun proposed the addition of a signal to detect splicing on the Lightning Network's gossip network. Lisa Neigut expressed confusion as to why such a signal was being proposed, stating that it runs counter to the goal of making lightning's onchain footprint indistinguishable from any other onchain usage. However, Laolu argued that adding a reliable signal would not necessarily add extra bytes to the chain or reveal additional information for public channels. In his model of gossip v2, verifiers don't care if the advertised UTXO is actually a channel or not and aren't watching the chain for spends. With this in mind, splicing on the gossip network wouldn't be an explicit event. Instead, re-organizing by a little corner of the channel graph isn't necessarily tied to making a series of on-chain transactions.Laolu acknowledged that waiting N blocks before closing channels is "good enough" for most cases but suggested the addition of a signal as a belt-and-suspenders approach. Lisa brought up the potential for lost routing fees in the case of a communication failure, but Laolu argued that the severity of such a failure is minimal. They also discussed the appropriate number of blocks to wait before closing channels, with Lisa suggesting 15 or 144 blocks and Laolu referencing a previous mention of ~an hour for gossip to propagate and suggesting 12 blocks as a reasonable first estimate. Finally, they noted that Alex Myer's minisketch project may improve gossip reconciliation efficiency and reduce the need for robust signaling in the future.In a recent discussion, it was suggested that if a chain close is seen along with a gossip message indicating a splice, the gossip should be propagated urgently and widely. However, adding urgency to gossip messages can be difficult to enforce as there is a cost associated with closing a channel which cuts down on the ability to send out urgent messages frequently. Additionally, spamming gossip with splices is expensive. This means that changing the behavior of the secondary information system may be a better approach than embedding the info into the chain. Lisa Neigut expressed confusion as to why on-chain signals are being proposed because it runs counter to the goal of making Lightning's on-chain footprint indistinguishable from any other on-chain usage. She questioned whether there was another reason for suggesting an on-chain detectable signal aside from infallibility. While the severity of a comms failure is minimal with the potential for lost routing fees, she did acknowledge that more robust signaling could be added later if needed.Regarding the "wait N blocks before you close your channels" solution, it was noted that it takes about an hour for gossip to propagate. As such, waiting 12 blocks (or two hours) is a reasonable first estimate. This wait time may be adjusted once real-world experience is gained. It was also mentioned that Alex Myer's minisketch project may improve gossip reconciliation efficiency, making gossip reliability less of an issue.The move to taproot/gossip v2 aims to make lightning's on-chain footprint indistinguishable from any other on-chain usage, which runs counter to the proposal of adding an on-chain detectable signal. The severity of a comms failure is minimal, and the suggested solution of waiting for a certain number of blocks before channel closure may not be foolproof. It takes approximately an hour for gossip to propagate, and 12 blocks is a reasonable estimate for waiting time. More robust signaling can be added later if lost routing fees become a significant problem. Alex Myer's minisketch project may also improve gossip reconciliation efficiency, making gossip reliability less of an issue.The conversation between Rusty and Olaoluwa Osuntokun discusses the issue of reliable gossip in routing. Rusty argues that relying on a wait time is simpler than waiting for reliable gossip, but Olaoluwa counters that there is no guarantee of message propagation time. He also suggests that ratelimiting may prevent some nodes from receiving all updates. Olaoluwa proposes using an on-chain signal to indicate when a splice is in progress, to ensure that routers do not lose out on potential fee revenue and sends/receivers continue to use channels.


Updated on: 2023-08-01T00:33:49.076989+00:00