Published on: 2023-04-06T16:14:41+00:00
During the discussion regarding the `announcement_signatures` and `commit_sig` cases, it was identified that storing the last_short_channel_id could help determine which announcement messages can be ignored and which should be considered errors until `revoke_and_ack` is sent or received. While this is not a protocol issue, correctly handling this detail is crucial to avoid breaking or force closing during the splice lock process. A proposal suggests allowing a certain class of 'stale' messages for a period of time until either mutual_splice_locked or successful commitment_signed/revoke_and_ack round trip in either direction.In a Lightning-dev mailing list, Dustin Dettmer highlighted a race condition that can occur during the 'splice_locked' workflow, which needs to be addressed properly. The issue arises if any channel activity occurs between sending and receiving 'splice_locked'. Although this is an edge case that implementations need to handle correctly, it is not a protocol issue. To address this, Dustin proposed a solution involving the temporary storage of two items: the last_short_channel_id (pre-splice short channel id) and splice_await_commitment_success (a boolean flag). After sending and receiving 'splice_locked', the last_short_channel_id should be set to the pre-splice short channel id, and splice_await_commitment_success should be flagged as true. If an 'announcement_signatures' message is received with an scid matching last_short_channel_id, it should be ignored, and the channel connection should not be aborted. Similarly, if a 'commitment_signed' message is received with the tlv splice_info->splice_channel_id set to something other than the successfully confirmed splice channel_id, the message should be ignored. Once revoke_and_ack is successfully sent or received, last_short_channel_id and splice_await_commitment_success should be reset, and normal validation of 'announcement_signatures' and 'commitment_signed' should resume. This solution effectively resolves the race condition while maintaining strict validation of messages without introducing new fields.During the testing of the `splice_locked` workflow, a race condition was discovered that requires immediate attention. This race condition occurs when channel activity happens after `splice_locked` is sent but before it is received, which can frequently trigger the issue. To illustrate this problem, a test case can be created where a node continuously sends payments during the `splice_locking` process. The race condition affects two messages: `commitment_signed` and `announcement_signatures`. The approach for handling announcement is similar to commitment. When `splice_locked` is sent, Node B considers the channel spliced (Chan 106). However, if `splice_channel_id` is set to something other than the successfully confirmed splice channel_id, the message should be ignored. Once `revoke_and_ack` is successfully sent or received, `last_short_channel_id` and `splice_await_commitment_success` should be reset, and normal validation of `announcement_signatures` and `commitment_signed` should resume. By implementing this solution, the race condition is resolved while ensuring strict validation of messages without the need for additional fields in these messages.
Updated on: 2023-08-01T01:08:18.088785+00:00