Skip to content

Release OutputSweeper::pending_sweep flag on future drop#4598

Merged
TheBlueMatt merged 1 commit into
lightningdevkit:mainfrom
tnull:2026-05-output-sweeper-pending-sweep-guard
May 6, 2026
Merged

Release OutputSweeper::pending_sweep flag on future drop#4598
TheBlueMatt merged 1 commit into
lightningdevkit:mainfrom
tnull:2026-05-output-sweeper-pending-sweep-guard

Conversation

@tnull

@tnull tnull commented May 5, 2026

Copy link
Copy Markdown
Contributor

regenerate_and_broadcast_spend_if_necessary used pending_sweep: AtomicBool as a single-runner gate but only cleared the flag with an unconditional store(false) after the inner future resolved. If the caller's future was dropped while the inner await was Pending — which tokio::time::timeout, futures::select!, manual JoinHandle::abort, etc. all do — the reset never ran, leaving the flag stuck true and every subsequent call to the function short-circuiting with Ok(()).

Because OutputSweeper is what claims SpendableOutputDescriptors back to the user's wallet after channel closure (including HTLC outputs with time-bounded recovery deadlines), a stuck flag turns into fund-loss exposure: time-sensitive HTLC sweeps simply stop happening, while every other code path keeps queueing new outputs to sweep, until the process is restarted.

Replace the trailing store(false) with an RAII PendingSweepGuard whose Drop impl always releases the flag — covering normal return, error, and cancellation alike.

Co-Authored-By: HAL 9000

`regenerate_and_broadcast_spend_if_necessary` used `pending_sweep:
AtomicBool` as a single-runner gate but only cleared the flag with an
unconditional `store(false)` *after* the inner future resolved. If the
caller's future was dropped while the inner await was `Pending` —
which `tokio::time::timeout`, `futures::select!`, manual
`JoinHandle::abort`, etc. all do — the reset never ran, leaving the
flag stuck `true` and every subsequent call to the function
short-circuiting with `Ok(())`.

Because `OutputSweeper` is what claims `SpendableOutputDescriptor`s
back to the user's wallet after channel closure (including HTLC
outputs with time-bounded recovery deadlines), a stuck flag turns
into fund-loss exposure: time-sensitive HTLC sweeps simply stop
happening, while every other code path keeps queueing new outputs to
sweep, until the process is restarted.

Replace the trailing `store(false)` with an RAII `PendingSweepGuard`
whose `Drop` impl always releases the flag — covering normal return,
error, and cancellation alike.

Co-Authored-By: HAL 9000
@ldk-reviews-bot

ldk-reviews-bot commented May 5, 2026

Copy link
Copy Markdown

👋 Thanks for assigning @joostjager as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

@ldk-claude-review-bot

Copy link
Copy Markdown
Collaborator

The implementation is correct and clean. Key verification points:

  1. The _guard binding (not _) ensures the guard lives until end of scope, covering the .await.
  2. The guard is created only after the compare_exchange succeeds (early return on failure), so there's no risk of a spurious release.
  3. No await points between compare_exchange and guard construction, so no cancellation gap.
  4. Memory orderings are correct: Acquire on the CAS success pairs with Release on the guard's drop.
  5. The test correctly simulates cancellation by polling once against a PendingKVStore that blocks on write, then dropping the future.

No issues found.

@TheBlueMatt TheBlueMatt left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly not sure its worth committing the test for this, but whatever.

@codecov

codecov Bot commented May 5, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 77.64706% with 19 lines in your changes missing coverage. Please review.
✅ Project coverage is 86.25%. Comparing base (1a26867) to head (6394d18).
⚠️ Report is 13 commits behind head on main.

Files with missing lines Patch % Lines
lightning/src/util/sweep.rs 77.64% 18 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4598      +/-   ##
==========================================
- Coverage   86.84%   86.25%   -0.59%     
==========================================
  Files         161      159       -2     
  Lines      109260   109351      +91     
  Branches   109260   109351      +91     
==========================================
- Hits        94882    94319     -563     
- Misses      11797    12416     +619     
- Partials     2581     2616      +35     
Flag Coverage Δ
fuzzing-fake-hashes ?
fuzzing-real-hashes ?
tests 86.25% <77.64%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@TheBlueMatt TheBlueMatt merged commit b1ea3f9 into lightningdevkit:main May 6, 2026
23 of 25 checks passed
@tnull

tnull commented May 6, 2026

Copy link
Copy Markdown
Contributor Author

Honestly not sure its worth committing the test for this, but whatever.

Yeah, true, I was also on the fence whether to include it or not.

@TheBlueMatt

Copy link
Copy Markdown
Collaborator

The asyncification of this API happened in 0.2, so no need to backport to 0.1.

@TheBlueMatt

Copy link
Copy Markdown
Collaborator

Backported to 0.2 in #4683.

TheBlueMatt added a commit to TheBlueMatt/rust-lightning that referenced this pull request Jun 23, 2026
v0.2.3 - Jun 18, 2026 - "Through the Loupe"

API Updates
===========

 * `DefaultMessageRouter` will now always generate blinded message paths that
   provide no privacy (where our node is the introduction node) for nodes with
   public channels. This works around an issue which will appear for any nodes
   with LND peers that enable onion messaging - such peers will refuse to
   forward BOLT 12 messages from unknown third parties, which most BOLT 12
   payers rely on today (lightningdevkit#4647).
 * Explicit `amount_msats` of 0 is rejected in BOLT 12 `Offer`s; `OfferBuilder`
   now maps 0-amounts to an amount of `None` (lightningdevkit#4324).

Bug Fixes
=========

 * `Features::supports_zero_conf` no longer clears the `ZeroConf` features and
   `Features::requires_zero_conf` now correctly reports required, rather than
   supported, status (lightningdevkit#4517).
 * If an MPP payment is claimed but `ChannelMonitorUpdate`s for some parts are
   still being completed asynchronously, further channel updates (e.g.
   forwarding another payment) are pending and the node restarts, the channel
   could have become stuck (lightningdevkit#4520).
 * The presence of unconfirmed transactions actually no longer causes
   `ElectrumSyncClient` to spuriously fail to sync (lightningdevkit#4590).
 * LSPS1, LSPS2, and LSPS5 persistence will no longer get stuck and refuse to
   persist again after a single failure from the KVStore (lightningdevkit#4597, lightningdevkit#4282).
 * Dropping the future returned by
   `OutputSweeper::regenerate_and_broadcast_spend_if_necessary` no longer
   results in future calls to the same method being spuriously ignored (lightningdevkit#4598).
 * Used async-receive offers are no longer refreshed on every timer tick once
   their refresh time is reached (lightningdevkit#4672).
 * `FilesystemStore::list_all_keys` will no longer fail if there are stale
   intermediate files lying around from a previous unclean shutdown (lightningdevkit#4618).
 * When forwarding an HTLC while in a blinded path with proportional fees over
   200%, LDK will no longer spuriously allow a forward that pays us 1 msat too
   little in fees (lightningdevkit#4697).
 * Fixed a rare case where a channel could get stuck on reconnect when using
   both async `ChannelMonitorUpdate` persistence and async signing (lightningdevkit#4684).
 * If we had exactly zero balance in a zero-fee-commitment channel, the
   counterparty was able to splice all of their balance out, violating the
   reserve requirements they'd otherwise be forced to keep (lightningdevkit#4580).
 * Providing an `Event::HTLCIntercepted` to the `LSPS2ServiceHandler` twice no
   longer results in spuriously opening a channel early (lightningdevkit#4656).
 * `Event::PaymentSent::fee_paid_msat` is no longer `None` in cases where
   `ChannelManager::abandon_payment` was called before the payment ultimately
   completes anyway (lightningdevkit#4651).
 * `AnchorDescriptor::previous_utxo` now provides the correct `script_pubkey`
   for non-zero-commitment-fee anchor channels (lightningdevkit#4669).
 * Syncing a `ChainMonitor` using the `Confirm` trait will no longer write some
   full `ChannelMonitor`s to disk several times per block (lightningdevkit#4544).
 * `OMDomainResolver` now correctly accounts for failed queries when rate
   limiting, ensuring we continue to respond to queries after failures (lightningdevkit#4591).
 * Calling `ChannelManager::send_payment_with_route` without a `route_params`
   and with an invalid `Route` will no longer panic (lightningdevkit#4707).
 * `LSPS2ServiceHandler::channel_open_failed` now correctly fails intercepted
   HTLCs rather than allowing them to fail just before expiry (lightningdevkit#4677).
 * `StaticInvoice::is_offer_expired` was corrected to check offer, rather than
   static invoice, expiry (lightningdevkit#4594).
 * `lightning-custom-message`'s handling of `peer_connected` events now ensures
   that sub-handlers will see a `peer_disconnected` event if a different
   sub-handler refused the connection by `Err`ing `peer_connected` (lightningdevkit#4595).
 * Replay protection for LSPS5 signatures now detects replays which are only
   different in the encoded signature's case (lightningdevkit#4701).
 * When `lightning-liquidity` is configured in the background processor, there
   is no longer a stream of `Persisting LiquidityManager...` log spam (lightningdevkit#4246).
 * Incomplete MPP keysend payments will no longer see their HTLCs held until
   expiry (lightningdevkit#4558).
 * `InvoiceRequestBuilder` will no longer accept a `quantity` of `0` for a
   BOLT 12 `Offer`, allowing any quantity up to a bound (lightningdevkit#4667).
 * `lightning-custom-message` handlers that return `Ok(None)` when asked to
   deserialize a message in their defined range no longer cause panics (lightningdevkit#4709).
 * Several spurious debug assertions were fixed (lightningdevkit#4537, lightningdevkit#4618, lightningdevkit#4026)

Security
========

0.2.3 fixes several underestimates of the anchor reserves required to ensure we
can reliably close channels, several denial-of-service vulnerabilities and a
sanitization issue.
 * `Bolt11Invoice::recover_payee_pub_key` no longer panics if called on an
   invoice which set an explicit public key, rather than relying on public key
   recovery. Note that this method is called from
   `PaymentParameters::from_bolt11_invoice` (lightningdevkit#4717).
 * Maliciously-crafted unpayable invoices which have overflowing feerates will
   no longer cause an `unwrap` failure panic (lightningdevkit#4716).
 * Parsing an `LSPSDateTime` which is before 1970 no longer panics. This is
   reachable when parsing messages from counterparties (lightningdevkit#4715).
 * `possiblyrandom` did not properly generate random data except when it was
   explicitly configured to. By default this means LDK is vulnerable to various
   HashDoS attacks (lightningdevkit#4719).
 * `OMNameResolver` will no longer panic when looking up payment instructions
   which include unicode characters at the start of a TXT record (lightningdevkit#4718).
 * When using the `anchor_channel_reserves` module to calculate reserves
   required to pay for fees when closing anchor channels, zero-fee-commitment
   channels were not considered. This could allow a counterparty to open many
   channels, leaving us unable to properly force-close (lightningdevkit#4592).
 * The `anchor_channel_reserves` module overestimated the value of `Utxo`s in
   the wallet by ignoring the `TxIn` cost to spend them (lightningdevkit#4670).
 * `PrintableString` did not properly sanitize unicode format characters,
   allowing an attacker to corrupt the rendering of logs or UI (lightningdevkit#4593, lightningdevkit#4605).
 * RGS data is now limited in how large of a graph it is able to cause a client
   to store in memory. Note that RGS data is still considered a DoS vector in
   general and you should only use semi-trusted RGS data (lightningdevkit#4713).
 * Counterparty-provided strings in failure messages are no longer logged in
   full, reducing the ability of such a counterparty to spam our logs (lightningdevkit#4714).
 * Reading a corrupted `ChannelManager` or `ProbabilisticScorer` can no longer
   cause us to allocate large amounts of memory (lightningdevkit#4712).

Thanks to Project Loupe for reporting most of the issues fixed in this release.

Conflicts resolved in:
 * lightning/src/chain/channelmonitor.rs
 * lightning/src/events/mod.rs
 * lightning/src/ln/channelmanager.rs
 * lightning/src/ln/mod.rs
 * lightning/src/ln/offers_tests.rs
 * lightning/src/ln/onion_utils.rs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants