Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Improve assumeutxo state representation #30214

Open
wants to merge 16 commits into
base: master
Choose a base branch
from

Conversation

ryanofsky
Copy link
Contributor

This PR contains the first part of #28608, which tries to make assumeutxo code more maintainable, and improve it by not locking cs_main for a long time when the snapshot block is connected, and by deleting the snapshot validation chainstate when it is no longer used, instead of waiting until the next restart.

The changes in this PR are just refactoring. They make Chainstate objects self-contained, so for example, it is possible to determine what blocks to connect to a chainstate without querying ChainstateManager, and to determine whether a Chainstate is validated without basing it on inferences like &cs != &ActiveChainstate() or GetAll().size() == 1.

The PR also tries to make assumeutxo terminology less confusing, using "current chainstate" to refer to the chainstate targeting the current network tip, and "historical chainstate" to refer to the chainstate downloading old blocks and validating the assumeutxo snapshot. It removes uses of the terms "active chainstate," "usable chainstate," "disabled chainstate," "ibd chainstate," and "snapshot chainstate" which are confusing for various reasons.

@DrahtBot
Copy link
Contributor

DrahtBot commented May 31, 2024

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type Reviewers
Concept ACK TheCharlatan
Approach ACK fjahr

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #30610 (validation: do not wipe utxo cache for stats/scans/snapshots by sipa)
  • #30598 (assumeutxo: Drop block height from metadata by fjahr)
  • #30342 (kernel, logging: Pass Logger instances to kernel objects by ryanofsky)
  • #30221 (wallet: Ensure best block matches wallet scan state by achow101)
  • #29680 (wallet: fix unrelated parent conflict doesn't cause child tx to be marked as conflict by Eunovo)
  • #29652 (wallet: Avoid potentially writing incorrect best block locator by ryanofsky)
  • #29641 (scripted-diff: Use LogInfo/LogDebug over LogPrintf/LogPrint by maflcko)
  • #28616 (Show transactions as not fully confirmed during background validation by Sjors)
  • #26593 (tracing: Only prepare tracepoint arguments when actually tracing by 0xB10C)
  • #26022 (Add util::ResultPtr class by ryanofsky)
  • #25722 (refactor: Use util::Result class for wallet loading by ryanofsky)
  • #25665 (refactor: Add util::Result failure values, multiple error and warning messages by ryanofsky)
  • #24230 (indexes: Stop using node internal types and locking cs_main, improve sync logic by ryanofsky)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@DrahtBot
Copy link
Contributor

🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the
documentation.

Possibly this is due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.

Leave a comment here, if you need help tracking down a confusing failure.

Debug: https://github.com/bitcoin/bitcoin/runs/25656143496

@fjahr
Copy link
Contributor

fjahr commented Jun 8, 2024

Concept ACK

The following test code never checked anything because the if statement was
always false:

    if (cs != &chainman_restarted.ActiveChainstate()) {
        BOOST_CHECK_EQUAL(cs->m_chain.Height(), 109);
    }

Also, the height of the background chainstate it was intending to check is 110,
not 109. Fix both problems by rewriting the check.
Make Chainstate objects aware of what block they are targeting. This makes
Chainstate objects more self contained, so it's possible to determine what
blocks should be downloaded and connected to them without needing to query
ChainstateManager or look at other Chainstate objects.

The motivation for this change is to make code more readable, so it only
requires knowing about Chainstate targets, not reasoning about the assumeutxo
snapshot validation sequence. This change also enables simplifications to the
ChainstateManager interface in subsequent commits, and could make it easier to
implement new features like creating new Chainstate objects to generate UTXO
snapshots or index UTXO data.
Get rid of m_disabled/IsUsable members. Instead of marking chains disabled for
different reasons, record chainstate validity and validation progress
explicitly and use that information directly.
Move duplicate code from ChainstateManager::ActivateSnapshot and
ChainstateManager::ActivateExistingSnapshot methods to a new
ChainstateManager::AddChainstate method.

The "AddChainstate" method name doesn't mention snapshots even though it is
only used to add snapshot chainstates now, because it becomes more generalized
in a later commit in this PR ("refactor: Add ChainstateManager::m_chainstates
member")
Remove hardcoded references to m_ibd_chainstate and m_snapshot_chainstate so
MaybeCompleteSnapshotValidation function can be simpler and focus on validating
the snapshot without dealing with internal ChainstateManager states.

This is a step towards being able to validate the snapshot outside of
ActivateBestChain loop so cs_main is not locked for minutes when the snapshot
block is connected.
Use to simplify code determining the chainstate leveldb paths.
CurrentChainstate() is basically the same as ActiveChainstate() except it
requires cs_main to be locked when it is called, instead of locking cs_main
internally.

The name "current" should also be less confusing than "active" because multiple
chainstates can be active, and CurrentChainstate() returns the chainstate
targeting the current network tip, regardless of what chainstates are being
downloaded or how they are used.
ValidatedChainstate() accessor replaces GetChainstateForIndexing() with no
change in behavior.
Change ChainstateRole parameter passed to wallets and indexes. Wallets and
indexes need to know whether chainstate is historical and whether it is fully
validated. They should not be aware of the assumeutxo snapshot validation
process.
IsSnapshotActive() method is only called one place outside of tests and
asserts, and is confusing because it returns true even after the snapshot is
fully validated.

The documentation which said this "implies that a background validation
chainstate is also in use" is also incorrect, because after the snapshot is
validated, the background chainstate gets disabled and IsUsable() would return
false.
IsSnapshotValidated() is only called one place outside of tests, and is use
redundantly in some tests, asserting that a snapshot is not validated when a
snapshot chainstate does not even exist. Simplify by dropping the method and
checking Chainstate validity field directly.
SnapshotBlockhash() is only called two places outside of tests, and is used
redundantly in some tests, checking the same field as other checks. Simplify by
dropping the method and using the m_from_snapshot_blockhash field directly.
Use to replace m_active_chainstate, m_ibd_chainstate, and m_snapshot_chainstate
members.
Deduplicate code looping over chainstate objects and calling
ActivateBestChain().
Just use m_chainstates array instead.
@DrahtBot
Copy link
Contributor

DrahtBot commented Aug 9, 2024

🐙 This pull request conflicts with the target branch and needs rebase.

@fjahr
Copy link
Contributor

fjahr commented Aug 25, 2024

Thanks for addressing my feedback @ryanofsky ! I hope we can merge #29553 soon and then I will get back to this with another full review. But it does look very good to me already.

I'm especially interested if you have ideas on splitting up documentation and code changes into separate PRs.

I'm unsure if it's really needed because I find clean refactoring commits like those here can be reviewed quicker than other changes, so I would be fine to keep it as is just based on volume of the change. But I did think about it and if that is feasible maybe splitting along the lines of everything that touches ChainstateRole and the rest (in whatever order you think works better) could be an idea. If you feel like splitting it that's fine for me but I think I would prefer to keep it as one PR as I have already passed everything so I am biased towards fewer fundamental changes ;)

Ah, and since the first is a fix you could just open that as a separate PR now. That should be an easy review and merge that doesn't conflict with anything else. The docs changes it feels like it's better to keep them along the code since you need to get your head into the topic either way. I often feel like reviewing docs and code side by side challenges your understanding and might lead to higher quality review.

@TheCharlatan
Copy link
Contributor

Concept ACK

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants