Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bulk element operation tracking + event #14032

Merged
merged 5 commits into from
Dec 13, 2023

Conversation

brandonkelly
Copy link
Member

@brandonkelly brandonkelly commented Dec 13, 2023

Description

Adds the concept of bulk element operation tracking, giving plugins an opportunity to do things only after a bulk operation is fully completed, fetching all affected elements at once.

This could be useful for expensive actions like sitemap generation, static page generation, updating Algolia indexes, etc.

use craft\elements\Entry;
use craft\events\BulkOpEvent;
use craft\services\Elements;
use yii\base\Event;

Event::on(Elements::class, Elements::EVENT_AFTER_BULK_OP, function(BulkOpEvent $event) {
    // Fetch all the entries that were affected
    $entries = Entry::find()
        ->inBulkOp($event->key)
        ->status(null)
        ->all();
    // ...
});

All built-in element (multi-)save and deletion actions will ensure that a bulk operation is active, even if only impacting a single element. So plugins could potentially stop listening to Elements::EVENT_AFTER_SAVE_ELEMENT/Element::EVENT_AFTER_SAVE altogether, and could go all-in on EVENT_AFTER_BULK_OP.

Plugins that are performing bulk operations on multiple elements can define their own bulk operations like so:

$bulkKey = Craft::$app->elements->beginBulkOp();

// save/delete elements
// ...

Craft::$app->elements->endBulkOp($bulkKey);

There’s also a new BaseBatchedElementJob queue job class, which extends BaseBatchedJob, but ensures that all of the individual batches are wrapped in a single bulk element operation.

Related issues

Copy link

linear bot commented Dec 13, 2023

@khalwat
Copy link
Contributor

khalwat commented Dec 13, 2023

I had some objections, but in thinking it through, if I just listen on Elements::EVENT_AFTER_BULK_OP then I can "do the thing" then because it will only be triggered at the end of the save (whether it's one element or thousands).

So I don't need to have an event before the bulk operation starts (though I suspect someone, somewhere may want an Elements::EVENT_BEFORE_BULK_OP too).

Obviously, this would only be possible when refactoring things for Craft 5, but it looks great to me, thanks!

(he says, in a vacuum, having never used it in the real world)

@brandonkelly
Copy link
Member Author

brandonkelly commented Dec 13, 2023

Yeah exactly. Keeping track of elements as things happen felt like the right approach for a couple reasons:

  • That will include any nested elements that are modified as a result of the primary elements being saved, which wouldn’t have been known at the outset.
  • If there are thousands, keeping a list of modified element IDs wouldn’t really be helpful, since querying for elements by thousands of IDs would be horrible for query performance (and possibly beyond resource limits).

So I don't need to have an event before the bulk operation starts (though I suspect someone, somewhere may want an Elements::EVENT_BEFORE_BULK_OP too).

Yeah maybe. Added it just in case. But in general I suspect ElementQuery::inBulkOp() will suffice.

@khalwat
Copy link
Contributor

khalwat commented Dec 13, 2023

I assume Craft is then going to move to use this for the resave elements stuff, propagation method changes, etc., and in FeedMe, and other bulk element operations ya?

@brandonkelly
Copy link
Member Author

Yes, everything in core already uses it in this PR, and we would start using it in Feed Me when porting it to Craft 5.

@khalwat
Copy link
Contributor

khalwat commented Dec 13, 2023

I wonder if this can be leveraged by the template {% cache %} mechanism, too, to coalesce the cache breaking that happens on element changes. Or maybe it already does in some manner?

Anyway, nifty, well done!

@brandonkelly brandonkelly force-pushed the feature/acc-349-bulk-element-operation-tracking branch from b70b60a to cec68f9 Compare December 13, 2023 16:27
@bencroker
Copy link
Contributor

This looks great, will definitely be taking advantage of it in Blitz 5, thanks @brandonkelly and @khalwat for raising this!

[ci skip]
@brandonkelly brandonkelly merged commit 4c3f727 into 5.0 Dec 13, 2023
@brandonkelly brandonkelly deleted the feature/acc-349-bulk-element-operation-tracking branch December 13, 2023 16:36
@brandonkelly
Copy link
Member Author

Thanks for taking a look @khalwat and @bencroker! Merged 🚀

@janhenckens
Copy link
Contributor

This is a great addition, something I can use in the v5 version of Scout too 👍

@michaelrog
Copy link

michaelrog commented Dec 13, 2023

plugins could potentially stop listening to Elements::EVENT_AFTER_SAVE_ELEMENT/Element::EVENT_AFTER_SAVE altogether, and could go all-in on EVENT_AFTER_BULK_OP.

This sounds like a nice direction, but I'm confused on the event naming. It sounds like these events are going to be a wrapper for all element saves, not just bulk ones.

If I'm reading this right, this is Craft's way of saying:

  • "I'm about to save some number of elements."
  • "I'm done saving some number of elements; You can use this key to find the elements that changed."

Eh, I think I misunderstood. This is only for bulk operations. i.e. This functionality does not cover normal run-of-the-mill atomic element saves/deletes.

So then, is there a way for me to tell, from Elements::EVENT_AFTER_SAVE_ELEMENT or Elements::EVENT_AFTER_PROPAGATE, whether an individual save/propagate action is part of a bulk operation?

@michaelrog
Copy link

michaelrog commented Dec 13, 2023

Also, curious how this would work in multi-site use cases...

Currently, we can listen for Element::EVENT_AFTER_PROPAGATE to do third-party stuff once an element is fully saved and propagated to other sites. Importantly, we can tell which site is relevant for this operation, because the initiating element is accessible directly from the event.

By contrast, these new events don't have the actual elements attached; Instead, they carry a key for querying those elements after-the-fact. We presumably need that key to facilitate querying all the site-elements that changed during an operation. Capturing element ID alone would not be sufficient... right?

@michaelrog
Copy link

michaelrog commented Dec 13, 2023

Keeping track of elements as things happen felt like the right approach for a couple reasons:

  • That will include any nested elements that are modified as a result of the primary elements being saved, which wouldn’t have been known at the outset.

This feels like it might get tricky. A batch operation might save elements of many types, some of which may be third-party... So, how do I know, listening to Elements::EVENT_AFTER_BULK_OP, what types of elements changed?

For example:

Event::on(Elements::class, Elements::EVENT_AFTER_BULK_OP, function(BulkOpEvent $event) {
  // Fetch all the entries that were affected
  $entries = Entry::find()
    ->inBulkOp($event->key)
    ->status(null)
    ->all();
});

How do I know, from the event, that I need to build an Entry query here?

@brandonkelly
Copy link
Member Author

Eh, I think I misunderstood. This is only for bulk operations. i.e. This functionality does not cover normal run-of-the-mill atomic element saves/deletes.

@michaelrog Your first take was correct. As I said:

All built-in element (multi-)save and deletion actions will ensure that a bulk operation is active, even if only impacting a single element. So plugins could potentially stop listening to Elements::EVENT_AFTER_SAVE_ELEMENT/Element::EVENT_AFTER_SAVE altogether, and could go all-in on EVENT_AFTER_BULK_OP.

By contrast, these new events don't have the actual elements attached; Instead, they carry a key for querying those elements after-the-fact. We presumably need that key to facilitate querying all the site-elements that changed during an operation. Capturing element ID alone would not be sufficient... right?

Correct, this is only concerned with which elements were updated, regardless of site. I considered tracking the site, but decided against it since most element saves will propagate across all sites anyway.

How do I know, from the event, that I need to build an Entry query here?

You would just need to check each individual element type you care about. Or continue listening for EVENT_AFTER_ELEMENT_SAVE and track which element types are being touched from there – but considering you would need to do that potentially across multiple requests (in the case of batched queue jobs), that may be more trouble than it’s worth.

@michaelrog
Copy link

michaelrog commented Dec 13, 2023

Your first take was correct.

In that case, naming these as "bulk" does feel a bit too specific... It feels conceptually weird that every element save/delete action is definitionally a bulk action, whether or not multiple elements are involved. This feels more analogous to "transactions" and I wish I could think of term that felt more precise, to make documentation/teaching about this a bit less cumbersome.

...

I see where bulk queue jobs are amended to beginBulkOp() and endBulkOp(), but I don't see where this plays into ordinary element saving. i.e. Where is the service actually tracking the individual elements that change?

Edit: I am a silly goose, and Large diffs are not rendered by default.

...

You would just need to check each individual element type you care about.

The challenge is, I may not know ahead-of-time which element types I care about, especially if 3p plugins are involved. So then do I have to check every possible element type?

I guess I could try to query the elements_bulkops table using the key, join it into the elements table to get type, and select distinct types from that. That seems like a hoop to jump... but I guess I don't have a good suggestion for an alternative... 🤔

@khalwat
Copy link
Contributor

khalwat commented Dec 14, 2023

So if we're hung up on naming... transaction might work? Sort of mirroring Yii2's yii\db\Connection::beginTransaction() / yii\db\Connection::endTransaction() so instead we'd have:

  • Events - Elements::EVENT_BEFORE_ELEMENT_TRANSACTION & Elements::EVENT_AFTER_ELEMENT_TRANSACTION
  • Methods - $transactionKey = Craft::$app->elements->beginElementTransaction(); & Craft::$app->elements->endElementTransaction($transactionKey);
  • Query param - ->inElementTransaction($event->key)

It probably more accurately describes what's going on this way... but honestly, the naming doesn't bother me terribly, as long as the functionality is there.

@brandonkelly
Copy link
Member Author

Trust me, 90% of the time I spent on this change was trying to think of a better name for it than “bulk operation”.

“Transaction” is definitely the closest thing conceptually. But I don’t want to overload that word.

@michaelrog
Copy link

It occurs to me that deleted elements may be difficult to work with in this context:

Soft-deleted elements can still be queried using inBulkOp()... But hard-deleted elements aren't registered at all. (The elementId fk prevents them from existing in elements_bulkops in the first place.)

e.g. If I hard-delete a bunch of entries, and now need to remove those entries from an external search index, there's no native mechanism for me to batch that, right?

(I could see this being especially frustrating since Delete For Site behavior for multi-site entries is inconsistent between hard-/soft-delete.)

So it seems like the best practice will still be to listen to EVENT_AFTER_DELETE_ELEMENT for elements being deleted, rather than trying to handle them in bulk... Am I understanding this correctly?

@brandonkelly
Copy link
Member Author

@michaelrog yeah, no way to keep track of hard deletions after-the-fact, including site-specific hard-deletions, so if you care about those, you would still want to listen to EVENT_AFTER_DELETE and EVENT_AFTER_DELETE_FOR_SITE.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants