Skip to content

[Blinker] Signals (sync events) #5359

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 25 commits into from
Jun 9, 2021
Merged

[Blinker] Signals (sync events) #5359

merged 25 commits into from
Jun 9, 2021

Conversation

smotornyuk
Copy link
Member

@smotornyuk smotornyuk commented Apr 28, 2020

Add support of signals using blinker

Rationale:

  1. Devs often need some kind of hook into CKAN lifecycle. While interfaces work just fine(yes, they really cool when you've used to core concepts), no one will like hundreds of them. First of all, once the interface is published, you cannot add/remove its method, unless all extensions are using inherit=True. The second problem is quite clumsy construction::

     for plugin in plugins.PluginImplementations(plugins.IYetAnotherInterface):
       plugin.something_just_happened_and_you_can_react_to_this(event_details)
    
  2. PubSub(observer, events, whatever you call it) makes the system more expressive. While interfaces are proactive(you are writing the code, that checks the existence of another code, and then performs some actions), events are reactive(you are writing code, that "reacts" to some changes and does nothing if there is no any activity). Even though those two ideas almost the same in byte-code, it makes a huge difference for the person who is writing/reading code.

  3. CKAN Interfaces (at least, from my point of view) are "extension points". It's a tool for changing default behavior. While they can be used for side effects(IDomainObject.notification), writing a new interface only for sending some kind of notifications about activity inside CKAN core sounds a bit wrong for me. I would rather watch logs and call some function when a particular pattern is found there.

This PR solves mentioned issues in a following way:

  1. tk.signals(ckan.lib.signals). Send notification to all the observers (if there are none of them, blinker does nothing, so it's very cheap operation) at any moment. tk.signals.SIGNAL_NAME.send(data, **context) and everyone will react to your signal. This functionality requires predefined signals. If you are doing it from an extension, call tk.signals.ckanext.signal('SIGNAL_NAME').send(data, **context) - this construction works with both, existing and new signals. I won't write all details here, as they already described inside the blinker's documentation.
  2. ISignal. This new interface contains a single method get_signal_subscriptions which collects all the subscriptions from extensions and connects them to signals. It is better than creating listeners using blinker's decorators because we can attach/detach listeners when plugin loaded and unloaded(and it's crucial for testing).
  3. Listeners(subscriptions) are just functions that are called whenever the corresponding signal is sent. If there are no listeners for signal, sending the signal is a no-op. If there is no signal for the listener - the listener never will be called, but you won't get an error.
  4. Listeners can modify arguments(like IDatasetFormv::update_package_schema), can return value ( like IFacets::dataset_facets) and can perform some side effects (like IDomainObjectModification::notify). But for the first two cases, one MUST use Interfaces. For side effects, I'm suggesting signals. The main reason why you should avoid using signals for the first two cases - order of listeners execution is undefined, while interfaces are controlled by the order of plugins in the config file.
  5. Inside extensions, when you just want to notify about some activity, you can use signals. Once again, they are very cheap, so you can send tons of signals(but it would be better to send them only when something interesting is about to happen). Think about signals, as some kind of logs with callbacks. When you want to make your plugin more extendible - use interfaces.

As you can see, signals are not replacing interfaces. They just take a part of responsibility from interfaces and are completely optional. You don't have to use them, nothing will be changed from your previous experience.

Yes, if you are wondering, why blinker:

  1. it seems, that it's most popular / most lightweight library that works with signals
  2. Flask can use blinker too

PS I'm going to add a dozen signals here and there through the codebase and create a new page in docs while this PR in draft state. If there are any other suggestions - you're welcome.

@smotornyuk smotornyuk changed the title [Blink] Signals (sync events) [Blinker] Signals (sync events) Apr 28, 2020
@smotornyuk smotornyuk marked this pull request as ready for review May 4, 2020 13:29
@smotornyuk
Copy link
Member Author

Check new doc page for additional details

@amercader amercader self-assigned this May 5, 2020
@wardi wardi self-assigned this May 5, 2020
@wardi
Copy link
Contributor

wardi commented May 5, 2020

before_action/after_action would be better implemented as a plugin with an interface like chained_action so we can wrap calls to all actions with our own code.

This plugin interface would let us modify/capture parameters and return values like the code here but would also allow maintaining state across calls (setting a variable before the call and checking it after) and capturing/modifying exceptions raised by the actions.

@smotornyuk
Copy link
Member Author

Yes, completely agree. But here signal can be used alongside with interface. As we are using flask, i've tried to follow its ideas, and exactly the same situation described in the third paragraph of Flask docs (i've even copied it into our new documentation page):

... You will notice that there are signals that appear to do the same thing like some of the builtin decorators do (eg: request_started is very similar to before_request()). However, there are differences in how they work. The core before_request() handler, for example, is executed in a specific order and is able to abort the request early by returning a response. In contrast all signal handlers are executed in undefined order and do not modify any data.

Copy link
Member

@amercader amercader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a first quick look at this and I like it. The ability to subscribe to a notification before or after an action without worrying about commit state is great. We should make clearer that signal listeners should not modify the data passed though. I think the documentation needs to be clearer, specially where you describe how to register listeners with ISignal as it took me a while to understand what was expected.
I can help with this and proof reading the main docs.

Would it make sense to have a signal emitted on out before_request and after_request Flask handlers? or is that built-in in Flask?

I'll like to give this a more thorough look but it's looking good so far.

"""Action was successfully completed.
"""

register_blueprint = ckan.signal(u'register_blueprint')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give an example of when a signal like this might be useful? Not saying it isn't, just want to understand better

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say it pretty useless - I was just searching for someplace to add signals, so there is no problem in removing this. But, someone, who loves metaprogramming, can create new auth functions for every new entity type. In general-use extension(like ckanext-security), you can't predict, what exact dataset types will be used on the concrete portal. But let's imagine, you want to create ACL tables based on auth functions:

Permission Anonymous User Member Editor Admin Sysadmin
package_create no yes yes yes yes yes
package_delete no yes yes yes yes yes

As time passes, you want to make it more flexible. And, if one portal has dataset types dataset and report, you want this table to look like

Permission Anonymous User Member Editor Admin Sysadmin
dataset_create no yes yes yes yes yes
report_create no yes yes yes yes yes
dataset_delete no yes yes yes yes yes
report_delete no yes yes yes yes yes

and for second portal, that has only collection dataset type, you want

Permission Anonymous User Member Editor Admin Sysadmin
collection_create no yes yes yes yes yes
collection_delete no yes yes yes yes yes

We already have a prepare_blueprint method inside IDataset/IGroupForm interfaces, but, as one can see from name of the method, it should be used for manipulations with the newly created blueprint. And using it for side effects violates SOLID principles. I'm ok with it - sometimes i'm doing much worse things :), but there are a lot of people out there who are strongly against such solutions. So, subscribing to custom blueprint creation for side-effect will make much more sense for them

@wardi
Copy link
Contributor

wardi commented May 11, 2020

I feel pretty strongly that before_ and after_ signals are an anti-pattern. Code that is called in a random order and relies on modifying mutable parameters passed also seems like a bad idea.

Let's add plugin interfaces that work like chained_action does for these cases instead. They are better in many ways:

  • order is predictable
  • values are modified explicitly by passing new values up the chain
  • local variables can be used around the call to implement things like resource management or time measurement
  • no special value required to indicate when not to call the original (just don't call it)
  • exceptions can be captured/modified

@smotornyuk
Copy link
Member Author

@amercader , flask already has before/after_request, but it makes sense to re-export them from ckan.lib.signals.ckan, so that developers don't have to know implementation details and can assume that signal is completely under CKAN control.

@wardi , I agree with what you said about new interface, but you probably misread previous explanations(or, I just used wrong grammar :) )

  • those before/after shouldn't be considered as two parts of a single mechanism. they are completely different events. I just can't find any better name for reflecting the moment, when those events occur.
  • signals NEVER should be used for modifying mutable parameters. We have interfaces for such kind of stuff(or, in this case, we have to create a new interface). Signals are for side-effects, and one should only use them for logic, that does not depend on execution order. In this case, the good use-case for signals would be computation of total execution time for particular action during a single request(if you want to use before/after together) or saving all uncommon package_search calls with the corresponding username in order to analyze user-preferences and make some kind of suggestion based on user's behavior(yep, you can also use IPackageController.before_search for this)

Signals are not a replacement for Interfaces. In this case, I think that we can have both - interface and signal at the same time(like in the case of register_blueprint signal, that we were discussing with @amercader above). But while Interfaces are more like hooks that are used for changing behavior, signals are subscribers that are doing only additional work, not affecting original workflow.

PS I don't have any particular reasons to use those before/after_action and I'll remove them. But I want to make sure, that you see my vision of signals and don't have a feeling, like signals are "something similar to interfaces".

@wardi
Copy link
Contributor

wardi commented May 11, 2020

thanks @smotornyuk I may have misunderstood, I believed you were suggesting before_action as a good place to modify data_dict. removing before/after_action from this PR sounds good.

👍 to re-exporting flask signals so extensions have one place to register everything.

@smotornyuk
Copy link
Member Author

  • Removed action signals
  • Updated documentation. @amercader , I've added more examples and now docs should be more clear and ready for your review/improvements.

@wardi
Copy link
Contributor

wardi commented May 19, 2020

FYI the signal points we added to our site are:

  • successful logins
  • invalid login attempts
  • password reset requests
  • account creations
  • calls to datastore_upsert for created and updated records
  • calls to datastore_delete for removed records
  • new datasets created
  • datasets updated
  • datasets published by an admin (marked public)
  • datasets deleted
  • resources added
  • resources updated
  • resources deleted
  • errors on bulk upload to datastore (with ckanext-recombinant Excel templates)
  • downloads of bulk Excel templates (ckanext-recombinant)

I look forward to this getting merged so we can add these to core or our extensions with blinker instead of the client side screen-scraping we're doing right now.

@amercader
Copy link
Member

Another good candidate for a signal is whenever an activity is created. I've found many occasions when users want to export the general activity on a CKAN instance to a third party platform to perform detailed usage analysis. Having a signal that just gives you the created activity which you can then just forward to whatever tool you are using in the background would be great.

@amercader
Copy link
Member

@smotornyuk I had a go at reviewing and making clearer the docs: https://gist.github.com/amercader/9c3524af1ef78bd2062cd136d954614f Feel free to add whatever you consider to this PR (I couldn't as it's not on a branch on this repo)

Do you think it's worth adding a couple more signals as part of this PR? My favourites so far from Ian's list and mine are:

  • Successful login
  • User created
  • Activity created

Of course these can all be added later after more discussion.

Code-wise I'm happy with it. The dict returned by the interface method is a bit weird, with the keys being callables but it is well documented.

One thing I don't fully get is what is the benefit of using ISignals instead of just registering the signal with the decorator. Can you expand on that?

@smotornyuk
Copy link
Member Author

I also like suggested signals, so I'll try to add them to current PR during next week.

Nothing stops you from registering the signals globally, using native blinker's API(currently, we are doing so in all the local extensions, used by our clients), but in such a way, signals will always be active. This may be a problem in two cases:

  1. You have installed the extension, that is listening to some signals, but you don't need to have it enabled. Instead, you want to import some class from this extension, extend it, and expose it as a feature of your own extension. For example, I often create inheritors of CKANHarvester, but the original class itself is not used, so ckan_harvester not added to enabled plugins. But, all the global subscriptions will be created as soon as you import files directly or it will be imported as a dependency of some other module. And in my case, all the theoretical subscriptions of ckan_harvester will be active even if I didn't enable it.
  2. Tests. The same as for plugins - some tests must be executed without all the listeners enabled. So you need to have some switchers for listeners.

So, it's possible to use the decorator-way described in blinker documentation as long as you are sure you won't regret about it. But I would recommend avoiding this practice for general-use extensions.

@smotornyuk
Copy link
Member Author

smotornyuk commented May 22, 2020

@amercader , i've merged your changes.

And added the following signals:

  • successful/failed login
  • user created
  • request/perform password reset
  • datastore upsert/delete

After a quick look, I decided that it would be better to leave activity/resource/dataset signals out of this PR, because there are quite a lot of decisions about the concrete moments when they need to be sent and what exact payload should be provided.

So, if there no additional comments after newest changes, this PR is ready to go

@wardi wardi added this to the CKAN 3.0 milestone Jun 26, 2020
@amercader
Copy link
Member

I had a fresh look at this @smotornyuk and it still looks great. Really excited about the doors this opens

@amercader amercader merged commit 0ceebd1 into ckan:master Jun 9, 2021
@ccancellieri
Copy link

Ciao, very good work thanks for your effort.

I'm going to implement the IDomainObjectModification over my 2.8.x compatible plugin, so I can't use ISignal.
I'm doing that to notificate the deletion of a children to a parent metadata view.

IDomainObjectModification.notify_after_commit is deprecated:

have you planned to deprecate also https://docs.ckan.org/en/latest/extensions/plugin-interfaces.html#ckan.plugins.interfaces.IDomainObjectModification.notify

or it's ok to use (not having ISignal) ?

@smotornyuk
Copy link
Member Author

smotornyuk commented Sep 6, 2021

As far as i remember, notify_after_commit doesn't work in 2.8. You can either use IDomainObjectModification.notify or more low-level ISession interface. ISession is still available in v2.9 and will be migrated to events only in the next major CKAN release(which won't happen any soon and we can implement some sort of adapter from the old ISession to new events)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

4 participants