-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(send_kcidb.py): Set processed_by_kcidb_bridge flag after processing node #919
Conversation
a10ab12
to
a263d96
Compare
src/send_kcidb.py
Outdated
# TBD: job nodes might have child nodes, mark them as processed as well | ||
node['sent_kcidb'] = True | ||
try: | ||
self._api.node.update(node) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the safe place to update the node is in _send_revision
method after data validation is done with kcidb.io.SCHEMA.is_valid(revision)
. As for some reason, if data is invalid, the service would still set the kcidb_sent
flag to True
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might move there, but may data become valid later?
Because if i will not flag node as processed, it will keep trying to resend it (and fail, because data is invalid).
Maybe better we just report invalid data somewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Data becomes invalid if incorrect type of data is stored in node or there is some parsing issue while generating KCIDB-compatible data dictionary in the bridge service.
That's why the same data won't be valid again without any changes to service or node model.
Maybe we can have an error log for invalid data. Also, we can have a dict
field in Node
model for KCIDB related processing information with 2 flags i.e. kcidb_sent
and processed_by_kcidb_bridge
, and use both the fields for a submission retry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then maybe i will change name to processed_by_kcidb_bridge (but it feels a bit long), it will be more correct such way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
src/send_kcidb.py
Outdated
|
||
# mark the node as processed, sent_kcidb field to True | ||
# TBD: job nodes might have child nodes, mark them as processed as well | ||
node['sent_kcidb'] = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the field name is kcidb_sent
from PR kernelci/kernelci-core#2746.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, sorry for mistake, i am still thinking how it should look and drafting where i will put rest of code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, np. I'd be happy to help in case you find something unclear in the bridge service.
4369323
to
0d479c1
Compare
Requires: kernelci/kernelci-core#2746 After node is being processed, set flag to indicate we sent data to kcidb. Signed-off-by: Denys Fedoryshchenko <[email protected]>
0d479c1
to
f44708a
Compare
node = self._find_unprocessed_node() | ||
|
||
if not node: | ||
node, is_hierarchy = self._api_helper.receive_event_node(context['sub_id']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we will miss PubSub events if this is conditional as this will only process real-time event data if no unprocessed nodes are found.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As i recall we tested before together, active pubsub subscription will queue events and wont drop them.
If it doesnt(if i missed something), then it need to be redesigned, because even in current code, it means it might miss event, if other event submission going too slow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. This needs to be verified.
Data in node might be invalid, and not really sent to kcidb | ||
This is workaround, until we improve event handling in kernelci-pipeline/api | ||
""" | ||
node['processed_by_kcidb_bridge'] = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we were going to have 2 flags, i.e. one is for successful submission and another one is for processing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for submission not yet, as it will be next step of handling errors, which is unclear if happening at all
child_nodes = self._api.node.find({'parent': node['id']}) | ||
if child_nodes: | ||
for child in child_nodes: | ||
self._node_processed_recursively(child) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of querying API, we should be getting node IDs from revision
data as there might be some cases when not all the child nodes are submitted to KCIDB.
So, mark sent_kcidb
to True
only if the node was actually submitted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the moment we mark just "processed by kcidb bridge", to workaround missing events issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. So, sent_to_kcidb
would be a different flag for successful submission. Got it.
Requires: kernelci/kernelci-core#2746
After node is being processed, set flag to indicate we sent data to kcidb.