Skip to content

Conversation

@ikreymer
Copy link
Member

@ikreymer ikreymer commented Oct 6, 2023

An alterate fix for #1176, superseding #1218 and #1249

Computing the execution time entirely in the operator

  • Detecting all pod exits and adding difference between terminated.state.finishedAt and terminated.state.startedAt to new crawljob execTime counter
  • For crawl cancelation / crawl job deletion, setting new cancelation key for fast cancelation (to work with Fast cancelation + remove time counter browsertrix-crawler#406), and adding up <current time> - running.state.startedAt time for each running pod, then deleting all running pods. The term interrupt combined with redis state should result in crawler pods exiting right away.
  • Normal crawl finish/stop will await for all pods to be Completed state, then add up the times. Crawl cancelation also adds up - running.state.startedAt in finalizer.
  • Total execTime added to db crawl object if not already set, then incremented in org month group.

tw4l and others added 2 commits October 5, 2023 21:40
- rename isNewCrash -> isNewExit, crashTime -> exitTime
- keep track of exitCode
- add execTime counter, increment when state has a 'finishedAt' and 'startedAt' state
- ensure pods are complete before deleting
- set redis :canceled key to immediately cancel crawl
- delete crawl pods to ensure pod exits immediately
- in finalizer, don't wait for pods to complete - add currentTime in pod.status.running.startedAt times for all existing pods
@ikreymer ikreymer force-pushed the issue-1776-execution-time-in-operator branch from 9ed49ed to e56c49e Compare October 6, 2023 07:23
Copy link
Member

@tw4l tw4l left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested with cancelled cralls, sending SIGINTs to restart crawler pods, and scale at 1 and higher. Working well in all cases. Nice work!

@ikreymer ikreymer merged commit 5cad9ac into main Oct 10, 2023
@ikreymer ikreymer deleted the issue-1776-execution-time-in-operator branch October 10, 2023 00:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants