Skip to content

feat: add execute now to task queues #577

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 18 commits into
base: main
Choose a base branch
from
Open

feat: add execute now to task queues #577

wants to merge 18 commits into from

Conversation

twuebi
Copy link
Contributor

@twuebi twuebi commented Nov 27, 2024

Being able to move the date of cleanup to now() means potentially not paying for 7 days of storage. Depending on how we wanna do recursive delete, this can also serve as a building block for it.

  • add interface to trait & implement for pg & existing queus
  • expose on http api
  • extend tests

@twuebi twuebi force-pushed the tp/task-cancellation branch 3 times, most recently from 3bfb408 to 416e76a Compare December 9, 2024 15:23
Base automatically changed from tp/task-cancellation to main December 11, 2024 16:11
Copy link

cla-assistant bot commented Dec 19, 2024

CLA assistant check
All committers have signed the CLA.

@twuebi twuebi changed the title feat: add execute now to task queues (WIP) feat: add execute now to task queues Jan 8, 2025
@@ -861,6 +864,7 @@ pub mod v1 {
#[utoipa::path(
post,
tag = "warehouse",
// FIXME: deleted_tabulars should be kebab-case, this would be a breaking change
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets document the kebab-case endpoint, and add the underscore endpoint undocumented additional for 1 minor release version.

tag = "warehouse",
path = "/management/v1/warehouse/{warehouse_id}/deleted-tabulars/reschedule",
responses(
(status = 204, description = "Tabular dropped successfully"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs an update

#[utoipa::path(
post,
tag = "warehouse",
path = "/management/v1/warehouse/{warehouse_id}/deleted-tabulars/reschedule",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking more about a task centric API, similar to what we discussed here:
https://docs.google.com/document/d/1sX2iBO6eL_8XCC7cznppiZWy8K44xQ-NOPhYZlHO-ic/edit?tab=t.0

But I like this approach too.

Eventually we should have a task view in Lakekeeper, where all tasks show up. This is super useful for maintenance, so that users don't have to browse all namespaces to see what ran successfully, what failed and what needs attention.

As such, we will be adding more APIs around: Retry, reschedule, cancel, listing etc.
Most of those APIs will probably use task_ids as the main identifier, which is why this API could also be useful.
I am unsure if we want to maintain both APIs.

The longer I think about it the more I think we should focus on the task centric APIs to avoid spreading task management APIs across very diverse endpoints.
Maybe we should have another call to decide if we should tackle /task now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went for tabular_ids instead of task_ids to avoid messy authorization, how'd you want to deal with that?

We have per table / view permissions for drop & undrop. tasks exist on warehouse level so it'd be a warehouse permission. That'd fragment task permissions across warehouse and the entity the task pertains to, I'm somewhat ok with that but it may be unexpected to be able to undrop but not undrop-now or inversely to be able to reschedule a drop but not being able to drop in the first place.

@@ -230,6 +230,14 @@ pub struct UndropTabularsRequest {
pub targets: Vec<TabularIdentUuid>,
}

#[derive(Deserialize, Debug, ToSchema)]
#[serde(rename_all = "kebab-case")]
pub struct RescheduleSoftDeletionRequest {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we move this to the /task API, we could use a generic reschedule request that is not specific for SoftDeletions

.reschedule_tabular_expiration(TaskFilter::TaskIds(task_ids), reschedule_to)
.await?;

// TODO: emit event
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we missing an event-type for this?

@c-thiel c-thiel removed this from the Release 0.7.0 milestone Feb 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants