Skip to content

TCB_SetSchedParams: Change to match API reference #1312

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

alwin-joshy
Copy link
Contributor

The implementation of TCB_SetSchedParams did not
follow the API reference, as it would fail if:

a. The TCB already had a scheduling context, even
if this was the same as the one being passed in.
b. The scheduling context was bound to a TCB, even
if this was the TCB that was invoked.

This commit allows TCB_SetSchedParams allows the
invoked TCB's already bound scheduling context to
be passed in as described the API reference.

Closes: #1311

@lsf37 lsf37 added the verification Needs formal verification input/change, or is motivated by verification label Aug 23, 2024
@lsf37 lsf37 requested a review from michaelmcinerney August 23, 2024 08:27
@lsf37
Copy link
Member

lsf37 commented Aug 23, 2024

We'll need to check what the formal spec says and how much work it would be to change it if it is the same as the code and not the same as the manual.

@lsf37 lsf37 added the MCS issues about the mixed-criticality system config label Aug 23, 2024
Copy link
Contributor

@Indanz Indanz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code is good, commit message could be improved a bit. Just drop the last bit, it doesn't add anything you haven't said or implied yet:

This commit allows TCB_SetSchedParams allows the
invoked TCB's already bound scheduling context to
be passed in as described the API reference.

@alwin-joshy alwin-joshy force-pushed the fix_set_sched_params branch from 9bc940e to bf2581f Compare August 23, 2024 09:44
@lsf37
Copy link
Member

lsf37 commented Aug 26, 2024

We'll need to check what the formal spec says and how much work it would be to change it if it is the same as the code and not the same as the manual.

The spec does the same as the code at the moment, so this would need re-verification. @michaelmcinerney is currently on leave, but when he is back he might be able to say if this one is quick and easy or the opposite.

@michaelmcinerney
Copy link
Contributor

We'll need to check what the formal spec says and how much work it would be to change it if it is the same as the code and not the same as the manual.

The spec does the same as the code at the moment, so this would need re-verification. @michaelmcinerney is currently on leave, but when he is back he might be able to say if this one is quick and easy or the opposite.

This should be a very straightforward update to the proofs, maybe just a day or so.

@lsf37
Copy link
Member

lsf37 commented Sep 3, 2024

Turns out this is a bit more complex in the proofs. We specifically introduced this case 6 years ago in ba78e3b (and forgot to update the manual), because we were hitting lots of special casing when SC/TCB were already bound, even if it is just overwritten to be the same. Michael is looking into what would need to be done.

Current thinking is that we should make use of the updateFlags parameter to explicitly skip the update in this case. This should give us the behaviour the API reference has without needing more cases deeper down in the proof, only in the decode function, which is less critical.

@Indanz
Copy link
Contributor

Indanz commented Sep 3, 2024

I agree that not passing thread_control_sched_update_sc makes more sense when tcb->tcbSchedContext == sc (which implies sc->scTcb == tcb).

@alwin-joshy
Copy link
Contributor Author

Will squash the new commit with the first if it's okay

Copy link
Contributor

@Indanz Indanz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although the code is correct, it's somewhat ugly. If it doesn't make verification harder, I'd prefer an outer if-check that checks if the SC is changed. If it is, it does the existing checks and also adds thread_control_sched_update_sc to the flags (which need a new local var that can be passed at the end).

@michaelmcinerney
Copy link
Contributor

I have been able to get the abstract invariants working with something analogous to the C in this PR as it is currently, but removing the check on line 1874 performed just before schedContext_bindTCB, given that we now know that we will be performing the bind in this case. So I think this behaviour will be fine for verification.

But my initial thoughts on this were that the old error behaviour was much more clear, and that if the TCB is bound to a sched context and we do not want to unbind them, then it is strange that the user would need to provide a cap to this sc just to end up performing a no-op. I would prefer to have the user pass in a bool as to whether they would like the sc changed. However, I'm not sure whether it's reasonable to expect the user to always know whether the TCB is bound, and if so, what sc it is bound to. I think the user does need to know this, but others in the verification team aren't sure. Introducing this user-provided bool would be a change to the API, so maybe we do not want to do that, but maybe it is a way of simplifying the code, in a way.

@michaelmcinerney
Copy link
Contributor

Adding a bool to the API would require an RFC, and it's not worth it. So if we can go ahead with Indan's suggestion of an outer if, then I should be able to get the proofs sorted within a couple of days, and then this can be merged.

@Indanz
Copy link
Contributor

Indanz commented Sep 6, 2024

But my initial thoughts on this were that the old error behaviour was much more clear,

The old behaviour was to always fail the call if the thread already had an SC bound, is that what you are referring to?

Sure it is clear, but it would make the syscall useless to call more than once during init. All the parameters can be configured with other system calls, so perhaps that's not a problem, but why have such a system call at all then?

and that if the TCB is bound to a sched context and we do not want to unbind them, then it is strange that the user would need to provide a cap to this sc just to end up performing a no-op.

Yes, very strange. I think the only purpose of seL4_TCB_SetSchedParams is to create tasks with slightly less system calls.

I would prefer to have the user pass in a bool as to whether they would like the sc changed. However, I'm not sure whether it's reasonable to expect the user to always know whether the TCB is bound, and if so, what sc it is bound to. I think the user does need to know this, but others in the verification team aren't sure. Introducing this user-provided bool would be a change to the API, so maybe we do not want to do that, but maybe it is a way of simplifying the code, in a way.

For passive tasks it is not reasonable to expect the user to know the current SC.

My personal preference would be to get rid of seL4_TCB_SetSchedParams altogether, both on MCS and non-MCS. The function can be emulated with multiple calls for backwards API compatibility.

But if we're doing such big breaking changes, the whole init functions could be cleared up. Currently there is way too much overlap between seL4_TCB_Configure, seL4_TCB_SetSpace and seL4_TCB_SetSchedParams. The only thing missing is an easy way to change or set fault_ep.

@alwin-joshy
Copy link
Contributor Author

But if we're doing such big breaking changes, the whole init functions could be cleared up. Currently there is way too much overlap between seL4_TCB_Configure, seL4_TCB_SetSpace and seL4_TCB_SetSchedParams. The only thing missing is an easy way to change or set fault_ep.

Agreed, the only reason I stumbled on this at all is because I wanted to change the priority and fault_ep of a thread and set_sched_params seemed like the easiest of the three to use in my case.

@michaelmcinerney
Copy link
Contributor

The old behaviour was to always fail the call if the thread already had an SC bound, is that what you are referring to?

Yes, that's what I was referring to. I suppose that before, it was a much less useful call since it can't be called for TCBs that did have a bound SC, but my objection was really just to providing the cap to the SC and then getting a no-op.

But if we're doing such big breaking changes, the whole init functions could be cleared up. Currently there is way too much overlap between seL4_TCB_Configure, seL4_TCB_SetSpace and seL4_TCB_SetSchedParams. The only thing missing is an easy way to change or set fault_ep.

I hadn't thought about that. That could be interesting. Given that CRefine is still in progress and that the ccorres rules for these functions will need to be updated, if there is a significant simplification that could be made, then it might be worth it. We would need to make an RFC, so it could end up just taking too much time, but @lsf37 might have some thoughts on it.

@lsf37
Copy link
Member

lsf37 commented Sep 7, 2024

But if we're doing such big breaking changes, the whole init functions could be cleared up. Currently there is way too much overlap between seL4_TCB_Configure, seL4_TCB_SetSpace and seL4_TCB_SetSchedParams. The only thing missing is an easy way to change or set fault_ep.

I hadn't thought about that. That could be interesting. Given that CRefine is still in progress and that the ccorres rules for these functions will need to be updated, if there is a significant simplification that could be made, then it might be worth it. We would need to make an RFC, so it could end up just taking too much time, but @lsf37 might have some thoughts on it.

In my opinion we should definitely not do bigger breaking changes without properly strong reasons -- we have no way to even estimate how much impact that would have on current users who'd all need to update their code.

@corlewis
Copy link
Member

corlewis commented Sep 9, 2024

This doesn't seem quite right to me yet. The current code doesn't update the flags in the null_cap case, which means that the syscall can't be used to unbind an sc.

I guess there's multiple ways to do this, but how about moving the flags update to after the scCap switch statement, with something like this?

if (tcb->tcbSchedContext != sc) {
    update_flags |= thread_control_sched_update_sc;
}

That should also make it a no-op when trying to unbind a thread that is already unbound, which to me feels like the appropriate generalisation of the change we're already making.

All of this would also let us remove the second half of the if statements at

seL4/src/object/tcb.c

Lines 1876 to 1879 in 08cf7ee

if (sc != NULL && sc != target->tcbSchedContext) {
schedContext_bindTCB(sc, target);
} else if (sc == NULL && target->tcbSchedContext != NULL) {
schedContext_unbindTCB(target->tcbSchedContext);
which I think would make the proofs slightly simpler.

The implementation of TCB_SetSchedParams did not follow the API
reference, as it would fail if:

a. The TCB already had a scheduling context, even if this was the same
   as the one being passed in.
b. The scheduling context was bound to a TCB, even if this was the TCB
   that was invoked.

Signed-off-by: Alwin Joshy <[email protected]>
We no longer need to check this as the previous commit changed
decodeSetSchedParams to only pass the thread_control_sched_update_sc
flag if these conditions are true.

Signed-off-by: Alwin Joshy <[email protected]>
@alwin-joshy
Copy link
Contributor Author

Thanks for picking up on that, I think you're right. I've updated the flags update as you suggested and changed the checks in tcb.c in a separate commit.

@lsf37 lsf37 self-assigned this Feb 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
MCS issues about the mixed-criticality system config verification Needs formal verification input/change, or is motivated by verification
Projects
None yet
Development

Successfully merging this pull request may close these issues.

seL4_TCB_SetSchedParams: inconsistency between API reference and implementation
6 participants