Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix and simplify inference timing logic #47711

Merged
merged 2 commits into from
Nov 26, 2022
Merged

Fix and simplify inference timing logic #47711

merged 2 commits into from
Nov 26, 2022

Conversation

pchintalapudi
Copy link
Member

@pchintalapudi pchintalapudi commented Nov 25, 2022

Follow-up to #46825
Should fix #47710

@pchintalapudi pchintalapudi marked this pull request as ready for review November 25, 2022 23:58
@brenhinkeller brenhinkeller added the bugfix This change fixes an existing bug label Nov 26, 2022
src/julia.h Outdated
@@ -1940,6 +1940,7 @@ typedef struct _jl_task_t {
void *stkbuf; // malloc'd memory (either copybuf or stack)
size_t bufsz; // actual sizeof stkbuf
uint64_t inference_start_time; // time when inference started
unsigned int timing_inference; // whether inference is currently timing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to use another 4 bytes for this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we believe that inference and codegen aren't going to exceed 65536 recursions, we can drop all of the recursion trackers to 2 bytes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But yes I think the issue was that I wasn't initializing the inference start time to 0, so junk times ended up being recorded. I'll adjust it to rely on the reentrant inference tracker again.

@pchintalapudi
Copy link
Member Author

@DilumAluthge given that last time this failure was intermittent, how many times do we want to rerun the tests here to be sure of the fix?

@DilumAluthge
Copy link
Member

It was all green the first time. Let's do a second run, and if that is all green, let's go ahead and merge it and see how it does on master.

@giordano
Copy link
Contributor

Would it be possible to conceive a test which reproducibly triggers the bug, instead of hoping to hit the intermittent situation?

@DilumAluthge
Copy link
Member

I'm seeing a failure in Pkg, but no failure in misc.

@DilumAluthge DilumAluthge merged commit 88a0627 into master Nov 26, 2022
@DilumAluthge DilumAluthge deleted the pc/fix-timers branch November 26, 2022 09:58
@DilumAluthge
Copy link
Member

Would it be possible to conceive a test which reproducibly triggers the bug, instead of hoping to hit the intermittent situation?

Yeah, this would be a good thing to add in a follow-up PR.

@pchintalapudi
Copy link
Member Author

Since #46825 is being backported to 1.9, we should also backport this fix too.

@pchintalapudi pchintalapudi added the backport 1.9 Change should be backported to release-1.9 label Nov 29, 2022
KristofferC pushed a commit that referenced this pull request Dec 8, 2022
* Fix and simplify inference timing logic

* Reduce task struct size

(cherry picked from commit 88a0627)
@KristofferC KristofferC removed the backport 1.9 Change should be backported to release-1.9 label Dec 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix This change fixes an existing bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Failure in the misc testset on x86_64-w64-mingw32: Expression: after_comp >= before_comp
6 participants