Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch free_at_safepoint linked list (back) to a vector #1806

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

MasterDuke17
Copy link
Contributor

This way we don't need to alloc every time something is added, and actually performing the free of the elements should be faster since we don't have to pointer chase through a linked list.

Heaptrack reports ~11m fewer allocation functions called and CORE.c's stage parse seems to be about 1s faster. However, this is not safe. Adding to the list with just a bare MVM_VECTOR_PUSH seems to cause problems in random spectests. But wrapping it in uv_mutex_lock(&(tc->instance->mutex_free_at_safepoint)); ...; uv_mutex_unlock(&(tc->instance->mutex_free_at_safepoint)); is incredibly slow. Is there a safe-but-faster way?

This way we don't need to alloc every time something is added, and
actually performing the free of the elements should be faster since we
don't have to pointer chase through a linked list.
@MasterDuke17 MasterDuke17 requested review from jnthn and niner May 19, 2024 01:27
@niner
Copy link
Contributor

niner commented May 19, 2024

I think this is a good case for per-thread lists (or arrays). MVM_alloc_safepoint is only called on the GC orchestrator thread at a point where all other threads are just waiting. So it'd be safe to go through all thread contexts and process their free lists.

The only thing I'm worried about with MVM_VECTOR is that it only grows and never shrinks. Thus we could be paying the memory cost for the rest of the program runtime after just one extreme situation. Therefore I'd destroy and re-initialize those vectors in MVM_alloc_safepoint if they are larger than some sane value (maybe 64?), e.g. if (MVM_VECTOR_SIZE(tc->free_at_safepoint) > 64) { MVM_VECTOR_DESTROY(tc->free_at_safepoint); MVM_VECTOR_INIT(tc->free_at_safepoint, 64); }

@jnthn
Copy link
Member

jnthn commented May 19, 2024

Is there a safe-but-faster way?

The thread safety is exactly why it was a linked list.

@MasterDuke17
Copy link
Contributor Author

I think this is a good case for per-thread lists (or arrays). MVM_alloc_safepoint is only called on the GC orchestrator thread at a point where all other threads are just waiting. So it'd be safe to go through all thread contexts and process their free lists.

I'll try that.

The only thing I'm worried about with MVM_VECTOR is that it only grows and never shrinks. Thus we could be paying the memory cost for the rest of the program runtime after just one extreme situation. Therefore I'd destroy and re-initialize those vectors in MVM_alloc_safepoint if they are larger than some sane value (maybe 64?), e.g. if (MVM_VECTOR_SIZE(tc->free_at_safepoint) > 64) { MVM_VECTOR_DESTROY(tc->free_at_safepoint); MVM_VECTOR_INIT(tc->free_at_safepoint, 64); }

When compiling CORE.c, the size of the free_at_safepoint linked list reached a max of 22k. Of the 1982 times MVM_alloc_safepoint was called, only 116 of them had a list size of less than 100. So if we do this I think the sane value would have to be much higher than 64. But I'm not sure what cost we'd be paying even without destroying the vector? Sure it only grows, but the MVM_VECTOR_POPs decrement its index. So after the MVM_alloc_safepoint the next MVM_VECTOR_PUSHes will start filling it in back at the beginning. It seems to me that this way we're reusing the memory and reducing churn?

@MasterDuke17
Copy link
Contributor Author

Is there a safe-but-faster way?

The thread safety is exactly why it was a linked list.

That's what I suspected, but thanks for confirming. I wonder if an MVM_ATOMIC_VECTOR would be useful in more places than just here?

@jnthn
Copy link
Member

jnthn commented May 20, 2024

I wonder if an MVM_ATOMIC_VECTOR would be useful in more places than just here?

Trouble is that lock-free dynamic arrays are vastly more complex beasts. There's one paper here describing an approach, but seems that design is vulnerable to the ABA problem, although that probably isn't a concern for the use case mentioned here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants