You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For qsort, the size of the data is unknown at build time. I found that memcpy can influence the performance significantly. So I made some specialization for small data sizes:
However, when data size is large (e.g. std::array<long long, 16>), I can see there is still a huge gap between driftsort and glibc's implementation. See benchmark_qsort_with_large_data_size portion of the data in the above link.
Another complication is that, for qsort_r interface, we do not know the alignment of the data, while the comparison function may be invoked on scratch pad. So what I have to do is
size_t alignment = guess_alignment(element_size, data);
BlobComparator<Comp> comp{element_size, alignment, compare};
// Insertion sort is always fine because we never call comparison on temporary// spaceconstexprsize_t MAX_LEN_ALWAYS_INSERTION_SORT = 20;
constexprsize_t MAX_ALIGNMENT = 32;
if (DRIFTSORT_LIKELY(length <= MAX_LEN_ALWAYS_INSERTION_SORT)) {
BlobPtr v = comp.lift(data);
returnsmall::insertion_sort_shift_left(v, length, 1, comp);
}
// unfortunately, due to qsort_r's restriction, we cannot really know the// alignment of over-aligned types, so we have to fall back to slow path.// This is similar to the behavior of glibc's qsort.if (DRIFTSORT_UNLIKELY(alignment > MAX_ALIGNMENT))
return trivial_heap_sort(data, length, comp);
This does not affect the performance here but I would like to know if you have more ideas on improving the sorting algorithm for qsort scenario.
Thanks!
P.S. although the code is heavily fuzzed, I have not guarantee that it resembles the driftsort exactly.
The text was updated successfully, but these errors were encountered:
I'm surprised you picked driftsort for this purpose. All the qsort implementations I am aware of are in-place and unstable. I'd expect it to be a hard sell to suggest switching to an implementation that allocates. Plus there are various optimizations not done or done specifically because of the stability requirement Orson and I designed driftsort in mind with.
There are various optimizations in driftsort and ipnsort that only really make sense if the compiler can inline the comparison function, which isn't possible in C without PGO. I've written previously about the performance impacts of it here.
If you want to get good test coverage I suggest you fork this repository and attach your implementation to it, so it can leverage the existing tests and benchmarks.
Benchmarks are tricky, for me to evaluate and comment on your findings I would have to dig through your benchmark code and implementation, and frankly I don't have the time for that currently. I suggest you fork this repository and attach your implementation to it, c_std_sys.cpp should be a good starting point see this commit for a list of associated changes. From there you can follow the top-level README to run and visualize benchmarks. In this familiar framework it would be easier for me to reason about performance characteristics.
Outside Rust, I am trying to port driftsort as libc's qsort implementation. See https://github.com/SchrodingerZhu/qsort-driftsort.
For qsort, the size of the data is unknown at build time. I found that memcpy can influence the performance significantly. So I made some specialization for small data sizes:
In this way, it beats glibc's implementation in most cases with small data sizes. See performance number:
https://docs.google.com/spreadsheets/d/1p5Ltrq-HyVY4M_lKTWRD0pauJgT34yrxwWx6ZQnrit8/edit?gid=329046775#gid=329046775
However, when data size is large (e.g.
std::array<long long, 16>
), I can see there is still a huge gap between driftsort and glibc's implementation. Seebenchmark_qsort_with_large_data_size
portion of the data in the above link.Another complication is that, for
qsort_r
interface, we do not know the alignment of the data, while the comparison function may be invoked on scratch pad. So what I have to do isThis does not affect the performance here but I would like to know if you have more ideas on improving the sorting algorithm for
qsort
scenario.Thanks!
P.S. although the code is heavily fuzzed, I have not guarantee that it resembles the driftsort exactly.
The text was updated successfully, but these errors were encountered: