fix: subsequent requests cannot be sent until 'num_concurrent_requests' requests have all finished in non-block mode #59

llsj14 · 2024-07-01T08:28:14Z

issues

Summary

Subsequent requests cannot be sent until whole requests have all finished even in non-block mode.
Fixing the request launcher was challenging due to its dependency on Ray, so I used multiple threads and request launchers, each holding one client and controlling only one request.

…s' requests have all finished in non-blocking mode Signed-off-by: Sungjae Lee <[email protected]>

jouDance · 2024-11-10T15:19:52Z

Hey, was there any progress regarding this issue ? it seems its detrimental for the correct operation of llmperf but on the other hand no one gives it any attention.
@gracehonv

llsj14 · 2024-11-12T08:51:51Z

I guess maybe it might be better to move it to a different file in an overlay manner instead of modifying the token_benchmark_ray.py file..

Anyway, I have been using llmperf well since applying this commit without any errors.

gracehonv · 2024-11-12T18:00:30Z

I had made a commit 5 months ago to move the prompt construction outside the send loop so the benchmark doesn't get slowed down. I didn't know this affects the num_concurrent_requests mode. Let me know if I can help in some way.

llsj14 · 2024-11-13T00:13:21Z

@gracehonv
I rebased my code onto your commit, and it didn't slow down.

gracehonv · 2024-11-13T00:20:49Z

It does look like in your commit you have prepared all of the prompts before launching all the concurrent requests so there shouldn't be any slowdown. Thanks for fixing!

cpwan · 2024-12-07T04:46:09Z

The y axis is the number of concurrent requests.

The pattern on the left is with the current token_benchmark_ray, the pattern on the right is with your fix.

Thanks for the fix!

llsj14 · 2024-12-08T13:06:51Z

@cpwan,
Thank you for your analysis and visualization related to this PR.

llsj14 · 2024-12-08T13:09:48Z

@avnishn @rickyyx
Would it be possible to get this PR reviewed please? Thank you.

rickyyx

Thank you!

fix: subsequent requests cannot be sent until 'num_concurrent_request…

570f780

…s' requests have all finished in non-blocking mode Signed-off-by: Sungjae Lee <[email protected]>

llsj14 force-pushed the fix/get-next-ready branch from d75db69 to 570f780 Compare July 3, 2024 16:12

chore: revert missing part

b19685c

rickyyx approved these changes Dec 9, 2024

View reviewed changes

rickyyx merged commit f1d6bed into ray-project:main Dec 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: subsequent requests cannot be sent until 'num_concurrent_requests' requests have all finished in non-block mode #59

fix: subsequent requests cannot be sent until 'num_concurrent_requests' requests have all finished in non-block mode #59

llsj14 commented Jul 1, 2024

jouDance commented Nov 10, 2024 •

edited

Loading

llsj14 commented Nov 12, 2024

gracehonv commented Nov 12, 2024

llsj14 commented Nov 13, 2024

gracehonv commented Nov 13, 2024

cpwan commented Dec 7, 2024

llsj14 commented Dec 8, 2024

llsj14 commented Dec 8, 2024

rickyyx left a comment

fix: subsequent requests cannot be sent until 'num_concurrent_requests' requests have all finished in non-block mode #59

fix: subsequent requests cannot be sent until 'num_concurrent_requests' requests have all finished in non-block mode #59

Conversation

llsj14 commented Jul 1, 2024

issues

Summary

jouDance commented Nov 10, 2024 • edited Loading

llsj14 commented Nov 12, 2024

gracehonv commented Nov 12, 2024

llsj14 commented Nov 13, 2024

gracehonv commented Nov 13, 2024

cpwan commented Dec 7, 2024

llsj14 commented Dec 8, 2024

llsj14 commented Dec 8, 2024

rickyyx left a comment

Choose a reason for hiding this comment

jouDance commented Nov 10, 2024 •

edited

Loading