Reimplement Interruptible #417

hms · 2024-11-18T23:28:05Z

Details for each of the commits are in the separate commit messages

Some minor housekeeping with .gitignore, .rubocop.ymlt and solid_queue.gemspec
Proposed update to the tests to silence successful tests from outputting backtraces (keep the test output clean where possible and only output for the unexpected
Proposed reimplementation of the Interruptible concern to help mitigate some mysql database connectivity issues.

rosa

Thanks a lot for this PR! I need a little bit of time to review it and I'm on-call this week, so I hope to get to it next week 🙏

rosa · 2024-12-02T11:47:11Z

Thanks again, @hms! Would you mind elaborating a bit more on:

Proposed reimplementation of the Interruptible concern to help mitigate some mysql database connectivity issues.

What where the connectivity issues you found? Was this only in tests, or did you observe these in a production environment?

hms · 2024-12-02T16:28:27Z

@rosa

Hi Rosa,

I don't run MySQL in production. The issues I've come across are strictly in a testing environment. What I get are random disconnections (and miles long backtraces) at the driver level that are frequently paired with broken pipe errors. Extra rescues on the pipes side didn't help. It's always more database errors than pipe errors, leading me to assume the root cause was the database driver.

Since you folks are MySQL based and haven't seen any of these issues, I'm sure this is some weird combination of Mac, my local builds via brew, and/or how I'm configured to build. I did some digging, and the mysql2 gem had a window of time with similar issues some years ago right after an undocumented MySQL upgrade on the Oracle side, so maybe I'm a cannery in the coal mine?

Relative to a justification for the change, I do think using a pure Ruby solution as long as it's 100% in compliance with the requirements is worth considering. My understanding for why the self-pipe is to avoid signal processing races that can be triggered by the anything using one of the signal aggregation APIs (database drivers) and signal traps. The Thread::Queue makes those guarantees with the same performance profile and a very simply to understand implementation.

hms

rosa · 2024-12-02T16:31:44Z

My understanding for why the self-pipe is to avoid signal processing races that can be triggered by the anything using one of the signal aggregation APIs (database drivers) and signal traps.

Yes, exactly! Thanks for the extra justification; that's super helpful. I do like this implementation much more, so I'm going to run this in production for a couple of days and see if we run into anything, and then I'll merge afterwards. Thank you!

rosa · 2024-12-02T16:32:11Z

Oops, would you mind rebasing against the last main and resolving the conflicts? Sorry!

hms · 2024-12-02T16:33:49Z

No problem. But I won't be able to do for a couple of hours. Just about to step out.

hms

rosa · 2024-12-02T16:34:43Z

Oh, of course, no rush at all! 🙏

* Adds to RVM .ruby-gemset to .gitignore for RVM and gemset holdouts. * Updates Gemfile.lock to include arm64-darwin-24 (weird that it's missing) * Adds 4 development dependencies to quiet deprecation warnings * Updates rubocop to allow Ruby 3.3 syntax, matching .ruby-version

SolidQueue has excellent built-in error reporting. While this is fantastic for SQ users, it is less than ideal for testing SolidQueue because any test that deliberately uses or triggers an exception produces voluminous error reporting. This error reporting is hugely valuable when the exception is not expected, but distracting and of limited value for expected use-cases, especially when the test confirms the correct outcomes via assertions. This commit adds: * A generic test-specific Exception class: ExpectedTestError This allows testing for specific exceptions while retaining all error reporting infrastructure for unexpected exceptions. * Two helper methods for silencing on_thread_error output These methods accept an Exception or Array(Exception) and simply does not call the output mechanism if the exception passed to on_thread_error matches. This way, any unexpected error during test still reports in a highly visible manner while the exceptions being tested are validated via assertions. * Replaces the stock on_thread_error with one that ignores ExpectedTextError. Updated several tests from using the ruby stock RuntimeError to ExpectedTestError. * Configures tests to run with YJIT enabled This is to test under likely production deployment configuration, not for performance reasons. Note: With the very recent reporting on M4's crashing on Mac's with YJIT enabled, we might want to either defer this change or add a conditional to opt in until the problem is resolved.

* Replaces a little Unix cleverness with a standard Ruby class. This pushes the responsibiity for meeting the SQ requirements from SQ to stock Ruby * Delivers equivelent performance, identical API, and API behaviors with the original implementation (see note below on Futures) * Mostly fixes a *platform / version dependent* issue with MySQL (see below) * Meets 100% of SQ's functional requirements: * interruptible_sleep: a potentially blocking operation interruptible via either a "wake_event" (possibly requested prior to entering interruptible_sleep) or blocking until a timeout. * wake_up / interrupt: a Signal#trap and thread-safe method that does not require user-level synchronization (with the risk of not fully understanding all of the complexities required) code that either interrupts an inflight-interruptible_sleep or enqueues the event to processed in the invocation of interruptible_sleep * Interruptible's API is trivially reproduceable via Thread::Queue * interruptible_sleep => Queue.pop(timeout:) where pushing anything into the queue acts as the interrupt event and timeout is reliable without any extra code or exception handling. * wake_up / interrupt => Queue.push(Anything) is thread, fiber, and Signal.trap safe (can be called from anywhere) and captures all wake_up events whenever requested, automaticall caching any "event" not processed by a currently executing interruptible_sleep matching existing functionality exactly. Why the Future in #interruptible_sleep? While Thread::Queue micro benchmarks as having the same performance on the main thread Vs. any form of a sub-thread (or Fiber) and self-pipe, when running the SQ test suite we see a 35% slow down Vs. the original self-pipe implenentation. One assumes this slowdown would manifest in production. By moving the just the Queue#pop into a separate thread via Concurrent::Promises.future we get +/- identical performance to the original self-pipe implementation. I'm assuming this root causes to Ruby main-thread only housekeeping and/or possibly triggering a fast/slow path issue. Why a Future Vs. Thread#new for each interruptible_sleep call? Every other threaded operation in SQ is implemented using Concurrent Ruby. Using a Future is for code and architectual consistency. There is no difference in performance or functionality between the two. MySQL *only* issues: There seems to be a *platform specific* or *version specific* problem with MySQL database connectivity and/or broken self-pipes leading to randomly failing tests and a stream of distracting backtraces *even with successful* tests. Adding to the complexity sometimes, the lost database connection can self-heal -- HOWEVER -- this takes time and given how much of the test suite has time based assertions, leads to additional random test failures. These, or similar, issues have been observed in the past when changes to the MySQL client library forced changes in the mysql2 gem. With the Thread::Queue based implementation of the Interruptible concern, the random failures and amount of spurious output are dramatically improved (but not eliminated).

hms · 2024-12-02T19:11:41Z

@rosa

Rebased.

rosa · 2024-12-03T17:34:55Z

This is working great so far, so I'm going to merge it. Thanks a lot @hms!

rosa · 2024-12-03T17:35:40Z

Oops, so sorry, conflicts again! It needs rebasing again 🙈 Also, I realised I missed a comment for a small change, that I'll add now.

lib/solid_queue/processes/interruptible.rb

solid_queue.gemspec

* Streamlined a comment in the code * Removed unneeded gemspec developer_dependencies that we get handled via the upgrade to Rails 7.1.4.1 * Added a gemspec version dependency required for keyword argument support * Rebase to main

hms · 2024-12-03T21:18:53Z

@rosa

Rebased.

Thank you for all of your feedback and help to get this finished.

rosa · 2024-12-04T13:00:19Z

Thank you!

rosa reviewed Nov 21, 2024

View reviewed changes

hms mentioned this pull request Nov 24, 2024

Implements TimerTasks using Promises. #425

Closed

hms added 3 commits December 2, 2024 09:04

hms force-pushed the interruptible branch from 0e88a8e to a152f26 Compare December 2, 2024 18:42

rosa reviewed Dec 3, 2024

View reviewed changes

lib/solid_queue/processes/interruptible.rb Outdated Show resolved Hide resolved

rosa reviewed Dec 3, 2024

View reviewed changes

solid_queue.gemspec Outdated Show resolved Hide resolved

Addresses PR feedback

17c4cd5

* Streamlined a comment in the code * Removed unneeded gemspec developer_dependencies that we get handled via the upgrade to Rails 7.1.4.1 * Added a gemspec version dependency required for keyword argument support * Rebase to main

rosa merged commit 4f29fc8 into rails:main Dec 4, 2024
4 checks passed

hms deleted the interruptible branch December 18, 2024 20:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reimplement Interruptible #417

Reimplement Interruptible #417

hms commented Nov 18, 2024

rosa left a comment

rosa commented Dec 2, 2024

hms commented Dec 2, 2024

rosa commented Dec 2, 2024

rosa commented Dec 2, 2024 •

edited

Loading

hms commented Dec 2, 2024

rosa commented Dec 2, 2024

hms commented Dec 2, 2024

rosa commented Dec 3, 2024

rosa commented Dec 3, 2024

hms commented Dec 3, 2024

rosa commented Dec 4, 2024

Reimplement Interruptible #417

Reimplement Interruptible #417

Conversation

hms commented Nov 18, 2024

rosa left a comment

Choose a reason for hiding this comment

rosa commented Dec 2, 2024

hms commented Dec 2, 2024

rosa commented Dec 2, 2024

rosa commented Dec 2, 2024 • edited Loading

hms commented Dec 2, 2024

rosa commented Dec 2, 2024

hms commented Dec 2, 2024

rosa commented Dec 3, 2024

rosa commented Dec 3, 2024

hms commented Dec 3, 2024

rosa commented Dec 4, 2024

rosa commented Dec 2, 2024 •

edited

Loading