Skip to content

Exceeding max-concurrent-recoveries triggers circuit breaker #6106

Open
@lucavice

Description

Version Information
Version of Akka.NET? 1.4.40
Which Akka.NET Modules? Akka.Cluster.Sharding 1.4.40, Akka.Persistence.SqlServer 1.4.35

Describe the bug
In certain situations, exceedeing temporarily the max-concurrenct-recoveries parameter triggers a circuit breaker that prevents Akka Persistence to persist any further events for the duration of the circuit breaker.

See sequence of logged events here:
image

I have been unable to reproduce reliably this problem, as it seems to happen fairly randomly on our production instance (a few times per day).
Setting locally a max-concurrenct-recoveries equal to 1 and force recover of multiple actor at once does not seem to create the issue, so it must be triggered by a combination of factors.

We can't find the root of the error that triggers the circuit breaker. There is no information in the logged OpenCircuitException, and that's the only error that appears in the log (hundreds of times for the duration of the open circuit breaker).

To Reproduce
I don't have reliable steps to trigger the problem.
I would appreciate hints on what I could try to understand better the underlying problem and come up with a strategy to reproduce reliably. It may be possible that this is entirely caused by some bad programming on my side, but I'm a bit lost in what to look for.

Environment
Windows on .NET 6

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions