Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Shutting down myself" caused by error occured in remote node. #7113

Open
ingted opened this issue Mar 3, 2024 · 2 comments
Open

"Shutting down myself" caused by error occured in remote node. #7113

ingted opened this issue Mar 3, 2024 · 2 comments

Comments

@ingted
Copy link

ingted commented Mar 3, 2024

Version Information
Version of Akka.NET? 1.5.0
Which Akka.NET Modules? Akka Remote

Describe the bug
A clear and concise description of what the bug is.

  1. Have two actors created in node A (port 64609) & B (port 64640)
  2. actor_in_a tell actor_in_b and actor_in_b would process the message and tell back
  3. However the generated reponse message is unable to be serializeb by Hyperion and caused error "Failed to write message to the transport" in node B
AssociationError [akka.tcp://[email protected]:64609] <- akka.tcp://[email protected]:64640: Error [Failed to write message to the transport] []
  1. Then A bumped into disassociation issue with a MYTHICAL node 64643 (I didn't create it)
Association between local [tcp://[email protected]:64643] and remote [tcp://[email protected]:64609] was disassociated because the ProtocolStateActor failed: Unknown
  1. Then B diassociates
Association with remote system akka.tcp://[email protected]:64640 has failed; address is now gated for 5000 ms. Reason is: [Akka.Remote.EndpointException: Failed to write message to the transport   ---> Hyperion.ValueSerializers.UnsupportedTypeException: No coercion operator is defined between types 'CefBrowser*' and 'System.Object'.     at Hyperion.ValueSerializers.UnsupportedTypeSerializer.WriteManifest(Stream stream, SerializerSession session)     at lambda_method305(Closure, Stream, Object, SerializerSession)     at Hyperion.ValueSerializers.ObjectSerializer.WriteValue(Stream stream, Object value, SerializerSession session)     at Hyperion.Extensions.StreamEx.WriteObject(Stream stream, Object value, Type valueType, ValueSerializer valueSerializer, Boolean preserveObjectReferences, SerializerSession session)     at lambda_method299(Closure, Stream, Object, SerializerSession)     at Hyperion.ValueSerializers.ObjectSerializer.WriteValue(Stream stream, Object value, SerializerSession session)     at Hyperion.Extensions.StreamEx.WriteObject(Stream stream, Object value, Type valueType, ValueSerializer valueSerializer, Boolean preserveObjectReferences, SerializerSession session)     at Hyperion.SerializerFactories.EnumerableSerializerFactory.<>c__DisplayClass10_0.<BuildSerializer>b__1(Stream stream, Object o, SerializerSession session)     at Hyperion.ValueSerializers.ObjectSerializer.WriteValue(Stream stream, Object value, SerializerSession session)     at lambda_method76(Closure, Stream, Object, SerializerSession)     at Hyperion.ValueSerializers.ObjectSerializer.WriteValue(Stream stream, Object value, SerializerSession session)     at lambda_method72(Closure, Stream, Object, SerializerSession)     at Hyperion.ValueSerializers.ObjectSerializer.WriteValue(Stream stream, Object value, SerializerSession session)     at lambda_method74(Closure, Stream, Object, SerializerSession)     at Hyperion.ValueSerializers.ObjectSerializer.WriteValue(Stream stream, Object value, SerializerSession session)     at Hyperion.Serializer.Serialize(Object obj, Stream stream, SerializerSession session)     at Hyperion.Serializer.Serialize(Object obj, Stream stream)     at Akka.Serialization.HyperionSerializer.ToBinary(Object obj)     at Akka.Remote.MessageSerializer.Serialize(ExtendedActorSystem system, Address address, Object message)     at Akka.Remote.EndpointWriter.WriteSend(Send send)     --- End of inner exception stack trace ---     at Akka.Remote.EndpointWriter.PublishAndThrow(Exception reason, LogLevel level, Boolean needToThrow)     at Akka.Remote.EndpointWriter.WriteSend(Send send)     at Akka.Remote.EndpointWriter.<Writing>b__27_0(Send s)     at lambda_method64(Closure, Object, Action`1, Action`1, Action`1)     at Akka.Actor.ReceiveActor.OnReceive(Object message)     at Akka.Actor.UntypedActor.Receive(Object message)     at Akka.Actor.ActorBase.AroundReceive(Receive receive, Object message)     at Akka.Actor.ActorCell.Invoke(Envelope envelope)]
  1. Then node A & B diassociate
Disassociated [akka.tcp://[email protected]:64640] -> akka.tcp://[email protected]:64609
Disassociated [akka.tcp://[email protected]:64609] <- akka.tcp://[email protected]:64640
  1. At last, A & B shut down: (seed node has port 9000)

For node B <= Shutting down myself

Message [AckIdleCheckTimer] from [akka://cluster-system/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2Fcluster-system%4010.28.199.143%3A64640-2/endpointWriter#1537073490] to [akka://cluster-system/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2Fcluster-system%4010.28.199.143%3A64640-2/endpointWriter#1537073490] was not delivered. [1] dead letters encountered. If this is not an expected behavior then [akka://cluster-system/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2Fcluster-system%4010.28.199.143%3A64640-2/endpointWriter#1537073490] may have terminated unexpectedly. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'. Message content: Akka.Remote.EndpointWriter+AckIdleCheckTimer

Cluster Node [akka.tcp://[email protected]:64609] - Marking node(s) as UNREACHABLE [Member(address = akka.tcp://[email protected]:64640, Uid=1558551789 status = Up, role=[ShardNode,ShardAnalyticServiceNode,petabridge.cmd,10.28.199.143], upNumber=3, version=12.8.202)]. Node roles [ShardNode,ShardAnalyticServiceNode,petabridge.cmd,10.28.199.143]

"Couldn't establish a causal relationship between "remote" gossip and "local" gossip - Remote[Gossip(members = [Member(address = akka.tcp://[email protected]:9000, Uid=1028805500 status = Up, role=[dd,singletonRole,SeedNode,petabridge.cmd], upNumber=1, version=7.1.460), Member(address = akka.tcp://[email protected]:64609, Uid=942161684 status = Up, role=[ShardNode,ShardAnalyticServiceNode,petabridge.cmd,10.28.199.143], upNumber=2, version=1.0.0), Member(address = akka.tcp://[email protected]:64640, Uid=1558551789 status = Up, role=[ShardNode,ShardAnalyticServiceNode,petabridge.cmd,10.28.199.143], upNumber=3, version=12.8.202)], overview = GossipOverview(seen=[UniqueAddress: (akka.tcp://[email protected]:9000, 1028805500), UniqueAddress: (akka.tcp://[email protected]:64640, 1558551789)], reachability=Reachability([akka.tcp://[email protected]:64640 -> UniqueAddress: (akka.tcp://[email protected]:64609, 942161684): Unreachable [Unreachable] (1)])), version = VectorClock(0DA4CAFA080D3226573233D2547D1AC0->6, 3EBA3B1B1C91D00A7301186C5FF6E40C->1)] - Local[Gossip(members = [Member(address = akka.tcp://[email protected]:9000, Uid=1028805500 status = Up, role=[dd,singletonRole,SeedNode,petabridge.cmd], upNumber=1, version=7.1.460), Member(address = akka.tcp://[email protected]:64609, Uid=942161684 status = Up, role=[ShardNode,ShardAnalyticServiceNode,petabridge.cmd,10.28.199.143], upNumber=2, version=1.0.0), Member(address = akka.tcp://[email protected]:64640, Uid=1558551789 status = Up, role=[ShardNode,ShardAnalyticServiceNode,petabridge.cmd,10.28.199.143], upNumber=3, version=12.8.202)], overview = GossipOverview(seen=[UniqueAddress: (akka.tcp://[email protected]:64609, 942161684)], reachability=Reachability([akka.tcp://[email protected]:64609 -> UniqueAddress: (akka.tcp://[email protected]:64640, 1558551789): Unreachable [Unreachable] (1)])), version = VectorClock(06163C12B3D0EBEA1063AC304EC6A2FE->1, 0DA4CAFA080D3226573233D2547D1AC0->6)] - merged them into [Gossip(members = [Member(address = akka.tcp://[email protected]:9000, Uid=1028805500 status = Up, role=[dd,singletonRole,SeedNode,petabridge.cmd], upNumber=1, version=7.1.460), Member(address = akka.tcp://[email protected]:64609, Uid=942161684 status = Up, role=[ShardNode,ShardAnalyticServiceNode,petabridge.cmd,10.28.199.143], upNumber=2, version=1.0.0), Member(address = akka.tcp://[email protected]:64640, Uid=1558551789 status = Up, role=[ShardNode,ShardAnalyticServiceNode,petabridge.cmd,10.28.199.143], upNumber=3, version=12.8.202)], overview = GossipOverview(seen=[], reachability=Reachability([akka.tcp://[email protected]:64609 -> UniqueAddress: (akka.tcp://[email protected]:64640, 1558551789): Unreachable [Unreachable] (1)][akka.tcp://[email protected]:64640 -> UniqueAddress: (akka.tcp://[email protected]:64609, 942161684): Unreachable [Unreachable] (1)])), version = VectorClock(06163C12B3D0EBEA1063AC304EC6A2FE->1, 0DA4CAFA080D3226573233D2547D1AC0->6, 3EBA3B1B1C91D00A7301186C5FF6E40C->1)]"

Received gossip where this member has been downed, from [akka.tcp://[email protected]:9000]

Message [BackoffTimer] from [akka://cluster-system/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2Fcluster-system%4010.28.199.143%3A64640-2/endpointWriter#816387495] to [akka://cluster-system/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2Fcluster-system%4010.28.199.143%3A64640-2/endpointWriter#816387495] was not delivered. [8] dead letters encountered. If this is not an expected behavior then [akka://cluster-system/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2Fcluster-system%4010.28.199.143%3A64640-2/endpointWriter#816387495] may have terminated unexpectedly. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'. Message content: Akka.Remote.EndpointWriter+BackoffTimer

Cluster Node [akka.tcp://[email protected]:64609] - Node has been marked as DOWN. Shutting down myself

For Node A <= Shutting down myself

Cluster Node [akka.tcp://[email protected]:64640] - Marking node(s) as UNREACHABLE [Member(address = akka.tcp://[email protected]:64609, Uid=942161684 status = Up, role=[ShardNode,petabridge.cmd,ShardAnalyticServiceNode,10.28.199.143], upNumber=2, version=1.0.0)]. Node roles [ShardNode,petabridge.cmd,ShardAnalyticServiceNode,10.28.199.143]

Cluster Node [akka.tcp://[email protected]:64640] - Receiving gossip from [UniqueAddress: (akka.tcp://[email protected]:9000, 1028805500)]

Received gossip where this member has been downed, from [akka.tcp://[email protected]:9000]

Cluster Node [akka.tcp://[email protected]:64640] - Node has been marked as DOWN. Shutting down myself

To Reproduce
If needed, I will provide it in a small project.

Expected behavior
Errors occured in node B should not shut down node A...

Actual behavior
Node A "Shutting down myself"....

Environment
I am running on Windows with .NET 7.

@ingted
Copy link
Author

ingted commented Mar 3, 2024

This time it is different from #2903.
Now the disassociation cause each other shut down themself...

image

image

@ingted
Copy link
Author

ingted commented Mar 3, 2024

Since the error is expected and we can certainly not to trigger it... anyway... T_T|||

@Aaronontheweb Aaronontheweb added this to the 1.5.18 milestone Mar 5, 2024
@Aaronontheweb Aaronontheweb modified the milestones: 1.5.18, 1.5.19 Mar 12, 2024
@Aaronontheweb Aaronontheweb modified the milestones: 1.5.19, 1.5.20 Apr 15, 2024
@Aaronontheweb Aaronontheweb modified the milestones: 1.5.20, 1.5.21 Apr 29, 2024
@Aaronontheweb Aaronontheweb modified the milestones: 1.5.21, 1.5.22, 1.5.23 May 28, 2024
@Aaronontheweb Aaronontheweb modified the milestones: 1.5.23, 1.5.24, 1.5.25 Jun 6, 2024
@Aaronontheweb Aaronontheweb modified the milestones: 1.5.25, 1.5.26 Jun 14, 2024
@Aaronontheweb Aaronontheweb modified the milestones: 1.5.26, 1.5.27 Jun 27, 2024
@Aaronontheweb Aaronontheweb modified the milestones: 1.5.27, 1.5.28 Jul 25, 2024
@Aaronontheweb Aaronontheweb modified the milestones: 1.5.28, 1.5.29 Sep 4, 2024
@Aaronontheweb Aaronontheweb modified the milestones: 1.5.29, 1.5.30 Oct 1, 2024
@Aaronontheweb Aaronontheweb added this to the 1.5.31 milestone Oct 4, 2024
@Aaronontheweb Aaronontheweb modified the milestones: 1.5.31, 1.5.32 Nov 14, 2024
@Aaronontheweb Aaronontheweb modified the milestones: 1.5.32, 1.5.33 Dec 4, 2024
@Aaronontheweb Aaronontheweb modified the milestones: 1.5.33, 1.5.34 Dec 24, 2024
@Aaronontheweb Aaronontheweb modified the milestones: 1.5.34, 1.5.35 Jan 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants