-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIRListenerRegistration will failed while connected Router disconnected from Internet and then connect back. #2987
Comments
Hi @TIEmerald; could you elaborate on what you mean by "failed"? I tested out your project, and I think it's working as expected. (I tested on an emulator, and faked "pulling the router cable" by turning off the network connection on my mac.) When the network is in a bad state, Firestore does indicate in the logs that there is a network error, but once the network state is restored, it should reconnect and continue working properly. |
Nope... You cannot fake faked "pulling the router cable" by turning off the network connection on my Mac. you have to literally pulling the router cable. |
And Failed means if I change anything from the listening collection or document, our project won't notice that... |
And also, "Pulling the router cable" is trying to simulate the unstable Internet connection from Router. |
@TIEmerald Thanks for the added information. The SDK should initiate reconnect attempts (subject to exponential backoff) once it detects the connection has been closed. If you pull the network cable, it may very well take longer for the client to be notified that the connection has been closed, and so recovery could take longer. But it should still reconnect and get appropriate events. Can you try waiting longer and see if the changes eventually arrive? Else, can you enable logging with THanks! |
Hi @mikelehen , I have test it one more time, here is the records: ( ̄∇ ̄) ~~ Then, I un-plug and re-plugin the cable from connected router arount 9:35-36 am. After that, while my computer reconnected with Internet, I changed the value from firebase/firestore database: The firebase value changed immediately, but not the firestore's one.... I have waited until 9:41 am... still not changing... even now it's still not changing... And here is the logs I copied from console: |
@TIEmerald Thanks very much for the screenshots and logs. This makes it very clear what you're observing and it's definitely not what I'd expect. From the logs, the Firestore SDK hasn't been notified that the connection has failed and so we haven't initiated any retry attempts. I still suspect that if you wait long enough, it will eventually reconnect. So if you're willing, I'd appreciate it if you'd try waiting ~15 minutes and see if it ever recovers. Obviously this would be unacceptable performance but it would at least verify that we're dealing with a timeout issue rather than a full bug. In the meantime, I'll see if I can reproduce, maybe using the Network Link Conditioner in OSX. Unfortunately this is likely something that would need to be addressed in the gRPC layer (the underlying transport that we use to communicate with the backend), so it may not be something we can fix super easily (at least in the short-term)... |
Thanks, @mikelehen , I am appreciated your efforts. Also... I am pretty sure waiting ~15 minutes won't help in here... As while I generating my last comments.. I believe I have noticed that even through the time is 9:51 am, which is around 15 mins from my testing, the Firestore Listener is still not updated.. To be honest... I am not pretty sure could we reproduce it with Network Link Conditioner..... But my work around is whenever I detected the Internet connection is off, I will recreate the same Listener when the Internet is back. |
@TIEmerald Interesting. Could you do the following and collect the logs for me?
I am wondering if when you get data for the new listener, you also get the missing event for the first listener. It really shouldn't be possible for you to "miss" events and then have a new listener work correctly. So I'm wondering if the reason recreating the listener works for you is because it triggers us to notice the connection has died and perform reconnect logic. Meanwhile I will try to reproduce this myself today. Thanks! |
@TIEmerald Good news. I was able to reproduce this using Network Link Conditioner. I will follow up with the gRPC team to see if I can get to the bottom of it. Thank you for the report. I’ll update here as I have news to share. |
Cheers @mikelehen , looking forward your updates~~ I didn't get a chance to collect the logs today... let me check if I could find a time later today or not.. About the "because it triggers us to notice the connection has died and perform reconnect logic." part. Actually, there is one thing I found very interesting with our production application.... Besides "recreate the listener", I could even bring the listener back while I called a cloud function from the same device which updated the parent document of my listening collection..... Which means... If I have a document D, which has a collection C like this: My application A has a listener L to the collection C. And then if I disconnect Router and connected it back, the listener L will failed, which is the issue I was describing... After that if I called one Cloud function on Firebase server from the same application A, and this cloud function updated one property on D. I could notice that the failed listener L will back to work..... Very strange... might be helpful thus I leave it in here... could ignore it if it's too complicated... Thanks anyway. |
Thanks for the added info. Don’t worry about collecting the logs. I think I have everything that I need since I can reproduce it now. |
Quick Question in here, hi @mikelehen, would you mind tell me how do you reproduce it with Network Link Conditioner? with Profile "100% Loss"? seems like I failed to reproduce it with Network Link Conditioner |
@TIEmerald What I've been doing is starting the listener, enabling 100% packet loss in Network Link Conditioner, waiting 2.5 minutes, and then disabling Network Link Conditioner to restore full network connectivity. When I do that, about 50% of the time the listener doesn't recover. I don't know how important it is to wait 2.5 minutes. But I tried 60, 90, 120, 150 seconds and 150 was the first time it reproduced for me. 🤷♂ |
en... strange things... might to do something with the base level internet communication... 🤷♂️ ... I might still testing it with my re-plugin way...... Kind of annoying that my temporary solution didn't fully resolve the issue our customers are facing... could because our internet connection detection logic is not perfect, or something else, I don't know....... I think I might try to depend on firebase listener to temporarily resolve this ... Still, thanks a lot mate~ |
Hi @mikelehen we encounter a similar issue today while we are testing our application with Xcode. Scenario :Our device connected to router all the time, and we didn't notice our device disconnected from Internet at all. We found out Firestore Listener failed in the middle of our testing, and then we tried to reset the listener manually. Logs :Captured Logs for Firestore Listener failed.txt This image will help you understand the scenario: ( I grep all lines with WatchStream)
Further Experience:Then I did further experience with my simple project. And I use my re-plugin method to simulate the issue. |
@TIEmerald Thanks for the added info. I think that's probably expected. Removing the network cable results in packet loss and TCP streams can tolerate some amount of packet loss and still recover. So depending on how long and various timing aspects, the TCP stream may just recover (no stream error) or it may fail and close (resulting in a stream error). I've forwarded the info on this case to the gRPC team. I'm also looking into whether we can avoid it ourselves by enabling a keepalive in gRPC... which I believe will help detect failures much more quickly / predictably. I'll keep you updated. |
This brings us to parity with Android (https://github.com/firebase/firebase-android-sdk/blob/47d41b9dc17cd95a7799f672f5cc14f1747642ec/firebase-firestore/src/main/java/com/google/firebase/firestore/remote/Datastore.java#L114) and addresses #2987.
*EDIT: * (thanks @paulb777) I should have said "will go out in a future iOS release" rather than "next". :-) I think this should be addressed by #3029 which will go out in the next iOS release. You can pick it up earlier by following the instructions at https://github.com/Firebase/firebase-ios-sdk#accessing-firebase-source-snapshots With that change, my listener recovered within a few seconds of turning off packet loss. It could take longer, depending on various timing of retry attempts, etc. But it should never take more than ~1 minute. Thanks again for reporting this and all the help to track it down. |
Actually, we've already branched and are stabilizing the next release. This bug fix will be on the M49 release which will be the one after that. |
Thanks very much @mikelehen, you are a legend. I will give it a try today. |
Hi, @mikelehen 😂 I cannot pick it up via following the instructions at https://github.com/Firebase/firebase-ios-sdk#accessing-firebase-source-snapshots, would you mind help me and tell me what's wrong in here? Thanks
|
Unfortunately, there's a difference between subspecs of the That is, you want |
[READ] Step 1: Are you in the right place?
file a Github issue.
with the firebase tag.
google group.
of the above categories, reach out to the personalized
Firebase support channel.
[REQUIRED] Step 2: Describe your environment
[REQUIRED] Step 3: Describe the problem
As Described in the title, the created Listener FIRListenerRegistration will failed if the device-connected router disconnected from Internet and then connect back. (I mean the device connected with router all the time, but the router is disconnected from Internet.)
Steps to reproduce:
Un-plug the Internet cable from Router and then connect it back....
Relevant Code:
Here is a very simple Project I used to testing... I found out Firebase Listener doesn't have this issue but Firestore's one has.
Testing Procject For Firestore.zip
The text was updated successfully, but these errors were encountered: