-
Notifications
You must be signed in to change notification settings - Fork 894
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose API to clear offline persistence (in IndexedDB). #449
Comments
There is not currently (other than poking into IndexedDB directly to delete the data). |
Can we make it a feature request then? |
I edited it into a feature request. Feel free to add more detail if you have more specific use cases in mind, etc. |
I also need this. When a user logs out we need to clear local caches of private data in our application. |
For any given instance of Firestore we have two kinds of data:
We could potentially clear either or both of these. If user makes a change but signs out before the client has finished sending it currently they'd end up with a persistent pending write, which we'd resume when that specific user signed in again. Are you intending that this kind of data would be cleared as well? |
At logoff time we'd want to know that there are pending writes. If online we'd wait for them to complete before logging off, then we'd delete cached documents. If offline we'd warn the user and if the user chooses to continue with logging off, we'd delete the pending writes and the cached documents. |
Same thing in my app, I'd just as a user if he wants to discard changes or not. Probably the best solution would be to expose all of these things separately:
Last two can be of course merged into one method with some options. |
Makes sense. We've discussed adding some way to detect if there are pending writes or wait to flush pending writes in order to make Separately we're working on multi-tab support for offline persistence such that active tabs can see each other's pending writes, even while disconnected. We'll need to investigate how this intersects with that work. If signing out in one tab doesn't sign out the others then nuking and paving IndexedDb out from under the other tabs isn't going to end well. |
I think that this could also lead to other issues. Let's say that we have an app with offline persistence. User A logs in, and fetches her/his profile from the database. We have security rules: match /profiles/{uid} { allow read: if request.auth.uid == uid }. The problem is that if User A logs out, User B will have access to all local data of that user. Whenever there's local data from another user, security rules wont work because the first response always comes from all cached data. This could lead to data breaches or wrong results while being offline. |
I repeat this advice in every discussion related to this but I want to be clear: if you’re at all sensitive to the disclosure of cached information within the same operating system user account: do not enable persistence. The Firestore client that exists today assumes that operating system user == actual user. Protecting within that boundary is a hard problem that we have not attempted to solve. We don’t believe clearing IndexedDB on sign-out is sufficient to implement this: it’s trivially defeated by a user forgetting to do it, the browser crashing, or an attacker installing a browser extension to read the contents of the database while another user is signed in. Clearing IndexedDB is still useful, especially for testing, and we are hoping to add this feature (PRs welcome, of course). However for sensitive applications like you describe we’re implementing an LRU aspect to our in-memory cache. This will improve perceived performance to be on par with having persistence enabled without requiring the user to remember to sign out or any of the other risks of disclosure. |
@wilhuff I agree with you in terms of sensitive data. However, I see this issue has other implications. In my opinion, it's a good idea to make cache data specific to each user. Clearing IndexedDB seems to be a simple solution to fix all the issues mentioned here as well. |
If two users use the same device with different operating system accounts, everything works as intended: separate browser caches and operating system permissions on them isolate the users. In this scenario clearing LevelDB doesn't help (or hurt). If two users use the same device using the same operating system account/browser then separating the cache data within LevelDB doesn't actually protect you: it's trivial to get at the other user's data within the same IndexedDB. If you're willing to work around this by clearing the IndexedDB data in between sessions, just don't write it: disable persistence. |
Just FYI-
This won't actually be the case for most apps. Since security rules do not act as filters (they either allow or deny a query in its entirety), as long as your app doesn't perform broken queries (that will be rejected by the backend due to security rules), users won't see any locally-cached data that they don't have access to... I repeat Gil's warning that you can't rely on this to protect sensitive data (you should disable persistence instead), but you can expect your app to behave offline just fine even if multiple users have used the app, resulting in various locally-cached data... in fact, sharing the cache improves the offline experience and general performance of the app since there's a greater chance that data requested by the app will already be available locally. |
@mikelehen @wilhuff Thanks for your feedback. |
+1 |
We need this feature since we need to be able to explicitly remove any data stored on the device by our PWA. Our data is allowed to survive several logins by several users, but at some point we need to clear everything (for all users) if any user explicitly requests this. Our devices (usually laptops) are used at trusted locations by trusted users, but day by day the trusted locations and trusted users might change. |
We have been running on Firestore for about one year now and are seeing massive performance issues related to the IndexedDB storing large cached data sets and therefor urgently need this feature. |
Just keep reading this many times..
This look dangerous to us! According to our usage for SDK, we heavily use cached data and most of time load only updated docs according to FirestoreServerStampTime.. so clearing old docs without notification will make problem to us... thanks |
I'm 99.9% sure they wouldn't clear documents where |
Correct. We wouldn't clear documents with unsynced changes. The idea is that we'd clear documents that you previously listened to but which you haven't listened to recently. And we wouldn't clear documents unless your total cache size had exceeded some semi-large threshold (e.g. 10MB of document data, though this is TBD). |
(((Please))) take in your consideration to make this (auto clear method) optional; when we define the firestore instance.Our case; -making this (auto clear method) without option will threat us to drop our project on Firestore. @mikelehen What really we need is to have a new feature to clear a specific collection on cache..
THANKS |
@mohshraim When you say "get the maxStampTime of cached docs" and "combine it with local cached data", how are you getting the local cached data? If you're doing a get() or onSnapshot() or whatever, then that will bump the "last used" time of the data, and so the SDK will try not to clear it. How big is your cache? 1k items is probably plenty small so you wouldn't be affected by garbage collection at all. As for whether it's optional, we're still looking into what controls to expose. cc/ @gsoltis |
Sorry for long reply! I use
Thanks for clearing this. This will be more safe on main definitions like items and patients, but not on the other collections like (patientVisit) that we call it when user want to view patient history. We have developed huge library Above the Amazing Firestore SDK. we test it with full offline environment for several hours with heavy operations and its working smoothly... Firestore is the best Cloud Service with FullOffline 🥇 🥇 🥇 (Currently); so adding this feature without optional control will remove Firestore from what many developers looking for, like us. 😢 Note we was working on PouchDB before decide moving to Firestore. and we reach 40MB of cached data with good performance also, but not that good features as Firebase. Please make it optional! 🙏and maybe export the clearCache function on client SDK so we call when we need to. Thanks |
So we have clients who are heavily affected by this issue. We cleared indexeddb yesterday at a client, where the size was 25MB. The app was starting to be extremely sluggish (we're talking 10-15 seconds between clicks), but became responsive again after. We need information on the structure inside indexeddb, because we HAVE to start clearing out data manually. We basically only need to clear one collection (order collection) where pending writes is false. |
Here's an idea in the mean time: How about supporting a persistence mode where the entity being saved is only persisted while |
This will likely cause size problems:
It seems that documents that have been deleted to stay in indexeddb, but flagged as Can anyone shed light on why this logic is required? Why reference documents that doesn't exist anymore? |
Alright, so I ended up with this.
It should be said that this is built to clear subcollections by name, and not root collections. Example:
Note that the function does NOT clear queued writes (they are stored in a different data-store called 'mutations'). So even if the above cleared all orders locally, any changes made to orders would not get lost. It's been tested in various offline/online scenarios, but I welcome more tests, and general feedback in case I messed something up. WARNING |
Hey @larssn I hope that @mikelehen have time to audit your method code to feed us if it will make any hidden bug. And if firestore team can enhance it and adopt it in SDK -that's will be great- Thanks |
I'm afraid I really can't endorse this code or code like it. The original version that cleared entries from Additionally, making changes to the persisted cache while the SDK is running is dangerous and may break assumptions in the SDK, leading to undefined behavior. Finally, it's really not safe to write code that makes any assumptions about the schema used by the SDK. The schema is an internal implementation detail and subject to change from one release to the next, breaking any code you've written. So you would need to re-audit your code every time you upgrade. The chance for bugs leading to cache corruption is high. The closest thing to an approach I can endorse would be to delete the entire cache, before initializing the SDK. This is guaranteed to be safe, since it's equivalent to just running your app on a fresh device. I'm sorry we don't have a better supported option at this time, but we're actively working on garbage collection for the cache which will prevent it from growing beyond a specified threshold. And further cache controls are on the roadmap. For now, any direct modification to the SDK's persisted data is done at your own risk and may break at any time. Sorry! |
@mikelehen Thanks for taking the time to write all that. What you're writing makes sense, and you're right: it's a high risk solution; but seeing the performance of IndexedDB, we couldn't sit on our hands any longer. Due to the fact that we're writing to a live cache (albeit in a transaction), we had to take out the deletion of We actually haven't observed any cache corruption yet. After a cache cleaning, and then reloading the app, your SDK seems to correctly get the missing entities; but thanks for your feedback, I'll keep an eye on it. Is there any milestone for the garbage collection? It's been talked about for a long time now. |
Here's the version that deletes from
This function should not be run while firestore is initialized since it can cause internal assertion errors that breaks the DB layer completely. So you should either: Again, use at own risk. |
Thanks @larssn! As for the timeline for garbage collection, I can't communicate any specific timeline, but it's making good progress. The bulk of it has been implemented on iOS (firebase/firebase-ios-sdk#1600) and is currently being ported to web. That said, we still need to define and expose the API that actually enables it which will take some additional time (all new APIs have to go through some extra process before being exposed). So no concrete timeline, but I promise it's coming. :-) |
@mikelehen Thank you. We'll hold our breath for now :) |
@mikelehen apologies for my confusion as a noob, but by saying users won't see any locally-cached data that they don't have access to- do you mean access restricted by security rules? Because I am seeing cached data returned despite data being restricted by security rule (and error callback called as well). I tried two scenarios- Scenario 1:
Scenario 2:
So I am only confused with the quoted part.
|
@asifbd Can you share your security rules that should be restricting access? I'm confused why your listener error handler isn't getting called if the user doesn't have access to all of the documents under |
@mikelehen listener error handlers do get called, in both scenarios. But at the same time, cached data is also returned in snapshot.
My security rules involve several custom functions, just pasting the relevant portion-
|
@asifbd Ahh, okay. Then that is behaving as expected. Security rules don't act as filters, so in order to prevent your listener from being canceled (and the error callback called), you'll need to adjust the query you're listening to so that it passes your security rules. Once you do that, the results of the query (from cache) will presumably no longer include data that the user doesn't have access to. Note that I'm not saying that this means your app is more secure. A malicious user could drop into the JS console and manually do queries to access whatever data is in the cache. But in terms of app correctness (showing the user only their own documents), the queries you issue have to be allowed by your security rules anyway (i.e. the user must have access to perform the query), and so your app should behave correctly and only show users data that they have access to. |
@mikelehen Any news on this ? We're also starting to see our app slowing down gradually as time goes by... Just deleting the indexedDB shows real improvement on queries performance but this is not a viable solution. Thanks |
Thanks @mikelehen . I'm specifically not looking for an API to clear persistence... Anything that can fix the performance problem should be good ! I know you can't say anything about timeline but... Are we days, months, years from getting this garbage collection feature ? |
@sambegin LRU Garbage collection is coming very soon. Pretty much all of the groundwork is in place, it just needs to be wired together. I would expect maybe one or two more PRs then it will be ready! |
GC is now available |
|
Please expose a way to clear offline persistence (in IndexedDB). It would be useful to clean up when the user signs out.
The text was updated successfully, but these errors were encountered: