-
Notifications
You must be signed in to change notification settings - Fork 66
Description
Hi all! I spent some time debugging this surprising behavior of ndb's in-memory cache today. I don't know if it's intentional or not, but it definitely caught me off guard, and I suspect it's not what you want. When I put an entity, then modify it in memory without puting it again, then get it, the returned entity includes the modifications that weren't put.
Example code below. I expect this happens because the in-memory cache stores and returns the exact entities it's given. The impact here is also probably not huge, since contexts are generally per request or otherwise local and short-lived, and this usage pattern is probably unusual. Still though, it's a nasty surprise. Should the cache maybe make copies of entities on both put and get to prevent this? It'd add overhead, but I wonder if it would be worth it to avoid subtle, difficult-to-debug bugs.
Environment details
MacOS and Linux, Python 3.9.16, ndb 2.1.1. Other libs:
google-api-core 2.11.0
google-auth 2.15.0
google-cloud-appengine-logging 1.3.0
google-cloud-audit-log 0.2.5
google-cloud-core 2.3.2
google-cloud-datastore 2.11.0
google-cloud-error-reporting 1.9.1
google-cloud-logging 3.5.0
google-cloud-ndb 2.1.1
google-cloud-tasks 2.13.1
googleapis-common-protos 1.59.0
Steps to reproduce
- Store an entity
- Modify the entity
- Don't
putit again getthe entity- It has the unstored modifications
Code example
class Foo(Model):
a = StringProperty()
f = Foo(id='x', a='asdf')
f.put()
print(id(f))
f.a = 'qwert'
got = Foo.get_by_id('x')
print(got)
print(id(got))The output shows that got includes the modification of a = 'qwert' that was never written to the datastore, and that it's the same object in memory as f:
4393984640
Foo(key=Key('Foo', 'bar'), a='qwert')
4393984640
If I add use_cache=False to the get, or context.clear_cache() before it, I get the original object with 'asdf' for f, without the unstored modification.
Stack trace
N/A