[00:00] <wgrant> That is the case.
[00:00] <wgrant> So that makes complete sense.
[00:00] <wgrant> The question is whether your case is the same.
[00:01] <wgrant> It's the same cycle, but is it also caused by the object getting lost due to the reset.
[00:01] <cjwatson> Does http://paste.ubuntu.com/14059967/ help?
[00:01] <wgrant> It's a set.
[00:01] <wgrant> It certainly shouldn't.
[00:01] <cjwatson> Oh, so it is
[00:02] <wgrant> In the garbage runs there is no case in which unhook is called from within MutableValueVariable.
[00:02] <wgrant> Even if it were only added once, it would still leak.
[00:02]  * wgrant looks for any reset hooks.
[00:03] <wgrant> Ah
[00:03] <wgrant> There are none.
[00:03] <cjwatson> invalidate probably won't cause this, since it does _mark_autoreload
[00:03] <wgrant> Right, exactly.
[00:03] <wgrant> abort does invalid, which does AutoReload.
[00:03] <wgrant> Which is why this normally works.
[00:03] <wgrant> We reset between tests, but normally abort or commit first.
[00:04] <wgrant> Look at Store.reset -- it just clears the dicts.
[00:04] <cjwatson> Yeah
[00:04] <cjwatson> With a "beware" comment at the top
[00:04] <wgrant> Hm, but the object itself is presumably destroyed.
[00:04] <wgrant> Its destructor could tell its variables to GTFO, presumably.
[00:04] <wgrant> But ew __del__
[00:05] <wgrant> I wonder how slow it would be to fire hooks on them all as they are reset...
[00:05] <cjwatson> I was wondering whether the tp_traverse implementation might be incomplete
[00:06] <wgrant> Hmm.
[00:07] <wgrant> Isn't the cycle legitimate?
[00:07] <wgrant> Or does PostgresConnection not refer back to the event_system?
[00:08] <wgrant> What would the tp_traverse bug be?
[00:09] <cjwatson> Missing MutableVariableValue._event_system, possibly.  Just trying without C extensions to narrow it down
[00:09] <wgrant> Ah, possibly.
[00:10] <cjwatson> OK, still garbage from that, so probably a red herring.
[00:15] <wgrant> (Pdb) p gc.get_referents(gc.garbage[0])[0]['_event'] is gc.get_referrers(gc.garbage[0])[1]['_event']
[00:15] <wgrant> True
[00:15] <wgrant> gc.garbage[0] is the PostgresConnection
[00:15] <wgrant> So looks legit to me.
[00:15] <wgrant> Unless we can remove a reference or use a weakref, it's uncollectable.
[00:16] <wgrant> Hm, can we weakref the hook, I wonder.
[00:16] <wgrant> Presumably if the object is still known by the store, its variable will still be alive.
[00:16] <wgrant> We only need to detect changes if we're possibly going to flush them to the store, so if we're no longer in the store it's irrelevant.
[00:16] <wgrant> Right?
[00:16] <cjwatson> Yeah, and quite a few other hooks are weakreffed
[00:16] <wgrant> Oh are they
[00:17] <wgrant> So they are
[00:17] <cjwatson> Well
[00:17] <cjwatson> Their args are
[00:17] <cjwatson> Maybe not a perfect example
[00:17] <wgrant> Hmmm actually.
[00:18] <wgrant> This may not be valid.
[00:18] <wgrant> What if I have the object, query 10 million other objects so it's evicted from Storm's cache, then change the mutable value, delete my reference, then flush.
[00:19] <cjwatson> It's a little hard to see how to get away without a reset hook.
[00:20] <wgrant> Heh and even the reset hook isn't sufficient here, I don't think.
[00:20] <wgrant> Oh, unless it keeps weakrefs to even evicted objects.
[00:20] <wgrant> In which case it would.
[00:20] <wgrant> s/unless it/unless the Store/
[00:20] <cjwatson> Cache eviction could call a hook too.
[00:20] <cjwatson> But starts to sound rather slow.
[00:21] <wgrant> It'd also be a weird hook
[00:21] <wgrant> Basically a "leak pls" hook.
[00:21] <cjwatson> Or a "make changes not work" hook
[00:21] <wgrant> Because we can't unhook _detect_changes just because the object is evicted; it may be alive elsewhere.
[00:21] <cjwatson> Yeah
[00:22] <wgrant> Ah
[00:22] <wgrant> _alive is a WeakValueDictionary.
[00:22] <wgrant> _cache is separate.
[00:22] <wgrant> So a reset hook may work.
[00:23] <wgrant> It also can't be any slower than an abort.
[00:23] <cjwatson> I have to go to sleep, so will pick this up in the morning if you haven't figured it out by then ...
[00:23] <wgrant> Oh of course it needs _alive. Even if it's evicted from the cache it has to use the same object if it's reretrieved by key.
[00:23] <wgrant> I might end up wrapping our stores, will see.
[00:24] <wgrant> Or just replacing all our reset calls, short-term.
[00:24] <cjwatson> The possible slowness would be in having to go and look for hooks on ten million objects.
[00:24] <wgrant> Right, but abort and commit already do that.
[00:24] <wgrant> And we already do that for every test.
[00:24] <cjwatson> Outlawing reset would probably be a reasonable stopgap.
[00:24] <wgrant> It's terrible, but already done.
[00:25] <cjwatson> Well, outlawing it without an immediately previous abort/commit
[00:26] <cjwatson> There aren't too many of those
[00:26] <cjwatson> Anyway, sleeeeep
[00:27] <wgrant> Yup,.
[00:27] <wgrant> Night, thanks for investigating.
[00:58] <wgrant> Trying to work out why I need that Milestone construction in there.
[00:58] <wgrant> The test doesn't leave garbage unless I've created something else referring to that object since the last commit.
[01:03] <wgrant> http://pastebin.ubuntu.com/14060983/ on top of trunk is minimal so far.
[05:35] <wgrant> cjwatson: I suspect that http://paste.ubuntu.com/14071248/ will be sufficient.
[05:35] <wgrant> It turns out that object-deleted often doesn't get called at all, because it's only invoked by a weakref callback.
[05:35] <wgrant> And the objects can sometimes outlive the weakref callbacks.
[05:35] <wgrant> s/weakref callbacks/weakrefs/
[05:36] <wgrant> (obviously the callback would never be invoked anyway in the uncollectable circular case, but removing the hooks also seems to eliminate at least one case of uncollectability. so hopefully it's enough)
[11:11] <cjwatson> wgrant: I'm trying a full test run with that patch now.  Shall I land it if it checks out?
[11:11] <wgrant> cjwatson: Or even before it checks out, if you're so inclined.
[11:11] <wgrant> The Storm test suite passes with it.
[11:11] <wgrant> And it's not obviously terribly wrong.
[11:12] <cjwatson> Can you commit it to our Storm branch and I'll sort out an sdist build and an LP landing?
[11:12] <cjwatson> I agree it seems at least plausible.
[11:13] <wgrant> We may need to detach a couple of extra hooks in the deletion hook, but this fixes at least one case.
[11:13] <wgrant> If it goes well I'll also merge storm trunk tomorrow, as it fixes that restricted librarian "Processing failed" thing that we run into occasionally.
[11:16] <cjwatson> Oh?  What was the change there?
[11:16] <cjwatson> r466?
[11:19] <wgrant> If that's the thing about making "cannot send data" ish a DisconnectionError, yes.
[11:19] <cjwatson> Yeah, that
[11:19] <wgrant> Pushed up to revision 408.
[11:19] <wgrant> 3 tags updated.
[11:19] <wgrant> oh no bzr what did you do
[11:20] <cjwatson> wgrant: http://paste.ubuntu.com/14072599/ if that helps
[11:20] <wgrant> I was hoping you'd say that.
[11:21] <wgrant> 0.20                 free.ekanayaka@canonical.com-20130628093022-ztq82k68kepzznxx
[11:21] <cjwatson> Be glad I didn't reflexively "bzr up" first.
[11:21] <wgrant> And two lds tags
[11:21] <wgrant> Hm.
[11:21] <wgrant> I wonder if I tried a merge into it at some point and it remembered the tags.
[11:21] <wgrant> Harmless, anyway.
[11:21] <cjwatson> Presumably those are ghosts with respect to our branch, but in the same repository.
[14:41] <cjwatson> Total: 19377 tests, 0 failures, 0 errors in 195 minutes 26.711 seconds.
[14:41] <cjwatson> good sign
[16:16] <cjwatson> Yay, buildbot happy again.
[16:16] <cjwatson> wgrant: thanks!
[21:47] <wgrant> Oh good, buildbot works.
[22:02] <cjwatson> Something has to (probably).
[22:02] <wgrant> Indeed...