[10:27] <wgrant> cjwatson: I guess those PostgresConnection leaks aren't a coincidence :/
[10:29] <cjwatson> wgrant: Yeah, although if I'm not mistaken I was seeing them before my psycopg2 upgrade as well
[10:30] <cjwatson> wgrant: Indeed, builds 897 and 898 both have a bunch
[10:30] <cjwatson> r17870 and r17871 respectively
[10:31] <wgrant> Yeah, they happened occasionally.
[10:31] <wgrant> But two in a row straight after an upgrade is slightly worrying.
[10:31] <wgrant> Hopefully just a coincidence.
[10:31]  * cjwatson tries forcing once more for luck
[10:57] <cjwatson> LayerIsolationError: Test left uncollectable garbage
[10:57] <cjwatson> [<storm.databases.postgres.PostgresConnection object at 0x15d376ec>] (referenced from [(<storm.databases.postgres.PostgresConnection object at 0x15d376ec>,), [<storm.databases.postgres.PostgresConnection object at 0x15d376ec>], {'_connection': <storm.databases.postgres.PostgresConnection object at 0x15d376ec>, '_order': {}, '_dirty': {}, '_implicit_flush_block_count': 0, '_sequence': 0, '_event': <storm.variables.EventSystem object at ...
[10:57] <cjwatson> isn't that saying that it's referenced from itself?
[10:57] <cjwatson> ... 0x15d3790c>, '__provides__': <zope.interface.Provides object at 0x1132a30c>, '_lp_store_initialized': True, '_register_for_txn': False, '_cache': <storm.cache.GenerationalCache object at 0xe000d6c>, '_database': <lp.services.webapp.adapter.LaunchpadDatabase object at 0x1132e74c>, '_alive': <WeakValueDictionary at 391910156>}])
[10:57] <wgrant> cjwatson: Referenced from a tuple, a list, and a dict.
[10:59] <cjwatson> oh right
[11:34] <cjwatson> wgrant: no luck reproducing this locally so far
[12:03] <wgrant> cjwatson: Hm, you've run a big chunk of the tests?
[12:29] <cjwatson> I tried running one test that happened to fail a couple of hundred times, and tried running lp.blueprints.browser.  I'll try a bigger chunk
[12:52] <cjwatson> gc.get_referents(*gc.garbage) is returning [{'_closed': True, '_database': <lp.services.webapp.adapter.LaunchpadDatabase object at 0x1107174c>, '_event': <storm.variables.EventSystem object at 0x172c95ac>, '_raw_co
[12:52] <cjwatson> nnection': None}, <class 'storm.databases.postgres.PostgresConnection'>]
[13:01] <cjwatson> so the cycle could either be via an event hook, or something in LaunchpadDatabase (but neither that nor any of its superclasses hold any non-trivial references that I can see)
[16:21] <cjwatson> ... ok, that did reproduce locally, once in the entire test suite.  now to try the whole thing again with -D :-/
[16:21] <cjwatson> (and hopefully that won't perturb it away)
[22:19] <wgrant> cjwatson: Ah, I apparently actually dug into a reproducible version of this in my Product.access_policies changes a few months ago. A MutableValueVariable held a reference through _event_system, but I changed the approach used in the code rather than debugging all the way.
[22:25] <cjwatson> Curious, since MVV's hooks don't seem to do anything that should produce circularity
[22:26] <cjwatson> I mean, it only adds methods with zero args as hooks
[22:27] <cjwatson> At some point I may manage to get lucky and catch this in pdb.
[22:30] <wgrant> cjwatson: If you want to bet that the two PostgresConnection leaks aren't unrelated, it may be worth reinstating my broken garbo job and poking at that test.
[22:30] <wgrant> "∘ product-aps-set broke buildbot spectacularly.
[22:30] <wgrant> ‣ garbage: {{{[<storm.databases.postgres.PostgresConnection object at 0x15d673ec>]}}}
[22:30] <wgrant> "
[22:31] <wgrant> cjwatson: http://paste.ubuntu.com/14057795/ reproduced it consistently.
[22:31] <cjwatson> OK.  I'm running --layer=DatabaseFunctionalLayer at the moment with pdb.set_trace() inserted just before the exception; if that doesn't catch it (or maybe even if it does) then I'll try that.
[22:31] <wgrant> Specifically setting the access_policies list.
[22:32] <cjwatson> Though it would certainly be nice to have a shorter and more reliable reproducer, so yes ...
[22:32] <cjwatson> Won't get to it before tomorrow morning though
[22:32] <wgrant> Well, the problem is that we don't know if they're actually related.
[22:32] <cjwatson> Indeed.
[22:32] <wgrant> And we won't know until the garbo one is fixed and buildbot passes :/
[22:33] <wgrant> Feel free to hand over any notes you have and I'll continue poking.
[22:33] <cjwatson> I don't yet have meaningful notes other than what I've mentioned here, since my reproducer was so long and unreliable.
[22:34] <cjwatson> Just running with http://paste.ubuntu.com/14057865/
[22:34] <cjwatson> (But I'm sure you could have figured that much out)
[22:34] <cjwatson> I've been multitasking with working on getting my git-build-recipe tests passing
[22:35] <cjwatson> Currently at 119/135 ...
[22:44] <wgrant> cjwatson: 119/135 tests complete, or 119/135 runs succeeded/
[23:24] <cjwatson> wgrant: 119/135 tests succeeded
[23:25] <cjwatson>  lp.registry.tests.test_distro_webservice.TestGetBranchTips.test_same_results> /home/cjwatson/src/canonical/launchpad/lp-branches/more-layer-isolation-info/lib/lp/testing/layers.py(473)testTearDown()
[23:25] <cjwatson> -> BaseLayer.flagTestIsolationFailure(
[23:25] <cjwatson> oho
[23:25] <wgrant> really
[23:25] <wgrant> It's an interesting test to break on. It's a slightly weird method.
[23:26] <cjwatson> Hesitant to ascribe too much significance to the particular test.
[23:26] <wgrant> Indeed.
[23:28] <cjwatson> (Pdb) p obj['_event']._hooks
[23:28] <cjwatson> {'register-transaction': set([]), 'flush': set([(<bound method JSONVariable._detect_changes of <storm.variables.JSONVariable object at 0x182407ec>>, ())])}
[23:29] <wgrant> Oho
[23:29] <wgrant> Very relevant, then.
[23:29] <cjwatson> So *could* be, but how's that circular
[23:29] <cjwatson> I guess that bound method refers to that variable
[23:29] <wgrant> Yep.
[23:29] <wgrant> Bound methods hold a reference to their object, funnily enough.
[23:30] <wgrant> But I'm not sure why this wouldn't happen all the time, unless it sometimes gets unregistered.
[23:30] <cjwatson> (Pdb) gc.get_referents(meth.im_self)
[23:30] <cjwatson> [<storm.variables.EventSystem object at 0x1b102f0c>, <class 'storm.variables.JSONVariable'>, {u'include_long_descriptions': True, u'backports_not_automatic': False}, Undef, (Undef, u'{"include_long_descriptions": true, "backports_not_automatic": false}'), <storm.properties.PropertyColumn object at 0xdcb19cc>, <storm.variables.EventSystem object at 0x1e15914c>]
[23:31] <cjwatson> So I guess my publisher_options change could have tickled it?
[23:31] <cjwatson> Maybe made it more probable, at least, since DistroSeries is everywhere
[23:31] <wgrant> It is not impossible.
[23:31] <wgrant> I did think about that when reviewing it, but buildbot seemed happy...
[23:32] <cjwatson> The only things with __del__ are Result and Connection, unless there's something inside psycopg2
[23:34] <wgrant> Right, but it's probably something like PostgresConnection holding a reference to the event_system so it can fire events on commit or similar.
[23:35] <cjwatson> There'd have to be a reference back to the connection somewhere for it to be relevantly cyclic, unless I misunderstand the gc
[23:36] <wgrant> Indeed, but the variable or event_system probably knows the store which knows the connection, or something like that.
[23:36] <cjwatson> ah
[23:36] <cjwatson> (Pdb) gc.get_referents(meth.im_self)[6]._hooks
[23:36] <cjwatson> {'stop-tracking-changes': set([(<bound method JSONVariable._stop_tracking of <storm.variables.JSONVariable object at 0x182407ec>>, ())]), 'changed': set([(<bound method Store._variable_changed of <storm.store.Store object at 0x1b10294c>>, ()), (<bound method Relation._break_on_local_diverged of <storm.references.Relation object at 0x15e3b66c>>, ({<storm.references.Relation object at 0x126da46c>: {}, 'sequence': 1, 'primary_vars': ...
[23:37] <cjwatson> ... (<storm.variables.IntVariable object at 0x19a5487c>,), <storm.references.Relation object at 0x1237eacc>: {}, 'invalidated': True, <storm.references.Relation object at 0x126da74c>: {}, 'store': <storm.store.Store object at 0x1b10294c>, <storm.references.Relation object at 0x121ecb8c>: {}},))]), 'start-tracking-changes': set([(<bound method JSONVariable._start_tracking of <storm.variables.JSONVariable object at 0x182407ec>>, ())]), ...
[23:37] <cjwatson> ... 'object-deleted': set([(<bound method JSONVariable._detect_changes_and_stop of <storm.variables.JSONVariable object at 0x182407ec>>, ())]), 'resolve-lazy-value': set([(<bound method Store._resolve_lazy_value of <storm.store.Store object at 0x1b10294c>>, ())]), 'flushed': set([])}
[23:37] <wgrant> Right.
[23:37] <cjwatson> there's the store
[23:37] <cjwatson> so these must normally be unhooked or nothing would work
[23:37] <wgrant> Yeah, maybe _stop_tracking isn't being called somewhere along the line.
[23:37]  * wgrant tries to revive the reproducible case.
[23:39] <wgrant> Oh I have vague memories of this.
[23:39] <wgrant> I could only reproduce it with a reset in some circumstances, I think.
[23:42] <cjwatson> I'm a bit suspicious of the object-deleted implementation in the C extensions.
[23:43] <cjwatson> That's at least sometimes how _stop_tracking is called.
[23:44] <wgrant> Confirmed that r17587 plus that test I pasted reliably generates garbage
[23:44] <cjwatson> This JSONVariable is only referred to by its bound methods
[23:45] <wgrant> Yep
[23:48] <cjwatson> I wonder if it's possible to end up with the same hook attached multiple times?
[23:48] <cjwatson> Via multiple MutableValueVariable.get calls
[23:51] <wgrant> It indeed seems to hook twice, hmmm.
[23:51] <wgrant> But then my unhook log isn't triggered at all.
[23:52] <wgrant> Oh, probably unhooked once in set, which I didn't log.
[23:53] <wgrant> cjwatson: Oh, it's a set anyway.
[23:53] <wgrant> Unhook is never called AFAICS.
[23:54] <cjwatson> It's not called via object-deleted -> _detect_changes_and_stop -> _stop_tracking?
[23:54] <wgrant> Nope.
[23:54] <wgrant> Logging more and tweaking the test to not leave garbage, to see how it's called normally.
[23:55] <wgrant> It is possible, of course, that it only leaks on reset.
[23:55] <wgrant> And it depends on test ordering.
[23:56] <cjwatson> reset (or invalidate) could cause multiple gets by clearing the cache
[23:56] <wgrant> Interesting.
[23:56] <wgrant> If I remove the reset at the end of the test, it actually gets unhooked by a new set() call.
[23:57] <wgrant> I guess it gets set to AutoReload when the transaction is aborted at the end of the test.