wgrant | cjwatson: I guess those PostgresConnection leaks aren't a coincidence :/ | 10:27 |
---|---|---|
cjwatson | wgrant: Yeah, although if I'm not mistaken I was seeing them before my psycopg2 upgrade as well | 10:29 |
cjwatson | wgrant: Indeed, builds 897 and 898 both have a bunch | 10:30 |
cjwatson | r17870 and r17871 respectively | 10:30 |
wgrant | Yeah, they happened occasionally. | 10:31 |
wgrant | But two in a row straight after an upgrade is slightly worrying. | 10:31 |
wgrant | Hopefully just a coincidence. | 10:31 |
* cjwatson tries forcing once more for luck | 10:31 | |
cjwatson | LayerIsolationError: Test left uncollectable garbage | 10:57 |
cjwatson | [<storm.databases.postgres.PostgresConnection object at 0x15d376ec>] (referenced from [(<storm.databases.postgres.PostgresConnection object at 0x15d376ec>,), [<storm.databases.postgres.PostgresConnection object at 0x15d376ec>], {'_connection': <storm.databases.postgres.PostgresConnection object at 0x15d376ec>, '_order': {}, '_dirty': {}, '_implicit_flush_block_count': 0, '_sequence': 0, '_event': <storm.variables.EventSystem object at ... | 10:57 |
cjwatson | isn't that saying that it's referenced from itself? | 10:57 |
cjwatson | ... 0x15d3790c>, '__provides__': <zope.interface.Provides object at 0x1132a30c>, '_lp_store_initialized': True, '_register_for_txn': False, '_cache': <storm.cache.GenerationalCache object at 0xe000d6c>, '_database': <lp.services.webapp.adapter.LaunchpadDatabase object at 0x1132e74c>, '_alive': <WeakValueDictionary at 391910156>}]) | 10:57 |
wgrant | cjwatson: Referenced from a tuple, a list, and a dict. | 10:57 |
cjwatson | oh right | 10:59 |
cjwatson | wgrant: no luck reproducing this locally so far | 11:34 |
wgrant | cjwatson: Hm, you've run a big chunk of the tests? | 12:03 |
cjwatson | I tried running one test that happened to fail a couple of hundred times, and tried running lp.blueprints.browser. I'll try a bigger chunk | 12:29 |
cjwatson | gc.get_referents(*gc.garbage) is returning [{'_closed': True, '_database': <lp.services.webapp.adapter.LaunchpadDatabase object at 0x1107174c>, '_event': <storm.variables.EventSystem object at 0x172c95ac>, '_raw_co | 12:52 |
cjwatson | nnection': None}, <class 'storm.databases.postgres.PostgresConnection'>] | 12:52 |
cjwatson | so the cycle could either be via an event hook, or something in LaunchpadDatabase (but neither that nor any of its superclasses hold any non-trivial references that I can see) | 13:01 |
cjwatson | ... ok, that did reproduce locally, once in the entire test suite. now to try the whole thing again with -D :-/ | 16:21 |
cjwatson | (and hopefully that won't perturb it away) | 16:21 |
=== BradCrittenden is now known as bac | ||
wgrant | cjwatson: Ah, I apparently actually dug into a reproducible version of this in my Product.access_policies changes a few months ago. A MutableValueVariable held a reference through _event_system, but I changed the approach used in the code rather than debugging all the way. | 22:19 |
cjwatson | Curious, since MVV's hooks don't seem to do anything that should produce circularity | 22:25 |
cjwatson | I mean, it only adds methods with zero args as hooks | 22:26 |
cjwatson | At some point I may manage to get lucky and catch this in pdb. | 22:27 |
wgrant | cjwatson: If you want to bet that the two PostgresConnection leaks aren't unrelated, it may be worth reinstating my broken garbo job and poking at that test. | 22:30 |
wgrant | "∘ product-aps-set broke buildbot spectacularly. | 22:30 |
wgrant | ‣ garbage: {{{[<storm.databases.postgres.PostgresConnection object at 0x15d673ec>]}}} | 22:30 |
wgrant | " | 22:30 |
wgrant | cjwatson: http://paste.ubuntu.com/14057795/ reproduced it consistently. | 22:31 |
cjwatson | OK. I'm running --layer=DatabaseFunctionalLayer at the moment with pdb.set_trace() inserted just before the exception; if that doesn't catch it (or maybe even if it does) then I'll try that. | 22:31 |
wgrant | Specifically setting the access_policies list. | 22:31 |
cjwatson | Though it would certainly be nice to have a shorter and more reliable reproducer, so yes ... | 22:32 |
cjwatson | Won't get to it before tomorrow morning though | 22:32 |
wgrant | Well, the problem is that we don't know if they're actually related. | 22:32 |
cjwatson | Indeed. | 22:32 |
wgrant | And we won't know until the garbo one is fixed and buildbot passes :/ | 22:32 |
wgrant | Feel free to hand over any notes you have and I'll continue poking. | 22:33 |
cjwatson | I don't yet have meaningful notes other than what I've mentioned here, since my reproducer was so long and unreliable. | 22:33 |
cjwatson | Just running with http://paste.ubuntu.com/14057865/ | 22:34 |
cjwatson | (But I'm sure you could have figured that much out) | 22:34 |
cjwatson | I've been multitasking with working on getting my git-build-recipe tests passing | 22:34 |
cjwatson | Currently at 119/135 ... | 22:35 |
wgrant | cjwatson: 119/135 tests complete, or 119/135 runs succeeded/ | 22:44 |
cjwatson | wgrant: 119/135 tests succeeded | 23:24 |
cjwatson | lp.registry.tests.test_distro_webservice.TestGetBranchTips.test_same_results> /home/cjwatson/src/canonical/launchpad/lp-branches/more-layer-isolation-info/lib/lp/testing/layers.py(473)testTearDown() | 23:25 |
cjwatson | -> BaseLayer.flagTestIsolationFailure( | 23:25 |
cjwatson | oho | 23:25 |
wgrant | really | 23:25 |
wgrant | It's an interesting test to break on. It's a slightly weird method. | 23:25 |
cjwatson | Hesitant to ascribe too much significance to the particular test. | 23:26 |
wgrant | Indeed. | 23:26 |
cjwatson | (Pdb) p obj['_event']._hooks | 23:28 |
cjwatson | {'register-transaction': set([]), 'flush': set([(<bound method JSONVariable._detect_changes of <storm.variables.JSONVariable object at 0x182407ec>>, ())])} | 23:28 |
wgrant | Oho | 23:29 |
wgrant | Very relevant, then. | 23:29 |
cjwatson | So *could* be, but how's that circular | 23:29 |
cjwatson | I guess that bound method refers to that variable | 23:29 |
wgrant | Yep. | 23:29 |
wgrant | Bound methods hold a reference to their object, funnily enough. | 23:29 |
wgrant | But I'm not sure why this wouldn't happen all the time, unless it sometimes gets unregistered. | 23:30 |
cjwatson | (Pdb) gc.get_referents(meth.im_self) | 23:30 |
cjwatson | [<storm.variables.EventSystem object at 0x1b102f0c>, <class 'storm.variables.JSONVariable'>, {u'include_long_descriptions': True, u'backports_not_automatic': False}, Undef, (Undef, u'{"include_long_descriptions": true, "backports_not_automatic": false}'), <storm.properties.PropertyColumn object at 0xdcb19cc>, <storm.variables.EventSystem object at 0x1e15914c>] | 23:30 |
cjwatson | So I guess my publisher_options change could have tickled it? | 23:31 |
cjwatson | Maybe made it more probable, at least, since DistroSeries is everywhere | 23:31 |
wgrant | It is not impossible. | 23:31 |
wgrant | I did think about that when reviewing it, but buildbot seemed happy... | 23:31 |
cjwatson | The only things with __del__ are Result and Connection, unless there's something inside psycopg2 | 23:32 |
wgrant | Right, but it's probably something like PostgresConnection holding a reference to the event_system so it can fire events on commit or similar. | 23:34 |
cjwatson | There'd have to be a reference back to the connection somewhere for it to be relevantly cyclic, unless I misunderstand the gc | 23:35 |
wgrant | Indeed, but the variable or event_system probably knows the store which knows the connection, or something like that. | 23:36 |
cjwatson | ah | 23:36 |
cjwatson | (Pdb) gc.get_referents(meth.im_self)[6]._hooks | 23:36 |
cjwatson | {'stop-tracking-changes': set([(<bound method JSONVariable._stop_tracking of <storm.variables.JSONVariable object at 0x182407ec>>, ())]), 'changed': set([(<bound method Store._variable_changed of <storm.store.Store object at 0x1b10294c>>, ()), (<bound method Relation._break_on_local_diverged of <storm.references.Relation object at 0x15e3b66c>>, ({<storm.references.Relation object at 0x126da46c>: {}, 'sequence': 1, 'primary_vars': ... | 23:36 |
cjwatson | ... (<storm.variables.IntVariable object at 0x19a5487c>,), <storm.references.Relation object at 0x1237eacc>: {}, 'invalidated': True, <storm.references.Relation object at 0x126da74c>: {}, 'store': <storm.store.Store object at 0x1b10294c>, <storm.references.Relation object at 0x121ecb8c>: {}},))]), 'start-tracking-changes': set([(<bound method JSONVariable._start_tracking of <storm.variables.JSONVariable object at 0x182407ec>>, ())]), ... | 23:37 |
cjwatson | ... 'object-deleted': set([(<bound method JSONVariable._detect_changes_and_stop of <storm.variables.JSONVariable object at 0x182407ec>>, ())]), 'resolve-lazy-value': set([(<bound method Store._resolve_lazy_value of <storm.store.Store object at 0x1b10294c>>, ())]), 'flushed': set([])} | 23:37 |
wgrant | Right. | 23:37 |
cjwatson | there's the store | 23:37 |
cjwatson | so these must normally be unhooked or nothing would work | 23:37 |
wgrant | Yeah, maybe _stop_tracking isn't being called somewhere along the line. | 23:37 |
* wgrant tries to revive the reproducible case. | 23:37 | |
wgrant | Oh I have vague memories of this. | 23:39 |
wgrant | I could only reproduce it with a reset in some circumstances, I think. | 23:39 |
cjwatson | I'm a bit suspicious of the object-deleted implementation in the C extensions. | 23:42 |
cjwatson | That's at least sometimes how _stop_tracking is called. | 23:43 |
wgrant | Confirmed that r17587 plus that test I pasted reliably generates garbage | 23:44 |
cjwatson | This JSONVariable is only referred to by its bound methods | 23:44 |
wgrant | Yep | 23:45 |
cjwatson | I wonder if it's possible to end up with the same hook attached multiple times? | 23:48 |
cjwatson | Via multiple MutableValueVariable.get calls | 23:48 |
wgrant | It indeed seems to hook twice, hmmm. | 23:51 |
wgrant | But then my unhook log isn't triggered at all. | 23:51 |
wgrant | Oh, probably unhooked once in set, which I didn't log. | 23:52 |
wgrant | cjwatson: Oh, it's a set anyway. | 23:53 |
wgrant | Unhook is never called AFAICS. | 23:53 |
cjwatson | It's not called via object-deleted -> _detect_changes_and_stop -> _stop_tracking? | 23:54 |
wgrant | Nope. | 23:54 |
wgrant | Logging more and tweaking the test to not leave garbage, to see how it's called normally. | 23:54 |
wgrant | It is possible, of course, that it only leaks on reset. | 23:55 |
wgrant | And it depends on test ordering. | 23:55 |
cjwatson | reset (or invalidate) could cause multiple gets by clearing the cache | 23:56 |
wgrant | Interesting. | 23:56 |
wgrant | If I remove the reset at the end of the test, it actually gets unhooked by a new set() call. | 23:56 |
wgrant | I guess it gets set to AutoReload when the transaction is aborted at the end of the test. | 23:57 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!