[00:15] <cjwatson> [6~[6~ [5~/wg 61
[00:15] <cjwatson> argh, lag
[01:43] <lifeless> argh
[01:43] <lifeless> set_up_tacfile_logging
[01:44] <wgrant> That's not the problem here, AFAICT, but yeah.
[01:44] <wgrant> I commented it out, still borked.
[01:44] <lifeless> well
[01:44] <wgrant> It's evil, though.
[01:44] <lifeless> I'm yak shaving on the stdio thing
[01:44] <wgrant> Ah
[01:44] <lifeless> its that or yakshave on a manhole
[01:45] <lifeless> so
[01:45] <lifeless> the channel is set to pretend to be non-errors
[01:46] <lifeless> which is why that function isn't the cause of the looping
[01:46] <wgrant> Ah
[01:46] <lifeless> channel = logging.StreamHandler(log.StdioOnnaStick())
[01:46] <lifeless> thats what is connected to the python logging system
[01:47] <lifeless> note further that the python logging stuff *also* needs a oops handler attached (with special gymnastics) because set_up_tacfile_logging turns errors into plain messages
[01:47] <lifeless> special gymnastics because its calling 'normal' oops code in a twisted appserver.
[01:47] <lifeless> climb through, if you dare
[01:48] <lifeless> aaaaaaah
[01:48] <lifeless> -> execute_zcml_for_scripts()
[01:48] <lifeless> (Pdb)
[01:48] <lifeless> and thats where my prompt disappears
[01:53] <lifeless> it may just be horrendously slow under pdb.
[01:53] <lifeless> ah yues
[01:53] <lifeless> aieee
[01:53] <lifeless> someone in twisted had a daft day
[01:53] <lifeless> 623  -> def startLoggingWithObserver(observer, setStdout=1):
[01:54] <lifeless> there appears to be no way to control that.
[01:55] <wgrant> Heh
[01:57] <lifeless> Twisted-11.1.0-py2.6-linux-i686.egg/twisted/application/app.py(228)start()
[01:57] <lifeless> the use of -n doesn't stop this happening
[01:57] <lifeless> so, interactive debugging -> nah, we can't do that
[01:57] <lifeless> now, with that hacked in, lets see
[01:59] <lifeless> oh joy
[01:59] <lifeless> and then we re-setup logging doing that against as well
[01:59] <wgrant> We have to be sure :)
[01:59] <lifeless> via ib/lp/services/sshserver/service.py(174)startService()
[01:59] <lifeless> yay.
[02:00] <lifeless> now I have pdb, I am all powerful
[02:00] <lifeless> 12K oopses written
[02:00] <lifeless> then it stopped
[02:01] <lifeless> 2 oopses this time
[02:02] <lifeless> now 1
[02:02] <lifeless> heisenbug
[02:04] <poolie> lifeless, if we depend on python-fixtures is that likely to be released etc?
[02:04] <poolie> backported even?
[02:04] <lifeless> useless backtrace though
[02:04] <lifeless> poolie: hi released for sure
[02:04] <lifeless> poolie: I have no idea about backports to official ubuntu; it is in a ppa building for a wide set of releases
[02:04] <poolie> i guess adding dependencies is a muscle that should be worked
[02:04] <poolie> it hurts :)
[02:07] <lifeless> wgrant: https://bugs.launchpad.net/launchpad/+bug/901498/comments/5
[02:07] <_mup_> Bug #901498: poppy-sftp OOPSes infinitely <oops> <Launchpad itself:Triaged> < https://launchpad.net/bugs/901498 >
[02:08] <lifeless> wgrant: :!ls -l /var/tmp/lperr/2011-12-08/ | wc -l
[02:08] <lifeless> 5
[02:08] <lifeless> wgrant: suggestions on making it break solicited
[02:08] <wgrant> lifeless: That normally degrades into infinite recursion, I believe.
[02:08] <lifeless> of course, it might be the setStdout thing
[02:08] <lifeless> wgrant: hasn't for me
[02:08] <wgrant> Even when you run without your hacks and without -n?
[02:09] <lifeless> its the setStdOut parameter to startLogging
[02:11]  * lifeless runs with 2>/dev/null
[02:11] <lifeless> right
[02:12] <lifeless> our logging is wired up to stderr
[02:12] <lifeless> and we redirect stderr to be an OOPS
[02:12] <lifeless> our *python* logging is wired up to stderr
[02:14] <wgrant> Ah
[02:14] <lifeless> we have logging wired up to stderr for scripts
[02:14] <lifeless> where the stderr goes to lp-error-reports
[02:14] <lifeless> for 'help me' reporting
[02:16] <lifeless> wgrant: analyzed https://bugs.launchpad.net/launchpad/+bug/901498/comments/7
[02:16] <_mup_> Bug #901498: poppy-sftp OOPSes infinitely <oops> <Launchpad itself:Triaged> < https://launchpad.net/bugs/901498 >
[02:19] <lifeless> oh wow. we export a function from a package.
[02:19] <lifeless> bah
[02:19] <lifeless> function from a module as the same name as the module
[02:19] <lifeless> lib/canonical/launchpad/scripts/__init__.py line 31
[02:24] <lifeless> ok, this is crazy. going in loops :(
[02:26] <spm> while poppy=aaaaaa; do sudo crazy --user=lifeless ; sleep 5 ; done
[02:27] <poolie> lifeless, i don't suppose you would be open to reconsidering https://code.launchpad.net/~mbp/launchpad/show-timeline/+merge/80166 to not consolidate it with ++profile++ for now?
[02:27] <poolie> the profile code is a bit hardcoded
[02:27] <poolie> i feel a bit averse to refactoring it
[02:27] <poolie> at least until i know if this is actually useful
[02:29] <lifeless> mmm
[02:29] <lifeless> If you want to land it, and then either refactor or remove it in january sometime, thats fine
[02:30] <lifeless> I don't want deliberate duplication in this area; it is as you note already clumsy, and duplication around it makes that worse.
[02:30] <lifeless> so I'm *not* ok with landing it and then forgetting about it.
[02:30] <poolie> ok, deal
[02:30] <poolie> i promise to at minimum delete it
[02:30] <lifeless> ok
[02:30] <lifeless> thank you
[02:31] <poolie> hopefully it will actually be useful and we can unify them
[02:31] <poolie> i will add an integration smoke test
[02:41] <lifeless> wgrant: feedback wanted
[02:41] <lifeless> https://bugs.launchpad.net/launchpad/+bug/901498/comments/8
[02:41] <_mup_> Bug #901498: poppy-sftp OOPSes infinitely <oops> <Launchpad itself:Triaged> < https://launchpad.net/bugs/901498 >
[02:41] <lifeless> I thnk I have a full handle on it now
[02:45] <wgrant> lifeless: That sounds reasonable, I think.
[02:46] <wgrant> But I don't know twistd.
[02:58] <lifeless> StevenK will rejoice
[02:58] <wgrant> Oh?
[02:58] <lifeless> he has a branch to delete setup_tacfile_logging
[02:59] <lifeless> I suggested that he needed to figure out this set of intricate pain to move forward and it stalled
[02:59] <lifeless> having figured it out, it should be fairly straightforward to make a patch
[02:59] <wgrant> Ah
[03:03] <StevenK> lifeless: What's your plan, then?
[03:03] <lifeless> StevenK: https://bugs.launchpad.net/launchpad/+bug/901498/comments/8
[03:03] <_mup_> Bug #901498: poppy-sftp OOPSes infinitely <oops> <Launchpad itself:Triaged> < https://launchpad.net/bugs/901498 >
[03:05] <StevenK> Hmm, I'm can't recall that branch
[03:05]  * StevenK looks
[03:06] <StevenK> Found it.
[03:09] <wgrant> lifeless: Does one just use https://lp-oops.canonical.com/admin/oops/prefix/ to assign appinstances to prefixes?
[03:09] <lifeless> yes
[03:09]  * wgrant fixes them up.
[03:09] <lifeless> per the docco wiki page
[03:09] <lifeless> its a bit horrible
[03:09] <wgrant> Horrible? This is Django!
[03:09] <lifeless> a bit, not a lot
[03:10] <lifeless> jus tsome of the maniuplation gets clumsy
[03:10] <StevenK> lifeless: So I should resurrect my branch?
[03:10] <wgrant> Oh goody.
[03:10] <wgrant> We have 'production' and 'lpnet' appinstances.
[03:10] <lifeless> StevenK: I think so, especially if you're up to tweaking the other bits too
[03:10] <lifeless> wgrant: production was scripts etc
[03:10] <lifeless> wgrant: lpnet is just appservers
[03:11] <lifeless> IIRC
[03:11] <wgrant> Probably.
[03:13] <StevenK> lifeless: I would, but I have no idea how to.
[03:14] <wgrant> lifeless: https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-04319168ea720529358ae77eb667fea9
[03:14] <lifeless> zomg links that work :P
[03:14] <wgrant> YES
[03:14] <lifeless> StevenK: I can give you pointers
[03:15] <lifeless> wgrant: epic fail traceback
[03:16] <lifeless> wgrant: the no URL thing is interesting
[03:16] <wgrant> Yes.
[03:16] <lifeless> wgrant: no req variables either
[03:16] <wgrant> It probably happened aoutside the request.
[03:16] <wgrant> It's in the disconnection.
[03:16] <lifeless> wgrant: can you file a bug for this? it managed to keep the timeline though.
[03:16]  * StevenK points lifeless to this whole annual leave thing.
[03:16] <lifeless> StevenK: are you there now ?
[03:17] <lifeless> StevenK: you could link your branch in as a starting point (just in a comment I mean)
[03:19] <wgrant> Hmm.
[03:19] <wgrant> Our DB needs pruning.
[03:19] <lifeless> wgrant: oops DB?
[03:19] <StevenK> lifeless: Comment 9
[03:19] <wgrant> So many obsolete unreferenced prefixes.
[03:19] <wgrant> Yeah.
[03:19] <lifeless> wgrant: oh, prefix pruning.
[03:19] <lifeless> wgrant: wait till we have the new configs streamline
[03:19] <wgrant> 139 unreferenced prefixes.
[03:19] <wgrant> SSO and crap.
[03:19] <wgrant> Yeah.
[03:23] <lifeless> poolie: the linaro thing is a sadness
[03:23] <StevenK> Which thing?
[03:26] <lifeless> zygmund? I forget the spelling - moved a linaro project to github, unilaterally.
[03:27] <lifeless> with a bit of a rant about usability
[03:27] <lifeless> some true aspects
[03:30] <poolie> yeah
[03:30] <poolie> some seems fairly easily fixable :/
[03:32] <wgrant> Yes.
[03:32] <wgrant> But the problem is they've been fairly easily fixable for 4 years.
[03:32] <wgrant> And they're still not fixed.
[03:37] <lifeless> a few good men
[03:38] <StevenK> I'd suggest they get escalated, but what good will that do.
[03:41] <poolie> wgrant,  they are actually a bit more easily fixable now
[03:41] <wgrant> (I don't actually know what they are)
[03:42] <lifeless> YHBTHANDHTH
[03:42] <nigelb> ...
[03:45] <poolie> you have been trolled etc
[03:46] <lifeless> Wgrant sometimes needs the reciprocal branded on his forehead :)
[03:46] <wgrant> I was being quite serious :)
[03:46] <lifeless> except you didn't know the specific issues
[03:47] <poolie> featurefixture assumes every defined feature is true :(
[03:47] <poolie> can be fixed
[03:47] <poolie> s//flag
[03:47] <lifeless> so you were at minimum applying /some/ hyperbole
[03:47] <wgrant> Some, yes.
[03:47] <wgrant> But it was a reasonable assumption.
[03:48] <mwhudson> lifeless: i was pretty unhappy he did that
[03:49] <lifeless> mwhudson: I am too
[03:49] <wgrant> So am I. Doesn't meant I don't see his points as valid :)
[03:50] <lifeless> mwhudson: he could at least have offered some patches
[03:51] <mwhudson> my upsetness is probably from a different direction though
[03:52] <lifeless> mwhudson: sharable?
[03:52] <mwhudson> lifeless: bad timing, and we're supposed to be a team dammit
[03:52] <poolie> exactly
[03:53] <lifeless> mwhudson: What made the timing bad? Team thing yes, I agree - thats kindof where my point was...
[03:53] <lifeless> mwhudson: though I think that Linaro being a separate org makes that a little less clear.
[03:53] <mwhudson> lifeless: the project is a bunch of scripts we use for deployment
[03:53] <mwhudson> lifeless: he moved it <12 hours before we used them for the first time for a production deployment
[03:53] <lifeless> oh wow
[03:58] <lifeless> mwhudson: who reports to whom in this structure?
[03:58] <mwhudson> zyga and i both report to paul larson
[04:14] <poolie> short review anyone? https://code.launchpad.net/~mbp/launchpad/featurefixture-from-request/+merge/84887
[04:16] <wgrant> lifeless: Did you have a fix for https://lp-oops.canonical.com/?oopsid=OOPS-5c5df29093442e212197300c3b8d2f8a?
[04:16] <lifeless> bah, slid off plate
[04:16] <lifeless> let me shelve my DB changes and I'll get on it
[04:16] <lifeless> wgrant: you're landing the soyuz start tweaks?
[04:17] <lifeless> wgrant: was there a bug for this ?
[04:17] <lifeless> ah yes, 898638 is it
[04:17] <wgrant> lifeless: They landed a few hours ago.
[04:17] <wgrant> lifeless: Also, your (qa)staging de-oops-prefix-isation made various staging OOPSes show up on prod.
[04:18] <lifeless> wgrant: win
[04:18] <wgrant> Because there are prefixes configured in schema-lazr.conf, for some probably not very good reason.
[04:18] <wgrant> eg. CB and CID
[04:18] <wgrant> And CIW
[04:18] <lifeless> HAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAAHAHHAAHA
[04:18] <wgrant> And APPORTBLOB
[04:18] <wgrant> Anyway.
[04:18] <lifeless> caretokill ?
[04:18] <wgrant> Not really.
[04:18] <wgrant> Might as well just kill off all prefixes on Monday.
[04:18] <wgrant> :)
[04:18] <wgrant> Actually.
[04:19] <lifeless> well, all script ones
[04:19] <wgrant> I guess we can sensibly just drop those now, can't we.
[04:19] <wgrant> Since we're using the new datedir-repo on prod everywhere.
[04:19] <lifeless> set to none in the schema
[04:19] <wgrant> Yeah, just was thinking it wasn't safe everywhere yet.
[04:19] <lifeless> wgrant: we haven't updated the prod config yet
[04:19] <wgrant> But it is.
[04:19] <wgrant> They don't need to be unique any more, and reporter defaults to something sensible for scripts now.
[04:20] <lifeless> yes, that part of it definitely
[04:20] <wgrant> I'll remove all the prefix defaults now.
[04:21] <lifeless> http://pastebin.com/3Q7v6YVD
[04:22] <wgrant> aaaa
[04:22] <wgrant> but ok
[04:22] <lifeless> how do you propose to address it ?
[04:23] <wgrant> I don't have any better ideas.
[04:25] <lifeless> whee
[04:25] <lifeless>   File "/usr/lib/python2.6/urllib.py", line 1222, in quote
[04:25] <lifeless>     res = map(safe_map.__getitem__, s)
[04:25] <lifeless> UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
[04:25] <lifeless> close enough to demonstrating the problem IMO
[04:25] <lifeless> and with the fix
[04:37] <lifeless> wgrant: lp:~lifeless/launchpad/bug-898638
[04:37] <poolie> lifeless, what is an 'extra oops message'?
[04:38] <poolie> nm
[04:38] <lifeless> wgrant: review -  https://code.launchpad.net/~lifeless/launchpad/bug-898638/+merge/84891
[04:38] <lifeless> poolie: have a look in errorlog.py, should be fairly obvious
[04:38] <poolie> like         with globalErrorUtility.oopsMessage(oops_message): ?
[04:39] <lifeless> something like that yes
[04:39] <lifeless> I don't know if there is a good API for 'and I am not raising right now'
[04:39] <lifeless> I hope there is
[04:39] <lifeless> otherwise I'm sending you off yak shaving again
[04:39] <lifeless> (which wasn't my intent)
[04:40] <wgrant> lifeless: APproved
[04:41] <lifeless> we need a longpoll hook for changes to the MP other than diff
[04:45] <lifeless> has stubs stuff landed again ?
[04:45]  * lifeless checks
[04:46] <wgrant> It has, yes.
[04:58] <lifeless> wgrant: https://code.launchpad.net/~lifeless/python-oops-tools/prune/+merge/84892
[04:58] <wgrant> I need to remember to not hit refresh.
[05:06] <lifeless> wgrant: diff is there :P
[05:07] <poolie> lifeless, according to some documentation, oops messages are not supported in the webapp because they're not thread safe
[05:08] <lifeless> poolie: hmm, so I believe we have per-thread utilities
[05:08] <lifeless> poolie: so if I wrote that, I was misguided.
[05:10] <wgrant> lifeless: k
[05:10] <wgrant> r=me
[05:29] <poolie> where are oopses from make run now put?
[05:29] <wgrant> /dev/null unless you have python-oops-tools set up, or kill rabbit.
[05:30] <poolie> i guess i'm really asking, what's the easiest way to look at them
[05:31] <lifeless> poolie: run up oops-tools locally.
[05:31] <wgrant> It's actually pretty easy nowadays.
[05:31] <lifeless> poolie: give me 5 minutes to land a switch to the LP sourcedep branch and it will be even easier.
[05:31] <wgrant> Although it would be nice if you didn't have to create a special postgres cluster and blah.
[05:31] <lifeless> wgrant: patches gratefully.
[05:31] <poolie> srsly
[05:31] <poolie> ok thanks
[05:31] <wgrant> poolie: If you kill rabbit and cause an oops, it will appear in /var/tmp/lperr as before.
[05:32] <lifeless> wgrant: is there an exchange by default ?
[05:32] <lifeless> wgrant: w/no exchange they should be written to disk regardless
[05:32] <wgrant> lifeless: Hmm, that's true.
[05:33] <wgrant> Perhaps the dev rabbit is more persistent than I thought it was.
[05:33] <lifeless> poolie: I'd expect by default they arrive in /var/tmp/lperr/
[05:34] <lifeless> poolie: if you have oopstools you can glue it in very easily. I reference an email describing this from https://dev.launchpad.net/QA/OopsToolsSetup
[05:34] <lifeless> the precise link is
[05:34] <lifeless> https://lists.launchpad.net/launchpad-dev/msg08183.html
[05:34] <lifeless> (but see https://dev.launchpad.net/QA/OopsToolsSetup#Deploying_locally_.28e.g._devpad.29 anyhow)
[05:35] <lifeless> sadface
[05:35] <lifeless>  bzr resolve --all download-cache/
[05:35] <lifeless> bzr: ERROR: If --all is specified, no FILE may be provided
[05:35] <wgrant> Yeah, that sucks :/
[05:35] <lifeless>  bzr resolve --all -d download-cache/
[05:35] <lifeless> bzr: ERROR: no such option: -d
[05:35] <lifeless>  cd download-cache/ && bzr resolve --all && cd -
[05:35] <lifeless> /home/robertc/oops-tools
[05:37] <poolie> :/
[05:39] <lifeless> man, I wanted to get so much more done today
[05:39] <lifeless> E-TIME
[05:39] <lifeless> of course, having rogue services check up gb's of queue memory is a good excuse.
[05:41] <wgrant> Indeed.
[05:41] <wgrant> 14M oopses from one service is quite the exceptional circumstance.
[05:42] <lifeless> wgrant: you're missing the 700K it processed.
[05:42] <lifeless> wgrant: that graph was depth, not volume
[05:43] <wgrant> lifeless: True.
[05:43] <wgrant> lifeless: Although some of that 700000 was from after the peak.
[05:43] <wgrant> But before I hacked it to skip them.
[05:43] <lifeless> wgrant: not really
[05:44] <lifeless> wgrant: it had 9 hours of constant chewing on them
[05:44] <wgrant> True.
[05:44] <lifeless> wgrant: and ~20 minutes of quad consumer
[05:44] <wgrant> Yeah
[05:44] <lifeless> so thats 9*60=540 vs 80
[05:47] <poolie> stub,  this is for stub, this is for https://code.launchpad.net/~mbp/launchpad/666765-features-no-reasons/+merge/84050
[05:47] <poolie> any better ideas welcome
[05:47] <lifeless> poolie: does the request flags controller know the scope values ?
[05:48] <lifeless> poolie: if so, adding an oops hook, or extending attach_request, would be clean.
[05:48] <poolie> yep
[05:48] <poolie> i don't understand how the existing use of it is really safe
[05:48] <poolie> i guess it's mostly stateless
[05:48] <lifeless> its probably not
[05:48] <poolie> in the way it's used in the webapp
[05:48] <poolie> ok that should work
[05:48] <lifeless> this is old and crufty code mostly facelifted
[05:50] <stub> poolie: I think at the moment all data is attached to the request, which is effectively thread local and gets destroyed at the end of the request. The utility certainly should be stateless, or failing that use thread local storage.
[05:50] <lifeless> certainly the amqp publisher has TLS built in
[05:51] <lifeless> and the datedir repo publish function doesn't alter global state (though the old one did). All hail clean code.
[05:51] <poolie> right except it has this concept of 'messasges' which mislead me
[05:51] <stub> I attached something to the OOPS reports via request the other day but can't remember what it was now... :-/
[05:51] <poolie> pulling it from the request at the time it's fired looks like it will work
[05:51] <lifeless> stub: the previous oops
[05:51] <poolie> and is even slightly clean
[05:51]  * stub wanders off on his walking frame to grab a cup of tea
[05:51] <stub> ahh... that's right
[05:54] <lifeless> poolie:  trunk of python-oops-tools is now using the lp sourcedeps cache.
[05:55] <lifeless> I'm going to mail the list a tl;dr summary in a second
[06:48] <poolie> all my lp branches failed with that soyuz apparently spurious failure
[06:48] <wgrant> Yeah, it's become extremely common lately.
[06:48] <wgrant> Possibly 100% common.
[06:49] <wgrant> I have two branches in ec2 now that I'm going to try to catch.
[06:59] <jtv> wgrant: I've been trying to Q/A a change to PackageCloner.packageSetDiff… any idea how I exercise it?  I thought I could initialize a distroseries but haven't had much luck with that.
[06:59] <jtv> (And what is a SetDiff?)
[07:02] <wgrant> It's a diff of a package set. Note that a package set is not to be confused with a packageset.
[07:03] <wgrant> Initialising a new distroseries in the same distribution should use the cloner.
[07:03] <wgrant> But the easiest way is to use populate-archive to create a copy archive.
[07:03] <wgrant> That uses the cloner.
[07:04] <jtv> And specifically packageSetDiff?
[07:06] <wgrant> Hmm.
[07:06] <wgrant> I'm not sure that's ever been used.
[07:06] <wgrant> The cloner was initially going to be able to do progressive copies, IIRC.
[07:06] <wgrant> But that never got finished AFAIK.
[07:06] <poolie> ok, night all
[07:06] <wgrant> At least never used anywhere.
[07:07] <jtv> night poolie
[07:07] <jtv> wgrant: then I guess qa-untestable.
[07:10] <wgrant> jtv: Probably.
[07:17] <jtv> wgrant: Thanks.  By the way, it runs on cocoplum, right?
[07:17] <jtv> The package cloner in general?
[07:19] <wgrant> Yeah.
[07:19] <wgrant> Mostly.
[07:22] <jtv> Mostly..?
[07:22] <jtv> “They mostly come at night.  Mostly.”
[07:22]  * jtv flashes wgrant a terrified look
[07:23] <wgrant> It would occasionally be run from loganberry, probably, and will soon be on bilimbi.
[07:26] <jtv> I'm filing NDT & cocoplum rollouts then.
[07:26] <wgrant> Objection.
[07:26] <jtv> ?
[07:26] <wgrant> We can't roll out to cocoplum until next week.,
[07:26] <wgrant> poppy is a bit fucked
[07:26] <wgrant> Took down all of LP this morning.
[07:26] <jtv> But that's germanium innit?
[07:26] <wgrant> By overloading rabbit with 14 million OOPSes in 10 hours.
[07:26] <jtv> whoopie
[07:26] <wgrant> poppy is both cocoplum and germanium.
[07:26] <jtv> Is that the new logging of GPG errors?
[07:27] <wgrant> A bug in the new OOPS infrastructure.
[07:27] <wgrant> Which causes the first error to create an infinite loop.
[07:27] <wgrant> Of OOPSes.
[07:27] <jtv> I shouldn't have joked about that the other day.
[07:28] <wgrant> https://lpstats.canonical.com/graphs/ProductionRabbitOopsQueue/20111208/20111209/
[07:28] <lifeless> wgrant: that should be on LPS in the cherrypick section I suspect
[07:28] <wgrant> lifeless: It's in the cowboy section.
[07:28] <lifeless> cool cool
[07:28] <wgrant> I'll add germanium as well.
[07:30] <lifeless> also  https://lpstats.canonical.com/graphs/AppServerRequestLpnet/20111208/20111209/ which we have no current clue about
[07:30] <jtv> wgrant: is the upshot that rolling out to cocoplum doesn't work today?  Or that do so would break something, and if so, how?  Would it override a cowboy, for instance?
[07:30] <wgrant> jtv: Rolling out to cocoplum will work for a few hours, and then cause all production services to hang.
[07:31] <wgrant> Bug #901498
[07:31] <_mup_> Bug #901498: poppy-sftp OOPSes infinitely <oops> <Launchpad itself:Triaged> < https://launchpad.net/bugs/901498 >
[07:31] <jtv> So it would break something.
[07:31] <jtv> That's something I hate even more than just not being able to do it.
[07:31] <lifeless> https://bugs.launchpad.net/launchpad/+bug/901498
[07:31] <_mup_> Bug #901498: poppy-sftp OOPSes infinitely <oops> <Launchpad itself:Triaged> < https://launchpad.net/bugs/901498 >
[07:32] <wgrant> lifeless: Well, it's nothing to worry about.
[07:32] <wgrant> lifeless: Data collection on loganberry probably just hung for a while.
[07:32] <wgrant> lifeless: Note that there's a gap, and then the next point is high -- it's the sum of the missing data.
[07:32] <jtv> wgrant: argh.  As usual I didn't see the moin warning until I hit the button.  :(
[07:32] <jtv> wgrant: I'm afraid I cross-edited LPS.
[07:33] <jtv> Very sorry.
[07:33] <jtv> I still want a red background with a skull-and-crossbones motif when that happens.
[07:34] <wgrant> jtv: It's a far too subtle warning.
[07:34] <wgrant> I think it's actually more subtle than the warning that other people will be warned.
[07:34] <lifeless> restyle it
[07:34] <lifeless> we have a branch with the theme
[07:34] <wgrant> lifeless: wiki.c.c?
[07:34] <wgrant> jtv: What'd you change? Added a deployment request?
[07:34] <wgrant> Ah, yeah.
[07:34] <wgrant> No conflict.
[07:35]  * wgrant saves.
[07:35] <lifeless> wgrant: yes, pretty sure. check with webops.
[07:35] <jtv> No, edited my existing deployment request.
[07:35] <jtv> Poppy oops loop.  Now there's something to say quickly.
[07:35] <lifeless> or as I like to say
[07:35] <lifeless> poops
[07:37] <jtv> well if you had to say "poppy oops loop" a few times, chances you'd have ended up saying it anyway
[07:37] <jtv> chances *are
[07:37] <jtv> This keyboard lets my fingers fall too far behind my brain.
[07:40] <wgrant> jtv: I've amended your suspended request further, to request germanium as well.
[07:40] <wgrant> Because it's nice to keep them together.
[07:40] <wgrant> They have the same downtime constraints, so there's no reason to let them diverge.
[07:41] <jtv> I was wondering about that.  Is there any particular reason why we don't deploy those as a coherent set?
[07:41] <wgrant> Because I only just altered LPS to recommend that that be done.
[07:42] <wgrant> there's probably not much benefit in adding a special alias for them.
[07:42] <wgrant> And we can't really merge the nodowntime-but-not-nodowntime targets into a single set, since they require codehosting.
[07:42] <wgrant> (codehosting, mailman, librarian are nodowntime-but-not-nodowntime)
[07:44] <jtv> nowdowntime-but-not-nodowntime..?
[07:45] <wgrant> Able to be deployed without downtime, but not in nodowntime because their upgrades require handholding.
[07:45] <jtv> Should we have a plus-zero-downtime set?
[07:46] <wgrant> Heh
[07:53] <lifeless> mailman needs to be divorced from LP
[07:53] <lifeless> made into a separate consumable thing, not part of the same tree
[07:53] <wgrant> Yes.
[07:53] <lifeless> that will eliminate one of those
[07:54] <lifeless> I've run that broad plan past elmo for comment, and he is (provisionally) cool with it
[08:53] <adeuring> good morning
[08:56] <jtv> morning adeuring
[08:56] <adeuring> hi jtv
[08:56] <mrevell> Hey
[08:57] <jtv> hi mrevell
[09:39] <micahg> given Bug #788819, do I need a second bug to be able to subscribe to all team branches?
[09:39] <_mup_> Bug #788819: want to subscribe to all merge proposals project wide <code-review> <feature> <notification> <Launchpad itself:Triaged> < https://launchpad.net/bugs/788819 >
[10:17] <lifeless> ocr: https://code.launchpad.net/~lifeless/python-oops-amqp/bug-901497/+merge/84915 plox
[10:59] <lifeless> stub: around? should we catchup?
[11:03] <lifeless> stub: I'll try to catch you tomorrow. EHALT now.
[11:04] <lifeless> allenap: ditto for you (re the bug I was talking to julian about)
[11:04] <bigjools> lifeless: allenap is not around reliably today
[12:08] <stub> lifeless: night
[12:21] <rick_h_> morning everyone
[12:24] <nigelb> bigjools: Seen IND vs WI match?
[12:26] <bigjools> nigelb: yes, amazing
[12:26] <nigelb> :)
[12:26] <nigelb> The office seems to be in party mode :P
[14:07] <sinzui> jcsackett, can you check your email to verify you were notified that you were assigned a bug? I am verifying that email is working...you can ignore the bug
[14:08] <jcsackett> sinzui: this was within say the last hour? (i have a misbehaving bug mail filter, so i'll need to search)
[14:09] <sinzui> It was 15 minutes ago
[14:10] <jcsackett> sinzui: i see no email.
[14:10] <sinzui> okay
[14:43] <jml> hey, is there something wrong with the builders. the queue seems pretty big.
[14:43] <jml> or is this just usual flux
[14:44] <bigjools> jml: probably a result of the librarian outage the other day
[14:47] <flacoste> bigjools: that and we were missing some builders
[14:47] <flacoste> or are
[14:48] <flacoste> back saturday i think
[14:48] <bigjools> flacoste: nothing to do with that unfortunately
[15:05] <flacoste> oops, don't list active feature flags right?
[15:22]  * deryck goes afk for early lunch/errands, back in hour
[15:30] <bigjools> jml: remember when we wrote some poppy tests and you duplicated them across ftp and sftp by using "multiply_tests" in the loader?
[15:30] <jml> bigjools: yes.
[15:30] <bigjools> jml: do you know how to run only one scenario?
[15:30] <jml> bigjools: yeah, just pick the right regex
[15:30] <bigjools> the test name that it prints when running is not recognised by -t
[15:31] <jml> test.py test_poppy <regex>
[15:31] <bigjools> well finding the regex is proving hard :)
[15:31] <bigjools> for example:
[15:31] <bigjools>  lp.poppy.tests.test_poppy.TestPoppy.test_bad_gpg_on_changesfile(sftp)
[15:31] <jml> bigjools: don't use -t, use the two regex way of selecting tests
[15:31] <bigjools> ahl ISWYM
[15:32] <bigjools> jml: didn't work :(
[15:32] <bigjools> unless ( and ) need escaping
[15:32] <jml> probably they do
[15:32] <jml> I'm guessing you just want ftp
[15:32] <bigjools> aha there we go
[15:32] <bigjools> how'd you guess :)
[15:32] <bigjools> that did it, thanks jml
[15:32] <jml> you could probably also change the name of the scenario
[15:35] <adeuring> jcsackett: could you please review this MP: https://code.launchpad.net/~adeuring/launchpad/history-model/+merge/84973 ?
[15:35] <jcsackett> adeuring: sure.
[15:35] <adeuring> jcsackett: thanks!
[15:54] <benji> I'm confused.  I didn't realize we still used the release-cirical-only mode of PQM any more, yet my branch just got rejected because of it.
[16:32] <flacoste> benji: there were two buildbot failures earlier on
[16:33] <jcsackett> adeuring: r=me, sorry for the delay.
[16:33] <flacoste> unless a testfix was landed or a rebuild requested
[16:33] <adeuring> jcsackett: thanks :)
[16:33] <flacoste> we are in testfix mode
[16:33] <flacoste> benji: so we are not using release-critical anymore
[16:33] <flacoste> but testfix yes
[16:33] <benji> flacoste: let me double-check, but I'm pretty sure the rejection email said release-critical and not testfix
[16:34] <flacoste> benji: the message might be wrong
[16:34] <flacoste> i think we are still in testfix
[16:34] <flacoste> because nothing was done with the db_lp failures
[16:36] <benji> flacoste: heh, my tendancy to read right to left combined with the fact that the regex has *both* testfix and release-critical in it bit me
[16:36] <flacoste> what was the issue with "True != False: memcached is live but should not be."
[16:36] <flacoste> these errors happened both on lp and db_lp?
[16:36] <flacoste> was it deemed transient?
[16:36] <flacoste> should we trigger a rebuild on db_lp?
[16:37] <flacoste> or the same fix that got applied to lp should be to db_lp?
[16:42] <deryck> rick_h_, hey.  can you just pastebin me the queries you want run, in order. Just the queries, no extra text.
[16:42] <rick_h_> deryck: sure thing, sec
[16:44] <rick_h_> deryck: http://paste.mitechie.com/show/466/
[16:58] <gary_poster> flacoste, I came back in for only part of a conversation you were having with someone, and no-one replied.  I'm going to assume that we still need someone to investigate the issues that benji raised in email, and the buildbot failures.
[16:58] <gary_poster> I'll arrange for it to be tackled.
[17:00] <gary_poster> benji, could you try reverting the new germinate locally as Francis suggested?  I'm not sure how to do it, and if you are not either, I bet there are many people who could help you.
[17:00] <deryck> rick_h_, I don't follow the "repeat" comment below. Do I need to change an id and run them again?
[17:00] <rick_h_> deryck: yes please, I want to then get the same info for the parent messages of those two
[17:00] <gary_poster> I guess that the memcache error on buildbot is about some memcache that started and shouldn't have.
[17:00] <deryck> rick_h_, what is the parent message?
[17:00] <rick_h_> parent field in the results
[17:01] <rick_h_> message.parent
[17:01] <gary_poster> I'll try to investigate that
[17:01] <rick_h_> should be either null, or ids of messages as well
[17:01] <rick_h_> deryck: ^^
[17:01] <benji> gary_poster: I haven't upgraded it manually so I doubt I have the newest version, let me check
[17:03] <deryck> rick_h_, I don't get anything for the first results.
[17:03] <deryck> 0 rows
[17:03] <gary_poster> benji, it  would be upgraded automatically when you upgrade stuff because we have Launchpad's PPA as a source
[17:04] <rick_h_> deryck: http://paste.mitechie.com/show/468/ ?
[17:04] <gary_poster> I guess you mean you didn't update launchpad dependencies, you just installed the missing dependency?
[17:04] <gary_poster> If that's the case you should definitely update all your dependencies before you try again
[17:04] <benji> gary_poster: by "when you upgrade stuff" do you mean running a blanket "apt-get upgrade"?  I haven't done that lately.
[17:04] <deryck> rick_h_, https://pastebin.canonical.com/56886/
[17:05] <rick_h_> deryck: ok, thanks
[17:06] <deryck> adeuring, you linked a branch to bug 898200, which is the dupe. bug 295214 is the master bug.
[17:06] <_mup_> Bug #898200: Can't sort bug list by customized fields <bug-columns> <Launchpad itself:Triaged> < https://launchpad.net/bugs/898200 >
[17:06] <_mup_> Bug #295214: ordering options should match data that is displayed in bug listings <bug-columns> <confusing-ui> <lp-bugs> <Launchpad itself:In Progress by adeuring> < https://launchpad.net/bugs/295214 >
[17:06] <deryck> adeuring, well, sorry, maybe you linked both?  skimming through mail now. ;)
[17:07] <adeuring> deryck: right, I linked both
[17:07] <deryck> adeuring, ok, cool.  Sorry for the noise.
[17:07] <adeuring> np
[17:09] <rick_h_> deryck: one more please: http://paste.mitechie.com/show/469/
[17:10] <benji> gary_poster (and flacoste): neither updating germinate or using an older version makes my errors (http://paste.ubuntu.com/763881/) go away
[17:10] <cjwatson> gary_poster: I'd be surprised if the new germinate could have broken anything, since prior to https://code.launchpad.net/~cjwatson/launchpad/refactor-cron-germinate/+merge/84624 (which hasn't landed yet), it wasn't actually tested
[17:11] <gary_poster> holy smoke, what happened yesterday on buildbot?  The currently running build has over 88 revisions, which start from over a week ago
[17:11] <cjwatson> it certainly has nothing to do with builds
[17:11] <cjwatson> as in buildmaster
[17:13] <gary_poster> cjwatson, germinate not being a cause makes sense in general.  There does seem to be a difference between ec2 and buildbot results though, and deb dependencies would explain it.  That said, something appears to be seriously screwy here.
[17:13] <gary_poster> in buildbot
[17:13] <gary_poster> and our devel
[17:14] <cjwatson> sure - just trying to help cut out probable red herrings
[17:15] <cjwatson> code not run in any tests has a low probability of breaking tests :-)
[17:15] <gary_poster> cjwatson, cool, thx.  when you said builds are not at fault, that's in response to looking at benji's pastebin (email)?  http://paste.ubuntu.com/763881/
[17:15] <gary_poster> it looks build-y but I don't know that system.
[17:26] <deryck> rick_h_, https://pastebin.canonical.com/56889/
[17:30] <deryck> mrevell, I've seen so much mail about long bug titles and one line vs. two, I can't even follow it any more.
[17:31] <deryck> mrevell, I feel like we're trying to solve too many problems.  What are we trying to solve?  Are trying to get everything on one line, or trying to make better use of the space?
[17:34] <mrevell> Hey deryck. My priority is this: we should offer a one-line view that gives the same info you can get from the currently live non-beta bug listings. If there's a way that we can also cater to wider screen users by putting additional data to the right-hand of their wide page, rather than on a second line, then that'd be a nice bonus. I'm sorry this has become a confused issue over the past few days.
[17:35] <mrevell> I believe we've solved that first part.
[17:36] <deryck> mrevell, no worries, I just want to be clear. it feels a bit like we're talking about ideas, just to be talking now. And I was losing *why* we're discussing this. :)
[17:36] <deryck> mrevell, my impression from the check point call was that we had done as well as good on this, and we'd come back to this if we could, but were not really trying to.
[17:36] <deryck> mrevell, so I could have completely misunderstood, sorry.
[17:38] <mrevell> deryck, Sorry, I should have given a clearer direction to the mailing list threads. The discussion I was hoping for was about whether it was possible to use people's widescreens more effectively, while not forcing horizontal scrolling or too much wrapping on narrower screen users. I agree, the discussion seems to have strayed. Got time for a quick call?
[17:40] <deryck> mrevell, yeah, that might be useful.  I'm sorry. I think the discussion is happening across two different threads, too.
[17:40] <deryck> mrevell, skype?
[17:40] <mrevell> sure
[17:41] <deryck> mrevell, ready when you are.
[17:57] <lifeless> bigjools: you can use load-list to run one scenario precisely
[17:57] <bigjools> lifeless: never mind that, help with me logging!
[17:57] <bigjools> about to run out of hair
[17:58] <lifeless> sure, give me 5 to do some essential I-just-woke-up-stuff
[17:58] <lifeless> and you'll have my full attention, FWTW
[17:58] <bigjools> TMI
[17:59] <lifeless> mrevell: I *love* your attitude
[17:59] <mrevell> You do? :)
[17:59] <lifeless> mrevell: totally.
[17:59] <lifeless> mrevell: re the thing poolie forwarded.
[18:00] <lifeless> mrevell: your response has made my day, and it has only just started
[18:01] <cjwatson> gary_poster: that wasn't what I meant actually; sorry to confuse.  What I said was that germinate is not involved in anything to do with the buildmaster
[18:01] <cjwatson> or what I meant to say
[18:01] <cjwatson> gary_poster: germinate is loosely-coupled to the publisher, rather than to builders
[18:02] <mrevell> lifeless, Ah, cool :)
[18:02] <lifeless> mrevell: FYI: I rarely do sarcasm online; it doesn't transcribe well enough to avoid confusion.
[18:03] <lifeless> mrevell: so you can take emphatic positive statements as being what you see
[18:03] <mrevell> Great :)
[18:03] <bigjools> lifeless: I need to eat, I'll be back online in ~1hr
[18:03] <lifeless> bigjools: ok. I'll look at your notes on the bug
[18:03] <lifeless> bigjools: and repage it all in in prep.
[18:03] <deryck> rick_h_, I took a look at your MP again.  Generally it looks great.  Can we chat via mumble or Skype about something with it?
[18:03] <bigjools> lifeless: I solved the problem - sorta.  :)  You'll see ...
[18:03] <rick_h_> deryck: sure thing
[18:03] <lifeless> bigjools: I fear the pastch
[18:03] <deryck> rick_h_, mumble or Skype?
[18:03] <cjwatson> gary_poster: ah, Benji suggests python-lpbuildd in https://lists.launchpad.net/launchpad-dev/msg08641.html, which makes a lot more sense as that package is actually related
[18:04] <rick_h_> deryck: mumble
[18:04] <deryck> ack
[18:15] <deryck> rick_h_, see if Skype is better?
[18:15] <rick_h_> deryck: ok
[18:20] <flacoste> gary_poster: i asked myself the same question about the 88 revisions build
[18:20] <mrevell> Night all
[18:20] <flacoste> and didn't find a satisfactory answer
[18:20] <flacoste> there are like several testfix in there
[18:21] <lifeless> flacoste: https://bugs.launchpad.net/python-oops-amqp/+bug/901449 is looking to me like its the longpoll code failing
[18:21] <_mup_> Bug #901449: Rabbit failure when sending an OOPS seems to hang the producer <rabbit> <python-oops-amqp:Triaged> < https://launchpad.net/bugs/901449 >
[18:22] <flacoste> benji: i think your failure is bug #779367
[18:22] <_mup_> Bug #779367: spurious failure in test_builder.TestSlave <critical-analysis> <spurious-test-failure> <test-system> <Launchpad itself:Triaged> < https://launchpad.net/bugs/779367 >
[18:23] <flacoste> lifeless: how so?
[18:23]  * benji looks.
[18:23] <lifeless> flacoste: unreliablesession is a longpoll thing
[18:24] <lifeless> flacoste: the idea of 'big oopses' is a distraction. The issue was volume not size.
[18:24] <flacoste> lifeless: i got that, so poppy was the source of the volume, which killed rabbit
[18:25] <flacoste> and the app server dying is because of a bug in unreliable sessio?
[18:25] <lifeless> flacoste: I *think* that unreliablesession is being hooked in automatically in the appservers; and it has (at least one) bug in handling rabbit issues (see comment 2)
[18:25] <benji> flacoste: it might be, comment #3 is exactly the same as mine
[18:26] <lifeless> flacoste: I think it may be. Doesn't mean we should stop looking
[18:26] <lifeless> flacoste: there is also bug https://bugs.launchpad.net/python-oops-amqp/+bug/901497 which could indeed affect LP appservers during rabbit issues
[18:26] <_mup_> Bug #901497: close_ignoring_EPIPE can trigger IOError <python-oops-amqp:Triaged> < https://launchpad.net/bugs/901497 >
[18:26] <lifeless> I've just pushed a release for that
[19:02] <bac> hi flacoste, trunk for lazr.uri is an old bzr format and it breaks dailydeb.  can you upgrade it?  for some reason that branch, unlike most other LAZR projects, is owned by PQM not ~lazr-developers
[19:03] <bac> flacoste: btw, lazr.uri and lazr.restfulclient need to be backported for the natty SRU of launchpadlib
[19:06] <rick_h_> deryck: ping, I've got the start working, but not sure if the look is going to pan out. http://uploads.mitechie.com/lp/limited_textinput.png
[19:07] <rick_h_> deryck: before I go with it more, what do you think, do we want to keep going with it?
[19:07] <flacoste> bac: if it's owned by PQM, it will need to be done by a losa
[19:07] <deryck> rick_h_, well, if we can drop the scrolling and overflow, have the outer-styled element expand as the stuff overflows, and drop the border around the input, then yeah, that could work. ;)
[19:08] <deryck> rick_h_, but if that leads to cascading issues, then I think it's not worth pursuing.
[19:08] <bac> flacoste: any reason to not change the ownership at the same time?
[19:08] <flacoste> bac: probably not
[19:08] <bac> will do
[19:09] <rick_h_> deryck: ok, yea didn't want to get too crazy changing all the css styles/etc. More meant the space on the side that appears and the fact that it'll cause more text to hit the default max_height setting of the resizing widget
[19:09] <rick_h_> which brings up scroll bars
[19:10] <rick_h_> deryck: I can remove the max height setting when used from the inline edit, but not sure if that would bring up other issues
[19:11] <deryck> rick_h_, I leave that to your judgement. Sorry, just don't know the ramifications of the max height restrictions without working on it more myself.
[19:11] <rick_h_> deryck: ok, sounds good.
[19:13] <lifeless> bac: most lazr trunks *should* be pqm or tarmac managed (same user) - we just haven't got a round tuit
[19:14] <bac> lifeless: oh, i thought this one was the outlier in the wrong direction
[19:14] <bigjools> o/ lifeless
[19:14] <lifeless> bigjools: oh hai
[19:14] <lifeless> bigjools: do you want voice ?
[19:15] <bigjools> lifeless: if I had remembered to bring my headset in....
[19:16] <bigjools> lifeless: the more I see of Twisted logging, the more I shake my head in disbelief
[19:19] <bigjools> lifeless: I got as far as noticing that the twisted stuff was monkey patching sys.stdout and then I quit in disgust
[19:21] <lifeless> bigjools: right
[19:21] <lifeless> that happens from self.logger.start(application) deep in twisted
[19:21] <lifeless> we have to work with that
[19:22] <lifeless> it framed the situation for my proposed solution
[19:22] <lifeless> (sorry for the slight latency replying, helping lynne with meds for cynthia)
[19:23] <bigjools> no worries
[19:23] <bigjools> lifeless: so I expect this is part of the problem with sys.stdout not working in the stream handler
[19:24] <bigjools> but what I can't fathom is the duplication if I use stdioonnastick
[19:24] <lifeless> the looping ?
[19:24] <bigjools> no
[19:24] <bigjools> it prints each message about 20 times
[19:25] <bigjools> *bizarre*
[19:25] <lifeless> I did not see that
[19:25] <bigjools> apply the diff in my paste
[19:25] <bigjools> use the log.stdioonnastick option
[19:25] <lifeless> ok, so RotatableLogFileObserver fits in as a replacement observer
[19:25] <lifeless> we want that to be the fallback observer for the OopsObserver
[19:26] <bigjools> ye
[19:26] <bigjools> s
[19:28] <lifeless> so, we don't want errors going to stdout/stderr because nothing is watching those streams on the daemon
[19:28] <bigjools> tbh I was trying to rip out all use of the python logger and use twisted's
[19:28] <lifeless> that would be ideal
[19:28] <lifeless> but
[19:28] <bigjools> but
[19:28] <bigjools> Hooks has some debug logging
[19:29] <lifeless> we use an unknown amount of LP code today
[19:29] <lifeless> someone might add a logging output in there somewhere and screw us
[19:29] <lifeless> so we have to handle it
[19:29] <bigjools> right
[19:29] <lifeless> if we get one solid recipe
[19:29] <lifeless> we should be set
[19:29] <bigjools> I have had an unpleasant day working all this out
[19:29] <lifeless> I bet
[19:30] <lifeless> So, I think we need to do:
[19:30] <lifeless> logging.basicConfig(stream=sys.stdout)
[19:31] <lifeless> before all the twisted application stuff kicks in - e.g. where setup_tacfile_logging happens is fine.
[19:31] <lifeless> the fiddling with channels etc isn't needed
[19:31] <bigjools> ok
[19:31] <lifeless> we can pass level=X to that function too
[19:31] <lifeless> http://docs.python.org/library/logging.html#logging.basicConfig
[19:32] <lifeless> using getLogger('poppy-sftp') should be fine
[19:32] <bigjools> *should* :)
[19:32] <bigjools> but wasn't
[19:32] <lifeless> once everything else is sorted I mean
[19:32] <lifeless> we need to change the OOPSObserver setup
[19:33] <lifeless> its participating in the loop
[19:33] <bigjools> I am really worried this will screw over the  build manager
[19:33] <lifeless> does the build manager call setup_tacfile_logging ?
[19:33] <bigjools> no
[19:34] <lifeless> we probably want what the builddmanager has
[19:34] <lifeless> to fix the separate bug that poppy log rotation is terrible
[19:34] <bigjools> it uses Rotatable...
[19:34] <bigjools> right
[19:34] <lifeless> we had 50G of poppy logs or something
[19:34] <lifeless> it was taking a while to rsync after each rotation
[19:34] <bigjools> yeah it logs excessively anyway
[19:35] <lifeless> so yeah, the builddmanager could be affected
[19:35] <lifeless> but, we have and end goal that makes sense for both
[19:35] <lifeless> application.addComponent(
[19:35] <lifeless>     RotatableFileLogObserver(options.get('logfile')), ignoreClass=1)
[19:35] <lifeless> thats what builddmanager does
[19:36] <lifeless> that won't give OOPS integration
[19:36] <lifeless> it probably needs to be
[19:36] <lifeless> application.addComponent(OOPSObserver(oops_config, RotatableFileLogObserver(options.get('logfile')), ignoreClass=1))
[19:36] <lifeless> and may need a tweak to the OOPSObserver to pass down the rotation event. I don't know.
[19:36] <bigjools> simples :)
[19:37] <lifeless> that + a logging.basicConfig(stream=stdout, level=X)
[19:37] <bigjools> RFLO installs its own signal handler
[19:37] <lifeless> should give us everything we need
[19:37] <bigjools> ok I am going to experiment now
[19:37] <lifeless> this function: set_up_oops_reporting
[19:38] <lifeless> in loggingsupport.py
[19:38] <lifeless> is what creates the loop
[19:38] <lifeless> (because it turns stderr output into oopses
[19:38] <lifeless> and writes a msg on oops
[19:38] <lifeless> and joins twisted msg() output to python logging
[19:38] <bigjools> well I managed to stop the loop without changing that
[19:38] <lifeless> which is joined to stderr by default
[19:38] <lifeless> sure, you can break it at a number of points.
[19:38] <bigjools> and I've seen looping in poppy before your changes
[19:39] <lifeless> here, one sec
[19:44] <lifeless> http://pastebin.com/LV3QtjT9
[19:44] <lifeless> this is a sketch
[19:44] <lifeless> we have techdebt all around
[19:44] <lifeless> e.g. in the SSHServerService which self-configures oops reporting
[19:44] <lifeless> rather than letting it be configured for it
[19:45] <lifeless> you may want to add a flag to stop calling the setup rather than changing what it calls, etc etc
[19:47] <bigjools> lifeless: yes that will affect a few things
[19:48] <lifeless> yup
[19:48] <bigjools> this, b-m and the branch server iirc
[19:48] <lifeless> won't affect b-m
[19:48] <lifeless> AFAICT from daemons/buildd-manager.tac
[19:49] <lifeless> and manager.py
[19:49] <lifeless> it only grabs the rotatablefileobserver from loggingsupport
[19:49] <bigjools> sorry not b-m
[19:49] <lifeless> we *should* change the buildd-manager to get oopses from it, but we should get poppy and the bzr service sorted first.
[19:50] <bigjools> well this will fix the bzr server I hope
[19:50] <bigjools> unless ...
[19:50] <lifeless> right, once the code fallout is done :)
[19:51] <lifeless> basically the SSHService can't do its own oops config, because its being used in cooperation with other services, but logging-glued-in oops configuration is global.
[19:51] <lifeless> so it has to be pulled out of there.
[19:51] <lifeless> it could have its own oops config object, for direct raising of soft reports.
[19:51] <lifeless> but thats a totally different question.
[19:53] <bigjools> right, separate bug
[19:53] <bigjools> I shall fix this first
[19:54] <lifeless> yeah
[19:54] <lifeless> this is contained to:
[19:54] <lifeless>  - stop sshservice configuring logs or oops
[19:54] <lifeless>  - delete all the accumulated cruft around that
[19:54] <lifeless>  - return an observer with rotation which we can use anywhere
[19:54] <lifeless>  - and setup python logging to just stdout
[19:54] <lifeless> -fin
[19:56] <lifeless> oh, and it looks like a trivial 'delete set_up_logging_for_script' - just doing a full grep to be sure
[19:57] <lifeless> aieee
[19:57] <lifeless> supermirror-pull
[19:58] <lifeless> that should be an easy fix - migrate across to the set_up_oops_reporting API and call t.python.log.startLoggingWithObserver(oops_observer))
[19:59] <lifeless> or move it to the end of the set_up_logging_for_script call
[19:59] <lifeless> either will work AFIACT
[20:00] <lifeless> bigjools: you use 'longpoll' right as the tag?
[20:01] <bigjools> lifeless: yes
[20:01] <lifeless> FYI https://bugs.launchpad.net/launchpad/+bug/901844
[20:01] <bigjools> prob need to make it official
[20:01] <_mup_> Bug #901844: unreliablesession service causing oops during after-request processing <fallout> <longpoll> <oops> <Launchpad itself:Triaged> < https://launchpad.net/bugs/901844 >
[20:05] <bigjools> lifeless: so in your patch you've removed startLoggingWithObserver
[20:06] <lifeless> right
[20:06] <lifeless> twistd calls that itself
[20:06] <lifeless> if the application thing that the buildd-manager uses works, it should be enough
[20:07] <bigjools> yeah
[20:07] <lifeless> I haven't traced through to see whats happening
[20:07] <bigjools> I keep thinking that it won't but that's the txlongpoll stuff .... it's a TAP not a TAX
[20:07] <bigjools> TAC
[20:12] <bigjools> lifeless: I like your optimism of expecting "options" to be available in the tac file :)
[20:12] <lifeless> bigjools: see buildd-manager.tac
[20:13] <bigjools> ah different options
[20:16] <james_w> would someone please review my branch? https://code.launchpad.net/~james-w/launchpad/binaries-created-since/+merge/85022
[20:16] <james_w> jcsackett, are you still on call?
[20:17] <jcsackett> james_w: i am.
[20:17] <jcsackett> you have something for me?
[20:18] <james_w> https://code.launchpad.net/~james-w/launchpad/binaries-created-since/+merge/85022 if you would be so kind
[20:18] <jcsackett> i would be happy to.
[20:18] <abentley> jcsackett: could you please review https://code.launchpad.net/~abentley/launchpad/batch-dealising/+merge/85023 ?
[20:21] <jcsackett> abentley: sure. it's next.
[20:22] <abentley> jcsackett: tia
[20:23] <bigjools> lifeless: what happens if set_up_oops_reporting is called twice?
[20:24] <lifeless> you'll have two signal handlers installed ?
[20:24] <lifeless> seems like a bad idea
[20:30] <flacoste> lifeless: fwiw, googlebot doesn't execute javascript
[20:30] <flacoste> http://code.google.com/web/ajaxcrawling
[20:30] <flacoste> is the convention they suggest to make ajax site indexable
[20:30] <bigjools> lifeless: just a theoretical question.  Anyway, your change has the same affect as my branch - nothing from the python log ends up in the twisted log
[20:30] <flacoste> lifeless: and sorry for the out-of-context info :-)
[20:32] <lifeless> flacoste: interesting
[20:33] <flacoste> yeah, more server work
[20:33] <flacoste> which is what we hoped to get rid of
[20:34] <rick_h_> deryck: I've pushed changes to the inline editor, appreciate your feedback on if it gets the look you were thinking of
[20:36] <lifeless> bigjools: hmm
[20:36] <bigjools> lifeless: this is why I have no hair left today :)
[20:36] <lifeless> bigjools: so, I suggest you push this current experiment somewhere
[20:36] <lifeless> bigjools: I will dig while you sleep
[20:36] <lifeless> bigjools: and hand-back your fri morning.
[20:36] <lifeless> bigjools: its well past your EOD now :)
[20:37] <jcsackett> james_w: is the interface addition of created_since_date to add a field on a form in the webapp? i don't see anything connecting it back, but i'm unfamiliar with the soyuz bits of the webapp and could see it just being automatically handled with that addition. just want to make sure that's the case.
[20:37] <bigjools> lifeless: yes, I am 2 whiskies in to EOD
[20:37] <james_w> jcsackett, I want it for the webservice, I don't think there's really anywhere it would fit in the webapp either
[20:38] <james_w> jcsackett, I have an API client that needs the parameter to make it a billion times more efficient
[20:38] <lifeless> flacoste: whats the bug # for the linaro-cannot-upload-releases bug? I can't seem to find it again..
[20:38] <jcsackett> james_w: ah.
[20:39] <james_w> lifeless, https://bugs.launchpad.net/launchpad/+bug/194558
[20:39] <_mup_> Bug #194558: Project file uploads time-out but don't OOPS <escalated> <linaro> <Launchpad itself:Triaged by gmb> < https://launchpad.net/bugs/194558 >
[20:39] <lifeless> james_w: thank you!
[20:39] <flacoste> what james_w says :-)
[20:39] <lifeless> bigjools: so yeah, pastebin or push to a branch, and make it my problem
[20:40] <lifeless> bigjools: I'll get the basics hanging together properly.
[20:40] <bigjools> lifeless: http://pastebin.ubuntu.com/764210/
[20:40] <bigjools> I've actually been using it on a branch where I've added extra gpg debugging to poppy
[20:41] <bigjools> good way to test the logging :)
[20:41] <jcsackett> james_w: r=me.
[20:41] <jcsackett> abentley: looking at yours now.
[20:41] <james_w> jcsackett, merci
[20:41] <james_w> now to see if I can actually land it
[20:41] <lifeless> bigjools: cool, I saw the MP for that branch
[20:42] <lifeless> bigjools: btw, I don't think we should log errors for these gpg failures if its a sigfail - thats a user problem (analgous to a 404). But lets worry about that *after* we sort out germanium
[20:44] <bigjools> lifeless: right - this is purely for debugging since we have NFI why it fails
[20:44] <bigjools> there's a "source" debug property that'll tell us where the error originated
[20:44] <lifeless> bigjools: ah, I don't mean your change, I mean the existing code
[20:45] <bigjools> lifeless: the logging comes from the FTP handler in Twisted :/
[20:45] <lifeless> something, perhaps in twisted itself, is logging an isError
[20:45] <lifeless> we'll want to add an oops filter to mask those from being oopses I think. eventually.
[20:45] <bigjools> the writer handler just tells the parent class there was a failure and it does the rest
[20:51] <lifeless> yup
[20:51] <lifeless> flacoste: thanks for the link
[20:53] <jcsackett> abentley: just to make sure i'm reading this all right, the purpose of this branch is to avoid reloading stuff and just grab the data we've already got, right?
[20:54] <abentley> jcsackett: right.
[20:54] <jcsackett> abentley: awesome.
[20:54] <deryck> rick_h_, sorry, missed the ping somehow.  looking now.
[20:54] <jcsackett> abentley: okay, given that my understanding is on the mark, this looks good to me. r=me.
[20:54] <abentley> jcsackett: thanks.
[20:57] <rick_h_> deryck: cool, I'm off to run around, but feel free to leave any notes and I'll check it out in the morning
[20:58] <deryck> rick_h_, sounds good.  Have a nice evening.
[21:04] <abentley> deryck: I'm having trouble reproducing 900398 locally.  I've set a bug to incomplete, and it still says "There are currently no open bugs."  Any tips?
[21:06] <deryck> abentley, ah, expire able.  yeah, you'll need to back date a bug and back date when it was marked incomplete, by poking at db or in console.
[21:07] <abentley> deryck: How old does it need to be?
[21:09] <deryck> abentley, hmmm, that's a good question.  my impulse was >60 but not sure what the expireable view shows, since those should auto expire.
[21:09]  * deryck looks at some things....
[21:09] <abentley> deryck: that's why I find it confusing.  I thought all incomplete bugs were "expirable", and if they were >60 they'd be expired.  Well, invalid.
[21:10] <wgrant> abentley: Assigned bugs or those with multiple tasks don't expire, IIRC.
[21:11] <abentley> wgrant: Ah, that could be it.
[21:12] <wgrant> abentley: I believe there are other exceptions as well. They used to be documented, but I'm not sure if it's up to date.
[21:14] <abentley> wgrant: hmm.  I set https://bugs.launchpad.dev/ubuntu/+source/linux-source-2.6.15/+bug/10 to incomplete, and claims it will expire, but doesn't show in the list.
[21:14] <_mup_> Bug #10: It says "displaying matching bugs 1 to 8 of 8", but there is 9 <lp-bugs> <Launchpad itself:Invalid> < https://launchpad.net/bugs/10 >
[21:14] <deryck> abentley, also the project has to have expiry enabled for that view to work.
[21:14] <deryck> abentley, and I think expiry is off by default
[21:14] <abentley> deryck: Yes, I made sure that was checked.
[21:15] <deryck> ok, hmmm.  can't be a dupe, can't be assigned, can't be targeted to milestone
[21:15]  * deryck is thinking of every criteria
[21:16] <deryck> abentley, ah.  and it is a config param for how old it needs to be.  config.malone.days_before_expiration
[21:17] <deryck> abentley, and as I read the method to get the list, the bugs have to be created than that value.
[21:17] <deryck> s/created/greater/
[21:17] <abentley> deryck: It's particularly weird, because lp says it will expire on the bug page, but doesn't list it in the listing.
[21:17] <wgrant> Oh, so +expirablebugs or whatever it is only makes sense when expiry is disabled?
[21:17] <wgrant> Because otherwise everything there should already have expired...
[21:18] <deryck> well, expiring has to be enabled for that page to have any bugs.  but yeah, I agree it doesn't make sense why a bug has to be ready to expire before it shows on that page.
[21:19] <wgrant> "An Incomplete bug can remain in this list indefinitely, so long as the bug is regularly updated."
[21:19] <abentley> I guess that would explain why it's a custom view, instead of searching on status=Incomplete.
[21:19] <wgrant> That suggests it doesn't have the timeout.
[21:19] <abentley> wgrant: I think that means that updates reset the timeout.
[21:19] <wgrant> But then it wouldn't be on the list.
[21:19] <wgrant> So it wouldn't stay on the list indefinitely.
[21:20] <abentley> wgrant: Maybe the page doesn't know what it actually lists?
[21:20] <deryck> the view uses findExpirableBugs which does: Bug.date_last_updated < CURRENT_TIMESTAMP AT TIME ZONE 'UTC' - interval '%s days'
[21:20] <deryck> where the config value is the %s
[21:20] <lifeless> have we finished the migration to the new ENUM as well ?
[21:20] <wgrant> Yeah, the docs seem wrong.
[21:21] <wgrant> That makes the view pretty useless.
[21:21] <deryck> indeed
[21:21] <abentley> deryck: do we want to fix this view or kill it?
[21:21] <wgrant> Oh.
[21:21] <deryck> you can only view them on that view when they're old enough to expire but before the script has actually run :)
[21:21] <wgrant> But findExpirableBugTasks takes min_days_old as an argument.
[21:21] <flacoste> wtf, i get this email back from PQM: http://pastebin.ubuntu.com/764259/
[21:22] <flacoste> what is that supposed to mean
[21:22] <wgrant> flacoste: Check the attachment.
[21:22] <wgrant> flacoste: It will say the regex doesn't match.
[21:22] <flacoste> ah!
[21:22] <flacoste> attachments!
[21:22] <flacoste> that must be new
[21:22] <flacoste> wgrant: thanks!
[21:22] <wgrant> deryck: Perhaps we should change BugTaskExpirableListingView to show everything older than 0 days?
[21:23] <deryck> wgrant, yeah, I think so.
[21:23] <deryck> abentley, ^^.  Just fix it to show anything that could expire when the time comes.
[21:23] <wgrant> I guess for years it made sense to show only the stuff that was really expirable -- because they never actually expired.
[21:23] <abentley> deryck: Okay.
[21:23] <deryck> That's my understanding of what this view is meant to show.
[21:23] <deryck> wgrant, heh. yeah.
[21:23] <abentley> wgrant: older than 0 days == everything?
[21:23] <flacoste> why are we still still in testfix mode?
[21:24] <wgrant> abentley: I think so.
[21:24] <flacoste> since both lp and db_lp are building happily?
[21:24] <abentley> flacoste: because the last build attempt failed, and no one has landed a testfix.
[21:24] <wgrant> flacoste: I'm not sure. It may be because the last devel build blew up really impressively, and buildbot apparently lost its memory.
[21:24] <wgrant> abentley: That doesn't normally matter.
[21:24] <flacoste> right
[21:24] <wgrant> Normally as long as both are either successful or building, it's happy.
[21:24] <flacoste> yep
[21:25] <wgrant> But buildbot-poller probably doesn't handle results that end in an exception, I guess.
[21:25] <flacoste> but the memory aspect might confuse buildbot-poller :-/
[21:25] <flacoste> or something like that
[21:25] <wgrant> abentley, deryck: Hah
[21:25] <abentley> wgrant: I could have sworn I saw that behaviour yesterday when I forced a build.
[21:25] <wgrant> That's a recent change.
[21:25] <wgrant> 11057.8.2
[21:25] <wgrant>   modify can_expire to use the days_before_expiration config option
[21:25] <wgrant>      expirable_bugtasks = list(bugtaskset.findExpirableBugTasks(
[21:25] <wgrant> -        0, getUtility(ILaunchpadCelebrities).janitor))
[21:25] <wgrant> +        config.malone.days_before_expiration, getUtility(ILaunchpadCelebrities).janitor))
[21:26] <deryck> ah interesting.
[21:26] <abentley> wgrant: thanks.
[21:27] <deryck> That's a bdmurray change.  I'm sure he and I talked about it, but don't recall now why we did that.
[21:27] <wgrant>   [r=leonardr][ui=none][bug=595124] unexport IBug.can_expire which is
[21:27] <wgrant>         confusing instead create and export IBug.isExpirable which can
[21:27] <wgrant>         accept a custom number of days.
[21:27] <abentley> It was proposed here: https://code.launchpad.net/~brian-murray/launchpad/595124/+merge/28543
[21:27] <wgrant> Is the landing message.
[21:27] <wgrant> How odd.
[21:27] <wgrant> It seems unrelated.
[21:27] <lifeless> ui=none is rather stale
[21:28] <deryck> heh, I even discussed this and approved then.
[21:28] <deryck> well, I unapprove my approval. ;)
[21:28] <wgrant> deryck: It made sense back then :)
[21:28] <wgrant> It was still switched off.
[21:28] <abentley> deryck: lol
[21:28] <wgrant> So it was useful for expirable-bugs to show what *would* happen.
[21:29] <deryck> yeah, and as it is now, we'll never actually see them.
[21:30] <deryck> unless the expiry script barfs. ;)
[21:30] <wgrant> Yep.
[21:30] <wgrant> revert revert revert
[21:33] <wgrant> baaaah
[21:33] <wgrant> **********************************************************************
[21:33] <wgrant> Can't use pdb.set_trace when running a layer as a subprocess!
[21:33] <wgrant> **********************************************************************
[21:33] <wgrant> How nice.
[21:35] <wgrant> Ah, OK, lp-buildd problem is clear now.
[21:35]  * wgrant fixes.
[21:35]  * wgrant was wrong to blame poolie :(
[21:36] <abentley> deryck, wgrant: thanks.  Can now reproduce the issue.
[21:37] <nigelb> wgrant: winpdb might help.
[21:37] <nigelb> statik showed me that trick. It helped when I was debugging LP :)
[21:38] <poolie> glad it wasn't me
[21:38] <poolie> wgrant, what was it?
[21:38] <wgrant> poolie: We added a new config option last week.
[21:38] <wgrant> poolie: It's mandatory.
[21:38] <wgrant> poolie: The config file is installed by python-lpbuildd, but the migrator is in launchpad-buildd.
[21:38] <wgrant> So upgrading python-lpbuildd doesn't upgrade the config file.
[21:39] <poolie> ah
[21:39] <poolie> kind of my fault for not separating them better perhaps
[21:39] <wgrant> Meh.
[21:40] <poolie> i don't know if making it mandatory but added by upgrading makes sense, but i don't know the specific
[21:40] <wgrant> I guess I'll move it, alter the migrator to also migrate for 111 if the option isn't in the file, and release a new one.
[21:40] <wgrant> poolie: It's how all the path configuration is done, unfortunately.
[21:40] <wgrant> It's all pretty ancient and bad.
[21:44] <lifeless> poolie: http://lists.linaro.org/pipermail/linaro-dev/2011-December/009059.html suggests a place where git hosting would be valuable to LP users
[21:45] <poolie>  yep
[21:46] <lifeless> abentley: whats the expiry policy on your ajax batch cache ?
[21:46] <abentley> lifeless: None.
[21:46] <poolie> i think the clearest case for it is as a replacement for git.kernel.c.c. and git.l.c
[21:46] <lifeless> abentley: so users have to refresh the page to see new results?
[21:46] <abentley> lifeless: right.
[21:47] <lifeless> abentley: you might like to have a bug noting this - one of the things we're trying to do long term is make LP more responsive, and having to refresh will conflict with that
[21:48] <abentley> lifeless: the irony is, of course, that not loading is a great way of making Launchpad more responsive.
[21:49] <lifeless> abentley: well, its a way. I don't think it gets used much at all - e.g. see poolies data that users use 'next' very rarely.
[21:50] <abentley> lifeless: For users that don't use the cache, its expiry policy doesn't matter :-)
[21:50] <lifeless> agreed
[21:51] <lifeless> just noting that this is a side effect of the cache - part of the expected overhead of having a cache
[21:51] <lifeless> It is a significant difference to the non-ajax behaviour where next/prev give live data.
[21:52] <abentley> lifeless: So if they're not using the cache, then it neither makes launchpad more responsive nor forces them to reload.
[21:52] <lifeless> It would be nice to have the user subscribe to the bugs visible in the batch, so they can be removed from the batch live
[21:53] <lifeless> abentley: thats true. But we know some users do use the cache. And we haven't drilled into where that usage is distributed
[21:53] <lifeless> I forget what the percentage was
[21:53] <lifeless> call it N. It may be that N% of users use next/prev all the time.
[21:53] <abentley> lifeless: Yes, I'd love to see that data by user rather than by hit.
[21:55] <abentley> Anyhow, I will be happy to contribute to the effort to make pages update live.
[21:57] <wgrant> poolie: Ah, actually, the config file I had was crufty. python-lpbuildd doesn't use the system config file at all. But there's a test-specific config file in the tree which wasn't updated.
[22:06] <lifeless> abentley: I think for now a bug noting that this is the situation is appropriate (because its not obvious to users that this will be happening)
[22:06] <lifeless> abentley: I can file it if you like
[22:06] <abentley> lifeless: There's no spinner when retrieving from cache, so it should be obvious, but feel free.
[22:07] <lifeless> abentley: folk in europe working with small batches probably won't register the spinner anyhow :)
[22:11] <lifeless> abentley: bug 901892
[22:11] <_mup_> Bug #901892: bug search cache makes next/prev return old data <bug-columns> <Launchpad itself:Triaged> < https://launchpad.net/bugs/901892 >
[22:11] <lifeless> deryck: ^ FYI too
[22:11] <lifeless> abentley: its a little entertaining to me that 7 years on we're still talking about cache implementations
[22:11] <lifeless> :)
[22:12] <deryck> ack, thanks
[22:14] <deryck> later on, everyone.
[22:17] <flacoste> need to run for the rest of the evening
[22:17] <flacoste> cu later
[22:19] <lifeless> ciao
[22:35] <mtaylor> lifeless: so - I set up launchpad translations a while back for nova, and they seem to be working, even though nova doesn't have a .pot file in the repo (since it generates one through distutils-extra)
[22:36] <mtaylor> lifeless:  I was starting to try to set up the same thing for glance, but I'm not sure I did it right - who should I bug with questions?
[22:37] <lifeless> mtaylor: if there is a CHR listed in #launchpad, start there
[22:37] <lifeless> mtaylor: failing that, just ask there - there are folk around
[22:37] <mtaylor> ok. cool. thanks
[22:38] <lifeless> mtaylor: failing that, open a ticket at l.n/launchpad