[00:06] hmm, a branch name from StevenK that doesn't make me go "WTF?" [00:06] * jelmer is disappointed [00:06] jelmer: Which one? [00:07] StevenK: refactor-imports-redux [00:07] If it doesn't make you go WTF, Diff against target: 11295 lines (+1298/-1451) 531 files modified will [00:08] sinzui: Hm, so just launchpadstatistic, librarian, logintoken and temporaryblobstorage left. [00:08] poolie: we do more than 1M pages a day, we'd blow past their taster-account in no time ;) [00:08] sinzui: I have a branch from a couple of weeks back for temporaryblobstorage. [00:08] wgrant: sinzui was going to tackle logintoken [00:23] lifeless, how can i get the raw form of an oops? [00:23] or anyone [00:33] from where [01:17] wallyworld__: O hai. https://code.launchpad.net/~stevenk/launchpad/productset-all-lies/+merge/86314 [01:17] StevenK: looking now [01:19] StevenK: any tests to amend? [01:19] Not that I could see [01:19] ec2 will tell us i guess [01:19] Tempted to just -vvm registry [01:19] sure there is a test to add ? [01:20] I'd be surprised if ProductSet:+all wasn't tested by some doctest [01:20] * StevenK runs registry tests [01:20] StevenK: i've +1'ed it but it would be cool if there were a doc test that could be added to [01:20] or whatever [01:59] poolie: :/ [02:31] lifeless, ? [02:31] i found it on disk on devpad [02:31] why the frownie? [02:32] I may have misinterpreted your answer to my reply to your advert for bson-dump [02:33] ELONGCONTEXT [02:33] oh [02:33] i agree it would be good to do [02:33] i don't know why i didn't put it elsewhere in the first place [02:33] it was a while ago [02:33] perhaps all the external oops stuff seemed too much in flux? [02:33] or there were too many options for where to put it, so i took a lame default [02:34] I felt, apparently wrongly, that you were being a bit uhm, 'well I've done, nyar'. [02:35] the perils of low bandwidth comms [02:35] ah, not really [02:35] poolie: I'd really like to delete utilities/* [02:35] feeling a bit "omg so few days before holidays etc" [02:35] if you tell me a specific place to move it to that will help [02:35] heh, fair enough. [02:36] i guess, something that knows about bson encoding and will be installed for all developers [02:36] i think splitting stuff is good but a minor consequence is that 'where do i do this' gets a bit harder [02:36] I'd put it either in oops-datedir-repo or oops-tools itself [02:39] its not urgent to move it [02:40] if you're busy with other stuff in the holiday lead up, just ignore it. [03:00] i'll move it to oops-tools [03:22] from bzrlib.plugins.builder.recipe import RecipeParseError [03:22] ImportError: No module named builder.recipe [03:22] * StevenK peers [03:30] StevenK, wgrant: I'm sorry to hear that I broke buildmaster again. Never expected there'd be no missed spots at all, but didn't expect this many either. [03:30] Did Gavin land the fix? [03:31] jtv: The test coverage of buildd-master is just *horrid*. [03:32] Ah, reverted in r14552. [03:32] But marked with the bugs, and not incr. Sigh. [03:32] Should that be incr? [03:33] I completely forgot about that tag. [03:33] jtv: You're rolling back the code, so I guess the next step is fix the three bugs and land it again. [03:34] Well I'm not rolling anything back personally; I have to go back to clearing out the house. [03:34] But yes, I'm afraid that's the process. [03:34] jtv: If so, our process says the 3 bugs should be closed. Except they won't be fixed. [03:35] Oh. [03:35] So, PQM's been whinging about a conflict for 6 hours now. [03:35] jtv: The qa-tagger will tag them needstesting, they'll get marked untestable, and rolled out. [03:35] Is someone going to fix that at some point? [03:36] I'm trying to sort out ImportError: No module named builder.recipe [03:36] StevenK: the bit you said about qa-tagger is what will happen regardless, no? [03:36] jtv: Yes, but if was marked incr, the qa-tagger won't slam the bugs to Fix Committed. [03:37] Ah, now the pieces come together. [03:38] But I thought you said the rollback should be [incr], not the fixes themselves? [03:39] jtv: Right. The rollback will be marked 'as part of this bugs fix', and then when the fixes land properly, the bugs should hit Fix Committed. [03:39] But you said the 3 bugs should be closed, withing being fixed..? [03:42] jtv: No, I said that what was likely to happen due to the lack of incr. [03:45] StevenK: you said "if so, the process says the 3 bugs should be closed." What was the "if so" referring to? [03:46] jtv1: I can see we are talking past each other. I explained what would likely happen, and then shifted to talking about what should have happened instead. [03:47] Ah, I think I get it now. Thanks. [04:01] Is checkwatches safe to run on qas? [04:04] StevenK: Not really, no. [04:04] StevenK: And it wouldn't be a very useful test anyway. [04:05] wgrant: Okay. Safer to qa-untestable my checkwatches branch? [04:05] I think so. [04:10] wgrant: Looking at db-devel versus stable [04:17] lifeless, hm putting this in with the daemon seems not quite right [04:20] wgrant: PQM silenced. Hopefully. [04:28] i'll put it in python-oops [04:31] poolie: Doesn't it belong in oops-datedir-repo? [04:31] I didn't think python-oops knew about BSON. [04:32] it mentions it in the docs but it doesn't use it in the code [04:32] It doesn't depend on bson. [04:33] That's all in datedir-repo/amqp [04:33] but it does not seem like you should need the repo code to inspect an oops file [04:33] i could make a new package [04:33] Why not? [04:33] python-oops doesn't do serialisation. [04:33] it seems like overkill for what is basically one line of code [04:33] "oops file" is a concept that's only part of datedir-repo. [04:34] there are two potentially separate aspects [04:34] serializing as bson [04:34] and writing into per-date directories [04:34] you could reasonably have the first without the second [04:35] indeed if you just download one oops you probably will [04:35] Sure, but python-oops deliberately doesn't know about serialisation like that. [04:35] That's left to the repository implementations: datedir-repo and amqp. [04:36] amqp has its own separate serialization? [04:37] It's BSON. I believe it uses datedir-repo's BSON serializer. [04:37] foo [04:37] All roads lead to datedir-repo :) [04:37] python-oops says in the readme it defines a serialization [04:38] though i suppose it is ambiguous what 'the oops project' means [04:39] so that's why i just put it in utilities/. [04:39] I think python-oops' docs are out of date. [04:39] datedir-repo was extracted in r9 [04:43] hm, so [04:44] i don't know [04:44] having the format be separate from the serialization seems good [04:44] having no comment at all about what serialization seems dumb [04:44] in practice multiple trees assume it is bson [04:44] No. [04:45] Multiple repository implementations use BSON. [04:45] datedir-repo has an option to write out rfc822 as well. [04:45] And it will read it perfectly happily. [04:46] amqp could be changed to use pickles if you were sufficiently misguided, without affecting datedir-repo. [04:46] true [04:46] so there's no reason this should live in one of them rather than the other [04:46] Well. [04:46] I think it makes sense in datedir-repo. [04:46] Since amqp's bson doesn't ever hit the disk as a file. [04:47] It's purely encoding as it goes into rabbit, and decoding as it comes out. [04:47] (it's then usually handed off to datedir-repo, where it's reencoded and written out into a file) [04:47] yeah i see [04:47] So I think this script belongs in datedir-repo. [04:48] and if python-oops-tools offered an option to download it, it would get reserialized again there [04:48] Possibly. [04:48] But maybe not. [04:48] I think oops-tools is pretty tied to datedir-repo. [04:49] Whereas amqp/datedir-repo/oops are very nicely separated. [04:49] They actually have sensible interfaces, and work within them! [04:50] you know what, i'll just make it separate [04:51] I think datedir-repo :) But ok. [04:53] StevenK: Did you run that through ec2? [04:53] wgrant: Which? The imports branch? [04:53] Yes. [04:53] Yeah, I did [04:53] Hm. [04:53] Why? [04:54] A naive global format-imports should have broken stuff unless you were very lucky. [04:54] Due to lp.codehosting's side-effects. [04:54] Although I guess it is alphabetically early. [04:54] There were 4 failures on ec2, which I fixed before lp-landing [04:54] So it may be OK. [04:59] wgrant: Still nervous? [05:00] StevenK: Slightly. [05:01] wgrant: I can forward you the failure mail if it will ally your concerns. [05:08] and there are at least two different python modules called 'bson' [05:08] poolie: Yes :/ [05:08] And at least one of them is very buggy. [05:08] (the one we use) [05:12] :) [05:29] wgrant, lifeless, https://code.launchpad.net/~mbp/python-oops-datedir-repo/bsondump/+merge/86338 [09:08] good morning [09:09] Good morning [09:11] Morning all === almaisan-away is now known as al-maisan [09:22] Some colleagues have asked me if I could set up an in-house Launchpad server so they could use it for their projects. They're probably only going to use the bugtrack, blueprint and repository functionality. I'm wondering though if Launchpad isn't a bit overkill then. What's your advise? I already set up a bugtracker for them (MantisBT), a Wiki for their blueprints and setting up a repo is not much work either. [09:23] bigjools: Hai. Will you have a chance to do your QA today? [09:24] StevenK: hopefully! I got a bit blindsided yesterday [09:24] Yes, that's why I didn't bug you then. :-) [09:24] I have a theory about poppy [09:25] It is horribly, horribly broken and needs to die? [09:25] well you wrote it :) [09:25] * bigjools just hacked on the FTP bit [09:26] Better than continuing to use Zope's horrible excuse for an FTP server. [09:26] bigjools: What is your theory? [09:27] StevenK: the ssh checks connect to the appservers to get the authorisation [09:28] when we have FDT, the XMLRPC connection fails [09:28] after that, it continues to fail forever until restarted [09:28] not sure why, but meh, Twisted [09:28] the swap death was caused by someone using a loop to connect === gmb changed the topic of #launchpad-dev to: https://dev.launchpad.net/ | On call reviewer: gmb | Firefighting: - | Critical bugtasks: 3*10^2 [10:54] anyone developing on precise? [10:54] AutoStatic: I'd recommend *not* running Launchpad locally. [10:55] jml: Yeah, we figured that out too: https://answers.launchpad.net/launchpad/+faq/920 [10:55] AutoStatic: it's pretty huge and the operational cost is non-trivial, even at low scale. [10:56] allenap: in the tests in your branch, it's probably worth refactoring the bit that sets properties on objects in a r/w transaction [10:56] AutoStatic: cool. [10:56] bigjools: Erm, which bit? [10:56] AutoStatic: so, I'm not 100% sure what your question is then :) [10:57] bigjools: Like in test_handleStatus_OK_sets_build_log? [10:57] allenap: line 72/83 of the diff [10:57] allenap: I suspect we'll need to do that a lot more in the future [10:58] bigjools: I don't know what a better way would be. I could instead enter read-only mode in each test individually (via a fixture) I guess. [10:58] allenap: I was thinking just a test helper [10:58] like setattr [10:58] but does the whole transactionny thing [10:59] bigjools: With the removeSecurityProxy thing too I assume. [10:59] allenap: no, the caller can do that [10:59] jml: Well, I got an instance running locally here and my question was more or less a stepping stone to some other questions [11:00] bigjools: Okay, I think I have a cool way to do that. [11:00] allenap: of course :) cheers [11:01] jml: So I'm going to wipe out that local launchpad and convince my colleagues that they should look for something else [11:02] AutoStatic: ok. === matsubara-afk is now known as matsubara [11:08] bwahahaha [11:09] Python 2.6 [11:09] sorry. [11:09] good luck with that. [11:16] Argh. My connection drops out for ten minutes and when I get back bigjools has done the review I was doing. It's going to be one of those someone-else-does-all-the-work OCR days, is it? [11:17] (Also, he did a better job of it) [11:17] (Which galls) [11:18] jml: I've had a bash at getting Launchpad built on Precise, but I lost interest (it was late). Seems like the cool kids are using a schroot (which I am) or an LXC. [11:18] (running Lucid) [11:19] gmb: shurely shome mishtake :) [11:20] What's the firefighting section about? [11:20] (in the topic) [11:21] if we're in the middle of an incident [11:21] ah. It makes topic. Nice. [11:21] * bigjools just added a million people on G+ and may live to regret it [11:23] * nigelb just searched on G+ for "bigjools" [11:23] Dammit. [11:23] bigjools, gmb: Thank you both for the reviews :) [11:24] nae prob [11:35] bigjools: Fwiw, this is what I did to factor out the things you suggested: http://paste.ubuntu.com/776193/ [11:37] allenap: not so much as a refactoring as a rewriting :) [11:38] bigjools: Well, I'm already using it in my next branch, and will probably in the one after that :) [11:38] heh [11:39] allenap: I'm not suggesting you should actually make this change now, but it might be more re-usable as a Fixture. [11:40] bigjools: How do I go about QAing the revert I did? Or do we just say it's fine because it's approximately already on cesium. [11:40] ? [11:41] jml: Yeah, you're right. If it causes enough friction I'll change it. [11:41] allenap: untestable [11:42] bigjools: Cool. [11:47] gmb: any further thoughts on my QA suggestions for https://code.launchpad.net/~mvo/launchpad/maintenance-check-precise/+merge/82125 ? [11:49] cjwatson: No, no further thoughts (sorry, meant to reply the other day but forgot after a reboot). Could you take care of QAing if for me? I'll make sure it lands today or tomorrow. [11:49] modulo holiday, yes I can [11:49] Excellent, thanks. [11:52] ./topic [12:21] hmm. [12:22] so I have a clean lucid schroot for building packages. Can I somehow leverage that to make an schroot dedicated to hacking on Launchpad? [12:40] you could copy the source directory and add a new entry in /etc/schroot/chroot.d/ for it [12:41] and drop the unioniness [12:41] I use a 'lucid-lp' schroot [12:42] cjwatson: thanks. [12:42] (also, my next laptop will have an SSD) [12:56] hmm. I should probably do something like this for each Canonical-deployed project I work on. [13:49] bzrlib.errors.ConnectionReset: Connection closed: Unexpected end of message. Please check connectivity and permissions, and report a bug if problems persist. [13:49] got this trying to fetch bzr-git w/ update-sourcecode [13:50] never mind. [14:00] jml: the /etc/resolv.conf in your chroot might be out of date..? [14:01] gmb: got a sec for review? https://code.launchpad.net/~rharding/launchpad/sort_labels_894744/+merge/86287 [14:01] jml: try "sudo cp /etc/resolv.conf /etc/resolv.conf" and see whether that helps [14:01] benji: I noticed that in the three branches of mine you reviewed yesterday, you left an Approved comment but didn't set the MP to Approved; was that deliberate? [14:02] cjwatson: generally the MP initiator sets it to approved, sometimes they might be getting a DB review too or a UI approval [14:03] I set the other one to approved because I was landing it and the machinery won't land unapproved branches. [14:03] oh, I didn't know that, my reviewer's always done it for me before [14:03] probably because I've always explicitly asked for landings :) [14:07] benji: ah, and I can't set the MP to Approved because I'm not in ~launchpad [14:07] benji: any chance of landings for always-index-d-i and sign-installer, then, if you have a chance? It might be best to leave new-python-apt for a bit as it collides with https://code.launchpad.net/~mvo/launchpad/maintenance-check-precise/+merge/82125 and this way I do the merge rather than making somebody else do it [14:07] heh, well that would make it harder [14:09] cjwatson: sure, I'll start the landing of those in a bit [14:10] great, thank you [14:13] rick_h__: Sure thing; looking now. [14:13] gmb: ty much === frankban_ is now known as frankban [14:40] rick_h__: Approved. [14:46] gmb: awesome, thanks === al-maisa` is now known as almaisan-away === aldeka_ is now known as aldeka === matsubara is now known as matsubara-lunch === almaisan-away is now known as al-maisan === gmb changed the topic of #launchpad-dev to: https://dev.launchpad.net/ | On call reviewer: - | Firefighting: - | Critical bugtasks: 3*10^2 === matsubara-lunch is now known as matsubara [16:30] I'd appreciate it if some kind soul would review this branch: https://code.launchpad.net/~benji/launchpad/bug-903532/+merge/86426 [16:30] if that kind soul has some translations knowledge, it would be even better === al-maisan is now known as almaisan-away [16:49] benji, I ca take it [16:50] sinzui: cool, thanks [16:52] benji, r=me [16:53] sinzui: thanks === cjwatson_ is now known as cjwatson === matsubara_ is now known as matsubara === matsubara is now known as matsubara-afk === almaisan-away is now known as al-maisan === al-maisan is now known as almaisan-away [20:42] gary_poster: I'm around for a bit if you want to talk oopses more [20:43] thanks lifeless on call [21:44] sinzui: jcsackett: can we mumble now? [21:44] yes [21:45] sinzui: fucking mumble is doing it's thing again where it consumes all my cpu. i have to reboot [22:04] anyone want to take a look at https://code.launchpad.net/~james-w/launchpad/bpph-binary-file-urls/+merge/86470 ? [22:10] o/ james_w [22:10] hi all [22:11] hi poolie [22:38] hey poolie [22:53] On the deployable revisions page it says "Revision 14556 can be deployed: orphaned". Does that mean I can't qa it? [22:58] either it has no bug linked, or the bug has been closed already [22:58] if its the latter, you can reopen the bug [23:14] lifeless: Will it get picked up by the qa tagger etc. then? [23:15] lifeless: Should it be Fix Committed or will any status other than Fix Released do? [23:21] ok now i've played with juju it is annoying me that launchpad doesn't use it [23:23] poolie: :) [23:26] jelmer, i just talked to flacoste about bug 795025 [23:26] <_mup_> Bug #795025: no way to gracefully disconnect clients and shut down the bzr server < https://launchpad.net/bugs/795025 > [23:26] istm there is a safer way to do it [23:26] which is to have a signal to tell the processes to just stop listening [23:26] then we can start a new one [23:34] poolie: will that work with haproxy? [23:34] i think so? [23:34] haproxy will detect that it's down? [23:34] I haven't looked at it, so not exactly sure how it's communication with services works [23:35] ah, so... so we shut the existing one down and then when haproxy starts another one that's using the new code? [23:35] i think it's some combination of: seeing if the port is listening, plus pinging a separate http port that reports on the status [23:35] more precisely: [23:36] we tell the existing one "stop accepting connections", and it closes its listening socket [23:36] and then haproxy notices it's down, i guess [23:36] that makes sense [23:37] and then we start a new instance listening on the same port, which will be running the new code [23:37] then the old process can either exit by itself when all the connections are done [23:37] or the sysadmins can kill it if they want [23:44] poolie: uhm, thats not sufficient [23:44] because? [23:45] because having the old code running for several weeks will play havoc with things like upgrading xmlrpc verbs [23:45] ..? [23:45] like, removing old verbs on internal xmlrpc that the old code uses? [23:46] yes, or rearranging things; things that you would normally do a server change, change client, cleanup old code sequence [23:46] this depends on being confident that the client is deployed [23:46] mm [23:47] not to mention that we would like to free disk space from old deploys. [23:47] so to have loose coupling we would want to not have those things required to happen in too short of a time window [23:47] anyhow, after that time, we can just kill the old procesess [23:47] the client should cope [23:48] right, we can allow a few hours for the old processes to gracefully go away, which is what the current plan aims at [23:48] we don't want to interrupt someones 6 hour epic initial push, after all. [23:48] right [23:49] so my plan is [23:49] we don't want idle heavyweight processes hanging around indefinitely either, which means a way of killing them while idle, which implies the client coping [23:49] i think we can do this in two steps [23:49] 1- move new connections on to the new process [23:50] or rather, accept new connections from the new process [23:50] 2- boot off existing clients [23:50] 2 is a bit messy becaues [23:50] some clients won't cope well [23:50] and it will take unbounded time to get there [23:50] and it's just generally more risky [23:50] mmm [23:51] remember we have some fixed paths on disk for the front-forking IPC calls, and we also have N front-end and N forking services to restart [23:51] doing 1 without waiting for 2 is more complex and doesn't really buy us anything [23:51] we're still not done-done until 2 has happened [23:52] so doing only 1 will let us bump codehosting from every fdt deploy [23:52] that seems highly worthwhile [23:52] no, it won't. [23:52] why? [23:53] codehosting isn't in fdt anyhow, its a nodowntime-with-handholding deploy [23:53] the handholding is because of 2 [23:53] solve the handholding problem and it can move to nodowntime [23:54] the constraints are that we must be safe to delete the deploy directory after the deploy. [23:54] well, there are probably more, but thats the key one I see. [23:54] what specifically is the problem [23:54] ok [23:55] the problem today is that the nodowntime deploy pauses for hours because we can't interrupt bzr safely, so we wait until there are only a few clients connected then manually check that they are all CI servers and whatnot [23:55] and then interrupt them ungracefully [23:56] the deploy process is 'upgrade instance 1, upgrade instance 2' - serialised - which gets us no downtime [23:56] during the deploy, the symlink for the active tree is updated, and after that we assume we can delete the tree at any point [23:57] a few trees are kept around, but when we do multiple deploys in a day, there is no fixed window for when a tree will be deleted [23:57] sure [23:58] we probably need to rejigger a few things, and having a quick-stop-listening step is fine with me as long as we don't set ourselves up for messy failures that we need to ignore / whatever. [23:58] so the handholding is that [23:58] they want to delete the tree when the processes using it have finished [23:59] however, that is always going to take a while, unless we're prepared to just abruptly kill connections