jelmer | hmm, a branch name from StevenK that doesn't make me go "WTF?" | 00:06 |
---|---|---|
* jelmer is disappointed | 00:06 | |
StevenK | jelmer: Which one? | 00:06 |
jelmer | StevenK: refactor-imports-redux | 00:07 |
StevenK | If it doesn't make you go WTF, Diff against target: 11295 lines (+1298/-1451) 531 files modified will | 00:07 |
wgrant | sinzui: Hm, so just launchpadstatistic, librarian, logintoken and temporaryblobstorage left. | 00:08 |
lifeless | poolie: we do more than 1M pages a day, we'd blow past their taster-account in no time ;) | 00:08 |
wgrant | sinzui: I have a branch from a couple of weeks back for temporaryblobstorage. | 00:08 |
StevenK | wgrant: sinzui was going to tackle logintoken | 00:08 |
poolie | lifeless, how can i get the raw form of an oops? | 00:23 |
poolie | or anyone | 00:23 |
lifeless | from where | 00:33 |
StevenK | wallyworld__: O hai. https://code.launchpad.net/~stevenk/launchpad/productset-all-lies/+merge/86314 | 01:17 |
wallyworld__ | StevenK: looking now | 01:17 |
wallyworld__ | StevenK: any tests to amend? | 01:19 |
StevenK | Not that I could see | 01:19 |
wallyworld__ | ec2 will tell us i guess | 01:19 |
StevenK | Tempted to just -vvm registry | 01:19 |
lifeless | sure there is a test to add ? | 01:19 |
StevenK | I'd be surprised if ProductSet:+all wasn't tested by some doctest | 01:20 |
* StevenK runs registry tests | 01:20 | |
wallyworld__ | StevenK: i've +1'ed it but it would be cool if there were a doc test that could be added to | 01:20 |
wallyworld__ | or whatever | 01:20 |
lifeless | poolie: :/ | 01:59 |
poolie | lifeless, ? | 02:31 |
poolie | i found it on disk on devpad | 02:31 |
poolie | why the frownie? | 02:31 |
lifeless | I may have misinterpreted your answer to my reply to your advert for bson-dump | 02:32 |
lifeless | ELONGCONTEXT | 02:33 |
poolie | oh | 02:33 |
poolie | i agree it would be good to do | 02:33 |
poolie | i don't know why i didn't put it elsewhere in the first place | 02:33 |
poolie | it was a while ago | 02:33 |
poolie | perhaps all the external oops stuff seemed too much in flux? | 02:33 |
poolie | or there were too many options for where to put it, so i took a lame default | 02:33 |
lifeless | I felt, apparently wrongly, that you were being a bit uhm, 'well I've done, nyar'. | 02:34 |
lifeless | the perils of low bandwidth comms | 02:35 |
poolie | ah, not really | 02:35 |
lifeless | poolie: I'd really like to delete utilities/* | 02:35 |
poolie | feeling a bit "omg so few days before holidays etc" | 02:35 |
poolie | if you tell me a specific place to move it to that will help | 02:35 |
lifeless | heh, fair enough. | 02:35 |
poolie | i guess, something that knows about bson encoding and will be installed for all developers | 02:36 |
poolie | i think splitting stuff is good but a minor consequence is that 'where do i do this' gets a bit harder | 02:36 |
lifeless | I'd put it either in oops-datedir-repo or oops-tools itself | 02:36 |
lifeless | its not urgent to move it | 02:39 |
lifeless | if you're busy with other stuff in the holiday lead up, just ignore it. | 02:40 |
poolie | i'll move it to oops-tools | 03:00 |
StevenK | from bzrlib.plugins.builder.recipe import RecipeParseError | 03:22 |
StevenK | ImportError: No module named builder.recipe | 03:22 |
* StevenK peers | 03:22 | |
jtv | StevenK, wgrant: I'm sorry to hear that I broke buildmaster again. Never expected there'd be no missed spots at all, but didn't expect this many either. | 03:30 |
StevenK | Did Gavin land the fix? | 03:30 |
StevenK | jtv: The test coverage of buildd-master is just *horrid*. | 03:31 |
StevenK | Ah, reverted in r14552. | 03:32 |
StevenK | But marked with the bugs, and not incr. Sigh. | 03:32 |
jtv | Should that be incr? | 03:32 |
jtv | I completely forgot about that tag. | 03:33 |
StevenK | jtv: You're rolling back the code, so I guess the next step is fix the three bugs and land it again. | 03:33 |
jtv | Well I'm not rolling anything back personally; I have to go back to clearing out the house. | 03:34 |
jtv | But yes, I'm afraid that's the process. | 03:34 |
StevenK | jtv: If so, our process says the 3 bugs should be closed. Except they won't be fixed. | 03:34 |
jtv | Oh. | 03:35 |
wgrant | So, PQM's been whinging about a conflict for 6 hours now. | 03:35 |
StevenK | jtv: The qa-tagger will tag them needstesting, they'll get marked untestable, and rolled out. | 03:35 |
wgrant | Is someone going to fix that at some point? | 03:35 |
StevenK | I'm trying to sort out ImportError: No module named builder.recipe | 03:36 |
jtv | StevenK: the bit you said about qa-tagger is what will happen regardless, no? | 03:36 |
StevenK | jtv: Yes, but if was marked incr, the qa-tagger won't slam the bugs to Fix Committed. | 03:36 |
jtv | Ah, now the pieces come together. | 03:37 |
jtv | But I thought you said the rollback should be [incr], not the fixes themselves? | 03:38 |
StevenK | jtv: Right. The rollback will be marked 'as part of this bugs fix', and then when the fixes land properly, the bugs should hit Fix Committed. | 03:39 |
jtv | But you said the 3 bugs should be closed, withing being fixed..? | 03:39 |
StevenK | jtv: No, I said that what was likely to happen due to the lack of incr. | 03:42 |
jtv1 | StevenK: you said "if so, the process says the 3 bugs should be closed." What was the "if so" referring to? | 03:45 |
StevenK | jtv1: I can see we are talking past each other. I explained what would likely happen, and then shifted to talking about what should have happened instead. | 03:46 |
jtv1 | Ah, I think I get it now. Thanks. | 03:47 |
StevenK | Is checkwatches safe to run on qas? | 04:01 |
wgrant | StevenK: Not really, no. | 04:04 |
wgrant | StevenK: And it wouldn't be a very useful test anyway. | 04:04 |
StevenK | wgrant: Okay. Safer to qa-untestable my checkwatches branch? | 04:05 |
wgrant | I think so. | 04:05 |
StevenK | wgrant: Looking at db-devel versus stable | 04:10 |
poolie | lifeless, hm putting this in with the daemon seems not quite right | 04:17 |
StevenK | wgrant: PQM silenced. Hopefully. | 04:20 |
poolie | i'll put it in python-oops | 04:28 |
wgrant | poolie: Doesn't it belong in oops-datedir-repo? | 04:31 |
wgrant | I didn't think python-oops knew about BSON. | 04:31 |
poolie | it mentions it in the docs but it doesn't use it in the code | 04:32 |
wgrant | It doesn't depend on bson. | 04:32 |
wgrant | That's all in datedir-repo/amqp | 04:33 |
poolie | but it does not seem like you should need the repo code to inspect an oops file | 04:33 |
poolie | i could make a new package | 04:33 |
wgrant | Why not? | 04:33 |
wgrant | python-oops doesn't do serialisation. | 04:33 |
poolie | it seems like overkill for what is basically one line of code | 04:33 |
wgrant | "oops file" is a concept that's only part of datedir-repo. | 04:33 |
poolie | there are two potentially separate aspects | 04:34 |
poolie | serializing as bson | 04:34 |
poolie | and writing into per-date directories | 04:34 |
poolie | you could reasonably have the first without the second | 04:34 |
poolie | indeed if you just download one oops you probably will | 04:35 |
wgrant | Sure, but python-oops deliberately doesn't know about serialisation like that. | 04:35 |
wgrant | That's left to the repository implementations: datedir-repo and amqp. | 04:35 |
poolie | amqp has its own separate serialization? | 04:36 |
wgrant | It's BSON. I believe it uses datedir-repo's BSON serializer. | 04:37 |
poolie | foo | 04:37 |
wgrant | All roads lead to datedir-repo :) | 04:37 |
poolie | python-oops says in the readme it defines a serialization | 04:37 |
poolie | though i suppose it is ambiguous what 'the oops project' means | 04:38 |
poolie | so that's why i just put it in utilities/. | 04:39 |
wgrant | I think python-oops' docs are out of date. | 04:39 |
wgrant | datedir-repo was extracted in r9 | 04:39 |
poolie | hm, so | 04:43 |
poolie | i don't know | 04:44 |
poolie | having the format be separate from the serialization seems good | 04:44 |
poolie | having no comment at all about what serialization seems dumb | 04:44 |
poolie | in practice multiple trees assume it is bson | 04:44 |
wgrant | No. | 04:44 |
wgrant | Multiple repository implementations use BSON. | 04:45 |
wgrant | datedir-repo has an option to write out rfc822 as well. | 04:45 |
wgrant | And it will read it perfectly happily. | 04:45 |
wgrant | amqp could be changed to use pickles if you were sufficiently misguided, without affecting datedir-repo. | 04:46 |
poolie | true | 04:46 |
poolie | so there's no reason this should live in one of them rather than the other | 04:46 |
wgrant | Well. | 04:46 |
wgrant | I think it makes sense in datedir-repo. | 04:46 |
wgrant | Since amqp's bson doesn't ever hit the disk as a file. | 04:46 |
wgrant | It's purely encoding as it goes into rabbit, and decoding as it comes out. | 04:47 |
wgrant | (it's then usually handed off to datedir-repo, where it's reencoded and written out into a file) | 04:47 |
poolie | yeah i see | 04:47 |
wgrant | So I think this script belongs in datedir-repo. | 04:47 |
poolie | and if python-oops-tools offered an option to download it, it would get reserialized again there | 04:48 |
wgrant | Possibly. | 04:48 |
wgrant | But maybe not. | 04:48 |
wgrant | I think oops-tools is pretty tied to datedir-repo. | 04:48 |
wgrant | Whereas amqp/datedir-repo/oops are very nicely separated. | 04:49 |
wgrant | They actually have sensible interfaces, and work within them! | 04:49 |
poolie | you know what, i'll just make it separate | 04:50 |
wgrant | I think datedir-repo :) But ok. | 04:51 |
wgrant | StevenK: Did you run that through ec2? | 04:53 |
StevenK | wgrant: Which? The imports branch? | 04:53 |
wgrant | Yes. | 04:53 |
StevenK | Yeah, I did | 04:53 |
wgrant | Hm. | 04:53 |
StevenK | Why? | 04:53 |
wgrant | A naive global format-imports should have broken stuff unless you were very lucky. | 04:54 |
wgrant | Due to lp.codehosting's side-effects. | 04:54 |
wgrant | Although I guess it is alphabetically early. | 04:54 |
StevenK | There were 4 failures on ec2, which I fixed before lp-landing | 04:54 |
wgrant | So it may be OK. | 04:54 |
StevenK | wgrant: Still nervous? | 04:59 |
wgrant | StevenK: Slightly. | 05:00 |
StevenK | wgrant: I can forward you the failure mail if it will ally your concerns. | 05:01 |
poolie | and there are at least two different python modules called 'bson' | 05:08 |
wgrant | poolie: Yes :/ | 05:08 |
wgrant | And at least one of them is very buggy. | 05:08 |
wgrant | (the one we use) | 05:08 |
poolie | :) | 05:12 |
poolie | wgrant, lifeless, https://code.launchpad.net/~mbp/python-oops-datedir-repo/bsondump/+merge/86338 | 05:29 |
bigjools | good morning | 09:08 |
AutoStatic | Good morning | 09:09 |
danhg | Morning all | 09:11 |
=== almaisan-away is now known as al-maisan | ||
AutoStatic | Some colleagues have asked me if I could set up an in-house Launchpad server so they could use it for their projects. They're probably only going to use the bugtrack, blueprint and repository functionality. I'm wondering though if Launchpad isn't a bit overkill then. What's your advise? I already set up a bugtracker for them (MantisBT), a Wiki for their blueprints and setting up a repo is not much work either. | 09:22 |
StevenK | bigjools: Hai. Will you have a chance to do your QA today? | 09:23 |
bigjools | StevenK: hopefully! I got a bit blindsided yesterday | 09:24 |
StevenK | Yes, that's why I didn't bug you then. :-) | 09:24 |
bigjools | I have a theory about poppy | 09:24 |
StevenK | It is horribly, horribly broken and needs to die? | 09:25 |
bigjools | well you wrote it :) | 09:25 |
* bigjools just hacked on the FTP bit | 09:25 | |
StevenK | Better than continuing to use Zope's horrible excuse for an FTP server. | 09:26 |
StevenK | bigjools: What is your theory? | 09:26 |
bigjools | StevenK: the ssh checks connect to the appservers to get the authorisation | 09:27 |
bigjools | when we have FDT, the XMLRPC connection fails | 09:28 |
bigjools | after that, it continues to fail forever until restarted | 09:28 |
bigjools | not sure why, but meh, Twisted | 09:28 |
bigjools | the swap death was caused by someone using a loop to connect | 09:28 |
=== gmb changed the topic of #launchpad-dev to: https://dev.launchpad.net/ | On call reviewer: gmb | Firefighting: - | Critical bugtasks: 3*10^2 | ||
jml | anyone developing on precise? | 10:54 |
jml | AutoStatic: I'd recommend *not* running Launchpad locally. | 10:54 |
AutoStatic | jml: Yeah, we figured that out too: https://answers.launchpad.net/launchpad/+faq/920 | 10:55 |
jml | AutoStatic: it's pretty huge and the operational cost is non-trivial, even at low scale. | 10:55 |
bigjools | allenap: in the tests in your branch, it's probably worth refactoring the bit that sets properties on objects in a r/w transaction | 10:56 |
jml | AutoStatic: cool. | 10:56 |
allenap | bigjools: Erm, which bit? | 10:56 |
jml | AutoStatic: so, I'm not 100% sure what your question is then :) | 10:56 |
allenap | bigjools: Like in test_handleStatus_OK_sets_build_log? | 10:57 |
bigjools | allenap: line 72/83 of the diff | 10:57 |
bigjools | allenap: I suspect we'll need to do that a lot more in the future | 10:57 |
allenap | bigjools: I don't know what a better way would be. I could instead enter read-only mode in each test individually (via a fixture) I guess. | 10:58 |
bigjools | allenap: I was thinking just a test helper | 10:58 |
bigjools | like setattr | 10:58 |
bigjools | but does the whole transactionny thing | 10:58 |
allenap | bigjools: With the removeSecurityProxy thing too I assume. | 10:59 |
bigjools | allenap: no, the caller can do that | 10:59 |
AutoStatic | jml: Well, I got an instance running locally here and my question was more or less a stepping stone to some other questions | 10:59 |
allenap | bigjools: Okay, I think I have a cool way to do that. | 11:00 |
bigjools | allenap: of course :) cheers | 11:00 |
AutoStatic | jml: So I'm going to wipe out that local launchpad and convince my colleagues that they should look for something else | 11:01 |
jml | AutoStatic: ok. | 11:02 |
=== matsubara-afk is now known as matsubara | ||
jml | bwahahaha | 11:08 |
jml | Python 2.6 | 11:09 |
jml | sorry. | 11:09 |
jml | good luck with that. | 11:09 |
gmb | Argh. My connection drops out for ten minutes and when I get back bigjools has done the review I was doing. It's going to be one of those someone-else-does-all-the-work OCR days, is it? | 11:16 |
gmb | (Also, he did a better job of it) | 11:17 |
gmb | (Which galls) | 11:17 |
allenap | jml: I've had a bash at getting Launchpad built on Precise, but I lost interest (it was late). Seems like the cool kids are using a schroot (which I am) or an LXC. | 11:18 |
allenap | (running Lucid) | 11:18 |
bigjools | gmb: shurely shome mishtake :) | 11:19 |
nigelb | What's the firefighting section about? | 11:20 |
nigelb | (in the topic) | 11:20 |
bigjools | if we're in the middle of an incident | 11:21 |
nigelb | ah. It makes topic. Nice. | 11:21 |
* bigjools just added a million people on G+ and may live to regret it | 11:21 | |
* nigelb just searched on G+ for "bigjools" | 11:23 | |
nigelb | Dammit. | 11:23 |
allenap | bigjools, gmb: Thank you both for the reviews :) | 11:23 |
bigjools | nae prob | 11:24 |
allenap | bigjools: Fwiw, this is what I did to factor out the things you suggested: http://paste.ubuntu.com/776193/ | 11:35 |
bigjools | allenap: not so much as a refactoring as a rewriting :) | 11:37 |
allenap | bigjools: Well, I'm already using it in my next branch, and will probably in the one after that :) | 11:38 |
bigjools | heh | 11:38 |
jml | allenap: I'm not suggesting you should actually make this change now, but it might be more re-usable as a Fixture. | 11:39 |
allenap | bigjools: How do I go about QAing the revert I did? Or do we just say it's fine because it's approximately already on cesium. | 11:40 |
allenap | ? | 11:40 |
allenap | jml: Yeah, you're right. If it causes enough friction I'll change it. | 11:41 |
bigjools | allenap: untestable | 11:41 |
allenap | bigjools: Cool. | 11:42 |
cjwatson | gmb: any further thoughts on my QA suggestions for https://code.launchpad.net/~mvo/launchpad/maintenance-check-precise/+merge/82125 ? | 11:47 |
gmb | cjwatson: No, no further thoughts (sorry, meant to reply the other day but forgot after a reboot). Could you take care of QAing if for me? I'll make sure it lands today or tomorrow. | 11:49 |
cjwatson | modulo holiday, yes I can | 11:49 |
gmb | Excellent, thanks. | 11:49 |
rick_h__ | ./topic | 11:52 |
jml | hmm. | 12:21 |
jml | so I have a clean lucid schroot for building packages. Can I somehow leverage that to make an schroot dedicated to hacking on Launchpad? | 12:22 |
cjwatson | you could copy the source directory and add a new entry in /etc/schroot/chroot.d/ for it | 12:40 |
cjwatson | and drop the unioniness | 12:41 |
cjwatson | I use a 'lucid-lp' schroot | 12:41 |
jml | cjwatson: thanks. | 12:42 |
jml | (also, my next laptop will have an SSD) | 12:42 |
jml | hmm. I should probably do something like this for each Canonical-deployed project I work on. | 12:56 |
jml | bzrlib.errors.ConnectionReset: Connection closed: Unexpected end of message. Please check connectivity and permissions, and report a bug if problems persist. | 13:49 |
jml | got this trying to fetch bzr-git w/ update-sourcecode | 13:49 |
jml | never mind. | 13:50 |
al-maisan | jml: the /etc/resolv.conf in your chroot might be out of date..? | 14:00 |
rick_h__ | gmb: got a sec for review? https://code.launchpad.net/~rharding/launchpad/sort_labels_894744/+merge/86287 | 14:01 |
al-maisan | jml: try "sudo cp /etc/resolv.conf <path-to-chroot>/etc/resolv.conf" and see whether that helps | 14:01 |
cjwatson | benji: I noticed that in the three branches of mine you reviewed yesterday, you left an Approved comment but didn't set the MP to Approved; was that deliberate? | 14:01 |
benji | cjwatson: generally the MP initiator sets it to approved, sometimes they might be getting a DB review too or a UI approval | 14:02 |
benji | I set the other one to approved because I was landing it and the machinery won't land unapproved branches. | 14:03 |
cjwatson | oh, I didn't know that, my reviewer's always done it for me before | 14:03 |
cjwatson | probably because I've always explicitly asked for landings :) | 14:03 |
cjwatson | benji: ah, and I can't set the MP to Approved because I'm not in ~launchpad | 14:07 |
cjwatson | benji: any chance of landings for always-index-d-i and sign-installer, then, if you have a chance? It might be best to leave new-python-apt for a bit as it collides with https://code.launchpad.net/~mvo/launchpad/maintenance-check-precise/+merge/82125 and this way I do the merge rather than making somebody else do it | 14:07 |
benji | heh, well that would make it harder | 14:07 |
benji | cjwatson: sure, I'll start the landing of those in a bit | 14:09 |
cjwatson | great, thank you | 14:10 |
gmb | rick_h__: Sure thing; looking now. | 14:13 |
rick_h__ | gmb: ty much | 14:13 |
=== frankban_ is now known as frankban | ||
gmb | rick_h__: Approved. | 14:40 |
rick_h__ | gmb: awesome, thanks | 14:46 |
=== al-maisa` is now known as almaisan-away | ||
=== aldeka_ is now known as aldeka | ||
=== matsubara is now known as matsubara-lunch | ||
=== almaisan-away is now known as al-maisan | ||
=== gmb changed the topic of #launchpad-dev to: https://dev.launchpad.net/ | On call reviewer: - | Firefighting: - | Critical bugtasks: 3*10^2 | ||
=== matsubara-lunch is now known as matsubara | ||
benji | I'd appreciate it if some kind soul would review this branch: https://code.launchpad.net/~benji/launchpad/bug-903532/+merge/86426 | 16:30 |
benji | if that kind soul has some translations knowledge, it would be even better | 16:30 |
=== al-maisan is now known as almaisan-away | ||
sinzui | benji, I ca take it | 16:49 |
benji | sinzui: cool, thanks | 16:50 |
sinzui | benji, r=me | 16:52 |
benji | sinzui: thanks | 16:53 |
=== cjwatson_ is now known as cjwatson | ||
=== matsubara_ is now known as matsubara | ||
=== matsubara is now known as matsubara-afk | ||
=== almaisan-away is now known as al-maisan | ||
=== al-maisan is now known as almaisan-away | ||
lifeless | gary_poster: I'm around for a bit if you want to talk oopses more | 20:42 |
gary_poster | thanks lifeless on call | 20:43 |
wallyworld | sinzui: jcsackett: can we mumble now? | 21:44 |
sinzui | yes | 21:44 |
wallyworld | sinzui: fucking mumble is doing it's thing again where it consumes all my cpu. i have to reboot | 21:45 |
james_w | anyone want to take a look at https://code.launchpad.net/~james-w/launchpad/bpph-binary-file-urls/+merge/86470 ? | 22:04 |
poolie | o/ james_w | 22:10 |
poolie | hi all | 22:10 |
james_w | hi poolie | 22:11 |
dobey | hey poolie | 22:38 |
huwshimi | On the deployable revisions page it says "Revision 14556 can be deployed: orphaned". Does that mean I can't qa it? | 22:53 |
lifeless | either it has no bug linked, or the bug has been closed already | 22:58 |
lifeless | if its the latter, you can reopen the bug | 22:58 |
huwshimi | lifeless: Will it get picked up by the qa tagger etc. then? | 23:14 |
huwshimi | lifeless: Should it be Fix Committed or will any status other than Fix Released do? | 23:15 |
poolie | ok now i've played with juju it is annoying me that launchpad doesn't use it | 23:21 |
jelmer | poolie: :) | 23:23 |
poolie | jelmer, i just talked to flacoste about bug 795025 | 23:26 |
_mup_ | Bug #795025: no way to gracefully disconnect clients and shut down the bzr server <canonical-losa-lp> <hpss> <launchpad> <ssh> <Bazaar:Fix Released by jameinel> <Launchpad itself:Triaged> < https://launchpad.net/bugs/795025 > | 23:26 |
poolie | istm there is a safer way to do it | 23:26 |
poolie | which is to have a signal to tell the processes to just stop listening | 23:26 |
poolie | then we can start a new one | 23:26 |
jelmer | poolie: will that work with haproxy? | 23:34 |
poolie | i think so? | 23:34 |
poolie | haproxy will detect that it's down? | 23:34 |
jelmer | I haven't looked at it, so not exactly sure how it's communication with services works | 23:34 |
jelmer | ah, so... so we shut the existing one down and then when haproxy starts another one that's using the new code? | 23:35 |
poolie | i think it's some combination of: seeing if the port is listening, plus pinging a separate http port that reports on the status | 23:35 |
poolie | more precisely: | 23:35 |
poolie | we tell the existing one "stop accepting connections", and it closes its listening socket | 23:36 |
poolie | and then haproxy notices it's down, i guess | 23:36 |
jelmer | that makes sense | 23:36 |
poolie | and then we start a new instance listening on the same port, which will be running the new code | 23:37 |
poolie | then the old process can either exit by itself when all the connections are done | 23:37 |
poolie | or the sysadmins can kill it if they want | 23:37 |
lifeless | poolie: uhm, thats not sufficient | 23:44 |
poolie | because? | 23:44 |
lifeless | because having the old code running for several weeks will play havoc with things like upgrading xmlrpc verbs | 23:45 |
poolie | ..? | 23:45 |
poolie | like, removing old verbs on internal xmlrpc that the old code uses? | 23:45 |
lifeless | yes, or rearranging things; things that you would normally do a server change, change client, cleanup old code sequence | 23:46 |
lifeless | this depends on being confident that the client is deployed | 23:46 |
poolie | mm | 23:46 |
lifeless | not to mention that we would like to free disk space from old deploys. | 23:47 |
poolie | so to have loose coupling we would want to not have those things required to happen in too short of a time window | 23:47 |
poolie | anyhow, after that time, we can just kill the old procesess | 23:47 |
poolie | the client should cope | 23:47 |
lifeless | right, we can allow a few hours for the old processes to gracefully go away, which is what the current plan aims at | 23:48 |
lifeless | we don't want to interrupt someones 6 hour epic initial push, after all. | 23:48 |
poolie | right | 23:48 |
poolie | so my plan is | 23:49 |
lifeless | we don't want idle heavyweight processes hanging around indefinitely either, which means a way of killing them while idle, which implies the client coping | 23:49 |
poolie | i think we can do this in two steps | 23:49 |
poolie | 1- move new connections on to the new process | 23:49 |
poolie | or rather, accept new connections from the new process | 23:50 |
poolie | 2- boot off existing clients | 23:50 |
poolie | 2 is a bit messy becaues | 23:50 |
poolie | some clients won't cope well | 23:50 |
poolie | and it will take unbounded time to get there | 23:50 |
poolie | and it's just generally more risky | 23:50 |
lifeless | mmm | 23:50 |
lifeless | remember we have some fixed paths on disk for the front-forking IPC calls, and we also have N front-end and N forking services to restart | 23:51 |
lifeless | doing 1 without waiting for 2 is more complex and doesn't really buy us anything | 23:51 |
lifeless | we're still not done-done until 2 has happened | 23:51 |
poolie | so doing only 1 will let us bump codehosting from every fdt deploy | 23:52 |
poolie | that seems highly worthwhile | 23:52 |
lifeless | no, it won't. | 23:52 |
poolie | why? | 23:52 |
lifeless | codehosting isn't in fdt anyhow, its a nodowntime-with-handholding deploy | 23:53 |
lifeless | the handholding is because of 2 | 23:53 |
lifeless | solve the handholding problem and it can move to nodowntime | 23:53 |
lifeless | the constraints are that we must be safe to delete the deploy directory after the deploy. | 23:54 |
lifeless | well, there are probably more, but thats the key one I see. | 23:54 |
poolie | what specifically is the problem | 23:54 |
poolie | ok | 23:54 |
lifeless | the problem today is that the nodowntime deploy pauses for hours because we can't interrupt bzr safely, so we wait until there are only a few clients connected then manually check that they are all CI servers and whatnot | 23:55 |
lifeless | and then interrupt them ungracefully | 23:55 |
lifeless | the deploy process is 'upgrade instance 1, upgrade instance 2' - serialised - which gets us no downtime | 23:56 |
lifeless | during the deploy, the symlink for the active tree is updated, and after that we assume we can delete the tree at any point | 23:56 |
lifeless | a few trees are kept around, but when we do multiple deploys in a day, there is no fixed window for when a tree will be deleted | 23:57 |
poolie | sure | 23:57 |
lifeless | we probably need to rejigger a few things, and having a quick-stop-listening step is fine with me as long as we don't set ourselves up for messy failures that we need to ignore / whatever. | 23:58 |
poolie | so the handholding is that | 23:58 |
poolie | they want to delete the tree when the processes using it have finished | 23:58 |
poolie | however, that is always going to take a while, unless we're prepared to just abruptly kill connections | 23:59 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!