[02:22] <skaet> Thanks to whoever posted the images.... :)
[02:33] <skaet> cjwatson,  please get with Daviey after he has a chance to look at https://bugs.launchpad.net/ubuntu/+source/eucalyptus/+bug/813266;  we may need to respin the server images again.
[02:33] <ubot4> Launchpad bug 813266 in eucalyptus (Ubuntu) "eucalyptus fails to start instances (affects: 1) (heat: 6)" [Critical,New]
[07:32] <cjwatson> lamont: thanks
[07:34] <cjwatson> grah, bugs
[07:35] <cjwatson> skaet: plus the ones jibel pointed to - looks to me like respinning everything is near-certain :-/
[07:36] <cjwatson> oh, bah, I'm an idiot, it was fine on my disk but I did a partial commit
[07:37] <cjwatson> ev: can I have a wubi/lucid r192 build, please?
[08:04] <cjwatson> Daviey: let me know when you're around, for that eucalyptus bug?
[08:04] <cjwatson> so.  wubi fixed but waiting on ev being around for a new build.  d-i fixed by infinity last night.  eucalyptus waiting for Daviey.
[08:04] <cjwatson> I guess that means I can respin alternates
[08:15] <cjwatson> oh, argh.  somebody respun already but didn't mention it here, after the bug was mentioned here.  PLEASE don't do that.
[08:15] <infinity> cjwatson: We already re-spun.
[08:15] <cjwatson> or at least didn't mention clearly why.
[08:15] <infinity> cjwatson: (where "we" is skaet)
[08:15] <cjwatson> so that'd be a waste of my time then.
[08:15] <infinity> cjwatson: I think the conversation ended up split between -release and -testing.  Sorry.
[08:16] <cjwatson> ctrl-c'ing and unpublishing.
[08:17] <cjwatson> yeah, I think people do need to remember that conversations spread across multiple channels and queries and stuff are very difficult to follow for people in different timezones, and that it needs extra clarification
[09:03] <Daviey> cjwatson: Sorry, just seen your ping in scrollback.
[09:12] <Daviey> cjwatson: You know we might need to re-spin the kernel if we want that bug fixed?
[09:14] <cjwatson> whee!
[09:14] <cjwatson> want to talk me through it?
[09:14] <cjwatson> (that would definitely mean delaying 10.04.3.)
[09:21] <Daviey> cjwatson: We saw the same issue during the Maverick cycle.
[09:21] <Daviey> kernel bug #588861,
[09:21] <ubot4> Launchpad bug 588861 in linux (Ubuntu Maverick) (and 4 other projects) ""pad block corrupted" error when trying to register an image with 2.6.34 kernel (affects: 1) (heat: 7)" [High,Fix released] https://launchpad.net/bugs/588861
[09:21] <Daviey> I suspect the bug has now reached lucid kernel stable updates?
[09:22] <cjwatson> do you already have a kernel developer on it?
[09:23] <Daviey> no
[09:23] <Daviey> apw: How are you this fine day?
[09:24] <apw> Daviey, ok thanks, you ?
[09:24] <Daviey> apw: not so bad.. :).. do you think 588861 has been introduced into Lucid?
[09:25] <cjwatson> apw: (that was code for "you're about to not be so good")
[09:25] <Daviey> lol.. yes. sorry.
[09:27] <apw> Daviey, which version of lucid you testing ?
[09:28] <Daviey> 10.04.3 candidate.
[09:28] <Daviey> so, 2.6.32-33.70
[09:30] <Daviey> There is a minimal test java tool that can prove this.
[09:30] <Daviey> https://bugs.launchpad.net/ubuntu/+source/eucalyptus/+bug/588861/comments/10
[09:30] <ubot4> Launchpad bug 588861 in linux (Ubuntu Maverick) (and 4 other projects) ""pad block corrupted" error when trying to register an image with 2.6.34 kernel (affects: 1) (heat: 7)" [High,Fix released]
[09:31] <apw> Daviey, ok the current source in lucid seems to have the issue yes
[09:32] <Daviey> apw: assuming you are git blaming, do you know what kernel release this was introduced?
[09:33] <apw> Daviey, and i assume you are saying that your tests show it affected to ?
[09:33] <Daviey> apw: Well we are seeing the issue on bug 813266.
[09:33] <ubot4> Launchpad bug 813266 in linux (Ubuntu) (and 1 other project) "eucalyptus fails to start instances (affects: 1) (heat: 6)" [Undecided,Incomplete] https://launchpad.net/bugs/813266
[09:33] <apw> Daviey, am looking now
[09:35] <apw> Daviey, but if that is true then the bug was introduced in v2.6.23-rc1 ... and lucid has always had it
[09:35] <Daviey> hmm.
[09:36] <apw> Daviey, so i have to assume something else has changed perhaps even java to tickle this
[09:36] <Daviey> ah!
[09:36] <Daviey> openjdk-6 was updated 5 weeks ago on Lucid
[09:37] <apw> Daviey, is it easy to test with the previous jdk to confirm this is what has triggered the issue
[09:37] <apw> Daviey, in parallel i will get you a kernel with the patch to test too, presuming you can re-test yes ?
[09:37] <Daviey> apw: wilco
[09:40] <Daviey> (The test rigs cannot access launchpadlibrarian due to firewall.. *awesome*)
[09:45] <cjwatson> respinning lucid Ubuntu/Kubuntu desktop (ISO only, livefs remains the same) for Wubi fix
[09:52] <cjwatson> the openjdk-6 change was a security update :-(
[09:52] <Daviey> apw: I had hoped to see a reference to sendfile() in the openjdk security update.
[09:52] <Daviey> Not there. :/
[09:52] <cjwatson> so not easy to just revert
[09:52] <Daviey> cjwatson: Haven't yet confirmed that is a culprit for opening it.
[09:52]  * cjwatson nods
[09:53] <cjwatson> do you need a full list of changed packages?
[09:53] <apw> Daviey, will let you know when the replacement kernels are ready for testing
[09:53] <Daviey> I'm trying to remember exactly what happend.. i seem to remember openjdk said they were doing the right thing, and it was purely a kernel issue.
[09:53] <Daviey> apw: ack
[09:54] <Daviey> cjwatson: I'm pretty confident it's an issue that openjdk demostrated in the kernel.
[09:55] <cjwatson> well, http://paste.ubuntu.com/648102/ if it's helpful
[09:55] <cjwatson> note there were two openjdk-6 changes in there
[09:55] <Daviey> cjwatson: groovy, is that 10.04.2 -> .3 ?
[09:55] <cjwatson> yeah
[09:56] <cjwatson> https://launchpad.net/ubuntu/+source/openjdk-6/6b20-1.9.7-0ubuntu1~10.04.1 was the other change
[09:56] <Daviey> cjwatson: unrelated, i'd quite like the lpapi script you used to generated that :)
[09:56] <cjwatson> well, technically it's all publications to -updates since 2011-02-18, but I think that comes out about right
[09:57] <cjwatson> http://paste.ubuntu.com/648105/ - it's a cut-down version of something else so there's some cruft in there
[09:57] <Daviey> ta
[10:05] <cjwatson> Ubuntu desktop reposted with wubi fix
[10:13] <apw> Daviey, http://people.canonical.com/~apw/lp588861-lucid/
[10:19] <Daviey> apw: thanks!
[10:19] <Daviey> It seems openjdk might not be responsible btw.
[10:20] <apw> Daviey, its possible some other kernel change has exposed this one
[10:22] <Daviey> The odd thing, is that the minimal test app we had for proving this in Maverick passes OK.
[10:29] <apw> Daviey, hmmm odd indeed
[10:30] <apw> Daviey, well let me know if the kernels help at all, hard to know if they will given the evidence
[10:31] <Daviey> i'm still trying to prove how it is broken, once i have that - i'll try your kernel.
[10:35] <apw> Daviey, it would be good to know if this is the fix, as likely there will be some serious hoop jumping to get this out
[10:46] <Daviey> apw: trying the kernel now.
[11:10] <Daviey> apw: It's not looking good, as in i believe i'm still seeing it with your kernel
[11:10] <Daviey> bah.. it didn't boot into your kernel
[11:10] <Daviey> scrub that
[11:25] <Daviey> apw / cjwatson: Confirming that the candidate kernel resolves the issue for me.  I've asked jamespage to reproduce the failed test, then re-confirm the kernel.
[11:26] <Daviey> where candidate kernel is apw's.
[12:05] <apw> Daviey, so that begs the question as to why it worked before
[12:10] <Daviey> apw: It might still be related to the openjdk update 5 weeks ago.. although, kinda suprised nobody else noticed it.
[12:11]  * apw doesn't like unexplained regressions
[12:11] <Daviey> if we had a snapshots archive, this would be much easier to determine.
[12:29] <pitti> Daviey: LP has all previous package source and binary versions
[12:30]  * Laney imagines lp-bisect
[12:34] <Daviey> pitti: Well i'm fully aware of that.. but doesn't have an archive interface.
[12:58] <astraljava> Hi gang, no alternate images for Ubuntu Studio nor Xubuntu. Is the installer broken?
[13:07] <jibel> are the results from the previous desktop builds still valid (livefs remained the same) ?
[13:08] <cjwatson> jibel: yeah, livefs is the same
[13:08] <cjwatson> astraljava: for oneiric?
[13:08] <cjwatson> missing images have nothing to do with a broken installer anyway
[13:08] <jibel> ok thanks
[13:08] <jibel> will move them
[13:09] <cjwatson> astraljava: anyway, it's a persistent and annoying type of CD build failure which should go away once we get faster hardware
[13:10] <cjwatson> I'm trying a rebuild, which may or may not fix it
[13:11] <astraljava> cjwatson: Yes, oneiric. Okay, thanks for the info!
[13:15] <jibel> kubuntu desktop 20110720.1 posted to the tracker
[13:16] <cjwatson> Daviey: so this sounds like we're going to have to delay, or else not release the server images; minimum kernel build time is about 11.5 hours, and then we have to rebuild d-i and then the images
[13:16] <cjwatson> jibel: oh, thanks, my bad
[13:18] <cjwatson> skaet: ^- what do you think?  (the wubi problem is resolved)
[13:18] <cjwatson> Daviey: was this actually tested properly for 10.04.2, I wonder?
[13:23] <hggdh> cjwatson: yes, it was
[13:24] <hggdh> (although 'properly' sounds a stretch, if this bug was present there)
[13:24] <cjwatson> oddness
[13:30] <skaet> cjwatson,  tempted to go with what we have for the rest of the images, and then push out the server image when we have it at this point. (assuming no new regressions are found).    Am thinking that smoser's cloud images need to be looked at to see if they can use the current kernel - am thinking if we're respinning image for server,  we probably want that kernel used as well for cloud images.
[13:33] <apw> skaet, if we're respinning the kernel what level of QA are you expecting on the changes
[13:34] <skaet> apw,  that's why I'm tempted to not put the server images out now,  so we can get some QA and cert on it.
[13:35] <skaet> Is the candidate kernel the same except for one patch,  as the one we're shipping,   or is it picking up a set of changes?
[13:37] <apw> skaet, i was assuming we'd just want to do the single commit to minimise risk
[13:37] <skaet> apw,  indeed yes,  that is the preference.
[13:41]  * skaet has to step away for a bit, back later. 
[13:43] <Daviey> cjwatson: yeah, this would most certainly have been noticed for 10.04.2
[13:43] <Daviey> it's 100% failure rate afaiks
[13:43] <Daviey> jamespage is doubly reproducing to confirm.
[13:44] <cjwatson> skaet: ok, not sure I know what the cloud image situation is
[13:44] <cjwatson> Daviey: do you know if anyone's actually started building cloud images for 10.04.3?
[13:46] <Daviey> cjwatson: Oh yes. https://uec-images.ubuntu.com/lucid/20110719/ <--- but i don't know if they have been QA'd.
[13:46]  * Daviey checks.
[13:46] <cjwatson> so those want posted to the tracker, right?
[13:49] <cjwatson> Daviey: posted
[14:17] <jamespage> Daviey, cjwatson: I reproduced the issue and then tried out apw's new kernel which fixed the problem - I'll update the ticket to reflect
[14:24] <apw> Daviey, did you manage to work out what triggered the exposure of this bug
[14:25] <Daviey> apw: not yet
[14:26] <apw> and is the decision to spin the kernel and hold at least -server for it?
[14:34] <Daviey> skaet / cjwatson: so the Cloud images have been QA'd it seems.  Previously, those images haven't followed the same release cadence as the distro point releases.
[14:35] <Daviey> They get updated when a new kernel is pushed.
[14:35] <cjwatson> ok, I don't mind that
[14:35] <Daviey> That being said, they can be branded 10.04.3 - but a health warning should state to check it is the most recent image.
[14:48] <apw> skaet, cjwatson, do we have a decision as to direction yet
[14:49] <skaet> apw, working it...
[14:50] <skaet> yes,  spin the kernel.   We'll hold at least server to pick it up.
[14:50] <skaet> Daviey, apw,  ^^
[14:51] <Daviey> I don't think it's a big deal, but is this the first time desktop vs server has had different kernel releases on a release?
[14:52] <skaet> Daviey,  are you planning on resping cloud images to pick up new kernel,  or going out with the ones you have?
[14:52] <Daviey> skaet: I don't think that is essential
[14:54] <cjwatson> Daviey: I'm not certain, but it would surprise me if it were the first time
[14:54] <cjwatson> It doesn't particularly worry me for a point release
[15:01] <Daviey> skaet: the cloud image stamping of 10.04.3 isn't really a big deal, as the images are refreshed with higher freq. anyway.
[15:02] <skaet> Daviey,  ok.
[15:39] <cjwatson> astraljava: looks like that rebuild attempt worked better, for both xubuntu and ubuntustudio
[16:36] <apw> Daviey, i assume you need this spun for -ec2 as well yes ?
[16:37]  * apw pokes Daviey
[16:48] <Daviey> apw: nah
[16:49] <Daviey> it's a nice to have.. but we are only currently seeing this with euca.
[16:49] <Daviey> If people are running euca in ec2, they have larger concerns.
[16:49] <Daviey> If something like hadoop was exposing the same bug, sure.. but we've had no reports.
[16:50] <Daviey> apw: So a nice to have, but no rush.
[17:26] <cjwatson> is somebody dealing with uploading this kernel the?
[17:26] <cjwatson> then
[17:27] <Daviey> apw si ^^
[17:28] <cjwatson> skaet: I have to go out now, but AFAIK everything except server is good, unless QA find other showstoppers
[17:28] <cjwatson> nothing to hand off from my side
[17:29] <skaet> cjwatson,  thanks.  Working server issue right now on the side.
[17:29]  * cjwatson nods
[17:29] <apw> cjwatson, Daviey, the kernel is being uploaded to the ckt PPA 'now'
[17:29] <skaet> apw, thanks
[17:30] <apw> cjwatson, Daviey, skaet, indeed it is in an building on all architectures already
[18:21] <Daviey> apw: rocking!