=== inuka_desk_ is now known as inuka_desk === jayc__ is now known as jayc_ [05:05] i dont understand what this means, http://www.news.com/8301-13580_3-9867657-39.html [05:17] zul: ping === jayc__ is now known as jayc_ [05:56] hmh, resume from hibernate failed, ie. it booted normally and not from the file === asac_ is now known as asac === macd_ is now known as macd === Whoopie_ is now known as Whoopie [10:45] xivulon: Hi, I've been looking at your notes for the Wubi issues [10:46] just added a new one! [10:46] was waiting for you in fact... [10:47] let me know if I need to clarify anything [10:48] xivulon: I managed to screw up my /home today playing around with this - I was offline for a while sorting my machine out :-( [10:49] sorry to hear that :( [10:49] xivulon: It's all part of the fun :-) [10:49] You are invited to explain the fun part to my wife... [10:50] xivulon: I looked at the notes - the problem is umounting / and then /home - can one actually do this in practice? [10:50] xivulon: I was thinking of using pivot_root but I haven't much experience with this, so I am not sure if this a possible way of doing things [10:50] unmounting is not a problem except that it will be the last thing you can do (without a ramdisk) [10:51] see my last comment on that [10:51] ..but with a ramdisk, one could umount /? [10:51] I would assume so [10:51] but I have never tried that [10:52] ..me neither. I think we may need some expertise from someone like Colin Watson. [10:53] eheh... [10:53] But I think it will lead to a sane shutdown that will ensure / and /home are flushed and umounted properly [10:53] ..otherwise one will always see this kind of corruption of / [10:54] if one can do this then the vm dirty hacks are no longer required [10:54] I'd think that if we could remount /host ro (safely) it would be good enough [10:55] see my umountroot attachment for that, unfortunately ntfs-3g complains about remounting... [10:55] cking the vm dirty hacks are still required in case someone hard-reboot [10:56] xivulon: yep - true about the hard-reboot - I overlooked that detail. [10:56] In fact based on user feedback all the problems of fs corruption were due to hard reboots [10:57] I haven't seen any complaint about fs corruption because of normal reboot [10:57] I'd guess that remounting root ro + the sysctl hacks is enough in most cases (and maybe some fsck help...) [10:57] xivulon: I managed it :-) I did started removing a huge tree of source and then rebooted - and I got a bit of a mess [10:58] that is with the sysctl hacks? [10:59] cking was disconnected [11:00] xivulon: what happened there? [11:00] there was a netsplit, too [11:00] ..a netsplit? [11:00] IRC networks consist of many servers, in general, which are connected in a graph; if a pair of servers lose their connection, that's a netsplit [11:00] ah.. I am enlightened [11:00] which results in the users on the network being effectively partitioned in terms of their ability to talk to each other [11:01] cking missed replies after my last comment [11:01] didn't miss cjwatson explanation though :) [11:01] xivulon: yep, I kind of missed the last few lines [11:02] 10:56 cking: I think you misunderstand the order of events slightly, though [11:02] 10:56 cking: the NTFS filesystem in question is actually /host, not /home [11:02] 10:56 cking: and Wubi is a little odd here - the way it works is that / is actually a loop-mounted filesystem *on top of* an NTFS filesystem that is then moved to /host [11:02] 10:57 cking: so the filesystems are the other way round from what you'd expect [11:02] in case that was missed [11:02] thanks [11:03] .. I keep on writing /home when I mean /host - it's a constant brain type of mine [11:03] ^type^typo. I need coffee. [11:03] the crux though is that /host cannot be remounted r/o IMHO [11:03] I suspect that you cannot actually unmount / and then expect to be able to unmount /host [11:03] because you won't have a reference to /host any more [11:03] cking did you experience fs corruption when rebooting (rc6.d) and with the sysctl hacks in place? [11:04] you'd have to move all the mount points around, and frankly, I do not think that is a viable approach at this point [11:04] was /host to be corrupted or / and did corruption involve the journal? [11:04] xivulon: yep [11:04] we need something pretty simple and foolproof, and playing Tetris with your mount points isn't that :) [11:05] cjwatson: I'm not sure if there is a solution that is simple and foolproof. [11:05] not sure if we can try fixing /host (ntfs-3g) remount [11:06] xivulon: one could do that - but wasn't there a big risk that NTFS could get corrupted? [11:06] that was my understanding from last szaka comment [11:07] did ask him some more info by mail but he didn't reply [11:07] if the NTFS filesystem is never remounted read-only or unmounted (as is the case at the moment, I understand), why does that affect the kernel's ability to writeback changes to the ext3 /? [11:07] or rather, what is breaking the kernel's ability to writeback changes? [11:08] cjwatson: from the way I see it, we can either force disk writebacks using the vm flush hacks - but this does not necessarily guarantee a fulling sync'd fs.. or.. [11:08] IIRC the kernel always syncs just before reboot [11:09] .. we can fool around with the umounting. The former may help when systems power down in an uncontrolled way, the latter makes sure the filesystems are coherently umounted at shutdown [11:10] cjwatson: true. [11:10] ...and yet you ended up with a corrupted fs [11:10] ..my concern is that the /host is not being properly written back to. [11:12] I believe /host is not being umounted which worries me. [11:12] it certainly is not [11:12] or you would lose / [11:12] and hence /sbin/reboot [11:13] remounting it read-only ought to be as good as unmounting from the perspective of avoiding corruption, but that's been problematic due to ntfs-3g bugs [11:14] cking, on a side note, you also suggested using sync mount option on the loopfile, I assume you meant the /host fs [11:16] xivulon: yes, if it's possible on ntfs. The sync is not required on the loop file since it's mounted ro at the shutdown. However, the sync,dirsync and commit=1 could help on power outages I suppose. [11:17] but isn't host that has to sync automatically? the fact that the loopfile syncs to host does not help if host does not sync to disc, correct? [11:18] xivulon: true - yep, it's clear I haven't thought that one through. [11:20] xivulon: I suppose the outstanding questions are: can ntfs-3g be fixed in time to enable a ro remount, and if so, can one avoid NTFS corruption? [11:21] xivulon: ..also, do the vm dirty tweaks do anything useful in reality? [11:21] yes, that would be the best avenue, I had that bug open for quite some time... [11:21] but did not bother too much because had zero bug reports about fs corruption... so far... [11:22] xivulon: ..thirdly.. does the ntfs-3g support sync and commit=1 or are these only applicable to ext3? [11:22] cking I know that in wubi 7.04 we did not have them we had more reports about people having fs corruption after hard reboot [11:22] that is based on a completely subjective measure [11:22] xivulon: ..yes but it's comforting to hear this. [11:22] might be due to better user education also (added a few faqs) [11:23] I would not know how to test that in an objective way [11:23] for ntfs-3g we would need to ask szaka [11:23] I know nothing about its internals [11:25] but if you see #186117 that didn't go too far [11:25] xivulon: that's the best path to take if he is available. [11:26] xivulon: Meanwhile I will google around to see if there are any other ways to make sure the filesystem is flushed other than using the vm dirty hacks, [11:27] ..one never knows if there is something better. [11:30] cjwatson as plan b couldn't we still try to go for /sbin/reboot on ramdisk (but leaving current behaviour whenver that is not strcilt needed)? [11:35] it's very risky, you need to copy over any libraries it needs as well; it needs /proc and /dev; and who knows what else you might have to play catch-up with [11:35] I have a suspicion changing ntfs-3g would be safer [11:35] cking still other route might be to ensure that ntfs-3g does actually empy his cache whenever the sync command is issued. I think. [11:36] if / is ro and synced properly, rebooting should not be too drastic even if /host is rw [11:36] xivulon: an extra sync can only help [11:37] xivulon: I recall that twiddle with (one of) the vm dirty options may also forces a sync. [11:37] cking you might want to try to reproduce fs corruption with new initramfs + sysctl + umountroot, as per thread [11:38] as per bug attachments... [11:38] yep. Will do. [11:38] * cking need to reboot to try this.. rebooting .. [11:38] FS corruption of the host is probably far more important than the guest corruption. And for the latter I'd focus on journal loss. [11:41] xivulon: NTFS + ntfs-3g's behaviour is not really well know in this scenario. That's my concern. I'd hate too see Wubi users hacked off because their NTFS is screwed up [11:42] ..so sync sync sync and then a fixed ntfs-3g mount -o remount may be the best way forward. [11:43] ..one can never sync enough. [11:43] :-) [11:43] only confort I can give you is that with almost 1 million downloads all the ntfs corruption claims were due to hard reboots. And I have no evidence that ntfs-3g makes things any worse [11:43] 1 million(!!) [11:43] since you would still risk fs corruption when hard rebooting [11:46] I discussed that with szaka more than once, and he always asserted that the chances of fs corruption are not made any worse by the use of ntfs-3g [11:47] yep we are close to a million [11:48] but of course now that wubi is official, host corruption would make very bad press... [11:55] cking: rebooting .. === Lure_ is now known as Lure [12:36] lamont: I did do some basic tests yesterday for #206113 and commented on the bug let me know if you need anything else [13:30] my most recent dist-upgrade (-15.27 et al) killed sound... [13:35] lamont: does it also lead to a long boot time? [13:35] dunno [13:35] I wasn't watching... [13:35] and actually, I think it's related to session management, not to the kernel upgrade... [13:36] lamont: hmm.. how so? [13:36] I think I saw this before, now I just need to remember how I got told to fix it... [13:40] [ 54.214217] iTCO_wdt: Found a ICH5 or ICH5R TCO device (Version=1, TCOBASE=0xf860) [13:41] not seeing where I figured it out before, istr it was simply muted hard somewhere [13:41] * lamont afk for about 15 min [14:12] amitk: so I take back what I said... let's blame the kernel for now... debugging ideas? [14:17] lamont: sound-applet reports no mixers? [14:18] lamont, in case you missed it, did the tests you asked, see comment in 206113 [14:18] xivulon: yeah - thanks [14:18] let me know if you need anything else [14:18] sure [14:19] amitk: bringing the system fully current to now and rebooting (new lrm), and then I think I'll need something more basic in the way of instructions... [14:19] * lamont iz sound neub [14:28] amitk: anything I should tweak before I reboot [14:28] ? [14:28] Open Volume Control --> No volume control GStreamer plugins and/or devices found [14:31] lamont: just note if you reboot takes longer than usual [14:32] is -16 known to not boot? i cant find a report on it [14:33] pwnguin: on what? -16 is still not widely distributed as meta was just uploaded a few minutes ago. [14:33] ah [14:34] on my laptop, toshiba tecra m7. -generic [14:34] pwnguin: is this a regression? [14:34] yea [14:34] on the other hand, i dont have -meta [14:34] from -15 to -16? [14:34] yep [14:34] im on -15 right now [14:34] pwnguin: meta won't affect the boot. [14:35] rtg: i'd feel better about not missing part of the package with meta though [14:35] pwnguin: at what point in the boot process does it stop? [14:35] after turning off the graphical boot, hardware detection is the last message [14:36] i booted into recovery mode [14:36] and [14:36] the last two modules giving messages were iwl and cs [14:36] pwnguin, wait for a few minutes [14:36] it should continue booting [14:36] unsurprisingly, that didn't fix either of my issues [14:36] rtg: I see this too, I was woring on Classmate when I hit this [14:37] a few minutes? [14:37] thats a hell of a timeout ;) [14:37] pwnguin: I am trying to determine if you are seeing the same problem as me ;) [14:37] well [14:37] lemme reboot it i guess [14:37] amitk: everything since -15 has been custom binary, except for the mmc timeout. [14:37] heh [14:38] the mmc timeout [14:38] interesting; i upgraded because i wanted to test that [14:38] rtg: yeah.. I saw the changelog, and even the mmc timeout was 10ms, not 2 minutes [14:38] better go through the git log and make sure. [14:38] It is if it was 2ms before and not 2min [14:39] brb, rebooting after upgrade. [14:39] It was just a one liner changing mmc_delay from 2 to 10 [14:39] its supposed to be ms [14:39] it could be seconds [14:39] i never found that particular function [14:40] It is an inline function in one of the headers [14:40] core.h:static inline void mmc_delay(unsigned int ms) [14:41] pwnguin: can you confirm that the boot contineus after a few minutes? albeit w/o sound [14:41] on a related note, the sdhci developer's asked me to retry with MMC_DEBUG [14:41] amitk: not yet [14:41] give it a minute or so [14:42] pwnguin: just wait a few more please, that'll tell me that it is reproducible behaviour [14:43] pwnguin: What does Alt+F1 show? [14:43] doh [14:43] nothing much [14:43] F2? [14:43] I forget which one it is [14:43] kinit: no resume, doing normal boot [14:43] 8 [14:43] loading manual drivers [14:43] i forgot to set verbose [14:44] pwnguin: care to disable "quiet splash" in the kernel cmdline on the next boot? [14:44] indeed [14:44] thanks [14:45] hmm. now to see what changed in lum [14:45] amitk: my boot seems to proceed at normal-ish pace [14:46] lets see. theres hda_codec at 30 saying it cant find my audio. iwl at 31 saying there's 11 tunable channels for bg, 13 for a, and several I/O port probes for module cs [14:46] rtg: 607ab6f78fa5d51b4dab72d218455a858c499c6f in lum [14:47] if you want logs, im afraid i dont have a serial port =( [14:47] rtg: smb_tp: that commit looks like it will affect every sound driver [14:47] amitk: is that the culprit? [14:48] huzah [14:48] fail [14:48] rtg: I am going to revert and try on the classmate [14:49] amitk: just rename snd_core to something, then try. [14:49] amitk: after 2 minutes of no activity, it wrote some new text [14:50] it seems to still be stuck though [14:50] pwnguin: I'll bet it'll boot eventually :) [14:52] Hm, might be bad if register all will eventually call register... [14:52] heh [14:52] well, it doesn't seem to be spinning [14:52] rtg: crap.. just realised soundcore.ko comes from kernel, not ubuntu directory in /lib/modules :) [14:53] amitk: I was just looking to see where that file gets compiled. [14:54] rtg: that seems to improve things a lot [14:54] I'm going to revert the commit if you have no objections [14:55] amitk: can tyou rebuild lum with that patch reverted? [14:55] rtg: yeah.. doing that now [14:55] cking in your tests did you try mounting /host with sync & co? [14:55] amitk: removing soundcore indicates it is the sound sub-system, but isn't proof positive. [14:56] would it help to have 1 sec sleep before rebooting? [14:58] xivulon: I couldn't see to create any filesystem corruptions whatsoever. [14:58] amitk: I started a new release in LUM [14:58] xivulon: what is interesting is looking at /proc/vmstat | grep dirty [14:59] rtg: I've been seeing this since morning, but thought it was due to my usb patch for classmate suspend [14:59] ..even with the agressive vm dirty tweaks the data is not getting written back that quickly [15:00] ..one can see the writes by doing a bit of an ugly hack: sudo echo 1 > /proc/sys/vm/block_dump and looking at dmesg [15:00] pwnguin: you are running -15? [15:00] no [15:01] -16 [15:01] you compiled yourself? [15:01] nope =( [15:01] xivulon: ..even with the most aggressive vm dirty settings one can observe writes being flushed a few seconds after the initial write... [15:02] i noticed -16 in the repos [15:02] rtg: ^^ [15:02] xivulon: ..which could explain why some users are seeing corruptions even when we are forcing writebacks to occur as soon as possible. [15:02] amitk: I saw that. [15:02] xivulon: ..because ASAP is not really that ASAP :-( [15:02] hmm [15:02] pwnguin: do you also have -16 LUM installed? [15:03] is that an ntfs-3g / fuse implementation issue? [15:03] I only see -16 LRM, no LUM or kernel [15:03] cking what is your suggestion based on that? [15:03] well my mirror wins i guess [15:03] ohh and I see -16 linux-libc-dev [15:03] xivulon: no. I initially thought it was all the layering in the ntfs-3g, fuse, loopback etc.. but one can observe this on a normally mounted filesystem. [15:04] amitk: Got that installed already [15:04] when you say "couldn't see to create any filesystem corruptions whatsoever" do you mean you couldn't create corruptions or that you were simply not testing that [15:04] so it's a kernel issue... [15:04] xivulon: so it appears what the kernel documentation says and what is actually happening differ. [15:05] not to pleased to hear that [15:05] xivulon: as for the corruptions: I hammered the system with loads of creat's and unlinks and forced a reboot and everything was OK... [15:05] smb_tp: no sideeffects? [15:06] xivulon: and also I repeated this several times and pulled the plug and got the normal fsck'ing fixes but I did not get any major failures [15:06] cking that is in line with lack of reports on the matter. was this with the new initramfs / umountroot or in a plain vanilla case? [15:06] xivulon: it was using the new scripts from the bug report [15:07] amitk: None I noted, but I guess this only affects programs comiled on that host. and I didn't compile any user-space [15:07] well that is good news [15:07] for once... [15:07] rebooting now.... [15:07] xivulon: I left some notes on the bug report so one can see what I mean about the write backs being a little bit too lazy for my liking [15:08] could it be that the mount options (sync & co) interfere with sysctl? [15:09] xivulon: no. when I saw the less than perfect performance I tried it on a vanilla configuration to do a sanity check - i.e. no insane sync,dirsync,commit=1 flags. [15:09] cking: would you suggest we keep mount options and/or sysctl settings? [15:10] ..and I saw the same "problem". We are probably seeing the pdflush pushing out all it can before it gets rescheduled or something. I need to shove a whole load of debug in to see why it is a little less aggressive than expecyed [15:11] so I guess you'll need some more time for the tests then we can reassess [15:11] I would not think though that in the light of what you say the new initrd/umountroot should make much difference [15:11] xivulon: well, closer inspection shows that the mount options are ignored by ntfs - so we just fall back to the sysctl's. [15:11] since /host is still not remounted ro (at least in my tests) [15:12] xivulon: agreed [15:13] xivulon: I basically need to figure out why pdflush is not being totally true to the vm dirty knobs [15:13] rtg: revert works and sound is back [15:13] sure, keep in mind that we'll need to do a feature freeze exception for each change, if something is not strictly necesseray (proven to be so) we should skip it [15:13] rtg: I've pushed the revert to lum [15:13] that applies to initrd, sysctl and umountroot [15:15] xivulon: yep. Well I think the real issue is that we want to flush blocks out asap and I need to figure out why the pdflush does not quite match the behaviour as described on the lid [15:15] ^on the lid^in the docs^ [15:15] ..and then I'm a little worried about tinkering with the pdflush daemon this late in a code freeze :-( [15:16] ..but it could be something obvious once I've got my head around it. [15:17] xivulon: any ideas on anything else we could do? [15:17] not really [15:21] xivulon: OK I investigate pdflush and the vm dirty knobs [15:22] s/I/I will/ [15:36] cking, maybe doublecheck whether initramfs/umountroot make any difference [15:38] xivulon: I think most productive thing is to see if we can get pdflush to really do aggressive writebacks to limit loopbacke'd fs corruption [15:39] xivulon: I hope it is something obvious [15:40] agree [15:41] by the way, wouldn't something like that be relevant also for vm? [15:43] that too is a nested fs and hence subject to the same vulnerabilities [15:43] I am surprised I am the only one running into this [15:51] at least for the power-loss part (normal reboot is very different of course) [16:20] amitk: I verified the cx88 revert. it definitely makes a difference. [16:21] rtg: half the patch was mutexes, it had to ;) [16:21] amitk: I looked through it when I applied it. everything looked balanced, _and_ it came from upstream. [16:22] rtg: I guess it needs another part from upstream [16:23] amitk: could be. now I'm f\gonna have to figure out what the real cx88 crash was . === jayc__ is now known as jayc_ [17:33] i have just updated to -16 kernel and it hangs [17:34] nxvl: LUM is currently percolating with a fix. wait for linux-ubuntu-modules-2.6.24 16.22 [17:34] ok [17:34] so it was a know issue? [17:34] heh [17:34] only recently [17:35] nxvl: it became a known issue fairly quickly. [17:35] yes of course it has no more than one day uploaded [17:35] but if you already have what you need to fix it [17:35] i'm ok [17:35] 16.22? [17:35] https://launchpad.net/ubuntu/+source/linux-ubuntu-modules-2.6.24/2.6.24-16.22 [17:36] i have linux-headers at 16.30 [17:36] is that . not kept in sync? [17:37] pwnguin: The LUM package version is not related to the headers package version (except for the ABI number) [17:38] ok [17:39] so, fix is released and we only need to wait until it reach the mirrors? [17:39] yep [17:39] not booting is a fairly high priority bug ;) [17:40] espcially when someone like amitk and reproduce it [17:41] ok thanks! [17:58] btw [17:58] i forgot to report something else [17:58] i hav a lenovo T61 [17:58] with hotplug DVD-rom [17:59] but when iunplug it the system hangs [18:40] rtg, i was thinking this morning, could you see to it that LBM is shipped as available on the DVD image to install? [18:40] in it's pool directory or similar [18:41] mario_limonciell: thats a question for cjwatson [18:42] okay i'll ask him [18:52] rtg: are there are traces from the bootup problems (WRT #212960 in l-u-m -16)? [18:52] amitk: ^ [18:52] are there, sheesh. can't type. [18:53] crimsun: there don't seem to be. I'm looking at it now. Frank's original patch works, but the one Leann attached does not (it locks up) [18:53] crimsun: I'm trying to decide if it is really the root of the problem. [18:54] rtg: I don't think it is, after closer inspection. [18:54] crimsun: can there really be that much list activity going on? Or is the use of mutexes just jostling the thread timing. [18:55] crimsun: as far as I can remember, PCI device discovery is single threaded, right? === inuka_desk_ is now known as inuka_desk [19:09] rtg: there shouldn't be much activity, and it is single-threaded IIRC. [19:15] crimsun: that's what I thought. I'm not sure Frank's analysis is correct (about the lists getting corrupted). [19:17] I can't catch an AP with my AR5008 madwifi card since the two last -14 and -15 kernel [19:17] still no problem with the -12 kernel and the same userland as the others new [19:22] hm what is creating the symlink between cpufreq directories on two core systems initially? [19:22] i had thought it'd be powernowd doing it, but I don't see evidence of that [19:23] any idea ? for the regression on madwifi [19:23] rtg: Hm, I might be stupid but in the report snd_card_register_all is called and later snd_device_new. If thats both in core, how should that work? [19:24] I should be more precise as it is a AR5XX8 chipset i had to use the madwifi svn and not the module in the restricted package [19:24] it works fine on feisty/gutsy and hardy since the 2.6.24-12 kernel [19:29] smb_tp: it clearly must work because its not a problem for everyone. I'm still looking at the cx88 driver. However, right now its lunch time. [19:30] pepie34: you are on your own 'cause I haven't pulled the madwifi branch that supports the AR5008. I've been waiting until they make it official. [19:30] rtg: Enjoy. I was just thinking of the same codepath with mutexes [19:38] rtg i just want to warn then that since the -14 the madwifi-svn does not work and you will get all the mac user really angry [19:50] amitk: so will this new kernel make my soundz werk>? [20:24] * pwnguin does a zomg sd works again dance [20:24] rtg: thanks [20:49] lamont: You don't need sound. [20:49] infinity: do too [20:50] lamont: The kernel team disagrees. [20:54] lamont: we've gone out of our way to wreck only _your_ audio support. [20:55] rtg: I feel so much more productive without the noise-canceling music [20:56] lamont: I thought you had audio once upon a time? [21:23] hi, can someone please look at the FFe in bug #185634 and give a comment? thanks! [21:23] Launchpad bug 185634 in ubuntu-meta "uvcvideo: iSight firmware loading does not work" [Medium,Confirmed] https://launchpad.net/bugs/185634 [23:56] is there something in -16 which broke nvidia from loading? bug 215778 [23:56] Launchpad bug 215778 in linux-restricted-modules-2.6.24 "2.6.24-16.30 kernel update - nvidia: module license 'NVIDIA' taints kernel" [Undecided,Confirmed] https://launchpad.net/bugs/215778 [23:57] lrm debdiff looked good, so it shouldn't be due to the dh_strip fix