[12:49] <rtg> all ext4 crack-heads follow Ted ==> https://bugs.edge.launchpad.net/ubuntu/+source/linux/+bug/317781/comments/175
[12:49] <ubot3> Malone bug 317781 in linux "Ext4 data loss" [High,Fix released] 
[14:41] <dandel> rtg, you there?
[14:42] <dandel> i need to find out what i am supposed to do to set... CONFIG_ACPI_DEBUG and does this require a kernel compile?
[14:43] <rtg> dandel: I expect it will
[14:43] <dandel> ack.
[14:44] <dandel> now what do i ened to install to compile 2.6.29rc8 or 2.6.29rc7?
[14:44]  * dandel is following the maintainers instructions but doesn't like kernel compiles.
[14:45] <rtg> dandel: build-essential fakeroot, etc. See https://wiki.ubuntu.com/KernelMaintenance
[14:48] <dandel> hmm
[14:48] <dandel> makedumpfile is missing
[14:49] <dandel> oh... hardy is incomplete lol
[14:52] <dandel> ok... got kernel source and whatnot installed, now what?
[14:52] <rtg> dandel: if you use 'debuild -b' then all build assumptions are checked. It'll tell you whats missing
[14:52] <rtg> of course, you must first install devscripts
[14:52] <dandel> how do i get that?
[14:53] <rtg> the usual way, 'apt-get install devscripts'
[14:53] <dandel> already got it
[14:53] <dandel> all i want to do is rebuild the acpi module with debug options.
[14:55] <rtg> dandel: you can build a binary deb for just the flavour you want, 'fakeroot debian/rules build-debs flavours=generic'. Other then that, there aren't many short cuts
[14:55] <rtg> build-debs ==> binary-debs
[14:56] <dandel> going to take 5 or 6 min to get the extracted src
[14:57] <dandel> oh wait, it finished already lol
[14:58] <dandel> debian/rules not found.
[15:01] <lool> amitk: w0àt
[15:01] <lool> that was a woot, but mistype, anyway
[15:01] <lool> amitk: NEON can cause babbage to deadlock due to hw bugs
[15:01] <lool> amitk: Could you please turn CONFIG_NEON off in imx51?
[15:02] <lool> amitk: This will be fixed in hardware updates; we don't know when we'll get them, but since we target the babbage development for now and the release anyway, we shouldn't take the risk; I don't think we'll get the new hardware soonish and others might run into it
[15:02] <lool> amitk: This is recommended by Dave Martin from ARM
[15:02] <lool> rtg: ^ You might be interested as well
[15:04] <rtg> lool: your 'Add HWCAP_NEON to the ARM hwcap.h file' won't work worth a hoot if you disable CONFIG_NEON
[15:04] <lool> rtg: Crap
[15:04] <lool> lskdgkjlqzhkgl hlq
[15:05] <lool> rtg: That makes sense; but it's kind of a stupid situation
[15:05] <rtg> lool: its all protected with '#ifdef CONFIG_NEON'
[15:06] <lool> rtg: Yes, I didn't think of that
[15:06] <lool> rtg: I've asked dave_m to join here; he requested both changes which are contradictory
[15:06] <rtg> lool: I'll let you and amitk duke it out.
[15:07] <lool> rtg: Thanks
[15:07]  * amitk would like to take this opportunity to reiterate - We shouldn't apply crack in such a hurry w/o testing
[15:08]  * rtg feels like he's been bitch slapped :)
[15:09] <dandel> hmm... where do i find the ubuntu upstream git repo for 2.6.29rc8
[15:10] <rtg> dandel: there isn't one. apw has a mainline build script that synthesizes a debian build from the upstream kernel git repo
[15:11] <apw> dandel, yep, in essence we rip the source tree out of our reposity and graft in the offical tree from linus' repo
[15:11] <apw> so the official source for the kernel is his code at that level
[15:12] <apw> we then push our kernel configuration in and build packages from it
[15:12] <apw> there are source packages for the trees in there
[15:12] <dandel> blah... i like binary build... a maintainer asked me to set a param and i don't want to build the whole kernel to do it.
[15:13] <dandel> apw, look at this... http://bugzilla.kernel.org/show_bug.cgi?id=12873 
[15:13] <ubot3> bugzilla.kernel.org bug 12873 in Config-Interrupts "ACPI_IRQ not set" [Normal,Needinfo] 
[15:13] <dandel> if i could figure how to set config_acpi_debug without having to build the whole kernel it'd be great.
[15:14] <apw> dandel, thats a pretty core so i think you'd have to rebuild most things to set that anyhow
[15:14] <apw> have you checked if it is set already?
[15:14] <dandel> it's not
[15:14] <dandel> not in the one that's in mainline tree
[15:15] <apw> hrm, no its not in  our standard configs, so it'd not be
[15:15] <dandel> 2.6.29-020629rc8-generic does not have that option set.
[15:15] <mjg59> dandel: You can't
[15:15] <mjg59> dandel: acpi is an integral part of the kernel. Enabling debug will force most of the kernel to be rebuilt.
[15:15] <apw> you are pretty much forced to rebuild
[15:15] <dandel> ><;
[15:16] <dandel> can i get a acpi debug enabled build then?
[15:16] <apw> i take it it has to be 2.6.29-rc7
[15:16] <dandel> 2.6.29rc8 has same dmesg output
[15:17] <dandel> i don't think it'll matter as long as i get the log up and point to which version of kernel it is.
[15:17] <apw> dandel, i take it you are not keen to build one
[15:18] <dandel> i haven't done it before, and last few times i royally screwed up the machine doing it.
[15:18] <apw> hrm
[15:18] <dandel> kernel builds take up a lot of space, and i don't have that to spare.
[15:18]  * apw has a look
[15:19] <dandel> last i checked, it takes about 5 to 6gb to build the kernel.
[15:19] <apw> i'd say about 1gb, but not tiny
[15:19] <apw> whats your platform?  32bit 64 bit?
[15:19] <dandel> 32
[15:20]  * rtg gloats about his 4 spindle 2TB build server
[15:20] <dandel> with the dmesg of... 2.6.29-020629rc8-generic ... should of been obvious.
[15:20] <dandel> i believe the 64 bit version has a version of somethin like... 2.6.29-020629rc8-amd64
[15:20] <apw> nope: 2.6.28-8-generic
[15:21] <apw> and i can shave 50% off by asking
[15:21] <lool> dave_m_: Hey
[15:21] <lool> amitk: Dave Martin == dave_m_ above
[15:21] <lool> amitk: it's not crap and it's useful
[15:22] <lool> rtg: ^
[15:22] <dandel> apw, at least i knew enough to check where the bugs where, which is better than 90% of people who just install to run programs.
[15:22] <apw> indeed
[15:22] <lool> amitk, rtg: At least at the source level, we can rebuild the pristine jaunty kernel when new hardware comes out
[15:22] <apw> dandel, is there an ubuntu bug for this?
[15:22] <lool> amitk, rtg: Now, according to dave_m_, the NEON code used by the kernel itself wont trigger the hang
[15:22] <dandel> yes
[15:23] <apw> wahts the #, can tie the kerenls to that
[15:23] <lool> amitk, rtg: So it should be safe to keep it enabled IIUC; problematic NEON code triggering the hang is in ffmpeg and in pixman, so in userpsace
[15:23] <dandel> bug # 338701
[15:23] <dave_m_> lool: I would need to double-check that.
[15:23] <lool> amitk, rtg: CONFIG_NEON or not doesn't change the fact that these could deadlock the platform
[15:23] <rtg> lool: works for me
[15:23] <lool> dave_m_: please do
[15:23] <apw> bug #338701
[15:23] <ubot3> Malone bug 338701 in linux "acpi_irq is not set properly." [Undecided,Incomplete] https://launchpad.net/bugs/338701
[15:24] <dandel> it has a sister bug namely bug #294323
[15:24] <ubot3> Malone bug 294323 in linux "Special Function keys broken after upgrade ( Toshiba Satilite P305D, 2.6.27 kernel) (dup-of: 261318)" [Undecided,New] https://launchpad.net/bugs/294323
[15:24] <ubot3> Malone bug 261318 in linux "Regression: new Toshiba Laptop Support (tlsup) driver breaks Toshiba hotkeys; input device does not support 'kbd' input handler" [High,Fix released] https://launchpad.net/bugs/261318
[15:24] <lool> rtg, amitk: So what dave_m_ was also proposing is to check with FSL how we can identify the problematic boards; would you be ok to merge a patch disabling/enabling NEON based on the baord?
[15:25] <lool> rtg, amitk: The hwcaps changes and the NEON disablement are really orthogonal; it's just unfortunate that the only *hardware* that we have is broken; would we have working hardware (which we'll likely have later), or another supported NEON platform (e.g. beagleboard), we wouldn't face this contradictory situation
[15:26] <dandel> although, that bug got semi-fixed, but the suspend/shutdown, power plug status and battery updates where all messed up
[15:26] <lool> rtg, amitk: I think if we get confirmation that NEON in the kernel works, we don't need to do anything to the config and you can just ignore my request to disable it; I'm sorry I only learnt later today that NEON was causing deadlocks on iMX51
[15:27] <rtg> lool: you and amitk work it out. I'll already been smacked once today.
[15:27] <lool> dave_m_: In all cases, kernel patch or not, I think it's useful to recognize problematic hardware so that we can at least assist people coming with random hangs
[15:27] <dave_m_> Apologies, it was me not realising that iMX51 is the only official target with NEON support at present.
[15:27] <lool> rtg: I'm sorry about that :-(
[15:27] <lool> rtg: Didn't know about the hardware issues earlier today, this is news to me
[15:29] <dave_m_> Just running a kernel with NEON support built in doesn't cause problems. So maybe we can split the discussion into kernel and userspace.
[15:30] <dandel> apw, mind walking me through building just enough of the kernel to boot up to the console for a dmesg report?
[15:30] <lool> dave_m_: So we don't need to disable CONFIG_NEON?
[15:30] <apw> dandel, i was just working on a build for you now
[15:31] <dandel> oh, thanks :D
[15:31] <dave_m_> CONFIG_NEON by itself is not a problem.  It's only if some app tries to use NEON that the problems can occur.
[15:32] <JayFo> pgraner, finally here now. stupid restricted network
[15:32] <JayFo> I'll tell you later
[15:32] <pgraner> JayFo: lol, glad to have ya
[15:32] <JayFo> :)
[15:32] <JayFo> thanks
[15:33] <dandel> hmm... actually, i wonder how severe my bug really is... lol. (besides knocking out the power management on my laptop)
[15:33] <dandel> i had to debate between 2.6.24 and 2.6.27+ because 2.6.27 had proper cpu throttling on my laptop.
[15:38] <dandel> oh, how should i report bugs with the ubuntu hybernate, because the windows based installer and such for ubuntu leads to unusable hybernate.
[15:39] <apw> all bugs should go into launchpad
[15:39] <dandel> if i hybernate i get a long long dmesg log due to the fact it's trying to fit 3gb of ram in to a 1gb swap.
[15:40] <dandel> it's an installer bug from what i can tell.
[15:40] <apw> in what sense?  that it should have made a bigger swap?
[15:41] <dandel> it should let me do a custom partition table for starters
[15:41] <apw> sounds like an installer bug yes
[15:41] <dandel> 8.04.2 autoconfigured everything.
[15:41] <dandel> and thus i couldn't even tell it that the swap needed to be at least 3gb
[15:42] <dandel> 8.10 is the same, as for 9.0x i couldn't even get the windows based installer to even run.
[15:42] <apw> both of those sound like they need reporting
[15:43] <dandel> if i knew which part of the launchpad to put i would of
[15:44] <dandel> anyways, i had to blacklist the ath5k driver due to the fact it can't even work right ><; doesn't detect my wap which is less than 3 feet away.
[15:47] <amitk> lool: rtg: I still stick by my initial statement. It shouldn't have gone in, in the first place. There are several people who have the board that could test it (even in my absence). cooloney is in the China TZ and could've helped. The reason I am bitching is that it causes extra work. Apply-upload-test-explode-revert-upload isn't my ideal workflow.
[15:47] <lool> amitk: What does?  The NEON hwcap?
[15:48] <amitk> lool: yeah
[15:48] <lool> amitk: Why so?
[15:48] <cooloney> amitk, lool sorry i'm not get the whole story
[15:49] <dandel> hmm... i assume kernel panics are critical bugs right? ( i have a friend which on one of the newer kernels gets panics every 5 min on his laptop.
[15:49] <amitk> lool: because of all the discussion that has been going on. Instead, the patch could've been emailed, tested and only then applied.
[15:49] <lool> amitk: I currently need it to demonstrate NEON support in ffmpeg in jaunty; yes, it's a new feature
[15:49] <lool> amitk: The patch is ok
[15:49] <lool> amitk: It's not wrong in any case
[15:49] <amitk> lool: you claimed it was
[15:49] <lool> No
[15:50] <amitk> 17:01 < lool> amitk: NEON can cause babbage to deadlock due to hw bugs                                                                                                         gnomefreak    
[15:50] <amitk> 17:01 < lool> amitk: Could you please turn CONFIG_NEON off in imx51?     
[15:50] <lool> amitk: That's unrelated to the patch I sent
[15:50] <lool> amitk: The patch I sent is to enable NEON 
[15:50] <lool> *hwcaps*
[15:50] <lool> Not CONFIG_NEON
[15:50] <lool> amitk: CONFIG_NEON is currently turned on already
[15:51] <lool> amitk: I think you're mixing the two, they are completely orthogonal
[15:51] <gnomefreak> what did i do?
[15:52] <lool> amitk: Does this clarify?
[15:54] <lool> amitk: I logically need NEON support in all software bits, but the hardware is broken; all NEON support around is correct, and even the binary kernel is ok if you get a new babbage board which doesn't deadlock in some code present in ffmpeg and pixman
[15:57] <dave_m_> lool: Is it reasonable to enable the infrastructure for NEON support (CONFIG_NEON, HWCAP_NEON and optionally ld.so hwcap support), but to avoid software which uses NEON for this release? It still feels valuable, because people can develop against the infrastructure when suitable platforms are available.
[15:59] <amitk> lool: perhaps I was mixing them up. In which case apologies.
[16:00] <lool> amitk: No problem, I do agree with you that the hurry was a bad idea, but it's close to beta and I wanted to meet our goals, even if the technical changes are understood only so late   :-/
[16:01] <lool> amitk: I feel bad to have put everybody on the nerves about this; I'll do my best to make everything as smooth as possible; concerning NEON we don't need any other change in the tree now; the patches which rtg merged this morning were correct and useful
[16:02] <lool> No need to change CONFIG_NEON either, that part was wrong from me
[16:02] <rtg> lool: s'ok, I had not gotten around to it yet anyway
[16:02] <amitk> lool: ack
[16:03] <lool> rtg: Do keep your hwcaps changes though  ;-)
[16:03] <rtg> lool: both patches are in the repo
[16:03] <lool> rtg: Saw them, that's great, thanks a lot
[16:33] <dandel> apw, how's the build going along?
[16:33] <apw> its building now, i had some tooling issues, not tried to build a mainline kernel locally since they were automated and i've broken it along the line
[16:34] <dandel> oh fun.
[18:05] <apw> dandel, the kernels images you needed should be uplaoded in a couple of minutes, link in the bug
[18:22] <dandel> apw, ok, i'll get em in about in about 15 min, just to make sure they are completed.
[18:22] <apw> they are done my end
[18:23] <dandel> k
[18:26] <dandel> i'll put up the log asap.
[18:27] <dandel> just haft to wait for initramfs.
[18:28] <dandel> libc 2.6 vs libc 2.7 lol... nice little complain lol.
[18:34] <Keybuk> rtg: around?
[18:34] <rtg> Keybuk: I'm a square
[18:34] <Keybuk> a square will do fine
[18:34] <Keybuk> need an opinion on bug #296710
[18:34] <ubot3> Malone bug 296710 in linux "warning: ehci_hcd loaded AFTER uhci_hcd and ohci_hcd" [Undecided,Confirmed] https://launchpad.net/bugs/296710
[18:35] <Keybuk> I've been doing some investigation, and it turns out that the three drivers have no overlapping modules aliases
[18:35] <Keybuk> in fact, they each export exactly one
[18:35] <Keybuk> ehci_hcd  pci:v*d*sv*sd*bc0Csc03i20*
[18:35] <Keybuk> ohci_hcd  pci:v*d*sv*sd*bc0Csc03i10*
[18:35] <Keybuk> uhci_hcd  pci:v*d*sv*sd*bc0Csc03i00*
[18:35] <rtg> right, which is one of twe reason I've been confused why link order makes any difference
[18:35] <rtg> or load order
[18:36] <Keybuk> now, because they don't overlap - there's nothing we can do in userspace
[18:36] <Keybuk> modules.order won't help
[18:36] <dandel> apw, no log change.
[18:36] <Keybuk> if we find the USB-1 hub first, [uo]hci_hcd will be loaded before ehci_hcd
[18:36] <apw> dandel, you should have a .config with it in /boot
[18:36] <Keybuk> and that can happen if the laptop has a USB-1 hub, and someone plugs in a USB-2 pccard
[18:36] <apw> can you confirm the entry is in there
[18:37] <Keybuk> but I don't understand why there's a warning in the kernel anyway?
[18:37] <rtg> Keybuk: can we modprobe it in initramfs ?
[18:37] <Keybuk> if the modaliases don't overlap
[18:37] <Keybuk> then why does it matter?
[18:37] <Keybuk> rtg: if that's the kind of fix we need, build it into the kernel!
[18:37] <rtg> Keybuk: lemme look at the code.
[18:37] <Keybuk> k,
[18:37] <Keybuk> bbiab tea
[18:38] <dandel> it is set
[18:38] <dandel> but, config_acpi_debug_func_trace is not
[18:39] <dandel> i'll put the log on the main kernel.
[18:40] <apw> i thought they only asked for the former
[18:40] <apw> but yep stuff it up and see waht they ask for 
[18:40] <apw> next
[18:42] <dandel> done, and should i place it on the launchpad too?
[18:43] <dandel> apw, i think your on intrepid :)
[18:43] <apw> dandel, yeah put it on there too may as well, helps keep us informed
[18:43] <apw> dandel, why so?
[18:44] <dandel> laptop is set to hardy
[18:44] <apw> i built those images in an intrepid chroot, but the machine was running jaunty
[18:45] <dandel> it's got a long string of error due to that ><;
[18:46] <dandel> however, that's long after the irq is disabled.
[18:48] <apw> dandel, odd
[18:48] <dandel> here's the log... http://launchpadlibrarian.net/23949351/dmesg_2.6.29-020629rc8-generic.log
[18:48] <dandel> line by line dump as follows... [    2.524473] Pid: 1, comm: swapper Not tainted 2.6.29-020629rc8-generic #1
[18:49] <dandel> that's where it starts... i won't past much more since it's over 15 lines.
[18:49] <dandel> it ignores my dsdt in the bios
[18:50] <dandel> however, that's not new.
[18:51] <apw> [    0.000000] Unknown boot option `acpi_debug.layer=0x44': ignoring
[18:51] <apw> [    0.000000] Unknown boot option `acpi_debug.level=0x08000004': ignoring
[18:51] <apw> doesn't look like the debug is turned on in that boot to me
[18:53] <dandel> i set the parameters :D
[18:55] <dandel> pfft
[18:56] <dandel> i found it
[18:56] <dandel> he should of said... acpi.debug_level
[18:56] <apw> dandel, yep, just about to say the very same
[18:57] <dandel> yay... now the spam begins.
[18:58] <dandel> ev_gpe_detect
[18:58] <dandel> and ev_fixed_event_detect are large parts
[18:58] <dandel> 80 seconds of spam b4 it got done
[18:59] <rtg> Keybuk: I wonder what the downside would be to build in ehci? I can't see a hardware reason for the warning, just vague warning in google articles that bad things could happen, or that existing USB 1.0 devices will get disconnected when a 2.0 device is inserted.
[18:59]  * dandel adds new log shortly.
[19:00] <IntuitiveNipple> rtg: I was able to reproduce that on a PC here at one time. Would it help if I figured out which one so we can work on an effected system?
[19:01] <rtg> IntuitiveNipple: can you remember how you did it? Insert USB 1.0 device, then a 2.0 HUB ?
[19:01] <dandel> uhh... apw... log is too long
[19:02] <dandel> it cut out huge parts of it
[19:02] <apw> heh nice
[19:02] <IntuitiveNipple> rtg: I *think* it was just the 'default' boot-up ... I didn't *know* there was a problem until an external USB2 hub with 160G USB2 drive began behaving slowly... then the slowness 'went away' so I never really confirmed the issue 
[19:02] <dandel> [  190.652052]    evgpe-0444 [00] ev_gpe_detect         : Read GPE Register at GPE18: Status=01, Enable=80 to [  214.452038]    evgpe-0444 [00] ev_gpe_detect         : Read GPE Register at GPE18: Status=01, Enable=80
[19:02] <dandel> that takes 1.1k lines
[19:03] <rtg> IntuitiveNipple: so, its likely you either have a 1.0 hub built in to the laptop, or only have 1.0 devices connected initially.
[19:03] <dandel> hmm.
[19:03] <IntuitiveNipple> dandel: apw: FADT: X_PM1a_EVT_BLK.bit_width (16) does not match PM1_EVT_LEN (4)   ... not necessarily related but...
[19:04] <IntuitiveNipple> rtg: Yes, it has both USB1 and USB2 hubs... USB1 seem to be used for the internal USB devices
[19:04] <dandel> hmm.... i better copy the source file in /var
[19:04] <IntuitiveNipple> rtg: (four lspci lines follow)
[19:04] <IntuitiveNipple> 00:1d.0 USB Controller [0c03]: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #1 [8086:27c8] (rev 02)
[19:04] <IntuitiveNipple> 00:1d.1 USB Controller [0c03]: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #2 [8086:27c9] (rev 02)
[19:04] <IntuitiveNipple> 00:1d.2 USB Controller [0c03]: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #3 [8086:27ca] (rev 02)
[19:04] <IntuitiveNipple> 00:1d.3 USB Controller [0c03]: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #4 [8086:27cb] (rev 02)
[19:04] <IntuitiveNipple> 00:1d.7 USB Controller [0c03]: Intel Corporation 82801G (ICH7 Family) USB2 EHCI Controller [8086:27cc] (rev 02)
[19:05] <rtg> IntuitiveNipple: hmm, so it _did_ load the ehci driver, but just not in the right order. correct?
[19:05] <IntuitiveNipple> rtg: actually, yes, I think this is affected... I've just checked /etc/initramfs-tools/modules and found:
[19:05] <IntuitiveNipple> ehci_hcd
[19:05] <IntuitiveNipple> uhci_hcd
[19:06] <rtg> IntuitiveNipple: well, thats what fixed it
[19:06] <IntuitiveNipple> So, I must have thought I needed to fix it. I have dinner now but afterwards I'll change that and do some tests for you
[19:06] <dandel> eww... corrupted low memory is in the log
[19:06] <rtg> IntuitiveNipple: I'll build a test kernel with ehci built in
[19:07] <IntuitiveNipple> rtg: I can do that here. I'll get back to you later when I've had time to play
[19:07] <rtg> IntuitiveNipple: I think it already is built in.
[19:07] <rtg> oh, nm
[19:08] <IntuitiveNipple> modinfo ehci_hcd
[19:08] <IntuitiveNipple> filename:       /lib/modules/2.6.28-9-generic/kernel/drivers/usb/host/ehci-hcd.ko
[19:08] <rtg> right, I was looking at ARM configs
[19:08] <IntuitiveNipple> gotta go - dinner cooling :D
[19:12] <dandel> apw, i got it... but compressing it takes first step
[19:14] <dandel> which is more useful... syslog messages or kern.log
[19:16] <Keybuk> rtg: yeah the warning confuses me
[19:16] <Keybuk> maybe it's to do with how the kernel decides it's a USB 2.x hub?
[19:16] <Keybuk> if you have a USB 2.x hub with only USB 1.x devices plugged in, does it look like a USB 1.x hub ?
[19:16] <Keybuk> so ohci/uhci would claim it
[19:17] <rtg> Keybuk: AFAIK
[19:17] <Keybuk> whereas the ehci_hcd driver knows better?
[19:17] <Keybuk> that sounds rather shaky though
[19:17] <Keybuk> and almost like they shouldn't have three drivers ;)
[19:17] <dandel> apw, want to look through em?
[19:18] <rtg> Keybuk: on the other hand, it diesn't make sense that a 1.0 device would cause uhci to load. he driver thats loaded is, after all, based on PCI ID.
[19:18] <IntuitiveNipple> right... back. Will rebuild initrd without the module load-order
[19:18] <Keybuk> rtg: indeed
[19:18] <mjg59> Load order is important
[19:18] <rtg> IntuitiveNipple: mighty quick dinner :)
[19:18] <Keybuk> mjg59: but *why* is it important? :)
[19:18] <Keybuk> that's what we're trying to understand
[19:18] <mjg59> If uhci loads before ehci then it'll chirp 1.0 at the device
[19:19] <mjg59> Then when ehci loads the device will already be bound
[19:19] <Keybuk> mjg59: but given uhci has no device table overlap with ehci - why does uhci even try?
[19:19] <IntuitiveNipple> rtg: fast eater :D
[19:19] <mjg59> Keybuk: I don't understand
[19:19] <Keybuk> surely it should dismiss the pci device because it has the wrong pci ids
[19:19] <mjg59> Keybuk: No
[19:19] <Keybuk>  ehci_hcd  pci:v*d*sv*sd*bc0Csc03i20*
[19:19] <Keybuk>  ohci_hcd  pci:v*d*sv*sd*bc0Csc03i10*
[19:19] <Keybuk>  uhci_hcd  pci:v*d*sv*sd*bc0Csc03i00*
[19:19] <mjg59> Keybuk: ehci doesn't support USB 1 at all
[19:19] <mjg59> Keybuk: You have to have uhci or ohci fallback
[19:20] <Keybuk> doesn't support USB 1 hubs or USB 1 devices on USB 2 hubs?
[19:20] <mjg59> Things get complicated with 1 devices on 2 hus
[19:20] <mjg59> At that point I think you're supposed to speak 1 in 2 framing
[19:20] <Keybuk> what I don't get is why, when you load uchi_hcd, it does anything to the PCI device that it doesn't claim
[19:20] <Keybuk> surely it should ignore it (just like it doesn't try and speak USB 1 to your sound card :p)
[19:20] <mjg59> Which is loading first? uhci or ehci?
[19:21] <Keybuk> let's say you have a machine
[19:21] <Keybuk> it has two PCI devices
[19:21] <Keybuk> one of which is a USB 2.x hub (i20)
[19:21] <Keybuk> the other is a USB 1.x hub (i00)
[19:21] <mjg59> No, don't use the word hub here
[19:21] <Keybuk> what's the right word?
[19:21] <mjg59> host
[19:21] <Keybuk> ok
[19:21] <Keybuk> so we have in slot 1 a USB 1.x host (i00)
[19:21] <Keybuk> and in slot 2 a USB 2.x host (i20)
[19:22] <Keybuk> we'll probably load uchi_hcd first
[19:22] <mjg59> Right
[19:22] <Keybuk> because of the ...i00* match
[19:22] <mjg59> At that point the USB ports will be powered up
[19:22] <Keybuk> why does that module claim the device in slot 2 ?
[19:22] <mjg59> It doesn't
[19:22] <mjg59> ehci will also be loaded
[19:22] <mjg59> But
[19:22] <mjg59> When uhci binds, it'll power up the ports
[19:22] <Keybuk> ehci will only be loaded because there's a USB 2.x host as well
[19:22] <mjg59> The USB device at the other end will then send a chirp
[19:22] <Keybuk> the ports of which host?
[19:22] <mjg59> They're the same prots
[19:22] <Keybuk> or are you saying those two hosts share a single hub?
[19:23] <mjg59> As I said, the word "hub" is not meaningful here
[19:23] <Keybuk> sorry, single port (or set of ports)
[19:23] <mjg59> But yes, the two hosts share the same physical ports
[19:23] <Keybuk> ahhh
[19:23] <Keybuk> so the two PCI devices do not map 1:1 to the ports on the back
[19:23] <mjg59> The USB device will then chirp and only get a USB 1 device
[19:23] <Keybuk> the ports on the back are shared by both PCI devices
[19:23] <mjg59> s/device/response/
[19:23] <Keybuk> one of which is for a USB 1.x legacy driver and legacy OS
[19:23] <Keybuk> the other is for a USB 2.x OS ?
[19:23] <mjg59> ehci will then load, but will not reenumerate the devices
[19:23] <mjg59> Because they're already bound to uhci
[19:24] <Keybuk> what happens if you load ehci, and you have only USB 1.x hosts?
[19:24] <mjg59> They don't work
[19:24] <Keybuk> but then if you load uhci ?
[19:24] <Keybuk> do they work then?
[19:24] <mjg59> Yes
[19:24] <Keybuk> neat, thanks for the explanation
[19:24] <Keybuk> so building in ehci would solve the issue ;)
[19:24] <mjg59> They'll chirp, ehci will refuse to talk to them and then uhci will reenumerate them
[19:25] <rtg> mjg59: so, it seems that the real solution is to build them both into the kernel, making sure ehci is linked first, right?
[19:25] <Keybuk> rtg: ehci is first in the link order
[19:25] <Keybuk> though it's tempting to leave uhci and ohci as modules since they're "less common"
[19:25] <mjg59> Yeah, for a reason
[19:25] <dandel> apw, the log is up, however, i had to compress it with gz so it would be a manage-able download.
[19:25] <mjg59> Keybuk: Built-in USB peripherals like fingerpritn readers are often USB 1
[19:25] <Keybuk> mjg59: the link order doesn't pass down to userspace though, since though it's in modules.order, we never probe an alias that expands to both modules
[19:25] <Keybuk> mjg59: ahh, interesting
[19:26] <mjg59> Because it's cheaper to produce
[19:26] <rtg> Keybuk: plenty of old mice that are 1.0
[19:26] <Keybuk> hmm
[19:26] <Keybuk> it only matters for the host though?
[19:26] <Keybuk> ehci can talk to a 1.x mouse on a port served by 2.x host?
[19:27] <mjg59> Keybuk: No
[19:27] <Keybuk> no?
[19:27] <Keybuk> ahh
[19:27] <mjg59> ehci can't speak USB 1 wire protocol
[19:27] <Keybuk> so to support a 1.x mouse, you have to have a 1.x host?
[19:27] <mjg59> Yes
[19:27] <Keybuk> got it
[19:27] <mjg59> Or plug it into a 2.0 hub which is plugged into a 2.0 host
[19:27] <mjg59> The hub does the reframing in that case
[19:27] <Keybuk> right, the 2.0 hub has a 1.0 host in it, and knows how to transmit that
[19:27] <JanC> 1.1 works on a 2.0 host, 1.0 doesn't work on 2.0 host, I think?
[19:28] <rtg> uhci takes over when a 1.0 peripheral is inserted (I think)
[19:28] <Keybuk> hmm
[19:28] <Keybuk> when you insert a new device
[19:28] <Keybuk> if you have both ehci and uhci loaded
[19:28] <Keybuk> how do they determine which one wins?
[19:28] <mjg59> There's some magic in that case
[19:28] <mjg59> But ehci always wins
[19:29] <Keybuk> and that magic doesn't work on first init unless ehci is loaded first?
[19:29] <Keybuk> I guess you have to know to switch the magic on ;)
[19:29] <mjg59> I believe that once ehci has loaded, the root hub is then willing to respond to the 2.0 chirp
[19:29] <mjg59> It's handled at a level below the OS
[19:30] <Keybuk> ah, ok
[19:30] <rtg> Keybuk: building these modules into the kernel is essentially the same as the initramfs solution.
[19:30] <Keybuk> rtg: yes, except it works when you don't have an initramfs ;)
[19:30] <rtg> correct
[19:31] <mjg59> The strongest argument for building ohci and uhci in is so you have a working keyboard
[19:31] <mjg59> Since they're basically always USB 1.x
[19:31] <Keybuk> mjg59: yes, I was thinking about that ;)
[19:31] <IntutiveNipple-S> The test PC has just booted... will post the dmesg to the bug report
[19:33] <mjg59> Keybuk: This is also a case where suspend/resume ordering is important - if you resume the 1.x host before the 2.0 one, all your dual-speed devices fall back to 1.x
[19:34] <Keybuk> ah, interesting
[19:34] <Keybuk> it definitely sounds like just building these in is the right solution
[19:34] <Keybuk> I can't think of any con of doing so
[19:34] <mjg59> The only reason this currently works is that ehci always come after uhci in the PCI tree, and we resume devices backwards...
[19:34] <rtg> mjg59: are there no OHCI controllers on 64 bit platforms?
[19:35] <mjg59> rtg: You can get them on PCI cards
[19:35] <apw> Keybuk, i thought someone said they needed to blacklist them for something?
[19:35] <Keybuk> apw: only reason I can think of would be power?
[19:35] <apw> how will you guarentee init order in these two?
[19:36] <Keybuk> apw: kernel link order
[19:36] <apw> i think that might have been the reason given thinking on it
[19:36] <apw> that you can guarentte it with a probe/probe sequence and not by building in
[19:36] <apw> if we can now do that then we win
[19:37] <Keybuk> ?
[19:37] <Keybuk> I didn't understand that
[19:37] <mjg59> apw: It's guaranteed that drivers will be intialised in the order that they're linked into the kernel
[19:37] <mjg59> As long as ehci binds first, you won't have a problem
[19:37] <mjg59> Of course, xhci will have to link before ehci
[19:37] <apw> Keybuk, i think my memory that undefined init order was a reason given to me to not build them in
[19:38] <apw> it sounds like we are relying on and assuming link order as init order so
[19:38] <Keybuk> apw: the init order in the kernel is very very definitely defined ;)
[19:38] <apw> building them in sounds liek its a win
[19:38] <mdz> what is the maximum amount of physical RAM supported by the 32-bit -generic kernel?
[19:38] <apw> 4GB
[19:38] <apw> but there is no guarentee you can put 4GB in any one machine
[19:38] <Keybuk> mjg59: xhci replaces ehci though, right?  I remember reading that it's supposed to be compatible with 2.0 up
[19:38] <apw> and see it with a 32 bit kernel
[19:38] <mjg59> Keybuk: Oh, maybe
[19:39] <apw> some machines place it elsewhere to avoid making io holes in it
[19:39] <mjg59> Keybuk: But if it's compatible you'll need to build xhci in so that ehci can't bind to it
[19:39] <mdz> apw: hmm
[19:39] <Keybuk> mjg59: right, but this makes sense too
[19:39] <Keybuk> otherwise you then have to remember to load x-before-h-except-after-o-unless-using-e
[19:39] <apw> it is common for 4 gb machines to place memory at 0, 1, 2 and 4 gb
[19:39] <mjg59> It's guaranteed that you *can't* get 4GB on 32-bit without PAE
[19:39] <apw> and you would only see the first three with a non-pae kernel, ie with out 32 bikernels
[19:40] <mjg59> Since PCI has to go somewhere
[19:40] <apw> mjg59, its guarenteed you won't get 100% of the 4GB yes, but some may surround the io space
[19:40] <mdz> would it be fair to say that it does NOT support 4GB or more?
[19:40] <mjg59> Keybuk: It's starting to sound like a rap star
[19:40] <rtg> mdz: absolutely correct
[19:40] <mdz> s/fair/accurate/
[19:40] <Keybuk> mjg59: MChci!
[19:40] <apw> close enough for a release note style statement yes
[19:42] <Keybuk> it would certainly be more accurate to say that "x86 supports machines of less than 4GB of memory" than "x86 supports machines of up to 4GB of memory"
[19:43] <mjg59> A PAE kernel obviously gives you support for more, limited by your chipset
[19:43] <Keybuk> but PAE kills kittens, apparently
[19:43] <IntutiveNipple-S> Keybuk: can we blacklist a driver in initrd ?
[19:43] <rtg> mjg59: presumably, on those machines you couldn't stuff more then 4GB
[19:43] <Keybuk> IntutiveNipple-S: yes, same was as you do after - just run update-initramfs
[19:44] <mjg59> rtg: 945 only has 32 bits of data lines on the memory controller, for instance
[19:44] <IntutiveNipple-S> Keybuk, so it's enough to add to /etc/modprobe.d/blacklist ?
[19:44] <Keybuk> rtg: the problem aiui is that once you enable PAE, you cease supporting chipsets without it
[19:44] <mjg59> But even with PAE, you can't get the full 4GB
[19:44] <mjg59> Keybuk: Chips, not chipsets
[19:44] <Keybuk> IntutiveNipple-S: and run update-initramfs, yes
[19:44] <IntutiveNipple-S> Keybuk, OK, going to add ehci_hcd to the blacklist to see if the physical external ports are shared by the controllers
[19:44] <mjg59> You can run a PAE kernel on a core 2 with a 945, even though it's only got 4GB of address space
[19:45] <rtg> Keybuk: I'm thinking abuot a new PAE flavour for Karmic, and drop 32 bit server
[19:45] <Keybuk> rtg: I'd make PAE the default and have a -crusty flavour - but that's me, always forward never looking back at the people crying in my wake ;)
[19:45] <rtg> can't make it the default. too many non-PAE CPUs out there. VIA for example.
[19:49] <IntutiveNipple-S> we need a way to 'stop' external HDDs on USB ... when it is unplugged I can hear the emergency retract
[19:50] <Keybuk> IntutiveNipple-S: eject them
[19:50] <Keybuk> eject /dev/sdN
[19:50] <IntutiveNipple-S> I tried that ... didn't help. Will re-investigate when I get time
[19:51] <IntutiveNipple-S> OK, the other machine has decided to do an fsck... twiddles thumbs
[19:51] <rtg> IntutiveNipple-S: its time for ext4 :)
[19:52] <IntutiveNipple-S> rtg, most of the volumes on it are ext4... I suspect this one might have remained ext3 for backwards compatibility with Hardy though
[19:52] <IntutiveNipple-S> There's 11 volumes mounted on it, that helps cut down on boot delays from one big fsck
[19:53] <rtg> IntutiveNipple-S: sounds like a hell of  a big server
[19:53] <IntutiveNipple-S> laptop
[19:54] <IntutiveNipple-S> I just split all the logical work areas into separate LVMs
[19:54] <rtg> or, just a big drive with lots 'o volumes
[19:54] <IntutiveNipple-S> so, /home then /home/all then /home/all/Project /home/all/SourceCode /home/tj/ and so forth
[19:55] <IntutiveNipple-S> music, videos, etc., on separate LVMs too#
[19:55] <IntutiveNipple-S> right the laptop has started. Confirmed ehci_hcd isn't loaded
[19:56] <rtg> I just built a server with 4 500Gb disks using raid 0 and ext4. Its significantly faster then a single spindle.
[19:57] <Keybuk> heh
[19:58] <IntutiveNipple-S> Yeah, I've got a dell with PERC and RAID-5 in the garage, and a soft md raid-5 with 5 disks in another dev machine. makes a hell of a difference
[19:58] <mjg59> It also turns out that fsck is much faster if you can run as many threads as you have spindles
[19:59] <IntutiveNipple-S> USB1 transfer is happening via port 1
[19:59] <IntutiveNipple-S> just need to check which host is in use
[20:00] <Keybuk> mjg59: something we're vaguely working on now that fsck is moving into util-linux-ng is to have a mount daemon in there that handles the fsck and mount stuff based on either udev or dbus
[20:01] <mjg59> There's threaded patches for ext3 fsck. And probably ext4.
[20:01] <Keybuk> indeed
[20:01] <Keybuk> though then you need a flag point where you know that /dev is full
[20:02] <mjg59> And SGI implemented it for xfs at some point (their Final Solution)
[20:10] <IntutiveNipple-S> Is there an easy way with /sys/ to show the USB host and target path to show which device is connected via which port on which host?
[20:10] <IntutiveNipple-S> I've been digging but can't find a clear way to show the linkage
[20:17] <Keybuk> $ for DEVICE in /sys/bus/usb/devices/*; do readlink $DEVICE; done
[20:18] <Keybuk> the PCI id part of the path will change
[20:18] <IntutiveNipple-S> beautiful, thanks. I was trying something similar but starting from /sys/block/ (to show the disk-to-usb relationship)
[20:19] <IntutiveNipple-S> If i can get something similar from the block/ path it will make it easy to show which host/port the device in port 1 is on for all permutations of the drivers
[20:20] <IntutiveNipple-S> I've got "readlink /sys/block/sdb/device" == "../../../7:0:0:0" but not been able to get the clear path shown like your code does
[20:20] <Keybuk> just do a readlink on /sys/block/sdX itself
[20:20] <Keybuk> $ for DEVICE in /sys/block/*; do readlink $DEVICE; done
[20:24] <IntutiveNipple-S> got it! I'd gone one directory too far :)
[20:24] <IntutiveNipple-S> readlink /sys/block/sdb
[20:24] <Keybuk> ;)
[20:24] <IntutiveNipple-S> I was working inside the sdb/ directory off the device link
[20:25] <IntutiveNipple-S> ended up with lots of relative stuff, not realising the sdb/ has symlinked me someplace else in /sys/ already
[20:25] <Keybuk> that works too
[20:25] <Keybuk> it'll just take you higher up in the same tree
[20:25] <IntutiveNipple-S> this is what i wanted... one line shows the linkage now
[20:25] <Keybuk> /sys/block/sda will point at something like /sys/devices/BUS/DEVICE/SCSI HOST/SCSI TARGET/SCSI ID/block/sda
[20:26] <Keybuk> /sys/block/sda/device will point at the physical device
[20:26] <Keybuk> which is obviously the SCSI ID
[20:26] <Keybuk> /sys/block/sda will point at something like /sys/devices/BUS/DEVICE/SCSI HOST/SCSI TARGET/SCSI ID
[20:26] <Keybuk> err
[20:26] <Keybuk> /sys/block/sda/device will point at something like /sys/devices/BUS/DEVICE/SCSI HOST/SCSI TARGET/SCSI ID
[20:26] <Keybuk> the device symlinks are entirely needless these days
[20:26] <Keybuk> since if they point at anything other than an ancestor in the tree (.. or ../.. usually) they're bogus
[20:34] <IntutiveNipple-S> I know what we could do with. A patch to ehci_hcd reporting the number of ports it is handling, like uhci_hcd reports
[20:36] <IntutiveNipple-S> lol... been slapping the bluetooth mouse because it stopped working... then realised I'd just rmmod uhci_hcd... and the BT module is on an internal USB1 port :D#
[20:38] <IntutiveNipple-S> matters just got worse! when using the touchpad, the mouse cursor gets stuck on X screen 1 if it ventures there, which it now has, but another bug in Xserver means all apps started from screen 1 end up on screen 0 so got no terminal to reload uhch_hcd  ... going home!
[20:39] <IntutiveNipple-S> ssh saves the day :)
[21:13] <ernstp> I keep getting timeout exceptions from the ata subsystem in Jaunty, never happened on Intrepid
[21:13] <ivoks> has anyone reported problems with latest hardy kernel on sparc machines?
[21:13] <ernstp> could it be a kernel bug or will everyone blame my hardware?
[21:14] <ernstp> http://paste.ubuntu.com/132196/
[21:15] <ernstp> happens with different bios settings, different sata ports
[21:15] <ivoks> did you check out smart?
[21:15] <ernstp> only my ext4 root filesystem during big dist-upgrades, but that's probably because it's such a heavy load
[21:19] <ernstp> ivoks: no errors ever
[21:46] <IntuitiveNipple> ernstp: that doesn't look great. Can you post a bug report and attach the system's /var/log/dmesg and /var/log/kern.log containing the error messages?
[21:46] <ernstp> IntuitiveNipple: yeah, I decided to give a bugreport a shot
[21:47] <ernstp> IntuitiveNipple: only got dmesg though, bit tricky with the logs with the read-only filesystem :-(
[21:48] <IntuitiveNipple> hmmm, that bad? how about using netconsole or a serial console to capture the error log?
[21:49] <ernstp> I have other filesystems so I have done dmesg > /file
[21:49] <ernstp> so I've got that
[21:50] <IntuitiveNipple> sometimes dmesg doesn't contain the error reports... but if it does, then we don't need kern.log
[21:52] <ernstp> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/343919
[21:52] <ubot3> Malone bug 343919 in linux "[jaunty] ata timeout exception with data loss" [Undecided,New] 
[21:53] <ernstp> right :-)
[21:58] <IntuitiveNipple> Does it only affect the one drive? The Samsung SP2504C ?
[22:02] <ernstp> IntuitiveNipple: yeah, and only that filesystem
[22:03] <IntuitiveNipple> Do you have smartd installed?
[22:03] <IntuitiveNipple> as in smartmontools
[22:04] <ernstp> I did a smartctl --all on the disk, no errors ever
[22:05] <ernstp> IntuitiveNipple: but I'll install that
[22:06] <ernstp> IntuitiveNipple: oh, I did have it installed
[22:07] <IntuitiveNipple> yeah, I just saw an article about it causing the issue, but that was some time ago now
[22:09] <ernstp> IntuitiveNipple: yikes, it can do that kind of stuff... ?
[22:09] <ernstp> IntuitiveNipple: well I'll uninstall it then, see if it happens again
[22:14] <IntuitiveNipple> can you paste this report to the bug? lspci -nn
[22:16] <ernstp> IntuitiveNipple: write ernstp: so I see when you write ;-)
[22:17] <ernstp> IntuitiveNipple: http://paste.ubuntu.com/132216/
[22:23] <ernstp> IntuitiveNipple: gonna try running with the mainline kernel 2.6.28.7 for a while
[22:23] <ernstp> IntuitiveNipple: see if it still happens
[22:25] <IntuitiveNipple> Don't worry, it will :)
[22:25] <IntuitiveNipple> Stick with the Ubuntu kernel so we can debug it.
[22:28] <ernstp> IntuitiveNipple: kk
[22:29] <ernstp> IntuitiveNipple: this looks similar.. https://bugs.launchpad.net/ubuntu/+bug/315572
[22:29] <ubot3> Malone bug 315572 in ubuntu "alert! /dev/disk/by-uuid/be80cf42-e6f2-466c-bb73-7d664956a334 does not exist" [Undecided,New] 
[22:30] <ernstp> ok, gotta sleep now, cya