=== bluefoxicy [n=bluefox@c-68-33-112-13.hsd1.md.comcast.net] has joined #ubuntu-kernel | ||
bluefoxicy | I want oops insurance | 04:58 |
---|---|---|
bluefoxicy | when my kernel oopses I want it to file a bug and mail the full oops output to every kernel dev so I KNOW they're fixing it :D | 04:59 |
infinity | bluefoxicy: Every time I commit a fix for something, I want it instantly forced on every user's machine, and I want them to be forced to test it immediately. (Also, we can't always get what we want) | 05:30 |
bluefoxicy | infinity: it was a joke | 05:31 |
infinity | So was mine. Sort of. ;) | 05:31 |
bluefoxicy | infinity: although instant testing would be nice ;) | 05:31 |
=== j_ack [n=nico@p508D940A.dip0.t-ipconnect.de] has joined #ubuntu-kernel | ||
=== CataEnry [n=cataenry@host152-2.pool8261.interbusiness.it] has joined #ubuntu-kernel | ||
dilinger | infinity: i'd like every user to accompany their bug reports w/ patches that solve the problem. also, a pony. i want a pony. | 10:05 |
Mithrandir | dilinger: what colour? | 10:09 |
=== CataEnry [n=cataenry@host152-2.pool8261.interbusiness.it] has joined #ubuntu-kernel | ||
dilinger | blue | 10:11 |
=== dilinger works on making x86_64's printk_address print something reasonable looking | ||
dilinger | [ 123.737130] Call Trace: [ffffffff802e0e15] schedule_timeout+0x25/0xd0 | 10:49 |
dilinger | [ 123.737143] [ffffffff8014b803] prepare_to_wait+0x23/0x80 | 10:49 |
dilinger | [ 123.737148] [ffffffff802dc383] unix_stream_recvmsg+0x2c3/0x5d0 | 10:49 |
dilinger | [ 123.737155] [ffffffff8014b550] autoremove_wake_function+0x0/0x30 | 10:49 |
dilinger | hey look, it's an almost readable backtrace, just like i386 and sparc64 | 10:50 |
dilinger | ugh, is that intentional? | 11:19 |
dilinger | dmesg output: | 11:19 |
dilinger | [ 82.375220] [<ffffffff8019adc0>] __pollwait+0x0/0xf0 | 11:19 |
dilinger | syslog (kern.log): | 11:19 |
dilinger | Feb 24 05:15:42 throat kernel: [ 82.375220] [__pollwait+0/240] __pollwait+0x0/0xf0 | 11:19 |
=== jane_ [n=JaneW@dsl-146-140-66.telkomadsl.co.za] has joined #ubuntu-kernel | ||
mjg59 | BenC: Why the change to CONFIG_2GB? | 12:16 |
doko | is it intended for a server setup that the hard disk spin down after activity? and if yes, that period seems to be too short | 01:17 |
=== fabbione [i=fabbione@gordian.fabbione.net] has joined #ubuntu-kernel | ||
BenC | doko: that would be one of the startup scripts doing that, or it's your bios | 01:31 |
BenC | mjg59: mainly to allow laptops with 1GB of ram to still be able to suspend/resume | 01:31 |
doko | BenC: no, did start sometime in the dapper cycle | 01:31 |
BenC | doko: has to be a startup script then...not sure which one though | 01:32 |
mjg59 | BenC: Eh? | 01:32 |
mjg59 | BenC: Suspend/resume should work with highmem... | 01:32 |
BenC | mjg59: I've had reports that it doesn't, also read it somewhere | 01:32 |
mjg59 | BenC: How weird | 01:32 |
mjg59 | BenC: It has the side effect of breaking sbcl | 01:32 |
BenC | yeah, it seems to be breaking more than it fixes | 01:33 |
Mithrandir | when I get "recursive die() failure, output supressed", that's bad, right? | 01:33 |
fabbione | hey BenC | 01:33 |
BenC | hey fabbione | 01:33 |
mjg59 | BenC: Suspend to RAM or suspend to disk? | 01:33 |
BenC | fabbione: I did silo last night, just forgot to upload new upstream tarball | 01:33 |
Mithrandir | our current live CD seems exceedingly unhappy in vmware. | 01:33 |
BenC | mjg59: can't recall, let me check | 01:33 |
fabbione | BenC: please pull from my tree. changes: more sun4v love, config changes for sparc to move sunhme, sungem and esp from mod to inline to spare people a few errors in d-i | 01:34 |
BenC | fabbione: those d-i problems were supposed to be fixed, I filed bugs :/ | 01:34 |
fabbione | BenC: i am talking about modprobe errors during install | 01:35 |
BenC | fabbione: did you also make scsi/sd built-in aswell? | 01:35 |
fabbione | no i didn't | 01:35 |
BenC | what sort of modprobe errors? | 01:35 |
fabbione | only sunhme, sungem and esp | 01:35 |
fabbione | it's problem with the drivers | 01:35 |
fabbione | when the hw is not there, they exit badly | 01:35 |
fabbione | and d-i catch the error and show a big fat red screen to the user | 01:35 |
fabbione | = scary | 01:35 |
fabbione | bu | 01:36 |
fabbione | BUT | 01:36 |
fabbione | if they are compiled in, hw is still recognized | 01:36 |
fabbione | and there is no need to modprobe them | 01:36 |
fabbione | = works | 01:36 |
BenC | gotcha | 01:36 |
fabbione | BenC: i also added the sparc ABI files for 16.22 and 16.23 | 01:37 |
fabbione | and removed the abi checker | 01:38 |
fabbione | sorry i meant the sparc.ignore | 01:38 |
BenC | ok | 01:38 |
fabbione | now it takes me about 20 minutes to build :) | 01:38 |
BenC | how's the DC sparc hw coming? | 01:39 |
BenC | 20 minutes? my machine can do it in 10 :) | 01:40 |
BenC | btw, someone is giving me an e3.5k | 01:41 |
BenC | not sure where I'll put it, or why I need it, but it's free | 01:41 |
infinity | mjg59: swsusp is the one that supposedly doesn't work with > 1GB of RAM (according to reports on user lists and forums, etc), though it's worked in the past on my laptop with 2GB of RAM.. | 01:41 |
fabbione | BenC: 20 minutes including udebs and without ccache? | 01:42 |
fabbione | BenC: 20 is full dpkg-buildpackage :) | 01:42 |
infinity | BenC: Don't suppose they want to spend ridiculous sums to ship it to Australia? | 01:42 |
fabbione | BenC: the machines are installed in a rack.. i think elmo crashed before installing anything on them | 01:42 |
BenC | fabbione: full udebs, but ccache, yes | 01:42 |
fabbione | kill ccache :) | 01:43 |
fabbione | and we can talk ;) | 01:43 |
BenC | infinity: maybe I can sneek it through customs in to germany in my duffle bag | 01:43 |
infinity | BenC: You must have gotten a larger duffle bag since last we met... | 01:43 |
BenC | I think It was 30 minutes without ccache | 01:43 |
fabbione | Feb 24 04:40:30 sunrise udevd[2686] : get_netlink_msg: unable to receive kernel netlink message: No buffer space available | 01:43 |
fabbione | HMM | 01:44 |
fabbione | kernel or udev issue? | 01:44 |
BenC | fabbione: AHA! someone else sees it aswell | 01:44 |
BenC | that's been fucking up my e3k for weeks now | 01:44 |
infinity | Blame davem. | 01:44 |
BenC | it has to be kernel, I haven't seen it anywhere else | 01:44 |
BenC | someone the netlink foo is returning ENOBUF but I can't see how | 01:45 |
fabbione | BenC: AHHHHH | 01:45 |
BenC | s/someone/somewhere/ | 01:45 |
fabbione | crap | 01:45 |
fabbione | i will blame davem :) | 01:45 |
infinity | Or, blame davem's girlfriend. She's too distracting. | 01:46 |
BenC | infinity: s/duffle bag/empty crago hold/ | 01:46 |
fabbione | because it looks like udevplug is triggering that thing exactly when it needs to probe the network driver | 01:46 |
BenC | fabbione: for me it occurs when it needs to load sd_mod, which means I have to manually bring the machine past initrd stage | 01:46 |
fabbione | ok.. | 01:47 |
fabbione | so it's random.. | 01:47 |
fabbione | good to know | 01:47 |
fabbione | david will love to fix that :) | 01:47 |
BenC | not really, for me sd_mod is the first thing it is trying to do | 01:47 |
fabbione | the netdriver is not the first | 01:48 |
fabbione | it's way past that i think | 01:48 |
fabbione | but i collected all the infos together | 01:48 |
BenC | it's probably the first thing that requires a kernel event | 01:48 |
fabbione | could be | 01:50 |
fabbione | BenC: btw.. silo is up | 01:53 |
fabbione | uploaded this morning | 01:53 |
=== Keybuk [n=scott@213-78-32-60.ppp.onetel.net.uk] has joined #ubuntu-kernel | ||
BenC | yeah, saw that | 02:11 |
fabbione | cool | 02:12 |
fabbione | hey Keybuk | 02:12 |
fabbione | i was just waiting for you | 02:12 |
fabbione | Keybuk: http://people.ubuntu.com/~fabbione/sparc/ | 02:12 |
fabbione | this is the error i am getting on sparc | 02:12 |
fabbione | there is lspci, the error in syslog and the strace of both udevd and udevplug | 02:13 |
fabbione | each time i run udevplug i can reproduce the error | 02:13 |
fabbione | what i see at each reboot is network not loaded (e1000) | 02:13 |
fabbione | but BenC for instance has no sd_mod | 02:13 |
=== BenC concurs | ||
fabbione | BenC: i also fixed hw-detect and Mithrandir did kbd-chooser | 02:14 |
fabbione | the latter will make that annoying error disappear when you don't have a keyboard installed | 02:15 |
BenC | so hw-detect should find my sbus devices now? | 02:15 |
fabbione | BenC: only after you will upload the new kernel... | 02:15 |
fabbione | that has esp built in | 02:15 |
BenC | well that's no help, new kernel will make my sbus modules built-in :P | 02:15 |
Keybuk | cute, that error message isn't listed in the recv() manpage | 02:16 |
Keybuk | ENOBUFS | 02:16 |
fabbione | BenC: i will look at hw-detect in debian and see what they have | 02:16 |
BenC | Keybuk: that error is from the netlink core | 02:16 |
Keybuk | BenC: how do we avoid that error? | 02:16 |
fabbione | BenC: otherwise yeah. i guess that's the solution :/ | 02:16 |
BenC | Keybuk: I couldn't figure out why it happened | 02:16 |
BenC | netlink code is so confusing | 02:16 |
Keybuk | could it be that the kernel overflowed the netlink buffer space | 02:17 |
BenC | fabbione: they have a working libdetect that correctly descends secondary sbus busses :) | 02:17 |
BenC | Keybuk: seems that it some how does | 02:17 |
fabbione | BenC: what package is that? | 02:18 |
BenC | fabbione: detect or libdetect or something like that | 02:19 |
BenC | from what I remember, we don't use the same thing they do to detect devices | 02:19 |
fabbione | BenC: oh you mean discover? | 02:19 |
BenC | yeah, that's it | 02:19 |
fabbione | they did get rid of it | 02:19 |
fabbione | and the fix for the double sbus was merged in breezy under my "heavy" pressure | 02:20 |
Keybuk | BenC: what's BUFFER_SIZE in lib/kobject_uevent.c ? | 02:20 |
Keybuk | (that's the size allowed for a single uevent) | 02:21 |
BenC | Keybuk: no idea, but there is no ENOBUFS in that code, so I think it's elsewhere | 02:21 |
Keybuk | it could be that it's trying to generate a uevent that's too big | 02:22 |
Keybuk | or it could that it's queuing too many uevents, and udevd isn't getting enough cpu time to slurp them all | 02:22 |
BenC | the BUFFER_SIZE overun case returns ENOMEM | 02:22 |
fabbione | it's probably the latter | 02:22 |
fabbione | too many events | 02:22 |
BenC | net/netlink/genetlinks:ctrl_build_msg() returns ENOBUFS | 02:23 |
BenC | and so does net/netlink/af_netlink:netlink_overrun() | 02:24 |
BenC | it's probably the second one generating the error | 02:24 |
Keybuk | that function's called in a few places | 02:24 |
fabbione | Keybuk: is it possible to build udevplug to wait let say half second between sending each event? | 02:25 |
fabbione | that would exclude the "too many events at once" | 02:25 |
Keybuk | fabbione: you'd be in the ten-minutes-to-boot area with that delay | 02:25 |
fabbione | Keybuk: i don't need to run at it boot :) | 02:25 |
fabbione | i can test it in userland | 02:25 |
fabbione | i get the same error | 02:25 |
fabbione | in both situation | 02:26 |
fabbione | 10 minutes.. no problem ;) | 02:26 |
Keybuk | run "udevplug -s" to test if that's the case | 02:26 |
Keybuk | that waits for the previous event to be processed before sending the next | 02:26 |
fabbione | what does -s do? | 02:26 |
fabbione | ok | 02:26 |
fabbione | sure i can do | 02:26 |
fabbione | in a few minutes.. | 02:26 |
fabbione | i am enjoying this extremely cleaned up d-i | 02:26 |
Keybuk | BenC: could be caused by nlmsg_new not returning a message, though I can't find that function | 02:27 |
BenC | do_one_broadcast() also has a few failure points | 02:27 |
=== Keybuk looks in the header files | ||
Keybuk | static inline struct sk_buff *nlmsg_new(int size) | 02:27 |
Keybuk | { | 02:27 |
Keybuk | return alloc_skb(NLMSG_GOODSIZE, GFP_KERNEL); | 02:27 |
Keybuk | } | 02:27 |
BenC | netlink_broadcast_deliver() seems like a likely candidate | 02:28 |
Keybuk | yeah, there's a whole bunch of them there | 02:29 |
BenC | it tries to push one to the queue | 02:29 |
Keybuk | if only we could tag it so we knew which one it was | 02:29 |
BenC | if fabio's test doesn't prove anything I'll start sprinkling some printk's to find out where it is failing | 02:29 |
Keybuk | ah, that pushes into an actual socket rcvbuf ... so if that was full, then it wouldn't fit | 02:29 |
BenC | right | 02:29 |
Keybuk | interesting that this has only affected sparc so far though | 02:30 |
fabbione | hmmm | 02:30 |
BenC | hey, can you increase udev's socket bufsize? | 02:30 |
fabbione | Keybuk: it might be a cpu speed issue? | 02:30 |
Keybuk | I'd've thought an amd64 would show up more | 02:30 |
BenC | Keybuk: sparc64 is slower | 02:30 |
Keybuk | ahh, of course, sparc is slower so udevd gets less time because the kernel is using it all | 02:30 |
BenC | it's odd that it happens on a 6-way system though | 02:30 |
Keybuk | BenC: remind me how to do that | 02:30 |
BenC | my sparc64 has 6 cpu's and 6gigs of ram :/ | 02:31 |
Keybuk | it's a setsockopt isn't it? | 02:31 |
fabbione | "mine" only 32 CPUs and 16GB of ram | 02:31 |
=== fabbione ducks | ||
=== pappan [n=ppadman@bgepxyout-02.asiapac.hp.net] has joined #ubuntu-kernel | ||
Keybuk | fabbione: cat /proc/sys/net/core/rmem_max /proc/sys/net/core/rmem_default | 02:31 |
fabbione | Keybuk: you will have to wait.. i am in the middle of testing parted.. just a few minutes | 02:32 |
BenC | Keybuk: maybe setsockopt(), let me check | 02:32 |
fabbione | rebooting now... | 02:33 |
BenC | fabbione: maybe we have just too much memory/cpu and it's confusing udev :) | 02:33 |
=== fabbione goes for a break while POST | ||
fabbione | BenC: possibly | 02:33 |
=== Keybuk wonders what SO_RCVBUFFFORCE is | ||
fabbione | BenC: these udev hackers and their laptops | 02:33 |
BenC | lol | 02:33 |
BenC | Keybuk: IIRC, you can set the buffer to a local one | 02:33 |
BenC | SO_RCVBUF maybe | 02:35 |
Keybuk | it already sets that SO_RCVBUFFORCE thing | 02:37 |
BenC | what does that do? | 02:38 |
Keybuk | dunno | 02:39 |
fabbione | re | 02:40 |
Keybuk | ah | 02:40 |
Keybuk | got it, sets the rcvbuf and forces it over the maximum if necessary | 02:40 |
Keybuk | const int buffersize = 16 * 1024 * 1024; | 02:41 |
=== mgalvin [n=mgalvin@ubuntu/member/mgalvin] has joined #ubuntu-kernel | ||
fabbione | Keybuk: ok.. now i am without network | 02:41 |
Keybuk | setsockopt(uevent_netlink_sock, SOL_SOCKET, SO_RCVBUFFORCE, &buffersize, sizeof(buffersize)); | 02:41 |
Keybuk | so that forces the rcvbuf of that socket to 16MB | 02:41 |
fabbione | root@sunrise:~# cat /proc/sys/net/core/rmem_max /proc/sys/net/core/rmem_default | 02:41 |
fabbione | 131071 | 02:41 |
fabbione | 124928 | 02:41 |
Keybuk | which is roughly 16,000 uevents | 02:41 |
BenC | Keybuk: maybe that's broken on sparc | 02:41 |
fabbione | BenC: or events are bigger? | 02:42 |
Keybuk | events are fixed at 1024 in the kernel and in udev | 02:42 |
Keybuk | fabbione: try "udevplug -s -v | tee events.txt" then count how many lines you get :) | 02:42 |
fabbione | Keybuk: udevplug -s is running | 02:43 |
Keybuk | ah, ok | 02:43 |
fabbione | i can do that later again :) | 02:43 |
fabbione | the error did always appear before.. | 02:43 |
fabbione | so one run more or less won't cahnge my life | 02:44 |
Keybuk | ok | 02:45 |
BenC | Keybuk: are you checking the return value of setsockopt()? | 02:45 |
Keybuk | see whether it appears this run first | 02:45 |
Keybuk | BenC: no... | 02:46 |
BenC | it may be failing | 02:46 |
Keybuk | BenC: looking at the code, it can only fail with -EPERM | 02:46 |
BenC | yeah, but it could also be something as stupid as a signed extension or 32/64 value that is getting junked and causing it to be set at minimum bufsiz | 02:47 |
BenC | but that wouldn't error out | 02:47 |
BenC | amd64 isn't doing this in 32-bit | 02:47 |
fabbione | Keybuk: it looks like that the run did not generate the error, but it still doesn't bring up the network | 02:47 |
fabbione | BenC: amd64 doesn't need memory to be alligned at 64 bit | 02:48 |
fabbione | that can cause issues | 02:48 |
fabbione | like it was with apt in breezy | 02:48 |
BenC | no, I mean amd64 isn't pushing this through a compat layer | 02:48 |
fabbione | yes i understand | 02:48 |
fabbione | are we? | 02:48 |
fabbione | yes | 02:48 |
fabbione | i think.. | 02:48 |
=== fabbione is tired and confused | ||
fabbione | Keybuk: i have the events file.. | 02:49 |
Keybuk | fabbione: just "wc -l" it | 02:50 |
fabbione | wc -l events.txt | 02:50 |
fabbione | 0 events.txt | 02:50 |
=== zul [n=chuck@ubuntu/member/zul] has joined #ubuntu-kernel | ||
zul | heyl | 02:50 |
fabbione | ? | 02:50 |
fabbione | hey zul | 02:50 |
BenC | looks like it just pushes it through as a compat, no translation | 02:50 |
Keybuk | fabbione: you ran with -v ? | 02:51 |
fabbione | udevplug -s -v | tee events.txt | 02:51 |
Keybuk | weird | 02:51 |
fabbione | exactly as you wrote it | 02:51 |
Keybuk | dunno why that didn't give you anything | 02:51 |
Keybuk | does it without the | tee ? | 02:51 |
Keybuk | ie just udevplug -s -v ? | 02:51 |
fabbione | yeah | 02:51 |
BenC | Keybuk: doesn't give me anything on my amd64 box either | 02:51 |
fabbione | udevplug -s -v | wc -l | 02:52 |
Keybuk | fabbione: is /sys mounted? :) | 02:52 |
BenC | ah, are you in a chroot? | 02:52 |
fabbione | Keybuk: you joking right? it's ubuntu running system 100% | 02:52 |
fabbione | no chroot | 02:52 |
Keybuk | find /sys -name uevent | 02:52 |
fabbione | find /sys -name uevent | wc -l | 02:53 |
fabbione | 730 | 02:53 |
Keybuk | BenC: it works just fine on mine, I get a huge number of /sys lines printed | 02:53 |
BenC | I do now that /sys is mounted | 02:53 |
BenC | I need a serial console to my sparc so I can test this stuff too | 02:54 |
Keybuk | fabbione: what about just "udevplug -v" does that print anything? | 02:54 |
BenC | I ran udevplug under linux32 in an i386 chroot on my amd64, which should use the same codepath as sparc, and it was just fine | 02:54 |
fabbione | Keybuk: checking in a sec | 02:55 |
Keybuk | BenC: yeah, I've just done the same | 02:55 |
Keybuk | quest scott# time udevplug -v | wc -l | 02:55 |
Keybuk | 773 | 02:55 |
Keybuk | udevplug -v 0.01s user 0.03s system 2% cpu 1.685 total | 02:55 |
fabbione | hmm | 02:55 |
fabbione | it's taking too long | 02:55 |
=== doko [n=doko@dslb-084-059-085-222.pools.arcor-ip.net] has joined #ubuntu-kernel | ||
Keybuk | fabbione: check you don't have an empty /dev/.udev/queue directory (just sudo rmdir it) | 02:56 |
fabbione | Keybuk: yeah that's where i was sticking my nose :) | 02:56 |
Keybuk | that's a common failure mode of previous udevd | 02:56 |
fabbione | time udevplug -v | wc -l | 02:56 |
fabbione | 735 | 02:56 |
fabbione | real 0m1.093s | 02:56 |
fabbione | user 0m0.048s | 02:56 |
fabbione | sys 0m0.164s | 02:56 |
Keybuk | could also explain why -s doesn't work (it's also waiting for that to go away first, and just times out after three minutes) | 02:56 |
Keybuk | ok | 02:56 |
fabbione | so now | 02:56 |
fabbione | let's try again the -s | 02:57 |
Keybuk | well, you don't have any more events than my amd64, slightly less in fact | 02:57 |
fabbione | i got the error in syslog | 02:57 |
Keybuk | those should take only 735,000 bytes of memory to hold in the kernel | 02:57 |
Keybuk | BenC: do you know of a way to find out the size of a socket from userspace? | 02:57 |
Keybuk | lsof? | 02:57 |
fabbione | hmmm this is interesting | 02:58 |
fabbione | Keybuk: if i run udevplug -v -s | 03:00 |
fabbione | it gets to /sys/class/vc/vcsa | 03:00 |
fabbione | /sys/devices/pci0000:02 | 03:00 |
fabbione | and it stalls there | 03:00 |
Keybuk | probably aborts with SIGALRM :) | 03:00 |
fabbione | no no | 03:01 |
fabbione | udevplug is still running | 03:01 |
fabbione | i had to ctrl+c | 03:01 |
Keybuk | odd | 03:01 |
Keybuk | what's in /dev/.udev/queue? | 03:01 |
fabbione | no queue | 03:01 |
Keybuk | hmm | 03:01 |
Keybuk | strace it | 03:01 |
=== fabbione tries again | ||
fabbione | it's polling queue | 03:03 |
fabbione | probably udev is still processing | 03:03 |
Keybuk | does queue still exist? | 03:03 |
fabbione | yes | 03:03 |
fabbione | but it's empty | 03:04 |
Keybuk | ah | 03:04 |
Keybuk | did you rmdir it first? | 03:04 |
fabbione | yes | 03:04 |
=== fabbione will do again to be sure | ||
fabbione | like i said | 03:06 |
fabbione | udevplug did generate the queue | 03:06 |
fabbione | and waiting for it to disappear | 03:06 |
fabbione | but it's empty | 03:06 |
BenC | Keybuk: getsockopt() | 03:06 |
fabbione | Keybuk: udevd is doing nothing | 03:06 |
Keybuk | that's weird | 03:07 |
Keybuk | that suggests the kernel never gave the event to udevd | 03:07 |
fabbione | well it did | 03:07 |
Keybuk | can you run "udevmonitor -e" as well? | 03:07 |
fabbione | otherwise i would have no / ;) | 03:07 |
Keybuk | if it had given the event to udevd, udevd would have done something and removed the queue directory | 03:07 |
fabbione | ok one sec.. | 03:08 |
fabbione | i am setting up a slightly better env | 03:09 |
fabbione | like 20 xterm | 03:09 |
Keybuk | udevmonitor should give you a UEVENT and UDEV for each things udevplug prints (with -s) | 03:09 |
fabbione | root@sunrise:/dev/.udev# ls -asl | 03:10 |
fabbione | total 0 | 03:10 |
fabbione | 0 drwxr-xr-x 4 root root 80 Feb 24 06:09 . | 03:10 |
fabbione | 0 drwxr-xr-x 14 root root 13920 Feb 24 06:05 .. | 03:10 |
fabbione | 0 drwxr-xr-x 2 root root 520 Feb 24 2006 db | 03:10 |
fabbione | 0 drwxr-xr-x 2 root root 40 Feb 24 06:09 failed | 03:10 |
fabbione | root@sunrise:~# udevmonitor -e | 03:10 |
fabbione | udevmonitor prints the received event from the kernel [UEVENT] | 03:10 |
fabbione | and the event which udev sends out after rule processing [UDEV] | 03:10 |
fabbione | ok? | 03:10 |
fabbione | do we agree that it is ok? | 03:10 |
Keybuk | ok | 03:10 |
fabbione | i can see the events | 03:10 |
Keybuk | that's a good starting point | 03:11 |
=== pappan [n=ppadman@bgepxyout-02.asiapac.hp.net] has joined #ubuntu-kernel | ||
Keybuk | "udevplug -s -v" on that ... for each thing it prints you should see a UEVENT and then a UDEV for it | 03:11 |
Keybuk | if you get no UEVENT, then that's bad | 03:11 |
Keybuk | if you get a UEVENT and no UDEV, that's even worse | 03:11 |
fabbione | UEVENT[1140790262.767691] add@/class/vc/vcsa | 03:11 |
fabbione | UDEV [1140790262.770291] add@/class/vc/vcsa | 03:11 |
fabbione | ok? | 03:11 |
Keybuk | ok | 03:11 |
Keybuk | and udevplug printed /sys/class/vc/vcsa as well? | 03:11 |
fabbione | it didn't get that far??? | 03:12 |
fabbione | /sys/class/tty/tty4 | 03:12 |
fabbione | it stopped a few letters before | 03:12 |
fabbione | make that 40 | 03:12 |
fabbione | it did skip 4 | 03:12 |
fabbione | meh 0 | 03:12 |
fabbione | there is a queue | 03:13 |
fabbione | and it is empty | 03:13 |
Keybuk | ok, what was the last thing udevplug printed? | 03:13 |
fabbione | yes | 03:13 |
Keybuk | what was? | 03:13 |
fabbione | /sys/class/tty/tty4 | 03:13 |
Keybuk | ok | 03:13 |
fabbione | now | 03:13 |
Keybuk | what was the last UEVENT/UDEV combo printed? | 03:13 |
fabbione | ok one second dude | 03:14 |
fabbione | when udevplug was at /sys/class/tty/tty4 | 03:14 |
=== mxpxpod [n=BryanFor@unaffiliated/mxpxpod] has joined #ubuntu-kernel | ||
fabbione | udevmonitor was printing the vcsa | 03:14 |
Keybuk | that's a bit weird | 03:14 |
fabbione | there was a queue and it was empty | 03:14 |
fabbione | now one more thing | 03:14 |
fabbione | i did remove the queue | 03:14 |
fabbione | and saw it recreated | 03:14 |
fabbione | more events did pass | 03:14 |
fabbione | and udevplug did finish | 03:15 |
Keybuk | (btw, worth noting that "udevplug -s" is not very well tested) | 03:15 |
Keybuk | it could be just a general bug with it | 03:15 |
fabbione | so ok.. what do you want me to test next? | 03:15 |
Keybuk | hmm | 03:15 |
Keybuk | so udevplug completed normally | 03:15 |
Keybuk | and you didn't get that error | 03:15 |
fabbione | nope | 03:15 |
Keybuk | nope it didn't complete noramlly? | 03:16 |
fabbione | but know i don't know if it would have loaded the module | 03:16 |
fabbione | nope = no error | 03:16 |
fabbione | i think the problem is here: | 03:16 |
Keybuk | right, so this suggests that the netlink buffer doesn't overflow if you go slowly | 03:16 |
fabbione | tty4 was way before than /sys/class/vc/vcsa | 03:16 |
Keybuk | how do you know? :) | 03:17 |
fabbione | (in the print from udevplug) | 03:17 |
fabbione | the print order? | 03:17 |
fabbione | now | 03:17 |
fabbione | listen up | 03:17 |
Keybuk | did udevplug never do vcsa before then? | 03:17 |
fabbione | no it didn't | 03:17 |
fabbione | it did it after | 03:17 |
fabbione | if you let me :) | 03:17 |
fabbione | udevplug was printing tty4 - udevmonitor was at vcsa | 03:18 |
fabbione | the line after vcsa in udevplug is /sys/devices/pci0000:02 | 03:18 |
fabbione | the same where it was hanging a long time before | 03:18 |
fabbione | now... | 03:18 |
Keybuk | yeah, I get the same behaviour here (though udevplug actually prints that ... I suggest your stdout buffers weren't flushed <g>) | 03:19 |
fabbione | could it be a bug in /sys parsing of that device? | 03:19 |
Keybuk | this is just a "-s" bug | 03:19 |
fabbione | next test... | 03:20 |
fabbione | queue was never deleted tho | 03:20 |
fabbione | if i run only udevplug | 03:20 |
fabbione | i can see all the events and the error | 03:20 |
fabbione | that happens very early | 03:20 |
pappan | is there kernel debugging tool in ubuntu | 03:20 |
fabbione | almost at the beginning | 03:20 |
Keybuk | right, because udevd had finished processing the event, and was waiting for the next ... where udevplug had made the queue directory and then ended up waiting on it | 03:21 |
pappan | i am facing a problem with reboot in my laptop | 03:21 |
fabbione | but if i run it normally, the queue dir disappear | 03:21 |
Keybuk | I'll have to debug that, but it's reasonably safe race :) | 03:22 |
Keybuk | so ignore that for now | 03:22 |
fabbione | Keybuk: so do you think the bug is from udev itself? | 03:22 |
Keybuk | fabbione: no, I think this is a kernel bug | 03:22 |
fabbione | i am not really worried about the message itself | 03:23 |
Keybuk | sending the events slowly seems to not produce ENOBUFS | 03:23 |
Keybuk | sending them at normal speed produces it | 03:23 |
Keybuk | so, BenC, can we get some printk()s to find out which -ENOBUF that is? | 03:23 |
fabbione | it's only annoying that it doesn't bring up the ethernet | 03:23 |
fabbione | anyway i need to take off for a while now | 03:23 |
fabbione | Keybuk: thanks a lot dude | 03:23 |
BenC | Keybuk: yeah, let me get my sparc back up | 03:23 |
fabbione | later guys | 03:24 |
zul | toodles | 03:25 |
Keybuk | damn, that -s bug is entirely consistent at the first pci device | 03:29 |
Keybuk | tickle: uevent: '/sys/devices/pci0000:00/uevent' | 03:30 |
Keybuk | make_queue: directory: '/dev/.udev/queue' | 03:30 |
Keybuk | create_path: stat '/dev/.udev' | 03:30 |
Keybuk | wait_for_queue: directory: '/dev/.udev/queue' | 03:30 |
Keybuk | oh | 03:30 |
Keybuk | because tickling /sys/devices/pci0000:00/uevent DOES NOTHING | 03:30 |
Keybuk | BenC: kernel bug! kernel bug! kernel bug! :) | 03:31 |
BenC | Keybuk: stop picking on me! :) | 03:31 |
Keybuk | (this is irrelevant to the ENOBUFS error, it's just amusing to find more errors along the way to debugging that one) | 03:32 |
BenC | I think ENOBUFS is a red herring | 03:32 |
Keybuk | you do? | 03:33 |
BenC | more than likely our problem is more related to uevents not getting where they should | 03:33 |
Keybuk | yeah, but if the socket buffer is full, they won't get there | 03:33 |
Keybuk | the fact that pci0000:00 doesn't generate a uevent is only important when using "-s" where it waits patiently for the event ... during normal booting it's irrelevant | 03:34 |
BenC | I can't see generic sockets getting full...lots of things would be broken | 03:34 |
Keybuk | aye | 03:34 |
Keybuk | udevd used to do it quite regularly until they increased the size to 16MB | 03:34 |
BenC | were the symptoms the same? | 03:34 |
Keybuk | yeah, I think so | 03:35 |
BenC | I just can't see 16k uevents occuring, even on a sparc :) | 03:35 |
Keybuk | we know there's only 730 events | 03:35 |
BenC | so filling it would take a lot of effort | 03:35 |
Keybuk | which is less than my amd64 | 03:35 |
Keybuk | I'm wondering whether it's actually that there's an event bigger than 1K | 03:35 |
Keybuk | an event is just an env buffer, after all | 03:35 |
BenC | true...guess I can put some debug to see what the event size is | 03:36 |
BenC | or just check for > 1k | 03:36 |
Keybuk | I wonder ... | 03:39 |
Keybuk | the fact udev wants the socket buffer to be 16MB is just a hint about how big it should never grow past | 03:39 |
Keybuk | it doesn't mean it can actually grow that big, the kernel might not have any free memory to grab | 03:39 |
Keybuk | so it may actually be effectively smaller than the 730K needed to do the job | 03:39 |
=== __keybuk [n=scott@82.108.80.242] has joined #ubuntu-kernel | ||
=== j_ack [n=nico@p508D940A.dip0.t-ipconnect.de] has joined #ubuntu-kernel | ||
=== Keybuk [n=scott@82.108.80.245] has joined #ubuntu-kernel | ||
=== CataEnry [n=cataenry@host152-66.pool876.interbusiness.it] has joined #ubuntu-kernel | ||
=== BenC [n=bcollins@debian/developer/bcollins] has joined #ubuntu-kernel | ||
=== shaya [n=spotter@user-0ccembr.cable.mindspring.com] has joined #ubuntu-kernel | ||
shaya | anyone home? | 07:35 |
=== cmvo [n=cmvo@62.225.11.174] has joined #ubuntu-kernel | ||
shaya | just filed a bug, can help try to debug it if that would help? | 07:36 |
=== cmvo [n=cmvo@62.225.11.174] has left #ubuntu-kernel ["Konversation] | ||
=== cjb [n=cjb@islay.ra.phy.cam.ac.uk] has joined #ubuntu-kernel | ||
cjb | Sorry, stupid question: dmesg/lspci dumps for lkml, should they go inline or attached? | 11:07 |
cjb | (The only examples I can find were all attached, so I wonder if there's some differing standard for dmesg as opposed to patches.) | 11:07 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!