=== medberry is now known as med_out === Quintasan_ is now known as Quintasan [07:38] Morning +* [07:39] morning pqr s/p/s s/q/m /s/r/b [07:39] hmm, that will actually replace the r in morning as well [07:40] smb, anyway, I'm tracing down a 3.0 regression for a few HDAs and I believe I found the upstream commit that resolves the problem, but that patch is quite large. [07:41] smb, would you prefer a large upstream patch or something smaller that I crafted myself? [07:42] diwic, From the point of view of a stable maintainer probably rather something crafted with the note why that was done. [07:43] It is just better to decide what something does that way. It is a bit of a gap between rather preferring upstream vs. I can understand it... [07:44] smb, in this case I tend to agree [07:44] smb, you can search for commit 3af9ee6b83 for reference [07:45] should be in 3.1 [07:47] diwic, Hm, yeah. Most of it looks not too bad actually. But the whole block in alc662_auto_fill_dac_nids looks a bit scary at least on the first glance [07:48] smb, yeah, it's mainly the stuff in alc_auto_look_for_dac I need here [07:51] smb, hmm, and actually that stuff is yet again rewritten in a later commit [07:51] smb, better craft something then probably [07:54] diwic, It is always some tough decision. There have been bigger changes accepted in the past. Just then be prepared to get asked how well this was tested. It may make sense if you can say, there was just no other way to get this done and I did test it at least on the variety of hw I got that is affected. [07:55] Unfortunately when larger changes were done, they usually _had_ some hidden pitfalls in certain cases in them as well. Which tends to lower the acceptance of big chances again. [07:58] smb, yeah, and tbh I'm a little afraid that would happen if we applied 3af9ee as is [08:00] diwic, If even you are afraid, that is a bad sign. :) [08:22] gah! so I bisected my video problem, found the patch that broke it but, reverting it in newer kernels doesn't restore my display :( [08:22] just my luck [08:23] Sounds like one of those days [08:26] yeah [09:05] jjohansen, Still around? [09:05] yep [09:06] smb: ? [09:06] jjohansen, I wonder whether you could have a peek at that sru Tim sent around (about igb). I have that is-the-stove-really-out nagging feeling about spinlock without any diabling [09:07] If someone else would review and say it looks okish I just would ack it as well [09:07] smb: sure [09:14] urg, this is when I wish for my big monitor so I can run two instances of meld side by side [09:18] smb: your right about the is-the-stove-really-out nagging feeling, I have it too. I have the original up and this and am doing a full on proper comparison [09:19] and eval [09:20] jjohansen, Ah ok. Yeah, somehow it feels to me that at least a spinlock_bh variant may be required [09:20] hrmmm, that is a frightening thought :) [09:21] But its just that uneasy feeling I usually have about not turning off anything. Somehow I converted to use spinlock alone only if I had nesting... But that is maybe just overcautious [09:21] * jjohansen wishes the original patch hadn't crammed 4 or 5 different things into it [09:22] of course but being overcautious is safe [09:24] Exactly. Rather be safe than sorry and in the bug report I think was a version with irqsave and restore which has been dropped because the original patch did not use it. Though the original patch completely changed things... [09:24] well I am not so worried about what is done in the original patch as patches around it [09:25] the original didn't change things enough in the locking sense for that worry [09:29] In that case I did not look that much on what was done upstream (taking the statement about it being complicated and not all needed to get the gist of what is needed). Mostly what is done looks ok (make the update happen all the time and lock on the callers). And also most callers seem to be things that are triggered from userspace only. Just the watchdog function I am not sure. And whether it may interrupt a caller from user-space in an unfor [09:29] tunate way [09:31] Still I did not want to get that patch forgotten just because I am a code worrier... :-P [09:39] smb: heh no worries, we will get it taken care of [10:17] smb: so this I am less than thrilled with the patch [10:18] jjohansen, Oh, ok. So what are your worries there? [10:19] (actually if you could just reply in the thread, then you'd have to type it only once..) [10:19] smb: in the context of locking _bh, or irq I think it is okay, but I am not entirely convinced the locking is doing much good [10:20] smb: basically, it does prevent similtaneous writers, but it does nothing for readers [10:20] jjohansen, Ah, ok... Hm, missed that aspect then [10:22] smb: look at igb_get_stats in igb_main.c [10:22] jjohansen, Probably did not look into all accesses. At least the hunks seemed to change read (in those cases) into a write (update) [10:22] in the original patch a copy is made after the update and the copy is what is returned to the readers [10:22] so they have a private copy [10:22] a snapshot [10:23] here we are returning the live structure [10:23] now the question is do we care? [10:24] the locking seems to have been mostly added for 64bit accesses on 32bit machines [10:24] Hm, doing the snapshot ensures that the stats are coherent. I guess updates just happening more often would not be completely what is desired... [10:24] so right now I am trying to figure out if we care [10:25] if its only for consistency across 64 bit stats then we may not care [10:25] if its for snapshot consistency we do [10:25] Well otoh locking sounds reasonable just because now there may be two readers causing parallel updates, won't they? [10:26] possibly, the writers certainly need locking of some form [10:26] but I am not familiar enough with the code to say if multiple writers can occur at that point [10:26] but the locking won't hurt [10:27] iirc it was changing the code so reading the stats for example with ethtool woul update them at that time [10:27] And I guess at least in theory that could happen in parallel [10:27] it really comes down to the readers [10:28] right [10:28] I think the writers can be parallel, I am just not positive [10:28] so, I good with the locking on the writer side [10:29] its the readers I am trying to figure [10:29] * jjohansen really wishes the original patch had been split up [10:29] Ok, let me have a look at this again in meld... [10:32] wtf [10:32] I am not sure why I care... There was a code warrior putting that into master-next already it seems... [10:33] o_O [10:36] jjohansen, Just questioning my sanity: do you see it too? [10:37] hrmm, let me pull and see [10:39] ah gah my linux-next tree hasn't been updated to git hub [10:41] oops [10:50] * smb -> lunch [11:02] smb: so I don't believe we care. We aren't updating 64 bit values, and I don't see a reason for rx and tx stats to be a snapshot, that is if rx gets incremented while read tx it doesn't really matter [11:04] jjohansen, Ah ok, so basically the net stats are not really verifyable the first place. If its only rx / tx then surely they are ok as long as the value itself is consistent [11:04] Would be different when there was something like irqs as well which would or would not be consistent with rx or tx [11:05] well its a little more than rx or tx look in include/linux/netdevice.h for net_device_stats [11:06] eg you could get rx_bytes and rx_packets to be from different snapshots [11:07] but the lack of consistency there is a very minor problem [11:07] Yeah, and probably was there before... [11:07] yep [12:22] ogasawara, hello! I'm just pinging you to make bug 849564 more visible. :) Very reproducable and pretty annoying regression. I'd be glad to help test kernels [12:22] Launchpad bug 849564 in linux "System76 Lemur: Does not suspend on second attempt" [Undecided,Confirmed] https://launchpad.net/bugs/849564 [12:24] jjohansen, tyler hicks is the ecryptsfs man right ? [12:25] yep [12:25] I keep meaning to sit down with him but haven't yet [12:27] jjohansen, heard any reports about encrypted swap corrupting stuff ? [12:27] * smb <- back [12:27] https://bugs.launchpad.net/ubuntu/+source/libreoffice/+bug/745836 [12:27] Ubuntu bug 745836 in ecryptfs-utils "ecryptfs encrypted swap corrupts application stack/heap [was: soffice.bin SIGSEGV cppu::throwException()]" [Critical,Confirmed] [12:27] smb, ^^ you may be interested in the bottom of this [12:27] i assume swap actually is actually dmcrypt [12:27] wow [12:27] apw: no [12:28] apw: so swap is not encrypted using ecryptfs [12:28] apw: it's just dmcrypt [12:28] apw, Thought you were the bottom man... :-P [12:28] * smb looking [12:28] right ... i assume the users are confused [12:28] apw: ecryptfs-setup-swap sets it up [12:29] apw: you assume correctly, I think :-) [12:30] smb, would you have time to poke this nest of worms ? [12:31] apw, Not sure I really want to... [12:31] jjohansen, if you do talk to tyler, tell him that people are reporting that the 'crap at the end of ecryptsfs files' bug is being reported not fixed even where the fixes are applied [12:31] smb, heh now that i can understand :/ [12:31] smb, i could swap you for working out why we have power problems if you like [12:32] apw: yeah, that is weird [12:32] apw, I probably take the other land-mine then I guess [12:33] apw, At least things exploding is more realistic [12:37] smb, it sounds like they have a reproduce by in a vm if you scroll back a few comments [12:38] Yeah. Will try the receipe and see what I get [13:04] apw, smb: Hi there. [13:05] Sweetshark, Hi, just reading through that bug [13:05] * Sweetshark is Bjoern Michaelsen. And yes, it seemed to be quite reproducable in the end. [13:06] * smb Realized [13:06] Sweetshark, How small was the smal RAM? [13:07] 512M? [13:07] * apw is suprised LO would start at all in that [13:08] smb: 512MB was enough here. [13:08] Sweetshark, was it reproducible on oneiric ? [13:09] and then open Firefox, imagesearch for large images and open 20 in tabs to make him swap. (Unless you kernelguys have something less pragmatic). [13:09] apw: Havent tried yet. I just got it reproducable today. [13:09] * jjohansen -> waves good night [13:09] jjohansen, night [13:09] Sweetshark, heh i think i'd probabally just boot the VM with less ram :) [13:10] jjohansen, no bed bugs .... [13:10] Also got some program done by someone to eat memory... Would try that too === kentb-out is now known as kentb === med_out is now known as medberry [13:34] as for this affecting oneiric: it is highly likely as there are stacktraces for it: bug 854197 for example [13:34] Launchpad bug 854197 in libreoffice "soffice.bin crashed with SIGSEGV in QPaintEngine::drawImage() (dup-of: 745836)" [Undecided,Incomplete] https://launchpad.net/bugs/854197 [13:34] Launchpad bug 745836 in ecryptfs-utils "encrypted swap corrupts application stack/heap [was: soffice.bin SIGSEGV cppu::throwException()]" [Critical,Confirmed] https://launchpad.net/bugs/745836 [13:35] or bug 801592 (which is closer to the original "cppu::throwException" stacktrace.) [13:36] arggh, pasted the wrong bugs. [13:38] * Sweetshark better tries himself. [13:41] Sweetshark, I think at least the initial report of that was 32bit. Since you already spent quite a time with those, were there 64bit cases as well? [13:43] smb: yes. [13:43] smb: Actually I personally reproduced on natty 64-Bit. [13:44] Ah good. So that does not matter [14:16] * ogasawara back in 20 [14:19] apw, kirkland, Sweetshark: I looked into bug 745836 last night and have a gut feeling that it is an inode race condition that I fixed recently. I haven't had a chance to update the bug with my findings yet. [14:19] Launchpad bug 745836 in ecryptfs-utils "encrypted swap corrupts application stack/heap [was: soffice.bin SIGSEGV cppu::throwException()]" [Critical,Confirmed] https://launchpad.net/bugs/745836 [14:20] I started to backport the patches last night but ran out of steam [14:20] I think I've just got one left to do [14:21] tyhicks, Just wondering, if it is encrypted home/swap it is dm-crypt no ecryptfs... [14:21] tyhicks, though they claim near the bottom that removing ecryptfs from the equasion and the problem is still there [14:22] * tyhicks missed that [14:22] tyhicks, Sweetshark it was you who claimed to have tested with just dmcrypt swap and normal home right ? [14:23] I can reproduce this easily in my vm... let me give it a shot on an unencrypted home [14:23] tyhicks, thanks [14:24] tyhicks, or just replace the swap by an unencrypted one [14:35] apw smb: I couldn't reproduce it with encrypted swap and unencrypted home [14:35] smb: you wanted me to test unencrypted home with unencrypted swap? [14:37] tyhicks, Interesting, in the bug report it is claimed with encrypted home and without encrypted swap it cannot be reproduced either... [14:37] tyhicks, I am still installing a vm, so if you want to go for a try, sure [14:37] smb: I'm thinking that this is the eCryptfs inode race condition that I fixed upstream recently [14:38] smb: encrypted swap could add just enough latency into everything that the race condition exposes itself more often [14:38] Ah, that may well be [14:39] tyhicks, So since you were already quite close to have the patches ready (and probably a test kernel too), I think the best approch would be to give those a try first [14:44] smb: Ah, I just noticed that the fixes I had started backporting are already in oneiric [14:46] smb: Sweetshark says that it is highly likely that oneiric is affected, too, but I'm not seeing any reports against oneiric [14:46] tyhicks, Sweetshark thought there were bug reports for oneiric too. But then said something about wrong ones and I am not sure with what kernel version those would have been opened anyway [14:47] smb: I just got a vanilla oneiric beta2 vm ready. Trying to reproduce. [14:48] Sweetshark: great - that would give me a good idea about whether or not this patchset I'm backporting is going to fix it [14:48] Sweetshark, tyhicks Can I let the two of you check into this. It feels like tyhicks has the best lead for the fix anyways [14:48] smb: works for me [14:49] smb: if I determine that eCryptfs is not at fault, I'll let you nkow [14:49] Thanks a bunch. Yes, sounds good [14:49] * apw idly notes that 'thanks a bunch' is normally negative [14:50] oops [14:50] * smb meant it the positive way [14:50] thanks a lot! is normally possiblem t [14:50] possible, thanks a lot. is often not :) [14:50] isn't english great [14:51] * smb probably has to live with every second word he says is an insult, about drinking or sex without knowing [14:55] tyhicks: NOT reporoducable in oneiric beta2 with encrypted home and swap (or at least not easily reproducable). [14:56] Sweetshark: woohoo... I think I might be on the right rack [14:56] s/rack/track/ [14:56] Sweetshark: thank you! [14:58] of course tests cant prove the absense of bugs ;) [15:04] Oh, I should probably get some quick confirmation on the natty kernel source that I'm using (I've never worked directly on an ubuntu kernel before now) [15:05] Is the master branch of git://kernel.ubuntu.com/ubuntu/ubuntu-natty.git what I should be backporting against? [15:07] tyhicks, either master or master-next [15:08] master-next is the branch we'll actually apply the SRU patch agains [15:12] tyhicks: what are the chances that Libreoffice itself is at fault here? I assumed pretty slim and closed as invalid. Since you have a idea what the root cause it do you agree that this is valid to assume? [15:17] Sweetshark: Tough to say just yet [15:17] tyhicks: k [15:18] well, if you find anything fishy from LO on this please reopen on libreoffice. [15:18] Sweetshark: at first glance, my reaction was "Oh, eCryptfs is doing something non-POSIX-like and libreoffice should maybe handle that better" but then as I looked into it more, I got the feeling that eCryptfs is at fault [15:19] tyhicks: k thx [15:20] ogasawara, the binary package is: linux-image-extra-3.0.0-9-virtual [15:20] as an example from my test builds [15:20] apw: cool, was just about to check [15:20] apw: so we're fine there [15:20] the theory behind the packaging changes is that you could pick any subset as a separate package [15:21] so you could pull out staging for example [15:21] though we don't do it [15:21] bug #818177 [15:21] Launchpad bug 818177 in udev "HP DL380G5 root disk mounted read-only on boot and boot fails" [High,Confirmed] https://launchpad.net/bugs/818177 [15:32] * smb drops off [15:49] mjg59, do you know of any way to get a trace of the AML method calls being made by a driver under Windows? [15:49] sforshee: Nope [15:49] rats [16:44] sforshee, ask alexhung [16:46] mjg59, nice response to microsoft's fluff [16:47] cking: Thanks! [16:47] cking, already have, but I'm exploring a couple of alternate avenues [16:47] I'm glad it's getting a whole load of attention [16:48] sforshee, pity you can't slap in systemtap eh? [16:48] cking, I found some kind of sample for tracing ioctls (which is how AML methods appear to be invoked on windows) [16:48] heh [16:49] cking, windows also seems to have support for something called filter drivers, which I think can be placed transparently between a driver and the hardware [16:49] sforshee, that's interesting to know, so is the windbg approach a dead end? [16:50] cking, it looked like it required some special hardware, so I'm looking at software-only solutions first [16:51] sforshee, you could hack out the DSTD, put in some extra AML to do port 0x80 writes and then boot the machine with BITS to pre-load the tables into memory - and then chain boot Windows. [16:51] you need a port 0x80 debug card ;-) and patience [16:52] oh, SIGCHILD, gotta to attend my kids.. [16:52] cking, that's an interesting idea, but these are netbooks so installing a port 0x80 debug card is going to be problematic :) [16:52] sforshee, yep, I've always found they don't work well [16:53] the mini PCI-e one's have never worked for me [16:53] sforshee: Alternative is to figure out where ACPI debug statements go on Windows and fill the AML with a pile of them [16:54] mjg59, thanks, I'll look into that as well [17:29] * sforshee -> dr appt [17:51] * tgardner -> lunch [17:54] * apw calls it beer time, laters [18:12] Hmm. Is there a reason why xenbus_probe_frontend is modular in several of our configs? [18:12] It means that other Xen drivers (e.g. xen-netfront) don't get autoloaded [18:19] sforshee, be sure to write up your findings once you've figured out a workable methodology ;-) [18:20] cjwatson, I think the backends are built-in for DomU clients. Aren't the front-end drivers for Dom0 ? or do I have it backwards. [18:28] * ogasawara lunch === _ruben_ is now known as _ruben [18:41] tgardner: I think you have it backwards [18:42] tgardner: I'm on a domU client at the moment - loading xenbus_probe_frontend is enough for it to find its network interface [18:43] and without that, the installer fails due to having no network interfaces; I could work around this in the installer, but it seems more correct to build this in unless there's a reason not to [19:05] tgardner: I've filed bug 857662 with a more complete rationale [19:05] Launchpad bug 857662 in linux "Should xenbus_probe_frontend be built-in?" [Undecided,New] https://launchpad.net/bugs/857662 [19:05] cjwatson, ack [19:06] cjwatson, are you using the server or virtual ? [19:06] explained in the bug [19:09] ... and corrected the description to match reality [19:11] cjwatson, so you think CONFIG_XEN_XENBUS_FRONTEND=y will solve all your problems ? -virtual works as desired ? [19:12] I have no remotely straightforward way of trying -virtual - I don't control the host [19:12] I have tested that 'modprobe xenbus_probe_frontend' makes everything work [19:13] cjwatson, ok [19:13] and I guess I sort of feel that -generic should be a superset of -virtual, normally? at least for "does it work"-ness [19:14] ISTR -virtual originally being just a make-it-smaller flavour [19:14] cjwatson, actually, -server and -virtual are much more closely related (in amd64) [19:15] oh, is the -virtual config generated somehow? [19:16] I noticed [19:16] debian.master/config/amd64/config.flavour.generic:21:CONFIG_XEN_XENBUS_FRONTEND=m [19:16] debian.master/config/amd64/config.flavour.virtual:21:CONFIG_XEN_XENBUS_FRONTEND=y [19:16] and inferred that it wasn't a simple subset any more [19:16] cjwatson, it was originally based on -server [19:16] * jjohansen -> lunch [19:16] oh, I see, well, -server has CONFIG_XEN_XENBUS_FRONTEND=m too [19:16] I wouldn't object to switching the xen images over to -virtual in P if it's the kernel team's opinion that that would work better [19:17] and if you think CONFIG_XEN_XENBUS_FRONTEND=y across the board is too risky for Oneiric then I can add workarounds [19:17] cjwatson, we tailored -virtual for use in xen environments [19:17] but I thought it made sense to ask first [19:17] I'm still evaluating CONFIG_XEN_XENBUS_FRONTEND=y across the board [19:17] its likely OK [19:22] cjwatson, it _looks_ ok, but I think I've gotta test it on some bare metal first. [19:25] tgardner: lemme know how the tests go. I'll hold off on the upload for a bit so we can add the config change. [19:26] ogasawara, it'll take 30 mins or so [19:26] tgardner: ack [19:34] cjwatson, ogasawara: the only way to build in XEN_XENBUS_FRONTEND is by enabling one of the front end device drivers, e.g., CONFIG_XEN_BLKDEV_FRONTEND=y or CONFIG_XEN_NETDEV_FRONTEND=y. This is the reason we only enabled those drivers for -virtual. [19:35] I'll add my rationale to the bug [19:51] tgardner: oh, drat [19:52] that seems like a bug in the upstream config system [19:52] since you can never get autoloading that way [19:52] cjwatson, Im not saying it _won't_ work, I'm just a little reluctant at this late stage. [19:53] yeah, understood [19:53] switching to -virtual will be fine for P, and I'll work around it in the installer for Natty/Oneiric [19:54] cjwatson, cool, much appreciated. [19:55] solutions that don't involve kernel changes are fine, I just have a little Keybuk voice in the back of my head going "we should just change the kernel" sometimes ;-) [19:55] I guess I'm the anti-keybuk [19:55] ogasawara, fire away [19:56] tgardner: I'm firing up test boots and will then pull the trigger [20:36] * tgardner -> EOW [21:18] ogasawara, heya [21:18] bryceh: hi what's up [21:18] ogasawara, I want to build a kernel for a bug reporter to test, with an upstream patch applied [21:19] bryceh: want me to build it for ya? [21:19] ogasawara, oh that'd be great [21:19] bryceh: sure, just point me to the patch. [21:19] it's comment #7 on https://bugs.freedesktop.org/show_bug.cgi?id=41059 [21:19] bryceh: I assume oneiric kernel? and do they need amd64 or i386? [21:19] Freedesktop bug 41059 in Driver/intel "XRANDR operations very slow unless (phantom) HDMI1 disabled" [Major,New] [21:20] ogasawara, amd64 [21:21] bryceh: once I've got it, you want me to just post the url to the test kernel in that bug? [21:21] ogasawara, yep that'd be perfect [21:21] bryceh: ack, will do [21:23] thanks ogasawara! [21:25] bryceh: hi! what do you say - are 120 ms per DP port probe ok? HDMI2 and HDMI3 in comparison are much faster. [21:30] htorque, keith packard's been reworking the DP probing code lately [21:31] we've seen some horrible times, particularly eDP, of 500-1000ms in some cases [21:31] so compared with that 120 ms doesn't seem too bad [21:31] however my expectation would be that it should be much lower than that [21:32] bryceh: okay, thanks for your answer! :) [21:32] Given the recent trend of hardware with lots of different ports, what concerns me is the sum of doing these probes in serial. I wonder why they're not done in parallel; is that even possible? [21:33] htorque, keithp has a code branch with reworked DP probing that we had a couple people test and they found it significantly imprved boot speed [21:34] I've communicated this to Intel, which they were glad to hear and hopefully they'll have that new code finished, QA'd, and available in time for LTS [21:34] I am not confident it's something we'd want to risk SRUing for oneiric but guess we can decide at some point when the code is ready [21:35] bryceh: the 1.6 sec. from HDMI1 are definitely worse - especially if you don't have HDMI ports :P [21:35] do you have a link to that branch? [21:37] no but I can dig it up, one sec [21:37] bryceh: found this: http://kernel.ubuntu.com/~sarvatt/macbook-air/ [21:38] aha yeah that's it [21:44] bryceh: from 120 ms per port to 50 ms per port. nice! :) [21:47] htorque, most excellent :-) === kentb is now known as kentb-out === kamalmostafa is now known as kamal