=== smb` is now known as smb === cwillu_ is now known as cwillu [09:57] ro [09:57] / ro [09:57] [ 6827.767493] ata1.00: exception Emask 0x0 SAct 0xffffff SErr 0x880000 action 0x6 frozen [09:58] [ 6827.767497] ata1: SError: { 10B8B LinkSeq } [09:58] [ 6827.767499] ata1.00: failed command: WRITE FPDMA QUEUED [09:58] [ 6827.767516] ata1.00: cmd 61/08:00:90:4f:4a/00:00:06:00:00/40 tag 0 ncq 4096 out [09:58] [ 6827.767516] res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) [09:58] [ 6827.767517] ata1.00: status: { DRDY } [09:58] [ 6827.767518] ata1.00: failed command: WRITE FPDMA QUEUED [09:58] [ 6827.767521] ata1.00: cmd 61/68:08:c0:b9:47/00:00:0d:00:00/40 tag 1 ncq 53248 out [09:58] [ 6827.767521] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) [09:58] [ 6827.767522] ata1.00: status: { DRDY } [09:58] ... [09:58] [ 6828.122829] ata1.00: device reported invalid CHS sector 0 [09:58] [ 6828.122836] sd 0:0:0:0: [sda] [09:58] [ 6828.122837] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [09:58] [ 6828.122838] sd 0:0:0:0: [sda] [09:58] [ 6828.122839] Sense Key : Aborted Command [current] [descriptor] [09:58] [ 6828.122841] Descriptor sense data with sense descriptors (in hex): [09:58] [ 6828.122842] 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00 [09:58] [ 6828.122846] 00 00 00 00 [09:58] [ 6828.122848] sd 0:0:0:0: [sda] [09:58] [ 6828.122850] Add. Sense: No additional sense information [09:58] [ 6828.122851] sd 0:0:0:0: [sda] CDB: [09:58] [ 6828.122852] Write(10): 2a 00 06 4a 4f 90 00 00 08 00 [09:58] [ 6828.122856] end_request: I/O error, dev sda, sector 105533328 [09:58] [ 6828.122858] quiet_error: 3 callbacks suppressed [09:59] [ 6828.122859] Buffer I/O error on device sda1, logical block 13191410 [09:59] [ 6828.122860] lost page write due to I/O error on sda1 [09:59] ... [09:59] [ 6828.123406] Buffer I/O error on device sda1, logical block 9537386 [09:59] [ 6828.123407] Buffer I/O error on device sda1, logical block 9537387 [09:59] [ 6828.123409] EXT4-fs warning (device sda1): ext4_end_bio:250: I/O error writing to inode 3812130 (offset 995328 size 8192 starting block 9537644) [09:59] [ 6828.123411] ata1: EH complete [09:59] [ 6828.123655] EXT4-fs (sda1): Remounting filesystem read-only [09:59] [ 6828.123658] EXT4-fs error (device sda1) in ext4_free_blocks:4700: Journal has aborted [09:59] eh [09:59] [ 6828.123667] journal commit I/O error [09:59] [ 6828.123697] EXT4-fs error (device sda1) in ext4_dirty_inode:4610: Journal has aborted [09:59] [ 6828.123740] EXT4-fs (sda1): ext4_da_writepages: jbd2_start: 9223372036854775807 pages, ino 9213275; err -30 [09:59] [ 6828.123768] EXT4-fs error (device sda1): ext4_journal_start_sb:370: Detected aborted journal [09:59] [ 6828.123803] EXT4-fs (sda1): ext4_da_writepages: jbd2_start: 9223372036854775807 pages, ino 9213273; err -30 [09:59] [ 6828.123844] EXT4-fs error (device sda1) in ext4_orphan_add:2420: Journal has aborted [09:59] [ 6828.124019] EXT4-fs error (device sda1) in ext4_orphan_add:2420: Journal has aborted [09:59] [ 6828.124021] EXT4-fs error (device sda1) in ext4_orphan_add:2420: Journal has aborted [09:59] [ 6828.124083] EXT4-fs error (device sda1) in ext4_reserve_inode_write:4483: Journal has aborted [09:59] [ 6828.124158] EXT4-fs error (device sda1): ext4_journal_start_sb:370: Detected aborted journal [09:59] [ 6828.124258] EXT4-fs error (device sda1) in ext4_reserve_inode_write:4483: Journal has aborted [09:59] ok, need to reboot and fsck [10:02] ppisati, bitten again? [10:02] yes [10:03] haswell and intel ssd [10:03] ppisati, what kind of workload? [10:03] i need to find if there's a new firmware for that ssd now [10:03] which SSD? [10:03] ppisati: smart logs can tell you where and why those WRITE FPDMA QUEUED errors happened [10:03] cking: nothing [10:03] cking: editing files, terminal with mutt, irssi and some sshs [10:03] cking: chrome was open too [10:04] ohsix: but i guess i need smartd running, right? [10:04] that's what I saw on my Intel 520 [10:04] ppisati, If I were around I would first suggest to see whether the "Buffer I/O errors" are within the drive/partition space [10:05] smb, you on vacation? [10:05] ppisati: nope, smartctl -l error /dev/device [10:05] cking, Its "no-work-friday" [10:05] smb, you're as bad as me [10:05] the device stores a bunch of logs, an error log is one of them [10:06] ohsix: SMART Error Log not supported [10:06] ohsix, not sure if it is supported on some intel SSDs [10:06] oh, that's what I get too [10:07] smb: it mentioned sda1, so it should be within partition bounds [10:07] bummer [10:08] there may be an alternate report, smartctl -x lists everything [10:08] ppisati, In theory yes, though I am not 100% sure which boundaries are exactly checked at which level, butyes. Though the initial error was some dma write failing. And then some weird thing about invalid chs sector 0... [10:09] hello, does anyone know how to use debian/rules to run make bzimage? [10:10] after successfully building a kernel I wanted to make a few corrections [10:10] cking, I am not really working, really. ;) I was just around to complain about jockey being broken now and the steam client steaming about it a bit. And apport not generating bug reports but only updates on errors.u.c which is hard to check for success or not [10:10] ppisati, so you got hit by a link sequence error and I suspect then the SSD just popped offline [10:11] but it seems as if build/rules binary-generic does not care about changes I make to .c files because make bzImage is never rerun [10:13] it seems like runing build/rules clean ; build/rules binary-generic is a very poor way of recompiling the kernel, especially when I barely add a single printk [10:15] cking: so more of a ATA bug then, but the ctrl communicates with the ssd so it could any of them actually [10:15] ppisati, so it looks like an issue on the link, you got 10b8b and a LinkSeq error, so it looks like the SATA link got all weird [10:16] mIKEjONE2, You may try to remove the build stamp file in debian/stamps (I believe) and skip the clean part [10:16] cking: this is the haswell box, could be a bug in the ctrl then [10:16] cking: or even silicon [10:16] ppisati, Has that box a easy-swap drive bay like mine? [10:17] smb: yes [10:17] ppisati, You could replace the ssd by another disk to rule out the ssd [10:17] ppisati, I got the same issue on Ivybridge with an Intel 520 SSD. So it may be SSD related or H/W related on the chipset, or both, or who knows [10:17] i bet on "who knows" [10:17] ppisati, but I didn't see the issue with the same SSD on a Core2 laptop [10:17] smb: i'll try to update the fw, in cae it happens again i'll swap the disk [10:18] (but it's nice to have a fast ssd... :) ) [10:18] ppisati, lemme rig up a spare Intel SSD on one of my SDP's and soak test it and see if we can characterise the bug [10:18] when it doesn't crap out... [10:18] so how do you guys recompile the kernel after modifying some of the source files? [10:20] smb: that worked pretty well :) thanks [10:20] I'm kind of surprised there's no better solution :/ [10:21] mIKEjONE2, We have a quick build box we share, so it is not really an issue, and recompiling the whole just makes sure everything really is done freshly and we are also a bit paranoid about that. ;) [10:26] smb: I'm running this on a 6core i7 with 32GB of RAM + an SSD and it still takes 20mins :/ [10:31] mIKEjONE2, yep, 11 million lines of code, does take a while to build and package :-( [10:31] there is always a good time to get another cup of coffee/tea or beer (depending on time of day) :) [10:31] smb, I use build time to catch up on LKML [10:32] cking, I knew I was doing something "wrong"... [10:36] cking: well, not really, if you're building a bzimage without debian/rules framework, and you change a single file you don't have to rerun make mrproper and start from scratch [10:37] make is smart enough to figure that out [10:38] not quite sure why the debian/rules framework is intended for developers when its build capabilities are so crippled [10:43] mIKEjONE2, use: rm debian/stamps/stamp-build-generic and then fakeroot debian/rules binary-generic, that will save a complete rebuild [10:45] ppisati, which intel SSD do you have? [10:45] cking: dmesg says [10:45] [ 4.066991] scsi 0:0:0:0: Direct-Access ATA INTEL SSDSC2CW24 400i PQ: 0 ANSI: 5 [10:46] 520? [10:46] 400i is latest 520 fw [10:46] same as mine 520 [10:46] cking: yea smb suggested that as well, and it works [10:47] I'm just a little grumpy because I spent an hour on 3 rebuilds because I didn't know that trick :/ [10:47] :-( [10:49] alternatively I just removed the touch $@ for the build-stamp rule from debian/rules.d/2-binary-arch.mk, this way I don't have to manually rm the file :D [10:50] oh, that's way less of a hack, nice === ricotz_ is now known as ricotz [14:43] Hi, I've sent patch to kernel-team@lists.ubuntu.com yesterday, but it did not show up in the archives. I am being told that I have to be subscribed to the mailing list, is that right? [15:29] is it expected that "linux-image-generic" is listed as one of the packages I can remove with 'apt-get autoremove'? [16:03] and after an entire day of printk and debugging i finally found why we can poke at iomem space safely in Q (contrary to what happened to P)... ohhhhhhh.... [16:18] ppisati, nice [16:38] * ppisati calls it a day/week and heads to the gym for some ignorant weight lifting! :) === yofel_ is now known as yofel