[07:32] <alkisg> Hi, I think I've found a bug in overlayfs, could someone please check that I'm not doing something wrong, before I file it upstream?
[07:32] <alkisg> As root: cd $(mktemp -d); mkdir lower upper work root; truncate -s 4G lower/sparse; mount -t overlay -o workdir=work,upperdir=upper,lowerdir=lower overlay work; truncate -s 0 work/sparse; umount work
[07:32] <alkisg> The error: truncate: cannot open ‘work/sparse’ for writing: Value too large for defined data type
[07:34] <alkisg> I.e. the truncate function call, fails on big files, when the underlying file system is overlayfs
[07:36] <apw> alkisg, what you are doing looks valid to my eye
[07:36] <alkisg> Thank you apw, should I send a mail to the unionfs kernel ML?
[07:36] <apw> alkisg, to the maintainers of overlayfs in the first instance, as in whats in MAINTAINERS for fs/overlay
[07:37] <alkisg> Thank you very much
[07:38] <mjg59> alkisg: https://github.com/docker/docker/issues/11700 ?
[07:38] <mjg59> We ran into that
[07:41] <apw> alkisg, seems like a copy up issue, as it is failing in open, opening it for write (if i am reading the trace right)
[07:41] <alkisg> mjg59: looking into that...
[07:42] <alkisg> apw, it fails for 2200M, it succeeds for 2000M, but it takes a very long time to truncate it, like it's trying to copy the file contents which imho it shouldn't
[07:42] <apw> alkisg, it is absolutly copying it up
[07:42] <apw> as truncate is "open for write, then ftruncate(N) on the fd"
[07:43] <mjg59> alkisg: Presumably failing at 2G
[07:43] <alkisg> Right
[07:44] <alkisg> Hmm so when I want to empty some files, `truncate -s 0` isn't really an efficient way to do it... :)
[07:45] <alkisg> I thought the truncate function would special-case 0 there
[07:46] <apw> alkisg, the problem is not truncate, but the fact to use it you have to open the file
[07:47] <alkisg> Hmm and also overlayfs doesn't use blocks, lists etc, so it wouldn't be able to actually "truncate" a file in half without copying it first
[07:48] <alkisg> I.e. mark the rest of the blocks as unused etc
[07:50] <apw> alkisg, right, though at the time we open it all we are saying is "open this so i can scribble on the current contents"
[07:50] <apw> and in that case all it can do is make sure it is on upper
[07:50] <alkisg> Gotcha
[07:51] <apw> that it fails on 2G+ is broken, but it will always be _slooow_
[07:52] <alkisg> Right, so I'll change the ltsp code so that it doesn't use truncate, but the bug still needs to be reported upstream
[07:52] <alkisg> I couldn't an official source for the maintainer though
[07:52] <alkisg> I think it's miklos, but I don't see his name in overlay.txt...
[07:53] <alkisg> s/an/find/...
[07:54] <apw> it is miklos yes, the main MAINTAINERS file in the top level lists them all
[07:56] <apw> alkisg, to: miklos cc:linux-unionfs
[07:56] <alkisg> ty!
[09:18] <alkisg> Reported to http://permalink.gmane.org/gmane.linux.file-systems.union/408 and fixed in LTSP, https://bugs.launchpad.net/ltsp/+bug/1494660. Thank you guys.
[09:18] <apw> alkisg, i also see someone has worked up a fix for the fundamental issue, though if you can avoid the copy up ... all the better
[09:19] <alkisg> We avoided it, but e.g. `useradd` fails on live booted PCs because it's using truncate (lastlog == sparse file of 3 GB on a system that uses LDAP)
[09:20] <alkisg> So we'll keep an eye on that in case we find other side effects until overlayfs is fixed upstream
[09:20] <apw> alkisg, right, i am saying someone on linux-unionfs looks to have a simple fix
[09:20] <alkisg> I understood. Ah, could you give me the link?
[09:23] <alkisg> Ah, a reply to mine, ok
[09:23] <alkisg> :)
[14:46] <psivaa> cking: hello, about that failing health-check test, (which passed in today's run btw) do you think it would be beneficial to retain the old host  for any debugging.
[14:46] <psivaa> We're having to handover the machine to the IS, but can retain for another week
[14:48] <cking> psivaa, i don't think we're going to gain much by keeping hold of the old host
[14:48] <cking> so let it go
[14:48] <psivaa> cking: ack, thank you
[15:35] <TJ-> apw: ping re: cryptsetup
[15:40] <apw> TJ-, i am blank, remind me
[15:45] <TJ-> apw: You signed off a couple of recent package uploads; I was wondering if you're semi-maintaining it ?
[15:45] <apw> i might :)
[15:46] <TJ-> apw: I work with LUKS/dm-crypt/cryptsetup extensively, and there are a couple of related issues showing with Wily which lead to non-boot regresssions
[15:46] <apw> TJ-, sounds bad
[15:47] <apw> i think i just merged it, so if you want to change it thats all good
[15:47] <TJ-> apw: do you have 5 minutes to discuss/bounce ideas?
[15:50] <apw> sure go ahead
[15:53] <TJ-> apw: I've used LUKS with keyscript= in crypttab for several years. The basic script simply takes the KEYFILE passed to it and reads it. That's worked fine with Upstart. Now, with systemd-cryptsetup, that is broken because it doesn't support keyscript.
[15:53] <apw> TJ-, "you don't want to do it like that"
[15:53] <apw> good old systemd and its "i do everything you should use and nothing you do" support
[15:54] <TJ-> apw: systemd-cryptsetup does support finding/loading the keyfile directly, however, so if it doesn't require any special steps that will still work
[15:55] <TJ-> apw: systemd-cryptsetup complains about unknown option keyscript= though, regardless. If the user takes notice of that and removes the keyscript= option from cryptsetup it breaks the initramfs cryptroot script causing failure to unlock te container with the rootfs
[15:56] <TJ-> apw: Currently the cryptsetup initramfs hook script knows its a problem and will report "cryptsetup: WARNING: target LUKS_OS uses a key file, skipped"
[15:57] <TJ-> apw: That doesn't help the user though. I'm wondering if we shouldn't have the script simply assume the crypttab 'initramfs' option and do the cryptsetup install to the intrd.img regardless, rather than having no cryptdisk support at all
[15:58] <TJ-> apw: The hook script doesn't know if the LUKS slots may also contain a passphrase which the user can type instead of the key-file, but it does know cryptsetup is required to unlock the container with the rootfs in
[16:00] <apw> TJ-, you mean we are keying off that command line option to work out what file to put in the initrd
[16:00] <apw> TJ-, but systemd tells us to get rid of it?
[16:01] <TJ-> apw: we're keying off the crypttab entry for the container, and not installing cryptsetup at all if there is a key-file defined but no kescript
[16:02] <TJ-> apw: systemd-cryptsetup complains each time it sees 'keyscript=' which could lead to users (like me!) trying to tidy things up
[16:02] <apw> TJ-, so i tend to agree if we know root needs crypt to unlock we should always install that
[16:03] <apw> TJ-, even if it won't work, it definatly won't work without
[16:03] <TJ-> apw: It can be over-ridden in crypttab using the 'initramfs' option, but its not obvious
[16:03] <TJ-> apw: OK, I'll work up a patch and a bug report and let you know when its tested
[16:03] <apw> thanks.
[16:04] <apw> anything that gets them a step nearer to working is good, if root is in a container that needs it, it should be in the initrd even if we don't have enough info to actually open it
[16:05] <TJ-> Yeah... with keyscript= it's not a problem, so for users upgrading they should be OK, but for new users it could be an issue
[16:08] <TJ-> Getting the required support into systemd-cryptsetup is going to be a big job; upstream wants to use the kernel key-ring and the  PasswordAgents API
[16:09] <apw> TJ-, whatever is hard for everyone i am sure
[16:09] <TJ-> apw: makes sense, but a lot of work :)
[16:10] <TJ-> PA means easy support for SmartCards and other devices withou hackish scripts
[16:43] <TJ-> apw: got a fix in, cryptroot::get_device_opts() expands the existing  warning, recommends a pass-phrase, and doesn't return an error, so the device is included. "cryptsetup: WARNING: target LUKS_OS uses a key file, but no keyscript is set. Please ensure there is also a typed pass-phrase set"
[17:00] <jdstrand> cking: thanks for the super-quick turnaround for thermald! :)
[17:01] <cking> jdstrand, no problem, I had the fix already,so it was a no-brainder
[17:01] <cking> *brainer
[17:01]  * jdstrand nods
[17:01] <jdstrand> still though-- very nice :)
[17:01] <jdstrand> cking: hope you enjoy your weekend :)
[17:01] <cking> cheers, thanks, you too
[17:02] <cking> jdstrand, and thanks for testing it :-)
[17:02] <jdstrand> I think it was literally the least I could do considering the rapid turnaround :)
[17:03] <jdstrand> but, you're welcome :)
[17:16] <TJ-> apw: There are other similar bugs (e.g. bug 1487127) I'll squash whilst I'm at it, so they can be rolled up into one update. This issue is bug 1494851