[05:17] <pitti> Good morning
[07:08] <jibel> pitti, how do I connect to running autopkgtest VMs? there are defunct runcmd, but no evidence of what happened on the console and on the host.
[07:22] <pitti> jibel: sorry, missed your ping
[07:22] <pitti> jibel: ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -p 10022 ubuntu@localhost
[07:23] <pitti> jibel: could also be 10023 or higher if multiple VMs are running; it says the port in the log, though
[07:26] <jibel> pitti, ah sorry, I didn't see the port at the end of the command line. Thanks
[07:38] <jibel> pitti, there are serious performance issues with 9p
[07:38] <jibel> pitti, I compared dd on the mountpoint and locally inside the VM http://paste.ubuntu.com/7403086/
[07:42] <pitti> jibel: right; but it's still much faster than squeezing everything through a pipe and tar
[07:43] <pitti> jibel: we run the tests and builds in /tmp/ in the testbed though, not in /autopkgtest
[07:43] <pitti> jibel: the former is a workaround for the too old qemu in saucy/precise (that caused the "invalid numeric value" breakage)
[07:43] <pitti> which means we actually have to copy large packages, which is what causes these timeouts on libo & friends
[07:43] <pitti> I need to look into that
[07:44] <pitti> (aside from all the other fires that are burning)
[07:44] <pitti> jibel: does that performance limitation hamper some test?
[07:44] <pitti> jibel: the 142 MB/s is just writing to memory, right? whereas 9p actually goes to the disk in the hosts's /tmp/
[07:44] <pitti> I suppose that explains most of the difference
[07:45] <jibel> pitti, right, but but 2.9MB/s is really slow even for disks
[07:45] <jibel> -but
[07:46] <jibel> pitti, adt-virt-qemu copies data from /autopkgtest right?
[07:46] <jibel> I think that's what impacts performance and makes some test hang
[07:47] <pitti> jibel: yes; /autopkgtest is used for everything that needs to copy up/down data
[07:48] <pitti> jibel: originally I had all tests and source packages there, but as we can't make it owned by the user running the test I had to play some chmod tricks and move building and the tests tree out
[07:48] <pitti> jibel: which test is hanging due to that?
[07:48] <pitti> jibel: NB that libo etc. fail due to the copy timeout when copying the unpackaged source tree between host and testbed (that's the bit on my list)
[07:48] <jibel> pitti, 51200000 bytes (51 MB) copied, 0.228209 s, 224 MB/s
[07:49] <jibel> pitti, ^this is on disk
[07:49] <jibel> faster than mem even :)
[07:49] <pitti> jibel: are you sure? that's going to teh overlay in /dev/shm, isn't it?
[07:49] <pitti> back in 3 mins
[07:50] <jibel> pitti, I am sure, last test was in $HOME as auto-package-testing on alderamin
[07:51] <jibel> and the VM runs with -drive file=/run/shm/adt-utopic-amd64-cloud.img.overlay-1399316707.83,if=virtio,index=0
[07:55] <pitti> jibel: ah, you ran it on the host, not in the VM
[07:56] <pitti> so 221 MB/s on my workstation's disk
[07:56]  * pitti boots VM
[07:58] <pitti> jibel: right, I get 6.9 MB/s here
[08:00] <jibel> pitti, yes, 224 is on the host with a raid array, 2.9 is on 9p in the VM and 142 in the VM on disk (which is an overlay in shm)
[08:07] <pitti> jibel: so at least chromium and libo don't have build-needed, for those I can radically optimize the copying and thus avoid the timeout
[08:08] <pitti> ah crap, no, these might actually call stuff from the full tree
[08:08] <pitti> jibel: as an immediate workaround I'd suggest I temporarily change the default copy timeout so that this doesn't keep blocking our tests?
[08:09] <pitti> -copy_timeout = int(os.getenv('ADT_VIRT_COPY_TIMEOUT', '300'))
[08:09] <pitti> +copy_timeout = int(os.getenv('ADT_VIRT_COPY_TIMEOUT', '3000'))
[08:10] <pitti> man, why is it so ridiculously hard to communicate with a QEMU VM
[08:11] <pitti> it seems pretty much the only thing that's fast is ssh, and that makes a lot of assumptions
[08:21] <jibel> pitti, 3000 sounds good for now, let see if it makes binutils more stable.
[08:22] <pitti> jibel: the recent one succeeded; I retried chromium, libo, and friends with the 3000
[08:23] <jibel> for LO and linux, I am not confident. It took 40min to copy the built tree with previous version of autopkgtest
[08:26]  * pitti wants a way to access a VM image without root privs. Now!
[09:42] <jibel> pitti, it seems to be related to the block size used by cp, if I dd with bs=1M I get x85 performance improvement on my machine
[09:58] <pitti> jibel: oh, wow! so maybe we could apply a similar trick in the copy{up,down}_shareddir bits
[09:59] <davmor2> Morning all
[10:02] <pitti> jibel: https://lists.gnu.org/archive/html/coreutils/2011-07/msg00059.html seems related
[10:05] <elfy> morning davmor2 :)
[13:22] <jibel> pitti, FYI, I didn't fin any way to really improve cp on 9p. I tried cpio too and specified a bloc size but it doesn't make a differentce. The best I could do is with rsync which is 2 times faster than cp
[13:29] <pitti> jibel: ah, I'm also currently playing around with this; my hope was that cpio and/or tar would help as you can specify big block sizes
[13:32] <pitti> dd bs=1M if=/dev/zero count=100 | cpio -o --file=out.cpio
[13:32] <pitti> heh, that only creates a sparse file
[13:33] <jibel> pitti, I tried find /autopkgtest/tmp -depth -print|cpio -pdm /tmp/adt/
[13:34] <pitti> jibel: I'm getting 44 MB/s with tar, while I got 6 with 512 byte block (default) dd
[13:34] <jibel> pitti, I also tried the options msize and cache of 9p but visible improvement
[13:34] <jibel> *no visible
[13:36] <pitti> jibel: same with reading, btw
[13:36] <pitti> dd if=/autopkgtest/out.tar of=/dev/null -> 7.0 MB/s
[13:36] <pitti> with bs=1M -> 63 MB/s
[13:39] <pitti> jibel: so it seems funneling that through tar if both the host and testbed paths are *not* already in the shared dir will give some x10 improvement
[13:48] <jibel> pitti, right, the best I can get is by creating a tar file on the host then on the guest run: time dd if=/autopkgtest/shared.tar bs=1M|tar x -C /tmp/adt/
[13:48] <jibel> pitti, it takes 2s for a 100M tar file
[13:50] <pitti> jibel: right, I found reading be much less sensitive to the block size
[13:52] <jibel> while cp -a takes 52s and kills the cpu
[13:55] <pitti> jibel: I'm now testing with a more realistic scenario with lots of smaller files, not a single big one
[13:55] <pitti> with an unpacked postgresql-9.3 tree (113 MB, 5680 files)
[13:55] <pitti> 11 seconds with cp -r
[13:56] <jibel> pitti, pitti I tried with /usr/share/doc + libpng = 17806 files
[15:22] <jibel> pitti, I pushed fixes and new tests to https://code.launchpad.net/~jibel/britney/fix_missing_results
[15:22] <jibel> pitti, I verified that I could reproduce the bug with the current version of britney and the new tests
[15:22] <pitti> \o/
[15:22] <pitti> jibel: you rock, thanks
[15:23] <jibel> pitti, do you think you'd have time tomorrow for a  pre-review, then I'll propose a merge against britney
[15:23] <pitti> jibel: yes, absolutely; this is the #1 issue for wrecking utopic, I'll make time
[15:23] <pitti> jibel: it seems this bug currently happesn more often than not; presumably because of the large amount of pacakge influx from syncs, etc.?
[15:23] <jibel> pitti, that'd be great, many thanks
[15:25] <pitti> jibel: perhaps you can already propose, then we see the diff and have some comment thread for the discussion/review?
[15:25] <pitti> jibel: set it as "WIP" for now
[15:27] <jibel> pitti, it's essentially because there are more packages with autopkgtest, and the last result is take into account.
[15:27] <jibel> pitti, okay, I'll do that
[15:28] <jibel> pitti, any reason the tests aren't merged in britney2-ubuntu?
[15:29] <pitti> jibel: I don't know; I proposed it ages ago, but got no reaction to it yet; probably needs more poking
[15:58] <pitti> jibel: OOI, why logging.warning -> print() ?
[16:04] <jibel> pitti, because that's the only call to logging in all the code, print() is used everywhere else. Probably a copy/paste at some point
[16:06] <pitti> jibel: ah, thanks