[02:16] <cwillu_at_work> qemu-arm-static seems to be acting weird all of a sudden
[08:15] <lag> Morning mythripk
[08:17] <hrw> morgen
[08:19] <Taalas> morgen
[08:21] <lag> Mornin' all
[08:42] <mythripk> morning lag
[10:32] <ogra> lool, btw, my ext2 image probs were caused by the FS running out of inodes
[10:32] <ogra> formatting with a higher default value should solve it
[10:34] <lag> ogra: Any luck?
[10:35] <ogra> archive is out of sync
[10:35] <ogra> i'm waiting for it to test the fix i have
[10:35] <lag> How do you mean 'out of sync'?
[10:35] <lag> Out of sync with what?
[10:35] <ogra>   computer-janitor: Depends: python-fstab (>= 1.2) but it is not installable
[10:35] <ogra>   libmailtools-perl: Depends: libtimedate-perl but it is not installable
[10:35] <lool> ogra: it's quite surprizing TBH; do you resize the fs at some point?
[10:36] <lool> (I saw the chat and agree it's a number of inode problem)
[10:36] <ogra> lool, nope only later on first boot
[10:36] <ogra> i create an empty file with dd and format it, then loop mount it and cp -ax
[10:36] <ogra> thats all i do
[10:37] <ogra> it might be the way dd creates the file if you use count=0 and seek=<$imagesize>
[10:38] <lool> ?
[10:38] <ogra> i know that creates a file with holes (which i.e. swapon complains about)
[14:39] <lool> Anybody at GUADEC?
[14:50] <rsavoye> lool: nope, blew it off to get work done...
[15:18] <cwillu_at_work> Check my logic:  the qemu vm issues exist in the -static builds as well;  -static + chroot works better for at least partially unknown reasons
[15:19] <rsalveti> cwillu_at_work: you mean, with rootstock?
[15:19] <cwillu_at_work> rsalveti, yes
[15:20] <rsalveti> yep, had a hug debugging day yesterday
[15:20] <cwillu_at_work> oh, really?
[15:20] <rsalveti> with full vm things are slower, seg faults and hangs
[15:20] <cwillu_at_work> too bad I missed it, because the -static is really hurting me right now :)
[15:20] <rsalveti> with user emulation sucks with programs that request info from /proc
[15:20] <rsalveti> like the stupid mono package
[15:20] <cwillu_at_work> I'm getting mysterious "method http died" messages, and if I shuffle things around, the mysterious deaths move to pip
[15:21] <cwillu_at_work> I've been running fine for months, and then yesterday my images just started failing
[15:21] <cwillu_at_work> it's almost as if some update to a package I was installing broke things
[15:22] <rsalveti> cwillu_at_work: hm, what package failed?
[15:22] <cwillu_at_work> rsalveti, as far as I can tell, no package failed
[15:22] <rsalveti> hm
[15:22] <cwillu_at_work> it's just that some process will randomly die after aptitude finishes
[15:23] <rsalveti> cwillu_at_work: oh, ok
[15:23] <cwillu_at_work> (installing packages in the chroot)
[15:23] <rsalveti> cwillu_at_work: what distro version are you trying to bootstrap?
[15:23] <cwillu_at_work> lucid
[15:23] <cwillu_at_work> debootstrap finishes fine
[15:24] <rsalveti> cwillu_at_work: I'm planning to change to user mode emulation when running with root, like what you're doing, and also add native arm support
[15:24] <cwillu_at_work> I could push things out to first boot, but I'd really prefer not too
[15:24] <rsalveti> today, I mean
[15:24] <cwillu_at_work> which, rootstock?
[15:24] <rsalveti> yep, that sucks
[15:24] <rsalveti> cwillu_at_work: yep
[15:24] <cwillu_at_work> sec
[15:25] <rsalveti> full vm doesn't work, lots of bugs, and user mode emulation works fine for most of the cases
[15:25] <rsalveti> then if you still can't create the rootfs, do it on arm
[15:27] <cwillu_at_work> so, yesterday, did you figure anything out re: triggering it?
[15:29] <persia> How much of the vm issues can be attributed to issues with the kernel targets?  Might we just be seeing something odd there, especially with -updates things could be loosely tested for VM targets.
[15:29] <cwillu_at_work> chroot isn't using the arm kernel though
[15:29] <rsalveti> yep, just full vm
[15:29] <rsalveti> persia: with full vm I'm getting the same behavior with different kernel
[15:29] <cwillu_at_work> but that's something I haven't checked:  whether I'm using -updates as my source
[15:30] <rsalveti> cwillu_at_work: first, if you use maverick's qemu, you'll get the unsupported syscall for pselect again
[15:30] <rsalveti> and a huge backlog
[15:30] <cwillu_at_work> I don't follow
[15:30] <rsalveti> then if you install anything related with mono, it'll hang
[15:31] <rsalveti> cwillu_at_work: this was fixed for lucid, but we have a regression for maverick
[15:31] <cwillu_at_work> I'm not targeting maverick :p
[15:31] <rsalveti> apt-get uses pselect, and this syscall is implemented at lucid
[15:31] <rsalveti> happens if you're using maverick as the host :-)
[15:31] <cwillu_at_work> not doing that either
[15:31] <rsalveti> cwillu_at_work: also, I get a seg fault while installing humanity-icon-theme
[15:32]  * cwillu_at_work repeats himself:
[15:32] <cwillu_at_work> same package set worked fine a week ago :)
[15:32] <persia> The lucid case should be fairly different from the maverick case, but the lucid Vm kernels come from the linux source package, which doesn't see much careful testing on updates except for i386 and amd64, usually.
[15:33] <cwillu_at_work> persia, we're not using the vm kernels at all though
[15:33] <cwillu_at_work> persia, qemu-arm-static doesn't require one
[15:33] <rsalveti> cwillu_at_work: but for me rootstock is working fine for most of the basic cases
[15:33] <rsalveti> cwillu_at_work: what packages are you requesting rootstock to install?
[15:33] <cwillu_at_work> rsalveti, it was for me too, up until a week ago :)
[15:33] <cwillu_at_work> sec
[15:34] <cwillu_at_work> I'm just going to pastebin my version
[15:34] <rsalveti> ok
[15:34] <persia> cwillu_at_work: Sorry then: I thought the issue was comparison of -static to the VM case.  Ignore me :)
[15:34] <cwillu_at_work> persia, well, it kinda is, but I'm focusing on the parts where -static fails in the same way :)
[15:35] <cwillu_at_work> http://pastebin.com/1iynGS4j
[15:35] <cwillu_at_work> line 620
[15:36] <cwillu_at_work> you should be able to run that locally if you remove the git calls
[15:36] <rsalveti> ok, will try
[15:36] <rsalveti> just a sec
[15:36] <cwillu_at_work> if I don't download the packages first, aptitude will die while installing firefox (i.e., second invocation)
[15:36] <cwillu_at_work> as written, it makes it through to the pip calls
[15:37] <cwillu_at_work> ugh, which are commented out in this version :p
[15:37] <cwillu_at_work> you might need an empty modules.d directory in the working dir
[15:38] <cwillu_at_work> given locally cached downloads, it takes about 20-25 minutes to run
[15:48] <cwillu_at_work> rsalveti, it seems like anything which touches the network dies after that point
[15:49] <cwillu_at_work> if I remove the pip calls and the aptitude update call at the end, it finishes
[15:55] <cwillu_at_work> rsalveti, actually, there's an odd thing:
[15:56] <cwillu_at_work> I split up the installer into multiple files as you noticed, which are each called in the same chroot, but different invocations of it
[15:56] <cwillu_at_work> (that was done to try to isolate things after they went weird yesterday)
[16:00] <cwillu_at_work> ....
[16:00] <cwillu_at_work> hmm, it could be the other arm chroot I had open to build packages was breaking things?
[16:03] <persia> That should have no effect.  I've routinely had multiples open (via schroot) without any apparent effect.  Mind you, that sample may not be large enough to prove a negative.
[16:03] <cwillu_at_work> persia, arm chroot's?
[16:04] <persia> arm-on-amd64 foreign schroots
[16:04] <cwillu_at_work> oaky
[16:04] <persia> Err, armel-on-amd64
[16:05] <cwillu_at_work> yep
[16:06] <cwillu_at_work> well, I'm rerunning without the other chroot's open
[16:06] <persia> I'd run that test a few times, as there's supposed to be some separation.  If you can demonstrate a convincing effect, then we clearly need to do something more advanced with LXC
[16:07] <rsalveti> cwillu_at_work: sorry, will look at it now, was doing other things
[16:07] <cwillu_at_work> np
[16:08] <cwillu_at_work> persia, I've run 40-50 builds in the last day, and hundreds over the last few months
[16:08] <cwillu_at_work> I haven't established that the extra chroot was the cause though, if that's what you're asking
[16:08] <cwillu_at_work> but whatever changed is consistent
[16:09] <persia> cwillu_at_work: I figured, but I doubt you have data on which of them were run with a simultaneous build chroot active (although I'd be happy to know otherwise).
[16:09] <cwillu_at_work> I could figure it out (both have start and end timelogs)
[16:09] <persia> Are you targeting -updates?  I really think that's the most likely source of regression.
[16:09] <cwillu_at_work> I checked, I'm not
[16:10] <persia> -security ?
[16:10] <cwillu_at_work> take a look at the pastebin I posted
[16:10] <persia> How about on the host?
[16:10] <cwillu_at_work> I haven't applied updates this week yet
[16:10] <cwillu_at_work> at it worked on friday
[16:10] <cwillu_at_work> s/at/and/
[16:10] <persia> Ugh.  phase-of-the-moon problem :(
[16:10] <cwillu_at_work> :)
[16:11] <cwillu_at_work> MIRROR="http://repository:3142/ports.ubuntu.com/ubuntu-ports"
[16:11] <cwillu_at_work> REAL_MIRROR="http://ports.ubuntu.com/ubuntu-ports"
[16:11] <cwillu_at_work> COMPONENTS="main universe"
[16:11] <persia> Right.  That should be the same as it was at release.
[16:12] <cwillu_at_work> the reason I mention the other chroot isn't so much that I had builds running at the same time (that was one of the earlier things I checked), but rather that I never closed the chroot itself
[16:12] <cwillu_at_work> that's the test I'm running right now
[16:12] <persia> That really shouldn't have an effect
[16:13] <cwillu_at_work> you've said this :)
[16:13] <cwillu_at_work> I'll know in ten minutes
[16:15] <persia> Well, a simultaneously active chroot could have an effect if the chroot boundaries are insufficient, leading to a need to do more with LXC, but an inactive chroot is about the same whether chroot() has been called on it or not.
[16:18] <cwillu_at_work> it had /proc, /sys/, dev and so forth mounted inside it
[16:19] <persia> But no files open, right?
[16:19] <cwillu_at_work> whatever a shell would have, yes
[16:19] <cwillu_at_work> here's the thing though:
[16:19] <cwillu_at_work> (have you looked at the pastebin yet? :p)
[16:19] <persia> I wouldn't expect a shell to have enough open to make a difference, but maybe
[16:19] <cwillu_at_work> I split the script that runs in the chroot into four pieces
[16:19] <persia> yes
[16:19] <cwillu_at_work> to debug this
[16:19] <cwillu_at_work> each script is run in a separate chroot, sequentially
[16:20] <cwillu_at_work> Now, I can rerun rootstock, and it'll get up to the same point each time (i.e., first download works, next thing to touch the network dies)
[16:20] <cwillu_at_work> but... the next thing to touch the network is in a different chroot
[16:20] <cwillu_at_work> and it still dies
[16:20] <rsalveti> hm, werid
[16:20] <cwillu_at_work> ...even though re-running rootstock doesn't
[16:21] <cwillu_at_work> I'm pretty sure this will reduce to a config change that I forgot I made or something silly like that, but even so, I don't think I'm doing anything that _should_ be broken :)
[16:21] <persia> Very odd.
[16:22] <persia> At least it's isolated enough that it can be debugged, so once it's known, it can be made to never happen.
[16:23] <cwillu_at_work> I'm secretly hoping that this is the same trigger as the grief in qemu, but I'm not sure how that plays into what you said rsalveti earlier about lucid patches
[16:24] <cwillu_at_work> moments away from knowing
[16:24] <cwillu_at_work> nope, that wasn't it :p
[16:24] <cwillu_at_work> damn
[16:25]  * persia is glad the LXC integration isn't actually required, as that has looked painful the last few investigations
[16:25] <cwillu_at_work> LXC?
[16:25] <rsalveti> cwillu_at_work: I'm running it here, will let you know if it worked or not
[16:25] <persia> http://lxc.sourceforge.net/
[16:25] <cwillu_at_work> k
[16:26] <cwillu_at_work> rsalveti, I don't think there's too many hardcoded dependencies on my environment
[16:26] <persia> Basically, one can create even more segregation than with a regular chroot, which we haven't (quite) needed for anything yet, but I keep expecting it when people start talking about issues with multiple simultaneous chroots.
[16:26] <rsalveti> cwillu_at_work: I removed most of the stuff I could easily identify
[16:26] <cwillu_at_work> ah, k
[16:26] <rsalveti> mirror, rsync, etc
[16:27] <cwillu_at_work> the rsync might be of interest
[16:29] <cwillu_at_work> (of modules.d, at least)
[16:35] <cwillu_at_work> pulling up a shell inside the chroot after it starts dying
[16:35] <cwillu_at_work> I'd like to confirm that it's network related activity
[16:46] <rsalveti> cwillu_at_work: yep, worked fine
[16:54] <cwillu_at_work> okay, bash prompt up
[16:54] <cwillu_at_work> yep, definitely something weird on this box
[16:58] <cwillu_at_work> ifconfig eth0 shows information, ifconfig dies with ": error fetching interface information: Device not found"
[16:58] <cwillu_at_work> with no indication of which device it's looking for
[17:00] <cwillu_at_work> host repository
[17:00] <cwillu_at_work> qemu: Unsupported syscall: 250
[17:00] <cwillu_at_work> errno2result.c:111: unable to convert errno to isc_result: 38: Function not implemented
[17:00] <cwillu_at_work> socket.c:3851: epoll_create failed: Function not implemented
[17:00] <cwillu_at_work> /usr/bin/host: isc_socketmgr_create: unexpected error
[17:03] <persia> That's exceedingly annoying.  Is that unique to one install, or replicable?
[17:05] <cwillu_at_work> not sure yet;  copying the files over to my desktop to try it there
[17:10] <cwillu_at_work> strace doesn't work :D
[17:11] <persia> strace-in-chroot or strace-on-qemu?
[17:11] <cwillu_at_work> chroot
[17:11] <lool> qemu can't emulate ptrace right now
[17:11] <lool> and it would probably be hard
[17:11] <cwillu_at_work> which is the package for binfmt?
[17:11] <persia> Well, the issue is in qemu, if you're getting "Function not implemented".  Try stracing that.
[17:12] <cwillu_at_work> binfmt-support?
[17:12] <persia> that's the base, but it's pluggable.
[17:12] <persia> You want to edit the binfmt entry for armel binaries to call strace, or run host strace attaching to a PID.
[17:17] <cwillu_at_work> attached
[17:19] <cwillu_at_work> will be a moment, hit ctrl-c in the prompt once too many times
[18:05]  * cwillu_at_work cries
[18:29] <cwillu_at_work> persia, did you want an strace of qemu when a "host google.ca" fails?
[18:32] <cwillu_at_work> persia, rsalveti, http://pastebin.com/8agEskqT
[18:32] <cwillu_at_work> http://pastebin.com/6EfBPFLu is the shell output
[18:50] <cwillu_at_work> there's another thing I didn't notice before:
[18:51] <cwillu_at_work> my build environment is an unpacked copy of the output of my rootstock
[18:53] <cwillu_at_work> which I haven't regenerated in a few weeks
[18:53] <cwillu_at_work> same problems in it though
[19:01] <cwillu_at_work> er, not quite
[19:01] <cwillu_at_work> host dies in the same way, pip and apt seem to work fine :/
[19:16] <cwillu_at_work> and, success
[19:16] <cwillu_at_work> bouncing everything through a local proxy on 127.0.0.1 works