[00:04] <hallyn> apw: thanks - that worked!
[00:08] <manjo> anyone know if ports.ubuntu is having issues 
[00:11] <manjo> looks like some of the ports mirrors might be down... oh well 
[00:17] <hallyn> apw: the patch is not quite right as it should grab another reference to the namespace, but it doesn't seem to blow up and does fix the bug, so i'll send a proper patch tongiht - thanks again
[00:24] <hallyn> or does it not matter?
[00:25] <hallyn> are we guaranteed that the cgns will stick aroudn for the duration of the mount due to task->mntns ?
[00:25] <hallyn> that actually may be the case bu ti need to think it through
[00:44] <manjo> apw, shit! 
[00:44] <manjo> (initramfs) echo "81780AD9-068C-4A80-A795-56856973B8F9" | tr '[:upper:]' '[:lowe
[00:44] <manjo> r:]'
[00:44] <manjo> 81780AD9-068C-4A80-A795-56856973B8F9
[00:45] <manjo> apw, so tr trick won't work in initramfs environemt 
[00:52] <manjo> apw, (initramfs) echo "81780AD9-068C-4A80-A795-56856973B8F9" | awk '{print tolower($0
[00:52] <manjo> )}'
[00:52] <manjo> 81780ad9-068c-4a80-a795-56856973b8f9
[00:52] <manjo> that works
[01:17] <manjo> apw, revised patch attached to the bug with real system boot test results 
[02:41] <jsalisbury> lamont, I posted the last bunch of test kernels in a directory tree.  Location and how they are organized posted to the bug.
[02:45] <lamont> jsalisbury: awesome
[02:45] <lamont> I'll smash through the bisect tomorrow
[07:27] <tjaalton> do you prefer an oldskool pull request or a merge request via lp? (for xenial)
[09:12] <apw> tjaalton, old skool pull request is prolly easier for people as that is what they are used to
[09:14] <tjaalton> ok, I'll do that then
[11:19] <tjaalton> apw: enjoy! :)
[11:22] <apw> manjo, i've uploaded an alternative fix (to avoid adding the first awk dependency) to ppa:apw/ubuntu/initramfs-tools-test, if you could test that for me then i can get it uploaded for real later today
[11:22] <apw> tjaalton, that is highly doubtful :/
[11:23] <tjaalton> hehe
[11:28] <apw> tjaalton, pththththth
[12:18] <apw> hallyn, yo ... did you bottom out that cgroups fix?  i don't see an email
[14:27] <HerrAmeise> anyone install dist-upgrade 4.2.0-30 this morning and have trouble booting?
[14:27] <HerrAmeise> i had to revert back to 4.2.0-27 in order to get it working
[14:28] <apw> bjf, kamal ^
[14:28] <apw> HerrAmeise, what was your symtoms
[14:28] <apw> you are cirtinaly the first report I have seen of issues with that update
[14:29] <HerrAmeise> Unity would not start up
[14:29] <HerrAmeise> so it went through the normal boot sequence
[14:29] <HerrAmeise> and then just hung forever
[14:30] <HerrAmeise> I couldn't even open a terminal
[14:30] <apw> HerrAmeise, ugg, sounds bad.  could you file a bug against the kernle "ubuntu-bug linux" from your working kernel
[14:30] <HerrAmeise> so I had to go to grub and boot with a different kernel version
[14:30] <apw> and put whatever details you have in it, and tell us the bug# here
[14:30] <HerrAmeise> yup no problem
[14:30] <apw> it is also worth looking in syslog to see if there was anything reported when it broke
[14:30] <HerrAmeise> yep i can do that one sec
[14:31] <apw> i am assuming you receieved the kernel via a simple update (update manager or apt-get dist-upgrade sort of thing)
[14:32] <HerrAmeise> yea i did apt-get dist-upgrade
[14:32] <HerrAmeise> didn't build it myself or anything crazy
[14:32] <apw> and its hard to not get all the pieces you need as a complete set when its done that way
[14:33] <apw> (so that eliminates one possibility)
[14:33] <apw> once you have a bug, i'll paste some initial kernels to test to try and figure out which of the three updates is at fault
[14:33] <bjf> apw, news to me. but -30 only has been in -updates for less than 24 hrs.
[14:34] <apw> yep, that indeed
[14:34] <HerrAmeise> apw: ok, I am also running this on a VM at work if that is relevant
[14:34] <HerrAmeise> VMware Workstation 10
[14:35] <apw> HerrAmeise, VMs would be a common test subject but mostly in KVM
[14:42] <apw> HerrAmeise, got a bug # yet ?
[15:06] <stgraber> apw: did our uploads fir your kernel adt problem?
[15:07] <apw> stgraber, yes ... final one just went green in the last 15m
[15:07] <apw> (for xenial)
[15:07] <stgraber> good!
[15:19] <apw> stgraber, and we're seeing ppc64el lxc ADT failures on trusty regardless of kernel by the looks of it, is this expected ?
[15:19] <apw> stgraber, and if not i'll file you anohter bug :)
[15:20] <HerrAmeise> apw: is there any other way to report bugs other than through Apport?
[15:21] <HerrAmeise> it's really a PITA
[15:21] <apw> in theory you can ask apport to file the info to a blob you can move to anohter machine and submit form there
[15:22] <HerrAmeise> and its not just an application crash
[15:25] <HerrAmeise> btw i upgraded the kernel to 4.2.0-30 on my 32-bit Ubuntu VM and the same thing happened
[15:25] <HerrAmeise> so definitely able to replicate the error
[15:25] <HerrAmeise> first one was 64-bit
[15:25] <apw> it is as likely a vmware realted issue as anything
[15:27] <HerrAmeise> true
[15:28] <HerrAmeise> i'll try natively when i get home tonight
[15:29] <apw> HerrAmeise, the first debugging steps are to try -28 and -29 to see which of 28,29,30 are the first broken one
[15:29] <apw> https://launchpad.net/ubuntu/+source/linux/4.2.0-29.34
[15:30] <apw> https://launchpad.net/ubuntu/+source/linux/4.2.0-28.33
[15:30] <apw> binary packages for those are in the librarian ^
[15:37] <HerrAmeise> 33674
[15:38] <HerrAmeise> oops sorry wrong window
[15:41] <stgraber> apw: the latest ppc64el failure on trusty indicates a DC network failure
[15:41] <stgraber> unable to reach the gpg network and cloud-images.ubuntu.com
[15:42] <apw> stgraber, i'll ask for them again and see if it goes away then
[15:42] <stgraber> the rest of your results look good so I'd say your kernel is fine, it's just the test runner having some network difficulties
[15:43] <stgraber> I know that IS changed the squid proxy IP recently, could be that the ppc64el VMs don't have the right firewall rule or something
[15:43] <stgraber> if it fails again, we'll involve pitti
[15:43] <apw> stgraber, ack, will let you know
[15:55] <hallyn> apw: sorry, no, had some technical difficulties.  will try to send it out this morning
[15:56] <apw> hallyn, the pending fixes which break you are time sensitive, so i would like to get to a place where i have a plan to add something or rip something and upload at the latest tommorrow
[15:57] <hallyn> apw: should i send a patch first upstream or first to ubuntu-kernel@?
[15:58] <apw> hallyn, either is fine, if you are confident in the fix i cna apply it while upstream grinds on it
[15:58] <apw> were still in a reaosnably felxible period so we can rip it and replace it if upstream has a better idea
[15:59] <apw> as i assume my other option is to rip the other thing you applied which causes the issue
[15:59] <apw> the cgn i think it was
[15:59] <hallyn> apw: http://paste.ubuntu.com/15181046/   <- does that mean anything to you?
[16:00] <hallyn> (it kinda blocks me when the server hosting all my work keeps hanging with crap like that - from 3.13 to 3.16 and now on 4.2)
[16:00] <hallyn> if it migth be hw then i'll just $%)(*$%)($ switch
[16:01] <apw> i wonder if that could be the fuse cve i was just reviewing
[16:01] <hallyn> oh
[16:01] <sforshee> hallyn: do you hit the WARN_ON at the end of the cgroup_mount after your change?
[16:02] <hallyn> which warn_on?
[16:02] <sforshee> hallyn: was also thinking it might make sense to make kernfs to use sget_userns with init_user_ns always, haven't had time to really think it through yet though
[16:02] <sforshee> the one in the if (pinned_sb) block
[16:03] <hallyn> sforshee: i don't think so
[16:03] <apw> sforshee, you know fuse well, is fuse_fill_write_pages() use on the write or read path ?
[16:04] <apw> it ought to be write, but hey, nothing is clear in this world
[16:04] <hallyn> sforshee: we cannot have cgroup mount use init_user_ns always, if you end up doing the comparison for the sb.
[16:04] <sforshee> apw: write it would seem, invoked by fuse's write_iter callback
[16:05] <apw> sforshee, thanks, not that cve then hallyn 
[16:05] <apw> sforshee, that stack trace might be something you grok better than i: http://paste.ubuntu.com/15181046/
[16:05] <hallyn> apw: trying to decie whether to halt my world to have the people hosting the server check the hw for 10 hours
[16:06] <sforshee> apw: looks to me like the kernel is hung waiting for userspace to respond
[16:06] <apw> hallyn, if it is always that, it look to my shallow knowledge of fuse that it is waiting on a userspace provider, and isn't interruptible
[16:06] <apw> sforshee, ok you see the same as me
[16:06] <hallyn> ok, thx :)
[16:06] <apw> hallyn, so i'd not be keen to blame h/w but whatever crap is mounted on fuse
[16:06] <apw> is that lxcfs :)
[16:07] <hallyn> hm, could be.
[16:07] <apw> sforshee, do we really hand things to uspace and wait in an uninterruptible way for it to respond, that sounds mad to me
[16:07] <hallyn> in that case the only thing i can think is that the kernel builds make oom happen killing it (bc nothing else was execising fuse),
[16:07] <hallyn> except i've got 42g ram
[16:08] <apw> hallyn, then you'd have an oom in there
[16:09] <apw> hallyn, and it looks to start right there, with something not it before
[16:10] <sforshee> hallyn: going back to cgfs ... prior to my changes it was going to reuse an existing superblock. Now it still wants to but sget refuses because it's a different userns. Is that right?
[16:15] <sforshee> apw: I think fuse does wait in an uninterruptible way to respond. But there is some kind of abort connection sysfs node to break those waits.
[16:32] <sforshee> hallyn: it does seem to me that it will possible to hit that WARN_ON(new_sb) if using kernfs_mount_ns. Does that represent some real problem?
[16:33] <sforshee> hallyn: also I'm not sure what you were getting at wrt using init_user_ns always
[16:33] <sforshee> in effect that's what's happening before we have s_user_ns. But like I said I need to think it through some more.
[16:38] <hallyn> sforshee: just to  make sure, you see why i need something like it right?
[16:39] <hallyn> looking at fs/sysfs/mount.c, i think i just need to grab/release the ns ((i suppose sb release could be done lazily)
[16:39] <sforshee> hallyn: I think so. Previously you ended up reusing a superblock, but now sget returns EBUSY because you're in a different userns.
[16:39] <hallyn> right
[16:40] <sforshee> and by passing a namespace you force the kernfs test function to not match the old superblock
[16:40] <sforshee> but there seems to be something inherent to this code that expects to reuse the superblock in some cases
[16:40] <hallyn> is there?  i was wondering that but didn't see it,
[16:40] <sforshee> just look at pinned_sb
[16:41] <hallyn> is there a simple way we could re-use it?
[16:41] <hallyn> without hardcoding cgroupfs in sget_userns
[16:42] <sforshee> well, that's where the though of doing sget_userns(..., &init_user_ns) in kernfs_mount_ns came from
[16:42] <sforshee> *thought
[16:42] <sforshee> or I was also considering whether maybe that check only makes sense for device-backed mounts
[16:44] <hallyn> yeah it doesnt make sense for i.e. sysfs or proc, right?  you can't get another userns's mount there
[16:44] <hallyn> devpts?
[16:45] <hallyn> i wonder whether this will quietly break containers doing 'mount -t devpts' without -o newinstance
[16:46] <hallyn> well we pass file_system_type in, so i guess we *could* filter on cgroupfs;  how would we tell whether it's blockdev-backed in sget?
[16:47] <sforshee> you can't mount devpts from !init_user_ns without newinstance it appears
[16:47] <hallyn> oh, good
[16:48] <sforshee> there's a flag in the fs type that says whether or not it's device backed
[16:48] <sforshee> FS_REQUIRES_DEV
[16:49] <cristian_c> jsalisbury: hi
[16:49] <hallyn> so if (type->flags & FS_REQUIRES_DEV && user_ns != old->s_user_ns) ?
[16:49] <lamont> jsalisbury: finally read what you set up for me.  you have outdone expectations, thanks.
[16:50] <sforshee> hallyn: yeah, something like that. Still thinking though.
[16:50] <lamont> I'll smash through those sometime before the end of lunch today, expect an update in something like 2-4 hours.  I'm assuming that our final kernel is top-of-branch, with the identified commit reverted?
[16:53] <sforshee> hallyn: so essentially s_user_ns is used for 2 things. First is translating ids for the backing store, which doesn't apply to psuedo filesystems.
[16:54] <sforshee> the second is for privileges towards the superblock. For cgroups/sysfs do we really want root in the userns to have privileges towards the superblock?
[16:55] <sforshee> though if they aren't they can't remount
[16:58] <hallyn> sforshee, well in the case of cgfs cgroup_mount() guards remount, but i'm not sure about proc and sysfs
[16:58] <hallyn> i would assume so
[17:00] <hallyn> else that would have been an issue already since we allow mount on those
[17:01] <hallyn> sforshee: it seems we canassume it's safe if it's not dev-backed and it already has FS_USERNS_MOUNT
[17:02] <sforshee> hallyn: assume what's safe? Skipping the checkin in sget_userns?
[17:03] <hallyn> sforshee: yes
[17:05] <sforshee> hallyn: I think at minimum it's probalby okay as a interim fix while we decide what the best fix is
[17:10] <xnox> bjf, what's the minimal amount of maas functionality is required to run kernel-maas testing for s390x?
[17:11] <xnox> as far as can tell that enablement is currently staggnating, and i'm wonder if that can be expedited somehow.
[17:13] <bjf> xnox, right now i only deal with bare-metal via maas 
[17:14] <xnox> bjf, and it needs to boot to ssh right?
[17:14] <bjf> xnox, yes
[17:15] <xnox> bjf, and you do use the powercycle functions in maas right? e.g. poweroff/on/reboot?
[17:15] <bjf> xnox, yup
[17:15] <xnox> bjf, ok, cool.
[17:17] <manjo> apw, I had trouble installing from the PPA due to dependcy issues .. posted it to the bug 
[17:17] <manjo> apw, https://bugs.launchpad.net/ubuntu/+source/initramfs-tools/+bug/1548120
[17:19] <apw> manjo, those deps are right
[17:19] <apw> manjo, they are internal deps making sure the bits are at the same version from the same source package
[17:21] <manjo> apw, hmm I have xenial-propose enabled .. 
[17:22] <manjo> apw, should I disable that 1st ? 
[17:22] <apw> manjo, nope shouldn't matter
[17:22] <apw> manjo, did you add the ppa or download the .debs ?
[17:22] <manjo> added your repo
[17:23] <manjo> etc/apt/sources.list.d/apw-ubuntu-initramfs-tools-test-xenial.list
[17:23] <manjo> using apt-add-repo
[17:23] <apw>     initramfs-tools-bin_0.122ubuntu5~rc1_amd64.deb (81.7 KiB)
[17:23] <apw>     initramfs-tools-core_0.122ubuntu5~rc1_all.deb (116.8 KiB)
[17:24] <apw>     initramfs-tools_0.122ubuntu5~rc1_all.deb (84.0 KiB)
[17:24] <apw> well that PPA has all three of those in there
[17:24] <apw> so your deps should be found from the PPA
[17:24] <manjo> initramfs-tools:
[17:24] <manjo>   Installed: (none)
[17:24] <manjo>   Candidate: 0.122ubuntu5~rc1
[17:24] <manjo>   Version table:
[17:24] <manjo>      0.122ubuntu5~rc1 500
[17:24] <manjo>         500 http://ppa.launchpad.net/apw/initramfs-tools-test/ubuntu xenial/main arm64 Packages
[17:24] <manjo>  Candidate: 0.122ubuntu5~rc1
[17:24] <manjo>   Version table:
[17:24] <manjo>      0.122ubuntu5~rc1 500
[17:24] <manjo>         500 http://ppa.launchpad.net/apw/initramfs-tools-test/ubuntu xenial/main arm64 Packages
[17:25] <manjo> apw, ah -bin comes from ports 
[17:25] <apw> OH its bloody arm64
[17:25] <manjo>   Candidate: 0.122ubuntu4
[17:25] <manjo>   Version table:
[17:25] <manjo>  *** 0.122ubuntu4 500
[17:25] <manjo>         500 http://ports.ubuntu.com/ubuntu-ports xenial-proposed/main arm64 Packages
[17:25] <apw> hang on
[17:25] <manjo> ok ☺ 
[17:25] <manjo> apw, welcome to my world 
[17:29] <apw> manjo, ok in about 10m there will a ~rc2 in there built for arm64 as well
[17:30] <manjo> cool 
[17:30] <manjo> will post results to the bug 
[17:47] <sforshee> hallyn: are you already preparing a patch for the cgfs issue or should I go ahead and make one?
[17:48] <hallyn> sforshee: I'm looking at the code, but i'm not sure i'm doing it right
[17:48] <hallyn> (adding an extra helper fn which tries to get the logic right of whether we need to check the user_ns)
[17:49] <sforshee> sounds more complicated than what I was thinking ...
[17:49] <hallyn> lemme finish tihs up and pastebin it and you can tell me i'm wrong :)
[17:49] <sforshee> sounds good
[17:51] <hallyn> sforshee: http://paste.ubuntu.com/15181923/ ?
[17:54] <hallyn> seems to be buliding anyway
[17:55] <sforshee> hallyn: I think that should work, but the FS_USERNS_MOUNT seems unnecessary
[17:56] <hallyn> sforshee: so the idea of that is that any virtual fs which we cannot currently mount in a userns will be restricted to your own userns...
[17:56] <hallyn> agreed not sure if it's needed, and it may be reckless...  might break things...
[17:56] <hallyn> but i wasn't certain that it woudl be safe otherwise
[17:56] <hallyn> no shouldn't break things - it just allows what you had designed to work the way you menat it to for those filesystems :)
[17:56] <sforshee> but if you can't mount it in a user_ns then both namespaces are always init_user_ns anyway
[17:57] <hallyn> sforshee: oh, youdon't relax the need for FS_USERNS_MOUNT, right
[17:57] <hallyn> so you're right, shouldn't be necessary so we don't need the helper really
[17:58] <hallyn> sforshee: so i'll just build and test http://paste.ubuntu.com/15181976/
[17:58] <hallyn> if you can do a clenaer version of that then even better
[17:58] <sforshee> hallyn: or even http://paste.ubuntu.com/15181975/
[18:00] <hallyn> right
[18:00] <hallyn> do you mind sending that in then? :)
[18:00] <sforshee> you want to test it first?
[18:01] <hallyn> is building my version enough or do you want me to do your verbatim version?
[18:03] <sforshee> hallyn: I'm probably going to build my version either way before I send it, so I can just give you that version to test
[18:03] <sforshee> I hate to send patches without build testing, typos can be easy to overlook
[18:10] <hallyn> builds taking forever
[18:22] <kamal> HerrAmeise, apw, bjf: So far, I'm unable reproduce Herr's issue.  In a 15.10 amd64 qemu vm, I can successfully boot linux-image-4.2.0-{27,28,29,30}-generic with no apparent trouble.
[18:24] <bjf> kamal, thanks for looking at that
[18:35] <hallyn> sforshee...  and still building.  if you get a build, so long as you can 'lxc launch cxenial x1' and x1 boots (gets >3 processes and gets an ip address) then it's good
[18:37] <sforshee> hallyn: mine's still building too
[18:45] <lamont> jsalisbury: back to you.
[18:45] <lamont> and thanks muchly for making it less painful
[18:45] <jsalisbury> lamont, thanks, I'll take a look
[18:46] <lamont> it's also good to have my gut confirmed. :D
[18:47] <jsalisbury> lamont, I'll build a test kernel with that commit reverted, if it fixes the bug, I'll get in touch with the patch author
[18:49] <lamont> ack.  scream when
[18:52] <sforshee> hallyn: http://people.canonical.com/~sforshee/cgns/, updating my vm to test it now
[18:53] <lamont> jsalisbury: fwiw, http://global.shuttle.com/products/productsDetail?productId=1480
[18:56] <hallyn> sforshee: success!
[18:57] <hallyn> i think
[18:57] <hallyn> yeah.  dunno why there is a 'lxc' cgroup there, but ...
[19:00] <sforshee> hallyn: seems to work for me too
[19:00] <sforshee> I'll send it
[19:06] <sforshee> apw, hallyn: patch sent
[19:08] <tjaalton> umm yeah disregard the pull request, I'll send a new one
[19:11] <hallyn> sforshee: awesome, thx
[19:30] <cristian_c> jsalisbury: hi, again