[05:54] good morning [07:49] jfh, good morning! [07:49] jfh, sorry to hear that you're having difficulty getting online :-( [07:50] jfh, as well as your mail, you could also try asking for assistance on the "canonical-sysadmin" channel on freenode... [07:51] good morning andrewc - well, I hope I can figure that out soon ... seems to be an issue with the Sign-in through Canonical ... let's see ... [07:51] good point - will try that, too [09:59] :D [10:46] jfh: welcome to the dark side [10:47] o/ [10:47] hi jamespage [10:49] mihajlov: borntraeger: hi I hope you have a good weekend soon, but we would have a question regarding libvirt/kvm/openstack on s390 [10:49] I hope it is one of the former two so we can get a quick solution without asking the OS guys :-) [10:50] jamespage: here in the channel hit thie bug just a few minutes ago [10:50] https://bugs.launchpad.net/ubuntu/+source/nova/+bug/1564831 [10:50] Launchpad bug 1564831 in nova (Ubuntu) "s390x: error booting instance" [Undecided,New] [10:50] cpaelzer, ? [10:50] mihajlov: borntraeger: and I wondered what/why you don't hit that with z/KVM+OS [10:50] the title is rather misleading IMHO [10:50] libvirtd[21610]: this function is not supported by the connection driver: cannot update guest CPU data for s390x architecture [10:51] borntraeger: mihajlov: that is closer to where things might start to break [10:51] cpaelzer, I would assume that this is about the "not yet available" cpu model support [10:52] cpaelzer, mihajlov : but there was a workaround in libvirt for that [10:52] http://libvirt.org/git/?p=libvirt.git;a=blobdiff;f=src/cpu/cpu_s390.c;h=23a7f9d8d38a00dc9c673d224f797cf8a17aa5d1;hp=f9d7e216aec847df321d7c7d3a050415ee8550fd;hb=59403018893cf2c4f9a6f5145e387cefbd44399a;hpb=b789db36ae1cb5a48986c3b9e3bfb64131367872 [10:52] looks relevant but we appear to have that in the libvirt version in xenial - just double checking [10:54] yah - confirmed in 1.3.1 [10:55] jamespage: hmm - to be sure is that OS against libvirt/KVM ? [10:55] or containers anywhere in between? [10:55] cpaelzer, jamespage , mihajlov its certainly a message from libvirt [10:56] cpaelzer, jamespage, mihajlov , but I have not seen it here [10:56] jamespage: could you identify the exact (api) call it made to trigger that? [10:57] cpaelzer, borntraeger: actually yes there is a container in the way here [10:57] I think that's the cause of the problem... [10:58] empty /proc/cpuinfo is not helping I suspect [11:04] jamespage: do you want to give it a try without containers just with KVM ? [11:06] jamespage, is missing /proc/cpuinfo an lxc/lxd bug, given that it needs to emulate/whitelist/synthesise it or some such? [11:06] xnox, yes I think so [11:07] cpaelzer, not just yet [11:08] jamespage, i guess a manual provider can be mixed into the thing... ? [11:11] xnox, figured out how to bind mount the hosts cpuinfo into the container... [11:11] ^_^ [11:11] xnox, getting alot of "Failed to allocate directory watch: Too many open files" [11:17] xnox, cpaelzer: lack of /proc/cpuinfo is a problem for LXD, but does not appear to be the cause of this... [11:20] xnox, cpaelzer: trying a trick to add the host machine to the deployment, but just hit the wall with the 2G root disk size... [11:21] jamespage, what's your host? you should be able to active e.g. additional drives and add them to the vgroup. [11:21] jamespage, btw i can reboot s1lp7 and give it to you as well, as an additional resource it should have ~100GB large rootfs. [11:22] xnox, my problem is that all of the control plan IP addresses are on the local bridge and not generally accessible... [11:22] 2016-04-01 11:21:49 INFO install E: Write error - write (28: No space left on device) [11:22] not unexpected... [11:47] xnox: does d-i in guided partitioning try to create a swap disk as huge as memory? [11:47] xnox: the disk of james on a 40G memory system had split the available ~41G into 38.x swap and 2G root [11:48] xnox: s390 is the land of small disks and (sometimes) a lot of memory [11:48] xnox: there should/could be a cap on the swap size [11:50] cpaelzer, jamespage: this is a classic d-i/partman bug. there are no caps, just multiples. [11:50] deactivate swap, remove it, enlarge partition, enlarge rootfs.... [11:50] xnox: already done that [11:50] I just wanted to avoid the next oen running into it [11:51] xnox: classic means the bug exists and is open= [11:51] ? [11:51] https://bugs.launchpad.net/ubuntu/+source/partman-auto/+bug/1032322 [11:51] Launchpad bug 1032322 in partman-auto (Ubuntu) "Swap space allocation for large memory systems needs improvement" [Medium,Confirmed] [11:51] great, thanks [11:52] first opened in 2012-08-02 but it has been around since forever, and typically reported by installer testers in e.g. qemu vms with like [11:52] "i gave it 16GB of ram and 8GB rootfs disk" [11:53] thinking about it. [11:53] cpaelzer, does it even make sense to have swap on lpar / z/VM? [11:53] xnox: don't get this started in a public channel [11:53] nooooo [11:53] you did it [11:53] this is like vim and emacs [11:53] lpars should be big enough, and z/VM can over-commit x2 RAM [11:54] it can overcommit up to whatever you can accept performance wise and often I've seen 2-3x [11:54] but on z/VM only, not on LPAR, right? [11:54] even on kvm it works reasonably most of the time although there could be soom improvements [11:54] LPAR is only partitioning, no overcommit for memory [11:55] thinking about it, maybe there should be a safe guard that e.g. swap cannot be more than 10% of total disk space, regardless of the sizing relative to RAM [11:55] IMHO the Host should swap not the guests [11:55] or maybe the 200% should be from the smallest of (ram, disk) sizes [11:55] but there are quite a lot of cases where that alone is not the truth [11:55] I think I have seen some logic that groups into three categories by ram size [11:56] ram <2G, try 2*ram [11:56] else swap = ram size [11:56] but [11:56] can one at all hybernate lpar & z/VM? cause on server hibernate on emergency power shut down is a poor mans choice for redundant power. [11:56] never go over 64G [11:56] and never go over x% of the disks [11:56] why 64G? why not 65G? why not 63G? [11:56] xnox: suspend and resume is implemented [11:57] arbitrary choice, like the old 2x, why not 1.8x [11:57] suspend&resume is not hibernate&thaw. E.g. swap is not needed for suspend, as RAM remains powered/active. [11:57] if science people are involved we could suggest a smooth scaling formular no one would understand :-) [11:57] 2x -> is reasonable to have a good chance at hibernate, when things have overcommited ram. [11:58] cause one needs to dump all of ram to swap, to hibernate, plus whatever got overcommited/spilled over to swap. [11:58] ah you mean to disk [11:58] yes, hibernate. [11:58] never cares too much about that, I'd have to check if that works as well [11:58] hca: ^^ ? [11:58] is there hibernate on lpar / z/vm -> if not, i'll just remove swap from default recipes full stop, and people can install swapfile package to add swap. [11:59] wait for hca's answer [11:59] but then power failure is so boo low end [11:59] I mean most cpu calculations are doen twice for quantum effects of random particles [11:59] power failure - pffff [11:59] i think mainframe deployements have better power failure mode handling than other architectures. [12:00] well - eventually they have way better handling, but then this is (sadly) one of the things the business/finance people cut costs [12:00] it works the same without that battery pack, well then ... [12:01] >_< [12:12] cpaelzer, reading all the bug reports it's like "high memory system -> too large swap" and "swap not large enough to hibernate" [12:12] the most reasonable comment is from superm1 [12:12] https://bugs.launchpad.net/ubuntu/+source/partman-auto/+bug/576790 [12:12] Launchpad bug 576790 in partman-auto (Ubuntu) "Partman should support disabling swap in impractical scenarios" [Undecided,New] [12:13] e.g. it should be possible to have a flag to essentially "skip swap" and calculate that for "impractical scenarios" e.g. RAM >> root disk (high memory system) [12:13] with a threshold as to what a high memory system is [12:13] and be able to preseed that key. [12:15] imho "high-memory" is anything where RAM >> 10% of total disk space [12:15] (specifically 10% of the /usr partition) [12:15] well... no. [12:16] * xnox needs to look at partman-auto to see if it has total disk size numbers available [12:35] I'm ok with almost any limit, as the hard part is creating the infrastructure not defining the exact ratio/size of the limit [13:59] cpaelzer, borntraeger: OK so after looping around and re-deploying with the compute node directly on an LPAR running Ubuntu Xenial, I still see the same problem [14:24] jamespage: so you now run without Containers just KVM&Openstack [14:24] ? [14:24] cpaelzer, well the control plane bits are still in containers but the hypervisor is not [14:24] ok [14:25] didn't xnox already say it worked for him, maybe he has the workaround you need [14:27] not with latest nova generated libvirt config for our cloud image [14:27] xnox, yeah - I suspect this is a break in nova's used of libvirt but not 100% sure yet... [14:28] so we will need to debug the generated libvirt config i guess. [14:28] jamespage, is the one you pasted on the bug report accurate? [14:28] most recent [14:28] xnox, yes [14:29] cool, i'll give it a poke in a few. [14:29] need to finish a few things up, and have a call, and then will be able to look into it. [15:46] xnox, having a punt at setting the cpu-mode flags for nova to host-passthrough [15:46] xnox, we do the same for ppc64el [15:47] xnox, have you hit this "too many open files" warning/error on s390x? I think its actually impacting my deployment [15:47] I see it on the host and in containers as well.. [16:27] Hi, I have updated a comment to https://bugs.launchpad.net/ubuntu/+source/nova/+bug/1564831 regarding instance not starting on s390x [16:27] Launchpad bug 1564831 in nova (Ubuntu) "s390x: error booting instance" [Undecided,New]