=== poster is now known as Poster === notdaniel is now known as notdaniel_ [09:01] jamespage: no backporting by cloud-archive of new DPDK further back than Xenial right? [09:10] cpaelzer: correct [09:10] any earlier pockets are closed now [09:10] good, thanks jamespage [09:10] there was a fix to recent dpdk to build on older kernels <=3.10 I think [09:10] no need to rush it in then === Benjamin_ is now known as nmjnb === petevg is now known as petevg_afk === vamiry_ is now known as vamiry [12:59] jamespage: coreycb: ICYMI nova 15.0.1 is out with high impact fixes for ocata, pls update === rxc is now known as Guest14564 [14:04] frickler: on my list for as soon as it appeared [14:04] frickler: working that next === poster is now known as Poster [15:28] any good tools to understand why I'm having poor network performance on a server? [15:29] There are four NICs, and throughput is choppy, even on a NIC which is unused for anything else, so I'm guessing that it's something with the scheduler. [15:29] The server is a little busy, load avg of ~16 [15:29] but there are 24 cores === BrianBlaze420 is now known as BrianbBlaze420 === BrianbBlaze420 is now known as BrianBlaze420 [17:14] Hi there. I'm having trouble executing a script at startup. [17:14] I have added it to cron as a @reboot job, not working. [17:14] Can anyone give me a light on how to add it to rc.local? [17:29] stgraber: Hey Stephane, Im hitting this bug (https://bugs.launchpad.net/ubuntu/+source/open-iscsi/+bug/1576341) do you know what is the status? I basically cant run iscsid inside the container (failed to mlockall, exiting...). I added the lxc.aa_profile=unconfined, as suggested in the bug, restarted the container, but no success, I still get the same error [17:29] Launchpad bug 1576341 in systemd (Ubuntu) "fails in lxd container" [High,Confirmed] [17:30] stgraber: any idea? [17:33] erlon: did you verify the running container is unconfined? [17:34] erlon: what version of open-iscsi? [17:34] nacc: the last, 2.0.873 [17:34] nacc: hmm, one point worth to mention, this container is running inside a KVM machine [17:35] nacc: not sure if that can be a problem [17:35] erlon: specific version -- is this i ubuntu 16.04? [17:35] open-iscsi 2.0.873+git0.3b4b4500-14ubuntu3.3 amd64 [17:35] nacc: yes [17:36] erlon: right it doesn't have the fix, afaict [17:36] erlon: only zesty does [17:36] erlon: (well, the open-iscsi fix) [17:36] erlon: you can try making the same change to the service file locally [17:36] nacc: reALLY? [17:37] opps [17:37] caps [17:37] erlon: i see no tasks for xenial and the last comment is for the zesty version (per changelog and rmadison) [17:38] nacc: you mean configuring the iscsid.service in the container? [17:38] erlon: let me see if i can find the change verbatim to show you [17:38] nacc: ok [17:40] https://www.irccloud.com/pastebin/kAgDBIav/ [17:40] erlon: http://paste.ubuntu.com/24183873/ [17:41] erlon: interesting -- i don't know enough about why mlockall is failing then, it would seem if it was unconfined, it shouldn't have issues with it [17:42] nacc: as it seems that fix just blocks iscsi from running inside confined containers, righ? [17:43] nacc: ConditionVirtualization=!container, 'if this is a container, don't run' [17:44] erlon: i think it prevents it from running in containers period? [17:44] nacc: yes, [17:45] hallyn: --^ [17:45] hallyn: should that change be SRU'd? [17:46] nacc: but checking here, I have run iscsid inside a container running in a baremetal machine and is not getting the same problem [17:46] erlon: same version of the container (16.04)? [17:47] nacc: I believe yes, checking [17:48] nacc: very same version [17:48] iscsid -f --version [17:48] iscsid version 2.0-873 [17:49] erlon: the iscsid version isn't very helpful [17:49] need the package versions [17:49] `apt policy open-iscsi` [17:49] dpkg -l | grep iscsi [17:49] ii open-iscsi 2.0.873+git0.3b4b4500-14ubuntu3.3 amd64 iSCSI initiator tools [17:49] https://www.irccloud.com/pastebin/4VZlxypy/ [17:50] https://www.irccloud.com/pastebin/BYzncru8/ [17:51] nacc: ^this last one inside the virtualized container [17:51] erlon: ack [17:52] erlon: well, i'm a bit stumped myself. It does seem like hallyn's fix in 17.04 is to prevent open-iscsi.service from running in containers at all. But I'm not sure why that' [17:52] that's acceptable -- i guess you would need to do it manually [17:52] I'm having a horrible time on 16.04 with a cups server [17:52] nacc: hmm, but I can't run it even manualy [17:52] when run with cupsd -f, it works fine [17:53] nacc: [17:53] erlon: right, that's probably why it was disabled [17:53] root@juju-5efd81-1-lxd-1:/home/ubuntu# iscsid -f [17:53] iscsid: failed to mlockall, exiting... [17:53] but when run as a service, or manually, using cupsd -l [17:53] it timesout after 60seconds and exits [17:53] erlon: as you don't have CAP_IPC_LOCK actually [17:53] nacc: Im mean calling the binary directly [17:54] erlon: right, the systemd service just calls the binary [17:54] erlon: so rather than have it fail all the time in containers, hallyn disabled it [17:54] nacc: but inside the baremetal container it runs fine [17:54] frickler: just published the 15.0.1 update for nova through to zesty; that will auto-backport in the next hour and then I'll promote and test to the UCA (that might be AM tomorrow) [17:54] erlon: are the two hosts (baremetal and KVM) the same Ubuntu? [17:55] if you want it early then it will be in ppa:ubuntu-cloud-archive/ocata-staging [17:55] nacc: I believe yes, just double checking [17:56] patdk-wk: had the same issue, went back to 14.04, couldn't figure it out [17:56] erlon: and same version of lxd in both? [17:56] it's odd [17:56] erlon: and kernel (since there are hwe now) [17:56] only happening on this one server, the other one it works fine [17:56] but think that has something to do with avahi stuff keeping it alive from hitting the 60sec timeout [17:57] patdk-wk: yeah, I think so, because onew ork around was to add a "ping" cronjob [17:58] ya, as long as I keep hitting the web interface to configure and do setup, it keeps running [17:58] as soon as I'm done, it's dead [17:58] nacc: HOST->"Ubuntu 16.04.2 LTS" -> KVM -> ="Ubuntu 16.04.1 LTS", -> LXC -> Ubuntu 16.04.2 LTS [17:58] I did some tracking in systemd thinking it was a socket problem, and found some stuff was indeed missing, but couldn't figure out what needed to be fixed [17:58] patdk-wk: but imho the prob is there, with ssytemd and the "on-demand" stuff [17:58] erlon: so the KVM and baremetal systems are different? [17:59] erlon: rather than using codenames, it's probably better to pastebin things like `uname -a`, `apt policy lxd` [17:59] hey DammitJim [17:59] DammitJim: how did that work out? [18:00] epoll_wait(4, [], 65536, 1000) = 0 [18:00] epoll_wait(4, [], 65536, 60000) = 0 [18:00] epoll_ctl(4, EPOLL_CTL_DEL, 9, 0x7ffe5adefb70) = 0 [18:00] epoll_ctl(4, EPOLL_CTL_DEL, 10, 0x7ffe5adefb70) = 0 [18:00] close(9) = 0 [18:00] close(10) = 0 [18:00] nacc: hold on, just fcorrenting KVM I mean the lxc running [18:00] right after the return from that 60000 epoll_wait, cupsd start to shutdown [18:00] socket ping should keep it alive :( but annoying [18:00] yeah [18:01] nacc: so yes, the LXC container that iscsid can work (16.04.1 ) is different from the LXC running under KVM (16.04.2) [18:01] doesn't happen on my laptop, but that has the full gui installed on it, not just a cups server [18:02] nacc: just does not makes sense the the later version should work, not the 16.04.1 [18:03] ● cups.service - CUPS Scheduler [18:03] Loaded: loaded (/lib/systemd/system/cups.service; enabled; vendor preset: enabled) [18:03] Active: inactive (dead) since Wed 2017-03-15 13:32:05 EDT; 31min ago [18:03] Docs: man:cupsd(8) [18:03] Process: 13103 ExecStart=/usr/sbin/cupsd -l (code=exited, status=0/SUCCESS) [18:03] Main PID: 13103 (code=exited, status=0/SUCCESS) [18:04] is there a difference between the zfs.ko that comes with the kernel and the zfs-dkms package? both are the same version, afaict. [18:04] I increase the virtual hard disk.. how do I realize the new size in the VM? pvs still shows old size [18:04] ducasse, yes [18:04] I increased the size of the virtual disk.. [18:06] drab, I told one of the other managers and he said we'll talk about it [18:06] thanks for following up [18:09] ah.. it took a while for pvresize to take in effect.. now I see the large size [18:10] axisys, good luck with resizing [18:10] DammitJim: done [18:10] I still have to document how I do that to standardize it in our company [18:10] awesome [18:11] patdk-wk: i assume i should use the zfs.ko in the kernel, as the zfs metapackage doesn't drag in zfs-dkms? [18:11] either is fine [18:11] for ubuntu kernels you shouldn't need zfs-dkms, iiuc [18:11] dkms is for when you use a kernel that does not have it [18:11] my /dev/sdb was a PV .. so once I changed the size of the disk in vmware, I had to run pvresize.. and it took a little type to reflect [18:11] patdk-wk: ok, thanks [18:11] or if if you want to use a different version than the one in the kernel [18:12] i see. perfect :) [18:12] jamespage: great, thx [18:27] erlon: so i'm not sure where we stand -- it's sort of hard to tell which version is which (as 16.04.1 and 16.04.2 are the same version of lxc and open-iscsi, the difference is in the kernel/x stacks potentially) [20:26] drab, this is not really the *right* fix [20:27] but editing the systemd cups.service unit file to change the -l (run on demand) to -f, has fixed the issue [20:33] patdk-wk: oh, good point, better than the ping or stick with 14.04 I guess [20:33] thanks [20:33] it seems like -F works too [20:33] testing that, as it seems like a better solution [20:34] atleast till the real reason is solved, this on demand stuff for a server heh [20:34] while on the topic, I don't know your setup, but you wouldn't happen to have quite a few diff systems there and a reasonable way to centralized printing? [20:34] I wrote to the cups ML, but no response for some reason [20:34] the client.conf thing they say is deprecated and "you must run a local cups connecting to the remote one" [20:35] I get that in principle, but in practice it's a mess ime [20:35] that is how I do it [20:35] ok, maybe I'm doing something wrong, here's the big problem for me [20:35] I setup a cups printer, that advertizes airprint and google print [20:35] setup cups-browsed [20:36] printer goes down, queue is still up because cups doesn't actually check the printer [20:36] user goes to machineA, tries to print, doesn't work, tries again, doesn't work, a third time, doesn't... now machineA has 3 jobs in the local queue [20:36] then use thinks "I nkow, there's a problem with machine A" and goes to machine B and does the same thing [20:37] now I have two machines with 3 jobs each of the same thing in the queue [20:37] I go to fix the printer and bam, 6 copies of the same thing gets printed [20:37] setup your cups server, with cups-browsed, and BrowseLocalProtocols cups [20:37] then on the client machines install both again, but with cups-browsed setting of BrowseRemoteProtocols cups [20:37] it will automatically add the cups servers printers locally [20:37] with client.conf that doesn't happen because jobs only exist on the remote cups, which I can easily monitor and see the dup'ed jobs and remove them before I restart the printer [20:38] the problem isn't finding/configuring the printer [20:38] it's job management [20:38] specfically stuck jobs [20:38] when something is wrong with the printer, local queues ime are a nightmare [20:38] becaue I have no clue on which machine theya re and purging them is a pain [20:38] yes, but the job won't be on the local machines then [20:40] mmmh, I had that situation quite a bit, which is why we switched to client.conf [20:40] but maybe there was something else wrong [20:41] dunno, I never have to worry about queues ever myself [20:41] I mainly use that feature for my laptop and macs [20:41] they show up on the lan, the printer just appears and is usable [20:41] no local config setup or anything needed [20:41] I can't imagine it would get stuck in a local queue when doing that [20:42] as it should always be able to go to the remote queue [20:42] or the printer would *vanish* from the local machine [20:43] I see [20:43] eer, above I said "printer has a problem", the problem case is "remote cups has a problem" [20:43] which is when the job is cached locally [20:44] in the other case you're right, the job would be gone from the local queue even if not printed [20:44] I'll take a look at the browserd thing you mentioned, thanks [20:44] if the printer did vanish that'd stop the problem altogether [21:13] ok, not sure -F works :( [21:19] ok, found the real reason [21:19] it's cause I have no shared printers configured [21:21] adding this as in the bug report I found solves the issue [21:21] ListenStream=631 [21:21] bug #1598300 [21:21] bug 1598300 in cups (Ubuntu Xenial) "CUPS web interface stops responding after a while" [Undecided,Fix committed] https://launchpad.net/bugs/1598300 [21:33] ya, that looks like it fixed up [21:33] upgrading to the cups -proposed package [21:33] that is damned annoying [21:40] patdk-wk: good catch. the upstream patch seemed unresolved tho, ie they don't accept the ListenStream=631 [21:40] different issue, kindof [21:41] default in ubunt is to have the webinterface enabled [21:41] then cups should never exit [21:41] the patch fixes that [21:41] so then it becomes a non-issue