=== CyberJacob is now known as zz_CyberJacob [01:17] Hey guys, I can't seem to find the documentation on manually configuring an existing DHCP server. Lots of places point to http://maas.ubuntu.com/docs2.0/configure.html#manual-dhcp [01:17] but it seems to no longer be on the page. [01:43] nevermind, all good [04:27] Bug #1583891 opened: clean up boot-resources before syncing images as well as after === mup_ is now known as mup === mup_ is now known as mup === zz_CyberJacob is now known as CyberJacob [10:23] hello! [10:23] My maas server can install ubuntu 16 on my nodes but when I choose ubuntu 14 it says kernel image not found [10:23] and I have added the right images [10:23] is this a bug or something? [10:24] cause I am trying to install a local cluster and I need the 14.04 version [13:14] Bug #1584047 opened: [1.9.3] maas-dhcp failure while/after upgrading to 1.9.3: apparmor_parser: Unable to replace "/usr/sbin/dhcpd". Permission denied; attempted to load a profile while confined? [13:59] Bug #1584047 changed: [1.9.3] maas-dhcp failure while/after upgrading to 1.9.3: apparmor_parser: Unable to replace "/usr/sbin/dhcpd". Permission denied; attempted to load a profile while confined? === mup_ is now known as mup [16:02] Hi. I'm getting "Failed commissioning" on a host. Can someone help me determine why it failed? I have a couple hosts with the same hardware spec that work okay. Here is the console log: http://pastebin.com/eeaMUvPs [16:02] I can paste more relevant sections if required. [16:04] I see this message.. but I'm not sure if it's the root cause or not.. and I don't know what it means: May 20 15:41:27 controller-3 [CLOUDINIT] handlers.py[WARNING]: failed posting event: start: modules-final/config-final-message: running config-final-message with frequency always [16:05] Bug #1584120 opened: maas doesn't seem to like authenticating proxy URLs [17:12] shewless, let me check [17:27] kiko: awesome thanks. I have more logs if you want.. but the rest of the logs didn't really look meaningful [17:28] shewless, [17:28] May 20 15:41:26 controller-3 [CLOUDINIT] util.py[WARNING]: Failed running /var/lib/cloud/instance/scripts/user_data.sh [1] [17:28] shewless, are you supplying your own user_data? [17:29] if not, could you get that file into a pastebin? [17:29] [1]#012Traceback (most recent call last):#012 File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 715, in runparts#012 subp(prefix + [exe_path], capture=False)#012 File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 1704, in subp#012 cmd=args)#012cloudinit.util.ProcessExecutionError: Unexpected error while running command.#012Command: ['/var/lib/cloud/instance/scripts/user_data.sh']#012Exit code: 1#012Reason: -# [17:29] dout: ''#012Stderr: '' [17:29] smoser, any hint on the above? [17:30] kiko: I am not supplying any user_data (at least not on purpose) [17:31] hmmm! [17:31] shewless, can you get the output of that user_data.sh? [17:31] shewless, are you on 1.9.3? [17:32] if so, you can pause commissioning and get access to that file to see what is running [17:32] kiko, cloud-init is just reporting that the code maas fed it to run exited non-zero [17:32] by selecting a special option [17:32] I'm on 2.0.0 [17:32] I'm not sure how to get the contents of user_data.sh. [17:32] shewless, you may be able to ssh into the instance. [17:33] Is there a "default" login? [17:33] shewless, using your registered ssh key with the ubuntu user [17:33] oh... commissioning. i'm not sure what /who's ssh keys are in tehre. [17:33] kiko, during comissioning ? [17:33] smoser, I think we changed commissioning to add the keys if you select an option when you commission [17:33] whos keys ? [17:33] when you click on commission, there are three radiobuttons [17:33] I assume the user who triggers it? [17:33] err checkboxes [17:34] right. when you explicitly commission i guess. [17:34] isn't commissioning always explicit? [17:34] * kiko <- clueless [17:34] and i guess even when you just accept a node, then *someone* did the accept. [17:34] oh, when you accept does it trigger comissioning automatically? [17:34] i think so :). i might ask someone on the maas team to be sure though ;) [17:35] but yeah, you shoudl be able to ssh in, shewless . and then /var/log/cloud-init-output.log might have something useful in it. [17:36] Is there an easy way for me to determine what IP was assigned to this box? [17:36] Don't see it in DNS [17:36] smoser, why don't we ship that back to maas by default? [17:36] feels like we have everything needed to do so [17:36] that's a great question [17:37] it flashes by the console IIRC [17:37] ooh.. I found it.. (just guessing at IPs around the range that had been assigned) [17:38] last line in cloud-init-output.log is more of the same: 2016-05-20 15:41:27,375 - handlers.py[WARNING]: failed posting event: finish: modules-final: FAIL: running modules for final [17:38] shewless, can you apt-get install pastebinit [17:38] pastebinit < /var/log/cloud-init-output.log [17:38] and [17:38] pastebinit < /var/lib/cloud/instance/scripts/user_data.sh [17:39] you can probably also just *run* that user_data.sh script [17:39] its going to do the same thing this time. and will probably fail similarly [17:39] yeah [17:39] what smoser said too :) [17:39] you can add a set -x to the top if you want more verbosity [17:39] lol okay. Just gotta get this puppy some internet access [17:39] sh -x /var/lib/cloud/instance/scripts/user_data.sh 2>&1 | tee out.log [17:39] pastebinit out.log [17:40] shewless, well, you can jsut collect those over ssh anad move them back and forth to you [17:40] but, yeah. the interenets make things easier [17:41] cloud-init-output.log: http://paste.ubuntu.com/16528310 [17:42] user_data.sh: http://paste.ubuntu.com/16528340 [17:42] result of user_data.sh execution coming up [17:43] BTW pastebinit is AWESOME [17:43] :) [17:43] no kidding yeah [17:43] it is. and its even inside 16.04 images by default [17:44] yes I'm using 16.04 and noticed that it was already installed [17:46] result of user_data.sh execution: http://paste.ubuntu.com/16528508 [17:46] some clock skew and HTTP request failures... [17:48] hey. i have to work on some other things... kiko this is squarely maas code that is running here [17:49] smoser, what do you sniff might be happening looking at that output? [17:49] it is not impossible that clock skew is involved. [17:49] request to http://172.20.0.1:5240/MAAS/metadata//2012-03-01/maas-commissioning-scripts failed. sleeping 1.: HTTP Error 401: OK [17:49] you might have errors on the other end too [17:49] shewless, how wrong is the system clock on that machine? [17:50] shewless, 401 is unathorized [17:50] if I type "date" it's bang on.. not sure how to check [17:50] I have commissioned other hosts so it seems weird if it would be an authorization problem [17:51] kiko, i cant help without much more investigation really. [17:52] smoser, that's fine [17:52] thanks [17:52] shewless, is this the only host that fails? [17:52] kiko: yes [17:52] shewless, if date is bang on then that's not the problem [17:53] I have 4 hosts commissioned and deployed successfully. 2 of which are the same hardware spec as this one that is failing [17:53] well, if it dhcp'd and got date from an ntp source, it might be fixed now. [17:53] but had possibly been a problem. [17:53] should I check the bios? [17:53] i think if you reboot that system, it should set the hardware clock on way down [17:53] so that next time it might work [17:54] smoser, ah, but our dhcp clients are brokenly not updating ntp, see bug in that spec I filed [17:54] its also possible clock is not related and ipmi stuff is failing. [17:54] I think it's unrelated [17:54] the real problem [17:54] I think [17:54] is this [17:54] request to http://172.20.0.1:5240/MAAS/metadata//2012-03-01/maas-commissioning-scripts failed. sleeping 1.: HTTP Error 401: OK [17:54] kiko, well, maybe no [17:54] shewless, if you wget that URL does it fail? [17:54] because maa might just be saying "go away, you're not commissining now" [17:54] that is a weird date btw [17:54] thats an oauthed' resource [17:55] thats the api version of the maas metadata service [17:55] its not changed since then [17:55] interesting [17:57] if I wget that URL it does fail [17:58] HTTP request sent, awaiting response... 401 UNAUTHORIZED Username/Password Authentication Failed. [17:58] as user "ubuntu" [17:59] that being said if I run the same wget on a successfully deployed system it fails in the same way.. not sure if that is relavent [18:00] well [18:00] it's interesting to say the least [18:00] shewless, "ntpdate clock.ubuntu.com"? [18:02] can't find host clock.ubuntu.com (couldn't ping it either) [18:04] kiko: did you mean ntp.ubuntu.com? [18:07] kiko: my maas server is the wrong timezone.. not sure if that matters [18:07] would think the other nodes would have failed though [18:07] shewless, timezone and clock have nothing to do with each other [18:07] kk [18:08] somewhat counterintuitively [18:08] clock is always utc [18:08] kiko: okay.. I fixed that anyways (change maas server to be UTC like all the other nodes) [18:09] shewless, did ntpdate show a major update? [18:09] or a minor one? [18:09] kiki: 20 May 18:09:35 ntpdate[4678]: adjust time server 91.189.89.199 offset -0.007157 sec [18:10] I think that's minor [18:10] shewless, is the maas server also synced? [18:10] i.e. ntpdate from the maas server? [18:11] kiko: on the maas server: 20 May 18:10:56 ntpdate[30075]: adjust time server 91.189.89.199 offset 0.000407 sec [18:11] shewless, okay, so clock skew is not the problem [18:11] shewless, re-run the script and echo $? [18:11] if it's zero, then this is a red herring [18:12] kiko: brb. I will do that.. but when I run user_data.sh it does say "+ return 0" [18:12] kiko: so does that mean it's a red herring? [18:12] I /think/ so [18:12] but something is failing on this machine [18:12] boo.. what next? :) [18:12] brb [18:13] 2016-05-20 15:41:26,777 - util.py[WARNING]: Failed running /var/lib/cloud/instance/scripts/user_data.sh [1] [18:13] that's the only hint [18:13] it says it failed to run it [18:14] it's very strange [18:26] shewless, the fastest thing I have is to try and compare a working commissioning run with a failing one [18:26] to see if it's a red herring or not [19:33] Bug #1357086 opened: [2.0b5] Machine Finished Commissioning, it powered off, but power status show's "on" [20:04] kiko: I'm attempting to commission another system (that's previously worked in the past). I'll let you know [20:08] kiko: not sure if it's related but on the "failed" device I see an error on the console: "blk_update_request: I/O error, dev fd0, sector 0" [20:08] I don't see that on my "working" device [20:13] I saw that but found it odd [20:13] why is it trying to write to /dev/fd0? [20:14] I have no idea! That's why I ignored it at first.. there isn't a fd0 device [20:14] so the user_data.sh execution looks pretty similar. HTTP Error 401 is still present [20:15] on the "working" system [20:15] anything else I should check? It looks like the "failed system" was VERY close to working in terms of logging [20:16] is the only difference the fd0 warning? [20:16] if so, see if there's a BIOS entry for floppy you can disable? [20:16] that's the only difference I've noticed [20:16] let me look at the bios [20:17] http://askubuntu.com/questions/213512/buffer-i-o-error-on-device-fd0-logical-block-0-error [20:17] when you run blkid it apparently triggers that [20:20] the floppy was enabled in the bios. I disabled it and am trying to commission again... I'm not sure if it's enabled in the "working" node or not [20:21] shewless, if it works, could you file a bug describing the failure to commission and the fd0 error and BIOS fix? [20:21] kiko: I can. Where would I file the bug? [20:21] launchpad.net/maas/+filebug [20:21] okay.. against maas [20:23] shewless, I'd be surprised if we care that much about blkid [20:23] but.. [20:23] one hint is that blkid does not appear in that user script [20:24] kiko: the commissioning works after the floppy drive was disabled in the BIOS... now that is really strange :) [20:28] Bug is submitted: https://bugs.launchpad.net/maas/+bug/1584211 [20:29] kiko: thanks again.. all of my servers are commissioned now.. phew! [20:29] shewless, it's a bug [20:29] anyone know if I can use ubuntu-vm-builder to create a VM with 2 nics ? [20:31] hey shewless, how's your install coming along? [20:31] kiko: did I screw it up? I think I submitted it as a bug [20:34] terje: I have maas working great. I'm currently exploring a set of ansible scripts that we ahve in house to deploy open stack. I took a look at conjure up tool but I'm not sure if it's right for me. I want to be able to install a "HA" controller setup and add things like LDAP authentication [20:34] what version of MAAS are you using? [20:34] 2.0.0 [20:34] on Ubuntu 16.04 [20:34] gotcha [20:34] cool [20:36] Bug #1584206 opened: [2.0b5] machine failed to deploy: insufficient free space [20:36] Bug #1584211 opened: Commissioning fails when BIOS reports floppy drive, but there is none installed [20:37] terje: if you have any hints for getting an HA setup using conjure-up I'd have another look :) [20:39] so, I've had a hell of a time getting stuff working. [20:40] :( [20:40] I had a working 16.04 + maas 2.0 but never got openstack working there [20:40] so I bagged it and went to 14.04 + maas 1.9.2 [20:41] oh that's no good. did 14.04 and maas 1.9.2 help? [20:41] it's essentially unusable. [20:41] but I do have kind of a cool setup [20:41] what is? [20:41] 1.9.2 I can't get working at all [20:41] terje, hmm, I just deployed openstack with autopilot and maas at a customer site [20:41] terje, why does 1.9.2 fail for you? [20:42] on those versions, incidentally [20:42] here's my setup [20:42] kiko: does autopilot do controller HA? [20:42] I have a physical server loaded with 16.04. I have deployed a VM here, 14.04. [20:43] shewless, yes [20:43] once the VM is deployed, I login and run this script [20:43] https://github.com/jmcdice/ubuntu-os-cloud/blob/master/maas/maas-trusty-install.sh [20:43] the maas-dhcp server never starts [20:44] this is where I am stuck [20:44] terje, okay so far.. [20:45] there is an error in /var/log/upstart/maas-dhcpd.log [20:45] /var/lib/maas/dhcpd.conf does not exist. Aborting. [20:45] maas-dhcpd stop/pre-start, process 676 [20:46] on 2.0.0 you need to add a subnet to the right fiber and then enable DHCP. I think 1.9.2 is a lot different though [20:46] kiko: I tried to run "openstack-install" but it says "command not found" [20:46] hints? [20:46] I think that's being done [20:47] in this script, see configure_private() https://github.com/jmcdice/ubuntu-os-cloud/blob/master/maas/maas-trusty-install.sh [20:48] kiko: do you have a document you follow which will help me follow? [20:48] terje, you know, the maas install is pretty straightforward. one sec [20:49] kiko: I'm trying to make this a repetable process. If you could have a look at the script above and let me know what I'm missing that would be really helpful. [20:50] https://maas.ubuntu.com/docs1.9/install.html#pkg-install [20:50] terje, gotcha. let me think. [20:51] terje, there has to be some error in your install that we're ignoring [20:51] kiko: I guess conjure-up is supposed to be used instead of autopilot in maas 2.0.0? [20:51] or a race condition somewhere [20:51] shewless, you can use both [20:51] I'll start a fresh VM and start over [20:51] terje, let me explain [20:52] terje, apt-get install maas should leave you with everything running [20:52] ok [20:52] terje: good luck! let me know how it goes. [20:52] terje, is that error, the dhcpd error, happening after the first install, or after you reconfigure? [20:52] kiko: I'm out for the weekend. Thanks again for the help [20:52] see ya shewless, probably monday [20:52] :) [20:52] thanks shewless -- sorry it was hard to discover that problem, but we'll get the bug nailed so others won't be incovenienced [20:53] kiko: I'm going to follow this doc precisely and get back to you [20:53] just happy to have it solved (at least for me).. easy workaround :) [20:56] terje, okay, but answer my question too ;-) [20:57] after the first install [20:57] shewless, it was funny that you found the only thing that couldn't possibly be problem but what :-) [20:57] s/what/was/ damn [20:57] terje, so if you comment out configure_maas, configure_private and import_images it still fails? [20:57] if so it's a bug (possibly a race when installing) [20:58] if I only run install_maas() dhcpd is not running. [20:58] but I'll have to check and see if that error is there [20:58] I'll have a new fresh trusty VM up here in a couple of minutes and I can start over. [20:59] but that's not right.. dhcpd has to be running after apt-get install maas concludes [20:59] if it isn't, it's a bug [20:59] the install has to have failed somewhere [20:59] are you checking the return value of apt-get install maas? [20:59] no [20:59] but I certainly can. [20:59] it pulls in a ton of deps [20:59] brb [20:59] I bet it's failing [20:59] k [21:00] happy to share a screen if you like.. :) [21:28] hey kiko, so yea [21:28] after install_maas() I have the error [21:28] /var/lib/maas/dhcpd.conf does not exist. Aborting. [21:46] terje, so the question is why is the package install failing [21:46] apt-get install maas should not fail [21:47] if it's failing it's a bug [21:47] we're detecting something wrong in your system [22:07] ok, I'll run it again and capture the install log [22:33] kiko: http://sprunge.us/OAcb [22:33] maas install log [22:36] return code was 0