[01:17] <tx> Hey guys, I can't seem to find the documentation on manually configuring an existing DHCP server. Lots of places point to http://maas.ubuntu.com/docs2.0/configure.html#manual-dhcp
[01:17] <tx> but it seems to no longer be on the page.
[01:43] <tx> nevermind, all good
[04:27] <mup> Bug #1583891 opened: clean up boot-resources before syncing images as well as after <MAAS:New> <https://launchpad.net/bugs/1583891>
[10:23] <ricos> hello!
[10:23] <ricos> My maas server can install ubuntu 16 on my nodes but when I choose ubuntu 14 it says kernel image not found
[10:23] <ricos> and I have added the right images
[10:23] <ricos> is this a bug or something?
[10:24] <ricos> cause I am trying to install a local cluster and I need the 14.04 version
[13:14] <mup> Bug #1584047 opened: [1.9.3] maas-dhcp failure while/after upgrading to 1.9.3: apparmor_parser: Unable to replace "/usr/sbin/dhcpd".  Permission denied; attempted to load a profile while confined? <oil> <MAAS:New> <https://launchpad.net/bugs/1584047>
[13:59] <mup> Bug #1584047 changed: [1.9.3] maas-dhcp failure while/after upgrading to 1.9.3: apparmor_parser: Unable to replace "/usr/sbin/dhcpd".  Permission denied; attempted to load a profile while confined? <oil> <MAAS:Won't Fix> <https://launchpad.net/bugs/1584047>
[16:02] <shewless> Hi. I'm getting "Failed commissioning" on a host. Can someone help me determine why it failed? I have a couple hosts with the same hardware spec that work okay.  Here is the console log: http://pastebin.com/eeaMUvPs
[16:02] <shewless> I can paste more relevant sections if required.
[16:04] <shewless> I see this message.. but I'm not sure if it's the root cause or not.. and I don't know what it means: May 20 15:41:27 controller-3 [CLOUDINIT] handlers.py[WARNING]: failed posting event: start: modules-final/config-final-message: running config-final-message with frequency always
[16:05] <mup> Bug #1584120 opened: maas doesn't seem to like authenticating proxy URLs <amd64> <apport-bug> <xenial> <maas (Ubuntu):New> <https://launchpad.net/bugs/1584120>
[17:12] <kiko> shewless, let me check
[17:27] <shewless> kiko: awesome thanks. I have more logs if  you want.. but the rest of the logs didn't really look meaningful
[17:28] <kiko> shewless,
[17:28] <kiko> May 20 15:41:26 controller-3 [CLOUDINIT] util.py[WARNING]: Failed running /var/lib/cloud/instance/scripts/user_data.sh [1]
[17:28] <kiko> shewless, are you supplying your own user_data?
[17:29] <kiko> if not, could you get that file into a pastebin?
[17:29] <kiko> [1]#012Traceback (most recent call last):#012  File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 715, in runparts#012    subp(prefix + [exe_path], capture=False)#012  File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 1704, in subp#012    cmd=args)#012cloudinit.util.ProcessExecutionError: Unexpected error while running command.#012Command: ['/var/lib/cloud/instance/scripts/user_data.sh']#012Exit code: 1#012Reason: -#
[17:29] <kiko> dout: ''#012Stderr: ''
[17:29] <kiko> smoser, any hint on the above?
[17:30] <shewless> kiko: I am not supplying any user_data (at least not on purpose)
[17:31] <kiko> hmmm!
[17:31] <kiko> shewless, can you get the output of that user_data.sh?
[17:31] <kiko> shewless, are you on 1.9.3?
[17:32] <kiko> if so, you can pause commissioning and get access to that file to see what is running
[17:32] <smoser> kiko, cloud-init is just reporting that the code maas fed it to run exited non-zero
[17:32] <kiko> by selecting a special option
[17:32] <shewless> I'm on 2.0.0
[17:32] <shewless> I'm not sure how to get the contents of user_data.sh.
[17:32] <smoser> shewless, you may be able to ssh into the instance.
[17:33] <shewless> Is there a "default" login?
[17:33] <kiko> shewless, using your registered ssh key with the ubuntu user
[17:33] <smoser> oh... commissioning. i'm not sure what /who's ssh keys are in tehre.
[17:33] <smoser> kiko, during comissioning ?
[17:33] <kiko> smoser, I think we changed commissioning to add the keys if you select an option when you commission
[17:33] <smoser> whos keys ?
[17:33] <kiko> when you click on commission, there are three radiobuttons
[17:33] <kiko> I assume the user who triggers it?
[17:33] <kiko> err checkboxes
[17:34] <smoser> right. when you explicitly commission i guess.
[17:34] <kiko> isn't commissioning always explicit?
[17:34]  * kiko <- clueless
[17:34] <smoser> and i guess even when you just accept a node, then *someone* did the accept.
[17:34] <kiko> oh, when you accept does it trigger comissioning automatically?
[17:34] <smoser> i think so :). i might ask someone on the maas team to be sure though ;)
[17:35] <smoser> but yeah, you shoudl be able to ssh in, shewless . and then /var/log/cloud-init-output.log might have something useful in it.
[17:36] <shewless> Is there an easy way for me to determine what IP was assigned to this box?
[17:36] <shewless> Don't see it in DNS
[17:36] <kiko> smoser, why don't we ship that back to maas by default?
[17:36] <kiko> feels like we have everything needed to do so
[17:36] <kiko> that's a great question
[17:37] <kiko> it flashes by the console IIRC
[17:37] <shewless> ooh.. I found it.. (just guessing at IPs around the range that had been assigned)
[17:38] <shewless> last line in cloud-init-output.log is more of the same: 2016-05-20 15:41:27,375 - handlers.py[WARNING]: failed posting event: finish: modules-final: FAIL: running modules for final
[17:38] <kiko> shewless, can you apt-get install pastebinit
[17:38] <kiko> pastebinit < /var/log/cloud-init-output.log
[17:38] <kiko> and
[17:38] <kiko> pastebinit < /var/lib/cloud/instance/scripts/user_data.sh
[17:39] <smoser> you can  probably also just *run* that user_data.sh script
[17:39] <smoser> its going to do the same thing this time. and will probably fail similarly
[17:39] <kiko> yeah
[17:39] <kiko> what smoser said too :)
[17:39] <kiko> you can add a set -x to the top if you want more verbosity
[17:39] <shewless> lol okay. Just gotta get this puppy some internet access
[17:39] <smoser>  sh -x /var/lib/cloud/instance/scripts/user_data.sh 2>&1 | tee out.log
[17:39] <smoser> pastebinit out.log
[17:40] <smoser> shewless, well, you can jsut collect those over ssh anad move them back and forth to you
[17:40] <smoser> but, yeah. the interenets make things easier
[17:41] <shewless> cloud-init-output.log: http://paste.ubuntu.com/16528310
[17:42] <shewless> user_data.sh: http://paste.ubuntu.com/16528340
[17:42] <shewless> result of user_data.sh execution coming up
[17:43] <shewless> BTW pastebinit is AWESOME
[17:43] <smoser> :)
[17:43] <kiko> no kidding yeah
[17:43] <smoser> it is. and its even inside 16.04 images by default
[17:44] <shewless> yes I'm using 16.04 and noticed that it was already installed
[17:46] <shewless> result of user_data.sh execution: http://paste.ubuntu.com/16528508
[17:46] <shewless> some clock skew and HTTP request failures...
[17:48] <smoser> hey. i have to work on some other things... kiko this is squarely maas code that is running here
[17:49] <kiko> smoser, what do you sniff might be happening looking at that output?
[17:49] <smoser> it is not impossible that clock skew is involved.
[17:49] <kiko> request to http://172.20.0.1:5240/MAAS/metadata//2012-03-01/maas-commissioning-scripts failed. sleeping 1.: HTTP Error 401: OK
[17:49] <smoser> you might have errors on the other end too
[17:49] <kiko> shewless, how wrong is the system clock on that machine?
[17:50] <kiko> shewless, 401 is unathorized
[17:50] <shewless> if I type "date" it's bang on.. not sure how to check
[17:50] <shewless> I have commissioned other hosts so it seems weird if it would be an authorization problem
[17:51] <smoser> kiko, i cant help without much  more investigation really.
[17:52] <kiko> smoser, that's fine
[17:52] <kiko> thanks
[17:52] <kiko> shewless, is this the only host that fails?
[17:52] <shewless> kiko: yes
[17:52] <kiko> shewless, if date is bang on then that's not the problem
[17:53] <shewless> I have 4 hosts commissioned and deployed successfully.  2 of which are the same hardware spec as this one that is failing
[17:53] <smoser> well, if it dhcp'd and got date from an ntp source, it might be fixed now.
[17:53] <smoser> but had possibly been a problem.
[17:53] <shewless> should I check the bios?
[17:53] <smoser> i think if you reboot that system, it should set the hardware clock on way down
[17:53] <smoser> so that next time it might work
[17:54] <kiko> smoser, ah, but our dhcp clients are brokenly not updating ntp, see bug in that spec I filed
[17:54] <smoser> its also possible clock is not related and ipmi stuff is failing.
[17:54] <kiko> I think it's unrelated
[17:54] <kiko> the real problem
[17:54] <kiko> I think
[17:54] <kiko> is this
 request to http://172.20.0.1:5240/MAAS/metadata//2012-03-01/maas-commissioning-scripts failed. sleeping 1.: HTTP Error 401: OK
[17:54] <smoser> kiko, well, maybe no
[17:54] <kiko> shewless, if you wget that URL does it fail?
[17:54] <smoser> because maa might just be saying "go away, you're not commissining now"
[17:54] <kiko> that is a weird date btw
[17:54] <smoser> thats an oauthed' resource
[17:55] <smoser> thats the api version of the maas metadata service
[17:55] <smoser> its not changed since then
[17:55] <kiko> interesting
[17:57] <shewless> if I wget that URL it does fail
[17:58] <shewless> HTTP request sent, awaiting response... 401 UNAUTHORIZED  Username/Password Authentication Failed.
[17:58] <shewless> as user "ubuntu"
[17:59] <shewless> that being said if I run the same wget on a successfully deployed system it fails in the same way.. not sure if that is relavent
[18:00] <kiko> well
[18:00] <kiko> it's interesting to say the least
[18:00] <kiko> shewless, "ntpdate clock.ubuntu.com"?
[18:02] <shewless> can't find host clock.ubuntu.com (couldn't ping it either)
[18:04] <shewless> kiko: did you mean ntp.ubuntu.com?
[18:07] <shewless> kiko: my maas server is the wrong timezone.. not sure if that matters
[18:07] <shewless> would think the other nodes would have failed though
[18:07] <kiko> shewless, timezone and clock have nothing to do with each other
[18:07] <shewless> kk
[18:08] <kiko> somewhat counterintuitively
[18:08] <kiko> clock is always utc
[18:08] <shewless> kiko: okay.. I fixed that anyways (change maas server to be UTC like all the other nodes)
[18:09] <kiko> shewless, did ntpdate show a major update?
[18:09] <kiko> or a minor one?
[18:09] <shewless> kiki: 20 May 18:09:35 ntpdate[4678]: adjust time server 91.189.89.199 offset -0.007157 sec
[18:10] <shewless> I think that's minor
[18:10] <kiko> shewless, is the maas server also synced?
[18:10] <kiko> i.e. ntpdate from the maas server?
[18:11] <shewless> kiko: on the maas server: 20 May 18:10:56 ntpdate[30075]: adjust time server 91.189.89.199 offset 0.000407 sec
[18:11] <kiko> shewless, okay, so clock skew is not the problem
[18:11] <kiko> shewless, re-run the script and echo $?
[18:11] <kiko> if it's zero, then this is a red herring
[18:12] <shewless> kiko: brb. I will do that.. but when I run user_data.sh it does say "+ return 0"
[18:12] <shewless> kiko: so does that mean it's a red herring?
[18:12] <kiko> I /think/ so
[18:12] <kiko> but something is failing on this machine
[18:12] <shewless> boo.. what next? :)
[18:12] <shewless> brb
[18:13] <kiko> 2016-05-20 15:41:26,777 - util.py[WARNING]: Failed running /var/lib/cloud/instance/scripts/user_data.sh [1]
[18:13] <kiko> that's the only hint
[18:13] <kiko> it says it failed to run it
[18:14] <kiko> it's very strange
[18:26] <kiko> shewless, the fastest thing I have is to try and compare a working commissioning run with a failing one
[18:26] <kiko> to see if it's a red herring or not
[19:33] <mup> Bug #1357086 opened: [2.0b5] Machine Finished Commissioning, it powered off, but power status show's "on" <MAAS:Confirmed> <https://launchpad.net/bugs/1357086>
[20:04] <shewless> kiko: I'm attempting to commission another system (that's previously worked in the past). I'll let you know
[20:08] <shewless> kiko: not sure if it's related but on the "failed" device I see an error on the console: "blk_update_request: I/O error, dev fd0, sector 0"
[20:08] <shewless> I don't see that on my "working" device
[20:13] <kiko> I saw that but found it odd
[20:13] <kiko> why is it trying to write to /dev/fd0?
[20:14] <shewless> I have no idea! That's why I ignored it at first.. there isn't a fd0 device
[20:14] <shewless> so the user_data.sh execution looks pretty similar. HTTP Error 401 is still present
[20:15] <shewless> on the "working" system
[20:15] <shewless> anything else I should check? It looks like the "failed system" was VERY close to working in terms of logging
[20:16] <kiko> is the only difference the fd0 warning?
[20:16] <kiko> if so, see if there's a BIOS entry for floppy you can disable?
[20:16] <shewless> that's the only difference I've noticed
[20:16] <shewless> let me look at the bios
[20:17] <kiko> http://askubuntu.com/questions/213512/buffer-i-o-error-on-device-fd0-logical-block-0-error
[20:17] <kiko> when you run blkid it apparently triggers that
[20:20] <shewless> the floppy was enabled in the bios. I disabled it and am trying to commission again... I'm not sure if it's enabled in the "working" node or not
[20:21] <kiko> shewless, if it works, could you file a bug describing the failure to commission and the fd0 error and BIOS fix?
[20:21] <shewless> kiko: I can. Where would I file the bug?
[20:21] <kiko> launchpad.net/maas/+filebug
[20:21] <shewless> okay.. against maas
[20:23] <kiko> shewless, I'd be surprised if we care that much about blkid
[20:23] <kiko> but..
[20:23] <kiko> one hint is that blkid does not appear in that user script
[20:24] <shewless> kiko: the commissioning works after the floppy drive was disabled in the BIOS... now that is really strange :)
[20:28] <shewless> Bug is submitted: https://bugs.launchpad.net/maas/+bug/1584211
[20:29] <shewless> kiko: thanks again.. all of my servers are commissioned now.. phew!
[20:29] <kiko> shewless, it's a bug
[20:29] <terje> anyone know if I can use ubuntu-vm-builder to create a VM with 2 nics ?
[20:31] <terje> hey shewless, how's your install coming along?
[20:31] <shewless> kiko: did I screw it up? I think I submitted it as a bug
[20:34] <shewless> terje: I have maas working great. I'm currently exploring a set of ansible scripts that we ahve in house to deploy open stack.  I took a look at conjure up tool but I'm not sure if it's right for me. I want to be able to install a "HA" controller setup and add things like LDAP authentication
[20:34] <terje> what version of MAAS are you using?
[20:34] <shewless> 2.0.0
[20:34] <shewless> on Ubuntu 16.04
[20:34] <terje> gotcha
[20:34] <terje> cool
[20:36] <mup> Bug #1584206 opened: [2.0b5] machine failed to deploy: insufficient free space <MAAS:New> <https://launchpad.net/bugs/1584206>
[20:36] <mup> Bug #1584211 opened: Commissioning fails when BIOS reports floppy drive, but there is none installed <MAAS:New> <https://launchpad.net/bugs/1584211>
[20:37] <shewless> terje: if you have any hints for getting an HA setup using conjure-up I'd have another look :)
[20:39] <terje> so, I've had a hell of a time getting stuff working.
[20:40] <terje> :(
[20:40] <terje> I had a working 16.04 + maas 2.0 but never got openstack working there
[20:40] <terje> so I bagged it and went to 14.04 + maas 1.9.2
[20:41] <shewless> oh that's no good. did 14.04 and maas 1.9.2 help?
[20:41] <terje> it's essentially unusable.
[20:41] <terje> but I do have kind of a cool setup
[20:41] <shewless> what is?
[20:41] <terje> 1.9.2 I can't get working at all
[20:41] <kiko> terje, hmm, I just deployed openstack with autopilot and maas at a customer site
[20:41] <kiko> terje, why does 1.9.2 fail for you?
[20:42] <kiko> on those versions, incidentally
[20:42] <terje> here's my setup
[20:42] <shewless> kiko: does autopilot do controller HA?
[20:42] <terje> I have a physical server loaded with 16.04. I have deployed a VM here, 14.04.
[20:43] <kiko> shewless, yes
[20:43] <terje> once the VM is deployed, I login and run this script
[20:43] <terje> https://github.com/jmcdice/ubuntu-os-cloud/blob/master/maas/maas-trusty-install.sh
[20:43] <terje> the maas-dhcp server never starts
[20:44] <terje> this is where I am stuck
[20:44] <kiko> terje, okay so far..
[20:45] <terje> there is an error in /var/log/upstart/maas-dhcpd.log
[20:45] <terje> /var/lib/maas/dhcpd.conf does not exist.  Aborting.
[20:45] <terje> maas-dhcpd stop/pre-start, process 676
[20:46] <shewless> on 2.0.0 you need to add a subnet to the right fiber and then enable DHCP. I think 1.9.2 is a lot different though
[20:46] <shewless> kiko: I tried to run "openstack-install" but it says "command not found"
[20:46] <shewless> hints?
[20:46] <terje> I think that's being done
[20:47] <terje> in this script, see configure_private() https://github.com/jmcdice/ubuntu-os-cloud/blob/master/maas/maas-trusty-install.sh
[20:48] <terje> kiko: do you have a document you follow which will help me follow?
[20:48] <kiko> terje, you know, the maas install is pretty straightforward. one sec
[20:49] <terje> kiko: I'm trying to make this a repetable process. If you could have a look at the script above and let me know what I'm missing that would be really helpful.
[20:50] <kiko> https://maas.ubuntu.com/docs1.9/install.html#pkg-install
[20:50] <kiko> terje, gotcha. let me think.
[20:51] <kiko> terje, there has to be some error in your install that we're ignoring
[20:51] <shewless> kiko: I guess conjure-up is supposed to be used instead of autopilot in maas 2.0.0?
[20:51] <kiko> or a race condition somewhere
[20:51] <kiko> shewless, you can use both
[20:51] <terje> I'll start a fresh VM and start over
[20:51] <kiko> terje, let me explain
[20:52] <kiko> terje, apt-get install maas should leave you with everything running
[20:52] <terje> ok
[20:52] <shewless> terje: good luck! let me know how it goes.
[20:52] <kiko> terje, is that error, the dhcpd error, happening after the first install, or after you reconfigure?
[20:52] <shewless> kiko: I'm out for the weekend. Thanks again for the help
[20:52] <terje> see ya shewless, probably monday
[20:52] <terje> :)
[20:52] <kiko> thanks shewless -- sorry it was hard to discover that problem, but we'll get the bug nailed so others won't be incovenienced
[20:53] <terje> kiko: I'm going to follow this doc precisely and get back to you
[20:53] <shewless> just happy to have it solved (at least for me).. easy workaround :)
[20:56] <kiko> terje, okay, but answer my question too ;-)
[20:57] <terje> after the first install
[20:57] <kiko> shewless, it was funny that you found the only thing that couldn't possibly be problem but what :-)
[20:57] <kiko> s/what/was/ damn
[20:57] <kiko> terje, so if you comment out configure_maas, configure_private and import_images it still fails?
[20:57] <kiko> if so it's a bug (possibly a race when installing)
[20:58] <terje> if I only run install_maas() dhcpd is not running.
[20:58] <terje> but I'll have to check and see if that error is there
[20:58] <terje> I'll have a new fresh trusty VM up here in a couple of minutes and I can start over.
[20:59] <kiko> but that's not right.. dhcpd has to be running after apt-get install maas concludes
[20:59] <kiko> if it isn't, it's a bug
[20:59] <kiko> the install has to have failed somewhere
[20:59] <kiko> are you checking the return value of apt-get install maas?
[20:59] <terje> no
[20:59] <terje> but I certainly can.
[20:59] <terje> it pulls in a ton of deps
[20:59] <kiko> brb
[20:59] <kiko> I bet it's failing
[20:59] <terje> k
[21:00] <terje> happy to share a screen if you like.. :)
[21:28] <terje> hey kiko, so yea
[21:28] <terje> after install_maas() I have the error
[21:28] <terje> /var/lib/maas/dhcpd.conf does not exist.  Aborting.
[21:46] <kiko> terje, so the question is why is the package install failing
[21:46] <kiko> apt-get install maas should not fail
[21:47] <kiko> if it's failing it's a bug
[21:47] <kiko> we're detecting something wrong in your system
[22:07] <terje> ok, I'll run it again and capture the install log
[22:33] <terje> kiko: http://sprunge.us/OAcb
[22:33] <terje> maas install log
[22:36] <terje> return code was 0