/srv/irclogs.ubuntu.com/2016/05/20/#maas.txt

=== CyberJacob is now known as zz_CyberJacob
txHey guys, I can't seem to find the documentation on manually configuring an existing DHCP server. Lots of places point to http://maas.ubuntu.com/docs2.0/configure.html#manual-dhcp01:17
txbut it seems to no longer be on the page.01:17
txnevermind, all good01:43
mupBug #1583891 opened: clean up boot-resources before syncing images as well as after <MAAS:New> <https://launchpad.net/bugs/1583891>04:27
=== mup_ is now known as mup
=== mup_ is now known as mup
=== zz_CyberJacob is now known as CyberJacob
ricoshello!10:23
ricosMy maas server can install ubuntu 16 on my nodes but when I choose ubuntu 14 it says kernel image not found10:23
ricosand I have added the right images10:23
ricosis this a bug or something?10:23
ricoscause I am trying to install a local cluster and I need the 14.04 version10:24
mupBug #1584047 opened: [1.9.3] maas-dhcp failure while/after upgrading to 1.9.3: apparmor_parser: Unable to replace "/usr/sbin/dhcpd".  Permission denied; attempted to load a profile while confined? <oil> <MAAS:New> <https://launchpad.net/bugs/1584047>13:14
mupBug #1584047 changed: [1.9.3] maas-dhcp failure while/after upgrading to 1.9.3: apparmor_parser: Unable to replace "/usr/sbin/dhcpd".  Permission denied; attempted to load a profile while confined? <oil> <MAAS:Won't Fix> <https://launchpad.net/bugs/1584047>13:59
=== mup_ is now known as mup
shewlessHi. I'm getting "Failed commissioning" on a host. Can someone help me determine why it failed? I have a couple hosts with the same hardware spec that work okay.  Here is the console log: http://pastebin.com/eeaMUvPs16:02
shewlessI can paste more relevant sections if required.16:02
shewlessI see this message.. but I'm not sure if it's the root cause or not.. and I don't know what it means: May 20 15:41:27 controller-3 [CLOUDINIT] handlers.py[WARNING]: failed posting event: start: modules-final/config-final-message: running config-final-message with frequency always16:04
mupBug #1584120 opened: maas doesn't seem to like authenticating proxy URLs <amd64> <apport-bug> <xenial> <maas (Ubuntu):New> <https://launchpad.net/bugs/1584120>16:05
kikoshewless, let me check17:12
shewlesskiko: awesome thanks. I have more logs if  you want.. but the rest of the logs didn't really look meaningful17:27
kikoshewless,17:28
kikoMay 20 15:41:26 controller-3 [CLOUDINIT] util.py[WARNING]: Failed running /var/lib/cloud/instance/scripts/user_data.sh [1]17:28
kikoshewless, are you supplying your own user_data?17:28
kikoif not, could you get that file into a pastebin?17:29
kiko[1]#012Traceback (most recent call last):#012  File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 715, in runparts#012    subp(prefix + [exe_path], capture=False)#012  File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 1704, in subp#012    cmd=args)#012cloudinit.util.ProcessExecutionError: Unexpected error while running command.#012Command: ['/var/lib/cloud/instance/scripts/user_data.sh']#012Exit code: 1#012Reason: -#17:29
kikodout: ''#012Stderr: ''17:29
kikosmoser, any hint on the above?17:29
shewlesskiko: I am not supplying any user_data (at least not on purpose)17:30
kikohmmm!17:31
kikoshewless, can you get the output of that user_data.sh?17:31
kikoshewless, are you on 1.9.3?17:31
kikoif so, you can pause commissioning and get access to that file to see what is running17:32
smoserkiko, cloud-init is just reporting that the code maas fed it to run exited non-zero17:32
kikoby selecting a special option17:32
shewlessI'm on 2.0.017:32
shewlessI'm not sure how to get the contents of user_data.sh.17:32
smosershewless, you may be able to ssh into the instance.17:32
shewlessIs there a "default" login?17:33
kikoshewless, using your registered ssh key with the ubuntu user17:33
smoseroh... commissioning. i'm not sure what /who's ssh keys are in tehre.17:33
smoserkiko, during comissioning ?17:33
kikosmoser, I think we changed commissioning to add the keys if you select an option when you commission17:33
smoserwhos keys ?17:33
kikowhen you click on commission, there are three radiobuttons17:33
kikoI assume the user who triggers it?17:33
kikoerr checkboxes17:33
smoserright. when you explicitly commission i guess.17:34
kikoisn't commissioning always explicit?17:34
* kiko <- clueless17:34
smoserand i guess even when you just accept a node, then *someone* did the accept.17:34
kikooh, when you accept does it trigger comissioning automatically?17:34
smoseri think so :). i might ask someone on the maas team to be sure though ;)17:34
smoserbut yeah, you shoudl be able to ssh in, shewless . and then /var/log/cloud-init-output.log might have something useful in it.17:35
shewlessIs there an easy way for me to determine what IP was assigned to this box?17:36
shewlessDon't see it in DNS17:36
kikosmoser, why don't we ship that back to maas by default?17:36
kikofeels like we have everything needed to do so17:36
kikothat's a great question17:36
kikoit flashes by the console IIRC17:37
shewlessooh.. I found it.. (just guessing at IPs around the range that had been assigned)17:37
shewlesslast line in cloud-init-output.log is more of the same: 2016-05-20 15:41:27,375 - handlers.py[WARNING]: failed posting event: finish: modules-final: FAIL: running modules for final17:38
kikoshewless, can you apt-get install pastebinit17:38
kikopastebinit < /var/log/cloud-init-output.log17:38
kikoand17:38
kikopastebinit < /var/lib/cloud/instance/scripts/user_data.sh17:38
smoseryou can  probably also just *run* that user_data.sh script17:39
smoserits going to do the same thing this time. and will probably fail similarly17:39
kikoyeah17:39
kikowhat smoser said too :)17:39
kikoyou can add a set -x to the top if you want more verbosity17:39
shewlesslol okay. Just gotta get this puppy some internet access17:39
smoser sh -x /var/lib/cloud/instance/scripts/user_data.sh 2>&1 | tee out.log17:39
smoserpastebinit out.log17:39
smosershewless, well, you can jsut collect those over ssh anad move them back and forth to you17:40
smoserbut, yeah. the interenets make things easier17:40
shewlesscloud-init-output.log: http://paste.ubuntu.com/1652831017:41
shewlessuser_data.sh: http://paste.ubuntu.com/1652834017:42
shewlessresult of user_data.sh execution coming up17:42
shewlessBTW pastebinit is AWESOME17:43
smoser:)17:43
kikono kidding yeah17:43
smoserit is. and its even inside 16.04 images by default17:43
shewlessyes I'm using 16.04 and noticed that it was already installed17:44
shewlessresult of user_data.sh execution: http://paste.ubuntu.com/1652850817:46
shewlesssome clock skew and HTTP request failures...17:46
smoserhey. i have to work on some other things... kiko this is squarely maas code that is running here17:48
kikosmoser, what do you sniff might be happening looking at that output?17:49
smoserit is not impossible that clock skew is involved.17:49
kikorequest to http://172.20.0.1:5240/MAAS/metadata//2012-03-01/maas-commissioning-scripts failed. sleeping 1.: HTTP Error 401: OK17:49
smoseryou might have errors on the other end too17:49
kikoshewless, how wrong is the system clock on that machine?17:49
kikoshewless, 401 is unathorized17:50
shewlessif I type "date" it's bang on.. not sure how to check17:50
shewlessI have commissioned other hosts so it seems weird if it would be an authorization problem17:50
smoserkiko, i cant help without much  more investigation really.17:51
kikosmoser, that's fine17:52
kikothanks17:52
kikoshewless, is this the only host that fails?17:52
shewlesskiko: yes17:52
kikoshewless, if date is bang on then that's not the problem17:52
shewlessI have 4 hosts commissioned and deployed successfully.  2 of which are the same hardware spec as this one that is failing17:53
smoserwell, if it dhcp'd and got date from an ntp source, it might be fixed now.17:53
smoserbut had possibly been a problem.17:53
shewlessshould I check the bios?17:53
smoseri think if you reboot that system, it should set the hardware clock on way down17:53
smoserso that next time it might work17:53
kikosmoser, ah, but our dhcp clients are brokenly not updating ntp, see bug in that spec I filed17:54
smoserits also possible clock is not related and ipmi stuff is failing.17:54
kikoI think it's unrelated17:54
kikothe real problem17:54
kikoI think17:54
kikois this17:54
kiko<kiko> request to http://172.20.0.1:5240/MAAS/metadata//2012-03-01/maas-commissioning-scripts failed. sleeping 1.: HTTP Error 401: OK17:54
smoserkiko, well, maybe no17:54
kikoshewless, if you wget that URL does it fail?17:54
smoserbecause maa might just be saying "go away, you're not commissining now"17:54
kikothat is a weird date btw17:54
smoserthats an oauthed' resource17:54
smoserthats the api version of the maas metadata service17:55
smoserits not changed since then17:55
kikointeresting17:55
shewlessif I wget that URL it does fail17:57
shewlessHTTP request sent, awaiting response... 401 UNAUTHORIZED  Username/Password Authentication Failed.17:58
shewlessas user "ubuntu"17:58
shewlessthat being said if I run the same wget on a successfully deployed system it fails in the same way.. not sure if that is relavent17:59
kikowell18:00
kikoit's interesting to say the least18:00
kikoshewless, "ntpdate clock.ubuntu.com"?18:00
shewlesscan't find host clock.ubuntu.com (couldn't ping it either)18:02
shewlesskiko: did you mean ntp.ubuntu.com?18:04
shewlesskiko: my maas server is the wrong timezone.. not sure if that matters18:07
shewlesswould think the other nodes would have failed though18:07
kikoshewless, timezone and clock have nothing to do with each other18:07
shewlesskk18:07
kikosomewhat counterintuitively18:08
kikoclock is always utc18:08
shewlesskiko: okay.. I fixed that anyways (change maas server to be UTC like all the other nodes)18:08
kikoshewless, did ntpdate show a major update?18:09
kikoor a minor one?18:09
shewlesskiki: 20 May 18:09:35 ntpdate[4678]: adjust time server 91.189.89.199 offset -0.007157 sec18:09
shewlessI think that's minor18:10
kikoshewless, is the maas server also synced?18:10
kikoi.e. ntpdate from the maas server?18:10
shewlesskiko: on the maas server: 20 May 18:10:56 ntpdate[30075]: adjust time server 91.189.89.199 offset 0.000407 sec18:11
kikoshewless, okay, so clock skew is not the problem18:11
kikoshewless, re-run the script and echo $?18:11
kikoif it's zero, then this is a red herring18:11
shewlesskiko: brb. I will do that.. but when I run user_data.sh it does say "+ return 0"18:12
shewlesskiko: so does that mean it's a red herring?18:12
kikoI /think/ so18:12
kikobut something is failing on this machine18:12
shewlessboo.. what next? :)18:12
shewlessbrb18:12
kiko2016-05-20 15:41:26,777 - util.py[WARNING]: Failed running /var/lib/cloud/instance/scripts/user_data.sh [1]18:13
kikothat's the only hint18:13
kikoit says it failed to run it18:13
kikoit's very strange18:14
kikoshewless, the fastest thing I have is to try and compare a working commissioning run with a failing one18:26
kikoto see if it's a red herring or not18:26
mupBug #1357086 opened: [2.0b5] Machine Finished Commissioning, it powered off, but power status show's "on" <MAAS:Confirmed> <https://launchpad.net/bugs/1357086>19:33
shewlesskiko: I'm attempting to commission another system (that's previously worked in the past). I'll let you know20:04
shewlesskiko: not sure if it's related but on the "failed" device I see an error on the console: "blk_update_request: I/O error, dev fd0, sector 0"20:08
shewlessI don't see that on my "working" device20:08
kikoI saw that but found it odd20:13
kikowhy is it trying to write to /dev/fd0?20:13
shewlessI have no idea! That's why I ignored it at first.. there isn't a fd0 device20:14
shewlessso the user_data.sh execution looks pretty similar. HTTP Error 401 is still present20:14
shewlesson the "working" system20:15
shewlessanything else I should check? It looks like the "failed system" was VERY close to working in terms of logging20:15
kikois the only difference the fd0 warning?20:16
kikoif so, see if there's a BIOS entry for floppy you can disable?20:16
shewlessthat's the only difference I've noticed20:16
shewlesslet me look at the bios20:16
kikohttp://askubuntu.com/questions/213512/buffer-i-o-error-on-device-fd0-logical-block-0-error20:17
kikowhen you run blkid it apparently triggers that20:17
shewlessthe floppy was enabled in the bios. I disabled it and am trying to commission again... I'm not sure if it's enabled in the "working" node or not20:20
kikoshewless, if it works, could you file a bug describing the failure to commission and the fd0 error and BIOS fix?20:21
shewlesskiko: I can. Where would I file the bug?20:21
kikolaunchpad.net/maas/+filebug20:21
shewlessokay.. against maas20:21
kikoshewless, I'd be surprised if we care that much about blkid20:23
kikobut..20:23
kikoone hint is that blkid does not appear in that user script20:23
shewlesskiko: the commissioning works after the floppy drive was disabled in the BIOS... now that is really strange :)20:24
shewlessBug is submitted: https://bugs.launchpad.net/maas/+bug/158421120:28
shewlesskiko: thanks again.. all of my servers are commissioned now.. phew!20:29
kikoshewless, it's a bug20:29
terjeanyone know if I can use ubuntu-vm-builder to create a VM with 2 nics ?20:29
terjehey shewless, how's your install coming along?20:31
shewlesskiko: did I screw it up? I think I submitted it as a bug20:31
shewlessterje: I have maas working great. I'm currently exploring a set of ansible scripts that we ahve in house to deploy open stack.  I took a look at conjure up tool but I'm not sure if it's right for me. I want to be able to install a "HA" controller setup and add things like LDAP authentication20:34
terjewhat version of MAAS are you using?20:34
shewless2.0.020:34
shewlesson Ubuntu 16.0420:34
terjegotcha20:34
terjecool20:34
mupBug #1584206 opened: [2.0b5] machine failed to deploy: insufficient free space <MAAS:New> <https://launchpad.net/bugs/1584206>20:36
mupBug #1584211 opened: Commissioning fails when BIOS reports floppy drive, but there is none installed <MAAS:New> <https://launchpad.net/bugs/1584211>20:36
shewlessterje: if you have any hints for getting an HA setup using conjure-up I'd have another look :)20:37
terjeso, I've had a hell of a time getting stuff working.20:39
terje:(20:40
terjeI had a working 16.04 + maas 2.0 but never got openstack working there20:40
terjeso I bagged it and went to 14.04 + maas 1.9.220:40
shewlessoh that's no good. did 14.04 and maas 1.9.2 help?20:41
terjeit's essentially unusable.20:41
terjebut I do have kind of a cool setup20:41
shewlesswhat is?20:41
terje1.9.2 I can't get working at all20:41
kikoterje, hmm, I just deployed openstack with autopilot and maas at a customer site20:41
kikoterje, why does 1.9.2 fail for you?20:41
kikoon those versions, incidentally20:42
terjehere's my setup20:42
shewlesskiko: does autopilot do controller HA?20:42
terjeI have a physical server loaded with 16.04. I have deployed a VM here, 14.04.20:42
kikoshewless, yes20:43
terjeonce the VM is deployed, I login and run this script20:43
terjehttps://github.com/jmcdice/ubuntu-os-cloud/blob/master/maas/maas-trusty-install.sh20:43
terjethe maas-dhcp server never starts20:43
terjethis is where I am stuck20:44
kikoterje, okay so far..20:44
terjethere is an error in /var/log/upstart/maas-dhcpd.log20:45
terje/var/lib/maas/dhcpd.conf does not exist.  Aborting.20:45
terjemaas-dhcpd stop/pre-start, process 67620:45
shewlesson 2.0.0 you need to add a subnet to the right fiber and then enable DHCP. I think 1.9.2 is a lot different though20:46
shewlesskiko: I tried to run "openstack-install" but it says "command not found"20:46
shewlesshints?20:46
terjeI think that's being done20:46
terjein this script, see configure_private() https://github.com/jmcdice/ubuntu-os-cloud/blob/master/maas/maas-trusty-install.sh20:47
terjekiko: do you have a document you follow which will help me follow?20:48
kikoterje, you know, the maas install is pretty straightforward. one sec20:48
terjekiko: I'm trying to make this a repetable process. If you could have a look at the script above and let me know what I'm missing that would be really helpful.20:49
kikohttps://maas.ubuntu.com/docs1.9/install.html#pkg-install20:50
kikoterje, gotcha. let me think.20:50
kikoterje, there has to be some error in your install that we're ignoring20:51
shewlesskiko: I guess conjure-up is supposed to be used instead of autopilot in maas 2.0.0?20:51
kikoor a race condition somewhere20:51
kikoshewless, you can use both20:51
terjeI'll start a fresh VM and start over20:51
kikoterje, let me explain20:51
kikoterje, apt-get install maas should leave you with everything running20:52
terjeok20:52
shewlessterje: good luck! let me know how it goes.20:52
kikoterje, is that error, the dhcpd error, happening after the first install, or after you reconfigure?20:52
shewlesskiko: I'm out for the weekend. Thanks again for the help20:52
terjesee ya shewless, probably monday20:52
terje:)20:52
kikothanks shewless -- sorry it was hard to discover that problem, but we'll get the bug nailed so others won't be incovenienced20:52
terjekiko: I'm going to follow this doc precisely and get back to you20:53
shewlessjust happy to have it solved (at least for me).. easy workaround :)20:53
kikoterje, okay, but answer my question too ;-)20:56
terjeafter the first install20:57
kikoshewless, it was funny that you found the only thing that couldn't possibly be problem but what :-)20:57
kikos/what/was/ damn20:57
kikoterje, so if you comment out configure_maas, configure_private and import_images it still fails?20:57
kikoif so it's a bug (possibly a race when installing)20:57
terjeif I only run install_maas() dhcpd is not running.20:58
terjebut I'll have to check and see if that error is there20:58
terjeI'll have a new fresh trusty VM up here in a couple of minutes and I can start over.20:58
kikobut that's not right.. dhcpd has to be running after apt-get install maas concludes20:59
kikoif it isn't, it's a bug20:59
kikothe install has to have failed somewhere20:59
kikoare you checking the return value of apt-get install maas?20:59
terjeno20:59
terjebut I certainly can.20:59
terjeit pulls in a ton of deps20:59
kikobrb20:59
kikoI bet it's failing20:59
terjek20:59
terjehappy to share a screen if you like.. :)21:00
terjehey kiko, so yea21:28
terjeafter install_maas() I have the error21:28
terje/var/lib/maas/dhcpd.conf does not exist.  Aborting.21:28
kikoterje, so the question is why is the package install failing21:46
kikoapt-get install maas should not fail21:46
kikoif it's failing it's a bug21:47
kikowe're detecting something wrong in your system21:47
terjeok, I'll run it again and capture the install log22:07
terjekiko: http://sprunge.us/OAcb22:33
terjemaas install log22:33
terjereturn code was 022:36

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!