[04:00] <mbruzek> hello bigjools
[04:00] <mbruzek> Do you have a minute to help with a maas question?
[04:01] <mbruzek> /etc/maas/dhcp.conf is not being populated and i can not start the maas-dhcp-server
[07:23] <bigjools> rvba: I looked at the power error log thing
[07:24] <bigjools> the only solution for this, apart from a python rewrite, is to write the generated template to a file and execute it
[07:25] <rvba> bigjools: can't we use our tiny wrapper that we use everywhere to execute shell scripts and specialize the way errors are built?
[07:25] <bigjools> what tiny wrapper?
[07:27] <rvba> Or rather, can't we tweak how we build the PowerActionFail error (http://paste.ubuntu.com/8316602/)?
[07:28] <rvba> That's the wrapper I had in mind, it's only a wrapper (so to speak) around the error.
[07:29] <bigjools> oh and CI is failing :(
[07:29] <bigjools> juju bootstrap
[07:29] <rvba> Damn.
[07:29] <rvba> Consistently?
[07:29]  * rvba checks
[07:29] <bigjools> yes
[07:29] <bigjools> rvba: no we cannot tweak it, the error text is returned from check_output
[07:33] <rvba> bigjools: well, the error we get from check_output has all the information we need stored on it.  Only it's __str__ method produces garbage in our case.
[07:33] <rvba> bigjools: subprocess.py http://paste.ubuntu.com/8316633/
[07:33] <bigjools> rvba: it's not garbage, check output does something like "command: error text"
[07:33] <bigjools> and command in our case is the template
[07:34] <bigjools> it's abusing the check_output stuff really
[07:34] <rvba> Yeah, I know; but for power command the template is too much information to put in an event :).
[07:34] <rvba> commands*
[07:34] <bigjools> what I might do is use subprocess.communicate and grab stderr separately
[07:35] <bigjools> although the power templates (I'm looking at AMT) produce  way more output than required
[07:35] <rvba> Exactly
[07:35] <rvba> bigjools: it looks to me that the problem is line 6 here http://paste.ubuntu.com/8316645/
[07:35] <rvba> Because we call the exception's __str__ method.
[07:36] <rvba> And for errors generated by power templates failing, it's guaranteed that it contains the whole template.
[07:36] <bigjools> rvba: we can cheat and grab e.output I suppose
[07:36] <bigjools> but the template needs fixing
[07:37] <rvba> Indeed.
[07:38] <rvba> bigjools: but yeah, we could use e.output and e.returncode and not use e.cmd (as e.__str__ does).
[07:38] <bigjools> rvba: oh god look at what it's doing
[07:38] <bigjools> line 6 does the __str__
[07:38] <bigjools> then line 10 does e.output *again*
[07:39] <rvba> heh, true :)
[07:39] <bigjools> I'll propose a change
[07:39] <bigjools> easier than I thought
[07:40] <bigjools> in the meantime, please review my branch :)
[07:40] <rvba> I'd love to.
[07:41] <rvba> Already approved by Dr. jtv it seems.
[07:42] <jtv> Sorry.  :-P
[07:42] <jtv> And by the way I never finished that PhD...
[07:43] <bigjools> aha
[07:46] <rvba> jtv: I know… it was more of honorific title.  For some reason I was thinking about 'Doc' in "Back to the Future". ;)
[07:47] <jtv> I'll just take that as a compliment.  :)
[08:24] <bigjools> rvba: I'm the one with unruly grey hair, not jtv
[08:25] <jtv> I don't know how to break this to you but... it's not unthinkable that I'm slightly weirder.
[10:15] <caribou> jtv: I'm preparing the SRU for bug #1346703
[10:15] <ubot5`> bug 1346703 in maas (Ubuntu) "/var/log/maas/rsyslog has incorrect permission" [Medium,In progress] https://launchpad.net/bugs/1346703
[10:16] <jtv> Hi caribou.  Excellent.
[10:16] <caribou> should I do a debdiff against trusty-proposed or another MP ?
[10:16] <caribou> I would say it depens on who sponsor my upload
[10:17] <jtv> caribou: I don't recall having done any backports on the packaging branch myself, so not sure.
[10:18] <caribou> jtv: I'll attach a debdiff & add sponsors & SRU team; this is what I usually do
[10:18] <caribou> jtv: if needed I'll change it later no big deal
[10:18] <jtv> You might ask bigjools, but I think it's past the end of his day now.
[10:19] <gmb> allenap, rvba, jtv: https://code.launchpad.net/~gmb/maas/enlist-mscm-to-RPC/+merge/234279 and https://code.launchpad.net/~gmb/maas/enlist-uscm-to-RPC/+merge/234285 need reviewing when you’ve got a sec.
[10:19]  * jtv has a sec
[10:19] <rvba> gmb: I'll review your other branch.
[10:19] <caribou> jtv: will do, I should be able to catch him later
[10:19] <gmb> jtv, rvba: ta
[10:21] <jtv> Race conditions ftw
[10:21] <jtv> At last I find out what happens when you try to claim a review that someone else has just claimed.
[10:36] <rvba> allenap: I wonder if what's seeing in the lab is not two bugs combined.
[10:36] <rvba> what we are*
[10:36] <rvba> allenap: I see two errors:
[10:37] <rvba> maas-integration.TestMAASIntegration.test_check_nodes_declared ... ERROR
[10:37] <rvba> Or
[10:37] <rvba> maas-integration.TestMAASIntegration.test_juju_bootstrap ... ERROR
[10:38] <allenap> rvba: Where are you seeing that? On the run I kicked off test_check_nodes_declared is ok...
[10:39] <rvba> allenap: we landed a bunch of branches since the first failure.
[10:39] <rvba> And when I go through them, I see two different type of failures.
[10:40] <rvba> allenap: I wonder if you haven't reverted one problem, only to get the failure from the other problem.
[10:40] <rvba> allenap: makes sense?
[10:42] <allenap> rvba: Ah ha, yes :)
[14:34] <gmb> allenap, rvba: Another branch for you: https://code.launchpad.net/~gmb/maas/useful-noconnectionerrors/+merge/234319
[14:35] <rvba> gmb: I'll take it.
[14:36] <gmb> Ta
[14:39] <rvba> gmb: question for you on the MP.
[14:45] <blake_r> allenap: https://bugs.launchpad.net/maas/+bug/1368269
[14:49] <roaksoax> rvba: ^^
[14:51] <rvba> roaksoax: blake_r: yeah, ugly bug. The exception is a bit confusing though.  allenap will probably have an idea.
[14:51] <blake_r> rvba: yeah rpc error
[14:57] <gmb> rvba: Good point! Updated and pushed.
[14:58] <gmb> rvba: hang on; test in the wrong place… fixing
[14:58] <rvba> gmb: is it worth changing src/maasserver/rpc/regionservice.py:getClientFor as well?
[14:59] <gmb> rvba: Definitely. I hadn’t spotted that one.
[15:09] <roaksoax> gmb: so all the probe-and-enlist are finished already right?
[15:34] <gmb> roaksoax: Yes, they’re finished now.
[15:35] <gmb> rvba: I’ve updated that branch again.
[15:35] <roaksoax> gmb: awesome!
[15:36] <gmb> rvba: Oh, you already approved. Ta :)
[15:36] <gmb> roaksoax: Indeed :).
[15:39] <blake_r> rvba: https://bugs.launchpad.net/maas/+bug/1368269
[15:39] <blake_r> rvba: that breaks juju bootstrap
[15:39] <blake_r> rvba: that is what your seing in CI
[15:39] <blake_r> rvba: i have a fix, for the enlistment issue
[15:40] <rvba> blake_r: okay, well sleuthed.  allenap will have a look at this in a bit.
[15:41] <roaksoax> rvba: heh...so this wasn't related to blake_r 's branches after all
[15:41] <blake_r> roaksoax: enlsitment was!
[15:41] <rvba> roaksoax: well, yes and no.
[15:41] <roaksoax> hehe ok :)
[15:42] <roaksoax> ok, let's get this fixed asap since we are releasing today
[15:43] <blake_r> rvba: https://code.launchpad.net/~blake-rouse/maas/fix-enlistment/+merge/234326
[15:43] <blake_r> rvba: one liner!
[15:44] <rvba> blake_r: which means you're missing a test :)
[15:44] <blake_r> rvba: naw, its twisted!
[15:44] <blake_r> rvba: haha!
[15:44] <rvba> blake_r: do we really want to ignore all failures like that?
[15:44] <blake_r> rvba: we do for windows boot method
[15:45] <rvba> I mean don't you want to only silence No Content error errors
[15:46] <blake_r> rvba: i want to silence all errors, because if windows boot method can't be used, that is fine
[15:46] <blake_r> rvba: this is only used for the deprecated windows install, that is not supported anymore
[15:46] <blake_r> rvba: it is unrelated to curtin
[15:46] <blake_r> rvba: we might remove it
[15:47] <rvba> blake_r: okay, makes sense;  probably worth a comment in the code though :)
[15:50] <blake_r> rvba: okay added comment
[15:51] <rvba> Ta
[21:09] <plars> matsubara: got a sec? trying to sort out a maas issue
[21:09] <matsubara> plars, yep
[21:09] <matsubara> what's up?
[21:10] <plars> matsubara: I have an install here on trusty that I haven't messed with in a while, but it was previously working.  When I powered it back up to try something I tried to go to the /MAAS page on my server and got a 500 error
[21:10] <plars> matsubara: so I updated to the latest in trusty and rebooted, still no luck
[21:10] <plars> matsubara: I'm now on the one in ppa:maas-maintainers/stable, but it's doing the same to me
[21:10] <matsubara> plars, are you using any version from the PPAs?
[21:10] <plars> matsubara: sec and I'll post the oops
[21:10] <matsubara> thanks
[21:10] <plars> matsubara: previously I wasn't, but I tried the ppa one as a last effort
[21:11] <plars> matsubara: http://paste.ubuntu.com/8322024/
[21:12] <matsubara> the only thing I see in that pastebin is a $
[21:13] <matsubara> plars, ^
[21:13] <plars> matsubara: hmm
[21:13] <plars> sec
[21:13] <plars> matsubara: try http://paste.ubuntu.com/8322034/
[21:15] <matsubara> Do you have the full traceback for that oops in /var/log/maas/oops? Are there any other tracebacks in /var/log/maas/maas.log or /var/log/maas/celery.log (assuming you are using 1.6 from the stable PPA)
[21:16] <matsubara> plars, ^
[21:19] <matsubara> plars, also worth checking if all services for maas are running: maas-pserv, maas-txlongpoll, maas-cluster-celery and maas-region-celery
[21:20] <matsubara> plars, rabbitmq-server too
[21:20] <plars> matsubara: sec, phone
[21:21] <matsubara> but there's probably a more informative traceback in somewhere in /var/log/maas or /var/log/apache2/
[21:34] <plars> matsubara: don't see anything that looks like a real traceback, but still looking, one moment
[21:49] <matsubara> plars, ok. Another thing, did you upgrade from 1.5? If you did you'll likely have to re-import the boot images (unrelated to the 500 error you're seeing, just a heads up)
[21:49] <plars> matsubara: good to know, but I can't even get that far at the moment :)
[21:49] <plars> matsubara: found some possible stuff http://paste.ubuntu.com/8322192/
[21:50] <plars> matsubara: that's from error.log
[21:52] <matsubara> plars, it's taking forever to load that pastebin, is it a huge paste?
[21:53] <plars> matsubara: yes
[21:53] <plars> matsubara: I can chop it up if you like
[21:54] <plars> matsubara: there's a lot of 'error: [Errno 113] No route to host' in it
[21:54] <plars> I'm not sure which host, perhaps the node that's turned off at the moment?
[21:54] <plars> matsubara: try http://paste.ubuntu.com/8322257/ for a short one
[21:55] <matsubara> plars, is rabbitmq-server running? I'd say the region controller is trying to connect to it but failing
[21:55] <matsubara> plars, when you upgrade did maas restart the services after the update?
[21:55] <plars> matsubara: yes, it's running
[21:56] <plars> matsubara: the first update to trusty latest, for certain it did, I even rebooted the whole box to be sure
[21:56] <plars> matsubara: I'm not sure what all services need to be restarted, I trusted the package update to take care of that but I can reboot again after the upgrade to the ppa version
[21:57] <matsubara>  maas-pserv, maas-txlongpoll, maas-cluster-celery and maas-region-celery should be all up
[21:57] <matsubara> as well as rabbitmq-server
[21:57] <beisner> plars, matsubara - i think the one time i had 500 issue in maas, it ended up being an mq auth issue.   i think there were also some bugs where the rabbitmq pwd was reset during an upgrade.
[21:57]  * beisner thinks back
[21:57] <matsubara> beisner, good point
[22:04] <matsubara> plars, another thing worth checking is the config files in /etc/maas/ and see if DEFAULT_MAAS_URL and MAAS_URL look sane
[22:05] <matsubara> as in, are they pointing to the URL/ip you'd expect MAAS to be running?
[22:07] <plars> matsubara: hmm, no one of them points at localhost/MAAS
[22:09] <matsubara> plars, where are they pointing to? I'd expect to be one of the IPs for that machine.
[22:10] <plars> matsubara: it is, but....
[22:11] <plars> matsubara: when grepping through there I think I may have found the problem
[22:11] <plars> matsubara: somehow the celery broker url seems to be pointing to the wrong IP
[22:15] <matsubara> plars, there's code in the package to auto detect the default route for the given system and use that IP address as the DEFAULT_MAAS_URL which in turn the MAAS_URL would infer its value from. The package would respect the values if they're set into the debconf db but if the configs were changed manually directly in the file they might be overwritten.
[22:16] <matsubara> plars, but if you can reproduce the issue or describe what you did, I think it's worth filing a bug. It's helpful to have this kind of upgrade feedback.
[22:19] <plars> matsubara: seems to be working now, somehow I think I just had a bad ip. Thanks!
[22:19] <matsubara> plars, cool! You're welcome.