#maas 2012-09-03
<lifeless> bigjools: which bug # for clarity ?
<bigjools> lifeless: https://bugs.launchpad.net/juju/+bug/945505
<ubot5> Ubuntu bug 945505 in juju "Use ipAddress instead of dnsName now that txaws supports it" [High,In progress]
<bigjools> esp see comment 8
<bigjools> dunno if that' all aws-specific though
<bigjools> (on second viewing)
<lifeless> bigjools: its all AWS specific
<bigjools> then we're all good
<lifeless> bigjools: more details in mail
<rvba> mrevell: want to talk about the hexr stuff?
<mrevell> rvba, Sure! Was just on another call. I'll set up a hangout
<melmoth> hmm, at wich stage is dnsmasq suppose to have set a new entry for the new host so that its name is resolvable ?
<melmoth> i have just bootsraped juju on a maas environment, it s using the only node i defined right now: zookeeper.mydomain.com (enlisted manually)
<melmoth> and while the system is being installed, i cannot resovled its name.
<melmoth> so, juju will not able to log in it, will it ?
<melmoth> i see the entry for the node in  /var/lib/misc/dnsmasq.leases, but still, the name cannot be resolved
<rbasak> allenap: I don't understand https://bugs.launchpad.net/maas/+bug/1044393/comments/4
<ubot5> Ubuntu bug 1044393 in MAAS "Tests fail with "HTTPError: HTTP Error 503: Service Unavailable"" [Low,Triaged]
<rbasak> allenap: is this against the wrong bug?
<melmoth> i try the workarond mentionned in https://bugs.launchpad.net/ubuntu/+source/maas/+bug/1043121 , but i still have the same behaviour
<ubot5> Ubuntu bug 1043121 in maas (Ubuntu) "deployed node cannot be looked up with dnsmasq on MAAS" [Undecided,New]
<melmoth> anyone knows exactly at wich stage is dnsmasq suppose to resolve the name ?
<allenap> rbasak: Nope, correct bug. I can reproduce the error, and that, I think, is the root cause.
<allenap> Well, on the way to the root cause.
<rbasak> allenap: ok, I've tested and commented on the bug
<rbasak> allenap: (your step to reproduce doesn't work for me)
<allenap> rbasak: Interesting. I hate bugs.
<allenap> like this.
<melmoth> is there a way to bootstrap on a node of my choice (the same way i can jitsu deploy-to a specific node) ?
<jtv> You can pass a "name" constraint that must map the node's hostname.
<melmoth> hmmm
<allenap> melmoth: In 12.10 there will be much richer placement/hardware constraint support. Obviously that's not useful right now, but if you're just evaluating then it may be worth coming back to it then.
 * melmoth wonders if 'juju bootstrap --constraints "instance-name=zookeeper" ' is already suppose to work
<melmoth> may be maas-name=zookeeper
<melmoth> juju bootstrap  --constraints "maas-name=zookeeper.mydomain.com"
<melmoth> works ! \o/
<allenap> roaksoax: Hello! I have some questions about packaging. To start with: is there an easy way to attempt packaging with my current branch, instead of with whatever's at the top of the changelog?
<allenap> roaksoax: I've tried `bzr builddeb --export-upstream=...` but I'm just waving my arms in the dark really.
<roaksoax> allenap: your branch on lp:~allenap etc etc or a different revision?
<allenap> roaksoax: I want to be able to use a local branch, just the one I'm working on. I'm trying to write a script to build my current branch and run tests on it.
<roaksoax> allenap: uhmm maybe try modifying debian/rules
<roaksoax> allenap: the other one is that you create an .orig.tar.gz yourself
<allenap> roaksoax: So, walk me through creating the .orig.tar.gz. I can bzr export the branch into a tarball, but what do I call it so that the builddeb finds it and uses it?
<roaksoax> allenap: what i would do is simply tar zxvf .orig.tar.gz && cp debian maas-<whatever> and then: cd maas-<whatever> && debuild -S -sa
<roaksoax> allenap: note that the tarball has to be the same version as what it is in the changelog
<roaksoax> allenap:btw.. how can I add kernel parameters to a system?
<roaksoax> allenap: i don't want to modify provisioningserver/pxe/config.template
<allenap> roaksoax: File a bug :)
<allenap> We will move those templates to somewhere easier to customise once installed, but... time.
<roaksoax> allenap: heh.. ok.. I thought we had agreed that it was important for us to be able to modify kernel parameters for the system
<roaksoax> allenap: as in, being able to add kernel parameters from the API/WebUI
<allenap> It is. What's your use case?
<roaksoax> and IIRC< that was a request we made
<roaksoax> allenap: right now not my particular case, but for example, sabdlf just lost serial interface output due to not having the kernel parameters. And really there's no easy way to add them without having to modify those config files
<allenap> roaksoax: Find or file a bug, add a comment and mark it critical.
<roaksoax> allenap: will do
<allenap> roaksoax: I'd like to give you a better answer, but we don't have any slack right now.
<roaksoax> allenap: yeah we are all like that
<roaksoax> allenap: Anyways, if there's nothing else I can help you with, I'll go resume my holiday :)
<allenap> roaksoax: Ah, I didn't realise you were away. Sorry about that, have a good one :)
<roaksoax> allenap: hehe no worries, either way I have to do school work so I'm kinda around
#maas 2012-09-04
<rbasak> smoser: ping. Need some ephemeral image debugging help when you get in please. For some reason it's not configuring DNS right even though IP-Config lists correct info (can't resolve ports.ubuntu.com later). I can't reproduce by booting the machine into a normal installation. Is there a way to get a shell onto the system via the console and/or stop it shutting down?
<jtv> Daviey: got a moment to talk about how we run dhcpd?
<Daviey> jtv: Hmm, i can try.. kinda knee deep atm
<jtv> Thanks.  It's just that /etc/default/isc-dhcp-server seems to be the only place where we can set which interfaces it should serve on.
<Daviey> rbasak: are you using an apt proxy?
<jtv> And so my only option to set that _if I stay within MAAS code_ is to rewrite that file, which is all sorts of ugly.
<rbasak> Daviey: no. I've got a bit further though. Looks like cloud-init is not writing a resolv.conf at all. Some conflict with resolvconf and the nature of my image perhaps.
<Daviey> jtv: Well, that is the proper way... but you can supply your own upstart job with a config, if that makes more sense
<Daviey> rbasak: resolvconf awesomeness !
<jtv> Daviey: I'm not sure it'd be good to have two upstart jobs for the same daemon side by side.  Otherwise, tempting.
<jtv> Obviously I want to avoid just appending endless lines... is there a standard reusable way to rewrite a single line with some kind of I-wrote-this marker comment in front?
<rbasak> augeas is the standard answer. Not sure how well adopted that is in Ubuntu. And it's probably overkill
<jtv> Don't know what that is, but the name certainly carries an air of overkill!
<rbasak> It's a generic mechanism to understand different file formats and change them programatically and losslessly
<Daviey> rbasak: yeah, we never really quite got into the augeas vibe
<jtv> Something else then?
<jtv> It's a relatively simple problem, but obviously not unique.
<rbasak> I would use a comment as a marker and combine that with sed
<rbasak> I think that's the usually done thing
<Daviey> jtv: Well... having /etc/default/isc-dhcp-server - ENABLED=False.. then writing a /etc/maas/isc-dhcp-server is another option, no?
<jtv> Not _particularly_ nice to have the comment on the same line, but...
<jtv> But then we're back to writing that file.  :)
<rbasak> Usually it's the immediately preceding line
<rbasak> Which notes that it's an automatic thing and not to mess with the line below
 * jtv has not mastered inter-line editing in sed
<Daviey> well, ## Begin MAAS entries\n .. ### End MAAS entries ?
<jtv> A comment like that is exactly what I had in mind though.
<jtv> I thought sed got horribly involved when you had to edit across lines.
<rbasak> It's not too bad.
<Daviey> smoser would fight for awk here :)
<rbasak> If you use Daviey's start and end markers, then:
<rbasak> hmm, maybe too much to do in irc
<rbasak> You may have a point
 * jtv thought so :)
<jtv> It's not _very_ much code but given how evil corruption in /etc would be, I'd much prefer something that's already in widespread use.
<jtv> Failing that, I can just write some python myself.
<rbasak> It's against debian policy for packages to change other packages conffiles
<rbasak> isn't it?
<rbasak> How would that work for maas packaging then?
<rbasak> I think the separate upstart job would be cleaner and more policy-compliant
<jtv> But if it requires editing /etc/default/isc-dhcp-server to disable the original upstart job, it doesn't make things much better.
<rbasak> Luckily the dhcp server installs disabled
<jtv> Ideally we'd have some kind of conf.d hook here.
<jtv> Oh, yes, of course it does!
<jtv> *facepalm*
<jtv> Oh dear.  But that's in the config, innit?
<jtv> And we write to the config.
<rbasak> Have an entirely separate config. /etc/maas/dhcp
<jtv> Yeah, that's what springs to mind but it's a lot of weight to carry around.
<rbasak> The maas dhcp upstart job would fire up the daemon to use that, with separate pidfiles, lockfiles, leases files and everything
<rbasak> It is
<rbasak> Unfortunately there's no accepted mechanism for packages to stack up automatically in the way that maas needs to sit higher in the stack and manuiplate dhcp
<rbasak> So right now that's the only clean way to do it in packaging I think
<rbasak> But who am I to say this anyway. I'm neither a Debian nor an Ubuntu anything
<jtv> And yet you cite debian policy.
<jtv> At least it's giving me a much better view of the problem, so I thank you for that.
<rbasak> You're right about the conf.d hooks btw. That's the normal way of packages getting hooks into other packages
<rbasak> I suppose /etc/default/isc-dhcp-server could have something like [ -d /etc/default/isc-dhcp-server.d ] && for f in $(run-parts --list /etc/default/isc-dhcp-server.d); do . $f; done
<rbasak> That could go in as an Ubuntu delta to the dhcp package, and then maas would just need to manipulate /etc/default/isc-dhcp-server.d/maas
<rbasak> I think that would be clean and policy compliant, but it would be unusual
<Daviey> ^ that seems better actually
<jtv> But it'd have to be in Precise as well.
<jtv> Daviey: any chance of getting anything like that into Precise?
<jtv> As well as Quantal?
<Daviey> jtv: well quantal is easy.. precise is so messed up with this massive change, what is more pain. :)
<jtv> *cough*
<jtv> Who might be able to do such a thing?
<Daviey> jtv: I honeslty can't see the massive change going into precise at the moment...
<Daviey> that run-parts style thing, is /possible/
<jtv> Wait... which is the massive change then?  You mean writing our own config is the impossible one?
<Daviey> no, the massive change.. as in, pulling everything back :)
<jtv> I don't understand that expression.
<Daviey> jtv: The plan to SRU trunk maas back to precise.. I have reservations abou it being possible.
<jtv> Oh!  That one.
<jtv> Sorry, I had no idea you were talking about that.
<Daviey> not saying it can
<jtv> Well, one thing at a time then: do you know how we might achieve a runparts extension on /etc/defaults/isc-dhcp-server?
<Daviey> not saying it can't/won't be done.. i just have reservations
<Daviey> jtv: Well, we won't achieve it until it's required.. does that make sense?
<Daviey> ie, we won't update the precise version.. on the off-chance it might be needed in a few months.
<Daviey> Does that make sense?
<jtv> Of course.
<jtv> The only other solution that's come up is to write our own upstart job and our own dhcpd config.
<jtv> It's a lot of weight to carry around, basically just to work around this relatively standard piece of config machinery not being there.
<Daviey> jtv: right
<jtv> rvba: you wrote the dhcp config files, right?
<rvba> jtv: no, Julian did iirc.
<jtv> Ah
<jtv> Oh, maybe the problem you had was with the dns config.
<rvba> Can I help with something?
<jtv> The problem where you had to write into the config file.
<rvba> Yeah, that's with the DNS config file.
<jtv> Well, there's only one place where we can specify what interfaces dhcpd should service.  And it's in /etc/default/, so we're not really supposed to change it.
<jtv> Alternatives include:
<jtv>  * Doing it anyway.
<rvba> (that's what we did with the DNS stuff)
<jtv>  * Getting a "runparts" into the packaged version of that file.
<jtv>  * Creating our own upstart job, and using our own config that isn't in the usual /etc/dhcpd or wherever it normally is.
<rvba> (That might be a problem if you add Appamor into the mix)
<jtv> Oh dear yes, there's that too.
<rvba> Let's hangout briefly if that's ok  with you, the decisions we made for the DNS stuff are still fresh in my memory so maybe the two of us can figure out the best way to do this.
<rvba> jtv: ^
<jtv> Sure, thanks.  I don't have much time, but I have some.
<jtv> I'm creating a hangout.
<jtv> rvba: inviting...
<jtv> https://plus.google.com/hangouts/_/c5f8c770e7c02a1703b226632493d84bb508ad53?authuser=0&hl=en-GB
<jtv> Daviey: looks like my only realistic option is to write the existing config file anyway.  Thanks for  sparring with me!
 * jtv is off
<Daviey> \o/
<smoser> rbasak, what happened above?
<rbasak> smoser: I got a bit further
<smoser> i'd prefer the separate upstart job myself.
<rbasak> It seems that resolv.conf isn't being populated for some reason
<smoser> and i dont really think its a lot of additional things to carry. its 1 file.
<rbasak> Oh
<smoser> oh. yeah, that too. the above comments were wrt dhcp
<smoser> but rbasak yeah, i suspect you have clock issues and dhcp :)
<rbasak> The dhcp thing is between Daviey and jtv I guess
<smoser> or the eth0 naming bug we agreed existed.
<rbasak> I'm not sure about clock. The rest of DHCP seems to work fine. It's just resolv.conf that's not being populated
<rbasak> Could it be a race between eth0 and eth1?
<smoser> rbasak, where is your image?
<smoser> what does /etc/resolv. conf looki like in the pristine image?
<rbasak> It's a symlink
<smoser> because if you're using something older on 12.04, that was busted, and i'm not sure how it would react (although it seems to work fine on intel...)
<smoser> rbasak, ok. then i think i know.
<smoser> ah. i know.
<rbasak> I can look up the target if you need
<smoser> https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1031065
<ubot5> Ubuntu bug 1031065 in cloud-init (Ubuntu) "cloud-init-nonet runs 'start networking' explicitly" [Medium,Confirmed]
<smoser> i suspect it is that.
<smoser> and you can verify by removing the 'start networking' line from cloud-init-nonet.
<smoser> its a race condition.
<smoser> interestingly, i came upon that bug when I was incorrectly not specifying 'ro' on the kernel command line
<rbasak> Verify by removing it from where? In the image?
<smoser> so you might want to verify that you're doing that as it massages the race differently :)
<smoser> in the image, yes.
<smoser> so, you *should* boot with 'ro' on the kernel command line, and that *might* "fix" your problem here.
<rbasak> What's the plan for the armhf image for precise in general?
<roaksoax> rbasak: can you give us a short summary of the dhcp issue?
<roaksoax> err
<roaksoax> sorry
<roaksoax> rvba:
<roaksoax> ^^
<smoser> rbasak, we'll get you one if you need one.
<rbasak> I definitely need one
<smoser> i made a daily of the images available, and my plan was to get those down.
<rbasak> Critical for MAAS on ARM - in time for 12.10
<smoser> but due to many blocking bugs in quantal maas, i've not been able to sufficiently test the images to move them to 'release'.
<smoser> i've verified they're good for precise usage.
<rbasak> I believe MAAS trunk is essentially working now
<rbasak> Not sure about the packaging
<rvba> roaksoax: the dhcp issue that Jeroen was having?
<roaksoax> rvba: yeah, did you guys reach concensus on how should it be fixed?
<rvba> roaksoax: we basically don't have a choice here: with support for a proper conf.d directory, the only solution is to write to the config file directly. Same as what we've done for the DNS config.
<rvba> s/with support/without support/
<rvba> roaksoax: can you think of another way to do this?
<roaksoax> rvba: so I asked the securoity team about this the other day
<roaksoax> rvba: and this is the response: "Perhaps it would be easier for MAAS to have a helper that spawns the"
<roaksoax> dhcp and bind daemons itself? This would allow specifying different
<roaksoax> configuration files, in a location that is accessible by the maas
<roaksoax> rvba: which I believe is probably the best way to mess up with files, without messing up with files (if you know what i mean)
<rvba> We ruled that out because Apparmor restrict where, say, the named daemon can read its config file quite seriously.
<rvba> restricts*
<roaksoax> rvba: right, but we can always ship apparmor profiles
<rvba> Right, that would solve the Apparmor problem.
<rvba> But this means adding conf.d support for the apparmor profiles.
<roaksoax> rvba: not really no
<rvba> Which, IIRC, isn't available yet.
<rvba> ah?
<roaksoax> rvba: it is the same as what happened with cobbler/maas-provisiion
<roaksoax> rvba: we just ship an apparmor profile which is also the same as the original one
<roaksoax> rvba: but we add the differences to it
<roaksoax> rvba: so we can just copy the dhcp apparmor profile and those to would lie their
<roaksoax> ie. /etc/apparmor.d/usr.sbin.dhcp and /etc/apparmor/usr.sbin.dhcpd-maas
<roaksoax> or similar
<roaksoax> and the -maas profile includes the files we are touching
<rvba> I can't say I know Apparmor very well but I suppose this means that the profile are "cumulative"?  I mean that we would have to provide a /etc/apparmor/usr.sbin.named-maas which would sort of" extend" the default /etc/apparmor/usr.sbin.named
<rvba> Is that right?
<roaksoax> rvba: I don't know actually if it would work that way
<roaksoax> rvba: we shoud ask jdstrand
<roaksoax> rvba: cause otherwise we ship the same exact profile, plus the location of the maas configu files
<roaksoax> rvba: we will simply have 2 different profiles installed
<rbasak> smoser: definitely booting with ro. I'll try dropping the "start networking".
<rvba> roaksoax: if using the apparmor profile tricks works, this would also mean duplicating all the startup scripts to control our custom dns/dhcp services.  Is this really something we want to do?
<roaksoax> rvba: what do you mean by duplicating all the startup scripts? Isn't it just the upstart job?
<rvba> roaksoax: yeah, the upstart jobs.
<roaksoax> rvba: yeah, so only 2 upstarts jobs
<roaksoax> 1 for DHCP and 1 for DNS
<roaksoax> but for now we can just try this out with DHCP
<smoser> rvba, i +1 roaksoax's solutoin.  this is the right way especially for dhcp. we can probably find some way to get around bind, though.
<smoser> using groups i think.
<smoser> but that might be more trouble than its worth also
<allenap> roaksoax: Is python-txtftp published to a PPA somewhere?
<allenap> Or, where's it coming from for precise?
<roaksoax> allenap: for precise its on maas-trunk
<roaksoax> allenap: are you planing any fixes?
<allenap> roaksoax: Where's the maas-trunk PPA?
<roaksoax> allenap: err there';s no precise package fro python-txtftp
<roaksoax> we dont want it
<roaksoax> because we wont be able to have it in the archives
<roaksoax> allenap: python-tftftp is installed as part of maas
<allenap> roaksoax: debian/control says that python-django-maas depends on python-txtftp. How will that work on precise then?
<roaksoax> allenap: there's a different precise branch
<allenap> Ah, right.
<roaksoax> allenap: the precise packaging branch is stacked on top of the quantal one
<rbasak> smoser: ping
<rbasak> smoser: I've tried the workaround you suggested, but now it cloud-init seems to failcompletely. It doesn't even attempt apt-get update
<rbasak> cloud-init-nonet gave up waiting for a network device.
<rbasak> Then it lists eth0 as configured correctly, and eth1 as unconfigured this time
<rbasak> What is cloud-init-nonet doing here, anyway?
<smoser> rbasak, cloud-init-nonet ensuring that network comes up before cloud-init.conf runs
 * rbasak doesn't understand why this isn't called cloud-init-net
<smoser> rbasak, i can help if you can point me at something i can look at.
<smoser> but i'm almost certain you're seeing the bug that i pointed you at.
<rbasak> smoser: I've tried the workaround you suggested - I removed "start networking"
<smoser> combined with the other bug we discussed (which does not have a launchpad bug for) of eth0 not being the device that was pxe booted from.
<rbasak> Behaviour has changed now
<rbasak> It definitely has booted off eth0
<rbasak> smoser: is it significant that it lists eth1 _before_ eth0?
<smoser> rbasak, well are hte mac's right?
<smoser> did it boot off of eth0 ?
<rbasak> Yes
<smoser> then its not significant
<rbasak> smoser: am I right in thinking that this bug isn't arch specific, and not specific to having two interfaces either then?
<smoser> i dont think its specific to 2 nics.
<smoser> i dont think its arch specific
<smoser> if it is arch specific, it is not really because of arch, but rather because of some config somewhere that makes an assumption
<smoser> or because of race condition
<rbasak> I think this is release critical for maas
<smoser> that just happens because arm has different bottlenecks.
<smoser> well, yes, its clearly critical
<rbasak> OK, where do we need to go from here?
<smoser> rbasak, other things stoped me from poking further at bug 1031065 after i found that the bug we were working around by adding 'start networking' was not fixed otherise (ie, we still needed 'start networking' to correctly boot under lxc).
<ubot5> Launchpad bug 1031065 in cloud-init (Ubuntu) "cloud-init-nonet runs 'start networking' explicitly" [Medium,Confirmed] https://launchpad.net/bugs/1031065
<smoser> i ran out of time the day i was poking at it nad haven't been bakc.
<smoser> it is possible that if we debug that firuter, the root of the problem would also be the root of your issue.
<smoser> its also possible they're unreleated.
<rbasak> I understand that you have too much to do in too little time. I didn't mean to imply that you were slacking! I'm just not sure how to proceed right now and this issue completely blocks me
<smoser> rbasak, i'm not complaining.
<smoser> rbasak, so there are 3 issues here.
<smoser> a.) 1031065 documents the fact that cloud-init should not have 'start networking' as it does. but removal of that breaks booting under lxc. we need to fix that.
<smoser> b.) your issue, which seems unrelated to me, but may be the core cause of a.
<smoser> c.) nothing in the ephemeral images is going to force 'eth0' to be "pxe booted interface". however we assume that.
<smoser> i dont have a bug opened for 'c', but i'd call that critical too
<rbasak> As an aside, there's no support for ipappend in U-Boot right now
<smoser> rbasak, right. so i'm not sure how we'd solve that for arm, but the solutoin is possible for intel
<rbasak> I want to propose that rather than getting that through, we instead have the MAAS dynamic TFTP server just supply the MAC address of the node it is responding for in the kernel command line, if that will work
<smoser> rbasak, the tftp server is a IP application, no?
<smoser> would it necessarily (without arp hackery) know the mac of the client?
<smoser> and if it could figure that out, i suspect it would still break any case where the client was not on the same network
<rbasak> I think arp hackery may be needed to make that work, but I think that would be cleaner than ipappend
<rbasak> I'd like to assume that the tftp server is on the same network if I am permitted to do that
<rbasak> It seems unclean to me for the node to boot and re-dhcp and then assume that the pxelinux supplied IP is the same as the one on the correct interface that it dhcp'd
<rbasak> Oh
<rbasak> The TFTP currently does know the mac of the client
<rbasak> it's in the pxelinux.cfg/01-<mac>
<rbasak> that it tried to fetch
<rbasak> Only catch is that if I have it fall back to default for arch detection then that will break
<rbasak> (without keeping some state which is horriblew)
<smoser> tftp does work generically over ip
<smoser> so mac cannot actually be assume di dont think . unless its part of the tftp protocol
<smoser> (ie, inside the packet)
<rbasak> Oh wait - you're using ipappend 2?
<rbasak> for the mac directly?
<smoser> rbasak, i believe we use http://www.syslinux.org/wiki/index.php/SYSLINUX#IPAPPEND_flag_val_.5BPXELINUX_only.5D
<smoser> at least previously cobbler used that, and the installer knows how to handle that.
<rbasak> OK, just checked. It's ipappend 2 which adds bootif=<mac>
<smoser> so yeah, 2
<rbasak> So a workaround for U-Boot would be to supply bootif=<mac> from maas if it knows it
<smoser> rbasak, yes. tha twuld work.
<smoser> except where it is not known.
<rbasak> yeah
<smoser> but, as you say, that might not be a requirement.
<rbasak> like enlistment, without storing state from the previous miss :-/
<smoser> and, i'm not certain that its *not* in a tftp request
<smoser> although my argument about it being IP breaks that too
<rbasak> Just checked. TFTP doesn't include it
<rbasak> but it is supplied by pxelinux in a previous pxelinux.cfg/01-<mac> (what will be a) miss
<rbasak> would you mind if we define a missing bootif to mean eth0?
<smoser> rbasak, ok. i just see what you were saying about 01-MAC now.
<smoser> rbasak, well w can just make the fall through case "do nothing"
<smoser> but "eth0" is completely arbitrary. i think we're to the point in the kernel now that upgrades probably consistently order network adapter names on the same bus consistently
<smoser> but hard coding eth0 basically implies/enforces wiring in a specific way. which sucks. but we dont have a lot of other options.
<smoser> rbasak, how does pxe work?
<rbasak> that's a bit of an open question!
<smoser> it dhcp's , uses that IP to then do a tftp i guess.
<rbasak> Yes
<rbasak> I think the NIC it uses is hardware-defined
<rbasak> The first one on the case
<rbasak> (probably)
<rbasak> I've never seen/heard of a real server trying to PXE off a second nic, but I don't usually PXE them so I may be wrong there
<smoser> what is our tftp server?
<rbasak> It's one that's now built into maas
<rbasak> (some twisted thing)
<smoser> can you g+ really quick?
<rbasak> Sure
<roaksoax> allenap: still around?
#maas 2012-09-05
<Daviey> smoser: RE: eth0.. not only is network adapters now consistently named.. you now can't guarantee they are called eth*.  biosdevname is now enabled.
<rbasak> Daviey: would you object to: if bootif is supplied (on Intel via "ipappend 2"), then use that. Otherwise, if eth0 exists, then use that. Otherwise, fail.
<rbasak> Incidentally, internally the netlink api does have an ordering. It is possible to ask the kernel for the name of the "first" interface. But what that is is undefined of course.
<Daviey> well.. sounds reasonable.
<Daviey> not quite worked out the ramifications tbh
<roaksoax> jtv: howdy
<roaksoax> jtv: so what's the status of MAAS DHCP?
<roaksoax> rvba: ^^
<rvba> roaksoax: hi.  jtv just put up for review two branches to add the writing of the interface config file.  We decided to start with that (i.e. do the same as for the DNS config) and get it working;  if it is possible to do the solution you suggested yesterday (fix the apparmor profile and have our own dhcp daemon) then what it will require on the upstream side will be very simple as all the config writing stuff
<rvba> will already be in place.
<roaksoax> rvba: ok, so those two branches are the ones that actually write the config to /etc/dhcp/dhcpd.conf?
<rvba> roaksoax: no, that is already done.  The config file I'm talking about is /etc/default/isc-dhcp-server where the interface is specified.
<rvba> The interface that the DHCP server should listen to that is.
<roaksoax> rvba: ah I see, cause the version I released yesterday to quantal doesn't seem to be writing a dhcp config file
<rvba> roaksoax: Jeroen will know probably better but I'll try and do some testing about that today.
<roaksoax> rvba: ok cool
<roaksoax> rvba: oh btw... was wondering if you guys were planning on working on the quantal support for maas for this cycle?
<rvba> roaksoax: yep, it's scheduled.
<roaksoax> rvba: is there an ETA?
<rvba> roaksoax: Not that I know of.
<roaksoax> rvba: ok thanks ;)
<rvba> welcome :)
<rvba> roaksoax: btw, this is bug 1013146 so if you have insights on how it should be fixed, don't hesitate to add comments on the bug.
<ubot5> Launchpad bug 1013146 in MAAS "MAAS currently only supports Ubuntu version 12.04 to be installed on the nodes." [High,Triaged] https://launchpad.net/bugs/1013146
<roaksoax> rvba: will do, started looking at it yesterday but didn't really dig deep into it
<rvba> roaksoax: ok, cool
<roaksoax> rvba: so how do you think is best to enumerate the releases?
<roaksoax> development, stable, lts, others?
<rvba> roaksoax: should we simply name the releases using the result from 'distro-info --supported'
<rvba> ?
<roaksoax> rvba: right, but are we looking to support all of the releases?
<roaksoax> rvba: or are we looking to support precise and up
<rvba> roaksoax: I thought the plan was to support precise and up yeah
<roaksoax> rvba: so how do you think we should be enumerating them here: http://paste.ubuntu.com/1187350/
<roaksoax> how can we identify them
<rvba> (I'm currently looking into the dhcp config writing pb btw)
<rvba> roaksoax: how about storing the first release we support (precise) and then get the result of 'distro-info --supported' and parse that?
<rvba> Do we really what to classify the releases like that (devel, lts) ?
<roaksoax> rvba: if not, how can we make the attributes of the class be the release name itself?
<roaksoax> rvba: it is not that we can do exactly as we do with the Architecture
<roaksoax> that we can easily identify
<roaksoax> rvba: unless for now we handle them like that, and then we find a better way
<roaksoax> just to test quantal and precise
<rvba> roaksoax: sounds good to me.  What we need now is a quick way to test things with quantal.  We can always refine on how we get the list of the releases later.
<roaksoax> indeed
<rvba> roaksoax: were you looking for something like this? http://paste.ubuntu.com/1187365/
<rvba> ?
<roaksoax> rvba: ah yes!! thanks!
<rvba> welcome :)
<rvba> roaksoax: I think I found why the dhcp config is not being written: see bug 1046397.
<ubot5> Launchpad bug 1046397 in maas (Ubuntu) "The DHCP config file does not get written." [Undecided,New] https://launchpad.net/bugs/1046397
<roaksoax> rvba: awesome!!
<roaksoax> thanks
<rvba> roaksoax: I'll let you triage that bug and, if you think my suggestion is right, then the fix should be pretty straightforward.
<roaksoax> rvba: seems the right fix to me
<roaksoax> rvba: so now the only "issue" would be how to correctly display the problem
<roaksoax> err
<roaksoax> the release name
<roaksoax> as in "Precise Pangolin (12.04 - LTS)"
<roaksoax> or similar
<roaksoax> allenap: still around?
* fjlacoste changed the topic of #maas to: 5 weeks until Final Freeze | Discussion of upstream development of Ubuntu's Metal as a Service (MAAS) tool | MAAS jenkins: https://jenkins.qa.ubuntu.com/job/maas-trunk/
<robbiew> RoAkSoAx: should a 'sudo dpkg-reconfigure maas' also restart apache2 afterwards?
<roaksoax> robbiew: it should yes!
<robbiew> hmm...didn't for me...but maybe it was a fluke
 * robbiew tests again
<roaksoax> robbiew: it won't
<roaksoax> I removed that
<roaksoax>  but will at it again
<robbiew> lol
<roaksoax> just checked :)
<robbiew> ;)
 * robbiew checks the box for today for "contribution to opensource"...and goes to lunch
<robbiew> :P
<roaksoax> hahaha
<roaksoax> have a good one
<robbiew> thx
<roaksoax> cause
 * roaksoax is still contributing :)
<robbiew> I admire such over achievers
<roaksoax> lol
<Daviey> robbiew: RMS is proud of you today.
<allenap> roaksoax: Hi there, can I help? I have just a couple of minutes though :-/
<allenap> roaksoax: If you do need me, email and I'll try and reply this evening, or get me tomorrow before about 1630 UTC. Have a good evening!
<roaksoax> allenap: no worries, enjoy your evening
#maas 2012-09-06
<robotfuel> I've uninstalled maas and now I can't re-install it. I get an error message about password authentication failed for "maas" when I try.
<jtv> smoser: would you be around for some help with omshell by any chance?
<jtv> roaksoax maybe?
<lifeless> bigjools: hi
<lifeless> bigjools: shall we dance ?
<jtv> Oh hi lifeless
<jtv> He may still be lunching.
<lifeless> kk
<bigjools> lifeless: I iz back
<lifeless> cool
<lifeless> video ?
<bigjools> yarp, hang on
<bigjools> jtv, want to join?
<jtv> ?
<bigjools> the call
<bigjools> about architecture
<jtv> uh sure
<jtv> plus?
<jtv> bigjools: stuck again with dhcpd.  Error reporting is minimal.  I get an I/O error whenever I try a host with its IP address as the name.
<jtv> Which could mean just about anything.  There's certainly no attempt at I/O.
<jtv> At least, not at the level that strace would see.
<jtv> The server receives my omshell request, and replies with an I/O error.  That's pretty much all that strace sees.
<jtv> (Surprisingly, dhcpd checks for errors from fprintf/fputs/fputc by seeing if they set errno â but ignoring return value.  I thought those functions were allowed to set a nonzero errno on successful runs, in which case the application should ignore it.)
<bigjools> jtv: paste session
<jtv> bigjools: http://paste.ubuntu.com/1188377/
<jtv> Problem is, dhcpd can decide that the leases file isn't sane.  The logging in the source doesn't look very helpful, but I might try getting at it.
<bigjools> jtv: you need quotes around the name value
<bigjools>                                                               
<bigjools> > set
<bigjools> argh
<bigjools> set name = "192.168.35.70"
<jtv> Grrr.  It parses an IP address quite happily.
<bigjools> yes
<jtv> Didn't see the quotes in the template, either.
<bigjools> then it's buggy :/
<jtv> So not clear where they come from in our Omshell class.
<jtv> But then how could it possibly work?
<jtv> Oh well, at least I managed to reproduce the problem with a quoted name.  Just not with the way we seem to do it in the Omshell code.
<jtv> Aren't you getting Omshell errors in your Q/A setup, with manage_dhcp enabled?
<bigjools> omshell wasn;t getting called the last time I QAed
<bigjools> you fixed it recently
<jtv> That'd do it.
<jtv> Shall I just file a bug then, so that we at least check this?
<bigjools> we also need
<bigjools> set hardware-type = 1
<bigjools> which is missing ATM
 * jtv files one bug, 2 cards
<bigjools> jtv: "it would be good to see confirmation in Q/A" What?
<jtv> The "I/O error" I'm getting might still be something weird in my setup.  For whatever reason the omshell just accepts an IP address there, and parses & represents it in its own format.
<jtv> Maybe it's just too generic and accepts any kind of argument though.
<jtv> Anyway, I suspect we'd be wanting the generated hostname there, right?
<bigjools> jtv: I recreated locally
<bigjools> quoting fixes
<bigjools> jtv: hostname is irrelevant actually
<jtv> Yes, I just don't know what the reason was why we had the ip address there.
<bigjools> jtv: it just needs to be a fixed identifier
<jtv> If hostname is irrelevant, do we need it here at all!?
<jtv> Ah
<bigjools> IP address works fine
<jtv> Puzzling stuff!
<bigjools> omapi is shite
<jtv> I'll note it on the bug.
<jtv> It's from an age when getting a string all the way to the user was a job in itself.
<bigjools> mmyes
<jtv> But I don't see how the errno handling could have been correct or portable.
<jtv> errno = 0;
<jtv> fprintf(...);
<jtv> if (errno) handle_error();
<jtv> I've been bitten by that.  AFAIK standard-library functions can set a nonzero errno even on success.
<jtv> Well, since we've discussed it now, I'll just write up that fix.
<jtv> roaksoax, are you around?
<jtv> Too early, probably -- sorry for waking you, better go back to sleep.  :)
<jtv> Daviey, maybe you can help?  I'm creating a custom upstart job to run dhcpd with our own defaults & config, and updating HACKING.txt.  I have a few questions about it.
<roaksoax> jtv: not yet oficially but what's up :)
<jtv> Good morning :0
<jtv> :)
<roaksoax> morning :)
<roaksoax> jtv: what are your questions about it?
<jtv> I'm creating a custom upstart job, so that we can use custom config files that we're free to modify.
<jtv> Questions:
<jtv>  - Where do I put the upstart script?  In the maas package, or in the packaging branch, or elsewhere?
<jtv>  - Where do I put the new /etc/default/maas-dhcp-server config/script file?
<roaksoax> jtv: in the packaging branch
<jtv> Ah OK
<roaksoax> jtv: look at debian/maas.maas-celery.upstart
<roaksoax> jtv: look at debian/maas.maas-*.upstart
<jtv>  - Are we keeping the maas-dhcp package?  I'm updating HACKING.txt but just installing isc-dhcp-server won't do it any more.  You'll need the custom stuff installed.
<roaksoax> jtv: what's custom stuff?
<jtv> Ahh and I just add my own maas.maas-dhcp-server.upstart ?
<jtv> The custom upstart script and the custom /etc/default/... file.
<roaksoax> jtv: yes, well kind of. It would be: maas-dhcp.maas-dhcp-server.upstart
<jtv> Wow :)
<jtv> So that answers my question about the maas-dhcp package.
<roaksoax> jtv: and about the default, i think you can also do maas-dhcp.default
<roaksoax> and the package system will take care of installing them in the appropriate places
<jtv> Will it all automatically be owned by maas:maas?
<roaksoax> jtv: nope
<jtv> For files we want to write to, I thought it'd make sense to skip the sudo dance while we're at it.
<jtv> Say, maybe it would be good to have a quick face-to-face chat about it?
<roaksoax> jtv: so the idea is to write files in a maas owned location
<jtv> That way I can say I had a proper pre-imp call.  :)
<roaksoax> and dhcp would just source from there
<roaksoax> jtv: sure, give me 5
 * jtv gives roaksoax 5
<roaksoax> jtv: ready
<jtv> See if this works... https://plus.google.com/hangouts/_/51b1cfe53b7b186ae3519d28fac1eff5b7ce2bdb?authuser=0&hl=en-GB
<smoser> rbasak, https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1031065
<ubot5> Ubuntu bug 1031065 in cloud-init (Ubuntu) "cloud-init-nonet runs 'start networking' explicitly" [Medium,Triaged]
<smoser> right now i'm suspecting that you're hitting the other race condition that slangasek mentioned.
<smoser> that you have /run but not "virtual-filesystems".
<smoser> hm..
<smoser> yeah. thats the issue.
<smoser> i wonder which if th evirtual fileysstems you're not getting.
<smoser> rbasak, can you get the same log, but with '--verbose' added to the mountall command line in /etc/init/mountall.conf ?
<rbasak> smoser: sure. Though I'm running some tests right now - I'll need to do it a bit later
<rbasak> smoser: you still want the upstart --verbose too?
<smoser> yeah.
<rbasak> OK
<jtv> roaksoax: turns out the only real sticking point with the existing apparmor profile for dhcpd is the pidfile.  And I can't find any documentation to support having multiple profiles for the same executable...  what if we just hardlinked /usr/sbin/maas-dhcpd to /usr/sbin/dhcpd and created a new profile?
<jtv> Oh hi there smoser â thanks for spotting the fatal flaw in our grand config-writing scheme.  Currently working around it.
<smoser> jtv, i talked with jdstrand about the apport changes.
<smoser> and he's fine with them.
<jtv> s/apport/apparmor/ ?
<smoser> ie, adding a sectoin to the apport stanza to allow maas to have well defined config.
<smoser> yes
<smoser> sorry
<smoser> i also use the words 'landscape' and 'launchpad' interchangeably
<jtv> Burn in hell!
<jtv> You unox geeks are all the same.
<roaksoax> lol
<jtv> Rolix, Relix â what's the difference?
<roaksoax> jtv maybde jdstrand knows better
<roaksoax> smoser but that doesnt solve precise
<smoser> we can solve that.
<smoser> but i'm hesitent to tell you how
<smoser> because i think you'll like the idea for quantal also
<jtv> And that's bad?  Now I'm curious.  :)
<smoser> hold on, collecting data
<roaksoax> k
<smoser> ok . so /etc/apparmor.d/usr.sbin.dhcpd is pastebinned at http://paste.ubuntu.com/1189094/
<smoser> note lines 57 and 58
<smoser> i believe that renders to "/etc/apparmor.d/local/usr.sbin.dhcpd" (i'm not sure of the path, but obvioysly it does *something*)
<smoser> that file is *not* a config file, so its more ok for maas to poke at it
<jtv> Why isn't it a config file?
<jtv> (It exists â I looked at it earlier)
<smoser> hm.. strange that it exists.
<jtv> Placeholder comment, basically.
<jtv> I guess mostly to stop the #include from breaking.
<jtv> But yes, I had that A-HAH moment too.  :(
<smoser> hm..
<smoser> well jdstrand suggested it.
<smoser> if it is a conf file, but you are explicitly expecting someone to update it, then thats broken by design
<jtv> Or BSD, depending on your nomenclature.
<smoser> well, maybe its not so bad.
<smoser> the conf file prompt comes when you have local changes to a file and hte package has changes to the file
<smoser> it would seem *very* unlikely for the package to ship a change to that header stub
<smoser> and if it did, the package maintainer would surely be aware that they were going to force config file prompts for all users.
<smoser> so they'd want to avoid if at all possible.
<smoser> anyway... i'd much rather see us change that file, and would seem ok for us to do that for precise.
<smoser> reasonable?
<jtv> Wouldn't the hardlink solution make it all a bit easier?
<jtv> Assuming it would even work.
<smoser> well.
<smoser> a.) we shouldn't just look for the easiest solution
<smoser> b.) how would you package the hardlink?
<jtv> Create it in postinst?
<smoser> then you have an unpackaged file in /usr/sbin
<smoser> which screams WHAT_THE_%*#*_IS_THIS!
<jtv> Ship a dummy, replace with hardlink?
<smoser> then you have a file in /usr/bin that looks like it has been modified if you check checksums of your package
<smoser> which, again screams *#*$**
<jtv> True, true â but it's about the most policy-compliant trick we have so far, no?
<smoser> i think editing the local file is ok. imo.
<smoser> with hardlink, what happens on upgrade of package?
<jtv> Heh.
<smoser> with upgrade of isc-dchp-server?
<jtv> You're right.
<smoser> so my suggestion is update apparmor profile for quantal.
<smoser> and we can even ask security team if they're ok with that change SRU to precise
<jtv> Just add the maas directories & pidfile?
<smoser> yeah. basically like eucalyptus
<smoser> at least that is seemingly a designed solution to start off of
 * jtv was tempted, so tempted, to jam our own pidfile into /run/eucalyptus
<jtv> Actually the pidfile looks like the only one that we really can't circumvent decently.
<jtv> There is some wiggle room for our own leases and config files.
<smoser> jtv, well we were still running into permission issues on those files
<smoser> but we could work around that.
<jtv> We could own them.
<smoser> well, there were issues.
<smoser> at least when i actually tried this.
<jtv> The upstart script ensures ownership/permissions of the leases file (and backup).
<jtv> I imagine maas installation would take care of ownership/permissions of the config file.
<smoser> you're saying running as the isc-dhcp user?
<jtv> I was thinking of continuing to run dhcpd as dhcp, yes.
<jtv> Make the config file owned by maas, but group dhcp.
<smoser> if we can separate and run our own, i really think thats the best.
<jtv> Or maas:maas for all I care, as long as dhcpd can read it.
<smoser> then you can know that you can just kill it
<smoser> when the person pusshes the button "do not run dhcpd"
<smoser> its is cleaner separation. and well known path.
<jtv> Wait... are you saying running as dhcp is best?  Or that running as maas is best?
<smoser> i'm saying maas running/maintaining its own dhcpd is best.
<smoser> *not* using the isc-dhcpd
<jtv> Oh, yes, that's what I'm coding up.
<jtv> Well, it uses the isc-dhcpd *binary* of course.  But not its config etc.
<smoser> right.
<jtv> But we need at least a new pidfile in the apparmor profile, I think.  Everything else we _could_ keep as-is afaics.
<smoser> you could use /ltsp :)
 * jtv snarls at smoser
<jtv> I guess I'll just add rules for /etc/maas/dhcpd.conf, /var/lib/maas/dhcpd.leases, and _something_ in /run.
<roaksoax> rvba: around?
<roaksoax> jtv: still around?
<roaksoax> allenap: are you?
<allenap> roaksoax: I'm just passing (it's dinner time here). Email :-/
<roaksoax> allenap hehe just left for lunch myself im from the phone
<roaksoax> allenap just wantwd to know how to create a db migration to be able to add distro realease support
<roaksoax> if theres  template or what
<rvba> roaksoax: have a look at HACKING.txt, it's a one line command once you've made the change to the model code.
<roaksoax> rvba ok cool thanks
<smoser> roaksoax, ok. you around
<roaksoax> smoser: nope, this is an automated message
<roaksoax> smoser: hehe, what's u?
<roaksoax> up*
<smoser> ok. so where can i get your dhcp fixed maas?
<roaksoax> smoser: it is in the quantal archives, just not on the cd
<smoser> ah. ok.
<smoser> so we have generally functional dhcpd?
<smoser> how?
<roaksoax> smoser: maas-provision binary is the one generating the config
<roaksoax> so we just needed sudoers access to it
<roaksoax> not the best approach obviously, but enough for now until we have the other fixes in place
<smoser> roaksoax, ok. i'm gonna try to walk through this, so i'll probably bother you again
<roaksoax> alrighty
<smoser> roaksoax, ok.
<smoser> so bug 1044229 is marked as fixed
<ubot5> Launchpad bug 1044229 in MAAS "DHCP config doesn't get written unless an inhuman combination of scripts is run" [Critical,Fix released] https://launchpad.net/bugs/1044229
<smoser> and that was a dupe of bug 1044061
<ubot5> Launchpad bug 1044229 in MAAS "duplicate for #1044061 DHCP config doesn't get written unless an inhuman combination of scripts is run" [Critical,Fix released] https://launchpad.net/bugs/1044229
<smoser> so, how do i enable maas dhcp ?
<smoser> oh. fun. from the api
<smoser> any other way than api and ui ?
<smoser> hm..
<smoser> and i go to my maas ui and it has 'manage dhcp' but ihave no dhcp process running
<smoser> roaksoax, ?
<roaksoax> smoser give me a sec plz
<smoser> deal
<smoser> issue is that dhcpd is not configured to listen on any interfaces
<roaksoax> smoser you enable it by installing maad-dhcp
<smoser> $ dpkg-query --show maas-dhcp
<smoser> maas-dhcp       0.1+bzr971+dfsg-0ubuntu1
<smoser> button is highlighed in maas web ui
<smoser> maas-dhcp installed
<smoser> no isc-dhcp process
<smoser> http://paste.ubuntu.com/1189590/
<smoser> that is everythign i've done so far
<smoser> (except not added daily builds)
<smoser> as you told me archive updates was good
<smoser> roaksoax, what am i doing wrong?
<roaksoax> smoser did you accept thr cresyion of the dhxp betwork
<smoser> </dev/null
<smoser> so, no
<roaksoax> smoser so when you install maas-dhcp and say yes and configure the subnet then it gets enabled
<smoser> i just dpkg-reconifgure maas-dhcp
<smoser> and isc-dhcp-server still wont run
<smoser> what am i doing wrong?
<matsubara> smoser, roaksoax: I just reproduced the same thing in the lenovo lab
<matsubara> installed the latest package, answered the dhcp config questions but isc-dhcp-server keeps respawning saying there's not subnet declaration for eth0, eth1 or eth2
<matsubara> and /etc/dhcp/ doesn't have any maas config
<smoser> roaksoax, is the version i have listed above sufficient?
<smoser> yeah. i have latest package per https://launchpad.net/ubuntu/+source/maas (44 minutes old)
<smoser> oh wait. i have 0ubuntu1 and 0ubuntu2 is 44 minutes old
<matsubara> I'm using 0.1+bzr971+dfsg-0+972+74~ppa0~precise1 from https://code.launchpad.net/~maas-maintainers/+archive/dailybuilds
<matsubara> actually: 0.1+bzr971+dfsg-0+975+74~ppa0~precise1	
<matsubara> which is the latest latest crack
<matsubara> :-)
<smoser> yeah, an di just by hand applied https://launchpadlibrarian.net/114878785/maas_0.1%2Bbzr971%2Bdfsg-0ubuntu1_0.1%2Bbzr971%2Bdfsg-0ubuntu2.diff.gz
<smoser> to no avail
<roaksoax> smoser: ok i'm back
<smoser> k
<smoser> i can let you in if you want to poke around, but i
<smoser> i've basically done nothing other than that pb
<roaksoax> smoser: ok so, the first time you install maas-dhcp, if you configure it, and dhcp is disabled, then it will be enabled
<smoser> so what are you saying ?
<smoser> its enabled.
<smoser> thats fine.
<smoser> it doesnt run
<smoser> or do anything
<smoser> thats not.
<roaksoax> smoser: right, but is it enabled on the WebUI?
<smoser> yes
<roaksoax> smoser: and no config file in /etc/dhcp/dhcpd.conf?
<roaksoax> smoser: have you rebooted the machine?
<roaksoax> err
<roaksoax> i mean, restart apache2?
<roaksoax> or pserv or celery?
<smoser> well, /etc/dhcp/dhcpd.conf exists. as ait always did.
<roaksoax> smoser: right, but maas should have written a config to it
<smoser> i did for p in celery pserv txlongpoll; do sudo restart maas-$p; done
<smoser> and have actually rebooted since installing maas.
<roaksoax> let me see
<smoser> ubuntu@10.55.60.167
<smoser> [2012-09-06 20:44:27,803: INFO/PoolWorker-3] Not sending DHCP leases to server: not all required knowledge received from server yet.  Missing: api_credentials, maas_url, nodegroup_name
<roaksoax> smoser: Version: 0.1+bzr971+dfsg-0ubuntu1
<roaksoax> that's the reason
<roaksoax> it is not ubuntu2
<matsubara> roaksoax, I get the same behaviour (not the same log entry though) on 0.1+bzr971+dfsg-0+975+74~ppa0~precise1
<roaksoax> matsubara: the fis is on packaging, not on upstream
<roaksoax> s/fis/fix
<matsubara> ah ok
<smoser> roaksoax, i manually added the sudoers
<matsubara> what's the package revno I should be using to build with?
<smoser> and restarted since then
<roaksoax> smoser: that should work then
<smoser> you can look.
<smoser> $ sudo cat /etc/sudoers.d/99-maas-sudoers
<smoser> maas ALL= NOPASSWD: /usr/sbin/service isc-dhcp-server restart
<smoser> maas ALL= NOPASSWD: /usr/sbin/maas-provision
<roaksoax> matsubara: the release ubuntu2
<roaksoax> matsubara: 1 before last one i think
<roaksoax> smoser: disable/enable from the web ui
<smoser> $ ls -altr /etc/dhcp/dhcpd.conf
<smoser> -rw-r--r-- 1 root root 3602 Jul 10 21:29 /etc/dhcp/dhcpd.conf
<roaksoax> smoser: lket me check locally first as network is crappy
<roaksoax> again
<smoser> (Jul 10). no dice
<roaksoax> smoser: is DNS enabled?
<smoser> i have 2 nice check boxes checked
<smoser> "Enable Dns", "Manage DHCP".  They're really pretty, and orange.
<roaksoax> smoser:     return self.run(*args, **kwargs)
<roaksoax>   File "/usr/lib/python2.7/dist-packages/provisioningserver/tasks.py", line 322, in restart_dhcp_server
<roaksoax>     check_call(['sudo', 'service', 'isc-dhcp-server', 'restart'])
<roaksoax>   File "/usr/lib/python2.7/subprocess.py", line 511, in check_call
<roaksoax>     raise CalledProcessError(retcode, cmd)
<roaksoax> CalledProcessError: CalledProcessError
<roaksoax> weird
<smoser> where do you see that?
<smoser> duh.
<smoser> not wierd
<smoser> "sudo service"
<smoser> maas can't sudo
<roaksoax> yeah
<roaksoax> which is weird
<roaksoax> smoser: and magically, it now works
<smoser> it can though
<smoser> it is in /etc/sudoers.d/
<smoser> for maas
<smoser> what magically works ?
<roaksoax> smoser:
<roaksoax> smoser: DHCP config was written now
<smoser> /etc/dhcp/dhcpd.conf ?
<smoser> on my system?
<roaksoax> smoser: http://paste.ubuntu.com/1189688/
<roaksoax> smoser: on my test env
<roaksoax> let me access yours
<smoser> ubuntu@10.55.60.167
<roaksoax> smoser: can you disable/enable dhcp config again please
<roaksoax> from maas webui
<smoser> sure
<smoser> disabled
<smoser> enabled
<roaksoax> smoser: so just did a dpkg-reconfigure maas-dhcp and it worked
<smoser> hm..
<roaksoax> smoser: seems to be (wild guess) that when maas gets installed (maas-dhcp really) it tries to create the DHCP database, and enables it (that's why it is enabled on webui). However, since the creation of the dhcpd config file fails (because maas-provision not in sudoers) the dhcp client network doesn't become permanent
<roaksoax> so when we add maas-provision to sudoers
<roaksoax> and dpkg-reocnfigure maas-dhcp
<roaksoax> it success on creating
<roaksoax> hence the network is created
<smoser> i will check your theory
<roaksoax> cool
 * roaksoax will try to finish the quantla support
<smoser> hm..
<smoser> it seems to have worked.
<smoser> although i think i'd done that before
<smoser> where did you see the stack trace you showed above?
<roaksoax> smoser: celery log
<roaksoax> /var/log/maas/celery.log
<smoser> hm.. i never saw that htere.
<smoser> matsubara, could you open a bug based on the above?
<roaksoax> smoser: there's  no bug really, it is fixed
<roaksoax> smoser: the problem ois that it has not been released
<roaksoax> in the archives
<roaksoax> smoser: https://launchpad.net/ubuntu/+source/maas/0.1+bzr971+dfsg-0ubuntu2
<smoser> http://paste.ubuntu.com/1189725/
<smoser> roaksoax, ^
<smoser> file gets written now, but it is bad
<roaksoax> smoser: isn't that wrong configuration
<roaksoax> smoser: ah yes, it is badupstream side
<smoser> http://paste.ubuntu.com/1189729/
<matsubara> smoser, roaksoax I think there's another bug. the dhcp config is written but next-server is set to 127.0.0.1, is that correct?
<smoser> right.
<smoser> i was going to point that out
<smoser> ok
 * smoser has to go now
<roaksoax> matsubara: *and* I think there's another issue
<roaksoax> subnet 192.168.1.5 netmask 255.255.255.0
<roaksoax> that should be the network address not the starting range
<roaksoax> maas-provision asks for starting range
<roaksoax> and should probably calculate the network address
<matsubara> roaksoax, this is the config the package wrote for me: https://pastebin.canonical.com/73987/
<matsubara> roaksoax, this https://code.launchpad.net/~maas-maintainers/maas/packaging.precise has the latest fix you mention that's included in ubuntu2 right? My package was built from that source
<roaksoax> matsubara: try dpkg-reconfigure maas
<matsubara> roaksoax, ok. it offered to change the MAAS pxe address.
<roaksoax> matsubara: err i meant maas-dhcp
<roaksoax> matsubara: htat's a bug for sure when it tells next-server
<roaksoax> or mabe not as it bings to localhost
<roaksoax> we;ll have to do some real pxde boot testing on that
<matsubara> roaksoax, ok, reconfigured the maas-dhcp package, it asked me the same questions asked when I installed and the dhcpd.conf still have next-server as 127.0.0.1
<roaksoax> matsubara: that's a bug for sure
<matsubara> roaksoax, that's what I'm doing :) when I pxe boot, the node gets a TFTP open timeout
<roaksoax> yeah
<matsubara> roaksoax, I'm going to file it, is it in the packaging or upstream?
<roaksoax> matsubara: upstream
<matsubara> https://bugs.launchpad.net/maas/+bug/1047061
<ubot5> Ubuntu bug 1047061 in MAAS "dhcpd.conf next-server set to 127.0.0.1" [Undecided,New]
<roaksoax> cool
<roaksoax> anyways
<roaksoax> im off
<roaksoax> later
<matsubara> roaksoax, have a good one. Thanks!
#maas 2012-09-07
<roaksoax> jtv: around yet?
<roaksoax> smoser: ubuntu releases support works successfully :D
<bigjools> roaksoax: \o/
<jtv> Hi roaksoax -- sorry I missed you
<jtv> bigjools: "seeing if an object exists" in omshell will complicate our code, in that we need to send partial input, evaluate output, then decide what input to send next.  My original delete-first approach looks much simpler.
<bigjools> jtv: it is also racy
<bigjools> it;s really not hard to check first
<bigjools> two lines of omshell
<jtv> The omshell part is easy.  See above.
<jtv> Wait, I have an idea.
<jtv> But it's ugly.
<bigjools> it's all easy
<bigjools> I don;t understand your reticence
<bigjools> but perhaps I am missing something you've seen
<jtv> Well there's no conditionals in omshell.
<jtv> So we can't keep sending one chunk of omshell code the way we're doing.
<bigjools> you don't need to do it all in omshell
<jtv> That's what I'm saying.
<jtv> If we want to check first, we can't keep sending one chunk of omshell code the way we're doing.
<bigjools> so what's the problem with that?
<bigjools> you run it once and then possibly again
<jtv> That we've been patching omshell a lot in tests.
<jtv> I can give it a try.  It's definitely not impossible, just breaking something we've been making assumptions about.
<bigjools> I can't see why it won't work, although it will require more mocking for tests
<jtv> It's definitely not impossible.
<bigjools> I guess that's the best I can get out of you :)
<jtv> It's something I've been noticing this week: a lot of communication has been breaking down.
<jtv> All I wanted to say is: we've been making a lot of assumptions about how we run omshell, and so this change may involve a lot of test changes.  That's the reason for my reticence.  I can try it, but I needed to discuss alternatives first.
<bigjools> ok
<bigjools> let's reqind
<bigjools> rewind even
<jtv> rescind?
<bigjools> maybe :)
<jtv> remind.
<bigjools> the error message that you check for, let's see if you can make it appear for other scenarios
<bigjools> if it doesn't, then your original code is good IMO
<bigjools> what other scenarios? comms fail, syntax error, etc
<jtv> fsetpos failure in a trace file.
<jtv> (Seriously!  I have no idea why that generates that same error)
<bigjools> whut?
<jtv> Yeah.  :(
<bigjools> no, I mean what are you doing, I don't understand
<jtv> I went through the dhcp source to find other things that might generate this same error.
<jtv> And there were a few, in code paths that I _think_ we're not exercising.
<bigjools> ok
<jtv> I have one thing I can try.
<jtv> Hi rvba!
<bigjools> if that's the case we are probably safe
<rvba> Hi jtv, hi bigjools.
<bigjools> however, did you check what errors omshell generates itself?
<bigjools> morning rvba
<jtv> bigjools: yes, I did â it's all the same codebase, so I just grepped the whole thing.
<bigjools> ah ok
<bigjools> jtv: in that case you have convinced me that your existing change is good
<bigjools> does it work in practice?
<jtv> When I tried it manually in omshell, yes.
<bigjools> cool
<jtv> I appreciate your critical probing; I wish I could give easy absolute certainties about that codebase.  :/
<bigjools> yeah I know :/
<bigjools> glad you appreciate it, not trying to be difficult, honest!
<jtv> "Uh, trying?"  :)
<bigjools> :p
<jtv> It's good code for its era, actually.  Clay pots, numeric error codes...
<jtv> Makes me appreciate python all the more.
<jtv> roaksoax: I've got a change up for review that will manage the discussed extension to dhcpd's apparmor profile.  We'll have to run that when we set up our dhcpd, and direct its output into /etc/apparmor.d/local/usr.sbin.dhcpd
<jtv> roaksoax: Still have to set up the custom dhcpd instance though, with its own config file etc.
<smoser> jtv, are you there?
<smoser> so, thinking about apparmor
<smoser> is there any reason that maas should write that ? rather than packaging?
<smoser> the reason i ask is that I think it might just be easier all around if
<jtv> hi smoser
<smoser> a.) in precise, we add 1 line in packaging to the local/usr.sbin.dhcpd that says "include /etc/maas/apparmor.d/usr.sbin.dhcpd" or the like
<smoser> b.) in quantal apparmor.d we get such a thing included.
<jtv> Sure, if you can swing it, for both precise and quantal.
<smoser> ie, in quantal we'll modify usr.sbin.dhcpd profile to have that include.
<smoser> then maas packaging can write the file in both cases. the only difference is we have to update that local file in precise.
<smoser> it just seemed strange to write it in maas source code. its really packaging that caused this pain :)
<jtv> Ohhh, right, I see.
<jtv> Well the maas code is only to manage the extension to the local file.
<smoser> right. but even that seems wierd.
<jtv> It could easily contain just the #include.
<smoser> i dont want to stop this from moving forward.
<smoser> but that seemed a bit less in trusive
<smoser> and it really seems wierd to me to have a daemon writing apparmor config
<jtv> Either way it's going to be done in packaging, of course.
<smoser> interesting...
<smoser> if there were an apparmor profile for maas
<smoser> it probably would *not* allow it to write/modify files in /etc/maas/apparmor.d
<jtv> It's not the daemon writing there â it's just a piece of python that the installation scripts can call.
<smoser> because if it did... that'd be like allowing full exploit and defeat of apparmor
<smoser> hm..
<smoser> oh well.
<smoser> i think i'll just table my thoguhts for now.
<jtv> At this point, I'd rather have them soon so I can move forward a bit more before the night is over!
<jtv> Er... did that come out weird?
<jtv> What I mean is, I'm hoping to get a bit more done tonight.
<smoser> me too
<jtv> So I'd very much like these thoughts resolved.
<jtv> smoser: if you feel the approach is wrong, it's better if you say so now than to wait until we're committed.  If you feel it's OK to append and #include to the local apparmor file, without managing custom settings in there, I can scuttle my branch to support the latter.
<smoser> i think for precise we have to modify that file (the local)
<smoser> there is no other way.
<jtv> Agreed.  But how do we modify it?
<smoser> i think if it works, it makes sense to reduce your management of a common file
<smoser> by using "#include"
<smoser> right?
<jtv> Right.
<jtv> I just didn't want to assume that we can just append that to the file and never look back.
<smoser> right
<jtv> For example, we don't want to leave a broken #include around after you purge the config.
<smoser> yeah, i dont know how it handles that. and packaging on removal would have to handle removal of that if missing files cause apparmor to choke
<smoser> roaksoax, what do you think ?
<smoser> it seems more proper to me for packaging to basically handle apparmor
<jtv> FWIW with my code it'd be easy to disable that bit we add.
<jtv> Packaging scripts would _call_ my code, obviously.
<smoser> well, they could, yeah.
<smoser> i'll wait to see what roaksoax thinks.
<jtv> OK
<smoser> it really does to me seem like packaing's problem.
<jtv> Yes â I'm just providing a tool for it to use.
<jtv> smoser: I'm going to go out for dinner now.  I may be some time.
<roaksoax> smoser: i'm fine with that, is the same thing done with bind
<roaksoax> jtv: is the change in question make the change in the apparmor profile, or will we have to do it (in packaging)?
<roaksoax> rvba: around?
<roaksoax> rvba: let me know when around
<rvba> roaksoax: I'm there but I'm otp
<roaksoax> rvba: ok, no worries, let me know when you're done :)
<smoser> ok.
<smoser> i have a thoguht.
<smoser> thell me how stupid this is.
<smoser> in commissioning environment (and enlistment)
<smoser> there is basically no way in to the image.
<smoser> and when it fails, its a PITA to debug
<smoser> so i'm thinking i will do:
<smoser> if we take the failure case...
<smoser> right now we print a message, sleep 60 seconds, poweroff
<smoser> instead, we could
<smoser> enable a password (possibly random), give a message, and say "if you do not login in the next X seconds, you will be powered off"
<roaksoax> smoser: yeah that could work for debugging purposes
<roaksoax> smoser: so, during enlistment, I waas thinking how to we chose what release to use?
<smoser> http://paste.ubuntu.com/1190988/
<smoser> roaksoax, what do you mean choose?
<roaksoax> smoser: so, while working on adding other ubuntu releases support
<roaksoax> smoser: enlistment now requires you to tell what release to use for the enlisted system
<roaksoax> smoser: so, how can we effectively chose what ubuntu releaswe to use
<smoser> i'm confused.
<smoser> you mean as in which ephemeral image to boot ?
<roaksoax> smoser: no, it will boot in precise ephemeral image
<roaksoax> smoser: but when you enlist the node itself
<smoser> ok (that is a separate issue)
<smoser> you're saying when i enlist a system, i have to chose the operating system that will be installd on it?
<smoser> that is broken
<smoser> badly broken
<roaksoax> smoser: yes and no (that's one of the things I want to discuss with rvba)
<roaksoax> smoser: but, as it stands now, I need to specify the ubuntu release to use
<melmoth> random question of the day: if iwas to deploy a nova-compute servoce on the node running zookeeper (just for the sake of not wasting hardware), would i regret it ?
<rvba> roaksoax: all right, I'm there.  Sorry for the delay.
<roaksoax> rvba: no worries :)
<roaksoax> rvba: ok so basically, the problme is that during enlistment we don't have to specify a release, but we always want it to set one by default
<roaksoax> rvba: so in modes/node.py I did this:
<roaksoax> os_release = CharField( max_length=10, choices=UBUNTU_RELEASE_CHOICES, blank=False, default=UBUNTU_RELEASE.precise)
<roaksoax> which means that it should never be blank basically
<roaksoax> right?
<roaksoax> now, in forms.py I initially had this:
<rvba> right
<roaksoax> os_release = forms.ChoiceField( choices=UBUNTU_RELEASE_CHOICES, required=False, initial=UBUNTU_RELEASE.precise, error_messages={'invalid_choice': INVALID_UBUNTU_RELEASE_MESSAGE})
<roaksoax> err
<rvba> Note that 'blank' is purely validation-related.
<roaksoax> s/False/True
<roaksoax> so I had the above with s/False/True, then enlistment failed because we were not specifying the releas
<roaksoax> eso I changed it to required=False, and then enlistment no longer complained about not specifying release
<roaksoax> but there was this:
<roaksoax> ValidationError: {'release': [u'This field cannot be blank.']}
<roaksoax> which means it was related to the blank=False
<rvba> yep
<roaksoax> so my question is, shouldn't the initil= and/or default= have set the release?
<rvba> Well, that is indeed confusing stuff in Django.
<rvba> The initial=xxx stuff let's you create a form with initial values.
<rvba> As in an HTML form.
<rvba> but that's all it will do.
<rvba> So in the API were we're simply using forms for validating stuff, that won't be taken in to account.
<roaksoax> rvba: right, but what about the default= on node.py
<roaksoax> rvba: or better yet, how do you think this problme should be addressed?
<rvba> roaksoax: the form should be responsible for setting the default.  Let me find an example.
<rvba> roaksoax: if you look at WithMACAddressesMixin which is a mixin used to extend node-adding forms, you'll see that in save(), we set the default hostname if the one provided is the empty string.
<rvba> roaksoax: that's in src/maasserver/forms.py
<roaksoax> rvba: right, that's exactly where I ws looking
<roaksoax> rvba: but then again, that's particularly for mac address isn't it?
<rvba> roaksoax: yes
<rvba> roaksoax: shouldn't the default come from a global config setting?
<rvba> Rather than being hardcoded I mean.
<roaksoax> rvba: yeah but if we do that, someone can simply set a non-supported release
<roaksoax> rvba: but for now I'm just doing a first step
<roaksoax> on getting this to work
<roaksoax> rvba: and thinking that the default should be "precise"
<rvba> The setting could be a dropdown menu with only the valid options.
<roaksoax> rvba: right, that's what it is
<rvba> So only supported releases would be in there right?
<roaksoax> rvba: yes
<roaksoax> rvba: everything seems to be working as expected but enlistment
<roaksoax> rvba: so that's why I was wondering where should a default value be set
<rvba> roaksoax: also, maybe you should allow Null value in there.  If this field is null, then is means to use the default distrib.
<rvba> s/distrib/release/
<roaksoax> rvba:     os_release = CharField(
<roaksoax>         max_length=10, choices=UBUNTU_RELEASE_CHOICES, null=True,
<roaksoax>         blank=True, default=UBUNTU_RELEASE.precise)
<rvba> If someone changes the default release, then all the nodes without a specific release set will use the new default transparently.
<roaksoax> rvba: right, so that would technically fix it?
<rvba> roaksoax: yes.
<rvba> roaksoax: but I would be in favor of default=None
<roaksoax> rvba: http://pastebin.ubuntu.com/1191126/
<rvba> Then if you want to change the default, change the global setting.
<roaksoax> rvba:  but default=None is no release being set
<roaksoax> rvba: meaning the node will have os_release = None
<roaksoax> rvba: each node should have a release
<roaksoax> it is just as the architecture
<rvba> roaksoax: yeah, and we would simply add a small get_os_release method which would return the default setting in that case.
<roaksoax> IMHO
<rvba> That method would be used to decide which release should be used.
<roaksoax> rvba: that would be in node.py?
<rvba> roaksoax: yes, soemthing like http://paste.ubuntu.com/1191129/
<rvba> s/get_or_release/get_os_release/
<roaksoax> rvba: right, ok cool, I'll first test the null change stuff and see what happens and then will look into the rest
<rvba> roaksoax: ok
<roaksoax> rvba: thought, default should probably be distro-info --stable
<roaksoax> rvba: that's what I was thinking to do instead of UBUNTU_RELEASE.precise
<roaksoax> to simply use UBUNTU_RELEASE.default = distro-info --stable
<rvba> roaksoax: we can initialize it with that but then if we make it a configuration option it will be in the hands of the user.
<roaksoax> rvba: right, ok so I'll first get this working
<roaksoax> and then we can improve the default release
<roaksoax> and setting option
<rvba> Sounds like a plan.
<roaksoax> rvba: btw.. preseed.py would look similar too: http://paste.ubuntu.com/1191157/ (i need to updated it first though)
<rvba> roaksoax: makes sense.
<roaksoax> rvba: where does the dev inst/home/roaksoax/Desktop/project/maas-whole/maas/bin/maas createsuperuser
<roaksoax> sorry
<roaksoax> errr
<roaksoax> rvba: one more thing, when I click on "Add Node" on the WebUI I can't see the ubuntu release options, but I can when editing the node
<roaksoax> rvba: where is that?
<rvba> roaksoax: the js code uses ./src/maasserver/templates/maasserver/snippets.html to create that form
<roaksoax> rvba: so that should also be updated then?
<rvba> roaksoax: yes
<roaksoax> rvba: ok good to know
<roaksoax> rvba: and what about the label it will use, it shoes as "Os release"
<roaksoax> and I'd like to use just "release"
<rvba> roaksoax: use label="xx" on the CharField
<roaksoax> rvba: ok cool thanks
<rvba> np
<roaksoax> rvba: so that also needs a migration?
<roaksoax> or needs ot be part of the migration
<rvba> roaksoax: if you're changing a model, then there is a migration.  I recommend having one migration per change.
<roaksoax> rvba: gotcha, thenaks
<roaksoax> rvba: so do yo think it is a good idea to UBUNTU_RELEASE.default be obtained form a setting and then use that instead?
<rvba> roaksoax: it allows someone to 'upgrade' the default used without having to mass-change the onfirmation stored on the nodes themselves so yeah, I think that's a good idea.
<roaksoax> rvba: ok cool
<tsandall> I created a VirtualBox VM and enlisted with my MAAS server. The node shows up as Declared on the MAAS 'nodes' page. When I try to boot the VM for the first time though it acquires a DHCP lease, and it looks like PXE starts before hitting an error: "Boot sector signature not found" and then dropping to a boot: prompt.
<tsandall> the pxelinux.cfg filef for the VM's MAC doesn't reference any of the precise images, instead it's referencing chain.c32, is that correct?
#maas 2013-09-02
<AskUbuntu> juju ceph deployment | http://askubuntu.com/q/340465
<racedo> ping roaksoax
#maas 2013-09-05
<roaksoax> jtv: ping
<jtv> Hi roaksoax
<roaksoax> jtv: howdy!
<roaksoax> jtv: so I'm wondering really quickly if the latest upstream releases of MAAS
<roaksoax> jtv: now depend on the pgarray thingy
<jtv> Not yet.  Holding off on that until we know we can support it in the builders etc.
<jtv> We'll need to be able to backport to Precise, even.
<roaksoax> jtv: ok, it is already in universe though
<bigjools> land it
<jtv> Great!
<jtv> Thanks.
<bigjools> we should repackage for precise for the PPA
<jtv> Take cover.  Branches coming down.
<roaksoax> either way nothing new will hit the archives
<roaksoax> bigjools: yeah but will be largely based on the saucy stuff i think though
<bigjools> yes
<roaksoax> bigjools: with the cloud-tools archive things should be easier to maintain
<bigjools> \o/
<roaksoax> i need to catch up with jamespage about that though
 * jtv just can't bring himself to say that his package is in Saucy Universe.
<jtv> It sounds like a magazine.
<roaksoax> jtv: https://launchpad.net/ubuntu/+source/djorm-ext-pgarray
<roaksoax> jtv: do we need to look into updating the versions of oops stuff, and the longpoll stuff and so on?
<roaksoax> bigjools: ^^
<jtv> You mean, for Saucy?
<roaksoax> yeah
<roaksoax> or any of the dependencies maintained by us
<jtv> Julian may be better able to answer that...  I don't suppose we're expecting any miraculous improvements from celery/rabbit.
<bigjools> roaksoax: not yet really
<bigjools> roaksoax: I intend to have a version revamp for 14.04
<roaksoax> ack!
<roaksoax> perfect then
<bigjools> but before then it's harder because we're trying to keep trunk running on precise
<roaksoax> bigjools: yeah but with the cloud-tools pocket from the cloud-archive that should be fixed
<bigjools> roaksoax: well I'd not be so hasty about that
<bigjools> maas is broken if you run stuff from the cloud archive
<roaksoax> bigjools: i mean, we can technically put all the dependencies we need there as long as they are in the develpment release IIRC
<roaksoax> s/iirc/afaik
<roaksoax> bigjools: because of the use of django 1.5?
<bigjools> roaksoax: yes
<roaksoax> yeah
<roaksoax> anyway im off to bed
<bigjools> roaksoax: bit late for you!
<jtv> nn roaksoax
<jtv> And thanks
<jtv> bigjools: can't land anything anyway... jenkins is still failing.
<bigjools> jtv: didn't rvb do a fix?
 * bigjools nearly has tarmac charm ready for us
<jtv> I think he managed to get the machine running again, but the problem may have come back, or this may be a new problem.
<jtv> I think this current failing streak started around 22:00 UTC.
<bigjools> ugh
<bigjools> jtv: I thought the landing was ok, it was only the dailies
<jtv> Then we have a separate problem on our hands.  :(
<bigjools> jtv: yes - there's no pgarray egg on the saucy machine :)
<bigjools> https://jenkins.qa.ubuntu.com/job/maas-merger-trunk/label=lp-saucy-server-amd64/47/console
<bigjools> jenkins is configured to run tests on both raring and saucy before landing the branch
<bigjools> I guess you only put the egg in raring
<jtv> Yup, that'd be it...
<bigjools> this is going to go away when I turn this off and use my new lander
<jtv> Oh wow, you got actual console output.  I was only seeing a summary failure notice.
<jtv> Found it.  That's not very intuitive UI...
<bigjools> jtv: yes you need to dive into the sub-jobs
<jtv> And then it's on the left, disguised as a global pane.  Anyway, should I just be patient then?
<bigjools> hey roaksoax!
<bigjools> when will the packages in -proposed get pushed to -updates?
<bigjools> verification was done ages ago
#maas 2013-09-06
<diegonat> hi guys... https://wiki.ubuntu.com/ServerTeam/MAAS/AvahiBoot where can i find this page? I cannot find instruction about how to do it!
<kentb> I'm running maas and juju-core 1.13.  When deploying the precise images to bare metal I'm running into the situation where an SSL-enabled mongodb is required but the firewall setup where I'm located (I think) is preventing the correct ppa from being added to pull in mongodb.  I'm currently using 13.04 for my bare-metal machines to work around it, but, is
<kentb> there a way on the maas/juju side to possibly hack in a workaround if I want to use 12.04?
<mgz> you can mirror the package from the ppa inside your firewall
<kentb> ah ok. good idea
<mgz> I'm trying to remember the way to get juju to look at a different apt repo, but can't recall the details
<diegonat> hi all.... I have got a problem with MAAS but I m not sure how to troubleshoot it. Basically, I set up a MAAS environment and I added one node. However, now I cannot add any other nodes. If you I power up a machine (virtual), it does boot but it doesnt come out in the web control panel. Any idea??
<diegonat> in celery ive got these two lines
<diegonat> [2013-09-06 16:30:11,570: INFO/MainProcess] Got task from broker: provisioningserver.tasks.upload_dhcp_leases[ed613b58-7ba0-402f-8a59-1fb9ec931cb2]
<diegonat> [2013-09-06 16:30:11,679: INFO/MainProcess] Task provisioningserver.tasks.upload_dhcp_leases[ed613b58-7ba0-402f-8a59-1fb9ec931cb2] succeeded in 0.104770898819s: None
<diegonat> [2013-09-06 16:30:11,570: INFO/MainProcess] Got task from broker: provisioningserver.tasks.upload_dhcp_leases[ed613b58-7ba0-402f-8a59-1fb9ec931cb2]
<diegonat> [2013-09-06 16:30:11,679: INFO/MainProcess] Task provisioningserver.tasks.upload_dhcp_leases[ed613b58-7ba0-402f-8a59-1fb9ec931cb2] succeeded in 0.104770898819s: None
<diegonat> ops
<diegonat> sorry
<diegonat> guys on MAAS i have machines' status on commissioning although they are up and running... any idea?
<roaksoax> diegonat: commissioning means that the machines need to PXE boot into MAAS, do the commissioning process, then tell MAAS that are ready to be deployed
<diegonat> roaksoax, i launch the machine
<diegonat> it boots up
<diegonat> and ive got the prompt for the user and pass
<roaksoax> diegonat: ok so let it do its thing
<diegonat> do i need to wait for some time?
<roaksoax> yeah
<roaksoax> diegonat: is this a VM?
<diegonat> yes
<roaksoax> diegonat: yeah so might take a bit longer than expected
<roaksoax> specially if yo are using precise
<diegonat> yes im
<diegonat> but i dont have any output
<roaksoax> yeah
<roaksoax> just let it be for a second
<roaksoax> for a few minutes in reality
<roaksoax> and let's see what happens
<diegonat> ok
<diegonat> its not working
<roaksoax> diegonat: did the machine turned off?
<roaksoax> or the VM
<diegonat> no its still on
<diegonat> i have an error error inserting ipmi_si
<diegonat> but ive read on the internet that its not relevant
<roaksoax> diegonat: yeah, so it is doing its thing, it just got stuck somewhere else because of that
<roaksoax> diegonat: it is jus taking longer than expected
<diegonat> do u think that is the problem?
<diegonat> yeah it took sooo long
<diegonat> but roaksoax you were right
<diegonat> matter of waiting
<roaksoax> ;)
#maas 2014-09-01
<bigjools> jtv1: could I trouble you for some reviews please
<jtv1> I'm in the process of writing up a review.
<bigjools> ta
<jtv> bigjools: found it!  âfrom django.contrib import messagesâ and then e.g. âmessages.error(request, "Aaaigh!")â
<jtv> Now to find the request...
<bigjools> perfick
<jtv> Well, still have to find that request.
<bigjools> it's passed into the view iirc
<bigjools> and api request
<bigjools> afk for a few
<jtv> Yes, the view gets it â but the triggers don't.  They don't even know whether there is one.
<jtv> Signal handlers, I mean.
<jtv> Not triggers.
<bigjools> that's the point of the handlers
<bigjools> they're not supposed to care
<bigjools> not looked at django docs for signals but I wonder what it does if there's an error
<jtv> Turns out the transaction does commit.  Which does not bode well.
<jtv> Meanwhile, my NUCs won't auto-enlist any more.  :-(
<jtv> My non-NUC test machines won't even netboot!  Complain about "APM not present."
<rvba> bigjools: thanks for the review of my robustness branch.  Much appreciated.  Addressing your comments now.
<rvba> jtv: I've seen the "APM not present" message in the lab.
<jtv> rvba: searching for it yielded very little information... it sounded as if some tool suddenly expects APM.
<jtv> Which is strange, given the laws of nature.
<rvba> jtv: it happens when node is told to power off (which is the default PXE config instruction sent when MAAS doesn't really know what to do) but fails to do so.
<jtv> I mean specifically the one that says time moves in a forward direction.
<jtv> That's what I found by searching.  To me though it happens while trying to netboot...
<rvba> Which probably means that the netboot/status combination is unexpected or wrong.
<jtv> This is when trying to auto-enlist...  MAAS shouldn't even know the machine exists.
<rvba> jtv: I see nodes being enlisted okay in the current CI run.  Could it be a problem with your specific branch?
<jtv> Could be, though I think it's basically a version of trunk
<jtv> Phew.  Installing the latest trunk got me past it somehow.
<jtv> Past the "APM not pressent" problem, that is.
<rvba> blake_r: Hi blake.  I'm having a look at the CI runs and the new maas-integration.TestMAASIntegration.test_imported_boot_resources test takes 20 minutes to complete.  Why is this so long?  (I'm asking because it's important to keep the total runtime as low possible)
<jtv> rvba: you said you were seeing static IP addresses in the lab... is that a recent version of trunk?
<rvba> jtv: it's trunk + my robustness changes (which shouldn't interfere with the IP assignment)
<jtv> Hmmmright.  Do you know what the last trunk revision in there was?
<rvba> jtv: 2854
<jtv> Thanks.
<jtv> That's current.  So... what in blazes is going on?
<bigjools> rvba: sorry, I am being a hardass on your review
<rvba> bigjools: the change you suggest about the netboot flag has nothing to do with my change.
<rvba> bigjools: and I don't really understand why re-assigning a status is dangerous.
<bigjools> rvba: I explained why it's bad in the review comments
<bigjools> you can end up in bad states.  I remember mentioning at the sprint that I don't like that flag any more
<rvba> bigjools: I prefer the 'default' (i.e. what happens to nodes that won't be picked up by the migration) to be that the nodes end up 'Deployed' instead of 'Allocated'.
<bigjools> 2. the status change has a race if the maas is in use
<rvba> bigjools: I agree that I need to write a migration.
<bigjools> the new state is fine, just don't change the values of existing ones
<rvba> bigjools: there is no bad state. netboot is still only relevant for one status.  Same as before.
<rvba> bigjools: I think changing the meaning is the safest thing to do here.  Let me explain:
<bigjools> rvba: EXACTLY!
<rvba> Previously we had on state 'Allocated', which could mean 3 things.  â Allocated/Deploying/Deployed.  Now, I expect most nodes in the old 'Allocated' state to be effectively in the new 'Deployed' state and that's why I'd like this migration to be as transparent as possible.
<rvba> bigjools: but it's really a detail.  I can write one additional data migration if this is to get this branch landed.
<bigjools> rvba: ok
<rvba> bigjools: re-netboot.  I'm just saying this branch doesn't make things worse when it comes to the netboot stuff.  Let's get it landed and think about whether or not we want to change this after.
<bigjools> rvba: that's fine
<rvba> Okay, cool.  I'll revert the change to the enum and write this migration then.
<bigjools> rvba: excellent
<rvba> bigjools: Don't get me wrong, I appreciate the extra scrutiny on this.
<bigjools> rvba: I know :)
<bigjools> it'a a hairy area not to be rushed
<rvba> Indeed.
<gmb> allenap: I have a question re: the MockLiveClusterToRegionRPCFixtureâ¦ When I set a mock result properly (see http://pastebin.ubuntu.com/8204974/), I get the following error: http://paste.ubuntu.com/8204977/. It's almost as though something is wrapping the list in a tuple and then the whole thing breaks. If I specify the "interfaces" item in the response as just being a single dict, rather than a list of dicts with one element, it works perfectly. HALP?
<gmb> allenap: Aaah, hang on, I hadn't applied your patch, I think I see the problemâ¦
<gmb> allenap: Yeah, I'd not spotted the stray comma on the "interface =" lines. Thanks for that :)
<jtv> rvba: those static IP addresses you saw in the lab... are you very very sure they're static?  Because I'm seeing addresses now, but from the dynamic range.
<rvba> jtv: I've got another run in progress, I'll tell you when it gets to the point where static IP addresses are assignedâ¦ I didn't check that the addresses I saw where from the static range last time.
<rvba> jtv: I just had a problem in the lab (my nodes didn't get an entry in the zone file) but I think it's caused by the change I'm trying to QA.
<jtv> rvba: thanks â highly interested to see if you meet with more success.
<rvba> jtv: just did another test locally with revision 2857 and my node just got an IP from the static range.
<jtv> Gah.
<jtv> Here, my nodes do get IP addresses, just from the dynamic range.
<jtv> But that's with the REVEAL_IPv6 flag set.  I wonder if that makes a difference...
<jtv> What I had before I set it, I believe, was no IP address at all.
<jtv> allenap: your misc-boot-resources-stuff branch removes a lock check... is that intentional?
<jtv> The one where it doesn't import if its lock is currently held?
<allenap> jtv: Yes, itâs superfluous. It tries to get the lock later on. Actually, thereâs a chance that itâll block for a long time (it could have before too; itâs racy). Iâll improve that.
<jtv> The occasional race may not be so bad, but the point was to skip the entire attempt if another thread is already working on a download...  is that behaviour stil there?
<allenap> jtv: It is, but it may wait 15 seconds before giving up. However, it then joins the lock thread, which will hang around until it can actually get the lock. Perhaps it doesnât actually need to join the lock thread; that can be left to die on its own. Of course, thatâs a leak in itself.
<jtv> As long as it's cleaned up eventually, I guess...
<jtv> rvba: ruddy-cave.maas is Deployed and on, but now I see no IP address for it at all...
<jtv> Ah, one just appeared.  And it's dynamic.
<rvba> jtv: I think this is related to the thing I'm testing now (the robustness stuff).
<jtv> The fact that it didn't get a static IP address?
<jtv> Remember, I'm seeing the same thing with trunk in my own setup.
<rvba> Hum, the StaticIPAddress table is empty.
<rvba> That's weird.
<jtv> Yup.
<jtv> Oh this is just horrible.
<jtv> Whether the node claims static IP addresses also seems to depend on its power type.
<jtv> Unknown power type: no static IP.
<jtv> ether_wake and no MAC address set in the power parameters: no static IP.
<jtv> And am I going cross-eyed or are there a Node.claim_static_ips and a Node.claim_static_ip_addresses?
<rvba> jtv: this code is a mess :/
<jtv> Yup.
<jtv> Note no docstring.
<jtv> On _create_tasks_for_static_ips.
<rvba> And claim_static_ip_addresses is strangely similar to _create_tasks_for_static_ips.
<jtv>  /o\
<jtv> rvba: I also see a lot of special cases for "self.status == NODE_STATUS.ALLOCATED"...   I guess those are complicating your life right now.
<allenap> Node.claim_static_ips is going away, eventually.
<allenap> However itâs not in use right now.
<jtv> And claim_static_ip_addresses will be its eventual replacement?
<allenap> jtv: Yep.
<jtv> That'd be worth putting in docstrings.
<allenap> jtv: Iâm changing a lot of this code for the RPC work Iâm doing, so if you find logical faults please tell me about them; Iâve recreated what was already there, so I may have recreated bugs.
<jtv> Don't be afraid to write that something is unclear.  Better than a shared false belief that it was all done deliberately!
<jtv> rvba: it looks as if mac_addresses_on_managed_interfaces is not returning empty...  Maybe MACAddress.cluster_interface never got set.
<rvba> jtv: let's check the current runâ¦
<rvba> Nodes are commissioning nowâ¦
<jtv> Static addresses should be assigned at the point where the nodes are first started in Allocated state.
<rvba> jtv: when is MACAddress.cluster_interface populated exactly?
<jtv> Good question.
<jtv> I was just trying to find that out actually.
<jtv> NodeGroupHandler.update_leases..?
<rvba> Right, it calls update_mac_cluster_interfaces.
<rvba> jtv: current state in the lab: http://paste.ubuntu.com/8206479/
<jtv> So those cluster interfaces haven't been populated.
<rvba> Apparently not.  Looks like a bug to me.
<jtv> Maybe it's just a matter of waiting a bit longer..?
<jtv> BTW I filed bug 1363999 about this.
<ubot5> bug 1363999 in MAAS "Not assigning static IP addresses" [Critical,Triaged] https://launchpad.net/bugs/1363999
<rvba> jtv: If the lease table is populated, it means update_leases has been called.
<rvba> leases*
<jtv> Ugh.  I hadn't realised the significance of that part.
<jtv> Oh, but careful: that table can contain old leases from deleted nodes.
<rvba> This is from a run in the lab, it's using a clean VM.
<jtv> Damn.
<allenap> jtv, rvba: Do you want any more eyes on the problem?
<jtv> Oh that would be great.
<jtv> We're currently staring at update_mac_cluster_interfaces, in api/node_groups.py.
<jtv> (Huh what, his groggy brain asks him, where did that huge api.py module go?)
<jtv> We have reason to believe that that function runs, but it doesn't appear to be doing this:
<jtv>                 mac_address.cluster_interface = interface
<jtv>                 mac_address.save()
<rvba> jtv: I don't understand why we still have MAC.cluster_interface now that the network stuff is unified and that we can use the Network<->MACAddress link.
<jtv> They're not quite the same thing.  For example, two NGIs can have overlapping IP ranges, which are different subnets that happen not to be connected.
<jtv> It'd be nice to resolve that at some point, but we haven't taken that step yet.
<rvba> I thought we didn't support overlapping IP ranges.
<jtv> For Network we don't.
<jtv> But two cluster interfaces (on different clusters) might still do it.
<jtv> rvba: stupid question perhaps, but... do we even still call the API's update_leases method?
<jtv> I mean, hasn't that been moved to RPC or anything?
<rvba> jtv: well, that's a good question :).  Let's have a look at the KB board.
<rvba> jtv: apparently it's been ported to RPC by Julianâ¦ but if it is so, why is this method still there?
<jtv> "Periodically upload DHCP leases"...
<jtv> Lots of good questions today.
<rvba> jtv: src/maasserver/rpc/leases.py
<rvba> Doesn't call update_mac_cluster_interfaces :/
<jtv> Well that looks like an explanation.
<rvba> Yep
<allenap> Good find :)
<jtv> Believe me, it gives me no joy.  :)
<jtv> Not even the relief I expected from discovering that it's not the IPv6 changes.
<allenap> Now, which poor soul is going to fix it?
<jtv> I might do it since it's blocking my work â but not tonight!
 * jtv tired
<rvba> blake_r: I can see two reasons why the import is slow: a) you're downloading many images by default (?) or b) you're not using the configured proxy (?).
<rvba> blake_r: my money is on b).
<blake_r> rvba: I would go with number 2, unless the node by default is supposed to use that
<rvba> blake_r: I don't see the relation to the nodeâ¦ this is all happening on the region.
<rvba> blake_r: btw, did you land the UI for the new image stuff?
<blake_r> rvba: sorry I meant region
<blake_r> rvba: no ui yet
<blake_r> rvba: only api
<blake_r> rvba: ui is next
<blake_r> rvba: create a bug for not using proxy and I will fix it this week
<rvba> blake_r: okay, cool.
<rvba> blake_r: https://bugs.launchpad.net/maas/+bug/1364062
<ubot5> Ubuntu bug 1364062 in MAAS "New download boot resources method doesn't use the configured proxy" [Critical,Triaged]
<Valduare> hi guys
<Valduare> any news on maas with arm devices
#maas 2014-09-02
<jtv> Oops: the development environment is broken.
<jtv> Cluster won't start up because the startup code tries to create GNUPGHOME, which is now hard-coded to its production location in /var/lib.
<bigjools> jtv: yeah rvb filed a bug about that
<bigjools> another critical that needs fixing.... we have so many all of a sudden
<bigjools> btw: https://code.launchpad.net/~julian-edwards/maas/update-mac-to-cluster-when-updating-leases/+merge/232961
<jtv> bigjools: approved.
<bigjools> jtv: thanks. BTW I am reviewing Gavin's branch
<bigjools> I forgot to mark it
<jtv> Brave.
<bigjools> I feel follhardy
<bigjools> foolhardy, too
<jtv> It's all in the follicles.
<jtv> bigjools: need pre-imp... are you free?
<bigjools> jtv: not just yet, trying to finish reviewing gavin's branch and then I need to eat
<jtv> OK.
<jtv> I'll fill the time with something useful.
<bigjools> I'll ping you when I can
<bigjools> jtv: https://bugs.launchpad.net/maas/+bug/1363913 and https://bugs.launchpad.net/maas/+bug/1362693
<bigjools> :)
<ubot5> Ubuntu bug 1363913 in MAAS "Impossible to remove last MAC from network in UI" [Undecided,New]
<ubot5> Ubuntu bug 1362693 in MAAS "MAAS is providing 2 IP's even if there's only 1 NIC in the network" [Undecided,New]
<jtv> Sounds fun.
<bigjools> bug 1330765
<ubot5> bug 1330765 in MAAS "If start_nodes() fails, it doesn't clean up after itself." [High,Triaged] https://launchpad.net/bugs/1330765
<jtv> Why are you listing bugs at me?
<bigjools> ignore that one
<bigjools> that was for me
<bigjools> I'm too lazy to go to LP
<bigjools> jtv: ok call?
<jtv> Yay
<bigjools> jtv: alert
<jtv> ?
<bigjools> GPG home needed on maasserver as well :(
<bigjools> see src/maasserver/start_up.py line 95
<jtv> Is that a problem though?  We can have the Makefile set it on both sides.
<jtv> And the maasserver can import it from the provisioningserver, no problem.
<bigjools> which imports from provisioningserver.upgrade_cluster, yegads
<bigjools> the dev ui is now almost useless without a cluster and image data :(
<jtv> The work I'm doing now should put us in a position to provide that though.
<jtv> Because a branch will have a natural place to put boot resources etc.
<bigjools> even after blake's changes?
<jtv> Oh, does he still put them in /var/lib/maas..?
<bigjools> no, they're going to be in the DB
<jtv> Well never mind that then.
<bigjools> we could just fudge the sample data for now
<jtv> rvba, this looks like a question for you: is there, or should there be, a way for a signal handler to find the web request, if any, whose processing led to the firing of the signal?
<jtv> Because signal handlers may reconfigure DNS/DHCP, and any errors while doing that ought to be displayed.
<rvba> jtv: there is no direct (as in built-in) way AFAIK.  The first thing that comes to mind is to make the current request available by storing it somewhere (i.e. as a global object) and letting anything (including signal callbacks) find it.
<jtv> Yeah.  It's a bit ugly though isn't it?
<rvba> A tad.  But AFAIK it's the only way.
<rvba> jtv: I /seem/ to remember we've used this technique beforeâ¦ let me see if I can find it in the code.
<jtv> We do have the "register a message on the current page whatever it is" part.
<jtv> Another way to do it would be to make it run asynchronously (somehow) and register component errors.
<jtv> Or, we could move the calls out of the signal handlers and into the forms.
<rvba> If it's possible, moving things like that into the forms seems like the cleanest way to do it.
<jtv> Yeah.  In the long run I'd love to make it asynchronous, where we just set a "dirty" flag to say "this needs reconfiguring," but that sounds like a major project.
<jtv> If you take the error backchannel into account, at least.
<jtv> Anyone able to review https://code.launchpad.net/~jtv/maas/bug-1363900/+merge/232982 ?
<rvba> jtv: sure, I'll take it.
<jtv> Thanks.
<rvba> jtv: I wonder why you felt we needed another mechanism to override variables in a dev env.  Don't we have a separate config for this (src/maas/development.py)?
<jtv> rvba: provisioningserver.  :)
<jtv> Can't import anything django-ey from there â that rules out the django configs as well, right?
<rvba> Right.
<jtv> It's annoying, but we're ditching the celery config so we need something in its place.
<rvba> It seems to me that every time we ditch a part of Celery we introduce something more hacky instead.
<jtv> I wouldn't really know â I've been assigned to other work while this was going on.  But I hope this can replace a bunch of it with a bit of system to it.
<jtv> You do a "make run" and you basically get a filesystem root in the "run" directory.
<rvba> Right.
<jtv> And: if we stick to the pattern, we lose some side effects.
<rvba> jtv: in get_path's docstring you say the call has side effectsâ¦ which side effects?
<jtv> Reading the environment and the filesystem.
<jtv> Ah, not the filesystem â the current working directory.
<jtv> They're not very big side effects, but that's a slippery slope.
<rvba> Is reading the environment considered a dangerous side effect?
<jtv> No, just a side effect â the sort of thing we should avoid doing during import.
<rvba> Reading the current working directory?  I don't see it, what am I missing?
<jtv> It's implicit in the abspath.
<jtv> As in, "reading what the current working directory is," not "reading the directory listing."
<rvba> Really?  I didn't know that.
<rvba> Oh, I see, because you're using '.'
<jtv> Actually I think that one will be a simple text processing exercise.  But calling abspath with a relative path makes it get the current working directory.
<jtv> (Not read the working directory's contents, but read what the cwd is.)
<jtv> Very minor side effects, but we've been quietly accepting side effects and it leads to more.
<rvba> jtv: branch approved with a couple of minor suggestions.
<jtv> Thanks!
<allenap> jtv, rvba: Thereâs pserv.yaml, which is really not hacky (it has a _schema_). Modifying it from packaging thoughâ¦ thatâs hacky.
<bigjools> allenap: it doesn't have per-env settings
<rvba> allenap: what bigjools said
<bigjools> allenap: there were a couple of critical pserv/rpc bugs
<bigjools> one was where if it starts before the region, it never connects
<allenap> bigjools: Ah yes, Iâll take a look at that.
<bigjools> allenap: It would need a 1.6 backport as well
<rvba> allenap: about my bug-1359169 branch (thanks for the review btw), I'm not sure I can move away from using exception.message because piston and oauth both rely on this pretty heavily it seems.
<ramonskie> hi i have upgraded to trusty on my cluster controller with maas 1.5.2 installed and i stumbled on this bug https://bugs.launchpad.net/maas/+bug/1307779
<ubot5> Ubuntu bug 1307779 in MAAS "fallback from specific to generic subarch broken" [Critical,Fix released]
<ramonskie> i saw that this bug is solved in 1.6 but i can't find that package anywhere in the repo
<ramonskie> any other suggestions?
<allenap> rvba: Okay, itâs out of your hands, no worries.
<blake_r> rvba: https://code.launchpad.net/~blake-rouse/maas/fix-1364062/+merge/233070
<rvba> blake_r: I like the commit message of revision 2874 on https://code.launchpad.net/~blake-rouse/maas/fix-1364062/+merge/233070
<blake_r> rvba: haha
<blake_r> rvba: forgot to link the bug with --fixes
<blake_r> rvba: thanks for the review
<rvba> My pleasure.
<newell> rvba, how was your flight back?
<rvba> newell: horribly bumpy.
<newell> blake_r, I am fixing up what I have for the curtin-maas-install stuff and then I will push so you can take a look
<newell> rvba, ah that sucks
<newell> Hopefully you both didn't feel like it was then last flight of your life
<blake_r> newell: okay, thanks
<newell> s/then/the
<rvba> newell: well, yes, at some point I did.
<newell> rvba, roaksoax mentioned he wanted me to work with you on whatever you are doing.  Not sure what that is or if you talked with him about this or not.
<rvba> newell: yeah, I could use you help on the node event log work.
<blake_r> rvba: didnt we use to get a traceback on api errors, finding errors in tests, is really hard now
<blake_r> allenap: ^
<rvba> blake_r: yes, now we only get tracebacks for real errors, not validation errors etc which, although handled internally as exceptions, are not errors per-se.
<blake_r> rvba: can we change that so we get tracebacks when running tests?
<blake_r> rvba: like debug=True return traceback
<rvba> blake_r: I guess we could do thatâ¦ what kind of errors are you interested in?
<blake_r> rvba: i have something wrong in my code, I am running a unit test
<blake_r> rvba: all i get is a oneliner says the exception error
<blake_r> rvba: i would like to know where, so I can find it
<blake_r> rvba: stack tracing through django code to find it takes forever
<blake_r> rvba: i just want stack trace back
<blake_r> rvba: on django side
<rvba> blake_r: sure, but what *kind* of errors are you interested in?
<rvba> You'll still get a stracktrace for a real error.
<blake_r> rvba: 'NoneType' object has no attribute 'items'
<blake_r> rvba: that would be a real error, that is all I get
<rvba> Not for a BadRequest error anymore.  Which makes sense, because it's not an error even if internally it uses exceptions.
<blake_r> rvba: its a 500 error
<rvba> blake_r: okay, this is a bug then.
<blake_r> rvba: https://bugs.launchpad.net/maas/+bug/1364481
<ubot5> Ubuntu bug 1364481 in MAAS "http 500 error doesn't contain a stack trace" [Critical,Confirmed]
<rvba> blake_r: cool, I'll take care of it.
<blake_r> rvba: thanks
<rvba> blake_r: https://code.launchpad.net/~rvb/maas/traceback-error-500/+merge/233086
<blake_r> rvba: line 8-9 on the diff? dont you need that?
<rvba> blake_r: no, it's integrated into ExceptionMiddleware now.
<blake_r> rvba: okay
<ramonskie> i have upgraded to trusty on my cluster controller and now i stumbled on this error https://bugs.launchpad.net/maas/+bug/1307779 so it seems that i need to upgrade to maas 1.6 but i can't find the package
<ubot5> Ubuntu bug 1307779 in MAAS "fallback from specific to generic subarch broken" [Critical,Fix released]
<gQuigs> anyone know if there is any way to get maas to dump all data about it's environment?  sosreport tries to collect "maas dumpdata" but that never works
#maas 2014-09-03
<jtv> rvba: is the user-data at the end of installation only for the fast-path installer then?
<Valduare> hows it going guys
<jtv> Hi
<Valduare> any news on maas with arm devices
<rvba> jtv: the user-data is requested before the f-p installation happens.  Now I'm not sure it happens in the case of d-i.
<jtv> rvba: I'm trying it out...
<jtv> dimitern: hi there â when could you talk about networking?
<dimitern> jtv, hey, how about tomorrow or on friday?
<dimitern> jtv, what's a good time for you?
<jtv> Anytime before 11:00 UTC.  Except our standup is at 08:30 UTC.
<dimitern> jtv, so how about tomorrow @ 10 UTC ?
<jtv> Yes, great.
<dimitern> jtv, i'll send an invite, cheers
<dimitern> i'll invite roaksoax and jam if they want to join
<jtv> Sure.
<dimitern> jtv, actually, do you mind if we move it 30m earlier - 9:30 UTC, as it will overlap with our standup :)
<jtv> Better for me actually!
<dimitern> great! invite sent
<ramonskie> i have upgraded to trusty on my cluster controller and also to maas 1.5.2 but now i stumbled on to this bug https://bugs.launchpad.net/maas/+bug/1307779
<ubot5> Ubuntu bug 1307779 in MAAS "fallback from specific to generic subarch broken" [Critical,Fix released]
<ramonskie> it seems to be fixed in 1.6 but i can't find that package. any idea's?
<bigjools> ramonskie: ppa:maas-maintainers/stable
<ramonskie> thanks
<ramonskie> okay so i upgraded and it finds a image now. only its now stuck on the screen where i see route-info
<jtv> rvba: hey, how about I remove that restriction where you need at least 16 bits of netmask on a managed network?  That was for the old generated zone files.
<rvba> jtv: yep, we don't need that restriction anymore.
<jtv> \o/
<jtv> Easy karma.
<ramonskie> when enlisting nodes it hangs on the following screen https://dl.dropboxusercontent.com/u/50671970/enlist-hang.jpg
<jtv> ramonskie: that part looks OK to me in itself... how long did you watch them hang?
<jtv> Because if it failed there, I'd probably expect some error output.
<ramonskie> for about 10 minutes now
<jtv> If this is all it shows on the console, I'd give it a bit longer.
<ramonskie> jtv: thanks i will wait
<ramonskie> jtv: how long should i wait....
<jtv> ramonskie: any change on the screen?
<ramonskie> nope nothing
<jtv> Then I guess it's time to go trawling through the logs.
<ramonskie> which log files do i need to check..
<jtv> I'm looking along.  It's on the maas server, in /var/log/maas.
<jtv> To be honest, since we're not seeing any error message, I don't know what to look for in this case.
<jtv> Oh, one thing that might also help: shift-PageUp on the node's console might show a bit more history.
<jtv> (Might as well keep the node doing whatever it's doing for now, in case it changes its mind)
<jtv> First thing to do is a quick scan for obvious errors:
<jtv> /var/log/maas/apache2/error.log, /var/log/maas/celery.log, /var/log/maas/maas.log
<jtv> If there's an error in there, chances are it'll jump right out at you.
<ramonskie> maas.log is empty nothing
<jtv> That is really odd.
<jtv> Even if there are no requests, it's supposed to have periodic jobs in there.  Unless... which version is this?
<jtv> (Roughly â e.g. "the one that came with 14.04")
<ramonskie> first i wasn on ubuntu 12.04 with maas 1.4 then upgraded to ubuntu 14.04 and maas 1.5 and now upgraded to 1.6
<jtv> OK, that's pretty recent.  Good.
<jtv> I'm not sure if you'll have /var/log/maas/apache2 then; but it's just a symlink to /var/log/apache2.
<ramonskie> in selery is see some dhcp lease errors
<jtv> Oh?
<jtv> Can you paste one?
<ramonskie> ERROR/MainProcess] Task provisioningserver.tasks.upload_dhcp_leases[e19c4353-92a2-499e-8f95-154b3b017950] raised unexpected: IOError()
<ramonskie> also need the trace?
<jtv> That'd be nice, thanks.  Maybe use paste.ubuntu.com.
<ramonskie> wait let me clean all the logs and restart the controller server and start all over
<ramonskie> in case its non related stuff
<jtv> OK
<jtv> Thanks for the review rvba.
<rvba> jtv: np.  I just put up for review https://code.launchpad.net/~rvb/maas/revert-2872/+merge/233191.  A run in the CI shows that it fixes the problem introduced recently.
 * jtv looks
<ramonskie> jtv: these are the errors after a restart in celery.log http://pastebin.com/ayRxD5v0
<jtv> Thanks.
<jtv> rvba: done.  :)
<rvba> Ta
<jtv> ramonskie: scant consolation but this is code that's already removed from the dev version.  :)
<jtv> Now, what seems to be going wrong is that the cluster controller is having trouble talking to the region-controller API.
<ramonskie> its on the same server
<ramonskie> i only have one
<jtv> Yeah, it should be easy, shouldn't it?
<ramonskie> :P
<jtv> You may want to check that your DEFAULT_MAAS_URL is configured sensibly.
<jtv> That's in several places.  Just grep /etc/maas for it â but as root, or you won't be able to read some of those files.
<jtv> (There's credentials in some of them.)
<ramonskie> ./maas_local_settings.py:DEFAULT_MAAS_URL = "http://172.21.42.1/MAAS" ./maas_local_settings.py.dpkg-dist:DEFAULT_MAAS_URL = "http://maas.internal.example.com/"
<jtv> That looks sensible â assuming 172.21.42.1 is indeed your server's IP address, and the nodes will be able to reach it.
<jtv> You could have a look to see if that same request shows up in the Apache error log, in case it did get through to the server.
<jtv> (Or maybe even the Apache access log â but I doubt that)
<jtv> Oh!  Just in case, you may want to search the Apache access log for "/MAAS/MAAS"
<ramonskie> nope nothing found
<ramonskie> only thing i see in the apache error log is this : No such file or directory: mod_wsgi (pid=2708): Unable to change working directory to '/home/maas'
<ramonskie> but should not be the problem
<jtv> No, shouldn't be.
<jtv> It's dumb, but maybe you could just try making a wget request to http://172.21.42.1/MAAS from the server itself, just to make sure that gets through?
<ramonskie> strange thing is i see now that i have 2 cluster controllers in clusters
<jtv> Two clusters?  That's interesting.
<jtv> When it wakes up, the cluster registers itself with the region controller, and then just keeps polling for the region controller to say "sure, yeah, you're accepted."
<ramonskie> yeah i think upgrade
<jtv> The region controller should identify them by UUID.
<ramonskie> i think its a upgrade quirk that happend in 1.5
<ramonskie> but i gave that another set op ips and dns zone
<jtv> New to me...  I guess they have different UUIDs?  I guess one is "master"?
<ramonskie> yes one is master and one is called maas
<ramonskie> in dns zone
<jtv> If they're both running on the same server, that spells trouble.
<jtv> Because they control a DHCP server, a DNS server, iSCSI, a TFTP server, and so on.
<ramonskie> ahh well that explains a lot
<jtv> I'm still not sure _how_ it would cause the failure in the log, but it's definitely closer to the source.
<ramonskie> the only problem is if i delete the newly created cluster it pops backup in pending state
<jtv> Yeah.
<ramonskie> and the other cluster have still a set of working nodes in them
<jtv> You could try stopping the cluster controller, deleting the new cluster, and then in the UI updating the old one to look like the new one.
<jtv> Mind you, they'll still have different UUIDs...  so that may not be good enough.
<jtv> I think this'll require some database surgery.
<ramonskie> they have different uuids
<jtv> Yeah.
<jtv> So the upgrade generated a new one instead of reusing the old one.
<ramonskie> also the old cluster has only 6 boot-images and the new one 126
<jtv> Yeah, a lot has changed there.
<ramonskie> so can i move the nodes from the old cluster to the new one
<jtv> Let's start by getting a good view of the situation... if you grep /etc/maas for "UUID", do you get consistent UUIDs from the various config files?
<ramonskie> and then delete the old cluster
<jtv> The only way to move nodes is to delete them from one cluster and re-enlist them into the other.
<jtv> If that is not a problem, then I think it's the easiest way out.
<jtv> But it means that anything you've got running on those nodes is lost.
<ramonskie> maas_local_celeryconfig_cluster.py:CLUSTER_UUID = '3d245a63-2b23-42be-8977-f36cb2218b9e'
<ramonskie> and thats the new one
<ramonskie> yeah i can't delete thos nodes openstack is running on it with alot of vms :(
<jtv> Blast.
<ramonskie> otherwise i would already have done a clean install :)
<jtv> Well, it's going to get tricky at any rate.  First let me have a look for known bugs.
<ramonskie> okay
<jtv> Meanwhile, could you check that the cluster UUID in /etc/maas/maas_cluster.conf is consistent with the one you found in maas_local_celeryconfig_cluster.py?
<ramonskie> yes they are the same
<jtv> OK
<ramonskie> both the new accepted cluster
<ramonskie> can i disabele the other cluster but still let dns work
<jtv> ramonskie: safest thing to try I guess would be to set them to the old cluster.  But... first a look for known bugs.
<ramonskie> that should solve it
<jtv> Well, plus a restart.  :)  And then you'd have to delete the new one.
<ramonskie> okay but if i set it to the old cluster will also the new boot images be added?
<jtv> Should be, yes.  Because AFAICT the two actually share everything except a process.
<jtv> It's the same files on disc, etc.  It may take a few minutes for the remaining cluster controller to inform the region controller of what it has.
<ramonskie> you already checked known bugs?
<ramonskie> so should i try this?
<jtv> I checked known unfixed bugs.  Let me make one more round for ones that may have been fixed later.
<jtv> ramonskie: I guess it's not bug 1344089, and that's the best candidate I found.
<ubot5> bug 1344089 in MAAS 1.6 "IntegrityError after upgrading to 1.6beta5" [Critical,Fix released] https://launchpad.net/bugs/1344089
<jtv> (I realise you hit your problem with an earlier version)
<ramonskie> i'am on 1.6 now
<jtv> Anyway, assuming that's not it, we'll have to make the change.  I'd stop the cluster controllers first.
<jtv> Then update the UUID entries in the config, to use your original cluster's UUID.
<jtv> I'd also set the cluster interfaces to Unmanaged, just so you can re-enable the right one later.
<ramonskie> whats the best and savest way to stop the cluster controller
<jtv> Then restart, accept the right cluster controller if needed (it may be automatic), enable the right cluster interface, and see if that fixes things.
<jtv> sudo service maas-cluster-celery stop
<jtv> sudo service maas-pserv stop
<jtv> Then I'd run a âps -ef | grep maasâ to check for lingering processes.
<ramonskie> wow there is still a lot running
<jtv> Yeah it's not a small thing.
<ramonskie> several of these: /usr/bin/python /usr/bin/celeryd --logfile=/var/log/maas/celery-region.log --schedule=/var/lib/maas/celerybeat-region-schedule --loglevel=INFO --beat --queues=celery,master
<ramonskie> should i kill them?
<jtv> No, those are the region controller's celery.
<ramonskie> and there should be 10 of them?
<jtv> Probably not.
<ramonskie> lol
<jtv> But I don't know what might cause there to be more...  I do hope you don't have two region controllers as well!
<ramonskie> i cerently hope not
<ramonskie> the only thing i did was what i thought a simple upgrade
<jtv> Yeah.  This clearly shouldn't have happened.
<ramonskie> okay edited both files maas_local_celeryconfig_cluster.py and maas_cluster.conf
<ramonskie> with the old uuid
<jtv> OK.
<jtv> And you've set the cluster interfaces to Unmanaged in the UI?
<ramonskie> the new cluster?
<ramonskie> done for the new cluster
<jtv> OK.  I'd do the old one as well.
<jtv> (The only drawback is your DHCP server will be down briefly â let's keep it short)
<ramonskie> no dhcp entries will be deleted?
<jtv> Not as such, though there may be more confusion that will only become clear later.
<ramonskie> backed it up just in case
<jtv> Good.
<jtv> And then we get to restart.  A reboot would be the most comprehensive.
<ramonskie> reboot it is
<ramonskie> okay rebooted
<ramonskie> i removed the new cluster now
<ramonskie> do i need to set managed dhcp on again?
<ramonskie> what are the best next steps to take?
<jtv> First: is the old cluster now Accepted?
<jtv> If it is, then yes, re-enable DHCP management (and DNS management I guess â you mentioned using that)
<ramonskie> yes the old cluster is accepted and the boot-images are now also 126 instead of 6
<jtv> Excellent!
<jtv> Want to try that node again?
<ramonskie> yup
<ramonskie> let me first check if everything is okay
<ramonskie> and that not the ipaddress have changed :P
<jtv> Yeah.  Anything you can check is a plus at this point.  :)
<jtv> If you feel up to it, maybe a fresh look at those logs in /var/log/maas.
<ramonskie> okay check seems okay no error for now
<ramonskie> will try a node now
 * jtv bates breath
<ramonskie> whooopppdidoooh
<ramonskie> it works
<ramonskie> muchos kudos to you!!!
<jtv> Phew.
<ramonskie> thanks for helping mate
<jtv> Glad I could help â and glad it didn't come crashing down on us.  :-)
<ramonskie> yes i'm realy glad i don't need to start over. this saved so much work
<ramonskie> and finaly the auto discover of ipmi is working :D
<jtv> I'll have to go now, but I would really appreciate if you could file a bug about this â especially the part where you upgraded and got two cluster controllers.  That might still be in the packaging somewhere.
<ramonskie> okay will do thanks for all the help
<ramonskie> where do you want me to fill in the bug?
<jtv> https://bugs.launchpad.net/maas
<jtv> (You have a Launchpad account ,right?)
<ramonskie> yup
<ramonskie> okay creating one now
<jtv> Thanks.  If we can prevent this from happening to someone else, that's wonderful.
 * jtv runs now
<jtv> Good night!
<ramonskie> i'm out to bye
<ramonskie> bug created https://bugs.launchpad.net/maas/+bug/1364903
<ubot5> Ubuntu bug 1364903 in MAAS "2 cluster controllers after upgrade from 1.4 > 1.5" [Undecided,New]
<rvba> blake_r: Hi Blake.  I had to revert 2872.  See https://code.launchpad.net/~rvb/maas/revert-2872/+merge/233191 for details.
<rvba> blake_r: Now I'm thinking that revision 2871 also introduced a problem (but a different one): one CI run failed with the nodes failing to get the images they need to boot.  This looks like the problem you diagnosed yesterday and said you were working on.
<rvba> blake_r: I'd say this is a race condition as the CI test passed a couple of times.
<rvba> blake_r: since you'll be up in less than an hour I'll refrain from reverting this one again.  Let's talk when you come online.
<blake_r> rvba: yes it is possible it will pass
<blake_r> rvba: the issue is that RPC is used for the API call but not for the image selection when a node is booting
<blake_r> rvba: so pxeconfig will fail
<blake_r> rvba: I have a branch that fixes pxeconfig
<rvba> blake_r: 2871 causes the images not to be present from time to time.  2872 (which I reverted) was causing the node to fail to enlist (see my paste on the revertion MP).
<rvba> noeds*
<rvba> nodes*
<rvba> arg
<blake_r> rvba: its not that the images are not present, its that the images are not present in the BootImage model, which is going away
<rvba> blake_r: right, what I meant that, as far as the node can see (and this involves the BootImage model), the images are not there.
<blake_r> rvba: yes correct, I was going to get that branch ready and land today, looks like I will have to do all of the again, :(
<rvba> blake_r: I just reverted 2872 (which was causing failures all the time), not 2871.
<blake_r> rvba: okay
<blake_r> rvba: will look at the mp in a moment, getting through email this morning
<rvba> blake_r: I'm sorry but I was in the middle of a QA and having trunk broken like that means a lot of time wasted for me.
<blake_r> rvba: oh I see the reason
<blake_r> rvba: yeah its reporting the avaliable architectures wrong, I will work on a fix
<rvba> blake_r: cool, ta.
<rvba> blake_r: if you can fix the breakage introduced by 2871 first that would be great.  Because 2871 is still checked in.
<blake_r> rvba: okay
<rvba> Thanks.
<newell> I am getting a 500 error when I go to aquire a commissioned node with latest trunk.  Here is the stacktrace: http://paste.ubuntu.com/8223903/
<newell> Anyone seen this before?
<rvba> newell: let me have a look at this stacktraceâ¦
<rvba> newell: looks like a bug in gen_dynamic_ip_addresses_with_host_maps: it should skip the ngi with no static_ip_range_low/hig
<rvba> newell: can you file a critical bug about this?
<newell> yeah
<newell> https://bugs.launchpad.net/maas/+bug/1364993
<ubot5> Ubuntu bug 1364993 in MAAS "gen_dynamic_ip_addresses_with_host_maps: it should skip the ngi with no static_ip_range_low/hig" [Critical,New]
<rvba> newell: I changed the title for this bug.  We try to explain what the problem is in the title/descriptions.  Suggestions on the possible cause or ideas on how to fix it should be put in the comments.
<rvba> newell: this helps triaging and lets people come up with alternative solutions.
<newell> ha I just changed the title as well before I just read this
<newell> wonder where it stands now
<newell> https://bugs.launchpad.net/maas/+bug/1364993
<ubot5> Ubuntu bug 1364993 in MAAS "500 error when trying to acquire a commissioned node" [Critical,New]
<newell> Is that better?
<rvba> Yep, it describes the problem.
<newell> k, we were both in the middle of changing it and my page wasn't refreshed, that is why I didn't see that you had modified it
<rvba> I figured :)
<newell> rvba, still around?
<rvba> newell: yep
<newell> rvba, we need a test for that bug or is it trivial enough to just push the change you mentioned?
<rvba> newell: as with any non-trivial change, it's worth a test
<rvba> newell: now I'm not so sure my solution is the right one as test__treats_undefined_static_range_as_zero_size_network seems to test the case where ngi has not static range.
<newell> yeah I was looking at that too
<newell> your change doesn't break any tests though
<rvba> newell: there is a bug in the test :)
<newell> ha
<rvba> newell: can you spot it?
<newell> let me take a look
<rvba> newell: My fix is up for review.  https://code.launchpad.net/~rvb/maas/bug-1364993/+merge/233244.  And I need to step out now. ttyl.
<newell> sounds good I will review it, sorry wife was asking me questions and got pulled away
<newell> ha, just needed to save it
<dpb2> Hi all -- I'm starting a server, but I don't see any log message for the power on attempt in celery log (just periodic dhcp refreshes, etc).  What is up?
<dpb2> (this install had been working fine)
<dpb2> roaksoax: ^ any ideas?
<roaksoax> dpb2: check whether maas-pserv is running
<roaksoax> dpb2: are you importing images?
<dpb2> roaksoax: all maas services are reported as running
<dpb2> roaksoax: let me check on the images
<dpb2> roaksoax: actually, I'm not sure how to check that. :)
<roaksoax> dpb2: what MAAS version are you using?
<dpb2> 1.6.1+bzr2550-0ubuntu1~ppa2
<roaksoax> dpb2: uhmmm
<roaksoax> blake_r: ^^ any thoughts?
<dpb2> roaksoax: I *could* restart all the services, but I didn't want to mask an issue
<roaksoax> dpb2: do please restart the issue. 1.7 will completely change in that area
<roaksoax> dpb2: because of celery being silly and causing issues like this
<roaksoax> dpb2: (we are getting rid of celery)
<roaksoax> dpb2: can you see logs?
<roaksoax> dpb2: maas.log celery.log
<dpb2> hm
<dpb2> roaksoax: I'm seeing the logs now
<dpb2> (just now)
<dpb2> roaksoax: so...
<dpb2> roaksoax: if boot images are importing, does that block power up attempts
<dpb2> ?
<roaksoax> dpb2: yes it can... celery blocks any other jobs if a bigger job is in progress
<dpb2> yikes
<dpb2> ok
<dpb2> well, I think it's working now.  the old "try it again" fixed it
<dpb2> thanks
<roaksoax> dpb2: np!
<Valduare> hows it going guys
<Valduare> any news on maas with arm devices?
<newell> Valduare, there is currently some arm support (i.e. arm64/armhf etc.)
<Valduare> I have a few mk808 devices here that would be fun to be able to spin them up etc
#maas 2014-09-04
<bigjools> jtv: may I avail myself of your reviewing powers please https://code.launchpad.net/~julian-edwards/maas/timer-commands/+merge/233296
<jtv> *trumpet music blares briefly*
<bigjools> is that like the reaping music in Hunger Games?
<jtv> Can't discuss that here.  Probably illegal.
<jtv> Anyway, your review is done.
<bigjools> thanks
<newell> Hola all
<jtv> Controlling the power on our NUCs would be so much easier if amttool supported IPv6...
<jtv> Bah.  Doesn't look as if wsman supports IPv6 either.  :(
<rvba> allenap: it seems we've got many different ways to test methods that call the RPC stuff in the code baseâ¦ I assume this is due to the fact that you're refined it recentlyâ¦ can you point me to the most up-to-example version?
<allenap> rvba: It depends on what you want to do :) call_responder is best when youâre testing a single RPC method directly.
<rvba> allenap: I want to test this: a RPC method should be called as a side-effect of changing a node's status.
<allenap> rvba: Mock(Live)?(ClusterToRegion|RegionToCluster)Fixture are best when testing code that uses RPC. The Live variants are best when there isnât an opportunity to pump IO in the test, by hand as it were.
<rvba> allenap: okay, thanks; that's what I started using.  There are places in the code that use pure mocking instead of this.
<allenap> rvba: Personally I would be tempted to write a helper in maasserver.clusterrpc somewhere, test that on its own, then use patch_autospec to test that *that* is called when changing a nodeâs status.
<rvba> allenap: I don't understandâ¦ what would that helper do?
<rvba> allenap: couldn't we get MockRegionToClusterRPCFixture to use patch_autospec under the hood to create "valid" mocks for all the registered methods?
<allenap> rvba: Weâre actually getting something similar because all calls are being validated against the RPC schema.
<rvba> allenap: ah, okay.
<rvba> allenap: still, it's not exactly the same thing as having mocks already setup.
<allenap> rvba: The helper would be a thin wrapper around getting a client and making the remote call. Itâs a conveniently small thing to test, but also provides a good point at which to mock.
<allenap> rvba: What kind of mocks do you want? When running the pserv tests I donât really want to bring up the database and Django.
<rvba> allenap: well, just mock objects;  and then you can configure them if you want.  I don't see how that's related to Django at all but I must be missing something.
<allenap> rvba: If youâre using addEventLoop (from the cluster) or addCluster (from the region) with the end-to-end fixtures, then you can specify which calls to mock, and it does provide stubs for you to customise. Itâs hard to customise those to have some default behaviour when that behaviour is defined elsewhere (e.g. provisioningserver shouldnât import
<allenap> maasserver, and thatâs where Django comes in) and might require a database with sample data. I also donât think that would be a good way to write unit tests.
<rvba> allenap: fair enough.
<gQuigs> ~$ maas maas-mini-test maas get-config maas_name=unicode
<gQuigs> No provided name!
 * gQuigs doesn't understand how to query get-config
<gQuigs> ideally, I'd like to be able to just get all the config back
<rvba> allenap: I'm still not completely satisfied with the explanation you gave me on ~allenap/maas/rpc-start-nodes-extraâ¦ why were we excluding nodes with MACs?  If this is still a requirement, why aren't you changing all the call sites to eliminate the nodes with MACs before calling start_nodes.  And if you think start_nodes is the wrong place to do this, why did you do something very similar in stop_nodes?
<rvba> allenap: sorry, lots of questions :)
<allenap> rvba: I honestly donât know why we were excluding nodes without MACs.
<allenap> It seems bizarre to make that decision then. Worse yet to exclude them without warning, just silently drop them.
<allenap> rvba: Iâm not sure the call sites need changingâ¦ well, I donât know. How do we get to a point where we have a node without a MAC address?
<rvba> allenap: the only explanation I can see is that for some reason we know we can't power them up.  In which case it's logical to exclude them.
<rvba> allenap: yeah, I know, that's why the whole thing is weird.
<allenap> rvba: stop_nodes() doesnât exclude nodes without MACs, only those where the power_info tuple has can_be_stopped=False.
<rvba> allenap: right.
<rvba> allenap: I'm temped to agree with you and get rid of itâ¦ but because I don't understand why we were doing this in the first place I'm a bit confusedâ¦
<allenap> rvba: On one hand I want to remove it and see if anything happens.
<rvba> gQuigs:  ~$ maas maas-mini-test maas get-config name=myconfig
<gQuigs> rvba: oh, wow..  thanks!
<rvba> allenap: I think we did this because get_primary_mac is used by get_effective_power_parameters
<rvba> allenap: but there is support for when get_primary_mac return None
<rvba> allenap: okay, let's get rid of it.
<allenap> rvba: Cool.
<allenap> rvba: Thanks, by the way. Itâs good to have these kinds of discussions :)
<rvba> allenap: heh, you're welcome :)
<roaksoax> rvba: isn't the primary mac also for networking?
<roaksoax> rvba: we only do mapping for primary mac
<roaksoax> for DNS
<rvba> roaksoax: no, we populate the DNS with what's in the staticIP table. And this is, in turn, uses the MAC on the managed interface.
<roaksoax> rvba: ok
#maas 2014-09-05
<bigjools> jtv: with regarding to https://bugs.launchpad.net/maas/+bug/1365616 can you remember why we restricted API access (even read access) to cluster workers (and admins)?
<ubot5> Ubuntu bug 1365616 in MAAS "Non-admin access to cluster controller config" [High,Triaged]
<jtv> bigjools: looking...
<bigjools> jtv: just added a comment, refresh in 1 min
<bigjools> adding*
<jtv> bigjools: the only readily apparent reason I can see is that update privileges should be restricted to admins.
<jtv> So I think the reason is probably just that we didn't have time to build both privileged and unprivileged handlers at a time when it wasn't yet clear that the latter would be needed.
<bigjools> jtv: well I don't think it's a case of two handlers - the read code explicitly makes this check
<jtv> The _read_ code makes a security check!?
<bigjools> yes
<bigjools> see src/maasserver/api/node_group_interfaces.py
<bigjools> I think I know why
<bigjools> it would enable attackers easier access
<jtv> I think I see another reason.
<jtv> The access checks look for two things:
<jtv> 1. Admin.
<jtv> 2. Cluster worker.
<jtv> I think either we didn't realise that when we built the NGI API, or just didn't want the complication of also checking for different levels of access given time pressure.
<bigjools> hmmm
<bigjools> jtv: are you free for a pre-imp in about 10 minutes?
<jtv> bigjools: sorry, didn't notice the IRC notification there.  On the bright side, your branch is now reviewed.  :)  Give me another few minutes.
<bigjools> jtv: ok I need to remember what I wanted to talk about...
<bigjools> but I have fresh coffee, so it'll come soon
<bigjools> jtv: calling
<jtv> Looks like we have that CI check now to ensure that a node's IP address is one from the static range.  But it's failing.
<jtv> I thought we'd fixed MACAddress.cluster_interface?
<jtv> Whooo!  My node is âDeployedâ!
<jtv> No longer just âAllocated.â
<jtv> Thanks rvba.  :-)
<rvba> \o/
<rvba> jtv: why the new factory? (make_Network)
<rvba> jtv: I don't see that pattern (i.e. make_*N*ode) used anywhere elseâ¦?
<jtv> Look at the first branch first.  :)
<rvba> ah
<jtv> Sometimes I wonder: are we actually cleaning up the Celery-based code that we disable?
<bigjools> seems not
<bigjools> I plan on eviscerating some stuff later
<bigjools> rvba: you might know this, there's a dupe of this but I can't work out which
<bigjools> https://bugs.launchpad.net/maas/+bug/1365035
<ubot5> Ubuntu bug 1365035 in MAAS "MAAS provider bootstrap: Timeout, server <server> not responding." [Undecided,New]
<bigjools> rvba: it's where the power is slow to go off/on
<bigjools> and the old machine gets re-used
<rvba> bigjools: are you thinking about https://bugs.launchpad.net/maas/+bug/1325610 ?
<ubot5> Ubuntu bug 1325610 in MAAS "node marked "Ready" before poweroff complete" [High,Triaged]
<bigjools> aha
<bigjools> thanks
<bigjools> oh you already did it
<bigjools> ah no you didn't, confusing LP ui fail
<bigjools> easy karma for someone, delete-only branch: https://code.launchpad.net/~julian-edwards/maas/remove-update-leases-api/+merge/233466
<maastest> Can someone please help me with this question http://askubuntu.com/questions/520240/problem-deploying-node
#maas 2015-08-31
<h0mer> anyone here familiar with the cannonical openstack distro?
<h0mer> specifically how to ssh into a glance machine deployed by JUJU?
<binoy> Hi
<binoy> how can I power off nodes using maas . I am using WoL as power type
<binoy>  how can I power off nodes using maas GUI. I am using WoL as power type
<bino> hi
<bino> can anyone help me on wake on lan power type with maas
<bdx> hows it going everyone?
<bdx> http://askubuntu.com/questions/666572/how-to-add-user-defined-cloud-config-preseed-in-maas
<mup> Bug #1490630 opened: intermittent access issues for IPMI with HP Gen8 BL460c <MAAS:New> <https://launchpad.net/bugs/1490630>
<ntpttr> Hey everyone, I just upgraded maas from 1.7 to 1.8 by running 'sudo add-apt-repository ppa:maas-maintainers/stable' and then 'sudo apt-get dist-upgrade', and now when I go to localhost/MAAS on my browser it's stuck on a screen that says "MAAS is starting, please try again in a few seconds". Does anyone know what could make this happen?
<mup> Bug #1490637 opened: Devices model should support types (container) <MAAS:New> <https://launchpad.net/bugs/1490637>
<roaksoax> ntpttr: what does /var/log/maas/*.log say ?
<roaksoax> ntpttr: try to access the webui again?
<mup> Bug #1490637 changed: Devices model should support types (container) <MAAS:New> <https://launchpad.net/bugs/1490637>
<ntpttr> roaksoax: It looks like reloading bind is failing, it's saying rndc connect failed: 127.0.0.1#954
<philibar> Hi All, does anyone here have any information on how we can update the commission image from 14.04.1 to 14.04.3? Thank you
<ntpttr> roaksoax: And when I try to start bind manually it fails with the same message
<mup> Bug #1490637 opened: Devices model should support types (container) <MAAS:New> <https://launchpad.net/bugs/1490637>
<roaksoax> philibar: you can't use point releases for commissioning images, you can however, use the latest images
<roaksoax> ntpttr: so what does syslog say about the bind ? maybe there's a config that's making maas fail ?
<roaksoax> or bind fail
<roaksoax> philibar: you can change the sources from 'releases' to 'daily'
<roaksoax> philibar: you can change the sources from 'releases' to 'daily' images, which will give you the latest commissioning image
<roaksoax> philibar: in Settings under Boot Images
<ntpttr> roaksoax: All it's giving me is 'Reloading BIND failed: command `rndc -c /etc/bind/maas/rndc.conf.maas reload` returned non-zero exit status 1: #012rndc: connect failed: 127.0.0.1#954: connection refused"
<philibar> roaksoax, Thanks a bunch, we'll give that a try
<ntpttr> roaksoax: I figured out the problem if you're interested. dnssec validation was defined both in /etc/bind/named.conf.options and /etc/bind/maas/named.conf.options.inside.maas which was causing it to fail because the configuration already existed
<roaksoax> ntpttr: weird, MAAS should handle the upgrade gracefully
<roaksoax> ntpttr: what version of 1.8 are you using?
<ntpttr> roaksoax: It looks like 1.8.0
<roaksoax> ntpttr: ah yes, we fixed that in 1.8.1
<ntpttr> roaksoax: Oh, do you know how upgrading to 1.8.1 works? I didn't see too much documentation about it
<ntpttr> roaksoax: Also, I don't know if you know much about juju as well, but I'm behind a proxy and every command I run with juju times out saying the API is unavailable
<roaksoax> ntpttr: you can just sudo apt-get dist-upgrade to upgrade tyo 1.8.1 and 1.8.2, althoiugh, 1.8.2 will be out soon
<roaksoax> it is already out, it  will be available for upgrade this week
<roaksoax> ntpttr: you probably need to set a proxy in juju too
<ntpttr> roaksoax: I
<roaksoax> ntpttr: https://jujucharms.com/docs/stable/howto-proxies
<ntpttr> roaksoax: I'll check that doc out, the 'juju set-env http-proxy' command is timing out too
<ntpttr> roaksoax: thanks for your help!
<roaksoax> ntpttr: np! sorry cant help more with juju
<philibar> hi roaksoax, your hack worked on getting 14.03.3, but the image does not support LTSEnablementStack, therefore the kernel does not update and doesn't fix our issue.
<philibar> roaksoax, do you think you would know a solution on how to have a more recent kernel? I'm guessing it's by setting up a local OS repo?
<roaksoax> philibar: recent kernel as in using a vivid kernel in trusty for example?
<philibar> yes
<philibar> it's exactly that
<philibar> we're using HP nodes, and the driver isn't detecting the disks if we're not using vivid :(
<mup> Bug #1490709 opened: When no HWE kernels for a release, MAAS UI shows  <MAAS:New> <https://launchpad.net/bugs/1490709>
<roaksoax> philibar: if you are using 1.8, change the subarchitecture for the node to amd/hwe-v *or* which is probably best, in the MAAS Settings Page
<roaksoax> philibar: change the Commissioning release from trusty to Vivid
<roaksoax> philibar: under the "Commissioning" section
<roaksoax> philibar: i think the latter should fix your issue
<roaksoax> philibar: in 1.9, however, when you *deploy* you will be able to select what kernel version you use right there instead of chnaging the subarch
<philibar> roaksoax, we're running 1.8, and that's where I run into the issue that even if I have 15.10, 15.04, 14.10, and 14.04 LTS, we are unable to change the Commissioning release that will be used.
<philibar> we only have 14.04 LTS in the drop down
<mup> Bug #1490709 changed: When no HWE kernels for a release, MAAS UI shows "string index out of range" <MAAS:New for ltrager> <https://launchpad.net/bugs/1490709>
<roaksoax> philibar: uhmm. Right, that means we are restricting LTS's for commissioning then
<roaksoax> philibar: ok, so the other option, go to the machine details itself, and change the subarchitecture to amd64/hwe-v
<roaksoax> philibar: that should do it
<roaksoax> philibar: let meknow if that works
<philibar> roaksoax, Testing now..
<philibar> YESSSS!
<philibar> roaksoax, it works!
<roaksoax> philibar: cool!
<philibar> Finally! haha Thanks a bunch again, it has identified all my disks and went on the correct kernel
<roaksoax> philibar: woohoo!
<mup> Bug #1490709 opened: When no HWE kernels for a release, MAAS UI shows  <MAAS:New> <https://launchpad.net/bugs/1490709>
<mup> Bug #1490711 opened: MAAS doesn't allow you to select what HWE kernel to use when commissioning <MAAS:Confirmed> <https://launchpad.net/bugs/1490711>
<philibar> and right on time for me to catch my train! Thanks again
<philibar> have a good one roaksoax
<mup> Bug #1490711 changed: MAAS doesn't allow you to select what HWE kernel to use when commissioning <MAAS:Confirmed> <https://launchpad.net/bugs/1490711>
<mup> Bug #1490711 opened: MAAS doesn't allow you to select what HWE kernel to use when commissioning <MAAS:Confirmed> <https://launchpad.net/bugs/1490711>
#maas 2015-09-01
<mup> Bug #1490847 opened: API docs for GET nodes/op=list don't mention available filters <MAAS:New> <https://launchpad.net/bugs/1490847>
<mup> Bug #1474417 changed: squid-deb-proxy does not refresh translation files often enough <maas (Ubuntu):Fix Released> <squid-deb-proxy (Ubuntu):Fix Released by mvo> <https://launchpad.net/bugs/1474417>
<mup> Bug #1490962 opened: xgene-uboot and xgene-uboot-mustang "not a valid architecture"  and missing in ui <arm64> <xgene> <MAAS:New> <https://launchpad.net/bugs/1490962>
<mup> Bug #1491057 opened: Record and expose SSH host keys <landscape> <MAAS:Triaged> <https://launchpad.net/bugs/1491057>
#maas 2015-09-02
<sputnik13> hello what does it mean for a maas cluster to be "disconnected"
<sputnik13> the region and cluster controller are on the same box
<sputnik13> and the nodes being managed are on the same switch as the controller, and they're all pingable
<sputnik13> what the...  is there a reason why maas-clusterd and maas-pserv try to listen on the same port
<mup> Bug #1491236 opened: How to configure virsh with maas <MAAS:New> <https://launchpad.net/bugs/1491236>
<mup> Bug #1491324 opened: Test failure: maasserver.websockets.websockets._WSException: Reserved flag in frame (114) <tests> <MAAS:Triaged> <https://launchpad.net/bugs/1491324>
<mup> Bug #1491344 opened: Possible block device misdetection after upgrade <MAAS:New> <https://launchpad.net/bugs/1491344>
<mup> Bug #1491344 changed: Possible block device misdetection after upgrade <MAAS:Invalid> <https://launchpad.net/bugs/1491344>
<mup> Bug #1491403 opened: add truncation/download button to log listing page <landscape> <ui> <MAAS:Triaged> <https://launchpad.net/bugs/1491403>
<mup> Bug #1491403 changed: add truncation/download button to log listing page <landscape> <ui> <MAAS:Triaged> <https://launchpad.net/bugs/1491403>
<mup> Bug #1491403 opened: add truncation/download button to log listing page <landscape> <ui> <MAAS:Triaged> <https://launchpad.net/bugs/1491403>
<taj_> Hi guys, we have a little problem - it could be great if someone had any leads: We have MaaS-RC and MaaS-CC. MaaS-RC uses internal mirror (proxy issue) where we have all necessary files. MaaS-RC imports packages without any problem but MaaS-CC is still Out-of-Sync. I can see that cache on MaaS-CC has filled (4GB) but system is still in our-of-sync state without any error message.
<mup> Bug #1420450 changed: Unable to enumerate nodes in MAAS <openstack-installer:Fix Released by mikemc> <MAAS:Won't Fix> <https://launchpad.net/bugs/1420450>
<mup> Bug #1491474 opened: MAAS WoL power control broken <MAAS:New for newell-jensen> <https://launchpad.net/bugs/1491474>
<mup> Bug #1401651 changed: disks not recognized with HP blades using smart array  <cloud-installer> <curtin:New> <MAAS:Fix Released> <https://launchpad.net/bugs/1401651>
<roaksoax> taj_: what version?
<roaksoax> taj_: check the logs under /var/log/maas , clusterd.log
<taj_> Hi roaksoax, it is fixed now :) Few files were missing from our mirror (like grubnetx64.efi).
#maas 2015-09-03
<james> Hi
<Guest79467> when i do the following command,  I am getting null value as o/p sudo virsh list --all
<Guest79467> root@maasserver:/home/vagrant# sudo virsh list --all
<Guest79467>  Id Name State
<Guest79467>  ----------------------------------------------------
<Guest79467>  root@maasserver:/home/vagrant#
<mup> Bug #1491742 opened: Unable to modify static address pool when no machines exist <MAAS:New> <https://launchpad.net/bugs/1491742>
<mup> Bug #1491742 changed: Unable to modify static address pool when no machines exist <MAAS:New> <https://launchpad.net/bugs/1491742>
<mup> Bug #1491742 opened: Unable to modify static address pool when no machines exist <MAAS:New> <https://launchpad.net/bugs/1491742>
<mup> Bug #1491822 opened: openstack-install crashes on juju bootstrap <MAAS:New> <https://launchpad.net/bugs/1491822>
<mup> Bug #1491822 changed: openstack-install crashes on juju bootstrap <MAAS:New> <https://launchpad.net/bugs/1491822>
<mup> Bug #1491822 opened: UnicodeDecodeError in _get_systemd_service_status <MAAS:Triaged> <https://launchpad.net/bugs/1491822>
<mup> Bug #1491831 opened: maas-cli should allow to specify a cluster controller for a new node <MAAS:New> <https://launchpad.net/bugs/1491831>
<mup> Bug #1491831 changed: maas-cli should allow to specify a cluster controller for a new node <MAAS:New> <https://launchpad.net/bugs/1491831>
<mup> Bug #1491831 opened: maas-cli should allow to specify a cluster controller for a new node <MAAS:New> <https://launchpad.net/bugs/1491831>
<mup> Bug #1491843 opened: maas web interface should display cluster names in the "Add a new node" cluster dropdown, not their DNS zone names <ux> <MAAS:Triaged> <https://launchpad.net/bugs/1491843>
<bino> hi
<bino> is anybody knows about virsh with maas
<bino> ?
<mup> Bug # changed: 1391612, 1482784, 1483305, 1491843
<bino> ERROR Thu, 03 Sep. 2015 14:26:10 Failed to power on node â Node could not be powered on: virsh failed with return code 1: Failed to login to virsh console.
<bino> have any idea regarding this
<mup> Bug #1491887 opened: cannot create hostmaps when dhcpd server is down <MAAS:New> <https://launchpad.net/bugs/1491887>
<mup> Bug #1491898 opened: update drivers.yaml with hpdsa driver entry <MAAS:New> <https://launchpad.net/bugs/1491898>
<mup> Bug #1376483 changed:  AttributeError: 'Port' object has no attribute 'socket' <tech-debt> <MAAS:Invalid> <MAAS 1.8:Triaged> <https://launchpad.net/bugs/1376483>
<mup> Bug #1378993 changed: Let administrators see the region secret in the MAAS UI, and provide instructions on how to register clusters. <MAAS:Won't Fix> <https://launchpad.net/bugs/1378993>
<mup> Bug #1491924 opened: Deploying a node via the API with an incorrect HWE kernel succeeds <MAAS:Triaged by ltrager> <https://launchpad.net/bugs/1491924>
<mup> Bug #1491927 opened: No way to tell what caused a node to be released <oil> <MAAS:New> <https://launchpad.net/bugs/1491927>
<mup> Bug #1491927 changed: No way to tell what caused a node to be released <log> <oil> <MAAS:Triaged> <https://launchpad.net/bugs/1491927>
<mup> Bug #1491947 opened: ARM64 and ARMhf nodes report only 1 core on systems with 2 - 8 cores <MAAS:New> <https://launchpad.net/bugs/1491947>
#maas 2015-09-04
<binoy> I am getting following error while trying to commision the node
<binoy> Failed to query node's BMC â Node could not be queried
<mup> Bug #1492262 opened: maas-enlist-udeb fails enlistment because of missing curl <MAAS:New> <maas-enlist (Ubuntu):Fix Released by mathieu-tl> <https://launchpad.net/bugs/1492262>
<mup> Bug #1492262 changed: maas-enlist-udeb fails enlistment because of missing curl <maas-enlist (Ubuntu):Fix Released by mathieu-tl> <https://launchpad.net/bugs/1492262>
<mup> Bug #1490865 opened: destroy-environment on an unbootstrapped MAAS environment can release all my nodes <cloud-installer> <oil> <juju-core:New> <MAAS:New> <https://launchpad.net/bugs/1490865>
<mup> Bug #1490865 changed: destroy-environment on an unbootstrapped MAAS environment can release all my nodes <cloud-installer> <oil> <juju-core:New> <MAAS:New> <https://launchpad.net/bugs/1490865>
<mup> Bug #1490865 opened: destroy-environment on an unbootstrapped MAAS environment can release all my nodes <cloud-installer> <oil> <juju-core:New> <MAAS:New> <https://launchpad.net/bugs/1490865>
<Kiall> Hey folks, I'm looking into MaaS but running into an issue.. When comissioning a node, it PXE boots etc correctly, but cloud-init attempts to pull metadata from the routers IP rather than the MaaS server.. Any clues?
<roaksoax> Kiall: that's probably a misconfiguration
<roaksoax> Kiall: what version of maas are you using?
<Kiall> roaksoax: yea, however I can't find it :) 1.8 from the MAAS PPA
<roaksoax> Kiall: what's the MAAS_URL in /etc/maas ?
<roaksoax> Kiall: and the generator in /etc/maas/pserv.yaml
<Kiall> roaksoax: two ops set to http://10.2.2.3/MAAS which is correct, and generator URL of http://10.2.2.3:5240/MAAS/api/1.0/pxeconfig/
<Kiall> Also correct
<roaksoax> Kiall: /etc/maas/maas_local_settings.py ? DEFAULT_MAAS_URL
<Kiall> Same again :(
<roaksoax> Kiall: can you capture the kernel parameters being passed to the machine when pxe booting?
<Kiall> I'm not 100% sure, the boot seems to stall before getting to a shell .. Though, may be able to interrupt the boot before it gets that far...
<roaksoax> Kiall: you should be able to see the kernel commands while the machine is pxe booting
<Kiall> kk.. need to walk a few rooms over ;)
<roaksoax> Kiall: do you know have serial console :) ?
<Kiall> roaksoax: sadly.. nope.. I see cloud-config-url=http://10.2.2.3/MAAS/metadata/etc/etc
<Kiall> and - the URL seems to correctly return a sane looking cloud-config YAML
<roaksoax> Kiall: yeah so that seems to be correct, then maybe networking is bustedn ?
<mwenning> Hi maas team, is there a way to fetch all the commissioning summary data for a maas node from the command line?
<rpa> Hi there
<rpa> i have a problem when i try to get a node running in my testint enviroment. i'm trying to run a new node, but when the node starts a poweroff command is issued
<mup> Bug #1492465 opened: shouldn't "Ram" be capitalized in node details view? <MAAS:New> <https://launchpad.net/bugs/1492465>
<mup> Bug #1492465 changed: shouldn't "Ram" be capitalized in node details view? <MAAS:New> <https://launchpad.net/bugs/1492465>
<mup> Bug #1492465 opened: shouldn't "Ram" be capitalized in node details view? <MAAS:New> <https://launchpad.net/bugs/1492465>
#maas 2015-09-05
<jbran> How do I go about deploying a specific point release? Let's say I want to deploy 12.04.0 instead of 12.04.5?
<mup> Bug #1492531 opened: MAAS seems to be using old Trusty images <MAAS:New> <https://launchpad.net/bugs/1492531>
<mup> Bug #1492531 changed: MAAS seems to be using old Trusty images <MAAS:New> <https://launchpad.net/bugs/1492531>
<mup> Bug #1492531 opened: MAAS seems to be using old Trusty images <MAAS:New> <https://launchpad.net/bugs/1492531>
#maas 2015-09-06
<mup> Bug #1472228 changed: maas-pserv still " Stop"  !!  <MAAS:Expired> <https://launchpad.net/bugs/1472228>
<mup> Bug #1472228 opened: maas-pserv still " Stop"  !!  <MAAS:Expired> <https://launchpad.net/bugs/1472228>
<mup> Bug #1472228 changed: maas-pserv still " Stop"  !!  <MAAS:Expired> <https://launchpad.net/bugs/1472228>
<mup> Bug #1472228 opened: maas-pserv still " Stop"  !!  <MAAS:Expired> <https://launchpad.net/bugs/1472228>
<mup> Bug #1472228 changed: maas-pserv still " Stop"  !!  <MAAS:Expired> <https://launchpad.net/bugs/1472228>
<bleepbloop> Anyone have a clue why I would have begun receiving the error "list index out of range" when attempting to commission some machines in maas?
#maas 2016-09-05
<gurki> hello
<gurki> is there a way to have maas running on a centos?
<gurki> i kind of have to have centos on the management node and i dont want to deploy a second one ...
<kiko> gurki, deploy an ubuntu container and run MAAS inside that
<NetNet> Any idea what the default login details are for a deployed node???
<brendand> ubuntu/ubuntu
<brendand> actually, there shouldn't be a login as it should use public key authentication
<NetNet> Tried that and it only returns login incorrect...I'm pulling my hair out with this! haha I have tried lots of combinations and nothing has worked - no matter what OS I install on the node, it comes up with the login prompt and I can get no further! :(
<NetNet> hmm...I have an SSH key setup on the web ui - I take it this would need adding to the actual MAAS appliance first though?
<brendand> NetNet, that key needs to be wherever you want to ssh from
<NetNet> do you have to use the SSH key to login to the node locally (as in through direct KVM access to the node..)
<brendand> NetNet, when you added the ssh key, where did you get it from?
<gurki> kiko: thats kind of hard as maas is gonna try to create its own vms when running image builder
<gurki> ... breaking when run inside a vm.
<roaksoax> gurki: maas-image-builder has really nothing to do with MAAS. maas-image-builder simply creates a VM to be able to create a image with what's required for MAAS
<roaksoax> gurki: so yes, you'd need an ubuntu machine to run it, and then you can iport that image into MAAS
<roaksoax> NetNet: a deployed node will have the SSH key of the user in MAAS
<gurki> im aware of that. but not being able to create the images on the very node i intend to deploy them is kind of ... strange
<gurki> i basically not only need a ubuntu vm but an ubuntu on a physical machine
<gurki> which is a problem
<gurki> id basically need a physical machine just for that as we just use centos atm
<gurki> you might ask 'why use maas then' - been going for maas as it provides moonshot support
<gurki> is there actually some technical reason why it cannot run on $distribution?
<roaksoax> gurki: does the centos images provided by MAAS not working for your hardware?
<gurki> roaksoax: itll throw 'no such file or directory' at me when i try to do as said on https://maas.ubuntu.com/docs/os-support.html
<gurki> am i expected to put $thing taken from the centos iso to wherever?
<gurki> i could create a tgz if i had some physical ubuntu which i dont have
<kiko> gurki, actually, we provide the centos images in a stream
<kiko> gurki, so you just need to tick the box for them
<kiko> gurki, running MAAS bare-metal on CentOS is not a trivial project unfortuantely
<kiko> I'm sure it could be done, but we are flat out as it is and can't really blink
<kiko> gurki, can't you just request an Ubuntu box? if necessary you can just deploy CentOS onto a container or VM on the same box
<gurki> kiko: how can i grab said image?
<kiko> gurki, just tick the box on the images page
<gurki> i can just tick ubuntu stuff there ... or did i miss sth?
<gurki> *rechecking*
<kiko> gurki, what version of MAAS are you running?
<kiko> gurki, on 2.0 it's enabled out of the box
<gurki> hum
<gurki> i have 1.9.4 installing them from the ubuntu 14.04 repo.
<gurki> i guess thats too old a version then.
<gurki> *throwing a recent version on some vm*
<gurki> kiko: were inside some moonshot stuff ... each node is kind of costly :s
<gurki> it is supposed to be manageable as-is, with no additional pcs/whatever required
<gurki> a ubuntu vm added to the headnode is fine, the other way around is kind of tricky as we have other management stuff that needs to run there
<kiko> gurki, I see what your challenge is now
<roaksoax> /w/win 4
<PCdude> http://askubuntu.com/questions/820925/how-do-i-set-a-dns-server-in-maas-that-will-be-passed-on-to-the-nodes
<PCdude> I hope that makes the question clear, anybody an idea?
<kiko> gurki, is there a strong rationale for putting the MAAS region controller on the chassis itself?
<kiko> gurki, is the chassis all you have?
<kiko> PCdude, the nodes typically are set to use MAAS as their DNS server
<kiko> PCdude, but you can configure forwarders, and that normally works fine
<kiko> PCdude, does that not work in your case?
<PCdude> yes it indeed has the ip address of the MAAS controller and I indeed has set the value of the DNS forwarders, but nothing is resolved as far as DNS is concerned
<PCdude> kiko: maybe there is a file I can change ?
<kiko> PCdude, something is wrong. can you clarify to me something:
<kiko> PCdude, you've deployed a node, you ssh into it, and "host google.com" fails?
<PCdude> kiko: this is what I have done now, I have changed the interface of the maas controller facing the nodes to DHCP managed only (so no DNS) and that seems to work)
<PCdude> kiko: it still has the IP of the maas controller, but I think he immidiatly sends it to the forwarder
<kiko> PCdude, that doesn't make sense though
<PCdude> kiko: and indeed in the "DHCP and DNS managed" situation the "host google.com" would not succeed
<PCdude> kiko: exactly my thought :)
<kiko> PCdude, something else is wrong
<PCdude> kiko: like what?
<kiko> PCdude, what version of MAAS?
<kiko> sounds like 1.9
<PCdude> kiko: 1.9.4
<PCdude> I cant use 2.0
<kiko> PCdude, because of Juju?
<PCdude> coz of JUJU which I also need and is still in beta
<PCdude> kiko yes JUJU
<PCdude> kiko: is this a know issue of MAAS 1.9?
<kiko> PCdude, no, it's not.
<kiko> PCdude, it suggests something is wrong with your setup
<PCdude> kiko: well I dont know what could be wrong haha, how can I check?
<kiko> PCdude, can you revert back to manage dhcp+dns, and ensure the forwarders are correct, and redeploy a node?
<PCdude> kiko: gonna do that right now
<PCdude> kiko: deploying rn, it will take a minute or two
<kiko> thanks
<PCdude> kiko: packages succeed.... I am really confused I did this before and suddenly it works now
<PCdude> kiko: maybe the MAAS server got an update by changing the DNS and DHCP managing on the interface
<PCdude> ?
<kiko> PCdude, yes, there could be a hidden bug somewhere
<kiko> PCdude, but I'm happy you have a working setup now
<kiko> do you want to try and reproduce, or move on? :)
<PCdude> kiko: well lets try to reproduce, its easy I have snapshots :)
<kiko> PCdude, cool. let's go back to the broken setup then.
<kiko> PCdude, the first thing is to double-check the forwarders in the MAAS settings page, and get to a failing "host google.com" on the host
<PCdude> kiko: nodes and controller are set back, I am starting the controller rn
<kiko> cool
<PCdude> kiko: the setting page indeed has the right DNS follower, I am gonna deploy a node rn
<PCdude> kiko: lets see what happens :)
<kiko> PCdude, cool
<kiko> PCdude, also, check out the BIND configuration we generate, which should be in /etc/maas/bind/
<kiko> PCdude, see if the forwarder is correctly set there
<PCdude> kiko: I indeed checked those before and they contain the right values but the nodes do not in the end, but let me check again
<PCdude> kiko: I think its /etc/bind/maas btw , but np I am checking rn
<PCdude> the named.conf.options.inside.maas does indeed have the right value of the forwarder
<PCdude> kiko: node is almost done with deploying
<PCdude> ping: unknown host google.com
<PCdude> ""ping google.com" gives "ping : unknown host google.com"
<PCdude> "host google.com" gives "connection timed out; no servers could be reached"
<kiko> PCdude, excellent.
<kiko> PCdude, what does resolv.conf say?
<PCdude> kiko: "ping 8.8.8.8" does succeed so there is internet for sure
<kiko> PCdude, cool
<kiko> PCdude, I am thinking the DNS server on the MAAS side is either broken or not running
<kiko> but first, resolv.conf
<PCdude> that file contains the wrong nameserver value and the right DNS name
<kiko> PCdude, what do you mean?
<PCdude> kiko:  let met put it on pastebin one second
<PCdude> http://pastebin.com/raw/HFD8mabz
<PCdude> the IP address is the IP address of the MAAS controller itself
<kiko> PCdude, but that's correct. the MAAS controller is the DNS server, right?
<kiko> PCdude, next, "nslookup [IP-address]"
<kiko> err
<kiko> PCdude, next, "nslookup google.com [IP-address]"
<PCdude> kiko: yes that is right, indeed sorry I was mixing two situations
<kiko> ok
<kiko> PCdude, what's the name of the node that was deployed?
<PCdude> "nslookup google.com [IP-address]" gives "server can't find google.com: SERVFAIL"
<PCdude> the name is very simple haha "node1"
<kiko> PCdude, okay, and "nslookup node1.maas [IP-address]"?
<kiko> PCdude, and finally, "nslookup google.com 8.8.8.8"
<PCdude> both of those succeed
<kiko> okay
<kiko> let's go to the MAAS server now
<PCdude> terminal window or web GUI of MAAS?
<kiko> terminal window
<kiko> and look at the syslog to see what BIND is complaining about
<kiko> PCdude, grep " named" /var/log/syslog | tail
<kiko> (I think bind logs to syslog)
<PCdude> http://pastebin.com/raw/6ZPsEuGa
<PCdude> sorry I have to remove some sensitive data here and there
<PCdude> *had
<kiko> hah
<kiko> dnssec strikes again
<kiko> PCdude, what does a "grep dnsssec" on your bind config return?
<PCdude> so I have to issue: grep "dnssec" /var/log/syslog | tail ?
<kiko> PCdude, no, /etc/bind/*
<kiko> PCdude, possibly -r as it's /etc/bind/maas yes?
<kiko> PCdude, but basically, you need to set "dnssec-enable no;" somewhere, and otherwise to complain to your forwarders :)
<PCdude> I have done sudo grep "dnssec" -r /etc/bind/maas/ | tail
<PCdude> is that correct?
<kiko> PCdude, do it in /etc/bind
<PCdude> u wanna see that output, right?
<PCdude> http://pastebin.com/raw/Z7EQFz9i
<kiko> PCdude, yeah
<kiko> PCdude, so this line:
<kiko> /etc/bind/maas/named.conf.options.inside.maas:dnssec-validation auto;
<kiko> needs to become "dnssec-enable no; dnssec-validation no"
<kiko> err
<kiko> needs to become "dnssec-enable no; dnssec-validation no;"
<kiko> do that, restart bind and you'll be fine
<PCdude> kiko: I think that can be done from within the MAAS web GUI :)
<kiko> PCdude, it can?
<kiko> PCdude, if it can, could you send me a screenshot showing it?
<kiko> gurki, if you see this later, just /msg me
<PCdude> on the settings page this setting is present
<PCdude> http://imgur.com/a/fRUqK
<PCdude> kiko: that is exactly what I need to change right?
<kiko> PCdude, correct. turn that off.
<PCdude> YEAH! I turned it off and the DNS resolves succeed!
<PCdude> kiko: many thanks!
<kiko> PCdude, you're welcome. that's an annoying problem.
<PCdude> kiko:  this really helped me out, where are u from?
<kiko> PCdude, I work at Canonical, if that's what you are asking?
<PCdude> haha good to know too, but I was more interested in what country u are from?
<kiko> PCdude, from brazil
<kiko> PCdude, filled in the answer to http://askubuntu.com/questions/820925/how-do-i-set-a-dns-server-in-maas-that-will-be-passed-on-to-the-nodes/821368#821368
<PCdude> kiko: ah ok, good to know! I upvoted and gave it the mark as correct answer
<kiko> PCdude, thanks! where are you from?
<PCdude> kiko: I am from the netherlands
<PCdude> how old are u?
<kiko> PCdude, old enough to be on IRC!
<kiko> heh
<PCdude> kiko: haha good ;)
<PCdude> kiko: any experience with openstack?
<kiko> PCdude, well.. some
<PCdude> kiko: the problem u helped with was the error that was stopping openstack from installing, so I thought that was the last one
<PCdude> kiko:  but luckily..... its not
<PCdude> kiko: u wanna help?
<kiko> PCdude, heh, there are always more problems then time
<kiko> PCdude, I can try
<PCdude> kiko: let me post the error log of the isntall
<PCdude> http://pastebin.com/raw/rLZUTvKP
<PCdude> this is the whole log of the install
<PCdude> kiko: apparently for landscape it is trying to install a package that fails
<kiko> [ERROR: 09-05 20:59:46, multi.py:384] Problem deploying Landscape: {'err': 'sudo: juju-deployer: command not found\n', 'status': 1, 'output': ''}
<kiko> PCdude, you need juju-deployer installed
<kiko> is that a bug in landscape I wonder?
<kiko> PCdude, apt-get install juju-deployer
<PCdude> sudo apt-cache policy juju-deployer does indeed show that it is not installed
<PCdude> but I am wondering, this should be something that the landscape installer should do right?
<PCdude> kiko: why do I even need that package? haha
<kiko> PCdude, it's because landscape uses it. it seems like it's a real proper bug
<PCdude> kiko: so for the time being I can install it myself and file a report for a bugt
<PCdude> *bug
<kiko> PCdude, yep
<PCdude> kiko: ah ok, lets try that
<PCdude> kiko: do u work remote with canonical or is there an office nearby?
<kiko> PCdude, remote
<PCdude> kiko: ah ok cool, what are u responsible for?
<PCdude> kiko: I am trying to install openstack rn
<PCdude> fingers crossed :)
<kiko> PCdude, maas and storage are mine
<PCdude> kiko: ah ok nice
<kiko> I need to split -- catch you tomorrow
<PCdude> kiko: ah ok, see u tomorrow
<PCdude> kiko: I dont know if u are still around, but the openstack installer is failing again but at a later point
<PCdude> step 3 out of 4
<PCdude> the screen show the following:
<PCdude> http://imgur.com/a/Z47K1
<PCdude> and just stays there
<PCdude> the nodes have internet so there can't be the problem
#maas 2016-09-06
<mup> Bug #1620423 opened: [2.1a2] Crash in regiond.log <MAAS:New> <https://launchpad.net/bugs/1620423>
<holocron> quick question here.. does MAAS provide the 169.254.169.254 service that cloud-init is looking for upon node commissioning?
<holocron> I'm getting PXE boot, but not getting past commissioning
<holocron> http://imgur.com/a/V5unw
<holocron> After those attempts time out, it tries to connect to the default gw (not the MAAS server btw) to get the meta-data
<holocron> perhaps this is the problem?
<holocron> http://imgur.com/a/VRsyQ
<holocron> it seems i have hit upon https://bugs.launchpad.net/maas/+bug/1402861/comments/4 but after restarting my MAAS controller, the maas-dhcpd service has failed to start
<holocron> ConditionPathExists=/var/lib/maas/dhcpd.conf was not met
<holocron> trying to sort that out now
<holocron> simply turned out that the maas-rackd service hadn't started properly on reboot -- so dhcpd is back up.. but still seeing the same error as before with not finding the metadata service on 169.264.169.264, and then not finding it on the default gw
<holocron> https://bugs.launchpad.net/maas/+bug/1620458
<mup> Bug #1620458 opened: Commissioning fails with cloud-init meta-data unavailable <cloud-init> <virsh> <MAAS:New> <https://launchpad.net/bugs/1620458>
<mup> Bug #1620478 opened: [UI] Broken validation on VLAN MTU <ui> <MAAS:New> <https://launchpad.net/bugs/1620478>
<PCdude> kiko: u there?
<mup> Bug #1620513 opened: UniqueViolation: Got more than one neighbour <networking> <MAAS:Triaged> <https://launchpad.net/bugs/1620513>
<mup> Bug #1620514 opened: Unexpected error in processEnded: twisted.internet.defer.AlreadyCalledError <networking> <MAAS:Triaged> <https://launchpad.net/bugs/1620514>
<mup> Bug #1620458 changed: Commissioning fails with cloud-init meta-data unavailable <cloud-init> <virsh> <MAAS:Invalid> <https://launchpad.net/bugs/1620458>
<kiko> PCdude, yes
<PCdude> kiko: cool
<PCdude> I am almost there with openstack, but not totally
<PCdude> let me search for the error log real quick
<PCdude> kiko: http://pastebin.com/raw/A7qtJm4v
<PCdude> it timed out, but I have no idea why and where more logs are to search in to get more info about it
<PCdude> kiko: any idea?
<neith> once maas is intalled , I try to setup openstack, but the UI requires at least 3 nodes on the public network. What is this public network?
<kiko> neith, the routable external network
<neith> kiko: I suppose so, but which one is detected as the external network
<neith> Should I name it public network manually?
<neith> kiko: i selected the pub net in openvswitch
<neith> but on the add hardware pane, all my nodes have a - in the public net column
<neith> kiko: can I use auto assign IP when deploying Openstack?
<kiko> neith, are you using Juju directly, or via Landscape?
<neith> kiko landscape
<kiko> neith, I think Landscape decides that for you
<kiko> PCdude, your problem is in the openstack-install step
<neith> kiko: so why am i stuck here: http://imgur.com/a/bgjk9
<PCdude> kiko: ok, I use the latest version of the openstack installer. the questions the installer asks are pretty basic the only important parts are the IP address and the MAAS key. Both should be right I think.
<PCdude> kiko: Maybe there are others logs I can check to rule out parts or maybe gives a better image of the problem itself?
<kiko> PCdude, I've asked Adam Stokes internally
<kiko> PCdude, oh, he's on freenode, his nick is stokachu
<kiko> PCdude, he should be around in a bit
<kiko> neith, it looks like you are missing a node suitable for neutron-gateway
<kiko> Beret, are any landscape people on this channel?
<PCdude> kiko: great thanks! I am in the JUJU channel too, I sended him a message already on a suggestion from someone else
<kiko> PCdude, okay, if he doesn't reply let me know
<PCdude> kiko: will do :)
<kiko> PCdude, it seems to me like the node maas deployed may not have internet connectivity, but we tested that yesterday and it worked, so I'm not sure..
<neith> kiko: You mean the neutron node needs multiple network interfaces correctly configured?
<PCdude> kiko: exactly my thought too. I even logged in the node (SSH) after the node was deployed by the openstack installer and both internet is working and DNS can be resolved.
<kiko> PCdude, and the problem is reproducible, right?
<kiko> neith, I think so -- I've asked somebody from the landscape team to join and help out
<PCdude> kiko: I tried 4 times and all the 4 times it was giving the exact same error, so yes it is
<kiko> neith, if they don't show up I guess you can /msg dpb directly
<kiko> I need to step out for a bit
<neith> kiko: thanks
<kiko> <andreas> kiko: quick hint, yes, neutron needs two nics, it's part of the autopilot checklist at the beginning
<kiko> neith, ^^
<PCdude> kiko: yeah I have a question about that too, so both nodes that have 2 nics. one of them should be connected to the public side and one with the private side and the other node both with the private side
<PCdude> https://insights.ubuntu.com/2015/04/10/maas-network-layouts-for-the-landscape-autopilot/
<PCdude> second image
<kiko> <andreas> kiko: one nic should have the ip assigned by maas, and the other should just be connected to the network, but unconfigured
<PCdude> and the other one should be connected to the same network as the first NIC right, but then unconfigured like u said?
<PCdude> kiko: but for a split network u will still need a node with one connected to the other network
<PCdude> kiko: should that make 3 nodes with 2 NIC's or can that be one of the 2 nodes with 2 NIC's?
<rbasak> kiko: AFAIK, every TLD is an official TLD. There are one or two ones designed for local use (eg .local), but .maas isn't one of them. I wonder if there's anything that makes things better when using one of the official names.
<rbasak> (ie. that's where I'd look)
<mup> Bug #1620662 opened: MAAS should permit (optionally) DNSSEC signatures for its authoritative domains <MAAS:New> <https://launchpad.net/bugs/1620662>
<mup> Bug #1613857 changed: [doc] api.html: Available configuration items in "Manage the MAAS server" section are not properly rendered <MAAS:Fix Released by nobuto> <https://launchpad.net/bugs/1613857>
<mup> Bug #1613918 changed: [doc] api.html: inconsistent reST format breaks rendering and indent <MAAS:Fix Released by nobuto> <https://launchpad.net/bugs/1613918>
<PCdude> kiko: I have not heard from stockachu yet, just wanted to tell u
<kiko> PCdude, I didn't either, I wonder if he's out
<PCdude> he works too in canonical?
<kiko> PCdude, neither is mmcc
<kiko> yes, both do, both are responsible for openstack-install and conjure-up
<kiko> [ERROR: 09-05 23:38:49, multi.py:218] Failed to get ip directly: [Errno -2] Name or service not known
<kiko> this I find suspicious
<PCdude> ah ok cool, good to know
<PCdude> kiko: I personally think it went wrong from the moment I installed the juju-deployer manual
<PCdude> I dont think that was a smart move, but maybe I am wrong
<PCdude> kiko: uhm, just did some digging on that error line, some questions about it no answers apparently
<kiko> PCdude, see if the suggested debug steps here give you any new information? http://askubuntu.com/questions/626535/landscape-installation-takes-a-long-time
<kiko> PCdude, also look through https://github.com/Ubuntu-Solutions-Engineering/openstack-installer/issues/857 which has quite a bit of detail
<PCdude> note: the first link the comment on the questions says that the deployer timed out, that could be a thing since we installed it manually
<kiko> PCdude, no, that's really unlikely to be the issue
<kiko> I'm not sure why the dependency was missing
<PCdude> shall I post the "JUJU status" output?
<PCdude> http://pastebin.com/raw/cE8ii7s3
<PCdude> thats the output of JUJU status
<kiko> PCdude, looks like your juju deployment is failing to conclude
<kiko> PCdude, what do your machines in MAAS look like? deployed?
<PCdude> first of what gave u that hint?
<PCdude> the line with "waiting for agent initialization to finish"?
<PCdude> I have the nodes and the controller all running in VMware ESXI
<kiko> yes
<PCdude> kiko: the nodes all have 1 GB and 1 core and the controller has 4GB and 2 cores
<kiko> PCdude, you need to check out what is happening on the nodes themselves
<PCdude> rn I only wanna try it to deploy it later on normal "real" nodes
<PCdude> kiko: well what I can see all most of it is "normal"
<PCdude> kiko: I can see the last 20-25 lines that the nodes has as output
<PCdude> one second I will post those
<PCdude> kiko: http://imgur.com/a/vH3bX
<PCdude> thats what I can see on the node from the output
<PCdude> I can still login and use it though
<PCdude> maybe some error logs?
<kiko> PCdude, yeah, look through the logs, in particular in /var/log/juju
<PCdude> kiko: holy shit, that is alot of info
<PCdude> kiko: I think MAAS has a different view on the structure of the network
<kiko> PCdude?
<PCdude> ok, me send the logs first then I will explain hahah
<PCdude> http://pastebin.com/zuX1TJcB
<PCdude> this is the log specificially for the node that was made by JUJU/openstack
<PCdude> it is fetching for a public IP but does not succeed
<PCdude> kiko: let me make sure that the way I set it up rn is how openstack needs it
<PCdude> I have 5 machines
<PCdude> 2 of them have 2 NIC's to the private network
<PCdude> 1 of them has also 2 NIC's but one of them is to the private network and one to the public
<PCdude> the other 2 have only 1 NIC in them
<PCdude> all use WOL to startup
<PCdude> I think that is pretty much it
<PCdude> kiko: is that setup correct?
<kiko> PCdude, I don't really understand what problem you are having, and what's worse I don't quite understand the openstack requirements
<PCdude> kiko: u mean I am not clear enough in what I want or u mean the interpretation of the info I gave u?
<kiko> the latter
<kiko> I know you just want an openstack deployed
<PCdude> ah ok, well that makes two of us : )
<kiko> PCdude, I think the easiest thing to do now is to ask on askubuntu, and we'll get somebody to look at it
<erickes> Hi, IÂ´have installed maas 194 on 1404. I have start a VM that is booting correcly from the maas node. I get the enlisting login prompt but enlisting is not started. What could cause this ?
<PCdude> kiko: will do, I will make a question and post the link here
<kiko> thanks PCdude
<kiko> erickes, I'd guess network connectivity
<kiko> erickes, i.e. if the node can't get out to the internet
<erickes> Could it be that a port is blokeck, we have a firewall on the maas node
<kiko> erickes, if you could try letting all traffic through, that'd be good
<erickes> Ok thx I can have a look on that.
<kiko> erickes, let me know if you need any further help
<erickes> thx, I think itÅ network related
<erickes> as you sa. Need to investigate a liitle bit more
<kiko> sure thing
<erickes> I have added iptables -A FORWARD -o br0 -i br3 -s 172.18.10.0/24 -m conntrack --ctstate NEW -j ACCEPT
<erickes>  to the iptables and know it works, br0 is the internet interface
<erickes> brs is the maas interface where pxe boot is running
<erickes> br3
<kiko> erickes, is your masquerading being done externally?
<erickes> no
<kiko> erickes, well.. 172.18 is in the rfc1918 space
<kiko> erickes, so somebody needs to NAT it in order for it to route externally
<erickes> check
<PCdude> kiko: a little late but here is the stackexchange link (had some other stuff I had to take care off)
<PCdude> http://askubuntu.com/questions/821804/openstack-with-landscape-install-fails
<kiko> PCdude, are you behind a proxy?
<PCdude> kiko: yes I am, I use PIA when my internet goes out of the house
<PCdude> kiko: but internet is working like it should be, could that be an problem?
<kiko> PCdude, well, sometimes you have packages that get corrupted when coming via the proxy, etc
<kiko> PCdude, can you do a juju bootstrap onto that MAAS server and deploy the ubuntu charm?
<PCdude> kiko: here is an update on the issue : https://github.com/Ubuntu-Solutions-Engineering/openstack-installer/issues/986
<PCdude> kiko: haha just what I was doing
<PCdude> that last answer could be the answer let me do that
<PCdude> kiko: ok so I changed my environments file to the following:
<PCdude> kiko: http://pastebin.com/raw/4fyi6AF3
<kiko> looks correct
<PCdude> kiko: the first line has just been added, does that look like it could work?
<PCdude> ah ok good
<PCdude> lets try juju bootstrap again
<PCdude> http://pastebin.com/raw/SgJQ883h
<PCdude> :(
<PCdude> lets try the internet for some solution
<DickV> Installed maas 2.0, dhcp on private net, nodes pxe-ed, boot to login, NEVER show up on maas Nodes pg
<PCdude> DickV: is ur DHCP server running on the private network for the nodes?
<kiko> DickV, sounds like enlistment isn't completing, just like we saw with erickes a while back
<kiko> DickV, my guess as to where to start: can your nodes route out to the internet?
<DickV> I can't login to the nodes directly
<DickV> or by ssh
<PCdude> DickV: u first have to add the key of the nodes for it to SSH
<kiko> DickV, yeah, during enlistment you really can't unless you backdoor the images
<kiko> PCdude, he's not even enlisted yet!
<DickV> Each node shows up in the used ip list after pxe boot, but then they all disappear
<kiko> PCdude, hey, why not start your environments file by running "juju init" first?
<kiko> DickV, my bet is the nodes can't enlist because they can't get out to the internet (to install packages, etc)
<DickV> Following Ubuntu and MAAS doc says install maas first, then juju
<PCdude> kiko: same error :(
<DickV> Does that mean I need a router on the private net?
<PCdude> DickV: I got the nodes to enlist without internet, but here is a link to get internet going through ur MAAS controller
<kiko> DickV, yes, and to NAT it
<PCdude> http://askubuntu.com/questions/717803/openstack-install-problem-with-juju-bootstrap/718820#718820
<kiko> it can route through the MAAS node itself
<PCdude> kiko: I posted my error from the bootstrap on the JUJU channel too. I dont use JUJU often, but this error is not from the yaml file I created right?
<DickV> Thanks. Working on it...
<kiko> PCdude, err.. you are stumbling on something quite basic now, as getting to the point where juju bootstrap kicks off is 5 minutes!
<kiko> s/basic/fundamental to be clear
<PCdude> kiko: yeah I know, I have to be honest here. I sort of took JUJU for granted as it should be working with the basic settings, but yeah never do that... haha
<PCdude> kiko: but still I searched on the internet, but no direct leads to the solution
<kiko> PCdude, when you juju init you get a template environments.yaml
<kiko> PCdude, you then just edit the key and endpoint IP address and it /will/ work
<kiko> I don't know how you ended up with that cut-down environments file
<PCdude> kiko: well I know very special ;) haha
<PCdude> *am
<PCdude> lets just delete the file
<PCdude> and bingo
<PCdude> there is a file now
<PCdude> kiko: and now just bootstrap juju and see if it succeeds ?
<PCdude> he is bootstrapping rn
<PCdude> kiko: lets hope this works...
<PCdude> kiko:  holy shit the bootstrap is successfull!
<PCdude> kiko: I just added my "extra" parameters to the file and it succeeded, lets try the openstack install again
<holocron> Any help on why maas-dhcpd.service fails to start with "ConditionPathExists=/var/lib/maas/dhcpd-interfaces was not met" ?
<kiko> holocron, hmm, does that path exist?
<kiko> PCdude, see!!
<holocron> kiko -- nope, sure doesn't
<holocron> well, /var/lib/maas/ exists, but not that file
<kiko> holocron, okay, next question, have you told MAAS to manage some VLANs?
<holocron> kiko, MAAS detected some vlans from my system configuration, and one of the tagged vlans is where i am attempting to start the dhcpd service
<roaksoax> holocron: have you gone into the specific VLAN and "Enable DHCP" in the action menu ?
<holocron> yes, and at first i was able to describe the range of IP and gateway addresses
<PCdude> kiko: same error again.... :(
<PCdude> which is strange, coz I solve the problem with JUJU bootstrap. So any error should be a different one
<kiko> holocron, see the comment from roaksoax
<kiko> PCdude, if you did a juju bootstrap successfully, then can you juju deploy ubuntu?
<kiko> and does that work
<kiko> because if not.. that's the problem :-)
<holocron> roaksoax - yes
<holocron> kiko, roaksoax : the maas cli interface shows "dhcp_on": true for that fabric and vlan
<kiko> roaksoax, what do you think?
<kiko> interesting
<kiko> holocron, a bug, it looks like
<holocron> well, there's this https://bugs.launchpad.net/maas/+bug/1551378
<holocron> but i'm not getting a config file at all out of maas
<holocron> and this https://bugs.launchpad.net/maas/+bug/1592540
<holocron> but i that doesn't help me much
<PCdude> kiko: ok, so I started fresh again with a snapshot. I did the bootstrap again which succeeded.
<PCdude> kiko: next command is "juju deploy ubuntu -n 4"
<PCdude> so far no errors, but not done yet so lets see what happens
<holocron> kiko, roaksoax : I can't explain why, but it seems to be working now - I had both an ipv4 and an ipv6 subnet on that vlan. I removed the v6 subnet and removed and readded the dhcp service and it came up
<holocron> perhaps it was because i did not have a dynamic range on the v6 network defined?
<PCdude> kiko: ok that works too now! YEEEESSS! I think we can go back to openstack now?
<mup> Bug #1620838 opened: [2.1a2] Got more than one neighbor <MAAS:New> <https://launchpad.net/bugs/1620838>
<PCdude> kiko: I will have to get some sleep now, I will see u tomorrow
<PCdude> it is still giving the same error... :(
<PCdude> but I think we are pretty close to the solution of it
<holocron> I'd like to provide DHCP to an untagged internal OVS bridge, any clues here? I've got the OVS bridge created, but I'm unsure how to go about creating the proper configuration in MAAS
#maas 2016-09-07
<holocron> :) alrighty -- commissioning virsh VMs now, they go to READY state
<holocron> when i go to deploy: Node failed to be deployed, because of the following error: divine-cow: Failed to start, static IP addresses are exhausted.
<holocron> change ip address back to dhcp
<holocron> then, i deploy and end up with "failed deployment"
<holocron> boot log looks ... okay, not sure what to debug
<holocron> first error in the event log is "  Node installation failure - 'curtin' failed: configuring disk: sda "
<mup> Bug #1620877 opened: Bad username or password error message is easy to miss <error-surface> <notifications> <MAAS:Triaged> <https://launchpad.net/bugs/1620877>
<mup> Bug #1620513 changed: UniqueViolation: Got more than one neighbour <networking> <MAAS:Triaged> <https://launchpad.net/bugs/1620513>
<mup> Bug #1593881 changed: 2.0 beta7: Internal Server Error following installation <oil> <MAAS:Expired> <https://launchpad.net/bugs/1593881>
<mup> Bug #1600052 changed: [2.0rc1] Failure to install image due to permissions - missing commissioning image choice <MAAS:Expired> <https://launchpad.net/bugs/1600052>
<mup> Bug #1620903 opened: [2.1-trunk] Unable to save network settings <MAAS:In Progress by mpontillo> <https://launchpad.net/bugs/1620903>
<mup> Bug #1620946 opened: API call fails with Internal Server Error <MAAS:New> <https://launchpad.net/bugs/1620946>
<sujeet_> Hi Kiko
<sujeet_> Hi roaksoax
<sujeet_> Is there any naming convention for the commission script?
<kiko> hello sujeet_
<kiko> not really to be honest
<sujeet_> i used the same code in other file called "get_controller_info.py" its not working
<mup> Bug #1621062 opened: Enable console login with ubuntu in enlistment phase <MAAS:Confirmed> <https://launchpad.net/bugs/1621062>
<mup> Bug #1621065 opened: [2.0] Curtin failure to install windows with xenial ephemeral image - Failed to fetch  .. rename failed, Stale file handle  <oil> <oil-2.0> <curtin:New> <MAAS:New> <https://launchpad.net/bugs/1621065>
<sujeet_> kiko: i used the same code in other file called "get_controller_info.py" its not working
<kiko> sujeet_, I think you want to engage with brian via email so we can get set up to support your work properly
<neith> how do I clear All the data related to maas?
<kiko> neith, what do you mean?
<neith> dhcp leases etc...
<neith> to restart from a fresh state
<neith> I suspect a lease collision
<neith> kiko: I have ip mismatch
<kiko> neith, 1.9 or 2.0?
<neith> 1.9
<kiko> ah, I think 1.9
<kiko> yeah, it's an infamous problem with 1.9
<kiko> you can just delete the leases directly
<kiko> shut down maas and dhcp and delete the file in /var/lib/
<neith>  /var/lib/maas/dhcp/dhcp.leases ?
<kiko> yeah
<kiko> in 2.x I think we do this for you -- roaksoax?
<neith> kiko: there are no service for maas and dhcp?
<roaksoax> in 2.0 MAAS will be notified when a leas ehas expired and nobody uses it
<kiko> roaksoax, I meant having an option of saying "nuke all leases"
<kiko> neith, you mean systemd/upstart?
<neith> kiko: yep
<kiko> neith, there is, but maybe it's confusingly named? I tend to look at /etc/init to remind myself
<neith> ok
<roaksoax> kiko: we dont have an specific option for that
<kiko> roaksoax, hmm, I thought we had discussed it at least. okay
<kiko> roaksoax, maybe an API call, clear all leases? to avoid a clickety click? :)
<neith> It's a must have I think
<neith> cause I often get lease collision
<kiko> neith, the question is why are you getting that? it tends to happen when your dynamic range is too small, which normally happens when you are using a /24
<neith> kiko: I do use a /24
<neith> but I only have 7 servers
<neith> Its only a PoC
<kiko> neith, how are you running out of leases then? :)
<neith> kiko: i'm not out of leases, MAAS if allocating the same ip to 2 different mac address
<neith> kiko: happened twice in 2 weeks
<neith> kiko: besides, half of my servers are pxe booting only once every 2 boot. do you have an idea?
<kiko> neith, that only happens when you run out of leases
<kiko> neith, first, we don't ever give you the same IP to different MAC addresses
<kiko> neith, we complain we have no IP addresses left
<neith> kiko: I'll paste you the lease file
<neith> kiko: you'll see yourself
<kiko> neith, the only way a collision can happen is a) dhcpd ran out of leases and had to reuse one and b) a bug :)
<kiko> roaksoax may be able to provide some more color on that, but that's the highlight
<kiko> sure
<neith> kiko: I'm more upset by my server not pxe booting every time
<kiko> neith, what happens when it fails?
<neith> it does not even boot
<neith> its weird
<neith> the server is infinitely loopiing on the pxe boot sequence
<neith> and if I hit reset, the next boot is perfect
<neith> :S
<PCdude> hÃ© everybody :)
<kiko> neith, can you catch a movie of the loop?
<kiko> hey PCdude
<neith> kiko: I can, but there are no useful information
<kiko> neith, it will help me understand what exactly is happening
<neith> kiko All 7 servers are the sames
<neith> :(
<kiko> neith, it might have to do with your problem running out of leases
<neith> kiko :(
<neith> I cleaned the lease file
<neith> and starting again
<neith> we will see
<PCdude> kiko: I forgot is IRC name, but is this other guy around for the JUJU problem? something with pickachu?
<sujeet_> ok Kiko
<mup> Bug #1621065 changed: [2.0] Curtin failure to install windows with xenial ephemeral image - Failed to fetch  .. rename failed, Stale file handle  <oil> <oil-2.0> <curtin:Invalid> <MAAS:Invalid> <https://launchpad.net/bugs/1621065>
<mup> Bug #1621072 opened: Avoid shutting down (or rebooting) when we encounter critical failures during enlistment, commissioning and installation <MAAS:Triaged> <https://launchpad.net/bugs/1621072>
<mup> Bug #1621090 opened: rack controller broken after a period of time when deployed on a seperate machine from the region <MAAS:New> <https://launchpad.net/bugs/1621090>
<sebastian__> Hi all, does anyone know how to set default IPMI credentials for Maas?
<neith> sebastian__: good question, maybe using the cli?
<sebastian__> You can only add it to individual nodes and you have to know the system id for each node where you want to set the credentials
<sebastian__> Maybe i have to ask a different question. i saw ipmi settings should be detected automatically, how can i debug this if it does not work?
<kiko> sebastian__, they are set up automatically during enlistment
<kiko> sebastian__, so that may be failing, and it's normally related to the firmware being broken/old
<sebastian__> kiko: For me this seems to be not working, do you know where i can see why?
<kiko> sebastian__, so the first step would be checking firmware correctness
<sebastian__> kiko: i did firmware updates today, they are from 22.08.2016 so not that old
<sebastian__> Maybe you need to know i am using huawei hardware
<kiko> interesting
<kiko> give me a few mins
<sebastian__> ok
<sebastian__> i'll be back in a few moments, fetching some fresh air, thanks kiko
<sebastian__> so i'm back
<kiko> sebastian__, you should be able to log into the machine during enlistment if it fails if you are fast enough. can you try to ssh ubuntu@node with the password ubuntu?
<sebastian__> one moment i'll try that kiko
<kiko> sebastian__, see the code in action here: http://bazaar.launchpad.net/~maas-committers/maas/trunk/view/head:/contrib/preseeds_v2/enlist_userdata
<kiko> sebastian__, the code we run is maas_ipmi_autodetect.py
<sebastian__> kiko: so i should be able to just execute the script and see what the output is?
<kiko> sebastian__, yep
<kiko> sebastian__, we run http://bazaar.launchpad.net/~maas-committers/maas/2.0/view/head:/etc/maas/templates/commissioning-user-data/snippets/maas_ipmi_autodetect_tool.py first
<kiko> sebastian__, and then we run http://bazaar.launchpad.net/~maas-committers/maas/2.0/view/head:/etc/maas/templates/commissioning-user-data/snippets/maas_ipmi_autodetect.py
<kiko> sebastian__, having said all that, huawei are signed partners in our cert program, so you can also file bugs and raise support tickets and we'll look into them
<kiko> bladernr is a useful contact for that
<kiko> neith, any luck?
<sebastian__> ok i'll try to boot one of the systems and log in and see what the output of those scripts is, if i am not able to figure it out i'd open a bug report for this
<neith> kiko: no stil the 3 same servers are failing to be commissionned
<neith> kiko: I'll double check the dhcp conf
<kiko> neith, oh, some work and some don't?
<neith> kiko: its really weird
<neith> 7nodes in total
<neith> 4 are perfectly working
<neith> 3 are pxe booting 1/3 attempt
<neith> 7 nodes perfectly identical and UEFI have the same conf
<kiko> neith, same firmware revs? that's often the case?
<neith> same firmware
<neith> kiko: wanna cry lol
<kiko> neith, send me the video :)
<neith> i'm in a meeting, will do later
<kiko> neith, ah! check if the switch ports have portfast enabled
<neith> kiko: GOOD ideaa
<kiko> neith, that's the other common place where that fails
<neith> kiko: it should be enable righ? not disabled
<kiko> neith, it should be enabled. but I guess check if the ports are configured differently between machines that work and machines that don't?
<neith> kiko: ok ok
<sebastian__> kiko: just for your information, i figured out what the issue was... Huawei checks if the password is "complex" enough and thats where the ipmi detect failed... it looks like the password is "too simple" and therefore can't be set
<kiko> sebastian__, thanks! that is a bug worth reporting
<sebastian__> i'll note it down and hopefully i'll manage to create a bug report for it tomorrow, i'll have to leave now. Thank you very much for your help
<kiko> sebastian__, thanks!
<kiko> sebastian__, what model number, btw?
<sebastian__> kiko: RH1288 V3
<kiko> sebastian__, thanks
<neith> kiko: I got it
<kiko> neith, what was it?
<neith> kiko: they mismatched the port on the switch
<neith> first boot it used the 2nd interface
<neith> 2nd boot it used the 1rst one
<neith> but the first one is on the wrong vlan
<kiko> neith, phew! and we didn't need a video either :)
<kiko> neith, happy you got it sorted. what hardware is it incidentally?
<neith> HP ProLiant XL420 Gen9
<neith> I am mad about HP
<neith> how they bios can be so buggy
<neith> their
<kiko> ah, firmware
<neith> kiko: anyone from the landscape team around?
<neith> kiko: I did not figure out the subnet configuration to deploy openstack
<kiko> neith, sadly no, but if you can get an askubuntu post up I'll get somebody to look at it
<neith> kiko: alrigh
<jhegge> where are the API docs for the 2.1.0 Alpha 2 release?  still seeing only 2.0.0 online
<kiko> jhegge, I don't think they have been generated yet -- are you seeing a gap in the 2.0 docs?
<kiko> roaksoax, ^^
<jhegge> just wanting to look at the New discovery API
<jhegge> cool new features in both alphas
<roaksoax> jhegge: working on getting api docs updated
<jhegge> awesome, thx
<mup> Bug #1621175 opened: BMC acc setup during auto-enlistment fails on Huawei model RH1288 V3 <MAAS:New> <https://launchpad.net/bugs/1621175>
<kiko> sebastian__, see bug #1621175 I just reported
<PCdude> kiko: I sended a message to stokachu is he in today?
<kiko> PCdude, not sure, I can poke
<PCdude> kiko: please do, I really wanna get rid of this problem, all help is welcome :)
<sujeet_> Hi Kiko
<sujeet_> i want to pass the firmware upgrade image along with the script, so how can we do?
<sujeet_> From where i need to fetch the file, whether from Maas server or any other server where Maas dashboard can be accessed?
<kiko> sebastian__, if you can add your logs as roaksoax asked, I'd appreciate it
<kiko> sujeet_, I believe MAAS has an API where you can store objects which the commissioning script could fetch it from later
<kiko> roaksoax, does that still exist? I believe Juju stores or stored tools there ^^
<sujeet_> can i know the api?
<roaksoax> kiko: the object store in MAAS is deprecated
<roaksoax> kiko: but still exists
<dbainbri> on MAAS 1.9, i am seeing the IP in the MAAS UI is inconsistent with the actually IP handed out via DHCP. anyway to sync these or get MAAS to accept the "correct" IP?
#maas 2016-09-08
<mup> Bug #1621285 opened: Server VLAN's on a Rack Details page not loading. <vanilla> <MAAS:New> <https://launchpad.net/bugs/1621285>
<mup> Bug #1621344 opened: [2.0] Multiple failed trusty deployments  -- rename failed, Stale file handle  <oil> <oil-2.0> <MAAS:New> <https://launchpad.net/bugs/1621344>
<mup> Bug #1621407 opened: [2.1] Add node listing filters for Network (NIC) and Storage (devices) tags <ui> <MAAS:Confirmed> <https://launchpad.net/bugs/1621407>
<mup> Bug #1621409 opened: On save tags bug storage/interfaces <ui> <MAAS:New> <https://launchpad.net/bugs/1621409>
<mup> Bug #1621344 changed: Error reading package list on ephemeral environemt <oil> <oil-2.0> <MAAS:Invalid> <https://launchpad.net/bugs/1621344>
<mup> Bug #1621344 opened: Error reading package list on ephemeral environemt <oil> <oil-2.0> <MAAS:Invalid> <https://launchpad.net/bugs/1621344>
<mup> Bug #1621344 changed: Error reading package list on ephemeral environemt <oil> <oil-2.0> <MAAS:Invalid> <https://launchpad.net/bugs/1621344>
<mup> Bug #1621344 opened: Error reading package list on ephemeral environemt <oil> <oil-2.0> <MAAS:Incomplete> <https://launchpad.net/bugs/1621344>
<mup> Bug #1621446 opened: [Vanilla] Design QA bugs <vanilla> <MAAS:In Progress by ricgard> <https://launchpad.net/bugs/1621446>
<mup> Bug #1621344 changed: Error reading package list on ephemeral environemt <oil> <oil-2.0> <MAAS:Incomplete> <https://launchpad.net/bugs/1621344>
<kiko> roaksoax, how do you think we should advise sujeet to store firmware blobs that commissioning scripts should access?
<kiko> roaksoax, I actually think the object store is a nice solution to that problem
<kiko> roaksoax, did it have major issues?
<nirlevy> Hello all
<nirlevy> anyone has recently installed maas over vsphere node?
<roaksoax> kiko: yes it did have major issues and juju didn't need it anymore. The object store was there for juju really. We didn't removed it in 2.0 because juju still needed it.
<roaksoax> kiko: we wont remove it in 2.0 series
<roaksoax> 2.x series
<kiko> roaksoax, what were the major issues?
<nirlevy> thanks roaksoax,
<roaksoax> kiko: so if there's a use case to keep it, we can keep it
<nirlevy> I am in a much preliminary issue, adding nodes to maas
<kiko> roaksoax, we should think about it more carefully, yeah
<nirlevy> pxe works but I can not found the proper power type
<kiko> nirlevy, I think PCdude has an installation like that, currently with some problems
<roaksoax> i'll need to dig through the buglist to find the issues it has
<kiko> nirlevy, there's a VMware power type?
<roaksoax> kiko: nothing blocking though, it is still usable
<kiko> roaksoax, thanks
<kiko> roaksoax, I remember there was an auth issue IIRC
<kiko> but that's all I remember
<nirlevy> can someone please explain the methodology behind virtual node booting,
<nirlevy> I am confused, in one hand I have virtual nodes over my esxi host
<nirlevy> on the other hand maas node should include libvirt to manage them, is it correct?
<kiko> nirlevy, no, MAAS uses libvirt for KVM nodes
<kiko> nirlevy, I'm curious -- does the version of MAAS you are using not have a VMware power type? what version are you using?
<kiko> nirlevy, virtual nodes boot into an ephemeral environment via PXE
<kiko> nirlevy, what we need the power controller for is simply to turn the nodes on and off
<nirlevy> I am using ubuntu 14.04 and apt-get installation which is version 1.9
<nirlevy> for maas
<kiko> roaksoax, the 1.9 SRU went through then?
<kiko> nirlevy, and is there not a VMware power type?
<kiko> in the dropdown?
<nirlevy> it do have VMWare in the drop down menu
<nirlevy> after pxe boot nodes status is "new"
<nirlevy> when I commissioning them, the operation fails
<kiko> nirlevy, correct, you go to the node then, choose that power type and set the VMware details correctly
<kiko> the VMware setup is rather manual
<nirlevy> let me try it now.
<nirlevy> the status of the nodes after PXE boot is at login, I do not know the user and password (is there is any default?) those are the same details I need for the VMWare option
<kiko> nirlevy, the user and password are your VSphere credentials
<kiko> nirlevy, the nodes are New because they have just enlisted; once you have the power type set up you can kick off a commissioning process
<nirlevy> Thanks, Trying
<kiko> nirlevy, if you get failures check your VSphere access logs
<mup> Bug #1618543 opened: freeipmi lacks IPv6 support <maas-ipv6> <needs-upstream-report> <MAAS:Confirmed> <freeipmi (Ubuntu):Triaged> <https://launchpad.net/bugs/1618543>
<kiko> poor freeipmi
<nirlevy> failed to log in even in the vsphere console..
<nirlevy> same credentials as my vsphere.
<nirlevy> to be exact same credentials as my vsphere client on host running ESXI
<nirlevy> shall I try my MAAS web credentials?
<nirlevy> Did not work either.
<mup> Bug #1621507 opened: ipconfig lacks ipv6 support <maas-ipv6> <MAAS:Confirmed> <initramfs-tools (Ubuntu):New> <klibc (Ubuntu):New> <klibc (Debian):Unknown> <https://launchpad.net/bugs/1621507>
<nirlevy> kiko, any ideas?
<nirlevy> maas maas nodes accept-all - does not work either, command seem to be wrongly parsed
<kiko> nirlevy, MAAS web credentials?
<kiko> nirlevy, maybe you're confused
<kiko> nirlevy, the power parameters are what MAAS uses to contact vSphere to tell it to turn a VM on
<kiko> nirlevy, so they are really the credentials on the VMWare side
<kiko> mpontillo, do you have any advice for nirlevy?
<vfontanella> Hi
<vfontanella> I usually see in the internet that to install MAAS we need at least 5 computers
<vfontanella> but the new docs say the minimum is 2
<vfontanella> is that true?
<mpontillo> kiko: nirlevy: In a call now, will read the scrollback shortly
<kiko> mpontillo, I'll copy you on an email reply to him in fact, so just look out for that. it has a bit more detail
<kiko> vfontanella, actually, to install OpenStack you have a minimum requirement
<kiko> vfontanella, MAAS itself can run on 1 machine
<kiko> vfontanella, but of course, you use MAAS to control other machines, so.. it's only really useful if you have a bunch to manage
<vfontanella> I see
<vfontanella> that means if I have a couple xenservers and esxi, I can install in one server and manage them
<mpontillo> vfontanella: depends on what you want to do; I run it on a laptop ;-) (but I'm just testing it mainly)
<vfontanella> I mean manage the VMs on it
<mpontillo> kiko: ok thanks, please loop roaksoax in as well; I may not be able to respond quickly since I have a lot to do by the end of next week
<vfontanella> I tried to install it in a vbox but I had problems to power on the vms
<kiko> mpontillo, I think he's missing python-pyvmomi, actually?
<kiko> mpontillo, I can't really believe we ship MAAS half-broken for VMWare if that's true
<mpontillo> kiko: could be the certificate issue too
<mpontillo> kiko: I'll take a quick look
<kiko> mpontillo, thanks
<kiko> mpontillo, that package needs to be installed on the rack controller, right?
<mpontillo> kiko: yes, though the most recent version in Xenial is missing some code necessary to work around an issue with certificate validation that many people run into
<mpontillo> kiko: so some people have had more success installing the latest version from pip
<mpontillo> kiko: nirlevy: see my comments on this bug https://bugs.launchpad.net/maas/+bug/1593469
<kiko> mpontillo, and the latest 1.9.4 doesn't contain your workaround?
<mpontillo> kiko: ah, correct, I think the workaround is only in 2.0
<mpontillo> kiko: but the version in trusty didn't have this issue ;-)
<kiko> mpontillo, this user is on trusty..
<mpontillo> (what a tangled web this is.)
<roaksoax> kiko: we dont ship maas broken
<roaksoax> kiko: if python-pyvmomi is not ther emaas should be notifying the issues
<roaksoax> kiko: there's no mpython-pyvmomi in trusty
<roaksoax> we alwasy shipped it in a PPA
<mpontillo> nirlevy: have you tried the "Add Hardware > Chassis" option? that usually works best with vmware instances. It's best to name all the nodes you want to enlist in MAAS with a common prefix, and use the prefix filter to ensure MAAS only configures the VMs you want to use.
<PCdude> kiko: any idea if someone is online that could help with my problem?
<kiko> PCdude, let me try and help again. where are you stuck, remind me of the whole story
<PCdude> kiko: thats cool
<PCdude> kiko: so just a quick review. I have 1 controller and 5 nodes. 3 of them have 2 NIC's. 2 of those three have both NIC's connected to the private network. The last one has one connection to the public network and one to the private network
<kiko> PCdude, what hardware is this?
<neith> kiko: hello
<neith> kiko: I'm bootstrapping juju agin
<neith> and I keep getting this error: http://pastebin.com/PLganGgV
<kiko> neith, okay, let me know how it goes
<neith> kiko: yesterday worked like a charm
<neith> kiko: today cant bootstrap juju same error always
<kiko> neith, hmm, odd. I'm otp so I'll be slow in replying
<PCdude> kiko: I was disconnected for a moment, so dont know if I missed something, but anyway thanks for looking at it
<kiko> PCdude, what hardware is this?
<PCdude> so the nodes and controller are running in esxi, but the host is 16GB of corsair memory, asus H87i-plus motherboard, processor high-end of two generations back. I dont know the exact number but I could look it up
<PCdude> kiko: is that what u mean with the hardware?
<kiko> PCdude, aha, so it's a vmware-based setup. okay
<kiko> PCdude, did you use the autopilot vmdk we provide?
<PCdude> kiko: yes its vmware based.
<PCdude> kiko: no I did not use that, coz its a single machine solution. I wanted to try it all out as a copy of what I wanted to set up with real machines later on
<PCdude> kiko: but it should still work right? I mean, just using VMware ESXI could become problematic with WOL and that kind of stuff, but not for the nodes itself as VM
<kiko> PCdude, it should work, yes, but there are many pitfalls you can encounter (as you see)
<kiko> PCdude, why WoL as opposed to our native VMWare support?
<PCdude> kiko: well there is support in MAAS to connect directly with some API to vmware, but that is only possible with an more advanced version of ESXI. the lowest is free and that is what I use. I cant afford the more expensive ones. The problem is that ESXI does not allow API calls in my version. So I cant use the MAAS option, which would be great if I
<PCdude> could use it
<kiko> PCdude, ah, indeed, I see the challeng
<kiko> e
<PCdude> kiko: the next problem was that WOL does not even work with ESXI. So I installed a package on my main PC and listen to the WOL magic package. I respond to that by turning the machine on the GUI
<PCdude> kiko: not ideal , but for testing not bad I think
<PCdude> kiko: I can live with it, since when I move over to real machines WOL works and openstack can operate on its own
<neith> kiko: do the main network on wich nodes are pxe booting requires to have a GW to internet?
<neith> kiko: seems like juju wants to directly connect to the net from bootstrapped nodes
<kiko> thanks mmcc
<mmcc> hi maas folks
<kiko> PCdude, mmcc is one of the engineers responsible for openstack-install
<PCdude> kiko: great! :)
<kiko> mmcc, PCdude has a vmware-based deployment of MAAS that is failing on the openstack-install phase
<PCdude> hey mmcc
<mmcc> hi PCdude
<mmcc> so how is it failing? If there's a lot of discussion in the backscroll, I'd be glad to read it if someone could paste it up for me.
<PCdude> http://askubuntu.com/questions/821804/openstack-with-landscape-install-fails
<PCdude> mmcc: most of the info is there
<kiko> mmcc, and neith has a deployment that is failing like this: http://pastebin.com/PLganGgV
<mmcc> ok, looking now. thanks
<kiko> mmcc, neith's issue I can only find reference to here: https://github.com/Ubuntu-Solutions-Engineering/openstack-installer/issues/870
<kiko> mmcc, and it seems like it's pretty reproducible
<neith> mmc should bootstrapped node be able to resolv maas server by hostname?
<neith> as the opposite?
<PCdude> mmcc: I have the systems still in the state of the moment after they crash, so I can look at any tiny file that can be of relevance for the troubleshooting
<mmcc> PCdude: so it looks like you're hitting a timeout built into the juju-deployer tool that openstack-installer uses internally. Are the systems in your MAAS slow to spin up? I don't know how using VMWare in there affects speed
<PCdude> mmcc: good point, tbh I dont know what I small since I have only experience with this machine. IT takes about 2-4 minutes to deploy a node from MAAS. the nodes have 1gb and 1 core. the controller has 4gb and 2 cores. an important thing to add is that some days ago I was facing another issue and with some help of kiko we find out that the juju deplo
<PCdude> yer was not installed by the openstack-installer, so I installed it manually. I have no idea if that is important, just saying to be complete
<PCdude> mmcc: I tried with more RAM on the nodes, but did not make an difference (I tried 2 and 4 GB)
<kiko> mmcc, ah, yes, there was that point: juju-deployer was not installed when they installed openstack-install. is that expected?
<mmcc> kiko: yeah, juju-deployer is a dep of the 'openstack-landscape' package, which gets installed after running openstack-installer if you choose that option
<mmcc> if that wasn't happening, then something is odd with the packaging
<PCdude> mmcc: I choose that option, but it was not installed. by installing it manually, I got past the point it was complaining about in the past
<mmcc> PCdude: so is this VMs on one machine that are registered in MAAS as separate 'machines'?
<mmcc> I'm surprised that it didn't install juju-deployer.
<PCdude> yes, they are. So to be clear there is one host, which has 6 VM's on it. 5 of them are nodes and 1 is the controller. On the controller is MAAS installed and sees the other nodes (the other VM's) as seperate machines
<mmcc> neith: do you still have the systems sitting around in the same state? I'm curious if you can interact with juju successfully outside of the installer. for example, "JUJU_HOME=~/.cloud-install/juju juju status"
<mmcc> setting JUJU_HOME is necessary because we use a separate directory to avoid affecting an existing environment
<PCdude> mmcc:  The machine has 16GB of corsair memory and ASUS h87i-plus motherboard and a high end processor (dont know the number exactly  )
<mmcc> ok
<mmcc> PCdude: if you try the same "JUJU_HOME=~/.cloud-install/juju juju status" command, what does it tell you?
<mmcc> it's possible that has errors that juju-deployer doesn't relay back to the user, so it just waits until a timeout
<neith> mmcc: It can't even reach this state now. I'm trying to replay openstal-install
<PCdude> pastebin?
<neith> mmcc: before I run openstack-install -u and it gets: Error: an inet prefix is expected rather than "False".
<PCdude> mmcc:   http://pastebin.com/raw/AZY2ZHiV
<mmcc> neith: you can ignore that error. unfortunately the uninstall script is loud if it's tearing down an incomplete install
<neith> mmcc: Alright, running it again
<mmcc> PCdude: so it is trying to spin up one machine with four LXC containers on it. the containers are stuck in a 'pending' state
<mmcc> the next thing I'd do is to dig around on that machine, via "JUJU_HOME=~/.cloud-install/juju juju ssh 0"
<mmcc> and then once I'm on machine 0, look at the lxc containers on there and see if there are any errors that look obvious
<PCdude> mmcc: well I have one controller and one nodes is used for MAAS. I have not seen one machine ever startup to do that. So I cant login to that extra machine and try those commands
<mmcc> PCdude: your juju status says that 'node1.maas' was brought up as a juju machine
<PCdude> mmcc: ah ok, yes that is correct. I will issue the command on that machine. I thought there should spin up another one than node1
<mmcc> PCdude: no, the landscape install puts its services into LXC containers on a single machine
<PCdude> mmcc: I have never used LXC before, so what command should I use to check that?
<PCdude> mmcc: one note is that version 1.0.8 is installed of lxc, while 2.0.3 is the recommended version on trusty (source:apt-cache)
<kiko> sudo lxc-ls --fancy
<mmcc> PCdude: openstack-installer is glue over MAAS, Juju, and LXC. You can see the bundle that we deploy here: https://jujucharms.com/u/landscape/landscape-dense-maas - I think the way forward for you is to verify that the underlying tech works in your environment before trying again with openstack-installer. e.g. make sure that you can bring up lxc containers inside your VMs, then try deploying a juju bundle that uses lxc containers onto
<mmcc> your MAAS.
<PCdude> http://pastebin.com/raw/Lh1uTDQt
<mmcc> many issues with running openstack-installer end up being better answered by the communities around those component technologies
<PCdude> mmcc: that is the output of "sudo lxc-ls --fancy"
<PCdude> mmcc: ah ok, I am really learning stuff here :)
<mmcc> so it sounds like there's an issue bringing up lxc containers with juju on your environment. I'd suggest asking in #juju for expert advice on where to look for that
<mmcc> I've dug in there in the past, but I'd probably have to ask there myself, it's been long enough
<neith> mmcc: kiko : sorry for my previous questions, I cant trust the network guys, they messed up the cables again. Though I understand its hard to manage nodes with 8 physical interfaces without mistakes
<mmcc> neith: so you're in good shape now? that's good to hear
<neith> mmcc: I'am wondering if we could improve maas by checking if nodes using the same subnets are really able to talk to each other
<PCdude> mmcc: good I will head over to the JUJU channel
<neith> avoiding loops and cross connections
<mmcc> neith, that does sound useful to me, but I'm not a maas dev.
<mmcc> if no one else sees it here, I'd file a feature request on https://bugs.launchpad.net/maas/+filebug
<kiko> neith, mmcc: what's what mpontillo is already working on for release 2.1
<kiko> neith, a bit miffed that that was the problem in the end -- the failure mode is really obscure
<PCdude> kiko: mmcc well, there happens to be a know issue with bringing up instances. somehwere along the line apt-get update and apt-get upgrade fail. Disabling this option would make the installer to succeed
<kiko> PCdude, why is the upgrade failing though?
<kiko> PCdude, network flakiness?
<PCdude> kiko: dont know yet, still busy in JUJU channel
<PCdude> kiko: mmcc https://lists.ubuntu.com/archives/juju/2016-September/007845.html
<PCdude> also good to know for other problems in the future with this version I think
<PCdude> this is for Xenial, but maybe for trusty too lets see
<kiko> PCdude, I'm thinking you are continue to be in pain in your current setup
<kiko> PCdude, maybe you should pivot into getting some hardware, or using KVM instead of VMware?
<PCdude> kiko: Well, I have made a pretty precise plan in what I want to do with openstack. Though, I have some blind spots on understanding some parts. Thats why wanted to try and really reproduce the acutal usage situation. Since the test fase on ESXI will eventually gives me a clue on  what hardware I will need.
<PCdude> kiko: I have tried KVM, but I ran in some issues too with that. I cant remember what it was.
<kiko> PCdude, I think it fails for a few reasons, potentially the first that WoL isn't all that reliable
<PCdude> kiko: ah ok, yeah some other people told me that too. I was thinking about maybe buying some serious equipment (second handed of course) to do the trick.
<kiko> PCdude, my overall guidance would be to prefer hardware if possible, and if not, to use KVM, which will be less magical than VMware
<mpontillo> kiko: I think checking if MAAS nodes can talk to *each other* requires software running on the deployed node; that is actually out of scope. the network discovery for 2.1 gives MAAS better information about which nodes have been seen on subnets directly attached to MAAS servers.
<PCdude> kiko: good point and I agree for sure. yeah I tried KVM probably an hour after I find out about openstack. So maybe there is part of the problem what I had with KVM.
<kiko> mpontillo, yes, in 2.1 you get the first step of that, which is can the nodes talk to the networks MAAS knows about
<kiko> mpontillo, I assume it's actually "knows about", not necessarily "manages"?
<mpontillo> kiko: when it comes to interactions with juju, spaces are presumed to be the model of which networks can talk to each other in MAAS. that is something that needs to be defined by the MAAS admin though; MAAS does not yet try to determine that automatically, even in 2.1
<kv> do you guys run maas on ubuntu 16.04 ?
<kiko> mpontillo, that's okay, I'm just saying that if the user has his networks listed in MAAS, and the cluster controllers are attached to them, we can check if the nodes talk to them
<kiko> kv, sure!
<kv> any tips or tricks?  I am having quite a few issues with it
<mpontillo> kiko: yes, that much is certainly true.
<kiko> kv, first, upgrade to 2.0 final. :-)
<kiko> kv, second, tell us all about your issues
<kiko> mpontillo, that is already worlds better than where we are today
<kv> kilo any good docs for 16.04 ?
<kv> kilo, i had lots of issue with the dhcpd
<kiko> kv, http://maas.io/docs covers 2.0
<kiko> kv, what exact version are you running?
<mup> Bug #1621610 opened: [2.0,2.1] MAAS allow's machine to transition to 'commissioning' even if no images are present <MAAS:Triaged> <MAAS 2.0:Triaged> <MAAS trunk:Triaged> <https://launchpad.net/bugs/1621610>
<PCdude> kiko: mmcc it turns out NOT to be JUJU haha
<PCdude> as said before the LXD containers do not get an IP address and this should relate to either MAAS in failing of giving it or LXD by not providing it right
<kiko> PCdude, it rarely is, unfortunately. most problems are in the MAAS domain :-)
<PCdude> kiko: haha ok, stokachu suggested that maybe MAAS ran out of IP addresses
<PCdude> both static and dynamic now have 100 addresses each for DHCP requests (50 before that)
<PCdude> kiko: How can I diagnose the MAAS DHCP settings in a CLI window?
<PCdude> kiko: /etc/dhcp/dhcpd.conf is pretty empty
<kiko> PCdude, I think they are in /etc/maas or /var/lib/maas
<PCdude> kiko: ah yes of course that makes sense
<kv> Sep  8 13:51:53 sj36 maas.dhcp: [ERROR] Could not rewrite DHCPv4 server configuration (for network interfaces enp1s0f0): Command `maas-rack atomic-write --filename /var/lib/maas/dhcpd.conf --mode 0644` returned non-zero exit status 1:#012None
<kv> kilo any idea why I keep getting that error?
<PCdude> well good night everyone :)
<roaksoax> kv: what's the permissions under /var/lib/maas/ ?
<kiko> kv, does /var/lib/maas exist? :)
<kiko> PCdude, happy you've made progress. we should do better at showing you when you're running out of IPs..
<kv> kilo drwxr-xr-x  5 maas     maas     4.0K Sep  8 14:19 maas
<mup> Bug #1621615 opened: network not configured when ipv6 netbooted into cloud-init <maas-ipv6> <MAAS:New> <cloud-init (Ubuntu):New> <https://launchpad.net/bugs/1621615>
<mup> Bug #1621635 opened: [2.0, 2.1, UX] MAAS doesn't warn the user that MAAS it not managing DHCP when attempting to commission <error-surface> <notifications> <ux> <MAAS:Confirmed> <MAAS 2.0:Confirmed> <MAAS trunk:Confirmed> <https://launchpad.net/bugs/1621635>
<mup> Bug #1621647 opened: MAAS 2.0 SSL verification error when adding UCSM chassis <MAAS:New> <https://launchpad.net/bugs/1621647>
<mup> Bug #1621647 changed: MAAS 2.0 SSL verification error when adding UCSM chassis <MAAS:New> <https://launchpad.net/bugs/1621647>
<mup> Bug #1621647 opened: MAAS 2.0 SSL verification error when adding UCSM chassis <MAAS:New> <https://launchpad.net/bugs/1621647>
<kv> oh boy...777 on /var/lib/maas and /var/lib/maas/dhcp still no luck
<kv> everything seems fast...and promising but dhcp ...
<roaksoax> kv: what happens if you run that command manually ?
<roaksoax> maas-rack atomic-write --filename /var/lib/maas/dhcpd.conf --mode 0644
<roaksoax> kv: also, how did you enable DHCP ?
<roaksoax> kv: are you running under a vm? lxc container ? may there be something preventing thigs to work ?
<roaksoax> like apparmosr ?
<kv> i ran that command from root and it did not do anything
<kv> i enabled using the ui
<kv> i also tried from cli
<kv> i think i found the issue
<kv> it was related to sudo on maas account
<roaksoax> kv: interesting, there's no sudoers file  ?
<roaksoax> maas should have installed the sudoers file
<kv> i think it did but my puppet module removed it :)
<roaksoax> ah!
<kv> i added 'maas ALL=(ALL) NOPASSWD: ALL' and it is happy now
<roaksoax> kv: cool
<kv> thanks roaksoax
<kv> dhcp is working...tftp got timeout when booting from pxe...new host was not registered in maas ui.
<kv> any idea roaksoax ?
<roaksoax> kv: logs I can look at ?
<roaksoax> kv: rackd.log or a console log
<kv> pxe-e32: tftp open timeout
<kv> i don't see anything in rackd.log
<kv> weird tftpd isn't installed by default?
<roaksoax> kv: maas run its own tft server
<roaksoax> kv: check /var/log/maas/rackd.log
<roaksoax> kv: can the machine reach the tftp address it is told to ?
<kv> bingo
<kv> looking good
#maas 2016-09-09
<mup> Bug #1621690 opened: [2.1] When starting network discovery, logs don't show which rack controller  <MAAS:New> <https://launchpad.net/bugs/1621690>
<neith> mmcc: hey, I had to reboot one node on my openstack cluster, lxd containers are running byt openstack is displaying L3 agent as down
<neith> where should I start?
<kiko> neith, get an overall juju status first?
<neith> kiko how?
<kiko> JUJU_HOME=~/.cloud-install/juju juju status
<kiko> from the node where you ran openstack-install
<neith> kiko: everyting looks fine
<neith> I manually restarted the agent
<kiko> neith, and it's okay? then it's a bug
<kiko> neith, is this Liberty on 14.04?
<neith> kiko: yes
<neith> kiko: now I'm gonna fight with networks, actually intstances are not reachable from the ext_net
<mup> Bug #1621981 opened: [2.1] Incorrect API output when adding SSH keys via LP <MAAS:Confirmed for newell-jensen> <https://launchpad.net/bugs/1621981>
#maas 2016-09-10
<koaps> hello
<koaps> is anyone around who might be able to help me with a MAAS 1.9 API command?
<DJHenjin> Apparently my region controller has disappeared,... If I have 3 NICs in the box I am running the MAAS on, Do I have to have all 3 of them connected to something, or can I leave one hanging?
<koaps> DJHenjin: I have interfaces not plugged in on my controllers with no prob
<koaps> Does anyone know if the only way to update the nodegroup on a node is during node new ? I can't find a way to update the value after the node is seen by MAAS after PXE
<koaps> I know I can do it in the webui, but I need to use the maas cml
<DJHenjin> koaps, I am sure there must be _a_ way to do it
<DJHenjin> not everyone can count on $10,000 servers with 0.1%  failure rate, so there would have to be mechanisms to replace dead machines, and possibly expand the node groups
<koaps> I dug through all the commands I could think of, none seem to allow updating the nodegroup for a node
<koaps> seems weird to me you can't change that when you can easily do it in the webui
<DJHenjin> or are you referring to a name? every time I have had to deal with something like that (in many other web coding projects) there has always been an obscure get_ID_by_Name, and set_Name_by_ID hiding somewhere
<koaps> I tried passing the equiv of what's used in the new command, nodegroup=CLUSTERID, to the node update, didn't error but also didn't update
<koaps> just radially changes how we injest servers if we need to add them new from the maas command
<DJHenjin> one sec, let me take a look at the code, (this is open source right)?
<koaps> ya
<DJHenjin> what version you on?
<koaps> 1.9
<DJHenjin> kk
<koaps> 1.9.3 I think is current
<koaps> http://maas.ubuntu.com/docs1.9/api.html#nodes
<koaps> when creating a new one, there's
<koaps> param nodegroup: The id of the nodegroup this node belongs to.
<DJHenjin> Take a look at line 114, 117 of http://bazaar.launchpad.net/~maas-committers/maas/1.9/view/head:/src/maascli/api.py
<DJHenjin> Looks like there may be a way to pass in store_true=true to get a persist mechanism going
<koaps> not sure how that would help me change the value?
<koaps> python not my strongest lang :)
<DJHenjin> well, basically speaking. store_true is an argument that by default is initialized to false in all instances of the command parser.
<DJHenjin> I haven't found the associated code yet, but to me it looks like any API commands that need to change stored values only part of the time can short circuit the code which actually commits the new value by leaving store_true set as false. And tif they want to make a meaningful commit (after validation for ex), they jusdt do the same command, pass in store_true as true. and then the magic happens
<koaps> think I follow ya
<DJHenjin> Looking at how the nodegroup is implemented, we use UUID to differentiate between the nodegroups programmatically, which would allow us to change the name relatively easily.
<DJHenjin> well, really it would allow changing anything at all about the group quite easily, although name might break the group since name is used for dhcp
<DJHenjin> er, dns
<koaps> i don't need to change the nodegroup itself, just add nodes to it
<koaps> I think there's commands for changing the nodegroup itself
<koaps> just nothing that says put this node in that group
<DJHenjin> thats what I am looking for. name was just an example of where it may break things to change somethign manually
<DJHenjin> BaseNodeManager on line 268 in http://bazaar.launchpad.net/~maas-committers/maas/1.9/view/head:/src/maasserver/models/node.py is likely where it will exist, I need to go have a smoke, but I will keep looking
<koaps> thanks for looking, I appreciate the help
<DJHenjin> I am coming up blank, sorry. I have to go deal with a dead UPS in the DC
<koaps> all good, thanks for checking
<koaps> I feel for you, I'm glad I don't have to make weekend DC trips any more :)
<DJHenjin> That's wwhat I am hoping maas will help eliminate
<DJHenjin> although it helps when the "DC" is <10 feet to your right XD
<koaps> hah
#maas 2016-09-11
<roaksoax> koaps: that seems like a bug / error no exposing that on the API
<roaksoax> koaps: but otherwise, this would be the command: maas admin node update node-e766df7c-7719-11e6-b22e-00163ea81775 nodegroup=<nodegroup>
<roaksoax> koaps: but otherwise, this would be the command: maas admin node update <node-system-id> nodegroup=<nodegroup>
<kkkkk> just wonder if maas can manage iscsi ?
#maas 2017-09-04
<yosefrow> hi
<yosefrow> for some reason I cannot change the root DNS (maas)
<yosefrow> every time i restart MAAS it resets it to the wlan0 ip which is wrong
<yosefrow> https://askubuntu.com/questions/952596/maas-forces-wrong-root-dns-ip
<noobcloud> Hello!
#maas 2017-09-05
<kklimonda> is installing CentOS 7 on baremetal supported? If so, how can I debug error: file not found and alloc magic is broken when booting centos after installation. (it's UEFI boot)
<kklimonda> hmm, not after I've reinstalled centos, maas is pushing grub config that references /efi/ubuntu/grubx64.efi which doesn't exist on centos
<kristian__> Hey, does somebody use MAAS on OVH dedicated servers?
<xygnal> roaksoax:  our strange problem resolved itself maybe an hour after I gave up.
<xygnal> back on THursday
<xygnal> I'll keep an eye on it
<xygnal> I do have another question though.  Is there a way to blacklist a kernel module during commission?
<xygnal> I have some older Cisco UCSC boxes that have this flexflash controller.  even though i have no cards in the controller, it still shows up as a tiny little device you cannot read or write from.
<xygnal> MAAS doesnt like this and fails the commission
<roaksoax> xygnal: you should be able to send a kernel parameter for the machine blacklisting the module
<xygnal> roaksoax: where do I set that?using a tag?
<roaksoax> xygnal: https://docs.ubuntu.com/maas/2.1/en/installconfig-nodes-kernel-boot-options
<roaksoax> xygnal: or rather: https://docs.ubuntu.com/maas/2.2/en/nodes-kernel-options
<roaksoax> same thing
<xygnal> roaksoax: thanks I found that too.  It looks as if the device that it is actually finding is not the flexflash controller but a iSCSI device
<xygnal> systemeeee
<xygnal> oops
<xygnal> roaksoax: sent you a paste
<xygnal> no.no
<xygnal> oops. damn this 3rd device in my KVM is throwing me off.
<mup> Bug #1715230 opened: Partition and VG API endpoints diverge <MAAS:New> <https://launchpad.net/bugs/1715230>
<mup> Bug #1715230 changed: Partition and VG API endpoints diverge <MAAS:New> <https://launchpad.net/bugs/1715230>
<mup> Bug #1715230 opened: Partition and VG API endpoints diverge <MAAS:New> <https://launchpad.net/bugs/1715230>
#maas 2017-09-06
<mup> Bug #1698891 changed: [web UI] non-admin can appear to create logical volume <docteam> <MAAS:Expired> <https://launchpad.net/bugs/1698891>
<mup> Bug #1702919 changed: displayed lease IP information not updated when entering rescue mode <dhcp> <MAAS:Expired> <https://launchpad.net/bugs/1702919>
<mup> Bug #1715337 opened: [2.3a2] Missing DNS in rescue mode <MAAS:New> <https://launchpad.net/bugs/1715337>
<mup> Bug #1715338 opened: Dumpdata failing for table metadataserver.nodeuserdata <MAAS:New> <https://launchpad.net/bugs/1715338>
<mup> Bug #1715345 opened: [2.3 alpha 2, Machine details] When I click the edit button, the Save and Cancel buttons remain hidden <ui> <MAAS:New> <https://launchpad.net/bugs/1715345>
<mup> Bug #1715353 opened: [2.3 alpha 3, Subnets/VLAN details] VLAN details do not have the Edit button and can still be edited with auto-save <ui> <MAAS:New> <https://launchpad.net/bugs/1715353>
<c06> hi all i am trying test in my vbox VM
<c06> i have two vm one VM with public and hostonly adapter(192.168.56.101 - gw 192.168.56.1), and i installed maas
<c06> i configured second VM with hostonly adapter(192.168.56.102). but MAAS is unable to find the second machine. any suggestions.??
<c06> Also in my second machine i enabled Network(boot) PXE
<c06> anyone on.?
<c06> my node is unable to get the boot on tftp server
<c06> it got ip but after we are getting "No bootable medium found."
<roaksoax> cnf: did you enable dhcp and/or imported image s?
<roaksoax> err
<roaksoax> sory
<cnf> Hmm?
<cnf> I'm in the US on holiday atm
<cnf> roaksoax:  so hi from carson city \o :P
<roaksoax> cnf: sorry my message was for someone else
<ybaumy> roaksoax: i know you are not in charge fixing that damn resolv.conf problem. but do you have a estimate on how long it will be until we get a solution. i would really need maas atm. i tried foreman but its not compatible with juju
<ybaumy> i also tried 14.04 for commissioning
<ybaumy> but its the same problem
<ybaumy> and in the end juju doesnt work too
<xygnal> roaksoax: get a chance to see my paste?
<xygnal> roaksoax: not sure if this is an existing bug. cannot find one.
<jamesbenson> can someone destribe badblocks-destructive further?  I've read the info on the release notes... just not sure what exactly it does.... i.e. are the hard drives erased?  just bad sectors?
<xygnal> i assume destructive mode does read AND write testing
<xygnal> write testing being destructive
<jamesbenson> yeah, I'm was guessing that any bad sectors they find, they mark bad so any data there is unrecoverable...
<jamesbenson> good to get validation.  thank you xygnal
<jamesbenson> do any of the tests work when they have a raid controller card?
<jamesbenson> the few tests I've ran, they all seem to fail (have a perc 6i raid card)
<xygnal> not sure James. dev team seems busy today. still waiting on a reply myself before submitting a bug.
<mup> Bug #1715501 opened: MAAS can't connect to RSD 2.1 pod <MAAS:New> <https://launchpad.net/bugs/1715501>
<mup> Bug #1715501 changed: MAAS can't connect to RSD 2.1 pod <MAAS:New> <https://launchpad.net/bugs/1715501>
<mup> Bug #1715501 opened: MAAS can't connect to RSD 2.1 pod <MAAS:New> <https://launchpad.net/bugs/1715501>
#maas 2017-09-07
<c06> want some suggestions in maas, my testbed machine are running in vbox.
<c06> my 2nd machine unable to get PXE from maas server
<Guest18676> hi anyone there
<Guest18676> facing PXE boot in maas any suggestions.??
<Guest18676> runing vms in vbox and also enabled PXE boot in second VM but its not working fine
<c06> for virtualbox, what is the powertype we need to use.?
<c06> ** irtualbox VMs
<mup> Bug #1715634 opened: 'tags machines' takes 30+ seconds to respond with list of 9 nodes <cdo-qa> <MAAS:New> <https://launchpad.net/bugs/1715634>
<jamesbenson> Does maas have an issue with machines with over 2TB of storage.  It seems to have commissioned my systems under that limit, but over it can't.  I had a fix in 2.1.5 where I manually partitioned it but it doesn't seem to work with 2.2.2
<mup> Bug #1715687 opened: [2.3 alpha 3, Hardware test section] Avoid wrapping text to two lines in tables. e.g. if the hardware testign result is Timed out it wraps to a second line <ui> <MAAS:New> <https://launchpad.net/bugs/1715687>
<mup> Bug #1715687 changed: [2.3 alpha 3, Hardware test section] Avoid wrapping text to two lines in tables. e.g. if the hardware testign result is Timed out it wraps to a second line <ui> <MAAS:New> <https://launchpad.net/bugs/1715687>
<mup> Bug #1715687 opened: [2.3 alpha 3, Hardware test section] Avoid wrapping text to two lines in tables. e.g. if the hardware testign result is Timed out it wraps to a second line <ui> <MAAS:New> <https://launchpad.net/bugs/1715687>
#maas 2017-09-08
<aleister> hello party people. I'm attempting a maas setup but finding that any time a vm boots it pxe boots and never seems to write to the disk so that it can boot. The disk is listed as a higher boot order. The node shows as new but re-installs when I try to commission
<mup> Bug #1715876 opened: [2.3 alpha 3, Notifications] We need to introduce a type of notification that is local to a page and persists until dismissed by the user <ui> <MAAS:New> <https://launchpad.net/bugs/1715876>
<gimmic> How do you login to the rescue mode from maas?
<gimmic> Node is prompting me for a password rather than using the PSK
<gimmic> wait.. nevermind. Looks like it pulled a different dhcp lease in rescue
#maas 2017-09-10
<mup> Bug #1703462 changed: rack controller rejected by region <cdo-qa> <cdo-qa-blocker> <MAAS:Expired> <https://launchpad.net/bugs/1703462>
<ybaumy> moin
<Guest66126> Hi All, I wanted to use custom kernel for MAAS deployment, for which created local repo of MAAS using SimpleStreams. But could not find any documentation on how custom kernel can be imported. Have someone tried this functionality and provide some inputs?
#maas 2019-09-02
<mup> Bug #1834602 changed: unable to deploy 19.04 to ppc64el with unique disk config <curtin:Expired> <MAAS:Expired> <https://launchpad.net/bugs/1834602>
<mup> Bug #1834602 opened: unable to deploy 19.04 to ppc64el with unique disk config <curtin:Expired> <MAAS:Expired> <https://launchpad.net/bugs/1834602>
<mup> Bug #1834602 changed: unable to deploy 19.04 to ppc64el with unique disk config <curtin:Expired> <MAAS:Expired> <https://launchpad.net/bugs/1834602>
<mup> Bug #1834602 opened: unable to deploy 19.04 to ppc64el with unique disk config <curtin:Expired> <MAAS:Expired> <https://launchpad.net/bugs/1834602>
<mup> Bug #1834602 changed: unable to deploy 19.04 to ppc64el with unique disk config <curtin:Expired> <MAAS:Expired> <https://launchpad.net/bugs/1834602>
<mup> Bug #1842287 opened: Node deploy options without writing to secondary disks or storage <sts> <MAAS:New> <https://launchpad.net/bugs/1842287>
<mup> Bug #1842287 changed: Node deploy options without writing to secondary disks or storage <sts> <MAAS:New> <https://launchpad.net/bugs/1842287>
<mup> Bug #1842287 opened: Node deploy options without writing to secondary disks or storage <sts> <MAAS:New> <https://launchpad.net/bugs/1842287>
<mup> Bug #1842299 opened: Not all RPC calls get prometheus metrics tracking <MAAS:Fix Committed by ack> <MAAS 2.6:Fix Committed by ack> <https://launchpad.net/bugs/1842299>
<mup> Bug #1842300 opened: Add metrics for websocket calls queries count and latency <MAAS:Fix Committed by ack> <MAAS 2.6:Fix Committed by ack> <https://launchpad.net/bugs/1842300>
<mup> Bug #1842299 changed: Not all RPC calls get prometheus metrics tracking <MAAS:Fix Committed by ack> <MAAS 2.6:Fix Committed by ack> <https://launchpad.net/bugs/1842299>
<mup> Bug #1842300 changed: Add metrics for websocket calls queries count and latency <MAAS:Fix Committed by ack> <MAAS 2.6:Fix Committed by ack> <https://launchpad.net/bugs/1842300>
<mup> Bug #1842299 opened: Not all RPC calls get prometheus metrics tracking <MAAS:Fix Committed by ack> <MAAS 2.6:Fix Committed by ack> <https://launchpad.net/bugs/1842299>
<mup> Bug #1842300 opened: Add metrics for websocket calls queries count and latency <MAAS:Fix Committed by ack> <MAAS 2.6:Fix Committed by ack> <https://launchpad.net/bugs/1842300>
<danboid> I'm trying to a report a bug for maas but the instructions for fetching the curtin logs aren't working for me
<danboid> blake_r, The insructions you linked me to don't seem to be quite right
<danboid> blake_r, I've sent you some logs anyway, attached the to the launchpad bug I opened on Friday
#maas 2019-09-03
<mup> Bug #1842454 opened: virsh address does not read hostname when adding machine <MAAS:New> <https://launchpad.net/bugs/1842454>
<mup> Bug #1842454 changed: virsh address does not read hostname when adding machine <MAAS:New> <https://launchpad.net/bugs/1842454>
<mup> Bug #1842454 opened: virsh address does not read hostname when adding machine <MAAS:New> <https://launchpad.net/bugs/1842454>
<mup> Bug #1842454 changed: virsh address does not read hostname when adding machine <MAAS:New> <https://launchpad.net/bugs/1842454>
<mup> Bug #1842454 opened: virsh address does not read hostname when adding machine <MAAS:New> <https://launchpad.net/bugs/1842454>
<mup> Bug #1842486 opened: [2.6] Redfish controller iDRAC returns 406 error <MAAS:New> <https://launchpad.net/bugs/1842486>
<mup> Bug #1842486 changed: [2.6] Redfish controller iDRAC returns 406 error <MAAS:New> <https://launchpad.net/bugs/1842486>
<mup> Bug #1842486 opened: [2.6] Redfish controller iDRAC returns 406 error <MAAS:New> <https://launchpad.net/bugs/1842486>
<mup> Bug #1842486 changed: [2.6] Redfish controller iDRAC returns 406 error <MAAS:New> <https://launchpad.net/bugs/1842486>
<mup> Bug #1842486 opened: [2.6] Redfish controller iDRAC returns 406 error <MAAS:New> <https://launchpad.net/bugs/1842486>
<pepperhead> o/
#maas 2019-09-04
<mup> Bug #1842524 opened: [2.6, UI] Composing machine via UI does not appear in pod machine list <MAAS:New> <https://launchpad.net/bugs/1842524>
#maas 2019-09-05
<mup> Bug #1842872 opened: Error during adding and commission identical UUID <MAAS:New> <https://launchpad.net/bugs/1842872>
<mup> Bug #1842896 opened: [VM Provisioning] If constraints: zones=<ZONE> causes Juju to provision new VMs on same node of correct zone in disrespect of overcommit restrictions <juju:New> <MAAS:New> <https://launchpad.net/bugs/1842896>
<mup> Bug #1842896 changed: [VM Provisioning] If constraints: zones=<ZONE> causes Juju to provision new VMs on same node of correct zone in disrespect of overcommit restrictions <juju:New> <MAAS:New> <https://launchpad.net/bugs/1842896>
<mup> Bug #1842896 opened: [VM Provisioning] If constraints: zones=<ZONE> causes Juju to provision new VMs on same node of correct zone in disrespect of overcommit restrictions <juju:New> <MAAS:New> <https://launchpad.net/bugs/1842896>
<mup> Bug #1842896 changed: [VM Provisioning] If constraints: zones=<ZONE> causes Juju to provision new VMs on same node of correct zone in disrespect of overcommit restrictions <juju:New> <MAAS:New> <https://launchpad.net/bugs/1842896>
<mup> Bug #1842896 opened: [VM Provisioning] If constraints: zones=<ZONE> causes Juju to provision new VMs on same node of correct zone in disrespect of overcommit restrictions <juju:New> <MAAS:New> <https://launchpad.net/bugs/1842896>
#maas 2019-09-06
<mup> Bug #1842642 opened: Juju fails to deploy, as nodes fails too start. Due to Unable to allocate static IP due to address exhaustion.  <juju:Incomplete> <MAAS:New> <https://launchpad.net/bugs/1842642>
<mup> Bug #1843024 opened: Data required to group network cards and show if SR-IOV is available <MAAS:New> <https://launchpad.net/bugs/1843024>
<mup> Bug #1843024 changed: Data required to group network cards and show if SR-IOV is available <MAAS:New> <https://launchpad.net/bugs/1843024>
<mup> Bug #1843024 opened: Data required to group network cards and show if SR-IOV is available <MAAS:New> <https://launchpad.net/bugs/1843024>
<mup> Bug #1843042 opened: Bridge type data not available to the UI <MAAS:New> <https://launchpad.net/bugs/1843042>
<mup> Bug #1843052 opened: CleanSave.__getattribute__ causes significant performance reduction <MAAS:In Progress by bjornt> <MAAS 2.6:In Progress by bjornt> <https://launchpad.net/bugs/1843052>
<mup> Bug #1729555 opened: [2.3b3, Filtering] I want to be able to filter out the VMS of a Pod from the machine listing <pod> <ui> <MAAS:Fix Released by ack> <maas (Ubuntu):New> <https://launchpad.net/bugs/1729555>
<mup> Bug #1729555 changed: [2.3b3, Filtering] I want to be able to filter out the VMS of a Pod from the machine listing <pod> <ui> <MAAS:Fix Released by ack> <maas (Ubuntu):New> <maas (Ubuntu Xenial):New> <https://launchpad.net/bugs/1729555>
<mup> Bug #1729555 opened: [2.3b3, Filtering] I want to be able to filter out the VMS of a Pod from the machine listing <pod> <ui> <MAAS:Fix Released by ack> <maas (Ubuntu):New> <maas (Ubuntu Xenial):New> <https://launchpad.net/bugs/1729555>
<mup> Bug #1729555 changed: [2.3b3, Filtering] I want to be able to filter out the VMS of a Pod from the machine listing <pod> <ui> <MAAS:Fix Released by ack> <maas (Ubuntu):New> <maas (Ubuntu Xenial):New> <https://launchpad.net/bugs/1729555>
<mup> Bug #1729555 opened: [2.3b3, Filtering] I want to be able to filter out the VMS of a Pod from the machine listing <pod> <ui> <MAAS:Fix Released by ack> <maas (Ubuntu):New> <maas (Ubuntu Xenial):New> <https://launchpad.net/bugs/1729555>
<pepperhead> 0/
<pepperhead> Trying to get juju on maas running. I type "juju status" and get "ERROR no API addresses"
<pepperhead> Is it normal for nothing to show on the screen when pasting the APIU into the credentials dialog when it asks for oauth1?
<pepperhead> is there anyu way to get juju or maas to report what API is stored for the maas cloud?
<pepperhead> has anyone seen an issue with maas where it gets an order from juju, say to install the controller, deploys the OS, PXE is caught by maas on reboot, hands it off for local boot to complete, and sits at the "boot:" prompt?
<pepperhead> reproduced on two nodes
#maas 2019-09-07
<atdprhs> hello everyone, can anyone help me get maas KVM pod to use bridge network? I'm willing to pay reward for anyone can help me get this fixed
<atdprhs> I am experiencing the same issue as this https://bugs.launchpad.net/maas/+bug/1596683 , this is my network config for both `interfaces` and `virsh net-edit default` https://paste.ubuntu.com/p/wYCDBD4g2J/
#maas 2020-08-31
<mup> Bug #1893664 opened: Partition cannot be saved; not enough free space on the block device. <MAAS:New> <https://launchpad.net/bugs/1893664>
<mup> Bug #1893668 opened: Can't opt out of tracking software <MAAS:New> <https://launchpad.net/bugs/1893668>
<mup> Bug #1893670 opened: display bios_boot_mode in the web UI <MAAS:New> <https://launchpad.net/bugs/1893670>
<mup> Bug #1893690 opened: MAAS is unable to handle duplicate UUIDs <MAAS:In Progress by ltrager> <MAAS 2.8:New> <https://launchpad.net/bugs/1893690>
