#maas 2013-01-22
<geordish> Morning
<geordish> should squid-deb-proxy automatically be configured with the prefix from which your machines will be coming from?
<roaksoax> rvba: howdy!!
<rvba> roaksoax: hi Andres
<roaksoax> rvba: So I was wondering. I need to install 2 different commissioning images
<roaksoax> rvba: or the same image in two different
<roaksoax> rvba: directories, but it doesn't let me
<roaksoax> rvba: so i want to install images in commissioning/ and xinstall/ dir
<roaksoax> it doesn't let me due to saying something like "these ephemeral images have already been installed"
<roaksoax> or similar
<rvba> What command are you running exactly?
<roaksoax> rvba: let's say I do: maas-provision install-pxe-image "--arch=$arch" "--subarch=$subarch" "--release=$release" --purpose="commissioning" --image="$tmpdir"
<roaksoax> rvba: let's say I do: maas-provision install-pxe-image "--arch=$arch" "--subarch=$subarch" "--release=$release" --purpose="xinstall" --image="$tmpdir"
<roaksoax> (see the different purpose)
<roaksoax> $tmpdir is the same image dir. It fails to install into xinstall because it already has installed the images into commissioning
<roaksoax> rvba: which is right cause it doesn't make sense to copy the same images to a different tftp directory
<roaksoax> rvba: however, we are adding a different purpose (xinstall) for installation using an ephemeral image (fast path installer)
<roaksoax> rvba: so, i was wondering if it would make sense to add a parameter like --symlink="place-to-symlink-to"
<roaksoax> rvba: so that: maas-provision install-pxe-image "--arch=$arch" "--subarch=$subarch" "--release=$release" --purpose="commissioning" --image="$tmpdir" --symlink="xinstall"
<rvba> roaksoax: Where is the code for maas-provision again?
<roaksoax> rvba: that's src/provisioningserver (maas-provision is just a wrapper that uses src/provisioningsrever)
<rvba> roaksoax: ah right, let me have a look at that code.
<bigjools> GOOD MORNING
<roaksoax> rvba: https://code.launchpad.net/~andreserl/maas/maas_fast_path_installer --> this is what I'm adding so far
<roaksoax> bigjools: morning!
<roaksoax> bigjools: us time this week?
<rvba> roaksoax: the path used to store the image already seems to include the "purpose" ('/'.join([arch, subarch, release, purpose]))
<bigjools> roaksoax:  yeah I am in Austin
<roaksoax> rvba: yes. So if I try to install an image dir to purpose 'commissioning' and then the same image dir to purpose 'xinstall', it will fail due to have 2 identical dirs (which is expected). However, I need to access the same commissioning images using a different purpose
<roaksoax> rvba: so I was wondering if it was makes to add the --symlink option that basically does "if I install image with purpose commissioning, symlink that purpose dir into XYZ"
<roaksoax> bigjools: cool!
<rvba> roaksoax: maybe I'm missing something but what I see in the code is that the dirs should not be identical if when you install the same image to another "purpose".
<roaksoax> rvba: Let's look at it another way. I need to use the 'commissioning' images for a different purpose ('xinstall').
<bigjools> roaksoax: too cool, it's bloody freezing :)
<roaksoax> so I need to symlink /var/lib/maas/tftp/<arch>/generic/<release>/xinstall to use /var/lib/maas/tftp/<arch>/generic/<release>/commissioning
<roaksoax> rvba: ^^
<rvba> Ah I see.
<roaksoax> rvba: for that I'd like to add --symlink="purpose", which will do that (when using maas-provision install-pxe-image
<rvba> Right now you only have the option so re-install it right.
<rvba> ?
<roaksoax> bigjools: heh  I don't think is that bad is it?
<bigjools> roaksoax: it depends on your relative outlook :)
<rvba> roaksoax: the part I don't get is why you get the error you're getting when you're simply trying to install the same image for a different purpose.  This should work all right.
<rvba> roaksoax: having a symlink instead of duplicating the file is nice.  But it should be just an optimisation over installing the same image twice.
<roaksoax> rvba: yes that's what I want to do instead of installing the same image twice
<roaksoax> bigjools: heh.. I live in Florida so I feel you :)
<bigjools> roaksoax: EXACTLY!
<roaksoax> bigjools: it's 16C here and its cold for me
<roaksoax> rvba: the error when trying to install $imagedir into 'xinstall' purpose *after* installing them into 'commissioning' purpose is: http://pastebin.ubuntu.com/1559858/
<roaksoax> rvba: but anyways I guess I'll simply add the --symlink option
<roaksoax> since that does make more sense
<rvba> roaksoax: Sounds like an useful optimisation.
<roaksoax> rvba: cool thanks
<rvba> roaksoax: but you should not be getting that error.
<bigjools> roaksoax: you in Miami?
<roaksoax> bigjools: yes I am :)
<bigjools> warm :)
<roaksoax> indeed :)
<roaksoax> i love the weather
<rvba> roaksoax: I'm looking into your error nowâ¦
<roaksoax> rvba: yeah but i guess it is comparing all the already installed dirs to check if there's one installed already
<roaksoax> rvba: cool thanks
<rvba> roaksoax: it's not that clever :).  I suspect it's failing because of something else.
<roaksoax> :)
<rvba> roaksoax: when you install an image, the source gets cleaned up afterwards.  That's why the second run fails: it does not find the image to import.
<rvba> roaksoax: if you duplicate the directory containing the images first and run the second import on the copy, it will work.
<rvba> roaksoax: note that the symlink thing won't solve that pb :).
<roaksoax> rvba: ah I see that now
<roaksoax> clear
<roaksoax> ls
<roaksoax> err
<rvba> heh :)
<roaksoax> rvba: good catch though!
 * roaksoax needs some coffee apparently
<roaksoax> rvba: I have a question... say i have 2 node controllers and 1 region. In order for me to deploy 2 different networks (on of each of the cluster), do I still need to be accessible to the region controller?
<rvba> roaksoax: not sure I understand the questionâ¦ but every cluster needs to be able to contact the region if that's what you're asking.
<roaksoax> rvba: so the clusters/region communicate through 192.168.0.1/28, while I want each cluster to deploy on nodes using 192.168.1.0/24
<roaksoax> and 192.168.2.0/24
<rvba> roaksoax: that's fine, so the clusters will have 2 interfaces.
<roaksoax> rvba: so they don't have to be able to contact the region then
<rvba> Yes they do
<roaksoax> rvba: so all the deployed nodes have to have 2 interfaces then
<rvba> roaksoax: Either that or you need to add the appropriate routes so that they can contact the region through the cluster.
<roaksoax> right
<roaksoax> bigjools: there's no merge bot for packaging.precise.sru ?
<bigjools> roaksoax: no, because it's not a recognised release branch
<bigjools> it's an integration branch
<bigjools> so you need to merge manually
<bigjools> and push up
<roaksoax> ok cool
<roaksoax> thanks :)
<bigjools> I turned on appendrevisions for it so it should be safe
<roaksoax> cool!
<roaksoax> rvba: btw.. it is now the cluster that runs the tftp server and stores the images ?
<rvba> roaksoax: yes
<roaksoax> ok cool
<roaksoax> bigjools: btw... i was wondering whether you guys were planning to SRU the kernel_opts tag stuff to precise branch
<roaksoax> t/win 9
<roaksoax> matsubara: ping?
<matsubara> roaksoax, hi
<roaksoax> matsubara: howyd!! quick question. Have you played with tags much?
<matsubara> roaksoax, yes, a bit. How can I help?
<roaksoax> matsubara: a ny ideas? ubuntu@maas:/usr/share$ maas-cli admin tag update-nodes "xinstall" add=node01
<roaksoax> { "removed": 0, "added": 0
<roaksoax> }
<matsubara> roaksoax, did you create the tag xinstall first?
<roaksoax> yes
<matsubara> also the doc says: The nodegroup parameter, which restricts the operations to a particular nodegroup, is optional, but only the superuser can execute this command without it.
<matsubara> maybe you need the nodegroup parameter as well since you seem to be running with the ubuntu user
<roaksoax> matsubara: yeah: http://pastebin.ubuntu.com/1560577/
<matsubara> or node01 is not matching any node in your cluster. are you sure that's the system_id for the node?
<roaksoax> matsubara: ah!1 that's why then
<roaksoax> thanks for the info :)
<matsubara> no problem :-)
<bigjools> roaksoax: no plans to backport any tags
<roaksoax> bigjools: ok
<roaksoax> bigjools: why do I need maas-dns installed in order for the UI to not fail: http://paste.ubuntu.com/1560879/
<roaksoax> j/win 14
<bigjools> roaksoax: well that sucks.  Ask rvb about it, he wrote the start up wsgi stuff.  I am a little surprised that this happens but it might be an untested setup :(
<roaksoax> bigjools: yeah anyways, I already filed a bug. I'll check with him tomorrow
#maas 2013-01-23
<roaksoax> rvba: howdy: https://bugs.launchpad.net/ubuntu/+source/maas/+bug/1103195
<ubot5> Launchpad bug 1103195 in maas (Ubuntu) "MAAS WebUI crashes when installing maas-region-controller only" [Medium,Confirmed]
<roaksoax> rvba: the bug above is from latest lp:maas/1.2 (which is on raring). Happened after a clean install
<roaksoax> rvba: so a better quick fix would be to install the tool with maas-region-contorller then?
<rvba> roaksoax: not sure it's better, but it's quicker :)
<roaksoax> indeed :)
<roaksoax> rvba: btw.. wanted to check with you if this makes sense to you: http://paste.ubuntu.com/1563502/
<roaksoax> this is adding an 'xinstall' purpose that installs Ubuntu using a fast path installer (using the pehemral image)
<roaksoax> rvba: and.. in a preseed, what variable refers to the cluster controller from which a node is booting from?
<rvba> roaksoax: having a look nowâ¦
<rvba> roaksoax: looks good but: I wonder why you need the symlink stuff since importing the same image twice is simpler right now / not sure if using the tags to decide install/xinstall is the way to go in the long run (but looks ok for testing at least)
<rvba> roaksoax: I don't think that variable is present yet.
<roaksoax> rvba: the tags stuff is being tested already :)
<roaksoax> rvba: the symlink stuff might eventually change since the image will be obtained from oxygen and XYZ
<roaksoax> and yes the variable to refer to cluster controller is what is needed
#maas 2013-01-26
<lifeless> does maas still run the installer, or does it dd/rsync on a cloud image? [or did it always do that?]
#maas 2014-01-20
<jtv1> bigjools: ->
<bigjools> jtv: â
<bigjools> did we just ping with arrows?
<jtv> Sort of.
<jtv> Bough & arrough.
<jtv> Anyway.  I phased my sketch of the changes according to my Baby Steps maxim.
<jtv> Here's what I'd like to do at this point:
<jtv> First, get an idea of what changes we want over the whole stretch.
<jtv> So that includes the multiple-interface-DNS change.
<jtv> Get a grip on the changes it requires to the DNS config generators etc.
<jtv> Then, decide whether (1) it makes sense to keep the phasing I outlined (where we do multiple DHCP interfaces first), and (2) anything can be done in parallel.
<jtv> Make sense?
 * bigjools digesting
<bigjools> jtv: ok yes that's fine
<bigjools> parallel is good
<bigjools> as is baby steps
 * bigjools fixorates trusty failures
<jtv> Right now it looks to me as if it's pretty easy just to remove the restriction on multiple DHCP-managed interfaces, and the get_dns_managed_interface makes it easy to grep for reliance on the remaining (for now) restriction limiting us to one DNS-managed interface.
<jtv> But a bit more exploration first may be wise.
<jtv> To support multiple DNS-managed interfaces, obviously we need to take a closer look at how we manage DNS configs, but also at what options we offer to the user.
<jtv> I'd particularly like to get a handle on the difference between (i) supporting resolution of hostnames to interfaces on the various networks, and (ii) providing name resolution to client interfaces on the various networks.
<jtv> For resolution, can we simply resolve a node's hostname to all of its interfaces?
 * jtv struggles with clarity
<jtv> I mean: nodes can be both clients to our DNS, and hosts resolved in our DNS.
<jtv> Should a node have a single hostname, mapping to all its IP addresses?
<jtv> Currently the DNS code only maps a node's hostname to an IP address leased to its oldest MACAddress object.
<bigjools> right
<bigjools> this falls under some of the other fixes we wanted to do
<bigjools> ie boot from a maas-controlled management network, but the DNS should resolve to a different NIC
 * bigjools goes to deal with ants. 
 * bigjools realises this is an Archer clichÃ©
 * jtv smugly sidesteps the clichÃ© for another one.
<bigjools> they are antliminated
 * jtv cringes at attempted wordplay
<ging> is there a way to get more logging/debugging info with maas? every time i have an issue i find it very diffcult to find out where it is going wrong
<bigjools> ging: can I help with something in particular?
<bigjools> all the logs are under /var/log/maas/
<bigjools> there are a few different logs
<ging> at the moment my nodes in kvm are failing tests while commisioning
<bigjools> which tests?
<ging> 00-maas-01-lshw and 00-maas-02-virtuality
<bigjools> and what are you seeing?
<ging> it instantly shuts down the vm so i can't see what is going on
<bigjools> what makes you say it's failing?
<ging> it says failed 2/ and lists those 2
<ging> 2/5
<bigjools> ok
<bigjools> well don't worry too much it doesn't affect normal operations, it'll just be missing some info in the database
<jtv> We don't expose the commissioning results  anywhere in the UI, do we?
<bigjools> no
<bigjools> you have to DB-dive
<bigjools> this is something we should sort out I suppose!
<bigjools> ging: there's a database table called NodeCommissioningResult, the result logs will be in there
<ging> ok thanks i will have a look
<ging> do you think if i disable pxe boot on the vms they'll actually be working then ?
<jtv> No, that'll stop them from working with MAAS at all.
<ging> is it the postgres db that maas is using?
<bigjools> yes
<ging> i can't even get into the database
<bigjools> use "maas shell" and then:
<bigjools> from metadataserver.models import NodeCommissioningResult
<bigjools> from metadataserver.models import NodeCommissionResult
<bigjools> sorry
<bigjools> for n in NodeCommissionResult.objects.all()
<bigjools>   print n
<bigjools> or similar
<bigjools> jtv: want some cheap karma? https://code.launchpad.net/~julian-edwards/maas/fix-celery-callbacks-bug-1270713/+merge/202250
<bigjools> you lost your chance!  ok then gmb how about you?
<jtv> Don't worry, I'll take it.
<bigjools> ha!
<bigjools> it's a one liner
<jtv> I am known for my one-liners.
<bigjools> eye liners more like
<jtv> Tramp!
<jtv> bigjools: will your change cause any problems for the cloud-archive version?
<ging> does maas shell have a man page?
<bigjools> jtv: no idea, don't care. cloud archive should update if so.
<bigjools> ging: http://maas.ubuntu.com/docs is better
<jtv> I think we include the generated man pages there nowadays.
<bigjools> jtv: thanks for reviewiewiewiewering
<jtv> Yeahyeahyeah I know I know
 * bigjools out for a while
<ging> would a lack of connection to the repositories cause the commisioning tests to fail?
<jtv> Yes, it would
<jtv> The node connects to a proxy running on the region controller, and the proxy connects to the repository.
<ging> i think this could be where it is going wrong
<jtv> I think the lldp scripts would also fail though.
<jtv> Because those need to install lldp.
<jtv> And the virtuality script doesn't look as if it needs any networking.
<jtv> In fact I don't see how that script could fail at all...
<jtv> ging: if you can run that maas-shell loop, that should provide error output for the commissioning scripts.
<ging> jtv: i couldn't get it to run
<jtv> Where did it fail?
<ging> it imported NodeCommisionsResult but the next bit gave me a syntax error
<ging> and i have no understanding of this syntax
<jtv> Did it say where the error happened?  If it's a traceback, just the last entry should be enough.
<jtv> Where it names a file and a line number.
<rvba> ging: run this: '[n for n in NodeCommissionResult.objects.all()]'
 * jtv searches
<ging> rvba with the braces and the quotes or neither?
<rvba> ging: just what's inside the quotes (i.e. with the braces).
<jtv> That's meant to be run within the maas shell though â that failed to start, right?
<ging> no i'm in the maas shell
<ging> this is what i get with the braces
<rvba> jtv: Julian forgot the ':' at the end of the loop statement.
<ging> [<NodeCommissionResult: NodeCommissionResult object>, <NodeCommissionResult: NodeCommissionResult object>, <NodeCommissionResult: NodeCommissionResult object>, <NodeCommissionResult: NodeCommissionResult object>, <NodeCommissionResult: NodeCommissionResult object>, <NodeCommissionResult: NodeCommissionResult object>, <NodeCommissionResult: NodeCommissionResult object>, <NodeCommissionResult: NodeCommissionResult object>]
<jtv> Ah!
<jtv> Well, that's the objects.  Now for the contents.  :)
<jtv> We can start with the metadata:
<jtv> for n in NodeCommissionResult.objects.all():
<jtv>     print(n.name, n.script_result)
<jtv> That'll tell us which ones succeeded and which ones failed (in case there's any misunderstanding there).
<rvba> bigjools: rarg, looks like the celery upgrade broke quite a few things :/
<ging> do i need to end the print command with something? nothing happens
<jtv> Try pressing enter another time.
<ging> TypeError: 'NodeCommissionResult' object is not callable
<jtv> Er...
 * jtv is not seeing it
<ging> got it now
<ging> didn't have enough of an indent
<ging> h no i had a typo
<jtv> The most interesting output will be the non-zero results, of course.
<jtv> My system is upgrading; I'll drop out soon, hopefully briefly.
<ging> so far i can see what tests are failing, but i don't know how to get the actual output of the tests, or is that it just pass or fail ?
<ging> ah n.data gives me something
<bigjools> guys, FYI: http://docs.celeryproject.org/en/latest/whatsnew-3.0.html
<bigjools> rvba,jtv, allenap, gmb ^
<bigjools> rvba: notice the "Celery will automatically use the librabbitmq module if installed" :)
 * jtv goes offline.  Will not be back for a while after all.
<rvba> bigjools: that part I know :).  We managed to work around that.
<bigjools> I know
<bigjools> the task chaining stuff is cool
<rvba> Hopefully the core problem will get fixed at some point.
<bigjools> it's clearly a bug in the rabbit lib
<rvba> Yep
<bigjools> aaaanyway I have to do kids' bedtime
<ging> should my maas server be listening as a proxy on port 8000? because it currently doesn't it has a proxy on 3128 but apt running on commisioning it trying to connect to it on port 8000
<bigjools> ging: check the settings page to see what is configured. The proxy is squid-deb-proxy but I can't remember what its default port is.
 * bigjools wishes all good night
<ging> changing thing on the setting page seems to make no difference
<ging> restart of squid deb proxy and it now listens on 8000
<ging> squid deb proxy crashes every time i connect to it, i'm not sure why
<ging> but that is definetely my problem
<rvba> allenap: does that problem ring a bell? http://paste.ubuntu.com/6785337/
<rvba> allenap: this is when I run 'make' in the branch where I upgraded testttools to 0.9.34.
<allenap> rvba: It could be that youâve got python-mimeparse installed locally.
<rvba> Ah, good point.
 * rvba checks
<allenap> rvba: Or that testtools 0.9.33 needs a newer version than is pinned in versions.cfg.
<rvba> allenap: no, not installed.
<rvba> allenap: python-mimeparse is not mentioned in versions.cfg
<allenap> rvba: Ah, it ought to be then.
<rvba> allenap: why?  If it's not mentioned, doesn't just mean any version is okay?
<rvba> doesn't it*
<rvba> allenap: can you remind me what command I should run to get buildout to write versions.cfg?
<allenap> rvba: bin/buildout -v buildout:allow-picked-versions=true
<rvba> allenap: Fu&%ing A!  Looks like it did the trick.  Thanks!
<tomixxx> @jtv, matsubara: are u online? do you remember my MAAS-problem: one maas-server with two network-card-interfaces, one card connects to a private switch with two other nodes and one interface connects to the uni network
<jpds_> tomixxx: Probably not at this time, but you're better off looking at http://shop.canonical.com/product_info.php?products_id=658
<tomixxx> jdps: ty but i guess i will keep the free-of-charge server ;)
#maas 2014-01-21
<ging> i've not been having much luck getting maas to work on 13.10 as the server, would going back to 12.04 give me something more stable or is the latest the best to go for?
<bigjools> latest is best
<ging> and for a commission ditro is 13.10 also still the best to use?
<ging> or does that depend what actual distro you want to install?
<bigjools> ging: we generally only test with precise for that
<ging> ok i will leave it set on that
<ging> what does provisioning from other distros gain for you? is it more stable / quick to provision from the distro you are installing?
<jtv1> ging: not particularly, no.  But there might be some more recent driver your hardware needs.
<ging> well i don't understand this at all, i have tried several provisioning images and they all fail once the machine is installed and tries to connect to the archive mirror
<ging> does not seem to be a network issue because i can connect to the mirror with wget
<ging> maybe it is a network issue
<jtv> Then it may be a proxy issue.
<jtv> The nodes download their packages via the proxy running on the region controller.
<jtv> You could try if you can wget with the http_proxy variable set to that proxy.
<ging> ah
<ging> that will be why
<jtv> Well it's only an idea of where to look for a possible problem.
<jtv> Normally, this should just work.
<ging> the proxy on the region controller is broken and i can't get it to work
<jtv> It's squid-deb-proxy, if that helps.  You could see if it's running, and whether it logged anything.
<ging> i had to manually disable it in user_data_config.template to get it as far as installing
<ging> yeah something has gone completely wrong with it, as soon as anything connects to it it crashes
<jtv> !
<ging> i've done apt-get purge on it and delete all it's cached files and reinstalled but it still crashes
<jtv> No log messages?
<ging> http://pastebin.com/Bi8JTGxx
<ging> it is very strange i've been asking in the squid channel see if anyone can make sense of it
<ging> i've given the server 16 gig of ram just incase it was lack of memory
<jtv> Well it's not plain squid, it's squid-deb-proxy.  I'd look there primarily, just because it'll get less use.
<jtv> If all use of squid-deb-proxy uses squid, then squid probably gets a lot more testing etc.
<jtv> Whoa.  That looks like a corrupted file.
<jtv> Or just perhaps a problem with the hardware â let's hope not.
<jtv> Something you could try is to purge squid-deb-proxy and squid, and then see if it's left any state behind in /var/cache, /var/lib etc.
<jtv> Because that log looks to me as if it left a file behind there that got corrupted, perhaps because of a power outage while it was being written or something.
<jtv> It's only logged as a "warning," but coming right before the segfault it's the most obvious candidate.
<ging> i think i might have figured it out
<ging> it started of with 512 meg of ram when it was built, it gave it 16gig but swap is still set at 500meg
<jtv> More memory should never be a problem.  :)
<jtv> Unless you've got it overclocked and the dimms don't match up, of course...
<ging> it's a vm
<jtv> I don't think there's any reason nowadays why memory shouldn't be bigger than available swap.  ISTR that rule went out the window.
<jtv> Could the host be running out of memory maybe?  Or if the guest is 32-bit, maybe it runs out of address space?
<ging> i think it was just the fact it had a tiny swap partion and way trying to use it
<ging> no it still crashes with a bigger swap partition
<ging> i will try purging squid and squid-deb-proxy
<roaksoax> rvba: howdy!! What is the most recent revision you want released?
<rvba> Hi roaksoax, actually, we have a problem with the integrations tests so I'd like to postpone the release (of MAAS) by a day or two if that's all right with you.
<rvba> integration*
<roaksoax> rvba: yeah that's fine
<roaksoax> rvba: maas-test is a go though?
<rvba> roaksoax: yep
<roaksoax> rvba: ok cool!
<jamespage> does the version of 'maas-import-ephemerals' provide feedback on what its doing? on saucy it give me zip....
<jamespage> does the *newest* version...
<jamespage> that should have read
<roaksoax> jamespage: it doesn't!
<roaksoax> rvba: ^^ can we address that ASAP please?
<jamespage> roaksoax, its quite disconcerting
<roaksoax> tych0: ^^
<jamespage> roaksoax, I'm happy to raise a bug
<roaksoax> jamespage: agreed!
<roaksoax> jamespage: please do!
<jamespage> roaksoax, the only way I can tell its actually doing something is because I've got the network indicators on in byoby
<roaksoax> jamespage: yeah! We have requested this to be addressed before, I guess noone has gotten to it, or it has been forgotten
<jamespage> bug raising now
<tych0> there is callback support, i don't think anyone ever decided what the output should look like
<jamespage> some options to switch to daily stream would be nice as well
<jamespage> fwiw I have the same compliant with juju
 * jamespage stops whinging and raises a bug
<roaksoax> tych0: IIRC, we did at the sprint
<roaksoax> smoser: ^^
<smoser> well, that import-ephemerals needs all sorts of work.
<jamespage> bug 1271189 and bug 1271188
<ubot5> bug 1271189 in maas (Ubuntu) "support switching image streams in import ephemerals" [Undecided,New] https://launchpad.net/bugs/1271189
<ubot5> bug 1271188 in maas (Ubuntu) "maas-import-ephermerals provides no feedback on progress" [Undecided,New] https://launchpad.net/bugs/1271188
<smoser> some progress output would be useful thats for sure.
<smoser> theres other things that could be improved there too
<tych0> jamespage: it does support switching to the daily stream
<tych0> you can point it anywhere with --url
<jamespage> tych0, oh - does it
<roaksoax> tych0: is that documented in the manpage?
<smoser> its in help output
<smoser> but its not wonderful anyway
<smoser> we really ned to thikn about how better to address that.
<tych0> roaksoax: probably not :-)
<jamespage> tych0, oh - I see
<roaksoax> heh :)
<jamespage> you need url and products
<jamespage> (a --daily shorthand would be nice)
<smoser> (i only knew it was in '--help' becauase rharper came asking me about it last week)
 * jamespage has to admit he just hacked his local copy to get working
<tych0> do you have to set products?
<tych0> i thought there may have been defaults
<smoser> right now, rbasak has somehwat convinced me that specifying 'product' id is not so important
<smoser> and that maas should search really on tags only, not on product id.
<rvba> roaksoax: Fixing this script is tricky.  It's completely intertwined with simplestreams which means that progress report will have to use simplestreams' featuresâ¦
<roaksoax> smoser: ^^
<roaksoax> tych0: ^^
<smoser> which is fine
<tych0> simplestreams should have support for callbacks, though
<tych0> i did it as part of the re-write
<smoser> it does
<tych0> i think the plan was to chang eit when you guys figured out what output you wanted
<smoser> well, simplestreams has the update callbacks.
<smoser> but maas-import-ephemerals does not use them
<tych0> right
<smoser> simply putting a '.' on the screen every X MB would be useful.
<rvba> tych0: am I wrong or only part of the import process has been migrated to simplestreams?
<tych0> well, simplestreams just does ephemerals
<smoser> import of di kernels does not use simplestreams
<tych0> not the pxe files
<rvba> Right, that's what I had in mind.
<rvba> Which means effective reporting should address both types of imports.
<rvba> But yeah, usability-wise I agree that it's pretty important.
<smoser> ok... so some info here might be useful.
<smoser> there are 2 things that will affect "import" going forward.
<smoser> a.) strikov and rbasak and I talked yesterday about "enablement packages" ... making maas able to serve up kernels and boot media for new arches/"subarches" and supporting better, and having sstream data on some server to describe that.
<smoser>  thats relevant, because the data that strikov was porposing woudl have 'di-kernel' and 'di-initrd' (debian installer) data also, which is what you're currently getting with import-pxe-data.
<smoser> b.) we may change the format of the ephemeral image download to help address 'a'.
<smoser> ie, instead of it being a .tar.gz, it might end up being a squashfs filesystem or something/
#maas 2014-01-22
<bigjools> jtv2: I fixed the trusty run environment
<bigjools> I'm going to switch the lander to trusty soon
<jtv2> Great!
<jtv> I've been holding back my other machine for this, so that's the green light for that upgrade.
<bradm> bigjools: is there an easy way to use MaaS to boot a rescue image?
<jtv> bradm: not that I know... it's an interesting thought  though.
<bigjools> bradm: I would just replace the kernel/initrd with your custom ones
<bradm> the issue is that I'm trying to rebuild the arrays on the proliants, it'd be awesome to have a way to handle that in MaaS
<bradm> HP Proliant iLOs aren't great with rebuilding arrays, particularly over ssh
<lifeless> iLO ssh specifically I presume? openssh to an OS would be fine, right ?
<bradm> yeah, iLO ssh specifically, getting into the array config is the issue, openssh to an OS would be fine
<bradm> so I could run hpacucli or similar
<bigjools> jtv:
<bigjools> -        content = render_dns_template(template_path, kwargs, context)
<bigjools> +        content = render_dns_template(self.template_file_name, kwargs, context)
<bigjools> what;s going on here?
<bigjools> you cover note says "template_path disappear, because render_dns_template already knows how to compute this path" but you replaced path with file name
<bigjools> gah ok the diff was obfuscating things, never mind
<rvba> gmb: time for a tiny review? https://code.launchpad.net/~rvb/maas-test/ssh-with-sudo/+merge/202643
<gmb> rvba: Sure
<rvba> Thanks.
<gmb> r=me
<bigjools> the lander is up again
 * bigjools sleeps
<tomixxx> hi, jtv or matsubara online?
<tomixxx> i search the file in which i can set the url of the maas-server
<tomixxx> has someone an idea? i had already open this file but i cant find it right now
<rvba> allenap: can you tell roaksoax why we have two manpage for maas-cli? (docs/maascli.rst, docs/man/maas-cli.8.rst)â¦ they seem related but I failed to recall how.
<allenap> rvba: otp, but Iâll check when Iâm done.
<rvba> Thanks.
<allenap> roaksoax: It looks like the man page for maas-cli was copied from the doc, presumably because we wanted slightly different content between its web presentation and man page. However, theyâre out of sync in a few places so it would be better to ditch one I guess, and my vote would be to ditch docs/maascli.rst in favour of the man page.
<allenap> docs/maascli.rst does include some images, which the man page cannot.
<allenap> Well, it /can/; we ought to see if they affect man page rendering.
<allenap> Anyway, I have to go.
<roaksoax> allenap: cool
<Ming> is "Allocated to root" is the final state for bootstrap maas?
#maas 2014-01-23
<jtv> I wonder why the postgres fixture passes some of its arguments as one string.
<jtv> When executing pg_ctl, I mean.
<jtv> The fixture calls check_call() to run pg_ctl to start a cluster, but instead of passing a list of strings, one per command-line argument, it combines some of the arguments into one longer string with spaces.
<jtv> Bound to cause trouble.
<roadmr> evilnickveitch: hello, I was told you're the person to talk to for a typo in a maas documentation web page.
<evilnickveitch> roadmr, i am such a person
<evilnickveitch> where is it?
<roadmr> evilnickveitch: thanks! https://juju.ubuntu.com/docs/charms-deploying.html
<roadmr> evilnickveitch: at the end of the "Considerations" section, it reads "scale out the service on it's own node"
<roadmr> it should be "its", no apostrophe
<evilnickveitch> it should indeed. thanks for that, i will get it fixed. Will probably be live by the end of the day
<roadmr> evilnickveitch: thanks! every time a bad "it's" is fixed, a kitten gets fed :)
<evilnickveitch> :)
<roadmr> evilnickveitch: hey found another typo! https://juju.ubuntu.com/docs/charms-exposing.html search for "abtracts"
<evilnickveitch> ok
<roadmr> evilnickveitch: ... just realized this is about juju, not maas :/ let me know if I should direct this to someone else
<roadmr> evilnickveitch: on that same page, "expose the service via it's given address" (its again)
<roadmr> evilnickveitch: finally, search for "simly" in that page :D
<evilnickveitch> roadmr, yeah, these are juju pages, but it's (!) okay, I am the goto person for docs in cloud
<roadmr> I see what you did there :) well thanks for your help!
<evilnickveitch> roadmr, thankyou for pointing out the mistakes - they bug me just as much, but I have a lot of pages to read so it's unlikely I will spot them all
<roadmr> evilnickveitch: no problem, I'm (clearly) reading through juju documentation, so with fresh eyes I can help catch a few of those.
<evilnickveitch> cool. If I am not around, you can always file a bug - https://bugs.launchpad.net/juju-core/+filebug
<evilnickveitch> or fix it yourself if you are so inclined -
<evilnickveitch> https://juju.ubuntu.com/docs/contributing.html
<roadmr> evilnickveitch: oh cool! thanks!
<ging> i've added my ssh key to my root user through the webinterface, should that then get copied to the ubuntu user for all nodes allocated to root that are already booted, or only newly booted ones or only newly provisioned ones? because at the moment i don't seem to be able to get in at all
<bigjools> it only copies it to newly booted machines
<ging> for the sirius user?
<bigjools> FWIW if you use juju it handles all that for you
<ging> i mean ubuntu user
<ging> i've not setup juju, does that mean it won't do it at all?
<bigjools> not at all
<bigjools> it just makes life easier
<ging> ok i'll give setting that up a go
#maas 2014-01-24
<jtv> bigjools: I get the impression our support for DNS management is brittle in the face of multiple networks...  we always map a node's hostname to its first-registered NIC, instead of picking a NIC that's on the DNS-managed network.
<jtv> The right thing to do AIUI is to map the hostname to each IP address that the node has on a network for which we manage DNS.
<jtv> This means AFAICS that the fix will change DNS mappings for setups where the nodes are on multiple networks, when the first interface that gets registered is not on the network for which we manage DNS.
<bigjools> "for which", I like it :)
<bigjools> jtv: I think we need to set the "primary" network to which the first IP is assigned in the DNS
<bigjools> jtv: however not fixing this now would not be a regression
<bigjools> anyway can you get "make run" actually serving anything on trusty?
<jtv> I'll try it.
<jtv> My gramatically awkwardly laboured point is that this behaviour _does_ change if we're going to manage DNS for multiple networks.
<bigjools> jtv: what I meant was that we don't have to manage multiple DNS networks
<bigjools> stick to the single for now
<bigjools> and we can solve this problem later
<jtv> !
<bigjools> why the !
<jtv> That's what I documented in my phased approach, and you said absolutely we have to do DNS on multiple interfaces.
<bigjools> we do - but I am saying it can wait until the end
<jtv> The end of what?
<bigjools> the cycle/feature
<bigjools> anyway there's two things getting conflated here
<bigjools> management of cluster interfaces does not have anything to do with which NIC on a node has its IP used as its dns lookup
<bigjools> since the cluster may not be connected to all the same networks as the node
<bigjools> they only need to share one
<bigjools> jtv: want to do a hangout to talk more?
<jtv> Yes please.  Meanwhile, no dice on Trusty â ISE.  :(
<jtv> Let me get my equipment ready for a hangout.
<bigjools> fnar
<jtv> This will include a heavy object to hurl at the crying baby.
<bigjools> I didn't want that kind of hangout
<jtv> Oh zark off.
<bigjools> call me when yer ready
<jtv> Well, _that_ boot didn't work, so let me try another one.
<jtv> Logging in with your password sort of requires a response to keyboard input.
<jtv> Ah, I can switch to a text console and back, and then the login screen accepts input.
<bigjools> !
<jtv> It's an alpha.
<jtv> By the way, the error I get from maas running from branch in Trusty is an import error involving django's debug toolbar.  Let's try uninstalling that.
<jtv> Good thing we have oops reports!
<bigjools> all I get is an oops page
<jtv> Me too, but it does generate an oops report with the traceback.
<lifeless> oops
<bigjools> jtv: I don't get any traceback or nuffink
<bigjools> hey lifeless
<bigjools> you lurker you
<lifeless> noone expects the NZ inquisition!
<jtv> Hey there lifeless
<bigjools> jtv: so I think our django debug toolbar is an old egg, I'm trying to get a new one
<jtv> Trying that here too.
<bigjools> jtv: setting it to 1.0.1
<bigjools> jtv: btw djorm-ext-pgarray is at 0.9 now
<jtv> Oh, then we may no longer need our workaround.
<bigjools> have a looksee and let me know
<jtv> Prioritising the nodegroupinterface work though...
<bigjools> debug_toolbar/base.html missing now
<bigjools> fair enough
<bigjools> hurray fixed
<jtv> \o/
<jtv> Failing for me:
<jtv> Error: Picked: sqlparse = 0.1.10
<bigjools> landing a branch
<jtv> While installing repl.
<bigjools> one mo
<jtv> OK
<bigjools> give the lander 10 minutes
 * bigjools makes coffee
<jtv> bigjools: works for me too now!
<bigjools> hurray!
<roaksoax> bigjools: ping
<jtv> Hi roaksoax â try again in an hour, or after our standup.
<roaksoax> jtv: hehe, i'll be asleep in an hour (and probably should be sleeping now)
<jtv> !
<jtv> I thought you were just up early!
<roaksoax> nah
<roaksoax> i'
<roaksoax> :)
<roaksoax> jtv: anyway going to bed but will leave bigjools with my opne question
<jtv> OK nn!
<roaksoax> bigjools: since otherwise we wont be able to talk until sunday night (my time), your monday morning, I was wondering what's the status of adding upstream support to MAAS for HA? The charm is coming along pretty good and will soon start looking at that HA stuff for maas-region!
 * roaksoax out
<jtv> rvba: I put a few cards on the board for immediate next steps towards multiple-managed-interfaces.  They have detailed descriptions.
<rvba> jtv: okay, I'll have a look.
<jtv> Thanks.
<rvba> jtv: AFAIK, we can have the same DHCP server listen to multiple interfaces and have one config per interface.  That's all good.  But can we do something similar with DNS?
<jtv> rvba: depends on how similar you mean...  that's the region controller, of course, so it's going to listen on all of the controller's interfaces.
<jtv> But any given node is only going to be on one managed network, apparently.
<jtv> And that's why there is no confusion between the networks.
<rvba> Do you mean that there is one (and only one) network marked as "the management network"?
<jtv> From the cluster controller's perspective, no.  But from the node's perspective, yes.
<rvba> Care to explain?
<rvba> jtv: also, shouldn't we try splitting the card named "Handle multiple cluster net interfaces" into smaller tasks?
<jtv> rvba: why do you think there are cards for jobs that are part of that project?
<jtv> About the networks: a cluster will manage multiple networks, and a node will be in only one of them.
<rvba> jtv: then I think there are things that are missing :).  Like update the generated DHCP config to account for the multiple interfaces.
<rvba> Also, the parsing of the lease file will have to be changed (I think).
<jtv> You're right â we'll have to update the DHCP config for multiple networks.
 * rvba adds a card
<jtv> That's currently all built around the assumption of a single network.  :(
<rvba> Right, that's why I'm saying it will probably have to be changed.
<jtv> I can't think of anything that would need changing in the parser.
<rvba> Well, I'm just saying that's worth a look :).
<rvba> jtv: btw, did you fix the problem in maas-test where the machine was not cleaned up?
<jtv> rvba: no, and I can't test it now.  :(
<rvba> jtv: you mean you can't reproduce the problem?
<jtv> That's right.
<jtv> And as I said I just don't see how the "finally" clause could fail to run...
 * rvba tries
<bigjools> roaksoax: HA is not going to happen I'm afraid.
<bigjools> unless we find time out of nowhere
 * gmb lunches
<gmb> allenap: Are you able to help me unfuck my dev box? I accidentally rebooted halfway through apt-get remove maas last night and now maas-region-controller can't be fully installed or removed.
<allenap> gmb: Sure :)
#maas 2014-01-26
<netkook> does anybody want to help a newbie?
<netkook> == hello room
#maas 2015-01-19
<caribou> rvba: remember my query last week about DNS resolution ?
<rvba> caribou: yep
<caribou> rvba: well, I got maas 1.7 working fine in the same context
<caribou> rvba: maybe some specific issue with maas 1.5
<caribou> rvba: one thing I noticed with 1.7 is that if DNS resolution fails, it will drop to IP address which helped
<caribou> rvba: I'll take some time to see if I can get my maas 1.5 setup to work
<rvba> caribou: interestingâ¦ my guess is that the upgrade from 1.5 to 1.7 caused the entire DNS conf to be re-writtenâ¦ and it fixed the issue.
<caribou> rvba: oh, I didn't upgrade, I installed in another VM
<rvba> ah, okay
<caribou> now I'll reinstall a fresh trusty with its maas & see what I get, now that I have DNS resolution apparently working
<rvba> Okay, cool.  Thanks for testing this.  Keep me posted.
<caribou> but I do see the entries in the zone file & tasks in the logs
<caribou> for maas 1.7
<caribou> rvba: FYI, reinstalled maas 1.5 from scratch : works fine now. Go figure :-/
<aslaen> hello.. this is prob a question with a simple answer but I couldn't find it looking through the docs.  How do I set a static IP on a node?  I need to define a static IP so it NATs correctly.
#maas 2015-01-20
<kdavyd> Hi All. If this isn't the right channel, please point me the right way. Has anyone ever successfully managed to integrate MAAS node power controls and vSphere nodes via vCenter without going through a bunch of manual shenanigans and pinning VMs to individual vSphere hosts?
<kdavyd> I see two ways - one is to hack on libvirt to allow it to talk to a vCenter via vpx:// without having to specify the ESX server name, just the cluster.
<kdavyd> The other is to add vmware tools support into either the preseed or a custom commissioning script, and then get power controls to the VMs via wake-on-lan.
<kdavyd> Neither option appears to be trivial, and google offers no help either.
<kdavyd> What I'm trying to accomplish is a production-grade OpenStack HA deployment on top of an existing vSphere environment, with all OpenStack management nodes virtualized, and the only bare metal provisioned hosts being the actual compute and storage.
<X-Rob> kdavyd: vmware have an openstack build that talks to vCenter
<X-Rob> however, apparently it's poo.
<kdavyd> X-Rob: I don't even want to down that path, partially for the reason you've described.
<X-Rob> heh, yup.
<kdavyd> (we've been through vCloud, and that didn't end up well)
<dimitern> rvba, meeting?
<rvba> dimitern: sorry I'm lateâ¦
<jlondon> Howdy all. I'm trying Maas again after a couple years and am having some issues with the bootstrapping process. Some machines won't get past the initial pxe instantiation and just sit at the login prompt and the machines that do get their IPMI information added and then turned off... don't have ram/cpu info stored in maas. Any ideas? I've looked through the logs and didn't really see anything.
<roadmr> jlondon: ram/cpu info is only collected when you commission them; the first contact with maas is called "enlistment" and it won't get that information
<roadmr> jlondon: commission the nodes you want to use (there's a control for that), then reboot those nodes; they'll boot, then turn off again, but *then* you'll see their hardware information
<roadmr> jlondon: once they're commissioned, you can start them or deploy workloads (e.g. with juju) on them
<roadmr> jlondon: there's even a "commission and start node" button
<jlondon> roadmr: Okay, understood. That part is working as expected then. Any ideas on troubleshooting the nodes which can't get past the 'maas-enlisting-node' login?
<jlondon> I don't see anything in the logs about them and they never show up in the nodes list.
<roadmr> jlondon: try switching VTs (alt-f1,2,3...) and see if another console shows what they're doing
<roadmr> jlondon: if they got as far as showing the enlistment login, everything else should be ok
<jlondon> All tty's show the enlisting login screen, no logs.
<jlondon> Oh wait... tty7 has the dmesgs
<jlondon> Last entry is 'Ign http://archive.ubuntu.com trusty-updates'
<roadmr> jlondon: :D weird, is there some config difference between the failing nodes and the successful ones?
<jlondon> roadmr: I seem to remember something like this happening the last time I tried Maas and it had something to do with that Maas sets up a proxy for the nodes or something like that?
<roadmr> jlondon: network-wise, perhaps... a different switch?
<roadmr> jlondon: yes, what's odd is, why do some nodes work while others don't
<jlondon> roadmr: Nope. Same switches and I even just deleted one node which had been added okay and then tried adding it again and now it's hanging. Sooo... maybe a proxy issue?
<jlondon> If so how do I disable the proxy?
<roadmr> jlondon: That, I don't know :/ sorry.. since it's worked OK for me, I don't know how to disable it :(
<roadmr> jlondon: you could post maas-related questions in askubuntu.com, maas people regularly check there for questions
<jlondon> roadmr: Hrm, Okay thanks. I'll try to see if there's anything in the docs about it. I am now remembering that I had issues with the proxy the last time I tried this so.. Do you know how to login to the node before it has been found by maas? Maybe I can see the network info of the host from there.
<roadmr> jlondon: hm no idea, particularly because I don't think the ephemeral image has a password :/
<roadmr> jlondon: go to the console (alt-f7) right after the node boots, you'll be able to see cloud-init including network data there
<jlondon> roadmr: Yeah saw the IP of one of the machines previously but couldn't get in with my ssh key or anything I tried. I think you might be right about no password though :(
<jlondon> It's looking like it is a proxy issue now though, so I bet if I can disable it things will be okay.
<roadmr> jlondon: right, they won't even be sshable until they're started I think :/ that's when cloud-init fetches the ssh keys from maas
<roadmr> jlondon: have patience, someone here will know what to do. Just that I don't :) tomorrow morning you may have better luck, more people than now (almost end-of-day)
<jlondon> :)
<jlondon> Thanks for the help so far.
<jlondon> 99% sure now that it's a proxy issue. The apt-get process or whatever is being run is going.. just very, very, very slowly.
<roadmr> jlondon: np. Also sounds like wonky network card... hopefully you'll figure out the proxy and be able to rule this out.
<jlondon> Yep. It's the proxy. Killed it and the node finished booting.
<jlondon> Not certain why an issue I had two years ago is still an issue :|
<roadmr> so weird...
<roadmr> ok, need to log off. good night!
<jlondon> Seeya~
#maas 2015-01-21
<aslaen> hello.. I'm trying to get MaaS setup and I'm running into an issue where commission is fine but when I try to bootstrap I get "APM not present"  I believe this is because we have a NAT setup and MaaS doesn't appear to be using the IP I assigned to this node.
<aslaen> I used claim-sticky-ip-address to assign the IP to this specific node.
<aslaen> maas node claim-sticky-ip-address <node System ID> requested_address=172.16.0.177
<aslaen> but when it tries to pxe it says "My IP address is 172.16.0.226"
<aslaen> any ideas?
#maas 2015-01-22
<swtshoulder> hi can any one tell me if I have to disable my DHCP on my router if I want to use MaaS DHCP server to boot my nodes or my router's DHCP won't have a conflict with MaaS's DHCP?
<wojtek_dev__> hi, I got a lot of errors running make test (log: https://gist.githubusercontent.com/skarab7/f764ae115e07ecc395f3/raw/1afa5f9e6459d6051ab87baa39a46ea82523d825/maas_dev_make_test_log) on fresh ubuntu utopic (see https://github.com/skarab7/maas-dev-vagrant-env). I was following the hacking maas page. Did I miss something?
<wojtek_dev__> steps I took: https://github.com/skarab7/maas-dev-vagrant-env/blob/master/Vagrantfile#L56
<roaksoax> /qu/win 9
<blake_r> swtshoulder: you will need to disable your routers DHCP server if you want MAAS to deploy your ndoes
<aslaen> Hello, does claim-sticky-ip-address work? It doesn't seem to DHCPREQUEST even after I set it.
#maas 2015-01-25
<Guest57661> hello i am trying to learn the Ubuntu maas setup in virtual environment using virtualbox
<Guest57661> how ever the node enlistment process for vm;s doesnot complete
<Guest57661> the maas server reports bad udp checksums in syslog
<Guest57661> and the node displays wating for response retrying
<Guest57661> help plz
<Guest57661> IP-Config: no response after 60 secs - giving up
#maas 2016-01-25
<BlackDex> hello there. I have installed a maas server, but i don't seem to have a commissioning release available
<BlackDex> How can i manually download this?
<D4RKS1D3> BlackDex, it also happens to be and I realized that you need to download 14.04 (not higher) to do the commissioning (I'm using MaaS 1.9.0)
<BlackDex> D4RKS1D3: Ah, that could explain it :). Seems to be working with the 14.04LTS :)
<mup> Bug # changed: 1321023, 1381058, 1386327, 1389806, 1415820, 1419041, 1423691
<nagyz> hey guys
<nagyz> any developers around? :)
<nagyz> I think I've managed to stumble upon some bugs in the latest 1.9
<mup> Bug #1475977 changed: filenames for user-defined preseeds when booting using the debian installer are not correct. <MAAS:Invalid> <MAAS 1.7:New> <https://launchpad.net/bugs/1475977>
<mup> Bug #1481992 changed: Upgrade of grub-pc during install fails <canonical-bootstack> <MAAS:Invalid> <curtin (Ubuntu):New> <https://launchpad.net/bugs/1481992>
<mup> Bug #1475977 opened: filenames for user-defined preseeds when booting using the debian installer are not correct. <MAAS:Invalid> <MAAS 1.7:New> <https://launchpad.net/bugs/1475977>
<mup> Bug #1481992 opened: Upgrade of grub-pc during install fails <canonical-bootstack> <MAAS:Invalid> <curtin (Ubuntu):New> <https://launchpad.net/bugs/1481992>
<mup> Bug #1475977 changed: filenames for user-defined preseeds when booting using the debian installer are not correct. <MAAS:Invalid> <MAAS 1.7:New> <https://launchpad.net/bugs/1475977>
<mup> Bug #1481992 changed: Upgrade of grub-pc during install fails <canonical-bootstack> <MAAS:Invalid> <curtin (Ubuntu):New> <https://launchpad.net/bugs/1481992>
<mup> Bug #1475977 opened: filenames for user-defined preseeds when booting using the debian installer are not correct. <MAAS:Invalid> <MAAS 1.7:New> <https://launchpad.net/bugs/1475977>
<mup> Bug #1481992 opened: Upgrade of grub-pc during install fails <canonical-bootstack> <MAAS:Invalid> <curtin (Ubuntu):New> <https://launchpad.net/bugs/1481992>
<mup> Bug # changed: 1475977, 1481992, 1484268, 1488594, 1494955
<mup> Bug #1484268 opened: MAAS not auto-detecting/auto-entering credentials for HP Proliant ML310E G8 V2 server <MAAS:Invalid by newell-jensen> <maas (Ubuntu):Expired> <https://launchpad.net/bugs/1484268>
<mup> Bug #1488594 opened: Nodes cannot boot after a storage disk replacement <ceph> <disaster-recovery> <storage> <MAAS:Invalid> <syslinux (Ubuntu):New> <https://launchpad.net/bugs/1488594>
<mup> Bug #1494955 opened: MAAS reports duplicate MAC on commissioning <canonical-bootstack> <MAAS:Invalid> <MAAS 1.8:Triaged> <https://launchpad.net/bugs/1494955>
<mup> Bug # changed: 1484268, 1488594, 1494955, 1508565, 1513224
<mup> Bug #1508565 changed: maas uses 3.13 (hwe-t) kernel which does not work on non-virtual IBM power <blocks-hwcert-server> <MAAS:Invalid> <maas-images:Confirmed> <maas (Ubuntu):Invalid> <https://launchpad.net/bugs/1508565>
<mup> Bug #1508565 opened: maas uses 3.13 (hwe-t) kernel which does not work on non-virtual IBM power <blocks-hwcert-server> <MAAS:Invalid> <maas-images:Confirmed> <maas (Ubuntu):Invalid> <https://launchpad.net/bugs/1508565>
<mup> Bug #1508565 opened: maas uses 3.13 (hwe-t) kernel which does not work on non-virtual IBM power <blocks-hwcert-server> <MAAS:Invalid> <maas-images:Confirmed> <maas (Ubuntu):Invalid> <https://launchpad.net/bugs/1508565>
<mup> Bug #1513224 opened: 1.9b2: node details storage displays big json blob <landscape> <MAAS:Won't Fix> <https://launchpad.net/bugs/1513224>
<mup> Bug #1508565 changed: maas uses 3.13 (hwe-t) kernel which does not work on non-virtual IBM power <blocks-hwcert-server> <MAAS:Invalid> <maas-images:Confirmed> <maas (Ubuntu):Invalid> <https://launchpad.net/bugs/1508565>
<mup> Bug #1508565 changed: maas uses 3.13 (hwe-t) kernel which does not work on non-virtual IBM power <blocks-hwcert-server> <MAAS:Invalid> <maas-images:Confirmed> <maas (Ubuntu):Invalid> <https://launchpad.net/bugs/1508565>
<mup> Bug #1513224 changed: 1.9b2: node details storage displays big json blob <landscape> <MAAS:Won't Fix> <https://launchpad.net/bugs/1513224>
<mup> Bug #1537789 opened: MAAS dhcp fails to start on  up-to-date Xenial with MAAS built from source <MAAS:New> <https://launchpad.net/bugs/1537789>
<mup> Bug #1537789 changed: MAAS dhcp fails to start on  up-to-date Xenial with MAAS built from source <MAAS:New> <https://launchpad.net/bugs/1537789>
<mup> Bug #1537789 opened: MAAS dhcp fails to start on  up-to-date Xenial with MAAS built from source <MAAS:New> <https://launchpad.net/bugs/1537789>
<mup> Bug #1537789 changed: MAAS dhcp fails to start on  up-to-date Xenial with MAAS built from source <MAAS:New> <https://launchpad.net/bugs/1537789>
<mup> Bug #1508501 changed: maas dns entry missing <adoption> <charmers> <cpp> <MAAS:Invalid> <MAAS 1.8:Triaged> <https://launchpad.net/bugs/1508501>
<mup> Bug #1537789 opened: MAAS dhcp fails to start on  up-to-date Xenial with MAAS built from source <MAAS:New> <https://launchpad.net/bugs/1537789>
<mup> Bug #1508501 opened: maas dns entry missing <adoption> <charmers> <cpp> <MAAS:Invalid> <MAAS 1.8:Triaged> <https://launchpad.net/bugs/1508501>
<mup> Bug #1508501 changed: maas dns entry missing <adoption> <charmers> <cpp> <MAAS:Invalid> <MAAS 1.8:Triaged> <https://launchpad.net/bugs/1508501>
<lovea> Hi, Trying to use ProLiant DL160 Gen9 servers in a MAAS set up. They have hpdsa drivers detected. So far so good. Then I use Juju to deploy a 15.10 Wily charm and the deploy fails because the hpdsa drivers are requested from http://downloads.linux.hp.com/SDR/repo/ubuntu-hpdsa/dists/wily/main/binary-amd64/Packages. This doesn't exist. HP only has a 14.04 Trusty repo. Any ideas how I can proceed?
 * jmalcaraz Hi
<timello> Hi there, I'm trying to deploy ubuntu on POWER8 box using maas. I'm wondering if there's anything I should do in the POWER box before issuing the commission... In the ipmi console log, I don't see anything in the 'Kernel command line:'
<timello> MaaS can power on/off, but commission fails
<timello> Nothing in the logs other than: ERROR: Section post-commit `Chassis_Boot_Flags']
<timello> And: [WARNING] Failed to change the boot order to PXE <IP>
<bdx> hey whats up everyone? Is there a maas/next repo for xenial?
<roaksoax-brb> bdx: we are testing a xenial release as of now. We should have something up this week
<bdx> roaksoax-brb: sweet! thanks!
<nagyz> so something changed and (even after a reinstall!) my enlistment and commissioning runs into errors when trying to get an IP for the node :/
<nagyz> according to maas's log, dhcp is ACKing it, but the node never sees it (or just ignores it)
#maas 2016-01-26
<mup> Bug #1519726 changed: Status ready but no RAM,  Asus node <MAAS:Expired> <https://launchpad.net/bugs/1519726>
<smoser> rfolco, you had some questions wrt power and maas ?
<rfolco> smoser, yeah. Looks like some boot options are not set by maas to start commissioning on a p8 box
<rfolco> timello, ^
<timello> rfolco, smoser, yeah, actually we are looking for some guide or list of requirements... Is there any specific firmware version, any preparation in the box prior the maas commission
<rfolco> smoser, you mentioned you have a p8 box on maas. Any special config to impi/pxe work?
<smoser> hm..
<smoser> oh. i dont recall if enlistment works or not.
<rfolco> timello, can you give more details on the error ? so the p8 box boots, then falls to petitboot, and maas returns error because it doesn't find an entry to boot from?
<smoser> likeley it'd enlist and then you'd have to fill in the power parameters
<smoser> you do have to leave the user blank on power8
<timello> smoser: ipmi works
<timello> maas can power on and off
<timello> get this though: [ [WARNING] Failed to change the boot order to PXE 9.114.118.83: ERROR: Section post-commit `Chassis_Boot_Flags']
<smoser> oh. so what is wrong ? i'm sorry.
<timello> commission fails
<smoser> what maas version ?
<timello> latest from ppa
<smoser> 1.9 then.
 * timello checks
<smoser> hm..
<smoser> well.... 1.9 not so good for power at the moment.
<timello> smoser: what would you recomend
<smoser> https://bugs.launchpad.net/maas/+bug/1523779
<smoser> that is completely going to block you on 1.9
<smoser> that said the error your see is not expected or ignorable
<rfolco> 1.7 then?
<smoser> pick a 1.8 you should be good.
<timello> MAAS Version 1.9.0+bzr4533-0ubuntu1 (trusty1)
<rfolco> there you go timello
<rfolco> smoser, you da man, thanks!
<smoser> well, sorry. i dont know where you can easily get a 1.8 though
<smoser> :-(
<smoser> roaksoax-brb, ^ can you advise on that? blake_r_ ?
<smoser> fwiw, 1.8 does not work with powervm systems.
<smoser> powernv or powerkvm should work fine.
<smoser> you need 1.9 for powervm but that breaks as descrivbed in that bug above.
<smoser> its being worked.
<rfolco> pkvm here
<timello> smoser: btw, thank you!
<rfolco> +1
<mup> Bug #1519470 changed: Trusty Deployment w/ RAID storage config fails because Trusty images do not contain RAID kernel modules <hwcert-server> <curtin:Invalid> <MAAS:Invalid> <maas-images:Confirmed> <https://launchpad.net/bugs/1519470>
<mup> Bug #1538280 opened: MAAS 1.9.0+bzr4533 can not start HP ProLiant DL360 Gen9 servers over IPMI <MAAS:New> <https://launchpad.net/bugs/1538280>
<mup> Bug #1538280 changed: MAAS 1.9.0+bzr4533 can not start HP ProLiant DL360 Gen9 servers over IPMI <MAAS:Opinion> <https://launchpad.net/bugs/1538280>
<mup> Bug #1519470 opened: Trusty Deployment w/ RAID storage config fails because Trusty images do not contain RAID kernel modules <hwcert-server> <curtin:Invalid> <MAAS:Invalid> <maas-images:Confirmed> <https://launchpad.net/bugs/1519470>
<mup> Bug #1519470 changed: Trusty Deployment w/ RAID storage config fails because Trusty images do not contain RAID kernel modules <hwcert-server> <curtin:Invalid> <MAAS:Invalid> <maas-images:Confirmed> <https://launchpad.net/bugs/1519470>
<mup> Bug #1538280 opened: MAAS 1.9.0+bzr4533 can not start HP ProLiant DL360 Gen9 servers over IPMI <MAAS:Opinion> <https://launchpad.net/bugs/1538280>
<mup> Bug #1538280 changed: MAAS 1.9.0+bzr4533 can not start HP ProLiant DL360 Gen9 servers over IPMI <MAAS:Opinion> <https://launchpad.net/bugs/1538280>
<mup> Bug #1538280 opened: MAAS 1.9.0+bzr4533 can not start HP ProLiant DL360 Gen9 servers over IPMI <MAAS:Opinion> <https://launchpad.net/bugs/1538280>
<mup> Bug #1538280 changed: MAAS 1.9.0+bzr4533 can not start HP ProLiant DL360 Gen9 servers over IPMI <MAAS:Opinion> <https://launchpad.net/bugs/1538280>
<rfaux> Hello. Anyone been able to use MAAS on nodes with multipath storage?
<krondor> trying out maas 1.9 seeing a few issues.  Deploys fail with [Errno 2] No such file or directory: '/sys/block/c0d0/holders' and I cant' seem to find maas-image-builder anymore in PPAs
<krondor> Here's more error log from deployment failure; https://gist.github.com/krondor/92f145d67d150eca3602
#maas 2016-01-27
<tertiary> my group is considering maas+juju. one thing i am very hesitant about is our requirement to stay with the most current Ubuntu OS version. We will be running gluster to share storage across all the nodes. in order to reprovision the nodes with the newest Ubuntu versions, won't maas rebuild all the nodes, consequently destroying the gluster share?
<rbasak> tertiary: what would you prefer to happen instead? MAAS will only do that if you ask it to, and presumably you don't?
<rbasak> tertiary: you might be interested to look into how Juju manages HA Openstack upgrades without breaking the deployment in the process. Different problem but the solution is relevant to you I think.
<mup> Bug #1538640 opened: Failed Deployment:  No Arrays Found <MAAS:New> <https://launchpad.net/bugs/1538640>
<tertiary> rbasak: thanks
<mup> Bug #1538640 changed: Failed Deployment:  No Arrays Found <MAAS:New> <https://launchpad.net/bugs/1538640>
<mup> Bug #1538640 opened: Failed Deployment:  No Arrays Found <MAAS:New> <https://launchpad.net/bugs/1538640>
<roaksoax-brb> OS/win 11
<roaksoax-brb> bl	why wouldn't this work ?
<roaksoax-brb> roaksoax@unleashed:~$ maas roaksoax node get-curtin-config system_id=node-7a75ff92-324b-11e5-902a-00163eca07b6
<roaksoax-brb> Not Found
<roaksoax-brb> blake_r_: ^^
<blake_r_> roaksoax-brb: because its wrong
<blake_r_> roaksoax-brb: maas roaksoax node get-curtin-config node-7a75ff92-324b-11e5-902a-00163eca07b6
<blake_r_> roaksoax-brb: no "system_id="
<mup> Bug #1538640 changed: Failed Deployment:  No Arrays Found <curtin:Confirmed> <MAAS:Invalid> <https://launchpad.net/bugs/1538640>
<mup> Bug #1538640 opened: Failed Deployment:  No Arrays Found <curtin:Confirmed> <MAAS:Invalid> <https://launchpad.net/bugs/1538640>
<mup> Bug #1538640 changed: Failed Deployment:  No Arrays Found <curtin:Confirmed> <MAAS:Invalid> <https://launchpad.net/bugs/1538640>
<donu7> Is this room an adequate place to ask maas+juju based questions?
<donu7> nvm, found the #juju channel
<mup> Bug #1356012 opened: maas incorrectly overmanages DNS reverse zones <dns> <MAAS:In Progress by lamont> <MAAS 1.9:Won't Fix> <MAAS trunk:In Progress by lamont> <maas (Ubuntu):Confirmed> <https://launchpad.net/bugs/1356012>
#maas 2016-01-28
<mup> Bug #1536262 changed: dns-* e/n/i are misplaced <uosci> <curtin:In Progress> <MAAS:Invalid> <MAAS 1.9:Invalid> <MAAS 2.0:Invalid> <https://launchpad.net/bugs/1536262>
<mup> Bug #1536262 opened: dns-* e/n/i are misplaced <uosci> <curtin:In Progress> <MAAS:Invalid> <MAAS 1.9:Invalid> <MAAS 2.0:Invalid> <https://launchpad.net/bugs/1536262>
<mup> Bug #1536262 changed: dns-* e/n/i are misplaced <uosci> <curtin:In Progress> <MAAS:Invalid> <MAAS 1.9:Invalid> <MAAS 2.0:Invalid> <https://launchpad.net/bugs/1536262>
<mup> Bug #1522910 changed: Default Flat partition scheme fails on efi - lacks /boot/efi <MAAS:Invalid> <https://launchpad.net/bugs/1522910>
<mup> Bug #1524500 changed: [xenial, 1.10] dhcpd.conf doesn't get created anymore <MAAS:Invalid> <https://launchpad.net/bugs/1524500>
<mup> Bug #1524501 changed: [xenial, 1.10] dhcpd.conf doesn't get created anymore <MAAS:New> <https://launchpad.net/bugs/1524501>
<mup> Bug #1539248 opened: Subnet.objects.allocate_new should take a subnet and requested_address <networking> <MAAS:Triaged> <https://launchpad.net/bugs/1539248>
<mup> Bug #1539248 changed: Subnet.objects.allocate_new should take a subnet and requested_address <networking> <MAAS:Triaged> <https://launchpad.net/bugs/1539248>
<mup> Bug #1539248 opened: Subnet.objects.allocate_new should take a subnet and requested_address <networking> <MAAS:Triaged> <https://launchpad.net/bugs/1539248>
<mup> Bug #1539277 opened: [1.10, 2.0] MAAS UI is not available immediately after install <MAAS:New> <https://launchpad.net/bugs/1539277>
<mup> Bug #1539292 opened: 1.9: Stale view while deleting a server - no way to tell that operation is in progress <oil> <MAAS:New> <https://launchpad.net/bugs/1539292>
<mup> Bug #1539292 changed: 1.9: Stale view while deleting a server - no way to tell that operation is in progress <oil> <MAAS:New> <https://launchpad.net/bugs/1539292>
<mup> Bug #1539292 opened: 1.9: Stale view while deleting a server - no way to tell that operation is in progress <oil> <MAAS:New> <https://launchpad.net/bugs/1539292>
<mup> Bug #1539292 changed: 1.9: Stale view while deleting a server - no way to tell that operation is in progress <oil> <MAAS:New> <https://launchpad.net/bugs/1539292>
<mup> Bug #1539292 opened: 1.9: Stale view while deleting a server - no way to tell that operation is in progress <oil> <MAAS:New> <https://launchpad.net/bugs/1539292>
#maas 2016-01-29
<mag009_> anyone around
<mup> Bug #1539739 opened: Memory size reported by MAAS less than actual memory size by 1GB for commissioned VM <oil> <MAAS:New> <https://launchpad.net/bugs/1539739>
<mup> Bug #1539739 changed: Memory size reported by MAAS less than actual memory size by 1GB for commissioned VM <oil> <MAAS:New> <https://launchpad.net/bugs/1539739>
<mup> Bug #1539739 opened: Memory size reported by MAAS less than actual memory size by 1GB for commissioned VM <oil> <MAAS:New> <https://launchpad.net/bugs/1539739>
<mup> Bug #1539739 changed: Memory size reported by MAAS less than actual memory size by 1GB for commissioned VM <oil> <MAAS:New> <https://launchpad.net/bugs/1539739>
<mup> Bug #1539739 opened: Memory size reported by MAAS less than actual memory size by 1GB for commissioned VM <oil> <MAAS:New> <https://launchpad.net/bugs/1539739>
<mup> Bug #1514648 opened: Got more than one item. - unable to add/modify machines in the UI <MAAS:New> <https://launchpad.net/bugs/1514648>
<mup> Bug #1514648 changed: Got more than one item. - unable to add/modify machines in the UI <MAAS:New> <https://launchpad.net/bugs/1514648>
<mup> Bug #1514648 opened: Got more than one item. - unable to add/modify machines in the UI <MAAS:New> <https://launchpad.net/bugs/1514648>
#maas 2016-01-30
<nagyz> is there a way to tell maas not to use the server's IP as a DNS server when deploying new nodes? :/
<nagyz> ok, opened a bug...
<mup> Bug #1539924 opened: dns forwarders are not set properly <MAAS:New> <https://launchpad.net/bugs/1539924>
#maas 2016-01-31
<mup> Bug #1540021 opened: Machine Summary Page Storage Displays Meta Details like: 53.0GB over [{"is_boot":true,"available_size_human":"0.0...  <MAAS:New> <https://launchpad.net/bugs/1540021>
<mup> Bug #1540021 changed: Machine Summary Page Storage Displays Meta Details like: 53.0GB over [{"is_boot":true,"available_size_human":"0.0...  <MAAS:New> <https://launchpad.net/bugs/1540021>
<mup> Bug #1540021 opened: Machine Summary Page Storage Displays Meta Details like: 53.0GB over [{"is_boot":true,"available_size_human":"0.0...  <MAAS:New> <https://launchpad.net/bugs/1540021>
<mup> Bug #1520645 changed: Unable to enlist node in gMAAS <MAAS:Expired> <https://launchpad.net/bugs/1520645>
<mup> Bug #1520645 opened: Unable to enlist node in gMAAS <MAAS:Expired> <https://launchpad.net/bugs/1520645>
<mup> Bug #1520645 changed: Unable to enlist node in gMAAS <MAAS:Expired> <https://launchpad.net/bugs/1520645>
<Skaag> I have two machines, both with maas installed... how do I make them aware of each other? (cluster)
<Skaag> I tried to set the IP address of the master machine in the dialog that popped up when running 'dpkg-reconfigure maas-cluster-controller'
<Skaag> (but that doesn't work)
<mup> Bug #1540040 opened: MAAS 1.9 is broken on xenial <MAAS:New> <https://launchpad.net/bugs/1540040>
<mup> Bug #1540040 changed: MAAS 1.9 is broken on xenial <MAAS:New> <https://launchpad.net/bugs/1540040>
<mup> Bug #1540040 opened: MAAS 1.9 is broken on xenial <MAAS:New> <https://launchpad.net/bugs/1540040>
<mup> Bug #1540040 changed: MAAS 1.9 is broken on xenial <MAAS:Invalid> <MAAS 1.9:In Progress by allenap> <https://launchpad.net/bugs/1540040>
<mup> Bug #1540040 opened: MAAS 1.9 is broken on xenial <MAAS:Invalid> <MAAS 1.9:In Progress by allenap> <https://launchpad.net/bugs/1540040>
<mup> Bug #1540040 changed: MAAS 1.9 is broken on xenial <MAAS:Invalid> <MAAS 1.9:In Progress by allenap> <https://launchpad.net/bugs/1540040>
<mup> Bug #1539924 changed: dns forwarders are not set properly <MAAS:Invalid> <https://launchpad.net/bugs/1539924>
<mup> Bug #1540021 changed: Machine Summary Page Storage Displays Meta Details like: 53.0GB over [{"is_boot":true,"available_size_human":"0.0...  <MAAS:Invalid> <https://launchpad.net/bugs/1540021>
<zdoge> Hello everyone! I need help about integrating MAAS master (which is into a VM) to a VMWare VSphere host. I tried to add the VSphere host but I keep getting "Failed to probe and enlist VMware nodes: (vim.fault.HostConnectFault) {#012   dynamicType = <unset>,#012   dynamicProperty = (vmodl.DynamicProperty) [],#012   msg = '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify faile", someone knows how I can bypass this ? thanks :)
#maas 2017-01-23
<aaran> Hi, I am trying to get MAAS working but I am having a little trouble with the setup when using an external DHCP server
<aaran> First of all, if I am using an external DHCP server sshould be be seeing "rackd â Missing connections to 1 region controller(s)." in the
<aaran> Rack Controller
<aaran>  section of the maas server
<pmatulis> aa...
<andrew-ii> I have a node that boots under maas direction, but does not commision. Other nodes in the region commision fine - is that likely a BIOS/firmware/UXE issue on the failing node?
#maas 2017-01-24
<tdihp> Hi mpontillo
<tdihp> After https://askubuntu.com/questions/873701/maas-node-unable-to-resolve-its-own-hostname updated, I'm still trying to get the commissioning working.
<tdihp> the current problem is that the storage info is still missing after commissioning, with a valid 00-maas-07-block-devices.out output (/dev/sda is listed).
<tdihp> What else can I try to diagnose the situation?
<mpontillo> tdihp: I would run this script on the commissioning node while SSH'd in to see where the failure is.
<mpontillo> https://gist.github.com/pontillo/0b92a7da2fba43fb5dce705be2dcf38b
<mpontillo> tdihp: that is, the commissioning node executes (pretty much) those commands to figure out what disks are on the system
<mpontillo> tdihp: so you should be able to tell where it's going wrong. I think lamont's theory was that it was timing out trying to contact an iSCSI disk specified by a name rather than an IP address
<tdihp> Thanks mpontillo, I'll try.
<mpontillo> tdihp: updated https://gist.github.com/pontillo/0b92a7da2fba43fb5dce705be2dcf38b -- now includes also the 'sudo' commands
<tdihp> mpontillo: I've tried the script
<tdihp> It will take more than 2 minutes, with more than 6 sudo calls
<tdihp> However the 	00-maas-07-block-devices.err file just has 4 lines of sudo errors.
<tdihp> mpontillo: Oddly though the script you provided did not reproduce those sudo error messages
<mpontillo> tdihp: my script redirects sudo's stderr to /dev/null =)
<mpontillo> tdihp: can you humor me and do two more things... (1) tell me the output of `iscsiadm -m session`, and (2) if you can determine that open-iscsi is trying to connect to MAAS via a hostname rather than an IP address (either iscsiadm or mount should tell you), run "sudo dpkg-reconfigure -plow maas-rack-controller" on the MAAS server, and make sure the MAAS URL
<mpontillo> is set to an IP address and not a hostname
<tdihp> mpontillo: And yes, After removing those "2>/dev/null" I saw the identical errors
<tdihp> mpontillo: the iscsi output: tcp: [1] 10.9.8.1:3260,1 iqn.2004-05.com.ubuntu:maas:ephemeral-ubuntu-amd64-generic-xenial-daily (non-flash)
<mpontillo> tdihp: lamont's theory, by the way, was that the sudo errors are a red herring; I think he thought the actual issue was probing an iSCSI target by its hostname rather than an IP address
<tdihp> mpontillo: I see. The iscsi target is in ip though.
<mpontillo> tdihp: yeah. I see that. I wonder if it was passed into the kernel that way. try:
<mpontillo> cat /proc/cmdline | tr ' ' '\n' | grep iscsi_target_ip
<mpontillo> tdihp: leave off the grep and you should also see a root= line that has the iSCSI /dev/ path in there.
<tdihp> iscsi_target_ip=10.9.8.1
<mpontillo> ok
<mpontillo> tdihp: hm. so when you tried my script, did storage information come back as expected?
<tdihp> mpontillo: yes, the script did output sda sda1 and sdb
<tdihp> root=/dev/disk/by-path/ip-10.9.8.1:3260-iscsi-iqn.2004-05.com.ubuntu:maas:ephemeral-ubuntu-amd64-generic-xenial-daily-lun-1
<tdihp> The only line(s) with any hostnames are: iscsi_initiator=pure-mammal -- ip=::::pure-mammal:BOOTIF
<mpontillo> tdihp: so it's curious that you don't have that same data when you commission then. in my setup, when I browse to the node details page, scroll down to commissioning details, and click on "00-maas-07-block-devices.out", I see a JSON object which is a list of two dictionaries (each disk in the system)
<tdihp> mpontillo: Yes, my "00-maas-07-block-devices.out" also lists two items, curious truly :D
<mpontillo> tdihp: ok, what state is the node in? perhaps the commissioning is failing at some other point?
<tdihp> mpontillo: BTW I've figured a way to tap to the HTTP calls the node made to the MAAS server
<mpontillo> tdihp: oh yeah? what was your method? I've captured packets in the past, but that was a pain. later I noticed we had a mass_get.py script or similar which made the calls, heh
<tdihp> mpontillo: the node is now in "ready", just with "storage" as 0 GB
<tdihp> mpontillo: Yeah I wish I could use mitmproxy too.
<tdihp> mpontillo: https://sysdig.com/blog/decode-your-http-traffic-with-sysdig/ the echo_fds bit
<mpontillo> tdihp: so 00-maas-07-block-devices.out -- I assume it doesn't contain sizes then?
<mpontillo> tdihp: do the sudo commands in my script get the block size? (just noticed a typo, the second one should be bsz: and not size64:)
<mpontillo> tdihp: ah nice, that tool looks interesting
<mpontillo> tdihp: and I didn't use a MitM method, I just used "sudo tcpdump" on the rack controller, btw
<mpontillo> tdihp: but I had to wade through a huge packet capture ;-)
<tdihp> mpontillo: So the point was, between the "starting 00-maas-07-block-devices" request and the report, there was less than 80 secs
<tdihp> mpontillo: which matches the lag of 4 sudo calls (2 devices)
<mpontillo> tdihp: well, the important thing to me is whether or not you're getting the data. if sudo is slow, that's not good. but I guess it's aborting before it's able to get the data. which is weird... on my system, I still get the "unable to resolve" error, but it works immediately
<mpontillo> tdihp: I suspect that is because I am getting a NXDOMAIN response back from the DNS server, and it's able to fail fast, whereas in your case, for some reason, the packets don't make it, and it times out
<tdihp> mpontillo: the 00-maas-07-block-devices.out does include size for sda like: "SIZE": "500107862016",
<mpontillo> tdihp: well that's interesting. so I assume you have a 500 GB disk then
<mpontillo> tdihp: if that's the case then I'm surprised it isn't being reported correctly.
<tdihp> mpontillo: Yeah, MAAS still reports "No storage information"
<mpontillo> tdihp: is there anything interesting under /var/log/maas/rsyslog/<hostname>/* on the MAAS server?
<mpontillo> tdihp: another thing you could do is look at: curl $(cat /proc/cmdline | tr ' ' '\n' | grep cloud-config-url | cut -d= -f2-)
<mpontillo> tdihp: you should see under datasource: and reporting: the URLs used to communicate back to the MAAS server. though I expect they're okay given that you seem to have commissioning output.
<mpontillo> tdihp: at this point, I would also check /var/log/maas/regiond.log, and check for any Python tracebacks. maybe that could provide a clue why the size couldn't be calculated
<tdihp> maasserver: [error] Error while calling DescribePowerTypes: Unable to get RPC connection for rack controller 'nv750' (k4am4k).
<mpontillo> tdihp: are the region and the rack on the same machine? is that consistent or a one-off?
<mpontillo> tdihp: that could happen if, for example, MAAS is restarting
<mpontillo> tdihp: what you might do is "tail -f" the log and try commissioning again, then see if any tracebacks occur during commissioning
<tdihp> mpontillo: It's the same machine
<mpontillo> ok. probably a red herring then, unless you have firewall rules I don't know about ;-)
<tdihp> mpontillo: and its not one-off
<mpontillo> tdihp: hm, well that is weird then. if you go to Nodes > Controllers and click your controller, is the status all green?
<tdihp> maasserver.rack_controller: [critical] Failed configuring DHCP on rack controller 'id:1'.
<tdihp> mpontillo: oops
<tdihp> mpontillo: It seems the rack controller is offline on MAAS
<mpontillo> tdihp: well that's an interesting clue, however it seems DHCP is working, or you wouldn't be able to PXE boot?
<mpontillo> tdihp: ok, I would "tail -f rackd.log" and then "service maas-rackd restart"
<tdihp> mpontillo: Yes DHCP/TFTP seems working
<tdihp> mpontillo: Oh, My IP have changed
<tdihp> mpontillo: the rackd is fixed, I'll try commision again
<mpontillo> tdihp: great. hoping it works this time =)
<tdihp> mpontillo: I guess I still have to try harder
<tdihp> mpontillo: is it possible that the MAAS server failed to update the node status when commissioning?
<mpontillo> tdihp: you can look at the node event log and find out
<tdihp> mpontillo: there is a burst of events like this: pure-mammal ureadahead[463]: ureadahead:events/fs/open_exec/enable: Ignored relative path
<mpontillo> tdihp: that might be a red herring https://bugs.launchpad.net/maas/+bug/1643838
<mpontillo> tdihp: any chance you can pastebin your regiond.log and rackd.log?
<mpontillo> tdihp: maas.log too for completeness
<tdihp> mpontillo: Sure I can.
<tdihp> mpontillo: https://1drv.ms/u/s!Ap2xOCxdN0NAjCHA48CAfrmpySNc
<tdihp> mpontillo: Can you download the file?
<tdihp> Oh, you mean there's a real pastebin tool
<tdihp> pls wait, I'll paste there
<mpontillo> tdihp: no that's fine, I grabbed it and was looking through
<mpontillo> tdihp: haven't found anything interesting though
<mup> Bug #1656717 opened: Juju -> MAAS [2.2+] API integration needs to account for null spaces <juju:Triaged> <MAAS:In Progress by mpontillo> <https://launchpad.net/bugs/1656717>
<mup> Bug #1656717 opened: Juju -> MAAS [2.2+] API integration needs to account for null spaces <juju:Triaged> <MAAS:In Progress by mpontillo> <https://launchpad.net/bugs/1656717>
<mup> Bug #1643001 changed: Moonshot iLO4 'Power HW address' prevent ipmitool from working <MAAS:Expired> <https://launchpad.net/bugs/1643001>
<tdihp> mpontillo: Haven't mentioned but the commissioning did work for only one time, and I could never reproduce =)
<mpontillo> tdihp: when was the last time you reopened the browser tab with your MAAS UI? can you try force-reloading it? I may have just spotted the bug. it's possible that if the amount of storage changes, the UI won't refresh properly. it may be that your node commissioned just fine.
<mpontillo> tdihp: the fact that the node is in "Ready" state means that MAAS thinks the node is usable. it would be in "Failed Commissioning" state if MAAS thought otherwise.
<mpontillo> tdihp: if that doesn't work, I would try deleting all the storage devices and recommissioning the node (though it SHOULD work without doing that; MAAS tries to update existing devices though, and I'm wondering if that step somehow goes horribly wrong)
<tdihp> mpontillo: Yes! Strange enough after having a new browser access, the storage size has shown!
 * mpontillo groans
<tdihp> I'll try delete the node and do another check
<mpontillo> tdihp: glad we're finally through that =)
 * mpontillo should sleep soon; it's after midnight here
<tdihp> mpontillo: Thanks really. Good night, cheers
<mpontillo> tdihp: happy to help; have a good day.
<jlec_> Hi there
<jlec_> What is the current way of creating custom CentOS images with MAAS-2.x?
<brendand> jlec_, CentOS is available by default in MAAS now
<brendand> no custom images needed
<jlec_> brendand: yes I know. But the official images provided by canonical are older versions. I need 7.2/3
<jlec_> secondly I like to change the base image deployed.
<jlec_> And lastely only 6.6 picks up an IP, but the 7.0 fails to do so
<mup> Bug #1656717 opened: Juju -> MAAS [2.2+] API integration needs to account for null spaces <juju:Triaged> <MAAS:In Progress by mpontillo> <https://launchpad.net/bugs/1656717>
<mup> Bug #1643001 changed: Moonshot iLO4 'Power HW address' prevent ipmitool from working <MAAS:Expired> <https://launchpad.net/bugs/1643001>
<Yoofy> Hi
<Yoofy> I see the maas-builder-image are deprecated, do you know how made custom image ?
<MrLeau> Does anybody have problem commissioning nodes using 16.04 with an offline MAAS install?
<MrLeau> I'm getting Hash sum mismatch, I think, it's because its renaming the files in /var/lib/apt/lists
<mpontillo> MrLeau: I have seen that before when my mirror was out of date. For me it meant that my images were more up-to-date than my mirror.
<mpontillo> MrLeau: MAAS installs packages from the archive during commissioning, for things like checking lldp connectivity.
<MrLeau> Yes, it is very likely that my mirror is out of date, but I found if I delete everything from /var/lib/apt/list, apt will update it, it works
<MrLeau> If there was a way to clean /var/lib/apt/list I think it would fix my problem but the commission script runs after the standard apt-get update
<jlec> hi again. now that the maas-image-builder is deprecated, what is the state of the art way to build custom images?
<pmatulis> jlec, i don't think there is a way afaik
<jlec> pmatulis: but how does canonical do it to preovide the image to the community?
<jlec> There is something around curtin, but no real docs anywhere
<plars> I have a small maas system with http://maas.ubuntu.com/images/ephemeral-v2/daily/ as the boot image sync url, and a while back when I first set it up, I was able to use it to boot with either xenial or zesty.
<plars> but now if I try to boot start node with distro_series=zesty, it tells me:
<plars> {"distro_series": ["'zesty' is not a valid distro_series.  It should be one of: '', 'ubuntu/trusty', 'ubuntu/wily', 'ubuntu/xenial'."]}
<plars> but distro_series=xenial works just fine
<plars> brendand: iirc you had helped me with this a while back when it first worked, any ideas?
<pmatulis> plars, what version of maas are you using?
<plars> pmatulis: 1.9.4+bzr459
<pmatulis> plars, oh. you haven't up'd to the 2.x series. i only really started using maas with 2.0 and 2.1
<plars> pmatulis: for various reasons, I'm reluctant to update from that. We are trying to use an existing dns/dhcp server in this lab rather than letting maas handle that function. This worked for a long time, then broke in an update a while back. Folks here were able to help me convince it to (mostly) quit trying to force us down that path but warned me that
<plars> future updates might break it completely
<plars> and this did previously import the zesty images and let us deploy with it at one time, so I'm not sure when it stopped
<plars> or why
<pmatulis> plars, maybe go directly to the URL and see if zesty is there
<plars> pmatulis: I see it on that url
<pmatulis> ohh
<plars> there was an update a few days ago it seems
<plars> to that image
<pmatulis> plars, i suggest filing a bug then
#maas 2017-01-25
<errr> Using MAAS Version 2.1.1+bzr5544-0ubuntu1 (16.04.1) is there a limit to the number of machines returned by the API in a single request?
<brendand> errr, not that i know of
<errr> from what I could find in the source code thats what it was looking like to me but I wasnt 100% sure
<aaran> Hi, tried to get support regarding an issue yesterday however there was no one around so I will ask again
<aaran> when i tell a client to network boot it gives the error Unable to locate configuration file
<aaran> After restarting the server the client now boots into a ubuntu instance however the server is showing commissioning until it fails
<g3> hey yo!
<g3> Anyone around?
<g3> =(
<errr> is it possible to provision based on tags a machine has been given?
<errr> Using MAAS Version 2.1.1+bzr5544-0ubuntu1 (16.04.1)
<g3> setting up maas for the first itme, currently stuck at One cluster is not connected to the region.
<g3> Currently have a cluster controller setup at a network interface handling dhcp and dns
<g3> and /etc/maas/clusterd.conf reflects that information
<g3> Hmmm apt installed 1.9
<pmatulis> errr, how are you provisioning?
<pmatulis> g3, i suggest using maas 2.1 if you're setting up for the first time
<g3> Yeah I just realized I can't use Trusty for that
<g3> So I'll have to upgrade another server first.
<roaksoax> errr: yes
<roaksoax> errr: you can 'allocate' a machine based on tags in order to deploy it
<roaksoax> errr: so you can allocate + deploy the machine
<andrew-ii> What could cause a node to properly boot via PXE but not enlist?
<roaksoax> andrew-ii: lack of network access ? or not being able to contact the maas server ?
<andrew-ii> It seems to do ok for the most part
<roaksoax> andrew-ii: do you have a console log I can see ?
<andrew-ii> It boots under maas direction, and only seems to trip up at the end during the enlist stage
<andrew-ii> Lemme double check if rsyslog captured any of the end bits
<roaksoax> andrew-ii: if you could get me an error output or the whole output for the matter, I would be able to have an idea of what's wrong
<andrew-ii> Neat - is there a preferred place to put a log? (It's about 3.9MB at the moment)
<roaksoax> andrew-ii: not really, a pastebin would be good
<roaksoax> andrew-ii: if you have the snippet of the failure, that's better
<andrew-ii> Let me sed out the ureadahead bits and see if that makes it more obvious
<andrew-ii> Sorry for the delay - here's the last attempt: http://pastebin.com/FkMnpY8f
<andrew-ii> The strange bit is that I've got 4 other machines that commissioned happily. Just this one seems to get confounded.
<andrew-ii> Note that all the machines are different - it's a horrific Frankenstein's monster of boxes
<roaksoax> andrew-ii:
<roaksoax> Jan 25 21:56:24 maas-enlisting-node cloud-init[2325]: curl: (22) The requested URL returned error: 400 BAD REQUEST
<roaksoax> Jan 25 21:56:24 maas-enlisting-node cloud-init[2325]: + maas-enlist --serverurl http://10.222.222.10:5240/MAAS/api/2.0/machines/
<roaksoax> andrew-ii: can the machines access 10.222.222.10 ?
<roaksoax> andrew-ii: or is that in a different subnet
<andrew-ii> They should - the maas region/rack controller is 10.222.222.10 and all the machines are on the same set of switches
<andrew-ii> That's the PXE address, at least
<roaksoax> andrew-ii: right, so tail -f /var/log/maas/regiond.log while that happens, is there any error ?
<roaksoax> andrew-ii: my guess, however, is that the machine itself cannot ping the above IP
<andrew-ii> Just did it and literally caught it
<andrew-ii> I think this is what you mean? 2017-01-25 16:22:17 -: [info] ::ffff:10.222.222.50 - - [25/Jan/2017:22:22:16 +0000] "POST /MAAS/api/2.0/machines/ HTTP/1.1" 400 119 "-" "curl/7.47.0"
<andrew-ii> Running another commission attempt to see if I can see it live
<andrew-ii> cat /tmp/enlist.out yields empty string, which is a bit disappointing
<andrew-ii> I think it was related to https://bugs.launchpad.net/maas/+bug/1520645
<andrew-ii> Basically, I rebuilt the machine, this time adding *all* four NIC MACs instead of just the first. I guess it was attempting to phone home to MAAS via the second NIC, and that MAC wasn't included yet.
<andrew-ii> At any rate, thanks roaksoax! I was wondering how it could possibly boot but not be able to enlist. I guess the node just didn't reply with the first NIC =/
<g3> Okay I'm now on 2.0
<g3> Should I upgrade to 2.1?
<g3> So what does an untagged VLAN mean exactly?
<g3> I setup a subnet with dhcp on a specific port
<andrew-ii> An untagged VLAN is basically stuff without VLAN tags/numbers
<andrew-ii> I'm not sure about dhcp ports (I'm currently trying to get a node to contact the outside world through the region controller)
<g3> I see
<g3> So now that I have MAAS setup with a subnet serving DHCP on a fabric.
<g3> next setup would be to try to provision a node?
<andrew-ii> I'd give it a shot! I think PXE is untagged by default, hence why you can't provision on a VLAN.
<g3> I don't think a VLAN is currently needed as I am using an entirely seperate network for this?
<andrew-ii> Right
<g3> Can MAAS handle IPMI stuff?
<andrew-ii> At least, I think it's recommended to let maas be on its own hardware; that's what I'm trying
<g3> LIke have MAAS provide DHCP and then use those IP's for IPMI management stuff?
<andrew-ii> Generally I think the IPMI stuff is pretty good
<andrew-ii> I have an HP iLo2 Gen 5 server off Craigslist that apparently has a firmware bug that just refuses to play ball with it, but otherwise every other machine seemed to connect well
<g3> Oh and I see that it will handle APC pdu things interesting.
<andrew-ii> I set static IPs for the IPMI connections, but I think the node discovery may actually figure it out for you; I haven't tried that yet
<g3> Yeah it looks like post-commission you can have it set up all of the network interfaces.
<g3> Do you know the difference between commission, acquire and deploy?
<andrew-ii> Commission: bootstrap the machine and get it to report back settings; Acquire: plonk your name on it and reserve the node (I think); Deploy: Spin up the node with an image (full install)
<andrew-ii> Someone probably has a better answer about what Aquire does
#maas 2017-01-26
<g3> I think it is you and I only bud
<g3> no one else in the world
<andrew-ii> Prolly. I'm trying to figure out where the setting is to allow nodes to talk to the outside world
<g3> What do you mean allow them to talk to the outside world?
<andrew-ii> Basically, I have the region (and rack) controller on the office network, and all the nodes on a separate wired network, with the only line out through the region controller. It should act like a proxy, but it only does it for apt out of the box, I think.
<g3> Oooh
<g3> You'll need a VLAN no?
<andrew-ii> I've got a lot of those, too. I'm vaguely following a tutorial by a Juju dev (  http://blog.naydenov.net/2015/11/deploying-openstack-on-maas-1-9-with-juju-network-setup/)
<andrew-ii> Mainly, I'm still learning how the various configurations are set and overridden.
<g3> You'll have one VLAN to do the management and another VLAN on the same port that allows outside access
<andrew-ii> So maas uses squid as a proxy at /var/lib/maas/maas-proxy.conf, and I'm trying to figure out if it's allowing my node to talk to the outside world or not. Hilariously, it's handling outside DNS great, which means I have the opposite problem I had a long time ago
<g3> Fun!
<g3> Do you know the difference between fabric and space?
<andrew-ii> I think a space is an arbitrary designation for grouping things together
<andrew-ii> But a fabric should (at least semantically) be on its own physical network
<andrew-ii> In the tutorial linked there, I set up two fabrics: `maas-external` which is just on the one NIC on the region controller going to the office, and `maas-management` which is all things connected to the maas switches
<g3> So lets say I have two subnets: 1 management and another that is my block of IPs from the isp.
<g3> And when I go to provision new nodes I'd like for MAAS to use the subnet from my ISP and hand out IPs to the nodes
<g3> Is maas capable of doing so while keeping track?
<andrew-ii> I think so
<andrew-ii> But honestly I am not sure exactly how to go about that - I'd guess that after the node is commissioned, you would set the NIC to have its IP autoassigned
<g3> Hmm in the rack controller, there isn't a green check next to the dhcpd
<andrew-ii> That's a bad day - I had that happen too. There can be a few reasons for that from what little I've seen: make sure you're actually serving dhcp on a subnet (make sure there isn't a conflict), make sure there isn't a bogus subnet (like a blank IPv6 one) that is confounding it
<g3> Hmm
<g3> I have the interface set with a CIDR 192.168.1.0/24
<g3> with a gateway at 192.168.1.1
<andrew-ii> I assume that is also set as the region controller's IP address?
<andrew-ii> err, and/or racks? (sometimes that needs a dpkg-reconfigure to get set after install)
<g3> I only have one management node which is a region and a rack?
<g3> 1 controller that is a Region and a rack controller
<andrew-ii> Makes sense.
<g3> And on the subnet that I have set for fabric 3 on interface em3 it says it is serving dhcp
<g3> according to https://docs.ubuntu.com/maas/2.1/en/installconfig-subnets-dhcp#enabling-dhcp
<g3> but ifconfig doesn't show anything on that port
<andrew-ii> I don't think ifconfig will show that, though I don't know enough to say why
<andrew-ii> But if the subnet says that it is serving dhcp, and if rackd shows dhcpd is running, then I'd bet it's working
<g3> So rackd doesn't have a check next to dhcp
<andrew-ii> ah, I was hoping that got fixed for you; hmm, I don't really know too well how to troubleshoot that
<g3> Hmp
<andrew-ii> I'd guess there's a conflict preventing it from starting
<g3> The thing is there is nothing else on that machine
<andrew-ii> perhaps check /var/log/maas/?
<g3> it has a static route on a different port
<g3> 2017-01-25 23:42:43 maasserver.dhcp: [info] Successfully configured DHCPv4 on rack controller 'ncgqxy'.
<andrew-ii> I guess I'm out of my depth on that
<g3> shame
<andrew-ii> Much so. I really wish I understood better how that particular bit works
<g3> me to
<g3> hmm
<g3> http://askubuntu.com/questions/683733/maas-lan-cable-wiring-topology-for-dhcp?rq=1
<g3> did you set your gateway in the subnet to the upstream router perhaps?
<andrew-ii> I don't think so... If I do a wget on my earlier pastebin ( `wget http://pastebin.com/raw/FkMnpY8f` ) it will correctly resolve the ip of it, but it will never connect
<andrew-ii> from a deployed node, that is
<g3> hmp
<andrew-ii> The maas controller is gold
<andrew-ii> If I set the proxy to my maas controller via `export http_proxy=http://10.222.222.10:3128/`, then I get a 403 forbidden. So it's like the built-in maas proxy could work, but it rejects everything
<g3> http://www.openstackbasement.com/maas-network-hardware
<andrew-ii> I think that may be for an ealier version of maas (maybe pre-1.8?)
<andrew-ii> Though that's close to my setup, where I just don't have that red line
<andrew-ii> Part of the problem is the docs and a lot on the net are for pre-maas2.0, and a lot has been added and managed in maas 2, so it's hard to tell what's still meaningful
<andrew-ii> Well, there are a lot of 2.x docs, too. But it's not quite as expansive yet
<g3> yeah
<g3> So I have teh checkmark next to the dhcp now
<g3> awesome
<g3> but there is no activity on that port
<g3> no activity light* blinking on that port
<andrew-ii> Might not do much until it commision/deploy?
<g3> hmp
<g3> on your maas server
<g3> are you doing dhcp?
<g3> if so do you get anything in ifconfig?
<andrew-ii> Quite a bit of it
<g3> ?
<andrew-ii> The maas controller is doing dhcp on all subnets that are not on the wan or office (basically all the 10.xxx stuff)
<g3> but on that interface port?
<g3> like em#
<andrew-ii> ifconfig shows that all my adapters and vlans are live
<g3> hmp
<g3> interesting
<g3> mine doesn't show anything but the main network
<andrew-ii> There's almost no traffic, since only one node is live
<g3> I have em1 which is the network and I'm trying to set em4 to the private network
<andrew-ii> Ah, mine's enp3s0f0 and enp3s0f1
<g3> yeah I don't expect the names to match
<andrew-ii> So the first is the external, and the second is the internal. Everything internal seems to be sane, though I haven't bootstrapped juju or fully configured it all yet
<andrew-ii> Externally, I serve no dhcp
<g3> are you serving dhcp from the region /rack controler?
<andrew-ii> The little office router will handle that (well, the Windows domain controller, I guess)
<g3> I see
<g3> http://pasteboard.co/2EyTOQSDf.png
<andrew-ii> The rack/region controller handles dhcp for everything on the maas management network
<g3> Yeah
<andrew-ii> One sec, snapshotting mine
<g3> also if you could pastebin your cat /var/lib/maas/dhcpd.conf ?
<andrew-ii> http://pasteboard.co/qxGFsGdut.png
<andrew-ii> http://pasteboard.co/qxHrk6m9K.png
<andrew-ii> http://pastebin.com/raw/qYDuH0c1 < cat /var/lib/maas/dhcpd.conf
<andrew-ii> Note that I only configured names. The maas setup automatically filled in all the stuff from my /etc/network/interfaces file
<g3> yeah
<g3> it basically matches mine
<g3> with more vlans
<andrew-ii> I also have to ifup the vlans after startup, because the interfaces file apparently trips overitself if you do it all there at once
<g3> what does your /etc/network/interfaces look like?
<g3> did you have to configure the interface there?
<g3> I got the activity light to come on by setting it in /etc/network/interfaces
<g3> but if I set it to dhcp it doesn't come up
<andrew-ii> I've configured them all there
<g3> I seee
<andrew-ii> http://pastebin.com/raw/BVCfWxan
<andrew-ii> I'm just noticing that there's some iptables stuff there. I dunno if that's not needed anymore; the EC2 redirect was a response to what I think was a bug in the setup, but again, I may need to try it without the rules
<andrew-ii> Note that I commented out the vlan auto enp... lines, since it breaks the setup. I have to ifup the vlans after the network is up
<g3> OH NICE
<g3> *does a dance*
<andrew-ii> Make sense?
<g3> Though I have two of the rackcontrollers popping up
<g3> on 192.168.1.1 and 192.168.1.2
<g3> but the other machine plugged in is popping up on the dhcp
<andrew-ii> Oh, good!
<g3> I don't understand the two, but hey
<g3> Now I see the other machine in the subnet
<andrew-ii> Seems like something that could at least be troubleshot, right?
<g3> how do I now commission?
<g3> hmm
<g3> It isn't showing up in device discovery
<andrew-ii> I've added machines manually; I have a text file with all the machine's MAC addresses and the static IPs I set for the IPMI setup, and then just commision manually
<andrew-ii> I think it will eventually discover the machine, but that's a new feature that I just haven't gotten to trying yet
<andrew-ii> Alright, giving up for the day. Good luck!
<g3> thanks buddy
<g3> now it is stuck on commisioning
<g3> It can reboot
<g3> Hmm now when I went to commision it did it's PXE thing. THen a new machine popped up
<g3> and then I was able to commission that one?
<g3> acquire
<pmatulis> g3, what are you trying to do? the backscroll is too long
<g3> No worries
<g3> I finally commisioned and acquired
<g3> when I went to deploy
<g3> Stdout: "Error: /dev/sda: unrecognised disk label
<g3> Installation failed with exception: Unexpected error while running command.
<g3> Command: ['curtin', 'block-meta', 'custom']
<g3> Exit code: 3
<g3> Reason: -
<g3> Stdout: "Error: /dev/sda: unrecognised disk label\nError: /dev/sdb: unrecognised disk label\nno disk with serial 'PHFT6401010P800CGN' found\n"
<g3> Most of the backscroll is myself and andrew-ii trying to get my dhcp to work
<g3> I think I sorted that out.
<g3> Though the machine wasn't showing up in the device discovery, so I manually added a node.
<g3> using the IPMI 2.0 I was able to reboot / commission
<g3> but the node that I created hung on commisioning, while a new node popped up with a random name (warm-spider.maas)
<g3> I was able to then commission that node (which was the one that I created...as that is the only machine plugged in)
<g3> and it had the same DHCP ip of x.x.x.101
<g3> then after it commisioned I acquired it, setup a RAID1 on sda&sdb
<g3> went to deploy
<g3> and it failed
<g3> So now I released it and am currently erasing the disks
<pmatulis> you don't need to acquire. that's really just used if a user wants to reserve a node. maas will never automatically use a node that is acquired
<g3> ok so correct path is to commision then deploy?
<pmatulis> g3, yes. curious. did you start by reading the maas documentation?
<g3> I did
<g3> but frankly, it seems a little spotty
<pmatulis> this gives a runthrough: https://docs.ubuntu.com/maas/2.1/en/installconfig-checklist
<pmatulis> subheadings: 'Commission a node', then 'Deploy a node'
<g3> Okay
<g3> so what does acquiring actually do?
<g3> And under DHCP I had to setup the MAAS interface (em4 for instance) under /etc/network/interfaces as a static route
<g3> then in MAAS give that interface a static route as well
<pmatulis> g3, re acquire, i pretty much told you
<pmatulis> but you can read here: https://docs.ubuntu.com/maas/2.1/en/intro-concepts
<pmatulis> (see 'acquire' action and 'allocated' status)
<g3> Okay
<g3> thanks
<g3> Where did the random name come from?
<g3> warm-spider
<pmatulis> when used with a service orchestration tool, such as Juju, acquire happens under the hood
<pmatulis> g3, from some python library probably
<pmatulis> (random names)
<g3> fair enough
<pmatulis> g3, i have never had to set up a static route in MAAS yet. why did you do that?
<g3> Because nothing was coming up?
<g3> I went to the VLAN and setup DHCP
<g3> went to the interface on the region / rack controller and set it to DHCP
<g3> but nothing was showing up in mass, with ifconfig and the activity light wasn't blinking on the port
<g3> Only once I setup static routes for MAAS did the green check next to DHCPD on the region/rack controller page appear
<g3> And only then did the other machine I am attempting to deploy show up in the node list
<pmatulis> you set the interface on the region/rack controller to 'DHCP'?
<g3> I did, but nothing was happening
<g3> I can remove the static from /etc/network/interfaces and give it a go again
<pmatulis> that's not how you enable DHCP
<pmatulis> i recommend you read the docs: https://docs.ubuntu.com/maas/2.1/en/installconfig-subnets-dhcp
<pmatulis> anyway, i need to go to bed. please read the docs. if you find them lacking kindly open a bug. there is a link at the bottom of the menu (left pane)
<g3> I have
<g3> and I've done that
<g3> I've now removed the route from /etc/network/interfaces
<g3> in the vlan on the 192.168.1.0/24 subnet I have enabled the dhcp
<g3> x.x.x.101-x.x.x.254
<g3> in the region/rack controller I have the em4 interface set to dhcp
<g3> but now the check mark next to the dhcpd is gone and there is no activity on the port again
<g3> http://pasteboard.co/qzuzQs9d1.png
<ihih> hi
<g3> http://pasteboard.co/qzv9izzdK.png
<ihih> help configuring maas?
<g3> http://pasteboard.co/qzvF1IMh5.png
<g3> help me?
<ihih> how?
<g3> Well first I can't get the dhcp to work unless I set a static in MAAS and in /etc/network/interfaces
<ihih> is there another dhcp server on the same network?
<g3> No
<g3> This side of the network is all static routes as it is our public IP block
<g3> and I am setting up MAAS on a private OOB network
<ihih> cool
<ihih> its supposed to be static
<g3> Hmm okay
<g3> pmatulis said otherwise, but that is the only way I've been able to get it to work
<g3> okay lets say it is working
<g3> It wasn't discoverying the device
<g3> So I added it manually and commisioned the nod
<g3> e
<g3> it then hung at commissioning forever, while a new node of random name popped up
<g3> with the correct cpu, mem, drives, ect..
<g3> I commissioned that one. then acquired. then deployed with a raid1
<ihih> yup that happens
<g3> which part?
<g3> random name?
<ihih> the entire thing
<g3> okay
<g3> then it failed though
<ihih> commissioning failed?
<g3> deploy
<ihih> oh
<g3> Stdout: "Error: /dev/sda: unrecognised disk label\nError: /dev/sdb: unrecognised disk label\nno disk with
<g3>             serial 'PHFT6401010P800CGN' found\n
<ihih> well, I tried to configure it for openstack
<ihih> and juju bootstrap failed
<g3> And how hmp
<ihih> im guessing youre stuck where i am
<ihih> nodes arent properly configured
<ihih> how have you networked the nodes?
<g3> I currently only have the node connected to the OOB network
<g3> I did see lots of 401's popping up
<ihih> is your region controller node configured to access the internet
<g3> It is, but I don't have that subnet forwarded to the nodes
<ihih> is your network configured to something like this? - www.openstackbasement.com/maas-network-hardware/openstack%20network.png?attredirects=0
<g3> minus the 5port dumb switch yes
<ihih> have you updated the boot images?
<g3> I only pulled them down maybe 4 hours ago
<ihih> have you tried recommissioning the same node again?
<g3> I am doing it right now
<g3> without the raid1
<ihih> okay
<g3> THough now it appears that the machine is off and it is stuck at deploying
<ihih> Give it some time
<ihih> say 10-15 min
<ihih> which maas version are you using?
<g3> 2.1
<g3> Okay well I'll head home then check on it in 30 min or so
<ihih> 2.1 out? updating maas only gets me 1.9
<g3> are you on 14.4?
<g3> 14.04
<g3> https://docs.ubuntu.com/maas/2.1/en/installconfig-upgrade-to-2
<ihih> yep
<g3> states you need xenial for 2.0+
<ihih> right
<g3> for 2.0 and above**
<ihih> is your node configured to wake on lan?
<ioi_> '
<g3> hello
<g3> so I uploaded a sshkey to the MAAS admin account. Deployed a server, but I can't access it?
<g3> ssh ubuntu@<node> asks for a pass
<g3> Oh it gave it a ip from the reserved block. got it.
<mup> Bug #1656717 opened: Juju -> MAAS [2.2+] API integration needs to account for null spaces <juju:Triaged> <MAAS:In Progress by mpontillo> <https://launchpad.net/bugs/1656717>
<mup> Bug #1643001 changed: Moonshot iLO4 'Power HW address' prevent ipmitool from working <MAAS:Expired> <https://launchpad.net/bugs/1643001>
<mup> Bug #1659538 opened: juju deploy fails, unable to contact charm store <MAAS:New> <https://launchpad.net/bugs/1659538>
<mup> Bug #1657491 changed: Several IPs assigned to the same iface in the DB due to 'free' leases <sts> <MAAS:Won't Fix> <https://launchpad.net/bugs/1657491>
<mup> Bug #1659607 opened: [2.1.3] Can't link subnet with link_up from maas API as in WebUI- Cannot configure interface to link up (with no IP address) while other links are already configured. <oil> <MAAS:New> <https://launchpad.net/bugs/1659607>
<mup> Bug #1659613 opened: [Map subnet] When the user takes action Map subnet, (s)he gets a confirmation message including the link to the dashboard, while the mapping has not started yet and they have to click the "Go" button <ui> <MAAS:New> <https://launchpad.net/bugs/1659613>
<mup> Bug #1659642 opened: maas manpage? <MAAS:New> <https://launchpad.net/bugs/1659642>
<mup> Bug #1659642 changed: maas manpage? <MAAS:New> <https://launchpad.net/bugs/1659642>
<mup> Bug #1659642 opened: maas manpage? <MAAS:New> <https://launchpad.net/bugs/1659642>
<mup> Bug #1659672 opened: [2.1, rev5670] Can't release machines previously deployed/failed deployment/or newly deployed after upgrade to latest trunk <oil> <oil-2.0> <MAAS:Confirmed for ltrager> <https://launchpad.net/bugs/1659672>
<Teranet> Question where do I report a BUG ???
<pmatulis> Teranet, https://bugs.launchpad.net/maas/+filebug
<Teranet> Thank you
#maas 2017-01-27
<mup> Bug #1659692 opened: MAAS Nodes don't rescan RAM when rebooted <MAAS:New> <https://launchpad.net/bugs/1659692>
<mup> Bug #1659692 changed: MAAS Nodes don't rescan RAM when rebooted <MAAS:New> <https://launchpad.net/bugs/1659692>
<mup> Bug #1659692 opened: MAAS Nodes don't rescan RAM when rebooted <MAAS:New> <https://launchpad.net/bugs/1659692>
<mup> Bug #1659694 opened: Cannot set min kernel to hwe-x <MAAS:New> <https://launchpad.net/bugs/1659694>
<andrew-ii_> Does the maas controller only proxy apt and ubuntu images? Or, does it need to be reconfigured to allow nodes access to the internet?
<andrew-ii_> I'm not sure, but I'd guess that it would have to do with the rules in the maas squid configuration? The file says not to directly modify it, so ... is there another way?
<andrew-ii_> Or is it only reasonable to dedicate one NIC on each node to the public network and (at least) one to the internal maas network?
<pmatulis> andrew-ii_, maas has a general HTTP proxy that nodes can partake in
<andrew-ii_> Is it configured by default?
<pmatulis> yes
<andrew-ii_> Huh. That's both exactly what I wanted to hear and really disheartening
<pmatulis> andrew-ii_, what do you mean?
<andrew-ii_> Basically, a deployed node can resolve DNS fine, but it never seems to connect to outside sources
<andrew-ii_> As in, a wget will properly be sent, but fails to ever retrieve
<pmatulis> can the maas server contact outside sources?
<andrew-ii_> Yes
<pmatulis> is the maas server also set as the default gateway (for the nodes)?
<andrew-ii_> Yes? It is for every subnet
<pmatulis> is the maas server able to route node traffic to the internet?
<andrew-ii_> I feel like this is where I'm dropping the ball on the configuation: the maas controller does not seem to route traffic to the internet
<pmatulis> so you probably need to set up masquerading on the maas server
<andrew-ii_> Ah, so that really is the solution. I had seen reference to it, but I hadn't really understood if that was a workaround or a special case
<andrew-ii_> Do you have a reference implementation off the top of your head?
<andrew-ii_> I'm afraid I am not familiar with proper iptables configuration details yet, but I'll learn if that gets the nodes online
<pmatulis> i don't have anything specific on hand
<andrew-ii_> (I'm currently at the "chinese room" stage of network configuration understanding
<andrew-ii_> No worries. That basically sets me on a course
<pmatulis> but you should prolly capture packets on the maas server when a node tries to access an internet resource
<pmatulis> to see what's going on
<andrew-ii_> Yep. That would probably be a very sensible thing to do
<pmatulis> confirm that the requests are at least hitting the maas server but not going out the other interface
<andrew-ii_> Really wish that occurred to me sooner :P
<andrew-ii_> Do I need to do anything to the nodes? Like set an http_proxy environment variable?
<andrew-ii_> Or should the maas controller take the request and route it transparently?
<pmatulis> that should already be set
<andrew-ii_> Cool
<pmatulis> also, confirm that the maas proxy config file has an ACL for your particular subnet. it is supposed to include one automatically but i have seen it happen that this is broken
<andrew-ii_> Last I checked it did, but I'm double checking now
<pmatulis> /var/lib/maas/maas-proxy.conf
<pmatulis> you can also set a custom/upstream proxy
<pmatulis> read here: https://docs.ubuntu.com/maas/2.1/en/installconfig-proxy
<andrew-ii_> Nothing upstream yet, and the proxy has the acl localnet src 10.x.x.x subnets
<pmatulis> and this is the maas-proxy log, to see HIT/MISS or DENIED: /var/log/maas/proxy/access.log
<andrew-ii_> Alrighty, releasing, redeploying, and checking this out fresh
<andrew-ii_> Is there any reason to put the maas proxy in the proxy field? That would be redundant, right?
<pmatulis> right, redundant
<andrew-ii_> Like if my controller was 10.0.0.1, putting http://10.0.0.1:3128 in the field is silly
<pmatulis> right
<andrew-ii_> ok, that's a whole set of things I need not try again
<andrew-ii_> Oh, and thanks a mil - just getting confirmation of proper/expected behavior is huge! It will save me huge amounts of second guessing
<pmatulis> i guess the docs are not clear then. please open a bug if you have suggestions on how to improve them: https://github.com/CanonicalLtd/maas-docs/issues/new
<andrew-ii_> My money is that it's just a hidden learning curve for newbs
<andrew-ii_> I know just enough to sorta make sense of it, but not enough to be wise to the actual cause of the trouble
<pmatulis> alright, i need to retire. do leave a msg on how you make out and i'll see it tomorrow
<andrew-ii_> I appreciate the help immensely; Sleep well!
<andrew-ii_> pmatulis: Turns out your first note was spot on: I needed to properly set up the iptables masquerade. I followed the setup at the end of http://askubuntu.com/a/594723 - I think my settings were just out of place *and* I missed the net.ipv4.ip_forward=1 setting. Way back I had a configuration that *did* have it set, but then the masquerade was wrong.
<andrew-ii_> The answer links to a tutorial at http://wiki.cloudbase.it/maas that notes the settings that set the NAT from the private network to the public.
<jlec__> Hi roaksoax Do you have time for a short query?
<brendand> jlec__, roaksoax isn't around at this time, what's your query?
<jlec__> brendand: lou peers referenced me to roaksoax regarding some questions I have around Maas.
<brendand> jlec__, you can ask them here and he'll see them, or maybe someone else can help
<jlec__> I am looking for documentation for creating custom images.
<jlec__> Secondly the CentOS 7 images from Canonical do not work for me. Maas will not assign IP adresses to the nodes
<brendand> jlec__, are you using MAAS 2.x?
<jlec__> brendand: yes, latest on the LTS
<zeestrat> Hey, is there a recommended boot device order for nodes managed by MAAS? |PXE,Disk|, |Disk,PXE| or just |PXE|?
<MrLeau> I believe pxe as to be first but for the rest I'm don't know
<brendand> zeestrat, always pxe first. i believe it *should* work with just pxe, but not positive
<vogelc> has anyone had success deploying centos with LVM?  All my nodes are deployed with flat storage.
<roaksoax> jlec: maas doesn't configure networking manuallyt for centos. we only assign a DHCP address for the PXE interface in CentOS
<roaksoax> zeestrat: if your machine supports being told to PXE every time MAAS power manages it, it doens't matter
<roaksoax> zeestrat: maas will tell the machine to PXE every time when you do an action, but if that is not supporte din the BMC, then you will need to make sure to set the boot order
<vogelc> Trying to install a custom storage layout with LVM.  all the partitions are created but curtin fails because it tries to install grub on /dev/dm-0.  Is there a way to direct curtin not to try /dev/dm-0?
<mup> Bug #1659891 opened: Trusty deploy fails, with unrecognised disk label, on nodes commissioned with Xenial <cdoqa> <MAAS:New> <https://launchpad.net/bugs/1659891>
<mup> Bug #1659891 changed: Trusty deploy fails, with unrecognised disk label, on nodes commissioned with Xenial <cdo-qa-blocker> <cdoqa> <MAAS:Incomplete> <https://launchpad.net/bugs/1659891>
<thiagolib> Hi, I'm having the following problem with maas-region-controller on ubuntu 16.04 when trying to reinstall the package I'm getting the errors below.
<thiagolib> https://pastebin.ubuntu.com/23875672/
<thiagolib> I believe it is because it verifies that there is already a database and an admin user.
<stokachu> thiagolib, ah i see you're here
<thiagolib> stokachu: I came in a few minutes ago.
<stokachu> thiagolib, cool, these be the people that can help with your problem
<mup> Bug #1659948 opened: stuck tgt and tgt-admin, super slow boot <landscape> <MAAS:New> <https://launchpad.net/bugs/1659948>
<cj9255> Anyone do freelance for MaaS Deployment?
<bdx> just wondering ... is there a kub power driver floating around anywhere?
<mup> Bug #1659959 opened: [2.2b1] Cannot add a device from the dashboard  <MAAS:Confirmed> <https://launchpad.net/bugs/1659959>
<mup> Bug #1659960 opened: LVM support in non Ubuntu OS <MAAS:New> <https://launchpad.net/bugs/1659960>
<g3> Hello there
<g3> When using device discovery, can one add that as a machine?
<g3> Hmmm so I went to commision four new servers
<g3> and in MAAS they are showing up with no drives
<g3> But in the full output of 00-maas-07-block-devices.out from the commissioning script
<g3> it found all 15 drives
<g3> buut it found them on the second try at commisioning?
<g3> andrew-ii where you at!
<andrew-ii> Hi!
<andrew-ii> g3 - I dunno about it
<andrew-ii> I would actually be curious to know if the system can use any previous knowledge when commissioning a second time
<andrew-ii> If so, that may explain why it figured out your drive array the second time (but I wouldn't know :( )
<mup> Bug #1659982 opened: [2.2b1, UI] When in the node listing, the top menu disappears when scrolling down (when too many machines) <MAAS:Triaged> <https://launchpad.net/bugs/1659982>
#maas 2017-01-28
<g3> oh shame
<mup> Bug #1659986 opened: [2.1.3] Curtin fails to deploy Centos on HP DL360 G9 - FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpt84rtm0x/target/boot/efi/EFI' <oil> <curtin:New> <MAAS:New> <https://launchpad.net/bugs/1659986>
<mup> Bug #1659986 changed: [2.1.3] Curtin fails to deploy Centos on HP DL360 G9 - FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpt84rtm0x/target/boot/efi/EFI' <oil> <curtin:New> <MAAS:New> <https://launchpad.net/bugs/1659986>
<mup> Bug #1659986 opened: [2.1.3] Curtin fails to deploy Centos on HP DL360 G9 - FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpt84rtm0x/target/boot/efi/EFI' <oil> <curtin:New> <MAAS:New> <https://launchpad.net/bugs/1659986>
<g3> Hmm if I add another network interface when I deploy it all fails
<andrew-ii> g3: That... is really weird. As in, you add to the commissioned node?
<g3> I have a OOB network and a network with the internet connection
<g3> If I setup the interface on the node with the internet connection
<g3> it fails when it tries to download something
<g3> if I just use the OOB network it works..
<andrew-ii> I wonder if it's using the new interface instead of the old as the main route for internet traffic?
<andrew-ii> One of my nodes did that, where the first NIC wasn't actually the one that was the main connection, and so commissioning failed (if it didn't include all the MACs)
<g3> okay when I have a second interface added into the interfaces of the node
<g3> I see
<g3> searching for network data from DataSourceMAAS
<g3> and it fails there
<g3> https://www.stackevolution.com/node/20
<mup> Bug #1467465 changed: Hardcoded deployment timeout of 40 minutes <canonical-bootstack> <timeout> <windows> <MAAS:Triaged> <https://launchpad.net/bugs/1467465>
<mup> Bug #1659986 changed: [2.1.3] Curtin fails to deploy Centos on HP DL360 G9 - FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpt84rtm0x/target/boot/efi/EFI' <oil> <curtin:New> <MAAS:New> <https://launchpad.net/bugs/1659986>
<mup> Bug #1644920 changed: Image selection unclear <docteam> <MAAS:Expired> <https://launchpad.net/bugs/1644920>
<mup> Bug #1645057 changed: [web UI] erasure options during Release do not reflect chosen default <docteam> <MAAS:Expired> <https://launchpad.net/bugs/1645057>
#maas 2017-01-29
<mup> Bug #1660169 opened: Upgrade resulted in a rack-controller with server BMC info <MAAS:New> <https://launchpad.net/bugs/1660169>
<mup> Bug #1660171 opened: [2.2b1] Machines did not turn off after enlistment <MAAS:New> <https://launchpad.net/bugs/1660171>
<mup> Bug #1660182 opened: not able to locate regiond connection issue log messages in regiond.log and rackd.log <oil> <MAAS:New> <https://launchpad.net/bugs/1660182>
<mup> Bug #1660182 changed: not able to locate regiond connection issue log messages in regiond.log and rackd.log <oil> <MAAS:New> <https://launchpad.net/bugs/1660182>
<mup> Bug #1660182 opened: not able to locate regiond connection issue log messages in regiond.log and rackd.log <oil> <MAAS:New> <https://launchpad.net/bugs/1660182>
<mup> Bug #1660185 opened: [2.2b1] Deploying and Deployed still look too similar <MAAS:New> <https://launchpad.net/bugs/1660185>
<mup> Bug #1660188 opened: [2.2b1] Devices cannot be edited in the devices listing <MAAS:New> <https://launchpad.net/bugs/1660188>
<g3> hey yo
<g3> So here is the deal: I have two NICs. One OOB management and another that is for data/network.
<g3> When I deploy the system with the two interfaces configured, it sets the main gateway/route through the OOB network
<g3> which doesn't allow data to be routed to the interwebs.
<g3> Is there anything I can do for this? hmm
<g3> exit
<mup> Bug #1660211 opened: MAAS fails to properly configure two NICs with different subnets <MAAS:New> <https://launchpad.net/bugs/1660211>
#maas 2018-01-22
<mup> Bug #1741216 opened: Instance launch failed with error 'No valid host was found' , getting 'Unable to refresh my resource provider record' error continuously on all computes <juju:Incomplete> <MAAS:New> <https://launchpad.net/bugs/1741216>
<vtriple> Jan 22 02:04:51 vhost02 cloud-init[2617]: Exception in thread smartctl-validate (id: 679, script_version_id: 1): Jan 22 02:04:51 vhost02 cloud-init[2617]: Traceback (most recent call last): Jan 22 02:04:51 vhost02 cloud-init[2617]:   File "/usr/lib/python3.5/threading.py", line 914, in _bootstrap_inner Jan 22 02:04:51 vhost02 cloud-init[2617]:     self.run() Jan 22 02:04:51 vhost02 cloud-init[2617]:   File "/usr/lib/python3.5/threading.
<vtriple> AttributeError: 'str' object has no attribute 'get'
<vtriple>  model = value.get('model')
<vtriple> any ideas?
<vtriple> I'm not sure what's wrong here
<vtriple> well i found the python file at least on maas so i should be able to figure it out
<vtriple> so there is a check for not model and not serial but it's failing
<vtriple> because it doesn't ensure they are they before the attempt
<mup> Bug #1744072 opened: MIR Chrony in 18.04 <NTP Charm:New> <ceph (Ubuntu):New> <chrony (Ubuntu):New> <cloud-init (Ubuntu):New> <maas (Ubuntu):New> <https://launchpad.net/bugs/1744072>
<vogelc> roaksoax: Checking in to see fi you have any ideas about why twistd is exploding?
<xygnal> roaksoax filed bug 1744765
<xygnal> roaksoax let me know what else you need attached, provided, or confimed
<mup> Bug #1744802 opened: create postgresql 9 to 10 transition for bionic upgrades <MAAS:New> <https://launchpad.net/bugs/1744802>
#maas 2018-01-23
<mup> Bug #1733900 changed: [2.3final, UI] Machines that have failed testing don't have an error icon <2.3qa> <ui> <MAAS:Expired> <https://launchpad.net/bugs/1733900>
<ejat> anyone here?
<xygnal> roaksoax: around?
<mpontillo> xygnal: I think he's out sick today; anything I can help with?
<xygnal> mpontillo: we have a bug report open for a very impactful issue
 * mpontillo just found the "exploding twisted" bug; not good
<xygnal> on top of the exploding part, its really slow.  it seems to always be loading the nodes list, and its always sooooo sloooow to load
<mpontillo> xygnal: are you saying this bug did NOT occur prior to upgrading to MAAS 2.3? (that's interesting; I'm not aware of any changes that should have significantly impacted UI scalability between those two releases.)
<xygnal> we added quite a few nodes to the system since last restart of MAAS, I believe.  We also had some past crazy behavior in the past after a region restart so this could be a bug we 'got around' before that is more exposed now.
<xygnal> even after eliminating any swap devices, i am seeing I/O wait times of 32 seconds soemtimes
<xygnal> no idea what MAAS is doing in those moments to queue so hard
<xygnal> but it could have something to do with the fact it burns through all the memory in minutes
<mpontillo> xygnal: might be good to get some data on what postgresql is doing. I can imagine it might be worse if you've patched for meltdown/spectre as well... meanwhile, I wonder if you could help us get a some test data from your environment?
<xygnal> we've got monitors on the pgsql host and its performance so far is largely idle
<xygnal> yes please, what can I collect?
<mpontillo> xygnal: if you can find out what queries MAAS executes just prior to the crash, (like, when you first load the page) that would be helpful. I'm trying to figure out if we can easily have you dump the database minus the OS images, but it's non trivial it seems
<xygnal> how can i get those queries logged?
<mpontillo> xygnal: it is possible that we're loading up the websocket connection with a huge amount of results, which causes the OOM situation. it is difficult to turn on logging in pgsql without logging way too much though. I'll look into it...
<xygnal> we dont have direct access to pgsql box, it's on a box provided as a service.  hm. we might have a non-root login.
<xygnal> i was hoping we could dump that kind of info out of the regiond itself
<mpontillo> xygnal: so it might be nice to confirm that it's the act of loading the machines listing itself that causes the OOM situation. here's something you might be able to do https://paste.ubuntu.com/26446310/
<mpontillo> xygnal: you'd replace "mpontillo" in the example code with an admin username in MAAS, and run that after typing "sudo maas-region shell".
<mpontillo> xygnal: that should run the database fetching outside the context of the region server - rather, in the Python shell itself. so if that process dies, that could confirm where the bug is
<mpontillo> xygnal: here's a version you can copy/paste without thinking about it. https://paste.ubuntu.com/26446337/
<xygnal> its running. still waiting.
<xygnal> cd
<xygnal> oops
<mpontillo> xygnal: the other thing I was wondering: about how many concurrent UI sessions would you say are open? is it just the one?
<xygnal> if i reset the region controller and login as the first user
<xygnal> and go to nodes
<xygnal> it does not freak out
<xygnal> it just runs very slowly
<xygnal> if i try to 'reload' the page
<xygnal> that happens
<xygnal> as if it could not finish its first scan and the second scan called it to freaaaak-out
<xygnal> Ã§Ã§
<gimmic> Can maas set the dns search suffix even if it is not running/authoritative DNS?
<xygnal> curse my lazy bluetooth keyboard swtching.
<xygnal> box tanked so hard i can't even SSH in.  waiting for it to settle down.
<mpontillo> xygnal: wow, thanks for confirming
<xygnal>  it should be handling it more gracefully than that if its memory, i removed all swap, so it shoukd oom_kill as soon as it hits 24GB
<mpontillo> gimmic: currently no, we we use the list of all authoritative domains as the search list, and place the domain the machine is actually in first in the list
<mpontillo> gimmic: as of MAAS 2.3 anyway - I think there was some inconsistent behavior prior to that
<gimmic> Okay, we currently come back around with ansible to 'fix' this, but it would be nice to deploy it out of the gate or have an option to deploy it that way
<mpontillo> gimmic: how would you want that to look? a per-domain flag to indicate if the domain should be in the search list?
<gimmic> hmm. Maybe by zone
<gimmic> domain would likely work too
<mpontillo> gimmic: I think they're effectively the same thing to MAAS right now anyway; we derive things like reverse zones, not sure about sub-zones of a domain, but I would think you could model it however it suits you
<xygnal> mpontillo:  stopped/started regiond.  your command works before I login to UI, your command works after I login to UI, your command continues to work until free memory hits 0
<mpontillo> xygnal: wait, are you saying that more memory is consumed each time you run it in the same python shell?
<xygnal>  no no.  I mean i dont see the memory issue at all if i restart maas-regiond and dont login to the UI.
<xygnal> even when i am in the UI and its loading the list of nodes so slowly, tha command returns just fine
<xygnal> its not the UI itself that is slow but the 'loading' of the nodes in the nodes page.
<xygnal> and if you refresh the nodes page
<xygnal> that memory issue kicks up and it soon after killed
<xygnal> the slowness and the oom condition may not be directly the same issue, just appearing at the same time
<xygnal> trying to get bumped up to 32gb today
<xygnal> just to be sure it actually uses all of that
<mpontillo> xygnal: yeah, it's odd that the refresh itself seems to push it over the edge; I'm guessing maybe it can handle one session with all that data loaded, but when you refresh, the old session doesn't immediately go away.
<xygnal> its more than a double in memory jump
<xygnal> quit
<mpontillo> xygnal: I'll keep poking at it. one thing I did was open the network inspector in Chrome and look at what the websocket was doing on a large MAAS. I noticed that it seems to make a lot of requests with regard to the device discovery listing. I wonder if it might help for you to look at the same data on your system to see what it's up to
<catbus> wililupy: hi
<mpontillo> xygnal: that is, if you open the javascript console and select the "network" tab, then find the entry for "ws?csrftoken=...", then click the "Frames" tab, you'll be able to see what data the UI is requesting and receiving. that might tell us more about what the UI is so obsessed about that it needs to consume so much memory =)
 * mpontillo needs to step out for lunch, back later
<wililupy> Hi catbus! How are you doing?
<catbus> wililupy: Hey, I am good. How are you?
<catbus> wililupy: I have some questions about the demo here: https://insights.ubuntu.com/2017/03/01/devops-for-netops/ wonder if you can help clarify.
<wililupy> catbus: Good. I'm glad you are doing well as well.
<wililupy> catbus: I remember that article. I remember doing the demo as well. What are you questions?
<catbus> wililupy: the wedge 100 running MAAS, is it an ONIE-based wedge 100? Is it classic ubuntu running on the switch?
<catbus> wililupy: then what image does MAAS deploy to the wedge 40? assuming wedge 40 is also onie-based?
<wililupy> catbus: Yes, it is the Accton Wedge 100. It was running Ubuntu 16.04 with MAAS installed
<wililupy> catbus: MAAS deployment depended on 2 things for the Wedge. It can either use PXE and install Ubuntu Classic on the switch, or we could use ONIE to install an ONIE image that is hosted on the MAAS ToR Switch.
<wililupy> catbus: If we wanted to use ONIE, we disable PXE boot on the Wedge, if we want to deploy and managed the switch from MAAS, we enabled PXE on the Wedge and deploy it just like we would a server.
<catbus> wililupy: ok, in the demo, wedge 40 uses pxe, and it was classic ubuntu running on top of it deployed by maas, do I read it right?
<wililupy> catbus: The demo we did at OCP last year was slightly different in that when MAAS enlisted a node, it would detect it was a switch, and then when we commissioned it would deploy Ubuntu 16.04 and then deploy the SONIC Snap automatically and build the required Kernel Modules needed for the ASIC to function.
<wililupy> catbus yes ma'am.
<catbus> wililupy: how does it deploy the SONIC snap automatically?
<catbus> node-specific preseed?
<catbus> how does maas know it's a switch..?
<wililupy> catbus: it was for our demo so we had a custom preseed and custom image and some other customizations with MAAS to get this to work.
<catbus> ok
<wililupy> catbus: bacially when enlisting the node, it would detect during the lspci and the dmidecode the ASIC and then MAAS would tag the device as a switch. That is actually stock now in MAAS 2.3
#maas 2018-01-24
<hkominos> hi guys! Just a quick question. After playaround a bit with maas I have ended up unable to start the regiond
<hkominos> it complains with a python exception The SECRET_KEY setting must not be empty
<hkominos> Where is this key loaded from ?
<hkominos> thx for your help
<hkominos> resolved.
<mup> Bug #1745198 opened: [2.3] kernel_opts is not documented in `maas admin tag update -h` <cpe-onsite> <MAAS:New> <https://launchpad.net/bugs/1745198>
<mup> Bug #1745230 opened: region controller not starting <MAAS:New> <https://launchpad.net/bugs/1745230>
<mup> Bug #1745230 changed: region controller not starting <MAAS:Won't Fix> <https://launchpad.net/bugs/1745230>
#maas 2018-01-25
<Mashkoor> hi team
<Mashkoor> can I use MAAS for my LAN Infra for autodeploymnet of different operating systems? like windows and linux
<hkominos> Greetings. I am having a weird issue with maas-tftp. It is veeeeeeeeery slow. (The issue does not appear to be networking related. All machines are on the same host). Has anybody seen similar behaviour on aarch64 ?
<hkominos> I googled a bit and saw that tftblocksize could be increased to help. Can taht be done on maas-tftp?
<roaksoax> hkominos: did you ugprade due to meltdown/spectre ?
<roaksoax> hkominos: it could be related to that
<roaksoax> hkominos: is your maas running on a VM or on actual hardware?
<hkominos> it is running on a VM. and we are booting both physical hardware and Vms (aarch64 all of them). This issue has been here for some time.it takes about 40 seconds to transfer the images from the VM to the machine
<hkominos> current kernel is 4.4.0-87
<hkominos> ubuntu
<roaksoax> kwmonroe: right, but is maas itself, where is it running? on a vm inside a aarch64 ?
<dnegreira> Hi, I am having trouble starting MAAS after an upgrade
<dnegreira> I am getting this on /var/log/maas/regiond.log: /usr/bin/twistd3: Unknown command: maas-regiond
<roaksoax> dnegreira: that seems that your machine didn't really fully upgrade
<roaksoax> or something prevented it from fully upgraded
<dnegreira> shall I run dpkg --reconfigure ?
<dnegreira> on the maas-region-controller package
<dnegreira> 2018-01-25 15:22:52 status half-configured maas:all 2.3.0-6434-gd354690-0ubuntu1~16.04.1
<dnegreira> you are right indeed
<dnegreira> weird, I seemed to miss any message regarding this during the upgrade, I will try to see what happened
<roaksoax> dnegreira: try sudo apt-get -f install
<dnegreira> reports nothing
<dnegreira> also nothing in dpkg -l or dpkg-query
<dnegreira> the plot thickens
<dnegreira> ok figured it out
<dnegreira> it seems that rackd was not able to read out the files in /etc/apt/apt.conf.d/
<dnegreira> fixed, and sorry for the noise
<roaksoax> nop worries
<hkominos> roaksoax: Yes
<hkominos> Progress: I have been trying to play around with the blk_size in  is /usr/lib/python3/dist-packages/tftp/bootstrap.py  is 1028 . if i push it a bit further (I assume this leads to bigger ip packets so less time to offload the image) Then tptf stalls and does not serve anything
<hkominos> I will see if I can somehow enable jumbo frames
<xygnal> roaksoax: mpontillo: i regards to bug 1744765 - it sounds like you confirmed that with over 300 nodes it slowed down a bit? is that something expected? its quite slow for us. need to know if that is being looked into or if we need to split up our environments
<roaksoax> xygnal: did you patch your MAAS for metldown ?
<roaksoax> err
<roaksoax> your system for metldown ?
<roaksoax> xygnal: we've tested the listing (both devices and machine listing) with 8k machines wihtout issues
<roaksoax> but that was before metldown
<roaksoax> xygnal: how many machines do you have ?
<roaksoax> xygnal: what's the technical specs of the machine running maas? inside a vm or in hardware ?
<xygnal> just under 600 according to my bug report
<xygnal> VM
<xygnal> ESXi ubuntu vm
<xygnal> 8 cores, 32GB of ram
<xygnal> storage is NFS-based (As in that is how ESXi is storing its volumes) so we have swap disabled
<roaksoax> xygnal: right
<roaksoax> xygnal: so did you start seeing this recently? judging by the date of the bug it would seem after metldown fix on the kernel was released ?
<xygnal> when NOT refreshing the nodes page, the memory issue doesnt trigger and memory utilization is normal
<xygnal> usually under 8gb
<xygnal> however
<xygnal> it is just as slow then
<xygnal> its is slow the whole time
<Beret> xygnal, did you file a support request with Canonical?
<xygnal> no, we have only worked on this with the MAAS devs so far
<xygnal> and as if yet dont feel i have sufficient evidence that that is the reason its performing poorly, unless you have some way to prove thats the behavior its sitting on
<Beret> xygnal, quickest way to resolution is to file a support request
<xygnal> Beret: we could jsut boot the old kernel on maas 2.3 and see if it peforms any differently
<xygnal> far faster than filing a support request to find out
<xygnal> boot back for dev environment certainly seems faster, but not ready to test it prod yet
<xygnal> just noticed prod is ALSO unable to look up node details
<xygnal> if open up individual nodes it just sits at 'loading'
<roaksoax> xygnal: yeah improvements for machine details loading are in 2.4
<roaksoax> xygnal: but for listing are in 2.3
<roaksoax> xygnal: but that's just because it loads the commissioning logs and such
<roaksoax> t/win 5
#maas 2018-01-26
<hkominos> narinder
<narinder> hkominos, hi
<hkominos> Hi. There is rumour that you demoed maas in the opnfv plugfest
<hkominos> so I assume you have a deep understanding of Maas?
<narinder> hkominos, not a rumour but i did talk about kubernetes and MAAS.
<narinder> hkominos, what exactly you are looking foe?
<hkominos> I wanted to ask you about a bug (?) that I see in maas. Here in opnfv-armband at least. It all started with tftp being veeeeeeery slow. After considerable looking around I found that the tftp library which maas leverages is unable to understand the properties of the underlying link and forces blocksize to be small. I remember you had a similar issue with 9000 MTU ??
<hkominos> IF you set 9000 MTU maas does nto server images anymore BUT alav told me that you might have a hack in place to tell mass to use proper block size ?
<narinder> foer pxe network we uses 1500 MTU most of the time.
<hkominos> we have the same. However the library forces blksize to be 1000 . (if I force the blksize to be 1464 then it works perfectly and faster)
<hkominos> Should i just open a bug ?
<narinder> hkominos, please open it against maas and maas team can have a look.
<hkominos> ok thx!
<Karunamon> 'morning folks - I've got a fresh install of 2.3.0 on 16.04 that is refusing to let me enable DHCP on the one single vlan that was configured out of the box
<Karunamon> specifically, I get a "The IPRange could not be created because the data didn't validate."
<Karunamon> no idea where to even begin troubleshooting because this is literally step 6 on the install page
<catbus> Karunamon: could it be the ip range for dhcp is not reachable via that interface?
<catbus> Karunamon: what's the MAAS ip on the interface where you will provide DHCP and what's the IP range you configured for DHCP?
<Karunamon> catbus: so the box has a single interface - 10.0.0.225, the DHCP range is only two addresses now, 10.0.0.246 through 10.0.0.247.
<Karunamon> as configured the vlan is 10.0.0.0/24
<Karunamon> I didn't actually add that vlan, it was automatically created after install, presumably from the IP settings on the machine prior to installing the 'maas' package
<roaksoax> mpontillo: "The IPRange could not be created because the data didn't validate."
<roaksoax> Karunamon: can you file a bug with all the information aout your networks, and waht ranges you are trying to create, etc
<mpontillo> Karunamon: are you creating the IP range in the UI or the API? how does the IP range you're trying to create correlate to any subnets on that VLAN?
<dd__> Hi all, My network discovery is only finding 3 bare metal machines that already have OS's on them in the VLAN. When I PXE boot from Idrac, MaaS isn't discovering the machines. Any ideas on things to check?
<Karunamon> roaksoax: I think one may already exist, but it's been dormant for months. https://bugs.launchpad.net/maas/+bug/1569960
<Karunamon> in the meantime I'm evaluating auto deployment tools and the poor error messaging is... discouraging
<Karunamon> roaksoax: The range was already created when I logged into the system for the first time. This error appears when I hit "provide DHCP" on the vlan
<Karunamon> dd__: Do you have active discovery turned on in the vlan?
<Karunamon> *subnet, rather
<mpontillo> Karunamon: right, I triaged that but couldn't get the issue to occur; it wasn't clear what the cause of the problem was. if you can tell us how to get that to happen consistently, that would be great
<mpontillo> dd__: what happens on the system console when it PXE boots?
<dd__> The virtual console on idrac just says no media detected.
<dd__> Or do you mean on the maas console?
<Karunamon> mpontillo: I can give a tl;dr - it's a bog standard 16.04 install with ssh and basic ubuntu server installed, IPs as configured above, maas installed with 'apt install maas'
<mpontillo> dd__: I was trying to confirm that the machine actually PXE booted from MAAS first. "no media detected" implies that it either isn't set to boot from the network, or MAAS isn't providing DHCP on the network and thus can't serve up the parameters for the PXE request
<dd__> MaaS says it's providing dhcp.
<Karunamon> the only thing I can think of is that it's getting upset that the maas server (rack and region controller same box) is living inside the vlan its managing
<mpontillo> Karunamon: the only unusual thing about that is probably the very small range - can you try making the DHCP range larger?
<mpontillo> Karunamon: I'll check the code to see how we handle that
<Karunamon> mpontillo: tried that. When I hit fill out the "provide DHCP" form, it suggests a range of 10.0.0.136 through 10.0.0.199
<Karunamon> I get the same behavior even if I just accept the default
<mpontillo> Karunamon: btw, it's fine (required even, since DHCP is UDP-based) for MAAS to have an IP address on the managed VLAN
<Karunamon> mpontillo: heh, just spitballing
<Karunamon> I tried creating a dynamic range manually under the subnet, that causes a "There is no room for any dynamic ranges on this subnet."
<dd__> mpontillo: idrac says booting from PXE Device 1: Integrated NIC 1 Port 1 Partition 1 PXE: No media detected. Maas on the Controller summary says DHCPD is running.
<mpontillo> dd__: can you confirm that the MAAS server is actually receiving the DHCP requests and replying?
<dd__> mpontillo: how would I confirm that?
<Karunamon> mpontillo: I think I worked past it. Reserved everything but the range I wanted used (so two regular reservations), and reserved the remainder as a dynamic range
<Karunamon> then I was allowed to enable DHCP for the vlan
<mpontillo> Karunamon: strange, I wouldn't expect that would have an effect. when you "provide DHCP" in the UI it should be equivalent to adding the IP range; seems there may be a bug in the "provide dhcp" path then
<mpontillo> Karunamon: glad you got it working thouh
<mpontillo> *though
<mpontillo> dd__: you could install something like `dhcpdump` and run `sudo dhcpdump -i <interface>` on the MAAS rack controller, then any DHCP requests and replies would be printed on the console
<Karunamon> mpontillo: The trick was putting the subnet into managed rather than unmanaged
<Karunamon> I toggled it around a few times and that seems to be the magic
<mpontillo> dd__: when I do that, I can see the DHCP request from the machine I'm booting, and then I can see the reply from MAAS that includes "FNAME: pxelinux.0", showing that MAAS has given the client the option of doing a PXE boot
<mpontillo> Karunamon: ah. should have known. that part of MAAS can be a bit confusing; I actually have a blog post that explains a bit about the history of it here http://spectrum42.com/posts/ip-ranges-in-maas/
<mpontillo> Karunamon: sounds like we need clearer error messages about attempts to enable DHCP on unmanaged subnets (unmanaged means we don't control DHCP, so I'm guessing that's why it was rejected)
<Karunamon> mpontillo: Ah! Unmanaged means it doesn't even try to assign addresses to a dynamic range you create yourself?
<Karunamon> (say, you've got an external DHCP server, you're basically telling maas to look for new nodes within that range?)
<mpontillo> Karunamon: it's more for dual-homed MAAS environments; let's say your MAAS has two interfaces; it managed DHCP on a private network set up for MAAS, but it also connects to your office LAN with some other DHCP server. and maybe you've got a couple ranges on that network that MAAS can allocate from. You configure those ranges that MAAS can manage as reserved, that way if you deploy a dual-homed machine on both networks, it can use what your IT
<mpontillo> department (or whatever) has allocated for you on the network where you /don't/ manage DHCP.
<roaksoax> Karunamon: unmanaged means "interface doesn't get configured", DHCP means " the interface gets configured for /dhcp/"
<mpontillo> Karunamon: that way you don't make your IT department angry at you for running a DHCP server on their network ;-)
<mpontillo> but you can still assign IPs
<dd__> mpontillo: Thanks for that. It actually was getting DHCP requests, but in settings I hit enable active discovery and then told to scan every 10 mins. Passive discovery just wasn't doing it.
<mpontillo> dd__: passive discovery is ARP based, so you won't devices show up there unless they get an IP address from DHCP and start asking where things are
<mpontillo> dd__: but to be clear, not being able to PXE boot is completely separate from device discovery
<mpontillo> dd__: if a machine isn't enlisting into MAAS< that is generally an issue with DHCP enablement in MAAS on the VLAN you want to boot from
<mpontillo> (that is, if it won't even PXE boot.)
<mpontillo> dd__: if you don't control DHCP on the subnet, you'd have to have a network administrator configure it to PXE boot from MAAS, or enable DHCP on an alternate VLAN and relay it to MAAS. but MAAS works smoother if it controls DHCP itself
<dd__> mpontillo: The dhcp is disabled on this vlan.
<mpontillo> dd__: all right; if you are able to enable it, you should see PXE requests start to work - when you PXE boot the machine it should auto-enlist and add an IPMI password to allow MAAS to power-control the machine
<dd__> mpontillo: I should say that our networks ip manager (infoblox) is not managing dhcp for that vlan.
<mpontillo> dd__: good. so if there are no other DHCP servers on that network, I'd go ahead and try enabling DHCP in MAAS for that VLAN
<dd__> mpontillo: DHCP is enabled for that maas vlan
<mpontillo> dd__: you said you saw DHCP /requests/ in dhcpdump when you attempted the PXE boot but you didn't mention any /replies/; are you sure the settings are correct?
<mpontillo> dd__: I would take a look at /var/lib/maas/dhcp.conf on the rack controller and see if it looks sane
<dd__> mpontillo: Nothing jumps out at me on /var/lib/maas/dhcpd.conf  I haven't looked at anything dhcp snippets yet though.  On the dhcpdump, I got a bunch of bootprequests and bootreply over and over.
<mpontillo> dd__: ok, so it sounds like MAAS is attempting to reply to the PXE request but the server is ignoring it? it could help if you pastebin the dhcpdump.
<mpontillo> dd__: I would check the BIOS settings to make sure the system is set up to allow and prefer network boot
<mpontillo> dd__: also check to be sure there is no firewall on your network that could be preventing the DHCP replies from reaching the server, or could be preventing the PXE boot (TFTP-based) from happening
<dd__> mpontillo:https://pastebin.com/hHCnFJ97
<dd__> mpontillo: Since maas is on the same vlan as the machines it's provisioning, there wouldn't be any firewall.
<Karunamon> Great news - DHCP works :D
<Karunamon> bad news: my machine picks up the initrd from maas and immediately reboots
<mpontillo> dd__: yeah, just checking since some switches prevent DHCP requests unless they are from known authorized DHCP servers - like a L2 firewall
<ltrager2> Karunamon: can you get the console output?
<roaksoax> Karunamon: the initrd should launch an ephemeral image
<roaksoax> Karunamon: so mabe it is just doing that ?
<mpontillo> Karunamon: the expected behavior for the first time it PXE boots is for it to load the kernel/initrd, then run a script to register itself with MAAS, then turn itself off
<mpontillo> dd__: from the pastebin it kind of looks like the offer from MAAS is sent but never acted upon; the DHCPDISCOVER request is repeated a few seconds later, after the original DHCPOFFER was sent. I'd check for network problems between MAAS and the BMC; maybe it can can send traffic but not receive it, or maybe it's a network filtering issue like what I mentioned before
<mpontillo> dd__: it looks like MAAS offers up the possibility of an IP address and a PXE boot path four times, but the offer is never accepted
<mpontillo> Karunamon: btw, if it succeeds, you'll see the server in the machines listing in MAAS with a random name
<dd__> mpontillo: It looks like something IS blocking it at the switch level. Thanks for your help today.
<mpontillo> dd__: happy to be of service =)
#maas 2018-01-27
<mup> Bug #1745705 opened: snap installed maas has flapping processes <MAAS:New> <https://launchpad.net/bugs/1745705>
<hallyn> roaksoax: is 'install region controller' on ubuntu server artful cd meant to work right now?
<hallyn> odd, on second try it worked
<mup> Bug #1745705 changed: snap installed maas has flapping processes <MAAS:Won't Fix> <https://launchpad.net/bugs/1745705>
<mup> Bug #1745705 opened: snap installed maas has flapping processes <MAAS:Won't Fix> <https://launchpad.net/bugs/1745705>
<mup> Bug #1745705 changed: snap installed maas has flapping processes <MAAS:Won't Fix> <https://launchpad.net/bugs/1745705>
<Chuck_d_> I have a basic question about MaaS in a bare metal deployment.  I have a jump server with a 4 NICs + 1 IPMI.  I have 5 nodes that will be controlled by the Jump Server running maas.  I currently have one of my NICs on the jump configured on the IPMI network. All is working nicely.
<Chuck_d_> I would like to NOT dedicate a NIC to the IPMI network.   Is it possible to use my JUMP server's IPMI NIC to interface to the other servers?
<Chuck_d_> In other words is MaaS smart enough to use it? Do I need to provision or configure something? or is this just NOT possible?
<mup> Bug #1745778 opened: [MAAS, DNS] While adding device, same name on different domain throws "hostname already exists" error <MAAS:New> <https://launchpad.net/bugs/1745778>
#maas 2018-01-28
<troubled> Greetings, programs!
<troubled> Okay, maybe someone can chime in here. I got a test setup where I have a region controll with bond0 (lacp), carved up into vlans, where vlan100 is my "maas" vlan that the nodes pxe boot up in. The nodes have 2 nics each, with eno1 being untagged vlan 100 for maas, and eno2 is on the "external" network. Problem is maas provides the dhcp to them, but can't set a default route that isn't in the vlan100 subnet.
<troubled>  Ideas?
<troubled> To clarify more: say node1 has eno1 dhcp of 192.168.100.100, I want it to use eno2 with its real 1.2.3.4 IP, which I can assign statically, but I can't get it to be the default gw, since the dhcp from eno1 points to the gw from 192.168 and not the 1.2.3.1 ip from my ISP gw that I need it to be
<troubled> I am trying to avoid NAT via the regional controller, or via some manual script that runs during the node startup that drops some snippet into /etc/network/, ie: I would prefer it to be something done via the web ui in subnets or fabric etc
<troubled> (my kingdom for a router + nat....)
#maas 2020-01-20
<mup> Bug #1853047 changed: VLAN with the specified VID already exists error when updating the fabric attribute <cdo-qa> <foundations-engine> <MAAS:Expired> <https://launchpad.net/bugs/1853047>
<mup> Bug #1853047 opened: VLAN with the specified VID already exists error when updating the fabric attribute <cdo-qa> <foundations-engine> <MAAS:Expired> <https://launchpad.net/bugs/1853047>
<mup> Bug #1853047 changed: VLAN with the specified VID already exists error when updating the fabric attribute <cdo-qa> <foundations-engine> <MAAS:Expired> <https://launchpad.net/bugs/1853047>
<mup> Bug #1860340 opened: redfish power driver tweaks <MAAS:New> <https://launchpad.net/bugs/1860340>
<mup> Bug #1860383 opened: MAAS does not check if #includedir is missing from /etc/sudoers <cpe-onsite> <field-medium> <papercut> <MAAS:New> <https://launchpad.net/bugs/1860383>
<mup> Bug #1860383 changed: MAAS does not check if #includedir is missing from /etc/sudoers <cpe-onsite> <field-medium> <papercut> <MAAS:New> <https://launchpad.net/bugs/1860383>
<mup> Bug #1860383 opened: MAAS does not check if #includedir is missing from /etc/sudoers <cpe-onsite> <field-medium> <papercut> <MAAS:New> <https://launchpad.net/bugs/1860383>
<mup> Bug #1860388 opened: MAAS fails clean install www-data user does not exist due to nginx requirement <cpe-onsite> <field-medium> <papercut> <MAAS:New> <https://launchpad.net/bugs/1860388>
#maas 2020-01-21
<eduardo71> Need a consultant to help us debug an issue with rack controllers in a cluster not syncing images
<mup> Bug #1838943 opened: Cannot PXE boot arch 0f due to protocol mismatch <MAAS:Confirmed> <https://launchpad.net/bugs/1838943>
<cyphermox> so; I have just installed MAAS in a VM, a very standard install on 18.04, I can login to the interface but it just remains at "Connecting..." forever after login
<cyphermox> anyone has an idea what's wrong, what I can check?
<cyphermox> rbasak: ^ pinging you, because I don't even know if this channel is the right place to be asking the question :)
<rbasak> I think this channel is the right place
<rbasak> But I don't know the answer, sorry.
<rbasak> The MAAS team hang out here too I think though?
<cyphermox> yeah I expected you might not know the answer
<cyphermox> it's certainly different from all previous experiences
<cyphermox> it's always worked before, but I've always deployed it on metal, now it's in a VMWare virtual machine
<cyphermox> roaksoax: any idea ^
<newell> cyphermox: ltrager might have an idea as I think he has messed with VMWare VMs, I personally have not and I have not seen what you are experiencing
<cyphermox> well, even then whether it's hardware or a VM and which hypervisor should not matter at all
<newell> cyphermox: I haven't seen what you are experiencing in general
<cyphermox> yeah, I understand
<newell> cyphermox: are you using main archive or ppa?
<newell> or other?
<ltrager> cyphermox: I've used MAAS to deploy VMware but I haven't deployed to VMware itself
<cyphermox> ppa:maas/proposed
<ltrager> I mean installed MAAS on VMware
<newell> cyphermox: ppa:maas/2.7
<newell> try that one if it is not too much trouble
<ltrager> cyphermox: If you can't connect to the UI it could be there is a network rule blocking websockets
<ltrager> cyphermox: To get around that try using an SSH sockets proxy
<cyphermox> well, it's port 5240 isn't it?
<cyphermox> or are there extra firewall rules required?
<ltrager> cyphermox: 5240 is for the API, I believe 80 is still used for the websocket
<ltrager> cyphermox: argh actually nevermind I was thinking of something else
<ltrager> cyphermox: I have seen firewalls block just websockets while letting other traffic through
<cyphermox> mmkay
<cyphermox> I'm going to have to go beat things into submission
#maas 2020-01-23
<mup> Bug #1860619 opened: maasserver_event table grows without bounds, impacting UI performance <MAAS:New> <https://launchpad.net/bugs/1860619>
<ebbex> Did the ephemeral-v3/daily/ images break networking recently? All my machines are coming up with a minimal network config, but have the proper files under /etc/cloud/cloud.cfg.d/50-curtin-networking.cfg
<ebbex> netplan doesn't seem to apply the config from these either. So maybe netplan broke recently?
<ebbex> Nope, I'm missing proper applied network config from /etc/cloud/* on xenial aswell as bionic.
#maas 2020-01-24
<ebbex> Oh, ha! cloud-init possibly commit de34dc7c467b318b2d04d065f8d752c7a530e155 reacts to iscsi_auto kernel parameter
<sdhd-sascha> hello,
<sdhd-sascha> are there are ppa's for maas series 2.7 ? I currently let it build myself from time to time ...
<cyphermox> ltrager: thanks, it was indeed this firewall filtering websocket, and I had to have a good firm talk with it to even manage to push new configurations ;)
#maas 2020-01-25
<theodrim> Hello, is there any pointer how to make curtin be able to use hp smart array logical drives as install destination? Can't use HBA mode sadly as this server won't be able to boot. Error I'm getting when curtin tries to vgchange -a (because arrays seen as logical volumes) it fails. https://dpaste.org/t1Vt/raw
<theodrim> I understand that this question is better asked at discourse, but maybe somebody encountered this before ?
