#maas 2013-07-22
<AskUbuntu> Can i install juju zookeeper on maas server itself? | http://askubuntu.com/q/323000
<AskUbuntu> Is virtual maas removed? | http://askubuntu.com/q/323070
#maas 2013-07-24
<bbcmicrocomputer> this juju-core bug is a bit of a blocker for using GoJu+maas - https://bugs.launchpad.net/juju-core/+bug/1204507
<ubot5> Launchpad bug 1204507 in juju-core "juju bootstrap dies with 400 bad request" [Undecided,New]
<bbcmicrocomputer> looks like maas file storage doesn't like empty files
<bbcmicrocomputer> so not sure exactly whose bug this should be?
<roaksoax> robbiew: ^^
<roaksoax> err
<roaksoax> sorry :)
<roaksoax> rvba: ^^
<robbiew> ;)
 * rvba looking.
<bbcmicrocomputer> rvba: thanks for doing the fast patch..hope maas supports updating/overwriting the provider-state file :)
<rvba> bbcmicrocomputer: welcome.  Yeah, that part should be fine ;).
<rvba> bbcmicrocomputer: fwiw, a package with the fix is in ppa:maas-maintainers/dailybuilds (saucy) and one is building in ppa:maas-maintainers/daily-backports (precise).
<bbcmicrocomputer> rvba: yep, was just looking, thanks!
<rvba> np
#maas 2013-07-25
<Nac> Gello
<Nac> Hello *
<Nac> Okay, just one question. Can MaaS be installed on a single computer? That's all
<roaksoax> rvba: o/! Yes I have it my ToDo
<roaksoax> rvba: did that bug you guys fixed yesterday is a candidate fro SRU?
<rvba> roaksoax: hi, thanksâ¦ I'm not sure, it's pretty important because it breaks the usage of MAAS with the latest juju-coreâ¦ so it's SRU material when the most recent juju-core will be SRU material.
<roaksoax> rvba: right, but keep in mind that juju-core won't get SRU'd, but we "expect" that juju-core PPA works against precise
<rvba> roaksoax: then it's a candidate for SRU.
<roaksoax> rvba: please add it here so I can take care of it: https://bugs.launchpad.net/maas/1.2/?field.searchtext=&orderby=-importance&search=Search&field.status%3Alist=FIXCOMMITTED&assignee_option=any&field.assignee=&field.bug_reporter=&field.bug_commenter=&field.subscriber=&field.tag=&field.tags_combinator=ANY&field.status_upstream-empty-marker=1&field.has_cve.used=&field.omit_dupes.used=&field.omit_dupes=on&field.affects_me.used=&field.has_patch.used=&field.ha
<ubot5> Launchpad bug 1 in Ubuntu Malaysia LoCo Team "Microsoft has a majority market share" [Critical,In progress]
<AskUbuntu> isnt there any descriptive step by step guide for MAAS + juju + Ceph + Openstach HA deployment? | http://askubuntu.com/q/324470
<AskUbuntu> MAAS node OS auto install - failed to retrieve preseed configuration file | http://askubuntu.com/q/324545
<AskUbuntu> can we have multiple juju config files? | http://askubuntu.com/q/324604
#maas 2013-07-26
<bbcmicrocomputer> just wondering if there are any plans to support empty files fix (https://bugs.launchpad.net/maas/+bug/1204507) for MAAS 1.3 on Raring?
<ubot5> Launchpad bug 1204507 in MAAS 1.2 "MAAS rejects empty files" [Critical,Fix committed]
<rvba> bbcmicrocomputer: that's right, I forgot about thatâ¦ since this fix is essential for juju-core, I'll backport it to 1.3.
<bbcmicrocomputer> rvba: cool, thanks!
<bbcmicrocomputer> rvba: I'm guessing tych0's add-hwinfo merge (1539) won't be going into 1.3?
<rvba> bbcmicrocomputer: well, if it's SRU-material, then it should.
<rvba> But I don't think it's SRU-material.
<bbcmicrocomputer> rvba: k, thanks
<rvba> bbcmicrocomputer: if you want that on raring then maybe we should have a raring package built from trunk.
<bbcmicrocomputer> rvba: just enquiring at this stage :)
<rvba> k
#maas 2013-07-28
<AskUbuntu> Can make conflict if I run "maas-import-pxe-files" command on different networks? | http://askubuntu.com/q/325631
<freeflying> shall I file bug against package from daily build ppa?
<freeflying> http://paste.ubuntu.com/5921918/
<AskUbuntu> juju bootstrap not working for MAAS | http://askubuntu.com/q/325749
#maas 2014-07-21
<jtv> The CI has been seeing import failures because /usr/lib/syslinux/pxelinux.0 did not exist... missing dependency?
<rvba> jtv: the maas-test CI job is failing because it expects a 1.6 or trunk-based package and all we have in Utopic is 1.5.
<jtv> Can we fix that?
<rvba> As soon as 1.6 will be published in Trusty, the problem will go away.  The best fix, I think, is to change the CI job so that it can take a PPA argument and use the MAAS package from that PPA.
<rvba> This way we can get the CI job to use the package that is our "target package" instead of using whatever is in the archive at the time.
<rvba> jtv: would you mind having another look at https://code.launchpad.net/~rvb/maas/event-model/+merge/227156 ?  The code has changed significantly after my discussion with Julian but I've addressed your comments that where still valid after I changed the code.
<jtv> OK, having a look.
<rvba> Ta.
<rvba> allenap: reviewed the 'fascist' branch ;)
<allenap> rvba: Thanks!
<rvba> allenap: jtv: any idea why the CI is failing with "No module named bson"? http://paste.ubuntu.com/7829493/
<rvba> Did we add the bson dependency recently?
<rvba> It seems to me we've been using bson for quite some time.
<rvba> Hum, looks related map_enum being moved to provisioninserverâ¦
<allenap> rvba: Gosh, no. Iâve not changed anything in that area recently, I think.
<jtv> rvba: looks like newell used the same decoders as allenap, but this time it led to trouble...
<jtv> Why does a function called node_exists need bson..?
<jtv> Ah, so it queries the API for a node and, probably just for belt-and-braces, decodes the response from either json or bson as appropriate.
<jtv> I could imagine this adding the bson dependency to a package that didn't need it before, but isn't this one of those cases where we have all our packages' dependencies installed?
<jtv> Pardon my ambiguity.
<jtv> "Isn't the failure in a situation where bson would be installed even if it's a dependency for a _different_ maas package?"
<allenap> Yeah. Itâs odd. There are two different bson packages though, and we need one in particular. I wonder if itâs picking up the wrong oneâ¦
<rvba> I /think/ the problem is not in the package but in how the CI builds the package.
<jtv> Ah, mixed install mechanisms again...
<rvba> allenap: I've got a weird test failure that seem related to twisted getting in the wayâ¦
<rvba> allenap: the pserv tests fail with http://paste.ubuntu.com/7829947/
<rvba> (On my event-rpc-utility branch)
<rvba> allenap: what's weird is that the tests I added in this branch pass in isolationâ¦ but fail when running the complete test suite.
<rvba> allenap: I suppose I'm doing something wrong here: I'm importing "from maasserver.testing.factory import factory" from inside the pserv code.  But how can I initiate my objects otherwise?
<allenap> rvba: You canât and you shouldnât.
<allenap> rvba: Verify that the call is being made correctly according to the specifications in p.rpc.region, but have a stub/fake response.
<rvba> allenap: right, I was creating real nodes but this is already a test that uses a fake client so I guess I don't need real nodesâ¦
<allenap> rvba: Nah :)
<rvba> allenap: what's funny is that the tests as they are now pass in isolation.
<allenap> rvba: Perhaps because they're not running using a Django-derived test case, or using Djangoâs runner; those do some bookkeeping and tidying-up with regards to the database and stuff.
<allenap> rvba:
<allenap> rvba: If you have a moment could you look at my additions to https://code.launchpad.net/~allenap/maas/rpc-call-list-operating-systems/+merge/227117?
<allenap> then Iâll be 100% on power control :)
<rvba> allenap: cool :).  I'll have a look at your branch in a sec.
<allenap> rvba: Thanks for that review. Hereâs the first for power: https://code.launchpad.net/~allenap/maas/rpc-power-remove-celery-vars/+merge/227550
<rvba> allenap: \o/
<blake_r> allenap: is the rpc for operating systems wired in?
<allenap> blake_r: Not 100%. Are you blocked?
<blake_r> allenap: one the license-key-view and ubuntu-release-simplestreams
<allenap> blake_r: Okay. Iâm going to juggle finishing that with rvbaâs demands; he can hassle me in my timezone :)
 * allenap goes to collect kids from school.
<jseutter> I have some computers that stop at the grub boot prompt after installing with maas.  A third computer with the same hardware boots just fine.  Does anyone know where I should look for issues?
<bigjools> jseutter: can you photo the screen and paste it so we can see?
<bigjools> rvba: what say you to this? https://code.launchpad.net/~jason-hobbs/maas/allow-hostname-change-while-allocated/+merge/227372
<jseutter> bigjools: yes, I'll do that in a few
<rvba> bigjools: looks like we had very good reasons to prevent the host from being renamed while its in useâ¦ I'm trying to see if the assumptions on which the original code was based have changedâ¦
<bigjools> rvba: my thoughts too
<bigjools> juju snafus
<rvba> Yeah
<rvba> jhobbs: Hi there.  Looking at lp:~jason-hobbs/maas/allow-hostname-change-while-allocated, wonder why you seem confident that Juju won't be utterly confused by a node changing its name in flight?
<jhobbs> rvba: i'm not at all confident
<rvba> jhobbs: Then you like to gamble :)
<jhobbs> rvba: but why would you change a node's hostname if it will confuse juju?
<jhobbs> rvba: if you run a DNS server, you know changing hostname config for running systems might break them, so you don't do it
<bigjools> it could also confuse the node
<bigjools> I suppose we can allow people to footgun and let them pick up the toes later
<jhobbs> bigjools: rvba: well the use case here is 1) acquire node 2) set hostname 3) start node
<rvba> jhobbs: MAAS sort of works on the contract that once a node is allocated, it's under someone else's control.  I think this also means that MAAS sorts of guarantees that the information about the node (most notably DNS info) won't be changed.
<jhobbs> rvba: is that how other cloud providers work?
<rvba> jhobbs: I understand that, but I don't think we should remove the restriction in question before we clearly distinguish between acquired+off and acquired+started
<jhobbs> i setup a digital ocean system recently; i started the system, had it a couple of days, then configured dns for it
<rvba> jhobbs: trying to work out juju's exact expectations nowâ¦
<jhobbs> rvba: juju isn't the only consumer though
<rvba> jhobbs: indeed.  But it's an important one.
<rvba> jhobbs: this restriction exists mostly because of Juju.
<jhobbs> rvba: you can always just not change your hostnames if you're running juju - it only happens if someone asks it to, not automatically
<rvba> jhobbs: true
<jhobbs> rvba: is acquired+stopped vs acquired+start tracking something being worked on already?
<rvba> jhobbs: no
<rvba> jhobbs: I guess we want 3 states: acquired (i.e. not started), deploying (being installed) and deployed (acquired+started=in use)
<rvba> jhobbs: adding the distinction deploying vs deployed is part of the robustness work.
<jhobbs> do deploying and deployed apply to commissioning also?
<rvba> Not really, 'commissioning' already is an "in progress" state.
<jseutter> bigjools: pic of grub prompt here: http://imgur.com/Kdaq64G
<jseutter> bigjools: I'm surprised you're awake..
<bigjools> jseutter: me too :)
<bigjools> jseutter: landed in UK this morning
<jseutter> bigjools: ah!
<bigjools> jseutter: I am afraid I have NFI what is wrong there.  Perhaps one of jhobbs, blake_r or roaksoax_ might
<jseutter> bigjools: ok, well thanks for having a look.  :)  In the meantime I'll dig through the bios on these machines and look for differences.
<jhobbs> i've never seen that error
<jhobbs> "attempt to read or write outside of disk `hd0`
<bigjools> jseutter: did you install with curtin or d-i?
<jseutter> d-i.   I might be misremembering, but I think curtin just black-screened for several hours until I gave up.
<bigjools> ummm
<bigjools> is the disk partitioned already?
<jseutter> bigjools: yes
<bigjools> does it have any of them marked for raid?
<blake_r> jseutter: this might help you
<blake_r> jseutter: http://askubuntu.com/questions/397485/what-to-do-when-i-get-an-attempt-to-read-or-write-outside-of-disk-hd0-error
<jseutter> bigjools: there are several disks in the raid, configured as one virtual disk on the raid card.  What do you mean by marked for raid?
<bigjools> jseutter: I think it's a bug in the installer, I have vage recollections of seeing this even outside of maas once
<bigjools> vague, even
<jseutter> bigjools: hm, I'm just looking at the system that booted successfully, and the drac raid section says there are no disks in the system.  I have somewhere to start digging
<mthaddon> I'm trying out MAAS on a machine. It's a region and cluster controller (trusty) and I have a VM I'm trying to add as a node. It's currently stuck in "Commissioning" status and if I try to either commission or start it from /MAAS/nodes/ I get: The action "..." could not be performed on 1 node because its state does not allow that action.
<mthaddon> any ideas what I'm missing? I don't see anything useful in logs in /var/log/maas
<jseutter> blake_r: thanks
<bigjools> mthaddon: two things to check
<bigjools> 1. did you define the power details correctly?
<bigjools> 2. is the cluster controller celeryd running?
<bigjools> damn, we need a faq
<bigjools> and hi mthaddon btw :)
<mthaddon> bigjools: o/
<mthaddon> bigjools: power address is "qemu:///system" and I've confirmed I can run "virsh --connect qemu:///system list" as the maas user
<mthaddon> bigjools: power ID is "test" and the name of the virsh instance from virsh list --all is "test"
<bigjools> mthaddon: did you follow this http://maas.ubuntu.com/docs/nodes.html#virtual-machine-nodes
<mthaddon> I did, yep
<bigjools> col
<bigjools> cool
<bigjools> blast this keyboard, it's going in the bin
<mthaddon> I see celery processes running
<bigjools> if you look in its log /var/log/maas/celery.log, do you see a line with power_on around the time you added the node?
<mthaddon> bigjools: http://paste.ubuntu.com/7831026/
<bigjools> mthaddon: aha
<bigjools> sadly, I think you're using an old version that lacks debugging info
<mthaddon> bigjools: an old version? this was a clean install over the weekend!
<bigjools> of which maas version?
<mthaddon> well, of trusty - maas appears to be 1.5.2+bzr2282-0ubuntu0.2
<bigjools> ok
<mthaddon> is that "old"?
<bigjools> ummmmm you can get past this by "powering up" the VM manually for now
<bigjools> not massively, but I am surprised the task logging isn't working :(
<mthaddon> presumably I'd need to tell it to connect to the MAAS cluster somehow? I'm not doing DHCP or DNS via MAAS
<bigjools> and then you'll have to try and re-create what the power template is doing to start the node up to work out what's wrong
<bigjools> if you're not managing dhcp, then presumably you've configured another DHCP with the next-server pointing at maas?
<mthaddon> I haven't - if that's the next step, sounds like I need to do that
<bigjools> yeah
<bigjools> mthaddon: http://maas.ubuntu.com/docs/configure.html#manual-dhcp-configuration
<mthaddon> k, thx
#maas 2014-07-22
<chiluk> Hey guys is there any hidden functionality that enables maas to be better suited for a test machine environment *(i.e. nag notices on extended length machine reservations?)
<jtv> chiluk: we have no such notices, no.  Should be doable through the API though.
<rvba> allenap: now that I think about it, wouldn't it be simpler to have a RPC power_change() method instead of power_on/power_off?  This way, the new method used to query the power state of a node would simply be a different call to the existing RPC method?
<allenap> rvba: power_change(query=True) smells.
<allenap> or something like that.
<rvba> allenap: power_change({'on', 'off', 'query'})
<allenap> rvba: That smells too. PowerOn and PowerOff donât declare any response arguments because theyâre not relevant. If we munged these together with checking the power state weâd need to declare some response arguments just for that case.
<rvba> allenap: humâ¦ okay, good point.  I just feel that the RPC layer is starting to extend its tentacles in every direction.
<allenap> rvba: Heh. By keeping the interface very tight and focused instead of overloading RPC calls itâs doing the opposite.
<allenap> Once Celery is out and its replacement is in, I think MAAS overall will feel less embraced by wandering tentacles.
<rvba> allenap: what interface?  It seems to me we're extending the API surface (the RPC api) quite a bit.
<rvba> allenap: one good thing about Celery was that it wasn't very "invasive".  It was pretty clear what the API surface was and how to do retries and things like that was pretty straightforward.   I just hope the RPC stuff won't lose this.
<rvba> I'm not saying that it looks like the RPC stuff is going to be worse than Celery was;  the importance of having a simple interface is just something we need to keep mind.
<rvba> allenap: btw, the RPC power_on/power_off methods are there now but they are not "wired" (as in, used) at all yet correct?
<allenap> rvba: Each RPC call has a well-defined shape. Celery allows us to pass almost anything around, which increases the implementation coupling between parts of MAAS. Itâs convenient, but thereâs no distinct edge to the layering.
<allenap> rvba: Thatâs right, theyâre not used yet because thereâs a hornetâs nest of static IP Celery stuff to sort out first.
<rvba> allenap: Is sorting out the static IP stuff a pre-req for using the PowerOn/PowerOff methods?
<rvba> (Just asking because I was planning to do that now.)
<allenap> rvba: They can be used, but if you look in m.models.nodeâ¦start_nodes youâll see that power_on can be chained behind a task returned by claim_static_ips().
<rvba> allenap: oh, right.  Damn.
<allenap> rvba: Indeed. The hornetâs nest is found by digging into claim_static_ips :)
<rvba> allenap: yep, it's definitely a hornet's nest :)
<bigjools> allenap, rvba: is my beautifully written code too complex for you?  Should I draw a picture? :)
<bigjools> FTR, this is why I said we needed a layer on RPC to simulate jobs so that you can chain them to guarantee ordering
<rvba> allenap: time for a quick pre-imp chat?
<schegi> hey, got some problems with the maas dns, i think. after i bootstrapped a juju node, it is not reachable by dns always getting ERROR state/api: websocket.Dial wss://controller.wcloud.uni-koblenz.de:17070/: dial tcp: lookup controller.wcloud.uni-koblenz.de: no such host when doing juju ssh 0
<schegi> bootstrapping works fine starts with Attempting to connect to controller.wcloud.uni-koblenz.de:22 Attempting to connect to 192.168.25.14:22
<schegi> anyone?
<allenap> bigjools: It is complicated, but then itâs not a trivial problem.
<allenap> rvba: Sorry, I was otp, and Iâm getting a call back any minute :-/
<bigjools> allenap: yes, I always said that removing celery was not as simple as just replacing tasks with rpc :)
<rvba> allenap: it won't take long, and I can wait if the someone calls you while we're on the phone.  I'd be happy to talk to you before we step out for lunch if possible.
<rvba> s/the someone/someone/
<allenap> rvba: Okay, call again.
<bigjools> if anyone needs a trivial bug to fix: https://bugs.launchpad.net/maas/+bug/1340920
<bigjools> the formatter's ordering of imports can really bugger up tests when you're using monkeypatching :(
<bigjools> allenap: do you know a convenient way to create a mock object that has mock functions which return a particular value, without doing a dance involving multiple mocks?
<allenap> bigjools: Yes :)
<allenap> my_thing = self.patch(thing, ânameâ)
<bigjools> excellent, always full of useful info
<bigjools> :)
<allenap> my_thing.function_1.return_value = 1234
<bigjools> oh that easy
<allenap> Just reference the thing you want and youâll get a Mock back.
<allenap> Yep.
<bigjools> the obvious is always sitting too close to see, isn't it?
<allenap> Yeah. And mock is a little scary too.
<bigjools> allenap: a further thing - is there a convenient way of making the return value conditional on passed values? :)
<allenap> bigjools: Also, see mock.create_autospec(). Thatâs a fairly cool function.
<allenap> bigjools: my_thing.function_1.side_effect = my_callable
<bigjools> basically I am patching the rpc client
<bigjools> oh side_effect...!
 * bigjools not having a good day
<allenap> Hehe :)
<rvba> allenap: care to have a look at this tiny branch? https://code.launchpad.net/~rvb/maas/retry-power-changes-1/+merge/227722
<allenap> rvba: It would be my pleasure.
<rvba> allenap: ta
<rvba> allenap: "RabbitMQ will be going away this cycle" to we have a plan to do away with txlongpoll?  Because it also depends on RabbitMQ.
<rvba> s/to we/do we/
<allenap> rvba: In my mind I do :) otp now.
<bigjools> rvba: irrelevant for this bug though
<rvba> bigjools: who said it was about this bug?
<bigjools> rvba: because that's the context for the statement!
<bigjools> allenap: there?
<bigjools> need rpc halp
<allenap> bigjools: Iâll be with you in 3 minutes.
<bigjools> tick tock
<allenap> bigjools: https://plus.google.com/hangouts/_/canonical.com/bigjools
<bigjools> allenap: https://code.launchpad.net/~julian-edwards/maas/image-download-service/+merge/227309
<bigjools> allenap: can you review my branch please?
<rvba> allenap: question for you:  I thought the methods in src/provisioningserver/rpc/clusterservice.py like power_on where being executed in the reactor?   That's weird because PowerAction.execute is effectively blockingâ¦ isn't that a problem?
<bigjools> rvba: should be deferred to a thread
<rvba> allenap: in other words, shouldn't this be wrapped in deferToThread?
<allenap> rvba: Yeah, it probably should be deferred. Iâll sort that out unless you want to?
<rvba> allenap: I'm on it.
<allenap> Tip top.
<bigjools> allenap: https://code.launchpad.net/~julian-edwards/maas/image-download-service/+merge/227309
<allenap> bigjools: Iâm half way done already. Waaay ahead of you :)
<bigjools> allenap: ha!
<bigjools> allenap: I just realised I am about to be hoist by my own petard
<bigjools> I only did happy path testing
<bigjools> lol
<allenap> bigjools: Hehe, everythingâs happy in Twistedland.
<bigjools> allenap: because it's on drugs
<allenap> Absolutely. As long as it doesnât stop taking them itâll never come down ;)
<bigjools> allenap: I just re-read the original email you sent me about the implementation for this and realised I got something wrong.  Lack of maas.meta means it should not be importing automatically.
<allenap> bigjools: Ah yes! Iâd forgotten that.
<bigjools> allenap: LOL at last two MP comments
<schegi> hey one question in the /etc/maas/templates/dhcp/dhcpd.conf.template the option ntp-server is set by dhcp_subnet.get('ntp_server') but there is no specific option when specifying dhcp in maas how to set that to a different address?
<matsubara> schegi, The setting is in the main maas settings page not at the cluster dhcp config.
<matsubara> I mean, the ntp server setting.
<schegi> ok gotz it thx
<matsubara> np
<schegi> ah and theere is also a field for the forwarders, nice
<schegi> have hardcoded them
<schegi> if i update these values do i have to restart something??
<matsubara> Nope
<schegi> it is just possible to specify one forwarder right?
<matsubara> Yes
<automatemecolema> So I'm getting an "Internal Server Error" after I have MAAS installed, and the PXE images downloaded. Not sure what has happened. I used the latest ISO release, and updated before I installed MAAS any ideas?
<automatemecolema> Looked through some logs and maas.log reveals an RPC connection error, and celery has a couple errors too
<Sh3l0ck> Is there a way to statically map IP addresses to machines in MAAS?
<roaksoax_> Sh3l0ck: not until the next release
<Sh3l0ck> roaksoax_: Oh I see...Is the next release coming out soon?
<roaksoax_> Sh3l0ck: yes! sometime in the next few weeks
<automatemecolema> So I'm getting an "Internal Server Error" after I have MAAS installed, and the PXE images downloaded. Not sure what has happened. I used the latest ISO release, and updated before I installed MAAS any ideas?
<newell> automatemecolema, can you do a tail -f *.log in /var/log/maas and see if there is any information that would help you out
<breze411> anyone knows what login works maas nodes console ?
#maas 2014-07-23
<alexmcwhirter> Hey guys, im having issues with nodes failing to fully PXE boot. They get stuck at IP-Config requesting an IP address from the MAAS server but never getting a response. The cluster controller doesnât ever see these requests either.
<alexmcwhirter> A bug report was opened for similar issue https://bugs.launchpad.net/ubuntu/+source/initramfs-tools/+bug/1327412
<ubot5> Launchpad bug 1327412 in initramfs-tools (Ubuntu) "Delay during PXE Boot, IP-Config gives up" [High,New]
<jtv> rvba: I wonder if that migration (1) shouldn't break in the database anyway because it updates a column that was added in the same transaction, and (2) won't cause trouble updating a field that _on the model_ hasn't been added yet.
<jtv> In SQL terms, what I'd expect that migration to do is: update the node table using a left outer join with the tag table where the tag is for the fastpath installer; and set the newly added column to one value or the other depending.
<jtv> Instead, it loops over all nodes, queries the tags table for each, updates a model field that hasn't been fully added yet, then writes back that one node.
<jtv> It was this MP wot did it: https://code.launchpad.net/~blake-rouse/maas/add-boot-type-to-node/+merge/227234
<jtv> rvba: I'm working on that broken migration.
<bigjools> thanks jtv
<bigjools> rvba: I'll review your branch if you review mind :)
<bigjools> mine*
<rvba> bigjools: well, it's always the same question I have with twisted: how to *start* the damn thing. What am I doing wrong here: http://paste.ubuntu.com/7841796/ ?
<bigjools> rvba: looking
<bigjools> rvba: aiee
<bigjools> rvba: you don't need the callLater
<rvba> bigjools: well, I tried without first
<rvba> bigjools: what you see now is the hole I ended up in after trying different things
<bigjools> basically it's going to be hard the way the code is written
<bigjools> because of inline callbacks
<bigjools> the function won't exit until it's finished
<bigjools> but you need to do assertions while it's running
<rvba> Indeed, but the function yields at various points.
<bigjools> I forgot what it's calling
<rvba> Isn't that enough to let us do the checks we want to do?
<bigjools> power_action, right
<bigjools> no, because it yields to the reactor
<bigjools> let me show you something I did recently
<bigjools> rvba: also fwiw I think you need some test refactoring to reduce the boilerplate
<bigjools> rvba: move the clock.advance() to the first statement after the "for"
<bigjools> advance just runs the reactor for one iteration so you need to do that to get your callLater to run
<bigjools> the test now fails because it's actually incorrect :)
<rvba> bigjools: doens't seem to work.
<rvba> doesn't*
<bigjools> rvba: it doesn't work because your test is wrong
<rvba> ah :)
<bigjools> it needs updating to cope with this
<bigjools> rvba: you could try advance(0), I've never done that though
<bigjools> for the first iteration I mean
<rvba> bigjools: I've tried that first actually.  Ni dice.
<bigjools> rvba: 0.1 then
<bigjools> :)
<bigjools> you just need to spin the reactor once to get the callLater to run
<bigjools> but not enough to get the retries to start yet
<rvba> I've tried that too :)
<bigjools> rvba: what happened?
<rvba> bigjools: the first check failed because nothing had been called.
<bigjools> rvba: weird.  did you try 1?
<rvba> bigjools: with 1 it fails during the second check.  Maybe my test has a problemâ¦ let me have another lookâ¦
<bigjools> rvba: yes, there is a problem, it's not taking into account the initial calls I think
<rvba> bigjools: it doesn't look as if clock.advance is doing its job: the code inside the loop in change_power_state() only gets called once.
<bigjools> rvba: the balance of probability is that your code is buggy then :)
<bigjools> rvba: give me a few minutes to finish my branch and then you can have my full attention
<rvba> bigjools: either that or I'm not using the right magic formula in my test.
<rvba>     raise mismatch_error
<rvba> MismatchError: !=:
<rvba> reference = [call(context-key-4XklLw=u'context-val-TYmWJk', power_change=u'on')]
<rvba> actual    = [call(context-key-4XklLw=u'context-val-TYmWJk', power_change=u'on')]
<rvba> : after <function get_mock_calls> on <Mock name='mock().execute' id='140383727837328'>: calls do not match
<rvba> Nice
<bigjools> wat
<rvba> bigjools: I've pushed the latest version of the code to lp:~rvb/maas/retry-power-changes
<rvba> ./bin/test.maas src/provisioningserver/rpc/tests/test_power.py:TestPowerHelpers.test_change_power_state_pauses_in_between_retries
<bigjools> rvba: it works now?
<rvba> Obviously not :)
<bigjools> rvba: ok I'll take a look
<rvba> The clock.advance doesn't seem to have any effect on the tested method resuming its execution.
<bigjools> rvba: test.maas for pserv tests?
<rvba> The call* to clock.advance
<bigjools> use test.pserv!
<rvba> It works.
<bigjools> it's slow
<rvba> Can you run just one test with test.pserv?
<bigjools> rvba: provisioningserver.rpc.tests.test_power:TestPowerHelpers.test_change_power_state_pauses_in_between_retries
<bigjools> rvba: bin/test.pserv provisioningserver.rpc.tests.test_power:TestPowerHelpers.test_change_power_state_pauses_in_between_retries
<bigjools> sigfh
<bigjools> notice how quick it is ;)
<rvba> Okay, cool.
<rvba> True, it's quick
<bigjools> rvba: would your DNS CNAME removal branch easily backport to 1.5?
<rvba> bigjools: I think so; the interface hasn't been changed, only the internals.
<bigjools> yeah I thought as much, thanks
<rvba> bigjools: if I debug what happens when I reach the clock.advance in the test, I can see that right after existing the twisted code, the code runs the second execution of the loopâ¦ in other words, *nothing* calls the method under test again.
<bigjools> rvba: put a breakpoint at the end of the method and see if it's reached
<rvba> bigjools: it's not
<rvba> did that
<bigjools> pok
<bigjools> so it exits at the yield
<rvba> bigjools: reading the twisted code, I wonder if Clock is the right tool to use here.  It seems it's only meant to write unit tests that use callLater
<bigjools> nah it's fine
<bigjools> hmmm so when I tested something similar in my own code I patched out deferToThread
<bigjools> so you're creating threads in here
<rvba> Damn, that did the trick.
<bigjools> haha
 * rvba can't believe it.  Let me put a breakpoint and see it with my own eyes.
<bigjools> I suspect Clock() won't work with threads
<bigjools> rvba: I would patch out deferToThread so it just returns defer.succeed
<rvba> self.patch(power, "deferToThread", maybeDeferred) is what I did
<bigjools> same thing
<bigjools> nearly
<bigjools> rvba: so did you get it working?
<rvba> bigjools: the test passes
<bigjools> hurrah
<rvba> I'm seeing if I can't break that giant function into smaller pieces that could be unit tested separately.
<rvba> Well, I'm sure I can but the question: will the resulting code still be readable?
<rvba> the question is*
<bigjools> I am sure an engineer of your high calibre will do just fine :)
<bigjools> rvba: I was thinking, you might not need the callLater.
<bigjools> make the test use inlineCallbacks
<schegi_> question on the virsh power settings for a node can anyone help? i configured them like this 'power address' qemu+ssh://ubuntu@192.168.25.14/system and 'power id' vm1 but it does not power on the vm. running 'sudo -u schegi virsh -c qemu+ssh://ubuntu@192.168.25.14/system start vm1' where schegi is my maas user works.
<bigjools> schegi_: http://maas.ubuntu.com/docs/nodes.html#virtual-machine-nodes
<bigjools> rvba: then you can yield the first result, followed by advancing the clock after that.
<rvba> bigjools: yeah, I'll tryâ¦ let me see if I can break the methods into smaller pieces firstâ¦
<rvba> method*
<bigjools> rvba: good call.
<bigjools> will be much much easier to test
<schegi_> bigjools, already found that but the problem is that running the command from the console powers on the machine. starting the vm by maas doesnt work
<bigjools> schegi_: two things to check then, 1. is the cluster celery process running? 2. look in its log for errors
<schegi_> bigjools, seems that the login to the virsh console somehow fails http://pastebin.com/QfH9Ybqa
<jpds> maas has a broken piece in trusty.
<jpds>  /var/log/maas/rsyslog has the wrong permissions by default.
<bigjools> schegi_: can't help you with that, sorry
<bigjools> jpds: someone filed a bug the other day
<rvba> bigjools: http://paste.ubuntu.com/7842616/ (a first version with a tiny bit extracted); I think it can be done better.  Is there a better pattern that having sub-methods like what I've done with doer() and checker()â¦ I could go all the way and get rid of the inlineCallbacks entirelyâ¦
<rvba> I'm starting to see why twisted is fun :).
<jpds> bigjools: Do you have the bug handy/
<bigjools> jpds: sorry, not to hand
<bigjools> busy atm, will look in a bit
<bigjools> rvba: one mo
<bigjools> rvba: ok I can look now
<bigjools> rvba: you could enhance retries() to take a sequence
<bigjools> but up to you
<rvba> bigjools: what do you mean by that?
<bigjools> rvba: the retries util that gavin wrote just takes fixed intervals
<bigjools> a fixed interval, I mean
<rvba> bigjools: I'm doing something different here: there is no global timeout.
<bigjools> ok
<rvba> bigjools: the thing I don't like in my code is the need to encapsulate actions into tiny methods that take no arguments (checker and doer).
<bigjools> rvba: presumably you're doing that so it's easier to monkey patch?
<rvba> bigjools: no, only to have something contained to pass to retry()
<bigjools> rvba: my brain is starting to shut down for the day, I might not be much more help!
<rvba> bigjools: you've been very helpful thus far!  I'll refine my code a bit more so that it marks the node as broken when the template execution fails.
<bigjools> sweet
<bigjools> rvba: TDD is generally even more important when writing twisted code
<rvba> bigjools: right, I can see that now.
<bigjools> rvba: and sometimes I eschew inlinecallbacks in favour of passing deferreds around just to make tests easier
<bigjools> it's quite nice being able to add a test callback on one of the deferreds from the code being tested
<rvba> bigjools: not just that.  In this particular case, not using inlinecallbacks means dealing with errors that need to be treated the same way but are happening at various point in the callback tree is easier.
<rvba> points*
<bigjools> right
<bigjools> rvba: the downside means you end up writing lambdas or inline functions
<rvba> Indeed.
<rvba> bigjools: another thing is that using the Deferred pattern (vs inlineCallback) forces you to think in terms or success vs. failure.  It structure the code paths around that distinction.
<rvba> s/terms or/terms of/
<bigjools> rvba: well yield is more like normal python
<bigjools> exceptions and so on, so not sure about that.
<rvba> bigjools: if you compare my code vs the previous version, handling failures in the "deferred" version comes more naturally.
<allenap> bigjools: Do you know why the MAAS Daily Builds PPA appears to be fairly out of date?
<allenap> Or am I misunderstanding?
<allenap> I want a package built from trunk.
<allenap> For Trusty.
<bigjools> hang on
<allenap> I /could/ build it myself.
<bigjools> allenap: it got switched to utopic
<allenap> bigjools: Oh.
<bigjools> allenap: dailies for trusty are in here https://launchpad.net/~maas-maintainers/+archive/ubuntu/experimental/+packages
<allenap> bigjools: Can we build trunk packages for Trusty and Utopic?
<bigjools> allenap: it's complicated :)  can't upload the same thing twice to a PPA, so we'd have to reversion stuff in the recipes to include the series info
<bigjools> this situation needs reviewing I think
<allenap> Yeah. My lack of knowledge means I resent the arcane dark art of packaging even more.
<bigjools> allenap: I've been building my own lately
<bigjools> since I made it *so* easy ;)
<bigjools> allenap: talking of which, can you remember why, after installing packages, pserv fails to start with:
<bigjools> /usr/bin/twistd: Unknown command: maas-pserv
<bigjools> plugin not found, obviously, but why
<allenap> bigjools: No, sorry, but I can help diagnose. Not now - kids have just returned and need feeding - but after 8pm.
<bigjools> allenap: at 8pm I'll be semi-comatose.  better leave it to tomorrow
<allenap> bigjools: Okay.
<bigjools> thanks
<allenap> Bye!
<bigjools> look at the time!
<bigjools> I keep thinking "team call in 5 minutes..."
<bigjools> allenap: oh ah I see what's up, plugin failed to load
<bigjools> exceptions.AttributeError: 'Settings' object has no attribute 'WORKER_QUEUE_DNS'
<bigjools> but why... question for tomorrow!  good night
<schegi_> got still problems with powercontrol of an virsh vm node. already configured my maas user to be able to perform pwd-less virsh command in the terminal, but doing the same via maas is not possible.
<schegi_> e.g commissioning node would not start node with power settings qemu+ssh://ubuntu@192.168.25.14/system id: vm1, but performing virsh -c qemu+ssh://ubuntu@192.168.25.14/system console vm1 in the terminal starts node. celeryd is running and log says provisioningserver.custom_hardware.virsh.VirshError: Failed to login to virsh console..
<Waruii> hi. I have a newly installed on a mainly vanilla 14.04, but maas does not import images. pserv.log tells me "AssertionError: MAAS_URL is not set.  This probably means that the script which started this program failed to source maas_cluster.conf", but access rights look fine to me
<Waruii> I also see repeatedly two(!) instances of HttpClientFactory (http://localhost/MAAS/rpc/) booting up and being stopped immediately
<jseutter> Waruii: I remember seeing the same error with my maas install but I don't remember how I fixed it.  From looking at https://maas.ubuntu.com/docs/troubleshooting.html, sudo dpkg-reconfigure maas-region-controller may be what you need.
<jseutter> Waruii: on my install, I have maas_cluster.conf:MAAS_URL=http://localhost/MAAS
<jseutter> sorry, that's in /etc/maas/maas_cluster.conf
<Waruii> jseutter: config says the same for me. just did the dpkg-reconfigure to no avail
<jseutter> Waruii: Are you trying to install maas from packages on an existing 14.04 install?
<jseutter> Waruii: If so, check your installed packages against mine - http://pastebin.ubuntu.com/7843901/
<Waruii> installed the "maas" metapackage after a vanilla 14.04 installation
<jseutter> Waruii: did you do the sudo add-apt-repository cloud-archive:tools before installing the maas metapackage?
<Waruii> does not ring a bell
<jseutter> Waruii: I had a couple of failed attempts, it finally worked for me when I followed https://maas.ubuntu.com/docs/install.html closely.  It could be that the maas packages in the standard archive do not work as well as the ones in the cloud archive
<jseutter> Waruii: actually
<jseutter> Waruii: On my system, the cloud-tools archive has been disabled by distUpgrade, so I must have gotten maas working on 12.04 and then done a dist-upgrade to 14.04.
<jseutter> nuts.
<Waruii> I thought I followed it but I will give it another try â¦ tomorrow ;)
<Waruii> thx though!
<jseutter> Waruii: That add-apt-repository command is described as optional in the docs, so if one way doesn't work you could try the othre
<jseutter> other
#maas 2014-07-24
<JJ__> hi
<JJ__> Hello?
<rvba> bigjools: please have another look at my branch (https://code.launchpad.net/~rvb/maas/retry-power-changes/+merge/227884);  I've kept the hardcoded list of power types that support querying for now.  The way we're using the power type registry is a bit nasty: the objects it stores are JSON representation of the fields the power type requires, not proper objects that can be extended cleanly.  If we're going to
<rvba> extend this, I think it's worth taking a step back and seeing if we can't refactor this a bit before we had more cruft on top of what we have now.
<bigjools> rvba: sure
<bigjools> gimme a few, just writing an email
<rvba> No worries
<rvba> bigjools: also, if you have time after that, I'd like to talk about error reporting with you.
<bigjools> rvba: first comment - the test_change_power_state_marks_the_node_broken_if_exception test won't fail if there's no Failure.  You need to do d.addBoth() and check the result is a Failure.
<bigjools> did you ever get the test to fail?  :)
<bigjools> rvba: see self.assert_fails_with() which does what you want
<rvba> bigjools: I actually *did* get the test to failâ¦
<rvba> bigjools: if you apply http://paste.ubuntu.com/7846674/ the test fails.
<bigjools> rvba: wrong kind of failure
<bigjools> if there is no failure at all, the test will pass
 * bigjools writes on MP
<rvba> bigjools: I get http://paste.ubuntu.com/7846676/, that's what I'm expecting
<rvba> i.e. the node not being marked as broken
<bigjools> rvba: wait for my MP comments, the test needs a simple fix, that's all
<rvba> bigjools: oh, I see what you mean now
<bigjools> :)
<bigjools> rvba: there is a test helper that does it all in one line
<rvba> Right.
<bigjools> rvba: see email
<bigjools> also a useful nugget is that you can pass debug=True to make_factory() which gives you a bunch of info about Deferreds when debugging your tests (full stack trace of where they were created is tremendously useful)
<rvba> bigjools: hum, looks like you need to "return" assert_fails_with()
<bigjools> rvba: ah yes
<bigjools> returns the deferred to the reactor
<rvba>         d = power.change_power_state(
<rvba>             system_id, power_type, power_change, context)
<rvba>         d.addErrback(check)
<bigjools> rvba: well what you can do, since you're adding an extra check is this (let me just write it down)
<rvba>         return assert_fails_with(d, Exception)
<rvba> bigjools: this ^ works.
<rvba> And check is:
<rvba>         def check(failure):
<rvba>             # The node has been marked broken.
<bigjools> rvba: you don't need the check func you can do a lambda
<rvba>             self.assertThat(
<rvba>                 client.mark_node_broken, MockCalledOnceWith(system_id))
<rvba>             raise failure.value
<bigjools> d.addErrback(self.assertThat, ...)
<bigjools> not a lambda... heh
<bigjools> rvba: ok?
<rvba> Having a cleanly defined check() method is easier on the eyes.
<rvba> I think.
<bigjools> rvba: I massively disagree
<bigjools> inline funcs are to be avoided at all costs
<bigjools> well, not all, but they are hideous
<rvba> Not in tests when they encapsulate a tiny bit of logic that is nice to see isolated from the rest of the code.
 * bigjools continues to disagree
<rvba> bigjools: in this case, I need to do two things in check(): the test itself and then re-raise the exception
<bigjools> rvba: no you don't need to re-raise
<bigjools> if you add the errbacks in the right order, anyway
<rvba> bigjools: if I don't re-raise, the exception is swalled and assert_fails_with fails
<rvba> oh, I see.
<bigjools> rvba: d = self.assert_fails_with ...
<rvba> assert_fails_with returns a deferred
<rvba> bigjools: yeah
<bigjools> d.addErrback(...
<rvba> bigjools: so that's what you're suggesting then? http://paste.ubuntu.com/7846751/
<bigjools> rvba: no need to re-assign d
<bigjools> but basically, yeah
<rvba> It's not failing properly when no exception is raisedâ¦
 * rvba reads the documentation for d.addErrbackâ¦
<rvba> bigjools: http://paste.ubuntu.com/7846769/ that's better
<bigjools> ah yes sorry, forgot the extra arg :)
<bigjools> normally it's not needed but because you have other callbacks ...
<rvba> Right.
<bigjools> rvba: that;s interesting.  The test func where you added the yield doesn't have inlinecallbacks
<rvba> bigjools: ah!
<bigjools> It's fine, but ...
<bigjools> rvba: is assert_fails_with not on the testcase?
<bigjools> self.assert_fails_with ... ?
<rvba> No, it's not.
<bigjools> huh, weird.  Someone must have patched it in on the LP tests then
<bigjools> rvba: I reckon we should change our test cases to have it
<rvba> bigjools: it's only used in two places now (including the one I'm adding) so the fix should be tiny.
<rvba> bigjools: how do you explain "yield method()" works without the inlinecallbacks decorator?
<bigjools> it's a generator...
<rvba> But how come the test runner deals with that transparently?
<bigjools> I suspect you don't need the run_tests_with
<bigjools> it's never hitting the reactor
<rvba> bigjools: when I call power.change_power_state() without explicitly passing a reactor, the default one gets usedâ¦
<bigjools> I mean the tests are not relying on it
<rvba> But when 'yield pause(waiting_time, reactor)' gets called, this should hit the reactor shouldn't it?
<bigjools> rvba: not if inlinecallbacks is missing it's not :)
<rvba> bigjools: but I've got other test methods with inlinecallbacks.  They pass with or without it.
<bigjools> meh, the lack of an easy way to run a quick image import on canonistack is annoying
<bigjools> need a sources file now
<bigjools> rvba: right, because the tests are not relying on the reactor spinning, just the clock
<rvba> bigjools: but the clock defaults to twisted.internet.reactor
<bigjools> rvba: not the one you're relying on
<rvba> bigjools: I'm talking about other tests which don't use Clock() at all.
<bigjools> I'm not doing a good job of explaining, sorry
<jtv> Who wants to review my database migration fix for trunk?  https://code.launchpad.net/~jtv/maas/bug-1347579/+merge/228099
<bigjools> jtv: I would be delighted to
<jtv> Thank you.
<allenap> Please can a review get I? https://code.launchpad.net/~allenap/maas/rpc-get-preseed-data-helper/+merge/228086
<rvba> allenap: I'll look at it in a sec
<allenap> rvba: Thanks.
<bigjools> allenap: pserv logging.  it's, erm, not doing a lot
<bigjools> is the python logger wired up?
<bigjools> to twisted's, I mean
<allenap> bigjools: Whatâs happening specifically?
<bigjools> allenap: two problems:
<bigjools> 1. logger.info() is not appearing in the log
<bigjools> 2. the log is swamped with RPC connection reporting which is annoying
<roaksoax> bigjools: so I was thinking, would it be a good idea to start working on Unified Logging?
<roaksoax> since most of the new logging that we will be doing in all of the work
<roaksoax> will need to end up in a unified log
<bigjools> roaksoax: that's a terrible idea right now, but when everything else in progress is done, it will be an excellent idea
<roaksoax> bigjools: right, but we need to start consolidating logs in a single location anyway, instead of creating new logs for new stuff
<allenap> bigjools: I shall look.
<bigjools> roaksoax: there is nothing that I know of that is going to require a new log
<allenap> roaksoax: Yeah, letâs just clean up what we have for now, get over the hump, no new development right now. In a month we can do something a bit more structured.
<bigjools> allenap: thanks.
<allenap> bigjools: logger.info() probably not appearing because weâre not configuring the standard lib's logging in pserv.
<bigjools> allenap: I think we need to build some retrying backoff algorithm into getClient()
<bigjools> allenap: yes that;s what I guessed at 14:52
<bigjools> getClient fails hard and quick, and this is not a normal situation, we expect RPC to always be up.  It could be that we're trying to access too quickly after startup.
<allenap> bigjools: We can wire up Twistedâs log to the standard libâs logging with PythonLoggingObserver.
<allenap> And thatâs probably a good thing to do.
<bigjools> yes
<bigjools> allenap: how evil would it be to put inlineCallbacks on __init__ ? :)
<bigjools> I need a startup pause for the image checking service because of this getClient thing
<allenap> bigjools: We could add a queue to ClusterClientService for getClient requests.
<allenap> bigjools: inlineCallbacks on __init__ wonât work.
<bigjools> yeah I figured
<bigjools> allenap: client(GetBootSources) is returning a string.  Is that right?
<allenap> bigjools: Nope, it returns a {âsourcesâ: {â¦lots of stuffâ¦}}. See GetBootSources.response for the schema.
<bigjools> allenap: I am getting a very very weird error
<bigjools> let me paste
<bigjools> allenap: http://paste.ubuntu.com/7848044/
<bigjools> the returned data looks ok
<bigjools> oh no.... do I really have to do returned_data['sources'] ...
<allenap> bigjools: Yep, thatâs it.
<bigjools> allenap: so did that, got further: http://paste.ubuntu.com/7848059/
<bigjools> I feel we're lacking in some integration tests
<allenap> Yep :)
<bigjools> allenap: seriously - there's too much connection mocking, we really need end-to-end tests
<allenap> bigjools: Iâm not sure that the response that GetBootSources gives was designed to exactly slot into existing code. It was designed to make sense first of all :)
<rvba> bigjools: I know you're busy right now but here is the MP that implements what we talked about earlier: https://code.launchpad.net/~rvb/maas/improve-broken-state/+merge/228132
<bigjools> roaksoax: ack
<allenap> bigjools: Youâre the one wiring it up for the first time, so you get to write the integration tests.
<bigjools> rvba: ack
<bigjools> ffs
<bigjools> allenap:yay?
<allenap> bigjools: Think of how good youâll feel when itâs done :)
<bigjools> allenap: I always enjoy finishing someone else's work ;)
<allenap> bigjools: Great, I think I have some more for you :)
<bigjools> allenap: arf!
<bigjools> allenap: do we have any infra to set it up in a test?
<allenap> bigjools: Thereâs ClusterRPCFixture for the other direction. You could replicate that for cluster->region.
<bigjools> allenap: right
<bigjools> allenap: so I see the problem
<bigjools> the keyring data is returned in sources, but the importer code wants it in a file
<bigjools> I love it when the rabbit hole deepens like this
<bigjools> so I can't just pass that sources obj that rpc returns
<bigjools> what I want to know is, how is the file written normally when the celery job runs
<bigjools> argh
<allenap> bigjools: Thatâs done in boot_resources.import_images, right? write_all_keyrings
<bigjools> argh arfg argf
<bigjools> allenap: the rpc code has a bug
<bigjools> it's returning the data in a field with an incorrect name
<bigjools> "keyring" is the filename, "keyring_data" is the, well, data
<bigjools> but the call uses the former for the latter
<bigjools> allenap:
<bigjools>         keyring_data = source.pop("keyring_data")
<bigjools>         source["keyring"] = b64decode(keyring_data)
<bigjools> why!
<bigjools> thinko or deliberate?
<allenap> bigjools: Deliberate. BootSource.to_dict() is the one smoking crack for the benefit of Celery. The cluster never needs to know the filename. To the cluster the keyring is the keyring, it does not need to be qualified with a _data suffix. Once the Celery task is gone we can fix BootSource.to_dict().
<bigjools> allenap: I'm changing it back to keyring_data, because it's the least path of resistance
<allenap> (Itâs smoking crack because the model is where the base-64 encoding is happening, instead of where the task is dispatched.)
<bigjools> the code in import_resources relies on all of this
<bigjools> it's not just celery
<allenap> bigjools: Jumping off a cliff is a quick way to the bottom.
<bigjools> \o/
<bigjools> I prefer keyring_data tbh.  I doesn't leave me wondering if it's a filename or the data
<bigjools> keyring on its own is ambiguous
<bigjools> and in fact more confusing because the gpg executable uses --keyring=<filename>
<allenap> bigjools: Either way, fix BootSource.to_dict() when you delete the import_boot_images task.
<allenap> bigjools: Note that everything assumes that keyring_data is base-64 too. Youâll need to disabuse everything of that notion.
<bigjools> allenap: sigh :(
<bigjools> allenap: for now I am going to ensure that everything RPC is bytes and not b64
<bigjools> it's a good cutpoint.
<allenap> Yeah.
<bigjools> allenap: I have a dilemma
<bigjools> there's a time when the importer code needs to deal with both b64 and bytes for the keyring
<bigjools> before both the scheduled job and manual import are migrated to rpc
<allenap> The only way out is through.
 * allenap is the font of aphorisms today.
<allenap> bigjools: Well, you /could/ name one "keyring_data" and the other, oh, I donât know, just âkeyringâ? :)
<bigjools> allenap: fooey
<bigjools> doesn't help anyway
<bigjools> all the underlying code assumes b64
<bigjools> can I detect which it is and DTRT
<allenap> bigjools: You could try decoding base-64 and catch exceptions, but thatâs not without risk.
<bigjools> yeah
<bigjools> I'll decode it when the task is called
<allenap> Yeah, that works.
<bigjools> not when pumped right into the underlying code :/
<allenap> Perfick.
<bigjools> then when the task gets removed eventually, all will be well
<bigjools> allenap: so there is a slight flaw in the plan for detecting when to run a download :(
<bigjools> although it's not the end of the world
<bigjools> maas.meta only gets updated when there is something to download.  So once it's a week old, we run the downloads hourly :)
<bigjools> allenap: bug 1271188 is getting more attention now, and I am also a little frustrated.  Let's stick it on the list while we're doing this work?
<ubot5> bug 1271188 in MAAS "maas-import-ephermerals provides no feedback on progress" [High,Triaged] https://launchpad.net/bugs/1271188
<bigjools> allenap: if you want a gander I have a WIP MP, I'm offski now.  There's a bit more to do to stop that hourly checking.
<bigjools> https://code.launchpad.net/~julian-edwards/maas/replace-celery-download-job-with-pserv/+merge/228168
<allenap> bigjools: Ha! We could ensure that maas.meta is touched after each successful run, or use another reference file.
<allenap> bigjools: Iâll be back online later, but Iâm off now too. Iâll try to have a look at that wip.
#maas 2014-07-25
<rvba> bigjools: meanwhile, would you have time for a review (the diff looks big because there is a migration in it but the changes are pretty mechanical): https://code.launchpad.net/~rvb/maas/improve-broken-state/+merge/228132 ?
<bigjools> rvba: I saw it don't worry, just clearing my inbox and then I'll get to it
<rvba> bigjools: cool, thank you
<allenap> Does anyone have a moment for a review? https://code.launchpad.net/~allenap/maas/rpc-call-get-preseed-data/+merge/228179
<rvba> allenap: sure, I'll take it
<rvba> allenap: at the bottom of the diff, why are you changing metadata_url to be of type `urlparse.ParseResult`?  Doesn't that parameter come from `absolute_reverse` which returns a string?
<allenap> rvba: It actually gets passed via RPC as a tuple; see the ParsedURL type. I had previously stubbed it out using a string instead of a urlparse.ParseResult.
<allenap> Which was a mistake.
<rvba> allenap: oh, I see; the conversion is actually done in src/maasserver/clusterrpc/osystems.py:get_preseed_data;  that's the part I was missing.
<bigjools> allenap: can you help me set up python's logger in pserv please?  There's currently zero feedback from the downloader service which is a little jarring :)
<allenap> bigjools: Quick and dirty is to call logging.basicConfig(level=logging.DEBUG) in p.pluginâ¦makeService().
<bigjools> allenap: yesterday you said the loggingobserver needs setting up
<bigjools> I detest logging in twisted
<allenap> bigjools: Thatâs to get Twistedâs logs to go via the stdlibâs logging module.
<allenap> So you can configure both in there.
<allenap> bigjools: Fwiw, so do most people who use Twisted.
<allenap> Where âconfigure both in thereâ means you can select levels and whatnot using the familiar stdlibâs logging stuff.
<allenap> For Twisted logs.
<bigjools> allenap: I am slightly confused by it all.  The log there now is direct from twisted, right?  If Python's log is configured will it just pick it up and show it?
<allenap> bigjools: Theyâre separate at the mo. If you call twisted.python.log.err or .msg itâll print something to canât remember where (thatâs configured in the plugin). If you use the stdlibâs logging module then it must be configured to show your messages; by default the level is set to WARN.
<allenap> bigjools: PythonLoggingObserver can be used to bridge the Twisted logs over to logging, but thatâs not in there yet.
<allenap> not in there = not being used yet
<bigjools> allenap: yeah I think the pserv service is started with --logfile=/dev/null
<bigjools> stdout is redirected instead
<allenap> bigjools: p.services.LogService gets all Twisted logs to go to the filename specified, which is defined in pserv.yaml iirc.
<bigjools> yeah
<bigjools> so is that going to pick up python's logging?
<bigjools> Guess I'll try it out
<allenap> bigjools: Nopeâ¦ but you could add a call to logging.basicConfig into LogService.startService.
<bigjools> allenap: that's what I mean, if it's configured, where will it go?
<allenap> bigjools: basicConfig send the logs to stdout or stderr iirc.
<bigjools> allenap: yeah I think so.  might interleave with twisted...
<allenap> bigjools: The âlogfile=â¦ argument to twistd only affects Twistedâs logs I think. It doesnât redirect stdio or configure the logging module.
<allenap> bigjools: As long as you can see the logs for development thatâs a start.
<bigjools> I think we added the loggingservice to make the log location sane
<bigjools> I am past development, thinking of production/QA now
<allenap> bigjools: I would think about changing LogService to create a new PythonLoggingObserver to bridge Twisted logs over, and to configure the logging module using the log file defined in pserv.yaml.
<allenap> The latter can be done in p.pluginâ¦makeService().
<bigjools> allenap: yeah, sounds reasonable
<allenap> bigjools: The current functionality of LogService can be removed.
<bigjools> right
<rvba> allenap: I'd like to create a RegionRPCFixture similar to the CluserRPCFixture.  I'm thinking about doing something like: http://paste.ubuntu.com/7855652/ but it requires Region (from maasserver) to be import from provisioningserver which is obviously wrongâ¦ Any idea what I can do instead?
<allenap> rvba: Yeah :) Import Region from p.rpc.region instead; thatâs where itâs defined.
<allenap> rvba: Btw, self.patch() doesnât work on Fixture.
<rvba> allenap: oh, right, I'll MonkeyPatcher then
<rvba> allenap: thanks!
<rvba> I'll use* MonkeyPatcher
<allenap> rvba: And Iâm wrong about Region().
<rvba> allenap: yeah, that's what I just found out
<allenap> rvba: I think youâll have to make it easy to create stubs.
<rvba> allenap: okay, I'll look into itâ¦ but I'd like a fixture that wires things up magically (i.e. a fixture that removes all the AMP transport and calls the 'responder' directly).
<bigjools> easy karma for someone: https://code.launchpad.net/~julian-edwards/maas/touch-maasmeta/+merge/228301
<bigjools> rvba, allenap: we really need a full end-to-end fixture...
<allenap> rvba: Thatâs not going to be easy at all; youâll need to get Django configured and all that jazz.
<allenap> bigjools: I donât think we do. Iâve cheated with ClusterRPCFixture and Iâm having second thoughts. We should develop the region and cluster against each otherâs RPC schema. We ought to get into the habit of writing QA tests for the whole-system side of things.
<bigjools> allenap: but I want integration tests that talk each way
<allenap> bigjools: You may think you want that ;)
 * bigjools dodges allenap's meta-argument :)
<bigjools> can someone review my simple branch please!
<rvba> bigjools: looking at your branch now
<bigjools> thank you rvba!
<jseutter> bigjools: I have a workaround for that grub issue where maas couldn't install on some nodes from earlier this week.  I dropped several drives from the raid array (down to 1.1TB array size) and the maas install worked.  What is weird is that the ubuntu cd installer works with all drives in the array.
<bigjools> jseutter: I've seen the installer fail for me when I've got raid partitions present (outside of maas)
<jseutter> bigjools: ok, good to know :)
<bigjools> 600-line upper decker for your Friday afternoon enjoyment. https://code.launchpad.net/~julian-edwards/maas/replace-celery-download-job-with-pserv/+merge/228168
<bigjools> no takers?  Beuller? :)
<bigjools> allenap: the logging is doing my head in, let's talk about it next week
<allenap> bigjools: Yeah, letâs :)
<allenap> bigjools: I havenât looked at it in a long long time and it does need some attention.
<bigjools> allenap: if we send to stdout it makes things somewhat awkward
<bigjools> we have our own custom logging service
<bigjools> that takes a filename
<bigjools> not sure we need a filename, but at the same time we do need to write a log to the right place
#maas 2015-07-20
<mup> Bug #1454469 changed: 8 nodes transition to failed state within very short period of time <oil> <MAAS:Expired> <https://launchpad.net/bugs/1454469>
<mup> Bug #1476152 opened: CNAME to correctly address extra interface of a node <MAAS:Triaged> <https://launchpad.net/bugs/1476152>
<mup> Bug #1476254 opened: 1.8 Unexpected whitespace above node listing <ui> <ux> <MAAS:New> <https://launchpad.net/bugs/1476254>
<mup> Bug #1462155 changed: Rendering artifacts with Chromium 41.0.2272.76 Ubuntu 14.04 (64-bit) <ui> <ux> <MAAS:Invalid by ricgard> <https://launchpad.net/bugs/1462155>
<mup> Bug #1476291 opened: lshw does not detect disk with HP Smart Storage Controller P400i <lshw> <maas> <provisioning> <MAAS:New> <https://launchpad.net/bugs/1476291>
#maas 2015-07-21
<h0mer__> hey guys, is this the right place to ask about maas openstack networking questions?
<plars> are the APC PDUs supported for power control in the version of maas in trusty? Is there a way to get it to work with that, or should we really be running the version of maas in the stable ppa?
<plars> The one that was installed by our lab folks is 1.5.4+bzr2294-0ubuntu1.3 and I don't see an option for APC, but locally I have 1.8.0+bzr4001-0ubuntu2~trusty1 and I see it as an option there
<mup> Bug #1476718 opened: 1.8.1 Change table font size <ui> <MAAS:In Progress by ricgard> <https://launchpad.net/bugs/1476718>
<mup> Bug #1476718 changed: 1.8.1 Change table font size <ui> <MAAS:In Progress by ricgard> <https://launchpad.net/bugs/1476718>
<mup> Bug #1476718 opened: 1.8.1 Change table font size <ui> <MAAS:In Progress by ricgard> <https://launchpad.net/bugs/1476718>
<mup> Bug #1476719 opened: 1.8.1 ready machine shows OS as machine property <ui> <ux> <MAAS:New> <https://launchpad.net/bugs/1476719>
<mup> Bug #1476719 changed: 1.8.1 ready machine shows OS as machine property <ui> <ux> <MAAS:New> <https://launchpad.net/bugs/1476719>
<blake_r> plars: I believe that was added in 1.8
<blake_r> plars: you can use the stable ppa to use 1.8, it is released just not in trsuty
<plars> blake_r: right, I know it should work, just wondering if that's the recommended option for a production system
<blake_r> plars: 1.8 is recommend
<plars> blake_r: speaking of production maas, do you know if there's any documentation about recommended procedures for disaster recovery, backups, etc?
<blake_r> plars: with the MAAS server?
<blake_r> plars: everything in MAAS is stored in the database
<blake_r> plars: so if you backup the database, and  /etc/maas you will be able to recover
<blake_r> plars: you could even get away with backing up /etc/maas, but I would recommend it just in case
<blake_r> plars: no direct documentation as of this time
<mup> Bug #1476719 opened: 1.8.1 ready machine shows OS as machine property <ui> <ux> <MAAS:New> <https://launchpad.net/bugs/1476719>
<mup> Bug #1476746 opened: 1.8.1 Release dropdown shows (duplicate) OS <ui> <ux> <MAAS:New> <MAAS 1.8:New> <https://launchpad.net/bugs/1476746>
<mup> Bug #1476746 changed: 1.8.1 Release dropdown shows (duplicate) OS <ui> <ux> <MAAS:New> <MAAS 1.8:New> <https://launchpad.net/bugs/1476746>
<mup> Bug #1476746 opened: 1.8.1 Release dropdown shows (duplicate) OS <ui> <ux> <MAAS:New> <MAAS 1.8:New> <https://launchpad.net/bugs/1476746>
<mad4comp> hey guys is the right place to ask about maas/landscape/openstack questions?
#maas 2015-07-22
<mup> Bug #1476291 changed: lshw does not detect disk with HP Smart Storage Controller P400i <lshw> <maas> <provisioning> <MAAS:Invalid> <https://launchpad.net/bugs/1476291>
<mup> Bug #1476291 opened: lshw does not detect disk with HP Smart Storage Controller P400i <lshw> <maas> <provisioning> <MAAS:Invalid> <https://launchpad.net/bugs/1476291>
<mup> Bug #1476291 changed: lshw does not detect disk with HP Smart Storage Controller P400i <lshw> <maas> <provisioning> <MAAS:Invalid> <https://launchpad.net/bugs/1476291>
<mattdiplant> My Webui for maas 1.8 will not connect to nodes
<mattdiplant> I just see "Connecting..."
<mattdiplant> Anyone else having this problem?
<Nastooh> Any pointers as how to add a modified Ubuntu image into MAAS? For more details, kindly, take a look at this http://askubuntu.com/questions/651410/deploy-modified-ubuntu-using-maas
#maas 2015-07-23
<mup> Bug #1477602 opened: 1.8.1 Close tag icon not loading <ui> <MAAS:New for ricgard> <https://launchpad.net/bugs/1477602>
<mup> Bug #1477609 opened: 1.8 Table sorting & active state is confusing <ui> <ux> <MAAS:New for carlaberkers> <https://launchpad.net/bugs/1477609>
<mup> Bug #1477614 opened: 1.8 Re-align two column tables <ui> <ux> <MAAS:New for ricgard> <https://launchpad.net/bugs/1477614>
<saraya> I've following "Installing the canonical distribution of ubuntu openstak" but I'm stuck. after install fresh ubuntu 14.04.2 and run apt install maas it's install maas 1.8
<saraya> now in need to import pxe file
<saraya> it shown this error "Error: No boot sources provide Ubuntu images.
<saraya> "
<mup> Bug #1477691 opened: Node.pxe_mac can be set to a MAC that doesn't belong to the node <MAAS:Triaged> <https://launchpad.net/bugs/1477691>
<saraya> it's look like proxy issue I'm working on it
<saraya> simplestreams.util.SignatureMissingException: No signature found!
<saraya> I've use this solution but not working for me http://askubuntu.com/questions/626507/maas-maas-import-pxe-files
<saraya> at the end "maasusr maas set-config name=http_proxy value=http://myname:mypass@myproxy:3128"
<saraya> http://wyldeplayground.net/maas-maas-import-pxe-files/
<saraya> "maas maas set-config name=http_proxy value=http://myname:mypass@myproxy:3128"
#maas 2015-07-24
<mup> Bug #1478067 opened: curtin uefi fails on precise with sgdisk: command not found\nfailed to sgdisk for uefi to /dev/nvme0n1 <oil> <curtin:New> <MAAS:New> <https://launchpad.net/bugs/1478067>
<h0mer> Hey guys, can someone help me with networking issues using maas/juju/landscape openstack distribution?
<mup> Bug #1478107 opened: MAAS does not setup rsyslog on deployed nodes with curtin <MAAS:Triaged> <https://launchpad.net/bugs/1478107>
#maas 2015-07-25
<greenride> I'm setting up a 4 node maas cluster in a home network with manual dhcp configuration. My router has the dhcp server, and it runs dhcp. Anyone know a tutorial for this setup? Mainly, I need help with the manual dhcp configuration.
<h0mer> install dnsmasq or proxydhcp on a machine and have it serve up the PXE boot file and images.
<greenride> h0mer: Are you suggesting this in place of maas or in addition to maas?
<h0mer> in addition to it
<h0mer> that's pretty much what i have set up at the moment
<h0mer> at a later time ill be flashing my linksys router with dd-wrt which gives you the ability to specify the pxe settings on its internal dhcp server
<greenride> h0mer: My router is running openwrt. So, I should be able to specify the pxe settings on the internal dhcp server.
<greenride> Is there a reason that you aren't using the master maas node to serve the pxeboot images?
<h0mer> can't you just ssh into the router and setup the pxe boot config?
<h0mer> https://maas.ubuntu.com/docs/configure.html#manual-dhcp
<greenride> Those instructions are for ISC DHCP. I'm not familiar with all the settings. Where do I start learning about those settings/options?
<greenride> But yes, I can ssh into the router.
<greenride> Also, the /etc/config/dhcp file has a section for dnsmasq.
<greenride> That's on the router.
<greenride> Here's the /etc/config/dhcp file. https://dpaste.de/Zyea
<greenride> I screwed up the cut and paste. This version is better. https://dpaste.de/Un8R
<h0mer> on dd-wrt I can set a "next-server" option (for dnsmasq) which I pointed to the maas server IP and it booted from the tftp directory on the maas server
<greenride> Did you have to setup the tftp server on the maas server?
<h0mer> it should be there by default when you import the images on the maas gui.
<greenride> Are you using Ubuntu's maas server or a different setup.
<greenride> ?
<lathiat> maas does that for you if you set it to managed mode
<h0mer> ubutnu's maas
<greenride> Thanks
<greenride> Let me try that.
#maas 2015-07-26
<greenride> I setup one maas cluster, which has 1 node. I have a total of two machines. I turned on PXE-boot and wake-on-lan in the BIOS for that node.When I select commission selected nodes, I get the error message: "The action 'commission selected nodes' could not be performed on 1 node because its state does not allow that action."
<greenride> How should I diagnose this problem?
<greenride> I tested that the following two commands wakeup the node: 'sudo etherwake <my_mac>' and 'wakeonlan <my_mac>'. However, selecting commission does not wakeup the node.
<h0mer> did you install etherwake?
<h0mer> apt-get install etherwake
<greenride> h0mer: I installed both wakeonlan and etherwake.
<greenride> wakeonlan works from both my laptop and the maas controller server.
<greenride> etherwake works from my laptop but not the maas controller server.
<h0mer> if you machine on a different subnet?
<greenride> I get the error message: 'SIOCGIFHWADDR on eth0 failed: No such device'
<greenride> Both the laptop and maas controller node are on the same subnet: 192.168.3.xxx.
<greenride> Both using Ubuntu 14.04 LTS. One is desktop and other is server.
<greenride> Maybe I should ask on Ubuntu about the etherwake before proceeding with the commissioning of nodes. If both etherwake and wakeonlan are available on the maas controller server, which does it attempt to use?
<h0mer> reboot the maas node
<h0mer> it uses one or the other
<h0mer> the script in the /etc/maas templates directory checks to see wheich one you have installed
<h0mer> if you have both, i think it uses etherwake first
<lathiat> one thing i had to do greenride was manually set the mac for wakeonlan on each ndoe
<lathiat> it wouldn't pick it up from its general info
<lathiat> not sure if that helps
<greenride> lathiat: 'sudo etherwake <my_mac>' works perfectly from my laptop, but it doesn't work from the mass controller server. So, the target machine is setup properly. So is my laptop. It's the maas controller box that has a problem.
<h0mer> after installing etherwake/wakeonlan, i had to reconfigure maas
<h0mer> basically bring it down and back up
<h0mer> (i rebooted the node)
<h0mer> and my hp machiens with wol worked after that
<lathiat> greenride: i mean within the maas config
<lathiat> greenride: but if "sudo etherwake" (literally) doesn't work, then yeah
<lathiat> i also had problems with it working over a vlan interface
<lathiat> so i had to stop making my maas vlan an ethernet trunk
<lathiat> i suspect that was a switch issue but i didn't verify that fully
<h0mer> if maas is on another subnet, then you'd need to add the subnet prefix to the etherwake/wakeonlan config
<h0mer> greenride: can you do a wol or etherwake from the maas node to wake up your machine?
<greenride> h0mer: laptop ip: 192.168.3.219, router: 192.168.3.1, maas controller: 192.168.3.174, maas node: 192.168.3.131
<h0mer> can you use wol or etherwake from the maas controller (command line) to turn on the machine?
<greenride> h0mer: wakeonlan <mass_node_mac_address> from the maas controller command line works.
<h0mer> but maas can't wake up the machine from the gui?
<greenride> h0mer: 'sudo etherwake <mass_node_mac_address>' does not work.
<greenride> h0mer: That's right. From the GUI, it does not wakeup the maas node.
<h0mer> is wake on lan installed in the /usr/sbin directory?
<h0mer> i mean /usr/bin
<greenride> That's correct. wakeonlan is in /usr/bin. etherwake is in /usr/sbin.
<h0mer> you've rebooted maas?
<h0mer> or the maas node
<h0mer> that is?
<greenride> The maas node is in the power off state when the command is executed. I've tried this multiple times from the different machines/different commands.
<h0mer> no i mean reboot the machine maas is on
<greenride> The maas controller node?
<h0mer> yes
<greenride> Let me try that.
<h0mer> and then when maas comes back up, delete the node in the maas gui
<h0mer> and re-add it using wol credentials
<h0mer> i mean wol mac address
<greenride> Before deleting, should I try the etherwake command?
<h0mer> you mean from cli?
<greenride> yes
<h0mer> i thoguht you said that worked?
<greenride> h0mer: wakeonlan <mac_address> works. sudo etherwake <mac_address> does not.
<h0mer> shouldn't matter, just looked at the template script and WOL is used first, then etherwake
<greenride> one command works from the maas controller node. The other command does not work from the maas controller node.
<greenride> From my laptop, both commands work.
<greenride> Then, let me delete and re-add the node.
<h0mer> rgr
<greenride> rgr?
<h0mer> roger
<h0mer> as in okay.
<greenride> Understood the 'roger'... the abbreviation I didn't. :)
<h0mer> rgr
<h0mer> and btw do you have two nic's connected on the maas controlelr node?
<greenride> No. Only one nic.
<h0mer> alright, the only other thing i can think of is a tcp/udp issue.  I know wol uses tcp and i think etherwake uses udp
<greenride> By the way, it worked this time. After adding the node, status became 'Commissioning' and it booted.
<h0mer> rgr
<greenride> dhcp didn't work. But, the boot part did.
<h0mer> last time i had this issue, i just rebooted the maas controller node after installing etherwake
<h0mer> whats the dhcp stuff you're talkinga bout?
<greenride> After the wake on lan, the maas node tries to get an ip address using DHCP. That fails.
<greenride> In the PXEBoot process.
<h0mer> but the wol/etherwake stuff works now right?
<greenride> Yes, it does.
<greenride> Thanks for your help.
<h0mer> alright, is maas now setup to manage the DHCP and DNS on you network?
<greenride> Deleting the node and re-adding it did the trick.
<greenride> h0mer: No. The DHCP is done through the router. The router runs OpenWRT.
<h0mer> did you set up the network in maas to point to the router?
<greenride> The OpenWRT router has dnsmasq.
<greenride> h0mer: No. I didn't
<h0mer> and it's set to pxelinux.0 and then points to the maas controller node?
<h0mer> goto the networks tab and configure the gateway and router ip to point to the DHCP server (in maas)
<greenride> No, the only thing I did was add 'next-server <maas_controller_ip>' to the dnsmasq section of /etc/config/dhcp on the OpenWRT router.
<greenride> rgr
<h0mer> oh the stuff i told you earlier?
<h0mer> got it
<greenride> right
<h0mer> id need to see how you set up the network in maas
<h0mer> is it managed or unmanaged?
<greenride> No networks are defined.
<h0mer> the networks tab shows no networks?
<greenride> Right.
<greenride> That's my problem I believe. :)
<h0mer> goto the maas controller and type 'ifconfig'
<h0mer> does it show eth0 or eth1 or something along those lines?
<greenride> It shows lo and p4p1. p4p1 is assigned the ipv4 address 192.168.3.174.
<h0mer> is this a vm?
<greenride> No
<greenride> Physical box.
<h0mer> goto the networks tab and add a network
<h0mer> set the gateway and dns server to your dhcp server ip
<h0mer> then goto clusters tab
<h0mer> and then add an interface
<h0mer> p4p1 or whatever
<h0mer> do the same thing, but set it as "unmanaged"
<h0mer> set everything up to the router IP
<h0mer> and then leave all the dhcp/staic fields empty
<h0mer> then delete your node from maas and re-enlist it
<greenride> In the network tab for 'add a network', only fields for name, description, ip, netmask, vlan tag, and connected network interface cards exist.
<greenride> There is no field for gateway or dns server.
<h0mer> theres no field that says "default gateway" or "dns servers'?
<h0mer> what version of maas are you running?
<greenride> Right.
<h0mer> (it's on the bottom of the page)
<greenride> The version that installs by default with Ubuntu 14.04.2 LTS.
<h0mer> what does it say on the bottom of the maas page?
<greenride> Â© 2012 Canonical Ltd. Ubuntu and Canonical are registered           trademarks of Canonical Ltd.           View Documentation
<h0mer> sounds like a pretty old version
<h0mer> can you answer your pm?
<greenride> answer your pm?
<greenride> What do you mean?
<greenride> I'm trying to find the version.
<h0mer> do you not see a pm from me?
<greenride> h0mer: I got maas fully installed. I'm playing around with it. Thanks for your help.
<haripriya> I am facing an issue, Installed MAAS in a vagrant box using virtual box as provider. On accessing the web UI, I can see only the nodes links and no images/ cluster links are available. I tried the sudo maas-import-pxe-files but still the issue persists
<haripriya> The nodes created were all stuck in commissioning status. Can anyone help me? A
<h0mer> @haripriya: not sure I understand your issue.  Is the gui not resolving the links for the pages, or are you having a commissioning problem? or both?
<haripriya> I am having both the issues
<haripriya> gui is showing only the nodes link
<haripriya> and after adding nodes, they will stuck in commisioning status
<greenride> h0mer: Thanks for your help yesterday. I got maas up and running.
<h0mer> np
#maas 2016-07-25
<godleon> https://www.irccloud.com/pastebin/VufwujBs/
<godleon> above is the content I got from /var/lib/maas/dhcp/dhcpd.leases in MAAS 1.9.3, is that normal? I thought one MAC address should always gets the same ip address, doesn't it ?
<siva> roaksoax: Hi. I am getting MAAS commission fails issue. MAAS failed reason. http://paste.openstack.org/show/541592/
<siva> roaksoax : Could you please provide me the solution for this
<siva> narindergupta:  Hi . I am getting MAAS commission failed issue. Using MAAS 1.9.3.  MAAS failed reason. http://paste.openstack.org/show/541592/ . I am not using any proxy  or Firewall service . I saw the related issue from the  following link https://bugs.launchpad.net/maas/+bug/1534013.Could you please provide me solution.
<narindergupta> siva, i see you are using willy. Thats fine since I can not reproduce the issue will you please attached the log required as stated in bug  https://bugs.launchpad.net/maas/+bug/1534013 looks like it needs the debugging.
<narindergupta> please look for comment number #2
<narindergupta> also in our deployment we are not facing this issue in any of the lab. And it could be your environment specific. Like bridges on your jump host etc...
<siva> narindergupta: Ok. Thank you. I will debug more.
<narindergupta> siva, thanks
<narindergupta> siva, also i think you need to look into console of VM whether pxe boot completes or not if not then it might be your network configuration of jumphost which might be breaking. If you can give me access to your environment then only i can have a look.
<roaksoax> siva: Unforunately, same as last time, I cannot provide you with a solution when I don't even know what the issue is. Which is why I requested the cloud-init.log and cloud-init-output.log. That is found in the commissioning environment. For that, while you commission, you need to enable the SSH option and log into the server when it becomes available
<roaksoax> also, Wily is End of Life already
<siva> narindergupta: sorry. I am not authorized to give access as per my company policy. I have recorded video. I will send that. Please tell me how to send a video  here.
<narindergupta> siva, will you please look into roaksoax suggession as he is the product owner of MAAS and suggest you on MAAS.
<siva> roaksoax: Yes. I tried as you said to get cloud-init.log. But i am unable to ssh to nodes from MAAS controller..
<roaksoax> siva: the other option, is to have a console log. Do you have a console log ?
<siva> narindergupta: Ok. thank you.
<siva> roaksoax: node console log?
<siva> roaksoax: I have a small video. It has the info of how MAAS doing PXE boot then while running Cloud-init script, it was throwing  network communication failed issue. Due to that commission failed may be.
<roaksoax> siva: yes, if it is network communication issue then that's the reason why commissioning failed
<siva> roaksoax: I am thinking Due to that network issue client node not able get internet access . Then Commission process is waiting upto 20 minutes and then throwing commission failed issue.
<roaksoax> siva: yeah because while it access commissioning, i'm guessing it never finishes and maas waits for it to fail
<roaksoax> siva: ok, so you need to look into why it is not accessing ?
<roaksoax> siva: so are you under a proxy ? the machines don't have access to the internet ? have you set an upstream DNS server ?
<siva> roaksoax: No. I didn't configure any proxy .
<roaksoax> siva: so is MAAS directly connected to the internet and NOT the machines it manages ?
<siva> roaksoax: In MAAS server I have taken two NICs. For One NIC I have given public network. Second NIC I configured as a extrenal router. And then i used IP forwarding. Then I installed virtual box. From virtual box I created one VM. I have one nic In this VM.
<siva> roaksoax: And I configured virtual box VM interface with bridge network
<siva> roaksoax: Now I installed MAAS on Virtuaklbox VM. Now maas controller and its clients are in the same network
<siva> roaksoax: With single interface MAAS will manage both DHCP and DNS.
<roaksoax> siva: right, to me it seems like an issue with the forwarding or similar
<roaksoax> siva: have you tried putting an upstream DNS ?
<roaksoax> siva: under the settings page ?
<mup> Bug #1605756 changed: [ juju2 beta11 ] system show up in juju status as pending but there is no attempt to deploy in maas <oil> <oil-2.0> <juju-core:Invalid> <MAAS:Invalid> <https://launchpad.net/bugs/1605756>
<mup> Bug #1605756 opened: [ juju2 beta11 ] system show up in juju status as pending but there is no attempt to deploy in maas <oil> <oil-2.0> <juju-core:Invalid> <MAAS:Invalid> <https://launchpad.net/bugs/1605756>
<mup> Bug #1605756 changed: [ juju2 beta11 ] system show up in juju status as pending but there is no attempt to deploy in maas <oil> <oil-2.0> <juju-core:Invalid> <MAAS:Invalid> <https://launchpad.net/bugs/1605756>
<siva> roaksoax: No. i didn't tried. can I try it now?
<roaksoax> siva: yeah you can try that. For example use 8.8.8.8 as upstream DNS and see if that fixes any resolution errors
<siva> roaksoax; OK. thank you.
<mup> Bug #1606264 opened: dns_servers and ntp_servers in _ConfigureDHCP are defined as strings <tech-debt> <MAAS:Triaged> <https://launchpad.net/bugs/1606264>
<siva> roaksoax: Hi. I tried with 8.8.8.8 , 8.8.4.4 and local name server . But it was showing the same network communication failed issue. And In maas.log showing  Marking node failed: Node operation 'Commissioning' timed out after 0:20:00.
<roaksoax> siva: so /etc/maas/clusterd.conf , what does maas_url= say ?
<siva> roaksoax: maas_url: http://192.168.10.2:5240/MAAS .
<siva> roaksoax: I tried first with   maas_url: http://localhost:5240/MAAS. Next instead of localhost I configured Cluster controller with IP.
<mup> Bug #1606281 opened: omapi code does not deal well with ipv6 addresses <MAAS:Incomplete> <https://launchpad.net/bugs/1606281>
<mup> Bug #1606329 opened: MAC address accessible in web UI, but UI is VERY obscure <MAAS:New> <https://launchpad.net/bugs/1606329>
<roaksoax> siva: is 192.168.0.2 an IP address that the machines can access to ?
<roaksoax> /win/win 3
<mup> Bug #1606386 opened: Apt Upgrade fails for maas-region-controller 2.0.0~rc2+bzr5156  to 2.0.0~rc3+bzr5171 <MAAS:New> <https://launchpad.net/bugs/1606386>
<mup> Bug #1606386 changed: Apt Upgrade fails for maas-region-controller 2.0.0~rc2+bzr5156  to 2.0.0~rc3+bzr5171 <MAAS:New> <https://launchpad.net/bugs/1606386>
<mup> Bug #1606386 opened: Apt Upgrade fails for maas-region-controller 2.0.0~rc2+bzr5156  to 2.0.0~rc3+bzr5171 <MAAS:New> <https://launchpad.net/bugs/1606386>
<mup> Bug #1606386 changed: Apt Upgrade fails for maas-region-controller 2.0.0~rc2+bzr5156  to 2.0.0~rc3+bzr5171 <MAAS:New> <https://launchpad.net/bugs/1606386>
#maas 2016-07-26
<mup> Bug #1565467 changed: twistd3 crashed with PermissionError in touch(): [Errno 13] Permission denied: '/etc/maas/rackd.conf' <amd64> <apport-crash> <xenial> <MAAS:Expired> <twisted (Ubuntu):Expired> <https://launchpad.net/bugs/1565467>
<siva> roaksoax: No. Machines are accessing IPs from the Dynamic ranges Starts from  192.168.10.3 to 192.168.10.70
<siva> roaksoax: Hi. As you said yesterday, If Wily is End of Life already Can I use it now for MAAS? If I can, Please provide me the exact documentation link for configuring MAAS on Wily.
<siva> roaksoax: I want to do a POC for deploying OpenStack on Ubuntu.[Using MAAS,JUJU,landscape and autopilot].
<siva> roaksoax: On which version of Ubuntu you guys deployed MAAS successfully, with out having any issues as I am facing right now on Ubuntu 15.10[Wily]. Have you deployed MAAS on Wily successfully?
<siva> oaksoax: actually, I followed the below link  to configure MAAS http://www.ubuntu.com/download/cloud/install-openstack-with-autopilot. In this reference link, they mentioned ubuntu 14.04[trusty]. This reference completely for ubutnu 14.04[trusty]?
<mup> Bug #1606499 opened: NTP servers must be IPv4 for use with ISC dhcpd <ntp> <MAAS:Triaged> <https://launchpad.net/bugs/1606499>
<mup> Bug #1606508 opened: Failover peers must be IPv4 for use with ISC dhcpd <ntp> <MAAS:Triaged> <https://launchpad.net/bugs/1606508>
<siva> roaksoax: Hi. As you said yesterday, If Wily is End of Life already Can I use it now for MAAS? If I can, Please provide me the exact documentation link for configuring MAAS on Wily.
<siva> roaksoax: I want to do a POC for deploying OpenStack on Ubuntu.[Using MAAS,JUJU,landscape and autopilot].
<siva> roaksoax: On which version of Ubuntu you guys deployed MAAS successfully, with out having any issues as I am facing right now on Ubuntu 15.10[Wily]. Have you deployed MAAS on Wily successfully?
<siva> roaksoax: actually, I followed the below link  to configure MAAS http://www.ubuntu.com/download/cloud/install-openstack-with-autopilot. In this reference link, they mentioned ubuntu 14.04[trusty]. This reference completely for ubutnu 14.04[trusty]?
<roaksoax> siva: we no longer test nor support MAAS running on top of wily since it is end of life
<roaksoax> siva: we support maas on trusty (1.9) or on xenial (2.0)
<siva> roaksoax: OK. Thank you. Please suggest me the correct reference link for MAAS on Xenial(2.0)? For trusty (1.9) I will use http://www.ubuntu.com/download/cloud/install-openstack-with-autopilot.
<siva> roaksoax: (or) for xenial(2.0) also same link http://www.ubuntu.com/download/cloud/install-openstack-with-autopilot sufficient?
<roaksoax> siva: autopilot doesn't support MAAS/Juju 2.0 yet
<roaksoax> siva: so if you are using auto-pilot it wont work
<siva>  roaksoax: Thank you.
#maas 2016-07-27
<KpuCko> hello, can somebody help me with "maas latest meta-data instance-id failed bad status code 404"
<KpuCko> oke, i've fixed the problem, the ip of the maas controller used for pxe boot is setted to the wrong value
<KpuCko> with dpkg-reconfigure i fixed the issue
<mup> Bug #1606948 opened: [trunk] Filter scrolling for according always shows despite only having 1 item <MAAS:New> <MAAS 2.0:New> <MAAS trunk:New> <https://launchpad.net/bugs/1606948>
<mup> Bug #1606999 opened: reporting messages can slow down operations greatly <cloud-init:New> <curtin:New> <MAAS:New> <https://launchpad.net/bugs/1606999>
<Braven36> hello
<Braven36> is there anyone is the channel?
<roaksoax> Braven36: for the most part, yes! :)
<Braven36> Cool
<Braven36> Do you work with maas alot?
<roaksoax> Braven36: i do
<roaksoax> Braven36: how can I help?
<Braven36> Curtin, I do not think I full understand how it works
<Braven36> I thought last_command: would run when the OS boots up for the first time. But that does not appears to be the case
<Braven36> Do you know of any good document for the curtin?
<Braven36> Did I lose you roaksoax
<roaksoax> Braven36: sorry,testing RC3 atm
<roaksoax> Braven36: will late_commands in curtin run during the installation process, but after the image has been copied onto the disk and everything is ready to continue
<roaksoax> Braven36: if you see in /etc/maas/preseeds/curtin_userdata you can see sections that say "curtin", "in-target", "--", "XYZ"
<roaksoax> Braven36: that is telling curtin, do XYZ inside the installed OS
<Braven36> I am try to run a disk config script.
<roaksoax> Braven36: do you have an example ?
<Braven36> of the script or what I put in the file
<roaksoax> Braven36: well that depends on what you want to do
<roaksoax> if you want to run a shell script
<roaksoax> you'd do stuff similar to "curtin", "in-target", "==",
<roaksoax> you'd do stuff similar to "curtin", "in-target", "==", "sh", "-c", "whatever-shell-you-want (e.g. run a script, do stuff like command1 && command2, etc"
<Braven36> I did that and seen the script run. Then installation failed.
<Braven36> I see you are using "==" instead of --
<roaksoax> Braven36: that's a typo on my side
<roaksoax> Braven36:another way of doing it: driver_00_key_get: curtin in-target -- sh -c "/bin/echo -en '{{key_string}}' > /tmp/maas-{{driver['package']}}.gpg"
<Braven36> There not much documentation out there about it. So I feel like I missing something that everyone else knows.
<Braven36> so in the example above. You do not need to use the ","
<roaksoax> Braven36: i think that's just two ways of doing the same thing
<roaksoax> Braven36: /etc/maas/curtin_userdata would give you examples
<roaksoax> smoser: do you have curtin docs already available ?
<smoser> there are some curtin docs at http://people.canonical.com/~rharper/curtin/#
<roaksoax> smoser: thanks!
<roaksoax> smoser: can cloud-init do this too? http://people.canonical.com/~rharper/curtin/topics/reporting.html#configuration ?
<Braven36> interesting. I have not seen this yet
<smoser> cloud-init doesn ot postfiles
<smoser> but  yes, it reports
<roaksoax> smoser: ah bummer, I was looking to be able to post cloud-init-output back to maas for storage
<Braven36> how does cloud-init fit into Maas?
<roaksoax> Braven36: the ephemeral environmetn (commissioning, deploying) is driven by cloud-init
<Braven36> Cloud-init reads the curtin file to deploy the server?
<Braven36> I see a lot info on #cloud-config files. I also see #cloud-config at the top the curtin file
<roaksoax> Braven36: cloud-init drives curtin
<Braven36> late-command is ran after the first reboot.
<roaksoax> Braven36: late_command is run during the install. In mimics what d-i does
<Braven36> I am new this stuff. So curtin is all I know. is there a away to setup a script to run on boot
<Braven36> and it only run once
<roaksoax> Braven36: you can modify the late_command to write whatever you need on your installed system, so that on first boot it runs what you need to run
<roaksoax> that's the usual way you can customize things
<Braven36> maybe it failed some where else... my script setup disk for a hadoop data node.
<roaksoax> Braven36: what do you mean by setup disk ?
<Braven36> there are 4 disk. in server. 1 is for the Os dev/sda and the others are for data
<Braven36> The Fstab need to have the uuid of disk.
<Braven36> or I mean Partition
<roaksoax> Braven36: why don't you partition the disk in MAAS ?
<roaksoax> Braven36: are you using MAAS 2.0 ?
<Braven36> no.. I am using maas 1.0
<Braven36> hardware did not support 2.0 =(
<mup> Bug #1607112 opened: [2.0rc2] package installation fails when default gateway is not set <MAAS:New> <https://launchpad.net/bugs/1607112>
#maas 2016-07-28
<mup> Bug #1607345 opened: Collect all logs needed to debug curtin/cloud-init for each deployment <oil> <cloud-init:New> <MAAS:Triaged> <https://launchpad.net/bugs/1607345>
<mup> Bug #1607112 changed: [2.0rc2] package installation fails when default gateway is not set <MAAS:Fix Released by andreserl> <https://launchpad.net/bugs/1607112>
<mup> Bug #1607112 opened: [2.0rc2] package installation fails when default gateway is not set <MAAS:Fix Released by andreserl> <https://launchpad.net/bugs/1607112>
<mup> Bug #1607112 changed: [2.0rc2] package installation fails when default gateway is not set <MAAS:Fix Released by andreserl> <https://launchpad.net/bugs/1607112>
<mup> Bug #1576427 opened: [1.9.1] Commissioning didn't discover storage devices <MAAS:In Progress> <https://launchpad.net/bugs/1576427>
<voidspace> hey, I used to be able to commission KVM nodes without *having* to set power information - starting them manually
<voidspace> that's no longer possible
<voidspace> is that by design, or a regression?
<voidspace> or both...
<mup> Bug #1607403 opened: [trunk] WebUI unavailable due to new version of AngularJS <MAAS:New> <angular.js (Ubuntu):New> <https://launchpad.net/bugs/1607403>
<roaksoax> voidspace: by design. There's a "manual" power type
<voidspace> roaksoax: it's annoying :-( at least I know how to fix it now though, thanks.
<nturner> I'm seeing a strange issue where one of my hardware nodes often fails commisioning due to "Failed to power on node ..." --- however, IPMI power control appears to be working fine for this node. Whenever I click on "check now", power status updates. But it doesn't always update automatically.
<nturner> Does this sound familiar to anyone?
<nturner> (When I say power status doesn't always update automatically, I mean if I leave the UI up, this node often shows stale power state, but as soon as I click on "check now", the state is updated correctly. It's like it isn't polling correctly. But only this node.)
<nturner> Currently running 2.0.0~rc3+bzr5180-0ubuntu2~16.04.1
<nturner> But this behavior has been like this all month.
<zeestrat> nturner: Yeah, I am seeing the same thing here, both with rc2 and rc3. Should probably open up a bug.
<roaksoax> narindergupta: seems like BMC issues
<roaksoax> err
<roaksoax> nturner:
<roaksoax> nturner: seems like BMC issues
<roaksoax> nturner: like flaky BMC's
<narindergupta> roaksoax, may i know the bug number
<narindergupta> nturner, do you know the which hardware?
<roaksoax> narindergupta: you only deal with hp right ?
<roaksoax> narindergupta: or dell too ?
<narindergupta> roaksoax, i deal in HP, lenovo, NEC, Ericsson
<narindergupta> roaksoax, no dell
<roaksoax> narindergupta: k thanks!
<nturner> roaksoax: Hmm, wouldn't you expect to see power control errors if maas tried to update the power status and the BMC didn't respond?
<nturner> I don't see that. And every time I initiate a power status check by clicking on "check now", it works immediately.
<roaksoax> nturner: probably because maas does retry and your BMC says yes ?
<roaksoax> nturner: what does rackd.log  tell you ? does it tell you about any errors ?
<roaksoax> nturner: note that the power, in the UI, might not update immediately
<roaksoax> nturner: it may take a few more seconds to update
<nturner> roaksoax: Where can I find rackd.log? On the maas controller?
<nturner> roaksoax: here's an excerpt from the event log for this node: https://paste.ubuntu.com/21328195/
<nturner> It looks to me like it concluded the deploy failed before it queried the BMC...
<roaksoax> nturner: Queried node's BMC - Power state queried: onWed, 27 Jul. 2016 19:03:41
<nturner> I found rackd.log on the maas controller. There are backtraces related to this. Will paste...
<roaksoax> cool
<roaksoax> nturner: also, are you using rc3
<nturner> https://paste.ubuntu.com/21328779/
<nturner> Yes, I upgraded today. Though the log entry I just posted was from yesterday
<nturner> https://paste.ubuntu.com/21328867/ is the same thing today, after upgrade
<nturner> roaksoax: That first event list is in reverse-chronological order; that "Queried node's BMC..." message is after the rest.
<roaksoax> newell_: ^^
<roaksoax> newell_: is there any debug logging that would shed some more light on that?
<nturner> I'd be happy to turn up tracing somewhere and try to reproduce this.
<newell_> roaksoax: there isn't debug logging on the rack afaik
<roaksoax> newell_: where can we inject some debug info to debug the above ?
<roaksoax> nturner: i'll lookg thorugh the code to try to find a good place to insert a piece of code to debug
<newell_> roaksoax: well it is weird because this is being thrown from the base class
<newell_> roaksoax: is this with trunk?
<roaksoax> newell_: 2.0rc3
<roaksoax> newell_: but where can we find the output of the power command
<roaksoax> newell_: and whether a power command succeeds
<roaksoax> newell_: and whether we are retrying
<newell_> roaksoax: in the perform_power method that is seen in the traceback
<newell_> this is where the retries happen
<newell_> peform_power utimately calls the "actual" power driver to perform either, off, on, query, etc.
<newell_> ah, I have never seen this error actually thrown in practice but if you look in provisioningserver.drivers.power.__init__ perform_power that error is thrown at the end if the state never transitions
<roaksoax> newell_: honeslt,y we need to add some debugging log there
<newell_> nturner: what type of power driver are you using for this?
<roaksoax> nturner: if you try to do it just one, how many of "provisioningserver.drivers.power.PowerError: Failed to power 4y3h8d. BMC never transitioned from off to on." do you see... can you please share the logs just for 1 attempt ?
<nturner> newell_: this node is using LAN_2_0 [IPMI 2.0]
<nturner> roaksoax: Sure, will do one now.
<roaksoax> nturner: if you could apply this: http://paste.ubuntu.com/21335012/
<roaksoax> nturner: to /us/lib/python3/dist-packages/provisioningserver/..../__init__.py
<roaksoax> nturner: restart maas-rackd
<roaksoax> nturner: and retry it would be great
<nturner> roaksoax: sure, will do
<nturner> naturally, the last 2 deploys succeeded without incident =\
<nturner> Ah, a failure! Logs coming...
<nturner> roaksoax: newell_: here's syslog output (with verbose named entries elided): https://paste.ubuntu.com/21336662/
<nturner> Based on this tracing, I wonder if the problem is simply that this system is sometimes slow to power on.
<newell_> nturner: yeah your hardware seems to be slow
<nturner> Is it possible to change those timeout values or increase the number of retries?
<newell_> nturner: you can if you edit the python file manually
<roaksoax> yeah there's no setting to do it
<nturner> yeah, in there now...
<roaksoax> but strange... it takes more than 24+ seconds to power on ?
<newell_> 35 seconds to be exact
<newell_> nturner: do you have physical access to the hardware?
<newell_> nturner: if so, does it really take longer than 35 seconds to power on?
<nturner> well, it does seem odd.
<newell_> nturner: I am assuming that at some point the power actually does turn on
<nturner> when I looked at the maas UI after seeing this in the logs, the power state shows as on
<newell_> nturner: and does the node boot up at that point?
<nturner> I can try again with more retries and will monitor the UI a little closer during that time
<nturner> yeah, the node does boot
<newell_> nturner: if you edit the DEFAULT_WAITING_POLICY tuple in /usr/lib/python3/dist-packages/provisioningserver/drivers/power/__init__.py, save the file, and restart rackd as mentioned above, you will have more retries
<roaksoax> newell_: maybe there's a bug in the UI were it is saying it is ON when it is not and it is failing to check ....
<nturner> looks like you actually have to edit ipmi.py ... running now
<nturner> OK...
<nturner> so I configured it to retry many times after 12 seconds each...
<nturner> and after 5 or so, I opened the UI and clicked "check now" -- the UI showed Power on within a second
<nturner> Meanwhile, the "Successfully checked power state, checking if it is desired... off" continued in the log
<nturner> seems like there are 2 paths being taken here
<roaksoax> nturner: what if you manually turn off your BMC, and then click on "Check Power"
<nturner> roaksoax: newell_: What happens when I click "check now" in the UI? It doesn't appear to enter that maas.drivers.power logic (no traces seen).
<nturner> It shows as off.
<newell_> nturner: okay so if the BMC is off and you check the UI, that is working
<nturner> If I click on "check now" every second after deploying, it shows On after about 12 or so seconds.
<newell_> nturner: when you click on check now it should query the BMC via the power_query method in ipmi.py
<nturner> newell_: thanks, i added some tracing there.... it's very odd; i see /usr/sbin/ipmipower being run with the same arguments when I click the 'check now' in the UI and during the polling during deploy
<nturner> but again, it polls for many cycles, and then i click 'check now' and the UI instantly shows Power on
<mup> Bug #1607560 opened: switching rackd.conf maas_url back to localhost has no effect <MAAS:New> <https://launchpad.net/bugs/1607560>
 * nturner has to head out for a bit; more fun tomorrow!
<newell_> nturner: so just to be clear, when you increase the wait times, it all works fine correct?
<nturner> newell_: no
<nturner> The only thing that seems to work reliably in this particular node is clicking 'check now' in the UI
<nturner> the polling seems to somehow get different results
<nturner> which seems really weird
<mup> Bug #1607560 changed: switching rackd.conf maas_url back to localhost has no effect <MAAS:New> <https://launchpad.net/bugs/1607560>
<newell_> nturner: can you file a bug for this?
<newell_> nturner: if you would be so kind, also list what type of hardware you are using
<nturner> newell_: can do; will probably do this tomorrow
<nturner> sure, no problem.
<newell_> nturner: thanks!
<nturner> thanks for the help today
<nturner> now that I know where the relevant code is, I can have some fun doing a little further debug too =)
<newell_> nturner: np :)
<mup> Bug #1607560 opened: switching rackd.conf maas_url back to localhost has no effect <MAAS:New> <https://launchpad.net/bugs/1607560>
#maas 2016-07-29
<mup> Bug #1607638 opened: [2.0rc3] the term "reserve" in IP Ranges is misleading <MAAS:New> <https://launchpad.net/bugs/1607638>
<mup> Bug #1607638 changed: [2.0rc3] the term "reserve" in IP Ranges is misleading <MAAS:New> <https://launchpad.net/bugs/1607638>
<mup> Bug #1607638 opened: [2.0rc3] the term "reserve" in IP Ranges is misleading <MAAS:New> <https://launchpad.net/bugs/1607638>
<KpuCko> hello, is it there any reason of that maas doesn't see any disks or partition of the nodes?
<KpuCko> But if i ssh to the node i can see devices which is /dev/sda and /dev/sdb
<siva> roaksoax: Hi. I am getting MAAS node power Error while enlising. Issue details: http://paste.openstack.org/show/543872/.  Please provide me some solution for this issue.
<siva> roaksoax: Hi. I am getting MAAS node power Error while enlising. Issue details: http://paste.openstack.org/show/543872/.  Please provide me some solution for this issue.
<Braven36> hello
<Braven36> Is there a way to add column to node page of maas?
<roaksoax> siva: the error is self explanatory: Power state could not be queried: 192.168.2.162: connection timeout
<roaksoax> siva: it cannot connect ot your BMC
<siva> roaksoax: Thank you
<siva> roaksoax: resolved that BMC connection issue
<siva> roaksoax: Hi. I have taken 3 Dell machines as MAAS client nodes. Among 3 client nodes One client node deployed with ubuntu 14.04 OS. I can do SSH also.
<siva> roaksoax: But for two nodes deployment failed.
<siva> roaksoax: Issue details : http://paste.openstack.org/show/543945/
<mup> Bug #1607980 opened: After using Rescue Mode for Broken machine, it is Mark Fixed, then Deploy doesn't work. <MAAS:In Progress by newell-jensen> <https://launchpad.net/bugs/1607980>
#maas 2016-07-31
<mup> Bug #1608203 opened: apt install maas fails without default gateway <MAAS:New> <https://launchpad.net/bugs/1608203>
<mup> Bug #1608203 changed: apt install maas fails without default gateway <MAAS:New> <https://launchpad.net/bugs/1608203>
<mup> Bug #1608203 opened: apt install maas fails without default gateway <MAAS:New> <https://launchpad.net/bugs/1608203>
<mup> Bug #1608203 changed: apt install maas fails without default gateway <MAAS:New> <https://launchpad.net/bugs/1608203>
<mup> Bug #1608203 opened: apt install maas fails without default gateway <MAAS:New> <https://launchpad.net/bugs/1608203>
<mup> Bug #1608203 changed: apt install maas fails without default gateway <MAAS:New> <https://launchpad.net/bugs/1608203>
<cafaroo> Im having some problems bootstrapping juju on an Proliant SL170z. I get failed deployment and the following error "An error occured handling 'sda': OSError - [Errno 6] No such device or address: '/dev/sda2'". Anyone that has had the same error?
<cafaroo> I think this may be related to https://bugs.launchpad.net/curtin/+bug/1562249 (Had this problem with cciss before with other hardware).
#maas 2017-07-24
<mup> Bug #1706196 opened: when using pods, during juju bootstrap maas creates vm in zone default when a vm already exists in another zone <MAAS:New> <https://launchpad.net/bugs/1706196>
#maas 2017-07-25
<BlackDex> Hello there
<BlackDex> i'm trying to delete a subnet
<BlackDex> but i get the following error
<BlackDex> Subnet matching query does not exist.
<BlackDex> maas has been upgrade to the latest version
<pmatulis> BlackDex, how did you try to delete the subnet?
<BlackDex> yes
<BlackDex> tryed via CIDR and ID
<BlackDex> both same message
<BlackDex> also the same via the WebGUI
<pmatulis> BlackDex, you should paste your exact CLI command
<BlackDex> maas maas subnet delete 20
<BlackDex> maas maas subnet delete 10.10.216.0/23
<BlackDex> both give that same error message
<roaksoax> BlackDex: what version are you using ?
<roaksoax> BlackDex: for example if you "http://10.10.10.10:5240/MAAS/#/subnet/4" is the '4' the ID you are using ?
<BlackDex> 2.2.1 at the moment
<BlackDex> it has been upgraded
<roaksoax> BlackDex: strange, I just tried to delete subnets both ways and it worked fine
<BlackDex> if i go to that link it works
<BlackDex> i see the subnet
<BlackDex> and all the info
<BlackDex> i even tried to change the subnet
<BlackDex> that works
<BlackDex> but then deleting it just doesn't
<BlackDex> ow wow
<BlackDex> i now change the whole subnet
<BlackDex> so from 10.10.216.0/24 to 10.10.217.0/24
<BlackDex> and then delete works :S
<BlackDex> oke have to flow
<BlackDex> fly
<BlackDex> thx for the help :)
<mup> Bug #1706423 opened: [2.x] No pods available when no images import <pod> <MAAS:Triaged> <https://launchpad.net/bugs/1706423>
<mup> Bug #1706196 changed: when using pods, during juju bootstrap maas creates vm in zone default when a vm already exists in another zone <cdo-qa> <juju:New> <MAAS:Invalid> <https://launchpad.net/bugs/1706196>
<mup> Bug #1706438 opened: pods don't have availability zones <cdo-qa> <MAAS:New> <https://launchpad.net/bugs/1706438>
<mup> Bug #1436887 changed: Failure: twisted.web.error.Error: 503 Service Unavailable <twisted> <MAAS:Invalid> <https://launchpad.net/bugs/1436887>
<mup> Bug #1706196 opened: when using pods, during juju bootstrap maas creates vm in zone default when a vm already exists in another zone <cdo-qa> <juju:New> <MAAS:New> <https://launchpad.net/bugs/1706196>
<mup> Bug #1706458 opened: MAAS tries to allocate VMs from pods even when resources are exhausted <cdo-qa> <pod> <MAAS:Triaged by newell-jensen> <https://launchpad.net/bugs/1706458>
<mup> Bug #1706459 opened: All auto-allocated VMs go to the first pod <cdo-qa> <pod> <MAAS:New> <https://launchpad.net/bugs/1706459>
<mup> Bug #1706461 opened: wishlist: allow turning off auto-create-on-allocate for pods <cdo-qa> <MAAS:New> <https://launchpad.net/bugs/1706461>
#maas 2017-07-26
<mup> Bug #1706459 changed: All auto-allocated VMs go to the first pod <cdo-qa> <pod> <MAAS:Won't Fix> <https://launchpad.net/bugs/1706459>
<BlackDex> Hello.
<BlackDex> I'm having an issue with MAAS
<BlackDex> i have a maas node which has a tagged interface for the PXE network
<BlackDex> but the nodes are all untagged on that vlan
<BlackDex> i'm not able to configure the nodes to have an untagged network in the PXE network, since the MAAS node doesn't show it as an untagged network
<roaksoax_> BlackDex: what version of MAAS are you using ?
<BlackDex> 2.2
<BlackDex> 2.2.1 to be exact
<roaksoax_> BlackDex: so your fabric-untagged does not hold the subnet, or the other way around ?
<BlackDex> on the maas node it doesn't indeed
<BlackDex> the maas node has it as a tagged interface
<BlackDex> while the nodes have them untagged
<roaksoax_> BlackDex: ok, maybe i'm not following, but what you are saying is mAAS is providing dhcp on a tagged interface (e.g eth0.1234)
<BlackDex> roaksoax_: that is correct
<BlackDex> and the nodes are connected via 1234 vlan untagged
<roaksoax_> BlackDex: right, so IIRC< that's is now a supported scenario (it wasn't before)
<roaksoax_> BlackDex: hwat is the issue you are experiencing
<BlackDex> Well
<BlackDex> The issue is
<BlackDex> That, if i want to configure a node
<BlackDex> so, create a bond for example
<BlackDex> and have that bond connected as an untagged interface, i cant select that
<BlackDex> i need to select a vlan, which makes it tagged.
<BlackDex> if i do not configure het network my self, so no bonding what ever. it seems to work
<BlackDex> but creating a bond on a node, which is fysically connected to vlan 1234 UNTAGGED, i can't configure that within the node config, because MAAS only allows me to have that configured as a TAGGED interface
<roaksoax_> BlackDex: right, so in the model, is the subnet in question under the untagged vlan or under the vlan X ?
<BlackDex> what do you mean with model?
<roaksoax_> BlackDex: under the Subnets tab
<BlackDex> no, that subnet is under a tagged network
<BlackDex> if i try to change that i'm not able to allow dhcp on it so that PXE will work
<roaksoax_> BlackDex: i'll try to reproduce but AFAIK, we should support the escenario now
<roaksoax_> (we didn't use to support that, we should support that now though)
<BlackDex> well, i have an other network on that untagged
<BlackDex> so maybe that causes the problem
<BlackDex> on the maas node that is
<BlackDex> i currently have it fixed by switching the tagged/untagged on the maas node. So that the PXE network is untagged, but not really the best solution in my opinion
<roaksoax_> BlackDex: yeah, I dont have something to reproduce right now, ut I do know one in the team has reproduced it
<roaksoax_> i just need to double check with him
<BlackDex> oke cool :)
<tai271828_> hi, I failed to deploy node in another subnet with external DHCP. is it possible for MAAS to deploy node in another vlan with an external dhcp? my maas is 2.1.x.
<pmatulis> tai271828_, does the node show up with a status of 'New'? that is, does 'enlistment' work?
<tai271828_> more details: my maas controller is in vlan A with subnet A and an external DHCP A, deployment works fine. I try to deploy a node in vlan B with subnet B and an external DHCP B (for IP in subnet B) and my controller seems lost the ip of the node (enlist and commission works)
<tai271828_> pmatulis: enlistment works, the status could be "Ready"
<tai271828_> pmatulis: I could get info from the node (looks like the info from the cloud-init of the node) by watching the log of MaaS server, log: rsyslog/node/messages
<pmatulis> tai271828_, 'Ready' implies the node has also 'commissioned'. did you tell MAAS to 'commission' that node?
<tai271828_> pmatulis: yes, I tell MaaS to commission that node, and then it is from "New" to "Ready". and then I began to deploy it
<pmatulis> tai271828_, in that case, then, yes, you should be able, in principle, to 'deploy' with an external DHCP
<tai271828_> pmatulis: even the external DHCP is in another subnet differing from where the MaaS controller is in? : (
<pmatulis> yep
<tai271828_> pmatulis: ok. then my vlan of the subnet does not show "there is an external DHCP". Something may be related to it.
<tai271828_> pmatulis: thanks for your comments
<pmatulis> tai271828_, you will need to set the node's IP assignment mode to 'DHCP' and have a 'reserved IP range' covering the external DHCP range to make sure MAAS never uses such IPs (not necessary if both ranges can never conflict)
<tai271828_> pmatulis: yeah, actually I did in this way.
<tai271828_> pmatulis: at the begining of the deployment process, it looks good. I can see the dhcp server is requested and offer expected IPs. and in the mean time, MaaS controller "lost" the configuration of the IP of the node. It became "unconfigured" from "10.101.47.52(DHCP)"  (10.101.47.52 is the ip of subnet B)
<tai271828_> I guess my networks of two subnets may block something if two subnets with their own external DHCP is a working case...
<mup> Bug #1706696 opened: maas auto-allocates machines via pods even when zone is included in constraints <cdo-qa> <MAAS:New> <https://launchpad.net/bugs/1706696>
<mup> Bug #1706700 opened: [UI, Tables] When only one row action is available, expose it in the UI instead of having a contextual menu <ui> <MAAS:Triaged by karlwaghorn-moyce> <https://launchpad.net/bugs/1706700>
<mup> Bug #1706723 opened: Failed talking to pod: int() argument must be a string, a bytes-like object or a number, not 'NoneType' <MAAS:New> <https://launchpad.net/bugs/1706723>
<mup> Bug #1706763 opened: [2.2] Add ability to add tags to a pod that can be used on allocation constraints <pod> <MAAS:Triaged> <https://launchpad.net/bugs/1706763>
<mup> Bug #1706763 changed: [2.2] Add ability to add tags to a pod that can be used on allocation constraints <pod> <MAAS:Triaged> <https://launchpad.net/bugs/1706763>
<mup> Bug #1706763 opened: [2.2] Add ability to add tags to a pod that can be used on allocation constraints <pod> <MAAS:Triaged> <https://launchpad.net/bugs/1706763>
<mup> Bug #1706811 opened: [2.2,2.3] Avoidable power errors occur when there is a mismatch among racks <MAAS:New> <https://launchpad.net/bugs/1706811>
#maas 2017-07-27
<mup> Bug #1706811 changed: [2.2,2.3] Avoidable power errors occur when there is a mismatch among racks <MAAS:New> <https://launchpad.net/bugs/1706811>
<mup> Bug #1706811 opened: [2.2,2.3] Avoidable power errors occur when there is a mismatch among racks <MAAS:New> <https://launchpad.net/bugs/1706811>
<mup> Bug #1706811 changed: [2.2,2.3] Avoidable power errors occur when there is a mismatch among racks <MAAS:Won't Fix> <https://launchpad.net/bugs/1706811>
#maas 2017-07-28
<mup> Bug #1707216 opened: MAAS should allow recognizing enlisting nodes by BMC credentials rather than just MAC <cdo-qa> <cpe> <MAAS:New> <https://launchpad.net/bugs/1707216>
<mup> Bug #1707218 opened: Unable to Deploy PowerEdge R730xd Server <MAAS:New> <https://launchpad.net/bugs/1707218>
<mup> Bug #1699555 changed: [Test] Bug to test new jenkins lander bug resolver <MAAS:Fix Released by blake-rouse> <https://launchpad.net/bugs/1699555>
<mup> Bug #1707253 opened: pxe boot fails with http enabled <cdo-qa> <MAAS:New> <https://launchpad.net/bugs/1707253>
<danielsouzasp> Hello guys, I have a doubt about network configuration for nodes
<danielsouzasp> I'm using a custom config, it works fine for Ubuntu, but does not work for CentOS 6
<danielsouzasp> I'm facing this Error:Custom network configuration only supported on Ubuntu. Using OS default configuration.
<danielsouzasp> Did someone face this problem already?
#maas 2018-07-23
<naturalblue> hey guys/gals
<naturalblue> I ahve setup my mass system with multiple vlans and they are working perfectly. I have told maas to use maas builtin proxy for apt etc... but for some reason when an LXC is creted the proxy details do not appear to be getting sent down. I am using juju to deploy the LXC's
<naturalblue> has anyone any idea where to what to check for. Would I need to reload the apces in juju
<roaksoax> naturalblue: that's probably because you need to set the proxy in juju itself
<naturalblue> roaksoax: yea. i set the proxy for the model but whcih helped but unfortunately different lxd's use different spaces and there seems to be no way to set the proxy per instance on deploy
<naturalblue> roaksoax: thanks
<naturalblue> ^i set the proxy for the model which helped but
#maas 2018-07-24
<BlackDex> Hello ppl. Is it possible to have juju bootstraped within an lxd container for usage with maas?
<BlackDex> at this moment i'm still creating a virtual machine which i add to maas and use that
<BlackDex> but i think a lxd container is better performance wise
<BlackDex> also, adding maas in a container would be a good option then i think right?
<BlackDex> just to keep everything clean, and easy backup/snapshotable
<likelion> Can anyone explain why adding second rack controller (2.4.0~beta2 (6865-gec43e47e6-0ubuntu1)) completely kills MaaS with Error while calling DescribePowerTypes: RPC connection timed out to rack controller'
<jing__> Hi all, trying to install MAAS 2.4 via numerous methods (snap, package) all to no avail... Installation goes well enough, but I can't seem to get the GUI to actually run and listen on the URL I specify
#maas 2018-07-25
<hakonw> Is it possible to select which bootloader to use with maas?
<roaksoax> hakonw: what do you mean ?
<cc> exit
<fallenour> o/
<fallenour> Im having some pretty serious issues with maas after my update to bionic, is anyone else experiencing the same problems as me?
<fallenour> my juju instances arent downloading apt-packages anymore, even though I know they should be able to, dns isnt working anymore, even though ip based internet connectivity works, and now my maas page wont load via webbrowser, and it loaded with nginx, with laoded broken.
<roaksoax> fallenour: dpkg -l | grep maas
<roaksoax> win 4
<roaksoax> fallenour: what's the version it is showing ?
<fallenour> roaksoax: 2.4.0~beta2-6865
<fallenour> roaksoax: In better news, the page is loading again, this time I have to connect to it directly via port, but it still loads, which means its primary config is laying around somewhere, but the proxy isnt forwarding me well.
<roaksoax> fallenour: sudo add-apt-repository ppa:maas/stable -y
<roaksoax> fallenour: sudo apt-get update && sudo apt-get dist-upgrade
<roaksoax> fallenour: that will install 2.4.0
<roaksoax> fallenour: 2.4.0 is in the process of being SRU'd into Xenial
<roaksoax> sd/xenial/bionic
<roaksoax> s/xenial/bionic/g
<roaksoax> fallenour: i think that 2.4.0 will resolve your issues
#maas 2018-07-26
<fallenour_> o/
<fallenour_> hey guys Im havinga lot of issues with maas,and im super confused at this point
<fallenour_> maas can resolve to any system, even itself, machines deployed by maas can resolve to any machine, except for maas, any containers deployed on said machines can ping any ip, but cant resolve to any systems by dns, but they can ping 127.0.0.53, which is set in resolv.conf
<fallenour_> all settings are DHCP
<fallenour_> hey is anyone else having issues with dns resolving with maas?
<roaksoax> fallenour: did you upgrade to 2.4.0 final rather than sticking with beta2 ?
<roaksoax> fallenour: on the dns side, are the containers IP addresses managed by MAAS ? (e.g. maas allows resolving against subnets it knows about)
<roaksoax> fallenour: e.g. https://bugs.launchpad.net/maas/+bug/1774206
<robottalk> Hi all. Just having a bit of a time trying to get a private ssh key into our preseed configuration. Tried a bunch of late_command methods (ssh_keys, write_files, custom sh), following maas, cloud-init, preseed docs. Nothing working very well. Any thoughts on best practice here or suggestions for getting this to work? We need the private key to then access a private git repo via ssh and pull down some manifes
<robottalk> robottalk
<robottalk> ts... Thank you!
<roaksoax> robottalk: did you do a late_command "in-target" ?
<robottalk> yes, last night we left off with something like   ssh_key_copy: curtin in-target -- sh -c "/bin/cp --preserve=mode /home/conductor/.ssh/maas_deploy /target/root/.ssh/id_rsa; /bin/cp --preserve=mode /home/conductor/.ssh/maas_deploy.pub /target/root/.ssh/id_rsa.pub"
<robottalk> but i wasn't sure at that point if the maas server (host called conductor) mount points were accessible
<robottalk> so that was a last effort
<robottalk> write_files worked
<robottalk> but the key was broken because it didn't respect new lines
<roaksoax> yeah I was gonna suggest write_files would be better
<robottalk> that was like
<robottalk> write_files:
<robottalk>   f1:
<robottalk>     path: /root/.ssh/maasdeploy
<robottalk>     content: "-----BEGIN RSA PRIVATE KEY-----
<robottalk> ...
<robottalk> -----END RSA PRIVATE KEY-----"
<robottalk>     permissions: '0600'
<robottalk> but again the key seems broken since it's just a long run on string
<roaksoax> robottalk: https://pastebin.ubuntu.com/p/2Fsw8CyrZN/
<roaksoax> i would do that
<robottalk> the pipe :-)
<robottalk> lemme try it
<robottalk> thanks!
<roaksoax> robottalk: for example, i did this myself: https://pastebin.ubuntu.com/p/vytRwCKd2x/
<roaksoax> robottalk: that correctly wrote the script provided by content
<roaksoax> as yaml would do weird stuff with the quotes
<robottalk> awesome thanks - testing now
<robottalk> fingers crossed
<robottalk> hehe
<roaksoax> robottalk: i guess it worked ?
<robottalk> it just came up
<robottalk> the write_files with pipe worked
<robottalk> :-)
<robottalk> thanks so much!
<robottalk> just checking something in the cloud-init-output
<robottalk> seeing a "Failed to start Apply the settings specified in cloud-config"
<robottalk> and cloud-config.service isn't running
<robottalk> but i just started looking into it...
<robottalk> just this error which seems new ... but not sure of it's impact just yet
<robottalk> ERROR: ld.so: object 'libeatmydata.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
<roaksoax> that libeatmydata.so shouldn't really be a concern I think
<robottalk> roaksoax: it seems ok ... is the cloud-config.service needed once the machine is deployed? just looking around and that error and it doesn't seem critical but not sure if that service is required ... seems important?
<robottalk> roaksoax: thanks for your help! going to move forward with this method - seems to do the trick - wish i had though about the pipe yesterday! hahaha thanks again!
<roaksoax> robottalk: no prob, glad it works
<xygnal> roaksoax: we are having problems with commissions just 'stalling' and never finishing.
<xygnal> roaksoax: just gathering some numbers but it feels like 40-50% of the time we sbmit a commission it fails.
<xygnal> and its not a per-model or per-serial problem, we'll repeat the same commmission 4 times and the 4th time it will go through.
<xygnal> Ã§cat /tee
<roaksoax> xygnal: what version?
<roaksoax> xygnal: and where does it get stuck ?
<roaksoax> xygnal: does it not expire after X minutes and gets marked failed commissioning?
<xygnal> 2.4
<xygnal> does not expire
<xygnal> I am also seeing disk service times of up to 1.0s on tur
<xygnal> on the local disk
<xygnal> but this is a regiond without db inside
<fallenour_> roaksoax: Im not sure what you mean
<fallenour_> roaksoax: im on 2.4.0-beta2
<fallenour_> if there is a more stable version of maas, id rather be on that, even if it means reverting back a version or two. I need maas to work, otherwise life will be quite miserable for me.
<xygnal> roaksoax:  it's as if the commission jobs *die* and MAAS loses track of them/does not restart the jobs.
<xygnal> They remain in 'COMMISSIONING' status *until we abort*
<roaksoax> fallenour: ppa:maas/stable is where 2.4.0 final is available
<roaksoax> fallenour: you should upgrade
<roaksoax> xygnal: how long after is that you abort ?
<roaksoax> xygnal: and where do they get stuck ? e.g. do they pxe boot into the ephemeral environment ? do they get stuck running a script ?
<xygnal>  roaksoax:  hours to days
<roaksoax> xygnal: i would file a bug, but the important thing here is determining where it is getting stuck
<xygnal> roaksoax: there appears to be a pxe boot element, as I believe we've seen it receive the DHCP offer but but be told it had nothing for it to boot
<roaksoax> xygnal: how many machines are you booting at the same time ?
<xygnal> roaksoax: doesnt matter if we do 1 at a time or 10 at a time, we see the same result
<roaksoax> xygnal: right, well again i would file a bug, attach logs and such and we can try to look and determine whats wrong
<xygnal> I can verify that the actual load on the MAAS box itself is very low pretty much all the time, so it does not appear to be a performance bottleneck.  at least not ouside of app code.
<xygnal> just /var/log/maas/ logs?
<roaksoax> xygnal: yeah, and the events for the given machine
<xygnal> well yes, a hostname and some chronological information
<xygnal> as well :)
<xygnal> on a lighter note, we just made a patch to your python code that handles code 64 SMART errors with a pass instead of a fail
<xygnal> (means that the log had errors, but there are not active errors)
<xygnal> we run into it a lot with nodes that have FPDMA errors from bad cabling.  The SMART Log will never clear, so in order to get MAAS to pass commission we had to force over-ride each time in the past.
<xygnal> apparently the Munin product has a simila patch/problem in the past
<xygnal> its like.. 2 lines of code change.  we could submit a PR to your code or.. with how small it is.. would you prefer a bug report + attached fix?
<roaksoax> bug report + attached fix is better to keep track of stuff
<roaksoax> or you cana ttacha  diff to the bug report as well
<xygnal> its tiny so i'll just bug report + the lines + a linked article about why
<roaksoax> cool
<mup> Bug #1783889 opened: COMMISSION S.M.A.R.T Tests fail unnecessarily  on code 64 (past log entries) <MAAS:New> <https://launchpad.net/bugs/1783889>
<xygnal> roaksoax: both submitted.  the COMMISSION one is private as it contains logs with IPs inside.
<xygnal> roaksoax: I think you are going to be disappointed with the info, as the logs show no problems between starting commission and our manual abort.
<roaksoax> xygnal: have the link for the commission on?
<roaksoax> one*
<xygnal> 1783892
<roaksoax> xygnal: do you have the rackd.log from where this machine is to be pxe booting ?
<xygnal>  can get thatÃ§Ã§Ã§Ã§Ã§Ã§Ã§Ã§exit
<roaksoax> xygnal: and the events of the machine in the failed attempt
<roaksoax> maas admin events query hostname=<machine>
<roaksoax> xygnal: also this would be helpful on a fialed run /var/log/maas/rsyslo/<machine-name>/<date>/messages
<xygnal> hm... i can't find the hostname in rackd log on any rack servers
<xygnal> would i not BE listed with its hostname in rackd?
<roaksoax> xygnal: by pxe mac
<roaksoax> or by pxe ip starting from 2.5
<roneth> Question: How to configure IP on the deployed OS?
<roneth> Just deployed CentOS7 and the deployed OS was assigned an IP which seemed to be from one of the subnet defined in MAAS.
<roneth> How can it be deployed such that a desired IP is assigned to the deployed OS ?
<roaksoax> roneth: you mean you want the machine to have a specific ip rather than the auto-assigned ip ?
<roneth> YEs.
<xygnal> you have to set the network config for the node to Static IP instead of Auto-Assign
<xygnal> and you have to put the IP you want in ahead of time, before deployment
<xygnal> which means an admin has to do it :)
<roneth> I am the admin. : ) Can you please elaborate on how I should be setting up the network config?
<roaksoax> roneth: for example, go to the UI, go to the specific machine, go to the interfaces section
<roaksoax> roneth: and edit the specific interface,
<roaksoax> change it from 'Auto assign' to 'Static assign' and select the IP you want
<xygnal> roaksoax: I dont see the mac address in question listed at all in rackd.log on our controllers.   using grep -i and :'s
<roneth> Ah! So, that is "commissioning stage"
<roaksoax> roneth: commissioning will obtain an IP from the MAAS run DHCP. Once the machine is 'Ready' you can change the ip you want for deployment
<roaksoax> xygnal: that's strange... that would mean the machine is not pxe booting ?
<xygnal> it's correct that we have not seen them able to boot PXE.  We see them get DHCP, but the PXE reply seemed to be invalid
<roneth> So, after commissioning and after the machine become "Ready", Would have to edit the interface to be "static assign" --- correct?
<roaksoax> roneth: correct
<xygnal> roneth: yep
<xygnal> roneth: beware that you need to be running a recent version of the CentOS image if you need that IP to be static file in CentOS as opposed to static DHCP
<xygnal> roneth: and if you only run with DHCP STatic method, beware that if you bring rack controllers down for 10 minutes all of those boxes will go offline
<roneth> xygnal: thank a lot. A different Question on "subnet": How does MAAS determine what subnet to pick per commisioning ?
<roaksoax> roneth: whatever it is the subnet you have connected to the vlan where the machine PXE boots on
<roaksoax> roneth: and for which you have enabled DHCP by creating a dynamic DHCP range
<roneth> ah! So, that depends on the underlying real physical set up ?
<roaksoax> roneth: yes
<roaksoax> roneth: well, depends
<roaksoax> roneth: a vlan can have as many subnets as you want really
<roaksoax> you can just add any subnet , the machine could get any ip, pxe boot, etc
<roaksoax> but you will need a gateway to access the external network
<roaksoax> to get packages and stuff
<roaksoax> so that will be dependent on that
<xygnal> unless you proxy them through maas
<xygnal> no?
<xygnal> most of our rackd subnets are NOT internet accessible
<roaksoax> xygnal: yes, but it still has a gateway
<roneth> So, If I have 3 VLANs defined in MAAS, how can I tell MAAS what vlan to pick per commissioning or deployment?
<roaksoax> roneth: you will always need a physical network to do the pxe booting. After that you can configure pretty much anything you want
<roneth> So, it depends on the underlying VLAN then, I can't really tell MAAS what VLAN to pick. (?)
<xygnal> roaksoax: internal gateway to traverse internal subnets, sure. my bad.
<xygnal> roaksoax: let me know if we need to turn up any debugging on next commission
<roaksoax> xygnal: yup, will look at it tomorrow as i'm eod
<xygnal> ty same here
<roaksoax> roneth: so basically, maas will have a interface that's facing the machines for PXE boot rght? which could be connected to any vlan configured on the switch port or a trunk vlan or a mangement vlan
<roaksoax> whatever you may wanna call it
<roaksoax> lets say that's eth1 - 10.10.10.2 in MAAS
<roaksoax> in the maas model that would be, say fabric-0 - untagged - 10.10.10.0/24
<roaksoax> machines that PXE boot are connected to the same vlan
<roaksoax> so you would need to go to the 'untagged' vlan of fabric-0, enable DHCP and create a dynamic range on 10.10.10.0/24 so that machines that PXE boot get an IP from MAAS on that subnet
<roneth> roaksoax: that make sense. Thank you.
<roneth> I have a script that configure the bonding and assign IP to the interface.... What can I pass the script to the deployed machine?
<roneth> I tried "user_data" but it doesn't seemed like it ever gets run.
#maas 2018-07-27
<mup> Bug #1783912 opened: [2.5]Failure: twisted.internet.error.MulticastJoinError: (b'\xe0\x00\x00v', b'\n\xf5\x88\x06', 98, 'Address already in use') <MAAS:Triaged> <https://launchpad.net/bugs/1783912>
<mup> Bug #1783913 opened: [2.5, 2.4] Errors about over commit ratios unclear <pod> <MAAS:Triaged> <https://launchpad.net/bugs/1783913>
<fallenour_> roaksoax: I think I have a solution. Id like to build an HA system for my maas solution, and I have the extra hardware. If i use the other system that is currently xenial, and put them in HA mode, and then tear down the first one, which is messed up due to bionic, will that fix my issue? Also, how will that impact my maas deployment?
<robottalk> Hello again.. I was here earlier with this write_files issue and needing a pipe before my content. It was definitely working after I followed this syntax https://pastebin.ubuntu.com/p/2Fsw8CyrZN/
<robottalk> but now I'm getting an error no matter what I try
<robottalk> 2018-07-27 01:43:34 (219 KB/s) - '/dev/null' saved [2]  /bin/sh: 1: setup_user: not found curtin: Installation failed with exception: Unexpected error while running command. Command: {'setup_user': {'content': '-----BEGIN RSA PRIVATE KEY-----
<robottalk> formatting here is a bit weird
<robottalk> but the main issue seems to be it calling /bin/sh 1: setup_user:
<robottalk> and not finding it
<robottalk> here's a link to the late_commands section of my curtin_userdata
<robottalk> https://pastebin.ubuntu.com/p/rvnfkp9JyP/
<robottalk> Any insights anyone? :-)
<robottalk> Ok I think I figured it out
<robottalk> looks like my indentation was too far over
<robottalk> i moved write_files to column 1 and it worked... I had it indented under late_commands before...
<robottalk> herm
<fallenour>  ok so jumping into oblivion here with this idea, but I have two rack controllers now, one on xenial, one on bionic. Will this remediate the DNS issue that MAAS/Bionic currently has?
<fallenour> another issue ive noticed, it wont let me change the domain of either of my controllers, region or otherwise from the default from the UI. Can anyone provide guidance?
<dumdumdum> hi everyone, need some help with maas
<dumdumdum> there is this error message "ubuntu bionic is configured as the commissioning release but it is unavailable in the configured streams"
<dumdumdum> i can connect to maas.io correctly and synced bionic's image
<dumdumdum> couldnt find much information about this
<dumdumdum> anyone over here has any idea?
<blizzow> Is there a way to set up my / partition as a zfs mirror?
#maas 2018-07-29
<alatrix> join
<robottalk> Hi all, having a heck of a time getting late_commands to do what I need. I have write_files working but I'm adding some custom sh cmds after that and nothing seems to happen... any help would be greatly appreciated! Here's an excerpt from my curtin_userdata file..
<robottalk> https://pastebin.ubuntu.com/p/xrK4drfyyM/
<robottalk> also if anyone can direct me to some useful logs that would maybe help dig up the issue - i've looked at /var/log/cloud-init.log and /var/log/cloud-init-output.log
<robottalk> not much in there pertinent to late_commands that I can find
#maas 2019-07-22
<mup> Bug #1837381 opened: Unable to deploy Precise on a KVM node with MAAS 2.3.5 <MAAS:New> <https://launchpad.net/bugs/1837381>
<mup> Bug #1837381 changed: Unable to deploy Precise on a KVM node with MAAS 2.3.5 <MAAS:New> <https://launchpad.net/bugs/1837381>
<mup> Bug #1837381 opened: Unable to deploy Precise on a KVM node with MAAS 2.3.5 <MAAS:New> <https://launchpad.net/bugs/1837381>
<mup> Bug #1837385 opened: Subnets -> VLAN Summary: unexpected comma displayed after rack controllers list <MAAS:New> <https://launchpad.net/bugs/1837385>
<mup> Bug #1837381 changed: Unable to deploy Precise on a KVM node with MAAS 2.3.5 <MAAS:Invalid> <https://launchpad.net/bugs/1837381>
<mup> Bug #1837381 opened: Unable to deploy Precise on a KVM node with MAAS 2.3.5 <MAAS:Invalid> <https://launchpad.net/bugs/1837381>
<mup> Bug #1837381 changed: Unable to deploy Precise on a KVM node with MAAS 2.3.5 <MAAS:Invalid> <https://launchpad.net/bugs/1837381>
<mup> Bug #1837381 opened: Unable to deploy Precise on a KVM node with MAAS 2.3.5 <MAAS:Invalid> <https://launchpad.net/bugs/1837381>
<mup> Bug #1837381 changed: Unable to deploy Precise on a KVM node with MAAS 2.3.5 <MAAS:Invalid> <https://launchpad.net/bugs/1837381>
#maas 2019-07-23
<SaiHarshaK> hello, is there a guide for contributors? i only found one for documentation
<roaksoax> i'll let you/win 2
#maas 2019-07-24
<mup> Bug #1837042 changed: Failed to deploy Disco on Power8 with MAAS 2.5.2 <curtin:Invalid> <MAAS:Invalid> <The Ubuntu-power-systems project:Invalid by maas> <https://launchpad.net/bugs/1837042>
<mup> Bug #1837042 opened: Failed to deploy Disco on Power8 with MAAS 2.5.2 <curtin:Invalid> <MAAS:Invalid> <The Ubuntu-power-systems project:Invalid by maas> <https://launchpad.net/bugs/1837042>
<mup> Bug #1837042 changed: Failed to deploy Disco on Power8 with MAAS 2.5.2 <curtin:Invalid> <MAAS:Invalid> <The Ubuntu-power-systems project:Invalid by maas> <https://launchpad.net/bugs/1837042>
#maas 2019-07-25
<mup> Bug #1837903 opened: preseed uses region controller as http_proxy where rack controller proxy should be used <http> <preseed> <proxy> <rack> <region> <MAAS:New> <https://launchpad.net/bugs/1837903>
#maas 2019-07-26
<mup> Bug #1838091 opened: MaaS reports wrong OS upon release of a node <MAAS:New> <https://launchpad.net/bugs/1838091>
<mup> Bug #1838091 changed: MaaS reports wrong OS upon release of a node <MAAS:New> <https://launchpad.net/bugs/1838091>
<mup> Bug #1838091 opened: MaaS reports wrong OS upon release of a node <MAAS:New> <https://launchpad.net/bugs/1838091>
<mup> Bug #1838091 changed: MaaS reports wrong OS upon release of a node <MAAS:New> <https://launchpad.net/bugs/1838091>
<mup> Bug #1838091 opened: MaaS reports wrong OS upon release of a node <MAAS:New> <https://launchpad.net/bugs/1838091>
<mup> Bug #1837210 opened: power driver UI - IPMI - save settings disabled if field "power mac" is empty  <ipmi> <ui> <MAAS:New> <https://launchpad.net/bugs/1837210>
<mup> Bug #1837903 opened: preseed uses region controller as http_proxy where rack controller proxy should be used <http> <preseed> <proxy> <rack> <region> <MAAS:New> <https://launchpad.net/bugs/1837903>
<mup> Bug #1838117 opened: feature: support "forward only" and "forward first" options <cdo-qa> <MAAS:New> <https://launchpad.net/bugs/1838117>
#maas 2019-07-27
<mup> Bug #1830017 changed: Commission changes Redfish username and password <MAAS:Expired> <https://launchpad.net/bugs/1830017>
<mup> Bug #1830017 opened: Commission changes Redfish username and password <MAAS:Expired> <https://launchpad.net/bugs/1830017>
<mup> Bug #1830017 changed: Commission changes Redfish username and password <MAAS:Expired> <https://launchpad.net/bugs/1830017>
<atdprhs> is there any way that i can configure maas to create a machine with a bridged network?
<atdprhs> i am talking about composing a machine with KVM
<atdprhs> so i have 2 subnet masks, one of them is a virtual network, and the other one is on the same subnet mask as lan, i managed to configure the bridge network for kvm to make the machine have the same ip as lan
<atdprhs> but in maas, it's by default that it uses the virtual subnet mask
#maas 2020-07-20
<bradm> is there any way to configure lxd when you're using juju on a maas cloud?  particularly I'm thinking of targetting storage for lxd containers, rather than using a pre-definied directory.  I could just mount the space on /var/lib/lxd with maas, but seems like something that should be configurable
<mup> Bug #1888014 changed: PXE boot fails on Raspbeery Pi 4 missing recovery.elf/start.elf <MAAS:Invalid> <https://launchpad.net/bugs/1888014>
<sakharkar> Hi all, till now I was using MAAS 2.4 with esxi using virsh type to commission the nodes. Now we are upgraded to MAAS 2.6 and now the node commissioning is failing with error: "failed to detect a valid IP address from"
<sakharkar> does 2.6 stopped supporting esxi?
<sakharkar> can anyone help me out with this?
<d0ugal> sakharkar: AFAIK we support it still. The best place for support is on discourse. https://discourse.maas.io/
<mup> Bug #1888278 opened: Rotation/Retention of rsylog files in /var/log/maas/rsyslog/<host>/<date>/messages <feature-request> <MAAS:New> <https://launchpad.net/bugs/1888278>
<mup> Bug #1888284 opened: Configuration for logs in /var/log/maas/rsyslog/ to use SYSID instead of HOSTNAME <feature-request> <MAAS:New> <https://launchpad.net/bugs/1888284>
<mup> Bug #1888015 changed: Not possible to add region controller <MAAS:Invalid> <https://launchpad.net/bugs/1888015>
<mup> Bug #1888015 opened: Not possible to add region controller <MAAS:Invalid> <https://launchpad.net/bugs/1888015>
<mup> Bug #1888015 changed: Not possible to add region controller <MAAS:Invalid> <https://launchpad.net/bugs/1888015>
<mup> Bug #1886103 changed: Wrong instructions for registering a rack-controller with snap <ui> <MAAS:In Progress> <https://launchpad.net/bugs/1886103>
<mup> Bug #1886103 opened: Wrong instructions for registering a rack-controller with snap <ui> <MAAS:In Progress> <https://launchpad.net/bugs/1886103>
<mup> Bug #1886103 changed: Wrong instructions for registering a rack-controller with snap <ui> <MAAS:In Progress> <https://launchpad.net/bugs/1886103>
<mup> Bug #1886103 opened: Wrong instructions for registering a rack-controller with snap <ui> <MAAS:In Progress> <https://launchpad.net/bugs/1886103>
<mup> Bug #1886103 changed: Wrong instructions for registering a rack-controller with snap <ui> <MAAS:In Progress> <https://launchpad.net/bugs/1886103>
<mup> Bug #1888315 opened: MAAS does not set type=VLAN for CentOS & RHEL <MAAS:New> <https://launchpad.net/bugs/1888315>
<mup> Bug #1888315 changed: MAAS does not set type=VLAN for CentOS & RHEL <MAAS:New> <https://launchpad.net/bugs/1888315>
<mup> Bug #1888315 opened: MAAS does not set type=VLAN for CentOS & RHEL <MAAS:New> <https://launchpad.net/bugs/1888315>
#maas 2020-07-21
<sakharkar> d0ugal: Thanks for the help. But as per the doc I checked for virsh.py file and only "/usr/share/sosreport/sos/plugins/virsh.py" is available in my setup. So is there any way to install virsh drivers?
<mup> Bug #1881892 changed: CPU usage bar not shown in UI for KVMs <ui> <MAAS:Fix Released by caleb-ellis> <https://launchpad.net/bugs/1881892>
<d0ugal> sakharkar: Sorry, I only really idle here for the bug stream :) Lots of far more knowledgeable people are looking at discourse and can help there.
<d0ugal> but I don't really know how the sosreport is related to this issue :)
<sakharkar> d0ugal: sure I am looking into it. Thanks for the help
<d0ugal> sakharkar: np, btw where in the docs does it say to look for virsh.py?
<d0ugal> sakharkar: Did you see this? https://discourse.maas.io/t/maas-2-6-breaks-virsh-esxi-powertype/734
<sakharkar> d0ugal: yup
<sakharkar> d0ugal: I edited it and now power management started working but commissioning is failing
<d0ugal> I see
<abyss> I updated maas from 2.6 to 2.8 and now I get error: warn] Failed to change the boot order to PXE 10.2.2.2: /snap/maas/7808/usr/sbin/ipmi-chassis-config: 15: exec: /usr/sbin/ipmi-config: not found. In 2.6 everything worked fine. Any ideas?
<d0ugal> abyss: Hi, this channel isn't really active. Can you open a bug with more details or maybe ask a question on discourse.maas.io
<d0ugal> abyss: if you can include `snap info maas` in the bug that would be useful
<mup> Bug #1789650 opened: Servers set to boot from disk after MAAS installation <curtin:Incomplete> <MAAS:New> <https://launchpad.net/bugs/1789650>
<abyss> d0ugal: ok, thank you for info
<mup> Bug #1809939 opened: dhcp snippet create fail when dhcp subnet is relayed <MAAS:Confirmed for sombrafam> <https://launchpad.net/bugs/1809939>
<mup> Bug #1888413 opened: "Add a rack controller" instructions are wrong <MAAS:New> <https://launchpad.net/bugs/1888413>
<mup> Bug #1888419 opened: MAAS UI is unclear when unable to MAAS deploy machines <ui> <MAAS:New> <MAAS 2.8:New> <https://launchpad.net/bugs/1888419>
#maas 2020-07-22
<mup> Bug #1888413 changed: "Add a rack controller" instructions are wrong <MAAS:New> <https://launchpad.net/bugs/1888413>
<mup> Bug #1888413 opened: "Add a rack controller" instructions are wrong <MAAS:New> <https://launchpad.net/bugs/1888413>
<mup> Bug #1888413 changed: "Add a rack controller" instructions are wrong <MAAS:New> <https://launchpad.net/bugs/1888413>
<mup> Bug #1888536 opened: named.conf.options.inside.maas reverts to default <MAAS:New> <https://launchpad.net/bugs/1888536>
#maas 2020-07-23
<stelucz> Hi, I just deployed machine by MAAS. I kept 13 disks in available state without any partition or format setting except one root disks. I have expected that if the disk is not configured in MAAS then it will be untouched by MAAS, however it seems that all disks were erased by MAAS during deployment. What is the proper way to say MAAS to do not touch
<stelucz> other disks except these I want to configure? Thanks
<mup> Bug #1888673 opened: maas package doesn't install in focal lxd container <MAAS:New> <snapd:New> <https://launchpad.net/bugs/1888673>
<mup> Bug #1888718 opened: Rack controllers are not stateless <MAAS:New> <https://launchpad.net/bugs/1888718>
<mup> Bug #1888719 opened: Improve migration of rack controllers <MAAS:New> <https://launchpad.net/bugs/1888719>
<mup> Bug #1888720 opened: Provide scripts for each phase of deployment <MAAS:New> <https://launchpad.net/bugs/1888720>
<mup> Bug #1888721 opened: Allow editing default scripts <MAAS:New> <https://launchpad.net/bugs/1888721>
<mup> Bug #1888721 changed: Allow editing default scripts <MAAS:New> <https://launchpad.net/bugs/1888721>
<mup> Bug #1888721 opened: Allow editing default scripts <MAAS:New> <https://launchpad.net/bugs/1888721>
<mup> Bug #1888721 changed: Allow editing default scripts <MAAS:New> <https://launchpad.net/bugs/1888721>
<mup> Bug #1888721 opened: Allow editing default scripts <MAAS:New> <https://launchpad.net/bugs/1888721>
<mup> Bug #1888721 changed: Allow editing default scripts <MAAS:Won't Fix> <https://launchpad.net/bugs/1888721>
#maas 2020-07-24
<damaya> Hello, I'm running into this issue, https://discourse.maas.io/t/maas-2-5-wont-pxe-qlogic-10g-nic-hp-model-nc523sfp/313
<damaya> I have a Broadcom NetXtreme II BCM57810 10 Gigabit Ethernet adapter, and we are running MAAS 2.8.1 (8567-g.c4825ca06). Every time we try to enlist a node using Legacy PXE boot, we run into this issue where as soon as PXE switches to HTTP, everything stops.
<damaya> The screenshot here is exactly what is happening, https://discourse.maas.io/uploads/default/original/1X/3d89884a818429aa1942b94cf683d30700e8a634.jpg
<damaya> That screenshot came from this thread: https://discourse.maas.io/t/pxe-not-working-after-upgrading-to-2-5-0/384/23
<damaya> Is there any way to simply disable HTTP PXE and go back strictly to TFTP?
<damaya> Or is there some sort of workaround here? I've searched launchpad / discourse and could only find unanswered threads.
<mup> Bug #1888884 opened: Rescue fails on maas 2.3.6/2.3.5 with boot-kernel failed no such file or directory <MAAS:New> <https://launchpad.net/bugs/1888884>
<mup> Bug #1888884 changed: Rescue fails on maas 2.3.6/2.3.5 with boot-kernel failed no such file or directory <MAAS:New> <https://launchpad.net/bugs/1888884>
<mup> Bug #1888884 opened: Rescue fails on maas 2.3.6/2.3.5 with boot-kernel failed no such file or directory <MAAS:New> <https://launchpad.net/bugs/1888884>
#maas 2020-07-25
<mup> Bug #1888946 opened: image auto-sync options <MAAS:New> <https://launchpad.net/bugs/1888946>
#maas 2020-07-26
<mup> Bug #1888993 opened: 2.8 Can't edit curtin files inside maas snap  <MAAS:New> <https://launchpad.net/bugs/1888993>
<mup> Bug #1888993 changed: 2.8 Can't edit curtin files inside maas snap  <MAAS:New> <https://launchpad.net/bugs/1888993>
<mup> Bug #1888993 opened: 2.8 Can't edit curtin files inside maas snap  <MAAS:New> <https://launchpad.net/bugs/1888993>
