[01:28] <mup> Bug #1703712 opened: [2.3] IP addresses not listed for devices when they are 'Dynamic' <MAAS:Triaged> <https://launchpad.net/bugs/1703712>
[01:28] <mup> Bug #1703713 opened: [2.3] Devices don't have a link from the DNS page <MAAS:Triaged> <MAAS 2.2:Triaged> <https://launchpad.net/bugs/1703713>
[04:23] <mup> Bug #1690154 changed: block-curtin-poweroff doesn't work <MAAS:Expired> <https://launchpad.net/bugs/1690154>
[04:38] <mup> Bug #1690154 opened: block-curtin-poweroff doesn't work <MAAS:Expired> <https://launchpad.net/bugs/1690154>
[04:44] <mup> Bug #1690154 changed: block-curtin-poweroff doesn't work <MAAS:Expired> <https://launchpad.net/bugs/1690154>
[10:53] <magicaltrout> hello folks, maas question
[10:53] <magicaltrout> I can commission a node
[10:53] <magicaltrout> but then when I deploy it doesn't PXE boot
[10:53] <magicaltrout> any idea where to prod around?
[10:56] <magicaltrout> hmm does if I restart
[10:56] <magicaltrout> maybe it comes to life too quickly
[11:24] <magicaltrout> scrap that next question:
[11:24] <magicaltrout> https://gist.github.com/buggtb/21517dda16ba6b91fd40ebe2ae3763c8
[11:24] <magicaltrout> curtin fails miserably when deploying a server
[11:24] <magicaltrout> not sure why though, the disk seems to exist earlier in the boot
[11:32] <magicaltrout> weird if i keep looking at /dev just prior to curtin running
[11:32] <magicaltrout> the partition vanishes
[11:38] <magicaltrout> centos images install though
[11:38] <magicaltrout> just not xenial hwe
[12:33] <magicaltrout> hmm
[12:33] <magicaltrout> disabling EFI seemed to fix that
[12:33] <magicaltrout> now grub fails...
[13:04] <roaksoax> magicaltrout: do you mean that when you deploy, the machine installs fine, reboots, but it never boots into the OS ?
[13:05] <roaksoax> magicaltrout: or you mean this is due to the curtin issue you found above ?
[13:13] <magicaltrout> still something curtin based roaksoax
[13:13] <magicaltrout> i get this
[13:13] <magicaltrout> https://gist.github.com/buggtb/c3406963c12646115891ad41c92c569f
[13:13] <magicaltrout> but a manual install works fine. I guess it finds the mmc paritioning a bit funky
[13:30] <roaksoax> magicaltrout: yeah, I'd suggest you file a bug on the curtin project specifying the versions of curtin/maas in use, with the output of maas <user> machines get-curtin-config <system_id>
[13:38] <magicaltrout> thanks roaksoax will do, posted to maas-devel on the off chance someone has an idea as well
[13:43] <xygnal> mpontillo: roaksoax: still waiting on info for how to set by boot & root device in MAAS without having to do it manually. (cannot set during commission or deploying states)
[13:43] <xygnal> s/by/my
[13:43] <xygnal> though I think it's just root device that needs 'ready' state
[14:44] <mup> Bug #1703845 opened: rackd should lower recheck interval on disconnect <MAAS:In Progress by blake-rouse> <https://launchpad.net/bugs/1703845>
[14:53] <mup> Bug #1703845 changed: rackd should lower recheck interval on disconnect <MAAS:In Progress by blake-rouse> <https://launchpad.net/bugs/1703845>
[15:02] <mup> Bug #1703845 opened: rackd should lower recheck interval on disconnect <MAAS:In Progress by blake-rouse> <https://launchpad.net/bugs/1703845>
[15:24] <gimmic> Is there an issue with running Landscape on the MAAS box?
[15:26] <gimmic> magicaltrout: Try booting into rescue and running wipefs, then deploy to the box
[15:26] <gimmic> curtin seems to handle existing partition data poorly
[15:26] <gimmic> I had duplicate vgs due to reusing hard drives with existing partitions on the nodes
[15:27] <gimmic> 'sudo wipefs -a /dev/sd* -f;sudo wipefs -a /dev/mapper/* -f'
[15:27] <gimmic> cleared up my issues, nuke it from orbit..
[15:30] <roaksoax> gimmic: are you sure you are in the latest version of curtin ?
[15:30] <roaksoax> what curtin version are you using ?
[15:30] <roaksoax> jhave you filed a bug ?
[15:31] <gimmic>   Installed: 0.1.0~bzr505-0ubuntu1~16.04.1
[15:31] <gimmic> was up-to-date at least as to when I was tshooting that last week
[15:49] <dannf> Odd_Bloke: does your team generate the ephemeral images used by MAAS directly? if so, do you know if you strip kernel modules out during that process?
[15:49] <dannf> (LP: #1702976 is why I ask)
[15:50] <Odd_Bloke> dannf: We run the code that generates the image, but the MAAS team own that code.
[15:50] <Odd_Bloke> So I don't know. :)
[15:50] <dannf> Odd_Bloke: got a pointer to that code? is it still lp:maas-images?
[15:51] <Odd_Bloke> dannf: It is, yeah.
[15:51] <dannf> Odd_Bloke: ok, perfect. thanks!
[15:53] <Odd_Bloke> :)
[16:01] <dannf> roaksoax: ^
[16:01] <dannf> i'm guessing this is what we need: https://code.launchpad.net/~dannf/maas-images/lp1702976/+merge/327307
[16:02] <mup> Bug #1703895 opened: BMC static allocated addresses are not associated with nodes in subnet list <MAAS:New> <https://launchpad.net/bugs/1703895>
[16:08] <dannf> fixed a typo, now: https://code.launchpad.net/~dannf/maas-images/lp1702976/+merge/327308
[16:09] <roaksoax> dannf: we dont generate kernels, we grab them from the archive
[16:09] <dannf> roaksoax: see my MP
[16:17] <xygnal>  roaksoax any ideas about my root device dilemma?
[16:18] <roaksoax> dannf: uhmmmm wtf! we should be using the initrd from the archive
[16:18] <dannf> roaksoax: there isn't an initrd in the archive, it is generated at installtime.
[16:18] <roaksoax> xygnal: sorry, i was following the conversation. Basically, you want to change / ?
[16:18] <roaksoax> dannf: http://archive.ubuntu.com/ubuntu/dists/xenial/main/installer-amd64/current/images/netboot/ubuntu-installer/amd64/
[16:18] <gimmic> xygnal: you can set boot devices via IPMI
[16:18] <roaksoax> e.g.
[16:18] <dannf> roaksoax: well, there's a d-i initrd - but that's not what you need here
[16:19] <dannf> roaksoax: booting that woudl boot d-i
[16:19] <xygnal> no no no.  I am trying to set the boot *and* root device in MAAS so that it installs on the correct disk (sda is NOT the correct disk)
[16:19] <xygnal> boot is not the problem really, its root disk that refuses to go forward if the node is not in 'ready' state
[16:20] <xygnal> that means I cannot set it during commission and I cannot set it during deploy... so... WHEN can I set it without having outside cron jobs to look?
[16:21] <xygnal> I wrote an endpoint that looks up the correct disk and sets it in MAAS via the MAAS API.  Problem is, I cannot find a place in the build process within MAAS where it would be able to run.
[16:35] <roaksoax> xygnal: sorry, otp. give me a sec
[17:00] <dannf> roaksoax: thx for the MP approval! would you also be able to merge it, or do we need a separate ack?
[17:02] <roaksoax> dannf: do you have landing rights ? if you do, go for it
[17:02] <dannf> roaksoax: i don't
[17:03] <ltrager> dannf: in meeting right now will land it after
[17:04] <dannf> ltrager: thx!
[18:01] <ltrager> dannf: on ARM64 does uname -m output 'arm64'?
[18:05] <gimmic> I just maas installed 50 identical nodes.. and 4 of them use eth0 rather than eno1
[18:05] <gimmic> I wonder why they would use the traditional interface naming when using 16.04
[18:17] <dannf> ltrager: no
[18:17] <dannf> ltrager: aarch64
[18:28] <roaksoax> xygnal: you can make change sto storage in 'ready' state (e.g, like partitions, bcache, etc). You can make filesystem changes on 'allocated'
[18:28] <roaksoax> xygnal: the machine can be ready and you may want to change the boot/root
[18:28] <roaksoax> xygnal: also, is this EFI or legacy ?
[18:29] <roaksoax> xygnal: if it is legacy you do care which disk is the disk the bios uses as boot, which is the disk that you will have to install /boot/
[18:50] <xygnal> legacy
[18:50] <xygnal> and yes they need to be the same disk, as all the rest of the disks need to e 100% safe for the clients to use and wipe
[18:51] <xygnal> we use one particular model of SSD for OS and unfortunately it cannot be re-arranged to be the first disk.
[18:57] <roaksoax> xygnal: right, so that seems like you would have to put /boot in the disk that the bios will boto from and /root on your ssd
[19:11] <xygnal> we have not had to do that
[19:11] <xygnal> we are able to change which disk is booted first in the BIO
[19:11] <xygnal> I have been setting boot and root to sdm for example and that has been working
[19:12] <xygnal> THAT is working fine. The problem is taht I cannot set this information during commission or deployment, which means I have to actually schedule it outside of MAAS.  Not optimal.
[19:13] <gimmic> [Errno 13] Permission denied: '/sys/class/block/md127/md/sync_action'
[19:14] <gimmic> I've come to the conclusion curtin is garbage
[19:15] <gimmic> xygnal: there's several instances where it would be great to be able to plug in basic OS commands at stages of deployment (post commission, pre-deployment, post deployment)
[19:57] <ltrager> dannf: I've created a new MP which only includes the driver on ARM64, can you take a look?
[19:57] <ltrager> dannf: https://code.launchpad.net/~ltrager/maas-images/lp1702976/+merge/327329
[19:58] <dannf> ltrager: see my comment about uname -m
[19:58] <ltrager> dannf: yes I changed it to using aarch64 and confirmed it was included when I did a test build on my system
[19:59] <dannf> ltrager: oh - so on an x86 system uname retruend aarch64 in an aarch64 root?
[20:00] <ltrager> dannf: Yes, maas-image-builder uses qemu to emulate an ARM64 system during build
[20:03] <dannf> ltrager: personally, i'd still prefer dpkg --print-architecture. uname -m bites in lots of places - say, if we were building armhf images outside of maas-image-builders on an arm64 host
[20:03] <dannf> ltrager: but lemme reply in the MP FTR
[21:00] <mup> Bug #1703984 opened: Commissioning fail with local ntp server <MAAS:Incomplete> <https://launchpad.net/bugs/1703984>
[21:15] <mup> Bug #1703992 opened: [2.3] Device discovery shows duplicate entry for the same device <MAAS:Triaged> <https://launchpad.net/bugs/1703992>
[21:27] <mup> Bug #1703992 changed: [2.3] Device discovery shows duplicate entry for the same device <MAAS:Triaged> <https://launchpad.net/bugs/1703992>
[21:30] <mup> Bug #1648635 changed: [2.x] Commissioning fails due to low ipmi wait_time <ipmi> <papercut> <MAAS:Triaged> <https://launchpad.net/bugs/1648635>
[21:30] <mup> Bug #1703992 opened: [2.3] Device discovery shows duplicate entry for the same device <MAAS:Triaged> <https://launchpad.net/bugs/1703992>
[21:36] <mup> Bug #1648635 opened: [2.x] Commissioning fails due to low ipmi wait_time <ipmi> <papercut> <MAAS:Triaged> <https://launchpad.net/bugs/1648635>
[21:42] <mup> Bug #1648635 changed: [2.x] Commissioning fails due to low ipmi wait_time <ipmi> <papercut> <MAAS:Triaged> <https://launchpad.net/bugs/1648635>
[22:13] <agrebennikov> roaksoax, we can discuss it over here if you wish :)
[22:13] <agrebennikov> will be much easier and faster
[22:42] <agrebennikov> ltrager, regarding the bug you've just asked me about (commissioning)
[22:43] <ltrager> agrebennikov: can you confirm that your machines do have access to the Ubuntu archives and can install lldpd and ntp?
[22:44] <agrebennikov> just did
[22:44] <agrebennikov> si yes, they download all the stuff
[22:44] <agrebennikov> moreove
[22:44] <agrebennikov> *over
[22:44] <agrebennikov> sometimes I have my comissioning passed
[22:44] <agrebennikov> as I mentioned in the description
[22:45] <agrebennikov> it seems all the matter of how fast the time can snc with the internal ntp
[22:45] <agrebennikov> the reason I'm so strict about it is that yesterday I spent about half a day struggling with ntp
[22:46] <agrebennikov> and it turned out then before ntp sync happens I had all commands starting with "systemctl" failed with timeout
[22:47] <agrebennikov> ltrager, ^^
[22:47] <ltrager> agrebennikov: so it seems that interacting with systemd is taking a long time in your environment
[22:47] <agrebennikov> when I set up correct ntp server - started working immediately
[22:48] <agrebennikov> when it passes ntp sync - it is immediate response
[22:48] <ltrager> agrebennikov: So it only fails when an unreachable NTP server is used?
[22:49] <agrebennikov> it seems so
[22:49] <agrebennikov> when it could never reach ntp - yes, I thought systemd is completely broken
[22:49] <agrebennikov> then I changed ntp server to a reachable one
[22:49] <agrebennikov> now I can pass commissioning from 2 or 3 attempt
[22:50] <agrebennikov> when I log into the host during the commissioning I clearly see that the lldp is installed
[22:50] <agrebennikov> but the service doesn't run
[22:50] <agrebennikov> if I run it manually - works just fine
[22:51] <agrebennikov> unfortunately I just got my env broken, my jumphost went down
[22:51] <ltrager> agrebennikov: so do you need to be able to use MAAS in an environment that doesn't have access to an external NTP server?
[22:52] <agrebennikov> but you guys can try on your own, as I said if you specify unreachable ntp server in settings and deny access to ntp.ubuntu.com
[22:52] <agrebennikov> no, I'm not. Current ntp works just fine
[22:52] <agrebennikov> I don't know how to explain it better :(
[22:53] <agrebennikov> in the beginning server always wants to connect to ntp.ubuntu.com
[22:53] <agrebennikov> until cloud init or whatever injects proper settings to ntp.conf
[22:55] <ltrager> agrebennikov: okay I think I understand where the problem is. systemd has its own built in NTP client(timedatactl) which I believe is configured for ntp.ubuntu.com. cloud-init replaces it with ntp so this might be a bug in systemd. I'll experiment a bit
[22:55] <agrebennikov> great
[22:57] <agrebennikov> hopefully you'll find something :)
[23:00] <agrebennikov> check this screenshot out: https://snag.gy/h3LrXj.jpg
[23:00] <agrebennikov> yeah, it is systemd-timesync who is trying to connnect
[23:00] <agrebennikov> (it is just screen leftovers from the broken host unfortunately)
[23:06] <agrebennikov> ltrager, &^^
[23:07] <ltrager> agrebennikov: I suspect its getting hung up on daemon-reload due to ntp trying to run again and holding up any commands to systemd.
[23:07] <ltrager> agrebennikov: I need to confirm that and see if daemon-reload is still needed
[23:08] <agrebennikov> well, yesterday I was trying stuff like "systemctl -l" and it didn't work eather
[23:11] <agrebennikov> anyway, I'm heading off, hopefully will get my jumphost fixed in the morning. Please let me know if I need to try anything else tomorrow
[23:11] <agrebennikov> thanks ltrager!
[23:15] <ltrager> agrebennikov: np, I'll update the bug report with anything I find