mup | Bug #1703712 opened: [2.3] IP addresses not listed for devices when they are 'Dynamic' <MAAS:Triaged> <https://launchpad.net/bugs/1703712> | 01:28 |
---|---|---|
mup | Bug #1703713 opened: [2.3] Devices don't have a link from the DNS page <MAAS:Triaged> <MAAS 2.2:Triaged> <https://launchpad.net/bugs/1703713> | 01:28 |
mup | Bug #1690154 changed: block-curtin-poweroff doesn't work <MAAS:Expired> <https://launchpad.net/bugs/1690154> | 04:23 |
mup | Bug #1690154 opened: block-curtin-poweroff doesn't work <MAAS:Expired> <https://launchpad.net/bugs/1690154> | 04:38 |
mup | Bug #1690154 changed: block-curtin-poweroff doesn't work <MAAS:Expired> <https://launchpad.net/bugs/1690154> | 04:44 |
=== frankban|afk is now known as frankban | ||
magicaltrout | hello folks, maas question | 10:53 |
magicaltrout | I can commission a node | 10:53 |
magicaltrout | but then when I deploy it doesn't PXE boot | 10:53 |
magicaltrout | any idea where to prod around? | 10:53 |
magicaltrout | hmm does if I restart | 10:56 |
magicaltrout | maybe it comes to life too quickly | 10:56 |
magicaltrout | scrap that next question: | 11:24 |
magicaltrout | https://gist.github.com/buggtb/21517dda16ba6b91fd40ebe2ae3763c8 | 11:24 |
magicaltrout | curtin fails miserably when deploying a server | 11:24 |
magicaltrout | not sure why though, the disk seems to exist earlier in the boot | 11:24 |
magicaltrout | weird if i keep looking at /dev just prior to curtin running | 11:32 |
magicaltrout | the partition vanishes | 11:32 |
magicaltrout | centos images install though | 11:38 |
magicaltrout | just not xenial hwe | 11:38 |
magicaltrout | hmm | 12:33 |
magicaltrout | disabling EFI seemed to fix that | 12:33 |
magicaltrout | now grub fails... | 12:33 |
=== mpontillo_ is now known as mpontillo | ||
=== tai271828__ is now known as tai271828_ | ||
=== mup_ is now known as mup | ||
roaksoax | magicaltrout: do you mean that when you deploy, the machine installs fine, reboots, but it never boots into the OS ? | 13:04 |
roaksoax | magicaltrout: or you mean this is due to the curtin issue you found above ? | 13:05 |
magicaltrout | still something curtin based roaksoax | 13:13 |
magicaltrout | i get this | 13:13 |
magicaltrout | https://gist.github.com/buggtb/c3406963c12646115891ad41c92c569f | 13:13 |
magicaltrout | but a manual install works fine. I guess it finds the mmc paritioning a bit funky | 13:13 |
roaksoax | magicaltrout: yeah, I'd suggest you file a bug on the curtin project specifying the versions of curtin/maas in use, with the output of maas <user> machines get-curtin-config <system_id> | 13:30 |
magicaltrout | thanks roaksoax will do, posted to maas-devel on the off chance someone has an idea as well | 13:38 |
xygnal | mpontillo: roaksoax: still waiting on info for how to set by boot & root device in MAAS without having to do it manually. (cannot set during commission or deploying states) | 13:43 |
xygnal | s/by/my | 13:43 |
xygnal | though I think it's just root device that needs 'ready' state | 13:43 |
mup | Bug #1703845 opened: rackd should lower recheck interval on disconnect <MAAS:In Progress by blake-rouse> <https://launchpad.net/bugs/1703845> | 14:44 |
mup | Bug #1703845 changed: rackd should lower recheck interval on disconnect <MAAS:In Progress by blake-rouse> <https://launchpad.net/bugs/1703845> | 14:53 |
mup | Bug #1703845 opened: rackd should lower recheck interval on disconnect <MAAS:In Progress by blake-rouse> <https://launchpad.net/bugs/1703845> | 15:02 |
gimmic | Is there an issue with running Landscape on the MAAS box? | 15:24 |
gimmic | magicaltrout: Try booting into rescue and running wipefs, then deploy to the box | 15:26 |
gimmic | curtin seems to handle existing partition data poorly | 15:26 |
gimmic | I had duplicate vgs due to reusing hard drives with existing partitions on the nodes | 15:26 |
gimmic | 'sudo wipefs -a /dev/sd* -f;sudo wipefs -a /dev/mapper/* -f' | 15:27 |
gimmic | cleared up my issues, nuke it from orbit.. | 15:27 |
roaksoax | gimmic: are you sure you are in the latest version of curtin ? | 15:30 |
roaksoax | what curtin version are you using ? | 15:30 |
roaksoax | jhave you filed a bug ? | 15:30 |
gimmic | Installed: 0.1.0~bzr505-0ubuntu1~16.04.1 | 15:31 |
gimmic | was up-to-date at least as to when I was tshooting that last week | 15:31 |
dannf | Odd_Bloke: does your team generate the ephemeral images used by MAAS directly? if so, do you know if you strip kernel modules out during that process? | 15:49 |
dannf | (LP: #1702976 is why I ask) | 15:49 |
Odd_Bloke | dannf: We run the code that generates the image, but the MAAS team own that code. | 15:50 |
Odd_Bloke | So I don't know. :) | 15:50 |
dannf | Odd_Bloke: got a pointer to that code? is it still lp:maas-images? | 15:50 |
Odd_Bloke | dannf: It is, yeah. | 15:51 |
dannf | Odd_Bloke: ok, perfect. thanks! | 15:51 |
Odd_Bloke | :) | 15:53 |
dannf | roaksoax: ^ | 16:01 |
dannf | i'm guessing this is what we need: https://code.launchpad.net/~dannf/maas-images/lp1702976/+merge/327307 | 16:01 |
mup | Bug #1703895 opened: BMC static allocated addresses are not associated with nodes in subnet list <MAAS:New> <https://launchpad.net/bugs/1703895> | 16:02 |
dannf | fixed a typo, now: https://code.launchpad.net/~dannf/maas-images/lp1702976/+merge/327308 | 16:08 |
roaksoax | dannf: we dont generate kernels, we grab them from the archive | 16:09 |
dannf | roaksoax: see my MP | 16:09 |
xygnal | roaksoax any ideas about my root device dilemma? | 16:17 |
roaksoax | dannf: uhmmmm wtf! we should be using the initrd from the archive | 16:18 |
dannf | roaksoax: there isn't an initrd in the archive, it is generated at installtime. | 16:18 |
roaksoax | xygnal: sorry, i was following the conversation. Basically, you want to change / ? | 16:18 |
roaksoax | dannf: http://archive.ubuntu.com/ubuntu/dists/xenial/main/installer-amd64/current/images/netboot/ubuntu-installer/amd64/ | 16:18 |
gimmic | xygnal: you can set boot devices via IPMI | 16:18 |
roaksoax | e.g. | 16:18 |
dannf | roaksoax: well, there's a d-i initrd - but that's not what you need here | 16:18 |
dannf | roaksoax: booting that woudl boot d-i | 16:19 |
xygnal | no no no. I am trying to set the boot *and* root device in MAAS so that it installs on the correct disk (sda is NOT the correct disk) | 16:19 |
xygnal | boot is not the problem really, its root disk that refuses to go forward if the node is not in 'ready' state | 16:19 |
xygnal | that means I cannot set it during commission and I cannot set it during deploy... so... WHEN can I set it without having outside cron jobs to look? | 16:20 |
xygnal | I wrote an endpoint that looks up the correct disk and sets it in MAAS via the MAAS API. Problem is, I cannot find a place in the build process within MAAS where it would be able to run. | 16:21 |
roaksoax | xygnal: sorry, otp. give me a sec | 16:35 |
=== frankban is now known as frankban|afk | ||
dannf | roaksoax: thx for the MP approval! would you also be able to merge it, or do we need a separate ack? | 17:00 |
roaksoax | dannf: do you have landing rights ? if you do, go for it | 17:02 |
dannf | roaksoax: i don't | 17:02 |
ltrager | dannf: in meeting right now will land it after | 17:03 |
dannf | ltrager: thx! | 17:04 |
ltrager | dannf: on ARM64 does uname -m output 'arm64'? | 18:01 |
gimmic | I just maas installed 50 identical nodes.. and 4 of them use eth0 rather than eno1 | 18:05 |
gimmic | I wonder why they would use the traditional interface naming when using 16.04 | 18:05 |
dannf | ltrager: no | 18:17 |
dannf | ltrager: aarch64 | 18:17 |
roaksoax | xygnal: you can make change sto storage in 'ready' state (e.g, like partitions, bcache, etc). You can make filesystem changes on 'allocated' | 18:28 |
roaksoax | xygnal: the machine can be ready and you may want to change the boot/root | 18:28 |
roaksoax | xygnal: also, is this EFI or legacy ? | 18:28 |
roaksoax | xygnal: if it is legacy you do care which disk is the disk the bios uses as boot, which is the disk that you will have to install /boot/ | 18:29 |
xygnal | legacy | 18:50 |
xygnal | and yes they need to be the same disk, as all the rest of the disks need to e 100% safe for the clients to use and wipe | 18:50 |
xygnal | we use one particular model of SSD for OS and unfortunately it cannot be re-arranged to be the first disk. | 18:51 |
roaksoax | xygnal: right, so that seems like you would have to put /boot in the disk that the bios will boto from and /root on your ssd | 18:57 |
xygnal | we have not had to do that | 19:11 |
xygnal | we are able to change which disk is booted first in the BIO | 19:11 |
xygnal | I have been setting boot and root to sdm for example and that has been working | 19:11 |
xygnal | THAT is working fine. The problem is taht I cannot set this information during commission or deployment, which means I have to actually schedule it outside of MAAS. Not optimal. | 19:12 |
gimmic | [Errno 13] Permission denied: '/sys/class/block/md127/md/sync_action' | 19:13 |
gimmic | I've come to the conclusion curtin is garbage | 19:14 |
gimmic | xygnal: there's several instances where it would be great to be able to plug in basic OS commands at stages of deployment (post commission, pre-deployment, post deployment) | 19:15 |
ltrager | dannf: I've created a new MP which only includes the driver on ARM64, can you take a look? | 19:57 |
ltrager | dannf: https://code.launchpad.net/~ltrager/maas-images/lp1702976/+merge/327329 | 19:57 |
dannf | ltrager: see my comment about uname -m | 19:58 |
ltrager | dannf: yes I changed it to using aarch64 and confirmed it was included when I did a test build on my system | 19:58 |
dannf | ltrager: oh - so on an x86 system uname retruend aarch64 in an aarch64 root? | 19:59 |
ltrager | dannf: Yes, maas-image-builder uses qemu to emulate an ARM64 system during build | 20:00 |
dannf | ltrager: personally, i'd still prefer dpkg --print-architecture. uname -m bites in lots of places - say, if we were building armhf images outside of maas-image-builders on an arm64 host | 20:03 |
dannf | ltrager: but lemme reply in the MP FTR | 20:03 |
mup | Bug #1703984 opened: Commissioning fail with local ntp server <MAAS:Incomplete> <https://launchpad.net/bugs/1703984> | 21:00 |
mup | Bug #1703992 opened: [2.3] Device discovery shows duplicate entry for the same device <MAAS:Triaged> <https://launchpad.net/bugs/1703992> | 21:15 |
mup | Bug #1703992 changed: [2.3] Device discovery shows duplicate entry for the same device <MAAS:Triaged> <https://launchpad.net/bugs/1703992> | 21:27 |
mup | Bug #1648635 changed: [2.x] Commissioning fails due to low ipmi wait_time <ipmi> <papercut> <MAAS:Triaged> <https://launchpad.net/bugs/1648635> | 21:30 |
mup | Bug #1703992 opened: [2.3] Device discovery shows duplicate entry for the same device <MAAS:Triaged> <https://launchpad.net/bugs/1703992> | 21:30 |
mup | Bug #1648635 opened: [2.x] Commissioning fails due to low ipmi wait_time <ipmi> <papercut> <MAAS:Triaged> <https://launchpad.net/bugs/1648635> | 21:36 |
mup | Bug #1648635 changed: [2.x] Commissioning fails due to low ipmi wait_time <ipmi> <papercut> <MAAS:Triaged> <https://launchpad.net/bugs/1648635> | 21:42 |
agrebennikov | roaksoax, we can discuss it over here if you wish :) | 22:13 |
agrebennikov | will be much easier and faster | 22:13 |
agrebennikov | ltrager, regarding the bug you've just asked me about (commissioning) | 22:42 |
ltrager | agrebennikov: can you confirm that your machines do have access to the Ubuntu archives and can install lldpd and ntp? | 22:43 |
agrebennikov | just did | 22:44 |
agrebennikov | si yes, they download all the stuff | 22:44 |
agrebennikov | moreove | 22:44 |
agrebennikov | *over | 22:44 |
agrebennikov | sometimes I have my comissioning passed | 22:44 |
agrebennikov | as I mentioned in the description | 22:44 |
agrebennikov | it seems all the matter of how fast the time can snc with the internal ntp | 22:45 |
agrebennikov | the reason I'm so strict about it is that yesterday I spent about half a day struggling with ntp | 22:45 |
agrebennikov | and it turned out then before ntp sync happens I had all commands starting with "systemctl" failed with timeout | 22:46 |
agrebennikov | ltrager, ^^ | 22:47 |
ltrager | agrebennikov: so it seems that interacting with systemd is taking a long time in your environment | 22:47 |
agrebennikov | when I set up correct ntp server - started working immediately | 22:47 |
agrebennikov | when it passes ntp sync - it is immediate response | 22:48 |
ltrager | agrebennikov: So it only fails when an unreachable NTP server is used? | 22:48 |
agrebennikov | it seems so | 22:49 |
agrebennikov | when it could never reach ntp - yes, I thought systemd is completely broken | 22:49 |
agrebennikov | then I changed ntp server to a reachable one | 22:49 |
agrebennikov | now I can pass commissioning from 2 or 3 attempt | 22:49 |
agrebennikov | when I log into the host during the commissioning I clearly see that the lldp is installed | 22:50 |
agrebennikov | but the service doesn't run | 22:50 |
agrebennikov | if I run it manually - works just fine | 22:50 |
agrebennikov | unfortunately I just got my env broken, my jumphost went down | 22:51 |
ltrager | agrebennikov: so do you need to be able to use MAAS in an environment that doesn't have access to an external NTP server? | 22:51 |
agrebennikov | but you guys can try on your own, as I said if you specify unreachable ntp server in settings and deny access to ntp.ubuntu.com | 22:52 |
agrebennikov | no, I'm not. Current ntp works just fine | 22:52 |
agrebennikov | I don't know how to explain it better :( | 22:52 |
agrebennikov | in the beginning server always wants to connect to ntp.ubuntu.com | 22:53 |
agrebennikov | until cloud init or whatever injects proper settings to ntp.conf | 22:53 |
ltrager | agrebennikov: okay I think I understand where the problem is. systemd has its own built in NTP client(timedatactl) which I believe is configured for ntp.ubuntu.com. cloud-init replaces it with ntp so this might be a bug in systemd. I'll experiment a bit | 22:55 |
agrebennikov | great | 22:55 |
agrebennikov | hopefully you'll find something :) | 22:57 |
agrebennikov | check this screenshot out: https://snag.gy/h3LrXj.jpg | 23:00 |
agrebennikov | yeah, it is systemd-timesync who is trying to connnect | 23:00 |
agrebennikov | (it is just screen leftovers from the broken host unfortunately) | 23:00 |
agrebennikov | ltrager, &^^ | 23:06 |
ltrager | agrebennikov: I suspect its getting hung up on daemon-reload due to ntp trying to run again and holding up any commands to systemd. | 23:07 |
ltrager | agrebennikov: I need to confirm that and see if daemon-reload is still needed | 23:07 |
agrebennikov | well, yesterday I was trying stuff like "systemctl -l" and it didn't work eather | 23:08 |
agrebennikov | anyway, I'm heading off, hopefully will get my jumphost fixed in the morning. Please let me know if I need to try anything else tomorrow | 23:11 |
agrebennikov | thanks ltrager! | 23:11 |
ltrager | agrebennikov: np, I'll update the bug report with anything I find | 23:15 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!