/srv/irclogs.ubuntu.com/2017/07/12/#maas.txt

mupBug #1703712 opened: [2.3] IP addresses not listed for devices when they are 'Dynamic' <MAAS:Triaged> <https://launchpad.net/bugs/1703712>01:28
mupBug #1703713 opened: [2.3] Devices don't have a link from the DNS page <MAAS:Triaged> <MAAS 2.2:Triaged> <https://launchpad.net/bugs/1703713>01:28
mupBug #1690154 changed: block-curtin-poweroff doesn't work <MAAS:Expired> <https://launchpad.net/bugs/1690154>04:23
mupBug #1690154 opened: block-curtin-poweroff doesn't work <MAAS:Expired> <https://launchpad.net/bugs/1690154>04:38
mupBug #1690154 changed: block-curtin-poweroff doesn't work <MAAS:Expired> <https://launchpad.net/bugs/1690154>04:44
=== frankban|afk is now known as frankban
magicaltrouthello folks, maas question10:53
magicaltroutI can commission a node10:53
magicaltroutbut then when I deploy it doesn't PXE boot10:53
magicaltroutany idea where to prod around?10:53
magicaltrouthmm does if I restart10:56
magicaltroutmaybe it comes to life too quickly10:56
magicaltroutscrap that next question:11:24
magicaltrouthttps://gist.github.com/buggtb/21517dda16ba6b91fd40ebe2ae3763c811:24
magicaltroutcurtin fails miserably when deploying a server11:24
magicaltroutnot sure why though, the disk seems to exist earlier in the boot11:24
magicaltroutweird if i keep looking at /dev just prior to curtin running11:32
magicaltroutthe partition vanishes11:32
magicaltroutcentos images install though11:38
magicaltroutjust not xenial hwe11:38
magicaltrouthmm12:33
magicaltroutdisabling EFI seemed to fix that12:33
magicaltroutnow grub fails...12:33
=== mpontillo_ is now known as mpontillo
=== tai271828__ is now known as tai271828_
=== mup_ is now known as mup
roaksoaxmagicaltrout: do you mean that when you deploy, the machine installs fine, reboots, but it never boots into the OS ?13:04
roaksoaxmagicaltrout: or you mean this is due to the curtin issue you found above ?13:05
magicaltroutstill something curtin based roaksoax13:13
magicaltrouti get this13:13
magicaltrouthttps://gist.github.com/buggtb/c3406963c12646115891ad41c92c569f13:13
magicaltroutbut a manual install works fine. I guess it finds the mmc paritioning a bit funky13:13
roaksoaxmagicaltrout: yeah, I'd suggest you file a bug on the curtin project specifying the versions of curtin/maas in use, with the output of maas <user> machines get-curtin-config <system_id>13:30
magicaltroutthanks roaksoax will do, posted to maas-devel on the off chance someone has an idea as well13:38
xygnalmpontillo: roaksoax: still waiting on info for how to set by boot & root device in MAAS without having to do it manually. (cannot set during commission or deploying states)13:43
xygnals/by/my13:43
xygnalthough I think it's just root device that needs 'ready' state13:43
mupBug #1703845 opened: rackd should lower recheck interval on disconnect <MAAS:In Progress by blake-rouse> <https://launchpad.net/bugs/1703845>14:44
mupBug #1703845 changed: rackd should lower recheck interval on disconnect <MAAS:In Progress by blake-rouse> <https://launchpad.net/bugs/1703845>14:53
mupBug #1703845 opened: rackd should lower recheck interval on disconnect <MAAS:In Progress by blake-rouse> <https://launchpad.net/bugs/1703845>15:02
gimmicIs there an issue with running Landscape on the MAAS box?15:24
gimmicmagicaltrout: Try booting into rescue and running wipefs, then deploy to the box15:26
gimmiccurtin seems to handle existing partition data poorly15:26
gimmicI had duplicate vgs due to reusing hard drives with existing partitions on the nodes15:26
gimmic'sudo wipefs -a /dev/sd* -f;sudo wipefs -a /dev/mapper/* -f'15:27
gimmiccleared up my issues, nuke it from orbit..15:27
roaksoaxgimmic: are you sure you are in the latest version of curtin ?15:30
roaksoaxwhat curtin version are you using ?15:30
roaksoaxjhave you filed a bug ?15:30
gimmic  Installed: 0.1.0~bzr505-0ubuntu1~16.04.115:31
gimmicwas up-to-date at least as to when I was tshooting that last week15:31
dannfOdd_Bloke: does your team generate the ephemeral images used by MAAS directly? if so, do you know if you strip kernel modules out during that process?15:49
dannf(LP: #1702976 is why I ask)15:49
Odd_Blokedannf: We run the code that generates the image, but the MAAS team own that code.15:50
Odd_BlokeSo I don't know. :)15:50
dannfOdd_Bloke: got a pointer to that code? is it still lp:maas-images?15:50
Odd_Blokedannf: It is, yeah.15:51
dannfOdd_Bloke: ok, perfect. thanks!15:51
Odd_Bloke:)15:53
dannfroaksoax: ^16:01
dannfi'm guessing this is what we need: https://code.launchpad.net/~dannf/maas-images/lp1702976/+merge/32730716:01
mupBug #1703895 opened: BMC static allocated addresses are not associated with nodes in subnet list <MAAS:New> <https://launchpad.net/bugs/1703895>16:02
dannffixed a typo, now: https://code.launchpad.net/~dannf/maas-images/lp1702976/+merge/32730816:08
roaksoaxdannf: we dont generate kernels, we grab them from the archive16:09
dannfroaksoax: see my MP16:09
xygnal roaksoax any ideas about my root device dilemma?16:17
roaksoaxdannf: uhmmmm wtf! we should be using the initrd from the archive16:18
dannfroaksoax: there isn't an initrd in the archive, it is generated at installtime.16:18
roaksoaxxygnal: sorry, i was following the conversation. Basically, you want to change / ?16:18
roaksoaxdannf: http://archive.ubuntu.com/ubuntu/dists/xenial/main/installer-amd64/current/images/netboot/ubuntu-installer/amd64/16:18
gimmicxygnal: you can set boot devices via IPMI16:18
roaksoaxe.g.16:18
dannfroaksoax: well, there's a d-i initrd - but that's not what you need here16:18
dannfroaksoax: booting that woudl boot d-i16:19
xygnalno no no.  I am trying to set the boot *and* root device in MAAS so that it installs on the correct disk (sda is NOT the correct disk)16:19
xygnalboot is not the problem really, its root disk that refuses to go forward if the node is not in 'ready' state16:19
xygnalthat means I cannot set it during commission and I cannot set it during deploy... so... WHEN can I set it without having outside cron jobs to look?16:20
xygnalI wrote an endpoint that looks up the correct disk and sets it in MAAS via the MAAS API.  Problem is, I cannot find a place in the build process within MAAS where it would be able to run.16:21
roaksoaxxygnal: sorry, otp. give me a sec16:35
=== frankban is now known as frankban|afk
dannfroaksoax: thx for the MP approval! would you also be able to merge it, or do we need a separate ack?17:00
roaksoaxdannf: do you have landing rights ? if you do, go for it17:02
dannfroaksoax: i don't17:02
ltragerdannf: in meeting right now will land it after17:03
dannfltrager: thx!17:04
ltragerdannf: on ARM64 does uname -m output 'arm64'?18:01
gimmicI just maas installed 50 identical nodes.. and 4 of them use eth0 rather than eno118:05
gimmicI wonder why they would use the traditional interface naming when using 16.0418:05
dannfltrager: no18:17
dannfltrager: aarch6418:17
roaksoaxxygnal: you can make change sto storage in 'ready' state (e.g, like partitions, bcache, etc). You can make filesystem changes on 'allocated'18:28
roaksoaxxygnal: the machine can be ready and you may want to change the boot/root18:28
roaksoaxxygnal: also, is this EFI or legacy ?18:28
roaksoaxxygnal: if it is legacy you do care which disk is the disk the bios uses as boot, which is the disk that you will have to install /boot/18:29
xygnallegacy18:50
xygnaland yes they need to be the same disk, as all the rest of the disks need to e 100% safe for the clients to use and wipe18:50
xygnalwe use one particular model of SSD for OS and unfortunately it cannot be re-arranged to be the first disk.18:51
roaksoaxxygnal: right, so that seems like you would have to put /boot in the disk that the bios will boto from and /root on your ssd18:57
xygnalwe have not had to do that19:11
xygnalwe are able to change which disk is booted first in the BIO19:11
xygnalI have been setting boot and root to sdm for example and that has been working19:11
xygnalTHAT is working fine. The problem is taht I cannot set this information during commission or deployment, which means I have to actually schedule it outside of MAAS.  Not optimal.19:12
gimmic[Errno 13] Permission denied: '/sys/class/block/md127/md/sync_action'19:13
gimmicI've come to the conclusion curtin is garbage19:14
gimmicxygnal: there's several instances where it would be great to be able to plug in basic OS commands at stages of deployment (post commission, pre-deployment, post deployment)19:15
ltragerdannf: I've created a new MP which only includes the driver on ARM64, can you take a look?19:57
ltragerdannf: https://code.launchpad.net/~ltrager/maas-images/lp1702976/+merge/32732919:57
dannfltrager: see my comment about uname -m19:58
ltragerdannf: yes I changed it to using aarch64 and confirmed it was included when I did a test build on my system19:58
dannfltrager: oh - so on an x86 system uname retruend aarch64 in an aarch64 root?19:59
ltragerdannf: Yes, maas-image-builder uses qemu to emulate an ARM64 system during build20:00
dannfltrager: personally, i'd still prefer dpkg --print-architecture. uname -m bites in lots of places - say, if we were building armhf images outside of maas-image-builders on an arm64 host20:03
dannfltrager: but lemme reply in the MP FTR20:03
mupBug #1703984 opened: Commissioning fail with local ntp server <MAAS:Incomplete> <https://launchpad.net/bugs/1703984>21:00
mupBug #1703992 opened: [2.3] Device discovery shows duplicate entry for the same device <MAAS:Triaged> <https://launchpad.net/bugs/1703992>21:15
mupBug #1703992 changed: [2.3] Device discovery shows duplicate entry for the same device <MAAS:Triaged> <https://launchpad.net/bugs/1703992>21:27
mupBug #1648635 changed: [2.x] Commissioning fails due to low ipmi wait_time <ipmi> <papercut> <MAAS:Triaged> <https://launchpad.net/bugs/1648635>21:30
mupBug #1703992 opened: [2.3] Device discovery shows duplicate entry for the same device <MAAS:Triaged> <https://launchpad.net/bugs/1703992>21:30
mupBug #1648635 opened: [2.x] Commissioning fails due to low ipmi wait_time <ipmi> <papercut> <MAAS:Triaged> <https://launchpad.net/bugs/1648635>21:36
mupBug #1648635 changed: [2.x] Commissioning fails due to low ipmi wait_time <ipmi> <papercut> <MAAS:Triaged> <https://launchpad.net/bugs/1648635>21:42
agrebennikovroaksoax, we can discuss it over here if you wish :)22:13
agrebennikovwill be much easier and faster22:13
agrebennikovltrager, regarding the bug you've just asked me about (commissioning)22:42
ltrageragrebennikov: can you confirm that your machines do have access to the Ubuntu archives and can install lldpd and ntp?22:43
agrebennikovjust did22:44
agrebennikovsi yes, they download all the stuff22:44
agrebennikovmoreove22:44
agrebennikov*over22:44
agrebennikovsometimes I have my comissioning passed22:44
agrebennikovas I mentioned in the description22:44
agrebennikovit seems all the matter of how fast the time can snc with the internal ntp22:45
agrebennikovthe reason I'm so strict about it is that yesterday I spent about half a day struggling with ntp22:45
agrebennikovand it turned out then before ntp sync happens I had all commands starting with "systemctl" failed with timeout22:46
agrebennikovltrager, ^^22:47
ltrageragrebennikov: so it seems that interacting with systemd is taking a long time in your environment22:47
agrebennikovwhen I set up correct ntp server - started working immediately22:47
agrebennikovwhen it passes ntp sync - it is immediate response22:48
ltrageragrebennikov: So it only fails when an unreachable NTP server is used?22:48
agrebennikovit seems so22:49
agrebennikovwhen it could never reach ntp - yes, I thought systemd is completely broken22:49
agrebennikovthen I changed ntp server to a reachable one22:49
agrebennikovnow I can pass commissioning from 2 or 3 attempt22:49
agrebennikovwhen I log into the host during the commissioning I clearly see that the lldp is installed22:50
agrebennikovbut the service doesn't run22:50
agrebennikovif I run it manually - works just fine22:50
agrebennikovunfortunately I just got my env broken, my jumphost went down22:51
ltrageragrebennikov: so do you need to be able to use MAAS in an environment that doesn't have access to an external NTP server?22:51
agrebennikovbut you guys can try on your own, as I said if you specify unreachable ntp server in settings and deny access to ntp.ubuntu.com22:52
agrebennikovno, I'm not. Current ntp works just fine22:52
agrebennikovI don't know how to explain it better :(22:52
agrebennikovin the beginning server always wants to connect to ntp.ubuntu.com22:53
agrebennikovuntil cloud init or whatever injects proper settings to ntp.conf22:53
ltrageragrebennikov: okay I think I understand where the problem is. systemd has its own built in NTP client(timedatactl) which I believe is configured for ntp.ubuntu.com. cloud-init replaces it with ntp so this might be a bug in systemd. I'll experiment a bit22:55
agrebennikovgreat22:55
agrebennikovhopefully you'll find something :)22:57
agrebennikovcheck this screenshot out: https://snag.gy/h3LrXj.jpg23:00
agrebennikovyeah, it is systemd-timesync who is trying to connnect23:00
agrebennikov(it is just screen leftovers from the broken host unfortunately)23:00
agrebennikovltrager, &^^23:06
ltrageragrebennikov: I suspect its getting hung up on daemon-reload due to ntp trying to run again and holding up any commands to systemd.23:07
ltrageragrebennikov: I need to confirm that and see if daemon-reload is still needed23:07
agrebennikovwell, yesterday I was trying stuff like "systemctl -l" and it didn't work eather23:08
agrebennikovanyway, I'm heading off, hopefully will get my jumphost fixed in the morning. Please let me know if I need to try anything else tomorrow23:11
agrebennikovthanks ltrager!23:11
ltrageragrebennikov: np, I'll update the bug report with anything I find23:15

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!