[00:52] <mup> Bug #1588547 changed: Generated bonding configuration is incorrect. <sts-needs-review> <curtin:Confirmed> <MAAS:Opinion by mpontillo> <https://launchpad.net/bugs/1588547>
[08:35] <mup_> Bug #1588706 opened: MAAS should not add 'source /etc/network/interfaces.d/*.cfg' to /etc/network/interfaces <MAAS:New> <https://launchpad.net/bugs/1588706>
[12:27] <mup> Bug #1588706 changed: MAAS should not add 'source /etc/network/interfaces.d/*.cfg' to /etc/network/interfaces <curtin:New> <MAAS:Invalid> <https://launchpad.net/bugs/1588706>
[14:06] <gimmic> my successful deploy rate of 14.04 LTS is terrible
[14:06] <gimmic> 10 identical nodes: 3/10 successful deployments
[14:20] <kiko> gimmic, using MAAS 2.0, right?
[14:20] <kiko> gimmic, we need to fix that rate and find out what is causing this crappy ratio
[14:22] <gimmic> can i increase the logging with some sort of debug flag?
[14:22] <gimmic> I'm about to fire off another 15 nodes at once
[14:22] <gimmic> they're all identical hardware and in ready state
[14:23] <kiko> gimmic, we tend to log everything
[14:23] <kiko> gimmic, remind me, do the nodes commission fine? 20/20?
[14:23] <gimmic> single disk R620's. Literally the only customization I'm doing is changing the hostname and static assignment of the network interface.
[14:24] <gimmic> They all commission fine
[14:24] <kiko> gimmic, always, right? never any commissioning failures?
[14:29] <gimmic> you know, a more global log viewer would be nice from the weberface
[14:29] <gimmic> like a general event log
[14:30] <gimmic> rather than host-specific only
[14:33] <kiko> gimmic, yeah, I hear you
[14:33] <kiko> gimmic, we have /something/ like that with remote rsyslog aggregation but I'm not sure it's working in 2.x
[14:34] <kiko> gimmic, anyway, when deployments fail, is the failure reproducible or rather random?
[14:36] <gimmic> I fired off all ten last night before I left work and came back to the 3/10. Have not tshot it yet.
[14:36] <gimmic> will poke at this next batch of 15 since I'm assigning IP/hostnames now.
[14:36] <kiko> okay
[14:36] <gimmic> Is there a document anywhere that describes where each subcomponent logs to? I see the logs under /var/log/maas
[14:41] <kiko> gimmic, maas.io/docs has the architecture -- there are basically two main components, rackd and regiond
[14:41] <kiko> rackd talks to the nodes (pxe, ipmi, etc)
[14:41] <kiko> regiond hosts the API and Web server
[14:43] <mup> Bug #1588846 opened: [2.0b6]  builtins.ValueError: invalid literal for int() with base 10: '' <MAAS:Triaged> <https://launchpad.net/bugs/1588846>
[14:51] <dimitern> does anyone know how long 'Disk erasing' is expected to take on a 256GB SSD ?
[14:52] <dimitern> is it a mult-pass secure wipe or something?
[14:53] <dimitern> roaksoax: ^^
[14:56] <roaksoax> dimitern: nope, but it iwll take a while
[14:56] <roaksoax> dimitern: it just wipes the whole disk
[14:57] <dimitern> roaksoax: yeah, it just finished ~10m for that 256GB SSD; interestingly the other 2 nodes with 120GB SSDs are taking longer (all were started pretty much at once)
[15:13] <mup> Bug #1588857 opened: [2.0b5] sudo: no tty present and no askpass program specified <MAAS:New> <https://launchpad.net/bugs/1588857>
[15:22] <mup> Bug #1588868 opened: [2.0b5] While monitoring service 'dhcpd/tgt/dhcpd6/proxy' an error was encountered: Unable to parse the output from systemd for service <MAAS:New> <https://launchpad.net/bugs/1588868>
[15:43] <mup> Bug #1588875 opened: [2.0-b6] Deploying a trusty (but not xenial) node frequently fails during storage setup of curtin  <MAAS:New> <https://launchpad.net/bugs/1588875>
[15:45] <gimmic> kiko: http://i.imgur.com/jzb7djY.png
[15:46] <gimmic> fired this off about 30 minutes ago, all the nodes are hung in deploying state, same logs:
[15:46] <gimmic> Node installation - 'curtin' failed: configuring storage	Fri, 03 Jun. 2016 10:26:55
[15:46] <gimmic> Node installation - 'curtin' failed: configuring disk: sda	Fri, 03 Jun. 2016 10:26:55
[15:46] <gimmic> Node installation - 'curtin' started: configuring disk: sda	Fri, 03 Jun. 2016 10:26:54
[15:46] <gimmic> Node installation - 'curtin' started: configuring storage	Fri, 03 Jun. 2016 10:26:54
[15:46] <gimmic> using LVM now
[15:47] <gimmic> none of them have failed yet, but I suspect there's a timeout somewhere ticking down
[15:47] <dimitern> gimmic: just filed that bug above for this
[15:48] <gimmic> dimitern: I'm not sure it's LVM related
[15:48] <gimmic> initially I was doing flat ext4
[15:48] <gimmic> and still run into it
[15:53] <dimitern> gimmic: it is
[15:54] <dimitern> gimmic: the default layout is flat, but I've created a VG on top of that
[15:54] <dimitern> in order to emulate having 2 distinct partitions
[15:57] <gimmic> Yeah, I'm saying I see the same error/behavior without using LV
[15:57] <gimmic> even installing w/ just flat ext4 curtain hangs in the same way
[15:57] <gimmic> ..curtin
[16:18] <dimitern> gimmic: ah, I see .. not good :/
[16:25] <mup> Bug #1588868 changed: [2.0b5] While monitoring service 'dhcpd/tgt/dhcpd6/proxy' an error was encountered: Unable to parse the output from systemd for service <MAAS:Incomplete> <https://launchpad.net/bugs/1588868>
[16:44] <kiko> gimmic, can you feed into https://bugs.launchpad.net/maas/+bug/1588875 as well?
[16:55] <mup> Bug #1588907 opened: [2.0b6] django.db.utils.IntegrityError: insert or update on table "piston3_consumer" violates foreign key constraint "piston3_consumer_user_id_4ac0863fa7e05162_fk_auth_user_id" <MAAS:New> <https://launchpad.net/bugs/1588907>
[17:04] <kiko> hm
[17:31] <mup> Bug #1588914 opened: [2.0b6] MAAS writes DHCP multiple times while not much is going on <MAAS:New> <https://launchpad.net/bugs/1588914>
[17:55] <gimmic> so just fyi
[17:56] <gimmic> if I use ipmi to power cycle the failed deployment node, release it, and re-deploy
[17:56] <gimmic> it seems to deploy fine
[17:57] <gimmic> I suspect it's something with how the partition exit code is
[17:57] <gimmic> when I redeploy, the partitions are already there
[18:40] <gimmic> okay, confirmed.. If I just shut the node down and netboot it again
[18:40] <gimmic> it works fine after re-deploying
[18:40] <gimmic> so the partition creation is exiting with an error code, but completing
[19:22] <mup> Bug #1588875 changed: [2.0-b6] Deploying a trusty (but not xenial) node frequently fails during storage setup of curtin  <curtin:Confirmed> <MAAS:Invalid> <https://launchpad.net/bugs/1588875>
[19:36] <kiko> roaksoax, smoser: is there a way for gimmic to stop the deployment process from rebooting and ssh in?
[19:38] <gimmic> so my current 'workaround' is let it time out to failed deployment, mark the nodes as broken, mark them as fixed, deploy
[19:38] <gimmic> that allows me to avoid fiddling with out of band ipmi resets
[19:38] <gimmic> I have now deployed 30 nodes using that methodology
[19:44] <smoser> gimmic, 2 ways
[19:45] <smoser> a.) maas server change /etc/maas/preseeds/curtin_userdata
[19:45] <smoser> see 'power_state' there
[19:45] <smoser> comment that out will do
[19:46] <smoser> b.) ssh in during deplyoment and sudo touch /run/block-curtin-poweroff
[19:47] <kiko> gimmic, that will let you try and re-run the partitioning command and see what's failing
[19:47] <kiko> could it be a gpt/efi thing?
[19:48] <smoser> kiko, you can also just put it comissioning and have it not power itself off
[19:48] <smoser> and then do whatever you want
[19:48] <kiko> smoser, well, I think the problem is right now gimmic doesn't even know what curtin is doing
[19:50] <kiko> so doing it in the deployment phase is going to be easier
[23:41] <mup> Bug #1588154 changed: [2.0b5] Deploying node fails when it has VLAN configuration <MAAS:Invalid> <https://launchpad.net/bugs/1588154>