/srv/irclogs.ubuntu.com/2016/04/05/#cloud-init.txt

=== spandhe_ is now known as spandhe
kalxHi all. I recently made a new AMI on AWS after doing running through linux system updates. I noticed the new one has some severe lag after startup.06:48
kalx Per logs, it seems the 'config_apt-pipelining module is taking 4-5 mins to execute now. Anyone run into similar issues before?06:48
kalxDoes anyone know what the apt-pipelining config module does? Is it purely just disabling apt-pipelining? or does it do anything else?07:20
kalxChecked code, it just writes a config file to apt config dir to disable/enabling apt pipelining, nothing else09:28
kalxso likely something else causing the issue actually that just happens block that config module from finishing09:28
Odd_Blokerharper: smoser: What clouds do you know of that use network_config?09:53
smoserOdd_Bloke, it works on openstack with config drive and nocloud currently.13:23
smoserand the 'fallback' will be in place on others.13:23
smoserOdd_Bloke currently its in place only for local data sources (not requiring network)13:24
smoserthe next step is to add it for data sources that require a network. example EC2 or Openstack Metadata13:24
Odd_Blokesmoser: Sorry, I should have explained what I was looking for more fully: I'm trying to track down a problem with precise's ConfigDrive handling of network_config, and I'm looking for a place where the OpenStack configuration is known-good.13:24
Odd_Blokesmoser: Because I don't want to consider fixing it in precise if it turns out I'm just looking at a funky OpenStack configuration. :p13:25
smoseroh. that.13:26
smoserOdd_Bloke, what is it that you're looking at? is it openstack metadata service ? or config drive?13:27
Odd_Blokesmoser: Config drive.13:28
smoserok, so that has a shot at working, but even then i think that probably on precise the cloud-init local job doesn't fully block networking from coming up.13:28
Odd_Blokesmoser: (I'm hoping that we'll be able to convince the partner to just not have precise in the region they're seeing this issue, but want to be sure of all the facts before pushing for that :)13:28
Odd_BlokeBecause it EOLs in a year anyway, and this is a new region, etc.13:28
smoserand thus if it doesn't block networking coming up, then it best case we ifdown something and then ifup it back up.13:30
smoserthe new stuff is better, in that we block networking from coming up.13:30
Odd_Blokesmoser: Well, looking at it, I'm not sure it _does_ work at all.  It looks like configuration is put in to keys that aren't later read from; but I want to confirm that I'm not just dealing with a weird OpenStack configuration that cloud-init mishandles.13:38
Odd_Blokesmoser: (This was totally refactored by trusty, and that works fine)13:38
smoseroh.13:38
Odd_BlokeSo I want a cloud which does network_config "properly" so I can just validate my finding of brokenness. :)13:41
smoserOdd_Bloke, and you want that to work with precise13:42
Odd_Blokesmoser: Well, once we know where we're at, we can go and talk to the partner about whether it's worth making it work.13:42
smoserOdd_Bloke, ok. quickly reading that..13:47
smoseri think that what is there is support for config drive v113:47
smoserwhich is probably not alive in any openstack cloud13:47
smoserconfig drive v2 is what you'd probably see anywhere.13:47
smoserv2 came probably 3 years ago at least13:47
rharpersmoser: did you see my ping re: xenial cloud-image not getting user ubuntu installed, which breaks when we add keys ?14:54
smoseryikes.14:54
smoserno. i didnt.14:54
rharper<rharper> smoser: http://paste.ubuntu.com/15621102/14:56
rharper<rharper> ubuntu user not in /etc/passwd, so the ssh key add failed (xenial cloud image from 2016-04-03 )14:56
rharperrunning a synced curtin vmtest should trip it14:56
smoserhm..14:56
rharper(ie new enough xenial cloud image)14:56
smoser cat /etc/cloud/build.info14:56
smoserbuild_name: server14:56
smoserserial: 20160403-14142914:56
smoserthat works in lxc at least.14:57
smoserlxd14:57
rharperyou're not useradding ubuntu ?14:57
smoser(ie, just laucnhed an instance here and there is a ubuntu user)14:57
rharperwhere is that normally added? (default users/groups) ?14:57
smoserits part of config (/etc/cloud/cloud.cfg)14:57
rharperright. cloud-init does;14:57
rharperalso, there is another one related to booting an image a second time; http://paste.ubuntu.com/15630937/14:59
smoserrharper, so what happened is you failed to get the datasource15:08
smoserand you used the fallback datasource15:08
smoserwhich just generates ssh keys15:09
rharperheh, *I didn't* fail15:09
smoserand there is apparently a bug in that where it creates a driectory rather than symlinking15:09
smoser:)15:09
rharperwell, maybe I'm speaking too soon15:09
rharperalways chance for a PEBCAC15:09
rharperwe're providing the normal seed via iso15:10
smoserwell, it is probably failing to find a source.15:10
smoserif you can get a aconsole log it will probably mention that15:10
smoserfallback datasource BAD THINGS TO COME or somethign liek that15:11
rharpercloud-init.log or ?15:11
smosercloud-init.log should have WARN in it15:11
rharperyeah15:11
rharperhard to know what to look for15:11
rharperwhy would it fail ?15:11
rharperto find the iso ?15:11
rharperthat's after it actually loads and reads seed from /dev/vdc15:12
rharpersmoser: which datasource should our seed.img in our curtin tests show up under ? (NoCloud) right ?15:17
smoseryeah.15:17
rharperhttp://paste.ubuntu.com/15631292/15:18
smoserrharper, ok. i'll take a look in 10 mimnutes trying to finish something up for matsubara15:20
rharperwouldn't the fallback seed still read and use /etc/cloud/cloud.cfg (which the default users get installed?)15:20
rharpersmoser: thanks15:20
smoseryeah.. i'm not sure why the warn about the user.15:21
=== jgrimm is now known as jgrimm-afk
rharpersmoser: so, another reason curtin needs to disable cloud-init network;  nic name races;  cloud-init emits the systemd link stuff, it classed with the udev rules we wrote and now I got a renam5-eth2 in there15:38
smoserhm..15:39
smoserit shoudlnt clash though.15:39
smoseras 70-persistent should always be favored15:39
rharperwell15:39
rharperit's not =/15:39
rharperthe fallback code decided that my eth2 would ne a nice eth0 link15:39
rharperthen I suppose it got a rename even, and raced (and 70 won)15:40
rharperbut not before ifup had run and setup some link information in the kernel15:40
smoserok. i'll start poking15:41
* rharper is trying again with networking disabled 15:41
rharperthat fixed my test run15:42
rharperit appears that the disturbance of networking takes cloud-init down some other path that fails to use the local datasource15:43
smoseroh. yeah.15:43
smoserit does.15:43
smoserthats why you're seeing the DataSourceNone15:43
smoserbecause networking never comes up.15:43
rharperthat seems odd; especially for a nocloud ds15:44
rharperbut I'm sure I'm missing something15:44
smoserrharper, what curtin tests were failing for you ?16:44
smoserrharper, ^ i just ran successfully on diglett all tests except for one trusty one (which i think failed due to io load with --processes=-1)17:15
smoserie: http://paste.ubuntu.com/15634350/17:16
rharpersmoser: it's a new one I'm adding for vlan stuff17:40
smoserwell, pfft.17:40
rharperin particular, it's due to the RTNETLINK stuff17:40
rharpersmoser: I think the more general concern is that in the case that network fails, cloud-init fails even when nocloud datasource is present17:41
rharperand networking is failing due to cloud-init networking (in my case, emitting the systemd link file is the direct cause)17:41
rharperso before I add the vlan case which triggers this 100% of the time17:41
rharperI'm adding the code to emit network: config: disabed in curtin target17:42
smoserwell, boot failed because cloud-init wanted networking (as do other things in boot).  they wait until networking is available (network.target)17:42
smoserand there was no network.target reached, so it failed.17:42
smoseri dont think its related to systemd link file17:43
smoser see /lib/udev/rules.d/80-net-setup-link.rules17:43
smoser.link files are only paid attention to if NAME==""17:43
smoserand your 70-persistent.... would have set NAME=17:44
rharperyes17:45
rharperit's related17:45
rharperthe link file forced a iface rename of eth2 to eth017:45
rharperand it's racing with existing data in the routing table17:45
smoserhow ?17:45
rharperI don't fully understand the sequence17:45
rharperbut when the vlan config goes to ip set link up on the interface for the vlan17:45
rharperit finds an existing route17:46
rharperand fails17:46
rharperin the routing table, I see things like eth2-rename17:46
rharperand if I disable cloud-init networking, the eni is perfectly fine and solid17:46
rharperI'll post the branch in a minute17:46
rharperif you'd like to debug it more17:46
rharpersmoser: https://code.launchpad.net/~raharper/curtin/trunk.test-vlan/+merge/291023  ;  if you remove the bit in curtin/net/__init__.py where I now write a config to disable cloud-init networking, the XenialTestNetworkVlan testcase will fail and you can boot the install disk to see what's going on17:54
=== jgrimm-afk is now known as jgrimm

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!