=== rangerpb is now known as rangerpbzzzz
cpaelzerthanks rharper for your mail10:47
cpaelzerrharper: I was creating wrappers accordingly to show our user-stories10:47
cpaelzerso far working fine10:47
cpaelzerI also made some time analysis with cloud-init-analyze on case 2 I wanted to discuss with you10:48
cpaelzerAlso I'm not sure if we want/need to "show" user story #4, I'd more expect that to be the story that leads over to the discussion on how to stage it into Xenial10:48
cpaelzerrharper: now lunch, then creating case 3 but that is essentially 2 on Openstack which should be fine10:52
cpaelzerrharper: please ping me once you are around for the discussion on the #2 timings10:52
cpaelzersmoser: you would have enjoyed this lunch http://johorkaki.blogspot.com/2015/10/samyang-extremely-spicy-chicken-flavor.html11:24
=== blaisebool is now known as Guest58715
cpaelzerrharper: comparing times on the Openstack based execution is just as non-helpful12:21
cpaelzerrharper: I need to discuss if/where I should see something stable enough for the Demo other than the huge ec2 conf timeout12:22
cpaelzerthat one I have and I also like how we can show of the "snap config and install via cloud-init"12:22
cpaelzerbut everything I threw at cloud-init analyze so far didn't give me a good showcase for the timing at least12:23
cpaelzeron top it seems that the older ci enabled image has not yet the improvements on the output for cloud-init analyze to be able to differ the stages more easily12:23
cpaelzerrharper: ping me, I can send you data or let you to my bastion whatever we need12:23
cpaelzerother than that the tech part is ready creating a few raw slides to guide along12:24
cpaelzerrharper: smoser: jgrimm: I've made a draft for the ds-identify show and fully automated the demo part of it based on ryans work13:31
cpaelzerrharper: thanks13:31
cpaelzerI'll share the draft slide deck so you can review/modify13:31
cpaelzerif nothing is fatally broken jgrimm and I will adapt depending what comes up on Monday13:32
cpaelzerrharper: the only blind spot that is left is my lack of a good case to show the timing - waiting for you to show up13:33
cpaelzermaybe I even have all I need but just don't find it impressive enough :-)13:33
rharpercpaelzer: not quite in yet, but I think this should capture the delta we need (the first being , no searching, then latter how cloud-init on OpenStack looks without identify)  http://paste.ubuntu.com/23746042/13:46
cpaelzerrharper: forgot to reply - I think I see what you mean - gathering that on my dmeo env now and seeing it it shows something nice14:03
cpaelzerrharper: smoser: I found every now and then (about 1/8 of the cases) that the final stage hangs15:15
cpaelzerthought it is part of my ssh setup, but now got in and I find it hanging15:15
cpaelzeroh I see15:15
cpaelzernot "us" but snap it seems15:15
rharpercpaelzer: ok, here now15:16
rharperif you use my user-data15:16
rharperit does snap installs15:16
rharperwhich may take some time15:16
cpaelzerwhich I want for some of the cases15:16
cpaelzerbut, what I mean is that this sometimes hangs15:16
cpaelzerlike really for minutes15:16
rharpersnaps do that15:16
cpaelzerhow unfriendly15:17
rharperdownload, verify, unpack, squash mount, etc15:17
cpaelzeris there anything cloud-init should do to unlock that?15:17
cpaelzerthe process is sleepng15:17
rharperapt isn't fast either15:17
cpaelzerI really think it is dead some way15:17
rharperpossible, in general, we've discussed the idea of doing some config modules in parallel but it's a non-trivial problem if one has dependencies between them15:18
cpaelzerwhat I want would be a timeout and retry15:18
cpaelzerthis is now crossing the 10 minute mark - it won't ever succeed15:18
rharpersee the mount15:18
rharper systemctl start snap-part\x2dcython-3.mount15:18
rharperthat's not us15:18
cpaelzerthe x2dcython-3.mount one15:19
cpaelzerok let me rephrase15:19
rharperI understand what you suggest;15:19
cpaelzerI think it is a snap issue, but should the snap module of cloud-init take care to detect and recover in those cases15:19
rharperconfig modules can certainly have timeouts15:19
cpaelzeroh ok15:19
cpaelzersorry for feeling misunderstood then15:19
rharperwe do in other places (like searching for data sources)15:19
rharperno worries15:19
cpaelzerbut when scripts start a bunch of them 1/8 hits often enough to feel bad :-)15:22
rharperyeah, maybe use different snaps15:24
cpaelzerone instead of three should already help to mitigat most I hope15:27
rharperhello is going to be fast and easy15:27
rharperit's the #1 snap in the store  =)15:27
cpaelzeranother issue seems to be that if I check /var/lib/cloud/instance/boot-finished too early that seems to conflict with the final stage (re)starting ssh - that makes the final stage hang as well it seems15:41
cpaelzerwhich still is part of config snappy I jsut see15:41
cpaelzer |`->running config-snappy with frequency once-per-instance @02.00000s +912.00000s15:41
cpaelzerthat is in the analyze after I hard restarted the ssh on the UC16 openstack guest15:41
cpaelzerthat is in the case without user-data even15:43
rharpernot following final stage restarting ssh ?15:44
cpaelzerthe cloud-analyze output is linear as well right?15:44
rharpersure, it's just sorted by event timestamp15:44
cpaelzerrharper: http://paste.ubuntu.com/23746681/ line 21215:45
cpaelzerthe pastebin has three cases with-data, no-data-ds-identify, no-data-old15:45
rharperI'm not seeing that huge time15:46
rharperwhich image?  recent or the old one ?15:46
cpaelzerold one15:47
cpaelzeryou think it might just be an old issue?15:47
rharperit certainly has older packages15:47
rharperwhich may get refreshed updated15:47
rharperwhich could block time15:47
rharperlet me upload a newer  image without the updated cloud-init15:47
rharperso it's more apples to apples w.r.t boot time15:47
cpaelzerno hurry ping me or write mail once you rebuilt one - so I can exchange the image I use15:52
cpaelzerthanks for the "policy can be overridden" statement - missed to add that15:54
cpaelzerrharper: FYI - I have all the timing data if discussion comes up, but I think there is nothing with enough "bang" in it to get a slide15:57
cpaelzerI can pull them out whenever needed - or not if not15:57
cpaelzertoo much data without the need could lead to deep dives on unimportant numbers15:57
rharperwell, I think a 10x factor in time to run cloud-init-local is nice enough15:57
rharperthe biggest win will be in the ec2 image (we don't have those times) simply because ec2 runs *last*15:58
cpaelzerof course15:58
cpaelzerjust at least on openstack old and new image both are rather fast15:58
cpaelzeror my timing granularity is bad15:59
cpaelzerit's basically 00.0000 for init-local and init-network on OLD15:59
rharperwell, you need to use the journalctl method to get subsecond resolution15:59
cpaelzerI had both15:59
cpaelzerah year15:59
cpaelzerit is so small15:59
rharperI suggest trying again with the journalctl -o short,precise -u cloud-init-local | cloudinit-analyze show -i -16:00
cpaelzerdo I miss something that init-network is not in the journal methof?16:00
cpaelzerI already have the journal data rharper16:00
rharperI just did one unit16:00
cpaelzer00.56115 -> 00.0812616:00
rharperyou can string them all16:00
rharperin total time it's small16:00
cpaelzerI'll do so16:00
rharperbut it's a rather huge reduction16:01
rharper-u cloud-init-local -u cloud-init -u cloud-config -u cloud-final16:01
rharperare the unit names to append to the journalctl command16:01
cpaelzernow that I spent the automation effort for all showcases it is a minor change to get it with that on top :-)16:01
rharperthat'll get you all stages16:01
rharperfor the images using config-drive, the change will be small, ones using Openstack datasource, will be larger win as it probes local + other clouds then OpenStack datasource16:02
rharperso the further from the "head" of the list, the greater the win w.r.t time reduction;  and you're right, it's not huge in terms of wallclock time16:03
rharperbut the improvement scales with the speed of the system16:03
cpaelzerok, then I'm good16:04
rharpera follow up is that the POC only prevents cloud-init from probing other sources16:04
cpaelzerit means I rad the data correctly and it isn't impressive until underlining the sweet spots :-)16:04
rharperwe will further have cloud-init skip the specific datasource probe as well16:04
rharperyes, that's true16:04
cpaelzerbut - as we said when you showed cloud-init analyze - no matter if big or small, having the data is the important point for the discussions16:05
powersjmagicalChicken: https://paste.ubuntu.com/23747530/17:47
magicalChickenpowersj: needs --upgrad17:54
magicalChickenI had meant to do that yesterday, I;'ll get that done now17:55
magicalChickenI added a config flag to automatically do --upgrade on linuxcontainers.org images17:55
magicalChickensince they don't ship with cloud-init17:55
powersjlet me know when I can pull again (no rush) and I can continue playing with it :)17:55
magicalChickenpowersj: sure, should have that fixed in just a bit17:56
aixtoolsbefore I get started again - a git question. How do I update 'my' copy - aka aixtools (aixtools        ssh://aixtools@git.launchpad.net/~aixtools/cloud-init (push)) - from origin  (https://git.launchpad.net/cloud-init (fetch))18:25
naccaixtools: do you have changes in your branch relative to what you currently have from cloud-init origin?18:27
magicalChickenaixtools: you can add the origin as a remote and pull18:27
naccand then you'd rebase your branch(es) onto the updated origin/master, normally18:28
naccit depends on your workflow, and what, if any changes, you have locally18:28
aixtoolsi have a seperate branch I am working (an AIX port I hope); I have 'aixtools' that is my clone of master.18:35
aixtoolswhat I would like to do is: 1) get my launchpad "clone" up to date; 2) use that to update (i think fetch?) my 'local' copy of 'master';18:36
naccaixtools: sorry, can you pastebin the output of `git remote; git branch; git branch -r` ?18:36
aixtoolsfinally,. 3:18:36
aixtoolsupdate/merge the changes of the current status into my 'changes' for aix-port18:37
naccaixtools: ok, so this is how *I* would do it, you can choose to take/leave what you want :)18:43
aixtoolsi am a noob - I shall live and learn :)18:43
naccaixtools: what i would first do (for your own sanity/helpfullness) is make sure that your ~/.gitconfig has:18:43
nacc      decorate = short18:44
naccthat will put in `git log` output, things like tags and head names if they are in the history18:44
naccif you do that, in your current branch (aix-port), `git log` should indicate that an ancestor of HEAD is origin/master (I'm guessing, presuming you have not fetched yet)18:44
naccso the way to update your tree would be then:18:45
nacc1) git fetch origin18:45
nacc2) git rebase -i origin/master18:45
nacciiuc, (it does depend on how many commits you have relative to the old upstream you were using), this will present you with an $EDITOR window, which allows you to specify how to treat the commits that are to be carreid forward18:46
aixtoolsdid that also update my copy i.e., ssh://aixtools@git.launchpad.net/~aixtools/cloud-init , or is that an update of my local disks?18:46
aixtools(step 1, that is)18:47
naccjust local18:47
aixtoolsand step 2 - starting now...18:47
naccall step 1) did was to fetch from the 'origin' remote any branches and commits18:47
naccso then 3) would be `git push aixtools aix-port` (presuming your remote branch is also called aix-port, if it's master there, you would say aix-port:master)18:48
aixtoolsstep 2 - what is that 'trying' to do. looks like my changes coming into 'master' which I do not want.18:48
naccaixtools: look at the pictures in `man git-rebase`18:49
aixtoolsi want to move 'master' into aix-port and.or see conflicts, so I can review them18:49
naccbasically you have a topic branch (aix-port)18:49
naccwait, what?18:49
aixtoolsmaybe my thought process is 'wrong',18:50
aixtoolsyes - topic branch aka aix-port18:50
naccso git is just storing a DAG, right?18:50
naccdirected acyclic graph18:50
aixtoolsDAG was shorter, still do not know the fancy verb18:50
aixtoolsadjective i should say18:51
naccok, let's gloss it for now18:51
aixtoolshence, i wanted my 'local' master to be equal to the project master.18:51
naccif you look at the first example in `man git-rebase` (around line 68)18:51
naccaixtools: right, but you don't *need* that at all for git18:51
aixtoolsgit is turning into the new trick this old dog cannot learn18:52
aixtoolsi read somewhere git is keeping three copies: the 'master', a local copy of the master, and then the local changes18:53
naccaixtools: i mean, yes your repository's master branch can track origin/master, but you also already have origin/master :)18:53
nacci feel like this would be way faster to explain on the phone :)18:54
naccbut let me keep going18:54
aixtoolsmy thought is to keep my topic-branch as close to master as I can.18:54
naccthat's smart18:54
naccyou do that with regular rebases18:54
aixtoolsok, so all i have done now is step 1, step 2 I aborted.18:55
naccaixtools: do you have hangouts? i can explain this quickly if you have the time?18:55
aixtoolsfor the rebase - would I go back to my branch and then execute 'rebase'18:55
aixtools(imho - git has a lot of features - I wiull someday see the benefit(s) - but for now they just confuse.18:56
naccright, i didn't realize you had left your branch18:56
naccas your output before was that you were on the aix-port branch18:56
naccand `git fetch` doesn't move your checked out state at all18:56
aixtoolswell, I did git checkout master before starting irc18:57
naccdon't do that :)18:57
naccwasn't in my list of steps :)18:57
aixtoolsno, it was in someone elses list (long-live google)18:57
aixtoolsseemed to be the way to prepare for a merge18:58
aixtoolswhich is what I thought I needed to do18:58
naccso, here's my opinion18:58
naccyou have no need for a local master branch18:58
naccit is of no use to you, as you're always doing topic branches forked from upstream's master18:58
naccso let's just ignore the local master :)18:58
naccyou can delete it, but git will complain sometimes, so it's easier to leave it around, but ignore it18:59
naccso we're going to only work on your aix-port branch18:59
nacc(git checkout aix-port)18:59
naccwe're going to run the rebase step here, which is basically telling git (long typing to follow)18:59
naccgit rebase -i origin/master18:59
nacc(implicilty the commit to rebase is HEAD)18:59
naccI was based off something in the history of origin/master19:00
naccbut now origin/master has moved on without me19:00
naccI want git to 'remember' all the stuff that i've done from that historical fork-point (called the merge-base) and save it19:00
naccthen I want to fast-forward the branch I'm on (which HEAD is on) to the updated state of origin/master19:00
naccand then I want to replay the 'remembered' stuff, as new commits19:01
aixtoolsso, just save the file with all the 'picks' in it.19:01
naccaixtools: in your case, yeah19:01
naccaixtools: as you don't want to drop anything19:01
aixtoolsnot yet :)19:01
aixtoolsSuccessfully rebased and updated refs/heads/aix-port.19:01
naccaixtools: you can, in the future, probably, drop the -i. I like to always see what git-rebase is going to do, but your case should be a quick rebase each time, esp. if you do it often19:02
aixtoolsOn branch aix-port19:02
aixtoolsYour branch and 'aixtools/aix-port' have diverged,19:02
aixtoolsand have 9 and 10 different commits each, respectively.19:02
aixtoolsso, rather than the pull suggested, I would do a push?19:02
aixtoolsto put the local copy on the server?19:03
naccyep, i said that earlier, `git push aixtools aix-port`19:03
naccnow, if i had to guess, that will complain saying it's not a fast forward19:03
aixtoolsforgot that... my apologies19:03
aixtoolsfew lines too many... http://pastebin.com/289U3mCR19:05
naccnon-fast-forward is the important bit19:05
naccso the reasoning here is19:05
naccimagine someone was using your branch as the basis for their work19:05
naccthey, just like you did, want to be able to do a `git fetch origin` (excpet their origin is your repository)19:06
naccand have it make sense and be a linear history19:06
naccbut you just 'moved' your history by rebasing it19:06
naccso for your topic branch, it's relatively likely you'll need to tell `git-push` the '-f' flag (to force), *if* you know your local branch is correct19:07
aixtoolsso, just add -f19:07
naccafter verifying you want your local aix-port branch to be what is ont he server19:07
nacc(by looking at git-log, diffing against origin/master, etc)19:08
aixtoolswell, as I am working solo - it is either a mess and I get to start over again, or it is okay.19:08
aixtoolsI'll vote (read hope) for the latter.19:09
naccack, it's not a big deal for topic branches that are one-offs19:09
aixtoolsTotal 56 (delta 40), reused 0 (delta 0)19:09
aixtoolsTo ssh://git.launchpad.net/~aixtools/cloud-init19:09
aixtools + f1dee34...be633b8 aix-port -> aix-port (forced update)19:09
naccit's a bigger deal for origins, masters, etc19:09
aixtoolsSo, maybe even a good way to learn the ropes.19:10
naccnote that until you do a MR, there's not really even a reason to push19:10
naccexcept if you develop in multiple places, or if you are worried about your system dying19:10
naccthere's also not a reason *not* to push, admittedly19:10
aixtoolsthanks very much - the boss (wife) called. time to go...19:10
aixtoolswell, I am also trying to learn git.19:10
aixtoolslater, or tomorrow. thx.19:11
naccaixtools: np! i'll be around19:11
=== rangerpbzzzz is now known as rangerpb
magicalChickenpowersj: I got the setup_overrides working so lxd images can force --upgrade even if it isn't specified19:33
magicalChickenpowersj: so 'run -n xenial' should work now without upgrade19:33
powersjmagicalChicken: tests appear to be running, thank you!19:42
powersjmagicalChicken: I just got the too many open files on my laptop19:46
magicalChickenpowersj: what image was it on?19:46
powersjpython3 -m tests.cloud_tests run -v -n xenial19:47
powersjdidn't think you mentioned that with the ubuntu image19:47
magicalChickenpowersj: i've never seen it with xenial19:47
magicalChickeni only get it on centos 7/6.6, debian wheezy, and ubuntu precise19:48
magicalChickenpowersj: you're getting it in a different place too19:49
magicalChickeni've always seen the stacktrace come from inside pylxd19:49
magicalChickennot sure what's going on yet, must be resources leaking somehow19:50
rharpersomething is leaking file descriptors, likely the exec bits in pylxd ?19:51
magicalChickenyeah, it must be something like that19:51
magicalChickenthis only started after the switch to pylxd 2.2 i think19:51
magicalChickeni'm going to make a branch with centos/debian support but using pylxd 2.1, see if it happens there19:52
rharperApparently the relevant limit is /proc/sys/fs/inotify/max_user_instances. This is "128" by default. When increasing it with19:53
rharper   sudo sysctl fs.inotify.max_user_instances=25619:53
rharperlooks to be the tl;dr19:53
rharperthat looks helpful here19:54
magicalChickenthat definitely looks related19:54
magicalChickenfrom the last comment on there, it sounds like the init system being used by the container has an effect on whether or not the bug occurs19:55
magicalChickenwhich could explain why i'm only seeing it with some distros19:55
rharperfor sure19:56
rharpersystemd new enough spawns cgroups per unit19:56
rharperincluding inotify watches on each one19:56
magicalChickenthat make sense19:56
rharperso, pre-systemd like 2XX or something like that isn't affected19:56
magicalChickenI'm hesitant about just bumping the limit though19:57
magicalChickenbecause it may still hit the limit eventually19:57
rharperbut there's nothing to do about that other than wait19:57
magicalChickenMaybe minimizing the calls to execute() would help, as right now it polls for the system being up using execute()19:57
rharperit's a global limit19:57
rharperwell, sounds like systemd execution in the guest is consuming them19:57
rharpernot exec (I was thinking of leaking fd's , we say that like more than a year ago)19:58
magicalChickenoh, yeah if its systemd itself that's using all the fd then there's not much to help19:58
rharperright, other than watching the global limit and raising it before running19:58
rharperwe can raise it up, watch the count during a run19:58
rharperand see how close we get19:58
rharperand then in jenkins (at least) ensure we run with a limit high enough to handle things with some head room19:59
magicalChickenyeah, that should work19:59
magicalChickenhopefully there'll be either systemd or kernel changes to fix this eventually though19:59
rharperthere's no fix19:59
rharperit's just a global resource that's being used19:59
rharpersorta like file descriptors;  if you make that man opens, it has to be tracked;20:00
powersjmagicalChicken: also https://github.com/lxc/pylxd/issues/20920:00
rharpersystemd is a heavy inotify user20:00
powersjand 211 from the link at the bottom20:00
rharperdouble whammy20:00
magicalChickenyay :)20:00
rharpertime to poke  rockstar in #lxd20:01
magicalChickenwell, that one is fixable20:01
magicalChickenhaha yeah20:01
magicalChickentemp fix in the test suite is to go back to polling inside the instance with a single call to execute() though20:01
rharperwe could even test it20:02
rharpereither way, for sure20:02
magicalChickenfunny enough, I changed to doing it in python right after the pylxd 2.2 switch20:02
magicalChickenyeah, doing that swicth should show which bug broke us20:02
magicalChickenor it could be both too :)20:02
rharperwell, I suspect the exec fd is the big one, but it was hard to find without systemd consuming a bunch of stuff too20:03
magicalChickenthe systemd limit is still an issue without exec though too20:03
magicalChickenbecause ideally I had wanted to get all of this going in parallel20:03
rharperyeah, but I think raising the limit on a jenkins instance is reasonable, and we can add that to testsuite docs20:03
rharperyeah =)20:03
magicalChickenyeah, that makes sense20:03
erick3kanyone here?22:13
erick3kor everyone hidleing?22:13
rharperbest to just ask, folks will answer when they can22:14
erick3koh nice22:14
erick3kjust making sure there is ppl22:14
erick3ki go other rooms and they are full yet is like there is nobody22:14
rharperhappens here too22:14
erick3ki am trying to expire root password after launch22:15
erick3kis it possible to do it within the vm?22:15
erick3kin cloud.cfg?22:15
erick3kalthough this doesnt work: # System and/or distro specific settings # (not accessible to handlers/transforms) system_info:    # This will affect which distro class gets used    distro: ubuntu    # Default user name + that default users groups (if added/used)    default_user:      name: root      lock_passwd: false  chpasswd:       expire: True22:15
rharperlet's look at the user config22:15
erick3ksomething like that?22:15
erick3klike the cloud.cfg?22:16
rharperhttp://cloudinit.readthedocs.io/en/latest/topics/modules.html#users-and-groups   shows you can set expire to a date;22:17
rharperlemme look at the code to see what get's passed around22:17
rharperanother option is to use the chage command as a run_cmd22:17
erick3ksorry rharper am not familiar with terms22:18
erick3kby user config you mean cloud.cfg on /etc/cloud?22:19
rharperthere's a linux command called 'chage'22:19
rharper /etc/cloud is the default config, typically one passes in user-data into the stance in addition to the default22:19
rharperat least on debian/ubuntu, there's no root password set, so nothing to expire22:19
erick3kproblem is i would like to do it within the vm22:19
erick3kinstead of passing it throught user-data22:20
erick3ki did set a password22:20
rharperand your image already has a password set for root, right22:20
erick3kbut upon launching i want it to expire so customers have to reset it22:20
rharperif you're authoring the image22:20
rharperthen when you set the root password, you can use the chage command to expire it immediately22:21
rharperrather than doing it in cloud-init (which is just going to run the 'chage' command anyhow)22:21
erick3ki tried both22:22
erick3kpasswd --expire root and  chage -d 0 root22:22
erick3kbefore powering off and runing cloud-init22:22
erick3kbut does't work22:22
rharperyou want to look at the 'mount-image-callback' command; this let's you run commands inside the filesystem of the vm22:23
erick3kit applies whatever password you set but upon login it is not expired22:23
rharperyou can use chage --list root to see what got set22:23
erick3kok gonna try that agian22:25
erick3krharper it worked22:31
erick3ki get this https://i.imgur.com/xPgaxcO.png22:32
erick3kuntil i change the expired password i can't ssh22:32
rharperare you trying to ssh in as root ?22:32
erick3ki had to login throught the vm console and set the expired password22:34
rharperyeah; I've never set a root password or forced it to expire; I suspect there's something at play with sshd config with root22:35
rharpersorry I'm not more help here22:37
erick3kthanks for trying22:39
rharpermaybe not; -1 disables expiration22:40
erick3kquick question22:56
erick3kdoes the ubuntu user needs to exist for cloud-init to run?22:56
rharperbut many modules expect there be a default non-root user22:58
rharperso, things like 'add ssh keys' only work if you use the default config (which supplies a non-root user for your distro type)22:58
erick3kis it safe to delete the ubuntu directory in /home?23:02
erick3kubuntu home user directory23:02
rharperonly affects the ubuntu user23:03

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!