wallyworldanastasiamac: can i please get a review on this which is a fix for one of the issues discovered last week during the outage analysis http://reviews.vapour.ws/r/1824/01:30
anastasiamacwallyworld: looking :D01:31
anastasiamacfun \o/01:38
wallyworldjam: hey, you around?06:08
jamhiya wallyworld06:08
wallyworldquiet today with everyone away06:08
wallyworldi have a MP for python-jujuclient which retries send requests if juju says it is upgrading06:08
wallyworldit should address some of the core issues deployer is having06:09
wallyworldcould you take a look?06:09
jamwallyworld: I do wish it was trivial to backoff retries06:11
wallyworldit *could* be implemented, but this i think is an ok first step06:11
wallyworldit covers the small window where juju machine agent needs to first check if upgrades are needed06:12
wallyworldduring which time the api is limited and so the "upgrade error" is reported06:12
wallyworldwhich would be < 1 second normally06:12
wallyworldor thereabouts06:13
wallyworld99% of the time (or pick your own stat), the api goes from limited -> open because no upgrade is required06:13
wallyworldthis stops the case where people juju bootstrap && juju-deploy via a script06:14
wallyworldfrom going wrong06:14
jamwallyworld: if its upgrading can't we also get disconnected completely during this time?06:15
wallyworldso, yes but what a recent juju change did was keep the api in limited mode until the check to see if an upgrade is required. if an upgrade is required, then it does that without giving the deployer a chance to connect simply to be disconnected06:16
wallyworldbut that changed opened a small window where the deployer trying to connect initially got the "upgrading error"06:16
wallyworldbecause the upgrade worker needed to start06:17
jamwallyworld: so I see that you're retrying "upgrade in progress" which is fine, my concern is are we also retrying "I got disconnected completely". IIRC the later is what broke OIL, etc.06:17
wallyworldi didn't intend to retry anything other than "we are upgrading"06:17
wallyworldfor this change06:17
wallyworldthe "we got disconnected" case is a bit separate06:18
jamwallyworld: Isn't the original bug about getting disconnected vs upgrading?06:18
jam(the problem with upgrading is that deployer got disconnected and then just died)06:18
wallyworldjam: hangout? a bit easier to explain06:18
jamsec, need to grab headphones06:19
dimiterndooferlad, standup?09:01
voidspacedimitern: thanks to dooferlad it now works!09:31
dimiternvoidspace, sweet!09:32
dimiternvoidspace, omw to our 1:109:32
voidspacedimitern: grabbing coffee first09:33
dimiternvoidspace, sure09:33
voidspacedimitern: omw09:40
dimiternperrito666, o/10:06
=== jam1 is now known as jam
=== wesleyma` is now known as wesleymason
wallyworldsinzui: hey, i did a python-jujuclient change to fix the issue of deployer complaining that juju is upgrading. but i need to talk to a maintaner to get that merged (it's approved), and then we need to figure out how to unblock landings13:13
wallyworldsinzui: ^^^^ - i can get the fix landed but am unsure what to do next to unblock things. can we ass a 1 sec delay to the CI test until pythin-jujuclient gets rolled out13:31
sinzuiwallyworld: We automatically test tht package.13:32
sinzuiwallyworld: all slaves use the juju ppa to get its packages. That is how we cause the quickstart regression last week13:33
wallyworldsinzui: so as soon as a phtyon-jujuclient fix lands in source, CI will grab that copy?13:33
sinzuiwallyworld: no, CI gets the built packages13:34
wallyworldhow long from branch getting merged till CI using the changes?13:34
sinzuiwallyworld: about 1 hour after the package is built by Lp13:35
wallyworldok, great, i'm just trying to see how long we may still be blocke for13:35
=== psivaa-lunch is now known as psivaa
katcoericsnow: standup14:04
sinzuiwallyworld: katco: Do either of you have a minute to review http://reviews.vapour.ws/r/1829/14:15
katcosinzui: in a meeting, sec14:15
wallyworldsinzui: +114:16
sinzuithank you wallyworld14:16
=== redelmann is now known as rudi_gfm
dimiternwallyworld, hey there14:42
dimiternwallyworld, if you can find some time, please have a look at http://reviews.vapour.ws/r/1830/ - instancepoller using the api14:42
ericsnowwallyworld: any help I can give on #1460171?14:42
mupBug #1460171: Deployer fails because juju thinks it is upgrading <blocker> <ci> <deployer> <regression> <upgrade-juju> <juju-core:In Progress by wallyworld> <python-jujuclient:In Progress by wallyworld> <https://launchpad.net/bugs/1460171>14:42
dimiterndooferlad, voidspace, ^^14:42
voidspacedimitern: looking14:43
dimiternvoidspace, thanks!14:43
wallyworldericsnow: waiting for patch to land in python-jujuclient - no core changes14:44
wallyworldshould be soon i hope14:44
ericsnowwallyworld: cool14:44
wallyworldthanks for asking14:44
ericsnowwallyworld: :)14:44
wallyworlddimitern: sorry, was talking to someone else14:44
dimiternwallyworld, no worries14:45
voidspacedimitern: why does facade version start at 1 whilst others start at 014:45
dimiternvoidspace, new facades should start at 114:45
dimitern(there was some decision about this some time ago)14:46
perrito666voidspace: 0 is for facades previous to versioning iirc14:46
voidspacecool, thanks14:46
dimiternwallyworld, I'd appreciate if you can confirm the instancepoller should start once per apiserver (rather than per environment)14:46
dimiternfwereade, ^^\14:47
wallyworlddimitern: so long as it knows how to deal with mult envs14:47
wallyworldmachines are per env after all14:47
fwereadedimitern, wallyworld: yeah, it sounds like a per-env thing to me14:47
wallyworldwe could have just the one, but polling intervals get tricky14:48
fwereadedimitern, wallyworld: and including multi-env logic in the instancepoller, rather than just running N of them, would seem suboptimal14:48
dimiternfwereade, wallyworld, but each running instance should only work for a given env?14:49
* dimitern wonders if requiring JobManagerEnviron will make this "just work", like for other "singleton" workers14:49
wallyworlddimitern: almost 1am here, my brain is dead sorry, i need sleep14:49
dimiternwallyworld, get some sleep then! :)14:50
wallyworldcan talk more tomorrow unless fwereade soets it out14:50
dimiternsure, no problem14:50
wallyworldsee ya later14:50
fwereadedimitern, yes, each instance is part of one and only one env14:51
dimiternfwereade, so I guess starting one per env should work, as login will take care of which envs to use and subsequently what will the watchers report14:51
fwereadedimitern, I think you should just be starting the instancepoller alongside the firewaller and provisioner for each environment14:52
dimiternfwereade, right14:53
dimiternfwereade, so I'll change that, but the rest should be fine14:53
dimiternfwereade, thanks!14:53
fwereadedimitern, hey, has instancepoller just always been running non-singular?14:55
fwereadedimitern, I'm pretty sure we don't want one per state server per env14:56
fwereadedimitern, ...in fact14:56
fwereadedimitern, instance address-setting txns have been among the ones we've seen clogging up stuck environments, right?14:56
dimitern fwereade so far it was started in the StateWorker() method of the MA14:57
fwereadedimitern, and the problems with mgo/txn absolutely centre around separate flushers racing to write the same doc14:57
dimiternfwereade, which means once per state server14:57
fwereadedimitern, it's also in startEnvWorkers14:58
fwereadedimitern, ...or only there14:58
dimiternfwereade, now it's only in startEnvWorkers (running tests still)14:58
fwereadeah ok14:59
fwereadedimitern, but I *do* see it non-singular in startEnvWorkers14:59
dimiternfwereade, where?15:00
fwereadedimitern, and as a worker that's yammering at the provider api we definitely want it to be singular, I think, not to menntion my FUD about it causing the sort of workload that stresses mgo/txn15:00
fwereadedimitern, :1116 in master15:00
fwereaderunner.StartWorker("instancepoller", func() (worker.Worker, error) {15:01
fwereadereturn instancepoller.NewWorker(st), nil15:01
dimiternfwereade, right!15:01
fwereadedimitern, so s/runner/singularRunner/ and we get a little bit better in a couple of good ways too15:02
fwereadedimitern, (on top of passing in the api instead of teh state :))15:03
dimiternfwereade, in a call, will get back to you15:04
cheryljsinzui: Should I backport bug 1442308 to 1.23?15:13
mupBug #1442308: Juju cannot create vivid containers <ci> <cloud-installer> <local-provider> <lxc> <ubuntu-engineering> <vivid> <cloud-installer:Confirmed> <juju-core:In Progress by cherylj> <juju-core 1.24:Fix Committed by cherylj> <https://launchpad.net/bugs/1442308>15:13
sinzuicherylj: no, I don’t think we will make a 1.23.4 release since we will propse 1.24.0 on Thursday15:14
cheryljok, thanks!15:14
sinzuicherylj: I will add a task to the bug as WONT FIX to be clear that we choose not to15:15
cheryljsinzui: awesome, thank you15:15
voidspacerebooting *sigh*15:20
natefinchabentley: you around?15:29
abentleynatefinch: Yes, but I have standup now.  I'll ping you when done.15:30
natefinchabentley: thx15:30
voidspacedimitern: ping15:54
voidspacedimitern: if you're still around15:54
voidspacedimitern: I'm still doing your review by the way...15:54
voidspaceit's big15:54
voidspace(the patch I mean)15:54
voidspacebut also trying to bootstrap juju with MAAS15:54
voidspaceand failing - hard to tell if current failure is a MAAS problem or a juju problem, or something else15:54
voidspacelast problem was HP propietary drivers calling deploy to fail15:55
voidspacecurrent problem is this:15:55
dimiternvoidspace, yeah, I'm here15:55
dimiternvoidspace, sorry about the side - it's mostly tests though :)15:55
voidspacedimitern: http://pastebin.ubuntu.com/11499441/15:55
voidspacedimitern: heh, indeed15:55
dimiternvoidspace, looking15:55
voidspacedimitern: so juju fails to contact MAAS (connection refused)15:55
voidspacefetching that URL in the browser works15:56
voidspaceand there's nothing useful in the MAAS logs15:56
voidspacethe MAAS node is deployed15:57
voidspacedimitern: I updated MAAS version and am running juju latest master15:58
dimiternvoidspace, why localhost?15:58
voidspacedimitern: because MAAS is running locally15:58
dimiternvoidspace, on port 80?15:58
voidspacehmmm... apparently15:58
voidspacethat's working fine15:59
dimiternvoidspace, try bootstrapping with --debug to get more context15:59
voidspacedimitern: ok, will do15:59
voidspacedimitern: it takes about ten minutes or so because these proliants are *slow* to boot16:00
voidspacedimitern: the intelligent bios thing takes several minutes to do its thing16:00
voidspaceI might try and disable it16:00
voidspacebut it can run in the background whilst I continue the review16:01
dimiternvoidspace, is MAAS itself configured with http://localhost/MAAS/ ?16:03
dimiternvoidspace, dpkg-reconfigure maas (IIRC)16:03
voidspacedimitern: I'll check16:04
voidspacewhen I went to instead of localhost I had to login again16:04
voidspaceso there maybe a difference16:04
voidspaceI'll wait until this bootstrap completes16:04
dimiternvoidspace, ok16:04
dimiternvoidspace, I'm pretty sure the MAAS URL has to match exactly - both in maas config and in juju's16:05
voidspacedimitern: yep, good call16:06
=== rudi_gfm is now known as rudi
=== rudi is now known as redelmann
abentleynatefinch: I'm free now.16:08
voidspacedimitern: I think it needs a visible url and not a local url16:12
voidspacedimitern: trying with the machine IP address16:12
dimiternvoidspace, that sounds good16:12
voidspacedimitern: i.e. a node can't use to reach the MAAS API16:12
voidspacetaking a break16:13
dimiternvoidspace, I have a similar setup locally, but I use a 192.168.50.X - .2 for maas, the rest for the nodes16:13
dimiternvoidspace, ok, I'll need to go, but might be back later16:14
voidspacedimitern: thanks, see you later16:14
natefinchabentley: I was going to do something like tghis to add the actions feature flag to the CI tests... is this acceptable? http://pastebin.ubuntu.com/11499809/16:15
abentleynatefinch: That won't work because EnvJujuClient24 is only used for juju 1.24.  I meant that you should add an EnvJujuClient22 that was used for juju 1.22, that supplied the 'actions' feature flag.16:18
abentleynatefinch: A heads-up: jog is landing support for -e with "action do" and "action fetch" today.16:20
abentleynatefinch: In this branch: https://code.launchpad.net/~jog/juju-ci-tools/start_chaos16:21
voidspacedooferlad: hah, and four days later I have a working juju bootstrapped to MAAS on an HP proliant16:23
voidspacedooferlad: the PDU seems to be working fine now too, both for switching machines on and off16:23
voidspacedooferlad: http://pastebin.ubuntu.com/11500002/16:24
natefinchabentley: I'16:26
natefinchabentley: I'm not really prepared to spend very much more time on this CI test.  It's already taken 3-4 times as long as I had anticipated & scheduled16:27
natefinchcc katco ^^16:27
natefinchabentley: but if I can just remove my action code and merge with what jog lands, that's fine with me, though it would make for a lot of wasted work on my part.  It's unfortunate both of us were working on the same functionality.16:28
natefinchabentley: or maybe I misunderstood what you were talking about.. do you mean he was landing code in the tests or juju-core16:29
jognatefinch, sorry I was working on another project and just discovered our juju-ci-tools lib needed to handle actions differently on Friday.16:30
abentleynatefinch: He's just done an alternative implementation of the _full_args change, none of the rest.16:31
natefinchabentley: oh ok, that's good.  I'm glad we didn't overlap much16:31
natefinchabentley: do I have to do more in the EnvJujuClient22 than implement the _shell_environ, and add a new elif in EnvJujuClient.by_version?  Something like this? http://pastebin.ubuntu.com/11500166/16:35
abentleynatefinch: That's all you need to do for that.16:36
natefinchabentley: thanks16:36
katconatefinch: abentley: hey... so these CI tests are being wrapped up then?16:58
natefinchkatco: yeah16:59
katcoyay :D17:03
natefinchwhy the heck do I have to log into ubuntu to "download as text" from pastebin.ubuntu.com?17:05
perrito666you can always report it as a bug17:08
=== natefinch is now known as natefinch_afk
katcowwitzel3: ping17:57
wwitzel3katco: pong17:58
katcowwitzel3: hey on the rich status spec? who do you think from ecosystems/accounting would be good to ping?17:59
katcowwitzel3: it has to do with charm metadata, so charmers for sure. and i would think someone from accounts would want to give input on what information they'd like when doing installations18:00
wwitzel3katco: not 100% sure, so I'd ping arosales and ask him for some candidates that might have a strong interest/opinion18:00
katcowwitzel3: ty. arosales, any volunteers? https://docs.google.com/document/d/1JcWkE4SNxXuFClZGBcwnU3w13IpRU1yxMhddQG6mKyE/edit#18:01
arosaleskatco, /me looking . .18:03
katcoarosales: ty sir18:04
arosaleskatco, I'll bring it up on our daily and send a mail out on it too18:04
katcoarosales: ty... please let me know who you'd like to delegate so i can add them to the reviewers list18:04
arosaleskatco, will do18:06
arosaleskatco, thanks thanks for looking for the feedback18:06
katcoarosales: ty again!18:06
arosaleskatco, np. I'll should have some more information this afternoon.18:07
katcoarosales: i'm also pulling marcoceppi into https://docs.google.com/document/d/1LORhaYvk_A8yMHkAb9FR_cN9V0S55zEx-T6QXdmr3fU/edit#18:08
katcoarosales: he expressed interest in nuremberg18:08
arosaleskatco, ah yes is a good one for min version18:08
katcoarosales: juju min. version is the one we'll be focusing on next18:10
abentleynatefinch_afk: jog's stuff has landed now.18:27
natefinch_afkabentley: thanks18:51
natefinch_afkabentley, sinzui:  I get this error on several of the tests, despite having run make install-deps18:56
natefinch_afkOSError: /usr/lib/python2.7/dist-packages/lookup3.so: cannot open shared object file: No such file or directory18:56
sinzuiI wonder what that is18:57
sinzuinatefinch_afk: It appears to relate to jenkins and I I see several reports of it failing18:59
=== natefinch_afk is now known as natefinch
natefinchsinzui: yeah, just found some interesting things... I found it in /usr/local/lib/python2.7/dist-packages/18:59
sinzuinatefinch_afk: my apt-cache policy python-jenkins says I have 0.2.1-0ubuntu119:00
natefinchInstalled: 0.2.1-0.119:00
sinzuinatefinch: how did you get that version? pip? easy_install?19:01
* sinzui thinks we need the ubuntu version19:01
natefinchsinzui: quite possibly19:01
natefinchsinzui: I didn't know about make install-deps when I started, so I was just installing stuff however I could find it19:02
sinzuinatefinch: understood. I have to do the same on the win and OS X machines. The issue I am reading implies the jenkins lib does work on OS X, but it is working wel enough for our tests19:03
natefinchI'm on ubuntu... just ran pip install (I think?)  because I didn't know how else to ge tit19:04
natefinchand..... now pip is dumping a giant stack trace when I do pip uninstall jenkins.  Nice.19:04
abentleynatefinch: If you ran make install-deps, you should have python-jenkins installed via apt.19:05
sinzuinatefinch: you can run pip unistall jenkins?19:05
* sinzui isn’t sure of the pip package name19:05
natefinchsinzui: I can try and have it fail19:05
natefinchsinzui: it seemed to recognize the name19:05
natefinchabentley: yeah, apt seemed to think I had it installed via apt19:05
sinzuiabentley: surely pip is installing in a path that takes precedence.19:05
natefinchI removed and reinstalled the apt version, it still gives me  0.2.1-0.119:06
abentleyI do not have lookup3 installed, and I don't seem to need it.19:06
abentleyI have python-jenkins 0.2.1-0.1 installed.19:08
natefinchfull stack trace from running tests (there are a handful of these): http://pastebin.ubuntu.com/11502857/19:09
abentleynatefinch: Can you delete /usr/local/lib/python2.7/dist-packages/jenkins.py or at least move it aside so that the correct jenkins lib gets loaded?19:11
natefinchabentley: sure19:12
natefinchFYI, I don't have  /usr/lib/python2.7/dist-packages/jenkins.py19:15
natefinch(if I'm supposed to)19:15
natefinchIt looks like all my jenkins stuff got installed to /usr/local/lib/python2.7/dist-packages/  instead of /usr/lib/python2.7/dist-packages/19:17
natefinchthat sounds like "you installed something with or without sudo when you should have done it the other way"   but I have no idea what, being both a linux and python n00b19:17
abentleynatefinch: No, you shouldn't have that, you should have /usr/lib/python2.7/dist-packages/jenkins/__init__.py19:24
natefinchabentley: ahh, ok, yes, I have that19:25
natefinchI guess get_python_lib()  must be returning the wrong thing19:27
abentleynatefinch: There are at least two incompatible packages providing 'jenkins': https://pypi.python.org/pypi/jenkins https://pypi.python.org/pypi/python-jenkins and the one installed in /usr/local/lib is the wrong one.19:40
natefinchabentley: how am I supposed to install it?19:42
abentleynatefinch: The right one is already installed.  You just have to get rid of the wrong one.19:43
natefinchabentley: ahh, ok, I figured it out. pip uninstall, instead of saying "Hey, this needs to be run with sudo" instead dumped a giant ugly stack trace.19:45
natefinchwhich I incorrectly interpreted as "jenkins wasn't installed with pip"19:46
natefinchthat fixed it19:46
rogpeppethumper: hiya19:57
natefinchis there a bzr plugin that'll let me run an external merge tool to fix conflicts?  I found bzr-extmerge, but it appears to be ancient (tries to run with python 2.4)20:09
natefinchthumper, sinzui, abentley: ^^20:10
abentleynatefinch: No, extmerge is the only one I'm aware of.  But bzr dumps THIS, BASE and OTHER files that you can use an arbitrary tool with.20:13
* natefinch closes his eyes and runs sudo python ./setup.py20:16
natefincher setup.py install20:17
=== brandon is now known as web
=== brandon is now known as Guest13004
sinzuiwallyworld: do you think the maas 1,7 test would pass if we added a 30s delay between bootstrap and deployer?21:46
wallyworldsinzui: yes21:46
wallyworldsinzui: not even 30s, more like 1 second21:46
wallyworldor 221:46
sinzuilet me try to solve the issue.21:46
sinzuiwallyworld: I will start with 5 seconds21:46
wallyworldok :-)21:47
marcoceppikatco: you still around?21:57
sinzuiwallyworld: I am adding a call to status between bootstrap and deployer. Do you think that is enough time? Do you have a branch ready to merge to test my change. I don’t want to start a test of an old revision if you have work queued?22:01
wallyworldsinzui: everything you need to test should be in tip of 1.2422:06
wallyworldsinzui: the python-jujuclient work simply retries during the second or so you will be deplaying22:06
wallyworldwhich would make the delay unnecessary22:06
sinzuiwallyworld: I am pushing a change to all the slaves. I will retest 1.24 tip when I see the changes areive22:08
wallyworldsinzui: tyvm, i will wait with baited breath22:08
mupBug #1460184 changed: Bootstrapping fails with Maas on Ubuntu Vivid <maas-provider> <vivid> <juju-core:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1460184>22:09
wallyworldericsnow: any chance of a trivial time display fix review? the code change is one line, the test changes are a search and replace http://reviews.vapour.ws/r/1823/22:14
ericsnowwallyworld: sure22:15
ericsnownice: "You Require More Vespene Gas" (in a test)22:16
ericsnowwallyworld: ship-it!22:17
wallyworldericsnow: ty22:17
ericsnowwallyworld: any time22:17
katcomarcoceppi: am now, what's up?22:24
=== anthonyf is now known as Guest41448
=== Guest13004 is now known as web
=== anthonyf is now known as Guest28879
wallyworldwaigani_: heya, you working on bug 1376246 ?23:45
mupBug #1376246: MAAS provider doesn't know about "Failed deployment" instance status <landscape> <maas-provider> <juju-core:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1376246>23:46
davechen1ygreat, build is blocked, again23:48
waigani_wallyworld: no, I should be able to start on that today though.23:49
wallyworldwaigani_: great, becasue we want 1.24 work done so we can look to do a release overnight23:49
waigani_wallyworld: okay, let me get a bite to eat and I'll get into it23:50
axwwallyworld: sorry I missed standup, been on the phone with iinet for 40 minutes trying to get my account unlocked :/23:52
wallyworldaxw: gawd, i hate isps. all fixed?23:52
axwwallyworld: yeah, silly error while setting up my new modem. OTOH, seems I got swapped to the new port and now I'm syncing at 16Mb as opposed to 4Mb I was getting for the last few months23:53
wallyworldoh good :-)23:53
wallyworldaxw: you free now for a chat?23:54
axwsure, just a quick one tho23:54
axwsee you in standup23:54

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!