/srv/irclogs.ubuntu.com/2015/11/16/#juju-dev.txt

axwthumper: http://reviews.vapour.ws/r/3149/ -- FYI01:19
axwthumper: I'd be keen to know if you still get errors with this applied, since I had to really work hard to get the peergrouper to reliably fail01:20
menn0thumper: and another easy one: http://reviews.vapour.ws/r/3150/01:26
thumperaxw: I'll grab it down and try01:26
thumperaxw: yay, that looks like it fixed it, for me at least01:29
axwthumper: sweet01:29
axwmenn0 thumper: can I retarget https://bugs.launchpad.net/juju-core/+bug/1516144 to a different series? it's blocking master but it's not on the master branch...01:52
mupBug #1516144: Cannot deploy charms in jes envs <blocker> <charms> <ci> <regression> <juju-core:Fix Committed by menno.smits> <https://launchpad.net/bugs/1516144>01:52
thumperaxw: yeah, it is01:52
axwthumper: oh. job says functional-jes01:53
thumperaxw: menn0 is currently looking at it01:53
axwok then01:53
thumperyes it is the master branch, but the JES test01:53
axwrighto01:53
axwsorry01:53
thumpermenn0 has fixed one of the problems, but CI failed again with an IP address error01:53
thumperso... weird01:54
menn0thumper: if I repeat what the CI test does with the local provider it all works02:18
menn0thumper: trying with joyent now02:18
=== cmars` is now known as cmars
thumpermenn0: any joy with joyent?02:58
menn0thumper: everything works fine when I do it manually03:06
menn0with local provider and joyent03:06
thumperFFS03:06
menn0i'm going to look over the logs from the test failure again in more detail03:06
thumperI'd reply to the curse email, and make sure it is addressed to Curtis and Aaron, cc juju-dev03:07
thumperlet them know03:07
menn0the test is seeing the dummy-source/0 unit in the first hosted environment is in an error state03:08
menn0no idea why or how03:08
thumperhmm03:08
menn0thumper: the test is passing a config.yaml to create-environment03:09
menn0I wonder if that is broken somehow03:09
menn0I kinda guessed with that03:09
thumperhmm..03:09
thumpermenn0: there is a cursed email to reply to now03:35
menn0thumper: thanks, I will03:36
menn0thumper: I'm using the actual CI test script now03:36
mupBug #1331151 changed: 'juju destroy-environment' sometimes errors <destroy-environment> <juju-core:Expired> <https://launchpad.net/bugs/1331151>04:40
davecheneyaxw: ping ?04:49
axwdavecheney: pong04:49
davecheneyaxw: you mentioned you had some fixes for the peergrouper ?04:50
davecheneydid they land ?04:50
davecheney?04:50
axwdavecheney: no, master is blocked04:50
axwdavecheney: fixes here: http://reviews.vapour.ws/r/3149/04:51
davecheneyaxw: crap04:57
mgzmenn0: is there some trick to getting hosted env... oh, oh dear05:01
mgz# TODO(gz): May want to gather logs from hosted env here.05:01
davecheneythumper: https://bugs.launchpad.net/juju-core/+bug/151649805:09
mupBug #1516498: api/unitassigner: data race <juju-core:New> <https://launchpad.net/bugs/1516498>05:09
mupBug #1516498 opened: api/unitassigner: data race <juju-core:New> <https://launchpad.net/bugs/1516498>05:16
=== mup_ is now known as mup
jamdimitern: ping08:20
dimiternjam, hey, sorry I missed our 1:1 :/08:25
jamdimitern: np, maybe we can chat after the standup if you have anything you want to go over08:43
dimiternjam, sure, ok08:48
=== mthaddon` is now known as mthaddon
rogpeppethis PR updates juju-core to the latest charm package version: http://reviews.vapour.ws/r/3152/)09:14
rogpeppereviews appreciated :)09:14
mupBug #1516541 opened: payload/api/private: tests do not pass <juju-core:New> <https://launchpad.net/bugs/1516541>09:38
mupBug #1516541 changed: payload/api/private: tests do not pass <juju-core:New> <https://launchpad.net/bugs/1516541>09:41
mupBug #1516541 opened: payload/api/private: tests do not pass <juju-core:New> <https://launchpad.net/bugs/1516541>09:44
dimiternjam, frobware, standup?10:01
dimiternfrobware, dooferlad, voidspace, please take a look when you have a moment - http://reviews.vapour.ws/r/3153/ - almost straight cherry pick from the 1.25 fix for bug 148387910:46
mupBug #1483879: MAAS provider: terminate-machine --force or destroy-environment don't DHCP release container IPs <bug-squad> <destroy-machine> <landscape> <maas-provider> <sts> <juju-core:Triaged> <juju-core 1.24:Won't Fix> <juju-core 1.25:In Progress by dimitern> <https://launchpad.net/bugs/1483879>10:46
frobwarevoidspace, ok to start?11:03
voidspacefrobware: yep, omw11:03
dimiternfrankban, hey, do you have any idea when the guibundles branch will land on master?11:10
frankbandimitern: it is already landed on master11:18
frankbandimitern: because it was merged on the chicago-cubs one11:18
dimiternfrankban, awesome! so juju deploy bundle.yaml is usable?11:18
frankbandimitern: yes11:18
dimiternfrankban, nice! I'll give it a try now :) it's pity it's not mentioned in juju deploy help11:19
frankbandimitern: it should be mentioned actually11:19
dimiternfrankban, oh, sorry - I missed it - it's there11:20
frankbandimitern: cool11:20
dimiternsweet! juju deploy bundle.yaml works just fine with spaces constraints11:26
frankbandimitern: \o/11:27
* frankban lunches11:27
dimiternfrobware, dooferlad, voidspace, I have another review for you to look at when you can - http://reviews.vapour.ws/r/3155/ - fixes spaces-based deployments on ec2 and brings feature parity between master and 1.2511:39
voidspacedimitern: is that a straight forward port?11:40
dimiternvoidspace, yes, no changes needed11:41
voidspacedimitern: you don't need a review then if it's already been reviewed11:41
voidspacedimitern: but LGTM :-)11:41
dimiternvoidspace, it still needs a review :) thanks!11:41
voidspacedimitern: I don't think we're re-reviewing stuff that is a straight port between branches11:42
voidspacedimitern: at least I and other people haven't been :-)11:43
voidspacedimitern: and it doesn't seem like a good use of time11:43
dimiternvoidspace, it still needs a ship it stamp11:43
voidspacedimitern: that isn't what we've been doing11:43
dimiternvoidspace, isn't it?11:43
voidspacedimitern: no11:43
frobwaredooferlad, would dimitern's change have any impact to your CI tests? ^^11:43
voidspacedimitern: and people shouldn't "Ship It" *without* reviewing it11:44
dimiternvoidspace, I agree11:44
voidspacedimitern: and re-reviewing it is a waste of everyone's time11:44
dimiternvoidspace, not if you reviewed it the first time I guess11:44
voidspacedimitern: heh, well possibly11:45
frobwaredimitern, voidspace: cherry-pick backports that have already been reviewed shouldn't need re-reviewing, IMO.11:45
dimiternfrobware, ok, I don't mind at all to just land it then :)11:45
voidspaceif they need substantive changes then a re-review is reasonable11:45
dooferladfrobware: it shouldn't have any impact on tests.11:46
frobwaredimitern, my only observation is for the CI tests11:46
voidspacedooferlad: how far off getting to the maas test server are you?11:46
voidspacedooferlad: I'm going to need it "soon-ish"11:46
dooferladvoidspace: I just ran into a KVM not appearing for one of my tests, which may be due to mhy MAAS or may be just flake.11:47
dooferladvoidspace: once I have that sorted I will have got the review answered I was looking at and can get on with the test server11:47
dooferladvoidspace: so, this afternoon.11:48
voidspacedooferlad: ok11:48
dimiternfrobware, voidspace, thanks for the Ship It! anyway guys :)11:49
dimiternfrobware, dooferlad, voidspace, yet another for you to review - a really small one this time - http://reviews.vapour.ws/r/3156/ fixes bug 149942612:36
mupBug #1499426: deploying a service to a space which has no subnets causes the agent to panic <network> <juju-core:In Progress by dimitern> <juju-core 1.25:In Progress by dimitern> <https://launchpad.net/bugs/1499426>12:36
dimiternfrobware, thanks for the review - I've replied and updated the PR13:36
=== Spads_ is now known as Spads
mattywfwereade, ping?13:46
fwereademattyw, pong13:46
frobwaredimitern, are you waiting for a review on http://reviews.vapour.ws/r/3153/  I ask because I saw that it was being merged.14:43
alexisbthank you wwitzel3 and katco !14:50
katcoalexisb: yep, we'll get it figured out14:50
wwitzel3alexisb: np14:51
dimiternfrobware, nope that one is for master and it's still blocked14:58
dimiternfrobware, and since we're not reviewing forward ports, I'll just merge it when possible, if that's ok14:59
katconatefinch: standup15:03
katcofrobware: hey, how close are you to getting a fix for bug 1512371 for 1.25?15:05
mupBug #1512371: Using MAAS 1.9 as provider using DHCP  NIC will prevent juju bootstrap <bug-squad> <maas-provider> <network> <juju-core:In Progress> <juju-core 1.25:In Progress by frobware> <https://launchpad.net/bugs/1512371>15:05
* dimitern steps out to the store; bbl15:06
frobwarekatco, probably tomorrow15:11
katcofrobware: kk15:11
frobwarekatco, actively working on it now15:11
frobwarekatco, blocking you?15:11
katcofrobware: cool, just trying to figure out how much wiggle room we have on another bug :)15:11
katcofrobware: nope not blocked.15:12
frobwarekatco, in terms of a making a 1.25.x release?15:12
katcofrobware: yeah15:12
katcofrobware: e.g. is everyone waiting on us15:12
katcorather, i.e.15:12
=== akhavr1 is now known as akhavr
frobwarecherylj, you mentioned you had the replica set problem again - still holding true?15:14
cheryljfrobware: that maas set up was hosed.  I ended up tearing it down and rebuilding it.  Haven't seen the problem since.15:15
cheryljfrobware: I can't say for sure there wasn't something else going on15:15
katcocherylj: hey, can you read my comment at the bottom of bug 1382556 and give guidance?15:19
mupBug #1382556:  "cannot allocate memory" when running "juju run" <cpe-critsit> <run> <juju-core:In Progress by ericsnowcurrently> <juju-core 1.25:In Progress by ericsnowcurrently> <https://launchpad.net/bugs/1382556>15:19
cheryljkatco: sure, taking a look....15:20
katcocherylj: ty15:20
katcocherylj: this is one of the last blockers for 1.25.115:20
cheryljkatco: yeah.  Are you guys in your stand up?  Could I come chat with you guys if you are?15:20
katcocherylj: of course: https://plus.google.com/hangouts/_/canonical.com/moonstone?authuser=115:20
lazypowerwwitzel3 katco - ping15:21
wwitzel3lazypower: pong15:21
lazypowerI'm riffing with mbruzek in a hangout, and it appears juju list-payloads isn't available on 1.26-alpha1, is this known/expected behavior?15:21
katcolazypower: pong15:21
katcolazypower: it is not yet in master15:22
lazypoweri'm confused as to how its in 1.25 and not 1.26 :P15:22
mbruzekHow did it get into 1.25 if it is not in master?15:22
mbruzekis it hidden by feature flag?15:22
mbruzekYou gave us a feature then took it away!15:22
lazypower^ yeah, wat15:23
katcolazypower: mbruzek: sorry in meeting. we started the feature based on 1.25, 1.26 was blocked by lack of a >= Go 1.3 process15:24
lazypowerhmm, hokay15:24
katcolazypower: it's on the radar. we'll get it landed asap15:24
mbruzekOK, sorry to interrupt meeting.15:24
lazypowerThanks for the follow up o/15:24
katco(we're also on bug squad this iteration)15:24
mupBug #1516668 opened: Switch juju-run to an API model (like actions) rather than SSH. <tech-debt> <juju-core:Triaged> <https://launchpad.net/bugs/1516668>15:36
mupBug #1516669 opened: Memory/goroutine leaks. <tech-debt> <juju-core:Triaged> <https://launchpad.net/bugs/1516669>15:36
mgzI replied to menn0's mail about the blocker again, can natefinch or someone take a look?15:54
mupBug #1516676 opened: Use of os/exec in juju is problematic in resource limited environments. <tech-debt> <juju-core:New> <https://launchpad.net/bugs/1516676>16:06
natefinchmgz: reading16:17
natefinchkatco: seems like the jes CI tests are still blocked by code introduced by the unitassigner.  Should I work on that or the juju run bug?  (I presume the blocker, but wanted to confirm)16:32
katconatefinch: hm16:33
katconatefinch: my inclination is to say the juju run bug since it's blocking the impending 1.25.1 release16:33
katconatefinch: we still have some runway on master16:34
katconatefinch: plus it looks like menno did a fix-committed?16:34
natefinchkatco:  menno responded to the CI test failure with some comments, thread title is "Cursed (final): #3310 gitbranch:master:github.com/juju/juju 0bf7c382 (functional-jes)"16:35
natefinchkatco: basically... he thought it should have been fixed, but the CI test was still having problems16:36
katconatefinch: i see. well, i think you should still focus on the 1.25.1 blocker16:37
katconatefinch: that's coming out first16:37
natefinchkatco: yep, that's fine.  That's why I asked :)16:37
katconatefinch: yep, ty16:37
mbruzekaxw: ping?16:42
voidspacefrobware: change to picking address algorithm landed on 1.2516:44
voidspacefrobware: change discussed in standup fixed that failing test16:44
voidspacefrobware: porting to master now16:45
voidspacefrobware: also I think that the new Subnets implementation is done - but needs tests, which means I need a test harness16:45
voidspacefrobware: I can switch to ListSpaces whilst I wait for that16:45
mupBug #1516698 opened: Juju never stops trying to contact charm store <juju-core:Triaged> <https://launchpad.net/bugs/1516698>16:45
natefinchone wonders if no one thought about what might happen if this was run on an environment of 5000 machines: https://github.com/juju/juju/blob/master/apiserver/client/run.go#L16416:57
dooferladvoidspace: *sigh*, that CI stuff took ages. I am not going to get far with gomaasapi before I need to stop (now-ish). Will see if I can take a look after dinner.16:58
frobwarevoidspace, all sounds good16:58
natefinchsometimes I think people just randomly decide whether or not to pass around pointers versus values :/17:05
voidspacedooferlad: thanks17:07
voidspacenatefinch: what's your problem with that function using a pointer?17:25
natefinchvoidspace: it shouldn't be modifying the value, and the value is small enough to be copied easily.17:37
natefinchvoidspace: making it a pointer makes me wonder if it's going to be modified somewhere.17:37
voidspacenatefinch: if it's called 5000 times surely using a pointer is *more* efficient17:38
voidspacenatefinch: and if that's not the issue why does it matter if it's called for 5000 machines as you called out17:38
voidspacenatefinch: or is that a separate issue?17:38
cheryljkatco, does the lxd provider use the container/lxc code to still do container provisioning?17:39
natefinchvoidspace: separate issue... the problem is spawning 5000 goroutines that all do stuff at te same time17:39
voidspacenatefinch: right, instead of queuing17:39
voidspaceyeah, that would be much better...17:39
katcocherylj: i don't think so. container/lxd17:39
natefinchvoidspace: and pointer dereference versus some small amount of memory copying is not always an obvious win17:39
natefinchvoidspace: queueing is what I'm writing right now, since this code is causing OOM issues17:40
voidspacenot always, just usually17:40
voidspacenatefinch: right, cool17:40
natefinchvoidspace: the pointer thing isn't really a problem, just a pet peeve17:40
voidspaceheh17:40
voidspacenatefinch: thanks for expanding, interesting stuff17:40
natefinchman I love channels and goroutines18:30
natefinchbbiab18:30
=== natefinch is now known as natefinch-afk
katconatefinch-afk: hey did you get that tech-debt card created?19:43
=== natefinch-afk is now known as natefinch
natefinchkatco: oops, nope, will do now19:44
natefinchkatco: done19:54
katconatefinch: ty19:55
thumpersinzui: what's the status with the CI blocker21:06
thumper?21:06
thumpersinzui: menno ran the tests locally yesterday and could not reproduce21:06
thumperboth with local, joyent, and the CI scripts21:06
sinzuimgz: ^ I think you are versed in this topic21:07
mgzthumper: I replied to menno's message21:07
* thumper hasn't got to it yet, still reading21:08
thumpergo there21:08
thumpers/go/got21:08
mgzthumper: short version, somehow with trunk the units in the hosted environment are going *through* error, when the machines are not up, rather than pending, but once the machines are up are fine21:08
thumperwha?21:08
thumperoh...21:08
thumperha21:08
thumperI bet this is the unit assignment worker21:09
thumpertrying to assign them too early21:09
thumperand somehow marking them21:09
mgzthere is some layering thing screwed up21:09
thumpernatefinch ^^?21:09
mgzit thinks they are units for the hosting env... till it gets machines, then it works it out21:09
natefinchthumper: reading backlog21:13
thumpernatefinch: I was just about to look at the unit assignment worker21:13
thumperit seems that it is trying to assign the unit twice21:13
thumperbecause the machine isn't up yet21:13
thumperhttp://juju-ci.vapour.ws/job/functional-jes/276/console21:13
thumpernatefinch: see the status output in there21:13
thumpernatefinch: after two minutes, the status is taken again, and it looks ok21:14
thumperso it obviously settles itself down21:14
thumperbut putting the unit into an error state is confusing the tests (and users)21:14
natefinchthumper: I've seen it error and then settle, but I thought I'd fixed that when I told it only to run the worker on the master state machine21:15
natefinchthumper: already assigned does not sound like the error you'd get if you tried to assign it and there was no machine yet21:15
thumperno, it sounds like it was assigned, and then attempted to assign it again21:16
natefinchright, which would imply some sort of race condition - either multiple people getting notified and trying to assign (like I originally fixed) or maybe two notifications firing off in succession and thus causing two unit assignments to run concurrently.... the latter seems possible21:18
mgznatefinch: the other thing with this is this doesn't appear in a state server log anywhere21:19
mgzdespite being an error that appears in status. this seems very wrong.21:19
thumpermaor logging plz21:20
thumper:)21:20
natefinchhmmm... wonder if I went too crazy in removing my debugging logging21:20
thumperthis might be strange, but does the collection watcher fire when docs are removed?21:22
thumpernatefinch: I have a feeling it might, but just a stab in the dark at the moment21:22
thumperI thought any change to the doc would fire the watcher21:22
thumpernot just insertions21:23
thumperit appears that the assign units collection just has insertions and deletions and no updates21:23
thumperis that right?21:23
natefinchcorret21:24
natefinchcorrect21:24
thumpernatefinch: how about logging the unit ids that are being assigned21:26
thumperI wonder if we'll find a dupe21:26
natefinchthumper: yeah.... I swear I was, but again, maybe I just took out too much logging21:27
mgzI can rerun with logging turned up more if that would maybe make things clearer21:28
natefinchthe worker has some tracef calls that you could turn on, but it definitely looks like I took out too many log statements21:29
natefinchthat'll at least tell you wht unit ids the worker is seeing firing from the watcher, and log the results of the unit assignment attempt21:31
mgznatefinch: what do I want... "<main>=DEBUG ?=TRACE"21:32
natefinchjuju.worker.unitassigner21:32
mgzta21:32
mgznatefinch: http://juju-ci.vapour.ws/job/functional-jes/279/console21:52
perrito666anyone knows how to ask peergrouper which of the machines is the leader?21:52
mgzyou'll want the gathered logs when the job completes21:52
natefinchmgz: thanks21:54
natefinchwow, juju status --format=tabular doesn't show containers?21:54
mgz2015-11-16 21:52:19 TRACE juju.worker.unitassigner unitassigner.go:56 Unit assignment results: ["cannot assign unit \"dummy-source/0\" to machine: cannot assign unit \"dummy-source/0\" to new machine or container: cannot assign unit \"dummy-source/0\" to new machine: unit is already assigned to a machine" <nil>]22:00
natefinchlol, this OOM error frmo juju run is a lot harder to repro when we moved to m3.mediums.22:00
natefinchmgz: not really useful, given that we already knew that.  I'm working on another bug for bug squad right now, that's blocking 1.25, but I'll try to look at that one once I get this one finished up22:02
natefinchmgz: I'll take a look at the logs from the run later tonight and see if anything obvious pops up22:02
natefinchgotta run and make dinner for the family22:02
=== natefinch is now known as natefinch-afk
perrito666diner, honestly? at 6PM...22:06
thumperI'm sorry, but seriously?22:44
thumpera critical blocker stopping the entire team has less priority?22:44
=== Makyo is now known as Guest20767
axwwallyworld: I'm rebasing the azure-arm-provider branch because it's missing fixes from master23:14
axwwallyworld: assuming no need to review23:14
wallyworldaxw: once, just finishing meeting23:15
wallyworldaxw: sorry, done now, in standup23:16
axwoops, is that the time23:16
thumperaxw: I have never suggested or required a review of merging master into a feature branch23:48
* thumper off to walk the dog23:49

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!