/srv/irclogs.ubuntu.com/2015/11/18/#juju-dev.txt

mupBug #1517258 opened: juju 1.24.7 precise: container failed to start and was destroyed <oil> <juju-core:New> <https://launchpad.net/bugs/1517258>00:13
anastasiamacwaigani: tyvm for review \o/00:17
anastasiamacwaigani: i'll prefix the PR with WIP and will add api/apiserver logic to it :D00:17
alexisbwaigani, did you use your new tool for the review?00:37
anastasiamacalexisb: waigani: new tool?00:40
wwitzel3wallyworld: ping00:58
wallyworldhey00:59
wwitzel3hey, hangout?00:59
wallyworldok00:59
wwitzel3wallyworld: https://plus.google.com/hangouts/_/canonical.com/moonstone?authuser=100:59
mgzsoooo... how do people actually feel about table tests these days?01:37
mgzI tried it as more-idiomatic go thing, but I might just be out of date01:38
davecheneymgz: i think table driven tests are ace01:42
mgzdavecheney: do you normally have two different structs for error cases and non-error cases?01:44
davecheneydepends01:44
davecheneyi normally add an err error field to the test table01:45
davecheneyand then you just test that the actual error and the erro int he test table is the same01:45
davecheneyfor most cases it lets me combine success and fail cases01:45
mgzhm, issue is I need ErrorMatches01:48
mgzas I'm testing an error that includes at the bottom level a url.Parse error string I don't want to assert on the specifics of01:48
davecheneydo you care what error it is01:48
davecheneyor just the presence of an error01:48
mgznope.01:48
davecheneygot, err :=f ()01:49
davecheneyif t.err != err { // oops }01:49
davecheneyshould do01:49
mgzwell, I have two error paths, so I care a bot01:49
mgz*bit01:49
mgzdavecheney: exact code in cinder_test.go http://reviews.vapour.ws/r/3170/01:51
davecheneymgz: lgtm01:55
davecheneythat's how id do it with gocheck01:55
davecheneywho want's to look at the weirdest data race report02:02
davecheneyhttp://paste.ubuntu.com/13322899/02:02
davecheneyfunc (fakeAssignCaller) BestFacadeVersion(facade string) int { return 102:03
davecheney}02:03
=== natefinch-afk is now known as natefinch
natefinchdavecheney: derp?02:04
davecheneysure, most things are human error02:05
davecheneybut how can there be a data race on a function that never touches any data ?02:05
natefinchit even leaves out the receiver variable...02:06
davecheneywhy does it need the receiver, it alwasy returns 102:07
perrito666davecheney: gocheck?02:07
davecheneynope02:07
perrito666ok, I am curious now02:10
davecheneyhttp://reviews.vapour.ws/r/3174/02:10
natefinchsome sort of pointer auto-dereference thing?02:10
davecheneyhave a look02:10
davecheneynatefinch: if I have a *T and I need to call a method that takes a T, how does the compiler do that ?02:11
perrito666interesting02:12
natefinchdavecheney: it dereferences the pointer, which is what I was thinking...02:12
davecheneyso why does that lead to a data race ?02:12
davecheneyRead by goroutine 12:02:13
davecheney  github.com/juju/juju/api/unitassigner.(*fakeWatchCaller).BestFacadeVersion()02:13
davecheney      <autogenerated>:18 +0xa902:13
davecheneyPrevious write by goroutine 11:02:13
davecheney  sync/atomic.CompareAndSwapInt32()02:13
davecheney      /home/dfc/go1.4/src/runtime/race_amd64.s:281 +0xc02:13
davecheney  sync.(*Mutex).Lock()02:13
davecheneyi'm going to have to blog this one02:13
natefinchit's copying the value02:13
natefincheven though we're not even using the value02:14
natefinchand so that counts as being "read", and in other other goroutine it's changing that value... so what gets copied depends on the order in which those two things happen.... even though we don't care.02:15
perrito666I would have guessed that it was compiled as a function rather than a method when the receiver is not assigned02:15
natefinchI definitely would have hoped that the compiler would figure out not to copy the value and therefore not trigger the race detector... however, it seems like it must not be that smart.... or the race detector's not that smart02:17
natefinchThe interesting thing then is, it seems like you should _always_ use *T for methods that don't need the receiver, to avoid exactly this problem.02:18
perrito666natefinch: methods that dont need a receiver are called functions :p02:18
natefinchperrito666: except it has to be a method to satisfy an interface02:19
davecheneynatefinch: bingo02:19
davecheneynatefinch: an observation in gopl was if you use pointer receivers for some methods, you shuld probably use them for all methods02:22
davecheneyand vice versa02:22
natefinchdavecheney: I've read that, and I wasn't sure if I agreed... sometimes value receivers are nice to both point out that the value isn't being modified, as well as enforce that fact.  Sort of immutability-lite ... but certainly with gotchas like this, it certainly is problematic.02:24
davecheneynatefinch: yeah, i'm still undecided02:34
davecheneyhttp://play.golang.org/p/DTDzYu4b3C02:34
davecheneyhere is a very small repro that shows the problem02:34
davecheneyno mutex02:34
davecheneyno sync atomic02:35
wallyworldwwitzel3: where are the logs etc? i can't see any in /var/log/juju02:38
natefinchdavecheney: nice simple repro.  Do you know if it's the compiler's fault or the race detector's?02:56
natefinchfantasitc... when CI tests fail, they don't bring down all the machines they created :/03:10
natefinchsomeday I'll get a clean run of the CI tests and maybe actually repo  this bug03:13
mgznatefinch: this is an issue with the way this fails03:24
mgznatefinch: you call `juju destroy-environment` on an env you've just deployed machines for, but have the machines have not yet come up, juju leaks the machines03:24
natefinchmgz: even with --force?03:24
natefinchI guess --force shouldn't matter....  but still... ouch, damn.03:25
natefinchmgz: I think this is a pass?  https://pastebin.canonical.com/144311/03:28
natefinchwould be nice if it said "PASS" or something at the end, rather than making me guess :/03:29
mgznatefinch: yeah, looks okay, `echo $?`03:30
natefinch003:31
mgzthat's a pass.03:31
natefinchI didn't change anything past menn0's fix03:31
natefinchgranted, it's on EC2 not joyent, but...03:32
* natefinch shrugs03:33
natefinchworks on my machine :/03:33
mgznatefinch: you can use the joyent creds from lp:cloud-city03:36
natefinchmgz: k, thanks03:36
menn0natefinch, mgz: I found it work with the local provider and EC2 as well03:36
menn0so it's beginning to look joyent specific which is weird03:37
natefinchmenn0: probably a timing issue03:37
natefinchprobably joyent is faster or slower in some specific part of the script03:37
menn0natefinch: agreed, that's likely03:37
mgzI suspect you just aren't seeing a race, :12 at last deploy, :19 before status is first called in your log03:37
mgzprobably long enough for EC2 to allocate instance ids for both machines03:38
natefinchyep03:38
mgz:09 at deploy on CI to :12 before first status03:39
natefinchmgz: the default-joyent creds in the environments.yaml in cloud-city?03:39
natefinchs/creds/environment03:39
mgznatefinch: yup03:40
natefinchmgz: btw, I hope you're not actually in the UK at this hour03:40
mgzor parallel-joyent and use the exact stream that the test did03:40
mgznatefinch: nah, NY03:40
natefinchmgz: ahh, good.  Whatcha doing on this side of the pond?03:41
mgzsprint, talking about QA across all of CDO products03:42
natefinchneat03:42
mgzwaigani_: updated r/3170 - thoughts welcome03:46
natefinchmenn0: you said you hacked the script to do an upload tools, care to show me where I'd do that?03:48
natefinchor mgz would you know how I could best do that? ^03:51
mgznatefinch: you can just append --upload-tools03:51
mgzoh wait, jes is still special?03:52
mgz--upload-tools doesn't work with jes, or at least didn't at the time the test was written03:53
natefinchwait, what, really?03:56
natefinchffs03:57
natefinchhow the hell do you test if you can't upload tools?03:57
menn0mgz: i'm not aware of a limitation with upload-tools and jes04:03
menn0but with joyent there is a routing problem04:03
menn0new instances often can't get to the state server's internal IP address04:03
menn0so they can't download the tools from the state server04:04
menn0I ran into that with my testing04:06
menn0wallyworld: where do instances get tools from during provisioning? always the state server or maybe directly from streams?04:07
waigani_mgz: +1 to how you've got the tests now. LGTM04:09
mgzmenn0: yeah, we have specialy hackery in CI to work around the fact the provider can't clean up networks04:15
mgzmenn0: see http://juju-ci.vapour.ws/job/joyent-deploy-jes-trusty-amd64/configure04:16
mgz$RELEASE_TOOLS/joyent-curl.bash /cpcjoyentsupport/fwrules | sed -e 's/[\[\{]/\n\0/g;' | grep $JOB_NAME | sed -e 's/.*"id":"\([^"]*\)".*/\1/' | xargs -I{} $RELEASE_TOOLS/joyent-curl.bash /cpcjoyentsupport/fwrules/{} -X DELETE || true04:16
mgzwaigani_: thanks04:16
mgzmenn0: I'm failing to remember why --upload-tools didn't work, just remember I disabled it on abentley's say-so04:19
mgznatefinch: there's a one-line change you can make in utility.py that will enable --upload-tools again04:21
mgzwell, and a dedent.04:22
mupBug #1497316 changed: TestUniterSteadyStateUpgrade permission problem <ci> <intermittent-failure> <windows> <juju-core:Expired> <https://launchpad.net/bugs/1497316>04:23
menn0mgz: what about changing the call to boot_context() in assess_jes_deploy.py04:30
menn0it takes an uploadTools arg04:30
menn0(hardcoded to false)04:30
mgzah, pants, yeah, that too04:31
mupBug #1497316 opened: TestUniterSteadyStateUpgrade permission problem <ci> <intermittent-failure> <windows> <juju-core:Expired> <https://launchpad.net/bugs/1497316>04:32
natefinchmgz: seems easy enough.  I think I have it, with the additional change to make False args.upload_tools in boot_context04:38
mupBug #1497316 changed: TestUniterSteadyStateUpgrade permission problem <ci> <intermittent-failure> <windows> <juju-core:Expired> <https://launchpad.net/bugs/1497316>04:38
natefincher in the call to boot_context04:38
* natefinch tries it out04:38
mgzokay, http://reviews.vapour.ws/r/3172/ should be good to go04:38
mupBug #1497316 opened: TestUniterSteadyStateUpgrade permission problem <ci> <intermittent-failure> <windows> <juju-core:Expired> <https://launchpad.net/bugs/1497316>04:41
mupBug #1497316 changed: TestUniterSteadyStateUpgrade permission problem <ci> <intermittent-failure> <windows> <juju-core:Expired> <https://launchpad.net/bugs/1497316>04:47
natefinchhmm yeah, not working... getting no tools found when doing create-environment.04:52
wallyworldmenn0: sorry, was out at 1:1. during provisioning we go to state server04:55
wallyworldmenn0: cloud init is configured to try all the recorded api host addresses to till connections04:55
menn0wallyworld: ok thanks04:56
menn0mgz, natefinch: so I must have been running into joyent firewall issues then. I frequently saw new instances that couldn't connect to the state server to get tools.04:56
menn0natefinch: you might want to make sure you remove the rules as per the CI job04:57
natefinchmenn0: this is running the CI test...04:58
menn0natefinch: yes, but the CI job that runs assess_jes_deploy.py does some stuff before and after it runs too04:58
natefinchahh04:58
menn0natefinch: see the job's config in jenkins04:58
menn0natefinch: or the long command mgz mentioned earlier04:59
menn0natefinch: that's from the CI job04:59
natefinchmenn0: gross05:01
menn0natefinch: it's not exactly beautfiul no05:01
wwitzel3wallyworld: if you juju ssh 0 , you'll see all the logs05:02
wallyworldoh, that's not the bootstrap machine, doh05:03
wwitzel3wallyworld: nope, that is the client machine, in the home directory there is the 1.26-alpah1 client too05:04
wwitzel3wallyworld: the client on the path is 1.2205:04
wallyworldok05:04
mgzbug 1451104 could just be fixed in the joyent provider, then CI hackery goes away05:07
mupBug #1451104: Joyent machines can fail to fetch tools <ci> <deploy> <joyent-provider> <reliability> <juju-core:Triaged> <https://launchpad.net/bugs/1451104>05:07
mgzbug 1485781 is more specific05:08
mupBug #1485781: Juju is unreliable on Joyent <joyent-provider> <reliability> <repeatability> <juju-core:Triaged> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1485781>05:08
natefinchle sigh05:08
davecheneyi read that as "joyent is unreliable on joyent"05:09
davecheneyit had a nice ring to it05:10
natefinchno, it's "CI only really works in CI"05:11
natefinchthere's just like 100 implicit dependencies that need to be perfectly correct to get things to run05:12
davecheneyworks on my cloud(tm)05:12
wallyworldwwitzel3: the data being logged does not match what is on streams.canonical.com05:19
wallyworldmaybe they have squid or something serving stale data05:20
wwitzel3wallyworld: but that shouldn't prevent upload-tools working?05:24
wallyworldtrue, i was looking to see why a normal upgrade fials05:24
natefinchmgz: what's $JOB_NAME?05:25
mgznatefinch: what you're passing as forth arg to assess_jes.py05:27
mgz+u05:27
natefinchmgz: so the copied environment name?05:31
mgzwell, ideally something unique, the idea is jobs don't stomp on each other if running at the same time05:32
wallyworldwwitzel3: aha05:33
wallyworldthe tools metadata is fucked05:33
wallyworldincorrectly generated05:33
wallyworldthis has happened before :-(05:33
wwitzel3wallyworld: can we use juju-metadata to generate new data?05:34
wallyworldno, it needs to be signed05:35
wallyworldit's a CPC issue05:35
wwitzel3wallyworld: and this is what is impacting even with upload-tools?05:35
wallyworldnot sure, that's next on the list to look at05:36
wallyworldthis is a serious, serious issue05:36
wwitzel3wallyworld: ah, ok05:36
mgzwallyworld: which streams?05:37
wallyworlddevel, maybe others, haven't checked05:37
natefinchstill no tools found...05:37
natefinchwhelp.  It's after midnight and well past bed time.  I guess I won't be fixing this today.05:37
wallyworldmgz: all sreams seem affected05:39
=== mwenning is now known as mwenning-afk
mupBug #1482155 changed: lxc restriction on multiple state servers <ha> <lxc> <state-server> <juju-core:Invalid> <https://launchpad.net/bugs/1482155>05:47
mupBug #1482155 opened: lxc restriction on multiple state servers <ha> <lxc> <state-server> <juju-core:Invalid> <https://launchpad.net/bugs/1482155>05:50
mupBug #1482155 changed: lxc restriction on multiple state servers <ha> <lxc> <state-server> <juju-core:Invalid> <https://launchpad.net/bugs/1482155>05:53
mgzwallyworld: so... don't think we actually need cpc, probably from a lp:juju-release-tools change, so we're just sitting on their hardware06:01
wallyworldok, so we can maybe fix?06:01
wallyworldor rollback06:02
mgzyeah, but it's also night in canada06:02
wallyworldSOP \o/06:04
wallyworldSPOF06:04
mupBug #1517344 opened: state: initially assigned units don't get storage attachments <juju-core:Triaged> <https://launchpad.net/bugs/1517344>07:35
menn0everything is awesome08:57
menn0everything is awesome when you're part of a team08:57
dimiternmenn0, :) are you talking about the blocker bug?08:58
menn0not specifically08:58
menn0just slightly delirious08:58
menn0and that song is stuck in my head08:58
dimiternmenn0, you should get some rest then ;)08:59
menn0dimitern: nah... perfect time to write some more code09:00
menn0;-)09:00
dimiternmenn0, I know the feeling09:00
menn0fwereade: review please: http://reviews.vapour.ws/r/3178/09:13
fwereademenn0, ack09:14
fwereademenn0, and now that's stuck in my head too09:14
menn0you're welcome09:14
menn0;-)09:14
dimiternvoidspace, dooferlad, frobware, guys, please have a look at http://reviews.vapour.ws/r/3167/ when you can09:32
voidspacedimitern: looking09:36
voidspacedimitern: good to read this code as I don't (yet) the concepts fully09:36
dimiternvoidspace, cheers - I'm glad to clarify things where needed09:39
voidspacedimitern: straightforward so far09:50
frobwaredimitern, added some comments; also (for standup) why do we replace " " in space names?09:56
dimiternfrobware, thanks! spaces are not valid in constraints09:58
frobwaredimitern, so why do we collapse them? if you pass a name with " " isn't that just invalid?10:00
voidspacemaas space names can have spaces10:01
voidspacewe'll have to translate between juju space names and maas names10:01
dimiternbut not juju space names10:01
dimiternfwereade, jam, standup?10:02
mupBug #1517391 opened: MachineStorageIdsWatcher severely undertested <tech-debt> <juju-core:Triaged> <https://launchpad.net/bugs/1517391>10:09
mattywanastasiamac, are you still awake?10:30
mattywanastasiamac, I guess you shouldn't be really10:30
anastasiamacmattyw: hi o/11:13
mattywanastasiamac, hey hey. I see http://reviews.vapour.ws/r/3171/ is marked WIP - do you still want reviews or should I leave it for now?11:14
anastasiamacmattyw: tyvm for asking - plz leave it for now :D11:14
mattywanastasiamac, will do11:15
anastasiamacmattyw: \o/11:15
mattywanastasiamac, feel free to ping me or anyone else from emerald squad with reviews for x-model relations11:16
anastasiamacmattyw: gr8 idea - will do :)11:17
=== blahdeblah_ is now known as blahdeblah
mupBug #1517428 opened: state depends on multiwatcher and/or params <tech-debt> <juju-core:Triaged> <https://launchpad.net/bugs/1517428>12:12
voidspacedimitern: ping12:18
dimiternvoidspace, pong12:18
voidspacedimitern: your PR - parseDelimitedValues12:18
voidspacedimitern: where does rawValues come from?12:18
voidspacedimitern: is that from the original command line invocation?12:19
voidspacedimitern: I wonder why we're doing the " " stripping inside the provider12:19
voidspacedimitern: we should have already converted from juju space names to maas space names by here12:19
voidspacedimitern: so they *might* have valid spaces in them12:19
dimiternvoidspace, the raw values come from constraints12:19
voidspacedimitern: if they're space names then they need converting to maas names12:20
voidspacedimitern: and the provider can't do that as it requires access to state12:20
voidspacedimitern: and constraints will use juju names12:20
dimiternvoidspace, I guess I overengineered it a bit - I'll drop the space stripping and that will simplify it - there's no way we get invalid constraints that far into the provisioning process12:20
voidspacedimitern: ok12:20
voidspacedimitern: maas space names can have spaces in them - so stripping them is wrong12:21
voidspacedimitern: I'll add an issue12:21
dimiternvoidspace, the conversion needs to happen, but not having it now won't block us from possibly doing the demo12:21
voidspacedimitern: but the conversion needs to happen in a different place, not in the provider code12:22
dimiternvoidspace, yeah, that's true12:22
voidspacedimitern: the provider code should only be dealing with MAAS names (or ids)12:22
dimiternvoidspace, I think it will happen in the provisioner12:22
voidspacecool12:23
voidspacedimitern: frobware: dooferlad: just to mention that I'm off tomorrow, it's in the calendar.12:31
dimiternvoidspace, have a good one ;)12:31
voidspacedimitern: I'm spending the day with my mum, visiting stratford12:32
voidspaceshould be good12:32
dimiternafter some minor but required CI surgery on the parallel streams job, I'm happy to report we now have a blessed master!!12:38
dimiterndon't rush all at once to land your stuff :P12:39
dimiternvoidspace, I've re-queued your PR #3692 for merging as well as all 3 of mine12:51
mupBug #1516144 changed: Cannot deploy charms in jes envs <blocker> <charms> <ci> <regression> <juju-core:Fix Released by menno.smits> <https://launchpad.net/bugs/1516144>12:51
dimiternhopefully no conflicts will emerge12:52
mattywdimitern, ping?13:02
dimiternmattyw, pong13:04
mattywdimitern, you know maas, are you able to respond to this guy? https://github.com/juju/juju/issues/362713:04
dimiternmattyw, I have no idea why that happens unfortunately :/13:05
frobwarevoidspace, ack13:27
frobwaredimitern, you left mine out... sniff. :)13:28
mupBug #1517474 opened: provider/ec2: don't artificially limit EBS volumes to xvdf-xvdp <juju-core:Triaged> <https://launchpad.net/bugs/1517474>13:30
jamfwereade: ping for planning meeting?13:33
dimiternfrobware, sorry, should've checked :/13:39
dimiternfrobware, voidspace, I've updated http://reviews.vapour.ws/r/3167/ - can you have another look and approve it if it looks ok?13:40
frobwaredimitern, taking a look13:40
frobwaredimitern, glad we drooped the " " stripping13:40
dimiternfrobware, yeah, now that I think about it, it was like this because the --networks argument to deploy (where these came from in an earlier PoC) was not well validated and/or rushed to a demo-able state13:42
mgzdimitern: I don't suppose you heard anything from wallyworld on the status of upgrade issue? he was concerned our streams were screwed, but I don't see any mailling list updates from him.13:42
dimiternmgz, which upgrade issue is that?13:45
dimiternfrobware, once our PRs land, it's time to rebase maas-spaces onto master13:48
frobwaredimitern, +113:49
dimiternmgz, was that about upgrade 1.20.x->1.24.4 not working unless with --version 1.24.4 ?13:50
mgzI think it was several layers of investigation into rt #8546313:55
mupBug #85463: [apport] evolution-exchange-storage crashed with SIGSEGV in e2k_restriction_unref() <evolution-exchange (Ubuntu):Invalid by desktop-bugs> <https://launchpad.net/bugs/85463>13:55
mgznot that mup13:55
mattywkatco, woohoow!13:56
mattywkatco, I've been denying myself the lxd provider till it lands in master13:57
voidspacedimitern: LGTM14:22
dimiternvoidspace, cheers!14:22
mupBug #1517499 opened: i/o timeout on bundle deployment <juju-core:New> <https://launchpad.net/bugs/1517499>15:01
natefinchaxw: thanks for fixing that bug with unit assignment.  Can't believe I forgot to remove the docs after successful assignment.  I'm surprised that didn't show up more often in tests.15:13
frobwarecherylj, having trouble with the bond0 interface and juju-br0 1516891. Will concentrate on landing 1512371 first15:28
cheryljfrobware: okay, thanks for the update!15:29
frobwarecherylj, this may be at the root of some problems - https://bugs.launchpad.net/ubuntu/+source/ifupdown/+bug/133787315:29
mupBug #1337873: Precise, Trusty, Utopic - ifupdown initialization problems caused by race condition <cts> <patch> <ifupdown (Ubuntu):Fix Released by dgadomski> <ifupdown15:29
mup(Ubuntu Precise):Confirmed> <ifupdown (Ubuntu Trusty):Confirmed> <ifupdown (Ubuntu Vivid):Confirmed> <ifupdown (Debian):New> <https://launchpad.net/bugs/1337873>15:29
cheryljfrobware: interesting.  It's an old bug, but wasn't addressed until last month?15:38
frobwarecherylj, I think there's a bit of work to address the bonding bug (for juju-br0)15:40
cheryljfrobware: do you know if it's just a problem with MAAS 1.9?15:41
frobwarecherylj, I don't.15:42
cheryljjust re-read the bug, it looks like the support for bonding is introduced in 1.915:43
cheryljfrobware: is there a workaround where users could manually edit /e/n/i?15:44
frobwarecherylj, let me try. though I am getting distracted... :)15:44
cheryljfrobware: thanks :)  I'd feel more comfortable moving it to 1.25.2 if we could give people a manual workaround15:45
natefinchericsnow: you around?15:58
ericsnownatefinch: briefly15:58
frobwarecherylj, synthesizing what juju boostrap would do to /e/n/i and rebooting yields no working interfaces. \o/ ... :(15:59
frobwarecherylj, so, not sure there's even a quick workaround right now.15:59
cheryljfrobware: yikes!!15:59
natefinchericsnow: I don't think your fix for the OOM error is actually changing the core behavior... you're running the waits in goroutines, but you're starting all the executables in what is effectively a tight loop... since Cmd.Start() just fires off a goroutine15:59
natefinchericsnow: er, maybe not a goroutine per se... but we'll still be starting processes really fast and could easily have 100 fire off in a second16:01
ericsnownatefinch: fork doesn't take long16:02
ericsnownatefinch: fork+exec I mean16:02
ericsnownatefinch: by the time Start completes it should already be exec'ing16:03
natefinchericsnow: I guess I don't know when linux stops "counting" the memory of the fork against juju... or when it decides the fork doesn't really need that much.  You're going to have 100 ssh processes all running at the same time if there are 100 machines to run on.16:06
ericsnownatefinch: once it execs16:06
katcocherylj: got a sec?16:07
cheryljkatco: sure, what's up?16:07
natefinchericsnow: so, once Cmd.Start returns, we're no longer being hurt by the memory the fork uses?16:07
katcocherylj: can you hop in https://plus.google.com/hangouts/_/canonical.com/moonstone?authuser=116:07
ericsnownatefinch: correct16:08
natefinchericsnow: interesting. ok, thanks.16:09
ericsnownatefinch: np16:09
wwitzel3mgz: what do you mean isolate? I know if I bootstrap with 1.22.8, set the agent-stream to devel, and issue update-juju --version 1.26-alpha1 I get: no matching tools available16:10
wwitzel3s/update/upgrade16:10
mgzwwitzel3: I mean set agent-metadata-url to a specific different location16:13
wwitzel3mgz: sure, what location?16:14
ericsnownatefinch: FWIW, I don't think the 100 ssh procs will be a big problem (~50k each)16:14
mgzabentley: do you have bandwidth to work with wwitzel3 to look at the upgrade from 1.22.8 to devel streams?16:14
natefinchericsnow: only a problem due to the memory copying thing from fork16:14
ericsnownatefinch: yep16:15
abentleymgz, wwitzel3: Yes, I can help out.  I was just dealing with a similar issue 1.20.x16:16
abentleywwitzel3: hangout?16:16
wwitzel3abentley: https://plus.google.com/hangouts/_/canonical.com/moonstone?authuser=116:17
mupBug #1517535 opened: Agent stuck in "allocating" state <juju-core:New> <https://launchpad.net/bugs/1517535>16:25
natefinchkatco: in case you missed it, I talked with ericsnow and he convinced me that his code actually is doing the right thing... it was just my lack of knowledge of how linux's memory handling works during fork & exec.16:26
mupBug #1517535 changed: Agent stuck in "allocating" state <juju-core:New> <https://launchpad.net/bugs/1517535>16:31
mupBug #1517535 opened: Agent stuck in "allocating" state <juju-core:New> <https://launchpad.net/bugs/1517535>16:34
katconatefinch: cool, looking forward to seeing it land16:34
katcomattyw: jujubot merged commit de417eb into master 20 seconds ago. it's in master now :) alpha2!16:44
katconatefinch: ericsnow: wwitzel3: lxd provider is now in master. good job team :D16:44
mattywkatco, natefinch ericsnow wwitzel3 good work folks - just try to stop me using the lxd provider now!16:45
ericsnowkatco: \o/16:45
mgzcherylj: bug 1512399 fix landed on 1.2516:53
mupBug #1512399: ERROR environment destruction failed: destroying storage: listing volumes: Get https://x.x.x.x:8776/v2/<UUID>/volumes/detail: local error: record overflow <amulet> <bug-squad> <openstack> <sts> <uosci> <Go OpenStack Exchange:Fix Released by gz> <juju-core:In Progress by gz> <juju-core16:53
mup1.25:In Progress by gz> <https://launchpad.net/bugs/1512399>16:53
cheryljyay!  thanks, mgz!16:53
=== jamesmil_ is now known as jamesmillerio
wwitzel3abentley: https://bugs.launchpad.net/juju-core/+bug/1507867/comments/23 figured out the reproduction steps17:10
mupBug #1507867: juju upgrade failures <canonical-bootstack> <upgrade-juju> <juju-core:In Progress by wwitzel3> <https://launchpad.net/bugs/1507867>17:10
wwitzel3abentley: seems code related, but thought you might be interested17:11
abentleywwitzel3: Yes, using --upload-tools breaks future upgrades.  That is a known bug.  sinzui?17:11
wwitzel3abentley: yeah, but I thought setting the tools-url manually worked around it?17:12
abentleywwitzel3: Yes, I thought so, too.  Interesting.17:13
abentleywwitzel3: Can I see the log?17:15
abentleywwitzel3: Can you upgrade to a non-devel version like 1.25.0?17:16
wwitzel3abentley: nope17:17
wwitzel3abentley: the log doesn't have much in it, even at debug17:17
abentleywwitzel3: Sometimes the "reading tools with major.minor version 1.22" line is informative.17:19
wwitzel3abentley: that line does not exist in the log17:19
abentleywwitzel3: for an upgrade, the machine-0 log may also be useful, but you have to read it very damn closely.17:19
wwitzel3abentley: I dont see any streams activity, like it isn't trying at all17:19
mgzis the lxd-provider documented anywhere at present?17:20
katcomgz: in the release notes; no documentation as of yet17:22
mgzkatco: ta17:22
alexisbthe idea of having both bundle deploy support and a lxd provider on a devel ppa juju makes me giddy :)17:25
katcoalexisb: i know right? those 2 features together = love17:26
abentleywwitzel3: Here is the machine-0 log of a successful upgrade from 1.22 to master, if it is useful: http://data.vapour.ws/juju-ci/products/version-3327/aws-upgrade-22-trusty-amd64/build-376/machine-0.log.gz17:29
natefinchericsnow: I see you're still posting on the juu-run bug. Do you think you'll have time to get that landed today?  I was going to try to do the work to get it landed, but if you're doing it, I can move on to something else.17:42
ericsnownatefinch: I'm trying but quickly running out of time17:43
ericsnownatefinch: I'll let you know and hand it off if needed17:43
stokachukadams54: got a PR for you for theblues pacakge17:43
natefinchericsnow: ok. yeah, I had assumed you weren't going to have any time today, so was prepared to do it. Let me know when you know.17:44
ericsnownatefinch: k17:44
ericsnownatefinch: will be soon17:44
stokachu2117:45
kadams54stokachu: Taking a look17:48
natefinchfwereade: btw, really like your point about passing an abort channel rather than a timeout value.  way more flexible, and still trivial to use as a timeout.18:11
fwereadenatefinch, cool, thanks :)18:11
wwitzel3abentley: was that after using upload tools?18:23
abentleywwitzel3: no, that was our standard test.  http://reports.vapour.ws/releases/3327/job/aws-upgrade-22-trusty-amd64/attempt/37618:34
wwitzel3abentley: yeah, the standard test works for me as well18:35
natefinchkatco: I'm going to go back to bug #1491688 for now, since ericsnow seems to still be working on his bug18:51
mupBug #1491688: all-machine logging stopped, x509: certificate signed by unknown authority <bug-squad> <landscape> <logging> <rsyslog> <sts> <juju-core:Triaged> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1491688>18:51
katconatefinch: oh, ok. sorry, didn't think ericsnow would be around today18:51
natefinchkatco: me either :)18:52
ericsnowkatco: leaving in a few minutes18:52
katcoericsnow: ok. do you think you'll get this in by EOD?18:53
ericsnowkatco: was going to hand it off to natefinch soon18:53
katcoericsnow: natefinch: is it possible for you two to pair for the remaining time?18:54
ericsnowkatco: made the mistake of reading fwereade's review comments :)18:54
katconatefinch: i don't want you to get whiplash18:54
natefinchkatco: I gotta pick up my daughter from preschool now, unfortunately18:56
katconatefinch: ok18:57
katcoericsnow: please send nate an email with all the info he needs for the hand-off. we need this landed by tonight18:57
natefinchkatco: I'm very familiar with eric's changes and william's comments, though, so I don't think there will be many rpoblems18:57
katconatefinch: ok18:57
ericsnownatefinch: I'm going to update my branch and then you can take it from there18:57
natefinchericsnow: cool18:57
ericsnownatefinch: thansk18:57
natefinchkatco: I spent a lot of time looking into it before I realized eric was working on it, so I guess that works out :)18:57
katco=|18:58
katcocommunication people!18:58
katcoericsnow: how dare you not actually be out today18:58
natefinchexactly!18:58
natefinch;)18:58
natefinchok, gotta run18:58
=== natefinch is now known as natefinch-afk
ericsnowkatco, natefinch-afk: okay, I'm out19:06
katcoericsnow: gl19:06
=== ericsnow is now known as ericsnow-afk
=== _thumper_ is now known as thumper
lazypowero/  allo core devs. Is there ever a reason to use a scope: container qualifier on a relationship outside of subordinate services? I've been noodling this for a while and dont see a use-case for it outside of subordinate relations.20:08
=== natefinch-afk is now known as natefinch
thumperdavecheney: isn't the reason the structs moved into the multiwatcher was because state depends on multiwatcher...20:25
thumperdavecheney: could the dependency change the other way around?20:25
thumperso state/multiwatcher just depends on state?20:25
thumperI'm not sure I'm remembering correctly20:26
stokachukadams54: is there a way to do a globbed like search with theblues library?20:28
stokachuusing search('nova') only pulls in nova-compute and not nova-cloud-controller etc20:28
kadams54stokachu: I don't rightly know. We use ElasticSearch on the backend, so it would be worth trying ES' query string syntax: https://www.elastic.co/guide/en/elasticsearch/reference/1.3/query-dsl-query-string-query.html#query-string-syntax20:41
stokachukadams54: ok thanks20:42
mupBug #1517611 opened: TestFilesystemInfo race condition in 1.25 <ci> <intermittent-failure> <regression> <juju-core:Incomplete> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1517611>20:50
stokachukadams54: find any issues with that PR?21:01
* thumper sighs21:01
thumperI merged the blessed master into my feature branch21:01
thumperand I have so many failures with go 1.521:01
thumperwhy are go 1.5 test runs not voting?21:02
natefinchdoh21:02
natefinchthumper: because they weren't passing :/21:02
urulamastokachu: the way to search anything mentioning "nova" is https://api.jujucharms.com/charmstore/v4/search?text=nova&autocomplete=1&limit=10021:02
thumperclearly the fix then is to fix them21:02
thumpernot just ignore them21:02
natefinchagreed21:02
thumperFFS21:03
stokachuurulama: perfect thank you!21:03
natefinchthe lxd stuff is behind build tags for go 1.3+, (since it requires go 1.3+)  Hopefully you're not seeing problems with that code21:03
urulamastokachu: and it's ranked21:04
stokachuurulama: yea it looks like it return tursty/precise first21:04
stokachuurulama: so can i assume those are the blessed ones at the top each time?21:04
urulamastokachu: yes21:05
stokachuurulama: perfect much appreciated21:05
urulamastokachu: if you want to see just the blessed ones (no user namespace charms) do this https://api.jujucharms.com/charmstore/v4/search?text=nova&autocomplete=1&limit=100&owner=21:06
stokachueven better21:07
kadams54stokachu: QA checks out. I'm going to merge it.21:08
stokachukadams54: sweet thanks!21:09
thumpernatefinch: github.com/juju/juju/worker/provisioner, github.com/juju/juju/payload/api/private, github.com/juju/juju/cmd/jujud/agent (messy db cleanup), github.com/juju/juju/cmd/juju/commands, github.com/juju/juju/api/unitassigner , and some fallout that appears to be due to a clash with my current work21:11
stokachuurulama: so i noticed some of the results return charms like glance and neutron, is there a way to just search if the 'term' is in the id?21:11
natefinchthumper: hmm weird, yeah, none of that is lxd code.  not sure why it would fail for go 1.521:12
thumpernatefinch: most likely the runtime changes in 1.521:13
thumperw.r.t. GOMAXPROCS default21:13
stokachukadams54: ew owned by docs?21:13
thumpercauses more races in tests21:13
natefinchthumper: I don't think so. I always run GOMAXPROCS=821:13
* thumper shrugs21:13
thumpersomething else then21:13
natefinchsumthin21:13
natefinchkatco: I have no idea how to write a test to show that eric's code uses goroutines in exactly the right way and not in the wrong way :/21:14
kadams54stokachu: Yeah, I'll get it straightened out.21:14
stokachuurulama: not a big deal if not, i can filter it out21:14
stokachukadams54: :)21:14
katconatefinch: let me take a peek at his code again21:14
katconatefinch: what function are you trying to test?21:15
natefinchkatco: the meat of the change is in startSerialWaitParallel21:15
natefinchkatco: so, before we were effectively doing the whole inside of the loop in a goroutine, and now we're only doing the second half in a goroutine21:16
katconatefinch: i think the trouble is because the wait func should be passed into startSerialWaitParallel21:17
urulamastokachu: it should be with something like https://api.jujucharms.com/charmstore/v4/search?name=mongodb&limit=10021:18
katconatefinch: pull that out, test it separately, and then test that startSerialWaitParallel calls it properly21:18
urulamastokachu: however, some names don't return any results, which makes me wonder why :S21:18
katconatefinch: i.e. startSerialWaitParallel calls whatever is passed in properly21:18
natefinchkatco: yeah, that can work.  I already didn't like that wait was getting half its args from closing over them and half not, so that would fix that problem.21:19
katconatefinch: isn't the only var it's closing over, "wg"?21:20
natefinchkatco: 1 is half of three when you assign it to an int ;)21:21
katcolol21:21
natefinchtechically correct is the best kind of correct!21:21
natefinchyeah, I though there was something else, but I guess not :)21:22
katcodoes anyone recall seeing a bug open recently around not being able to "upgrade-juju" after using "--upload-tools" ?21:24
natefinchah hah, the cmd too21:24
natefinchkatco: upload tools is bad juju, so to speak.21:24
katconatefinch: so not the point ;p21:24
natefinchI'm not really clear on the rules when you use upload-tools vis a vis upgrades21:25
thumperdavecheney: ping when you are around21:27
perrito666I hate replicaset sometimes... and some other too21:28
* perrito666 bbl bike time21:28
stokachuurulama: so would https://api.jujucharms.com/charmstore/v4/search?name=nova&limit=100 return everything with 'nova' in the name?21:28
urulamastokachu: it should, but doesn't :S so, stick with the text=nova&owner= for now and filter out the ones you don't need21:31
stokachuurulama: ok cool21:31
urulamastokachu: i'll give a look why name queries fails21:32
stokachuurulama: thanks again :)21:32
urulamastokachu: np21:32
natefinchgod I hate using external tests.  It's like every time I go to write a nice little unit test, someone is actively trying to make it more difficult :/21:50
natefinchkatco: well, refactored, but now to write the tests: https://github.com/natefinch/juju/commit/5f5e26441bc250c221e8e8b688432db8c5e8beec#diff-813c65409327242bb7df0d66ac593b91R19521:52
natefinchkatco: gonna require some... tricks, I think.21:52
katco=|21:53
natefinchjust difficult to detect when you're running in a separate goroutine or not21:53
natefinchactually,  I think I can do it.21:55
natefinchhmm..21:55
natefinchahh, I have it21:56
natefinchsome creative locks in the start and wait functions can prove that start is always called in order, and wait gets called all at the same time21:57
mupBug #1517632 opened: juju upgrade-juju after upload-tools fails <juju-core:New> <https://launchpad.net/bugs/1517632>21:59
=== natefinch is now known as natefinch-afk
perrito666seriously this country.... a bit of summer and people have barbecue parties ever single night... and my desk is next to the windows22:59
alexisbthumper, axw, anastasiamac: release call is running over23:00
thumperk23:00
alexisbbe there shortly23:00
anastasiamack :D23:01

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!