[00:22] axw: menn0:thumper:wallyworld: an awesome, small review plz :D http://reviews.vapour.ws/r/5523/ [00:22] alexisb_: ping [00:22] otp with reed can look soon [00:22] alexisb_: we should have a quick word [00:23] anastasiamac: looking [00:24] menn0: \o/ [00:25] anastasiamac: well that was easy. ship it [00:27] menn0: cmars: tyvm :) m loving to see this beta's bug count going down or fixes count going up - glass half full :D [00:46] dog walk time === thumper is now known as thumper-dogwalk [00:50] Bug #1471770 changed: TestPrunes fails occasionally still [00:50] Bug #1580501 changed: cloudimg-base-url parameters not in Juju2 anymore <4010> [01:10] anastasiamac: I think you might've broken something: ERROR failed to bootstrap model: model "controller" of type manual does not support instances running on "amd64" [01:11] (I'm looking into it) [01:11] axw: well, the only thing i can think of is that manual constraints validator does not have arches, and if there are no images for it, then nothing will be merged... [01:12] anastasiamac: yeah, same thought [01:12] axw: i think I'll need to put arches in in there [01:12] anastasiamac: yup [01:12] k. i'll do it now :) [01:12] anastasiamac: I can probably fix it in my branch [01:12] axw: ooh even better ;) [01:12] should be a pretty small change [01:13] axw: i wonder if other providers will need to have similar thing.. i think manual slipped through finger because maybe it never had arches vocab defined..? [01:13] anastasiamac: yeah, it doesn't [01:14] axw: \o/ tyvm. let me know if there is something i can do to assist [01:32] wallyworld: have you noticed that "model-config" says FROM=model for logging-config and resource-tags in a fresh model? [01:33] axw: loggin-config i'd expect because juju itself sets the value via the api. resource-tags i'd need to look at, but i suspect the issue is the schema coeecsion to a map from a string [01:34] bah, string to map i mean [01:34] wallyworld: yeah, almost certainly. I was thinking we should just set the default logging-config though? [01:34] to the same as what the agent runs [01:35] and configure the client differently if needed [01:35] axw: it starts out as one thing and juju sets to another (debug->info) [01:35] or something like that [01:36] but yeah, maybe we could do better, i recall at the time it made sense how it is [01:36] wallyworld: *shrug* it seems odd to me that OOTB the config says it's not default [01:37] agreed. i can't recall the specifics off hand, but juju messes with it [01:37] we can clean up next week [01:42] anastasiamac: would you please review https://github.com/juju/juju/pull/6083/commits/32fdee6ef69e1355480ed9dbd208ca69c97fdd0f [01:45] axw: looks awesome :) did u get a chance to test live too? [01:45] anastasiamac: yup, bootstraps fine after this change [01:45] axw: \o/ LGTM for this commit :D [01:45] anastasiamac: ta === thumper-dogwalk is now known as thumper [02:09] Bug #1449210 changed: cloudsigma index file has no data for cloud [02:09] Bug #1616197 changed: juju restore-backup error <20160826> [02:09] Bug #1616298 changed: DebugMetricsCommandSuite.TearDownTest fails due to "no reachable servers." [02:12] is canonical IRC down for anyone else? [02:13] natefinch: yes [02:13] Just back now [02:13] oh good, then I haven't been fired yet. [02:13] miken: not for me... still down \o/ [02:13] natefinch: unless we've all been :) [02:14] Oh - I'm connecting from an internal IP address, and it just reconnected to irc.c.c 2mins ago. [02:25] thumper: you'll need to resubmit your race fix pr i think, after adding "Build failed:" comment to trick the bot [02:25] wallyworld: ack [02:27] axw: got some time to talk manual providers? [02:31] thumper: sure [02:31] axw: https://hangouts.google.com/hangouts/_/canonical.com/manual?authuser=0 [02:42] * anastasiamac about to lose electricity - going afk for lunch and fun [02:43] thumper: you guys talking about that bug I was looking at? [02:45] natefinch: this one? https://bugs.launchpad.net/juju-core/1.25/+bug/1610880 [02:45] Bug #1610880: Downloading container templates fails in manual environment [02:45] anastasiamac: yeah [02:45] anastasiamac: was going to ask you if you had any thoughts about that one [02:46] natefinch: looks like the fix needs to go into 1.25 not master [02:46] natefinch: no thoughts whatsoever - hence, m happy to go with advice from wallyworld to mark it as Invalid. I do wish there was a bit more explanations as to why it is Invalid... [02:47] anastasiamac: well, the customer who is experiencing it is on 1.25, yeah. I don't know if it happens in 2.0, honestly [02:47] it has to be invalid as it's lxc only [02:47] natefinch: k :) do u have enough context to fix? [02:47] oh yeah, dug [02:47] 2.0 uses a totlly different mechanism [02:47] duh [02:47] it's not invalid for 1.25 [02:47] but it is invalid for 2.0 [02:47] correct [02:47] yes. [02:48] all it did was remove the targetted juju project [02:48] left it targetted to juju-core [02:48] Bug #1610880 changed: Downloading container templates fails in manual environment [02:48] you can tell it's only 1.25 from reading the logs [02:49] yes, of course. I hadn't really thought that part through :) [02:50] i've removed juju-core and left for 1.25 [02:50] I added a note as to why it's not applicable to 2.0 :) [02:50] Hi, is there a way to tear down a model from the gui? [02:50] generally if the bug is in 1.25, we also keep it to be fixed in 2.0 [02:50] wallyworld: updates in http://reviews.vapour.ws/r/5510/ [02:50] however, in this case, lxc is not in 2.0; so bug is invalid [02:50] looking [02:50] from command line I can go 'juju destroy-model blah' [02:50] natefinch: thnx :D [02:50] anastasiamac: correct [02:51] wallyworld: m glad that u r agreeing \o/ [02:51] that it is 1.25 only? yes :-) [02:52] thumper: it's possible that the "should be terminated" is coming from the pkill issued by DestroyController, and the SIGABRT stack trace is due to the killall run by the CI script [02:52] wallyworld: and to a new world order, no? :D [02:52] axw: hmm... I'll poke around [02:52] thumper: so maybe it's just a case of the agent not shutting down fast enough [02:52] that too, peace in the middle east and all that [02:54] axw: actually... you may well be right [02:55] perhaps we need to make the destroy controller method on the manual provider wait until the process has been removed [02:56] axw: I think this is the most likely case, and what I'll try for first [02:56] thumper: maybe, could be that the agent is wedged though? we wouldn't normally care, because the controller machine gets destroyed in cloud environments [02:56] the agent will hang around for a while to answer current api calls [02:56] but no, don't think it is fully wedged. [02:56] but I'll look too [02:57] still think it is worthwhile waiting [02:57] thumper: hmmk. seems reasonable to wait, yeah [03:06] redir: really close, got a fix it then ship it, let me know if anything is unclear [03:12] wallyworld: k. tc [03:12] tx even [03:14] wallyworld: hmm... [03:14] things that make you go [03:14] wallyworld: trying to bootstrap a manual provider in lxd [03:15] ERROR failed to bootstrap model: model "controller" of type manual does not support instances running on "amd64" [03:15] wat? [03:15] thumper: damn, that's fallut from anatasia's changes for fixing simplestreams issues, you running from tip? [03:16] yep [03:16] ok, will need to be fixed [03:16] trying to look at the manual provider leaving stuff behind [03:16] but can't bootstrap [03:16] I don't mind fixing if I can be told what needs fixing [03:16] you can comment out the error [03:17] huh? [03:17] it's a bit complicated, image metadata been reworked, so different rules to figure out what can be bootstrapped [03:17] comment out the error return [03:17] ie don't do the check - that will get you going so you can do your fix [03:18] where is that? [03:19] don't commit that change of course [03:19] validateUploadAllowed [03:19] environs/bootstrap/tools.go [03:19] ack [03:20] wallyworld: that'll be the s390x manual bootstrap bug too [03:20] ah, yes, could be [03:21] i'll talk to her when she's back online [03:24] So Posgres 9.6 tl;dr notes: parallel query scans, joins, and aggregates, inceremental vacuum freeze, sycronous replication with multiple standbys, 10 will start a new version scheme (thing firefox/chrome) [03:24] other than that mostly lots of talk about the uber paper. [03:25] redir: oh? [03:26] uber: PostgreSQL -> Some schemaless object store thing on MySQL [03:26] Appearntly theis has caused some hubbbub https://eng.uber.com/mysql-migration/ [03:27] thumper: live from the East Bay Postgres meetup at Pandora... [03:29] Time to go be social at the pub, getting kicked soon. [03:39] wallyworld: http://reviews.vapour.ws/r/5530/ when you have a chance later. I haven't yet updated what we talked about earlier. [03:40] wallyworld: also ignore the bits from the other PR,this is stacked on that so it has those issues too. [03:40] later juju-dev [03:40] ty [03:41] axw: in my local lxd testing, it took two to three seconds from the time kill-controller had exited to the time the jujud agent stopped running [03:43] thumper: sounds about right. in the CI failure it's still running 10 minutes later... [03:44] 10 minutes? [03:44] I thought it was much sooner than tha [03:44] * thumper double checks [03:44] thumper: terminationworker says to terminate at 3:41, then the SIGABRT stack trace comes at about 3:50 [03:44] 3:51 actually, I guess the CI script is waiting 10 minutes [03:45] anastasiamac: did you see there's fallout with the arch / image stuff and manual provider with lxd? [03:46] wallyworld: the one that axw and i discussed and he has fixed (and i lgtm-ed) on his branch? [03:46] wallyworld: I've got a fix for manual in my branch, can't land because master is blocked [03:46] axw: i'd say make it a blocker and use $$fixes$$ [03:46] axw: i think u can land ur branch [03:46] wallyworld: is there a bug #? [03:46] axw: if u do not have a bug, jfdi [03:46] anastasiamac: thanks, didn't see there's a fix, thumper ran into it before [03:46] axw: no... wasn't 10 minutes [03:47] ok [03:47] was almost immediate [03:47] AFAICT [03:48] thumper: http://paste.ubuntu.com/23087267/ [03:49] from http://reports.vapour.ws/releases/4301/job/manual-deploy-precise-amd64/attempt/4018 [03:49] on attempt 4017 it was more immediate [03:51] axw: that timing doesn't match the log outputs at all from 4018 [03:51] kill was here: 02:34:05 [03:51] thumper: I think the logs are appended to [03:51] thumper: I'm probably looking at something old [03:51] that test log output is now in local time [03:52] but still [03:52] thumper: I was looking from the top of the log, it looks like there's multiple test runs in the same log file [03:52] searching from the back, I concur that it's immediate [03:53] ok... good :) [03:54] thumper: though there *is* a very slow one at the top of the log, so it's not consistent [03:55] one problem at a time :) [04:09] wallyworld: in your lxd PR, there's another target var lower down. I didn't realise that it exited early if the alias exists - maybe the check is still needed? does CopyImage return immediately if the image is already there? [04:10] axw: in my testing, i deleted all lxd images. bootstrap the first time downloaded the image (slowly, with progress shown). then another bootstrap did not [04:11] and lxc image list shows the one image [04:11] wallyworld: but that might be because there's still another call to GetAlias [04:11] the instance started immediately though [04:12] so it's using whatever it cached the first time [04:12] i can't see any obvious difference in behaviour [04:12] wallyworld: no, I'm just saying there's still another call to GetAlias that looks like it should be removed. but I'm not sure of the impact. [04:13] oh, i miss understood you. i saw that call too but didn't follow what it did so left it [04:25] i've seen a few cpu/mem spike related bugs... if i were a memory leak in juju, where would i be? :D [05:06] axw, menn0: http://reviews.vapour.ws/r/5532/ [05:10] thumper: LGTM, thanks [05:24] thumper: double ship it :) [05:29] axw: i've got a few very small reviews up if you get a chance later. one is the lxd one which seems ok to me given it behaves as expected when testing [05:30] wallyworld: sure, just finishing up QA for my add-model changes [05:30] no worries [05:36] wallyworld: add-model changes: http://reviews.vapour.ws/r/5534/ [05:36] looking [05:42] wallyworld: I'm QAing your lxd branch, and bootstrap is fetching images that I have again. possibly due to that code removal [05:42] hmmm, it didn't fetch mine again [05:42] but i started from a clean slate [05:42] what are your aliases? [05:42] ubuntu-xenial etc? [05:43] wallyworld: yep [05:43] I have ubuntu-xenial [05:43] hmmm, ok, i'll bootstrap again and see what it does [05:43] nfi why it doesn't work for you [05:43] wallyworld: yep, I put the code back in and it doesn't do it now [05:44] wallyworld: possibly once it has the image again, it wouldn't copy again [05:44] yeah, that's what i was thinking [05:44] there might me some implicit alias or something [05:45] which i think is ok behaviour - so long as it only fetches once. stephane was adament we should be doing it this way or else auto update would not work [05:50] wallyworld: it's probably fine, just doing one last test to satisfy myself [05:50] sounds good, best to be sure [05:50] i'm testing again too [05:50] but download is sloooooooooow [05:59] wallyworld: so, I think the issue is that the local alias I had did not match the image that was in the source [05:59] wallyworld: so it replaced it [05:59] yeah, whereas before maybe we were setting the alias name [05:59] wallyworld: if you were to put that GetAlias code back in, people could continue using their existing images... but I guess they wouldn't auto-update [05:59] that's my understanding [05:59] and we want auto update [06:00] wallyworld: ok. seems fine, maybe just add a release note that it will force an image refresh on everyone? [06:00] sure [06:00] axw: in the add-model / cloud branch - i've just srted looking - do we reject add-model cloud where the controller doesn't support the cloud asked for? [06:01] wallyworld: it will complain that "foo" is not a cloud or a region [06:01] wallyworld: because you can't add clouds to a controller, the only cloud it'll find is the one that was bootstrapped [06:01] wallyworld: I did test that actually, just didn't add in the QA steps [06:01] will do that now [06:01] ta, that would be good as i was wondering [06:02] wallyworld: updated steps under LXD [06:02] great ty [06:02] axw: and you can +1 the lxd pr? [06:03] wallyworld: sorry yes [06:03] done [06:03] not sure if i should land before beta [06:04] might be good to get auto update fixed [06:06] axw: "is neither a cloud nor a region". i don't like that message because aws is a cloud. it's just not supported by the current controller. so people will get confused by the message i think? [06:07] wallyworld: well, it's not a cloud so far as the controller is concerned [06:07] wallyworld: I agree it's a sucky message [06:07] wallyworld: I guess we could look in the client's list of clouds first? [06:07] could we rephrase to say that this controller doesn't support models on cloud "aws", only clouds "lxd" are supported [06:08] yeah, look at client clouds, and if it is a valid cloud name, be smart about the message [06:08] "... are supported by this controller" [06:08] or something [06:09] wallyworld: we don't have an API to list clouds yet. I suppose I could add it [06:09] axw: that's one of the things martin asked for [06:09] so it won't go to waste [06:09] and we have the cloud facade [06:09] wallyworld: yeah, was trying to keep this minimal. shouldn't take too long tho [06:10] understood [06:10] but the message sucks :-) [06:22] axw: gotta duck out to do school pickup, but one last question - on the apiserver side where an unsupported cloud is passed in - it returns an annotated not found error but i think we can d better with the error message there also [06:22] "such and such cloud is not supported, try one of these instead" type thing [06:22] wallyworld: where's that? [06:22] wallyworld: "getting cloud definition" ? [06:23] yeah [06:23] didn;t matter before but now that we are allowing people to specify the cloud themselves [06:23] need to tighten it up IMO [06:23] wallyworld: isn't that redundant if we have the client query the supported clouds? [06:24] that's in add-model, what abot via the api [06:24] python juju client, controller proxy etc [06:25] hmm I guess so [06:25] auto pilot, conjure up etc - they all use the api [06:25] and in conjure up, someone could easily specify an unsupported cloud [06:26] gotta run, bbiab, got to update release notes at some point [06:45] axw: ping - larry shared his vsphere setup which i think you've used recently. Did you have any problems bootstrapping? For me it doesn't complete using beta15. [06:46] frobware: hey. I didn't get past authentication. I think the issue I was seeing was that the client downloads the cloud image and then uploads it to vsphere. I'm quite far away, so that was so slow it timed out [06:46] frobware: sorry I mean, it authenticated but didn't get any further (functionally) than that [06:47] axw: I get as far as... https://pastebin.canonical.com/163942/ [06:48] axw: lines 75 & 76 repeat until timeout [06:48] axw: I have never bootstrapped on vsphere before so could be operator error too [06:48] frobware: ah, well you got further than me :p sorry, I don't know what's up with it. I've never used vsphere before that one time [06:48] and I was just verifying that my auth changes were good [06:51] axw: the only addition I made to the cloud definition was adding to clouds.yaml: vsphere: regions: dc0 {} [06:51] axw: which was largely done based on a bug comment I think you made... somewhere... :) [06:52] frobware: if the issue was with clouds.yaml, it would have failed much earlier [06:52] I don't think it's user error [06:52] more likely the provider or vsphere is broken [06:53] axw: which it did. could not bootstrap because 'datacenter' was undef [06:53] axw: I'll try going back to beta8 as that's where the bug was reported, but largely to see if bootstrap has regressed since. [06:53] frobware: oh I see what you mean. yeah, larry's original clouds.yaml was broken [06:53] oh [06:54] frobware: this is what I've got: https://pastebin.canonical.com/163945/ [06:54] axw: you mean it was broken and needed the regions bit? [06:54] yep [06:55] frobware: well, and he was trying to use non-standard keys. that one I linked is in the valid format [06:56] axw: this is what I'm currently using: https://pastebin.canonical.com/163946/ [06:56] frobware: yep that's fine [06:56] auth-types is unnecessary but won't cause a problem [06:57] wallyworld: how's this? https://pastebin.canonical.com/163947/ [06:57] looking [06:58] axw: yay, much nicer, thank you [06:58] wallyworld: cool. just gotta write some tests, and improve error messages on the server side now [06:58] sgtm [07:01] axw: you could potentially make the add-model cmd dumb and not do any checks and allow them all to be done on the server side [07:02] since you need to make an api call to list clouds anyway [07:02] you could avoid that call [07:02] and just make the create model call [07:02] wallyworld: thought about it, but that makes the cloud/region unstructured which I'm not too keen on [07:02] you could still split on / [07:02] wallyworld: this way we may also support auto-upload of cloud definition, if we want to do that [07:03] wallyworld: sure, but you still don't know if it's cloud or region if there is no / [07:03] true [07:03] ok, ignore me, just thinking out loud [07:17] wallyworld: hey [07:17] wallyworld: still hanging around? [07:18] maybe [07:19] thumper: what's up? [07:19] pretty sure bug 1615839 is that bit you got me to comment out [07:19] Bug #1615839: Manual-provider claims s390x is not supported [07:19] is anastasiamac on that? [07:20] or shall I take a look? [07:20] might take me longer [07:20] thumper: i think axw landed a driveby [07:20] ot has one in train [07:20] or [07:20] but I could muddle through it [07:20] all good, we broke it, we fix it [07:20] thanks for offering [07:20] who shall I assign the card and bug to? [07:21] check that axw is/has done it, otherwise to anastasia [07:21] axw: have you fixed it? [07:22] is hudson back? [07:24] redir: late for you, go to bed :-) [07:24] yeah just got home and eating something, then bed [07:25] who knew postgres folks were such talkers:) [07:35] thumper: sorry was on school run, it should be fixed by my latest merge, have marked Fixed Committed [07:35] axw: ok, cool [07:35] what was the fix by the way? [07:57] wallyworld: updated my PR, PTAL [07:57] looking [08:03] axw: looks great, ty [08:05] axw: when it lands, let urulama and mhilton know as they've started to need the Clouds() API and are assuming a return of []string whereas we are offering a map of cloud details [08:06] wallyworld: sure [08:07] wallyworld: gonna have to get a second review, this is >500 [08:07] I'll point martin at it, maybe he'll be willing :) [08:07] hmmm [08:07] stupid rule [08:11] wallyworld: well, the rule would not bite if the PR are manageable [08:12] >500 is not manageable for any reviewer [08:12] disagree [08:12] depends on the type of change [08:12] of course u do [08:12] and who's reviewing [08:12] we had a much larger limit in launchpad [08:12] 800 [08:12] 500 is too small [08:12] no, usually only dev knows what they wrote for PR >500 [08:13] not just dev [08:13] i know what's in that pr and i didn;t write it [08:13] u r very specail [08:13] special* [08:15] axw: off to make dinner, updated the pr, thanks for reviewing [08:15] wallyworld: will look in a sec. I'm reviewing your show-user one now [08:17] * frobware is back in ~1 hour [08:27] wallyworld, axw: what have you done to my API deisgn! [08:36] mhilton: we needed more than just the cloud names :-) you get the names as the map keys [08:37] and also you have allowable regions etc which are really useful for the gui [08:38] wallyworld: It's fine, I'm curous. We were getting the regions from the Cloud() endpoint. but doing it all in one go is probably better. [08:39] mhilton: yeah, we think so, one call to get all the info you need [08:50] rogpeppe1: thanks for review, did you just want to check my answer to you question in the review http://reviews.vapour.ws/r/5533/ [09:09] wallyworld: i've just published a review of http://reviews.vapour.ws/r/5533/ [09:09] ta [09:09] wallyworld: weird, i didn't think i'd published anything until now... [09:10] wallyworld: i think you were maybe talking about axw's question [09:10] oh dear, i was [09:10] wallyworld: i was wondering about external user access too, although i forgot to mention it in my review [09:11] this review isn't about any of that [09:11] wallyworld: i think if we left the access field out entirely, things would become more obvious [09:11] it's all already been done [09:11] the access field is also pre existing so i don't realy want to move it [09:11] this review is just about proerly looking up external user access [09:11] wallyworld: the problem is that it *looks* as if the access field tells you what access rights a given user has [09:12] it does [09:12] wallyworld: but actually it doesn't tell you that [09:12] why? [09:12] it tell you what access that user has to the controller [09:12] wallyworld: because if access has been granted to everyone, everyone will have at least those rights [09:13] wallyworld: and when we implement general group checking, that issue will become still worse [09:13] oh, i see what you're saying. yes right now it just tells you what that user has specifically been granted [09:13] groups will require a whole lot of change [09:13] wallyworld: yup. i think that's misleading, and we'd be better off fixing things now. [09:13] wallyworld: i.e. remove the Access field [09:13] wallyworld: because that's the only problematic part [09:13] wallyworld: otherwise it's just about looking up information about a specific user [09:14] why can;t we look up the access transatively and just fill in the access bit [09:14] we need to give the access value back to the caller [09:14] wallyworld: why does it need to be in the same API call? [09:15] why not? for distributed systems you aim to minimse the api calls [09:15] fewer bulk calls is the design goal [09:15] wallyworld: that's an optimisation - i generally prefer to start by being as clear as possible and optimise later [09:16] we disgaree there, i remember this discussion when juju's api was first being designed [09:16] wallyworld: so how many bulk calls are actually being used as bulk calls now? :) [09:16] mhilton: heh sorry :p but yeah, one query lets us get all the stuff we want in one go, rather than getting names and then calling Cloud a bunch of times [09:16] wallyworld: POitRoAE... still! [09:16] i can't answer that - i don't know what api clients people have written [09:17] wallyworld: juju is the only api client for all the agent stuff [09:17] i can imagine landcape etc would use bulk calls [09:17] and the gui certainly *should* [09:17] wallyworld: and bulk calls are used approximately zero times [09:18] then they're doing it wrong if that's the case [09:18] wallyworld: no. mostly you do only have one thing to do at a time [09:18] wallyworld: and this isn't HTTP [09:18] more's the pity [09:18] it should be restful [09:18] but that's another discussion [09:19] wallyworld: HTTP1 is bad because the calls are expensive and cannot return replies out of order [09:19] wallyworld: the RPC API doesn't have that limitation [09:19] wallyworld: the overhead of a call is small [09:19] sure, but HTTP1 is so last century [09:19] wallyworld: we still only use HTTP1 [09:19] why is that out of curiouslty? [09:20] besides that Go is stuck in the 70s :-) [09:20] wallyworld: for all the bulk calls we have, making several calls concurrently is faster than actually using the bulk call as a bulk call [09:20] wallyworld: because we make our own http transport [09:20] wallyworld: otherwise we'd get HTTP2 out of the box [09:21] hmmm, i'd like to see the numbers. may be fast for some things, but chatty api calls are evil for distributed system [09:21] that's why we should just stick to standards instead of rolling out own [09:21] wallyworld: tell that to google, amazon, heroku, etc etc [09:21] wallyworld: they all use RPC systems [09:21] i've seen way too many distributed system fall over due to inefficient apis [09:21] wallyworld: and none of them have a "bulk calls only" policy [09:21] wallyworld: HTTP APIs, right? [09:22] rpc [09:22] wallyworld: if a system is inefficient, optimise it [09:22] wallyworld: it's not a hard thing to do [09:22] easier said than done once the apis are set [09:22] wallyworld: that's why we have versioning [09:22] you can use a bulk call singularly but not the other way around [09:23] and versioning is horrible for us to try and us [09:23] each time we rev a facade verison it introuces a world of hurt [09:24] anyway, i need to change the access look up to take account of the everyone group [09:24] regardless of the api design [09:24] wallyworld: you can make many singular calls concurrently [09:24] wallyworld: there really is very little overhead in doing so [09:24] at the cost of many network reosurces [09:24] wallyworld: no [09:24] wallyworld: at the cost of *some* extra bandwidth [09:24] wallyworld: but much less than you'd think [09:25] do we really have bandwidth issues? [09:25] frobware: not AFAIK [09:25] bandwidth is a finite resource [09:25] wallyworld: so are developer resources [09:25] try living in australia [09:26] the point being? [09:26] where there's latency, bulk calls are much better [09:26] wallyworld: the point being that we've expended 1000s of hours of extra effort making every call "bulk" and we never use that capability [09:26] wallyworld: actually no [09:26] what extra effort? [09:27] we designed the api once [09:27] and we do use it? how do you know we dont't? [09:27] have you audited every external juju api client? [09:27] wallyworld: because most of the entry points in the api package don't even expose the bulk functionality [09:27] wallyworld: i'm talking about the agent API here [09:28] wallyworld: because that's easily checked [09:28] the juju api layer only exposes singular calls, but pythin juju client, conjure up, etc etc don't use that [09:29] wallyworld: testing and implementing a bulk API call is probably 5 times more effort than a single one [09:29] wot? [09:29] i don't agree with that [09:30] i don't find it any different [09:30] wallyworld: there are lots more edge cases to be tested [09:30] wallyworld: zero, one, many, all error, some errors, etc [09:30] that is true, but once you get standard patterns in place, it falls out pretty easily [09:31] wallyworld: BTW when there's latency, concurrent calls are better because you can get replies out of order and start dealing with them sooner, rather than waiting for all replies at once [09:31] i wonder why there's such stark difference of opinion here [09:31] wallyworld: you mean you copy and paste [09:31] no, we have helper structs also [09:31] like our errors [09:32] % find api apiserver -name '*.go' | xargs cat | wc [09:32] 125883 386115 3838814 [09:33] wallyworld: in a very few cases, you can have helper structs. but if you're returning actual results, you can't use 'em [09:33] wallyworld: and you still need to have all those test cases [09:33] sure [09:33] not hard though [09:33] wallyworld: it all adds up [09:33] wallyworld: our api code is *huge* [09:33] wallyworld: and it's mostly noise [09:33] lol [09:34] Go code is mostly noise :-P [09:34] so much boilerplate and copy and paste [09:34] wallyworld: you're writing it wrong then [09:34] due to not having generics etc etc [09:34] so all those sort functions are wrong [09:34] different ones for int vs string etc [09:34] wallyworld: very little of what we're doing in the API could be made better with generics [09:35] no, i was making apoint about the fact that you critised our aoi for cut and paste when the language itself is just as bad :-) [09:35] wallyworld: i honestly don't see that much copy and paste in decent Go code [09:36] we have so many cut and paste functions for "is this string in this slice" [09:36] etc [09:36] wallyworld: even implementing sort only involves copying and pasting two lines [09:36] and each time i have to do it i die a little inside [09:36] no other language ive used makes you do that [09:36] wallyworld: i guess you've never used C then [09:37] not for years [09:37] luckily [09:37] bbiab, SIGWIFE [09:55] Bug #1616832 opened: manual environment juju-db timeout [10:57] * fwereade bbl [11:04] dimitern: ping - I tried your patch but it didn't work for me. [11:05] frobware: oh, what was wrong? [11:05] dimitern: just trying to repro again to ensure it's all true... [11:06] frobware: ok [11:06] dimitern: but it was eseentially the same problem that ivoks ran into initially [11:07] frobware: DNS hostname (resolved) != PUBLIC ADDRESS in status? [11:09] dimitern: double-checking. have too many pots on the go. [11:31] Bug #1616832 changed: manual environment juju-db timeout [11:55] juju server certs need unique serial numbers, it seems. this PR adds them. small PR, review appreciated :) https://github.com/juju/juju/pull/6100 [12:14] please could someone give this small PR a review? (i need review from someone in -core), as it's blocking us right now: http://reviews.vapour.ws/r/5538/ [12:18] dimitern, frobware: ^ [12:18] rogpeppe1: looking [12:18] frobware: thanks [12:20] rogpeppe1: can you please list some QA steps [12:20] frobware: ah, ok, sure [12:21] frobware: done [12:22] rogpeppe1: without your change you cannot connect? [12:23] frobware: no, there's no externally visible behaviour change [12:23] frobware: except, i guess, that if you use a browser to connect, it can do so [12:23] frobware: hmm, maybe the QA steps could specify that i guess === plars_ is now known as plars [12:24] rogpeppe1: right - I wanted to look at before and after [12:24] frobware: let me check how they've been doing it - it involves creating a new CA key, adding its cert to your browser, and making a websocket connection to the API [12:25] frobware: actually, even that won't quite check it. i think you need to do that with two controllers. [12:27] frobware: oh yes, the controllers need to be bootstrapped using the new CA key === mup_ is now known as mup [12:42] frobware, http://reviews.vapour.ws/r/5539/ is up if you have a moment :) [12:42] and now I'm going to have some lunch, ping me if you need me and I'll catch up when I'm back [12:43] fwereade: ack [12:45] anastasiamac: when you deleted the juju-core task, but left a 1.25 task on bug 1616832, you removed the bug from search. All bugs that affed to a project series must also affect the project. [12:45] Bug #1616832: manual environment juju-db timeout [12:49] Bug #1616832 opened: manual environment juju-db timeout [12:50] frobware: i replied to your question [12:51] rogpeppe1: ok, will take a look. looking at the other review atm [12:51] frobware: as we're the ones affected, perhaps we should do the QA [12:52] rogpeppe1: I think so. [12:52] frobware: ok, cool. in that case your LGTM would be much appreciated. [13:13] rogpeppe1: I dropped another question on the review [13:16] frobware: i don't get which line 170 you're talking about [13:17] the only new(big.Int) we're now using is on cert.go:133 [13:17] frobware: is that the one you're referring to? [13:18] frobware: otherwise i'm seeing cert_test.go:170 is expiry, err := time.Parse("2006-01-02 15:04:05.999999999 -0700 MST", "2012-11-28 15:53:57 +0100 CET") [13:18] frobware: and cert.go:170 is if !ok { [13:19] rogpeppe1: ah, sorry. line 170 was in the old file. [13:20] frobware: so we're not using new(big.Int) there any more [13:20] rogpeppe1: dropped [13:20] frobware: ok, ta [13:25] dimitern, fwereade: Could you look at this? http://reviews.vapour.ws/r/5540/ [13:26] babbageclunk: looking [13:26] babbageclunk: looking (I'm OCR) [13:26] It's the machine undertaker worker with tests. (Also I worked out how to get it onto RB, since the bot wasn't helping!) [13:26] dimitern, frobware: Thanks! [13:27] babbageclunk: one quick question, why did the pattern of s.waitRemoved() & s.waitForRemovalMark() calls appear to now be the other way around? [13:28] frobware: I've changed the provisioner not to remove machines anymore, so the tests can't wait for the machine to be removed. [13:29] frobware: Instead they wait for it to be marked for removal. [13:31] frobware: I'm not sure if that was quite what you were asking. [13:32] babbageclunk: yes [13:32] babbageclunk: thx [13:33] frobware: cool cool. I saw something weird while testing manually, just seeing if I can reproduce it. [13:34] frobware, dimitern: MAAS was giving the error "node with this hostname already exists" if I tried to create containers on two hosts at the same time. [13:35] babbageclunk: but goes away if done in series? [13:36] frobware: That was what it seemed like - trying to reproduce it now. [13:36] frobware: (Most importantly, trying to reproduce it on upstream/master) [14:01] dimitern: frobware: standup time [14:02] katco: omw [14:02] frobware: Hmm, can't reproduce it on master or my branch now. [14:02] omw [14:10] Hi. I have a question. While selecting components from landscape UI for autopilot openstack deployment , can we give set external configuration parameters for a particular component? [14:16] ram____: you'd do better to ask in #juju [14:21] frobware: sorry, didn't mean to overlap [14:21] katco: not a problem [14:21] natefinch: Ok. tahnk you. [15:10] mgz: ping? [15:11] heya [15:11] I'm trying to investigate this: https://bugs.launchpad.net/juju/+bug/1606308 [15:11] Bug #1606308: Restore cannot initiate replica set [15:12] I can't really remember how to go about running the CI tests. [15:13] babbageclunk: so following the links through to a recent failure gives you the rough outline [15:13] this is a test we run on aws, so it's pretty easy [15:13] want to do a ho or something quickly to go over? [15:14] yeah, that would be brilliant [15:14] babbageclunk: okay, I am in the meeting named core [15:28] fwereade: I'm looking at this: https://bugs.launchpad.net/juju-core/+bug/1485784 [15:28] Bug #1485784: Error creating container juju-trusty-lxc-template; Failed to parse config [15:29] fwereade: actually, srory, wrong link, this one: https://bugs.launchpad.net/juju-core/1.25/+bug/1610880 [15:29] Bug #1610880: Downloading container templates fails in manual environment [15:29] though they're similar [15:30] fwereade: we're running lxc-create and trying to download the lxc image from the server - lxc-create [-n juju-trusty-lxc-template -t ubuntu-cloud -f /var/lib/juju/containers/juju-trusty-lxc-template/lxc.conf -- --debug --userdata /var/lib/juju/containers/juju-trusty-lxc-template/cloud-init --hostid juju-trusty-lxc-template -r trusty -T https://10.2.0.186:17070/environment/80234a11-2d53-436e-855c-da998c76d6ca/images/lxc/tru [15:30] sty/amd64/ubuntu-14.04-server-cloudimg-amd64-root.tar.gz] [15:30] That's getting a cert error [15:31] I notice that I get a similar error if I just try to curl that URL [15:32] babbageclunk: sorry, been sidetracked. I wanted to try your changes [15:33] natefinch, I don't have any immediate insight I'm afraid :( [15:33] fwereade: np [15:39] frobware: no worries [15:39] mgz: oops, I guess firefox gave up? I think that's enough to go on with, thanks heaps! [15:40] babbageclunk: yup, think it's pretty hung [15:40] babbageclunk: no problems, yell if you need more [15:40] mgz: I almost certainly will! [15:43] frobware: If there are some issues that you want me to look at in the meantime you can put up a partial review while you're testing? [15:44] babbageclunk: haven't got that far. :( still trying to get vsphere to boot whilst larry is about [15:44] frobware: ok, fair enough :) [15:45] frobware: I'm not blocked, so no stress [16:03] Hi. For testing purpose I developed a simple charm using the shell script to modify the cinder configuration file for the post-deployment of OpenStack. cinder configuration modified. But I am saw some error in charm log. pasted information of my issuehttp://paste.openstack.org/show/563408/ [16:04] Please anyone provide me some solution. [16:19] mgz: I'm getting a "no such file or directory" when it tries to call euca-terminate-instances - where should I get that from? [16:20] anyone that has knowledge of migrations on this time zone? [16:20] mgz, oh looks like euca2ools [16:24] babbageclunk: hm, that really should be part of the deps [16:24] but it should also be switched to boto.... [16:30] perrito666, voidspace but I dont see him online [16:31] alexisb: he is travelling [16:31] katco, thanks [16:31] katco, there is a mail I need him to respond to as well [16:31] I will loop you in for tracking purposes [16:32] alexisb: he mentioned he might check-in. if so, i'll let him know [17:12] natefinch: do you know how we've ended up with 5 packages under github.com/juju/juju/resource with no tests at all? [17:13] natefinch: it makes life a bit awkward when refactoring in that area [17:14] rogpeppe1: yeah. [17:15] rogpeppe1: I think the only one with any substantial amount of code is the resourceadapters directory [17:16] rogpeppe1: I don't have a good answer for that. [17:22] Hi. I tried to deploy "cinder-xtremio " charm in our local Juju openstack environment like $juju deploy cinder-xtremio. I was facing errors. pasted error log : http://paste.openstack.org/show/563432/. Please anyone provide me some solution for this. [18:23] Hi. I want to develop a cinder-storagedriver charm. And i want to integrate it with Ubuntu-autopilot . SO can I give input parameters like san IP, san user and san password from landscape autopilot UI. Otherwise everything we have to hardcode into the charm. And different users for the same storage array have different credentials. [18:27] ram____: you might get a better response if you send an email to the juju mailing list [18:28] ram____: most of us here work on the core code for juju itself, and we don't know much about the ubuntu autopilot code, or the openstack charms in general [18:30] natefinch : Ok. thank you === mup_ is now known as mup === mup_ is now known as mup [19:28] tych0: you around? [20:02] tvansteenburgh: hey, have you made any progress on bug 1616574 ? still stuck? [20:02] Bug #1616574: Can't deploy local charm via API [20:07] katco: i haven't made progress [20:08] tvansteenburgh: ok; did you have a look at the go code? [20:08] if i'm bootstrapping and I see "2.0-beta16-xenial-amd64.tar.gz", does that mean i'm downloading the agent from simplestreams? [20:08] sorry "Fetching agent 2.0-beta16-xenial-amd64.tar.gz" i mean [20:09] cmars: should be, yeah [20:09] natefinch, is there a way to get juju to pick up a jujud binary i've already built? [20:09] natefinch, i can do --build-agent, but if i've already built it.. [20:10] cmars: it just has to be in your $PATH i believe. see wallyworld's email a week or so ago [20:10] cmars: if you're bootstrapping with a built juju, it's supposed to automagically figure it out and do upload--tools [20:10] i've got jujud in my path, but it's not getting picked up [20:10] so open a bug? [20:10] it needs to be in the same dir as juju [20:10] $GOPATH/bin usually [20:10] ah [20:11] yeah, i think it was [20:11] but, i'll try again [20:11] and it needs to match juju exactly in terms of version [20:11] --show-log will have more info [20:11] wallyworld, ok, thanks! [20:11] it should all work, let me know if now and we can debug [20:12] *not [20:12] katco: i did, yes [20:12] tvansteenburgh: i'm looking through and comparing the 2 now; did anything pop out? [20:12] it's not critical, but it would shave a minute or two off our CI to not build twice [20:13] tvansteenburgh: the logic i'm using is: if the juju binary can do this, there's no reason python-jujuclient shouldn't be able to as well. i.e. i don't think it's a fix on our end? [20:13] tvansteenburgh: is there a flaw in that reasoning? [20:13] katco: yes :) [20:13] tvansteenburgh: haha [20:13] what am i missing? [20:14] katco: the logic i'm using is, this works with juju1 but not juju2 [20:15] katco: maybe it works with juju2 but there's another step or something, i dunno [20:15] tvansteenburgh: there have been many many breaking api changes between the 2 versions [20:15] katco: no, that's not the problem [20:15] tvansteenburgh: but that's my point: i'm going to help you figure out what's wrong, but i don't know why this is targeted against the juju project and not python-jujuclient? [20:17] katco: here's the thing. customer comes and says "how do i deploy a local charm using the juju2 api". i can't fix python-jujuclient until i know the answer to that. [20:17] katco: so far no one has been able to tell me how to do it [20:17] that's what the bug is for [20:18] tvansteenburgh: ok, well let's get this figured out. the go code i pointed you at is how we do it, so we just have to figure out what the difference is [20:33] katco: is it possible that the local charm should be uploaded to the controller and not the model? [20:36] no, that didn't work either [20:41] tvansteenburgh: if a customer came and asked how to deploy a local charm using the juju2 api, I'd say "don't" [20:42] our API is not designed to be used directly by third parties. It's too granular and requires too much knowledge of the internal workings of juju. [20:42] natefinch: well, that's why we supply libs to wrap that, which is what tvansteenburgh is trying to fix [20:42] wallyworld, ah, i figured out how to force bootstrap to use jujud out of $PATH. i set the agent & image metadata url to localhost and streams to "nope" [20:42] katco: yes, I get that [20:43] wallyworld, that fails over to the "Preparing local Juju agent binary" case [20:43] wallyworld, do you think that's expected behavior? [20:44] (actually, i'm not sure i should have messed with image... i have no idea what i'm doing!) [20:44] tvansteenburgh: i see your placement args are empty. placement.Scope must be the model UUID i think [20:44] cmars: it will only use a local juju if it can't find any binaries in streams. setting the url like that will cause that search to fail [20:45] cmars: we gave beta16 binaries now, i bet your master source code still says beta16 [20:45] wallyworld: cmars: maybe the issue is that your client is reporting beta16 [20:45] katco: thanks i'll try that [20:45] wallyworld, ah! i bet that's it [20:45] cmars: it has to report higher than that [20:45] wallyworld, i think i'll keep it like this.. i want to test exactly what i've built [20:45] wallyworld, kind of a hacked-up --agent-binary feature [20:45] that will happen 99.999% of the time [20:45] it's just we have an hour windows just after a release [20:46] where the source code is not yet updated to say beta+1 [20:46] * wallyworld needs coffee [20:59] sinzui: k. thnx [21:00] wallyworld: awake again? [21:01] almost [21:01] katco: no luck with that http://pastebin.ubuntu.com/23090486/ [21:04] katco: if i call the CharmInfo with the charm-url i also get a "charm not found" error back [21:05] CharmInfo api i mean [21:05] tvansteenburgh: here is the entire client-side call-chain serialized out, freshmen in cs101 failing miserably style: http://pastebin.ubuntu.com/23090503/ [21:06] tvansteenburgh: let me ponder your CharmInfo comment a moment [21:07] tvansteenburgh: substitutions in that pastebin can be searched by triple "/" (e.g. ///) [21:07] katco: cool, looking [21:08] thumper, ping [21:09] morning [21:09] heya thumper [21:09] morning thumper [21:09] ping [21:09] * thumper is munching on breakfast, been a busy morning [21:09] kids off to ski trip [21:09] nice [21:10] thumper, do you mind joining a HO? [21:10] sure [21:10] I'll just be mute [21:10] if you join now no need fo ryou to be on release call [21:10] https://hangouts.google.com/hangouts/_/canonical.com/bug-scrub [21:10] thumper, ^^^ [21:11] * redir lunches [21:20] tvansteenburgh: it looks like environment.py::add_local_charm is missing schema & revision queryargs [21:21] tvansteenburgh: L65-67 in pastebin [21:22] katco: i noticed that but figured i'd get an error back if they were needed. i'll try adding them though [21:22] * katco doesn't know, is just pedantically going through the diff [21:23] katco: after uploading my charm, and getting a url back, a call to the Charms.List api does not list my charm [21:23] i'll try the extra args now [21:23] tvansteenburgh: ok, so we're narrowing it down here at least ^.^ [21:23] tvansteenburgh: fyi i have a call in 7m which EOD me [21:23] which will EOD me [21:24] katco: ack [21:25] katco: extra args didn't seem to make any difference [21:26] tvansteenburgh: hm, can you verify that the lib's Charms.List call works at all? just so we know the scope of this problem? [21:26] katco: yeah i have output from it [21:27] it just doesn't include the charm i uploaded [21:27] it does include other charms in the model thought [21:27] though [21:27] tvansteenburgh: k... so it has to be something with add_local_charm_dir down right? [21:27] well i get a charm-url back from that, as if the upload was successful [21:28] but then when i list via the api, it's not there [21:28] can you manually verify it's in the environment? i.e. look in mongo? [21:28] or use juju bin? [21:29] katco: can the juju cli list apps that haven't been deployed? [21:30] tvansteenburgh: it depends on if they're just "pending" or uploaded by not placed... i don't know which adding a charm does =/ [21:30] tvansteenburgh: i would just look in mongo to be sure [21:31] katco: never done that, are there instructions somewhere? :D [21:31] tvansteenburgh: yeah sec [21:33] tvansteenburgh: https://lists.ubuntu.com/archives/juju-dev/2016-July/005772.html [21:34] coool [21:34] * tvansteenburgh tries [21:38] * tvansteenburgh waits for mongo client to install [21:44] katco: http://pastebin.ubuntu.com/23090623/ [21:47] tvansteenburgh: http://pastebin.ubuntu.com/23090638/ [21:47] tvansteenburgh: try this version [21:49] katco: same :( [21:49] is my mongo shell version ok? [21:49] tvansteenburgh: oh, no... you need >= 3.2 [21:50] tvansteenburgh: sorry, didn't catch that [21:50] ok [21:52] katco: ok i'm connected [21:53] tvansteenburgh: `use juju` [21:53] thumper: target prechecks infrastructure: http://reviews.vapour.ws/r/5543/ [21:53] tvansteenburgh: `db.charms.find()` [21:54] katco: i see my local ubuntu charms [21:54] tvansteenburgh: the ones you've been uploading? [21:54] yeah [21:54] menn0: looking [21:54] i'll pastebin one [21:55] veebers: can we hangout? [21:55] katco: http://pastebin.ubuntu.com/23090657/ [21:55] katco: now i notice that model uuid is not the one i passed to Placement.Scope [21:56] tvansteenburgh: ah. wortha try [21:57] katco: no luck [21:58] thumper: sure thing, what about? [21:59] veebers: running a ci test locally [21:59] thumper: sure, where you want to meet? [21:59] veebers: https://hangouts.google.com/hangouts/_/canonical.com/friday?v=1471633360&clid=9319256218B181C1&authuser=0 [21:59] tvansteenburgh: ok, we'll have to pick this up tomorrow. at least we've narrowed the scope [22:00] katco: sounds good, thanks [22:00] tvansteenburgh: np [22:05] is master blocked? [22:06] aaah, anastasiamac we should remove blocking tags [22:06] yes. i will now \o/ [22:07] alexisb: there are no blocking bugs i see [22:10] anastasiamac: https://bugs.launchpad.net/juju/+bugs?field.tag=blocker [22:10] anastasiamac: I see 4 blockers [22:11] Bug #1615986 changed: Agents failing, blocked by upgrade but no upgrade performed [22:11] menn0: looking. i'll remove.. nothing is blocking according to juju.fail :D [22:11] anyone know who runs juju.fail? I suspect it needs to be updated to look at "juju" instead of "juju-core" on launchpad [22:11] menn0: i think it's marcoceppi [22:11] ah... it's becasue juju fail may looks at launchpad juju-core not juju project [22:11] sideeffect of the move :) [22:12] \o/ [22:12] that's what I said :) [22:12] haha [22:12] :D [22:12] * menn0 emails marcoceppi [22:16] thanks all [22:19] katco: i see what's happening. the uploads are being tagged with the controller uuid instead of the default model uuid. i'm not sure how to fix it though [23:10] menn0 katco I'll update it, and add a link to bugs [23:10] marcoceppi: thank you [23:10] actually, there is a link, at the bottom of the page [23:10] that says I made it [23:10] ;) [23:14] marcoceppi: the awesomeness of the rest of the page must have blinded us to that part ;-) [23:14] the citools from QA has changed, I have to go patch the scripts [23:15] menn0: this may seem odd, but apparently there are no blockers? [23:16] anastasiamac, standup [23:16] marcoceppi: that's correct... there were an hour or so ago, but not now [23:16] perrito666, standup [23:16] menn0: ah, I see, we'll it's switched over now [23:17] menn0: the next time you ahve a blocking bug double check to make sure it works [23:17] marcoceppi: will do! [23:20] katco: i figured it out. bug updated with the details. TL;DR "I'm sorry"