[00:19] gary_poster: ping [00:32] * thumper wonders why his branch isn't being picked up by tarmc [00:32] hi wallyworld__ [00:32] hi [00:34] wallyworld__: would be good if you could take another look at https://codereview.appspot.com/10370044/ [00:34] ok [00:35] also, I think that https://codereview.appspot.com/10361045/ is missing a prereq [00:55] thumper: when we start a container via StartContainer(), do we really need to pass in tools and environ config? Why do containers need these and StartInstance in the provisioner doesn't? [00:56] wallyworld__: yes, in order to create the cloud-init file [00:56] and the provisioner does [00:56] well... [00:56] StartInstance(machineId, machineNonce string, series string, cons constraints.Value, info *state.Info, apiInfo *api.Info) (instance.Instance, error) [00:56] the environ provisioner has the environement [00:56] that's from the orker interface [00:56] containers don't [00:57] and tool selection for environment machines is done based on what you ask for [00:57] for the container, it is got from the parent machine [00:57] ok, that makes sense, thanks [00:58] when we have so many params, i sort of like to do a struct to hold them [00:58] but no biggie [00:59] * thumper shurgs [00:59] * thumper calls bikeshed [00:59] but I can understand where you are coming from [01:00] lots of other places in the code do it. i'm not saying it's a blocker or needs to be fixed, just thinking out loud [01:05] thumper: i can't see the container manager New() implementation [01:05] am i going blind? [01:05] wallyworld__: is golxc ContainerManager method [01:06] you said to embed it [01:07] ok, thanks, makes sense [01:07] thumper: so when we stop a container, why to we call manager.New() to get a container reference. why don't we ask for the existing container by id or something [01:08] New() just seem counterintuitive to get an existing container [01:08] wallyworld__: New just gives an object that wraps a name [01:08] New returns a new container object [01:08] doesn't create the container [01:08] very confusing [01:09] New normally implies a new entity that is managed by the manager is created [01:10] ie the underlying foo didn;t exist before but does exist after New is called [01:10] so i read that code and expect a whole new container to be created [01:11] and once created, we shouldn't call New again just to get a reference to it [01:11] surely some other method can be used, like Get() [01:27] wallyworld__: https://bugs.launchpad.net/juju-core/+bug/1174778 can you comment on this? [01:27] <_mup_> Bug #1174778: generate-config needs to be updated for new HP Cloud images [01:31] wallyworld__: also, this one https://bugs.launchpad.net/juju-core/+bug/1183446 [01:31] <_mup_> Bug #1183446: Error uploading tools into openstack: 401 Unauth [01:51] thumper: looking now, had to talk to tiler [02:27] I am told that trying to share symbols in _test.go files between packages doesn't work, is that right? [02:36] bigjools: whazzup? [02:36] bigjools: but the answer is yes if you are asking what I think you are asking [02:36] thumper: I want to export test features from one package so another can use them [02:37] general practice is to move useful functions into juju-core/testing [02:37] not gonna happen here, because it's a function in gwacl [02:39] we want to expose something that creates a test object in the library [02:44] what is gwacl? [03:04] thumper: azure library [03:04] I guess the question can be re-phrased as, does exporting symbols in test files actually do anything? [03:12] bigjools: no, don't think so [03:12] bigjools: make a testing package inside gwacl? [03:12] thumper: we thought of that but it presents more problems [03:13] why? [03:13] the same exporting problem in reverse mainly [03:14] wat? [03:14] we have test code that pokes around in package structs and what not [03:14] oh, internal testing [03:14] yeah [03:14] we don't do that (on the whole) [03:15] if we need to poke internals [03:15] we add functions to export_test.go [03:15] which is i the main package [03:15] not the _test package [03:15] that way they are only exported for the test [03:15] but generally not [03:15] what does exporting mean in a test? [03:16] bigjools: as in: [03:16] func SomeTestFunc(c *C, some value) [03:16] so it is accessible from the test file [03:16] in a different package? [03:16] no, just in the tests for that package [03:17] * bigjools is confused now [03:17] because that is the only time the _test files are used [03:17] ok, example time... [03:17] I realise that - but I don't understand the need to export for use in the same package [03:17] bigjools: hangout? [03:17] sure [03:17] bigjools: you start? [03:17] head or bunny? (can never remember) [03:18] head [03:44] https://code.launchpad.net/juju-core/+activereviews [03:47] bigjools: can you add ~juju-core to ~maas-maintainers? [03:48] thumper: why would I want to do that? [03:48] bigjools: do you want to maintain the project? or do you want core to? [03:48] it's me for now [03:49] I know it is [03:49] that's why I asked you [03:49] :) it's not really anything to do with juju core though [03:49] if you are happy to keep it, that's fine with me [03:49] was an action item for me to chase from our last meeting [03:50] oh really? [03:50] aye [03:55] gomaasapi now belongs to ~juju [03:55] maas-maintainers are down as driver though, feel free to change [04:02] thumper: got a sec to help with a compile error? [04:02] sure [04:02] jam: ping [04:02] * bigjools assembles pastebin [04:04] thumper: http://paste.ubuntu.com/5782476/ [04:05] * thumper clicks [04:05] missing this: [04:05] func (t *TestTransport2) RoundTrip(req *http.Request) (resp *http.Response, err error) { [04:05] forgot to paste it [04:06] bigjools: you need an & [04:06] where? [04:06] transport := &TestTransport2{} [04:06] TestTransport2 doesn't implement the interface [04:06] ok god this is the pointer vs non-pointer interface stuff [04:06] *TestTransport2 does [04:06] aye [04:06] :( [04:07] thanks [04:16] wallyworld__: ping [04:16] hello [04:17] wallyworld__: does machine.Destroy take containers into consideration? [04:17] think so, let me check [04:17] yes, it won't allow a machine to be destroyed if it has containers [04:18] cool [04:28] * thumper stopping for a bit, back for meetings tonight === thumper is now known as thumper-afk === tasdomas_afk is now known as tasdomas [06:14] morning [06:14] TheMue: morning [06:18] jam: hiya, thanks for review. will change to caps for abbrevs, yes [06:19] jam: regarding the location of the test server I have thought for some time. I need it in the ec2 package to test the reader but also in the synctools to test there. *sigh* [06:19] TheMue: can you put it into the testing/ locations ? [06:20] TheMue: I can live with it, but it would be nice to be clearly test-only code that doesn't have to end up in the juju and jujud binaries [06:21] jam: thought about testing too. in this case i would have to duplicate the structs for the xml marshalling/unmarshalling [06:21] jam: but maybe that price is worth it, yes [06:21] TheMue: fair enough. [06:21] Id probably say not [06:21] TheMue: I think you've already found a reasonable balance. [06:22] I didn't know about all the complexities. so LGTM where it si. [06:22] is [06:22] jam: thx === TheRealMue is now known as TheMue [07:06] mornin' all [07:14] rogpeppe: heyhey [07:14] TheMue: hiya [07:41] greetings wallyworld_ [07:41] hi [08:09] jam: hi there! Did you see how my test-for-proper-locking helper worked out, in the separate branch? [08:09] jtv: I did, and I posted a comment on it. [08:09] looks pretty good. [08:09] Ah, thanks! [08:10] it doesn't quite match what you originally requested, but it does assert that the function doesn't return until after you've unlocked. [08:13] jam: thumper suggested making it a checker. Would love to, but right now I've got to get the basics out of the way, and they're blocking the team. Funny how each improvement leads to a new one. :) === thumper-afk is now known as thumper [08:42] jam: I meant this, newer proposal: https://code.launchpad.net/~jtv/juju-core/generic-locking-test/+merge/170477 [08:42] Based on yours but pared down to something very simple & reusable. [08:47] jtv: I did review it, but I forgot to submit my inline comments. [08:48] jtv: sorry about that. 2-step review process and all [08:59] jtv: go-bot rejected your change because you added a new depdency. I imagine we need lp:gwacl installed on the machine? [09:05] jam: argh! Thanks for pointing that out. Yes, we do. [09:05] jtv: done [09:05] you can resubmit [09:05] Thanks very much. :) [09:05] Hadn't realized we hadn't had the dependency operational yet. [09:09] jam: blast. Now we also need gwacl's dependencies. For starters, it needs github.com/andelf/go-curl [09:09] (Looking for any other dependencies) [09:09] jtv: curl ? [09:10] jam: yes, the built-in http library doesn't support tls renegotiation. [09:10] (That set us back a bit! Requests just failed with the words "no renegotiation," and the rest we had to figure out from source) [09:11] jtv: so do we need 'libcurl-dev' ? do you know what package? [09:11] (libcurl is one of those that has 20 different versions based on what SSL lib you use, etc) [09:11] We need libcurl4-openssl-dev. [09:11] What Raphaël said. :) [09:12] rvba: wow, that brings in a lot [09:12] krb5, ldap2, gpg-errror, gcrypt11, etc, etc. [09:13] There are also some packages *inside* gwacl that we need: launchpad.net/gwacl/dedent, launchpad.net/gwacl/logging [09:13] jtv: isn't that provided by 'bzr branch lp:gwacl' ? [09:13] jtv: 'go get launchpad.net/gwacl/...' succeeded [09:14] I imagine it is... just wanted to save you later roundtrips if it came to that. [09:14] That's it. No other dependencies that I can see. [09:15] Shall I retry? [09:15] jtv: you can retry as much as you like :) [09:15] may not *work*, but hopefully it will this time. [09:20] jtv: go-curl brings in a cgo dependency, which I think we were trying to avoid? [09:20] jtv: can you at least send an email to juju-dev bringing it up [09:27] jam: I have no idea what the issue is that you're talking about... could you elaborate about what dependency you mean, and why we're trying to avoid it? [09:29] jtv: go-curl has .c code in it. Which means it isn't pure go code. We just changed goyaml to be pure go, and we implemented 'golxc' in terms of shell calls rather than lib calls because of not wanting to use cgo. [09:29] jtv: It is something that sounds like it needs a juju-dev conversation, but maybe I'm better to bring it up. I'm just in a call now and didn't want to forget about it. [09:30] jtv: test suite failed, not sure why [09:30] should be on the MP shortly [09:30] /o\ [09:31] jtv: ffs, that looks like the mgo stuff that I thought I had fixed [09:31] Yup. [09:33] jtv: ah, subtly different [09:34] :( [09:34] jtv: this is 1 connection "InUse" the old failure was 1 connection "Alive" [09:34] jtv: 2 columns in the output, the 1 moved from the right to the left. [09:34] I can't imagine it being related to my branch, but that may be a lack of imagination on my part. [09:35] The universe of possible failures tends to defy imagination. [09:35] jtv: well, the same test is failing [09:35] Same test as with that problem you worked on? [09:35] Maybe it's one of those random failures? [09:36] Passes sometimes, fails sometimes? [09:36] If so, I guess I could just retry landing mine. But then we'd better be damned sure my branch isn't to blame. :) [09:36] jtv: I'm running the test directly right now [09:41] jtv: So far I've run it 12 times and it hasn't failed without your branch. Will try it with your branch [09:41] * jtv hopes for success [09:45] jtv: run in isolation it passes with your change. We have to figure out where this 'InUse" connection is from. [09:46] Shall I give my landing another go then? [09:46] jtv: right now I have the lander blocked while I'm testing [10:31] rogpeppe: hi, if you are free at some point i have a watcher question [10:40] fwereade_: ping [10:43] wallyworld_: ask away [10:44] wallyworld_: sorry i didn't see your ping === rogpeppe1 is now known as rogpeppe [10:45] rogpeppe: i have a watcher and a test. if i comment out the SetProvisioned() line in the test, there should be no event but the watcher still gets a change and i can't see why. if i println inside the watcher, i see an empty change come through and i don't know why. here's a pastebin of some code snippets https://pastebin.canonical.com/93139/ [10:46] wallyworld_: watchers always provide one initial event [10:46] oh. why? [10:46] wallyworld_: to give you a starting point [10:46] wallyworld_: so that you know you're in sync [10:47] ok. and that initial event is empty? [10:47] wallyworld_: that initial event gives you the current state of the world [10:47] wallyworld_: as the watcher sees it [10:47] ok, thanks. [10:47] rogpeppe: also [10:47] do you know about the mega watchwr? [10:47] wallyworld_: i do [10:47] wallyworld_: at least, i really *should* do :-) [10:48] rogpeppe: long story, but i have moved the InstanceId and Nonce attrs off machineDoc and int a separate instanceMetadata doc where other machine instance things like mem, cpu cores etc are now stored [10:49] the mega watcher tests expect to listen to machine and see instanceids [10:49] wallyworld_: ah, you'll need to add another collection to the megawatcher stuff [10:49] rogpeppe: sure. but i'm worried about what uses the mega watcher to listen to instance ids. does anything? [10:49] wallyworld_: yeah, the GUI does [10:50] what does it use InstanceId for? I will need to coordinate the changes then since when i land this, the ui stuff will break [10:50] wallyworld_: is the instanceMetadata collection already watched by the megawatcher? [10:50] not yet [10:50] that doc is new [10:50] it holds instance related things [10:51] like mem, cores, id, nonce etc [10:51] it's Id is the same as the machine to which it is related [10:51] its [10:51] when i say mem, id; i mean mem, instanceid [10:52] when I say Id, I mean _id aka primary key [10:53] wallyworld_: look at the backingStatus in megawatcher.go for a pattern to follow to update the megawatcher [10:53] rogpeppe: so why does the gui need to know the InstanceId? can't it just send requests using the machine.Id? [10:54] rogpeppe: i'll look at the backinfStatus, thanks. luckily there's only 3 failing tests out of the entire test run [10:54] wallyworld_: it can, but we maintain the invariant that the properties you see when asking for an entity are the same things the you see changing with the AllWatcher [10:55] wallyworld_: i think that's a good thing to maintain [10:55] wallyworld_: and we provide that information currently, so i don't think we should break backward compatibility [10:55] so if InstanceId is not longer a property on machine..... [10:56] wallyworld_: it is as far as the client is concerned [10:56] wallyworld_: even if it's not held in the database that way [10:56] what business reason does the gui have for needing the instanceid? [10:57] wallyworld_: so we watch the metadata collection and update the machine entity in the megawatcher store when the metadata changes [10:57] wallyworld_, hi [10:57] gary_poster: hi [10:57] wallyworld_: it's actually a little simpler than for the statuses, because instanceMetadata can only apply to a machine, i assume [10:57] gary_poster: can i fill you in on the current conversation? [10:57] wallyworld_: where statuses can apply to several different kinds of entity [10:57] rogpeppe: yes, 1-1 [10:58] have a call in 3 wallyworld_ . I *thinjk* the answer is "we need it as a key" [10:58] gary_poster: would love to chat perhaps after your call [10:58] wallyworld_, you'll still be around in 1 hr? [10:58] yep [10:58] ok will ping wallyworld_ [10:58] thanks [10:58] rogpeppe: in the meantime, i'll look into making the current megawatcher tests pass [10:59] by adding a new collection etc [10:59] wallyworld_: cool [10:59] wish me luck [10:59] i've not done anything with this bit of the codebase before [10:59] wallyworld_: sorry for the extra work - that's the down side of using different collections for attributes with the same primary key [10:59] wallyworld_: feel free to ping me at any time [11:00] rogpeppe: no need to apologise. when the data model changes, there are always consequences [11:00] wallyworld_: i'm not sure that anyone else has added stuff to that particular part of the code base before [11:01] * wallyworld_ straps on the headlamp and goes diving in [11:01] wallyworld_: the basic model is that a multiwatcher maintains a store of information on all the entities that are currently being watched [11:01] the *entire* data model? [11:01] what if we have 1000000 machines [11:02] wallyworld_: that's indeed a current problem with having a watcher that watches for all changes [11:03] wallyworld_: in the future, we'll probably move to a model where clients only watch for some summary of the state [11:03] rogpeppe: so am i right in assuming the store of info on the client mirrors exactly the persistent model? [11:03] wallyworld_: so we won't need to keep all info in mmeory [11:03] same classes etc [11:03] wallyworld_: not really [11:03] wallyworld_: it's exactly derived from it [11:04] wallyworld_: but it holds the entities that are observed by the client [11:04] wallyworld_: which don't necessarily map directly to the entities stored in the database [11:04] ok, so if there's a machineDoc with A,B attriutes, there'll be a clientMachine with A,B also etc [11:04] wallyworld_: status being a good example [11:04] wallyworld_: not necessarily [11:04] wallyworld_: we don't provide all attributes to the client [11:05] wallyworld_: and we provide extra attributes in some cases (for instance the machine status) [11:05] ok. will look at the code with that in mind [11:05] wallyworld_: the indirection is implemented by the Backing.Changed(all *Store, change watcher.Change) error method [11:05] rogpeppe: so if gary just uses InstanceId as a key, he could also use Machine.Id right? [11:05] wallyworld_: i wouldn't dare to guess [11:06] wallyworld_: and i definitely think we should provide the instance id through the AllWatcher, so i think the point is moot [11:06] i'm not across what AllWatcher does [11:07] ie how it differs from mega watcher [11:07] wallyworld_: megawatcher.go implements the AllWatcher [11:07] wallyworld_: they're the same thing - just the AllWatcher is the "official" name [11:07] when you say provide InstanceId, does it have to be attached to a Machine object? [11:08] or can we expect code to listen to instanceMetadata to get it? [11:08] wallyworld_, mgz, danilos: The Juju team manager meeting is going to overlap with our standup, if it is done in time, I'll be there, but I might miss it. [11:08] wallyworld_: it must be attached to the params.MachineInfo object [11:08] wallyworld_: no, i don't think the client should need to know that there are two collections underneath [11:09] so the watcher listens to both and merges the data into MachineInfo? [11:09] wallyworld_: in the future, we might provide a multiwatcher that can watch a limited subset of the state, but that's not where we are now. [11:09] wallyworld_: yes [11:09] i'd still like to know what the business case is for needing InstanceId though [11:10] InstanceId is now just one bit of info that becomes available when a machine is provisioned - we now also have mem, cores etc [11:10] we didn't have all this previously [11:11] wallyworld_: we did provide the instance id through the AllWatcher [11:11] if InstanceId is just being used as a key, the Id will do fine [11:11] wallyworld_: InstanceId is also shown to the user [11:11] we did provide it, but do we need to continue prviding it if there's no business reason for it [11:11] jam, wallyworld, danilos: if we could mumble the standup, that would save me various bother (and jam could be on mumble and hangouts at the same time :P) [11:11] wallyworld_: (it's an important piece of info, after all) [11:12] wallyworld_: I think that's probably the only way that the GUI finds out about instance ids [11:12] wallyworld_: and i think backwards compatibility is worth preserving [11:12] wallyworld_: there are other people using the API too [11:12] wallyworld_: outside of juju-core [11:13] backwards compatibility is good if the cost is not too great and/or the business reason is high [11:13] wallyworld_: i don't think the cost is too great here; and the benefits are significant [11:14] wallyworld_: you can use the AllWatcher as a way to show a dynamically updated system status [11:14] mgz: if only I could split the headphones so I listen to you with my right ear, and the managers with my left :) [11:14] ok, i'll look at the code. i'm still wondering why it's such an important piece of info. just curious as to why it's needed [11:14] mgz: and you *can* be on 2 hangouts at once, I believe. [11:14] what benefit do users get from seeing it? [11:15] wallyworld_: it's the only way a user can cross-correlate juju instances with instances in the ec2 management console, for example [11:15] ah ok, that makes sense [11:15] and i guess the same would hold for openstack [11:15] wallyworld_: indeed [11:16] if there is a management console [11:16] wallyworld_: there are presumably non-juju tools for managing openstack [11:16] yes [11:16] wallyworld_: there y'go then :-) [11:17] rogpeppe: it's still all a bit abstract - i'll look at the status stuff and try and understand what's being done and apply that to this case. thanks for the input. [11:17] wallyworld_: np [11:17] fwereade_: have reproposed dimiter's machiner watcher branch as https://codereview.appspot.com/10439043 - can you give me a few more details on how to test watcher id validity? [11:17] wallyworld_: again feel free to ask about specific bits of code or anything else [11:18] rogpeppe: will do for sure. it's way past my EOD so i'll take a proper look tomorrow [11:18] mgz: if you want to test invalid watcher ids, you can always make raw api calls directly as necessary. [11:19] mgz: I also need to find out what is going on with co-installability with go-juju and py-juju. [11:19] it's less that I want to test invalid ones, and more that I want to verify what I get is valid. [11:19] jam: we need to update how we do the ppa [11:19] because it's diverged from the fixed packaging [11:20] the backporting to 12.04 is also much requested, but pretty much just ubuntu-process (we can basically take the existing raring packages for both pyjuju and gojujuju) [11:20] tooju [11:22] fwereade_: fyi, i put up a mp to refactor the instance metadata api. i did it as a pre-req to the branch storing the metadata in state because the latter branch needs the allwatcher stuff modified and i want to get the api tweak in trunk asap https://codereview.appspot.com/10384049/ [11:23] wallyworld_, tyvm, sorry meeting now [11:23] np, just a heads up [11:24] wallyworld_: hm, I approve of that idea in principle, looking at the branch now [11:25] ty [11:30] danilos, jam, wallyworld: I'm on mumble [11:30] mgz, coming [11:30] jam: I don't think you can do two hangouts, last I tried (for on-air stuff), it complained [11:41] i'd appreciate a review or two of these CLs, if anyones feels inclined to: [11:41] https://codereview.appspot.com/10364046/ [11:41] https://codereview.appspot.com/10437043/ [11:41] anyone [11:53] rogpeppe: I *think* I figured out why my 'hack' patch worked, and the actual one that landed didn't. [11:53] wallyworld_, you still want to talk? [11:53] jam: oh yes? [11:53] It seems that socket.Close *doesn't* remove the socket from the alive counts. [11:53] So when I returned a Closed socket. [11:53] It was calling socket.Release() on it [11:53] jam: ah! [11:53] now that I return an Err [11:54] nobody is Releasing it [11:54] so the code needs to be 'if server.closed: socket.Release(), socket.Close() return err" [11:54] jam: i should have looked deeper, sorry for slack code review! [11:54] rogpeppe: well, the test suite was passing [11:54] and I didn't do quite enough rigor testing of the juju-core test suite with the updated patch [11:54] gary_poster: hi, i think i'm ok now. the db model is changing to support instance metadata (ram, cores etc) for provisioned machines. part of the change is moving InstanceId from machine to a new entity. but i think i can still keep the watcher interface the same for your stuff [11:54] jam: Release doesn't do a Close then? [11:55] because it takes 10s for each run, and only fails once in a while when there is a release. [11:55] rogpeppe: Release is "put this back in the available queue" vs "Close" is don't let anyone talk on this anymore. [11:55] I would have thought Close would do a Release, but it doesn't [11:55] jam: so they're entirely orthogonal? [11:55] Release shouldn't clase. [11:55] gary_poster: i just wanted to find out how much you depended on InstanceId - i think you display it to the users [11:55] cool thanks wallyworld_ . FWIW, backwards incompatible changes without a transition period make me concerned generally [11:55] rogpeppe: not in *my* head :) [11:55] jam: neither mine :-) [11:55] Close calls kill [11:55] which makes the socket unusable [11:56] but it doesn't mark it as not InUse [11:56] just NotAlive [11:56] mgz, sorry, just out of meetings -- I guess that successfuly using the watcher id is the clearest demonstration of its being valid [11:56] jam: a subtle distinction... [11:56] rogpeppe: well that is the 2 columns [11:56] and why the count moved from one into the otherc [11:56] but my old patch worked [11:56] gary_poster: yeah, understood. i wasn't sure how your stuff used the attribute ie the impact of any change, hence i wanted to find out before doing too much on it [11:56] because it happened to still call release [11:56] wallyworld_, not sure but we might also need it as a key. would have to look through code but won't since you are not changing now [11:56] because it used the 'send a request even if it is closed' [11:56] jam: presumably because it was relying on the usual code patch for cleaning things up [11:56] cool thanks wallyworld_ [11:56] s/patch/path/ [11:56] rogpeppe: yep [11:57] and the test I wrote [11:57] just was testing that it noticed the server was closed [11:57] which it did [11:57] have a good night wallyworld_ :-) [11:57] gary_poster: thanks also [11:57] jam: ha. gustavo was right about that test [11:57] will do [11:57] jam: too embedded in implementation detail [11:57] rogpeppe: but it also gave precise injection of concurrency [11:57] which is usually the tradeoff [11:57] jam: agreed [11:57] "run this 100 times and maybe it fails once" [11:58] vs "this will always fail if you got it wrong" [11:58] rogpeppe: it wouldn't be hard to have had my test case also check that all sockets had been fully cleaned up. [11:59] jam: but it would be nicer still to have a test that didn't delve so much into the implementation [11:59] jam: though i appreciate that might be hard... [12:00] rogpeppe: precise injection of concurrency is usually very white-box testy [12:00] otherwise it is in the "well maybe they happened at the same time" [12:00] or "maybe the scheduler did them in an order that always passes" [12:01] jam: sometimes just having a slow test that runs things in a loop, failing probabilistically isn't a bad approach [12:01] And very white-box tests that exercise precise scenarios of concurrency are generally a drag. In practice, it's trivial to introduce race conditions that go uncovered. [12:01] rogpeppe: right now, juju-core test suite almost always fails when run in practice, but if you run just the test, it almost always passes [12:01] Good morning/afternoon, by the way :) [12:01] niemeyer: yo! [12:03] niemeyer: that is true, but you can also add those cases as you find them. The main issue is without adding those cases [12:03] all you know is that you probablistically made things better. [12:03] jam: If you Release the socket and it's injected back onto a closed server, the socket is immediately closed. [12:04] jam: Sure, but I don't consider it a good tradeoff. You're testing an extremely specific race condition, not the next one [12:04] niemeyer: RecycleSocket doesn't add it to unused, but *doesn't* call socket.Close() [12:05] niemeyer: unless I'm missing something, you must call both Release and Close on the socket. [12:05] jam: Indeed, you're right, because Server.Close already closed htem [12:06] One to unset the IsAlive flag, and one to unset the InUse flag. [12:07] jam: As long as we don't allow new sockets to be created (as the old code was doing) or left open, it should be okay [12:07] niemeyer: by the time server.Connect has returned, it has created a new socket that has been acquired, so it needs both. [12:08] jam: On a meta level, for race-prone logic, what I appreciate doing is punching the code with concurrency and checking that it survives in a consistent way [12:08] jam: That's actually useful, because it's covering the past and future cases of uncovered races. [12:08] jam: It's what mgo/txn does, for example [12:10] i find it interesting, given what people say about Go's memory non-safety, that almost none of the race conditions we've discovered have resulted from memory races [12:10] and would probably have been possible to occur in any of the so-called "concurrent-safe" languages [12:11] rogpeppe: Sure, I don't think anyone expects code that is not memory-racy to bug-free [12:12] rogpeppe: If that wasn't the case, every Python application would be bug-free :) [12:12] this particular issue is a good example - it could easily have occurred in haskell with STM [12:12] niemeyer: sure [12:12] niemeyer: i just think it's an interesting thing to observe [12:12] rogpeppe: I agree [12:12] rogpeppe: Reminds me of The Rubber Boots post [12:12] niemeyer: what post is that? [12:13] rogpeppe: https://plus.google.com/u/0/107994348420168435683/posts/993U42yVbfk [12:13] * rogpeppe sees the dark clouds gather above and wonders if it might be a good moment to go inside [12:14] niemeyer: thanks - i vaguely remember that now [12:15] rogpeppe: we had a kind of apocalypse yesterday evening [12:15] TheMue: lots of rain? [12:15] TheMue: or angels of darkness and death? [12:15] rogpeppe: heavy thunderstorms [12:16] rogpeppe: thankfully the center has been more south of us, so here not much rain. but around my hometown ... [12:17] * rogpeppe can't decide if this particular large black cloud might not produce rain and have sun following it, or if i might have to run for it [12:19] hi guys, just wanted a quick assessment of https://bugs.launchpad.net/juju-core/+bug/1192728 [12:19] <_mup_> Bug #1192728: differing behavior with pyjuju: config-get all json [12:19] if you agree it's a bug, or not, etc [12:20] ahasenack: it is, i'm just working on it [12:20] basically config-get with no key returns all keys. In pyjuju, it only returns keys that were set (manually or via default values) [12:20] in juju-core it returns all [12:20] TheMue: ah, ok, thanks [12:20] rogpeppe: ask www.nsa.gov/prism, they know if it'll rain [12:21] TheMue: only because they bug the Met Office :-) [12:21] niemeyer: so with my "patch" to widen the concurrency hole (set pingerDelay to 5s and add a 10ms delay in newSocket). I can get the juju-core test suite to fail 8-in-10 times, and adding a line 'socket.Release()' before socket.Close() in AcquireSocket, it passes 10-in-10. [12:21] niemeyer: do you want a branch for that, or you can do the change yourself? [12:22] ahasenack: have you encountered a problem with that behaviour? [12:22] ahasenack: (the Go-juju behaviour, that is) [12:22] jam: I can do the one-liner quickly.. I'm actually just trying to write a high-level test that exposes that logic to concurrency in a black-box fashion [12:22] rogpeppe: yes, one charm of ours basically dumped all juju config values into the config file [12:22] ahasenack: i'm wondering if there's another way to see all possible config options without looking directly at the charm config metadata [12:23] rogpeppe: in gojuju we started getting "url = None" [12:23] rogpeppe: in pyjuju we didn't get that key, because it wasn't set [12:23] niemeyer: you can have a goroutine calling AcquireSocket and closing it, and another closing the server at some random point in time. But you need real hardware concurrency, rather than single-threaded 'maybe the scheduler decided to switch goroutines'. [12:23] (i think) [12:23] ahasenack: interesting, thanks [12:24] rogpeppe: honestly, I'm surprised with both [12:24] ahasenack: what behaviour would you expect? [12:24] niemeyer: the problem is that the test suite *I* have, without the changes to delay, passes 10-in-10, and really only fails under direct test about 1-in-50 or so. [12:24] rogpeppe: one can argue both ways, that both behaviours are correct I think [12:24] (it fails when running the whole test suite about 3-in-4, I'm guessing because of extra load/memory/timing changes/etc.) [12:24] "config-get" will return all config keys. The ones that weren't set will have a value of None [12:24] or [12:24] "config-get" will return all config keys that have a value set either manually or via a default value [12:25] ahasenack: the former is the current Go behaviour, right? [12:25] rogpeppe: right, and I'm slightly in favor of it [12:25] rogpeppe: at this point I guess it's more a question of maintaining backwards compatibility [12:25] ahasenack: yeah [12:25] ahasenack: maybe we could add a "-a" flag [12:26] ahasenack: so that the current Go behaviour is still available ('cos it's useful) but backward-compatibility is preserved [12:26] hm [12:26] so the go behaviour that we have now would only happen if you added "-a", for "all"? [12:28] rogpeppe: -a or --all? [12:28] TheMue: maybe both, meaning the same thing [12:28] ahasenack: yes [12:28] rogpeppe: as an option like --format [12:29] seems nice [12:29] the docs would say something like this [12:30] "config-get with no key specified will return all keys that have a value, either set manually or via a default value" [12:30] "if -a is provided, then all keys, regardless of their value, will be returned" [12:31] ahasenack: "config-get with no key specified will return all keys that have a value that is not None" ? [12:32] what if I set "None" in the config? Is there an equivalent yaml value for python's None? [12:32] ahasenack: because i think it might be possible to manually set a config setting to its original unset value [12:32] ahasenack: it's a good question, i'm not sure [12:33] ahasenack: i *think* that "" is always unset, but fwereade_ has poked at this a lot more recently than me [12:33] I find myself reading your version several times over to understand it [12:33] ahasenack: yeah, double negatives are never good [12:33] ahasenack: "config-get with no key specified will return all keys that have a value" [12:33] is probably good enough [12:33] s/that have/with/ [12:34] +1, but I prefer that have [12:34] ahasenack, rogpeppe: I have a lurking feeling that it will actually include those without values [12:34] fwereade_: indeed [12:34] fwereade_: that's the bug [12:34] ahasenack, rogpeppe: (independent of default substitution for missing values ofc) [12:34] fwereade_: https://bugs.launchpad.net/juju-core/+bug/1192728 [12:34] <_mup_> Bug #1192728: differing behavior with pyjuju: config-get all json [12:35] ahasenack, rogpeppe: ah-ha, ok, thank you [12:35] fwereade_: but we think the current go behaviour is actually useful, so the proposal is to add a -a (and/or a --all) flag [12:36] fwereade_: with which you'd get the current behaviour [12:36] --include-unset? [12:36] we have two "all"s in here [12:36] config-get with no key returns "all" keys (with values set) [12:36] rogpeppe, ahasenack: --full? --complete? [12:36] and -a returns all keys again (regardless of value) [12:37] fwereade_: i think -a has quite a bit of historical precedent [12:37] rogpeppe, fair enough, don't let me bikeshed it :) [12:37] rogpeppe, ahasenack: I'm +1 on this then [12:37] TheMue, I see you're assigned to it [12:37] fwereade_: yep [12:38] TheMue, it should just be a trivial tweak in jujuc I think [12:38] fwereade_: already working on config-get, there is another issue [12:38] rogpeppe: "config-get with no key specified will return all keys that have a value" s/all// [12:38] fwereade_: looks like, yes [12:38] TheMue, ah, what's the other issue? [12:38] ahasenack: or s/all/any/ ? [12:38] fwereade_: https://bugs.launchpad.net/juju-core/+bug/1192706 [12:38] <_mup_> Bug #1192706: config: empty strings return null with --format=json [12:38] rogpeppe: works too [12:40] TheMue, hmm, when was an empty string ever a valid value for anything? [12:40] TheMue, it's not settable [12:41] fwereade_: as i understood default is no empty string but the string '""' [12:41] fwereade_: two double quotes in the yaml [12:41] jam, rogpeppe: Waiting for sockets to die: 451 in use, 46 alive [12:43] TheMue, that's an empty string, try unmarshalling that yaml [12:43] fwereade_: is it possible to explicitly set an empty string for a string-typed config value? [12:43] rogpeppe, it never was before, empty strings were always converted to "delete" internally [12:43] jam, rogpeppe: http://paste.ubuntu.com/5783538/ [12:43] fwereade_: yeah. it does seem a bit wrong though. sometimes i might *want* an empty string even though the charm default is non-empty. [12:44] rogpeppe, ok, but thus far it has been impossible except apparently via this default backdoor [12:44] niemeyer: that's more the kind of test i was imagining [12:44] niemeyer: +1 [12:45] fwereade_: if i've seen it right we currently don't handle a default value if a key is missing [12:46] TheMue, sorry, restate please [12:47] fwereade_: in the example of the issue a default: is used [12:47] fwereade_: I just sent an email to juju-dev trying to summarize the API for Upgrader as I see it. Can you at least skim it and see that it matches what you got out of our conversation? [12:47] jam, I'll try, I'm rather backlogged today :) [12:48] fwereade_: maybe i'm not yet deep enough in our code, but i haven't seen where we handle it [12:48] TheMue, default is handled at ReadConfig time [12:48] fwereade_: I'm not working on it till Sunday anyway [12:48] no rush [12:49] niemeyer: pingDelay*10 seems a bit funny. can you make the const pingDelay 20 instead ? (so you get a feeling for the range from the const at the top) [12:49] fwereade_: ah, ic, so looked at the wrong place [12:49] fwereade_: thx for the hint [12:49] ah, I guess I see why it is separate. [12:49] TheMue, and hits the ""-becomes-nil logic when figuring out the actual default [12:50] jam: It's not the same thing, but sure.. it'll probably hit the bug either way [12:50] fwereade_: just started to take a look into config-get, but it's earlier. good to know [12:50] TheMue, I am a bit nervous about having separate validation paths for default settings and normal settings [12:51] fwereade_: will have to look at it first [12:52] fwereade_: but a small different topic before [12:52] fwereade_: would like to merge the moved ec2 http storage reader [12:52] niemeyer: so nm, because you use pingDelay as s [12:52] fwereade_: it has two lgtms but i would like to discuss the location of the test server with you [12:53] I would say you probably want pingDelay +/- randomTime [12:53] as the time.Sleep part [12:53] (possibly only - ?0 [12:53] jam: Nah.. it's simpler.. taking the ping delay down to 1 already triggers it [12:53] jam: Just letting them fight a little bit is enough [12:54] fwereade_: i made it public iin the same package, as i have to use it their for the test and in synctools for the test. other location would be testing, but in this case i would have to duplicate the xml marshaling structs [12:54] (even without random) [12:54] fwereade_: so both solutions have their downsides [12:54] fwereade_: what do you say, keep it in ec2 or move it to testing? [12:55] TheMue, ok, cool, I will see if I can figure it out quickly [12:56] fwereade_: thx [12:57] TheMue, ah, ok, it's a full-fledged server, not just canned data? [12:59] fwereade_: it's the same one like before, a faked one but handling the requests of the http reader [13:00] fwereade_: delivering the index xml and the files (simplified ones out of a map) [13:02] jam, rogpeppe1 : It's fixed and up [13:03] bbl [13:03] niemeyer: thanks [13:03] rogpeppe1: np [13:05] TheMue, ISTM that it's a bit wrong to reuse the same structs... it renders the tests somewhat tautological [13:05] TheMue, how about cutting the internal structs down to just the fields we care about, and using the full structs (ideally with some actual values filled in) in the testing server? [13:07] fwereade_: would be possible, yes. we don't use very much of the structure. [13:07] TheMue, if it's not too much hassle I think that would be ideal [13:08] fwereade_: and also move the server to testing then? [13:08] TheMue, yes please [13:08] fwereade_: no, should be simple and feels even better for me [13:31] TheMue, awesome news [13:33] fwereade_: just re-proposed it [13:33] TheMue, cool, ty [13:37] mgz: you have a review [13:40] rogpeppe1: ta [13:43] argh, missed the change in synctools test, but now a (hopefully) final propose [13:46] jam: I guess you figured out the problem with that test? [14:02] rogpeppe2, TheMue, kanban [14:03] fwereade_: trying [14:04] fwereade_: it doesn't seem to like me currently [14:04] rogpeppe2, it's been a bit picky with me lately too [14:04] rogpeppe2, let's give it a minute [14:04] fwereade_: it's just not loading the page [14:05] fwereade_: other connectivity seems ok [14:05] fwereade_: same problem trying to connect to plus.google.com [14:05] rogpeppe2, I have had that too [14:05] rogpeppe2, rebooting my router seems to help, but it's probably superstition [14:05] fwereade_: google.com just fine === wedgwood_away is now known as wedgwood [14:20] jtv: yeah, I sorted it out [14:20] mgz: Can you go to the cross-team meeting again today [14:20] ? [14:31] jam: can if I reboot, it's now, right? [14:33] can someone invite me on g+? [14:41] rogpeppe2: here's your syntax: juju set option=∅ [14:41] ;) [14:48] hi guys, I got myself in a dying loop: http://pastebin.ubuntu.com/5783889/ [14:48] I will try to reproduce it, but basically it's a ceph unit, with a subordinate, and I ran "juju destroy-unit ceph/3" [14:48] and all hell broke lose :) [14:48] that pastebin is happening in a loop [14:53] Wow - my crash course in GoLang has made me more frustrated with it than anything... apparently it can't handle null values in JSON responses. :-/ [14:54] FunnyLookinHat: it can, but probably not in the way you expect [14:55] mgz, well - all I know is that juju-core can't bootstrap to my devstack because it's failing on a null value [14:55] you can unserialised to a pointer type in the struct, or use an intermeduary form [14:55] http://hastebin.com/rokukasixi [14:55] FunnyLookinHat: file a bug (and then probably just fill in the description in your devstack config) [14:55] Yeah I'll do that - going to make sure it's not some patch I applied to Goose inadvertently [14:57] you don't even need to file a bug, just +affectsmetoo bug 1188825 [14:57] <_mup_> Bug #1188825: If description is unset, goose doesn't understand 'null' value [14:57] Perfect [14:57] you can just that in devstack [14:57] I haven't got a step-by-step for it though === tasdomas is now known as tasdomas_afk [14:58] right right - I can figure that much out - thanks for the bug ref [14:58] dpb1: ^see the above, do you remember off-hand what keystone config you needed to set for that? [15:02] mgz: jamespage actually worked-around it. I'm not sure what he set. james? [15:03] mgz, you need to have an empty description in the tenant/project [15:04] ah, so when you create it via the keystone api? [15:04] rather than in a conf file? [15:24] Any reviewers available? I had to make this a native Launchpad review, because there was an unlanded prerequisite branch: https://code.launchpad.net/~jtv/juju-core/session-certificate/+merge/170611 [15:29] looks reasonable to me jtv, though haven't read the prereq yet [15:30] Thanks mgz — the prereq has landed by now [15:30] ..prereq does seem to be merged? or did the bot blow up again [15:30] It wasn't landed at the time I wrote that, but it s now. :) [15:30] jtv: you can just run `lbox propose -cr -v` in the branch now for the normal method then? [15:31] * jtv decodes [15:33] It's proposing... [15:36] newTempCertFile is surprisingly manual, doesn't go help you with any of that? [15:36] * mgz just looked up the implementation out of curiosity, as hadn't looked at the original mp for it [15:38] mgz, yes [15:38] (I'm lazy and did not specify one when setting up tenants for serverstack) [15:38] jtv: hm, seems the rietveld proposal is still confused about something [15:39] jamespage: thanks [15:39] mgz: I was just updating the LP MP because it didn't get the link to the Rietveld MP. [15:40] mgz: ouch... lots and lots of diff! [15:41] Maybe we can just stick with the native MP? [15:42] I'm not too worried about the final merge, because that's going to be based on the LP diff not the Rietveld diff. [15:46] I'd expect it to be correct, indeed [15:46] jam/bot smarter than goetveld [15:48] goetveld... "goodfield"? [15:53] don't expect the non-dutch to make sense when squidging dutch and english together :) [15:56] Why on Earth not? [15:57] Any backup reviewer for this branch? The Rietveld MP got a little messed up, but the Launchpad one is good: https://code.launchpad.net/~jtv/juju-core/session-certificate/+merge/170611 [15:58] rogpeppe2 maybe? ^ [15:59] (And thanks mgz for the first review :) [16:00] jtv: if you merge trunk into the prereq and push it, then re-propose, it might sort out the rietveld problem [16:00] jtv: obviously you'd need to pump the pipeline too [16:00] I don't have a pipeline there actually. [16:01] The prereq is already landed. I had hoped that would make things simpler. [16:01] merging trunk would probably work, though shouldn't really be needed again... [16:01] I'll give it a go... [16:01] I think the issue is basically goetveld doesn't understand three-way merges, at all [16:03] Turns out there was a conflict, too. [16:06] mgz, rogpeppe2: re-proposing updated version of my branch... Fingers crossed! [16:06] jtv: you've merged and pushed the prereq too, presumably? [16:07] That was already landed. [16:07] So implicitly that happened as well, yes. [16:07] jtv: it doesn't matter [16:07] jtv: lbox diffs against the branch that it finds in launchpad [16:07] jtv: which won't have had trunk merged into it unless you did so explictly [16:07] Ahhh [16:08] jtv: despite the fact that it was merged *into* trunk itself [16:08] Gotcha. [16:10] I have a new Rietveld MP now, and it updated my Launchpad LP but kept the old (broken) Rietveld link. So I updated manually. [16:11] But get this: the two Rietveld MP ids are near-identical, with just one digit somewhere in the middle being different! [16:12] Thanks mgz & rogpeppe2 for helping me out of this. I'll have to leave now, but I'm in a better state. :) [16:12] jtv: that's much more common than you might think [16:12] jtv: due to the way that rietveld allocates ids [16:12] jtv: have you got a link? [16:13] https://codereview.appspot.com/10447043 [16:13] (While I triple-check that I have the right one here :) [16:15] jtv: that's much more like it! :-) [16:16] Yup, much more modest size. [16:17] I think I'll just delete the old one, before some poor passer-by tries to review it. [16:18] (Very courageous of the Rietveld MP to have a Delete button with no Undo and no confirmation dialog! :) [16:46] jtv: you have a review BTW [16:46] Thanks! [16:46] (Yes I know I said I was gone for the evening... No backbone :) [17:42] guys, is there some sort of timeout involved in this error? [17:42] error: cannot put charm: cannot make Swift control container: failed to create container: juju-51e360f7f9cbfdf5d24fdb84aa0d4112 [17:42] caused by: failed executing the request https://swift.canonistack.canonical.com/v1/AUTH_850578306acb4f1baeaceae75ff7ada1/juju-51e360f7f9cbfdf5d24fdb84aa0d4112 [17:42] caused by: unexpected EOF [17:43] I'm wondering because that particular charm has a big'ish tarball inside it, something like 6Mb [17:43] I wonder if that is uploaded to the swift container [17:44] (and I changed my control bucket id already, don't worry) [17:50] yay, live test with machine agent connecting via the API worked! [17:51] ahasenack: sorry, not sure i can help there without more info [17:51] ahasenack: and it's eod for me [17:51] see y'all tomorrow! [17:52] rogpeppe2: ok, I'm looking through the code trying to find out where it defines timeouts for this connection [17:52] ahasenack: it seems a little unlikely to be a timeout, but y'never know [17:52] ahasenack: it might just be the underlying TCP connection failing [17:53] ahasenack: i mean - unlikely to be a timeout that we've built in [17:53] I'm getting that a lot recently [17:53] I'll resort to tcpdump soon === hatch_ is now known as hatch [20:57] . [21:11] morning [22:29] fwereade_: would love a quick chat if you are up and bored [22:29] thumper, hey, I'm really sorry, I am hyperfocused [22:29] np [22:29] thumper, and I haven't done any reviews :/ [22:30] * thumper is just reproposing the broker one with the right pre-req [22:31] thumper, but I'm *this* close to nailing the destroy-services-quicker bug, and I think we can do it backward compatibly [22:32] awesome [22:38] fwereade_: quick question, I'm adding a job type for JobEnvironProvisioner, should I add one for JobFirewaller at the same time? [22:39] thumper, those two are JobManageEnviron [22:39] I know, now... [22:39] do we want to keep them like that? [22:39] thumper, task is more granular than job [22:39] or break them out [22:39] thumper, is the idea [22:40] I could just shelve this bit and just use JobManageEnviron [22:40] is that your suggestion? [22:40] thumper, yeah, I think JobManageEnviron implies run-an-environment-provisioner [22:40] kk [22:40] * thumper removes pipe [22:41] thumper, I guess that's one of the things that causes me to lean towards having more than one top-level task in play [22:41] hmm... [22:41] thumper, the task implementation is shared which is great [22:42] thumper, but they're being run for different reasons [22:42] fwereade_: you thinking that we rename provisioner to environ-provisioner [22:42] and add one for lxc? [22:42] if they are managed by the outer task runner [22:42] it is certainly simpler [22:43] thumper, I'm not 100% sure we'll need extra types for the lxc-provisioner [22:43] thumper, I think that's just a task primed with the right watcher and broker [22:44] thumper, the environ casehas extra funky watching so the actual provisioner is wrapped there [22:44] * thumper nods... [22:44] let me see if I can get this working [22:44] I'll keep the multi-task runner the jujud bit itself [22:44] cool [22:45] I think it might work out quite neat [22:46] * thumper goes to see what a machiner actually does [22:47] arse [22:48] * thumper needs to go back to refactoring the state info getter === wedgwood is now known as wedgwood_away [23:11] * thumper will test this version in ec2 to make sure the provisioner still works [23:25] * thumper sighs [23:25] forgot to upload tools [23:25] at least my client works with 1.11.0 tools [23:48] Dave C. around?