wallyworld | thumper: you around? | 01:29 |
---|---|---|
thumper | wallyworld hey | 02:10 |
wallyworld | thumper: want a hangout? | 02:16 |
thumper | sure | 02:16 |
wallyworld | https://plus.google.com/hangouts/_/72cpi2rmduripigvi5lgv18b7c | 02:16 |
axw | wallyworld: sorry, I guess I took one person's preference on US spelling for policy :) | 03:16 |
wallyworld | axw: no worries :-) | 03:16 |
wallyworld | no need to apologies | 03:16 |
axw | and certainly no need to apologize | 03:17 |
axw | hur hur | 03:17 |
wallyworld | ha ha ha | 03:17 |
wallyworld | axw: did my review on your bootstrap tools change make sense? | 03:17 |
axw | yes, just makign some changes now | 03:17 |
axw | thanks for that | 03:17 |
wallyworld | np. just wanted to check you were unblocked | 03:17 |
wallyworld | thumper: there's a potential issue with checking if a host machine can run a container. we currently support "juju add-machine lxc" which means create a new machine and add an lxc container. in that case, it's not possible at add-machine time to know if a container is supported or not. there was talk at one stage of removing the syntax allowing just "lxc" or "kvm" and forcing <containertype>:<machineid> | 03:26 |
wallyworld | do you think we could remove the add-machine <containertype> syntax? | 03:27 |
wallyworld | since i think there was feedback we should be explicit about where containers go | 03:27 |
wallyworld | but even in that case, add-machine <container>:<machine> may need to block for a while if the machine is not up yet | 03:27 |
wallyworld | eg if the user has scripted add-machine followed by add-machine container | 03:28 |
thumper | wallyworld: I think it is reasonable for someone to request it | 03:30 |
thumper | wallyworld: even though it may not be fully actionalbe | 03:31 |
thumper | so if there is a pending machine for a particular machine | 03:31 |
thumper | and that machine on starting says "I don't support containers" | 03:31 |
wallyworld | i guess we could relfect in status | 03:31 |
thumper | we need to be able to put that container into an error state | 03:31 |
thumper | since we are operating asyncly | 03:31 |
wallyworld | yes, ok | 03:31 |
thumper | we need to be able to handle this cleanly | 03:32 |
thumper | or at least | 03:32 |
thumper | cleanish | 03:32 |
thumper | o/ axw | 03:32 |
wallyworld | i can do a retry strategy to allow a little waiting if supported containers not known | 03:32 |
axw | heya thumper | 03:32 |
thumper | wallyworld: nah don't do that | 03:32 |
thumper | just expect it to be ok | 03:32 |
thumper | it is up the juju to manage conflicts | 03:32 |
wallyworld | ok | 03:32 |
thumper | don't block | 03:32 |
wallyworld | in many cases, we will hopefully know | 03:33 |
thumper | ack | 03:33 |
wallyworld | and can error immediately | 03:33 |
wallyworld | well, reject the add-machine command i mean | 03:33 |
* thumper nods | 03:48 | |
axw | wallyworld: I had to merge cmd/juju/bootstrap_test.go, and some other minor things. I forgot to remove the --source flag from cmd/juju | 06:21 |
axw | if you want to re-review let me know, otherwise I'll just push and land | 06:22 |
wallyworld | axw: otp, but i trust you :-) | 06:22 |
axw | ta | 06:23 |
wallyworld | jam: yeah, looks like it may be about to hail, go to race and get my son from cricket training | 06:35 |
jam | wallyworld: try not to get hurt :) | 06:35 |
wallyworld | will do :-) | 06:35 |
axw | jam: I'm looking at that uninstall-script thing again. What do you think about this alternative: store the agent's upstart service name in agent.conf, as well as a list of subordinate services (for the moment, just juju-db) | 07:33 |
axw | then we just stop/remove them | 07:33 |
axw | and rm -fr config.DataDir() | 07:34 |
axw | no opaque script, so upgrading should be simpler | 07:34 |
jam | axw: can't we just derive the upstart service name? We derive it when we create the service in the first place? | 07:35 |
jam | I will admit I don't know exactly what steps need to be done, I'm mostly just thinking about (a) how much can we just do so that when we change that list it is easy to do so | 07:36 |
axw | jam: could do that too. we'd need to tell the agent that it's a state server some way other than through the state database | 07:36 |
jam | axw: why is that? | 07:36 |
jam | I thought the idea for manual teardown was that the state machines go down last, so we still have the database a bit before their dead (I think) | 07:37 |
axw | yes.. hmm, maybe that's ok | 07:37 |
jam | axw: at the very least, we determined what jobs we were running at startup, right? | 07:37 |
axw | I'll need to look at the conditions for ErrTerminateAgent again | 07:37 |
axw | yes | 07:37 |
jam | given we needed to, ya know, *do* them :) | 07:37 |
jam | axw: this *might* even make it clearer for the HA w/ manual stuff. If you add another node, that one may not start out as a state server, but become one later | 07:38 |
jam | so deciding what needs to be cleaned up just before you do it, *sounds* better to me. | 07:38 |
axw | hmm true, good point | 07:38 |
axw | jam: ErrTerminateAgent requires a state conn anyway (makes sense; db err could be transient), so yes, that'd be fine | 07:39 |
jam | axw: in general my experience with upgrades and Upstart is that we don't have a good way to change Upstart config once we've installed. So I'd like to avoid putting stuff in at that level. If it looks plausible to have the "this is how I clean myself up" clearly expressed inside the thing that is running, that sounds the best to me. | 07:40 |
jam | Actually, the best is to have the newest thing possible know how that thing should clean up (like Upgrade should do the clean up in the New code, etc) but some of that is tricky to do. | 07:41 |
axw | jam: I never would've modified upstart config itself, but agent.conf maybe. But anyway, it looks like it's all doable at runtime, without config changes. | 07:41 |
axw | I'll dig in | 07:42 |
jam | axw: so as for the specifics, I don't care if it is a script file that we generate and then run, vs commands we run directly, or whatever | 07:43 |
jam | I don't quite understand why setUninstallScript has a restore function | 07:43 |
jam | is that so it happens in a defer avoiding panic conditions? | 07:43 |
axw | jam: that was just so changes to AgentEnvironment are contained | 07:44 |
axw | after Configure returns, the original value is restored | 07:44 |
jam | axw: so I think I now understand *what* it does, but I don't quite understand why you don't want AgentEnvironment to stay changed | 07:47 |
axw | jam: it's only of philosophical value - I prefer input variables to be considered immutable | 07:47 |
jam | axw: so I can see where we may not want to mutate what the caller thinks, except if its the whole point of the function :) | 07:48 |
jam | o | 07:49 |
jam | axw: it is suspicious that configure takes and returns a cloudinit.Config which is the same object | 07:49 |
jam | but it appears to be the whole point of the function to mutate the c that is passed in. | 07:49 |
jam | otherwise we should copy it, and return a new one | 07:49 |
jam | which I do prefer | 07:50 |
jam | but it would still be important that we don't unset the thing we just configured | 07:50 |
axw | jam: AgentEnvironment belongs to the MachineConfig (input), not cloudinit.Config | 07:50 |
axw | agreed that the c being input and output is odd - I changed that in another branch the other day :) | 07:50 |
axw | (removed the output) | 07:50 |
jam | axw: the only reason you might want in & out is because you want the caller to nil their object if there is an error, but it does seem like you either want an INOUT var or an IN and OUT but not an INOUT and an OUT | 07:51 |
jam | and if you *really* need the caller to nil, then take an **obj | 07:52 |
jam | cloudinit_test.go is the only place that doesn't pass it back into the same object (as far as I can tell0 | 07:53 |
jam | morning fwereade and dimitern | 08:04 |
dimitern | morning | 08:04 |
rogpeppe | mornin' all | 08:07 |
axw | morning | 08:07 |
axw | jam: do you think it'd be horrible to just attempt stopping/removing the juju-db service, and ignore the ENOENT? | 08:09 |
axw | i.e. no check for state server | 08:09 |
rogpeppe | axw: what's the context? | 08:10 |
axw | rogpeppe: uninstalling mongo (juju-db) when destroying a manual provider env | 08:10 |
axw | currently the machine agent just removes its own upstart config, and exits | 08:11 |
rogpeppe | axw: when will it get ENOENT? | 08:12 |
axw | rogpeppe: if the machine agent is not a state server, then juju-db won't exist | 08:12 |
fwereade | jam, dimitern, rogpeppe,axw: mornings | 08:12 |
rogpeppe | fwereade: hiya | 08:12 |
axw | ahoj | 08:12 |
rogpeppe | axw: i *think* it sounds reasonable | 08:13 |
rogpeppe | axw: but i'd've thought it might be just as easy to check for state-serverness | 08:14 |
axw | yeah it probably is. just looking at the options | 08:14 |
axw | blind removal is tempting, because it keeps it all in one spot | 08:15 |
rogpeppe | axw: which spot is that? | 08:15 |
axw | rogpeppe: func (m *MachineAgent) uninstallAgent() error | 08:15 |
axw | cmd/jujud/machine.go | 08:15 |
rogpeppe | axw: presumably you could just pass isStateServer into that function (or machine.Jobs()) | 08:16 |
axw | rogpeppe: there's an error condition that the agent deals with that would cause termination, where the agent wouldn't be able to determine its jobs | 08:17 |
axw | i.e. the machine entry does not exist in state | 08:17 |
axw | but hey, maybe we don't care about nonsense like that :) | 08:17 |
rogpeppe | axw: hmm, i wondered if something like that was possible | 08:18 |
rogpeppe | axw: in that case, i think just delete and ignore ENOENT | 08:18 |
rogpeppe | axw: but... | 08:18 |
rogpeppe | axw: how can we know to destroy things if we can't get the jobs? | 08:19 |
rogpeppe | axw: don't the jobs arrive in the same reply as the machine life status? | 08:19 |
axw | rogpeppe: yes. if that returns not found or unauthorized, the agent terminates | 08:20 |
rogpeppe | axw: ah, of course | 08:21 |
rogpeppe | axw: in which case, i think that ignoring ENOENT is preferable to the alternative (caching locally whether we *did* have state server jobs) | 08:21 |
rogpeppe | axw: in a sense, the upstart config *is* that local cache | 08:22 |
axw | yeah, I'm thinking that too | 08:22 |
rogpeppe | axw: my only hesitation is whether there might be something else with a juju-db service that might get annoyed, but i think enough things will break in that case that we can safely ignore the possibility | 08:23 |
axw | rogpeppe: indeed, I came to the same conclusion | 08:24 |
rogpeppe | fwereade: i think launchpad.net/juju-core/agent/bootstrap.go:123 is crackful and that it should use config.DefaultSeries. what do you think? | 08:53 |
rogpeppe | fwereade: although... hmm, maybe not | 08:54 |
rogpeppe | fwereade: in fact, no i think it's right | 08:55 |
rogpeppe | fwereade: ignore me :-) | 08:55 |
jam | axw: As long as what we are getting rid of is *clearly* a juju script (juju-db, juju-machine-0, etc) I think we're fine. | 08:55 |
jam | we can't run 2 juju's on a given machine without a lot of other pain | 08:55 |
rogpeppe | jam: yeah. it's a pity, that, really. | 08:56 |
jam | (You *might* be able to run a unit of one environment and the state server of another environment, but that just sounds terrible) | 08:56 |
jam | rogpeppe: well, we'd have to put namespaces to do tat | 08:56 |
jam | that | 08:56 |
axw | jam: you can with local, but these changes just won't work with local (which I think is reasonable) | 08:56 |
jam | /etc/init/juju-env-X-machine-0 | 08:56 |
rogpeppe | jam: yeah - we'd probably put the env uuid in there | 08:57 |
jam | axw: I'd think we'd want local to clean up properly | 08:57 |
axw | why? env.Destroy does that anyway | 08:58 |
jam | axw: well, we still want local environments to clean up properly, right? (it may be done in a different layer, but we might want to consider how to avoid redundancy as well) | 08:59 |
axw | jam: I consider this to be like freeing memory before exiting a process | 09:00 |
axw | there may be some use case in the future, but I don't see one right now | 09:01 |
axw | handling the local provider with non-standard service names takes us back to modifying agent.conf | 09:02 |
axw | jam, rogpeppe: https://codereview.appspot.com/28270043 -- take a look, let me know if you think it's worthwhile involving agent.conf to fix the local provider case | 09:50 |
mgz | mornin' | 10:00 |
jam | morning mgz | 10:02 |
jam | standup time | 10:45 |
jam | fwereade, rogpeppe, TheMue, https://plus.google.com/hangouts/_/calendar/am9obi5tZWluZWxAY2Fub25pY2FsLmNvbQ.mf0d8r5pfb44m16v9b2n5i29ig | 10:45 |
mattyw | fwereade, thanks for the reviews :) | 10:54 |
jam | fwereade: you seem to be having connection issues | 10:58 |
fwereade | jam, ha, even my g+ chats don't seem to be getting though | 11:02 |
fwereade | ian, ok, will do | 11:02 |
fwereade | wallyworld, ^ | 11:02 |
jam | fwereade: I got your "isn't that just 2" but that was the last one | 11:02 |
fwereade | grar, v quick break, we'll see if it's happier in 3 mins | 11:07 |
fwereade | wallyworld, fwiw, a watcher for "kinds of containers this machine is expected to run" wouldbe easy, and was the originalplan a while ago | 11:16 |
jam | fwereade: well we have "containers this machine is runinng" | 11:16 |
fwereade | jam, I thought that was container-type-specific | 11:16 |
jam | fwereade: right, that is what we are talking about. making one that is non-container type specific, but just doing "all" and reporting back errors for ones it doesn't support | 11:17 |
fwereade | jam, wallyworld: isn't the simplest way to do that to launch a provisioner task with a broker that always just errors on provisioning? | 11:20 |
jam | fwereade: the concern is that you're launching 5 different provisioners that will never do anything | 11:20 |
jam | and a lot of duplicate code | 11:20 |
jam | why not just run 1 that can handle N container types | 11:20 |
fwereade | jam, because watcher->broker is a simple clean chunk of functionality that already exists | 11:21 |
fwereade | jam, that is the *point* of a provisioner -- it watches a specific set of machines and provisions them using a specific broker | 11:21 |
fwereade | jam, adding multiple brokers into the mix complicates that unnecessarily | 11:21 |
fwereade | jam, compared to starting one provisioner for each kind of machine, and using a, ha, "null broker" when that machine kind is not known | 11:22 |
jam | fwereade: but why multiple watchers? | 11:22 |
jam | why not watch all possible containers? | 11:22 |
fwereade | jam, to avoid complicating the provisioner task, mainly | 11:23 |
jam | (it may be the internal DB structure don't support it well) | 11:23 |
fwereade | jam, I'm not sure it's in a great position to have 1->N-ness poked into it | 11:23 |
jam | fwereade: but what would an LXC provisioner do differently than a KVM one? | 11:23 |
jam | the commands are different, but that is a lower level | 11:23 |
fwereade | jam, talk to a different broker, where the two brokers are independent and needn't blockone another | 11:24 |
jam | fwereade: overloading your system because you start an LXC and a KVM and an OpenVZ and a doesn't seem a better User Experience :) | 11:25 |
jam | I agree in the external vs local provisioning case | 11:25 |
fwereade | jam, seriously, a provisioner overloads the system? | 11:25 |
fwereade | jam, if that's actually the case then fair enough | 11:25 |
jam | fwereade: starting up a container does | 11:25 |
jam | apt-get update | 11:25 |
jam | starting 10 of them is actually quite bad (from reports I've seen) | 11:26 |
jam | somone did the local provider and really hosed his sytem | 11:26 |
fwereade | jam, sure, that was jorge doing deploy -n 50, but that was still with one single provisioner in play, so I think not germane | 11:26 |
fwereade | jam, the problem was that he asked for his system to be overloaded | 11:26 |
fwereade | jam, and besides the provider/container distinction I think holes your argument -- you're asking for two kinds of provisioner, a 1->1 and a 1->N one | 11:28 |
fwereade | jam, a single provisioner, that maps from machine-set to broker, seems like the clearest model | 11:28 |
jam | fwereade: I personally don't see why machine-set needs to split by type | 11:29 |
jam | I guess | 11:29 |
fwereade | jam, it doesn't *need* to be, but doing so extracts extraneous functionality from the provisioner, which doesn't need any additional complexity imo | 11:30 |
wallyworld | jam: fwereade: so at the moment, there's an lxcBroker which acts as a provisioner task, as well as a cloud instance provisioner task. i'm sure we'll iterate to get the best model, whether that's one provision task for all container types or one per container type. the kvm provisioner code is not done yet. let's see how it falls out. in the meantime, we will achieve the required user facing functionality wrt containers and all that | 11:37 |
wallyworld | ie users will be able to start supported containers, and get sensible errors if they try and start non-supported ones | 11:37 |
fwereade | wallyworld, ok, that's cool -- I'm just saying I will get a bit shirty if the kvm code requires a single kvm-specific line in the provisioner task itself ;p | 11:38 |
wallyworld | fwereade: as will i. don't fret too much. all i'm saying, it's a work in progress :-) | 11:39 |
fwereade | wallyworld, don't worry, I won't, I know you know what you're doing :) | 11:39 |
wallyworld | sometimes :-) | 11:39 |
wallyworld | fwereade: one of my aims is to get rid of the switch statements in the current provisioner so that the kvm/lxc logic is isolated behind an interface | 11:40 |
* fwereade cheers at wallyworld | 11:41 | |
wallyworld | scaling as discussed in the backscroll is a valid concern. we will have to look at that also as part of the solution | 11:41 |
wallyworld | and there's always a trade off between conceptual complexity, number of moving parts etc | 11:42 |
jam | fwereade: so the other logical thoughts are "what about new types", having to add yet-another-thing to monitor for another type that is normally not doing anything, etc. | 11:47 |
jam | we know today that people want openvz, vagrant, vmware, ... | 11:47 |
fwereade | jam, that it the last thing I want | 11:47 |
jam | fwereade: so the nice thing about having 1 "ContainerProvisioner" is that it can also not think about types it doesn't know, but it can still say "I don't know about that type, so here is your error", rather than nothing listening for the OpenVS container type, so when you go to deploy it just sits in pending forever. | 11:48 |
jam | It seemed to smooth things out to have a generic one | 11:48 |
fwereade | jam, the machine agent asks for the types of the containers it's meant to be running; starts provisioners with appropriate brokers for those it understands, and null-broker provisioners for the ones it doesn't | 11:48 |
jam | but it does depend on how things align | 11:48 |
fwereade | jam, null brokers just error on StartInstance | 11:48 |
jam | fwereade: sure, though that does mean Machine Agents run N watchers and N brokers for however many types that we might support | 11:49 |
fwereade | jam, no, it means they run one provisioner for every container type they are currently using | 11:50 |
fwereade | jam, plus one task that starts/stops them | 11:50 |
jam | fwereade: "and null-broker provisioners for the ones it doesn't" is still N | 11:50 |
fwereade | jam, it's a very small N compared to the number of possible container types out there | 11:51 |
jam | I'm not sure I follow. | 11:51 |
fwereade | jam, it's only for those cases where someone deployed an invalid container before the machine came up and was able to setits supported types | 11:51 |
fwereade | jam, most machines just run a container-types watcher | 11:52 |
fwereade | jam, no containers? no types, thus no provisioners | 11:52 |
fwereade | jam, kvm and lxc and vagrant containers added? start 3 of them | 11:52 |
fwereade | jam,one of which is null | 11:52 |
jam | fwereade: I do finally see, though I don't think that has been reflected in the discussions so far. | 11:52 |
jam | wallyworld: does that make sense to you/ ^^ | 11:53 |
fwereade | jam, although with a bit of cleverness in state we should be able to auto-error the unsupported ones *anyway* I think | 11:53 |
* wallyworld reads | 11:53 | |
jam | I haven't heard about the container-types as a separate thing being watched. | 11:53 |
fwereade | jam, that was the original idea a while back | 11:53 |
jam | fwereade: wel, defense-in-dept is always useful. (how will this thing act if we poke something that could be construed as invalid) | 11:53 |
wallyworld | jam: yes, it makes sense as that's how i've implemented it. when a machine agent starts up, it determines what container types the host can support | 11:54 |
fwereade | jam, that's a benefit of having a null broker for unknown ones that might slip through | 11:54 |
wallyworld | it then watches for containers of those types to be requested | 11:54 |
wallyworld | once a container of a type is first requested, a provisoner task is started | 11:54 |
fwereade | jam, in the ideal case we should be able to squish those weird ones before they even make it to the agent anyway | 11:55 |
wallyworld | the provisioner task for a container type may well then do other stuff to prepare for the firt contaner of a type to start | 11:55 |
fwereade | jam, the types watcher implementation on the server side could filter those out and error them itself, or possibly the SetSupportedContainers stuff could do so itself with a bit of nasty state prestidigitation | 11:56 |
rogpeppe | natefinch: any chance of seeing your proposed package interface, please? | 12:14 |
natefinch | rogpeppe: yeah, sure | 12:16 |
rogpeppe | natefinch: godoc output would be ideal | 12:16 |
* fwereade lunch | 12:16 | |
natefinch | rogpeppe: let me just whip up some godoc comments :) | 12:17 |
rogpeppe | natefinch: always write your doc comments before writing the code :-) | 12:17 |
natefinch | rogpeppe: :) I usually do... this was kind of exploratory coding, so I didn't. *shrug* | 12:19 |
rogpeppe | natefinch: np | 12:19 |
rogpeppe | natefinch: at this point i'm mostly interested to see if you've got AddPeer or SetPeers | 12:20 |
natefinch | rogpeppe: add and remove | 12:20 |
natefinch | rogpeppe: I could do set, that's actually easier than add or remove | 12:21 |
rogpeppe | natefinch: i'm wondering if set might be more appropriate for our use case | 12:21 |
rogpeppe | natefinch: although you might want to leave add and remove since you've already implemented them | 12:21 |
jam | fwereade, wallyworld: your models don't actually match. wallyworld starts by introspecting the machine, fwereade has a list of requested-container-types that you watch | 12:21 |
jam | so once you have the list, then they mostly match | 12:21 |
wallyworld | i introspect the machine to determine what container types are supported, then watch those only | 12:22 |
jam | wallyworld: which is *not* watching a list of requested container *types* | 12:22 |
jam | and it *is* starting watchers for each type the machine might support | 12:22 |
jam | rather than only ones that have already been requested | 12:23 |
wallyworld | till the first container is requested, it's a strings watcher for the first container instance yes | 12:23 |
jam | thats the key bit that I was missing at least. It may be that I'm misunderstanding the things you've said, but I do feel there is a communication gap between what fwereade is actually describing and what you have | 12:23 |
wallyworld | which then starts a suitable provisioner task | 12:24 |
jam | wallyworld: "first container instance" ? | 12:24 |
wallyworld | afaik, all we have at the moment is the ability to call WatchMachineContainers | 12:24 |
wallyworld | which triggers whenever a container is added to a machine | 12:24 |
jam | which requires a list of container types, right ? | 12:24 |
wallyworld | yes | 12:25 |
jam | wallyworld: so what fwereade is describing, is another watcher | 12:25 |
jam | which is watching the list-of-requested-container-types | 12:25 |
jam | rather than the list-of-known-supported-types | 12:25 |
jam | so we would still have a startup "these are the types I know how to support" | 12:25 |
wallyworld | i currently call WatchMachineContainers - what it gives when it triggers is the container ids | 12:25 |
jam | we actually ask back "and what types would you like me to run" | 12:25 |
jam | wallyworld: right, that is something we *also* need, but fwereade is giving us a layer where we don't start provisioners until the list of *requested* containers now contains them | 12:26 |
wallyworld | that's right, i only start a provisioner when the first conainer of a type is requested | 12:26 |
wallyworld | until then, it's a simple strings watcher | 12:27 |
jam | so WatchMachineContainers would be run by *each* of the LXCProvisioner and KVMProvisioner, with their personal subset. *But* we don't start one until this other field includes that type in the list. | 12:27 |
wallyworld | no | 12:27 |
jam | wallyworld: but what happens when you have another one | 12:27 |
wallyworld | the provisioners are not started | 12:27 |
jam | or one that is a type you didn't probe for | 12:27 |
jam | or | 12:27 |
jam | etc | 12:27 |
wallyworld | well, the machine agent has to know what possible containers to probe for | 12:27 |
wallyworld | cause there's different initialisation code required for each type | 12:28 |
wallyworld | so it has to be baked in to the system | 12:28 |
wallyworld | we can't suddenly support new container types | 12:28 |
wallyworld | without the code which knows how to set up for that | 12:28 |
wallyworld | which packages to apt-get install etc | 12:28 |
jam | wallyworld: to give a hypothetical. Wouldn't it be nice if someone could request an OpenVZ which Juju 2.2 knows about, it goes into the DB, but the agent on machine-2 is only running Juju 2.0 and can just say "sorry, container type X not supported" | 12:28 |
jam | wallyworld: I'm certainly not saying we support things we don't know how to support | 12:29 |
jam | but what about being able to give error messages about things we haven't heard about before | 12:29 |
wallyworld | i can certainly modify the code to do that | 12:29 |
wallyworld | that would be easy to do | 12:30 |
wallyworld | it would only be a small tweak | 12:30 |
wallyworld | actually | 12:30 |
wallyworld | that's how it woeks now | 12:30 |
wallyworld | when the machine agent starts up | 12:30 |
wallyworld | it sets the supported container list | 12:30 |
jam | wallyworld: and I *think* it actually handles the "you asked for KVM which I know about but I don't actually support that" as well as "you asked for OpenVZ which I know nothing about" | 12:30 |
wallyworld | and if juju client 2.2 comes along | 12:30 |
wallyworld | and asks for something new, it will error immediately | 12:31 |
jam | wallyworld: except if the machine hasn't finished starting yet, which means it will go off into lala land because nobody is checking for a type that wasn't baked in. | 12:31 |
jam | which is why the "give me the list of types that have been requested, so I can start things for them, and oh, these ones I don't know about so put them into error state" | 12:32 |
wallyworld | no, cause i'm still writing that code | 12:32 |
jam | similar to "these ones I know about but don't actually support" | 12:32 |
wallyworld | and i will be checking for requested stuff that's not supported | 12:32 |
wallyworld | in fact, the code i have does do that already | 12:32 |
jam | it isn't hard to say "if I don't know about it, it isn't supported" | 12:32 |
wallyworld | yes | 12:32 |
natefinch | rogpeppe: http://pastebin.ubuntu.com/6437236/ | 12:32 |
wallyworld | jam: the code in progress i have iterates over all requested containers, and sets status if not supported | 12:33 |
wallyworld | so that will pick up new container type XYZ | 12:33 |
jam | wallyworld: but it only does that at startup time ? | 12:33 |
adeuring | rogpeppe: could you have a look athis MP: https://codereview.appspot.com/28310043 ? | 12:33 |
rogpeppe | natefinch: why ...[]ReplsetMember ? | 12:33 |
rogpeppe | adeuring: looking | 12:33 |
wallyworld | jam: yes, but the iteration happens after the block has been established to prevent unknown containers from being reuested | 12:33 |
adeuring | thanks! | 12:33 |
jam | and aftewards it starts watching for only the types that it does support | 12:34 |
rogpeppe | natefinch: wouldn't ...ReplsetMember be sufficient? | 12:34 |
jam | but it starts watching for *all* things that it might support | 12:34 |
wallyworld | yes | 12:34 |
natefinch | rogpeppe: sorry, I just changed that... yeah, that's what I meant | 12:34 |
wallyworld | jam: but only a strings watcher, not a provisioner | 12:34 |
jam | the model from fwereade is to put a Watcher on actually requested types, for those that are requested, start a provisioner watching for containers of that type. When the list of requested types change, the first one first and starts a new provisioner which may be a "not supported provisioner" | 12:35 |
natefinch | rogpeppe: I always forget the exact syntax for the variable params arrays | 12:35 |
wallyworld | jam: i don't plan on starting any "not supported provisioners" | 12:35 |
rogpeppe | adeuring: LGTM | 12:35 |
wallyworld | no point | 12:35 |
jam | wallyworld: my point is if we do the fwereade one, we don't ever end up in a race condition where we might not notice when someone requests something we don't support, even if the client is buggy and doesn't actually respect the supported types field | 12:35 |
adeuring | rogpeppe: thanks! | 12:35 |
rogpeppe | natefinch: ... replaces [] | 12:35 |
wallyworld | buggy client, never :-) | 12:36 |
adeuring | fwereade: could you have a llok here: https://codereview.appspot.com/28310043 ? | 12:36 |
natefinch | rogpeppe: yeah, I remembered that after you mentioned it. | 12:36 |
wallyworld | jam: the back end is the thing that looks at the supported types field | 12:36 |
jam | so while we *could* write it the other way, this way handles the cases we do care about, saves resources internally (doesn't have to even start watchers for supported types that aren't in use), and handles failure more gracefully | 12:36 |
wallyworld | jam: the client just gets an error | 12:36 |
wallyworld | here's no logic in the client | 12:36 |
wallyworld | there's | 12:37 |
jam | wallyworld: buggy code | 12:37 |
jam | regardless client | 12:37 |
natefinch | rogpeppe: I'd add SetReplicas that just takes an array of ReplsetMember and replaces the set in the mongo document | 12:37 |
jam | anyway, your model can be made to work, the other just seems more robust and actually consumes less resources because you aren't even starting Watchers until one is requested | 12:37 |
wallyworld | jam: the client asks for a container xyz, that goes to the server side, the server side is the thing that error's | 12:37 |
jam | you start A watcher | 12:37 |
jam | and never N watchers | 12:37 |
rogpeppe | natefinch: i think i'd prefer to see bools rather than *bools in ReplsetMember, even if it means having another type for marshalling | 12:38 |
wallyworld | jam: from memory, i start oen strings watcher per supported container type, after the suported container types have been set, preventing new unsupported ones from being rwquested | 12:38 |
jam | wallyworld: anyway, your design can certainly be made to work. My #1 point is that it doesn't actually match what fwereade is saying, and his does have an interesting benefit. | 12:38 |
wallyworld | i don't see the benefit just yet | 12:39 |
wallyworld | my current implementation doesn't allow unsupported containers, is robust to old clients, and doesn't start unnecessary watchers | 12:39 |
wallyworld | s/old/new | 12:39 |
jam | wallyworld: benefit #1, for 90% of all machines that don't run any containers, they start 1 watcher of requested-container-types, even when we support 10 different Virtualization types | 12:39 |
natefinch | rogpeppe: the only problem with that is that buildindexes defaults to true if unset, which is annoying to do in go marshalling .... doable, but annoying. | 12:40 |
jam | so you *could* deploy to any of those types (say you are running on MaaS so you have full support for whatever you want). but then you still have only 1 watcher because you aren't actually using any of them. | 12:40 |
wallyworld | jam: the current WatchContainer method doesn't take a list | 12:40 |
rogpeppe | natefinch: then have NoIndexBuilding or something? | 12:40 |
wallyworld | it just takes a single container to watch for | 12:41 |
jam | wallyworld: absolutely. It needs code written | 12:41 |
wallyworld | hence right now I need N | 12:41 |
jam | the stuff we have today doesn't match the design fwereade stated we were trying for | 12:41 |
wallyworld | i could change it yes | 12:41 |
natefinch | rogpeppe: hrm... I sorta hate to modify the API for replicasets away from the Mongo documentation. | 12:41 |
jam | we need another Watcher to watch the list-of-requested-container-types | 12:41 |
jam | fwereade's claim is that was the design | 12:41 |
jam | so the data may already be present in the DB | 12:42 |
wallyworld | jam: so whether we start 1 initial watcher or N, it's essentially the same design | 12:42 |
wallyworld | jam: not list-of-requested-container-types, but list of supported container types | 12:42 |
rogpeppe | natefinch: we're still having the same defaults, so i think it's reasonable. (I think it would be quite unintuitive if the zero value of some of those *bools implied true and others false) | 12:42 |
wallyworld | no need to watch for those we don't support | 12:43 |
rogpeppe | natefinch: i think we can be go-idiomatic even while sticking reasonably close to mongo docs | 12:43 |
rogpeppe | natefinch: even better, we can include links to the relevant parts of the doc | 12:43 |
rogpeppe | s/can/could/ | 12:43 |
natefinch | rogpeppe: I'm just going off the defaults of what Mongo gives you. Not my faults they're inconsistent ;) To me, the point shows that they're optional, and the default is whatever mongo says the default is. | 12:44 |
natefinch | s/point/pointer | 12:44 |
rogpeppe | natefinch: i think it's going to be really awkward to build a ReplsetMember | 12:44 |
jam | wallyworld: sure, but if we don't support them they likely won't end up in the requested set either. | 12:44 |
rogpeppe | natefinch: i'd much prefer if it didn't need pointers to bool | 12:44 |
jam | wallyworld: the point is to handle failure modes "oh it *did* end up in the requseted set somehow" | 12:45 |
rogpeppe | natefinch: or float, for that matter | 12:45 |
natefinch | rogpeppe: I guess pointers to booleans are annoying, it's true. | 12:45 |
jam | and still be able to respond to it | 12:45 |
wallyworld | jam: sure, my code does that now | 12:45 |
jam | rather than just staying in Pending forever | 12:45 |
jam | wallyworld: sure, but it *also* starts N watchers when there are 0 requested containers | 12:45 |
wallyworld | by my code i mean the stuff in progress | 12:45 |
natefinch | rogpeppe: the problem is that the defaults aren't zero for Priority or Votes. | 12:45 |
wallyworld | jam: it has to start a watcher (or N currently) | 12:46 |
jam | wallyworld: it *has* to start 1 | 12:46 |
jam | yes | 12:46 |
jam | but 1 != N | 12:46 |
wallyworld | so it knows when to kick off a new provisoner and prepare for that container type to be deployed | 12:46 |
rogpeppe | natefinch: tbh, i don't mind if we don't have defaults for those | 12:46 |
wallyworld | jam: sure, but with the api we have now, i have to start N | 12:46 |
jam | wallyworld: you were asking for fewer moving parts | 12:46 |
wallyworld | i can change that | 12:46 |
wallyworld | i'm just using what we have to get it working | 12:46 |
wallyworld | in user visible sense | 12:47 |
wallyworld | can iterate behind the scenes once it's running | 12:47 |
jam | wallyworld: mostly it felt like you didn't quite understand the design fwereade was talking, because you certainly weren't designing it in the same fashion. I wanted to make sure we are actually having the same conversation | 12:47 |
natefinch | rogpeppe: now who's making it more difficult to create replica members? As it is, all you have to specify is the Host (actually that's something I shold work out - the Ids of the members are really just their index i nthe list... I probably shouldn't expose them) | 12:47 |
jam | steps along the way, as long as we're actually headed in the same direction | 12:47 |
rogpeppe | natefinch: one possibility is to have a separate type, say MemberSettings, and have a value, say DefaultMemberSettings | 12:47 |
rogpeppe | natefinch: which holds all the usual defaults | 12:47 |
wallyworld | but i am designing it in the same way - just differ initially on the initial watcher | 12:47 |
wallyworld | 1 vs N | 12:48 |
wallyworld | or at least i think so | 12:48 |
natefinch | rogpeppe: likely, most people will only use Host as a value. The rest are rarely used, other than ArbiterOnly | 12:48 |
wallyworld | jam: i haven't actually talked to fwereade about this at all, just going by scanning the scrol back | 12:48 |
natefinch | rogpeppe: (at least from what the documentation says, it seems unlikely they're options that are used very often, since the doc says stuff like "generally you shouldn't change this" etc) | 12:49 |
jam | wallyworld: you aren't watching a list-of-requested-container-types at all, which is quite different from what he described. | 12:49 |
wallyworld | the core concepts though i think are in alignment - doesn't start provisoners until needed, delay host init until needed etc | 12:49 |
jam | I think after you have a watcher of something requested | 12:49 |
jam | you're doing the same thing | 12:49 |
wallyworld | maybe you and i differ on terminology | 12:50 |
wallyworld | i think i am watching a list of supported continer types | 12:50 |
jam | (15:16:00) fwereade: wallyworld, fwiw, a watcher for "kinds of containers this machine is expected to run" wouldbe easy, and was the originalplan a while ago | 12:50 |
natefinch | rogpeppe: I like the default value of the struct.. that's not a bad idea. I just think it's less straightforward than just doing it the way I have it. | 12:50 |
jam | wallyworld: no, you are starting N watchers on each type that you support | 12:50 |
jam | which is not a watcher of "what types have been requested for this machine" | 12:51 |
wallyworld | jam: yes - alist of supported contsiner typrs = a list of kinds this is expected to run | 12:51 |
wallyworld | jam: no 1 watcher on each type = N | 12:51 |
wallyworld | not N watchers on each type | 12:51 |
wallyworld | and that's only because there's no the api to support one watcher for all supported typrs | 12:52 |
wallyworld | yet | 12:52 |
jam | so, yes we were talking past eachother a bit | 12:52 |
jam | N watchers, 1 for each type you support | 12:52 |
jam | vs | 12:53 |
jam | wallyworld: you are starting N watchers | 12:53 |
jam | vs | 12:53 |
jam | starting 1 watcher that gets a list of things that it should then start watchers for | 12:53 |
wallyworld | i don't get the last line, but | 12:54 |
wallyworld | i would only like to start 1 watcher for the N container types | 12:54 |
wallyworld | when the api supports that | 12:54 |
wallyworld | it would be a small change to what we do now | 12:54 |
jam | wallyworld: if I have a doc in the DB, that aggregates all container types requested for this machine, which is just a list of [LXC, KVM], though in the common case is just [] | 12:54 |
jam | fwereade's contention was that "that was the original design" | 12:54 |
jam | which was clear didn't match what is being implemented | 12:55 |
wallyworld | ok, i see now | 12:55 |
jam | and I was trying to be clear about where he at least thought we were going | 12:55 |
wallyworld | i watch for cotainer ids | 12:55 |
wallyworld | and on the first one, start yje provisoner | 12:55 |
wallyworld | essentially the same thing | 12:55 |
wallyworld | with less moving parts | 12:55 |
wallyworld | cause the db model is simpler | 12:55 |
jam | wallyworld: except in the common case each machine-agent starts up N watchers, which is resources in the API server | 12:55 |
jam | to watch for things that will never actually have chanegs | 12:56 |
jam | changes | 12:56 |
wallyworld | yes, but will be one | 12:56 |
wallyworld | when the api is fixed | 12:56 |
jam | wallyworld: so I was proposing that, but fwereade seemed to think it was bad | 12:56 |
jam | and prefered the "original design" | 12:56 |
jam | which is why this thread got started | 12:56 |
jam | so I've at least illuminated where you two differe | 12:56 |
jam | and I will happily step back and let fwereade and you finish the conversation | 12:56 |
wallyworld | seems so :-) sorry, i was not getting it at first | 12:56 |
jam | wallyworld: neither did I. I just got it slightly sooner than you. | 12:57 |
wallyworld | well, i've had a few glasses of wine here by now, it's almost 11pm :-) | 12:57 |
jam | It wasn't until (15:53:01) jam: wallyworld: does that make sense to you/ ^^ | 12:57 |
jam | that I got what the difference was | 12:57 |
wallyworld | i think i skimmed that bit, sorry :-) | 12:58 |
wallyworld | in any case, i'd like to finish what i currently have - it works, is robust to different client versions | 12:58 |
jam | wallyworld: np, I didn't catch that you weren't understanding the "starting N watchers, 1 of each type" thing | 12:58 |
wallyworld | and we can argue about how to tweak it | 12:58 |
wallyworld | jam: yeah, i would have preferred to have only 1 watcher, but thought getting the new api done would take too much time | 12:59 |
wallyworld | and i wanted to try to have kvm done for this week | 12:59 |
wallyworld | cause the new api may we involve back end changes | 12:59 |
wallyworld | to the watcher infrastructure | 13:00 |
jam | wallyworld: sure. I think the discussion is just whether we have a higher-level watcher that then fires off these sub ones, or an aggregate watcher across them. | 13:01 |
wallyworld | yeah. once the provisioner is started, it essentially becomes that watcher | 13:01 |
wallyworld | s/that/the | 13:01 |
jam | yeah | 13:01 |
wallyworld | all the work to add the supported containers db model, and the code to update status etc is essentially independent of the initial watcher thing | 13:02 |
wallyworld | hence i can get that done and provide the user visible functionality up front | 13:03 |
wallyworld | and tweak the behind the scenes stuff later | 13:03 |
rogpeppe | natefinch: why does InitReplicaSet need to be run on the same machine when initially setting up a replica set? | 13:13 |
=== gary_poster|away is now known as gary_poster | ||
natefinch | rogpeppe: hmm.. I thought it had to be so mongo would know to replicate the stuff in this mongo instance, but it looks like I was wrong. The docs don't mention that, so I'll remove it. | 13:15 |
rogpeppe | natefinch: i actually didn't think it was possible to change an existing mongo to use a replica set without restarting it, but presumably you found out a way? | 13:16 |
natefinch | rogpeppe: oh, no... this code is outside that. I didn't write the restarting code yet, since that's outside the scope of what mgo can do. But that's pretty trivial, a couple exec commands | 13:17 |
natefinch | rogpeppe: which is to say, yes, you have to restart mongo | 13:17 |
rogpeppe | natefinch: i think that perhaps we can ignore that - we'll perhaps use a different upstart name, and make sure that the old one is removed before ensuring the new one exists. | 13:18 |
rogpeppe | natefinch: do we actually need an InitReplicaSet call then? | 13:18 |
natefinch | rogpeppe: you have to do both, the flag on mongo startup and initreplicaset | 13:19 |
natefinch | rogpeppe: brb | 13:20 |
jam | natefinch, rogpeppe: do you have to restart mongo anytime you change the replica set ? It looks like you have to set the startup flag, but we could just do that always, right? | 13:25 |
rogpeppe | jam: no you don't | 13:26 |
jam | rogpeppe: so http://docs.mongodb.org/manual/tutorial/convert-standalone-to-replica-set/ certainly says you can't take a running service and make it HA without stopping it | 13:26 |
rogpeppe | jam: we need to pass in the replica set name, but otherwise i think there's no need to restart | 13:26 |
jam | unless we default to always starting in a replica set with just 1 entry | 13:26 |
rogpeppe | jam: that's what we'd do, i think | 13:26 |
rogpeppe | jam: unless you can think of a reason that's a bad idea | 13:26 |
natefinch | jam, rogpeppe: so are you saying, always start mongo with the flag, but then just don't do replsetinitiate? | 13:27 |
rogpeppe | natefinch: i think so. i'm not quite sure what your InitReplicaset function is doing though. | 13:28 |
jam | rogpeppe: http://paste.ubuntu.com/6437458/ | 13:28 |
jam | ah, "we'd do" | 13:29 |
jam | not "thats what we already do" | 13:29 |
natefinch | rogpeppe: its what actually sets up the replica... it passes in the list of replicas. I don't know what it does behind the scenes. I can test what happens if you start mongo with --repl and try to use it as an individual database | 13:29 |
rogpeppe | natefinch: from an API user's p.o.v., i'd prefer not to have to call InitReplicaset ever | 13:30 |
jam | natefinch: so it sounds like we'd really like to do all of that work in "bootstrap-state" | 13:30 |
jam | so that we have *a* replica set with just 1 entry | 13:30 |
jam | natefinch: rs.initiate() takes an *optional* configuraiton | 13:30 |
jam | which sure sounds like the intial value can just be 1 node | 13:31 |
natefinch | jam: good point | 13:31 |
rogpeppe | my experiments seemed to show it worked fine with just one node in the replica set | 13:31 |
rogpeppe | there's one slight problem though | 13:32 |
rogpeppe | in bootstrap-state, we don't necessarily know the machine's address | 13:32 |
rogpeppe | or... do we? | 13:32 |
jam | rogpeppe: again rs.initialize() can just be started without passing in anything | 13:32 |
jam | let mongo sort it out | 13:32 |
jam | we can change it later when we expand it | 13:32 |
rogpeppe | jam: it might not sort it out correctly | 13:32 |
rogpeppe | jam: at least on my laptop, it got the wrong address | 13:33 |
jam | rogpeppe: there is a warning that you shouldn't use localhost for a member unless all entries are on localhost | 13:33 |
jam | rogpeppe: *but* couldn't we do that when expanding? | 13:33 |
jam | certainly it doesn't matter when there is no entries | 13:33 |
rogpeppe | jam: yeah, you're probably right | 13:33 |
jam | well, when there are no *other* mongod's | 13:33 |
jam | mongods ? | 13:33 |
jam | natefinch: extra exciting is that the docs *explicitly* say you shouldn't use mongorestore to seed the new guys, but you *could* snapshot the filesystem when mongo is in a consistent state. | 13:35 |
jam | sounds like "stop mongod, snapshot the filesystem, then start it again" | 13:35 |
jam | which is pretty terrible. | 13:35 |
jam | I'm hoping as long as you haven't written data you don't *have* to | 13:36 |
jam | it just needs a really long oplog | 13:36 |
jam | yeah, while earlier in http://docs.mongodb.org/manual/tutorial/expand-replica-set/ it says "you can seed it this way", the next section says "you should not have any data already" | 13:36 |
natefinch | jam: It should just sync | 13:37 |
natefinch | jam: yeah, adding replicas with data in them would be bad | 13:37 |
jam | so a little Schizophrenic | 13:37 |
natefinch | brb, diaper | 13:37 |
jam | natefinch: so it sounds like as long as you have an exact FS snapshot, then you'd be ok | 13:37 |
jam | probably mongodbrestore doesn't preserve some oplog property | 13:37 |
jam | it does sound like you just want to start empty | 13:38 |
jam | I don't think we want to try producing a stable snapshot to copy on our own | 13:38 |
rogpeppe | natefinch: here's an idea for a possible package interface. it's pretty close to what you have now, semantically, but with somewhat different names: http://paste.ubuntu.com/6437493/ | 13:39 |
rogpeppe | natefinch: need to go to lunch, back soon | 13:40 |
jam | rogpeppe: interestingly, it does look like you have to add replica set members one by one, and they must already be started | 13:41 |
natefinch | jam: pretty sure I've tried adding them before they were started, but I should double check | 13:42 |
jam | natefinch: it does appear that you could set the configuration | 13:42 |
jam | and then they should come online "by magic" as the command doesn't look to be synchronous | 13:42 |
jam | but the docs certainly tell you to start them first | 13:42 |
jam | presumably you could add them, and it would just go into non-quorum state | 13:43 |
jam | which might be pretty bad if you are going from 1-3 | 13:43 |
jam | 1 to 3 | 13:43 |
jam | rather than adding 1, waiting for the sync to finish, then adding another | 13:43 |
natefinch | jam: isn't 2 a problem either way? | 13:44 |
jam | natefinch: might be worth trying. create a bunch of data, start 1, add 2 and see if it accepts more data while it is bringing 2 and 3 up | 13:44 |
jam | natefinch: so if you start with 1, and add 1, you still work, though if either one fails you've lost quorum | 13:44 |
jam | I tihnk | 13:44 |
jam | but at least it should still take writes (i would think) | 13:44 |
jam | as it knows it is the elected master | 13:44 |
jam | if you start 1 and add 2d | 13:44 |
jam | 2 | 13:44 |
jam | maybe it is still true | 13:44 |
jam | that it knows it is the master by election process | 13:44 |
jam | worth trying to see if adding 2 immediately puts it into "unavailable until sync is don" | 13:45 |
jam | don' | 13:45 |
jam | done | 13:45 |
jam | natefinch: anyway, if you add 2 and they aren't up yet | 13:45 |
natefinch | jam: it definitely says it re-elects when you remove a replica, but doesn't say it does when you add... so it's possible it'll just work | 13:45 |
jam | it should refuse writes | 13:45 |
jam | "should" | 13:45 |
natefinch | jam: yeah, lotta shoulds. I'll do tests and figure out what it *does* do :) | 13:46 |
jam | they may get put in some sort of "pending" nodes that don't actually change quorum | 13:46 |
jam | natefinch: hopefully you don't have to test across version permutations | 13:46 |
natefinch | jam: versions of mongo? | 13:46 |
jam | natefinch: right | 13:47 |
natefinch | jam: well, they just released 2.4 recently, and it looks like they're on about an 18 month cycle, so I think we're good for a while | 13:48 |
jam | natefinch: http://engineering.foursquare.com/2011/05/24/fun-with-mongodb-replica-sets/ is interesting, though I don't think we'll actually be setting up hidden backup nodes | 13:48 |
jam | natefinch: except we're running 2.2.? in production today :) | 13:48 |
natefinch | jam: oh. | 13:48 |
jam | so we know we need at least 2 version | 13:48 |
jam | versions | 13:48 |
natefinch | jam: I guess testing 2.2.x and 2.4 is probably a good idea. Are we likely to start using 2.4 soon? What determines that? | 13:49 |
jam | natefinch: #1 thing is what version will be in Trusty | 13:49 |
jam | but I'm quite sure we're stuck with 2.2 for precise->saucy for a while | 13:50 |
jam | if we want to go to 2.4 we probably have to get it into trusty real-soon-now | 13:50 |
jam | jamespage: ^^ do you know the plans for upgrading MongoDB version? I'm guessing we don't want to be using mongodb 2.2 in 3 years | 13:50 |
jam | anyway, dinner time here | 13:51 |
jam | see you all later | 13:51 |
jamespage | jam: trusty already has 2.4 | 13:51 |
jamespage | so did saucy | 13:51 |
natefinch | jamespage: cool, thanks | 13:51 |
jamespage | natefinch, np | 13:51 |
jam | natefinch: so in other words, we already deploy to 2.2 and 2.4 | 13:52 |
jam | given ppa:juju/stable is running 2.2 | 13:52 |
jam | for P | 13:52 |
jam | jamespage: unless my "apt-cache madison" is lying somehow :) | 13:53 |
jamespage | jam: yes - but juju auto-adds cloud-tools which contains 2.4 | 13:53 |
rogpeppe | natefinch: do you know if there's any way of asking whether replica set members are up to date with the log? | 13:58 |
=== adeuring1 is now known as adeuring | ||
adeuring | jam: could you have another look here: https://codereview.appspot.com/28310043/ ? | 14:09 |
natefinch | rogpeppe: I'm not sure | 14:11 |
rogpeppe | natefinch: i've just found it | 14:12 |
rogpeppe | natefinch: http://docs.mongodb.org/v2.2/reference/replica-status/#repl-set-member-statuses | 14:12 |
rogpeppe | natefinch: what do you think of my proposed package interface, BTW? | 14:12 |
rogpeppe | natefinch: i tried to formulate it from the top down as something i'd like to use rather than from the bottom up | 14:13 |
rogpeppe | natefinch: http://paste.ubuntu.com/6437493/ in case you missed it | 14:13 |
natefinch | rogpeppe: saw it | 14:13 |
natefinch | rogpeppe: mostly looks good to me. The one problem I have with memberdefaults is that if anyone just constructs members to pass in without noticing they should use the defaults... they'll get pretty bad defaults (no votes, 0 priority, and no indexes) | 14:18 |
rogpeppe | natefinch: yeah; i think that's ok though. the defaults are there and obvious. | 14:19 |
natefinch | rogpeppe: hrmph. It's not horrible, but not my favorite thing. The defaults on the struct are not what the struct actually defaults to. | 14:22 |
sinzui | fwereade, rogpeppe: can either you of help me triage this bug? Do we is it really in Juju? Do we commit to fix it in the next 6 months? Bug #1250965 | 14:24 |
_mup_ | Bug #1250965: Loopback mounts do not work with local provider <local-provider> <juju-core:New> <swift-storage (Juju Charms Collection):New> <https://launchpad.net/bugs/1250965> | 14:24 |
rogpeppe | natefinch: yeah, maybe better to leave out the "defaulting to" remarks and just rever to MemberDefaults in the Member doc comment | 14:24 |
rogpeppe | sinzui: looking | 14:24 |
* rogpeppe looks up "loopback mounts" | 14:25 | |
rogpeppe | sinzui: by my very limited understanding of the issue, it looks like something we could probably fix soon and easily | 14:27 |
rogpeppe | sinzui: and that we should do | 14:27 |
sinzui | thank you! | 14:27 |
rogpeppe | sinzui: but there might be security or other issues that i'm not aware of | 14:28 |
sinzui | understood. | 14:28 |
dimitern | rogpeppe, ping | 14:46 |
rogpeppe | dimitern: pong | 14:46 |
dimitern | rogpeppe, what's the preferred way to get an the environ from an api connection? | 14:47 |
rogpeppe | dimitern: juju.NewConnFromState | 14:47 |
dimitern | rogpeppe, if I use NewAPIClientFromName I only get the api client, not the underlying APIConn, which has both state and environ | 14:47 |
rogpeppe | dimitern: best to avoid the necessity if possible though | 14:48 |
rogpeppe | dimitern: which call is this that needs it? | 14:48 |
dimitern | rogpeppe, I need something like NewConnFromState, but connecting to the API and returning the APIConn | 14:48 |
dimitern | rogpeppe, upgrade juju needs an environ in order to call FindTools with it | 14:49 |
rogpeppe | dimitern: oh, i see, as an agent | 14:49 |
dimitern | rogpeppe, as a client | 14:49 |
dimitern | rogpeppe, right now conn.Environ is used to get the environ in the command | 14:50 |
rogpeppe | dimitern: cfg, err := st.EnvironConfig(); env, err := environs.New(cfg) | 14:50 |
dimitern | rogpeppe, ah, ok, so I can call client.EnvironmentGet() and use that to construct and environ object | 14:51 |
rogpeppe | dimitern: yeah | 14:51 |
dimitern | rogpeppe, cheers | 14:51 |
rogpeppe | dimitern: although... | 14:51 |
rogpeppe | dimitern: we might possibly want to provide a way for a client to find tools without necessarily providing them with the whole environ config | 14:52 |
rogpeppe | dimitern: so there may well be an argument for a new API call here | 14:52 |
rogpeppe | fwereade: what thinkest thou? | 14:52 |
dpb1 | fwereade: ping | 14:57 |
dimitern | rogpeppe, I realized I don't need to implement anything else than client.SetEnvironAgentVersion() in the API, and use EnvironmentGet() initially | 14:57 |
rogpeppe | dimitern: sounds good | 14:57 |
jcsackett | sinzui, abentley: either of you free to look at https://code.launchpad.net/~jcsackett/charmworld/better-jobs/+merge/195443 ? | 15:05 |
abentley | jcsackett: sure. | 15:06 |
jcsackett | also, do we have a new "ping the team" word, since we're not orange anymore? | 15:06 |
jcsackett | thanks, abentley. | 15:06 |
abentley | Maybe juju-qa? | 15:06 |
abentley | jcsackett: It looks like you've added tests for your github changes, but not askubuntu. | 15:10 |
jcsackett | abentley: that's true, since i didn't think it was really changing askubuntu execution. | 15:10 |
jcsackett | abentley: oh wait, the backoff thing should have a test. | 15:11 |
abentley | jcsackett: That's what I was thinking. | 15:11 |
jcsackett | abentley: dig, i'll add that. | 15:11 |
sinzui | fwereade, I see a report using the 1.16.4 potential client has a problem when the state-server is 1.16.3. ERROR no such request "DestroyMachines" on Client. | 15:33 |
* sinzui is attempting to reproduce | 15:33 | |
fwereade | sinzui, oh *shite* | 15:33 |
fwereade | sinzui, ofc it's reproable, I am an idiot, I even thought of it and then forgot it | 15:34 |
* rogpeppe sees 1.16.5 arriving pronto | 15:34 | |
fwereade | sinzui, *unless* I convinced myself that it was an expected and transient error | 15:34 |
fwereade | rogpeppe, that'll have the same problem | 15:34 |
sinzui | fwereade, we do not normally see this in tests because they assume you are savvy enough to upload your tools if you have a release candidate, or we have release the actual tools | 15:34 |
rogpeppe | fwereade: unless 1.16.5 rolls back some client changes i guess | 15:35 |
fwereade | rogpeppe, and thus rolls back the bugfix | 15:35 |
rogpeppe | fwereade: the bugfix can't apply client-side? i guess not unless you factor out stuff to statecmd | 15:36 |
sinzui | 1.16.4 is not out, We are going to release today I think. | 15:36 |
rogpeppe | fwereade: sorry, i should have thought of this in my review | 15:36 |
sinzui | fwereade, caribou reported the issue. | 15:36 |
fwereade | rogpeppe, or tangles the source tree by introducing a 1.16-only statecmd bit | 15:36 |
sinzui | I gave him the script that make a package | 15:36 |
fwereade | sinzui, all praise to caribou | 15:36 |
rogpeppe | fwereade: i'm not quite sure what you're thinking of there | 15:37 |
fwereade | sinzui, I don't suppose it's reasonable to ask people to upgrade both server and client if they want the bugfixes? | 15:37 |
fwereade | rogpeppe, the more 1.16 diverges from the shape of trunk the harder it will be to maintain -- I don't want to make that experience suck until we have 1.18 out, at which point we needn;t worry about 1.16 so much anyway | 15:38 |
sinzui | fwereade, I consider a bug if the client ever selects a newer server. | 15:38 |
fwereade | sinzui, yeah, normal use will lead to breakage | 15:38 |
rogpeppe | fwereade: ah, bit==piece, not 1-or-0 | 15:38 |
sinzui | fwereade, I do think it is reasonable to say upgrade your client, then upgrade the server | 15:38 |
* fwereade kicks himself around a bit | 15:39 | |
jcsackett | abentley: tests are pushed up. | 15:39 |
fwereade | sinzui, new server with old client still works, but doesn't allow for --force, right? | 15:39 |
fwereade | sinzui, it's just old server with new client? | 15:40 |
sinzui | fwereade, I am not sure, caribou has stepped away for a bit | 15:40 |
abentley | jcsackett: This also adds remove_server_start_time.py. Is that deliberate? | 15:42 |
sinzui | fwereade, this is the background I have about the issue: after the report of the error: http://pastebin.ubuntu.com/6438018/ | 15:43 |
abentley | Oh, I guess that's a merge. | 15:43 |
abentley | jcsackett: r=me. | 15:44 |
sinzui | fwereade, "even without the --force it fails" | 15:53 |
sinzui | fwereade, basic "juju terminate-machine 1" fails with the message mentioned previously | 15:53 |
fwereade | sinzui, I think it is clear -- I backported the DestroyMachines and DestroyUnits API methods to 1.16.4, so 1.16.3 client still works by talking direct to the db | 15:55 |
fwereade | sinzui, but 1.16.4 client expects the APIs to exist, and a 1.16.3 server does not have them | 15:55 |
fwereade | sinzui, FWIW this will also break destroy-unit in the same circumstances | 15:55 |
sinzui | fwereade, This issue might also be alleviated with "best practice". I have advised "juju upgrade-juju --version=1.16.4" to be clear about putting everything on the same version | 15:56 |
fwereade | sinzui, well, if we can be very clear about it in the release notes, it *does* expose very useful new functionality | 15:57 |
rogpeppe | natefinch: here's the replicaset package interface suggestion with status added, FWIW: http://paste.ubuntu.com/6438102/ | 16:06 |
jcsackett | abentley: thanks. | 16:08 |
dimitern | fwereade_, rogpeppe: upgrade-juju + api https://codereview.appspot.com/21940044/ PTAL | 16:16 |
natefinch | rogpeppe: reading it | 16:20 |
natefinch | rogpeppe: are you running Mongo 2.2 or 2.4? | 17:01 |
rogpeppe | natefinch: 2.2.4 | 17:01 |
rogpeppe | natefinch: ah, i was looking at the 2.2 docs when i was doing that package description | 17:02 |
natefinch | rogpeppe: yeah, figured. I was poking at mongo and noticed some more info in status, but it must be added in 2.4 | 17:03 |
rogpeppe | TheMue: ping | 17:12 |
* fwereade_ will bbl | 17:12 | |
TheMue | rogpeppe: pong | 17:12 |
rogpeppe | TheMue: i've just been looking at https://codereview.appspot.com/24040044 again | 17:12 |
TheMue | rogpeppe: yep | 17:12 |
rogpeppe | TheMue: it still doesn't seem quite right to me, unless i'm missing something | 17:12 |
TheMue | rogpeppe: ok, I'm listening | 17:13 |
rogpeppe | TheMue: if a connection drops, what cleans up the pingTimeout? | 17:13 |
TheMue | rogpeppe: if it drops Ping() isn't called, so the timer isn't reset, after 3 minutes there's a timeout which calls the passed action. and here rpcConn.Close() is called, which also call Kill() on the root (it implements the killer interface, but that already existed) | 17:15 |
TheMue | rogpeppe: in the inital code Ping() already existed, but with no code inside, only a comment | 17:15 |
natefinch | rogpeppe: I'm going to go with your suggestion and move the code I have over to it (it's really just some minor changes). I don't have the status code written, but that should be easy. | 17:15 |
rogpeppe | natefinch: cool, thanks. | 17:16 |
natefinch | rogpeppe: one thing - is it really that useful to return maps of statuses and members | 17:16 |
rogpeppe | TheMue: so if a client drops a connection, the goroutine will remain around for up to 3 minutes. that seems a bit wrong to me - surely we can clean it up? | 17:17 |
rogpeppe | natefinch: i dunno, i wondered about that | 17:17 |
rogpeppe | natefinch: it nicely suggests the fact that there's only one entry per address | 17:17 |
rogpeppe | natefinch: and it *might* work out more nicely in the actual agent code | 17:18 |
TheMue | rogpeppe: eh, until those 3 minutes are done we're not sure that the connection is dropped (or the agent on the other side is just blocked) | 17:18 |
rogpeppe | TheMue: what if the client explicitly drops the connection? | 17:18 |
natefinch | rogpeppe: seems like it just makes it a little more annoying to iterate... it's also a difference from the way the data is input. It's not too hard to construct a map from a list if you need a map... it just doesn't seem like it actually fits the data model (other than, yes, there's only one per host... but that's generally more useful on input than output) | 17:19 |
TheMue | rogpeppe: how is apiserver notified about that today? | 17:19 |
rogpeppe | TheMue: the Kill method is called | 17:20 |
rogpeppe | natefinch: the scenario i'm thinking about is you get info, then you want to see how the info corresponds with info you already hold. | 17:22 |
rogpeppe | natefinch: but if you really think it doesn't fit very well, then slices could be fine, probably | 17:22 |
TheMue | rogpeppe: ok, so I should stop the goroutine here too, but as Kill() is called in Close() I have to ensure that it doesn't deadlock | 17:23 |
rogpeppe | TheMue: yup | 17:23 |
TheMue | rogpeppe: do you add a note in the review? otherwise I'll do it ;) | 17:24 |
rogpeppe | natefinch: having CurrentMembers return the same thing as is passed to replicaset.Set and Add seems like a reasonably strong argument for returning a slice actually | 17:24 |
rogpeppe | TheMue: i will | 17:24 |
TheMue | rogpeppe: great, thx | 17:24 |
natefinch | rogpeppe: that was my thinking | 17:25 |
rogpeppe | natefinch: and CurrentStatus should be similar to CurrentMembers, so yeah, go with slices all round | 17:25 |
natefinch | rogpeppe: plus, there's only a max of 12 items in the list, so even if you have to do naive N^2 logicm it isn't going to hurt anything | 17:26 |
natefinch | rogpeppe: cool | 17:26 |
rogpeppe | natefinch: performance was not part of my considerations | 17:26 |
natefinch | rogpeppe: going to be out for about an hour. Turning into a late working day for me, but so be it. I should have that code all set by EOD, and hopefully some tests too. | 17:34 |
rogpeppe | natefinch: brilliant, thanks! | 17:35 |
TheMue | rogpeppe, natefinch: anybody knows that error that made my merge fail: https://code.launchpad.net/~themue/juju-core/054-env-more-script-friendly/+merge/191838 | 17:35 |
rogpeppe | TheMue: sporadic failure | 17:36 |
rogpeppe | TheMue: just mark as approved again to try once more | 17:36 |
TheMue | rogpeppe: ah, already wondered | 17:36 |
TheMue | rogpeppe: thx | 17:36 |
sinzui | fwereade_, do you have a moment to discuss terminate-machine from new client to old server? | 17:51 |
rogpeppe | right, done for the day | 18:42 |
rogpeppe | g'night all | 18:42 |
rogpeppe | off to see Gravity at the local 3D imax; should be good if the reviews are anything to go by. | 18:43 |
mramm | rogpeppe: have fun! | 18:44 |
natefinch | rogpeppe: night. Supposed to be good, yeah. | 18:44 |
=== marcoceppi_ is now known as marcoceppi | ||
thumper | morning | 19:52 |
natefinch | thumper: morning | 19:52 |
=== gary_poster is now known as gary_poster|away | ||
thumper | natefinch: o/ | 19:57 |
=== gary_poster|away is now known as gary_poster | ||
sinzui | thumper, could you read and reply to the message "Geting bug 1222671 into 1.16.4" that I sent to juju-dev | 20:05 |
_mup_ | Bug #1222671: Using the same maas user in different juju environments causes them to clash <cts-cloud-review> <maas-provider> <Go MAAS API Library:Fix Committed> <juju-core:Fix Committed by thumper> <juju-core 1.16:In Progress by sinzui> <https://launchpad.net/bugs/1222671> | 20:05 |
thumper | hi sinzui | 20:05 |
thumper | sinzui: it was my understanding that the merge that rog did into the 1.16 branch fixed that | 20:06 |
thumper | which is why I marked it fix committed or fix released in that series | 20:06 |
sinzui | Fab. Thanks thumper | 20:06 |
thumper | I may be mistaken, but that is what I thought | 20:06 |
thumper | fwereade_: you around? | 20:12 |
fwereade_ | thumper, heyhey | 21:11 |
thumper | fwereade_: hey dude | 21:11 |
thumper | got time for a hangout? | 21:11 |
fwereade_ | thumper, how's it going? | 21:11 |
fwereade_ | thumper, sure | 21:11 |
thumper | good, | 21:11 |
* thumper starts one | 21:11 | |
thumper | fwereade_: https://plus.google.com/hangouts/_/7ecpjvqj508h694vc55hjnqsvo?hl=en | 21:12 |
thumper | wallyworld: so... we don't have any shared storage any more? | 21:47 |
thumper | wallyworld: I could remove a config key from the local provider | 21:47 |
thumper | shared-storage-port | 21:47 |
wallyworld | nope, cause only ec2 and openstack had it anyway | 21:47 |
wallyworld | and now we have simplestreams it's not needed | 21:48 |
wallyworld | i guess so, not sure what shared-storage-port did | 21:48 |
thumper | wallyworld: we should really fix the tools uploading for the local provider | 21:48 |
wallyworld | yes | 21:48 |
thumper | wallyworld: also, fwereade_ wants to chat with you | 21:48 |
wallyworld | i'm available | 21:49 |
fwereade_ | wallyworld, heyhey | 21:49 |
wallyworld | yello | 21:50 |
wallyworld | fwereade_: did you want a hangout? | 21:51 |
fwereade_ | let's have a go | 21:51 |
fwereade_ | google has started hating me again after a goodish week | 21:51 |
wallyworld | https://plus.google.com/hangouts/_/72cpil41gfi1iafo9ljflprqds | 21:51 |
* thumper pokes the local provider with a long stick to see if it moves | 21:57 | |
* thumper opens up the beast again for more surgery | 21:59 | |
wallyworld | fwereade_: google does hate you | 22:23 |
wallyworld | fwereade_: so, some remaining issues. i thought it best to keep the notion of managing the container dependencies out of the provisioner - those are separate concerns to me. the model is: wait until a container type is required, ensure stuff is set up, then start the provisioner to manage the creation of the containers | 22:26 |
fwereade_ | wallyworld, I'm +1 on that | 22:26 |
wallyworld | so, the initial watcher does then kill itself once it has done that job | 22:27 |
fwereade_ | wallyworld, what you have does a solid job of starting the appropriate provisioner at the appropriate time | 22:27 |
thumper | um... | 22:27 |
wallyworld | cause it has served its purpose | 22:27 |
thumper | do we have an initial watcher for each type of container? | 22:27 |
wallyworld | yes, but only because the api only allows that | 22:27 |
wallyworld | the api needs to change | 22:27 |
fwereade_ | wallyworld, I'm just saying that the setup bit is not implemented | 22:28 |
wallyworld | and that involves uknown unknowns | 22:28 |
wallyworld | fwereade_: it is in a downstream mp | 22:28 |
wallyworld | the next one in the pipe | 22:28 |
wallyworld | fwereade_: https://codereview.appspot.com/22980045/ | 22:28 |
* thumper fetches the paddles to shock the local provider back to life | 22:29 | |
thumper | CLEAR | 22:29 |
wallyworld | lol | 22:29 |
wallyworld | fwereade_: so, each container package is responsible for knowing how to set itself up so containers of that type can be started on a given host. the machine agent calls out when required to do that and then starts a suitable provisioner. i still want to get the hammer out and fix the provisioner task as previously discussed | 22:31 |
wallyworld | my main goal this week is to get kvm supported with what apis we currently have | 22:31 |
wallyworld | once that user facing functionailty is delivered, then we can tweak the behind the scenes things to clean it up | 22:32 |
* thumper wondered why the local provider wasn't starting the machine | 22:48 | |
thumper | network traffic is spiking | 22:48 |
thumper | last line in the log file | 22:48 |
thumper | 2013-11-18 22:44:33 DEBUG juju.container.kvm container.go:32 Synchronise images for precise amd64 | 22:48 |
wallyworld | thumper: we need user feedback on that shit :-) | 22:49 |
thumper | wallyworld: not sure how... | 22:49 |
wallyworld | to eliminate the wondering | 22:49 |
wallyworld | we need to establish a channel back to the client | 22:49 |
wallyworld | and the service business logic can pop progress events into that channel | 22:50 |
thumper | machine provisioning failed... | 22:54 |
thumper | now to figure out why | 22:55 |
fwereade | wallyworld, everything to do with computers hates me | 23:00 |
fwereade | wallyworld, thanks for that link though | 23:00 |
fwereade | wallyworld, for some reason it makes it all look less objectionable | 23:00 |
fwereade | wallyworld, do you think there's a reasonable evolution that lets us drop all the frickin' switching though? | 23:01 |
wallyworld | fwereade: one sec. otp | 23:01 |
fwereade | wallyworld, because I'm still feeling that if we're going to be lazy we should be really lazy | 23:01 |
fwereade | wallyworld, install the packages only when we're actually trying to run a container and find missing deps | 23:02 |
fwereade | wallyworld, separating out the thing that takes the decision on whether to start a provisioner is a great step | 23:03 |
fwereade | wallyworld, so that is a win in itself, no argument there | 23:03 |
fwereade | wallyworld, but smearing the specific-container-related logic out so widely stresses me out a little, because it introduces subtle dependencies | 23:04 |
* thumper sighs | 23:05 | |
* thumper pulls apart the threads looking for the issue | 23:05 | |
* thumper opens the patient up again | 23:06 | |
fwereade | wallyworld, I am cool with the watcher strategy though | 23:06 |
wallyworld | fwereade: sorry, just about finished phone cll, one more sec | 23:07 |
fwereade | wallyworld, and while I would love to have further discussions about SOA I am flagging a little -- so I'm off for a quick ciggie, then a short chat before bed | 23:07 |
wallyworld | ok, ill read you r comments while you kill your lungs | 23:07 |
* thumper frowns | 23:08 | |
wallyworld | fwereade: we do only install packages when the first container is needed to be run, so we are really lazy | 23:11 |
fwereade | wallyworld, they're still twitching a bit | 23:12 |
wallyworld | all the container logic is in one place - the containers package. the agent (for setup) provisioner task (for running) calls into that | 23:12 |
fwereade | wallyworld, but codewise the actual invocation of the setup has a very tenuous and distant connection to the actual launching of the container | 23:12 |
wallyworld | set up and launching are separate | 23:13 |
wallyworld | you could use the same argument when we used cloud init | 23:13 |
fwereade | wallyworld, how about if we jammed the setup bits it into broker creation? that feels much closer | 23:13 |
wallyworld | eg just efore this branch | 23:13 |
wallyworld | we always apt-get installed lxc in cloud init | 23:13 |
wallyworld | which is the lxc container set up | 23:13 |
fwereade | wallyworld, indeed I could :) | 23:13 |
wallyworld | that is very distant | 23:13 |
wallyworld | now, all related container logic is at least together | 23:13 |
wallyworld | and the worker task that uses it can call into it | 23:14 |
fwereade | wallyworld, I'm certainly not defending that practice -- it was expedient but rather hairy really | 23:14 |
fwereade | wallyworld, and it caused us problems ;p | 23:14 |
fwereade | wallyworld, I agree it's closer now, and better than before | 23:14 |
wallyworld | the containers package exposes 2 main sematics - setup and management | 23:14 |
wallyworld | and the task uses those 2 main concenpts | 23:14 |
wallyworld | by management, i mean runtime - start/stop etc | 23:15 |
* thumper falls foul to wallyworld's hack on debug levels | 23:15 | |
wallyworld | i *think* we have separate of concerns ok, if not heading in the right direction | 23:15 |
fwereade | wallyworld, it just seems reasonable that (say) broker creation be a decent signal implying the need for setup, ratherthan having broker creation either sane or not depending on the action of distant code | 23:15 |
wallyworld | fwereade: so that implies there's some flag somewhere which records if setup has been done | 23:16 |
fwereade | thumper, heh, I thought that'd cause trouble, but I couldn't think of anything better and you were on holiday | 23:16 |
fwereade | thumper, if you're of a mind to fix it please correspond with davecheney, whose use cases we were trying to support | 23:16 |
fwereade | wallyworld, it's definitely going in the right direction | 23:16 |
* thumper nods | 23:16 | |
wallyworld | thumper: i did tell you about it and the need to fix it :-) | 23:16 |
thumper | wallyworld: yeah I know | 23:17 |
thumper | I just hadn't gotten around to it | 23:17 |
fwereade | wallyworld, I'm just complaining because it feels like almost virgin soil, and that we could get further | 23:17 |
* thumper puts it on the stack of shit to fix | 23:17 | |
wallyworld | i know, just pressing buttons :-P | 23:17 |
fwereade | wallyworld, however, I remind myself, progress not perfection :) | 23:17 |
fwereade | wallyworld, and as I said I'm pretty happy with how it looks in the context of the followup | 23:17 |
wallyworld | fwereade: understood. the way i see it - we have container setup and management "nicely" packaged. we have stuff that calls it. we can adjust the stuff that calls it | 23:18 |
thumper | heh, oops, found a weirdness... | 23:18 |
wallyworld | fwereade: if you wanted to +1, or +1 with fixes, i can do that today | 23:18 |
fwereade | wallyworld, yeah, I'm just rereading my whining | 23:19 |
wallyworld | lol | 23:19 |
wallyworld | thanks for staying up late todo this | 23:19 |
fwereade | wallyworld, don't suppose I can convince you of a SetSupportedContainers call? I'd still prefer that to the mooted result from UpdateSC | 23:20 |
fwereade | wallyworld, that's the only bit that still feels really wrong, and isn't amenable to easy fixes by virtue of being part of the api | 23:20 |
wallyworld | fwereade: actually, in the current branch, i think i'm going to have to go to that api anyway, at least at the service level | 23:20 |
fwereade | wallyworld, sweet :) | 23:20 |
wallyworld | fwereade: so i did the Add api in isolation | 23:21 |
wallyworld | and then as the implementation has evolved, it needs to be changed i think | 23:21 |
wallyworld | so i could kand what's there, and it will be reworked in my current branch | 23:21 |
wallyworld | fwereade: thanks for +1. i am fully aware it's not yet perfect, but is a step along the way :-) | 23:23 |
wallyworld | and we are delivering new functionality to users | 23:23 |
fwereade | wallyworld, no worries -- and my concern about the Set/Add bit is that if 1.17 goes out with Add it will be waaay more tedious to make Set work without ugliness | 23:24 |
fwereade | wallyworld, so if you're confident that can be proposed today I can handle it | 23:24 |
wallyworld | yeah. i was thinking with Add, we would be less immune to races | 23:25 |
wallyworld | cause i wasn't sure if we would have multiple callers each adding their own | 23:25 |
wallyworld | and add is more robust to that | 23:25 |
fwereade | wallyworld, I'm not so bothered about those, because I feel we can restrict it to a single caller quite naturally | 23:25 |
wallyworld | but i think callling Set will be limited to machine agent at start up | 23:25 |
fwereade | wallyworld, yeah | 23:26 |
wallyworld | that wasn't clear at the time initially | 23:26 |
fwereade | understood :0 | 23:26 |
wallyworld | thanks again, go get some sleep :-) | 23:26 |
fwereade | wallyworld, cheers, enjoy your day | 23:26 |
wallyworld | i'll try :-) | 23:26 |
* thumper is stonewalled by kvm tools | 23:28 | |
* thumper needs to email robie to get answers | 23:28 | |
thumper | perhaps I'll look at the logging stuff while I wait | 23:28 |
wallyworld | thumper: or you could do this for me :-D https://codereview.appspot.com/28190043/ | 23:34 |
* thumper goes to have lunch first | 23:36 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!