[02:54] ./conn_test.go:117: undefined: juju.NewConnFromAttrs [02:54] grrr, thanks rog [06:23] Good morning [06:31] TheMue, heyhey [06:31] hello [06:32] davecheney, heyhey [06:32] fwereade_: Had a nice party yesterday? [06:32] TheMue, ha, just a sleep :) [06:33] fwereade_: Thought you would be too upset to fell asleep. (Do you say so?) [06:33] TheMue, excited maybe more than upset [06:34] what did I miss ? [06:34] TheMue, upset usually carries unhappy connotations :) [06:34] davecheney, I ran something on juju :) [06:34] fwereade_: indeed you did [06:34] a cause for celebration, not for sadness by my account [06:34] davecheney, definitely, it's just that I celebrated by going to sleep [06:35] sounds like a valid choice to me [06:36] fwereade_: Ah, excited is the missing word. Thx. [06:36] fwereade_: Already felt that upset is wrong. [06:36] possibly elated [06:39] Btw, my vacation starts today. But I'll be in later today again as I have two LGTMs. [06:39] ok [06:40] While I'm away you can reach me by mail, thx to those little portable computers which also can phone. ;) [06:46] TheMue: i have a really bad camera that also has email on it [06:46] btw, http://codereview.appspot.com/6497070/ [06:46] this might be the solutoin to the strange EOF problems we have talking to LP and AWS [06:46] *might* [06:46] or it coudl just be the intertubes in australia [06:55] TheMue, btw, if you's still around, why do you favour the switch over the if/else in loose-hook-info-members? [06:56] TheMue, it's clearly a perfectly legitimate thing to do but I have a strong bias toward switching on just one thing... is this just because I'm used to "broken" switch statements? [07:09] davecheney, fwereade_: morning [07:10] morning [07:10] davecheney: the change to NewConnFromAttrs was niemeyer's suggestion BTW [07:10] davecheney: sorry about that [07:10] rogpeppe: so'k [07:10] wasn't hard to figure out [07:10] davecheney: it made for better alignment with the entry points in environs [07:10] yeah, 'tis and improvement [07:12] fwereade_: Just optical reasons. I like the switch more than if-else-chains (even if they are short). But don't care, just personal preferences. :) [07:14] TheMue, changed it, still somewhat ambivalent, we'll see [07:14] fwereade_: does this look better to you than the version as test fixture? https://codereview.appspot.com/6495086/ [07:14] rogpeppe, heyhey [07:14] * fwereade_ looks [07:15] fwereade_: i generally prefer a switch if there are more than two branches [07:15] fwereade_: Thx. So, off for the moment. [07:15] rogpeppe, there are only two here [07:16] fwereade_: in which case i'm probably on the fence, although i haven't seen the code in question [07:16] rogpeppe, it may just be that I had a couple of reviews requesting switch->if-else and have grown wary of them :) [07:19] rogpeppe, that looks much nicer to me [07:19] fwereade_: cool. i'm not entirely sure about the "series" helper methods (explicit better than implicit?) but i'm happier with it as a normal type not a fixture. [07:19] rogpeppe, definitely [07:20] davecheney: good work on tls fix - let's hope it helps [07:20] fwereade_: the only other thing i considered was embedding LocalRepository or something like that. [07:21] fwereade_: 'cos they're really quite similar [07:21] fwereade_: but most uses don't care, and it's easy to make up a LocalRepository when needed [07:21] rogpeppe, IIRC I wanted to do that when I was first writing it but couldn't get them quite similar enough :) [07:22] rogpeppe: i'm not really sure it was the problem, but TSAN picked it up so it needed to be fixed [07:22] fwereade_: i think it's ok - they both represent a repository, but do different things with it. and all the state is in the directory, so it doesn't matter. [07:23] davecheney: ah. i'm looking forward to that being included as standard. i'm interested to see what it makes of our code... [07:26] rogpeppe: there is a reason why I don't run GOMAXPROCS > 1 [07:26] that will make gozk EXPLODE!! [07:26] davecheney: have you tried it? [07:26] once [07:26] got a segfault in gozk [07:26] davecheney: ha ha. i thought it *should* be safe. [07:26] rogpeppe: did you log that issue to have the signal details logged [07:26] davecheney: ah, no. i'll do that. [07:33] davecheney: it would be quite a hassle to fix sadly [07:34] davecheney: because in that context we can't call any functions [07:35] isn't there a panicf [07:35] or a printf that doesn't allocate ? [07:36] davecheney: hmm, maybe. i'm not sure of what invariants we need at that level. [07:38] davecheney: of course, the real fix is to allow callbacks from non-Go-created threads. [07:41] rogpeppe: that fix might be saying 'the real fix is to relax the laws of reletivity to allow high paying packets to travel at their chosen speed' [07:41] davecheney: yeah [07:42] m and g are not available, but as long as the GC doesn't need to run, it should be safe to use runtime.print [07:45] davecheney: do you know what linux signal behaviour is w.r.t. threads? is the signal delivered to one thread only? or all threads? [07:46] davecheney: hmm, one random thread, it looks like. hurray. [07:47] rogpeppe: yup, but all threads wouldn't help either [07:47] davecheney: if it was all threads we could block the signals in all but the one we care about and ignore if we find no m, no? [07:48] rogpeppe: good point, but i'm sure that was tried and found wanting [07:50] davecheney: better would be if a thread had to explicitly enable signals rather than receiving them by default [07:53] /home/dfc/src/launchpad.net/juju-core/testing/mgo.go:119: c.Fatal("Test left sockets in a dirty state") [07:53] what does this mean ? [08:04] davecheney, sorry, no idea... failed to delete them maybe? [08:13] fwereade_: i was leaking a mgo.Session [08:13] my fault [08:14] np [08:14] davecheney: do you know a way of reliably reproducing the signal problem. my naive attempt doesn't fail: http://paste.ubuntu.com/1188513/ [08:14] rogpeppe: not really [08:14] it is quite rare [08:14] deinfitely run with high GOMAXPROCS [08:16] davecheney: if the issue is what we think it is (the signal is randomly delivered to some thread), i can't see why the above code doesn't trigger the issue [08:17] rogpeppe: try firing the signal from the C side [08:21] davecheney: i tried firing the signal from the command line. no difference. [08:22] rogpeppe: i'm not surprised, it's not supposed to happen [08:22] davecheney: well... how can they be stopping it from happening? [08:22] 'supposed to happen' == no panic [08:30] davecheney: i've seen a couple more failures of the TestSSHConnect test BTW. (one just now and another one yesterday). [08:31] davecheney: it concerns me [08:31] rogpeppe: my suggestion to that would be to switch to the ssh crypto package [08:32] davecheney: i'd love to. we started off trying that direction, but it wasn't mature enough. [08:33] rogpeppe: all it needs to support for loading keys off disk (in the same order that ssh does it) [08:33] and talking to the ssh agent (which we have now) [08:33] i'll put it on my list for this weekend [08:33] must fly [08:42] moin. [08:58] Aram, heyhey [09:16] rogpeppe, trivial: https://codereview.appspot.com/6490086 [09:17] fwereade_: LGTM [09:17] rogpeppe, cheers [09:17] fwereade_: pity it had never been tested live before... [09:17] rogpeppe, indeed [09:34] * Aram has troubles finding flights to lisbon. [10:12] fwereade_: need trivial LGTM please: https://codereview.appspot.com/6494090 [10:12] * fwereade_ looks [10:13] LGTM [10:13] rogpeppe, ^ [10:13] fwereade_: ta! [10:13] * rogpeppe is ashamed to have broken the build :-( [10:23] fwereade_: FWIW, i think a string argument to AgentToolsDir is still ok. each agent can decide on its own name, no? [10:24] fwereade_: (i just saw your discussion with gustavo last night) [10:24] rogpeppe, it happens ;p [10:25] rogpeppe, kinda... but, atm, everybody has to prepend unit-, and deslash, whenever they want to specify a unit name [10:25] rogpeppe, this feels kinda icky [10:26] rogpeppe, although I am also not 100% sold on KeyPath [10:26] fwereade_: i'm not sure what you mean. unit names aren't prepended with "unit-" AFAICS [10:27] fwereade_: the *agent name* for a unit is though [10:27] rogpeppe, isn't that what AgentToolsDir expects? [10:27] fwereade_: yes, but how often do you need to call AgentToolsDir? [10:28] rogpeppe, a few places that may eventually coalesce into one place [10:28] rogpeppe, the hook env needs it, EnsureJujucSymlinks needs it [10:29] rogpeppe, container needs it [10:29] fwereade_: why does the hook env need it? [10:29] rogpeppe, so we can put it on the PATH [10:30] fwereade_: why not just do that in the uniter main? [10:30] rogpeppe, and, later, add it as an explicit env var for use when setting up juju-run commands [10:30] rogpeppe, because it's only needed by the hooks? [10:31] fwereade_: fair enough. i'd probably do something like uniter.AgentName(unit *state.Unit) string [10:31] fwereade_: or even uniter.ToolsDir(unit *state.Unit) string [10:33] rogpeppe, not unreasonable for container to know about uniter, I guess... and environs doesn't need to know about uniters, because that's always done by machiner... yeah, probably good [10:34] fwereade_: yeah, i hadn't thought about container needing to know about uniter, but as you say, sounds reasonable, as it's starting the worker (indirectly) [10:34] rogpeppe, yeah, exactly [10:36] rogpeppe, a further thought [10:37] rogpeppe, if we do have a consistent "kind-name" badge for the various agents, which sounds sensible, we really should be using it in more places than just AgentToolsDir [10:37] rogpeppe, it should be in logfile names, for example [10:38] fwereade_: that seems like a good plan [11:12] davecheney: did you have any further discussion with niemeyer about your CL https://codereview.appspot.com/6499071 [11:12] ? [11:12] davecheney: i'm not sure i understand his objections [11:14] davecheney: but perhaps his objections are *only* to the CL description [11:14] rogpeppe: no, i will try again tomorrow [11:14] davecheney: and if it was changed to "most uses just want the *Machine" it would be ok [11:14] rogpeppe: couldn't hurt [11:16] davecheney: although he's right - there's actually *no* real code that wants the Machine rather than the id [11:20] rogpeppe: it's probably not worth the discussion [11:20] what I have works [11:20] and there are so many more important things do to [11:20] true 'nufff [12:47] * rogpeppe goes for some lunch [12:58] niemeyer: morning! [13:35] rogpeppe: Morning! [13:35] Hello all! [13:35] niemeyer: thanks for the review! [13:35] rogpeppe: np, thanks for all the niceties [13:35] niemeyer: no problem at all [13:36] niemeyer: i don't think there are two round trips to get machine id and machine. in state i think that's true anyway. or are you thinking of mstate? [13:37] rogpeppe: Sorry, your logic escapes me.. you mean that state.Machine(id) doesn't have a roundtrip to the state server? [13:37] niemeyer: no, i'm saying that MachindId needs the round trip anyway [13:38] MachineId [13:38] AssignedMachineId even [13:38] rogpeppe: The logic still escapes me.. the CL makes it call Machine, always [13:38] rogpeppe: Machine has a roundtrip to the state server, both in state and in mstate [13:40] niemeyer: ah yes, you're right. that's wrong. but we can construct the machine from the topology without needing an extra round trip. [13:41] rogpeppe: No, we can't, because we use cached state in mstate [13:41] niemeyer: ah, so Unit.AssignedMachineId doesn't go to the server? [13:42] rogpeppe: Yes, and Machine has to go to the server to get the machine state [13:42] niemeyer: right, gotcha. presumably we *could* fetch the machine too when we fetch the unit. [13:42] rogpeppe: Dude.. [13:43] lol [13:43] ok ok [14:02] huh, are we not actually using --juju-directory for anything? [14:40] Hi all. I was on swap yesterday, but am back now. [14:40] Just got off a call about doing a Juju marketing video [14:40] hi mramm [14:40] mramm, heyhey [14:40] mramm, I deployed something yesterday, would you believe? [14:41] I'm also working on the info for the website, a juju webinar slide deck, and talking to jorge about marketing juju to developers and using it as a DIY PaaS. [14:41] fwereade_: That is AWESOME! [14:41] fwereade_: Everybody's hard work is coming together! [14:42] mramm, yep :D [14:42] I am excited! [14:42] mramm, now ofc I have to hammer it into proposable shape [14:42] and using too many exclamation points!!!!!!!!!!!!! [14:42] mramm, but that shouldn't be too far off [14:42] fwereade_: great! [14:43] I am excited about the juju video [14:43] the people we've got working on it seem competent [14:43] fwereade: you'll be pleased to know that authorisation of internal traffic now works [14:44] rogpeppe, sweet [14:44] mramm, (that was the only manual step) [14:44] fwereade: just waiting on a live test that checks that both machine agents upgrade [14:44] YAY! [14:44] it works [14:44] * fwereade cheers [14:44] I also had a talk with Clint about packaging juju for Ubuntu on Tuesday night [14:44] rogpeppe: AWESOME! [14:45] so very much awesome going on right now! [14:45] fwereade: 3.87s to upgrade both agents FYI [14:46] fwereade: and that's using the proper deploy infrastructure, juju.AddService, AddUnits, etc [14:46] rogpeppe, sweeet [14:47] fwereade: yeah, it's nice to have some functional tests that are a bit higher level [14:49] fwereade: it's just a pity that pushing the tools takes so long. sometimes as long as 5 minutes. [14:51] fwereade: it would have been done earlier if amazon's docs weren't so crap :-) [14:52] rogpeppe, ha, I know your pain [14:53] rogpeppe: haha [14:53] So, I have a product review meeting tomorrow [14:53] if you have things you think I should give management heads up on, let me know [14:54] I've got the unit agent, updater, mongo, and marketing update basics figured out [14:55] so if you have any additional detail you think is important on those things, or any other stuff that people need to know about, shoot it on over. [14:57] rogpeppe, niemeyer: sanity check on http://paste.ubuntu.com/1189087/ if yu have a mo [14:58] niemeyer, KeyPath on state entities was not so nice in the end -- we don't always have the state entities available [14:58] niemeyer, but I *think* this ends up quite neat [15:00] niemeyer, rogpeppe: I would almost certainly want to pair it with a PathSuite or something that swaps out LibDir, LogDir, InitDir [15:00] niemeyer, (and possibly $HOME, why not, eh?) [15:01] juju/testing/conn.go:// It also sets up $HOME and environs.VarDir to [15:01] fwereade: ^ [15:01] niemeyer, yeah; and a lot of other tests swap one or both of those out here or there [15:02] niemeyer, it was moving environs.VarDir that made me think "ouch, too much duplication" [15:03] fwereade: My only concern is that right now we already have an Agent with an interface [15:03] niemeyer, but if I could find a sensible name, it would probably be ok? [15:04] niemeyer, and actually I'm not sure agent.Spec is all that much subject to confusion [15:05] niemeyer, wait a mo, there is no Agent interface that I can find [15:05] niemeyer, now I want to just call that an Agent :) [15:06] fwereade: cd cmd/jujud; grep Agent * [15:08] niemeyer, I see AgentConf and AgentState [15:08] fwereade: Uh? [15:08] fwereade: UnitAgent, MachineAgent, ProvisioningAgent, AgentState, AgentConf, etc etc [15:09] niemeyer, ok... you said we had an Agent with an interface, maybe I mistook what you meant [15:09] fwereade: AgentConf being JujuDir + StateInfo.. [15:09] niemeyer, ah yeah, mean to mention that, JujuDir is not used [15:10] fwereade: I mean that the concepts you're playing with already exist [15:10] fwereade: We can't just add another package without sorting their situation out [15:10] niemeyer, ok, I see now [15:10] niemeyer, well, JujuDir doesn't actually do anything [15:10] fwereade: That won't change if we just add another package :) [15:11] niemeyer, but I'm pretty sure what it *should* do is overwrite environs.VarDir (which would be agent.LibDir) [15:20] niemeyer: https://codereview.appspot.com/6500089 [15:21] rogpeppe: Cheers.. I think I'll focus a bit on the watcher today, though, otherwise we'll never have it [15:21] niemeyer: no problem [15:21] niemeyer: you'll be glad to know that this enables the uniter stuff to work live though [15:21] rogpeppe: Wow, sweet! [15:21] niemeyer: and that it tests two machine agents upgrading at the same time [15:22] niemeyer: through the juju.Conn deploy machinery [15:22] rogpeppe: Oh man [15:22] rogpeppe: It's coming together [15:22] niemeyer: if nothing else, just have a brief glance at https://codereview.appspot.com/6500089/diff/2001/environs/jujutest/livetests.go [15:23] niemeyer: (i'm really happy that worked) [15:25] rogpeppe: Impressive indeed.. even more impressive to have that as a stock test [15:27] niemeyer: yeah, i'm happy about that [15:30] fwereade: wb [15:30] fwereade: https://codereview.appspot.com/6500089 [15:30] fwereade: I sent you and -dev a mail [15:30] fwereade: With the feedback, before I forget and so that I could move on [15:30] niemeyer, cheers, I will take a look [15:32] niemeyer, yeah, consider my mind already churning on these thoughts [15:33] fwereade: +1, cheers [15:38] fwereade: shouldn't JujuDir set environs.VarDir? [15:38] rogpeppe, I intimated as much above [15:39] f.StringVar(&environs.VarDir, "juju-directory", environs.VarDir, "juju working directory") [15:39] fwereade: oh yes, i didn't notice that [15:39] rogpeppe, although I am starting to feel dissatisfied with its globalness, and am wondering if we really need it [15:40] fwereade: yeah, i'd be happy for it to avoid globalness [15:41] fwereade: i think most of environs/tools.go could happily fit in a new package [15:41] fwereade: with no globals [15:42] fwereade: maybe just a package called "tools" [15:42] aw [15:43] fwereade: last thing you saw? [15:43] fwereade: last thing from you i saw was [15:43] [16:39:26] rogpeppe, although I am starting to feel dissatisfied with its globalness, and am wondering if we really need it [15:48] rog, I saw fwereade: i think most of environs/tools.go could happily fit in a new package [15:49] 16:41:37] fwereade: with no globals [15:49] [16:42:19] fwereade: maybe just a package called "tools" [15:49] rogpeppe, I was just pondering exactly that while having a ciggie :) [15:49] fwereade: because that's what this is all about, i *think* [15:50] rogpeppe, yeah, there's a sideline in "bits of container look awfully like bits of cloudinit" -- which originally started me off on this jaunt -- but you are absolutely right [15:52] Lunch break [15:54] fwereade: which bits were you thinking of? [15:54] rogpeppe, just the bits that create Confs are awfully similar and mildly tediously inconsistent [15:54] rogpeppe, not a big deal at all really, but it was nagging at me [15:55] fwereade: Conf? [15:55] rogpeppe, upstart.Conf [15:58] fwereade: yeah, i think that code would fit quite happily in a tools package [16:03] * rogpeppe is quickly hacking together a sketch [16:09] fwereade: how about something like this? http://paste.ubuntu.com/1189197/ [16:10] fwereade: a few bits tidied up: [16:10] package tools [16:10] // SearchFlags gives options when searching for tools. [16:10] type SearchFlags int [16:10] const ( [16:10] // HighestVersion indicates that versions above the version being [16:10] // searched for may be included in the search. The default behavior [16:10] // is to search for versions <= the one provided. [16:10] HighestVersion SearchFlags = 1 << iota [16:10] // DevVersion includes development versions in the search, even [16:10] // when the version to match against isn't a development version. [16:10] DevVersion [16:10] // CompatVersion specifies that the major version number [16:10] // must be the same as specified. At the moment this flag is required. [16:10] CompatVersion [16:10] ) [16:10] // List holds a list of available tools. Private tools take [16:10] // precedence over public tools, even if they have a lower [16:10] // version number. [16:10] type List struct { [16:10] Private []*state.Tools [16:10] Public []*state.Tools [16:11] } [16:11] // ListAll returns a List holding all the tools [16:11] // available in the given environment that have the [16:11] // given major version. [16:11] func ListAll(env Environ, majorVersion int) (*List, error) [16:11] // Put builds the current version of the juju tools, uploads them [16:11] // to the given storage, and returns a Tools instance describing them. [16:11] // If vers is non-nil it will override the current version in the uploaded [16:11] // tools. [16:11] oops [16:11] sorry everyone [16:11] i meant this: http://paste.ubuntu.com/1189199/ [16:13] fwereade: http://paste.ubuntu.com/1189199/ [16:17] hmm, i wonder if i've been muted [16:18] rogpeppe, sorry, I can hear you [16:18] np [16:18] rogpeppe, it looks sane, I think, I'm not sure exactly how it will square with the other wild ideas I am chasing [16:19] fwereade: what kind of thing are you thinking of? [16:19] rogpeppe, agent.Agent, and what exactly it should have attached to it, if it should even exist [16:20] fwereade: i'm not convinced it should exist [16:20] fwereade: an agent is its own beast [16:21] rogpeppe, there are an *awful* lot of commonalities across agents, I think there is a useful common structure waiting to emerge [16:21] fwereade: i go along with niemeyer's comments in this respect [16:21] fwereade: when it emerges, we can factor it out [16:21] fwereade: as particular chunks of functionality [16:22] rogpeppe, I think it's worth an overnight ponder at least :) [16:22] fwereade: saying "*this* is an agent" is a bit like the inheritance way of thinking, i think [16:22] rogpeppe, I guess it is somewhat predicated on the assumption that an agent should correspond to a single state entity [16:22] fwereade: indeed [16:22] fwereade: which may very well not be the case in the future [16:23] fwereade: well... i guess we will probably always have at least one item in the state for an agent [16:23] rogpeppe, hmm -- so it's workers that should correspond to state entities? [16:23] fwereade: no [16:24] fwereade: it's running processes [16:24] fwereade: i.e. things that can upgrade themselves [16:24] rogpeppe, ok (is there now a Provisioner state entity that I haven't noticed?) [16:24] fwereade: but i may very well be seeing through very upgrading-centric eyes currently :-) [16:24] fwereade: no, but there will be [16:24] fwereade: of some kind [16:25] fwereade: otherwise we can't tell when a PA manages to upgrade itself [16:26] rogpeppe, cool, thought so, and this just makes all the agents look even more similar to me, but indeed the stars might not yet be aligned [16:26] fwereade: there are definitely similarities [16:26] fwereade: but we can abstract those out when we need some common functionaliy [16:26] ty [16:27] fwereade: what kinds of operations do you envisage on agent.Agent? [16:28] fwereade: would it be a concrete type or an interface? [16:48] Hello golangian friends. I am wondering, are there any examples of Go applications already packaged (via PPA, or even in the Ubuntu archive) ? [16:49] and on a related note, I am looking at writing a charm in Go as an exercise, and wondering how best to do it.. #!/usr/bin/gorun is great for prototyping, but at some point I think I'll want to just compile it [17:00] SpamapS: lbox and cobzr might be good examples to start with [17:00] SpamapS: They're both in PPAs [17:00] niemeyer: perfect [17:00] SpamapS: and auto-building [17:00] * SpamapS already has the PPA's for those.. apt-get source to the rescue [17:02] niemeyer: any reason you had to be so explicit on binary-arch here: http://bazaar.launchpad.net/~niemeyer/lbox/package/view/head:/debian/rules [17:03] niemeyer: seems like an override_dh_install: might have handled that. [17:07] hiho [17:07] niemeyer: ping [17:07] SpamapS: i had some ideas about writing charms in Go, but haven't had time to do anything about them yet [17:07] rogpeppe: it seems a bit heavy handed for most charm duties. [17:08] SpamapS: yeah, a shell script is often a good fit [17:08] which boil down to "parse this file. put that file over there. run this command" .. [17:08] TheMue: Pong [17:08] SpamapS: Hmm [17:08] rogpeppe: I'm actually finding puppet's DSL quite good and succinct for charm duties. ;) [17:08] niemeyer: just wanted to inform you that the CLs are in [17:09] TheMue: Beautiful, thanks [17:09] * rogpeppe hasn't actually looked at puppet yet [17:09] niemeyer: You had some notes on the first and not on the second and vice versa. I applied them on both. [17:09] rogpeppe: the biggest drawback is that you have to bust out to ruby to do anything interesting [17:10] TheMue: Well, actually you didn't, as I mentioned in the review, but that's fine [17:10] niemeyer: Like the tabs, the panic message and the comment. [17:10] TheMue: We can sort out these details when you're back [17:10] TheMue: Ah, yeah, that was great,thanks [17:10] niemeyer: What exactly you are referring? [17:10] SpamapS: interesting. ruby's Yet Another Language I Have Never Used [17:10] SpamapS: i always figured it was too close to python to be worth learning unless i needed to. [17:10] TheMue: The three points in the review [17:11] niemeyer: WaitAgentAlive() on Unit? [17:11] TheMue: Where I suggested they might be done in a different CL [17:11] rogpeppe: more like perl meets javascript [17:11] TheMue: There's a bug opened as well with them now [17:11] niemeyer: Yeah, the Dying and the Alive error, but not the removal of the loop. [17:11] TheMue: Don't worry, what's going in is a great start, and we can easily sort the points out in a follow up when you're back [17:12] TheMue: You made the comment not on the last checkin. [17:12] SpamapS: I don't really know.. I bet there are better ways to handle it.. I'm just not a well versed packager [17:12] SpamapS: anyway, Go might be good for charms that do a lot of orchestration dancing [17:12] niemeyer: its really only a future-proofing thing anyway.. the way its written now is 100% correct [17:13] rogpeppe: right, I was thinking of centralized things that have to handle all coming/going .. like monitoring or logging [17:13] SpamapS: i reckon you could write it in good Go style, as a goroutine that receives "charm-hook-execute" events in a loop, then acts on the hook and replies. [17:14] niemeyer: Please take a look at https://codereview.appspot.com/6494073/diff/2002/mstate/unit.go at line 230ff [17:14] SpamapS: rather than our traditional "you get called when a hook happens" model [17:14] TheMue: Ok? [17:15] niemeyer: It's now w/o the loop, like for Machine. [17:15] SpamapS: but these thoughts are only fleeting and unformed... [17:15] TheMue: Ok, that's great [17:17] rogpeppe: thats actually what I was thinking too. But I think such things will only be necessary in the extreme cases where something has many thousands of related service units. [17:18] rogpeppe: for 99% of the cases.. python/ruby/bash will be up to the task of running a hook every few seconds. [17:18] SpamapS: it's not necessarily about large scale - it could make it easier to make logic simpler [17:18] rogpeppe: I doubt that. :) [17:18] Fine, now I can leave relaxed. I will look into mail via phone to stay informed during my vacation. It's an exciting time for juju. [17:19] SpamapS: you may well be right :-) [17:19] rogpeppe: simpler to a go developer maybe.. but not to a charmer. :) [17:19] the most insanely complex charms are still pretty straight forward [17:19] SpamapS: well, i certainly intend to write some more charms at some point :-) [17:20] SpamapS: also, i have a feeling we'll get more golangers coming along in the python/ruby space, though i may be wrong in that [17:20] http://www.cloudifysource.org/ [17:20] another juju-like project [17:30] TheMue: Yeah, have a great time there [17:31] niemeyer: Thx, the weather forecast looks good, so we'll have some fine days (and evenings with a bottle of wine) at the water. It's only already too cold for swimming. [17:36] i'm off for too. have fun all. see you tomorrow! [17:36] TheMue: enjoy your break! [17:36] gn rogpeppe [17:36] and thank you [21:06] niemeyer, ping [21:06] fwereade: Pongus [21:07] niemeyer, I think trunk is broken, I'm getting a panic from somewhere inside presence in the mstate tests... just verifying it *actually* happens on trunk, not just my merged version [21:07] niemeyer, if it is, is it still ok to submit over the top? [21:07] fwereade: Can you please paste the panic? [21:07] niemeyer, https://bugs.launchpad.net/juju-core/+bug/1047051 [21:07] fwereade: Yeah, if that's the only failure, it'd be fine [21:07] niemeyer, cool, thanks [21:07] niemeyer, yeah, trunk too [21:08] fwereade: Ah, ok.. I can tell what it is [21:08] fwereade: I've missed that when reviewing Frank's branch [21:08] niemeyer, ++psychic debugging :) [21:08] fwereade: We're creating a watcher and not stopping [21:08] niemeyer, jolly good, could be worse ;) [21:08] fwereade: Hehe :) [21:09] fwereade: I'll fix it as soon as I stop the current line of thinking here [21:09] niemeyer, lovely, tyvm [21:09] fwereade: Sorry for the trouble [21:09] niemeyer, no trouble :) [22:11] fwereade: ping [22:11] niemeyer, pong [22:11] fwereade: Yo [22:11] niemeyer, how's it going? [22:11] fwereade: Up for a quick review fixing that issue? [22:11] niemeyer, absolutely [22:11] niemeyer, then sleepytime ;) [22:11] fwereade: Cool, pushing [22:13] fwereade: I bet! 8) [22:15] fwereade: https://codereview.appspot.com/6510043 [22:15] I've screwed up the description.. fixing meanwhile [22:16] niemeyer, LGTM [22:16] Done [22:16] fwereade: Cheers! [22:30] niemeyer, btw, I think I have convinced myself that the reset --soft is pure superstition; and that in any cases which may exist in which deleting the lock file is insufficient, we really can't do anything about this [22:31] niemeyer, *but* that I also shouldn't even try to recover on unit agent startup -- the user could very easily be logged in and messing with the dir while the uniter is in an error state, and we can't just charge in and start doing stuff [22:31] fwereade: Ah, cool, thanks for the note.. I really wasn't sure [22:32] niemeyer, so I now want to just speculatively delete the lock file before every command that hits the index [22:32] fwereade: Sounds good.. your atomicity scheme kind of makes it unnecessary as long as the automated process is concerned, so sounds sane [22:32] fwereade: Uh oh [22:33] niemeyer, hm, is that bad? [22:33] fwereade: Sounds like the two extremes.. [22:33] fwereade: We're not reseting because the user might be messing around, but we're happy to kill the lock all the time? [22:33] niemeyer, the most authoritative-sounding source I could find was a git list thread from 2009, in which they discussed making the "just delete the lock file" message carry more of that intent [22:34] niemeyer, the only times we have any reason to kill the lock are precisely when we're doing things with the index [22:34] niemeyer, this implies that we are upgrading, or running hooks, and if the user is messing around in the charm dir while we are working that is on hi own head ;) [22:35] fwereade: I see, so hitting the index as in changing it [22:35] fwereade: Not just reading, right? [22:35] niemeyer, I'm not sure it'll even read the index when something else has a lock [22:36] niemeyer, but this was in fact done by a process of pure research [22:36] fwereade: I'd rather start more conservative, if feasible [22:36] fwereade: The lock is there precisely to avoid processes from stepping on each other [22:37] niemeyer, of all the commands we use: add, commit, pull, reset all require that the lock file not exist [22:37] fwereade: Sure, because if it exists someone/something else may be acting on the directory [22:37] fwereade: But, as I understand all the changes are done in independent directories [22:37] fwereade: With your atomicity mechanisms [22:37] fwereade: Which means that if we ever see a lock, something awkward is happening [22:38] fwereade: Breaking when something awkward happens seems legit [22:38] niemeyer, not actually true, but it could be made to be so [22:38] fwereade: Ah, indeed, the final pull I guess [22:38] niemeyer, yeah, the charm dir itself is not a swappable symlink [22:38] niemeyer, although maybe it should be [22:38] fwereade: COol, we can leave that [22:39] fwereade: Even then, I'd be fine with asking the user to go there and fix it for the moment, before we're sure that stuff works fine [22:39] fwereade: Seems useful to learn about how frequently we'll observe lock files left around and whatnot [22:39] niemeyer, hm, ok, let me think a sec [22:40] fwereade: We can always change our mind and be less conservative [22:41] niemeyer, in that case I suspect I favour just explicitly blowing up when we see a lock file that wold impede the impending operation [22:41] niemeyer, sane granularity? [22:41] fwereade: Yeah, that's what I mean [22:41] fwereade: Ah, explicitly [22:41] fwereade: You mean checking it's there? [22:41] niemeyer, we shouldn't be operating on the charm dir except at times when the user really shouldn't be operating on the charm dir [22:42] niemeyer, hmm, just let it fail even :) [22:42] fwereade: Right.. there's no point in checking, since that's exactly the race that the lock file is meant to protect against [22:42] niemeyer, true :) [22:43] niemeyer, I'm just thinking it through... this is a potential new error state [22:44] fwereade: Really? How is this any different from any other error related to the charm deployment? [22:44] fwereade: Can't create dir, can't download, etc [22:45] niemeyer, ah, ok, it's a totally-screwed-agent-down-uniter-log-filled-with-errors situation, not a polite-error-message-log-in-nicely-and-resolve-it one? [22:45] niemeyer, I guess all of those are [22:45] fwereade: Right [22:46] niemeyer, yep, ok, I'm just more scared of git because I'm more aware of my ignorance [22:46] fwereade: Well.. some of those errors surely are resolvable too, right? [22:46] fwereade: I'm not sure if you're saying these are conditions we can't recover from or not? [22:47] niemeyer, I'm just saying that currently we don't [22:47] fwereade: Okay, I'm happy to move on and polish later, but in principle those conditions should also be recoverable [22:48] fwereade: E.g. a download error is a pretty unexceptional error [22:48] fwereade: As Amazon would gladly tell you [22:48] niemeyer, so *that* will work just fine assuming the resource is actuallythere... eventually it'll succeed [22:48] fwereade: Perhaps not gladly, no :-) [22:49] niemeyer, and therefore shouldn't be reported, at least not immediately, because we expect it to be transient [22:49] fwereade: Right, ok.. so that's what I mean.. I see all of those cases as "wedged, needs action" [22:49] fwereade: but "juju resolved" in general should mean "Alrightly, will try again and see if that works now" [22:51] niemeyer, I'm not sure I agree that a dropped connection during a download is reason enough to stop doing things until the user comes to help us [22:52] fwereade: Sorry, yes, you're right.. that's not a good example [22:52] niemeyer, that's the trouble [22:52] fwereade: I mean a case like "can't create directory" [22:52] niemeyer, some uniter errors are resolvable by time, and others justify a stop-the-world [22:53] niemeyer, we do not as yet have a way of classifying them [22:53] fwereade: I see, ok [22:53] niemeyer, it would be a nice thing to be able to do so but it also feels like it might get hairy [22:54] fwereade: I think we might do with some heuristics [22:54] fwereade: without having to put them in separate buckets [22:55] niemeyer, so, given that, I would like to just drop the concept of recovery entirely and let the process fail as it pleases [22:55] fwereade: E.g. if it's something we want to retry automatically, don't yet tell it's an error [22:55] fwereade: But if we retry so many times, or for so long, and it's not working, stop trying and get into an error state [22:55] niemeyer, yeah, reasonable -- feels like a different change to just dont-ever-recover though [22:55] fwereade: This is even a conservative approach to prevent a large system from self-destructing [22:56] fwereade: Agreed [22:56] niemeyer, but, yeah, absolutely, sounds like what we should be doing to me [22:56] fwereade: To make things simpler for us to move forward, it's probably fine to start with don't-ever-recover, and move towards auto-retry [22:57] niemeyer, well, auto-retry will happen on everything for free via upstart [22:58] fwereade: That's not very nice [22:58] fwereade: We're managing retrying internally in other cases already [22:58] fwereade: E.g. connections to state [22:59] fwereade: Well, all errors really [22:59] fwereade: In other agents we manage retry-after-error internally [23:00] niemeyer, oh, er, yeah, sorry, I *did* copy that bit in from the machine agent [23:00] niemeyer, ok, I see where you're coming from now [23:01] niemeyer, so managing the heuristics becomes a little easier, and it's still a good idea to add some sort of heuristics [23:01] niemeyer, er, sorry nonsense [23:02] niemeyer, it feels to me like "try again" should always be reasonable behaviour if we've written the workers right [23:03] niemeyer, the spinning and logspam are unsightly, it is true [23:04] fwereade: It is always true, but in some cases we may need authorization to do so, due to the non-idempotency of some operation [23:05] niemeyer, and that should be handled internally by the uniter, surely? [23:05] niemeyer, the uniter knows about these things; the unit agent just needs to keep a uniter running at all times, except when it's upgrading [23:06] niemeyer, ah ok I think I see a source of confusion [23:06] fwereade: That, what? [23:06] niemeyer, when I say always-retry is the default I am specifically referring to errors that cause the uniter to fail [23:07] niemeyer, hook errors do not cause the uniter to fail -- when state indicates a hook error, the uniter runs just fine, in a mode in which it waits for resolution [23:07] niemeyer, the uniter will not retry an upgrade or a hook unless requested [23:08] niemeyer, and so it should always be safe for the unit agent to launch a new one after the first has failed [23:08] fwereade: If a conflict happens on a merge.. how is that handled? [23:09] niemeyer, the GitDir returns ErrConflict, magic happens, the uniter runs ModeConflicted [23:10] fwereade: and then? [23:10] niemeyer, eventually the user resolves the error, and we pull again, commit, and continue in ModeStarted [23:11] fwereade: Okay, it is an error that doesn [23:11] 't cause the uniter to bail out, then? [23:11] niemeyer, exactly [23:11] fwereade: How's a Conflicted error different from a "directory has a lock" one, from our perspective? [23:12] niemeyer, no different, essentially; this is a point of trouble :/ [23:12] fwereade: How's it trouble? It seems like a solution.. if it's the same, let's unify the handling [23:13] niemeyer, ha, yes indeed [23:13] niemeyer, there's no good reason for any of the git operations to fail, none of them touch the network [23:13] niemeyer, if they *ever* do we can go straight to help-fix-git-please mode [23:14] fwereade: Right [23:14] fwereade: help-fix-deployment even [23:14] fwereade: heck-were-is-my-data alternatively [23:14] :-D [23:15] where [23:15] niemeyer, indeed, it's currently called ModeConflicted and assumes that it's because of an upgrade, but I don;t think that's fundamental, just conincidental [23:15] haha [23:16] fwereade: Yeah, ModeDeploymentError or something [23:16] fwereade: status-info ftw [23:22] niemeyer, yep, ok, I think it's all percolating nicely through my brain; time to sleep for now, I hope I'll have time to sort out smarter error handling before too long [23:22] gn [23:22] fwereade: Have a good night