#juju-dev 2012-04-23
<wrtp> TheMue, fwereade: good morning!
<fwereade> wrtp, heyhey
<fwereade> wrtp, I'm getting test failures in the ssh stuff
<TheMue> wrtp, fwereade: moin
<wrtp> fwereade: let me see 'em!
<fwereade> wrtp, permissions on the key file are "too open"
<wrtp> fwereade: oh yes, sorry, i forgot to fix that. darn. will do.
<wrtp> fwereade: workaround is: chmod go-rwx state/sshtest/*
<fwereade> wrtp, cheers
<wrtp> fwereade: bloody ssh being too clever for its own good
<fwereade> wrtp, haha, yeah
<fwereade> wrtp, TheMue: nice weekends?
<wrtp> fwereade: yes thanks. had a couple of friends to stay, had large quantities of curry, a nice walk, and a certain amount of nice whisky too.
<wrtp> fwereade: you?
<fwereade> wrtp, lovely :)
<wrtp> fwereade: still got a pot full of some of the curry left for lunch :-)
<fwereade> wrtp, yeah, very nice, wandering valletta with the family on sat and an afternoon of booze and boardgames with some friends on sun
<fwereade> wrtp, awesome
<wrtp> sounds nice
<TheMue> fwereade: Yes, with two parties. On Saturday our neighbor went 50, on Sunday my niece had confirmation.
<fwereade> wrtp, actually afternoon/evening I guess, munchkin takes too long really
<TheMue> fwereade: So too much to drink and eat. *lol*
<fwereade> wrtp, I guess we're still learning it, just need to be a bit snappier about playing
 * wrtp doesn't know about munchkin
<fwereade> TheMue, lovely, sounds like we all did that ;)
<fwereade> wrtp, it's a card game that aims to be the (humorous) essence of D&D
<fwereade> wrtp, killing monsters, stealing treasure, betraying friends
<wrtp> fwereade: with success?
<wrtp> fwereade: (its aim, that is)
<fwereade> wrtp, a surprising amount actually
<fwereade> wrtp, down to the scope for rules-lawyering arguments ;)
<fwereade> wrtp, the rules are pretty simple but a lot of the cards tweak them in one way or another
<fwereade> wrtp, and the overlaps/conflicts are not always well-specified
<fwereade> wrtp, so it's not a technically "good" game IMO but it's kinda fun
<fwereade> wrtp, also, most of the cards have/are some sort of joke
<fwereade> wrtp, "invoke obscure rules: go up a level"
<wrtp> fwereade: funnier if you've previously played D&D, perhaps?
<fwereade> wrtp, probably
<fwereade> wrtp, but people seem to enjoy it even if they haven't
<wrtp> fwereade: sounds fun
<fwereade> wrtp, it's a nice change of pace from catan/carcassonne
<wrtp> fwereade: oh yeah, minor version compatibility:
<fwereade> wrtp, oh yes?
<wrtp> fwereade: if you increment a minor version, you can add (backwardly compatible) features
<wrtp> fwereade: so if you've got a client with a minor version of 2, it might not work with agents with a minor version of 1
<fwereade> wrtp, ah; my reading had been that, if it doesn't deal with older versions, that's a non-backward-compatible change, and demands a major version bump
<wrtp> fwereade: hmm. interesting.
<fwereade> wrtp, it seems to me that in that sort of mixed environment we'd really want to just expose a lowest-common-denominator feature set
<wrtp> fwereade: i thought that 1.0.0 is compatible with 1.1.0 but not vice versa.
<wrtp> fwereade: because otherwise AFAICS minor versions can't actually add any features
<fwereade> wrtp, that makes sense as well tbh
<fwereade> wrtp, they can add features but only expose the common capabilities of the whole env
<wrtp> fwereade: yeah
<fwereade> wrtp, sounds like a hassle to implement, but... ;)
<wrtp> fwereade: so things are backwardly compatible
<wrtp> fwereade: but not necessarily forward compatible.
<fwereade> wrtp, that's what it intuitively means to me but I expect someone's come up with a rigorous definition of the phrase "backward compatible" somewhere ;)
<wrtp> fwereade: yeah, well this is why it's a good time to be having the conversation... the version package seems like a good place to start
<fwereade> wrtp, absolutely so :)
<fwereade> wrtp, ok, think of the common situation, that someone upgrades their client juju from 2.2 to 2.4
<wrtp> ok
<fwereade> wrtp, to be suddenly unable to interact, without the upgrading the whole env, is a serious problem IMO
<fwereade> wrtp, so 2.4 *must* be able to emit stuff that can be understood by versions back to 2.0.0
<fwereade> wrtp, plausible reading of it?
<wrtp> fwereade: yes
<wrtp> fwereade: however
<wrtp> fwereade: think about when we've upgraded a client and now *bootstrap* a new environment
<wrtp> fwereade: we want to be able to use the new client features
<fwereade> wrtp, yeah; and because everything in the env has version >=2.4, we can
<wrtp> fwereade: yeah
<fwereade> wrtp, if everything doesn't then we block attempts to use them with friendly errors
<wrtp> fwereade: yes
<wrtp> fwereade: although
<wrtp> fwereade: i suppose that perhaps we shouldn't worry too much. if the latest minor version is there, we'll use it. if it's not, then we should probably still use the earilier version.
<wrtp> fwereade: so perhaps my compatibility test *is* wrong.
<wrtp> fwereade: because even though all the features of 2.4 won't work with a 2.2 backend, if a 2.2 backend is all we can find, we shouldn't bomb out
<fwereade> wrtp, exactly
<fwereade> wrtp, the thing is, part of me wants to impose versioning at the node-data-struct level
<fwereade> wrtp, but I can't really justify it very well, even with lots of handwaving
<wrtp> fwereade: doesn't that amount to the same thing, given that all of those are created by the code?
<fwereade> wrtp, kind of... it's just that the various kinds of node data won't necessarily be changing in sync with one another
<fwereade> wrtp, and that to (1) accommodate the old-environment thing and (2) not bump major versions all the time
<wrtp> fwereade: i think it might be useful to use node-data-struct compatibility to determine the overall version number
<fwereade> wrtp, we'll need some way of saying "write the 2.2 format of a unit workflow state node"
<fwereade> wrtp, yeah, that too
<wrtp> fwereade: i think we can be simpler than that
<fwereade> wrtp, oh yes?
<wrtp> fwereade: like having a rule to say: sub-nodes can only be added, not removed or changed, without losing backward compatibility
<wrtp> fwereade: kinda structural compatibility
<wrtp> fwereade: rather than versioning every node
<fwereade> wrtp, my concern is that that has a fossilizing effect and, long term, leads to horrid frankensteiny data structures
<wrtp> fwereade: if anything is frankenstein it's 2.2 unit nodes bolted on to 2.1 machine nodes :-)
<wrtp> fwereade: and we can clean up at major versions
<fwereade> wrtp, "we can clean up at <point in future>" is an argument of which I have learned to be suspicious
<wrtp> fwereade: if everything is versioned, you end up with n versions of the code, one for each version
<wrtp> fwereade: i think the protobuf/gob approach can work better in general
<fwereade> wrtp, yes: N datatype versions per node
<wrtp> fwereade: yuck.
<wrtp> fwereade: don't add a node unless you mean it :-)
<fwereade> wrtp, it seems to me to encapsulate the yuckiness better than having a 2.6 format extended from 2.5 from ... from 2.0
<fwereade> wrtp, indeed, we should not abuse it
<wrtp> fwereade: i think it would significantly complicate the code
<fwereade> wrtp, but either way promiscuously changing data formats is icky and costly
<wrtp> fwereade: presumably you'd need a separate data type for each version
<fwereade> wrtp, yeah; I don't anticipate too many versions
<fwereade> wrtp, but if we have 3 versions I'd rather have 3 versions than one version that has to handle reading/writing all 3 formats
<fwereade> wrtp, even *with* the restrictions on how the format can change which at least cuts down on that specific burden
<wrtp> fwereade: i don't think it's that much of a problem - you do it with subset/superset
<wrtp> fwereade: read the "A bit of history" section at https://developers.google.com/protocol-buffers/docs/overview for some context
<fwereade> wrtp, like I said, it's a pretty handwavey preference... and protobuf does seem to me to be essentially a more sophisticated version of what I propose
<wrtp> fwereade: i think so. and it doesn't have version numbers.
<fwereade> wrtp, I wouldn't like it if we had to manage all the version-munging by hand
<wrtp> fwereade: agreed. so i think we can work out ways to proceed forward in such a way that version n is automatically compatible with version n-1
<wrtp> fwereade: (version == minor version that is)
<fwereade> wrtp, yeah, SGTM
<wrtp> fwereade: but i do think that writing down the schema in some semi-formalised way might help to see potential compatibility problems as they happen
<wrtp> fwereade: not sure what syntax we'd use though
<fwereade> wrtp, tbh if we're not generating stuff from the schema and *knowing* that we cannot change the schemas except in tightly proscribed ways I don't think it'd be worth the effort
<fwereade> wrtp, it's basically just a comment, and has all the attendant problems
<wrtp> fwereade: i, for one, would quite like to see that comment :-)
<wrtp> fwereade: but i take the point
<fwereade> wrtp, I agree that a description of the intent at it was at some unknown time in the past is better than nothing, so long as you are always aware that's what you have
<wrtp> fwereade, TheMue: i wonder if it might make sense to have some table-driven tests in state (maybe there are already and i've missed them) where we start with a given zk tree (maybe generated by a previous version) and verify that it has the expected properties.
<fwereade> wrtp, +1
<wrtp> fwereade: that way we could do automatic backward-compatibility checking
<fwereade> wrtp, exactly
<TheMue> wrtp: so far there're only for watches table-driven tests
<TheMue> wrtp: the other ones are almost 1:1 ported from Python
<fwereade> wrtp, but then I start to feel that we're putting all the infrastructure for separate datatypes in place but trying to kid ourselves that they're not
<wrtp> fwereade: that's just testing
<wrtp> fwereade: the actual code remains straightforward (hopefully)
<TheMue> wrtp: ah, get it, you don't mean unit tests but a kind of compatability tests as part of the features
<wrtp> fwereade: as part of the testing, yes.
<wrtp> oop
<wrtp> s
<wrtp> TheMue: ^
<fwereade> wrtp, indeed
<TheMue> wrtp: unit testing during dev or prerequisite testing during upgrades?
<wrtp> TheMue: unit testing during dev
<fwereade> wrtp, so +1 table-driven tests, and we'll see if anything else turns out to be a good idea at some point in the future
<wrtp> TheMue: at every version, you'd generate some representative zk trees and dump them, along with the results of various API calls on those trees (e.g. Unit.CharmURL, etc etc etc)
<TheMue> wrtp: Then I didn't get it. What exactly do you want to test?
<wrtp> TheMue: i want to test that a subsequent version still returns the same results on the same zk tree
<wrtp> TheMue: even though the zk tree was generated by a previous version of the code.
<wrtp> of course, it would be nice if we could test the old code against the new data structures too...
<TheMue> wrtp: OK. Makes it more clear.
<wrtp> TheMue: i'm imagining just testing the read-only methods, because testing the writing is harder and i can't think of a way of doing it nicely.
<wrtp> TheMue: for read-only methods, i think it might be possible to do something neat with reflection to automatically call methods and check/store the results
<TheMue> wrtp: Yes, it's a hard topic. What exactly is the result you would expect by such a test?
<wrtp> TheMue: this is my thought (it might be wrong!): call a load of functions to create a given state. dump the zk tree for that state, and also dump the results of enumerating all read-only methods on that state (reflection *might* make it possible to do that by simply naming the read-only methods). to test, we restore the zk tree, then do the same enumeration and check the result are the same as the dumped values.
<TheMue> wrtp: So as a result you get a verification that the read methods with a newer version are able to read the data created with an older version. Am I right?
<wrtp> TheMue: yes
<wrtp> TheMue: you could also branch the older version and add test data generated by a newer version.
<TheMue> wrtp: Sounds good, so far I only dislike the reflection part. I would like to make it more explicit.
<wrtp> TheMue: i think it might just save lots of boilerplate code, but if you don't mind writing boilerplate...
<TheMue> wrtp: There would be an initial effort, but later adding new readers is simple. I fear the read-by-accident.
<wrtp> TheMue: read-by-accident?
<TheMue> wrtp: And this first list of readers could be generated by reflection and then manually controlled.
<TheMue> wrtp: How do you know (by reflection) which methods are the (important) readers?
<wrtp> TheMue: you don't - you name the methods explicitly. but use reflection to make the call and dump (or restore) the data.
<wrtp> TheMue: but you're probably right - it might not be too much work and perhaps it's better to be explicit
<TheMue> wrtp: OK, that's the explicit part I missed and wanted. I thought your idea would even start at a higher level and 'detect' those read methods by reflection. That has been my fear.
<wrtp> TheMue: no, that's not possible, i think
<TheMue> wrtp: Exactly
<wrtp> TheMue: i guess all one would need would be a function that takes a state and returns JSON-marshalable data from that
<wrtp> TheMue: then we can do a DeepEqual on the dumped JSON and current result
<TheMue> wrtp: Why exactly JSON?
<wrtp> TheMue: 'cos it's a nice format for dumping
<TheMue> wrtp: We could also gob it and write the binary.
<TheMue> wrtp: I like JSON (especially more than XML or YAML), but here I could live with gob.
<wrtp> TheMue: i think the transparency of JSON could be good here. and we might want to change the struct names.
<wrtp> TheMue: i think it would be useful to be able to manually inspect both the output tree and the dumped state data.
<TheMue> wrtp: Why? In case you do it the compatibility check would fail.
<wrtp> TheMue: the compatibility check would fail unnecessarily
<TheMue> wrtp: The test is "read old, dump, deploy new, read and compare dumed data".
<TheMue> s/dumed/dumped/
<wrtp> TheMue: i don't think the test itself would dump
<wrtp> TheMue: that would be a separate thing, done every so often
<TheMue> wrtp: Not the technological test run with go test, I described the whole test.
<wrtp> TheMue: ok. yes, that sounds right.
<wrtp> TheMue: so you don't want the "compare dumped data" step to fail unnecessarily
<TheMue> wrtp: You have to call the read methods of version A, dump those results, because you expect them later, install version A+1, run the read methods again and compare.
<wrtp> TheMue: yes
<wrtp> TheMue: except that we do the version A thing, then take those files and store them in version A+1 as test data
<TheMue> wrtp: Oh, yes, missed it.
<TheMue> wrtp: We need a ZK dump for simple restore of test data and a Go-readable dump for the comparings.
<wrtp> TheMue: exactly
<TheMue> wrtp: Sounds like an interesting job. Any experience dumping and restoring ZK?
<wrtp> TheMue: and i think it might make sense for both dumps to be human-readable - that way we get automatic "documentation" of what the zk tree actually looks like, killing two birds with one stone
<wrtp> TheMue: should be trivial - just: type ZkNode struct {Contents string; Children map[string] ZkNode}; func GetContents(zk *Zk) ZkNode
<wrtp> TheMue: i don't think we need to check the metadata
<TheMue> wrtp: Then you should write a novel marshaller. *lol* "The identificator of this unit is â¦"
<TheMue> wrtp: But indeed, JSON is pretty fine here.
<wrtp> TheMue: don't understand previous remark...
<wrtp> yeah, JSON should do all the work just fine
<wrtp> ahh!
<wrtp> you mean a marshaller that produces documentation!
<wrtp> a fine idea. i'll leave it as an exercise for the reader :-)
<TheMue> wrtp: But a novel,generated out of ZK, could be fun too. And you could configure if it is crime, mystery or love. *rofl*
<TheMue> wrtp: Hehe.
<wrtp> TheMue: definitely science fiction
<wrtp> TheMue: all this stuff is already sf :-)
<TheMue> wrtp: That's pretty simple. "$ dump" leads to 42.
<TheMue> wrtp: Oh, no, error, that the Python version.
<TheMue> *lol*
<wrtp> TheMue: i've thought of a particular reason why dumping as JSON might be better for the values too.
<wrtp> TheMue: it means we can potentially do forward-compatibility tests even when the new version has introduced some new fields.
<wrtp> TheMue: so rather than using DeepEqual, we could use an equality check that ignored new fields in the new version
<TheMue> wrtp: Sounds good.
<fwereade> TheMue, you've done some stuff with tomb recently, right?
<TheMue> fwereade: Yes.
<fwereade> TheMue, I was wondering what happened to tomb.Stop
<TheMue> fwereade: Isn't the old Stop() now Kill()?
<fwereade> TheMue, IIRC Stop was a constant that you could use to kill without it showing up as an error
<fwereade> TheMue, now it seems that Err() returns nil in 2 cases: "no error has yet been encountered" and "we're shutting down cleanly"
<TheMue> fwereade: Kill(nil) is allowed.
<fwereade> TheMue, and if you're checking the error return from a blocking call to something the gets Closed in another goroutine in response to <-Dying()
<fwereade> TheMue, then it's important to distinguish between "someone is shutting us down deliberately, so the read error caused by the Close should be ignored"
<fwereade> TheMue, and "whoa, read error! help help"
<fwereade> TheMue, checking for the error being tomb.Stop allowed for that
<fwereade> TheMue, is there another way to do what I ask?
<TheMue> fwereade: So watchers use two tombs. The first one is inside of the generic watcher types. They only handle the inner state of the watcher.
<TheMue> fwereade: And thos concrete watchers who use it have their own tombs.
<TheMue> fwereade: They are used if there is a logical/technical error on this level (e.g. an illegal content delivered by the content watcher).
<fwereade> TheMue, sorry, reading code, trying to figure out the analogous bit
<fwereade> TheMue, ah, hold on, I think I see what to do
<TheMue> fwereade: So the tomb of the ContentWatcher may return a problem with ZK. And if everything is ok the content watcher delivers a string that may be not interpretable by the surrounding ConfigWatcher.
<fwereade> TheMue, ok, wait, I don't think it's the same situation
<fwereade> TheMue, say I have `conn, err = listener.Accept` in my looping bit
<fwereade> TheMue, this blocks, and makes select on Dying() tricky
<fwereade> TheMue, so I have a goroutine which blocks on Dying() and calls Close on the listener
<fwereade> TheMue, this causes the blocking Accept to return an error
<fwereade> TheMue, which in *this* case is expected, and should be swallowed
<fwereade> TheMue, but I need some mechanism to tell me whether that was the case
<TheMue> fwereade: Uff, nice task. Sounds tricky.
<fwereade> TheMue, tomb.Stop was a magic value to pass to Fatal(), which allowed me to distinguish between the two cases
<fwereade> TheMue, it was always a little ugly, but it did work ;)
<TheMue> fwereade: Could you paste it? Or is it too much code?
<fwereade> TheMue, relevant bit is here: http://paste.ubuntu.com/942426/
<fwereade> TheMue, there are a couple of bits that appear conspicuously stupid
<fwereade> TheMue, I have no idea why I blocked on Dead() as well in the killer goroutine
<niemeyer> Morning!
<TheMue> niemeyer: Morning.
<TheMue> fwereade: Just scanning it.
<fwereade> niemeyer, heyhey
<TheMue> fwereade: Don't really get why you're using the goroutine here.
<fwereade> TheMue, how would you structure it?
<TheMue> fwereade: One moment, maybe I've got an idea. You want to keep the err to return it to a possible caller?
<wrtp> niemeyer: yo!
<fwereade> TheMue, I want server.Close() to return the error but only if it's a real error
<TheMue> fwereade: OK
<fwereade> TheMue, and likewise it seems sensible to have Wait() working as I would expect
<fwereade> TheMue, (maybe I shouldn't be *embedding* a Tomb at all... it seemed like it might be simpler... misuse?)
<wrtp> fwereade: one initial trivial remark: there's no reason for the goroutine to wait on <-s.Dead()
<niemeyer> wrtp: Heya
<niemeyer> fwereade, TheMue: What's up folks
<fwereade> wrtp, yeah, I noticed that... the more I look at this the more I think I should start from scratch again
 * wrtp nods
<fwereade> wrtp, I have a horrible feeling that I was halfway through a change :/
<TheMue> fwereade: Back in a moment ...
<wrtp> niemeyer: we had a discussion about versions
<wrtp> niemeyer: you might have seen this: https://codereview.appspot.com/6082044/
<niemeyer> wrtp: Indeed, and I'm participating on it :_)
<wrtp> niemeyer: (my impression of semantic versions)
<wrtp> niemeyer: this morning we also discussed ways of testing state backward compatibility
<niemeyer> wrtp: Ok.. I haven't seen this code yet, and I'd prefer to continue the conversation until we reach consensus before jumping into an implementation
<wrtp> niemeyer: ok. the versions package was only a couple of hours' work - i thought people seemed pretty much agreed on using semantic versioning, so it seemed to make sense to write it as a point of reference.
<TheMue> *: Have an interruption here, will be back later. Sorry fwereade
<niemeyer> wrtp: Sounds good.. we'll likely need this in a bit
<wrtp> niemeyer: (kapil's reply came quite a bit later)
<fwereade> TheMue, np, I'll experiment
<wrtp> niemeyer: so are we not agreed that your plan seems good? i was all set to go ahead with it.
<niemeyer> wrtp: I think people are mostly happy with it, so yeah, we can move forward with it I suppose
<wrtp> niemeyer: what do you think about my suggestion of using prerelease versions instead of odd-numbered versions?
<niemeyer> wrtp: There's just some minor disagreements there that we should settle on
<niemeyer> wrtp: I'm trying to answer it! :-)
<wrtp> niemeyer: :-)
<niemeyer> Ugh.. does bzr not have a hash for its revisions?
<wrtp> niemeyer, TheMue, fwereade: are we gonna have a meeting?
<niemeyer> wrtp: Yeah.. it'd be in 1h per the previous agreement, but I'm happy to do it sooner again
<fwereade> wrtp, niemeyer, TheMue: now is fine for me
<niemeyer> TheMue?
<wrtp> niemeyer: ah, sorry, i got my calendar time wrong by an hour
<wrtp> niemeyer: happy in an hour if TheMue is lunching
<fwereade> wrtp, niemeyer: ah yeah, TheMue did mention he had to go for a bit, didn't he?
<wrtp> no, it seems that google calendar is just confusing. it says "GMT+00" when it actually means "BST-01" (or is that +01 ?)
<wrtp> niemeyer: i'm not sure using the build scheme will work for dev versions
<wrtp> niemeyer: build versions have higher priority than the base versions
<wrtp> niemeyer: which means that 1.2.0+dev would override 1.2.0, which isn't what we want, i think.
<wrtp> niemeyer: but i'm probably misunderstanding
<niemeyer> wrtp: It actually is what we want
<niemeyer> wrtp: Oh, wait
<niemeyer> wrtp: No, it's not..
 * niemeyer reads again
<niemeyer> wrtp: Ah, interesting, I misunderstood indeed, but it actually works just as well
<niemeyer> wrtp: It just inverts when what we put the +dev on
<niemeyer> s/when//
<niemeyer> wrtp: We need to put +dev on the last release rather than on the next one
<fwereade> niemeyer, wrtp: pre-release on the next version feels slightly cleaner/clearer to me
<niemeyer> fwereade: That's no better than the odd/even scheme..
<niemeyer> fwereade: That said, I guess using the build scheme in that way is no better either
<fwereade> niemeyer, yeah, I was trying to figure out where the relevant distinction lay
<fwereade> btw, I was chatting to someone about their organisation's data the other day
<niemeyer> fwereade, wrtp: Maybe -dev and -pre$N would do it..
<fwereade> niemeyer, that looks sensible at first glance
 * fwereade decides the thread of conversation it looked like he was going to start is a derail
<niemeyer> We're still breaking the concept a little bit by allowing -dev to change, but this should be fine really
<fwereade> niemeyer, heh, dev-<UUID>
<niemeyer> fwereade: Exactly.. :)
<wrtp> niemeyer: i don't *think* the concept is broken by allowing -dev to change, as the spec says nothing about breaking changes *between* prerelease versions
<niemeyer> fwereade: I guess we could have part of the bzr revid as a hash.. the last few chars.. something like 1.0.2-dev.$REVID[-8:]
<fwereade> niemeyer, SGTM
<niemeyer> wrtp: There's a distinction between changing *between* pre-release versions and changing *pre-release* versions
<wrtp> niemeyer: ah, i don't think we'd do that. i see what you're suggesting now.
<niemeyer> wrtp: Well, we need to say so, and say how that's happening
<wrtp> niemeyer: we can use build versions on top of prerelease versions for that if we want
<wrtp> 1.0.2-dev+build1
<niemeyer> wrtp: Who defines build1?
<niemeyer> wrtp: and why? This looks ugly
<wrtp> niemeyer: we think we need to talk about who is going to use what versions...
<wrtp> s/we/i/
<wrtp> niemeyer: we've got versions that we'll hand out to people
<wrtp> niemeyer: we've also got versions that we'll use for our own development
<niemeyer> wrtp: Sure, that's 1.2.3-pre4
<niemeyer> wrtp: Yep
<wrtp> niemeyer: for versions for our own development, everyone that wants can have their "own" prerelease tag
<wrtp> e.g. 1.2.3-rog
<niemeyer> wrtp: the suggestion above puts those as 1.2.3-dev.hash
<niemeyer> wrtp: No no no
<wrtp> niemeyer: if i number my own prereleases, then i can use version numbering between my own releases
<niemeyer> wrtp: Let's please have a simple and well defined scheme..
<wrtp> niemeyer: that seems well defined to me
<niemeyer> wrtp: Exactly. It is well defined *to you*
<wrtp> niemeyer: we allocate each developer a part of the revision namespace
<wrtp> niemeyer: the semantic version spec takes care of the rest
<niemeyer> wrtp: No, that's going in an entirely different atmosphere which is completely out of the conversation in the list
<wrtp> niemeyer: we can't use revid
<niemeyer> wrtp: Because..?
<wrtp> niemeyer: because revid only applies to trunk
<niemeyer> wrtp: Huh?
<niemeyer> wrtp: revision id != revision number
<wrtp> niemeyer: oh, sorry, yeah
<wrtp> niemeyer: the problem with that is they're not ordered
<niemeyer> wrtp: Yep.. they're not..
<niemeyer> wrtp: Which is why the idea is handling dev builds as flag
<wrtp> niemeyer: which is a problem if i'm iteratively deploying a test env, no?
<niemeyer> as a flag
<wrtp> niemeyer: ah, so what does the flag do?
<niemeyer> wrtp: It is a problem indeed, and I'm trying to address it taking in account the characteristics of bzr
<niemeyer> wrtp: a dev build would always upgrade something with a version <= version (respecting the details for major)
<niemeyer> wrtp: Even if the -dev bit matches
<niemeyer> wrtp: so "juju upgrade-juju --dev" would do the right thing as we iterate through development
<wrtp> niemeyer: i'm not convinced we need a dev mode. if we can work out how to order dev releases, then the versions can do the work
<niemeyer> wrtp: Without having to artificially bump revision numbers all the time
<niemeyer> wrtp: We can't order dev releases.. there's simply no way
<wrtp> niemeyer: i think there might be
<wrtp> niemeyer: after all, revisions are ordered within a given branch
<niemeyer> wrtp: Ok.. so please explain how two different branches can cross-upgrade to each other?
<wrtp> niemeyer: they don't need to, i don't think
<niemeyer> wrtp: Heh
<niemeyer> wrtp: I want them to.. I don't want to be shutting down an environment just because I switched branches
<niemeyer> wrtp: I'm trying to solve actual problems. semver.org is only relevant to me for as long as it
<niemeyer> 's solving issues. It won't solve this issue, so we'll need to agree on something else that does.
<wrtp> niemeyer: hmm. there are two issues here
<wrtp> niemeyer: 1) can i deploy a given client version against a given backend? 2) can i upgrade a given client to a given client software version?
<wrtp> niemeyer: it seems to me that we haven't really touched on 2) yet.
<niemeyer> wrtp: 2 is completely out of our control..
<wrtp> niemeyer: oh? i thought that's what kapil was talking about in his message.
<wrtp> niemeyer: so if we're just talking about 1), i don't see why you'd need to shut down an environment because you switched branches. the deployed environment would still be compatible.
<wrtp> niemeyer: (because all branches with the same major version are compatible with each other)
<niemeyer> <wrtp> niemeyer: i'm not convinced we need a dev mode. if we can work out how to order dev releases, then the versions can do the work
<niemeyer> <niemeyer> wrtp: Without having to artificially bump revision numbers all the time
<niemeyer> <niemeyer> wrtp: We can't order dev releases.. there's simply no way
<niemeyer> wrtp: Because it's an *upgrade*..
<wrtp> niemeyer: sorry, i don't understand. what's an upgrade?
<niemeyer> wrtp: Ok.. we're not making progress
<niemeyer> wrtp: THe suggestion is this:
<niemeyer> actually.. I've just figured that pre-releases scheme isn't friends with Debian versions.. hah
<niemeyer> SpamapS: Was that your concern?
<wrtp> niemeyer: how do you mean?
<niemeyer> wrtp: 1.0.0-pre1 is greater than 1.0.0
 * wrtp doesn't know anything about Debian versions
<wrtp> niemeyer: in Debian versions?
<niemeyer> wrtp: Yeah
<wrtp> niemeyer: does Debian versioning matter for our version-selection algorithm?
<niemeyer> wrtp: Absolutely.. that's the environment we're living in
<niemeyer> wrtp: Using 1.0.0-pre1 will mean people will have to bump the series version every single release
<niemeyer> wrtp: We'd end up with a version like 123:4.2.1
<wrtp> niemeyer: what's the "123:" ?
<niemeyer> wrtp: Because that's the only way to force the Debian package to upgrade 1.0.0-pre1 to 1.0.0
<niemeyer> wrtp: It's the series version.. a monotonically increasing number that overrides the whole string-based comparison for cases like this
<wrtp> niemeyer: ah. it does lexical comparison?
<niemeyer> wrtp: It breaks down the version in parts and compares lexically, usually, but not if there's a series
<niemeyer> wrtp: Well.. not if there's a *different* series
<niemeyer> wrtp: So, I'm coming back to the conclusion that odd/even is perfectly fine
<wrtp> niemeyer: i'm still not quite sure why these versions have to pertain directly to debian versions. kapil was suggesting a single number. that wouldn't be a debian version either.
<niemeyer> wrtp: We'll be doing semver.org at the evens, and less strict on the odds
<niemeyer> wrtp: That'd be compare fine as a deb package version
<niemeyer> wrtp: I'm talking about packaging
<wrtp> niemeyer: why would we bother packaging a prerelease version?
<niemeyer> wrtp: Because we want people to use it?
<niemeyer> wrtp: There's an alternative, though.. which is simple and might work
<niemeyer> wrtp: and maybe you're happy with..
<wrtp> niemeyer: for prerelease versions, we *could* say "just get the bzr repo"
<wrtp> niemeyer: go on
<niemeyer> wrtp: We could use the +dev tag, as suggested at semver.org
<niemeyer> wrtp: In the *previous* version
<wrtp> niemeyer: yes
<wrtp> erm, no
<niemeyer> wrtp: So immediately after 1.0.0, we tag it as 1.0.0+dev
<niemeyer> wrtp: But this still won't solve the pre-release issue, I guess
<wrtp> niemeyer: no, i think that breaks semantic versions too badly
<wrtp> but i really don't like the odd-numbered dev versions either
<wrtp> niemeyer: if we want people to use a prerelease version, why don't we just release it?
<wrtp> niemeyer: then we can bump the patch version as necessary when their feedback comes in
<niemeyer> wrtp: Because it's a pre-release.. I think the concept of a pre-release is well understood, but I can explain if not
<wrtp> niemeyer: please do. this area is all quite new to me.
<niemeyer> wrtp: A pre-release is a candidate to be the given release
<niemeyer> wrtp: Which isn't yet blessed as stable for production use
<wrtp> niemeyer: so... does debian packaging have pre-releases?
<niemeyer> wrtp: It can also be *really* unstable, rather than a candidate, though
<niemeyer> wrtp: The goal is to get people in the wild to experiment with what's coming
<niemeyer> wrtp: But only the brave souls that understand the risks of doing so
<niemeyer> wrtp: Debian packages may contain pre-releases if the authors decide to offer them as such
<wrtp> niemeyer: so, in your proposal, all odd numbered branches are pre-releases, yes?
<niemeyer> wrtp: No
<niemeyer> wrtp: They can also be development snapshots
<wrtp> niemeyer: one might say the two things played a similar role
<wrtp> ok...
<wrtp> all odd numbered branches are *potential* pre-releases, then
<wrtp> ?
<niemeyer> wrtp: Development snapshots are changing by the minute..
<wrtp> niemeyer: many potential pre-releases :-)
<niemeyer> wrtp: They can be seen that way, yes, if we find a way to tag them appropriately with a mark that is unique
<wrtp> niemeyer: so with Debian packages, how do you tag a pre-release version?
<niemeyer> wrtp: But even that is problematic during development.. you'd have to commit for every test you do, for example
<niemeyer> s/test/test deployment/
<wrtp> ha
<niemeyer> Maybe that's fine, though
<niemeyer> wrtp: Debian packages don't care about the details of the version being packaged. It just has an algorithm to define what's new and what's old that must be respected, or the upgrade won't happen.
<wrtp> i think it's reasonable.
<fwereade> niemeyer, wrtp: commit per deployment sounds sane -- it hasn't been a notable pain point in the past
<niemeyer> fwereade: Cool
<wrtp> niemeyer: so debian prerelease versions would always use an explicit series version?
<wrtp> niemeyer: hmm, that seems unlikely.
<niemeyer> wrtp: No.. series would have to be bumped every single time a pre-release is used, to enable an upgrade like 1.0.0-pre1 to 1.0.0
<wrtp> niemeyer: how *do* Debian packages deploy pre-release versions?
<niemeyer> niemeyer> wrtp: Debian packages don't care about the details of the version being packaged. It just has an algorithm to define what's new and what's old that must be respected, or the upgrade won't happen.
<wrtp> niemeyer: sure. i'm just wondering what people do in practice.
<wrtp> niemeyer: do they bump the version number one more time, e.g. 1.2.3 for prerelease, then 1.2.4 for actual release?
<wrtp> niemeyer: or do they append to the previous version e.g. 1.2.2-prerelease before bumping to 1.2.3 for actual release?
<niemeyer> wrtp: In general I believe people end up splitting the version number in two to avoid the pain of series management
<niemeyer> wrtp: Ending up with something like 1.0.0-0~pre1
<niemeyer> wrtp: So that 1.0.0-1 would upgrade it
<wrtp> niemeyer: ok, so they'd never use 1.0.0 as a version after a pre-release then?
<niemeyer> wrtp: I'm not sure about what you mean in that case
<wrtp> niemeyer: if i've pre-released as 1.0.0-0~pre1 then i can't release as 1.0.0
<hazmat>  fwereade munchkin is awesome and well suited for the kids in the family ;-)
<niemeyer> wrtp: You can, as I just explained
<niemeyer> <niemeyer> wrtp: Ending up with something like 1.0.0-0~pre1
<niemeyer> <niemeyer> wrtp: So that 1.0.0-1 would upgrade it
<wrtp> niemeyer: yeah 1.0.0-1 != 1.0.0
<niemeyer> wrtp: -1 is the release information that is available in all packages
<niemeyer> wrtp: Do dpkg -l <whatever>
<niemeyer> dpkg -l bash
<fwereade> hazmat, don't think laura would follow it quite yet, we can play carcassonne as a jigsaw though ;)
<wrtp> niemeyer: ah, i didn't know about that
<niemeyer> wrtp: The basic rule is that the release information is distro-selected.. the version itself is upstream selected
<niemeyer> wrtp: In most cases, at least
<niemeyer> wrtp: If we put 1.0.0-pre1 or -dev as upstream, it'll certainly be broken, though, since it'd be a pain
<wrtp> niemeyer: so there's an easy mapping from semantic versions to debian versions: put a '0' before the pre-release version...
<wrtp> niemeyer: alternatively always use a 0 as the first char of the pre-release version
<wrtp> and enforce it
<niemeyer> wrtp: Nah, doesn't sound necessary
<niemeyer> wrtp: Anyway, you're right.. that's mostly a red-herring..
<niemeyer> wrtp: It can be hacked to fit
<niemeyer> So.. what can we agree on?
<niemeyer> 1.0.0-pre$N is a pre-release
<niemeyer> 1.0.0-dev.$REVID[-9:] is a snapshot?
<niemeyer> Maybe pre.1 to conform?
<wrtp> what's the "[-9:]" bit?
<niemeyer> That should be -8
<wrtp> a python slice operator?
<niemeyer> wrtp: The last N bytes of the revid
<wrtp> ah
<wrtp> yeah, that sounds good
<niemeyer> We'll need a --force flag on upgrade-juju
<wrtp> ordering of releases is a red herring anyway. with prereleases we want exact specification or nothing.
<niemeyer> To compensate for the lack of ordering
<niemeyer> Actually, we don't.. we just need to enable cross-dev upgrades
<niemeyer> wrtp: Yeah.. the issue is on development snapshots.. pre-releases are properly ordered
<wrtp> niemeyer: the main issue AFAICS is that you don't want to use some else's dev snapshot
<wrtp> niemeyer: and that, i think, is solved by simply using a different bucket for each developer
<niemeyer> wrtp: My suggestion is that we enforce the use of --dev to get from a stable release onto a dev release, and then after being into this dev release, upgrading to a dev release on the same patch level is done ignoring the ordering of hash
<niemeyer> of revid I mean
<niemeyer> wrtp: I disagree.. I *do* want to use someone else's dev snapshot
<niemeyer> wrtp: and I do want to use my own snapshot, on different branches, out of order too
<niemeyer> wrtp: There's no questions.. we have to solve that problem
<niemeyer> wrtp: Development is a nightmare without that
<wrtp> niemeyer: how would you want to specify the snapshot to use?
<wrtp> niemeyer: rev id hash isn't enormously friendly :-)
<niemeyer> wrtp: Actually, given we'll be using revids, it's an issue regardless.. the revids will be unordered even within a single branch
<wrtp> yes
<wrtp> niemeyer: which is why i was presuming that you'd always be pushing the current branch's executables
<wrtp> niemeyer: and then specifying that version exactly
<niemeyer> wrtp: Yep, we're just talking about what "that version" means
<niemeyer> wrtp: and the behavior the client will enforce when working with it
<wrtp> niemeyer: maybe we should push a hash of the actual executables
<wrtp> 1.0.0-dev.HASH
<wrtp> niemeyer: then you'll push correctly if you've changed the Go version, for example
<niemeyer> wrtp: That might work as well.. but it'd just be changing where the hash is being obtained from.. the issues debated still hold
<wrtp> niemeyer: not quite, as above
<niemeyer> wrtp: It's also potentially less useful information
<wrtp> niemeyer: it does mean that if you branch and do nothing, then (assuming a deterministic compiler) you won't need to push the executables again
<niemeyer> wrtp: Having such a hash will be pretty equivalent to having a random number
<wrtp> niemeyer: but i take the point about less useful information too
<wrtp> niemeyer: yeah, that's true.
<wrtp> niemeyer: but does the revid really matter too much either?
<wrtp> niemeyer: what *really* matters IMHO is replicability
<niemeyer> wrtp: It's not ideal, but it's useful.. I can look at a running env and tell which code it is running
<wrtp> niemeyer: all the juju bits anyway, but not necessarily the bits that they rely on
<niemeyer> wrtp: Well.. :-)
<wrtp> niemeyer: so if there's a fix to the tomb package, say, then it might not push when it should
<niemeyer> wrtp: Yeah, if there's a change in the kernel we won't know as well
<wrtp> niemeyer: we can't avoid that. we can avoid this.
<niemeyer> wrtp: not really..
<niemeyer> wrtp: Unless you introduce a random or pseudo-random number
<wrtp> niemeyer: if i add some logging code to some package that's not in juju and rebuild, i want to be able to deploy that executable
<wrtp> niemeyer: ?
<wrtp> niemeyer: how is the hash of the executables not good enough?
<niemeyer> wrtp: It's random..
<SpamapS> niemeyer: pre release scheme is fine if we use ~pre
<niemeyer> SpamapS: Cool, cheers
<wrtp> niemeyer: random? it's a deterministic hash of the code, surely?
<SpamapS> niemeyer: I as more thinking about those who want to automate on top of juju. Its a very small corner case though.
<niemeyer> wrtp: No, it's surely not..
<wrtp> niemeyer: no?
<niemeyer> wrtp: Nope.. hash of executable != hash of code..
<wrtp> niemeyer: how's that?
<niemeyer> wrtp: It's also not useful for what a version is useful for..
<SpamapS> niemeyer: just in general, I've never liked the idea of tagging releases with something like that. But I would not be wholly against using that scheme given all the parameters.
<wrtp> niemeyer: this isn't a sequential version - it's a scheme that means that, hopefully, we can reliably deploy development versions of juju and have them work predictably.
<niemeyer> wrtp: It's just not the same thing.. there are zero guarantees that two builds have produce the same hash
<SpamapS> hash of executable has a lot of parameters other than the code
<niemeyer> wrtp: There's also the detail that one can't lock a drawer with the key inside
<wrtp> niemeyer: it doesn't matter if they don't, i think. it's more important that two builds with different code will produce *different* hashes
<wrtp> niemeyer: would we store the hash inside the executable?
<niemeyer> wrtp: I hope we can type "juju version" and have it..
<wrtp> niemeyer: yeah, that would be good. but i still want to be certain that the code that's executing is the code i just compiled.
<niemeyer> wrtp: You can always md5sum the executable.. that's beyond the role of a version
<wrtp> SpamapS: code+compiler is it, i think. i don't think it embeds a time stamp.
<niemeyer> wrtp: building code doesn't have to produce the same binary every time.. there are zero guarantees about that
<SpamapS> compiler+all libraries
<TheMue> So, phew, back again.
<niemeyer> wrtp: Ok, I have to step out for lunch, and have a medical appointment first thing in the afternoon.. will think about that conversation meanwhile, and be back in action later in the day
<SpamapS> oddly enough, the same problem has existed in charms for a while
<wrtp> SpamapS: yeah. i count libraries as part of the code though.
<TheMue> niemeyer: Could you please later also review my latest proposal? Thx.
<SpamapS> we added the 'deploy --upgrade' flag to deal with it. Arbitrary revision number in the charm that is bumped +1 every time you ask to upgrade it.
<niemeyer> TheMue: Yeah, unfortunately the meeting time is gone now
<niemeyer> TheMue: Will do later
<TheMue> niemeyer: Emergency, my daughter had a cut in her food.
<niemeyer> TheMue: Ouch!
<TheMue> foot
<niemeyer> TheMue: How's she?
<wrtp> TheMue: for a moment i thought she'd eaten some glass! i'm glad she'll only lose a foot :-)
<TheMue> niemeyer: She now has a bandage, but walking will be difficult the next days.
<wrtp> TheMue: (sorry to hear it, hope she's ok)
<niemeyer> Heh
<niemeyer> TheMue: Glad to hear
<niemeyer> Ok.. will be back later!
<TheMue> wrtp: Yeah, only one char, but totally different meaning. ;)
<TheMue> niemeyer: Thx, me too.
<wrtp> SpamapS: by "the same problem" you mean that the current version wasn't pushed out, even though it had actually changed?
<TheMue> fwereade: Found a solution?
<fwereade> TheMue, tbh, no, I always seemed to converge on what I was doing before
<fwereade> TheMue, here's how it looks at the moment...http://paste.ubuntu.com/942518/
<TheMue> fwereade: OK, will do a quick draft I had time to thought about while waiting. ;)
<fwereade> TheMue, sweet, tyvm
<wrtp> fwereade: so you want to propagate the listener close to the Server close?
<wrtp> fwereade: is that the problem?
<fwereade> wrtp, I don't know any other way to stop the loop while it blocks on listener.Accept
<wrtp> fwereade: is there other code that's also using the tomb?
<fwereade> wrtp, if I'm closing the listener explicitly, via server.Close, then I know I'll get an error out of Accept
<fwereade> wrtp, no
<TheMue> fwereade: Somehow this way http://paste.ubuntu.com/942618/
<wrtp> fwereade: i don't think i'd use a tomb
<fwereade> TheMue, I don;t get that, the goroutine is surely wrong
<fwereade> wrtp, ha, I suppose that's an option, but it does seem to give me most of what I need
<fwereade> wrtp, (to continue dropped sentence) ...and I don;t want that error to count as an error
<fwereade> wrtp, but, in general, I feel that I shouldn't just swallow real errors that happen inside the loop, even if the only way to get atthem ATM is to close the server
<TheMue> fwereade: What's exactly what you don't get?
<fwereade> TheMue, that you kill the listener immediately for no apparent reason
<wrtp> fwereade: something like this perhaps, with one notable omission...: http://paste.ubuntu.com/942639/
<TheMue> fwereade: Ooops, wrote go instead of defer.
<TheMue> fwereade: That's indeed wrong.
<wrtp> fwereade: i'm not sure whether it's possible to portably define errorWasBecauseOfClose
<TheMue> fwereade: I meant http://paste.ubuntu.com/942647/
<fwereade> wrtp, indeed
<fwereade> TheMue, ok, but this appears to be dropping the errors on the floor, which I feel is a bad move unless they're expected
<TheMue> fwereade: No, the error that leads to the leaving of the loop is passed to Kill in the deferred func.
<fwereade> TheMue, ahhhh
<wrtp> fwereade: i think this does the job: http://paste.ubuntu.com/942650/
<fwereade> TheMue, we do still have the "is the error (overwhelmingly likely to be) due to a deliberate Close issue
<wrtp> fwereade: i think my paste solves that issue
<fwereade> wrtp, I don't quite follow what the select is doing
<SpamapS> wrtp: well perhaps more generally, the same problem was that we needed to maintain remote binaries efficiently
<wrtp> fwereade: you're right, it unnecessary
<wrtp> fwereade: this should work fine: http://paste.ubuntu.com/942657/
<fwereade> wrtp, and how do I distinguish between good errors and bad errors out of Accept?
<wrtp> SpamapS: ironically, i've discovered that Go *does* include a build time stamp in its binaries.
<wrtp> fwereade: doh! what was i thinking?
<wrtp> fwereade: in general, you can't tell
<wrtp> fwereade: unless you manage to work it out from the error itself
<wrtp> fwereade: because there's a race.
<fwereade> wrtp, in general it is indeed always possible that I could issue a Close(), and the Accept could error out on its own at just the right moment
<fwereade> wrtp, but that is a possibility I am not overly concerned about
<SpamapS> wrtp: perhaps that would be a good modifier then
<fwereade> wrtp, the important thing is that we asked it to close and now it's closed (or at least broken ;p)
<fwereade> wrtp, the disturbing case is when it broke without us asking it to close
<fwereade> wrtp, and while there's no Wait method yet it seems like the sort of thing that will be useful for orderly shutdown
<wrtp> fwereade: yeah. there should be an "IsErrorOnClosedConnection(err)" function in net
<fwereade> wrtp, that would be the ideal, yeah
<fwereade> wrtp, I'm reluctant to try to write that myself though
<wrtp> fwereade: how about this: http://paste.ubuntu.com/942664/
<fwereade> wrtp, I think I like that a lot
<fwereade> wrtp, tyvm
<wrtp> SpamapS: i'm not sure that we can avoid uploading the binaries every time while still guaranteeing we're executing the right code.
<wrtp> fwereade: np
<wrtp> fwereade: easy when you think about it the right way, as usual :-)
<wrtp> fwereade: took me a while tho
<fwereade> wrtp, yeah, it's taking a while to reroute my normal thinking
<fwereade> wrtp, a month of python didn't help ;)
<wrtp> fwereade: w e   w i l l   r e w i r e   y o u r   b r a i n
<fwereade> wrtp, haha
<wrtp> fwereade: there's still a potential problem though
<wrtp> fwereade: well, i suppose it depends
<fwereade> wrtp, bah! go on :)
<wrtp> fwereade: if you *didn't* get an error, you need to wait for the error anyway, otherwise the server may still be active
<wrtp> fwereade: and i presume that ServeConn is guaranteed to finish in a fixed amount of time??
<wrtp> fwereade: the first problem is easily solved: http://paste.ubuntu.com/942676/
<fwereade> wrtp, well, it depends on what the request actually does... it shouldn't just get stuck, in theory; I'd assumed we'd want to allow inflight requests to complete
<wrtp> fwereade: well, yeah, it's a matter of choice.
<fwereade> wrtp, we may want to change that choice at some stage, sure
<wrtp> fwereade: i think maybe you shouldn't be calling ServeConn synchronously.
<fwereade> wrtp, hmm, maybe, but I got confused about allowing multiple connections at once
<fwereade> wrtp, so I thought I'd punt on that for now
<wrtp> fwereade: i don't think it helps
<wrtp> fwereade: you can have multiple RPCs in parallel anyway, theoretically.
<fwereade> wrtp, yeah, there's nothing stopping us on the backend
<wrtp> fwereade: how about something as simple as this? http://paste.ubuntu.com/942683/
<wrtp> fwereade: but perhaps we *want* to block Close until the current request terminates. hmm.
<fwereade> wrtp, I had been fretting over the multiple connections to the socket, not on locking around backend stuff
<fwereade> wrtp, I thought it seemed like a sensible thing to do; it may be that future discoveries will prove it wrong
<wrtp> fwereade: ah. i'm not sure i see why we should prevent two commands executing concurrently. one might block while the other might return some quick info.
<wrtp> fwereade: in which case, just "go s.server.ServeConn" would be fine
<fwereade> wrtp, ok, cool, I guess I have been paranoid without reason ;p
<wrtp> fwereade: it is worth considering whether you want Server.Close to close all existing client connections. that would take more code, but not hard.
<fwereade> wrtp, IMO I may want that at some stage but it's premature for now
<wrtp> fwereade: easy to fix anyway
<SpamapS> wrtp: isn't that what makefiles are for?
<SpamapS> wrtp: like, "if X changed, compile it" ?
<wrtp> SpamapS: yes, that's a good point. i guess i was hoping to avoid upload if i had the same binaries as someone else. but in a dev environment, we don't care too much. i wonder how long it takes me to upload 4MB to S3.
<wrtp> SpamapS: so we could use revid + time of day.
<wrtp> SpamapS: or mod time of executable, rather
<SpamapS> I haven't looked, what does go use to build? I didn't see autotools stuff... :)
<SpamapS> wrtp: yeah mtime would actually be accurate
<fwereade> sorry all: eod, and I can't really hang around today -- I'll be around all tomorrow, can we meet then instead?
<fwereade> I'll try to pop on again later
<wrtp> fwereade: sounds good
<wrtp> SpamapS: it uses the go tool
<wrtp> SpamapS: it looks at the source files (recursively) to determine dependencies
<wrtp> SpamapS: http://golang.org/doc/articles/go_command.html
<SpamapS> wrtp: but not all of the "code" is go
<SpamapS> you have documentation, and examples...
<wrtp> SpamapS: the documentation and examples are in the code
<wrtp> SpamapS: well... the Go docs and examples are
<SpamapS> wrtp: where are the fairies and toilets made of solid gold? ;)
<wrtp> SpamapS: for other docs, i guess we'd use make or something similar
<wrtp> SpamapS: how did you guess? :-)
<wrtp> SpamapS: sorry, i thought you meant by "what does go use to build?", "what does go use to build Go executables?"
<wrtp> SpamapS: if you meant "what does go use to build restructured-text documentation and everything else?" i guess the answer is "it doesn't" :-)
<SpamapS> I figured as much. :)
<wrtp> SpamapS: i'm quite happy it does a good job with the code tbh
<SpamapS> I'm just thinking about how to make release.
<SpamapS> releases even
<wrtp> SpamapS: for getting the juju executables into a known place without knowing the names of all of them, this could work well: GOBIN=someplace go install launchpad.net/juju/go/...
<wrtp> SpamapS: the documentation can probably be done exactly as it is now
 * wrtp is off for the evening, see y'all tomorrow
 * niemeyer is back
<niemeyer> robbiew: Is the call still running?
<niemeyer> wrtp: Still around? Just wanted to run an idea by you
<niemeyer> mthaddon: ping
<wrtp> niemeyer: i've got 15 minutes before dinner if you're still around
#juju-dev 2012-04-24
<wrtp> fwereade: morning!
<fwereade> wrtp, heyhey
<fwereade> wrtp, how are you?
<wrtp> fwereade: just got back from my bike ride; not raining this morning, yay!
<fwereade> wrtp, nice
<wrtp> fwereade: and my 82 year old ex-lorry-driver neighbour just told me that 12.04 is out on Thursday
<fwereade> wrtp, awesome :D
<wrtp> fwereade: (which i didn't actually know!)
<fwereade> wrtp, better yet -- community in action ;p
<TheMue> morning
<fwereade> heya TheMue
<TheMue> Lunchtime ...
<wrtp> fwereade: i've been thinking about SuperCommand etc
<fwereade> wrtp, cool, go on
<wrtp> fwereade: the fact that when SuperCommand parses itself again it misses the flags added by LoggingCommand seems like an indication of something wrong structurally.
<wrtp> fwereade: and i *think* i have an idea for a way to fix it
<fwereade> wrtp, cool, it's seemed likely that everyone else can see an obvious design that I've missed ;P
<wrtp> fwereade: the underlying problem is, i think, that you're trying to use LoggingCommand to add functionality to SuperCommand, but it doesn't - it just embeds it as you know.
<fwereade> wrtp, it seemed to me like a good and sane consequence of embedding over inheritance
<fwereade> wrtp, yep
<wrtp> fwereade: how about actually providing a hook for SuperCommand to do the flag parsing.
<wrtp> fwereade: because SuperCommand doesn't have any flags itself, so why not let its user add them?
 * fwereade thinks
<wrtp> fwereade: so it could have a SetInitFlagSet(func(*gnuflag.FlagSet)) method for example
<fwereade> wrtp, I think it'd also need hooks for the other methods too though
<wrtp> fwereade: or that function could be an argument to NewSuperCommand
<fwereade> wrtp, we need to do something with them in Run()
<wrtp> fwereade: which other methods?
<fwereade> wrtp, Run and potentially ParsePositional-or-whatever-it-is could each want to do something with the flags on the logging type
<wrtp> fwereade: ParsePositional sounds wrong, as it would never get any positional arguments
<fwereade> wrtp, ParsePositional on supercommand always gets the subcommand as a positional arg
<fwereade> wrtp, and all the subcmd's args as well because it doesn't intersperse
<fwereade> wrtp, (incidentally I like `func Reduce(args []string) (Command, []string, error)`)
<fwereade> wrtp, then for consistency's sake you'd want (bare) Parse to return the final command ready to Run
<fwereade> wrtp, and that would actually always be the one originally passed in, but doesn't *have* to be
<wrtp> oh the twisted heaps we create just so we can intermix global flags and command-specific flags
<wrtp> fwereade: the thing is that having Reduce return a Command is *only* useful for the supercommand case AFAICS
<fwereade> wrtp, sure, but that bothers the other commands not one bit
<wrtp> fwereade: so rather than make that interface, which is used all over the place, more complex, why don't we make SuperCommand (which is a very small amount of code currently) slightly more complex instead?
<fwereade> wrtp, and it has a nice fit with the actual problem it's solving -- it is intuitively sensible that the process of parsing a given command be identical whether or not it's actually a subcommand
<fwereade> wrtp, it's used a certain amount; it amounts to writing ", nil" in a very few places
<wrtp> fwereade: i'm looking at the code that was deleted from SuperCommand in this CL and thinking: why can't the Logging piece implement just that code
<wrtp> fwereade: if you have that interface, it implies that a subcommand can itself return a subcommand
<fwereade> wrtp, IMO this is less of a cognitive load than adding two separate hooks to SuperCommand for InitFlagSet and Run
<fwereade> wrtp, well, actually, is there any reason it shouldn't?
<wrtp> fwereade: well, your current code wouldn't work - you'd need a loop
<fwereade> wrtp, like I suggest in the discussion already?
<wrtp> fwereade: oh, i didn't see that!
<fwereade> wrtp, ok it's a rubbish loop as I discovered when checking against reality but the core idea is there
<wrtp> fwereade: ok, here's another idea: make NewSuperCommand take a Command as an argument
<wrtp> fwereade: it will call Run but always with zero arguments
<fwereade> wrtp, hmm, at least it concatenates all the hooks into one package ;p
<fwereade> wrtp, I'm not sure that's intrinsically *better* than the loop idea... the fact that the loop works however many subcommands you have implies to me that it's well-suited to the problem
<wrtp> fwereade: then you'd call NewSuperCommand(name, doc, &LoggingSetupCommand{})
<wrtp> fwereade: to me it seems like overengineering. that's not the problem we're trying to solve.
<fwereade> wrtp, yeah, but the set of Commands that can usefully go in there is limited
<fwereade> wrtp, it's not the same thing as a Command, its fundamental nature is a set of hooks rather than a command
<wrtp> fwereade: true. alternative: SuperCommand could define its own interface type to be used as an argument.
<fwereade> wrtp, this is still fundamentally the hooks idea
<wrtp> fwereade: indeed. but i think that's fundamentally what's going on. you want to "hook" the global flags into any arbitrary subcommand
<fwereade> wrtp, and given that we have just one way in which we need to tweak SuperCommand
<fwereade> wrtp, we end up reducing the code to precisely the original implementation
<fwereade> wrtp, because SuperCommand checks some status and then does something in response in each of InitFlagSet and Run
<fwereade> wrtp, and it's just that it's an implicit do-we-have-a-hook check instead of an explicit self.SetsLog check
<fwereade> wrtp, with additional tomfoolery to set it all up
<fwereade> wrtp, I don;t feel that's a win
<wrtp> fwereade: i'm not sure
<fwereade> wrtp, it may be more extensible in the future, but so is mine; they're both interestingly extensible in different ways and I don't think we currently have reason to prefer one over the other on that front
<wrtp> fwereade: i think the hooks possibility makes it easy to add more global flags that aren't related to logging.
<fwereade> wrtp, sure, that would be an awesome change at precisely the moment we actually have such a need
<wrtp> fwereade: and i think that the "return a command" thing we will never ever use apart from in supercommand. it seems to me like we're spreading supercommand mess around.
<wrtp> fwereade: for the record, this is the kind of thing i'm thinking of: http://paste.ubuntu.com/943945/
<wrtp> fwereade: then all the ParsePositional stuff can remain exactly as it is now.
<wrtp> fwereade: and the logging set up code is almost identical to now
<fwereade> wrtp, ok, so that's still an extra type and extra setup work to do everything the original implementation did, in the same way, which I was asked to change
<wrtp> fwereade: that's true. but at least the concerns are now separated. SuperCommand now has nothing to do with logging. and the logic is quite easy to follow, i think. (but then i would, wouldn't i? :-))
<fwereade> wrtp, how does SuperCommand as separated out have anything to do with logging in the first place?
<wrtp> fwereade: it doesn't, but it did, didn't it?
<wrtp> sorry, i'm not sure what the question is
<wrtp> fwereade: SuperCommand as you're proposing has nothing to do with logging, no.
<wrtp> fwereade: but i'm not keen on the way that it makes all commands into recursive Command producers. i think that adds to cognitive load, when we can isolate it into SuperCommand.
<fwereade> wrtp, I think that the right doc for Reduce could make it all very clear
<fwereade> wrtp, in practice, returning 2 nils in a few places is not an especially painful cost
<wrtp> fwereade: but there's always the question: why are *all* these commands returning Command when *none* of them apart from SuperCommand uses that functionality?
<wrtp> fwereade: or will ever use that functionality...
<fwereade> wrtp, can we keep the unknowable future off the table?
<wrtp> fwereade: when i first read the SuperCommand code, i had to spend a little while, as i didn't find it easy to see what was going on. I now understand it (and I think it's nice for the problem being solved), but i really think that the whole recursive command thing belongs *there*, not in every command that's being implemented. the Command interface should map nicely to the command interface we want to implement IMHO.
 * fwereade is still thinking
 * wrtp is trying to work out if he's being unreasonable.
 * wrtp hopes not :-)
<fwereade> wrtp, I understand your concerns -- I think -- but they don't carry the same weight in my mind
<fwereade> rogpeppe, ^^
<rogpeppe> :-)
<fwereade> rogpeppe, consider that (bare) Parse itself is only ever called with a SuperCommand in my proposal
<fwereade> rogpeppe, there is, as there always was, a certain commingling of functionality
<rogpeppe> fwereade: which Parse are you referring to there?
<fwereade> rogpeppe, the precise balance at a given point depends on the needs of the current codebase
<fwereade> rogpeppe, cmd.Parse
<fwereade> rogpeppe, it will surely become immediately apparent if the future needs of the codebase require that we rearrange the responsibilities of the cmd package
<rogpeppe> [13:01] <fwereade> rogpeppe, consider that (bare) Parse itself is only ever called with a SuperCommand in my proposal
<rogpeppe> isn't it called with a LoggingSuperCommand?
<fwereade> rogpeppe, ha, ok, something that flawlessly impersonates a SuperCommand, if you must
<fwereade> rogpeppe, but that definition only lends weight to my assertion that Parse, which is called with two distinct things that both expect this behaviour, is correct to accommodate their shared functionality
<rogpeppe> fwereade: not quite flawlessly... (given the recursive calling problem) :-)
 * fwereade concedes somewhat grudgingly
<fwereade> rogpeppe, the trouble is still that, considering proposals [rog] and [original], [rog] does what [original] did, and adds a bunch of other stuff that's only used in one case; [original] failed to find favour, and I don't think that an implementation that does just the same thing, but with more boilerplate, is likely to fly
<niemeyer> Hello!
<fwereade> niemeyer, heyhey
<rogpeppe> niemeyer: yo!
<niemeyer> fwereade: While you're there, I don't understand why we need recursion at all :-)
<rogpeppe> +1 !
<fwereade> niemeyer, rogpeppe: it is now a loop ;p
<niemeyer> fwereade: Not even that!
<rogpeppe> fwereade: recursion via Y-combinator :-)
<niemeyer> fwereade: more, err := scmd.Consume(args)
<niemeyer> fwereade: ...
<fwereade> niemeyer: no, we need some sort of repeated parsing whatever we do
<niemeyer> fwereade: subcmd := scmd.subcmds[name]
<niemeyer> fwereade: subcmd.Consume(args)
<rogpeppe> niemeyer: this is my current preferred proposal: http://paste.ubuntu.com/943945/
<niemeyer> fwereade: This is a linear call.. supercommand calls subcommand..
<TheMue> niemeyer: moin
 * niemeyer looks
<niemeyer> TheMue: Heya
<fwereade> niemeyer, wait, so supercommand Consume calls subcmd Consume?
<niemeyer> rogpeppe: That works too
<niemeyer> fwereade: Yeah, that sounds fine in principle.. am I missing something?
<fwereade> niemeyer, well, subcommands can have flags, and cmd.Parse is already set up to deal with flags before passing the positional args on to Reduce/Consume/whatever-it-currently-is
<fwereade> niemeyer, duplicating that in SuperCommand would seem to be kinda rubbish
<rogpeppe> i like the current names. InitFlagSet, ParsePositional and Run work well for me.
<fwereade> niemeyer, rogpeppe: we have always had recursive Parse calls; they were disguised by being indirect, via ParsePositional, but it was always how it worked
<niemeyer> fwereade: Sorry, I don't understand.. probably because I don't have the picture so clear in my head as you do
<rogpeppe> as i said earlier, that's an interface that works well for commands, even if it isn't quite sufficient for SuperCommand.
<niemeyer> fwereade: I know.. and as I mentioned in the review, I've always disliked that.. it's magical, and unnecessary, iMO
<niemeyer> fwereade: The problem seems so simple.. I don't get why we need all this back and forth
<niemeyer> fwereade: SuperCommand wraps a number of subcommands.. it gets called, handles what it must, and delegates the rest
<niemeyer> fwereade: No recursion, no looping
<rogpeppe> niemeyer: the fundamental difficulty is the mixing of global and sumcommand flags AFAICS
<rogpeppe> s/sumcom/subcom/
<fwereade> niemeyer, how many times do we call Parse on a FlagSet in the course of this?
<fwereade> niemeyer, twice! there are two parsing steps needed
<rogpeppe> and we call InitFlagSet three times, right?
<niemeyer> fwereade: Yep.. twice, exactly.. not more, not less
<fwereade> rogpeppe, we may create a fresh FlagSet for usage output
<niemeyer> fwereade: No need for recursion or looping in that case
<rogpeppe> twice on the supercommand and once on the subcommand
<rogpeppe> fwereade: ah, makes sense
<fwereade> niemeyer, well, once *or* twice really, given that we expect Parse to work on anything that implements Command
<niemeyer> fwereade: Parse should be completely unaware about any of that
<rogpeppe> fwereade: is there a particular reason we export cmd.Parse BTW? does anything outside of the cmd package actually use it?
<niemeyer> fwereade: as should the interface of Command
<niemeyer> fwereade: The only particularity is that a supercommand needs to call the subcommand.. that's all really
<niemeyer> fwereade: The mechanisms out of the supercommand should not be aware of that
<fwereade> niemeyer, and mingle flaget flags
<rogpeppe> i'm all for mingling flagets
<niemeyer> fwereade: ?
<fwereade> niemeyer, the client of Parse doesn't see a difference; AFAICT the point of difficulty is in the interface for Command itself, right?
<fwereade> niemeyer, sorry, misunderstood
<niemeyer> fwereade: Maybe I should just try to code a sketch
<rogpeppe> fwereade: that's true for me, certainly. i think the interface for Command works well, and we're mangling it solely because of SuperCommand issues.
<fwereade> niemeyer, I was saying the SC needs both to select subcommands and reregister its own flags
<niemeyer> fwereade: I'm unhappy about this:
<niemeyer> func (c *SuperCommand) InitFlagSet(f *gnuflag.FlagSet) {
<niemeyer>         if c.subcmd != nil {
<niemeyer>                 c.subcmd.InitFlagSet(f)
<niemeyer> fwereade: This is all magical..
<niemeyer> fwereade: It's called once, somewhere, and then twice, in a full moon
<niemeyer> fwereade: Maybe it's set, or not.. and behaves in one way or the other..
<niemeyer> fwereade: and most methods go like that..
<niemeyer> fwereade:
<niemeyer> func (c *SuperCommand) Info() *Info {
<niemeyer>         var info *Info
<niemeyer>         if c.subcmd != nil {
<fwereade> niemeyer, as I recall we discussed this in some detail and it seemed to be the right way to deal with the legacy interspersed-or-not flag ickiness
<niemeyer> fwereade: My reaction to it, IIRC, is that it was a bit on the clever side and practical, even if it was hard to understand
<niemeyer> fwereade: Things are now being extended into other directions that extend the cleverness through the interface of Command, Parse, documenting recursiveness in the interface itself, etc
<niemeyer> fwereade: I'm getting lost
<niemeyer> fwereade: I totally admit it's probably my small brain, but I'd love if we could make this straightforward somehow
<niemeyer> fwereade: It sounds simple just from the description of what we need
<fwereade> niemeyer: the heart of it is in the requirement that we handle both "juju -v bootstrap" and "juju bootstrap -v"
<rogpeppe> [13:24] <niemeyer> fwereade: Things are now being extended into other directions that extend the cleverness through the interface of Command, Parse, documenting recursiveness in the interface itself, etc
<rogpeppe> +1
<niemeyer> fwereade: Ok, but that *sounds* (and I may be missing details for sure) simple..
<niemeyer> fwereade: The supercommand is wrapping the subcommands.. it is in control of what happens
<niemeyer> fwereade: It knows there are only two possible layers, and it has the second layer entirely at its will
<fwereade> niemeyer: to give you a bit of context on my thinking, would you read the comments on https://codereview.appspot.com/6107048/ and https://codereview.appspot.com/6100050/ please?
<niemeyer> fwereade: I had read the first one before we started to talk.. checking the second
<niemeyer> fwereade: Done
<fwereade> niemeyer, is anything I'm saying sounding any saner?
<niemeyer> fwereade: Everything you say is sane.. I just believe the problem may be solved in a simpler way
<niemeyer> fwereade: I'll give it a quick shot
<fwereade> niemeyer, before you do
<niemeyer> fwereade: A hack, arguably, but hopefully it'll be helpful
<fwereade> niemeyer, you seemed to like rog's proposal
<niemeyer> fwereade: I don't think it's necessary either
<fwereade> niemeyer, ok then :)
<niemeyer> fwereade: Maybe it will be, though.. once we want to use supercommand with other globals
<niemeyer> fwereade: But this is an aside.. the key point I'm making around non-recursiveness and non-looping is independent from it
<fwereade> niemeyer, I feel that's a touch speculative
<niemeyer> fwereade: Sure, I've been speculating since we first talked about this, and I'm now trying to stop speculating to show some code :)
<rogpeppe> fwereade, niemeyer: here's a version of supercommand with no recursive magic: http://paste.ubuntu.com/944043/
<TheMue> niemeyer: So, I now branched my last branch at the point before I modified the NeedsUpgrade behavior, but also before doing the changes of the according review. Do you expect a propose now to first take a look that this branch is in a proper state or shall I first do the review changes and then propose again?
<rogpeppe> (branched from a slightly earlier version, so using ParsePositional, as i believe Consume was added only for the recursive thing)
<fwereade> rogpeppe, how do you switch off logging?
<niemeyer> TheMue: Feel free to do the review changes
<rogpeppe> fwereade: the logging flags thing is orthogonal to this.
<TheMue> niemeyer: ok
<rogpeppe> fwereade: i still think my GlobalFlags proposal would work well
<niemeyer> rogpeppe: Cheers.. I'm looking at this as well, as I mentioned
<fwereade> rogpeppe, and, yes; you eliminate recursion by having a function that parses flags call a different function that parses flags in exactly the same way
<rogpeppe> niemeyer: sorry, i'd already been playing with it, so thought i'd push out something to think about.
<rogpeppe> fwereade: not exactly - it adds extra flags
<niemeyer> rogpeppe: No worries.. I'm just trying to focus on it for a moment to understand the problem better
<rogpeppe> fwereade: that's *the* crucial reason why SuperCommand exists, AFAICS
<fwereade> rogpeppe, that it gets in the exact same way that Parse does...
<fwereade> rogpeppe, there is repetition inherent in this problem
<fwereade> rogpeppe, we can loop or we can recurse or we can duplicate code and kid ourselves that it's simpler because we've managed to avoid a loop
<rogpeppe> fwereade: the entire reason for the recursion was so that we could reuse the 6 (!) lines of Parse?
<fwereade> rogpeppe, well, there are many reasons all of which interrelate in one way or another, but yes, that is one of them
<rogpeppe> fwereade: when nothing actually calls Parse itself other than Main, AFAICS. we could inline it into Main and noone would notice, i think.
<rogpeppe> fwereade: i think the version of SuperCommand.ParsePositional i posted shows very directly the logic of SuperCommand.
<fwereade> rogpeppe, yes it does, it looks to me very very much like the original version, except it wantonly duplicates the logic in Parse rather than just calling it; and it doesn't allow global args to come after the command, which was a requirement
<rogpeppe> fwereade: it *does* allow global args to come after the command
<rogpeppe> fwereade: it passes all tests
<fwereade> rogpeppe, I don't see where you register the global flags for the second flagset Parse
<rogpeppe> fwereade: newFlagSet
<rogpeppe> fwereade: actually i was confused by that at first.
<rogpeppe> fwereade: i'm not sure newFlagSet should take a Command, but that's another issue
<fwereade> rogpeppe, well, the way you're using it makes it wrong; the way it was used before was fine ;)
<rogpeppe> fwereade: ok, perhaps
<fwereade> rogpeppe, still, I just do not get how repetition is not part of the problem
<rogpeppe> fwereade: repetition of what?
<fwereade> rogpeppe, and why I am being criticised for using elementary constructs which express repetition
<fwereade> rogpeppe, repetition of parsing
<fwereade> rogpeppe, one Command parses part of the command line and delegates to another for the rest
<fwereade> rogpeppe, this is two parses; one would also be perfectly legitimate, because Parse should accept a Command
<rogpeppe> fwereade: the repetition is performed once only, but you're making a very general mechanism to replace something that's uses exactly the same lines of code
<niemeyer> fwereade: You're not being criticised.. I'm trying to simplify the logic we (the team) has in place.. that's impersonal
<rogpeppe> niemeyer: +1
<fwereade> niemeyer, rogpeppe: sorry, inappropriate language, I sometimes come to identify with my code a touch too much, ty for reminder
<niemeyer> fwereade: Your code was largely reviewed and agreed with
<rogpeppe> fwereade: this is all the diff it takes to make your code non-recursive and everything else can carry on using exactly the same interface before: http://paste.ubuntu.com/944063/
<rogpeppe> fwereade: i don't see how that's not a win
<fwereade> rogpeppe, ok, the internal implementation should not matter; do we agree that the real pain point is in returning things other than errors from ParsePositional
<fwereade> ?
<rogpeppe> fwereade: what do we want to return from ParsePositional?
<fwereade> rogpeppe, well, an error, right? that's the interface you've been fighting to preserve
<rogpeppe> fwereade: "the real pain point is in returning things other than errors"
<rogpeppe> fwereade: so what else do we want to return?
<fwereade> rogpeppe: I'm starting to think: a command and list of unconsumed args (which replaces ourself entirely if used and is in practice nil, nil for most Commands, although there are other plausible uses of such a mechanism)
<fwereade> rogpeppe, plus an error
<rogpeppe> fwereade: but why?
<fwereade> rogpeppe, because it is a perfect fit for the problem space and reduces duplication
<rogpeppe> fwereade: we're talking 6 lines of duplication here!
<rogpeppe> fwereade: and you're proposing to change the interface in many places just because of that?
<fwereade> rogpeppe, and you're saying that 6 lines of duplication is a *better* thing than a loop to express the same repetition?
<rogpeppe> fwereade: yes
<hazmat> g'morning
 * hazmat is glad to finally be home
<fwereade> rogpeppe, in and of itself that makes little sense to me; with the wider context of "because it allows us to have a cleaner signature for ParsePositional" I can understand it
<fwereade> heya hazmat
<rogpeppe> fwereade: {fmt.Println("a"); fmt.Println("b")} can be better than for _, s := range []string{"a", b"} {fmt.Println(s)}
<rogpeppe> hazmat: hiya
<rogpeppe> fwereade: and in this case, it's not strict repetition - the body of the loop is slightly different each time through.
<niemeyer> fwereade: FWIW, I'm still on it..
<rogpeppe> fwereade: if it *was* a simple loop, i wouldn't mind much. my problem is with the fact that to avoid this repetition you're proposing complicating the Command interface that's implemented everywhere.
<fwereade> rogpeppe, where by "everywhere" you mean "in two places"?
<fwereade> rogpeppe, hm, sorry, suddenly unsure whether you're looking at the version with FlagCommand
<fwereade> rogpeppe, ok, 3 places, FlagCommand implements it ;)
<rogpeppe> fwereade: ah, i'm not.
<rogpeppe> fwereade: but still.
<rogpeppe> fwereade: it's implemented (even if only with embedding) by every subcommand
<fwereade> rogpeppe, I guess this may be one of the sources of disconnect; my position is that the prectical cost is utterly minimal
<rogpeppe> fwereade: the Command interface is something that people will see when maintaining the code. it's a very good thing that it's as simple as possible.
<rogpeppe> fwereade: both myself and niemeyer found the recursive nature of supercommand difficult to understand.
<fwereade> rogpeppe, I thought one of the benefits of Reduce was that it actually made everything cleaner
<rogpeppe> fwereade: sometimes 6 lines of extra code are worth one less level of abstraction IMHO
<rogpeppe> fwereade: i think everything's cleaner if *all* the supercommand magic is strictly within supercommand
<rogpeppe> fwereade: in fact, the patch above isolates all the magic into SuperCommand.ParsePositional - a few lines of sequential code.
<rogpeppe> fwereade: FWIW i don't think i've seen a CL with FlagCommand in it
<fwereade> rogpeppe, it's in https://codereview.appspot.com/6107048/
<fwereade> rogpeppe, niemeyer proposes something better but I haven't reproposed with that in
<rogpeppe> fwereade: even with ZeroArgsConsumer, i still see 5 implementations of Consume
<fwereade> rogpeppe, you mean including places where ZeroArgsConsumer is embedded?
<rogpeppe> fwereade: no
<rogpeppe> e.g. func (c *agentConf) Consume(args []string) ([]string, error) {
<rogpeppe> fwereade: every time someone sees that method, they have to wonder: does it actually return something other than an empty arg list?
<niemeyer> Strawman on the way..
<rogpeppe> fwereade: and the only reason for it is SuperCommand
 * rogpeppe likes the nutty, chewy taste of straw
<fwereade> rogpeppe, ah, typo, I was mid-edit :/ ...sorry :)
<niemeyer> https://codereview.appspot.com/6118044
<niemeyer> Tests are broken, but it actually builds at least
<rogpeppe> niemeyer: i like the look of that
<rogpeppe> niemeyer: particularly the way lines of code melt away :-)
<rogpeppe> niemeyer: not sure about calling FlagSet.Parse twice on the same flag set, but i guess it might be ok.
<niemeyer> rogpeppe: We can always make it ok :)
<niemeyer> rogpeppe: But I suspect it's already fine
<rogpeppe> niemeyer: yeah, or just initialise a new flagset
<niemeyer> rogpeppe: I'd rather not, if possible.. I really like how we have a single place where this is happening
<rogpeppe> niemeyer: that's true. it removes a lot of head-scratching.
<fwereade> niemeyer, that's *very* nice indeed
<fwereade> niemeyer, almost horrifyingly so
<niemeyer> fwereade: Cheers
<fwereade> niemeyer, I'm still trying to think through how it'll work with the responsibility split but I think it goes in a nice direction there too
<niemeyer> fwereade: Yeah, I suspect it should be fine
<fwereade> niemeyer, ok, I'll take a short break and get onto it
<rogpeppe> niemeyer, fwereade, TheMue: are we still planning to have a meeting this afternoon?
<niemeyer> fwereade: ctx takes care of most of the issue there already, I believe
<fwereade> oh, yeah, good plan
<niemeyer> rogpeppe: +1
<niemeyer> I'm happy to have it *now*
<rogpeppe> i've got a 1 to 1 with robbie in 5 minutes, but i'm good after that
<niemeyer> I have only about half an hour, but should be useful at least
<TheMue> rogpeppe: In half an hour I've got my 1:1 with Robbie.
<niemeyer> rogpeppe: Aw, ok
<niemeyer> rogpeppe: After lunch then
<niemeyer> in 1h30?
<rogpeppe> niemeyer: i'm sure i could defer with robbbiew
<niemeyer> rogpeppe: I'm short..
<niemeyer> rogpeppe: 1h30 would work best there too
<rogpeppe> niemeyer: sounds good to me
<niemeyer> 16UTC then
<rogpeppe> niemeyer: yup.
<TheMue> niemeyer: OK
<niemeyer> fwereade, TheMue: Sounds good?
<fwereade> niemeyer, 1h30 from now sounds good
<niemeyer> Super
<robbiew> rogpeppe: we 1:1ing? or did you defer me :/
<robbiew> :P
<rogpeppe> robbiew: we are
<rogpeppe> robbiew: G+?
<robbiew> rogpeppe: aye
<robbiew> https://talkgadget.google.com/hangouts/_/extras/canonical.com/the-hulk-hideout
<robbiew> rogpeppe: ^
<TheMue> niemeyer: Please forget latest propose. Somehow the merge of parent changes brought unwanted stuff in. Seems I took the wrong revision.
<niemeyer> TheMue: np
<TheMue> niemeyer: I've got to admit that I'm a bit puzzled while trying to understand which revision I should take. I opened it in Bazaar Explorer to get a better impression. But I didn't thought that the needed merge to propose the stuff would get my not-wanted code back. *sigh*
 * niemeyer => lunch
<TheMue> niemeyer: So, now I've rolled back to an earlier revision, made the needed changes and would now like to propose it. How can this be done without the problem of a "diverged-branch"?
<TheMue> niemeyer: Oh, lunch, enjoy. ;)
<rogpeppe> TheMue: you need to have merged with trunk
<rogpeppe> TheMue: if bzr diff looks right, then the codereview should look right, assuming you've got the prereq correct
<rogpeppe> TheMue: also, i didn't know this for a while: you can't change the prereq of an existing merge proposal - you have to delete and repropose if you want to do that
<TheMue> rogpeppe: aha, have to take a look
<TheMue> rogpeppe: How do I do it?
<rogpeppe> TheMue: there's a "Delete this proposal" link in the top right of the merge proposal page
<TheMue> rogpeppe: You mean "Delete patch set"?
<TheMue> rogpeppe: Otherwise I'm on a different page. *lol*
<rogpeppe> TheMue: this page: https://code.launchpad.net/~themue/juju/go-state-unit-resolved-watcher/+merge/102497
<rogpeppe> TheMue: "Delete proposal to merge"
<TheMue> rogpeppe: Ah, thx, I've been on the review page.
<rogpeppe> TheMue: the launchpad pages are the important thing.
<rogpeppe> TheMue: you don't want to delete the codereview page.
<TheMue> rogpeppe: I just hoped both play together.
<rogpeppe> TheMue: codereview.com knows nothing about launchpad.net or vice versa. the only link is lbox.
<TheMue> rogpeppe: So we need a "lbox sync" supercommand. ;)
<rogpeppe> TheMue: there's no change you can make in codereview that we'd want to sync to launchpad, other than comments, which get transferred anyway.
<rogpeppe> TheMue: so lbox is already like lbox sync, just in one direction
<TheMue> rogpeppe: OK, to make it more clear. <JOKE> So we need a â¦ </JOKE>.
<rogpeppe> TheMue: oh
<rogpeppe> TheMue: ha ha :-)
<TheMue> rogpeppe: I've got to admit it has been a bad one. ;)
<TheMue> rogpeppe: Just hoped to get a link into the supercommand discusion today.
<rogpeppe> TheMue: what kind of link?
<TheMue> rogpeppe: Hmm, maybe wrong wording. How would say it if you want to connect a sentence now to a discussion that has been helt some time ago?
<TheMue> rogpeppe: A connection?
<rogpeppe> TheMue: you'd probably say something like "just hoped to make a reference to the supercommand discussion today"
<rogpeppe> "reference" is the word
<TheMue> rogpeppe: Ah, ok, thx, learned again. Sounds a bit formal from a German perspective.
<TheMue> TheMue: Here we also have the word "Referenz", but it is used in a different context.
<rogpeppe> TheMue: alternative phrasing "i was referring to the supercommand discussion" or "i was trying to refer to ..."
<TheMue> rogpeppe: Sounds like the same stem. How would you say it if you're in the context of making a joke? Would you also "refer to"?
 * TheMue likes English lessons here in the channel, as long as I don't have to pay for.
<rogpeppe> TheMue: probably not in a joke itself. i'd just use some phrasing that alludes to the original thing.
<TheMue> rogpeppe: I tried it with the "lbox sync" supercommand, but yes, that hasn't been funny.
<rogpeppe> TheMue: i can't say i laughed out loud :-)
<TheMue> rogpeppe: I'll try to make a better one next time.
<rogpeppe> TheMue: i wait with eager anticipation
<TheMue> rogpeppe: I hope I won't dissappoint you.
 * niemeyer is back
<rogpeppe> niemeyer, fwereade, TheMue: meeting?
<TheMue> rogpeppe: I still have the "diverged-branches" problem.
<rogpeppe> TheMue: push --overwrite
<niemeyer> yep!
<niemeyer> rogpeppe: Wanna do the honors?
<rogpeppe> niemeyer: k
<rogpeppe> niemeyer, TheMue, fwereade: invites out
 * hazmat lunches
<niemeyer> Lost my connection.. trying again
<niemeyer> rogpeppe: WTF.. is everybody else in?
<rogpeppe> niemeyer: yeah
<rogpeppe> niemeyer: we're still there
<niemeyer> rogpeppe: I'm retrying but it shows me as being online in my own list..
<rogpeppe> niemeyer: you're there, but in cartoon form only
<niemeyer> rogpeppe: I can hear you typing
<rogpeppe> niemeyer: lol
<niemeyer> Come on G+!
<niemeyer> I can hear everybody
<niemeyer> I'm disconnecting and retrying
<rogpeppe> can you use the text chat box?
<niemeyer> rogpeppe: I could..  just disconnected.. will retry
<niemeyer> Killing Chrome and restarting
<niemeyer> Shait
<rogpeppe> niemeyer: dammit
<niemeyer> This software wasn't so crackful before
<rogpeppe> niemeyer: it's working fine for us :-)
<niemeyer> rogpeppe: That's the kind of trick that always works.. if you restrict what "us" means enough, it everything is wonderful! ;-)
<rogpeppe> niemeyer: us == â :-)
<rogpeppe> niemeyer: maybe if you invited us, the server might be located in a better place...
<niemeyer> Ok, let's try that
<niemeyer> G+ got frozen completely and Chrome had to kill the page
<niemeyer> It's that bad..
<niemeyer> "The connection to talkgadget.google.com was interrupted."
<rogpeppe> niemeyer: hmm. i wonder if it's a networking issue at all. maybe you could try connecting using a Windows machine :-) :-)
<TheMue> rogpeppe: Evil
<niemeyer> rogpeppe: Oh yeah, I'm sure it's the kernel!
<niemeyer> robbiew: ping
<robbiew> pong
<fwereade> niemeyer, I'm not seeing a new invite; should I be?
<niemeyer> robbiew: Can we borrow your conf call number for a moment?
<robbiew> one sec
<niemeyer> I can't find my pin
<niemeyer> fwereade: No.. it's seriously broken
<niemeyer> fwereade: I get this on the tab in the G+ page: The connection to talkgadget.google.com was interrupted.
<fwereade> niemeyer, ouch, didn't realise that was the main G+ page
<niemeyer> fwereade: <robbiew> niemeyer: 107 568 4916
<niemeyer> This is the conf code..
<niemeyer> Phone number is at /ConferenceCalls on the internal wiki
<rogpeppe> fwereade, TheMue: you might want to add an additional communication channel...
<fwereade> rogpeppe, sorry, what should I drop out of?
<fwereade> niemeyer, rogpeppe, TheMue: I'm afraid I should be away soon in the interest of domestic harmony...
<fwereade> niemeyer, rogpeppe, TheMue: and I'm afraid I certainly can't do justice to the details of this conversation
<fwereade> happy evenings all
<TheMue> fwereade: bye
<niemeyer> fwereade: Sounds good, we certainly don't want to disturb the domestic harmony there ;-)
<niemeyer> TheMue: I was just looking at GoPort.. I'm not sure about how useful that is ATM.. maybe you have different plans for it, but right now it feels like a snapshot of the +activereviews link in Launchpad
<niemeyer> TheMue: What I had in mind was closer to a "bird's view" on the different areas
<niemeyer> TheMue: Something vague enough to be up-to-date for longer
<niemeyer> TheMue: Also, re. https://code.launchpad.net/~themue/juju/go/+merge/103315, do you want me to review that or was it just a test?
<niemeyer> TheMue: Wondering mostly due to the branch name
<andrewsmedina> niemeyer, rogpeppe please take a look https://codereview.appspot.com/6099051/ :-p
<niemeyer> andrewsmedina: Woohay
<TheMue> niemeyer: The GoPort page is currently only a start. It shall also contain all open features. It's like a Kanban board, only in a wki.
<TheMue> niemeyer: Tools based it would be simpler.
<TheMue> niemeyer: The last branch is for review. The name has gone so "bad" due to an error while fighting with bzr and lbox.
<niemeyer> TheMue: Just saying.. I'm not going to be joining you on that style of roadmap
<niemeyer> TheMue: This is what +activereviews provides
<niemeyer> TheMue: and changes every day
<niemeyer> TheMue: It won't help
<TheMue> niemeyer: Where in activereviews are all outstanding features?
<niemeyer> TheMue: Where in GoPort are all outstanding features?
<niemeyer> TheMue: What's there now is a list of branches
<niemeyer> TheMue: that's what I'm talking about
<TheMue> niemeyer: I already told you that I'm just started to get a feeling how to put it into the wiki. When the form is right I would talk to you and all others to gather the outstanding the features.
<niemeyer> TheMue: Sure.. you mentioned that page in the wiki today, I've looked, and am providing feedback on what's there.. no biggie
<TheMue> niemeyer: I'm not yet habe with the feature naming too. And I've got to try what's the best granularity.
<niemeyer> TheMue: As I said, I'm also happy to try to produce something you're happy with
<TheMue> niemeyer: So any good idea by you is welcome.
<niemeyer> TheMue: I'd have to sit down and do it.. which is fine by me, but it won't look like what's there now
<niemeyer> TheMue: I'd probably take the big areas from the Python code, and say what's missing at a very high level
<TheMue> niemeyer: I'm also not absolutely happy. It's not simple to put it into a wik iform. In past we used a mix of tools and physical kanban boards.
<TheMue> niemeyer: Yes, I already started scanning it. Today it's indeed to fine granular and the orientation at branches isn't right.
<niemeyer> TheMue: I've used a number of different techniques too.. we have tools that produce something resembling a Kanban from the reviews actually
<TheMue> niemeyer: Nice
<TheMue> niemeyer: And you feed it by creating LP issues for open points?
<niemeyer> TheMue: E.g. http://people.canonical.com/~niemeyer/florence.html
<TheMue> niemeyer: Great, want. *smile*
<niemeyer> TheMue: We also had feature boards in the wiki with Landscape
<niemeyer> TheMue: But I don't miss either of these right now
<niemeyer> TheMue: I'm happy to look at the problems we're trying to solve and come up with another solution that fits for what we're doing today
<niemeyer> TheMue: UDS will be a good time for the two of us to sit together and come up with a plan
<TheMue> niemeyer: Yes, looking forward. Maybe we really don't need this way of control, only a list of milestones ensuring that we don't miss the finishing at 12.10.
<niemeyer> TheMue: Yeah.. and some big areas that we can keep in mind and know how much progress we have there
<niemeyer> TheMue: E.g. "provisioning agent", "unit agent", "machine agent"
<TheMue> niemeyer: Sounds good. I really don't want to create big, ugly docs just for fun.
<niemeyer> TheMue: "state", "ec2 provider", ...
<TheMue> niemeyer: As you have seen I've also put some first open questions/topcs at the bottom. At least open for me.
<niemeyer> "local provider", /me looks at andrewsmedina
<TheMue> niemeyer: Maybe some of them are only interesting for > 12.10, but some maybe earlier
<niemeyer> "command line API"
<niemeyer> TheMue: I'd not mix that up
<niemeyer> TheMue: Or we'll risk losing focus
<TheMue> niemeyer: Oh, good that this stuff is logged. I'll add it tomorrow. Sounds even better than todays table.
<niemeyer> TheMue: We need sharp attention on the port.. everything else is secondary
<niemeyer> TheMue: I'd put each of these as headers of a section
<TheMue> niemeyer: Eh, I only wrotem them down to make clear that those parts which are important for 12.10 are handled early enough. Others may be deferred.
<TheMue> niemeyer: If you signal "This point A is covered there, and that point B is not interesting for the 12.10 Go port." than it's ok.
<TheMue> niemeyer: Nothing more.
<andrewsmedina> niemeyer: what?
<TheMue> Off for today.
 * TheMue waves
<niemeyer> andrewsmedina: nm
 * niemeyer breaks for a bit
<andrewsmedina> niemeyer: you saw my review?
<niemeyer> andrewsmedina: Not yet, but we'll get to it for usre
<niemeyer> sure
<hazmat> niemeyer, are you going to hit mongosv next week in sf?
<hazmat> bcsaller, i'm realizing that juju-info should probably also include some additional info relating to charm name and interfaces
<hazmat> take a monitoring tool, how's it know what its monitoring outside of specific monitoring support in the monitored charm
<bcsaller> hazmat: I would think anything it includes could be included in any/all charms
<bcsaller> hazmat: and by charms I mean relations
<hazmat> hmm.. i suppose so, but this case implicit where as the others are already explicit, but given a polymorphic interface i suppose that makes sense as well
<bcsaller> hazmat: more info and fewer special cases, at minimum there is one win in that list
<SpamapS> hazmat: +1 from me to adding charm and service name to all relations
<SpamapS> easy win, lots less special casing
<hazmat> SpamapS, true, but adding it to the all rels feels like we're missing something its going to duplicated for every remote unit.
<hazmat> its really a remote endpoint property
<hazmat> but we don't have a nice way to expose that except on via relation-get on the the remote unit
<SpamapS> hazmat: hm, not sure I understand. You're concerned that its the same for all units and so it is a different type of information.. but I think thats ok because its always true for each unit even though it is redundant.
<hazmat> SpamapS, yeah.. just wanted to see if we could store it differently to avoid the copies
<SpamapS> hazmat: why would it be stored?
<SpamapS> hazmat: we should have lazy-fetch for stuff like that
<hazmat> SpamapS, rel-get doesn't differentiate on keys, the info could be store on the rel, as for why store on the rel, i've been trying to keep contexts fully functional even without topo info to minimize issues arising from lack of stop hooks
<hazmat> ie. self contained for rel hook exec
<hazmat> but that's a losing proposition anyways
#juju-dev 2012-04-25
<wrtp> fwereade: review delivered. just a couple of documentation suggestions.
<fwereade> wrtp, cool, tyvm
<fwereade> wrtp, just one thing: It calls f.SetOutput(ioutil.Discard).
<wrtp> fwereade: go on
<fwereade> wrtp, would "It may call f.SetOutput(ioutil.Discard)." be acceptable?
<fwereade> wrtp, it's the sense I was trying to convey in the original
<fwereade> wrtp, or I could always make it call that and thereby get the simpler doc ;p
<wrtp> fwereade: i'd maybe go for that
<fwereade> wrtp, ...and more consistent behaviour, even if it doesn't matter much
<wrtp> fwereade: exactly
<fwereade> wrtp, sounds good, tyvm
<wrtp> fwereade: cool
<fwereade> wrtp, there's a trivial followup at https://codereview.appspot.com/6115048/ as well
 * TheMue would like to review too but is currently involved in correcting the German translations of the 12.04 press releases.
<TheMue> wrtp, fwereade: moin btw
<wrtp> TheMue: hiya
<wrtp> fwereade: proposal for new method in the Environ interface:
<wrtp> 	// UploadExecutables uploads the files in the
<wrtp> 	// given directory to the environment. They will be
<wrtp> 	// tagged with the given Juju version.
<wrtp> 	UploadExecutables(dir string, version string) error
<wrtp> fwereade: does that seem reasonable to you?
<TheMue> So, back from translation office. *lol*
<wrtp> fwereade: did you see my UploadExecutable proposal?
<TheMue> wrtp: Btw, I wondered why my state test was failing. But then I get aware that I have to install sshd. Has been a surprise first.
<wrtp> TheMue: slightly surprising it's not installed by default, yes. did it give a decent error message, BTW?
<TheMue> wrtp: Yes, it told me that it could not start sshd.
<wrtp> TheMue: that's good!
<TheMue> wrtp: So fix just has been a sudo apt-get ...
<TheMue> niemeyer: moin
<niemeyer> Heya!
<wrtp> niemeyer: yo!
<wrtp> niemeyer: possible addition to the Environ interface:
<wrtp> 	// UploadExecutables uploads the files in the
<wrtp> 	// given directory to the environment. They will be
<wrtp> 	// tagged with the given Juju version.
<wrtp> 	UploadExecutables(dir string, version string) error
<andrewsmedina> wrtp: hi
<fwereade> wrtp, sorry, I missed that earlier
<fwereade> wrtp, shouldn't dir be calculated from version?
<wrtp> fwereade: no, because dir is just a temp directory into which we've installed the go executables
<wrtp> niemeyer, fwereade: a possible alternative:
<wrtp> // UploadExecutables uploads a gzipped tar archive
<wrtp> // containing juju executables read from r. They will be
<wrtp> // tagged with the given Juju version.
<wrtp> UploadExecutables(r io.Reader, version string) error
<wrtp> but i think i prefer the original.
<wrtp> fwereade: perhaps i've misunderstood your comment though
<niemeyer> wrtp: Hmm, sounds interesting, the latter feels better
<niemeyer> wrtp: version shouldn't be a string, though
<niemeyer> wrtp: UploadTools?
<niemeyer> wrtp: as it goes onto tools/
<wrtp> niemeyer: the reason i prefer the former is that it gives the environ the choice to deploy in a different form if desired (e.g. local provider could just copy the dir)
<wrtp> niemeyer: UploadTools sounds good.
<wrtp> niemeyer: why shouldn't version be a string?
<niemeyer> wrtp: I don't see how that's the case with the first and not the second
<wrtp> niemeyer: the second one requires the executables to be tarred and gzipped already
<wrtp> niemeyer: the first one just assumes that there's a directory containing them (as made by GOBIN=$dir go install launchpad.net/juju/go/cmd/...)
<fwereade> wrtp, no, it was my misunderstanding
<niemeyer> wrtp: Sounds sane.. I'd prefer to have that consistent
<niemeyer> wrtp: There's no reason to have each environment with a different scheme
<wrtp> niemeyer: ok.
<niemeyer> wrtp: The alternative would be to reimplement the uploading logic everywhere, and then having to figure how each does what
<niemeyer> wrtp: The consistency also makes it easier to have the other side consistent too (the consuming bit)
<wrtp> niemeyer: i was thinking that there would be a function Archive(w io.Writer, dir string) which would do the tar & gzip
<wrtp> niemeyer: then it's trivial for an environ to use it if it wants
<niemeyer> wrtp: Until we figure why it would not want, this feels counterproductive
<wrtp> niemeyer: one thing: you said that S3 doesn't count the file as uploaded if the network connection is interrupted. but what if the uploading client is interrupted (e.g. with ^C) ?
<niemeyer> wrtp: That's what I did
<wrtp> niemeyer: that's fine.
<wrtp> niemeyer: hmm, so how does S3 know?
<niemeyer> wrtp: http has a Content-Length header on the upload side too
<wrtp> niemeyer: ah yes, i'd forgotten that.
<wrtp> niemeyer: so, maybe the signature should be: UploadExecutables(r io.Reader, length int64, version string) error
<niemeyer> wrtp: +1
<niemeyer> wrtp: Except for version.. it should be a Version IMO
<wrtp> niemeyer: ok
<wrtp> niemeyer: so... if we have a --force-upload flag, how can we tell the environ to bootstrap with the just-uploaded tools?
<niemeyer> wrtp: We'll be sending the current version from the client.. we can use that to decide what to do
<niemeyer> wrtp: We might also have a flag next to it.. something like "latest compatible" vs. "exact"
<wrtp> niemeyer: but if the current version comes from a global variable, it won't be changing each time
<wrtp> niemeyer: we could have an overriding environment variable
<niemeyer> wrtp: Indeed.. what are you foreseeing there?
<wrtp> niemeyer: i'm forseeing changing some code (but not the version) and uploading. i want that to work.
<wrtp> niemeyer: rather, i don't want it to clash with other people using the same version.
<niemeyer> wrtp: That sounds fine, that's what --force-upload would do, right?
<wrtp> niemeyer: or are you envisaging everyone using a different bucket?
<niemeyer> wrtp: Or --upload-tools, maybe
<niemeyer> wrtp: Ah, yeah, definitely
<niemeyer> wrtp: That was in the ML proposal
<niemeyer> wrtp: There's a well known $PUBLIC, but in development mode it'd upload to your own bucket
<wrtp> niemeyer: and that bucket comes from the environment.yaml, right?
<niemeyer> wrtp: Right
<niemeyer> wrtp: It's the same thing as the provider storage
<wrtp> niemeyer: that should work. we can just use the existing control-bucket, i guess
<niemeyer> wrtp: That's also the same thing as the provider storage :-)
<wrtp> niemeyer: ok, just making sure i understand.
<niemeyer> wrtp: Cool
<wrtp> niemeyer: so --upload-tools triggers dev mode ?
<niemeyer> wrtp: It's a bit unfortunate actually, that it's a common namespace
<niemeyer> wrtp: We could have done better than that otherwise
<niemeyer> wrtp: Hmm, good question.. what else would the dev mode affect?
<wrtp> niemeyer: i can't think of anything else.
<niemeyer> wrtp: Maybe we should handle that side of it as a feature on its own then
<wrtp> niemeyer: i think --upload-tools *is* dev mode for all intents and purposes
<niemeyer> wrtp: Just always check the provider storage before $PUBLIC on deployments
<niemeyer> wrtp: and join the versions found for purposes of deployment.. then, pick the right version considering "exact" vs. "latest", and deploy the right one wherever it is
<wrtp> niemeyer: i'm not sure i understand. if you've just uploaded the tools, why do you need to check anything other than provider storage for an exact match?
<wrtp> niemeyer: or are you suggesting that normal user deplyments check their provider storage too?
<niemeyer> wrtp: The fact the tools were uploaded sounds orthogonal to whether it is used, yeah.. When we don't upload the tools, what do we want to happen?
<wrtp> niemeyer: that's a good question.
<wrtp> niemeyer: i think that the tools should be chosen once, and then stored in the zk state
<wrtp> niemeyer: (to be changed later if we want to upgrade)
<niemeyer> wrtp: I'm not sure, only because the matrix is wider
<niemeyer> wrtp: It's quite possible that different Ubuntu releases and architectures may be forced onto a specific version based on availability
<wrtp> niemeyer: ah, but if we're uploading to the same URL each time, we have a problem, because we haven't got a good handle on the unchanging tools archive.
<wrtp> niemeyer: hmm, that's true. but what should --upload-tools do in that case?
<wrtp> niemeyer: if we can't use the version we're trying to upload, that is
<niemeyer> wrtp: That's the thing, it feels like it's orthogonal.. it'd just cause the version in use to be sent to the environment storage
<wrtp> s/we'retrying to upload/we've just uploaded/
<niemeyer> wrtp: Everything else works the same
<niemeyer> wrtp: Then we have to define that everything else to be sensible, of course
<wrtp> niemeyer: i'm not quite sure what you're suggesting here. that --dev-mode triggers exact version matching?
<niemeyer> wrtp: No, I'm suggesting that this is orthogonal to the storage provider.. there are two different things: whether to upload the tools, and whether to enforce specific version matching or be lax
<niemeyer> wrtp: So, just for the sake of understanding: bootstrap --upload-tools vs. bootstrap --exact-version vs. bootstrap --upload-tools --exact-version
<wrtp> niemeyer: would there ever be a case where you'd want to do --upload-tools without --exact-version?
<niemeyer> wrtp: Yes, in the cases where I'm happy for juju to pick a different version to enable me to work with other architectures, for example
<wrtp> niemeyer: even though the version you've uploaded might well be ignored?
<wrtp> only if there's a later uploaded version, i suppose
<wrtp> i'm slightly nervous about having multiple deployed units all with different versions
<wrtp> but i suppose it's an inevitable consequence of our versioning strategy
<wrtp> *maybe*
<niemeyer> wrtp: I understand and agree.. it's not even a consequence of our versioning strategy.. both of these aim at solving the problem which is supporting software for half a decade in multiple releases of the OS
<niemeyer> wrtp: and in multiple architectures
<niemeyer> wrtp: Practical example:
<niemeyer> wrtp: 12.10 ships with juju 2.0.0.. then, we find an important bug, and release 2.0.1.. do we want the guy that is using the client of 2.0.0 because he happened to stumble upon the CD to deploy 2.0.0 or 2.0.1? I'd prefer if the servers were running 2.0.1, even if the version doesn't match
<wrtp> niemeyer: yes, i agree. however, the scenario we were talking about was in development mode. i *think* i'd prefer --upload-tools to force my current version, even if it might not succeed on some architectures.
<niemeyer> wrtp: Fair enough, I'm happy to start from there, with --upload-tools implying --exact-version
<wrtp> niemeyer: so we give bootstrap a new exactVersion argument?
<niemeyer> wrtp: +1
<wrtp> niemeyer: and Startinstance, presumably
<niemeyer> wrtp: Actually, why don't we just start from there and ignore the other behavior for the moment?
<wrtp> niemeyer: just use a given version; no compatibility comparisons?
<TheMue> niemeyer: Beside an updated https://codereview.appspot.com/6120045/ I just proposed https://codereview.appspot.com/6111053/ containing only the NeedsUpgrade change (based on the trunk this time ;) )
<niemeyer> wrtp: Yeah, I *think* it'd be easy to get going with that, and then once the straightforward works, increment with a non-exact feature
<niemeyer> wrtp: We can even switch the default without much pain, I suppose
<wrtp> niemeyer: sounds good.
<niemeyer> wrtp: Otherwise we'll end up stumbling upon several issues before we even have anything working
<niemeyer> TheMue: Awesome, will try to review that now before lunch
<wrtp> niemeyer: so, what should i do about the versions package? it pretty much implements semantic versions as specified on semver.org, so perhaps i should just take out the Compat func (which is juju-specific) and leave the rest, and use it just for the parsing and not the comparison.
<TheMue> niemeyer: Great, thx
<niemeyer> wrtp: IMO, it should be greatly simplified
<wrtp> niemeyer: ok. you don't think it's useful having a standard semantic versions implementation?
<niemeyer> wrtp: What we need is type Version stuct { Major, Minor, Patch int }
<wrtp> niemeyer: ok, no pre-release, patch etc
<niemeyer> wrtp: and a version.Parse that returns it
<wrtp> i might just put the original onto rog-go.googlecode.org, as it's potentially useful for someone.
<wrtp> niemeyer: sounds good.
<niemeyer> wrtp: Agreed, not disagreeing with your generic design of semantic versioning itself
<wrtp> niemeyer: it *is* pretty darn complex for what it gives you :-)
<niemeyer> wrtp: Yeah
<niemeyer> wrtp: func (v Version) IsDev() bool <= might be useful
<wrtp> niemeyer: +1
<wrtp> niemeyer: (although i still think the odd-numbered versions will seem odd to people: "We've just released version 2.0.0 of juju" "But what happened to 1.0.0?!")
<niemeyer> wrtp: odd/even is not so uncommon, and as long as we have the proper mechanisms in place in terms of code, no real damage can happen I suppose
<niemeyer> wrtp: e.g. upgrade-juju shouldn't jump from stable to dev without a flag
<wrtp> niemeyer: yeah, it's just a bit weird from a PR perspective. but if it's not uncommon, that's fine.
 * TheMue loves LGTMs
<niemeyer> wrtp: http://en.wikipedia.org/wiki/Software_versioning#Odd-numbered_versions_for_development_releases
<niemeyer> TheMue: ;)
<wrtp> niemeyer: i can't see anything that does that on odd-numbered major releases.
<wrtp> niemeyer: case in point: http://en.wikipedia.org/wiki/GNOME#Versions
<niemeyer> wrtp: That doesn't bother me too much.. the main point of the odd major is being able to test the major upgrades before releasing them
<wrtp> niemeyer: i'm sure it'll work out
<TheMue> niemeyer: I would like to move all watchers into a new file "watchers.go" and rename "watch_test.go" to "watchers_test.go". OK for you?
<wrtp> TheMue: one though: i think singular works just as well as plural, if not better. i'd suggest "watcher.go" and "watcher_test.go".
<wrtp> s/though/thought/
<TheMue> wrtp: I think you have the better language feeling than me, so it would be ok for me to take "watcher", yes.
<niemeyer> TheMue: Review delivered, cheers
<niemeyer> TheMue: Yes, fine to move them elsewhere, as long as its in a branch by itself
<niemeyer> TheMue: Hard to review moving code
 * niemeyer => lunch, biab
<TheMue> niemeyer: Thx for the power reviewing. :D
<TheMue> niemeyer: Yes, it will be an own branch.
<TheMue> niemeyer: Enjoy
<wrtp> niemeyer: i've updated the version CL taking into account our discussions above: https://codereview.appspot.com/6082044
<niemeyer> wrtp: Done
<wrtp> niemeyer: submitted. thanks a lot.
<niemeyer> wrtp: np!
<wrtp> pwd
<niemeyer> wrtp: I now realize that ClientVersion is probably not correct.. this should be version.Current I think
<wrtp> niemeyer: +1
<wrtp> niemeyer: will do
<niemeyer> wrtp: Thanks!
<wrtp> fwereade: ping
<fwereade> wrtp, pong
#juju-dev 2012-04-26
<wrtp> fwereade: mornin'
<fwereade> wrtp, heyhey
<wrtp>   fwereade: my ping last night BTW was to let you know that i'd found an old branch that actually made some of the juju commands work, in case you were about to work on something similar.
<wrtp> fwereade: it's funny though, i dusted off the branch, split off a part of it that was logically separate and proposed that, then proposed the original again, and only *then* did i discover i'd proposed it before and even had a LGTM!
<fwereade> wrtp, yeah, it was nagging at my mind, I *thought* you'd done something like that :)
<wrtp> fwereade: yeah, exactly. i looked at cmd/bootstrap and thought "i thought this was working now", then i found the branch and thought i'd never got around to proposing it.
<wrtp> fwereade: there was quite a bit of code that i'd totally forgotten about
<fwereade> wrtp, heh, ouch :)
<wrtp> fwereade: it's fine. nobody's written anything similar in parallel, and the tests are *almost* passing.
<wrtp> fwereade: it's quite nice actually.
<fwereade> wrtp, awesome :)
<fwereade> wrtp, the Init change will hit one of us soon but it shouldn't be a bother
<fwereade> wrtp, oo, cath has made breakfast, bbs :)
<wrtp> fwereade: is there another Init change in the offing?
<wrtp> fwereade: enjoy!
<wrtp> hey Daviey
<fwereade> wrtp, oh, yeah, Init went in already; sorry, confused :)
<fwereade> wrtp, (mmm, maltese bread :))
 * wrtp has no idea what maltese bread might be like, but thinks he should have some breakfast because his mouth is watering.
<wrtp> (mmm, Sainsbury's muesli :-))
<wrtp> TheMue: hiya
<TheMue> wrtp: moin, just came back from workout
<TheMue> wrtp: â¦ and a discussion with my daughter that a fresh setup of a computer keeping software and data is more than a five minute job
 * TheMue laughs
<wrtp> TheMue: what, you mean you can't just do "juju add-unit ms-windows --constraint machine=daughters-computer ? :-)
<wrtp> fwereade: what *is* maltese bread like, BTW?
<TheMue> wrtp: Hehe, no. And even no "osx" (she's got an Apple). OK, installing the OS is done quickly, but a backup of her data as reinstallation of her software later.
<TheMue> Btw, anyone already upgraded to 12.04?
<fwereade> wrtp, crusty on the outside, awesomely soft and springy on the inside, nice taste that I don't really have the vocabulary to describe; works really *really* well with tomatoes, capers, maybe basil, little bit of oil
 * fwereade is hungry again
<wrtp> TheMue: not I
<fwereade> TheMue, yeah, haven't got around to it yet
<wrtp> fwereade: white or brown?
<fwereade> wrtp, white
<TheMue> wrtp, fwereade: Will also wait a few days until upgrade.
<TheMue> fwereade: Hey, I'm trying to loose some weight. Please don't write about delicous food here. ;)
 * fwereade resists the urge
 * TheMue once again listens to Porcupine Tree and is happy about such a great music
<fwereade> wrtp, btw, you remember we had a discussion about cmd/jujuc/server and tomb the other day
<wrtp> fwereade: i do
<wrtp> fwereade: i suggested not using tomb in that case if i remember rightly
<fwereade> wrtp, I must say I keep wanting to use tomb when I look at it -- for example, as it is I think it will be tricky to implement both Close and Wait nicely
<wrtp> fwereade: does it need a Wait?
<fwereade> wrtp, perhaps not yet; but without one, error discovery is deferred until you want to close the server anyway
<fwereade> wrtp, and that feels like it could be a problem
<wrtp> fwereade: i see
<wrtp> fwereade: depends whether something might want to wait for an error and start a new server or something
<wrtp> fwereade: but to be honest, when are you ever going to get an error here? accept is never going to fail in reality.
<fwereade> wrtp, I don't know... but I'm not very comfortable just dropping errors on the floor like that: just because I can't imagine the error doesn't mean it won't happen ;p
<wrtp> fwereade: can you paste the code again so i can remember it a bit better?
<fwereade> wrtp, http://paste.ubuntu.com/947032/
<wrtp> fwereade: are you ever going to implement any other methods on Server? (other than Wait)
<wrtp> fwereade: cos i have a suggeston
<wrtp> iom
<fwereade> wrtp, I don;t think so
<wrtp> ion
<fwereade> wrtp, go on
<wrtp> fwereade: why not remove the server type, and replace with a function: RunServer(...) error
<wrtp> ah, no
<fwereade> wrtp, I *think* I need both wait and close
<wrtp> well, why not
<wrtp> fwereade: when are you going to want to shut down the server?
<wrtp> fwereade: won't it always run forever?
<fwereade> wrtp, as part of an orderly shutdown in preparation for a code upgrade, for example
<wrtp> fwereade: if we're doing that, won't we be exiting the whole program anyway?
<wrtp> fwereade: alternatively...
<wrtp> fwereade: don't start the server on NewServer - just return the server object. then implement Server.Run which blocks when called.
<fwereade> wrtp, ah, that sounds like it could work
<wrtp> fwereade: similar to rpc.Server.ServeConn
<fwereade> wrtp, I'll try that, ty :)
<wrtp> fwereade: np
<wrtp> fwereade, TheMue: https://codereview.appspot.com/6120052/
<TheMue> *click*
<wrtp> you have reviewed it before, but it has changed a certain amount
<fwereade> wrtp, I can't decide whether I like the bootstrap Init
<wrtp> fwereade: you mean that it initialises the Conn rather than just the env name?
<fwereade> wrtp, it kinda feels as though an error in juju.NewConn would be better reported to the user with os.Exit(1) rather than (2)
<fwereade> wrtp, which is the observable distinction between Init and Run errors
<wrtp> fwereade: it means we can uniformly test that all commands open their environment correctly
<wrtp> fwereade: although perhaps there's another better way of doing that that i haven't thought of
<fwereade> wrtp, I don't see how that follows -- the code is duplicated in the Inits of Bootstrap and Destroy, so where we happen to test it is a matter of choice
<fwereade> wrtp, now I think of it, what we did with Log could quite easily work with other chunks of common command functionality
<wrtp> fwereade: ah, now yer talkin
<fwereade> wrtp, I think it'd work out really quite nicely :)
<wrtp> fwereade: consider it done
<fwereade> wrtp, cool
<fwereade> wrtp, I'll leave a comment for form's sake, I haven't finished looking through it yet
<wrtp> fwereade: are the final Log changes submitted yet?
<fwereade> wrtp, afraid not, you'll have to look at the CL for the latest
<wrtp> fwereade: k
<fwereade> wrtp, hoping that this morning's modifications will pass muster :)
<wrtp> fwereade: i'm not sure that that approach entirely solves the problem. it looks to me like we'll still need to test the environment behaviour of every command.
<wrtp> fwereade: for instance, here's what bootstrap might look like: http://paste.ubuntu.com/947216/
<wrtp> fwereade: although, i suppose that if we test at least one working example, we can infer the fact  that the command has the correct c.env.open call, and hence testing the env name after Init is sufficient.
<fwereade> wrtp, I think that's a good boundary, yeah
<wrtp> fwereade: yeah, i think i'll do that.
<fwereade> wrtp, other comments delivered btw
<wrtp> fwereade: tyvm
<niemeyer> Hello!
<wrtp> niemeyer: yo!
<TheMue> niemeyer: moin
<wrtp> fwereade: PTAL
<wrtp> niemeyer: https://codereview.appspot.com/6120052
<fwereade> niemeyer, heyhey
<niemeyer> fwereade: https://codereview.appspot.com/6115048/ reviewed
<fwereade> niemeyer, tyvm
<niemeyer> fwereade: My pleasure
<fwereade> niemeyer, ah, I think I missed the import of what you said before
<niemeyer> wrtp: Ohhh, tasty
<niemeyer> fwereade: You mean re. public/private?
<fwereade> niemeyer, Help should probably be public if for no other reason than directness of testing
<niemeyer> fwereade: Yeah, it looks like a nice piece for the API
<fwereade> niemeyer, there's a followup that covers that but it's WIP until the parent pipeline settles down
<niemeyer> fwereade: That's ok
<wrtp> niemeyer: you reviewed it before some time ago, and i forgot about it, found the old branch, dusted it off, split it, and only when i re-proposed did i find the old thread!
<niemeyer> wrtp: Hah :-)
<niemeyer> wrtp: Nice to see this going along the cool improvements by fwereade in the last few days
<fwereade> niemeyer, I feel that "ERROR:" beats "error:" in practice merely because it's often followed up with "usage:", etc, and it makes it stand out much better
<niemeyer> wrtp: Should we do a gigantic s/Environ/Env/ replacement?
<wrtp> fwereade: the Init/Run distinction works well. i don't know quite why we didn't go with that originally. maybe because we didn't think of the trick of passing in FlagSet
<niemeyer> and s/environs/envs/ etc
<wrtp> niemeyer: i like Environ
<wrtp> niemeyer: but i'd prefer "environ" as the package name
<niemeyer> wrtp: I see we often pick one or the other arbitrarily
<fwereade> wrtp, indeed, it would be interesting to go spelunking through its very early history in my copious free time ;)
<wrtp> fwereade: yeah, there was a long discussion AFAIR
<niemeyer> That's of course not too important, and if there's no agreement let's move on
<wrtp> niemeyer: i realised that unnecessary plurals don't help anything. hence my "environs->environ" suggestion.
<wrtp> niemeyer: but i'm much less keen on abbreviations than i used to be :-)
<niemeyer> wrtp: I'd take "env" as a tradeoff ;-)
<wrtp> niemeyer: i'll think about it
<niemeyer> wrtp: FWIW, we had "environs" to free the "environ" name for variables, if I'm not mistaken
<niemeyer> I may be wrong, though
<wrtp> niemeyer: maybe. though we usually use "e" or "env" (a good reason not to rename the package, perhaps)
<TheMue> niemeyer: Just for info, https://codereview.appspot.com/6111053/ has the changes after the review in and https://codereview.appspot.com/6118055/ is a new one I would like to see in trunk before moving the watchers.
<niemeyer> wrtp: Or a good reason to have it as juju/envs
<niemeyer> wrtp: same as strings, bytes, ..
<niemeyer> TheMue: Cool, thanks
<wrtp> niemeyer: i don't feel like i type it that often, or that it's an eyesore when i do.
<niemeyer> wrtp: "that"?
<wrtp> niemeyer: although i'm still thinking :-)
<wrtp> niemeyer: "environ"
<wrtp> niemeyer: "Environ"
<niemeyer> wrtp: Ah, .. it's more that we can't agree on the proper way to abbreviate it
<niemeyer> wrtp: Not that I dislike it per se
<wrtp> niemeyer: true.
<niemeyer> wrtp: Your branch has EnvName
<wrtp> niemeyer: ah!
<wrtp> niemeyer: now i see where you're coming from!
<niemeyer> wrtp: It's not the only place.. it just reminded me of the issue
<wrtp> niemeyer: hmm, you're right: http://paste.ubuntu.com/947322/
<niemeyer> wrtp: I'm pretty sure I had reviewed that branch before, but my comments are not there.. was there a different thread in place already?
<wrtp> niemeyer: look at the first message in the thread
<wrtp> niemeyer: i had to make a new CL because i changed the dependency
<niemeyer> OMG
<niemeyer> I can see that :-)
<niemeyer> wrtp: I reminded of it because I still feel the same way about the tests
<wrtp> niemeyer: all of them? i think there's enough commonality to justify tables, but perhaps only because i anticipate very similar tests for other subcommands.
<niemeyer> wrtp: cmd_test.go
<wrtp> niemeyer: there are two table-driven tests there.
<niemeyer> wrtp: It feels over the top.. I'm tempted to reduce them to straightforward checks to see how it'd look
<wrtp> niemeyer: the first one really will be identical for every command that takes a --environment flag
<niemeyer> wrtp: Maybe I'm just missing how complex it is, but the amount of exceptions and reflection on these tests makes me feel like it's being unnaturally forced onto a table
<wrtp> niemeyer: i don't see any exceptions in the EnvironmentInit tests, or reflection in the Commands tests.
<wrtp> niemeyer: your argument is stronger for TestCommands, i think
<niemeyer> if t.initErr != "" {; if t.runErr != "" {; com := testInit(c, t.cmd, t.args, t.initErr); etc
<wrtp> niemeyer: i could easily factor out the command test thing into a function.
<niemeyer> wrtp: This is also extremely simplistic:                 ops: envOps("peckham", dummy.OpBootstrap),
<wrtp> niemeyer: say runCommand(c *C, com Command, args ...string) (initErr, runErr error, ops []dummy.Operation)
<niemeyer> wrtp: This only works for cases where it's calling a simple method without any expectations about it
<niemeyer> wrtp: It's mocking on its best.. tying implementation shape to the tests
<wrtp> niemeyer: that's perhaps true
<wrtp> niemeyer: i'll try refactoring it. i stick to my guns on TestEnvironmentInit though.
<niemeyer> wrtp: This one is not so bad, but I'd change it to have a list of factories rather than a list of commands
<niemeyer> wrtp: So that the magic copying can go away
<wrtp> niemeyer: ok
<niemeyer> wrtp: That's the only comment really.. everything else is great
<wrtp> niemeyer: it doesn't get rid of reflection entirely BTW
<wrtp> niemeyer: thanks
<andrewsmedina> niemeyer, wrtp morning
<wrtp> andrewsmedina: hiya
<andrewsmedina> wrtp: are you ok?
<wrtp> andrewsmedina: fine thanks. and you?
<andrewsmedina> wrtp: when you have time, please take a look https://codereview.appspot.com/6099051/
<andrewsmedina> wrtp: im ok
<wrtp> andrewsmedina: yes i will. thanks for the reminder.
<andrewsmedina> wrtp: thanks
<wrtp> andrewsmedina: one question: do you need to be root to run those tests?
<wrtp> andrewsmedina: also i'm slightly concerned that running the tests might have undesirable side effects (but i'm not familiar with virsh - perhaps there are none)
<andrewsmedina> wrtp: not. the user that will run the tests need access to libvirt
<wrtp> andrewsmedina: and side effects?
<andrewsmedina> wrtp: not
<andrewsmedina> wrtp: i created and drestoy a dummy net
<wrtp> andrewsmedina: i don't see anywhere in the tests that the net is destroyed
<andrewsmedina> wrtp: in the other tests I'm using the default libvirt net
<wrtp> andrewsmedina: sorry, i don't know anything about libvirt. what's the default libvirt net?
<wrtp> andrewsmedina: is it something the user might be using anyway?
<niemeyer> fwereade: https://codereview.appspot.com/6123049/ done
<fwereade> niemeyer, tyvm
<andrewsmedina> wrtp: I'll be more careful in the tests and create a flag for those who have not installed virsh
<andrewsmedina> wrtp: he may be using the network default but I do not modify it in the tests
<wrtp> andrewsmedina: isn't starting it a modification?
<fwereade> niemeyer, nice suggestion, ty
<niemeyer> fwereade: Glad you like it
<andrewsmedina> wrtp: the start verify if the net is already started
<niemeyer> fwereade: I pondered about the forcing upgrade as well, btw
<wrtp> andrewsmedina: what if it's not already started?
<niemeyer> fwereade: I didn't bother too much because it's mimicking what's in Py today
<fwereade> niemeyer, oh, ok
<niemeyer> fwereade: I'm happy to splat the state if we have --force, but I don't think the opposite is sensible
<niemeyer> fwereade: IOW, if somebody does "upgrade --force", and then "upgrade", the second shouldn't drop the --force state
<fwereade> niemeyer, ah, I thought it... kinda dumb to do so, but not intrinsically *bad*
<TheMue> niemeyer, fwereade: I found it in the Py code, that a change isn't allowed.
<fwereade> niemeyer, I don't really feel that it should be an error, even so
<fwereade> niemeyer, it may silently refuse to downgrade from --force to not
<niemeyer> fwereade: I think it's actually bad because doing an upgrade at all is generally forbidden unless the unit is running fine
<niemeyer> fwereade: The first succeeds because of the --force, and the second succeeds because of the first
<andrewsmedina> wrtp: a network should be active, inactive or does not exists
<niemeyer> fwereade: So, in a way, the second would only happen at all because there's a --force in place..
<fwereade> niemeyer, hmm, ok
<andrewsmedina> wrtp: if it's active, it's started
<niemeyer> fwereade: That's why, in that situation, I'd rather see "hey, already doing it, hold on" rather than "sure!"
<wrtp> andrewsmedina: by running this test, the current status is changed. i don't think tests should have side-effects.
<andrewsmedina> wrtp: in your machine?
<wrtp> andrewsmedina: that's right
<niemeyer> TheMue: Are you with us
<niemeyer> ?
<niemeyer> TheMue: The summary is:
<wrtp> andrewsmedina: it's ok if it has side-effects that are cleaned up afterwards
<niemeyer> TheMue: upgrade + upgrade == fine
<niemeyer> TheMue: upgrade + force-upgrade == fine
<andrewsmedina> wrtp: yes
<niemeyer> TheMue: force-upgrade + force-upgrade == fine
<niemeyer> TheMue: force-upgrade + upgrade == NOT fine
<andrewsmedina> wrtp: I will use a testdummy net in tests for start method ok?
<wrtp> andrewsmedina: but anyone should be able to run go test launchpad.net/juju/go/... and be happy that it won't interfere with anything on their machine
<niemeyer> TheMue: Well.. or maybe, fine as well
<niemeyer> fwereade: We could actually make it fine, by saying that the --force continues in place in that situation
<TheMue> niemeyer: Hmm, sadly don't know enough about the upgrade mechanism in total.
<wrtp> andrewsmedina: you should make sure to delete the net afterwards too (perhaps by defining TearDownSuite)
<fwereade> niemeyer, yeah, that seemed nicer to me
<andrewsmedina> wrtp: you are "+1" for virsh flag or I shoud reuse the lxc flag?
<andrewsmedina> wrtp: I'm doing it for other tests
<wrtp> andrewsmedina: i don't understand
<fwereade> niemeyer, you're not allowed to, er, downgrade your upgrade ;)
<niemeyer> TheMue, fwereade: Ok, so what about this.. let's just move on with what's there given that it mimics what exists today.. we can improve it later
<niemeyer> fwereade: We can then implement that nicer behavior on a follow up
<TheMue> niemeyer: Sounds ok.
<fwereade> niemeyer, TheMue: yep, I'm fine with that
<niemeyer> TheMue: Cool
<wrtp> andrewsmedina: what flag are you referring to?
 * TheMue sadly forgot why he added (TODO) to the comment. 
<TheMue> I'm getting old.
<andrewsmedina> wrtp: a "--virsh" flag
<andrewsmedina> ops
<niemeyer> TheMue: So you just need to observe the last of fwereade's point to see how to get to some agreement
<andrewsmedina> wrtp: a "-virsh" flag
<wrtp> andrewsmedina: it depends whether you think you can implement the tests without affecting the user's environment
<wrtp> andrewsmedina: if you don't, i think it would be a good idea to implement some tests that can be run without running virsh, even if they don't verify all the functionality
<andrewsmedina> wrtp: how I mock the virsh commands?
<wrtp> andrewsmedina: you could set $PATH
<andrewsmedina> wrtp: I dont undertand
<wrtp> andrewsmedina: have a look at state/ssh_test.go
<wrtp> andrewsmedina: it uses that technique
<andrewsmedina> wrtp: brb
<wrtp> andrewsmedina: ok
<TheMue> fwereade: For the watchers which run concurrently I would like to keep those changes in the background (see the other watchers). Direct write/read tests are already done in state_test.go.
<fwereade> TheMue, I don't see how it's a benefit to run the test code concurrently when you don't have to
<fwereade> TheMue, the change goes through ZK just the same either way, and if you skip the concurrency you can eliminate both waiting and the (admittedly unlikely) potential event coalescing
<TheMue> fwereade: How would you build this as a table driven test?
<andrewsmedina> wrtp: im back :D
<fwereade> TheMue, I'm not sure it favours a table-driven style so much as it does a plain old procedural style
<TheMue> fwereade: niemeyer and wrtp convinced me about the elegant way of doing it table-driven. And I've got to admit, I now like this way. It's clean.
<fwereade> TheMue, heh, has it changed already?
<fwereade> TheMue, still don't feel it's a good fit tbh, but perhaps it'll grow on me
<TheMue> fwereade: What has changed? I had changed other watcher tests to be table-driven today.
<TheMue> fwereade: All now work the same way, also the new PortsWatcher test.
<fwereade> TheMue, sorry, I'm just unclear on a largely irrelevant point: whether or not the watcher tests were originally more procedural
<TheMue> fwereade: I've started them in a more mixed way, asynchronous changes and a procedural testing of the obtained values.
<TheMue> fwereade: And here it would habe been more simple to move the changes between the tests.
<fwereade> TheMue, it's the asynchronicity that I feel is the problem anyway, not the table-drivenness
<fwereade> TheMue, what behaviour does it exercise that a synchronous test wouldn't?
<TheMue> fwereade: Yes, that's why I asked you if you've got a good idea on integrate the changes int the table too.
<TheMue> fwereade: I've got no problem in convert it to be synchronous (but in a different branch, all watchers, after moving). But I would like to keep it table-driven.
<fwereade> TheMue, `func() { unit.OpenPort("tcp", 80)}` ?
<andrewsmedina> wrtp: I will create the mock for virsh like your ssh tests, ok?
<wrtp> andrewsmedina: you've got a review
<TheMue> fwereade: OK, so the table has to be an anonymous struct to contain test func and expected result, but yous, looks ok to me.
<fwereade> TheMue, cool
<wrtp> andrewsmedina: i think so. perhaps you might talk to niemeyer some time about this too.
<andrewsmedina> wrtp: ok
<andrewsmedina> wrtp: I will improve the tests and talk with you and niemeyer again
<TheMue> niemeyer: How about you? I would do that change after this watcher and the PortsWatcher are in and then the branch with moving all watchers is in too.
<niemeyer> TheMue: Which change?
<niemeyer> TheMue: The refactoring fwereade suggested?
<TheMue> niemeyer: I discussed with fwereade to take the concurrent changes out of the tests later.
<TheMue> niemeyer: Yes.
<niemeyer> TheMue: Yeah, as long as fwereade is happy, I'm happy
<niemeyer> TheMue: and as long as you're happy too, of course
<TheMue> niemeyer: Fine, and I will keep it as small branches to make you happy.
<niemeyer> TheMue: WOohay small branches!
<niemeyer> :)
<TheMue> niemeyer, fwereade: Big party!
<niemeyer> !
<niemeyer>   i i i
<niemeyer> ----------
<TheMue> *rofl*
<fwereade> TheMue, I'm happy :)
<niemeyer> |          |
<niemeyer> \:-)
<TheMue> Oh, we get a cake.
<TheMue> Btw, just had some selfmade chocolate chips by my wife with the coffee. Yummy.
<fwereade> niemeyer, btw, ERROR/error: thoughts? ERROR stands out much better when the error line is not the only output
<TheMue> niemeyer: So, submitted, next stop: PortsWatcher. ;)
<niemeyer> TheMue: Was just looking at it, but in a meeting with zaid just now.. will continue after lunch
<niemeyer> fwereade: just a sec and will be with you
<TheMue> niemeyer: OK, np.
<niemeyer> fwereade: I'd prefer "error: unrecognized flag --foo" than "ERROR: unrecognized flag --foo"
<niemeyer> fwereade: My feeling about it is that it was just a problem detected.. no one is dying :-)
<fwereade> niemeyer, fair enough, I imagine I'll grow used to it :)
<niemeyer> fwereade: That said, you have a point..
<niemeyer> fwereade: Something we should do is to reorder the error vs. help
<niemeyer> fwereade: In that one place we print the error & help, error should come last
<niemeyer> fwereade: So it's next to the prompt
<fwereade> niemeyer, that's no problem... and it will be much clearer in that case anyway
<fwereade> niemeyer, ok, if I rearrange all that and make it lowercase, good to submit?
<niemeyer> fwereade: In other cases, I suspect that's already happening
<niemeyer> fwereade: I mean, error is already the last thing printed
<niemeyer> fwereade: As long as that's true, I don't see a problem with it being lowercased as it'll be very clear
<niemeyer> fwereade: Yep, +1
<niemeyer> TheMue: Btw, I just detected a silliness I've suggested that isn't necessary
<niemeyer> TheMue: You know that change, ok := ... if !ok { s.tomb.Kill(nil); return }, that we're using everywhere?
<TheMue> niemeyer: Yes?
<niemeyer> TheMue: It's not necessary
<niemeyer> TheMue: It's fine to just return in that case
<niemeyer> TheMue: The deferred Done() will make it all work fine
<TheMue> niemeyer: So check the ok but don't Kill()? OK, I'll change it during refactoring.
<TheMue> niemeyer: Is it ok when putting it into the watcher movement change?
<TheMue> niemeyer: So that change would only contain this code change and the moving.
<niemeyer> TheMue: No, please not in the movement change
<niemeyer> TheMue: You can put it in the PortsWatcher, though
<niemeyer> TheMue: As it's just a bunch of one-liner removals, that's fine
<TheMue> niemeyer: OK, no pro.
<niemeyer> TheMue: If you do that now, please fix the comment of the watcher as well.. it's referring to the copy & pasted watcher
<niemeyer> TheMue: I'll review it right after lunch and hopefully we can already get it in
<TheMue> niemeyer: fwereade already reviewd it.
<niemeyer> TheMue: Ah, superb
<niemeyer> TheMue: Btw, do you understand why these lines can be removed?
<TheMue> niemeyer: OK, will change it now and propose it immediately.
<niemeyer> TheMue: Btw, do you understand why these lines can be removed?
<wrtp> niemeyer: PTAL  https://codereview.appspot.com/6120052
<TheMue> niemeyer: We don't pass an error to outside, but want to close the tomb and the channel. That's done by defers after the return.
<niemeyer> TheMue: Right, Done() takes care of "killing" the tomb if it's not yet dead
<niemeyer> TheMue: So as long as we're returning from this function, with nil err, just returning is fine
<niemeyer> wrtp: Thanks, I'll have  a look right after lunch + TheMue's
<TheMue> niemeyer: Fine, code reduction at its best.
<niemeyer> TheMue: +1
 * niemeyer => lunch
<wrtp> isn't it nice to know that gustavo implies lunch? i'll make sure to have him around as often as possible.
<TheMue> wrtp: *rofl* took a few moments to get it.
<wrtp> :)
<wrtp> hmm, i thought i'd sussed this prerequisite stuff. but despite small diffs
<wrtp> bzr diff --old lp:~rogpeppe/juju/go-more-commands | wc
<wrtp>      68     260    2113
<wrtp> i guess this:
<wrtp>  https://codereview.appspot.com/6117064
<wrtp> s/guess/get
<TheMue> wrtp: Strange. It claims different changes as yours. Hmm.
<wrtp> TheMue: yeah. somewhere it's diffing against an unmerged branch
<wrtp> or something
<TheMue> wrtp: I still don't understand the prereq stuff.
<wrtp> TheMue: i thought i did!
<TheMue> wrtp: As long as the tools are working I'm happy. But in case of a problem I'm lost.
<wrtp> TheMue: and this is really weird: http://paste.ubuntu.com/947637/
<wrtp> if i revert to a revision, isn't the revision-id going to be that revision?
 * wrtp is trying to work out how to diff two revision-ids
<fwereade> TheMue, wrtp: is it not that lbox requires that changes on the target that have been merged into the proposed branch have *also* been merged into the prereq, even if the prereq has already been merged?
<TheMue> wrtp: Huh
<wrtp> fwereade: yes. i did that though, i think
<fwereade> wrtp, ah ok -- it just looked a bit like it has for me when I've got that wrong
<wrtp> fwereade: me too. that's why "i thought i understood it!"
<fwereade> wrtp, haha, sorry :(
<wrtp> oh, i have uncommitted changes.
<wrtp> and it wants to remove cmd/juju/destroy.go
<wrtp> so weird. i could've sworn it passed its test *and* i merged it.
<fwereade> gn all, take care
<wrtp> fwereade: gn, take care
<wrtp> niemeyer, TheMue: https://codereview.appspot.com/6117064
<wrtp> all that hassle for about 6 lines of code :-)
<wrtp> niemeyer, TheMue: ultra-small branch BTW!
<TheMue> wrtp: *lol*
<niemeyer> wrtp: LGTM
<wrtp> niemeyer: still waiting for prereq BTW, but i'm sure you know that :-) it turns out that it didn't need to be a prereq, but i didn't want to delete the CL.
<wrtp> i probably should've
<niemeyer> wrtp: Yeah, I'll just review TheMue's first
<wrtp> niemeyer: np
<wrtp> niemeyer: do you know if http.Client.Do is guaranteed to close req.Body ?
<niemeyer> TheMue: Done.. just a couple of trivials
<TheMue> niemeyer: Fine, thx.
<niemeyer> wrtp: I don't
<wrtp> niemeyer: np. just wondering about s3 Put. but i don't think it matters actually.
<niemeyer> wrtp: I'd check the implementation
<TheMue> niemeyer: Do you wonna see the openPortsNode change? Otherwise I submit it.
<niemeyer> TheMue: No, happy to have it submitted, thanks!
<wrtp> niemeyer: yeah, i did, but it's not greatly obvious
<wrtp> time to go
<wrtp> niemeyer, TheMue: g'night. see ya tomorrow.
<TheMue> wrtp: Yes, good night, I leave too in a few minutes.
<TheMue> wrtp: CU
<niemeyer> wrtp: Thanks, and have a good one too
<wrtp> niemeyer, TheMue: to both of you too
<niemeyer> wrtp: I think the answer is no, there's no guarantee
<TheMue> niemeyer: So, I'm off. CU tomorrow.
<niemeyer> TheMue: Cheers, have a good evening too
<TheMue> niemeyer: You too, bye.
<robbiew> niemeyer: we have 1:1 now, mind if we just have it in person next week?
 * robbiew is cleaning up Precise blueprints all day today :/
<niemeyer> robbiew: np
<robbiew> thx...back to the grind
#juju-dev 2012-04-27
<rog> TheMue, fwereade: mornin'
<fwereade> rog, TheMue, heyhey
<TheMue> moin rog and fwereade
<TheMue> fwereade: After my watcher moving proposal this morning I'm now testing the test refactoring we talked about. Looks fine.
<fwereade> TheMue, cool
<TheMue> fwereade: So, done for the "real" watchers. Now for the watcher package and it's done.
<dpkingma> Hello everyone!! I'm looking for a Juju Charm for MongoDB v2.0+
<rog> fwereade: review delivered on the RPC CL
<dpkingma> NM. has been answered!
<rog> fwereade: review delivered on client RPC CL
<fwereade> rog, tyvm
<rog> fwereade: have you looked at the comments on the server CL?
<fwereade> rog, just starting too, need to digest them a little
<rog> fwereade: cool
<rog> fwereade: i think that losing the SuperCommand will be a useful simplification
<fwereade> rog, indeed, that's the bit I need to digest, because it doesn't fit my personal conception of the problem at all -- my current thinking is that SC already does exactly and precisely the job we need -- but you may well have a point :)
<rog> fwereade: consider that we will never get flags *before* the command name. and we already know what command to execute. so we aren't using any of SC's functionality.
<fwereade> rog, SC doesn't have that functionality
<fwereade> rog, the recent split is all about separating the log stuff from the command selection stuff
<fwereade> rog, (er, ok, it does, we moved the Log back in -- but the intent was absolutely that it be optional)
<rog> fwereade: AFAICS SC is all about intermingling the flags from before the command and after the command, and selecting the command.
<rog> fwereade: but we don't care about either of those things here
<fwereade> rog, well, we do care about selecting the command, and the convenient representation of the data is in fact the representation that happens to be perfectly convenient already
<fwereade> rog, you're just proposing a different way of selecting a command
<rog> fwereade: that's true. one that's considerably simpler :-)
<fwereade> rog, my vague idea was that a context should be able to just produce a list of commands it implements, and that the selection business could be taken care of by something that already does that
<rog> IMO
<rog> fwereade: but if a context can simply return a command for a given name, isn't that just as easy to do?
<rog> fwereade: i don't see why we take the trouble to register all those commands with SC when we *already know* what command we're going to run!
<dpkingma> Hi everyone! Question: is there any info on how to deploy stuff with juju on my own OpenStack cloud? Can't find any info online...
<fwereade> dpkingma, very briefly: use the ec2 provider and set the ec2-uri, s3-uri, default-image-id and default-instance-type fields in its config
<rog> pwd
<dpkingma> fwereade: thanks I'll try that!
<fwereade> rog, it seems to me that we know the name of the command we need to run but that actually executing it involves some extra work; and that it is surely convenient for a server.Context to know what commands it implements; but that it does not necessarily follow that we need to duplicate the command selection mechanism
<rog> fwereade: the "command selection mechanism" is one map lookup. i don't see that it's any significant complexity.
<rog> fwereade: but when i saw the SuperCommand logic, i had to think quite hard about what was actually going on. hence my suggestion.
<fwereade> rog, and building the map in the first place; but I don't see how SC represents significant complexity either
<rog> fwereade: it's a lot more complex than a simple map lookup
<fwereade> rog, sorry -- it seemed perfectly natural to reuse the code that happened to solve precisely the problem we have here
<rog> fwereade: i don't think it's solving the problem we've got here. i think you're modifying the problem so it looks like something that is good for SC :-)
<rog> fwereade: try the code without using SC there. i *think* it will look a reasonable amount nicer.
<fwereade> rog, it's building the map and extracting the command name and using it to select a command... and I agree that it's not complex, but it is work that's already been done
<rog> fwereade: and if not, then let's use SC
<fwereade> rog, SC takes a list which is a subcommand name followed by its args and does the needful
<rog> fwereade: building the map is the same complexity as building the slice, which it has to do currently. the command name already has to be extracted by the juju client.
<rog> fwereade: actually SC takes a list which is flags followed by subcommand name followed by flags followed by args...
<fwereade> rog, (please do not assume my motivations -- I'm not *trying* to use SC, I merely observed that it precisely followed the problem)
<fwereade> rog, only if you set the Log field
<rog> fwereade: sorry, i wasn't trying to assume your motivations. i'm going from my initial gut reaction when i saw the code, which hasn't gone away on closer inspection. i think this would be simpler without SC here.
<rog> fwereade: and easier to understand.
<fwereade> rog, I think we need to figure out exactly what are differences are wrt the definition of simplicity :)
<rog> fwereade: actually SuperCommand always takes flags followed by subcommand name - it always accepts --help for example.
<rog> fwereade: lines of code is a good start
<fwereade> rog, I see it and I think "right, a small chunk of the problem uses a well-understood type to so work that that type is expressly designed to do, I can forget that aspect of the problem for now"
<fwereade> rog, s/to so work/to do work/
<rog> fwereade: i see it and think "that entire cmd function does nothing useful"
<fwereade> rog, well, it uses the pieces we have in play right now to do its job
<rog> fwereade: we already have the pieces IMHO - the built in map type and possibly the existing *Log type are all that's needed.
<fwereade> rog, we definitely don't want a Log there
<rog> fwereade: that's fine, we don't need to use it
<rog> fwereade: even better!
<rog> fwereade: so instead of building a slice of Commands and leaving it up to something else to do the selection, we can use a map of commands and an index expression...
<rog> fwereade: which also means that potentially we could return a more useful error message when someone tries to execute a command in the wrong context
<fwereade> rog, isn't "here's a nicely-formatted list of the appropriate commands for this context" useful error-message-wise?
<rog> fwereade: not as useful as "you can't execute this command in this context for this reason"
<rog> fwereade: supercommand will just say "command not found" or whatever
<rog> hmm
<fwereade> rog, and list the ones it can, which is IMO useful
<rog> fwereade: i'm not sure. to the user, they all look like separate commands. seeing "usage: (->jujuc) ..." and a list of other commands isn't great.
<rog> fwereade: in particular the (->jujuc) thing
<fwereade> rog, heh, I felt that was useful information
<rog> fwereade: it's feels more like an implementation detail to me
<rog> s/it's/it/
<fwereade> rog, conceptually perhaps, but not one we can really hide from someone who takes the trouble to look
<rog> fwereade: sure. but we don't want to flaunt it to everyone that happens to mistype a command name...
<fwereade> rog, that'll just give them command not found
<fwereade> rog, it's only if they're hitting something symlinked to jujuc in the first place
<rog> fwereade: ok, to anyone that uses a command in the wrong context then ....
<fwereade> rog, this is moot anyway, AIUI the long-term plan is that they should all always be available anyway
<fwereade> rog, (but the wrong-context thing is IMO exactly when it'll be useful to see the commands that are valid in the current context)
<rog> fwereade: there's nothing stopping us showing them anyway.
<rog> fwereade: if we decide that's what we want. but an error message better than "unrecognised command" would be good.
<fwereade> rog, that feels like still more reimplementation of existing functionality...
<rog> fwereade: sure. i don't think it's really necessary. getting the right error message is more important. we *did* recognise the command, we just didn't act on it.
<fwereade> rog, well, probably, anyway... no doubt someone will try to call jujuc directly sometime ;)
<rog> fwereade: that's fine - we will get "unrecognised command" in that case :-)
<fwereade> rog, ok, I'll give it a go
<rog> fwereade: thanks for bearing with me :-)
<fwereade> rog, please understand that it's frustrating to work on something for a week with the clear intent that it will be used imminently, and to then have it dropped on the floor
<fwereade> rog, this essentially means that all the discussion and argument about how to extract Log from SuperCommand was pure academic wankery :(
<rog> fwereade: yes, i do understand that very well, and please believe me when i say that i wouldn't suggest it unless i thought it would be a significant improvement
<rog> fwereade: oh i see.
<fwereade> rog, this turn of events is maybe also evidence that we flat-out suck at predicting the future ;)
<fwereade> rog, don't get me wrong, SC turned out really nice, but...
<rog> fwereade: well, that is always true in s/w development at least. anything else, i can predict 3 days in advance perfectly.
<fwereade> rog, sure; but your simplicity argument does again hinge on future usage
<rog> fwereade: i'm really happy with the way SC turned out too. it means we can add extra flags easily too if we want, in a nicely organised way
<fwereade> rog, suggestion:
<fwereade> rog, I follow your advice but for different reasons
<rog> fwereade: i'm not sure. my simplicity argument is based on looking at the code as is
<rog> fwereade: and seeing that it could be made simpler
<fwereade> rog, that by implementing it as GetCmd(contextId, name) (string, error) I can thereby punt on the question of selection mechanism until we have an actual supplier of commands implemented
<rog> fwereade: that's true too
<fwereade> rog, you get simpler code to review and we defer any argument about the right way to do something until we're actualy doing it ;p
<rog> fwereade: i wasn't sure whether you'd already got some code that implemented GetCmds already lurking somewhere though
<fwereade> rog, no -- it's the simplest thing I could predict it would be easy to get from a Context :)
<rog> fwereade: well, i'm glad about that at least - no downstream branches to mangle
<rog> TheMue: ping
<TheMue> rog: pong
<rog> TheMue: i got a test failure on TestUnitWatchPorts
<rog> TheMue: i've worked out the issue, and i think i have a slightly better way of working that test
<rog> TheMue: how about this? http://paste.ubuntu.com/949642/
<rog> TheMue: this issue was a timeout BTW
<rog> TheMue: the receiver of the changes wasn't waiting long enough for the generator
<TheMue> rog: The tests are currently refactored to a more serialized way. But that will come in after the moving has a LGTM by Gustavo.
<rog> TheMue: this change makes them both run in lock step (so it should be faster) and also uses the same table for both
<rog> TheMue: ok, so you've got a similar change in the offing?
<TheMue> rog: Yep
<TheMue> rog: Wait, I paste one example.
<rog> TheMue: cool. i'll ignore the test failures for the moment then.
<TheMue> rog: http://paste.ubuntu.com/949648/
<rog> TheMue: that's cool. no need for the extra goroutine, of course, which i hadn't realised. nice.
<TheMue> rog: I followed an idea of William.
<rog> TheMue: 't'was a good idea
<TheMue> rog: Indeed.
<TheMue> fwereade: ping
<fwereade> TheMue, pong
<TheMue> fwereade: Just seen that you've got no roomie, like me. Shall I notify Marianna?
<fwereade> TheMue, sure :)
<TheMue> fwereade: OK, will do.
<fwereade> TheMue, I'm reasonablycivilised most of the time ;)
<TheMue> fwereade: I'll try my best to do so too. *lol*
<rog> TheMue: where do you find out your roomie?
<TheMue> rog: I went to the wiki and there "latest changes".
<TheMue> rog: There's a "PeopleDetail" page.
<rog> TheMue: got it
<TheMue> fwereade: Mail is sent.
<fwereade> rog, new cut at https://codereview.appspot.com/6120054/
<fwereade> rog, actually hold off, I can simplify further
<rog> fwereade: yeah, jujucDoc and jujucPurpose can go, at least
<fwereade> rog, that was precisely what I realised :)
<rog> fwereade: LGTM
<fwereade> rog, cool, thanks
<rog> fwereade: i hope it feels ok
<fwereade> rog, ii's good, thank you :)
<fwereade> rog, btw, I've really come to appreciate lbox propose publishing all my drafts, really nice idea
<rog> fwereade: i like that too. and the fact that i have to have a clean branch when proposing
<fwereade> rog, yeah, that's saved me a lot of embarrassment :)
<rog> fwereade: i wish cobzr switch would do that check too actually
<fwereade> rog, I've actually given upon cobzr, it surprises me slightly more than bzr does and that makes me paranoid
<rog> fwereade: i find it works pretty well actually.
<rog> fwereade: it would be nice if there was a universal way of specifying a co-located branch, so i could diff against them
<fwereade> rog, it's pretty clearly to do with a part of my brain that hasn't quite adapted to DVCSs in general
<rog> fwereade: so how do you set your GOPATH?
<fwereade> rog, so that's clearly soluble
<fwereade> rog, but the easy diffing is too much to give up
<fwereade> rog, sad to say, I'd rather just move directories ;)
<rog> fwereade: i usually just push to launchpad and diff against there. slow but quick enough for my purposes
<fwereade> rog, that would be fine if *something* I did didn't cause cobzr checkouts to occasionally forget their push branch :)
<rog> fwereade: ah, i have seen that before, i think. but nothing that caused too many problems.
<andrewsmedina> rog: thanks for the reviews :)
<rog> andrewsmedina: np
<niemeyer> Mornings!
<fwereade> heya niemeyer
<niemeyer> fwereade: Hey
<niemeyer> fwereade: I suspect the diff in https://codereview.appspot.com/6116049/ got mixed up with review points that we've sorted elsewhere
<niemeyer> fwereade: If that's the case, we can just get it in and let the other issues be sorted in whatever CL they are being debated
<andrewsmedina> Guest17887: I will improve testing for LXC using the same line you used forthe ssh
<rogpeppe>  andrewsmedina: that sounds reasonable.
 * niemeyer missing fwereade at the moment
<niemeyer> I don't understand why the checks in https://codereview.appspot.com/5845051/diff/17001/charm/charm_test.go are being removed..
<niemeyer> Ahh, I think I see.. it was moved to another test
<niemeyer> Nice
<TheMue> So, have to leave.
<TheMue> niemeyer: Watcher moving is in at https://codereview.appspot.com/6131045/, could you please review? Thx.
<TheMue> niemeyer: Watcher test refactoring is also ready locally (but based on the moved watchers). So it could be reviewed immediatelly after submit of moved watchers.
<niemeyer> TheMue: Will do
<niemeyer> TheMue: Cheers
 * niemeyer => lunch too
<rogpeppe> niemeyer: i'm seeing a store test failure: http://paste.ubuntu.com/950070/
<rogpeppe> niemeyer, fwereade: https://codereview.appspot.com/6128046/
<rogpeppe> fwereade: https://codereview.appspot.com/6128046/
<rogpeppe> fwereade_: https://codereview.appspot.com/6128046/
<fwereade_> rogpeppe, cheers
<rogpeppe> fwereade_: *slightly* concerned about the fact that my test will only work if the juju build succeeds, but we're never gonna break the build, right?
<fwereade_> rogpeppe, that sort of thing never happens at all ;p
<fwereade_> rogpeppe, but tbh it's not clear what you can do about it...
<rogpeppe> fwereade_: yeah. i don't think it matters too much.
<rogpeppe> fwereade_: fix the build, then the tests can start working again
<niemeyer> rogpeppe: Hmm.. is it failing consistently or some times?
<rogpeppe> niemeyer: just that once, but i haven't run all the tests since then. will try again.
<niemeyer> rogpeppe: I will increase the timeout there
<rogpeppe> niemeyer, fwereade_: i'm off slightly early today. very shortly in fact.
<fwereade_> rogpeppe, niemeyer: off in a few short minutes myself
 * fwereade_ almost avoids saying something about friday, friday...
<niemeyer> rogpeppe, fwereade_: Cool, I'll see you guys on Monday, hopefully
<niemeyer> I'll be working from us-east, so we'll have little-to-no overlap
<rogpeppe> niemeyer: did you see my latest CL BTW? archiving of juju client now works. only the S3 and shell script stuff to do.
<niemeyer> rogpeppe: Woohay!
<rogpeppe> niemeyer: store tests passed that time BTW
<niemeyer> rogpeppe: init is close
<rogpeppe> niemeyer: ?
<niemeyer> rogpeppe: Cool.. it's the timeout indeed
<niemeyer> rogpeppe: zkinit on server side
<rogpeppe> niemeyer: cool. i reckon that'll probably be done by the time i've finished the S3 upload code
<rogpeppe> niemeyer: a good place to stop for the day i reckon. have a good weekend all. and niemeyer, have a good flight, hope your ears give you no trouble...
<rogpeppe> fwereade_: cheerio
<niemeyer> rogpeppe: zkinit itself is done or mostly done I think.. next after the upload stuff is tweaking user-data setup to pick the right commands and run them
<niemeyer> rogpeppe: I'm so excited about this stuff, btw
<fwereade_> rogpeppe, happy weekend
<rogpeppe> niemeyer: 'xactly. "the shell script stuff"
<niemeyer> rogpeppe: Yeha
<rogpeppe> niemeyer: me too.
<niemeyer> rogpeppe: That mechanism will make development of the upcoming stuff such a breeze.. I can barely believe we'll be able to just say upgrade-juju --upload-tools and *test the new code*
<rogpeppe> niemeyer: the go tool made it almost trivial... kudos to russ
<niemeyer> rogpeppe: Hmm.. I don't think that's entirely related
<niemeyer> rogpeppe: Kudos to all of them for having standalone binaries.. *that's* the huge facilitator
<fwereade_> gn all, take care
<rogpeppe> niemeyer: yeah, but that i could do "GOBIN=$tmp go install launchpad.net/juju/go/cmd/..." and have all the binaries made for me and up to date is just lovely
<niemeyer> rogpeppe: Ah, I didn't realize you were going in that direction
<niemeyer> fwereade_: Night man
<rogpeppe> niemeyer: hope it works ok for you.
<niemeyer> rogpeppe: Thinking..
<niemeyer> rogpeppe: This is most awesome for dev, but it means you need the dev environment in place too..
<niemeyer> rogpeppe: The alternative I was thinking off was looking for the tools in $PATH
<niemeyer> s/off/of
<niemeyer> rogpeppe: This would work in all cases, as long as one has the tools in their path.. both in dev and in cases when juju is installed via debs
<rogpeppe> niemeyer: i'd assumed non-devs would use public binaries
<niemeyer> rogpeppe: Sure, that's an assumption we can make, but the question is why
<niemeyer> rogpeppe: What are we trading off
<rogpeppe> niemeyer: i can easily add logic to find binaries in $PATH to the current code.
<rogpeppe> niemeyer: perhaps in the next CL?
<niemeyer> rogpeppe: Yeah, we can even wait longer
<niemeyer> rogpeppe: Until we have answers for that
<niemeyer> rogpeppe: Your scheme sounds better for the situation we're currently in
 * niemeyer goes for a few errands in preparation for the trip
#juju-dev 2013-04-22
<thumper> I think I've cracked it
<thumper> ah ffa
<thumper> ffs
<thumper> even
<thumper> I've got bootstrap fixed...
<thumper> but status fails
 * thumper stabs ec2
 * thumper forceably terminates instance
 * thumper waves at mramm in passing
<wallyworld> thumper: have you run the live ec2 tests on raring? ie in the environs/ec2 directory "go test -gocheck.v --amazon"
<thumper> night all
<wallyworld> jam: meeting?
<jam> wallyworld: yep
<jam> are you on mumble?
<wallyworld> aye
<rogpeppe> mornin' all
<jam> hi danilos, I see you on, but you're still muted on mumble
<danilos> jam: hi, joining (it's only :59 for me :)
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: jam | Bugs: 2 Critical, 64 High - https://bugs.launchpad.net/juju-core/
<jam> rogpeppe: did you land the patch to work around the HTTP breakage with the ppa versions of juju-core?
<rogpeppe> jam: the one that retries the PUT requests?
<jam> rogpeppe: something about keepalive that dimitern and danilos worked out at the end of last week.
<jam> the "closed connection" stuff.
<rogpeppe> jam: that's a problem with the debian go distribution
<rogpeppe> jam: the solution is to use to using go1.0.3
<jam> rogpeppe: well, juju-core being broken is our problem.
<rogpeppe> jam: i know, but there's no good fix AFAIK
<rogpeppe> jam: danilos' suggested fix will break other things
<jam> rogpeppe: I don't think we can get 1.0.3 into the archive at this point.
<jam> we're quite a bit past that point.
<rogpeppe> jam: the real fix is not to use a buggy version of go (that bug was never part of a released version BTW)
<jam> so some sort of fix is necessary
<jam> or we won't have juju-core that actually works on Raring.
<rogpeppe> jam: actually that's not the case - it does currently work on Raring i believe
<jam> rogpeppe: I'm pretty sure the ppa version is borked. I've had 3 people say so.
<rogpeppe> jam: because the problem was seen because we had many versions in the public bucket
<jam> If the ppa version becomes the archive version.
<rogpeppe> jam: which put the s3 LIST response over 8K
<rogpeppe> jam: the objects have now been deleted, so the issue is papered over for now
<rogpeppe> jam: it seemed that it *might* be possible to get 1.0.3 into the archive, even at this stage
<rogpeppe> jam: fwereade might know whether it's actually going to happen or notr
<fwereade> rogpeppe, jam: I kicked it across to jamespage/Daviey
<jamespage> rogpeppe, fwereade: just getting to that
<fwereade> rogpeppe, jam: IMO it's still profoundly irresponsible from our POV to switch the language version at this stage
<jam> fwereade: I would agree with you
<fwereade> rogpeppe, jam, jamespage: but I concede that from the POV of raring it's a good thing and we have to take our lumps
<jamespage> fwereade, I actually agree; I won't support pushing 1.0.3 into raring this late
<rogpeppe> fwereade: i still think... well, you know what i think - 1.0.3 is a bug-fix version, and we're being bitten by one of the bugs
<jam> fwereade: I also wanted to check about Ian's constraint patches. Can it get reviewed and landed before final cut?
<fwereade> jamespage, yay!
<rogpeppe> fwereade: so how do we fix our code then?
<fwereade> rogpeppe, we don't, because we left it too late
<fwereade> rogpeppe, we paper over it because that's the only thing we can do that doesn't risk far worse instability
<jamespage> fwereade, if there is a critical bug then I'm happy to pull in a single fix but I don't think upgrading even to a point release is a great idea right now
<rogpeppe> fwereade: go1.0.3 is not unstable; and neither is our code when using it
<fwereade> rogpeppe, it's nothing to do with 1.0.3 itself
<rogpeppe> jamespage: it's a patch release really
<fwereade> rogpeppe, and our code has not had anywhere near the use and testing on 1.0.3 that it has on 1.0.2
 * jamespage shrugs
<jamespage> lots of  patch releases won't make raring now
<fwereade> jam, honestly I am -1 on wallyworld's constraints changes now
<jam> fwereade: wfm. It would be nice to have, but it would be a source of breakage.
<fwereade> jam, I don't know "wfm" but I think we're in agreement
<fwereade> jam, we have cowboyed it up far enough ;)
<rogpeppe> jam: i don't believe it would be a source of any breakage at all. the two releases are fully compatible - 1.0.3 just fixes some 1.0.2 bugs.
<jam> rogpeppe: I'm talking about the constraints stuff atm.
<rogpeppe> jam: oh, sorry
<rogpeppe> fwereade: we weren't seriously thinking of putting that constraints stuff in *now*, were we?
<fwereade> rogpeppe, jam suggested it, I said no
<rogpeppe> fwereade: phew.
<fwereade> rogpeppe, I don;t think he was seriously expecting a yes ;)
<fwereade> rogpeppe, but it's good to be clear ;)
<fwereade> jam, rogpeppe: there are lots of things that I want to get onto ASAP *after* today once we have the release in place
<jam> fwereade: in a "we'd like to have feature parity with pyjuju" it is definitely in the really-nice-to have. But the risk is high enough it didn't necessarily balance it out.
<fwereade> jam, rogpeppe: but as far as I am concerned we fixed or worked around the *really* critical stuff on friday
<rogpeppe> fwereade: perhaps we should move to using go1.1beta as standard from then
<fwereade> rogpeppe, I'm fine switching to 1.0.3 after today but not developing officially on a beta language version
<jamespage> fwereade, just for the record please can someone explain how tooling is prepared, i.e. why 1.0.3 is important in raring but 12.04 only has a much older version
<fwereade> jamespage, nobody cared until it was an opportunity to snark about me "not wanting to fix bugs"
<rogpeppe> i did care, but i presumed that we'd have 1.0.3 in raring
<rogpeppe> jamespage: i think we just use the unmaintained debian upstream go package
<jamespage> but why does 12.04 not matter?
<danilos> fwereade, jam: do we need to identify exact patch that needs to go on top of already present debian/patches/15-net-http-connection-close.patch and see if it's minor enough for jamespage to include in 1.0.2 package?
<rogpeppe> jamespage: it does
<jamespage> for reference:
<jamespage>     golang |      2:1-5 | precise/universe | source, all
<jamespage>     golang |  2:1.0.2-2 | quantal/universe | source, all
<jamespage>     golang |  2:1.0.2-2 | raring/universe | source, all
<jamespage> danilos, if we do anything it will be that
<fwereade> jamespage, I don't think anyone disagrees that it would be good to have, and I'd be +1 on updating it when we are no longer in release frenzy mode
<fwereade> jamespage, (I think that applies across the board from P onwards actually)
<rogpeppe> jamespage: i think go 1 was only just out when precise was released.
<jamespage> looking at https://code.google.com/p/go/issues/detail?id=4914
<jamespage> it appears a revise patch was produced - but its never been picked up in the packaging
<danilos> jamespage, right, I was wondering if people want me to test that patch on raring and see if it indeed solves our problems so we can ask for it to be included
<danilos> "that patch" == "patch from https://codereview.appspot.com/6211069"
<jamespage> danilos, yes that would be a good idea
<rogpeppe> danilos: that patch isn't sufficient - it doesn't fix https://code.google.com/p/go/issues/detail?id=1967
<rogpeppe> danilos: which is needed to make goamz/s3 work correctly, i believe
<danilos> rogpeppe, that patch is already included in the package I believe (one I referenced above, 15-net-http-connection-close.patch)
<jamespage> rogpeppe, my understanding is the fix for that issue (which is already cherry picked into the packaging) is the one that caused the refression
<jamespage> regression rather
<jamespage> danilos, rogpeppe, fwereade: I still really need a LP bug raising against ubuntu explaining what the current impact is
<rogpeppe> jamespage: the current impact of that go distribution on juju? or on anything?
<jamespage> rogpeppe, well juju would be a good start
<jamespage> right now I just have "there is a bug"
<fwereade> jamespage, rogpeppe: that was also my understanding -- if it's practical to switch to an unfucked *1.0.2*, and that change is genuinely small and precisely targeted to the point where we can analyse its impact by inspection, then great
<fwereade> rogpeppe, would this be a sane characterisation of the core of the issue: "ubuntu doesn't have the real 1.0.2 because the upstream contains a bad patch"?
<rogpeppe> fwereade: yes
<rogpeppe> fwereade: but 1.0.2 does have the problem that it ignores the Close field.
<jamespage> fwereade, exactly - I can also unbreak it for quantal as well if its targetted enough
<fwereade> rogpeppe, ok -- but fixing that in isolation will also cause us problems with the s3 package, right? ornot?
<fwereade> rogpeppe, (due to the ignored close field)
<rogpeppe> fwereade: i don't know - applying a random patch and ending up with a version that is again not *quite* one of the official released versions seems to me to be a bad idea
<fwereade> rogpeppe, sorry imprecise: s/that in isolation/the diff against real 1.0.2 in isolation/
<danilos> rogpeppe, that's how debian packages usually look like
<danilos> rogpeppe, and fwiw, I'd test with only this patch (http://pastebin.ubuntu.com/5591992/) added on top of the existing package, since we have been using debian-patched 1.0.2 for a while now already
<rogpeppe> danilos: that's from here, right? https://code.google.com/p/go/source/detail?r=c3cbd6798cc7
<danilos> rogpeppe, yeah, it's verbatim "hg diff -c c3cbd6798cc7"
<danilos> rogpeppe, note that https://code.google.com/p/go/source/detail?r=820ffde8c396# is already included in the deb package
<rogpeppe> danilos: yeah
<rogpeppe> danilos: looking at the list of issues fixed between 1.0.2 and 1.0.3, i see none that might impact us, other than one which might actually fix potential bugs in our code.
<danilos> rogpeppe, it's still a risk imo (even fixes break stuff :)), but I am not the release manager for either juju-core or ubuntu go package, so I won't comment further
<rogpeppe> danilos: yes, fixes can break stuff, but we have done quite a bit of testing against 1.0.3 actually.
<rogpeppe> danilos: i suppose if go1.0.3 is considered too risky, then the above patch would be better than nothing.
<danilos> rogpeppe, yeah, it will also have the benefit that it's easier to include in raring so we don't have to keep a PPA with go 1.0.3 for people to use to compile juju-core
<rogpeppe> danilos: but what about this one? https://code.google.com/p/go/source/detail?r=4c333000f50b
<fwereade> danilos, won't we need one of those for precise at least regardless?
<rogpeppe> danilos: hmm, actually, that's server only
<rogpeppe> danilos: no, i lie, it's not
<danilos> rogpeppe, there's transport stuff in there as well
<rogpeppe> danilos: yeah
<danilos> rogpeppe, I'll test only with the first patch, if that doesn't help, then it'd be better to compile for 1.0.3
<rogpeppe> danilos: this feels a bit like grasping for straws
<danilos> fwereade, I don't know, since I have no idea how do version numbers compare and whether we can SRU this in precise (https://launchpad.net/ubuntu/precise/+source/golang/2:1-5)
<rogpeppe> danilos: we have a well known version that fixes the issue and is the most well tested version of go overall
<danilos> rogpeppe, that's how packaging works, yes :)
<danilos> rogpeppe, I remember niemeyer was saying about problems with 1.0.3 and was-it-juju-gui?
<danilos> "saying something"
<rogpeppe> danilos: there is one known problem with trying to hack http redirects in 1.0.3, yes
<danilos> rogpeppe, btw, for this particular problem, do you know if there's any reason we are using connection:close (since keep-alive worked much faster for me with the Asian Amazon zone)
<rogpeppe> danilos: the problem is that in general you're not allowed to keep on reusing s3 connections.
<rogpeppe> danilos: it can break after 3 or 4 tries
<danilos> rogpeppe, ah, I see, more of an amazon policy rather than technical?
<rogpeppe> danilos: i think so, yes
<danilos> rogpeppe, right, understood
<danilos> rogpeppe, anyway, if people don't see the value in me testing this patch and we don't want to ask for it to be included with raring, I won't waste my time doing it
<rogpeppe> danilos: if that's the way things are usually done, and we won't get a fix any other way, then i think it's worth doing
<rogpeppe> danilos: i would much prefer to be using a well known and tested standard version though.
<fwereade> danilos, I don't believe we can adequately test it in time -- I don't see how you *can* without hacking other parts of juju to re-expose the issue
<fwereade> danilos, and while that's what we'll have to do tomorrow or the day after or whenever, I don't think it's a viable strategy in which the hours we have before we can land anything continue to tick down through the single digits
<danilos> fwereade, it was failing consistently for me with ap-southeast-1 region, and so was http://pastebin.ubuntu.com/5721759/
<fwereade> danilos, is it still doing so today?
<fwereade> danilos, sorry, is that the right link? says it doesn't exist
<danilos> fwereade, sure, but my point is that we can get this fixed in *ubuntu* in the next couple of days so our _users_ would get the benefit of being able to build a working juju-core package out of nothing but standard ubuntu packages soon
<danilos> fwereade, it was, but it seems ubuntu pastebin removes stuff (I had it open since Friday): new one on http://pastebin.ubuntu.com/5592035/
<danilos> fwereade, I'll try that to confirm that it was not the 8k limit
<danilos> fwereade, no, it doesn't fail anymore
<danilos> fwereade, so I suppose it is a moot point
<fwereade> danilos, it's still an issue but it's not one biting us *today*
<danilos> fwereade, right, and I guess we can keep it under control by not allowing our bucket to grow too long? and then we can take the time to resolve the issue properly
<fwereade> danilos, I am not proud of the workaround but I think it's the only one with predictably bounded second-order effects
<fwereade> danilos, exactly
<fwereade> danilos, releasing i386 versions tightens the window, but we can relax it a little by deleting the more recent 1.9.*s
<fwereade> danilos, I am more confident that we can keep it out of users sight until we fix it for real than I am that we can fix it for real without unintended consequences given the timescale and associated pressure
<danilos> fwereade, sure, agreed
<fwereade> so... rogpeppe, TheMue, mgz, jam, danilos, wallyworld: aside from the ap-southeast security group weirdness, is anyone aware of any trivially-fixable outstanding issues that would directly impact users if we were to release right now?
<fwereade> (the security groups are IMO not trivially fixable)
<rogpeppe> fwereade: the security groups could be fixed quite easily, but not in the way proposed
<rogpeppe> fwereade: i could prepare a patch quickly, but i fear it would need more testing time than we've got to make sure it works properly against all regions
<fwereade> rogpeppe, yeah, that's the heart of it
<fwereade> rogpeppe, I don't want to fix one region at the cost of another
 * rogpeppe finds it very strange that amazon apparently implemented the same software many times independently
<fwereade> rogpeppe, I'd rather release with "ap-southeast-1 and ap-southeast-2 cannot currently be used"
<rogpeppe> fwereade: well, the fix would use mechanisms that are already used in other regions.
 * fwereade hasn't looked under the covers but is terrified we're doing somthing crazy like always using "/current" api versions
<rogpeppe> fwereade: no, istr changing it to use a fixed version
<fwereade> rogpeppe, <3
<fwereade> rogpeppe, it' a shame it only hit us just now but I think we're over the line
<rogpeppe> fwereade: hmm, the fixed version was just for the metadata
<fwereade> rogpeppe, aw feck
<rogpeppe> fwereade: ah, but all is not lost
<rogpeppe> fwereade: goamz/ec2 hardcodes 2011-12-15
<jam> d
<jam> mgz: /wave
<fwereade> rogpeppe, ok, great :)
<fwereade> right
<fwereade> ok
<fwereade> I assert that we should bump the version, and release what we have right now, right now
<rogpeppe> fwereade: +1
<fwereade> mgz, ping
 * fwereade slopes off for a quick ciggie to see whether anything else plops into his mind, brb
<mgz> hey jam
<mgz> fwereade: we should probably just do a release, though I'm still not sure what exactly we should do with it
<fwereade> mgz, I think we should release 1.10.0 and put that into raring, then switch to 1.11.0 and GTW on the various things we haven't managed -- I feel like your statement alludes to things I have not thought of?
<fwereade> mgz, I suspect there are maybe things to do with the "series" of juju-core that should be done, but this is completely outside my ken at the moment
<mgz> so, we can't bump the go version, though I think the issue that helped with we landed another fix for? and we still need to get stuff past the archive admins.
<fwereade> mgz, we papered over the go version one by trashing old releases in the juju-dist bucket
<fwereade> mgz, I don't think there's anything else we can fix today that will affect users
<mgz> ah, that was the swift bucket listing one
<fwereade> mgz, s3 but yeah
<mgz> ho ho ho, that's the reverse getting the names backwards from normal
<mgz> openstack is winning
<fwereade> haha
<fwereade> mgz, can I leave the release in your hands then, and inform Daviey and jamespage that you will have 1.10.0 for them imminently?
<mgz> yeah, though I also need to argue with the release guys for the other packaging as well.
<mgz> so, nothing landed after the tagging on 1.10.0 that we want in?
<mgz> the various bugs were all otherwise worked around?
<fwereade> mgz, we reverted that actually
<fwereade> mgz, but we didn't release anything from the briefly-1.10.0 source
<fwereade> mgz, so I think we're good to just bump and go
<fwereade> mgz, there were a couple on friday that I can't even remember today :/
<fwereade> mgz, but neither I nor anyone else can AFAICT think of any way to make the release *definitely* better in the next couple of hours
<fwereade> mgz, so I see no further reason to delay
<fwereade> mgz, sorry, I missed that: the "other packaging"?
<mgz> the python juju changes to make go juju installable under the same names
<fwereade> mgz, ah, hell, I had understood that that was already in hand
<fwereade> mgz, is it not?
<fwereade> mgz, if it isn't it kinda feels like we're irreversibly boned regardless...
<mgz> the upload was rejected, and I need an archive admin's attention to get any new packages in
<fwereade> mgz, well, crap
<fwereade> mgz, that seems to justify a certain amount of frothing and screaming -- do you know to whom we should be directing it?
<fwereade> mgz, or is it in fact just a straight-up can't-be-done sort of issue?
<mgz> the faceless bureaucracy of ubuntu, but don't froth, I'm on it.
<fwereade> mgz, ok, cool
<fwereade> mgz, so, the things we need are (1) bump to 1.10.0 (2) release 1.10.0 (3) bump again to 1.11.0 (4) get juju 0.7 into raring (5) get juju 1.10 into raring
<mgz> yup.
<mgz> I'm currently on 4, and will move onto 5 after
<fwereade> mgz, ok, I will propose a bump to 1.10.0
<jamespage> mgz, Daviey is looking at the rejected package
<Daviey> mgz: Should i be looking at the rejected one.. or wait out?
<mgz> I'll forward the email I got from stephane
<mgz> I have a trivial diff that I think is all he wants
<mgz> jamespage: what should I use to test install of a deb I've just built in canonistack? schroot?
<jamespage> schroot it good
<jamespage> (generally what I do)
<mgz> can you give me a quick example of your procedure? I wasn't watching closely enough last time you did it
<fwereade> anyone want to give me an LGTM on https://codereview.appspot.com/8759045 for form's sake? :)
<mgz> done.
<TheMue> fwereade: done
<fwereade> mgz, ok, that is submitted
<mattyw> rogpeppe, can you spot anything odd we're doing here? https://pastebin.canonical.com/89643/
 * rogpeppe gets his 2-factor key
<fwereade> mattyw, is there another package-level Test* function that doesn't use MgoTestPackage by any chance?
<rogpeppe> +1
<rogpeppe> that was my question too
<mattyw> fwereade, there definately is yes
<fwereade> mattyw, that'd be the problem then, drop that and use this, I think
<rogpeppe> mattyw: you should only have one top level Test function
<mattyw> fwereade, rogpeppe I'll move it, thanks guys
<rogpeppe> mattyw: np
<fwereade> mattyw, cheers
<fwereade> ok
<wallyworld> fwereade: just got back from soccer and saw your question - i'd love to get the constraints stuff for openstack in the release, but maybe it's too late?
<fwereade> wallyworld, I'dlove to too but I thought it was too late on friday really
<wallyworld> ok, no problem
<fwereade> wallyworld, we can get that into the first server-side update and get a lot of value
<wallyworld> i think we are doing a 10.1 release soonish anyway?
<fwereade> wallyworld, I think
<fwereade> wallyworld, that is the plan
<fwereade> wallyworld, as far as I am concerned we are now frozen
<wallyworld> fwereade: also, i really see a lot of value in the image-id constraint. can we discuss sometime?
<fwereade> wallyworld, definitely -- it's a use case we should have in mind, but I really don't think it's a constraint
 * TheMue is at lunch, bbiab
<wallyworld> fwereade: did you want to pop i on Blue's standup soon?
<fwereade> wallyworld, honestly I don't think I can do it usefully -- I ran out of energy on thursday, friday was past my limit, and my first weekend off in a month was not enough to reset me
<mgz> bah, this is still borked
<wallyworld> ok no problem :-)
<wallyworld> fwereade: we can discuss later this week perhaps
<fwereade> wallyworld, I'd love to
<wallyworld> sure. i am off thursday so maybe wednesday or tomorrow
<fwereade> wallyworld, weds would be perfect
<mgz> ah, no, probably just a bash quirk
<wallyworld> ok, wed it it
<fwereade> wallyworld, cheers
<fwereade> mgz, will you be doing the 1.10.0 release or should someone else pick that up for you?
<fwereade> mramm2, heyhey
<dimitern> jam, i'm very sorry i missed the 1-1 today
<wallyworld> fwereade: i'm sad you rejected my mp. i agree with rogpeppe about binary compatibility. i'd like to think that the landing bot would do the tests on precise, but we really need to run "bleeding edge"  tests locally on the latest series to minimise risk for future deployments
<mgz> fwereade: what does the release consist of in this case? build, tools to public bucket, and announcement on list?
<wallyworld> fwereade: i guess we need to add that to our wednesday agenda :-)
<mgz> note for self for next time... `sudo schroot -c juju -u root` as I don't know how to easily configure fancy permissions...
<fwereade> mgz, yes please, I think that's it -- maybe a note that 1.9.* will be removedimminently, and actual removal of 1.9.* in a day or 2
<markramm> mgz: yes those things seem to be the sum of what doing a release for go juju require ;)
<fwereade> wallyworld, I consider the failure entirely mine, it's fundamentally about communication
<fwereade> wallyworld, I'm not sure what's meant by the binary compatibility question but I'm afraid I'm done thinking for today
<wallyworld> fwereade: i don't think you failed. but that aside, in past projects and fundamentally, i think local devs should test against the latest system release while the landing bot tests against the stable release
<fwereade> gents, you're all great, but I'm wiped out for now
<wallyworld> fwereade: ok, talk later
<fwereade> wallyworld, cheers :)
<dimitern> jam, mgz: mumble poke
<mgz> ta
<danilos> jam: stand-up time
 * dimitern lunch
<hazmat> anyway familiar with this error, "2013/04/22 08:50:19 ERROR command failed: no CA certificate in environment configuration"
<rogpeppe> we really need to upload the 1.10 tools
<rogpeppe> currently we can't use juju bootstrap with the tip version without using --upload-tools
<rogpeppe> llog
<mgz> rogpeppe: building from recipe in the ppa now
<rogpeppe> mgz: thanks.
<rogpeppe> mgz: i wonder if every time we upload a new version, we should delete the oldest version currently in the bucket
<rogpeppe> mgz: to avoid running afoul of the bug
<mgz> right now, probably
<mgz> is it always safe to remove old tools, even if there are environments currently running them?
<mgz> I guess not if we're good about major/minor compat discipline
<rogpeppe> mgz: there's probably a window during which a started instance can fail because the provisioner finds some tools and puts the url in the cloud-init, only for the tools to be removed before they get chance to run
<rogpeppe> mgz: and there's the compatibility issue too. but i hope we're going to be much better about that from now on.
<mgz> so, I don't have creds to the juju-dist thing as far as I can find in my email archive, but it's pretty trivial for anyone else to download the debs and run the release-public-tools script
<rogpeppe> yay, look at these ping times http://paste.ubuntu.com/5592594/
<dimitern> mgz: no, i haven't brought that cloudinit workaround for raring to smoser's attention
<dimitern> mgz: maybe we should though, you're right - afaics it works, tested live and all, but some subtleties might have escaped me
<mgz> let's bug him and see
<dimitern> can you do this please - since you're handling the release anyway?
<mgz> done, though, as stated^ I'll need someone else to upload the binaries to the public bucket(s)
<dimitern> rogpeppe: kanban?
<rogpeppe> dimitern: i'll give it a go. network connect still v dodgy.
<gary_poster> hey niemeyer.  _mup_ seems to be sleeping, at least on #juju-gui.  Could you wake him up?
<niemeyer> gary_poster: WIll check it out
<gary_poster> thank you
<dimitern> gary_poster: _mup_ was on crack here as well - not answering to bug 0123456 as well since few days now
<gary_poster> dimitern, yeah, I thought I saw that.  thanks for the confirmation :-)
<dimitern> (it seems it still is)
<niemeyer> gary_poster: Launchpad seems to have changed the bug URL that mup looks for
<gary_poster> niemeyer, ah :-/
<gary_poster> thank you for investigating niemeyer.  Is that something you plan to address soon, or no time?
<niemeyer> gary_poster: I'd like to see if fixed for sure.. I'm having a deeper look just now
<gary_poster> cool
<gary_poster> thanks again
<niemeyer> gary_poster: np
<gary_poster> hey jam.  I think you had talked about adding tarmac support to lbox.  Do you know if that has gone anywhere beyond that initial statement of idea?  We would like that for the GUI.
 * jamespage high 5's mgz
<jamespage> py juju accepted....
<mgz> woho!
<mgz> right, now for go...
<mramm> jamespage: mgz: AWESOME!
<niemeyer> gary_poster: Found the issue with wgrant's and andreas' help
<niemeyer> gary_poster: We're taking the chance to update the machine as well
<gary_poster> niemeyer, fantastic
<niemeyer> gary_poster: mup will be awaken soon :)
<gary_poster> heh cool :-)
<rogpeppe1> i'm off now
<rogpeppe1> see y'all tomorrow
<ahasenack> guys, with go-juju, how are the open-port, close-port, etc commands inserted into $PATH?
<ahasenack> I just deployed a service and got
<ahasenack> 2013/04/22 19:14:24 INFO worker/uniter: HOOK /var/lib/juju/agents/unit-lds-quickstart-0/charm/hooks/install: line 329: open-port: command not found
<ahasenack> I logged in, and these tools are in /var/lib/juju/tools/machine-1/
<ahasenack> is that in the hook's shell env?
<ahasenack> ok, it's the charm that changes PATH
<smoser> please can we have https://codereview.appspot.com/8648047/ reverted?
<smoser> this problem should go away "forever" after released images on thursday.
<smoser> see my comment in that bug for more information.
<smoser> mgz, ^
<mgz> smoser: I'll propose that.
<mgz> hm, the r1111 change to make .lbox.check actually verify the build works is going to screw me
<mgz> because the build has *never* worked for me
<mgz> it just doesn't fail in an important manner...
<thumper> morning
<thumper> man, perhaps I should have a coffee before tackling the emails...
 * thumper goes to make that coffee
#juju-dev 2013-04-23
<m_3> davecheney: yo
<davecheney> m_3: wazzup ?
<m_3> davecheney: just wanted to touch base with you on scaling
<m_3> davecheney: make sure you weren't waiting on me for anything
<davecheney> m_3: nope
<davecheney> did you use the test harness on friday ?
<davecheney> i was going to bring it up to the latest changes and try the 300 node test again
<davecheney> HP cloud totally shat itself once I got to 300 node last week
<m_3> davecheney: saw email on it, but didn't get a chance to play
<m_3> took a real weekend :)
<davecheney> m_3: no probs
<davecheney> you should continue to take that weekend
<m_3> haha
<m_3> yeah, sprinting this week
<m_3> conference last
<m_3> conference next
<m_3> then another sprint
<m_3> whoohoooo!  party
<davecheney> shit, you travel more than I do
<davecheney> you should sell your house, or rent it out and live in your basement
<m_3> haha, yeah
<m_3> it's utah, so it's cheap
<davecheney> houses in australia don't have basements
<davecheney> and if they did
<davecheney> we'd convert them into garages
<m_3> we do rack up the hilton points though
<m_3> unfortunately it's all in my wife's name.... so I'm always Mr Martin
<m_3> not Mr Mims
<m_3> and not even Dr Martin
<davecheney> shitter
 * thumper has installed the ec2 command line tools to try to work out the difference between us-east-1 and ap-southeast-2
<thumper> yay
<davecheney> thumper: digging in boto is also a good way to figure out how this shit is supposed to work
<thumper> boto?
<davecheney> python boto
<davecheney> _the_ python ec2 library
<thumper> ah
 * davecheney is really wishing he hadn't let the TA talk him into flying via LAX next week
<thumper> heh
 * thumper has AKL->SFO
<davecheney> but the only option for SYD -> SFO
<davecheney> was flying united
<thumper> yep, that sucks
 * thumper likes Air New Zealand
<davecheney> UA flying an original 1970 vintage 747-400
<davecheney> fuck that
<davecheney> so now I have a 2 hour layover in LAX to clear customs (who are on strike) and land on time (flight controllers are also on strike) and transfer to a domestic flight
<davecheney> soooo not going to happen
<thumper> why is everyone striking?
<davecheney> furlough
<davecheney> air traffic controllers are paid with federal money
<davecheney> and it's run out
<davecheney> http://www.13wham.com/news/local/story/FAA-Furloughs-Kick-In-But-Flights-Come-In-On-Time/81mfoekV-0SKgJ8HusTqMA.cspx
<thumper> oh yay
 * thumper has just realised a big problem...
<thumper> well, bigish
<thumper> FSVO big
<davecheney> do tell
 * thumper tries to work this out
<thumper> when running tests we have MakeFakeHome
<thumper> which writes the environments.yaml into the juju home dir
<thumper> however MakeFakeHome doesn't take JUJU_HOME into account, but the code to work out the juju home to write the config file into does
<thumper> so if any dev has JUJU_HOME set
<thumper> the tests will break their environment
<davecheney> oops
 * thumper enfixorates
 * thumper files bug first
 * thumper thinks about how to side step this panic
<rogpeppe1> mornin' all
<TheMue> morning
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: TheMue | Bugs: 2 Critical, 64 High - https://bugs.launchpad.net/juju-core/
<smgz> I need a floating ip on canonistack, has anyone got one I can have?
<dimitern> mgz, wallyworld_, jam, danilos: sorry guys I might be 5m late for the standup
<wallyworld_> ok
<jam> np
<danilos> jam, dimitern, wallyworld_: I am not sure I'll be able to get mumble to work here at all :/ I think I'll drop out to find a new PSU instead of trying to fix the existing one
<wallyworld_> what's wrong with it?
<jam> wallyworld_: he is on his laptop, and having trouble getting it working. The PSU apparently blew out some capacitors
<danilos> jam, wallyworld_: I have mumble working here but with a high-pitched noise (like echo) coming in from the built-in microphone, and I am sure nobody wants to listen to that
<wallyworld_> danilos: you can maybe just listen in
<jam> danilos: yeah you can sit in and listen
<smgz> so, I can't connect to the standard mumble port right now
<smgz> it appears to be a day of mumble fail
<smgz> if there's a public server on 443 I could get to that instead
<danilos> smgz, have you tried setting up an ssh tunnel?
<wallyworld_> jam: forgot to mention - i had goose bot issues as well - it wouldn't land stuff and ssh to the goose bot machine failed
<jam> wallyworld_: yeah, I haven't been able to ssh into canonistack instances for a while, I was hoping mgz would have some insight, but he only just got back.
<wallyworld_> jam: so i justed landed my branch manually
<jam> wallyworld_: right, I've been doing that for the last few patches until we get goose-bot sorted out again.
<wallyworld_> ok, glad it's not just me then :-)
<smgz> danilos: have one now, it's... not going to be much good for voice packets
<jam> udp over tcp, great latency and reliability at the same time
<wallyworld_> dimitern: thanks for the review. i copied that imagesFields() function from ec2 - i'll look for a reference to the file format and point to that
<smgz> also, because of lack of floating ips, I'm going hpcloud->chinstrap->canonistack
<smgz> were I just in the datacenter, it might be more practical
<dimitern> wallyworld_: cheers!
<rogpeppe1> test
<ahasenack> hi guys, I noticed that with go-juju the binaries like open-port, close-port, juju-log and others are in /var/lib/juju/tools/machine-1/
<ahasenack> I suppose you change PATH, so that all hooks can easily find them, right?
<rogpeppe1> ahasenack: that's right
<ahasenack> I just stumbled upon a charm/script that sanitizes PATH to the usual bin paths, and then gojuju broke
<rogpeppe1> ahasenack: interesting
<rogpeppe1> ahasenack: where are those executables in py juju?
<ahasenack> rogpeppe1: /usr/bin
<rogpeppe1> ahasenack: i'm not sure there's anything we can do about that unfortunately
<rogpeppe1> ahasenack: we can't put binaries in /usr/bin
<ahasenack> why? Ubuntu policy?
<rogpeppe1> ahasenack: because we allow upgrades
<smgz> we can fix the script...
<rogpeppe1> ahasenack: and there can be more than one version of the tools binaries on the same machine
<smgz> or symlink probably, but really fiddingly with path shouldn't be needed
<ahasenack> one per deployed unit, basically?
<rogpeppe1> ahasenack: one for each agent - that is one per deployed unit and one for the machine agent
<ahasenack> rogpeppe1: hm, unrelated, but did bootstrap with raring break again? It was working yesterday
<rogpeppe1> ahasenack: yes, it's just been deliberately broken again
<rogpeppe1> ahasenack: pending a proper fix
<ahasenack> ok
<rogpeppe1> ahasenack: which is an outstanding bug in raring
<ahasenack> isn't raring going to be released tomorrow?
<ahasenack> no, 25t
<ahasenack> 25th
<ahasenack> hardy dies tomorrow
<ahasenack> so, hm, I can't really test my PATH fix in this charm
<rogpeppe1> ahasenack: try with a slightly older version of juju-core
<ahasenack> the ppa gets rid of the older ones
<ahasenack> I can postpone this, np
<smgz> ahasenack: you can use the version currently in the ppa
<ahasenack> smgz: it's what I had installed, bootstrap wasn't working
<dimitern> rogpeppe1: kanban?
<rogpeppe1> dimitern: i'll see how the network works today...
<rogpeppe2> darn it!
<TheMue> rogpeppe2: no chance
<rogpeppe2> phone company says it might be fixed tomorrow. i'm not gonna hold my breath
<rogpeppe2> dimitern, mramm: sorry, i'm giving up. my mobile connection (previously quite reliable) doesn't like me now either
<mramm> rogpeppe2: that's fine
<mattyw> rogpeppe3, ping?
<rogpeppe3> mattyw: pong
<mattyw> rogpeppe3, I'm trying to run a unit test using the MgoTestPackage stuff in juju-core, do we have a version of mongo that will run on a 32 bit machine?
<rogpeppe3> mattyw: i'm not sure
<mattyw> rogpeppe3, I think we might be ok actually, we can get the enterpise edition of mongo free for development
<rogpeppe3> mattyw: cool
<hazmat> rogpeppe3, is the juju-core api exposed or is it only available internal to the env?
<hazmat> in a security group sense
<rogpeppe3> hazmat: it's exposed
<hazmat> rogpeppe3, cool thanks
<rogpeppe3> hazmat: are you planning on using it?
<hazmat> rogpeppe, yes.. i've got rewriting launchpad.net/juju-deployer on my plate for this week
<hazmat> its mostly cli forks.. but seems like a nice opportunity to use the api.
<rogpeppe> hazmat: staying in python, or rewriting in Go?
<hazmat> rogpeppe, not sure yet
<hazmat> might be better in go, if there's a desire to move into core
<rogpeppe> hazmat: it's a small enough rewrite that rewriting in Go is probably a net win
<rogpeppe> hazmat: 'cos you can use the api package directly
<rogpeppe> hazmat: you might want to consider putting it under https://launchpad.net/juju-utils i suppose
<hazmat> rogpeppe, thanks for the link
<rogpeppe> hazmat: speaking of which, here's the current way to open an API client from Go. i plan on moving some form of this function into core: http://bazaar.launchpad.net/~juju/juju-utils/trunk/view/head:/cmd/juju-wait/main.go#L134
<rogpeppe> hazmat: PS spot the deliberate mistake :-)
<hazmat> rogpeppe, envname ;-)
<rogpeppe> hazmat: indeed
<hazmat> rogpeppe, i've had problems using any juju-core env.. any subsequent command post bootstrap does.. error: no CA certificate in environment configuration
<hazmat> rogpeppe, do you know what might cause that ?
<rogpeppe> hazmat: hmm, sounds like the CA certificate isn't being saved or loaded correctly
<hazmat> i am using JUJU_HOME to separate pyjuju/gojuju envs
<rogpeppe> hazmat: are you using the env from different machines?
<hazmat> rogpeppe, nope
<hazmat> rogpeppe, same machine, multiple trunk builds w and wo upload-tools, multiple regions.. etc.
<rogpeppe> hazmat: there has been quite a bit of churn in that area since i touched it. i'll just have a look.
<hazmat> might just be an issue with JUJU_HOME
<rogpeppe> hazmat: that's what i think
<rogpeppe> hazmat: it's very new
<rogpeppe> hazmat: yes, that's the bug
 * hazmat reports a bug
<rogpeppe> hazmat: it always saves it to $HOME/.juju
<hazmat> and reads from $JUJU_HOME
<rogpeppe> hazmat: yup
<rogpeppe> hazmat: the bug is in environs.WriteCertAndKeyToHome
<hazmat> bug 1171910
<_mup_> Bug #1171910:  JUJU_HOME is broken, writes cert to wrong location <juju-core:New> <https://launchpad.net/bugs/1171910>
 * rogpeppe welcomes _mup_ back
<hazmat> is there an lp2kanban instance running on the juju core board?
<hazmat> rogpeppe, fix in review
<rogpeppe> hazmat: ha, i've done one too
<hazmat> doh.. bad coordination
<rogpeppe> hazmat: https://codereview.appspot.com/8839047
<rogpeppe> hazmat: very very slow upload speed here
<hazmat> rogpeppe, i went minimal.. https://code.launchpad.net/~hazmat/juju-core/fix-juju-home/+merge/160423
<rogpeppe> hazmat: if poss, we generally go through codereview
<hazmat> rogpeppe, understood, having a credential issue with google atm
<rogpeppe> hazmat: i think that fix isn't quite right - the authoritative source for juju home is config.JujuHome()
 * rogpeppe is still uploading
<rogpeppe> 56Kbaud FTW
<hazmat> rogpeppe, ack.. one minor with your branch is that it looses the cert write test if JUJU_HOME isn't set.
<rogpeppe> hazmat: it will panic, actually
<rogpeppe> hazmat: if config.SetJujuHome hasn't been called
<rogpeppe> hazmat: but MakeEmptyFakeHome calls it
<rogpeppe> hazmat: hmm, sorry ignore me
<rogpeppe> hazmat: speaking without looking
<rogpeppe> hazmat: i'm not sure what you mean then
<rogpeppe> hazmat: you mean it doesn't test what happens if $JUJU_HOME isn't set?
<hazmat> rogpeppe, yes
<hazmat> rogpeppe, but its not clear if that's auto setup via config to HOME and tested elsewhere
<rogpeppe> hazmat: the right place to test whether config.SetJujuHome is called is in cmd/juju
<rogpeppe> hazmat: the JUJU_HOME env var isn't part of the contract of the environs package
<hazmat> ack
<rogpeppe> hazmat: there is no test, but i'm disinclined to grumble. the call to InitJujuHome is at the start of Main.
<rogpeppe> hazmat: and juju.InitJujuHome itself is well tested
<rogpeppe> hazmat: and that's where the $JUJU_HOME / $HOME logic is
<rogpeppe> i'm done for the day
<rogpeppe> see y'all tomorrow
<TheMue> rogpeppe: cu, and much luck with your phone line
<gary_poster> hi.  I want to announce the GUI Juju core support today.  What is the end user expected to do to start up juju core?  I'd like to try it from that direction and make sure it works
<gary_poster> ppa?
<mramm> gary_poster: that sounds awesome
<gary_poster> :-)
<mramm> yep, ppa
<mramm> https://launchpad.net/~juju/+archive/devel is the latest
<gary_poster> mramm, cool, thanks that's what I was hunting for.
<ahasenack> gary_poster: is bootstrap working on raring yet with juju-core?
<ahasenack> it wasn't earlier today
<gary_poster> ahasenack, my raring machine is still updating :-P .  are you getting "error: no matching tools available"?
<ahasenack> gary_poster: yes
<ahasenack> gary_poster: they said it was broken on purpose today until the right fix lands
<ahasenack> gary_poster: so maybe you should check before announcing
<gary_poster> ahasenack, oh! ok.  Yeah id does that on quantal too.  I was wondering if they expected people to use --upload-tools until release or something
<gary_poster> s/id does/it does/
<gary_poster> mramm do you know of a timeline when that will be resolved?  I agree with ahasenack that it would be nicer to announce when the story is smoother
<sidnei> hola! heard rumours that the go version has reached feature-parity with the python version. is this what the announcement is about?
<gary_poster> sidnei, my announcement would be that the GUI works fully with juju core, with feature parity.
<sidnei> ah, so pretty close
<gary_poster> sidnei, the other announcement may be waiting in the wings, dunno :-)
<sidnei> does juju-core work with canonistack yet? or does it still need public ips?
<gary_poster> I have not tried that.  Still using pyjuju for that
<mramm> gary_poster: I think mgz was working on it
<mramm> gary_poster: I'll check in with him first thing in the morning (his time)
<mramm> so I'll know before you get in to work ;)
<mramm> sidnei: go juju works with canonistack
<sidnei> mramm: no more public ips issues?
<mramm> sidnei: but it does not have constraint enforcement for openstack yet (code in a branch, will be merged in the next release)
<gary_poster> thanks mramm
<mramm> sidnei: well, there are public IP issues on canonistack, but there are ways to get around it built in
<mramm> now
<sidnei> mramm: not even instance-type for constraints?
<mramm> there will be mem, cpu/arch constraints (cross platform constraints) as soon as john's branch lands
<mramm> but there was a lot of arguing about provider specific (instance-type) constraints, so those are not yet implemented at all
<sidnei> mramm: what about those workarounds to get around public ip issues, is it written down somewhere?
<gary_poster> mramm, probably already known, but fwiw, when working with the PPA, juju bootstrap --upload-tools works fine in raring and quantal, but juju deploy on raring is giving me "error:cannot log in to admin database: auth fails".  Works fine in quantal.  Maybe I am doing something wrong; will investigate later
<ahasenack> I couldn't get it to work with canonistack the other day, some error when trying to contact swift, I think 401
<ahasenack> but then the bootstrap errors started, and I didn't try again with canonistack
<mramm> sidnei: I believe it is written down, but I can't find it at the moment
<mramm> mgz and jam would know
<mramm> gary_poster: that seems like something I am not yet aware of -- will look into it
<gary_poster> thanks
<hazmat> hm.. if you have multiple relations, and one is in error, you can't tell which one has the error.
<hazmat> pip install jujuclient
#juju-dev 2013-04-24
<orospakr> I just tried deploying a charm (using juju-core) that requires the onieric series, but the VMs failed in the "no matching tools" state.  trying `juju sync-tools --all` just produced the same message.
<orospakr> here's how the world looks for me right now: https://gist.github.com/orospakr/b3a1950fbe80c61446b4
<davecheney> orospakr: we do no support onaric
<davecheney> sorry
<davecheney> you won't find any tools
<davecheney> yeesh
<davecheney> we do not support onieric
<orospakr> huh, okay. thank you.
<orospakr> silly question: when you juju bootstrap with juju-core, the bootstrap VM is running juju-core as well?
<davecheney> the first machine, machine 0 is running the mongodb service (what we call the state server)
<davecheney> it holds the knowledge of everything in that environment
<orospakr> ah, I knew about that. I thought a jujud ran on it as well.
<davecheney> orospakr: well a jujud is running on that machine
<davecheney> i'm not sure which question you are asking
<orospakr> does the first node also actively administer the other nodes, in response to your commands with the juju client, in addition to storing state in mongodb?
<davecheney> orospakr: yes and no
<davecheney> the first node also runs the providisioning agent which is the one that talks to the provider and creates new machines (nodes) if required
<orospakr> ah, that's the thing I'm thinking of. good. :)
<orospakr> so, now I've got this other issue: I can't destroy the wedged service/units/machines, because they're all unaccessible to juju.
<davecheney> orospakr: can you explain what you mean by inaccessible ?
<orospakr> the onieric-based service is in the "dying" state, and the units and machines alike are in the "pending" state.
<orospakr> s/"dying" state/life parameter is "dying"/
<davecheney> orospakr: have you tried juju terminate-machine ?
<orospakr> aye. [~]$ juju destroy-machine 2 3 4
<orospakr> error: no machines were destroyed: machine 2 has unit "couchbase/0" assigned; machine 3 has unit "couchbase/1" assigned; machine 4 has unit "couchbase/2" assigned
<orospakr> oof, excuse me. that didn't paste well.
<davecheney> i'd start with destroy-unit, destroy-server, etc
<davecheney> services
<davecheney> but at this point i'd just destroy the environment and start again if that is an option
<orospakr> it is, but I'm trying to stick it out on things like this so I can be confident that I can deal with failures when I do use this in production. ;)
<davecheney> i think destroy-unit and destory-service take a --force
<davecheney> but they will create dangling references
<davecheney> m_3: ping
<davecheney> why did the PA restart ? http://paste.ubuntu.com/5597298/
<davecheney> m_3-droid: i'm a tit
<davecheney> the problem isn't the HP endpoint, it is the number of concurrent connections to the mongo server running on machine/0
<davecheney>      ââmongodâââ828*[{mongod}]
<davecheney>      ââmongodâââ9*[{mongod}]
<davecheney> oops
<davecheney> https://bugs.launchpad.net/juju-core/+bug/1172110
<davecheney> shit, we're using 3 mongo connections per machine
<jam> davecheney: because of the different agents?
<davecheney> i suspect so
<davecheney> it isn't exactly 3 x
<davecheney> 450 connections == 156 machines
<davecheney> the mgo drivers' peer probing isn't helping
 * davecheney really hates that feature
<davecheney> jam: https://codereview.appspot.com/8931044/
<davecheney> jam: Apr 24 05:30:51 juju-goscale2-machine-0 mongod.37017[9222]: Wed Apr 24 05:30:51 [conn24025] end connection 15.185.174.63:33866 (599 connections now open)
<davecheney> Apr 24 05:30:53 juju-goscale2-machine-0 mongod.37017[9222]: Wed Apr 24 05:30:53 [initandlisten] connection accepted from 15.185.173.39:34560 #24049 (600 connections now open)
<_mup_> Bug #24049: KDE systemsettings doesn't have a file association component <kde-systemsettings (Ubuntu):Fix Released> <https://launchpad.net/bugs/24049>
<davecheney> ^ this is what you get from mongo
<davecheney> compared to how much we log
<davecheney> it's not a lot
<davecheney> #1
<davecheney> #1234
<davecheney> #12345
<_mup_> Bug #12345: isdn does not work, fritz avm (pnp?) <isdnutils (Ubuntu):Fix Released by doko> <https://launchpad.net/bugs/12345>
<davecheney> #9999
<davecheney> #10000
<_mup_> Bug #10000: xserver-common: X crashed (signal 7) while scrolling in Mozilla <xorg (Ubuntu):Invalid by daniels> <xorg (Debian):Fix Released> <https://launchpad.net/bugs/10000>
<davecheney> bzzt
<davecheney> Apr 24 05:46:53 juju-goscale2-machine-0 mongod.37017[9222]: Wed Apr 24 05:46:53 [initandlisten] connection refused because too many open connections: 819
<davecheney> Apr 24 05:46:53 juju-goscale2-machine-0 mongod.37017[9222]: Wed Apr 24 05:46:53 [initandlisten] connection accepted from 15.185.176.240:44746 #55756 (820 connections now open)
<_mup_> Bug #55756: installer partman crash selecting swap partition <ubiquity (Ubuntu):Invalid> <https://launchpad.net/bugs/55756>
<davecheney> Apr 24 05:46:53 juju-goscale2-machine-0 mongod.37017[9222]: Wed Apr 24 05:46:53 [initandlisten] connection refused because too many open connections: 819
<davecheney> Apr 24 05:46:53 juju-goscale2-machine-0 mongod.37017[9222]: Wed Apr 24 05:46:53 [initandlisten] connection accepted from 15.185.177.196:51276 #55757 (820 connections now open)
<_mup_> Bug #55757: "Send a mail" to a contact list adds only the first contact to "To:" <deskbar-applet:Fix Released> <deskbar-applet (Ubuntu):Fix Released by desktop-bugs> <https://launchpad.net/bugs/55757>
<fwereade> morning everyone
<fwereade> how screwed up is everything today?
* ChanServ changed the topic of #juju-dev to: ttps://juju.ubuntu.com | On-call reviewer: fwereade | Bugs: 2 Critical, 64 High - https://bugs.launchpad.net/juju-core/
<davecheney> fwereade: we're looking at a hard limit of ~300 machines or 800 agents (the smaller of the two)
<fwereade> davecheney, thanks, I just read your email -- definitely mongodb connections?
<davecheney> fwereade: absolutelu
<davecheney> it says so in the mongo log
<davecheney> fwereade: also, https://codereview.appspot.com/8931044/
<davecheney> not critical
<fwereade> davecheney, that is ludicrous, isn't it? surely that's not as far as it can go?
<davecheney> the default limit is 80% of nfiles, which is 1024 on ubuntu
<davecheney> these defaults can be changed
<davecheney> during service deployment we average 3x the number connections as machines
<davecheney> the number falls back to 2x after deployment
<davecheney> (although that may not be successful deployment as I restarted monogo
<davecheney> 2x sounds logical, 1 machine agent, 1 unit agent per service unit
<fwereade> davecheney, yep, agreed
<davecheney> the conn limit for mongo can be increased
<davecheney> fwereade: the other problem is i'm seeing continual cpu spikes as 600 agents wake up
<davecheney> and probe each mongodb server
<davecheney> this is a very unhelpful feature
<fwereade> davecheney, heh, I had been less worried about that in particular but it goes to show you never can tell:/
<davecheney> at 2000 agents I estimate the mongodb server will be under constant polling pressure
<fwereade> davecheney, *if* we had an internal API that would be a non-issue but it doesn't help us today
<davecheney> which will probably fuck us with sockets in TIME_WAIT
<davecheney> fwereade: yes, that is the logical solution
<davecheney> federated mongodb might help
<davecheney> but the polling logic will still be a massive burdon
<fwereade> davecheney, yeah,doesn't feel like it'd really address the issue
<rogpeppe> mornin' all
<dimitern> rogpeppe: morning
<rogpeppe> dimitern: hiya
<rogpeppe> fwereade: hope you've recovered a bit!
<fwereade> rogpeppe, yeah, more or less, although I remain baffled at the actual state of play -- I am pondering dave's scale tests for now and hoping that mramm will wake up in an hour or two and bring us the gift of clarity
<rogpeppe> fwereade: unfortunately my network connection is still borked, so i missed the kanban meeting yesterday, so i don't really know
<rogpeppe> fwereade: i don't think i've seen anything about dave's scale tests other than the conversation just above
<dimitern> rogpeppe: I think nobody knows for sure, except probably mgz
<fwereade> rogpeppe, I had a brief chat with him this morning, it seems we are hitting mongo connection limits at around 300 machines
<rogpeppe> fwereade: interesting. i'm not entirely surprised actually.
<rogpeppe> fwereade: need to get that internal API done. and HA on that.
<fwereade> rogpeppe, yep
<fwereade> rogpeppe, considering possible mitigations in the meantime
<rogpeppe> fwereade: anything particular in mind?
<fwereade> rogpeppe, still going through internal sanity-vetting ;p
<dimitern> hmm.. cmd/juju tests running time seems to have improved slightly
<fwereade> mramm, heyhey
<mramm> heyhey
<mramm> I am trying to get caught up on the packaging issues
<fwereade> mramm, great
<mramm> I thought things were moving smoothly until late yesterday when antonio informed me that the server guys did not think it would go in
<mramm> I still don't have a clear picture from them of what needs to happen, just some hand waving about "issues"
<fwereade> mramm, my understanding had been that monday was the actual razor-sharp cutoff in any case
<mramm> well, that was my understanding too
<mramm> but I thought we'd given them something by then, and I hear from them yesterday that the razer sharp cutoff is actually today
<fwereade> mramm, but, regardless, if there is anything I can do to help I would be happy to; and if you do manage to glean some measure of clarity I, and others, would be most grateful for it
<mramm> looks like jamespage is on the case
<mramm> if you can join #server on canonical IRC that would be helpful
<mramm> so anybody that can help jamespage in some way today gets an extra gold star from me ;)
<davecheney> good evening
<davecheney> could I draw your attention to
<davecheney> https://bugs.launchpad.net/juju-core/+bug/1172122
<_mup_> Bug #1172122: state/presence: FAIL: presence_test.go:253: PresenceSuite.TestWatchPeriod <juju-core:New> <https://launchpad.net/bugs/1172122>
<davecheney> and
<mramm> so, on subject of dave's e-mail
<davecheney> https://bugs.launchpad.net/juju-core/+bug/1172121
<_mup_> Bug #1172121: environs/maas: multiple test failures <juju-core:Confirmed> <https://launchpad.net/bugs/1172121>
<mramm> haha
<davecheney> mramm: i can leave again if you want to talk about me
<davecheney> i don't mind, i like the idea of being popular
<davecheney> also, https://docs.google.com/a/canonical.com/document/d/1p_OzWxqxaXalHBI3ohkUsB9_iQBPSGWwM-qODm5FSbI/edit#
<mramm> haha
<mramm> so, I think we should talk about the internal API stuff sooner rather than later
<mramm> given that we are hitting scalability limits now
<fwereade> davecheney, did you update gomaasapi?
<davecheney> mramm: SGTM
<davecheney> fwereade: probably not
<fwereade> davecheney, I think that resolves it
<davecheney> right, i'll check that tomorrow
<fwereade> davecheney, I'll close it then, reopen if I'm wrong please
<davecheney> SGTM
<davecheney> what about state/presence ?
<fwereade> davecheney, I think I have no option but to do a deep dive there and try to figure out WTF is going on
<fwereade> davecheney, occasional failures from there have always been a thing, but not one that ever quite rose high enough to be looked at properly
<davecheney> has anyone else seen that problem ?
<fwereade> davecheney, it doesn't look entirely unfamiliar, but I thought we had a bug for it already; apparently not
<mramm> so, can somebody help jamespage out with updating our "release" package to build from local source rather than latest trunk, so the source gets included in the package
<mramm> we can pull the local source from trunk and then build
<jamespage> mramm, forget the package
<jamespage> I just want a release tarball of juju-core 1.10.0 that contains everything that is juju-core 1.10.0 aside from the packaging
<mramm> ahh
<mramm> cool
<mramm> so rogpeppe, TheMue, fwereade, jam, dimitern:  can one of you get that for jamespage now?
<rogpeppe> mramm: ok, i'll do that
<mramm> rogpeppe: thanks!
<rogpeppe> jamespage: presumably not including binaries, right?
<jamespage> rogpeppe, you got it
<TheMue> ah, just wanted to ask too
<davecheney> if you fancy a break to look at it
<mramm> morning mgz!
<mgz> hey! seen your email.
<jamespage> mgz, hey!
<mgz> so, there are a couple of issues for me:
<mgz> the update-alternatives bits didn't get merged into dave's packaging branch, but he did then add a manpage, so that needed updating
<mgz> I don't have rights to upload to the public bucket on ec2, nor have I done what dave normally does for releases
<mgz> and there's a change in 1.10 working around the updates-during-raring to upstart bug that I don't think we want to release with
<jamespage> mgz, I've merged all the various bits of packaging - lp:~james-page/ubuntu/raring/juju-core/1.10
<jam> mgz: I have some questions for you as well if/when you have some time.
<mgz> jam: sure
<jamespage> mgz, and raised a FFe - 1172215
<jamespage> bug 1172215
<_mup_> Bug #1172215: [FFe] Please include juju-core 1.10.0 in Ubuntu 13.04 <Ubuntu:New> <https://launchpad.net/bugs/1172215>
<jam> mgz: 1) it is known that 1.10.0 is in the ppa but not uploaded to ec2, so you can't actually bootstrap (no tools found)
<mgz> those are the only things I'm aware of.
<mramm> davecheney: can you help out with the tools upload stuff if you are still around?
<jam> 2) I still can't ssh to the original goose instance, as near as I can tell chinstrap isn't letting ssh get to lcy01 (it is working to lcy02, but I get No route to host trying to get to 10.55.60.94)
<mgz> any of the core guys should have rights, but no one responded when I poked the other day
<mgz> so maybe they don't?
<davecheney> mramm: sure, mgz do you want the creds ?
<jam> 3) I tried starting a new tarmac bot with similar config and juju-core doing the bootstrap to lcy02, but it seems charms:precise/tarmac uses puppet, and that just doesn't work now.
<davecheney> or I can do the push if you cna point me to the deb in the archive
<mramm> giving mgz creds makes sense to me
<mramm> lp:~james-page/ubuntu/raring/juju-core/1.10 looks like the proposed release
<davecheney> sure, the only reason I hesitate is they belong to gustavo
<mramm> yea
<davecheney> so, be careful with 'em
<jam> davecheney: load testing time!
<jam> how many CC4.xxlarge can we run? :)
<mramm> we ultamately need to take that over and make it something owned by a team, not a person :/
<jam> mramm: is it possible to change ownership of an s3 bucket?
<mramm> not easily
<davecheney> mgz: ceheck your make
<davecheney> mramm: jam much easier to change the source in the code
<davecheney> mgz: check your mail
<mramm> davecheney: right
<mramm> you can delete it, and then try to pick the name up under a new account
<mramm> but 1) anybody can pick it up when it becomes available
<mramm> and 2) it can take up to 24 hours to go back into the available names pool
<mgz> jam: one funny thing I noticed was there are a couple of goosebot instances, one of which is shutoff, on lcy01. it's possible the routing is broken just for that host.
<mgz> jam: the tarmac puppet charm not working with juju-core is a bug here I guess, unless it does something very bogus
<jam> mgz: the shutoff one is the python-juju bootstrap node
<jam> I wasn't able to start and get to that one either, but didn't really need to.
<mgz> there are two shutoff ones...
<TheMue> mramm: thats why I suggest a dns name like tools.juju.ubuntu.com in our code to point to ANY bucket we want (or other server)
<rogpeppe> jamespage: i'm pushing a source-only branch containing all the latest source and its dependencies.
<mramm> TheMue: file a kanban card for it and add it to tomorrow's agenda ;)
<davecheney> TheMue: https://docs.google.com/a/canonical.com/document/d/1p_OzWxqxaXalHBI3ohkUsB9_iQBPSGWwM-qODm5FSbI/edit#
<TheMue> mramm: yep, will do
<rogpeppe> jamespage: unfortunately my network upload speed is outrageously slow at the moment (about 50Kbits/s) so it will take a while
<TheMue> davecheney: agenda is already edited, for a different topic ;)
<jamespage> rogpeppe, ack
<mgz> jam: seems 10.55.60.94 is indeed unreachable, worth raising with canonistack support
<jam> mgz: is that ask web-ops and then get it escalated ?
<jam> mgz: do you know if you have to have the puppet agent running to have the script run ? (default is that puppet agent in /etc/defaults/puppet is to not run)
<mgz> for the record, the packaging branch is at lp:~james-page/ubuntu/raring/juju-core/1.10.0 with the trailing .0
<rogpeppe> jamespage:  lp:~rogpeppe/+junk/juju-1.10.0-source-only
<mgz> jamespage: your packaging branch looks good to me
<jamespage> rogpeppe, that really need to be somewhere official
 * TheMue is at lunch
<jamespage> like lp:juju-core/1.10.0
<rogpeppe> jamespage: ah
<rogpeppe> i'm worried i might start stepping on someone's toes here - i'm not usually involved with this stuff
<mgz> needs a different name if you're putting it under the juju-core project
<mgz> as it's packaging
<mgz> I can push it somewhere though
<jamespage> mgz, no - thats the point
<jamespage> I want an upstream release of juju-core
<jamespage> forget the packaging
<mgz> it's not juju-core 1.10.0
<mgz> it's that plus all the deps
<jamespage> yes
<mgz> okay.
<jamespage> juju-core plus the deps that the juju-core dev team say are good for 1.10.0
<rogpeppe> basically i fetched all the deps from scratch and removed the .bzr and .hg directories
<jamespage> mgz, I want to switch the packaging away from 3.0 (native) - its not required
<mgz> to quilt?
<jamespage> mgz, yes
<jamespage> native rarely makes sense
<mgz> I can live with that
<jam> mgz: I wonder if you could grab them and use 'bzr-upload' to create light dirs so that it is easy to update them to newer tools.
<mgz> so, issue #1 is resolved by james' branch, issue #2 is okay now I have the ec2 creds, last question is if we carry the cloud-init hack or not...
<mgz> jam: we could certainly do something more elegant, for now I'm happy with just dumping the code and adding a bunch of fresh unrelated stuff to the repo on launchpad
<mramm> be back in a few -- getting breakfast and etc.   Ping me if needed.
<mgz> jamespage: probably a question for you as much as any of the juju guys, see <https://codereview.appspot.com/8648047/>
<jam> mgz: did you still want the hp-cloud instance runnng?
<mgz> jam: nope, I fixed my script so I can restart it myself as needed
<jam> mgz:except you couldn't reach http, right?
<jam> non-http because of the non-standard port
<jam> for keyauth
<mgz> well, "myself", provided I remember to do it in advance, otherwise without manual intervention (lesson: use sed -r when being fancy)
<rogpeppe> my pesky phone line seems like it might be out for another whole week
 * rogpeppe didn't know about sed -r
<rogpeppe> i always get bitten by the fact that standard sed doesn't do "proper" regexps, 'cos i'm used to plan 9's sed which does them by default
<jamespage> mgz, rogpeppe: which of you two are working on preparing the juju-core 1.10.0 snapshot release and sticking it something official?
<jamespage> not clear from my backscroll
<rogpeppe> mgz: given my (lack of) current bandwidth, you might be best doing that
<jam> rogpeppe: still no home internet? ouch
<rogpeppe> jam: yeah. i just talked with the phone company, and they have no idea when it'll be resolved. it's been out since last tues
<jam> rogpeppe: for sed, you mean '\d' vs [[:digit:]] ?
<rogpeppe> jam: well, there is *some* internet, but the upload speed is stupidly bad, and i've been getting 3-5s ping response times
<rogpeppe> jam: no, i mean (foo|bar)
<mgz> jamespage: I can push the source roger put up to somewhere... and I guess we can just leave the other questions for now
<mgz>  lp:~juju-core/ubuntu/raring/juju-core/1.10.0-source okay?
<jam> mgz, dimitern: standup?
<jamespage> mgz, not sure why we need the ubuntu/raring prefix
<jamespage> its not ubuntu or raring - its juju-core 1.10.0
<jam> jamespage: because ~juju-core/juju-core is completely unrelated code.
<jam> well, "unrelated"
<jam> in that it is just the base tree, not all deps
<jamespage> well like I said - I'm good with a tarball published on launchpad.net
<jamespage> like we do for juju
<mgz> hm, that would be ideal really
<jamespage> i'd rather have it that was - otherwise I have to cut my own tarball still
<jamespage> which is not always deterministic
<jam> mgz: mumble?
<mgz> I'm there.
<mgz> so, we don't have a 1.10.0 milestone yet? ...ah, the fun is it's half the 2.0 milestone
<wallyworld_> fwereade: want to join us on mumble?
<mgz> jamespage: https://launchpad.net/juju-core/1.10/1.10.0 has tarball based on the rogpeppe branch
<fwereade> wallyworld_, actually, could we have a quick hangout for 15 mins?
<mgz> I'll tidy up the rest of the release sutff for now
<wallyworld_> sure
<fwereade> wallyworld_, I forgot our differing interpretations of wednesday
<mgz> blast, no hangout for me
<jamespage> mgz, ta muchly
<wallyworld_> fwereade: if mgz can't do hangout, can you do mumble?
<fwereade> wallyworld_, er maybe, how can I set it up in 2 minutes or less?
<rogpeppe> mgz: you might want to include this info somewhere around the place: http://paste.ubuntu.com/5598150/
<wallyworld_> fwereade:  not in 2 mins i don't think :-(
<rogpeppe> mgz: it's the full list of the revision numbers
<wallyworld_> mgz: can you reduce your bandwidth for a hangout?
<mgz> fwereade: `sudo apt-get install mumble` then <https://wiki.canonical.com/StayingInTouch/Voice/Mumble> but hangout safer, go for that
<fwereade> mgz, cheers, I'll set that up after this then
<mgz> wallyworld_: issue is google don't provide arm debs for their binary blobs
<wallyworld_> mumble can be fiddly
<rogpeppe> mgz: hmm, except the revision of the go tree itself. that should probably be included too, i suppose, although we're not including it in the tarball
<rogpeppe> mgz: maybe we should
<rogpeppe> mgz: although i suppose we're building against a known go revision
<mgz> I can, if needed, cheat and re-up the tarball with fixes
<fwereade> wallyworld_, mgz: anyone else I should invite? jam?
<mgz> I'm not certain the current dir layout makes sense for instance
<wallyworld_> fwereade: he is afk for a little bit, will join when he gets back
<mgz> fwereade: jam was interested, but has to go off now
<wallyworld_> so invite him
<rogpeppe> mgz: i just did it so the root of the tree could be used as $GOPATH
<fwereade> mgz, I need to in 10 mins bit I think we can cover some stuff usefully
<mgz> yeah, go go, you and ian
 * dimitern bbin1h
<jamespage> mramm, mgz: I noticed the debian/copyright file was not complete in the packaging - working on that now
<jamespage> which means I have to document the copyright and license for all of the bundled projects as well
 * jamespage sighs
<mgz> ...sorry about that
<jamespage> mramm, mgz: OK - I have two problems
<jamespage> goose has no explicit Copyright holder
<jamespage> and lpad has neither a Copyright holder or license
<mgz> fwereade: sorry for the incoming launchpad email surge, you had a bunch of fixed bug targetted at 2.0 that you actually fixed long ago
<mgz> ...is lpad actually a dependency?
<mgz> or should we remove it from the tarball?
<mgz> goose is fixable
<mgz> we've had no external contributions, the copyright holder is just canonical
<mgz> jamespage: can you try building what you have as well, to see if tests etc are all fine? I'll upload a -1 tarball with any fixes needed
<jamespage> mgz, builds just fine - this is just a distro copyright/license thing
<jamespage> mgz, I don't see any tests executing fwiw
<mgz> jamespage: where should we put the copyright holder if we're not doing per-file licencing?
<mgz> jamespage: I doubt they're run as part of the packaging
<mgz> we should probably add that, but not now
<jamespage> mgz, good question - most of niemeyer's projects have it in LICENSE
<niemeyer> mgz, jamespage: I tend to do it per file as well
<niemeyer> mgz, jamespage: and certainly on a LICENSE file or similar
<jamespage> niemeyer, I spotted
<jamespage> :-)
<mgz> in juju-core that seems to just be a copy of the agpl
<jamespage> mramm, mgz: LICENSE/LICENCE or suchlike is sufficient - every file is best practice
<niemeyer> jamespage: +1
<niemeyer> Even because some projects (e.g. goyaml) do have mixed licensing
 * jamespage looks at goyaml again
<mgz> hm, and now canonistack is refusing to talk to me
<mgz> okay, I can't build or upload anything till canonistack is back, so having lunch
<mgz> jamespage: okay, have written a hacky script for rolling up tarball
<jamespage> mgz, yeah for hacky scripts ;-)
<mgz> I just need to know, exactly, what change you want me to make for the goose license thing
<jamespage> it was missing copyright - hrm
<mgz> I currently have COPYING and COPYING.LESSER in the branch
<mgz> I don't really want to modify those, as they're just the text from gnu
<mgz> not that I know why we have both...
<mgz> modifying every darn file in the tree is also not sane, though it's what the gpl generally wants (header on each source file)
<jamespage> mgz, indeed - I think just adding a LICENSE file detailing which of those two licenses its licensed under and details of the C holder would be OK
<jamespage> for now anyway
<jamespage> infact I'm happy to leave what you have in tarball as-is - so long as there is a commit in the bzr branch with the details on it - I can refer to that with a comment
<jamespage> time is of the essence and all that
<mgz> okay, done, uploading
<mgz> this really does need the tests run on it though, as it's pristine from export, rather than copied files from trees that have been tested together
<mgz> jamespage: (and everybody else) https://launchpad.net/juju-core/1.10/1.10.0/+download/juju-core_1.10.0-1.tar.gz
<jamespage> mgz, wanna check my branch again? just re-cut using that tarball
<rogpeppe1> mgz: i ran the basic tests on it, but not live tests.
<rogpeppe1> mgz: BTW, i'm pretty sure that davecheney builds from pristine for each release
<mgz> jamespage: looking, it seems good, one question for the others
<mgz> go.net has lost the html/ package, is that something we used in any way?
<m_3> TheMue: I'm gonna turn that into a blog post though.. so give it a couple of days and I'll have a version that's a bit easier to read
<mgz> seem safe, I needed exp/html from go trunk long ago, but I think that was for rietveld not juju-core anyway
<TheMue> m_3: Great, that's what we need. Thx for your effort.
<mgz> jamespage: seems there were some changes to the cert code in your old cut of the source which I'm not clear on the origin of...
<mgz> what's there now is all that's been on trunk as far as I can see, and looks okay to me
<mgz> anyway, I shall build out of that branch, and upload to public bucket
<mgz> I guess we may also want to change the recipe to use this, and rebuild what's in the ppa?
<rogpeppe1> simple git question for someone: what's the equivalent of bzr revision-info in git?
<mgz> hm, something is not happy
<TheMue> rogpeppe1: took a look at git show?
<rogpeppe1> TheMue: i think that "git rev-parse HEAD" is what i need
<TheMue> rogpeppe1: oh, i'll take a look
<rogpeppe1> TheMue: it seems there's no linear idea of commit history in git, unlike hg and bzr. is that right? i.e. no numeric log numbering.
<rogpeppe1> TheMue: thanks. i've never used git in anger.
<TheMue> rogpeppe1: i've just started for private projects, used hg before
<TheMue> rogpeppe1: and yes, it seems to use large numbers like uuids and commit, tree, parent relations
<jamespage> mgz, OK- just got a ftbfs on arm
<rogpeppe1> jamespage: interesting. what's the error?
<jamespage> apologies for the paste:
<jamespage> launchpad.net/goyaml
<jamespage> # launchpad.net/goyaml
<jamespage> src/launchpad.net/goyaml/goyaml.go:89: undefined: newDecoder
<jamespage> src/launchpad.net/goyaml/goyaml.go:90: undefined: newParser
<jamespage> src/launchpad.net/goyaml/goyaml.go:135: undefined: newEncoder
<rogpeppe1> jamespage: ah, i think i know what the issue might be. i wonder if cgo is disabled/not working on arm
<jamespage> rogpeppe1, how do I check?
<rogpeppe1> jamespage: are you getting the error at a command prompt?
<rogpeppe1> jamespage: oh jeeze
<rogpeppe1> jamespage: this is an old version of go we're using
<rogpeppe1> jamespage: i'm not surprised actually
<jamespage> 1.0.2 as in raring
<rogpeppe1> jamespage: yeah. hmm.
<rogpeppe1> jamespage: davecheney's the man for knowing about go-on-arm stuff.
<rogpeppe1> jamespage: if you type "go env" on the arm box, it should have an output line saying something like: CGOENABLED="1"
<rogpeppe1> sorry, CGO_ENABLED
<jamespage> rogpeppe1, CGO_ENABLED="0"
<jamespage> yikes
<rogpeppe1> jamespage: right, so that's the issue
<jamespage> rogpeppe1, OK - we'll drop arm for juju-core in raring
<rogpeppe1> jamespage: +1
<rogpeppe1> jamespage: we've done no testing on arm
<jamespage> mramm, ^^
<mramm> jamespage: no arm is fine
<mramm> jamespage: arm support was explicitly pushed back to post 13.04 anyway
<mgz> well, nearly very smooth: <http://paste.ubuntu.com/5598730/>
<jamespage> mramm, OK
<rogpeppe1> jamespage: from http://code.google.com/p/go-wiki/wiki/GoArm : "currently the development version of Go includes better support for linux/arm, including full cgo support, than Go 1.0."
<mramm> rogpeppe1:  correct, git has no version numbers, just commits (identified by the hash of the commit contents)
<rogpeppe1> mramm: thanks
<mramm> commits define their parent commit, and you therefore get a graph of commits
<mramm> which git walks to show you a timeline
<mgz> and yeah, I can't build juju-core on this box, we don't work on arm currently, but we were also not targetting it
<rogpeppe1> mramm: that's true of all of those systems i think - but the linear history is quite nice for referring to a given trunk
<rogpeppe1> mgz: you probably could if you used go tip
<mramm> rogpeppe1: well, git does not attempt to pretend that there is a linear history
<rogpeppe1> mgz: i'd be interested to find out whether it works actually
<rogpeppe1> mramm: yeah.
<mgz> I'll add that to my list of fun-time things to try :)
<rogpeppe1> hmm, i wonder if i my san box upstairs would be up to the task...
<mramm> in a fully distributed system I have commits in a branch you don't have and vice versa, so linear history is impossible to get right
<rogpeppe1> mramm: yeah, but hg and bzr both pretend quite well :-)
<mramm> linus was very opposed to pretending
<rogpeppe1> mramm: another stupid git question: how do i update the current working tree to a given rev id?
<rogpeppe1> mramm: git pull?
<fwereade> whoops, not been focusing on this: https://codereview.appspot.com/8939043
<fwereade> rogpeppe1, quick look for form's sake please?
<mramm> git checkout <sha>
<rogpeppe1> fwereade: LGTM trivial
<fwereade> rogpeppe1, cheers
<rogpeppe1> mramm: thanks
<mramm> git pull will grab objects (commits and trees) from the remote repo an pull them down
<mramm> git checkout will switch the cwd
<fwereade> ok, I'm nearly up to date on my reviews, and everyone can start committing their approved bits and pieces to trunk now
<fwereade> will try to swing by again later -- if not, ttyall tomorrow
<fwereade> dimitern, if C+L sleep early I might ping you for a late  beer/catchup re upgrade-charm
<fwereade> dimitern, otherwise, maybe 20 mins before the meeting tomorrow?
<dimitern> fwereade: sgtm
<dimitern> fwereade: when you can
<jamespage> mgz, rogpeppe1, fwereade: you guys happy with what we are proposing for release into raring?
<rogpeppe1> jamespage: yup, sgtm
<rogpeppe1> jamespage: assuming it's essentially still the sources i put together
<jamespage> rogpeppe1, its the re-cut sources mgz did
<rogpeppe1> mgz: what did you change?
<jamespage> lpad got dropped
<jamespage> and some licensing clarification around goose
<dimitern> why is juju depending on lpad anyway? it's only used by lbox when interacting with LP
 * jamespage shrugs
<jamespage> anyway I just uploaded to raring - Daviey and slangasek are lined up for review
<dimitern> jamespage: tyvm!
<mramm> jamespage: awesome work!
<jamespage> lets not get to excited - its just in the queue!
<mramm> thank you very, very much
<mramm> haha
<mramm> understood
<mramm> but it is progress
<jamespage> mramm, action prior to next release - get some copyright/license headers in all source across the board please!
<Makyo> fwereade, I got your review comments in the middle of submitting.  Would you be alright with a separate branch with those implemented?
<mramm> jamespage: will do
<Makyo> Er, too late :/
<rogpeppe1> dimitern: it's actually just the store code that depends on lpad
<rogpeppe1> dimitern: i included it just so i could do go test ./... without errors
<dimitern> rogpeppe1: it's about time to separate the store from juju-core now
<rogpeppe1> dimitern: it's happened before and it will happen again :-)
<dimitern> rogpeppe1: i certainly hope it'll be soon :)
<dimitern> Makyo: it's never late for another branch ;)
<Makyo> dimitern, More branches, more branches!
<Makyo> Getting conflicting reviews on r1192 and it should be reverted.  I've never been successful at that.  Can someone help me out?
<fwereade> Makyo, sure, that's fine
<fwereade> Makyo, sorry about that
<fwereade> Makyo, no need to revert if it's already in, just note that you will update in the review please
<mattyw> what's the best way for a charm to work out if it's been deployed using py-juju or go-juju other than checking for location of the agent.conf files?
<Makyo> fwereade, even with rogpeppe1'
<Makyo> 's comments?
<fwereade> Makyo, ah sorry, just saw rog's
<fwereade> Makyo, 2 mins thinking time
<rogpeppe1> mattyw: JUJU_CONTEXT_ID is a reasonable indication, i think.
<rogpeppe1> Makyo: sorry for tardy review - i forgot to submit the comments earlier
<Makyo> rogpeppe1, That's okay, they're definitely necessary.  I think reverting might be the best choice, though, to make sure things aren't half-right.
<fwereade_> Makyo, yeah, it needs more thought -- sorry, Ineeded to look up what we did with deploy in that case
<fwereade_> Makyo, I think we should keep charm-adding and charm-setting spearate, essentially as they are in deploy
<Makyo> fwereade_, alright, looking through deploy...
<fwereade_> Makyo, trying to figure it out myself
<fwereade_> Makyo, rogpeppe1: how do charms get into state from the GUI ie via the API?
<rogpeppe1> fwereade_: currently they don't
<rogpeppe1> fwereade_: we're restricted to charms in the charm store
<mgz> jamespage: what was the reason for juju-core rejection? ...I wish there was something recorded on launchpad
<Makyo> fwereade_, rogpeppe1, Correct, no support deploy/upgrade on local yet, though we can see them once they're deployed.
<fwereade_> rogpeppe1, ah ok, so the client just has to pass a url known to the charms store?
<rogpeppe1> fwereade_: yes
<rogpeppe1> fwereade_: my vague plan is to have an optional extra call to upload a charm (probably a sequence of calls so we don't bundle up MB in one json message)
<rogpeppe1> fwereade_: it may even end up being better as a PUT
<fwereade_> Makyo, rogpeppe1: ok, then are we ok just putting the same restriction on upgrades  for now? ie ServiceName, CharmURL, Force?
<Makyo> fwereade_, rogpeppe1, sounds good to me.
<rogpeppe1> fwereade_: i think so
<fwereade_> Makyo, rogpeppe1: I'm fine punting on local charms today at least, so long as we punt consistently
<rogpeppe1> fwereade_: yup
<rogpeppe1> i'm done for the day
<Makyo> Still on to revert 1192?
<rogpeppe1> see y'all tomorrow
<Makyo> Later.
<gary_poster> Hi all.  I'd like to announce the GUI's compatibility with juju core, but the Raring Juju from the devel PPA fails for me like this: http://pastebin.ubuntu.com/5598967/
<gary_poster> Is that known?  Is there some other, better way to suggest that people try out the GUI on juju core?  I didn't figure installing juju from source was the right sales pitch :-)
<jamespage> mgz, release team said it was to late
<ahasenack> gary_poster: so now --upload-tools is failing too?
<gary_poster> ahasenack, as in the pastebin, bootstrap --upload-tools succeeds temporarily, but then I can't deploy anything. :-/
<ahasenack> oh, ok, I missed the deploy command
<mramm> jamespage: mgz: we are still fighting the good fight.   And if we don't get it in now, we will do everything possible to get it in via backports early  next week
<mramm> so the end user visible difference will not be much
<mramm> either way after next week a user will be able to sudo apt-get install juju-core, and get our package
<mramm> and if it is in backports it will be even easier to get it updated with our monthly releases
<davecheney> fwereade_: what is the story with the trunk ?
<davecheney> can we land fixes ?
<mgz> davecheney: I thought it had been mentioned that it's okay to land again, but I now can't find a reference
<mgz> release things have all been branched though
<davecheney> mgz: mramm fwereade_ : please email juju-dev with the status of the trunk
<davecheney> mgz: i saw you were unassigning issues from 2.0
<davecheney> thank you
<mramm> feel free to use trunk
<davecheney> mramm: email please
<mramm> will do
<davecheney> just copy and paste this discussion
<mramm> it's posted
<mramm> and I've updated the agenda so we talk about that
<mramm> and so that we talk about the backports stuff
<mramm> and general release status
<mramm> I will try to write up a release status update e-mail later this evening
<mramm> but right now there is a lot up in the air, and I need some time to get all the details organized and written down
<mramm> and I have some personal stuff to take care of in a few min, so I don't think it will make it before I leave for that...
<davecheney> thanks mark
#juju-dev 2013-04-25
<fwereade_> dimitern, ping
<dimitern> hey
<dimitern> fwereade_: morning
<TheMue> fwereade_, dimitern: morning
<dimitern> TheMue: hiya
<fwereade_> dimitern, quick chat re relations/upgrade-charm before the meeting?
<dimitern> fwereade_: actually, I thought it's better for me to do the other card first: --switch
<dimitern> fwereade_: should be simpler, etc. but after the change to use statecmd, i'm a bit lost now
<fwereade_> dimitern, ah, is that in now? I thought Makyo|out was planning to revert
<dimitern> fwereade_: it landed and probably he'll build on it
<fwereade_> dimitern, and, yeah, this is bad timing -- rog will also have something to contribute, I think, he's assigned himself the "allow local charms via api" card
<dimitern> fwereade_: hmm..
<fwereade_> dimitern, *but* from your perspective it shouldn't be that bad actually
<rogpeppe1> mornin' all
<fwereade_> rogpeppe1, heyhey
<dimitern> fwereade_: actually, i'm lost now
<dimitern> rogpeppe1: hey
<fwereade_> dimitern, all --switch really needs to do is to send a different charm url in the params
<dimitern> rogpeppe1: were you planing to implement upgrade-charm --switch to local repo flag?
<rogpeppe1> dimitern: no
<rogpeppe1> dimitern: i'm not sure what the planned semantics are
<fwereade_> dimitern, sorry unclear: it's just that the api doesn't handle local charms
<fwereade_> dimitern, and I thought I saw rogpeppe1 assign that issue to himself
 * rogpeppe1 is relieve to see his phone switch from "E" to "H", so may make the meeting after all
<dimitern> fwereade_: but since upgrade-charm now uses the api, how will that be possible then?
<rogpeppe1> fwereade_, dimitern: ah, local charms is something i was planning to do, as it requires some API thought
<dimitern> so i cannot implement it then..
<fwereade_> dimitern, it's actually orthogonal, isn't it?
<fwereade_> dimitern, the commands don't *really* use the api anyway
<dimitern> fwereade_: how about me you and rogpeppe1 have a g+ after the meeting to discuss it?
<fwereade_> dimitern, +1
<rogpeppe1> dimitern: sgtm
<dimitern> great
 * rogpeppe1 hopes he might get a decent internet connection again one day
<dimitern> rogpeppe: what's E and what's H on your phone line?
<davechen1y> https://docs.google.com/a/canonical.com/document/d/1p_OzWxqxaXalHBI3ohkUsB9_iQBPSGWwM-qODm5FSbI/edit#
<rogpeppe> dimitern: different network data connectivity levels: E=Edge, 3G=3G and G=HSDPA H=HSDPA+ i think
<davechen1y> https://plus.google.com/hangouts/_/d3f48db1cccf0d24b0573a02f3a46f709af109a6
<fwereade_> dimitern, (fwiw the short version is: you have a conn, with state and env, just add the charm yourself and pass a charm url to one that's already in state)
<rogpeppe> dimitern: E is good for nothing. H is good enough for a marginal G+
<davechen1y> edge is what they used to call 2G before 2G was defined by 3G
<rogpeppe> davechen1y: ah, i'm pretty clueless about this, thanks
<dimitern> rogpeppe: I see
 * davechen1y worked for ericsson during the 2g / 3g wars\
<davechen1y> it's all horseshit
<davechen1y> the speed available to you is directly diven by the amount of free timeslots on your cell
<davechen1y> basically, the more people on their phone, the less data per person
<rogpeppe> luckily around here nobody uses much data, i think
<mattyw> rogpeppe, hi there me again! does py juju set JUJU_CONTEXT_ID? i.e. to work out if my charm is running under pyju or goju do I just need to find some data in JUJU_CONTEXT_ID or something specific?
<davechen1y> mattyw: a quick grep of the source says, no py juju (0.7) does not export JUJU_CONTEXT_ID
<mattyw> davechen1y, hi dave, that seemed to be the case when I looked, but wanted to check in case my vimgrep-fu wasn't that great
<mattyw> davechen1y, I guess it's a problem the juju-gui charm will have to solve at some point if not already, I'll try to pester hazmat when he's up
<davechen1y> +1
<fwereade_> dimitern, would you start one? quick bathroom break
<dimitern> fwereade_: sure
<jam> mgz: you're not climbing the valleys?
<dimitern> fwereade_, rogpeppe: https://plus.google.com/hangouts/_/84df1b5aa61d251351d1476839611a33f6aa4f69?authuser=0&hl=en
<mgz> woho, it let me leave
<mgz> rogpeppe: haven't looked at the route cards yet, but it's normally white peak bit of peak district, around longnor. so, nothing very mountainy really.
<rogpeppe> mgz: it is beautiful around there
<dimitern> Makyo|out: sorry, i'll be reverting your branch about upgrade-charm API, as agreed with fwereade_ and rogpeppe; please ping me or them when you're here to discuss next steps
<dimitern> mgz: what was the easiest bzr way to revert a single rev?
<mgz> dimitern: eg, to backout rev 3, `bzr merge -r3..2 .` note the trailing dot
<dimitern> mgz: ok, thanks
<dimitern> mgz: hmm.. it's not working: bzr: ERROR: Not a branch: "/home/dimitern/work/go/src/launchpad.net/juju-core/.bzr/cobzr/.bzr/branch/": location is a repository.
<dimitern> mgz: maybe cobzr interferes somehow; i suspect it's the dot at the end that's confusing it
<mgz> you need to reference the branch you're dealing with
<mgz> dimitern: so, with native colo I could also use co:trunk if the rev I'm dealing with is on trunk. I presume cobzr has something similar
<dimitern> mgz: i got rid of cobzr and now i'm using just plain bzr
<TheMue> dimitern: i do so too. switching branches is simple enough.
 * TheMue is at lunch, bbiab
<dimitern> fwereade_: https://codereview.appspot.com/8958043
<fwereade_> dimitern, cheers
<fwereade_> dimitern, is that a straight revert?
<dimitern> fwereade_: yes
<fwereade_> dimitern, excellent, if the tests pass then LGTM trivial
<dimitern> fwereade_: running them now
<jam> mgz: poke
<danilos> jam: mumble is crashing for me :/
<jam> danilos: :(
<jam> danilos: still no love for mumble?
<jam> mgz: still not around yet?
<danilos> jam: nope, restarting pulseaudio didn't help either
<jam> well, if mgz isn't around, then we can do a hangout, as he is the only one that doesn't have good access.
<mgz> er, I'm here, but have turned off my hangout machine
<mgz> ...really need to fix the clock on this
<danilos> jam: sure, hangout or not?
<jam> danilos: mgz made it onto mumble first, so we'll proxy you via irc
<danilos> jam: ack
<jam> danilos: you go firsrt
<jam> to get it out of the way
<danilos> jam: heh, fair enough
<jam> how was your day
<danilos> jam: it was decent, still fighting two remaining test failures with my keypair auth stuff; I also plan to start testing it --live
<danilos> jam: I wonder if I need anything to be able to run against hp cloud other than the credentials I already got from you?
<mgz> danilos: you shouldn't need anything else
<jam> right, you shouldn't need more
<danilos> mgz, jam: cool, I'll let you know how that goes then
<jam> and the creds I got you should be "safe" from one of the bugs wallyworld_ noticed.
<mgz> be a little careful about how you name your env and bucket, as that's a shared account (I assume)
<jam> mgz: yes, it is
<jam> danilos: sounds good
<danilos> mgz, right
<mgz> we're discussing non-interactive apt in the hook environments
<danilos> mgz, any conclusions out of that discussion?
<jam> danilos: it looks like it is actually doing the right thing there, so I'm on to debug why my charm isn't working correctly.
<jam> for a sec we thought it wasn't setting DEBIAN_FRONTEND=noninteractive, but it looks like that isn't true
<danilos> jam: interesting indeed
<wallyworld_> fwereade_: hey, you around?
<fwereade_> wallyworld_, hey
<fwereade_> wallyworld_, oh crap, this is the same timing as yesterday :(
<wallyworld_> wanna have a chat about the constraints stuff?
<fwereade_> wallyworld_, would you be ok to chat about it in your morning? I can come on before I go to bed
<fwereade_> wallyworld_, unless you'll be around in ~1h20 -- what's the time difference?
<fwereade_> wallyworld_, I can't actually remember what city you're in
 * danilos goes out to grab a quick bite
<jam> anyone know if "juju set --config" actually does anything? It supports the option, but AFAICT it doesn't actually set anything.
<jam> Or maybe the contents of the file changed?
<fwereade_> jam, the --config format is a bit surprising
<wallyworld_> fwereade_: it's 22:10 here now, if i can stay awake i'll ping you later or else let's do it my morning
<fwereade_> wallyworld_, hey, I can do 15 mins now
<wallyworld_> ok
<fwereade_> wallyworld_, I'll start a hangout
<jam> fwereade_: I think we broke the format supported by python-juju
<jam> which is that the top-level item is "tarmac"
<jam> we just do an unmarshal into a map[string][string] and then pass that in to s.SetConfig(options)
<fwereade_> jam, oh *fuck*, I could have sworn rogpeppe was doing that?
<jam> while we use the same s.SetConfig(options) from the commandline.
<jam> sorry not "tarmac" but the name of the service
<jam> rogpeppe: poke?
<rogpeppe> fwereade_: no, it's an outstanding bug that we decided not to fix at the time
<rogpeppe> jam: the top level items refers to the services to set the config options on i think
<jam> rogpeppe: that is what it is supposed to be (what it was in python-juju), but doesn't seem to be the case in juju-core today
<rogpeppe> jam: indeed
<jam> rogpeppe: note that it is also broken for "juju deploy --config" then
<rogpeppe> jam: yes
<jam> bug #1162122
<_mup_> Bug #1162122: deploy --config has no tests <juju-core:Triaged> <https://launchpad.net/bugs/1162122>
<rogpeppe> jam: tbh i find the python format a bit weird (we say to a service "please set these config options if there happens to be a matching service name in here") but it should still be fixed to be compatible
<rogpeppe> jam: i think there are actually tests
<jam> rogpeppe: well, it would let you have all your config in one file for multiple services, and just have it pick out the right one at the right time.
<rogpeppe> jam: they're just testing the wrong thing
<rogpeppe> jam: it would make more sense if there was a call that said "set all these settings on all these services"
<rogpeppe> jam: with just the config file as input
<jam> rogpeppe: well, the service could be optional if you specify 'juju set --config'
<rogpeppe> jam: yeah, that would make much more sense
<jam> :q
<rogpeppe> jam: it would be nice if there was some way of reliably telling the two-level config from the one-level config. that way you could have some settings in a config file that would apply to a given charm regardless of its service name.
<rogpeppe> jam: (that's the way juju is now, but we're going to remove that possibility by making it compatible)
<jam> rogpeppe: well, you could look for a top level key that matches the service name, and if it exists, recurse into it
<jam> though it leaves you open to 'interpretation' if you ever had key collisions
<rogpeppe> jam: it's quite possible that there's a ... yeah
<rogpeppe> jam: i much prefer to avoid heuristics when possible
<rogpeppe> jam: bug #1167465
<_mup_> Bug #1167465: service set (and deploy) uses wrong YAML config syntax <juju-core:New> <https://launchpad.net/bugs/1167465>
<rogpeppe> it's a pity we've got YAML going over the wire at all. if it wasn't such an abstruse format, there would be a decent js parser for it and we wouldn't need to do that.
<rogpeppe> because it would make much more sense for the client side to inspect the yaml and decide which settings to use based on the service name, then send just those settings
<mgz> er... who has the creds to put tools in canonistack?
<gary_poster> Hi all.  I'd like to announce the GUI's compatibility with juju core, but the Raring Juju from the devel PPA fails for me like this: http://pastebin.ubuntu.com/5598967/
<gary_poster> Is that known?  Is there some other, better way to suggest that people try out the GUI on juju core?  I didn't figure installing juju from source was the right sales pitch :-)
<rogpeppe> gary_poster: yes, it's known and waiting on a fix to cloud-init in raring, i believe
<rogpeppe> gary_poster: we did have a fix but it was backed out on request
<gary_poster> ok cool rogpeppe thanks.  So we shouldn't announce till that is in backports?
<rogpeppe> gary_poster: and we were assured that the lifetime of the bug was "measured in days"
<gary_poster> :-)
<rogpeppe> gary_poster: i think we could announce but say that we can't currently bootstrap a raring instance (referencing the bug number). you shouldn't see this issue if you set default-series=precise
<hazmat> gary_poster, your launching a precise env? rogpeppe if so that  doesn't sound like the raring bug.
<hazmat> the raring cloud init bug
<rogpeppe> hazmat: i presumed he was launching a raring env
<rogpeppe> hazmat: yeah
<gary_poster> rogpeppe, sorry, I am launching a precise instance from raring
<gary_poster> default-series: precise
<gary_poster> in environment
<hazmat> rogpeppe, default is precise. he's using raring client
<rogpeppe> gary_poster: hmm
<hazmat> well dev ppa client
<gary_poster> I had to do --upload-tools
<rogpeppe> hazmat: doesn't upload-tools make it default to current series?
<gary_poster> Maybe that is related?
<rogpeppe> gary_poster: have you ssh'd to the bootstrap instance and looked at cloud-init-output.log ?
<gary_poster> I think that was changed a while ago: at least on quantal we've been able to launch precise instances with --upload-tools
<hazmat> rogpeppe, it also uploads precise.. i hope it doesn't re change default series..
<gary_poster> no rogpeppe, will look.  /var/log/...?
<hazmat> the result for that error was nothing on the other side..
<rogpeppe> gary_poster: yes
<gary_poster> k
<rogpeppe> hazmat: yup
<hazmat> rogpeppe, is "error: cannot log in to admin database: auth fails" cannot connect to remote side?
 * hazmat pokes around
<rogpeppe> hazmat: it means that we've got as far as starting mongo, but jujud bootstrap-state failed
<rogpeppe> hazmat: so yeah, it's not the raring bug
<rogpeppe> gary_poster: might be worth doing "cat /etc/os-release" on the remote instance too, just to make sure it really has booted precise
<gary_poster> ack rogpeppe.  (restarting instances)
<TheMue> gary_poster: also which juju release (1.9.14 or 1.10.0)
<rogpeppe> TheMue: he used --upload-tools
<rogpeppe> TheMue: so it shouldn't make a deal of difference
<gary_poster> TheMue, how do you tell, anyway?  I tried --version but that gave an error message
<ahasenack> I just bootstrapped on ec2 and I didn't need to use --upload-tools, fwiw
<TheMue> rogpeppe: "shouldn't" is the right answer ;)
<rogpeppe> ahasenack: with 1.10 ?
<ahasenack> rogpeppe: yes
<gary_poster> cool ahasenack.  I'll try that after
<rogpeppe> ahasenack: maybe we have got 1.10 in the publiuc bucket after all then
<ahasenack> http://pastebin.ubuntu.com/5601017/
<ahasenack> rogpeppe: gary_poster ^^^
<gary_poster> cool
<rogpeppe> ahasenack: cool. i thought noone had acquired the right permissions to do that yet. i guess davecheney might've done it.
<ahasenack> rogpeppe: still missing in canonistack, though
<ahasenack> but that's a "private" matter ;)
<rogpeppe> ahasenack: :-)
<ahasenack> as in, private cloud
<TheMue> rogpeppe: imho he handled it with martin yesterday
<rogpeppe> TheMue: cool
<gary_poster> rogpeppe, unsurprisingly juju ssh 0 fails with same error, as does juju status.  There is a clear error in the log though: https://pastebin.canonical.com/89982/ .
<gary_poster> rogpeppe, and it definitely is precise
<gary_poster> ahasenack, were you starting a precise instance?
<rogpeppe> gary_poster: you can't use juju ssh
<rogpeppe> gary_poster: you'll have to use ssh directly
<gary_poster> rogpeppe, yup, figured that, I did, and that's how I got the pastebin :-)
<rogpeppe> gary_poster: try: ssh ubuntu@instance
<ahasenack> gary_poster: yes, I have default-series precise
<gary_poster> ahasenack, huh.  I may just want to try not using --upload-tools
<gary_poster> now that this works
<ahasenack> it used ami-d0f89fb9
<ahasenack> (us-east-1)
<ahasenack> that's 099720109477/ubuntu/images/ebs/ubuntu-precise-12.04-amd64-server-20130411.1
<gary_poster> ahasenack, same here
<gary_poster> (but I used upload-tools)
<rogpeppe> gary_poster: hmm, weird
<rogpeppe> gary_poster: ah i know!
<rogpeppe> gary_poster: you were using --upload-tools but you hadn't done "go install"
<gary_poster> so upload tools doesn't work from PPA
<gary_poster> cool, makes sense
<gary_poster> I was somewhat surprised when it worked on my quantal machine
<gary_poster> ok, will kill and retry without --upload-tools.  Thanks rogpeppe and ahasenack
<rogpeppe> gary_poster: juju --upload-tools should check that the command line client is the same version as the tools that have been built.
<rogpeppe> gary_poster: although that's not really sufficient either.
<rogpeppe> gary_poster: perhaps upload-tools should build the tools and then run them to do the actual bootstrap
<gary_poster> rogpeppe, that sounds potentially good.  Alternatively, --upload-tools could simply fail (with helpful message) if you are not running from a devel juju?
<ahasenack> hm, juju-log in juju-core does not have a --log-level parameter
<ahasenack> pyjuju does
<ahasenack> keystone charm failed to deploy:
<rogpeppe> gary_poster: that wouldn't fix the most common failure scenario that i encounter
<ahasenack> 2013/04/25 13:39:25 INFO worker/uniter: HOOK subprocess.CalledProcessError: Command '['juju-log', '--log-level', 'INFO', 'Configuring Keystone to use a random admin token.']' returned non-zero exit status 2
<rogpeppe> ahasenack: hmm, i thought we just ignored that
<rogpeppe> pwd
<ahasenack> it said
<ahasenack> 2013/04/25 13:39:25 INFO worker/uniter: HOOK error: flag provided but not defined: --log-level
<ahasenack> should I file a bug?
<hazmat> yes
<rogpeppe> ahasenack: yeah. we accept (and ignore) -l, but don't define --log-level
<rogpeppe> ahasenack: which should be defined as an alias
<ahasenack> -l is an alias for --log-level?
<ahasenack> ok
<rogpeppe> ahasenack: thanks for finding it
<ahasenack> #1172717
<_mup_> Bug #1172717: juju-log does not accept --log-level <juju-core:New> <https://launchpad.net/bugs/1172717>
<ahasenack> hm, found another incompatibility
<ahasenack> the charm directory layout changed a bit, and the keystone charm makes some assumptions
<ahasenack>     juju_rc_path = "/var/lib/juju/units/%s/charm/%s" % (unit_name, script_path)
<ahasenack> that does not exist in units deployed by go-juju
<rogpeppe> ahasenack: i think that's an unwarranted assumption on the part of the keystone charm
<rogpeppe> ahasenack: i don't think there has every been a guarantee of the current directory
<rogpeppe> s/every/ever/
<ahasenack> rogpeppe: I agree
<rogpeppe> ahasenack: it should really use $CHARM_DIR
<ahasenack> rogpeppe: is that an env var exported by both pyjuju and juju-core?
<rogpeppe> ahasenack: i think so, yes
<ahasenack> ok
<rogpeppe> fwereade__: what do you say to removing jujuc log --debug and making --log-level work correctly?
<rogpeppe> fwereade__: given that we now do actually have log levels
<fwereade__> rogpeppe, do we have no users of --debug? I thought it pre-existed
<fwereade__> rogpeppe, if we don't definite +1
<rogpeppe> fwereade__: not AFAICS in /juju/hooks/cli.py
<fwereade__> rogpeppe, even if we do +1 to using log levels
<rogpeppe> fwereade__: but i may be missing another place
<fwereade__> rogpeppe, sweet
<fwereade__> rogpeppe, kill it kill it kill it
<rogpeppe> fwereade__: need to fix general log-level setting first
<fwereade__> rogpeppe, agreed
<gary_poster> FWIW, verified that juju-gui works.  Do I need to suggest that people install the devel PPA, or should the default Raring juju-core work?  I'm using the devel PPA
<gary_poster> I suppose I could uninstall juju-core, uninstall the PPA, reinstall juju-core, and verify...
<gary_poster> but if someone knows off hand that would be convenient :-)
<mgz> the bucket probably just needs the the version from the archive as welll...
<mgz> I can do that and see
<ahasenack> gary_poster: the instructions at juju.ubuntu.com already say to install the devel ppa iirc
<mgz> some funny versioning business went on, didn't think the juju tool picking loguc would care
<gary_poster> ahasenack, right you are, thanks.  But that may be specific to < raring.  So mgz, you are doublechecking to see if default raring juju-core works OK?
<mgz> well, people already had, but I'm trying something else to see if it makes a difference for you
<rogpeppe> oh wonderful "Your fault is due to major damage to cabling affecting all providers in your area. Engineers are working to resolve ASAP. We apologise for no fix time."
<TheMue> rogpeppe: so a wonderful mca
<rogpeppe> TheMue: mca?
<TheMue> rogpeppe: maximum credible accident
<rogpeppe> TheMue: ha.
<rogpeppe> TheMue: the fact that most people in the street don't have a problem seems to give the lie to it
<TheMue> rogpeppe: or they use lte ;)
<rogpeppe> TheMue: lte?
<TheMue> rogpeppe: 4g
<rogpeppe> TheMue: some do, yeah, but lots don't. my neighbour who's also affected has been canvassing other folks in the street and finding out
<TheMue> rogpeppe: http://en.wikipedia.org/wiki/LTE_(telecommunication)
<dimitern> https://codereview.appspot.com/8540050
<TheMue> rogpeppe: most aren't so depending on it like you (we). so it's inconvenient, but it doesn't really hurts. they then use their mobiles or the internet at work.
<dimitern> so there's a mistake in the desc, sorry, the bug is 1040210, not 1040203
<rogpeppe> dimitern: looking
 * dimitern bbi30m
<rogpeppe> dimitern: reviewed
<rogpeppe> fwereade__, ahasenack: trivial? https://codereview.appspot.com/8955044
<ahasenack> rogpeppe: if that adds --log-level as an alias to -l, +1 :)
<rogpeppe> ahasenack: yup, that's it.
<fwereade__> rogpeppe, LGTM trivial
<rogpeppe> fwereade__: thanks
<rogpeppe>  hmm, "package" is not a great name for a type in go
<mgz> I try to use "if" as a variable name sometimes
<dimitern> rogpeppe, TheMue: thanks
<rogpeppe> i always find it interesting how much ambiguity the brain resolves automatically
<rogpeppe> i'll have used the same variable name in two entirely different ways in a function and have no problem with it until the compiler tells me
<rogpeppe> and then it's like "how tf didn't i notice that?"
<fwereade__> dimitern, reviewed
<dimitern> fwereade__: cheers
<dimitern> enter the saucy salamander (13.10) :D - http://www.markshuttleworth.com/archives/1252
<TheMue> dimitern: yeah, just read it
<dimitern> fwereade__, rogpeppe: replied to the reviews, will need some help - please take a look
<fwereade__> dimitern, responded
<dimitern> fwereade__: thanks
<Daviey> mgz: Hey, Did i understand correctly that you created the release tarball for juju-core?
<mgz> Daviey: yes, all complaints to be directed at me.
<Daviey> mgz: That'll have to wait until we have a big bottle of port, and soe fine cheese.
<Daviey> mgz: The reason for asking.. Can you wiki-ify what you did, whilst it's still fresh?  (I imagine there was some sausage making, but that is OK).  What you did is a good basis for next time :)
<mgz> yeah, I'm writing up release process stuff today
<Daviey> mgz: superb
<dimitern> i'm off guys
<mgz> later dimitern
<dimitern> see you tomorrow and good evening
<mramm> Two new bugs from end users: 13:52 dpb_: mramm: https://bugs.launchpad.net/juju-core/+bug/1172814, https://bugs.launchpad.net/juju-core/+bug/1172811
<_mup_> Bug #1172814: Need a way to run an end-to-end test on a juju environment. <juju-core:New> <https://launchpad.net/bugs/1172814>
<_mup_> Bug #1172811: Need a way to watch juju-core environements <juju-core:New> <https://launchpad.net/bugs/1172811>
<rogpeppe> mramm: thank. i've responded to the second one.
<rogpeppe> thanks, even
<rogpeppe> right, that's me for the day
<rogpeppe> g'night all
<mramm> anybody know anything about the issues folks are reporting with sync-tools ?
<mramm> that sounds like a high priority bug to me
<ahasenack> http://pastebin.ubuntu.com/5602223/ for reference
<mramm> hmm, looks like another issue we've seen caused by a packaging bug in go 1.0.2, we will investigate right away
<thumper> morning folks
#juju-dev 2013-04-26
<davecheney> m_3: ping
<davecheney> our cloudinit harness doesn't support the bits of upstart I need
<davecheney> so i'm going to hack the bootstrap node after boot
<davecheney> arosales: ^ as above
<davecheney> that will have the same effect and validate our assumptions about the ~298 connection limit
<davecheney> OT question: does bzr have anything like svn externals or git submodules ?
<davecheney> $ sudo initctl start -v juju-db
<davecheney> initctl: Job failed to start
<davecheney> FML
<thumper> hi davecheney
<davecheney> ubuntu@juju-hpgoctrl2-machine-0:~$ nova list
<davecheney> +---------+---------------------------+------------------+--------------------------------------+
<davecheney> |    ID   |            Name           |      Status      |               Networks               |
<davecheney> +---------+---------------------------+------------------+--------------------------------------+
<davecheney> | 1465097 | juju-hpgoctrl2-machine-0  | ACTIVE           | private=10.7.194.166, 15.185.162.247 |
<davecheney> | 1565949 | juju-goscale2-machine-37  | ACTIVE(deleting) | private=10.6.245.47, 15.185.172.89   |
<davecheney> | 1566583 | juju-goscale2-machine-239 | ACTIVE(deleting) | private=10.6.246.187, 15.185.177.83  |
<davecheney> | 1581493 | juju-goscale2-machine-0   | ACTIVE           | private=10.7.27.166, 15.185.166.80   |
<davecheney> +---------+---------------------------+------------------+--------------------------------------+
<davecheney> ^ jammed in deleting for a few days now :(
<davecheney> 2013/04/26 00:51:08 DEBUG started processing instances: []environs.Instance{(*openstack.instance)(0xf8401b3f00)}
<davecheney> ^ *openstack.instance needs a String()
<m_3> davecheney: hey
<davecheney> m_3: hey mate
<davecheney> going for broke for 2k
<m_3> ssup?  still jammed?
<m_3> sweet
<m_3> bit of latency atm... gogo inflight wireless
<m_3> :)
<davecheney> i've hacked the mongo on the bootstap machine to have at least 20,000 conns
<davecheney> that should be enough fo the moment
<m_3> oh nice
<davecheney> m_3: where u off too ?
<m_3> SF, then Portland
<m_3> SF is prep for the big data summercamp talk
<m_3> portland is railsconf
<m_3> whoohoo
<m_3> actually looking forward to hanging with the ole 'austin-on-rails' crowd
<davecheney> m_3: I think we'll probably run out of ram on the bootstrap node by 2,000
<davecheney> m_3: this one is a hp bug,
<davecheney> ubuntu@juju-hpgoctrl2-machine-0:~$ nova list | grep delet
<davecheney> | 1565949 | juju-goscale2-machine-37  | ACTIVE(deleting)  | private=10.6.245.47, 15.185.172.89   |
<davecheney> | 1566583 | juju-goscale2-machine-239 | ACTIVE(deleting)  | private=10.6.246.187, 15.185.177.83  |
<m_3> davecheney: damn... I was just writing that we can bounce it and get something larger
<m_3> but we can't update the env after boostrap still right?
<davecheney> ~ 1.5 mb per service unit
<davecheney> env ?
<davecheney> yhou mean the spec for the bootstrap machine ?
<m_3> juju environment
<m_3> yeah
<davecheney> not easily
<davecheney> probalby esier to hack juju bootstrap
<m_3> right
 * davecheney facepalm
<davecheney> there is no swap on these machines
<davecheney> that will be a problem
<davecheney> mongo will probably explode
<m_3> yeah, sometimes when they're wedged with juju-0.7 we could do destroy-environment and it was a little stronger than destroy-service
<m_3> can you kill em with nova
<davecheney> nova can't kill this one
<m_3> we shouold've started with ec2 imo
<davecheney> (how do you think it got into this state in the first place)
<m_3> haha
<davecheney> m_3: any movement on some ec2 creds ?
<m_3> not yet... I prepped antonio that the request had be pretty much approved from above... but gotta get ben on the actual acct stuff
<m_3> davecheney: I think we should just blow it up
<m_3> davecheney: maybe put something in place that'll tell us that's what's happening
<m_3> so we can distinguish between a juju error and the bootstrap node blowing up
<davecheney> "11:25 < m_3> davecheney: maybe put something in place that'll tell us that's what's happening"
<davecheney> oh
<davecheney> that
<m_3> :)
<davecheney> let me blow one up so I can see what to expect
<m_3> reasonable to get as big as we can
<m_3> ack
<m_3> unfortunately I won't be in the air for long... otherwise _that_ would be a great story :)... "kicked off 1000 nodes from the plane"
<m_3> latency's really dropped down too... so it's pretty nice actually
<davecheney> mramm: wazzup ?
<mramm> not much
<mramm> I just got an email from linaro folks about armhf support in juju-core
<davecheney> m_3: lemmie hack this instnce with a /.SWAP
<davecheney> mramm: piece of piss
<mramm> ?
<davecheney> i told someone that we can always do a one off build if they need armhf today
<davecheney> if they need it properly
<davecheney> we need some work done on the golang-go package int he archive
<davecheney> basically, we need go 1.1
<mramm> they are just asking if they can help test and support it
<mramm> right
<mramm> that was what I remembered from some earlier arm discussion
<davecheney> they can test it right now today if they build go and juju from source
<davecheney> http://dave.cheney.net/unofficial-arm-tarballs
<mramm> They are not being demanding, just asking how they can help
<davecheney> ^ or they can use my beta tarballs
<davecheney> feel free to cc me
<davecheney> i'm happy to help get them started
<mramm> and what they can do, so I will let them know the situation, and CC you
<mramm> sounds good
<mramm> did we hardcode the state server to be amd64?
<m_3> descending below 10k-ft... ttyl
<davecheney> mramm: opinions differ
<davecheney> william told me it _is_ hard coded to amd64
<davecheney> then he told me it wasn't
<mramm> ok
<davecheney> i don't know the current answer
<mramm> I will check with william
<davecheney> id' expect it to just work
<davecheney> mramm: it's a bit of a problem that the UEC service doesn't list our armhf on amd64 images, http://cloud-images.ubuntu.com/query/precise/server/released.txt
<mramm> interesting
<davecheney> hmm, maybe they do for Q
<davecheney> nup
<mramm> we can talk to the "public cloud images" guys about that, and see what we can get done there.   I'll talk to antonio about that tomorrow.
<davecheney> mramm: http://www.h-online.com/open/news/item/Canonical-releases-EC2-image-for-Ubuntu-ARM-Server-1585740.html
<mramm> kk
<mramm> thanks
<thumper> hi mramm
<davecheney> mramm: m_3 286 slaves running, mongo using 450 mb of ram
<davecheney> so at least 4gb required for 2000 nodes at this rate
<thumper> davecheney: is that good?
<davecheney> it means you need to run a larger bootstrap instnace
<mramm> davecheney: I guess that is to be expected if we are going to have thousands of open connections to mongo
<davecheney> but then, if you're running 2000 nodes in your environment
<mramm> true enough
<davecheney> you probably dn't care about the cost difference
<mramm> right, the bootstrap node cost will be trivial compared to the 2000 nodes
<davecheney> each conn is a thread, which is anywhere between 1mb and 16mb depending on libc and the phase of the moon
<davecheney> mramm: bingo
<mramm> thumper: hey!
<mramm> davecheney: I think we should work to get 1.1 into S as soon as we can
<thumper> mramm: finally landed the hook synchronization branch
<mramm> we expect 1.1 final to land in plenty of time, and the earlier we propose the easier it is
<davecheney> mramm: that will require deviating from the ustream
<davecheney> which I have no problem doing
<mramm> yea
<thumper> snarky... superb... slimey...
<davecheney> but sounds like that isn't what we do (tm)
<thumper> what was S again?
<davecheney> surly
<thumper> not sweet
<thumper> I don't want to look it up
<thumper> but instead batter things around until it floats to the top of my memory
<davecheney> surly simian or something
<thumper> definitely a salamander
<thumper> not sticky
<thumper> which reminds me of a joke
<davecheney> stinky subhuman
<thumper> "What is brown and sticky"
<davecheney> 2013/04/26 01:53:23 NOTICE worker/provisioner: started machine 307 as instance 1582617
<mramm> stout sea-urchin?
<thumper> a stick
<mramm> haha
<mramm> fyi: https://wiki.ubuntu.com/SReleaseSchedule
<davecheney> hmm, at 300 nodes the main thread on mongod is at 30% duty
<mramm> interesting
<mramm> sounds like some more evidence that we will need an internal API sooner rather than later
<davecheney> mramm: it's all the reconnection and ssl handshaking from the clients probing
<mramm> does it settle down after they have connections established?
<davecheney> mramm: no
<davecheney> this is a constant load
<davecheney> the polling is every 2 ? minutes
 * davecheney goes and checks
<thumper> so changing to use the api internally should reduce the load here?
<thumper> or will it still be high
<thumper> just because of the number of clients?
<davecheney> thumper: lower, i would hope
 * thumper nods
<davecheney> the polling is internal to the mongo driver
<davecheney> the driver will poll all the known services in the replica set every 180 seconds at least
<davecheney> 2013/04/26 02:13:11 NOTICE worker/provisioner: started machine 406 as instance 1582971
<davecheney> might have to go to lunch at this rate
<davecheney> hmm, 20 mins per 100 instances
<davecheney> not bad
<mramm> yea, that's not too bad at all
<thumper> davecheney: going up to 2000?
<davecheney> f;yeah
<davecheney> hp are anxious to have their capacity back
<davecheney> so no pussy footing around
<davecheney> oooh
<davecheney> ubuntu@juju-hpgoctrl2-machine-0:~$ juju debug-log 2>&1 | grep TLS
<davecheney> juju-goscale2-machine-281:2013/04/26 02:42:08 ERROR state: TLS handshake failed: local error: unexpected message
<davecheney> juju-goscale2-machine-444:2013/04/26 02:42:11 ERROR state: TLS handshake failed: local error: unexpected message
<davecheney> juju-goscale2-machine-160:2013/04/26 02:42:07 ERROR state: TLS handshake failed: local error: unexpected message
<davecheney> juju-goscale2-machine-405:2013/04/26 02:42:10 ERROR state: TLS handshake failed: local error: unexpected message
<davecheney> juju-goscale2-machine-162:2013/04/26 02:42:11 ERROR state: TLS handshake failed: local error: unexpected message
<davecheney> doesn't appear to be affecting things
<davecheney> instance creation time is slowing, 2013/04/26 04:17:37 DEBUG environs/openstack: openstack user data; 2712 bytes
<davecheney> 2013/04/26 04:17:52 INFO environs/openstack: started instance "1584731"
<thumper> davecheney: by how much?
<davecheney> not sure, i'd have to get the whole logs
<davecheney> but the botostrap node is nearly out of memory
<davecheney> and starting to swap
<davecheney> i'm having a look to see if I can change the instance type of the bootstrap node
<davecheney> need at least 4x more ram to make it to 2000
<m_3> davecheney: can we `juju bootstrap --constraint='instance-type=standard.large'` or something?
<davecheney> m_3: not sure
<davecheney> there is something in the openstack logs that says the instance type is being hard coded
<m_3> oh, yeah, there's --constraints on bootstrap according to help
<davecheney> i'm going to grab the log and kill this test
<m_3> oh... didn't realize it was hard-coded... never tried anyting other than standard.small on hp
<davecheney> i've seen enough to know it's not going to make it
<m_3> still great info
<m_3> got it to the point where it's swapping
<davecheney> m_3: will post my notes on this run
<m_3> so it's probably safest to keep the environment defaulted to standard.small and then do a special bootstrap
<davecheney> m_3: how do we advise customers to size their bootstrap node
<m_3> btw, we should do a special hadoop-master too
<davecheney> m_3: wanna take a look while i'm grabbing the logs ?
<m_3> lemme check my notes
<m_3> I stuck the heap-size config about halfway through http://markmims.com/cloud/2012/06/04/juju-at-scale.html
<m_3> we just need to test out if the openstack provider will take the --constraints="instance-type=xxx" on bootstrap
<m_3> those were mediums though
<m_3> in ec2
<m_3> but whatever, the big one is the bootstrap node for now... the hadoop job doesn't actually have to run atm
 * m_3 looks back for the dang ip
<davecheney> 15.185.162.247
<davecheney> ubuntu@juju-hpgoctrl2-machine-0:~$ scp -C 15.185.162.247:/var/log/juju/all-machines.log all-machines-2000-node-test-20130426.log
<davecheney> Permission denied (publickey).
<davecheney> why is this being a sone of a bitch
<davecheney> oh hang on
<davecheney> ok, i'm going to destroy this envrionment
<m_3> rsync -azvP -e'juju ssh -e ...'
<davecheney> got it
<m_3> so we prob wanna do standard.xlarge
<m_3> can maybe do a standard.large, but might as well do the bootstrap at xlarge
<m_3> `nova flavor-list` describes them all
<davecheney> m_3: we'll probably have to do a set-config after we boot
<davecheney> but I need to do some screwing with the bootstrap node to make mongo scale
<m_3> ah, ok
<davecheney> unless you want to boot everthing as an xlarge
<davecheney> which might get me a bollocking
<m_3> davecheney: no, we only have perms on standard.small over normal limits
<m_3> davecheney: so I think we leave the environment using default-instance-type: standard.small
<m_3> davecheney: but try to use a constraint with the bootstrap
<m_3> davecheney: are you thinking that won't work?
<m_3> davecheney: sorry, I think I screwed up your scp... please check it
<davecheney> nah it's ok
<davecheney> dont' worry i got the scp
<m_3> k
<davecheney> lets try the --constraint option
<davecheney> it's 3pm in AU now
<davecheney> i'm going to destrouy this env and start again
<m_3> hell, I guess the easiest thing to do is first of all
<davecheney> i don't want to leave it running overnight
<m_3> deploy another service with a constraint
<m_3> yeah, we don't need to leave it up for anything
<m_3> I weas just thinking we could test out the constraint thing pretty quickly
<m_3> but it'll be interesting to see how long the destroy takes :)
<m_3> ha
<m_3> davecheney: it stil looks like it's spawning shit
<davecheney> yup, destroy works backwards
<davecheney> i'll stop the PA
<davecheney> stopped
<m_3> davecheney: so do we have to kill them via nova now?
<davecheney> m_3: if we have too, that is a bug
<davecheney> destroy means destroy, not do your best :)
<m_3> yup, but do the services you just killed have to be up throughout destroy?
 * m_3 doesn't know if destroy needs the db to get instance-ids
<m_3> davecheney: crap, just tried to bootstrap on another hp acct... doesn't respect the instance-type constraint
<davecheney> m_3: I suspected that
<m_3> davecheney: know the syntax for "mem>=16GB"
<m_3> ?
<davecheney> thumper: ?
<davecheney> m_3: our constraints support is very basic
<m_3> oh, looks like it's trying on a 'mem=16G'
<davecheney> wallyworld_: any ideas ?
<m_3> nice, I got past the basic validation it looks like... got a "no tools available"
<davecheney> --upload-tools ?
<wallyworld_> davecheney: about?
<davecheney> wallyworld_: we're trying to bootstrap an env with a larger bootstrap node
<m_3> davecheney: we can try from the ctrl instance... my laptop's off of the 1.10 distro package
<wallyworld_> on ec2 i assume
<davecheney> try from the control instance
<m_3> davecheney: nice, they're dying... slowly
<davecheney> we could kill them all with nova
<davecheney> probably not worth it
<davecheney> it'll be done in a few mins
<m_3> davecheney: yup
<m_3> once they're dead, we can try the constraint on bootstrap
<wallyworld_> davecheney: so you are typing something like this?  juju bootstrap --constraints "mem=4G"
<davecheney> wallyworld_: y
<wallyworld_> and it's not working?
<m_3> davecheney: I like that it blocks
<davecheney> ec2 blocks as well
<davecheney> but ec2 lets you just say 'delete these 1000 instance id's'
<m_3> ack
<davecheney> it looks like openstack makes you do them one at a time
<m_3> wallyworld_: not sure yet
<m_3> that's surprising
 * wallyworld_ has to go get kid from school
<m_3> might be worth filing it as a bug on the openstack provider
<davecheney> or at least a whinge
<m_3> davecheney: well, I spoke too soon :)
<m_3> it finished with instances still active
<davecheney> FAIL!
<m_3> maybe a timeout
 * davecheney embuginates
<davecheney> nup just raw fail
 * m_3 cheers from the sidelines
<davecheney> https://bugs.launchpad.net/juju-core/+bug/1170210
<_mup_> Bug #1170210: environs/openstack: destroy-environment leaks machines in hpcloud <juju-core:Triaged> <https://launchpad.net/bugs/1170210>
<davecheney> here is one I apparently prepared ealier
<davecheney> m_3: ubuntu@juju-hpgoctrl2-machine-0:~$ nova list                                                                                                                              âÂ·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·
<davecheney> +---------+---------------------------+------------------+--------------------------------------+                                                                         âÂ·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·
<davecheney> |    ID   |            Name           |      Status      |               Networks               |                                                                         âÂ·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·
<davecheney> +---------+---------------------------+------------------+--------------------------------------+                                                                         âÂ·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·
<davecheney> | 1465097 | juju-hpgoctrl2-machine-0  | ACTIVE           | private=10.7.194.166, 15.185.162.247 |                                                                         âÂ·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·
<davecheney> | 1565949 | juju-goscale2-machine-37  | ACTIVE(deleting) | private=10.6.245.47, 15.185.172.89   |                                                                         âÂ·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·
<davecheney> | 1566583 | juju-goscale2-machine-239 | ACTIVE(deleting) | private=10.6.246.187, 15.185.177.83  |                                                                         âÂ·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·
<davecheney> | 1581727 | juju-goscale2-machine-5   | ACTIVE(deleting) | private=10.7.30.60, 15.185.168.253   |                                                                         âÂ·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·
<davecheney> +---------+---------------------------+------------------+--------------------------------------+
<davecheney> can you email thiat list to antonio and ask hp to find out why those won't delete
<m_3> oh, same stuck ones?
<davecheney> -5 is a new one from this round
<davecheney> -37 and -239 were stuck from tuesday
<m_3> ack
<m_3> sent
<davecheney> 2013/04/26 05:16:11 WARNING environs/openstack: ignoring constraints, using default-instance-type flavor "standard.small"  '
<davecheney> ^ this is what I was afraid of
<davecheney> wallyworld_: any way to hack around this ?
<m_3> crap
<m_3> davecheney: we could turn off the 'default' in the environment
<davecheney> m_3: i suspected that would happen, but lacked the words to express it
<m_3> then see what happens with a few
<m_3> or explicitly set the constraint for smalls too
<davecheney> i like how fast bootstrap happens in hp cloud
<davecheney> usually < 1 min
<davecheney> so much better than AWS plodding
<m_3> davecheney: yup... lots faster
<davecheney> m_3: hang on, let me fuck with it for a sec
<davecheney> ahh, yoiu;'re doing what I was going to do :)
<m_3> shit, sorry
<davecheney> nah, you're good
<davecheney> that was what I was going to do
<davecheney> m_3:  do you wanna do a hangout for a bit ?
<davecheney> or is it a bit late in your local TZ ?
<m_3> davecheney: yeah, I should stop screwing around and hit the sack :)
<davecheney> go, flee, run wild, etc
<davecheney> sam is in perth this weekend
<m_3> hotel room with the wife asleep so can't do voice atm
<davecheney> so i'm going to hack on this all weekend
<davecheney> (not to mention drink scotch)(
<m_3> :)
<m_3> ok, yeah, it doesn't look like our experiment was working anyways
<m_3> might not be hard to change the constraint "override" code though
<davecheney> I FIXED IT WITH SCIENCE !
<davecheney> m_3: ok, i got the environment setup the way we want
<davecheney> but forgot to goose mongo
<davecheney> lemme do that again
<davecheney> m_3: hey, machine 5 is dead :)
<davecheney> that is nice bonus
<m_3> oh, cool
<davecheney> please watch closely, there is nothing up my sleves
<m_3> haha
<m_3> so you're gonna default to xlarge, then explicitly ask for 'mem=2G' for slaves?
<davecheney> m_3:  will know in a second
<davecheney> the environment config should default to .smalls
<m_3> sweeet
<m_3> nice
<davecheney> thank thumper for set-config
<m_3> ah
<davecheney> m_3: the rule is, once you've bootstraped, most of the values in environments.yaml are ignored
<davecheney> the active values are in the state
<davecheney> ohh dear, it shouldn't show you all those things :)
 * m_3 was wanting set-config in juju-0.6 earlier this week
<m_3> ha
<m_3> well, yes
<davecheney> sorry, the comamnd is set-environment
<m_3> it shouldn't
<davecheney> but it's operation is straight forward
<m_3> understood... I was actually wanting set-config :)... but thought maybe the tool did both
<davecheney> we have set-config as well
 * m_3 happy camper
<davecheney> um, at least I thought we did
<m_3> just get
<davecheney> oh yeah
<m_3> `juju get hadoop-slave`
<m_3> no filtering it looks like
<davecheney> yeah, i blame myself
<m_3> I sooo want a "preload-packages" or the equiv
<davecheney> m_3: what would that do ?
<m_3> charm metadata level as well as environment level
<m_3> install packages before calling any hooks
<davecheney> ah, via cloud init (sorta)
<davecheney> so all the hook install commands we no ops
<m_3> even later would be fine
<davecheney> MUCHA PARALLELA
<davecheney> 2013/04/26 05:48:16 DEBUG environs/openstack: openstack user data; 2710 bytes                                                                                             âÂ·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·
<davecheney> 2013/04/26 05:48:29 INFO environs/openstack: started instance "1585513"
<davecheney> 13 seconds to bootstap an instance
<davecheney> thumper: i was wrong, this didn't significantly change with 1000 instances running
<m_3> davecheney: it's moving now...
<m_3> what, thought the per-instance startup time was changing?
<davecheney> it went a up a little as mongo started to swap
<davecheney> not signficantly
<m_3> ack
<m_3> 5/min atm
<m_3> ish
<davecheney> the hold back time from openstack's rate limiting affects that
<davecheney> bc says 7 hours to bootstrap 2000 instances
<davecheney> faaaaaaaaaaaaaaark
<davecheney> you only get 4 cpus with the 16gb instance
<davecheney> that is pretty tight
<m_3> davecheney: where's htop on the bootstrap?
<davecheney> #6
<davecheney> fun fact, mongo supports a --maxConns flag
<davecheney> which defaults to 20,000
<davecheney> but that is gated by 80% of the current number of file descriptors
<m_3> huh
 * davecheney quitely expects mongodb to assplode at 10k connections
<davecheney> m_3: juju-goscale2-machine-0:2013/04/26 05:55:05 NOTICE worker/provisioner: started machine 85 as instance 1585607                                                             âÂ·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·
<davecheney> juju-goscale2-machine-0:2013/04/26 05:55:05 INFO worker/provisioner: found machine "86" pending provisioning
<davecheney> this is an interesting log line
<m_3> davecheney: I didn't catch your startup... are these related to a master?
<davecheney> sorry, say again
<m_3> did you deploy this from 'bin/hadoop-stack'?
<davecheney> yeah
<m_3> or just deploy -n?
<davecheney> with -n1975
<m_3> ok, cool
<m_3> wanna catch the master address... shit, status doesn't take any filters either though
<davecheney> that log line above shows how the PA works
<davecheney> 15.185.161.62
<davecheney> what is the port ?
<m_3> davecheney: yeah, that looks like what we'd expect to me
<m_3> 50070
<davecheney> using nova list is cheating, but whateva
<m_3> 80 nodes registerd
<m_3> this'd be really hard to test without novaclient
<m_3> damn, this is looking great right now
<davecheney> m_3: so i'm trying to drag myself into the 90's an use tmux
<davecheney> but there is one thing that i can't figure out
<davecheney> when i C-a etc
<davecheney> sometimes it is like the ^C is ignored
<m_3> hmmmm not sure what you mean
<m_3> you're trying to ctrl-c a process you mean?
<davecheney> no, cntl-a n
<m_3> ctrl-a hangs waiting for a followup keypress
<davecheney> yeah
<m_3> there's a timeout setting I think
<davecheney> it feels like that
<davecheney> m_3: anyway
<davecheney> it looks like mongo does all it's tls negogiation on the main thread
<davecheney> then spawns a worker thread
<davecheney> which is a bit lame
<m_3> I'll often find myself switching to another window as a no-op if I change my mind or get lost in a ctrl-a sequence
<davecheney> rather than accepting the connetion and handling it in a thread
 * m_3 not surprise that something like tls integration is half-baked
<davecheney> at 900 machines running, the main thread was busy 90% of the time handling all the reconnections from the driver
<m_3> yeah
<davecheney> i expect that to get a bit shit at 2,000 nodes
<m_3> yup
<m_3> not sure how to get around that one
<davecheney> as william said, it's moving the ws api out to the agents
<m_3> yeah, but that's a huge change though right?
<davecheney> its a lot of work, but conceptually it's straight forwrd
<m_3> right
<m_3> a fix
<m_3> not so much a workaround :)
<davecheney> everything talks tot he state via a set of types which convert between mongo documents and data structures
<davecheney> so it would just be a different conversion
<davecheney> watchers are, as always, the tricky bit
<m_3> true dhat
<davecheney> m_3: what happens if I deploy the juju-gui on this environment ?
<m_3> don't know if juju-gui talks to juju-1.10 api yet... does it?
<m_3> shit, we can try :)
<davecheney> m_3: gary poster said it did about 5 hours ago
<davecheney> who am I to doubt that lovely man
<davecheney> fuck, we'll have to wait 8 hours for that to be provisioned
<m_3> now your nova trick won't work this time :)
<davecheney> shitter
<davecheney> well this is fun, for relative values of fun
<davecheney> bugger, i should have deployed the gui first
<davecheney> hmm, i'll do that on the next run
<m_3> hmmmm... brain's getting fuzzy... but maybe there's a way to point the juju-gui to an api server via config
<m_3> i.e., from anohther env
<davecheney> probably
<davecheney> it won't use a relation
<davecheney> because the api server is not a service
<davecheney> (although it should be)
<m_3> nah, doesn't look like it in the charm
<davecheney>   juju-gui:                                                                                                                                                               âÂ·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·
<davecheney>     charm: cs:precise/juju-gui-46                                                                                                                                         âÂ·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·
<m_3> i.e., no config for api server
<davecheney>     exposed: true                                                                                                                                                         âÂ·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·
<davecheney>     units:                                                                                                                                                                âÂ·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·
<davecheney>       juju-gui/0:                                                                                                                                                         âÂ·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·
<davecheney>         agent-state: pending                                                                                                                                              âÂ·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·
<davecheney>         machine: "1999"
<davecheney> GLWT
<m_3> 1999
<m_3> sweet
<m_3> btw, the gui for this will be pretty un-interesting
<m_3> two boxes
<m_3> hadoop-master and hadoop-slave
<m_3> two lines between them
<davecheney> i be it crashes my browser
<davecheney> bet
<m_3> but yes, it'd still be neat to see
<m_3> hahq
<m_3> well, yeah... maybe that too
<m_3> although kapil had a simulator mock thingy set up
<davecheney> that is true
<m_3> he may've done some scale testing with that
<davecheney> that can simulate infesibly large environments
<m_3> most likely problem would be timeouts
<m_3> maybe
<m_3> while the api server chokes
<m_3> davecheney: sweet... that's thumping along
<davecheney> m_3: that is what I am thinking, it'll be lugging around the data for thousands of relations
<davecheney> yup
<m_3> davecheney: ok, well I think I'm gonna hit the sack then
<davecheney> yeah
<m_3> davecheney: you want me to do anything on the flipside?
<davecheney> this is as thrilling as watching paint dry
<davecheney> if anything eventful happens i'll put it in an email
<m_3> davecheney: or well just send me email if you get eod and want me to do something
<davecheney> i won't leave it running past about 11pm tonight
<davecheney> we should be pretty close to 2000 nodes by then
<davecheney> 7 hours really isn't fast enough for this
<davecheney> how long did it take for the ec2 2k node test ?
<m_3> k... I'm on UTC-7 for the next two weeks
<m_3> bout 7hrs iirc
<m_3> was split up a bit in the big run
<m_3> did 1000, tested job runs on that cluster
<davecheney> m_3: i'll see you in -7 on the 5th
<m_3> then cleaned out the hdfs and added 1000 more
<m_3> but I think that was 7hrs total
<davecheney> booooooooooooooring
<m_3> there were a few white russians invovled too :)
<davecheney> a capital idea!
<m_3> :)
 * davecheney considers scouting for dinner
<m_3> davecheney: k, well goodnight fine sir
<davecheney> later mate
<davecheney> enjoy this port - land
<davecheney> rogpeppe: can you help with a juju-gui question ?
<rogpeppe> davecheney: perhaps...
<rogpeppe> davecheney: a question from you about juju-gui, or a question from the juju-gui team?
<davecheney> how to login to the bugger
<rogpeppe> davecheney: sorry, didn't see your question...
<rogpeppe> davecheney: if you want me to see something, you need to mention my irc handle...
<rogpeppe> davecheney: you use your admin secret
<rogpeppe> davecheney: have you tried it and had it fail?
<davecheney> rogpeppe: yeah, tried and failed
<davecheney> is there a length limit ?
<rogpeppe> davecheney: i don't think so
<rogpeppe> davecheney: hmm, let me try it. remind me of the charm url of the gui charm, please?
<davecheney> https://15.185.163.105/
<davecheney> ^ this is the depoloyed gui]
<davecheney> ubuntu@15.185.162.247
<davecheney> is the machine that bootstrapped
<davecheney> rogpeppe: your key is already on that machine
<davecheney> so you should be able to recover the admin password
<rogpeppe> davecheney: actually, i was going to try deploying it, and couldn't remember the charm url
<rogpeppe> davecheney: but i'll try logging in to yours too
<davecheney> sorry this one is already deployed
<davecheney> rogpeppe: it's doing a 2000 machine bootstrap
<davecheney> so deploying another will take another 7 hours
<rogpeppe> davecheney: i want to see if i can reproduce the problem on a smaller env
<davecheney> kk
<davecheney> i just do juju deploy juju-gui
<davecheney> juju expose juju-gui
<davecheney> just followed garys instructions from his email
<rogpeppe> davecheney: i don't see any gui charm deployed on that machine
<rogpeppe> davecheney: and the error messages in machine.log look like they're not in the current juju tree
<davecheney> thaqt machine is not inside the environemnt
<davecheney> rogpeppe: but you can use that machine to recover the admin secret for the goscale2 environment
<rogpeppe> davecheney: ah, ok; i thought you said it was the deployed gui
<davecheney> rogpeppe: the gui uri is https://15.185.163.105/
<rogpeppe> davecheney: sorry, i got muddled
<davecheney> rogpeppe: yeah, sorry, this is very confusing
<davecheney> we're running an envbironment within an environment
<davecheney> 'cos that is how m_3 rolls
<rogpeppe> davecheney: i sometimes do that too
<rogpeppe> davecheney: at some point i'll run up a "juju-dev" charm that provides a full juju-core dev environment
<davecheney> that is a great idea
<davecheney> screw local mode
<rogpeppe> davecheney: i've done it manually before, but it's a hassle; just what charms are for
<rogpeppe> davecheney: ok, so login fails for me too
<davecheney> weird eh
<rogpeppe> davecheney: any chance you could add my key to the gui node?
<rogpeppe> davecheney: ah, i can probably ssh from the bootstrap node
<davecheney> rogpeppe: yes
<davecheney> juju ssh 1
<rogpeppe> davecheney: is there any way we can get ssh to only *temporarily* add hosts. the "permanently added" thing seems wrong
<rogpeppe> davecheney: and i just saw this message, which is probably related: http://paste.ubuntu.com/5603807/
<davecheney> rogpeppe: unrelated
<davecheney> we've been creating and destroying machines all day
<rogpeppe> davecheney: ah, ok
<davecheney> so ip addresses have been reused
<davecheney> and have left stale entries in the ssh knownhosts file
 * davecheney has craeted on the order if 1600 machines today
<rogpeppe> davecheney: that sounds like exactly what i was talking about, no?
<rogpeppe> davecheney: isn't the "permanently added" thing talking about adding to the knownhosts file?
<davecheney> rogpeppe: that is correct
<davecheney> i think i meant to say 'that warning is not serious'
<rogpeppe> davecheney: oh, i realise that
<rogpeppe> davecheney: but if ssh wasn't adding to the known hosts file, we wouldn't see that message
<davecheney> it won't add it a second time
<davecheney> the warning is the ip address exists in the file, with a different fingerprint
<davecheney> because we pass -o ignorehostwarning or something to ssh it carries on anyway
<rogpeppe> davecheney: yeah; basically i don't want to say "i know this ip address" forever because ip addresses are totally transitory in the juju env
<davecheney> rogpeppe: bingo
<davecheney> rogpeppe: i'll forward you my notes from the first 1000 machines
<davecheney> rogpeppe: i didn't bother to send that to william, he's got enough on his plate
<davecheney> the amount of memory mongo uses per connection is obscene
<rogpeppe1> davecheney: last thing i saw was:
<rogpeppe1> [09:27:39] <davecheney> rogpeppe: i'll forward you my notes from the first 1000 machines
<davecheney> 18:32 < davecheney> rogpeppe: i didn't bother to send that to william, he's got enough on his plate
<davecheney> 18:33 < davecheney> the amount of memory mongo uses per connection is obscene
<davecheney> that is all I said
<davecheney> 'cos you were ignoring me :)
<rogpeppe1> davecheney: occupational hazard of going through a mobile data connection
<davecheney> rogpeppe1: do you think they will reconnect your part of england to the internet in the near future ?
<rogpeppe1> davecheney: no prospect in the near future
<davecheney> rogpeppe1: shitter
 * davecheney steps outside to order some dinner
<rogpeppe1> davecheney: the fault is somewhere in 200m of underground cable
<rogpeppe1> davecheney: and they have to get planning to dig it up
<rogpeppe1> davecheney: i'd like to see your notes BTW
<rogpeppe1> davecheney: you might've missed this BTW:
<rogpeppe1> [09:31:31] <rogpeppe> davecheney: ah, this looks like a problem: http://paste.ubuntu.com/5603842/
<rogpeppe1> [09:32:57] <rogpeppe> davecheney: oops, missed one redaction
<davecheney> rogpeppe1: if you're looking at the output of juju get-environment
<davecheney> yeah, i think we left our flys open a bit
<rogpeppe1> davecheney: i removed most of the passwords; but i've no idea what that one was from - third attempt, looks like
<rogpeppe1> davecheney: unfortunately there seems no way to deliberately delete a paste
<rogpeppe1> davecheney: before the crawlers find it
<davecheney> rogpeppe1: s'ok, i'll change the admin secret
<rogpeppe1> aw shucks, "juju deploy juju-gui --force-machine 0" doesn't work
<rogpeppe1> davecheney: that wasn't the admin secret
<davecheney> will fix
<davecheney> rogpeppe1: as pennance, you need to fix that bug :)
<rogpeppe1> davecheney: i'm looking
<rogpeppe1> davecheney: i'll try to reproduce it first. please don't take down that environment for the time being (not that there's much danger, i think)
<davecheney> rogpeppe1: np
<rogpeppe1> davecheney: interesting minor bug: http://paste.ubuntu.com/5603887/
<davecheney> no you can't do that, oh, ok, if you must
<rogpeppe1> davecheney: no, it's not done - the unit is left around unassigned
<davecheney> oh
<davecheney> interesting
<rogpeppe1> davecheney: you have to manually destroy the unit then add another one
<rogpeppe1> davecheney: https://bugs.launchpad.net/juju-core/+bug/1173089
<_mup_> Bug #1173089: deploy can fail partially <juju-core:New> <https://launchpad.net/bugs/1173089>
<davecheney> bzzt
<rogpeppe1> davecheney: hmm, the gui works ok for me
<davecheney> rogpeppe1: poop
<davecheney> why can't i login to my deployment ?
<rogpeppe1> davecheney: here's an idea: kill the machine agent
<rogpeppe1> davecheney: and see if it works when it starts again
<davecheney> ok
<rogpeppe1> davecheney: 'cos that EOF error is really weird
<rogpeppe1> davecheney: i'm hoping that we will still see the error when it restarts
<rogpeppe1> davecheney: because then there's the possibility of upgrading the binaries with some updated logging and better error messages.
<rogpeppe1> davecheney: and finding out what's really going on
<rogpeppe1> davecheney: the only possibility that i can think of currently is that the connection to the mongo server has failed
<rogpeppe1> davecheney: i *wish* we annotated our errors more
<rogpeppe1> davecheney: if my theory is correct, that EOF error comes from about 6 levels deep and hasn't been given any context at all
<davecheney> rogpeppe1: is this on the api server, or the state/mongo server?
<rogpeppe1> davecheney: on the api server
<davecheney> right
<rogpeppe1> davecheney: if i had my way, there would be almost no if err != nil {return err} occurrences in our code
<rogpeppe1> davecheney: i lost that argument ages ago, but problems like this really show how bad our current conventions are
<davecheney> rogpeppe1: i'm starting to be convinved
<davecheney> and i think it can be reopened
<davecheney> times they have a changewd
<rogpeppe1> davecheney: my comment (the last one) on this post is a reasonable representation of my thoughts on the matter: http://how-bazaar.blogspot.co.nz/2013/04/the-go-language-my-thoughts.html
 * davecheney reads
<davecheney> rogpeppe1: the main mongo thread is now using more than 100% CPU
 * rogpeppe1 is not surprised
<davecheney> it looks like mongo handles the accept(2) and the tls handshake on the main thread
<davecheney> so every 30 seconds we get a storm of agents sniffing around
<rogpeppe1> davecheney: oh god
<davecheney> and the cpu wedges
<davecheney> only once it has done the handshaking does it hand off the connection to a new thread
<rogpeppe1> davecheney: we should try with a much much longer time interval there
<rogpeppe1> davecheney: 30s is ridiculous
<davecheney> it's not 30s
<davecheney> but that appears to be the resonent frequency of the polling interval
<davecheney> its 180s or whenever they need to do a sync (that is what mgo calls it)
<davecheney> which ever is the sooner
<rogpeppe1> davecheney: ah i see. the usual self-synchronising clock thing
<davecheney> yeah, that isn't all 650 agents at once
<davecheney> but a swarm of them
 * rogpeppe1 loves emergent patterns
 * davecheney does not
<rogpeppe1> davecheney: it's the joy of the universe, maaan
<rogpeppe1> davecheney: does that blog comment make sense to you BTW? i have the impression that noone gets what i'm trying to say there.
 * rogpeppe1 is not good at rhetoric
<davecheney> rogpeppe1: i agree with your position
<davecheney> i think we talked about this a year ago
<davecheney> waiting for the computer history museam to open
<rogpeppe1> davecheney: ah yes, i remember
<davecheney> and now with the benefit of some history
<davecheney> i agree
<davecheney> well, i always agreed
<davecheney> but this is an excellent case
<rogpeppe1> davecheney: i might put a post together for juju-dev
<rogpeppe1> davecheney: 9 levels deep and still diving
<davecheney> rogpeppe1: remember to stop on the way back up and represurise to avoid the bends
<rogpeppe1> davecheney: lol
<davecheney> don't go james cameron on me man
<rogpeppe1> davecheney: bottomed out at 12
<davecheney> 64 bit process
<rogpeppe1> davecheney: if we reported a stack trace, as some suggest, it would show only the bottom 2 levels
<rogpeppe1> davecheney: http://paste.ubuntu.com/5604054/
<rogpeppe1> davecheney: actually, there's probably another layer at the top
<rogpeppe1> davecheney: here's the complete stack: http://paste.ubuntu.com/5604064/
<davecheney> rogpeppe1: shit
<rogpeppe1> davecheney: one easy thing to do is to actually hook up the mgo logging
<rogpeppe1> davecheney: then that logf at the bottom would actually have printed something
<davecheney> rogpeppe1: is that hard to do ?
<rogpeppe1> davecheney: trivial
<rogpeppe1> davecheney: a one-line change
<rogpeppe1> davecheney: or one or two more if we want nicely formatted messages
<davecheney> rogpeppe1: a single thread is now using 209% CPU on the bootstrap node ...
<rogpeppe1> davecheney: is that possible?
<davecheney>   PID USER      PRI  NI  VIRT   RES   SHR S CPU% MEM%   TIME+  Command
<davecheney>  9611 root       20   0 8169M 1770M     0 S 194. 11.0  1h40:55 /usr/bin/mongod --auth --dbpath=/var/lib/juju/db
<davecheney> really, it is
<rogpeppe1> davecheney: i thought a thread was... single threaded
<rogpeppe1> davecheney: or do you mean a single process (with several threads inside) ?
<davecheney> rogpeppe1: this is using htop so it should be per thread
<davecheney> i cannot explain it
<davecheney> apart from observing it is large
<davecheney> ohh, and now I can see a lot of blocking on the mongo side
<davecheney> and that is only 800 machines
<davecheney> sorry, 888\
<davecheney> Apr 26 10:21:44 juju-goscale2-machine-0 mongod.37017[9611]: Fri Apr 26 10:21:44 [conn84734] query presence.presence.pings query: { $or: [ { _id: 1366971690 }, { _id: 1366971660 } ] } ntoreturn:0 ntoskip:0 nscanned:2 keyUpdates:0 numYields: 1 locks(micros) r:763142 nreturned:2 reslen:744 381ms
<davecheney> rogpeppe1: i'm assuming these are 'slow queries'
<davecheney> they only start to show up in the log at the 800 machine mark
<rogpeppe1> davecheney: wow, does that reslen value mean the query has been waiting for 12 minutes to be processes?!
<davecheney> i don't think so
<davecheney> i don't think it is 744,381 ms
<davecheney> surely it is 744 bytes after 381 ms
<rogpeppe1> davecheney: yeah, probably
<davecheney> rogpeppe1: Apr 26 10:56:20 juju-goscale2-machine-0 mongod.37017[9611]: Fri Apr 26 10:56:20 [conn50284] query presence.presence.pings query: { $or: [ { _id: 1366973760 }, { _id: 1366973730 } ] } ntoreturn:0 ntoskip:0 nscanned:2 keyUpdates:0 numYields: 1 locks(micros) r:911100 nreturned:2 reslen:792 501ms
<rogpeppe1> davecheney: latency rises...
<davecheney> not really sure wht that is showing me yet
<davecheney> it's sort of a cas insn't it ?
<davecheney> Apr 26 11:02:02 juju-goscale2-machine-0 mongod.37017[9611]: Fri Apr 26 11:02:02 [conn6275] query presence.presence.pings query: { $or: [ { _id: 1366974120 }, { _id: 1366974090 } ] } ntoreturn:0 ntoskip:0 nscanned:2 keyUpdates:0 numYields: 1 locks(micros) r:1413393 nreturned:1 reslen:406 768ms
<davecheney> but yes, they certainly rise
<davecheney> what is the heartbeat for presence ?
<davecheney> we should put some thought into avoiding harmonic feedback in all these periodic loops
<davecheney> shit, we're not even at 1000 instances
<davecheney> it's been running for 3 hours ...
<davecheney> testing this thing is a job for life :)
<dimitern> rogpeppe1: hey, how about a suggestion about better help doc for upgrade-charm --switch?
<davecheney> rogpeppe1: http://paste.ubuntu.com/5604256/
<davecheney> at the 1000 node mark, the api server is unusable
<rogpeppe1> dimitern: ah, will do. sorry, bit distracted currently as some old pipes have just sprung a leak in our kitchen and i've had to turn the main water supply off
<davecheney> or something maybe mongo
<dimitern> rogpeppe1: wow..
<davecheney> maybe the the thing afterwards that
<davecheney> crap
<rogpeppe1> davecheney: isn't the mongo, not the API server?
<rogpeppe1> s/the/that/
<davecheney> rogpeppe1: really not sure
<dimitern> rogpeppe1: "To manually specify the charm URL to upgrade to, use the --switch argument.
<dimitern> It will be used instead of the service's current charm newest revision.
<dimitern> Note that the given charm must be compatible with the current one, e.g.
<davecheney> i guess it is looking in the db
<dimitern> it must not remove relations the service is currently participating in,
<dimitern> and no settings types can be changed. This *is dangerous* and you should
<dimitern> know what you are doing."
<davecheney> to find the address of the instance
<davecheney> it could also be blocked waiting for the provider to return some data
<davecheney> but we've used up all our quota with the provider
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer:  | Bugs: 2 Critical, 64 High - https://bugs.launchpad.net/juju-core/
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer:  | Bugs: 3 Critical, 63 High - https://bugs.launchpad.net/juju-core/
<dimitern> wallyworld_: mumble?
<wallyworld_> dimitern: i just got back from soccer, i'll be a minite
<rogpeppe1> dimitern: can an upgraded charm have less config settings than the old one?
<dimitern> rogpeppe1: let me check
<davecheney> does anyone know if nova list has a limit on the nubmer of rows it returns ?
<davecheney> https://bugs.launchpad.net/nova/+bug/1166455 ?
<_mup_> Bug #1166455: nova flavor-list only shows 1000 flavors <prodstack> <OpenStack Compute (nova):Invalid> <python-novaclient:Fix Committed by gtt116> <nova (Ubuntu):Invalid> <https://launchpad.net/bugs/1166455>
<dimitern> rogpeppe1: well, it seems the old config settings should remain, but you can add new ones
<rogpeppe1> dimitern: ok, that seems good
<rogpeppe1> dimitern: http://paste.ubuntu.com/5604375/
<dimitern> rogpeppe1: sgtm, thanks
<dimitern> rogpeppe1: so how to test both local: and cs: urls? start a http server mocking the store and set that to charm.Store?
<rogpeppe1> dimitern: good question.
<rogpeppe1> dimitern: sorry, still distracted, trying to get hold of a plumber
<dimitern> rogpeppe1: i'll propose it without that, for now
<ahasenack> hi guys, I'm getting this error in the bootstrap node when bootstrapping on canonistack:
<ahasenack> ERROR worker: loaded invalid environment configuration: required environment variable not set for credentials attribute: User
<ahasenack> full logs at http://pastebin.ubuntu.com/5604481/
<ahasenack> any ideas?
<ahasenack> "juju status" on my laptop just hangs
<dimitern> ahasenack: try running juju status --debug -v
<ahasenack> dimitern: hm
<ahasenack> dimitern: http://pastebin.ubuntu.com/5604493/
<ahasenack> security group issue?
<ahasenack> it connects over there (localhost), so there is something listening on that port
<dimitern> ahasenack: it seems it cannot connect to mongo - is it running?
<ahasenack> root@juju-canonistack-machine-0:~# telnet localhost 37017
<ahasenack> Trying 127.0.0.1...
<ahasenack> Connected to localhost.
<ahasenack> Escape character is '^]'.
<ahasenack> something is, I assume it's mongo
<ahasenack> tcp        0      0 0.0.0.0:37017           0.0.0.0:*               LISTEN      27573/mongod
<ahasenack> yep
<dimitern> ahasenack: so you can connect from machine 0 to mongo, but not from outside?
<ahasenack> right
<ahasenack> I'm checking the security group rules
<dimitern> ahasenack: yeah, good idea
<ahasenack> dimitern: ah, I know
<ahasenack> dimitern: the rules are ok
<ahasenack> dimitern: it's the public ip thing, on the private ip only ssh is routed through
<ahasenack> dimitern: I'll fire up sshuttle and that should sort it
<ahasenack> dimitern: yep, worked now, thanks
<ahasenack> the errors in the logs were misleading me
<dimitern> ahasenack: you can also try setting the "use-floating-ip" to true in env config
<ahasenack> yepo
<dimitern> ahasenack: but knowing the shortage of floating ips on canonistack, it might fail anyway
<ahasenack> yes, I will stick with sshuttle, works well enough for my testing
<ahasenack> rogpeppe1: hi, I see that https://bugs.launchpad.net/juju-core/+bug/1172717 is still open, but the branch is merged
<_mup_> Bug #1172717: juju-log does not accept --log-level <juju-core:In Progress by rogpeppe> <https://launchpad.net/bugs/1172717>
<ahasenack> rogpeppe1: is it fixed in trunk?
<rogpeppe1> ahasenack: i think so; let me check
<rogpeppe1> ahasenack: yes
<ahasenack> rogpeppe1: will that trigger a new ppa build? I still only see the version with the bug
<ahasenack> rogpeppe1: also, does it requires a new "tools" build?
<ahasenack> does it require*
<rogpeppe1> ahasenack: i don't think so. i think the patch needs to be back ported
<ahasenack> rogpeppe1: I'm using this ppa: http://ppa.launchpad.net/juju/devel/ubuntu/
<rogpeppe1> ahasenack: we haven't yet worked out best practice in that respect yet - we're still feeling our way
<ahasenack> I thought that was trunk
<rogpeppe1> ahasenack: the tools still need to be pushed to the public bucket
<rogpeppe1> ahasenack: because that's where they're pulled from, not the ppa
<ahasenack> rogpeppe1: the bug actually depends more on the tools than on the new deb
<ahasenack> ok
<ahasenack> and that does not happen with every commit?
<ahasenack> I guess there needs to be a concept of "stable" and "devel" tools
<rogpeppe1> ahasenack: there is that concept
<rogpeppe1> ahasenack: if the minor version is odd, it's a devel version
<rogpeppe1> ahasenack: i think we probably need to automate our pushing to the public bucket
<ahasenack> rogpeppe1: but are they in separate buckets?
<rogpeppe1> ahasenack: no, there's only one public bucket
<rogpeppe1> ahasenack: (for any given environment, that is)
<ahasenack> ok, so if you push to that bucket with every commit, like a "daily", you risk breaking production users
<ahasenack> with the ppa at least you have a distinction about what is "stable" and what is "devel" or "daily"
<rogpeppe1> ahasenack: only if we push versions with an even minor version number, i think
<ahasenack> rogpeppe1: so how do you test trunk, you use --upload-tools all the time?
<rogpeppe1> ahasenack: the idea is that we always develop against an odd minor version (currently we're developing against 1.11)
<rogpeppe1> ahasenack: yes
<ahasenack> rogpeppe1: like my case now, I was going through all the openstack charms and seeing if they deploy with juju-core trunk, and filing bugs where appropriate (some in openstach charms, some in juju)
<ahasenack> rogpeppe1: but I can't test a "trunk" build of juju-core, because it's not there, I'm stuck with the version with the bug :)
<rogpeppe1> ahasenack: you could use upload-tools
<ahasenack> last time I tried it exploded, I emailed the list
<ahasenack> I will wait for a new package in the devel ppa, and new tools :)
<rogpeppe1> ahasenack: there have been some significant issues fixed since then. it *should* work fine.
<rogpeppe1> ahasenack: in particular, it shouldn't pick incompatible tools if you've uploaded some, which was probably the cause of the explosion before
<ahasenack> rogpeppe1: I think my problem is more basic than that... http://pastebin.ubuntu.com/5604658/
<ahasenack> what does it mean "no go source files"
<rogpeppe1> ahasenack: try go get -v launchpad.net/juju-core/...
<ahasenack> rogpeppe1: the "..." are for real?
<rogpeppe1> ahasenack: there are no source files in the juju-core root directory
<rogpeppe1> ahasenack: yes
<rogpeppe1> ahasenack: it's a wildcard
<ahasenack> !!
<rogpeppe1> ahasenack: from "go help packages": http://paste.ubuntu.com/5604667/
<ahasenack> rogpeppe1: ok, that changes things, thanks, I'll go on from here
<rogpeppe1> ahasenack: if the wildcard was '*', you'd have to quote the names all the time
<rogpeppe1> ahasenack: and '*' usually doesn't match multiple levels of directory
<rogpeppe1> ahasenack: cool; please let us know when things go wrong, or are awkward to understand - it's nice to get feedback from people that aren't used to walking around the holes in the road.
<davechen1y> m_3 ping
<dimitern> i'd appreciate a review on https://codereview.appspot.com/8540050
<ahasenack> rogpeppe1: --upload-tools worked, and I verified that that -l/--log-level bug is indeed fixed
<dimitern> rogpeppe1:  ^^
 * dimitern bbi30m
<rogpeppe1> ahasenack: lovely, thanks for giving it a go
<rogpeppe1> dimitern: ok, will look in a little bit
<rogpeppe1> dimitern: reviewed
<dimitern> rogpeppe1: cheers
<m_3> davecheney: pong
<ahasenack> hi, I got this error when deploying cinder with juju-core, is this a change between pyjuju and gojuju? http://pastebin.ubuntu.com/5605085/
<rogpeppe1> hmm, interesting
<rogpeppe1> ahasenack: do you know what hook that was running in?
<ahasenack> rogpeppe1: install I think, this was just before, and I was really installing it only
<ahasenack> 2013/04/26 15:51:25 DEBUG worker/uniter/jujuc: hook context id "cinder/0:install:79731491855068321"; dir "/var/lib/juju/agents/unit-cinder-0/charm"
<ahasenack> rogpeppe1: wait, let me paste more context
<rogpeppe1> ahasenack: hmm, so which relation did the code expect to be set there?
<rogpeppe1> ahasenack: given that the install hook isn't associated with a relation.
<ahasenack> http://pastebin.ubuntu.com/5605098/
<ahasenack> the install had failed before, i had to run a few juju set foo=bar to fix a config and then resolved --retry
<rogpeppe1> ahasenack: i think we could do with even more context actually
<ahasenack> I'm not sure what it was trying to set
<ahasenack> ok
<ahasenack> let me get the whole file
<ahasenack> rogpeppe1: http://pastebin.ubuntu.com/5605109/
<rogpeppe1> ahasenack: right, it's running the install hook
<rogpeppe1> ahasenack: i think it's reasonable that relation-related commands can fail in that circumstance, but i'd be interested to know what the charm was actually trying to do
<ahasenack> let me see what it does
<rogpeppe1> ahasenack: perhaps we should just ignore untoward relation-related commands
<ahasenack> rogpeppe1: I found two relation-set commands that match that log
<ahasenack> rogpeppe1: one specifies a relation id :)
<rogpeppe1> ahasenack: :-)
<ahasenack> looks like a bug
<rogpeppe1> ahasenack: looks that way to me
<ahasenack> the one that doesn't is in keystone_joined() (!!)
<ahasenack>   relation-set service="cinder" \
<ahasenack>     region="$(config-get region)" public_url="$url" admin_url="$url" internal_url="$url"
<ahasenack> rogpeppe1: ok, thanks, I'll take it from here
<rogpeppe1> ahasenack: if charms are doing this commonly though, and the python allowed it, we should perhaps consider letting it through and ignoring it
<ahasenack> ok
<ahasenack> I will debug this one, see how it ended up running keystone_joined() in the install hook
<ahasenack> and then if we can get and use a relation id
<rogpeppe1> anyone know of a decent way of inserting nicely formatted code fragments into a gmail mail?
<rogpeppe1> or a google doc for that matter
<ahasenack> hi, I have a feeling that juju deploy --config file.yaml isn't working, it's not taking the options from file.yaml
<ahasenack> before I debug further, is this a known issue?
<ahasenack> juju set <service> --config file.yaml also didn't work, but juju set <service> key=value did
<ahasenack> https://bugs.launchpad.net/juju-core/+bug/1121907
<_mup_> Bug #1121907: deploy --config <cmdline> <juju-core:New> <https://launchpad.net/bugs/1121907>
<dimitern> ahasenack: I think deploy doesn't accept --config yet
<ahasenack> The option is there, but the bug still open
<dimitern> ahasenack: or more likely it ignores it
<ahasenack> yep, looks like it
<dimitern> rogpeppe1: bugging you one last time: https://codereview.appspot.com/8540050
<ahasenack> juju get works, but there is also a bug for it, still open
<ahasenack> weird
<rogpeppe1> ahasenack: we've been fixing lots of bugs - not all them have necessarily been marked as such...
<ahasenack> ok
<rogpeppe1> dimitern: why call repo.Latest at all if we've got a specified revision number?
<rogpeppe1> dimitern: it's a potentially slow operation
<dimitern> rogpeppe1: it doesn't seem slow - it just changes the rev in the curl
<rogpeppe1> dimitern: no it doesn't - it calls CharmStore.Info, which makes an http request
<dimitern> rogpeppe1: only for a local repo it does get, but this shouldn't be slow at all, the CS does not fetch anything on Latest
<rogpeppe1> dimitern: 	resp, err := http.Get(s.BaseURL + "/charm-info?charms=" + url.QueryEscape(key)) ?
<dimitern> rogpeppe1: it's not the charm that's downloaded here, just the metadata
<rogpeppe1> dimitern: looks like it's fetching something to me
<dimitern> rogpeppe1: it's essentially a HTTP HEAD
<rogpeppe1> dimitern: sure, but it's still making an unnecessary network request for no particularly good reason. surely it's easy to avoid?
<dimitern> rogpeppe1: yeah, i suppose..
<dimitern> rogpeppe1: but despite this the logic is now sound, right?
<rogpeppe1> dimitern: i stopped there, but will continue looking, one mo
<dimitern> rogpeppe1: i'll just move the Lastest call in an else block after checking the other two cases
<rogpeppe1> dimitern: that was what i was just thinking
<dimitern> rogpeppe1: sorry, haven't seen it like this
<dimitern> rogpeppe1: thanks
<rogpeppe1> dimitern: you might even consider making it a bool switch
<dimitern> rogpeppe1: i did something like that, but it looked ugly, so i got rid of it
<rogpeppe1> dimitern: np; three cases is marginal
<rogpeppe1> dimitern: i'm still not sure the logic is quite right, even making that change
<dimitern> rogpeppe1: why?
<rogpeppe1> dimitern: don't we want to do a bump revision if the switch url is specified without a revno ?
<dimitern> rogpeppe1: I don't believe so
<rogpeppe1> dimitern: william said this, and i agree:
<rogpeppe1> Hmm.I suspect that bump-revision logic *should* apply when --switch is given
<rogpeppe1> with a *local* charm url *without* an explicit revision. Sane?
<dimitern> rogpeppe1: that's the user being explicit anyway, so we'll do what he asks, and probably knows what he's doing
<dimitern> rogpeppe1: I still disagee
<rogpeppe1> dimitern: as there's no way to explicitly specify bump-revision, i think we should make the default logic work
<dimitern> rogpeppe1: this is like --force - "do exactly what i'm telling you to do, no smart tricks"
<rogpeppe1> dimitern: hmm, you said "Done" in response to that sentence before - you didn't seem to disagree
<rogpeppe1> dimitern: if you don't specify a revision number, you're saying "please choose an appropriate revision number for me"
<rogpeppe1> dimitern: i think we should make that path work
<dimitern> rogpeppe1: done, meaning all the rest - except that, i should've been clearer perhaps
<dimitern> rogpeppe1: there's no way *not* to bump the revision otherwise
<dimitern> rogpeppe1: and why should we do it - it's a different charm, so no conflicts would apply (hopefully)
<rogpeppe1> dimitern: sure there is - specify a revision number, no?
<rogpeppe1> dimitern: it's a different charm, but we may already have another version of the one we're switch to
<rogpeppe1> switching to
<rogpeppe1> dimitern: it's not unlikely, in fact, if we're calling switch on multiple services
<dimitern> rogpeppe1: on the same service?
<dimitern> rogpeppe1: we can call it only on one service at a time
<rogpeppe1> dimitern: yes, but bump-revision isn't about the service, is it? it's about the charm's stored in the state, which are independent of the services that use them
<dimitern> rogpeppe1: so you think bumping revision on switch without explicit rev will be straightforward to understand from the user's point of view?
<rogpeppe1> dimitern: yes
<rogpeppe1> dimitern: because it's the behaviour they're used to when deploying with a local charm url
<dimitern> rogpeppe1: ok, i'll do it, but i'm still not convinced it's right
<rogpeppe1> dimitern: i think automatic bump-revision for any local charm is correct, as who knows what relationship the local charm bears to the one that's previously been uploaded?
<dimitern> rogpeppe1: fair enough
<dimitern> rogpeppe1: so when you have svc "riak",running charm "riak-7" and you upgrade it to "local:myriak" (no exp. rev, final result: "local:precise/myriak-7"), and then upgrade it again to "local:myriak", should the rev be bumped to "local:myriak-8" ?
<rogpeppe1> dimitern: yes, i think so
<dimitern> rogpeppe1: yeah, that's what I though, adding a test for that now
<dimitern> i'm off, happy weekend to everyone!
<ahasenack> rogpeppe1: about the earlier conversation about relation set and relation id, it looks like it's very common to not specify a relation id in pyjuju
<ahasenack> two charm authors I spoke with said so, and the "manpage" for relation-set in pyjuju says it's optional (as is everything else, so I don't trust that help doc very much: https://pastebin.canonical.com/90111/)
<rogpeppe1> ahasenack: it is optional, in relation-related hooks
<rogpeppe1> ahasenack: but in a non-relation hook, what could it possibly default to?
<ahasenack> ah, so it is optional in gojuju
<ahasenack> ok, I'll debug further
<rogpeppe1> right, eod and start of weekend for me here
<rogpeppe1> happy weekends all
<ahasenack> bye rogpeppe1, enjoy
#juju-dev 2013-04-27
<davecheney> m_3: arosales ping
<davecheney> rogpeppe3: ping
<davecheney> nm, i had lost the details for the control instance
<davecheney> but it's ok now
#juju-dev 2013-04-28
<m_3> davecheney: good to hear
<davecheney> m_3: i can't remember what I said :)
<davecheney> it was yesterday
<davecheney> grr, hp cloud is having a loe down
<davecheney> it can't talk to anything outside the network
<davecheney> which makes bootstrapping a bit problematic
<davecheney> Apr 28 10:29:56 juju-goscale2-machine-0 mongod.37017[9332]: Sun Apr 28 10:29:56 [conn1999] query
<davecheney> juju.txns.log query: { $query: {}, $orderby: { $natural: -1 } } cursorid:6469952084798152748 ntor
<davecheney> eturn:10 ntoskip:0 nscanned:11 keyUpdates:0 numYields: 1 locks(micros) r:486329 nreturned:10 resl
<davecheney> en:948 267ms
<davecheney> are these log times bad ?
<fwereade> davecheney, morning
<fwereade> davecheney, I was wondering if you considered https://codereview.appspot.com/8953044 suitably trivial? it was agreed in the discussions preceding the original CL but we forgot to actually implement it then :/
 * fwereade goes to do housey things for a bit, will be back soon
<fwereade> rogpeppe1, heyhey
<rogpeppe1> yo!
<fwereade> rogpeppe1, how's it going?
<rogpeppe1> fwereade: trying to write family emails and you're a welcome distraction :-)
<rogpeppe1> fwereade: how's you?
<fwereade> rogpeppe1, I consider https://codereview.appspot.com/8953044 to be plausibly trivial, because this was always the agreed behaviour AIUI: https://codereview.appspot.com/8953044
<fwereade> rogpeppe1, I's good :)
 * rogpeppe1 is not looking at anything work related just now
<fwereade> rogpeppe1, a few cylinders are starting to fire again
<fwereade> rogpeppe1, no worries :)
<rogpeppe1> fwereade: that's good. you must've burned out a few
<fwereade> rogpeppe1, had a nice weekend, lots of laura time
<fwereade> rogpeppe1, was definitely needed
<rogpeppe1> fwereade: we had a nice weekend building smart new veg beds from old scaff boards
<fwereade> rogpeppe1, excellent
<rogpeppe1> fwereade: not quite done yet, but excellent progress made (and some celebratory g&ts as the sun went down)
<fwereade> rogpeppe1, I've been reading jonathan strange and mr norrell
<rogpeppe1> fwereade: good innit?
<fwereade> rogpeppe1, it's rather good I think
<fwereade> rogpeppe1, oh yes
<rogpeppe1> fwereade: i kinda stole it off my sister in law one xmas (she'd just been given it) and read it far too fast before she noticed
<rogpeppe1> fwereade: i'm on the fourth bear
<rogpeppe1> fwereade: but not v far in
<fwereade> rogpeppe1, haha, nice, I have done that on a few occasions
<fwereade> rogpeppe1, it is not impossible that you will cast it aside with a bewildered WTF, I am not sure how much my enjoyment of that one was based on knowledge of the wider world from his other books
<rogpeppe1> fwereade: the fire is flaming nicely and the foster cat is happily snoozing on the sofa.
<rogpeppe1> fwereade: we'll see
<rogpeppe1> fwereade: i'm reserving judgement for the moment
<rogpeppe1> fwereade: the context i had from you has been useful
<fwereade> rogpeppe1, yeah, I don't think that's explained at all
<fwereade> rogpeppe1, at least not when you kinda need to know it
<rogpeppe1> fwereade: i think there was one brief aside
<rogpeppe1> fwereade: interesting to see results of dave's stress tests.
<fwereade> rogpeppe1, definitely
<rogpeppe1> fwereade: oops, work related.
<rogpeppe1> fwereade: must write email :-)
<fwereade> rogpeppe1, do we ever need that internal api :)
<rogpeppe1> fwereade: indeed
<fwereade> rogpeppe1, I have a few threads I need to draw together myself
<rogpeppe1> fwereade: i was happy that the gui seemed to work fine with all those nodes
<fwereade> rogpeppe1, that's awesome
<rogpeppe1> fwereade: that's only one client for the API mind :-)
 * fwereade removes hat
<fwereade> rogpeppe1, sure
<fwereade> rogpeppe1, but it's always nice to break after some other component ;p
<rogpeppe1> :-)
 * rogpeppe1 hasn't quite worked out what to say on the subject of error reporting vs logging yet, but feels there is sensible something lurking around
<fwereade> rogpeppe1, well, every piece of context you describe *could* be provided by logging
<rogpeppe1> fwereade: if we were happy with logging 1GB/minute
<fwereade> rogpeppe1, but it is indeed a tradeoff in many things
<rogpeppe1> fwereade: yeah
<fwereade> rogpeppe1, I think that more logging *and* more error context is likely to be generally helpful
<rogpeppe1> fwereade: agreed
<rogpeppe1> fwereade: that's the gist of my as yet uncomposed reply
<fwereade> rogpeppe1, but there have been times when we have erred toward the massively redundant in our error messages, and I think that's also a problem
<rogpeppe1> fwereade: if we err in that direction and prune as needed, i think that's a better approach than the opposite
<fwereade> rogpeppe1, yeah
<fwereade> rogpeppe1, I find myself pondering state though
<fwereade> rogpeppe1, there are a lot of places we pass up unadorned errors from mgo
<rogpeppe1> fwereade: indeed
<rogpeppe1> fwereade: let's not
<fwereade> rogpeppe1, hmm -- this to me is actually a case where I think logging wins out
<fwereade> rogpeppe1, "mgo said no" is generally independent of context -- the connection fell over, basically, and there's nowt to be done
<rogpeppe1> fwereade: logging errors or logging actions?
<fwereade> rogpeppe1, logging actions
<fwereade> rogpeppe1, and leaving errors we can't account for to fall out, and the detailed context be pieced together as needed
<rogpeppe1> fwereade: we generate enormous log files as is - i'm concerned about the volume of logging data we generate.
<rogpeppe1> fwereade: or *will* generate.
<fwereade> rogpeppe1, I think that actually implementing fricking log levels will address that
<rogpeppe1> fwereade: the problem is that anyone in any significantly sized environment will dial back the log levels
<rogpeppe1> fwereade: and then we'll be left with io.EOF
<fwereade> rogpeppe1, we don't have to send everything everywhere
<rogpeppe1> fwereade: that's a good point actually.
<rogpeppe1> fwereade: but still, those API servers are gonna be helluva busy
<rogpeppe1> fwereade: when everything's going through the api
<fwereade> rogpeppe1, consider me undecided on how best to address logging
<rogpeppe1> fwereade: yeah me too. i agree with you that a combination of better logging *and* better error reporting will be useful
<fwereade> rogpeppe1, the distribution problem smells a little of being amenable to charming, if we can figure out the right interface to expose
<rogpeppe1> fwereade: perhaps. some volume estimations may well be useful.
<fwereade> rogpeppe1, yeah, just an idle thought
<fwereade> rogpeppe1, have been having slightly less idle thoughts about an environs/resource package with Image and Spec structs and Map and Kind interfaces
<fwereade> rogpeppe1, but I have been doing that when I should have been reviewing ian's stuff
 * fwereade gets he to a codereview
<rogpeppe1> fwereade: interesting. i think there's something nice struggling to get out there but i haven't spent any time on it since maas appeared.
<fwereade> rogpeppe1, it has kinda clicked -- not in the "it's perfect" sense but the "it's definitely *closer* to perfect" sense
<rogpeppe1> fwereade: i want something that's not just a "this appeared vaguely similar in these providers" thing; something that's obviously genuinely helpful in current and future contexts.
<fwereade> rogpeppe1, eg it will be as stupid as the individual implementations of Map are wrt caching
<fwereade> rogpeppe1, well, it's a step that way -- I don't think Image should *really* be a struct, but I don't want to speculate too far
<rogpeppe1> fwereade: i really need to get back to what i'm supposed to be doing :-) look forward to continuing discussions around this topic
<fwereade> rogpeppe1, likewise
<fwereade> rogpeppe1, hey sorry 1 thing?
<rogpeppe1> fwereade: 1 :-)
<rogpeppe1> fwereade: go on
<fwereade> rogpeppe1, davecheney mentioned you were going to fix the unassigned unit thing seen with juju-gui to machine 0; please chat to me about it at your convenience, ideally prior to implementation, I have thoughts
<fwereade> rogpeppe1, but not now
<rogpeppe1> fwereade: bug #?
<fwereade> rogpeppe1, not sure
<fwereade> rogpeppe1, if it's not currently on your mind you can leave it on mine, but I'll cc you in in case it rings a bell, I have a half-reply open
<rogpeppe1> fwereade: doesn't ring a bell currently, unless it's the failed deploy thing, in which case i just reported the bug, and had no immediate intention of fixing it
<fwereade> rogpeppe1, ah great, I'll find it and assign it to myself then, I have a CL up that addresses the best currently-addressable subset of the problem, I think
<rogpeppe1> fwereade: #1173089 ?
<_mup_> Bug #1173089: deploy can fail partially <juju-core:New> <https://launchpad.net/bugs/1173089>
<fwereade> rogpeppe1, yeah, that's the one
<fwereade> rogpeppe1, got distracted by another bug ;p
<fwereade> rogpeppe1, fwiw I don't think we should rollback -- I think create/assign should be a single txn
<rogpeppe1> fwereade: yeah, probs
#juju-dev 2014-04-21
<jam> vladk: Your patch failed to land on a couple of tests: provider/maas/environ_whitebox_test.go:686: undefined: environs.NetworkInfo
<jam> It looks relevant to  what you changed.
<vladk> jam: I'll look
<jam> mgz: ping me when you wake up
<vladk> jam, mgz: I can't commit my branch because tests failed due to interface changes in the trunk branch. What is the usual process to resolve such situation?Merge manually? Use rebasing?
<jam> vladk: merge trunk, resolve conflicts, commit, push, mark approved again
<perrito666> morning everyone
<jam> morning perrito666
<jam> I think its going to be a bit quite in the standup today. I think most people are out for Easter
<perrito666> well for the moment its only me and wwitzel3 s room
<wwitzel3> :)
<natefinch> stupid hangouts
<jam> natefinch: morning
<natefinch> morning
<natefinch> I may need to reboot to get hangouts to work
<natefinch> brb
<jam> natefinch: if you want to chat a bit early, I have free time now
<natefinch> jam: sure
<wwitzel3> natefinch: mongo API rename review https://codereview.appspot.com/89840044
<natefinch> wwitzel3: I'll look
<wwitzel3> natefinch: also, what should I start on next?
<natefinch> wwitzel3: see if there are any of these you can take on: https://launchpad.net/juju-core/+milestone/1.19.1
<rogpeppe> hi all
<perrito666> hi rogpeppe
<rogpeppe> perrito666: hiya
<perrito666> rogpeppe: my calendar says you are not here
<rogpeppe> perrito666: i am not
<rogpeppe> perrito666: i am going skiing again today
<perrito666> wow, skiing while irc chatting, skillful dude
<rogpeppe> perrito666: :-)
<sinzui> natefinch, wwitzel3 : We are getting a lot of activity about 1.18.1 and 1.19.0 local not being able to find tools. I cannot reproduce it https://bugs.launchpad.net/juju-core/+bug/1309805
<_mup_> Bug #1309805: LXC / Local provider machines do not boot in 1.18 / 1.19 series <juju-core:Incomplete> <https://launchpad.net/bugs/1309805>
<sinzui> Maybe I am not affected because I have been on trusty for 6 months
<natefinch> sinzui: possible we have the same problem
<natefinch> (being on trusty)
<sinzui> natefinch, For some number of hours, there was a checksum mismatch in a security update that broken cloud-init (not tools). The fixed package is propagating now
<hazmat> sinzui, what's the juju ftest jenkins url?
<hazmat> sinzui, having some issues this morning (different then the archive issues noted) about 1.19 not installing mongo on precise.
<sinzui> http://ec2-54-84-137-170.compute-1.amazonaws.com:8080/
<sinzui> hazmat, ^ CI was angry about the checksum mismatch over the last few hours
<hazmat> http://paste.ubuntu.com/7300113/
<hazmat> sinzui, that's an image/repo issue not juju's fault afaics
<hazmat> sinzui, that pastebin has the mongo issue i'm trying to trackdown
<hazmat> albeit juju should still deal with it nicely
 * hazmat tries again with current trunk
<sinzui> hazmat, wow
<sinzui> hazmat, I expect to see juju always install a juju. precise should install mongodb-server. I think juju wanted to exec /usr/bin/mongod
<sinzui> ^ natefinch I *think* juju always installs a mongodb-server on bootstrap when it doesn't find one. trusty wants /usr/lib/juju/bin/mongod, precise wants /usr/bin/mongod
<hazmat> sinzui, mongo install has  been subsumed into the ha worker bits
<hazmat> might have gotten caught in the ha merge ... trying again with current trunk/upload-tools to see if it helps
<natefinch> hazmat, sinzui: juju always installs mongod
<sinzui> natefinch, but is chooses one to match the series. trusty gets juju-mongodb?
<natefinch> sinzui: it should, I'm trying to detangle the code
<jam1> sinzui: hazmat: Trunk is not compatible with 1.19.0. In 1.19.0 it assumed the Client installed mongodb-server or juju-mongodb via cloud-init. In 1.19.1 it is done in the jujud server. So if you are testing with Trunk, you have to use --upload-tools
<jam1> hazmat: bug #1308337
<_mup_> Bug #1308337: 1.19.1 cannot bootstrap 1.19.0 (no longer installs mongodb-server during bootstrap) <regression> <juju-core:Won't Fix> <https://launchpad.net/bugs/1308337>
<natefinch> sinzui: it looks like it always installs mongodb-server from cloud-archive, but not juju-mongodb
<jam1> natefinch: it gets mongodb-server from ctools, but juju-mongodb is in the Trusty Universe archive.
<jam1> natefinch: the bug hazmat was seeing was because it was targetting the released 1.19.0
<jam1> as the "latest available version"
<jam1> but per my bug report, 1.19.1 *won't* bootstrap a 1.19.0
<jam1> (well, will fail to)
<natefinch> jam1: there's no mention of juju-mongodb in the code, so I have to assume we're not installing it
<natefinch> jam1: ensuremongoserver just installs mongodb-server
<jam1> natefinch: ffs...
<jam1> natefinch: well, on amd64 mongodb-server is now available, so that might actually still work, but it will be broken for non amd/i386
<natefinch> jam1: I can fix it so we install juju-mongodb on trusty
<jam1> natefinch: so there was func MongoPackageForSeries(series string) string {
<jam1> 	switch series {
<jam1> 	case "precise", "quantal", "raring", "saucy":
<jam1> 		return "mongodb-server"
<jam1> 	default:
<jam1> 		// trusty and onwards
<jam1> 		return "juju-mongodb"
<jam1> 	}
<jam1> }
<jam1> which is, install "juju-mongodb" for everything new
<natefinch> jam1: I remember that... weird, I guess it must have gotten dropped during a merge or something
<jam1> natefinch: it was removed by r 2593 "  p, li { white-space: pre-wrap; }  r=natefinch] The machine agent is now responsible for setting up mongo. Cloud Init is no longer used to set up the upstart script. Mongo now starts with --replSet and the replicaset gets initiated. We still start just a single machine, but more machines will be able to be added later (once we have code to do so)."
<jam1> natefinch: code from rog, r=natefinch
<jam1> natefinch: I think your EnsureMongoServer patch just removed the "what package should be installed" logic completely
<jam1> because it was removing the "use cloud-init to install the right package"
<natefinch> jam1: well, fixable, obviously it should be there.  Probably just overzealous code deletion
<hazmat> re ha stuff, we can get this out of status state-server-member-status? feels like leaky internal details
<natefinch> hazmat: it's pretty useful information to know which of your state servers are voting members in the replicaset.  It can help when figuring out what's going on with HA
<hazmat> natefinch, again thats' leaky internal impl details
<natefinch> hazmat: yes, but it's *useful* internal impl details
<hazmat> natefinch, if we want to go that route, why not just list out jobs on machines
<hazmat> natefinch, ideally internal impl details only get show when requested
<hazmat> natefinch, else we're just drowning status in more noise
<natefinch> hazmat: we're considering making a more succinct status anyway
<hazmat> i've heard..
<natefinch> hazmat: status is already too huge for any medium sized deployment
<hazmat> but let's not keep doing the opposite ;-)
<hazmat> till we get there
<natefinch> hazmat: well, we need information about HA so you can tell what's going on about it. We can change it so it doesn't say "vote", but we need information that say s "this is a state machine, and it's currently being used to keep juju highly available"
<hazmat> natefinch,  state-server-member-status: has-vote doesn't really convey that
<natefinch> hazmat: Roger's initial suggestion was "Active, Inactive, Deactivating, Activating".  Maybe I was wrong to get him to change it.
<hazmat> natefinch, those sound pretty reasonable.. i'd expect omission on Inactive.. but the others convey better to an end user rather than has-vote
<hazmat> its still dev version.. re changing
<natefinch> hazmat: inactive means the machine is likely down or is a mongo backup (non-voting) only.   I don't think it should be simply removed, otherwise it will look like it's not a state server, when it is.
<natefinch> hazmat: the name change isn't a big deal, people can mentally translate to mongod terms as needed.  But I do still think more information is still better than less, even with status being too long as-is.  We can make a less spammy one later, but until then, this is the only way people can easily introspect their environment without having to start querying log files.
<natefinch> brb gotta make some lunch
<hazmat> natefinch, better would be some ha centric commands/options ie fold into ensure-availability
<natefinch> hazmat: yeah, ha-status or something
<hazmat> when's 1.19.1 scheduled
<hazmat> afaics 1.19.0 is hosed on precise
<hazmat> with manual provider
<jam> natefinch: did we get a bug for not installing juju-mongodb or should I add one?
<natefinch> jam: I haven't made one, my guess is that there isn't one right now.  I was just testing a fix for it
<jam> natefinch: submitting a bug so we can track it against the release
<jam> natefinch: bug #1310719
<jam> assigned to you
<_mup_> Bug #1310719: mongodb-server installed instead of juju-mongodb on trusty <ha> <mongodb> <juju-core:Triaged by natefinch> <https://launchpad.net/bugs/1310719>
<natefinch> jam: thanks
<natefinch> jam: I have to run out to run an errand, but I should be able to get that fix in before EOD
<jam> natefinch: any status update on bug #1304407
<_mup_> Bug #1304407: juju bootstrap defaults to i386 <amd64> <apport-bug> <ec2-images> <metadata> <trusty> <juju-core:Triaged by natefinch> <juju-core 1.18:Triaged> <juju-core (Ubuntu):Triaged> <https://launchpad.net/bugs/1304407>
<natefinch> jam: I have a fix, needs to be proposed
<natefinch> jam: https://codereview.appspot.com/89900043
<natefinch> jam: occurred to me there's no tests for it, though
<jam> natefinch: do we strictly prefer amd64 if you're running from say ppc ?
<jam> I wonder if strictly preferring version.Current would be better
<jam> and a test helps us keep this from regressing
<natefinch> jam: I don't know.  It seems complicated to know what people expect
<natefinch> jam: probably a strict heirarchy of what we prefer is better than depending anything on the juju client
<natefinch> jam: but I really gotta run
<jam> natefinch: so --upload-tools will just break if we don't use version.Current
<jam> natefinch: see ya later
<natefinch> jam: good point
<hazmat> sinzui, the effort involved in converting the docs is pretty large.. and keeping it running and delta minimized as changes are made to both is painful.. is there an issue with just moving forward and resolving these issues as we go?
<hazmat> these formatting issues that is
<sinzui> hazmat, I am concerned that we lost information in translation.
<hazmat> sinzui, lost as in your recent merge proposals?
<hazmat> sinzui, syncing content between two different formats is time consuming.. the only way to make that pain go away is to finish the switch
<perrito666> hi, can you all read me?
<wwitzel3> perrito666: didn't notice until now, but if it still helps, yes
<natefinch-afk> alexisb: you around?  Sorry I'm late
<alexisb> natefinch-afk, yep, on and ready when you are
<perrito666> wwitzel3: tx, I was checking network connectivity trough phone, light went out for a moment
<bac> sinzui: do you know when juju started generating a control-bucket if one was not supplied?
<wwitzel3> perrito666: ahh you still on the phone now?
<perrito666> wwitzel3: nope, power came back (I am behind an irc proxy)
<sinzui> bac 1.17.0 or 1.17.1
<bac> thanks
<stokachu> is the gomaasapi exposed through the juju api as well?
<stokachu> or is it just internal to juju
#juju-dev 2014-04-22
<bodie_> anyone know how to get the lbox tool to work properly on a headless host?
<bodie_> or can I simply use the launchpad site to propose my merge?  isn't there some process for going through rietveld?
<bodie_> we've been collaborating using a remote dev box
<perrito666> bodie_: you need to set the... sensible-browser iirc, to nothing and it will print the link to authorize lbox
<bodie_> okay, cool
<bodie_> perrito666, I think I've already done that, because it's giving me the link, but I thought I had to visit the URL on the same host
<perrito666> bodie_: nope
<bodie_> perrito666 -- http://paste.ubuntu.com/7303428/ -- this is what I get after authorizing it from another host
<bodie_> ah, might be working now
<perrito666> did you d the whole confirmation dance? it is not as straight forward iirc
<bodie_> it looks like it's working but now it's telling me to auth with my google account, and that's not working yet.
<bodie_> Also I thought the MR description had to be 50 words, not 50 characters.  lol
<rick_h_> bodie_: hah, 50 characters for the first line
<rick_h_> and then more details after that allowed
<bodie_> ah.....
<bodie_> well, that is a very terse description then, lol
<rick_h_> ex https://github.com/juju/juju-gui/pull/249
<rick_h_> the 50 chars is becoming a common convention
<bodie_> I mean, I crammed the whole MR description into the 50 characters thinking it was the limit.
<rick_h_> lol, no, just that first line
<bodie_> It's not quite twitter speak....  :$
<rick_h_> so that a one line diff is reasonable and no-wrap
<bodie_> ahhh
<rick_h_> but you can expand/list/etc in the body of the commit
<bodie_> yeah
<bodie_> is there a way to amend that?
<bodie_> I keep getting invalid user or password after it asks me for my Google creds.  I've tried with binary132 as well as binary132@gmail.com, I know I'm using the correct password, and I authorized it on Rietveld....
<arosales> axw: hello
<axw> arosales: heya
<arosales> axw: were you working with thumper and davechenney on the power enablment bits?
<axw> arosales: nope
<arosales> axw: ah ok
<axw> arosales: is there something broken?
<arosales> axw: do you happen to know if any other juju core folks were working on power enablment besdies thumper and dave?
<arosales> axw: getting a seg fault and I know there is a kernel bug opened just not sure if it was recently resolved . .  .
<axw> arosales: I think waigani has been fixing some of hte tests...
<axw> and wallyworld was looking at one stage
<axw> I'm not sure if there was anyone else
<wallyworld> there were packaging guys looking at the kernel bug
<wallyworld> not sure of the status
<perrito666> bodie_: your sso acct is not setup?
<bodie_> sso?
<arosales> axw: wallyworld: thanks for the info. Looking at https://bugs.launchpad.net/ubuntu/+source/gccgo-go/+bug/1275620
<arosales> I think the fix may be in gccgo, just not sure when that fix is going to make it into trusty
<wallyworld> arosales: yeah, i'm not up on the latest status sorry
<wallyworld> i wasn't sure if it was just gccgo related or not
<perrito666> bodie_: strange, can you pastebin the whole output? (omitting any info that you considered not public :p)
<bodie_> it actually looks like Google was blocking my login from the VPS for security reasons :/
<bodie_> giving it another go
<waigani> arosales: what is your email? I'll forward you the latest on that issue
<perrito666> bodie_: oh
<arosales> wallyworld: no worries
<arosales> waigani: also do you know the latest on https://bugs.launchpad.net/ubuntu/+source/gccgo-4.9/+bug/1304754
<_mup_> Bug #1304754: gccgo on ppc64el using split stacks when not supported <ppc64el> <trusty> <gccgo-4.9 (Ubuntu):Confirmed> <https://launchpad.net/bugs/1304754>
<arosales> I think this is the one we are actually hitting
<arosales> on -08 kernels we are ok, but anything more recent we consistantly seg fault
<bodie_> THERE we go.  thanks google....
<bodie_> https://codereview.appspot.com/90130044 ! ^_^
<waigani> waigani: if I remember correctly, I think the two are related - dave is the best one to ask - hopefully the email I sent will be helpful
<arosales> waigani: yup they look to be related
<jose> hello, guys! I have a fix for a file in juju-core, and I was wondering what's the correct way of making an MP, as I've seen codereview.appspot.com used in the past
<rick_h_> jose: yep, reitveld is used with a tool called lbox to submit patches.
<jose> rick_h_: erm, would you mind helping me to propose an MP?
<jose> I think I fixed https://bugs.launchpad.net/juju-core/+bug/1309805 as the description mentioned
<_mup_> Bug #1309805: LXC / Local provider machines do not boot without default-series <config> <local-provider> <lxc> <juju-core:Triaged> <https://launchpad.net/bugs/1309805>
<rick_h_> jose: I'm looking for the doc. Not done it myself. :)
<jose> oh, ok, thanks :)
<rick_h_> jose: http://bazaar.launchpad.net/~go-bot/juju-core/trunk/view/head:/CONTRIBUTING
<jose> blargh, I suppose I need to check better for files like that in the future :P
<jose> thanks rick_h_ :)
<rick_h_> jose: all good, thanks for going through the process
<jose> np :)
<jose> uh, looks like I'm getting an error when proposing
<jose> RuntimeError: maximum recursion depth exceeded in cmp it says
<axw> wallyworld: do we have a sprint topic about bulk API?
<axw> machine provisioning is getting pretty chatty now
<wallyworld> axw: yes, i think so, i remember seeing it
 * axw hunts
<wallyworld> Bulk cloud API calls (both scalability, and things like port-ranges)
<wallyworld> i also recall reading some prose
<axw> hmm that's a bit different
<axw> I'm talking about bulk Juju API calls
<axw> bulk is maybe the wrong word
<axw> I mean combining API calls, for example
<wallyworld> i know what you mean, it's there somewhere
<axw> ok
<jose> oh hey wallyworld, have a minute?
<wallyworld> hi sure
<jose> I'm having some problems doing a proposal for juju-core
<jose> erm, let me find the error
<wallyworld> ok
<jose> ok, so when I try to do 'bzr switch' it tells me the branch doesn't exist, or if I do 'lbox propose -bug=bugnumber' I get http://paste.ubuntu.com/7304413/
<jose> I was wondering if there's a way to do a proposal in codereview.appspot.com without using this tools, if the branch is in LP and I already have a bug number
<wallyworld> hmmm. i've not seen that error before, but it looks like your initial creation of the branch missed something
<wallyworld> what does bzr info say
<wallyworld> launchpad is the means by which the landing bit picks up and merges the code
<wallyworld> codereview.appspot.com is an ancilliary tool because some folks didn't like launchoad initially
<jose> let's check bzr info
<wallyworld> but lp is the main tool via which stuff needs to be proposed
<jose> bzr info gives the same error I think
<jose> so, saying I want to get https://code.launchpad.net/~jose/juju-core/init-comment-default-series merged into lp:juju-core
<jose> would an MP be enough/good? or I still need to go through rietveld?
<wallyworld> no, mp is fine by me. i prefer lp
<jose> ok, let's link the bug and do an mp then :)
<wallyworld> but your bzr setup is broken though if bzr info doesn't work
<jose> bzr info is broken when I execute it via cobzr
<jose> but if I do it with bzr, it gives the following output:
<jose> http://paste.ubuntu.com/7304444/
 * wallyworld sighs - cobzr is another tools which was not needed
<jose> I should recognize the CONTRIBUTING file complicates the process x1000
<wallyworld> does the contributing file talk about cobzr? that is unfortunate if it does as it's just not needed and makes things much harder
<wallyworld> bzr, lightweight checkouts,and switch are all that's needed
<jose> yeah, the contributing file talks about cobzr and lbox
<wallyworld> sadly lbox is unavoidable since rietveld is the review tool most people like using
<wallyworld> anyways, do your mp in lp for now and we'll get it reviewed
<jose> awesome, thanks :)
<jose> who should I set as the reviewer? or should I just leave that blank?
<wallyworld> leave it blank - i'll look at it now
<jose> thank you!
<jose> https://code.launchpad.net/~jose/juju-core/init-comment-default-series/+merge/216660 is it
<wallyworld> jose: the boilerplate environment.yaml file already has a commented out section for default series...
<wallyworld>     # The default series to deploy the state-server and charms on.
<wallyworld>     #
<wallyworld>     # default-series: precise
<wallyworld> i think we just want to add a comment in there to tell the user to uncomment defaultseries and set to precise or trusty as needed
<jose> ok, I'm modifying that line
<wallyworld> the juju init output does tell the user to edit the yaml file to suit
<wallyworld> thanks very much for contributing
<jose> no prob :)
<wallyworld> it's great when we can get help with all the little paper cut issues like this as we often doesn't get the time to polish these user facing annpyances
<jose> and, message changed
<wallyworld> looking
<wallyworld> jose: sorry, i didn't mean change the juju init message. the extra juju init output is not needed. just add a line to the comment in the yaml boilerplate telling the user they should uncomment the default-series line and choose precise or trusty
<wallyworld> provider/local/environprovider.go
<jose> oh, gotcha, gotcha :)
<jose> that should do
<wallyworld> jose: yep, i'll approve and the landing bot will pick it up. depending on what else is in the queue, it should land in say 20 minutes or so
<wallyworld> thank you
<jose> awesome!
<jose> no prob, glad I could help :)
<jose> should I mark that bug as fix committed?
<wallyworld> jose: no, that will be done automatically by launcghpad once it lands
<jose> good then
<jose> thanks for approving!
<wallyworld> np. you'll get an email soon hopefully saying it landed
<jose> cool
<jam> so quiet these days... :)
<vladk> jam: morning
<jam> morning vladk, how's it going?
<voidspace> morning all
<jam> morning voidspace
<perrito666> morning
<mgz> hey perrito666
<wwitzel3> hello
<voidspace> wwitzel3: morning
<wwitzel3> voidspace: hey, enjoy the holiday weekend?
<voidspace> wwitzel3: yeah, very good thanks
<voidspace> wwitzel3: my church has a big easter celebration weekend, so I've been to six meetings
<voidspace> wwitzel3: so hectic, but great fun catching up with all my friends from round the country
<wwitzel3> voidspace: nice
<voidspace> wwitzel3: I'm just catching up with administrivia and then we should (could) hangout
<voidspace> also the bluebells are out
<voidspace> wwitzel3: you have a good weekend?
<wwitzel3> voidspace: I did thanks :)
<perrito666> jam: ?
<jam> perrito666: just finishing up another hangout, will be there soon
<yaguang> hi all,does any one know how to specify a public ip pool when juju bootstrap with openstack
<yaguang> when using neutron as network service
<yaguang> as juju can't login the instance without a public ip
<jam> yaguang: "use-floating-ip: true" in environments.yaml
<yaguang> jam, I see that option, but juju complains  floatingip pool need to be specified
<yaguang> jam, with neutron, you need to specify the pool to allocate a floating ip to instance
<mgz> yaguang: can you ask in #ubuntu-server about the neutron setup you need?
<mgz> yaguang: note, you don't really need public ips, provided you put the machines on a network you can route to
<yaguang> mgz,  the problem is that with public ip, instance can only be accessed in netns
<yaguang> s/with/without/
<jam> vladk: my son's homework was pretty quick, want to do a hangout to get started on bug #1310255
<_mup_> Bug #1310255: juju-run failure /var/lib/juju/system-identity not accessible in HA <ha> <juju-run> <juju-core:Triaged by klyachin> <https://launchpad.net/bugs/1310255>
<jam> ?
<jam> wwitzel3: do you have a handle on https://bugs.launchpad.net/bugs/1304407 ?
<_mup_> Bug #1304407: juju bootstrap defaults to i386 <amd64> <apport-bug> <ec2-images> <metadata> <trusty> <juju-core:In Progress by natefinch> <juju-core 1.18:In Progress by natefinch> <juju-core (Ubuntu):Triaged> <https://launchpad.net/bugs/1304407>
<natefinch> jam, wwitzel3: this is the review: https://codereview.appspot.com/89900043/    this is the branch: lp:~natefinch/juju-core/045-amd64plz
<natefinch> wwitzel3: it just needs a test that we pick amd64 if it's an option
<vladk> jam: let's do it
<jam> vladk: I'm in https://plus.google.com/hangouts/_/canonical.com/vlad-john
<voidspace> wwitzel3: let me know when you're back
<voidspace> hmm...
<voidspace> actually, I'll go on lunch
<voidspace> wwitzel3: I'll catch up with you in a bit
<jam> natefinch: "juju ensure-availability" takes an '-n' parameter that is mandatory. Did you discuss this with Rog?
<jam> I filed bug #1311083 because I think setting 3 is a sane default
<_mup_> Bug #1311083: juju ensure-availability should default to -n=3 <ha> <ui> <juju-core:In Progress by jameinel> <https://launchpad.net/bugs/1311083>
<jam> but I wanted to check if I missed some discussion
<jam> https://code.launchpad.net/~jameinel/juju-core/ensure-availability-default-3-1311083/+merge/216706 or https://codereview.appspot.com/90160044 for anyone who wants a quick review
<jam> ok, this looks to be the conversation: https://codereview.appspot.com/81700043/diff/1/cmd/juju/ensureha.go#newcode58
<jam> Which is "a default of 1 seems wrong" which I agree with
<natefinch> jam: I think 3 is the only sane default, and it's almost always useful to have a sane default
<jam> natefinch: the path from StateServingInfo into writing stuff to disk is *really* convoluted
<jam> natefinch: so I'm trying to fix EnsureAvailability so that it picks the default series based on the existing machines
<jam> natefinch: however, in client_test.go today, we don't actually *have* any state machines when we run the first EnsureAvailability command
<jam> which is completely *not the way it would be*
<jam> and it further has the bug that the StateServingInfo doc in the DB is only created when you *open* the State
<jam> after you've created machine-0
<natefinch> jam: the tests were where we had the most difficulty with this code.  I agree that we shouldn't be testing with zero state servers to start
<jam> but I'm trying to reopen state, and getting "bad password cannot create the log db"
<rogpeppe1> jam: the stateServingInfo doc is created at state initialize time
<rogpeppe1> jam: unless we're upgrading
<jam> rogpeppe1: but in JujuConnSuite there *is no machine-0
<jam> so there is no valid machines serving the state
<jam> there are
<jam> rogpeppe1: so we have to create one, and then get the doc to be recreated
<rogpeppe1> jam: yeah, that's wrong - the tests should add a machine before calling ensureavailability
<rogpeppe1> jam: but the doc is still created
<rogpeppe1> jam: even when there are no machines
<jam> rogpeppe1: well, there is a doc, but it has no contents
<jam> creating the machine
<rogpeppe1> jam: yeah
<jam> doesn't update the doc
<jam> so you have to create the machine
<jam> and then re-open the state
<jam> for it to put machine-0 in
<jam> rogpeppe1: unless there is some other trick I should know about
<rogpeppe1> jam: i don't understand what you mean by putting machine-0 in stateservinginfo - stateservinginfo doesn't have any machine info in
<rogpeppe1> jam: unless you're actually talking about state *server* info
<jam> rogpeppe1: if I do: 	_, err := s.State.AddMachine("quantal", state.JobManageEnviron)
<jam> rogpeppe1: then it dosen't end up in StateServingInfo
<rogpeppe1> jam: StateServingInfo doesn't have any machine info in
<jam> rogpeppe1: so it still looks like there are no machines actually serving the state
<rogpeppe1> jam: i don't understand your statement
<rogpeppe1> jam: unless you're talking about StateServerInfo
<jam> rogpeppe1: sorry, I think you're right about StateServerInfo
<natefinch> because that's not confusing ;)
<jam> just a typo looking through similarly named things on 5 different object types
<rogpeppe1> jam: and StateServerInfo *should* be maintained by AddMachine
<rogpeppe1> jam: if it isn't, that's a regression and needs to be fixed
<rogpeppe1> jam: but i'm pretty sure it is
<rogpeppe1> jam: otherwise the logic wouldn't work
<jam> rogpeppe1: I'm pretty sure just doing AddMachine doesn't put any MachineIds into StateServerInfo
<rogpeppe1> jam: it definitely should do
<jam> EnsureAvailability does
<jam> so the tests that start with EnsureAvailability(1) work
<jam> and bootstrap does
<jam> so in real-life it works
<jam> rogpeppe1: but in real life we never have a case where there is no machine-0
<jam> because life starts somewher :)
<jam> ahh,, ffs, I can't just poke EnsureAvailability either because that notices that len(VotingMachines) == 0
<natefinch> jam: like I said, the tests is where we had the most problems with this code :)  There's just a lot of moving pieces and a lot of mongo quirks to work around.
<jam> natefinch: the fact that JujuConnSuite doesn't match a bootstrapped environ is bad here
<jam> natefinch: it is the only place where you have a Mongo running and *no* machine to host it
<rogpeppe1> jam: about a million years ago I tried to change things so the dummy environ creates a bootstrap machine
<rogpeppe1> jam: but it broke many many tests
<jam> rogpeppe1: I can imagine
<jam> it makes it hard to do things like "machine-0 is *this* type of machine"
<rogpeppe1> jam: because we have loads of tests that assume that the first machine created has id 0
<jam> rogpeppe1: *right* now I think I can make it work if I can re-open State so that we trigger the upgrade logic
 * rogpeppe1 goes and looks at the code
<jam> rogpeppe1: state/open.go has newState() which calls createStateServersDoc
<jam> which notices that if "len info.MachineIds == 0" then it will create it.
<rogpeppe1> jam: yeah, it needs to
<rogpeppe1> jam: because we might be in an upgraded environment
<rogpeppe1> jam: but state.Initialize calls state.Open
<jam> rogpeppe1: sure, it just means I can backdoor it.
<rogpeppe1> jam: you should not need to
<jam> rogpeppe1: I have to create the machine, then get it to run the upgrade ode
<jam> code
<rogpeppe1> jam: AddMachine with JobManageEnviron *should* add the machine to StateServerInfo
<jam> rogpeppe1: it doesn'tt
<jam> flat out
<jam> it doesn't
<rogpeppe1> jam: i thought there were tests that tested that specifically
<jam> rogpeppe1: so I see maintainStateServerOPs
<rogpeppe1> jam: State.AddMachines calls maintainStateServerOps
<jam> but when I'm testing it, I see len(VotingMachineIds) == 0
<rogpeppe1> jam: which should do that
<rogpeppe1> jam: there are no *voting* machine ids
<rogpeppe1> jam: but there will be a non-voting machine
<jam> rogpeppe1: if there is only machine-0, why wouldn't it be voting ?
<jam> rogpeppe1: ok, PEBKAC, it wasn't the test I fixed that was failing, it was the other tests.
<rogpeppe1> jam: that's a reasonable question.
<jam> that *weren't* creating a machine
<rogpeppe1> jam: BTW i think the logic should create a voting machine id unless NoVote is true
<jam> rogpeppe1: well, again, that isn't the "standard" path that we're going to get machine-0 is it?
<jam> maybe it is
<jam> the first step is just AddMachine(the thing I just started)
<rogpeppe1> jam: AddMachine should add the machine to the voting servers
<rogpeppe1> jam: i'd be surprised if it doesn't
<jam> rogpeppe1: so it might, I'll have to double check that now that I've sorted out which tests are actually failing
<rogpeppe1> jam: ok
<jam> I certainly expected that we would always have 1 voting machine
<rogpeppe1> jam: i believe we do
<jam> but stuff broke because we never created machine-0
<rogpeppe1> jam: (after the first machine is actually created)
<jam> and I fixed that for one test, and the others broke, and I thought it was my test that was breaking
<jam> ok, we have a voting machine after AddMachine
<rogpeppe1> jam: good. everything seems to be working as it should then.
<rogpeppe1> jam: BTW i think it might be reasonable if AddMachine also took its series from the state server machines by default
<rogpeppe1> jam: because then at least we know that calling juju add-machine will work
<jam> rogpeppe1: I think having PreferredSeries return the state server arch instead of LatestLTSSeries would be ok, but I think that is more of a conversation that we need to bring up for discussion.
<rogpeppe1> jam: yeah
<jam> s/arch/series/
<wwitzel3> voidspace: https://code.launchpad.net/~wwitzel3/juju-core/008-ha-rsyslog
<jam> natefinch: rogpeppe1: so AXW's comment seems to me that he wanted a default to mean "preserve the existing availability"
<jam> I would be fine making the default 0, and have that interpreted on the server as 3 or whatever N is currently running if N != 1
<rogpeppe1> jam, natefinch: i have a version of https://codereview.appspot.com/70770043/ that consists of entirely automatic changes and uses "errgo" rather than "errors" as the errgo package identifier. i'd quite like to get it in, but it would be nice to have advance approval because of the overheads of proposal.
<jam> rogpeppe1: as this is not in the critical path to release, I can't commit the time to reviewing it today (especially since I'm 1.5hrs past EOD already)
<rogpeppe1> jam: which comment?
<jam> rogpeppe1: https://codereview.appspot.com/90160044/
<jam> about the default N for ensure-availability
<rogpeppe1> jam: ok. but are you amenable to the change in principle?
 * jam goes to spend time with the familyd
<rogpeppe1> jam: +1 to making the default the current number
<jam> I haven't particpated in the discussion enough to really contribute there.
<jam> I think we had gotten to the point of "yes we should use errgo", but I wasn't a voting member of those discussions
<rogpeppe1> jam: i had a "LGTM but please use errgo for the package identifier"
<rogpeppe1> jam: which is basically what i'm doing now
<rogpeppe1> jam: but that was a little while ago. the LGTM might have expired.
<rogpeppe1> jam: i can make the changes in about 15 seconds, but lbox proposing it takes ages.
<bodie_> hi guys, can I get a review on our MR?  https://code.launchpad.net/~binary132/juju-core/skeletal_actions/+merge/216651
<bodie_> it's a little hefty
<bodie_> whenever you have time
<bodie_> thanks in advance :)
<bodie_> rietveld at https://codereview.appspot.com/90130044
<wwitzel3> voidspace: http://www.rsyslog.com/sending-messages-to-a-remote-syslog-server/
<natefinch> rogpeppe1: what does swapping in errgo get us right now?  Do we get file/line numbers for those errors?
<rogpeppe1> natefinch: it doesn't change any existing external behaviour, but it means we lose our dependency on github.com/errgo/errgo and it sets us up for the next stage.
<rogpeppe1> natefinch: and, yes, we might get file/line numbers for those errors
<natefinch> rogpeppe1: keeping that code from bitrotting seems worthwhile
<rogpeppe1> natefinch: (but those errors aren't really the important ones as they're easy to grep for)
<rogpeppe1> natefinch: that code has already bitrotted (i just do it from scratch each time), but the conversion program may bitrot
<stokachu> is the ProvisioningScript the api call i would use to "juju run" a shell script before any of the hooks are run?
<natefinch> rogpeppe1: ok, just do it.  If the code compiles and the tests pass, then I don't think anyone can argue too much.
<rogpeppe1> natefinch: cool
<stokachu> or is an api command available that allows me to run abitrary commands on a machine before any hooks are executed
<natefinch> stokachu: what are you trying to accomplish?
<stokachu> natefinch: writing a plugin to do some initial setup on the machine before running a charm
<natefinch> stokachu: you can always do add-machine, set up some stuff, and then deploy --to it
<stokachu> natefinch: by doing juju run or juju ssh?
<natefinch> stokachu: juju run, yeah.  You can do juju add-machine, juju run --machine 1, juju deploy foo --to 1
<stokachu> so how do i do it through the api?
<stokachu> i want to pass it a script to run
<stokachu> with the api i want to be able to add-machine, upload a script via sftp or something, then execute it before a charm is deployed
<stokachu> i can add-machine through the api just fine
<natefinch> stokachu: there's a RunOnAllMachines api endpoint
<stokachu> so that endpoint RUnOnAllMachines, does it support passing a script over the wire? or is it for running commands independently
<natefinch> stokachu: looks like it just takes a list of commands, not a full script
<stokachu> ok
<stokachu> is the run commands done as root as well?
<stokachu> i dont have to worry about sudo or anything like in the charms?
<natefinch> stokachu: yeah, it's run as the "ubuntu" user which generally has root privs
<stokachu> ok cool
<stokachu> so essentially i could run juju scp, then call the api to execute that command
<natefinch> stokachu: that should work, yeah
<stokachu> is there an api for scp or does that have to be run manually, i couldnt find any reference to it in the code
<stokachu> in the api section of the code anyway
<natefinch> stokachu: yeah, scp doesn't use the API
<stokachu> ok cool, thanks man
<natefinch> welcome
<stokachu> natefinch: last question, other than juju scp is there any other way to upload files to machine using the api?
<sinzui> rogpeppe1, natefinch: I don't think Go van only import a master branch from git. Am I wrong?
<rogpeppe1> sinzui: i'm not sure what you mean
<rogpeppe1> sinzui: the only thing in Go that knows about branches is "go get"
<stokachu> marcoceppi: maybe you know? ^
<natefinch> stokachu: I don't think so.  What's wrong with using the juju client's juju scp?
<stokachu> natefinch: if im on a remote machine with no juju client
<natefinch> stokachu: I guess I don't know why you'd be on a random machine without juju, but with whatever code you're writing
<sinzui> rogpeppe1, right. I  can "go get" github.com/juju/core/1.18
<sinzui> ^ rogpeppe1 I mean I can *not*
<natefinch> sinzui: you can always use version control to get a branch before building it with go
<stokachu> natefinch: yea just wanted the ability to interface with juju without relying on juju binary
<natefinch> stokachu: but..... you have *code* that you're relying on somewhere.
<sinzui> rogpeppe1, for CI and releases, I need to keep the branch info separate and I need another step to checkout that branch
<mgz> can I have a review on https://codereview.appspot.com/90310043
<mgz> rogpeppe1: ^ you should be on holiday, no? (but if not... :)
<stokachu> natefinch: yea but that doesn't necessarily mean i have to have juju installed with it
<sinzui> natefinch, That presumes a little too much. Currently, we use go get, then switch to the branch.
<sinzui> natefinch, Then  run godeps to pin all the branches to the correct version
<natefinch> stokachu: well, otherwise you're just duplicating effort with what we have in juju client
<sinzui> natefinch, The new issue is that bzr urls has branch and (implicitly) repo, git only supports repo (implicitly master)
<stokachu> natefinch: what if its library bindings
<sinzui> SO I need to change CI and Releases to gather another piece of information, the branch, and I need a step to switch to that branch when juju goes to git hub
<natefinch> sinzui: yes... although go get launchpad.net/juju-core/1.18 wouldn't work anyway, since it would put the juju-core code in $GOPATH/src/launchpad.net/juju-core/1.18
<natefinch> stokachu: not sure what you mean.
<sinzui> natefinch, It does work, but not that way. Release tarballs always exercise go-get, then switches to a branch and revision. The branch is a url. git is a name.
<natefinch> sinzui: I guess I'm not sure what the problem is, if you had to go get and switch before, and you do the same with git.
<natefinch> sinzui: does it matter if the switch target is a url or a name?  they're both just strings, right?
<sinzui> natefinch, There are some issues because a url enough information for CI/releases to get all the juju code. It is not with git. It is two pieces of information that I need to store now
<natefinch> sinzui: oh, I see.
<sinzui> or I create a pretend url with the extra information
<sinzui> natefinch, This is a niggle that makes CI transition to github awkward. Obviously it can be fixed in a day, but I was hoping that Go had a solution for the problem
<natefinch> sinzui: there is a solution, but it requires us to do a little work ahead of time. You can make a redirector service that can translate, for example  juju.com/v1.18/juju-core into github.com/juju/juju-core (branch 1.18)
<natefinch> sinzui: that's actually the "correct" way to solve this problem.... we've been going against the grain with godeps and using non-head branches etc etc
<sinzui> ah, thank you natefinch. I forgot about that
<stokachu> natefinch: if im writing an application using a api library i shouldnt need the actual juju command
<stokachu> the api server supports uploading of charms though
<natefinch> stokachu: the juju client pretty much is an api library, and one that we keep up to date with the rest of our changes.  But I understand it may not work for all purposes.
<stokachu> i think it would be cool override cloud-init too
<marcoceppi> stokachu: you can use regular rsync
<marcoceppi> you just have to have ssh keys on that machine, and know the IPs
<stokachu> marcoceppi: do i just supply it the ssh key
<stokachu> i think the api lets me get those ssh keys
<marcoceppi> `rsync -avz ubuntu@<juju-machine-ip>:... ./`
<marcoceppi> stokachu: maybe, not sure
<natefinch> stokachu: you can certainly override cloud-init, but there is almost certainly code in jujud that will assume stuff has been done in cloud-init.
<stokachu> so i wanted to maybe utilize the provisioningscript api call to override cloud-init in a manual provider
<stokachu> but it looks like it would be run after a machine is deployed
<voidspace> need moar coffeez
<voidspace> natefinch: ping
<natefinch> voidspace: pacman (so much better than pong)
<voidspace> natefinch: hah, nice - I'll have to remember that
<voidspace> natefinch: so I'm working with wwitzel3 on the rsyslog issue
<voidspace> natefinch: we think we can do most of the work with the rsyslog configuration
<hazmat> hmm.. api-endpoints on trunk has degraded
<hazmat> http://pastebin.ubuntu.com/7308341/
<natefinch> voidspace: awesome
<voidspace> natefinch: with some extra work to update the config when available state servers change
<voidspace> natefinch: with the slight side issue (to be dealt with later) that new state servers possibly want all the previous logs too
<voidspace> natefinch: so we currently have separate config templates for nodes and stateserver machine
<voidspace> natefinch: nodes are currently configured to log just to the *first* state server machine
<voidspace> natefinch: changing that to have multiple rules logging to all of them is easy
<voidspace> natefinch: state servers only log locally (to a file)
<voidspace> natefinch: we need to change this to log locally *plus* to send all log messages generated on the machine to the other state servers
<natefinch> hazmat: what's degraded about that?
<hazmat> natefinch, previously it would return the first line..
<voidspace> natefinch: obviously not to just forward *all* messages, or they would forward all the ones they received and every message would be logged an infinite number of times
<voidspace> which might annoy people
<hazmat> natefinch, its now returning every possible state server address. the whole point of the method is for api clients to connect to the state server.
<voidspace> natefinch: sooo, we need an rsyslog rule that matches locally generated messages
<natefinch> voidspace: yeah, that sounds good.   definitely handling new servers added getting old logs is something that can sit in a todo for right now
<hazmat> natefinch, so private addresses, local host addreses need not apply. ipv6 addreses should only be shown on switch, etc. it breaks existing clients that were using the output previously
<voidspace> natefinch: well, we probably need to do adding new servers unless the info for all state servers is always immediately available
<voidspace> but that aside
<voidspace> (we'll figure that out shortly)
<voidspace> we need a rule that can filter local messages
<natefinch> hazmat: yeah, jam was looking at that too.  Certainly the localhost ones are just wrong (especially since you might legitimately have a local environment running on localhost).
<voidspace> natefinch: I'm currently scouring the interwebs for examples of matching local messages only
<voidspace> natefinch: is this something you know about or should I delve further into my search?
<natefinch> voidspace: I know less about rsyslog than probably anyone else on the team :)
<voidspace> I can see we have %HOSTNAME% available to the templates
<voidspace> natefinch: haha
<voidspace> but I don't think local messages will come in with hostname
<voidspace> maybe we can match on the *absence* of a hostname as we know the format of *forwarded messages*
<hazmat> natefinch, its also confusing because the notion with ha is that each entry would be a different state server
<voidspace> that sounds potentially ropey though
<voidspace> natefinch: ok, I'll continue
<natefinch> hazmat: brb, gotta help with the kids for a second.  I think part of the idea is that IPs internal to the environment will work from inside environment, and the external ones might not.
<hazmat> natefinch, one of the primary users of the api is the cli client which isn't in the environment. notions of external/internal are definitely fuzzy
<hazmat> filed a bug on it to track  fwiw http://pad.lv/1311227
<natefinch> hazmat: the idea is that you dial all the addresses (staggered slightly) and use whatever one connects first.  I believe the way the CLI works is that when it connects to one, it resorts that one to the top of the list, so next time it's likely to connect first again.
<hazmat> natefinch, the issue is a bit more than that.. if this api is used by agents then result need to differ by login
<hazmat> natefinch, there's little value in an a cli client connecting to localhost, private ip addresses, or ipv6 (unless requested)..
<hazmat> where as agents preferentially want the private addresses
<natefinch> hazmat: right
<natefinch> hazmat: probably the ultimately right thing to do is return the same list to everyone, but include the networkscope, so the consumer can decide what addresses is appropriate for itself.  the client would only look at public, agents would only look at cloudlocal, etc.
<hazmat> yeah.. more incompatible output ;-)
<hazmat> natefinch, in that case (incompatible output) differentiating the state servers would also be good
<natefinch> hazmat: can't keep the world frozen for forever :)    Really, the only problem with the current output (aside from the fact that it's not the same as what we had before) is that some of the addresses are ambiguous (localhost, 10. addresses from outside the cloud, etc)
<hazmat> natefinch,  well the current output is a regression for users of that cli command imo
<natefinch> hazmat: the API servers are all identical... there's no reason to differentiate
<hazmat> natefinch, 'so the consumer can decide what addresses is appropriate for itself. '
<hazmat> also applies to multiple distinct endpoints.. even if resulting traffic is identitcal
<voidspace> wwitzel3: example of having separate rulesets for local and received messages
<voidspace> wwitzel3: http://www.rsyslog.com/tag/more-complex-scenarios/
<hazmat> natefinch, ie a client wants to grab a  set of ipv4 public addresses to round robin to.. how do they know their getting an ha set without servers identified distinctly
<natefinch> hazmat: I'm not sure I understand.  You get a list of IP addresses the environment says are for the API.... what does that have to do with HA?  We always return addresses for all state servers.  If you're in HA, you'll get addresses from multiple servers.  What does the client need to know?
<hazmat> natefinch, i have a client that refuses to connect to an env unless its in ha mode.. how does it know the difference without state servers identified in the result?
<natefinch> hazmat: juju status would show multiple state servers.  api-endpoints is not the call to make to determine if you're in HA mode.
<hazmat> natefinch, it is the call to get the api endpoints
<hazmat> natefinch, which is what the client needs to connect to.. if one server has 4 public ip addresses, how does the client distinguish to get only get the ones representing a unique set of endpoints for the state servers that it can talk to
<hazmat> it can't unless the results also distinguish state servers in addition to endpoints
<natefinch> hazmat: but it doesn't matter what API server it talks to
<hazmat> natefinch, the traffic doesn't matter, the connection being ha does.. ie. the client wants to be able to fall back to a different server, how does it know there's a different server to fallback to?
<natefinch> hazmat: it has more API addresses it can connect, just round robin on those.
<hazmat> natefinch, if they all point to the same server?
<hazmat> natefinch, you expressed the desire to pass the information to the client, but your also saying let's not pass it because it shouldn't matter
<hazmat> that's not very consistent
<natefinch> hazmat: the information I wanted to pass was the type of network address, not the machine that was behind it.
<hazmat> i'm saying it does matter if the client wants to know that there are multiple vms it can connect to
<hazmat> why should the client try a bunch of addresses for a machine it may know it can't talk to ...
<hazmat> because that machine is down now.. when if it knew which addresses mapped to the state server, it could just fallback to the next actual server
<natefinch> hazmat: it just seems like you're trying to overload api-addresses.  There's juju status if you want to know the status of the state servers
<hazmat> natefinch, its already been overloaded
<hazmat> natefinch, i'm trying to clean it up.. you suggested breaking compatiblity already.. i'm saying well let's give the clients all the info they need.
<natefinch> hazmat: well, we could return yaml document with each server and a list of addresses with what type they are (public, cloud local, etc)
<hazmat> natefinch, sounds good
<natefinch> hazmat: probably not something we can do this week, though. We're pretty slammed, and a ton of people are off at gophercon
<hazmat> natefinch, yeah... i'm already on site
<hazmat> natefinch, its a regression though in the dev version, so it would be good to fix up or make compatible at least for the next stable, not sure when that's scheduled
<natefinch> hazmat: pretty sure that'll be post-vegas.   But we should be able to target it for that, I'd think.
<natefinch> hazmat: I wonder if we should have tests external to juju-core that show any break in compatibility with old versions.  Obviously they wouldn't be able to test everything, but simple things like this could at least be caught and discussed / noted in release notes.
<natefinch> hazmat: I gotta run to a doctor's appointment, back in an hour or so.
<hazmat> natefinch-afk, yeah.. part of the issue is that the functional regression suite is only hit a small fraction of the surface area (bootstrap/destroy-env/deploy/add-unit/relation/expose etc)
<hazmat> i should setup some sort of ci for some of the extant plugins
<hazmat> natefinch-afk, hmm.. although i just ran into an interesting case and regression as a result of the same.. i'm currently spanning regions with manual provider which used to work fine.. but with 1.19 breaks because  the machines initially connect on public, get the cloud local address, and attempt to reconnect back with that cloud-local addresses which is broken since the machines are on different cloud-local/subnets, resulting in down machines.
<hazmat> http://paste.ubuntu.com/7308797/
<bac> hi sinzui, can i pick your brain about logging in to charmworld?
<sinzui> bac sure
<bodie_> thanks for the feedback natefinch
<bodie_> :)
<bodie_> i'll reply in thread
<natefinch> bodie_: no problem.  Glad to have this work getting done. I don't remember if I said so, but the code looks generally good.
<bac> sinzui: may have to wait util tomorrow...
<stokachu> so im trying to write some code against juju-core but i keep hitting this error: http://paste.ubuntu.com/7310622/
<stokachu> i thought someone brought this up previously but couldnt find it in my irc logs
<stokachu> cmars: i get that same error when trying to build your juju-nat plugin too
<stokachu> cmars: have you run into that issue before?
<Guest46215> stokachu: looks like you've pulled tip of code.google.com/p/go.crypto which breaks juju. Go's lack of proper dependency mgmt sucks. you can run godeps -u dependencies.tsv to get the correct revision
<stokachu> Guest46215: ah perfect that worked
<stokachu> Guest46215: thanks!
#juju-dev 2014-04-23
<perrito666> anyone knows his way around state/open.go ?
<wallyworld> perrito666: it depends on what you want to know, i might be able to help
<stokachu> cmars: hah gues you figured i based my plugin off yours :)
<wallyworld> axw: mornin'. can we have a hangout now instead of in an hour?
<stokachu> was gonna give you credit once i got something working
<axw> morning wallyworld. sure thing, just give me a moment
<axw> wallyworld: erm, my sound isn't working. gotta fix that first...
<wallyworld> ok
<perrito666> wallyworld: tx, sadly my head is falling on the kb so I better hit the bed before I introduce a bug instead of fixing the current one
<wallyworld> perrito666: np, i'm on  call anyway now. if you have a question, feel free to email to the list or ask again later
<perrito666> wallyworld: ok, I am more curious about fixing this bug than about going to sleep :p so her I go
<perrito666> I am trying to fix the restore functionality
<perrito666> now, at some point the restore calls state.Open(), I tried to replace it by using juju.NetConn and NewConnFromName and in all cases, it timeouts at mgo.DialWithInfo thile trying to make Ping()
<wallyworld> perrito666: ok. there may also be someone else looking into that from juju-core
<perrito666> "that" being?
<wallyworld> i think Horacio DurÃ¡n
<perrito666> sadly that would be me
<wallyworld> he's started to fix some of the backup bugs and was also going to look at restore
<wallyworld> oh
<wallyworld> hi
<perrito666> hi
<wallyworld> i didn't realise!
<wallyworld> perrito666: give me a couple of minutes to finish this call
<perrito666> sure
<wallyworld> perrito666: sorry, back now
<wallyworld> i'm not across the restore stuff specifically
<perrito666> wallyworld: I think the restore part of my explanation can be safely ignored
<axw> wallyworld: gotta go to the shops for a little while, bbs
<perrito666> I just provided it for context
<wallyworld> axw: sure, np
<wallyworld> perrito666: so you are looking to, in general, replace calls to state.Open() with juju.NewConn ?
<wallyworld> to use the api
<wallyworld> so you definitely have a state server running?
<wallyworld> api server even
<perrito666> wallyworld: well I am pretty sure I do, I try to query mongo by hand and it responds, yet when juju tries to dial it just timeouts
<wallyworld> mongo != api server though
<wallyworld> the api server listens on port 17070
<perrito666> true, altough I am pretty sure this breaks before getting to state
<wallyworld> what code are you changing?
<perrito666> well, current existing code calls open, open in time calls DialWithInfo
<wallyworld> which file?
<perrito666> DialWithInfo creates a session
<perrito666> ah sorry
<perrito666> state/open.go
<wallyworld> sure, but the caller to that
<wallyworld> which caller of state.Open() is being replaced?
<perrito666> cmd/plugins/juju-restore/restore.go
<perrito666> around :187
<wallyworld> so at the time restore runs, is there a bootstrap node running?
<wallyworld> i don't think there is
<wallyworld> ah there may be
<wallyworld> cause looks like it calls rebootstrap()
<perrito666> there is
<wallyworld> but you might find that it is just that the api server has not started yet
<wallyworld> cause it can take a while to spin up the bootstrap node and then start the services
<wallyworld> maybe to see if that's the issue, pause the restore script or add in a big attempt loop to see if it just needs more time
<perrito666> wallyworld: mm I tried looping on that
<perrito666> I waited 30 mins total
<perrito666> that is a lot
<wallyworld> can you do a juju status when it fails?
<wallyworld> ie does juju status work?
<wallyworld> that would need an api server connection
<perrito666> mm, it does not
<wallyworld> so if juju status is broken also, then there's an issue with the bootstrap node
<wallyworld> you would need to ssh in and look at the log file
<wallyworld> cause it could be the node itself starts but then the juju services fail to start
<perrito666> mm, the service seems to be running, I even restarted it by hand
<perrito666> in what port should the state server be listening?
<stokachu> 37017
<wallyworld> 17070
<wallyworld> 37017 is ongo
<wallyworld> mongo
<wallyworld> perrito666: when you say you restarted the state service by hand, that doesn't make sense to me because the state service runs inside the machine agent - did you start jujud?
<perrito666> wallyworld: yes
<wallyworld> and the machine log file is good?
<wallyworld> and yet juju status fails also
<wallyworld> there's gotta be something logged which shows the problem
<wallyworld> until something like juju status is happy, then the code changes to restore.go won't work either
<perrito666> wallyworld: interesting though, restore is trying to open a state server n 37017
<wallyworld> the current restore using state.open()?
<wallyworld> it will because it connects straight to mongo
<wallyworld> the new juju.NewConn() methods instead go via the api server on port 17070
<perrito666> aghh, juju.NewConn fails just as Open, so someting is definitely broken in my recently restored node
<stokachu> wallyworld: is that in trunk yet?
<stokachu> my logs show NewConnFromName accessing mongo directly on 37017
<wallyworld> stokachu: the api server stuff?
<stokachu> yea
<wallyworld> yes, been there since 1.16
<wallyworld> used universally since 1.18
<wallyworld> perrito666: i'd be surprised and sad if the log files on that node didn't show what was wrong
 * perrito666 run the extremely tedious setup script
<wallyworld> perrito666: it will still be waiting for you tomorrow after you get some sleep :-)
<perrito666> wallyworld: certainly but now its personal
<wallyworld> lol
<wallyworld> feel free to pastebin logs files if you want some more eyes
 * perrito666 paints canonical logo on his face and yells mel gibson style
<stokachu> woot i actually a juju plugin to do something in go
<perrito666> stokachu: I sense a verb missing there :p
<wallyworld> would have been funnier if you said "i a missing verb there" :-)
<stokachu> hah
<axw> back...
<stokachu> to much time looking at juju core code
<waigani> wallyworld: axw: I'm here for standup
<perrito666> wallyworld: my wife is watching tv in spanish next to me, when 2 lang module enabled in my head I loose capacity for witty sentences in both languages
<wallyworld> waigani: huh? i thought you were on holidays so we had it early :-)
<axw> waigani: we already had it early, weren't expecting you
<wallyworld> but we can have another
<waigani> :(
<waigani> I'm in auk airport
<waigani> okay, maybe I can talk through what I'm doing?
<wallyworld> waigani: sure, i'm in the hangout
<axw> brt
<perrito666> wallyworld: https://pastebin.canonical.com/108967/
<wallyworld> perrito666: looking, sorry was otp
<perrito666> on the same note https://pastebin.canonical.com/108968/
<wallyworld> perrito666: is there any more in machine-0.log?
<perrito666> wallyworld: well, there is before that altough I am not sure if I can distinguish between pre/post restore (restore is a particularly ugly thing)
<wallyworld> perrito666: what i mean is, after the output you logged. that log looks ok i think. there was one timeout with the api client connecting but thatcan happen and it appeared to be ok after that but i wanted to be sure by looking at subsequent logging
<perrito666> nope, after that it just loops with https://pastebin.canonical.com/108969/
<wallyworld> hmmm, ok. so that says there is an issue with the api server
<wallyworld> you may need to enable trace level logging and/or add extra logging to see why it's failing. i wonder if netstat shows the port as open
<perrito666> tcp        0      1 10.140.171.13:59925     10.150.60.153:17070     SYN_SENT    4001/jujud
<wallyworld> that's a different ip address to what is being dialled
<wallyworld> oh no
<wallyworld> it's not
<perrito666> nope, just without the dns nae
<wallyworld> yeah
<wallyworld> if it were me, i'd have to add lots of extra debug logging at this point to see what's happening as i'm out of ideas
<wallyworld> but you can see even internally the machine agent api client can't start
<wallyworld> so there's a core issue with starting the api server itself
<wallyworld> axw: local provider is sorta ok. it doesn't like starting precise containers on trusty although it used to. and if i start a precise container first and it fails, subsequent trusty containers also fail, but starting a trusty container first works
<perrito666> wallyworld: well, I think the restore step is actually breaking the state api server
<perrito666> since it works right before
<wallyworld> likely
<perrito666> (restore bootstraps a machine and then untars the backup on top of it)
<wallyworld> roger wrote all that so i have no insight off the top of my head as to what might be wrong
<axw> wallyworld: ah ok. there have been a few bugs flying around about host vs. container series mismatch not working
<wallyworld> axw: yeah, i'm going to try explicitly setting default series to see if i can get precise to work. but precise failing should not also then kill trusty :-(
<perrito666> wallyworld: I think there might be something wrong with the backup, tomorrow I will strip one into pieces and see what is wrong, as for me I am now officially out or tomorrow I will be sleeping on the kn at the standup
<perrito666> kb*
<wallyworld> np, good night :-)
<axw> wallyworld: oh I didn't see that bit... weird
<wallyworld> yeah
<axw> wallyworld: I think you can also bootstrap --series=trusty,precise to get it to work
<axw> not sure why trying precise would fail trusty tho
<wallyworld> ta, will try that also to try and get a handle on it
 * wallyworld -> food
<axw> wallyworld: I just pasted the output I see from destroy-environment with manual
<axw> wallyworld: it's as I expected
<wallyworld> axw: i missed it as my laptop got disconnected
<axw> wallyworld: I mean I pasted it in the bug
<wallyworld> ah, looking
<axw> #1306357
<_mup_> Bug #1306357: destroy environment fails for manual provider <destroy-environment> <manual-provider> <juju-core:Incomplete> <https://launchpad.net/bugs/1306357>
<wallyworld> axw: clearly then i need to get my eyes tested as i had thought i included it all, sorry :-(
<wallyworld> although i wish the last error was first
<axw> wallyworld: nps. it does kinda get lost down there...
<wallyworld> as it would read much nicer that way
<wallyworld> ie root cause, followed by option to fix
<axw> wallyworld: I'm going to look at fixing these openstack tests. If you do have any spare time, it would still be useful if you could review the placement CL
<axw> but if you're busy then that's okay
<wallyworld> axw: funny you should mention that - just finished another review and am looking right now
<axw> wallyworld: cool :)
<wallyworld> axw: this is a personal view, but i tend to think that if a method returning a (value, error) returns a err != nil, then the value should be considered invalid. so this bit irks me:
<wallyworld> if c.Placement != nil && err == instance.ErrPlacementScopeMissing {
<wallyworld> i would use an out of band signal like a bool or something
<axw> wallyworld: err was originally nil, that was something william wanted
<axw> I suppose I could change it to reutrn a nil placement, and have the caller construct one
<wallyworld> hmmm. is there value in adding a bool to the return values
<wallyworld> or something
<axw> I don't really think so, then you may as well just check if the scope has a non-empty scope
<wallyworld> i sorta think that err != nil meaning the value is bad is kinda idiomatic Go
<axw> yeah... probably should have just left it as it was
<wallyworld> change it since he isn't here :-)
<axw> wallyworld: I think I will just change it to return a nil Placement, and hten the caller will create a Placement with empty scope and the input string as the directive field
<wallyworld> ok
<wallyworld> i think that sounds good
<axw> the caller needs to know the rule anyway, at least this way it's the usual case of nil value iff error
<wallyworld> sorta best of both worlds
<wallyworld> ta
<wallyworld> axw: with these lines in addmachine
<wallyworld> if params.IsCodeNotImplemented(err) {
<wallyworld> 	
<wallyworld> 	135 if c.Placement != nil {
<wallyworld> is there any point trying again if c.Placement is nil?
<wallyworld> should it just be a single if ... && ...   ?
<axw> wallyworld: yes we should try again, because we're calling a new API method
<axw> wallyworld: client.AddMachines now calls a new API method by default
<axw> wallyworld: and client.AddMachines1dot18 calls the old one
<wallyworld> oh,right. hadn't go to that bit yet, i recalled it was the same api from earlier review
<axw> it was, I fixed it :)
<wallyworld> but i guess versioning
<wallyworld> wish we had it
<axw> indeed
<stokachu> do i have to invoke "scp" with the ssh.Copy function in utils/ssh?
<axw> stokachu: the openssh client impl will delegate to scp, if that's what you're asking
<stokachu> https://github.com/battlemidget/juju-sos/blob/master/main.go#L89-L94
<stokachu> so im trying to replicate juju scp within my plugin
<stokachu> this is my log output : http://paste.ubuntu.com/7312090/
<stokachu> i think my actual copyStr is incorrect as i was following was is required by juju scp
 * axw looks
<stokachu> what is*
<axw> stokachu: I think you want the target and source in separate args
<stokachu> im a newb with golang as well so if i got stupid stuff in there
<stokachu> lemme try that
<axw> stokachu: i.e. a length-2  slice
<stokachu> ok lemme see if i can make that happen
<wallyworld> axw: is there a reason why we store placement as a string and not a parsed object. and hence precheck take s a string and not a parsed struct etc. i would normally look to parse on the way in and then pass around the parsed struct etc so we fail as close to the system boundary as possible. am i missing a design decision?
<stokachu> sweet, gotten farther http://paste.ubuntu.com/7312102/
<axw> wallyworld: originally I did that, william wanted it changed. it should not get to the environment if the scope doesn't match
<stokachu> though maybe i should be using the instance.SelectPublicAddress of machine?
<wallyworld> axw: hmmmm. ok. i disagree with william here then :-(
<axw> stokachu: cool. ahh, "juju scp" does the magic of converting machine IDs to addresses
<axw> wallyworld: why? the environment should not need the scope
<stokachu> ive got a execssh that i borrowed from someone that uses instance.selectpublicaddress
<stokachu> going ot try that
<wallyworld> axw: what i mean is that the string should be parsed into whatever internal representation makes sense at the system boundary ie a struct of some sort, possibly different to what is used on the client ie minus the scope
<axw> stokachu: see juju-core/cmd/juju/scp.go, hostFromTarget  -- that's where it maps machine IDs to addresses
<wallyworld> and internal apis should then use that typed struct
<stokachu> axw: ahh i see that now
<wallyworld> not an "untyped" string
<wallyworld> but, doesn't matter, it's already been changed to get approval
<stokachu> to bad expandArgs isnt public
<axw> wallyworld: the directive string is free-form, so how are you going to do that?
<axw> wallyworld: it's up to the provider to decide what makes sense in directives
<wallyworld> axw: ah bollocks, i was thinking there was more to it than just a string. but you are saying that by the time it's stored, it represents a mass name or whatever
<wallyworld> that makes more sense. i hadn't fully re-groked the implementation
<axw> wallyworld: as far as the infrastructure is concerned, it's an opaque blob of bytes. the provider will interpret it. provider/maas will interpret it as maas-name to start with
<wallyworld> ok
<axw> we may converge on some convention, like thing=value
<axw> az=uswest-1 or whatever
<axw> stokachu: it's also worth noting that some providers (e.g. azure) require proxying through machine 0
<axw> stokachu: so you may want to just shell out to "juju scp" if you can...
<stokachu> axw: ah good point
<stokachu> cleaner than what im doing
<stokachu> is there a shell function in juju-core thats exposed?
<stokachu> or should i just use os.Exec
<axw> stokachu: os/exec is as good as anything
<stokachu> axw: good deal
<stokachu> ill do that instead
<axw> there are some utils in juju, but I don't think they'd be useful
<stokachu> cool no worries
<wallyworld> axw: yeah, i'm a fan of a little more structure. but none the less, land that f*cker
<jam> hazmat: fwiw the first line that api-endpoints returns is the one that we last connected to, so if you just do "head -n1" you can get the same output we used to give
<axw> wallyworld: thanks
<wallyworld> np. sorry if i went over old ground
<axw> nope, that's cool
<wallyworld> jam: i was going to get your opinion on that bug - i'd like to close now as "invalid" or whatever given the other ifx has landed
<jam> wallyworld: sorry, which bug?
<wallyworld> jam: the one you just remarked on above
<wallyworld> bug 1311227
<_mup_> Bug #1311227: juju api-endpoints cli regression on trunk/1.19 <api> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1311227>
<jam> wallyworld: localhost shouldn't be in the output
<jam> and I would be fine pruning ipv6 by default
<wallyworld> jam: it can be for local provider since localhost is the public address for local provider
<wallyworld> jam: martin's branch does prune ip6 by default
<jam> wallyworld: sure, I'm not saying don't print localhost when that's the address, but *don't* print localhost for ec2
<axw> we shouldn't have localhost for ec2, but we would have 127.0.0.1 and that'll get pruned
<wallyworld> jam: martin's branch probably ensures that's the case, since for ec2 localhost is machinelocal isn't it?
<jam> wallyworld: hmmm... I don't know that Martin's patch is *quite* right. I'd rather still cache IPv6, but just not display them on api-endpoints
<axw> we don't use any scope heuristics for hostnames
<jam> wallyworld: right, I think his patch is what we want, and we do want to be caching the network scope data instead of just addrs
<wallyworld> jam: it's ok for now i think since we don't need/use ip6 yet
<wallyworld> jam: so, i think then that kapil's bug has 2 bits 1. the ip6/127.0.0.1 stuff which martin's bug fixes, and 2. the multiple api address thing which is new and intended
<wallyworld> so therefore we can mark the bug as invalid
<wallyworld> right ?
<jam> wallyworld: so I still think there are bits that we can evolve on api-endpoints. Namely, to change what we cache from just addrs to being the full HostPort content (which includes network scope), and then api-endpoints can grow flags to do --network-scope=public
<jam> wallyworld: so while I think we've addressed the regression today
<jam> I don't think the bug is "just closed"
<wallyworld> sure, but that's not the bug as described
<wallyworld> we can get it off 1.19.1 at least
<jam> wallyworld: right, i think the *regression* portion is stuff that we intend (multiple addresses, even per server), because we think they might be routable
<jam> and we don't save enough information (yet) to be able to provide --network-scope
<wallyworld> yep, i don't see any regression at all
<jam> (and then default it to public)
<jam> wallyworld: giving private addresses in api-endpoints by default is wrong
<jam> but "good enough" for now.
<jam> And hazmat has a point about actually grouping the data by server, so you have a feeling for what machine is a fallback
<wallyworld> ok, so let's retarget off 1.19.1 then
<jam> SGTM
<wallyworld> jam: 2.0 or 1.20?
<wallyworld> 2.0 i guess?
<jam> I'd be ok with 2.0
<waigani> axw: when I use restore with patchValue I get this error: http://pastebin.ubuntu.com/7312196/
<stokachu> so heres my latest change using juju scp https://github.com/battlemidget/juju-sos/blob/master/main.go#L89-L96
<stokachu> and the error output http://paste.ubuntu.com/7312200/
<stokachu> i verified that juju ssh 1 and /tmp/sosreport*xz exists on the machine
<waigani> anyway, I need to go catch a plane
<axw> waigani: sorry, need more context. show me in vegas :)
<stokachu> axw: -r doesn't work with machine num it seems
<stokachu> juju scp 1:/tmp/test . works
<stokachu> but juju scp -r 1:/tmp/test* . fails
<axw> stokachu: you need to separate the command out into individual args
<axw> stokachu: i.e. "juju", "scp", ...
<stokachu> this is manually running the command from the shell
<axw> stokachu: there are some limitations with juju scp, I forget exactly how to pass extra args... lemme see
<stokachu> http://paste.ubuntu.com/7312211/
<stokachu> thats what ive tested manually
<axw> stokachu: stick "--" before -r
<stokachu> axw: you da man
<jam> axw: is that juju 1.16? as 1.18 is a bit broken wrt scp
<jam> stokachu: in 1.18 (for a while until it gets fixed) args for just scp must come at the end and be grouped
<axw> jam: well I'm on trunk... I forget which versions do what wrt scp
<jam> so: juju scp 1:foo 2:bar "-r -o SSH SpecialSauc"
<axw> jam: what I just described does work on trunk, so presumably on 1.18 too?
<stokachu> ah
<axw> jam: i.e. I just tested "juju scp -- -r 0:/tmp/foo /tmp/bar"
<jam> axw: https://bugs.launchpad.net/juju-core/+bug/1306208 was fixed in 1.18.1 I guess
<_mup_> Bug #1306208: juju scp no longer allows multiple extra arguments to pass throug <regression> <juju-core:Fix Released by jameinel> <juju-core 1.18:Fix Released by jameinel> <juju-core (Ubuntu):Fix Released> <juju-core (Ubuntu Trusty):Fix Released> <https://launchpad.net/bugs/1306208>
<jam> axw: trunk just lets you pass everything, and you shouldn't need "--" I thought
<axw> you do need --, otherwise juju tries to interpret the args
<jam> axw: fairy nuff
<stokachu> yea i had to use -- with 1.18.1-trusty
<stokachu> axw: that worked :D:D
<axw> stokachu: cool :)
<vladk> jam: morning
<jam> morning vladk, its early for you, isn't it ?
<jam> well, early for you to be on IRC :)
<fwereade> good mornings
<waigani> fwereade: morning :)
<jam> morning fwereade, we've missed you
<fwereade> waigani, jam: it's nice to be back :)
<waigani> heh, easter holiday?
<jam> brb
<axw> hey fwereade
<axw> fwereade: I was about to approve https://codereview.appspot.com/85040046 (placement directives) - do you want another look first?
<fwereade> axw, I'll cast a quick eye over it :)
<axw> okey dokey
<fwereade> axw, ok, based on a quick read of your responses I think I'm fine -- my only question is exactly what happens with the internal API change as we upgrade
<axw> fwereade: the provisioner will be unhappy until it has upgraded
<fwereade> axw, I *think* that it's fine, given that the environment provisioner only runs on the leader state server, and therefore the upgrade happens in lockstep
<fwereade> axw, but other provisioners?
<fwereade> axw, hm, I have a little bit of a concern about error messages during upgrade
<axw> fwereade: it will be the same for the container provisioners, I think
<jam> back
 * axw checks
<fwereade> axw, *we* might know they're fine
<fwereade> axw, but people who read our logs don't get quite such a sunny prospect of our general competence
<jam> axw: so we talked about having EnsureAvailability with a value of say 0 just preserve the existing desired num of servers
<jam> AFAICT, we never *record* the desired number of servers
<jam> we just have a number of things that are running.
<axw> jam: it's implied by what's in stateServerInfo
<jam> and we have stuff like WantsVote() but I can't see anywhere that sets NoVote=true to indicate that we no longer want to be votiing.
<axw> jam: len(VotingStateMachineIds)
<axw> jam: that's done in EnsureAvailability, in state/addmachine.go
<jam> axw: sure, but isn't that the actual ones that are voting? I guess it would be an availability check?
<fwereade> axw, this must ofc be balanced against the hassle of maintaining the multiple code paths
<axw> jam: VotingMachineIds is really the ones that *want* to vote
<axw> fwereade: just checking still, sorry
<fwereade> axw, np
<fwereade> axw, what I did with the unit agent the other day was just to leave it blocking until the state server it's connected to *does* understand the message, and then continue as usual
<axw> fwereade: yeah, this is common to all provisioners - it will cause an error on upgrade for container provisioners
<axw> hmm ok
<axw> I'll take a look at that code
<axw> fwereade: worker/uniter?
<fwereade> axw, it's not the best code in the world but it seemed to work
<fwereade> just a sec yeah somewhere there
<axw> fwereade: got it I think
<axw>             logger.Infof("waiting for state server to be upgraded")
<axw> yeah okay, I can add that in
<fwereade> axw, cool
 * axw senses another need for API versioning imminently
<axw> although I suppose we can just see that fields are zero values...
<axw> fwereade: yuck, this means threading the tomb all the way through... oh well.
<axw> I suppose it's for the best
 * fwereade glances pointedly at jam re API versioning
 * jam ducks and pretends to catch a plane
 * fwereade does understand
<jam> fwereade: I made sure it was in the topics list
<fwereade> jam, great, thanks :)
<axw> jam: sorry, back to ensure-ha: if you just send 0 or -1 to state.EnsureAvailability, then it can load st.StateServerInfo() and set numStateServers=len(VotingMachineIds)
<jam> axw: I'm going to use 0, because it isn't otherwise valid, and we don't have to  woryr about negative numbers.
<axw> sounds good
<jam> axw: I was thinking to do that originaly, but trying to verify the actual meaning of the various values was ... tricky
<axw> oh I don't have to thread the tomb, hooray
<axw> jam: it's not super clear, I agree
<jam> axw: I was reading through the code and trying to figure out what the actual invariants are
<jam> axw: I was really surprised that ensureAvailabilityIntentions doesn't take into account the new request
<jam> so we end up with 2 passes at it
<jam> also, the WantsVote vs HasVote split is confusing. Probably necessary, but very confusing
<axw> jam: yeah, we need to know what the existing ones want to do
<axw> jam: we certainly could do with some developer docs on this
<axw> I don't understand what the peergrouper does, haven't looked at it at all
<axw> I know what EnsureAvailability does, but it's easy to forget :)
<jam> axw: one advantage of "-1" is that it is odd :)
<axw> heh
<jam> axw: I took out the <= 0 and it still failed, and had to remember 0 is even
<jam> axw: non-negative or nonnegative ?
<jam> our error message currently says >0
<jam> and "greater than or equal to 0" is long
<axw> jam: non-negative looks good to me
<jam> though non-math people won't get non-negative, I guess
<axw> really?
<jam> number of state servers must be odd >= 0
<jam> number of state servers must be odd and >= 0
<jam> ?
<axw> will non-math people understand >= ? ;)  sure, I guess so
<jam> axw: non-engineering/scientists sort of people don't distinguish "positive" from "nonnegative"
<jam> axw: I can't even say "must not be even"... -1 for clarity :)
<jam> only not
<axw> hehe
<axw> fwereade: updated https://codereview.appspot.com/85040046/patch/120001/130035
<jam> axw: updated "juju ensure-availability" defaults 3 https://codereview.appspot.com/90160044
<axw> jam: looking
<jam> axw: note that I merged my default-series branch in ther
<jam> to get the test cases right
<jam> but that didn't end up landing in the mean time
<axw> ok
<jam> so there is a bit of diff that should be ignored, but you can't really add a prereq after the fact
<axw> jam: reviewed
<axw> jam, wallyworld: review for a goose fix please https://codereview.appspot.com/90540043
<jam> looking
<jam> axw: lgtm
<axw> ta
<axw> fwereade: am I okay to land that branch, or are you still looking?
 * axw takes silence as acquiescence
<fwereade> axw, sorry, yes, it looks fine :)
<axw> cool
<axw> jam: is the bot awake?
<jam> axw: checking
<jam> axw: it is currently running on addmachine-placement
<jam> perhaps there was a queu?
<jam> its been goin for 14 min
<axw> okey dokey, thanks
<axw> I thought my goose one would go through first
<jam> axw: I don't think there is relative ordering, and the bot only runs one at a time based on what it finds when itwakes up every minute
<jam> so if you approve both, but it hasn't seen it
<jam> then it will wake up, get the list, and start on one
<axw> ok
<axw> wheee, placement is in
 * axw does the maas bits
 * fwereade bbiab
<axw> jam: the bot does do goose MPs, right?
<mgz> axw: it does
<mgz> wallyworld: thanks for landing my branch
<wallyworld> mgz: np, pleased to help
<wallyworld> i also tested with local provider just in case
<voidspace> morning all
<jam1> morning voidspace
<jam1> axw: so the bot has "landed" your code, but the branch isn't a proper checkout, so it didn't get pushed back to LP
<jam1> I'll fix it
<axw> doh
<axw> jam1: thanks
<jam1> axw: should be merged now
<mgz> right, time to get a train to a plane, see you all next week!
<jam1> mgz: see you soon
<jam1> have a good trip
<jam1> you'll see some of us tomorrow at gophercon, righT?
<mgz> jam1: thanks! and yeah, some this week
<jam1> axw: lgtm on your dependencies branch
<axw> jam1: ta
<jam1> we'll have to make the bot get the latest version, though
<jam1> fortunately, I know someone who is currently logged in
<axw> :)
<axw> I thought the bot updated now?
<jam1> axw: it runs godeps
<jam1> but that won't pull in new data
<jam1> it does do go get -u when you poke config
<jam1> axw:  Ican't *quite* go get -u to not screw up the directory under test
<axw> jam1: it does godeps? "godeps -u" updates the code thought...?
<axw> though*
<vladk> jam1: please, take a look https://codereview.appspot.com/90580043
<vladk> I will be offline until meeting
<axw> woop, add-machine <hostname> works... now the fun of updating the test service
<jam1> axw: it sets the version of an existing tree to that revision. It does not *pull* data from remote sources.
<jam1> so if it isn't present locally, godeps -u doesn't work
<axw> jam1: ah right, I see
<jam1> axw: so I haven't gotten a chance to dig into it thoroughly, but are we writing "/var/lib/juju/system-identity" via cloud-init? Or are we only using the cloud-initty stuff to get it on their via SSH bootstrap ?
<axw> jam1: yes, that is how it is done now. I'm not a fan
<axw> jam1: actually...
<axw> jam1: sorry, no, we SSH in and then put it in place
<axw> jam1: anything inside environs/cloudinit.ConfigureJuju happens after cloud-init, but only for the bootstrap node
<psivaa> hello, could someone help me build juju from source pls?
<psivaa> I'm getting http://paste.ubuntu.com/7313347/ when i run go install -v launchpad.net/juju-core/...
<voidspace> psivaa: I'm just doing a pull and trying now
<voidspace> psivaa: works for me
<voidspace> psivaa: so I suspect you're using a "too old" version of Go
<voidspace> psivaa: what does "go version" say?
<voidspace> psivaa: I'm on 1.2.1 (built from source)
<psivaa> voidspace: 'go version xgcc (Ubuntu 4.9-20140406-0ubuntu1) 4.9.0 20140405 (experimental) [trunk revision 209157] linux/amd64' is the output for go version
<axw> fwereade: maas-name support -> https://codereview.appspot.com/90470044/
<jam> psivaa: actually that looks like an incompatible version of go crypto
<axw> fwereade: still need to support it in bootstrap
<fwereade> axw, awesome :)
<axw> (and add-unit and deploy, but they're coming later)
<jam> psivaa: if you "go get launchpad.net/godeps" you can run "godeps -u dependencies.tsv" and it should grab the right versions of dependencies
<psivaa> jam: ack, i did 'hg clone https://code.google.com/p/go.crypto/' to get go crypto.
<psivaa> jam: voidspace: thanks. i'll try your suggestion
<jam> psivaa: gccgo 4.9 should be new enough
<jam> psivaa: My guess is that go crypto updated their apis, which broke our use of their code
<jam> and we haven't caught up yet
<jam> which is why we have dependencies.tsv to ensure we can get compat versions
<psivaa> jam: ahh ack, i'll use that. thanks
<jam> psivaa: if you don't want godeps, then you can hg update --revision 6478cc9340cbbe6c04511280c5007722269108e9
<jam> I think
<jam> psivaa: looks like just "hg update 6478cc9340cbbe6c04511280c5007722269108e9"
<fwereade> axw, LGTM, it's really nice to see it implemented with such a small amount of new code:)
<axw> fwereade: :) thanks
<axw> fwereade: sadly the bootstrap one will be a bit larger - I'll need to change Environ.Bootstrap
<fwereade> axw, sure, but it's absolutely a desirable change, and subsequent ones (like zone on ec2) will themselves then basically come for free :)
<axw> yup
<fwereade> vladk|offline, ping me when you're back please -- wondering whether we should really share an identity across state servers, or whether we should be creating one each
<fwereade> vladk|offline, ah, forget it, I made bad assumptions in the first reading
<voidspace> my parents have just turned up for coffee
<vladk> fwereade: ping
<voidspace> be afk for 15minutes :-)
<fwereade> vladk, pong
<fwereade> vladk, I see we have separate identities, sorry I misread; but I don't see when we'll rerun those upgrade steps. perhaps we'll definitely never need them?
<perrito666> good soon to be morning everyone
<vladk> fwereade: I just used a formatter struct, my code does nothing with upgrade. I don't know whether SSH key will distributed on tools upgrade. It wasn't my task.
<vladk> But SSH key will be installed on every new mashing with state agent.
<vladk> Should I investigate what occurs during upgrade?
<fwereade> vladk, ahh, I see
<fwereade> vladk, yes, please see if you can find a way to break it by upgrding at a bad time
<fwereade> vladk, if you can't, then LGTM, just note it in the CL and ping me to give it the official stamp ;)
<fwereade> perrito666, heyhey
<fwereade> perrito666, sorry I left you hanging last week, I think I managed to send you another review a day or two ago though -- was it useful?
<jam1> fwereade: AFAIK we don't have different identities, do we?
<jam1> fwereade: https://codereview.appspot.com/90580043/patch/1/10013 concerns me
<jam1> are we actually writing that to userdata ?
<jam1> (exposing the secret ssh id)
<jam1> I think axw-away claimed that we didn't actually do that during bootstrap
<perrito666> fwereade: It was, altough right now I put that on hold since I am juggling with a brand new set of restore bugs :p
<fwereade> jam1, it does indeed look like we were, grrmbl grrmbl; but it looks to me like what we do now is generate a fresh id and add that to the system, as one of N keys for the state-server "user", per state-server-machine
<fwereade> jam1, so I think it's solid -- did I miss something
<fwereade> perrito666, ok, great -- I'm here to talk further if you need me
<jam1> fwereade: I haven't yet found that bit that you're talking about (where we actually generate the new value)
<jam1> I see the code that if we have the value we write it onto disk
<jam1> fwereade: but while we remove this: https://codereview.appspot.com/90580043/patch/1/10012
<jam1> I don't see the the SystemPrivateSSHKey being removed from MachineCfg
<jam1> nor have I yet found anything that creates the populates the contents of identity
<jam1> but I could easily just be missing it, though I've gone over the patch a few times now
<fwereade> jam1, hum, yes, I now think I was seeing that bit in the upgrade instructions alone
<fwereade> jam1, yeah, I think that's the only place -- vladk, thoughts? ^^
<fwereade> jam1, but fwiw, I suspect that the stuff in cloudinit is actually not in *cloudinit*, only in the bit that gets rendered as a script when we ssh in at bootstrap time
<jam1> fwereade: and we are calling AddKeys(config.JujuSystemKey, publicKey)  and setting it to exactly 1 key
<jam1> fwereade: right, so I'm not very sure about the cloudinit stuff because we did the bad thing and punned it
<fwereade> jam1, AddKeys is meant to *add*, not update -- did that change?
<jam1> so that sometimes cloud-init is rendered to actual cloud-init
<jam1> and sometimes it is rendered to a ssh script
<jam1> fwereade: ah, it might
<fwereade> jam1, believe me, I told the affected parties when they wrote the environs/cloudinit module *waaay* back in the day -- cloudinit is just one possible output format
<fwereade> jam1, sadly I was not in an official tantrum-throwing position at that time ;p
<jam1> fwereade: also, I think we have a point that steps118.go is only run when upgrading from 1.16 to 1.18, so it *won't* be run when upgrading to 1.20 (from 1.18)
<jam1> but I don't think that actually matters here
<jam1> as we don't actually need to fix upgrade
<jam1> because HA is new in 1.19, so we don't have anything that we're upgrading
<psivaa> jam1: jfyi, godeps method made installing from source work for me. thanks
<fwereade> jam1, I think that, yeah, upgrade is irrelevant except in that it's the one place that actually sets up the keys
<jam1> fwereade: the issue is that if we are going to give each one a unique identity (which I think is better, fwiw, but I'm not sure if it breaks some assumptions)
<jam1> I would expect us to see a change in AddMachine()
<jam1> or EnsureAvailability
<jam1> fwereade: it sets up the first key
<jam1> fwereade: I really don't see how his patch would populate the new "identity" field in agent.conf
<jam1> fwereade: but the fact that we have 3 or 4 types with a StateServingInfo method, and each gets its data from somewhere else
<jam1> (might be API, might be agent.conf, might be ...)
<vladk> fwereade, jam1: about https://codereview.appspot.com/90580043/patch/1/10012
<vladk> This is a part of ssh-init script construction.
<vladk> Now ssh key is passed inside of agent.conf file. So I remove it direct creation.
<jam1> vladk: right, I think that line is great
<jam1> vladk: but I haven't managed to find the part that actually sets the contents of the agent.conf file
<vladk> here https://codereview.appspot.com/90580043/patch/1/10005
<vladk> via yaml marshaling
<jam1> vladk: but what is setting it on the struct
<jam1> (I'm also not sure that we're allowed to change the content of an agent.conf without bumping the format number, but that is a later concern)
<jam1> vladk: I see a lot of stuff that "if we have the data set" gets it written to the right places, which all looks good
<jam1> I just haven't managed to find a line that is "SystemIdentity = XXXXX"
<jam1> vladk: going the route you did, I would expect to see a change in state/addmachine.go
<jam1> to something in either EnsureAvailability or elsewhere
<jam1> to create the system-identity data that the machine agent then reads from agent.conf later
<vladk> jam1: https://codereview.appspot.com/90580043/patch/1/10008 set to StateServingInfo
<vladk> https://codereview.appspot.com/90580043/patch/1/10005 set to formatter of agent.conf
<jam1> vladk: thanks, fwereade^^ your original assumption is wrong, they all get the same value, and it is being written via cloud-init (from what I can tell)
<jam1> which is sad news, I believe
<jam1> vladk: I expected that we would be actually calling an API to get that data during cmd/jujud/machine.go
<jam1> if we are only reading it from disk
<jam1> then we wrote it to disk via cloud-init
<jam1> which means we are passing our ssh secret key to EC2
<jam1> to hand back to us
<jam1> we got away with it (slightly) with "bootstrap" because bootstrap actually SSH's onto the machine to write those files
<fwereade> well fuck
<jam1> but all other provisioning is done via cloud-init and follow up calls to the API
<fwereade> honestly I'd expect us to just generate it at runtime
<fwereade> jam1, wait, we're writing state-server info to new state servers we provision?
<perrito666> wwitzel3: can you see me?
<jam1> fwereade: I had originally thought they should be shared, but honestly, I like your idea to have the agent come up
<jam1> check that it doesn't have one
<jam1> generate it
<fwereade> jam1, that's *all* meant to come over the API
<jam1> and add the public key only to the list of accepted keys
<fwereade> jam1, and indeed in this case there's no reason not to do it on the agent
<jam1> fwereade: *I* don't understand the code very well
<jam1> we do some crazy shit
<jam1> about writing agent.conf
<jam1> and then reading it back in
<jam1> fwereade: all of the code in machine.go uses agentConfig.StateServingInfo()
<jam1> fwereade: except line 240
<jam1> where we call st.Agent().StateServingInfo()
<jam1> and then call: 			err = a.ChangeConfig(func(config agent.ConfigSetter) {
<jam1> 				config.SetStateServingInfo(info)
<jam1> 			})
<jam1> to get it written to disk
<jam1> for everything else to read
<jam1> fwereade: but I *think* there is a bug that you have to have it written to agent.conf first, so that you come up thinking you want to be an API server
<jam1> fwereade: also see machine.go line 458
<jam1> that says "this is not recoverable, so we kill it, in the future we might  get it from the API"
<jam1> there *is* an issue with bootstrap, the first API server obviously has to get it from agent.conf
<jam1> so there is some bit of we can't just always read from the api
<jam1> I guess
<jam1> but the swings and roundabouts make it hard for me to reason
<jam1> anyway, standup time, switchnig machines
<jam> fwereade: standup ?
<perrito666> Horacio DurÃ¡n
<perrito666> jam:
<voidspace> jam: on the logging, the theory is that all the state servers should have *all* the logging - so when bringing up a new state server it really shouldn't need to connect to *all* state servers to get existing logging. Any one (that is fully active) should do.
<jam> voidspace: I understand that, but when you go from 1 to 3, you'll probably see the other api server that is coming up at the same time, and then it is just random-chance if you get the full log or not
<jam> (similarly going from 3-5)
<jam> though not going from degraded-2 to 3
<voidspace> jam: right, so being able to determine if it's fully active or not would help - but if we can't do that then maybe there's no other way
<jam> voidspace: I certainly understand why it might work, but my point would still be "we can iron out getting the backlog later, because it isn't the most important thing right now"
<voidspace> jam: ok, understood
<voidspace> connecting to all state servers and filtering out duplicate logging offends me though
<voidspace> (and it's O(n^2) if you bring up lots of state servers
<jam> voidspace: its O(n) if the data was properly sorted :)
<natefinch> definitely just ignore the backlog for now. We'll get a real logging framework set up that will do more than rsyslog.  There's a topic for it in Vegas.
<jam> though you only ever have 7 state servers (because we use mongo, and mongo has that limit)
<voidspace> ah
<voidspace> still, I'm sure we can do better
<natefinch> jam: in theory you can have up to 12 as long as only 7 are voting.
<vladk> jam: 1) do we need different identites on different machines?
<vladk> 2) should I find places where agent.conf is written and where SystemIdentity is assigned?
<ghartmann> do we already have any clue why add-machine doesn't work for local providers anymore ?
<jam> ghartmann: I hadn't heard that that was the case
<jam> is there a bug/context/paste ?
<ghartmann> I don't get any logs at all
<ghartmann> the machines just stick on pending
<ghartmann> I tried installing on the VM and seen the same issue
<ghartmann> I decided to roll back to 1.18
<ghartmann> and it's kinda working
<ghartmann> I can't boot precise but trusty works
<ghartmann> by the way
<ghartmann> I am willing to help but I am struggling a bit on how to debug the code
<fwereade> ghartmann, sorry, my internet is up and down, I am missing context
<fwereade> ghartmann, but I would like to help you if I can
<ghartmann> I am currently using juju for local provider only
<ghartmann> best way to prototype and fix charms
<ghartmann> but since I updated juju I am unable to start any machines
<ghartmann> or they start but that way too long
<ghartmann> 30 minutes if they do start
<fwereade> ghartmann, hmm, that "way too long" is really interesting, to begin with it sounded like it might be https://bugs.launchpad.net/juju-core/+bug/1306537
<_mup_> Bug #1306537: LXC provider fails to provision precise instances from a trusty host <deploy> <local-provider> <lxc> <juju-core:Triaged> <juju-quickstart:Triaged> <https://launchpad.net/bugs/1306537>
<ghartmann> I would imagine that someone have reported it because being unable to start machines is a breaking issue
<ghartmann> I am trying to understand why this happens and how can I help
<fwereade> ghartmann, ok, the best way to collect information is to `juju set-env "logging-config=<root>=DEBUG"`; and then to look in /var/log/juju-<envname>
<fwereade> ghartmann, in fact looking at the lxc code you might want to set juju.container.lxc=TRACE
<jam1> fwereade: I think if you "juju bootstrap --debug" it does that level of logging, doesn't it ?
<jam1> DEBUG (not TRACE)
<fwereade> jam1, yeah, I was assuming an existing environment
<fwereade> jam1, but if it's not working I guess there's not much reason t keep the old one around
<fwereade> jam1, and in particular a lot of the lxc stuff is only logged at trace level, I now observe
<jam1> vladk: so having unique identities is more of a "it would be nice if they did" rather than "they must"
<fwereade> ghartmann, if you're struggling to find *where* in the code I would start poking around in the container/lxc package -- specifically CreateContainer in lxc.go -- but I'm not sure if that's what you're asking
<ghartmann> the debug helps a little bit but it seems it believes that it worked ... "2014-04-23 12:16:50 INFO juju.cmd.juju addmachine.go:152 created machine 4"
<jam1> ghartmann: created machine is creating a record in the DB for a new machine
<fwereade> ghartmann, that just indicates that it recorded we'd like to start the container
<jam1> != actually started a machine
<ghartmann> ah ok
<fwereade> ghartmann, it's possible that the provisioner is implicated, but in particular the slowness STM to point to the actual nuts and bolts of the container work
<jam1> fwereade: so I think his statement was "it isn't working after 30 minutes" which means it hasn't actually worked yet
<fwereade> jam1, ok, I see :)
<jam1> fwereade: ghartmann: if it *was* working, it would still need to download the precise/trusty cloud image, but that download should only need to happen once
<ghartmann> I will try looking on lxc
<fwereade> ghartmann, do you see any lines mentioning the provisioner in the logs?
<fwereade> ghartmann, in particular "started machine <id> as instance ..."
<ghartmann> opening environment local
<ghartmann> no started machine
<ghartmann> you mean on .juju/local/log right ?
<ghartmann> I am stop starting the machine manually
<ghartmann> it seems that the machine can't start a network device
<fwereade> ghartmann, ah! you get a container created but it won't do anything?
<ghartmann> it seems that the lxc-start doesn't start the machine
<ghartmann> I will try to get it working first
<ghartmann> it is something related with the network
<ghartmann> it seems that the network of the machine doesn't start
<ghartmann> I will try making it as a bridge
<ghartmann> will let you know once I finish it
<ghartmann> thanks for the ideas
<fwereade> ghartmann, there's a "network-bridge" setting for the local provider which defaults to lxcbr0 -- that works for most people, but possibly you have a different setup there?
<ghartmann> I am using the standard
<ghartmann> but I will change a few things on my network
<ghartmann> will take a while
<jam> fwereade: so there is a bug that deploying precise on trusty will fail because of "no matching tools found"
<jam> fwereade: 2014-04-23 12:36:43 ERROR juju runner.go:220 worker: exited "environ-provisioner": failed to process updated machines: cannot start machine 1: no matching tools available
<fwereade> jam, is that different from the one Ilinked?
<jam> fwereade: it might be the root cause of the one linked, I'm not sure
<jam> fwereade: ghartmann: so one option is to try running "juju bootstrap --series precise,trusty" or possibly "juju upgrade-juju --series=precise,trusty --upload-tools" to see if that gets things unstuck. But for *me* the provisioner is spinning on not creating an LXC instance because it cannot find the right tools
<jam> if you got past that part
<jam> fwereade: so it would seem that if the provisioner cannot provision machine 1 because of no tools, it won't try to provision machine 2
<jam> (in this case, the former is precise, the latter is trusty)
<fwereade> jam, I think the core of it all is tools.HasTools
<fwereade> jam, oh, wait, it actually can't be here, can it
<fwereade> jam, but the provisioner task's possibleTools method is all messed up anyway :/
<jam> fwereade: the check we have that all machines are running the same agent version also fails when you have dead machines (since nil != "1.18.1.1")
<jam> so you can't use "juju upgrade-juju --upload-tools --series precise,trusty" to trick it
<fwereade> jam, not without force-destroying the machines, yeah
<jam> fwereade: but for *me* if I "juju bootstrap -e local --upload-tools --series precise,trusty" it works
<jam> without the --series trick, it gets stuck never finding tools for the precise charm
<jam> and then never getting to try for thetrusty charm
<jam> seemingly
<fwereade> jam, it seems reasonably likely that the provisioner is just failing out on the first one, and then trying again in the same order when it comes back up
<jam> fwereade: right
<jam> fwereade: I would have thought the provisioner would fail and keep trying the next one
<jam> though perhaps the idea is that if tools aren't available yet, it isn't worth trying until later?
<fwereade> jam, yeah, unless explicitly handled otherwise we assume that errors might fix themselves if we try again later
<fwereade> jam, frankly it's insane that the provisioner even knows about tools in the first place
<jam> fwereade: well, it needs to pass them to cloud init
<jam> so that the machine that is starting up can get them
<jam> fwereade: why is that insane ?
<fwereade> jam, the environ *already knows about the tools*. we *ask it where to find the tools*.
<voidspace> lunch
<fwereade> jam, a bit more than a year ago, we managed to refactor some of the way, but not all
<jam> fwereade: is it intended to stay that way? Given we've talked about object storage in mongo
<fwereade> jam, tools-in-state would indeed change the picture significantly, it's true
<fwereade> jam, but even then the provisioner would just be a dumb pipe wrt tools, Ithink
<jam> fwereade: I thought "juju destroy-machine --force" was intended to prevent this status:
<jam>   "2":
<jam>     instance-id: pending
<jam>     life: dead
<jam>     series: trusty
<fwereade> jam, hmm, yeah, the provisioner ought to be able to kill all the dead machines before it starts worrying about the live ones
<jam> fwereade: well it is possible that it will get to it soon, but it is stuck downloading the cloud-image template
<jam> which is a few MB
<jam> like 100 or so
<fwereade> jam, btw, I don't suppose you know where that "instance-id: pending" business comes from?
<fwereade> jam, either we have an instance-id or we don't
<jam> fwereade: in that particular case, the "trusty-template" fslock was left stale
<jam> when I called "destroy-environment" while not waiting for trusty to come up.
<axw-away> jam: just saw your message about system-identity in cloud-init. that test you linked to is a bit misleading; it's running Configure, when it should be running ConfigureBasic
<axw-away> jam: IOW, the test does not reflect what we really do on bootstrap
<fwereade> oh WTF
<jam> fwereade: I'm also seeing: 2014-04-23 13:41:08 WARNING juju.worker.instanceupdater updater.go:231 cannot get instance info for instance "": no instances found
 * axw-away goes back away
<fwereade> jam, looks.like m.InstanceId is not erroring when it should?
<jam> fwereade: perhaps
<jam> fwereade: so from what I can sort out, vladk's patch is worth landing. I'm still confused by bits of it (why is it working), but I can accept that it might just be because I don't understand the swings and roundabouts
<jam> certainly he said he confirmed that secrets aren't going to EC2
<jam> fwereade: a potential fix for bug #1306537: https://codereview.appspot.com/90640043
<_mup_> Bug #1306537: LXC local provider fails to provision precise instances from a trusty host <deploy> <local-provider> <lxc> <juju-core:In Progress by jameinel> <juju-core 1.18:In Progress by jameinel> <juju-quickstart:Triaged> <https://launchpad.net/bugs/1306537>
<hazmat> question via email this morning.. local provider (using lxc).. doing deploy --to kvm:0 is supported?
<jam> hazmat: my understanding is that it has worked, perhaps accidentally but it was working
<wwitzel3> voidspace: I'm going to grab an early lunch and do an errand and we can sync up with where we are at when I get back.
<fwereade> jam, I'm worried about that because tim added a hack somewhere else in an attempt to resolve essentially the same problem
<fwereade> jam, except it's not quite the-same *enough* I guess
<jam> fwereade: so there is certainly a bit of "this worked for me" vs feeling good about the change. but I have the strong feeling that feeling good about the change means a much bigger overhaul of our internals
<jam> fwereade: so I filed bug #1311677
<_mup_> Bug #1311677: if the provisioner fails to find tools for one machine it fails to provision the others <provisioning> <status> <ui> <juju-core:Triaged> <https://launchpad.net/bugs/1311677>
<jam> and looking at it
<jam> (the startMachines code)
<jam> it does exit on the first failure
<jam> and we have the fact that on "normal" provisioning failures
<jam> we call "task.setErrorStatus"
<jam> so if one fails
<jam> we mark it failing
<jam> and then just go back to doing the next thing when we wake up again
<jam> however, if possibleTools fails
<jam> we *don't* call setErrorStatus
<jam> so that machine stays around blocking up all other work
<jam> fwereade: my concerns. 1) We could try to keep provisioning even on errors, but if we are getting RateLimitExceeded, we realyl should just shut up and go sleep for a wihle
<jam> 2) Do we expect tha tpossibleTools is actually going to resolve itself RealSoonNow ?
<jam> now that we have the idea of Transient failures, could we treat no tools there ?
<fwereade> jam, still thinking
<fwereade> jam, re (1), I really think we have to do the rate-limiting inside the Environ, and use a common Environ for the various workers that need one
<jam> fwereade: so even with that we are likely to eventually exceed our retries
<jam> (say we retry up to 3 times, do we want to come back tomorrow?)
<jam> I don't think we want to block a worker thread completely in Environ for more than ... minutes?
 * jam gets called away to actually be part of a family
<fwereade> jam, if you come back sometime soon: I don't think that tools failure is transient, so I don't think treating it as such will really help -- setErrorStatus is probably the right answer to the problem (apart from anything else, precise/trusty are not the only series people will use even if they are *today*)
<fwereade> to *that* problem
<natefinch> fwereade: definitely, no tools is likely to be a semi-permanent problem for all intents and purposes, certainly not something likely to get fixed within a small number of minutes, which is the most amount of time I can conceive of actually waiting for something to succeed.
<hazmat> jam, it works, the question is it supported, i thought thumper had said that it was, but various folks are getting mixed signals on it
<hazmat> so there's some confusion in regard
<sinzui> jam, fwereade, I think we are 2+ week away from a stable 1.20. I want to try for a 1.18.2 release this week.
<natefinch> hazmat: it works by accident.  I wouldn't say it is "supported"
<jam1> sinzui: so my understanding is that there is very strong political pressure to get something out that has HA in a 'stable' release by the end of the week. We don't have to close all the High bugs to get there.
<natefinch> hazmat: which is to say, I wouldn't rely on it working in the future.
<jam1> I think we might be able to do a 1.19.1 today
<jam1> which will be missing debug-log in HA, and backup/restore, I think
<jam1> but I think we can land Vladk's patch to get "juju run" to work in 1.19.1 and HA
<sinzui> jam1, You cannot have stable release until after users have given feedback. If I release today, you still don't get feedback until next week
<hazmat> natefinch, so if we have folks that need a working solution for lxc and kvm today that need a supported solution, the answer is your out of luck? and we don't support lxc and kvm in the same local provider.
<jam1> fwereade: sinzui: alexisb (if around) I'm not the one who has the specifics for why we need HA available for April 25th, can you give more context ?
<sinzui> jam1, also CI still doesn't pass HA. Someone might need to work with abentley to make the test pass of find the bug that might be in the code
<fwereade> hazmat, I don't *like* it, but ISTM that it's (1) useful and (2) used, so we don't have any reasonable option for breaking it without providing an alternative
<hazmat> fwereade, there's an extant bug on the later to support kvm and lxc containers in the same provider, which would also work, but its a bit more work.
<jam1> fwereade: hazmat: I would agree with the "we shouldn't break it without providing another way"
<jam1> hazmat: you still have the problem with spelling "I want to deploy the next one into KVM", unless we go all the way and make all the things you deploy prefixed
<hazmat> ok.. so supported for now .. till we have something better :-)
<hazmat> jam, any placement effectively bypasses constraints
<hazmat> fwereade, jam1, thanks
<sinzui> jam1, alexisb, fwereade: I am not here to be the voice of idealism. I am the voice of pragmatism. We know developers, user, and CI find bugs, and all three need to affirm the feature works. There is not enough information to call HA stable for release
<fwereade> jam1, hazmat: or we bite the bullet and get multi-provider environments going; at which point it's just another pseudo-provider and should Just Work
<fwereade> jam1, hazmat: but I'm not confident that'll happen any time soon
<jam1> fwereade: then there is the argument that cross-env relations is better than multi-provider ones
<jam1> fwereade: if only because for most of them, you actually still want to run an agent local to that provider
<alexisb> jam1, the 4/25 date for the 1.20 release was set because the target for a release with HA is ODS and jamespage needs some time to integrate
<hazmat> long term that sounds great, manual provider with cross region worked well enough for most of those cases for me till 1.19 (the address stuff breaks it)
<alexisb> but as sinzui points out it has to be ready, which it is not
<jam1> alexisb: fwiw, it is probably ready enough for jamespage to look into integrating it
<alexisb> jam1, ok, we should connect with jamespage then
<sinzui> alexisb, jamespage If you get juju 1.19.1 with HA this week, is that good enough to test?
<natefinch> jam1, alexisb: that was going to be my thought as well.  There's some edge case stuff that should be fixed, but the main workings are all there
<jam1> sinzui: though probably we'll want to get 1.19.1 rather than have him running trunk
<jam1> sinzui: I was trying to assign someone to work on the HA bug today ,I think natefinch is the one that volunteered to get the test running
<alexisb> sinzui, jam1 how close are we to a 19.1 release?
<alexisb> I see 2 critical bugs still being worked
<sinzui> alexisb, jam1, you are actually on schedule for a Friday release
<jam1> alexisb: one of those should have a patch that should be landing, I don't know for sure why it hasn't
<sinzui> I just don't see that release being called 1.20
<jam1> the other is "juju backup" which is also supposed to have something from perrito666, but may not have to block 1.19.1
<alexisb> sinzui, agreed
<jam1> sinzui: I agree, I don't think 1.19.1 is 1.20
<jam1> but it is HA out for testing
 * perrito666 feels conjured
<jam1> to get feedback to drive a proper 1.20
<jam1> perrito666: so you work working to get "juju backup" to find /usr/lib/juju/bin/mongod when available, did that get done?
<alexisb> jamespage, would a 1.19.1 development release be enough for you to begin testing and integration?
<sinzui> jam1 yep
<jam1> alexisb: I know of 2 things that are just-broken when you run HA (juju debug-log and juju run), but we have a patch for the latter, and wwitzel3 and voidspace on the former.
<fwereade> jam1, I'm not sure how important it is to have a local state-server in the *long* term, but in the short term it is true that we benefit a lot from it
<jam1> natefinch: did you get to look into the HA CI test suite? Can you give me an update on it by your EOD, as I can look at it tomorrow.
<perrito666> jam1: I am actually trying to fix the whole thing together (backup/restore) since the test takes time I try to make the best of it, but I can propose the backup fix alone if you want
<sinzui> jam1, returning to 1.18.2. You have diligently landed some fixes to it. I think there were a few more bugs that would be lovely to include. May I propose some merges to 1.18 to prepare a 1.18.2 that Ubuntu will love?
<natefinch> jam1: looking at it now, late start to my day today, but i still have a lot of time to put into it.
<jam1> perrito666: please never block getting incremental improvements on getting the whole thing. In general everyone benefits as long as it doesn't regress things in the mean time.
<fwereade> perrito666, I like small branches -- I know that a backup that can't be restored is no backup at all, but I'd still rather see a few branches that we merge all at once if we have to
<jam1> sinzui: I have the strong feeling that 1.18 is going to stick in Trusty and we're going to be supporting it for a while.
<perrito666> ack
<jam1> sinzui: so while I'm not currently focused on it, because of 1.19 and HA stuff filling my queue
<perrito666> :)
<jam1> sinzui: patches seem most welcome to 1.18
<fwereade> perrito666, jam1: indeed, the only reason to hold off on landing one of those branches is if it does, in isolation, regress something
<alexisb> jam1, are you thinking that 1.18 will be the long term solution for Trusty?
<sinzui> jam1. okay. I will make plans for 1.18.2
<natefinch> sinzui: how do I investigate a CI failure?  I believe functional-ha-recovery-devel is the one I'm supposed to be fixing
<jam1> alexisb: 1.18 doesn't have HA support, and will likely be missing lots of stuff. I just think that given our track record with actually getting stuff into the main archive, we really can't trust it
<sinzui> natefinch, abentley in canonical's #juju is seeing errors like this...http://ec2-54-84-137-170.compute-1.amazonaws.com:8080/job/functional-ha-recovery-devel/64/console
<jam1> alexisb: so likely we'll want something like cloud-archive for Trusty that provides the latest set of tools that we like
<sinzui> natefinch, abentley believes the problem is the test. it is not waiting for the confirmation that juju is in HA.
<jam1> but I don't think we can actually expect to get things into the Ubuntu archive.
<sinzui> natefinch, abentley will ask for assistance if the test continues to fail after assuring itself that HA is up
<natefinch> sinzui: cool.  I'm more than willing to help.  I know that working with mongo can be hairy
<alexisb> jam1, yes we are working with the foundations team/TB to define the process for updating juju-core package in trustie
<alexisb> I don't know yet what the process will be
<jam1> alexisb: i might be being jaded, but cloud-tools:archive still has 1.16.3 because it never got 1.16.5 landed in Saucy
<jam1> and that is... 6 months old?
<alexisb> and it could very well become via cloud-tools
<jam1> alexisb: though again, we've struggled to get stuff in there, too
<hazmat> are there any tricks to compiling juju with gccgo?
<sinzui> jam1, alexisb : I thought jamespage had made progress getting juju 1.16.4..1.16.6 in old ubuntu. The issue was the backup and restore plugins...since the backup plugin wasn't in the code, we elected to not package it.
<fwereade> jam1, re https://codereview.appspot.com/90640043 -- how about fixing environs/bootstrap.SeriesToUpload instead?
<jam1> sinzui: so cloud-archive:tools still has 1.16.3 as the best you can get: http://ubuntu-cloud.archive.canonical.com/ubuntu/dists/precise-updates/cloud-tools/main/binary-amd64/Packages
<alexisb> well HA is really important so we will need to fight the battles to get it into Trustie
<jam1> fwereade: so instead of LatestLTSSeries it would do AllLTSSeries ?
<fwereade> jam1, essentially, yeah
<fwereade> jam1, if we were smart we'd only upload a single binary anyway but I'm not sure we got that far yet
<jam1> fwereade: so at this point, I think using LatestLTSSeries is still a bit wonky since we really can't expect anything about T+4
<jam1> fwereade: we're not
<sinzui> alexisb, jam1, we have never tested upgrade from 1.16.3 to 1.18.x. We need to test that if jamespage fails to get 1.16.6 into the cloud-archive...and hope it works
<jam1> if you bootstrap --debug you can see the double upload
<fwereade> jam1, yeah, thought so
<jam1> sinzui: AIUI, the issue was that once Trusty releases, then the version in Trusty becomes the version in cloud-tools, so it will jump from 1.16.3 to 1.18.1 (?)
<sinzui> jam1, right, that was the jamespage's fear.
<jam1> fwereade: I would be fine moving it toSeriesToUpload, and *I* would be fine just making that function put Add("precise"), Add("trusty")
<jam1> fwereade: but *I'm* way past EOD here
<fwereade> jam1, but regardless, I think we're better off fixing SeriesToUpload (and maybe improving the double-upload, now that it's potentially a triple-upload) than adding another tweak to a code path that is in itself pretty-much straight-up evil in the first place
<jam1> fwereade: so happy to  LGTM a patch that does that :)
<jam1> even better that it could *actually* be tested
<fwereade> jam1, quite so, that was my other quibble there ;)
<fwereade> jam1, ok, I have a meeting in a few minutes and am not sure I will get to it today myself, but I'll make sure you know if I do
<bac> sinzui: so the swift fix was a mirage?
<sinzui> bac: yes
<bac> drats
<sinzui> bac: and the corrupt admin-secret theory is crushed
<sinzui> bac, also, staging machine-0 has been stuck in hard reboot for a week. I think we can say it is dead.
<jam1> fwereade: I gave a summary of why vladk's patch works, mostly boiling down to the fact that what we write to the DB is the params.StateServingInfo struct, unlike most of our code which uses separate types for API from DB types
<jam1> https://codereview.appspot.com/90580043/
<jam1> vladk: are you able to land that patch today before sinzui can put together a release ?
<jam1> (and get CI to pass on it, I guess)
<vladk> jam1: yes
<jam1> vladk: great
<jam1> LGTM
<jam1> vladk: can I ask that you file a "tech-debt" bug to track that we may want to have each API server have their own system identity?
<vladk> jam1: ok
<jam1> I think as long as we have the api StateServingInfo we can actually notice who's calling and give them the a different value if we want
<hazmat> it looks like 1.18 branch has deps on both github.com/loggo/loggo and github.com/juju/loggo are those the same ?
<jam1> hazmat: they need to be only one, otherwise the objects internally are not compatible
<jam1> it should all be "github.com/juju/loggo"
<hazmat> jam1, 1.18 stable branch -> state/apiserver/usermanager/usermanager.go:     "github.com/loggo/loggo"
<hazmat> jam1, thanks.. i'll mod locally
<jam1> hazmat: please propose a fix if you could
<hazmat> jam1, sure.. just need to get through the morning
<voidspace> jam1: ping, if you have 5 minutes
<voidspace> jam1: it can wait until tomorrow if not
<voidspace> ooh, precise only has version 5 of rsyslog so we can only use the "legacy" configuration format
<voidspace> lovely
<voidspace> jam1: cancel my ping :-)
<voidspace> natefinch: ping
<natefinch> voidspace: howdy
<natefinch> fwereade: where do I go to approve time off?
<perrito666> jam1: fwereade sinzui https://codereview.appspot.com/90660043 this fixes the backup part of the issue
<perrito666> so ptal?
<perrito666> anyone is encouraged to, although be warned, its bash
<fwereade> natefinch, canonicaladmin.com is all I know
<perrito666> does anyone now why are we dragging the logs on the backup? (and most precisely why are we restoring them?) I mean I know we might want to back them up for analysis purposes, but restore the old logs pollutes information a bit
<jam1> natefinch: you should be able to log into Canonical Admin and have "Team Requests" under the Administration section
<jam1> perrito666: if you want to investigate why something failed in the past, you need the log
<perrito666> jam1: exactly, but if you restore the log from the previous achine you are lying about the current one
<jam1> perrito666: but it also contains the whole history of your actual environment
<jam1> vs just this new thing that I just brought up
<jam1> I would be fine moving the existing file to the side
<jam1> but all the juicy history is what you are restoring
<jam1> perrito666: did you test the backup stuff live against a Trusty bootstrap?
<jam1> perrito666: nate's patch landed at r2662
<perrito666> jam1: sorry I was at the door
<perrito666> I did, let me re-check that the env that is being back-up actually has the proper mongodb
<perrito666> jam1: re your comment, I could try to assert MONGO* is exectuable or fail instead
<voidspace> going jogging, back shortly
<jam1> perrito666: I don't really think we need to spend many cycles worrying about it.
<jam1> It may be that just using '-f' will give better failure modes (more obvious if we try to execute something that isn't executable than trying to run a command that isn't in $PATH)
<jam1> perrito666: anyway, not a big deal, don't spend too much time on it, focus on getting it landed and on to restore
<perrito666> yea, most likely if you have those and they are not executable you most likely noticed other problems
 * perrito666 repeats himself when he stops writing a sentence in the middle and then restarts
<jam1> that is certainly a common thing
<perrito666> whell I did a version of restore that backups the old config just so I get to discover what part of our backup restoration breaks the state server
 * perrito666 's kindgom for an aws location in south america
<voidspace> EOD folks
<voidspace> g'night
<perrito666> bye
<wwitzel3> voidspace: see ya
<stokachu> is juju add-relation smart enough to handle add-relations to non-existent services that may be coming available in the future
<stokachu> for example if I deploy 3 charms and charm 1 relies on charm 3 so i add the relation during charm 1 deployment
<stokachu> is it smart enough to retry to add-relations once it sees charm 3 come online?
<stokachu> marcoceppi: ^ curious if you know this?
<marcoceppi> stokachu: no
<stokachu> marcoceppi: no to not smart enough or no to you aren't sure?
<marcoceppi> not smart enough, if you run add-relation then it won't actually work if the one of the two services isn't there
<stokachu> so that makes it difficult for me to put juju deploy <charm>; juju add-relation <charm> <new_charm_not deployed>; juju deploy <new_charm>
<marcoceppi> stokachu: not difficult, impossible.
<marcoceppi> stokachu: you should run add-relation once you have all your services deployed
<stokachu> so if i deploy and openstack cloud i'd have to deploy all charms, then re-loop through those charms and add-relations
<marcoceppi> stokachu: or, use juju deployer
<bloodearnest> stokachu: or better yet, deploy charms, mount volumes, then add relations, as many charms expect the volumes to be already configured on the joined hook
<stokachu> bloodearnest: interesting ill look into that
<bloodearnest> stokachu: on account of juju having no way yet to detect/react to volumes changing, AIUI
<stokachu> i wonder if it'd be worth it to have add-relations kept in a queue and when a service comes online it just checks for pending
<natefinch> stokachu: note that you don't need to wait for the charms to be deployed to add relations. You can fire off deploy deploy deploy add-relation add-relation add-relation, and juju will eventually catch up.   It's just that you have to run the deploy command before the add-relation command
<stokachu> natefinch: yea thats what im doing now
<stokachu> just iterating through the charms twice is all
<natefinch> stokachu: iterate through charms once and then through relations once ;)
<natefinch> gotta run, car needs to be inspected, back in 45 mins
<sinzui> wwitzel3, natefinch CI cursed the most recent juju because of a unit-test failure on precise. Do either of you think the test can be tuned to be reliable on precise? https://bugs.launchpad.net/juju-core/+bug/1311825
<_mup_> Bug #1311825: test failure UniterSuite.TestUniterUpgradeConflicts <ci> <intermittent-failure> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1311825>
<natefinch> sinzui: looking
<wwitzel3> sinzui: also taking a look
<natefinch> man I hate overly refactored tests
<natefinch> wwitzel3: can you even tell what sub-test is failing?
<natefinch> all I see is "step 8" which doesn't tell me diddly
<wwitzel3> natefinch: not really, I've got as far as fixUpgradeError step
<wwitzel3> natefinch: but it is all nested so I can't tell in which that is happening
#juju-dev 2014-04-24
<axw> wallyworld: https://codereview.appspot.com/91740043/
<axw> tested live with maas
<wallyworld> axw: already looking :-)
<axw> cheers
<axw> going to go do some packing, bbiab
<axw> wallyworld: hrm, it is selecting the instance but it doesn't seem to be bootstrapping. I will continue investigating... there may be another change required to bootstrap though
<wallyworld> ok
<wallyworld> branch looks ok - i did have a suggestion about the error checking in bootstrap
<wallyworld> i've +1ed it so you can land once it works :-)
<axw> wallyworld: testing again with a fresh maas node, could've been because it was dirty
<axw> wallyworld: I didn't want to return other errors because we don't care about whether they're well formed
<axw> wallyworld: just that they're unscoped or not
<axw> bootstrap doesn't care if you specify an lxc placement with an invalid machine id
<axw> because it just doesn't support lxc placement
<wallyworld> axw: ok, it makes sense now that i think a bit harder
<axw> alrighty, worked fine with a new node
<jam> wallyworld: https://code.launchpad.net/~wallyworld/juju-core/bootstrap-supported-series/+merge/216974 LGTM
<jam> morning all
<wallyworld> jam: great thanks
<jam> wallyworld: fwiw I think we actually want to target 1.18 with that patch
<wallyworld> jam: we want to get a lot of stuff into 1.18
<jam> wallyworld: since 1.18 local is broken in the same way
<wallyworld> i'll land in 1.19 first cause that ships tomorrow and then will back port
<jam> wallyworld: *I* find it much easier to land to 1.18 and then merge up, but as you wish
<wallyworld> i just cherry pick :-)
<wallyworld> but i can see that could get messy easier than landing to 1.18 first
<jam> wallyworld: 1 case isn't bad, N cases starts to get hairy
<jam> wallyworld: can I ask to have you peek at https://code.launchpad.net/~natefinch/juju-core/045-amd64plz/+merge/216612
<jam> It looks like FindInstanceSpec doesn't like that we prefer amd64 now
<jam> and I think you have some familiarity with that code
<wallyworld> sure
<wallyworld> that was actually on my list
<wallyworld> cause i tried to land it this morning
<wallyworld> there's a couple of failures
<wallyworld> jam: btw, is the weekly meeting on tonight with everyone away? i can't make it cause it's my son's birthday
<jam> wallyworld: you lazy slacker, how dare you have a family life! :)
<wallyworld> yeah, bad i am
<jam> wallyworld: I think we'll still do it with whoever is around
<jam> but don't feel bad
<wallyworld> sure, axw was wondering also
<wallyworld> probs will just be discussing getting 1.19 out anuway
<axw> woot, maas-name is merged
<jam> axw: so that is "juju deploy --to maas-name:foobar" ?
<axw> jam: sorry was eating lunch. not quite; for one thing deploy/add-unit are not supported yet (I opened a new bug for that)
<axw> jam: we can now do "juju add-machine [env-name:]<maas-name>"
<axw> jam: also s/add-machine/bootstrap/
<jam> axw: [env-name:] why wouldn't it be -e envname ?
<axw> jam: it's how fwereade wanted it, not exactly sure when we'd have env-name != implied env
<axw> jam: but you can leave it out altogether and it uses the implied env name
<jam> wallyworld: lp:~wallyworld/juju-core/bootstrap-supported-series just bounced with test suite failures
<axw> jam: I *think* it's for multi-env
<wallyworld> ah, ok, just about to propose the i386 branch
<wallyworld> will look in a bit
<jam> wallyworld: I'm looking at your constraints vocabs patch
<wallyworld> great ta
<jam> wallyworld: reviewed https://codereview.appspot.com/96730043/
<wallyworld> ta, will look rsn
<jam> wallyworld: On your SeriesToUpload patch, I was surprised to see Lucid
<wallyworld> jam: i didn't add it, it was already there
<jam> k
<wallyworld> i think the test just set it up as the preferred series in config
<wallyworld> just to have something other than precise to use
<wallyworld> jam: i tweaked some metadata and a test and fixed a nil pointer issue and it seems to work now https://codereview.appspot.com/90720043/
<jam> wallyworld: LGTM
<wallyworld> ta
<fwereade> axw, jam: it's not really env-name, it's provider-name (or, heh, really more like account-name) but since we don't have those concepts in play at the moment we can't sanely reference them
<jam> fwereade: fwiw, Id rather bring that in  when we actually have multi-provider environments
<jam> but as long as you can just leave it off, I guess it is fine.
<jam> I do wonder if we want to continue the --to syntax
<axw> yeah, just as long as we don't prevent future use
<jam> since that would match how it would end up on things that aren't bootstrap and add-machine
<jam> but I suppose add-machine already uses the ssh: syntax
<fwereade> jam, axw: yes, Iwould generally prefer that we not make a big deal of the prefixes; I just wantthem in place for when we do
<fwereade> jam, axw: I think a placement directive is meaningful both as a positional arg and as a --to payload
<fwereade> jam, axw: and it's to our benefit to make it look nice and easy in both cases, just as it is to our benefit to design it such that we can expandin plausibledirections in future without having to change everything ;)
<axw> me too. "I'm bootstrapping $machine" vs. "I'm deploying $service to $machine"
<jam> fwereade: axw: any chance that your change allows "juju bootstrap ssh:user@host" at the same time ?
<fwereade> axw: although, hmm, I suddenly fret that --to works better on bootstrap...
<axw> jam: no, I did think of that, but it's not possible atm
<axw> jam: because we use the bootstrap-host when preparing
<fwereade> axw, jam: I'm not quite sure what positional arg I would be saving it for
<fwereade> axw, ah, that's a shame
<axw> jam, fwereade I added a TODO in provider/manual to revisit it tho
<jam> axw: AIUI we have a bug open on it
<fwereade> axw, it does STM that we don't really even need an environments.yaml for that case,though
<axw> ok
<fwereade> axw, that's a bigger change, but one we should make soonish
<jam> axw: bug #1282217
<_mup_> Bug #1282217: Specifying bootstrap-host requires editing environments.yaml <bootstrap> <ci> <manual-provider> <juju-core:Triaged> <https://launchpad.net/bugs/1282217>
<fwereade> axw, I haveto write it upproperly for vegas
<jam> and possibly bug #1282642
<_mup_> Bug #1282642: Bootstrap prefers .jenv over environments.yaml <bootstrap> <ci> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1282642>
<axw> ah yes, that's right
<jam> fwereade: so AIUI in bug #1282217, we need it in Prepare because we potentially call SyncTools before we bootstrap
<_mup_> Bug #1282217: Specifying bootstrap-host requires editing environments.yaml <bootstrap> <ci> <manual-provider> <juju-core:Triaged> <https://launchpad.net/bugs/1282217>
<axw> jam, fwereade: the main issue is that we have multiple ways of preparing an environment, and it currently needs to be usable before bootstrap
<axw> yep
<jam> however, sync-tools is sort of dying anyway, so I'd rather not tie us to that
<fwereade> jam, yeah, but I'm not sure there's any reasonable remaining justification for sync-tools
<fwereade> jam, axw: I don't *think* there's anything beyond sync-tools keeping Prepare distinct from Bootstrap, is there?
<axw> fwereade: only plugins, but they probably don't need it
<jam> fwereade: Prepare is the time when we take the rough template and fill out all the random bits
<jam> so I don't mind it as a step
<jam> but that step can happen only during bootstrap for all I care
<axw> that would be nice
<fwereade> jam, yeah, I'm not really against *prepare* so much as I am against environments.yaml
<jam> fwereade: I *think* the fact that we gave bootstrap --source means we've replaced the other needs we had
<fwereade> cool
<jam> fwereade: with some possible caveats
<jam> like I think we are missing a way to say "Give me a copy of the tools that are at $URL"
<axw> the juju-metadata plugin uses PrepareFromName. does it need it to?
<jam> and --metadata-source requires it to be a local path
<jam> so you can bootstrap from a public URL either
<axw> I think it might, so we can prepare tools and metadata in keystone or whatever
<jam> axw: I'm a firm believer that metadata should not be a plugin
<jam> axw: and I think it is in the same boat as sync-tools, in that when you are preparing your own image/tool metadata
<jam> you can't have an env bootstrapped yet
<jam> because you're creating its infrastructure
<axw> right... so it just needs to be able to deal with an "unprepared" env config
<jam> axw: but I don't think manual actually needs it, since you're not doing image lookup if your target is manual
<fwereade> jam, axw: I dunno, I feel like creating metadata and uploading it are very distinct things
<jam> you may need it for tools
<fwereade> jam, axw: and that separation points towards the pluginness being a good thing
<jam> tohugh again, I think --source for bootstrap is a better workflow there
<fwereade> jam, axw: it probably shouldn't be about environments at all
<fwereade> jam, axw: plugins for metadata generation, for them as needs it
<jam> fwereade: actually, I firmly believe it shouldn't involve an env, because I often try to configure stuff, and I don't have a default env, and it just goes "poo" you must set -e
<jam> fwereade: afaict it doesn't really do anything with the env, at least for generate
<jam> maybe validate will use it for where to find the stream data
<fwereade> jam, axw: --source args that let you point at local or remote metadata that gets pushed into your environment
<fwereade> jam, axw: would be nicest of all, really, to have validate again just point at a metadata source
<fwereade> jam, axw: then if an environment uses that source, yay, you know it's validated
<axw> fwereade: how do point at the environment storage?
<axw> although hm, the tools don't actually upload to env storage?
<jam> axw: the metadata does
<jam> or can
<axw> jam: sorry I meant the plugin. doesn't look like it does... maybe there's a flag I'm missing
<axw> anyway, if we can make it so prepare is only ever called by bootstrap that would be awesome
<fwereade> axw, jam: yeah, I don't think we really need to -- although I think I'm backing off the idea that you *can't* validate metadata for an env, it still seems perfectly reasonable to just take the source from the env and check that source
<axw> seems sane. I think once it's in the env, it should be reasonable to assume it's valid
<jam> fwereade: I feel like you have a better handle on what should be happening while hooks are firing, I think I've gotten the context down small enough in https://bugs.launchpad.net/juju-core/+bug/1311825 but I don't quite understand why it does/doesn't do stuff
<_mup_> Bug #1311825: test failure UniterSuite.TestUniterUpgradeConflicts <ci> <intermittent-failure> <test-failure> <juju-core:In Progress by jameinel> <https://launchpad.net/bugs/1311825>
<jam> fwereade: do you want to Hangout about it?
<fwereade> jam, sure, but you may just be watching me catching up
<jam> fwereade: https://plus.google.com/hangouts/_/canonical.com/john-william
<mfoord> morning all
<fwereade> mfoord, heyhey
<mfoord> fwereade: o/
<fwereade> jam, https://codereview.appspot.com/92720044
<jam> fwereade: LGTM
<jam> fwereade: me thinks we might want that patch in 1.18 ?
<jam> since it is about making CI happier to make releases
<jam> fwereade: provisioner treats no tools as error: https://codereview.appspot.com/93720044
<natefinch> morning all
<jam> morning natefinch
<jam> natefinch: care to start your day with a review ? https://code.launchpad.net/~jameinel/juju-core/1.18-provisioner-no-tools-is-fatal-1311676/+merge/217011
<natefinch> jam: sure
<perrito666> morning
<natefinch> perrito666: morning
 * perrito666 notices his bug resembles a mamushka doll
<axw> jam: not going to be able to make the meeting, I have to look after my daughter
<jam> axw: np
<axw> also I have tomorrow off - national holiday. see you all in vegas
<jam> axw: see you next week
<jam> fwereade: standup ?
<perrito666> axw: you can bring your daughter to the meeting its becoming a thing :)
<natefinch> axw: see you in vegas
<natefinch> perrito666: you have a point :)
<jam> vladk|offline: group standup ?
 * perrito666 sees the test running for too long and knows he is about to get an error
<mfoord> natefinch: you need to bring your daughter to Vegas. Daily standups without her just won't be the same...
<natefinch> mfoord: haha
<psivaa> jam: hey, could you help taking a look at http://paste.ubuntu.com/7315688/  for juju bootstrap with trunk version, 1.19.1-trusty-amd64 pls?
<jam> psivaa: known bug we won't fix, you have to use --upload-tools to bootstrap trukn
<psivaa> jam: ack, i think i tried that too. let me do that again. thanks
<jam> psivaa: bug #1308337
<_mup_> Bug #1308337: 1.19.1 cannot bootstrap 1.19.0 (no longer installs mongodb-server during bootstrap) <regression> <juju-core:Won't Fix> <https://launchpad.net/bugs/1308337>
<psivaa> jam: thanks. with --upload-tools i got http://paste.ubuntu.com/7314791/ in a private cloud. not sure if the cloud being overly restrictive is any reason.
<psivaa> i'll try that with hpcloud
<jam> psivaa: you're using juju-core with gccgo and not compiling with static linking
<jam> psivaa: var/lib/juju/tools/1.19.1.1-precise-amd64/jujud: error while loading shared libraries: libgo.so.5: cannot open shared object file: No such file or directory
<jam> psivaa: you need to set flags, let me go look them up
<jam> psivaa: var/lib/juju/tools/1.19.1.1-precise-amd64/jujud: error while loading shared libraries: libgo.so.5: cannot open shared object file: No such file or directory
<jam> sorry
<jam> psivaa: INSTALL_FLAGS := -gccgoflags=-static-libgo
<jam> psivaa: so you should be doing "go install -gccgoflags=-static-libgo" when building juju from trunk
<psivaa> ack, will try that. thanks jam
<wwitzel3> anyone else having issues with publickey denied with juju ssh to machine 0? using local provider?
<jam> wwitzel3: your local machine isn't accepting your own ssh key, but that isn't that uncommon. ssh 0 is never guaranteed to work ,because most people don't even run openssh on their laptop/etc
<jam> wwitzel3: what command are you running?
<jam> eg "juju debug-log" never worked with local until recent 1.19.1-ish stuff
<jam> thumper turned debug-log into an api call.
<wwitzel3> jam: just juju ssh 0
<jam> wwitzel3: right, that depends entirely on if "ssh localhost" works
<jam> wwitzel3: not something guaranteed for local provider
<wwitzel3> jam: well, I can run ssh localhost and it works as expected
<jam> wwitzel3: sorry, can you "ssh ubuntu@localhost" ?
<wwitzel3> jam: oh right, because that is the user it is expecting
<mfoord> canonical wiki not responding
<mfoord> and I want some lxc information from it
<mfoord> so that seems like a good time to go on lunch
<mfoord> back soon(ish)
<wwitzel3> miss you already
<mfoord> wwitzel3: liar :-)
<perrito666> mfoord: also irc server is not letting me connect
<jam> mfoord: wiki just came up for me
<jam> slow, but it did come up
<jam> fuuuuuu!!!
<jam> I can't land anything on the 1.18 branch on the bot, the test suite is broken on tip
<jam> didn't wallyworld have a patch recently because we had some test isolation issues when releases were made?
 * wallyworld has vague recollections but can't recall the details
<jam> wallyworld: does https://code.launchpad.net/~jameinel/juju-core/1.18-provisioner-no-tools-is-fatal-1311676/+merge/217011/comments/515862/+download ring any bells ?
<jam> wallyworld: I'm just getting lots of "no tools available" failures on the 1.18 branch
<jam> even without my patch (if I SSH in manually and try it)
<jam> but it passes on my machine on trusty
<wallyworld> jam: it's the changed lts thing
<jam> ah, ffs, LatestLTSSerise became trusty
<wallyworld> yep
<jam> so all the tests that were expecting "precise" to just work
<jam> ...
<wallyworld> jam: i fixed it in trunk with the SeriesToUpload fix
<wallyworld> it sorta worked before but then we got the %LTS% placeholder put in
<jam> wallyworld: I can land my patch today by just editing /usr/share/distro-info/ubuntu.csv ...
<jam> and making Trusty come out next month
<wallyworld> yeah
<wallyworld> i need to backport a bunch of fixes to 1.18
<jam> wallyworld: well, the test which failed now passes on the bot
<wallyworld> \o/
<jam> I feel a bit dirty, but I feel *very* pragmatic
<wallyworld> sometimes that's fine :-)
<wallyworld> jam: so it seems we won't have 1.19.1 for ODC unless we shift out some bugs
<jam> wallyworld: High bugs don't block releasing 1.19.1
<wallyworld> great
<jam> wallyworld: they're just the stuff I want next on everyone's queues
<jam> they probably block 1.20
<wallyworld> yeah
<jam> High 1.19.1 bugs do, High 1.20 bugs probably don't...
<jam> as we're essentially doing priority by keeping multiple milestones
<jam> natefinch:  wwitzel3: can I get a trivial review: https://code.launchpad.net/~jameinel/juju-core/1.18-panic-parsing-jenv-1312136/+merge/217036
<jam> we were ignoring an 'err' return that could cause a later pani()
<jam> panic()
<wwitzel3> + 1 we shouldn't do that
<wwitzel3> haha, taking a look
<wwitzel3> LGTM
<alexisb> dude! you guys are kicking a** on bugs!
<jam> alexisb: well, I feel like I'm reporting as many as I'm fixing, but we are doing a lot of them
<alexisb> jam :) understood
<alexisb> but finding them is important to
<wwitzel3> jam: that's interesting, because it was declared as a var and then used in scope at the bottom of the fuction, it wasn't an compile time error when we didn't use the err we get back from the function.
<alexisb> either way the progress is impressive
<jam> wwitzel3: hypothetically I should be adding a test for that path
<alexisb> jam are we seeing a lot of regressions from HA and vlan landing?
<jam> alexisb: we are seeing a lot of things that don't work when HA is enabled
<jam> not quite the same thing
<jam> when you don't *use* HA, then everything still works
<wwitzel3> :)
<alexisb> jam ack
<alexisb> that is better then HA breaking everything :)
<fwereade> natefinch, please remind me how arch selection works after your change -- do we pick amd64 by default, regardless of client?
<jam> alexisb: right. I think it falls under "not a regression, but fairly important bugs" which is reasonable
<jam> given that HA isn't by-default
<fwereade> natefinch, if so, am I right in thinking it should also resolve https://bugs.launchpad.net/juju-core/+bug/1274755 and https://bugs.launchpad.net/juju-core/+bug/1262967 ?
<alexisb> jam we will need to flush out the HA bugs and get them fixed before 1.20
<natefinch> fwereade: I changed it to just select the same arch as the host if possible
<natefinch> fwereade: since jam had made a good point that otherwise we'll break --uploaÄ-tools
<alexisb> jam, for the HA sprint discussion...
<jam> fwereade: I don't think it fixes https://bugs.launchpad.net/juju-core/+bug/1274755
<jam> that is that our canned data doesn't include other ARCH
<alexisb> do you think it is a mini topic or needs a full timeslot?
<fwereade> natefinch, that seems backwards -- aren't we restricting by available images/tools before going further?
<jam> alexisb: I think we're going to be talking a lot about next steps for HA, TBH
<jam> fwereade: https://bugs.launchpad.net/juju-core/+bug/1274755 is a test-suite bug
<jam> fwereade: so if you just run "juju bootstrap" it will pick something
<jam> if you later run "juju upgrade-juju --upload-tools"
<jam> We've at least had support requests
<jam> where they expected the latter to work, but were on say i386
<jam> and you can't upload i386 tools to an amd64 env
<jam> I'm not *sure* what clouds would have amd64 and ppc64 and then run from a ppc64 client
<natefinch> fwereade: the arch check is the last thing we do, after other considerations (tools and constraints)
<jam> alexisb: I'm *mostly* concerned that we have time allocated to doing the work
<alexisb> ack
<jam> alexisb: We can do the actual discussion in the bar, but I do think it is going to have a longer tail if we really want polish
<alexisb> yep we are planning to but time aside for core coding
<jam> alexisb: so are you saying I can have time for my team to code, or time to discuss HA ?
<alexisb> I am saying we have already planed to have time for coding, and I can schedule time for HA discussion if needed
<jam> alexisb: sounds good
<jam> I realize I'm adding them late, but when I come across them, I figure it is better to bring it up
<jam> we'll probably be adding some during the week as well, I would imagine
<alexisb> yes something tells me this will be a dynamic plan
<alexisb> I am just trying to ensure we capture everything and make note of it
<jam> alexisb: certainly, and I *definitely* appreciate that I can fire off a "we should talk about this" and not have to actually schedule it myself
<jam> wwitzel3: I think the code in question wanted to do shared error handling, and then just didn't, or was refactored out of being shared. I can poke at it a bit more
<jam> wwitzel3: I can't land it yet on the bot because of the Trusty / Utopic stuff (still needs to land my other branch which fixes 1.18 branch to pass the test suite again)
<coreycb> Hi, I'm getting connection failed, will retry: dial tcp 127.0.0.1:37017: connection refused' on bootstrap of local provider on trusty
<fwereade> natefinch, jam: dammt, I thought we'd done the SupportedArchitectures work to head that sort of problem off at the pass
<jam> wwitzel3: cfg is needed because it gets used later on
<fwereade> natefinch, jam: anyway, thanks, I won't close the bugs then :)
<jam> wwitzel3: and you can't return 'cfg' without returning 'err' I believe, we could declare 'err' more locally but we can't use := syntax
<jam> fwereade: *shrug* I'm not sure. I'm just giving my synopsis of the bug
<jam> since we don't have that platform on-hand to test it
<jam> maybe it works
<jam> wallyworld was doing testing on ppc64, right?
<fwereade> jam, indeed, but the evidence does not strongly point that way :)
<coreycb> /var/log/upstart/juju-db*.log has:   "/bin/sh: 1: exec: /usr/lib/juju/bin/mongod: not found"
<jam> coreycb: do you have "juju-local" installed ? it should bring in "juju-mongodb"
<coreycb> jam: nope
<jam> coreycb: I believe in trunk we detect if juju-local is available and if not, request that you install it
<fwereade> jam, you remember updating the ec2instance types a couple of months ago? do you recall where you got the cpu-power scores?
<coreycb> jam, that fixed it.  I don't think it notified me to but I could hvae missed something.  I know it told me to install mongodb-server.
<jam> coreycb: 1.18 detected fixed dependencies, which were valid on precise, but not trusty
<jam> so instead we now require juju-local
<jam> which brings in the right things on both platforms
<coreycb> jam, ok I'm on 1.18.1-0ubuntu1
<jam> fwereade: I believe they changed their webpages :(
<natefinch> jam, wwitzel3:  we should declare err inside the if scope.  Yes, you'll need to do it in two places, but it'll ensure the compiler knows if you don't use it
<fwereade> jam, yeah, and now the only provider that understood cpu-power no longer gives us values
<jam> fwereade: found it: https://aws.amazon.com/ec2/pricing/
<jam> fwereade: it is under pricing, but not under instance types
<fwereade> jam, ah-ha! thanks
<jam> natefinch: i did
<jam> though it feels pretty wasted victory
<jam> natefinch: that is going to be something RSN someone is going to say "Why is this code duplicated" and pull it out as they refactor the function and forget to handle the err :)
<jam> but I *did* add a test for this case
<fwereade> jam, and there's some new ones, might as well add them (since I'm fixing a typo and looking them up anyway ;p)
<jam> note that the code still has another bug
<jam> which is if envs.Config() doesn't return IsNotFound
<jam> but returns some other err
<jam> it is going to break again
<natefinch> jam: yeah, I was going to mention that
<jam> natefinch: so you can pick up my patch if you want, but I'm way past EOD and shaving yaks is not a good start to my weekend
<natefinch> jam: no problem
<jam> natefinch: its pretty obvious that the author thought err was going to bubble up and be handled
<jam> but isn't
<jam> like, why handle only IsNotFound here
<natefinch> yep
<jam> natefinch: so I'd appreciate handoff of https://code.launchpad.net/~jameinel/juju-core/1.18-panic-parsing-jenv-1312136/+merge/217036. If you like it well enough, Approve it, if you want to tweak it, feel free
<natefinch> jam: sure, I'd like to handle the non-not found error
<natefinch> jam: seems like we don't need to handle notfound special there, just return whatever error we get
<jam> natefinch: sgtm, I'm not sure what errors we could handle
<jam> which means we probably want to move the Warningf that I have
<jam> up
<jam> and share the err
<jam> natefinch: the Warningf is because otherwise we don't inform the user that there data is bad, because we are trying to connect on the other Try
<natefinch> jam: so always warn if we return a non-nil error?
<jam> natefinch: I think we could not warn on NotFound
<natefinch> jam: yeah, that seems valid
<natefinch> jam:  someone higher up is probably checking for isnotfound anyway
<jam> happy birthday to Vladk
<vladk> jam: thanks
<perrito666> vladk: hey hb
<perrito666> why is calendar not telling me that its your bd?
<wwitzel3> vladk: nice job, do it again next year ;)
<fwereade> jam, another question about the m3 instance types -- they seem to accept both paravirtual and hvm images. do you know any particular reason to choose one over the other?
<coreycb> are individual juju command options documented somewhere?
<natefinch> coreycb: juju help <commandname>
<coreycb> natefinch, duh, thanks :)
<natefinch> coreycb: np :)
<fwereade> alexisb, I've just realised I've been listening to the hold music on the cross-team call for 10 mins -- do you know if it's happening?
<coreycb> hmm 'juju debug' gets - ERROR control-bucket: expected string, got nothing
<alexisb> fwereade, I think mark sent a cancellation notice
<coreycb> sorry, juju debug-log
<alexisb> but you are not the first to ask so maybe not
<bloodearnest> fwereade: it just got all electro-funk on me
<fwereade> bloodearnest, yeah, it was a bit more soothing when I started listening
<fwereade> alexisb, well, I'll drop off for now, I'm around if I'mwanted
<rick_h_> fwereade: same here. I just gave up.
<rick_h_> fwereade: looking through my mail for any missed notice
<fwereade> rick_h_, couldn't find one, but iirc he is at gophercon
<rick_h_> yep, conference has swallowed our teams
<alexisb> :)
<alexisb> yep sorry guys
<alexisb> I would run it but I dont have the leader code
<alexisb> and I am on a plane
<rick_h_> all good, see everyone in 4 days
<natefinch> alexisb: I thought you were going to be there yesterday?
<alexisb> natefinch, I was
<alexisb> I ended up in the ER
<alexisb> it is a long story
<natefinch> alexisb: whoa, hope everything's ok
<alexisb> but I am headed that way now
<alexisb> yes just complications from the cold I had 3 weeks ago
<alexisb> but I promise everyone I am not contagious I wont get you sick
<natefinch> alexisb: heh... I thought I was getting over my three (now almost four) week cold a couple days ago, and it just got worse.
<coreycb> has anyone seen this?  http://paste.ubuntu.com/7322857/
<coreycb> (sorry for so many questions this morning)
<natefinch> coreycb: hmm you don't really need to do debug-log locally, since you can just view the log yourself.  But I do wonder if that's a bug in local
<perrito666> alexisb: plane with internet sweet, here I was once asked to shut down my ipod (as in old, no internet ipod) for takeoff :p I certainly would not be allowed to use internet on flight
<natefinch> fwereade: any known issues with debug-log on local?  Seems like it would try to ssh to localhost's port 22, and I'm not sure we can guarantee it'll be open?
<coreycb> natefinch, yeah I can just log into the machine for now
<natefinch> coreycb: it's the local provider, right?  So the logs should be under ~/.juju/local
<coreycb> natefinch, yes it is.  ok thanks
<fwereade> natefinch, not sure it ever worked properly until thumper's debug-log-via-api, thought it'd work now
<fwereade> natefinch, although, yeah, you can always just look at the logs ;)
<natefinch> fwereade: is that post 1.18.1?
<fwereade> natefinch, I *thought* .0, but I'd have to go hunting
<natefinch> fwereade: it's ok. I can try it out and see what's going on
<fwereade> natefinch, cheers
<natefinch> anyone else having bzr issues?   I presume it's a launchpad connection problem.
<wwitzel3> natefinch: everything was giving me issues earlier, but apparently there were some networking issues. Seems fine for me now.
<wwitzel3> natefinch: though still a bit slower than normal
<natefinch> wwitzel3: that's good
<natefinch> rogpeppe: hey, how's gophercon?
<rogpeppe> natefinch: great so far!
<rogpeppe> natefinch: (apart from the hangover)
<natefinch> rogpeppe: haha, well, only one person to blame for that. ;)   I hear the lower air pressure makes alcohol more effective up there
<rogpeppe> natefinch: yeah
<natefinch> wwitzel3: can you review my update to john's branch?  I think it's a lot easier to read, and also handles an error path that his change was missing: https://codereview.appspot.com/90740044
<mfoord> I was wrong about rsyslog v5 docs not being online
<mfoord> thankfully they are
<fwereade> natefinch, LGTM, much nicer
<mgz> rogpeppe: how do you have a hangover already...
<perrito666> mgz: clearly drinklag
<mfoord> rogpeppe: hey, hi
<mfoord> rogpeppe: obviously the right fix for the hangover is to start drinking again
<sinzui> natefinch, rogpeppe : CI still fails trunk because of precise unit tests: https://bugs.launchpad.net/juju-core/+bug/1312261
<_mup_> Bug #1312261: 3 unit tests  fail the entire precise suite <ci> <regression> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1312261>
<sinzui> natefinch, or wwitzel3 , or anyone. ^ any ideas what can be done to the tests or to the test setup to make them pass on precise?
<wwitzel3> natefinch: LGTM, nice job
<mfoord> so according to rsyslogd my config file is valid
<mfoord> need HA up and running to try it
<mfoord> that'll be a task for tomorrow I think
<mfoord> time to go jogging
<mgz> sinzui: one thing to check on those tests if if the precise has up to date distro-info-data
<sinzui> mgz, you mean the archives we pull from whe we do an update could be stale?
<sinzui> mgz, distro-info-data was installed/upgraded to 0.8ubuntu0.6
<sinzui> oh, mgz, but trusty got  0.18ubuntu0.1
<sinzui> mgz, distro-info-data_0.8ubuntu0.6_all.deb on precise does know about utopic
 * sinzui quickly updates publishing script to use that name instead of unctuous.
<perrito666> has anyone ever encountered a loop of "INFO juju.state open.go:133 dialled mongo successfully" when calling state.Open() ?
<natefinch> perrito666: doesn't sound familiar.  Are you running with --debug?  Hopefully you'd get more info about what's actyually going on.  The other option is to go look at the system logs, rsyslog, etc.  Sometimes mongo's logs will output something useful.
 * natefinch has to go run a quick errand, back in half hour ish.
<natefinch> directions for restarting Ubuntu:  click power button, restart, hit the restart button in the popup window. Wait a minute, nothing happens.  Close all applications and try again.  Wait a minute, nothing happens.  Open terminal run sudo reboot, window manager closes immediately, wait 2 minutes while ubuntu .... image spins, hold power button for 5 seconds until hard power off.  Push power button again.
<sinzui> hands up who moved loggo from the loggo organisation to the juju organisation
<perrito666> natefinch: wow, instead, iirc: open terminal, sync, unplug power cord
<natefinch> perrito666: heh, well, it's a laptop, so I'd have to remove the battery too, which requires unscrewing stuff.  Probably just pushing the power button for 5 seconds to start is the best bet,.
<perrito666> natefinch: seems your machine hates the new ubuntu kernel
<sinzui> 1.18 cannot build because loggo has moved. I need to create a shim.
<natefinch> perrito666: something.  There was a sweet spot in there somewhere where it was rebooting correctly for a month or two.  Not so much, now.  Oh well, I don't reboot often enough for it to be a big deal.  Had to today because LXC was being a bitch and screwing my juju local
<natefinch> sinzui: does loggo not exist in the old place?
<sinzui> Will go/godeps allow my to symlink, or do it need to manually get it
<sinzui> natefinch, yes it does exist in the old place
<sinzui> well regardless, of the solution I will know soon
<natefinch> sinzui: go build only cares about paths on disk.  it looks for "launchpad.net/loggo" under $GOPATH/src/launchpad.net/loggo  .. it has no idea and doesn't care how the code got to that directory
<natefinch> sinzui: not sure about godeps.... it expects to be able to use VCS to sync a branch to a commit number, will probably fail if it can't do that in the correct directory
<sinzui> thank you natefinch . I will hope for a symlink, but plan to pull from the old repo
<sinzui> hmm, but why didn't godeps complain that the lib wasn't missing. go build certainly noticed
<natefinch> sinzui: not sure.  Not hugely familiar with the internals of godeps.  I'd assume that if that directory didn't exist, it would barf
<sinzui> maybe the tarball script is calling godeps too early
 * sinzui looks
<sinzui> natefinch, a new clue, Looks like logo has to be in the old and new location for go to build. I tested with just old and just new and go failures.
<natefinch> sinzui: sounds like some of the code is referencing the old path and some is referencing the new path
<jam1> natefinch: my patch was intended for lp:juju-core/1.18 though you're proposing it against trunk
<natefinch> jam1: doh
<natefinch> jam1: that would be why it has 1.18 in the name, too :/
<sinzui> natefinch, if this is the case I think I or jam will update the code to just use the old location to meet Ubuntu's criteria for micro releases
<natefinch> sinzui: yeah, we definitely should not be using both, that could easily cause bad behavior
<jam1> sinzui: so the awful truth is that LTS switching to trusty broke some of our tests, I believe
<natefinch> jam1: luckily one of the tests failed ;)
<jam1> sinzui: bug #1312176
<_mup_> Bug #1312176: test suite fails on Precise now that Trusty is released <test-failure> <testing> <juju-core:Triaged> <https://launchpad.net/bugs/1312176>
<natefinch>  jam1: it's like one of those "are you sure you want to merge this?" popups.  You have to really mean it
<sinzui> jam1, you can feel good that trusty tests usually pass the first time. trusty is ver stable
<natefinch> jam1: well, shit.  I guess I should cherrypick my change and move it to a new branch
<jam1> natefinch: can you just propose it against 1.18 or did you merge trunk in the meantime?
<natefinch> jam1: I merged trunk :/  there were conflicts (not surprising since it was for 1.18)
<sinzui> natefinch, jam1. 1.18 does indeed want both loggos and the juju/loggo is unpinned because the goreps don't know about it.
<natefinch> jam1: I can just re-branch your branch and copy over the changes, they were pretty trivial
<natefinch> sinzui: that's a pretty important bug.  Importing both could easily cause serious issues.  We need to only use one or the other.
<jam1> sinzui: yeah, I think hazmat noted that we have a bogus import
<sinzui> natefinch, jam1: https://bugs.launchpad.net/juju-core/+bug/1311909 proposal for a fix sane? I can do all the packaging changes. The upgrade suggestion might to too tricky to do
<_mup_> Bug #1311909: juju 1.18 (local provider) depends on lxc 1.0.0 but nothing forces it to install on precise <packaging> <upgrade-juju> <juju-core:Triaged> <https://launchpad.net/bugs/1311909>
<natefinch> sinzui: sounds good to me
<natefinch> weird, lbox totally screwed up the link to the code review
<natefinch> no.... it looks like reitveld just reused an old number including old comments, but with the correct code
 * natefinch doesn't know WTF is going on with reitveld
<jam1> sinzui: if we just put lxc 1.0.0 into ppa:juju/stable would it make it work?
<natefinch> dammit, somehow bzr or lbox completely mangled my commit
 * natefinch does it over again
<sinzui> jam1, I think it might have avoided the issue where juju-local wasn't installed
<natefinch> ahh FFS
<natefinch> jam1, wwitzel3: not sure if either of you are still here, but I had to manually propose my new branch because lbox is somehow totally borked for this branch for whatever reason
<natefinch> jam1, wwitzel3: https://code.launchpad.net/~natefinch/juju-core/panic-parsing-jenv-1312136/+merge/217130
<fwereade> natefinch, that still LGTM
<natefinch> fwereade: thanks
<sinzui> natefinch, wwitzel3 will either of you be landing a branch in the next 2 hours?
<natefinch> sinzui: in theory I have one landing right now
<sinzui> fab, I will wait for your branch instead os replaying an old revision to test utopic building
<natefinch> fwereade: trivial code review if you have a sec: https://codereview.appspot.com/94760043
<fwereade> natefinch, shouldn't there be an error returned and checked there?
<natefinch> fwereade: oops, wow, yeah.  Thanks for catching that
<natefinch> fwereade: fixed
<sinzui> natefinch, 	lp:juju-core/1.18 r2275 just fail again.
<sinzui> go build -v launchpad.net/juju-core/...
<sinzui> tmp.t3vUjGZmJ6/RELEASE/src/launchpad.net/juju-core/state/apiserver/usermanager/usermanager.go:9:2: cannot find package "github.com/loggo/loggo" in any of:
<sinzui> 	/usr/lib/go/src/pkg/github.com/loggo/loggo (from $GOROOT)
<sinzui> 	/mnt/jenkinshome/jobs/build-revision/workspace/tmp.t3vUjGZmJ6/RELEASE/src/github.com/loggo/loggo (from $GOPATH
<fwereade> natefinch, LGTM
<natefinch> fwereade: thanks
<natefinch> sinzui: do you need me to fix that?
<sinzui> natefinch, I am about to prose a fix for the one bad import
<natefinch> sinzui: ok
<natefinch> sinzui: that looks to be the only place it's imported in 1.18 AFAICS
<sinzui> natefinch, Do you have a moment to review https://codereview.appspot.com/96780043
<natefinch> sinzui: LGTM
<sinzui> thank you natefinch , I will offer to the gobot god
<natefinch> sinzui: good luck :)
<natefinch> ok, EOD for me
<perrito666> hey, is this on syslog something I should worry about? replSet can't get local.system.replset config from self or any seed (EMPTYCONFIG)
<fwereade> if anyone's around, I'd appreciate a review of https://codereview.appspot.com/96790043 that included checking a random sample of it against objective reality
 * fwereade bed
#juju-dev 2014-04-25
<voidspace> morning all
<voidspace> wwitzel3: ping
<natefinch> voidspace:  standup?
<voidspace> natefinch: coming
<jam1> fwereade: do you think authorized-keys is worthy of 1.18?
<fwereade> jam1, I don't see that it'll hurt, and it's nice to get consistency fixes in early imo
<jam> perrito666: did you get a chance to file the "juju-restore depends on mongodb-clients" ?
<jam> bug
<perrito666> jam1: nope, sorry I was up until very late hooked on the (what I believe is) las bug from restore
<perrito666> natefinch: I will go get something caffeinated an then ping you
<natefinch> perrito666: ok
<fwereade> voidspace, wwitzel3: hmm, it kinda looks as though we only update the addresses on login, and we don't have any sort of watcher on them
<voidspace> fwereade: ah
<voidspace> fwereade: I'm pretty sure we update some mongo document as new state servers come up, and we may even have a watcher already for that
<voidspace> fwereade: just somewhere else
<voidspace> fwereade: we'll be needing this shortly, but not *right now*
<voidspace> so we'll circle back to it
<voidspace> it's  StateServingInfo I was thinking of - I'm not sure that's what we need
<voidspace> although there's also StateServerInfo
<psivaa> jam1: hey sorry to bother you, with juju-1.19.1-trusty-amd64, i see http://pastebin.ubuntu.com/7329106/
<psivaa> i was hoping to pick the fix for bug 1308767
<perrito666> natefinch: back
<natefinch> perrito666: hey
 * perrito666 is properly caffeinated
<natefinch> so, it's really unfortunate that the backup and restore stuff is a bash script and not a Go program, becausew then you could just reuse the code we already have for dealing with replicasets.  But anyway, it's easy enough to do via command line
<perrito666> natefinch: well, such is my luck :p
<natefinch> perrito666: hopefully we'll fix it later, but for now I guess we have to deal with it as-is.
<perrito666> basically, after loading the dump, mongo keeps yelling
<perrito666> replSet can't get local.system.replset config from self or any seed (EMPTYCONFIG)
<perrito666> among other things
<natefinch> yeah, that means the replicaset config is incorrect
<natefinch> which is not surprising, since it'll be referencing IP addresses that don't exist anymore
<perrito666> yup
<natefinch> you can use the mongo client and give it either javascript in the command line or javascript from a file to run
<perrito666> ill go with the client, that is what i being done o far for the rest
<natefinch> perrito666: what you want to call is rs.reconfig() and pass it a document with the right IP addresses
<natefinch> perrito666: http://docs.mongodb.org/manual/reference/method/rs.reconfig/
<perrito666> taking a look
<natefinch> perrito666: the document you pass to it should look something like this one: http://docs.mongodb.org/manual/reference/replica-configuration/
<natefinch> perrito666: ideally we'd read out the old config, fix up the IPs, and reset it with the IPs being the only things different, but I'm not sure if we can even get the old config out of it
<perrito666> mmm, something tell me that I will need to export that in the backup first
<perrito666> heh
<natefinch> perrito666: exporting it during backup is an excellent idea.  Hadn't thought of that
<perrito666> I mean, if the replset has more members I would be loosing them if I dont backup that explicitly
<perrito666> right?
<natefinch> perrito666: right.  Definitely it's a good idea to back up the replset config on its own, because you'll need to fix it up before you can connect to the database properly
<natefinch> perrito666: the nice thing is, is that we tag each replica entry with the machine ID, so we can correlate the entry with the machine's new IP address
<natefinch> brb
<natefinch> back
<perrito666> jam1: any form of you around here?
<perrito666> anyone knows if I can report a bug and say "affects this, this and this" (and how exactly, I dont see that option on the submit form)
<fwereade> perrito666, not sure quite what you're asking
<fwereade> perrito666, filing a single bug with multiple not-obviously-connected consequences?
<fwereade> perrito666, offhand I would be inclined to file the consequences separately and immediately mark them all dupes of the real cause, but this is really just a guess
<perrito666> fwereade: nope I need to file a bug because backup/restore requires mongodb-clients and jam suggested I should include juju-mongodb package and juju-core in the report
<fwereade> perrito666, ah, once you've filed it there's an "also affect project" link
 * perrito666 is about to bash hi head on the kb if his keys keep failing
<fwereade> perrito666, maybe there's one on the report-a-bug page too, not sure
<perrito666> fwereade: tx, I never added dependencies to my bug reports before :)
<coreycb> juju get is returns 'default: true' for every option as far as I can tell.  is that a known bug?
<coreycb> just 'juju get' (no is)
<fwereade> coreycb, so you've explicitly set some config options and its still saying they're set to the defaults? that surely does sound like a bug
<coreycb> fwereade, yeah so for example 'juju get mysql | grep default' returns 'default: true' for everything, even if it has 'type: int'
<coreycb> unless I'm misinterpreting the meaning of default
<coreycb> it doesn't correspond to the config.yaml default values, at least
<fwereade> coreycb, the type is not relevant afaics -- but if the values claim to be the defaults but don't match, that's definitely surprising
<coreycb> fwereade, I may be misinterpreting the meaning of default for juju get
<coreycb> fwereade, is it always a bool?
<fwereade> coreycb, "default: true" means "nobody's explicitly set this -- or someone has explicitly unset it -- and so it has the default value and may change to the new default value if the charm is upgraded"
<coreycb> fwereade, yeah ok so I'm misinterpreting.  that's pretty confusing though vs usage of 'default: ' in the config.yaml.
<fwereade> coreycb, if you explicitly set a config value to one that happens to match the default, it will persist through charm upgrades so long as the type for that field continues to match
<fwereade> coreycb, yeah, I never liked it much myself, iirc it was a compatibility-with-pyjuju thing
<coreycb> fwereade, I was thinking if 'default: eth0' in config.yaml, then juju get should return 'default: eth0'
<coreycb> ok
<coreycb> nevermind me then
<coreycb> :)
<fwereade> coreycb, the thinking is that *what* the default is isn't very interesting, but that fact that it *is* the default value is, because that has runtimeimplications
 * fwereade has to collect laura, bbiab
<coreycb> fwereade, well a different option name could make sense in juju get.. like 'using-default: true' to differentiate from what's in config.yaml
<coreycb> but maybe you need backware compat at this point
<coreycb> backward
<natefinch> fwereade: is this really something we need in for 1.19? https://bugs.launchpad.net/juju-core/+bug/1312463
<fwereade> natefinch, well, they rather suck, but I don't think they're critical
<fwereade> natefinch, feels like they should be pretty trivial though -- 1 missing line in setup sort of thing
<natefinch> fwereade: could be, it just seems like an edge case - running tests without network. Surely we shouldn't rely on having access to the network for tests, but it's also not going to affect the user at all.
<fwereade> natefinch, I'd prefer not to treat it as edgey -- look at it from the "intermittent failure" perspective
<fwereade> natefinch, agree not user-facing though
<natefinch> fwereade: yeah, definitely makes the tests more fragile
<natefinch> fwereade: where do we get the db username and password from?
<natefinch> fwereade: I'm trying to help perrito666 with the backup and restore scripts, and we need the user/pw for the admin database to be able to get the replicaset config
 * perrito666 keeps finding things to ask that nobody knows :p karma hates me
<natefinch> perrito666: it's just one of those rabbitholes where this data structure gets it from that data structure which gets it from another data structure.... and half the time if you trace it back far enough, it's actually hard coded
<perrito666> well, in the past I know I am in problems when fwereade says "ahh oh, ... that ... uhm"
<perrito666> I found, for instance that one particular failure case of not working db ends up with mongo flooding with a success message :p
<natefinch> perrito666: haha... that's mongo for you
<natefinch> perrito666: working with mongo in juju has reinforced my love of postgres for personal projects :)
<perrito666> well working enough with postgres has reinforced my love for csv :p and kind of db in excess is bad for your health
<natefinch> perrito666: heh, yeah, often times simple file storage is more than good enough
<perrito666> natefinch: so, is perhaps the password stored in agent config as "old-password"?
<fwereade> perrito666, natefinch: I don't think it's that one... don't we set up the state server machine-tag/password as mongo admins?
<fwereade> perrito666, natefinch: user-admin used to be one, with admin-secret, but I *think* we closed that off
<fwereade> perrito666, natefinch: sorry, I'm a bit back and forth at the moment, I have fewer hours until I fly than I thought I did
<perrito666> fwereade: ah, you leave today?
<fwereade> perrito666, early tomorrow
<perrito666> fwereade: ouch
 * perrito666 reminds he needs to secure a travel to the airport
 * fwereade just did that himself :)
<perrito666> my flight is at the not so fun time of 2AM
<natefinch> heh... I got our second car inspected yesterday so I could drive to the airport and not get ticketed for driving an uninspected vehicle.  Nothing like waiting until the last minute.
<natefinch> perrito666: ouch!
<perrito666> natefinch: you leave the car at the airport?
<natefinch> perrito666: sorta have to.  It's an hour and 15 minutes away... telling my wife she needs to drive 2 1/2 hours with two screaming kids in the back to drop me off is not a viable option :)
<perrito666> natefinch: ah, I live 15-10 min trough highway from the airport... although telling my wife that she needs to drive at 2am trough the unsafest places of the city might yield a similar result, so cab it is
<natefinch> perrito666: haha yeah
<natefinch> perrito666: I don't think there's a cab in the world that would even agree to take me that far, and even if they did, it would cost as much each direction as parking for the week
<perrito666> heh cab here are cheap for intl standards, its a U$D15 ride to the airport ~27km
<perrito666> against 120 for a weeks parking
<natefinch> heh, it's like 90km for me, and if I could get a ride it would cost me at least $100.  There *might* be a shuttle bus from a nearby town, but those are usually $50 each way, and parking is only $108 per week
<natefinch> perrito666: anyway, I leave 7pm on Sunday and arrive 9:41pm local time in Vegas.  I had hoped to leave earlier on sunday, but there were no other direct flights, and indirect flights were adding like 2-3 hours onto the 5.5 hour direct flight
<perrito666> natefinch: well I get out at 2AM and get there around 4 o 5 PM local time with a 2h wait in panama
<natefinch> perrito666: on sunday?
<perrito666> yep
<perrito666> when I say local time, I mean ART
<perrito666> Its like 2PM there
<natefinch> that's not too bad then.  Long haul, but nice to have a little time to recover that day
<perrito666> yep, return on the other hand, is a 20h flight
<perrito666> 8 of those are a wait in panama :p
<natefinch> oh man, that's rough
<natefinch> I hate layovers
<perrito666> but, I have been told that tocumen airport has shuttles that take you to know the canal and the city if you have long waits
<fwereade> perrito666, did you get your priority pass for the lounges? I, uh, didn't, and I have 5+ hours in heathrow on the way back
<perrito666> fwereade: the code for canonical was not working and I forgot to ask and well, also I just remembered to ask for this too late
<perrito666> their page says they send the card un 3 to 7 business days, which, here, if its intl mail will be around 15 days to a month
<Makyo> Hi from GopherCon. For those who aren't in talks, I'm trying to get juju building on my new laptop and running into the following: utils/ssh/ssh_gocrypto.go:84: undefined: ssh.ClientConn
<fwereade> perrito666, fwiw you're meant to make sure it gets sent to the office, and they need to send it on
<wwitzel3> ... well I was going to complain about my 1h layover in ATL and my 4 hour flight, but I feel much better about it now
<fwereade> perrito666, not surewhy
<voidspace> hah
<perrito666> wwitzel3: lol
<jcw4> Makyo: did you use godeps to update the dependency versions?
<Makyo> jcw4, no, this is a fresh checkout.  I read about it in CONTRIBUTING, but wasn't sure if it was required for building.
<jcw4> Makyo: yes, that specific error looks like one I got recently before running godeps
<Makyo> jcw4, alright, thanks, will try.
<jcw4> Makyo: my process (in juju-core): go get -u ./... ; godeps -u dependencies.tsv; go build ./...
<hazmat> fwereade, ping
<fwereade> hazmat, pong
<perrito666> natefinch: strange, bootstrap seems to be using oldpassword as the password
<hazmat> fwereade, re cloud-installer, was wondering if you had a minute to chat about the approach they went with.. its actually a bit cleaner than what they were looking at previously (no --to=kvm:0) but it does have some nesting
<hazmat> using kvm as the local provider container type, with lxc inside of them
<hazmat> instead of touching machine 0 in local ... which is special.
<natefinch> perrito666: I think we change it after we log in the first time
<perrito666> apparently so, bc oldpassword is not working either
 * perrito666 dives back into grep
<wwitzel3> voidspace: ping
<voidspace> wwitzel3: pong
<voidspace> sorry
<wwitzel3> voidspace: np :)
<voidspace> wwitzel3: https://code.launchpad.net/~mfoord/juju-core/ha-rsyslog/+merge/217270
<perrito666> natefinch: interesting, we seem to be setting the admin-secret as password, yet this does not seem to work for a simple query using the client
<natefinch> perrito666: hmm weird.  can you try adding a logging statement to write out the exact password we're using to log in?
<natefinch> perrito666: also, it's possible there's two different passwords - one for the db in general and one for the admin database
<perrito666> natefinch: I pasted somthing in priv there
<perrito666> mostly bc it has the pw
<wwitzel3> was someone else seeing hash sum mismatch errors on ec2 earlier?
<wwitzel3> http://paste.ubuntu.com/7331230/
<natefinch> nope, sounds like maybe just a networking error though
<wwitzel3> natefinch: yeah I just kept retrying it and eventually it worked :/
<wwitzel3> always makes me confident
<natefinch> I gotta go run an errand.  Good luck perrito666, sorry I haven't been much help.  My wife is sick, so I'm getting a lot more kid duty today
<perrito666> ouch hope she gets better
<perrito666> you where very helpful
<natefinch> if her cold is anything like mine, it'll last another 3 weeks.  I've been sick for like a month.
<perrito666> is this a correct invocation for mongo?:  mongo --ssl -u "admin" -p "a very secret password" localhost:37017/admin -eval "rs.conf()"
<voidspace> EOW
<voidspace> see you in Vegas
<perrito666> voidspace: have a nice trip
<voidspace> perrito666: you too
<voidspace> bye all
<natefinch-afk> perrito666: how goes?
<perrito666> natefinch-afk: hey, just came back from lunch
<perrito666> natefinch-afk: not many uses of SetAdminMongoPassword
<perrito666> but I do notice that sets the hash
<perrito666> that might be why Its not working
<sinzui> perrito666, lbox hates me. Lp sees the diff, bug lbox/reitvield tells me to get stufffed
<sinzui> Failed to send patch set to codereview: diff is empty
<sinzui> ^ any ideas
<natefinch> sinzui: just propose it the old fashioned way, I had to do that yesterday
<natefinch> sinzui: there's a diff in launchpad, it's just not as purty as rietveld's
<sinzui> natefinch, I think the old fashioned way is LP's MP. Is that what you mean?
<natefinch> sinzui: yeah
<sinzui> natefinch, Do you have a moment? https://code.launchpad.net/~sinzui/juju-core/inc-1.19.2/+merge/217287
<natefinch> sinzui: see? we don't need no reitveld for that.  :)
<sinzui> :)
<sinzui> This is the first release without quantal. This could be the first release without a build failure.
<mgz> sinzui: for lbox things, it helps to delete any random branch directories it's gone and created
<sinzui> mgz, I find the from time to time.
<perrito666> sinzui: sorry I was afk
<sinzui> np perrito666
<natefinch> sinzui: no more 1.19.1 bugs?
<natefinch> sinzui: I don't see 1.19.1 as a valid milestone choice under milestone-targeted bugs
<sinzui> natefinch, yep, 1.19.1 is closed. It is being build now
<natefinch> sinzui: cool
<sinzui> natefinch, I will create 1.19.2
<sinzui> natefinch, 1.19.2 will show up if you reload the page
<bits3rpent> Lets say I set a config value to something new config-changed hook runs
<bits3rpent> eventually the actual config-changed script runs, and performs config-get. Does config-get get the new config that was changed in the state server, or the old config in the local state?
<natefinch> bits3rpent: I think there's very few juju devs online right now
<natefinch> I'm packing it in.  EOD for me today.
<mgz> fix for bug 1312940 <https://codereview.appspot.com/95770045/>
#juju-dev 2014-04-26
<mgz> rogpeppe: https://codereview.appspot.com/95780044
<axisys> juju does not start hadoop units in local environment .. showing pending..
<axisys> using 1.19.1-trusty-amd64 from ppa:juju/devel
<axisys> log http://paste.ubuntu.com/7334753/
<axisys>   
<mgz> anyone remember if there's a good reason to stay with go 1.1.2 on the bot?
<mgz> I can't land the go.crypto update atm as it wants a few cypher constants that don't exist on that apparently
<mgz> seems 1.1.2 is the newest we have in the golang ppa... so stuck for now
#juju-dev 2015-04-20
<dimitern> morning
<jam> dimitern: are you back today, or taking a swap day ?
<jam> morning, btw
<dimitern> jam, morning :) I'm back today and I'm taking off the thu and fri this week
<jam> I'll be in the hangout in just a sec, grabbing water
<dimitern> ok
<TheMue> morning o/
<mup> Bug #1446159 was opened: actions required params list can specify undefined parameters <juju-core:New> <https://launchpad.net/bugs/1446159>
<mup> Bug #1446168 was opened: juju status --format=short has stray newline <landscape> <juju-core:New> <https://launchpad.net/bugs/1446168>
<Mmike> Hello, lads! Who can tell me how juju is creating lxc containers for maas provider? I'm interesed in the config file(s) provided, and how templates and containers are created?
<Mmike> especially in relating to juju-br0 interface
<mup> Bug #1446264 was opened: joyent machines get stuck in provisioning <bootstrap> <joyent-provider> <reliability> <repeatability> <juju-core:Triaged> <https://launchpad.net/bugs/1446264>
<xwwt> sinzui: Note I canceled release standup today
<sinzui> Ok, I didn't see that
<thumper> alexisb: are you wanting to catch up today?
<alexisb> thumper, yes
<thumper> alexisb: also, we need to move our weekly call an hour
<alexisb> and you missed our 1x
<alexisb> 1x1
<alexisb> thumper, that is fine
<alexisb> can you meet now
<thumper> alexisb: it was too early
<thumper> :)
<thumper> k
<alexisb> see you there
#juju-dev 2015-04-21
<menn0> thumper, cherylj: http://reviews.vapour.ws/r/1457/
 * thumper looks
<menn0> thumper: next PR will hook up bufferedLogWriter to the logsender worker and get these running inside the unit and machine agents
 * thumper nods
<menn0> thumper: chat?
<thumper> menn0: sure
<menn0> thumper: which hangout?
<thumper> meh
<thumper> standup
<menn0> kk
<mattyw> morning all
<dimitern> morning
<dimitern> dooferlad, happy birthday! :)
<dooferlad> dimitern: :-)
<TheMue> morning o/
<TheMue> dooferlad: oh, what do I see? happy birthday!
<TheMue> wow, April is always special, 5 to 6 birthdays per week, including our little daughter
<mattyw> gsamfira, ping?
<gsamfira>  mattyw: pong\
<mattyw> gsamfira, hey there - was good seeing you last week, couple of questions...
<gsamfira> mattyw: it was :D. Hope to do it son again :D
<gsamfira> mattyw: sure thing, shoot
<mattyw> gsamfira, 1. In one of the cloudbase talks there was a mention of common problems in unit tests, and one of them was "The process cannot access the file because it is being used by another process". And there was a statement about the pattern that usually leads to it. Can you remember what that was?
<gsamfira> yup
<gsamfira> its opening a file for writing, defering the close and then movinf the file
<gsamfira> :)
<gsamfira> something like fd, err := os.Open("file"); defer fd.close; os.Move(fd.Name(), "/tmp/bla")
<gsamfira> that usually blows up on windows
<gsamfira> mattyw: simple way to get around it is to simply close the file before trying to delete or move it
<mattyw> gsamfira, ok cool, I'll see if I can fix some of those this week
<mattyw> gsamfira, question 2. Have you ever considered making a chocolately package for juju?
<gsamfira> mattyw: huh...good idea :D
<mattyw> gsamfira, I'll add that to my list as well :)
<mattyw> gsamfira, no more questions for now :)
<gsamfira> mattyw: there are some tests that try to do cleanup by simply doing a defer os.Remove("target"). Those can be replaced with defer func() { fd.Close(); os.Remove(fd.Name())}()
<gsamfira> if those are the ones you plan to fix
<gsamfira> then that would drasticly reduce the leftovers in $env:TMP
<gsamfira> :D
<mattyw> gsamfira, there are a number in apiserver that fail that way I think so I was hoping to make time to take a look at those this week
<gsamfira> mattyw: hmm. This must have happened in some recent commit. I did a merge on friday that fixed tests on windows again :)
<mattyw> gsamfira, ah - I have a branch from friday morning, I'll check again
<gsamfira> mattyw: https://github.com/juju/juju/commit/34313b849aee8cfed333df1d02ccdb7e0cc32f58 <-- this should pass :)\
<mattyw> gsamfira, yep, tests look good :)
<mattyw> (I ran them ages ago but totally forgot)
<gsamfira> mattyw: sweet! now we need to make them less flaky and finally run them in parallel to the ubuntu ones in the CI :P
<mattyw> gsamfira, is mgz taking that on?
<gsamfira> mattyw: yup. And we will help wherever we can :)
<jam> dimitern: reviewed your patch
<dimitern> jam, thanks!
<mup> Bug #1446608 was opened: agent panic on MAAS network with uppercase characters <cloud-installer> <landscape> <juju-core:New> <https://launchpad.net/bugs/1446608>
<mgz> mattyw: I sent mail about windows tests - there are a couple of metering suite failures that aren't immediately obvious to me
<mattyw> mgz, I just saw that - thanks very much. I'll take a look at those now
<mattyw> mgz, this on master?
<mgz> mattyw: yup, windows tests on master only
<mattyw> mgz, I'll take a look at the meterstatus one
<mattyw> mgz, the rest we need someone who knows more
<mgz> mattyw: yeah, I got 'em
<mup> Bug #1394755 changed: juju ensure-availability should be able to target existing machines <cloud-installer> <ha> <landscape> <juju-core:Fix Released by natefinch> <juju-core 1.23:Fix Released by natefinch> <https://launchpad.net/bugs/1394755>
<mup> Bug #1430340 changed: Failing to create tempdir in tests on windows <test-failure> <windows> <juju-core:Fix Released by gabriel-samfira> <https://launchpad.net/bugs/1430340>
<mup> Bug #1432652 changed: upgrade_test.go: TestLoginsDuringUpgrade failing due to "upgrade in progress" <ci> <ppc64el> <test-failure> <unit-tests> <juju-core:Fix Released by menno.smits> <https://launchpad.net/bugs/1432652>
<mup> Bug #1435152 changed: Can't deploy local charm in non-server environment <juju-core:Fix Released by waigani> <juju-core 1.23:Fix Released by waigani> <https://launchpad.net/bugs/1435152>
<mup> Bug #1438683 changed: Containers stuck allocating, interface not up <add-machine> <cloud-installer> <landscape> <maas-provider> <network> <juju-core:Fix Released by mfoord> <juju-core 1.23:Fix Released by mfoord> <juju-core trunk:Fix Released by mfoord> <https://launchpad.net/bugs/1438683>
<mup> Bug #1439364 changed: error in logs: environment does not support networking <logging> <network> <juju-core:Fix Released by mfoord> <juju-core 1.23:Fix Released by mfoord> <https://launchpad.net/bugs/1439364>
<mup> Bug #1439447 changed: tools download in cloud-init should not go through http[s]_proxy <cloud-installer> <landscape> <juju-core:Fix Released by cherylj> <juju-core 1.23:Fix Released by cherylj> <https://launchpad.net/bugs/1439447>
<mup> Bug #1443440 changed: 1.23-beta4 sporadically fails autotests <local-provider> <mongodb> <systemd> <ubuntu-engineering> <vivid> <juju-core:Fix Released by menno.smits> <juju-core 1.23:Fix Released by menno.smits> <https://launchpad.net/bugs/1443440>
<mup> Bug #1443541 changed: juju 1.23b4 vivid panic: runtime error: invalid memory address or nil pointer dereference <openstack> <uosci> <juju-core:Fix Released by ericsnowcurrently> <juju-core 1.23:Fix Released by ericsnowcurrently> <https://launchpad.net/bugs/1443541>
<mup> Bug #1443904 changed: Apache-licensed code has been borrowed with violation of license requirements <packaging> <tech-debt> <juju-core:Fix Released by wallyworld> <juju-core 1.23:Fix Released by wallyworld> <https://launchpad.net/bugs/1443904>
<sinzui> dimitern, how goes the container addressability feature flag?
<dimitern> sinzui, it's being tested - so far, so good
<sinzui> :)
<dimitern> sinzui, I'll most likely land it in a couple of hours - I have a review and a few things to fix first
<sinzui> dimitern, understood. I am going to turn off CI so that it doesn't start testing something ahead of you.
<dimitern> thanks sinzui
<mup> Bug #1426461 changed: Some service.Service methods should return an error <tech-debt> <juju-core:Fix Released by ericsnowcurrently> <https://launchpad.net/bugs/1426461>
<mup> Bug #1446662 was opened: Vivid bootstrap and destroy-environment intermittently fails <bootstrap> <destroy-environment> <golang> <vivid> <juju-core:In Progress by gz> <juju-core 1.23:Fix Committed by gz> <https://launchpad.net/bugs/1446662>
<mup> Bug #1351099 changed: WARNING juju.worker.uniter.charm git_deployer.go:200 no current staging repo <cloud-installer> <landscape> <logging> <usability> <juju-core:Fix Released by davidpbritton> <https://launchpad.net/bugs/1351099>
<mup> Bug #1414424 changed: envStateCollection FindId and RemveId don't filter on env-uuid field <juju-core:Fix Released by waigani> <https://launchpad.net/bugs/1414424>
<mup> Bug #1422791 changed: DestroyEnvironment does not remove entry from envusers <destroy-environment> <users> <juju-core:Fix Released by waigani> <https://launchpad.net/bugs/1422791>
<cherylj> Hey core team, join the bootstack / core call if you can.
<cherylj> cmars, dimitern, TheMue ^^
<TheMue> cherylj: yup
<dimitern> cherylj, I won't manage, sorry :/
<TheMue> cherylj: hmmmpf, do you see me moving? FF says it's freezed
<cherylj> TheMue: Yes, I do
<TheMue> cherylj: fine, and I can at least hear you
<lazyPower> o/ I have a question about the enhanced service-status code thats coming... will we get something on the node itself like "unit-get status" to determine the overall status of the service being deployed? or are we relying on hook context to determine the status as we have in the past.
<alexisb> lazyPower, can you send a note to fwereade
<lazyPower> will do
<alexisb> I need a volunteer for this bug: https://bugs.launchpad.net/juju-core/+bug/1441826
<mup> Bug #1441826: deployer and quickstart are broken in 1.24-alpha1 <api> <ci> <deployer> <quickstart> <regression> <juju-ci-tools:Triaged> <juju-core:Triaged> <https://launchpad.net/bugs/1441826>
<mup> Bug #1444354 changed: juju backups includes previous backups in saved file <backup-restore> <juju-core:Fix Released> <https://launchpad.net/bugs/1444354>
<sinzui> cmars: juju-ci is failing because git.apache.org/thrift does not exist. I think I need to force this test to pass or not vote so that we can release 1.23.1
<alexisb> sinzui, cmars is out
<alexisb> force the non vote
<sinzui> alexisb, looks like bad server on their side. I cannot get it either.
<stokachu> what's the recommended way to pull the current api state? i used to use juju.NewConnFromName
<stokachu> it looks like NewAPIClient and NewAPIRoot gives me access to the api state, is that the new recommended way?
<stokachu> api state + api client
<alexisb> stokachu, sorry missed your questions earlier
<alexisb> stokachu, can you send your question to juju-dev mailing list
<alexisb> stokachu, we still have many folks out on swap days
<stokachu> alexisb: ah ok, its not urgent just some exploritory work im doing in the juju internals
<lazyPower> o/ can someone point me in the right direction as to what i'm doing incorrectly thats spawning 'no unit id'
<lazyPower> http://paste.ubuntu.com/10863354/
<alexisb> thumper, ericsnow, perrito666 others who might be online ^^^
<lazyPower> the intention is getting/sending variables out of band of the hook context in which its intended. "such as i have networking config updated, and i need to transmit over the networking relationship between hosts"
<ericsnow> lazyPower: relation-get -r <rel id> <key> <unit id>
<ericsnow> lazyPower: looks like you are missing the unit ID
<lazyPower> root@juju-lazyp-canonistack-machine-9:/var/lib/juju/agents/unit-docker-0/charm# relation-get private-address -r 19 flannel-docker/0
<lazyPower> 10.55.61.185
<lazyPower> oi
<lazyPower> ok i see *why* it has to be scoped that way though, so far i've only relation-set out of band. Thanks for the rundown ericsnow
<ericsnow> lazyPower: np :)
<mup> Bug #1446857 was opened: MeterStatusWatcher tests fail on windows test slave <ci> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1446857>
<mup> Bug #1446871 was opened: Unit hooks fail on windows if PATH is uppercase <ci> <hooks> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1446871>
<sinzui> xwwt, alexisb and canonistack hates us. I will make the one failing non-voting because we have other tests in the substrate that passed an hour ago
<alexisb> wallyworld_, thumper can someone review williams pull request from earlier today
<wallyworld_> ok, will do soon
<alexisb> wallyworld_, anastasiamac is up for all call reviewer, but I would like william to have what he needs in the morning
<alexisb> wallyworld_, thanks
<wallyworld_> sure, np
<xwwt> sinzui: ty for the update on that
<sinzui> xwwt, we have a bless, I am waiting for all the assets to arrive to queue the release
<xwwt> sinzui: cool.  ty for sticking with this
<mup> Bug #1446885 was opened: Skipped cmd/jujud/agent/upgrade_test.go tests on windows <skipped-test> <test-failure> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1446885>
<sinzui> wallyworld_, thumper : do either of you have a minute to review http://reviews.vapour.ws/r/1464/
<wallyworld_> sure
<wallyworld_> sinzui: +1
<mgz> ocr: http://reviews.vapour.ws/r/1463
<sinzui> mgz: I have ci paused. I can wait a few hours for your current branch
<mgz> sinzui: for trunk? sounds good, I have one more test script change to work around a bunch of uniter failures
<sinzui> mgz, yes trunk. I don't want to test 1.23 again given how unwell joyent, aws, and canonistack at this hour
<wallyworld_> thumper: i've updated http://reviews.vapour.ws/r/1440 if you could take another look that would be great
#juju-dev 2015-04-22
<thumper> wallyworld_: ack
<wallyworld_> thumper: thanks for review
<thumper> np
<stokachu> if i wanted to query AllMachines from the state package is there a preferred method of accessing that data? i used to do a juju.NewConnFromName which would expose that state api
<stokachu> i can't seem to find an alternative way now since that has disappeared
<thumper> stokachu: hmm...
<thumper> stokachu: where and why?
<thumper> stokachu: short answer is you should use the client API
<thumper> stokachu: which you should be able to get the machine information out of
<thumper> using the status command
<thumper> hmm... 16:30 and brain is going fuzzy
<thumper> about 3-4 hours later than yesterday
<stokachu> thumper: yea that's what im using now, im just updating https://github.com/battlemidget/juju-sos/blob/master/cmd.go
<stokachu> to work with latest juju code
<stokachu> ill push everything through the api though, makes more sense anyway
<thumper> stokachu: yeah, juju.Conn isn't the way
<stokachu> thumper: cool, thanks for confirming the api client
<thumper> stokachu: also, change the logger to be 'juju.cmd.sos'
<stokachu> ah ok will fix that too
<thumper> or
<thumper> juju.plugin.sos
<thumper> fairly arbitrary definition
<thumper> plugin probably makes more sense
<stokachu> ok sounds good ill fix that too
<thumper> cheers
<stokachu> thanks :)
<thumper> np
<dimitern> jam, dooferlad, please take a look http://reviews.vapour.ws/r/1465/ - follow-up to the branch from yesterday, this time adding tests
<dooferlad> dimitern: *click*
<dimitern> dooferlad, ta
<jam> dimitern: why are we getting "not supported" rather than "EPerm" for NewMachineTag("42") ?
<jam> it makes me wonder if we should be doing the address check later.
<jam> I'm not particularly worried about it, just made me wonder
<dimitern> jam, well, it is a valid tag - why ErrPerm?
<jam> the machine exists?
<jam> (because we don't do NotFound
<jam> )
<dimitern> jam, it does not exists, but that's fine, because you're not supposed to call the method at all without the feature flag
<jam> dimitern: yeah, seems ok
<jam> you're not leaking information about a particular entry
<dimitern> jam, cheers
<dimitern> jam, actually that issue you raised was bugging me as well for its behavioral inconsistency, so I'm changing it to return results with the same len as the passed args when the flag is not on
 * fwereade has so far found 27 workers running in the machine agent for one reason or another and knows he's missing some :/
<mattyw> dimitern, are you around for a quick review? http://reviews.vapour.ws/r/1466/
<mattyw> dimitern, (I'm ocr today so I can't do it)
<wwitzel3> perrito666: ping
<perrito666> wwitzel3: pong
<wwitzel3> ericsnow: ping
<fwereade> perrito666, I'm wondering if you'd know: is the limitLoginsDuringRestore stuff goroutine-safe?
<perrito666> I think it is
<perrito666> fwereade: btw, hi :)
<fwereade> perrito666, can you point me to how it is? all I can see are a bunch of unprotected fields in the agent, and callbacks to methods using them in the api server
<fwereade> perrito666, hi indeed :)
<dimitern> mattyw, sure, will have a look in a bit
<perrito666> fwereade: can you ask me again tomorrow I really really am trying to fit something into 1.24 for ian
<fwereade> perrito666, sure, np
<mgz> hm, the pre-push hook doesn't check that the *tests* build
<dimitern> mattyw, this just occurred to me: the reason why you're getting 2 events on linux and only 1 on windows might be because windows does not run the apiserver (yet), where the server-side watcher resided
<mattyw> dimitern, windows doesn't run the apiserver?????
<dimitern> mattyw, well, not that I know of - bootstrap is not supported on windows and there's some issues around packaging mongo with ssl on windows
<dimitern> mattyw, reviewed
<mattyw> dimitern, cheers
<mattyw> dimitern, what do we do in tests then under windows?
<dimitern> mattyw, but then again - this could have nothing to do with windows not running an apiserver
<dimitern> mattyw, we test we can use a juju client on windows to talk to a bootstrapped environment on ubuntu
<jam> dimitern: fwereade: you coming to the malta powow?
<dimitern> jam, sure, omw
<fwereade> jam, just a mo
<dimitern> pow's done, wow's mostly :)
<jam> :)
<mattyw> dimitern, yeah - I think we're only dealing with state
<mattyw> dimitern, looking into it now - but as you have shown an "interest" it's you I'm coming to if I have questions ;)
<dimitern> mattyw, no sweat :)
<jam> dimitern: I'm pretty sure you can at least run the infrastructure of the API server on Windows.
<jam> I'm not sure that "jujud' runs, but that is different
<mgz> wwitzel3: did you mean to remove all the windows deps in your gosigma dependencies.tsv update?
<wwitzel3> mgz: nope :(
<mgz> wwitzel3: happened to not go through anyway, so easy enough to fix
<wwitzel3> mgz: yep, already fixed, thanks
<fwereade> dimitern, jam, anybody: this is my current best guess at the workers we might run in a machine agent: http://paste.ubuntu.com/10865749/
<dimitern> fwereade, looking
<fwereade> dimitern, jam, annybody: I know it lacks detail and/or existence in several bits where one worker spawns many others (eg container provisioning, per-env provisioners, etc)
<fwereade> any suggestions, clarifications, additions gratefully received
<dimitern> fwereade, sure, will let you know
<jam> fwereade: my irc client seems to have not buffered the original request, I'm interested to look, but can you link it again?
<dimitern> fwereade, you've missing one of the most recent ones - worker/addresser
<fwereade> jam, http://paste.ubuntu.com/10865749/
<dimitern> fwereade, runs on each master state server (like cleaner, resumer)
<fwereade> dimitern, is that a *new* worker that uses a direct state connection? grrbml grrmbl
<dimitern> fwereade, also the workers in worker/provisioner are fairly twisted, but I know most of their deps from my dealings with lxc containers
<dimitern> fwereade, it's supposed to be run on the state servers only
<fwereade> dimitern, I have high hopes that one day we will have a simple provisioner that is no more or less than a watcher/broker adapter
<fwereade> dimitern, it's still something reaching into the database directly
<dimitern> fwereade, and it needs cloud creds as well
<fwereade> dimitern, and?
<dimitern> fwereade, the addresser
<fwereade> dimitern, let's not violate layers unless we have to
<dimitern> fwereade, +1
<fwereade> dimitern, nothing should touch state apart from the apiserver itself
<dimitern> fwereade, but frankly it's not worse than the resumer and cleaner
<fwereade> dimitern, they should have been done in the initial pass as well
<fwereade> dimitern, not sure why they weren't
<fwereade> dimitern, api-everywhere was quite the goal for us at one point
<dimitern> fwereade, because we said "meh - it's only on the apiserver, so it's fine"
<fwereade> dimitern, the apis might be simple, but that's no reason to drop the benefits
<fwereade> dimitern, there's no *general* statement that particular workers must be bound to particular machines
<dimitern> fwereade, agreed, however it's easy to introduce apis for such cases at any point
<jam> fwereade: TerrifyinglyExtremeSuiciderName  :)
<fwereade> dimitern, every time we make a choice like that we add to the friction and make it harder to move the workers from one place to another easily
<fwereade> dimitern, please add a card to move the new one behind an api (and if you can bear to add 2-method apis for the others, that would be awesome...)
<dimitern> fwereade, will make a note of it, ok
<fwereade> dimitern, cheers
<ericsnow> wwitzel3: pong
<mup> Bug #1447174 was opened: state crash with juju terminate-machine --force X <juju-core:New> <https://launchpad.net/bugs/1447174>
<sinzui> dimitern, can you read my addition (a comment) to the proposed 1.23.1 release notes about the feature flag? https://docs.google.com/document/d/1JApj2hsEwKKmAqDmIayGrZ1fkKNPQWvyDmv_bKOcQek/edit
<dimitern> sinzui, sure, will look in a bit
<sinzui> dimitern, the release is actually live for some people. Can you look now?
<dimitern> sinzui, ok, looking now
<dimitern> sinzui, looks good
<sinzui> thank you dimitern
<ericsnow> cherylj: I left a short review on your "system" command patch
<ericsnow> cherylj: basically, "system" is pretty ambiguous, but simply making the doc for the command more explicit will be sufficient to address that IMO
<cherylj> ericsnow: thanks for the feedback.  There have been discussions around "server" vs. "system", and things landed on "system".  I'll see if I can make it clearer.
<ericsnow> cherylj: I think "system" is fine as long as we are clear about its context (in documentation and help strings)
<wwitzel3> mgz: can you stop #2915 build, it is missing a commit
<mgz> wwitzel3: done
<wwitzel3> mgz: ty
<mup> Bug #1447174 changed: state crash with juju terminate-machine --force X <terminate-machine> <vivid> <juju-core:New> <https://launchpad.net/bugs/1447174>
<perrito666> wwitzel3: hey, do you currently have a vmaas for testing?
<perrito666> http://pastebin.ubuntu.com/10866692/ <-- I am trying to bootstrap my vmaas and getting this, Ill try master to see if it is my branch, but looks like not
<mup> Bug #1447234 was opened: juju prints "error" when deploying yet no units are in error <charms> <improvement> <set> <juju-core:Triaged> <https://launchpad.net/bugs/1447234>
<mup> Bug #1447235 was opened: add stdin support to "juju set" <juju-core:New> <https://launchpad.net/bugs/1447235>
<wwitzel3> perrito666: yeah, I have one
<perrito666> wwitzel3: could you try to deploy master and tell me if it works or you get the error I pasted above?
<wwitzel3> perrito666: yep, I can give that a shot
<wwitzel3> perrito666: working just fine so far
<wwitzel3> perrito666: what version of maas?
<perrito666> mm, good question, I had to kill it to get resources in the machine, but ill look as soon as I finish running tests
<perrito666> wwitzel3: if it reaches the end ofbootstrap succesfully then its ok, and it is most likely something in my vmaas
<perrito666> tx a lot btw
<wwitzel3> perrito666: same failure
<perrito666> ok, so I am not crazy, if I had to guess I would say that the changes merged in the sprint to support centos broke that
 * perrito666 looks at gsamfira 
<wwitzel3> perrito666: I'm trying 1432affde02cd81b354871d679804beca9bbe21a right now
<wwitzel3> perrito666: yeah, so it is after 1432affde02cd81b354871d679804beca9bbe21a if you want to do a git bisect, probably an easy find
<perrito666> wwitzel3: tx a lot, I think Ill try after I finish this
<gsamfira> perrito666: pinging bogdanteleaga
<perrito666> gsamfira: I am not saying it was that, it is just an educated guess
<gsamfira> perrito666: I think I know where the problem is. Should be an easy fix.
<wallyworld_> thumper: got a minute?
#juju-dev 2015-04-23
<mup> Bug #1447390 was opened: mongo tools missing on centos <juju-core:New> <https://launchpad.net/bugs/1447390>
<mup> Bug #1447392 was opened: ssh args list too long when bootstrapping <juju-core:New for bteleaga> <https://launchpad.net/bugs/1447392>
<wallyworld_> thumper: you around?
<perrito666> wallyworld: http://reviews.vapour.ws/r/1472/
<wallyworld> ty, will look real soon
<perrito666> wallyworld: pushing a cleaner branch now that one got some odd merges from master
 * perrito666 wonders if rb will cope with it
<perrito666> it did and it fits one page
<perrito666> k its midnight I am out cheers
<jam> wwitzel3: ping
<mup> Bug #1447446 was opened: 1.23.1: bootstrap failure, vivid, local provider <landscape> <juju-core:New> <https://launchpad.net/bugs/1447446>
<wallyworld> thumper: you around?
<wallyworld> jam: you have a few minutes?
<jam> wallyworld: sure
<jam> what's up ?
<wallyworld> jam: need someone to tell me i'm an idiot, can you join say the TL hangout
<jam> core-leads-call?
<wallyworld> yeah
<mgz> wallyworld: still up?
<mgz> jam: pokÃ©
<mgz> jam: <http://reviews.vapour.ws/r/1470/> 'm not being unreasonable right?
<wallyworld> mgz: sorry, was at soccer
<mgz> wallyworld: no problem, sending email with update
<perrito666> wallyworld: we really need a video of you playing socker
<perrito666> soccer*
<wallyworld> perrito666: no we don't :-)
<perrito666> for... negotiation purposes
<mgz> okay, I'm having lunch now, see email/review for init woes
<wwitzel3> jam: pong
<tasdomas> hi, could somebody take a look at http://reviews.vapour.ws/r/1116/ ?
<mup> Bug #1447595 was opened: TestLeadership fails on windows test slave <ci> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1447595>
<mgz> bogdanteleaga: when proposing crazy mps some more explaination on *why* changing the whitespace fixes the issue would be helpful :)
<mgz> and what the issue is, for that matter
<bogdanteleaga> mgz: I accidentally discovered bash doesn't like the script with the old whitespace formatting
<bogdanteleaga> mgz: I'll edit the message, sure
<katco> natefinch: ericsnow: stand up
<mgz> bogdanteleaga: see, that does sound fun, I want the whole story in the mp :)
<mgz> (also, I think there's an associated lp bug?)
<bogdanteleaga> mgz: :)
<bogdanteleaga> mgz: there's no bug for this one
<bogdanteleaga> mgz: but it would be nice to get it in along with the bugfix
<natefinch> ericsnow: are you looking at https://bugs.launchpad.net/juju-core/+bug/1447446 ?  Or are you working on something else?
<mup> Bug #1447446: 1.23.1: bootstrap failure, vivid, local provider <landscape> <juju-core:Triaged> <juju-core 1.23:Triaged> <https://launchpad.net/bugs/1447446>
<ericsnow> natefinch: yep (forgot to assign it to me)
<natefinch> ericsnow: cool
<katco> ericsnow: please add that to the kanban as well
<ericsnow> katco: will do
<katco> ericsnow: ty sir
<katco> ericsnow: pro tip: you can just paste the bug ID in the "card id" field of the card, and it will auto link everything
<ericsnow> katco: yeah, just noticed :)
<lazyPower> is this something awesome you have setup or is this auto-behavior?
<katco> lazyPower: it's board specific
<katco> lazyPower: the board owner must set it up
<lazyPower> ah, we have an intermediary service we're routing through to get the functionality
<lazyPower> i can see wher this works for your team as juju-core is one project
<lazyPower> we're tracking across ~ 8 billion (give or take a few billion)
<ericsnow> lazyPower: we scored a good lead with katco :)
<lazyPower> ericsnow: she's prettttyyy cool, i'll admit.
 * lazyPower hands katco a gold star
<ericsnow> natefinch was good too :)
 * katco spends the next five minutes trying to press the star onto her nose
<ericsnow> katco: doesn't it go on your belly (or was that 2 stars)
 * katco is a bit lost there
<ericsnow> katco: Dr. Suess (star belly sneeches)
<katco> my daughter hasn't reached the age where i'm back up on Dr. S yet :p
<natefinch> ericsnow: katco is 1000x as organized as I was.  I think we
<natefinch> we'll be a lot more successful with her leading :)
<ericsnow> natefinch: :)
 * perrito666 imagines moonstone conquering countries and taking power
<katco> emacs will be mandatory.
<katco> which will lead to us being overthrown eventually
 * perrito666 starts a preemtive revolution
<katco> but it will be beautiful and terrifying while it lasts
<katco> ;p
<perrito666> <esc>:overthrow
<wwitzel3> perrito666: are you working on that issue with maas?
<perrito666> wwitzel3: bogdanteleaga is
<bogdanteleaga> there's a fix waiting for review, feel free to try it out in the meantime
<sinzui> natefinch, I will now bug you less.
<natefinch> huzzah!
 * natefinch notes that katco will probably now bug him a lot more, though ;)
<katco> haha
<katco> ericsnow: natefinch: it's certainly not hurting anything, but for now, don't feel the need to put points to bugs. if it makes you happy, please continue :)
<natefinch> katco: yeah, I meant to ask about that.
<katco> natefinch: i think we'll only need to point planned feature work
<natefinch> katco: fair enough
<natefinch> katco: maybe we should actively not put points on bugs, so we don't screw with our velocity?
<katco> natefinch: i'm really flexible... i think the reporting can sift that out
<wwitzel3> alexisb: ping
<alexisb> wwitzel3, omy
<alexisb> omw
<ericsnow> what's our restriction on Go version?
<natefinch> 1.2.2
<ericsnow> looks like the patch for the vmware provider relies on Go 1.3 features
<natefinch> REJECTION
<ericsnow> (or rather struct fields that don't exist in 1.3)
<katco> wwitzel3: ericsnow: core meeting
<natefinch> gsamfira: you around?
<natefinch> ahh crap, I forgot the uniter tests are an annoyingly monolithic table driven test
<perrito666> :D yes they are
<natefinch> fwereade: what's wrong with this line? runCommands{`if [ $(is-leader) != "False" ]; then exit -1; fi`},
<natefinch> fwereade: to be more specific, what happens when that test runs on windows? :P
<perrito666> the fact that is a bash ugly oneliner
 * fwereade did something bashy again? sorry :(
<natefinch> fwereade: or you committed someone else's work so you get blamed for it ;)
<fwereade> natefinch, nah, that was me
<natefinch> fwereade: I'm looking to fix https://bugs.launchpad.net/juju-core/+bug/1446871
<mup> Bug #1446871: Unit hooks fail on windows if PATH is uppercase <ci> <hooks> <windows> <juju-core:Triaged by natefinch> <https://launchpad.net/bugs/1446871>
<natefinch> fwereade: but when I went to run the tests on windows, saw some other failures too
<fwereade> natefinch, I guess it should have different constants in util_*_test.go
<natefinch> fwereade: it would be a lot easier to understand how to fix it if I didn't need to parse a 1600 line DSL first :/    Last time I had to fix a uniter test it took me like 3 days for a relatively simple fix :/
<fwereade> natefinch, which is why I've pulled so much of it out into separate packages with things resembling actual unit tests
<natefinch> fwereade: didn't realize you were doing that.  I hugely appreciate that.
<katco> natefinch: the most succinct way to state why table tests should be used sparingly is: you are defining your own test runner which has a very small sub-set of its larger test runner.
<katco> why that is bad should be pretty obvious
<katco> sorry, very small sub-set of features of its larger test runner
<fwereade> natefinch, that bug looks like it should be reasonably possible to repro it in unit tests somewhere in uniter/runner
<natefinch> katco: ironically, we're already one more level deep using gocheck
<katco> yep
<katco> but at least gocheck is a proper test runner
<perrito666> katco: anyone against your argument should be forced to change the current accepted values for statuses :p it is much like a domino castle
<fwereade> natefinch, ...which remains less well tested than I would like, but *should* have some explicit tests for the env-var population bits
<katco> perrito666: i had to touch that code once and wowwwww
<perrito666> katco: it took me five minutes to change the values and validators and setters.. and 4 days to fix that test
<katco> perrito666: yeah exactly my experience
<natefinch> fwereade: so I'm just trying to get the tests to actually run on windows and then I'll tackle the env casing problem... is there not some more simple way to verify that the hook tool isn't there, rather than sending a command-line for the uniter to run?  Yes, we could make one that runs on windows and one on linux... but that still seems like the most complicated way to test if a file exists on disk.
<fwereade> natefinch, well, those tests are the effectively the functional tests for the uniter
<fwereade> natefinch, you should be able to write a test case in uniter/runner that exposes the problem, though
<fwereade> natefinch, env.go
<fwereade> natefinch, ha:
<fwereade>     if runtime.GOOS == "windows" {
<fwereade>         c.Skip("bug 1403084: There are some problems regarding os.Environ() on windows")
<fwereade>     }
<mup> Bug #1403084: Tests that need to be fixed on windows <ci> <tech-debt> <testing> <windows> <juju-core:Fix Released> <https://launchpad.net/bugs/1403084>
<fwereade> natefinch, so I'd start there, I think
<natefinch> fwereade: ok
<fwereade> natefinch, I know there's a bit of jiggery-pokery regarding env vars for windows -- we were doing <something> twice in some circumstances, and I distorted the code a bit to make sure we only did it once
<fwereade> natefinch, so there is another path you need to check somewhere to make sure it works in some weird context (juju-run, or debug-hooks, or something)
<bogdanteleaga> iirc that test was skipped because os.Environ() introduced some flaky variables, but it should not affect anything else
<fwereade> natefinch, at a guess the problem is env.go:54?
<fwereade>         "Path=" + paths.GetToolsDir() + ";" + os.Getenv("Path"),
<fwereade> natefinch, a few extra tests similar to TestEnvWindows should pick it up, I think?
<fwereade> natefinch, although you should figure out what happens when you've got path and PATH and Path all defined -- presumably the `Path` we write is lower priority than whichever one's already set in problematic situations?
<fwereade> natefinch, regardless, you shouldn't need to touch the giant uniter functional tests just for that fix, so long as you can demonstrate it in the unit tests
<bogdanteleaga> actually, since I don't get the juju-log.exe not found on my machine, the env_test.go might reveal the issues on the problematic machines
<natefinch> fwereade:  The uniter tests don't pass on windows right now.  That's part of that bug.
<natefinch> fwereade: unrelated to the environment variables
<fwereade> natefinch, ok, and they should; but the bug has "Any windows machine with casing of the path variable other than 'Path' will fail to find hook tools."
<mgz_> bogdanteleaga: thanks!
<natefinch> fwereade: yes, I know.  I'm just trying to get the tests to a point where they only fail for that reason, then I can fix that reason
<bogdanteleaga> natefinch: I think it's only env issues at this point
<bogdanteleaga> natefinch: it complains about not finding juju-log.exe at some point
<bogdanteleaga> natefinch: after executing it moments before
<mgz> bogdanteleaga: assign bug 1447595 to yourself :)
<mup> Bug #1447595: TestLeadership fails on windows test slave <ci> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1447595>
<fwereade> natefinch, I submit that you will have an easier time of it if you repro and address the path issue in isolation
<fwereade> natefinch, lest you get caught in a morass of other unexpected problems with the uniter test on windows
<bogdanteleaga> mgz: done
<bogdanteleaga> http://reviews.vapour.ws/r/1477/ anybody in for a fast review?
<mgz> bogdanteleaga: lgtmed
<fwereade> natefinch, I don't know what they might be but I'm not so optimistic as to assume the path issue is the only obstacle
<fwereade> bogdanteleaga, I don't see where it's successfully executing juju-log.exe?
<fwereade> brb
<bogdanteleaga> fwereade: on install it executes, on config-changed it says not found
<bogdanteleaga> fwereade: http://data.vapour.ws/juju-ci/products/version-2549/run-unit-tests-win2012-amd64/build-276/consoleText
<bogdanteleaga> fwereade: second test
<mgz> be a bit wary of that log, it would be one with both PATH and Path set in the environment block, which could well do weird things
<fwereade> bogdanteleaga, ...I see
<natefinch> huh, interesting, for some reason go test's  -test.foo style flags don't work in powershell
<fwereade> bogdanteleaga, well, that's exciting
<mgz> basically we just need a helper in utils that handles this correctly that both utils/exec and the uniter can use
<bogdanteleaga> mgz: afaik, I can't merge it so you can do it if you want to
<fwereade> mgz, yeah, PrependToPath or something
<bogdanteleaga> they pass for me on both win8 and win server so I can't really reproduce this one
<fwereade> mgz, I am somewhat convinced that we could come up with a happier arrangement of utils/exec so we didn't have that func cloned in both places
<mgz> fwereade: indeed
<fwereade> bogdanteleaga, can you try changing the env we set up in the tests? just tweak the value in env_test.go:125 and see what you get
<bogdanteleaga> but the juju-log.exe thing has been there for a while, did you have both set up for a long time?
<mgz> bogdanteleaga: $$merge$$ requested - I'll look at your other branches again later if I get a mo
<bogdanteleaga> cool, they're just as small :)
<mgz> the other complication I didn't call out in the bug - the go behaviour of os.Getenv on windows changes
<mgz> it's exact case in 1.2 but uses windows api in I think 1.3 and that will retrieve Path if you ask for PATH and visa versa
<bogdanteleaga> now that's fun
<mgz> I had the intention of getting to fix this some point this week... but many things have intervened. I'll happily throw peanuts at nate though :)
<natefinch> I like peanuts :)
<bogdanteleaga> fwereade: http://paste.ubuntu.com/10873918/ not sure what you're looking for
<fwereade> bogdanteleaga, I was thinking more about s/Path/PATH/ than s/bar/obar/?
<fwereade> and, well, it'll fail for sure, but it might fail in an interesting and instructive way... for someone running the right version of go... possibly on a suitably idiosyncratic windows box
<bogdanteleaga> fwereade: sounds like quite the setup
<fwereade> bogdanteleaga, yeah, I guess it takes some proper investigation
<bogdanteleaga> mgz: looks like it crashed, not sure why though
<mgz> hm, godeps failed, 'unrecognized import path "gopkg.in/natefinch/lumberjack.v2"'
<mgz> did someone delete nate?
<natefinch> mgz: better not have
<natefinch> mgz: works for me
<natefinch> mgz: godeps and go get
<mgz> natefinch: likely something intermittent and networky
<natefinch> mgz: could be a bad HTTP response from gopkg.in confused it
<mgz> I'll remerge and see
<bogdanteleaga> fwereade: changing it to PATH doesn't make it fail :)
<bogdanteleaga> mgz: that's what I get for pushing from windows with no prepush hook. can you try again?
<fwereade> anyone for a *really* trivial review? http://reviews.vapour.ws/r/1478/diff/
<natefinch> fwereade: ship it!
<fwereade> natefinch, cheers
<mgz> bogdanteleaga: sad gofmt :) regoing
<natefinch> fwereade, mgz, bogdanteleaga:  for the record, doing sets and gets in a simple program, I can't get setenv and getenv to act incorrectly on 1.2.2 .... get always seems case insensitive
<natefinch> (on windows)
<mgz> okay, maybe that bug report was wrong, good good
<bogdanteleaga> for me os.Environ() returns =ExitCode= in the environment on win2012r2
<natefinch> (sorry, get and set are both case insensitive)
<mgz> so, it's just the use of os.Environ() that's particularly suspect
<natefinch> mgz: probably just our use of os.Environ.... we probably expect it to be case sensitive, and it's not
<mgz> well, it's case-preserving
<mgz> you just can't do straight string matches vs things you've pulled out previously with os.Getenviron
<natefinch> yes
<mgz> -iron
<natefinch> gotta run, talk later.
<mgz> later nater
<fwereade> I have an even simpler review: http://reviews.vapour.ws/r/1479/
<fwereade> the description is noticeably larger than the change
<fwereade> menn0, good morning, can I hit you up for a review or 2?
<fwereade> menn0, http://reviews.vapour.ws/r/1479/ is genuinely trivial
<menn0> fwereade: sure
<fwereade> menn0, and http://reviews.vapour.ws/r/1224/ is less so, but I think and hope it's well-commented and documented, and has been used in anger in enough pending CLs, that I think it should be good
<fwereade> tasdomas, ...unless you're around and want to celebrate your graduation by upgrading your I-think-it-makes-sense to and LGTM? ;p
<menn0> fwereade: I'll take a look
<menn0> fwereade: sorry, just dropped out (keyboard problems, stuck capslock)
<menn0> fwereade: looking now
<fwereade> menn0, no worries :)
<alexisb> so menn0 fair warning, critical interrupt headed your way
<fwereade> menn0, then definitely don't worry about 1224
<menn0> alexisb: ok... a critical bug fix?
<alexisb> menn0, yes, maybe you and fwereade can have a critical bug party :)
<alexisb> wallyworld, has the details
<menn0> fwereade: ship it for 1479
<menn0> fwereade: it wasn't immediately obvious why it fixes the problem but after a bit of digging I think I get it
<menn0> fwereade: at any rate, it's a more straightforward way of getting the job done
<fwereade> menn0, indeed
<fwereade> menn0, glad it passes a sniff test though, I lost much of the detailed context by not extracting it immediately
<menn0> fwereade: just looking at 1224 now. i'm glad that you've been looking at this. the way works are started and managed has become a mess.
<menn0> workers even
<fwereade> menn0, yeah
<fwereade> menn0, I am currently rather quailing at the prospect of fixing the machine agent
<fwereade> menn0, the unit agent was hard enough
<fwereade> menn0, but I've had some practice now :)
<wallyworld> menn0: did you have a moment to chat? maybe in the onyx standup hangout?
<menn0> wallyworld: sure. give me a minute.
<wallyworld> np
<mgz> ...do I send my daily dose of good news to the mailing list?
<mgz> "hey everyone, have these orrible breaking bugs to play with..."
<menn0> fwereade: are you still around? one of the bugs i'm looking at now might be related to recent uniter changes
<menn0> jw4: ping/
<jw4> menn0: ola
<menn0> jw4: you've been looking at bug 1438489
<mup> Bug #1438489: juju stop responding after juju-upgrade <upgrade-juju> <juju-core:In Progress by johnweldon4> <juju-core 1.23:Triaged by johnweldon4> <https://launchpad.net/bugs/1438489>
<menn0> jw4: where are things at right now?
<jw4> menn0: yes, I'm afraid I'm responsible for that one
<jw4> menn0: I think the cleanest might be to back out my hook changes from a few weeks ago
<mgz> menn0: see also https://chinstrap.canonical.com/~gz/ for that, some logs and things that may be of interest
<menn0> jw4: what's the PR with your hook changes?
<jw4> menn0: let me find it
<menn0> mgz: thanks
<jw4> menn0:  https://github.com/juju/juju/pull/1897
<jw4> menn0: I'd prefer to fix it though rather than backing it out
<jw4> menn0: I haven't figured out why the hooks quit firing after the upgrade - I just assumed it was because of those changes of mine
<menn0> jw4: shall I take a look?
<menn0> agreed that fixing is preferrable to backing out if possible
<jw4> menn0: feel free - the repo is easy
<jw4> s/repo/repro/
<jw4> all I did to reproduce was to install 1.22, install charm, upgrade to 1.23, observe that hooks quit firing
<menn0> jw4: cool
<jw4> menn0: mgz just added a simple repro step to the bug
<menn0> jw4: what about the recent attempts to fix this? should they stay or be pulled?
<menn0> jw4: PRs 2067 and 2058
<mgz> menn0: if I understand correctly, the earlier landing was just making trying to make the error not as fatal, didn't change the upgrade step itself
<jw4> menn0: the other fixes are actually valid...
<menn0> jw4: ok. just confirming.
<jw4> mgz: menn0 yeah.. actually this upgrade issue shouldn't be related to the upgrade steps
<jw4> more likely that it's related to the uniter logic itself that changed, if indeed it was caused by my original PR
<menn0> jw4: what about the errors related to not being able to read the uniter state?
<jw4> (the upgrade steps are effectively a no-op, because of a misunderstanding in how upgrades were handled)
<mgz> this is one of those kinda-shoulda just backed out in response to the bug cases, but the fact we weren't yelling because our upgrade job passed let it slip
<jw4> menn0: that was because of tightened validation logic.  Since the upgrade steps actually didn't work it was reverted to what was originally there
<jw4> mgz: yeah... however, I haven't established yet that the hooks not firing is a result of the original PR, but it sure looks suspicious
<menn0> jw4, mgz: ok well let me run with it for a bit and see what I find
<jw4> menn0, mgz originally this bug was about the uniter getting into a validation error loop, but now we're past that because of the other fixes, and now we're encountering this issue with hooks not firing
<menn0> jw4: understood, thanks
 * jw4 needs to drop off to take son to drum lessons... bbl
<menn0> jw4: np, talk to you later
<menn0> mgz: do we still have a problem with upgrades to 1.23 as well?
<menn0> mgz: or is it just 1.24 at this stage?
<mgz> menn0: the critical part is the upgrade path from 1.22 to 1.23, which exhibits this bug
<mgz> I've not seperately tested 1.23 to 1.24 but the next CI run of a trunk branch will do that for us.
<menn0> mgz: ok, i'll focus on the upgrade to 1.23 to start
<mup> Bug #1447841 was opened: eu-central-1 AWS region V4 signing required and not supported <juju-core:New> <https://launchpad.net/bugs/1447841>
<mup> Bug #1447841 changed: eu-central-1 AWS region V4 signing required and not supported <juju-core:Triaged> <juju-core 1.23:Triaged> <https://launchpad.net/bugs/1447841>
<menn0> mgz: I think I see the problem. just trying to confirm now.
<mgz> menn0: ace
<menn0> mgz: but wallyworld needs something reviewed first
<mgz> circular favours! I have a branch I'd like wallyworld to review (it's not urgent though :)
<wallyworld> i'll get to it soon
<mup> Bug #1447841 was opened: eu-central-1 AWS region V4 signing required and not supported <juju-core:Triaged> <juju-core 1.23:Triaged> <https://launchpad.net/bugs/1447841>
<wallyworld> mgz: why is Shutoff considered to be alive?
<mgz> wallyworld: I presume because it's fine if you then manually nova start it, want be to dig up the change it was added in?
<mgz> wallyworld: I also wanted to allow BUILD(anything) but was scared we actually depend on the machine to be in a somewhat usable state elsewhere
<wallyworld> mgz: i ask because the behaviour is being changed from active | build and i'm not sure we want to do that
<mgz> wallyworld: bug 1382709
<mup> Bug #1382709: Openstack provider, Instance-state doesn't change on instance shutdown <cts> <cts-cloud-review> <status> <ubuntu-openstack> <juju-core:Fix Released by dimitern> <juju-core 1.21:Fix Released by wallyworld> <https://launchpad.net/bugs/1382709>
<mgz> wallyworld: ah, in AllInstances? that is actually dead code, I probably should have left alone
<wallyworld> yeah, in AllInstances
<wallyworld> you saying we don't call that anymore?
<mgz> no usage of a provider in the juju codebase ever calls AllInstances
<mgz> it's just part of the provider api :)
<wallyworld> mgz: nah, it's called in bootstrap
<wallyworld> and AllInstances should be a superset of Instances
<wallyworld> so i'm worried about changing that
<wallyworld> oh, right, i read the bug. doing stuff out of band we seem to be
<mgz> hm, so it is, somewhat oodly
<mgz> wallyworld: anyway, change just makes it more like the Instances() call so finds more stuff
<wallyworld> yeah, fair enough
<mgz> wallyworld: I'm not sure on the logging I added
<mgz> I want some ammount more, but not completely sure on the balance between useful debugging and spam
<wallyworld> make  it trace perhaps
<mgz> wallyworld: that AllInstances call in bootstrap is new, and puzzling...
<mgz> well, new, 2014... new since I looked at that code
<wallyworld> i can't recall the specifics
<wallyworld> mgz: i'm not comfortable about landing this without tests - the goose test service should be updated to match
<wallyworld> awesome
<wallyworld> provider/cloudsigma/config.go:97: no formatting directive in Errorf call
<wallyworld> why was stuff landed when there was a govet error
<wallyworld> i thought we rejected such things now
<wallyworld> mgz: ?
<mgz> wallyworld: hm, is that still not fataled in the check script? I thought that had been flipped
<wallyworld> mgz: me too, but i just pulled master
<wallyworld> and now get that error
<mgz> ./scripts/verify.bash; echo $?
<ericsnow> could someone spare we a review on a small patch: https://github.com/juju/govmomi/pull/1
<jw4> menn0: you figured it out?
<menn0> jw4: still reviewing wallyworld's change. almost done.
<jw4> ah. kk
<menn0> jw4: but what i think is happening is that the upgrade step runs while the machine agent is upgrading
<menn0> jw4: but at that point the unit agent is still running the previous juju version
<menn0> jw4: so when the unit agent shuts down it overwrites the changes made by the upgrade step
<mup> Bug #1447846 was opened: Hooks don't fire after upgrade 1.23.0 <hooks> <regression> <upgrade-juju> <juju-core:Triaged> <juju-core 1.23:Triaged> <https://launchpad.net/bugs/1447846>
<menn0> jw4: sound plausible/
<menn0> ?
<jw4> menn0: oooh interesting.
<wallyworld> ericsnow: i'm not sure the build directives are correct. don't we just want go1.2 and !go1.2
<jw4> menn0: I don't think the hook.Info upgrade step *ever* runs, because the upgrade step is supposed to find all the units for a given machine and run the upgrade on each one manually
<wallyworld> otherwise how would it work on go1.4 etc
<jw4> i.e. I don't think the upgrade process automatically runs upgrade steps on units, only on machines
<jw4> maybe I'm wrong though...
<menn0> jw4: yeah you're right... so there's 2 problems
<menn0> jw4: the upgrade mechanics only run on the machine agents so the check for a unit tag in the upgrade step is never going to be true
<jw4> menn0: I backburnered my upgrade steps problem by reverting the validation logic... but how does your proposed scenario cause the hooks to stop firing after upgrade?
<jw4> (I think it's plausible, but I'm missing the connecting steps)
<menn0> jw4: i haven't quite figured it out either
<jw4> menn0: kk - well I'm glad to have your eyes on it... I'm embarassed by this whole issue
<jw4> :)
<menn0> jw4: but i'm wondering if the uniter state file hasn't been upgraded then we can end up with the "unexpected hook info with Kind Continue" error
<jw4> menn0: yes - that error will always be logged - it's almost certainly a red herring to the real problem
<jw4> when I 'fixed' the validation I just changed it so that instead of crashing the uniter, it just logs the problem once and then continues
<ericsnow> wallyworld: go1.2 means Go 1.2 and later
<jw4> so the error will always show once when the uniter starts up  (until I/we fix the upgrade steps), but it shouldn't prevent the uniter from continuing normally after that
<wallyworld> ericsnow: ah, i see, thanks
<menn0> wallyworld: review done. for some reason RB has duplicated one of my comments for a particular section of code and I can't figure out how to fix it. please ignore the dup.
<wallyworld> menn0: tyvm
#juju-dev 2015-04-24
<menn0> jw4: i will keep digging now
<wallyworld> ericsnow: why the dial changes?
<jw4> menn0: I'll be in and out for the next few hours... making supper for kids, and then school open house - I'll try to get on longer later
<menn0> jw4: no worries
<ericsnow> wallyworld: in 1.2 http.Transport doesn't have the KeepAlive field, so I had to manually create the same effect
<wallyworld> ok
<ericsnow> wallyworld: thanks for the review :)
<wallyworld> np
<mup> Bug #1447853 was opened: Local charms are not added to storage on upgrade to 1.22.x <charms> <regression> <storage> <upgrade-juju> <juju-core:Triaged> <https://launchpad.net/bugs/1447853>
<wallyworld> menn0: fixes pushed when you get a chance
<menn0> wallyworld: kk
<menn0> wallyworld: ship it
<wallyworld> \o/
<wallyworld> ty
<mup> Bug #1447853 changed: Local charms are not added to storage on upgrade to 1.22.x <charms> <regression> <storage> <upgrade-juju> <juju-core:Triaged> <https://launchpad.net/bugs/1447853>
<mup> Bug #1447853 was opened: Local charms are not added to storage on upgrade to 1.22.x <charms> <regression> <storage> <upgrade-juju> <juju-core:Triaged> <https://launchpad.net/bugs/1447853>
<sinzui> wallyworld, I am exhausted. I haven't found much. I split bug 1447846 from the other upgrade issue and captured some unit log
<mup> Bug #1447846: Hooks don't fire after upgrade 1.23.0 <hooks> <regression> <upgrade-juju> <juju-core:Triaged by menno.smits> <juju-core 1.23:In Progress by menno.smits> <https://launchpad.net/bugs/1447846>
<wallyworld> sinzui: ty, menn0 is working that bug at the moment
<sinzui> wallyworld, mgz and I also discovered bug 1447853, but I know have a cheep work around
<mup> Bug #1447853: Local charms are not added to storage on upgrade to 1.22.x <charms> <regression> <storage> <upgrade-juju> <juju-core:Triaged> <https://launchpad.net/bugs/1447853>
<wallyworld> ah bollocks, saw that float past
<wallyworld> will have to fix for 1.24
<sinzui> wallyworld, the work around is to upgrade-charm --force. it works a minute later :)
<wallyworld> at least there is that
<ericsnow> anyone have a minute to review a patch for a very mechanical change: https://github.com/juju/govmomi/pull/2
<menn0> fwereade: ping?
<menn0> jam: ping?
<menn0> wallyworld: i have a learned a lot more about the hook not firing bug
<menn0> wallyworld: not sure I know how to fix it though
<wallyworld> oh?
<wallyworld> did you want to talk?
<menn0> wallyworld: having jam or fwereade around would be handy. it's leader related.
<wallyworld> oh joy
<wallyworld> jam will be online soonish
<menn0> wallyworld: quick chat would be good
<wallyworld> ok, same hangout
<menn0> wallyworld: i'm there now
<wallyworld> menn0: if you get a chance, but doesn't matter if not http://reviews.vapour.ws/r/1481/
<menn0> wallyworld: got it
<jw4> menn0: does that mean it's not related to my changes?
<menn0> jw4: at this stage it's looking like the hooks not firing issue is related to the leadership work
<menn0> jw4: i'm not any wiser about that mode error in the logs though
<jw4> menn0: well I know about the error - that's not critical
<jw4> menn0: cool, I'll try and dig into it too
<jw4> for my edification
<menn0> we just had an earthquake, 6.4 but very deep
<jw4> whoa
<menn0> because of the depth it wasn't that severe
<jw4> menn0: any damage
<menn0> but my chair was rolling around by itself
<menn0> no damage i think but it was a strange rolling feeling and went on for a while
<menn0> i've only experienced short "bangs" up until now
<anastasiamac> menn0: is south island sinking?
<jw4> that's freaky... are earthquakes common in that area? Pacific Rim and all
<menn0> yep, quite common in NZ
<jw4> anastasiamac: he's ignoring the question... I think he's rescuing pets or something
<menn0> anastasiamac: probably :)
<jw4> quite a bit of activity recently: http://www.geonet.org.nz/quakes/felt
<anastasiamac> jw4: :D
<jw4> :)
<anastasiamac> menn0: u should have apost-earthquake party, considering it's friday
<menn0> anastasiamac: they're a little too common to be having parties for each one :)
<anastasiamac> menn0: on a serious note, it's gr8 that it was deep. 6.4 sounds really scary... :))
<menn0> anastasiamac: yes. this one was the same magnitude as the one that did this a few years ago: http://en.wikipedia.org/wiki/2011_Christchurch_earthquake
<menn0> the city is still a long way from recovering from that one
<wallyworld> menn0: the earth moved for you in cape town as well didn't it?
<menn0> wallyworld: ssh not in public :)
<mup> Bug #1447895 was opened: Panic if jujud restarts while action is running <actions> <juju-core:New> <https://launchpad.net/bugs/1447895>
<mup> Bug #1447899 was opened: upgrade fails if no explicit version is specified <upgrade-juju> <juju-core:Triaged> <https://launchpad.net/bugs/1447899>
<jw4> nice sleuthing menn0
<menn0> jw4: with the hook not firing bug?
<jw4> yeah
<jw4> finding the WatchLeadershipSettings missing document
<menn0> wallyworld, jw4: bingo!
<menn0> wallyworld, jw4: I have a fix for the hook not firing issue
<menn0> wallyworld, jw4: adding an upgrade step that adds the leadership documents in the settings collection fixes the issue
<menn0> wallyworld, jw4: I hacked it in without tests so let me sort that out before proposing
<jw4> menn0: suh-weet!
<menn0> wallyworld, jw4: initial, rough fix for hooks not firing issue
<menn0> https://github.com/mjs/juju/tree/1447846-hooks-dont-fire-after-upgrade-1.23
<menn0> writing the tests now
<jw4> other than missing docstring on https://github.com/juju/juju/compare/1.23...mjs:1447846-hooks-dont-fire-after-upgrade-1.23#diff-d7b2b2c8e8ce6dfc1b7f09c3cf9744d1R1203
<jw4> and forthcoming tests...
<jw4> looks great menn0
<wallyworld> menn0: branch looks good, still want to understand why listener hangs if there's no record. i think that's poor behaviour
<jw4> wallyworld: I bet it's because the watcher isn't returning an initial empty event, like it's supposed to
<jw4> wallyworld, menn0 it was a little tricky getting that initial guaranteed event in the Action Watchers, so I wouldn't be surprised if the Leadership Watchers have a problem there too
<menn0> jw4: I think the problem here is that the code assumes the leadership doc will be there b/c it is for services created under 1.23
<menn0> jw4: but someone forgot to write this upgrade step
<jw4> the Watchers (afair) are supposed to guarantee an initial event, possibly empty, even if there are no records...
<menn0> jw4: and without the doc there when the watcher is created the initial event isn't fired
<menn0> jw4: are you sure about that?
<jw4> menn0: yeah, I think you're right, but I remember fwereade stressing to me the importance of that... (but I could be remembering wrong)
<jw4> (the importance of the initial, possibly empty, guaranteed event)
<menn0> jw4: yeah, I know there's been talk about this but I'm not exactly sure what should happen
<jw4> hopefully fwereade will be on soon and will chime in... I guess it's EOD for you though now (and EOW)
<menn0> jw4: i've got a bit more time left
<menn0> jw4: i'm going to try and get this fix done as much as I can
<menn0> jw4: regardless of what the watchers are supposed to do I'm pretty sure we want this upgrade step
<jw4> menn0: regardless, I think your fix is appropriate in this case
<wallyworld> jw4: exactly, but to me that's not how i thought the watchers worked
<wallyworld> +1
<jw4> wallyworld: yeah, I'm fairly confident that's how I was instructed to implement the actions related watchers, but we should verify
 * jw4 off to bed
<menn0> wallyworld: ping
<wallyworld> yo
<menn0> wallyworld: i have to EOD very soon
<wallyworld> one sec
<wallyworld> back
<menn0> wallyworld: just need to hand over the fix for this
<menn0> wallyworld: pushing now...
<menn0> wallyworld: give me a sec
<wallyworld> sure, is it ready to land after review?
<wallyworld> or tests needed?
<menn0> wallyworld: the hard tests are done
<menn0> wallyworld: just needs a test for upgrade step idempotency
<wallyworld> ok
<menn0> it's here: https://github.com/mjs/juju/tree/1447846-hooks-dont-fire-after-upgrade-1.23
<wallyworld> menn0: i'll either do it or hand over; it's my son's birthday today and i have to go to dinner soonish. either way will be done. thanks for fixing
<menn0> it's ready to go apart from that one test
<menn0> if no-one gets it done today it'll take me almost no time on Monday
<menn0> wallyworld: and we need to forward port to master
<wallyworld> ok, np. tbh, i don't think we are cutting a release on friday anyway
<wallyworld> but it's now done so the risk is removed
<menn0> wallyworld: ok good
<menn0> wallyworld: well have a good night and a weekend
<menn0> speak to you next week
<wallyworld> fwereade: you online?
<wallyworld> fwereade: hiya
<wallyworld> TheMue: hi there, you ocr? would love a review of this so as it needs to land for 1.24 http://reviews.vapour.ws/r/1481/ it removes a feature flag
<wwitzel3> jam: ping
<mattyw> wwitzel3, ping?
<wwitzel3> mattyw: pong
<fwereade> wallyworld_, so do we still serve client facade v0? I'm not seeing where we do
<fwereade> wallyworld_, and that would STM to be a problem...
<wallyworld_> fwereade: ah, right, i wasn't sure about that. i wasn't sure what the exact mechanism was there
<fwereade> wallyworld_, the ideal, I think, is to *never touch* the client API, and carve off changing functionality into more service-specific facades
<fwereade> wallyworld_, otherwise you have to have an almost-perfectly-cloned Client facade
<fwereade> wallyworld_, then when (say) storage evolves, it shouldn't need to hit more than one or two small facades
<fwereade> wallyworld_, rather than dragging the Client mess into the equation every time
<fwereade> wallyworld_, what actually changed on Client?
<wallyworld_> fwereade: yeah sadly i think you are right, i'll have to do that, but will miss the branching of 1.24. bollocks. will have to forward port
<fwereade> wallyworld_, bad luck :(
<wallyworld_> fwereade: new params to 2 apis were added
<wallyworld_> fwereade: if a newer client adds additonal params to the call, they will be ignored, but we need a way to tell the user
<wallyworld_> hence the api bump
<fwereade> wallyworld_, yeah, indeed, it really is a different API
<fwereade> wallyworld_, so I *think* that the right thing to do
<fwereade> wallyworld_, is to pull out a Service service facade from client
<wallyworld_> yeah, i suspect so too
<wallyworld_> i'll revisit over the weekend
<wallyworld_> i really want to remove the flag for 1.24
<perrito666> morning
<mgz> perrito666: wala
<redelmann> perrito666, Â¿good? morning
<katco> wwitzel3: ericsnow: stand up
<alexisb> all, looking at this: https://bugs.launchpad.net/juju-core/+bug/1447846
<mup> Bug #1447846: Hooks don't fire after upgrade 1.23.0 <hooks> <regression> <upgrade-juju> <juju-core:Triaged by menno.smits> <juju-core 1.23:In Progress by menno.smits> <https://launchpad.net/bugs/1447846>
<alexisb> ^^^ based on menno's last comment is this something someone can pick up and complete today?
<jw4> alexisb: menn0 said it's almost done.. just one test for idempotency left
<katco> alexisb: yeah looks doable for someone to pick up
<alexisb> katco, can you please delegate
<katco> alexisb: sure
<alexisb> thanks
<katco> moonstone are all currently working on bugs. any volunteers?
<katco> cherylj: perrito666: do you two have capacity to land a simple bug?
<perrito666> katco: sorry I dont feel really reliable right now.
<katco> perrito666: no worries. feel better soon
<katco> cherylj: sorry i lost connection. can you take that bug?
<natefinch> man I really hate that we require go 1.2 for juju, it totally screws up my development environment for everything else... and many projects already require 1.3+
<katco> natefinch: here's my setup
<katco> natefinch: ~/.local/go-1.{2,3,4}
<katco> natefinch: ~/.local/go -> go-1.2 during day
<katco> natefinch: export GOPATH=~/.local/go
<katco> natefinch: there is some churn with having to rebuild utils
<ericsnow> hmmm...virtualenv for go...
<katco> ericsnow: or there's that :)
<natefinch> ericsnow: there's a tool for that... forget which of the myriad tools out there it is.  of  course, as katco said, you can always just run them side by side.
<katco> natefinch: wwitzel3: planning time!
<wwitzel3> I'm there?
<wwitzel3> is it not in moonstone? .. (checks calendar)
<natefinch> wwitzel3: not moonstone
<wwitzel3> cmars: http://reviews.vapour.ws/r/1484/
<cmars> wwitzel3, so i guess CI is going to set that feature flag to get that test to pass? what's the difference between not registering the provider vs. blocking bootstrap?
<wwitzel3> cmars: the code on the unit doesn't see the provider as registered since the flag is local
<wwitzel3> cmars: jam pointed this out no the ML and the QA ran in to it with CI tests
<aznashwan> hey; could I please get a quick review on http://reviews.vapour.ws/r/1486/
<cmars> wwitzel3, ok, got it. so this moves the featureflag check to the client? in what context does environ.Bootstrap run?
<cmars> (environ's not an area of the codebase I'm familiar with)
<aznashwan> it's a simple module; ~100 lines of code that actually does something
<wwitzel3> cmars: just during juju bootstrap
<natefinch> cmars: quick review?  http://reviews.vapour.ws/r/1487/diff/#
<wwitzel3> cmars: thanks for the review
<perrito666>  /query alexisb
<perrito666> well, another bug in my irc client
<perrito666> I should stop using daily snapshots
<natefinch> sometimes I think Juju is the biggest example of what not to do with interfaces :/
<perrito666> you exagerate
<natefinch> perhaps slightly
<natefinch> there might be a few worse examples in the world ;)
<natefinch> I like the way everything is a nice generic interface and then in the comments we just say "this is for precise0
<natefinch> s/0/"
<perrito666> natefinch: I think that is an example on how not to used a typed language
<mgz> anyone still around for trivial reviews?
<mup> Bug #1448308 was opened: Skipped TestUniterUpgradeConflicts on ppc64 <skipped-test> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1448308>
<ericsnow> mgz: sorry I missed that format stuff :/
<ericsnow> anyone around that feels comfortable reviewing http://reviews.vapour.ws/r/1490/
<ericsnow> it fixes the "hooks not firing" bug
<ericsnow> #1447846
<mup> Bug #1447846: Hooks don't fire after upgrade 1.23.0 <hooks> <regression> <upgrade-juju> <juju-core:Triaged by menno.smits> <juju-core 1.23:In Progress by ericsnowcurrently> <https://launchpad.net/bugs/1447846>
<mgz> ericsnow: I can look
<ericsnow> mgz: oh, cool
<mgz> ericsnow: I also have a trivial ppc64 test skip branch
<ericsnow> mgz: note that I only added the idempotency test, menn0 did the rest
<mgz> /1488
<ericsnow> mgz: I'll take a look
<mgz> ericsnow: yeah, I already looked over menno's branch
<ericsnow> mgz: nice
<ericsnow> mgz: I'm pretty sure I got the idempotency check right
<ericsnow> mgz: in the bug you make it sound like skipping the test might be masking a real bug
<mgz> ericsnow: it might be, but we hav ehte bug, and know we need to do functional testing of the leadership stuff
<ericsnow> mgz: k
<ericsnow> mgz: so 1448308 will be closed only as soon as we resolve the issue (and not when the skip-the-test patch merges), right?
<mgz> ericsnow: how-upgrades-work question
<ericsnow> mgz: sure
<mgz> ericsnow: indeed, I'm keeping the skipped-tests bugs open, because we keep filing bugs and linking them in the codebase, but counting them as fixed when the test is skipped
<mgz> which is wrong...
<ericsnow> mgz: cool, then LGTM
<mgz> I'm on 1.23.0 which did not do this step, my state server notices there's a 1.23.2 with this fix and downloads it... how does it know to run this step?
<mgz> we just always do steps on new versions and the steps must apply?
<mgz> even minor ones, right?
<ericsnow> mgz: I'm not sure, but...the idempotency invariant of upgrade steps ensures it's a noop if you're already okay
<mgz> right
<ericsnow> I expect it simply runs all the steps
<mgz> ericsnow: test changes look good
<ericsnow> mgz: k, cool
<ericsnow> mgz: it made sense to factor out that stuff into helpers :)
<mgz> I'd be somewhat interested in a test that ran the step on a as-if just bootstrapped system, no services
<mgz> we don't try and do all error paths in these things, but that one does seem relevent
<ericsnow> mgz: so basically the same thing as the existing test but without adding any services first?
<mgz> (it should just return success without running any ops basically)
<mgz> ericsnow: yup
<ericsnow> mgz: I'll add such a test!
<mgz> ericsnow: lgtm
<mgz> hm, can I shipit and leave a comment in the same thing? not sure on the magic
<mgz> I guess I jsut include that string in my text huh.
<ericsnow> mgz: so is it correct that CI covers the upgrade scenarios now (i.e. the bug trigger will get exercised)?
<mgz> ericsnow: see all the failures on upgrade jobs on 1.23 and trunk :)
<ericsnow> mgz: that's what I figured :)
<ericsnow> mgz: thanks for the review
<mgz> http://reports.vapour.ws/releases/2557
<mgz> "panic: rescanned document misses transaction in queue" ...that one's new to me
<ericsnow> mgz: yeah, me too :\
<ericsnow> mgz: hmmm...see #1318366
<mup> Bug #1318366: jujud on state server panic misses transaction in queue <cloud-installer> <landscape> <orange-box> <panic> <performance> <sm15k> <juju-core:Fix Released by menno.smits> <juju-core 1.20:Fix Released by menno.smits> <juju-core (Ubuntu):Fix Released> <https://launchpad.net/bugs/1318366>
<fwereade> ericsnow, menn0, dammit, sorry, thanks
<ericsnow> mgz: I'm going to ask menn0 to double check the upgrade step and tests when he gets a chance; that failure is a little suspicious.
<mgz> ericsnow: seems wise, we're not in a rush now
<ericsnow> mgz: yep :)
#juju-dev 2015-04-25
<mup> Bug #1421258 changed: juju deploy fails to deploy ceilometer-agent - Output: ERROR connection is shutdown  <deploy> <oil> <oil-bug-1372407> <juju-core:Expired> <https://launchpad.net/bugs/1421258>
#juju-dev 2016-04-25
<babbageclunk> dimitern: quick review of this? https://github.com/juju/gomaasapi/pull/46
<dimitern> babbageclunk: looking
<babbageclunk> dimitern: it turns out that as long as the methods return nil literally, the interface values compare == to nil
<dimitern> babbageclunk: hmm really? Won't the nil be wrapped still in an nil interface value, which itself is not nil?
<babbageclunk> dimitern: as long as both the type pointer and the value pointer are nil it'll == nil
<babbageclunk> dimitern: but returning something like i.subnet when it's nil will set the type pointer.
<dimitern> babbageclunk: you're correct, I've just tried this: http://play.golang.org/p/dQSGV-JQ87
<dimitern> babbageclunk: so the compiler does the right thing as long as you return an untyped nil
<babbageclunk> dimitern: yup - and that means we can keep the interface nice while getting the right behaviour
<dimitern> babbageclunk: sweet!
<babbageclunk> dimitern: :)
<voidspace> dimitern: frobware: will be a couple of minutes late to standup
<voidspace> urgent coffee emergency
<dimitern> voidspace: np
<babbageclunk> sweet, opportunistically grabbing a tea too then
<dimitern> babbageclunk: reviewed
<jam> mgz: how's the release shaping up? Is CI happy or do we need some extra pokinG?
<frobware> voidspace: https://bugs.launchpad.net/maas/+bug/1573046
<mup> Bug #1573046: 14.04 images not available for commissioning as distrio-info --lts now reports xenial <landscape> <MAAS:Fix Released by andreserl> <https://launchpad.net/bugs/1573046>
<frobware> voidspace: I got kicked out, back into standup HO again?
<voidspace> frobware: I'm still there
<frobware> dooferlad: the precise /tmp issue - I wonder how it  ever worked then
<dooferlad> frobware: I don't know, but it definitely happens.
<babbageclunk> dimitern, frobware, voidspace: review please? https://github.com/juju/juju/pull/5271
<babbageclunk> also, trying to post a comment on a bug on launchpad is timing out - is that normal?
<babbageclunk> nvm - fine now
<dimitern> babbageclunk: LGTM, thanks!
<babbageclunk> man, juju on maas is really neat when it's all working smoothly.
<babbageclunk> dimitern: Thanks!
<voidspace> babbageclunk: that was a difficult review...
<voidspace> dimitern: that's what a review should look like - not 17 pages! ;-)
<voidspace> (I jest - remove as much code as you want...)
<babbageclunk> voidspace: sorry man - I tried to keep it to a 1-line change but it wouldn't compile without the other one.
<voidspace> babbageclunk: yeah, I have the same change in my branch
<voidspace> dimitern: I can confirm that lxd doesn't work on maas for neither trusty nor xenial
<voidspace> dimitern: due to this bug https://bugs.launchpad.net/juju-core/+bug/1568895
<mup> Bug #1568895: Cannot add MAAS-based LXD containers in 2.0beta4 on trusty <juju-core:Triaged> <https://launchpad.net/bugs/1568895>
<dimitern> babbageclunk: it should be smooth if we (and the maas guys) are doing it properly :)
<babbageclunk> dimitern: :)
<dimitern> voidspace: oh, not good :/ I'll give it a try here with 1.9 and 2.0 on xenial
<voidspace> I also can't commission new nodes at the moment
<dimitern> voidspace: some reviews are like that :P
<voidspace> which frobware reckons is probably due to the xenial transition
<voidspace> not an issue for me right now, but will be soon
<voidspace> heh, yea
<dimitern> babbageclunk: if you can bootstrap ok now, you can test a few things - e.g. constraints selection for nodes, bindings, deploying mult-nic lxd backed by a device, ideally as far as checking whether network-get works
<voidspace> dimitern: can't test lxd, multi-nic or otherwise
<dimitern> voidspace: 'otherwise' should work (i.e. with lxc)
<voidspace> dimitern: well, yes "not lxd" works...
<dimitern> voidspace: as long as it works for lxc (and kvm) at least we know the multi-nic code path is exercised
<dimitern> then it's down to fixing lxd I guess
<voidspace> dimitern: by the way - how is AllocateContainerAddresses tested in the maas provider
<dimitern> (again)
<babbageclunk> dimitern: ok - I'll probably need some hand-holding for those.
<voidspace> dimitern: not directly as far as I can tell
<voidspace> I'm writing tests for the new code path
<dimitern> babbageclunk: sure, just ask
<dimitern> voidspace: not tested except manually (live) I'm afraid
<voidspace> naughty :-/
<voidspace> who let that past review!
<voidspace> easier to test with the new gomaasapi though
<dimitern> voidspace: yeah, but at least we can test it properly now
<dimitern> babbageclunk: for testing constraints, I'd suggest trying to add a machine with --constraints='...', once for each type (arch, mem, cpu, etc.) on its own, then removing the machine with --force before trying the next
<babbageclunk> ok
<babbageclunk> dimitern, voidspace: man, reading through gomaasapi/testservice.go was pretty alarming! Basically having to re-implement a baby MAAS!
<dimitern> different combos are also useful to test, and constraints with negatives as well (tags=^mytag1,^othertag, same for spaces=)
<dimitern> babbageclunk: yeah, and that's a simpler test double than we use for ec2
<babbageclunk> dimitern: ouch
<voidspace> babbageclunk: yep :-)
<dooferlad> frobware: have replied to that review...
<mup> Bug #1574564 opened: Juju tools 1.25.5 not found in https://streams.canonical.com/juju/tools/releases/ <juju-core:New> <https://launchpad.net/bugs/1574564>
<dimitern> voidspace, frobware, babbageclunk: http://reviews.vapour.ws/r/4697/ please, take a look (one page diff this time :)
<fwereade> babbageclunk, dimitern, voidspace: FWIW, I feel quite strongly that complex test doubles are very smelly -- for individual interactions, explicit canned request/response is usually tractable, and for broader tests there's no substitute for a real maas
<dimitern> fwereade: agreed - we're moving the low-level MAAS API interactions and tests inside gomaasapi itself, and using a pre-canned http server for higher-level interactions
<fwereade> babbageclunk, dimitern, voidspace: and it seems like a bad deal to devote effort to exactly aping a substrate -- it's very hard to maintain, and it means that the test infrastructure is biased *against* being able to test weird responses
<fwereade> dimitern, cool
<dimitern> frobware, voidspace, babbageclunk: related, but simpler review for the names package: http://reviews.vapour.ws/r/4698/
<babbageclunk> dimitern: is there a way to ignore whitespace differences in reviewboard or github? The vertical alignment that gofmt does makes it hard to see the real change in lots of places. (The checkboxes in reviewboard don't seem to do anything for the changes I'm looking at.)
<dimitern> babbageclunk: I don't think so, but does it look better on github?
<babbageclunk> dimitern: oh-ho! https://github.com/blog/967-github-secrets
<babbageclunk> dimitern (and frobware): add ?w=1 on the end of the url
<dimitern> babbageclunk: nice! I knew there was something like that, but haven't used it yet
<frobware> babbageclunk: I was looking for something like that about 30 mins ago when purveying dimitern's PR... thx
<dimitern> frobware: fwiw I try to make individual commits make sense on their own
<frobware> dimitern: ack. I just find it a toss-up which UI is better. GH or RB.
<dimitern> ok, upgraded both maas-es to latest versions (1.9.1+bzr4543-0ubuntu2 (trusty1) and 2.0.0 (beta4+bzr4944), respectively) now bootstrapping to see if lxd will work
<dimitern> fwereade: hey, I know we don't usually do our 1:1s, but I have a few things to discuss if you have 15-20m time?
<dimitern> we can do it later/tommorow as well
<dimitern> voidspace, frobware, babbageclunk: I can successfully deploy multi-nic lxd containers on xenial with maas 1.9.1 beta4
<jam> mgz: ping
<voidspace> dimitern: that's awesome
<dimitern> voidspace: maybe try upgrading your maas to see if it will fix the lxd issue?
<dimitern> now trying the same on 2.0.0
<voidspace> dimitern: ah, that's 1.9
<voidspace> dimitern: I'm using the latest 2.0 from experimental 3 ppa.
<voidspace> dimitern: not so awesome news then. I mean, great - but it's not using my code...
<voidspace> dimitern: you won't be able to deploy containers on maas 2 with master - you'll need my branch
<dimitern> voidspace: ah, ok can you paste a link and I'll try with your branch?
<voidspace> dimitern: https://github.com/voidspace/juju/tree/maas2-allocate-container-addresses
<dimitern> voidspace: thanks!
<jam> fwereade: have you seen the "test timed out waiting for the machiner to start" ? failure in CI?
<jam> http://reports.vapour.ws/releases/3919/job/run-unit-tests-mongodb3/attempt/538
<mup> Bug #1382131 changed: local provider configs cannot be trivially duplicated <config> <local-provider> <juju-core:Invalid> <https://launchpad.net/bugs/1382131>
<jam> fwereade: the very first thing that comes to mind is knowing *what* dependency is missing would be useful
<frobware> dooferlad: ping
<cherylj> fwereade: Just sent an email about the pinger bug - can you take a look?
<urulama> cherylj: are you changing the pinger?
<urulama> cherylj: that's the only way to keep WS open so that apache doesn't close it by itself
<cherylj> urulama: this is for fixing bug 1572237
<mup> Bug #1572237: juju rc1 loses agents during a lxd deploy <lxd-provider> <juju-core:In Progress by ericsnowcurrently> <https://launchpad.net/bugs/1572237>
<cherylj> urulama: it's to make sure the pinger restarts if it dies
<urulama> cherylj: ok, thanks. just didn't want to get any surprises seeing GUI stop working all of a sudden :)
<cherylj> hehe :)
 * dimitern is out for ~2h
<simonklb> anyone know if someone is working on https://bugs.launchpad.net/juju-core/+bug/1565872 ?
<mup> Bug #1565872: Juju needs to support LXD profiles as a constraint <adoption> <juju-release-support> <lxd> <juju-core:Triaged> <https://launchpad.net/bugs/1565872>
<simonklb> and what is the best option for developing docker charms for 2.0 at the moment?
<fwereade> jam, the infrastructure is there but I never got to exposing it -- call .Report() on an Engine and you'll get all sorts of useful diagnostic goodness
<fwereade> cherylj, ack
<mup> Bug #1574607 opened: Multiple Interfaces lead to stalled charm download over wrong API endpoint <api> <networking> <juju-core:New> <https://launchpad.net/bugs/1574607>
<fwereade> cherylj, it's a real race, in stuff my eyes completely skipped over because I thought I recognised it
<mup> Bug #1574632 opened: Data Race in apiserver/presence/pinger.go <blocker> <ci> <race-condition> <regression> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1574632>
 * voidspace lunches
<mup> Bug #1574637 opened: TestScale fails intermittently <ci> <intermittent-failure> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1574637>
<mup> Bug #1574637 changed: TestScale fails intermittently <ci> <intermittent-failure> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1574637>
<Garyx> Anyone have any idea of when juju 2.0 RC1 or next Beta will rear it's head in the repo's?
<mup> Bug #1574637 opened: TestScale fails intermittently <ci> <intermittent-failure> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1574637>
<babbageclunk> Garyx: I think cherylj might be able to tell you that.
<alexisb> sinzui, fwereade, cherylj yay for CI for catching issues
<alexisb> fwereade, cherylj what do we htink the turn around time will be for fixing the race?
<Garyx> babbageclunk: thanks
<Garyx> cherylj: any rough eta, not asking for a time and date ;)
<alexisb> Garyx, we are working on the next beat atm
<alexisb> trying to get it out today
<alexisb> beta6
<Garyx> alexisb: thanks for the info :)
<mup> Bug #1574649 opened: environs/config LatestLtsSeries is not isolated from the host during testing <juju-core:New> <https://launchpad.net/bugs/1574649>
<fwereade> alexisb, more than 0 hours and hopefully less than 2; surely less than 4
<alexisb> fwereade, ack thanks
<mup> Bug #1574564 changed: Juju tools 1.25.5 not found in https://streams.canonical.com/juju/tools/releases/ <juju-core:Invalid> <https://launchpad.net/bugs/1574564>
<mup> Bug #1574677 opened: Designation of 'local' for both users and controllers is confusing <docteam> <juju-core:New> <https://launchpad.net/bugs/1574677>
 * dimitern is back
<katco> ericsnow: standup time
<dimitern> voidspace: I'm trying your branch now on 2.0 with lxd
<mup> Bug #1570917 changed: upgrade-juju: success but then deploy fails <upgrade-juju> <juju-core:Invalid by rogpeppe> <https://launchpad.net/bugs/1570917>
<voidspace> dimitern: cool, happy path test (single NIC) just about ready to land
<voidspace> dimitern: I'll get that up for review and then do more extensive tests
<voidspace> dimitern: once it lands there's container support on master - so good to get it in
<voidspace> mind you, master is blocked so it doesn't really matter I guess
<dimitern> voidspace: sounds good
<voidspace> a review would be good though
<dimitern> voidspace: I was having issues with my maas2 - tgt was misbehaving again and nothing could be commissioned or deployed successfully, now it seems it works again
<voidspace> dimitern: I think you'll hit this bug, but we'll see: https://bugs.launchpad.net/juju-core/+bug/1568895
<mup> Bug #1568895: Cannot add MAAS-based LXD containers in 2.0beta4 on trusty <juju-core:Triaged> <https://launchpad.net/bugs/1568895>
<dimitern> voidspace: nope, it's not that I think, as the bug is about trusty and this is xenial
<voidspace> dimitern: I had the exact same error with both
<dimitern> voidspace: if you grep for 'tgtadm: out of memory' in /var/log/maas/rackd.log and find anything, you might be hitting my issue
<voidspace> dimitern: will do
<voidspace> dimitern: frobware: babbageclunk: I have to pick my daughter up from the childminders' at 5pm - wife is stuck at the doctors
<voidspace> so will be late to meeting, sorry
<dimitern> voidspace: ok, np
<dimitern> voidspace: do you have a dual-nic hardware node, and the second one is usb2eth?
<voidspace> dimitern: me, no
<dimitern> voidspace: and the node details page showing something like 'eno1' and 'enxxaabbccddeef0' ?
<voidspace> dimitern: my testing is kvm only
<dimitern> voidspace: ah, ok
<voidspace> dimitern: if you're talking about that bug report it is from frobware
<dimitern> voidspace: nope, it's about dealing with long NIC names (e.g. the second NIC on all of my NUCs shows up as e.g. 'enx00e10000163d'
<voidspace> ah
<dimitern> and that's a wee bit too long to prepend 'br-' to it
<voidspace> hah
<dimitern> bridge script fails and I can't access the node :/
<voidspace> dimitern: frobware: babbageclunk: AllocateContainerAddresses with happy path test http://reviews.vapour.ws/r/4700/
<voidspace> review appreciated even though it can't land yet - so I can make changes
<voidspace> in the meantime working on more tests
<dimitern> voidspace: that seems to happen if you commissioned the node with xenial instead of trusty
<dimitern> voidspace: looking
<mup> Bug # changed: 1556183, 1556252, 1557146, 1557148, 1557679
<alexisb> dimitern, stop landing stuff in master please
<dimitern> alexisb: it wasn't blocked today, and I was not aware of restrictions :/
<alexisb> dimitern, it is blocked
<dimitern> alexisb: I see now it is, but this morning it wasn't - I haven't $$__JFDI__$$'ed anything since you mailed me
<alexisb> thanks dimitern
<frobware> dimitern, voidspace: I find it odd that in 2016 the name of the iface is so constrained.
<voidspace> dimitern: frobware: back, joining networking meeting
<voidspace> frobware: I'm nearly at the point where nothing crazy about networking can surprise me...
<natefinch_> anyone care to help debug why I can't log in after  upgrding  to Xenial?  http://pastebin.ubuntu.com/16050493/
<perrito666> natefinch_: sure
<perrito666> natefinch_: expand, "cant login"
<natefinch_> well, I get to the greeter screen, enter creds, it looks like its switching to the desktop, briefly displays "ubuntu has encountered a problem" andf then bumps me back to the greeter screen
<perrito666> ah, I see, well first of all you seem to be using the floss driver for nvidia, is that intentional?
<natefinch_> (please excuse my typing, i'\m on a tiny bluetooth keyboard via my tablet)
<natefinch_> I don't care what driver I use so long as it works.... which this noe obviously does not
<perrito666> if I had to guess, I would say that that your X is crashing after loggin in
<natefinch_> I had to twiddle my modprobe.d blacklists to even get x to load
<natefinch_> perrito666 - probably
<bogdanteleaga> natefinc_, did you blacklist all the gpu drivers?
<perrito666> natefinch_: check if the greeter doesnt provide an option for a session withouth the effects
<natefinch_> I can switch to a console easily enough, but that's it
<perrito666> natefinch_: I am sure ubuntu greeter has an option to change the session you are going to log in
<perrito666> click on the ubuntu logo next to the user/pass prompt
<natefinch__> evidently just reinstalling the nvidia driver fixed it
<perrito666> lol
<dimitern> lol
<dimitern> CRITICAL: We failed, but the fail whale is dead. Sorry....
<redir> up
<redir> um, power outage.
<redir> shutting down for a few minutes until it comes back.
<perrito666> dimitern: what was the dirty hack to workaround the issue with tests complaining about xenial?
<dimitern> dimitern: ha :) well, put a script returning "trusty", named 'distro-info' somewhere in $PATH, before /usr/bin
<mup> Bug #1574773 opened: bootstrap --to [lxd|lxc]:N fails to validate if machine N exists <landscape> <juju-core:New> <https://launchpad.net/bugs/1574773>
<mgz> ahasenack: that's a fun bug
<ahasenack> mgz: thx! :)
<ahasenack> and, let me fix the title
<ahasenack> it's not bootstrap --to
<ahasenack> it's deploy --to
<ahasenack> obviously
<mgz> ahasenack: ah, looks like bug 1365124
<mup> Bug #1365124: "juju deploy --to <non-existent-machine> <charm-name>" juju still tries to deploy the service. <deploy> <placement> <juju-core:Triaged> <https://launchpad.net/bugs/1365124>
<ahasenack> just --to N worked to validate N
<ahasenack> it failed when I added the lxc/lxd prefix
<mgz> but you say that's fixed?
<ahasenack> yeah, look at the paste in the bug
<ahasenack>  ERROR cannot deploy "ubuntu-to-machine" to machine 1: machine 1 not found
<ahasenack> when I used --to 1
<ahasenack> but --to lx[dc]:1 passed that check
<mgz> well, close one bug, open another
<ahasenack> mgz: that bug you linked to is for juju 1.18!
<ahasenack> or rather, was reported on juju 1.18
<mgz> yup :)
<mgz> ahasenack: don't supose you have a 1.X env alive at present? I only have 2.0 up currently
<mgz> just want to series target
<ahasenack> nope, I made it a point to move myself to juju 2
<mgz> ahasenack: I'll stand something up and test here
<mup> Bug #1365124 changed: "juju deploy --to <non-existent-machine> <charm-name>" juju still tries to deploy the service. <deploy> <placement> <juju-core:Triaged> <https://launchpad.net/bugs/1365124>
<mup> Bug #1574783 opened: Juju 2 status should show model name - discrepancy between output formats <juju-core:New> <https://launchpad.net/bugs/1574783>
<tvansteenburgh> can anyone point me to the code that retrieves group names from launchpad when i do `charm whoami`?
<natefinch__> marcoceppi: ^
<tvansteenburgh> natefinch: pretty sure it's go code, i just can't find it
<mup> Bug #1574783 changed: Juju 2 status should show model name - discrepancy between output formats <juju-core:New> <https://launchpad.net/bugs/1574783>
<mup> Bug #1574783 opened: Juju 2 status should show model name - discrepancy between output formats <juju-core:New> <https://launchpad.net/bugs/1574783>
<mup> Bug #1574783 changed: Juju 2 status should show model name - discrepancy between output formats <juju-core:New> <https://launchpad.net/bugs/1574783>
<mup> Bug #1574798 opened: Surprising charm downgrade <landscape> <juju-core:New> <https://launchpad.net/bugs/1574798>
<mup> Bug #1574809 opened: "to: lxc:0" ignored in bundle <kanban-cross-team> <landscape> <juju-core:New> <https://launchpad.net/bugs/1574809>
<redir_> and we're back
<mramm> Does GOOSE support the OpenStack Identity API V3?
<mramm> and can you configure/use it from juju?
<mup> Bug #1574844 opened: juju2 gives ipv6 address for one lxd, rabbit doesn't appreciate it. <landscape> <juju-core:New> <rabbitmq-server (Juju Charms Collection):New> <https://launchpad.net/bugs/1574844>
<katco> mramm: it does. you just have to request a /v3/ endpoint in your accounts.yaml (i think that's the right file... might be clouds.yaml)
<mramm> Thanks!
<katco> perrito666: ericsnow: meeting time
<perrito666> katco: tx
<alexisb> wallyworld, I have 5 minutes before my next call
<wallyworld> me too, be right there
<redir> katco: yst?
<alexisb> thumper, I lost you
<katco> redir: kinda
<redir> k. about to tanzanitestand but was looking for some juju1 assist. I'll hit you up tomorrow.
<redir> katco: ^
<katco> redir: ok tc, cya tomorrow
<mup> Bug # changed: 1537937, 1557747, 1564397, 1566130, 1568374, 1571832, 1571916, 1572237, 1572781, 1573148, 1573149, 1573259, 1573382, 1573659, 1574632
<mup> Bug #1573410 opened: trusty juju 1.25.5 having issues deploying xenial lxc containers <canonical-bootstack> <juju-core:New> <https://launchpad.net/bugs/1573410>
#juju-dev 2016-04-26
<redir> wallyworld: re the trusty tests, while looking to remove them, I recall that I left them in as they also test i386 constraints, which isn't supported in xenial anymore, IIUC>
<redir> wallyworld: with that in mind do you still want them removed?
<wallyworld> redir: yeah, 386 is no longer supported in 2.0
<anastasiamac> \o/
<redir> ahh even in trusty and older. OK makes sense.
<redir> tx
<redir> wallyworld: how about the even older quantal tests? nix them too?
<wallyworld> redir: depends. we tend to use quantal as a series for certain things in tests where we want to be sure the system on which the test runs is not that series
<redir> ok
 * redir leaves them
<redir> only removes trusty
<redir> bbiab
<mup> Bug #1574272 changed: Juju agent uninstalls itself while adding machine <juju-core:Incomplete> <https://launchpad.net/bugs/1574272>
<mup> Bug #1574272 opened: Juju agent uninstalls itself while adding machine <juju-core:Incomplete> <https://launchpad.net/bugs/1574272>
<mup> Bug #1574272 changed: Juju agent uninstalls itself while adding machine <juju-core:Incomplete> <https://launchpad.net/bugs/1574272>
<natefinch> single character code review anyone?
<natefinch> http://reviews.vapour.ws/r/4704/
<natefinch> wallyworld, thumper, menn0 ^
<wallyworld> looking
<wallyworld> natefinch: now if that facade were not done separately and differently to every other facade it would not have missed having its version bumped
<wallyworld> natefinch: and there are tests that check that all api and apiserver facades marry up
<natefinch> wallyworld: yeah.
<wallyworld> which also do not pick this up
<natefinch> wallyworld: the lack of tests is the real problem here, I think.
<wallyworld> we have tests
<wallyworld> but because this was done differently they did not pick it up
<natefinch> wallyworld: I mean, yes, also the fact it's different is bad
<natefinch> I don't think it's necessarily bad to component1/api  component2/api    rather that api/component1 api/component2 ... it's just that we have a mix, which is confusing, and this code is missing tests that should have picked up the mismatch of version numbers.
<natefinch> anyway, bedtime for me.
<mup> Bug #1574949 opened: setting a valid config setting should not warn <landscape> <juju-core:New> <https://launchpad.net/bugs/1574949>
<redir_> wallyworld: http://reviews.vapour.ws/r/4691/diff/1-2/ updated per earlier conversation
<wallyworld> great ty
<wallyworld> redir: looks good, just missing that extra test
<redir_> there are tests
<redir_> wallyworld: https://github.com/juju/juju/blob/master/environs/config/config_test.go#L1855
<redir_> or it is covered
<wallyworld> redir: ok, np, i was just going by the diff, let me check the link you just posted
<wallyworld> redir: those tests don't check that FallbackLtsSeries is used as expected
<wallyworld> that's the missing test i think we need
<redir_> i added a couple using patch executable, but they didn't cover anything additional so I didn't commit them
 * redir_ looks again
<wallyworld> redir: i guess the fallback series is only used to set testing FakeLtsSeries so it doesn't really need any more tests
<wallyworld> i just wanted to ensure that the now exported FallbackLtsSeries was used as expected now that it is exported
<redir_> got it committing.
<redir_> wallyworld: http://reviews.vapour.ws/r/4691/diff/2-3/ adds tests to ensure the exported variable is used
<wallyworld> ok, ta
<redir_> they don't add any coverage however.
<redir_> I am going eod... but will merge in the morning if it looks good to you.
<wallyworld> redir: ok, np. looking at the tests, it was not so much testing the setting of FallbaclLtsSeries (as you say that adds no extra coverage), but more testing places that used it got waht we expected ie was it logic to use fallbacklts wired up correctly. i think since we just use it to set a testing variable elsewhere, we can do without these tests
<redir_> ok reverting those tests:)
<redir_> OK they are gone again wallyworld :)
<wallyworld> ty, sorry for misunderstanding
<wallyworld> lgtm :-)
<redir_> np, It is getting late here, so I may have been unclear.
<redir_> I can ship it now if you want but I am not waiting around for the merge bot
<redir_> wallyworld: ^
<wallyworld> redir: yep, np. i'll check the progress
<redir_> k, the bot's got it
<redir_> nite
<redir_> btw, when do sleep?
<redir_> when do you sleep? even...
<mup> Bug #1574963 opened: juju2 lxd launch hostname reverse lookup inconsistent <kanban-cross-team> <landscape> <juju-core:New> <https://launchpad.net/bugs/1574963>
<thumper> night all
<frobware> dimitern: ping, running 10mins late...
<dimitern> frobware: no worries, thanks for reminding me :)
<dimitern> frobware: ping me when you're there
<frobware> dimitern: let's go
<dimitern> frobware: omw
<frobware> dimitern: let's try after standup
<TheMue> morning
<voidspace> dimitern: I've replied to two of your points on the review of my branch: http://reviews.vapour.ws/r/4700/
<voidspace> dimitern: the error 500 on the subnet is unrelated to the points you make and could be a genuine error
<voidspace> dimitern: and one of the points (about creating the primary NIC) I'll talk to you about in standup
<voidspace> dimitern: the other point about null types I'm fixing.
<voidspace> dimitern: I'd like to see the output of subnets read on your maas server - the error appears to be that subnet "2" doesn't exist
<voidspace> dimitern: frobware: standup?
<babbageclunk> dimitern: When setting up a VLAN and subnet, if I want a machine on that VLAN to be able to get to the internet (or anywhere outside that subnet) I need to set up a default gateway, right?
<babbageclunk> dimitern: Should that be the address of the MAAS controller? Does that mean that it needs another interface that I can assign that VLAN and subnet?
<babbageclunk> dimitern: reading docs
<dimitern> babbageclunk: yeah, the controller usually has access to all VLANs (has NICs on each)
<dimitern> babbageclunk: and also acts as a gateway for the VLAN's subnets
<babbageclunk> Ok, so I need to add an interface on the controller for each VLAN I create?
<dimitern> babbageclunk: to make the setup closer to the real world, add gateways to each subnet pointing to the respective cluster addresses on that subnet
<babbageclunk> dimitern: ?
<babbageclunk> dimitern: cluster addresses? The interface on the controller?
<dimitern> babbageclunk: yeah, one NIC for each subnet on the controller is the easiest thing, both for physical and vlan subnets
<dimitern> babbageclunk: yep
<babbageclunk> dimitern: ok, thanks.
<dimitern> babbageclunk: np
<babbageclunk> dimitern: I get an error when I try to create a subnet: malformed array literal: ""
<babbageclunk> LINE 1: ...', 5005, 1, '10.10.0.0/24', 2, '10.10.0.1'::inet, '', true) ...
<babbageclunk>                                                              ^
<babbageclunk> DETAIL:  Array value must start with "{" or dimension information.
<dimitern> babbageclunk: what did you run to get this?
<babbageclunk> Just trying to create the subnet in the MAAS admin.
<frobware> babbageclunk: is that in 2.0?
<dimitern> babbageclunk: I mean can you paste the command line?
<babbageclunk> I made a VLAN with the VID and name 10, and then tried to create a 10.10.0.0/24 subnet
<babbageclunk> I wasn't using the command line, hang on - I'll try that.
<dimitern> babbageclunk: have you tried setting up the VLAN NICs on the cluster first and rebooting?
<frobware> dimitern: I thought that detection only happened at install time
<dimitern> ooh wow
 * dimitern haven't see the web ui in 2.0 allows that
 * frobware is jealous in 1.9 land...
<babbageclunk> dimitern: Added the nic and rebooted, but I can't set the vlan tag on the new nic.
<dimitern> frobware: unless it changed, it also happens at boot
<frobware> dimitern: nice!
<dimitern> babbageclunk: check out this http://juju-sapphire.github.io/post/MAAS%20Spaces%20Demo/
<frobware> dimitern: I was only scripting that a few days ago
<dimitern> frobware: ah, ok
<babbageclunk> dimitern: I thought maybe that was because I'd need to assign a subnet to the nic first and then it would set it to the VLAN for that subnet
<babbageclunk> dimitern: but having trouble creating the subnet.
<dimitern> frobware: it seems to work, as I've recently "gained" a fabric-2 with a 192.168.122.0/24 subnet on it after installing libvird-bin and rebooting
<frobware> dimitern: but this is 2.0 only, yes?
<babbageclunk> dimitern: oh yeah - I've got that as well, on fabric-1 (I assume because I installed libvirt-bin before adding the new nic)
<dimitern> babbageclunk: nope, the creation order is: fabrics and spaces, vlan on the fabric, subnet on a vlan and space
<dimitern> frobware: it might be 2.0 only, now that I think of it
<babbageclunk> dimitern: Ok - I've got fabrics 0 (original nic on the controller), 1 (bridge from libvirt-bin), 2 (new nic I added to be on the new vlan)
<babbageclunk> dimitern: I've got a private space
<babbageclunk> dimitern: Trying to add a subnet in the private space in the 10 vlan.
<dimitern> babbageclunk: but, you either do A or B, no need for both; A) edit /e/n/i on the maas controller to e.g. add eth0.11, eth0.12, eth0.13 vlan nics, which are statically configured with e.g. 10.11.1.1/24, 10.12.1.1/24, 10.13.1.1/24 (so maas can detect the subnet along with the vlan, and from the parent device eth0 it can detect the fabric)
<babbageclunk> dimitern: Does the vlan need a primary rack?
<dimitern> or B) create vlans, then spaces, then subnets within the vlans and spaces manually with the CLI (or if it works - web ui)
<dimitern> babbageclunk: I expect so, but you should only have one anyway
<babbageclunk> dimitern: Ah - when I added the nic did I need to add a new network in VMM as well?
<voidspace> dimitern: I fixed the "nil" issue by the way, so if you could complete the review sometime it would be appreciated
<voidspace> dimitern: http://reviews.vapour.ws/r/4700/
<voidspace> doing some multi-nic testing.
<voidspace> well, attempting to without screwing up maas :-/
<dimitern> babbageclunk: not necessarily - adding new NICs on the VM is only needed if you need more than one physycal NIC to appear on the VM when commissioned by MAAS
<babbageclunk> dimitern: but how else would I be able to make sure the controller had an address in the right vlan?
<dimitern> babbageclunk: you can add VLANs on top of existing physycal NICs
<babbageclunk> dimitern: how?
<babbageclunk> dimitern: do I do that in the UI, or somewhere else?
<dimitern> voidspace: ta, will look in a bit
<dimitern> babbageclunk: see for example my /e/n/i from the maas controller machine: http://paste.ubuntu.com/16061991/
<babbageclunk> dimitern: (Sorry for all the questions - I feel like there's a lot of knowledge that I'm missing that everyone else has forgotten they know.)
<dimitern> babbageclunk: if I add e.g. "eth1.42" to that, with address 10.42.0.1/24 and vlan-raw-device eth1, then reboot maas should have created a 10.42.0.0/24 subnet linked to a newly created 42 on the same fabric eth1 is on
<babbageclunk> dimitern: ok, that makes sense - I thought that I could do that through the UI but had missed a step.
<dimitern> babbageclunk: well, you could also to that from the CLI (possibly the UI as well), but why bother if MAAS does it for you?
<dimitern> babbageclunk: once you have it working (the auto-detected way), you could try to replicate it manually, if you're interested
<babbageclunk> dimitern: ok, I'll do that.
<dimitern> babbageclunk: here's some more info that can help shed some light on the manual approach: http://blog.naydenov.net/2016/01/maas-setup-deploying-openstack-on-maas-1-9-with-juju/
<dimitern> just you'd have to adapt it for the 2.0 API changes
<babbageclunk> dimitern: ok, that's great, thanks.
<katco> fwereade: jam: hey, will you be able to make the planning meeting in ~20m?
<katco> fwereade: doh, you responded. sorry to pester ;)
<fwereade> katco, thanks for the reminder, though :)
<katco> fwereade: ha ;p definitely time for some coffee
<jam> katco: yes
<dimitern> voidspace: reviewed
<katco> jam: awesome, ty!
<katco> frobware: hey were you able to get those estimates? i see you requested access to the spreadsheet
<voidspace> dimitern: ta
<frobware> katco: nope. don't have any meaningful numbers
<dimitern> voidspace: live testing your branch now - I think I found the cause of "subnet not found"
<dimitern> voidspace:  PUT http://10.14.0.1/MAAS/api/2.0//MAAS/api/2.0/nodes/4y3h8x/interfaces/112/, para
<dimitern> ms:
<dimitern>  
<dimitern> we pass no params (I'd expect the vlan id there) when updating interface
<voidspace> dimitern: no vlan id
<voidspace> dimitern: ok, I'll look for that
<dimitern> voidspace: so what I think happens is maas tries to find a subnet with id=2 in the vlan with id=0 (which does not exist)
<voidspace> dimitern: right
<voidspace> dimitern: don't suppose you have any logging where it's coming from
<voidspace> dimitern: PUT is for update - which only happens on the device create I think
<dimitern> voidspace: ah, just killed the controller :/
<voidspace> heh
<voidspace> I'm looking
<dimitern> voidspace: but the PUT 500 was right after the device was created
<voidspace> dimitern: so in gomaasapi/machine.go:310 you can see the code
<voidspace> dimitern: it's only adding a VLAN if the iface and Subnet don't match - which may well be incorrect
 * dimitern takes a look
<voidspace> dimitern: that should probably be unconditional
<dimitern> voidspace: it makes sense not to set the vlan param if it's empty
<dimitern> voidspace: as you might want to e.g. just rename the interface or change its mac
<voidspace> that's where the vlan should be added
<voidspace> dimitern: should it be vlan id or vlan vid
<voidspace> we're setting id, which is consistent with the other entitites (use id)
<voidspace> in interface.go:137
<dimitern> voidspace: it should be the vlan id, not vid (or alternatively vlan="vid:<vid>")
<voidspace> yeah
<voidspace> well, there's definitely code to handle it in that code path
<voidspace> dimitern: actually, that code doesn't specify a subnet (the update call), so it's probably not that one
<voidspace> dimitern: let me look at LinkSubnet - but that's a POST not a PUT
<dimitern> voidspace: the update call sets the vlan, the link-subnet call sets the subnet and mode
<voidspace> dimitern: right, but you said the error was specifying a subnet and no vlan
<voidspace> dimitern: and the update call (the PUT) doesn't specify a subnet
<voidspace> dimitern: so it can't be that
<voidspace> I need a setup I can try this on
<voidspace> I'm shooting in the dark here
<voidspace> lunch first
<dimitern> voidspace: sorry, I didn't say we need subnet and vlan in the link-subnet call
<voidspace> dimitern: no, you said you'd expect to see a vlan id in the update interface call
<dimitern> voidspace: however, not setting a valid vlan on the interface won't allow you to link any subnet to it
<voidspace> dimitern: but it doesn't take that parameter
<voidspace> oh, it does
<voidspace> dimitern: right
<dimitern> voidspace: yep
<voidspace> dimitern: however, we do have code to set the vlan id - so it's weird that it's not getting there
 * dimitern *facepalms*
<voidspace> go on...
<babbageclunk> dimitern: I've added the vlan in ENI, but it doesn't show up in the controller. Do I need to do something (other than rebooting) after changing ENI?
<dimitern> voidspace: of course it won't work :) we're not actaully updating device's eth0 vlan at all
<voidspace> dimitern: explain
<dimitern> voidspace: we're creating a device with a mac, then reading back the interface_set from it, skipping the primary in the loop around 2308 in environ.go, and only recording its id
<voidspace> dimitern: but machine.CreateDevice does the update
<voidspace> dimitern: in gomaasapi
<mup> Bug #1574949 changed: setting a valid config setting should not warn <landscape> <juju-core:New> <https://launchpad.net/bugs/1574949>
<voidspace> dimitern: which is why we skip it
<dimitern> voidspace: yeah, it looks like Update() uses the subnet linked to the primary device NIC to figure out which vlan id should use
<dimitern> voidspace: sorry, not Update(), but CreateDevice()
<dimitern> voidspace: so the problem is in line 2272 in environs.go, not gomaasapi
<voidspace> dimitern: ah, so CIDR is not unique - so a simple hash is not sufficient
<voidspace> dimitern: it's CIDR/vlan id pairs
<voidspace> dimitern: it pulls out the wrong subnet
<voidspace> ?
<dimitern> voidspace: cidr is unique
<dimitern> voidspace: but I suspect preparedInfo['eth0'].CIDR is empty
<voidspace> dimitern: the maas 1.9 code is doing the same
<dimitern> voidspace: hmmm.. true
<dimitern> mystery
 * voidspace lunches
<babbageclunk> dimitern: ooh, while he's gone, can we hangout again? I'm hopelessly confused. (Unless you also want to lunch?)
<dimitern> babbageclunk: can you give me 10m ?
<babbageclunk> dimitern: if you do I can go for a run now and we can chat after?
<dimitern> babbageclunk: sounds better - please ping me when you're back then
<babbageclunk> dimitern: ok, thanks
<dimitern> voidspace: the problem is using Spaces(), followed by Subnets() on each does not give you the vlan of the subnet
<dimitern> voidspace: and CreateMachineDeviceArgs.Validate must also check a.Subnet.VLAN() != nil
<voidspace> dimitern: Space.Subnets() doesn't set vlan properly
<voidspace> that's a bug if true
<voidspace> anyway, really going on lunch
<voidspace> (you may be right)
<dimitern> voidspace: enjoy :)
<mup> Bug #1573410 changed: trusty juju 1.25.5 having issues deploying xenial lxc containers <canonical-bootstack> <juju-core:Invalid> <https://launchpad.net/bugs/1573410>
<babbageclunk> dimitern: back!
<dimitern> babbageclunk: hey, so let's use the standup HO?
<katco> frobware: hey, thanks for showing up. sorry it wasn't useful to you.
<babbageclunk> dimitern: yup
<mattyw> katco, ping?
<katco> mattyw: pong
<mattyw> katco, hey there, I'm struggling to deploy local charms on beta6 (it also fails on master) I'm getting connection timed out on ec2 but nothing more useful in the log than that (charmstore charms work fine) shall I raise a bug?
<katco> mattyw: s/charms/resources/ ? or do you really mean charms?
<mattyw> katco, charms
<katco> mattyw: does it work on a different substrate? i.e. is it a fluke?
<mattyw> katco, it did work on lxd, but I had other problems on lxd so had to stop using it
<katco> mattyw: this seems like something CI would be catching... hm
<katco> mattyw: i don't think it would hurt to raise a bug, but the inability to deploy a local charm seems very severe and something everyone would be seeing
<katco> sinzui: cherylj: mgz: is ci seeing anything like this?
<sinzui> katco: matty, ci deploys many 10s of local charms and is not seeing an issue
<mgz> mattyw: in good news, lxd will probably work for you now :)
<mattyw> mgz, lxd was sort of working, I was trying to install lxd in lxd though, and that was giving me problems
<sinzui> mattyw: this is an example of a deploy froma  few hours ago "juju --show-log deploy"
<cherylj> mattyw: what connection is timing out?  the actual deploy command?
<abentley> sinzui: I see no instances in eu-west-1.  Did you do cleanup, or did we finally get a clean run?
<mattyw> cherylj, seems to be yeah
<sinzui> abentley: I have too many disasters to deal with to clean up. I guess all went well
<cherylj> mattyw: can you paste the juju deploy --debug output?
<sinzui> abentley: I saw the region being used yesterday
<mattyw> cherylj,  https://pastebin.canonical.com/155238/
<abentley> sinzui: Yes.  industrial tests were still running last I checked, but it looked reasonable then.
<cherylj> mattyw: and this is ec2?
<mattyw> cherylj, yep
<cherylj> mattyw: yeah, it's not filtering lxd addresses
<cherylj> jam has a PR up for that
<dimitern> babbageclunk: http://rickardnobel.se/the-vlan-802-1q-tag-part-1/
<mattyw> cherylj, I don't believe lxd is being used in this situation
<cherylj> mattyw: can you ssh to your controller and run an ifconfig?
<cherylj> mattyw: I'm guessing you have a bridge device on the controller that's not being filtered, but I could be wrong
<babbageclunk> dimitern: you froze up! Are you ok over there? Thanks heaps anyway!
<dimitern> babbageclunk: yeah, to it appeared you dropped out btw
<dimitern> s/it/me/
<dimitern> argh
<dimitern> s/it/me it/ :)
<cherylj> mattyw: can you also go to your ec2 dashboard and see what IPs that machine has assigned to it?
<babbageclunk> dimitern: ah well, I think we're finished anyway - I'll read through that and get the vlans set up in my maas.
<babbageclunk> Thanks!
<dimitern> babbageclunk: I've sent you a few links in prvmsg not to flood the channel :)
<mattyw> cherylj, so I can't seem to ssh into it anymore, seems like none of the juju commands work, I've found the instance in ec2 though
<mattyw> cherylj, it's eu-west, I guess it might be a not commonly used region
<cherylj> mattyw: yeah, that doesn't surprise me that you couldn't ssh through juju
<dooferlad> frobware: https://github.com/juju/juju/pull/5281 may look very similar to something you reviewed yesterday... if you could take a quick look.
<mattyw> cherylj, how do I ssh to the controller now?
<cherylj> mattyw: can you ssh directly to ubuntu@54.195.25.5 ?
<dimitern> mattyw: try juju ssh --proxy 0 ?
<mattyw> cherylj, yes: https://pastebin.canonical.com/155242/
<mattyw> dimitern, ERROR machine 0 not found (not found)
<cherylj> hmm, I guess that region does actually use a 10. internal IP
<cherylj> I kinda figured it was a bridge IP
<dimitern> mattyw: juju switch admin then the above?
<mattyw> dimitern, that seems to work
<dimitern> or more verbosely juju ssh --proxy -m admin 0
<cherylj> mattyw: see anything interesting in machine-0.log?
<dimitern> well, for machine 0 of the admin model you shouldn't need to pass --proxy, but for any other machine (or container on machine 0), --proxy is needed unless you can ping the private address of the machine directly
<mattyw> cherylj, nothing really
<cherylj> mattyw: and the deploy command just hangs in the POST?
<mattyw> cherylj, yeah
<mattyw> cherylj, I've just discovered the charm is >600mb, I wonder If I just have to wait for ages, and expect some errors to happen
<cherylj> could be
<cherylj> too bad there's not a progress indicator
<cherylj> I mean, you could watch df on the controller to see if something is happening
<cherylj> or use jnettop
<mattyw> cherylj, yeah will do, I must admit to not realising the charm was that big
<cherylj> mattyw: yeah, that is surprising
<jam> cherylj: mattyw: I'm trying to get that landed today
<mattyw> cherylj, yep panic over, just took ages
<cherylj> yay!
<cherylj> (well, not yay for taking ages, but yay to not being a regression)
<mattyw> cherylj, thanks for sticking with me
<cherylj> mattyw: of course :)
<mup> Bug #1575229 opened: juju/utils/GetAddressesForInterface only returns IPv4 addresses <ipv6> <lxd> <juju-core:Triaged> <https://launchpad.net/bugs/1575229>
<rcj> cherylj, sinzui, alexisb: xenial amis are available
<alexisb> rcj, awesome
<sinzui> wee
<sinzui> thank you cherylj , I will update the merge job
<mgz> rcj: <3
<voidspace> dimitern: you're wrong by the way - I fetch the spaces from MAAS directly there, and it does get VLAN information
<voidspace> dimitern: those are gomaasapi.Subnet not network.SubnetInfo
<dimitern> voidspace: oh, it does? hmm..
<voidspace> dimitern: well, I'm pretty sure you're wrong anyway
<mup> Bug #1575229 changed: juju/utils/GetAddressesForInterface only returns IPv4 addresses <ipv6> <lxd> <juju-core:Triaged> <https://launchpad.net/bugs/1575229>
<dimitern> voidspace: well, CreateMachineDeviceArgs.Validate has to check Subnet.VLAN is != nil
<voidspace> dimitern: that's true enough, but it's not the cause of the error
<voidspace> still working on a multi-nic setup
<dimitern> voidspace: well, something is wrong with the 2.0 implementation if AllocateContainerAddresses or something in gomaasapi, as the 1.0 API path still works
<voidspace> dimitern: sure :-)
<voidspace> dimitern: once I can reproduce I'm confident I can track it down
<dimitern> voidspace: +1
<voidspace> dimitern: I think your diagnosis about the VLAN might be right
<voidspace> dimitern: I just can't see how from the code
<voidspace> dimitern: but some debugging info should reveal it
<dimitern> voidspace: I'd suggest commenting out the deferred device deletion on failure to inspect it
<frobware> dooferlad: just a couple of coments, but LGTM
<katco> cherylj: got a sec to chat about a bug?
<cherylj> katco: sure
<katco> cherylj: https://plus.google.com/hangouts/_/canonical.com/moonstone?authuser=1
<frobware> voidspace, babbageclunk, dooferlad, dimitern: please remind me, who ran into the iface names too long issues yesterday? And was a bug raised for this?
<mup> Bug #1575229 opened: juju/utils/GetAddressesForInterface only returns IPv4 addresses <ipv6> <lxd> <juju-core:Triaged> <https://launchpad.net/bugs/1575229>
<voidspace> frobware: I thought it was dimitern - at least he told me about it :-)
<babbageclunk> dimitern: I deleted a node from my maas (because I screwed up its network) and want to recommission it, but it won't pxe boot, always falls back to the disk.
<babbageclunk> frobware: yeah, I think it was dimitern
<voidspace> babbageclunk: check the boot order
<frobware> babbageclunk: boot order in the VM settings
<babbageclunk> voidspace, frobware: thanks, but I checked that. It shows that it's trying to boot from the network but nothing happens.
<frobware> dimitern: ^^ - please could you raise a bug for the iface name issue
<alexisb> fwereade, I will be a bit late for our 1x1
<voidspace> babbageclunk: weird. And the subnet that exists is one with managed dhcp?
<babbageclunk> voidspace: ooh, checking
<babbageclunk> voidspace: hmm, I don't know where to check that
<voidspace> babbageclunk: in the maas ui - networks
<fwereade> alexisb, chatting to dimitern, let me know when you're ready
<voidspace> babbageclunk: find the subnet you're interested in and click on "untagged" (vlan)
<voidspace> babbageclunk: and it will tell you if DHCP is enabled (you need it enabled to pxe boot)
<bogdanteleaga> anybody up for the fastest review of the week? http://reviews.vapour.ws/r/4710/
<babbageclunk> voidspace: yup, DHCP is enabled on the untagged VLAN in fabric-0
<perrito666> bogdanteleaga: shipit
<voidspace> babbageclunk: :-/
<voidspace> babbageclunk: clone the vm and commission a new one instead
<voidspace> babbageclunk: maybe maas won't commission a deleted machine?
<voidspace> babbageclunk: or ask in #maas
<babbageclunk> voidspace: might try creating a new one and see if that gets picked up ok
<babbageclunk> voidspace: nope - ok, something wrong with the controller rather than the node.
<voidspace> babbageclunk: :-/
<babbageclunk> voidspace: undoing the vlan change to see if that helps
<voidspace> dimitern: ok, I can reproduce
<mup> Bug #1575245 opened: Juju 2.0 LXD install instructions don't work on trusty <juju-core:New> <https://launchpad.net/bugs/1575245>
<dimitern> voidspace: \o/
<dimitern> voidspace: what was it?
<voidspace> dimitern: I said I can repro it
<voidspace> dimitern: working out what causes it is the next step
<babbageclunk> voidspace, dimitern: what the crap - my ENI is empty now! Just has loopback in it now. That might explain the problem (although I don't know how that happened).
<dimitern> frobware: which iface name issue? about biosdevnames and enxxaabbccddeef0 ?
<frobware> dimitern: yep
<mup> Bug #1575245 changed: Juju 2.0 LXD install instructions don't work on trusty <juju-core:New> <https://launchpad.net/bugs/1575245>
<voidspace> babbageclunk: hah, nice
<babbageclunk> yay, lucky I put it in a pastebin while chatting to dimitern
<dimitern> frobware: I did yesterday - bug 1572070
<mup> Bug #1572070: MAAS 2.0 cannot link physical device interfaces to tagged vlans, breaking juju 2.0 multi-NIC containers <juju> <MAAS:Fix Committed by blake-rouse> <MAAS 1.9:Fix Committed by blake-rouse> <https://launchpad.net/bugs/1572070>
<dimitern> babbageclunk: sorry, I was talking to fwereade - catching up on scrollback
<babbageclunk> no worries - it was a false alarm, my ENI is still there.
<dimitern> babbageclunk: ah, good! :)
<frobware> dimitern: is that the right bug#?
<dimitern> frobware: it's against maas only, not juju
<frobware> dimitern: I'm confused. How is that related to enxxaabbccddeef0 and our resultant bridge name?
 * dimitern *d'oh*
<dimitern> frobware: sorry - looking more carefully now
<dimitern> frobware: there it is - bug 1572070
<mup> Bug #1572070: MAAS 2.0 cannot link physical device interfaces to tagged vlans, breaking juju 2.0 multi-NIC containers <juju> <MAAS:Fix Committed by blake-rouse> <MAAS 1.9:Fix Committed by blake-rouse> <https://launchpad.net/bugs/1572070>
<dimitern> argh
<dimitern> bug 1574771
<mup> Bug #1574771: MAAS/curtin generate invalid /e/n/i and failed deployment for nodes with long (biosdevname) interface names, which in turn have VLANs <networking> <robustness> <MAAS:Invalid> <MAAS 1.9:Invalid> <https://launchpad.net/bugs/1574771>
<mup> Bug #1575245 opened: Juju 2.0 LXD install instructions don't work on trusty <juju-core:New> <https://launchpad.net/bugs/1575245>
<frobware> alexisb: this ^^ was the bug I mentioned earlier (bug 1574771)
<babbageclunk> dimitern: yup - if I remove the vlan, I can commission. If I add it back in, I can't.
<dimitern> babbageclunk: add/remove from the maas machine?
<dimitern> babbageclunk: I'm not sure what are you talking about :)
<natefinch> sinzui: is it safe to drop code that is only compiled for older versions of Go?
<natefinch> (older than 1.6)
<babbageclunk> dimitern: sorry - add/remove it to/from ENI
<dimitern> babbageclunk: on the maas controller maching?
<dimitern> machine*
<babbageclunk> dimitern: yup
<dimitern> babbageclunk: how does the commissioning fail?
<dimitern> babbageclunk: do you see the console of the VM node doing stuff?
<babbageclunk> dimitern: I get this error: 0x040ee119
<dimitern> babbageclunk: so pxe is failing
<babbageclunk> dimitern: yup
<babbageclunk> dimitern: so presumably that means DHCP?
<dimitern> babbageclunk: yeah, among other things
<babbageclunk> dimitern: could it be setting the dns servers on the subnet?
<babbageclunk> dimitern: I don't know why that would cause it though.
<dimitern> babbageclunk: hmm.. let me check something here on my maas
<voidspace> dimitern: so it's the call to machine.CreateDevice that errors - however as far as I can tell the juju side of that is correct (right subnet with the right vlan)
<voidspace> dimitern: so instrumenting gomaasapi to see what it actually sends...
<dimitern> voidspace: it gets deeper .. :/
<dimitern> babbageclunk: how did you trigger the commissioning - via the UI or CLI?
<mup> Bug #1564622 changed: Suggest juju1 upon first use of juju2 if there is an existing JUJU_HOME dir <juju-release-support> <juju-core:Fix Released by natefinch> <https://launchpad.net/bugs/1564622>
<natefinch> cherylj, katco: btw, got clarity from security... basically we just need to blacklist the RC4 ciphersuitem which is fairly straightforward... I'll add a function to juju/utils to create a default tls.Config and make sure we use that everywhere.
<voidspace> dimitern: yeah, as far as I can tell the arguments to machine.CreateDevice are correct, and note that at this point it's the same code path as for a single nic
<voidspace> dimitern: it hasn't even got into the part that's different!
<dimitern> babbageclunk: and what series was used - xenial or trusty?
<voidspace> dimitern: it *could* still be a MAAS bug :-)
<dimitern> voidspace: hmm the difference between single-nic and multi-nic should only be the way we handle the primary nic (update the vlan + link-subnet)
<katco> natefinch: cool, glad to hear it's manageable
<babbageclunk> dimitern: xenial
<cherylj> awesome, thanks for following up on that, natefinch
<babbageclunk> dimitern: although it wouldn't ever get to that point.
<dimitern> babbageclunk: and that particular node has 1 NIC on the same bridge where the controller's managed interface is on?
<babbageclunk> dimitern: I think so. What do you mean by managed interface?
<sinzui> natefinch: good question. We are wtill running windows and centos unit test with go 1.2 because they do not pass on go 1.6. I don't think you can require go 1.6 today
<dimitern> babbageclunk: how many NICs does your maas controller KVM has configured?
<natefinch> sinzui: booooooooooooooooooooooooooo
<natefinch> oooooo
<babbageclunk> dimitern: 2
 * natefinch sadface
<dimitern> babbageclunk: ok, and they are attached to difference virbrX bridges?
<babbageclunk> dimitern: I added one when I thought I needed another for the vlan.
<dimitern> babbageclunk: is it possible that second bridge has DHCP enabled on it?\
<dimitern> babbageclunk: (last bridge you added)
<natefinch> sinzui: are we building the binaries with go 1.6?
<natefinch> sinzui: for centos and windows?
<sinzui> natefinch: yes
<babbageclunk> dimitern: I'm not sure how to see what bridge it's connected to.
<natefinch> sinzui: eq
<natefinch> ew
<sinzui> natefinch: we only build with golang 1.6
<babbageclunk> dimitern: in VMM? They're both on the maas2 network - I guess that's the bridge.
<dimitern> babbageclunk: if you have virt-manager installed and connected to qemu:///system it will be easy :)
<natefinch> cherylj: this seems like a pretty bad thnig - our tests and our production code are being built with different versions of go.  which means we're not actually testing 100% what we're distributing
<dimitern> babbageclunk: ah, well - not quite the network is "backed" by a virbrX bridge, but it's a separate entity as far as libvirt is concerned
<babbageclunk> dimitern: I'm going to remove the second nic anyway now - it's not needed.
<dimitern> babbageclunk: so yeah, how is that maas2 network configured in terms of dhcp, ipv4/ipv6, etc?
<dimitern> babbageclunk: that might help indeed :)_
<babbageclunk> dimitern: IPv4, DCHP off, NAT on
<dimitern> babbageclunk: ok, that's correct
<dimitern> babbageclunk: then, if you go to the Nodes | Controllers | your maas controler
<babbageclunk> dimitern: ok, I'll ditch the other one and put the vlan back, then try recommissioning.
<babbageclunk> dimitern: yup
<babbageclunk> ?
<cherylj> natefinch: I'm not sure what you're referencing. Since we made the move to go 1.6, we use it everywhere.  The only exceptions are windows unit tests using go 1.6 are still non-voting, pending some changes from perrito666.  Oh and, centos unit tests too
<dimitern> babbageclunk: what does it show in the "served vlans" section and interfaces below it?
<sinzui> natefinch: I think you misunderstand what is happening. we switched to go 1.6 as core asked, but core is not fixing the test as QA asks
<dimitern> babbageclunk: sorry for the 20 questions :) I suspect the issue might be enabling DHCP on the newly added VLAN
<babbageclunk> dimitern: no worries! I appreciate the help.
<babbageclunk> dimitern:
<perrito666> dimitern: both urls seem valid, but ill make the change
<babbageclunk> dimitern: so served VLANs has fabric-0 (the nic) and fabric-1 (the bridge that got added when installing libvirt-bin)
<dimitern> babbageclunk: ok, so the first fabric-0 is the managed one, and its vlans should all have dhcp enabled
<babbageclunk> dimitern: under Interfaces there are ens3 (original nic), ens9 (second nic I added) and virbr0 (the libvirt bridge)
<babbageclunk> dimitern: yup (there's only the one VLAN, I removed the new one)
<babbageclunk> dimitern: fabric-1's vlan has DHCP off
<dimitern> babbageclunk: ok, so the ens9 is the (only) managed interface, but with you should also see the newly added VLAN interface there as well - e.g. ens9.42
<dimitern> babbageclunk: that's fine though - it can't be on as libvirt also provides dhcp for 192.168.122.0/24 by default (virbr0)
<dimitern> babbageclunk: ah, you've removed the new vlan, ok
<dimitern> babbageclunk: and you're saying that adding it blocks commissioning - adding it how? manually with the CLI or editting /e/n/i and rebooting to let maas autodiscover it?
<babbageclunk> dimitern: so ens9 is the managed interface? should I be adding the vlan to that interface instead of ens3?
<babbageclunk> dimitern: the latter
<dimitern> babbageclunk: aha! that's it I think :)
<babbageclunk> dimitern: (and then using the cli to set space/gateway/dns_
<dimitern> babbageclunk: ok, so that misconfigured ens3 was the issue I suspect
<babbageclunk> dimitern: so you're suggesting - keep ens9, configure it in ENI, and add the vlan to that instead?
<natefinch> sinzui, cherylj: sorry, had to step away... I just mean - we should prioritize work to get the windows and centos tests passing on 1.6.
<babbageclunk> dimitern: ok, I can see that in your ENI from the pastebin
<dimitern> babbageclunk: sorry - which one did you add last - ens9 or ens3 ?
<babbageclunk> ens9
<cherylj> natefinch: yeah, perrito666 was doing some work for windows
<sinzui> natefinch: yes we agree
<cherylj> sinzui: is there a bug open for the centos / 1.6 failures?
<perrito666> cherylj: sorry I thought I merged those last night but got flakytestwalled
<dimitern> babbageclunk: ah, ok then add the vlan to the ens3 and drop ens9 (on the vm as well)
<sinzui> cherylj: bug 1570883
<natefinch> perrito666: I love that term, totally stealing it
<mup> Bug #1570883: imageSuite.TestEnsureImageExistsCallbackIncludesSourceURL fails on centos go 1.6 <centos> <ci> <go1.6> <jujuqc> <lxd> <juju-core:Triaged> <https://launchpad.net/bugs/1570883>
<cherylj> sinzui: ah, I see it now
<dimitern> babbageclunk: as ens3 was the "original" managed interface :)
<babbageclunk> dimitern: ok
<babbageclunk> dimitern: (by virtue of being the only one0
<babbageclunk> )
<dimitern> babbageclunk: yep
<dimitern> babbageclunk: my maas-es setups all use dual-nic lxc containers for the controllers - one managed, one "external" unmanaged
<dimitern> babbageclunk: but that's not a requirement, just me making my life interesting :D
<babbageclunk> dimitern: :)
<dimitern> babbageclunk: to be on the safe side, I'd delete that node which failed commissioning and re-add it
<dimitern> babbageclunk: for kvm-based nodes by far the easiest way is 'Add Hardware|V|' -> Chassis -> Power type 'virsh', Address like 'qemu+ssh:///...', etc. and set a prefix filter on the node names (e.g. mine are called maas-19-node-0, maas-18-node-2, etc.)
<babbageclunk> dimitern: Ooh, that's nice.
<dimitern> so I'd use 'maas-19-node-' as prefix
<babbageclunk> dimitern: I tried adding a new one though, and it also wouldn't pxe boot
<dimitern> babbageclunk: yeah :) nice trick
<dimitern> babbageclunk: so a "new" node can't pxe boot just like that, maas won't allow it
<dimitern> babbageclunk: it needs to be turned on and try pxe booting (with priority over local disk boot)
<babbageclunk> dimitern: yeah, I had done that.
<dimitern> babbageclunk: then it shows up as "new" and you can edit a few things like name, zone, etc. and then you "accept" it
<babbageclunk> dimitern: ok - just tried with the vlan, commissioning works (at least, got past the bit where it got stuck)
<dimitern> babbageclunk: but using "Add Hardware" like that bypasses most of that - it should add them as new and the start commissioning
<dimitern> babbageclunk: awesome!
<babbageclunk> dimitern: just adding the gateway/dns/space back on to make sure.
<babbageclunk> dimitern: thanks for all of the detective work!
<dimitern> babbageclunk: :) glad to help
<dimitern> ok I think it's beer o'clock already
 * dimitern eod-s
<babbageclunk> dimitern: yay, still works with the extra settings. Enjoy that beer!
<dimitern> babbageclunk: nice! I'll leave you to it then :) cheers!
<redir> we don't support any i386 in 2.x correct?
<natefinch> redir: correct
<redir> tx natefinch
<mup> Bug #1571053 opened: container networking lxd 'invalid parent device' <ci> <lxd> <juju-core:Triaged> <https://launchpad.net/bugs/1571053>
<mup> Bug #1575283 opened: Juju 2 status doesn't show error reason in default format <juju-core:New> <https://launchpad.net/bugs/1575283>
<katco> yay, headed my way: https://weather.com/weather/radar/interactive/l/USMO0105:1:US?layer=radarConus&zoom=7
<jam> mgz: sinzui: did something change on the merge bot recently? It looks like LXD is installed but not configured, as I tried to land a change and I got a lot of "LXD is not configured" failures.
<jam> which I don't *think* I touched myself
<jam> (I am working in that area, so I might have tweaked something and not realized it)
<sinzui> jzm: for merging? why yes we did. We switched back to xenial images when the AMIs appeared
<jam> ah, at least fwereade's patch failed as well
<jam> sinzui: so it looks like the merge bot is rejecting patches now because as it tests the LXD provider, it finds that the local LXD is unconfigured
<sinzui> jam: So either we revert to trusty, or we we change someting to configure LXD. I am still operating on the assumption LXD doesn't just work
<sinzui> jam: Some trickery is need to configure LXD without a prompt. I think reverting to trust is safest for now
<jam> sinzui: well, I'm heading to bed, unfortunately. Most expedient is probably to revert to trusty
<sinzui> jam: I can rety your branch in a few minutes
<jam> dpkg-reconfigure -p medium  still prompts you ?
<jam> (i can see that it could, I just noticed that the dpkg-reconfigure changed, and our prompt helping you to get working changed as well)
<sinzui> jam: katco fwereade : your merges are re-queued. The landing bot will use trusty images again because xenial images don't come with a working lxd, and "lxd init" required a interactive prompt
<katco> sinzui: ty, was wondering about that
<jam> sinzui: so 'lxd init' is the wrong thing to run, as that won't setup lxdbr0, 'dpkg-reconfigure -p medium lxd' is apparently the right thing to run, but it might also need a prompt
<natefinch> sinzui, jam: is this something we need to ping the lxd guys about?
<sinzui> jam: does -p medium lxd not require interactive?
<natefinch> sinzui, jam: it prompts me
<jam> sinzui: not sure. '-p' seems to change what level of questions get asked (so I'm guessing dpkg-reconfigure asks you to enter the subnet, but -p medium doesn't
<sinzui> jam :( one question is enough to kill a provisioning script
<frobware> jam: does the failure look like "initialisation.go:94 configuring zfs failed with exit status 1: LXD init cannot be used at this time"?
<jam> sinzui: dpkg-reconfigure -p high doesnt ask any questions, but I don't know if it works :)
<sinzui> jam: :) If it works, I can switch back to xenial in a few hours
<jam> tych0: ^^
<jam> tych0: do you know if "dpkg-reconfigure -p high' will just use the auto-selected subnet?
<jam> frobware: the current failure we're seeing is different
<jam> frobware: but that also looks like one we should be tracking if you're seeing that as well.
<frobware> jam: ok, it might be because my tip is a little behind upstream/master
<jam> not being able to configure ZFS shouldn't be fatal
<frobware> jam: it may also be due to local changes I'm making in terms of detecting whether we're already setup (bridge-wise)
<natefinch> anyone know if there's any reason to support a TLS version less than 1.2?
<tych0> jam: i don't know, but you can probably preseed it
<mup> Bug #1575310 opened: Add "juju status --verbose". <feature> <juju-core:New> <https://launchpad.net/bugs/1575310>
<cmars> natefinch, gui, possibly
<cmars> other than that, no
<natefinch> cmars: IE (of course) is the only browser that hasn't had 1.2 support for a long time.  Wikipedia says you need IE 11 for 1.2 support to be on by default... and google says IE 11 is the lowest supported version for win 7+
<mup> Bug #1575332 opened: add-cloud method not documented by "juju help" <juju-core:New> <https://launchpad.net/bugs/1575332>
<mup> Bug #1544890 changed: "ERROR the name of the model must be specified" when 'juju init' required <2.0-count> <bootstrap> <juju-release-support> <juju-core:Invalid> <https://launchpad.net/bugs/1544890>
<alexisb> I have to go pick up sick kid and vet supplies (For horse not kid); will be back online later
<mup> Bug #1467715 changed: worker/peergrouper: data race in package <ci> <intermittent-failure> <race-condition> <regression> <juju-core:Fix Released by menno.smits> <https://launchpad.net/bugs/1467715>
<mup> Bug #1575400 opened: juju2: no maas nodes available: error message mentions 'zone=default' <landscape> <juju-core:New> <https://launchpad.net/bugs/1575400>
<menn0> cherylj: reviewed the add-model change. Ship it with a few small suggestions.
<mup> Bug # opened: 1575403, 1575405, 1575409, 1575410
<cherylj> menn0: I'll have to address in a follow up PR.  I was just merging feature branch I had created to test the requisite CI changes.
<menn0> cherylj: no worries
<mup> Bug #1575409 changed: status hides container error in tabular format <kanban-cross-team> <landscape> <juju-core:Triaged> <https://launchpad.net/bugs/1575409>
<mup> Bug #1575410 changed: juju2-beta5: tools mismatch error on lxd container <landscape> <juju-core:New> <https://launchpad.net/bugs/1575410>
#juju-dev 2016-04-27
<menn0> thumper: SSHClient facade. Review pls. http://reviews.vapour.ws/r/4713/
<mwhudson> menn0: i have a couple of simple PRs https://github.com/juju/juju/pull/5252 https://github.com/juju/juju/pull/5241
<mwhudson> menn0: reviewboard doesn't seem to have picked them up
 * redir is eod. See you tomorrow juju-dev
<mup> Bug #1575448 opened: trusty juju 1.25.5 HA availability issues <canonical-bootstack> <juju-core:New> <https://launchpad.net/bugs/1575448>
<menn0> mwhudson: looking
<mwhudson> menn0: looks like tim merged them
<menn0> mwhudson: ok cool. I was having lunch.
<mwhudson> or well
<mwhudson> menn0: https://github.com/juju/juju/pull/5252 hasn't merged, can you tell why?
<mwhudson> oh still processing
<menn0> mwhudson: yep, still merging
<cherylj> hey perrito666 - your window tests PR fixed this bug, right?  bug 1571783
<mup> Bug #1571783: Windows unit tests cannot setup under go 1.6 <ci> <go1.6> <jujuqa> <regression> <test-failure> <unit-tests> <windows> <juju-core:In Progress by hduran-8> <https://launchpad.net/bugs/1571783>
 * perrito666 checks
<perrito666> yes
<cherylj> cool, thanks
<a123> having trouble building juju(cb347bb7). Following the README.md results in: cannot find package "github.com/Azure/azure-sdk-for-go/Godeps/_workspace/src/github.com/Azure/go-autorest/autorest/mocks"
<davecheney> a123: did you run godeps first ?
<davecheney> cd $GOPATH/github.com/juju/juju && godeps -u dependencies.tsv
<a123> no I did not. The README.md only says to run 'make install-dependencies'
<davecheney> that should do the same thing
<davecheney> did that process work ?
<davecheney> it might be easier to run the command I suggested
<davecheney> and raise a bug if our install docs are out of date
<a123> hmm. godeps: command not found.    I'm using: go version go1.6.1 linux/amd64
<a123> I'm new to go, if that's not obvious.
<davecheney> please raise a bug that make install-dependencies does not install godeps, which is a dependency :)
<davecheney> for the momement
<davecheney> go get launchpad.net/godep
<davecheney> go get launchpad.net/godeps
<davecheney> you will need to install bzr, sorry
<a123> ha. yep.
<davecheney> that may have been while make isntall-deps failed
<perrito666> odd, I thought make install installed bzr
<a123> hmm. odd. bzr already installed. here's the output of get get launchpad.net/godeps
<a123> # cd .; bzr branch https://launchpad.net/godep /home/ubuntu/proj/gojuju/src/launchpad.net/godep bzr: ERROR: Not a branch: "https://launchpad.net/godep/". package launchpad.net/godep: exit status 3
<perrito666> a123: its godeps
<perrito666> go get launchpad.net/godeps
<a123> yep.
<perrito666> with an s at the end
<perrito666> you are missing it
<davecheney> yes, sorry, i typed it incorrectly the first time
<a123> working....
<davecheney> there is a related tool with a very similar name
<a123> godeps -u dependencies.tsv looks like it did its thing.
<a123> now running go install -v github.com/juju/juju/... and that looks like it's doing its thing. Thanks for the help.
<davecheney> no worries
<a123> BTW, anyone get juju bootstrap to work from behind a proxy? At home, no issues at all. Different story behind a proxy.
<davecheney> juju _should_ pickup the various http_proxy variables if they are defined in your shell
<blahdeblah> a123: 1.25, or 2.x?
<blahdeblah> 1.25 works for me
<a123> yes. It does look like it does that at first.... Yes, 1.25 worked for me too, now trying 2.0
<a123> don't know if this is related to our proxy or not, but doing lxc remote list produced a table where images url looked like:
<a123>  https://images.linuxcontainers.org
<a123> I could not download any images from there. but when I redefined that url to include the port(8443) the image download was successful.
<davecheney> if you've change something (i'm not sure what you've changed) then you'll need to be explicit
<davecheney> it sounds like the thing your downloading from expects 443 as the default
<a123> right. I ran: lxc remote set-url images https://images.linuxcontainers.org:8443
<a123> this redefined the remote images url. I then ran: lxc image copy images:ubuntu/xenial/amd64 local: --alias ubuntu-xenial
<a123> w/out changing the URL, the lxc image copy command would not work behind our proxy
<davecheney> umm, juju 2.0 doesn't support lxc
<davecheney> only lxd
<davecheney> i hope this statement is helpful, not frustating
<a123> laughing... I thought lxd2.0 was built on top of lxc. Your point is taken though.
<davecheney> well it is
<davecheney> so your point is technically correct, which is the best kind of correctness
<a123> before changing the image url and running: juju bootstrap --config default-series=xenial lxd-test lxd --debug produced:
<a123> sorry guys. I'm seeing different debug output now than earlier. lots of connection refused to the 10.0.3.0/24 network. Does the no_proxy setting take CIDR?
<a123> that network is attached to lxdbr0
<a123> What I should ask is, does juju understand CIDR if used in the no_proxy env var?  ie. export no_proxy=10.0.3.0/24
<mgz> a123: no, no_proxy doesn't take a cidr
<a123> thanks.
<davecheney> i _think_ no_proxy is just a sort of match string
<davecheney> export no_proxy=10.
<davecheney> ^ not tested
<mgz> just a comma seperated list of "domain extensions"
<mgz> so, it's suffixes, not prefixes
<mgz> valid: no_proxy=.com
<mgz> no_proxy=.255
<a123> I've found this to be application dependent in the past.
<a123> ah. nice.
<mgz> not valid: no_proxy=10. no_proxy=10.* etc etc
<mgz> we plug this stuff into wget in some places so are limited by what wget supports
<a123> oh. It wasn't clear to me if wildcards would work.
<a123> ok. wget is the driver. I think I know why my debug is different now. I used the --keep-broken flag when running the bootstrap in hopes of finding answers. How do you destroy the model?
<davecheney> juju kill-controller $controller
<davecheney> from memory
<davecheney> kill is the more finite form of destroy
<davecheney> which tends to not actually destroy things 'cos it's a wimp
<a123> so much I don't know. Is the controller a different concept than a model? It looks like a controller never got created when using the --keep-broken flag.
<a123> so. juju list-controllers returns an empty table. When I then run: juju list-models I get: error: controller local.lxd-test not found. What exactly does --keep-broken do?
<a123> thanks for the help everyone. I'm not confident my environment is in a good state. I'm going to tear down the VM, bring up a fresh one and try again tomorrow.
<thumper> menn0: a few questions on your review
 * thumper goes to make coffee
<menn0> thumper: ok, looking
<menn0> thumper: good point... these APIs copy the existing APIs used by juju ssh/scp exactly but it's probably worth making sure they at least work for IPv6 too
<menn0> thumper: hangout to discuss?
<thumper> menn0: sure, gimmie 5?
<menn0> yep
 * thumper wants to enjoy his coffee first
<mup> Bug #1575463 opened: buildSuite.TestGetVersion* CryptAcquireContext: Provider DLL failed to initialize <blocker> <ci> <regression> <test-failure> <unit-tests> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1575463>
<davecheney> menn0: are you looking at this bug https://bugs.launchpad.net/juju-core/+bug/1458585
<mup> Bug #1458585: SSHGoCryptoCommandSuite.TestCommand fails <ci> <go1.6> <intermittent-failure> <regression> <test-failure> <wily> <xenial> <juju-core:Incomplete> <juju-core 1.23:Won't Fix> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1458585>
<davecheney> or a duplicate of it ?
<menn0> davecheney: no and no
<menn0> davecheney: I'm working on making juju ssh/scp use the actual SSH host keys of the machine being connected to
<menn0> davecheney: close to being done
<davecheney> ok, related, but not the same issue
<davecheney> thanks
<thumper> wallyworld: when do we add things into the cloud metadata storage?
<thumper> wallyworld: is it "normal" behaviour for clouds
<thumper> or more used for custom openstack stuff?
<wallyworld> thumper: you mean keystone?
<wallyworld> or the metadata url
<wallyworld> will still allow the use keystone for simplestreams
<wallyworld> but i don't think we've used the metadtaa url for ages
<wallyworld> we now bootstrap differently
<wallyworld> it used to be a way to pass bootstrap instance info across
<mup> Bug #1575469 opened: liveSuite.TestBootstrapMultiple invalid character \"\\\\\" in host name <ci> <go1.6> <regression> <test-failure> <unit-tests> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1575469>
<mup> Bug #1575472 opened: Data Race github.com/juju/juju/environs/tools/build.g <ci> <race-condition> <regression> <test-failure> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1575472>
<thumper> wallyworld: I mean the state cloudmetadataC collection
<wallyworld> thumper: we cache the simplestreams metadata there at bootstrap and whenever we poll cloud-images
<thumper> k
<davecheney> menn0: https://github.com/juju/juju/pull/5289
<davecheney> i heard u were on call review
<menn0> davecheney: looking
<menn0> davecheney: looks fine. what's changed in x/crypto that fixes the isssue?
<davecheney> menn0: who knows it's been a year since we updated that dependency
<davecheney> and there have been heaps of bug fixes to the crypto repo
<davecheney> i'm deliberately phrasing it like this because I don't want to backport anything
<menn0> davecheney: ah right... so you didn't find a specific upstream change that fixed the issue
<davecheney> i didn't even look
<davecheney> upgrading to tip fixed the problem
<menn0> davecheney: well you have a ship it
<davecheney> Do we need to maintain comptabilty with Go 1.2 ?
<davecheney> the build bot just failed to land a branch 'cos it tried to build with Go 1.2
<dimitern> axw: are you around by any chance?
<dimitern> or wallyworld ?
<wallyworld> hey, otp
<axw> dimitern: yo, I am
<dimitern> axw, wallyworld: hey, just a quick question, if you happen know
<dimitern> in provider/ec2 can we now only filter instances by model tags rather than secgroups ?
<dimitern> axw: I know you did something around that lately
<axw> dimitern: not yet, tried to land today but master is blocked
<dimitern> I'm working on bug 1321442 and it will be a lot easier not to have to make group filters work with explicit VPC ID
<mup> Bug #1321442: Juju does not support EC2 with no default VPC <ec2-provider> <network> <juju-core:In Progress by dimitern> <https://launchpad.net/bugs/1321442>
<dimitern> axw: I see, ok - I'll look into your changes
<axw> dimitern: hmm you know, I just remembered a reason why we may not want to do away with the group filtering.
<axw> dimitern: tags aren't added immediately after creating an instance...
<dimitern> axw: ah, bummer
<axw> sorry
<axw> dimitern: I mean, they're not added until after they're created
<axw> dimitern: so there's a window where they could leak
<dimitern> axw: np, it makes sense in an AWS world I guess
<dimitern> axw: ok, cheers, I'll do group filtering then as well
<axw> dimitern: and I'll revert that bit in my PR :)
<axw> cheers
<dimitern> you gotta love how inconsistent AWS API is at times :[
<dimitern> lol `deleteSecurityGroupInsistently`
<fwereade> jam, tech board?
<voidspace> dimitern: ping
<dimitern> voidspace: pong
<voidspace> dimitern: I've addressed two of your comments from your review of the devices branch
<voidspace> dimitern: you suggest just adding a test for the multi nic case as it's basically done
<voidspace> dimitern: however, adding a test means some test infrastructure work as the multi-nic path calls additional provider methods
<voidspace> dimitern: not difficult but not nothing
<dimitern> voidspace: it looked like that, but yeah - can be a follow up
<voidspace> dimitern: in addition there is the known bug with the multi-nic case
<voidspace> dimitern: I'd rather land as is - with working and tested single nic containers
<dimitern> voidspace: what did you find about the missing subnet?
<voidspace> dimitern: not uncovered it yet - still digging
<voidspace> dimitern: should be done and dusted today though, can't be *that* hard to find :-)
<voidspace> dimitern: so I'll land as is and land multi-nic tests along with the bug fix (assuming it's not a maas bug)
<voidspace> dimitern: ok?
<dimitern> voidspace: you know what occurred to me yesterday: it might be due to unchecked 0 ids for fabrics or vlans
<voidspace> dimitern: possibly
<voidspace> dimitern: I'm going to try always setting VLAN ID even if it isn't changing in that update call
<dimitern> voidspace: e.g. fabric-0 is always there, but we might not have meant that
<voidspace> yeah
<dimitern> voidspace: ok, if you don't mind let me do a quick bootstrap with your branch now
<voidspace> dimitern: sure
<voidspace> I'm doing the same, with modified gomaasapi
<dimitern> ok
<voidspace> forcing the VLAN to be specified didn't help
<voidspace> adding more debugging
<voidspace> dimitern: anyway, I'm on it
<dimitern> voidspace: can you paste the outputs of 'maas <profile> subnets read', 'fabrics read', and 'machines read' for your maas2?
<voidspace> dimitern: http://pastebin.ubuntu.com/16076113/
<dimitern> voidspace: thanks, here's mine for comparison: http://paste.ubuntu.com/16076129/
<voidspace> dimitern: I'm going to try allocating the machines and making the maas calls manually
<voidspace> dimitern: after this bootstrap to confirm it's the CreateDevice call that fails
<dimitern> voidspace: sgtm, I'm bootstrapping now
<voidspace> dimitern: I'd still like to land single nic support
<dimitern> voidspace: I don't disagree with that :)
<voidspace> cool
<dimitern> voidspace: are both of your first 2 fabrics managed? (0 and 1)
<dimitern> i.e. subnets 172.16 and 172.17
<voidspace> dimitern: gah, master blocked
<voidspace> dimitern: subnets are managed if that's what you mean
<voidspace> I don't know what a managed fabric is
<voidspace> both subnets
<voidspace> dimitern: ok, I was wrong it gets past CreateDevice and UpdateInterface
<voidspace> dimitern: so I need to add more instrumentation and redo
<voidspace> dimitern: I've got this anyway
<voidspace> dimitern: I'll ping you when I've made progress
<dimitern> voidspace: ok, your maas2 setup looks fine - so the issue must be related to how we're calling the api
<voidspace> yep
<dimitern> voidspace: still the same thing - no params (with vlan) passed to update the primary NIC of the device
<voidspace> dimitern: standup old boy
<TheMue> morning
<axw> babbageclunk: it would appear you're the luck OCR today, would you please take a look at http://reviews.vapour.ws/r/4718/diff/#?
<axw> fixes master blocker
<voidspace> dimitern: I see vlan=5001 in the params of the PUT
<dimitern> voidspace: and I can't :/
<dimitern> voidspace: and with the vlan set do you still get 500 ?
<voidspace> dimitern: wait, that's the wrong vlan though
<axw> dimitern voidspace: can either of you take a look at the tiny PR above? it should unblock master
<axw> well, with the juju/juju change coming after
<voidspace> dimitern: I do get 500
<voidspace> dimitern: but I think that's the wrong vlan
<voidspace> dimitern: interesting
<dimitern> axw: LGTM
<axw> ta
<dimitern> voidspace: aha!
<dimitern> voidspace: 5001 should be the untagged vlan of your fabric-0
<dimitern> voidspace: and it should be the same as the host's first NIC's VLAN
<voidspace> dimitern: however this subnet comes straight from MAAS
<voidspace> but "spaces read" seems to give the right thing
<voidspace> dimitern: so in gomaasapi it should be args.Subnet.VLAN() not iface.VLAN()
<voidspace> dimitern: and in the single NIC case it works because there's only one
<voidspace> dimitern: trying that
<dimitern> voidspace: exactly!
<voidspace> I think that's it
<voidspace> trying it now
<dimitern> voidspace: the iface is the device interface, while args.Subnet.VLAN is the one we want on iface
<voidspace> yep
<voidspace> gaah, that took a long time
<voidspace> at least we get to blame Tim
<dimitern> :)
<voidspace> and a chunk of the time was getting a multi-nic MAAS setup which is worthwhile work
<dimitern> voidspace: \o.
<dimitern> \o/ even
<dimitern> voidspace: and it also helped your maas2 setup is a bit different than mine
<voidspace> yep
<dimitern> voidspace: are you using 2.0.0 beta4 more recent than bzr 4936 ?
<voidspace> dimitern: 4941
<voidspace> dimitern: beta3 though!
<voidspace> dimitern: I've got past that point with no error
<voidspace> dimitern: and I see a device with two nics and two IP addresses
<dimitern> voidspace: hmm, well you *might* hit this bug 1572070
<mup> Bug #1572070: MAAS 2.0 cannot link physical device interfaces to tagged vlans, breaking juju 2.0 multi-NIC containers <juju> <MAAS:Fix Committed by blake-rouse> <MAAS 1.9:Fix Committed by blake-rouse> <https://launchpad.net/bugs/1572070>
<babbageclunk> axw: Sure - sorry, missed this until now.
<dimitern> voidspace: sweet! then it should work the rest of the way
<voidspace> dimitern: yep, so I'll propose a fix for gomaasapi and then tests for multi-nic
<voidspace> frobware: babbageclunk: it turned out to be a bug in gomaasapi
<dimitern> voidspace: but that bug will be relevant with mult-nic only
<voidspace> dimitern: ok, thanks
<voidspace> dimitern: the linking is already done (my vlans are untagged)
<voidspace> dimitern: so I think I've got past that
<babbageclunk> voidspace: nice
<dimitern> voidspace: well since the physical nics of the host are on untagged VLANs it only will become an issue trying to create a second physical NIC of the device linked to a tagged VLAN
<voidspace> dimitern: container is running fine
<dimitern> voidspace: no WARNINGs in the log around provisioning / PrepareContainerInterfaceInfo ?
<voidspace> dimitern: only about no DNS settings found
<voidspace> dimitern: nothing else
<babbageclunk> axw: dimitern: d'oh, should've reloaded
<voidspace> dimitern: and I can ssh fine
<voidspace> dimitern: case closed on that one
<dimitern> voidspace: awesome! so the device has eth0 and eth1 linked to br-eth0 and br-eth1 on the host?
<voidspace> dimitern: this is e/n/i on the container http://pastebin.ubuntu.com/16076575/
<dimitern> voidspace: slightly odd to see `dns-nameservers 172.16.0.2` on eth0 which has address 172.17.0.4/24
<voidspace> dimitern: it adds the nameservers to the first stanza
<dimitern> but otherwise looks solid
<voidspace> yeah
<dimitern> voidspace: yeah - and only the .16 subnet has dns_servers set?
<voidspace> dimitern: this is the host http://pastebin.ubuntu.com/16076591/
<voidspace> dimitern: you only need one nameserver entry
<voidspace> they're global
<voidspace> neither dns is set in maas (both 0.0.0.0) and both are managed
<dimitern> voidspace: I see, ok - does 'ping bbc.co.uk' work while inside the container?
<voidspace> dimitern: yep
<dimitern> voidspace: we're nearly done then :)
<voidspace> dimitern: can't test that change easily in gomaasapi - you can specify responses in the test server it uses but not verify requests it seems
<voidspace> you can fetch the last request, but it makes another after the update
<voidspace> looking to see if I can get at the one before last :-)
<voidspace> hah, no
<voidspace> ah, I can check the VLAN
<voidspace> should be ok
<dimitern> voidspace: sure, on it
<voidspace> dimitern: are you doing it?
<voidspace> dimitern: I have a branch with a fix
<dimitern> voidspace: haven't started
<dimitern> doing a few things at once as usual
<voidspace> dimitern: a proper test involves a lot of json (new interface, vlan and subnet in json)
<voidspace> dimitern: I have this though https://github.com/juju/gomaasapi/compare/master...voidspace:maas2-create-device-vlan
<voidspace> as there's only one interface on the test machine the test doesn't fail without the fix
<voidspace> I suggest we land this, I'll work on juju and come back for a multi-nic test for gomaasapi later
<dimitern> voidspace: looks good
<voidspace> babbageclunk: https://github.com/juju/gomaasapi/pull/47
<voidspace> coffee
<babbageclunk> voidspace: I like it
<babbageclunk> voidspace: !
<babbageclunk> voidspace: ahh, missed your discussion about the test above.
<voidspace> babbageclunk: yeah, it needs a better test - but creating a machine with multiple interfaces, subnets and vlans for the gomaasapi test harness is a pain
<voidspace> babbageclunk: I'll do it later
<voidspace> babbageclunk: it's no worse tested than it was before ;-)
<voidspace> dimitern: I've updated http://reviews.vapour.ws/r/4700/
<voidspace> dimitern: includes gomaasapi revision bump and a happy path test for multi-nic
<voidspace> is master unblocked yet
<voidspace> dimitern: still more tests needed
<dimitern> voidspace: still looks good to land, and I'll do another quick live test with it
<voidspace> dimitern: that would be good, see if it works for you
<voidspace> master is still blocked
<alexisb> voidspace, dimitern, frobware, babbageclunk please JFDI any maas2 related PRs
<alexisb> master is blocked but maas2 stuff is an exception here
<dimitern> alexisb: thanks!
<dimitern> voidspace: it looks a lot better: bootstrap ok, switch to admin, add-machine, then add-machine lxd, lxc, kvm to both :0 and :1
<dimitern> voidspace: but only machine-1 containers all came up ok and have expected addresses, the ones on machine failed with host machine device "br-eno2" has no address, and I was digging into the logs to figure out why
<mup> Bug #1575676 opened: Hard to use non-default LXD bridge <landscape> <juju-core:New> <https://launchpad.net/bugs/1575676>
<alexisb> thank you katco !
<katco> alexisb: np... happy that we're giving capacity planning a little more attention
<katco> alexisb: also revealing future targets :)
 * katco begins her morning routine
<voidspace> alexisb: ok
<voidspace> dimitern: machine-0 ones didn't work?
<voidspace> picking up daughter from school
<voidspace> back in 15mins
<dimitern> voidspace: nope, something's odd, still investigating and adding more logging
<voidspace> dimitern: alexisb: frobware: babbageclunk: container support has landed on master
<frobware> voidspace: nice work!
<babbageclunk> voidspace: awesome!
<dimitern> voidspace: great!
<dimitern> I'm still debugging the issue on machine-0
<voidspace> dimitern: ok
<voidspace> dimitern: I'll try adding a node to my maas with two nics and try that
<voidspace> dimitern: I've *only* deployed to machine-0
<dimitern> voidspace: in the admin model?
<voidspace> dimitern: yep
<voidspace> dimitern: commissioning an additional node now
<alexisb> \o/
<voidspace> dimitern: when I commission a node with two nics the second nic comes up as "unconfigured" (no address) whereas the primary nic is "auto assign"
<voidspace> dimitern: I can manually change it to auto assign
<voidspace> dimitern: if I don't do that then I think I only get a single nic on the container
<dimitern> voidspace: otp
<voidspace> successfully created a container with two nics on a new machine not in the admin model
<mup> Bug #1574809 changed: "to: lxc:0" ignored in bundle <bundles> <juju-release-support> <kanban-cross-team> <landscape> <juju-core:Invalid> <https://launchpad.net/bugs/1574809>
<dimitern> voidspace: yeah, you can set them to auto or static (unconfigured physical + auto/static VLAN children won't work though)
<dimitern> voidspace: nice! I added more logging and now trying again
<babbageclunk> dimitern, voidspace - hmm, trying to add a machine with a space constraint gives a schema error - it looks like the maas response might have changed since we put the constraint matching code in.
<babbageclunk> It's returning an array of interface/device ids, not just one.
<babbageclunk> I'm going to fix gomaasapi - do you think that just taking the first is the right thing to do? I guess I should ask the MAAS people what it would mean for the result to have more than one value.
<dimitern> babbageclunk: can you point me to the code you're talking about (causing the issue)?
<babbageclunk> dimitern: https://github.com/juju/gomaasapi/blob/master/controller.go#L810
<babbageclunk> dimitern: compared to this response: http://pastebin.ubuntu.com/16083467/
<mgz> voidspace: where is the gomaasapi/bootresource.go stuff meant to come from? bootstrapping complains about missing kflavor schema
<babbageclunk> mgz - if you do "maas <session> boot-resources read" can you see kflavor in the items that come back/
<babbageclunk> ?
<mgz> babbageclunk: a bunch of them do, some do not (centos ones)
<dimitern> babbageclunk: hmm
<dimitern> babbageclunk: so looking at the response you pasted, the schema seems correct
<babbageclunk> mgz: that makes it sound like it's an optional field that we didn't know was optional.
<mgz> babbageclunk: it does indeed, I'll file a bug?
<babbageclunk> mgz: yes please!
<babbageclunk> dimitern: will ForceInt accept an array?
<dimitern> babbageclunk: I think the ForceInt applies to the values of the StringMap
<babbageclunk> dimitern: sure, but the values are lists, right?
<babbageclunk> dimitern: I'm going to write a little test to check
<dimitern> babbageclunk: i.e. given a {"constraints_by_type":{"storage":{"foo":1,"bar":2}},"interfaces":{"aa":1,"bb:2}} it should parse it
<mup> Bug #1575760 opened: Juju switch returns confusing error message <juju-core:New> <https://launchpad.net/bugs/1575760>
<babbageclunk> dimitern: but the one coming from maas has "interfaces": {"default": [24]}
<dimitern> babbageclunk: you're correct! so specifying "interfaces=foo:space=0" gives me multiple items in the list for "foo"
<babbageclunk> dimitern: is that because any of those interfaces are in the right space?
<babbageclunk> dimitern: I mean, all
<dimitern> babbageclunk: yeah - if you have 1 space only (space-0 a.k.a. default) and you pass that to interfaces, you'll get all nodes back
<dimitern> babbageclunk: so I guess it needs to be something like "interfaces": schema.StringMap(schema.List(schema.ForceInt()))
<babbageclunk> dimitern: yeah, I think so.
<babbageclunk> dimitern: And I think we need to change gomaasapi.ConstraintMatches to have slices of BlockDevice and Interface as well then.
<dimitern> babbageclunk: indeed
<babbageclunk> dimitern: once it gets into the provider, can we just pick any one of them?
 * babbageclunk looks in environ.go for how we use the matches.
<dimitern> babbageclunk: and the way ids are handled in the loops below (id := value.(int) -> ids := value.([]int) and range over it)
<dimitern> babbageclunk: you mean, if you have more than one interface ID per label?
<dimitern> babbageclunk: does not matter if we only want a given space to be accessible on the allocated machine
<babbageclunk> dimitern: yeah, actually we only use the constraint matches for storage information.
<dimitern> babbageclunk: but why do you need those results from constraintsMap ? the 1.0 code path ignores them as we don't care (i.e. we read all interfaces along with which subnets and spaces they're linked to each time we call NetworkInterfaces())
<dimitern> babbageclunk: :) yep
<dimitern> we probably should at some point, but so far we don't need it
<babbageclunk> dimitern: ok, in the storage case should that just be another loop over the block devices that come back
<babbageclunk> ?
<dimitern> babbageclunk: I suspect so, but have a look how 1.0 code path does it (and its tests)
<babbageclunk> dimitern: context: https://github.com/juju/juju/blob/master/provider/maas/volumes.go#L295
<mgz> voidspace, babbageclunk: filed bug 1575768
<mup> Bug #1575768: boot resource 2.0 schema check failed: kflavor: expected string, got nothing <bootstrap> <ci> <maas-provider> <juju-core:Triaged> <https://launchpad.net/bugs/1575768>
<babbageclunk> dimitern: It was structured differently in the v1 API - looks like it was id -> label, so you could have multiple ids going to the same label
<dimitern> babbageclunk: it doesn't look like the 2.0 code path is using physicalblockdevice_set at all
<babbageclunk> thanks mgz - I'll start on that now.
<babbageclunk> dimitern: no - gomaasapi handles that lookup
<babbageclunk> dimitern: in controller.go: parseAllocateConstraintsResponse
<babbageclunk> mgz: do you know what MAAS version that is?
<mgz> babbageclunk: beta3+bzr4941
<babbageclunk> mgz: Thanks.
<mgz> it's upgraded from 1.9, which would have done the original image imports
<dimitern> babbageclunk: I need to go, but in case it might help, here's a couple of outputs with a set storage constraint on 2.0 (http://paste.ubuntu.com/16084450/) and 1.0 (http://paste.ubuntu.com/16084458/)
<babbageclunk> dimitern: thanks! I'm not going to get to fixing it tonight, should fix mgz's issue first
<dimitern> mgz: that's a bit too old can't you upgrade to the latest beta4 from the experimental3 ppa?
<dimitern> babbageclunk: sounds good, cheers!
<babbageclunk> dimitern: mgz: I wasn't sure whether that was something I could demand. :)
<dimitern> aw c'mon mgz's a pal :)
<babbageclunk> mgz: the version we're working against is beta4+bzr4944
<mgz> dimitern: I can, I didn't see an announcement from roaksoax about it
<mup> Bug #1575764 opened: Juju doesn't detect lxd container IP address changes <juju-core:New> <https://launchpad.net/bugs/1575764>
<mup> Bug #1575768 opened: boot resource 2.0 schema check failed: kflavor: expected string, got nothing <bootstrap> <ci> <maas-provider> <juju-core:Triaged by 2-xtian> <https://launchpad.net/bugs/1575768>
<mup> Bug #1575769 opened: Can't "forget" a controller that I've lost access to <juju-core:New> <https://launchpad.net/bugs/1575769>
<dimitern> mgz: they don't always mail when they do a point release I think
<dimitern> anyway, I'm outta here
<babbageclunk> mgz: it has a few bugfixes we need, so I guess you'll need to upgrade anyway.
<mup> Bug #1575769 changed: Can't "forget" a controller that I've lost access to <juju-core:New> <https://launchpad.net/bugs/1575769>
<mgz> babbageclunk: dist-upgrading first
<babbageclunk> mgz: I'll try uploading a centos image to my local MAAS and see if that also leaves out kflavor.
<mup> Bug #1575769 opened: Can't "forget" a controller that I've lost access to <juju-core:New> <https://launchpad.net/bugs/1575769>
<babbageclunk> mgz: yup, it still leaves out kflavor on my local beta4 one
<voidspace> babbageclunk: have you got a fix for that issue?
<mup> Bug #1575769 changed: Can't "forget" a controller that I've lost access to <juju-core:New> <https://launchpad.net/bugs/1575769>
<babbageclunk> voidspace: I think we just need to mark it optional in the schema.
<voidspace> babbageclunk: do you have a branch with that?
<babbageclunk> voidspace: The provider doesn't use it - we only care about architectures
<babbageclunk> voidspace: not with the change yet - just reproducing the bug.
<voidspace> babbageclunk: cool
<redir> cherylj: looking
<cherylj> redir: at what?
<cherylj> what did I say?
 * babbageclunk lols
<redir> cherylj: at your PR
<cherylj> haha
<redir> :p
<voidspace> babbageclunk: reading scrollback - weird about the device id array
<voidspace> babbageclunk: did you ask in the maas folk about that?
<voidspace> I've just upgraded to beta4 - was using next-proposed instead of experimental3
<mgz> there are too damn many maas ppas
<babbageclunk> voidspace: no, haven't yet
<babbageclunk> voidspace: but the new way is right - if the mapping's gone from {id: label} to being keyed by label, the values need to be lists
<mgz> babbageclunk: I don't see a beta4 in any of the ~maas ppas
<mgz> it's in someones private one?
<mup> Bug #1575769 opened: Can't "forget" a controller that I've lost access to <juju-core:New> <https://launchpad.net/bugs/1575769>
<babbageclunk> mgz: I think it's this one: ppa:maas-maintainers/experimental3
<frobware> jam, tych0: PTAL @ http://reviews.vapour.ws/r/4722/
<tych0> frobware: oh, derp :)
<frobware> tych0: you were doing the same thing?
<tych0> no, just that it's a dumb bug on my part
<mup> Bug #1575769 changed: Can't "forget" a controller that I've lost access to <juju-core:New> <https://launchpad.net/bugs/1575769>
<tych0> i didn't realize Initialize was called more than once, actually
<tych0> but it makes sense that it is
<tych0> frobware: looks good to me
<frobware> tych0: ok, will do some testing tomorrow morning (as master is blocked).
<frobware> tych0: thx
<tych0> frobware: np, thanks for the fix
<mup> Bug #1575794 opened: Agent config format version should be changed for 2.0 <juju-release-support> <tech-debt> <juju-core:Triaged> <https://launchpad.net/bugs/1575794>
<voidspace> babbageclunk: mgz: that's the right one - I just had to switch to it
<voidspace> babbageclunk: ok, the kflavor ones sounds easier
<voidspace> babbageclunk: I can propose that
<mgz> babbageclunk, voidspace: similar issue with 'mount_point'
<mgz> on filesystem2_0
<babbageclunk> voidspace: https://github.com/juju/gomaasapi/pull/48
<babbageclunk> voidspace: can you do the mount_point one?
<voidspace> babbageclunk: gah, you beat me to it
<babbageclunk> sorry!
<voidspace> babbageclunk: mount_point needs to be optional?
<mgz> what command is this coming from?
<voidspace> babbageclunk: I fixed it by supplying a default instead
<babbageclunk> voidspace: oh, that's nicer
<mup> Bug #1575794 changed: Agent config format version should be changed for 2.0 <juju-release-support> <tech-debt> <juju-core:Triaged> <https://launchpad.net/bugs/1575794>
<mgz> hm, block-device, which needs a machine
<voidspace> babbageclunk: https://github.com/juju/gomaasapi/compare/master...voidspace:maas2-kflavor-optional
<voidspace> babbageclunk: and then you always get a value even if it's missing
<voidspace> mgz: I can make mount_point optional in filesystem
<babbageclunk> voidspace: yeah - do that instead
<voidspace>  babbageclunk ok, I'll add making mount_point optional in the same branch
<mgz> voidspace: some of my filesystems are null
<babbageclunk> mgz: you should be able to see it with machines read
<voidspace> the whole filesystem?
<voidspace> mgz: what should we default to for mount_point: "/" ?
<mgz> voidspace: http://paste.ubuntu.com/16085749
<babbageclunk> voidspace, mgz: sorry, I have to head home
<voidspace> I have to go in five minutes too
<voidspace> mgz: when you say "some of my filesystems are null" you mean mount_point, label and mount_options being null
<voidspace> ah no
<voidspace> the whole filesystem is null
<voidspace> geez
<mgz> :)
<voidspace> mgz: this is now a tomorrow problem, sorry
<voidspace> wife is calling me to dinner
<mgz> voidspace: I'll file you a bug son
<voidspace> mgz: can you link to this pastebin on it please
<mgz> will also put up a non-voting job for maas 2.0
<mup> Bug #1575794 opened: Agent config format version should be changed for 2.0 <juju-release-support> <tech-debt> <juju-core:Triaged> <https://launchpad.net/bugs/1575794>
<mgz> to avoid cursing on this stuff
<voidspace> mgz: much appreciated - thanks
<voidspace> mgz: yeah
<voidspace> weird-ass maas configuration
<voidspace> right, dinner
<voidspace> sorry
<mup> Bug #1575797 opened: AddressableContainerSetupSuite.TestContainerInitialised lxc-net: no such file or directory <arm64> <centos> <ci> <regression> <test-failure> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1575797>
<mup> Bug #1575808 opened: filesystem 2.0 schema check failed: mount_point: expected string, got nothing <bootstrap> <ci> <maas-provider> <juju-core:Triaged> <https://launchpad.net/bugs/1575808>
<mup> Bug #1575808 changed: filesystem 2.0 schema check failed: mount_point: expected string, got nothing <bootstrap> <ci> <maas-provider> <juju-core:Triaged> <https://launchpad.net/bugs/1575808>
<mup> Bug #1575808 opened: filesystem 2.0 schema check failed: mount_point: expected string, got nothing <bootstrap> <ci> <maas-provider> <juju-core:Triaged> <https://launchpad.net/bugs/1575808>
<redir> ericsnow: you around?
<ericsnow> redir: yep
<redir> can I borrow your eyes for a minute?
<redir> ericsnow: ^^
<ericsnow> redir: sure
<redir> ericsnow: am in moonstone
<natefinch> uhh... redir, cherylj, are the tests supposed to pass on xenial?
<natefinch> I'm getting a lot of this:  obtained string = "xenial"... expected string = "trusty"
<natefinch> katco, ericsnow ^ ?   This is just running tests on master
<katco> natefinch: what does dist-info --lts return for you?
<ericsnow> natefinch: I don't believe we've landed any fixes for that yet
<katco> natefinch: sorry distro-info --lts
<ericsnow> natefinch: redir is working on a comprehensive patch
<natefinch> ahh ok
<natefinch> katco: xenial
<mup> Bug #1575895 opened: juju loses apt-http/s-proxy information if a model is deleted and a new one created <juju-core:New> <https://launchpad.net/bugs/1575895>
<redir> natefinch: getting close with some help from ericsnow just now. Making sure it works then reviewing previous commits to see if this works on those and reverting them to keep things uniform.
<redir> Hopefully done RSN(tm)
<natefinch> redir: \o/
<redir>  /o\
<cmars> ah, resources are awesome
<natefinch> cmars: glad you like them.  I do think they're pretty great
<cmars> natefinch, updating my mattermost charm to use them
<natefinch> cmars: nice!
<natefinch> cmars: let us know if there's any problems or unexpected behavior
<katco> cmars: :D
<perrito666> wallyworld: ping me when you are around please
<wallyworld> perrito666: hey. just about to go into a meeting
<perrito666> k
<mup> Bug #1575940 opened: LXC containers under MAAS get no "search <domain>" entry in resolv.conf when deployed with juju2 <juju-core:New> <https://launchpad.net/bugs/1575940>
<alexisb> menn0, ping
<menn0> alexisb: hey hey
<alexisb> are you available to join the leads call
<menn0> alexisb: ah crap... sorry
<cherylj> davecheney: ping?
<perrito666> wallyworld: lemme know if you have a spot before the standup
<wallyworld> perrito666: stuck in meetings, will let you know. may have to move standup depending on how meetings go
<perrito666> k
<wallyworld> perrito666: axw: anastasiamac_: redir: quick standup between meetings?
<cherylj> redir: will you need to coordinate with CI to not hack distro-info for your PR?
<redir> cherylj: If I have this right it should pass as the bots are, and when they unhack them as well.
<cherylj> redir: sweet!
<redir> so master is blocked
<redir> should I target something else for a PR?
#juju-dev 2016-04-28
<mup> Bug #1575403 changed: juju2: juju deploy --to and -n not usable together? <landscape> <juju-core:Invalid> <https://launchpad.net/bugs/1575403>
<davecheney> cherylj: https://github.com/juju/juju/pull/5297
<mup> Bug #1575983 opened: Having 2 machine-0 is confusing <juju-core:New> <https://launchpad.net/bugs/1575983>
<mup> Bug #1575983 changed: Having 2 machine-0 is confusing <juju-core:New> <https://launchpad.net/bugs/1575983>
<mup> Bug #1575983 opened: Having 2 machine-0 is confusing <juju-core:New> <https://launchpad.net/bugs/1575983>
<davecheney> menn0: https://github.com/juju/juju/pull/5297
<davecheney> do you have a sec for a small review of a blocker
<menn0> davecheney: give me 2 minutes
<menn0> davecheney: looking now
<axw> davecheney: LGTM
<menn0> davecheney, axw: I have no real issue with the change but I don't understand why there was a data race to begin with
<menn0> can you explain it?
<axw> menn0: only if Stdout==Stderr exactly will cmd/exec guarantee that only one goroutine writes to one at a time
<axw> (I wondered too)
<menn0> davecheney, axw: in that case, would this be a better solution? http://paste.ubuntu.com/16090295/
<menn0> then we don't lose the interleaving of stdout and stderr
<axw> that assumes there's no stderr in the success case... which probably should always be true
 * axw shrugs
<axw> just use CombinedOutput in that case then
<menn0> true
<menn0> whatever... it's a minor thing and I guess there is the risk that there might be something on stderr for the success case
<menn0> davecheney, axw: just leave it
<redir> wallyworld: cherylj: whomever... http://reviews.vapour.ws/r/4724/
<wallyworld> looking
<davecheney> menn0: interlieving of stdout/stderr is undefined
<davecheney> it'll make the test brittle
<davecheney> menn0: there is a data race becase var both bytes.Buffer is being written by two writers
<davecheney> menn0: but we're not looking for stderr, we're looking for stdout, which is the juju version
<davecheney> the code cannot cope with anything on stderrr
<davecheney> the code cannot cope with anything on stderr
<menn0> davecheney: all good. your change avoids any unnecessary complexity.
<davecheney> menn0: fwiw i wish the test wasn't written like that
<davecheney> the test is written to cope with the fact that the mock exec.Command is weak
<wallyworld> redir: i have some suggestions which had been on my mind before but which I've now enumerated in the review. happy to discuss
<menn0> axw: thanks for the review. a few things:
<menn0> the calls are bulk on the server side
<menn0> they're just not on the client side
<axw> menn0: ah, didn't look properly, sorry
<menn0> regarding passing tags, instead of machine or unit ids ... i'm fine with that
<menn0> the client will have to have the logic to figure out which is which but that's fine
<axw> menn0: yeah, my only concern with it is that the client would have to be updated to handle new entities. but I think that's never going to happen, or only extremely rarely
<menn0> I considered lumping the addresses together but the concept of a preferred private and public addresses is baked right into the machine docs
<menn0> also, if they're returned together then the client needs to figure out which one to use
<menn0> maybe that's a good thing
<natefinch> davecheney, menn0: sorry for the crappy code/test
<menn0> but it would be quite a change to the current commands
<menn0> axw: ^^
<axw> menn0: OK, forget about that for now then
<axw> menn0: thanks
<menn0> axw: I agree the current approach isn't ideal
<davecheney> natefinch: it's ok
<menn0> axw: esp when there's mixed IPv4/6 deployments
<davecheney> it's not that crappy
<davecheney> patchvalue comes with it's own costs
<menn0> axw: I'll just make the calls take tags instead of raw ids for now
<axw> menn0: SGTM, thanks
<mup> Bug #1575983 changed: Having 2 machine-0 is confusing <juju-core:Invalid> <https://launchpad.net/bugs/1575983>
<mup> Bug #1576003 opened: Juju 2.0: default bootstrap-timeout insufficient for physical machines <juju-core:New> <https://launchpad.net/bugs/1576003>
<redir_> wallyworld: got a minute to discuss?
<wallyworld> sure
<redir_> tanzanite?
<wallyworld> standup hangout
<redir_> there
<wallyworld> axw: can you let me know when you have 5 mins?
<axw> wallyworld: I have 5 mins
<axw> maybe even more
<wallyworld> axw: standup ho
<mup> Bug #1576021 opened: 1.25.6 cannot deploy on CI maas 1.9 or 1.8 <ci> <maas-provider> <regression> <juju-ci-tools:Triaged> <juju-core:Incomplete> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1576021>
<axw> dimitern: FYI, http://reviews.vapour.ws/r/4625/diff/3-4/  -- see comments for an alternative method of filtering instances in ec2
<dimitern> axw: thanks! looking
<frobware> ping
<dimitern> axw: I was thinking about using client-token like this, but decided to go with a simpler approach for now - adapting the group-based selection
<axw> dimitern: okey dokey
<dimitern> frobware: was that for me?
<dimitern> axw: can you expand on this: The only danger is if we need to make non-idempotent calls to RunInstances for the machine ID ?
<frobware> dimitern, heh, nope. off-by-1. :)
<dimitern> ha :)
<axw> dimitern: if provisioning fails, then you do "juju retry-provisioning", then we will call StartInstances again. if we use a predictable ClientToken, then the result of StartInstances should always be the same
<babbageclunk> voidspace: so do you want to propose your kflavor/mount_point fix instead?
<voidspace> babbageclunk: https://github.com/juju/gomaasapi/pull/49
<axw> dimitern: that would be fine if RunInstances is always called with exactly the same args
<axw> (for a given machine)
<voidspace> babbageclunk: I'm doing mount_point / filesystem as a separate branch as it's a bit more involved
<voidspace> dimitern: https://github.com/juju/gomaasapi/pull/49
<voidspace> babbageclunk: dimitern: as it's a critical juju bug we'll also need to propose juju PR bumping gomaasapi
<babbageclunk> voidspace: I'll get on with the constraint schema change - it's also a bit more involved, means changing the gomaasapi interfaces.
<frobware> dimitern, voidspace, babbageclunk, dooferlad, jam: would appreciate a review for http://reviews.vapour.ws/r/4722/
<dimitern> axw: agreed, as the API docs say "A client token is valid for at least 24 hours after the termination of the instance"
<voidspace> babbageclunk: this is a separate issue - the problem with space / storage constraints?
<babbageclunk> voidspace: yup
<dimitern> voidspace, frobware: looking at both
<voidspace> thanks
<babbageclunk> frobware: looking now
<axw> dimitern: also, at some point I want retry-provisioning to be replaced with automatic retries. so any potential problem would be exasperated. I don't *think* there's any problem, but also don't want to change it this close to RC :)
<frobware> dimitern, voidspace, babbageclunk, dooferlad: might not be in standup. expecting Lenovo engineer to turn up with parts to fix my latop.
<voidspace> frobware: ok
<dimitern> frobware: no standup today btw
<babbageclunk> oh yeah - big meeting at 11 instead?
<frobware> dimitern, ah, ok. was on autopilot
<dimitern> axw: it should like a daunting task to do that generically for all providers, and idempotently as well (at juju-level, possibly relying on cloud-specific features underneath, like client-token)
<dimitern> s/should like/sure looks like/
<voidspace> dimitern: babbageclunk: corresponding juju PR: https://github.com/juju/juju/pull/5301
<axw> dimitern: I don't think it's *that* hard, and we don't need to have the the idempotent provisioning if we reap unknown instances. but I'm sure I'll find gotchas if/when I do it :)
<dimitern> axw: hopefully in 2.1 or 2.2 :)
<dimitern> voidspace: that one also LGTM
<voidspace> dimitern: ta
<jam> frobware: I just reviewed http://reviews.vapour.ws/r/4722/
<voidspace> mgz: did you file a bug for the filesystem issue?
<frobware> jam, thanks
<voidspace> mgz: so the schema for partition can already handle filesystem being null
<dimitern> frobware: reviewed
<voidspace> mgz: so it's just mount_point and label need to be optional
<voidspace> mgz: the output you sent me - was that blockdevices? http://paste.ubuntu.com/16085749/
<voidspace> some of those have no partitions but a top-level filesystem, which we don't expose anyway
<voidspace> hmmm... that may bite us in the future
<voidspace> we'll see
<frobware> jam, dimitern: you both had comments about parsing the file and whitspace and trim, et al. I was treating this as a machine generated file - how far and how robust should we try to be?
<dimitern> frobware: I'm ok with assuming that, but when the format changes even a little, we might fail unpredictably (even silently), so more logging should help
<dimitern> frobware: even a single extra space before e.g. ` #LXD_IPV4_ADDR="0.1.2.0/24"`
<voidspace> mgz: I found https://bugs.launchpad.net/juju-core/+bug/1575808 - which answers my questions
<mup> Bug #1575808: filesystem 2.0 schema check failed: mount_point: expected string, got nothing <bootstrap> <ci> <maas-provider> <juju-core:In Progress by mfoord> <https://launchpad.net/bugs/1575808>
<voidspace> dimitern: frobware: babbageclunk: https://github.com/juju/gomaasapi/pull/50
<dimitern> voidspace: LGTM +2 suggestions
<voidspace> dimitern: without the ok check it panics
<dimitern> voidspace: it panics, after the schema was coerced ?
<voidspace> dimitern: ah, you mean still use the two value form of the cast but ignore ok
<dimitern> yeah
<voidspace> dimitern: yep - because nil is a valid option
<voidspace> dimitern: and you can't cast nil to a string...
<voidspace> dimitern: your suggestion works
<dimitern> voidspace: no, it's already verified to be a string at that point
<voidspace> dimitern: nope, string or nil
<voidspace> but this works
<voidspace> value, _ := thing.(string)
<dimitern> voidspace: yeah, but (string) with ignored ok will yield "" for a nil value
<voidspace> dimitern: yes
<voidspace> dimitern: the single value form of the cast panics
<voidspace> not the double value form
<voidspace> dimitern: so thank you
<dimitern> voidspace: ta
<voidspace> dimitern: updated
<voidspace> hmmm, no
<voidspace> *now* updated
<dimitern> voidspace: looking
<voidspace> dimitern: too late
<dimitern> babbageclunk: fair question btw - I've replied with some examples
<dimitern> voidspace: thanks :)
<voidspace> dimitern: frobware: babbageclunk: https://github.com/juju/juju/pull/5302
<voidspace> dimitern: thanks
<dimitern> voidspace: you're on fire today :D
<dimitern> nice!
<voidspace> dimitern: easy ones I can do...
<voidspace> heh
<axw> wallyworld anastasiamac_: team meeting?
<axw> dimitern: ^^
<mup> Bug #1576120 opened: "juju kill-controller" removes controllers.yaml entry even if destroying fails <juju-core:Triaged> <https://launchpad.net/bugs/1576120>
<dimitern> I won't make it to the team meeting :/
<wallyworld> axw: was at soccer
<frankban> axw: could you please take a look at a quick fix for macaroon login that we need in the GUI for next beta? https://github.com/juju/juju/pull/5305
<axw> frankban: sorry was afk, reviewed now
<axw> wallyworld: you gave mattyw's branch a shipit, but I don't think we should be ignoring errors ...
<axw> I mean it's not ignored, but we should surely be checking for specific types of errors
<wallyworld> axw: at that point, the controller name is either invalid or not found so we exit. i think that's the right thing to do?
<wallyworld> the type of error doesn't really matter
<axw> wallyworld: you might fail to read the controllers.yaml file? or the cross-process mutex fails?
<wallyworld> axw: right, but what can be done besides exiting with an error?
<axw> wallyworld: that's exactly what we should do
<wallyworld> that's what i thoguht we did, i may have misread the change
<axw> wallyworld: except in the case where the controller just wasn't found
<axw> wallyworld: ah
<axw> wallyworld: no, it's me that's doing the misreading :)
<wallyworld> whew
<axw> wallyworld: I looked at the diff back-to-front
<wallyworld> i *almost* did the same thing :-)
<axw> sorry!
<wallyworld> np :)
<mup> Bug #1576184 opened: "juju create-backup" fails if you're not operating on the admin model  <juju-core:New> <https://launchpad.net/bugs/1576184>
<mup> Bug #1576184 changed: "juju create-backup" fails if you're not operating on the admin model  <juju-core:New> <https://launchpad.net/bugs/1576184>
<frankban> axw: ty!
<mup> Bug #1576184 opened: "juju create-backup" fails if you're not operating on the admin model  <juju-core:New> <https://launchpad.net/bugs/1576184>
<babbageclunk> voidspace, dimitern, frobware: change for constraint parsing here: https://github.com/juju/gomaasapi/pull/51
<voidspace> babbageclunk: looking
<dimitern> babbageclunk: I think it will be a lot nicer, if we used the schema to process the constraintsMap
<dimitern> babbageclunk: e.g. like when deserializing a vlan embedded into a subnet
<voidspace> other than that it looks alright to me
<voidspace> relatively straightforward
<voidspace> I don't think it's that clunky, but if you can do as dimitern says then it sounds better
<babbageclunk> dimitern: ok - I'll look at that
<dimitern> babbageclunk: basically, define a getConstraintsMapDeserializationFunc and the related constraintsMapDeserializationFuncs map with twoDotOh: constraints_map_2_0 (ugh! I really don't like underscores in go code :( btw but I'd rather have consistency with pre-existing code)
<babbageclunk> dimitern: Right, so put a StringMap(Any) for each of them, and then a separate function for each that does another checker.Coerce call? Sounds like it's worth a go.
<dimitern> babbageclunk: that as well, yeah
<babbageclunk> dimitern: ok - something that takes an interface{} and returns a nice map[string][]int would be a huge improvement.
<dimitern> babbageclunk: see for example how (the lot more complicated) interface_set is handled in machine_2_0
<babbageclunk> dimitern: I don't want to go back through the version dispatch though - I'm just going to make a function and call that directly.
<dimitern> babbageclunk: ignoring the API version sounds like a bad idea - esp. if the response format differs
<dimitern> babbageclunk: a much simpler (alas inconsistent) way will be to define a struct with JSON serialization tags for the map and its keys
<babbageclunk> dimitern: Oh, I thought this was down-stack from a machine_2_0 function, but you're right.
<dimitern> babbageclunk: (inconsistent with the other code I mean)
<dimitern> babbageclunk: look at how devices or interfaces are handled in the 1.0 code in provider/maas/ (before the changes to gomaasapi)
<babbageclunk> dimitern: yeah, I don't really understand why this code didn't use the builtin JSON serialisation tags.
<dimitern> babbageclunk: me too (fwiw) :)
<dimitern> babbageclunk: but I guess the schema is the sexy new thing everybody should use everywhere now :)
<babbageclunk> dimitern: it's ok, but it doesn't much help with the type system stepping all over everything.
<dimitern> babbageclunk: going with the struct(s)+json tags will be a lot cleaner to do for such a simple map format, and since it will be in gomaasapi no need to jump through hoops like in provider/maas/interfaces.go (e.g. serialize to JSON only to get the []byte blob and deserialize it via the struct)
<babbageclunk> dimitern: yeah, but I probably should try to keep it consistent. <sigh>
<dimitern> babbageclunk: here's a trick: just put all that in a separate file :) inconsistency less obvious
<perrito666> morning
<babbageclunk> dimitern, voidspace: put up a tweaked version that just pulls out the common conversion code
 * babbageclunk|afk goes for a run.
<dimitern> jam: ping
<voidspace> babblooking
<babbageclunk> voidspace, dimitern: any objections to me merging https://github.com/juju/gomaasapi/pull/51
<dimitern> babbageclunk: sorry, got distracted, will have a look now quickly
<babbageclunk> dimitern: thanks!
<voidspace> babbageclunk: LLLGTM
<voidspace> uhm, or something like that
<babbageclunk> :)
<voidspace> babbageclunk: so I assume there's a follow-up in juju coming to use this
<dimitern> babbageclunk: it could be a bit simpler, considering the "storage" and "interfaces" are both validated to be maps with string keys and []int values
<dimitern> babbageclunk: what will happen if you type-assert to map[string]map[string][]int instead on line 826 in controller.go?
<dimitern> babbageclunk: if it works, then convertConstraintMatches becomes unnecessary and you could just iterate over the nested map for both top-level keys
<babbageclunk> dimitern: I couldn't get that to work - the values actually have interface objects in them, so they need to be visited to unpack them.
<dimitern> babbageclunk: you can still type assert the nested maps - e.g. https://play.golang.org/p/D2cz2Af4vE
<dimitern> babbageclunk: but I'm OK with landing this as is and possibly trying a bit simpler approach in a follow-up
<dimitern> (and the follow-up itself doesn't have to be done today)
<dimitern> :)
<babbageclunk> dimitern: I think the problem is that you can't do the next level down - https://play.golang.org/p/0jmajp13vl
<alexisb> voidspace, babbageclunk nicely done on the bugs reported by CI for maas 2.0
<babbageclunk> dimitern: so you can't do the full cast on line 826, you have to walk through the key/values converting the interface{}s to []interface{}
<babbageclunk> dimitern: and then walk through those turning the leaf interface{}s into ints.
<babbageclunk> alexisb: thanks!
<katco> ericsnow: standup time
<babbageclunk> dimitern: (at least, that's what I think - I'd be happy to have that code be simpler!)
<dimitern> babbageclunk: you can go as deep as you want: https://play.golang.org/p/UmVNx8N8tu
<dimitern> babbageclunk: but as I said, as long as it fixes the bug let's land it, and improve upon it when we have some spare time
<babbageclunk> dimitern: Right, but that's essentially what I had in https://github.com/juju/gomaasapi/pull/51/commits/61b3b97003699be0c103f21cb26e7fb924d19b4b
<mup> Bug #1570796 changed: container startup issue when juju network management disabled <juju-core:Invalid> <https://launchpad.net/bugs/1570796>
<mup> Bug #1576266 opened: apiclientSuite.SetUpTest fails because no tools available <ci> <go1.6> <test-failure> <unit-tests> <windows> <juju-core:Incomplete> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1576266>
<mup> Bug #1576270 opened: 'juju create-backup'  fails first on a mongodump dependency then auth failure <docteam> <juju-core:New> <https://launchpad.net/bugs/1576270>
<babbageclunk> dimitern: the casting is all interspersed with the matching devices and interfaces.
<babbageclunk> dimitern: anyway, thanks!
<babbageclunk> dimitern: :)
<dimitern> babbageclunk: that looks better, yeah - remember the schema gives us some confidence to do e.g. matchMap := source.(map[string][]int) directly
<babbageclunk> except that panics - it complains that the actual value is a map[string]interface{} (even though all of the interface{}s actually hold []interface{}s and all of them have ints).
<dimitern> babbageclunk: anyway, it was an interesting exercise I guess :)
<babbageclunk> dimitern: yup yup - thanks!
<jcastro> alexisb: filing bugs in juju core, launchpad or github?
<jcastro> if launchpad shouldn't we disable the github issues tracker?
<alexisb> jcastro, launchpad
<natefinch> jcastro: we've talked about that a bnuch
<alexisb> jcastro, we have discussed disabling it
<jcastro> we need to make it obvious then, I just realized I've been filing bugs in the wrong place
<jcastro> and I work here
<alexisb> our current policy is that we open launchpad bugs for folks
<babbageclunk> voidspace, dimitern, frobware: review? http://reviews.vapour.ws/r/4733/
<babbageclunk> this is the other side of my gomaasapi constraint change.
<natefinch> code review anyone? https://github.com/juju/utils/pull/208
<natefinch> very short and sweet
<dimitern> babbageclunk: did you try this live against MAAS2 ?
<babbageclunk> dimitern: ooh, no - thanks for reminding me! That was what lead me down this crazy rabbithole in the first place!
<dimitern> babbageclunk: :) ok
<mup> Bug #1576295 opened: add-credential input field should support tab completion <juju-core:New> <https://launchpad.net/bugs/1576295>
<babbageclunk> dimitern: hmm. So I can add a machine using --constraints spaces=private, and it picks the right machine and starts deploying..., but the machine never comes up
<dimitern> babbageclunk: how are the machine NICs configured?
<babbageclunk> dimitern: it's running, but juju says pending. I can't ssh in by juju ssh or directly.
<dimitern> babbageclunk: it doesn't let you in or it doesn't connect ?
<dimitern> babbageclunk: if EPERM, try ssh -i ~/.local/share/juju/ssh/juju_id_rsa.pub ubuntu@<IP-known-by-maas>
<babbageclunk> dimitern: network looks like this http://pastebin.ubuntu.com/16098886/
<babbageclunk> dimitern: juju ssh just hangs
<babbageclunk> dimitern: I tried ssh with the key, that says: Connection closed by 192.168.150.4
<dimitern> babbageclunk: I suspect the DHCP primary is the reason for this
<babbageclunk> dimitern: any way to log in on the terminal? Probably no default password for ubuntu user, right?
<babbageclunk> dimitern: ok
<dimitern> babbageclunk: I've never tried that and maas docs are not quite clear what does it mean (it's not the same as Auto)
<mup> Bug #1576301 opened: Cryptic error message if Juju uses the wrong json file from GCE <juju-core:New> <https://launchpad.net/bugs/1576301>
<dimitern> babbageclunk: you can hack the userdata juju passes to maas to include setting the password (or removing it)
<mup> Bug #1576313 opened: windows: uniter tests fail because logs get dumped to stderr <juju-core:New> <https://launchpad.net/bugs/1576313>
<babbageclunk> dimitern: ok, so I released the machine, but I can't remove it in juju - just stays in pending.
<dimitern> babbageclunk: remove-machine # --force
<babbageclunk> dimitern: I tried that - it looks like it timed out on its own eventually
<babbageclunk> dimitern: ok, so I've changed it from DHCP to auto? Try again?
<babbageclunk> dimitern: Or should I try deploying it in MAAS and making sure I can ssh to it as I'd expect?
<dimitern> babbageclunk: either auto or static should work (that's true for all types of NICs)
<dimitern> babbageclunk: well, that's also an option if you're not sure juju does the right thing
<dimitern> babbageclunk: but as you explain it, it looks like juju did good, maas didn't
<dimitern> there's still a LOT maas can do to improve the UX around misconfigured nodes, networks, images, ..
<natefinch> ericsnow: MR OCR, can you check out my patch?  it's 31 lines, half of that is comments.  https://github.com/juju/utils/pull/208
<mgz> bogdanteleaga: new bug is very like fixed bug 1470601
<mup> Bug #1470601:  UniterSuite.TestLeadership fails on windows <blocker> <ci> <regression> <unit-tests> <juju-core:Fix Released by bteleaga> <https://launchpad.net/bugs/1470601>
<ericsnow> natefinch: sure
<dimitern> babbageclunk: re cloud-init and ubuntu password: http://blog.scottlowe.org/2015/11/09/changing-passwords-cloud-init/ (you can hack provider/maas/ where it creates a cloudinit config; alternatively, in maas you can add it to the custom preseed scripts used for deployments)
<babbageclunk> dimitern: I'm getting "Unable to allocate static IP due to address exhaustion." - I think my DHCP covers the whole subnet
<babbageclunk> dimitern: where can I adjust the range? I can't find it.
<babbageclunk> dimitern: Do I need to delete and re-add the subnet?
<babbageclunk> dimitern: ah, found it - on the VLAN
<bogdanteleaga> mgz, yeah, I think what we talked about earlier is happening, the fix to that bug was to create a reg key that just happened to stick around
<mgz> bogdanteleaga: heh
<dimitern> babbageclunk: no need to remove the subnet, adjust the range
<babbageclunk> dimitern: I can't see how to do that - I can either provide DHCP or disable it, but they never give me the option to adjust the range.
<dimitern> babbageclunk: have a look at the subnet in the ui
<dimitern> babbageclunk: you should see a bunch of used addresses at the end
<dimitern> babbageclunk: and there are also the CLI commands for iprange(s) that control what's reserved, etc.
<dimitern> babbageclunk: but I haven't actually tried (I did poke around in the postgres db directly though with maas-region --dbshell :)
<dimitern> babbageclunk: `sudo maas-region dbshell -i` - it's the familiar django dbshell (running psql)
<ericsnow> natefinch: ship-it
<natefinch> ericsnow: thanks!
<mup> Bug #1576324 opened: Juju2.0 commandline client inconsistent treatment of yaml files  <landscape> <usability> <juju-core:New> <https://launchpad.net/bugs/1576324>
<mup> Bug #1576324 changed: Juju2.0 commandline client inconsistent treatment of yaml files  <landscape> <usability> <juju-core:New> <https://launchpad.net/bugs/1576324>
<babbageclunk> dimitern: hmm. I shouldn't have deleted that subnet while it was in use, I don't think.
<babbageclunk> dimitern: I think I might need to reinstall MAAS tomorrow.
<dimitern> babbageclunk: sounds good (after a few times it becomes always second nature :)
<dimitern> s/always/almost/
<mup> Bug #1576324 opened: Juju2.0 commandline client inconsistent treatment of yaml files  <landscape> <usability> <juju-core:New> <https://launchpad.net/bugs/1576324>
<babbageclunk> dimitern: Sometimes taking off and nuking the site from orbit really *is* the only way to be sure!
<rogpeppe> anyone know if there's a way to list what models are stored locally?
<cherylj> rogpeppe: what models or what controllers?
<rogpeppe> cherylj: what models
<rogpeppe> cherylj: all the models i can juju switch to
<cherylj> rogpeppe:  well, all the models locally won't necessarily be all the ones you can switch to
<cherylj> hmm
<rogpeppe> cherylj: ah, switch allows you to switch to remotely held models too?
<cherylj> actually
<cherylj> yeah, you could be granted access to a model
<cherylj> and it wouldn't be in your local cache
<rogpeppe> cherylj: juju switch will automatically go and grab a model into the local cache?
<cherylj> there's a way you can list models you have access to for a paricular controller
<cherylj> juju list-models will limit it to ones you have access to
<cherylj> (unless you're an admin and use the whatever all-models flag)
<rogpeppe> cherylj: but there's no way to find out what models i have cached locally?
<cherylj> rogpeppe: I'm not sure if the model info will be stored locally
<cherylj> when you switch to it
<cherylj> I'd have to see
<rogpeppe> cherylj: well, model info is stored locally
<rogpeppe> cherylj: i'm presuming that's used sometimes
<cherylj> rogpeppe: not apart from manually looking in ~/.local/share/juju/models.yaml
<mup> Bug #1576342 opened: `juju status` should show leadership primitives <juju-core:New> <https://launchpad.net/bugs/1576342>
<mup> Bug #1576346 opened: upgrade-charm with a local charm fails with trailing slash <juju-core:New> <https://launchpad.net/bugs/1576346>
<bdx> hey whats going on everyone? Does anyone know the timeframe for RC1?
<cherylj> bdx: we're most likely going to do another beta next week.  Not sure when we're going to have something we want to call rc1
<bdx> cherylj: ok, awesome. thx
<mup> Bug #1576359 opened: Cannot talk to a new model with a reused name <juju-release-support> <switch> <juju-core:Triaged> <https://launchpad.net/bugs/1576359>
<kwmonroe> hey juju-dev, i have 2 models in azure, and both have a "machine-4".  the last one to come up seems to steal the dns entry.  is this a bug, or am i doing networking wrong?  http://paste.ubuntu.com/16115389/
<mup> Bug #1576366 opened: juju 2 beta6: show-controller --format=json is broken <landscape> <juju-core:New> <https://launchpad.net/bugs/1576366>
<mup> Bug #1576368 opened: blockdevice 2.0 schema check failed: model: expected string, got nothing <ci> <deploy> <maas-provider> <juju-core:Triaged> <https://launchpad.net/bugs/1576368>
<natefinch> kwmonroe: not sure... definitely worth filing a bug, though.
<katco> mgz: ping
<mgz> katco: yo
<katco> mgz: o/
<mgz> \o\
<katco> mgz: i'm removing HP cloud schtuff in master, and there's some code in the openstack provider that considers an instance started if the status is BuildSpawning... is that safe to remove?
<katco> mgz: https://github.com/juju/juju/blob/master/provider/openstack/provider.go#L1089-L1097
<katco> mgz: just want to remove the nova.StatusBuildSpawning from L1093, and a corresponding test
<mgz> yeah, it should be, I don't think any trunk version of openstack used that form of status
<katco> mgz: awesome, ty
<mgz> katco: hm, we may want to keep something, looking at https://github.com/openstack/nova/tree/master/nova/compute *_state.py files
<katco> mgz: it looks like the only reference to spawning is for task states whereis this is server state
<katco> mgz: you know way more about openstack, but seems safe to me? could ask in #openstack maybe
<mgz> katco: yeah, the HP way of displaying "BUILD(spawning)" was HP specific
<mgz> but the isAliveServer logic is pretty ropey
<katco> yeah
 * katco just remembered ods is going on
<mgz> katco: I think I'd remove the code as it exists and file a bug against goose to update/make the various exposed states more usable
<mgz> yeah, it's a good and a bad week to ask about openstack things :)
<katco> hehe
<katco> mgz: as in, don't expose the state, expose a synthesized concept of machine up?
<mgz> katco: well, we get three types of status/state exposed, but goose doesn't give us that cleanly
<katco> 14:05> go get -u gopkg.in/juju/charm.v6-unstable
<katco> # cd /home/kate/workspace/go/src/gopkg.in/juju/charm.v6-unstable; git pull --ff-only
<katco> fatal: unable to access 'https://gopkg.in/juju/charm.v6-unstable/': server certificate verification failed. CAfile: none CRLfile: none
<katco> is this just me?
<mgz> katco: I think I made all the upstreams for charm etc go straight to github
<mgz> ha, actually, didn't restore that after my drive got wiped
<mgz> so, lets see
<natefinch> katco: I don't get that
<mgz> katco: worked for me
<katco> hm, ok... now wondering what the heck is up with my machine
<natefinch> However, I have been having problems with go get -u.... I get this: http://pastebin.ubuntu.com/16118444/
<natefinch> maybe godeps + go get -u is a bad combo
<katco> natefinch: i think your're just on a detached head
<katco> natefinch: try git checkout master first
<natefinch> katco: right, but I think that's because of godeps
<katco> natefinch: well yes, godeps checks a commit out which puts you on a detached head unless the commit happens to be master
<natefinch> katco: right, ok, so it's just godeps messing up go get -u. That's fine.
<natefinch> annoying, but fine :)
<mup> Bug #1576376 opened: azure multi model dns failure <juju-core:New> <https://launchpad.net/bugs/1576376>
<natefinch> ericsnow: I think reviewboard is grumpy.  My last couple of PRs haven't been picked up
<ericsnow> natefinch: :(
<cherylj> hey katco, did you see I scheduled the interview for tomorrow morning?  Will you be able to make it?
<ericsnow> natefinch: you mean like http://reviews.vapour.ws/r/4735? <wink>
<katco> cherylj: yes, i'll be there. sorry i haven't responded yet.
<cherylj> katco: no worries, just wanted to make sure.
<katco> cherylj: thanks for setting that up
<cherylj> katco: did you want to chat for a bit first to make a plan?  I'm not sure how you and mattyw worked things out before
<natefinch> ericsnow: yeah, weird, it didn't update the PR with the link
<natefinch> ericsnow: I'm not sure what to do with the test that ensures we don
<natefinch> ericsnow: don't support SSL.  It seems redundant with this change, but... I was hesitant to remove a test that still passed and still tested something we want to be true,.
<mup> Bug #1572772 changed: URLsSuite.TestImageMetadataURL paths fail on windows <ci> <regression> <test-failure> <unit-tests> <windows> <juju-core:Fix Released by hduran-8> <https://launchpad.net/bugs/1572772>
<mup> Bug #1575463 changed: buildSuite.TestGetVersion* CryptAcquireContext: Provider DLL failed to initialize <blocker> <centos> <ci> <ppc64el> <regression> <test-failure> <unit-tests> <windows> <juju-core:Fix Released by axwalk> <https://launchpad.net/bugs/1575463>
<redir> so this https://github.com/juju/juju/blob/master/environs/config/config.go#L342 should validate that 'series' is a valid LTS series, yes?
<perrito666> bbl ~1h
<redir> ericsnow, natefinch, katco ^^
<katco> redir: sec otp
<ericsnow> redir: not a valid *LTS* necessarily
<ericsnow> redir: just one the charm package recognizes as valid
<redir> ericsnow: um, so the error is wrong or the test is wrong?
<redir> ericsnow: but it should verify an actual LTS series, no?
<ericsnow> redir: charm.IsValidSeries() just ensures that the string is a valid simple name: "^[a-z]+([a-z0-9]+)?$"
<ericsnow> redir: the LTS-ness comes from being the result of the "distro-info --lts" command
<redir> right but distroLtsSeriesFunc returns the latest LTS series so that doesn't seem like a valid check
<redir> should it be more correct, ericsnow ?
<redir> or omitted?
<ericsnow> redir: the charm.IsValidSeries() call is just making sure the data we got back from "distro-info --lts" is valid
<redir> well is a lower case string that doesn't start with a number.
<ericsnow> redir: I think the error message is fine given the context of the function
<redir> which isn't a valid LTS
<redir> which means the function could return a nonsense string
<ericsnow> redir: it doesn't have to verify that; it's more a sanity check of the output
<redir> partial sanity check:)
<ericsnow> redir: we're relying on "distro-info --lts" to do the right thing
<redir> OK
<perrito666> Wallyworld ill be half hour later to the 1:1 I misread the calendar
<wallyworld> perrito666: no problem at all, just ping me
<perrito666> Tx
<perrito666> wallyworld: I am in the call
<wallyworld> ok
<mup> Bug #1575472 changed: Data Race github.com/juju/juju/environs/tools/build.g <ci> <race-condition> <regression> <test-failure> <unit-tests> <juju-core:Fix Released by dave-cheney> <https://launchpad.net/bugs/1575472>
<mup> Bug #1575472 opened: Data Race github.com/juju/juju/environs/tools/build.g <ci> <race-condition> <regression> <test-failure> <unit-tests> <juju-core:Fix Released by dave-cheney> <https://launchpad.net/bugs/1575472>
<mup> Bug #1575472 changed: Data Race github.com/juju/juju/environs/tools/build.g <ci> <race-condition> <regression> <test-failure> <unit-tests> <juju-core:Fix Released by dave-cheney> <https://launchpad.net/bugs/1575472>
<axw> anastasiamac_: what's the 3 day weekend? don't think it is over here
<anastasiamac_> axw: oh... maybe jsut qld: labour day :D
<axw> anastasiamac_: ah, ours is earlier in the year
<anastasiamac_> oh..
<anastasiamac_> axw: m happy to have both off - it's sinful to work on labour day... right?
<wallyworld> davecheney: i just saw for the first time your compiler benchmarks. wtf. i thought go > 1.2 was slow, but wow. was it just that the go compiler sucks compared to c++?
<axw> C
<wallyworld> go 1.7 looks like it is getting back to parity
#juju-dev 2016-04-29
<davecheney> cherylj: looks like CI is still running 1.2
<davecheney> at least on the 1.25 branch
<davecheney> does sinzui 's fix need to be backported to 1.25 ?
<davecheney> >was it just that  the go compiler sucks compared to c++?
<davecheney> ^ i literally cannot even
<anastasiamac_> wallyworld: axw: i think this is ready to go http://reviews.vapour.ws/r/4573/ could someone plz stamp it? :D
<sinzui> davecheney: What is happening here http://reports.vapour.ws/releases/3933/job/run-unit-tests-race/attempt/1387
<davecheney> ta
<wallyworld> sinzui: are all the juju tests now run on go 1.6?
<wallyworld> for 2.0?
<davecheney> the race detector does not support ppc64
<davecheney> hang on
<davecheney> is this ppc64
<sinzui> wallyworld: no windows and centos are tested on both 1.2 and 1,6 because the 1.6 do not pass
<davecheney> no, this is amazon
<sinzui> davecheney: That is exactly my wtf moment. a trusty containter that thinks it is ppc
<wallyworld> sinzui: that's the unit tests right?
<sinzui> wallyworld: yes
<davecheney> ok, so the race deector in the trusty golang1.6 deb is busted
<davecheney> :9
<davecheney> :(
<davecheney> paging mwhudson
<wallyworld> sinzui: i thought horatio fixed the windows tests on 1.6, i'll need to check
<mwhudson> davecheney: ah yes
<sinzui> wallyworld: no his last fix was for go 1.2 regressions
<sinzui> wallyworld all packages have been built with go 1.6 for several weeks
<mwhudson> davecheney: try https://launchpad.net/~mwhudson/+archive/ubuntu/trusty-race-detector/+packages ?
<wallyworld> sinzui: the issue right now is there's some upstream azure sdk changes we need but those require go 1.6 to build, so we need 1.6 everywhere first
<wallyworld> so those tests need fixing
<sinzui> wallyworld: IU ask every day in meeting for them to be fixed
<wallyworld> yeah, everyone is busy :-(
<sinzui> davecheney: I am inclined to make the race tests non-voting until we sort out what happened
<sinzui> wallyworld: in the case of windows, I see bugs we already see, but more often, and some that seem to have been revealed by recent fixes http://reports.vapour.ws/releases/3933/job/run-unit-tests-win2012-amd64-go1_6/attempt/51
<wallyworld> sinzui: oh the fix for beta6 for ODS - the pinger shutdown fix - may have introduced a new window failure
<sinzui> wallyworld: :(
<wallyworld> well just just a guess
<sinzui> wallyworld: I am on unknowns tomorrow. I will get every bug files for the remaining Go 1.6 tests
<wallyworld> it seems go 1.6 on windows slowness may be exaserbating some of these failures
<sinzui> wallyworld: and as for slowness. I think we need to extend timeouts for centos and windows
<wallyworld> if that helps short term, that would be nice so we can get the azure fixes in
<sinzui> wallyworld: I will also get that sorted out tomorrow
<wallyworld> ok, ty
<sinzui> davecheney: I think we need to force the race tests to run with go1.2 or quickly solve the golang 1.6 ld issue
<davecheney> mwhudson: what say you to that ?
<davecheney> afaik the race detector in go 1.6 works fine
<davecheney> but something is wrong with how it's pacakged in trusty
<mwhudson> ah
 * redir is eod later #juju-dev
<mwhudson> sinzui, davecheney: there is a ppa with the support files needed
<mwhudson> https://launchpad.net/~mwhudson/+archive/ubuntu/trusty-race-detector/+packages
<davecheney> sinzui: is using that ppa an option ?
<mwhudson> sinzui, davecheney: i should get this into trusty too
<mwhudson> sinzui, davecheney: alternatively, can you run the race tests on xenial?
<sinzui> davecheney: yes, yes I think it is
<davecheney> cool
<sinzui> mwhudson: I just switched the tests to xenial. got the same error though
<mwhudson> sinzui: oh that's more surprising
<mwhudson> sinzui: do you have apt/apt-get installing recommends by default?
<sinzui> mwhudson: no, but we support installing a different set of packages instead of using the makefile (tht is how we test s390x and arm64)
<mwhudson> shouldn't matter
<mwhudson> sinzui: can you link me to the log of it failing on xenial?
<sinzui> mwhudson: It is happening right now at http://juju-ci.vapour.ws:8080/job/run-unit-tests-race/1388/console. You need to log in as a dveloper to see it
<davecheney> can someone send me the developer credentials
<davecheney> my browser has forgotten them
<mwhudson> sinzui: how does the golang-1.6 package get installed there?
<mwhudson> all i see is golang-1.6 is already the newest version (1.6.1-0ubuntu1).
<mwhudson> sinzui: anyway, somehow the golang-1.6-race-detector-runtime package needs to get installed
<mwhudson> sinzui: usually that gets installed when golang-1.6 does, because golang-1.6 Recommends: it
<mwhudson> er golang-1.6-go Recommends it, to be more precise
<mwhudson> sinzui: but if golang-1.6-go is somehow installed when recommends are disabled, it won't, so that's my guess as to what's happening here
<sinzui> mwhudson: understood. The makefile forces no-recommends. the makefile got an update today to ensure golang 1.6 or higher is preffered. I will reconfigure the job to install the required packages.
<mwhudson> sinzui: ah makes sense
<axw> anastasiamac_: LGTM
<anastasiamac_> axw: \o/
<sinzui> thans not fair. the rules to install extra packages conflict with the rules run with --race
<sinzui> oh, all is good. thank you mwhudson and davecheney . I can see golang-1.6-race-detector-runtime is installed and the tests are running with --race on xenial
<mwhudson> sinzui: \o/
<davecheney> woot
<natefinch> menn0, axw, wallyworld:  anyone care for a quick review?  It's fairly trivial, but for a critical bug: http://reviews.vapour.ws/r/4735/
<wallyworld> sure
<natefinch> I'd gotten eric to review, but he had some questions, so I wanted another pair of eyes
<wallyworld> natefinch: if we have test coverage for proper use of ssl via other means, then we probably can remove that special case in the test
<natefinch> wallyworld: well, my static analysis tests are stuck in committee... so right now, we don't.
<wallyworld> natefinch: ok, so no harm leaving it there for now. i think the pr is lgtm
<natefinch> wallyworld: ok... i think that's basically where eric landed too, just wanted a second opinion.  Thanks.
<wallyworld> np
<natefinch> wallyworld: do you know if the apt mirrors thing is *only* applicable to MAAS?
<wallyworld> natefinch: the general principal is applicable anywhere, but as axw mentioned to me, maas allows apt mirroe to be configured for it so juju could make special use of that to set the juju apt mirror setting off of what maas uses
<wallyworld> instead of making the user to specify manuall in 2 places
<axw> natefinch wallyworld: I'm not sure if it does now ... it has configuration for proxies for sure
<natefinch> wallyworld: does juju's apt mirror get propagated to containers?
<wallyworld> natefinch: no, that's the whole point of this work :-)
<wallyworld> it is supposed to
<wallyworld> so we need to figure out why not and fix
<wallyworld> that's my understanding anyway
<natefinch> wallyworld: ok... just making sure I understand the problem correctly.  Sounds like it's two things: 1.) juju apt mirrors don't get propagated to containers. 2.) (bonus points) juju should pick up maas's apt mirror and pass it into containers juju creates on maas nodes
<wallyworld> yes, assuming maas has such a setting
<axw> yup
<wallyworld> given all this, we need to audit what's there in juju already
<wallyworld> so we understand the current actual behaviour
<wallyworld> and figure out the gap between that and the to be
<wallyworld> and document the to be in that spec
<axw> wallyworld: I meant to ask, when you've got a spare few minutes, can you please have a look over my openstack PR?
<wallyworld> sure
<wallyworld> axw: looks like you already have a +1?
<axw> wallyworld: yeah, but babbageclunk is not a graduated reviewer AFAIK
<wallyworld> ah, i didn't check who did it
<wallyworld> just saw the tick
<mup> Bug #1334481 changed: juju should not record ssh certificates of ephemeral hosts <ci> <cloud-installer> <landscape> <juju-core:In Progress> <https://launchpad.net/bugs/1334481>
<natefinch> where did all the help go?
<natefinch> do we not have help on the providers in the CLI anymore?
<wallyworld> axw: lgtm with a drive by request
<wallyworld> natefinch: i think the help topics are under construction, not sure
<axw> wallyworld: ta
<natefinch> wallyworld: ok, that gives me hope.  Right now, juju bootstrap on a new cloud basically just gives me the middle finger with no way to figure out what I'm supposed to do.
<wallyworld> what do you mean?
<natefinch> $ juju bootstrap google google
<natefinch> ERROR detecting credentials for "google" cloud provider: gce credentials not found
<natefinch> and as far as I can tell, none of our help actually tells you how to enter those credentials
<wallyworld> juju help add-credentials
<natefinch> $ juju help add-credentials
<natefinch> ERROR unknown command or topic for add-credentials
<wallyworld> sorry, no s
<wallyworld> juju help add-credential
<natefinch> oh man... bad one to not have a plural alias
<wallyworld> well, it is only a single credential
<natefinch> it is very often used in the plural, though
<wallyworld> natefinch: juju help bootstrap tells you what you need anyway
<wallyworld> but spot the sypo
<wallyworld> typo
<natefinch> yeah, IWas gonna say, it even says to see "add-credentials"
<wallyworld> so i'm not sure how the current help is deficient
<wallyworld> bootstrap failed, juju help bootstrap has the answer
<wallyworld> albeit with a typo
<natefinch> juju help bootstrap is a wall of text.  it woudl be nice if the error about "credentials not found" also said "see juju help add-credential"
<wallyworld> hopefully that stuff is being addressed
<natefinch> I hope so :)
<natefinch> we do still need juju help google/gce
<wallyworld> i'm not 100% sure of the status of all that
<natefinch> understood
<natefinch> gah
<natefinch> spend 15 minutes figuring out gce's credentials only to hit another bootstrap error:
<natefinch> cannot start bootstrap instance: sending new instance request: sending new instance request: googleapi: Error 400: Invalid value for field 'resource.networkInterfaces[0].network': 'global/networks/default'. The referenced network resource cannot be found., invalid
<menn0> davecheney: small utils/ssh change to allow StrictHostKeyChecking to be turned on: http://reviews.vapour.ws/r/4738/
<menn0> axw: or you? ^^
<axw> menn0: looking
<menn0> cheers
<natefinch> whelp, can't bootstrap gce... guess it's time to file a bug
<axw> natefinch: :/  how are we not finding that in CI ...
<axw> pretty sure I bootstrapped gce last week, must be something new or maybe region or account specific
<natefinch> axw: there was some networking stuff landed today... maybe it hasn't made it through CI yet
<axw> ah
<natefinch> axw: or at least, I heard them talking about networking stuff... I don't know for sure if anything landed
<natefinch> hmm.. the commit history on master does not look like anything interesting landed lately
<natefinch> yeah, weird, even if I do it from master  weeks ago, still fails same way
<natefinch> I must be special
<natefinch> or I have to configure my GCE account in some way that I haven't
<menn0> axw: thanks for the review. I think strict should be the default too but I didn't want to break existing consumers of utils/ssh
<menn0> axw: FWIW it will soon default to on for juju ssh/scp"
<menn0> it's working on my machine
<menn0> pulling the host keys from the API server etc
<menn0> but needs tidying up and better test coverage
<axw> natefinch: did it fail for you immediately? it seems to be working for me, but hasn't finished yet
<axw> menn0: fair enough
<axw> menn0: cool :)
<natefinch> axw: failed in 10 seconds
<axw> natefinch: I guess it's something about your account/project
<natefinch> axw: yeah, I'll try creating a new project
<anastasiamac_> yes, my gce is bootstrapping fine too from master tip..
<natefinch> this was a fairly old project... I'll see if a new one works better
<natefinch> still weird
<natefinch> that seems to be working better.
<davecheney> menn0: lookin
<menn0> davecheney: it's all good. axw already reviewed.
<mgz> axw: http://reports.vapour.ws/releases/3933 <- note gce
<mup> Bug #1576503 opened: Agents stuck in failed state <juju-core:New> <https://launchpad.net/bugs/1576503>
<axw> mgz: interesting, seems to be different tho
<menn0> davecheney: screw you fslock...
<mgz> axw: possibly over quota and bad message?
<mgz> axw: we had passes since then
<axw> mgz: hmm dunno, not much to go on apart from failing to get public addresses...
<axw> sounds plausible tho
<axw> mgz: ah yes, GCE operation error: (QUOTA_EXCEEDED) Quota 'CPUS' exceeded.  Limit: 96.0
<mup> Bug #1576509 opened: Race in macaroon-bakery <ci> <race-condition> <regression> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1576509>
<mup> Bug #1576510 opened: Race in macaroon-bakery <ci> <race-condition> <regression> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1576510>
<menn0> axw: i've just realised that this SSH host key work has some interestsing corner cases with "juju scp"
<menn0> axw: do we care much about "juju scp 0:some.file 1:"
<menn0> ?
<menn0> axw: it doesn't work by default because the keys aren't in place but it could
<axw> menn0: does scp let you do that? I thought it didn't
<menn0> axw: it definitely does... I've just been reading up on it
<axw> so it does
<axw> menn0: I must be living in the past
<axw> uummm
<axw> menn0: what do you mean "keys aren't in place"?
<axw> why is it different to making two ssh connections, and writing to known_hosts twice?
<menn0> axw: when you do "scp host1:some.file host2:" there's a connection from your local machine to host1, and then host1 connects to host2
<menn0> with juju there's no keys in place to allow connections between host1 and host2
<axw> menn0: ah, I see.
<axw> menn0: I would say that's up to the user to manage
<menn0> axw: that said, with "scp -3 host1:... host2:" the connections are local -> host1 and local -> host2
<menn0> so that's more likely to work
<menn0> except I can't make it work right now
<axw> menn0: I don't really think it's worth worrying about
<menn0> axw: with a bit of work I could have juju scp generate a known hosts with the keys for both hosts
<menn0> axw: but it does make things a bit more complex
<menn0> axw: for now I'll generate a known hosts for the first host
<menn0> axw: that covers the majority of cases
<axw> menn0: that will still cover "juju scp /local/file 0:remote/path" right?
<menn0> axw: of course. "0:" would be the first (and only host) in that command line
<axw> menn0: ok, just making sure I understood you. sounds good
<menn0> cool
<mup> Bug #1576503 changed: Agents stuck in failed state <juju-core:New> <https://launchpad.net/bugs/1576503>
<wallyworld> axw: a small one for friday? http://reviews.vapour.ws/r/4739/
<axw> wallyworld: ok
<axw> wallyworld: how does that satisfyPrerequisites function work on ubuntu < trusty?
<axw> wallyworld: oh, this is only for series where mongo-3.2 is available
<axw> never mind
<wallyworld> yeah
<axw> separate command, was thinking it was run by the machine agent
<axw> wallyworld: LGTM
<wallyworld> ty
<anastasiamac_> have u seen this before? I have just destroyed a controller (had 2 models: admin and default)...
<anastasiamac_> when i list-controllers, i get empty list (just the headers) as expected
<anastasiamac_> but when i list-models, i get "error: controller local.volumes not found"
<anastasiamac_> i would have expected (and have in the past) to see empty list with just headers
<anastasiamac_> it seems that maybe we are no longer cleaning current-controller ?.. is it bug-worthy?..
<mup> Bug #1576527 opened: listSuite.TestListJSON got null (showSuite too) <ci> <regression> <test-failure> <unit-tests> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1576527>
<mup> Bug #1576528 opened: current controller not cleared on destroy <juju-core:New> <https://launchpad.net/bugs/1576528>
<mup> Bug #1576534 opened: upgrade-charm refuses to upgrade <juju-core:New> <https://launchpad.net/bugs/1576534>
<bdx> hello everyone
<bdx> is 'juju deploy <service> --to lxd:<machine#>' supported by the maas provider in 2.0?
<frobware> bdx: are you talking about MAAS 2.0, or Juju 2.0?  Currently supported for Juju 2.0.  For MAAS 2.0 & Juju 2.0 currently WIP and only available behind a feature flag and building from tip....
<bdx> frobware: nice, whats the flag?
<bdx> frobware: maas2?
<frobware> bdx: yep
<bdx> frobware: I also must build master?
<bdx> will this be dropping into beta7 then?
<frobware> bdx: yep, and a moving target... currently very active for maas2 work
<frobware> bdx: tbd
<bdx> frobware: yea, I've been following along ... entirely. thx
<frobware> bdx: we haven't done a CI run yet with this flag, so YMMV
<bdx> I have a lab up with juju2.0 + MAAS2.0 + the flag
<bdx> just not a build of master :-(
<bdx> I'll give tip a build and see what gives
<frobware> bdx: I would initially try the sinlge NIC setup, just a hint. :)
<bdx> frobware: thanks, by single nic, you mean enlisted maas nodes only have 1 nic ?
<frobware> bdx: yep. though things could have moved on since wed/thu when I was last tracking this.
<bdx> that way no one has to decide what interface to put the bridge  on
<bdx> ?
<frobware> bdx: we bridge all interfaces now
<frobware> bdx: I just recall see a comment from voidspace that he was trying/testing single NIC setup to begin with
<bdx> nice, ok
<frobware> bdx: single NIC, with multiple VLANs, all end up with their own bridge: http://pastebin.ubuntu.com/16123783/
<rogpeppe1> anyone know why aws-china is a separate cloud rather than just an aws region ?
<mattyw> rogpeppe1, something something firewalls I think
<mattyw> wallyworld, you around mate?
<voidspace> frobware: bdx: I was testing containers deployed on machines with a single nic first
<voidspace> frobware: bdx: that worked and then we got multiple nics working too
<frobware> voidspace: ooh nice
<voidspace> frobware: maas2 work is now feature complete - as soon as we get it passing CI we'll remove the flag
<voidspace> frobware: so still *technically* experimental, but hopefully not for much longer
<anastasiamac_> rogpeppe1: different clouds for aws, aws-china and aws-gov because they require different authentication types or secret information for different regions
<rogpeppe1> anastasiamac_: ok, so you can't use the same credentials across those clouds? that makes sense.
<voidspace> frobware: we're still getting maas schema failures though (json doesn't match what we expect)
<voidspace> frobware: this is the latest CI run: http://reports.vapour.ws/releases/3934/job/maas-2_0-deploy-xenial-amd64/attempt/5
<voidspace> frobware: I'll fix that
<anastasiamac_> rogpeppe1: essentially, yes... quoting the spec for more details :D..."There can only be one default credential for a given cloud. If that credential does not work on a particular region then you will need to specify the credential used to deploy in that region. There can only be one default region for a cloud."
<rogpeppe1> anastasiamac_: thanks
<anastasiamac_> rogpeppe1: i hope this helps \o/
<anastasiamac_> rogpeppe1: there is a similar separation for azure :D
<voidspace> frobware: standup
<frobware> voidspace: arriving... now!
<voidspace> frobware: babbageclunk: https://github.com/juju/gomaasapi/pull/52
<rogpeppe1> fwereade: ping
<Mo0O> hello
<babbageclunk> frobware: maas is up and running, but my nodes won't pxe boot to enlist. I've turned on DHCP and downloaded a boot image. What have I missed?
<frobware> babbageclunk: boot order on VM nodes?
<babbageclunk> frobware: the nodes have their nics first in the boot order.
<babbageclunk> frobware: ha ha
<frobware> babbageclunk: images imported?
<frobware> babbageclunk: eth0 configured to be managed (and dns and dchp)?
<babbageclunk> yup - do I still need a 14.04 image imported?
<frobware> babbageclunk: you have 16.04?
<babbageclunk> frobware: yup
<frobware> babbageclunk: what does commisionning node say in your maas settings? (the drop-down in the UI)
<babbageclunk> frobware: xenial
<frobware> babbageclunk: HO?
<babbageclunk> yes
<frobware> babbageclunk: https://plus.google.com/hangouts/_/canonical.com/juju-sapphire
<fwereade> rogpeppe1, pong
<rogpeppe1> fwereade: np, IS just saw an occurrence of this bug after upgrading to 1.25. fixed by a manual JS script https://bugs.launchpad.net/juju-core/+bug/1516989 (but not quite sure why the upgrade migration hadn't worked)
<mup> Bug #1516989: juju status <service_name> broken <canonical-bootstack> <sts> <juju-core:Invalid by waigani> <juju-core 1.25:Fix Released by waigani> <juju-core (Ubuntu):Confirmed> <https://launchpad.net/bugs/1516989>
<rogpeppe1> fwereade: so, incident fixed, but not sure why it happened.
<fwereade> rogpeppe1, huh
<fwereade> rogpeppe1, sorry, I have only the vaguest memories of writing that script, and I don't recall tracking the fix in 1.25
<rogpeppe1> fwereade: np :)
<rogpeppe1> fwereade: just thought you might be able to provide some insight at the time
<fwereade> rogpeppe1, PR3860 went into 1.25 on dec 1, it seems -- is it remotely possible this was somehow an upgrade to an unfixed 1.25?
<rogpeppe1> fwereade: i don't think so; let me check
<rogpeppe1> fwereade: the version was 1.25.5
<fwereade> rogpeppe1, yeah, that should be fine
<fwereade> rogpeppe1, no further insights here, I'm afraid :(
<rogpeppe1> fwereade: thanks
<frobware> babbageclunk: if you do try my scripts you want to invoke as ... $ VIRT_RAM=1024 ./add-node ...
<frobware> babbageclunk: because the default is 4GB, which may cause you a few problems
<voidspace> babbageclunk: how's your branch doing?
<babbageclunk> voidspace: finally got maas set up again, just setting up the private space node so I can have some constraints.
<voidspace> babbageclunk: ok
<babbageclunk> voidspace: it occurs to me I could test the change with storage constraints rather than space ones, sorry.
<babbageclunk> voidspace: did you try adding a machine without a constraint?
<voidspace> babbageclunk: I've added machines, yes
<babbageclunk> voidspace: so should I just kick off the merge anyway? Not making anything worse if it doesn't work (although I think it will) and then unblocking you.
<voidspace> babbageclunk: go for it
<voidspace> babbageclunk: you might still need to JFDI it
<babbageclunk> voidspace: oh, also dimiter made some comments that I should address.
<voidspace> babbageclunk: cool
<voidspace> babbageclunk: let me know when it's landed please
<babbageclunk> voidspace: wilco
<babbageclunk> voidspace: will $$jfdi$$ still run the tests?
<frobware> babbageclunk: yes
<voidspace> ^^^^
<babbageclunk> voidspace, frobware: so it just bypasses blocking? Or it runs them and then merges regardless of the result?
<voidspace> babbageclunk: bypasses blocking
<babbageclunk> voidspace: thanks
<babbageclunk> voidspace: building now, also successfully added a machine with a space constraint.
<voidspace> babbageclunk: awesome!
<voidspace> great news
<dooferlad> frobware: could you take a look at http://reviews.vapour.ws/r/4741/ please? It is a rather important fix for the last bridge script mods.
<frobware> dooferlad: looking
 * dooferlad goes for lunch
 * babbageclunk is lunching
<frobware> dooferlad: reviewed
<mup> Bug #1576670 opened: undefined: tls.TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 <build> <centos> <ci> <regression> <test-failure> <unit-tests> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1576670>
<mup> Bug #1576510 changed: Race in macaroon-bakery <ci> <race-condition> <regression> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1576510>
<mup> Bug #1576674 opened: 2.0 beta6: only able to access LXD containers (on maas deployed host) from the maas network <oil> <juju-core:New> <https://launchpad.net/bugs/1576674>
<mup> Bug #1576686 opened: bundle repository/bundles-lxd.yaml is invalid <juju-core:Triaged> <https://launchpad.net/bugs/1576686>
<babbageclunk> voidspace, frobware: is there any way to get maas-cli 2 installed on wily?
<babbageclunk> experimental3 only has xenial.
<frobware> babbageclunk: is this because your laptop is running wily?
<babbageclunk> frobware: yeah
<babbageclunk> frobware: can't try to use/adapt your scripts without it
<mup> Bug #1576695 opened: Deployer cannot talk to Juju2 (on maas2) because :tlsv1 alert protocol version <ci> <deployer> <maas-provider> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1576695>
<dooferlad> frobware: updated http://reviews.vapour.ws/r/4741/
<mup> Bug #1576700 opened: Misleading "Waiting for agent initialization to finish" message <ci> <status> <juju-core:New> <https://launchpad.net/bugs/1576700>
<mup> Bug #1576704 opened: MigrationExportSuite.TestUnits unequal results <ci> <go1.6> <regression> <test-failure> <unit-tests> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1576704>
<mup> Bug #1576705 opened: cloudImageMetadataSuite.TestSaveDiffMetadataConcurrentlyAndOrderByDateCreated wrong order <ci> <go1.6> <regression> <test-failure> <unit-tests> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1576705>
<alexisb> babbageclunk, voidspace just want to make sure you guys saw this from mgz : https://bugs.launchpad.net/juju-core/+bug/1576368
<mup> Bug #1576368: blockdevice 2.0 schema check failed: model: expected string, got nothing <ci> <deploy> <maas-provider> <juju-core:Triaged> <https://launchpad.net/bugs/1576368>
<babbageclunk> alexisb: yup, voidspace is on it - https://github.com/juju/gomaasapi/pull/52
<frobware> dooferlad: it's not clear to me why we don't know all the routing information as and when we create the new stanzas and simply accumulate them. if we accumulate then it also means those bridging functions have no side effects.
<dooferlad> frobware: the only function that has a side effect is bridge_now.
<dooferlad> frobware: the most reliable thing to do is save routes for an interface, bridge the interface, restore routes that vanished.
<frobware> dooferlad: iff, you first successfully write the new file
<frobware> dooferlad: hence my preferred order. transform, successfully write new file, bridge-le-world.
<dooferlad> frobware: you meen you want to not do anything unelss you have generated a new e/n/i?
<dooferlad> frobware: I did just update the rev with a comment about that
<frobware> dooferlad: it's moot. but if you can't write the file all bets sjould be off.
<dooferlad> frobware: seems reasonable
<dooferlad> frobware: so I just suggested not running the bridge_now loop until after the if not args.activate: ... exit(0) code
<frobware> dooferlad: that would re-read the (new) eni file?
<dooferlad> frobware: no
<frobware> dooferlad: so what happesn before your exemplary if not activate?
<dooferlad> frobware: easiest if I just write the code. Will be two minutes.
<frobware> :)
<alexisb> babbageclunk, voidspace awesome
<babbageclunk> alexisb: just doing the juju diff to update the dependency now.
<dooferlad> frobware: http://reviews.vapour.ws/r/4741/ updated
<mup> Bug #1576728 opened: ConnectSuite.TestLocalConnectError: windows cannot connect to local lxd server <ci> <go1.6> <regression> <test-failure> <unit-tests> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1576728>
<voidspace> alexisb: we're playing whackamole with maas schema oddities :-/
<voidspace> alexisb: we'll get there though
<babbageclunk> voidspace, frobware: review please? http://reviews.vapour.ws/r/4742/
<babbageclunk> voidspace: figured I'd just do this bit
<voidspace> babbageclunk: you beat me to it!
<babbageclunk> voidspace: :)
<voidspace> babbageclunk: thanks, LGTM
<alexisb> voidspace, understood, thank you for the quick turn around it makes it so the CI runs can continue to move forward
<babbageclunk> voidspace: I'm going to quickly make a list of the attributes we use and see if roaksoax or anyone else in #maas can point out others that are optional
<voidspace> babbageclunk: it's any that appear in the gomaasapi schema - so depending on what you mean by "we use" it's not just the ones we use
<voidspace> babbageclunk: but yeah, catching these ahead of time would be good
<babbageclunk> voidspace: yeah, that's what I meant - frustratingly, in some of these cases they're not even things we use
<babbageclunk> voidspace: we're still good to JFDI things, right?
<babbageclunk> voidspace: (optional or nullable, I guess)
<voidspace> babbageclunk: yep
<voidspace> babbageclunk: and yep
<babbageclunk> voidspace: huh. Exact spelling of JFDI please?
<mgz> I feel like our blocking process has not been working the past few weeks
<mgz> babbageclunk: how about we just use fixes-BUGNUM instead
<katco> ericsnow: redir: hey i'm going to miss this morning's standup... guy who is refinishing our floor just showed up
<mgz> babbageclunk: I marked bug 1576368 blocking so don't jfdi, use magic string 'fixes-1576368'
<mup> Bug #1576368: blockdevice 2.0 schema check failed: model: expected string, got nothing <blocker> <ci> <deploy> <maas-provider> <juju-core:Triaged by 2-xtian> <https://launchpad.net/bugs/1576368>
<babbageclunk> mgz: Ok, thanks
<ericsnow> katco: k, good luck
<redir_> katco: standup?
<redir_> katco: OK. Missed the scrollback, eric passed it on.
<perrito666> wallyworld: still working?
<mup> Bug #1576743 opened: juju agent 1.25.5 crashes on ubuntu 14.04 in an private openstack cloud <juju-core:New> <https://launchpad.net/bugs/1576743>
<mup> Bug #1576750 opened: juju2 usability: many options have to be specified for every bootstrap <landscape> <juju-core:New> <https://launchpad.net/bugs/1576750>
<mup> Bug #1576756 opened: cannot shut down controllers running other people's models <juju-core:New> <https://launchpad.net/bugs/1576756>
<rogpeppe1> anyone know how i can "juju switch" to a different account?
<cherylj>  rogpeppe1 you mean be a different user?
<rogpeppe1> cherylj: yes
<rogpeppe1> cherylj: i'm getting "ERROR getting controller environ: getting bootstrap config from API: permission denied (unauthorized access)" when i try to destroy a controller
<rogpeppe1> cherylj: and i think the problem is probably that i'm using the wrong account
<cherylj> there was a switch-user, I thought.   hmm  maybe that was removed
<mup> Bug #1576756 changed: cannot shut down controllers running other people's models <juju-core:New> <https://launchpad.net/bugs/1576756>
<rogpeppe1> cherylj: i see two users for the controller in accounts.yaml
<rogpeppe1> cherylj: but the current user is not the admin user
<cherylj> rogpeppe1: maybe logout / login?
<rogpeppe1> cherylj: i don't think that's what i want
<rogpeppe1> cherylj: looks like logout will delete my password
<rogpeppe1> http://paste.ubuntu.com/16127920/
<frobware> dooferlad, voidspace, babbageclunk: PTAL @ http://reviews.vapour.ws/r/4743/
<cherylj> rogpeppe1: sounds like you want to change your password first :)
<cherylj> I've never done that before, so I can't tell you what will happen
<rogpeppe1> cherylj: tbh i don't want to clear the credentials from the client
<rogpeppe1> cherylj: i ended up just manually editing the accounts.yaml file
<cherylj> that works too
<rogpeppe1> cherylj: thanks for the help
<dooferlad> frobware: looking
<dooferlad> frobware: http://reviews.vapour.ws/r/4741/ is updated again
<dooferlad> frobware: +1
<babbageclunk> voidspace: this is the schema as we have it now (munged for a bit more clarity): http://paste.ubuntu.com/16127709/
<babbageclunk> roaksoax is going to get someone to go through it.
<frobware> dooferlad: thanks. I raised the question about `ip route' and IPv6 - what happens when the iface has an IPv6 route in addition to 4?
<frobware> cherylj: master is blocked... or not?
<cherylj> frobware: it is blocked
<cherylj> there are new failures
<cherylj> and they need to be addressed
<frobware> cherylj: ok, see you next week. :)
<cherylj> haha
<cherylj> frobware: it's going to be hot and humid.  Hope you're prepared
<frobware> cherylj: it's trying to snow here... ???
<cherylj> https://weather.com/weather/tenday/l/78757:4:US
<frobware> ooh, some of the numbers start at 8
<cherylj> high of 30C Friday
<frobware> cherylj: I'll go with old money; 80 sounds way higher
<voidspace> babbageclunk: awesome
<voidspace> babbageclunk: nice work
<cherylj> hehe
<mup> Bug #1576778 opened: juju2 tab-completion for sub commands is broken in beta6 <landscape> <usability> <juju-core:New> <juju (Ubuntu):New> <https://launchpad.net/bugs/1576778>
<alexisb> katco, ping
<katco> alexisb: pong
<alexisb> heya katco , we need some help unblocking master and cherylj is currently occupied with sprint tasks for next week
<alexisb> katco, can you please rally the troupes around these 1.6 failures:
<alexisb> https://goo.gl/0jYcTy
<katco> alexisb: sure thing
<alexisb> thank you, I am here if you have qs
<alexisb> though I will be changing locations here in a minute
<katco> alexisb: thanks
<katco> alexisb: for bug 1576670. what substrates are still on 1.2? i thought we had 1.6 backported to trusty and we were now fully on 1.6?
<mup> Bug #1576670: undefined: tls.TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 <blocker> <build> <centos> <ci> <go1.2> <go1.6> <regression> <test-failure> <unit-tests> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1576670>
<alexisb> 1.6
<alexisb> no more 1.2
<cherylj> katco: that bug started the discussion this morning
<cherylj> of whether or not we still use go 1.2 for our windows / centos unit tests
<cherylj> because those unit tests fail under go 1.6
<katco> cherylj: if we do, that's a bug against CI i think? windows can use 1.6 without any problem. centos i'm less sure of. what do you think?
<cherylj> so if we can get the windows / centos tests working with go 1.6, that bug will be invalid
<mgz> katco: 'without any problem' is the issue
<cherylj> yeah
<katco> cherylj: mgz: ah i see... chicken and egg problem? there are some centos/windows tests failing on 1.6?
<cherylj> katco: exactly :)
<cherylj> so we're asking for help to fix those other tests
<alexisb> 1.6 all the way!
<katco> cherylj: read my mind: i think we should push forward. mark bug 1576670 as invalid and flag failing tests as blockers
<mup> Bug #1576670: undefined: tls.TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 <blocker> <build> <centos> <ci> <go1.2> <go1.6> <regression> <test-failure> <unit-tests> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1576670>
<mgz> 200% compile time slowdowns all the way!
<cherylj> huzzah!
<katco> mgz: that has been addressed in tip :)
<katco> cherylj: so are you fine with me marking this as invalid? can you raise the other test failures as blocking bugs?
<cherylj> katco: already did
<cherylj> :)
<katco> cherylj: ty kindly :)
<alexisb> katco, what we need from the core team is to fix the test failures
<katco> alexisb: agreed
<mgz> katco: yeah, it's now just a 100% slowdown
<katco> mgz: native go compiler > c compiler, and slowdowns will be addressed over time :) it's a good thing overall!
<katco> ericsnow: redir: redir_: what are you 2 up to?
<katco> perrito666: what are you working on?
<mgz> katco: I'm not being too serious :)
<cherylj> okay, I'm going to get some lunch.  bbiab
<ericsnow> katco: syslog
<katco> ericsnow: the 1-pager?
<alexisb> folks these blockers are top priority so if you have a nice stopping point
<ericsnow> katco: yep
<redir> katco: The LTS update fix and ian's feedback.
<katco> ericsnow: sounds like that's getting trumped
<katco> redir: k, carry on
 * redir nods
<katco> redir: hi btw
<redir> morning katco :)
 * alexisb changes locations brb
<katco> ericsnow: you have your pick:
<katco> https://bugs.launchpad.net/juju-core/+bugs?field.searchtext=&orderby=-importance&search=Search&field.status%3Alist=NEW&field.status%3Alist=CONFIRMED&field.status%3Alist=TRIAGED&field.status%3Alist=INPROGRESS&field.status%3Alist=FIXCOMMITTED&field.status%3Alist=INCOMPLETE_WITH_RESPONSE&field.status%3Alist=INCOMPLETE_WITHOUT_RESPONSE&assignee_option=any&field.assignee=&field.bug_reporter=&field.bug_commenter=&field.subscriber=&field.st
 * katco now understands why alexisb put that in a url shortener
<ericsnow> katco: k
<katco> ericsnow: lmk which one you pick up so i can pick something else
<ericsnow> katco: it'll be a few minutes before I can get to a stopping point
<katco> ericsnow: np at all, same here
<mup> Bug #1576670 changed: undefined: tls.TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 <blocker> <build> <centos> <ci> <go1.2> <go1.6> <regression> <test-failure> <unit-tests> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1576670>
<mup> Bug #1576778 changed: juju2 tab-completion for sub commands is broken in beta6 <landscape> <usability> <juju-core:New> <juju (Ubuntu):New> <https://launchpad.net/bugs/1576778>
<mup> Bug #1576318 opened: juju register not clear that you're creating a new password <apport-bug> <i386> <usability> <xenial> <juju-core:Triaged> <juju-core (Ubuntu):New> <https://launchpad.net/bugs/1576318>
<mup> Bug #1576805 opened: Juju controllers set the rsyslog NetstreamDriver affecting all subsequent rsyslog configuration <juju-core:New> <https://launchpad.net/bugs/1576805>
<katco> cherylj: hey, build bot is wrong? https://github.com/juju/juju/pull/5318
 * cherylj looks
<mgz> katco: it's being fussy about high vs critical
<katco> mgz: ah ok. jfdi?
<mgz> go again, I edited the bug
<katco> mgz: ta
<mgz> use $$fixes-NNN$$
<katco> mgz: sanity check: for bug 1576728, i'm just not going to run any tests for lxd on windows
<mup> Bug #1576728: ConnectSuite.TestLocalConnectError: windows cannot connect to local lxd server <blocker> <ci> <go1.6> <regression> <test-failure> <unit-tests> <windows> <juju-core:In Progress by cox-katherine-e> <https://launchpad.net/bugs/1576728>
<katco> cherylj: ^^
<cherylj> katco: sounds good.  You may also need to include a skip for centos too
<cherylj> so, I guess if ! ubuntu
<katco> mgz: cherylj: when/if lxd is supported on windows, this code will have to be modified to be platform independent anyway. the error instructions don't make sense for anything but ubuntu
<katco> cherylj: yeah good point
<mgz> katco: I'm not sure on that,
<mgz> we have some general isolation test failures with lxd
<cherylj> but yeah, we'd want to move to a mocked out lxd client
<katco> mgz: i don't follow
<katco> cherylj: not only that, but these tests would be testing code that would never be exercised on those platforms
<cherylj> that's true
<mgz> katco: I've not looked in detail, but we have a bunch of tests that aren't very unity and actually try to talk to the local lxd
<mgz> which should really just not do that at all, rather than be skipped on some platforms
<katco> mgz: oh, yes. i completely agree there. just trying to remove the blocker pedantically
<katco> mgz: not really prepared to do major surgery atm :)
<mgz> katco: eg bug 1564511 - which is a different suite
<mup> Bug #1564511: cmd/jujud/reboot tests fail with lxd container running <ci> <regression> <s390x> <tech-debt> <test-failure> <unit-tests> <xenial> <juju-core:Triaged> <https://launchpad.net/bugs/1564511>
<katco> mgz: yeah. that's just so awful.
<mgz> so, I'm not anti skipping on windows where the test otherwise makes sense but isn't code we ever run on windows
<katco> mgz: i'll lay down robustness in layers. i'll address platform robustness in this patch
<redir> katco: ericsnow got a minute to help me understan PatchValue?
<ericsnow> redir: sure
<katco> redir: sure, moonstone?
<redir> sure
<ericsnow> mgz: do you know of any problems we have had (testing or otherwise) due to Windows's long clock tick (15.6 ms by default)?
<mup> Bug #1571783 changed: Windows unit tests cannot setup under go 1.6 <ci> <go1.6> <jujuqa> <regression> <test-failure> <unit-tests> <windows> <juju-core:Fix Released by hduran-8> <https://launchpad.net/bugs/1571783>
<mgz> ericsnow: not with juju I think
<ericsnow> mgz: k
<ericsnow> mgz: I'm thinking that the bug I'm working on may be due to that long clock tick
<mgz> ericsnow: it's possible, given we have a number of rather timing dependent tests
<mup> Bug #1576851 opened: juju debug-log -i unit-rabbitmq-server-0 is unfriendly <juju-core:New> <https://launchpad.net/bugs/1576851>
<alexisb> anastasiamac, are you there
<anastasiamac> alexisb: i am
<alexisb> heya
<anastasiamac> alexisb: do u have a sec?
<alexisb> I do
<anastasiamac> could we go to our 1:1?
<alexisb> yep
<mup> Bug #1576873 opened: Juju2 cannot deploy centos or windows workloads on maas 1.9 <blocker> <centos> <ci> <maas-provider> <regression> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1576873>
<mup> Bug #1576874 opened: restore-backup never completes <backup-restore> <blocker> <ci> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1576874>
<perrito666> alexisb: still here?
<redir> thanks katco ericsnow that was very useful
<alexisb> perrito666, yes
<alexisb> wuz up?
<perrito666> alexisb: priv
<katco> redir: np
<ericsnow> katco: you have a sec for a quick review?  http://reviews.vapour.ws/r/4746/
<katco> ericsnow: sure tal
<katco> ericsnow: do we have a way to test this empirically?
<ericsnow> katco: test what exactly?
<katco> ericsnow: that the fix for your bug works
<ericsnow> katco: I suppose I could spin up my KVM, update it, etc. and then run that test and hope my local timing isn't slow anyway...
<ericsnow> katco: but running in KVM might mask the issue
<ericsnow> probably
<katco> ericsnow: i think it'd be enough to land the fix and coordinate with ci folks
<katco> ericsnow: to ensure it's really solved
<ericsnow> katco: that's what I was thinking
<katco> ericsnow: shipit
<ericsnow> katco: thanks
<ericsnow> mgz: are all the go-1.6, windows blocks supposed to be critical?
<ericsnow> mgz: trying to merge a fix
<katco> ericsnow: flag your bug as critical to get it through
<ericsnow> katco: done
<perrito666> :950
<perrito666> meh, wrong window
#juju-dev 2016-04-30
<redir> ericsnow: you gone?
<redir> ericsnow: if not here's http://reviews.vapour.ws/r/4747
<mup> Bug #1576911 opened: github.com/juju/juju/environs timeout (sigquit) <centos> <ci> <go1.6> <regression> <test-failure> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1576911>
<redir> Have a great weekend #juju-dev
 * redir is eow
<mup> Bug #1576913 opened: StatusHistorySuite.TestPruneStatusHistory <blocker> <ci> <go1.6> <regression> <test-failure> <unit-tests> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1576913>
<cherylj> oh sinzui, haven't you opened enough bugs for one day?
 * cherylj sighs
<mup> Bug #1576985 opened: aggregateSuite.TestBatching wrong size <ci> <intermittent-failure> <unit-tests> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1576985>
<a123> I'm trying to get juju-2 bootstrap to work from behind a proxy. The image downloads fine, then appears to hang during package installation. I connected to the image via lxc exec <cont> bash, verified apt-get was running, then grep -iRn proxy /etc/apt/* which returned nothing. Shouldn't there be proxy info under /etc/apt/ ?
#juju-dev 2016-05-01
<mup> Bug #1576743 changed: juju agent 1.25.5 lost on ubuntu 14.04 in an private openstack cloud <openstack-provider> <juju-core:Incomplete> <juju-core 1.25:New> <https://launchpad.net/bugs/1576743>
<mup> Bug #1365828 changed: Default juju install  caught in logspam loop <juju-core:Triaged> <https://launchpad.net/bugs/1365828>
<mup> Bug #1376334 changed: lxc template creation, give option to use mirror <feature> <landscape> <lxc> <juju-core:Triaged> <https://launchpad.net/bugs/1376334>
<mup> Bug #1576686 changed: bundle repository/bundles-lxd.yaml is invalid <juju-ci-tools:New> <juju-core:Invalid> <https://launchpad.net/bugs/1576686>
<thumper> morning folks
<veebers> morning thumper :-)
<thumper> veebers: when do you change teams?
<veebers> thumper: Today is my first day on JujuQA
<thumper> ah ha
<thumper> awesome
<thumper> who are you reporting to?
<veebers> thumper: Torsten
<thumper> veebers: do you have enough to do today? or do you want a brain dump?
<veebers> thumper: I'm just going through the new starter tasks so far, can I hit you up later today for that dump? :-)
<thumper> sure
<thumper> veebers: my team has a standup at 9am
<thumper> you are welcome to come along and listen in
<thumper> I'd really like some QA on that call :)
<veebers> thumper: sweet, perhaps not today but sometime this week onwards. I first meet my new team tomorrow morning
<mup> Bug #1484688 changed: [Juju HA] Past logs do not get synced to new slave state servers <docteam> <ha> <logging> <juju-core:Invalid> <https://launchpad.net/bugs/1484688>
#juju-dev 2017-04-24
<wallyworld> sure
<wallyworld> babbageclunk: i think the PR needs a new test? see my comment
<wallyworld> thumper: which bug? there's nothing on the 2.2 beta3 milestone. i could search but if you have the number handy....
<babbageclunk> wallyworld: thanks
 * thumper looks for wallyworld
<thumper> https://bugs.launchpad.net/juju/+bug/1669540
<mup> Bug #1669540: state: storeManager dies permanently when it should not <jaas> <sprint> <juju:Triaged> <https://launchpad.net/bugs/1669540>
<wallyworld> oh that one
<thumper> see related comments at the end of the bug
<wallyworld> i wasn't intending to look at it
<wallyworld> i have oracle provider stuff queded up
<wallyworld> i want to get that done for beta3
<wallyworld> i could look tomorrow
<thumper> wallyworld: where would I get the tools sha256 from?
<wallyworld> there's a tools doc in mongo or are you asking where's the code to calculate the hash?
<thumper> I can see the calculation in the unpack tools
<thumper> but I'm doing the 1.25 upgrade
<thumper> perhaps a 5 minutes Hangout would be better
<wallyworld> sure
<thumper> got time?
<wallyworld> 1:1
<thumper> wallyworld: 1:1
<babbageclunk> wallyworld: there was already a test for updating - I tweaked it to make it clear that it preserves arbitrary firewall rule names.
<wallyworld> babbageclunk: ok. was there also an issue where adding a new rule with updated cidrs was an issue?
<wallyworld> or applying a new cidr to an existing rule?
<wallyworld> i thought there was a gap like that you mentioned?
<babbageclunk> wallyworld: That test does add a CIDR to an existing rule. Do you mean handling swapped orders of CIDRs? I've got a test at the ruleSet level.
<wallyworld> yeah, ok, sounds good. i didn't digest all the new tests
<babbageclunk> wallyworld: sweet.
<babbageclunk> wallyworld: Looks like the error swallowing I thought I saw was actually because I was watching log messages from juju.apiserver.remotefirewaller and juju.worker.firewaller, and the error was reported by juju.worker.dependency, so I didn't see it. Checking that now
<wallyworld> ok
<wallyworld> babbageclunk: can you look at this trivial PR to get Oracle provider working again?
<wallyworld> https://github.com/juju/juju/pull/7269
<babbageclunk> wallyworld: sure
<babbageclunk> wallyworld: LGTM'd
<wallyworld> babbageclunk: tyvm
<wallyworld> thumper: can you recall where the code to shorten the lxd instance ids lives? was that in a utils somewhere or was it a bespoke lxd bit of code? or similarly for maas
<thumper> I think it is in the instance package
<thumper> tail of a UUID
<babbageclunk> wallyworld: It looks like this check is the cause of the ports not being closed: https://github.com/juju/juju/blob/staging/worker/firewaller/firewaller.go#L1124
<babbageclunk> wallyworld: Removing it makes my new test pass, doesn't break any of the other ones. Any idea why it's there?
<wallyworld> babbageclunk: hmmm, yes, that was to fix something else
<wallyworld> um
<wallyworld> i can't recall exactly what now, but it showed up in manual testing maybe
<wallyworld> when you deleted a relation, the ports didn;t close
<wallyworld> maybe with your work, it's now fixed implicitly
<wallyworld> i think it was that - create a remote relation. see the ports open, delete the relation, the ports should close
<babbageclunk> wallyworld: ok, I'll check that manually.
<wallyworld> babbageclunk: or
<wallyworld> i think you then need to add the relation again perhaps, and see the ports open again
<babbageclunk> ah, ok
<wallyworld> so yeah, just a small manaul test to check everything
<babbageclunk> wallyworld: sounds good
<babbageclunk> wallyworld: review plz? https://github.com/juju/juju/pull/7270
<wallyworld> sure
<babbageclunk> thanks!
<babbageclunk> as a reward, some delightful doggo pics: https://www.reddit.com/r/rarepuppers/
<wallyworld> lol
<wallyworld> babbageclunk: so the manual testing worked. i winder why i put that line in
<wallyworld> babbageclunk: before you land, we need to get the vsphere thing fixed or else it won't go through, so do try yet
<wallyworld> *don't
<babbageclunk> ok, shan't
 * babbageclunk goes for a wee run
<wallyworld> babbageclunk: that means something different here in australia
<wallyworld> just saying
<blahdeblah> wallyworld: that's a wee walk
<blahdeblah> wee runs are anything under 5km
<wallyworld> depends on how much beer
<wallyworld> axw: your PR is on it's way....
<axw> wallyworld: thanks
<wallyworld> menn0: here's a small PR for the watcher bit https://github.com/juju/juju/pull/7272
<wallyworld> i wish the bug only contained one issue rather than 2
<menn0> wallyworld: looking
<menn0> wallyworld: done
<wallyworld> ta
<wallyworld> jam: unless i am wrong, containers have never inherited their host machine's constraints, and that's not something that has been changed in the last week or so, right?
<jam> wallyworld: it has never been the intent, but I do believe I've seen containers inheriting the host machines's space constraints for a reason I don't understand
<wallyworld> oh, i see. that is unintended i'm sure
<wallyworld> i saw the email to the list and it seems quite a bad issue for him
<blahdeblah> juju add-model has a --credential flag; is there a way to set this after the model has been added?
<blahdeblah> or show which credentials are in use for a model?
<tomk88> Hi, if there's somebody awake now:
<tomk88> I am writing new provider for juju. When I implement the provider interfaces, I am contastnly getting "method Id should be ID [go/golint]" from golint. do you guys have any recommendation how to turn this particular warning off in golint?
<tomk88> I am using vim, vim-go, YouCompleteMe
<bdx> rick_h: new bug here https://bugs.launchpad.net/juju/+bug/1685782
<mup> Bug #1685782: juju deployed LXD inherit constraints from the host causing lxd deploys to fail <juju:New> <https://launchpad.net/bugs/1685782>
<bdx> please put some heat on that
<rick_h> bdx: rgr
<zeestrat> Anyone got a spare cycle for troubleshooting steps on a model migration issue in #1680392?
<mup> Bug #1680392: Model migration fails on large model <juju:Incomplete> <https://launchpad.net/bugs/1680392>
<balloons> Can I get a review of https://github.com/juju/juju/pull/7273?
#juju-dev 2017-04-25
<rogpeppe> some cleanup of some of the juju/application command tests: https://github.com/juju/juju/pull/7274/
<rogpeppe> anyone around that might be able to do a review of this? https://github.com/juju/juju/pull/7274
<mup> Bug #1685382 changed: max_user_instances in inotify runs low on juju-db units - version 1.25.6 <canonical-bootstack> <juju-core:New> <https://launchpad.net/bugs/1685382>
<anastasiamac> thumper: wallyworld: this needs to be assisgned/addressed asawc - bug 1627127
<mup> Bug #1627127: resource-get gets hung on charm store <cdo-qa> <cdo-qa-blocker> <juju:Triaged> <https://launchpad.net/bugs/1627127>
<thumper> asawc?
<thumper> nm
<thumper> as soon as we can?
<anastasiamac> yes :)
#juju-dev 2017-04-26
<wallyworld> axw: builders etc on 1.8 now. i tried to merge the earlier ec2 error status related PR and fater an intitial bot issue, it's run with some test failures; i haven't looked in detail; a few url parsing things etc
<axw> wallyworld: yeah thanks, I'm merging develop into the feature branch - I think that should fix those failures
<wallyworld> sgtm
<wallyworld> i forgot that was on the fb
<wallyworld> hence it won't have the 1.8 test fixes
<seyeongkim> hello, any news about https://bugs.launchpad.net/juju/+bug/1616704 ?
<mup> Bug #1616704: juju restore-backup does not complete <backup-restore> <cpe> <juju:Triaged by thumper> <https://launchpad.net/bugs/1616704>
<wallyworld> seyeongkim: no news as yet as far as I know. there's a performance issue there, but no update yet
<seyeongkim> thanks wallyworld
<wallyworld> seyeongkim: there's a comment on bug 1680683 which maybe related. i suspect there's a couple of underlying issues that need to be addressed. some work has already been done as indicated in the comment, but there's still more to do
<mup> Bug #1680683: Poor "juju create-backup" performance <canonical-is> <juju:Triaged> <https://launchpad.net/bugs/1680683>
<seyeongkim> I see wallyworld
<axw> babbageclunk: belated happy birthday :)
<babbageclunk> axw: thanks!
<wpk> anyone wants to review this small one? https://github.com/juju/juju/pull/7280
<ashipika> good morning.. who has experience with the txn pruning?
<wpk> jam: you're needed :)
<wpk> axw: What about CpuPower?
<wpk> 13ae680e4ac environs/manual/detection.go                     (Andrew Wilkins     2013-08-15 17:21:16 +0800 156)         // TODO(axw) calculate CpuPower. What algorithm do we use?
<axw> wpk: I wouldn't worry about it, it's not as widely used as cpu-cores
<axw> (AFAIK)
<jam> ashipika: what's up?
<ashipika> jam: txn cleanup failed.. error while iterating.. io timeout.. is it safe to retry?
<jam> ashipika: what command were you running?
<jam> it is generally ok, except you can't expect it to actually work the second time if it failed the first
<ashipika> jam: we're migrating our DB.. and prunning txns.. so a custom command.. using the CleanAndPrune method in juju/txn
<ashipika> jam: while would it time out, is my question..
<jam> 'why would it time out' ?
<jam> ashipika: migrating from what to what?
<jam> ashipika: does this happen to be a mongo2.4 source using the CleanAndPrune that is in mgopurge 1.4?
<jam> as there are fixes in juju/txn to handle mongo2.4 better
<mwhudson> uh juju-core autopkgtests fail in artful: https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac/autopkgtest-artful/artful/amd64/j/juju-core/20170426_074500_fee63@/log.gz
<mwhudson> my read of that is that there is no jujud in the streams for artful?
<mwhudson> but maybe not
<rogpeppe> a rename of some types that were confusing me in juju: https://github.com/juju/juju/pull/7283
<rogpeppe> review appreciated
<rogpeppe> jam: fancy taking a look? ^
<jam> rogpeppe: isn't there a separate 'cmd' package that isn't specific to Juju which is the reason we have CommandBase vs JujuCommandBase ?
<rogpeppe> jam: not really
<rogpeppe> jam: i mean, there *is* a separate cmd package
<rogpeppe> jam: but modelcmd already has CommandBase
<rogpeppe> jam: but the "Base" in that name implies something different from the "Base" in all the other type names
<rogpeppe> jam: i've been trying to sort the whole picture out in my head, and wrote this summary: http://paste.ubuntu.com/24459695/
<rogpeppe> jam: where CommandBase is an evident anomaly
<rogpeppe> jam: this PR is proposing changing that so it looks like: http://paste.ubuntu.com/24459702/
<rogpeppe> jam: FWIW I also think that sysCommandWrapper should probably be controllerCommandWrapper but that's less important
<rogpeppe> jam: are you planning to review it or should I try to find someone else?
<jam> sorry, I have to take my son to music for the next hour, it seems reasonably ok, as its just a mechanical change and i don't mind the names, but I haven't thought deeply about it yet
<anastasiamac> wallyworld: axw: I've updated https://github.com/juju/juju/pull/7265, PTAL :) would b nice to have it in for b3 \o/
<rogpeppe> jam: ok, thanks
<rogpeppe> anastasiamac, wallyworld, axw: any chance of a review of this mechanical rename PR, by any chance? https://github.com/juju/juju/pull/7283
<anastasiamac> rogpeppe: m about to eod :D can b done tomorrow?
<anastasiamac> rogpeppe: i mean, it's only 10pm here... but it's been a looong couple of days :)
<rogpeppe> anastasiamac: i'd prefer to get it in today, but np
<rogpeppe> anastasiamac: "only" 10pm!
<rogpeppe> anastasiamac: stop working already! :)
<anastasiamac> rogpeppe: :) if noone looks at it today, i'll do it tomorrow \o/
<rogpeppe> anastasiamac: tyvm
<rogpeppe> anastasiamac: do we still have on-call-reviewers, BTW?
<rogpeppe> jam: thanks a lot
<bdx> question concerning the `deploy` command
<bdx> is this expected http://paste.ubuntu.com/24460316/ ?
<jrwren> bdx: I think that it is, if nothing is yet placed on that machine.
<bdx> jrwren: ok, thanks
<wpk> Q: is it possible to have two API functions with the same name with different arguments/return values for different API versions? If so, is there an example?
<wpk> (and if not, where could I find an example of API version-dependent call?)
<babbageclunk> wpk: ping?
<babbageclunk> wallyworld: busy?
<wallyworld> babbageclunk: yeah, in meeting, will ping soon
<babbageclunk> wallyworld: cool thanks
<wallyworld> veebers: join release call now?
<wallyworld> anastasiamac: ^^^^^
<veebers> wallyworld: omw
<wallyworld> babbageclunk: sorry, meeting finished and i forgot about you
<babbageclunk> wallyworld: hey now!
<wallyworld> did you want to talk?
<babbageclunk> yes, please - just plugging some stuff back in.
<wallyworld> righto
<babbageclunk> in standup?
<anastasiamac> babbageclunk: wallyworldalso forgot about my nice, updated PR just for him :D
<anastasiamac> wallyworld also *
<wallyworld> no he didn't
<wallyworld> it's been reviewed
<anastasiamac> well, no he did not... :)
#juju-dev 2017-04-27
<stokachu> wallyworld: any word on the oracle provider not listed anymore?
<stokachu> s/provider/cloud
<axw> menn0: any chance of a review on https://github.com/juju/juju/pull/7278?
<menn0> axw: looking
<menn0> axw: done
<axw> menn0: awesome, thanks
<axw> menn0: I've simplified the SetSSHKeys as suggested, but left the other one alone. can you please see if my reply is reasonable?
<menn0> axw: all good, thanks. merge away
<axw> menn0: thanks
<axw> babbageclunk wallyworld: ready whenever you two are
<wallyworld> righto
 * thumper unpicks convoluted code
<wallyworld> axw: i'll see you in HO once babbageclunk pings back
<axw> okely dokely
<babbageclunk> Sorry guys, was on the phone, finished now
<babbageclunk> axw, wallyworld: ^
<thumper> hmm...
<thumper> wallyworld: do you have 5 minutes?
<thumper> this code looks and feels wrong
<thumper> and I want to double check
<wallyworld> thumper: sure, just otp with andrew and xtian, soon?
<thumper> sure, ping when done
<wallyworld> thumper: free now
<thumper> wallyworld: 1:1
<wallyworld> thumper: sorry, cut you off
<thumper> nm
<thumper> was just going to ask if you were ready for karaoke next week
<anastasiamac> wallyworld: axw: do u know if this is something expected? https://bugs.launchpad.net/juju/+bug/1686585
<mup> Bug #1686585: Juju deploy via UI fails on Azure <juju:New> <https://launchpad.net/bugs/1686585>
<axw> anastasiamac: nope
<anastasiamac> axw: :D 'nope' = dunno or 'nope'='not expected'?
<wallyworld> you mean is a failure to dpeloy expectedz?
<axw> anastasiamac: not expected. it's a bug.
<axw> what wallyworld said. it would be pretty odd for a basic deploy of mariadb to be expected to fail
<anastasiamac> wallyworld: yes, mayb under some circumstances we know there could b failures.. kind of like under some circumstance we know we have difficulties ;)
<wallyworld> us? never :-)
<anastasiamac> axw: wallyworldyes. this was my expectetation too.. just finding way to flag to u - 'failure' :D
 * wallyworld saw the bug
<axw> anastasiamac: I'll have a look into it after I'm done landing ssh things
<axw> also gotta do some azure auth changes
<anastasiamac> axw: awesome \o/
<thumper> wallyworld: fyi, there are other tests that make sure you can only use --force with --series
<thumper> so ... whatever
<wallyworld> i should have remembered that
<wallyworld> been a while
<thumper> so...
<thumper> much of that other code is bollocks
<thumper> because it is only valid if force is true and series is passed in
<wallyworld> yeah
<thumper> oh well
<wallyworld> maybe we intended to allow --force without series at one point
<thumper> so... the LTS case will never happen
<thumper> yeah...
<thumper> maybe
<thumper> I'll leave it that way for now...
<thumper> but perhaps we should look to fix it later
<wallyworld> yep
<thumper> I'll leave a note
<thumper> wallyworld: https://github.com/juju/juju/pull/7285
<wallyworld> looking
<wallyworld> babbageclunk: for tomorrow when you start your day and can't face writing code straight away https://github.com/juju/juju/pull/7286
<babbageclunk> wallyworld: I mean, I guess it's net deletion so should be easy right?
<wallyworld> supposedly
<wallyworld> cut and paste
<wallyworld> 99% of it
<wallyworld> i am still finishing the hand testing
<wallyworld> thumper: lgtm, just a minor niggle
<thumper> ta
<thumper> axw: you have snap install go?
<anastasiamac> anyone cares if i self-approve a typo fix? https://github.com/juju/juju/pull/7287
<thumper> approved
<anastasiamac> thumper: \o/ tyvm!
<axw> thumper: yes I have
<thumper> axw: what do you do about gofmt?
<thumper> I'm still on 1.6, so I'll move to using the snap
<axw> thumper: nothing yet, but you can add /snap/go/current/bin to $PATH. gofmt is in there
<axw> it's just not exposed in /snap/bin
<thumper> ah
<thumper> right
<thumper> or I may just symlink from ~/bin because that's in my path already :)
<thumper> did you remove the golang package?
<thumper> or did you have go from source?
<thumper> wallyworld, menn0: either of you move to the snap from the deb?
<axw> thumper: depends on the machine. I was mostly building from source before
<axw> just took it out of my path
<menn0> thumper: same for me
<wallyworld> i'm still using deb
<thumper> I'm just thinking since /snap/bin is at the end of my PATH
<thumper> I'll probably have to remove the deb
<menn0> actually, i've got a symlink to /snap/bin/go
<thumper> this is my first snap apart from the live patch
<thumper> :)
<thumper> heh...
<thumper> go 1.8.1 from 'mwhudson' installed
<axw> anastasiamac: actually, that azure bug has been fixed in 2.2. so I guess it is expected after all :)
 * axw goes to find the bug #
<anastasiamac> axw: nice :) so it's a dupe? these are the best :)
<axw> anastasiamac: not exactly, just happened to be the same underlying issue
<axw> anastasiamac: I've just linked it a comment and marked Fix Committed
<rogpeppe> wallyworld: hiya
<wallyworld> hi
<rogpeppe> wallyworld: i just ran across a line of code that does nothing, and was wondering if it is intended to do something...
<rogpeppe> wallyworld: just asking before i delete it :)
<wallyworld> sure
<rogpeppe> wallyworld: (and your name is on it)
<rogpeppe> wallyworld: the line is 		url.Source = ""
<rogpeppe> wallyworld: in cmd/juju/crossmodel/show.go
<rogpeppe> wallyworld: around line 66
<rogpeppe> wallyworld: was the intention to remove the source from the string set in c.url ?
<wallyworld> rogpeppe: it doesn't do nothing does it? it sets the Source to empty
<rogpeppe> wallyworld: except it sets it in the local url variable which is immediately discarded
<rogpeppe> wallyworld: unless ParseApplicationURL returns a reference to some persistent value, i guess
<wallyworld> oh i see, no that's a bug
<wallyworld> but that line will never execute right nw
<wallyworld> I don't think, as source will always be "" from memory
<rogpeppe> wallyworld: why not? it's not possible to specify a source?
<wallyworld> it's for when we support cross controller cmr
<wallyworld> the CLI won't let you
<wallyworld> we only support single controller cmr for now
<rogpeppe> wallyworld: at a quick glance, it looks like the url parsing code does support returning a source
<wallyworld> the Source attr is there for futyre use
<wallyworld> it does
<wallyworld> but the CLI errors from memory
<rogpeppe> wallyworld: so someone *could* type in a url with a source
<wallyworld> if you try and create an offer an a different controller to the current one
<wallyworld> they could but the CLI won't let the get very far
<wallyworld> this is all WIP
<wallyworld> behind a feature flag
<rogpeppe> wallyworld: ok. shall i just remove that line then?
<wallyworld> yeah, or i'll fix it as a drive by
<wpk> babbageclunk: pong
<wpk> babbageclunk: timezones suck..
<rogpeppe> wallyworld: thanks - i've removed it for now
<wallyworld> rogpeppe: no worries, thank you
<axw> wallyworld: did you say something about QA tools and such being moved to git?
<wallyworld> axw: no
<wallyworld> i asked heather to file a bug for a CI change
<axw> wallyworld: ok
<axw> it would be nice if all our things were together
<wpk> jam: https://pastebin.canonical.com/186915/ does this look OK to you? Or do we want something more there?
<jam> wpk: looks like a good start.
<jam> wpk: as we discussed, its probably better to start with a minimal interface that is actually all in use and then grow it as we need to
<jam> wallyworld: axw: ping if you're around
<wallyworld> jam: i am somewhat
<jam> wallyworld: so I'm looking through some of the facade registries, and we're registering the *same* object with multiple versions
<jam> how can that be correct?
<jam> if we had the same object, we didn't need a version
<jam> if we changed the object, then we're violating the old api by exposing it with the new object
<wallyworld> they're not supposed to be the same - new or changed apis are there
<wallyworld> which one?
<jam> all of them that I saw
<wallyworld> oh application
<jam> wallyworld: I just sent an email about Application
<jam> but also SSHClient
<jam> and others, IIRC
<wallyworld> from memory, application facade adds new methods
<jam> MachineManager
<wallyworld> so the version 3 does use the same object as v4
<jam> wallyworld: but it *shouldn't*
<wallyworld> but v 3 clients don't see the new metjhod
<jam> wallyworld: it means we're exposing the new method on the old version
<jam> and things like libjuju
<jam> will expect it
<jam> wallyworld: they *see* it, they just don't know to ask for it
<jam> but something like libjuju will create code that will *fail* against old versions
<jam> because it expects the api to be there, because 'develop' says that it is
<wallyworld> that is something that wasn't apparent at all at the time
<wallyworld> it all works with juju's versioning mechanism
<wallyworld> but if libjuju imposes other restrictions, we'll meed to change
<jam> wallyworld: if you have (4, NewMethod), with the code written there you will also have (3, NewMethod)
<wallyworld> sure, but v3 juju clients won't see it
<jam> wallyworld: again, if they ask, they see it
<jam> its there
<jam> it can be called
<jam> its exposed
<wallyworld> and nothing breks
<jam> you're just assuming nobody ever inspects the api
<wallyworld> if they inspect it they can use it though
<wallyworld> they use what they see
<jam> wallyworld: but then they write code against a version of Juju and it breaks against the real version
<jam> wallyworld: the point of a versioned API is to *not change* the old version
<wallyworld> that is true. the assumption was that a it worked for juju clients so would have been ok
<jam> I'm also a bit surprised we've managed 4 revisions of Application with only 2.0, 2.1 and 2.2
<jam> not sure where the extra version comes in
<wallyworld> we reved the facades when we went to 2.0
<wallyworld> to avoid 1.x clients accidentally calling 2.x facades
<wallyworld> so 2.0 juju started with v2 application facade
<wallyworld> i guess we should fix the facades for 2.2
<rogpeppe> wallyworld, jam: i'm about to move cookie jars into the jujuclient.ClientStore interface so we don't get tests accidentally creating cookie files. does that seem reasonable to you?
<jam> rogpeppe: I don't quite have the context to immediately say yes/no, but I'd really like us to move cookie jars to be less of a 'global' thing
<jam> so steps in that direction sound goo
<jam> good
<rogpeppe> jam: yeah, that's what i'm working on currently
<rogpeppe> jam: cookie jars will be per-controller
<wallyworld> i guess it will be a new embedded interface
<rogpeppe> wallyworld: CookieJar(controllerName) http.CookieJar
<rogpeppe> wallyworld: or something like that
<wallyworld> that's the method, a new interface as well
<wallyworld> to follow the pattern already in place
<wallyworld> where ClientStore is composed of other interfaces
<rogpeppe> wallyworld: yeah, i guess so
<rogpeppe> wallyworld: type CookieStore interface {CookieJar(controllerName string) http.CookieJar}
<wallyworld> sgtm, ty
<rogpeppe> wallyworld: actually, the returned jar needs a Save method, but otherwise the same as http.CookieJar
<wallyworld> ok
<jam> rogpeppe: you did the work to allow Runner.Worker() to return the underlying worker?
<rogpeppe> jam: yeah
<jam> rogpeppe: bug #1686711 I'm trying to track down our test suite getting a nil panic
<mup> Bug #1686711: panic() during lease manager <panic> <worker> <juju:Triaged> <https://launchpad.net/bugs/1686711>
<jam> and it *might* be that Worker() is returning nil without returning an error
<rogpeppe> jam: you can't reproduce it?
<jam> rogpeppe: its a random test failure
<jam> rogpeppe: my guess is that we happen to call a leadership function at exactly the right time while something else is tearing down
<rogpeppe> jam: i'm pretty sure that the Manager instance isn't nil
<rogpeppe> jam: if that's what you're thinking
<jam> rogpeppe: well, Secretary pretty much can't be nil and 'config' isn't a pointer
<jam> (all places that create it in our code pass in a Config with a valid Secretary
<jam> if workerInfo.worker was nil it would get returned
<jam> as long as workerInfo was there
<rogpeppe> jam: the stack trace shows that the first argument (the receiver) is non-nil
<rogpeppe> jam: how often is this happening?
<jam> rogpeppe: not sure, it just rejected a merge request
<jam> but nil panics in code are serious so worth investigating so that we know they aren't production issues
<rogpeppe> jam: you've seen it before?
<jam> rogpeppe: I have not, but it certainly isn't my code that is changing this
<jam> rogpeppe: and unless you're going with "bit flips due to cosmic rays" its a latent bug in our code
<rogpeppe> jam: indeed
<jam> rogpeppe: so the stack trace, Claim() only takes 3 arguments and is a method, but the stack trace shows 8 parameters to the function
<jam> rogpeppe: is it expanding 'string' objects ?
<rogpeppe> jam: yeah, that's right
<jam> rogpeppe: time.duration is only an int64, they're still too many parameters
<rogpeppe> jam: args in turn are: receiver, str ptr, str len (9), str ptr, str len (11), duration (60s)
<jam> rogpeppe: there are still 2 more
<rogpeppe> jam: i don't think the stack-printing logic knows how many args there are
<rogpeppe> jam: given that args 2-6 look plausible, i think the first one is probably correct
<jam> some of them have ... and some don't so it seems it might know something
<jam> rogpeppe: anyway, i'm happy to agree that the receiver is probably not nil
<rogpeppe> jam: it looks like another string at the end.
<jam> rogpeppe: indeed, but not sure where that is coming from
<jam> NewManager calls config.Validate() to make sure config.Secretary != nil
<jam> I haven't found any code that changes the Secretary at runtime
<jam> rogpeppe: we *have* seen a different nil pointer panic in one of witold's submissions, wpk, do you remember which?
<wpk> Sec
<jam> 10722
<rogpeppe> jam: BTW I'd suggest that you set GOTRACEBACK=all in your CI setup
<jam> I added it to the bug
<wpk> http://juju-ci.vapour.ws:8080/job/github-merge-juju/10722/artifact/artifacts/xenial.log this one
<jam> rogpeppe: its the same failure
<rogpeppe> jam: then you'd be able to tell which test was running
<jam> rogpeppe: that one has 0x0 0x0 as the last two params
<jam> feels a bit like 'print the stack that might happen to still have stuff on it'
<rogpeppe> jam: yeah, i think it probably is
<rogpeppe> jam: FWIW http://paste.ubuntu.com/24466503/
<rogpeppe> jam: same thing when it's called through an interface
<rogpeppe> jam: i'd suggest setting GOTRACEBACK=all and waiting for it to happen again
<rogpeppe> jam: this kind of thing will be better in go1.9 - we're getting column positions on error messages
<rogpeppe> jam: i'm certain that Secretary is nil. look at this: http://paste.ubuntu.com/24466590/
<rogpeppe> jam: it faults on exactly the same address (0x30)
<jam> rogpeppe: you're saying that would be the offset of the function we're calling?
<rogpeppe> jam: i suspect something is using an uninitialised Manager somewhere
<rogpeppe> jam: no, that's the address it's trying to get a value from
<rogpeppe> jam: it's probably the offset into the interface value of the CheckLease method
<rogpeppe> jam: or of the method table
<rogpeppe> jam: ok, i know what's going on
<rogpeppe> jam: and it's all my fault :)
<rogpeppe> jam: i'm not quite sure what the best fix is though
<rogpeppe> jam: actually, not too hard to fix. i'll raise a bug for it.
<rogpeppe> wpk, jam: actually, is there an existing bug report for this?
<jam> rogpeppe: https://launchpad.net/bugs/1686711
<mup> Bug #1686711: panic() during lease manager <panic> <worker> <juju:Triaged> <https://launchpad.net/bugs/1686711>
<jam> rogpeppe: what is the bug?
<jam> well, what is the underlying issue/
<jam> ?
<rogpeppe> jam: ha! you sent that message at *exactly* the same time i clicked on Submit on https://bugs.launchpad.net/juju/+bug/1686720
<mup> Bug #1686720: worker/lease: NewDeadManager returns manager that crashes <juju:New> <https://launchpad.net/bugs/1686720>
<rogpeppe> jam: i'll leave it up to you to decide which one to go with :)
<rogpeppe> jam: i need lunch
<wpk> smacznego
<jam> rogpeppe: I'll mark a duplicate
<jam> rogpeppe: I went with yours
<jam> since it explains what is actually wrong
<Hetfield> hi guys, i need a fast help i could not find in the docs. basically is there something like juju resolved for "machines" instead of units?
<Hetfield> i have juju with maas, it tried to deploy some apps on a machine, but, due to wrong tags, juju could not get a machine. now i fixed the tags and i would like to tell juju "try again" or juju add-machine and relocate units to the new machine
<rick_h> Hetfield: so there's a retry-provisioning command that might help you there
<rick_h> Hetfield: check out the command reference here: https://jujucharms.com/docs/2.1/commands (basically juju retry-privisioning --help)
<Hetfield> rick_h: thx but it's not working
<Hetfield>    machine-status:       current: provisioning error       message: 'cannot run instances: cannot run instance: No available machine matches
<Hetfield> rick_h: now i manually added a new machine. any way to tell juju to move all units to new machine? because they are 4 lxd so i need to manually add them, so boring
<rick_h> Hetfield: no, unfortunately there's no migration path/tool there
<Hetfield> ok i'm going to open a bug for this so
<Hetfield> i just opened https://bugs.launchpad.net/bugs/1686767
<mup> Bug #1686767: juju lacks manual unit relocation on new machines (retry-provisioning not working) <juju:New> <https://launchpad.net/bugs/1686767>
<Hetfield> hope you can give some love :)
<rick_h> ty Hetfield
<wpk> babbageclunk: ping?
<babbageclunk> wpk: hey - still around? I was just wondering if you'd had an answer to your API versioning question
<wpk> babbageclunk: Yes, around, and yes
<wpk> babbageclunk: there was quite a discussion about it today, as e.g. application API presents 4 versions using the same struct
<wpk> 4 API versions with the same API
<babbageclunk> wpk: oh cool - I figured you probably would have given that it was yesterday
<wpk> babbageclunk: timezones sucks...
<babbageclunk> wpk: weird!
<wpk> everyone should be on CET!
<frankban> wallyworld: ping, could you please take a look at https://github.com/juju/juju/pull/7290 . it is a change to the gui handler code that you helped implementing
<wallyworld> frankban: sure, otp but after that
<frankban> wallyworld: thanks, and have a good day
<wallyworld> you too
<wallyworld> balloons: oracale call just finished. what's the tl;dr; on the release?
<balloons> wallyworld, seems we can ship 5177
<wallyworld> awesome
<menn0> wallyworld, anastasiamac, axw: now that we require Go 1.8 our install instructions in README.md and CONTRIBUTING.md might give people trouble (as per bug 1662857)
<mup> Bug #1662857: cannot go get the source code  <juju-core:Confirmed> <https://launchpad.net/bugs/1662857>
<menn0> people are most likely to have Go 1.6 installed and when they do  go get -d -v github.com/juju/juju/... it'll fail
<menn0> i'm thinking we should change the instructions to tell people how to get Go 1.8 first
<menn0> and then direct them to do  go get -d -v github.com/juju/juju/...
<menn0> I also think the "install-dependencies" make target then shouldn't install Go for people
<menn0> thoughts?
<anastasiamac> menn0: sounds great - feel free to update ;)
<menn0> anastasiamac: I will. just wanted some concensus
<anastasiamac> menn0: since we r taking 5177 as per balloons ^^ for beta3, this will end up on beta4
<anastasiamac> menn0: well, u have my +1 - anythign to improve docs and user experieneces
<menn0> thumper: repeating in a summarised way for thumper.
<menn0> we now require Go 1.8
<anastasiamac> thumper: can we talk now-ish?
<thumper> morning
<thumper> yeah
<menn0> people are likely to have Go 1.6 installed
<menn0> this means our instructions of go get github.com/juju/juju/... in the README and CONTRIBUTING docs will fail for them
<menn0> thumper: I'd like to change the instructions to tell people how to get Go 1.8 first
<menn0> and then get them to pull the code
<menn0> also change "install-dependencies" target not to install Go
<menn0> thumper: sound ok?
<menn0> thumper, anastasiamac : oh man... some of the instructions in the README as Juju 1.x specific
<menn0> embarrassing
<thumper> heh
<anastasiamac> menn0: yeah :( i think noone looked at the full contents of these files in a while... I've lloked at one paragraph recently... but did not have a chance to have a comprehensive  scan/update...
<anastasiamac> menn0: thank you for doing it ;D
<axw> menn0: (hours later) sounds fine to me
<menn0> axw: cheers
<babbageclunk> wallyworld: take a look at the tiny PR here? https://github.com/juju/juju/pull/7291
<wallyworld> sure
<babbageclunk> wallyworld: Once we're happy with that I'm tempted to try implementing the API code that will use the methods first.
<babbageclunk> axw: oh, you're up! Could you look at ^ as well to check it matches your thinking?
<axw> babbageclunk: okey dokey
<babbageclunk> ta!
<wallyworld> babbageclunk: you certainly could implement th api methods which use IsRouteable
<babbageclunk> how can daylight savings changes feel so confusing for so long?
<anastasiamac> thumper: oh, this one looks to be not-quiet-fixed :( https://bugs.launchpad.net/juju/+bug/1680392
<mup> Bug #1680392: Model migration fails on large model <juju:New for thumper> <https://launchpad.net/bugs/1680392>
#juju-dev 2017-04-28
<wpk> anastasiamac: thanks
<anastasiamac> wpk: anytime \o/
<menn0> babbageclunk: I guess the lazy lookup was in case something happened to import the package without actually running mongod using tests?
<menn0> babbageclunk: for that reason I guess it makes sense to keep it lazy
<thumper> wallyworld: I have things to do in town this afternoon, so let's cancel our call
<thumper> wallyworld: we will be seeing each other in two days anyway
<wallyworld> thumper: sgtm
<axw> menn0: I've noticed that we're not passing "--storageEngine wiredTiger" to mongo 3.2 in the tests. have you tested with that added?
<babbageclunk> menn0: yeah, that'd be the reason not to do it at import time.
<menn0> axw: isn't the default WT?
<axw> menn0: yeah, just the help text is a bit confusing "defaults to wiredTiger if no data files present"
<axw> menn0: anyway, I tested and it made no difference
<menn0> axw: ok good to know
<axw> menn0: forcing it to mmapv1 makes it faster though, in the limited testing I did
<axw> menn0: I don't think we want to do that though, since we use WT in prod
<axw> though maybe just having WT in CI would be good enough
<menn0> axw: we think the problem is that we're still deleting DBs instead of clearing the collections in a lot of places
<menn0> WT is much slower at deletings DBs
<axw> menn0: could be, but I did notice that at least some of the test code (i.e. between SetUpTest completion and before TearDownTest start is slower with WT
<menn0> ok, so there's issues there as well
<menn0> axw, thumper, babbageclunk: there's quite a significant performance increase with wiredtiger if you disable transparent huge pages
<menn0> echo never > /sys/kernel/mm/transparent_hugepage/enabled
<menn0> without changing anything else this takes the agent/agentbootstrap tests from 2.8s to 1.8s on my machine
<menn0> consistently
<menn0> interestingly, Juju is supposed to be setting that on controller machines but I don't see that
<babbageclunk> menn0: right, but do you know what other impact that would have on our systems?
<menn0> babbageclunk: not sure yet
<axw> menn0: hmm interesting. doesn't improve allWatcherStateSuite.TestChangeApplications in state for me
<menn0> axw: what's the setting set to on your machine?
<axw> menn0: it was set to always
<menn0> ok
<babbageclunk> menn0: It makes sense to do it on dedicated machines (although I guess it wouldn't help in lxd?)
<menn0> duh
<axw> menn0: with mongo 2.4 (juju-mongodb), that test takes just under 0.4s. with mongo 3.2 (juju-mongodb3.2) it takes 1.3s
<menn0> that sucks
<menn0> maybe we shouldn't be using wiredtiger at all
<menn0> there's a lot of horror stories online about WT
<menn0> lots of people seem to have switched back
<menn0> babbageclunk: https://github.com/juju/testing/pull/123/files
<thumper> hmm...
<thumper> wow
<thumper> menn0: initial timing tests... 2.6 mongod 23m17s, api 32s, apiserver 138s, state 671s
<menn0> babbageclunk, axw, thumper: seems like we get most of the performance back by passing --storageEngine mmapv1
<menn0> but then it's not like production
<babbageclunk> menn0: I think I floated that at the time.
<thumper> menn0: initial timing tests... 3.2 mongod 55m36s, api 366s, apiserver 1515s, state 1455s
<thumper> api and apiserver packages are 10x slower
<menn0> thumper: can you try with huge transparent pages turned off?
<babbageclunk> menn0: I guess the main risk/annoyance would be if there's some behaviour difference and we only find out when the CI tests fail.
<thumper> let me address the clear databases
<thumper> then I'll try with the transparent pages off
<menn0> thumper: yep ok.
<axw> menn0: yeah that's what I found too. I think I'm OK with that as long as our CI still uses 3.2 (which it would, since there's no way to override that?)
<babbageclunk> menn0: less of a problem now that we have check builds though
<axw> babbageclunk: do you have time for a small review? https://github.com/juju/testing/pull/124
 * thumper has kicked off the new test run
<menn0> babbageclunk, thumper: I'm going to return to other work for now but will be happy to discuss ideas or test stuff
<thumper> menn0: ack
<thumper> babbageclunk: using clear databases rather than reset actually looks like it is taking longer...
<thumper> hmm
<thumper> api package was 378s
<thumper> menn0, babbageclunk: I'll wait for the apiserver package timings, but the clear databases call makes it even slower
<thumper> I'll try the transparent pages turned off
<babbageclunk> axw: sure, looking at that now (and menn0's too).
<babbageclunk> axw: LGTM'd
<axw> babbageclunk: thanks
<axw> babbageclunk: did you forget about this one? https://github.com/juju/description/pull/8
<axw> there's no bot on that repo, I can merge if you like
<babbageclunk> axw: oh, yes please - I think I decided to merge it when thumper was away and then forgot to follow it up.
<babbageclunk> ta
<axw> babbageclunk: done
<menn0> babbageclunk: thanks for the review
<babbageclunk> menn0: <thumbsup emoji>
<thumper> menn0, babbageclunk: changing the huge transparent pages is making very little difference
<thumper> I don't know what to do next
<menn0> thumper: weird. it seemed to make a big difference for me.
<babbageclunk> thumper: :( you said clearDatabases is slower?
<thumper> yeah
<babbageclunk> thumper: well bums
<thumper> yeah
<thumper> even with the huge pages set to never, the api package is 10x slower
<thumper> so 300s instead of 30s
<thumper> NFI what to do next
<babbageclunk> convienient timing too
<menn0> thumper: --storageEngine mmapv1 :)
<thumper> menn0: where?
<menn0> in juju/testing/mgo.go, in the args we pass to mongod
<menn0> have to be careful that mongod is 3.x though as --storageEngine doesn't exist in 2.x
<menn0> thumper: also, what about dropping DBs in MgoSuite.TearDownTest instead of clearing?
<thumper> menn0: did you do something about caching the mgo version?
<menn0> thumper: yes
<thumper> menn0: where?
<menn0> https://github.com/juju/testing/commit/3ccb7d0a3f3412b41f8c1bf000ff7078d28c0af9
<thumper> k
<menn0> thumper: that doesn't deal with the slowness. it just closes a potential problem.
<thumper> hmm with mmapv1 storage engine it is only 30% slower
<thumper> are we prepared to accept a 30% decrease in test speed?
<thumper> and what other choice do we have?
<thumper> and should this info drive production choices?
<babbageclunk> thumper: production usage patterns are very different from the tests though - we very rarely drop databases. :)
<thumper> yeah...
 * thumper needs to organise some trip stuff
<babbageclunk> jam: ping?
<axw> wallyworld: oracle bootstraps for me on my trial account FWIW
<wallyworld> axw: it is failing because it is starting a yakkety instance (and looking for yakkety tools in /var/lib/juju) even though juju thinks it has asked for a xenial instance. so there's an issue with image selection :-(
<wallyworld> i will try and remove the yakkety image from my account, that should fix it
<axw> okey dokey. I think I only have xenial in mine
<wallyworld> yeah, i'll let gabriel know, just testing to make sure
<wpk> axw: re: proxy - this can break things, as it really sets the proxy globally
<wpk> axw: I forwarded you an e-mail from jam describing changes, I don't want to land it without a proper test run, preferably by someone who uses proxying
<wpk> axw: there is a 'workaround' (remove systemd code), but that has yet to be decided
<jam> wpk: how about if we leave the ability to do systemd, but have it disabled, and land the rest ?
<jam> I think the rest is a clear improvement
<jam> and we can discuss whether systemd will actually break things
<mup> Bug #1686938 opened: During a destroy-model the units first update/upgrade which delays the destroy process <apt> <delay> <destroy-model> <upgrade> <juju-core:New> <https://launchpad.net/bugs/1686938>
<wpk> Is anyone willing to review #7204? I know it's big but it's been waiting for almost a month now..
<jam> wpk: did you go through it as though you were doing a review as well?
<wpk> jam: yes, although it is ahuge one so I might have missed some things...
<jam> wpk: so it seems there is one piece that needs fixing wrt bridge_ports and inet and inet6 stanzas
<wpk> jam: we should only put one?
<jam> bug #1650304
<mup> Bug #1650304: Juju2: 'Creating container: failed to ensure LXD image: image not imported!' <oil> <oil-2.0> <regression> <juju:Incomplete> <juju 2.1:Incomplete> <https://launchpad.net/bugs/1650304>
<jam> from axw: https://bugs.launchpad.net/juju/+bug/1650304/comments/7
<wpk> jam: Reading through bridge-utils, IMHO it's OK to have it but I'll check
<jam> wpk: it wasn't ok from experience, given the original bug
<wpk> jam: as I understand that wasn't the problem, and looking at the code it's simply doing for (bridge in bridge_ports) if (bridge not already in the bridge) brctl addif....
<jam> hopefully thats if "port not already in the bridge" :)
<jam> but I get your point
<jam> wpk: are we sure that is true across trusty+xenial+yaketty, etc?
<wpk> trusty, xenial, zesty
<wpk>     if [ "$MODE" = "start" ] && [ ! -d /sys/class/net/$IFACE/brif/$port ]; then
<wpk> I don't have any Yakkety but I don't think they'd change that line just for this one :)
<axw> wpk: if you're changing it back (I'm happy to be told I was wrong - again, I'm no expert), can you please run the QA steps from that PR that jam linked to?
<axw> I mean, the PR linked from the bug that jam linked to...
<wpk> axw: willdo, I'm checking how the 'old version' worked and bridge_ports wasn't the only problem (repeated auto, etc.)
<wpk> axw: was the problem only with rackspace or with other providers too?
<axw> wpk: yep, I may well have conflated the other issues. that's the only place I observed and reproduced the issue. I think it was also seen on MAAS by others, but could never confirm myself
<rogpeppe> here's a tiny little PR for review, controller-specific cookie jars: https://github.com/juju/juju/pull/7294
<rogpeppe> axw, wpk, jam, wallyworld: if someone manages to review it, i'll owe them a beer or two :)
<axw> rogpeppe: heh. I've already got a beer and it's 9pm on a a friday night ;)  I'll take a look on monday if nobody does it sooner
<rogpeppe> axw: thanks
<rogpeppe> axw: this work was the subject of this tweet: https://twitter.com/rogpeppe/status/849963422032723968 :)
<axw> rogpeppe: I figured :)
#juju-dev 2018-04-23
<veebers> wallyworld: migrating a model to a controller of a newer version is supported right?
<veebers> thumper: can you share the 2.3.6 upgrade bug link please?
<wallyworld> veebers: yep
<anastasiamac> veebers: 2.3.6 upgrade bug - https://bugs.launchpad.net/bugs/1765722
<mup> Bug #1765722: upgrade to 2.3.6 failed: the dotted field is not valid for storage <docteam> <openstack> <sts> <juju:In Progress by jameinel> <juju 2.3:In Progress by jameinel> <https://launchpad.net/bugs/1765722>
<veebers> wallyworld: ok, we might have a bug then :-\ https://paste.ubuntu.com/p/pmt4cfYd64/
<veebers> awesome, cheers an
<veebers> anastasiamac ^ :-)
<anastasiamac> haha, yes let's NOT abbreviate!! :D
<wallyworld> veebers: oh, i forgot, we don't yet support migrating models with offers
<veebers> wallyworld: ah, hah ok. No bug then, just unsupported :-|
<wallyworld> veebers: yeah, cause if you migrate a model with offers, things connected to them will break
<veebers> wallyworld: ack, makes sense. Have I misunderstood the macaroon testing plan from Friday, or does this throw a spanner in the works?
<wallyworld> veebers: no, i just forgot about the migration limitation
<veebers> wallyworld: ack, I'll shelve for now and get this initial tomb PR up.
<wallyworld> yup
<veebers> huh, this is concerning: juju destroy-controller mbakev2-offer . . . Continue? (y/N):ERROR controller destruction aborted: read /dev/stdin: resource temporarily unavailable
<vino_> wallyworld: have a min
<wallyworld> vino_: yeah, just finishing testing, it fails with this error
<wallyworld> 2018-04-23 01:08:31 ERROR juju.service.systemd service.go:134 failed to read conf from systemd for application "juju-db": get conf failed (cat /lib/systemd/juju-init/juju-db/juju-db.service): error executing "/bin/systemctl": cat: /lib/systemd/juju-init/juju-db/juju-db.service: No such file or directory;
<wallyworld> 2018-04-23 01:08:31 ERROR juju.worker.dependency engine.go:551 "state" manifold worker returned unexpected error: failed to read conf from systemd for application "juju-db": get conf failed (cat /lib/systemd/juju-init/juju-db/juju-db.service): error executing "/bin/systemctl": cat: /lib/systemd/juju-init/juju-db/juju-db.service: No such file or directory;
<wallyworld> let's jump into standup ho
<vino_> yes please.
<kelvinliu> wallyworld, I think i found something different. Do u have a few minutes on hangout?
<thumper> wallyworld: can you file a bug about migrating models with cmr?
<wallyworld> kelvinliu: give me a few minutes, talking to vinu
<wallyworld> thumper: ok. it's something we need to figure out how to handle, won't be trivial
<thumper> yeah...
 * thumper needs food badly
<kelvinliu> wallyworld, all good, let me know when u r free thx
<veebers> babbageclunk: I'm seeing 'go vet' 'lock copy' errors on develop. It doesn't seem like the file has changed recently. You re-enabled that check and fixed the issues thought right?
<babbageclunk> veebers: yup yup
<babbageclunk> Hang on, looking
<veebers> babbageclunk: am I going crazy? :-P  Seeing the issue in mongo/mongometrics/mgostatsmetrics.go
<babbageclunk> hmm, I'm getting gofmt probs before the go vet.
<veebers> babbageclunk: gofmt probs are recent, they are in progress to revert :-)
<veebers> I've since fixed the check and merge jobs so they actually fail on go fmt/vet errors
<babbageclunk> veebers: running verify.bash doesn't fail once I comment the gofmt check out. It passes the go vet check. Maybe that code isn't actually being checked - is the package used?
<babbageclunk> If I run `go tool vet -all .` in mongo/mongometrics I don't get any errors.
<veebers> I wonder if I'm doing something odd
<veebers> babbageclunk: cheers I'll double check I'm not doing something screwy on my end.
<babbageclunk> paste the error you're seeing maybe?
<veebers> babbageclunk: https://paste.ubuntu.com/p/63SRG6JCH9/
<babbageclunk> veebers: have you updated mgo?
<babbageclunk> veebers: in my code, line 38 is copying a struct that only has int fields, but I can see that the package it's in has a global mutex. If the mutex has been moved into the struct that would do it.
<thumper> it's always the case where you think you have a nice simple bug, then it all goes to pot
<anastasiamac> thumper: the bug u picked did not look simple...
<thumper> i thought it might have been
<anastasiamac> anastasiamac's definition of simple bug: https://bugs.launchpad.net/juju/+bug/1765688
<mup> Bug #1765688: create-storage-pool unnecessarily requires key=value pairs <juju:New> <https://launchpad.net/bugs/1765688>
<veebers> babbageclunk: what rev of gopkg.in/mgo.v2 do you currently have?
<thumper> https://github.com/juju/juju/blob/develop/apiserver/facades/client/application/deploy.go#L171
<anastasiamac> hahaha, awesome comment :)
<babbageclunk> f2b6f6c918c452ad107eec89615f074e3bd80e33
<babbageclunk> veebers: ^
<veebers> babbageclunk: huh, yeah samsies, is yours patched via our make add-patches?
<babbageclunk> no, it won't be - I generally don't run make release
<babbageclunk> Oh, this is in the build?
<veebers> babbageclunk: I just reverted mine (as I had add-patched it), but still failure. I'm confused. I need to look harder I think
<babbageclunk> That would explain it
<babbageclunk> Yeah, it's the stats race patch
<wallyworld> kelvinliu: free now for a chat if you need
 * thumper quitely extracts himself from this bug
<babbageclunk> veebers: that means the vet error is highlighting a real problem.
<veebers> babbageclunk: sigh, yeah passes for me on a second run.
<veebers> babbageclunk: right, introduced by the patch
<babbageclunk> veebers: I don't think it's too hard to fix - the code needs to pass around *Statses rather than Statses.
<babbageclunk> veebers: (and the patch needs to make GetStats return *Stats too)
<veebers> babbageclunk: ack, I'll take a go at it shortly. Just trying to determine now why the merge/check jobs failed to pick itup
<veebers> babbageclunk: I'm a little confused now :-\ I cleared out those dirs (mgo.v2 etc.) did godeps, applied patches and verify no longer complains. The issue is still there though right? passing around mgo.Stats
<babbageclunk> veebers: yeah that's weird - if you apply that one patch, you should be able to see the lock in the Stats struct
<wallyworld> thumper: got a minute to join a team standup hangout?
<thumper> wallyworld: ack
<thumper> veebers: can you pull 2.3.6 out of streams?
<veebers> thumper: technically yes? I'm sure it's possible I'm not sure how to do it off the top of my head, would have to have a look
<thumper> veebers: please look
<veebers> thumper: ack, will do
<babbageclunk> thumper: have a moment?
<thumper> babbageclunk: kinda
<thumper> babbageclunk: 1:1?
<babbageclunk> thumper: sure thanks
<thumper> anyone? https://github.com/juju/bundlechanges/pull/35
<babbageclunk> thumper: oh go on then
<thumper> babbageclunk: are you still looking at it?
<babbageclunk> thumper: yeah, sorry - also having a discussion on irc.
<thumper> :)
<babbageclunk> Sorry done now
<thumper> ta
<veebers> babbageclunk: (having circled back) so odd, *now* I see the govet issues when I run verify script. Any way, would one edit the current patch? maybe spin a new one using the existing as a basis?
<babbageclunk> veebers: Not sure - I think editing the existing one is better (less confusing to debug later).
<babbageclunk> veebers: it's a bit iffy though - this changes the public interface of the package. But the previous change *kind of* already did that by adding a mutex to something that's returned by value.
<veebers> babbageclunk: ack, I'll attack that tomorrow. For now I'll try run the tests I was hoping to run this morning so I can proposed a PR ^_^
<veebers> thanks for your help earlier, was scratching my head a bit
<babbageclunk> no worries - that's pretty mysterious.
<babbageclunk> Hey, we should probably migrate to the fork of mgo! ;)
<veebers> ^_^
<vino_> hey WallyWorld : i have few questions
<vino_> wallyworld : for the first PR i can verify by myself juju-updateseries.
<vino_> Just keeping u posted.
<vino_> i need help only for the second PR.
 * vino_ away for tea
 * vino_ back
<wallyworld> vino_: hey
<vino_> hey
<vino_> i tried reaching u
<wallyworld> did you have a PR you wanted me to look at?
<vino_> now i am in middle of something...
<vino_> i am abt to.
<wallyworld> sure, np
<vino_> but just one question.
<vino_> i cudnt move the agentinfo.c from service to agent folder due to circular package issue..
<vino_> more code added and service needs that file and agent needs service.
<vino_> so i am moving it to core/agents
<vino_> is that fine for u
<vino_> ?
<wallyworld> ah right, sure, that happens sometimes. i think it may be ok, will be able to take a closer look when i see the PR
<vino_> yes.
<vino_> and the question i have is.
<vino_> inside agents folder..
<vino_> i do notice tools dir..
<vino_> which i feel can be moved to core/agents/tools
<vino_> which is more generic.
<vino_> i want to do it and send it the PR.
<wallyworld> let me take a quick look
<vino_> can u take a look at that folder and let me know ur opinion.
<vino_> agents/tools/*
<wallyworld> vino_: the packages have slightly different focus. let's leave for now. we can discuss more tomorrow
<vino_> oh ok.
<vino_> i was doing few chnages
<vino_> i will revert.
<vino_> so yes.
<vino_> we can look at it tomorrow.
<vino_> i will send it for review.
<wallyworld> the PR will be simpler with just the upgrade bits
<vino_> yup.
<veebers> wallyworld: re: cmr and migrating, I cannot destroy my controller now (after the failed attempt to migrated a shared model), error is: https://paste.ubuntu.com/p/4PjdrN4BDJ/
#juju-dev 2018-04-24
<vino_> wallyworld : Build is over for #8644 and it is still the go fmt issue..
<wallyworld> vino_: weird, ok. the PR doesn't touch that file right?
<vino_> it does. thats why i am worried
<wallyworld> vino_: did you run go fmt before committing?
<wallyworld> or use the pre-commit check?
<veebers> wallyworld, vino_ just pulled the branch locally gofmt complains on it. I would suggest enabling the pre-commit hook :-)
<wallyworld> veebers: thank you. i didn't realise till just now that that particular file was part of the pull request
<wallyworld> was conversing in another channel
<veebers> wallyworld: no worries. I had a vested interest to make sure my job changes hadn't borked things ;-)
<vino_> i still have issues with pre-hook check. so i do manuall run go fmt.
<vino_> i missed these folders i guess.
<vino_> but its going on now.
<vino_> the other PR inherits the same issue..
<vino_> so fixed both and the CI is running now.
<wallyworld> vino_: go fmt ./...   from the top level juju dir is safest
<wallyworld> but we need to sort out the pre-commit hook also
<vino_> agree..
<vino_> wallyworld : PR#8644 is happy
<wallyworld> vino_: just finishing review, been multitasking
<vino_> no worries.
<wallyworld> vino_: first one reviewed, take a look, happy to chat, but need to grab coffee first (still haven't been able to get away from keyboard to grab one and i'm getting desperate)
<vino_> sure wallywolrd. ping me once u r back
<wallyworld> vino_: did you want to talk about first PR?
<wallyworld> or you happy to make suggested changes?
<vino_> hm.. i want to talk abt the moving the agentinfo to agent.
<vino_> can we have a quick talk.
<vino_> other chnages are fine for me.
<vino_> they are very minor ones(comments)
<vino_> wallyworld : i am in hangout..
<anastasiamac> anyone keen to review 2 tiny PRs? 8647 and 8648...
<anastasiamac> babbageclunk: veebers: ^^
<veebers> I'll take 8648
<wallyworld> vino_: coming
<anastasiamac> veebers: haha :D since u did, it'd b awesome to have a ci test around storage pools... :D
<anastasiamac> veebers: but I'll take ur eyes on PR for now ;D
<veebers> anastasiamac: I'm always happy to help people add new and extend our CI tests ;-)
<anastasiamac> veebers: haha
<veebers> anastasiamac: reviewed, have asked a question but otherwise lgtm
<anastasiamac> veebers: it is tested for this layer :) so if u happy could u tick the tick so that i can land it?
<veebers> anastasiamac: I can , for money
<veebers> anastasiamac: hah :-) yep all good have approved now
<anastasiamac> veebers: \o/ money? i cannot money... ;( but i can drinks... how about a drink at next sprint?
<veebers> anastasiamac: deal
<anastasiamac> veebers: awesome. now it's set in stone (too many witnesses) ;D
<veebers> ^_^
<wallyworld> kelvinliu: i left a comment or two on your PR
<kelvinliu> wallyworld, looking now, thx
<babbageclunk> anastasiamac: sorry to blank you! I didn't get a pop up notification for some reason! Any still needing review?
<babbageclunk> anastasiamac: looking at 8647
<babbageclunk> anastasiamac: approved with v. minor comment
<babbageclunk> names
<babbageclunk> doh
<anastasiamac> babbageclunk: :-P
<balloons> Good morning all
<frankban> stickupkid: hey
<stickupkid> hey: frankban :D
<frankban> stickupkid: so I've seen that you tried to use jujushell in production (jujucharms.com) and you got an error. I updated the service, could you please try again?
<rogpeppe> anyone from juju side around and able to give this small juju/cmd PR a lookover? https://github.com/juju/cmd/pull/54
<balloons> rogpeppe, what's the drive for this?
<rogpeppe> balloons: i want to be able to have flags that are common to all subcommands (like the logging flags are)
<rogpeppe> balloons: so when i type "mycommand help" i'll get to see those flags
<balloons> rogpeppe, but you don't have a specific global flag you have in mind?
<rogpeppe> balloons: yeah, i do: --candid-url and --agent
<rogpeppe> balloons: (i'm going to use this in the candid command initially
<rogpeppe> )
<balloons> rogpeppe, ty. ohh right, candid. I know that name!
<rogpeppe> balloons: wanna give me a review then? :)
<balloons> rogpeppe, on it
<rogpeppe> balloons: ta!
<rogpeppe> balloons: i responded to your questions on the PR. they were both drive-by fixes tbh. I could do them as part of a separate PR if you'd prefer.
<balloons> rogpeppe, no, that's fine. approval is the same
<rogpeppe> balloons: ta
<balloons> rogpeppe, looks like we need to add a bot for juju/cmd though to merge
<rogpeppe> balloons: there definitely *was* a bot at some point...
<rogpeppe> balloons: i've tried a $$merge$$ and we'll see
<rogpeppe> balloons: i can always push the green button...
<balloons> rogpeppe, yea, we have new infastructure now. It's trivial to add a bot, just need to get off the phone
<rogpeppe> balloons: ok, cool, thanks
<rogpeppe> balloons: did you manage to enable a bot on juju/cmd, by any chance?
<balloons> rogpeppe, haven't gotten a moment yet :-)
<balloons> I saw you landed it, that's fine
<rogpeppe> balloons: np, thought that might take the pressure off :)
<admcleod_> anyone around who can help me with 'juju metadata generate-image' ?
<admcleod_> not sure if im doing it wrong or have abug
<balloons> admcleod_, that's kind of tricky to get right
<admcleod_> balloons: yeah - actually i think i have it
<admcleod_> balloons: the error i was getting was: well
<admcleod_> https://paste.ubuntu.com/p/6tgrNGqST5/
<admcleod_> balloons: but it looks like specifying the arch as the bootstrap constrain sorts it out - just that error is a bit confusing
<admcleod_> balloons: line 72
<admcleod_> balloons: cos image type and flavour are not really related other than the size of the image
<balloons> admcleod_, ahh.. yea, cross arch can be confusing. Best to specify arch in those cases
#juju-dev 2018-04-25
<manadart> Shamown. https://github.com/juju/juju/pull/8656
<balloons> stickupkid, so you wanted to chat about status?
<stickupkid> yeah, that would be great
<balloons> stickupkid, in 30 mins work?
<stickupkid> balloons: yeah, that's perfect
<rogpeppe> if anyone's around that might be able to review this, that would be marvellous, thanks :) https://github.com/juju/juju/pull/8657
<rogpeppe> balloons: ^
<balloons> rogpeppe, do you know anything about the verbiage choices for SLA 'unsupported'?
<cmars> balloons: i do
<cmars> balloons: pm for any questions regarding
<cmars> balloons: see also, state/model.go, specifically the `slaLevel` type
<balloons> cmars, ty
<hml> balloons: found the 0 problemâ¦ a different method copied the file too before sending over the wire - so the file handle was at EOF instead of start.  :-)
<hml> doh
<balloons> hml, ah-ha! Makes sense
<wallyworld> vino: free now if you need to talk
<vino> hey..
<vino> iam in hangout..
#juju-dev 2018-04-26
<wallyworld> vino: omw
<vino> hi wallyworld
<balloons> Good morning all
<thumper> morning team
<vino> wallyworld: hi
<wallyworld> vino: hey
<wallyworld> did you want to talk about your PR?
<vino> yes.
<vino> quick hangout..
<wallyworld> sure, there now
<vino> ok.
<hml> gânight all o/
<veebers> Have a good evening hml o/
<admcleod_> ill just leave this here: bug 1765571
<mup> Bug #1765571: lxd container fails to launch on bionic host: No associated target operation <juju:New> <https://launchpad.net/bugs/1765571>
<admcleod_> so i can launch lxd containers on bionic with either juju 2.3.7 or 2.4-beta1, still
#juju-dev 2018-04-27
<wallyworld> babbageclunk: i'd love to get to land this today if you have time to look. the code is small, the test fallout is larger (lots of little changes). it's in 2 commits https://github.com/juju/juju/pull/8665
<babbageclunk> wallyworld: sorry, was grabbing lunch - looking now
<wallyworld> ty
<wallyworld> babbageclunk: also, join #juju as we should talk in there
<babbageclunk> ok
<veebers> kelvinliu: commented on your PR, sorry it took me a while to get to
<kelvinliu> veebers, thx, I made some changes like better the docstrings etc and also left some comments there, Let's discuss from there
<veebers> kelvinliu: sounds good. I'm just about to pop out for a family dinner so lets hit it on Monday :-)
<kelvinliu> veebers, sure, have a good weekend
<veebers> kelvinliu: thanks, you too :-)
<vino> veerbers: have a good weekend!
<veebers> cheers vino, you too
<veebers> have a great weekend all o/
<rogpeppe> anyone know if there's a simple way to use the juju API to find out what units are part of an application? currently the only way i can see is the FullStatus call which returns a vast amount of extraneous information.
<stickupkid> if i make changes to the controller, how do I test the changes locally on lxd?
<rick_h_> stickupkid: what do you mean changes to the controller?
#juju-dev 2018-04-29
<schkovich> hey guys, cannot bootstrap to private openstack queens cloud. WARNING Authorization failed. im getting correct response from api using the same user, pass, domain, project and url
<schkovich> WARNING Could not find project: tom, although project is listed in the openstack dashboard
<schkovich> to make it more odd there are two admin and service domains, juju is trying to login to phantom admin domain that is not visible from os dashboard
