/srv/irclogs.ubuntu.com/2013/09/30/#juju-dev.txt

thumperhi davecheney01:50
thumperdavecheney: are you working today?01:50
thumperdavecheney: wallyworld is on holiday, and axw has a public holiday01:51
davecheneythumper: is it a public holiday today ?01:51
davecheneyi'm terrible at these things01:51
thumperdavecheney: for WA I think01:51
davecheneynup, in in NSW01:51
davecheneyi'll be here all week, try the fish01:52
thumperdavecheney: I have a number of small branches that fix saucy issues01:52
thumperdavecheney: https://codereview.appspot.com/14114043/01:52
thumperdavecheney: oh, just saw your review of it01:53
thumperdavecheney: and it does work on precise01:53
thumperthe other option is --session (for testing only)01:53
thumperI checked this01:53
thumperand tested on ec201:53
thumperdavecheney: I have a golxc one, and another juju one coming01:55
davecheneythumper: keep 'emcoming02:02
thumperdavecheney: ack02:03
thumperdavecheney:  https://codereview.appspot.com/14114044/02:03
davecheneythumper: LGTM02:04
davecheneyreviewed by email02:04
thumperdavecheney: I agree on the name, but that would be a bigger change just now02:05
thumperI'm trying to keep 'em smallish02:05
thumperdavecheney: it seems it isn't just me failing with this error02:19
thumperdavecheney: the gobot is also failing02:19
thumperdavecheney: could I get you to run the tests on trunk to see if you get it?02:19
davecheneysure, running trunk now02:20
thumperta02:21
davecheneysame,02:23
davecheney[LOG] 36.85478 DEBUG juju.environs.simplestreams cannot load index "http://127.0.0.1:42617/peckham/private/tools/streams/v1/index.sjson": invalid URL "http://127.0.0.1:4202:23
davecheney617/peckham/private/tools/streams/v1/index.sjson" not found02:23
thumperhmm...02:23
davecheneyhttp://paste.ubuntu.com/6173877/02:24
davecheneybroke02:24
* thumper wonders how it landed02:24
thumperdavecheney: https://code.launchpad.net/~thumper/golxc/nicer-destroy/+merge/18825402:41
* thumper now looks at the failing test02:42
thumperdavecheney: you are running raring?03:02
* thumper afk for a bit03:11
davecheneythumper: yes sir03:25
_thumper_jam: ping for when you start04:40
=== _thumper_ is now known as thumper
jamthumper: pong04:54
thumperjam: hangout? fire-fighting04:54
jamsure04:55
thumperjam: https://plus.google.com/hangouts/_/7e75017df572083de566b5fc04dab18866050eb4?hl=en04:56
thumperjam: https://code.launchpad.net/~thumper/juju-core/revert-1901/+merge/18826105:36
thumperjam: https://code.launchpad.net/~thumper/golxc/nicer-destroy/+merge/18825405:38
rogpeppemornin' all06:56
rogpeppefwereade: hiya07:05
fwereaderogpeppe, heyhey07:05
rogpeppefwereade: looking for a review of https://codereview.appspot.com/14038045/ if you have a mo at some point. (joint work of mgz & i)07:06
TheMuemorning07:52
rogpeppeTheMue: mornin'08:02
TheMuerogpeppe: heya, need a short restart after update08:05
TheMueso, back again08:08
fwereaderogpeppe, reviewed, not sure if there's some reason to mix concerns that I'm not quite getting08:16
fwereadejam, thank you for spotting the AccessDenied08:17
jamfwereade: np08:17
fwereadejam, am I right in thinking that mgz has keys to fix that?08:17
jamfwereade: It is the ec2 bucket, I don't know who has keys. Dave does, probably curtis does08:17
jamI don't08:17
dimiternfwereade, hey08:18
fwereadedimitern, heyhey08:19
rogpeppefwereade: the reason i thought it was good to put both the address updater and publisher in the same place is that they both need to respond to almost exactly the same information - it's trivial to do them both together08:19
fwereaderogpeppe, no it's not08:19
dimiternfwereade, https://codereview.appspot.com/14036045/ would you take a look please?08:19
rogpeppefwereade: and the publisher is actually thing i actually need out of this work08:19
fwereaderogpeppe, why so? just having the info in state is good enough, surely?08:20
rogpeppefwereade: i wanted to avoid doing a scan through all machines every time someone logs in08:20
fwereaderogpeppe, can't we just index by jobs if that turns out to be a cost worth worrying about?08:22
rogpeppefwereade: can you index by a set?08:22
fwereaderogpeppe, unless you know for sure that you *can't*, mising concerns like this is seriously premature optimization08:23
fwereaderogpeppe, last resort, not first08:23
fwereaderogpeppe, even if you do know08:23
fwereaderogpeppe, there's nothing stopping a separate publisher task from working with the data collected here08:23
fwereaderogpeppe, and pretending the two tasks are the same is just not helpful08:24
rogpeppefwereade: they seem to go together quite nicely to me08:24
fwereaderogpeppe, if your type description says "X does Y. Also, it does Z" you really should be writing either two types, or a long comment detailing the justifications for doing so08:25
rogpeppefwereade: we'll also want this logic for publishing the provider stateinfo08:25
fwereaderogpeppe, I'mnot saying the logic is *bad* even08:25
fwereaderogpeppe, just that the package is doing way too much08:25
fwereaderogpeppe, (and when you do the provider state info, I worry you'd say it's "trivial" to add it to this type too, because the tasks go together "nicely"...)08:27
rogpeppefwereade: ok, i guess. i thought the publishing bit is a relatively small addition to the rest of the logic, which is concerned with knowing when addresses change.08:27
rogpeppefwereade: yes, i'd thought that this package could be concerned with all addressing stuff.08:28
rogpeppefwereade: in particular, i'd thought we'd have two places where we'd publish the current set of addresses08:28
rogpeppefwereade: in state and in the provider08:28
fwereaderogpeppe, me too08:28
rogpeppefwereade: and that the same code can be responsible for both08:28
fwereaderogpeppe, ISTM that that is more than enough reason to put that clever code in its own package08:29
rogpeppefwereade: it's not very clever code08:29
fwereaderogpeppe, because then whoever needs to add functionality to it will *only* have that to deal with08:29
fwereaderogpeppe, rather than having to understand all the address-updating stuff as well08:29
rogpeppefwereade: i think this is making life harder for ourselves again08:29
rogpeppefwereade: but if you insist08:30
fwereaderogpeppe, I am not open to argument here08:30
fwereaderogpeppe, the concerns are separate08:30
fwereaderogpeppe, you've got the first one practically done, it seems08:30
fwereaderogpeppe, it can go in as a worker and start making life easier immediately08:30
rogpeppefwereade: that was the plan08:30
jamfwereade: I wanted to chat a bit about the ssl stuff, but after you're done with rog08:31
fwereaderogpeppe, and then we can write another worker that might even be properly trivial08:31
fwereaderogpeppe, and really easy to understand and change in isolation for the publish-to-environcase08:31
rogpeppefwereade: honestly, the publisher goroutine is really simple, and it won't be that simple when factored out as its own goroutine08:32
rogpeppes/goroutine/worker/08:32
rogpeppes/goroutine$/worker/ :-)08:32
rogpeppefwereade: because it'll have to duplicate a lot of stuff that this one is doing08:33
rogpeppe but we like duplication08:33
fwereaderogpeppe, AFAICT the only actual point of overlap is watching all machines08:33
rogpeppefwereade: and the environ08:33
fwereaderogpeppe, and that's not really apropriate to a publisher anyway, but it'll do in a pinch08:33
rogpeppefwereade: the publisher needs to watch all machines, no?08:34
fwereaderogpeppe, aw man, I guess we're doing the mix-environ-watching-into-everything stuff again?:(08:34
fwereaderogpeppe, depends08:34
fwereaderogpeppe, would be nicer if we could just watch all the state servers08:34
rogpeppefwereade: in this case, we can have a separate environ watcher that sets a shared Environ, guarded by a mutex.08:34
fwereaderogpeppe, is there any case we *couldn't* do that in?08:35
fwereaderogpeppe, Environ is meant to be goroutine-safe, right?08:35
rogpeppefwereade: yeah08:35
rogpeppefwereade: i'm not sure - there might be some cases where we actually want to know when an environ has changed.08:35
fwereaderogpeppe, I would hope not, surely?08:36
fwereaderogpeppe, and if we do, that would seem to be the place for custom environ-watching code08:36
fwereaderogpeppe, anyway axw has some investigation into that in his queue, I think08:38
fwereadejam, ssl?08:43
jamfwereade: so. I like smoser's idea to add the cert, mostly because it means I don't have to track down edge cases. It means I still need the code I've landed, because the initial *client* needs to have a way to connect.08:44
jamHowever08:44
jamfwereade: It is completely non-obvious how we get the Cert out of the connection.08:44
fwereadejam, ha08:44
jamI think we have to create a custom http.Transport object, that overrides Dial08:44
jamso that when it connects08:44
jamwe can peek at the tls.Conn object08:44
jamwhich has a ConnectionState call08:45
jamthat can have the certs in it08:45
jamBut the layer at which cloud-init sits08:45
jamis about 5 abstractions away from the actuall Conn08:45
jamfwereade: and I'm wondering how terrible that is08:45
jamThe best I can think of is to have a global registry of hostname => Certs08:45
jamand then create a custom Transport08:45
jamwell, custom Dial that adds those certs to the registyr08:46
jamand then if you have "ssl-hostname-verification: false" set08:46
jamit still does what I've done today08:46
jambut then at cloud-init time08:46
jamit looks in the global registry08:46
jamif there is a cert for auth-url08:46
jamand if so, it puts it into cloud-init08:46
jamThe vagaries of "hostname => certificate " concern me08:47
fwereadejam, I'm shuddering a little there08:47
jambut it might be feasible08:47
jamfwereade: the net.HTTP stuff doesn't expose any way to get access to the cached Conn objects08:47
jamso I can't do it without overriding Dial and peeking at connection time08:47
fwereadejam, that bit seems fine to me08:47
fwereadejam, it's the global registry that freaks me out08:48
fwereadejam, altogether too much action at a distance08:48
jamfwereade: so we already have a custom Transport object08:49
jambecause we have to set tls.SkipInsecuryVerify = true08:49
jamit isn't hard to inject a Dial there08:49
jamthough I'm not sure how to make that dial08:49
jamhave enough context08:49
jamto be able to cache the connection information on the Goose object ?08:49
fwereadejam, weeeell there are always ways... eg can we make the goose object itself supply the custom dial function?08:50
fwereadejam, (I feel like the situation is symptomatic of too many globals, and that adding more is unlikely to bring us to a happy conclusion)08:51
jamfwereade: so *today* we are using a shared HTTP Client08:52
jambecause that seems to be the recommend way08:52
rogpeppejam: could you just add the cert to /etc/ssl/certs/ca-certificates.crt ?08:52
jamso that you get global connection pooling08:52
jamrogpeppe: that is what cloud-init allows for you, the trick is *digging out* the certificate from the connection08:52
jamfwereade: we could certainly just punt on all of that (though it is how net/http works), and go with a one http.Client per goose.Client, and then goose.Client asks for an http.Client that has *this custom Dial* func() that is actually an appropriate closure08:53
jamfwereade: I'm not 100% sure how we do the juju-side of it.08:54
jamBecause of simplestreams08:54
jamwe might not need to08:55
jamas in, we leave juju simplestreams as just ignoring the certificate, we teach goose how to grab the certificate, and then we teach juju bootstrap how to ask goose for what the cert is08:55
jamnote there is still a small problem that the "SWIFT" URL doesn't have to match the AUTH URL08:55
rogpeppejam: we can't make it a configuration option, so the user tells juju about their own self-signed cert?08:56
jamrogpeppe: how do they get that cert08:56
rogpeppejam: off their provider, i suppose08:56
jamrogpeppe: users don't really want to connect via Firefox, click on "I understand the warning" then "Download Certificate", copy and paste that into an environment.yaml file (.jenv)08:56
jamrogpeppe: my point on the bug is: "ssl-hostname-verification: false" really easy for a user to type and understand08:57
jamrogpeppe: go inspect this service over there to pull out its SSL certificate08:57
jamrogpeppe: *completely* non-obvious08:57
fwereadejam, +108:57
jamrogpeppe: right now, I'm looking at how I get "ssl-hostname-verification: false" to work for all our stuff that just downloads from a URL08:58
jamcloud-init does08:58
jamupgrader does08:58
jamcharmer does08:58
jametc08:58
jamI either propagate ssl-hostname-verification = false into EnvironConfig08:58
jamand teach the API08:58
jamthat for things that return a URL08:58
jamthey also return a "And you should ignore the certificate for this URL"08:59
jamor I teach something like Bootstrap08:59
jamto put "here is a new Cert for you to use"08:59
rogpeppeor you find out the cert somehow, yeah08:59
fwereadejam, I'm starting to feel gordian-knotty here -- I think you should probably just go with skipverify for now, because it delivers actual value to users who specifically say they want insecurity08:59
jamto accept08:59
jamfwereade: what really sucks is that we have the cert08:59
fwereadejam, giving those users a bit of extra security is just a bonus08:59
jambut it is over here on this object that is hidden between 3 interfaces and a type that doesn't expose its internal map08:59
rogpeppejam: this problem is almost all about when we're talking to storage, right?09:00
jamrogpeppe: right09:00
jamrogpeppe: there are 2 problems, but I feel like I've solved the first09:00
rogpeppejam: so, storage already exposes a URL method, yes?09:00
jamwe need to handle connecting to the Provider09:00
jamand we need to handle Storage09:00
jamrogpeppe: all the agents that aren't on machine-0 don't have a Provider connection09:01
jamjust a bunch of URLs09:01
jamrogpeppe: which is why on Openstack the Storage() has to be a world-readable container09:01
jam(swift version of s3 bucket)09:01
rogpeppejam: so if we've got a URL for a provider, can we can find out the certificates provided by that URL?09:02
jamrogpeppe: per the work we've been doing to put everything into the API, we *really* don't want the Provider secrets on any machine but machine-009:02
rogpeppejam: ISTM that that's a potential way of bypassing the abstraction layers09:02
jamrogpeppe: as I've been saying, yes. You just connect to it, and then the tls.Conn object has a "ConnectionState" which has the certs. But *that* object is very hidden.09:02
fwereadejam, I feel like the urge for a solution is going to cause us either to fuck proper layering hard, or to fiddle with quite a lot of code in order to pass certs around with all the urls we store09:02
jamrogpeppe: so yes, we could make the API Server proxy for anything you want to download09:02
jambut that is quite a bit bigger change.09:03
rogpeppejam: i'm not sure i was suggesting that.09:03
rogpeppejam: are you saying that it's not possible to use the net/http interface to make a connection and find out the certs at the other end, regardless of our code?09:03
jamrogpeppe: so if we've done the work to extract the Certificate, then we we start an instance, we can tell cloud-init to add the certificate to the accepted certs store for that machine.09:03
jamrogpeppe: net/http has a global shared Client that pools connections, and that map of address => connection is not exposed (that I can see)09:04
jamrogpeppe: we *can* create an http.Client that uses a custom Dial09:04
fwereadesorry, brb09:04
jamand when we get a Dial attempt09:04
jamwe inspect if it is a tls.Conn09:04
jamand if so09:04
jamgrab the certificate09:04
jambut *where do we put it*09:04
jamso that we can pull it out later when we get to cloud-init time09:05
rogpeppejam: can't we put it into the environ's config?09:05
jamrogpeppe: http.Client is *intended* to be a global shared state09:05
jamrogpeppe: how do we get it from Dial => environ config09:05
dimiternfwereade, ping09:07
rogpeppejam: what i'm trying to suggest is that somewhere outside the provider, if we have insecureSkipVerify, we invent a storage request, try to dial its URL, extract the certificate, and save it in the provider (and possibly change the global http client too)09:07
rogpeppejam: so we don't have to wait until the provider does its own Dial09:07
rogpeppejam: we preempt it by doing our own first09:08
jamrogpeppe: so we don't actually know where storage is until we've connected to the provider09:08
jamrogpeppe: openstack uses a registry of URLs for where things like swift is at09:08
jamvs how ec2 has "known urls" ahead of time.09:08
jamrogpeppe: so you log in, then get back a list of "this is the URL to use for Swift"09:08
rogpeppejam: but that's in code that isn't hard to change to allow insecureSkipVerify, no?09:09
jamrogpeppe: so that's already been done, but it also means we've already done the Dial, so it doesn't make a lot of sense to do it separately09:11
rogpeppejam: i don't mind a bit of inefficiency in this case09:11
rogpeppejam: it's only one extra http request, after all; i'm probably missing something though.09:13
jamrogpeppe: so we have a fair number of abstractions about what URLs we are downloading from09:13
jamthere isn't Just One09:13
jamwe could probably do just Environ.Storage09:13
jam(and assume that tools-url is going to match that)09:14
jamthough there are no guarantees to that effect09:14
rogpeppejam: isn't the whole reason for URL so that we can use it in shell scripts ?09:14
rogpeppejam: what other abstractions are you thinking about?09:15
jamrogpeppe: you mean for env.Storage().URL ?09:15
rogpeppejam: yeah09:15
dimiternfwereade, jam, updated https://codereview.appspot.com/14036045/09:15
jamrogpeppe: so you're allowed to specify "tools-url" and "imagemetadata-url" which are just URL roots that we will use to look for image metadata and for tools metadata09:15
fwereadedimitern, heyhey09:16
fwereadedimitern, I will take a look09:16
rogpeppejam: is it ok to assume that they use certs signed by the same authority?09:16
rogpeppejam: or that if someone uses ssl-hostname-verification=false, that adding a cert from one of them will be good enough?09:17
fwereaderogpeppe, that does not sound ideal to me09:18
jamrogpeppe: so ian's design for simplestreams is that it can be any-old-http-server that you want, one of which might be swift/s309:18
jamrogpeppe: for the *immediate* use case, that might be ok09:18
jamthough auth-url and swift-url are different machines, I think09:18
jamso if they are using self-signed, they might be different self signed.09:18
fwereadejam, can we land it with just the existing disabling in place, and triage a bug for doing it better as wishlist or something? I feel like we're in danger sacrificing better to best09:19
jamfwereade: it won't work today with just what I've done so far09:19
jamwe can bootstrap09:19
jamand with the cloud-init it will start09:20
jambut Upgrader Uniter etc will still be broken09:20
jamI can land this, and work on those09:20
jamfwereade: but that is why I was tempted by smoser's idea09:20
rogpeppejam: smoser's idea is good *if* you know where to find the certificates to add09:20
fwereaderogpeppe, well, yeah,but it's *that* problem that feels to me like an uncontainable horror09:21
jamrogpeppe: so we can iterate over all the simplestreams DataSources and get all of there certs (if any) I suppose09:21
jamah, except the Sources09:21
jamuse their own connection09:21
jamwe just call Source.Fetch()09:22
jambut we *do* have source.URL09:22
jamso for _, source := GetToolsSources(): customClient.Get(source.URL()) => drops the Cert somewhere we can get it09:23
rogpeppejam: can we not define our own global http client, and have everything in juju use it?09:24
fwereadejam, but can we even be sure that a simplestreams file will only specify relative addresses for the actual downloads?09:24
rogpeppejam: hmm, that's not good either09:24
rogpeppefwereade: ha ha, good point09:24
jamfwereade: that is part of the simplestreams spec09:24
fwereadejam, ok, sweet, so long as someone's committed to that, I'd missed that09:24
jamfwereade: so there is stuff about mirrors, but the design is that the index always gives relative paths, so when you mirror the data, you don't have to change it09:25
fwereadejam, I think there's a disconnect there09:25
fwereadejam, but it's not worth worrying about actually09:25
fwereadejam, ech, or is it09:26
jamfwereade: so when we go to cloud-images... it tells us where the data is for amazonaws, etc.09:26
fwereadejam, anyway if it's in the spec this is moot09:26
jamfwereade: I'm not 100% sure about how the tools stuff is going to go, we are intending that you mirror tools into a local index09:27
jamfwereade: so I think we can avoid doing an HTTP get, we can iterate the sources, get the URL("") and then if the URL.Scheme == "https" do a tls.Dial and grab the cert out of there.09:29
jam(if ssl-hostname-verify: false)09:29
jamcreate a Set() of those certs, and the add them all to cloud-init09:30
fwereadejam, ok... we're still left assuming that those certs won't change... is that ok?09:30
jamwe probably still need to use ssl-hostame-verification: false when talking to the Provider itself (maybe), or we include auth-url as one of the bits we want to add09:30
jamfwereade: I think for the use case, it is fine. I was worried about that as well09:30
jamfwereade: but I don't think people are going to change their self-signed certs and expect juju to upgrade in place09:31
jamI think09:31
fwereadejam, I guess it's the same problem as updating authorized-keys in essence anyway09:31
fwereadejam, ie we could actually build the infrastructure to handle it if we had to09:31
jamfwereade: well ssl-hostname-verification: false would just disable it always, right?09:31
jamfwereade: we'd start managing certificates09:31
jamwhich I would love to avoid09:31
jam(oh, revoke that certificate, add this one), but I guess people want us to do that for authorized-keys as well09:32
fwereadejam, that was my thought, yeah09:32
jamfwereade: I think it is worthwhile to think how this interacts with the httpstorage proposal as well09:33
jam(a storage url may not be available before bootstrap time ?)09:33
fwereadejam, anything using it before bootstrap time will be able to get what it's looking for off the filesystem, won;t it?09:34
jamfwereade: I'm meaning httpstorage with the local provider09:35
jamwe're talking about it exposing https09:35
jamand I guess we'll do something about accepting that cert09:35
jamfwereade: I thought I saw an axw commit that said "we'll need to disable certs for this"09:35
fwereadejam, I may have missed that bit... but the wouldn't the environment's CA cert be what we'd use/need there?09:40
jamfwereade: I honestly don't know what the plan is, and axw is already gone for the day09:40
fwereadejam, ok, fair enough09:41
jamfwereade: I just know of it as yet-another HTTPS source we might need to worry about09:41
fwereadejam, indeed, got you09:42
jamI don't really know how we set ca-certs for those instances09:42
jamI don't think we use cloud-init there09:42
jamgiven we would have to run a metadata server on the users' machine09:42
jam(I think)09:42
fwereadejam, alternate tack again: the cost and complexity of adding and using a bool-returning api method for each of the facades is known to be small, and fulfils the current use case adequately if not admirably09:45
fwereadejam, the cost and complexity of the alternatives stm to make it unlikely that we'll get an adequate implementation anywhere near as soon09:46
jamdimitern: so for https://codereview.appspot.com/14036045/ couldn't we have something that takes a list of jobs and tells you if you need state access?09:47
fwereadejam, no argument, the admirability ceiling of keeping track of the actual certs is much higher, but it feels wishlisty09:47
jamthat seems generic enough to work whether or not you're on the end of the API or directly on state09:47
jamfwereade: well, it was also impacted by the fact that scott raised the question, and nobody else reviewed the proposal :)09:47
jamfwereade: so I certainly considered the "add this cert manually" but the feeling of how to do the manual cert seemed terrible for users. So I explored this possibility of doing it for them09:48
jamfwereade: but yeah, disabliing it everywhere seems more directly straightforward09:48
jamfwereade: and avoids the "oh you need 3 certs for the various services, etc"09:48
fwereadejam, don;t get me wrong, I am happy that you explored it, it's exactly the sort of thing I'd like us to be considering by default09:49
jamfwereade: yeah, I felt it was worth discussing at least, I certainly wasn't committing to code it yet, but I did investigate to see what it would have taken.09:50
jamIf net/http could have exposed the existing conn I was pretty interested. Having to do it via inspecting Dial made me a bit sad.09:51
dimiternjam, I originally though to add a method on MachineJob to return true if the job needs state, but we need the same one on params.MachineJob and state.MachineJob09:51
fwereadedimitern, that's a bit surprising09:51
fwereadedimitern, where do we get a state.MachineJob when we're not connected to state?09:51
jamfwereade: so there are 2 aspects (AIUI). One side needs to know if it should add a MongoPassword, and the other side needs to know if it should ensureStateConnection09:52
jamensureStateWorker09:52
jamdimitern: so I think your point is that we actually have 2 Job types09:53
jamone that is exposed on the API09:53
jamand one that is directly in state09:53
jamand so 1 function wouldn't take both types of objects09:53
jamshame they aren't just an Enum09:53
fwereadejam, dimitern: there are a few places where types moved from state to api rather than being copied... what are the forces that led us otherwise here?09:54
dimiternfwereade, the tricky part is the machine agent code09:54
dimiternfwereade, there we have both a state connection and an api connection09:55
jamdimitern: because we might run the env provisioner?09:55
dimiternfwereade, and we can't have the latter until we know we can connect - i.e. not bootstrapping and we know our jobs09:55
jamdimitern: I think fwereade's point is that why aren't they just params.MachineJob enums ?09:55
dimiternjam, because of JobManageState, but also because of the firewaller09:55
dimiternjam, the env provisioner also uses the api09:56
fwereadedimitern, jam: more to the point, they're ints in state and strings in the api09:56
fwereadebah09:56
jamfwereade: good times09:56
dimiternfwereade, all int consts are strings in the api - c.f. live09:56
fwereadejam, dimitern: however, no reason not to keep the int storage and expose them in methods as params.Job, right?>09:56
dimiternfwereade, that's because json doesn't have true ints IIRC09:57
fwereadedimitern, and we want it to be half-way comprehensible too:)09:57
jamfwereade: right, in an API call it is nice to see "hosts-units" vs "1"09:58
jambut that would have been a reason to put them as strings into the DB as well :)09:58
jamdimitern: so I think the idea is that we would have state.JobHostsUnits only long enough to turn it into params.JobHostsUnits.09:59
jamBut that sounds like EOUTOFSCOPE for your patch09:59
jamI'd probably still rather it be a function that takes a slice of jobs09:59
jamrather than putting functions on what is otherwise a blob10:00
mgzhey jam10:00
jammgz: mumble/hangout ?10:00
dimiternjam, fwereade, if it has to be a helper taking a slice of jobs, we still need 2 helpers10:01
fwereadedimitern, to be fair, if one of them is a wrapper that does the conversion and calls the other, that wouldn't be so bad, would it?10:02
rogpeppemgz: ping10:05
dimiternfwereade, ok I suppose10:05
jamdimitern: so I don't feel like we need tons of overengineering, but we did have logic that needed to be changed in several places to keep them in sync10:05
jamit certainly felt like it should be centralized10:05
jamnot helped by "this is an if clause, this is a map index, etc"10:06
mgzrogpeppe: hey10:06
rogpeppemgz: do you want to continue with the address worker at some point?10:07
mgzrogpeppe: that would be good10:07
mgzafter standup?10:08
rogpeppemgz: sounds good10:08
fwereadejam, rogpeppe: ISTM that provider.StartBootstrapInstance and provider.StartInstance are out of whack10:33
jamfwereade: how so?10:33
fwereadejam, rogpeppe: it looks like StartInstance is something you do *to* an environ, and should be in environs10:33
jamfwereade: so I think the idea is that we have 90% common code between all implementations10:34
fwereadejam, rogpeppe: but StartBootstrapInstance looks like it's a common implementation of Bootstrap10:34
jamthat set up the machine-config etc10:34
fwereadejam, provider implementations call StartbootstrapInstance but, other code calls StartInstance10:34
fwereadejam, rogpeppe: I thought the idea of the code in the provider package was to help people implement providers10:36
rogpeppefwereade: i think you're probably right10:36
jamfwereade: I have a feeling it was exploring the bounce-and-bounce-back stuff. (Do we call Env.Foo() which calls something common, and then calls back on Env or do we just call something common, or...)10:36
jamfwereade: I have the feeling someone found the balance different each time10:36
jambut we should make them at least similar10:37
rogpeppefwereade: i mean, you're definitely right that the code in the provider package is to help people implement providers10:37
rogpeppefwereade: and i think you're probably right about StartInstance. i'm just looking around - i'm not that familiar with it10:37
rogpeppefwereade: (although i suppose i may have been responsible for putting it there!)10:37
fwereaderogpeppe, jam: ehh, we progress uncertainly, but we do progress10:38
fwereaderogpeppe, jam: they're certainly named very confusingly given the different domains though10:38
rogpeppefwereade: agreed10:38
* rogpeppe hates the .(T) magic strewn around seemingly at random10:39
fwereaderogpeppe, jam: ok, I think I'll move StartInstance over to Environs in a mo10:40
rogpeppeit feels exceedingly fragile to me10:40
jamfwereade: it is also exceedingly unhelpful that env.StartInstance does already exist10:40
fwereadejam, oh, wtf10:40
jamand is called by provider.StartInstance IIRC10:40
fwereadejam, ahh Environ.StartInstance, sorry10:41
jamfwereade: "last line of StartInstance is broker.StartInstance"10:41
jamand broker == environ10:41
jam(if env, ok := broker.(environs.Environ)"10:41
fwereadejam, yep10:41
jamfwereade: so the idea is that environ.StartInstance is hard to use because it needs this MachineConfig and Tools stuff10:41
jamso we'll pull that out into a helper10:42
jamoddly enough10:42
jamStartBootstrapInstance also needs this tools list10:42
jambut the env passes them in10:42
fwereadejam, rogpeppe: I would agree that the type-checking in provider.StartInstance looks like madness and bullshit10:42
jamthough it comes from Bootstrap10:42
rogpeppejam: i don't really understand the "environ.StartInstance is hard to use because it needs this MachineConfig and Tools stuff" comment10:43
rogpeppejam: it's only used in one single place in the code10:43
rogpeppejam: so how does creating an extra layer help?10:43
jamfwereade: and there is a bootstrap.Bootstrap that does the heavy lifting before calling env.Bootstrap which then calls provider.StartBootstrapInstance which then calls env.StartInstance10:43
rogpeppejam: that seems reasonable to me10:44
rogpeppejam: because Bootstrap is something that external code might want to do.10:44
jamrogpeppe: so provider.StartInstance is the same "use a helper to then call the right values on the environ" that bootstrap.Bootstrap is10:44
fwereadejam, rogpeppe: exactly10:45
jamrogpeppe: note that I didn't write these, though maybe more should be caught in review.10:45
rogpeppejam: except that noone except the provisioner worker should ever be starting instances10:45
jamstandup time10:45
jamfwereade: rogpeppe: https://plus.google.com/hangouts/_/8a92f5273abdde270a9fa8d3c6c19416568d4b6b10:45
fwereaderogpeppe, ok, but I don't think the provisioner should really need to know or care about tools10:46
natefinchSo, am I wrong in thinking EC2 instances all start with a specific amount of disk space you get for free with the instance, which is always more than 8gigs?  "Instance Storage" here - http://aws.amazon.com/ec2/instance-types/#instance-details11:42
mgznatefinch: all cloud providers tend to give you another volume for misc storage11:45
mgzwhich is much larger than the root partition11:45
mgzreally charms should all be configured to use that for storagey things like databases where possible11:45
mgzso the root can just be packages11:45
natefinchwhen I tweaked the defaults inthe code, I could get root storage up to what was stated in that table.  afaik, aws gives you other storage but it's ephemeral, goes away on reboot11:47
mgzright... that's also a consideration11:47
natefinchgoes away, for me, means "treat like (really slow) in-memory storage"11:49
mgznatefinch: I'm not certain stopping and starting an instance is supported in general by juju, and ephemeral storage should persist across a simple reboot11:56
natefinchmgz: hmm.. I was under the impression that ephemeral storage wasn't reliable across a reboot, but I'm by no means an expert, and only read the docs once, a while ago.11:59
mgzhttp://docs.aws.amazon.com/AWSEC2/latest/UserGuide/InstanceStorage.html12:06
mgz"The data in an instance store persists only during the lifetime of its associated instance. If an instance reboots (intentionally or unintentionally), data in the instance store persists. However, data on instance store volumes is lost under the following circumstances: *Failure of an underlying drive  *Stopping an Amazon EBS-backed instance *Terminating an instance"12:06
natefinchmgz: thanks for the link. I guess I saw ephemeral and "temporary" and jumped to conclusions12:08
mgzah, excellent12:10
mgzforw whatever reason, sydney is vpc-only12:11
mgzall the other regions have ec2-classic12:11
mgzamusing I have to test this on the server furthest from me, but hey12:11
natefinchwhat's an ocean or two between friends, really?12:12
=== TheRealMue is now known as TheMue
hazmatjam, fwereade fwiw, i'm poking around the perms on the bucket.. the problem seems to be whatever is doing the uploads13:22
hazmatneeds to also explicitly do an access grant to public for read13:22
mgzhazmat: yes, our code just defaults to creating private13:23
mgzsinzui: do you need any help fixing up the ec2 perms, following up for the 1.15 release?13:24
sinzuimgz, I don't think so. I used s3cmd to upload because it has bulk powers. I will try again using s3up for each file.13:27
mgzsinuzi, the issue is the 'directories' not the files I believe13:27
sinzuioh, then I am still clueless.13:28
* sinzui reads the emails again13:28
hazmatthere are no directories13:31
hazmatthere are only object13:31
mgzthere is a tools/releases object that's 19 bytes13:31
hazmatodd13:31
sinzuithe paths then http://juju-dist.s3.amazonaws.com/13:32
hazmatany objects uploaded must be done so with a grant that allows global read13:32
hazmatsinzui, your using s3cmd?13:32
hazmatsinzui, is this code from juju-core/scripts?13:32
sinzuimgz, again?  I saw and thought I fixed the 19 byte issue. It was caused when relative paths were passed to sync-tools. I switched to absolute locate paths13:32
abentleysinzui: ping for standup13:32
hazmatso here's the perm map of that bucket http://paste.ubuntu.com/6175597/13:34
hazmatsinzui, mgz  it looks we can create a bucket policy directive for contained objects to default to read13:37
hazmatinvestigating13:37
sinzuihazmat I can fix this when I leave my current meeting.13:39
hazmatsinzui, at this point i'm already in it13:39
sinzuihazmat, then I thank you very much for helping13:40
rogpeppetrivial code review anyone?  https://codereview.appspot.com/1412304313:42
dimiternrogpeppe, looking13:43
dimiternrogpeppe, lgtn13:43
dimiternm even13:43
rogpeppedimitern: ta13:43
mgzlooks good to noone?13:44
mgzthat's pretty mean13:44
hazmatsinzui, np.. policy set13:45
* rogpeppe hates that feeling when you *know* you've implemented something identical in the (possibly recent) past, but just can't remember where that was.13:48
TheMuerogpeppe: it even get's worse if you don't even know it and some time later you discover, that you've done it13:53
rogpeppeTheMue: actually, i don't mind that as much13:54
rogpeppeTheMue: it's that feeling of struggling to reproduce logic you can *almost* remember13:54
rogpeppefwereade, jam: delete cmd/builddb: https://codereview.appspot.com/1412704314:36
* rogpeppe goes for lunch14:39
hazmatsinzui, one other regression vs 1.14.[0,1] is the production and upload of the armhf binaries14:52
sinzuihazmat, yes, only Ubunut is making them. I saw that14:54
hazmatrobbiew, non lts server distro support is 9months? ie. 12.10 is no longer supported?15:02
natefinchfwereade,mgz, rogpeppe, TheMue, jam, dimitern, anyone else who cares -  I'm writing juju help constraints... anyone want some input?  I want to make sure there aren't any technical errors, and if you have formatting suggestions, that's cool too: https://docs.google.com/document/d/1sy4yDUp93FYPt205Muarr8ASiEaylyVSAuycq0OkBgY/edit?usp=sharing15:10
mgznatefinch: will have a look15:11
fwereadenatefinch, I'll try to get to it later15:11
natefinchfwereade:  no problem15:11
mgznote I added some generic stuff to lp:juju/docs when we did the sprint15:11
mgzer... or whatever the correct location is15:11
natefinchmgz: I see some stuff on constraints under juju-core/doc/provisioning.txt15:12
mgznatefinch: the juju.ubuntu.com docs is what I'm talking about15:13
natefinchmgz: ahh, right15:13
natefinchmgz: didn't occur to me to look there15:13
natefinchmgz: definitely some stuff that needs adding to my docs15:14
mgzthose have:15:15
mgzhttps://juju.ubuntu.com/docs/charms-constraints.html15:15
mgzhttps://juju.ubuntu.com/docs/reference-constraints.html15:15
natefinchall right, well, there's obviously a lot to add to my docs.  Probably not worth looking over what I have until I work in those pages15:16
natefinchmgz: is this true?  "A value of 'any' explicitly unsets a constraint, and will cause it to be chosen completely arbitrarily."15:20
mgzer, it's not great wording, but it's true that pyjuju unsets a contraint when given the string 'any'15:21
natefinchmgz: yes, but does that apply to juju-core?15:22
mgznope.15:22
natefinchbadness15:22
mgzpyjuju distinguishes between "any" which is no constraint, and "" which is use the default15:23
mgzjuju-core just has "" which can mean several different things15:24
natefinchwe need separate documentation for pyjuju and juju-core :/15:24
mgzreally, we just need to update any remaining bits to talk about juju-core behaviours, with notes added for where we break compat15:25
jammgz: you might check with TheMue he's been working in the area. So that "nil" can unset a value, etc.15:26
natefinch*nod* also, a lot of that constraints page reads as release  notes, not documentation  "will be controlled with a new command"  "Please note that there are no changes to"15:26
jamI think that is null for JSON and maybe "" for the commandline15:26
jamah, nm, "juju unset"15:26
mgzTheMue: ^please also update natefinch and the docs when you land exciting constraint semantics changes :)15:27
* natefinch likes writing documentation, but it also means he's picky about it ;)15:28
TheMuemgz: will/would do, but so far nothing regarding constraints15:32
jamTheMue: ah, "juju unset" is all about config options for a charm, not constraints15:33
TheMuejam: yep, exactly15:33
jammgz: is there a way to, say, unset the mem constraint?15:33
mgzmem= and mem=0 both seem to have the same effect15:34
mgzno way of saying "go back to the juju default"15:34
mgz(returning references to unitialisaed values still freaks me out in go...)15:35
natefinchyou mean copies of uninitialized values, right? :)15:35
mgz(need to spend brainpower to remember uint64 means 0, not random memory15:36
mgzmeans? gets? summat.15:36
mgznatefinch: that also doesn't help :)15:36
natefinchthe "everything is initialized to a zero value" is one of my favorite things about go.  Although, to be fair, most modern languages do about the same thing... C#, java, etc.15:37
natefinchit's just go makes zero values more useful in many cases15:37
rogpeppehmm, provider/ec2 tests seem broken for me on trunk. anyone else see that?16:26
rogpeppei see this: http://paste.ubuntu.com/6176180/16:26
rogpeppefwereade, jam, dimitern, natefinch, TheMue: can you verify please?16:27
mgzrogpeppe: yeah, I see that16:32
rogpeppemgz: hmm, i wonder how it could have got past the 'bot16:32
mgzthe test looks like it talking to the real s3 bucket16:33
mgzso, presumably the bucket wasn't borken when it got run on the bt16:33
rogpeppemgz: what makes you think that it's talking to the real s3 bucket (not that i don't believe you)16:34
mgzlast few lines of the log have real urls16:34
rogpeppemgz: ha, good point16:35
mgzlacking the new "/releases/" part16:35
jamrogpeppe: mgz: this is because we *can* now read s316:35
rogpeppejam: i think this must be relatively recent behaviour16:36
jamrogpeppe: as in, kapil just fixed s3 about 2 hours ago16:36
rogpeppejam: rev 1901 introduced the problem16:37
jamrogpeppe: known, we had a failure elsewhere in the test suite because 1.15.0 was uploaded, and the test suite failed because it saw but couldn't read the bucket (see Tim's patch earlier today), then Kapil fixed the s3 bucket to be readable, and another test fails16:37
rogpeppejam: but at rev 1900, provider/ec2 tests are extremely slow (51s), whereas relatively recently they only took 5s16:37
jamrogpeppe: not specifically related16:38
rogpeppeoh it's such a twisty maze16:42
rogpeppei really think the dynamic type conversions everywhere are a horrible mistake16:43
rogpeppethere's no way to know by looking at provider.StartInstance what methods might be called on the broker parameter16:44
rogpeppejam: i'll disable that test for the time being, just so we can actually submit something16:46
jamrogpeppe: please make sure to submit a Critical bug about i16:48
jamit16:48
jamtest suite not being isolated is *bad* and will bite us again16:48
rogpeppejam: totally agree16:49
rogpeppejam: https://codereview.appspot.com/1412304516:58
rogpeppemgz, fwereade, dimitern: ^16:58
fwereaderogpeppe, LGTM16:59
fwereaderogpeppe, I am up to my elbows in it as we speak16:59
fwereaderogpeppe, in short I think that provider.StartInstance is total madness17:00
mgzrogpeppe: also lgtmed, note that you shouldn't mark that bug fixed when you land, as it's the one tracking the actual issue17:00
rogpeppefwereade: +117:00
rogpeppemgz: hmm, good point17:00
mgz(or you should reference a different bug in the skip message and close that one)17:01
rogpeppemgz: if i approve the branch, will it mark the bug as fixed?17:01
rogpeppemgz: (automatically)17:01
mgzI think the bot may, when landing, but you can always revert17:02
rogpeppemgz: ok17:02
rogpeppefwereade: i would really really *really* like it if we could lose all the dynamic type coercions, so any interface values passed to provider functions document exactly what methods may be called on them.17:03
fwereaderogpeppe, I'm not entirely convinced there17:04
natefinchrogpeppe, fwereade: +1   otherwise the interface is a lie17:05
fwereaderogpeppe, no argument that provider.StartInstance is abuse17:05
rogpeppefwereade: i can't see any advantage to the way things are done currently17:05
rogpeppefwereade: it breaks types-as-documentation, and it breaks encapsulation.17:06
rogpeppefwereade: and it's trivial to make it work conventionally.17:06
natefinch(to be clear, I was +1 for roger's point)17:06
fwereaderogpeppe, natefinch: ISTM that the reality is that we have environs that actually do expose different features, and the custom datasources are a valid application of the technique17:06
rogpeppefwereade: exposing a different feature is not a cause for exposing a different method17:07
rogpeppefwereade: environs that don't implement custom data sources can implement a method that returns no custom data sources17:08
rogpeppefwereade: and we can easily have a "nothing custom implemented here" empty provider type.17:08
rogpeppefwereade: which can be embedded to provide the default versions of the methods.17:08
rogpeppefwereade: so the cost to any given provider is at most one line17:09
fwereaderogpeppe, natefinch: so we have a giant nothing-special one that is itself useless as documentation, because any method could be overridden? or a bunch of little nothing-special ones that are embedded individually (and still you can't say for sure whether they're overridden)?17:11
fwereaderogpeppe, natefinch: ISTM that the idea that an interface specifies a minimum set of capabilities issomewhat endorsedby the language17:11
rogpeppefwereade: for a given Environ type it's easily possible to say what's overriden17:12
rogpeppedden17:12
fwereaderogpeppe, natefinch: if someone were to abuse that to, say, close a conn in a surprising fashion, that would be bad17:12
rogpeppefwereade: but what is awful is having functions that say they expect some interface, and then randomly assert some other interface type down in the depths of their implementation17:12
rogpeppefwereade: i am not endorsing that StartInstance take the giant interface type.17:13
fwereaderogpeppe, natefinch: whereas clever copying things with ReadFrom are touted as somewhat awesome17:13
natefinchfwereade: an interface defines both what could be called and what will not be called, by definition.  You can get around it, but it's bad form to do so17:14
fwereadenatefinch, so it's bad manners to accept an interface and not call its methods? agree in the abstract, think it's a bit fuzzier in practice17:14
rogpeppefwereade: the cleverness in ReadFrom is somewhat dubious - but excusable by virtue of the fact that the methods that it uses implement exactly the same functionality as the original interface methods.17:15
natefinchfwereade: I mean it's bad manners to accept an interface and then call OTHER methods17:15
natefinchfwereade: for exmaple, if something takes an io.Reader, checks for Close() and then calls Close.... that would be surprising17:15
rogpeppefwereade: i think that we should try to define interfaces that define useful subsets of the Environ type.17:15
fwereadenatefinch, indeed so :) and everyone agrees it sucks17:15
fwereaderogpeppe, indeed so17:15
rogpeppefwereade: but i think that functions like provider.StartInstance should at least accept an interface type that defines a non-strict-superset of the methods that will be called17:16
fwereaderogpeppe, natefinch: as it happens I think you're right about the custom data sources17:16
fwereaderogpeppe, natefinch: because they all have the same goddamn implementation anyway ;p17:16
natefinchheh17:17
fwereaderogpeppe, natefinch: so, I dunno -- I'm not willing to excommunicate the technique, but it may well be the case that every single use of it that you can point to is unmitigated crack17:17
fwereaderogpeppe, natefinch: in which case, eh, we shouldn't have any of them :)17:18
rogpeppefwereade: here's a reason that the io.Copy magic isn't bad - you can't break the behaviour of io.Copy by embedding an io.Reader in a custom struct type.17:18
rogpeppefwereade: but our code will break if you embed an Environ in something else.17:18
fwereaderogpeppe, ok, that's a nice concrete reason17:19
rogpeppefwereade: what would you take as evidence that a particular use was *not* unmitigated crack?17:19
natefinchrogpeppe, fwereade: my stance would be, any use like this should be a huge red flag, and much scrutiny put towards finding a better way.  My guess is that almost always, there will be a way that is much more clear, without a ton more work.17:21
fwereaderogpeppe, I imagine it varies case-by-case, but I wasn't *that* hard to convince there, was I? ;)17:21
rogpeppefwereade: hrmph :-)17:21
natefinch:)17:21
fwereaderogpeppe, natefinch: my experience this afternoon might be leading me in your direction anyway17:22
fwereaderogpeppe, natefinch: I can certainly agree that it's fair to treat it as a pungent code smell17:23
rogpeppehmm, we may want to disable go vet in .lbox.check in go1.217:25
rogpeppeit takes 30s on my machine17:25
natefinch30s seems ok if it only happens when you're proposing17:27
* fwereade called for dinne, back later17:27
natefinchit's not *fun*, but it's pretty useful, and could easily be forgotten otherwise17:27
rogpeppenatefinch: propose is already really slow17:27
rogpeppenatefinch: and we've got the bot17:28
natefinchrogpeppe: I was actually going to say, lbox propose is already slow, so what's another 30s? :)17:28
rogpeppenatefinch: it gets in the way of my critical path17:28
rogpeppenatefinch: i can't do anything when i'm running lbox propose17:29
natefinchrogpeppe: did it get significantly slower in 1.2?17:29
rogpeppenatefinch: yes17:29
rogpeppenatefinch: or maybe only in tip, i'm not sure17:29
rogpeppenatefinch: it now runs the type checker17:30
natefinchrogpeppe: ahh hmm. when are we going to switch to 1.2?17:31
rogpeppenatefinch: good question17:31
natefinchrogpeppe: I'm sorta surprised, 30s is a long, long time.17:32
rogpeppenatefinch: yup17:32
natefinchrogpeppe: any idea if they plan on trying to make that perform better?  I haven't been keeping up on golang-nuts as closely as I should17:33
rogpeppenatefinch: probably. i should bug adonovan17:33
sinzuir1.15.0 continues to be cursed. Is there a command or url I can use to quickly locate an Ubuntu image for azure. The default image selected by Juju appears to be invalid now: http://pastebin.ubuntu.com/6176364/17:34
rogpeppesinzui: could you run that command with --debug please?17:40
rogpeppenatefinch, fwereade, jam, mgz, dimitern: next stage in environment config info storage:17:40
rogpeppe2013-09-30 17:00:40 ERROR juju supercommand.go:282 command failed: cannot start bootstrap instance: POST request failed: BadRequest - The location or affinity group East US specified for source image b39f27a8b8c64d52b05eac6a62ebad85__Ubuntu-12_04_3-LTS-amd64-server-20130916.1-en-us-30GB is invalid. The source image must reside in same affinity group or location as specified for hosted service West US. (http code 400: Bad Request)17:40
rogpeppeerror: cannot start bootstrap instance: POST request failed: BadRequest - The location or affinity group East US specified for source image b39f27a8b8c64d52b05eac6a62ebad85__Ubuntu-12_04_3-LTS-amd64-server-20130916.1-en-us-30GB is invalid. The source image must reside in same affinity group or location as specified for hosted service West US. (http code 400: Bad Request)17:41
rogpeppeoops!17:41
rogpeppehttps://codereview.appspot.com/14136043/17:41
rogpeppenatefinch, fwereade, jam, mgz, dimitern: ^17:41
rogpeppeplease ignore the deleted builddb noise17:41
sinzuirogpeppe, I am re rerunning with 1.15.0 adn debug. the issue is the same with sable and unstable: http://pastebin.ubuntu.com/6176490/17:42
* sinzui tries an older image from August17:44
rogpeppesinzui: i suspect a problem with our simplestreams logic17:46
sinzuiolder image does not work either :(. This did work last week17:46
sinzuirogpeppe, possibly. Did you see that that my first paste was using 1.14.117:47
rogpeppesinzui: try with revno 190017:47
sinzui1.14.1 did run with azure last week for me17:47
rogpeppedimitern: have you run live tests on ec2 since you merged your mongo password changes?17:52
dimiternrogpeppe, I haven't merged them yet17:52
dimiternrogpeppe, still fooling around with some tests17:52
rogpeppedimitern: ah, ok that's good - i can't blame them for my current test failure :-)17:52
dimiternrogpeppe, and, let's clear that out - 'cause I was wondering before17:53
sinzuirogpeppe, same result for -r1900, 1.15.0, and 1.14.1 used default image selection and when I force an image. I suspect this is more azure than juju.17:53
* sinzui looks for victim to test azure17:53
rogpeppesinzui: quite possible17:53
dimiternrogpeppe, by "ec2 live tests" do you mean bootstrapping an ec2 env and doing some deployments, etc. or running ec2 tests with --amazon? (or whatever it was)17:53
rogpeppedimitern: no, i mean cd provider/ec2; go test -test.timeout 1h -amazon17:54
rogpeppedimitern: the latter in other words, yes17:54
dimiternrogpeppe, I've never done that actually17:54
rogpeppedimitern: heh17:54
dimiternrogpeppe, I just fire up a c2 env from my account and do some manual deploy/status/etc. tests on the console17:55
dimitern*ec2 that is17:55
* rogpeppe leaves18:19
rogpeppemight be back later for a bit18:19
=== BradCrittenden is now known as bac
thumpermorning19:29
natefinchthumper:  morning... you're on early19:33
thumperhi natefinch19:33
thumpernah, the country is now UTC+1319:33
natefinchoh, it was over this weekend, that's right19:33
thumperI'm a little earlier than usual by about 30m19:33
natefinchright19:33
thumperbecause it is the school holidays19:33
thumperand I have less to worry about19:33
natefinchhah, I think I'd get in earlier when the kids had to go off, because there's more stuff to do in the morning19:34
natefinchbut then, I start work before anyone else is even awake in the house19:35
sidneihey guys, got a fancy one here. had an env running for a couple days, now i went back to look at it and one of the machines has the agent as down. looking at the logs, it's failing to log back in:19:35
sidnei2013-09-30 19:34:03 ERROR juju runner.go:211 worker: exited "state": cannot log in to juju database as "machine-6"19:35
thumpernatefinch: what time do you start?19:35
thumpersidnei: have you upgraded?19:36
sidneiobviously i can't 'juju terminate-machine' because it needs the unit to be removed before terminating it right?19:36
natefinchthumper: I get up at 5:30am most days and start work between 6 and 6:30 depending on what else I have to do in the morning.  before the kids get up is the only quiet time I get :)19:36
sidneithumper: not intentionally, but maybe landscape auto-upgraded it for me19:36
thumpernatefinch: do you split your day?19:36
sidneialthough, i think it wouldn't come from a package, but from tools?19:36
thumpersidnei: which environment?19:37
natefinchthumper: basically.  I help the kids get up around 7:30-8:30 or so (later if there's nothing going on that day), and then help some more around lunch time.  So, less split ,and more just interrupted ;)19:37
natefinchsidnei: oh, we don't support environments being up for multiple days19:37
sidneinatefinch: haha :)19:38
thumper:P19:38
natefinch:D19:38
sidneithumper: canonistack19:38
thumpersidnei: I wonder if juju is installed there19:38
thumpercan you ssh into it?19:38
sidneithumper: neither 'juju' is installed, nor 'juju-core'. in fact, it doesn't even know about a juju-core package19:39
thumpersidnei: ok, that's good19:40
thumpersidnei: ya know, I've always felt a bit weird about how the password thing is handled19:40
sidneithumper: if you want to poke at it i can add your ssh key, im about to destroy the instance otherwise19:40
thumpersidnei: no, destroy it...19:41
* thumper thinkgs19:41
thumperwell agents in lxc containers come back up19:41
sidneii guess i can't destroy it either19:41
thumperso the file must persist properly I guess19:41
sidneiotherwise the environment will go nuts19:41
thumpersidnei: not yet, we don't have a --force or other mechanism for cleanup19:41
sidneimaybe i can poke at mongo and get the creds out of it, then compare to what the machine thinks it should have19:43
sidneijujud has a timestamp of sept 10th btw19:43
natefinchmna, it bugs me that you can do juju add-machine lxc, but you can't do juju add-machine --constraints container=lxc  :/19:48
rogpeppethumper: hiya19:49
thumperhi rogpeppe19:49
* thumper otp19:49
rogpeppethumper: otp = "off to play" ?19:50
thumperon the phone19:50
natefinchwhat does add-machine ssh do?19:57
fwereadenatefinch, manual provisioning19:59
fwereadenatefinch, ssh in and start running a machine agent19:59
natefinchso it basically gives credentials and an IP address for the state server to connect to that machine and start it?20:00
natefinch(and then do all our normal startup stuff)20:00
fwereadenatefinch, at the moment I think it's direct, and I'd need to read the code to tell you the exact sequence20:01
natefinchfwereade: no problem... I was just looking at docs related to constraints and..... add-machine needs some help :)20:01
fwereadenatefinch, something like: ssh in, maybe ask for a sudo password, figure out the machine's series/hardware, inject machine into state, start agent, log out20:02
fwereadenatefinch, I bet20:02
fwereadenatefinch, thank you ever so for focusing on this20:02
natefinchfwereade: I figure it's good to have the guy who already doesn't know what he's doing look at the docs and see if they make sense ;)20:02
fwereadenatefinch, hell yeah :)20:03
fwereadenatefinch, although, to be fair, I would not have described you thus :)20:04
thumperfwereade: hey hey20:04
fwereadethumper, heyhey20:04
fwereadethumper, how's it going?20:04
thumpergood20:05
fwereadesinzui, good evening20:12
sinzuiHi fwereade20:13
fwereadesinzui, how's azure? :)20:13
fwereadesinzui, is there anything I can do to help?20:13
sinzuifwereade, from an email I am typing:20:13
sinzui* BAD: I cannot deploy to azure with 1.15.0. I can with 1.14.1.20:13
fwereadesinzui, that statement is a model of clarity20:14
thumpersinzui: what sort of error are you getting?20:15
sinzuifwereade, http://pastebin.ubuntu.com/6177102/20:15
sinzui^ I have tried East US and West US. I have uploaded tools to both20:16
sinzuiI am confident that https://jujutools.blob.core.windows.net/juju-tools/tools is the tools-url20:16
thumpersinzui: looks to be in image problem not a tools problem...20:17
sinzuiI agree!20:17
sinzuiI can deploy 1.14.1 and I don't expect a different image to be selected though20:18
thumperhmm...20:18
thumpersinzui: the metadata seems to be in a different format than juju is expecting20:21
thumpersinzui: it looks to me that we are looking for: "com.ubuntu.cloud:server:12.04:amd64" but it has "com.ubuntu.cloud.daily:server:12.04:amd64"20:21
thumpernotice the .daily20:21
fwereadesinzui, yeah, that ".daily" would seem to be the problem20:22
thumpersmoser: ping20:23
sinzuiI can change image-stream: maybe I think "release" doesn't get the extra daily20:23
fwereadesinzui, but then that url doesitselfspecify daily20:23
smoserthumper, here.20:24
thumpersmoser: hey20:24
thumpersmoser: can you read the scrollback to the pastebin?20:24
thumpersmoser: we are having trouble with juju and azure20:24
thumpersmoser: and it seems we can't find the simple streams image data20:25
thumpersmoser: wondering if the format change, or could just be our code20:25
thumpersmoser: so really just checking on expectations right now20:25
thumperfwereade, sinzui: http://cloud-images.ubuntu.com/daily/streams/v1/com.ubuntu.cloud:daily:azure.sjson is referenced from the azure group20:26
fwereadethumper, huh, our code is looking a bit problematic itself actually... azurecode does weird things with daily vs ""20:27
smoserthumper, http://pastebin.ubuntu.com/6177102/20:27
thumpersmoser: yeah20:27
fwereadethumper, provider/azure/environ.go:93220:27
fwereadethumper, provider/azure/environ.go:90420:27
fwereadethumper, something doesnotadd up20:28
thumperfwereade: hmm...20:28
thumperfwereade: getImageStream is never called, except in a test20:29
fwereadethumper, ah good... I suppose20:29
thumperheh20:29
thumpersmoser: I guess the real thing we want to check is that we should be looking for  "com.ubuntu.cloud.daily:server:12.04:amd64" not "com.ubuntu.cloud:server:12.04:amd64" in the index20:30
thumpersmoser: is that right?20:30
smosernice.20:30
smoserno.20:30
thumperfwereade: can you see where we build that string?20:30
smoseryou should no longer be looking for daily.20:30
smoserthat should work. but you should'nt be looking for it.20:30
thumpersmoser: so... what should we be doing?20:31
smoserthere are released images on azure now.20:31
smoseri said this on an email thread.20:31
thumperfwereade: this might be it...20:31
* thumper sighs20:31
smoserhm... odd.20:31
thumpersmoser, fwereade: perhaps this was dropped on the floor?20:32
fwereadethumper, smoser: very possibly :(20:33
thumpersmoser: what's odd?20:33
fwereadethumper, smoser: the azure providerdoes seem to get the stream from the config20:33
smoserodd that you committed to juju the '.daily'20:33
smoserhttp://paste.ubuntu.com/6177171/20:34
smoseranyway, that is what you want.20:34
fwereadesinzui, does your environ config specify image-stream by any chance?20:34
sinzuiNo20:34
sinzuiI was pondering using it to force something other than daily20:34
thumperfwereade: line 90820:34
thumperfwereade: builds the image source list with the local storage, + the default (which is /daily)20:35
thumpersinzui: can I get you to try something?20:35
fwereadethumper, yeah, /releases is missing20:35
thumpersinzui: azure/environ.go line 90420:35
thumperchange daily to releases20:35
thumpersinzui: and try then20:35
smoserare you possibly looking in the released stream for daily products ?20:35
smoseras you wont find them.20:36
thumpersmoser: no, I don't think so20:36
thumperwe are looking in the daily for released20:36
thumperAFAICT20:36
fwereadethumper, agreed20:36
fwereadethumper, given that we have image-stream configurable, we really ought to look in both, I think20:37
fwereadethumper, ah, but can we?20:38
sinzui"image-stream: releases" yields this error:  no OS images found for location "East US", series "precise", architectures ["amd64" "i386"]20:38
smoserhm..20:38
smoseri tink on caonnistack we're actually combining both streams20:38
fwereadesinzui, is there a line just above with some product names?20:39
smoserin whic hcase if you were looking for either .daily or released, youd find both.20:39
thumperfwereade: where do you see the image stream be configurable?20:40
fwereadethumper, azure/config.go:7120:40
fwereadethumper, ha, :12720:41
sinzuifwereade, this looks identical to the first pastebin except for times: http://pastebin.ubuntu.com/6177196/20:42
thumperhmm20:42
thumpersinzui: haha, it is appending .releases20:42
thumperwhere I don't think it should20:43
thumpersinzui: what did you change exactly?20:43
smoserit should not.20:43
smoseryeah.20:43
sinzuithumper, I added20:43
sinzuiimage-stream: releases20:43
smoserpproduct names are com.ubuntu.cloud:server:12.04:amd64 and com.ubuntu.cloud.daily:server:12.04:amd6420:43
sinzuithumper, juju init added this20:44
sinzui#image-stream: daily20:44
thumpersinzui: but you made it "release" ?20:44
sinzuithumper, I uncomented the line and made the value "releases"20:45
thumpersinzui: ok, comment that out again20:45
thumperfwereade: did you find the config line for the source url?20:45
=== gary_poster is now known as gary_poster|away
thumpersinzui: add this:20:47
thumperimage-metadata-url: http://cloud-images.ubuntu.com/releases20:47
=== gary_poster|away is now known as gary_poster
thumperfwereade: hmm environs/imagemetadata/simplestreams.go:2420:49
fwereadethumper, hmm... but we never seem to check it20:50
sinzuiOh, this is taking much longer20:50
thumperfwereade: environs/imagemetadata/urls.go:39:20:50
sinzuiBoom. Up comes a state-server20:50
thumperso the question is: why isn't it using the default that is there...20:50
thumperI think I know20:50
fwereadethumper, ohhhh... how do we handle falling back from one source to another? is it giving up before looking at the right one?20:51
thumpercorrect20:51
thumperI think20:51
fwereadethumper, ah bollocks20:51
thumperso azure says: try this daily one20:51
thumperso it goes:20:51
thumperconfig if set20:51
thumperenviron if provided20:51
thumperthen default (which is correct)20:51
thumperbut we seem to be giving up before we get to it20:51
thumperI feel that this is incorrect error handling20:52
thumperany error is bad20:52
thumperwe should have a specific error that we can check for to keep iterating through20:52
fwereadethumper, I think I remember ian explaining that it was a behaviour-preservation thing... the original tools stuff would only fall back in the case of *no* tools20:52
thumperthis isn't tools20:53
thumperthis is images20:53
fwereadethumper, indeed so, but they share underlying mechanisms20:53
thumperhmm...20:53
thumperthat's a little poked20:53
fwereadethumper, yep, I completely overlooked it at the time, I was thinking purely about tools20:53
thumperso...20:53
thumperwhere to from here?20:54
thumperfwereade: can we split the lookup behaviour?20:54
thumperadd a policy to the method?20:54
fwereadethumper, policy feels cleanest at first sight, doesn't it20:55
* thumper nods20:55
fwereadethumper, there we have it documented: environs/imagemetadata/simplestreams.go:6820:57
fwereadethumper, "the first location which has a fileis the one used"20:57
thumperhmm...20:59
thumpersinzui: does this mean you can upgrade your azure env?21:00
sinzuiI am still waiting for the first juju status to complete21:00
sinzuithumper, I can try, but I have not been able to upgrade hp or aws21:01
thumpersinzui: omg slow21:01
thumperhmm, what is the hp upgrade problem?21:01
thumpersinzui: so amazon does upgrade?21:01
sinzuiazure is being routed though Somalia.21:01
sinzuithumper, no.21:02
sinzuiI have not been able to upgrade any env21:02
fwereadethumper, environs/simplestreams/simplestreams.go:444?21:02
sinzuibut since aws and hp are fast I can try again21:02
thumperfwereade: seems sprurious21:04
thumpersinzui: have you tried since the bucket tools were made available?21:04
sinzuiyes.21:04
fwereadethumper, there's the core of something sane though -- if you're looking for product X and you find an index for product Y you should certainly move on to the next one21:13
fwereadethumper, if you find product X, but no examples of it that match what you're looking for, you should probably not21:13
fwereadethumper, or at least it's arguable, I think21:13
thumperfwereade: yeah... otp with mramm now21:15
sinzuiI have still not gotten a status back from azure. This is more than 30 minutes without feedback21:18
hazmatsinzui, it takes a while but rarely more than 15m21:21
hazmatinterestingly bootstrap/destroy-env are basically synchronous there21:21
sinzuiyeah. I did three bootstraps of azure on 1.14.1 today21:21
hazmatfor valid reasons21:21
hazmatsinzui, is the instance up from the azure console?21:22
hazmatsinzui, are we tagging release in bzr?21:22
sinzuiYes, I see an instance21:23
sinzuihazmat, My first two status calls ended with "no reachable servers"21:24
sinzuiThe third is in progress21:24
hazmatsinzui, no reachable servers means no response on the api to get the ip21:24
hazmatfrom the object storage instance id21:24
* hazmat tries on azure21:26
sinzuithumper, I have a positive response from aws. I looks like it accepted the upgrade (http://pastebin.ubuntu.com/6177342/). But 10 minutes later I still see the agent versions are 1.14.121:30
thumpersinzui: would be interesting to get the entire log file back for analysys21:31
thumpersinzui: the -all.log from the bootstrap node?21:31
thumpersinzui: can you scp or pastebinit?21:31
* sinzui visits21:32
thumpergary_poster: hey21:32
thumpergary_poster: I'm now on saucy and having no issues with the local provider21:32
thumpergary_poster: I'm wondering what is different on your machine21:32
thumpergary_poster: can you think of any "non default" things you may have?21:33
* thumper afk for a bit21:36
hazmatthumper, gary_poster the issues gary mentioned sound like a cgroups issue not a juju one22:11
hazmatlike the cgroup mount space isn't correct.. normally we're using cgroups lite afaicr22:12
thumperhazmat: hmm... how does that get changed?22:12
hazmatgary_poster, you around?22:12
sinzuithumper, I reuploaded the aws tools using s3up, now I cannot see any public tools. I think it made things worse22:12
hazmatthumper, you need the cgroup-lite package to get the cgroup  sysfs to get automounted via upstart.. its been a while but i remember one time.22:12
* sinzui redeploys with s3cmd22:13
hazmatsinzui, i changed the bucket policy to ignore whatever the client said.. the bucket is always public22:13
thumperhazmat: however the clients inside aws can't see it22:13
hazmatsinzui, since this has happened multiple times..  re private tools22:13
hazmathuh22:13
* hazmat checks22:13
thumperyeah22:13
sinzuihazmat, I thought you did, but thumper reports other say they are still private22:13
thumperhazmat: sinzui was trying an upgrade, and the clients listed all the tools, but no 1.15 ones22:14
hazmatthumper, every link in jam's email worked for me22:14
thumperhazmat: perhaps it is how they are being listed?22:14
hazmatie.. http://juju-dist.s3.amazonaws.com/tools/releases/juju-1.15.0-precise-amd64.tgz22:14
thumperthe underlying api?22:14
thumperperhaps has nothing to do with the bucket settings22:14
hazmatthumper, the listing is http://juju-dist.s3.amazonaws.com/22:14
thumperthe goamz bit22:14
hazmatand it works fine to22:14
hazmatnot related, i overrode the default bucket policy which had keys default to private.. and switched it to public22:15
thumperah...22:15
thumperI know22:15
hazmatbecause this is like the second time in a week this issue has occured.22:15
thumpersinzui: the tools need to be in /tools as well for the 1.15 / 1.16 releases22:15
thumperthey need to be in two places22:15
thumperotherwise 1.14 can't find them22:15
thumperthe new code puts them in tools/releases22:16
thumperthe old code is just looking in /tools22:16
hazmatgotcha22:16
hazmatso they need to be in both places for backwards compatibility.22:16
thumperack22:16
hazmatsinzui, which scripts are you using .. the ones in lp:juju-core/scripts22:16
sinzuino, it be broken, it cannot find the series, version, or tarball name22:17
hazmatthe key on tools/streams/v1/com.ubuntu.juju:released:tools.json appears to have whitespace around it.22:17
hazmatnot sure if that's real or just formatting oddity22:17
hazmatmight just be formatting22:18
sinzuihazmat, I used this, but have run each upload by itself. http://pastebin.ubuntu.com/6177496/22:18
hazmatugh22:18
sinzuiIt began as a fix to the script. I shouldn't be trusted download each deb and then work out how to deploy to other clouds22:19
thumpersinzui: you need to add: s3cmd put --acl-public ${DEST_DIST}/tools/releases/*.tgz s3://juju-dist/tools/22:23
thumpersinzui: for at least the 1.15 and 1.16 versions22:24
thumpersinzui: once everything is on 1.16, we don't need the legacy location22:24
thumpersinzui: but otherwise the 1.14 machines can't see the new 1.15(16) tools22:24
sinzuithumper, let me repeat that ...22:25
thumpersinzui: did you want a hangout22:25
thumper?22:25
hazmatsinzui, that should get cleaned up and added to trunk branch..22:26
sinzuithumper, using v1.15.0 client, I cannot complete an upgrade to 1.15.0 server because the client didn't tell the server where the 1.15.0 servers are?22:26
hazmatsinzui, the server looks up tools for an upgrade22:26
thumpersinzui: correct22:27
sinzui2 minutes22:27
thumpersinzui: the client looks, and stores the new version in state22:27
thumperthe agents go "oh, new version" and go looking22:27
thumperusing the 1.14 codebase22:27
thumperwhich says "look in /tools"22:27
sinzuik$ s3cmd info s3://juju-dist/tools/juju-1.15.0-precise-amd64.tgzs3://juju-dist/tools/juju-1.15.0-precise-amd64.tgz (object):22:30
sinzui   File size: 329168522:30
sinzui   Last mod:  Mon, 30 Sep 2013 22:27:33 GMT22:30
sinzui   MIME type: application/x-gzip; charset=binary22:30
sinzui   MD5 sum:   10e6466f113e751fa66461d755c0149d22:30
sinzui   ACL:       gustavoniemeyer: FULL_CONTROL22:30
sinzui   ACL:       *anon*: READ22:30
sinzui   URL:       http://juju-dist.s3.amazonaws.com/tools/juju-1.15.0-precise-amd64.tgz22:30
sinzuiThey are there now22:31
thumpersinzui: the agents should be able to upgrade now22:31
sinzuithumper, and it did without me asking any more from it22:33
thumper\o/22:33
sinzuiasw upgrades. I will replay this on hp (after I upload tools to tools/)22:34
thumperso we need to do similar things for the other tools locations22:34
gary_posterhazmat, thumper here for a sec.  anything I can check?22:34
sinzuiyep22:34
thumperyay22:34
thumpernow I can go to the gym now feeling guilty about leaving a huge mess for a while22:34
* gary_poster runs away again22:36
sinzuithumper, v1.15.0's sync-tools does not copy to tool/. I need to find another way to put the tgzs there22:43
thumperhmm...22:44
thumperwe need something half manual22:44
hazmatgary_poster, yes.. can you verify you have cgroup-lite installed22:51
hazmatsinzui, the acl doesn't matter anymore22:51
hazmatsinzui, the bucket policy will override any acl22:51
hazmatyou can upload private and its still publicly available22:51
hazmatits been too accident prone relying on a variety of clients and lack of automation.22:52
sinzuihazmat, rock. I will stop panicking . Thank you!22:52
sinzuilooks like the hp upgrade is accepted. I will wait for a few minutes22:55
sinzuihp isn't upgraded yet23:02
sinzuiI have uploaded the tools to azure's tools/ and the listing looks correct23:05
sinzuihazmat, thumper does juju support leaping upgrades of stable? eg 1.12 to 1.16?23:06
* sinzui hopes no23:06
hazmatsinzui, at the moment.. yes.. you can even downgrade23:07
hazmatsinzui, although..23:08
hazmatsinzui, in this context its just going to look for the latest it can find in the location its knows about23:08
sinzuihazmat, so aren't we commit to putting tgz files in /tools until we break compatibility (2.0)23:09
hazmatmaybe.. given that we have to coordinate two versions (client, server) and two tool locations..23:09

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!