/srv/irclogs.ubuntu.com/2013/11/05/#juju-dev.txt

=== ehw is now known as Guest9026
axwwallyworld: in case you're wondering why I didn't land the simplestreams changes yet, I found a bunch of tests I forgot to update :)01:50
wallyworldah np :-)01:51
* thumper -> school run01:59
thumperwallyworld: please don't remove the names when done02:51
wallyworldthumper: ok, i saw some names had already been removed for things besides mine so i thought i'd tidy it up02:52
wallyworldlooks neater :-)02:52
* thumper fixes02:52
thumperit isn't how its doen02:52
wallyworlddo we care about the names?02:52
thumperdone02:52
wallyworldonce done02:52
axwjam: I had it as two tests, but changed it because I keep getting told to ;)   I agree - will change it back to two04:48
jamaxw: having worked through the Uniter tests, I find them very hard to debug when things go wrong04:48
jambecause the Nth item in the test is failing04:48
jamand the log is 1000 lines long04:49
axwyeah, I find this problematic too04:49
jamwallyworld: poke05:05
jamfor https://code.launchpad.net/~jameinel/juju-core/faster-passwords/+merge/193667 I realized that if an agent logs in with the "slow" hash, we can just rewrite it there to the fast one05:06
jam(or if we want something with salt, etc, etc)05:07
=== gary_poster is now known as gary_poster|away
wallyworldjam: hi05:33
jamhey wallyworld05:34
wallyworldyou want me to look at that review again?05:34
wallyworldjam: so did you need me to do anything re: the above poke?05:41
jamwallyworld: Sorry, I haven't finished responding to all the feedback, but I wanted to ask if a change seemed reasonable.05:42
jamNamely05:42
jamwhen running entity.PasswordValid05:42
jamif we see that the PasswordHash in the DB is the old form05:42
jamjust rewrite the DB to the new form05:42
jamwe can trivially compute the hash05:42
jambecause the agents always just pass in the full password05:42
wallyworldright, i thought you were looking to do that. seems reasonable to me, since cost is trivial and it will incrementally upgrade the db05:43
jamwallyworld: what about "salt" for UserPasswords05:44
jamsounds like fwereade would like to see that added05:44
jamdoesn't seem hard, though it means adding another field to the DB05:44
wallyworldthat makes sense to me too - o'd personally feel better with it05:44
wallyworldi'd05:44
jamwallyworld: is it worth doing for agent passwords?05:44
wallyworldthat's a harder question05:45
jamalso, is it worth doing something like a len(password) >= 18 for the fast version?05:45
jamwallyworld: so "worth it" is just in the "so the code paths are similar"05:45
jamI'm 99% sure it isn't worth it from an actual increased security05:45
wallyworldi'd like that since were relying on long enough password = hard enough to brute force05:45
jam(len(password) maybe)05:45
wallyworlddoes the salt for agent passwords add any tangible benefit?05:46
wallyworldcf the extra complexity05:46
wallyworldi guess if the code is the same anyways....05:47
jamwallyworld: no05:47
jamsalt is a "prevent someone from precomputing 1B password hashes"05:47
jamof known user-likely passwords05:47
jamthe whole point is we don't have known user-passwords for agents05:47
wallyworldyeah05:47
wallyworldso i wouldn't do it05:47
wallyworldwe could still use same code05:48
wallyworldie look up salt and use it if there05:48
wallyworldi think mongo returns empty for non existent fields05:48
jamwallyworld: I'm sure we can tell and be compatible05:49
wallyworldso then, add salt for user passwords, check length of agent passwords, rewrite out of date hashes05:50
wallyworlddo we pass password over the wire in plain text?05:50
wallyworldi guess we do?05:50
jamwallyworld: yes. though we have a TLS connection by that point05:51
wallyworldok05:51
jamwallyworld: I know fwereade also talked about using CA signed client certs for agents05:51
jambecause that also helps in the case of "recovery" mod.05:51
jammode05:51
wallyworldok05:52
jambut you'll still want *some* sort of user identity token/password/thingy05:52
wallyworldyeah05:52
jambecause you don't want machine-1 agent pretending to be machine-005:52
jamsince machine-0 gets all the passwords05:52
wallyworldyep :-)05:52
wallyworldor machine-N05:52
jamright05:52
wallyworldwhere N is a HA state server05:53
jamwallyworld: yeah, I think recovery will need some thought about security05:53
wallyworldindeed05:53
jamone can argue the attack surface is minimized by requiring a user to engage the mode05:53
jamand it could even require Admin registration sort of thing05:53
jamI don't know how much we want to automate all of recovery05:54
jamso EOUTOFSCOPE for now :)05:54
wallyworldyep05:54
wallyworldthere's lots of prior art for this sort of thing too i think05:54
wallyworldlet's not reinvent the wheel05:54
jamwallyworld: oh, the other bit about requiring min length of Agent passwords06:02
jamis it is going to disrupt the test suite a lot06:02
jambecause we have tests that set the password for machine to "test-password"06:02
jamwhich is a lot less than the 24 bytes we normally have06:02
wallyworlds/test-password/test-password1234567890 :-)06:02
jamwallyworld: well that, and it *might* bite us in backwards compatibility mode06:03
jamsince we can't change the actual password06:03
jamwe can change what we *store*06:03
jambut we don't have a "that Login is valid, but you need to create a new Password now"06:03
wallyworldtrue06:04
jamutils.RandomPassword has generated 24-byte passwords for a long time now, though06:04
wallyworldi reckon it's worth trying06:04
=== Guest9026 is now known as ehw
=== ehw is now known as Guest25962
=== Guest25962 is now known as ehqw
=== ehqw is now known as ehw
rogpeppemornin' all08:47
rogpeppefwereade: ping08:51
axwmorning rogpeppe08:51
rogpeppeaxw: yo!08:51
axwrogpeppe: should everything prefer to use state.Machine.Addresses() rather than go to environs.Environ.WaitDNSName()?08:55
axwI'm updating juju ssh to use the API; it uses WaitDNSName currently08:55
rogpeppeaxw: yes, it should definitely use state.Machine.Addresses08:55
axwok08:55
rogpeppeaxw: or Unit.Address(es?) when appropriate08:56
axwyup08:56
axwcool08:56
axwrogpeppe: main problem now is that NewAPIConn doesn't pass secrets... I guess I'll do that now08:56
rogpeppeaxw: ah yes, that definitely needs to happen08:57
rogpeppeaxw: you mean, it doesn't push secrets if it's the first connection, right?08:57
axwrogpeppe: yup08:57
axwthere's a TODO08:57
rogpeppeaxw: ha, the TODO just above it is very stale...08:59
rogpeppeaxw: i'm just wondering if there's a nicer way to do secret pushing that doesn't incur an extra round trip08:59
axwrogpeppe: the API server could return an error that says "I haven't got my secrets yet"? and then we push and retry?09:00
rogpeppeaxw: something a little like that, yes09:00
TheMuemorning09:01
axwmorning TheMue09:01
* TheMue fights with a mail backlog of one week :)09:01
rogpeppeTheMue: hiya09:04
TheMuehiya axw and rogpeppe09:04
rogpeppeaxw: one alternative i'm considering is that the Login response, rather than failing, returns a "lacking secrets" status09:05
rogpeppeaxw: that can be cached locally in the api.State and queried.09:06
axwrogpeppe: ah ok. I don't know the internals well to know how separated they are...09:06
axwsounds sensible09:06
axwso instead of a GetEnvironment/SetEnvironment, it'd just push secrets during login09:07
rogpeppeaxw: yes09:07
rogpeppeaxw: so we might add an environ config argument to api.Open09:08
rogpeppeaxw: which is allowed to be nil, but if it is, then the connection will fail if it's the first API connection09:08
rogpeppeaxw: in that case in fact it might work well to have Login fail09:09
rogpeppeaxw: hmm, not sure though09:10
rogpeppeaxw: depends whether we want the login message to contain the environ config09:10
axwrogpeppe: if it's just the first connection, why not just have the login proceed, and then have the server request the secrets (via a special error)?09:12
axwthen a second message09:12
axwit's only once09:12
rogpeppeaxw: so the error implies "login has actually succeeded (despite the error) but secrets are needed" ?09:13
axwrogpeppe: yeah. perhaps confusing, but that's one option anyway :)09:14
rogpeppeaxw: i definitely don't mind a second message to push the secrets09:14
rogpeppeaxw: one question that arises from this: is there ever a case where we want to allow some request to the API server *without* pushing the admin secrets?09:18
rogpeppeaxw: because if we push environ config with Login, that will be ruled out09:18
rogpeppeaxw: but that might well be a good thing09:18
rogpeppeaxw: because then there's no way that any client can do anything at all with an environment with no secrets09:19
axwhmm09:20
rogpeppeaxw: the other thing that i'm thinking about is how does the server know when secrets have been pushed09:20
rogpeppeaxw: the most straightforward approach is simply to get the environment config and see if there are any secrets in it09:23
rogpeppeaxw: but perhaps there might be an environment that has no secrets09:23
rogpeppeaxw: ha, that's actually not a problem, i realise09:23
axwrogpeppe: I thought the idea was an environ's config must be invalid if it doesn't have its secrets09:23
rogpeppeaxw: yeah, it is09:23
rogpeppeaxw: but we don't even need to create the Environ09:23
axwrogpeppe: as for allowing no secrets... sounds preferable to require them always, but I don't know if there's a case or not09:23
rogpeppeaxw: because we've got EnvironProvider.SecretAttrs09:23
rogpeppeaxw: so if that returns nothing, we know that we don't require secrets to be pushed09:23
axwah yeah, I see. then we can distinguish an invalid env from one with no secrets09:23
rogpeppeaxw: yeah09:24
rogpeppeaxw: although...09:24
rogpeppeaxw: perhaps it might be a good plan to actually validate the environ09:24
rogpeppeaxw: something we can't do currently09:24
axwcan't?09:25
axwrogpeppe: why can't we?09:25
rogpeppeaxw: because clients talk directly to mongo09:26
rogpeppeaxw: so there's nothing stopping a dodgy client pushing a bad environ config09:26
axwah ok, I see09:27
axwthere's no good time to do it currently09:27
rogpeppeaxw: the other thing that occurs to me is that we could cache "secrets pushed" in the apiserver.Server (because it can only go from false to true), meaning that any api server would only need to check once09:28
rogpeppeaxw: but that's just an optimisation (but one that's not possible if the secrets checking is done client-side)09:28
axwrogpeppe: is all of this going to break the GUI horribly?09:30
rogpeppeaxw: i don't think so09:30
rogpeppeaxw: because AFAIK the GUI can't currently make the first connection anyway09:30
rogpeppeaxw: and the Login call can be changed in a totally backwardly compatible way09:30
axwcool09:31
rogpeppeaxw: so to summarise, how does this sound? http://paste.ubuntu.com/6363731/09:38
axwrogpeppe: sounds great.09:39
axwrogpeppe: are you planning to do this yourself, or are you working on other things?09:41
rogpeppeaxw: i'm currently oriented more towards the HA stuff - if you feel like doing this, it would be great.09:41
axwsure, I will look into it (probably in the morning)09:42
jamaxw: just to let you know, I'm currently poking a lot of stuff underneath login (PasswordHash) stuff09:42
jamit probably won't conflict, but you might want to wait a sec on it09:42
axwjam: ok no worries09:42
rogpeppejam, fwereade: how does the above plan look to you?09:42
axwthanks09:42
jamrogpeppe: I *think* it is all unnecessary. thumper was quite keen on changing "juju bootstrap" to wait until it can connect to the API server09:43
jamin which case09:43
jambootstrap does all the work09:43
jamand then we don't have to do it for every API connection.09:43
rogpeppejam: i don't think that's viable09:43
jamrogpeppe: because ?09:43
rogpeppejam: what happens if someone interrupts "juju bootstrap" ?09:44
jamrogpeppe: they have to start over09:44
jamthey don't have more than 1 machine at that point09:44
jamso we aren't destroying an environment that is well set up anyway09:44
jamOr we allow "juju bootstrap" to start where it left off09:44
fwereaderogpeppe, jam: the plan was to catch interrupts and takes the machine down if it's interrupted09:45
rogpeppefwereade: i'm not sure that's great actually - what if the network is down? does that mean you can't interrupt bootstrap?09:46
fwereaderogpeppe, jam: blocking bootstrap actually has a lot of advantages -- no silly secrets dance, ability to create storage in the environment instead of the provider09:46
fwereaderogpeppe, it just fails09:47
jamfwereade: I'm a big fan, plus the fact you can give the user feedback about how far it gets09:47
fwereaderogpeppe, don't think it's any worse than the network going down during a normal bootstrap09:47
fwereadejam, rogpeppe: indeed, useful feedback during bootstrap is also awesome09:47
jamrather than trying to do that at every "juju status" or "juju deploy" or ... etc09:47
fwereadejam, in a sense that's just an extension of the secrets dance, but yeah, would be good to drop it entirely, no argument09:48
rogpeppefwereade: FWIW i think we can create storage in the environment instead of the provider anyway, can't we?09:50
fwereadejam, rogpeppe: the question is *when* thumper is likely to do this, because we need some solution for the cli-api work09:50
jamfwereade, rogpeppe: going back to another discussion, I'm going back to the PasswordHash stuff, and splitting it into a UserPasswordHash(password, salt) and AgentPasswordHash(password)09:51
jamwhere we allow CompatPasswordHash, but if that succeeds, we then change the DB to set it to the new methods09:51
jamfwereade: well, 'juju status' and 'juju deploy' are going to be some of the last steps we actually finish :)09:51
fwereaderogpeppe, well, we do, for the manual provider -- but that's a blocking bootstrap ;)09:51
jamwe can set someone on it, even if it isn't thumper09:51
rogpeppejam: i don't think you can salt user passwords until the entire CLI is API, can you?09:52
fwereadejam, heh, axw springs to mind given that he did the manual stuff -- it's just "make everything else work like manual bootstrap" :)09:53
axwheh09:53
fwereadejam, axw: modulo *also* needing provider storage -- or some alternative mechanism -- to store the bootstrap info09:53
jamrogpeppe: so we don't salt the Mongo password, so I don't think that actually changes, it is just when someone *does* connect via the API, we look up the hash + salt.09:53
axwjam, fwereade: per my email before, lack of secrets via API kinda blocks work I'm doing09:54
axwI can move onto destroy-environment maybe09:55
axwbut otherwise, I could look at the secrets/bootstrap business09:55
rogpeppejam: currently we do hash the mongo password, but i guess we could use a known salt for that09:55
rogpeppejam: (in fact that's what we do currently)09:56
jamrogpeppe: CompatPasswordHash() uses the same UserPasswordHash(password, FIXEDSALT)09:56
rogpeppejam: seems reasonable09:58
rogpeppejam: while we're about it, can we increase the password strength?09:59
jamrogpeppe: 18-bytes of entropy is about 2^53 or so. We could easily go up to 24 (and get 32-byte base64 passwords). which gets us up into 2**72.09:59
jamI may be wrong on the exact values10:00
jambut < 2^64 today, and >2^64 with a size bump.10:00
jamrogpeppe: the code itself says "we stick to 18-bytes because mongo uses md5sum anyway"10:00
rogpeppejam: i'm thinking we could usually use 256-bit random passwords and hashes10:00
rogpeppes/usually/usefully/10:01
jamrogpeppe: I honestly don't think that improves our actual security, but yes, we could make it really big.10:01
jam#1 mongo is still the most critical part10:02
jamas getting *that* password gives you everything10:02
rogpeppejam: yeah, but cracking the user password might give you access to other environments10:02
jamrogpeppe: doesn't matter, we don't set the user password10:02
jamand these passwords that we are generating are only good for a given agent10:03
jamso no leakage10:03
jamwe're using sha512 as our hash, so we have room internally10:04
jamif mongo is using md5sum10:04
jamthen that would be 128 bits10:04
jambut 18 bytes of entropy = 2^144 (i was doing the math wrong before)10:05
jamso we're already better than md510:05
rogpeppeyeah, i thought your numbers looked weird (i thought you perhaps meant 10^53)10:06
jamrogpeppe: I was doing 8 bytes == 8^8 rather than 256^810:07
rogpeppejam: or 2^(8 * 8)10:08
rogpeppejam: (easier just to work in bits, i reckon)10:08
jamrogpeppe: so because we take the raw bits and put it into base64 encoding, the useful bits are 18 bytes, 24 bytes and 30 bytes10:09
jamsince those leave us with a base64 password that doesn't have '=' padding.10:09
rogpeppejam: tbh a 144 bit random password is probably ample10:09
jamrogpeppe: for our attack surface I think it is more than ample myself10:10
rogpeppejam: that's not gonna be the way that someone breaks into our system10:10
jamrogpeppe: yeah, I was considering it when my math was bad, because 2^64 isn't great security. but 2^144 is perfectly fine10:10
rogpeppejam: yeah10:11
rogpeppefwereade: are you around for a chat about the HA stuff?10:37
rogpeppefwereade: ah, it's standup in a mo actually10:37
dimiternrogpeppe, mgz, fwereade: standup10:46
mgzta10:46
jamfwereade: standup?10:46
jamfwereade: https://plus.google.com/hangouts/_/calendar/am9obi5tZWluZWxAY2Fub25pY2FsLmNvbQ.mf0d8r5pfb44m16v9b2n5i29ig10:46
* TheMue => lunch11:50
tasdomasdoes juju support tokens in config.yaml? I.e. in cases where one value is dependant on another one (like base path and subfolders)?12:32
mgztasdomas: nope12:38
tasdomasmgz, right, thanks12:38
=== gary_poster|away is now known as gary_poster
abentleysinzui: did thumper fill you in on the lxc/local-provider developments?14:16
sinzuiNo, but I saw the branch merge14:16
abentleysinzui: I did a test run after the branch merged and the local provider still failed, but I did not have time to check whether it was the same issue as before.14:17
sinzuiabentley, are you using mysql in the test?14:17
abentleysinzui: Yes.14:17
jamrogpeppe1: fwereade: https://code.launchpad.net/~jameinel/juju-core/faster-passwords/+merge/193667 has been updated. It now sets a Salt for User passwords and uses clearly denoted AgentPasswordHash vs UserPasswordHash vs CompatPasswordHash14:18
rogpeppe1jam: looking14:19
jamI haven't had a chance to test live upgrades, but I have every belief things will JustWork14:19
rogpeppe1jam: one thing that occurs to me as *potentially* useful in the future, if we have many "users" that are actually agents, is that if we've generated the admin secret automatically (i.e. it's got lots of entropy) we could eschew the salting.14:22
rogpeppe1jam: something to think about for the future, perhaps14:22
jamrogpeppe1: well, eventually we'll get real users14:24
jamwe do often generate admin-secret today14:24
rogpeppe1jam: indeed.14:24
jambut, meh, salting is cheap, I'm only looking to change this stuff for the AgentPasswordHash changes14:24
rogpeppe1jam: salting is cheap, but UserPasswordHash is not. at some point in the future, we *might* come to a situtation where we've got many agents (the GUI is one example) that reconnect when an API server goes down, and hence use lots of CPU resources when doing so14:25
rogpeppe1jam: so, i guess i'm not really talking about the salting per se14:26
rogpeppe1jam: anyway, it was just a thought that occurred to me; ignore me :-)14:26
abentleysinzui: my test was invalid, because it started with 1.16.x, which isn't expected to have a fix yet.14:28
sinzuiah14:28
sinzuiabentley, we could add stable branches to the test? juju/1.16 will build a 1.16.3 client and server14:29
abentleysinzui: Certainly.  Did that temporarily for thumper yesterday.14:30
abentleysinzui: Are you in the stand-up?  I switched my urls around.14:32
rogpeppe1jam: why is environs/cloudinit.go using CompatPasswordHash ?14:33
jamrogpeppe1: that is the old "use the hashed password until we can use the real one"14:34
sinzuiabentley g+ is asking me to juggle three identities14:34
abentleysinzui: Fun.14:34
rogpeppe1jam: why can't we use a salted password there too?14:34
jamrogpeppe1: that would require changing what we pass to cloud-init, I think14:35
rogpeppe1jam: (after all, it's actually one of the most insecure places that the admin password is kept)14:35
rogpeppe1jam: yes, it would.14:35
rogpeppe1jam: is that a problem?14:35
jamwhich is something I wasn't as comfortable with because it isn't hidden behind the api14:35
rogpeppe1jam: i'm not that comfortable seeing "Compat"PasswordHash being used in a place where it looks like it will not be deprecated.14:36
jamrogpeppe1: regardless, the data we write to cloud init gets rewritten anyway,14:36
rogpeppe1jam: how do you mean?14:36
jamrogpeppe1: I'm fine changing the name back, or having multiple names for the same thing.14:36
jamrogpeppe1: once an agent is up, it resets its password14:36
jamso it won't match what is in cloud-init14:37
jamand bootstrap changes the admin password back to the real password rather than the hashed password14:37
rogpeppe1jam: but the admin password is still hashed in cloud-init, no?14:37
rogpeppe1jam: so someone that gets access to the cloud-init data (probably not too hard) can still brute-force the non-salted admin password AFAICS14:38
rogpeppe1jam: if we *are* going to salt user passwords, i think that's probably one of the most important places to do it14:38
natefinchrogpeppe1, jam:  any place we *can* salt passwords, we  should.  It's not computationally expensive, and even if we're not too worried about that vector of attack, it certainly can't hurt.14:40
jamrogpeppe1: so I dont think anything I've done precludes us adding salt there, and I think bootstrap is particularly a place where it is easy to break compatibility accidentally14:40
rogpeppe1jam: yeah, it *would* mean you couldn't use a new juju to bootstrap with old tools14:44
rogpeppe1jam: but perhaps you could change the occurrences of CompatPasswordHash to call UserPasswordHash with the known constant salt - then it's more obvious what's going on, perhaps. And a TODO in the code would be nice too.14:45
rogpeppe1jam: you have a review15:19
abentleysinzui: As we move to have more sets tests running, we'll want to have multiple versions of juju running concurrently.  Which makes me think we need to chroot (not lxc because that would break local provider).15:21
sinzuiabentley, I understand15:23
tasdomasI am getting a strange panic when running go test on juju-core/worker/uniter15:39
tasdomashttp://pastebin.com/ER6GUuza15:39
tasdomas(this is juju-core trunk)15:40
jcsackettsinzui: just realized what time it is. are we 1x1ing?15:42
sinzuijcsackett, sorry, had another meeting15:44
=== rharper_ is now known as rharper
abentleysinzui: AFAICT, it's impossible to run "make clean", because "go list -e -f '{{.Dir}}' launchpad.net/juju-core" doesn't find anything.  I could do "bzr clean-tree --unknown --ignored"16:03
sinzuiabentley, the make-recipe-and-package script creates a new directory with the revision in the name, so I don't understand how we could be reusing a built tree.16:15
abentleysinzui: Yesterday, I was testing 1.16 in the tree normally used for trunk.  With bad luck, a 1.16 revno could match a trunk revno.16:17
sinzuiah@16:18
sinzuiabentley, I have experienced that just after I created the 1.16 branch16:18
abentleysinzui: I also trigger the tests manually, without waiting for the revno to update, but that's probably less of an issue.16:20
bacjcsackett: are you around?  can we talk in a bit?16:23
=== natefinch is now known as natefinch-afk
abentleysinzui: I'm adding the clean-tree anyway to reclaim disk space.16:47
sinzuiYay16:47
jcsackettbac: sorry, i missed your message earlier. i can chat now, if you like.17:53
bachi jcsackett18:01
bacnow is good18:01
jcsackett  bac: g+?18:02
bacjcsackett: https://plus.google.com/hangouts/_/72cpjm0fnhduq36l7pim15v0mk?hl=en18:02
=== natefinch-afk is now known as natefinch
jcsackettsinzui: can you join in on the above g+? ^18:22
jcsackettsinzui: nm.18:29
* rogpeppe1 is done for the day.18:40
rogpeppe1g'night all18:41
bacsinzui: would you have a moment to review my one-line migration script?  i'll get someone else to look at the rest.  https://codereview.appspot.com/2179004519:37
jcsackettsinzui: can you look at https://code.launchpad.net/~jcsackett/charmworld/rollback-422/+merge/193999 today?20:06
sinzuibac, jcsackett I can start the reviews now20:33
jcsackettsinzui: awesome, thanks.20:33
bacsinzui: please do jc's first20:33
sinzuijcsackett, r=me20:42
jcsackettsinzui: thanks.20:43
jcsackettbac: i'll ping you when i've qa'ed it on staging.20:45
thumpersinzui: morning20:45
thumpersinzui: thoughts on 1.16.3?20:45
sinzuibac: LGTM. Thank you for updating the migration template20:47
baccool.  thanks for looking at the migration stuff sinzui.20:47
bacjcsackett: any problem with me landing my branch now or do you want me to wait?  i can walk the dog now and do it later20:48
sinzuibac: I am just happy we remember that es-update is automatically run for us20:48
jcsackettbac: i don't think our branches collide, so it should be fine.20:49
sinzuithumper, These bugs are fix releases in stable. they are fixed in trunk. I think I can mark these as fix released because everyone has the fix https://bugs.launchpad.net/juju-core/+bugs?search=Search&field.importance=Critical&field.status=New&field.status=Incomplete&field.status=Confirmed&field.status=Triaged&field.status=In+Progress&field.status=Fix+Committed20:55
thumpersinzui: well, kinda, are we going to roll out 1.17?20:56
thumpershould we make them fix released when we release them?20:56
sinzuiI was think of doing it this week, but doing 1.16.3 might exhaust me20:57
thumperwell, bug 1246556 is in 1.16.220:57
_mup_Bug #1246556: lxc containers broken with maas <api> <maas-provider> <juju-core:Fix Committed by thumper> <juju-core 1.16:Fix Released by thumper> <juju-core (Ubuntu):Fix Released> <juju-core (Ubuntu Saucy):New> <juju-core (Ubuntu Trusty):Fix Released> <https://launchpad.net/bugs/1246556>20:57
thumperso I think that should be fix released20:57
thumperhmm, I see the 1.16 task is released20:57
thumperpersonally I'd rather not have them marked fix released until we have a release with the fix20:58
thumperfix committed is enough to say they are in trunk20:58
sinzuithumper, I think only your addition last night is unreleased as well as the complicated maas bug21:00
thumpersinzui: I was concerned that we'd break people in a charm school at ODS21:01
thumperas the local provider is broken for everyone21:01
thumperdue to an old mistake and a precise update21:01
jcsackettbac: qa-ok.21:01
jcsackettbac: how did you cleanup the bad old review jobs last time? my terminal-fu is weak, and we have processes that won't die.21:02
sinzuithumper, yes, but not every charm is affected. I wont push back on the releases, but each releases of stable delays other work. It took days to get 1.16.2 to every place it had to be21:02
thumpersinzui: hmm, not every charm, but the local provider is broken now21:03
thumperfor everyone except those compiling trunk21:03
thumperno install hooks complete21:04
thumperbecause apt is left in an incomplete state21:04
sinzuilook at https://bugs.launchpad.net/juju-core/+bug/1240709. I think this is fix released for everyone because the juju-core maintains stable and devel trees.21:04
_mup_Bug #1240709: local provider fails to start <local-provider> <juju-core:Fix Committed by thumper> <juju-core 1.16:Fix Released by thumper> <juju-core (Ubuntu):Fix Released> <juju-core (Ubuntu Saucy):New> <juju-core (Ubuntu Trusty):Fix Released> <https://launchpad.net/bugs/1240709>21:04
bacjcsackett: it was in such a sad state i just rebooted the charmworld instance21:04
sinzuithumper, I typed into the to wrong channel 30 minutes ago: thumper I can do it. I was hoping that that CI would be up today. but abentley and I have had to rethink the branch + revno tactics used with jenkins21:06
* thumper nods21:06
thumperNormally I wouldn't be too concerned, but with ODS and charm school...21:06
thumpercould look real bad21:06
sinzuithumper, remember that the release takes between 8 hours and many days21:07
thumper:(21:07
sinzuiI don't want to be rushed it it has to be rushed because we don't control the builders and the copy step just adds more hours to the release21:07
sinzuithumper, If I had off a package in a few hours, I can hope that I have something to republish to end users in the first hours of my morning21:09
sinzuis/had/hand/21:09
thumpersinzui: I think we wouldn't need to push new tools21:09
thumpersinzui: as 1.16.3 should get 1.16.2 tools21:09
thumperwhich would work fine21:10
thumperthe only change is in the local provider cloud-init config21:10
sinzuibecause juju-core (client and server) are deployed by the client on the same machine?21:10
thumperyeah21:10
thumperthe local provider always pushes local tools21:11
thumperit is horrible21:11
thumperand we should fix it to be nicer21:11
thumperbut it works for now21:11
sinzuithumper, this is an awkward position to take because we have gone out of our way to ensure the jujuds are published before users get the clients21:11
thumper:)21:11
sinzuiThe number mismatches are scary21:11
thumperI understand21:12
thumperwas just trying to make things easier21:12
thumperbut I can understand that it isn't working21:12
sinzuithumper, jamespage makes the stable packages...21:12
sinzuibut I could go crazy and upload my own package to the builders. My packaging does not yet support dpkg alternatives, which would be a nasty regression for IS21:14
sinzuiand I haven't fix packaging because I personally spent 2.5 days releasing 1.16.221:14
thumperwow, why so long?21:15
sinzuiCI does not test stable yet. I do all the testing, send the tarball for building. While it builds I write release notes and steal natefinch's time to get a window's installer, and setup all the upgrade tests again. When I see all packages are build, I spend an hour assembling the tools and publishing them. Then a few hours testing that everything works still. Then I do the release annoucements, Then I work on windows and mac21:19
sinzuidistribution with upstreams21:19
sinzuiIf Hp loses authentication like last time, I can spend 90 minutes flailing to get jujuds in the cloud21:20
sinzuithumper, how many hours until charm school?21:23
thumpersinzui: no idea, best asking jcastro or marcoceppi21:24
thumperI don't know that there is one21:24
* sinzui is calculating risk and time21:24
thumperbut they normally do something21:24
wallyworldfwereade: hey, any update on bug 1233457?21:28
_mup_Bug #1233457: service with no units stuck in lifecycle dying  <cts-cloud-review> <destroy-service> <juju-core:Triaged> <https://launchpad.net/bugs/1233457>21:28
thumpersinzui: news isn't good, cts is doing the charm school21:45
sinzuiwhen?21:45
thumperdon't know21:46
sinzuithumper, I have a release plan nonetheless https://docs.google.com/a/canonical.com/document/d/1J0xf_G1ZRU5timhVBnPsrDbQmZmW3iuw02ReCpO9cMk/edit#21:56
thumpersinzui: weird21:58
thumpersinzui: I paste that in to the browser and get nothing21:58
thumpernot an error, just nothing21:59
sinzuithumper, I think you are not logged in as your canonical id22:00
abentleythumper: I could see it.22:00
sinzuithumper, I can go through the steps to the tarball phase. At that point I could return to fixing the devel packaging rules. When the rules are compatible, I can safely place packages into any archive. OR we appandone recipe...just ignore that is who we build test packages and fall back to tarball + packaging + bzr22:00
thumpersinzui: do you have 1.16 installed locally?22:01
thumperabentley: do you have 1.16 and not trunk?22:01
sinzuiIf I have the right rules, I am unblocked from loading the package to any archive22:01
sinzuithumper, I do, the current packaging rules do not support dpkg switch...IS cannot use pyjuju in production22:02
abentleythumper: I have 1.16.0 on my local machine.22:02
thumperabentley: 1.16.0 or 1.16.2?22:02
abentleythumper: It claims to be 1.16.0.22:02
thumperhmm...22:02
thumperabentley: does it start the local provider?22:02
abentleythumper: At the sprint, it worked sometimes.22:03
sinzuioh, thumper , abentley you might be referring to my quickly put together notes. stable is 1.16.2 now. that is what we test22:03
thumperfunny, but my system one says 1.16.0 too22:04
sinzuithumper, depending on your mirrors and update checks you can be a few weeks behind22:05
* thumper nods22:06
thumpersinzui: my ppa's were probably disabled when I moved to saucy22:06
thumperand I've not enabled them22:06
thumperyet22:06
sinzuithumper, The only way for me to get the 1.16.3 to complete the release is to install the deb I pulled locally, or bleed. I am running trusty (bleed)22:06
thumper:)22:07
thumperI'm not that trusty of trusty22:07
sinzuiI still think of her are tarty22:07
thumpersinzui: what is the doc called? going through the listing22:08
thumperlink still doesn't work, and I am logged in with the right creds22:08
fwereadewallyworld, heyhey22:08
sinzuithumper, 1.16.3 Release Log22:08
wallyworldhi22:08
wallyworldfwereade: just thought i'd check in in case i could help at all22:09
thumpersinzui: can't find it22:09
thumpersinzui: can I get you to check the sharing on it?22:10
sinzuithumper, try again22:10
sinzuiI think I have ensured the entire folder is searchable22:11
thumpersinzui: nope...22:12
thumperwhat is the folder?22:12
sinzuithumper, "Juju QA" and I just shared the doc directly with you.22:13
thumperrestarting chromium seemed to help22:13
wallyworldfwereade: also, save me some search time - is there currently a way to get an environs.Environ instance using an apiclient made from a call to NewAPIClientFromName? I can't see a way22:13
* sinzui starts blessing and cursing22:14
thumpersinzui: expose on the local provider does precisely nothing22:14
thumpersinzui: also, why test installing apache not mysql?22:15
=== gary_poster is now known as gary_poster|away
sinzuithumper, That was copied from the 1.16.3 release. I am going to do the mysql + wordpress stack. The apache charm was an example of a charm not affected by the apparmor/cgroups issue22:22
thumpersinzui: wow, that is quite a list22:25
sinzuiThis is half the size of the 1.16.2 that has canonistack, hp, azure, and ec222:25
davechen1ylinkage22:26
sinzuiIf I die, someone can check off the boxes I didn't get too22:26
davechen1ysorry, i missed the discussion22:26
abentleythumper, sinzui: I just tested 1.16r1982, and mysql was unhappy: http://162.213.35.28/job/test-no-upgrade-stable/3/console22:32
* sinzui looks22:33
thumperabentley: this isn't the local provider22:35
abentleythumper: Yeah, I don't know whether it was caused by the local provider or just happened on the local provider.22:35
thumperabentley: looked like it was happening on the hp test22:36
thumperisn't it?22:36
sinzuiyep22:36
sinzuithat is the very risk I am taking with my plan. The change only affected the local-provider22:37
abentleythumper: No, we set them up in parallel to reduce lag.22:37
abentleythumper: So exposing on test-release-hp is the last thing we did before we turned our attention back to the local provider.22:37
thumperabentley: do you have access to logs?22:38
thumperI don't entirely believe this22:38
abentleysure, what do you want?22:38
thumper/var/lib/juju/containers/*/console.log22:39
thumperfrom the machine running the local provider22:39
sinzuiabentley, thumper I commonly see start-errors with local. I did *not* get errors starting mysql+wordpress with 1.16.2 just now. I am setup for an upgrade test22:39
thumpersinzui: really? WTF?22:40
thumperI guess it is good...22:40
thumperbut kinda shit22:40
thumpersinzui: doing mysql+wordpress with the local provider?22:40
sinzuiI have noted in the past that mysql seems to work between the hours of UTC 0 and 622:40
sinzuiyep22:40
sinzuithumper, remember when you helped my last week with local provider...that is the thing I deployed 3 times to convince myself all was good...22:41
davechen1y09:40 < sinzui> I have noted in the past that mysql seems to work between the hours of UTC 0 and 622:41
davechen1yo_O!22:41
sinzuithen the next morning mysql told me to go fuck myself22:41
davechen1ysinzui: are you running through squid-apt-proxy ?22:42
thumpersinzui: it worked last week, but I didn't expect it to work today...22:42
sinzuidavechen1y, not in this test22:42
davechen1yif N, maybe that would make things more reliable^h^h^h^h^h reproducable22:42
sinzuiI have a package. I will do the upgrade22:43
abentleythumper: mailed to you.22:45
thumperta22:45
sinzuithumper, abentley local 1.16.2 to 1.16.3 worked.22:45
thumperabentley: the error seems to be unrelated to apt or lxc, but a networking glitch22:48
thumpernear the end of the first machine log file22:48
abentleythumper: it appears that mysql is on the second machine.22:50
abentleythumper: And that the problem is with mysql itself not coming up.22:50
thumperabentley: unit log file for it?22:50
abentleythumper: something different from local-machine-2/console.log?22:52
thumperabentley: yeah...22:52
thumperis the environment still "live"?22:52
abentleythumper: No, it's down.22:52
thumperbugger22:53
thumperthat means the internal log files are gone too22:53
sinzui1.16.3 deployment of mysql+wordpress == PASS22:53
abentleythumper: It might just be ENOMEM.22:54
sinzuitarball is building22:56
sinzuithumper, I have a tarball that I am willing to (and have) signed. I could send this to be packaged. I would prefer to fix my packaging branch. If I cannot solve its dep problems by my bedtime, I can send off the tarball.23:24
=== gary_poster|away is now known as gary_poster
=== gary_poster is now known as gary_poster|away
* wallyworld off to accountant, bbiab23:42
sinzuithumper, do you have any insight into why this upgrade did not complete? http://pastebin.com/QuLFM3hq23:49
sinzuioops, thumper, I think this has the log that shows a failed upgrade http://pastebin.com/3dTCS9W123:50

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!