davecheney | wallyworld: https://github.com/juju/juju/pull/379 | 00:20 |
---|---|---|
davecheney | mattyw: http://www.ohloh.net/projects/juju | 00:22 |
perrito666 | fwereade: since your freaked out I returned :p | 00:23 |
fwereade | perrito666, I'm in a quantum superposition of being freaked out, and not freaked out, until I know more context :) | 00:24 |
perrito666 | fwereade: since the latest speed ups restore sometimes fails when it arrives to the part where it installs mongo client because someone else is running apt-get | 00:24 |
fwereade | perrito666, what's the apt lock issue? | 00:24 |
mattyw | davecheney, 19% javascript? | 00:24 |
fwereade | perrito666, heh | 00:24 |
perrito666 | so we need to wait until apt-get finished | 00:24 |
perrito666 | :) | 00:24 |
fwereade | perrito666, who else might be running it? | 00:24 |
perrito666 | fwereade: I have no clue, but I intend to figure that out tomorrow | 00:24 |
fwereade | perrito666, ok | 00:24 |
davecheney | "maintained by a very large development team" | 00:25 |
perrito666 | fwereade: but that set aside it is reckless for restore to just try to apt-get without checking if it can | 00:25 |
fwereade | perrito666, well | 00:25 |
fwereade | perrito666, restore is setting up a new state server, right? | 00:25 |
perrito666 | fwereade: well it is stepping on a fresh one :) so my quick guess, something like apt-get update/grade is happening | 00:26 |
mattyw | davecheney, thumper https://github.com/juju/juju/pull/369 | 00:26 |
fwereade | perrito666, it kinda feels like the apt lock is too low level | 00:26 |
fwereade | perrito666, we may be stepping on it while cloudinit is still finishing? | 00:27 |
fwereade | perrito666, in which case we can expect more things than just the apt lock to fall over, I suspect | 00:27 |
perrito666 | fwereade: good point, you think that if I wait for apt-to finish something else might break? | 00:27 |
fwereade | perrito666, yeah | 00:27 |
fwereade | perrito666, I think that we should have complete control over that server | 00:28 |
* perrito666 wonders how to know if everything else is finished | 00:28 | |
fwereade | perrito666, I feel like we ought to have already solved that issue for bootstrap | 00:28 |
fwereade | perrito666, maybe we haven't? | 00:28 |
perrito666 | I think we think we did | 00:28 |
fwereade | axw, you did sync bootstrap -- do we guarantee cloudinit is finished before we start work? | 00:29 |
fwereade | perrito666, if we do, restore should do whatever bootstrap does | 00:29 |
perrito666 | fwereade: well the first step of restore is to bootstrap a machine so we cant really be more boostrapy :p | 00:29 |
fwereade | perrito666, if not, restore and bootstrap should both make sure they wait | 00:30 |
fwereade | perrito666, and ideally use the same code in the same way to do so regardless | 00:30 |
fwereade | perrito666, that maybe sounds like we don't wait | 00:30 |
perrito666 | fwereade: well, new restore is far cleaner in that sense ;) | 00:30 |
* fwereade grumps a bit | 00:30 | |
fwereade | perrito666, cool | 00:30 |
fwereade | perrito666, so, high-level, ISTM that we should be reusing bootstrap as much as possible (which we are, great) but that in either case we should wait for cloudinit to be done before we start doing anything else | 00:31 |
fwereade | perrito666, or possibly I don;t know what I'm talking about | 00:31 |
fwereade | perrito666, always bear that possibility in mind | 00:31 |
perrito666 | I definitely need someone which knows more bootstrap than I to go over what bootstrap does and find out why bootstrap declares it finished when it didn't | 00:31 |
fwereade | perrito666, either way, forget what I said about the hook execution lock, I'm pretty sure it's irrelevant | 00:32 |
fwereade | perrito666, thanks for coming back to chat | 00:32 |
fwereade | perrito666, bah, axw not around yet | 00:32 |
fwereade | perrito666, would you drop him a quick mail he can answer overnight please? | 00:32 |
* fwereade will slope off to lunch unless there's something else? | 00:33 | |
perrito666 | fwereade: I really need some sleep so ttyt | 00:33 |
perrito666 | lunch? where are you? | 00:33 |
fwereade | perrito666, new zealand | 00:33 |
perrito666 | cool, bring back a hobbit :p | 00:34 |
fwereade | perrito666, haha | 00:34 |
fwereade | perrito666, sleep tight | 00:34 |
perrito666 | thank you | 00:34 |
wallyworld | davecheney: i found a deadlock in the client api login tests | 01:11 |
wallyworld | that will explain a lot of the test failures in that area | 01:12 |
wallyworld | very much timing related so subtle changes due to session copying may have triggered it to be more frequent | 01:12 |
davecheney | wallyworld_: awesome | 01:27 |
wallyworld_ | davecheney: i updatd your bug, hopefully explanation makes sense | 01:32 |
wallyworld_ | fixed locally | 01:32 |
davecheney | wallyworld_: thanks | 01:33 |
davecheney | thumper: go test -run=XXX github.com/juju/juju/... | 01:36 |
thumper | fwereade: https://github.com/juju/juju/pull/380 | 01:41 |
wallyworld_ | axw: a small one https://github.com/juju/juju/pull/381 | 01:47 |
axw | looking | 01:47 |
axw | ahh, is this what was causing the tests to timeout? | 01:47 |
davecheney | wallyworld_: do you have a new PR for reapplyng your txn fix ? | 01:55 |
davecheney | then cmars can test it | 01:55 |
wallyworld_ | davecheney: not yet, i want to retest locally first. soon | 01:56 |
davecheney | ok | 01:59 |
davecheney | tasdomas: var uuidregex = regexp.MustCompile(`[a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12}`) | 02:01 |
axw | wallyworld_: something I can do to help you with the mongo session changes? | 02:16 |
axw | axw> wallyworld_: something I can do to help you with the mongo session changes? | 02:23 |
davecheney | axw: i think wallworld needs help with his intertubes | 02:27 |
axw | so it would seem | 02:28 |
tasdomas | davecheney, thumper https://github.com/juju/names/pull/19 | 02:29 |
axw | tasdomas: do you know if landscape has updated their code to use the new format? | 02:32 |
axw | tasdomas: also, #1257587 isn't actually Fix Released until this is in juju - right? | 02:32 |
_mup_ | Bug #1257587: environment-tag handling permits non-unique tags <landscape> <tech-debt> <juju-core:Fix Released> <https://launchpad.net/bugs/1257587> | 02:32 |
davecheney | axw: true | 02:32 |
davecheney | we need to land a companion branch to juju and dependencies.tsv | 02:32 |
wallyworld__ | davecheney: so, gh won't let me create a pr using the original copy-session branch because it says master already has all those commits, even though i reverted. so do i need to do a whole new branch? please say i don't | 02:38 |
menn0 | wallyworld__: add --force to the push? | 02:39 |
wallyworld__ | menn0: i haven't pushed anything new as yet - just went to the original branch on gh and tried to create a pr | 02:40 |
davecheney | wallyworld__: i wish I could help | 02:40 |
menn0 | wallyworld__: ah sorry... I misunderstood what you were trying to do | 02:41 |
menn0 | wallyworld__: I know even less about GH than Git | 02:41 |
davecheney | wallyworld__: i'd grab the .diff from the original PR and try to patch it onto a fresh branch | 02:41 |
wallyworld__ | menn0: that worked, thanks. not sure why exactly, since the branch hadn't been changed on my end and the revs it contains were already in my fork. but it seems to have done something | 02:45 |
axw | wallyworld__: still investigating, but I think I need to reopen #1345832 | 02:48 |
_mup_ | Bug #1345832: Juju writes to mongo without an actual change occurring <cloud-installer> <landscape> <juju-core:Fix Committed by axwalk> <juju-core 1.20:Fix Committed by axwalk> <https://launchpad.net/bugs/1345832> | 02:48 |
axw | don't think it's actually fixed on account of the assertion going into the oplog | 02:48 |
axw | may be best to take it off the 1.20.2 milestone if that's the case | 02:48 |
axw | -__________- | 02:49 |
axw | wallyworld__: your connection is awful | 02:50 |
axw | axw> wallyworld__: still investigating, but I think I need to reopen #1345832 | 02:50 |
_mup_ | Bug #1345832: Juju writes to mongo without an actual change occurring <cloud-installer> <landscape> <juju-core:Fix Committed by axwalk> <juju-core 1.20:Fix Committed by axwalk> <https://launchpad.net/bugs/1345832> | 02:50 |
axw | axw> don't think it's actually fixed on account of the assertion going into the oplog | 02:50 |
axw | <axw> may be best to take it off the 1.20.2 milestone if that's the case | 02:50 |
wallyworld__ | axw: it's only freenode | 02:50 |
wallyworld__ | nfi idea why | 02:50 |
wallyworld__ | axw: how often are the api hosts updated? | 02:51 |
axw | umm | 02:52 |
axw | not sure. will need to check | 02:52 |
axw | but the oplog will be changing at the same rate as before | 02:53 |
axw | just with a different type of entry | 02:53 |
tasdomas | thumper, davecheney https://github.com/juju/juju/pull/369 | 02:55 |
wallyworld | davecheney: cmars: master now updated with session copying branch | 03:18 |
cmars | wallyworld, thanks | 03:19 |
davecheney | wallyworld: ta | 03:28 |
axw | wallyworld: I'm reopening the bug about no-op writes, but thankfully I think it's easy to fix | 03:35 |
axw | I had thought there were two sources of calls to SetAPIHostPorts, but there's only one | 03:35 |
wallyworld | axw: np, sounds good | 03:35 |
wallyworld | axw: i got the copy sessions stuff backported, had to fox some additional tests. i'm doing a live test before landing. if you wanted to eyeball it, that would be good. no real hurry https://github.com/juju/juju/pull/385 | 03:39 |
axw | wallyworld: okey dokey. so what was causing the errors last night? | 03:39 |
wallyworld | axw: that necessitated the revert? | 03:41 |
axw | wallyworld: yeah | 03:41 |
wallyworld | i think it was the login deadlock | 03:41 |
axw | what about the authorization failures tho? | 03:41 |
wallyworld | let me re-read the email | 03:42 |
wallyworld | axw: i think i saw a fair few tests fail once the login deadlock caused failure | 03:43 |
axw | mk | 03:43 |
wallyworld | once i fixed that, i wasn't able to reproduce any more failures | 03:43 |
davecheney | thumper: fwereade https://github.com/juju/juju/pull/378 | 03:53 |
davecheney | fwereade: may I draw you attention to https://github.com/juju/juju/pull/378, which I have updated | 04:07 |
davecheney | mattyw: ping | 04:07 |
mattyw | davecheney, pong? | 04:07 |
mattyw | davecheney, at the moment all I've added is a NewUUID to the factory which just wraps utils.NewUUID | 04:08 |
davecheney | mattyw: that's the ticket | 04:09 |
davecheney | more abstraction, that's what we need | 04:09 |
axw | thumper: https://github.com/juju/juju/blob/master/environs/configstore/disk.go#L235 | 04:14 |
thumper | hi axw | 04:14 |
axw | hello | 04:14 |
thumper | bugger | 04:14 |
thumper | forgot to remove that | 04:14 |
thumper | sorry | 04:14 |
* thumper fixes | 04:15 | |
axw | thanks. I'd do it, but in the middle of a branch | 04:15 |
thumper | axw: also, just pulled trunk to get the new hotness | 04:15 |
thumper | but the tests seemed to be hanging | 04:15 |
axw | :( | 04:15 |
thumper | last output is: | 04:15 |
thumper | ok github.com/juju/juju/cmd/envcmd0.145s | 04:16 |
axw | will see if I can repro in a moment | 04:16 |
thumper | proceeded the second time | 04:17 |
thumper | axw: I normally run all the tests when I freshly pull master | 04:18 |
thumper | to make sure it is all good | 04:18 |
thumper | and it makes me feel better | 04:18 |
thumper | I think it is up to the replica set tests now | 04:18 |
waigani | axw: I'm trying to live test manually adding a machine with a specified ssh key and I get the following error: | 04:19 |
waigani | WARNING failed to parse bootstrap-config: empty image-metadata-url in environment configuration | 04:19 |
waigani | ERROR empty image-metadata-url in environment configuration | 04:19 |
axw | waigani: do you have an empty image-metadata-url in your environment configuration? ;) | 04:19 |
waigani | lol | 04:19 |
waigani | axw: fair call | 04:20 |
axw | if it's blank in environments.yaml, that's considered an error | 04:20 |
axw | it needs to be either not there, or not blank | 04:20 |
waigani | right, thanks | 04:20 |
thumper | axw: https://github.com/juju/juju/pull/386 | 04:21 |
waigani | axw: its not there | 04:22 |
axw | hmm | 04:23 |
axw | waigani davecheney: I suspect https://github.com/juju/juju/pull/362/files#diff-2 | 04:23 |
axw | config previously allowed "", now does not | 04:23 |
axw | so... somehow your environment has a blank value I guess | 04:24 |
waigani | axw: ugh, nice catch | 04:24 |
davecheney | wallyworld: can you help me revert this PR ? | 04:30 |
davecheney | https://github.com/juju/juju/pull/362 | 04:30 |
davecheney | turns out we can't even remove these fields | 04:30 |
wallyworld | davecheney: sure, let me look | 04:30 |
wallyworld | davecheney: reverted and merged | 04:32 |
davecheney | ta | 04:32 |
davecheney | wallyworld: thanks | 04:33 |
wallyworld | np | 04:33 |
thumper | anyone else seeing the state/watcher tests fail? | 04:35 |
waigani | axw: worked out what happened: I bootstrapped, ran make install (new version), then ran addmachine. I get same error when I try to destroy my env (I'll monkey my config to make it work). davecheney's revert should fix this. | 04:37 |
axw | okey dokey | 04:38 |
wallyworld | thumper: i occasionally get failures with those. what are you seeing? | 04:42 |
thumper | watcher_test.go:631: | 04:43 |
thumper | assertChange(c, s.ch, watcher.Change{"test", "a", revno1}) | 04:43 |
thumper | watcher_test.go:118: | 04:43 |
thumper | c.Fatalf("watch reported nothing, want %v", want) | 04:43 |
thumper | ... Error: watch reported nothing, want {test a 2} | 04:43 |
thumper | also, my replicaset test timed out at 10 minutes | 04:43 |
wallyworld | thumper: i regularly get replica set timeouts. i sometimes see the watcher failures, been meaning to search for a bug. definitely intermittent for me. another race :-( | 04:44 |
davecheney | wallyworld: it's not a race | 04:44 |
davecheney | it's that PR that I landed | 04:44 |
wallyworld | the wather failures? | 04:44 |
wallyworld | watcher | 04:44 |
davecheney | the problem is we're startin a mongodb on [::1]:something | 04:44 |
davecheney | and then dialing it on 127.0.0.1:something | 04:44 |
davecheney | they are effectively different networks | 04:44 |
wallyworld | ok. i've seen them pass sometimes and not others | 04:45 |
davecheney | thumper: confirmed, this has just started happening | 04:50 |
davecheney | ---------------------------------------------------------------------- | 04:50 |
davecheney | FAIL: watcher_test.go:625: com_juju_juju_state_watcher_test.TestWatchBeforeRemoveKnown.pN55_github.com_juju_juju_state_watcher_test.SlowPeriodSuite | 04:50 |
davecheney | insert("test", "a") => revno 2 | 04:51 |
davecheney | remove("test", "a") => revno -1 | 04:51 |
davecheney | watcher_test.go:118: c.Fatalf("watch reported nothing, want %v", want) | 04:51 |
davecheney | ... Error: watch reported nothing, want {test a 2} | 04:51 |
davecheney | [LOG] 0:10.025 INFO juju.testing reset successfully reset admin password | 04:51 |
davecheney | OOPS: 20 passed, 4 FAILED | 04:51 |
davecheney | --- FAIL: TestPackage (34.06 seconds) | 04:51 |
davecheney | FAIL | 04:51 |
davecheney | FAILgithub.com/juju/juju/state/watcher34.559s | 04:51 |
wallyworld | davecheney: that's what's been intermittent for me, passes mostly but sometimes fails | 04:52 |
davecheney | wallyworld: https://bugs.launchpad.net/juju-core/+bug/1348032 | 04:59 |
_mup_ | Bug #1348032: state/watcher: FastPeriodSuite.TestWatchAfterKnown failure <juju-core:New> <https://launchpad.net/bugs/1348032> | 04:59 |
davecheney | it's started to fail all over the ship today | 04:59 |
wallyworld | hmmmm. mostly passes here for me, with the odd failure | 04:59 |
wallyworld | that's why i thought it to be a race | 04:59 |
davecheney | wallyworld: it's not a race | 05:05 |
davecheney | it's load related | 05:05 |
davecheney | the test passes on an idle machine | 05:05 |
davecheney | when I load up my machine | 05:05 |
davecheney | the test fails | 05:05 |
wallyworld | so the load could trigger the race, by changing timing of thread interactions | 05:06 |
wallyworld | race conditions typically manifest under different load conditions | 05:06 |
davecheney | i'm running it under -race | 05:06 |
davecheney | it's not a race conditoin by my definition | 05:06 |
thumper | davecheney: https://github.com/juju/juju/pull/390 | 05:33 |
thumper | :-( | 05:35 |
thumper | axw, wallyworld: above PR fixes a critical issue in a previous branch of mine | 05:35 |
thumper | typed nil FTL | 05:35 |
wallyworld | looking | 05:36 |
thumper | axw: ta | 05:36 |
axw | np | 05:36 |
thumper | talking with davecheney about writing a jc.IsNil that may assert the type of the interface if it is nil | 05:37 |
wallyworld | thumper: i'd argue the original code was wrong not to check err value anyway | 05:37 |
axw | +1 | 05:38 |
wallyworld | jeez, bot is busy today | 05:39 |
thumper | wallyworld: it did check the err value | 05:42 |
thumper | wallyworld: what's the problem there? | 05:42 |
thumper | it wasn't the error that was the problem | 05:42 |
wallyworld | - if info != nil { | 05:42 |
wallyworld | - info.environmentDir = d.dir | 05:42 |
thumper | it was the expectation of nil, not-nil | 05:42 |
thumper | wallyworld: that's perfectly fine there | 05:42 |
wallyworld | should use the return value being "nil" as the determining factor | 05:42 |
wallyworld | shouldn't | 05:42 |
wallyworld | it should have done if err != nil | 05:43 |
thumper | still not the solution there | 05:43 |
wallyworld | or if err == nil | 05:43 |
thumper | that still wouldn't fix the problem | 05:43 |
axw | thumper: wallyworld is arguing that all error cases should exit early, which I agree with | 05:43 |
thumper | yeah... | 05:43 |
wallyworld | if err != nil, then you know info is bad | 05:43 |
wallyworld | then it doesn't matter if it is nil or typed nil | 05:44 |
thumper | wallyworld: well, I have a test that asserts that info is nil if error is there | 05:44 |
wallyworld | cause you won't use it | 05:44 |
thumper | but we do use it | 05:44 |
thumper | and we have a test that asserts it is nil | 05:44 |
wallyworld | you use it even if err != nil? | 05:44 |
wallyworld | that's bad | 05:44 |
thumper | right | 05:44 |
thumper | we assume that info is nil if error is set | 05:44 |
davecheney | thumper: https://bugs.launchpad.net/gocheck/+bug/1248040 | 05:44 |
_mup_ | Bug #1248040: test failed on 1.2rc3 <gocheck:New> <https://launchpad.net/bugs/1248040> | 05:44 |
thumper | I'm not saying it is right | 05:45 |
wallyworld | cause if err != nil, then you mustn't assume the return val is usable | 05:45 |
thumper | I'm saying that is what we do | 05:45 |
* thumper makes a note to fix the call site too later | 05:47 | |
tasdomas | davecheney, https://github.com/juju/juju/pull/393 | 05:49 |
davecheney | log_test.go:159: c.Check(tw.Log, jc.LogMatches, []string{"foo", "bar"}) | 05:50 |
davecheney | ... obtained func() []loggo.TestLogValues = (func() []loggo.TestLogValues)(0x455c10) | 05:50 |
davecheney | ... expected []string = []string{"foo", "bar"} | 05:50 |
davecheney | ... Obtained value must be of type []loggo.TestLogValues or SimpleMessage | 05:51 |
thumper | davecheney: https://github.com/juju/testing/pull/23 | 05:54 |
wallyworld | axw: got a sec for a hangout in the standup hangout? | 06:15 |
axw | sure | 06:15 |
wallyworld | mgz: meeting? | 10:03 |
perrito666 | axw: good morning, do you think I can get an answer to my email before your EOD? | 10:18 |
perrito666 | good morning all | 10:18 |
axw | perrito666: good morning. the one I already replied to, or the one that hasn't come yet? :) | 10:19 |
perrito666 | lool | 10:19 |
* perrito666 kicks his inbox | 10:19 | |
perrito666 | sorry it got buried in githubbies | 10:20 |
axw | no worries :) | 10:20 |
perrito666 | aghh, then who is running apt-get | 10:21 |
axw | perrito666: can you see what's being installed via apt-get? | 10:22 |
perrito666 | axw: well to be honest I was never able to reproduce that yet, Ill give it a try | 10:23 |
axw | jam: are you abreast of all the IPv6 changes that dimitern has been doing? if so, would you cast your eye over this please? https://github.com/juju/juju/pull/394 | 12:06 |
axw | if not I'll land anyway and send dimitern an email | 12:06 |
=== jcw4|on-the-road is now known as jcw4 | ||
natefinch | jam, mgz: relatively easy code review? https://github.com/juju/juju/pull/375 | 12:23 |
TheMue | natefinch: *click* | 12:26 |
natefinch | TheMue: thanks... that's what I get for not scrolling all the way down in the user list | 12:26 |
TheMue | natefinch: hehe | 12:27 |
TheMue | natefinch: lumberjack? ;) | 12:27 |
natefinch | TheMue: it's a log rolling package | 12:28 |
TheMue | natefinch: funny name | 12:28 |
natefinch | TheMue: borning names are so boring | 12:28 |
natefinch | s/borning/boring/ | 12:29 |
TheMue | natefinch: I’ve been grown up with TLAs | 12:29 |
katco | good morning all | 12:31 |
TheMue | natefinch: it’s a bad PR | 12:31 |
TheMue | katco: morning | 12:31 |
katco | favorite TLA joke: what does idk mean? | 12:31 |
TheMue | natefinch: there’s nothing to complain :D | 12:31 |
TheMue | katco: iiirks, dunno? | 12:32 |
TheMue | katco: ah | 12:32 |
katco | :) | 12:32 |
TheMue | katco: I don’t know | 12:32 |
katco | the response is "well then i'll just ask someone who does" | 12:32 |
natefinch | Yeah, I've decided I like giving real names to my projects rather than just purely descriptive ones... I could have named lumberjack go-logrotate or something boring like that... but that's not memorable. Lumberjack is fun, memorable, and fairly unique. | 12:34 |
katco | natefinch: you should have named it "snore" | 12:34 |
katco | as in, "sawing logs" | 12:34 |
TheMue | natefinch: I dislike all those go-prefixes | 12:35 |
natefinch | TheMue: me too.... but it's tricky because pretty much every decent name has already been taken | 12:35 |
TheMue | natefinch: not in your namespace, eg. at github | 12:36 |
natefinch | TheMue: I've decided that I don't care if I conflict with an existing project as long as it's not hugely well-known and/or in the exact same functional space | 12:36 |
* TheMue has to admit he’s using prefixes for his repositories, but not for the packages/modules/libraries in there | 12:37 | |
natefinch | TheMue: http://www.pkgname.com/ | 12:37 |
TheMue | so it’s clear that all the stuff in there is for one language or is an application which possibly uses multiple languages | 12:38 |
TheMue | natefinch: oh no, people build services to check this | 12:39 |
natefinch | haha, it basically just checks if you have "Go" in the name, and if so, say's it's a crappy name | 12:39 |
natefinch | of course, it's not smart... something like "gorilla" it says is a bad name because it mentions go | 12:40 |
TheMue | maybe it should be no prefix | 12:41 |
TheMue | ihavetogo | 12:41 |
TheMue | letsgo | 12:41 |
TheMue | canwego | 12:41 |
TheMue | igotoyou | 12:42 |
natefinch | nah, because people do foo-go (oh, hyphens are another no-no at pkgname, which I agree with) | 12:42 |
TheMue | those go* packages remind me of all thos ISomething interfaces | 12:43 |
natefinch | yep | 12:43 |
TheMue | or upn | 12:43 |
TheMue | hmm, no, wrong name, don’t remember how it is calles | 12:44 |
TheMue | called | 12:44 |
TheMue | this paiPointerToArrayOfIntegers | 12:44 |
natefinch | TheMue: we were having a similar conversation yesterday... Lumberjack has a MaxSize field, which is supposed to be in megabytes.... should it be MaxSizeInMB, or should the units be in the type (type Megabytes int) or just in a comment? | 12:46 |
natefinch | TheMue: curious to hear your opinion... I went with a comment, but I sorta feel like it should have been its own type. Not a fan of the change of the name | 12:46 |
TheMue | natefinch: I like the approach with an own type. it’s one of the strength of go to make this so simple. also for arguments or constants | 12:49 |
TheMue | natefinch: think of the time package | 12:49 |
natefinch | Yeah... I regret not using it on lumberjack | 12:49 |
natefinch | oh well.... can't change it now, I'd break the API. | 12:49 |
natefinch | It's not so bad for lumberjack, because it's something you only ever set up once in your project, so you only have to remember it's megabytes that one time. | 12:50 |
TheMue | +1 | 12:50 |
natefinch | TheMue: what do you think about the log size and number of backups for the log rotation? It seemed like a reasonable config, but I don't know what people really expect. 100M, to me, is nice, because if you get something spamming the logs, you have half a chance at being able to go back far enough to find the root cause before all the spam.... and yet, 100MB in this day and age shouldn't be a hardship even on relatively s | 12:52 |
natefinch | mall cloud disks. | 12:52 |
natefinch | jam: curious to hear your thoughts on the max size of the logs, too. Note, this is for the machine-n and unit-n logs, not the all machines log, which can be much much bigger (due to being, you know, everything) | 12:53 |
TheMue | natefinch: size is ok to me, only maybe more than just one generation, but I’m not sure | 12:55 |
TheMue | natefinch: standard is often 5, but for apps with less log changes, so the files are pretty small | 12:56 |
axw | natefinch: is there a way to get lumberjack to always backup the first log file? it may have useful information about the agent's initialisation | 13:25 |
axw | e.g. the agent version/compiler | 13:25 |
perrito666 | aghh I hate race conditions | 13:40 |
natefinch | heh | 13:40 |
natefinch | sorry, was out for a bit, electrician arrived | 13:41 |
natefinch | axw: there's no way to keep around the first file for forever... we could do that ourselves easily enough, though | 13:41 |
natefinch | axw: just copy the log file to unit-0-init.log after initialization is done | 13:42 |
natefinch | axw: lumberjack won't touch files it doesn't generate itself | 13:44 |
natefinch | (unless you happen to make a file that exactly matches origFilename-lumberjackTimestamp.origExt | 13:46 |
jamespage | I have a question with regards to relation state - when does data that a remote service has set on a relation become visible to related services? is is at the point of execution of the -changed hook on the remote side or as soon as the data is set on the local side? | 13:54 |
perrito666 | hey natefinch ericsnow can I be a couple of mins late? I am uploading some tools to a vm and mi bw up is completely used | 14:00 |
ericsnow | perrito666: I'm good :) | 14:01 |
perrito666 | no you are not :p | 14:01 |
ericsnow | haha | 14:01 |
mgz | ericsnow: poke about reviewboard plugin, have you got it in a state I can pick up and carry on with? | 14:03 |
ericsnow | mgz: I take it you didn't get me email 10 days ago :) | 14:03 |
ericsnow | mgz: yeah, it's ready (though I'm sure we'll find something that needs tweaking) | 14:04 |
mgz | ericsnow: thanks | 14:04 |
ericsnow | mgz: np | 14:04 |
mgz | (I have also been stuck in other things since the sprint mostly...) | 14:05 |
ericsnow | mgz: I figured :) | 14:07 |
ericsnow | voidspace: how's the trip? | 14:14 |
perrito666 | ericsnow: natefinch I am all set, going in | 14:16 |
voidspace | ericsnow: hey, cool - just sitting in lightning talks now | 14:19 |
voidspace | ericsnow: you all coping without me? | 14:19 |
voidspace | current talk title "I hate testing"... | 14:19 |
ericsnow | voidspace: Gary Bernhardt? | 14:20 |
voidspace | ericsnow: hah, no - some European guy I don't know | 14:20 |
voidspace | speaking about code mutation for testing - checking that when you change your code randomly some test fails to tell you | 14:21 |
ericsnow | natefinch: you coming? | 14:22 |
wwitzel3 | ericsnow, perrito666, natefinch: we are working on the inbound to juju mapping document for tosca, can't join standup right now. | 14:29 |
natefinch | wwitzel3: no problem. Don't worry about the standup this week | 14:29 |
wwitzel3 | natefinch: I'll send another update to the team EOD tomorrow that will cover the second half of the week. | 14:31 |
perrito666 | wwitzel3: send pics :p | 14:31 |
ericsnow | wwitzel3: hey, don't sound so excited! | 14:33 |
mbruzek | Hello natefinch I just got done with the Juju Cross team meeting. The Landscape team has a bug blocking their release that someone on core should look at. https://bugs.launchpad.net/juju-core/+bug/1318366 | 14:52 |
_mup_ | Bug #1318366: jujud on state server panic misses transaction in queue <cloud-installer> <landscape> <orange-box> <panic> <performance> <sm15k> <juju-core:Triaged> <juju-core (Ubuntu):Triaged> <https://launchpad.net/bugs/1318366> | 14:52 |
perrito666 | unrelated question, any of you has a thinkpad x1 carbon? | 14:53 |
mbruzek | natefinch, fwereade, since Alexis is out who can I speak with for core issues? | 14:53 |
natefinch | mbruzek: you can talk to me... sorry I missed the cross team meeting | 15:02 |
mbruzek | natefinch, That is OK you were dealing with electricians | 15:03 |
perrito666 | sinzui: ping | 15:03 |
sinzui | hi perrito666 | 15:03 |
mbruzek | natefinch, This seems pretty important to the Landscape team and no one on juju except for John has made a comment. | 15:03 |
perrito666 | sinzui: hi, sorry to botter, the repo for CI tests changed? | 15:03 |
perrito666 | I am bzr pull-ing on my copy and it is complaining | 15:04 |
natefinch | mbruzek: sorry... much of the Juju Core leads are indisposed this week. William and Tim are on a sprint, and Alexis is on vacation. | 15:05 |
sinzui | perrito666, lp:juju-ci-tools and lp:juju-release-tools | 15:05 |
* perrito666 scratches his head | 15:05 | |
perrito666 | thank you | 15:05 |
natefinch | mbruzek: I'll look into that error today. Seems like mongo is having trouble for some reason | 15:05 |
sinzui | perrito666, note that the restore tests are being run in HP this week because of the ec2 provisioning issue. So the tests are more likely to pass. They certainly always pass for 1.20.3 | 15:06 |
mbruzek | natefinch, OK great, thank you for taking a look. I believe this is blocking the Landscape team so if you could leave a comment that would be appreciated. | 15:06 |
perrito666 | sinzui: I am setting up an ec machine to run the tests from there to ec2, so I can trigger the error | 15:06 |
sinzui | +1 | 15:07 |
natefinch | mbruzek: done | 15:10 |
natefinch | heh.... This landscape bug has a panic traceback that's over a million lines long. Looks like about 18000 goroutines | 15:14 |
mgz | amusing | 15:16 |
natefinch | this is why exceptions are somewhat less useful in multi-threaded environments :) | 15:17 |
mgz | not really, but dumping every stack when we're also leaking routines certainly does lead to silliness | 15:17 |
natefinch | didn't realize we're leaking goroutines. that seems... serious | 15:18 |
natefinch | dpb1: I hear you're the one to talk to about log files for this landscape bug... do you have the full log file? The one James Page posted is truncated, and missing a lot of info. | 15:18 |
natefinch | https://bugs.launchpad.net/juju-core/+bug/1318366 | 15:18 |
_mup_ | Bug #1318366: jujud on state server panic misses transaction in queue <cloud-installer> <landscape> <orange-box> <panic> <performance> <sm15k> <juju-core:Triaged> <juju-core (Ubuntu):Triaged> <https://launchpad.net/bugs/1318366> | 15:18 |
dpb1 | natefinch: I'm looking into it. sec | 15:20 |
sparkiegeek | have been pointed here for giving info about bug 1318366? | 15:25 |
_mup_ | Bug #1318366: jujud on state server panic misses transaction in queue <cloud-installer> <landscape> <orange-box> <panic> <performance> <sm15k> <juju-core:Triaged> <juju-core (Ubuntu):Triaged> <https://launchpad.net/bugs/1318366> | 15:25 |
sparkiegeek | sorry, wasn't connected, don't have the backscroll | 15:25 |
sparkiegeek | what's needed? | 15:25 |
sparkiegeek | natefinch: https://chinstrap.canonical.com/~acollard/maxwell-logs.tgz | 15:29 |
bodie_ | anyone familiar with the cmd.out.Write utility for marshaling JSON? | 15:29 |
natefinch | sparkiegeek: thanks | 15:30 |
sparkiegeek | natefinch: as stated on the bug, it might contain sensitive data, hence the canonical.com link :) | 15:31 |
natefinch | sparkiegeek: understood | 15:31 |
natefinch | niemeyer: we're getting "rescanned document misses transaction in queue" from mgo during a big deployment. Relevant stack trace: http://pastebin.ubuntu.com/7848252/ | 15:51 |
natefinch | niemeyer: I'm not really sure what the panic means | 15:51 |
natefinch | niemeyer: relevant bug: https://launchpad.net/bugs/1318366 | 15:52 |
_mup_ | Bug #1318366: jujud on state server panic misses transaction in queue <cloud-installer> <landscape> <orange-box> <panic> <performance> <sm15k> <juju-core:Triaged> <juju-core (Ubuntu):Triaged> <https://launchpad.net/bugs/1318366> | 15:53 |
niemeyer | natefinch: I'm getting in a meeting right now, but we catch up afterwards | 16:00 |
natefinch | niemeyer: ok thanks | 16:01 |
sparkiegeek | natefinch: did you read the duplicate bug? https://bugs.launchpad.net/juju-core/+bug/1318044/comments/1 especially | 16:18 |
_mup_ | Bug #1318044: panic in mgo transaction lib <cloud-installer> <landscape> <mongodb> <race-condition> <juju-core:Triaged> <https://launchpad.net/bugs/1318044> | 16:18 |
natefinch | sparkiegeek: ahh, no, thanks | 16:31 |
perrito666 | natefinch: http://bashrcgenerator.com/ <-- this might come in handi | 16:33 |
perrito666 | handy | 16:34 |
natefinch | cute | 16:35 |
natefinch | what's the trick to rebase a change on top of stuff merged from upstream? | 16:48 |
natefinch | mgz, gsamfira ^^ | 16:48 |
gsamfira | natefinch: if your branch is based on "master" for example, you need to commit your changes, do a git checkout master, git pull, git checkout <my branch>, git rebase master | 16:49 |
gsamfira | your changes will be replayed on top of current master | 16:50 |
gsamfira | if any conflicts come up, you will be guided through resolving them | 16:50 |
gsamfira | make a backup of your local repo just in case ;) | 16:50 |
katco | https://github.com/juju/juju/pull/398 | 16:51 |
mgz | pretty much, it's not really as dangerous as it seems either, actually destroying underlying revisions is hard | 16:53 |
gsamfira | mgz: depends. If they are continuous, you could just do | 16:54 |
gsamfira | git rebase --interactive HEAD~2 | 16:54 |
gsamfira | and rebase the last 2 commits | 16:54 |
gsamfira | put a squash instead of a "pick" | 16:54 |
gsamfira | and they will be squashed into one | 16:54 |
niemeyer | natefinch: I'm here | 16:55 |
niemeyer | natefinch, gsamfira: You don't _have_ to rebase though.. merging works fine for that too | 16:56 |
gsamfira | niemeyer: true, rebasing just gives you a continuous history. If you start on a new feature, your commit messages will be at the top, instead of spread out throughout the commit log | 16:57 |
gsamfira | also squashing is helpful if you don't want to push 50+ commit messages for one big feature | 16:58 |
gsamfira | stuff like: changed based on feedback | 16:58 |
gsamfira | or "oops...forgot to close function" | 16:58 |
gsamfira | :) | 16:58 |
perrito666 | sinzui: getting "upgrade in progress - Juju functionality is limited" this is a behaviour you see too? | 17:00 |
niemeyer | natefinch: Can I please have the full log for that bug? | 17:01 |
natefinch | niemeyer: https://chinstrap.canonical.com/~acollard/maxwell-logs.tgz | 17:07 |
natefinch | brb, I have to drop off the babysitter. back in ~10 mins | 17:08 |
sinzui | perrito666, I have never seen that with the restore tests. | 17:09 |
perrito666 | sinzui: btw, I see what you meant about aws being unreliable these days | 17:10 |
niemeyer | natefinch, jamespage: Where are the db logs there? | 17:29 |
niemeyer | natefinch, jamespage: Ah, they seem to be unfiltered in syslog | 17:30 |
natefinch | back | 17:32 |
sparkiegeek | niemeyer: heh, those logs are from me. Separate case of the bug | 17:33 |
niemeyer | sparkiegeek: Hm? | 17:34 |
sparkiegeek | niemeyer: I think there's a lingering "rsyslog not configured to put mongodb in a separate file" bug in there somewhere too | 17:34 |
niemeyer | sparkiegeek: Which logs are from you? | 17:34 |
sparkiegeek | niemeyer: maxwell-logs.tgz is mine :) | 17:34 |
natefinch | sparkiegeek: ahh, I didn't realize these logs were separate from the ones in the bug | 17:34 |
niemeyer | sparkiegeek: Ah, you are the one facing the issue then? | 17:34 |
niemeyer | sparkiegeek: Or is that unrelated? | 17:35 |
sparkiegeek | niemeyer: initial bug was filed by jamespage (twice, I closed one as dupe). He hit it on a SeaMicro 15k which was put down to "lots of units, complex relations, something blew up" | 17:35 |
sparkiegeek | niemeyer: I hit it a couple of days ago on an OrangeBox, grabbed all the logs I could because I was aware of previous issue not being solved due to lack of logs | 17:36 |
sparkiegeek | niemeyer: make sense? | 17:36 |
natefinch | sparkiegeek: thank you for grabbing all that info. It makes it so much easier to be sure we're not missing something. Sometimes it can be like pulling teeth to get a full set of logs. | 17:37 |
niemeyer | sparkiegeek: Not yet.. | 17:37 |
niemeyer | sparkiegeek: The question is, what's the actual bug these logs are relatedto? | 17:38 |
sparkiegeek | niemeyer: bug 1318366 | 17:38 |
_mup_ | Bug #1318366: jujud on state server panic misses transaction in queue <cloud-installer> <landscape> <orange-box> <panic> <performance> <sm15k> <juju-core:Triaged> <juju-core (Ubuntu):Triaged> <https://launchpad.net/bugs/1318366> | 17:38 |
bodie_ | question for anyone who has input on the topic | 17:38 |
bodie_ | config-get returns a map of config values for a service or unit | 17:39 |
niemeyer | sparkiegeek: Okay, thanks | 17:39 |
bodie_ | you can also index it by key, as in config-get maxRam | 17:39 |
sparkiegeek | niemeyer: be sure to check the dupe bug too | 17:39 |
niemeyer | sparkiegeek: I don't think I understand what you meant originally by "separate case of the bug" then | 17:39 |
niemeyer | sparkiegeek: Which one? | 17:39 |
bodie_ | (I'll let you guys finish :) ) | 17:39 |
sparkiegeek | niemeyer: https://bugs.launchpad.net/juju-core/+bug/1318044 | 17:40 |
_mup_ | Bug #1318044: panic in mgo transaction lib <cloud-installer> <landscape> <mongodb> <race-condition> <juju-core:Triaged> <https://launchpad.net/bugs/1318044> | 17:40 |
niemeyer | sparkiegeek: Thanks, I've exchanged ideas on this one a couple of months ago already | 17:40 |
sparkiegeek | niemeyer: perhaps "separate instance of the bug" is clearer? I mean, I hit it separately from James and the logs are from that, not from James instance | 17:40 |
niemeyer | sparkiegeek: Do you still have that system running, or can you reproduce the problem easily? | 17:41 |
sparkiegeek | niemeyer: no, and no | 17:41 |
niemeyer | sparkiegeek: Ah, got it | 17:41 |
niemeyer | sparkiegeek, jamespage: Okay, so I'll restate my original comment from #1318044: please grab a dump of the database when you see it again | 17:41 |
_mup_ | Bug #1318044: panic in mgo transaction lib <cloud-installer> <landscape> <mongodb> <race-condition> <juju-core:Triaged> <https://launchpad.net/bugs/1318044> | 17:41 |
niemeyer | sparkiegeek: The logs were the other pending detail, and you nailed it. Thanks | 17:42 |
sparkiegeek | niemeyer: just for future reference, how do I grab a dump of the database? | 17:43 |
sparkiegeek | niemeyer: the thing that jumped out of me from those logs was how often mongo seemed to be getting killed | 17:43 |
niemeyer | sparkiegeek: There's a command, mongodump | 17:43 |
sparkiegeek | niemeyer: ok :) thanks | 17:43 |
niemeyer | sparkiegeek: Just point it to the localhost juju-db address and it'll spit everything out | 17:44 |
niemeyer | sparkiegeek: It's awkward indeed.. and it's a SIGTERM, so it's being explicitly terminate | 17:45 |
niemeyer | d | 17:45 |
niemeyer | Why is it being so? | 17:46 |
sparkiegeek | niemeyer: right, that lead me to upstart logs | 17:46 |
sparkiegeek | but I hit a dead end at that point | 17:47 |
sparkiegeek | is mongodb upstart controlled? | 17:47 |
natefinch | sparkiegeek: for juju, yes | 17:47 |
niemeyer | sparkiegeek: upstart/juju-db.log | 17:47 |
sparkiegeek | natefinch: juju-db.conf right? | 17:47 |
niemeyer | Nothing interesting | 17:47 |
natefinch | yeah | 17:48 |
sparkiegeek | yeah, but the noise there implies multiple restarts? That was my take away | 17:48 |
niemeyer | Just to be clear, it shouldn't really matter for this one instance.. it might be shut down thousands of times and it shouldn't matter | 17:48 |
niemeyer | But it's a problem worth understanding on itself | 17:48 |
natefinch | yeah.... wish there were timestamps in there | 17:48 |
sparkiegeek | natefinch: +1 | 17:49 |
niemeyer | natefinch: Where? | 17:50 |
natefinch | niemeyer: in the juju-db upstart log | 17:50 |
niemeyer | natefinch: Why? It's all boring messages.. | 17:50 |
natefinch | niemeyer: because then you can tell if they were 1 second apart or 1 day apart. Can make a big difference | 17:51 |
niemeyer | natefinch: I mean, not that I disagree.. it's very strange that these messages are not timestamped.. but that's yet another bug, this time on upstart :) | 17:51 |
natefinch | heh yep | 17:51 |
niemeyer | natefinch: We can tell how far apart they are.. just look at syslog | 17:51 |
niemeyer | They're under 30 seconds apart | 17:52 |
natefinch | right, I was going there next. Just was surprised that any log written anywhere was done without a timestamp | 17:52 |
perrito666 | aghh found it apt-get --option Dpkg::Options::=--force-confold --assume-yes install vlan | 17:54 |
sparkiegeek | the amusing thing is that the logs for mongo appear to have two timestamps :) | 17:55 |
sparkiegeek | take from column A, add to column B ;) | 17:55 |
natefinch | oh yeah... huh. didn't notice | 17:55 |
natefinch | very weird that it's getting sigterm'd like 21 seconds after it starts up | 17:58 |
sparkiegeek | natefinch: niemeyer: I'm about to EOD. I think I've given you guys all the help I can? | 17:59 |
natefinch | sparkiegeek: the logs are awesome, thanks again. Really makes all the difference. | 18:01 |
sparkiegeek | natefinch: np :) | 18:01 |
* perrito666 senses everybody forgot core team meeting | 18:02 | |
natefinch | mgz, TheMue, jam: team meeting? | 18:09 |
bodie_ | https://github.com/juju/juju/pull/399 should be ready to merge | 18:27 |
perrito666 | is anyone here familiar with networker? | 18:37 |
katco | just found this: https://chrome.google.com/webstore/detail/mnkacicafjlllhcedhhphhpapmdgjfbb | 18:50 |
katco | enables tweaking git diffs; side-by-side, ignore whitespace | 18:50 |
katco | here's github repository: https://github.com/KuiKui/Octosplit | 18:50 |
katco | firefox extension as well | 18:50 |
natefinch | katco: yeah, it sort of works | 19:07 |
katco | looks great with my sample size of 1 :) | 19:07 |
natefinch | it's poor man's side by side, and it mucks up if you expand stuff and then do side by side (or something like that) | 19:07 |
katco | ah. yeah i just went for the simple case. | 19:08 |
katco | does anyone know what the const aptAddRepositoryJujuStable is used for in juju/juju/provider/local/prereqs.go? i can't find any reference of it anywhere. | 19:19 |
natefinch | uh weird. must be old | 19:20 |
natefinch | I don't see any use of it either | 19:20 |
katco | can i delete it? | 19:20 |
katco | broken windows principle | 19:20 |
natefinch | right | 19:20 |
katco | i'm dealing with apt-getty schtuff atm | 19:21 |
natefinch | also, it someone is using it, it won't compile, so we'll figure it out | 19:21 |
katco | so i figure it fits | 19:21 |
natefinch | s/it/if/ | 19:21 |
natefinch | delete it, see what happens | 19:21 |
katco | k out it goes! | 19:21 |
katco | ty nate | 19:21 |
natefinch | katco: looks like the mongo code does the same add-apt-repository, but not using that constant... probably something missed in a refactor | 19:22 |
katco | ah | 19:22 |
natefinch | friggin' bzr | 19:49 |
natefinch | bzr: ERROR: Not a branch: "http://bazaar.launchpad.net/~maas-maintainers/gomaasapi/trunk/". | 19:49 |
perrito666 | perhaps it is not a branch ? | 19:50 |
* perrito666 hides | 19:50 | |
natefinch | it definitely is a branch :) | 19:52 |
perrito666 | Ill take bzr word on this one, sorry | 19:52 |
perrito666 | far from perfecting my go skills I am becoming even better at bash... | 19:53 |
perrito666 | sinzui: quick question | 19:54 |
perrito666 | do you think that restore should rather timeout trying to restore or rather fail after a few attempts knowing that if it kept trying it could have done it before timeout? | 19:54 |
natefinch | deleting my local launchpad.net directory and re-doing a go get fixes the problem | 19:55 |
sinzui | perrito666, I don't understand. My experience with juju in ci that juju passes quickly and fails slowly. Increasing timeouts/waiting is doesn't improve juju's chance of success | 19:57 |
perrito666 | sinzui: ok you indirectly answered me | 19:58 |
sinzui | oh? | 19:59 |
perrito666 | sinzui: you make me write, this is the deal | 19:59 |
* perrito666 writes | 19:59 | |
perrito666 | there is a worker that tries to add vlan support, for that it needs to install a package | 20:00 |
perrito666 | "sometimes®" its apt-get is not finished before our restore starts installing mongod-clients | 20:00 |
perrito666 | that causes restore to fail | 20:00 |
sinzui | ah | 20:01 |
perrito666 | which sucks bc its a dumb reason to fail | 20:01 |
perrito666 | so, I can either retry apt-get until it works or timeout | 20:01 |
perrito666 | oooor | 20:01 |
perrito666 | I can retry apt-get a few times and then give up | 20:01 |
sinzui | perrito666, thank you for the explanation | 20:01 |
perrito666 | I guess the first option is the right one | 20:02 |
sinzui | perrito666, All apt ops in juju need to retry since it shares the system. I suppose we need to know how long is reasonable? | 20:02 |
sinzui | perrito666, we have a bootstrap timeout of 10 minutes by default? is it reasonable to use a configured timeout to "rebootstrap" the state-server for this case | 20:04 |
sinzui | bootstrap-timeout? | 20:04 |
perrito666 | sinzui: mm, at that point I am not sure how much time passed, I think it is reasonable to stay there as much as required | 20:05 |
* natefinch gets ready to run go test ./... on his windows VM | 20:05 | |
sinzui | perrito666, +1 | 20:06 |
katco | natefinch: good luck lol | 20:13 |
natefinch | heh... it causes like 100 "windows firewall has blocked..." windows to pop up | 20:15 |
perrito666 | natefinch: lol | 20:16 |
perrito666 | kill the fw if you want to live | 20:16 |
perrito666 | sinzui: the fix should go to trunk and then backported? | 20:50 |
perrito666 | I am not sure what is the process there | 20:50 |
sinzui | perrito666, +1 for trunk. I believe we use git patch to backport to stable. But we only need to do that if we believe stable is affected | 20:54 |
perrito666 | ok anyway to be sure Ill have to run this a few times | 20:55 |
sinzui | wallyworld, CI blessed bc568a6d as 1.20.2. No human intervention. It took 2.5 hours. I can start the release now | 21:20 |
davecheney | \o/ | 21:25 |
wallyworld | sinzui: great :-) i'd like to strongly suggest to stakeholders that we do internal testing prior to formal release | 21:26 |
sinzui | wallyworld, how would they do that. compile their own and upload-tools? | 21:27 |
wallyworld | can't they take the release you make and run that? | 21:27 |
wallyworld | before we publish | 21:28 |
sinzui | wallyworld, I can copy all the binaries that CI made | 21:28 |
wallyworld | several internal changes have been made to how mongo connections are handled to combat i/o timeouts and we need proper load testing on maas etc | 21:28 |
wallyworld | we need testing in the same scenariois that showed issues with 1.20.1 | 21:28 |
sinzui | wallyworld, I could build into a PPA, and if they reject it, we skip to 1.20.3, because packaging never lets you reuse a version | 21:28 |
wallyworld | that sounds ok to me | 21:29 |
wallyworld | i'd rather do the extra due diligence given the high profile of this release | 21:29 |
wallyworld | with mark s involved also etc | 21:29 |
sinzui | wallyworld, Everyone using either CI or unreleased packages need to use --upload-tools or make their own streams | 21:30 |
sinzui | wallyworld, CI is already building its next test version, so its streams cannot be used | 21:30 |
wallyworld | i guess we could ask them to use upload-tools. but we are really trying to be able to kill that "feature" | 21:30 |
automatemecolema | So I'm trying to deploy a bundle file on juju canvas and I get an error saying no bundle name provided thoughts? | 21:31 |
wallyworld | i'd like us to have a pre-release mechanism for internal testing | 21:31 |
wallyworld | by a wider audience of stakeholders | 21:31 |
sinzui | wallyworld, If we provide alternate streams in a few cpcs, the users need proxy or direct access | 21:31 |
sinzui | wallyworld, I understand, but dev built juju to ensure it panics when it cannot find itself in the streams... | 21:32 |
sinzui | wallyworld, CI's packages are always available for people to test, but they need to use --upload-tools | 21:33 |
wallyworld | so we need to have a mechanism to publish a "daily" set of tools | 21:33 |
sinzui | no | 21:33 |
wallyworld | as opposed to "release" | 21:33 |
wallyworld | like daily images | 21:33 |
sinzui | wallyworld, we build 3 1.20.2's today | 21:33 |
sinzui | I don't think I will be doing that this year | 21:34 |
wallyworld | ok, np. it was a thought bubble | 21:34 |
wallyworld | so long as we have a way we can tell internal stakehholders how to test a pre-release | 21:34 |
wallyworld | if upload-tools is needed, then so be it | 21:35 |
sinzui | wallyworld, building packages and publish streams is a tightly coupled proc. CI can do it for itself. It isn't trivial to build real packages and streams because we need to ensure every version exists forever and it is unique to ensure no tampering | 21:36 |
wallyworld | yes, agreed | 21:36 |
* sinzui had a meeting with Ubuntu today and they now understand streams and juju's lack of commitment for stable cli/api means backporting is hard | 21:37 | |
wallyworld | sinzui: we are guaranteeing a stable client api afaik | 21:37 |
wallyworld | starting with 1.18 in trusty | 21:38 |
wallyworld | we will ensure 1.18 clients are supported with future versions of juju | 21:38 |
wallyworld | we now support api versioning internally | 21:38 |
sinzui | wallyworld, so a user who deployed 1.18.1 with trusty a few months ago will get an ubuntu upgrade to 1.25.0 and CLI/API guarantees that they can talk to their env? | 21:39 |
wallyworld | yes, that's the plan unless i am totally wrong | 21:39 |
wallyworld | that's why it took a while to get 1.18 done, to get all the stuff ported across to using the api which we could keep stable | 21:40 |
sinzui | wallyworld, I know there are deprecated cli between 1.18.1 and 1.20-alpha1 | 21:40 |
wallyworld | i think AddMachine may be one such api | 21:40 |
wallyworld | there's a V2, but will will still support V1 | 21:40 |
sinzui | bootstrap --series changed its meaning | 21:40 |
sinzui | wallyworld, I will start a conversation on the list about this case. and the case is future clients always talking to older envs because users are getting automatic updates of the client | 21:42 |
wallyworld | ok. i think the param change in meaning we may be able to get away with, as people can change their scripts. the client will still talk to the server | 21:43 |
wallyworld | ie it was a change to the client | 21:43 |
wallyworld | but i can't recall the exact change made | 21:44 |
waigani | just came across this https://github.com/gocircuit/circuit. looks interesting | 21:45 |
wallyworld | sinzui: the idea is though that we ask ubuntu to backport juju to because we are supposed to support old and new clients and then we publish new backend binaries via simplestreams | 21:46 |
wallyworld | tools are tied to deployments/workloads, not ubuntu per se | 21:46 |
sinzui | wallyworld, I understand, I just happen to know that cli is still getting deprecated, so the 5 year commitment isn't in everyone's head | 21:47 |
sinzui | wallyworld, I have a few minutes left in my day. I can send an email asking for testers and provide a link to CI's packages. | 21:48 |
sinzui | I may have time to make streams from those packages. | 21:49 |
wallyworld | sinzui: thank you, appreciated. i think a good dose of internal testing is essential given we haven't been able to see the issues arising in the field | 21:56 |
wallyworld | cant wait to get our own maas for ci etc | 21:57 |
sinzui | wallyworld, I am now skeptical about getting our own maas now that it clears there are many configurations used by stake holders. I need many maases | 21:59 |
wallyworld | yeah :-( | 22:03 |
wallyworld | but we do need at lease one set up with a reasonable workload to deploy | 22:03 |
thumper | wallyworld: hey | 22:10 |
thumper | FYI: https://bugs.launchpad.net/juju-core/+bug/1348386 | 22:10 |
_mup_ | Bug #1348386: lxc template fails to stop <clone> <lxc> <juju-core:Triaged> <https://launchpad.net/bugs/1348386> | 22:10 |
thumper | people seeing lxc templates not stop | 22:10 |
wallyworld | \o/ | 22:10 |
thumper | is because of this | 22:10 |
* wallyworld wonders if it is his fault before reading the bug | 22:11 | |
thumper | wallyworld: no, lxc | 22:11 |
thumper | lxc changed the meaning of a command line arg | 22:11 |
thumper | from a filename to a device | 22:11 |
wallyworld | thumper: ok, thanks for heads up, will miss the 1.20.2 release sadly | 22:11 |
perrito666 | fwereade: ping | 22:12 |
fwereade | perrito666, pong | 22:12 |
perrito666 | gnight/morning | 22:12 |
perrito666 | fwereade: I found out what is running apt-get | 22:12 |
perrito666 | its networker trying to install the vlan module. | 22:13 |
fwereade | perrito666, and not acquiring the hook lock? please make it do so | 22:13 |
fwereade | bbs | 22:14 |
arosales | anyone available to help mbruzek debug https://bugs.launchpad.net/ubuntu/+source/gccgo-4.9/+bug/1304754 and https://bugs.launchpad.net/ubuntu/+source/juju-core/+bug/1347322 | 22:22 |
_mup_ | Bug #1304754: gccgo has issues when page size is not 4kB <ppc64el> <trusty> <gcc:Fix Released> <gcc-4.9 (Ubuntu):Fix Released> <gccgo-4.9 (Ubuntu):Invalid> <gcc-4.9 (Ubuntu Trusty):Invalid> <gccgo-4.9 (Ubuntu Trusty):In Progress by doko> <gcc-4.9 (Ubuntu Utopic):Fix Released> <gccgo-4.9 (Ubuntu Utopic):Invalid> <https://launchpad.net/bugs/1304754> | 22:22 |
_mup_ | Bug #1347322: juju ssh results in a panic: runtime error <ppc64el> <juju-core:Triaged> <juju-core (Ubuntu):Confirmed> <https://launchpad.net/bugs/1347322> | 22:22 |
arosales | specifically mbruzek needs assistance testing utopic gcc (gccgo-4.9_4.9.1-1ubuntu3_ppc64el.deb) on trusty | 22:23 |
arosales | wallyworld: thumper any folks available to help? | 22:25 |
mbruzek | I need some help compiling juju. Can someone help me with that? | 22:34 |
arosales | mbruzek: can your power machines reach github? | 22:36 |
mbruzek | arosales, sure. | 22:36 |
arosales | does gccgo-4.9_4.9.1-1ubuntu3_ppc64el.deb install cleanly on your system? | 22:37 |
mbruzek | arosales, I don't know how to get it to use the utopic version | 22:37 |
arosales | if so reading https://github.com/juju/juju should get you a juju binary built with that gcc version | 22:37 |
mbruzek | arosales, how can I get apt-get to give me the utopic version? | 22:39 |
arosales | I thought you may be able to get it from https://launchpad.net/~ubuntu-toolchain-r/+archive/ubuntu/ppa/+packages | 22:40 |
arosales | mbruzek: https://launchpad.net/ubuntu/utopic/ppc64el/gccgo-4.9/4.9.1-1ubuntu3 | 22:41 |
arosales | mbruzek: specificially http://launchpadlibrarian.net/180170369/gccgo-4.9_4.9.1-1ubuntu3_ppc64el.deb | 22:41 |
arosales | mbruzek: hopefully the deps don't complain | 22:42 |
wallyworld | arosales: sorry, was having breakfast | 22:43 |
wallyworld | is there anything i need to do in particular? | 22:44 |
arosales | wallyworld: mbruzek is working on https://bugs.launchpad.net/ubuntu/+source/juju-core/+bug/1347322 which is blocking the verification of https://bugs.launchpad.net/ubuntu/+source/gccgo-4.9/+bug/1304754 | 22:45 |
_mup_ | Bug #1347322: juju ssh results in a panic: runtime error <ppc64el> <juju-core:Triaged> <juju-core (Ubuntu):Confirmed> <https://launchpad.net/bugs/1347322> | 22:45 |
mbruzek | arosales, depend problems let me see if I can resolve them http://pastebin.ubuntu.com/7850100/ | 22:45 |
_mup_ | Bug #1304754: gccgo has issues when page size is not 4kB <ppc64el> <trusty> <gcc:Fix Released> <gcc-4.9 (Ubuntu):Fix Released> <gccgo-4.9 (Ubuntu):Invalid> <gcc-4.9 (Ubuntu Trusty):Invalid> <gccgo-4.9 (Ubuntu Trusty):In Progress by doko> <gcc-4.9 (Ubuntu Utopic):Fix Released> <gccgo-4.9 (Ubuntu Utopic):Invalid> <https://launchpad.net/bugs/1304754> | 22:45 |
arosales | wallyworld: currently mbruzek is trying to test utopic's gcc https://launchpad.net/ubuntu/utopic/ppc64el/gccgo-4.9/4.9.1-1ubuntu3 on ppc64el trusty | 22:45 |
arosales | wallyworld: the help needed is just getting juju built with https://launchpad.net/ubuntu/utopic/ppc64el/gccgo-4.9/4.9.1-1ubuntu3 | 22:46 |
arosales | wallyworld: so mbruzek is going to try to install the utopic gcc-go.deb manually on ppc64el trusty, and then try to issue, "go install -v github.com/juju/juju/..." to build juju | 22:47 |
thumper | arosales: sorry, sprinting | 22:47 |
arosales | wallyworld: is that the correct approach? | 22:47 |
arosales | thumper: understood | 22:47 |
waigani | menn0, fwereade: https://github.com/avelino/awesome-go | 22:47 |
waigani | fwereade: https://github.com/gocircuit/circuit | 22:48 |
wallyworld | arosales: yes, go install will produce the juju binaries. but that will work regardless of what compiler is installed | 22:48 |
wallyworld | ie the one without the fix or the one with | 22:48 |
arosales | wallyworld: ack just wanted to confirm that was the correct process | 22:48 |
wallyworld | np | 22:48 |
arosales | wallyworld: so mbruzek just need to get the fixed gccgo onto the system and then build juju with "go install -v github.com/juju/juju/..." | 22:49 |
wallyworld | arosales: yes, so long as the $GOPATH/bin directory is in the path then you easily run the compiled juju | 22:49 |
arosales | mbruzek: note you'll need to get juju source first. | 22:49 |
wallyworld | or else you will need to type the full path | 22:49 |
arosales | mbruzek: https://github.com/juju/juju#getting-juju | 22:49 |
wallyworld | and all the dependencies | 22:50 |
wallyworld | i think go get does that | 22:50 |
mbruzek | arosales, having dependency problem with that deb package. | 22:50 |
mbruzek | http://pastebin.ubuntu.com/7850125/ | 22:50 |
arosales | mbruzek: so try to follow https://github.com/juju/juju#setting-gopath up to "Using Juju" | 22:50 |
thumper | wallyworld: since we are busy, any chance katco could look at the lxc clone bug? | 22:51 |
thumper | wallyworld: the only tricky bit is supporting old lxc versions | 22:51 |
arosales | mbruzek: why are you running sudo apt-get install gccgo | 22:51 |
wallyworld | thumper: i had assumed you wanted us to look at it | 22:51 |
thumper | wallyworld: I didn't assume :-) asking is nicer | 22:51 |
mbruzek | arosales, to resolve unmet dependencies. I ran it without gccgo installed and got the same error | 22:51 |
arosales | mbruzek: so you enabled the utopic archive? | 22:52 |
mbruzek | arosales, no I downloaded it, but I thought the apt-get would resolve the depends for me | 22:52 |
wallyworld | thumper: i can target that bug to 1.20.3 | 22:52 |
thumper | wallyworld: sure | 22:53 |
arosales | mbruzek: try just a "dpkg -i gccgo-4.9_4.9.1-1ubuntu3_ppc64el.deb" | 22:53 |
thumper | wallyworld: not sure how to best support multiple versions | 22:53 |
wallyworld | yeah, need to look into it | 22:53 |
wallyworld | maybe it needs to version sniff first | 22:53 |
wallyworld | when container manager is started | 22:54 |
mattyw | davecheney, ping? | 22:55 |
wallyworld | arosales: i use synaptic, does all the dependency stuff in a nice gui :-) | 22:55 |
arosales | mbruzek: hmm your apt get seemed to get, "Unpacking gccgo-4.9 (4.9.1-1ubuntu3) over (4.9-20140406-0ubuntu1) ..." | 22:55 |
mattyw | davecheney, is this close to being landable? https://github.com/juju/juju/pull/378 | 22:55 |
davecheney | mattyw: one small fix for fwereade then will and | 22:56 |
davecheney | land | 22:56 |
davecheney | doing that now | 22:56 |
davecheney | hold your breath | 22:56 |
perrito666 | fwereade: sorry you mean datadir/lockdir/uniter-hook-execution ? | 22:56 |
mbruzek | arosales, http://pastebin.ubuntu.com/7850160/ | 22:56 |
mattyw | davecheney, you're a good person, thanks | 22:56 |
mbruzek | arosales, I get dependency problems with or without the gccgo installed | 22:57 |
mattyw | davecheney, we're going to make use of it in our envuser branch - do you think there's any point in having a (s *State) GetEnvironUUID() call that wraps the call to EnvironConfig? | 22:58 |
wallyworld | sinzui: thank you for 1.20.2 and email | 22:58 |
thumper | wallyworld: talked to hallyn about the bug, he said he'd look tomorrow as I asked about the best way to support 0.8 and 0.9+ | 23:11 |
wallyworld | thumper: great, thank you. is he able to update the bug? | 23:11 |
fwereade | perrito666, yeah, that should be the one | 23:16 |
perrito666 | fwereade: restore (old restore, the one failing) is a bash script run via ssh from a cmd plugin? | 23:17 |
fwereade | perrito666, yeah, think os | 23:17 |
perrito666 | ouch sorry ? not intentional | 23:18 |
fwereade | perrito666, I just can't type "so" | 23:18 |
perrito666 | I meant, the "?" was a typo | 23:19 |
tasdomas | trying to deploy on local provider gives me this: ERROR juju.networker utils.go:35 command "lsmod | grep -q 8021q || modprobe 8021q" failed (code: 1, stdout: , stderr: FATAL: Could not load /lib/modules/3.13.0-32-generic/modules.dep: No such file or directory | 23:26 |
thumper | wallyworld: I've updated the bug with a solution | 23:27 |
thumper | wallyworld: which is effectively, try using --console-log, and if the command fails with exit code 1, fall back to --console | 23:27 |
wallyworld | thumper: you're all over it | 23:27 |
thumper | wallyworld: but not actually fixing it :) | 23:27 |
wallyworld | close enough, you spoon fed the solution | 23:27 |
wallyworld | om nom nom | 23:28 |
davecheney | I made mongodb panic, where is my prize ? | 23:34 |
davecheney | http://paste.ubuntu.com/7850431/ | 23:34 |
fwereade | davecheney, congratulations, you get to fix it | 23:36 |
mattyw | anyone else seeing this test failure on trunk? http://paste.ubuntu.com/7850458/ | 23:38 |
davecheney | fwereade: shitter | 23:38 |
perrito666 | axw: defer ping | 23:38 |
wallyworld | rick_h__: ping | 23:40 |
mattyw | thumper, hopefully I've got the right idea here https://github.com/juju/juju/pull/401 | 23:44 |
wallyworld | fwereade: from what i can see, the charm store code does not attempt to use any configured http proxy settings, just goes straight out to the internet. is that your understanding also? if so, does that mean for private clouds, people are expected to pull down charms to a local repo and deploy from there? | 23:45 |
fwereade | wallyworld, oh, wtf | 23:45 |
fwereade | wallyworld, it is true that in general that is what people do | 23:45 |
wallyworld | func (s *CharmStore) get(url string) (resp *http.Response, err error) { | 23:45 |
wallyworld | req, err := http.NewRequest("GET", url, nil) | 23:46 |
fwereade | wallyworld, many people on private clouds specifically want their own charms and managing a local repo is a reasonable way to do that | 23:46 |
wallyworld | that should use any http proxy from env though | 23:46 |
fwereade | wallyworld, but -- I can totally believe that code got missed when we were trying to do the proxy stuff :/ | 23:46 |
wallyworld | fwereade: so the above will use an an http proxy env var but not a juju configured http proxy | 23:47 |
fwereade | wallyworld, hmm | 23:48 |
fwereade | wallyworld, I don't think we do put the proxy stuff in the env for the machine agent do we | 23:48 |
wallyworld | not sure, thumper ? | 23:48 |
thumper | wallyworld: weird race condition for you: paste.ubuntu.com/7850518/ | 23:48 |
wallyworld | \o/ | 23:49 |
wallyworld | thumper: so, am i right in saying the charm store get above will fail if the user has configured a http proxy on via juju config (and doesn't have http proxy env var set) | 23:51 |
perrito666 | sinzui: did you by any chance had errors about something corrupting your .jenv file? | 23:51 |
wallyworld | s/on/only | 23:51 |
wallyworld | thumper: because (by design) any config http proxy setting is not put into the env vars of a machine agent? | 23:52 |
thumper | wallyworld: by design any http proxy config *IS* put into the env of the machine agent | 23:55 |
thumper | see MachineEnvironmentWorker | 23:55 |
wallyworld | thumper: ok, so why would the charm store http gets be failing? | 23:55 |
thumper | NFI | 23:55 |
wallyworld | ok, i'll ask them for more info, just wanted to ensure we were doing the right thing | 23:56 |
thumper | I think we are | 23:56 |
wallyworld | thumper: so, if the have http proxy set up in env.yaml, we expect to be able to juju deploy a charm even from inside a private network then | 23:57 |
wallyworld | since proxy is propagated to machine agent on state server | 23:57 |
thumper | as long as the http proxy setting is right :) | 23:57 |
wallyworld | yep | 23:57 |
wallyworld | thanks | 23:57 |
thumper | pretty sure we actually use this with our #is folks | 23:58 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!