wallyworld_ | sinzui: i don't know mongo very well but those errors could be mongo just refusing to start for whatever reason :-( | 00:06 |
---|---|---|
sinzui | wallyworld_, that's what I see too. I think something has changed and we might need to adapt juju to it | 00:14 |
wwitzel3 | perrito666: no luck yet, the problem is nothing in that commit would seem to have a direct impact on backup and restore in this fashion. It is really odd. | 00:16 |
sinzui | wallyworld_, the current tip of juju is failing the same test | 00:16 |
wallyworld_ | sinzui: the other possibility - we set a mongo socket timeout of 21 sec, which is twice the mongo heartbeat interval. those socket connection errors occur about 30 after the sockets are opened. perhaps mongo is closing its client connections because it thinks they areunused | 00:16 |
wallyworld_ | the socket timeout was recently increaed from 10s | 00:17 |
wallyworld_ | so if anything that should have made it more tolerant | 00:17 |
wallyworld_ | but i'm just guessing that the error is due to mongo socket closure, may not be | 00:18 |
sinzui | wallyworld_, I don't think recent juju changes are the problem. We know exactly which change the failure happened with https://github.com/juju/juju/commit/635a06380b2df20f829f205e45e2d3a7e9476d0d | 00:21 |
sinzui | I don't think that is relevant. | 00:21 |
wallyworld_ | sinzui: the socket timeout increase happened last week before that change | 00:22 |
wallyworld_ | i was trying to make sense of what the most relevant culprit may have been | 00:22 |
wallyworld_ | sinzui: but deploy to other clouds works | 00:23 |
sinzui | wallyworld_, only a few intermittent failures caused by timeouts or resources over the last week. This set rarely failed before a few hours ago. It is the only test on aws though that is bootstrapping with juju-monogodb like this | 00:24 |
sinzui | wallyworld_, yep all clouds and all the locals love bootstrapping | 00:24 |
wallyworld_ | so just aws | 00:24 |
sinzui | If I could boot utopic there, I bet it would fail the same way | 00:25 |
sinzui | precise doesn't fail...it has mongodb-server | 00:25 |
wallyworld_ | hmmm | 00:28 |
sinzui | The ami is two weeks old | 00:28 |
=== Guest70819 is now known as wallyworld | ||
=== wallyworld is now known as Guest14539 | ||
=== Guest14539 is now known as wallyworld | ||
=== wallyworld is now known as Guest41478 | ||
=== Guest41478 is now known as wallyworld | ||
wallyworld | sinzui: i just bootstrapped an amazon env on trusty from my local machine | 01:31 |
sinzui | wallyworld, CI is still trying. Look at this extraordinary effort http://juju-ci.vapour.ws:8080/job/aws-deploy-trusty-amd64/643/console | 01:33 |
wallyworld | wow | 01:33 |
sinzui | wallyworld, I could kill the job, see if it restart and complete...if so then I think aws was denfintely doing something bad | 01:33 |
wallyworld | sinzui: i wonder if it is region related | 01:34 |
sinzui | hmm | 01:34 |
wallyworld | 2014-06-20 00:57:20 INFO juju.provider.ec2 ec2.go:643 started instance "i-193ae232" in "us-east-1c" | 01:34 |
wallyworld | that is what just worked for me | 01:34 |
wallyworld | it picked az c | 01:34 |
wallyworld | after it was rejected by a and b as being congested | 01:34 |
sinzui | wallyworld, us-east-1a | 01:34 |
sinzui | well, that is the on I left alive anyway | 01:35 |
wallyworld | 2014-06-20 00:57:17 INFO juju.provider.ec2 ec2.go:627 "us-east-1a" is constrained, trying another availability zone | 01:35 |
wallyworld | 2014-06-20 00:57:19 INFO juju.provider.ec2 ec2.go:627 "us-east-1b" is constrained, trying another availability zone | 01:35 |
wallyworld | 2014-06-20 00:57:20 INFO juju.provider.ec2 ec2.go:643 started instance "i-193ae232" in "us-east-1c" | 01:35 |
wallyworld | so i wasn't allowed to use a or b | 01:35 |
sinzui | wallyworld, that one that is taking forever is us-east-1c | 01:35 |
wallyworld | hmmm ok, that's the one that worked for me | 01:36 |
sinzui | and it just failed. now the other aws jobs will start | 01:36 |
wallyworld | not sure what instance type it picked, hw is "hardware: arch=amd64 cpu-cores=1 cpu-power=100 mem=1740M root-disk=8192M" | 01:36 |
wallyworld | m1.small maybe | 01:37 |
wallyworld | so i wonder why mine worked | 01:37 |
sinzui | wallyworld, We get m3.medium by bootstrapping with mem=2G | 01:37 |
wallyworld | i just used the default constraints | 01:38 |
wallyworld | i can try with mem=2G | 01:38 |
sinzui | did I count wrong? I see the test restarting | 01:38 |
sinzui | oh this is wrong, CI is only allow to test this 5 times. It is now doing 7 | 01:39 |
perrito666 | wwitzel3: well tomorrow is a holiday here so I wont be here, this is your problem for a day :|, I think we need to find out what is re-writing agents.conf and in which documents is that info stored to see which one we are missing, my guess is that wallyworld's commit fixed something and that the fact that restore was working before was actually a bug | 01:39 |
perrito666 | we are missing an update somewhee, eiter peergrouper or instancepoller | 01:40 |
perrito666 | anyway, its late and I am very very tired | 01:40 |
wallyworld | perrito666: ok. is there info on the bug, i have not much idea what has been done etc | 01:40 |
wallyworld | eg how agents.conf is involved | 01:41 |
perrito666 | wallyworld: there is not much info, basically, we are doing some stuff in a bash script embedded in restore.go | 01:42 |
perrito666 | which is being undone | 01:42 |
wallyworld | ok, np | 01:42 |
perrito666 | that is the issue (in few words | 01:42 |
wallyworld | and running the exact same steps by hand works? | 01:42 |
perrito666 | yup | 01:42 |
wallyworld | wtf | 01:42 |
wallyworld | which line does it fail at? | 01:43 |
perrito666 | wallyworld: none, that is what makes this wtf of an issue impossible to find | 01:43 |
perrito666 | the restore ends succesfully | 01:43 |
wallyworld | hmmm, ok. i've not even looked at the restore script before so not sure how far i'll get | 01:44 |
perrito666 | and then when jujud-* starts, it all goes back as it was, courtesy of some worker | 01:44 |
wallyworld | i've got some other issues to deal with as well | 01:44 |
perrito666 | wallyworld: in any case wwitzel3 will be working on that too | 01:44 |
wallyworld | ok | 01:44 |
sinzui | wallyworld, something interesting is happening here. AWS doesn't see any sign of the two services being deployed | 01:44 |
perrito666 | wallyworld: this is the script https://github.com/juju/juju/blob/master/cmd/plugins/juju-restore/restore.go#L106 just in case you get inspired | 01:45 |
wallyworld | which services, the test ones? | 01:45 |
perrito666 | now gentlemen, I really need sleep | 01:45 |
perrito666 | cheers | 01:45 |
wallyworld | perrito666: ok,good night | 01:45 |
sinzui | did I mention that joyent manta went down for 5 hours yesterday? I had a power cut so it was difficult to report the outage | 01:46 |
wwitzel3 | perrito666: got it, have a good night | 01:46 |
sinzui | ah... | 01:46 |
sinzui | wallyworld, let's say the ec2 issue was region. this test suddenlty came to life and may pass http://juju-ci.vapour.ws:8080/job/aws-deploy-trusty-amd64/644/console | 01:47 |
wallyworld | sinzui: i just bootstrapped with mem=2G, got same region/zone as before | 01:48 |
wallyworld | 2014-06-20 01:43:22 INFO juju.provider.ec2 ec2.go:627 "us-east-1a" is constrained, trying another availability zone | 01:48 |
wallyworld | 2014-06-20 01:43:24 INFO juju.provider.ec2 ec2.go:627 "us-east-1b" is constrained, trying another availability zone | 01:48 |
wallyworld | 2014-06-20 01:43:25 INFO juju.provider.ec2 ec2.go:643 started instance "i-4a528a61" in "us-east-1c" | 01:48 |
wallyworld | - i-4a528a61 | 01:48 |
wallyworld | and it worked | 01:48 |
sinzui | the test officially passed | 01:48 |
sinzui | And I was watching the moment it got unstalled | 01:49 |
sinzui | I didn't intervene | 01:49 |
wallyworld | so did the failing and passing tests use different regions / zones? | 01:49 |
sinzui | wallyworld, no, the test that passed and the one I kept alive were the same region | 01:50 |
sinzui | but I think your supposition of region or maybe bad machine were at fault | 01:50 |
wallyworld | maybe it is just mongo flakiness | 01:50 |
sinzui | wallyworld, the failed bootstraps, the two timeouts, an the pass were all us-east-1c | 01:51 |
wallyworld | \o/ | 01:52 |
sinzui | wallyworld, I don't believe mongo flakiness because there is no prehistory of it in the test | 01:52 |
wallyworld | just seems strangethat mongo starts but then has connection errors | 01:52 |
sinzui | wallyworld, oh...I lie | 01:53 |
sinzui | wallyworld, there are two previous occurrences of this error in the last week. They happened twice in a row, It resolved itself in less than 30 mintutes http://juju-ci.vapour.ws:8080/job/aws-deploy-trusty-amd64/614/console | 01:53 |
wallyworld | hmmmm | 01:54 |
sinzui | this case though is 4 hours long | 01:54 |
sinzui | anyway. Lets close this bug. maybe I need to walk away from CI and let it mind itself | 01:54 |
wallyworld | sinzui: would have been interesting to use placement to run the tests on a different region/az when the test fails in us-east-1 | 01:55 |
sinzui | hmm, well I think I can add such as test after the release, or before if we loose faith that we make juju and CI happy in the next 16 hours | 01:56 |
wallyworld | ok | 01:56 |
sinzui | wallyworld, new info about joyent: https://bugs.launchpad.net/juju-core/+bug/1329123 | 02:08 |
_mup_ | Bug #1329123: status outage during upgrades in joyent, Juju is broken <joyent-provider> <regression> <upgrade-juju> <juju-core:Triaged> <https://launchpad.net/bugs/1329123> | 02:08 |
wallyworld | sinzui: looking into that one now | 02:09 |
sinzui | I am looking for something like termination protection to keep this instance alive | 02:10 |
wallyworld | ok | 02:10 |
sinzui | wallyworld, I have an instance. I clobber something on it to keep it alive when the test ended | 02:17 |
wallyworld | sinzui: ok, i'm bootstrapping a 1.18.4 env now and will see what the upgrade does | 02:18 |
wallyworld | far out joyent is slooooow | 02:32 |
sinzui | wallyworld, it is faster than azure | 02:39 |
wallyworld | surely you jest | 02:39 |
sinzui | as for Hp, it now varies between 9 minutes and 21 minutes | 02:39 |
wallyworld | yay cloud | 02:40 |
sinzui | azure is consistently 18 to 21 minutes | 02:40 |
sinzui | wallyworld, canonistack tests have been passing since I switch the test to pull tools from s3. I suspect swift is the principle cause of speed and out right failures in canonistack | 02:42 |
wallyworld | sigh | 02:42 |
wallyworld | sinzui: so it seems on joyent the api server does not start again after an upgrade | 02:50 |
sinzui | :( | 02:50 |
wallyworld | i think mongo is ok, looking into it | 02:52 |
wallyworld | sinzui: so it's caused by joyent offering up machine addresses that juju doesn't know how to pick out a mongo replicaset peer from. juju's address processing changed in 1.19 and is not compatible with joyent | 03:59 |
sinzui | :( | 04:00 |
wallyworld | i'll see if i can figure out a fix | 04:00 |
wallyworld | i'm somewhat new to this part of juju | 04:00 |
sinzui | wallyworld, I pleased that we know why. I feel relieved that we have really triaged the issue | 04:01 |
wallyworld | yeah, me too | 04:01 |
wallyworld | sorry i had not enought ime earlier to look | 04:02 |
wallyworld | sinzui: sort of argues against that upgrades are generic and can just be tested on one cloud, although i'm confused why a deployment worked as it should have had the same problem | 04:03 |
sinzui | wallyworld, yeah. abentley argued a different perspective that we certify upgrade works. We need to do it for each cloud to gather evidence that things work and that upgrade isn't different. | 04:04 |
sinzui | wallyworld, well maybe the deploy case works because juju is exploring the possible addresses. upgrade doesn' gather addresses. | 04:05 |
wallyworld | sinzui: there's not enough debugging to see the root cause, i'll add some and retest, but first i have to go and pick up my son as he finished school early today | 04:05 |
sinzui | wallyworld, All fine with me. I am going to sleep. | 04:06 |
wallyworld | ok, i'll send an update | 04:06 |
axw | wallyworld: I don't really understand this rsyslog stuff. I can manage to get it to work by changing the default ruleset, but I don't understand why that works... probably going to have to check with wwitzel3 and/or voidspace | 05:09 |
wallyworld | axw: fair enough, i don't fully grok it either | 05:10 |
wallyworld | maybe add a bug comment and we can ping them later | 05:10 |
marcoceppi | where should I be reporting bugs? | 05:21 |
axw | marcoceppi: launchpad | 05:21 |
marcoceppi | even for things like gomaasapi ? | 05:22 |
axw | that's not even on github yet is it? | 05:22 |
axw | I think launchpad too | 05:22 |
* marcoceppi has no idea | 05:22 | |
marcoceppi | cool | 05:22 |
axw | maybe not when it's on github. matters for juju for integration points, probably doesn't matter for libraries | 05:23 |
axw | wallyworld: is there anything else pressing I can look at? or just addresses/azure? | 05:30 |
wallyworld | axw: i've made progress with the joyent thing so yeah, just addresses/azure. let's try and wrap up the wip addresses stuff | 05:31 |
axw | ok. gonna write some tests now and propose part of it | 05:31 |
wallyworld | great | 05:41 |
jcw4 | clear | 06:13 |
jcw4 | derp | 06:13 |
davecheney | wallyworld: axw https://github.com/juju/juju/pull/135 | 06:41 |
axw | looking | 06:41 |
wallyworld | bah, beat me to it | 06:41 |
davecheney | wallyworld: the fix ? | 06:41 |
wallyworld | no, looking :-) | 06:42 |
wallyworld | axw: davecheney: might explain some of our mongo test issues | 06:46 |
* wallyworld handwaves | 06:47 | |
axw | maybe, not sure. it could cause sockets to not be released | 06:48 |
axw | davecheney: reviewed | 06:48 |
davecheney | bueno | 06:50 |
davecheney | axw: nice review | 06:50 |
davecheney | i don't know what the cost of leaking cursors is | 06:50 |
davecheney | or the rate we were leaking them | 06:50 |
davecheney | but given the usual nonsense we see with mongo | 06:50 |
davecheney | i'm confident to bet it wasn't helping | 06:50 |
voidspace | morning all | 06:51 |
axw | morning voidspace | 06:51 |
voidspace | o/ | 06:51 |
axw | voidspace: when you have a moment, can you please cast your eye over https://bugs.launchpad.net/juju-core/+bug/1332358 and my comment on it? | 06:53 |
_mup_ | Bug #1332358: creating a local environment stops the syslog (1.19.3) <local-provider> <logging> <juju-core:Triaged> <https://launchpad.net/bugs/1332358> | 06:53 |
wallyworld | axw: small one https://github.com/juju/juju/pull/136 | 06:54 |
axw | looking | 06:54 |
wallyworld | fixes last critical for 1.19.4 release | 06:55 |
voidspace | axw: ok | 06:55 |
voidspace | axw: ouch | 06:56 |
axw | voidspace: does that ruleset stuff mean something to you? | 06:56 |
voidspace | axw: yep | 06:56 |
voidspace | axw: I wrote the rulesets | 06:56 |
voidspace | axw: I can switch to looking at this | 06:57 |
voidspace | axw: but today is my last day for a week | 06:57 |
voidspace | axw: so maybe the best thing I can do is write up a quick explanation of the rulesets to help wayne or someone else debug it | 06:57 |
axw | voidspace: sounds good, thank you | 06:57 |
voidspace | axw: wayne did the rest of the rsyslog stuff (but not the ruleset) | 06:57 |
axw | ok | 06:57 |
wallyworld | voidspace: i've moved that bug to 1.19.4 so if there's any way to get it sorted before eod that would be great | 06:58 |
wallyworld | either you or wayne | 06:58 |
wallyworld | as it would be bad to ship 1.20 with that issue still there | 06:58 |
davecheney | axw: wallyworld i THINK i've got all the occurances | 06:58 |
axw | wallyworld: what was setting blank addresses? | 06:58 |
davecheney | i've searched for them a bunch of different ways | 06:58 |
voidspace | axw: is the problem that with local provider, we kill the rsyslog configuration for anything else on the system? | 06:58 |
davecheney | i wanted to use the new pointer analysis feature in godoc 1.3 | 06:58 |
axw | davecheney: oracle? :) | 06:58 |
axw | heh | 06:58 |
wallyworld | axw: an old version of the instance poller | 06:58 |
davecheney | but there isn't enough memory in the world to process juju | 06:59 |
axw | wallyworld: ok | 06:59 |
wallyworld | i can't recall the exact bug | 06:59 |
voidspace | axw: Having a bug description with full sentences would help :-/ | 06:59 |
davecheney | the stdlib consumes 1.7gb on my machine | 06:59 |
wallyworld | but it rings a bell | 06:59 |
davecheney | axw: i could try the oracle | 06:59 |
axw | voidspace: yes, everything else stops it seems | 06:59 |
axw | davecheney: I have 16GB, I'll see if I can run it. | 07:00 |
voidspace | axw: and you can reproduce it? | 07:00 |
axw | voidspace: yes, trivially | 07:00 |
voidspace | axw: what are the repro steps | 07:00 |
axw | voidspace: bootstrap local provider, wait for rsyslog worker to kick in, /var/log/syslog stops showing other bits | 07:00 |
voidspace | ah | 07:00 |
axw | voidspace: I just did what's in the bug | 07:00 |
voidspace | hah, there are repro steps | 07:01 |
voidspace | axw: it's 8am here and I haven't had coffee :-) | 07:01 |
voidspace | so coffee | 07:01 |
axw | :) | 07:01 |
voidspace | axw: so I guess the problem is that I change the default ruleset to forward all logs to the state server | 07:01 |
voidspace | and the state server only logs juju messages and drops everything else | 07:01 |
voidspace | and on local provider the state server is "the machine" | 07:02 |
voidspace | so all other logs are dropped | 07:02 |
axw | it looked like they both do that though? | 07:02 |
axw | both rulesets I mean | 07:02 |
davecheney | axw: i also have a bunch of junk in my $GOPATH | 07:02 |
davecheney | which upset the analyser, it freaks on first error | 07:02 |
voidspace | axw: one should just forward messages (local) the other should actually log them (remote) | 07:02 |
voidspace | but coffee first | 07:03 |
axw | ok | 07:03 |
voidspace | (so "remote" ruleset is for logs received by TCP - and they should be logged. The "default ruleset" is "what to do with messages generated locally" and we want to forward those to all state servers - so that's the "local ruleset" which forwards messages.) | 07:04 |
voidspace | For local provider we need to be smarter and log messages that aren't juju ones | 07:04 |
voidspace | in fact we can do that for all providers | 07:04 |
voidspace | so it should be a simple change | 07:04 |
wallyworld | thanks davecheney | 07:05 |
voidspace | but I/we need to know how to specify "do what you would have done anyway" for the non juju messages | 07:05 |
voidspace | the remote ruleset just logs messages - which is why switching to remote by default works | 07:05 |
voidspace | but it means that log messages don't get broadcast to other state servers | 07:06 |
voidspace | so it isn't correct | 07:06 |
axw | voidspace: isn't that just for non-juju log messages? | 07:06 |
voidspace | we only want our log file to handle juju messages | 07:06 |
axw | wallyworld: why the "" check in bestAddressIndex too? | 07:06 |
voidspace | axw: the problem is that it is for non-juju log messages too | 07:07 |
voidspace | axw: but yes - we have a tag format for juju messages, so for all *non juju* ones we need to say "just process them as you would have done" | 07:07 |
voidspace | I don't know how to specify that | 07:07 |
wallyworld | axw: because somehow "" addresses still end up in there and i'm not sure how | 07:08 |
axw | nor do I | 07:08 |
voidspace | needs digging into | 07:08 |
voidspace | probably keeping a reference to the normal "default ruleset" before replacing it | 07:08 |
wallyworld | voidspace: axw: there's a ~ syntax i think | 07:08 |
voidspace | or something | 07:08 |
voidspace | yeah ~ means "stop processing" | 07:08 |
wallyworld | or something like that | 07:08 |
voidspace | we only want to do that for juju ones | 07:08 |
voidspace | but we're also replacing the default ruleset, so the old processing rules might not even be used | 07:09 |
voidspace | there must be a way to do it, just needs looking up and trying... | 07:09 |
wallyworld | i'm sure between you and wayne you'll figure it out :-) | 07:09 |
voidspace | heh, thanks | 07:09 |
voidspace | wallyworld: I'll get wwitzel3 up to speed when he comes online if I haven't already fixed it | 07:10 |
wallyworld | voidspace: awesome, thanks. | 07:10 |
wallyworld | voidspace: i'll mark it as in progress to you so if curtis looks he'll know it's being worked on, and you can assign to wayne to finish off if need be | 07:11 |
wallyworld | voidspace: what's you lp id? | 07:12 |
wallyworld | never mind | 07:14 |
wallyworld | found it | 07:14 |
davecheney | ? github.com/juju/juju/rpc/rpcreflect[no test files] | 07:19 |
davecheney | ok github.com/juju/juju/state370.446s | 07:19 |
davecheney | ok github.com/juju/juju/state/api24.052s | 07:19 |
davecheney | what the f has happened to the state package | 07:19 |
davecheney | it was 156 seconds a few days ago | 07:19 |
axw | hm, I just ran it and it took 115.711s | 07:19 |
davecheney | ok, what the f is wrong with the bot ? | 07:19 |
axw | wallyworld: the real problem is in the joyent provider | 07:22 |
axw | instance.go, Addresses method | 07:22 |
axw | it's initialising a slice with len, and then appending to it | 07:22 |
wallyworld | let me look | 07:22 |
wallyworld | so it is | 07:23 |
wallyworld | i'll fix | 07:23 |
wallyworld | thanks | 07:23 |
axw | np | 07:23 |
wallyworld | we did also have that instance poller issue so i jumped to conclusions | 07:23 |
* wallyworld taps fingers impatiently waiting for joyent env to start up | 07:32 | |
axw | I thought it was meant to be fast | 07:32 |
wallyworld | not when the instance has to do the apt dance | 07:33 |
davecheney | how come you guys rewrite the apt mirrors ? | 07:36 |
davecheney | are the ec2 mirrors busted ? | 07:36 |
axw | davecheney: dunno, that is inherited from CI I think. | 07:37 |
davecheney | it's probably not that important | 07:38 |
davecheney | 2mb downloaded | 07:38 |
davecheney | the real time is taken writing the data | 07:39 |
davecheney | is that using ephemeral storage ? | 07:39 |
davecheney | that instance | 07:39 |
wallyworld | axw: bollocks, just the provider fix without the other work still fails | 07:40 |
wallyworld | so i'll have to add back the other check | 07:40 |
axw | wallyworld: :( | 07:41 |
axw | davecheney: non capisco. is what using ephemeral storage? | 07:41 |
davecheney | the ec2 instance | 07:42 |
axw | it has some, it just shows up as /mnt | 07:42 |
davecheney | aww crap | 07:43 |
davecheney | ok, ignore my suggestion, won't work | 07:43 |
axw | davecheney: we use /mnt/tmp as $TMPDIR when running the tests | 07:43 |
rogpeppe1 | mgz: ping | 07:56 |
wallyworld | axw: i added your fix and retained one of mine (and removed the other) and it all works | 07:57 |
axw | wallyworld: cool. which one of yours did you have to keep? | 07:58 |
wallyworld | the merge addresses one | 07:58 |
axw | okey dokey. I guess that needs to be there cos it was buggered before | 07:58 |
wallyworld | that's my guess but i don't know 100% | 07:59 |
rogpeppe1 | because it looks like we're not moving to kr's godep tool that soon, i've just pushed a couple of updates to godeps which should make it a little more pleasant to use | 07:59 |
wallyworld | hard to find out for little gain | 07:59 |
axw | rogpeppe1: what changes? | 07:59 |
axw | wallyworld: no worries, LGTM | 07:59 |
wallyworld | ta | 07:59 |
rogpeppe1 | specifically, there's now a -f flag which, if set and an update fails, tries to fetch new changes from the repo before trying again | 08:00 |
davecheney | i did git pull upstream master and the conflicts were horrid | 08:00 |
davecheney | how do I undo that ? | 08:00 |
davecheney | i have no committed | 08:00 |
axw | rogpeppe1: nice, thank you | 08:00 |
davecheney | i have not committed | 08:00 |
rogpeppe1 | axw: also (actually this changes has been around for a while) it tries to do as much work as possible in the face of failure | 08:00 |
rogpeppe1 | axw: so if one repo is dirty, it doesn't stop others being updated | 08:00 |
axw | davecheney: "git reset HEAD" I think? | 08:00 |
axw | rogpeppe1: ah excellent | 08:01 |
davecheney | axw: nope | 08:01 |
davecheney | sorry | 08:01 |
axw | davecheney: "git checkout ." in the root. I dunno if there's a better way to do it. | 08:01 |
rogpeppe1 | axw: i'd appreciate it if you could try it out and tell me if it seems to work ok. i didn't have the energy to write tests :-) | 08:01 |
davecheney | axw: thanks | 08:01 |
axw | rogpeppe1: will do | 08:01 |
davecheney | that worked | 08:01 |
davecheney | git totally fucked the change to state/apiserver/root.go | 08:02 |
rogpeppe1 | davecheney: you might find it useful too FWIW | 08:02 |
davecheney | rogpeppe1: will do | 08:04 |
davecheney | much love for godeps | 08:04 |
rogpeppe1 | it's funny, i've been thinking for ages i couldn't make the time to do the auto-fetch stuff, and then it only took 30 mins to do :-) | 08:05 |
davecheney | #winning | 08:06 |
wallyworld | wtf | 08:09 |
wallyworld | Updating juju-core dependencies to the required versions. | 08:09 |
wallyworld | launchpad.net/godeps (download) | 08:09 |
wallyworld | # cd .; bzr branch https://launchpad.net/godeps /mnt/jenkinshome/jobs/github-merge-juju/workspace/tmp.PFiIQenwfH/RELEASE/src/launchpad.net/godeps | 08:09 |
wallyworld | bzr: ERROR: exceptions.RuntimeError: if we move self._source_infos, then we need to change all of the index pointers as well. | 08:09 |
wallyworld | Traceback (most recent call last): | 08:09 |
wallyworld | File "/usr/lib/python2.7/dist-packages/bzrlib/commands.py", line 920, in exception_to_return_code | 08:09 |
wallyworld | jenkns is busted | 08:09 |
wallyworld | i'll hit the retry button | 08:09 |
axw | rogpeppe1: ^^ did you break it? :) | 08:10 |
axw | we grab the latest godeps I think... | 08:10 |
rogpeppe1 | axw: it's quite possible, although i tried to make it so the current behaviour wouldn't change unless explicitly specified | 08:11 |
rogpeppe1 | axw: specifically, it doesn't try to fetch stuff unless the -f flag is given | 08:11 |
axw | okey dokey | 08:11 |
axw | hmm actually it looks like it's trying to branch godeps, not run it | 08:12 |
wallyworld | axw: i'm off to soccer, if my 2nd retry fails, and there's a fix to anything, can you type $$merge$$ for me again? | 08:12 |
axw | wallyworld: sure | 08:12 |
wallyworld | ta | 08:12 |
wallyworld | i'll be back later anyway, just want to get this sucker landed | 08:13 |
davecheney | so lp just had a lie down ? | 08:13 |
axw | maybe. I just branched it and it worked fine | 08:14 |
axw | on my machine | 08:14 |
rogpeppe1 | well, jenkins is still thinking about godeps... | 08:14 |
rogpeppe1 | hmm. *still* thinking. that's not a great sign. | 08:17 |
rogpeppe1 | same problem again | 08:18 |
rogpeppe1 | looks like it might be a bzr issue | 08:18 |
axw | :( | 08:18 |
axw | I'll jump on the machine and take a peek | 08:19 |
rogpeppe1 | axw: try making sure that the godeps repo dir is deleted | 08:19 |
rogpeppe1 | axw: that might make a difference, i suppose | 08:19 |
axw | it goes to a tempdir | 08:19 |
rogpeppe1 | axw: hmm | 08:19 |
rogpeppe1 | axw: it's perhaps indicative of something that it took so long to fail | 08:20 |
rogpeppe1 | i'll try fetching it from scratch myself | 08:20 |
rogpeppe1 | worked fine for me | 08:21 |
axw | rogpeppe1: works when I do it on the machine too | 08:22 |
axw | le sigh | 08:22 |
rogpeppe1 | axw: hrmph | 08:22 |
axw | rogpeppe1: did you do "go get" or "bzr branch"? | 08:24 |
rogpeppe1 | axw: go get | 08:24 |
rogpeppe1 | axw: which, i hope, is what the bot is doing too | 08:25 |
axw | yeah that works too | 08:25 |
rogpeppe1 | axw: in fact, i know it is | 08:25 |
axw | yep | 08:25 |
axw | it is | 08:25 |
rogpeppe1 | axw: oh yes, the other new feature in godeps is that it now understands package wildcards just like the go tool | 08:25 |
axw | rogpeppe1: ah nice, now I don't have to remember how to list the packages :) | 08:26 |
rogpeppe1 | axw: cudos to github.com/kisielk/gotool - which made it a 2-line change | 08:26 |
rogpeppe1 | axw: i could uncommit and push -f if we need to | 08:28 |
axw | rogpeppe1: just going to see if there are any env vars that are influencing this first | 08:33 |
rogpeppe1 | axw: good thought | 08:33 |
rogpeppe1 | fwereade: if you haven't seen it, i think you'll like this proposal: https://docs.google.com/document/d/1e8kOo3r51b2BWtTs_1uADIA5djfXhPT36s6eHVRIvaU/edit | 08:44 |
fwereade | rogpeppe1, is that the bundles one? haven't looked at it properly yet I'm afraid | 08:45 |
rogpeppe1 | fwereade: no, it's a go proposal | 08:45 |
rogpeppe1 | fwereade: but i'd appreciate your thoughts on the bundles thing too, if poss | 08:45 |
rogpeppe1 | fwereade: 'cos we're actually implementing it right now :-) | 08:46 |
fwereade | rogpeppe1, I read the first sentence and thought "woohoo!" | 08:46 |
rogpeppe1 | fwereade: :-) | 08:46 |
fwereade | rogpeppe1, that's awesome | 08:47 |
rogpeppe1 | fwereade: i think so too | 08:47 |
rogpeppe1 | fwereade: it means we can have a clear division in juju-core between internal and external APIs | 08:48 |
rogpeppe1 | fwereade: i think we could restructure it like that even before the go tool supports it actually | 08:48 |
fwereade | rogpeppe1, yeah, and we can make small packages without annoying you *too* much ;) | 08:48 |
rogpeppe1 | fwereade: :- | 08:49 |
rogpeppe1 | ) | 08:49 |
rogpeppe1 | axw: any joy? | 08:52 |
axw | rogpeppe1: nope :( sticking a "bzr branch" in there to see if it makes a difference | 08:52 |
axw | no joy... | 09:01 |
rogpeppe1 | axw: i've just retracted my latest commit and overwritten godeps/trunk | 09:04 |
rogpeppe1 | axw: let's see if it works again now... | 09:04 |
rogpeppe1 | if not, then we're stuffed :-) | 09:04 |
axw | rogpeppe1: thanks, I'll give it another kick | 09:04 |
axw | rogpeppe1: that did it | 09:06 |
rogpeppe1 | axw: fuck | 09:06 |
axw | :( | 09:07 |
rogpeppe1 | axw: what the hell was going on there then? | 09:07 |
axw | no idea, weird bzr internal badness | 09:07 |
rogpeppe1 | axw: i will try another time, but with a fresh commit (and slightly different program text, in case it's a weird sha clash or something) | 09:07 |
axw | ok | 09:08 |
rogpeppe1 | axw: pushed, with slightly different changes and commit message. | 09:11 |
axw | rogpeppe1: okey dokey. I'll stop the build and try it again | 09:11 |
rogpeppe1 | axw: great, thanks | 09:11 |
axw | does not look promising | 09:16 |
rogpeppe1 | axw: darn | 09:19 |
axw | rogpeppe1: would you mind backing it out for now, and we can take another look at it next week? | 09:23 |
rogpeppe1 | axw: ok :-\ | 09:23 |
axw | sorry. kinda need to get this fix in for 1.19.4 | 09:23 |
rogpeppe1 | axw: ok, done | 09:24 |
axw | thanks | 09:24 |
rogpeppe1 | mgz: you around today? | 09:30 |
=== julianwa is now known as julianwa_hungry | ||
=== vladk|offline is now known as vladk | ||
dimitern | fwereade, I've responded to comments and updated most of the networking doc, but there are a few open questions left | 10:41 |
dimitern | fwereade, like the ports collection, addresses on the machine doc, port on the address doc, and whether to expose subnet management commands | 10:42 |
dimitern | fwereade, also I'm a bit concerned about how to model ipv4 and ipv6 addresses on the same interface - as described we need 2 interface docs with the same name and different subnet ids (for 4 and 6) | 10:44 |
dimitern | fwereade, but we might have 2 subnetids on the interface doc instead - IPv4SubnetId and IPv6SubnetId | 10:44 |
dimitern | fwereade, thoughts? | 10:44 |
dimitern | TheMue, standup? | 10:51 |
TheMue | dimitern: so late already? coming | 10:51 |
voidspace | axw: wallyworld: ok - I have it so that local messages not originating from juju are no longer dropped | 11:10 |
voidspace | axw: wallyworld: now I need to test that juju messages still get through | 11:10 |
=== vladk is now known as vladk|offline | ||
fwereade | dimitern, cheers, I'll take a look at the doc | 11:35 |
fwereade | dimitern, I don't immediately have ideas about 4v6 on the same interface | 11:36 |
dimitern | fwereade, right, thanks | 11:39 |
=== psivaa is now known as psivaa-lunch | ||
wallyworld | voidspace: just got back from soccer, great to see you've figured it out. you're now the juju-core rsyslog expert :-) | 12:05 |
=== psivaa-lunch is now known as psivaa | ||
=== vladk|offline is now known as vladk | ||
bac | hi sinzui, i've been trying to test quickstart on utopic. the gui comes up fine but mediawiki and mysql (the only two i've tried so far) either error or never leave pending. since bug 1314686 is resolved i would've expected success. | 13:45 |
_mup_ | Bug #1314686: Please add support for utopic <packaging> <juju-core:Fix Released by wallyworld> <juju-core 1.18:Fix Released by wallyworld> <juju-core (Ubuntu):Fix Committed by racb> <https://launchpad.net/bugs/1314686> | 13:45 |
=== vladk is now known as vladk|offline | ||
rogpeppe1 | dimitern, fwereade, mgz: what do you think about making all the Snippet constants in the names package non-capturing? | 13:53 |
rogpeppe1 | we just tried to use the snippets to do some parsing, and they're not that useful if they've got capturing groups inside | 13:54 |
mgz | rogpeppe1: I'd favour that | 13:54 |
mgz | I use non-capturing by default | 13:54 |
dimitern | rogpeppe1, then you'll need to change the parsing not to depend on capture groups | 13:54 |
bodie_ | morning all | 13:54 |
rogpeppe1 | dimitern: yeah | 13:55 |
mgz | rogpeppe1: unrelated, can you push your godeps changes again in a sec when I ask, the bot is getting some bits sorted out | 13:55 |
rogpeppe1 | mgz: sure | 13:55 |
rogpeppe1 | mgz: do you know what might've been going on there? | 13:55 |
mgz | there seems to be a shared repo across lots of packages, that got a particualr bug tickled, abentley is sorting it now we hope | 13:56 |
abentley | mgz: We are just working around the problem. We hope. | 13:58 |
ericsnow | perrito666: would https://github.com/juju/juju/pull/135 have had any impact on the restore (plugin) issue? | 14:03 |
natefinch | ericsnow: standup? Also perrito666 is on holiday | 14:03 |
wwitzel3 | ericsnow: perrito666 is out today, I'm looking at the backup and restroe stuff | 14:03 |
ericsnow | natefinch: coming | 14:03 |
voidspace | wwitzel3: https://github.com/voidspace/juju/compare/local-syslog | 14:07 |
=== vladk|offline is now known as vladk | ||
rogpeppe1 | mgz, dimitern: https://github.com/juju/names/pull/9 | 14:18 |
rogpeppe1 | makes snippets non-capturing | 14:18 |
rogpeppe1 | review appreciated :-) | 14:19 |
dimitern | rogpeppe1, reviewed | 14:21 |
rogpeppe1 | dimitern: you're a star, ta! | 14:21 |
dimitern | :) | 14:22 |
voidspace | wwitzel3: https://bugs.launchpad.net/ubuntu/+source/rsyslog/+bug/479592 | 14:22 |
rogpeppe1 | merged! | 14:22 |
_mup_ | Bug #479592: rsyslog doesn't work with property filter 'startswith' <rsyslog (Ubuntu):Confirmed> <https://launchpad.net/bugs/479592> | 14:22 |
rogpeppe1 | who needs CI anyway :-) | 14:22 |
mgz | rogpeppe1: yeah, that looks fine | 14:23 |
voidspace | echo 'biggle again' | logger -p local0.notice -t juju-michael-local- | 14:28 |
voidspace | wwitzel3: ^^ | 14:28 |
TheMue | dimitern: your suspicion has been a good one. our infrastructure simply only expects strings | 14:29 |
dimitern | TheMue, so if we need to support binary values for service settings that's something to keep in mind :) | 14:31 |
TheMue | dimitern: exactly | 14:39 |
voidspace | natefinch: so I *think* I've fixed the logging issue, wwitzel3 and I are testing | 14:41 |
voidspace | natefinch: the trouble is that we need to verify that logging still works properly with HA and units | 14:42 |
voidspace | natefinch: and you can't test that with local | 14:43 |
voidspace | natefinch: and it takes a while to check | 14:43 |
voidspace | natefinch: with my patch it now works properly again on local | 14:43 |
=== andreas__ is now known as ahasenack | ||
voidspace | I basically just got rid of the local ruleset - we just add the rules that do the broadcasting and then drop the juju messages | 14:43 |
voidspace | no need for a new ruleset | 14:44 |
natefinch | voidspace: nice work' | 14:53 |
voidspace | natefinch: well, we'll see | 14:53 |
voidspace | my machine has now bootstrapped | 14:53 |
=== tvansteenburgh1 is now known as tvansteenburgh | ||
=== hatch__ is now known as hatch | ||
=== vladk is now known as vladk|offline | ||
voidspace | wwitzel3: https://github.com/juju/juju/pull/137 | 15:14 |
voidspace | axw: wallyworld_ : https://github.com/juju/juju/pull/137 | 15:14 |
axw | voidspace: cool, thanks. so... those rules at the top go into the default ruleset? | 15:18 |
axw | now that there's no "local" one? | 15:18 |
alexisb | natefinch, perrito666 : where do we stand on lp 1332110? | 15:20 |
wwitzel3 | alexisb: perrito666 handed that off to me, he is out today (natl holiday) | 15:21 |
alexisb | wwitzel3, ah ok, do you have an update? | 15:22 |
wwitzel3 | alexisb: not fixed, I think we've isolated it to an issue with peergroup or instancepoller overwriting the conf, but not confirmed anything yet. | 15:22 |
alexisb | wwitzel3, ack, getting 1.19.4 out is important so if you need additional help/expertise let me know | 15:23 |
wwitzel3 | alexisb: thanks, will do | 15:24 |
=== vladk|offline is now known as vladk | ||
=== makyo_ is now known as Makyo | ||
voidspace | I'm off for the week, have fun without me next week | 16:07 |
natefinch | voidspace: have fun! | 16:07 |
ericsnow | voidspace: \o | 16:07 |
voidspace | thanks | 16:07 |
voidspace | o/ | 16:07 |
wwitzel3 | voidspace: safe travels | 16:07 |
wwitzel3 | natefinch: taking my car in | 16:11 |
natefinch | wwitzel3: ok | 16:12 |
rogpeppe1 | mgz: do you want me to push that godeps branch? | 16:14 |
mgz | rogpeppe1: yah, lets do it | 16:16 |
mgz | so we can undo again if the next one borks | 16:17 |
rogpeppe1 | mgz: ok, doing. i'll check you've got push rights so you can undo it, cos i'm just off for the w/e | 16:17 |
mgz | thanks | 16:18 |
rogpeppe1 | mgz: ok, you're added and i've pushed it... | 16:20 |
rogpeppe1 | mgz: now i'm off | 16:20 |
rogpeppe1 | happy weekends all | 16:20 |
=== vladk is now known as vladk|offline | ||
sinzui | what version of mongodb does juju require 2.4.6? | 16:49 |
sinzui | I see 2.4.9 in ctools, but I have not ported it to juju stable ppa? | 16:50 |
sinzui | natefinch, ^ | 16:50 |
=== vladk|offline is now known as vladk | ||
natefinch | sinzui: 2.4.6 | 16:54 |
sinzui | thank you natefinch | 16:55 |
=== vladk is now known as vladk|offline | ||
=== vladk|offline is now known as vladk | ||
=== vladk is now known as vladk|offline | ||
=== vladk|offline is now known as vladk | ||
=== vladk is now known as vladk|offline | ||
natefinch | wwitzel3: you around? | 18:59 |
=== vladk|offline is now known as vladk | ||
wwitzel3 | natefinch: am now | 19:40 |
wwitzel3 | natefinch: an oil change and new plugs turned in to $2500 and come get your car next week | 19:40 |
alexisb | wwitzel3, sucky | 19:42 |
bodie_ | 'lo alexisb :) | 19:45 |
bodie_ | wwitzel3, that is lame | 19:45 |
ericsnow | wwitzel3: ouch | 19:45 |
wwitzel3 | yeah :/ | 19:46 |
bodie_ | anyone have an idea how the canAccess stuff in apiserver works? | 19:46 |
bodie_ | I'm trying to figure out whether it's necessary to "canAccess" the Unit that Actions are going to, when I query an Action from State | 19:47 |
bodie_ | jcw4 / mgz / fwereade ? | 19:47 |
=== vladk is now known as vladk|offline | ||
bodie_ | sorry, state/apiserver in case that's too vague | 19:47 |
jcw4 | bodie_: I don't know but I can investigate also | 19:47 |
=== vladk|offline is now known as vladk | ||
=== vladk is now known as vladk|offline | ||
natefinch | wwitzel3: wow suck | 19:57 |
wwitzel3 | natefinch: I'm in moonstone | 19:57 |
bac | hi cmars | 20:02 |
alexisb | alrighty all, I am going to use friday afternoon to take down my system and do some much needed upgrades, if you need me call my cell or send me mail | 20:59 |
wwitzel3 | alexisb: have a good weekend, will update via the mailing list my progress on the backup and restore when I hit EOD | 20:59 |
alexisb | wwitzel3, thank you! | 21:00 |
alexisb | have a great weekend yourself | 21:00 |
bodie_ | oh good, my tests on master are passing again on my workstation | 22:14 |
bodie_ | wonder what happened | 22:14 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!