/srv/irclogs.ubuntu.com/2014/04/11/#juju-dev.txt

thumpermachine-0: 2014-04-11 00:19:50 INFO juju.worker.instanceupdater updater.go:264 machine "0" has new addresses: [public:localhost local-cloud:10.0.3.1]00:20
thumperlocalhost is public?00:20
thumperalso...00:21
thumper$ juju destroy-environment local -y00:21
thumperERROR readdirent: no such file or directory00:21
thumperwhere did that start coming from?00:21
davecheneythumper: that is um, wrong00:27
davecheneyhow does that even happy00:27
davecheneyhappen00:27
davecheneyNewAddress requires you to give an address scope00:27
thumperdavecheney: axw landed something in the last day that changed it00:28
davecheneythumper: i've been tinking that Address.Scope should be private00:28
davecheneythen we can force every creation to go through a helper function00:28
thumperdavecheney: chat with axw when he starts as he has done a lot of this00:31
thumperany one want to review the debug-log client hookup? https://codereview.appspot.com/8557004400:31
wallyworld__cmars: you got your lgtm, sorry about delay. i've been head down landing some critical 1.18 fixes00:33
cmarsnp, thanks wallyworld__00:33
wallyworld__thumper: suppose so00:34
stokachuif you remove juju-local juju-mongodb juju-core via apt-get with no bootstraped environment, the mongod process is still around00:40
stokachuis that intentional?00:40
stokachuthis is 1.18 on trusty00:41
thumperstokachu: depends, do you have mongodb-server package?00:46
stokachuthumper: just juju-mongodb00:46
stokachuand its pointing to /usr/lib/juju/bin/mongod00:47
thumperprobably unintentional00:47
stokachuok ill probably file a bug because if you dont kill that process subsequent bootstraps will fail00:48
thumperwallyworld__: I talked with rog last night about the error00:57
wallyworld__ok00:58
thumperwallyworld__: he suggested that I change the connection error to a more generic NotSupported error00:58
wallyworld__agree that's better00:58
* wallyworld__ doesn't like inconsistency00:58
* thumper nods00:58
thumperI have found another problem though00:58
thumperbut it is existing and elsewhere00:59
thumperI'll land this then fix the bug00:59
wallyworld__s/CodeIsNotImplemented/IsNotSupported :-)00:59
thumperwell...01:00
thumpernot implemented has a different meaning to not supported01:00
thumpernot implemented implies that one day you might01:00
thumperbut yes, agree in general01:00
wallyworld__sure, but we are using this as a mechanism to delete running against older api servers01:00
wallyworld__detect01:01
wallyworld__and failing back to 1dot16foo()01:01
wallyworld__sinzui: you'll see the email, but i got into 1.18 both john's fixes, for scp and downgrades. hopefully that will allow CI to work again01:02
thumperwallyworld__: do you want me to just use not implemented?01:07
thumperI'd be ok with that01:07
wallyworld__thumper: nah, let's go with the new error and promise hand on heart to port to using it everywhere appropriate :-)01:08
thumperfft01:08
wallyworld__the 1dot16 fallbacks will be disappearing anyway01:08
thumperlike that'll happen01:08
thumperyeah, I also renamed it fallback from 1.16 to 1.1801:08
sinzuiwallyworld__, I am hopeful that lp:juju-core/1.18 r2267 will pass. The azure-deploy test is very ill. I think we need to review both the test and azure cloud itself01:13
wallyworld__ok, we can ask axw for input there perhaps01:14
axw?01:14
axwazure is bad on 1.18?01:14
wallyworld__axw: sinzui says there are issues with the azure CI test not working01:14
wallyworld__i haven't looked yet, but perhaps we need to review what's being tested and how01:15
wallyworld__to see where the issue is01:15
axwis there a bug I can look at?01:15
axwor build failure01:15
axwCI build01:15
wallyworld__http://ec2-54-84-137-170.compute-1.amazonaws.com:8080/job/azure-deploy/01:16
axwta01:16
sinzuiwallyworld__, I revert the changes that were made to help it pass01:16
wallyworld__thanks for looking, i'm flat out right now landing stuff01:16
wallyworld__sinzui: you mean the changes with regard to downgrades and scp changes?01:17
sinzuiwallyworld__, axw. We increased the timeouts and tried to change the tools-metadata-url from stable to testing to help tests pass. The effoerts didn't help01:17
wallyworld__i saw the metadata url bug01:17
axwsinzui: what's going on? it's just hanging there?01:17
sinzuiwallyworld__, the downgrades restoration fixes 1.18.01:17
wallyworld__\o/01:17
wallyworld__sinzui: i'm porting to trunk now, conflict to solve but should be landed soon01:18
sinzuiIn trunk no env will upgrade withing 30 minutes01:18
axwI see the last one failed in scp, but the current one is just stuck on bootstrap?01:18
sinzuiThe scp is secondary, though it also acts as a computability test01:18
axw2014-04-11 01:20:06 ERROR juju.cmd supercommand.go:300 charm not found in "/var/lib/jenkins/repository": local:precise/dummy-source01:20
axwwat01:20
sinzuiaxw we tried replacing mysql and wordpress charms with charms the let us test just juju01:21
sinzuiaxw, we started on it earlier this week, then rushed it into use when we hoped to remove the many had starts that both of those charms have01:22
sinzuithe next run will use mysql and wordpress01:22
sinzuiIn theory charm-testing is responsible for making sure that mysql and wordpress are sane.01:23
axwok01:24
axwsinzui: which location do the azure tests run in?01:36
sinzuiUS West...the only location that has ever worked01:37
axwheh ok01:37
davecheneysinzui: thumper http://paste.ubuntu.com/7233131/01:41
davecheneygettng closer01:41
davecheneythe ones maked failed > 600s01:41
davecheneyare actually timeouts01:41
davecheneyif the builder was faster (building gccgo at the same time)01:42
davecheneythey may have passed01:42
sinzuiwell done01:42
davecheneySHIT01:42
davecheneyis jesse around01:42
davecheneyprovider/common is still complaining about missing keys01:43
axwsinzui: I'm not convinced azure is entirely healthy, I'm getting errors I haven't seen before from the API01:47
axwe.g. the storage API refusing connections01:47
axwthere's scheduled maintenance tomorrow on West US, I wonder if they started early on the "safe" parts01:48
axwthough now I've said that, azure-deploy just passed01:49
sinzuiaxw, I see a lot of errors. The CI often reties. Azure is actually very healthy today. I only failed 4 out of 24 hours01:49
axwok01:50
axwheh :)01:50
sinzuiBlessed: lp:juju-core/1.18 r226701:50
axwI have to backport another fix, but that's good to know01:51
axwwallyworld__: I'm about to backport the network addresses change - you're not already doing that, right?02:00
wallyworld__axw: no, i've been dealing with the other critical issues stopping CI from working02:01
wallyworld__just porting to trunk now from 1.1802:01
axwnps, thanks02:01
axwah ok02:01
wallyworld__sinzui: does that mean you worked around the scp issues fixed by r2268?02:02
sinzuiwallyworld__, we didn't succeed. It was a lower priority than getting a pass02:03
wallyworld__i'm not 100% sure, but maybe r2268 allows the original scripts to work?02:03
sinzuiwallyworld__, the scp issue only comes into play when the test fail02:03
wallyworld__ah ok02:04
wallyworld__anyways, it's merged into 1.18 and heading to trunk so if the tests fail again.... :-)02:04
sinzuiwallyworld__, http://ec2-54-84-137-170.compute-1.amazonaws.com:8080/job/canonistack-deploy-devel/ didn't capture logs from the last tests02:04
wallyworld__yeah, that's r226702:05
wallyworld__r2268 should hopefully cspture the logs02:05
sinzuiCI's normal and fallback credentials are broken in canonistack. we cannot test it until the swift authentication issue is fixed. So every canonistack test will fail02:06
wallyworld__:-(02:06
thumperanyone? https://codereview.appspot.com/85570045/02:49
sinzuiBlessed: lp:juju-core/1.18 r227002:52
waiganithumper: I put some garbage into jujud/agent.go then ran cmd/juju/bootstrap_test.go TestTest: passes locally, fails on vm02:53
thumper\o/02:53
thumperwallyworld__: don't worry about the juju command there...02:53
thumperwaigani: sorry that was for you02:53
thumperwaigani: but check the others02:53
thumperwaigani: although I do challenge that02:54
thumperwaigani: if you delete ~/go/bin/jujud and rerun the test02:54
thumperwaigani: what happens?02:54
waiganithumper: passes02:55
thumperwaigani: what is the error on the vm?02:55
waiganihttps://bugs.launchpad.net/juju-core/+bug/130476702:55
_mup_Bug #1304767: test failure in cmd/juju <ppc64el> <juju-core:In Progress by waigani> <https://launchpad.net/bugs/1304767>02:55
waiganithumper: I'll paste full error, hang on02:55
thumperwaigani: also, run a make check to run all the tests02:56
waiganithumper: http://pastebin.ubuntu.com/7233312/02:56
thumperwith a broken jujud you'll get a lot of failures02:56
thumperthat I don't think should happen02:56
thumperheh02:57
thumperok, to test locally02:57
thumperwe should do something like this...02:57
waiganiis it something to to with there not being a candidate match for 14.04:ppc ?02:57
thumperPatchValue(&version.Current.Series, "magic")02:57
thumpermake the series be something it can never be anywhere02:58
thumperand you'll hit the same problem locally02:58
thumper(I think)(02:58
waiganithumper: still passes02:58
thumperhmm...02:59
thumperit is something like that...02:59
thumperpatch it before the start of setup02:59
waiganiah okay02:59
thumperthe conn suite will bootstrap in setup02:59
waiganithumper: still passes03:00
thumperit is something like that....03:01
thumperplay a bit and break it03:01
waiganithumper: I'll keep debugging on the vm - slowly but surely03:01
waiganithumper: okay, I'm good at breaking things :)03:02
waiganiI have to run and get my girl now03:02
waiganiI've cornered the bug, with a bit more testing I should get it tonight.03:03
thumperwaigani: found it03:08
thumperwe patch version.Current, but don't patch arch.HostArch03:08
sinzuiwallyworld__, https://bugs.launchpad.net/juju-core/+bug/1302205 bothers me. It is critical, but is not targeted to the current milestone. I can see you working on it. I want to move it to 1.19.0 to reflect how it is being treated03:12
_mup_Bug #1302205: manual provisioned systems stuck in pending on arm64 <add-machine> <hs-arm64> <manual-provider> <juju-core:In Progress by wallyworld> <https://launchpad.net/bugs/1302205>03:12
thumpersinzui: what is the plan fo releasing 1.19?03:12
wallyworld__sinzui: sure03:13
sinzuithumper, if 1.19.0 gets a passing rev today/tomorrow I might release it. lp:juju-core r2587  was the last time 1.19.0 passed.03:14
sinzui1.18.1 has two passes today, so I think it is more likely to be released03:15
* thumper nods03:16
cmarssinzui, i've marked 1303880 and 1295140 as fix-committed. fixes for these have landed in trunk & 1.1803:28
sinzuithank you cmars03:28
axwwallyworld__: backport for addresses fix, can you please review? https://codereview.appspot.com/86720043/03:46
wallyworld__sure03:46
wallyworld__axw: done, looks like a nice change03:51
axwwallyworld__: cheers03:52
wallyworld__davecheney: is this one you've seen before? http://pastebin.ubuntu.com/7233429/04:02
wallyworld__dannf: you still online?04:03
sinzuithumper: Maybe you can point a developer to work on bug 1306212. CI will fail trunk because of it04:06
_mup_Bug #1306212: juju bootstrap fails with local provider <bootstrap> <ci> <local-provider> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1306212>04:06
wallyworld__sinzui: make him fix it himself :-P04:11
wallyworld__except, it may be HA related04:12
sinzuiwallyworld__, I don't make fixes after midnight. I do dangerous things when I am tired04:12
wallyworld__sinzui: no, not you, thumper :-)04:12
wallyworld__i didn't mean you to fix it04:12
wallyworld__you need to go to bed!04:12
sinzuiI am very skeptical that any of the trunk upgrades will pass. They are all paused and approaching the 30 minutes timeout04:13
wallyworld__:-(04:13
wallyworld__deployments look ok04:13
wallyworld__at least 1.18 works so that can be released soon04:14
sinzuiI already downloaded the 1.18.1 tarball and win installer. I could release that tomorrow04:16
wallyworld__sinzui: fix for bug 1303735 landing now though04:17
_mup_Bug #1303735: public-address change to internal bridge post juju-upgrade <openstack-provider> <juju-core:Fix Committed by axwalk> <juju-core 1.18:In Progress by axwalk> <https://launchpad.net/bugs/1303735>04:17
sinzui\o/04:17
wallyworld__we should not release without that one04:17
wallyworld__the only other one is backup failure04:18
wallyworld__meh for 18.1 i reckon04:18
sinzuiwallyworld__, There is not much point fixing that backup bug because the restore bug is targeted for 1.19.0.04:26
wallyworld__that's what i reckon also04:26
wallyworld__hence, "meh" :-)04:26
wallyworld__so we can retarget and release as soon as the final fix lands04:27
wallyworld__sinzui: fuck off to bed, it's way too late for you to be online04:27
wallyworld__:-)04:27
sinzuiI have more email explain what blocks 1.19.0 and then I will sleep04:28
davecheneywallyworld__: yes04:44
davecheneyit turns out that there is a bug in gccgo04:44
davecheneythings are only given small stacks04:44
davecheneyand when they run off the end, they crash04:44
wallyworld__boom04:45
davecheneydunno why some ppc machines are ok04:45
davecheneyi'm building a test gccgo now04:45
wallyworld__ok04:45
davecheneymwhudson: ping04:45
wallyworld__davecheney: btw, arm64 works now \o/04:45
davecheneywallyworld__: working on a compiler fix04:46
davecheneywallyworld__: it's actually a build option04:46
wallyworld__the issue i has was bug 1274558, transparent huge pages, there is a wor arpund04:47
davecheneyeek04:47
davecheneyautoconf is incorrectly turning on the -fsplit-stack option for gccgo04:47
wallyworld__the issue caused go executables to hang etc etc04:47
davecheneywhich is really only an intel thing04:47
davecheneywallyworld__: can you grab the last dozen lines of dmesg04:50
davecheneyon that system04:50
wallyworld__ok04:50
=== vladk|offline is now known as vladk
wallyworld__davecheney: this is not a juju bug is it https://bugs.launchpad.net/juju-core/+bug/130025605:46
_mup_Bug #1300256: juju status results in unexpected fault address on arm64 using/ local provider <arm64> <hs-arm64> <local-provider> <status> <juju-core:Triaged> <https://launchpad.net/bugs/1300256>05:46
=== vladk is now known as vladk|offline
=== liam_ is now known as Guest89383
dimiternmorning all07:22
dimiternfwereade, I hope you're feeling better today :), just as a reminder - my two VLAN CLs https://codereview.appspot.com/86010044/ and https://codereview.appspot.com/86600043/ when you can have a look07:23
* dimitern is away for 1h07:24
=== vladk|offline is now known as vladk
rogpeppemornin' all08:07
=== vladk is now known as vladk|offline
waiganidavecheney: turned out to be a one liner: https://codereview.appspot.com/86760043/08:37
waiganimorning rogpeppe08:37
rogpeppewaigani: hiya08:38
waiganiExciting Friday night here, coding in the kitchen08:38
axwdo we have a way to version API methods yet?08:47
rogpeppeaxw: renaming them is the only way currently08:55
rogpeppeaxw: or going the backwards-compatible route08:55
axwrogpeppe: ok, thanks08:55
rogpeppeaxw: there are a couple of directions i'd like to go with it, but it's a sensitive issue08:56
rogpeppeha, i wondered what i'd done to break the cmd/jujud tests, but this happens in trunk (many times): [LOG] 36.85478 ERROR juju worker: exited "rsyslog": failed to write rsyslog certificates: cannot create temp file: open /var/log/juju/rsyslog-cert.pem824698669: no such file or directory08:57
rogpeppethis is not great08:57
axwrogpeppe: did you get a chance to look at the EnsureAvailability CL?09:02
=== vladk|offline is now known as vladk
rogpeppeaxw: i'm about half way through the review09:02
axwok09:02
rogpeppeaxw: sorry, will get back to it!09:02
axwrogpeppe: nps09:03
rogpeppeaxw: there was one thing i was having difficulty working out09:04
rogpeppeaxw: i couldn't quite see whether it always preserves the invariant that the number of wants-vote machines is always odd09:04
axwit should always be the parameter, and that is checked at the top09:05
axwif one is taken out of VotingMachineIds, another will replace it09:06
axwrogpeppe: I must admit it was a little mindbendy to me, so don't take my word for it ;)09:07
mwhudsondavecheney: pong09:20
davecheneywaigani: nice fix09:24
davecheneyyeah, that was what I suspected09:24
davecheneywe weren't specifying, so it fell through to 'this machine'09:24
davecheneywhich didn't match the fixtures09:24
davecheneywaigani: interesting line wrapping, are you using a c64 ?09:25
waiganidavecheney: line wrapping? in the description you mean?09:26
rogpeppeaxw: i'm wondering if we should be doing all the stateserverinfo ids manipulation inside a single Update op09:26
davecheneywaigani: y09:26
axwrogpeppe: what's the benefit?09:27
rogpeppeaxw: it means that other code can't see them in an intermediate state09:27
waiganidavecheney: lol - well I may be formatting out of nostalgia ;)09:27
axwrogpeppe: all the ops are run in a single transaction though?09:28
waiganidavecheney: how are we looking now on ppc?09:28
rogpeppeaxw: read operations don't respect transactions09:28
davecheneywaigani: no complain', just sayin'09:28
davecheneywaigani: will know in < 300 seconds09:28
axwrogpeppe: ah, I see09:28
davecheneywaigani: provider/common was whinging about mising ssh keys09:28
waiganioh really?09:28
waiganiI can look into that if you like?09:29
davecheneywaigani: i'll know in a few mins09:29
rogpeppeaxw: the other thing i'm trying to persuade myself of is whether the $size asserts in maintainStateServersOps are still sufficient09:29
davecheneyhold tight09:29
axwrogpeppe: I was thinking they should be changed to exact match - is there a use case for two concurrent callers?09:30
axwI figure someone might want to have a cron job calling this09:31
rogpeppeaxw: well, we try to make everything work ok with concurrent callers09:31
rogpeppeaxw: i'm not sure that an exact match is possible to do though09:31
rogpeppeaxw: it might need to be a txn_revno assertion09:31
waiganidavecheney: I looked into 1262967. I suspect it is a similar problem. So far though, none of the tests are failing for me.09:31
davecheneywaigani: bootstrap_test.go:69: c.Assert(err, gc.IsNil)09:32
davecheney... value *errors.errorString = &errors.errorString{s:"no public ssh keys found"} ("no public ssh keys found")09:32
axwrogpeppe: sure, I just meant whether we try to allow concurrent modifications or lock each other out entirely09:32
rogpeppeaxw: ah yes09:32
rogpeppeaxw: if we've got two concurrent callers both calling ensure-availability with different numbers, that's likely to be problematic anyway09:33
rogpeppeaxw: but if the server count hasn't changed, i think it should be fine to just let a concurrent call assume that the first one has worked, and return with success09:34
rogpeppeaxw: s/server count/voting server count/09:34
axwrogpeppe: as we do now?09:35
axwlen(VotingMachineIds) == numStateServers?09:35
rogpeppeaxw: well, that was ok before, but isn't now, because we want to juggle available servers09:36
waiganidavecheney: I can't reproduce that?09:36
waiganiprovider/common$ go test on ppc 31 tests pass?09:36
davecheneywaigani: rm -rf ~/.ssh :)09:36
waiganiugh god facepalm09:36
axwrogpeppe: it should still work - it checks if there are any machines being taken out of the voting set09:37
davecheneywaigani: it should be the same as your provider manual fix, right09:37
waiganidavecheney: is there a bug for that one?09:37
davecheneythe underlying cause is the same09:37
axwrogpeppe: info is updated by updateAvailableStateServersOps if that's not clear09:37
waiganidavecheney: yep, looks very similar09:37
davecheneywaigani: the bug was for both I think09:37
axwso VotingMachineIds out may be smaller than going in09:37
waiganiooh right, okay I can fix and link it to the same bug?09:38
davecheneyyup09:38
waiganisweet, will do09:38
rogpeppeaxw: currently, we just return nil if the number of voting machines hasn't changed. i'm not sure we can still do that.09:38
davecheneymwhudson: i've found hte cause of the juju crashes on ppc09:38
davecheneyin the gccgo deb09:38
davecheneyactually, i'll step back09:38
mwhudsondavecheney: oh?09:38
davecheneymwhudson: basically, libgo tests to see if the compiler supposed -fsplit-stack09:39
davecheneywhich ppc says it does09:39
davecheneybut it lies09:39
mwhudsonah09:39
davecheneyi think the same is for arm09:39
davecheneyarm6409:39
davecheneyi mean, you need gold to support it09:40
davecheneyand gold isn't even installed on ppc09:40
axwrogpeppe: sorry, I don't understand why. if we remove a voting machine ID, then len(VotingMachineIds) < numStateServers. If we bring an available server back in, that count doesn't change, but we add a txn.Op to make the change in mongo09:40
mwhudsondavecheney: i'09:40
mwhudsondavecheney: i'm pretty sure i checked that the configure script does not think arm64 supports split stacks09:41
davecheneyok, that might be a different error09:41
davecheneybut it looks like ppc is saying it does09:41
davecheneyand so libgo configures itself accordingly09:42
davecheneyand so each goroutine has a small stack and runs of the end easily09:42
davecheneychecking whether -fsplit-stack is supported... no09:44
davecheneyhmm, that is odd09:44
rogpeppeaxw: i'm just saying that i don't think we can just do nothing at all if len(VotingMachineIds) == numStateServers09:44
rogpeppeaxw: (which is the current behaviour)09:44
axwrogpeppe: ah, we should perform an assertion you mean?09:45
rogpeppeaxw: we'll still need to potentially decommission unavailable machines and add new ones if so09:46
axwrogpeppe: yes, that's being done in my CL. possibly not very obviously09:48
rogpeppeaxw: yup09:48
axwor maybe I just don't understand09:48
rogpeppeaxw: perhaps when you said "as we do now" you meant "as we do in my branch" ?09:48
axwrogpeppe: "if the server count hasn't changed, i think it should be fine to just let a concurrent call assume that the first one has worked, and return with success"  -- do we not do that on trunk? maybe you meant server count hasn't changed in mongo, and availability hasn't changed09:50
axwanyway. I think we're roughly on the same page09:51
rogpeppeaxw: yeah09:51
natefinchmorning all09:51
rogpeppenatefinch: hiysa09:51
rogpeppeaxw: i suppose the question now in my mind is: is it possible for EnsureAvailability to do any actions without changing the server id counts?09:52
rogpeppeaxw: hmm, i think it might be possible09:53
axwmorning natefinch09:53
rogpeppeaxw: so perhaps the best thing is to assert on txn_revno and assume that txn.ErrAborted means all's well09:54
axwrogpeppe: yes, it will change the counts definitely. the check there at the moment says that the counts didn't change externally09:54
axwrogpeppe: SGTM09:54
davecheneywaigani: are you going to apply https://codereview.appspot.com/85710043/ to provider/common ?09:55
rogpeppeaxw: what if we've got the following scenario (x means unavailable): machineids: 1 2 3x 4; votingids: 1 2x 4; then we'll end up with something like: machineids: 1 2 4 5; voting ids: 1 4 509:56
rogpeppeaxw: i think09:56
waiganidavecheney: what would be best? I can update that branch or create a new one?09:56
rogpeppeaxw: where the voting counts haven't changed, but the contents have09:56
davecheneywaigani: you need to createa  new branch09:56
rogpeppeaxw: (and 5 is a newly commissioned machine)09:56
davecheneythis is the gift that lbox brings us09:56
waiganidavecheney: ah hehe okay09:57
waiganinice timing, as I was just starting on the old branch09:57
waiganiyou must have sensed it (or you've hacked into my computer)09:57
axwrogpeppe: each of the promote/demote ops assert that they weren't changed too09:58
axwrogpeppe: so if something else modified the the contents concurrently, we'd still see assert on wantvote/hasvote fields09:58
rogpeppeaxw: ah, good point09:59
rogpeppeaxw: reviewed10:09
axwrogpeppe: thanks10:09
axwrogpeppe: ahhhh, now I see what you mean about voting count not changing10:10
axwon ErrTxnAborted10:10
axwErrAborted rather10:10
rogpeppeaxw: cool10:13
waiganidavecheney: lboxing...10:13
davecheneykk10:16
waiganidavecheney: https://codereview.appspot.com/8680004310:17
davecheneywaigani: reviewing10:17
axwnight all10:17
davecheneyit would be nice to get this on in this evening10:17
waiganidavecheney: cool, I'll hang around until it lands10:17
davecheneywaigani: nah10:20
davecheneyi can land it myself10:20
davecheneyi have timezones on my side10:20
davecheneyit looks like there might be another bug in cmd/juju10:20
davecheneytests10:20
davecheneysimple ordering one10:20
waiganidavecheney: I'm hitting this on the vm: fatal error: bad spsize in __go_go10:21
davecheneyeek10:21
waiganiyeah, I don't seem to be able to run my tests10:22
davecheneywaigani: $ go test ./provider/common/10:28
davecheneyok      launchpad.net/juju-core/provider/common 24.926s10:28
davecheneyLGTM10:28
davecheneyship it10:28
waiganidavecheney: sweet10:28
waiganidavecheney: cmd/juju$ go test10:34
waiganiOK: 226 passed, 2 skipped10:34
waiganiPASS10:34
davecheneywaigani: nice one10:39
davecheneygive yourself the evening off10:39
waiganidavecheney: thanks. If you see any bugs you want me to look at, send me an email for Monday.10:43
waiganidavecheney: I'll be keen to see how close we are to getting juju working on ppc :)10:43
davecheneywaigani: i think we're down to 310:44
davecheneycmd/juju10:44
davecheneywhich looks easy10:44
davecheneycmd/jujud which looks like a timeout10:45
davecheneythe joyent provider which is a timeout10:45
davecheneyand worker tests10:45
davecheneywhich are a bug in the compiler i'm working on10:45
mgzrogpeppe, dimitern: standup!10:47
davecheneywaigani: http://paste.ubuntu.com/7233131/10:48
davecheneyfrom a few hors ago10:48
davecheneybe suspcious of any tests thta take 600 seconds10:48
davecheneythe watchdog kills it10:48
davecheneyhttps://bugs.launchpad.net/juju-core/+bug/130653610:57
_mup_Bug #1306536: replicaset: mongodb crashes during test <juju-core:Triaged> <https://launchpad.net/bugs/1306536>10:57
davecheneythis is a real thing10:57
davecheneyif mongo shits itself during our CI10:57
davecheneywhich it does, continually10:57
davecheneywhat is it going to do in the field ?10:57
rogpeppenatefinch: shall we hang out elsewhere?11:02
natefinchrogpeppe: sure11:02
rogpeppenatefinch: https://plus.google.com/hangouts/_/canonical.com/juju-HA?authuser=111:03
natefinchdavecheney: we'll look into it. I've seen it occasionally.  not sure what causes it yet11:04
davecheneynatefinch: what version is the bot running11:05
davecheneyis it still running out old crack versoin of mongo we made 2 years ago ?11:05
rogpeppenatefinch: bzr+ssh://bazaar.launchpad.net/~rogpeppe/juju-core/natefinch-041-moremongo/11:07
natefinchdavecheney: I don't think so, but I'm not sure11:07
davecheney$ ~/bin/juju destroy-environment -y local11:08
davecheneyERROR failed verification of local provider prerequisites:11:08
davecheneyjuju-local must be installed to enable the local provider:11:08
davecheneyumm ....11:08
davecheneyso, i can't develop juju unless I have a conflicting juju binary installed ?11:08
davecheneyhttps://bugs.launchpad.net/juju-core/+bug/130654411:10
_mup_Bug #1306544: developing juju requires juju-local  to be installed <juju-core:Triaged> <https://launchpad.net/bugs/1306544>11:10
natefinchinteresting way to put it, we just need youy to have  juju-mongodb11:11
davecheneythis is kind of a problem11:11
natefinchit's been a huge problem for new developers.... but you're right, we should have a separate package to just deploy the db11:11
davecheneyyou have to be super careful `juju` is the juju you mean11:11
natefinchyeah, I think rog hit that11:12
natefinchinteresting... actually:11:12
natefinch$ which juju11:12
natefinch/usr/bin/juju11:12
davecheneysad trombone11:12
=== vladk is now known as vladk|lunch
=== vladk|lunch is now known as vladk
rogpeppedoes anyone know anything about statecmd.MachineConfig ?12:43
rogpeppefrom the description, it looks like it just returns information, but AFAICS, it's actually responsible for setting up the initial password info on a machine too12:44
rogpeppeand it seems kinda weird that it's living inside statecmd too12:46
rogpeppeah, i see, it's a 1.16 legacy12:47
mgzevilnickveitch: do you want me to resubmit doc changes against github in order to get them landed?12:51
evilnickveitchmgz, no, it's fine. I did a PR for the GH repo. Nobody has looked at it, so I will just merge it anyhow12:55
mgzevilnickveitch: I can probably stamp it...12:56
mgzevilnickveitch: too late, thanks!12:56
evilnickveitch:)12:57
wallyworld__fwereade: hiya, saw your comment on instance type constraints. i haven't done any work on it for a few days except for merging trunk and resolving conflicts as i've been doing the arm stuff and other work for 1.18.1 etc. i guess i should mark it back as wip. i'll be able to get more done next week13:00
natefinchrogpeppe: back'13:31
fwereadewallyworld__, no worries, I knew we'd talked about some of the stuff I mentioned, just wanted to make sure it was recrded13:34
wallyworld__sure, ok13:34
fwereaderogpeppe, IIRC it's primarily for the manual provider (and particularly for hazmat's convenience)13:35
rogpeppefwereade: i'm just pondering the best place to add mongo password setup for new state servers13:36
rogpeppenatefinch: https://plus.google.com/hangouts/_/canonical.com/juju-HA?authuser=113:36
fwereaderogpeppe, preferably purely inside existing state servers, surely?13:37
rogpeppefwereade: sure13:38
rogpeppefwereade: i think it's best done along with the other password in the API call ProvisioningScript13:38
rogpeppefwereade: (FWIW, that's another bad name - it doesn't sound like it actually sets up the machine too)13:39
fwereaderogpeppe, fair enough, so long as we don't put it in cloudinit13:39
rogpeppefwereade: definitely not - it couldn't go in cloudinit anyway13:39
rogpeppefwereade: BTW NewAPIAuthenticator seems to be fundamentally misguided - it only gets the state and API addresses once13:40
fwereaderogpeppe, yeah, I know, it sucks13:40
rogpeppefwereade: i'm not sure whether to change it to fetch the addresses each time, or to add a watcher to the provisioner13:40
fwereaderogpeppe, midpoint: get the addresses once per batch of machines13:41
rogpeppefwereade: i suppose AuthenticationProvider could do the watch too13:41
fwereaderogpeppe, the intent was always to do it per-batch but I forget why it didn't happen13:41
fwereaderogpeppe, I suspect there was some ugly interaction with container provisioners13:42
rogpeppefwereade: ah, i see. we should get the address in provisionerTask.startMachines13:48
perrito666how cool would be to have a special var that when something other than nil is assigned to it would automatically produce panic or return err (error handling induced day dreaming)13:49
rogpeppefwereade: it's pretty awkward to refactor the code so it uses bulk API calls, BTW13:49
rogpeppefwereade: although it's definitely a place that it's worth it13:50
rogpeppefwereade: although actually just running all the startMachine calls concurrently would be a big win (and quite likely faster than using sequential bulk calls)13:51
natefinchperrito666: magic is pretty much anathema in Go. I prefer to have the code do exactly what it says it does, and no more.  Magic is how you get bugs.13:51
perrito666natefinch: amen13:51
perrito666natefinch: I guess I was more in search of syntactic sugar than magic13:56
natefinchperrito666: also not something Go generally does :)   you can do this, though:14:00
natefinchif err := foo(); err != nil {14:00
natefinch    return err14:00
natefinch}14:00
fwereaderogpeppe, we do also want environ.StartInstances to mitigate rate limiting14:02
rogpeppefwereade: yes.14:02
perrito666natefinch: yep, that is where I go whenever I can, although that is closer to syntactic saccharine than sugar14:02
fwereaderogpeppe, and I'm well aware of just how much hassle that will be, but we can only put it off so long ;)14:02
rogpeppefwereade: although i'm yet sure if rate limiting should really be done in the Environ itself14:02
rogpeppes/yet/not yet/14:03
fwereadeperrito666, also it keeps the scope of err nice and tight, which is often a benefit of itself14:03
fwereaderogpeppe, I'm reasonably sure it should be, myself14:03
rogpeppefwereade: the problem is that not everything that makes provider calls necessarily shares the same Environ14:03
fwereaderogpeppe, which is not to say that we shouldn't also have some stuff around instancepoller too14:03
fwereaderogpeppe, most of them should, though14:04
fwereaderogpeppe, and it's probably not that hard to arrange14:04
rogpeppefwereade: doesn't every worker create its own Environ?14:04
rogpeppefwereade: but, yeah, it's probably not too hard to pass an environ in to those workers that need one14:04
mattywfwereade, when you have a moment...14:04
rogpeppefwereade: except that it can change14:05
fwereaderogpeppe, yeah, but it doesn't have to -- pass one in, have another worker responsible for updating it14:05
fwereademattyw, heyhey14:05
natefinchanyone understand this log about wordpress not installing?   failed to fstat previous diversions file: No such file or directory    whole log: http://paste.ubuntu.com/7235032/14:05
sinzuiHi jamespage : I have an idea to address a bug 1304493 that I think was caused by republication of juju tools.14:05
_mup_Bug #1304493: Juju tools 1.18.0 streams.canonical.com checksum mismatch <ci> <landscape> <packaging> <juju-core:Triaged> <https://launchpad.net/bugs/1304493>14:05
jamespagesinzui, I think so yes14:05
jamespagesinzui, that feels bad to me14:05
dimiternrogpeppe, fwereade, updated https://codereview.appspot.com/86010044/14:06
dimiternrogpeppe, fwereade, (networks stuff)14:06
sinzuijamespage, I am not convinced it is the right solution since I do not know why streams.canonical.com got a different file size for the amd64 precise tools. I think the only package it could find is the one from the juju stable ppa14:11
jamespagesinzui, one from the PPA, then one from the distro when I uploaded it I suspect14:12
sinzuijamespage, you made one for precise?14:12
jamespagesinzui, no14:12
sinzuiI thought you were just making trusty14:12
jamespageoh - that really is odd then14:12
jamespagesinzui, I am14:13
jamespagesinzui, I don't understand then14:13
jamespagewe need utlemming I think14:13
sinzuiI will keep investigating14:13
sinzuithank you jamespage14:13
rogpeppefwereade: can i run something by you?14:17
fwereaderogpeppe, sure14:17
rogpeppefwereade: i *think* we can remove StateInfo from environs.MachineConfig and cloudinit.MachineConfig14:18
fwereaderogpeppe, w00t!14:18
fwereaderogpeppe, I've been wanting to do that for months :)14:18
rogpeppefwereade: basically, any agent that needs to connect to State can dial localhost14:19
rogpeppefwereade: that will need to change in the future if we want to have more API servers than mongo instances14:22
rogpeppefwereade: but even then, i don't think it needs to be in MachineConfig14:23
fwereaderogpeppe, +10014:24
fwereaderogpeppe, anything starting up at bootstrap time can use localhost, and anything starting up late can just grab the info over the api, anyway14:24
rogpeppefwereade: yup14:25
fwereaderogpeppe, excellent14:25
sinzuijamespage, I now think bug 1304493 is caused by different versions of gzip. I don't think this issue is about alternate packages.14:46
_mup_Bug #1304493: Juju tools 1.18.0 streams.canonical.com checksum mismatch <ci> <landscape> <packaging> <juju-core:Triaged> <https://launchpad.net/bugs/1304493>14:46
jamespagesinzui, ah - yes - deterministic zip creation14:47
natefinchmarcoceppi, hazmat: either of you recognize this error from installing a charm?  failed to fstat previous diversions file: No such file or directory   full log: http://paste.ubuntu.com/7235032/14:50
rogpeppefwereade: i think we'll remove state addresses from agent config too14:50
natefinchseems to only happen when deploying locally14:50
fwereaderogpeppe, sgtm I think14:50
rogpeppefwereade: and i'm also changing the agent config to store hostports to avoid the current impedance mismatch.14:51
rogpeppefwereade: then it's [][]instance.HostPort throughout14:51
fwereaderogpeppe, cool14:52
marcoceppinatefinch: that's an interesting error, I've not come across it before. What's the charm?15:02
natefinchmarcoceppi: that's wordpress15:05
natefinchmarcoceppi: different error for mysql, start fails15:05
natefinchmarcoceppi: mysql gets this: http://paste.ubuntu.com/7235297/15:06
jamespagesinzui, are we likely to see a 1.18.1 release today?  as it will be critical bug fixes I can push that pre-release15:44
jamespageand we're still in universe so meh15:44
sinzuijamespage, the gzip bug is blocking, but I hope to be able to start the release in 2 hours15:45
alexisbnatefinch, fwereade ping16:11
natefinchalexisb: howdy16:13
alexisbhi natefinch happy friday16:15
alexisbcan I lean on you to take a look at the 4 new critical bugs for 1.19.0 and see if you can easily assign them to someone across the juju team?16:15
alexisblooks like most are regressions16:16
natefinchalexisb: I can look, for sure. What's the timeline on getting them fixed?  There's not a lot of working hours left for most juju devs16:16
alexisbas soon as we can, but in regular working hours16:17
natefinchalexisb: ok16:17
alexisbwe just need to try and unblock sinzui from releasing16:17
alexisbsinzui, are there any bugs in 1.18 that need our assistance for the release jamespage needs?16:18
alexisband natefinch thank you!16:18
natefinchalexisb: welcome16:18
alexisbsinzui, our == juju-core16:18
rogpeppeis hook output still labelled with HOOK in the logs?16:25
sinzuialexisb, I am working on the one bug that blocks the release of 1.18.1.16:32
sinzuialexisb, I am going to defer the backup bug because fixing it doesn't allow you to restore. The restore bug is targeted to 1.19.016:32
alexisbsinzui, ack16:33
natefinchsinzui, marcoceppi, alexisb:  Btw, looks like wordpress and mysql both fail to deploy on trunk, at least using the local provider on trusty.  Getting some errors related to apparmor when installing wordpress and when starting mysql.16:38
alexisbnatefinch, that would make sense given on of the critical bugs deals with failing to bootstrap a local provider16:38
natefinchalexisb: I can bootstrap ok, it's deploying that I have a problem with16:40
alexisbah ok, shows my ignorance :)16:41
fwereadealexisb, heyhey, I'll take a look as well16:42
alexisbfwereade, thanks16:42
natefinchfwereade: check out my response to this bug at the bottom and let me know if you disagree: https://bugs.launchpad.net/juju-core/+bug/120843016:45
_mup_Bug #1208430: mongodb runs as root user <mongodb> <juju-core:Triaged> <juju-core (Ubuntu):Triaged> <https://launchpad.net/bugs/1208430>16:45
fwereadenatefinch, concur16:46
fwereadenatefinch, db access gives you the keys to the kingdom regardless16:46
fwereadenatefinch, have we closed the mongod ports externally yet?16:46
fwereadenatefinch, I'm guessing not, but I think we can now; right?16:46
fwereadenatefinch, at least on providers where we can do firewalls16:47
fwereadenatefinch, in fact it should be a matter of just not explicitly opening it any more16:48
natefinchfwereade: roger says he thinks we keep them open still... but I think we can close them (and definitely should close them)16:48
fwereadenatefinch, close 'em :)16:48
fwereadenatefinch, ideally we'd close them even to traffic from other machines in the environment, but that's not practical yet16:49
natefinchfwereade: yep16:49
sinzuinatefinch, then this is the real bug 130528016:49
_mup_Bug #1305280: juju command get_cgroup fails when creating new machines, local provider arm32  <armhf> <local-provider> <lxc> <packaging> <juju-core:Triaged> <apparmor (Ubuntu):New> <https://launchpad.net/bugs/1305280>16:49
sinzuinatefinch, The bug is in ubuntu, but I kept it open on juju-core to make it easy for me to track16:49
natefinchsinzui: ahh, ok, thanks16:50
fwereadenatefinch, looking at those criticals I am suspicious that the other 3 are ha-/replicaset-related16:50
natefinchsinzui: the armhf in the title seems misleading, since I'm definitely not running arm here16:51
sinzuinatefinch, yep. I thought it was limited ubuntu ports/packaging issue16:52
natefinchfwereade: yeah16:52
=== vladk is now known as vladk|offline
natefinchfwereade:  this one seems like it just requires an update to the backup script to find the juju-db binaries: https://bugs.launchpad.net/juju-core/+bug/130578016:54
_mup_Bug #1305780: juju-backup command fails against trusty bootstrap node <backup-restore> <juju-core:Triaged> <https://launchpad.net/bugs/1305780>16:54
natefinchoh, i guess that one is 1.81.116:54
natefincher 1.18.116:54
natefinchfwereade: certainly the other ones are HA related16:55
sinzuinatefinch, I am going to move that bug to 1.19.016:55
sinzuinatefinch, if we fix the backup bug, the user still hits the two restore bugs in 1.19.016:56
* sinzui moves the bug to be with its friends16:56
natefinchthe more the merrier16:56
sinzuinatefinch, So I see one bug in progress for 1.18.1, It's my job to solve the gzip compression problem between different machines.16:57
sinzuiJuju-Core can focus on getting trunk releasable for Monday/tuesday16:58
fwereadenatefinch, yeah, I think they're not on the $PATH17:15
fwereadenatefinch, sorry, evening is happening around me a bit ;)17:15
natefinchfwereade: seems like an easy fix at least17:15
fwereadenatefinch, yeah17:15
fwereadebtw, https://codereview.appspot.com/86910043 is up for review at long long last17:15
cjohnstonnatefinch or fwereade any chance either of you could help debug an issue where trusty cant deploy precise instances using lxc?17:17
fwereadecjohnston, IIRC thumper has started work on that -- is this new?17:18
sinzuicjohnston, did you set default-series in your config...and is your the failure with some charms bug 130528017:18
_mup_Bug #1305280: juju command get_cgroup fails when creating new machines, local provider arm32  <armhf> <local-provider> <lxc> <packaging> <juju-core:Triaged> <apparmor (Ubuntu):New> <https://launchpad.net/bugs/1305280>17:18
cjohnstonhttps://bugs.launchpad.net/juju-core/+bug/1306537 is the bug filed for it, yes default_series is set17:19
_mup_Bug #1306537: LXC provider fails to provision precise instances from a trusty host <juju-core:Invalid> <https://launchpad.net/bugs/1306537>17:19
cjohnstonI don't see (error: error executing "lxc-start": command get_cgroup failed in juju status, the instances just sit at pending for hours17:20
cjohnstonhowever deploying cs:trusty/ubuntu works17:20
sinzuicjohnston, This issue may be fixed in trunk. The opposite scenario was recently fixed: bug 130282017:26
_mup_Bug #1302820: juju deploy --to lxc:0 cs:trusty/ubuntu creates precise container <landscape> <juju-core:Fix Committed by thumper> <https://launchpad.net/bugs/1302820>17:26
sinzuicjohnston, 1.19.0 will be released next week, a day after trunk stabilises.17:27
rogpeppe i'm sure i used to be able to move my mouse when my machine was heavily loaded17:34
natefinchrogpeppe: I've noticed that recently, too17:35
rogpeppeand this IRC client has an amazing failure mode when i've been typing into it when the machine's heavily loaded17:36
natefinch(recently = last few months at least)17:36
rogpeppeevery few keys typed, it ignores what i've typed and types something from the past insteads17:36
rogpeppeso just then, if i typed "ddddddddd", i'd get "d abled abled"17:37
rogpeppehmm, we've taken away support for 1.19 to work against 1.16 clients, right?17:38
natefinchtime to get a new irc client17:38
sinzuirogpeppe, There is a commit in trunk that says that17:42
sinzuiand I removed the ability to package 1.16. its dead to me17:42
rogpeppesinzui: so we don't care about upgrading from 1.16 then?17:42
rogpeppesinzui: which upgrade transition has been failing on trunk?17:42
sinzuirogpeppe, we test stable to stable. 1.16 goes to 1.18. 1.18 can go to 1.19 or 1.2017:43
rogpeppesinzui: ok, cool17:43
* sinzui notes that juju docs should make that very clear17:43
rogpeppenatefinch, dimitern, anyone else that's around: large but largely trivial: https://codereview.appspot.com/8701004418:22
=== vladk|offline is now known as vladk
natefinchreview for whoever: https://codereview.appspot.com/86920043/18:41
natefinchrogpeppe: this is the machine 0 log for upgrading 1.18 to trunk18:44
natefinchhttp://pastebin.ubuntu.com/7236187/18:44
natefinchrogpeppe:  ubuntu@ec2-174-129-121-255.compute-1.amazonaws.com19:00
perrito666how deep is a copy of an object in go?20:40
perrito666for instance I copy a document that has fields such as []string slices and arrays inside20:42
perrito666do all those get properly copied?20:42
natefinchperrito666: it's hard to answer that without being flippant.  The obvious answer is, it copies everything.  But that would be confusing20:42
natefinchperrito666: slices, channels, and maps are all implemented as pointers, so the pointers get copied but not what they point to20:43
=== Ursinha is now known as Ursinha-afk
natefinchinterfaces, too20:43
natefincharrays are actually consecutive memory like you'd expect, and when you copy them, you copy the whole dang thing20:43
natefinchbut slices are just pointers to parts of arrays, so when you copy them, you're only copying the pointer20:44
perrito666natefinch: as I thought, if I want to "clone" and object, I need to create a sort of deepcopy20:44
natefinchperrito666: depends on the value, but yes, some values take some extra work20:45
natefinchperrito666: maps and slices are really the only thing you need to worry about (there's no real way to deep-copy an interface, and it doesn't really make sense to copy a channel)20:46
natefinchperrito666: I guess a good question would be - why do you need to clone something?20:47
perrito666natefinch: thank you, btw, how many hours a day do you work, you are here when I arrive and when I leave?20:49
perrito666natefinch: refactoring a big piece of code of assignToMachine20:51
natefinchperrito666: haha... not that much, really.  I have a lot of interruptions during the day due to having two small children.  I start work around 5:30 or 6am and end at 5pm, but there's usually a few hours of non-work in there.20:51
natefinchok, EOD for me20:56
perrito666natefinch: have a nice weekend20:56
natefinchperrito666: you too20:56
=== Ursinha-afk is now known as Ursinha
=== hatch__ is now known as hatch]
=== hatch] is now known as hatch
=== vladk is now known as vladk|offline

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!