[00:18] rick_h__: ping [00:18] did you get my email about winblows ? [00:19] davecheney: yes, sorry. Meant to ping alexis on it and see if she moved forward before I sent it over [00:19] davecheney: let me check my email for getting at the msdn stuff. [00:19] rick_h__: thanks [00:19] i've got as far as I can without being able to test locally [00:23] davecheney, I ca start a win instance from snapshot used to start the win-build-server [00:25] davecheney: see pm [00:44] wallyworld, rerunning an older revision is easy, but it take 4 hours now that we test so much and don't have the slaves doing their own builds [00:45] sinzui: ok, np [00:45] wallyworld, But I don't see any reason to test master until 1.20 is good So I will start the test [00:48] wallyworld, The build is started. We are about 2.5 hours from when the function*restore tests will run [00:48] sinzui: i'm manually testing restore, not done it before, ran into problem: i took a backup, then destoryed my env, then restored. i fails saying it can't re-bootstrap without a control bucket :-( [00:48] so it seems it wants stuff from the jenv file which no longer exists [00:49] wallyworld, I have not seen that before [00:49] [master]ian@wallyworld:~/juju/go/src/github.com/juju/juju$ juju-restore juju-backup-20140702-1042.tgz [00:49] extracted credentials from backup file [00:49] re-bootstrapping environment [00:49] error: cannot re-bootstrap environment: control-bucket: expected string, got nothing [00:49] wallyworld, what kind of env are you testing with? [00:49] amazon [00:50] maybe i needed to keep the jenv file [00:50] wallyworld, CI doesn't set control-bucket for any env :( [00:51] sinzui: i think doing destroy-environment after the backup was a mistake [00:51] ...and unittests and win-build-installer are now playing [00:51] but it seems unfortunate we can't fully restore from that situation [00:51] wallyworld, well, yes... [00:52] wallyworld, use the console to kill the instance [00:52] sinzui: cause you know, restore is meant to recover from stuff going away [00:52] wallyworld, that will simulate a state-sever failure [00:52] i should be able to hand over the backup tar gz to someone and they should be able to restore [00:52] wallyworld, and you need on service up to verify that the service learns about the new state-server [00:53] one service i mean no on server [00:53] hmmm. true for running a test. but in general, restore should be able to habdle a full restore from nothing [00:54] wallyworld, I don't think so. the control-bucket is only removed by destroy-env...that is human fault [00:55] sure, but here it failed because there was no jenv to read the control buvket from [00:56] requiring some artifacts to still be up for a restore to be successful implies restore is broken [00:56] wallyworld, your test isn't valid. the case is for when hardware fails, or the like when canonistack loses a machine [00:56] that is the scenario fwereade outlined to me [00:56] there's no reason why restore can recognise there's no control bucket and recreate it and redeploy the charms listed in state [00:57] imagine if hard drive or data backup worked that same way [00:57] wallyworld, possibly, but to test what was written and given to customers, we just simulate a bootstrap node failure by terminating the instance using the provider's tools [00:58] "sorry, restore can't copy your stuff back from tape because you no longer have your original login details to your pc that was destroyed" [00:58] understood, i was just using restore how i thought it should work [00:58] and got disappointed [00:58] it's not what i would accept as a customer [01:08] axw: hiya, i landed your 1.20 keep alive branch. landings still quite problematic [01:08] wallyworld: thanks. yeah :( [01:08] but a bunch did go through last night [01:08] it seemed marginally better after some of hte fixes landed, but could just be my bias [01:08] we still have the could of root cause issues i think [01:09] couple [01:09] wallyworld: is it okay if I modify the unit test job to create a tmpfs and use that for $TMPDIR? [01:09] sure [01:10] i also want to fix the root cause issues as well [01:10] axw: could i ask you to take a quick look to ensure we think the bootstrap-timeout option still works. it seems with CI joyent tests, joyent wasn't bootstrapping due to that 1.18 bug but the sessions weren't being closed after 10 minutes [01:11] wallyworld: ok [01:11] there was a problem with the timeout option originally, it never took effect I think [01:12] got fixed somewhere along the way. I'll verify in a bit [01:13] thanks [01:26] sinzui: i get the same restore "cannot connect to bootstrap instance: EOF" error in my manual test on 1.19.5 :-( [01:27] :( [01:28] the instance is there and i can ssh to it [01:28] i'll see what else might be happening [01:28] wallyworld: 2014-07-02 01:28:44 ERROR juju.provider.common bootstrap.go:119 bootstrap failed: waited for 30s without being able to connect [01:28] Stopping instance... [01:29] (it works) [01:29] axw: ok, thanks. sinzui ^^^^^ [01:29] not sure why timeout was failing for joyent [01:29] that was on ec2 [01:30] wallyworld: was it 1.18? [01:30] may have been [01:30] was there a fix post 1.18? [01:30] bootstrap-timeout used to be busted [01:30] not sure when it was fixed [01:30] ok, we'll ascribe the issue to that then :-) [01:31] wallyworld: not sure if you're aware of this yet: #1271144 [01:31] <_mup_> Bug #1271144: br0 not brought up by cloud-init script with MAAS provider [01:32] we fixed that a while ago, but there's a problem with maas assuming that the interface is eth0 [01:32] hadn't seen that [01:32] well, sorry, the maas provider [01:32] so maas is broken for 1.20? [01:32] it's the same now as it is in trusty [01:33] I don't *think* that particular bug is new [01:33] wallyworld: I don't think we need to fix it for 1.20 necessarily, just letting you know it exists because it's had some activity [01:34] thanks, we'll target for 1.20.1 [01:39] sinzui: hacking the restore code to just pick up from the bit where it attempts to connect to the newly started replacement bootstrap instance worked the second time through. so it looks like a mongo startup issue related to timing. i can put the connect bit of code in a retry loop [01:40] :) you are a hero [01:40] i'll target the bug to 1.20 also [01:41] wallyworld, Do I dare take master our of testing to make 1.20 the only priority [01:41] sinzui: i would :-) [01:41] just for today [01:41] * sinzui does [01:53] davecheney: are you updating juju to the new loggo? if not, I can [01:58] * thumper needs to think things through, so going to walk the dog [01:59] axw: https://github.com/juju/juju/pull/212 [01:59] jynx! [01:59] sorry, took a whlie to run all the tests [02:00] cool [02:00] axw: say the words, SAY THEM! [02:00] luks gud [02:01] davecheney: reviewed [02:01] ermehgerd, wreeevu [02:05] sinzui: there are a few possibilities where the restore is falling over, i have to track down the place and then i can fix. not as obvious where to put the retry as i first hoped. [02:05] :/ [02:06] when writing tests for apis .. if the state/api tests execute the code paths in state/apiserver .. do I need to copy and paste those tests minus the caller? what is the stance on that? [02:07] I have unit tests for the methods internal to state/apiserver , but the method we expose to state/api (GetCapabilities) is being fully executed by the tests in state/api. [02:09] sinzui: well, i know the top level api call, but one part of the implementation does retry, and there's a section that doesn't. i'll have a fix soon i hope [02:09] wwitzel3: our api tests are not really complete - i believe the client tests involve calling all the way through to a functioning server [02:10] the apiserver tests are a subset [02:10] i just follow what others have done when i add tests there [02:11] wwitzel3: i think from memory the apiserver tests have full coverage, and the client tests just check that it's all wired up [02:11] so they just run a few representative calls [02:15] wallyworld: ok, well then I will move my state/api assertions in to state/apiserver and then have the state/api test just ensure it is all wired up correctly. [02:15] wwitzel3: yeah, i think so. see what say keymanager does for example [02:16] it's sorta crap, but it was how it was all done originally [02:16] people just didn't stub out properly [02:16] the the pattern has been cargo culted over and over :-( [02:21] I've been cargo culting for the last 3 months .. sorry [02:21] I'm new, I don't know what to do! [02:22] just got done looking at the keymanager example [02:22] I will adjust accordingly [02:22] thanks :) [02:25] https://labix.org - security certificate has expired [02:28] whups [02:39] wwitzel3: cargo culting reference wasn't meant for you - it's a reflection of the poor state of the original tests [02:39] i've done the same thing in that section of code [02:40] wallyworld: wow, that tmpfs change I made had quite an impact on the time taken to run the state package tests [02:40] yeah? [02:40] 325s down to 120s [02:40] wow [02:40] those disks are slow as [02:40] must be :-( [02:40] it's meant to have SSDs [02:40] i use a ramfs for my tmpfs cause i got 16G [02:40] fast [02:41] i don't want to wear out my ssd [02:41] * axw nods [02:41] woot, ran all the tests in 19 mins and passed first go [02:42] axw: if they pass 10 times in a row..... that will be a nice surprise [02:42] regardless, there's still race conditions to fix :-( [02:42] wallyworld: that's shaved 8 minutes off the last successful first-time run [02:42] yeah [02:42] I am looking at all the races now [02:43] and when we get nailed up instance, goodbye to another 4-8 minutes [02:43] even SSDs are slow if they have to actually do IO [02:44] sure, but the difference on my laptop's SSD and tmpfs is about 30s in this test, not 2 minutes [02:45] ah [03:01] https://github.com/juju/juju/pull/213 [03:02] ^ another state related fix [03:02] actually related to the footgun that is suite.TearDownTest [03:02] oh fuck [03:02] the branch is mixed in with my loggo change [03:03] the change is in  state/initialize_test.go [03:03] if you want to review [03:03] i'll repropose it after the bot has landed my previous branch [03:04] looking [03:06] davecheney: this is where rebase is handy. "git rebase -i master" and delete the loggo commit line [03:14] axw: hmm [03:14] i tried [03:14] but failed [03:14] the commits came back :* [03:16] mmkay. I never had any problems with it so I don't know what to suggest [03:17] axw: i'm batting 0.00 with git rebase [03:20] oh man... [03:20] head is spinning with auth shit [03:21] * axw calls victory over the lander [03:22] 3 in a row passed first time, 19m 17m and 17m [03:23] axw: i remain unconvinced [03:23] i suspect there is more pain in there [03:23] great :-) those were the times we saw when this was all first set up - it has degraded over time since :-( [03:23] speaking of shit, antipathy.nonserviam.net/utilities/shit <== beware, perl [03:23] basically any time mongo shits itself coming up [03:23] the test explodes in a blame misdirecting way [03:23] show image in terminal [03:23] saw it last nigh [03:23] t [03:23] davecheney: there are still problems, but I reckon it's due to slowness [03:24] pretty interesting [03:24] someone should write it in go [03:24] it could be called "go shit" [03:24] go is [03:24] * thumper chuckles [03:25] sfw - examples imgur.com/a/kmOhv#0 [03:25] thumper: ha, fool me trice [03:25] sinzui: so i have a working solution. i tried to push the retry logic down to the root cause failure (login), but the underlying connection got closed underneath so i had to move it up higher. will propose and land in 1.20. took a while to get it right due to the iteration time [03:26] only took one retry loop before it connected [03:30] https://github.com/juju/juju/graphs/commit-activity [03:30] ^ this is what a sick commit bot looks like [03:31] sinzui: ah bollocks, it failed in a different place right near the end of the restore. ffs [03:32] might be spurious, will retry [03:32] * thumper out to collect daughter, back shortly [03:33] wallyworld: woohoo, now the github merge failures are back [03:34] \o/ [03:34] can't win [03:34] bbs picking up my daughter [04:38] bloody helll [04:38] juju/testing/mgo.go [04:38] how many panics are there in there !?!? [04:45] davecheney: local tests? The last build failure seemed to be a merge issue... [04:46] jcw4: this problem goes deeper than the last test faliure [04:46] senopsys [04:46] if mongo shits itself coming up [04:46] the test helper will panic in one of a dozen different ways [04:46] the panic is captured by the testing suite [04:46] which then goes to tear down the test [04:47] that usually generates a far more fatal panic [04:47] blech [04:47] davecheney: and mongo failing on startup is random? [04:47] jcw4: random as clockwork [04:47] haha [04:48] but... but... mongodb is network scale [04:48] it's significantly less stable since we combined replica sets and tls into an unhoyly offspring [04:48] or something like that [04:48] hmm [04:49] wallyworld: axw http://paste.ubuntu.com/7734835/ [04:49] openstack tests fail reguarly for me [04:49] complain they can't find tools and crap themselves [04:50] davecheney: what are the address type warnings about ipv6? [04:50] davecheney: seems possibly related to not finding resources on 127.0.0.1 [04:50] (ipv4) [04:51] our tests leak shitloads of pingers [04:51] http://paste.ubuntu.com/7734318/ [04:51] jcw4: no idea [05:01] davecheney: got a minute to chat? [05:01] davecheney: particularly about tests and pingers [05:02] thumper: sure, send me a hangout link [05:02] i'll go upstairs [05:03] kk [05:03] davecheney: https://plus.google.com/hangouts/_/canonical.com/testing [05:11] davecheney: what changes were made before those tests were run? [05:26] wallyworld: not sure I follow [05:26] i think those leaky pingers have been ther for a while [05:26] i think thye are leaking out of the mongodb driver [05:26] davecheney: i mean the tools errors [05:26] oh [05:26] PTAL: https://github.com/juju/juju/pull/216 [05:26] sorry, i'll try to capture more errors [05:26] ok [05:26] it doens't happen consistently [05:26] \o/ [05:26] and when I run the tests in a loop i usually find another failure [05:26] which distracts me [05:27] for example, http://paste.ubuntu.com/7734852/ [05:32] jam: since you'll be on before me tomorrow: https://github.com/juju/juju/pull/216 :) [05:32] jcw4: ? [05:32] on call reviewer :) [05:32] k [05:33] jam: pretty straightforward [05:33] jam: what timezone are you? I thought you were closer to GMT? [05:34] jcw4: UTC + 4 [05:34] I'm in Dubai [05:34] jam: Ah! that's the second time I've forgotten that [05:35] jcw4: well it is 20+ people to keep track of :) [05:35] jam: I told fwereade a couple weeks ago I chatted with you late at night 'cause I thought you were down under [05:35] jam: and I was embarrased to learn you were in dubai; now I know it's just that you're up bright and early [05:36] jcw4: in your defense, I don't keep to a very strict 8-hours-a-day work schedule. so I *might* show up at random times [05:36] my dog usually wakes me up for a walk at about 5:30-6am [05:36] jam: :) that's too early. well I'm UTC-7 so I'm off to bed myself [05:36] and my family was away for a week recently, which led to a bit more work times. [05:36] jcw4: rest well, it is quite late for you [05:37] yep, my self imposed bedtime is 30 minutes ago [05:37] :) [05:44] wallyworld: http://paste.ubuntu.com/7735005/ [05:46] davecheney: ok, that appears to be a test failure due to c.Check(info2, gc.DeepEquals, info) - what happens on failure is all the debug log stuff gets shown. simplestreams debug logs what happens as it looks on the search path for metadata, and sometimes there's nothing there so it logs it for diagnostic purposes [05:46] so it appears to not be an issue so much [05:47] hmm [05:47] and it's upset because the test bound to 127.10.0.1 [05:47] not 127.0.0.1 [05:49] i'm not sure why that happened without looking into it [05:49] mkay [06:02] axw: bot is a lot happier now that /tmp is not on those slow disks [06:02] yeah :) [06:03] so much blue === vladk|offline is now known as vladk [06:07] buys us time to nail the remaining intermittent errors [06:07] cause running on a slow disk should not break the tests :-) [06:08] yup [06:08] cloud is always gonna be slow [06:08] that's what it does [06:13] indeed, really do need to get to the bottom of it === vladk is now known as vladk|offline === vladk|offline is now known as vladk [07:03] for the well motivated: https://github.com/juju/juju/pull/218 and https://github.com/jameinel/juju/pull/4 [07:03] dimitern: fwereade: ^^ [07:27] axw: did the race detector find those races you fixed? [07:27] wallyworld: yes [07:27] jeez, there's a lot of them [07:28] indeed :) [07:28] we need to set up a CI job [07:28] cause we won't pick them all up at review time [07:28] agreed [07:28] wallyworld: all I did was "go test -race ./..." [07:29] takes along time though right? [07:29] not particularly long [07:29] I didn't time it [07:29] i still think a job will be helpful [07:35] morning dimitern [07:36] morning jam [07:36] jam, i'm looking at your PRs [07:36] thx [07:36] dimitern: and thanks for fixing the edit of the networking spec [07:37] jam, np, i should've done that initially [07:40] dimitern: hard to realize when you're own editing of it works just fine :) [07:40] :D [07:48] vladk: thanks for reviewing https://github.com/juju/juju/pull/211, one point of note. When you finish a review, it is good to give a summary comment on the whole thing. That way we know that you are done reviewing, and we know whether you are giving it an overall LGTM or whether it is something that needs to come back for another review. [07:53] morning [07:53] jam: vladk: thx for review, good comments [08:06] morning all [08:11] morning TheMue and voidspace [08:11] * jam needs to go get food [08:45] jam, both reviewed [08:49] dimitern: does the coming network stuff need bridge-utils? I'm just tidying up our cloud-init, wondering if it can be removed for non-maas [08:51] axw, we should leave it to the networker to install that and the vlan package as needed, but let's leave it for now until it does? [08:52] dimitern: ok, will leave it there for now [08:52] axw, cheers [08:53] axw, if you're changing cloudinit anyway, perhaps add a TODO about dropping bridge-utils and letting the networker install it as neede\ [08:53] d [08:54] sure [08:54] there's a bug about it, I'll add a comment and ref to that [08:56] axw, awesome [08:59] axw, btw is the relation addresses work on hold pending approval? [09:00] dimitern: yes. I did a more modest change to trigger config-changed on address change, but relation addresses are on hold for now [09:00] axw, right, ok [09:00] needs some more discussion and/or arm twisting [09:00] :) [09:00] :) tell me about it [09:01] axw: want to do mini standup? [09:01] mgz: sure, brt [09:01] we've been going over the model back and forth for the past 3 weeks, but at least we're now into minor implementation details mostly [09:32] jam: so I've finally managed to find the right place to hardcode the ipv6 mongo address [09:32] jam: and I can get a connection to mongo on [::1]:37017 working fine [09:32] jam: which I think validates that mongo works ok locally with ipv6 [09:34] jam: and from his reading of the mongo codebase vladk thinks that we should be fine with both ipv4 and ipv6 connections to the same mongo(s) if we need it [09:50] voidspace, great news! [10:01] voidspace: sounds good, indeed. [10:05] voidspace: ping? [10:05] jam: you’ve got another part I can support network coding? [10:17] menn0: pong [10:30] dimitern: I think I've addressed all of your requests on https://github.com/juju/juju/pull/218 if you want to give it another quick pass. [10:30] jam, cheers, will look in a bit [10:31] TheMue: I don't think I quite understand? Just wanting more work to do ? [10:32] TheMue: we'll talk about it in 15 min anyway, so I'm going to just go make coffee and see you guys there. [10:32] jam: ok [10:41] voidspace, dimitern, jam, TheMue: here is my investigations on replicaset with IPv6 [10:41] https://docs.google.com/a/canonical.com/document/d/1mfLSIUAV5V7_uj3oxBDIGxlT5jPOKIXsWlmyO6pa23Y/edit?usp=sharing [10:41] *click* [10:42] vladk, thanks for sharing! [10:43] vladk: interesting notation when setting addr and port. go would dislike it [10:46] jam, standup? also - your PR is ready to land I think [10:51] axw, ping [10:53] mgz: i'm going to update godeps. worth watching the 'bot to check that it doesn't fall over [10:53] wallyworld: mgz: http://juju-ci.vapour.ws:8080/job/github-merge-juju/324/console failed because the script couldn't be found: /tmp/hudson6827171313236674227.sh: line 121: /home/ubuntu/jenkins-github-lander/bin/lander-merge-result: No such file or directory [10:56] rogpeppe, jam1: that was too fast, right? :) [10:56] mgz: too fast? [10:56] mgz: i think so [10:56] to blame rog :) [10:56] mgz: it failed a long while ago [10:56] mgz: so yes, not rog's fault [10:56] mgz: i don't think it had even merged by the time that message came here [10:56] phew :-) [10:58] yeah, it's yesterday. I wasn't fiddling with it then, but someone may have been [10:58] would just try rerunning it, stuff has landed since then. [11:12] fwereade: pong. getting kids to bed and dinner on. I will bbl [11:12] axw, no worries, but if you have some time I would appreciate your thoughts on https://github.com/juju/juju/pull/189/files -- you've been in there a lot more recently than I have [11:13] axw, I think I'm saying things that are roughly sane but your oversight would be good [11:27] fwereade: nps, will take a look later === vladk is now known as vladk|offline === vladk|offline is now known as vladk [11:55] so I noticed there was state/policy.go and inside there is an interface EnvironCapabilities. So I should probably extend that to have the InstanceTypes and AvailabilityZones (it already has SupportedArchitectures) instead of defining my own interface (which was also called EnvironCapabilities). If I do that, I get an import cycle with state since I need to in include it in the apiserver code to use this interface, but it is also included [11:57] do I move the interface to its own package under state? do I move the interface out of state and use the one I've defined in provider? [11:58] do I just have two different, but very similar interfaces? [12:01] I feel like state is the wrong place for this interface anyway. It made sense to me existing in provider/common. [12:02] so I'm leaning towards just moving it there, since the interface itself is only used by providers anyway (afaict) [12:08] looks like it is used by state in a few places, hrmph [12:09] vladk: yay, your patch landed this time [12:09] jam: hooray [12:12] jam: back from lunch, shall I grab one coding task from our planning lane? [12:14] TheMue: so, we can have you (a) try to pair with voidspace on what he's doing with ipv6 and mongo [12:14] (b) have you look at fixing the firewall rules for providers to disallow external DB access [12:14] jam: TheMue: the first part of that is trivial and will be done soon [12:14] jam: TheMue: when I come back from lunch we could pair on the peergrouper stuff [12:14] voidspace: yeah, I expect '—ipv6' to be easy enough. IT is more about pairing for replica set and peergrouper [12:15] sounds good [12:15] voidspace: though if you put that up for review for TheMue to look at before you go to lunch, he can approve it by the time you get back :) [12:15] cool [12:15] jam: voidspace: ok [12:15] :-) [12:15] sounds good [12:16] TheMue: if you have some time free now, getting developer docs on using API versioning is probably quite worthwhile while it is fresh in our memories [12:17] jam: yes, I’m currently looking into your changes and the doc to see where we differ from the old API [12:18] jam: just digging in PR 218 [12:22] jam: would btw like two APICall() versions, one like now with explicit version and using BestFacadeVersion(), and another one doing that implicitely [12:22] jam: APICallBestFacade() [12:22] jam: but that’s only convenience ;) [12:23] TheMue: so we have FacadeCaller that provides the second form [12:23] along with BestAPIVersion for things that need to check compatibility [12:23] which I would expect most poeple would use [12:23] since they *also* just want to use calls on a single facade [12:23] jam: oh, not yet found it, nice === vladk is now known as vladk|offline [12:23] jam, how about instead of 1 suite, with methods to start ipv4/ipv6/both servers, instead to have 3 separate suites, that do each one separately at SetUpTest ? [12:24] jam: so in testing/mgo.go we hardcode the address to localhost:port (MgoInstance.addr) [12:24] jam: ah, no problem forget that [12:24] dimitern: its possible, however I would think that you would have both kinds of tests in one file, and it might be more obvious to have them next to eachother. [12:24] jam, because it's really not that easy to reason with MgoSuite and make it *not* start mongod in SetUpTest and assert it was closed in TearDownTest [12:24] jam: I just want to test I can *connect* to it with ipv6 [12:24] jam: "localhost" isn't a problem, right? [12:24] 127.0.0.1 would be [12:25] jam: no, it's fine - I was confusing starting the mongo instance and connecting to it [12:25] dimitern: I would be happy enough to have us split out Mongo testing from API Server testing. [12:25] having started it I just want to test connecting to it [12:25] what we probably care the most about at this layer is API Server testing [12:25] I thought I wanted to change the way it was started [12:26] jam, the problem is we can't start an API server without a connected backing state.State [12:26] dimitern: I'm just thinking that if I was writing tests, I would probably write: "TestIPv6APIServer()" and then "TestDualStackServer", etc. [12:26] Which *could* be separate Suites, but it wouldn't be as obvious (I think) [12:26] alternatively, we have a test mixin [12:26] (so a common set of tests that get run with each base suite) [12:27] jam, I'll think about it some more [12:28] dimitern: so I'm less stuck on Mongo must be connected to the same way as the API server. [12:28] if it is easier to tease them apart for testing. [12:28] jam, at least i found out a smaller prereq i'll propose shortly - the api server needs to take Addrs []string (or []network.HostPort, but I'd rather not bind these two) and listen on all given EPs, not just one [12:28] dimitern: I think that's better, though I wonder if we want to just pass ":Port" and have it bind to all ? [12:30] jam, that's better actually and simpler [12:34] dimitern: well, I'm looking into the Python 'socket' code, and you can't just do that, at least. [12:34] dimitern: at least in python, when you declare a socket, you have to specify AF_INET or AF_INET6 [12:35] and then sock.bind(('0.0.0.0': 12345)) vs sock.bind('::', 12345, 0, 0)) [12:35] (bind takes 4 args vs 2) [12:36] dimitern: http://golang.org/pkg/net/#Listen [12:36] dimitern: says that it must be "tcp" "tcp4" or "tcp6" [12:36] maybe "tcp" will give us both? [12:36] jam, right, we need "tcp" and "tcp6" and 2 Listen() calls [12:37] jam, I'll do some experiments [12:38] dimitern: well if we need 2 Listen calls, then we probably need 2 Listener objects, and if we want to support dual stack, then the inner select needs to select on both of them. [12:41] I get LOG messages "connection accepted", followed by a "no reachable servers" error [12:42] does the "connection accepted" log message mean that we *did connect*, or is it possible that the "no reachable servers" message is genuine [12:45] dimitern: ping ^^^ [12:47] voidspace, I think it means you connected to one of the mongos, but it itself couldn't reach the other replica set members perhaps? [12:48] rogpeppe will know better I guess ^^ [12:48] dimitern: I did some web searching, everything looks like you need 2 sockets to do it right. [12:48] dimitern: this is testing/mgo.go [12:48] so there's no replicaset [12:49] voidspace: the log message "connection accepted" just means TLS connected. I think we still need to Login, etc. after that. [12:49] jam: so mongo accepted the connection but login failed - something like that [12:50] This is how I'm constructing the connection: http://pastebin.ubuntu.com/7736622/ [12:50] essentially: info.Addrs = []string{fmt.Sprintf("[::1]:%s", testing.MgoServer.Port())} [12:50] let me try with 127.0.0.1 to make sure it can possibly work [12:52] voidspace: I forget the exact circumstances, but I've seen us go into a tight loop with 1000s of lines of "connection accepted"…. yeah, if it was accepted, why aren't you doing anything… :) [12:52] heh, no that fails [12:52] so the test construction is bad, nothing to do with ipv6 [13:00] hah, it was my python knowledge that was causing it to fail [13:00] dimitern: jam: %s is not the way to interpolate an int into a string with Sprintf ... [13:00] when I use %v the test passes [13:01] and without the ipv6 flag the test fails [13:01] ship it [13:03] :D [13:03] TheMue: https://github.com/juju/testing/pull/17 [13:10] jam, it actually works with Listen("tcp", ":54321") - the same listener accepts both 4 and 6 connections [13:11] TheMue: https://github.com/juju/juju/pull/221 [13:12] voidspace: looking [13:15] voidspace: just to be sure, when adding —ipv6 mongo only additionally enables ipv6, but still ipv4 in parallel, doesn’t it? [13:18] TheMue: that's correct - and the rest of the test suite works [13:20] voidspace: fine [13:22] voidspace: so the fist LGTM has been easy ;) [13:23] voidspace: and the second one too [13:30] dimitern "tcp" sounds good. I was trying to trace through the code to see if it would work, but didn't get to the root of it. [13:30] dimitern: I'm off until our call [13:30] jam, ok [13:30] TheMue: thanks [13:30] I'm going on lunch [13:31] voidspace, TheMue: so we need to do the same thing for github.com/juju/testing [13:31] since that sets up our MgoTestSuite [13:32] jam: I had pull requests for both [13:32] jam: and both reviewed [13:32] jam: probably need to bump revision in dependencies.tsv - but we're not yet *depending* on that [13:33] * voidspace really goes on lunch [13:56] abentley, wallyworld jam fwereade I am manually re-running the two failed restore function tests 1 just passed. I think the feature is brittle, but better than 1.18. I hope to start a release in an hour [14:03] sinzui: would this be 1.19.5 or would we be jumping to 1.20? [14:04] I personaly would like to go the "dev release becomes stable with only version.go changes" but I realize we're close on time. [14:04] jam I am releasing 1.20.0 from the 1.20 branch [14:05] jam Once I do that, I will change master to be 1.20-alpha1 [14:05] sinzui: I realize that, just that we have a fair number of code changes since 1.19.4 [14:05] sinzui: I think master becomes 1.21-alpha1 [14:07] jam, master has not passes in a week, so I am not inclined to attempt a 1.19.5. The plan last week was to only merge the safe and stable changes into 1.20 [14:08] sinzui: you would do 1.19.5 as a pre-release for 1.20 from the 1.20 branch [14:08] I would certainly not do a 1.19.5 from master [14:08] sinzui: I think *master* is already not a 1.20 branch, but a 1.21-pre branch [14:08] yep [14:09] sinzui: my point is that at one point in time, we wanted to have a stable release be *exactly the same code* as a dev release, except for the version.go line. [14:09] and while we've been doing 'stable' patches to 1.19.4, that is still patches from the last dev release. [14:10] We created the 1.20 branch our of desperation. I didn't do it until it was clear that was the only way to get a release out [14:10] jam. I think we need a policy change to get to "only the version is different". When we have a regression, we stop the line and everyone fixes it [14:11] sinzui: we can just do the release you want to do, but call it 1.19.5, and when we're happy it really is stable we do 1.20 with no other changes. [14:11] jam, master always has multiple regressions, and I think devs are building on top of the broken code [14:11] jam, release 1.19.5 today, the 1.20 tomorrow? [14:13] sinzui: given the timeframe, I'm not sure if it is actually worthwhile. If we were giving it 1 week then I think there would be genuine benefit for a stable release. [14:14] sinzui: as for getting into a bi-weekly release cadence for dev releases, we've talked about whether we have to split of each one for stabilization. We have a track record that says we need to. [14:14] anyway, I really need to go, but hopefully the release happens, it sounds good. [14:23] voidspace, vladk|offline, TheMue, a minor PR to enable and test the apiserver listens on ipv4+ipv6 - https://github.com/juju/juju/pull/224/ [14:24] rogpeppe, you might be interested as well ^^ [14:24] dimitern: thanks [14:24] are there any sort of architectural diagrams for juju? [14:24] to help identify subsystems, etc.? [14:25] katco: try https://github.com/juju/juju/tree/master/doc [14:25] ty, sir! [14:25] katco: more text than diagrams though [14:26] it's a start :) [14:27] dimitern: looking [14:28] TheMue, cheers [14:53] dimitern: that looks great [14:54] dimiterm: so far here too, but now 1:1 with alexisb [14:54] voidspace, yeah, it turned out simpler than I expected [14:54] TheMue, np [15:00] mgz: no merge bot on juju/testing ? [15:01] dimitern: LGTM by my side too [15:02] TheMue, thanks! [15:03] voidspace, wwitzel3, perrito666: standup? [15:04] ericsnow: I am in a rather unhappy place right now, I would say that you go on without me, there are no changes since our last standup on my side [15:05] perrito666: :( Do you need a break to come visit us in the land of the living? [15:05] ericsnow: I currently am 200km from my usual workplace, I am working at a friends house for the day, I had to travel to get a certificate that entitles my wife to drive my car (yes, you read that correctly) [15:06] perrito666: good luck [15:06] ericsnow: oh I got the required paper, it took me 5 min :p [15:06] its the 200km part that bugs me [15:07] I need to do this for every person I want to authorize to drive my car [15:08] and now there is also a kid crying next to me :p [15:17] ericsnow: not sure why my highlight doesn't work when it is in a list with other names, but just saw this. [15:18] ericsnow: I just got out of the tosca meeting 5 minutes ago, you still want to do standup? [15:18] wwitzel3: I wouldn't mind [15:19] ericsnow: meet you there [15:19] ericsnow: not me [15:19] also fwereade I sent you that branch to look at [15:19] ericsnow: I don't like you lot these days [15:19] lol [15:19] lol [15:19] ericsnow: in moonstone [15:20] interesting, i see a lot of mutexes in the logging framework. isn't the idiomatic way to do synchronization in go message passing? [15:20] voidspace: you do not like us now that you have better internet, you have changed [15:20] hah, I wish my internet was better [15:20] perrito666: it's still horrible :-/ [15:21] I've complained to the ISP now anyway [15:21] voidspace: its a good thing you dont have to spend a week locked with us :p [15:21] voidspace: they said new, not better [15:21] hah [15:32] TheMue: you up for / available for pairing? [15:32] dimitern: ping === urulama is now known as urulama-away [15:41] voidspace: now again, had been afk for giving my wife a helping hand in the garden ;) [15:44] TheMue: cool :-) [15:44] TheMue: there are three tickets we could plausibly work on [15:45] TheMue: peergrouper IPv6 testing - but I think that one is better waiting for the test suite that dimitern is working on [15:45] TheMue: Have tests for Mongo in IPv6 replicaSet tests [15:46] TheMue: Client: juju/api.go Don't drop IPv6 addresses [15:46] TheMue: I get the feeling that jam would prioritise "Have tests for Mongo in IPv6 replicaSet tests" slightly higher [15:46] TheMue: but there is no additional description in the card [15:47] voidspace: ack [15:47] voidspace: yeah, cards often have only the title ;) [15:48] TheMue: I guess the most important thing is that members of a replicaset can talk to each other over ipv6 [15:48] I am using the local provider. I deleted the /var/cache/lxc/cloud-trusty/ubuntu*cloudimg*.tar.gz, destroyed and re-bootstrapped. I don't see a new image in that directory, but juju "deploys". I thought juju would download a new image. [15:48] and possibly in "dual stack" scenarios, but I think we're going to (initially at least) require that all HA state servers be on either ipv6 or ipv4 (so no mixed environments) [15:49] TheMue: on the other hand, the other ticket "Client: juju/api.go Don't drop IPv6 addresses" seems nice and straightforward [15:49] TheMue: although I think we can't actually use that until we can connect to the api server with ipv6 [15:50] voidspace: sounds logical. even if it may be possible it alway can lead to hassle [15:50] voidspace: exactly, that’s a dependency [15:51] voidspace: so the order would be replicaSet, no dropping in api and finally peergrouper [15:52] TheMue: cool [15:52] TheMue: hangout? [15:52] voidspace: can do, mom, have to fetch headset [15:53] did you just refer to voidspace as mom? I've clearly missed something TheMue [15:53] mgz: hehe, no [15:53] that's what I thought [15:53] mgz: mom or mom pls is simply for „one moment please" [15:54] mgz: hey, for juju/testing do I need to manually approve the merge [15:55] mgz: how would you abbreviate it? [15:55] voidspace: sorry, missed that question earlier - yeah, for anything where github gives you the option of manually merging, you need to run the tests yourself and hit that button on the page [15:56] I'm planning to start switching stuff over shortly, but atm it's only a few projects that are landing bot controlled [15:56] ok, cool [15:56] mgz: thanks [16:00] voidspace: so, calling you in hangout ;) [16:02] voidspace: hmm, doesn’t work. taking the saphire hangout? https://plus.google.com/hangouts/_/canonical.com/juju-sapphire [16:02] TheMue: I'm there [16:13] voidspace: lost you [16:13] ;) [16:54] * TheMue lost connection [16:55] voidspace: I’ve got to step out, will ping you tomorrow morning [17:04] TheMue: ok, sorry for the delay === revagomes__ is now known as revagomes [17:22] would anyone have some time to answer some questions about the logging systems? [17:27] alexisb, I just sent an email about the 1.20.0 release. I need to eat. Perhaps you want to talk in about 30 minutes about it [17:27] sinzui, I will be out for a bit for lunch, but can ping you whne i am back [17:27] * alexisb goes an dlooks at email [17:40] dimitern, what's the status on container addressability? [17:42] hazmat, it's on hold since ipv6 gained priority [17:42] dimitern, ic, thanks [17:43] grrr, new phone arrived today [17:43] DOA [17:43] :-/ [17:44] hazmat, btw why does the deployer still connect to state directly? (somebody mentioned it today) [17:45] dimitern, it doesn't [17:45] dimitern, you mean juju deployer or unit deployer in juju? [17:45] hazmat, juju deployer [17:46] dimitern, it doesn't connect to mongo... it primarily uses the api or cli. [17:46] hazmat, we'll be closing the mongo port 37017 in 1.20.x soon [17:46] dimitern, there's a separate tool .. juju db inspector that does [17:46] hazmat, ah, ok then, that's fine [17:47] dimitern, primarily the value there (state inspection with db inspector) is being able to look at unit relation data [17:47] hazmat, right [17:47] hazmat, if it's just that, why not use juju run relation-get ... [17:48] dimitern, good question ;-) .. amulet goes through all kinds of tricks to try and do just that (inserting proxy charms and subordinates) for charm testing. prolly cause it predates juju run [17:48] marcoceppi, ^ [17:49] :) i see [17:49] Oh, it predates the idea of there even being a juju run [17:50] dimitern: also, juju run relation-get requires you know do several other runs to get relation, relation-id, list of units, etc [17:50] but juju-run is a "good enough" alternative for sure [17:51] marcoceppi, you can run a script that does all that in one juju run call === vladk|offline is now known as vladk [17:51] dimitern: you could, and I guess the unit subordinates could install that script [17:51] * marcoceppi wanders off to contemplate this [17:52] marcoceppi, yeah, or you can just juju scp it :) === vladk is now known as vladk|offline === vladk|offline is now known as vladk [18:01] I am getting an error in Juju panic: runtime error: invalid memory address or nil pointer dereference [18:02] mbruzek, file a bug with the traceback from the machine log [18:04] Thanks hazmat I will do that. Is there anything else they would need? [18:06] mbruzek, juju status output for some context might help.. [18:57] i'm having trouble figuring out where juju instantiates log files. can anyone give me a pointer? [18:58] perrito666, anyone help [18:58] http://juju-ci.vapour.ws:8080/job/functional-ha-backup-restore/ [18:58] katco: look for DefaultLogDir [18:59] perrito666: ty [18:59] sinzui: going [18:59] ^ The job fails for 1.20, but we haven't landed related changes to it [18:59] I have amassed a lot of test results, but not consistent failure to say what is wrong [18:59] * perrito666 uses a lot ofadjectives [19:02] sinzui: ... odd as hell [19:05] says jenkins taht this broke at https://github.com/juju/juju/commit/e7f77fc1 [19:08] perrito666, that is master, not 1.20 [19:09] perrito666, commit 9ece6097 could be what broke the test [19:09] sinzui: I just clicked on the link you gave me and then the first failing one [19:10] jenkins doesn't understand branches. It is a moron [19:11] perrito666, This commit is first failure in 1.20 https://github.com/juju/juju/commit/9ece6097f4b21db8e934885ff9cb1373afa579f4 [19:11] oh Isee [19:11] I suspect cloud issues rather than code changed, but it does indicate brittleness. [19:12] perrito666, I am starting a conversation about releasing 1.20.0 without support for this test [19:14] sinzui: I dont have the setup for that test at hand [19:14] It does look like restore is taking a lot [19:15] I am really hoping it is only that [19:16] perrito666, setup takes longer. I am updating the test to work with modern HP. I am 30 minutes into it without achieving HA [19:19] :| [19:26] sinzui: I really need to EOD today but Ill be back later (I need to take 3h bus ride) [19:27] perrito666, Thank you for your time [19:27] so if you throw in more data I will be glad to take a look [19:27] C U ALL [19:27] bye [19:43] Hey guys, I'm having issues with deploying my charm locally (it sits in /home/vagrant/charms/trusty/). I exported JUJU_REPOSITORY as /home/vagrant/charms and deploy with juju deploy local:trusty/metis however the log (.juju/local/log/unit-metis-0.log) claims the install file does not exist (it does in the metis/hooks). The charm proceeds to stay in a dying state after that (I have to destroy the machine and then service in [19:43] order to try again). Any tips? Full unit log at http://paste.ubuntu.com/7738369/ [20:00] lazypower, mbruzek ^ Do you have any experience with JoshStrobl's issue? [20:01] looking now [20:01] hey JoshStrobl o/ [20:01] hey lazypower o/ [20:01] Hi JoshStrobl is this a local charm ? [20:01] Yes. [20:01] JoshStrobl, the file is exec able? 777? [20:01] thats interesting... the log leads me to believe the hooks weren't copied over. [20:01] for 755 [20:01] Yes, all the files in hooks are executable [20:01] JoshStrobl, have you checked in a branch to bzr yet? [20:02] JoshStrobl, Do you have some time to share you screen on G+? [20:02] mbruzek: I have an existing branch that is out of date. My work takes place via Git and GitHub and the charm will be residing on Launchpad. [20:02] mbruzek: Sure thing, let me get my mic set up. [20:02] https://plus.google.com/hangouts/_/canonical.com/juju-solutions [20:03] mbruzek: It'll be a few minutes :) [20:05] mbruzek: gotta install the plugin :D [20:05] no worries. [20:09] mbruzek: the link says the party is over [20:09] JoshStrobl, I am in the room right now [20:09] still says it after refreshing :\ [20:10] JoshStrobl: you may need to clse out of chrome/firefox and restart it, the plugin is notoriously terribad at first time installation. === tvansteenburgh1 is now known as tvansteenburgh === vladk is now known as vladk|offline [20:39] https://github.com/StroblIndustries/Metis/tree/experimental [20:42] sinzui, I responded to your mail, but I am also happy to chat if you like [20:43] alexisb, Thank you. I am happy for the leads to speak up. [20:52] * JoshStrobl looks at Microsoft and curses out loud. Then opens up gEdit. [20:53] JoshStrobl, looks like dos2unix would help you out there Josh. [20:53] sinzui: found the issue - it was windows line endings in character encodings. [20:53] FYI - one more thing we need to be aware of as our story with windows users grows. [20:53] hmm [20:53] (josh, you're a pioneer using windows line endings in linux.) [20:53] I would like to say I'm proud of that, but I'm not. [20:54] I don't even know how that happened... [20:54] id dint know gedit had that as an option myself [20:54] lazypower, maybe charm proof could assist us [20:54] I blame gEdit...because you know, it is totally never the developers fault /s [20:54] but its nice to know we support the desktop interop [20:54] sinzui: really good idea. i'll file a bug [20:55] lazypower: ping me the launchpad bug when it's up, I'll mark myself as affected [20:55] lazypower, There was a file in juju core's win script that had a single win ling ending in it. IT was maddening to find. [20:56] https://bugs.launchpad.net/charm-tools/+bug/1336936 [20:56] <_mup_> Bug #1336936: Charm proof to detect character encoding [20:56] lazypower: thanks [20:56] sinzui: wow - epic. a single line? [20:56] how does that even happen? [20:56] smallest broken merge, ever. [20:56] yeah, how does that happen? [20:58] You guys saved me several hours and probably a few euro at the pub. You guys rock. [21:01] JoshStrobl: we aim to please. tell your friends about us. [21:01] lazypower: Will do. [21:04] Oh and by the way, the dos2unix tool works flawlessly. I'd recommending making note of that tool in the docs for Winderps users. I was able to just do dos2unix * in the hooks directory and it converted all the scripts to unix "format". === thumper is now known as thumper-afk [22:14] fwereade: ping [22:22] wallyworld__: http://paste.ubuntu.com/7739086/ [22:22] more problems with the openstack provider [22:23] wallyworld__: gah, sorry, here is the real eror [22:23] http://paste.ubuntu.com/7739090/ [22:23] bug report coming [22:24] wallyworld__: hey another question, looking at this comment. it says that the default for rsyslog is 0640; is that sufficient, or do we want to exclude the group as well? [22:24] https://bugs.launchpad.net/juju-core/+bug/1336980 [22:24] <_mup_> Bug #1336980: provider/openstack: map ordering issue in tests [22:24] spot the bug [22:36] wallyworld__: ah, and i remembered my other question. open to the room as well: is there a faster way to recreate the log-files in /var/log than to destroy-environment and bootstrap a new one? [22:37] katco: on a call, will respond soon [22:37] wallyworld__: no worries and no rush. thank you, sir! === revagomes_ is now known as revagomes [23:21] katco: sorry, off call now. today is my meeting day, lots of them. using juju proper, there's no faster way that i know of. you could though change the rsyslog conf file and stop and restart rsyslog (deleing the all machine log in between) to see any permissions changes take effect