/srv/irclogs.ubuntu.com/2014/02/07/#juju-dev.txt

wallyworld_	ahasenack: i think that's an ec2 issue from what i understand. i got the same error but can bootstrap fine on hp cloud etc.	00:16
wallyworld_	i did do a successful bootstrap just recently on ec2 and then it just stopped working	00:17
ahasenack	my env was bootstrapped, and then juju deploy started to fail with that error	00:17
ahasenack	I then destroyed the environment and then bootstrap started to fail too	00:17
wallyworld_	i'm not sure if there's some place that can be checked for known ec2 outages	00:18
ahasenack	you think it's a s3 outage?	00:19
wallyworld_	it appears so. it's nothing to do with juju in my opinion	00:23
wallyworld_	maybe not an outage per se but an issue outside of juju's control	00:24
ahasenack	wallyworld_: that actually sounds reasonable, I'm trying some s3 operations via aws's console, and they are failing	00:27
wallyworld_	:-(	00:27
wallyworld_	i hope it's fixed soon	00:27
bradm	anyone about who can talk about LP#1241674 ?	01:41
_mup_	Bug #1241674: juju-core broken with OpenStack Havana for tenants with multiple networks <cts-cloud-review> <openstack-provider> <juju-core:Fix Released by gz> <https://launchpad.net/bugs/1241674>	01:41
hazmat	what's the timeout on bootsttrap?	06:43
wallyworld_	hazmat: default 10 minutes but now can be changed	06:50
wallyworld_	if you run trunk	06:50
hazmat	wallyworld_, cool, how? i'm on a crappy net connection, and mongodb times me out.. i'm on trunk	06:50
wallyworld_	let me check	06:50
hazmat	wallyworld_, thanks	06:51
wallyworld_	hazmat: run bootstrap --help	06:51
wallyworld_	# How long to wait for a connection to the state server.	06:51
wallyworld_	bootstrap-timeout: 600 # default: 10 minutes	06:51
wallyworld_	# How long to wait between connection attempts to a state server address.	06:51
wallyworld_	bootstrap-retry-delay: 5 # default: 5 seconds	06:51
wallyworld_	# How often to refresh state server addresses from the API server.	06:51
wallyworld_	bootstrap-addresses-delay: 10 # default: 10 seconds	06:51
wallyworld_	the above go in your env.yaml	06:51
hazmat	wallyworld_, got it thanks.	06:51
bradm	anyone about who can talk about LP#1241674 ?	06:51
_mup_	Bug #1241674: juju-core broken with OpenStack Havana for tenants with multiple networks <cts-cloud-review> <openstack-provider> <juju-core:Fix Released by gz> <https://launchpad.net/bugs/1241674>	06:51
wallyworld_	bradm: mgz is your best bet	06:52
wallyworld_	bradm: is says fix released - is it now working?	06:53
wallyworld_	not	06:53
bradm	wallyworld_: well, I'm on the verge of testing it, I have an openstack setup deployed using maas that gets that error - but I had some questions about what happens if it does work - ie, will the fix be backported to 1.16, or do we have to wait for 1.18? and timeframes around that happening	06:53
bradm	wallyworld_: at this rate I should have it tested and confirmed next week	06:54
bradm	wallyworld_: but this is for the new prodstack for Canonical - I can't see us going live with a dev juju for that :)	06:54
wallyworld_	bradm: i personally am hopeful 1.18 will be out real soon now	06:54
wallyworld_	but we may need to consider a backport if 1.18 drags on a bit	06:54
bradm	wallyworld_: if we're talking a couple of weeks, great - if its months, we'll have issues	06:55
wallyworld_	it will be weeks but maybe a few rather than a couple if i had to guess	06:55
wallyworld_	we need to get some critical stuff in place for upgrades and other things before we release	06:55
bradm	right, so if I said a couple to a few weeks, that could be reasonable?	06:55
bradm	we have other things that need to be done too, so this isn't the only blocker	06:56
bradm	just means everything will have to be done using 1.17 until its released	06:56
bradm	fun things like if you reboot a swift storage node, the charm hasn't setup fstab entries so swift doesn't work so well anymore. :)	06:57
wallyworld_	bradm: i'd have to take a closer look at the bugs against 1.18 milestone. i really wouldn't like to guess without more knowledge	06:57
bradm	wallyworld_: ok, but you're thinking weeks rather than months, and if it blows out we could hope for a backport?	06:58
wallyworld_	yes, that is my view :-)	06:58
wallyworld_	if i were king for a day	06:58
bradm	I'll put a comment on the ticket after I've tested all this, and mention our concerns	06:59
wallyworld_	bradm: it depends a bit perhaps on what comes out of the mid-cycle sprint currently underway in SA	06:59
bradm	wallyworld_: there's a lot of people waiting on this openstack setup :-/	06:59
wallyworld_	sure. make sure we are aware and then stuff can be looked at	06:59
wallyworld_	i can imagine. to me it is quite critical	06:59
wallyworld_	but i'm only one voice	06:59
wallyworld_	maybe a backport would be feasible, then that relieves the pressure somewhat	07:00
bradm	yeah, that would be sufficient even.	07:01
bradm	we'll see - I should be able to get final testing done next week, its been a lot of waiting on hardware, and getting that into place	07:01
wallyworld_	ok. let us know how it goes and what you need	07:03
bradm	will do, I pretty much have all the pieces in place now for at least some preliminary testing, so I should know pretty quickly next week if it works	07:03
wallyworld_	good luck :-)	07:04
bradm	thanks.	07:07
hazmat	hmm.. just got a report from a user.. is juju replacing authorized keys on machines? or just augmenting?	07:11
hazmat	their claiming their iaas api provided keys stopped working once juju agents started running on the systems.	07:13
=== jam is now known as Guest47101
=== Guest47101 is now known as jam1
wallyworld_	hazmat: juju augments (appends) to any keys already existing the the ~/.ssh/authorised_keys file	07:37
dimitern	rogpeppe, wallyworld_, mgz, standup	10:47
dimitern	waigani, your connection could be better :)	11:39
adeuring	natefinch: cold you have another look here: https://codereview.appspot.com/60630043 ?	13:08
natefinch	adeuring: sure	13:08
natefinch	adeuring: reviewed. Thanks for looking into the OS-specific stuff. I just wanted to make sure we were being careful to not be too linux specific.	13:14
adeuring	natefinch: thanks	13:14
natefinch	sweet... now there's 2 waiganis... wonder if we'll end the day with 20 or something	13:39
dimitern	adeuring, just so you know - when you push more revisions after the MP is approved, (i.e. fixing test failures the bot found) you'll need to self-approve it first with a comment, and then mark it as approved again, so the bot will be happy to land it	14:01
adeuring	dimitern: thanks, i really tend to forget the comment ...	14:02
dimitern	adeuring, yep, i did too, but the bot never forgets :)	14:03
adeuring	dimitern: yeah, that's non-human bureaucracy ;)	14:04
rogpeppe	i'm seeing test failures on trunk (running tests on the state package): http://paste.ubuntu.com/6891504/	14:19
rogpeppe	anyone else see the same thing?	14:19
rogpeppe	(i'm seeing it every time currently)	14:19
rogpeppe	dimitern, natefinch, mgz: ^	14:19
mgz	rogpeppe: will see	14:19
dimitern	rogpeppe, i'm pulling trunk to try	14:20
rogpeppe	mgz, dimitern: thanks	14:20
mgz	(cd state&& go test) enouhg?	14:21
rogpeppe	mgz: should be	14:22
dimitern	rogpeppe, OK: 395 passed	14:23
rogpeppe	dimitern: hmm. still fails every time for me	14:23
mgz	I got one of them	14:25
mgz	the second only	14:25
dimitern	rogpeppe, are you sure you have all the deps right? i needed to go get error and do godeps -u, which failed for gwacl (rev tarmac something not found), otherwise all good	14:25
rogpeppe	mgz: ok, that's useful	14:25
mgz	same failure (bar the random port)	14:25
rogpeppe	dimitern: yeah, that was the first thing i did	14:25
rogpeppe	unfortunately i don't get the same failure when running individual suites or tests	14:25
dimitern	rogpeppe, i'm running them now several times to make sure	14:25
dimitern	rogpeppe, i'm running go test -gocheck.v in state/	14:26
dimitern	rogpeppe, what's the panic in relationsuite?	14:28
rogpeppe	hmm, i've just seen another error	14:28
rogpeppe	dimitern: when a fixture setup method fails, gocheck counts it as a panic	14:28
dimitern	rogpeppe, ah, i see	14:29
rogpeppe	this time i got this: http://paste.ubuntu.com/6891545/	14:29
rogpeppe	(ignore the timestamps)	14:29
dimitern	rogpeppe, hm, i got 2 failures on the third run: http://paste.ubuntu.com/6891551/	14:30
rogpeppe	dimitern: ah, that looks like the same thing	14:30
rogpeppe	well at least it's not just me	14:31
mgz	I ayou're just best at hitting the races for some reason rog :)	14:31
dimitern	rogpeppe, it seems mongo couldn't handle the stress	14:31
dimitern	rogpeppe, it's not properly shutting down and cleaning up stuff, or it lags	14:31
rogpeppe	dimitern, mgz: looks like it's a consequence of changes to mgo between rev 240 and now	14:54
rogpeppe	(and there do seem to be some relevant changes there)	14:54
dimitern	rogpeppe, oh yeah? what changes?	14:54
rogpeppe	dimitern: i'm still bisecting	14:54
rogpeppe	dimitern: somewhere between r240 and r243	14:55
dimitern	rogpeppe, how are you bisecting? i haven't used the more advanced vcs forensics like that	14:55
rogpeppe	dimitern: manually :-)	14:55
mgz	ah, interesting	14:55
rogpeppe	dimitern: bzr update -r xxx; go install	14:56
dimitern	rogpeppe, ah :)	14:56
mgz	we took that mgo bump for a gcc fix	14:56
dimitern	rogpeppe, there are commands like bisect (for git or hg i think) that supposedly takes a lot away from the manual checking	14:56
rogpeppe	dimitern: i know, but i can never figure out how to use them well	14:57
mgz	which was trivial, butpresumably picked up a bunch of other things, despite me being conserative with it	14:57
rogpeppe	dimitern: i've started to try	14:57
rogpeppe	dimitern: but never got very far. manual is quite easy anyway	14:57
mgz	r241 looks the most suspect	14:58
rogpeppe	mgz: yup, if my current run fails, that's where the finger points	14:59
rogpeppe	mgz: yeah, that's it	14:59
natefinch	rogpeppe: voyeur code: https://codereview.appspot.com/57700044	14:59
dimitern	rogpeppe, yeah, me too	14:59
mgz	it flat adds a timeout in a bunch of places that had none before	15:00
rogpeppe	mgz: there were later fixes to that code, but i guess they didn't work	15:00
rogpeppe	mgz: i'll try with r248 and see if it still fails	15:00
rogpeppe	natefinch: thanks. looking	15:00
mgz	probably we need to SetTimeout to something longer in the context of our tests	15:00
mgz	5 seconds should be okay, but is probably pushing it for some of our testing	15:02
rogpeppe	natefinch: reviewed	15:15
rogpeppe	mgz: i'm not quite sure what it's a timeout for anyway	15:17
rogpeppe	pwd	15:17
natefinch	rogpeppe: thanks'	15:18
rogpeppe	mgz: it still fails for me if i change pingDelay to 30 seconds	15:24
rogpeppe	mgz: so that may not be the issue	15:24
mgz	yeah, that was the one that already existed, the new one is syncSocketTimeout	15:26
rogpeppe	mgz: ah, i traced the code wrongly without looking at the diffs. foolish.	15:27
mgz	(pingDelay did get lowered... but seems less impactful anyway)	15:27
rogpeppe	mgz: doesn't look like it was syncSocketTimeout either	15:29
rogpeppe	mgz: (i still see failures when it's 100 seconds)	15:29
mgz	hm, that's no fun.	15:29
rogpeppe	mgz: i'm just experimenting by printing out the deadlines as they're set	15:32
rogpeppe	mgz: hmm, it seems like sometimes the timeout is only 100ms	15:42
rogpeppe	mgz: ha, variously 10m, 100s, 15s and 100ms	15:44
mgz	o_O	15:48
rogpeppe	mgz: ah ha, i think i have it - the initial dial timeout is also used for the socket timeout	15:59
rogpeppe	mgz: and we use 100 milliseconds in TestingDialOpts	15:59
mgz	doh!	15:59
rogpeppe	mgz: it's kind of odd that that value is used for two very different things actually	16:00
rogpeppe	mgz: ha, i was wondering why it was being a little slow to read the file that i'd sent the output of go test to. turned out that wasn't too surprising because it was 271MB!	16:13
rogpeppe	mgz: good thing my editor copes fine with that...	16:13
mgz	I actually managed to make vim choke on a juju log file the other day	16:18
mgz	giant log on a hobbling along m1.tiny... time for less	16:18
rogpeppe	mgz, dimitern, natefinch: simple review for a fix to the above issues: https://codereview.appspot.com/61010043/	16:38
rogpeppe	mgz: doesn't vim store the whole file in memory?	16:38
dimitern	rogpeppe, looking	16:44
dimitern	rogpeppe, no it doesn't - you can open a multigigabyte file almost instantly - emacs does the same :P	16:45
rogpeppe	dimitern: ah, i thought it did, interesting	16:46
rogpeppe	dimitern: presumably it does have to copy the file when opening it though	16:46
dimitern	rogpeppe, reviewed	16:46
rogpeppe	dimitern: thanks	16:46
dimitern	rogpeppe, why copy?	16:46
rogpeppe	dimitern: because there's usually an assumption that if i'm editing a file, i can remove it or move it, and still write it out as it is in the editor buffer	16:47
dimitern	rogpeppe, even if you delete it, the file handle that the editor opened is still readable some time after (perhaps up to the amount that got cached in the kernel)	16:49
rogpeppe	dimitern: and a quick experiment persuades me that vim does copy the data (although i can't tell if it's to memory or disk)	16:49
rogpeppe	dimitern: what if you overwrite it?	16:49
dimitern	rogpeppe, the contents will be in the kernel file cache for some time at least	16:49
dimitern	rogpeppe, if you open it again after overwriting you'll see the change	16:50
rogpeppe	dimitern: i just confirmed: vim does not keep the original file open	16:51
dimitern	rogpeppe, hmm good to know	16:52
rogpeppe	dimitern: also, it does look like vim stores everything in memory	16:58
rogpeppe	dimitern: (not that's too much of an issue now, with enormous memories, but worth being aware of)	16:58
dimitern	rogpeppe, perhaps up to certain size	16:58
rogpeppe	dimitern: i'd have thought that 180MB was probably larger than that size	16:59
dimitern	rogpeppe, i doubt you can put 4GB file so quickly in memory, judging by the speed it opens it	16:59
dimitern	rogpeppe, i think it uses memory mapped view of the file, using the kernel file cache	17:00
rogpeppe	dimitern: it doesn't seem to	17:00
rogpeppe	dimitern: but maybe it's different for truly enormous files	17:00
dimitern	rogpeppe, yep	17:01
* dimitern reached eod		17:04
dimitern	happy weekends everyone!	17:04
rogpeppe	dimitern: and you!	17:07
mgz	later dimitern!	17:08
rogpeppe	dimitern: BTW vi definitely seems to read it all into memory, even for GB-sized files	17:09
rogpeppe	that's me for the day	18:40
rogpeppe	g'night all	18:40
=== gary_poster is now known as gary_poster\|away

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!