/srv/irclogs.ubuntu.com/2014/02/07/#juju-dev.txt

wallyworld_ahasenack: i think that's an ec2 issue from what i understand. i got the same error but can bootstrap fine on hp cloud etc.00:16
wallyworld_i did do a successful bootstrap just recently on ec2 and then it just stopped working00:17
ahasenackmy env was bootstrapped, and then juju deploy started to fail with that error00:17
ahasenackI then destroyed the environment and then bootstrap started to fail too00:17
wallyworld_i'm not sure if there's some place that can be checked for known ec2 outages00:18
ahasenackyou think it's a s3 outage?00:19
wallyworld_it appears so. it's nothing to do with juju in my opinion00:23
wallyworld_maybe not an outage per se but an issue outside of juju's control00:24
ahasenackwallyworld_: that actually sounds reasonable, I'm trying some s3 operations via aws's console, and they are failing00:27
wallyworld_:-(00:27
wallyworld_i hope it's fixed soon00:27
bradmanyone about who can talk about LP#1241674 ?01:41
_mup_Bug #1241674: juju-core broken with OpenStack Havana for tenants with multiple networks <cts-cloud-review> <openstack-provider> <juju-core:Fix Released by gz> <https://launchpad.net/bugs/1241674>01:41
hazmatwhat's the timeout on bootsttrap?06:43
wallyworld_hazmat: default 10 minutes but now can be changed06:50
wallyworld_if you run trunk06:50
hazmatwallyworld_, cool, how? i'm on a crappy net connection, and mongodb times me out.. i'm on trunk06:50
wallyworld_let me check06:50
hazmatwallyworld_, thanks06:51
wallyworld_hazmat: run bootstrap --help06:51
wallyworld_    # How long to wait for a connection to the state server.06:51
wallyworld_    bootstrap-timeout: 600 # default: 10 minutes06:51
wallyworld_    # How long to wait between connection attempts to a state server address.06:51
wallyworld_    bootstrap-retry-delay: 5 # default: 5 seconds06:51
wallyworld_    # How often to refresh state server addresses from the API server.06:51
wallyworld_    bootstrap-addresses-delay: 10 # default: 10 seconds06:51
wallyworld_the above go in your env.yaml06:51
hazmatwallyworld_, got it thanks.06:51
bradmanyone about who can talk about LP#1241674 ?06:51
_mup_Bug #1241674: juju-core broken with OpenStack Havana for tenants with multiple networks <cts-cloud-review> <openstack-provider> <juju-core:Fix Released by gz> <https://launchpad.net/bugs/1241674>06:51
wallyworld_bradm: mgz  is your best bet06:52
wallyworld_bradm: is says fix released - is it now working?06:53
wallyworld_not06:53
bradmwallyworld_: well, I'm on the verge of testing it, I have an openstack setup deployed using maas that gets that error - but I had some questions about what happens if it does work - ie, will the fix be backported to 1.16, or do we have to wait for 1.18?  and timeframes around that happening06:53
bradmwallyworld_: at this rate I should have it tested and confirmed next week06:54
bradmwallyworld_: but this is for the new prodstack for Canonical - I can't see us going live with a dev juju for that :)06:54
wallyworld_bradm: i personally am hopeful 1.18 will be out real soon now06:54
wallyworld_but we may need to consider a backport if 1.18 drags on a bit06:54
bradmwallyworld_: if we're talking a couple of weeks, great - if its months, we'll have issues06:55
wallyworld_it will be weeks but maybe a few rather than a couple if i had to guess06:55
wallyworld_we need to get some critical stuff in place for upgrades and other things before we release06:55
bradmright, so if I said a couple to a few weeks, that could be reasonable?06:55
bradmwe have other things that need to be done too, so this isn't the only blocker06:56
bradmjust means everything will have to be done using 1.17 until its released06:56
bradmfun things like if you reboot a swift storage node, the charm hasn't setup fstab entries so swift doesn't work so well anymore. :)06:57
wallyworld_bradm: i'd have to take a closer look at the bugs against 1.18 milestone. i really wouldn't like to guess without more knowledge06:57
bradmwallyworld_: ok, but you're thinking weeks rather than months, and if it blows out we could hope for a backport?06:58
wallyworld_yes, that is my view :-)06:58
wallyworld_if i were king for a day06:58
bradmI'll put a comment on the ticket after I've tested all this, and mention our concerns06:59
wallyworld_bradm: it depends a bit perhaps on what comes out of the mid-cycle sprint currently underway in SA06:59
bradmwallyworld_: there's a lot of people waiting on this openstack setup :-/06:59
wallyworld_sure. make sure we are aware and then stuff can be looked at06:59
wallyworld_i can imagine. to me it is quite critical06:59
wallyworld_but i'm only one voice06:59
wallyworld_maybe a backport would be feasible, then that relieves the pressure somewhat07:00
bradmyeah, that would be sufficient even.07:01
bradmwe'll see - I should be able to get final testing done next week, its been a lot of waiting on hardware, and getting that into place07:01
wallyworld_ok. let us know how it goes and what you need07:03
bradmwill do, I pretty much have all the pieces in place now for at least some preliminary testing, so I should know pretty quickly next week if it works07:03
wallyworld_good luck :-)07:04
bradmthanks.07:07
hazmathmm.. just got a report from a user.. is juju replacing authorized keys on machines? or just augmenting?07:11
hazmattheir claiming their iaas api provided keys stopped working once juju agents started running on the systems.07:13
=== jam is now known as Guest47101
=== Guest47101 is now known as jam1
wallyworld_hazmat: juju augments (appends) to any keys already existing the the ~/.ssh/authorised_keys file07:37
dimiternrogpeppe, wallyworld_, mgz, standup10:47
dimiternwaigani, your connection could be better :)11:39
adeuringnatefinch: cold you have another look here: https://codereview.appspot.com/60630043 ?13:08
natefinchadeuring: sure13:08
natefinchadeuring: reviewed.  Thanks for looking into the OS-specific stuff. I just wanted to make sure we were being careful to not be too linux specific.13:14
adeuringnatefinch: thanks13:14
natefinchsweet... now there's 2 waiganis... wonder if we'll end the day with 20 or something13:39
dimiternadeuring, just so you know - when you push more revisions after the MP is approved, (i.e. fixing test failures the bot found) you'll need to self-approve it first with a comment, and then mark it as approved again, so the bot will be happy to land it14:01
adeuringdimitern: thanks, i really tend to forget the comment ...14:02
dimiternadeuring, yep, i did too, but the bot never forgets :)14:03
adeuringdimitern: yeah, that's non-human bureaucracy ;)14:04
rogpeppei'm seeing test failures on trunk (running tests on the state package): http://paste.ubuntu.com/6891504/14:19
rogpeppeanyone else see the same thing?14:19
rogpeppe(i'm seeing it every time currently)14:19
rogpeppedimitern, natefinch, mgz: ^14:19
mgzrogpeppe: will see14:19
dimiternrogpeppe, i'm pulling trunk to try14:20
rogpeppemgz, dimitern: thanks14:20
mgz(cd state&& go test) enouhg?14:21
rogpeppemgz: should be14:22
dimiternrogpeppe, OK: 395 passed14:23
rogpeppedimitern: hmm. still fails every time for me14:23
mgzI got one of them14:25
mgzthe second only14:25
dimiternrogpeppe, are you sure you have all the deps right? i needed to go get error and do godeps -u, which failed for gwacl (rev tarmac something not found), otherwise all good14:25
rogpeppemgz: ok, that's useful14:25
mgzsame failure (bar the random port)14:25
rogpeppedimitern: yeah, that was the first thing i did14:25
rogpeppeunfortunately i don't get the same failure when running individual suites or tests14:25
dimiternrogpeppe, i'm running them now several times to make sure14:25
dimiternrogpeppe, i'm running go test -gocheck.v in state/14:26
dimiternrogpeppe, what's the panic in relationsuite?14:28
rogpeppehmm, i've just seen another error14:28
rogpeppedimitern: when a fixture setup method fails, gocheck counts it as a panic14:28
dimiternrogpeppe, ah, i see14:29
rogpeppethis time i got this: http://paste.ubuntu.com/6891545/14:29
rogpeppe(ignore the timestamps)14:29
dimiternrogpeppe, hm, i got 2 failures on the third run: http://paste.ubuntu.com/6891551/14:30
rogpeppedimitern: ah, that looks like the same thing14:30
rogpeppewell at least it's not just me14:31
mgzI ayou're just best at hitting the races for some reason rog :)14:31
dimiternrogpeppe, it seems mongo couldn't handle the stress14:31
dimiternrogpeppe, it's not properly shutting down and cleaning up stuff, or it lags14:31
rogpeppedimitern, mgz: looks like it's a consequence of changes to mgo between rev 240 and now14:54
rogpeppe(and there do seem to be some relevant changes there)14:54
dimiternrogpeppe, oh yeah? what changes?14:54
rogpeppedimitern: i'm still bisecting14:54
rogpeppedimitern: somewhere between r240 and r24314:55
dimiternrogpeppe, how are you bisecting? i haven't used the more advanced vcs forensics like that14:55
rogpeppedimitern: manually :-)14:55
mgzah, interesting14:55
rogpeppedimitern: bzr update -r xxx; go install14:56
dimiternrogpeppe, ah :)14:56
mgzwe took that mgo bump for a gcc fix14:56
dimiternrogpeppe, there are commands like bisect (for git or hg i think) that supposedly takes a lot away from the manual checking14:56
rogpeppedimitern: i know, but i can never figure out how to use them well14:57
mgzwhich was trivial, butpresumably picked up a bunch of other things, despite me being conserative with it14:57
rogpeppedimitern: i've started to try14:57
rogpeppedimitern: but never got very far. manual is quite easy anyway14:57
mgzr241 looks the most suspect14:58
rogpeppemgz: yup, if my current run fails, that's where the finger points14:59
rogpeppemgz: yeah, that's it14:59
natefinchrogpeppe: voyeur code: https://codereview.appspot.com/5770004414:59
dimiternrogpeppe, yeah, me too14:59
mgzit flat adds a timeout in a bunch of places that had none before15:00
rogpeppemgz: there were later fixes to that code, but i guess they didn't work15:00
rogpeppemgz: i'll try with r248 and see if it still fails15:00
rogpeppenatefinch: thanks. looking15:00
mgzprobably we need to SetTimeout to something longer in the context of our tests15:00
mgz5 seconds should be okay, but is probably pushing it for some of our testing15:02
rogpeppenatefinch: reviewed15:15
rogpeppemgz: i'm not quite sure what it's a timeout for anyway15:17
rogpeppepwd15:17
natefinchrogpeppe: thanks'15:18
rogpeppemgz: it still fails for me if i change pingDelay to 30 seconds15:24
rogpeppemgz: so that may not be the issue15:24
mgzyeah, that was the one that already existed, the new one is syncSocketTimeout15:26
rogpeppemgz: ah, i traced the code wrongly without looking at the diffs. foolish.15:27
mgz(pingDelay did get lowered... but seems less impactful anyway)15:27
rogpeppemgz: doesn't look like it was syncSocketTimeout either15:29
rogpeppemgz: (i still see failures when it's 100 seconds)15:29
mgzhm, that's no fun.15:29
rogpeppemgz: i'm just experimenting by printing out the deadlines as they're set15:32
rogpeppemgz: hmm, it seems like sometimes the timeout is only 100ms15:42
rogpeppemgz: ha, variously 10m, 100s, 15s and 100ms15:44
mgzo_O15:48
rogpeppemgz: ah ha, i think i have it - the initial dial timeout is also used for the socket timeout15:59
rogpeppemgz: and we use 100 milliseconds in TestingDialOpts15:59
mgzdoh!15:59
rogpeppemgz: it's kind of odd that that value is used for two very different things actually16:00
rogpeppemgz: ha, i was wondering why it was being a little slow to read the file that i'd sent the output of go test to. turned out that wasn't too surprising because it was 271MB!16:13
rogpeppemgz: good thing my editor copes fine with that...16:13
mgzI actually managed to make vim choke on a juju log file the other day16:18
mgzgiant log on a hobbling along m1.tiny... time for less16:18
rogpeppemgz, dimitern, natefinch: simple review for a fix to the above issues: https://codereview.appspot.com/61010043/16:38
rogpeppemgz: doesn't vim store the whole file in memory?16:38
dimiternrogpeppe, looking16:44
dimiternrogpeppe, no it doesn't - you can open a multigigabyte file almost instantly - emacs does the same :P16:45
rogpeppedimitern: ah, i thought it did, interesting16:46
rogpeppedimitern: presumably it does have to copy the file when opening it though16:46
dimiternrogpeppe, reviewed16:46
rogpeppedimitern: thanks16:46
dimiternrogpeppe, why copy?16:46
rogpeppedimitern: because there's usually an assumption that if i'm editing a file, i can remove it or move it, and still write it out as it is in the editor buffer16:47
dimiternrogpeppe, even if you delete it, the file handle that the editor opened is still readable some time after (perhaps up to the amount that got cached in the kernel)16:49
rogpeppedimitern: and a quick experiment persuades me that vim *does* copy the data (although i can't tell if it's to memory or disk)16:49
rogpeppedimitern: what if you overwrite it?16:49
dimiternrogpeppe, the contents will be in the kernel file cache for some time at least16:49
dimiternrogpeppe, if you open it again after overwriting you'll see the change16:50
rogpeppedimitern: i just confirmed: vim does not keep the original file open16:51
dimiternrogpeppe, hmm good to know16:52
rogpeppedimitern: also, it does look like vim stores everything in memory16:58
rogpeppedimitern: (not that's too much of an issue now, with enormous memories, but worth being aware of)16:58
dimiternrogpeppe, perhaps up to certain size16:58
rogpeppedimitern: i'd have thought that 180MB was probably larger than that size16:59
dimiternrogpeppe, i doubt you can put 4GB file so quickly in memory, judging by the speed it opens it16:59
dimiternrogpeppe, i think it uses memory mapped view of the file, using the kernel file cache17:00
rogpeppedimitern: it doesn't seem to17:00
rogpeppedimitern: but maybe it's different for truly enormous files17:00
dimiternrogpeppe, yep17:01
* dimitern reached eod17:04
dimiternhappy weekends everyone!17:04
rogpeppedimitern: and you!17:07
mgzlater dimitern!17:08
rogpeppedimitern: BTW vi definitely seems to read it all into memory, even for GB-sized files17:09
rogpeppethat's me for the day18:40
rogpeppeg'night all18:40
=== gary_poster is now known as gary_poster|away

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!