[00:18] <wallyworld> thumper: yo
[00:18] <wwitzel3> thumper: you were helping a guy who had his juju upgraded from 1.18 to 1.19. Trying to get him to 1.20. I've been trying to replicate things locally
[00:19] <wwitzel3> thumper: I've got 1.19.3 bootstrapped .. but how do I make it think it is 1.18 so the juju upgrade-juju of 1.20 will proceed?
[00:26] <wallyworld> wwitzel3: i don't think you can - the version number is obtained via version.Current which is compiled into the executable
[00:26] <wallyworld> oh, wait, the db agent-version needs to be updated i think
[00:27] <wallyworld> but doing that will trigger downgrade
[00:27] <wallyworld> the version.Current and agent-version in db need to match, or else juju willtryand make them match
[00:28] <wallyworld> so you may need to hack 1.20 to run upgrade steps for 1.19 rather than 1.18
[00:31] <jcw4> axw: thumper suggested I ask you to review https://github.com/juju/juju/pull/450
[00:32] <jcw4> axw: It's one half of the issue raised in https://bugs.launchpad.net/juju-core/+bug/1351089
[00:32] <_mup_> Bug #1351089: Isolation failure in sshstorage test <juju-core:New for johnweldon4> <https://launchpad.net/bugs/1351089>
[00:32] <jcw4> axw: I have another PR coming soon that addresses this specific isolation error, and a handful of others too, by supressing the load of a users custom BASH_ENV file
[00:33] <jcw4> axw: but I thought that this PR was a valid change on it's own merits
[00:39] <wallyworld> sinzui: i remember why the lock dir is not removed - i read that removing the file could potentially cause issues (i think if other processes held the lock, not sure now). so that's why it is left behind. so i've made it so that bootstrap and destroy remove any lock files at that point
[01:15] <axw> jcw4: will take a look in a bit, thanks
[01:28] <thumper> wwitzel3: hey
[01:28] <thumper> wwitzel3: I *think* you just need to do the upgrade, then change the agent version in the agent configuration files on the various machines
[01:28] <thumper> wallyworld: hey
[01:33] <thumper> davecheney: care to look at https://github.com/juju/names/pull/22
[01:33] <thumper> wallyworld: around?
[01:38] <davecheney> thumper: /me looks
[02:01] <jcw4> thanks axw
[02:01] <jcw4> general question... how do I know when CI is ready to start accepting normal merges again?
[02:04] <axw> jcw4: you'd have to check the bugs I'm afraid
[02:04] <axw> they're tagged with "regression"
[02:05] <davecheney> jcw4: https://bugs.launchpad.net/juju-core/+bugs?field.searchtext=&orderby=-importance&field.status%3Alist=NEW&field.status%3Alist=CONFIRMED&field.status%3Alist=TRIAGED&field.status%3Alist=INPROGRESS&field.status%3Alist=INCOMPLETE_WITH_RESPONSE&field.status%3Alist=INCOMPLETE_WITHOUT_RESPONSE&field.importance%3Alist=CRITICAL&assignee_option=any&field.assignee=&field.bug_reporter=&field.bug_commenter=&field.subscriber=&field.structural_sub
[02:07] <jcw4> awesome, thanks axw and davecheney
[02:11] <davecheney> thumper: I cannot see an item on the N sprint for addressing the rquirements to support Trusty's 1.18 client
[02:12] <thumper> davecheney: I'll add one
[02:12] <davecheney> thumper: thanks
[02:12] <davecheney> thumper: for example, https://github.com/juju/juju/pull/449#issuecomment-50825367
[02:18] <axw> davecheney: does 1.18 CLI still dial mongo directly? I thought 1.16 was the last to do so
[02:20] <axw> looking at the 1.18 branch, looks like they all use API
[02:20] <davecheney> axw: cool
[02:20] <davecheney> i don't know the answer, that is why I asked
[02:23] <hazmat> how long are state tests supposed to take?
[02:23]  * hazmat realizes its been a while 
[02:23] <axw> hazmat: I think it takes 60s-80s on my laptop with SSD
[02:24] <axw> scratch that, with tmpfs
[02:25] <davecheney> hazmat: yup, that is how long it takes
[02:25] <davecheney> apiserver also take ~70 s for me
[02:26] <hazmat> hmm.. i haven't hit a failure but i am hitting timeouts with tokumx on ssd.. was running with extra verbose those (-gocheck.vv -v)
[02:26] <davecheney> axw: are all the build blockers related ?
[02:26] <hazmat> 10m and the test gets killed
[02:26] <davecheney> hazmat: then the test fuked up
[02:27] <davecheney> it happens
[02:27] <axw> davecheney: nfi
[02:27] <davecheney> axw: someone suggested at the team meeting they were all the same underlying cause; mongo startup failures
[02:27] <axw> hazmat: is tokumx cleanroom, or based on some version of mongo? if the latter, which version?
[02:27] <axw> davecheney: the azure one certainly is mongo
[02:27] <hazmat> axw, current release is against 2.4
[02:28] <hazmat> axw, but its a mvcc db under the hood.. ie. always journaled
[02:28] <axw> ok, just asking cos I've been trying to get things working with 2.6 and have had a bunch of issues
[02:29] <hazmat> i'm wondering if the mvcc stuff is causing some additional time consumption due to write flushes
[02:29] <axw> I would hope it wouldn't add *that* much
[02:30] <axw> how do you control transaction isolation?
[02:30] <hazmat> axw, db.beginTransaction(); db.commitTransaction(); db.rollbackTransaction()
[02:31] <hazmat> axw, http://docs.tokutek.com/tokumx/tokumx-transactions.html
[02:32] <hazmat> axw, actually better http://docs.tokutek.com/tokumx/tokumx-commands.html#tokumx-new-commands-transactions
[02:33] <axw> cool
[02:33] <hazmat> are folks using go 1.3 for dev ?
[02:33] <axw> I am
[02:33] <davecheney> i use trunk
[02:33] <hazmat> cool, i am as well, just wanted to double check i had some issues with the rcs and juju
[02:34] <davecheney> rsc ?
[02:34] <davecheney> rcs ?
[02:34] <hazmat> davecheney, release candidates.. http client errors on writes to s3
[02:35] <hazmat> when uploading tools, i would get  EOF often
[02:35] <davecheney> ok
[02:36] <stokachu> anyone know if work has been committed on the api servers not being added to .jenv files until running juju status the first time?
[02:36] <stokachu> i looked through the commit logs but didnt see anything
[02:36] <stokachu> the state server i should say
[02:36] <davecheney> stokachu: it's waiting to land now
[02:36] <stokachu> davecheney, ok cool, thanks
[02:37] <axw> davecheney: they'll be stored by the bootstrap command?
[02:37] <davecheney> axw: waigini knows
[02:37] <davecheney> but yes
[02:37] <axw> cool
[02:37] <davecheney> axw: I think the failure on precise is because the cloud archive isn't being added
[02:38] <davecheney> Setting up rsyslog-gnutls (5.8.6-1ubuntu8.6) ...
[02:38] <davecheney> Setting up libsnappy1 (1.0.4-1build1) ...
[02:38] <davecheney> Setting up mongodb-clients (1:2.4.6-0ubuntu5~ubuntu12.04.1~juju1) ...
[02:38] <davecheney> Setting up mongodb-server (1:2.4.6-0ubuntu5~ubuntu12.04.1~juju1) ...
[02:38] <davecheney> Adding system user `mongodb' (UID 107) ...
[02:38] <davecheney> ^ this isn't the juju mongo
[02:38] <davecheney> it's the system one
[02:38] <davecheney> no
[02:38] <davecheney> wait
[02:38] <axw> I don't think we use juju-mongodb on precise
[02:38] <davecheney> it _is_ the juju one
[02:38] <davecheney> axw: we _have_ to
[02:38] <davecheney> precise is ooooooooooooold sauce
[02:38] <davecheney> there is logic in cloudinit that does the switch
[02:38] <davecheney> something like mongo.*Series
[02:39] <axw> I don't think it *exists* in precise. I can't remember the logic... will have to dig. but anywya, that hasn't changed AFAIK
[02:39] <axw> trusty+ uses juju-mongodb, everything else uses mongodb-server. it may be that it's meant to come from cloud-tools tho
[02:39] <davecheney> 2.0 is in precise
[02:40] <davecheney> anyway
[02:40] <davecheney> probably a red herring
[02:40] <davecheney> just looking for somehting I can help
[02:43] <rick_h__> davecheney: wonder if your email is due to https://github.com/juju/txn/pull/6
[02:43] <davecheney> rick_h__: nope
[02:43] <rick_h__> davecheney: ok, nvm then
[02:43] <davecheney> this is something stupid we did to ourselves back in LV
[02:44] <davecheney> juju-local depends on juju
[02:44] <davecheney> and you can't run the tests without juju-local
[02:44] <davecheney> so even in an isolated environment you need to install juju
[02:44] <davecheney> then carefully ignore it
[02:44] <rick_h__> davecheney: ah, gotcha. Ok, just saw testing/dep and juju after seeing this pr today so wondered
[02:44] <davecheney> rick_h__: -ENAMEINGOVERLOAD
[02:44] <rick_h__> lol
[02:49] <katco> wallyworld: hey, i just pushed up a new round of changes. also exporatory. some tests aren't passing, but i'm not sure if they're transient. don't have time to verify tonight.
[03:16] <thumper> davecheney: added a sprint session to talk 1.18 support, see the bottom of the list
[03:17] <davecheney> thumper: thanks
[03:17]  * davecheney lunch
[03:32] <menn0> thumper: can I get your advice on something
[03:32] <menn0> ?
[03:41] <wwitzel3> wallyworld: thanks for the followup email
[03:43] <thumper> menn0: sure
[03:44] <menn0> thumper: I'm working on adding the ability to roll back the agent version to force a downgrade
[03:44] <thumper> ok
[03:44] <menn0> the upgrader had to be changed because it doesn't normally allow downgrades - that's done
[03:44] <thumper> ok
[03:44] <menn0> but now I'm at the part where I need to set the actual agent version back in state
[03:45]  * thumper tries not to be distracted by the thought of coffee
[03:45] <thumper> ok
[03:45] <menn0> and state.SetEnvironmentAgentVersion checks to make sure the new version makes sense
[03:46] <menn0> hang on...
[03:46] <menn0> ok
[03:46] <menn0> so one of the checks is that all the machines and agents are starting out at the same version
[03:47] <menn0> and in this case they might not be and that's actually ok
[03:47] <menn0> we want to roll back to last good version anyway
[03:48] <thumper> can I suggest that I make my coffee and then we have a hangout?
[03:48] <menn0> ok sounds good
[03:48] <menn0> coffee wins :)
[03:48] <thumper> what we should identify is all the places in state where we record the agent version
[03:48] <thumper> and also where on disk this is stored
[03:48] <thumper> and at which part of the process the various places change
[03:48] <thumper> so we know what is considered a valid rollback version
[03:49]  * thumper goes to make coffee
[03:50] <menn0> looking more closely I actually think I don't need to worry. would be good to discuss quickly anyway.
[03:55] <jcw4> so, should I not even try to $$merge$$ until the regressions are fixed?
[03:56] <thumper> jcw4: correct
[03:56] <jcw4> axw: this is the other half of the test isolation fix: https://github.com/juju/juju/pull/454
[03:56] <jcw4> thumper: thanks
[03:57] <axw> jcw4: can we just use IsolationSuite instead?
[03:58] <axw> or OsEnvSuite
[03:58] <axw> directly
[03:58] <jcw4> axw: I actually did that first, but it didn't fix the issue and the test I was testing on needed more refactoring to use it, which I wasn't 100% sure on... I can try again
[03:59] <jcw4> axw: I'll do that again and bug you with questions if I have them :)
[03:59] <axw> no worries
[03:59] <axw> thanks
[03:59]  * axw wanders off to get some lunch
[03:59] <axw> bbs
[04:00] <thumper> menn0: https://plus.google.com/hangouts/_/canonical.com/upgrades?authuser=1
[04:00] <menn0> "could not start because of an error"
[04:00] <menn0> thumper: ^^
[04:01]  * thumper sighs
[04:01] <thumper> menn0: you start one then
[04:01]  * thumper closes the hangout
[04:01] <menn0> thumper: k
[04:01] <jcw4> axw: one point, this one line fix to suppress BASH_ENV in the tests fixes 4 or 5 tests for me.  It seems helpful to merge this change *and* refactor the specific test we initially discussed
[04:02] <menn0> thumper: happened again as soon as you answered. killing my chrome now.
[04:02]  * thumper didn't answer
[04:02] <menn0> thumper: ok. well it keeps happening. even after restarting chrome.
[04:03]  * menn0 sighs
[04:03] <thumper> firefox?
[04:03] <davecheney> menn0: do you have two google identies ?
[04:03] <davecheney> i often have to add &authuser=1 to any link I open to make it use the right identify
[04:03] <menn0> (╯°□°)╯︵ ┻━┻ ︵ ╯(°□° ╯)
[04:04] <menn0> davecheney: i have 2 identity but it usually "just works"
[04:04]  * menn0 tries firefox
[04:04] <davecheney> menn0: maybe this is a time when it doesn't
[04:05] <stokachu> did the separate logs for each machine go away in juju 1.20.x?
[04:05] <menn0> FU22FA...
[04:05]  * menn0 goes to retrieve his phone from child
[04:06] <thumper> stokachu: no...
[04:06] <thumper> stokachu: local provider?
[04:06] <stokachu> interesting, only machine-0.log is being created
[04:06] <stokachu> thumper, yea local provider
[04:06] <thumper> stokachu: multiple local provider environments?
[04:06] <stokachu> machine-0.log shows all the logs from each machine correctly
[04:06]  * thumper recalled a bug
[04:06] <stokachu> thumper, nah a single local provider with kvm as the container
[04:07] <thumper> hmm
[04:07] <thumper> ah
[04:07] <thumper> kvm log files aren't mounted like the lxc ones because they can't
[04:07] <thumper> should still be an all-machines.log though
[04:07] <stokachu> yea there is, but the nested lxc machines aren't having their logs created either
[04:08] <stokachu> i guess is that what you meant by 'multiple' providers
[04:08] <thumper> no... they don't
[04:08] <thumper> no...
[04:08] <stokachu> kvm machines used to create machine-x.log files
[04:08] <stokachu> the unit logs aren't being created either if that helps
[04:09] <davecheney> stokachu: what does juju status say ?
[04:09] <menn0> thumper: https://plus.google.com/hangouts/_/gqu6sqg3xlszyfsl2u4blczvuaa?hl=en-GB
[04:09] <davecheney> have any units/services been deployed ?
[04:09] <stokachu> http://paste.ubuntu.com/7920401/
[04:09] <thumper> menn0: I get the party is over :-|
[04:09] <stokachu> yea, a bunch
[04:10] <davecheney> weird
[04:10] <menn0> thumper: trying again... at least it ain't crashing in FF
[04:10] <menn0> thumper: https://plus.google.com/hangouts/_/gqu6sqg3xlszyfsl2u4blczvuaa?hl=en-GB
[04:10] <stokachu> all-machines.log  ca-cert.pem  machine-0.log  rsyslog-cert.pem  rsyslog-key.pem
[04:10] <stokachu> those are the only files in .juju/local/log
[04:10] <wallyworld> menn0: july 28 is 2 days ago, not 2 weeks ago :-P
[04:10] <thumper> stokachu: yes, that is right
[04:11] <thumper> menn0: try this one https://plus.google.com/hangouts/_/gvvbftptty735axcajn45l2efia?hl=en-GB
[04:11] <thumper> arg
[04:11] <thumper> now I get error
[04:11] <stokachu> yea but there should be more of them :)
[04:11] <thumper> stokachu: no
[04:11] <thumper> stokachu: there shouldn't
[04:11] <menn0> wallyworld: 4-5 ish days, but yes
[04:11] <thumper> stokachu: it doesn't work that way
[04:11] <menn0> wallyworld: I screwed that up
[04:11] <stokachu> huh
[04:11] <stokachu> then why the did logs used to be there
[04:12] <thumper> for lxc they were
[04:12] <thumper> but kvm, never were
[04:12] <thumper> lxc containers share a log mount point
[04:12] <axw> jcw4: yeah, seems like it'll be too error prone to get everyone to use IsolationSuite - and it may not be viable in some cases
[04:12]  * menn0 needs more sleep
[04:12] <thumper> kvm can't
[04:12] <menn0> thumper: try calling me when you're ready
[04:12] <stokachu> but i have lxc containers deployed and there aren't logs
[04:12] <thumper> but you lxc containers are inside the kvm containers
[04:12] <thumper> not sure what you are expecting there
[04:12] <stokachu> and im pretty sure i had multiple machine logs with kvm
[04:13] <thumper> if you look inside all-machines.log, you should see both machine and unit logs
[04:13] <thumper> if you don't you have a problem
[04:13] <stokachu> all i know is juju used to create unit-x.logs and machine-x.logs with this same setup
[04:13] <thumper> also visible with 'juju debug-log'
[04:13] <stokachu> and now they dont
[04:13] <thumper> no
[04:13] <thumper> it didn't
[04:13] <thumper> never with kvm
[04:13] <thumper> trust me on this
[04:13] <thumper> I wrote it
[04:14] <thumper> kvm can't
[04:14] <thumper> otherwise it would
[04:14] <thumper> I talked with robbie basak about it
[04:14] <axw> jcw4: reviewed your original change
[04:16] <jcw4> axw: great thanks... will add that comment
[04:16] <thumper> menn0: trying again: https://plus.google.com/hangouts/_/gqu6sqg3xlszyfsl2u4blczvuaa?authuser=1&hl=en
[04:16] <thumper> gah
[04:16]  * jcw4 is sad; can't seem to find any of the blocking bugs that he can easily fix to clear the CI pipeline
[04:16] <thumper> also issues with couldn't start
[04:16] <thumper> WTF?
[04:17] <menn0> thumper: wondering if it's actually something wrong with hangouts at the moment
[04:17] <menn0> thumper: try calling me one more time
[04:17] <thumper> just did, it starts then crashes
[04:18] <thumper> menn0: ok, so back to previous statement, you have it sorted now?
[04:18] <menn0> maybe
[04:19] <menn0> thumper: it looks like changing the env agent version is allowed as long as all the machines in the env are on either the current or next version
[04:19] <menn0> thumper: which they should be for what I'm looking at
[04:19] <menn0> thumper: so it might all be fine.
[04:20] <thumper> ok, that's good then
[04:20] <menn0> thumper: I'll be doing some extensive manual testing before trying to merge this
[04:20]  * thumper nods
[04:20] <menn0> thumper: thanks rubber duck / teddy bear :)
[04:20] <thumper> how are you going to make it fail?
[04:20] <menn0> thumper: hack the code I think
[04:21] <menn0> make one of the state agents not upgrade
[04:21] <thumper> menn0: I know
[04:21] <thumper> add an upgrade step
[04:21] <thumper> that checks for some random file on disk
[04:21] <thumper> say /var/lib/juju/kill-upgrade
[04:21] <thumper> and if it is there, the upgrade fails
[04:21] <menn0> that's a nice idea but this is just before the upgrade steps run
[04:22] <thumper> what if the steps fail?
[04:22] <thumper> don't we want to go back?
[04:22] <menn0> this is for the case where one of the state servers fails to come up with the new agent version
[04:22] <thumper> or are we going forwards?
[04:22] <menn0> that depends
[04:22] <menn0> if it's the master, we roll back using the backup
[04:22] <thumper> ok, instead of an upgrade step
[04:22] <menn0> unless there is no backup... then we go in to an error state
[04:23] <thumper> make it something at the start of the machine agent
[04:23] <menn0> if it's the secondary it marks itself as broken but the upgrade continues
[04:23] <thumper> I'm half considering that we should always have something that looks for a file in order to die
[04:23] <thumper> think the chaos monkey stuff
[04:24] <jcw4> thumper: +1
[04:24] <thumper> /var/lib/juju/chaos
[04:24] <jcw4> :)
[04:24] <thumper> so we can have CI tests that controllably cause things to fail
[04:24]  * thumper goes to add another agenda item to next week's meeting
[04:25] <menn0> thumper: that's not a silly idea
[04:26] <thumper> sinzui: you there next week?
[04:26] <menn0> thumper: i'll see if I can leave something permanent and extensible in place for testing this stuff
[04:26] <thumper> menn0: I suggest $datadir/chaos/machine-agent
[04:26] <menn0> thumper: I like it
[04:27] <jcw4> so we have 6 blocking regression bugs, but only two are assigned and in progress... since no-one can land anything til these are resolved, shouldn't everyone be working on these?
[04:27] <thumper> s/everyone/someone/
[04:27] <thumper> jcw4: and yes
[04:27] <jcw4> :D
[04:27]  * thumper is prepping for a trip away leaving tomorrow
[04:27]  * jcw4 bravely looks at installing precise on a virtualbox instance
[04:40] <davecheney> jcw4: you don't need to install precise
[04:40] <davecheney> set your default series to precise
[04:40] <davecheney> and bootstrap ec2
[04:42] <jcw4> davecheney: I haven't done that yet... do I need to provide ec2 tokens in my env. or something?
[04:42] <jcw4> i.e. do I use my own ec2 credentials
[04:42] <davecheney> jcw4: yes
[04:42] <davecheney> just set the usual ec2 env vars and it works (majic)!
[04:42] <jcw4> davecheney: okay... I want to try that
[04:44] <davecheney> jcw4: you can use any provider
[04:44] <davecheney> well, not aszure, that is properly borken
[04:44] <jcw4>  hehe
[04:44] <davecheney> but openstack if you use rackspace or hp cloud
[04:45] <jcw4> I'm most familiar with ec2, but I've got a couple rackspace accounts too
[04:45] <axw> wallyworld: I've got some changes to hopefully make bootstrap/mongo more robust, but unit tests on 1.20 are not happy
[04:45] <axw> wallyworld: I can't run 1.20 tests without something mongo related failing
[04:45] <axw> with or without my changes
[04:45] <wallyworld> axw: did you want a quick chat
[04:45] <axw> sure
[04:46] <axw> ehrm, what's up with hangouts
[04:46] <wallyworld> i'm in 1:1
[04:47] <axw> wallyworld: it's dying on me
[04:47] <wallyworld> axw: yeah, you joined and then keft, we can try standup one
[04:47] <axw> wallyworld: same thing
[04:47] <axw> thumper: did you get hangouts to work? I'm getting the same thing
[04:47] <davecheney> jcw4: protip: do a bootstrap without default-series set
[04:47] <davecheney> so it _should_ choose trusty (dunno)
[04:48] <davecheney> and maybe will work before you start trying to debug why it doesn't
[04:48] <wallyworld> axw: hmmm, restart chromium?
[04:48] <jcw4> davecheney: I see, so do a pass through on trusty using ec2, and then change my series to precise?
[04:48] <axw> wallyworld: didn't help :/
[04:48] <wallyworld> wtf
[04:49] <axw> I'll try firefox
[04:49] <axw> may have to install the plugin
[04:49] <jcw4> davecheney: and to make sure I'm clear... I initiate the tests on my local trusty box, and if my ~/.juju/... config is pointed at ec2 it should "just work" ?
[04:50] <davecheney> juju switch "the name of your ec2 environment"
[04:50] <davecheney> juju bootstrap --upload-tools
[04:50] <davecheney> jcw4: i am concerned that bootstraping an environment is not second nature to you
[04:50] <davecheney> given the number of bugs we have that are not caught in CI
[04:50]  * jcw4 blushes
[04:50] <davecheney> you shouldn't just assume "works on my machine" == "ship it"
[04:51] <jcw4> davecheney: almost all of my work has been in areas that don't seem as susceptible to environmental issues
[04:51] <jcw4> davecheney: and my testing has been 100% via unit tests.
[04:52] <jcw4> davecheney: my first real dogfood testing of actions on the command line was yesterday
[04:52]  * jcw4 slinks off and hides
[04:52] <axw> oh wtf
[04:52] <davecheney> jcw4: better late than never
[04:52] <axw> firefox doesn't like it either
[04:52] <wallyworld> hmmm
[04:53] <wallyworld> axw: how about you try and start one?
[04:53] <axw> yeah trying now
[04:54] <axw> wallyworld: nope.
[04:54] <wallyworld> jeez
[04:54] <davecheney> axw: wallyworld menn0 why not use skype ?
[04:55] <wallyworld> yuk
[04:55]  * wallyworld doesn't have skype installed
[04:55]  * axw has never used skype before
[04:56] <wallyworld> axw: so do the trunk tests pass? is it just 1.20?
[04:56] <axw> wallyworld: I'll try now
[04:56] <menn0> axw, wallyworld: if you guys are having trouble too then hangouts is broken right now
[04:57] <axw> menn0: I am, dunno about wallyworld
[04:57] <menn0> thumper and I were unable to get a call up despite many attempts in chrome and firefox
[04:57] <axw> yeah, I guess it's busted
[04:57] <wallyworld> seems to work for me
[04:57] <menn0> it was crashing for us while the call rang
[05:02] <axw> wallyworld: I did get an error on master just now
[05:02] <axw> Panic: not authorized for query on presence.presence.beings (PC=0x40EF7D)
[05:02] <wallyworld> ffs
[05:03] <wallyworld> the precise failures on CI were uniter related
[05:04] <jcw4> davecheney: so after juju bootstrap --upload-tools... I do "go test ./..." locally and it'll magically connect to ec2?  Or do I need to first ssh into ec2 and then somehow initiate the tests?
[05:05] <davecheney> jcw4: nope
[05:05] <davecheney> jcw4: deploy some charms and see if your environment works
[05:05] <jcw4> davecheney: oh... I see.  What if I want to reproduce the test errors that are blocking CI?
[05:06] <davecheney> the error is "can't bootstrap a precise image"
[05:06] <davecheney> this isn't tested in the unit tests
[05:06] <davecheney> it's done in some intergration test build that curtis runs
[05:06] <davecheney> which is roughtly the steps you just did
[05:07] <davecheney> there might be some juiggery pokery we need to do to make sure that boostraps a 32bit environment (getting pretty rare in ec2 these days)
[05:07] <jcw4> davecheney: oh, ok.
[05:07]  * jcw4 scrambles to keep up
[05:07] <jcw4> :)
[05:08] <jcw4> I have a friend who always says "Don't judge me Weldon"
[05:08] <jcw4> I know how he feels :)
[05:09] <axw> wallyworld: I think a lot of the mongo errors we get in tests are to do with how we reuse Mongo databases
[05:10] <wallyworld> axw: you mean in the test setup?
[05:10] <axw> wallyworld: MgoTestPackage creates a single Mongo database and then MgoSuite reuses it
[05:11] <axw> restarting it only on authorisation errors
[05:11] <axw> sometimes those errors don't get caught in the right place tho
[05:11] <wallyworld> yeah, that has bothered me
[05:11]  * thumper eagerly awaits daylight savings time so the team lead calls are at a sane time
[05:12]  * thumper is brain dead
[05:12] <wallyworld> thumper is soft
[05:18] <wallyworld> axw: i've tried removing the lock file and it ran into various issues, so easiest for now just to leave it but use a different name to previously
[05:19] <wallyworld> so diff is 1 line
[05:19] <axw> wallyworld: cool, I think that'll be fine
[05:19] <wallyworld> yep,for now
[05:19] <wallyworld> gotta get this release out
[05:21] <axw> wallyworld: the change I'm trying to do is to add the admin user to mongo before initialising state
[05:22] <axw> that way we can always open state authenticated, and use session.Copy
[05:22] <wallyworld> axw: is that the cause of some issues?
[05:22] <axw> wallyworld: I couldn't reproduce the i/o timeout, but it's the only thing I can think of
[05:23] <axw> wallyworld: sorry, this is for azure
[05:23] <wallyworld> hmm, i would hope that this sort of bug would at least be consistent across providers
[05:23] <axw> bootstrap fails because the mgo socket gets a timeout
[05:23] <axw> well it's timing related
[05:24] <wallyworld> but adminuser first,then a consistent way to open state is good
[05:24] <wallyworld> then that leaves precise tests :-(
[05:25] <axw> well I need to get tests running for 1.20 on my trusty machine too...
[05:25] <wallyworld> sigh, at least it fails for both series then
[05:28] <wallyworld> axw: i'll gotta go and do some packing for the flight tomorrow, how far away is the io timeout fix?
[05:28] <axw> wallyworld: it's done, but I can't verify that the tests are unaffected
[05:28] <wallyworld> ah right
[05:28] <jcw4> davecheney: it's aliiive :)
[05:28] <axw> so, however long it takes to get the unit tests to run, which could be quite a while at this rate
[05:29] <wallyworld> does it bootstrap ok?
[05:29] <wallyworld> could you see the azure issue?
[05:29] <axw> wallyworld: it bootstraps fine. I couldn't reproduce the error in the first place.
[05:29] <wallyworld> sigh
[05:30] <wallyworld> axw: i'm wondering if it's worth pushing up the fix - do the relevant state tests pass?
[05:30] <axw> I'll try them in isolation
[05:30] <wallyworld> at least CI can have a go then
[05:31] <davecheney> jcw4: ok, so the next thing is to ensure you're bootstrapping a 386 environment
[05:31] <davecheney> so
[05:31] <davecheney> juju destroy-environment -y $(juju switch)
[05:32] <davecheney> then juju bootstrap --constraints="arch=386" --upload-tools
[05:32] <davecheney> this may work
[05:32] <davecheney> or it might fail horribly
[05:32] <davecheney> i'm not sure --upload-tools will work
[05:32] <jcw4> davecheney: where do I change the series?
[05:33] <jcw4> I didn't see it in environments.yaml
[05:33] <davecheney> jcw4: lets leave that for the moment
[05:33] <jcw4> ok
[05:33] <davecheney> changing the architacture might be a showstopper
[05:33] <jcw4> (I saw default series)
[05:34] <jcw4> bad arch constraint?
[05:34] <jcw4> juju bootstrap --constraints="arch=386" --upload-tools
[05:34] <jcw4> error: invalid value "arch=386" for flag --constraints: bad "arch" constraint: "386" not recognized
[05:34] <davecheney> try i386
[05:35] <davecheney> we have to translate between debian nouns and go nouns
[05:35] <jcw4> better... but:
[05:35] <jcw4> juju bootstrap --constraints="arch=i386" --upload-tools
[05:35] <jcw4> Bootstrap failed, destroying environment
[05:35] <jcw4> ERROR cannot build tools for "i386" using a machine running on "amd64"
[05:35] <davecheney> right
[05:36] <davecheney> so, this is going to be tricky to reproduce
[05:36] <davecheney> i would recommend not contining with this issue
[05:36] <jcw4> :) Okay.. it's approaching bedtime, but now my interest is piqued
[05:36] <axw> wallyworld: https://github.com/juju/juju/pull/456
[05:37] <jcw4> I assume the error is because I first bootstrapped with amd64
[05:37] <jcw4> and my terminated machines are configured for amd64
[05:37] <jcw4> and I'm guessing juju tries to re-use them?
[05:37] <axw> wallyworld: the only production code I changed was in agent.InitializeState; only tests in "cmd/jujud" and "agent" call that, and both those packages pass in isolation
[05:38] <davecheney> jcw4: nope, it's because --upload-tools can only generate tools for your curent machine
[05:38] <davecheney> which is amd64
[05:38] <jcw4> oooh
[05:38] <wallyworld> looking
[05:39] <jcw4> davecheney: well I *could* fire up a 386 vbox machine
[05:40] <davecheney> jcw4: if you want to try
[05:40] <davecheney> you need to get your juju environment up and running
[05:40] <davecheney> which is roughtly
[05:40] <jcw4> davecheney: I do want to try; I've gotten my env up a few times on different 64 bit machines
[05:40] <davecheney> sudo apt-get install golang-go mercurial bzr git && go get github.com/juju/juju/...
[05:41] <jcw4> davecheney: yeah (plus GOPATH=~/go or something)
[05:42] <jcw4> davecheney: fun for tomorrow though, if someone doesn't fix it first
[05:45] <jcw4> davecheney: it doesn't attempt to re-use terminated instances?
[05:45] <jcw4> davecheney: I suppose I have to clean those up manually
[05:47] <davecheney> jcw4: destroy-environment removes all machines
[05:47] <jcw4> davecheney: and if it doesn't that's a bug?
[05:47] <davecheney> yup
[05:47] <jcw4> 'cause it terminated them, but didn't remove them (unless the -y switch is what kept them around)
[05:48] <davecheney> what is it ?
[05:48] <davecheney> and how can you tell they aren't removed ?
[05:48] <jcw4> juju destroy-environment <== "it"
[05:48] <jcw4> https://console.aws.amazon.com is how I could tell
[05:48] <davecheney> did the command complete sucessfully ?
[05:49] <jcw4> yep
[05:49] <jcw4> I'll do it again to verify
[05:49] <davecheney> be carefully
[05:49] <davecheney> juju will delete _ANYTHING_ connected to those ec2 credentials
[05:50] <jcw4> now you tell me
[05:50] <jcw4> ;p
[05:50] <jcw4> my production web site hosting is in that account
[05:51] <jcw4> but no, I know have three new terminated instances
[05:51] <jcw4> and juju didn't seem to touch my running production instance
[05:51] <davecheney> phew
[05:51] <davecheney> ok
[05:51] <davecheney> don't do that again
[05:51] <jcw4> hehe
[05:51] <jcw4> I should know better
[05:51] <davecheney> sorry, should have said that juju will do that
[05:52] <jcw4> I think if I create additional keys it'll firewall them
[05:52] <davecheney> yup, different account that cannot see the running machines will be fine
[05:52] <jcw4> (metaphorically speaking)
[05:52] <jcw4> k... now it's bedtime for real.  Thanks again davecheney
[05:53] <davecheney> np
[06:34] <voidspace> morning all
[07:06] <menn0> voidspace: morning'
[07:41] <TheMue> morning
[08:09] <voidspace> TheMue: does the 1.18 client rely on direct mongo access?
[08:10] <TheMue> voidspace: very good question. I would expect that there are still parts, yes.
[08:10] <voidspace> TheMue: if we need to support 1.18 (LTS client) does that mean we *can't* shut off direct mongo access
[08:12] <TheMue> voidspace: oh, eh, indeed. software would cry. :’(
[08:12] <voidspace> I have a bootstrapped environment on amazon created with direct mongo access shut off
[08:12] <voidspace> going to switch to 1.18 and see if I can interact with it
[08:12] <voidspace> but I suspect not....
[08:15] <fwereade> voidspace, I'm pretty sure that 1.18 had switched to the API everywhere -- but it still needed direct-mongo-access code to interact with 1.16 installations
[08:15] <fwereade> voidspace, if we *hadn't* switched by then we screwed up royally by releasing it -- waiting for API everywhere was one of the big reasons we waited so long to release it
[08:16] <voidspace> fwereade: ok, I hope that's the case
[08:16] <voidspace> fwereade: I have an mp closing the mongo ports
[08:16] <voidspace> or rather, not opening them...
[08:19] <fwereade> voidspace, awesome
[08:49] <voidspace> hmm... internet here not as good as I remember it being last year
[08:50] <voidspace> I lost an hour or so at the end of the day yesterday. But that was ok, I had insomnia and ended up working at 1am. :-/
[08:56] <voidspace> so, when I try to interact with a 1.21-alpha1.1 juju state server from a 1.18 client I get: ERROR invalid agent version in environment configuration: "1.21-alpha1.1"
[08:56] <voidspace> I can't find a reference to agent in the jenv (nor in environments.yaml)
[08:57] <axw> I thought we stopped building jujud in our tests?
[08:57]  * axw witnesses tests take 10s longer than they need to
[08:58] <voidspace> "juju get-env" reports the agent version
[08:58] <voidspace> ah, the 1.18 client probably doesn't like the "alpha" in the agent version
[08:59] <voidspace> so I can't really test this without faking the version number
[08:59] <voidspace> but "juju status" works fine without direct mongo access :-D
[08:59] <axw> ugh. I guess that's not a problem since we'd only do that for pre-release versions...
[09:00] <voidspace> axw: right, but it makes testing that the 1.18 client still works with pre-release servers tricky...
[09:00] <voidspace> servers/environments
[09:00] <axw> indeed
[09:06]  * voidspace lurches
[09:09] <fwereade> mattyw, oops, forgot -- does the meeting have a video call?
[09:09] <mattyw> fwereade, there should be one on the meeting thing - otherwise: https://plus.google.com/hangouts/_/canonical.com/initial-metrics
[09:09] <mattyw> fwereade, an no problem - it's still early
[09:25] <rogpeppe> trivial change to the charm.v3 package: https://github.com/juju/charm/pull/32
[09:32] <TheMue> rogpeppe: *click*
[09:39] <TheMue> rogpeppe: done
[09:39] <rogpeppe> TheMue: thanks
[10:35] <voidspace> who is OCR?
[10:36] <TheMue> voidspace: o/ and fwereade
[10:37]  * fwereade should probably go and do some reviewing
[10:43]  * TheMue is happy about VM snapshotting. otherwise his test environment would have had hard network troubles now.
[10:44] <voidspace> TheMue: heh, yeah - messing with networking configuration is a recipe for pain
[10:44] <voidspace> TheMue: fwereade: a simple one if you have time https://github.com/juju/juju/pull/449
[10:44]  * perrito666 has been working 1h without glasses and finally realises why is this screen so bright
[10:44] <voidspace> this is the one that shuts off the StatePort
[10:44] <fwereade> voidspace, looking at that one right now
[10:44] <voidspace> perrito666: heh
[10:44] <voidspace> thanks
[10:45] <voidspace> fwereade: there is a comment removed, it's not entirely clear just from the mp why that comment is removed
[10:45] <voidspace> fwereade: and the answer is "because axw asked me to remove the TODO"...
[10:45] <fwereade> voidspace, haha, cheers
[10:45] <voidspace> as I was touching that code (in the azure changes)
[10:47] <TheMue> voidspace: quick hangout?
[10:47] <voidspace> TheMue: sure
[10:47] <voidspace> TheMue: let me grab a shirt
[10:47] <voidspace> for your sake...
[10:47] <TheMue> voidspace: hehe
[10:49] <fwereade> voidspace, LGTM, but I think we need followups to close the port in upgraded-from environments
[10:49] <fwereade> voidspace, I would plausibly accept bugs for the cases that's not actually possible, because there probably is some uncooperative provider
[10:50] <voidspace> fwereade: cool
[10:51] <voidspace> fwereade: I'll look at a port closing upgrade step
[10:51] <voidspace> fwereade: not needed for azure
[10:51] <voidspace> fwereade: as we no longer mask the state-port the firewaller will close it for us
[10:52] <fwereade> voidspace, ah, ok, cool -- don't follow how that's azure-specific though
[10:53] <TheMue> voidspace: seems I lost you
[10:58] <voidspace> TheMue: sorry connection went down, back now - will re-enter hangout if you're still around
[10:59] <voidspace> fwereade: I'm not sure why it was azure specific either, but that was the only provider where I ended up touching masking code
[10:59] <voidspace> fwereade: with luck it may apply to all the providers - I'll look into it
[11:01] <TheMue> voidspace: no problem, only said I found another problem in my setup due to copying the VMs. will notify you about the next result.
[11:01] <voidspace> TheMue: ok, cool
[11:01] <voidspace> TheMue: did you manage to get onto atanga?
[11:02] <voidspace> TheMue: I would like to know how to do that, I can file an rt about it though
[11:02] <voidspace> I don't have ipv6 routing from here to the outside world, so I'd need to vpn it as well (like jam )
[11:02] <voidspace> could be fiddly
[11:02] <voidspace> but then we can experiment with ipv6 routing without killing our own network configurations...
[11:02] <TheMue> voidspace: not yet, would be the next step
[11:02] <voidspace> cool
[11:03] <voidspace> if I get upgrades completed today I'll look at that
[11:03] <TheMue> voidspace: me neither, my provider currently doesn’t support it :(
[11:03] <voidspace> vpn will be fine, it's just another stage of setup
[11:03] <TheMue> yep
[11:21] <mattyw> fwereade, quick question?
[11:22] <mattyw> fwereade, (2 including that one)
[11:22] <fwereade> mattyw, sure, ofc
[11:22] <fwereade> :)
[11:23] <mattyw> fwereade, if meterstatus changes we want to fire the meter-changed (or config-changed) hook, my question: We probably want to add some rate limiting on that - so we don't call the hook too many times in quick sucession - yes or no?
[11:24] <fwereade> mattyw, I think it naturally limits itself, doesn't it?
[11:25] <fwereade> mattyw, if there's little enough going on that we have time to run a hook for every change we detect, that's fine
[11:25] <rick_h__> mattyw: fwereade is this convo for another channel?
[11:26] <fwereade> mattyw, otherwise we'll deal with them by coalescing as usual -- we never get config-changes queued up, we just make them available until they're acted on
[11:36] <mattyw> fwereade, I was just thinking if there was some bug and it got updated 1k times in 1 second we probably would only want to fire the hook once for the last value?
[11:36] <mattyw> fwereade, or is that optimising too soon?
[11:38] <fwereade> mattyw, no, that's correct, and it should fall naturally out of an implementation that conforms to what we already do
[11:38] <fwereade> mattyw, look at uniter.filter, that's basically its job
[11:38] <mattyw> fwereade, awesome, having it already there makes me happy
[11:38] <mattyw> fwereade, noted, thanks
[11:39]  * fwereade lunch for a bit
[12:28] <axw> wallyworld mgz_ : https://github.com/juju/juju/pull/458
[13:07] <wallyworld> sinzui: you online yet?
[13:12] <sinzui> I am
[13:17] <sinzui> wallyworld, I am
[13:19] <wallyworld> sinzui: so,the CI errors seem to be related to juju not talking to mongo when everything is first starting. i have deployed many ec2 environments today with no problem. but a change has been made to force juju to ignore repliasets when first connecting and that is landing now. hopefully it will help
[13:19] <wallyworld> sinzui: also, the name of the lock file for 1.20.3 has been altered slightly so as to not confuse 1.20.2 or earlier
[13:20] <sinzui> wallyworld, I don't recall an ec2 problem
[13:20] <wallyworld> there's lots of failures in the aws-deploy-precise-amd64 job
[13:20] <wallyworld> the same ones as for azure
[13:20] <sinzui> I saw your comment. I updated tests to not cleanup so that we can verify this version under test
[13:20] <sinzui> oh, sorry I did not see that
[13:21] <wallyworld> np, azure was most affected
[13:21] <wallyworld> i just fucking hope we can get a build out today
[13:22] <wallyworld> also, andrew is landing a number of changes in trunk which will hopefully make the mongo related tests more reliable
[13:23] <wallyworld> sinzui: anyway, i have to leave for the airport in 7 hours and need sleep. so i'm going, but just wanted to let you know what is happening
[13:23] <sinzui> wallyworld, Do we want to ask for testing of 1.20.3 even if azure fails?
[13:24] <wallyworld> i'm think we could
[13:24] <mgz_> wallyworld: where's our guide for setting up simplestreams on openstack?
[13:24] <wallyworld> i didn't try maas though
[13:24] <sinzui> As we will both be flying, this is our only chance to get something to testers
[13:24] <bigjools> you're up late wallyworld
[13:24] <wallyworld> bigjools: yeah :-( trying ti get a juju version out
[13:25] <wallyworld> mgz_: juju docs somewhere, can't recall exact url, let me check
[13:25] <bigjools> you get to Deutschland Sunday?
[13:25] <wallyworld> mgz_: https://juju.ubuntu.com/docs/howto-privatecloud.html
[13:25] <wallyworld> bigjools: i leave in 7 hours
[13:25] <bigjools> lol
[13:26] <wallyworld> for the airport
[13:26] <wallyworld> still need to finish packing
[13:26] <bigjools> lolol
[13:26] <wallyworld> bigjools: FO
[13:26] <wallyworld> bigjools: how is the sprint?
[13:26] <mgz_> wallyworld: thanks!
[13:26] <bigjools> wallyworld: productive
[13:27] <wallyworld> good
[13:27] <bigjools> some nice improvements
[13:27] <wallyworld> i has lunch with steve k today
[13:27] <bigjools> ha
[13:27] <wallyworld> he's in brisbane for pycon
[13:27] <bigjools> that's thre people in Bris while I've been away... .ffs
[13:27] <wallyworld> he said he didn't want to see you anyway
[13:27] <wwitzel3> hah
[13:27] <bigjools> heh
[13:28] <wallyworld> he also said HP is investing $2 BILLION in openstack next year
[13:28] <wwitzel3> that's it?
[13:29] <wallyworld> that's heat and iron.io as well
[13:29] <wallyworld> or was it 1 billion
[13:29] <wallyworld> but a lot of money, so we had better get our shit sorted
[13:29] <katco> wallyworld: holy moly
[13:30] <katco> that is a lot of money
[13:30] <wallyworld> yeah :-( deep pockets
[13:30] <bigjools> fark
[13:30] <wallyworld> they employ so many good openstack people
[13:30] <katco> openstack is good for juju, yes?
[13:31] <wallyworld> and there's lots of buy in for heat and iron.io since a lot of peole want just an openstack solution
[13:31] <perrito666> natefinch: we seem to have a meeting
[13:31] <wallyworld> katco: well it is supposed to be, but it has a juju comptitor called heat
[13:31] <katco> wallyworld: yeah reading up on that now
[13:32] <wallyworld> and maas competitor iron.io
[13:32] <wallyworld> i think it's called that
[13:32] <wallyworld> so lots of people just want one technology stack, and built in products integrate much better than generic 3rd party ones
[13:33] <wallyworld> so we need to make juju/maas compelling in other ways
[13:33] <natefinch> iron.io is a competitor of MaaS?
[13:33] <hazmat> natefinch, fwiw. the binaries compiled fine for me. i can put a copy up somewhere (dropbox, chinstrap) if that would be helpful. i used a pristine trusty container, and this is my extracted bash_history for the compile http://paste.ubuntu.com/7924025/
[13:33] <bigjools> I thought it was called Ironic
[13:33] <hazmat> natefinch, huh
[13:33] <wallyworld> natefinch: AH YES
[13:34] <wallyworld> sorry, i got the name wrong
[13:34] <wallyworld> i'm tired
[13:34] <bigjools> go to bed
[13:34] <wallyworld> ironblahsomething
[13:34] <hazmat> iron.io is a golang shop using mgo for queues, and stuff
[13:34] <hazmat> sass style
[13:34] <hazmat> saas even ;-)
[13:34] <wallyworld> yeah, i got the name mixed up
[13:34] <hazmat> how ironic ;-)
[13:34] <wallyworld> lol
[13:34] <natefinch> hazmat: yes please put it somewhere so I can grab it
[13:34] <natefinch> perrito666: coming
[13:34] <wwitzel3> hah
[13:35] <wwitzel3> this channel is on a roll today
[13:35] <natefinch> yeah, ok... I was going to say, I thought I sorta understood what iron.io did, and it's nothing like maas
[13:35] <bigjools> the only irony is that Americans don't understand irony :)
[13:35] <wallyworld> especially alanis morrisette
[13:35] <wallyworld> although see is canadian
[13:36] <natefinch> I thought Alanis Morrisette was Canadian
[13:36] <natefinch> yeah, see?
[13:36] <wallyworld> yeah
[13:36] <wallyworld> she
[13:36] <wwitzel3> same thing, americas hat
[13:36] <wallyworld> right i'm off, catch you guys later, hope the build passes CI finally, i'll wake up and check the computer straight away :-/
[13:36] <katco> wallyworld: travel salfe, sleep welel.
[13:36] <katco> well
[13:36] <wwitzel3> wallyworld: see ya
[13:38] <hazmat> natefinch, uploading it to dropbox, so all state tests pass minus apiservers afaics.. i've got quite a few notes in the activity log section of the ACID doc.
[13:38] <hazmat> the api server failures are around concurrency issues (doc level lock contention)
[13:39] <hazmat> in a few of the tests, worth investigating more.. we've done quite a lot of hot spot creation on docs to work around various definiciencies of not having txns and isolation (various ref counts)
[13:40] <natefinch> perrito666: you still there?
[13:40] <natefinch> hazmat: very interesting
[13:41] <hazmat> natefinch, https://www.dropbox.com/s/dbcrgahxxyt8buv/tokumx-1.5.0-linux-x86_64.tgz
[13:42] <natefinch> hazmat: awesome, thanks.   No idea why the build failed for me
[13:42] <hazmat> natefinch, yeah.. rather curious.. using pristine containers for builds is a good rule of thumb, and also isolates build crap from the host
[13:42] <hazmat> natefinch, added my build notes/recipe to the doc as well
[13:43] <natefinch> hazmat: maybe I just need to clean before I build... haven't done that
[13:44] <hazmat> natefinch, i have some requests out to tokutek re info on how to speed up dropdb equiv.. there's alot of time it seems like in setup/teardown  that's causing the tests to go slower overall even though actual test times look normal
[13:45] <hazmat> natefinch, instrumenting that to isolate would be useful, as would investigating the doc level conflicts we're doing with concurrent mods
[13:46]  * hazmat wanders off to pre-sales meetings
[14:05] <fwereade> ericsnow, ping
[14:06] <ericsnow> fwereade: o/
[14:06] <fwereade> ericsnow, just wanted to check quickly -- what is the Status going to be used for, and why do we need to have the id matching the timestamp?
[14:07] <ericsnow> fwereade: Status is just part of my enums compulsion :)
[14:07] <ericsnow> fwereade: the timestamp thing is just me overthinking
[14:08] <fwereade> ericsnow, I'm questioning the very existence of everything connected with status -- I'm fine with the *code* but I don;t quite see what it's for
[14:08] <fwereade> ericsnow, I'd imagined we could just create the backup, and record the complete metadata just once when we know the backup is safely stored elsewhere
[14:08] <fwereade> ericsnow, no need for status at all afaics
[14:09] <ericsnow> fwereade: 2 things
[14:10] <ericsnow> fwereade: if storing the archive succeeds but the metadata fails, I didn't want to leave the archive without metadata
[14:11] <ericsnow> fwereade: the other thing is I was thinking about restore, where I expect we will add the info without an archive (when uploading an archive for restore)...now obviously to me a premature optimization :)
[14:13] <cmars> jam, thanks for the review on login v2
[14:14] <fwereade> ericsnow, good thought re orphan archives, wondering if there's a different way to do it, can't think of an obvious one
[14:14] <fwereade> ericsnow, anyway I have a few other comments, let me know what you think
[14:14] <ericsnow> fwereade: so yeah, the status stuff is overkill and I'll find a simpler way
[14:14] <ericsnow> fwereade: thanks for all the help
[14:15] <fwereade> ericsnow, a pleasure :)
[14:39] <ericsnow> fwereade: do you think having dedicated error values (like the 3 in my patch) is worth the inflexiblity?
[14:39] <ericsnow> fwereade:   The idea of checking the error message to decide on what to do really bugs me, but identity checking means you can't customize the error.
[14:41] <ericsnow> fwereade: I guess I could have a customizable error type and a helper function that lets you ask if an error is a particular one (kind of like happens in state/api/params/apierror.go), but that's just too much work for what I need
[14:46] <fwereade> ericsnow, the code is all already written in juju/errors
[14:46] <ericsnow> fwereade: got it
[14:51] <gmb`> Guys, from where can I clone a branch of the latest stable version of Juju? I need it to hack up some demo interactions with MAAS.
[14:54] <zirpu> gmb`: github.com/juju/juju i think. or find it on launchpad also.
[14:55] <mgz_> gmb`: what do you mean by latest stable?
[14:55] <mgz_> not-trunk?
[14:55] <mgz_> the 1.20 branch on github
[14:56] <gmb`> mgz_: Okay, that’s perfick. Ta.
[15:07] <sinzui> gmb: there will 1.20.3 tools and debs in a few minutes if you just need a juju that is very current
[15:10] <gmb`> sinzui, mgz_ Okay. I can’t get 1.20 (or trunk) to build from source
[15:11] <ericsnow> for our use of mongo in a state collection (e.g. backups), is it okay to use omitempty on an _id field (i.e. forcing auto-generated unique ID)
[15:11] <marcoceppi> sinzui: for juju-core merges, to all merges have to be target at a bug?
[15:11] <gmb`> go install -v github.com/juju/juju/… fails:
[15:11] <marcoceppi> s/to/do/
[15:11] <sinzui> gmb, that's right
[15:11] <gmb`> sinzui: Ok…
[15:11] <sinzui> gmb, go doesn't support git like that
[15:12] <gmb`> sinzui: Shame that the documentation in the branch says to do *exactly* that.
[15:12] <natefinch> https://github.com/juju/juju/blob/master/CONTRIBUTING.md#getting-started
[15:12] <sinzui> gmb`, it says that to get stable?
[15:13] <gmb`> sinzui, natefinch: Hmm, my env must be screwy, rvba  isn’t haven’t the problem…
[15:13] <sinzui> marcoceppi, no, only when there are regressions. We fix regressions immediately now...we dont wait a few months
[15:13] <marcoceppi> sinzui: I don't understand this error https://github.com/juju/juju/pull/447
[15:13] <gmb`> natefinch: https://github.com/juju/juju/blob/master/README.md#building-juju
[15:13] <gmb`> sinzui: ^^
[15:13] <gmb`> Anyroad.
[15:14] <sinzui> marcoceppi, you branch doesn't fix a bug that needs to be fixed now. you cannot land unless you add fixes-<critical-regression>
[15:15] <marcoceppi> sinzui: oic
[15:15] <sinzui> marcoceppi, you cannot land until the build is fixed
[15:15] <marcoceppi> that's pretty cool
[15:15] <marcoceppi> slightly annoying, but cool
[15:17] <sinzui> marcoceppi, I would hope it is so annoying that regressions wont live for weeks
[15:17] <marcoceppi> that's what makes it cool
[15:17] <sinzui> gmb`, the instructions are right, did you skip dependencies...https://github.com/juju/juju/blob/master/CONTRIBUTING.md#dependency-management
[15:17] <gmb`> sinzui: No.
[15:18] <sinzui> gmb`, This script is very convoluted because it makes a tarball with frozen deps http://bazaar.launchpad.net/~juju-qa/juju-release-tools/trunk/view/head:/make-release-tarball.bash
[15:18] <sinzui> it does go-get, and godeps
[15:19] <natefinch> gmb`: what error are you getting?
[16:11] <katco> hey all, having trouble with trunk on my local machine. several panics, first of which is: PANIC: cmd_test.go:54: CmdSuite.TearDownTest
[16:11] <katco> ... Panic: watcher iteration error: not authorized for query on juju.txns.log (PC=0x40EF8D)
[16:11] <katco> any ideas here?
[16:11] <jcw4> katco: not GOMAXPROCS right >
[16:11] <jcw4> ;)
[16:11] <katco> jcw4: lol no... that is hard-coded into my script now :)
[16:11] <jcw4> hehe
[16:12] <katco> well, -parallel actually. let me try gomaxprocs just to be sure it's not the same thing
[16:12] <jcw4> katco: oh, interesting - I think -parallel is specific to how the tests are structured, vs. GOMAXPROCS which is more general
[16:13] <katco> jcw4: yeah, that's true
[16:14] <katco> here we go...
[16:19]  * bodie_ lights a votive candle for katco's dev environment
[16:19] <katco> LOL
[16:19] <katco> it felt the good vibes and got past the problem point!
[16:20] <katco> i am totally comfortable with dismissing jcw4's insistance on GOMAXPROCS=1 and attributing it _all_ to bodie_'s candle. thank you so much bodie_. way more helpful than jcw4 :)
[16:20] <bodie_> the cargo gods smile on this day
[16:20]  * katco is frustrated. back to coding...
[16:20] <bodie_> maybe next they'll get my branch landed :P
[16:20] <katco> haha
[16:20] <katco> hey i'm hoping for that too!
[16:21] <katco> in fact, i may need a PR reviewed here in a bit, assuming these tests now pass on my branch
[16:32] <jcw4> katco: lol
[16:32] <jcw4> and bodie_
[16:45] <ericsnow> is there any type in juju that records identifying information for the state server?  something like (unique juju instance ID, environment ID, state machine ID, hostname)
[16:48] <katco> alright, review needed: https://github.com/juju/juju/pull/398
[16:52] <axw> mgz_: I'm going to crash shortly, just added some comments to the azure i/o timeout bug
[16:53] <axw> looks like mongo is blocking reads/writes during index creation
[16:53] <natefinch> axw: 1am on a Friday night?  Weak  ;)
[16:53] <wwitzel3> lol
[16:54] <axw> :~(
[16:54] <axw> I'm a young old man
[16:54] <mgz_> axw: ta, I'll have a look
[16:54] <natefinch> axw: I go to bed at 9:30 most nights.  I'm just an old man :)
[16:54] <axw> hehe
[16:54] <TheMue> how to directly merge a PR without CI (it’s only one changed doc file)
[16:55] <TheMue> natefinch: hey, you and old, ha!
[16:55]  * TheMue only laughs about it
[16:57] <natefinch> Heh
[16:57] <TheMue> wonna bet?
[16:58]  * TheMue reaches his next decade next year
[17:02] <jcw4> TheMue: I was wondering if JFDI meant merge regardless of CI gate?
[17:02] <natefinch> TheMue: oh, I know you're older than me ;)
[17:02] <TheMue> jcw4: JFDI?
[17:03] <jcw4> TheMue: I've seen it in a few $$merge$$ messages... I thought it was an expression of frustration, and then I wondered if it was a secret code to CI :)
[17:05] <TheMue> jcw4: ah, hmm, afaik the bot merges with $$.*$$, so regardless the text
[17:05] <jcw4> oh well...:)
[17:07] <TheMue> I thought there’s an option on the web interface too
[17:11] <jcw4> TheMue: I've only seen it on the smaller repos
[17:11] <jcw4> TheMue: Unless you're in the smaller "owners" group maybe?
[17:12] <TheMue> jcw4: seems to be on option only visible with according access rights
[17:12] <jcw4> TheMue: yep
[17:12] <TheMue> ok, so using the standard way
[17:13] <natefinch> TheMue, jcw4:  just so no one is unclear, it's $$\S$$  which is to say, no spaces in the word between the dollar signs
[17:13] <jcw4> ah
[17:14] <TheMue> natefinch: ok, that’s more precise
[17:14] <katco> hey jcw4 had a very good question: is there any reason we can't parallelize machine provisioning? https://github.com/juju/juju/pull/398#discussion-diff-15706382
[17:15] <jcw4> I was thinking that since we're blocking waiting for the machine to be provisioned anyway it would be a perfect opportunity use goroutines
[17:16] <TheMue> eh, now it tells me „does not match ['fixes-1350983', 'fixes-1350911', 'fixes-1347715', 'fixes-1351030', 'fixes-1351019‘]“
[17:16] <natefinch> oh yeah.... sorry, the landing bot is locked down because there are bugs breaking CI
[17:17] <katco> jcw4: yeah i mean why not provision them in parallel. there could be an underlying reason i'm not aware of
[17:17] <natefinch> katco, jcw4: as long as you aggregate the errors back to the same place where they would go if they were serial, that's probably ok
[17:17] <katco> sweet... this could really speed things up. great idea jcw4
[17:17] <TheMue> natefinch: thx for info
[17:18] <natefinch> katco, jcw4: actually... I think there's a couple reasons not to do that.... the major one is that we should be making a bulk call to the API.  instead of "Start one machine" x5, we should do "start 5 machines"
[17:19] <katco> natefinch: i think i see what you mean... we've tied it to the Machine struct, but it looks like it can take multiple
[17:19] <natefinch> katco, jcw4: we also have to take into account rate limiting, though really that should take care of itself if the code is correct (though I think it's not... I think there's an outstanding bug where if you say deploy 6 and after 4 you hit a rate limit, it won't ever retry)
[17:20] <katco> natefinch: well, i'm not going to touch this then. this PR is already 3 weeks old
[17:20] <natefinch> yeah
[17:20] <natefinch> I wouldn't do it with your PR
[17:20] <natefinch> it should be done separately
[17:20] <katco> natefinch: are LGTM from non canonical employees sufficient for landing?
[17:22] <jcw4> katco, natefinch : my guess is it's more related to experience on the project and with Go than with employment status?
[17:22] <katco> jcw4: i would think that should be the case, but i just don't know.
[17:22] <natefinch> yeah
[17:23] <jcw4> I mean if Rob Pike wandered in and said LGTM, I'd be pretty okay with it
[17:23] <jcw4> :)
[17:23] <natefinch> lol
[17:23] <katco> haha
[17:23] <natefinch> I actually wouldn't, since he doesn't kn ow the project.... but if he said not LGTM, I'd listen, because there is probably a code problem
[17:24] <jcw4> +1
[17:24] <jcw4> good point
[17:25] <katco> natefinch: i understand the landing bot is broken; do we still do $$merge$$ and assume it will catch up?
[17:25] <bodie_> to be pedantic, there is probably a good argument to be made that if code needs to be special-cased around the project, it's a little bit not-go-ish
[17:25] <bodie_> though there are some cool bits and pieces, certainly
[17:26] <jcw4> last night thumper indicated I shouldn't $$merge$$ until CI opeend up
[17:26] <katco> ah ok.
[17:26] <jcw4> I don't get the impression the bot will catch up
[17:26] <katco> poor bot.
[17:26] <jcw4> katco: I don't think the bot is broken, I think it's actively rejecting anything that doesn't fix the critical regressions
[17:27] <katco> jcw4: ahhh
[17:27] <katco> thank you for that adjustment in perception. now some recent email threads make more sense :)
[17:27] <natefinch> yep
[17:27] <natefinch> it's functioning as designed
[17:27] <jcw4> it's a feature!
[17:27] <natefinch> rejecting everything until someone fixes the bugs
[17:27] <katco> haha
[17:45] <hazmat> any body familiar with juju metadata tools
[17:45] <hazmat> a question came up for generating image stream data with multiple series, and its not clear that the generate-image subcommand has any capability to do that
[17:46] <ericsnow> what uniquely identifies a juju instance?
[17:46] <natefinch> hazmat: unless sinzui knows.... you're on the at the wrong time of day
[17:47] <natefinch> ericsnow: what do you mean by instance?
[17:47] <hazmat> natefinch, yeah.. i know who to ask when the right time comes.. just curious if this blackbox has other initates
[17:47] <hazmat> ericsnow, two things.. a machine id from corresponding sequence, and a provider id... for a group when querying provider its the sec group typically. maas scopes by account.
[17:48] <hazmat> internal to juju its all machine id
[17:48] <ericsnow> the juju instance in common between multiple state servers
[17:48] <natefinch> ericsnow: you're still not defining instance
[17:48] <natefinch> ericsnow: you mean the environment?
[17:49] <ericsnow> natefinch: effectively
[17:49] <ericsnow> natefinch: however with multi-env that would not work, right?
[17:49] <natefinch> There's an environment ID
[17:50] <ericsnow> right (State.EnvironTag().Id())
[17:50] <natefinch> when we move to multiple environments per state server, the code will need to change to recognize it's not a 1:1 relationship anymore
[17:51] <ericsnow> fair enough...I'm thinking ahead I guess
[17:52] <natefinch> don't think too far ahead :)
[17:59] <ericsnow> natefinch: never! :)
[17:59] <natefinch> exactly
[18:22] <sinzui> hazmat, natefinch . sorry. I don't have any experience with generate-image
[18:24] <natefinch> sinzui: it's ok... I only suggested you because you often seem to know everything ;)
[18:26] <hazmat> sinzui, it was for ivoks, looks you already answered, thanks
[18:26] <hazmat> er. looks like
[18:26] <sinzui> natefinch, I know more than I should and I am bad at explaining why a suggestion make be cry in less than 20 words
[18:27] <sinzui> hazmat, I failed I think. I gave him an example for juju tools
[18:27] <hazmat> sinzui, he just needed to run the command twice re image metadata once for each series
[18:27] <sinzui> I know jerff (Ben) make the official images
[18:28] <sinzui> hazmat, That gives you two files though. I think Ben knows how to make one file
[18:35] <hazmat> sinzui, hmm.. bummer i was hoping.. but the params didn't give confidence.. there's a separate simple streams project with py tools for assembly
[18:35]  * hazmat files a bug
[18:36] <sinzui> hazmat, yep, I think that is what Ben and Scott use
[18:37] <hazmat> filed as bug 1351426
[18:37] <_mup_> Bug #1351426: juju metadata generate-image only supports a single series <juju-core:New> <https://launchpad.net/bugs/1351426>
[18:40] <hazmat> ericsnow, ah.. ic what you mean.. effectively the state servers of a MES situation aren't part of the environments their hosting (ie not machine 0)... there's been some notion of a special env within mes that they are a part of to support existing management techniques. ops like ensure ha from hosted/guest envs would be no-ops against them though.
[18:50] <perrito666> hey ppl, my ISP died on me and my cell phone is not all that great as a modem so Ill be offline while I sort this out, mail me if you need anything I am likely to answer right away from the phone
[18:50] <perrito666> cheers
[19:02] <dpb1> There is no option to leave the bootstrap node up, is there?
[19:02] <dpb1> (in case of failure)
[19:03] <natefinch> not currently
[19:03] <natefinch> we want to add a flag to do that... and that's probably coming in the near future
[19:05] <natefinch> dpb1: one thing you can do is do a kill -stop right before the code does the teardown
[19:06] <natefinch> I think anytime after you see the line about bootstrapping juju in the --debug log output, stopping the juju client will prevent it from killing the instance
[19:06] <natefinch> it's kind of a hack, but it works
[19:09] <dpb1> natefinch: ok, interesting
[19:09] <dpb1> natefinch: that indeed will help me I think
[19:09]  * dpb1 tries now
[19:10] <natefinch> dpb1: cool.  I think we'll add that flag to bootstrap sometime pretty soon.  I might be able to find time for someone to do it next week if we're lucky.... it plagues the core devs as well as users.
[19:12] <dpb1> natefinch: indeed, it's up.  now I can debug why it failed. :)
[19:12] <dpb1> thx
[19:20] <natefinch> dpb1: awesome
[20:21]  * perrito666 borrows a neighbor's internet
[20:56] <wallyworld> hazmat: you run generate-image more than once, just with a different series and inage id and will append to what's there, that's how you get > 1 series