/srv/irclogs.ubuntu.com/2013/04/16/#juju-dev.txt

davecheneym_3: ping00:08
davecheneybigjools: LP keeps eating my package00:16
davecheneyis there any log of what or why ?00:17
davecheneyhang on00:18
davecheneyLP says I have no pgp keys registered ...00:18
m_3davecheney:00:23
m_3yo00:23
m_3davecheney: so good news... just about to spin 200 nodes00:24
m_3davecheney: btw, we got approval for 2k as soon as hp catches up00:24
fwereadem_3, cool, has anything fallen over yet?00:24
m_3davecheney: btw, no log whatsoever... just email half an hour later saying it failed... til then, guessing game00:24
m_3(afaik)00:24
m_3fwereade: nope, only at 100 atm00:24
m_3fwereade: have 200-node answers shortly00:25
davecheneym_3: "10:24 < m_3> davecheney: btw, no log whatsoever... just email half an hour later saying it failed... til then, guessing game"00:26
davecheney^ what does this mean ?00:26
m_3davecheney: lemme know when you can play... I'm just bouncing things around atm, but plan to hand it to you in an hour or two00:26
fwereadem_3, excellent00:26
davecheneym_3: soon00:26
m_3davecheney: oh, sorry, that was in response to package uploads00:26
davecheneyjust getting fucked by pgp and launchpad at the moment00:26
m_3davecheney: ack00:26
m_3feel your pain00:26
davecheneybest I can tell, it is just throwing away my upload because my pgp keys were wrong00:27
davecheneym_3: what is the url of the host ?00:27
davecheneyi'll shoulder surf00:27
m_3fwereade: just the sensitivity to rate limiting... makes this soooo much more pleasant than before00:28
m_3davecheney: same as before... /me looks00:29
m_3ubuntu@15.185.162.24700:29
m_3davecheney:00:29
m_3^^00:29
m_3davecheney: `tmux attach`00:29
m_3davecheney: sorry, can't do voice atm00:30
bigjoolsdavecheney: https://answers.launchpad.net/launchpad/+faq/22700:31
davecheneym_3: that is fine00:32
davecheneybigjools: ack00:32
davecheneybigjools: * If the upload is signed, even if it gets rejected by packaging-inconsistencies, you should receive an email explaining the reasons within 5 minutes.00:33
davecheney^ never happens00:34
fwereadedavecheney, you might have a particular interest in https://codereview.appspot.com/8786043 because it hits the provisioner00:34
* davecheney looks00:34
fwereaderogpeppe, if you're on, and/or thumper, ^^00:34
bigjoolsdavecheney: "You probably have not signed the upload, or have not signed it with a GPG key registered for your Launchpad account"00:34
thumperfwereade: s'up?00:34
fwereadethumper, https://codereview.appspot.com/878604300:35
davecheneym_3: turn offed all that debug shit00:35
davecheneym_3: purdy00:35
m_3davecheney: totally want an ncurses ui00:36
m_3like htop00:36
m_3juju-top00:36
thumperfwereade: I'll look when I'm done with the current train of thought00:36
m_3jcastro says "hi"00:36
fwereadethumper, lovely, thanks00:36
thumperm_3: where is jcastro?00:36
fwereadehi jcastro00:36
m_3crap latency killing us00:37
m_3openstack devel summit00:37
m_3davecheney: can you ctrl-c that tail?00:37
m_3nm00:38
m_3now it's a waiting game00:40
m_3davecheney: http://15.185.169.172:50070/00:40
m_3"Live Nodes"00:41
m_3that's when they show up from the relation00:41
davecheney52 ... not bad00:43
m_3coming up nicely00:43
m_3davecheney: feel free to turn on the tail when you want... just turn it off when you're done cause it clogs up my pipes00:43
davecheneym_3: i followed your package build isntructions00:43
m_3:)00:43
m_3davecheney: and?00:43
davecheneybut LP is shitty at me because it has produced a mixed upload00:44
davecheneycontains both src and bin00:44
m_3working? or just stuck on dput and lp?00:44
m_3oh, right00:44
m_3so the pbuilder-dist stuff is _only_ to test it out00:44
davecheneyriiigh00:44
m_3when it comes time to dput it to lp... just use the debuild00:44
m_3davecheney: I think the last email in the chain of three or so I sent the other day has all you need00:45
davecheneythat might be where I am going wront00:45
davecheneywrong00:45
davecheneyi hav been working off the first00:45
m_3davecheney: yeah, sorry00:45
davecheneys'ok00:45
davecheneyits not your fault00:45
m_3davecheney: that's the dev process... build and test00:45
m_3there's probably a way to just uplaod the source bits to lp00:46
m_3but shit I don't know00:46
m_3davecheney: so I'm currently planning on _starting_ a terasort once the 197 slaves are up00:46
m_3won't let that one finish or run for too long00:47
m_3once that's working, then I'll turn it all over to you00:47
davecheneym_3: ok, what are the rules about shutting it down ?00:47
davecheneywe're paying for this right ?00:47
m_3play at will... current limits to 200, but that might bump to 2000 as early as a few hours00:47
m_3davecheney: we're paying yes00:47
m_3davecheney: just destroy it when you're not activlely testing something00:48
davecheney 7863 root      20   0 1035m 317m    0 S   25 15.8   2:28.72 mongod00:55
davecheney 7892 syslog    20   0  331m 1748 1212 S    4  0.1   0:34.75 rsyslogd00:55
davecheney 7903 root      20   0  676m 118m 6712 S    1  5.9   0:13.78 jujud00:56
davecheneytop three processes on the bootstrap machine00:56
davecheneyfwereade: we have to turn down all the document logging bullshit00:56
davecheneyrsyslog is nearly the top process on the bootstrap machine00:56
fwereadedavecheney, dammit, I just wish we had slightly more sophisticated logging so we could trun that stuff on when we need it00:57
davecheneyjuju-goscale2-machine-0:2013/04/16 00:29:34 DEBUG state/watcher: got request: watcher.reqWatch{key:watcher.watchK│·····ey{c:"machines", id:interface {}(nil)}, info:watcher.watchInfo{ch:(chan<- watcher.Change)(0xf840220a50), revno:0}│·····}00:57
davecheney^ i'm sure we do not need this crap00:57
fwereadedavecheney, I actually use it somewhat regularly... it has useful information buried in amongst the spam00:58
fwereadedavecheney, however00:58
fwereadedavecheney, it *is* fricking ridiculous00:58
davecheneyfwereade: i've seen in other places00:58
davecheneyDEBUG2 and TRACE00:58
davecheneyi think the watcher stuff could be classed as TRACE00:59
fwereadedavecheney, yeah, that sounds reasonable, but we don't have any useful filtering gubbins regardless00:59
davecheneym_3: looks pretty decent to me01:00
davecheneymongo is taking a pounding01:00
davecheneybut the jujud process is basically idle (although it may be blocking on mongo)01:00
davecheneym_3: actually at the 200'th node is the most important time01:01
fwereadedavecheney, however, so long as it's not *too* difficult to turn it back on I would trivial LGTM something that turned off the watcher stuff01:01
davecheneyevery new machien in the environment adds a worker which is racing to complete any outstanding transaction01:01
davecheneyso the more workers, the bigger the race01:01
davecheneythis is lower case race, for those watching at home01:01
fwereadedavecheney, I would consider "s/false/true/ somewhere and upload new tools" to be not *too* difficult01:01
fwereadedavecheney, yeah, I have been wondering about how those would end up01:03
davecheneyfwereade: yeah, we can hack it for load testing01:03
fwereadedavecheney, although it's not *any* outstanding transaction01:03
davecheneyfwereade: really ?01:03
fwereadedavecheney, yeah, just one that's blocking one it wants to make01:03
davecheneyohhh, so if you are not actrively waiting on a transaction to complete01:04
davecheneyyou don't participate01:04
davecheneythat makes it a lot better01:04
fwereadedavecheney, however certain documents are much too popularly written01:04
davecheneym_3: i think some of the delay in juju status is too many round trips01:04
fwereadedavecheney, I *suspect* that contention for the service document of whatever has lots of units is the real killer01:04
fwereadedavecheney, I would be very interested to know how 1x200 looks vs 10x2001:04
davecheneyfwereade: understood01:05
davecheneygood test01:05
m_3fwereade: yup, that sounds like a decent next step... easy to gen multiple smaller named clusters01:06
m_3fwereade: launchpad id?01:06
fwereadem_3, I am fwereade, I think01:06
m_3davecheney: whooops wtf was that?01:07
m_3strace01:07
davecheneytrying to figure out where all the time is going01:08
m_3oh, the '-v'01:08
m_3ack01:08
davecheneythere is a large block where status is waiting for the other side to return some data01:08
davecheneyatually, let me try something01:08
m_3k01:09
davecheneym_3: in theory I should be able to scp the .juju from the control machine, then use JUJU_HOME=... juju status01:09
davecheneyto run from my machine01:09
m_3davecheney: we didn't inject your keys01:10
davecheneylucky(/tmp) % JUJU_HOME=/tmp/.juju juju status -v01:10
davecheney2013/04/16 11:09:59 INFO JUJU:juju:status environs/openstack: opening environment "goscale2"01:10
m_3into the environment... lemme check01:10
davecheney2013/04/16 11:10:02 INFO JUJU:juju:status environs/openstack: waiting for DNS name(s) of state server instances [1500421]01:10
davecheneyi only need the outer machine01:10
davecheneyfwereade: that is a win for JUJU_HOME01:10
m_3nope, only the outer machine's keys are in that env01:10
davecheneyyou can just grab the .juju for another environment01:10
davecheneythen use JUJU_HOME=... juju $SUBCOMMAND01:11
davecheneym_3: veyr very very slow on my host01:11
davecheneyi suspect a lot of round trips01:11
fwereadedavecheney, shame not to share caches though01:11
davecheneyfwereade: what do we not cache ?01:11
fwereadedavecheney, I think that `juju switch` thing might have some mileage01:12
fwereadedavecheney, charms mainly01:12
davecheneyfwereade: i remain -1 on that proposal01:12
fwereadedavecheney, that might be it actually01:12
davecheneyfor the reasons stated01:12
fwereadedavecheney, yeah, I'll keep it to the list, it just made me think of it01:12
m_3davecheney: also... in az2 of hp so west US prob01:13
m_3davecheney: the "outer" machine is local to that az01:14
davecheneym_3: ahh, need -f01:14
davecheneybasically just too many round trips01:14
davecheneysome multiple of the number of machines and services01:14
m_3ack01:14
davecheneydunno, i think on balance that is better than the topology node01:15
m_3still got a few danglers...01:15
davecheneyi say start, you've got 95% of the machines reporting in01:17
m_3really need to adjust the numbers tho :)01:19
m_3haha01:19
m_3lemme bump them up so something a little more appropriate for that cluster01:20
davecheneyfwereade: we have a lot of machine agetns restarting01:21
m_3fwereade: your keys are there btw01:22
fwereadem_3, cool, thanks01:23
davecheneyfwereade: http://paste.ubuntu.com/5711961/01:23
davecheneywhy does the machine agent keep reconnecting to state01:23
davecheneyhttps://bugs.launchpad.net/juju-core/+bug/116937801:25
davecheneyi guess there is no _mup_ 'cos linnode got hacked01:26
m_3davecheney: I'm gonna go grab food01:27
m_3davecheney: you can just let the job run or not01:27
m_3davecheney: easiest is to just destroy-environment01:27
davecheneym_3: lets tear it down01:27
m_3davecheney: ok01:27
davecheneysome good results already01:27
davecheneywe just need the all-machines.log from the 0 machine01:28
davecheneythat is all we need01:28
m_3davecheney: I'm out feel free to do whatever01:28
davecheneyok will do and destroy01:28
m_3davecheney: I'll try to bump up to 2k tomorrow01:28
davecheneyfwereade: I would like to add a 'starting $CMD' log message01:28
fwereadedavecheney, thanks01:28
fwereadedavecheney, +1 to that01:28
davecheneywe're making a connection to state every few seconds per worker01:29
davecheneyso two per machine01:29
davecheneybut no error lines ...01:29
fwereadedavecheney, actually, there's a log.Noticef("agent starting")01:29
fwereadedavecheney, I don't think the actual process is bouncing01:29
davecheneyfwereade: right, so the agent isn't restarting01:30
davecheneybut the job is rerunning01:30
davecheneyso something is killing the Tomb01:30
davecheneyubuntu@juju-goscale2-machine-27:~$ head  /var/log/juju/unit-hadoop-slave-25.log01:30
davecheney2013/04/16 00:36:52 NOTICE agent starting01:30
davecheneyindeed there is a process restart message01:30
davecheneyubuntu@juju-goscale2-machine-27:~$ grep -c starting /var/log/juju/unit-hadoop-slave-25.log01:30
davecheney1301:30
fwereadedavecheney, ok, but those dials are happening every 30s01:31
fwereadedavecheney, I bet it is mgo01:31
davecheneythat fucking anti feature01:32
fwereadedavecheney, we pass that dial func in01:32
fwereadedavecheney, I imagine it is checking all the addresses in the cluster01:32
davecheneyfwereade: m_3: i have the all-machines log, i'm turning off the 200 machine environment01:34
fwereadedavecheney, cool01:34
davecheneyjuju-goscale2-machine-0:2013/04/16 00:33:33 ERROR worker/provisioner: cannot start instance for machine "16": cannot set up groups: failed to create a rule for the security group with id: %!s(*int=<nil>)01:35
davecheneyjuju-goscale2-machine-0:2013/04/16 00:35:52 ERROR worker/provisioner: cannot start instance for machine "28": cannot set up groups: failed to create a rule for the security group with id: %!s(*int=<nil>)01:35
davecheneyjuju-goscale2-machine-0:2013/04/16 00:36:08 ERROR worker/provisioner: cannot start instance for machine "30": cannot set up groups: failed to create a rule for the security group with id: %!s(*int=<nil>)01:35
davecheneyjuju-goscale2-machine-0:2013/04/16 00:46:25 ERROR worker/provisioner: cannot start instance for machine "82": cannot set up groups: failed to create a rule for the security group with id: %!s(*int=<nil>)01:35
davecheneyjuju-goscale2-machine-0:2013/04/16 00:46:55 ERROR worker/provisioner: cannot start instance for machine "85": cannot set up groups: failed to create a rule for the security group with id: %!s(*int=<nil>)01:35
davecheneym_3: this is why those machines didn't come up01:35
davecheneyi think I have a patch for that logging snafu01:35
davecheneyinteresting01:38
davecheneydestroy-environment blocks on hpcloud01:38
davecheneyon ec2, it's fire and forget01:38
davecheneyfwereade: ubuntu@juju-hpgoctrl2-machine-0:~$ juju destroy-environment -v01:39
davecheney2013/04/16 01:36:39 INFO JUJU:juju:destroy-environment environs/openstack: opening environment "goscale2"01:39
davecheney2013/04/16 01:36:39 INFO JUJU:juju:destroy-environment environs/openstack: destroying environment "goscale2"01:39
davecheneyubuntu@juju-hpgoctrl2-machine-0:~$01:40
davecheneydo we need a DEBUG or INFO "command finished"01:40
davecheneyso we can tell how long the command runs for ?01:40
thumperwould be nice01:40
davecheneyi'll raise a ticket01:41
davecheneylucky(~) % bzcat all-machines-201304016.log.bz2  | wc -l02:02
davecheney154838402:02
davecheneylucky(~) % bzcat all-machines-201304016.log.bz2  | grep -c 'watcher: got'02:02
davecheney102334502:02
davecheney66% of all log lines are 'watcher got such and such'02:02
fwereadedavecheney, +102:09
davecheneyfwereade: card raised02:09
fwereadethumper, https://codereview.appspot.com/8663045/ has a couple of extra comments and surprisinglyfew actual changes02:09
davecheneythe whole log file, 200 machines, compressed to 5mb02:09
davecheneysooooo much duplication02:09
fwereadedavecheney, I had a vague thought in mind that it might compress quite nicely, yeah, especially considering every one of those messages is sent to every machine02:10
davecheneyyeah, it might be a low blow02:10
davecheneythose log lines contain exactly the kind of duplication bz2 loves02:11
thumperdavecheney: I have a var foo [20]byte02:39
thumperdavecheney: and I want a string of that...02:39
thumperbut string(foo) doesn't work02:39
thumperwhat does?02:40
davecheneystring(foo[:])02:40
davecheneygotta slice the array first02:40
thumperta02:44
thumperdavecheney: can strings contain embedded nulls?02:47
davecheneythumper: yes03:17
davecheneystrings (and slices) know their length03:17
davecheneythe don't rely on \003:17
thumperdavecheney: what is the best way to compare to byte slices?03:18
davecheneyreflect.DeepEquals(slice, slice) is the simplest03:19
thumperdavecheney: can I assign a byte array to a byte slice?03:53
thumperand will it do what I expect?03:53
davecheneythumper: yes04:05
davecheneythe array backs the slice04:05
thumperthought so...04:05
* thumper pokes some more04:05
thumperfucking channel magic...05:09
thumperif this works, fair dinkum, it'll be a miricle05:09
thumperhah, well the first bit worked...05:14
thumperheh, it worked05:17
thumpercolour me surprised...05:17
* thumper fears review comments on this one...05:22
thumperbut proposing anyway05:22
thumperRietveld: https://codereview.appspot.com/8602046 for a file system lock implementation using lock directories05:30
* thumper sighs05:31
thumperrealised I missed a test for Unlock, but it can wait as I have to make dinner now...05:31
bigjoolsnice one thumper05:31
thumperthanks bigjools05:31
thumpermaybe it'll even get through review without changing too much :)05:32
bigjoolsthumper: it's the sort of thing that should be in Go's core05:32
thumper:)05:32
thumperyeah, but it isn't in python either05:32
thumperthat is why bzrlib implemented one05:32
* thumper moves into the kitchen05:33
thumperciao05:34
rogpeppemornin' all06:28
rvbafwereade: Hi… if it's the intented behaviour, then fine… I was troubled because pyJuju behaves differently: http://paste.ubuntu.com/5712470/.07:11
fwereadervba, yeah, pyjuju doesn't have lifecycle management07:18
rvbafwereade: all right then… I'll just make sure that it works as expected if I run "resolve mediawiki/0" as you advised.07:20
fwereadervba, yeah, if that doesn't work there's a problem07:20
fwereadervba, it did work for me though :)07:20
fwereadeTheMue, dimitern, rogpeppe: morning all btw07:35
rogpeppefwereade: hiya07:36
rogpeppefwereade, dimitern: i'd appreciate a review of this, if poss. the gui people are wanting to use it.07:38
fwereaderogpeppe, allwatcher service config?07:39
rogpeppefwereade: yup07:39
TheMuefwereade: heya, already woke up? seen a 4am comment by you.07:40
TheMuerogpeppe, dimitern: good morning to you too07:40
fwereadeTheMue, just a short nap ;p07:48
rvbafwereade: by "resolving" I suppose you mean removing the (broken) relation right?07:49
fwereadervba, yeah07:49
fwereadervba, `juju resolved mediawiki/0`07:50
TheMuefwereade: take care for yourself07:50
fwereadeTheMue, I'm ok, thanks, but I think I will be unilaterally declaring a couple of swap days next week ;p07:50
TheMuefwereade: yeah, sgtm07:51
TheMuefwereade: we need you in the long term07:51
rvbafwereade: it does not seems to fix the problem here: http://paste.ubuntu.com/5712542/07:52
fwereadeTheMue, I am reasonably well attuned to my own burnout signs, right now the psychologically healthy thing is to Get Things Done ;p07:52
rvbaseem*07:53
fwereadervba, I don't see a `juju resolved mediawiki/0` in there07:53
fwereadervba, I see a destroy-relation, which would be silently ignored because the relation's already dying07:53
TheMuefwereade: i've been in a similar flow once, but w/o any burnout signs my health striked back over night. that's why i care.07:54
rvbafwereade: ah right, that's what I was missing (sorry, I'm still used to py juju). With that it worked fine!07:55
fwereadervba, sweet07:55
rvbafwereade: tyvm :)07:55
TheMuedimitern: you had a few comments on https://codereview.appspot.com/8705043. could you please take a new look?07:55
fwereadeTheMue, btw, how's juju-deploy looking? in terms of what status is checks for?07:56
TheMuedimitern: i think it's all covered now.07:56
fwereadervba, fwiw quite a lot of the lifecycle stuff is covered in some detail in the stuff under doc/07:57
TheMuefwereade: will start now after i just had proposed the latest changes. so far i only did a quick scan into how it is configured, but not how it is working.07:57
rvbafwereade: ok, I'll have a look.07:57
rvbata07:57
fwereadervba, it's generally aimed at developers and might clarify a few things07:57
fwereadervba, start with the glossary, terms in there are used without explanation elsewhere07:58
rvbafwereade: another question: I terminated all the machines, they were successfully released (I see that on the MAAS server), but they still show up in "juju status".  Is that normal? http://paste.ubuntu.com/5712552/08:01
fwereadervba, that's in review :/08:02
rvbafwereade: all right then :)08:02
rvbaThanks.08:02
fwereaderogpeppe, reviewed08:31
rogpeppefwereade: thanks08:31
fwereaderogpeppe, fwiw parts of https://codereview.appspot.com/8786043/ might make you happy :)08:31
fwereaderogpeppe, I actually got a physical tingle from hitting `d`08:32
* rogpeppe is very happy to see those big blocks of red08:34
wallyworld_jam: hi, did my email make sense?08:34
jamwallyworld_: I understood it, still trying to sort out if I agree with it. Also, William has a patch that changes things around.08:35
wallyworld_ok, np08:35
wallyworld_i can explain a bit more in the standup if required08:35
rogpeppefwereade: i think tim got as far as the "info0" name and threw his hands up in disgust08:36
fwereaderogpeppe, without context, it is a pretty bad name ;)08:37
rogpeppefwereade: the context is all there to see...08:37
fwereaderogpeppe, there's quite a lot of assumed knowledge that you have to just kinda pick up by osmosis though08:38
rogpeppefwereade: yeah08:38
fwereaderogpeppe, reading the docs helps08:38
fwereaderogpeppe, but I suspect that really you need to read them, forget them, hit the code in anger a bit, and then read them again, at which point things may start clicking08:39
fwereaderogpeppe, I have found that is often my pattern08:39
rogpeppefwereade: fwereade: BTW i thought about using the Map method, but honestly we are already knee deep in knowledge about the settings and i prefer to avoid generating unnecessary garbage; maybe i should just avoid all use of the Settings object and just fetch into directly into the map like GetAll does08:39
rogpeppefwereade: yeah08:39
rogpeppefwereade: the Go docs, you mean?08:40
fwereaderogpeppe, most large systems I have to assimilate tbh08:40
fwereaderogpeppe, it's in the nature of technical documentation08:40
rogpeppefwereade: yeah08:40
rogpeppefwereade: it doesn't make sense until you start trying to do something with it08:41
fwereaderogpeppe, every sentence is important but the importance of some cannot be readily grasped on a first read through08:41
jamwallyworld_: interestingly, if you set "public_bucket_url" it also fails to sync-tools --public08:44
jamGives an Unauthenticated error.08:44
jamso if you *don't* set it, then it goes via the swift and existing client (I guess).08:44
jamIf you do set it08:44
jamthen it does a different unauthed connection08:44
jam?08:44
wallyworld_jam: i got it to work by commenting out the FindTools code which looked at the private bucket08:45
wallyworld_i set public-bucket-url and it just looked at that and didn't attempt to open the private bucket08:45
jamwallyworld_: fwereade's patch changes that around a lot, though it still looks at the private bucket (to see if there are tools there causing it to ignore the public bucket)08:45
wallyworld_sure, but thsat patch should allow control-bucket to be ""08:46
fwereaderogpeppe, I argued for keeping the error in https://codereview.appspot.com/8748046/ - let me know what you think08:46
jamI believe his patch changes it to only look at the pub bucket of the source (good), but still look at pub and private when --public is set.08:46
wallyworld_jam: it should do that but allow control bucket to be ""08:46
wallyworld_and ignore it if not specified08:46
jamfwereade: well offhand it would fix a bug if you just didn't search the private bucket at all.08:46
wallyworld_so that we can set up and env for just a public bucket08:46
wallyworld_for the shared swift account08:46
fwereadejam, wallyworld_: https://codereview.appspot.com/8726044/ and https://codereview.appspot.com/8748046/ are the relevant CLs08:46
fwereadejam, wallyworld_: as I recall we agreed in atlanta that any private tools should exclude all public ones from consideration08:47
wallyworld_fwereade: yes, but if an account only has a public bucket dfined, we should allow for that08:48
jamfwereade: the downside to that is just not working at all, but I think the argument was with dev versions you don't expect it to work08:48
rogpeppefwereade: looking08:48
jamfwereade: so the specific bug is a bit involved. 1) our shared HP account only has object store (no compute), 2) in Goose when you search the private bucket it checks that you have compute access.08:48
wallyworld_fwereade: so the current HP Cloud shared public bucket should be able to be set up and work just to provide tools etc, and no private bucket is needed, since it's just a tools repository08:48
jamso that it can give a nicer error message than falling over and failing later.08:49
fwereadejam, wallyworld_: I'm not convinced an environment without a control-bucket is meaningful08:49
jamfwereade: so again, the hp shared tools account isn't useful08:49
wallyworld_fwereade: jam: the reason it checks for compute is that a single openstack client is used to access all server resources - swift and compute08:49
jamit is a storage for a public bucket08:49
jamno compute means you can't run juju there08:50
jambut that is fine08:50
fwereadejam, wallyworld_: ISTM it would be easiest to have a public-tools env with the control-bucket set to the other envs' public-bucket08:50
jamyou just want to store files08:50
jamfwereade: you need the creds08:50
jamto write to the buckewt08:50
jambucket08:50
rogpeppejam, wallyworld: if the public bucket is "", doesn't the provider just return an EmptyStorage?08:50
wallyworld_rogpeppe: yes, but the issue is the private bucket08:50
jamrogpeppe: public-bucket vs public-bucket-url I believe08:50
rogpeppewallyworld_: sorry, i meant the private bucket08:51
wallyworld_fwereade: it's like the s3 public bucket - we just want a place to get tools from, not run juju08:51
wallyworld_rogpeppe: for openstack, it currently assumes control bucket must be specified08:51
rogpeppewallyworld_: "it" being which piece of code, sorry?08:52
fwereadedamn sorry bbiab08:52
wallyworld_rogpeppe: that's an implementation decision that needs to be changed if we want to allow public bucket only ens to be specified08:52
wallyworld_for openstack08:52
wallyworld_rogpeppe: the SetConfig() for the openstack provider08:52
rogpeppewallyworld_: ah, so it's an openstack provider issue08:52
wallyworld_yes, an implementation decision that control bucket is expected08:53
wallyworld_since juju won't work without one08:53
wallyworld_but if we want sync-tools to work with just a public bucket, we need to change that08:53
jamwallyworld_, rogpeppe: so there isn't a default config for control-bucket, so you have to specify one08:53
jamand I don't know what s3Unlocked.Bucket("") does08:54
wallyworld_jam: the default is "" but the code assumes it is specfied08:54
wallyworld_for openstack08:54
wallyworld_since juju needs it08:54
rogpeppejam: that would be easy to change - nothing outside the provider-specific code knows about the control-bucket setting AFAIK08:55
jamwallyworld_: for ec2, there is no default, so you have to specify something.08:55
wallyworld_jam: effectively, that's the same for openstack08:55
jambut I don't know what "" does for a bucket.08:55
wallyworld_since it dies if it is ""08:55
wallyworld_but for sync-tools, we just want an env that specifes a public bucket to copy to08:55
wallyworld_and not require a control bcket08:56
jamwallyworld_: technically both from and to, but I cheat with "juju-dist" as the private source bucket.08:56
jamsince that overlaps with the actual public bucket (I believe)08:56
wallyworld_yes, the public bucket for tools assumes juju-dist08:57
wallyworld_rogpeppe: yes, only the provider knows about the control bucket, so it is easy to change08:58
rogpeppewallyworld_: cool08:58
davecheneyrogpeppe: can you please try bootstrapping a quantal state server again08:59
davecheneyi believe the problem is fixed08:59
wallyworld_rogpeppe: the issue came up cause the account where the "standard" hp cloud public bucket was created only had swift enabled, not compute. but we dont need compute for that since it's just a tools repoistory, but the provider code needs to be tweaked to allow that09:00
rogpeppedavecheney: great!09:00
rogpeppedavecheney: you'd probably be best asking someone that's actually running quantal though09:00
davecheneyrogpeppe: who reported the issue that you reported to me ?09:00
davecheneyrogpeppe: if it's not conveninet09:00
davecheneydon't sweat it09:00
jamdavecheney: yay, you got https://launchpad.net/~juju/+archive/experimental sorted out?09:01
rogpeppedavecheney: it might've been benji09:01
davecheneyi'll bootstrap a machine after din dins09:01
davecheneyjam: yeah, turns out there is an amount of foul language that can solve any problem09:01
jamdavecheney: I can imagine that level is pretty high09:01
rogpeppedavecheney: i think using default-series=quantal should bootstrap a quantal node09:01
davecheneyrogpeppe: indeed, i'm well versed in hacking that crap09:01
rogpeppedavecheney: :-)09:01
davecheneyjam: rogpeppe i have heard from sources that a backport of 2.2.4 is in the works09:02
davecheneyso we may not have to live with this hack for too long09:02
TheMue*: python freaks to the front. what does the machine = machine = in machine = machine = status["machines"][m_id]["dns-name"] mean?09:27
fwereadeTheMue, er, file/line please?09:48
TheMuefwereade: one moment09:49
TheMuefwereade: http://bazaar.launchpad.net/~gandelman-a/juju-deployer/trunk/view/head:/utils.py#L8809:49
fwereadeTheMue, I think it's just a typo, equivalent to machine = machines[...]09:50
fwereadeTheMue, er, you know what Imean09:50
fwereadeit's getting harder to read python these days without refactoring it to go in my head09:52
TheMuefwereade: that's how i interpreted it too, just a typo. ;)09:52
fwereadebtw, can I get a review from somebody on https://codereview.appspot.com/8786043/ please?09:53
fwereadeit unfucks some fairly critical behaviour09:53
rogpeppefwereade: looking10:02
rogpeppefwereade: replied to earlier review also, BTW10:02
fwereaderogpeppe, tyvm10:04
TheMuefwereade: you've got a review10:06
* TheMue found another nice py statement he has to think twice about. looks like a list of sets is created by a post-positioned for loop. 10:22
davecheneyooh, some sneaky sod has introduced another dependency on the build10:32
davecheneyTheMue: rogpeppe today I found a great use for JUJU_HOME10:36
rogpeppedavecheney: oh yes?10:36
davecheneyscp over the ~/.juju of another environment10:36
rogpeppedavecheney: what's the new dep?10:36
davecheneyJUJU_HOME=/tmp/.juju juju status << you see their environment10:36
davecheneyrogpeppe: maas10:36
davecheneyit's a build dep on environs/maas10:36
davecheneybut I don't think it is part of the jujud deps10:36
rogpeppedavecheney: ah yes. i didn't actually notice when that went in10:37
rogpeppedavecheney: it should be10:37
rogpeppedavecheney: otherwise jujud won't work on maas10:37
davecheneywell, then they haven't updated the check10:37
rogpeppedavecheney: that's a nice use for JUJU_HOME10:38
TheMuedavecheney: nice10:38
davecheneyvar expectedProviders = []string{ "ec2", "openstack",10:39
davecheney}10:39
* rogpeppe still misses plan 9: bind /n/remote/usr/rog/.juju $home/.juju; juju status10:39
rogpeppedavecheney: yup, that should be there10:40
rogpeppedavecheney: i hadn't seen environs/all before10:40
rogpeppedavecheney: i was just wanting to do something like that10:41
rogpeppedavecheney: to be honest, the expectedProviders check should probably be a test in environs/all10:41
davecheneyrogpeppe: no, absolutely not10:41
davecheneyyou can duplicate it there if you like10:42
davecheneybut it must be part of the cmd/juju/main_test10:42
davecheneyotherwise we'll just fuck ourselves like we did in Atlanta when a transitive dep changed10:42
rogpeppedavecheney: did we have environs/all back then?10:42
davecheneyno10:42
davecheneyi will still oppose any move to move that check10:43
TheMuelunchtime, bbiab10:43
davecheneylucky(~/src/launchpad.net/juju-core) % juju bootstrap -v --upload-tools10:44
davecheney2013/04/16 20:37:11 INFO environs/ec2: opening environment "ap-southeast-2"10:44
davecheney2013/04/16 20:37:14 INFO environs/tools: built 1.9.14-quantal-amd64 (2299kB)10:44
davecheney2013/04/16 20:37:14 INFO environs/tools: uploading 1.9.14-quantal-amd6410:44
davecheney2013/04/16 20:37:55 INFO environs/ec2: bootstrapping environment "ap-southeast-2"10:44
davecheney2013/04/16 20:38:00 ERROR command failed: environment is already bootstrapped10:44
davecheneywhen did the bootstapped check move to after the upload tools ?10:44
rogpeppedavecheney: fwereade's been doing quite a bit of work in that area10:44
davecheneyindeed10:45
davecheneyrogpeppe: https://canonical.leankit.com/Boards/View/103148069/10482639310:45
davecheney66% of our logging goes in watcher debugging messages10:45
rogpeppedavecheney: yeah10:46
rogpeppedavecheney: it was even worse10:46
davecheneyrogpeppe: this was a 200 node hadoop instance10:46
davecheney20% cpu to mongo10:46
davecheney16% cpu to rsyslog10:46
rogpeppedavecheney: (most of the messages *were* saying "i just saw nothing")10:46
davecheney1-2% for jujud on the bootstrap machine10:46
rogpeppedavecheney: i'm surprised about that error. uploadTools shouldn't make the provider-state object in the control bucket10:48
davecheneyGet:7 http://ppa.launchpad.net/juju/experimental/ubuntu/ quantal/main mongodb-clients amd64 1:2.2.4-0ubuntu3 [20.3 MB]10:48
davecheneyfuck yea10:48
rogpeppedavecheney: that's just 'cos jujud's blocked by mongod, probably10:48
davecheneywut ?10:48
rogpeppedavecheney: the 1-2% for jujud10:49
davecheneyoh, yeah, i suspect jujud could use more cpu10:49
davecheneybut was blocked by mongo10:49
rogpeppedavecheney: yup10:49
davecheneywe are super chatty10:49
rogpeppedavecheney: yes10:49
rogpeppedavecheney: we should turn log level to info by default10:49
davecheneyrogpeppe: +10010:50
rogpeppedavecheney: and pass through --debug only if the environment is bootstrapped with --debug10:50
davecheney+ another 10010:50
rogpeppedavecheney: and then (not right now) allow dynamic changing of debug level10:50
rogpeppedavecheney: ah, i see the problem with your bootstrap10:51
davecheneyso, ive' overwritten the tools the environment (may) have been using, then failed10:51
rogpeppedavecheney: it's that you shouldn't try to upload tools if the environment is already bootstrapped10:51
rogpeppedavecheney: right?10:51
davecheneycorrect10:52
davecheneybut it looks like th echeck happens too lat enow10:52
rogpeppedavecheney: i wonder if we should have an Environ.PrepareForBootstrap method10:53
rogpeppedavecheney: which will return an error if it's already bootstrapped10:53
rogpeppedavecheney: or actually, just "Prepare"10:53
rogpeppedavecheney: then the environment could create the control bucket and put "pending" (or something) inside the provider-state object, so that something else can't bootstrap while we're uploading tools10:55
davecheneyrogpeppe: that sounds like an old bug, "don't go bootstrappin' twice"10:57
rogpeppedavecheney: it would be nice if bootstrap could be race-free10:58
rogpeppedavecheney: and i'd prefer to design our API such that it's actually possible for a provider to do that10:58
fwereaderogpeppe, responded again... I think it must be that there's a use case I'm not seeing11:02
fwereadedavecheney, rogpeppe: fwiw upload-tools moved to command-time a while ago11:05
rogpeppefwereade: do you see dave's issue though?11:05
fwereadedavecheney, rogpeppe: coincidentally and not deliberately my pipeline always uploads unique build numbers and so shouldn't overwrite11:05
rogpeppefwereade: if i call juju bootstrap, it shouldn't upload the tools, *then* check that the env is not already bootstrapped11:05
fwereaderogpeppe, sure, but you argued very firmly against an IsBootstrapped method when I suggested it a while back...11:06
rogpeppefwereade: yes, and i still think it's wrong, hence my Prepare suggestion above.11:06
fwereaderogpeppe, so Prepare would upload the tools?11:07
rogpeppefwereade: no, Prepare would check that the control-bucket doesn't exist and create it otherwise (and do anything else necessary to make it possible to use the environment's Storage)11:08
fwereaderogpeppe, that feels to me exactly as racy in effect as an IsBootstrapped11:09
rogpeppefwereade: not quite, because currently there's a very large window (the amount of time it takes to upload the tools) for the race11:10
rogpeppefwereade: and if a provider does have access to an atomic operation, then it's easy to make it non-racy11:11
rogpeppefwereade: whereas IsBootstrapped is *inherently* racy11:11
fwereaderogpeppe, and the providers you're aware of with atomic check-and-set operations we could use that way are..?11:12
rogpeppefwereade: it's trivially conceivable.11:13
rogpeppefwereade: i imagine that amazon provides such a thing if we look hard enough11:13
davecheneyhttps://docs.google.com/a/canonical.com/document/d/1zj8zs5SUTvKAcnLlLiaXOalMp07zInJz1fN7w1OTDLo/edit#11:14
davecheneyrelease notes for 1.9.1411:14
davecheneygonna be tappin' y'all for input if you touched the card11:14
fwereaderogpeppe, afaict dave's case would be fixed with a check for ErrNoTools before first upload, while the fancy anti-race stuff is restricted to a very specific set of users that aren't, I think, very common11:18
fwereaderogpeppe, ie those sharing environs that they all promiscuously start up and shut down11:18
fwereaderogpeppe, I submit that if you want to treat environs that way, you get your own ;)11:19
rogpeppefwereade: in general we try to make all operations safe in a concurrent environment. the fact that aws makes it hard to do so doesn't mean that we don't want to do it11:19
fwereaderogpeppe, describe to me the set of customers you expect to be impacted by this11:19
fwereaderogpeppe, it's not the hardness, it's the utility11:20
rogpeppefwereade: i could ask the same about set-environ11:20
fwereaderogpeppe, that is one of our explicit stated goals for the sprint11:21
fwereaderogpeppe, what alternative functionality do you have in mind?11:21
fwereades/sprint/release/11:21
rogpeppefwereade: i mean - why do we go to so much bother to make it safe to use concurrently?11:21
fwereaderogpeppe, we don't, it's pitiful horsecrap11:22
rogpeppefwereade: when only a "very specific set" of users will be concurrently setting environment settings11:22
fwereaderogpeppe, and I don't care too much about that because the multiple-admins story is still in the future11:22
rogpeppefwereade: that's what i think about concurrent bootstrap11:22
fwereaderogpeppe, but that set of people is still way larger than the set of people who will ever be impacted by concurrent bootstrap issues11:23
rogpeppefwereade: i have no idea11:23
rogpeppefwereade: i don't know how we can11:23
rogpeppefwereade: i just want to make a tool that works reliably11:23
fwereaderogpeppe, *any* multi-admin situation opens the possibility of concurrent env modification11:23
rogpeppefwereade: same could be said for bootstrap, i think11:23
davecheneydimitern: with machine errors in status, is there anything to add to the release notes about it ?11:24
dimiterndavecheney: something about nonce provisioning perhaps?11:24
davecheneydimitern: https://docs.google.com/a/canonical.com/document/d/1zj8zs5SUTvKAcnLlLiaXOalMp07zInJz1fN7w1OTDLo/edit#11:25
fwereaderogpeppe, a strict subset of those involves concurrent bootstraps, because I promise I will at least once create an environment and then give the details to someone else after it's bootstrapped11:25
davecheneywould you be able to write a line or two about what that means for the customer ?11:25
dimiterndavecheney: cheers11:25
davecheneyTheMue: do you have anything to add to the release notes for JUJU_ENV_UUID ?11:26
davecheneyfwereade: with "unused machines will not be reused", is there anything for the customers to know about this in the release notes11:28
fwereadedavecheney, possibly, yes -- "automatic machine reuse has been disabled for now; similar effects can be more reliably obtained by using the "--force-machine" with to `juju deploy` and `juju add-unit`, which duplicated the action of jitsu deploy-to"?11:31
fwereades/with to/option with/11:31
fwereades/duplicated/duplicates/11:31
davecheneyfwereade: roger11:32
davecheneyfwereade: this is because we can't really guarentee what state a previous charm will leave the machine in11:32
davecheney, correct ?11:32
dimiterndavecheney: I don't think I can explain nonced provisioning in a meaningful way to the end user, without revealing how bad it used to be :)11:34
fwereadedavecheney, yeah11:34
TheMuedavecheney: only that this variable is supported now inside the hooks11:35
davecheneydimitern: understood, don't mention the war11:35
TheMuedimitern: thx for your feedback11:40
jamdanilos: ping for mumble11:41
dimiternTheMue: np, I just think splitting the test table doesn't give much benefit, and duplicates a bit of code11:42
TheMuedimitern: it helped me during testing ;) but i'll keep the optimization in mind for later11:43
fwereadewell, yay!12:13
fwereadelatest tools code all still seems to work12:13
fwereadeagents quietly ignore failed upgrades with missing tools, and then handle the ones they have tools for12:14
fwereadethe provisioner barfs if it tries to start a new machine with no tools available, and (probably) sets the error on the machine12:15
dimiternfwereade: \o/12:15
fwereadebut we can't see it because of (1) a status bug: that a missing instance-id causes us to skip checking for machine errors (whoops)12:15
fwereadeand (2), sometimes, another status bug, wherein any error examining one machine causes the *whole* machines dictionary to be replaced with some "status error: cannot find instance id for machine 3" nonsense12:16
fwereade1) is a big deal I think because it means we *don't* get display of provisioning errors12:18
fwereade2) is less so, but still a bit crap, because if there's a 2-minute delay on new instances showing up in ec2, as there seemed to be today, it means you lose all machine status info, not just the missing ones12:19
dimiternfwereade: when do you expect to merge the tools stuff?12:24
fwereadedimitern, I need to look back through and figure out what has/hasn't been reviewed12:24
TheMuefwereade: i shared a doc with my juju-deploy notes with you. one thing we don't cover are subordinates12:24
fwereadeTheMue, great, thanks, what is going to hurt us worst?12:25
TheMuefwereade: i have to do another crosscheck against our code but it looks as we are mostly clean, only subordinates are missing 100%12:26
dimiternfwereade: because the chain of dependency just got longer - i'm waiting on you and wallyworld_ is waiting on me for the openstack constraints flavor/image picking12:27
dimiternfwereade: and I think we should have a short discussion12:27
rogpeppedimitern: i need another LGTM on this, if you want to have a look: https://codereview.appspot.com/876104512:28
fwereadeTheMue, that is excellent news -- I wonder a little about the error states12:28
* dimitern looking12:28
rogpeppedimitern: ta!12:29
fwereadeTheMue, do you think you can get subordinates done today?12:29
TheMuefwereade: have to check what it means exactly. the output below services and the units is changed.12:31
TheMuefwereade: let me take a deeper look12:32
fwereadeTheMue, ISTM they are additions, not changes, to what we produce; and that state supplies all the necessary info12:33
dimiternrogpeppe: reviewed12:34
rogpeppedimitern: thanks!12:34
TheMuefwereade: yes, that's my first impression too12:37
fwereaderogpeppe, how would you feel about EnsureAgentVersion for FindBootstrapTools?12:39
rogpeppefwereade: much better.12:40
fwereaderogpeppe, I think I have a better followup but structure is strictly more pressing at this point :)12:40
rogpeppefwereade: i understand :-)12:40
fwereadethen, rogpeppe and dimitern, I think it comes down to the sync-tools stuff12:40
danilosjam: hi, sorry, I sent an email that I won't be able to make a stand-up today; sorry again12:41
rogpeppefwereade: i still feel quite strongly about the force-version semantics. have you been able to fix that?12:42
rogpeppefwereade: i've got another possible solution there actually, simpler than the function argument.12:43
fwereaderogpeppe, I'm afraid not -- like MachineConfig, it's one of the boundaries I am not keen to cross lest this pipeline explode further12:43
* rogpeppe 's heart sinks a bit12:43
fwereaderogpeppe, I *am* very much keen to discuss and implement how I could do all this more cleanly12:44
fwereaderogpeppe, and indeed to fix up the building, because I think it's important12:44
rogpeppefwereade: i just feel that this semantic is breaking the very thing you're trying hard to fix12:44
rogpeppefwereade: and it will rebound on us 10 fold12:44
fwereaderogpeppe, it is breaking a single case AFAICT: we won't automatically explode when compiling one major version of the tools with another CLI12:45
fwereaderogpeppe, when we fix it, it's a simple "--upload-tools now respects source version as far as possible line, and basically nobody is affected but us"12:46
rogpeppefwereade: it's breaking juju status12:46
fwereaderogpeppe, huh?12:46
rogpeppefwereade: we won't be able to tell what versions the agents are running12:46
rogpeppefwereade: so an extremely useful diagnostic tool becomes useless12:47
fwereaderogpeppe, because we will have forgotten what;s in our source tree?12:47
rogpeppefwereade: because the version and agent reports in the status won't have any necessary connection with the version of the code that the agent is actually running12:48
rogpeppes/and agent/an agent/12:48
fwereaderogpeppe, they *already don't*12:48
rogpeppefwereade: they do if you haven't used upgrade-juju12:48
rogpeppefwereade: and that's a bug in upgrade-juju that i would very much like to fix12:48
fwereaderogpeppe, I would too12:49
rogpeppefwereade: rather than *breaking it further*12:49
fwereaderogpeppe, but I insist we upload tools consistently across bootstrap and upgrade-juju12:49
rogpeppefwereade: i'm convinced it would be just as easy to fix UploadTools to do the right thing12:50
fwereaderogpeppe, it would be easy to fix it *badly*12:50
fwereaderogpeppe, and that would make it harder to fix it well, and get some sort of clear tools-on-disk abstraction going12:50
rogpeppefwereade: arguably. but the scope is very limited. and the externally visible behaviour is really important here.12:50
rogpeppefwereade: i really don't belive it would make it harder to fix well12:51
rogpeppefwereade: we're talking about 10 lines of non-test code here12:51
fwereaderogpeppe, which people get used to, and make little tweaks assuming, and next thing you know it's another 200-line diff to unpick it all12:52
fwereade200012:52
rogpeppefwereade: UploadTools is not used everywhere12:52
rogpeppefwereade: and i don't believe it will be12:52
fwereaderogpeppe, it's only a matter of time before someone realises that it's crazy to have two implementations of it, and adds a func that calls it to envtesting12:53
fwereaderogpeppe, tentacles!12:53
rogpeppefwereade: why two implementations?12:53
fwereaderogpeppe, because of UploadFakeTools which does roughly the same thing12:54
fwereaderogpeppe, itself factored out of a range of tool-uploading tests in some prereq12:54
rogpeppefwereade: i don't want to support juju users with this misfeature in12:54
fwereaderogpeppe, dev version == not supported12:55
fwereaderogpeppe, upload-tools == dev version12:55
rogpeppefwereade: like we don't actually be supporting developers...12:55
rogpeppes/don't/won't/12:55
rogpeppefwereade: please tell me: why is this whole pipeline of changes important?12:55
rogpeppefwereade: i mean, important enough that we're desperately trying to get it in before the deadline12:56
fwereaderogpeppe, because our tools-picking was close to random, and it was wantonly fucking over developers, and I have no confidence that the implementation that fucks over devlopers will not also fuck over users12:56
fwereaderogpeppe, because there were 3 distinct live implementations of tools-picking, each of which was wrong, and probably in the same way, but I'm not confident of that either12:58
fwereaderogpeppe, I believe it is absolutely critical that we are as *predictable* as possible12:59
rogpeppefwereade: that's why i believe we should be able to predict the agent version from the version of the agent we're uploading12:59
rogpeppefwereade: otherwise developers will continue to be wantonly fucked over13:00
fwereaderogpeppe, "oh yeah, sometimes the wrong tools get chosen, I forget the details" inspires much less confidence than "developer tools are always uploaded with the cli version plus a unique build number, we're on it, see lp:1168754"13:00
fwereaderogpeppe, which we will have to fix imminently anyway13:01
rogpeppefwereade: it was actually "tools are chosen from the public bucket if you haven't uploaded a version with the right series". which is a fairly similar statement13:01
rogpeppefwereade: at least this change will fix the default case.13:02
fwereaderogpeppe, but you cannot in any way characterise what those tools will be13:02
rogpeppefwereade: but when someone comes to us and says "my environment is stuffed" and we want to find out what version they're running, we'll have to tell them to ssh to a machine, remove the force-version file and call jujud version again13:02
fwereaderogpeppe, we'll say "what's the version in your $GOPATH"?13:03
rogpeppefwereade: that may bear no resemblance to the version they bootstrapped with last week13:03
rogpeppefwereade: also, it's the version in your PATH that is the important thing13:04
rogpeppefwereade: and that's part of the point.13:04
fwereaderogpeppe, I don't follow: that's what they're *reported as*, not what they *are*13:04
rogpeppefwereade: oh i see. who knows whether they're still using the same branch?13:05
fwereaderogpeppe, they should if they're playing with sharp tools?13:06
fwereaderogpeppe, also, builds with the same exact version will always have been built from the same source13:07
fwereaderogpeppe, which is a pretty useful guarantee13:07
fwereaderogpeppe, x.x.x.1 was built from 1.10.2; x.x.x.2 was built from 1.11.7; upgrade, downgrade, dump one set of tools and see what happens13:08
fwereaderogpeppe, you might even want to build 2 versions of the cli to check that each can interact with each nicely13:10
fwereaderogpeppe, and that's really all you need, I think, to do sensible upgrade behaviour checking as a developer13:10
fwereadehazmat, ping13:24
fwereadedoes anyone have ~15s for my most trivial review ever? https://codereview.appspot.com/868804413:42
TheMuefwereade: done13:48
=== wedgwood_away is now known as wedgwood
rogpeppefwereade: i really don't think this is so bad: lp:~rogpeppe/juju-core/fwereade-do-not-lie13:53
rogpeppefwereade: it would need a little more test coverage around Upload, but i would be much happier with it done like this.13:54
fwereaderogpeppe, it's injecting a little snippet of custom logic in between steps 1 and 2 of three distinct separate operations -- it is taking things that are tighly coupled and could be profitably separated (if only so we could test the blasted things) and making them *more* coupled14:00
fwereaderogpeppe, and as soon as we're signing builds it will become more so14:00
rogpeppefwereade: i agree, but it fixes a real issue without undue perturbation to the code14:01
fwereaderogpeppe, I think this is where we differ14:01
rogpeppefwereade: and causes several big "THIS IS WRONG" comments to be unnecessary14:01
rogpeppefwereade: it's not a 1000 line diff14:01
rogpeppefwereade: kanban?14:02
fwereaderogpeppe, ah yeah14:02
rogpeppemramm: ^14:02
mrammrogpeppe: yea, be there in a minute14:02
rogpeppesaved by a "declared and not used" error once again14:50
rogpeppeniemeyer: hiya!14:51
niemeyerrogpeppe: Yo14:51
rogpeppefwereade: could you please take another look at this before i submit? https://codereview.appspot.com/876104514:59
fwereaderogpeppe, lgtm, nice15:09
rogpeppefwereade: thanks15:09
fwereadeI'll be back to do a submit-burst a bit later, need a quick rest15:11
rogpeppedimitern, fwereade, TheMue: trivial? https://codereview.appspot.com/866404715:15
fwereaderogpeppe, LGTM trivial with quibbles left to yourjdugment15:17
fwereadeand I really am off for a bit now15:17
=== rogpeppe2 is now known as rogpeppe
mrammHow goes everything?16:17
rogpeppejust about to leave16:38
rogpeppefwereade: trivial? https://codereview.appspot.com/865804516:38
mrammMany more items in the release notes: https://docs.google.com/a/canonical.com/document/d/1zj8zs5SUTvKAcnLlLiaXOalMp07zInJz1fN7w1OTDLo/edit#16:46
mrammI just took things from the kanban board, and wrote them up.16:46
mrammA few of them may have been available in 1.9.13 but were not announced then.16:46
rogpeppefwereade: there's a very simple reason why we don't see logs from the unit agent16:53
rogpeppefwereade: it's just not implemented16:53
rogpeppefwereade: no time to do it today i'm afraid16:53
rogpeppetime to go16:53
rogpeppesee y'all tomorrow!16:53
rogpeppemramm: thanks for that - quite a substantial list!16:55
mrammrogpeppe: agreed16:55
mrammI also got the force-machine stuff merged16:55
mrammso that part of the release notes is now true ;)16:55
rogpeppemramm: cool16:56
rogpeppemramm: has it been tested live?16:56
rogpeppeactually, i really am leaving :-)16:56
kapil_so the global firewall mode, still is adding entries per machine..17:20
kapil_into a global sec group, which still runs into size limits17:21
kapil_its actually a smaller size limit then the number of groups17:21
mgzha17:23
mgzwell, that's fixable17:23
mgzbut... shouldn't dupes be rejected anyway?17:23
mgzie, I add a rule saying allow tcp 80 to 0.0.0.0/017:24
mgzif I then try to add that rule again, I get back an error from the api saying it's already got that17:24
m_3hazmat: juju-goscale2-machine-0:2013/04/16 00:46:25 ERROR worker/provisioner: cannot start instance for machine17:26
kapil_mgz, if there differentiating on address then they would be distinct17:26
m_3hazmat: "85": cannot set up groups: failed to create a rule for the security group with id: %!s(*int=<nil>)17:26
kapil_the ostack provider ensureGroups looks sane17:28
kapil_hmm17:28
m_3hazmat: ubuntu@15.185.162.24717:29
mgzm_3: can you ssh-import-id gz too please?17:37
kapil_mgz, we're in the middle of performing an experiment, so read only observation pls unless coordinated17:40
mgzindeed.17:40
m_3mgz: added17:41
mgzta.17:41
kapil_fwereade, if we're not reusing, we should probably also be destroying during destroy-svc17:46
mgzI only see two ports opening in the log in home17:47
mgz...so, is it just lack of group cleanup between runs?17:48
kapil_mgz, looks sane17:50
kapil_we're only opening port on the master which is single instance17:50
kapil_perhaps it was accidental expose of the hadoop slave17:51
mgzit's probably just the code not being tolerant of the api "already got that" response and yeah, a double open17:53
mgzthe error is weird though, not what I'd expect17:54
m_3mgz: you want anything set up before we kick off a bigger run?18:05
=== TheRealMue is now known as TheMue
hazmatmgz, i wonder if we're getting different error strings causing a value mismatch on the duplicate group detection19:33
hazmatmgz, where you at..19:39
hazmatmgz, i'd like to pair on this.. the variation in errors is a bit high, it looks like some rate limiting is missing on flavor listing19:41
bacwith juju-core (r1164) i'm seeing juju commands failing rather than queueing up.  for instance if i bootstrap and then deploy in a script the deploy fails with "error: no instances found".  very non-juju.  anyone else seen it?20:02
bacthis: http://pastebin.ubuntu.com/5714170/20:06
mgzhazmat: sorry, just missed you before lunch, I'm in B113 right now, we could meet up somewhere to poke this21:03
thumpermorning21:10
thumperbac: not seen it, but not played much21:10
thumperbac: I agree not very juju :)21:10
bacthumper: it was suggested i clean out my buckets.  haven't gotten to try that yet.21:12
thumperbac: I don't think that buckets should have anything to do with that...21:12
mgzwhat exactly are you deploying on?21:13
TheMuethumper: morning21:13
mgzwhat you need to debug this is to run the list command on your underlying cloud and see what the instances are up to21:14
mgzyou can see that kind of behaviour if, for instance, the instance went to the error state21:14
m_3mgz: http://paste.ubuntu.com/5714448/22:04
m_3mgz: I'm gonna bring up 200 and then add some incrementally22:04
mgzm_3: ace22:11
mgz20 security group rules is pretty tight22:11
mgzdefault and the environ group will take about 10 just on their own22:12
m_3mgz: we can just go ahead and bump that up a bit22:15
mgzit wouldn't hurt22:15
m_3mgz: didn't realize we were going to be adding that many rules22:16
m_3is that because we're in global mode?22:16
m_3mgz: we're not going to nest any security groups right?22:20
mgzwe'll add rules to the global group for everything that opens ports22:22
mgzm_3: session done now, coming to find you22:22
m_3mgz: booth22:24
thumperrogpeppe: don't suppose you are around?23:06
thumperhmm... just after midnight23:06
thumperperhaps not...23:06
thumperhi wallyworld23:07
thumperwallyworld: how was the holiday?23:07
wallyworldg'day23:07
wallyworldfarking awesome23:07
wallyworldcan't wait to go back23:07
mgzno getting eaten by lion...23:07
wallyworldno, i am a fast runner23:07
wallyworldmgz: how's ODS?23:09
wallyworldthumper: i like your Set stuff - i really lament Go's lack of collections and associated standard things like Array.contains etc - there's some much boiler plate in our business logic where all this is done by hand each time :-(23:11
thumper:)23:11
mgzwallyworld: but writing a loop is so easy23:11
thumperwallyworld: yeah23:11
wallyworldseems like for every 100 lines of code, 50% is not business logic at all23:11
thumpermgz: don't make me hurt you23:11
mgzm_3: we're still getting the mongo timeout thing every minute or so23:13
mgzall seems to be from one machine, so that might just have something duff with networking23:14
thumpermgz: is mramm there with you?23:15
mgzhe's within yelling distance somewhere23:15
thumpermramm: oh hai... I'm guessing that we won't have a one-on-one call this week23:17
mrammthumper: I was not planning on doing one on ones with everybody23:18
TheMueso, 1st part of subordinates in status, time to go to bed.23:18
mrammbut I can sneak away from meetings to do some if they are helpful (on a case by case basis)23:18
TheMuehave a good night all23:18
mrammTheMue: thanks!23:18
mrammTheMue: good work.23:18
TheMuemramm: yw, and thanks.23:19
thumpermramm: nothing urgent, I talked with fwereade about work23:19
mgzso, machine 7 just never arrived at a good state: <http://paste.ubuntu.com/5714587/>23:20
m_3mgz: lemme know if we should bounce23:27
davecheneym_3: rog committed a fix overnight to reduce the amount of logging spam23:49
davecheneyso that sound cause less rsyslog load on the bootstrap node23:49
mgzfiled bug 116977323:56

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!