/srv/irclogs.ubuntu.com/2013/04/26/#juju-dev.txt

davecheneym_3: ping00:00
davecheneyour cloudinit harness doesn't support the bits of upstart I need00:00
davecheneyso i'm going to hack the bootstrap node after boot00:01
davecheneyarosales: ^ as above00:01
davecheneythat will have the same effect and validate our assumptions about the ~298 connection limit00:01
davecheneyOT question: does bzr have anything like svn externals or git submodules ?00:04
davecheney$ sudo initctl start -v juju-db00:20
davecheneyinitctl: Job failed to start00:20
davecheneyFML00:20
thumperhi davecheney00:30
davecheneyubuntu@juju-hpgoctrl2-machine-0:~$ nova list00:34
davecheney+---------+---------------------------+------------------+--------------------------------------+00:34
davecheney|    ID   |            Name           |      Status      |               Networks               |00:34
davecheney+---------+---------------------------+------------------+--------------------------------------+00:34
davecheney| 1465097 | juju-hpgoctrl2-machine-0  | ACTIVE           | private=10.7.194.166, 15.185.162.247 |00:34
davecheney| 1565949 | juju-goscale2-machine-37  | ACTIVE(deleting) | private=10.6.245.47, 15.185.172.89   |00:34
davecheney| 1566583 | juju-goscale2-machine-239 | ACTIVE(deleting) | private=10.6.246.187, 15.185.177.83  |00:34
davecheney| 1581493 | juju-goscale2-machine-0   | ACTIVE           | private=10.7.27.166, 15.185.166.80   |00:34
davecheney+---------+---------------------------+------------------+--------------------------------------+00:34
davecheney^ jammed in deleting for a few days now :(00:34
davecheney2013/04/26 00:51:08 DEBUG started processing instances: []environs.Instance{(*openstack.instance)(0xf8401b3f00)}00:51
davecheney^ *openstack.instance needs a String()00:52
m_3davecheney: hey01:17
davecheneym_3: hey mate01:18
davecheneygoing for broke for 2k01:18
m_3ssup?  still jammed?01:18
m_3sweet01:18
m_3bit of latency atm... gogo inflight wireless01:19
m_3:)01:19
davecheneyi've hacked the mongo on the bootstap machine to have at least 20,000 conns01:19
davecheneythat should be enough fo the moment01:19
m_3oh nice01:19
davecheneym_3: where u off too ?01:19
m_3SF, then Portland01:19
m_3SF is prep for the big data summercamp talk01:20
m_3portland is railsconf01:20
m_3whoohoo01:20
m_3actually looking forward to hanging with the ole 'austin-on-rails' crowd01:20
davecheneym_3: I think we'll probably run out of ram on the bootstrap node by 2,00001:20
davecheneym_3: this one is a hp bug,01:21
davecheneyubuntu@juju-hpgoctrl2-machine-0:~$ nova list | grep delet01:21
davecheney| 1565949 | juju-goscale2-machine-37  | ACTIVE(deleting)  | private=10.6.245.47, 15.185.172.89   |01:21
davecheney| 1566583 | juju-goscale2-machine-239 | ACTIVE(deleting)  | private=10.6.246.187, 15.185.177.83  |01:21
m_3davecheney: damn... I was just writing that we can bounce it and get something larger01:21
m_3but we can't update the env after boostrap still right?01:21
davecheney~ 1.5 mb per service unit01:21
davecheneyenv ?01:21
davecheneyyhou mean the spec for the bootstrap machine ?01:21
m_3juju environment01:21
m_3yeah01:21
davecheneynot easily01:22
davecheneyprobalby esier to hack juju bootstrap01:22
m_3right01:22
* davecheney facepalm01:22
davecheneythere is no swap on these machines01:22
davecheneythat will be a problem01:22
davecheneymongo will probably explode01:22
m_3yeah, sometimes when they're wedged with juju-0.7 we could do destroy-environment and it was a little stronger than destroy-service01:22
m_3can you kill em with nova01:23
davecheneynova can't kill this one01:23
m_3we shouold've started with ec2 imo01:23
davecheney(how do you think it got into this state in the first place)01:23
m_3haha01:23
davecheneym_3: any movement on some ec2 creds ?01:23
m_3not yet... I prepped antonio that the request had be pretty much approved from above... but gotta get ben on the actual acct stuff01:24
m_3davecheney: I think we should just blow it up01:25
m_3davecheney: maybe put something in place that'll tell us that's what's happening01:25
m_3so we can distinguish between a juju error and the bootstrap node blowing up01:25
davecheney"11:25 < m_3> davecheney: maybe put something in place that'll tell us that's what's happening"01:25
davecheneyoh01:25
davecheneythat01:25
m_3:)01:25
davecheneylet me blow one up so I can see what to expect01:26
m_3reasonable to get as big as we can01:26
m_3ack01:26
m_3unfortunately I won't be in the air for long... otherwise _that_ would be a great story :)... "kicked off 1000 nodes from the plane"01:26
m_3latency's really dropped down too... so it's pretty nice actually01:27
davecheneymramm: wazzup ?01:27
mrammnot much01:27
mrammI just got an email from linaro folks about armhf support in juju-core01:27
davecheneym_3: lemmie hack this instnce with a /.SWAP01:27
davecheneymramm: piece of piss01:27
mramm?01:27
davecheneyi told someone that we can always do a one off build if they need armhf today01:28
davecheneyif they need it properly01:28
davecheneywe need some work done on the golang-go package int he archive01:28
davecheneybasically, we need go 1.101:28
mrammthey are just asking if they can help test and support it01:28
mrammright01:28
mrammthat was what I remembered from some earlier arm discussion01:28
davecheneythey can test it right now today if they build go and juju from source01:28
davecheneyhttp://dave.cheney.net/unofficial-arm-tarballs01:28
mrammThey are not being demanding, just asking how they can help01:28
davecheney^ or they can use my beta tarballs01:29
davecheneyfeel free to cc me01:29
davecheneyi'm happy to help get them started01:29
mrammand what they can do, so I will let them know the situation, and CC you01:29
mrammsounds good01:29
mrammdid we hardcode the state server to be amd64?01:30
m_3descending below 10k-ft... ttyl01:31
davecheneymramm: opinions differ01:31
davecheneywilliam told me it _is_ hard coded to amd6401:31
davecheneythen he told me it wasn't01:31
mrammok01:31
davecheneyi don't know the current answer01:31
mrammI will check with william01:31
davecheneyid' expect it to just work01:31
davecheneymramm: it's a bit of a problem that the UEC service doesn't list our armhf on amd64 images, http://cloud-images.ubuntu.com/query/precise/server/released.txt01:33
mramminteresting01:33
davecheneyhmm, maybe they do for Q01:34
davecheneynup01:34
mrammwe can talk to the "public cloud images" guys about that, and see what we can get done there.   I'll talk to antonio about that tomorrow.01:46
davecheneymramm: http://www.h-online.com/open/news/item/Canonical-releases-EC2-image-for-Ubuntu-ARM-Server-1585740.html01:47
mrammkk01:47
mrammthanks01:47
thumperhi mramm01:49
davecheneymramm: m_3 286 slaves running, mongo using 450 mb of ram01:49
davecheneyso at least 4gb required for 2000 nodes at this rate01:49
thumperdavecheney: is that good?01:49
davecheneyit means you need to run a larger bootstrap instnace01:49
mrammdavecheney: I guess that is to be expected if we are going to have thousands of open connections to mongo01:49
davecheneybut then, if you're running 2000 nodes in your environment01:49
mrammtrue enough01:50
davecheneyyou probably dn't care about the cost difference01:50
mrammright, the bootstrap node cost will be trivial compared to the 2000 nodes01:50
davecheneyeach conn is a thread, which is anywhere between 1mb and 16mb depending on libc and the phase of the moon01:50
davecheneymramm: bingo01:50
mrammthumper: hey!01:50
mrammdavecheney: I think we should work to get 1.1 into S as soon as we can01:51
thumpermramm: finally landed the hook synchronization branch01:51
mrammwe expect 1.1 final to land in plenty of time, and the earlier we propose the easier it is01:52
davecheneymramm: that will require deviating from the ustream01:52
davecheneywhich I have no problem doing01:52
mrammyea01:52
thumpersnarky... superb... slimey...01:52
davecheneybut sounds like that isn't what we do (tm)01:52
thumperwhat was S again?01:52
davecheneysurly01:52
thumpernot sweet01:52
thumperI don't want to look it up01:52
thumperbut instead batter things around until it floats to the top of my memory01:52
davecheneysurly simian or something01:52
thumperdefinitely a salamander01:53
thumpernot sticky01:53
thumperwhich reminds me of a joke01:53
davecheneystinky subhuman01:53
thumper"What is brown and sticky"01:53
davecheney2013/04/26 01:53:23 NOTICE worker/provisioner: started machine 307 as instance 158261701:53
mrammstout sea-urchin?01:53
thumpera stick01:53
mrammhaha01:53
mrammfyi: https://wiki.ubuntu.com/SReleaseSchedule01:54
davecheneyhmm, at 300 nodes the main thread on mongod is at 30% duty01:55
mramminteresting01:55
mrammsounds like some more evidence that we will need an internal API sooner rather than later01:55
davecheneymramm: it's all the reconnection and ssl handshaking from the clients probing01:55
mrammdoes it settle down after they have connections established?01:56
davecheneymramm: no01:56
davecheneythis is a constant load01:56
davecheneythe polling is every 2 ? minutes01:56
* davecheney goes and checks01:57
thumperso changing to use the api internally should reduce the load here?02:03
thumperor will it still be high02:03
thumperjust because of the number of clients?02:03
davecheneythumper: lower, i would hope02:05
* thumper nods02:05
davecheneythe polling is internal to the mongo driver02:05
davecheneythe driver will poll all the known services in the replica set every 180 seconds at least02:07
davecheney2013/04/26 02:13:11 NOTICE worker/provisioner: started machine 406 as instance 158297102:13
davecheneymight have to go to lunch at this rate02:13
davecheneyhmm, 20 mins per 100 instances02:14
davecheneynot bad02:14
mrammyea, that's not too bad at all02:20
thumperdavecheney: going up to 2000?02:20
davecheneyf;yeah02:20
davecheneyhp are anxious to have their capacity back02:20
davecheneyso no pussy footing around02:21
davecheneyoooh02:42
davecheneyubuntu@juju-hpgoctrl2-machine-0:~$ juju debug-log 2>&1 | grep TLS02:42
davecheneyjuju-goscale2-machine-281:2013/04/26 02:42:08 ERROR state: TLS handshake failed: local error: unexpected message02:42
davecheneyjuju-goscale2-machine-444:2013/04/26 02:42:11 ERROR state: TLS handshake failed: local error: unexpected message02:42
davecheneyjuju-goscale2-machine-160:2013/04/26 02:42:07 ERROR state: TLS handshake failed: local error: unexpected message02:42
davecheneyjuju-goscale2-machine-405:2013/04/26 02:42:10 ERROR state: TLS handshake failed: local error: unexpected message02:42
davecheneyjuju-goscale2-machine-162:2013/04/26 02:42:11 ERROR state: TLS handshake failed: local error: unexpected message02:42
davecheneydoesn't appear to be affecting things02:42
davecheneyinstance creation time is slowing, 2013/04/26 04:17:37 DEBUG environs/openstack: openstack user data; 2712 bytes04:18
davecheney2013/04/26 04:17:52 INFO environs/openstack: started instance "1584731"04:18
thumperdavecheney: by how much?04:31
davecheneynot sure, i'd have to get the whole logs04:31
davecheneybut the botostrap node is nearly out of memory04:31
davecheneyand starting to swap04:31
davecheneyi'm having a look to see if I can change the instance type of the bootstrap node04:33
davecheneyneed at least 4x more ram to make it to 200004:34
m_3davecheney: can we `juju bootstrap --constraint='instance-type=standard.large'` or something?04:34
davecheneym_3: not sure04:34
davecheneythere is something in the openstack logs that says the instance type is being hard coded04:35
m_3oh, yeah, there's --constraints on bootstrap according to help04:35
davecheneyi'm going to grab the log and kill this test04:35
m_3oh... didn't realize it was hard-coded... never tried anyting other than standard.small on hp04:35
davecheneyi've seen enough to know it's not going to make it04:35
m_3still great info04:35
m_3got it to the point where it's swapping04:36
davecheneym_3: will post my notes on this run04:36
m_3so it's probably safest to keep the environment defaulted to standard.small and then do a special bootstrap04:36
davecheneym_3: how do we advise customers to size their bootstrap node04:36
m_3btw, we should do a special hadoop-master too04:36
davecheneym_3: wanna take a look while i'm grabbing the logs ?04:37
m_3lemme check my notes04:37
m_3I stuck the heap-size config about halfway through http://markmims.com/cloud/2012/06/04/juju-at-scale.html04:38
m_3we just need to test out if the openstack provider will take the --constraints="instance-type=xxx" on bootstrap04:40
m_3those were mediums though04:40
m_3in ec204:40
m_3but whatever, the big one is the bootstrap node for now... the hadoop job doesn't actually have to run atm04:40
* m_3 looks back for the dang ip04:41
davecheney15.185.162.24704:41
davecheneyubuntu@juju-hpgoctrl2-machine-0:~$ scp -C 15.185.162.247:/var/log/juju/all-machines.log all-machines-2000-node-test-20130426.log04:42
davecheneyPermission denied (publickey).04:43
davecheneywhy is this being a sone of a bitch04:43
davecheneyoh hang on04:43
davecheneyok, i'm going to destroy this envrionment04:43
m_3rsync -azvP -e'juju ssh -e ...'04:43
davecheneygot it04:43
m_3so we prob wanna do standard.xlarge04:44
m_3can maybe do a standard.large, but might as well do the bootstrap at xlarge04:45
m_3`nova flavor-list` describes them all04:45
davecheneym_3: we'll probably have to do a set-config after we boot04:48
davecheneybut I need to do some screwing with the bootstrap node to make mongo scale04:48
m_3ah, ok04:48
davecheneyunless you want to boot everthing as an xlarge04:49
davecheneywhich might get me a bollocking04:49
m_3davecheney: no, we only have perms on standard.small over normal limits04:51
m_3davecheney: so I think we leave the environment using default-instance-type: standard.small04:51
m_3davecheney: but try to use a constraint with the bootstrap04:52
m_3davecheney: are you thinking that won't work?04:52
m_3davecheney: sorry, I think I screwed up your scp... please check it04:52
davecheneynah it's ok04:52
davecheneydont' worry i got the scp04:52
m_3k04:53
davecheneylets try the --constraint option04:53
davecheneyit's 3pm in AU now04:53
davecheneyi'm going to destrouy this env and start again04:53
m_3hell, I guess the easiest thing to do is first of all04:53
davecheneyi don't want to leave it running overnight04:53
m_3deploy another service with a constraint04:53
m_3yeah, we don't need to leave it up for anything04:53
m_3I weas just thinking we could test out the constraint thing pretty quickly04:53
m_3but it'll be interesting to see how long the destroy takes :)04:54
m_3ha04:55
m_3davecheney: it stil looks like it's spawning shit04:55
davecheneyyup, destroy works backwards04:56
davecheneyi'll stop the PA04:56
davecheneystopped04:57
m_3davecheney: so do we have to kill them via nova now?04:58
davecheneym_3: if we have too, that is a bug04:59
davecheneydestroy means destroy, not do your best :)04:59
m_3yup, but do the services you just killed have to be up throughout destroy?04:59
* m_3 doesn't know if destroy needs the db to get instance-ids05:00
m_3davecheney: crap, just tried to bootstrap on another hp acct... doesn't respect the instance-type constraint05:02
davecheneym_3: I suspected that05:03
m_3davecheney: know the syntax for "mem>=16GB"05:03
m_3?05:03
davecheneythumper: ?05:03
davecheneym_3: our constraints support is very basic05:04
m_3oh, looks like it's trying on a 'mem=16G'05:04
davecheneywallyworld_: any ideas ?05:04
m_3nice, I got past the basic validation it looks like... got a "no tools available"05:04
davecheney--upload-tools ?05:05
wallyworld_davecheney: about?05:05
davecheneywallyworld_: we're trying to bootstrap an env with a larger bootstrap node05:05
m_3davecheney: we can try from the ctrl instance... my laptop's off of the 1.10 distro package05:05
wallyworld_on ec2 i assume05:05
davecheneytry from the control instance05:06
m_3davecheney: nice, they're dying... slowly05:06
davecheneywe could kill them all with nova05:06
davecheneyprobably not worth it05:06
davecheneyit'll be done in a few mins05:06
m_3davecheney: yup05:06
m_3once they're dead, we can try the constraint on bootstrap05:07
wallyworld_davecheney: so you are typing something like this?  juju bootstrap --constraints "mem=4G"05:08
davecheneywallyworld_: y05:08
wallyworld_and it's not working?05:08
m_3davecheney: I like that it blocks05:08
davecheneyec2 blocks as well05:08
davecheneybut ec2 lets you just say 'delete these 1000 instance id's'05:08
m_3ack05:09
davecheneyit looks like openstack makes you do them one at a time05:09
m_3wallyworld_: not sure yet05:09
m_3that's surprising05:09
* wallyworld_ has to go get kid from school05:09
m_3might be worth filing it as a bug on the openstack provider05:09
davecheneyor at least a whinge05:10
m_3davecheney: well, I spoke too soon :)05:10
m_3it finished with instances still active05:10
davecheneyFAIL!05:10
m_3maybe a timeout05:11
* davecheney embuginates05:11
davecheneynup just raw fail05:11
* m_3 cheers from the sidelines05:11
davecheneyhttps://bugs.launchpad.net/juju-core/+bug/117021005:12
_mup_Bug #1170210: environs/openstack: destroy-environment leaks machines in hpcloud <juju-core:Triaged> <https://launchpad.net/bugs/1170210>05:12
davecheneyhere is one I apparently prepared ealier05:12
davecheneym_3: ubuntu@juju-hpgoctrl2-machine-0:~$ nova list                                                                                                                              │·············································································05:13
davecheney+---------+---------------------------+------------------+--------------------------------------+                                                                         │·············································································05:13
davecheney|    ID   |            Name           |      Status      |               Networks               |                                                                         │·············································································05:13
davecheney+---------+---------------------------+------------------+--------------------------------------+                                                                         │·············································································05:13
davecheney| 1465097 | juju-hpgoctrl2-machine-0  | ACTIVE           | private=10.7.194.166, 15.185.162.247 |                                                                         │·············································································05:13
davecheney| 1565949 | juju-goscale2-machine-37  | ACTIVE(deleting) | private=10.6.245.47, 15.185.172.89   |                                                                         │·············································································05:13
davecheney| 1566583 | juju-goscale2-machine-239 | ACTIVE(deleting) | private=10.6.246.187, 15.185.177.83  |                                                                         │·············································································05:13
davecheney| 1581727 | juju-goscale2-machine-5   | ACTIVE(deleting) | private=10.7.30.60, 15.185.168.253   |                                                                         │·············································································05:13
davecheney+---------+---------------------------+------------------+--------------------------------------+05:13
davecheneycan you email thiat list to antonio and ask hp to find out why those won't delete05:13
m_3oh, same stuck ones?05:14
davecheney-5 is a new one from this round05:14
davecheney-37 and -239 were stuck from tuesday05:14
m_3ack05:14
m_3sent05:16
davecheney2013/04/26 05:16:11 WARNING environs/openstack: ignoring constraints, using default-instance-type flavor "standard.small"  '05:16
davecheney^ this is what I was afraid of05:16
davecheneywallyworld_: any way to hack around this ?05:16
m_3crap05:16
m_3davecheney: we could turn off the 'default' in the environment05:16
davecheneym_3: i suspected that would happen, but lacked the words to express it05:16
m_3then see what happens with a few05:17
m_3or explicitly set the constraint for smalls too05:17
davecheneyi like how fast bootstrap happens in hp cloud05:17
davecheneyusually < 1 min05:18
davecheneyso much better than AWS plodding05:18
m_3davecheney: yup... lots faster05:18
davecheneym_3: hang on, let me fuck with it for a sec05:18
davecheneyahh, yoiu;'re doing what I was going to do :)05:19
m_3shit, sorry05:19
davecheneynah, you're good05:19
davecheneythat was what I was going to do05:19
davecheneym_3:  do you wanna do a hangout for a bit ?05:19
davecheneyor is it a bit late in your local TZ ?05:19
m_3davecheney: yeah, I should stop screwing around and hit the sack :)05:20
davecheneygo, flee, run wild, etc05:20
davecheneysam is in perth this weekend05:20
m_3hotel room with the wife asleep so can't do voice atm05:21
davecheneyso i'm going to hack on this all weekend05:21
davecheney(not to mention drink scotch)(05:21
m_3:)05:21
m_3ok, yeah, it doesn't look like our experiment was working anyways05:21
m_3might not be hard to change the constraint "override" code though05:21
davecheneyI FIXED IT WITH SCIENCE !05:32
davecheneym_3: ok, i got the environment setup the way we want05:35
davecheneybut forgot to goose mongo05:35
davecheneylemme do that again05:35
davecheneym_3: hey, machine 5 is dead :)05:35
davecheneythat is nice bonus05:35
m_3oh, cool05:36
davecheneyplease watch closely, there is nothing up my sleves05:36
m_3haha05:37
m_3so you're gonna default to xlarge, then explicitly ask for 'mem=2G' for slaves?05:37
davecheneym_3:  will know in a second05:40
davecheneythe environment config should default to .smalls05:40
m_3sweeet05:40
m_3nice05:41
davecheneythank thumper for set-config05:41
m_3ah05:41
davecheneym_3: the rule is, once you've bootstraped, most of the values in environments.yaml are ignored05:41
davecheneythe active values are in the state05:41
davecheneyohh dear, it shouldn't show you all those things :)05:42
* m_3 was wanting set-config in juju-0.6 earlier this week05:42
m_3ha05:42
m_3well, yes05:42
davecheneysorry, the comamnd is set-environment05:42
m_3it shouldn't05:42
davecheneybut it's operation is straight forward05:42
m_3understood... I was actually wanting set-config :)... but thought maybe the tool did both05:43
davecheneywe have set-config as well05:43
* m_3 happy camper05:43
davecheneyum, at least I thought we did05:44
m_3just get05:44
davecheneyoh yeah05:44
m_3`juju get hadoop-slave`05:44
m_3no filtering it looks like05:45
davecheneyyeah, i blame myself05:45
m_3I sooo want a "preload-packages" or the equiv05:46
davecheneym_3: what would that do ?05:47
m_3charm metadata level as well as environment level05:47
m_3install packages before calling any hooks05:47
davecheneyah, via cloud init (sorta)05:47
davecheneyso all the hook install commands we no ops05:47
m_3even later would be fine05:47
davecheneyMUCHA PARALLELA05:47
davecheney2013/04/26 05:48:16 DEBUG environs/openstack: openstack user data; 2710 bytes                                                                                             │·············································································05:48
davecheney2013/04/26 05:48:29 INFO environs/openstack: started instance "1585513"05:48
davecheney13 seconds to bootstap an instance05:48
davecheneythumper: i was wrong, this didn't significantly change with 1000 instances running05:49
m_3davecheney: it's moving now...05:49
m_3what, thought the per-instance startup time was changing?05:49
davecheneyit went a up a little as mongo started to swap05:49
davecheneynot signficantly05:49
m_3ack05:49
m_35/min atm05:50
m_3ish05:50
davecheneythe hold back time from openstack's rate limiting affects that05:50
davecheneybc says 7 hours to bootstrap 2000 instances05:50
davecheneyfaaaaaaaaaaaaaaark05:50
davecheneyyou only get 4 cpus with the 16gb instance05:51
davecheneythat is pretty tight05:51
m_3davecheney: where's htop on the bootstrap?05:51
davecheney#605:51
davecheneyfun fact, mongo supports a --maxConns flag05:52
davecheneywhich defaults to 20,00005:52
davecheneybut that is gated by 80% of the current number of file descriptors05:52
m_3huh05:52
* davecheney quitely expects mongodb to assplode at 10k connections05:53
davecheneym_3: juju-goscale2-machine-0:2013/04/26 05:55:05 NOTICE worker/provisioner: started machine 85 as instance 1585607                                                             │·············································································05:55
davecheneyjuju-goscale2-machine-0:2013/04/26 05:55:05 INFO worker/provisioner: found machine "86" pending provisioning05:55
davecheneythis is an interesting log line05:55
m_3davecheney: I didn't catch your startup... are these related to a master?05:55
davecheneysorry, say again05:56
m_3did you deploy this from 'bin/hadoop-stack'?05:56
davecheneyyeah05:56
m_3or just deploy -n?05:56
davecheneywith -n197505:56
m_3ok, cool05:56
m_3wanna catch the master address... shit, status doesn't take any filters either though05:57
davecheneythat log line above shows how the PA works05:57
davecheney15.185.161.6205:57
davecheneywhat is the port ?05:57
m_3davecheney: yeah, that looks like what we'd expect to me05:57
m_35007005:57
davecheneyusing nova list is cheating, but whateva05:58
m_380 nodes registerd05:58
m_3this'd be really hard to test without novaclient05:59
m_3damn, this is looking great right now05:59
davecheneym_3: so i'm trying to drag myself into the 90's an use tmux05:59
davecheneybut there is one thing that i can't figure out05:59
davecheneywhen i C-a etc06:00
davecheneysometimes it is like the ^C is ignored06:00
m_3hmmmm not sure what you mean06:00
m_3you're trying to ctrl-c a process you mean?06:00
davecheneyno, cntl-a n06:00
m_3ctrl-a hangs waiting for a followup keypress06:01
davecheneyyeah06:01
m_3there's a timeout setting I think06:01
davecheneyit feels like that06:01
davecheneym_3: anyway06:01
davecheneyit looks like mongo does all it's tls negogiation on the main thread06:01
davecheneythen spawns a worker thread06:01
davecheneywhich is a bit lame06:02
m_3I'll often find myself switching to another window as a no-op if I change my mind or get lost in a ctrl-a sequence06:02
davecheneyrather than accepting the connetion and handling it in a thread06:02
* m_3 not surprise that something like tls integration is half-baked06:02
davecheneyat 900 machines running, the main thread was busy 90% of the time handling all the reconnections from the driver06:03
m_3yeah06:03
davecheneyi expect that to get a bit shit at 2,000 nodes06:03
m_3yup06:03
m_3not sure how to get around that one06:04
davecheneyas william said, it's moving the ws api out to the agents06:04
m_3yeah, but that's a huge change though right?06:05
davecheneyits a lot of work, but conceptually it's straight forwrd06:06
m_3right06:06
m_3a fix06:06
m_3not so much a workaround :)06:07
davecheneyeverything talks tot he state via a set of types which convert between mongo documents and data structures06:07
davecheneyso it would just be a different conversion06:07
davecheneywatchers are, as always, the tricky bit06:07
m_3true dhat06:07
davecheneym_3: what happens if I deploy the juju-gui on this environment ?06:07
m_3don't know if juju-gui talks to juju-1.10 api yet... does it?06:08
m_3shit, we can try :)06:08
davecheneym_3: gary poster said it did about 5 hours ago06:08
davecheneywho am I to doubt that lovely man06:08
davecheneyfuck, we'll have to wait 8 hours for that to be provisioned06:09
m_3now your nova trick won't work this time :)06:09
davecheneyshitter06:09
davecheneywell this is fun, for relative values of fun06:09
davecheneybugger, i should have deployed the gui first06:10
davecheneyhmm, i'll do that on the next run06:10
m_3hmmmm... brain's getting fuzzy... but maybe there's a way to point the juju-gui to an api server via config06:10
m_3i.e., from anohther env06:10
davecheneyprobably06:10
davecheneyit won't use a relation06:10
davecheneybecause the api server is not a service06:10
davecheney(although it should be)06:11
m_3nah, doesn't look like it in the charm06:11
davecheney  juju-gui:                                                                                                                                                               │·············································································06:11
davecheney    charm: cs:precise/juju-gui-46                                                                                                                                         │·············································································06:11
m_3i.e., no config for api server06:11
davecheney    exposed: true                                                                                                                                                         │·············································································06:11
davecheney    units:                                                                                                                                                                │·············································································06:11
davecheney      juju-gui/0:                                                                                                                                                         │·············································································06:11
davecheney        agent-state: pending                                                                                                                                              │·············································································06:11
davecheney        machine: "1999"06:11
davecheneyGLWT06:12
m_3199906:12
m_3sweet06:12
m_3btw, the gui for this will be pretty un-interesting06:12
m_3two boxes06:12
m_3hadoop-master and hadoop-slave06:13
m_3two lines between them06:13
davecheneyi be it crashes my browser06:13
davecheneybet06:13
m_3but yes, it'd still be neat to see06:13
m_3hahq06:13
m_3well, yeah... maybe that too06:13
m_3although kapil had a simulator mock thingy set up06:14
davecheneythat is true06:14
m_3he may've done some scale testing with that06:14
davecheneythat can simulate infesibly large environments06:14
m_3most likely problem would be timeouts06:14
m_3maybe06:14
m_3while the api server chokes06:15
m_3davecheney: sweet... that's thumping along06:15
davecheneym_3: that is what I am thinking, it'll be lugging around the data for thousands of relations06:16
davecheneyyup06:16
m_3davecheney: ok, well I think I'm gonna hit the sack then06:16
davecheneyyeah06:17
m_3davecheney: you want me to do anything on the flipside?06:17
davecheneythis is as thrilling as watching paint dry06:17
davecheneyif anything eventful happens i'll put it in an email06:17
m_3davecheney: or well just send me email if you get eod and want me to do something06:17
davecheneyi won't leave it running past about 11pm tonight06:17
davecheneywe should be pretty close to 2000 nodes by then06:17
davecheney7 hours really isn't fast enough for this06:17
davecheneyhow long did it take for the ec2 2k node test ?06:17
m_3k... I'm on UTC-7 for the next two weeks06:18
m_3bout 7hrs iirc06:18
m_3was split up a bit in the big run06:18
m_3did 1000, tested job runs on that cluster06:19
davecheneym_3: i'll see you in -7 on the 5th06:19
m_3then cleaned out the hdfs and added 1000 more06:19
m_3but I think that was 7hrs total06:19
davecheneybooooooooooooooring06:19
m_3there were a few white russians invovled too :)06:20
davecheneya capital idea!06:20
m_3:)06:20
* davecheney considers scouting for dinner06:20
m_3davecheney: k, well goodnight fine sir06:20
davecheneylater mate06:20
davecheneyenjoy this port - land06:20
davecheneyrogpeppe: can you help with a juju-gui question ?07:42
rogpeppedavecheney: perhaps...07:42
rogpeppedavecheney: a question from you about juju-gui, or a question from the juju-gui team?07:43
davecheneyhow to login to the bugger07:43
rogpeppedavecheney: sorry, didn't see your question...07:57
rogpeppedavecheney: if you want me to see something, you need to mention my irc handle...07:58
rogpeppedavecheney: you use your admin secret07:58
rogpeppedavecheney: have you tried it and had it fail?08:00
davecheneyrogpeppe: yeah, tried and failed08:01
davecheneyis there a length limit ?08:01
rogpeppedavecheney: i don't think so08:01
rogpeppedavecheney: hmm, let me try it. remind me of the charm url of the gui charm, please?08:02
davecheneyhttps://15.185.163.105/08:02
davecheney^ this is the depoloyed gui]08:02
davecheneyubuntu@15.185.162.24708:02
davecheneyis the machine that bootstrapped08:03
davecheneyrogpeppe: your key is already on that machine08:03
davecheneyso you should be able to recover the admin password08:03
rogpeppedavecheney: actually, i was going to try deploying it, and couldn't remember the charm url08:03
rogpeppedavecheney: but i'll try logging in to yours too08:04
davecheneysorry this one is already deployed08:04
davecheneyrogpeppe: it's doing a 2000 machine bootstrap08:04
davecheneyso deploying another will take another 7 hours08:04
rogpeppedavecheney: i want to see if i can reproduce the problem on a smaller env08:04
davecheneykk08:04
davecheneyi just do juju deploy juju-gui08:04
davecheneyjuju expose juju-gui08:04
davecheneyjust followed garys instructions from his email08:04
rogpeppedavecheney: i don't see any gui charm deployed on that machine08:08
rogpeppedavecheney: and the error messages in machine.log look like they're not in the current juju tree08:09
davecheneythaqt machine is not inside the environemnt08:09
davecheneyrogpeppe: but you can use that machine to recover the admin secret for the goscale2 environment08:10
rogpeppedavecheney: ah, ok; i thought you said it was the deployed gui08:10
davecheneyrogpeppe: the gui uri is https://15.185.163.105/08:11
rogpeppedavecheney: sorry, i got muddled08:11
davecheneyrogpeppe: yeah, sorry, this is very confusing08:11
davecheneywe're running an envbironment within an environment08:12
davecheney'cos that is how m_3 rolls08:12
rogpeppedavecheney: i sometimes do that too08:12
rogpeppedavecheney: at some point i'll run up a "juju-dev" charm that provides a full juju-core dev environment08:13
davecheneythat is a great idea08:13
davecheneyscrew local mode08:13
rogpeppedavecheney: i've done it manually before, but it's a hassle; just what charms are for08:14
rogpeppedavecheney: ok, so login fails for me too08:14
davecheneyweird eh08:15
rogpeppedavecheney: any chance you could add my key to the gui node?08:16
rogpeppedavecheney: ah, i can probably ssh from the bootstrap node08:16
davecheneyrogpeppe: yes08:17
davecheneyjuju ssh 108:17
rogpeppedavecheney: is there any way we can get ssh to only *temporarily* add hosts. the "permanently added" thing seems wrong08:19
rogpeppedavecheney: and i just saw this message, which is probably related: http://paste.ubuntu.com/5603807/08:19
davecheneyrogpeppe: unrelated08:20
davecheneywe've been creating and destroying machines all day08:20
rogpeppedavecheney: ah, ok08:20
davecheneyso ip addresses have been reused08:20
davecheneyand have left stale entries in the ssh knownhosts file08:20
* davecheney has craeted on the order if 1600 machines today08:21
rogpeppedavecheney: that sounds like exactly what i was talking about, no?08:21
rogpeppedavecheney: isn't the "permanently added" thing talking about adding to the knownhosts file?08:21
davecheneyrogpeppe: that is correct08:21
davecheneyi think i meant to say 'that warning is not serious'08:21
rogpeppedavecheney: oh, i realise that08:22
rogpeppedavecheney: but if ssh wasn't adding to the known hosts file, we wouldn't see that message08:22
davecheneyit won't add it a second time08:22
davecheneythe warning is the ip address exists in the file, with a different fingerprint08:22
davecheneybecause we pass -o ignorehostwarning or something to ssh it carries on anyway08:23
rogpeppedavecheney: yeah; basically i don't want to say "i know this ip address" forever because ip addresses are totally transitory in the juju env08:24
davecheneyrogpeppe: bingo08:24
davecheneyrogpeppe: i'll forward you my notes from the first 1000 machines08:27
davecheneyrogpeppe: i didn't bother to send that to william, he's got enough on his plate08:32
davecheneythe amount of memory mongo uses per connection is obscene08:33
rogpeppe1davecheney: last thing i saw was:08:36
rogpeppe1[09:27:39] <davecheney> rogpeppe: i'll forward you my notes from the first 1000 machines08:36
davecheney18:32 < davecheney> rogpeppe: i didn't bother to send that to william, he's got enough on his plate08:37
davecheney18:33 < davecheney> the amount of memory mongo uses per connection is obscene08:37
davecheneythat is all I said08:37
davecheney'cos you were ignoring me :)08:37
rogpeppe1davecheney: occupational hazard of going through a mobile data connection08:37
davecheneyrogpeppe1: do you think they will reconnect your part of england to the internet in the near future ?08:37
rogpeppe1davecheney: no prospect in the near future08:37
davecheneyrogpeppe1: shitter08:38
* davecheney steps outside to order some dinner08:38
rogpeppe1davecheney: the fault is somewhere in 200m of underground cable08:38
rogpeppe1davecheney: and they have to get planning to dig it up08:38
rogpeppe1davecheney: i'd like to see your notes BTW08:38
rogpeppe1davecheney: you might've missed this BTW:08:39
rogpeppe1[09:31:31] <rogpeppe> davecheney: ah, this looks like a problem: http://paste.ubuntu.com/5603842/08:39
rogpeppe1[09:32:57] <rogpeppe> davecheney: oops, missed one redaction08:39
davecheneyrogpeppe1: if you're looking at the output of juju get-environment08:40
davecheneyyeah, i think we left our flys open a bit08:40
rogpeppe1davecheney: i removed most of the passwords; but i've no idea what that one was from - third attempt, looks like08:41
rogpeppe1davecheney: unfortunately there seems no way to deliberately delete a paste08:41
rogpeppe1davecheney: before the crawlers find it08:42
davecheneyrogpeppe1: s'ok, i'll change the admin secret08:45
rogpeppe1aw shucks, "juju deploy juju-gui --force-machine 0" doesn't work08:45
rogpeppe1davecheney: that wasn't the admin secret08:45
davecheneywill fix08:45
davecheneyrogpeppe1: as pennance, you need to fix that bug :)08:46
rogpeppe1davecheney: i'm looking08:46
rogpeppe1davecheney: i'll try to reproduce it first. please don't take down that environment for the time being (not that there's much danger, i think)08:47
davecheneyrogpeppe1: np08:48
rogpeppe1davecheney: interesting minor bug: http://paste.ubuntu.com/5603887/08:48
davecheneyno you can't do that, oh, ok, if you must08:49
rogpeppe1davecheney: no, it's not done - the unit is left around unassigned08:49
davecheneyoh08:50
davecheneyinteresting08:50
rogpeppe1davecheney: you have to manually destroy the unit then add another one08:50
rogpeppe1davecheney: https://bugs.launchpad.net/juju-core/+bug/117308908:56
_mup_Bug #1173089: deploy can fail partially <juju-core:New> <https://launchpad.net/bugs/1173089>08:56
davecheneybzzt08:59
rogpeppe1davecheney: hmm, the gui works ok for me09:05
davecheneyrogpeppe1: poop09:06
davecheneywhy can't i login to my deployment ?09:06
rogpeppe1davecheney: here's an idea: kill the machine agent09:06
rogpeppe1davecheney: and see if it works when it starts again09:07
davecheneyok09:07
rogpeppe1davecheney: 'cos that EOF error is really weird09:07
rogpeppe1davecheney: i'm hoping that we will still see the error when it restarts09:07
rogpeppe1davecheney: because then there's the possibility of upgrading the binaries with some updated logging and better error messages.09:08
rogpeppe1davecheney: and finding out what's really going on09:08
rogpeppe1davecheney: the only possibility that i can think of currently is that the connection to the mongo server has failed09:10
rogpeppe1davecheney: i *wish* we annotated our errors more09:10
rogpeppe1davecheney: if my theory is correct, that EOF error comes from about 6 levels deep and hasn't been given any context at all09:11
davecheneyrogpeppe1: is this on the api server, or the state/mongo server?09:12
rogpeppe1davecheney: on the api server09:12
davecheneyright09:12
rogpeppe1davecheney: if i had my way, there would be almost no if err != nil {return err} occurrences in our code09:13
rogpeppe1davecheney: i lost that argument ages ago, but problems like this really show how bad our current conventions are09:14
davecheneyrogpeppe1: i'm starting to be convinved09:14
davecheneyand i think it can be reopened09:14
davecheneytimes they have a changewd09:14
rogpeppe1davecheney: my comment (the last one) on this post is a reasonable representation of my thoughts on the matter: http://how-bazaar.blogspot.co.nz/2013/04/the-go-language-my-thoughts.html09:16
* davecheney reads09:17
davecheneyrogpeppe1: the main mongo thread is now using more than 100% CPU09:19
* rogpeppe1 is not surprised09:19
davecheneyit looks like mongo handles the accept(2) and the tls handshake on the main thread09:20
davecheneyso every 30 seconds we get a storm of agents sniffing around09:21
rogpeppe1davecheney: oh god09:21
davecheneyand the cpu wedges09:21
davecheneyonly once it has done the handshaking does it hand off the connection to a new thread09:21
rogpeppe1davecheney: we should try with a much much longer time interval there09:21
rogpeppe1davecheney: 30s is ridiculous09:21
davecheneyit's not 30s09:21
davecheneybut that appears to be the resonent frequency of the polling interval09:21
davecheneyits 180s or whenever they need to do a sync (that is what mgo calls it)09:22
davecheneywhich ever is the sooner09:22
rogpeppe1davecheney: ah i see. the usual self-synchronising clock thing09:23
davecheneyyeah, that isn't all 650 agents at once09:23
davecheneybut a swarm of them09:23
* rogpeppe1 loves emergent patterns09:23
* davecheney does not09:23
rogpeppe1davecheney: it's the joy of the universe, maaan09:24
rogpeppe1davecheney: does that blog comment make sense to you BTW? i have the impression that noone gets what i'm trying to say there.09:27
* rogpeppe1 is not good at rhetoric09:28
davecheneyrogpeppe1: i agree with your position09:29
davecheneyi think we talked about this a year ago09:29
davecheneywaiting for the computer history museam to open09:30
rogpeppe1davecheney: ah yes, i remember09:30
davecheneyand now with the benefit of some history09:30
davecheneyi agree09:30
davecheneywell, i always agreed09:30
davecheneybut this is an excellent case09:30
rogpeppe1davecheney: i might put a post together for juju-dev09:31
rogpeppe1davecheney: 9 levels deep and still diving09:43
davecheneyrogpeppe1: remember to stop on the way back up and represurise to avoid the bends09:45
rogpeppe1davecheney: lol09:46
davecheneydon't go james cameron on me man09:46
rogpeppe1davecheney: bottomed out at 1209:52
davecheney64 bit process09:52
rogpeppe1davecheney: if we reported a stack trace, as some suggest, it would show only the bottom 2 levels09:52
rogpeppe1davecheney: http://paste.ubuntu.com/5604054/10:00
rogpeppe1davecheney: actually, there's probably another layer at the top10:00
rogpeppe1davecheney: here's the complete stack: http://paste.ubuntu.com/5604064/10:02
davecheneyrogpeppe1: shit10:03
rogpeppe1davecheney: one easy thing to do is to actually hook up the mgo logging10:04
rogpeppe1davecheney: then that logf at the bottom would actually have printed something10:04
davecheneyrogpeppe1: is that hard to do ?10:09
rogpeppe1davecheney: trivial10:09
rogpeppe1davecheney: a one-line change10:10
rogpeppe1davecheney: or one or two more if we want nicely formatted messages10:11
davecheneyrogpeppe1: a single thread is now using 209% CPU on the bootstrap node ...10:18
rogpeppe1davecheney: is that possible?10:18
davecheney  PID USER      PRI  NI  VIRT   RES   SHR S CPU% MEM%   TIME+  Command10:18
davecheney 9611 root       20   0 8169M 1770M     0 S 194. 11.0  1h40:55 /usr/bin/mongod --auth --dbpath=/var/lib/juju/db10:18
davecheneyreally, it is10:18
rogpeppe1davecheney: i thought a thread was... single threaded10:18
rogpeppe1davecheney: or do you mean a single process (with several threads inside) ?10:18
davecheneyrogpeppe1: this is using htop so it should be per thread10:19
davecheneyi cannot explain it10:19
davecheneyapart from observing it is large10:19
davecheneyohh, and now I can see a lot of blocking on the mongo side10:19
davecheneyand that is only 800 machines10:20
davecheneysorry, 888\10:20
davecheneyApr 26 10:21:44 juju-goscale2-machine-0 mongod.37017[9611]: Fri Apr 26 10:21:44 [conn84734] query presence.presence.pings query: { $or: [ { _id: 1366971690 }, { _id: 1366971660 } ] } ntoreturn:0 ntoskip:0 nscanned:2 keyUpdates:0 numYields: 1 locks(micros) r:763142 nreturned:2 reslen:744 381ms10:21
davecheneyrogpeppe1: i'm assuming these are 'slow queries'10:22
davecheneythey only start to show up in the log at the 800 machine mark10:22
rogpeppe1davecheney: wow, does that reslen value mean the query has been waiting for 12 minutes to be processes?!10:23
davecheneyi don't think so10:23
davecheneyi don't think it is 744,381 ms10:23
davecheneysurely it is 744 bytes after 381 ms10:23
rogpeppe1davecheney: yeah, probably10:24
davecheneyrogpeppe1: Apr 26 10:56:20 juju-goscale2-machine-0 mongod.37017[9611]: Fri Apr 26 10:56:20 [conn50284] query presence.presence.pings query: { $or: [ { _id: 1366973760 }, { _id: 1366973730 } ] } ntoreturn:0 ntoskip:0 nscanned:2 keyUpdates:0 numYields: 1 locks(micros) r:911100 nreturned:2 reslen:792 501ms10:56
rogpeppe1davecheney: latency rises...10:57
davecheneynot really sure wht that is showing me yet11:02
davecheneyit's sort of a cas insn't it ?11:02
davecheneyApr 26 11:02:02 juju-goscale2-machine-0 mongod.37017[9611]: Fri Apr 26 11:02:02 [conn6275] query presence.presence.pings query: { $or: [ { _id: 1366974120 }, { _id: 1366974090 } ] } ntoreturn:0 ntoskip:0 nscanned:2 keyUpdates:0 numYields: 1 locks(micros) r:1413393 nreturned:1 reslen:406 768ms11:02
davecheneybut yes, they certainly rise11:03
davecheneywhat is the heartbeat for presence ?11:03
davecheneywe should put some thought into avoiding harmonic feedback in all these periodic loops11:03
davecheneyshit, we're not even at 1000 instances11:06
davecheneyit's been running for 3 hours ...11:06
davecheneytesting this thing is a job for life :)11:06
dimiternrogpeppe1: hey, how about a suggestion about better help doc for upgrade-charm --switch?11:08
davecheneyrogpeppe1: http://paste.ubuntu.com/5604256/11:23
davecheneyat the 1000 node mark, the api server is unusable11:23
rogpeppe1dimitern: ah, will do. sorry, bit distracted currently as some old pipes have just sprung a leak in our kitchen and i've had to turn the main water supply off11:23
davecheneyor something maybe mongo11:23
dimiternrogpeppe1: wow..11:23
davecheneymaybe the the thing afterwards that11:23
davecheneycrap11:23
rogpeppe1davecheney: isn't the mongo, not the API server?11:24
rogpeppe1s/the/that/11:24
davecheneyrogpeppe1: really not sure11:25
dimiternrogpeppe1: "To manually specify the charm URL to upgrade to, use the --switch argument.11:25
dimiternIt will be used instead of the service's current charm newest revision.11:25
dimiternNote that the given charm must be compatible with the current one, e.g.11:25
davecheneyi guess it is looking in the db11:25
dimiternit must not remove relations the service is currently participating in,11:25
dimiternand no settings types can be changed. This *is dangerous* and you should11:25
dimiternknow what you are doing."11:25
davecheneyto find the address of the instance11:25
davecheneyit could also be blocked waiting for the provider to return some data11:25
davecheneybut we've used up all our quota with the provider11:26
=== ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: | Bugs: 2 Critical, 64 High - https://bugs.launchpad.net/juju-core/
=== ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: | Bugs: 3 Critical, 63 High - https://bugs.launchpad.net/juju-core/
dimiternwallyworld_: mumble?11:33
wallyworld_dimitern: i just got back from soccer, i'll be a minite11:33
rogpeppe1dimitern: can an upgraded charm have less config settings than the old one?11:37
dimiternrogpeppe1: let me check11:38
davecheneydoes anyone know if nova list has a limit on the nubmer of rows it returns ?11:39
davecheneyhttps://bugs.launchpad.net/nova/+bug/1166455 ?11:42
_mup_Bug #1166455: nova flavor-list only shows 1000 flavors <prodstack> <OpenStack Compute (nova):Invalid> <python-novaclient:Fix Committed by gtt116> <nova (Ubuntu):Invalid> <https://launchpad.net/bugs/1166455>11:42
dimiternrogpeppe1: well, it seems the old config settings should remain, but you can add new ones12:00
rogpeppe1dimitern: ok, that seems good12:00
rogpeppe1dimitern: http://paste.ubuntu.com/5604375/12:02
dimiternrogpeppe1: sgtm, thanks12:04
dimiternrogpeppe1: so how to test both local: and cs: urls? start a http server mocking the store and set that to charm.Store?12:24
rogpeppe1dimitern: good question.12:40
rogpeppe1dimitern: sorry, still distracted, trying to get hold of a plumber12:41
dimiternrogpeppe1: i'll propose it without that, for now12:41
ahasenackhi guys, I'm getting this error in the bootstrap node when bootstrapping on canonistack:12:50
ahasenackERROR worker: loaded invalid environment configuration: required environment variable not set for credentials attribute: User12:50
ahasenackfull logs at http://pastebin.ubuntu.com/5604481/12:50
ahasenackany ideas?12:50
ahasenack"juju status" on my laptop just hangs12:51
dimiternahasenack: try running juju status --debug -v12:52
ahasenackdimitern: hm12:53
ahasenackdimitern: http://pastebin.ubuntu.com/5604493/12:54
ahasenacksecurity group issue?12:54
ahasenackit connects over there (localhost), so there is something listening on that port12:55
dimiternahasenack: it seems it cannot connect to mongo - is it running?12:55
ahasenackroot@juju-canonistack-machine-0:~# telnet localhost 3701712:55
ahasenackTrying 127.0.0.1...12:55
ahasenackConnected to localhost.12:55
ahasenackEscape character is '^]'.12:55
ahasenacksomething is, I assume it's mongo12:55
ahasenacktcp        0      0 0.0.0.0:37017           0.0.0.0:*               LISTEN      27573/mongod12:55
ahasenackyep12:55
dimiternahasenack: so you can connect from machine 0 to mongo, but not from outside?12:56
ahasenackright12:56
ahasenackI'm checking the security group rules12:56
dimiternahasenack: yeah, good idea12:56
ahasenackdimitern: ah, I know12:57
ahasenackdimitern: the rules are ok12:57
ahasenackdimitern: it's the public ip thing, on the private ip only ssh is routed through12:58
ahasenackdimitern: I'll fire up sshuttle and that should sort it12:58
ahasenackdimitern: yep, worked now, thanks12:59
ahasenackthe errors in the logs were misleading me12:59
dimiternahasenack: you can also try setting the "use-floating-ip" to true in env config12:59
ahasenackyepo12:59
dimiternahasenack: but knowing the shortage of floating ips on canonistack, it might fail anyway12:59
ahasenackyes, I will stick with sshuttle, works well enough for my testing13:00
ahasenackrogpeppe1: hi, I see that https://bugs.launchpad.net/juju-core/+bug/1172717 is still open, but the branch is merged13:19
_mup_Bug #1172717: juju-log does not accept --log-level <juju-core:In Progress by rogpeppe> <https://launchpad.net/bugs/1172717>13:20
ahasenackrogpeppe1: is it fixed in trunk?13:20
rogpeppe1ahasenack: i think so; let me check13:33
rogpeppe1ahasenack: yes13:34
ahasenackrogpeppe1: will that trigger a new ppa build? I still only see the version with the bug13:34
ahasenackrogpeppe1: also, does it requires a new "tools" build?13:35
ahasenackdoes it require*13:35
rogpeppe1ahasenack: i don't think so. i think the patch needs to be back ported13:35
ahasenackrogpeppe1: I'm using this ppa: http://ppa.launchpad.net/juju/devel/ubuntu/13:35
rogpeppe1ahasenack: we haven't yet worked out best practice in that respect yet - we're still feeling our way13:35
ahasenackI thought that was trunk13:35
rogpeppe1ahasenack: the tools still need to be pushed to the public bucket13:36
rogpeppe1ahasenack: because that's where they're pulled from, not the ppa13:36
ahasenackrogpeppe1: the bug actually depends more on the tools than on the new deb13:36
ahasenackok13:36
ahasenackand that does not happen with every commit?13:37
ahasenackI guess there needs to be a concept of "stable" and "devel" tools13:37
rogpeppe1ahasenack: there is that concept13:37
rogpeppe1ahasenack: if the minor version is odd, it's a devel version13:37
rogpeppe1ahasenack: i think we probably need to automate our pushing to the public bucket13:38
ahasenackrogpeppe1: but are they in separate buckets?13:38
rogpeppe1ahasenack: no, there's only one public bucket13:38
rogpeppe1ahasenack: (for any given environment, that is)13:38
ahasenackok, so if you push to that bucket with every commit, like a "daily", you risk breaking production users13:38
ahasenackwith the ppa at least you have a distinction about what is "stable" and what is "devel" or "daily"13:39
rogpeppe1ahasenack: only if we push versions with an even minor version number, i think13:39
ahasenackrogpeppe1: so how do you test trunk, you use --upload-tools all the time?13:39
rogpeppe1ahasenack: the idea is that we always develop against an odd minor version (currently we're developing against 1.11)13:39
rogpeppe1ahasenack: yes13:39
ahasenackrogpeppe1: like my case now, I was going through all the openstack charms and seeing if they deploy with juju-core trunk, and filing bugs where appropriate (some in openstach charms, some in juju)13:40
ahasenackrogpeppe1: but I can't test a "trunk" build of juju-core, because it's not there, I'm stuck with the version with the bug :)13:40
rogpeppe1ahasenack: you could use upload-tools13:40
ahasenacklast time I tried it exploded, I emailed the list13:41
ahasenackI will wait for a new package in the devel ppa, and new tools :)13:41
rogpeppe1ahasenack: there have been some significant issues fixed since then. it *should* work fine.13:42
rogpeppe1ahasenack: in particular, it shouldn't pick incompatible tools if you've uploaded some, which was probably the cause of the explosion before13:42
ahasenackrogpeppe1: I think my problem is more basic than that... http://pastebin.ubuntu.com/5604658/13:45
ahasenackwhat does it mean "no go source files"13:45
rogpeppe1ahasenack: try go get -v launchpad.net/juju-core/...13:46
ahasenackrogpeppe1: the "..." are for real?13:46
rogpeppe1ahasenack: there are no source files in the juju-core root directory13:46
rogpeppe1ahasenack: yes13:46
rogpeppe1ahasenack: it's a wildcard13:46
ahasenack!!13:46
rogpeppe1ahasenack: from "go help packages": http://paste.ubuntu.com/5604667/13:47
ahasenackrogpeppe1: ok, that changes things, thanks, I'll go on from here13:47
rogpeppe1ahasenack: if the wildcard was '*', you'd have to quote the names all the time13:48
rogpeppe1ahasenack: and '*' usually doesn't match multiple levels of directory13:48
rogpeppe1ahasenack: cool; please let us know when things go wrong, or are awkward to understand - it's nice to get feedback from people that aren't used to walking around the holes in the road.13:50
=== wedgwood_away is now known as wedgwood
=== gary_poster is now known as gary_poster|away
davechen1ym_3 ping14:00
=== flaviami_ is now known as flaviamissi
dimiterni'd appreciate a review on https://codereview.appspot.com/854005014:29
ahasenackrogpeppe1: --upload-tools worked, and I verified that that -l/--log-level bug is indeed fixed14:29
dimiternrogpeppe1:  ^^14:29
* dimitern bbi30m14:29
rogpeppe1ahasenack: lovely, thanks for giving it a go14:29
rogpeppe1dimitern: ok, will look in a little bit14:29
=== gary_poster|away is now known as gary_poster
rogpeppe1dimitern: reviewed15:06
dimiternrogpeppe1: cheers15:11
m_3davecheney: pong15:13
ahasenackhi, I got this error when deploying cinder with juju-core, is this a change between pyjuju and gojuju? http://pastebin.ubuntu.com/5605085/15:52
rogpeppe1hmm, interesting15:54
rogpeppe1ahasenack: do you know what hook that was running in?15:55
ahasenackrogpeppe1: install I think, this was just before, and I was really installing it only15:55
ahasenack2013/04/26 15:51:25 DEBUG worker/uniter/jujuc: hook context id "cinder/0:install:79731491855068321"; dir "/var/lib/juju/agents/unit-cinder-0/charm"15:55
ahasenackrogpeppe1: wait, let me paste more context15:55
rogpeppe1ahasenack: hmm, so which relation did the code expect to be set there?15:55
rogpeppe1ahasenack: given that the install hook isn't associated with a relation.15:56
ahasenackhttp://pastebin.ubuntu.com/5605098/15:56
ahasenackthe install had failed before, i had to run a few juju set foo=bar to fix a config and then resolved --retry15:56
rogpeppe1ahasenack: i think we could do with even more context actually15:57
ahasenackI'm not sure what it was trying to set15:57
ahasenackok15:57
ahasenacklet me get the whole file15:57
ahasenackrogpeppe1: http://pastebin.ubuntu.com/5605109/15:58
rogpeppe1ahasenack: right, it's running the install hook15:59
rogpeppe1ahasenack: i think it's reasonable that relation-related commands can fail in that circumstance, but i'd be interested to know what the charm was actually trying to do15:59
ahasenacklet me see what it does16:00
rogpeppe1ahasenack: perhaps we should just ignore untoward relation-related commands16:00
ahasenackrogpeppe1: I found two relation-set commands that match that log16:01
ahasenackrogpeppe1: one specifies a relation id :)16:01
rogpeppe1ahasenack: :-)16:01
ahasenacklooks like a bug16:01
rogpeppe1ahasenack: looks that way to me16:01
ahasenackthe one that doesn't is in keystone_joined() (!!)16:01
ahasenack  relation-set service="cinder" \16:01
ahasenack    region="$(config-get region)" public_url="$url" admin_url="$url" internal_url="$url"16:01
ahasenackrogpeppe1: ok, thanks, I'll take it from here16:02
rogpeppe1ahasenack: if charms are doing this commonly though, and the python allowed it, we should perhaps consider letting it through and ignoring it16:02
ahasenackok16:02
ahasenackI will debug this one, see how it ended up running keystone_joined() in the install hook16:03
ahasenackand then if we can get and use a relation id16:03
rogpeppe1anyone know of a decent way of inserting nicely formatted code fragments into a gmail mail?16:07
rogpeppe1or a google doc for that matter16:09
ahasenackhi, I have a feeling that juju deploy --config file.yaml isn't working, it's not taking the options from file.yaml16:26
ahasenackbefore I debug further, is this a known issue?16:27
ahasenackjuju set <service> --config file.yaml also didn't work, but juju set <service> key=value did16:29
ahasenackhttps://bugs.launchpad.net/juju-core/+bug/112190716:34
_mup_Bug #1121907: deploy --config <cmdline> <juju-core:New> <https://launchpad.net/bugs/1121907>16:34
dimiternahasenack: I think deploy doesn't accept --config yet16:34
ahasenackThe option is there, but the bug still open16:34
dimiternahasenack: or more likely it ignores it16:34
ahasenackyep, looks like it16:34
dimiternrogpeppe1: bugging you one last time: https://codereview.appspot.com/854005016:35
ahasenackjuju get works, but there is also a bug for it, still open16:35
ahasenackweird16:35
rogpeppe1ahasenack: we've been fixing lots of bugs - not all them have necessarily been marked as such...16:36
ahasenackok16:37
rogpeppe1dimitern: why call repo.Latest at all if we've got a specified revision number?16:37
rogpeppe1dimitern: it's a potentially slow operation16:37
dimiternrogpeppe1: it doesn't seem slow - it just changes the rev in the curl16:38
rogpeppe1dimitern: no it doesn't - it calls CharmStore.Info, which makes an http request16:39
dimiternrogpeppe1: only for a local repo it does get, but this shouldn't be slow at all, the CS does not fetch anything on Latest16:39
rogpeppe1dimitern: resp, err := http.Get(s.BaseURL + "/charm-info?charms=" + url.QueryEscape(key)) ?16:39
dimiternrogpeppe1: it's not the charm that's downloaded here, just the metadata16:40
rogpeppe1dimitern: looks like it's fetching something to me16:40
dimiternrogpeppe1: it's essentially a HTTP HEAD16:40
rogpeppe1dimitern: sure, but it's still making an unnecessary network request for no particularly good reason. surely it's easy to avoid?16:40
dimiternrogpeppe1: yeah, i suppose..16:40
dimiternrogpeppe1: but despite this the logic is now sound, right?16:41
rogpeppe1dimitern: i stopped there, but will continue looking, one mo16:41
dimiternrogpeppe1: i'll just move the Lastest call in an else block after checking the other two cases16:41
rogpeppe1dimitern: that was what i was just thinking16:42
dimiternrogpeppe1: sorry, haven't seen it like this16:42
dimiternrogpeppe1: thanks16:42
rogpeppe1dimitern: you might even consider making it a bool switch16:42
dimiternrogpeppe1: i did something like that, but it looked ugly, so i got rid of it16:42
rogpeppe1dimitern: np; three cases is marginal16:43
=== gary_poster is now known as gary_poster|away
rogpeppe1dimitern: i'm still not sure the logic is quite right, even making that change16:45
dimiternrogpeppe1: why?16:46
rogpeppe1dimitern: don't we want to do a bump revision if the switch url is specified without a revno ?16:46
dimiternrogpeppe1: I don't believe so16:46
rogpeppe1dimitern: william said this, and i agree:16:47
rogpeppe1Hmm.I suspect that bump-revision logic *should* apply when --switch is given16:47
rogpeppe1with a *local* charm url *without* an explicit revision. Sane?16:47
dimiternrogpeppe1: that's the user being explicit anyway, so we'll do what he asks, and probably knows what he's doing16:47
dimiternrogpeppe1: I still disagee16:48
rogpeppe1dimitern: as there's no way to explicitly specify bump-revision, i think we should make the default logic work16:48
dimiternrogpeppe1: this is like --force - "do exactly what i'm telling you to do, no smart tricks"16:48
rogpeppe1dimitern: hmm, you said "Done" in response to that sentence before - you didn't seem to disagree16:48
rogpeppe1dimitern: if you don't specify a revision number, you're saying "please choose an appropriate revision number for me"16:49
rogpeppe1dimitern: i think we should make that path work16:49
dimiternrogpeppe1: done, meaning all the rest - except that, i should've been clearer perhaps16:49
dimiternrogpeppe1: there's no way *not* to bump the revision otherwise16:49
dimiternrogpeppe1: and why should we do it - it's a different charm, so no conflicts would apply (hopefully)16:50
rogpeppe1dimitern: sure there is - specify a revision number, no?16:50
rogpeppe1dimitern: it's a different charm, but we may already have another version of the one we're switch to16:50
rogpeppe1switching to16:50
rogpeppe1dimitern: it's not unlikely, in fact, if we're calling switch on multiple services16:51
dimiternrogpeppe1: on the same service?16:51
dimiternrogpeppe1: we can call it only on one service at a time16:51
rogpeppe1dimitern: yes, but bump-revision isn't about the service, is it? it's about the charm's stored in the state, which are independent of the services that use them16:52
dimiternrogpeppe1: so you think bumping revision on switch without explicit rev will be straightforward to understand from the user's point of view?16:52
rogpeppe1dimitern: yes16:52
rogpeppe1dimitern: because it's the behaviour they're used to when deploying with a local charm url16:53
dimiternrogpeppe1: ok, i'll do it, but i'm still not convinced it's right16:53
rogpeppe1dimitern: i think automatic bump-revision for any local charm is correct, as who knows what relationship the local charm bears to the one that's previously been uploaded?16:54
dimiternrogpeppe1: fair enough16:57
=== gary_poster|away is now known as gary_poster
dimiternrogpeppe1: so when you have svc "riak",running charm "riak-7" and you upgrade it to "local:myriak" (no exp. rev, final result: "local:precise/myriak-7"), and then upgrade it again to "local:myriak", should the rev be bumped to "local:myriak-8" ?17:17
rogpeppe1dimitern: yes, i think so17:17
dimiternrogpeppe1: yeah, that's what I though, adding a test for that now17:18
dimiterni'm off, happy weekend to everyone!17:57
ahasenackrogpeppe1: about the earlier conversation about relation set and relation id, it looks like it's very common to not specify a relation id in pyjuju18:00
ahasenacktwo charm authors I spoke with said so, and the "manpage" for relation-set in pyjuju says it's optional (as is everything else, so I don't trust that help doc very much: https://pastebin.canonical.com/90111/)18:00
rogpeppe1ahasenack: it is optional, in relation-related hooks18:10
rogpeppe1ahasenack: but in a non-relation hook, what could it possibly default to?18:10
ahasenackah, so it is optional in gojuju18:11
ahasenackok, I'll debug further18:11
rogpeppe1right, eod and start of weekend for me here18:15
rogpeppe1happy weekends all18:15
ahasenackbye rogpeppe1, enjoy18:19
=== wedgwood is now known as wedgwood_away

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!