davecheney | m_3: ping | 00:00 |
---|---|---|
davecheney | our cloudinit harness doesn't support the bits of upstart I need | 00:00 |
davecheney | so i'm going to hack the bootstrap node after boot | 00:01 |
davecheney | arosales: ^ as above | 00:01 |
davecheney | that will have the same effect and validate our assumptions about the ~298 connection limit | 00:01 |
davecheney | OT question: does bzr have anything like svn externals or git submodules ? | 00:04 |
davecheney | $ sudo initctl start -v juju-db | 00:20 |
davecheney | initctl: Job failed to start | 00:20 |
davecheney | FML | 00:20 |
thumper | hi davecheney | 00:30 |
davecheney | ubuntu@juju-hpgoctrl2-machine-0:~$ nova list | 00:34 |
davecheney | +---------+---------------------------+------------------+--------------------------------------+ | 00:34 |
davecheney | | ID | Name | Status | Networks | | 00:34 |
davecheney | +---------+---------------------------+------------------+--------------------------------------+ | 00:34 |
davecheney | | 1465097 | juju-hpgoctrl2-machine-0 | ACTIVE | private=10.7.194.166, 15.185.162.247 | | 00:34 |
davecheney | | 1565949 | juju-goscale2-machine-37 | ACTIVE(deleting) | private=10.6.245.47, 15.185.172.89 | | 00:34 |
davecheney | | 1566583 | juju-goscale2-machine-239 | ACTIVE(deleting) | private=10.6.246.187, 15.185.177.83 | | 00:34 |
davecheney | | 1581493 | juju-goscale2-machine-0 | ACTIVE | private=10.7.27.166, 15.185.166.80 | | 00:34 |
davecheney | +---------+---------------------------+------------------+--------------------------------------+ | 00:34 |
davecheney | ^ jammed in deleting for a few days now :( | 00:34 |
davecheney | 2013/04/26 00:51:08 DEBUG started processing instances: []environs.Instance{(*openstack.instance)(0xf8401b3f00)} | 00:51 |
davecheney | ^ *openstack.instance needs a String() | 00:52 |
m_3 | davecheney: hey | 01:17 |
davecheney | m_3: hey mate | 01:18 |
davecheney | going for broke for 2k | 01:18 |
m_3 | ssup? still jammed? | 01:18 |
m_3 | sweet | 01:18 |
m_3 | bit of latency atm... gogo inflight wireless | 01:19 |
m_3 | :) | 01:19 |
davecheney | i've hacked the mongo on the bootstap machine to have at least 20,000 conns | 01:19 |
davecheney | that should be enough fo the moment | 01:19 |
m_3 | oh nice | 01:19 |
davecheney | m_3: where u off too ? | 01:19 |
m_3 | SF, then Portland | 01:19 |
m_3 | SF is prep for the big data summercamp talk | 01:20 |
m_3 | portland is railsconf | 01:20 |
m_3 | whoohoo | 01:20 |
m_3 | actually looking forward to hanging with the ole 'austin-on-rails' crowd | 01:20 |
davecheney | m_3: I think we'll probably run out of ram on the bootstrap node by 2,000 | 01:20 |
davecheney | m_3: this one is a hp bug, | 01:21 |
davecheney | ubuntu@juju-hpgoctrl2-machine-0:~$ nova list | grep delet | 01:21 |
davecheney | | 1565949 | juju-goscale2-machine-37 | ACTIVE(deleting) | private=10.6.245.47, 15.185.172.89 | | 01:21 |
davecheney | | 1566583 | juju-goscale2-machine-239 | ACTIVE(deleting) | private=10.6.246.187, 15.185.177.83 | | 01:21 |
m_3 | davecheney: damn... I was just writing that we can bounce it and get something larger | 01:21 |
m_3 | but we can't update the env after boostrap still right? | 01:21 |
davecheney | ~ 1.5 mb per service unit | 01:21 |
davecheney | env ? | 01:21 |
davecheney | yhou mean the spec for the bootstrap machine ? | 01:21 |
m_3 | juju environment | 01:21 |
m_3 | yeah | 01:21 |
davecheney | not easily | 01:22 |
davecheney | probalby esier to hack juju bootstrap | 01:22 |
m_3 | right | 01:22 |
* davecheney facepalm | 01:22 | |
davecheney | there is no swap on these machines | 01:22 |
davecheney | that will be a problem | 01:22 |
davecheney | mongo will probably explode | 01:22 |
m_3 | yeah, sometimes when they're wedged with juju-0.7 we could do destroy-environment and it was a little stronger than destroy-service | 01:22 |
m_3 | can you kill em with nova | 01:23 |
davecheney | nova can't kill this one | 01:23 |
m_3 | we shouold've started with ec2 imo | 01:23 |
davecheney | (how do you think it got into this state in the first place) | 01:23 |
m_3 | haha | 01:23 |
davecheney | m_3: any movement on some ec2 creds ? | 01:23 |
m_3 | not yet... I prepped antonio that the request had be pretty much approved from above... but gotta get ben on the actual acct stuff | 01:24 |
m_3 | davecheney: I think we should just blow it up | 01:25 |
m_3 | davecheney: maybe put something in place that'll tell us that's what's happening | 01:25 |
m_3 | so we can distinguish between a juju error and the bootstrap node blowing up | 01:25 |
davecheney | "11:25 < m_3> davecheney: maybe put something in place that'll tell us that's what's happening" | 01:25 |
davecheney | oh | 01:25 |
davecheney | that | 01:25 |
m_3 | :) | 01:25 |
davecheney | let me blow one up so I can see what to expect | 01:26 |
m_3 | reasonable to get as big as we can | 01:26 |
m_3 | ack | 01:26 |
m_3 | unfortunately I won't be in the air for long... otherwise _that_ would be a great story :)... "kicked off 1000 nodes from the plane" | 01:26 |
m_3 | latency's really dropped down too... so it's pretty nice actually | 01:27 |
davecheney | mramm: wazzup ? | 01:27 |
mramm | not much | 01:27 |
mramm | I just got an email from linaro folks about armhf support in juju-core | 01:27 |
davecheney | m_3: lemmie hack this instnce with a /.SWAP | 01:27 |
davecheney | mramm: piece of piss | 01:27 |
mramm | ? | 01:27 |
davecheney | i told someone that we can always do a one off build if they need armhf today | 01:28 |
davecheney | if they need it properly | 01:28 |
davecheney | we need some work done on the golang-go package int he archive | 01:28 |
davecheney | basically, we need go 1.1 | 01:28 |
mramm | they are just asking if they can help test and support it | 01:28 |
mramm | right | 01:28 |
mramm | that was what I remembered from some earlier arm discussion | 01:28 |
davecheney | they can test it right now today if they build go and juju from source | 01:28 |
davecheney | http://dave.cheney.net/unofficial-arm-tarballs | 01:28 |
mramm | They are not being demanding, just asking how they can help | 01:28 |
davecheney | ^ or they can use my beta tarballs | 01:29 |
davecheney | feel free to cc me | 01:29 |
davecheney | i'm happy to help get them started | 01:29 |
mramm | and what they can do, so I will let them know the situation, and CC you | 01:29 |
mramm | sounds good | 01:29 |
mramm | did we hardcode the state server to be amd64? | 01:30 |
m_3 | descending below 10k-ft... ttyl | 01:31 |
davecheney | mramm: opinions differ | 01:31 |
davecheney | william told me it _is_ hard coded to amd64 | 01:31 |
davecheney | then he told me it wasn't | 01:31 |
mramm | ok | 01:31 |
davecheney | i don't know the current answer | 01:31 |
mramm | I will check with william | 01:31 |
davecheney | id' expect it to just work | 01:31 |
davecheney | mramm: it's a bit of a problem that the UEC service doesn't list our armhf on amd64 images, http://cloud-images.ubuntu.com/query/precise/server/released.txt | 01:33 |
mramm | interesting | 01:33 |
davecheney | hmm, maybe they do for Q | 01:34 |
davecheney | nup | 01:34 |
mramm | we can talk to the "public cloud images" guys about that, and see what we can get done there. I'll talk to antonio about that tomorrow. | 01:46 |
davecheney | mramm: http://www.h-online.com/open/news/item/Canonical-releases-EC2-image-for-Ubuntu-ARM-Server-1585740.html | 01:47 |
mramm | kk | 01:47 |
mramm | thanks | 01:47 |
thumper | hi mramm | 01:49 |
davecheney | mramm: m_3 286 slaves running, mongo using 450 mb of ram | 01:49 |
davecheney | so at least 4gb required for 2000 nodes at this rate | 01:49 |
thumper | davecheney: is that good? | 01:49 |
davecheney | it means you need to run a larger bootstrap instnace | 01:49 |
mramm | davecheney: I guess that is to be expected if we are going to have thousands of open connections to mongo | 01:49 |
davecheney | but then, if you're running 2000 nodes in your environment | 01:49 |
mramm | true enough | 01:50 |
davecheney | you probably dn't care about the cost difference | 01:50 |
mramm | right, the bootstrap node cost will be trivial compared to the 2000 nodes | 01:50 |
davecheney | each conn is a thread, which is anywhere between 1mb and 16mb depending on libc and the phase of the moon | 01:50 |
davecheney | mramm: bingo | 01:50 |
mramm | thumper: hey! | 01:50 |
mramm | davecheney: I think we should work to get 1.1 into S as soon as we can | 01:51 |
thumper | mramm: finally landed the hook synchronization branch | 01:51 |
mramm | we expect 1.1 final to land in plenty of time, and the earlier we propose the easier it is | 01:52 |
davecheney | mramm: that will require deviating from the ustream | 01:52 |
davecheney | which I have no problem doing | 01:52 |
mramm | yea | 01:52 |
thumper | snarky... superb... slimey... | 01:52 |
davecheney | but sounds like that isn't what we do (tm) | 01:52 |
thumper | what was S again? | 01:52 |
davecheney | surly | 01:52 |
thumper | not sweet | 01:52 |
thumper | I don't want to look it up | 01:52 |
thumper | but instead batter things around until it floats to the top of my memory | 01:52 |
davecheney | surly simian or something | 01:52 |
thumper | definitely a salamander | 01:53 |
thumper | not sticky | 01:53 |
thumper | which reminds me of a joke | 01:53 |
davecheney | stinky subhuman | 01:53 |
thumper | "What is brown and sticky" | 01:53 |
davecheney | 2013/04/26 01:53:23 NOTICE worker/provisioner: started machine 307 as instance 1582617 | 01:53 |
mramm | stout sea-urchin? | 01:53 |
thumper | a stick | 01:53 |
mramm | haha | 01:53 |
mramm | fyi: https://wiki.ubuntu.com/SReleaseSchedule | 01:54 |
davecheney | hmm, at 300 nodes the main thread on mongod is at 30% duty | 01:55 |
mramm | interesting | 01:55 |
mramm | sounds like some more evidence that we will need an internal API sooner rather than later | 01:55 |
davecheney | mramm: it's all the reconnection and ssl handshaking from the clients probing | 01:55 |
mramm | does it settle down after they have connections established? | 01:56 |
davecheney | mramm: no | 01:56 |
davecheney | this is a constant load | 01:56 |
davecheney | the polling is every 2 ? minutes | 01:56 |
* davecheney goes and checks | 01:57 | |
thumper | so changing to use the api internally should reduce the load here? | 02:03 |
thumper | or will it still be high | 02:03 |
thumper | just because of the number of clients? | 02:03 |
davecheney | thumper: lower, i would hope | 02:05 |
* thumper nods | 02:05 | |
davecheney | the polling is internal to the mongo driver | 02:05 |
davecheney | the driver will poll all the known services in the replica set every 180 seconds at least | 02:07 |
davecheney | 2013/04/26 02:13:11 NOTICE worker/provisioner: started machine 406 as instance 1582971 | 02:13 |
davecheney | might have to go to lunch at this rate | 02:13 |
davecheney | hmm, 20 mins per 100 instances | 02:14 |
davecheney | not bad | 02:14 |
mramm | yea, that's not too bad at all | 02:20 |
thumper | davecheney: going up to 2000? | 02:20 |
davecheney | f;yeah | 02:20 |
davecheney | hp are anxious to have their capacity back | 02:20 |
davecheney | so no pussy footing around | 02:21 |
davecheney | oooh | 02:42 |
davecheney | ubuntu@juju-hpgoctrl2-machine-0:~$ juju debug-log 2>&1 | grep TLS | 02:42 |
davecheney | juju-goscale2-machine-281:2013/04/26 02:42:08 ERROR state: TLS handshake failed: local error: unexpected message | 02:42 |
davecheney | juju-goscale2-machine-444:2013/04/26 02:42:11 ERROR state: TLS handshake failed: local error: unexpected message | 02:42 |
davecheney | juju-goscale2-machine-160:2013/04/26 02:42:07 ERROR state: TLS handshake failed: local error: unexpected message | 02:42 |
davecheney | juju-goscale2-machine-405:2013/04/26 02:42:10 ERROR state: TLS handshake failed: local error: unexpected message | 02:42 |
davecheney | juju-goscale2-machine-162:2013/04/26 02:42:11 ERROR state: TLS handshake failed: local error: unexpected message | 02:42 |
davecheney | doesn't appear to be affecting things | 02:42 |
davecheney | instance creation time is slowing, 2013/04/26 04:17:37 DEBUG environs/openstack: openstack user data; 2712 bytes | 04:18 |
davecheney | 2013/04/26 04:17:52 INFO environs/openstack: started instance "1584731" | 04:18 |
thumper | davecheney: by how much? | 04:31 |
davecheney | not sure, i'd have to get the whole logs | 04:31 |
davecheney | but the botostrap node is nearly out of memory | 04:31 |
davecheney | and starting to swap | 04:31 |
davecheney | i'm having a look to see if I can change the instance type of the bootstrap node | 04:33 |
davecheney | need at least 4x more ram to make it to 2000 | 04:34 |
m_3 | davecheney: can we `juju bootstrap --constraint='instance-type=standard.large'` or something? | 04:34 |
davecheney | m_3: not sure | 04:34 |
davecheney | there is something in the openstack logs that says the instance type is being hard coded | 04:35 |
m_3 | oh, yeah, there's --constraints on bootstrap according to help | 04:35 |
davecheney | i'm going to grab the log and kill this test | 04:35 |
m_3 | oh... didn't realize it was hard-coded... never tried anyting other than standard.small on hp | 04:35 |
davecheney | i've seen enough to know it's not going to make it | 04:35 |
m_3 | still great info | 04:35 |
m_3 | got it to the point where it's swapping | 04:36 |
davecheney | m_3: will post my notes on this run | 04:36 |
m_3 | so it's probably safest to keep the environment defaulted to standard.small and then do a special bootstrap | 04:36 |
davecheney | m_3: how do we advise customers to size their bootstrap node | 04:36 |
m_3 | btw, we should do a special hadoop-master too | 04:36 |
davecheney | m_3: wanna take a look while i'm grabbing the logs ? | 04:37 |
m_3 | lemme check my notes | 04:37 |
m_3 | I stuck the heap-size config about halfway through http://markmims.com/cloud/2012/06/04/juju-at-scale.html | 04:38 |
m_3 | we just need to test out if the openstack provider will take the --constraints="instance-type=xxx" on bootstrap | 04:40 |
m_3 | those were mediums though | 04:40 |
m_3 | in ec2 | 04:40 |
m_3 | but whatever, the big one is the bootstrap node for now... the hadoop job doesn't actually have to run atm | 04:40 |
* m_3 looks back for the dang ip | 04:41 | |
davecheney | 15.185.162.247 | 04:41 |
davecheney | ubuntu@juju-hpgoctrl2-machine-0:~$ scp -C 15.185.162.247:/var/log/juju/all-machines.log all-machines-2000-node-test-20130426.log | 04:42 |
davecheney | Permission denied (publickey). | 04:43 |
davecheney | why is this being a sone of a bitch | 04:43 |
davecheney | oh hang on | 04:43 |
davecheney | ok, i'm going to destroy this envrionment | 04:43 |
m_3 | rsync -azvP -e'juju ssh -e ...' | 04:43 |
davecheney | got it | 04:43 |
m_3 | so we prob wanna do standard.xlarge | 04:44 |
m_3 | can maybe do a standard.large, but might as well do the bootstrap at xlarge | 04:45 |
m_3 | `nova flavor-list` describes them all | 04:45 |
davecheney | m_3: we'll probably have to do a set-config after we boot | 04:48 |
davecheney | but I need to do some screwing with the bootstrap node to make mongo scale | 04:48 |
m_3 | ah, ok | 04:48 |
davecheney | unless you want to boot everthing as an xlarge | 04:49 |
davecheney | which might get me a bollocking | 04:49 |
m_3 | davecheney: no, we only have perms on standard.small over normal limits | 04:51 |
m_3 | davecheney: so I think we leave the environment using default-instance-type: standard.small | 04:51 |
m_3 | davecheney: but try to use a constraint with the bootstrap | 04:52 |
m_3 | davecheney: are you thinking that won't work? | 04:52 |
m_3 | davecheney: sorry, I think I screwed up your scp... please check it | 04:52 |
davecheney | nah it's ok | 04:52 |
davecheney | dont' worry i got the scp | 04:52 |
m_3 | k | 04:53 |
davecheney | lets try the --constraint option | 04:53 |
davecheney | it's 3pm in AU now | 04:53 |
davecheney | i'm going to destrouy this env and start again | 04:53 |
m_3 | hell, I guess the easiest thing to do is first of all | 04:53 |
davecheney | i don't want to leave it running overnight | 04:53 |
m_3 | deploy another service with a constraint | 04:53 |
m_3 | yeah, we don't need to leave it up for anything | 04:53 |
m_3 | I weas just thinking we could test out the constraint thing pretty quickly | 04:53 |
m_3 | but it'll be interesting to see how long the destroy takes :) | 04:54 |
m_3 | ha | 04:55 |
m_3 | davecheney: it stil looks like it's spawning shit | 04:55 |
davecheney | yup, destroy works backwards | 04:56 |
davecheney | i'll stop the PA | 04:56 |
davecheney | stopped | 04:57 |
m_3 | davecheney: so do we have to kill them via nova now? | 04:58 |
davecheney | m_3: if we have too, that is a bug | 04:59 |
davecheney | destroy means destroy, not do your best :) | 04:59 |
m_3 | yup, but do the services you just killed have to be up throughout destroy? | 04:59 |
* m_3 doesn't know if destroy needs the db to get instance-ids | 05:00 | |
m_3 | davecheney: crap, just tried to bootstrap on another hp acct... doesn't respect the instance-type constraint | 05:02 |
davecheney | m_3: I suspected that | 05:03 |
m_3 | davecheney: know the syntax for "mem>=16GB" | 05:03 |
m_3 | ? | 05:03 |
davecheney | thumper: ? | 05:03 |
davecheney | m_3: our constraints support is very basic | 05:04 |
m_3 | oh, looks like it's trying on a 'mem=16G' | 05:04 |
davecheney | wallyworld_: any ideas ? | 05:04 |
m_3 | nice, I got past the basic validation it looks like... got a "no tools available" | 05:04 |
davecheney | --upload-tools ? | 05:05 |
wallyworld_ | davecheney: about? | 05:05 |
davecheney | wallyworld_: we're trying to bootstrap an env with a larger bootstrap node | 05:05 |
m_3 | davecheney: we can try from the ctrl instance... my laptop's off of the 1.10 distro package | 05:05 |
wallyworld_ | on ec2 i assume | 05:05 |
davecheney | try from the control instance | 05:06 |
m_3 | davecheney: nice, they're dying... slowly | 05:06 |
davecheney | we could kill them all with nova | 05:06 |
davecheney | probably not worth it | 05:06 |
davecheney | it'll be done in a few mins | 05:06 |
m_3 | davecheney: yup | 05:06 |
m_3 | once they're dead, we can try the constraint on bootstrap | 05:07 |
wallyworld_ | davecheney: so you are typing something like this? juju bootstrap --constraints "mem=4G" | 05:08 |
davecheney | wallyworld_: y | 05:08 |
wallyworld_ | and it's not working? | 05:08 |
m_3 | davecheney: I like that it blocks | 05:08 |
davecheney | ec2 blocks as well | 05:08 |
davecheney | but ec2 lets you just say 'delete these 1000 instance id's' | 05:08 |
m_3 | ack | 05:09 |
davecheney | it looks like openstack makes you do them one at a time | 05:09 |
m_3 | wallyworld_: not sure yet | 05:09 |
m_3 | that's surprising | 05:09 |
* wallyworld_ has to go get kid from school | 05:09 | |
m_3 | might be worth filing it as a bug on the openstack provider | 05:09 |
davecheney | or at least a whinge | 05:10 |
m_3 | davecheney: well, I spoke too soon :) | 05:10 |
m_3 | it finished with instances still active | 05:10 |
davecheney | FAIL! | 05:10 |
m_3 | maybe a timeout | 05:11 |
* davecheney embuginates | 05:11 | |
davecheney | nup just raw fail | 05:11 |
* m_3 cheers from the sidelines | 05:11 | |
davecheney | https://bugs.launchpad.net/juju-core/+bug/1170210 | 05:12 |
_mup_ | Bug #1170210: environs/openstack: destroy-environment leaks machines in hpcloud <juju-core:Triaged> <https://launchpad.net/bugs/1170210> | 05:12 |
davecheney | here is one I apparently prepared ealier | 05:12 |
davecheney | m_3: ubuntu@juju-hpgoctrl2-machine-0:~$ nova list │············································································· | 05:13 |
davecheney | +---------+---------------------------+------------------+--------------------------------------+ │············································································· | 05:13 |
davecheney | | ID | Name | Status | Networks | │············································································· | 05:13 |
davecheney | +---------+---------------------------+------------------+--------------------------------------+ │············································································· | 05:13 |
davecheney | | 1465097 | juju-hpgoctrl2-machine-0 | ACTIVE | private=10.7.194.166, 15.185.162.247 | │············································································· | 05:13 |
davecheney | | 1565949 | juju-goscale2-machine-37 | ACTIVE(deleting) | private=10.6.245.47, 15.185.172.89 | │············································································· | 05:13 |
davecheney | | 1566583 | juju-goscale2-machine-239 | ACTIVE(deleting) | private=10.6.246.187, 15.185.177.83 | │············································································· | 05:13 |
davecheney | | 1581727 | juju-goscale2-machine-5 | ACTIVE(deleting) | private=10.7.30.60, 15.185.168.253 | │············································································· | 05:13 |
davecheney | +---------+---------------------------+------------------+--------------------------------------+ | 05:13 |
davecheney | can you email thiat list to antonio and ask hp to find out why those won't delete | 05:13 |
m_3 | oh, same stuck ones? | 05:14 |
davecheney | -5 is a new one from this round | 05:14 |
davecheney | -37 and -239 were stuck from tuesday | 05:14 |
m_3 | ack | 05:14 |
m_3 | sent | 05:16 |
davecheney | 2013/04/26 05:16:11 WARNING environs/openstack: ignoring constraints, using default-instance-type flavor "standard.small" ' | 05:16 |
davecheney | ^ this is what I was afraid of | 05:16 |
davecheney | wallyworld_: any way to hack around this ? | 05:16 |
m_3 | crap | 05:16 |
m_3 | davecheney: we could turn off the 'default' in the environment | 05:16 |
davecheney | m_3: i suspected that would happen, but lacked the words to express it | 05:16 |
m_3 | then see what happens with a few | 05:17 |
m_3 | or explicitly set the constraint for smalls too | 05:17 |
davecheney | i like how fast bootstrap happens in hp cloud | 05:17 |
davecheney | usually < 1 min | 05:18 |
davecheney | so much better than AWS plodding | 05:18 |
m_3 | davecheney: yup... lots faster | 05:18 |
davecheney | m_3: hang on, let me fuck with it for a sec | 05:18 |
davecheney | ahh, yoiu;'re doing what I was going to do :) | 05:19 |
m_3 | shit, sorry | 05:19 |
davecheney | nah, you're good | 05:19 |
davecheney | that was what I was going to do | 05:19 |
davecheney | m_3: do you wanna do a hangout for a bit ? | 05:19 |
davecheney | or is it a bit late in your local TZ ? | 05:19 |
m_3 | davecheney: yeah, I should stop screwing around and hit the sack :) | 05:20 |
davecheney | go, flee, run wild, etc | 05:20 |
davecheney | sam is in perth this weekend | 05:20 |
m_3 | hotel room with the wife asleep so can't do voice atm | 05:21 |
davecheney | so i'm going to hack on this all weekend | 05:21 |
davecheney | (not to mention drink scotch)( | 05:21 |
m_3 | :) | 05:21 |
m_3 | ok, yeah, it doesn't look like our experiment was working anyways | 05:21 |
m_3 | might not be hard to change the constraint "override" code though | 05:21 |
davecheney | I FIXED IT WITH SCIENCE ! | 05:32 |
davecheney | m_3: ok, i got the environment setup the way we want | 05:35 |
davecheney | but forgot to goose mongo | 05:35 |
davecheney | lemme do that again | 05:35 |
davecheney | m_3: hey, machine 5 is dead :) | 05:35 |
davecheney | that is nice bonus | 05:35 |
m_3 | oh, cool | 05:36 |
davecheney | please watch closely, there is nothing up my sleves | 05:36 |
m_3 | haha | 05:37 |
m_3 | so you're gonna default to xlarge, then explicitly ask for 'mem=2G' for slaves? | 05:37 |
davecheney | m_3: will know in a second | 05:40 |
davecheney | the environment config should default to .smalls | 05:40 |
m_3 | sweeet | 05:40 |
m_3 | nice | 05:41 |
davecheney | thank thumper for set-config | 05:41 |
m_3 | ah | 05:41 |
davecheney | m_3: the rule is, once you've bootstraped, most of the values in environments.yaml are ignored | 05:41 |
davecheney | the active values are in the state | 05:41 |
davecheney | ohh dear, it shouldn't show you all those things :) | 05:42 |
* m_3 was wanting set-config in juju-0.6 earlier this week | 05:42 | |
m_3 | ha | 05:42 |
m_3 | well, yes | 05:42 |
davecheney | sorry, the comamnd is set-environment | 05:42 |
m_3 | it shouldn't | 05:42 |
davecheney | but it's operation is straight forward | 05:42 |
m_3 | understood... I was actually wanting set-config :)... but thought maybe the tool did both | 05:43 |
davecheney | we have set-config as well | 05:43 |
* m_3 happy camper | 05:43 | |
davecheney | um, at least I thought we did | 05:44 |
m_3 | just get | 05:44 |
davecheney | oh yeah | 05:44 |
m_3 | `juju get hadoop-slave` | 05:44 |
m_3 | no filtering it looks like | 05:45 |
davecheney | yeah, i blame myself | 05:45 |
m_3 | I sooo want a "preload-packages" or the equiv | 05:46 |
davecheney | m_3: what would that do ? | 05:47 |
m_3 | charm metadata level as well as environment level | 05:47 |
m_3 | install packages before calling any hooks | 05:47 |
davecheney | ah, via cloud init (sorta) | 05:47 |
davecheney | so all the hook install commands we no ops | 05:47 |
m_3 | even later would be fine | 05:47 |
davecheney | MUCHA PARALLELA | 05:47 |
davecheney | 2013/04/26 05:48:16 DEBUG environs/openstack: openstack user data; 2710 bytes │············································································· | 05:48 |
davecheney | 2013/04/26 05:48:29 INFO environs/openstack: started instance "1585513" | 05:48 |
davecheney | 13 seconds to bootstap an instance | 05:48 |
davecheney | thumper: i was wrong, this didn't significantly change with 1000 instances running | 05:49 |
m_3 | davecheney: it's moving now... | 05:49 |
m_3 | what, thought the per-instance startup time was changing? | 05:49 |
davecheney | it went a up a little as mongo started to swap | 05:49 |
davecheney | not signficantly | 05:49 |
m_3 | ack | 05:49 |
m_3 | 5/min atm | 05:50 |
m_3 | ish | 05:50 |
davecheney | the hold back time from openstack's rate limiting affects that | 05:50 |
davecheney | bc says 7 hours to bootstrap 2000 instances | 05:50 |
davecheney | faaaaaaaaaaaaaaark | 05:50 |
davecheney | you only get 4 cpus with the 16gb instance | 05:51 |
davecheney | that is pretty tight | 05:51 |
m_3 | davecheney: where's htop on the bootstrap? | 05:51 |
davecheney | #6 | 05:51 |
davecheney | fun fact, mongo supports a --maxConns flag | 05:52 |
davecheney | which defaults to 20,000 | 05:52 |
davecheney | but that is gated by 80% of the current number of file descriptors | 05:52 |
m_3 | huh | 05:52 |
* davecheney quitely expects mongodb to assplode at 10k connections | 05:53 | |
davecheney | m_3: juju-goscale2-machine-0:2013/04/26 05:55:05 NOTICE worker/provisioner: started machine 85 as instance 1585607 │············································································· | 05:55 |
davecheney | juju-goscale2-machine-0:2013/04/26 05:55:05 INFO worker/provisioner: found machine "86" pending provisioning | 05:55 |
davecheney | this is an interesting log line | 05:55 |
m_3 | davecheney: I didn't catch your startup... are these related to a master? | 05:55 |
davecheney | sorry, say again | 05:56 |
m_3 | did you deploy this from 'bin/hadoop-stack'? | 05:56 |
davecheney | yeah | 05:56 |
m_3 | or just deploy -n? | 05:56 |
davecheney | with -n1975 | 05:56 |
m_3 | ok, cool | 05:56 |
m_3 | wanna catch the master address... shit, status doesn't take any filters either though | 05:57 |
davecheney | that log line above shows how the PA works | 05:57 |
davecheney | 15.185.161.62 | 05:57 |
davecheney | what is the port ? | 05:57 |
m_3 | davecheney: yeah, that looks like what we'd expect to me | 05:57 |
m_3 | 50070 | 05:57 |
davecheney | using nova list is cheating, but whateva | 05:58 |
m_3 | 80 nodes registerd | 05:58 |
m_3 | this'd be really hard to test without novaclient | 05:59 |
m_3 | damn, this is looking great right now | 05:59 |
davecheney | m_3: so i'm trying to drag myself into the 90's an use tmux | 05:59 |
davecheney | but there is one thing that i can't figure out | 05:59 |
davecheney | when i C-a etc | 06:00 |
davecheney | sometimes it is like the ^C is ignored | 06:00 |
m_3 | hmmmm not sure what you mean | 06:00 |
m_3 | you're trying to ctrl-c a process you mean? | 06:00 |
davecheney | no, cntl-a n | 06:00 |
m_3 | ctrl-a hangs waiting for a followup keypress | 06:01 |
davecheney | yeah | 06:01 |
m_3 | there's a timeout setting I think | 06:01 |
davecheney | it feels like that | 06:01 |
davecheney | m_3: anyway | 06:01 |
davecheney | it looks like mongo does all it's tls negogiation on the main thread | 06:01 |
davecheney | then spawns a worker thread | 06:01 |
davecheney | which is a bit lame | 06:02 |
m_3 | I'll often find myself switching to another window as a no-op if I change my mind or get lost in a ctrl-a sequence | 06:02 |
davecheney | rather than accepting the connetion and handling it in a thread | 06:02 |
* m_3 not surprise that something like tls integration is half-baked | 06:02 | |
davecheney | at 900 machines running, the main thread was busy 90% of the time handling all the reconnections from the driver | 06:03 |
m_3 | yeah | 06:03 |
davecheney | i expect that to get a bit shit at 2,000 nodes | 06:03 |
m_3 | yup | 06:03 |
m_3 | not sure how to get around that one | 06:04 |
davecheney | as william said, it's moving the ws api out to the agents | 06:04 |
m_3 | yeah, but that's a huge change though right? | 06:05 |
davecheney | its a lot of work, but conceptually it's straight forwrd | 06:06 |
m_3 | right | 06:06 |
m_3 | a fix | 06:06 |
m_3 | not so much a workaround :) | 06:07 |
davecheney | everything talks tot he state via a set of types which convert between mongo documents and data structures | 06:07 |
davecheney | so it would just be a different conversion | 06:07 |
davecheney | watchers are, as always, the tricky bit | 06:07 |
m_3 | true dhat | 06:07 |
davecheney | m_3: what happens if I deploy the juju-gui on this environment ? | 06:07 |
m_3 | don't know if juju-gui talks to juju-1.10 api yet... does it? | 06:08 |
m_3 | shit, we can try :) | 06:08 |
davecheney | m_3: gary poster said it did about 5 hours ago | 06:08 |
davecheney | who am I to doubt that lovely man | 06:08 |
davecheney | fuck, we'll have to wait 8 hours for that to be provisioned | 06:09 |
m_3 | now your nova trick won't work this time :) | 06:09 |
davecheney | shitter | 06:09 |
davecheney | well this is fun, for relative values of fun | 06:09 |
davecheney | bugger, i should have deployed the gui first | 06:10 |
davecheney | hmm, i'll do that on the next run | 06:10 |
m_3 | hmmmm... brain's getting fuzzy... but maybe there's a way to point the juju-gui to an api server via config | 06:10 |
m_3 | i.e., from anohther env | 06:10 |
davecheney | probably | 06:10 |
davecheney | it won't use a relation | 06:10 |
davecheney | because the api server is not a service | 06:10 |
davecheney | (although it should be) | 06:11 |
m_3 | nah, doesn't look like it in the charm | 06:11 |
davecheney | juju-gui: │············································································· | 06:11 |
davecheney | charm: cs:precise/juju-gui-46 │············································································· | 06:11 |
m_3 | i.e., no config for api server | 06:11 |
davecheney | exposed: true │············································································· | 06:11 |
davecheney | units: │············································································· | 06:11 |
davecheney | juju-gui/0: │············································································· | 06:11 |
davecheney | agent-state: pending │············································································· | 06:11 |
davecheney | machine: "1999" | 06:11 |
davecheney | GLWT | 06:12 |
m_3 | 1999 | 06:12 |
m_3 | sweet | 06:12 |
m_3 | btw, the gui for this will be pretty un-interesting | 06:12 |
m_3 | two boxes | 06:12 |
m_3 | hadoop-master and hadoop-slave | 06:13 |
m_3 | two lines between them | 06:13 |
davecheney | i be it crashes my browser | 06:13 |
davecheney | bet | 06:13 |
m_3 | but yes, it'd still be neat to see | 06:13 |
m_3 | hahq | 06:13 |
m_3 | well, yeah... maybe that too | 06:13 |
m_3 | although kapil had a simulator mock thingy set up | 06:14 |
davecheney | that is true | 06:14 |
m_3 | he may've done some scale testing with that | 06:14 |
davecheney | that can simulate infesibly large environments | 06:14 |
m_3 | most likely problem would be timeouts | 06:14 |
m_3 | maybe | 06:14 |
m_3 | while the api server chokes | 06:15 |
m_3 | davecheney: sweet... that's thumping along | 06:15 |
davecheney | m_3: that is what I am thinking, it'll be lugging around the data for thousands of relations | 06:16 |
davecheney | yup | 06:16 |
m_3 | davecheney: ok, well I think I'm gonna hit the sack then | 06:16 |
davecheney | yeah | 06:17 |
m_3 | davecheney: you want me to do anything on the flipside? | 06:17 |
davecheney | this is as thrilling as watching paint dry | 06:17 |
davecheney | if anything eventful happens i'll put it in an email | 06:17 |
m_3 | davecheney: or well just send me email if you get eod and want me to do something | 06:17 |
davecheney | i won't leave it running past about 11pm tonight | 06:17 |
davecheney | we should be pretty close to 2000 nodes by then | 06:17 |
davecheney | 7 hours really isn't fast enough for this | 06:17 |
davecheney | how long did it take for the ec2 2k node test ? | 06:17 |
m_3 | k... I'm on UTC-7 for the next two weeks | 06:18 |
m_3 | bout 7hrs iirc | 06:18 |
m_3 | was split up a bit in the big run | 06:18 |
m_3 | did 1000, tested job runs on that cluster | 06:19 |
davecheney | m_3: i'll see you in -7 on the 5th | 06:19 |
m_3 | then cleaned out the hdfs and added 1000 more | 06:19 |
m_3 | but I think that was 7hrs total | 06:19 |
davecheney | booooooooooooooring | 06:19 |
m_3 | there were a few white russians invovled too :) | 06:20 |
davecheney | a capital idea! | 06:20 |
m_3 | :) | 06:20 |
* davecheney considers scouting for dinner | 06:20 | |
m_3 | davecheney: k, well goodnight fine sir | 06:20 |
davecheney | later mate | 06:20 |
davecheney | enjoy this port - land | 06:20 |
davecheney | rogpeppe: can you help with a juju-gui question ? | 07:42 |
rogpeppe | davecheney: perhaps... | 07:42 |
rogpeppe | davecheney: a question from you about juju-gui, or a question from the juju-gui team? | 07:43 |
davecheney | how to login to the bugger | 07:43 |
rogpeppe | davecheney: sorry, didn't see your question... | 07:57 |
rogpeppe | davecheney: if you want me to see something, you need to mention my irc handle... | 07:58 |
rogpeppe | davecheney: you use your admin secret | 07:58 |
rogpeppe | davecheney: have you tried it and had it fail? | 08:00 |
davecheney | rogpeppe: yeah, tried and failed | 08:01 |
davecheney | is there a length limit ? | 08:01 |
rogpeppe | davecheney: i don't think so | 08:01 |
rogpeppe | davecheney: hmm, let me try it. remind me of the charm url of the gui charm, please? | 08:02 |
davecheney | https://15.185.163.105/ | 08:02 |
davecheney | ^ this is the depoloyed gui] | 08:02 |
davecheney | ubuntu@15.185.162.247 | 08:02 |
davecheney | is the machine that bootstrapped | 08:03 |
davecheney | rogpeppe: your key is already on that machine | 08:03 |
davecheney | so you should be able to recover the admin password | 08:03 |
rogpeppe | davecheney: actually, i was going to try deploying it, and couldn't remember the charm url | 08:03 |
rogpeppe | davecheney: but i'll try logging in to yours too | 08:04 |
davecheney | sorry this one is already deployed | 08:04 |
davecheney | rogpeppe: it's doing a 2000 machine bootstrap | 08:04 |
davecheney | so deploying another will take another 7 hours | 08:04 |
rogpeppe | davecheney: i want to see if i can reproduce the problem on a smaller env | 08:04 |
davecheney | kk | 08:04 |
davecheney | i just do juju deploy juju-gui | 08:04 |
davecheney | juju expose juju-gui | 08:04 |
davecheney | just followed garys instructions from his email | 08:04 |
rogpeppe | davecheney: i don't see any gui charm deployed on that machine | 08:08 |
rogpeppe | davecheney: and the error messages in machine.log look like they're not in the current juju tree | 08:09 |
davecheney | thaqt machine is not inside the environemnt | 08:09 |
davecheney | rogpeppe: but you can use that machine to recover the admin secret for the goscale2 environment | 08:10 |
rogpeppe | davecheney: ah, ok; i thought you said it was the deployed gui | 08:10 |
davecheney | rogpeppe: the gui uri is https://15.185.163.105/ | 08:11 |
rogpeppe | davecheney: sorry, i got muddled | 08:11 |
davecheney | rogpeppe: yeah, sorry, this is very confusing | 08:11 |
davecheney | we're running an envbironment within an environment | 08:12 |
davecheney | 'cos that is how m_3 rolls | 08:12 |
rogpeppe | davecheney: i sometimes do that too | 08:12 |
rogpeppe | davecheney: at some point i'll run up a "juju-dev" charm that provides a full juju-core dev environment | 08:13 |
davecheney | that is a great idea | 08:13 |
davecheney | screw local mode | 08:13 |
rogpeppe | davecheney: i've done it manually before, but it's a hassle; just what charms are for | 08:14 |
rogpeppe | davecheney: ok, so login fails for me too | 08:14 |
davecheney | weird eh | 08:15 |
rogpeppe | davecheney: any chance you could add my key to the gui node? | 08:16 |
rogpeppe | davecheney: ah, i can probably ssh from the bootstrap node | 08:16 |
davecheney | rogpeppe: yes | 08:17 |
davecheney | juju ssh 1 | 08:17 |
rogpeppe | davecheney: is there any way we can get ssh to only *temporarily* add hosts. the "permanently added" thing seems wrong | 08:19 |
rogpeppe | davecheney: and i just saw this message, which is probably related: http://paste.ubuntu.com/5603807/ | 08:19 |
davecheney | rogpeppe: unrelated | 08:20 |
davecheney | we've been creating and destroying machines all day | 08:20 |
rogpeppe | davecheney: ah, ok | 08:20 |
davecheney | so ip addresses have been reused | 08:20 |
davecheney | and have left stale entries in the ssh knownhosts file | 08:20 |
* davecheney has craeted on the order if 1600 machines today | 08:21 | |
rogpeppe | davecheney: that sounds like exactly what i was talking about, no? | 08:21 |
rogpeppe | davecheney: isn't the "permanently added" thing talking about adding to the knownhosts file? | 08:21 |
davecheney | rogpeppe: that is correct | 08:21 |
davecheney | i think i meant to say 'that warning is not serious' | 08:21 |
rogpeppe | davecheney: oh, i realise that | 08:22 |
rogpeppe | davecheney: but if ssh wasn't adding to the known hosts file, we wouldn't see that message | 08:22 |
davecheney | it won't add it a second time | 08:22 |
davecheney | the warning is the ip address exists in the file, with a different fingerprint | 08:22 |
davecheney | because we pass -o ignorehostwarning or something to ssh it carries on anyway | 08:23 |
rogpeppe | davecheney: yeah; basically i don't want to say "i know this ip address" forever because ip addresses are totally transitory in the juju env | 08:24 |
davecheney | rogpeppe: bingo | 08:24 |
davecheney | rogpeppe: i'll forward you my notes from the first 1000 machines | 08:27 |
davecheney | rogpeppe: i didn't bother to send that to william, he's got enough on his plate | 08:32 |
davecheney | the amount of memory mongo uses per connection is obscene | 08:33 |
rogpeppe1 | davecheney: last thing i saw was: | 08:36 |
rogpeppe1 | [09:27:39] <davecheney> rogpeppe: i'll forward you my notes from the first 1000 machines | 08:36 |
davecheney | 18:32 < davecheney> rogpeppe: i didn't bother to send that to william, he's got enough on his plate | 08:37 |
davecheney | 18:33 < davecheney> the amount of memory mongo uses per connection is obscene | 08:37 |
davecheney | that is all I said | 08:37 |
davecheney | 'cos you were ignoring me :) | 08:37 |
rogpeppe1 | davecheney: occupational hazard of going through a mobile data connection | 08:37 |
davecheney | rogpeppe1: do you think they will reconnect your part of england to the internet in the near future ? | 08:37 |
rogpeppe1 | davecheney: no prospect in the near future | 08:37 |
davecheney | rogpeppe1: shitter | 08:38 |
* davecheney steps outside to order some dinner | 08:38 | |
rogpeppe1 | davecheney: the fault is somewhere in 200m of underground cable | 08:38 |
rogpeppe1 | davecheney: and they have to get planning to dig it up | 08:38 |
rogpeppe1 | davecheney: i'd like to see your notes BTW | 08:38 |
rogpeppe1 | davecheney: you might've missed this BTW: | 08:39 |
rogpeppe1 | [09:31:31] <rogpeppe> davecheney: ah, this looks like a problem: http://paste.ubuntu.com/5603842/ | 08:39 |
rogpeppe1 | [09:32:57] <rogpeppe> davecheney: oops, missed one redaction | 08:39 |
davecheney | rogpeppe1: if you're looking at the output of juju get-environment | 08:40 |
davecheney | yeah, i think we left our flys open a bit | 08:40 |
rogpeppe1 | davecheney: i removed most of the passwords; but i've no idea what that one was from - third attempt, looks like | 08:41 |
rogpeppe1 | davecheney: unfortunately there seems no way to deliberately delete a paste | 08:41 |
rogpeppe1 | davecheney: before the crawlers find it | 08:42 |
davecheney | rogpeppe1: s'ok, i'll change the admin secret | 08:45 |
rogpeppe1 | aw shucks, "juju deploy juju-gui --force-machine 0" doesn't work | 08:45 |
rogpeppe1 | davecheney: that wasn't the admin secret | 08:45 |
davecheney | will fix | 08:45 |
davecheney | rogpeppe1: as pennance, you need to fix that bug :) | 08:46 |
rogpeppe1 | davecheney: i'm looking | 08:46 |
rogpeppe1 | davecheney: i'll try to reproduce it first. please don't take down that environment for the time being (not that there's much danger, i think) | 08:47 |
davecheney | rogpeppe1: np | 08:48 |
rogpeppe1 | davecheney: interesting minor bug: http://paste.ubuntu.com/5603887/ | 08:48 |
davecheney | no you can't do that, oh, ok, if you must | 08:49 |
rogpeppe1 | davecheney: no, it's not done - the unit is left around unassigned | 08:49 |
davecheney | oh | 08:50 |
davecheney | interesting | 08:50 |
rogpeppe1 | davecheney: you have to manually destroy the unit then add another one | 08:50 |
rogpeppe1 | davecheney: https://bugs.launchpad.net/juju-core/+bug/1173089 | 08:56 |
_mup_ | Bug #1173089: deploy can fail partially <juju-core:New> <https://launchpad.net/bugs/1173089> | 08:56 |
davecheney | bzzt | 08:59 |
rogpeppe1 | davecheney: hmm, the gui works ok for me | 09:05 |
davecheney | rogpeppe1: poop | 09:06 |
davecheney | why can't i login to my deployment ? | 09:06 |
rogpeppe1 | davecheney: here's an idea: kill the machine agent | 09:06 |
rogpeppe1 | davecheney: and see if it works when it starts again | 09:07 |
davecheney | ok | 09:07 |
rogpeppe1 | davecheney: 'cos that EOF error is really weird | 09:07 |
rogpeppe1 | davecheney: i'm hoping that we will still see the error when it restarts | 09:07 |
rogpeppe1 | davecheney: because then there's the possibility of upgrading the binaries with some updated logging and better error messages. | 09:08 |
rogpeppe1 | davecheney: and finding out what's really going on | 09:08 |
rogpeppe1 | davecheney: the only possibility that i can think of currently is that the connection to the mongo server has failed | 09:10 |
rogpeppe1 | davecheney: i *wish* we annotated our errors more | 09:10 |
rogpeppe1 | davecheney: if my theory is correct, that EOF error comes from about 6 levels deep and hasn't been given any context at all | 09:11 |
davecheney | rogpeppe1: is this on the api server, or the state/mongo server? | 09:12 |
rogpeppe1 | davecheney: on the api server | 09:12 |
davecheney | right | 09:12 |
rogpeppe1 | davecheney: if i had my way, there would be almost no if err != nil {return err} occurrences in our code | 09:13 |
rogpeppe1 | davecheney: i lost that argument ages ago, but problems like this really show how bad our current conventions are | 09:14 |
davecheney | rogpeppe1: i'm starting to be convinved | 09:14 |
davecheney | and i think it can be reopened | 09:14 |
davecheney | times they have a changewd | 09:14 |
rogpeppe1 | davecheney: my comment (the last one) on this post is a reasonable representation of my thoughts on the matter: http://how-bazaar.blogspot.co.nz/2013/04/the-go-language-my-thoughts.html | 09:16 |
* davecheney reads | 09:17 | |
davecheney | rogpeppe1: the main mongo thread is now using more than 100% CPU | 09:19 |
* rogpeppe1 is not surprised | 09:19 | |
davecheney | it looks like mongo handles the accept(2) and the tls handshake on the main thread | 09:20 |
davecheney | so every 30 seconds we get a storm of agents sniffing around | 09:21 |
rogpeppe1 | davecheney: oh god | 09:21 |
davecheney | and the cpu wedges | 09:21 |
davecheney | only once it has done the handshaking does it hand off the connection to a new thread | 09:21 |
rogpeppe1 | davecheney: we should try with a much much longer time interval there | 09:21 |
rogpeppe1 | davecheney: 30s is ridiculous | 09:21 |
davecheney | it's not 30s | 09:21 |
davecheney | but that appears to be the resonent frequency of the polling interval | 09:21 |
davecheney | its 180s or whenever they need to do a sync (that is what mgo calls it) | 09:22 |
davecheney | which ever is the sooner | 09:22 |
rogpeppe1 | davecheney: ah i see. the usual self-synchronising clock thing | 09:23 |
davecheney | yeah, that isn't all 650 agents at once | 09:23 |
davecheney | but a swarm of them | 09:23 |
* rogpeppe1 loves emergent patterns | 09:23 | |
* davecheney does not | 09:23 | |
rogpeppe1 | davecheney: it's the joy of the universe, maaan | 09:24 |
rogpeppe1 | davecheney: does that blog comment make sense to you BTW? i have the impression that noone gets what i'm trying to say there. | 09:27 |
* rogpeppe1 is not good at rhetoric | 09:28 | |
davecheney | rogpeppe1: i agree with your position | 09:29 |
davecheney | i think we talked about this a year ago | 09:29 |
davecheney | waiting for the computer history museam to open | 09:30 |
rogpeppe1 | davecheney: ah yes, i remember | 09:30 |
davecheney | and now with the benefit of some history | 09:30 |
davecheney | i agree | 09:30 |
davecheney | well, i always agreed | 09:30 |
davecheney | but this is an excellent case | 09:30 |
rogpeppe1 | davecheney: i might put a post together for juju-dev | 09:31 |
rogpeppe1 | davecheney: 9 levels deep and still diving | 09:43 |
davecheney | rogpeppe1: remember to stop on the way back up and represurise to avoid the bends | 09:45 |
rogpeppe1 | davecheney: lol | 09:46 |
davecheney | don't go james cameron on me man | 09:46 |
rogpeppe1 | davecheney: bottomed out at 12 | 09:52 |
davecheney | 64 bit process | 09:52 |
rogpeppe1 | davecheney: if we reported a stack trace, as some suggest, it would show only the bottom 2 levels | 09:52 |
rogpeppe1 | davecheney: http://paste.ubuntu.com/5604054/ | 10:00 |
rogpeppe1 | davecheney: actually, there's probably another layer at the top | 10:00 |
rogpeppe1 | davecheney: here's the complete stack: http://paste.ubuntu.com/5604064/ | 10:02 |
davecheney | rogpeppe1: shit | 10:03 |
rogpeppe1 | davecheney: one easy thing to do is to actually hook up the mgo logging | 10:04 |
rogpeppe1 | davecheney: then that logf at the bottom would actually have printed something | 10:04 |
davecheney | rogpeppe1: is that hard to do ? | 10:09 |
rogpeppe1 | davecheney: trivial | 10:09 |
rogpeppe1 | davecheney: a one-line change | 10:10 |
rogpeppe1 | davecheney: or one or two more if we want nicely formatted messages | 10:11 |
davecheney | rogpeppe1: a single thread is now using 209% CPU on the bootstrap node ... | 10:18 |
rogpeppe1 | davecheney: is that possible? | 10:18 |
davecheney | PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command | 10:18 |
davecheney | 9611 root 20 0 8169M 1770M 0 S 194. 11.0 1h40:55 /usr/bin/mongod --auth --dbpath=/var/lib/juju/db | 10:18 |
davecheney | really, it is | 10:18 |
rogpeppe1 | davecheney: i thought a thread was... single threaded | 10:18 |
rogpeppe1 | davecheney: or do you mean a single process (with several threads inside) ? | 10:18 |
davecheney | rogpeppe1: this is using htop so it should be per thread | 10:19 |
davecheney | i cannot explain it | 10:19 |
davecheney | apart from observing it is large | 10:19 |
davecheney | ohh, and now I can see a lot of blocking on the mongo side | 10:19 |
davecheney | and that is only 800 machines | 10:20 |
davecheney | sorry, 888\ | 10:20 |
davecheney | Apr 26 10:21:44 juju-goscale2-machine-0 mongod.37017[9611]: Fri Apr 26 10:21:44 [conn84734] query presence.presence.pings query: { $or: [ { _id: 1366971690 }, { _id: 1366971660 } ] } ntoreturn:0 ntoskip:0 nscanned:2 keyUpdates:0 numYields: 1 locks(micros) r:763142 nreturned:2 reslen:744 381ms | 10:21 |
davecheney | rogpeppe1: i'm assuming these are 'slow queries' | 10:22 |
davecheney | they only start to show up in the log at the 800 machine mark | 10:22 |
rogpeppe1 | davecheney: wow, does that reslen value mean the query has been waiting for 12 minutes to be processes?! | 10:23 |
davecheney | i don't think so | 10:23 |
davecheney | i don't think it is 744,381 ms | 10:23 |
davecheney | surely it is 744 bytes after 381 ms | 10:23 |
rogpeppe1 | davecheney: yeah, probably | 10:24 |
davecheney | rogpeppe1: Apr 26 10:56:20 juju-goscale2-machine-0 mongod.37017[9611]: Fri Apr 26 10:56:20 [conn50284] query presence.presence.pings query: { $or: [ { _id: 1366973760 }, { _id: 1366973730 } ] } ntoreturn:0 ntoskip:0 nscanned:2 keyUpdates:0 numYields: 1 locks(micros) r:911100 nreturned:2 reslen:792 501ms | 10:56 |
rogpeppe1 | davecheney: latency rises... | 10:57 |
davecheney | not really sure wht that is showing me yet | 11:02 |
davecheney | it's sort of a cas insn't it ? | 11:02 |
davecheney | Apr 26 11:02:02 juju-goscale2-machine-0 mongod.37017[9611]: Fri Apr 26 11:02:02 [conn6275] query presence.presence.pings query: { $or: [ { _id: 1366974120 }, { _id: 1366974090 } ] } ntoreturn:0 ntoskip:0 nscanned:2 keyUpdates:0 numYields: 1 locks(micros) r:1413393 nreturned:1 reslen:406 768ms | 11:02 |
davecheney | but yes, they certainly rise | 11:03 |
davecheney | what is the heartbeat for presence ? | 11:03 |
davecheney | we should put some thought into avoiding harmonic feedback in all these periodic loops | 11:03 |
davecheney | shit, we're not even at 1000 instances | 11:06 |
davecheney | it's been running for 3 hours ... | 11:06 |
davecheney | testing this thing is a job for life :) | 11:06 |
dimitern | rogpeppe1: hey, how about a suggestion about better help doc for upgrade-charm --switch? | 11:08 |
davecheney | rogpeppe1: http://paste.ubuntu.com/5604256/ | 11:23 |
davecheney | at the 1000 node mark, the api server is unusable | 11:23 |
rogpeppe1 | dimitern: ah, will do. sorry, bit distracted currently as some old pipes have just sprung a leak in our kitchen and i've had to turn the main water supply off | 11:23 |
davecheney | or something maybe mongo | 11:23 |
dimitern | rogpeppe1: wow.. | 11:23 |
davecheney | maybe the the thing afterwards that | 11:23 |
davecheney | crap | 11:23 |
rogpeppe1 | davecheney: isn't the mongo, not the API server? | 11:24 |
rogpeppe1 | s/the/that/ | 11:24 |
davecheney | rogpeppe1: really not sure | 11:25 |
dimitern | rogpeppe1: "To manually specify the charm URL to upgrade to, use the --switch argument. | 11:25 |
dimitern | It will be used instead of the service's current charm newest revision. | 11:25 |
dimitern | Note that the given charm must be compatible with the current one, e.g. | 11:25 |
davecheney | i guess it is looking in the db | 11:25 |
dimitern | it must not remove relations the service is currently participating in, | 11:25 |
dimitern | and no settings types can be changed. This *is dangerous* and you should | 11:25 |
dimitern | know what you are doing." | 11:25 |
davecheney | to find the address of the instance | 11:25 |
davecheney | it could also be blocked waiting for the provider to return some data | 11:25 |
davecheney | but we've used up all our quota with the provider | 11:26 |
=== ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: | Bugs: 2 Critical, 64 High - https://bugs.launchpad.net/juju-core/ | ||
=== ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: | Bugs: 3 Critical, 63 High - https://bugs.launchpad.net/juju-core/ | ||
dimitern | wallyworld_: mumble? | 11:33 |
wallyworld_ | dimitern: i just got back from soccer, i'll be a minite | 11:33 |
rogpeppe1 | dimitern: can an upgraded charm have less config settings than the old one? | 11:37 |
dimitern | rogpeppe1: let me check | 11:38 |
davecheney | does anyone know if nova list has a limit on the nubmer of rows it returns ? | 11:39 |
davecheney | https://bugs.launchpad.net/nova/+bug/1166455 ? | 11:42 |
_mup_ | Bug #1166455: nova flavor-list only shows 1000 flavors <prodstack> <OpenStack Compute (nova):Invalid> <python-novaclient:Fix Committed by gtt116> <nova (Ubuntu):Invalid> <https://launchpad.net/bugs/1166455> | 11:42 |
dimitern | rogpeppe1: well, it seems the old config settings should remain, but you can add new ones | 12:00 |
rogpeppe1 | dimitern: ok, that seems good | 12:00 |
rogpeppe1 | dimitern: http://paste.ubuntu.com/5604375/ | 12:02 |
dimitern | rogpeppe1: sgtm, thanks | 12:04 |
dimitern | rogpeppe1: so how to test both local: and cs: urls? start a http server mocking the store and set that to charm.Store? | 12:24 |
rogpeppe1 | dimitern: good question. | 12:40 |
rogpeppe1 | dimitern: sorry, still distracted, trying to get hold of a plumber | 12:41 |
dimitern | rogpeppe1: i'll propose it without that, for now | 12:41 |
ahasenack | hi guys, I'm getting this error in the bootstrap node when bootstrapping on canonistack: | 12:50 |
ahasenack | ERROR worker: loaded invalid environment configuration: required environment variable not set for credentials attribute: User | 12:50 |
ahasenack | full logs at http://pastebin.ubuntu.com/5604481/ | 12:50 |
ahasenack | any ideas? | 12:50 |
ahasenack | "juju status" on my laptop just hangs | 12:51 |
dimitern | ahasenack: try running juju status --debug -v | 12:52 |
ahasenack | dimitern: hm | 12:53 |
ahasenack | dimitern: http://pastebin.ubuntu.com/5604493/ | 12:54 |
ahasenack | security group issue? | 12:54 |
ahasenack | it connects over there (localhost), so there is something listening on that port | 12:55 |
dimitern | ahasenack: it seems it cannot connect to mongo - is it running? | 12:55 |
ahasenack | root@juju-canonistack-machine-0:~# telnet localhost 37017 | 12:55 |
ahasenack | Trying 127.0.0.1... | 12:55 |
ahasenack | Connected to localhost. | 12:55 |
ahasenack | Escape character is '^]'. | 12:55 |
ahasenack | something is, I assume it's mongo | 12:55 |
ahasenack | tcp 0 0 0.0.0.0:37017 0.0.0.0:* LISTEN 27573/mongod | 12:55 |
ahasenack | yep | 12:55 |
dimitern | ahasenack: so you can connect from machine 0 to mongo, but not from outside? | 12:56 |
ahasenack | right | 12:56 |
ahasenack | I'm checking the security group rules | 12:56 |
dimitern | ahasenack: yeah, good idea | 12:56 |
ahasenack | dimitern: ah, I know | 12:57 |
ahasenack | dimitern: the rules are ok | 12:57 |
ahasenack | dimitern: it's the public ip thing, on the private ip only ssh is routed through | 12:58 |
ahasenack | dimitern: I'll fire up sshuttle and that should sort it | 12:58 |
ahasenack | dimitern: yep, worked now, thanks | 12:59 |
ahasenack | the errors in the logs were misleading me | 12:59 |
dimitern | ahasenack: you can also try setting the "use-floating-ip" to true in env config | 12:59 |
ahasenack | yepo | 12:59 |
dimitern | ahasenack: but knowing the shortage of floating ips on canonistack, it might fail anyway | 12:59 |
ahasenack | yes, I will stick with sshuttle, works well enough for my testing | 13:00 |
ahasenack | rogpeppe1: hi, I see that https://bugs.launchpad.net/juju-core/+bug/1172717 is still open, but the branch is merged | 13:19 |
_mup_ | Bug #1172717: juju-log does not accept --log-level <juju-core:In Progress by rogpeppe> <https://launchpad.net/bugs/1172717> | 13:20 |
ahasenack | rogpeppe1: is it fixed in trunk? | 13:20 |
rogpeppe1 | ahasenack: i think so; let me check | 13:33 |
rogpeppe1 | ahasenack: yes | 13:34 |
ahasenack | rogpeppe1: will that trigger a new ppa build? I still only see the version with the bug | 13:34 |
ahasenack | rogpeppe1: also, does it requires a new "tools" build? | 13:35 |
ahasenack | does it require* | 13:35 |
rogpeppe1 | ahasenack: i don't think so. i think the patch needs to be back ported | 13:35 |
ahasenack | rogpeppe1: I'm using this ppa: http://ppa.launchpad.net/juju/devel/ubuntu/ | 13:35 |
rogpeppe1 | ahasenack: we haven't yet worked out best practice in that respect yet - we're still feeling our way | 13:35 |
ahasenack | I thought that was trunk | 13:35 |
rogpeppe1 | ahasenack: the tools still need to be pushed to the public bucket | 13:36 |
rogpeppe1 | ahasenack: because that's where they're pulled from, not the ppa | 13:36 |
ahasenack | rogpeppe1: the bug actually depends more on the tools than on the new deb | 13:36 |
ahasenack | ok | 13:36 |
ahasenack | and that does not happen with every commit? | 13:37 |
ahasenack | I guess there needs to be a concept of "stable" and "devel" tools | 13:37 |
rogpeppe1 | ahasenack: there is that concept | 13:37 |
rogpeppe1 | ahasenack: if the minor version is odd, it's a devel version | 13:37 |
rogpeppe1 | ahasenack: i think we probably need to automate our pushing to the public bucket | 13:38 |
ahasenack | rogpeppe1: but are they in separate buckets? | 13:38 |
rogpeppe1 | ahasenack: no, there's only one public bucket | 13:38 |
rogpeppe1 | ahasenack: (for any given environment, that is) | 13:38 |
ahasenack | ok, so if you push to that bucket with every commit, like a "daily", you risk breaking production users | 13:38 |
ahasenack | with the ppa at least you have a distinction about what is "stable" and what is "devel" or "daily" | 13:39 |
rogpeppe1 | ahasenack: only if we push versions with an even minor version number, i think | 13:39 |
ahasenack | rogpeppe1: so how do you test trunk, you use --upload-tools all the time? | 13:39 |
rogpeppe1 | ahasenack: the idea is that we always develop against an odd minor version (currently we're developing against 1.11) | 13:39 |
rogpeppe1 | ahasenack: yes | 13:39 |
ahasenack | rogpeppe1: like my case now, I was going through all the openstack charms and seeing if they deploy with juju-core trunk, and filing bugs where appropriate (some in openstach charms, some in juju) | 13:40 |
ahasenack | rogpeppe1: but I can't test a "trunk" build of juju-core, because it's not there, I'm stuck with the version with the bug :) | 13:40 |
rogpeppe1 | ahasenack: you could use upload-tools | 13:40 |
ahasenack | last time I tried it exploded, I emailed the list | 13:41 |
ahasenack | I will wait for a new package in the devel ppa, and new tools :) | 13:41 |
rogpeppe1 | ahasenack: there have been some significant issues fixed since then. it *should* work fine. | 13:42 |
rogpeppe1 | ahasenack: in particular, it shouldn't pick incompatible tools if you've uploaded some, which was probably the cause of the explosion before | 13:42 |
ahasenack | rogpeppe1: I think my problem is more basic than that... http://pastebin.ubuntu.com/5604658/ | 13:45 |
ahasenack | what does it mean "no go source files" | 13:45 |
rogpeppe1 | ahasenack: try go get -v launchpad.net/juju-core/... | 13:46 |
ahasenack | rogpeppe1: the "..." are for real? | 13:46 |
rogpeppe1 | ahasenack: there are no source files in the juju-core root directory | 13:46 |
rogpeppe1 | ahasenack: yes | 13:46 |
rogpeppe1 | ahasenack: it's a wildcard | 13:46 |
ahasenack | !! | 13:46 |
rogpeppe1 | ahasenack: from "go help packages": http://paste.ubuntu.com/5604667/ | 13:47 |
ahasenack | rogpeppe1: ok, that changes things, thanks, I'll go on from here | 13:47 |
rogpeppe1 | ahasenack: if the wildcard was '*', you'd have to quote the names all the time | 13:48 |
rogpeppe1 | ahasenack: and '*' usually doesn't match multiple levels of directory | 13:48 |
rogpeppe1 | ahasenack: cool; please let us know when things go wrong, or are awkward to understand - it's nice to get feedback from people that aren't used to walking around the holes in the road. | 13:50 |
=== wedgwood_away is now known as wedgwood | ||
=== gary_poster is now known as gary_poster|away | ||
davechen1y | m_3 ping | 14:00 |
=== flaviami_ is now known as flaviamissi | ||
dimitern | i'd appreciate a review on https://codereview.appspot.com/8540050 | 14:29 |
ahasenack | rogpeppe1: --upload-tools worked, and I verified that that -l/--log-level bug is indeed fixed | 14:29 |
dimitern | rogpeppe1: ^^ | 14:29 |
* dimitern bbi30m | 14:29 | |
rogpeppe1 | ahasenack: lovely, thanks for giving it a go | 14:29 |
rogpeppe1 | dimitern: ok, will look in a little bit | 14:29 |
=== gary_poster|away is now known as gary_poster | ||
rogpeppe1 | dimitern: reviewed | 15:06 |
dimitern | rogpeppe1: cheers | 15:11 |
m_3 | davecheney: pong | 15:13 |
ahasenack | hi, I got this error when deploying cinder with juju-core, is this a change between pyjuju and gojuju? http://pastebin.ubuntu.com/5605085/ | 15:52 |
rogpeppe1 | hmm, interesting | 15:54 |
rogpeppe1 | ahasenack: do you know what hook that was running in? | 15:55 |
ahasenack | rogpeppe1: install I think, this was just before, and I was really installing it only | 15:55 |
ahasenack | 2013/04/26 15:51:25 DEBUG worker/uniter/jujuc: hook context id "cinder/0:install:79731491855068321"; dir "/var/lib/juju/agents/unit-cinder-0/charm" | 15:55 |
ahasenack | rogpeppe1: wait, let me paste more context | 15:55 |
rogpeppe1 | ahasenack: hmm, so which relation did the code expect to be set there? | 15:55 |
rogpeppe1 | ahasenack: given that the install hook isn't associated with a relation. | 15:56 |
ahasenack | http://pastebin.ubuntu.com/5605098/ | 15:56 |
ahasenack | the install had failed before, i had to run a few juju set foo=bar to fix a config and then resolved --retry | 15:56 |
rogpeppe1 | ahasenack: i think we could do with even more context actually | 15:57 |
ahasenack | I'm not sure what it was trying to set | 15:57 |
ahasenack | ok | 15:57 |
ahasenack | let me get the whole file | 15:57 |
ahasenack | rogpeppe1: http://pastebin.ubuntu.com/5605109/ | 15:58 |
rogpeppe1 | ahasenack: right, it's running the install hook | 15:59 |
rogpeppe1 | ahasenack: i think it's reasonable that relation-related commands can fail in that circumstance, but i'd be interested to know what the charm was actually trying to do | 15:59 |
ahasenack | let me see what it does | 16:00 |
rogpeppe1 | ahasenack: perhaps we should just ignore untoward relation-related commands | 16:00 |
ahasenack | rogpeppe1: I found two relation-set commands that match that log | 16:01 |
ahasenack | rogpeppe1: one specifies a relation id :) | 16:01 |
rogpeppe1 | ahasenack: :-) | 16:01 |
ahasenack | looks like a bug | 16:01 |
rogpeppe1 | ahasenack: looks that way to me | 16:01 |
ahasenack | the one that doesn't is in keystone_joined() (!!) | 16:01 |
ahasenack | relation-set service="cinder" \ | 16:01 |
ahasenack | region="$(config-get region)" public_url="$url" admin_url="$url" internal_url="$url" | 16:01 |
ahasenack | rogpeppe1: ok, thanks, I'll take it from here | 16:02 |
rogpeppe1 | ahasenack: if charms are doing this commonly though, and the python allowed it, we should perhaps consider letting it through and ignoring it | 16:02 |
ahasenack | ok | 16:02 |
ahasenack | I will debug this one, see how it ended up running keystone_joined() in the install hook | 16:03 |
ahasenack | and then if we can get and use a relation id | 16:03 |
rogpeppe1 | anyone know of a decent way of inserting nicely formatted code fragments into a gmail mail? | 16:07 |
rogpeppe1 | or a google doc for that matter | 16:09 |
ahasenack | hi, I have a feeling that juju deploy --config file.yaml isn't working, it's not taking the options from file.yaml | 16:26 |
ahasenack | before I debug further, is this a known issue? | 16:27 |
ahasenack | juju set <service> --config file.yaml also didn't work, but juju set <service> key=value did | 16:29 |
ahasenack | https://bugs.launchpad.net/juju-core/+bug/1121907 | 16:34 |
_mup_ | Bug #1121907: deploy --config <cmdline> <juju-core:New> <https://launchpad.net/bugs/1121907> | 16:34 |
dimitern | ahasenack: I think deploy doesn't accept --config yet | 16:34 |
ahasenack | The option is there, but the bug still open | 16:34 |
dimitern | ahasenack: or more likely it ignores it | 16:34 |
ahasenack | yep, looks like it | 16:34 |
dimitern | rogpeppe1: bugging you one last time: https://codereview.appspot.com/8540050 | 16:35 |
ahasenack | juju get works, but there is also a bug for it, still open | 16:35 |
ahasenack | weird | 16:35 |
rogpeppe1 | ahasenack: we've been fixing lots of bugs - not all them have necessarily been marked as such... | 16:36 |
ahasenack | ok | 16:37 |
rogpeppe1 | dimitern: why call repo.Latest at all if we've got a specified revision number? | 16:37 |
rogpeppe1 | dimitern: it's a potentially slow operation | 16:37 |
dimitern | rogpeppe1: it doesn't seem slow - it just changes the rev in the curl | 16:38 |
rogpeppe1 | dimitern: no it doesn't - it calls CharmStore.Info, which makes an http request | 16:39 |
dimitern | rogpeppe1: only for a local repo it does get, but this shouldn't be slow at all, the CS does not fetch anything on Latest | 16:39 |
rogpeppe1 | dimitern: resp, err := http.Get(s.BaseURL + "/charm-info?charms=" + url.QueryEscape(key)) ? | 16:39 |
dimitern | rogpeppe1: it's not the charm that's downloaded here, just the metadata | 16:40 |
rogpeppe1 | dimitern: looks like it's fetching something to me | 16:40 |
dimitern | rogpeppe1: it's essentially a HTTP HEAD | 16:40 |
rogpeppe1 | dimitern: sure, but it's still making an unnecessary network request for no particularly good reason. surely it's easy to avoid? | 16:40 |
dimitern | rogpeppe1: yeah, i suppose.. | 16:40 |
dimitern | rogpeppe1: but despite this the logic is now sound, right? | 16:41 |
rogpeppe1 | dimitern: i stopped there, but will continue looking, one mo | 16:41 |
dimitern | rogpeppe1: i'll just move the Lastest call in an else block after checking the other two cases | 16:41 |
rogpeppe1 | dimitern: that was what i was just thinking | 16:42 |
dimitern | rogpeppe1: sorry, haven't seen it like this | 16:42 |
dimitern | rogpeppe1: thanks | 16:42 |
rogpeppe1 | dimitern: you might even consider making it a bool switch | 16:42 |
dimitern | rogpeppe1: i did something like that, but it looked ugly, so i got rid of it | 16:42 |
rogpeppe1 | dimitern: np; three cases is marginal | 16:43 |
=== gary_poster is now known as gary_poster|away | ||
rogpeppe1 | dimitern: i'm still not sure the logic is quite right, even making that change | 16:45 |
dimitern | rogpeppe1: why? | 16:46 |
rogpeppe1 | dimitern: don't we want to do a bump revision if the switch url is specified without a revno ? | 16:46 |
dimitern | rogpeppe1: I don't believe so | 16:46 |
rogpeppe1 | dimitern: william said this, and i agree: | 16:47 |
rogpeppe1 | Hmm.I suspect that bump-revision logic *should* apply when --switch is given | 16:47 |
rogpeppe1 | with a *local* charm url *without* an explicit revision. Sane? | 16:47 |
dimitern | rogpeppe1: that's the user being explicit anyway, so we'll do what he asks, and probably knows what he's doing | 16:47 |
dimitern | rogpeppe1: I still disagee | 16:48 |
rogpeppe1 | dimitern: as there's no way to explicitly specify bump-revision, i think we should make the default logic work | 16:48 |
dimitern | rogpeppe1: this is like --force - "do exactly what i'm telling you to do, no smart tricks" | 16:48 |
rogpeppe1 | dimitern: hmm, you said "Done" in response to that sentence before - you didn't seem to disagree | 16:48 |
rogpeppe1 | dimitern: if you don't specify a revision number, you're saying "please choose an appropriate revision number for me" | 16:49 |
rogpeppe1 | dimitern: i think we should make that path work | 16:49 |
dimitern | rogpeppe1: done, meaning all the rest - except that, i should've been clearer perhaps | 16:49 |
dimitern | rogpeppe1: there's no way *not* to bump the revision otherwise | 16:49 |
dimitern | rogpeppe1: and why should we do it - it's a different charm, so no conflicts would apply (hopefully) | 16:50 |
rogpeppe1 | dimitern: sure there is - specify a revision number, no? | 16:50 |
rogpeppe1 | dimitern: it's a different charm, but we may already have another version of the one we're switch to | 16:50 |
rogpeppe1 | switching to | 16:50 |
rogpeppe1 | dimitern: it's not unlikely, in fact, if we're calling switch on multiple services | 16:51 |
dimitern | rogpeppe1: on the same service? | 16:51 |
dimitern | rogpeppe1: we can call it only on one service at a time | 16:51 |
rogpeppe1 | dimitern: yes, but bump-revision isn't about the service, is it? it's about the charm's stored in the state, which are independent of the services that use them | 16:52 |
dimitern | rogpeppe1: so you think bumping revision on switch without explicit rev will be straightforward to understand from the user's point of view? | 16:52 |
rogpeppe1 | dimitern: yes | 16:52 |
rogpeppe1 | dimitern: because it's the behaviour they're used to when deploying with a local charm url | 16:53 |
dimitern | rogpeppe1: ok, i'll do it, but i'm still not convinced it's right | 16:53 |
rogpeppe1 | dimitern: i think automatic bump-revision for any local charm is correct, as who knows what relationship the local charm bears to the one that's previously been uploaded? | 16:54 |
dimitern | rogpeppe1: fair enough | 16:57 |
=== gary_poster|away is now known as gary_poster | ||
dimitern | rogpeppe1: so when you have svc "riak",running charm "riak-7" and you upgrade it to "local:myriak" (no exp. rev, final result: "local:precise/myriak-7"), and then upgrade it again to "local:myriak", should the rev be bumped to "local:myriak-8" ? | 17:17 |
rogpeppe1 | dimitern: yes, i think so | 17:17 |
dimitern | rogpeppe1: yeah, that's what I though, adding a test for that now | 17:18 |
dimitern | i'm off, happy weekend to everyone! | 17:57 |
ahasenack | rogpeppe1: about the earlier conversation about relation set and relation id, it looks like it's very common to not specify a relation id in pyjuju | 18:00 |
ahasenack | two charm authors I spoke with said so, and the "manpage" for relation-set in pyjuju says it's optional (as is everything else, so I don't trust that help doc very much: https://pastebin.canonical.com/90111/) | 18:00 |
rogpeppe1 | ahasenack: it is optional, in relation-related hooks | 18:10 |
rogpeppe1 | ahasenack: but in a non-relation hook, what could it possibly default to? | 18:10 |
ahasenack | ah, so it is optional in gojuju | 18:11 |
ahasenack | ok, I'll debug further | 18:11 |
rogpeppe1 | right, eod and start of weekend for me here | 18:15 |
rogpeppe1 | happy weekends all | 18:15 |
ahasenack | bye rogpeppe1, enjoy | 18:19 |
=== wedgwood is now known as wedgwood_away |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!