[00:00] m_3: ping [00:00] our cloudinit harness doesn't support the bits of upstart I need [00:01] so i'm going to hack the bootstrap node after boot [00:01] arosales: ^ as above [00:01] that will have the same effect and validate our assumptions about the ~298 connection limit [00:04] OT question: does bzr have anything like svn externals or git submodules ? [00:20] $ sudo initctl start -v juju-db [00:20] initctl: Job failed to start [00:20] FML [00:30] hi davecheney [00:34] ubuntu@juju-hpgoctrl2-machine-0:~$ nova list [00:34] +---------+---------------------------+------------------+--------------------------------------+ [00:34] | ID | Name | Status | Networks | [00:34] +---------+---------------------------+------------------+--------------------------------------+ [00:34] | 1465097 | juju-hpgoctrl2-machine-0 | ACTIVE | private=10.7.194.166, 15.185.162.247 | [00:34] | 1565949 | juju-goscale2-machine-37 | ACTIVE(deleting) | private=10.6.245.47, 15.185.172.89 | [00:34] | 1566583 | juju-goscale2-machine-239 | ACTIVE(deleting) | private=10.6.246.187, 15.185.177.83 | [00:34] | 1581493 | juju-goscale2-machine-0 | ACTIVE | private=10.7.27.166, 15.185.166.80 | [00:34] +---------+---------------------------+------------------+--------------------------------------+ [00:34] ^ jammed in deleting for a few days now :( [00:51] 2013/04/26 00:51:08 DEBUG started processing instances: []environs.Instance{(*openstack.instance)(0xf8401b3f00)} [00:52] ^ *openstack.instance needs a String() [01:17] davecheney: hey [01:18] m_3: hey mate [01:18] going for broke for 2k [01:18] ssup? still jammed? [01:18] sweet [01:19] bit of latency atm... gogo inflight wireless [01:19] :) [01:19] i've hacked the mongo on the bootstap machine to have at least 20,000 conns [01:19] that should be enough fo the moment [01:19] oh nice [01:19] m_3: where u off too ? [01:19] SF, then Portland [01:20] SF is prep for the big data summercamp talk [01:20] portland is railsconf [01:20] whoohoo [01:20] actually looking forward to hanging with the ole 'austin-on-rails' crowd [01:20] m_3: I think we'll probably run out of ram on the bootstrap node by 2,000 [01:21] m_3: this one is a hp bug, [01:21] ubuntu@juju-hpgoctrl2-machine-0:~$ nova list | grep delet [01:21] | 1565949 | juju-goscale2-machine-37 | ACTIVE(deleting) | private=10.6.245.47, 15.185.172.89 | [01:21] | 1566583 | juju-goscale2-machine-239 | ACTIVE(deleting) | private=10.6.246.187, 15.185.177.83 | [01:21] davecheney: damn... I was just writing that we can bounce it and get something larger [01:21] but we can't update the env after boostrap still right? [01:21] ~ 1.5 mb per service unit [01:21] env ? [01:21] yhou mean the spec for the bootstrap machine ? [01:21] juju environment [01:21] yeah [01:22] not easily [01:22] probalby esier to hack juju bootstrap [01:22] right [01:22] * davecheney facepalm [01:22] there is no swap on these machines [01:22] that will be a problem [01:22] mongo will probably explode [01:22] yeah, sometimes when they're wedged with juju-0.7 we could do destroy-environment and it was a little stronger than destroy-service [01:23] can you kill em with nova [01:23] nova can't kill this one [01:23] we shouold've started with ec2 imo [01:23] (how do you think it got into this state in the first place) [01:23] haha [01:23] m_3: any movement on some ec2 creds ? [01:24] not yet... I prepped antonio that the request had be pretty much approved from above... but gotta get ben on the actual acct stuff [01:25] davecheney: I think we should just blow it up [01:25] davecheney: maybe put something in place that'll tell us that's what's happening [01:25] so we can distinguish between a juju error and the bootstrap node blowing up [01:25] "11:25 < m_3> davecheney: maybe put something in place that'll tell us that's what's happening" [01:25] oh [01:25] that [01:25] :) [01:26] let me blow one up so I can see what to expect [01:26] reasonable to get as big as we can [01:26] ack [01:26] unfortunately I won't be in the air for long... otherwise _that_ would be a great story :)... "kicked off 1000 nodes from the plane" [01:27] latency's really dropped down too... so it's pretty nice actually [01:27] mramm: wazzup ? [01:27] not much [01:27] I just got an email from linaro folks about armhf support in juju-core [01:27] m_3: lemmie hack this instnce with a /.SWAP [01:27] mramm: piece of piss [01:27] ? [01:28] i told someone that we can always do a one off build if they need armhf today [01:28] if they need it properly [01:28] we need some work done on the golang-go package int he archive [01:28] basically, we need go 1.1 [01:28] they are just asking if they can help test and support it [01:28] right [01:28] that was what I remembered from some earlier arm discussion [01:28] they can test it right now today if they build go and juju from source [01:28] http://dave.cheney.net/unofficial-arm-tarballs [01:28] They are not being demanding, just asking how they can help [01:29] ^ or they can use my beta tarballs [01:29] feel free to cc me [01:29] i'm happy to help get them started [01:29] and what they can do, so I will let them know the situation, and CC you [01:29] sounds good [01:30] did we hardcode the state server to be amd64? [01:31] descending below 10k-ft... ttyl [01:31] mramm: opinions differ [01:31] william told me it _is_ hard coded to amd64 [01:31] then he told me it wasn't [01:31] ok [01:31] i don't know the current answer [01:31] I will check with william [01:31] id' expect it to just work [01:33] mramm: it's a bit of a problem that the UEC service doesn't list our armhf on amd64 images, http://cloud-images.ubuntu.com/query/precise/server/released.txt [01:33] interesting [01:34] hmm, maybe they do for Q [01:34] nup [01:46] we can talk to the "public cloud images" guys about that, and see what we can get done there. I'll talk to antonio about that tomorrow. [01:47] mramm: http://www.h-online.com/open/news/item/Canonical-releases-EC2-image-for-Ubuntu-ARM-Server-1585740.html [01:47] kk [01:47] thanks [01:49] hi mramm [01:49] mramm: m_3 286 slaves running, mongo using 450 mb of ram [01:49] so at least 4gb required for 2000 nodes at this rate [01:49] davecheney: is that good? [01:49] it means you need to run a larger bootstrap instnace [01:49] davecheney: I guess that is to be expected if we are going to have thousands of open connections to mongo [01:49] but then, if you're running 2000 nodes in your environment [01:50] true enough [01:50] you probably dn't care about the cost difference [01:50] right, the bootstrap node cost will be trivial compared to the 2000 nodes [01:50] each conn is a thread, which is anywhere between 1mb and 16mb depending on libc and the phase of the moon [01:50] mramm: bingo [01:50] thumper: hey! [01:51] davecheney: I think we should work to get 1.1 into S as soon as we can [01:51] mramm: finally landed the hook synchronization branch [01:52] we expect 1.1 final to land in plenty of time, and the earlier we propose the easier it is [01:52] mramm: that will require deviating from the ustream [01:52] which I have no problem doing [01:52] yea [01:52] snarky... superb... slimey... [01:52] but sounds like that isn't what we do (tm) [01:52] what was S again? [01:52] surly [01:52] not sweet [01:52] I don't want to look it up [01:52] but instead batter things around until it floats to the top of my memory [01:52] surly simian or something [01:53] definitely a salamander [01:53] not sticky [01:53] which reminds me of a joke [01:53] stinky subhuman [01:53] "What is brown and sticky" [01:53] 2013/04/26 01:53:23 NOTICE worker/provisioner: started machine 307 as instance 1582617 [01:53] stout sea-urchin? [01:53] a stick [01:53] haha [01:54] fyi: https://wiki.ubuntu.com/SReleaseSchedule [01:55] hmm, at 300 nodes the main thread on mongod is at 30% duty [01:55] interesting [01:55] sounds like some more evidence that we will need an internal API sooner rather than later [01:55] mramm: it's all the reconnection and ssl handshaking from the clients probing [01:56] does it settle down after they have connections established? [01:56] mramm: no [01:56] this is a constant load [01:56] the polling is every 2 ? minutes [01:57] * davecheney goes and checks [02:03] so changing to use the api internally should reduce the load here? [02:03] or will it still be high [02:03] just because of the number of clients? [02:05] thumper: lower, i would hope [02:05] * thumper nods [02:05] the polling is internal to the mongo driver [02:07] the driver will poll all the known services in the replica set every 180 seconds at least [02:13] 2013/04/26 02:13:11 NOTICE worker/provisioner: started machine 406 as instance 1582971 [02:13] might have to go to lunch at this rate [02:14] hmm, 20 mins per 100 instances [02:14] not bad [02:20] yea, that's not too bad at all [02:20] davecheney: going up to 2000? [02:20] f;yeah [02:20] hp are anxious to have their capacity back [02:21] so no pussy footing around [02:42] oooh [02:42] ubuntu@juju-hpgoctrl2-machine-0:~$ juju debug-log 2>&1 | grep TLS [02:42] juju-goscale2-machine-281:2013/04/26 02:42:08 ERROR state: TLS handshake failed: local error: unexpected message [02:42] juju-goscale2-machine-444:2013/04/26 02:42:11 ERROR state: TLS handshake failed: local error: unexpected message [02:42] juju-goscale2-machine-160:2013/04/26 02:42:07 ERROR state: TLS handshake failed: local error: unexpected message [02:42] juju-goscale2-machine-405:2013/04/26 02:42:10 ERROR state: TLS handshake failed: local error: unexpected message [02:42] juju-goscale2-machine-162:2013/04/26 02:42:11 ERROR state: TLS handshake failed: local error: unexpected message [02:42] doesn't appear to be affecting things [04:18] instance creation time is slowing, 2013/04/26 04:17:37 DEBUG environs/openstack: openstack user data; 2712 bytes [04:18] 2013/04/26 04:17:52 INFO environs/openstack: started instance "1584731" [04:31] davecheney: by how much? [04:31] not sure, i'd have to get the whole logs [04:31] but the botostrap node is nearly out of memory [04:31] and starting to swap [04:33] i'm having a look to see if I can change the instance type of the bootstrap node [04:34] need at least 4x more ram to make it to 2000 [04:34] davecheney: can we `juju bootstrap --constraint='instance-type=standard.large'` or something? [04:34] m_3: not sure [04:35] there is something in the openstack logs that says the instance type is being hard coded [04:35] oh, yeah, there's --constraints on bootstrap according to help [04:35] i'm going to grab the log and kill this test [04:35] oh... didn't realize it was hard-coded... never tried anyting other than standard.small on hp [04:35] i've seen enough to know it's not going to make it [04:35] still great info [04:36] got it to the point where it's swapping [04:36] m_3: will post my notes on this run [04:36] so it's probably safest to keep the environment defaulted to standard.small and then do a special bootstrap [04:36] m_3: how do we advise customers to size their bootstrap node [04:36] btw, we should do a special hadoop-master too [04:37] m_3: wanna take a look while i'm grabbing the logs ? [04:37] lemme check my notes [04:38] I stuck the heap-size config about halfway through http://markmims.com/cloud/2012/06/04/juju-at-scale.html [04:40] we just need to test out if the openstack provider will take the --constraints="instance-type=xxx" on bootstrap [04:40] those were mediums though [04:40] in ec2 [04:40] but whatever, the big one is the bootstrap node for now... the hadoop job doesn't actually have to run atm [04:41] * m_3 looks back for the dang ip [04:41] 15.185.162.247 [04:42] ubuntu@juju-hpgoctrl2-machine-0:~$ scp -C 15.185.162.247:/var/log/juju/all-machines.log all-machines-2000-node-test-20130426.log [04:43] Permission denied (publickey). [04:43] why is this being a sone of a bitch [04:43] oh hang on [04:43] ok, i'm going to destroy this envrionment [04:43] rsync -azvP -e'juju ssh -e ...' [04:43] got it [04:44] so we prob wanna do standard.xlarge [04:45] can maybe do a standard.large, but might as well do the bootstrap at xlarge [04:45] `nova flavor-list` describes them all [04:48] m_3: we'll probably have to do a set-config after we boot [04:48] but I need to do some screwing with the bootstrap node to make mongo scale [04:48] ah, ok [04:49] unless you want to boot everthing as an xlarge [04:49] which might get me a bollocking [04:51] davecheney: no, we only have perms on standard.small over normal limits [04:51] davecheney: so I think we leave the environment using default-instance-type: standard.small [04:52] davecheney: but try to use a constraint with the bootstrap [04:52] davecheney: are you thinking that won't work? [04:52] davecheney: sorry, I think I screwed up your scp... please check it [04:52] nah it's ok [04:52] dont' worry i got the scp [04:53] k [04:53] lets try the --constraint option [04:53] it's 3pm in AU now [04:53] i'm going to destrouy this env and start again [04:53] hell, I guess the easiest thing to do is first of all [04:53] i don't want to leave it running overnight [04:53] deploy another service with a constraint [04:53] yeah, we don't need to leave it up for anything [04:53] I weas just thinking we could test out the constraint thing pretty quickly [04:54] but it'll be interesting to see how long the destroy takes :) [04:55] ha [04:55] davecheney: it stil looks like it's spawning shit [04:56] yup, destroy works backwards [04:56] i'll stop the PA [04:57] stopped [04:58] davecheney: so do we have to kill them via nova now? [04:59] m_3: if we have too, that is a bug [04:59] destroy means destroy, not do your best :) [04:59] yup, but do the services you just killed have to be up throughout destroy? [05:00] * m_3 doesn't know if destroy needs the db to get instance-ids [05:02] davecheney: crap, just tried to bootstrap on another hp acct... doesn't respect the instance-type constraint [05:03] m_3: I suspected that [05:03] davecheney: know the syntax for "mem>=16GB" [05:03] ? [05:03] thumper: ? [05:04] m_3: our constraints support is very basic [05:04] oh, looks like it's trying on a 'mem=16G' [05:04] wallyworld_: any ideas ? [05:04] nice, I got past the basic validation it looks like... got a "no tools available" [05:05] --upload-tools ? [05:05] davecheney: about? [05:05] wallyworld_: we're trying to bootstrap an env with a larger bootstrap node [05:05] davecheney: we can try from the ctrl instance... my laptop's off of the 1.10 distro package [05:05] on ec2 i assume [05:06] try from the control instance [05:06] davecheney: nice, they're dying... slowly [05:06] we could kill them all with nova [05:06] probably not worth it [05:06] it'll be done in a few mins [05:06] davecheney: yup [05:07] once they're dead, we can try the constraint on bootstrap [05:08] davecheney: so you are typing something like this? juju bootstrap --constraints "mem=4G" [05:08] wallyworld_: y [05:08] and it's not working? [05:08] davecheney: I like that it blocks [05:08] ec2 blocks as well [05:08] but ec2 lets you just say 'delete these 1000 instance id's' [05:09] ack [05:09] it looks like openstack makes you do them one at a time [05:09] wallyworld_: not sure yet [05:09] that's surprising [05:09] * wallyworld_ has to go get kid from school [05:09] might be worth filing it as a bug on the openstack provider [05:10] or at least a whinge [05:10] davecheney: well, I spoke too soon :) [05:10] it finished with instances still active [05:10] FAIL! [05:11] maybe a timeout [05:11] * davecheney embuginates [05:11] nup just raw fail [05:11] * m_3 cheers from the sidelines [05:12] https://bugs.launchpad.net/juju-core/+bug/1170210 [05:12] <_mup_> Bug #1170210: environs/openstack: destroy-environment leaks machines in hpcloud [05:12] here is one I apparently prepared ealier [05:13] m_3: ubuntu@juju-hpgoctrl2-machine-0:~$ nova list │············································································· [05:13] +---------+---------------------------+------------------+--------------------------------------+ │············································································· [05:13] | ID | Name | Status | Networks | │············································································· [05:13] +---------+---------------------------+------------------+--------------------------------------+ │············································································· [05:13] | 1465097 | juju-hpgoctrl2-machine-0 | ACTIVE | private=10.7.194.166, 15.185.162.247 | │············································································· [05:13] | 1565949 | juju-goscale2-machine-37 | ACTIVE(deleting) | private=10.6.245.47, 15.185.172.89 | │············································································· [05:13] | 1566583 | juju-goscale2-machine-239 | ACTIVE(deleting) | private=10.6.246.187, 15.185.177.83 | │············································································· [05:13] | 1581727 | juju-goscale2-machine-5 | ACTIVE(deleting) | private=10.7.30.60, 15.185.168.253 | │············································································· [05:13] +---------+---------------------------+------------------+--------------------------------------+ [05:13] can you email thiat list to antonio and ask hp to find out why those won't delete [05:14] oh, same stuck ones? [05:14] -5 is a new one from this round [05:14] -37 and -239 were stuck from tuesday [05:14] ack [05:16] sent [05:16] 2013/04/26 05:16:11 WARNING environs/openstack: ignoring constraints, using default-instance-type flavor "standard.small" ' [05:16] ^ this is what I was afraid of [05:16] wallyworld_: any way to hack around this ? [05:16] crap [05:16] davecheney: we could turn off the 'default' in the environment [05:16] m_3: i suspected that would happen, but lacked the words to express it [05:17] then see what happens with a few [05:17] or explicitly set the constraint for smalls too [05:17] i like how fast bootstrap happens in hp cloud [05:18] usually < 1 min [05:18] so much better than AWS plodding [05:18] davecheney: yup... lots faster [05:18] m_3: hang on, let me fuck with it for a sec [05:19] ahh, yoiu;'re doing what I was going to do :) [05:19] shit, sorry [05:19] nah, you're good [05:19] that was what I was going to do [05:19] m_3: do you wanna do a hangout for a bit ? [05:19] or is it a bit late in your local TZ ? [05:20] davecheney: yeah, I should stop screwing around and hit the sack :) [05:20] go, flee, run wild, etc [05:20] sam is in perth this weekend [05:21] hotel room with the wife asleep so can't do voice atm [05:21] so i'm going to hack on this all weekend [05:21] (not to mention drink scotch)( [05:21] :) [05:21] ok, yeah, it doesn't look like our experiment was working anyways [05:21] might not be hard to change the constraint "override" code though [05:32] I FIXED IT WITH SCIENCE ! [05:35] m_3: ok, i got the environment setup the way we want [05:35] but forgot to goose mongo [05:35] lemme do that again [05:35] m_3: hey, machine 5 is dead :) [05:35] that is nice bonus [05:36] oh, cool [05:36] please watch closely, there is nothing up my sleves [05:37] haha [05:37] so you're gonna default to xlarge, then explicitly ask for 'mem=2G' for slaves? [05:40] m_3: will know in a second [05:40] the environment config should default to .smalls [05:40] sweeet [05:41] nice [05:41] thank thumper for set-config [05:41] ah [05:41] m_3: the rule is, once you've bootstraped, most of the values in environments.yaml are ignored [05:41] the active values are in the state [05:42] ohh dear, it shouldn't show you all those things :) [05:42] * m_3 was wanting set-config in juju-0.6 earlier this week [05:42] ha [05:42] well, yes [05:42] sorry, the comamnd is set-environment [05:42] it shouldn't [05:42] but it's operation is straight forward [05:43] understood... I was actually wanting set-config :)... but thought maybe the tool did both [05:43] we have set-config as well [05:43] * m_3 happy camper [05:44] um, at least I thought we did [05:44] just get [05:44] oh yeah [05:44] `juju get hadoop-slave` [05:45] no filtering it looks like [05:45] yeah, i blame myself [05:46] I sooo want a "preload-packages" or the equiv [05:47] m_3: what would that do ? [05:47] charm metadata level as well as environment level [05:47] install packages before calling any hooks [05:47] ah, via cloud init (sorta) [05:47] so all the hook install commands we no ops [05:47] even later would be fine [05:47] MUCHA PARALLELA [05:48] 2013/04/26 05:48:16 DEBUG environs/openstack: openstack user data; 2710 bytes │············································································· [05:48] 2013/04/26 05:48:29 INFO environs/openstack: started instance "1585513" [05:48] 13 seconds to bootstap an instance [05:49] thumper: i was wrong, this didn't significantly change with 1000 instances running [05:49] davecheney: it's moving now... [05:49] what, thought the per-instance startup time was changing? [05:49] it went a up a little as mongo started to swap [05:49] not signficantly [05:49] ack [05:50] 5/min atm [05:50] ish [05:50] the hold back time from openstack's rate limiting affects that [05:50] bc says 7 hours to bootstrap 2000 instances [05:50] faaaaaaaaaaaaaaark [05:51] you only get 4 cpus with the 16gb instance [05:51] that is pretty tight [05:51] davecheney: where's htop on the bootstrap? [05:51] #6 [05:52] fun fact, mongo supports a --maxConns flag [05:52] which defaults to 20,000 [05:52] but that is gated by 80% of the current number of file descriptors [05:52] huh [05:53] * davecheney quitely expects mongodb to assplode at 10k connections [05:55] m_3: juju-goscale2-machine-0:2013/04/26 05:55:05 NOTICE worker/provisioner: started machine 85 as instance 1585607 │············································································· [05:55] juju-goscale2-machine-0:2013/04/26 05:55:05 INFO worker/provisioner: found machine "86" pending provisioning [05:55] this is an interesting log line [05:55] davecheney: I didn't catch your startup... are these related to a master? [05:56] sorry, say again [05:56] did you deploy this from 'bin/hadoop-stack'? [05:56] yeah [05:56] or just deploy -n? [05:56] with -n1975 [05:56] ok, cool [05:57] wanna catch the master address... shit, status doesn't take any filters either though [05:57] that log line above shows how the PA works [05:57] 15.185.161.62 [05:57] what is the port ? [05:57] davecheney: yeah, that looks like what we'd expect to me [05:57] 50070 [05:58] using nova list is cheating, but whateva [05:58] 80 nodes registerd [05:59] this'd be really hard to test without novaclient [05:59] damn, this is looking great right now [05:59] m_3: so i'm trying to drag myself into the 90's an use tmux [05:59] but there is one thing that i can't figure out [06:00] when i C-a etc [06:00] sometimes it is like the ^C is ignored [06:00] hmmmm not sure what you mean [06:00] you're trying to ctrl-c a process you mean? [06:00] no, cntl-a n [06:01] ctrl-a hangs waiting for a followup keypress [06:01] yeah [06:01] there's a timeout setting I think [06:01] it feels like that [06:01] m_3: anyway [06:01] it looks like mongo does all it's tls negogiation on the main thread [06:01] then spawns a worker thread [06:02] which is a bit lame [06:02] I'll often find myself switching to another window as a no-op if I change my mind or get lost in a ctrl-a sequence [06:02] rather than accepting the connetion and handling it in a thread [06:02] * m_3 not surprise that something like tls integration is half-baked [06:03] at 900 machines running, the main thread was busy 90% of the time handling all the reconnections from the driver [06:03] yeah [06:03] i expect that to get a bit shit at 2,000 nodes [06:03] yup [06:04] not sure how to get around that one [06:04] as william said, it's moving the ws api out to the agents [06:05] yeah, but that's a huge change though right? [06:06] its a lot of work, but conceptually it's straight forwrd [06:06] right [06:06] a fix [06:07] not so much a workaround :) [06:07] everything talks tot he state via a set of types which convert between mongo documents and data structures [06:07] so it would just be a different conversion [06:07] watchers are, as always, the tricky bit [06:07] true dhat [06:07] m_3: what happens if I deploy the juju-gui on this environment ? [06:08] don't know if juju-gui talks to juju-1.10 api yet... does it? [06:08] shit, we can try :) [06:08] m_3: gary poster said it did about 5 hours ago [06:08] who am I to doubt that lovely man [06:09] fuck, we'll have to wait 8 hours for that to be provisioned [06:09] now your nova trick won't work this time :) [06:09] shitter [06:09] well this is fun, for relative values of fun [06:10] bugger, i should have deployed the gui first [06:10] hmm, i'll do that on the next run [06:10] hmmmm... brain's getting fuzzy... but maybe there's a way to point the juju-gui to an api server via config [06:10] i.e., from anohther env [06:10] probably [06:10] it won't use a relation [06:10] because the api server is not a service [06:11] (although it should be) [06:11] nah, doesn't look like it in the charm [06:11] juju-gui: │············································································· [06:11] charm: cs:precise/juju-gui-46 │············································································· [06:11] i.e., no config for api server [06:11] exposed: true │············································································· [06:11] units: │············································································· [06:11] juju-gui/0: │············································································· [06:11] agent-state: pending │············································································· [06:11] machine: "1999" [06:12] GLWT [06:12] 1999 [06:12] sweet [06:12] btw, the gui for this will be pretty un-interesting [06:12] two boxes [06:13] hadoop-master and hadoop-slave [06:13] two lines between them [06:13] i be it crashes my browser [06:13] bet [06:13] but yes, it'd still be neat to see [06:13] hahq [06:13] well, yeah... maybe that too [06:14] although kapil had a simulator mock thingy set up [06:14] that is true [06:14] he may've done some scale testing with that [06:14] that can simulate infesibly large environments [06:14] most likely problem would be timeouts [06:14] maybe [06:15] while the api server chokes [06:15] davecheney: sweet... that's thumping along [06:16] m_3: that is what I am thinking, it'll be lugging around the data for thousands of relations [06:16] yup [06:16] davecheney: ok, well I think I'm gonna hit the sack then [06:17] yeah [06:17] davecheney: you want me to do anything on the flipside? [06:17] this is as thrilling as watching paint dry [06:17] if anything eventful happens i'll put it in an email [06:17] davecheney: or well just send me email if you get eod and want me to do something [06:17] i won't leave it running past about 11pm tonight [06:17] we should be pretty close to 2000 nodes by then [06:17] 7 hours really isn't fast enough for this [06:17] how long did it take for the ec2 2k node test ? [06:18] k... I'm on UTC-7 for the next two weeks [06:18] bout 7hrs iirc [06:18] was split up a bit in the big run [06:19] did 1000, tested job runs on that cluster [06:19] m_3: i'll see you in -7 on the 5th [06:19] then cleaned out the hdfs and added 1000 more [06:19] but I think that was 7hrs total [06:19] booooooooooooooring [06:20] there were a few white russians invovled too :) [06:20] a capital idea! [06:20] :) [06:20] * davecheney considers scouting for dinner [06:20] davecheney: k, well goodnight fine sir [06:20] later mate [06:20] enjoy this port - land [07:42] rogpeppe: can you help with a juju-gui question ? [07:42] davecheney: perhaps... [07:43] davecheney: a question from you about juju-gui, or a question from the juju-gui team? [07:43] how to login to the bugger [07:57] davecheney: sorry, didn't see your question... [07:58] davecheney: if you want me to see something, you need to mention my irc handle... [07:58] davecheney: you use your admin secret [08:00] davecheney: have you tried it and had it fail? [08:01] rogpeppe: yeah, tried and failed [08:01] is there a length limit ? [08:01] davecheney: i don't think so [08:02] davecheney: hmm, let me try it. remind me of the charm url of the gui charm, please? [08:02] https://15.185.163.105/ [08:02] ^ this is the depoloyed gui] [08:02] ubuntu@15.185.162.247 [08:03] is the machine that bootstrapped [08:03] rogpeppe: your key is already on that machine [08:03] so you should be able to recover the admin password [08:03] davecheney: actually, i was going to try deploying it, and couldn't remember the charm url [08:04] davecheney: but i'll try logging in to yours too [08:04] sorry this one is already deployed [08:04] rogpeppe: it's doing a 2000 machine bootstrap [08:04] so deploying another will take another 7 hours [08:04] davecheney: i want to see if i can reproduce the problem on a smaller env [08:04] kk [08:04] i just do juju deploy juju-gui [08:04] juju expose juju-gui [08:04] just followed garys instructions from his email [08:08] davecheney: i don't see any gui charm deployed on that machine [08:09] davecheney: and the error messages in machine.log look like they're not in the current juju tree [08:09] thaqt machine is not inside the environemnt [08:10] rogpeppe: but you can use that machine to recover the admin secret for the goscale2 environment [08:10] davecheney: ah, ok; i thought you said it was the deployed gui [08:11] rogpeppe: the gui uri is https://15.185.163.105/ [08:11] davecheney: sorry, i got muddled [08:11] rogpeppe: yeah, sorry, this is very confusing [08:12] we're running an envbironment within an environment [08:12] 'cos that is how m_3 rolls [08:12] davecheney: i sometimes do that too [08:13] davecheney: at some point i'll run up a "juju-dev" charm that provides a full juju-core dev environment [08:13] that is a great idea [08:13] screw local mode [08:14] davecheney: i've done it manually before, but it's a hassle; just what charms are for [08:14] davecheney: ok, so login fails for me too [08:15] weird eh [08:16] davecheney: any chance you could add my key to the gui node? [08:16] davecheney: ah, i can probably ssh from the bootstrap node [08:17] rogpeppe: yes [08:17] juju ssh 1 [08:19] davecheney: is there any way we can get ssh to only *temporarily* add hosts. the "permanently added" thing seems wrong [08:19] davecheney: and i just saw this message, which is probably related: http://paste.ubuntu.com/5603807/ [08:20] rogpeppe: unrelated [08:20] we've been creating and destroying machines all day [08:20] davecheney: ah, ok [08:20] so ip addresses have been reused [08:20] and have left stale entries in the ssh knownhosts file [08:21] * davecheney has craeted on the order if 1600 machines today [08:21] davecheney: that sounds like exactly what i was talking about, no? [08:21] davecheney: isn't the "permanently added" thing talking about adding to the knownhosts file? [08:21] rogpeppe: that is correct [08:21] i think i meant to say 'that warning is not serious' [08:22] davecheney: oh, i realise that [08:22] davecheney: but if ssh wasn't adding to the known hosts file, we wouldn't see that message [08:22] it won't add it a second time [08:22] the warning is the ip address exists in the file, with a different fingerprint [08:23] because we pass -o ignorehostwarning or something to ssh it carries on anyway [08:24] davecheney: yeah; basically i don't want to say "i know this ip address" forever because ip addresses are totally transitory in the juju env [08:24] rogpeppe: bingo [08:27] rogpeppe: i'll forward you my notes from the first 1000 machines [08:32] rogpeppe: i didn't bother to send that to william, he's got enough on his plate [08:33] the amount of memory mongo uses per connection is obscene [08:36] davecheney: last thing i saw was: [08:36] [09:27:39] rogpeppe: i'll forward you my notes from the first 1000 machines [08:37] 18:32 < davecheney> rogpeppe: i didn't bother to send that to william, he's got enough on his plate [08:37] 18:33 < davecheney> the amount of memory mongo uses per connection is obscene [08:37] that is all I said [08:37] 'cos you were ignoring me :) [08:37] davecheney: occupational hazard of going through a mobile data connection [08:37] rogpeppe1: do you think they will reconnect your part of england to the internet in the near future ? [08:37] davecheney: no prospect in the near future [08:38] rogpeppe1: shitter [08:38] * davecheney steps outside to order some dinner [08:38] davecheney: the fault is somewhere in 200m of underground cable [08:38] davecheney: and they have to get planning to dig it up [08:38] davecheney: i'd like to see your notes BTW [08:39] davecheney: you might've missed this BTW: [08:39] [09:31:31] davecheney: ah, this looks like a problem: http://paste.ubuntu.com/5603842/ [08:39] [09:32:57] davecheney: oops, missed one redaction [08:40] rogpeppe1: if you're looking at the output of juju get-environment [08:40] yeah, i think we left our flys open a bit [08:41] davecheney: i removed most of the passwords; but i've no idea what that one was from - third attempt, looks like [08:41] davecheney: unfortunately there seems no way to deliberately delete a paste [08:42] davecheney: before the crawlers find it [08:45] rogpeppe1: s'ok, i'll change the admin secret [08:45] aw shucks, "juju deploy juju-gui --force-machine 0" doesn't work [08:45] davecheney: that wasn't the admin secret [08:45] will fix [08:46] rogpeppe1: as pennance, you need to fix that bug :) [08:46] davecheney: i'm looking [08:47] davecheney: i'll try to reproduce it first. please don't take down that environment for the time being (not that there's much danger, i think) [08:48] rogpeppe1: np [08:48] davecheney: interesting minor bug: http://paste.ubuntu.com/5603887/ [08:49] no you can't do that, oh, ok, if you must [08:49] davecheney: no, it's not done - the unit is left around unassigned [08:50] oh [08:50] interesting [08:50] davecheney: you have to manually destroy the unit then add another one [08:56] davecheney: https://bugs.launchpad.net/juju-core/+bug/1173089 [08:56] <_mup_> Bug #1173089: deploy can fail partially [08:59] bzzt [09:05] davecheney: hmm, the gui works ok for me [09:06] rogpeppe1: poop [09:06] why can't i login to my deployment ? [09:06] davecheney: here's an idea: kill the machine agent [09:07] davecheney: and see if it works when it starts again [09:07] ok [09:07] davecheney: 'cos that EOF error is really weird [09:07] davecheney: i'm hoping that we will still see the error when it restarts [09:08] davecheney: because then there's the possibility of upgrading the binaries with some updated logging and better error messages. [09:08] davecheney: and finding out what's really going on [09:10] davecheney: the only possibility that i can think of currently is that the connection to the mongo server has failed [09:10] davecheney: i *wish* we annotated our errors more [09:11] davecheney: if my theory is correct, that EOF error comes from about 6 levels deep and hasn't been given any context at all [09:12] rogpeppe1: is this on the api server, or the state/mongo server? [09:12] davecheney: on the api server [09:12] right [09:13] davecheney: if i had my way, there would be almost no if err != nil {return err} occurrences in our code [09:14] davecheney: i lost that argument ages ago, but problems like this really show how bad our current conventions are [09:14] rogpeppe1: i'm starting to be convinved [09:14] and i think it can be reopened [09:14] times they have a changewd [09:16] davecheney: my comment (the last one) on this post is a reasonable representation of my thoughts on the matter: http://how-bazaar.blogspot.co.nz/2013/04/the-go-language-my-thoughts.html [09:17] * davecheney reads [09:19] rogpeppe1: the main mongo thread is now using more than 100% CPU [09:19] * rogpeppe1 is not surprised [09:20] it looks like mongo handles the accept(2) and the tls handshake on the main thread [09:21] so every 30 seconds we get a storm of agents sniffing around [09:21] davecheney: oh god [09:21] and the cpu wedges [09:21] only once it has done the handshaking does it hand off the connection to a new thread [09:21] davecheney: we should try with a much much longer time interval there [09:21] davecheney: 30s is ridiculous [09:21] it's not 30s [09:21] but that appears to be the resonent frequency of the polling interval [09:22] its 180s or whenever they need to do a sync (that is what mgo calls it) [09:22] which ever is the sooner [09:23] davecheney: ah i see. the usual self-synchronising clock thing [09:23] yeah, that isn't all 650 agents at once [09:23] but a swarm of them [09:23] * rogpeppe1 loves emergent patterns [09:23] * davecheney does not [09:24] davecheney: it's the joy of the universe, maaan [09:27] davecheney: does that blog comment make sense to you BTW? i have the impression that noone gets what i'm trying to say there. [09:28] * rogpeppe1 is not good at rhetoric [09:29] rogpeppe1: i agree with your position [09:29] i think we talked about this a year ago [09:30] waiting for the computer history museam to open [09:30] davecheney: ah yes, i remember [09:30] and now with the benefit of some history [09:30] i agree [09:30] well, i always agreed [09:30] but this is an excellent case [09:31] davecheney: i might put a post together for juju-dev [09:43] davecheney: 9 levels deep and still diving [09:45] rogpeppe1: remember to stop on the way back up and represurise to avoid the bends [09:46] davecheney: lol [09:46] don't go james cameron on me man [09:52] davecheney: bottomed out at 12 [09:52] 64 bit process [09:52] davecheney: if we reported a stack trace, as some suggest, it would show only the bottom 2 levels [10:00] davecheney: http://paste.ubuntu.com/5604054/ [10:00] davecheney: actually, there's probably another layer at the top [10:02] davecheney: here's the complete stack: http://paste.ubuntu.com/5604064/ [10:03] rogpeppe1: shit [10:04] davecheney: one easy thing to do is to actually hook up the mgo logging [10:04] davecheney: then that logf at the bottom would actually have printed something [10:09] rogpeppe1: is that hard to do ? [10:09] davecheney: trivial [10:10] davecheney: a one-line change [10:11] davecheney: or one or two more if we want nicely formatted messages [10:18] rogpeppe1: a single thread is now using 209% CPU on the bootstrap node ... [10:18] davecheney: is that possible? [10:18] PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command [10:18] 9611 root 20 0 8169M 1770M 0 S 194. 11.0 1h40:55 /usr/bin/mongod --auth --dbpath=/var/lib/juju/db [10:18] really, it is [10:18] davecheney: i thought a thread was... single threaded [10:18] davecheney: or do you mean a single process (with several threads inside) ? [10:19] rogpeppe1: this is using htop so it should be per thread [10:19] i cannot explain it [10:19] apart from observing it is large [10:19] ohh, and now I can see a lot of blocking on the mongo side [10:20] and that is only 800 machines [10:20] sorry, 888\ [10:21] Apr 26 10:21:44 juju-goscale2-machine-0 mongod.37017[9611]: Fri Apr 26 10:21:44 [conn84734] query presence.presence.pings query: { $or: [ { _id: 1366971690 }, { _id: 1366971660 } ] } ntoreturn:0 ntoskip:0 nscanned:2 keyUpdates:0 numYields: 1 locks(micros) r:763142 nreturned:2 reslen:744 381ms [10:22] rogpeppe1: i'm assuming these are 'slow queries' [10:22] they only start to show up in the log at the 800 machine mark [10:23] davecheney: wow, does that reslen value mean the query has been waiting for 12 minutes to be processes?! [10:23] i don't think so [10:23] i don't think it is 744,381 ms [10:23] surely it is 744 bytes after 381 ms [10:24] davecheney: yeah, probably [10:56] rogpeppe1: Apr 26 10:56:20 juju-goscale2-machine-0 mongod.37017[9611]: Fri Apr 26 10:56:20 [conn50284] query presence.presence.pings query: { $or: [ { _id: 1366973760 }, { _id: 1366973730 } ] } ntoreturn:0 ntoskip:0 nscanned:2 keyUpdates:0 numYields: 1 locks(micros) r:911100 nreturned:2 reslen:792 501ms [10:57] davecheney: latency rises... [11:02] not really sure wht that is showing me yet [11:02] it's sort of a cas insn't it ? [11:02] Apr 26 11:02:02 juju-goscale2-machine-0 mongod.37017[9611]: Fri Apr 26 11:02:02 [conn6275] query presence.presence.pings query: { $or: [ { _id: 1366974120 }, { _id: 1366974090 } ] } ntoreturn:0 ntoskip:0 nscanned:2 keyUpdates:0 numYields: 1 locks(micros) r:1413393 nreturned:1 reslen:406 768ms [11:03] but yes, they certainly rise [11:03] what is the heartbeat for presence ? [11:03] we should put some thought into avoiding harmonic feedback in all these periodic loops [11:06] shit, we're not even at 1000 instances [11:06] it's been running for 3 hours ... [11:06] testing this thing is a job for life :) [11:08] rogpeppe1: hey, how about a suggestion about better help doc for upgrade-charm --switch? [11:23] rogpeppe1: http://paste.ubuntu.com/5604256/ [11:23] at the 1000 node mark, the api server is unusable [11:23] dimitern: ah, will do. sorry, bit distracted currently as some old pipes have just sprung a leak in our kitchen and i've had to turn the main water supply off [11:23] or something maybe mongo [11:23] rogpeppe1: wow.. [11:23] maybe the the thing afterwards that [11:23] crap [11:24] davecheney: isn't the mongo, not the API server? [11:24] s/the/that/ [11:25] rogpeppe1: really not sure [11:25] rogpeppe1: "To manually specify the charm URL to upgrade to, use the --switch argument. [11:25] It will be used instead of the service's current charm newest revision. [11:25] Note that the given charm must be compatible with the current one, e.g. [11:25] i guess it is looking in the db [11:25] it must not remove relations the service is currently participating in, [11:25] and no settings types can be changed. This *is dangerous* and you should [11:25] know what you are doing." [11:25] to find the address of the instance [11:25] it could also be blocked waiting for the provider to return some data [11:26] but we've used up all our quota with the provider === ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: | Bugs: 2 Critical, 64 High - https://bugs.launchpad.net/juju-core/ === ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: | Bugs: 3 Critical, 63 High - https://bugs.launchpad.net/juju-core/ [11:33] wallyworld_: mumble? [11:33] dimitern: i just got back from soccer, i'll be a minite [11:37] dimitern: can an upgraded charm have less config settings than the old one? [11:38] rogpeppe1: let me check [11:39] does anyone know if nova list has a limit on the nubmer of rows it returns ? [11:42] https://bugs.launchpad.net/nova/+bug/1166455 ? [11:42] <_mup_> Bug #1166455: nova flavor-list only shows 1000 flavors [12:00] rogpeppe1: well, it seems the old config settings should remain, but you can add new ones [12:00] dimitern: ok, that seems good [12:02] dimitern: http://paste.ubuntu.com/5604375/ [12:04] rogpeppe1: sgtm, thanks [12:24] rogpeppe1: so how to test both local: and cs: urls? start a http server mocking the store and set that to charm.Store? [12:40] dimitern: good question. [12:41] dimitern: sorry, still distracted, trying to get hold of a plumber [12:41] rogpeppe1: i'll propose it without that, for now [12:50] hi guys, I'm getting this error in the bootstrap node when bootstrapping on canonistack: [12:50] ERROR worker: loaded invalid environment configuration: required environment variable not set for credentials attribute: User [12:50] full logs at http://pastebin.ubuntu.com/5604481/ [12:50] any ideas? [12:51] "juju status" on my laptop just hangs [12:52] ahasenack: try running juju status --debug -v [12:53] dimitern: hm [12:54] dimitern: http://pastebin.ubuntu.com/5604493/ [12:54] security group issue? [12:55] it connects over there (localhost), so there is something listening on that port [12:55] ahasenack: it seems it cannot connect to mongo - is it running? [12:55] root@juju-canonistack-machine-0:~# telnet localhost 37017 [12:55] Trying 127.0.0.1... [12:55] Connected to localhost. [12:55] Escape character is '^]'. [12:55] something is, I assume it's mongo [12:55] tcp 0 0 0.0.0.0:37017 0.0.0.0:* LISTEN 27573/mongod [12:55] yep [12:56] ahasenack: so you can connect from machine 0 to mongo, but not from outside? [12:56] right [12:56] I'm checking the security group rules [12:56] ahasenack: yeah, good idea [12:57] dimitern: ah, I know [12:57] dimitern: the rules are ok [12:58] dimitern: it's the public ip thing, on the private ip only ssh is routed through [12:58] dimitern: I'll fire up sshuttle and that should sort it [12:59] dimitern: yep, worked now, thanks [12:59] the errors in the logs were misleading me [12:59] ahasenack: you can also try setting the "use-floating-ip" to true in env config [12:59] yepo [12:59] ahasenack: but knowing the shortage of floating ips on canonistack, it might fail anyway [13:00] yes, I will stick with sshuttle, works well enough for my testing [13:19] rogpeppe1: hi, I see that https://bugs.launchpad.net/juju-core/+bug/1172717 is still open, but the branch is merged [13:20] <_mup_> Bug #1172717: juju-log does not accept --log-level [13:20] rogpeppe1: is it fixed in trunk? [13:33] ahasenack: i think so; let me check [13:34] ahasenack: yes [13:34] rogpeppe1: will that trigger a new ppa build? I still only see the version with the bug [13:35] rogpeppe1: also, does it requires a new "tools" build? [13:35] does it require* [13:35] ahasenack: i don't think so. i think the patch needs to be back ported [13:35] rogpeppe1: I'm using this ppa: http://ppa.launchpad.net/juju/devel/ubuntu/ [13:35] ahasenack: we haven't yet worked out best practice in that respect yet - we're still feeling our way [13:35] I thought that was trunk [13:36] ahasenack: the tools still need to be pushed to the public bucket [13:36] ahasenack: because that's where they're pulled from, not the ppa [13:36] rogpeppe1: the bug actually depends more on the tools than on the new deb [13:36] ok [13:37] and that does not happen with every commit? [13:37] I guess there needs to be a concept of "stable" and "devel" tools [13:37] ahasenack: there is that concept [13:37] ahasenack: if the minor version is odd, it's a devel version [13:38] ahasenack: i think we probably need to automate our pushing to the public bucket [13:38] rogpeppe1: but are they in separate buckets? [13:38] ahasenack: no, there's only one public bucket [13:38] ahasenack: (for any given environment, that is) [13:38] ok, so if you push to that bucket with every commit, like a "daily", you risk breaking production users [13:39] with the ppa at least you have a distinction about what is "stable" and what is "devel" or "daily" [13:39] ahasenack: only if we push versions with an even minor version number, i think [13:39] rogpeppe1: so how do you test trunk, you use --upload-tools all the time? [13:39] ahasenack: the idea is that we always develop against an odd minor version (currently we're developing against 1.11) [13:39] ahasenack: yes [13:40] rogpeppe1: like my case now, I was going through all the openstack charms and seeing if they deploy with juju-core trunk, and filing bugs where appropriate (some in openstach charms, some in juju) [13:40] rogpeppe1: but I can't test a "trunk" build of juju-core, because it's not there, I'm stuck with the version with the bug :) [13:40] ahasenack: you could use upload-tools [13:41] last time I tried it exploded, I emailed the list [13:41] I will wait for a new package in the devel ppa, and new tools :) [13:42] ahasenack: there have been some significant issues fixed since then. it *should* work fine. [13:42] ahasenack: in particular, it shouldn't pick incompatible tools if you've uploaded some, which was probably the cause of the explosion before [13:45] rogpeppe1: I think my problem is more basic than that... http://pastebin.ubuntu.com/5604658/ [13:45] what does it mean "no go source files" [13:46] ahasenack: try go get -v launchpad.net/juju-core/... [13:46] rogpeppe1: the "..." are for real? [13:46] ahasenack: there are no source files in the juju-core root directory [13:46] ahasenack: yes [13:46] ahasenack: it's a wildcard [13:46] !! [13:47] ahasenack: from "go help packages": http://paste.ubuntu.com/5604667/ [13:47] rogpeppe1: ok, that changes things, thanks, I'll go on from here [13:48] ahasenack: if the wildcard was '*', you'd have to quote the names all the time [13:48] ahasenack: and '*' usually doesn't match multiple levels of directory [13:50] ahasenack: cool; please let us know when things go wrong, or are awkward to understand - it's nice to get feedback from people that aren't used to walking around the holes in the road. === wedgwood_away is now known as wedgwood === gary_poster is now known as gary_poster|away [14:00] m_3 ping === flaviami_ is now known as flaviamissi [14:29] i'd appreciate a review on https://codereview.appspot.com/8540050 [14:29] rogpeppe1: --upload-tools worked, and I verified that that -l/--log-level bug is indeed fixed [14:29] rogpeppe1: ^^ [14:29] * dimitern bbi30m [14:29] ahasenack: lovely, thanks for giving it a go [14:29] dimitern: ok, will look in a little bit === gary_poster|away is now known as gary_poster [15:06] dimitern: reviewed [15:11] rogpeppe1: cheers [15:13] davecheney: pong [15:52] hi, I got this error when deploying cinder with juju-core, is this a change between pyjuju and gojuju? http://pastebin.ubuntu.com/5605085/ [15:54] hmm, interesting [15:55] ahasenack: do you know what hook that was running in? [15:55] rogpeppe1: install I think, this was just before, and I was really installing it only [15:55] 2013/04/26 15:51:25 DEBUG worker/uniter/jujuc: hook context id "cinder/0:install:79731491855068321"; dir "/var/lib/juju/agents/unit-cinder-0/charm" [15:55] rogpeppe1: wait, let me paste more context [15:55] ahasenack: hmm, so which relation did the code expect to be set there? [15:56] ahasenack: given that the install hook isn't associated with a relation. [15:56] http://pastebin.ubuntu.com/5605098/ [15:56] the install had failed before, i had to run a few juju set foo=bar to fix a config and then resolved --retry [15:57] ahasenack: i think we could do with even more context actually [15:57] I'm not sure what it was trying to set [15:57] ok [15:57] let me get the whole file [15:58] rogpeppe1: http://pastebin.ubuntu.com/5605109/ [15:59] ahasenack: right, it's running the install hook [15:59] ahasenack: i think it's reasonable that relation-related commands can fail in that circumstance, but i'd be interested to know what the charm was actually trying to do [16:00] let me see what it does [16:00] ahasenack: perhaps we should just ignore untoward relation-related commands [16:01] rogpeppe1: I found two relation-set commands that match that log [16:01] rogpeppe1: one specifies a relation id :) [16:01] ahasenack: :-) [16:01] looks like a bug [16:01] ahasenack: looks that way to me [16:01] the one that doesn't is in keystone_joined() (!!) [16:01] relation-set service="cinder" \ [16:01] region="$(config-get region)" public_url="$url" admin_url="$url" internal_url="$url" [16:02] rogpeppe1: ok, thanks, I'll take it from here [16:02] ahasenack: if charms are doing this commonly though, and the python allowed it, we should perhaps consider letting it through and ignoring it [16:02] ok [16:03] I will debug this one, see how it ended up running keystone_joined() in the install hook [16:03] and then if we can get and use a relation id [16:07] anyone know of a decent way of inserting nicely formatted code fragments into a gmail mail? [16:09] or a google doc for that matter [16:26] hi, I have a feeling that juju deploy --config file.yaml isn't working, it's not taking the options from file.yaml [16:27] before I debug further, is this a known issue? [16:29] juju set --config file.yaml also didn't work, but juju set key=value did [16:34] https://bugs.launchpad.net/juju-core/+bug/1121907 [16:34] <_mup_> Bug #1121907: deploy --config [16:34] ahasenack: I think deploy doesn't accept --config yet [16:34] The option is there, but the bug still open [16:34] ahasenack: or more likely it ignores it [16:34] yep, looks like it [16:35] rogpeppe1: bugging you one last time: https://codereview.appspot.com/8540050 [16:35] juju get works, but there is also a bug for it, still open [16:35] weird [16:36] ahasenack: we've been fixing lots of bugs - not all them have necessarily been marked as such... [16:37] ok [16:37] dimitern: why call repo.Latest at all if we've got a specified revision number? [16:37] dimitern: it's a potentially slow operation [16:38] rogpeppe1: it doesn't seem slow - it just changes the rev in the curl [16:39] dimitern: no it doesn't - it calls CharmStore.Info, which makes an http request [16:39] rogpeppe1: only for a local repo it does get, but this shouldn't be slow at all, the CS does not fetch anything on Latest [16:39] dimitern: resp, err := http.Get(s.BaseURL + "/charm-info?charms=" + url.QueryEscape(key)) ? [16:40] rogpeppe1: it's not the charm that's downloaded here, just the metadata [16:40] dimitern: looks like it's fetching something to me [16:40] rogpeppe1: it's essentially a HTTP HEAD [16:40] dimitern: sure, but it's still making an unnecessary network request for no particularly good reason. surely it's easy to avoid? [16:40] rogpeppe1: yeah, i suppose.. [16:41] rogpeppe1: but despite this the logic is now sound, right? [16:41] dimitern: i stopped there, but will continue looking, one mo [16:41] rogpeppe1: i'll just move the Lastest call in an else block after checking the other two cases [16:42] dimitern: that was what i was just thinking [16:42] rogpeppe1: sorry, haven't seen it like this [16:42] rogpeppe1: thanks [16:42] dimitern: you might even consider making it a bool switch [16:42] rogpeppe1: i did something like that, but it looked ugly, so i got rid of it [16:43] dimitern: np; three cases is marginal === gary_poster is now known as gary_poster|away [16:45] dimitern: i'm still not sure the logic is quite right, even making that change [16:46] rogpeppe1: why? [16:46] dimitern: don't we want to do a bump revision if the switch url is specified without a revno ? [16:46] rogpeppe1: I don't believe so [16:47] dimitern: william said this, and i agree: [16:47] Hmm.I suspect that bump-revision logic *should* apply when --switch is given [16:47] with a *local* charm url *without* an explicit revision. Sane? [16:47] rogpeppe1: that's the user being explicit anyway, so we'll do what he asks, and probably knows what he's doing [16:48] rogpeppe1: I still disagee [16:48] dimitern: as there's no way to explicitly specify bump-revision, i think we should make the default logic work [16:48] rogpeppe1: this is like --force - "do exactly what i'm telling you to do, no smart tricks" [16:48] dimitern: hmm, you said "Done" in response to that sentence before - you didn't seem to disagree [16:49] dimitern: if you don't specify a revision number, you're saying "please choose an appropriate revision number for me" [16:49] dimitern: i think we should make that path work [16:49] rogpeppe1: done, meaning all the rest - except that, i should've been clearer perhaps [16:49] rogpeppe1: there's no way *not* to bump the revision otherwise [16:50] rogpeppe1: and why should we do it - it's a different charm, so no conflicts would apply (hopefully) [16:50] dimitern: sure there is - specify a revision number, no? [16:50] dimitern: it's a different charm, but we may already have another version of the one we're switch to [16:50] switching to [16:51] dimitern: it's not unlikely, in fact, if we're calling switch on multiple services [16:51] rogpeppe1: on the same service? [16:51] rogpeppe1: we can call it only on one service at a time [16:52] dimitern: yes, but bump-revision isn't about the service, is it? it's about the charm's stored in the state, which are independent of the services that use them [16:52] rogpeppe1: so you think bumping revision on switch without explicit rev will be straightforward to understand from the user's point of view? [16:52] dimitern: yes [16:53] dimitern: because it's the behaviour they're used to when deploying with a local charm url [16:53] rogpeppe1: ok, i'll do it, but i'm still not convinced it's right [16:54] dimitern: i think automatic bump-revision for any local charm is correct, as who knows what relationship the local charm bears to the one that's previously been uploaded? [16:57] rogpeppe1: fair enough === gary_poster|away is now known as gary_poster [17:17] rogpeppe1: so when you have svc "riak",running charm "riak-7" and you upgrade it to "local:myriak" (no exp. rev, final result: "local:precise/myriak-7"), and then upgrade it again to "local:myriak", should the rev be bumped to "local:myriak-8" ? [17:17] dimitern: yes, i think so [17:18] rogpeppe1: yeah, that's what I though, adding a test for that now [17:57] i'm off, happy weekend to everyone! [18:00] rogpeppe1: about the earlier conversation about relation set and relation id, it looks like it's very common to not specify a relation id in pyjuju [18:00] two charm authors I spoke with said so, and the "manpage" for relation-set in pyjuju says it's optional (as is everything else, so I don't trust that help doc very much: https://pastebin.canonical.com/90111/) [18:10] ahasenack: it is optional, in relation-related hooks [18:10] ahasenack: but in a non-relation hook, what could it possibly default to? [18:11] ah, so it is optional in gojuju [18:11] ok, I'll debug further [18:15] right, eod and start of weekend for me here [18:15] happy weekends all [18:19] bye rogpeppe1, enjoy === wedgwood is now known as wedgwood_away