/srv/irclogs.ubuntu.com/2011/10/11/#juju.txt

jason_SpamapS, I'm getting an invalid ssh key now -- I did juju status, and it told me the key had been changed (from my many reinstalls no doubt) and asked to accept or no -- I said no, meaning to cancel out and delete the known hosts file and retry, and now it's Invalid SSH key each time00:16
jason_ok, I copied all my .ssh files from the client I'd been working on to the server -- seems to have gotten me past that bit00:24
jason_juju status completed -- looks like my first system is in place00:30
hazmatjason_, woot!00:52
_mup_juju/go-store r18 committed by gustavo@niemeyer.net01:08
_mup_Implemented URL.WithRevision.01:08
jason_hazmat, mysql deploy success, too...01:11
niemeyerjason_: ho ho01:26
_mup_juju/go-store r19 committed by gustavo@niemeyer.net03:58
_mup_New store package with AddCharm and OpenCharm interface.03:58
_mup_The interface to the package is trivial, but internally it actually03:58
_mup_handles all the necessary logic for concurrent runs of the algorithm,03:58
_mup_including mongo-based atomic locks with expiration, multi-URL synchronous03:58
_mup_revision bumping as described in the charm specification, GridFS-based03:58
_mup_memory-friendly uploading for large files, and ponies too.03:58
_mup_Lacks documentation and sha256 handling, though.. but I need some sleep.03:58
niemeyerNight all04:11
_mup_juju/expose-retry r402 committed by jim.baker@canonical.com06:04
_mup_Support retrying port mgmt ops in periodic machine check06:04
_mup_Bug #872164 was filed: [Oneiric] Cannot deply services - store.juju.ubuntu.com not found <juju:New> < https://launchpad.net/bugs/872164 >08:21
jamespagemorning - I took the liberty of pointing the bug reporter for bug 872164 in the right direction and marked the bug as invalid08:48
_mup_Bug #872164: [Oneiric] Cannot deply services - store.juju.ubuntu.com not found <juju:Invalid> < https://launchpad.net/bugs/872164 >08:48
fwereade_thanks jamespage, I just saw, much better response than mine08:52
jamespagefwereade_, np08:52
jamespageI think I must be missing something: should the stop hook be called when a unit is removed from a service using remove-unit?09:37
rogwhere can i find documentation for txaws?11:19
rogoops, LMGTFY11:19
hazmatgood morning12:07
hazmatfwereade_, the docs still look out of date.. https://juju.ubuntu.com/docs/user-tutorial.html#deploying-service-units12:09
hazmati think jimbaker mentioned yesterday they weren't regenerating12:09
hazmatjamespage, on bug 871966 when you say local juju environment you mean a local provider?12:15
_mup_Bug #871966: FQDN written to /etc/hosts causes problems for clustering systems <cloud-init (Ubuntu):Confirmed> <cassandra (juju Charms Collection):New> < https://launchpad.net/bugs/871966 >12:15
hazmatjamespage, the stop hook is not called12:17
hazmatjamespage, pretty much everything that deals with remove/destroy works one level up from the supervisor of the thing being killed12:18
hazmatwith the notion that even if the thing is AWOL, the action will happen12:18
roghazmat: hiya12:20
hazmatrog, txaws is pretty much UTSL for most questions imo12:20
roghazmat: yeah, i discovered that. thanks.12:22
rogfoundations of sand :-)12:22
hazmatrog, not really.. its well tested. but yeah.. its a consequence of using twisted, vs using the standard python library for aws (boto )12:24
roguh huh12:24
hazmathmm.. interesting12:25
jamespagehazmat: the comment on bug 871966 does refer to the local provider - but that provides an IP address for private-address anyway12:31
_mup_Bug #871966: FQDN written to /etc/hosts causes problems for clustering systems <cloud-init (Ubuntu):Confirmed> <cassandra (juju Charms Collection):New> < https://launchpad.net/bugs/871966 >12:31
hazmatjamespage, yup and private-address==public-address there12:31
hazmatand it shows up in juju status12:31
jamespagehazmat: I now have something that works with the local provider, and on ec2 and openstack12:32
hazmatjamespage, nice12:32
hazmatjamespage, comments about the local provider probably aren't relevant on a cloud-init bug, since the local provider doesn't use cloud-init.. fwiw12:32
jamespagehazmat: they more referred to the fix for cassandra12:33
hazmatah.. ic. its linked12:33
jamespageyep12:33
jamespagehazmat: with regards to units leaving a service/not calling stop I was trying to figure out the best way to remove a node from a cassandra cluster12:36
jamespagebecause the node does not get shutdown, it remains in the ring12:36
=== plars-holiday is now known as plars
robbiewrog: ping12:45
rogrobbiew: pong12:46
robbiewrog: have you registered for UDS?12:46
rogrobbiew: i think so.... but i'll just check12:46
rogrobbiew: yes, i have12:47
robbiewrog: -> http://uds.ubuntu.com/register/ :)12:47
rogrobbiew: i did it on 15th Sep...12:47
rogand flights all booked too12:47
robbiewrog: hmm, okay.  I'll talk to our admins then, thx12:48
rogrobbiew: at any rate, i've got a confirmatiom email from marianna12:49
rogrobbiew: i'll just check the web site directly12:49
robbiewrog: ah, cool12:50
robbiewnevermind then12:50
robbiew:)12:50
rogrobbiew: ah, maybe i didn't register on the linaro web site. i think i only did the UDS registration.12:53
hazmatjamespage, hmm12:55
hazmatjamespage, yeah.. i guess we really should be calling stop on units12:55
jamespagehazmat: I need to deal with two scenarios - one where its a controlled removal12:56
jamespageand one where the node goes AWOL12:56
hazmatjamespage, pls file a bug12:56
hazmati can look at that today12:56
jamespagehazmat: ack - doing now12:56
hazmatfor stopping a machine its almost irrelevant, since we shutdown the machine, but for a unit if we don't call stop, there isn't any thing to keep it from continuing to run12:57
hazmatat least till all units are containers12:57
hazmatand then the container is killed12:57
robbiewrog: UDS is all you need ;012:57
robbiew;)12:57
rogrobbiew: ok, i'll ignore the FAQ then...12:58
hazmatbut we really can't do the latter on ec2, till we figure out some magical networking solution, or stop doing dynamic port management12:58
hazmatunless we assume a single unit per machine in ec2 and do a targeted forward rule per exposed port13:00
_mup_Bug #872264 was filed: stop hook does not fire when units removed from service <juju:New> < https://launchpad.net/bugs/872264 >13:04
jamespagehazmat: ^^13:05
jamespageI tried to document the two challenges I have specifically with the cassandra charm13:05
hazmatjamespage, thanks13:05
jamespageI guess they may apply to other charms that have similar ring storage methods13:06
hazmatjamespage, so on 2) and 1) the other units should both detect the removal13:07
jamespagehazmat: yes - they do13:07
rogjust realised that "canonical/linaro employee" means "(canonical AND linaro) employee" not "(canonical OR linaro) employee"...13:07
rogdoh13:07
jamespagehazmat: and I could use the hook on the remaining nodes to deal with both situations13:11
jamespageI would need to write it such that only one node completes the action13:11
* jamespage thinks about that one13:12
* SpamapS awakens.. far too early13:15
niemeyerGood morning all13:17
rogniemeyer: yo!13:18
SpamapSjamespage: I think there's another bug asking for similar functionality..13:18
SpamapSjamespage: bug 86242213:19
_mup_Bug #862422: Provide a way for services to protect units during dangerous operations <juju:Confirmed> < https://launchpad.net/bugs/862422 >13:19
SpamapSjamespage: swift is a similar ring service and has times where adding or removing is a bad idea13:20
jamespageSpamapS, agreed - it looks very similar13:21
SpamapSDoes seem like the stop hook should handle this13:27
jamespageSpamapS: it would do for controlled removal13:27
SpamapSjamespage: not sure I understand the AWOL case13:28
jamespageSpamapS, thats more of a housekeeping case13:28
jamespagein cassandra if you never moved entries for nodes that had gone away ('Down' status) it gets very crufty13:29
jamespagealso you want to ensure that loadbalancing etc.. get re-adjusted as the node won't be coming back13:29
hazmatjamespage, but don't you get a departed event at all other nodes when one goes AWOL?13:29
jamespageSpamapS, yes13:29
jamespagesorry - I mean hazmat13:29
* hazmat checks the bug report13:30
SpamapSjamespage: yeah that should be detected in the peer relations13:30
SpamapScassandra has a prescribed procedure for removing a dead node from the ring13:31
rogniemeyer: i'm porting the ec2 launch code and i'm not sure how goamz's AuthorizeSecurityGroup is supposed to work the way it's being used in the python code. here's a comparison: http://paste.ubuntu.com/706060/13:31
jamespageSpamapS, it does13:31
SpamapSso on departed.. you would run that procedure for the departed unit13:32
hazmatjamespage, so in the case of 1) the desire is for the actual termination of the unit to hang till the stop (which is potentially a long running op) completes?13:32
hazmatand of course to execute stop as part of 113:32
jamespagehazmat: ideally yes13:32
jamespageSpamapS: what information is provided when the -departed hook fires about the remote service unit?13:33
hazmatjamespage, doesn't the same problem exist in reverse when adding units.. as i recall for cassandra (might be outdated), your supposed to only add a single unit at a time13:33
niemeyerrog: Looks like there's a protocol setting missing13:33
hazmatjamespage, just the unit name and that it departed13:33
niemeyerrog: Check out the docs and the implementation13:33
SpamapShazmat: +1 for that, let stop be proactive about locally stored data13:34
hazmatSpamapS, niemeyer g'morning13:34
rogniemeyer: the python code doesn't seem to set a proto - i was just checking that it wasn't an obvious bug13:34
* hazmat just up the ante on his war against rodents, bring in the exterminator13:34
* SpamapS wishes the time would change, its pitch black here in LA at 6:30am :-P13:34
niemeyerrog: Maybe it has a default?13:34
SpamapSwe're porting the ec2 launch code?13:34
jamespagehazmat, there is a restriction on adding units - N+N rather than N+113:35
rogniemeyer: it seems to have two distinct modes of operation13:35
rogthere's no obvious default in the python code13:35
rogi'll recheck though13:35
niemeyerrog: They're both backed by the same implementation13:35
niemeyerrog: The same API13:35
niemeyerrog: If one of them is failing, the call is different.. just figure how it's different and you'll understand the problem13:35
SpamapShazmat: bug 862422 has a case where swift requires that nodes wait to be added until rebalance is done13:36
_mup_Bug #862422: Provide a way for services to protect units during dangerous operations <juju:Confirmed> < https://launchpad.net/bugs/862422 >13:36
jamespageSpamapS, hazmat: Cassandra has a similar requirement13:37
hazmathmm13:37
SpamapSIts not that hard on the add-unit case though13:37
SpamapSyou can error out the joined event13:37
hazmatthey can't really scan for a rebalance attribute since its being set by the same hook that's doing it13:37
hazmatand the hook values are only flushed at the end of the hook13:37
SpamapSand admins will just have to resolve --retry13:37
SpamapShazmat: the services should protect themselves13:38
SpamapShazmat: there's somewhere that an admin has to look to see if a re-balance is going on13:38
SpamapSthats where the hook should look13:38
hazmatSpamapS, there isn't any service level logic.. atm.. its got to be what the units can coordinate among themselves13:38
jamespageso - just to flip back to my -departed thinking13:39
jamespageATM I will need to a) detect which node needs to be removed from the ring13:39
SpamapShazmat: yeah, I don't think preventing it is juju's problems. Handling failures gracefully should be all it needs to do.13:39
jamespageand b) elect which of the remaining units is going to execute the removal13:40
jamespagein the -departed hook13:40
SpamapSThough this does go back to the --wait argument where as an admin I'd like to get feedback from the command's intended actions.13:40
hazmatjamespage, so a leader election/detection cli api for hooks13:41
jcastroDoes anyone want to volunteer to do a juju session for ubuntu openweek? https://wiki.ubuntu.com/UbuntuOpenWeek13:41
rogniemeyer: hmm, it looks like the python code is using an undocumented feature of aws.13:42
jamespagehazmat, that would be nice13:42
jamespageas it would prevent some fragile hack in the charm hook13:42
hazmatrog, that api has several different spellings, they are documented13:42
jamespageI'm doing something similar at the moment for unit bootstrapping - which it not 100% reliable13:43
jamespagewhen units join the peer relation13:43
SpamapSjcastro: I'm down for it.13:43
jcastroSpamapS: can you claim a block please?13:43
jcastroSpamapS: I'll do it with you if you want13:44
SpamapSYeah at least be there to help me with the bot. ;)13:44
hazmatrog, txaws is a poor reference impl to look at.. https://github.com/boto/boto/blob/master/boto/ec2/connection.py#L191713:44
hazmatis much better at api coverage and docs, notice right above that impl there is support for a deprecated mechanism with slightly different spelling13:44
lynxmanhazmat: SpamapS: got the juju macports done and working, just a versioning question, let me paste here the versions of the python packages I'm using and let me know which ones would you deem as "need upgrading"13:45
roghazmat: the name "SourceSecurityGroupName" is used as a parameter. i'd have thought that should be documented in http://docs.amazonwebservices.com/AWSEC2/latest/APIReference/index.html?ApiReference-query-AuthorizeSecurityGroupIngress.html13:46
roggiven that seems to be the entry point.13:46
lynxmanargparse (1.2.1), zookeeper (3.3.0), python-regex (0.8.0), python-txaws (0.2), pydot (1.0.25), python-argparse (1.2.1)13:47
lynxmanhazmat: maybe we should upgrade txaws?13:49
rogniemeyer: looks like a new entry point is warranted. perhaps the original call would be better named AuthorizeSecurityGroupIP. hmm.13:51
hazmatrog its quite possible txaws is not targeting the latestt api13:51
hazmatrog, actually highly likely given its lack of dev13:52
roghazmat: txaws has the call. as does boto. but the AWS documentation doesn't mention that variant AFAICS13:52
rogit looks like all the language APIs have that variant. do you know what it's actually doing? authorizing one group with the privileges of another?13:54
rogthat would be my guess, but it would be nice to know for sure, so that i can choose a good name.13:54
hazmatrog, aws supports both because they have a versioned api, boto has separate implementations for each version one marked deprecated.13:56
hazmatrog, it is documented, but not under the latest version of the api docs which document the latest13:56
jcastroSpamapS: which slot do you want?13:57
lynxmanhazmat: so what do you reckon :)13:57
roghazmat: ah, so... we have to ask: what's the equivalent of that old call in the new API?13:57
rogi'll try and find the old docs13:58
hazmatlynxman, so txaws doesn't have a release with the openstack fixes atm13:58
hazmatand i should probably push out a new version of txzookeeper13:59
hazmatlynxman, give me a moment, i'll cut releases for both13:59
lynxmanhazmat: cool :)13:59
hazmatlynxman, besides that.. what's python-regex?13:59
hazmatlynxman, we use the builtin re module not a third party lib13:59
hazmatunless a dep needs it like pydot..14:00
hazmatrog, it should be pretty clear from context how to translate14:00
lynxmanhazmat: I can drop it as a dependency then, pydot has its own :)14:02
roghazmat: perhaps. this page talks about a "user/group pair permission", but perhaps that's just code for "allow all IP access". http://docs.amazonwebservices.com/AmazonEC2/dg/2007-01-03/ApiReference-Query-AuthorizeSecurityGroupIngress.html14:03
hazmatlynxman, so python-txzookeeper 0.8.0 is needed as well14:04
hazmatlynxman, and zookeeper 3.3.3 .. there are definitely bug fixes in the py bindings we need14:05
lynxmanhazmat: alright, I'll upgrade both then, ty14:05
hazmatlynxman, np.. the latest pypi release for txzookeper looks good, off to push out a 0.2.1 txaws release14:06
lynxmanhazmat: lovely, thanks! :D14:06
rogtcp port numbers are 16 bit even with IPv6, right?14:17
* niemeyer looks at rog with the eye14:18
rogok, ok, i should know that.14:19
SpamapSjcastro: sorry, family stuff, I'll grab one in the next 2 hrs14:36
rogniemeyer: just checking: have you already written some Go code to parse environments.yaml?14:41
niemeyerrog: No, that was the first bit I suggested you could start with14:41
rogok, cool14:42
rog(BTW the instance starting and group set up code is all working now)14:42
niemeyerrog: Please follow the existing convention in the charm package14:42
niemeyerrog: Wow, neat!14:42
niemeyerrog: How're you testing it?14:42
rogniemeyer: it's just a stub file currently, no tests written so far14:43
niemeyerrog: Heh14:43
niemeyerrog: So there's nothing..14:43
rogniemeyer: just running it and going to the aws console to check14:43
niemeyerrog: :)14:43
niemeyerrog: Please write tests with the logic, rather than retrofitting them14:43
niemeyerrog: We should follow a similar model to what was done with goamz itself14:44
niemeyerrog: Rather than the mocking craziness we have in the Python side14:44
rogniemeyer: yes, tests are the next thing i'm putting in. the code isn't even in a package yet.14:44
niemeyerrog: Ok, it's a spike then14:45
rogniemeyer: a spike?14:45
niemeyerrog: yeah, a  temporary hack to get a feeling of the problem14:45
rogniemeyer: yeah, although i've ported a lot of the logic from the original python, so it should be trivial to do it right.14:45
rogniemeyer: this is all i've got so far: http://paste.ubuntu.com/706139/14:46
niemeyerrog: Nice14:48
hazmatlynxman, latest txaws release @ http://launchpad.net/txaws/trunk/0.2/+download/txAWS-0.2.1.tar.gz14:50
lynxmanhazmat: lovely, thanks :)14:50
rogniemeyer: what's the best approach to testing with ec2? actually interact with ec2 directly?14:50
niemeyerrog: No, we can follow a similar model from goamz14:53
rogok, i'll have a look.14:53
rogniemeyer: BTW is this the only spec for the environment yaml? https://juju.ubuntu.com/docs/getting-started.html#configuring-your-environment14:55
niemeyerrog: Please read the Python code14:55
rogok14:55
lynxmanhazmat: new ports submitted, contacted one of the maintainers and it's *possible* that juju will be in the archive by next week15:05
hazmat`lynxman, sweet!15:07
=== hazmat` is now known as hazmat
SpamapSlynxman: is there an artifact somewhere where I can test and provide positive feedback to the maintainers?15:26
lynxmanSpamapS: I can send you my portindex branch if you want15:27
jimbakerSpamapS, this branch should hopefully fix the problem you saw on openstack with expose failing: lp:~jimbaker/juju/expose-retry15:45
SpamapSHah, I love this code15:53
SpamapSself.mocker.call(simulate_random_failure)15:53
SpamapS:)15:53
SpamapSjimbaker: indeed that should retry those ops. There are many others.. I think we just have to get defensive about txaws15:54
* hazmat lunches15:59
niemeyerI'm off to lunch too.16:00
jimbakerSpamapS, :). we need to be defensive about txaws because it needs work and it necessarily deals with bad stuff. in general, txaws will fail early, if it has a bad payload it can't parse16:06
jimbakerfor commands like destroy-environment that can be repeated, this may be ok. for agents, we need to do retries16:07
jimbakeri'm pretty certain that the provisioning agent retry mechanism (ignoring that it's a SPOF for now) seems to robust, so long as we have errbacks defined such that stuff doesn't just stop. in the case of expose, the only place where txaws can be called is that one method (open_close_ports_on_machine), so trapping there and then using the existing resync mechanism for retries would seem to suffice16:10
SpamapSAre there any operations that the provisioning agent does w/ txaws where it shouldn't retry on error?16:11
SpamapSexpose/unexpose was just the most common fail we had16:12
SpamapSthere were others16:12
SpamapSany time listing instances returned empty ... things were likely to just grind to a halt16:12
jimbakerSpamapS, i suspect the problem with that is seen here: http://pastebin.ubuntu.com/706206/, specifically lines 17-2116:14
jimbakeri need to check that get_machines will always raise a ProviderError if it fails16:15
jimbakerSpamapS, no, it only catches EC2Error, but txaws will raise other errors16:16
SpamapSjimbaker: yeah seems like we should be able to trust our internal libraries to always raise only ProviderError. :)16:17
jimbakerSpamapS, that's definitely not the convention we have16:18
jimbakerno catchalls16:18
SpamapSseems like catchalls at external libraries would be a good idea, but not for internal ones.16:18
jimbakerexcept perhaps in some twisted code where we use an errback setup, and then that does catch everything16:18
jimbakerSpamapS, yeah, i don't know. i think i can defend the existing mechanism by stating that for nonagent code, it's better to failfast, so any unknown errors bubbling up is fine16:20
jimbakerSpamapS, but if i look at periodic_machine_check, it does the right thing: it always reschedules itself, even if there's an error (equiv to inlineCallbacks with a finally)16:22
jimbakerSpamapS, so it should be resilient. and of course, if txaws is bad here, vs just getting an occasional bad payload, there's nothing that can be done anyway except to repeatedly log the problem16:23
SpamapSjimbaker: thats really what I'm wondering.. I don't know of any action the provisioning agent takes that shouldn't just be retried over and over. I will say that we need a better way than debug-log to track provisioning operations.16:26
jimbakerSpamapS, i think this would be helpful, bug 76912016:28
_mup_Bug #769120: Ensemble status shouldn't report dead units based soley on state, but also on presence. <juju:New> < https://launchpad.net/bugs/769120 >16:28
hazmatniemeyer, the doc builds on juju docs have been broken for a while.. their still referencing old ways of deploying16:32
jimbakerSpamapS, ok, i think i see one bug here however: watch_machine_changes is a watch, and it calls process_machines. so this watch would stop working if process_machines fails because of some random exception from txaws16:33
niemeyerhazmat: Can you please raise that up in #is?16:33
jimbakerSpamapS, we would still see the resync from the periodic_machine_check, but the provisioning agent wouldn't respond to changes to ZK as they happen16:35
SpamapSjimbaker: exactly!16:35
jimbakerSpamapS, cool, glad to see your evidence corresponds to what i'm seeing here :)16:35
SpamapSjimbaker: did we ever open an actual bug for this?16:35
SpamapSI suppose you can just lpad it :)16:36
jimbakerSpamapS, i'll just open it conventionally, since i don't have a branch in place to fix it16:37
hazmatniemeyer, done.. is there any one i should ping about it?16:37
niemeyerhazmat: Hmm.. #is?  Who did you ping if you're wondering about who to ping?16:38
hazmatniemeyer, i just put the message about the problem on #is.. just wondering if i should bring it to a particular person's attention on #is16:38
niemeyerhazmat: Ah, gotcha16:39
niemeyerhazmat: No, I'd just wait to see if someone there is able to help16:39
niemeyerhazmat: Otherwise mail rt16:39
hazmatniemeyer, k, thanks16:39
_mup_juju/go-store r20 committed by gustavo@niemeyer.net16:54
_mup_Introduced revision key tracking so that we can detect whether a16:54
_mup_charm update is already the current tip across all requested URLs16:54
_mup_or not. If at least one of the URLs are out-of-date, the update16:54
_mup_will proceed and bump a revision on all of them.16:54
rogi'm off for the day. see y'all tomorrow.17:00
niemeyerrog: Cheers!17:03
_mup_Bug #872378 was filed: Provisioning agent stops watching machine changes in ZK <juju:New> < https://launchpad.net/bugs/872378 >17:05
jimbakerSpamapS, i just filed bug 87237817:05
_mup_Bug #872378: Provisioning agent stops watching machine changes in ZK <juju:New> < https://launchpad.net/bugs/872378 >17:05
SpamapSjimbaker: thanks, will confirm and mark High17:05
jimbakerSpamapS, thanks, just what i was going to ask :)17:05
SpamapSoh you did that :)17:06
jimbakeri did the high part, you can still confirm it however17:06
SpamapSneed to raise a txaws bug too17:06
jimbakeri'll get the bug dance better next time17:06
SpamapSwell I am pretty religious about not confirming my own bugs :)17:06
jimbakerSpamapS, it's an interesting question about txaws, but given that it's a closely related project, worth seeing their philosophy here - do they handle bad payloads or not?17:07
SpamapSno17:07
SpamapSthe project expects its AWS partner to be well behaved17:07
SpamapSso there's also a nova bug to raise17:08
SpamapSas nova shouldn't be returning empty ever17:08
SpamapSheh.. we should probably have a little triage party to clean up txaws's bug list.17:08
jimbakergot it. but regardless we would still expect to see TimeoutError, so there's some class of errors txaws will likely not handle17:08
SpamapS34 new, 72 open, 3 high..17:08
_mup_juju/go-store r21 committed by gustavo@niemeyer.net17:51
_mup_Track sha256 and store next to the charm information so we can answer17:51
_mup_related API requests in the future.17:51
_mup_juju/go-store r22 committed by gustavo@niemeyer.net18:01
_mup_Copied log.go from personal project (mgo).18:01
jcastrolynxman: heya, any update on the macports thing?18:44
jcastrohazmat: hey is there an easy way to tell the local provider to use my existing apt cache instead of installing all this apt-cacher-ng business?18:51
hazmatjcastro, i think he mentioned updated the portfile, he's going to ping one of the maintainers, with luck soon18:52
hazmatjcastro, sadly no18:53
hazmatjcastro, is the initial download a problem?18:55
jcastroyeah, this close to release the mirrors are hammered, I'll suffer and find something else to do18:55
m_3SpamapS: did you mention you had pending MW charm changes?19:05
SpamapSm_3: everything I had is in lp:charm/mediawiki19:06
m_3SpamapS: cool thanks19:07
_mup_juju/go-store r23 committed by gustavo@niemeyer.net19:09
_mup_Added info/debug logging across the charm storage operations.19:09
hazmatjamespage, ping19:18
hazmatjamespage, i'm wondering how problematic it is to always kill the unit's processes on removal instead of a controlled termination via stop19:19
SpamapShazmat: stop needs to be able to *cancel* the removal19:20
hazmatSpamapS, there's not much distinguishing a unit removal to a service removal at that level19:21
SpamapSIt would be awesome if charms could prevent data loss without a --force flag by simply refusing to stop the service while it is vulnerable.19:21
hazmatand units overriding the user express commands..19:21
hazmathmm19:22
SpamapSis this only happening on destroy-service, not on remove-unit ?19:22
SpamapSI do kind of think destroy-* should be more heavy handed19:22
hazmatSpamapS, it would happen on either one, the mechanics are the same atm19:22
hazmatSpamapS, how does the service know if its redundant or not?19:23
hazmatservice unit19:23
_mup_juju/config-get r393 committed by kapil.thangavelu@canonical.com19:25
_mup_juju get for service config/schema inspection19:25
SpamapShazmat: in the case of any clustered service, it will have some way to determine if removing this node is safe or not.19:27
SpamapShazmat: stop would also be a decent place for a single node service to signal some kind of snapshot or backup.19:29
SpamapSso blocking until its done would be cool19:30
hazmatSpamapS, the converse question is how to prevent problems with problematic charms, that might for example have a broken stop... or even well meaning ones that go out of control19:31
hazmatdecomissioning a node in cassandra is potentially a fairly long operation afaicr19:32
hazmatwe'll need intermediary states to properly convey status to a ui19:32
hazmatie. 'stopping'19:32
hazmatwe only have nouns now.. not verbs19:32
SpamapShazmat: --force ?19:36
hazmatsounds reasonable19:36
SpamapShazmat: I see what you mean. Yes it would be cool if we followed upstart's model there and had a goal state, and the in-between states with hooks available for each state.19:37
hazmatSpamapS, exactly19:37
hazmathmm.. well maybe not hooks available for each state, but at least the same re status19:37
hazmateffectively it would be a hook per verb19:37
SpamapSstop/running -> stop/hook-stop-running -> (if hook says so, stop/deferred-stop) -> stop/stopping-unit -> oblivion19:38
SpamapSLike if a hook exits 100 , that means it is running the safe stop in the background19:38
SpamapSthen you can just keep trying to stop it, and getting back 100 until its done decomissioning19:39
SpamapSand you can still have a short timeout to deal with misbehaving charms19:39
hazmati'm going to capture this discussion into the bug19:39

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!