/srv/irclogs.ubuntu.com/2015/08/27/#juju.txt

beisnerthedac, wolsen, gnuoy, coreycb, dosaboy - fix for bug 1485722 proposed and tested.  i view this as critical as it is a deployment blocker for Vivid and later when nrpe is related to rmq.  please & thanks :-)01:37
mupBug #1485722: rmq + nrpe on >= Vivid No PID file found <amulet> <openstack> <uosci> <nrpe (Juju Charms Collection):Invalid> <rabbitmq-server (Juju Charms Collection):New> <https://launchpad.net/bugs/1485722>01:37
wolsenbeisner, well sir let me go look01:38
wolsenbeisner - added a comment01:42
wolsenbeisner, please take a look - if I don't hear back I can fix it in the merge as the rest of the change looks fine01:43
beisnerwolsen, removed a cat :-)01:45
wolsenbeisner, awesome, those pesky felines getting in the way :)01:46
wolsencats, rabbits, meh01:46
wolsenbeisner, this is backport-potential yes?01:48
beisnerwolsen, definitely01:48
beisnerwolsen, affects next and stable rmq + nrpe01:48
beisnerwell just the rmq charm but nrpe users will hurt01:48
wolsenbeisner, yuppers - though its slightly mitigated by not affecting an lts version - but ya01:48
beisnerwolsen, it'll be a MP blocker once those tests land ;-)   then it'll get heat if it doesn't already have any.01:49
wolsenbeisner, +1 on that01:49
wolsenbeisner, I was noticing earlier with a proposal that jillr put up - there's not a lot of test around some of the nrpe stuff - which is something we should mark for improvement somewhere01:49
beisnerwolsen, i added a comment on that fyi01:50
beisnerbasically to hang tight while we iron out these crits01:50
beisnerthen resubmit01:50
wolsenbeisner, ah ok - hadn't looked at it too closely, but was trying to help jillr take a look at it01:51
wolsenbeisner, did she retarget? I had asked her to retarget /next (honestly haven't looked at that)01:51
beisnerwolsen, yep.  but it's running the old tests, which as we know now, lead to releasing broken-ass charms.01:51
wolsenbeisner, ack01:51
beisnerwolsen, hence the hang-tight, then rebase & resubmit review.  no sense in not-testing any additional features.01:52
beisnerbiab, thanks for the check-in wolsen01:53
wolsenbeisner, landed01:55
beisnerwolsen, thanks sir02:01
wolsenbeisner, np02:03
=== scuttlemonkey is now known as scuttle|afk
WalexHi I am trying to login to the Juju MongoDB instance and I am getting "not authorized" errors. I am using '-u admin' and '-p ....' from the '/var/lib/juju/agents/machine-0/agent.conf' file (API and state password are the same). The 'mongod' instance with started with '--keyFile ....' but there does not seem to be an equivalent option for the 'mongo' client. Suggestions welcome.13:17
Walexalso curiously all three members of the replica set have different passwords. How does one member log into the other members?13:23
lazyPowerWalex: MongoDB replicaset passwords are cluster specific, so typically you log in through a mongos gateway to reach your cluster nodes13:27
lazyPowerhowever i'm not certain how you would log in and poke around in the Juju DB - thats a good question. The core devs probably would have some insight here13:27
WalexlazyPower: http://www.metaklass.org/how-to-recover-juju-from-a-lost-juju-openstack-provider/ has a suggestion which I tried...13:28
lazyPowernatefinch:  wwitzel3 - Any insight for Walex on how they can cannect to the jujudb?13:28
WalexlazyPower: maybe I am using the wrong terms here, I don't see any 'mongos' daemons running on the nodes. I meant perhaps the state servers.13:30
Walexthe 3 nodes can log into each other obviously (I see the port 37017 traffic)...13:31
Walexbut that's obviously done with the replica set keyfile.13:32
lazyPowerright. I'm not so familiar with how our juju stateserver is setup to be honest. i just know it exists and what function it serves13:32
lazyPowerthe experience i speak of is from running a distributed monogdb cluster13:32
Walexwill wait or perhaps later send a mailing list message.13:33
wwitzel3let me see if I can remember13:33
natefinchyou use the oldpassword field from the agent config on machine 0....13:34
natefinchI forget the exact incantation13:35
Walexnatefinch: I'll try again13:35
Walexnatefinch: in http://www.metaklass.org/how-to-recover-juju-from-a-lost -juju-openstack-provider/13:35
Walexthere is a plausible looking line13:35
lazyPowerThanks natefinch and good morning o/13:36
Walexbut I use the 'oldpassword' and "auth fails" not sure if 'admin' is the right user13:36
natefinchWalex: that looks good to me. I usually just get the password the old fashioned way (copy and paste) but assuming the grep does the right thing, then yes13:36
Walexahhhhhhhhh I have just noticed my mistake: I was trying to log into the 'juju' database, not 'admin'. oops13:38
Walexindeed with "/admin" that works, sorry...13:40
lazyPowerWalex: glad we could get you sorted!13:44
WalexlazyPower: and I can connect directly to 'localhost:37017/juju' if I add '--authenticationDatabase admin' as an option to 'mongo'13:46
Walexsorted!13:46
lazyPowercory_fu: so, if you've got a moment - we got this far yesterday - http://paste.ubuntu.com/12205843/13:51
lazyPowerthats our reactive/nginx.py, down to implementing a relationship stub and the super simple intro to reactive and layers is basically complete. we've mirrored what we use in charm schools to teach charming w/ docker13:52
cory_fulazyPower: Would it have killed you to select Python as the language to get syntax highlighting?  ;)13:53
lazyPowercory_fu: i used pastebinit?13:53
cory_fuAh.  Fair enough13:53
cory_fuI'm surprised pastebinit doesn't guess the format based on the file name13:55
lazyPowerpapercutz13:56
cory_fu:)13:56
lazyPowercory_fu: thanks for the sync yesterday. that really got mbruzek and I moving. We're going to have this particluar charm wrapped today and ready to move on to extending the base layer(s) and writing docs before the week is up13:59
cory_fulazyPower: You have a bit of a bug in your config-changed handler.  It could potentially call stop_container and attempt to issue docker commands before docker is installed14:00
lazyPowerthat was critical to resolving things we were doing exploratory dev for14:00
lazyPowercory_fu: in our testing it was from install =>    and the entire chain ran before it hit a possible stop hook.14:00
lazyPowerwhat scenario would be exposed that leaves us vulnerable to calling stop before its present14:00
cory_fuOh, yeah, I suppose you're right.  docker.available will be set during the install hook, so it's a bit moot.  Though, if your docker base layer ever changes (say to require a repo URL config or something) that could potentially delay docker install until config, it could open you up.  *shrug*14:02
cory_fulazyPower: I was going to suggest creating an nginx.restart state that you could set14:02
cory_fuWould be another potentially useful entrypoint for layers using this14:02
cory_fuAnd would future-proof the code against the admittedly non-issue14:03
lazyPowerwe thought about that, and i forget why exactly we refactored down to just dropping in a config-changed hook context vs using the state14:03
lazyPowerbut it makes sense14:03
* mbruzek starts reading the scrollback14:11
=== mgz is now known as mgz_
kwmonroehey lazyPower, i just deployed cs:trusty/etcd and was met with: http://paste.ubuntu.com/12206396/  have you seen that before?15:06
kwmonroe^^ no relations or anything, just "juju deploy cs:trusty/etcd" got me there15:06
=== scuttle|afk is now known as scuttlemonkey
mbruzekhey Kevin that looks aweful15:16
mbruzekkwmonroe: that looks like our overuse of path.py has come to bite us.15:18
mbruzekkwmonroe: Can you paste the entire unit log?  Did the pip install path.py not work?15:18
mbruzekin the install hook15:19
* marcoceppi wears a smug face15:19
* mbruzek waves and nods at marco15:19
kwmonroemomento mbruzek, i tore that env down, but I'll fetch the logs again shortly15:21
mbruzekkwmonroe: it looks to me that the install hook does not install pip15:21
WalexI see that the Juju "command" node(s) don't run (necessarily) any daemon, and that the "state" nodes run 'mongod' from the 'juju-mongodb' package. Also I see that all nodes run 'jujud' from the unit 'tools' directories. I am about to update to 1.24.5. How do the 'jujud' binaries in each unit get updated? When? Are there ordering dependencies among the 'juju-*' packages for upgrade, and among the state servers? ...15:32
* Walex worries about details...15:32
kwmonroembruzek: i betcha you gotta do "from path import Path" (cap P on the 2nd path): http://paste.ubuntu.com/12206611/15:38
mbruzekkwmonroe: It looks like path.py was updated today!  It is possible that is not working.15:38
mbruzekkwmonroe: we use lowercase path all over the place.  whit can you help with this problem?15:39
kwmonroembruzek: we use cap P in our big data charms... now fight.15:39
lazyPowerkwmonroe: hmm interesting, let me check the charm code 1 sec kwmonroe15:40
kwmonroembruzek: lazyPower.. i'm just gonna leave this here: https://pypi.python.org/pypi/path.py15:40
lazyPowerkwmonroe: i follow that with the bug hat spawned this issue15:49
lazyPowerhttps://github.com/jaraco/path.py/issues/10215:49
lazyPowerkwmonroe: but thanks for the heads up on the issue. We'll cut a hotfix patch and get it cueued up - as we are apparently broken in the store now15:51
kwmonroegracias lazyPower!15:53
lazyPowerkwmonroe: once we have a fix in place do you mind being the on-call reviewer for that MP? i'll stack it on what you're already reviewing so its applicable :D15:53
kwmonroesure lazyPower, i'll be your huckleberry15:54
lazyPoweraww yeee15:54
lazyPowerkwmonroe: broken rel of path.py was just pulled from pypi16:14
lazyPowerready for you to re-test at your leisure16:14
lazyPowerwhoa juju gui just removed its crosshatched background - https://github.com/juju/juju-gui/pull/79916:18
kwmonroeconfirmed lazyPower, latest deploy pulls path.py-7.7.1 and all is right with the world.  would you like a tracking bug requesting s/path/Path for the inevitable time when path does finally go away?16:21
rick_h_lazyPower: quit watching us :P16:22
lazyPowerkwmonroe: we're going to pin package deps now, and prepare for the inevitable breakage when we have the bandwidth16:22
lazyPowerthats used in a lot of places16:22
lazyPowerand we have a lot of auditing to do16:22
kwmonroeack lazyPower.  thanks to you and whit for the nudge to get path-8 out of pypi :)16:23
WalexI updates packages on master node, now 'juju upgrade-juju' tells me that "no more recent supported versions available"  how do I make a more recent version of the tools available to the nodes?16:42
lazyPowerWalex: when you juju upgrade-juju it should have published a newer version of the tools to your state server16:46
lazyPowerthe nodes will slowly start to upgrade once your environment is upgraded if memory serves me correctly16:47
WalexlazyPower: ahhhhhhh so in theory I just wait. I noticed somewhere a mention of a queue16:49
Walexah but just looked at my state servers and I don't see the 1.24.5 directories I'll investigate16:50
Walexwhat's peculiar is that when I upgrade 'juju-core' it took many minutes and it is a fairly small package.16:58
wolsenackk, regarding our discussion for the keystone pause/resume17:00
ackkwolsen, yes17:00
wolsenackk, so for the clustering support - if we had the support in hacluster charm to move the vip off a node (e.g. get a node in maintenance mode or paused), then I think what is in the keystone charm would be fine actually17:01
wolsenackk, the pause and such that is17:01
wolsenackk, I still have a concern that if a user were to simply issue the pause against keystone but they hadn't done the appropriate action on the hacluster charm that they could end up with a service disruption17:02
wolsenwhich might not be great17:02
ackkwolsen, right. there are other similar cases where there's more to do than "service foo start/stop". for instance ceph OSDs need to be set to "noout"17:02
wolsenackk, right - maybe we can address it with docs around the pause/resume action?17:03
ackkwolsen, I see your point. I'm a bit worried about putting a lot of logic in a single action and having an action with the same name doing different things across charms17:04
ackkwolsen, there are other cases where you'd definitely want separate steps, like for nova-compute17:05
wolsenackk, that's a fair point, but to me the action defines the semantics of what you want to happen and its up to the charm to define what needs to happen for that action to take place - which can add some complications17:05
wolsenackk, that being said lets try to keep it simple until we have to17:05
wolsendo more17:05
wolsenackk, but if we do keep it simple, we still need to be able to inform the user what other actions need to take place17:05
ackkwolsen, you mean documenting that you should do other stuff before stopping services/17:06
ackk?17:06
wolsenackk, yep17:11
wolsenackk, I'm thinking the action docs would say something about requirements in a clustered scenario, e.g. running the pause action there first17:12
ackkwolsen, btw what's needed on the hacluster side to move the VIP?17:13
wolsenackk, if we could enforce that the action were running first, that'd be great, but that's kind of above and beyond...17:14
wolsenackk, for the hacluster - theres the option to move a resource - but the cluster may need to be in maintenance mode as well or the node marked as offline17:14
wolsenackk, i'd have to go through the specific details of how to do that (to refresh my memory)17:15
ackkwolsen, I see. so basically we could add a pair of actions there so that you'd "juju action do pause hacluster-keystone/X; juju action do pause keystone/Y"17:16
ackk(roughly)17:16
wolsenackk, yep17:16
ackkwolsen, cool17:16
wolsenackk, so it'd still keep the building blocks you're looking to add (we can fancy it up in the future if needed)17:16
ackkwolsen, +117:16
wolsenackk, but the user needs to know that they have to do the multi-step process17:17
ackkwolsen, agreed17:17
ackkwolsen, could you sum that up in a comment on the MP?17:17
wolsenackk, doing so now17:17
ackkwolsen, thanks17:17
ackkwolsen, totally unrelated (but since we're on openstack charms topic), do you know any downside of not using the embedded webserver for ceph-radosgw?17:18
wolsenackk, and I think the other proposals which are similar (e.g. glance and percona-cluster) will likely fall into the same - though percona-cluster I think we should carefully think through that in some more depth (I'll try to give some more thought to it)17:19
wolsenackk, when not using the embedded server, it doesn't have the 100 continue support built-in to the apache service. Ceph devs used to provide an apache package which had it but they yanked it in favor of the embedded web server17:20
wolsenackk, the 100 continue support is necessary for some of the use cases (e.g. using it from the horizon dashboard)17:20
ackkwolsen, I see17:20
wolsenackk, so the preferred way forward is the embedded server17:21
wolsenackk, but is there another use case that you have for not using it?17:21
ackkwolsen, well, we've seen failures in autopilot deploys recently. I'm not sure it's related, but it might have happened since we switched to the embedded server17:22
wolsenackk, oh :(17:23
ackkwolsen, as said it's just a guess, maybe it's an unlucky coincidence17:24
wolsenackk, logs and a bug would be great (if you haven't gotten one already)17:26
ackkwolsen, https://bugs.launchpad.net/charms/+source/ceph-radosgw/+bug/147722517:27
mupBug #1477225: ceph-radosgw died during deployment <cloud-install-failure> <cpec> <ceph-radosgw (Juju Charms Collection):New> <https://launchpad.net/bugs/1477225>17:27
wolsenackk, also wanted to say the MP looked really good in general and thanks for that contribution!17:27
ackkwolsen, np! :)17:27
ackkwolsen, wrt maintenance, we're also not sure yet of what needs to be done on neutron-gateway nodes (see notes in the doc)17:29
wolsenackk, bleh yeah that's tricky as it will almost certainly cause service disruption unless dvr is enabled I believe17:30
ackkwolsen, specifically if removing/readding the l3router in neutron is needed, and how properly cause a failover if17:30
ackkwolsen, we deploy with l3ha17:30
wolsenackk, ok17:30
ackkrouter-ha, that is17:30
ackkwolsen, still, stopping services on the node is not enough to cause a failover17:31
wolsenackk, I'll have to dig into it (I don't have enough background on neutron gateway and ha to be honest)17:32
ackkwolsen, ok, thanks for the info17:33
ackkwolsen, and for the review :)17:33
wolsenackk, np :-) it was fun!17:35
ackkheh17:36
kwmonroelazyPower: fwiw, i saw pypi went to path.py-8.1 and re-checked etcd.  you're still good.17:52
lazyPowerkwmonroe: above and beyond, thats awesome. Thanks!17:52
kwmonroenp lazyPower, gives you time to work out which version you want to pin.17:52
whitlazyPower: this path.py hiccup makes me think we should have an official juju python index18:00
marcoceppiwhit: that sounds like the opposite of what we need, why not just version lock your deps?18:00
whitmarcoceppi: accomplishes the same thing without having to edit all the places the dep is defined everytime you need to update18:01
whitmarcoceppi: think of it as a hierarchy of control18:01
whitthe index is centralized, but under our control (unlike pypi)18:02
whitthen reqfiles and setup.pys become the more granular control18:02
marcoceppisounds like a lot of work for little pay off18:02
lazyPowerwhit: that sounds like an extra maintenance burden and infrastructure for the sake of running infrastructure. It would yield some benefit, but i'm not certain thats enough to not just version lock deps.18:02
lazyPowerif we had packages constantly getting yanked from pypi, then yes, that sounds like the way to move forward18:03
lazyPowerso we can maintain the versions we depend on that are otherwise disappering18:03
whitthe issue is the "default" version18:03
whitwhich is always the most current in the index18:03
lazyPowerwell, thats fair, but we also didnt define any of that in our requirements. in 2 places we had blind install path.py on the CLI18:03
whitif you pin, you no longer will pull newer18:04
lazyPowerand in others we had no version data in the requirements.txt file18:04
marcoceppisure, so you should develop a charm, pip freeze, deliver, iterate, pip update, freeze again, release18:04
whitI don't think you all are grokking how these things scale or the tooling available18:04
* marcoceppi shrugs18:05
whitjust take it as my advice until it makes sense ;)18:05
lazyPowerwhit: thats possible18:05
lazyPowerwhit: but do we adopt the same thing with companion technologies for other languages? Run our own gem host, npm index, et-al?18:05
whitthink of it as every deploy of any charm as your "product website"18:05
whityou wouldn't deploy packages straight from pypi to prod18:06
whitquality control folks18:06
lazyPoweri did :318:06
whitur.. production18:06
lazyPowerand i hated the pain that introduced like today, but we pinned deps then too. we didn't spin up a gem host.18:06
marcoceppiso, how does pip freeze not solve this?18:06
marcoceppiI don't want us to be responsible for someone's charm not working because an index is down18:06
marcoceppior they're using a newer version than our index or vice versa18:07
whitmarcoceppi: vs. pypi or github being down which you can do nothing about?18:07
marcoceppiyeah, but we're not responsible and they're all well estabilished services that have a team dedicated to keeping those things running18:08
marcoceppino ones perfect but I don't want to run pager duty because we're running our own index18:08
whitmarcoceppi: yeah, but those being down == a crap experience for charm deployment18:08
marcoceppiand our index being down?18:09
whitthis is an academic example in reliability and control18:09
whitwe can fix that18:09
whitwe can't fix externalities18:09
whitthat's the point of the example18:09
marcoceppithis is the exact reason SAAS exists18:09
whitmarcoceppi: when your shit break because some elses saas break, you still look like an ass18:10
marcoceppiif you want to run this for a set of charm syou maintain, sure, that sounds great, I don't think it sets us up for success anymore than what exists with pypi or other services18:10
whitsaas exists so you don't have to build it18:10
whitbut when aws goes down, netflix loses money18:10
marcoceppiand when yoru shit breaks because you can't run a web service 24/7 due to stafing you look like a bigger ass18:11
whitmarcoceppi: that we can fix ;)18:11
whitmarcoceppi: my general point is that python libraries working are part of charms working and therefore part of a good charming experience18:12
whitwhich is important to the success of juju18:12
lazyPowerwhit: this sounds more like deps should be bundled with charms.18:12
lazyPowernot that we should run an indexer18:12
marcoceppirunning a proxy isn't a project concern18:12
marcoceppiit's an operations concern18:12
marcoceppiwe run pypi proxies in our private environments18:12
whitresources would help, but an index fixes the problem now without the developer issue of pin maintenance18:12
marcoceppithat just work, leave this for ops people to run themselves, not us18:13
whithttps://pypi.python.org/pypi/devpi18:13
marcoceppilunch is here18:13
whitmarcoceppi: yes it is an operations concern.  that we agree on.18:14
whitwho's operation concern and why, we don't18:14
lazyPowerwhit: actually the fact we were pulling from pypi and not some mirror helped us today... had we still had the 8.0 copy cached we would still be broken in the charm store right now18:17
rick_h_whit: marcoceppi fyi from another team's perspective. We've recently discussed talking with IS on running a pypi index in prodstack for our services there and gating/curating. However, we currently have a matching "xxx-download-cache" project for each codebase and build it into the project's build steps.18:17
rick_h_this allows for complete offline building of code/projects18:17
rick_h_and completely verison locked w/o internet access (since prodstack has egress firewall locks)18:17
rick_h_whit: marcoceppi so I guess some additional feedback, I can't not redeploy my produciton because pypi or GH are having DDOS issues atm.18:18
* rick_h_ goes back to lurking18:18
jrwrenEvery organization and project has different tollerances for acceptable risk. Some projects may be willing to accept the risk that goes with depending on github or pypi being up, others cannot.18:23
whitlazyPower: we would have tested the new copy before updating the index18:45
whittherefore no breakage?18:45
lazyPoweri guess18:45
* lazyPower shrugs18:45
lazyPoweri'm not interested in running a pypi mirror18:46
lazyPowerbut if someone else is,i'm thumbs up to them doing it18:46
whitrick_h_: that sounds good18:47
whitideally grabbing and freezing all necessary resources for a charm has lots of benefits18:47
whitthis is the general idea behind IBWFs mas o menos18:47
whitwhether you are building on the fly or building resource blobs or some sort of image, controlling the source material has lots of benefits18:49
whitimage workflows do have the advantage of breaking before deploy (in the build stage) rather than during deploy18:50
jrwrenwhit: how far do you take this? what makes archive.ubuntu and PPA different from pypi?18:53
=== fgallina`` is now known as fgallina
beisnerFwiw I use devpi caching mirror in uosci19:01
beisnerAs doing A lot of iterations revealed pypi weaknesses and false test failures19:02
=== scuttlemonkey is now known as scuttle|afk
whitjrwren: archive.ubuntu is better curated than pypi20:30
whitppa is effectively == to a self hosted index if you control the ppa20:30
whitif you don't, you trust the maintainer, so it depends on the nature of the relationship20:30
jrwrenwhit: ah, I thought you were refering to uptime. I used to deal with pypi being down often enough.20:31
whitso if you run the index, you have a bit more control of the uptime20:31
whitrather than depending on the packaging vols of pypi20:31
jrwrenyes, my point is, that to a 3rd party, there isn't much difference between pypi and a ppa.20:32
jrwrenBoth are out of control systems which present risk.20:33
whitjrwren: risk is contextual20:33
jrwrenwhat do you mean?20:33
whitif i am trying out juju, and my charm fails due to pypi being down, I still will say juju sucks20:33
jrwrendefinitely true.20:34
whitif I'm using juju for a situation I'm invested in, it behooves me to run a debian index and a python index20:34
whitand my own charm server20:34
jrwrenexactly.20:34
jrwrenor accept the risk of not doing so.20:35
whitso from the context of eco (and anyone who care bout adoption), controlling the central vectors of potential failure is valuable20:35
whityep20:35
whita good first time experience is one that works20:36
whitfrom our perspective, it's a tradeoff between investing in running, curating and monitoring our own index, vs. the less known cost of random failure20:37
jrwrenits a very good point.20:38
jrwrenit makes me wonder if charms shouldn't declare their external dependencies.20:39
jrwrencertainly with unit status external deps could be handled in the charm an a status-set blocked used to clearly tell end user that an external dep failed.20:41
beisner+1 to charms declaring external dependencies, at minimum in the form of a README blip.20:59
beisnerThat is so much better than having to figure it out via install hook failures when you're sitting behind firewalls and proxies.21:00

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!