/srv/irclogs.ubuntu.com/2018/02/12/#juju.txt

=== frankban\|afk is now known as frankban
gsimondo1	Got a 3 machine juju cloud that is currently stuck because it can't resolve a relation between influxdb and telegraf. This puts influxdb in error state. Resolving its state just puts it back in error state. The consequence is that I can't remove model, controller, machine, or anything that's above this unit/relation in stack.	10:45
gsimondo1	Any advice on how to handle these situations?	10:45
kjackal	hi gsimondo1, first file a bug with the proper section of juju debug-log showing the error. Then you can remove the machine where the failing application is deployed and that should unblock all the remove operations	11:03
kjackal	let me check if there is a --force flag	11:04
kjackal	yes, juju remove-machine <id> --force	11:04
gsimondo1	kjackal: Just saw your message. I figured that out a couple of hours ago but that kills the machine. So I take it that this is a bug that's worth reporting on github?	13:22
kjackal	gsimondo1: if you want to keep the machine you can try juju resolved with --no-retry flag	13:24
gsimondo1	kjackal: doesn't help. it wakes up and reexecutes the failing relation. can't get rid of the relation.	13:26
gsimondo1	2018-02-12 13:25:55 DEBUG query-relation-joined TypeError: configure() missing 2 required positional arguments: 'username' and 'password'	13:26
gsimondo1	2018-02-12 13:25:55 ERROR juju.worker.uniter.operation runhook.go:113 hook "query-relation-joined" failed: exit status 1	13:26
kjackal	gsimondo1: this is on the telegraf or the influx db side?	13:30
kjackal	influxdb, sorry	13:30
gsimondo1	root@kubernetes-2 ~ # tail -f /var/log/juju/unit-influxdb-0.log	13:30
kjackal	gsimondo1: how did you deploy this charm?	13:32
kjackal	I am looking for the revision	13:32
kjackal	is it thisone: https://jujucharms.com/influxdb/13	13:32
gsimondo1	kjackal: ehalilov@kubernetes-0 ~ $ juju deploy cs:~influxdb-charmers/influxdb	13:33
gsimondo1	before that juju deploy telegraf	13:33
gsimondo1	kjackal: the problem is of course that I was experimenting with relations that are not offered when you do 'juju add-relation telegraf influxdb'	13:35
gsimondo1	kjackal: with something like 'juju remove-relation telegraf:juju-info influxdb:juju-info'	13:36
gsimondo1	*add-relation, not remove	13:36
kjackal	gsimondo1: the error you got says that configure() missing 2 required positional arguments: 'username' and 'password', this method is here: https://github.com/ChrisMacNaughton/interface-influxdb-api/blob/master/provides.py#L27 and the charm is calling this method here: https://git.launchpad.net/influxdb-charm/tree/reactive/influxdb.py#n261	13:38
kjackal	so... this is a bug on the influxdb charm if I understand this correctly	13:38
kjackal	gsimondo1: will you be able to open a ticket here: https://launchpad.net/influxdb-charm ?	13:39
gsimondo1	kjackal: OK, I'll test a couple of other things and then open a ticket. I also have some other things failing in a similar fashion (relation related)	13:41
kjackal	now, to get you unblocked.... how confident are you in your scripting skills? If you juju ssh in influxdb you can find the charm source under /var/lib/juju/unit-influxdb-0(probably)/charm	13:42
gsimondo1	kjackal: correct me if I'm wrong but the main issue here is that I can't get rid of the failing relation... and it has effect on things like model, controller, machine, etc. Basically putting everything in state of paralysis and this is a bug that should be fixed on the level of juju or I'm missing something?	13:42
gsimondo1	kjackal: OK let me take a look at that file. I'm software engineer turned devops so I can code.	13:43
kjackal	awesome	13:43
kjackal	so if you go to the reactive/influxdb.py and do something like an early "return" you will not be hitting this road block	13:44
kjackal	gsimondo1: your issue is already reported: https://bugs.launchpad.net/influxdb-charm/+bug/1723334	13:48
mup	Bug #1723334: influxdb-api relation breaks when related to telegraf charm <InfluxDB Charm:New> <https://launchpad.net/bugs/1723334>	13:48
gsimondo1	mup: Yep	13:49
mup	gsimondo1: I apologize, but I'm pretty strict about only responding to known commands.	13:49
kjackal	mup: help	13:51
mup	kjackal: Run "help <cmdname>" for details on: bug, contrib, echo, help, infer, issue, login, poke, register, run, sendraw, sms	13:51
zeestrat	mup: help help	13:51
mup	zeestrat: help [<cmdname>] — Displays available commands or details for a specific command.	13:51
kjackal	mup: help sms	13:52
mup	kjackal: sms <nick> <message ...> — Sends an SMS message.	13:52
mup	kjackal: The configured LDAP directory is queried for a person with the provided IRC nick ("mozillaNickname") and a phone ("mobile") in international format (+NN...). The message sender must also be registered in the LDAP directory with the IRC nick in use.	13:52
zeestrat	What a helpful fellow	13:53
gsimondo1	kjackal: commenting the problematic line works but for sure not a solution to the problem	14:00
kjackal	yes, this has to be fixxed upstream in the charm code	14:13
gsimondo1	kjackal: I don't know people who use canonical k8s yet but it seems to me that it would be rather painful without custom charms	14:25
kjackal	gsimondo1: are you deploying k8s?	14:26
kjackal	I am with the team working on packaging k8s, can you elaborate a bit on the issues you faced?	14:27
kjackal	gsimondo1: we have a set of add-on charms that do the monitoring and logging but they are not based on influxdb, we use prometheus	14:28
gsimondo1	kjackal: creating a proof of concept but firstly trying to break juju as much as possible. the out of the box k8s "production" bundle provided installs fine but we would like to use components that do not have charms already or don't have them for xenial	14:29
gsimondo1	kjackal: I want prometheus but I want to have influxdb for durable storage	14:29
gsimondo1	kjackal: currently I use remote read/write that they offer for our current solution outside of k8s	14:29
kjackal	would you be able to make a case for influxdb addition here: https://github.com/juju-solutions/bundle-canonical-kubernetes/issues	14:30
kjackal	gsimondo1: this is the place where we gather all requests and bugs on canonical k8s	14:31
gsimondo1	kjackal: the only problem is that remote read/write for prometheus and influxdb integration seems to be something that they want to solve by providing an ability to add influxdb as default storage for prometheus	14:33
gsimondo1	kjackal: here's paul dix on that https://www.youtube.com/watch?v=BZkHlhautGk	14:33
gsimondo1	kjackal: so building something for this remote read/write solution that may be superseded by a more superior solution that's part of prometheus code seems like a waste of time	14:34
gsimondo1	kjackal: but I can create an issue if nothing then for discussion about this	14:34
kjackal	that would be nice, thank you	14:36
kjackal	gsimondo1: We keep kubernetes bundle addons here: https://github.com/conjure-up/spells/tree/master/canonical-kubernetes/addons	14:37
kjackal	i guess graylog and prometheus are the most interesting	14:38
gsimondo1	kjackal: so prometheus one is pretty much what I'm trying to test currently, just with influxdb in the mix for storage.	14:40
gsimondo1	kjackal: outside of the bundle you provide of course	14:40
gsimondo1	kjackal: the most dangerous thing I've encountered so far is the bug in these upstream charms that led to an impasse in my testing environment where I can't do anything due to some relation hook failing. my conclusion from that is that one has to test all the edge cases of charms used if they don't want to experience such paralyzing events in production systems	14:42
gsimondo1	kjackal: again, not sure if juju should provide some kind of mechanism to exit the deadlock as you seem to suggest that charms should just work, if they don't then hot fix the code	14:43
kwmonroe	gsimondo1: one thing you might be hitting is that events can queue up on the failing unit, so you'd have to run --no-retry multiple times. for example, influx may fail on the query relation, but then have a peer or status event queue up behind that. if the root cause of the error affects all those events, you'd have to run "juju resolved --no-retry influxdb/0" multiple times to get the unit back to a responsive state.	15:56
kwmonroe	you could watch status with "juju debug-log -i influxdb/0 --tail" in one window while you're doing the --no-retry. you should see juju progress past one hook, then potentially move to another that you have to pass with --no-retry, etc etc.	15:57
gsimondo1	kwmonroe: useful, thanks for that. I'll test if that resolves the issue. So far I've been mitigating this by manually editing code to get out of the deadlock	15:59
kwmonroe	gsimondo1: fyi, this feels like bug 1741506. if we had a --force flag for juju remove-*, it could do the --no-retries automatically until the removal could be completed.	16:02
mup	Bug #1741506: Removing a unit or application doesn't resolve errors <cli> <usability> <juju:Triaged> <https://launchpad.net/bugs/1741506>	16:02
gsimondo1	kwmonroe: seems like it. also, from the point of view of UI, what you suggest is important because prompting people should be optional IMO	16:05
kwmonroe	+1	16:06
rick_h	bdx: howdy	17:01
=== frankban is now known as frankban\|afk
bobeo	rick_h: bdx So I deploye two instances of owncloud into juju,and I also deployed two instances of postgresql, and the databases have synced as expected, and are replicating as expected, however, owncloud did not, even though owncloud uses postgresql for user accounts and for data storage. Can anyone explain, or help me understand something that I missed?	19:29
kwmonroe	bobeo: a quick look at the charm source (https://github.com/omnivector-solutions/layer-owncloud/blob/master/metadata.yaml) doesn't show any peer relations. that means multiple owncloud instances won't do anything special when another comes along. if you've deployed 2 of those, i'd bet that they both have their own db created and are working as 2 independent ownclouds.	19:43
bobeo	kwmonroe: But it does seem that the postgresql that I deployed does have it. postgresql/0* active idle 4/lxd/6 10.0.0.42 5432/tcp Live master (9.5.11) postgresql/1 active idle 5/lxd/2 10.0.0.80 5432/tcp Live secondary (9.5.11)	19:46
bobeo	kwmonroe: The thing that confuses me, is that shouldnt the postgresql database be storing all of the configurations data, IE the databa OOOOO! if the webserver doesnt point to the data, it doesnt matter that its replicated!	19:46
bobeo	kwmonroe: so what would you recommend I do to "make it" do the thing? Its so close to perfect for me. What are my options?	19:47
kwmonroe	bobeo: what thing are you trying to do? make it so that 2 owncloud units use the same database?	19:48
bobeo	kwmonroe: as for owncloud to the db, I added the relation of postgresql:db owncloud:postgresql	19:48
kwmonroe	right -- postgres does have a peer relation, so it will replicate amongst the cluster of postgres units.	19:48
bobeo	kwmonroe: Yes, and that the data replicates in the database between the database clusters, specifically so that 2 servers have the data in two databases, so that if one physical server dies, services remain available.	19:48
bobeo	kwmonroe: The idea is point haproxy forwarders to the local haproxy services, which point at the owncloud servers that share the same postgresql clusters	19:49
bobeo	high availablity from top to bottom. I lost a django project today. Never again.	19:49
bobeo	I lost almost a month worth of code progress, damn near cried, and quite literally might have bitten off something's head had anything live been within range at the moment of realization	19:50
kwmonroe	ouch :(	19:51
bobeo	kwmonroe: yea, especially because im new to django. The idea is build cool apps that solve real problems using django, decorate it using bootstrap and reactive. I was so close. less than 30 minutes away, and I would have finished my first successful django project.	19:52
bobeo	kwmonroe: so im hoping this idea, combined with a gitlab to go with it, will prevent that from ever happening again. Plus itll allow me to share the code with others for peer review, as well as external input/assistance.	19:54
kwmonroe	bobeo: since the postgres peering work is already done, the next piece would be to make owncloud aware of its peers. i dunno what that entails, so you'd have to check the OC docs for whatever is needed to run a cluster of owncloud apps. it may be as simple as just ensuring they all use the same db creds.	19:55
kwmonroe	bobeo: i think the harder part will be for the object storage. i'm assuming OC uses some actual filesystem to store objects and that they're not just blobs in a database.	19:56
kwmonroe	so for that, you'd need to have a common NFS share, or s3 storage, or something so all the OC units could access the same backend data.	19:56
bobeo	kwmonroe: I was under the impression since it didnt include using NFS that it was storing as BLOB in the postgresql db. How do I verify if thats the case? Ive never run into not knowing how the "data" is stored before?	19:57
kwmonroe	bobeo: i see that bdx has storage on his todo list -- https://github.com/omnivector-solutions/layer-owncloud/blob/master/README.md#todo -- maybe sync with him to find out if there is any work in progress to help with.	19:58
bobeo	kwmonroe: ahhh! ill do that. Im hoping to get this thing running sooner, rather than later. My soul still burns. My poor django baby. I dont see how developers carry on. It feels like I lost a piece of my very being.	19:59
kwmonroe	bobeo: if you have an OC deployment handy, upload a relatively big file (100ish megs) and see where it goes on the unit. i'm guessing /var/www/owncloud/data (which is the location that would need to be shared amongst your OC peers).	20:02
kwmonroe	if it does jam it into postgres, you'll probably see /var/lib/postgres grow by 100ish megs	20:02
kwmonroe	bobeo: looks like you can setup external storage through the GUI; https://doc.owncloud.org/server/10.0/admin_manual/configuration/files/external_storage_configuration_gui.html -- so if you get multiple OC units talking to the same database, then go in to each UI and configure external storage, that might get you where you want to be. then you'd hopefully find a programatic way to do that config so the charm could do it for	20:06
kwmonroe	you next time.	20:06
rick_h	kwmonroe: bobeo yea, you need the OC charm to elect a leader, pass it's pgsql connection to the non-leaders, and enable the leader sending object storage creds to the others as well	20:12
rick_h	kwmonroe: bobeo the idea being that if a non-leader dies no biggie and if the leader goes down a new one will be elected and already have the details it needs	20:12
rick_h	kwmonroe: bobeo it's a bit much to bite off as your first bits of charming and maybe you can send a note to the list and get some examples. All the ones I can think of would be openstack and rather large chunks to process	20:13
bobeo	rick_h: That sounds like serious "OOoh dis gonna hurt" content to try to take on. I guess in the interum simply wait until my skill and comfort level builds? And use Git instead? Its mostly the code Im concerned about. Is that a feature gitlab supports?	20:17
kwmonroe	bobeo: the leadership bits rick_h was referring to would need to be implemented in the owncloud charm. the hard part of leadership is already handled by juju and layer-leadership. the last mile is that piece in the OC charm that sets/gets config to ensure all OC units have the same info.	21:28
kwmonroe	i'm not quite sure what you meant with "feature gitlab supports", but again, this would be up to the owncloud charm to coordinate the config amonst its peers.	21:29
yosefrow_	anyone with knowledge of juju/openstack hear of a kolla juju bundle for openstack	21:31
kwmonroe	the docs cover a bit about juju's leadership capabilities (https://jujucharms.com/docs/stable/developer-leadership) and the layer that owncloud would need to include has a bit more about how a reactive charm works with leadership (https://git.launchpad.net/layer-leadership/tree/README.md)	21:31
kwmonroe	^^ that was for bobeo, not you yosefrow_ :)	21:32
yosefrow_	I had an conversation yesterday trying to convince someone to switch to juju for openstack, and they insisted that Kolla is the future for openstack deployments	21:32
yosefrow_	i didnt really have a comeback	21:32
yosefrow_	pretty much the conversation came down to why Kolla (openstack with dockers) is better than juju openstack bundle (which uses lxd)	21:33
kwmonroe	yosefrow_: i don't have OS/Kolla exp, but typically when it comes down to docker vs lxd, it's good to consider what type of container is best for you. docker (process) vs lxc (system) containers.	21:34
kwmonroe	i like to ssh into things and poke around. having /var/log/* and "normal" systemd / init functions make system containers easier for me to diagnose/debug. ymmv of course.	21:35
yosefrow_	kwmonroe, their argument was basically that lxd is dying and therefore not nearly as well supported as docker	21:35
bobeo	yosefrow_: I would heavily disagree with them. Nothing is stopping you from installing magnum, which will allow you to use docker containers with openstack. You dont want docker containers for infrastructure, you want them for containers for projects. System containers allow you to push process containers, or containers in containers. You cant as easily do that with lxc in docker, but very easily with docker in lxc. VM in a VM arguement.	21:35
bobeo	a type 1 into a type 2, but you can a type 2 into a type 1, which I have also done, and currently do.	21:36
kwmonroe	did i miss a netcraft report? yosefrow_, i'd love to see the "lxd is dying" source.	21:37
yosefrow_	kwmonroe, if not dying, than not very popular	21:37
yosefrow_	thats pretty much what they said	21:37
yosefrow_	but i had no statistics	21:37
bobeo	kwmonroe: No, you didnt. I speak with the netcraft people all the time. LXD is doing better than fine these days. Docker is overfluffed peacock fluffing its feathers again as always.	21:38
yosefrow_	bobeo, their argument for container usage is that eventually they say openstack services will move to kubernetes because it allows seamless and 100% reliable deployment with pod failover features. Therefore, docker is the path forward.	21:38
yosefrow_	Is there a solid basis to say that this is not hte case, and that system services should not be provided in docker containers?	21:39
bobeo	yosefrow_: They are wrong. That feature already exists with Nova_Evactuate, which is also possible with Magnum. That feature has existed for years for Openstack. Not only are they full of it, they are also severely late to the game. I know kelsey personally, and he will tell you the same, as he has always told people. Kube has its purpose, so does docker, so does KVM, so does LXD.	21:40
bobeo	yosefrow_: Yes. The first would be my personal experience, when I had to rebuild my entire openstack environment because 1 container crapped out. Unfortunately, it was my Keystone container, the only one you cant recover from if you dont have a proper HA deployment available.	21:41
yosefrow_	bobeo, so the main issue with system containers is their tendency to fail, and failing to take into account that many system services cannot follow the recovery model, but must simply never go down ?	21:42
bobeo	yosefrow_: The second would be any number of members from the kubenetes development team. They will directly tell you kube is designed as a devops tool, not as an infrastructure tool. Its not designed with the robustness to maintain a high availability, high load, feature rich environment. Containers, especially docker, has been service driven in design. its designed to provide one specific purpose, and perform it well. Infrastrucuture	21:42
yosefrow_	Basically that the fact that kubernetes services can recover quickly is irrelevant because the services cannot afford to fail in the first place?	21:42
bobeo	continues to require, a robust list of capabilities, and adapt on the fly requirements, which containers arent designed to do. They are designed to be depricated, not maintained.	21:43
bobeo	yosefrow_: Service containers historically die, and are built around a model that thats acceptable. System containers are build around the idea of "available at all costs". One is designed for HA Enviroments to cover their weaknesses, one is designed to be the HA environment.	21:44
yosefrow_	bobeo beautifully put. I like this distinction.	21:44
bobeo	yosefrow_: also, LXC containers have a VERY LONG history of success, docker containers do not. They are very new comparatively, and people historically need to knwo things will work.	21:44
yosefrow_	bobeo, do you have a link I can show the next guy who questions the viability or survivability of LXD as a container platform?	21:45
bobeo	yosefrow_: Honestly, I would put docker inside of LXC containers, and benefit from both. Use kubernetes to manage the docker containers inside of LXC. Enjoy the benefits of LXC system performance and density improvements over KVM, and also enjoy the centralized management performance and mobility of Docker/Kubernetes. That is our current plan with our project. To migrate to that model.	21:46
bobeo	Sure. Check out LXD vs KVM first though. Source is Canonical. Vancouver Event. It gives a good breakdown of what KVM is, and what LXC containers are. It gives a great insight into the performance difference as well.	21:47
bobeo	yosefrow_: allows the other side to understand why lxc containers in the first place, and then see why LXC is very different from Docker containers.	21:47
yosefrow_	bobeo, my main question in this context is whether or not the future of openstack services will be docker driven. From what you've said I've gathered that even if the capability for OS services via docker exists, its a bad idea because OS is too sensitive to change to rely on the Cattle Philosophy.	21:47
bobeo	yosefrow_: a good way to structure the environment is think about it like Hypervisors vs Baremetal.	21:47
bobeo	yosefrow_: Yes, that is correct. Docker isnt designed to support drastic OS shifts. its why the current issues with docker containers is patch management. The containers die when you patch many times, whereas LXC dont give a sh*t. Patch em 50 times in an hour, LXC is Honey Badger, LXC dont care.	21:48
yosefrow_	bobeo, so you are aware of the challenges that projects like magnum/kolla are facing, and in your opinion they are simply barking up the wrong tree?	21:51
bobeo	yosefrow_: Yes. Think about it from this perspective. Instead of looking at it from a Me vs. You Perspective, if we work together to Co-Exist, we are both able to play to our strengths and weaknesses better, rather than trying to focus on being better generalists in a moot point to replace each other that will surely fail.	21:52
yosefrow_	bobeo, my interests are completely apolitical when it comes to solutions	21:53
yosefrow_	This is the line that sticks out for me: <bobeo> yosefrow_: Service containers historically die, and are built around a model that thats acceptable. System containers are build around the idea of "available at all costs". One is designed for HA Enviroments to cover their weaknesses, one is designed to be the HA environment	21:53
bobeo	yosefrow_: Open source devs, and projects need to realize co-existance is the best path to highest performance yield. I use a wide variety of projects that do the same thing to achieve a task, and pour my efforts and focus into interoperability, which is the easiest of dev tasks.	21:53
yosefrow_	bobeo, my job is to analyze, distill, and understand the spirit of things	21:53
yosefrow_	bobeo, I can sympathize and understand the frustration with entities that refuse to coexist as a symptom of feeling superior	21:54
bobeo	yosefrow_: Thats exactly the issue. Every project has this crazy idea that it needs to be "superior" whatever that means. Open source isnt the Olympics, you dont get a Gold Medal or a Good Job Cookie for being the best.	21:55
yosefrow_	bobeo, on the other hand, a good amount of ego feeds innovation	21:56
bobeo	yosefrow_: To be honest, there is no "best", only "best for the job at hand". For Us, that means using MySQL, MongoDB, and PostgreSQL at the same time. Each better at something than the other, using HAProxy, NGinx, AND Apache to get the jobs done.	21:57
bobeo	yosefrow_: Not really. Historically Ego has been toxic to innovation.	21:57
bobeo	yosefrow_: If nothing else, ego is what drives your teams apart, and pushes them to create their own project, just to spite yours.	21:57
yosefrow_	bobeo, I mean innovation requires a healthy amount of pride in what you do	21:57
yosefrow_	if you dont believe in the solution you are building, why build it	21:58
bobeo	yosefrow_: Its not in the best interest of the community. Yes, its excellent to take pride in what you do, but not to be proudful. Pride is the greatest path to self destruction.	21:58
yosefrow_	I'm not saying to ignore the next guy, just to have enough pride to drive your passion forward	21:58
bobeo	yosefrow_: absolutely dont ignore them, call them, hang out with them, share beers together. When projects merge, that when you see amazing things built. Almost all of our greatest OS tech came from projects merging. When great minds collide, miracles happen. Let passion be your guide, and you will find that you will fly faster than any rocket ever could.	21:59
yosefrow_	If in fact someone is blinding themselves because of pride, that's of course destructive behavior	21:59
yosefrow_	but if pride is driving someone forward causing them to believe they can do something better because they know they can, than I think its a good thing	22:00
bobeo	yosefrow_: Yes, but the issue is pride can be bruised and beaten, passion cannot.	22:00
yosefrow_	ok, I tried to play the devil's out advocate. I'm out of ammo xD	22:01
yosefrow_	bobeo, well I think the shift towards a more cooperative community is happening already. But there are still pockets of resistance, where some people refuse to cooperate because of their own self importance.	22:02
yosefrow_	bobeo, We will get there though. I'm sure	22:03
bobeo	yosefrow_: Lol! I learned the hard way. My pride got so large it became a sun, creating its own gravity well, crushing everything I valued, blinding me, and pushing a lot of people I cared about away. I learned that life lesson the hard way, hopefully you wont have to. And yes, there will be hold outs. And they will burn, just like Rohan.	22:03
yosefrow_	bobeo, been there and done that. I've been through the fires that lead to self awareness. I can't say I've completely discarded my pride. But at least I've become aware of it.	22:04
yosefrow_	Either way, I agree with what you said earlier, that both docker and lxd have an important role to play in the future of openstack and cloud computing in general	22:05
yosefrow_	I will take your advice with me to my next conversation. Thanks for all the tips :)	22:05

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!