/srv/irclogs.ubuntu.com/2017/09/11/#juju.txt

stokachu	fallenour_: how'd it go?	00:05
fallenour_	still trying to install	01:07
fallenour_	keep getting faile derror, just realized though the error was lying, and the install was fine	01:07
fallenour_	so the past about 6 installs were for nothing, and wasted all because of a false error reporting	01:07
fallenour_	needless to say, juju is bringing me bad juju, and making me one sad panda	01:08
fallenour_	yogurt made my night better though	01:08
fallenour_	also question, Im installing standard conjure-up deployment, why is it that ceph-osd 2 and 3, the standard ones +1 arent seeing the ceph-mon, even though the system automatically installs it?	01:11
fallenour_	@stokachu	01:14
stokachu	fallenour_: they should be related and once the deployment is complete they would see each other	01:16
stokachu	fallenour_:also we are testing `sudo snap refresh conjure-up --candidate`, you may have a better experience there	01:16
fallenour_	@stokachu yea its cleaning up	01:17
fallenour_	you guys should really consider using my project as a large scale guinea pig	01:17
fallenour_	as insane as it drives me, its a great project, and a great idea, and I use openstack to provide a lot of services for free to a lot of major efforts	01:17
fallenour_	as insane as it drives me, its a great project, and a great idea, and I use openstack to provide a lot of services for free to a lot of major efforts	01:17
stokachu	what project?	01:18
fallenour_	Project PANDA, short for platform accessibility and development acceleration	01:18
stokachu	is it public?	01:18
fallenour_	Its designed to provide free infrastructure and services to nonprofits, research institutes, universities, and OSS developers	01:18
fallenour_	yeap, very public	01:18
stokachu	whats the project url?	01:19
fallenour_	pending these last hurdles, I expect to take it fully public and live by the end of this month	01:19
fallenour_	100 Gbps pipe, and about 10 racks of gear to start with	01:19
fallenour_	3 supercomputers (small beowulf clusters)	01:19
fallenour_	3 supercomputers (small beowulf clusters)	01:20
fallenour_	damn, neutron gateway errored out	01:20
stokachu	fallenour_:yea neutron needs access to a bridge device	01:21
fallenour_	@stokachu giving me a "config-changed " error	01:21
stokachu	so depending on your server you can set a range of bridges for neutron to search through	01:21
fallenour_	it should have one	01:21
fallenour_	right now the test stack is about 15 servers	01:21
fallenour_	does it configure a bridge when building via conjure-up?	01:22
fallenour_	it deploys the system, I figured it did by default	01:22
fallenour_	via eth1....	01:22
fallenour_	o.o	01:22
fallenour_	8O	01:22
stokachu	fallenour_:https://jujucharms.com/neutron-gateway/237 look at port configuration	01:23
stokachu	fallenour_:not openstack on maas, that's up to you	01:23
stokachu	you can configure the port in the configure section for neutron gateway	01:23
fallenour_	oh my dear lawd! https://jujucharms.com/neutron-gateway/234	01:24
fallenour_	Holy geebus Batman! you even provided me the config links via the status command output	01:24
fallenour_	its not letting me ssh in?	01:26
fallenour_	isnt it supposed to inherit my maas ssh key?	01:30
fallenour_	hmmm	01:37
fallenour_	@stokachu Hey just an fyi, one of the systems we are working on we is an equivalent to Redhat Satellite for Openstack Environments, didnt know if thats a system already	01:50
fallenour_	but its major helpful for us, especially because we have limited bandwidth at the current location	01:51
fallenour_		01:59
fallenour_		02:32
fallenour_	not seeing two of my storage nodes in my volumes, can anyone provide any insight as to why?	02:44
fallenour_	I have ceph-mon and ceph-osd installed	02:44
fallenour_	ceph-mon shows 5/5 of cluster	02:44
bdx	rick_h: just to recap, I was haggling the collectd charm to get the prometheus-node-exporter, I just ended up going with subordinate that relates to prometheus on the scrape interface https://jujucharms.com/u/jamesbeedy/prometheus-node-exporter/1	03:40
bdx	and just dropping collectd	03:40
tlyng	I'm trying to bootstrap a controller on azure and it's stuck at "Contacting Juju controller at <internal-ip> to verify accessibility...". The controller VM get assign an internal IP and an external IP. I've tried connecting to the external IP using SSH and that is successful. How is juju supposed to connect to an internal IP at azure which is not routable from here? Apart from that I noticed the API server is listening on port 17070 or so	08:35
tlyng	Is there a list of ports that need to be open (apart from ssh) in firewall to actually manage to use juju on public clouds?	08:35
=== disposable3 is now known as disposable2
tlyng	I deployed Kubernetes using JAAS, but when trying to download the kubectl configuration from kubernetes-master/0 I get an authentication error. My private ssh key is not recognized by that node (juju scp kubernetes-master/0:config ~/.kube/config), how am I supposed to get hold of this configuration?	10:30
mhilton	tlyng: have you tried running juju add-ssh-key to add your key to the model?	10:35
tlyng	mhilton: no, didn't even know that command existed (I'm new :-)) I will try it. Should I do it before I deploy the model or is it possible to do it after it's up and running?	10:36
mhilton	tlyng: I think it should work after the model is up and running.	10:37
rogpeppe1	tlyng: what mhilton says	10:37
mhilton	tlyng: if your key is in github or launchpad then it can also be imported with juju import-ssh-key which might be slightly easier.	10:37
tlyng	mhilton: Ok thanks, I'll try. Another quick question if you have time / knowledge about it. I've tried bootstrapping my own controller at Azure, but after it has launched the bootstrap agent it tries to connect to the VM's internal IP address - which is not routable.	10:39
tlyng	Contacting Juju controller at 192.168.16.4 to verify accessibility... ERROR unable to contact api server after 1 attempts: try was stopped	10:39
mhilton	tlyng, azure can be slow to bootstrap, it sometimes has to wait a while before it get's an external IP address. What version of juju have you got (output of "juju version")	10:41
tlyng	2.2.3-sierra-amd64	10:42
tlyng	(the one provided by homebrew on mac)	10:43
tlyng	it connects using the external IP to bootstrap (after it first try to use the internal IP). But when it's waiting for the controller it only tries the internal IP, it deletes everything when it fails.	10:44
mhilton	tlyng: OK that's interesting. I'll see if I see the same behaviour.	10:46
tlyng	Sadly I have to use Azure, at least for the time being. It looks like Microsoft has created this stuff called "security" and told the authorities about it. So if you're in the financial industry only "azure" is certified/approved by the government.	10:48
=== freyes__ is now known as freyes
mhilton	tlyng, I've just successfully bootstrapped an Azure controller with that juju version. I think your bootstrapping problem was that it couldn't talk to port 17070 on the external address. Even though it only said it was contacting the internal address it will be contacting all of them at the same time.	11:44
mhilton	tlyng: port 17070 is the only port you'll need access to for juju to communicate with the controller.	11:45
tlyng	mhilton: Ok, thank you. From now on I will use my phone as modem. Did I mention I hate firewalls?	11:46
mhilton	tlyng: The easist way to run models on Azure is through JAAS	11:46
rick_h	tlyng: I'm testing it as well and seeing some issues. I'm working to collect a bootstrap with --debug for filing a bug. At the moment seems Juju can't get the agents needed. :/	11:47
rick_h	tlyng: I'll bug balloons once it finishes timing out and get a bug report going	11:47
tlyng	What about persistence volume claims after deploying to Azure, does they work out of box?	11:54
tlyng	Currently it says "Pending" and it's been like that for some time.	11:55
urulama	mhilton, rick_h: fyi, i was able to bootstrap on azure/westeurope with 2.2.3 ... might be region thing	12:15
ejat	hi .. can we use --constraints with bundle ?	12:32
fallenour_	!ceph	13:21
rick_h	ejat: you stick the constrains on the machine or application in the bundle.	13:24
BarDweller	Nice work on adding flush =) my vagrant provisioning is a little more chatty now =) nice to see it slowly put the world together =)	13:28
rick_h	urulama: mhilton tlyng so I did get azure to bootstrap but it literally took 13min to get there.	13:44
tlyng	rick_h: yes, it's slow. I'm still unable to use azure storage and the loadbalancer stuff (for services). It doesn't look like the canonical distribution of kubernetes actually configure cloud-providers, which I would say is broken.	13:47
tlyng	using ceph on cloud providers ain't that wise	13:47
tlyng	(due to fault domains, data locality etc)	13:47
stokachu	BarDweller: nice!	13:53
SimonKLB	tlyng: juju currently doesnt enable charms to do anything cloud native such as setting up policies etc but with conjure-up there is some initial work on bootstrapping the kubernetes cluster (only on aws for now)	13:53
fallenour_	I figured out my problem is that the keyrings are in the wrong place, which is why it never got configured, but I need to know the cluster id so I can move the keyring to the appropriate directory @stokachu	13:54
SimonKLB	tlyng: see https://github.com/conjure-up/spells/pull/79	13:54
stokachu	coreycb: jamespage ^ do you know anything about this wrt ceph-mon/ceph-osd?	13:55
stokachu	tlyng: yea azure is next on our list to enable their storage/load balancer	13:55
fallenour_	the ceph god himself o.o	13:55
stokachu	:)	13:55
fallenour_	I am not worthy o.o	13:55
fallenour_	by the way, for future reference guides on Ceph-OSD, please see: http://docs.ceph.com/docs/jewel/rados/operations/add-or-rm-osds/ http://docs.ceph.com/docs/master/radosgw/admin/ http://docs.ceph.com/docs/master/radosgw/config-ref/ https://fatmin.com/2015/08/13/ceph-simple-ceph-pool-commands-for-beginners/	14:03
fallenour_	http://docs.ceph.com/docs/dumpling/rados/operations/pools/ http://ceph.com/geen-categorie/how-data-is-stored-in-ceph-cluster/	14:04
fallenour_	all very good resources	14:04
jamespage	fallenour_: give me the 101 on what you are trying todo	14:15
fallenour_	@jamespage hey james, https://github.com/fallenour/panda this is what I am working towards.	14:35
fallenour_	right now my struggle is getting the environment stable so I can go live, which is proving to be difficult	14:36
fallenour_	right now I think the issue is related to ceph-osd and ceph-mon, specifically with the /var/lib/ceph/mds directories missing on all ceph-mon and ceph-osd systems	14:37
fallenour_	the error output is "No block devices detected using current configuration" and " auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring: (2) No such file or directory"	14:38
fallenour_	my direct thoughts are that since the directory /var/lib/ceph/mds was never created, and the /etc/ceph/ceph.conf file points to it for a keyring for mds, that is the reason why its not working or responding to ceph-osd commands, w hich would make since why it thinks there arent any ceph block storage	14:39
fallenour_	what confuses me the most though is in my horizon, I see 3 of the 5 storage devices.	14:39
fallenour_	my guess is that because nova-compute is still working, the host can still see the storage, even though it may not be able to use it.	14:39
fallenour_	Ive identified that /var/lib/ceph/mon has its keyring, and both upgrade keyrings are present, but im not sure what keyring to copy to fix it, or if thats even the issue.	14:41
fallenour_	one thing I did realize is that the keyring in /var/lib/ceph/mon/$cluster_id is the same keyring across multiple systems, but im not sure what uses it.	14:42
jamespage	fallenour_: thats generated by ceph during the cluster bootstrap process I think	14:43
fallenour_	yea I found the bootstrap scripts for that	14:43
fallenour_	@jamespage Do you think the error might be that the mds directories were never created? And if so, why didn't the yaml build script build those?	14:43
jamespage	the mds directory being missing should not be a problem - that's related to ceph-fs	14:43
fallenour_	mmk	14:44
jamespage	where are you trying to run the ceph commands from?	14:44
fallenour_	from juju	14:44
jamespage	example?	14:44
jamespage	which unit?	14:44
fallenour_	juju run --unit ceph-osd/3 .....	14:45
fallenour_	and ive tried runnign it on multiple systems	14:45
fallenour_	do I need to run it specifically against the radosgw system?	14:45
fallenour_	also, one thing I just noticed, my $cluster_id variable is empty on the ceph nodes. If im not mistaken, that variable is used to define where keyrings are located	14:46
fallenour_	@jamespage how can I verify that the variable is populated properly, aside from juju run --unit ceph-osd/3 'echo "$cluster_id"'	14:46
jamespage	fallenour_: that's internal to ceph, not an environment variable	14:47
jamespage	the cluster_id is by default 'ceph'	14:47
fallenour_	@jamespage ahh I see. So what happens when ceph needs that variable or something outside of ceph needs the variable info in order to locate the keyring?	14:47
jamespage	that all gets passed via command line options	14:48
jamespage	fwiw the ceph-osd units don't get admin keyrings so you won't be able to run commands from those units	14:48
jamespage	only from the ceph-mon units, where "sudo ceph -s" should just work	14:48
fallenour_	@jamespage my /etc/ceph/ceph.conf file still reads at /var/lib/ceph/mon/$cluster-id in the config file	14:48
fallenour_	@jamespage I made all of my units a ceph-osd / ceph-mon pair. I didnt know if they all needed ceph mon, so I made 5 and 5 respectively	14:49
jamespage	fallenour_: ok so that's actually broken atm - you can't co-locate the charms (there is a bug open)	14:50
fallenour_	@jamespage ooooh...	14:50
jamespage	fallenour_: normally we deploy three ceph-mon units in LXD containers, and ceph-osd directly on the hardware	14:50
fallenour_	@jamespage Yea thats what I did	14:50
fallenour_	@jamespage I put all 5 on hardware, and 5 in lxd containers, ceph-osd hardware, ceph-mon lxd	14:51
fallenour_	@jamespage I figured it was done that way for a reason, so I copied the design for the other 2 additional storage units	14:51
jamespage	oh well that should work just fine - what does "sudo ceph -s" on a ceph-mon unit do?	14:51
jamespage	but 5 is overkill - 3 is fine	14:51
jamespage	there is no horizotal scale-out feature for ceph-mon - its control onlu	14:52
jamespage	have to drop for a bit to go find my room at the PTG	14:52
fallenour_	I didnt want to have to scale it later, I figured 5 for 500 PB of storage would be good	14:53
fallenour_	Output: cluster fc36db4c-9693-11e7-aae7-00163e20bc2c health HEALTH_ERR 196 pgs are stuck inactive for more than 300 seconds 196 pgs stuck inactive 196 pgs stuck unclean no osds monmap e2: 5 mons at {juju-950b53-0-lxd-0=10.0.0.51:6789/0,juju-950b53-1-lxd-0=10.0.0.10:6789/0,juju-950b53-2-lxd-0=10.0.0.37:6789/0,juju-950b53-3-lxd-0=10.0.0.252:6789/0,juju-950b53-4-lxd-0=10.0.0.40:	14:53
Dweller_	when I'm running --edge, I had to install lxd with snap before installing conjure-up, and I don't think I have a conjure-up.lxc command anymore..	14:53
stokachu	Dweller_: yea all that went away now you just use the snap lxd	14:54
Dweller_	but when I do lxc list .. it doesn't show the juju containers?	14:54
Dweller_	mebbe I need to set a config somewhere	14:55
stokachu	does juju status show anything?	14:55
Dweller_	it does.. until I first do lxc list, and then it breaks	14:55
Dweller_	(rebuilding the vm at the mo, will be able to confirm when it comes back up)	14:56
fallenour_	@jamespage just an fyi, power is becoming unstable, hurricane is coming towards georgia, so if i dont respond, that is why.	14:57
=== disposable3 is now known as disposable2
Dweller_	ok.. vm is back up.. juju status shows all containers as running and active	17:13
Dweller_	is there any config I should set for lxc to list the containers using lxc ?	17:14
magicaltrout	hello ya'll completely off the wall question here, but here goes	17:15
magicaltrout	if I wanted to run K8S in LXC/LXD on Centos, my understanding is that conjure-up makes some changes to the profile to allow it on Ubuntu? Can i manually make those changes on Centos or is that out of the question?	17:16
stokachu	magicaltrout: i think the changes made are related to app armor	17:19
stokachu	some of the changes	17:19
stokachu	the others are just enabling privledged etc	17:19
magicaltrout	hrmmm	17:19
magicaltrout	k	17:19
stokachu	magicaltrout: https://github.com/conjure-up/spells/blob/master/canonical-kubernetes/steps/lxd-profile.yaml	17:19
stokachu	thats what our profile looks like	17:20
stokachu	the lxc.aa_profile is apparmor	17:20
stokachu	not sure if devices apply either	17:20
magicaltrout	okay cool thanks stokachu i'll have a prod	17:21
stokachu	magicaltrout: np	17:21
Dweller_	confirmed.. I'm probably doing something wrong.. I install lxd with snap install lxd .. then I install conjure-up with snap install conjure-up --classic --edge then I bring up kube with conjure-up kubernetes-core localhost .. after which juju status shows the stuff up and running.. I then do lxc list and it mumbles about generating a client certificate, then lists no containes at all, and after that.. juju status	18:38
Dweller_	just hangs and doesnt work anymore	18:38
stokachu	hmm	18:38
stokachu	Dweller_: what does `which lxc` show	18:39
Dweller_	/usr/bin/lxc	18:39
stokachu	try /snap/bin lxc list	18:39
stokachu	im curious	18:39
Dweller_	that works	18:39
Dweller_	ok.. so stock ubuntu has an lxc that isn't the one that juju used =) no probs.. I can work with that	18:40
stokachu	Dweller_: yea conjure-up uses the snap lxd for it's deployments	18:40
stokachu	though i thought the environment's PATH had /snap/bin listed first	18:40
Dweller_	for me, /snap/bin is at the end	18:41
stokachu	ok, it may just be something i have to document for now	18:41
Dweller_	I wonder if I can apt uninstall the old lxc	18:41
stokachu	until snap lxd becomes the default	18:41
stokachu	Dweller_: yea if you aren't using the deb installed one	18:41
Dweller_	added apt-get purge -y lxd lxd-client to my vagrantfile =) that should sort it	18:55
=== antdillon_ is now known as antdillon
=== zeestrat_ is now known as zeestrat
=== rick_h_ is now known as rick_h
=== stokachu_ is now known as stokachu
Dweller_	hmm.. my last 2 bringups have got stuck at the 'setting relation' bit	20:52
Dweller_	interesting.. I need to confirm this.. but I _think_ if I apt-get purge lxd-client before I do conjure-up lxd / conjure-up kubernetes-core .. then conjure-up kubernetes-core hangs at the 'Setting relation ...' phase (never gets to 'Waiting for deployment to settle' log output)	23:27
Dweller_	which really kinda makes you wonder whats going on there, and could it be using the 'wrong' lxc atm ?	23:29
Dweller_	hmm.. I mean snap install xd / conjure-up kubernetes-core ;p	23:31
Dweller_	s/xd/lxd	23:31

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!