/srv/irclogs.ubuntu.com/2017/09/11/#juju.txt

stokachufallenour_: how'd it go?00:05
fallenour_still trying to install01:07
fallenour_keep getting faile derror, just realized though the error was lying, and the install was fine01:07
fallenour_so the past about 6 installs were for nothing, and wasted all because of a false error reporting01:07
fallenour_needless to say, juju is bringing me bad juju, and making me one sad panda01:08
fallenour_yogurt made my night better though01:08
fallenour_also question, Im installing standard conjure-up deployment, why is it that ceph-osd 2 and 3, the standard ones +1 arent seeing the ceph-mon, even though the system automatically installs it?01:11
fallenour_@stokachu01:14
stokachufallenour_: they should be related and once the deployment is complete they would see each other01:16
stokachufallenour_:also we are testing `sudo snap refresh conjure-up --candidate`, you may have a better experience there01:16
fallenour_@stokachu yea its cleaning up01:17
fallenour_you guys should really consider using my project as a large scale guinea pig01:17
fallenour_as insane as it drives me, its a great project, and a great idea, and I use openstack to provide a lot of services for free to a lot of major efforts01:17
fallenour_as insane as it drives me, its a great project, and a great idea, and I use openstack to provide a lot of services for free to a lot of major efforts01:17
stokachuwhat project?01:18
fallenour_Project PANDA, short for platform accessibility and development acceleration01:18
stokachuis it public?01:18
fallenour_Its designed to provide free infrastructure and services to nonprofits, research institutes, universities, and OSS developers01:18
fallenour_yeap, very public01:18
stokachuwhats the project url?01:19
fallenour_pending these last hurdles, I expect to take it fully public and live by the end of this month01:19
fallenour_100 Gbps pipe, and about 10 racks of gear to start with01:19
fallenour_3 supercomputers (small beowulf clusters)01:19
fallenour_3 supercomputers (small beowulf clusters)01:20
fallenour_damn, neutron gateway errored out01:20
stokachufallenour_:yea neutron needs access to a bridge device01:21
fallenour_@stokachu giving me a "config-changed " error01:21
stokachuso depending on your server you can set a range of bridges for neutron to search through01:21
fallenour_it should have one01:21
fallenour_right now the test stack is about 15 servers01:21
fallenour_does it configure a bridge when building via conjure-up?01:22
fallenour_it deploys the system, I figured it did by default01:22
fallenour_via eth1....01:22
fallenour_o.o01:22
fallenour_8O01:22
stokachufallenour_:https://jujucharms.com/neutron-gateway/237 look at port configuration01:23
stokachufallenour_:not openstack on maas, that's up to you01:23
stokachuyou can configure the port in the configure section for neutron gateway01:23
fallenour_oh my dear lawd! https://jujucharms.com/neutron-gateway/23401:24
fallenour_Holy geebus Batman! you even provided me the config links via the status command output01:24
fallenour_its not letting me ssh in?01:26
fallenour_isnt it supposed to inherit my maas ssh key?01:30
fallenour_hmmm01:37
fallenour_@stokachu Hey just an fyi, one of the systems we are working on we is an equivalent to Redhat Satellite for Openstack Environments, didnt know if thats a system already01:50
fallenour_but its major helpful for us, especially because we have limited bandwidth at the current location01:51
fallenour_   01:59
fallenour_     02:32
fallenour_not seeing two of my storage nodes in my volumes, can anyone provide any insight as to why?02:44
fallenour_I have ceph-mon and ceph-osd installed02:44
fallenour_ceph-mon shows 5/5 of cluster02:44
bdxrick_h: just to recap, I was haggling the collectd charm to get the prometheus-node-exporter, I just ended up going with subordinate that relates to prometheus on the scrape interface https://jujucharms.com/u/jamesbeedy/prometheus-node-exporter/103:40
bdxand just dropping collectd03:40
tlyngI'm trying to bootstrap a controller on azure and it's stuck at "Contacting Juju controller at <internal-ip> to verify accessibility...". The controller VM get assign an internal IP and an external IP. I've tried connecting to the external IP using SSH and that is successful. How is juju supposed to connect to an internal IP at azure which is not routable from here? Apart from that I noticed the API server is listening on port 17070 or so08:35
tlyngIs there a list of ports that need to be open (apart from ssh) in firewall to actually manage to use juju on public clouds?08:35
=== disposable3 is now known as disposable2
tlyngI deployed Kubernetes using JAAS, but when trying to download the kubectl configuration from kubernetes-master/0 I get an authentication error. My private ssh key is not recognized by that node (juju scp kubernetes-master/0:config ~/.kube/config), how am I supposed to get hold of this configuration?10:30
mhiltontlyng: have you tried running juju add-ssh-key to add your key to the model?10:35
tlyngmhilton: no, didn't even know that command existed (I'm new :-)) I will try it. Should I do it before I deploy the model or is it possible to do it after it's up and running?10:36
mhiltontlyng: I think it should work after the model is up and running.10:37
rogpeppe1tlyng: what mhilton says10:37
mhiltontlyng: if your key is in github or launchpad then it can also be imported with juju import-ssh-key which might be slightly easier.10:37
tlyngmhilton: Ok thanks, I'll try. Another quick question if you have time / knowledge about it. I've tried bootstrapping my own controller at Azure, but after it has launched the bootstrap agent it tries to connect to the VM's internal IP address - which is not routable.10:39
tlyngContacting Juju controller at 192.168.16.4 to verify accessibility... ERROR unable to contact api server after 1 attempts: try was stopped10:39
mhiltontlyng, azure can be slow to bootstrap, it sometimes has to wait a while before it get's an external IP address. What version of juju have you got (output of "juju version")10:41
tlyng2.2.3-sierra-amd6410:42
tlyng(the one provided by homebrew on mac)10:43
tlyngit connects using the external IP to bootstrap (after it first try to use the internal IP). But when it's waiting for the controller it only tries the internal IP, it deletes everything when it fails.10:44
mhiltontlyng: OK that's interesting. I'll see if I see the same behaviour.10:46
tlyngSadly I have to use Azure, at least for the time being. It looks like Microsoft has created this stuff called "security" and told the authorities about it. So if you're in the financial industry only "azure" is certified/approved by the government.10:48
=== freyes__ is now known as freyes
mhiltontlyng, I've just successfully bootstrapped an Azure controller with that juju version. I think your bootstrapping problem was that it couldn't talk to port 17070 on the external address. Even though it only said it was contacting the internal address it will be contacting all of them at the same time.11:44
mhiltontlyng: port 17070 is the only port you'll need access to for juju to communicate with the controller.11:45
tlyngmhilton: Ok, thank you. From now on I will use my phone as modem. Did I mention I hate firewalls?11:46
mhiltontlyng: The easist way to run models on Azure is through JAAS11:46
rick_htlyng: I'm testing it as well and seeing some issues. I'm working to collect a bootstrap with --debug for filing a bug. At the moment seems Juju can't get the agents needed. :/11:47
rick_htlyng: I'll bug balloons once it finishes timing out and get a bug report going11:47
tlyngWhat about persistence volume claims after deploying to Azure, does they work out of box?11:54
tlyngCurrently it says "Pending" and it's been like that for some time.11:55
urulamamhilton, rick_h: fyi, i was able to bootstrap on azure/westeurope with 2.2.3 ... might be region thing12:15
ejathi .. can we use --constraints with bundle ?12:32
fallenour_!ceph13:21
rick_hejat: you stick the constrains on the machine or application in the bundle.13:24
BarDwellerNice work on adding flush =) my vagrant provisioning is a little more chatty now =) nice to see it slowly put the world together =)13:28
rick_hurulama: mhilton tlyng so I did get azure to bootstrap but it literally took 13min to get there.13:44
tlyngrick_h: yes, it's slow. I'm still unable to use azure storage and the loadbalancer stuff (for services). It doesn't look like the canonical distribution of kubernetes actually configure cloud-providers, which I would say is broken.13:47
tlyngusing ceph on cloud providers ain't that wise13:47
tlyng(due to fault domains, data locality etc)13:47
stokachuBarDweller: nice!13:53
SimonKLBtlyng: juju currently doesnt enable charms to do anything cloud native such as setting up policies etc but with conjure-up there is some initial work on bootstrapping the kubernetes cluster (only on aws for now)13:53
fallenour_I figured out my problem is that the keyrings are in the wrong place, which is why it never got configured, but I need to know the cluster id so I can move the keyring to the appropriate directory @stokachu13:54
SimonKLBtlyng: see https://github.com/conjure-up/spells/pull/7913:54
stokachucoreycb: jamespage ^ do you know anything about this wrt ceph-mon/ceph-osd?13:55
stokachutlyng: yea azure is next on our list to enable their storage/load balancer13:55
fallenour_the ceph god himself o.o13:55
stokachu:)13:55
fallenour_I am not worthy o.o13:55
fallenour_by the way, for future reference guides on Ceph-OSD, please see: http://docs.ceph.com/docs/jewel/rados/operations/add-or-rm-osds/ http://docs.ceph.com/docs/master/radosgw/admin/ http://docs.ceph.com/docs/master/radosgw/config-ref/ https://fatmin.com/2015/08/13/ceph-simple-ceph-pool-commands-for-beginners/14:03
fallenour_http://docs.ceph.com/docs/dumpling/rados/operations/pools/ http://ceph.com/geen-categorie/how-data-is-stored-in-ceph-cluster/14:04
fallenour_all very good resources14:04
jamespagefallenour_: give me the 101 on what you are trying todo14:15
fallenour_@jamespage hey james, https://github.com/fallenour/panda this is what I am working towards.14:35
fallenour_right now my struggle is getting the environment stable so I can go live, which is proving to be difficult14:36
fallenour_right now I think the issue is related to ceph-osd and ceph-mon, specifically with the /var/lib/ceph/mds directories missing on all ceph-mon  and ceph-osd systems14:37
fallenour_the error output is  "No block devices detected using current configuration" and "  auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring: (2) No such file or directory"14:38
fallenour_my direct thoughts are that since the directory /var/lib/ceph/mds was never created, and the /etc/ceph/ceph.conf file points to it for a keyring for mds, that is the reason why its not working or responding to ceph-osd commands, w hich would make since why it thinks there arent any ceph block storage14:39
fallenour_what confuses me the most though is in my horizon, I see 3 of the 5 storage devices.14:39
fallenour_my guess is that because nova-compute is still working, the host can still see the storage, even though it may not be able to use it.14:39
fallenour_Ive identified that /var/lib/ceph/mon has its keyring, and both upgrade keyrings are present, but im not sure what keyring to copy to fix it, or if thats even the issue.14:41
fallenour_one thing I did realize is that the keyring in /var/lib/ceph/mon/$cluster_id is the same keyring across multiple systems, but im not sure what uses it.14:42
jamespagefallenour_: thats generated by ceph during the cluster bootstrap process I think14:43
fallenour_yea I found the bootstrap scripts for that14:43
fallenour_@jamespage Do you think the error might be that the mds directories were never created? And if so, why didn't the yaml build script build those?14:43
jamespagethe mds directory being missing should not be a problem - that's related to ceph-fs14:43
fallenour_mmk14:44
jamespagewhere are you trying to run the ceph commands from?14:44
fallenour_from juju14:44
jamespageexample?14:44
jamespagewhich unit?14:44
fallenour_juju run --unit ceph-osd/3 .....14:45
fallenour_and ive tried runnign it on multiple systems14:45
fallenour_do I need to run it specifically against the radosgw system?14:45
fallenour_also, one thing I just noticed, my $cluster_id variable is empty on the ceph nodes. If im not mistaken, that variable is used to define where keyrings are located14:46
fallenour_@jamespage how can I verify that the variable is populated properly, aside from juju run --unit ceph-osd/3 'echo "$cluster_id"'14:46
jamespagefallenour_: that's internal to ceph, not an environment variable14:47
jamespagethe cluster_id is by default 'ceph'14:47
fallenour_@jamespage ahh I see. So what happens when ceph needs that variable or something outside of ceph needs the variable info in order to locate the keyring?14:47
jamespagethat all gets passed via command line options14:48
jamespagefwiw the ceph-osd units don't get admin keyrings so you won't be able to run commands from those units14:48
jamespageonly from the ceph-mon units, where "sudo ceph -s" should just work14:48
fallenour_@jamespage my /etc/ceph/ceph.conf file still reads at /var/lib/ceph/mon/$cluster-id in the config file14:48
fallenour_@jamespage I made all of my units a ceph-osd / ceph-mon pair. I didnt know if they all needed ceph mon, so I made 5 and 5 respectively14:49
jamespagefallenour_: ok so that's actually broken atm - you can't co-locate the charms (there is a bug open)14:50
fallenour_@jamespage ooooh...14:50
jamespagefallenour_: normally we deploy three ceph-mon units in LXD containers, and ceph-osd directly on the hardware14:50
fallenour_@jamespage Yea thats what I did14:50
fallenour_@jamespage I put all 5 on hardware, and 5 in lxd containers, ceph-osd hardware, ceph-mon lxd14:51
fallenour_@jamespage I figured it was done that way for a reason, so I copied the design for the other 2 additional storage units14:51
jamespageoh well that should work just fine - what does "sudo ceph -s" on a ceph-mon unit do?14:51
jamespagebut 5 is overkill - 3 is fine14:51
jamespagethere is no horizotal scale-out feature for ceph-mon - its control onlu14:52
jamespagehave to drop for a bit to go find my room at the PTG14:52
fallenour_I didnt want to have to scale it later, I figured 5 for 500 PB of storage would be good14:53
fallenour_Output: cluster fc36db4c-9693-11e7-aae7-00163e20bc2c      health HEALTH_ERR             196 pgs are stuck inactive for more than 300 seconds             196 pgs stuck inactive             196 pgs stuck unclean             no osds      monmap e2: 5 mons at {juju-950b53-0-lxd-0=10.0.0.51:6789/0,juju-950b53-1-lxd-0=10.0.0.10:6789/0,juju-950b53-2-lxd-0=10.0.0.37:6789/0,juju-950b53-3-lxd-0=10.0.0.252:6789/0,juju-950b53-4-lxd-0=10.0.0.40:14:53
Dweller_when I'm running --edge, I had to install lxd with snap before installing conjure-up, and I don't think I have a conjure-up.lxc command anymore..14:53
stokachuDweller_: yea all that went away now you just use the snap lxd14:54
Dweller_but when I do lxc list .. it doesn't show the juju containers?14:54
Dweller_mebbe I need to set a config somewhere14:55
stokachudoes juju status show anything?14:55
Dweller_it does.. until I first do lxc list, and then it breaks14:55
Dweller_(rebuilding the vm at the mo, will be able to confirm when it comes back up)14:56
fallenour_@jamespage just an fyi, power is becoming unstable, hurricane is coming towards georgia, so if i dont respond, that is why.14:57
=== disposable3 is now known as disposable2
Dweller_ok.. vm is back up.. juju status shows all containers as running and active17:13
Dweller_is there any config I should set for lxc to list the containers using lxc ?17:14
magicaltrouthello ya'll completely off the wall question here, but here goes17:15
magicaltroutif I wanted to run K8S in LXC/LXD on Centos, my understanding is that conjure-up makes some changes to the profile to allow it on Ubuntu? Can i manually make those changes on Centos or is that out of the question?17:16
stokachumagicaltrout: i think the changes made are related to app armor17:19
stokachusome of the changes17:19
stokachuthe others are just enabling privledged etc17:19
magicaltrouthrmmm17:19
magicaltroutk17:19
stokachumagicaltrout: https://github.com/conjure-up/spells/blob/master/canonical-kubernetes/steps/lxd-profile.yaml17:19
stokachuthats what our profile looks like17:20
stokachuthe lxc.aa_profile is apparmor17:20
stokachunot sure if devices apply either17:20
magicaltroutokay cool thanks stokachu i'll have a prod17:21
stokachumagicaltrout: np17:21
Dweller_confirmed.. I'm probably doing something wrong.. I install lxd with snap install lxd .. then I install conjure-up with  snap install conjure-up --classic --edge  then I bring up kube with  conjure-up kubernetes-core localhost  ..  after which   juju status  shows the stuff up and running..  I then do lxc list  and it mumbles about generating a client certificate, then lists no containes at all, and after that.. juju status18:38
Dweller_just hangs and doesnt work anymore18:38
stokachuhmm18:38
stokachuDweller_: what does `which lxc` show18:39
Dweller_ /usr/bin/lxc18:39
stokachutry /snap/bin lxc list18:39
stokachuim curious18:39
Dweller_that works18:39
Dweller_ok.. so stock ubuntu has an lxc that isn't the one that juju used =) no probs.. I can work with that18:40
stokachuDweller_: yea conjure-up uses the snap lxd for it's deployments18:40
stokachuthough i thought the environment's PATH had /snap/bin listed first18:40
Dweller_for me, /snap/bin is at the end18:41
stokachuok, it may just be something i have to document for now18:41
Dweller_I wonder if I can apt uninstall the old lxc18:41
stokachuuntil snap lxd becomes the default18:41
stokachuDweller_: yea if you aren't using the deb installed one18:41
Dweller_added apt-get purge -y lxd lxd-client to my vagrantfile =) that should sort it18:55
=== antdillon_ is now known as antdillon
=== zeestrat_ is now known as zeestrat
=== rick_h_ is now known as rick_h
=== stokachu_ is now known as stokachu
Dweller_hmm.. my last 2 bringups have got stuck at the 'setting relation' bit20:52
Dweller_interesting.. I need to confirm this.. but I _think_ if I apt-get purge lxd-client before I do conjure-up lxd / conjure-up kubernetes-core .. then conjure-up kubernetes-core hangs at the 'Setting relation ...' phase (never gets to 'Waiting for deployment to settle' log output)23:27
Dweller_which really kinda makes you wonder whats going on there, and could it be using the 'wrong' lxc atm ?23:29
Dweller_hmm.. I mean snap install xd / conjure-up kubernetes-core ;p23:31
Dweller_s/xd/lxd23:31

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!