=== frankban|afk is now known as frankban [06:24] why did conjure-up drop landscape? sort of a "very breaking change" === salmankhan1 is now known as salmankhan [11:23] RageLtMan: sorry that hit you there. I think the main thing was conjure-up landscape was a way to get an easy openstack with autopilot but conjure-up ended up going more direct into doing a solid openstack install walk through [14:58] rick_h: thanks for the clarification. Is there a current documentation source for using conjure-up directly? It seems the sort of thing i'd be able to feed a json/yaml file into instead the curses config... [14:59] RageLtMan: you want to do headless install? [14:59] that would be great too - have Chef just execute it all :) [15:02] https://imgflip.com/i/1vuudt [15:05] RageLtMan: so you could checkout the openstack spell and provide a bundle fragment with your changes [15:05] RageLtMan: it' s not documented yet but we're working on it [15:13] stockachu: thank you much, will look into this when i get back in this evening === frankban is now known as frankban|afk [17:21] day 8, the war continues. It is 8 days since juju has not worked as desired. I continue to fight on. Troops are running thin, injuries are countless, coffee supplies have almost run out. We receive further orders in briefings, but the cries of the battles the night before rage in my mind, muting whatever words escape the mouths of high command. [17:22] The enemy, rbd, continues to elude us, hiding in the obscurity of multiple config files, and the overarching sophistication of ceph. [17:25] fallenour: sorry for your troubles but i'm enjoying the journal [17:26] @tvansteenburgh LOL [17:26] its a bloody mess man [17:26] ive tried damn near everything to make it work [17:26] im at that "just give me your ssh key, and you fix it" point. [17:27] its so painful, and the build times are terribly long for me becasue of my 6/1 connection speed [17:29] @stokachu hey if I add another nova node in the future, will it continue to leverage the already active rbd config, or will it roll back the ephemeral storage (default) unless I inject "unknown-syntax" as an option with the juju deploy -n1 nova-* command [17:49] random question, why does juju not bother to update /etc/hosts to help units keep track of one another? [17:51] magicaltrout: scale [17:52] fair enough [17:52] magicaltrout: think that's all there is to it. [17:52] other random question CDK related [17:52] kubectl exec -it microbot-3325520198-djs5f -- /bin/bash [17:52] Error from server: error dialing backend: dial tcp: lookup k8s-12 on 10.108.4.4:53: no such host [17:52] anyone seen that? [17:52] * rick_h ducks and hides [18:06] yeah so to fix it rick_h [18:06] i had to..... [18:06] add all most nodes to all my hosts file [18:06] add all most nodes to all my hosts files [18:07] so the kube dns knows where to find my nodes [18:07] this is a manual deployment, so i wonder how that differs in a cloud deployment [18:07] @magicaltrout do you have dhcp turned on? [18:08] yeah fallenour its just manual deployment within openstack [18:08] @magicaltrout easiest thing to do is point it all at relevant dns names [18:08] well internally kube dns is looking for k8s-12 [18:08] that way you dont have to worry about IP, and can just focus on names independantly, let dhcp worry about IP addresses. [18:09] but the kubernetes worker doesn't have a clue what the k8s-12 is [18:10] @magicaltrout to be honest, I dont either, but if k8s-12 isnt a DHCP server, it isnt gonna matter, because it aint gonna work, and Ill tell you now, manual additions of IP > Host to /etc/host files will get to non-scalable real fast. [18:10] @magicaltrout its in your best interest that if for some reason kube dns isnt working with your current dhcp server, that you build another one. [18:12] fallenour: yes, that much i'm aware of, so what i'm curious about is, if you deploy k8s on ec2 for example how it'd know what k8s-12 is [18:13] @magicaltrout sadly you dont. if you dont control the dhcp, you cant configure the dhcp, and the systems arent shared on the same dhcp, the only way to make it work is with a subdomain name, and point it there with a routable IP over WAN. [18:13] Its again why I stress the reasons why I dont like critical infrastructure in the cloud. [18:14] but i'm not buying that kubectl exec doesn't work in EC2 [18:14] in which case the resolution must work [18:14] but you don't magically get a dhcp server in EC2 if you deploy juju [18:16] The issue isnt that it will or wont, the issue is that in EC2, you are in a cloud infrastructure, but your servers may be miles apart from each other in two geographcailly close DCs, or racks down. Either way, different switches, different broadcast domains. The issue is your DHCP query wont be on the same DHCP servers, specifically unless you put them on the same l2 device on the same broadcast domain on the same vlan. The issue is [18:17] in a cloud infrastructure. As such, you either have ot put everything on the same server, and virtualize to ensure they all use the same etherswitch [18:17] or you have to put all your critical infrastructure on ahrdware you own and control. [18:17] Otherwise, its no dice @magicaltrout [18:18] fallenour i have no idea what you're saying, either way i suspect it doesn't match the issues i'm seeing :) [18:18] The downside to containization, is theres no hardware to control, so theres no controlling. [18:18] @magicaltrout Ok so DHCP works by broadcasting and listening to broadcasts for queries and requests for DHCP IP addresses, and responds accordingly [18:19] @magicaltrout the issue is, in order to get that request to or from a system, they have to send or receive it. You have to be on the same vlan, on the same broadcast domain in order to receive/send it to the same two systems [18:19] @magicaltrout in a cloud infrastructure, your devices are very rarely on the same rack stack, much less the same DC in many cases, which is why the WAN IPs are often so different from one another. [18:20] @magicaltrout that means they arent on the same broadcast domain, which means the DHCP server each device is talking to is very likely to be different from one another, which means theyll never get the same information, and wont know how to route it to you, which is why i recommended a subdomain name over a WAN address. Its the only feasible way with using EC2 [18:24] @magicaltrout for isntance, you can do dhcp.magicaltrout.com with a nginx box, and point that nginx box ip to your internal dhcp server. This will allow you to move the dhcp request over dns (or ddns) to your dhcp server, over nginx, and serve that query over the internet to your dhcp server over the wan. [18:24] @magicaltrout its convoluted, and incredibly complex, but it works very well, but requries a much more indepth knowledge of protocols and load balancing, as well as geographic based traffic flow management. [18:26] @stokachu ok so ceph is just Satan. Ive got HALF my OSD boxes green to go. Why does Ceph hate me so @jamespage @stokachu @catbus [18:28] rick_h: how does juju resolve hostnames normally, dns on the controller? [18:28] magicaltrout: bottom line is b/c you manually provisioned, dns isn't taken care automagically for you [18:29] well..... balls :) [18:29] @stokachu @catbus @jamespage @rick_h Btw, I fixed my neutron issue by simply clicking the autotune feature on. I would highly recommend that be a default config for future versions [18:29] @tvansteenburg its DHCP [18:29] @tvansteenburg DHCP registers the IPs to the hostname that the MAAS issues, and then registers their info accordingly with itself. From there, it queries the DHCP server for the DNS info, and executes accordingly [18:29] magicaltrout: kubedns only manages container dns, not the hosts themselves [18:30] yeah tvansteenburgh [18:30] i'll write a charm to manage hosts files or something [18:31] @stokachu @rick_h @jamespage In the future version as well, for RBD deployments, can you please add a note in the conjure-up text that /dev/sdb, /dev/sdc, etc has to be added to annotate the drives individually by comma separator in order for larger disk counts to work effectively for RBD deployments? Its annotated for comma separators in other areas, so hoping for some uniformity there in the future. It was a lesson learned the hard [18:32] magicaltrout: Dmitrii-Sh might have a recommendation, i think he did a CDK on manual provider recently [18:40] in a deployment I had to work with the environment had some automation to provision VMs, assign IP addresses to an IPAM and add the necessary entries to a DNS service [18:40] given that it was a custom piece of automation we could provide no integration juju-wise [18:40] so it was manual provider [18:41] hrm [18:42] considering manual connections are... manual... why couldn't juju provide dns services for manual stuff? [18:43] My dear lord I wanna scream, why on earth is it only at 270GB (two drives) of 8 drives, when 8 drives were provided? Why does this damn system hate me so? [18:45] fallenour: what do you have in for 'osd-devices' in the ceph-osd configuration? [18:46] rbd, with /dev/sbb, /dev/sbc, /dev/sbd, /dev/sbe, /dev/sbf, /dev/sbg, /dev/sbh [18:46] it configured for rbd it looks like [18:48] at least it feels that way [18:48] @catbus whats the command for listing osd-devices? [18:50] magicaltrout: well, normally juju relies on a cloud provider to give it a node. If that cloud provider also has a responsibility of managing DNS entries then it won't interfere because it wouldn't be generic (who knows what kind of infra do you have, right?). With a manual provider you do everything manually including making sure your nodes know who to talk to (routing) and how to resolve stuff. [18:50] MAAS, for example, has it's own bind service [18:50] fallenour: try 'juju config ceph-osd' and look for osd-devices. there should be a command to list the value of the parameter directly, but I don't off the top of my head. [18:50] if you need to update an upstream server, you can use dhcp snippets + ddns [18:51] https://wiki.debian.org/DDNS [18:51] yeah i just had a look at ec2 k8s and saw the internal ec2.internal pointer [18:51] @catbus Yea, I was right value: /dev/sdb, /dev/sdc, /dev/sdd, /dev/sde, /dev/sdf, /dev/sdg, /dev/sdh [18:52] @catbus @jamespage so why isnt it recognizing that I have more than 2 drives in each server? I can tell that the server is using both drives on the third server, but no more than the 2. Does it cap the total amount of drives usable by server based on the server with the lowest drive count? [18:52] fallenour: I believe it should be separated by space, not comma. https://jujucharms.com/ceph-osd/246 [18:53] @catbus please tell me theres a 1 line command to fix this. Rebuild just makes me wanna cry [18:53] fallenour: juju config ceph-osd osd-devices='/dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh' [18:57] @catbus ok I updated, is there anything else i need to do, or will ceph-mon / ceph-osd automatically expand the pools and correct my mistake? [18:58] @catbus OOOOO! [18:58] @catbus THE GODS BLESS ME THIS DAY! [18:58] @catbus I SHALL BUILD A STATUE IN YOUR NAME!!! [18:59] fallenour: You should thank the openstack-charmers team. [18:59] @catbus Oh I plan on giving them somethign extra special this year. Theyve made my life about a millions times more easy [19:00] fallenour: that's exactly the idea behind Juju/Charm. :) [19:00] @catbus the challenge is finding out who they all are. Aside from @stokachu "they" are the only person I know on the openstack-charmers? maybe? [19:00] as its fixed, does this mean you'll type less? [19:01] @magicaltrout nope! As its working, i have to type more, a lot more o.o Now I can turn the project fully public, and start to scale it, and start adding all the non-profits [19:02] oh well it was worth a shot [19:03] fallenour: what are you building this openstack cloud for if you don't mind sharing a bit details? [19:03] @magicaltrout dont worry though, itll be a lot of cool stuff ahead. now that the heat on me will die down, i can start focusing on my stronger areas, and scaling systems. A lot of people stand to benefit from the platform, and itll help a lot of groups, and OSS projects move forward. A lot of people have been waiting for me to kick the last kinks out, and Openstack storage was the last one [19:04] @catbus Im building a IaaS for Opensource developers, Research Institutes, Non-Profits, and Universities to use to develop on, free of charge. I provide the hardware, the environment, and the SaaS, and they build to their hearts content. Its the missing piece to the perfect storm for OSS Community. [19:06] @catbus I realized a long time ago how financially fortunate I was compared to most other OSS developers, so Ive taken a large portion of my income for several years to build a Datacenter where I can host all the gear so people can share in what i have, and support their favorite projects without having to pay anywhere from 600-3500 a month for the privilege of giving to the community. Now all they have to give is their time. [19:07] fallenour: thats awesome, keep us posted [19:08] @bdx I will, and im more than happy to. The updates on the project are posted at www.github.com/fallenour/panda [19:08] Ill be adding updates in the near future, to include the slides from the last presentation, and the current updates probably today. Its been a huge pain in the ass getting this all working, so I think im gonna go drown myself in beer. [19:09] fallenour: awesome! [19:09] oh damn, one last question here @catbus one of the nodes failed to spin up properly and deploy, its juju deploy -n1 ceph-osd correct? [19:10] fallenour: juju add-unit -n 1 ceph-osd [19:10] @catbus and itll deploy with the current ceph-osd configs the other systems use? [19:11] fallenour: yes, including all the relations it needs to have with other services. [19:27] @catbus aaand, right back in the fire. Now its saying I have about...3x more space than whats physically possible. Any ideas? [19:55] fallenour: I am no ceph expert. sorry. [20:00] fallenour: but I'd like to know how you get that conclusion that it reports 3x more space. [20:01] @catbus because in horizon it shows available space of 6.3TB, when the maximum drive count possible in OSD devices is 17, at 146GB drives each [20:08] maybe someone else on the channel has ideas about what causes this. [20:10] tvansteenburgh: if I was to create an NFS persistent volume [20:11] the snappage of kubelet shouldn't interfere should it? [20:11] cause its classic snap, it doesn't see any difference in filesystem does it? [20:31] scrap tha [20:31] t [20:31] user error [21:05] well thats an interesting side effect [21:05] juju add-relation kubernetes-worker telegraf it appears that that does bad things! :) [21:06] magicaltrout: bad enough to file a bug? [21:06] i'll make it easy for you! https://github.com/juju-solutions/bundle-canonical-kubernetes/issues/new [21:07] i don't know if you'd call it a bug tvansteenburgh [21:07] eh, it did it again [21:07] * magicaltrout backs out the relation [21:07] tvansteenburgh: it created [21:07] telegraf:prometheus-client kubernetes-worker:kube-api-endpoint http regular [21:08] which seemed to knock all my workers offline [21:09] yeah i don't think you wanna be connecting to kube-api-endpoint [21:09] yeah [21:09] it made my cluster very sad [21:09] i thought telegraf was a subordinate that you relate to prometheus [21:09] * tvansteenburgh looks [21:10] well on my mastere i have telegraph:juju-info related to kubernetes-master:juju-info [21:10] and stats flowing [21:10] but i may have guess wrong, it was tricky to guess the flow [21:11] yeah that should work on worker too [21:11] yeah i put that in, you have to do a full juju add-relation kubernetes-worker:juju-info telegraf:juju-info though [21:11] else it does the bad one :) [21:11] don't be so lazy magicaltrout [21:12] haha thanks! [21:12] <3 [21:13] the problem is juju won't connect the juju-info relation implicitly [21:13] so it saw that both sides had an http interface, and connected that [21:14] yeah, its fair enough [21:14] does brick your cluster for a while though ;) [21:14] maybe i should add a feature request for relation warnings like "if x connects to y" then warn the user it might explode [21:18] magicaltrout: triggers in your charm^ [21:18] yeah bdx [21:18] like "you can do this, technically, but we dont advise it" :) [21:20] well like .... right now, I seem to end up with something like this in every charm https://github.com/jamesbeedy/layer-django-base/blob/master/reactive/django_base.py#L35,L48 [21:20] triggers will entirely simplify the code for what I am trying to do there [21:20] what you are talking about it similar [21:21] (P and Q) -> R [21:22] like [21:23] if this relation is made, or flag is set, then warn user [21:27] cory_fu: whats the timeline looking like before 0.5.0 drops? [23:03] hmm... do we have a list of definitive types supported in charm options? [23:08] stokachu, cory_fu_: ^^ ? [23:08] thumper: yes, gimme a sec [23:08] https://github.com/juju/charm/blob/v6-unstable/config.go#L53 [23:08] thumper: ^^ [23:09] anastasiamac: I was hoping for something on https://jujucharms.com/docs :) [23:10] anastasiamac: thanks [23:10] * thumper wonders if we have that type validation in bundle options... [23:13] hmm... [23:56] honestly, I couldn't be more disappointed with the decision to put the elasticsearch charm back on lp [23:56] ;( [23:57] elasticsearch-charmers: sup