[00:00] i see that 502 error sometimes too [00:00] hard to reproduce [00:01] could it be because I don't have dns setup correct? [00:05] yup that was the problem [00:51] now I have to figure out why my ingress isn't presenting the right wrong cert [00:54] "the right wrong cert" - so many questions about this statement... [00:57] lazyPower well I am expecting an error due to DNS not being setup instead of a CA / self-signed issue [01:01] lazyPower do you have an example of a https ingress pulling it's cert from secrets collection to help me see where I went wrong? [01:02] for some reason my ingress is pulling a self-signed cert!! [01:02] https://kubernetes.io/docs/user-guide/ingress/ [01:02] under the TLS header [01:03] yeah that is what I was using :-/ [01:03] https://github.com/kubernetes/ingress/blob/master/controllers/nginx/README.md#https [01:03] are you attempting to configure this via configmap? [01:03] just perchance? we have an open PR to impelement that feature, however its author is on vacation [01:03] i may need to piggyback it in [01:05] no just did a kubectl create secret tls wildcard --key --cert [01:05] hmm ok [01:07] https://gist.github.com/cm-graham/51c866e87934b53daa64afa104a4f6b7 is my YAML [01:09] stormmore - can you confirm the structure of the secret has the keys 'tls.key' and 'tls.crt'? [01:09] lazyPower - yeah that is one of things I checked [01:10] honestly iv'e only ever tested this with self signed tls keys [01:10] i'm not sur ei would have noticed a switch in keys [01:11] I am still curious as to were it got the cert it is serving it [01:11] let me finish up collecting this deploy's crashdump report and i'll context switch to looking at this stormmore. [01:11] i'm pretty sure the container generates a self signed cert that it will offer up by default [01:11] and you're getting that snakeoil key [01:11] yeah that is what I am thinking [01:11] like the ingress rule itself isn't binding to that secret [01:11] so its falling back to default [01:11] one way to test that threory... going to go kill a container! [01:12] hmmm so the container needs to present that cert first? [01:12] i dont think so [01:28] stormmore ok sorry about that, now i'm waiting for a deploy of k8s core on gce. [01:28] i'll pull this in and try to get a tls ingress workload running [01:28] lazyPower no worries, as always I don't wait for anyone :P [01:28] good :) [01:28] if you cant get this resolved i'll happily take a bug [01:28] lazyPower I am trying a different namespace to see if it is a namespace overload issue [01:30] OK definitely not an namespace issue and even deleting the deployment, service and ingress didn't stop it serving the right wrong cert [01:34] and I have confirmed that the container services the right cert if I run it locally [01:34] you mean locally via minikube or via lxd deployment or? [01:35] even simpler ... docker run -p... [01:37] ok, i woudl expect there to be different behavior between that and the k8s ingress controller [01:37] it has to proxy tls to teh backend in that scenario + re-encrypt with whatever key its serving from configured in the LB [01:38] just rulling out the container 100% [01:38] it serves both tls and unsecure traffic at the moment so should be fine for lb/ingress at the moment [01:39] * lazyPower nods [01:42] yeah I am running out of ideas on what to try next :-/ [01:46] well short of destroying the cluster and starting again! [02:13] you should only need to setup the deployment, service and ingress right? [02:13] stormmore correct [02:13] hmmm sigh :-/ still no serving the cert from the secret vault [02:18] stormmore - this is whats incoming https://github.com/kubernetes/kubernetes/pull/40814/files [02:19] and it has tls passthrough from the action to the reigstry running behind it, but its configured via configmap. [02:20] so this branch is a pre-req to get the functionality but this is our first workload configured with tls that we have encapsulated as an action, which has the manifests [02:21] yeah I am monitoring that [02:22] I am also thinking nexus3 is potentially a better option for us as it gives us other repository types than docker registeries [02:22] also I am only using it at the moment as a test service [02:23] going to find an AWS region we aren't using an setup the VPC more appropriately for a cluster [02:23] going to head home and see if I can do that === thumper is now known as thumper-afk [03:29] lazyPower it's stormmore, do you know of a good guide for spinning up a k8s aws cluster including setting up the VPC appropreiately? [04:24] Hi Chris [04:27] are you there chris ? [04:27] abhay_: see pm [04:27] ok === frankban|afk is now known as frankban [08:55] Good morning Juju World! [09:11] how do I create bundle with local charms? [09:11] (using juju 2.0.3) [09:14] right now I have a bundle/ directory with bundle.yaml and charms/ inside, and I'm putting local charms into charms/) [09:15] but when I try to use charm: ./charms/ceph-269 I get an error >>path "charms/ceph-269" can not be a relative path<< [09:28] I guess I can workaround by passing an absolute path instead, but that's not what I should be doing [09:42] Hi kklimonda, I am afraid absolute path is the only option at the moment. [10:22] what is the best practice for exposing isolated charms (i.e. charms co-located in LXD containers) ? [10:23] right now i manually add iptables rules to port forward since expose doesnt seem to be enough, but perhaps there is a juju-way ? [10:33] that is the way SimonKLB [10:34] all expose does is if you have a provider that understands firewalls, to open the port [10:34] LXD obviously doesn't so expose does nothing [10:37] magicaltrout: right, do you know if there has been any discussion regarding this before? it would be neat if the expose command did the NAT:ing for us if the targeted application was in an LXD container [10:37] but perhaps this is not implemented by choice [10:50] SimonKLB: I've brought it up before, I don't think it ever really got anywhere. You could do this type of thing https://insights.ubuntu.com/2015/11/10/converting-eth0-to-br0-and-getting-all-your-lxc-or-lxd-onto-your-lan/ [10:50] basically just bridge the virtual adapter through the host adapter and get real ip addresses [10:50] all depends on how much you can be bothered :) [10:59] magicaltrout: i wonder how well that's going to work on a public cloud though [11:02] any idea how to debug juju MAAS integration? I can deploy juju controller just fine, but then I can't deploy my bundle - juju status shows all machines in a pending state, and maas logs show no request to create those machines [11:04] SimonKLB: indeed, it will suck on a public cloud :) [11:05] i reckon there is probably scope for a juju plugin someone could write to add that functionality though. Something like juju nat [11:05] but i'm not that guy! ;) [11:18] also, are there any chinese mirrors of streams.canonical.com? [11:40] hey kklimonda i'm not a maas user so I can't offer help but you could also #maas if you're not already there to see if other people are awake [11:41] #juju seems to be more active [11:41] lol fair enough [11:41] it will certainly pick up later in the day when more US folks come online [11:41] kjackal_ might be able to provide some MAAS support, I'm not sure [11:43] cholcombe: Hey [11:43] cholcombe: I needed some help regarding the Ceph Dash charm [11:45] help [11:52] ayush: you might find people awake on #openstack-charms [11:53] Thanks. Will check there :) [11:54] Hi! Can someone help with juju storage and cinder? [11:55] My deployments are working fine to my private openstack cloud, but I would want to use cinder volumes on my instances for logfiles etc. [11:56] I have https enabled on my openstack and I use ssl-hostname-verification: false [11:57] Units get added without problem, but when I want to add storage to instances I get error https://myopenstack.example.com:8776/v2/b3fbae713741428ca81bca384e037540/volumes: x509: certificate signed by unknown authority [12:37] kklimonda: I am not a maas user either, so apart from the usual juju logs I am not aware of any other debuging ...facilities [12:38] anrah: You might have better luck asking at #openstack-charms [12:40] I'm not deploying openstack :) [12:40] I'm deploying to OpenStack [12:40] OpenStack as provider [13:00] kjackal_: so it seems juju is spending some insane amount of time doing... something, most likely network related, before it starts orchestrating MAAS to prepare machines [13:01] stub, tvansteenburgh: ping for charms.reactive sync [13:02] this is a lab somewhere deep in china, and the connection to the outside world is just as bad as I've read about - it looks once juju finishes doing something it starts bringing nodes, one at a time [13:03] kklimonda: so a few things should happen, MAAS will check to make sure it has the correct images as far as I know, if it doesn't it'll download some new ones, likely Trusty and Xenial [13:03] then when juju spins up it will start downloading the juju client software and then do apt-get update etc [13:04] for the maas images, I've created a local mirror with sstream-mirror and pointed MAAS to it [13:06] it's definitely possible that juju is trying to download something else [13:06] yeah it will download the client, then when thats setup any resources it needs for the charms [13:07] can I point it to alternate location? [13:07] not a clue, clearly you can run an apt mirror somewhere [13:07] i don't know how you point cloud init somewhere else though [13:08] * magicaltrout taps in kjackal_ or an american at this point [13:08] I don't think it's even getting to apt [13:09] kklimonda: I am not sure either, sorry [13:09] controller is deployed and fine, and machines are in pending state without any visible progress for at least 10 minutes (that's how long it took for juju to spawn first out of three machines) [13:10] juju status --format yaml might, or might not give you more to go on [13:12] rick_h loves a bit of MAAS when he gets into the office [13:12] * rick_h looks up and goes "whodawhat?" [13:12] yeah, I'll wait around ;) [13:13] kklimonda: what's up? /me reads back [13:13] he does love MAAS, don't let him tell you otherwise! ;) [13:13] rick_h: I have a MAAS+Juju deployment somewhere in China, and juju add-machine takes ages [13:14] I do, my maas https://www.flickr.com/gp/7508761@N03/47B58Y [13:14] my current assumption is that juju, before it even starts machine through MAAS, is trying to download something from the internet [13:15] which is kinda no-go given how bad internet is there [13:15] kklimonda: probably pulling tools/etc from our DC perhaps. You might know more looking at the details logs and doing stuff like bootstrap --debug which will be more explicit [13:15] kklimonda: hmm, well it shouldn't do anything before the machine starts [13:15] kklimonda: it's all setup as scripts to be run when the machine starts [13:16] bootstrap part seems fine [13:17] I've done a sstreams-mirror to get agent/2.0.3/juju-2.0.3-ubuntu-amd64.tgz and later bootstraped it like that: juju bootstrap --to juju-node.maas --debug --metadata-source tools maas --no-gui --show-log [13:17] kklimonda: hmm, no add-machine should just be turning on the machine and installing jujud on the machine and register it to the controller [13:19] kklimonda: if you tell maas just to start a xenial image does it come up? [13:20] I can deploy xenial and trusty just fine through UI and it takes no time at all (other than servers booting up etc.) [13:21] but juju is not even assigning and deploying a machine until it finishes doing... whatever it's doing [13:21] the funny part is, it seems to be working just fine - only with a huge delay (like 15 minutes per machine) [13:23] kklimonda: hmm, yea might be worth filing a bug with as many details as you can put down. what versions of maas/juju, how many spaces/subnets are setup, what types of machines they are, etc. [13:24] sigh, it's definitely connecting to streams.canonical.com [13:24] I just tcpdumped traffic [13:24] i blame canonical for not having a DC in the back of china! [13:27] sigh, there are mirrors for old good apt repositories [13:27] but we're living in a brave new world [13:27] and apparently infrastructure has not yet caught up ;) [13:28] well you can boot of a stream mirror, I wonder why its not using your config [13:28] or is it a fall back. I'm unsure of how simple streams work, its like some black art stuff [13:30] this part seems to be rather thinly documented [13:32] Has anyone used the ceph dashboard chime? [13:32] charm* [13:33] ayush: I have, a while ago [13:33] Did you use it with the ceph chimes? Or can it be setup with a separate ceph cluster? [13:34] ayush: I used it with the ceph charms [13:34] Okay. [13:34] Which version of juju were you using? [13:34] ayush: ultimately because you need to run additional software on the ceph nodes to actually gather the insights [13:35] juju 2.0 [13:38] marcoceppi: I ran this. Could you tell me how to get the credentials? "juju config ceph-dash 'repository=deb https://username:password@private-ppa.launchpad.net/canonical-storage/admin-ceph/ubuntu xenial main'" [13:39] ayush: you'd have to chat with cholcombe or icey on that [13:41] marcoceppi: Thanks :) [13:44] cory_fu: I would like your help on the badge status PR. Let me know when you have 5 minutes to spare [13:58] kjackal_: Ok, just finishing up another meeting. Give me a couple of min [14:08] hi here [14:08] lazyPower: are you around? [14:09] I have some problem with conjure-up canonical-kubernetes, two LXD machines for kubernetes-worker are staying in "pending" [14:09] (and the charm associated are blocked in "waiting for machine" so) [14:09] Zic, yea im working to fix that now [14:09] ah :} [14:09] stub, tvansteenburgh: Thanks for the charmhelpers fix. We once again have passing Travis in charms.reactive. :) [14:10] \o/ [14:10] stokachu: do you have a manual workaround? [14:18] http://paste.ubuntu.com/23995096/ [14:21] kjackal_: Ok, I'm in dbd [14:22] cory_fu: going there now [14:35] ayush, you have to be given access to that PPA [14:36] ayush, seems you and icey have been in contact. i'll move this discussion over to #openstack-charms === med_` is now known as medberry === medberry is now known as med_ [15:09] Zic, is this the snap version? [15:15] stokachu: nope, but I think it was because I forgot to 'apt update' after the add-apt-repository for the conjure-up's PPA :) [15:16] (I got the older version of conjure-up, with the new one from PPA it seems to be OK) [15:33] lazyPower: are you awake? :) [15:38] is there a juju way for handling NTP? [15:43] kklimonda - juju deploy ntp [15:43] Zic - heyo [15:44] will it just deploy itself on each and every machine and keep track of new machines I add? [15:46] lazyPower: my conjure-up deployed stale at "Waiting to retry KubeDNS deployment" at one of the 3 masters, don't know if it's normal: http://paste.ubuntu.com/23995418/ [15:46] ah, I see [15:46] the first deploy, I had a too many open files, I increased the fs.file-max via sysctl and do a second deployment :) [15:47] I can create a relationship for other units that need ntp [15:47] cool [15:47] now it's just this silly "waiting" which block [15:47] Zic - i saw that when i was testing before the adontactic was updated [15:47] *addonTactic [15:49] lazyPower: I don't know what can I do to unblock it, it's in this state from 30 minutes [15:50] Zic - did you swap with the cs:~lazypower/kubernetes-master-11 charm? [15:51] yep [15:54] lazyPower: I saw the step "Waiting for crypto master key" (I think it's the one you added) [15:54] Zic -yeah, thats correct [15:54] but I have on of this kubernetes-master instances which stay in waiting about KubeDNS :/ [15:54] Zic - give me 1 moment i'm on a hangout helping debug a private registry probem [15:55] i think i might have botched the build with teh wrong template [15:55] i'll need to triple check [15:55] :D [15:56] lazyPower: just for info, I used the "juju upgrade-charm kubernetes-master --switch cs:~lazypower/kubernetes-master --channel=edge" command [15:56] (I didn't see your update with cs:~lazypower/kubernetes-master the first time :D) [15:56] (I didn't see your update with cs:~lazypower/kubernetes-master-11 the first time :D) [15:57] but the return displayed with the --channel=edge was actually cs:~lazypower/kubernetes-master-11 so it seems OK [16:04] Zic - running another build now, give me a sec to buid and re-test [16:05] np :) [16:06] FYI, my mth-k8stest-01 VM is about 8vCPU of 2GHz, 16GB of RAM and 50GB of disk (I saw that CPU and RAM was heavily used in my first attempt with 4vCPU/8GB of RAM) [16:22] Zic - so far still waiting on kube-addons [16:22] churning slowly on lxd but churning all the same [16:22] (i have an underpowered vm running this deployment) [16:25] lazyPower: yeah, it stale long in "Waiting for kube-system pods to deploy" (something like that) but this step pass OK [16:25] Zic - if it doesnt pass by the first/secoond update-status message cycle its broken [16:25] Zic - dollars to donuts its failing on the template's configmap [16:26] Zic - kubectl get po --all-namespaces && kubectl describe po kube-dns-blahblah [16:26] shoudl say something about error in manifest if its what i think it is [16:26] yeah, I tried that, it's in Running [16:26] plot thickens... [16:26] mine turned up green [16:26] wwaiting on 1 more master [16:26] yep [16:27] was long on the 2 others, but it finally came to green [16:27] but one master is still waiting in kubernetes-master/1 waiting idle 8 10.41.251.165 6443/tcp Waiting to retry KubeDNS deployment [16:27] oops, missed copy/paste [16:27] Zic - but hte pod is there and green? [16:27] yep [16:27] don't know what it's waiting though :D [16:29] likely a messaging status bug [16:29] if the pod is up [16:29] actually Zic [16:30] restart that vm [16:30] test the resiliency of the crypto key [16:30] nothing like an opportunity to skip a step :) [16:31] oki [16:36] Zic https://www.evernote.com/l/AX7J_eiKOdNF94_eSoBqGZ3fjz-ZA8qQzAkB/image.png [16:36] confirmation that its working for me locally. if you see different results, i'm highly interested in what was different [16:36] Zic - and to be clear, i did the switch *before* the deployment completed [16:37] i do need to add one final bit of logic ot the charms to update any existing deployments [16:37] its not wiping the auth.setup state which it should be on upgrade-charm. [16:38] lazyPower: yep, I do the --switch ~5s after conjure-up printed me to press Q to quit :) [16:38] (as you said I need to switch when juju status print "allocating") [16:39] yeah :) thats just to intercept before deployment and start with -11 [16:39] not test an upgrade path [16:39] this was only tested as a fresh deploy, so the upgrade steps still need to be fleshed out but it should be a simple state sniff and change [16:39] for now, it's stale at : kubernetes-master/1 waiting idle 8 10.41.251.165 6443/tcp Waiting for kube-system pods to start [16:39] (after the reboot of the LXD machine) [16:40] :\ boo [16:40] ok, can i trouble you for 1 more fresh deploy? [16:40] all kube-system pods are Running... [16:40] yep [16:40] thanks Zic - sorry, not sure what happened there [16:40] I'm here for ~1 hour more :p [16:40] however did your crypto-key validation work? [16:40] were you able ot verify all units had the same security key [16:41] I don't test it as I thought this KubeDNS waiting error blocks the final steps of installation [16:41] oh [16:41] if the kubedns pod is running [16:42] it switch to active o/ [16:42] just now [16:42] did it? [16:42] yep [16:42] fantastic, it was indeed a status messaging bug [16:42] looks like perhaps the fetch might have returned > 0 [16:42] not certain, but thats why the message says that, is its waiting for convergence of the dns container [16:43] yeah, I just rebooted the LXD machine with this status message blocked [16:43] take ~4minutes to switch to active/idle/green [16:44] update-status runs every 5 minutes [16:44] so that seems about right [16:44] ok [16:44] as "reboot" of LXD machine is too fast, I don't know if it's a good test for resilience [16:44] if I poweroff instead, and wait about 5 minutes [16:44] I need to find how to re-poweron an LXD machine :D [16:45] it's my first-time-use of LXD :p [16:45] Zic - lxc stop container-id [16:45] lxc start container-id [16:45] lxc list shows all of them [16:45] Zic - did you snap install or apt install? [16:45] just curious :) [16:45] apt [16:46] I just followed the homepage quickstart :) [16:46] (as it was updated with conjure-up instruction and add-apt-repository) [16:46] ok [16:46] :) I'm running a full snap setup and its working beatifully [16:46] not sure if you want to dip your toes in the snap space but its a potential for you [16:46] this mth-k8stest-01 VM will stay for test I think [16:46] as the snaps seem to move pretty quickly, and they auto-update [16:46] so I can test snaps in it :) [16:47] nice [16:47] just make sure you purge your apt-packages related to juju before you go the snap route [16:49] lazyPower: giving CDK another deploy in a few minutes, trying to get the exact steps documented to get deis running post CDK deploy [16:49] for the test VM I can go through snap, for the real cluster, I have right to reinstall it a last time tomorrow morning :x [16:49] so I'm not decided if I can use your edge patch directly to production [16:49] or if I should wait that it go to master [16:50] Zic - wait until its released with the charms [16:50] (master branch) [16:50] Zic - there's additional testing, this was just early validation [16:50] Zic - as well as teh upgrade path needs to be coded (remember this is fresh deploy test vs an upgrade test) [16:52] yeah, but as always, deadlines ruins my world: the last time I can reinstall the cluster is tomorrow morning, so I think I will just install the old (released) version with the bug with a singlemaster [16:52] and when your upgrade will go to release, I will add two more master [16:52] do you think it's the right path? [16:53] or it's better to deploy directly with 3 masters on the old release, and poweroff two masters waiting your patch going prod? === Anita is now known as Guest91022 [17:00] (it's just for the real cluster, I can do every tests I need/want on other VMs :)) [17:01] Hi This is regarding revoking few revisions of a charm. the revisions need to be revoked, first released to different channel and then tried to revoke those revisions. But revoking is happening revision wise. [17:02] Sorry revoking is not happening revisions wise [17:02] grant/revoke is happening for all revisions of the charm. [17:02] please advice [17:34] lazyPower: for the master part it seems ok for now [17:35] lazyPower: but for the worker part, if I reboot one of the worker, the ingress-controller on it pass to "Unknown" and try to respawn on another node... and stay in Pending [17:35] don't know if it's the normal behaviour with ingress-controller pod [17:36] 5m 20s 22 {default-scheduler } Warning FailedScheduling pod (nginx-ingress-controller-3phbl) failed to fit in any node [17:36] fit failure summary on nodes : PodFitsHostPorts (2) [17:40] oh, as Ingress Controller listen on 80 it's logic that I got this error [17:51] :() [17:51] Zic - interesting [17:52] so its trying ot migrtate an ingress controller to a unit thats already hosting one to satisfy the replica count [17:52] yep [17:52] it has requested an impossible scenario :P [17:52] i love it [17:52] but as it already have an Ingress which listen on *:80... it raise a PodFitsHostsPorts [17:52] yet another reason to do worker pooling [17:52] and have an ingress tier [17:53] does StatefulSet have a kind of "max one pod of this type per node"? [17:53] it's maybe one of possible solution :) [17:53] we would need to investigate teh upstream addons and see if they woudl be keen on accepting that [17:54] we dont modify any of the addon templates in order to keep that "vanilla kubernetes" label [17:54] i think we do one thing, which is sniff arch [17:54] but otherwise, its untained by any changes [17:54] it's not a crash issue so, I'm kinda happy with it anyway :D [17:54] sounds like its testing positively then? [17:54] aside from that one oddball status message issue [17:54] yep [17:55] fantastic [17:55] i'll get the upgrade steps included shortly and get this prepped for the next cut of the charms [17:55] thanks for validating Zic [18:13] lazyPower: http://paste.ubuntu.com/23996226/ [18:13] bdx - in a vip meeting, let me circle back to you afterwords [18:14] ~ 40 minutes [18:14] k, np [18:14] <3 ty for being patient === mwenning is now known as mwenning-lunch-r [18:44] kjackal_: I know it's late and you should be EOD, but were you +1 to merging https://github.com/juju-solutions/layer-cwr/pull/71 with my fix from earlier? [18:45] cory_fu: I did not have asuccessful run but went through the code and it was fine [18:45] cory_fu: So yes, merge it! [18:45] heh [18:45] kwmonroe: You want to give it a poke? [18:46] I ask because I'm trying to resolve the merge conflicts in my containerization branch and would like to get that resolved at the same time === frankban is now known as frankban|afk [18:47] kjackal_: Also, one last thing. Who was you said possibly had a fix for https://travis-ci.org/juju-solutions/layer-cwr/builds/201538658 ? [18:47] *Who was it you said [18:47] it was balloons! [18:47] balloons: Help! :) [18:48] yup cory_fu, do you have cwr charm released in the store? [18:48] cory_fu: balloons: it is about releasing libcharmstore [18:50] kwmonroe: What do you mean? That PR branch? [18:50] ohh, what did I do? :-) [18:50] kjackal_: libcharmstore? https://github.com/juju/theblues ? [18:50] yeah cory_fu. is it built/pushed/released somewhere, or do i need to do that? [18:50] balloons: kjackal_ says you might know how to fix our travis failure due to charm-tools failing to install [18:50] or cory_fu, do just want me to pause for 5 minutes, pretend like i read the code, and merge it? [18:51] rick_h: cory_fu: balloons: its about this: https://github.com/juju/charm-tools/issues/303 [18:51] rick_h: libcharmstore seems to be just a wrapper around theblues at this point. Not sure what it provides extra that charm-tools needs [18:51] cory_fu: ah ok cool [18:52] kwmonroe: I can push that branch to the store, but I thought we didn't update the store until it was merged. I'll push it to edge, tho [18:53] cory_fu: i would be most thankful for an edge rev [18:53] use a snap? AFAICT, charm-tools wasn't built for trusty at any point. [18:54] you could also migrate to xenial I guess [18:54] where can I find documentation on bundles vs. charmstore, e.g. how do I push a bundle to the charmstore? [18:54] bdx: bundles should act just like charms. [18:54] bdx: they're both just zips that get pushed and released and such [18:55] balloons: Travis doesn't offer xenial, AFAICT [18:55] bdx: you just have the directory for the readme/yaml file [18:55] rick_h: http://paste.ubuntu.com/23996456/ [18:57] cory_fu, building charm-tools for trusty and publishing it seems the most straightforward [18:57] rick_h: looks like it needed a README.md [18:57] bdx: otp, will have to look. must be something that's not made it think it's a bundle [18:57] bdx: ah ok [18:57] bdx: crappy error message there for just that :/ [18:58] rick_h: yeah, I'll file a bug there for clarity [18:58] thanks [18:58] bdx: ty [18:58] balloons, marcoceppi: I thought charm-tools was already available for trusty? [18:58] balloons: I also can't find the snap for charm-tools [18:58] snap install charm [18:59] there's a broken dep for trusty [18:59] marcoceppi: Yeah, I don't understand why the dep broke. Also, will snaps even work inside Travis? [19:00] probably? [19:04] kwmonroe: Ok, cs:~juju-solutions/cwr-46 is released to edge [19:05] gracias cory_fu [19:05] kwmonroe: You probably want the bundle, too [19:05] nah [19:05] cory_fu: bundle grabs the latest [19:05] oh, der.. probably latest stable. anyway, no biggie, i can wedge 46 into where it needs to go [19:06] kwmonroe: I can release an edge bundle [19:06] too late cory_fu, i just deployed what i needed [19:06] :) [19:09] rick_h: so I was able to get my `charm push` command to succeed, now my bundle shows in the store https://imgur.com/a/w4sYr, but when I select "View", I see this -> https://imgur.com/a/udu1s [19:10] bdx: what's the ACL on that? [19:13] bdx: hmm ok so that seems like it should work out. [19:13] rick_h: unpublished, write for members of ~creativedrive [19:14] I could try opening it up to everyone and see if it make a difference [19:14] I was able to deploy it from the cli ... [19:14] bdx: well you should be allowed to look at it like that w/o opening it up [19:14] bdx: right, you're logged in, first question would be a logout/login [19:16] balloons: Is there a ppa for current stable snap? Getting this: ZOE ERROR (from /usr/lib/snap/snap): zoeParseOptions: unknown option (--classic) [19:17] balloons: Also of note, our .travis.yml requests xenial but still gets trusty [19:17] rick_h: yeah .. login/logout did not fix [19:18] bdx: k, that sounds like a bug, the output of the ACL from the charm command would be good and I'll try to find a sec to sedtup a bundle and walk through it myself [19:18] Hi, I used conjure-up to deploy openstack with novakvm on a maas cluster. after it's up and running, I try to ssh to the instance via external port, but I can't. I am checking the switch configurations now (to make sure there is no vlan separating the traffic), but wanted to check here to see if there is any known issue here. I added the external port on maas/conjure-up node to the conjureup0 bridge via brctl. [19:19] I did specify the external port on the neutron-gateway. [19:19] rick_h: sweet. thx [19:36] cory_fu, ppa for a stable snap? [19:36] that's a confusing statement [19:37] cory_fu, yea, travis might keep you stuck. But again, fix the depends for trusty or ... [19:41] can storage be specified via bundle? [19:43] balloons: PPA to get the latest stable snapd on trusty. Version that installs doens't support --classic [19:44] cory_fu, heh. Even edge ppa doesn't supply for trusty; https://launchpad.net/~snappy-dev/+archive/ubuntu/edge [19:45] cory_fu, but are you sure it doesn't work? I know I've used classic snaps on trusty [19:45] you probably just need to use backports [19:46] balloons: backports. That sounds promising. [19:46] cory_fu, actually, no.. http://packages.ubuntu.com/trusty-updates/devel/snapd [19:46] that should work [19:47] balloons: How do I tell it to use trusty-updates? [19:48] Ah, -t [19:49] Have to run an errand. Hopefully that will work [19:50] balloons: That didn't work. :( https://travis-ci.org/juju-solutions/layer-cwr/builds/201634553 [19:50] Anyway, got to run. [19:50] bbiab [19:55] lazyPower: just documenting the next phase of the issue http://paste.ubuntu.com/23996697/ [19:57] bdx - self signed cert issue at first glance [19:57] just got back from meetings / finishing lunch [19:57] cory_fu: I'll have a dep fix tomorrow [19:57] lazyPower: yea, entirely, just not sure how deis it is ever expected to work if we can't specify our own key/cert :-( [19:58] lunch is banned until stuff works! [19:58] bdx - spinning up a cluster now [19:58] give me 10 to run the deploy and i'll run down the gsg of deis workflow, see if i can identify the weakness here [19:59] bdx - i imagine this can be resolved by pulling in teh CA from k8s and adding it to your chain, which is not uncommon for self signed ssl activities === thumper-afk is now known as thumper [20:03] lazyPower: oooh, I hadn't thought of that [20:12] rick_h - do you happen to know if storage made it into the bundle spec? [20:13] or is that strictly a pool/post-deployment supported op [20:14] cory_fu: you badge real good: http://juju.does-it.net:5001/charm_openjdk_in_cs__kwmonroe_bundle_java_devenv/build-badge.svg [20:15] lazyPower: yes https://github.com/juju/charm/blob/v6-unstable/bundledata.go check storage [20:16] aha fantastic [20:16] lazyPower: it can't create.pools but can use them I believe via constraints and such [20:16] rick_h: awesome thanks [20:16] rick_h: any docs around that? === menn0 is now known as menn0-busy [20:17] bdx: I think it's under documented. [20:17] bdx: sorry, in line at the kid's school ATM so phoning it in (to IRC) [20:17] i'm filing a bug for this atm rick_h - i gotchoo [20:18] bdx - nope sadly we're behind on that one. However https://github.com/juju/docs/issues/1655 is being edited and will yield more data as we get it [20:19] lazyPower: awesome, thx [20:22] blackboxsw: fginther: we should totally set up a call for figuring out best path to merge https://github.com/juju/autopilot-log-collector & https://github.com/juju/juju-crashdump -- talking use-cases and merging [20:25] lutostag, seems like log-collector might be a potential consumer/customer of crashdump as it tries to pull juju logs and other system logs together into a single tarfile [20:25] ahh extra_dir might be what we need there [20:26] blackboxsw: yeah, I was curious about your inner-model stuff in particular to make sure we could do that for you too [20:27] hey lazyPower I am still trying to understand ingress controllers... is there a way of load balancing floating IPs using them? [20:27] blackboxsw: and your use of ps_mem as well (the motivation behind that and what it gives you) [20:28] lutostag, it is a bit tuned to our CI log analysis, but would not be impossible to merge things [20:29] fginther: yeah, OIL has a similar log analysis bit I'm sure, and merging heads seems like a good idea -- get everybody on one crash-collection format, then standardize analysis tools on top of it [20:29] stormmore - actually yes, you can create ingress routes that point at things that aren't even in your cluster [20:29] stormmore - but it sounds more like you're looking specifically towards floating ip management? [20:30] lazyPower - just trying to efficiently utilize my public IP space while providing the same functionality of using the service type loadBalancer gives in AWS [20:31] stormmore - so in CDK today - every worker node acts as an ingress point. Every single worker is effectively an ELB style router [20:31] lutostag, yeah, no objections to going that route. Would hopefully make things easier in the future [20:32] stormmore - you use ingress to slice that load balancer up and serve up the containers on proper domains. I would think any floating IP assignment you're looking to utilize there would probably be best served at being pointed directly at the units you're looking to make your ingress tier (this will tie in nicely to future work with support for worker pooling via annotations, but more on this later) [20:32] fginther: since we already have 2 ci-teams doing it, we should smash em together so that other's don't have to re-invent, wish I had reached out to you guys earlier tbh :/ [20:32] lazyPower yeah I saw that, I am more looking at providing service IP / VIPs instead [20:32] ok for doing virtual ips, i'll need to do some reading [20:32] stormmore - i've only tested nodport for that level of integration, there's more to be done there for VIP support i'm pretty sure [20:33] lazyPower the problem with using the node's IP is what happens to the IP if the node goes down [20:33] right and that varies on DC [20:33] it could stay the same, could get reassigned [20:34] exactly... my idea right now, is figure out how to run keepalived in the environment on one of the interfaces [20:34] stormmore - the thing is, when you declare a service in k8s you're getting a VIP [20:34] but its not one that would be routable outside of the cluster [20:35] https://kubernetes.io/docs/user-guide/services/#ips-and-vips [20:35] have keepalived as a DaemonSet and then be able to assign an IP to the internal service IP [20:35] ok, that sounds reasonable [20:37] seems the logical way of separating out the actual used IPs from the infrastructure 100% [20:37] yeah i see what you mean. i was looking at this [20:37] https://kubernetes.io/docs/tutorials/stateless-application/expose-external-ip-address/ [20:37] i presume this is more along the lines of what you were looking for, with --type=loadbalancer [20:37] where its just requesting an haproxy from the cloud at great expense, to act as an ingress point to the cluster VIP of that service [20:41] yeah I have been looking at that from an AWS standpoint and allows kubernetes to setup the ELB [20:42] fginther: I'll spend a little time, making juju-crashdump more library-like, and import able, then maybe I'll setup a call for early next week to discuss what you guys need in terms of output format to minimize disruption to your analysis if that's ok [20:43] lutostag, a call next week would be fine... But please note that some of the content of autopilot-log-collector is no longer needed [20:43] all of the juju 1 content can be removed [20:44] lutostag, I wouldn't want you to implement a lot of changes for the sake of autopilot-log-collector and have them not be used [20:46] balloons, marcoceppi: It turns out I was installing "snap" when I should have been installing "snapd". Unfortunately, it seems that snaps do not in fact work in Travis: https://travis-ci.org/juju-solutions/layer-cwr/builds/201647356 [20:47] cory_fu, ahh, whoops [20:47] it's possible to ignore that, but not with our package [20:47] fginther: ok, sure, we'll make a list! [20:57] balloons: What do you mean, "not with our package"? [20:58] cory_fu, snapd can be built with selinux / apparmor or perhaps no security model. not sure. But the ubuntu package absolutely wants app armor [20:58] kwmonroe - what bundle is that svg from? [20:58] charm_openjdk_in_cs__kwmonroe_bundle_java_devenv <-- kinda tells me but kinda doesnt [20:59] lazyPower: The bundle would be cs:~kwmonroe/bundle/java-devenv [20:59] oh i guess its this https://jujucharms.com/u/kwmonroe/java-devenv/ [20:59] ninja'd [20:59] watching matt deploy some ci goodness [21:00] well if matt can do it, we're in great shape. [21:07] bdx - just got deis up and running, stepping into where you found problems i think [21:07] "register a user and deploy an app" right? [21:12] so i finally got around trying lxd on my localhost. My lxc containers are started with juju and they seem to be stuck in 'allocating'. I'm not sure why [21:15] bdx - can you confirm your deis router is currently pending and not started? [21:15] bdx - looks like a collision between the ingress controller we're launching and the deis router. [21:19] cholcombe: if you do a lxc list, do you see any "juju" machines there? [21:19] lutostag: yeah they're def running [21:19] i deployed 3 machines and i see 4 lxc containers with juju- in the name [21:20] bdx yeah i was able to get the router scheduled by disabling ingress, we made some assumptions there that would prevent deis from deploying cleanly, both are attempting to bind to host port 80 [21:20] cholcombe: I would try lxc exec bash # and then run top in there and see where it is stuck [21:20] lutostag: ok [21:20] cholcombe: at one point there was an issue with things not apt-upgrading appropriately and getting stuck there indefinitely [21:21] lutostag: ahh interesting [21:21] lutostag: i don't see much going on. let me check another one [21:22] but that was months ago, still going in and poking is the best way to find it [21:22] looks like everything is snoozing [21:31] cholcombe: hmm, if they are all still allocating, mind pasting one of the containers "ps aux" to paste.ubuntu.com [21:31] lutostag: sure one sec [21:31] if no luck there, we'll have to see what the juju controller says... [21:37] bdx - i see the error between your approach and what its actually doing [21:38] bdx - you dont contact the kube apiserver for this, its attempting to request a service of type loadbalancer to proxy into the cluster and give you all that deis joy [21:38] lutostag: http://paste.ubuntu.com/23997274/ [21:38] bdx - i'd need to pull in the helm charts and give it a bit more of a high-touch to make it work as is, in the cluster right now. It would have to use type nodeport networking, and we would need to expose some additional ports [21:40] Budgie^Smore - cc'ing you on this as well ^ if your VIP work-aroudn works, i'd like to discuss teh pattern a little more and dissect how you went about doing it. as it seems like there's a lot of useful applications for that. [21:46] lutostag: the last message i see in the unit logs are that it downloaded and verified my charm [21:48] cholcombe: yeah, I can't see anything popping out at me, nothing helpful in "juju debug-log" ? [21:48] lutostag: no just messages about leadership renewal [22:03] cholcombe: hmm, looks like maybe you are stuck in the exec-start.sh. You could try rebooting one of those containers [22:04] fighting another juju/lxd issue here myself, and a bit out of my depth, wish I could be more helpful [22:07] lazyPower: ok, that would make sense, awesome [22:10] bdx - however as it stands, this doesn't work in an obvious fashion right away. not sure when i'll have time to get to those chart edits [22:11] lutostag: no worries [22:13] lazyPower: what are my options here? am I SOL? [22:14] until you have time to dive in [22:14] bdx not at all, you can pull down those helm charts and change the controller to be type: hostnetwork or type: nodeport [22:14] oooh [22:14] then just reschedule the controller [22:14] helm apply mything.yml [22:14] its devops baby [22:14] we have the technology :D [22:14] nice [22:14] I'll head down that path and see what gives [22:14] lazyPower: as always, thanks [22:15] bdx - c'mon man :) We're family by now [22:16] :) === perrito667 is now known as perrito666 === mup_ is now known as mup === SaMnCo_ is now known as SaMnCo === psivaa_ is now known as psivaa === bryan_att_ is now known as bryan_att === rmcadams_ is now known as rmcadams === med_ is now known as Guest3904 [22:49] * Budgie^Smore (stormmore) had to come home since his glasses broke :-/ [22:51] Budgie^Smore - i've been there, done that. literally this last week [23:00] On one of my machines, LXD containers do not seem to deploy. No obvious error, machine (i.e. 0/lxd/0) just never leaves the Pending status, and the agent is stuck on "allocating". Am I missing something obvious? [23:06] Oh, and occasionally another machine doesn't deploy containers as well. Is there a point where something can cause LXD container deployments to fail without error/retry? [23:06] lazyPower ouch! just sucks just how short sighted I am :-/ [23:11] andrew-ii - which version of juju? [23:12] 2.0.2 [23:13] Running on MAAS 2.1.3 [23:14] Machine seems to be connected and healthy, and an app will deploy, just not to a container [23:18] cat /var/log/lxd/lxd.log is just like my other machines, except it doesn't have the "alias=xenial ....", "ephemeral=false...", or "action=start ...." lines === menn0-busy is now known as menn0 [23:33] to make matters worse, I have 4k displays [23:47] Budgie^Smore - so i just confirmed the reigstry action's tls certs work as expected, no sure if you were still on that blocker [23:47] but i can help you decompose a workload using this as a template to reproduce for additional tls terminated apps [23:47] i deployed one using letsencrypt certs and it seems to have gone without an issue [23:51] yeah that is a blocker still, got side tracked with another blocker anyway and looking at intergrating with the ELB for now [23:52] with the let's encrypt method are you manually uploading the key and cert to the k8s secrets vault? [23:55] I am also trying to build a cluster in a box so I can test scenarios better than right now [23:57] yeah [23:57] Budgie^Smore - its encapsulated in the action, its taking teh base64 encoded certs and stuffing them in the secret template and enlisting it [23:59] hmmm so if I go to the dashboard and look at the secrets should I see the base64 or the ascii? [23:59] base64