[04:31] <rmcadams> juju makes me want to bang my head on the desk sometimes :)
[07:40] <kjackal> Good morning Juju world!
[09:05] <spaok> heya, is there a way to force juju 1.25 to use a particular controller?
[14:27] <Spaulding> Hi folks!
[14:27] <Spaulding> https://gist.github.com/pananormalny/11144709446aa80e956d7977df3d88d1 anyone know how to run it just once? @only_once seems to be not working here...
[14:28] <magicaltrout> only_once is really a thing?!
[14:28] <Spaulding> also i have no idea what is the best way to check if db is already filled, should I check that some tables exists or there is some other fancy way of doing this
[14:28] <magicaltrout> so it is
[14:29] <magicaltrout> I don't know how fixed it is Spaulding but a issue here says:
[14:29] <magicaltrout> "Another consideration that needs to be documented because it is very non-obvious is that, because of an implementation detail, @only_once must be the innermost decorator, if combined with @when, etc."
[14:29] <magicaltrout> yours is the outer most
[14:29] <magicaltrout> :)
[14:30] <Spaulding> omfg
[14:30] <Spaulding> let's try it then :)
[14:30] <magicaltrout> i know... my google-fu is amazing on a Friday
[14:31] <Spaulding> haha
[14:31] <Spaulding> friday the 13th
[14:31] <magicaltrout> shit
[14:31] <magicaltrout> i didn't even notice
[14:31] <Spaulding> so now you know ;)
[14:31] <Spaulding> your magic abilities works even today!
[15:01] <BlackDex> hello there
[15:01] <BlackDex> i used juju with maas and i created several network aliasses on one vlan/interface
[15:02] <BlackDex> maas deploy's it correctly
[15:02] <BlackDex> i can access the servers on every network
[15:03] <BlackDex> but when i let juju generate lxd's on these hosts, they only receive one ip
[15:08] <Zic> hi here, I'm (always) testing canonical-kubernetes bundle charms, and I test to shut an etcd member to see if the 2 others deployed by default will be hurted
[15:08] <Zic> when I shut the etcd/1 or etcd/2 it's fine
[15:08] <Zic> but if I shut etcd/0, the cluster go unhealthy
[15:08] <magicaltrout> lazyPower: one for you -^
[15:08] <lazyPower> Zic which versin of the etcd charm?
[15:08] <Zic> have you any idea about this issue?
[15:09] <Zic> charm: cs:~containers/etcd-21
[15:11] <lazyPower> I have done disaster testing where we tear down nodes, and i haven't encountered this behavior.  Is the broken deploy still around? Can you collect the logs from one of the etcd nodes thats failing after you've destroyed etcd/0?
[15:12] <Zic> yep, I collected it, it just try to relink to the etcd/0 in loop (nothing unexpected) : http://paste.ubuntu.com/23792623/
[15:13] <Zic> I have the same type of message if I shut etcd/1 or etcd/2, but the cluster stay healthy and Kubernetes cluster is working
[15:13] <Zic> but if I shut etcd/0, all kubectl or even the kubernetes-dashboard is dead, with a "etcd cluster unhealthy or misconfigured" error message
[15:14] <lazyPower> oh Zic  i know what happened here. it looks like the etcd cluster lost quorem and didn't re-elect a new master because it couldn't reach consensus via the 2 nodes that were still active, and not acting as coordinator
[15:16] <Zic> lazyPower: oh, it seems logic, but how can I prevent that ?
[15:16] <lazyPower> Zic juju add-unit etcd -n 2  -- that will bring the total number of etcd units to 5, and you should be able to test the disaster recovery at that point
[15:17] <Zic> note that I put etcd on the same machine that's running kubernetes-master charm, instead of the default model where etcd is running alone in a machine
[15:17] <Zic> don't know if it's a safe way
[15:17] <lazyPower> Zic thats not recommended for production deployments :)
[15:17] <magicaltrout> etcd needs to man up and make a decision! ;)
[15:17] <lazyPower> as etcd is the core database tracking the state of your cluster.
[15:17] <Zic> yeah, but I scale 3 kubernetes-master
[15:18] <Zic> (and test kube-api-loadbalancer ;))
[15:18] <lazyPower> but its fine to colocate etcd for testing purposes
[15:18] <Zic> even if I run through 3 kubernetes-master, do you recommend to put etcd separately ?
[15:19] <Zic> to put it correctly: the default model deploy 1 kubernetes-master, 1 etcd, 1 etcd, 1 etcd
[15:19] <Zic> I add two more kubernetes-master, and place etcd charms alongside
[15:19] <Zic> is it a poor design choice for production cluster? :|
[15:22] <lazyPower> i think thats fine
[15:22] <lazyPower> extend the etcd cluster onto those principal units, 3 independent, 2 shared
[15:22] <lazyPower> where 2 is N really
[15:23] <Zic> lazyPower: thanks, I will try that
[15:24] <Zic> bonus question: is kube-api-loadbalancer really experimental? I hesitate to run it for production environment, bus as I lurked in what it is doing, it's just an automatically configured nginx which act as reverse-proxy
[15:25] <Zic> s/bus/but/
[15:26] <lazyPower> Zic we've found some interesting behavior with the proxy, there's still some tuning to do before we can call it GA
[15:26] <lazyPower>  it works well, and needs no further tuning in 90% of the scenarios in which its deployed
[15:27] <lazyPower> but if you're using addons like HELM, we've found that it doesn't work when routed through the LB
[15:43] <Zic> lazyPower: 5 etcd units in total permits what fault-tolerance? 2 nodes down as they stay 3 to make a quorum decision?
[15:44] <lazyPower> Zic https://coreos.com/etcd/docs/latest/v2/admin_guide.html
[15:44] <lazyPower> see the table under Optimal Cluster Size
[15:44] <Zic> oh yeah, I found this link yesterday and forgot it, sorry :)
[15:45] <lazyPower> Zic no worries :) I just find it useful to spot check my own knowledge against their upstream docs
[15:45] <lazyPower> it keeps me honest
[15:45] <Zic> :)
[15:45] <magicaltrout> nooo lazyPower make him pay!!!!
[15:45] <magicaltrout> make him paaaaaaaay!
[15:45] <magicaltrout> sorry, friday.
[15:45] <lazyPower> and according to magicaltrout you owe us pizza and beer... because we posted links.
[15:45] <lazyPower> and he pinged me
[15:45]  * lazyPower pokes magicaltrout 
[15:45] <magicaltrout> \o/
[15:45] <magicaltrout> marcoceppi_: i should be in DC in the last week in March
[15:46] <magicaltrout> consider yourself warned
[15:46] <magicaltrout> conference season hasn't even started and i'm already juggling trips
[15:46] <Zic> true! come in France, but I must warn you, we don't have good pizzas (and our 'good' bear is tolen from Belgian neighbors)
[15:47] <magicaltrout> i was in Paris last year, that was fun
[15:49] <lazyPower> Zic i'll gladly take your belgian brews
[15:49] <magicaltrout> hehe
[15:49] <magicaltrout> wise lazyPower you need to get practicing ;)
[15:50] <lazyPower> practicing what exactly?
[15:50] <magicaltrout> drinking 15% beer for Ghent ;)
[15:50] <lazyPower> dear lord
[15:50] <lazyPower> i am not ready
[15:50] <lazyPower> magicaltrout you have lofty goals ;)
[15:51] <magicaltrout> its always achievable... just requires dedication and practice... and the distinct possiblity of doing talks whilst still drunk
[15:51] <lazyPower> That would clearly follow in the footsteps of my hero Jamie Windsor
[15:52] <lazyPower> i think it was chef conf 2012, where he took the stage blitzed and tried to run a live siri-controlled infra deploy and it failed fantastically. RIP that demo, but it certainly was memorable
[15:52] <magicaltrout> lol
[15:53] <Zic> hehe, here we use Puppet generally... but for some new projects (like K8S) we're testing Juju :)
[15:56] <Zic> lazyPower: if I don't care about ressources, do you advice me to split up etcd from machines which already host kubernetes-master? or staying with 3 kubernetes-master+etcd and 2 others etc separately is also fine? (more question more beers, I know :()
[15:56] <lazyPower> i do, i think its the best strategy to keep them out of the kubernetes units themselves for instances of scale
[15:57] <lazyPower> if you colocate etcd with kubernetes-master-3,4,5 for example
[15:57] <lazyPower> and your cluster goes dormant, and you remove master-4,5
[15:57] <lazyPower> you'll still have tainted machines running with those etcd units
[15:58] <lazyPower> Plus, in juju land, when you colocate services (we call this  hulk-smashing), it *can sometimes* have unintended behavior, like say we add an addon that requires an isolated etcd container to be running... (i'm looking @ you CNI networking providers)
[15:58] <lazyPower> if you've got a colocated etcd unit on that node, you'll hit port collision
[15:58] <Zic> ok, good advices, I will apply them :)
[15:58] <lazyPower> and in some cases, it will cause the addon workload to fail
[18:40] <spaok> so, my juju is broken badly, mongo won't cluster and keeps restarting, I can't issue ensure-availability, I tried restoring a lxc snapshot but the last two didn't help, before I try something drastic, like restoring the first snapshot (removing the newer to do so) anyone have any ideas how I might recover?
[19:25] <lazyPower> spaok have you been able to collect the logs from your container? which juju version?
[19:25] <spaok> should be 1.25
[19:26] <spaok> its an older install we are trying to decomm, but juju broke so it making that harder
[19:26] <lazyPower> yeah, thats a rough situation. sorry spaok i'm not certain how to resolve mongo clustering issues but i would lead with a bug with those logs so we can attempt to triage, but it sound slike you're on a time table
[21:25] <magicaltrout> friday is here \o/ gin o'clock
[21:26] <magicaltrout> this ones for you kwmonroe https://youtu.be/u6mJMJzDD-M?t=1m13s
[21:26] <kwmonroe> no way i'm clicking that
[21:26] <magicaltrout> lol
[21:26] <magicaltrout> its music! :P
[21:26] <magicaltrout> it popped up on spotify on the way home and made me think of you
[21:27] <kwmonroe> ugh.. fine.  i'll click it.
[21:28] <kwmonroe> ha!  i like it.  you're back on my nice list magicaltrout!
[21:28] <magicaltrout> lol
[21:28] <magicaltrout> thanks :P
[21:41] <elbalaa> spaok: did you try sshing into the mongo machine and looking at the logs?
[23:39] <spaok> elbalaa: ya there was some corruption or something, mongo would try to cluster and then restart
[23:39] <spaok> we were able to finally find a mix between a previous snapshot and turning off the repl
[23:39] <spaok> then recovering from there
[23:40] <spaok> so it's up enough that we can finish the decomm process
[23:58] <elbalaa> spaok:  "corruption or something" story of my life when working with mongo