[00:08] <R_P_S> hey rick_h, just wanted to thank you for the help so far.
[00:15] <bdx> is there support for artful series on aws?
[00:16] <bdx> Im seeing juju has grabbed a machine http://paste.ubuntu.com/25970976/
[00:16] <bdx> but just doesn't want to start
[00:17] <bdx> been sitting there for a while, making me think artful hasn't made the cut just yet
[00:18] <bdx> by a "while" I mean like 20 mins
[09:49] <ChaoticMind> is the jaas controller being super slow right now or is it just me?
[09:50] <mhilton> ChaoticMind, what cloud/region are you seeing problems in?
[09:50] <ChaoticMind> aws/eu-central-1
[09:52] <mhilton> ChaoticMind, thanks I'll try it out and see if I see the same.
[09:52] <ChaoticMind> thanks
[09:57] <mhilton> ChaoticMind, is there any particular command that's taking it's time for you?
[09:58] <ChaoticMind> mhilton: just deploying the bundle took forever (like 3 minutes for a smallish bundle). Setting relationships took like 15 seconds each
[09:58] <ChaoticMind> Usually it's about 0.5 seconds
[09:59] <ChaoticMind> I made a new model and tried it again now, it seems ok now!
[10:00] <mhilton> ChaoticMind: I'll look into it, one of the controllers may be overloaded. Thanks for mentioning it.
[10:01] <ChaoticMind> no worries
[16:12] <bdx> yoyoyo - whats the deal with artful deploys? Will someone `juju add-machine --series artful` on aws and let me know if I'm crazy
[16:14] <bdx> oooh shoot, looks like adding a machine of artful worked actually
[16:14] <bdx> nm
[16:14] <bdx> jeeze
[16:14] <bdx> ahh, `juju deploy ubuntu --series artful --force` is what is failing
[16:16] <kwmonroe> bdx: how about juju add-machine --series artful; juju deploy ubuntu --to X --force?
[16:16] <kwmonroe> just a shot in the dark
[16:16] <bdx> kwmonroe: no, great shot, I actually just did that and it worked
[16:16] <kwmonroe> sweet
[16:17] <bdx> and it looks like what was failing me last evening is now working too, with the `juju deploy ubuntu --series artful --force`
[16:17] <bdx> gd
[16:18] <bdx> I was experiencing some extreme jitter yesterday on JAAS I think
[16:18] <bdx> I was trying to get an artful deploy going for quite a while and it was just failing at machine "pending"
[16:18] <bdx> really strange
[16:18] <kwmonroe> ha!  yeah, "juju deploy ubuntu --series artful --force" just worked for me too on aws
[16:34] <R_P_S> hey, so as part of my ongoing evaluation of juju, I've just created an HA controller.  But how do I specify what subnets to create the ha-instances into?
[16:35] <bdx> jam:^^
[16:36] <bdx> R_P_S: that is possible with the `--to` directive, its just not documented yet
[16:37] <bdx> I think its something like `--to subnet=subnet-<id>`
[16:37] <R_P_S> so $ juju enable-ha --to subnet=subnet-priv1b --to subnet=subnet-priv1c ?
[16:38] <bdx> R_P_S: let me see if I can get it, omp
[16:40] <bdx> R_P_S: `juju bootstrap aws/us-west-2 --to subnet=subnet-<id> --credential mycred`
[16:40] <bdx> ^^ worked
[16:40] <bdx> I'll see about the HA omp
[16:40] <R_P_S> yeah, that worked for the first instance...
[16:40] <bdx> instances are launching faster then I've ever experienced
[16:41] <R_P_S> but after creating the initial controller, enabling HA put them in random subnets as far as I can tell
[16:41] <bdx> I already have a bootstrapped controller
[16:41] <bdx> crazy
[16:41] <R_P_S> including moxing public and private subnets for controllers 1 and 2
[16:43] <bdx> R_P_S: `juju enable-ha  --to subnet=subnet-<id>,subnet=subnet-<id>`
[16:43] <bdx> worked seamlessly
[16:44] <bdx> R_P_S: would you mind putting some heat on this please https://github.com/juju/docs/issues/2122
[16:47] <R_P_S> I don't have a github account and I'm at work... I'll need to do that later... as an aside, that ticket doesn't appear to be about enable-ha
[16:52] <R_P_S> so now I'm not sure how to remove the extra HA controllers
[16:52] <R_P_S> https://jujucharms.com/docs/2.2/controllers-ha
[16:53] <R_P_S> juju status doesn't have any mention of "has-vote" for the controller model... and "remove-machine" just fails with a message that "machine 2 is required by the model"
[16:55] <rick_h> R_P_S: so juju show-controller should mention HA status bits and has-vote I believe
[16:55] <jamesbenson> hi all, I'm trying to do a simple LXD conjure-up k8s with help, all-in-one.  But it fails from the get-go.
[16:55] <rick_h> R_P_S: you can always remove-machine --force but yea, best to know what's up there.
[16:56] <stokachu> jamesbenson: whats the issue?
[16:56] <rick_h> jamesbenson: bummer, what's the issue? I'm sure folks can get you good to go here.
[16:56] <jamesbenson> thanks stokachu and rick_h
[16:56] <jamesbenson> I hope so
[16:57] <jamesbenson> So there seems to be a few issues.  Sidenote: I'm doing this from a ubuntu server VM in openstack, xlarge.  Pretty sure that doesn't matter, but just in case
[16:57] <stokachu> jamesbenson: whats the hw specs?
[16:57] <stokachu> ram, cpus
[16:58] <jamesbenson> 8 vCPU, 16GB RAM, 160 GB HD
[16:58] <stokachu> ok should be fine
[16:59] <jamesbenson> ubuntu 16. LTS
[16:59] <jamesbenson> These are the commands I do from deploy: http://paste.openstack.org/show/626536/
[17:00] <jamesbenson> https://snag.gy/nT0LPv.jpg
[17:00] <jamesbenson> that's the latest state...
[17:01] <jamesbenson> actually I've tried twice... here's the other: https://snag.gy/Z9cs2x.jpg
[17:02] <jamesbenson> thoughts stokachu?
[17:15] <R_P_S> so I'm trying to rollover controllers by simply terminating "bad" ones in aws directly and re-running enable-ha
[17:16] <R_P_S> but I'm still unable to remove machines that don't show up with ha-status enabled in "show-controller"
[17:17] <R_P_S> I upped enable-ha to 7 to test the subnets... it looks like I need to specify the subnets with each enable-ha command :\
[17:18] <R_P_S> but show-controller lists machines 0,3,4,5,6,7,8 (1,2 were "demoted" accoding to enable-ha output)... but I still get "machine 1 is required by the model"
[17:19] <R_P_S> and one thing I've found is that using --force for remove-machine leaves an orphaned security group
[17:22] <kwmonroe> jamesbenson: just a guess, but can those units get to the outside world?  i know etcd and k8s charms snap install stuff, so i wonder if they're having trouble getting out.  can you pastebin a "juju debug-log --replay -i etcd/0"?
[17:27] <R_P_S> juju remove-machine 1 --force
[17:27] <R_P_S> fails
[17:31] <R_P_S> this bug was opened almost a year https://bugs.launchpad.net/juju/+bug/1658033
[17:31] <mup> Bug #1658033: Juju HA - Unable to remove controller machines in 'down' state <4010> <cpe-onsite> <juju:Triaged> <https://launchpad.net/bugs/1658033>
[17:39] <bdx> R_P_S: downsizing the controller cluster isn't supported
[17:40] <bdx> you have to dump the db and restore to a smaller cluster (I think)
[17:40] <R_P_S> correct, once an -n N has been specified, it can't be shrunk
[17:40] <R_P_S> but I'm trying to simulate failure
[17:40] <R_P_S> so I terminated one instances and reran enable-ha to rebuild new ones
[17:40] <R_P_S> but the terminated onces are still in the list, unable to be removed
[17:41] <R_P_S> I'm up to 13 "machines" in the config, with 5/7 ha (currently rebuilding)
*  controller  admin  superuser  aws/us-east-1       2        13  5/7  2.2.6
[17:41] <bdx> kwmonroe: I'm hitting it again, I just deployed these and they stood up just fine, tore it down and redeployed and its the artful instances that have been in pending for > 20mins now -> http://paste.ubuntu.com/25975589/
[17:52] <bdx> kwmonroe: create a new model on the same controller, then deployed the same charm http://paste.ubuntu.com/25975644/
[17:52] <bdx> see what I'm saying about the inconsistencies ?
[17:52] <kwmonroe> bdx: i'm in us-east-1, and just verified "juju deploy ubuntu --series artful --force" worked again.  hard to say what's up with it being intermitten.  do a "juju ssh -m controller 0" and sudo grep around /var/log/juju for 'machine-X' to see if there's a provisioner issue.
[17:52] <bdx> right
[17:53] <kwmonroe> yeah bdx, frustrating for sure.. i'm hoping there's something in the controller log that will be more insightful about an artful provisioning issue.
[17:55] <bdx> kwmonroe: http://paste.ubuntu.com/25975668/ - oh man
[18:00] <kwmonroe> bdx: i haven't seen "failed to start instance (failed to start instance in provided availability zone)" before, and no sign of it in my controllers. however, i'm on 2.3-beta1 so that could be new logging in beta3.
[18:01] <kwmonroe> bdx: what kind of constraints do you have for redis-cache?  any wicked machine reqs there?
[18:01] <bdx> root-disk, spaces
[18:02] <bdx> testing w/o any constraints
[18:03] <kwmonroe> bdx: i was hoping you had "instance-type=p3.xxlarge" and i could just say "us-west is simply out of those instance types", but that doesn't seem like the case.
[18:05] <bdx> kwmonroe: it was the constraints
[18:05] <bdx> I removed them, and wala
[18:05] <kwmonroe> must have been spaces, right?  surely not root-disk
[18:05] <bdx> I wonder if I'm hitting disk cap on aws
[18:05] <bdx> testing that right now
[18:06] <kwmonroe> yeah, make sure you're asking for GB and not PB ;)
[18:06] <kwmonroe> othwise RIP your wallet
[18:09] <bdx> ha, yeah, "G"
[18:09] <bdx> so I just logged into aws console and created 10 x 100G ebs volumes
[18:09] <bdx> no issues
[18:11] <bdx> kwmonroe: https://bugs.launchpad.net/juju/+bug/1706462
[18:11] <mup> Bug #1706462: juju tries to acquire machines in specific zones even when no zone placement directive is specified <cdo-qa> <foundations-engine> <juju:Triaged by ecjones> <MAAS:Invalid> <https://launchpad.net/bugs/1706462>
[18:12] <bdx> see my comment at the bottom
[18:19] <bdx> kwmonroe: I'm about to suggest something crazy
[18:19] <bdx> http://paste.ubuntu.com/25975792/
[18:19] <bdx> taking ^ into consideration
[18:20] <bdx> redis-space and ubuntu-space are both deployed with only a "spaces" constraint
[18:20] <bdx> the ubuntu-space didn't have a --series constraint
[18:20] <bdx> or bah
[18:20] <bdx> --series argument
[18:21] <bdx> the only failures I'm seeing here are when '--series' is specified alongside a spaces constraint
[18:21] <bdx> because we see from ^ that redis-disk worked, it had '--series artful' and '--constraints "root-disk=100G"
[18:21] <bdx> and ubuntu-space worked
[18:22] <bdx> which had no '--series' arg, but had the spaces constraint
[18:22] <bdx> but the only thing failing consistently
[18:23] <bdx> are things deployed to a space that has the '--series'  arg
[18:23] <bdx> I'll prove it by specifying '--series' with another series other than artful
[18:23] <bdx> how about zesty
[18:24] <bdx> since we see from ^ that zesty worked w/o a spaces constraint
[18:25] <kwmonroe> bdx: i don't know enough about juju's zone handling, but happened here with graylog / #38? http://paste.ubuntu.com/25968550/
[18:25] <kwmonroe> did graylog have constraints?
[18:26] <bdx> no
[18:26] <bdx> well yes
[18:26] <bdx> but that isn't happening because of that
[18:27] <bdx> that happens with every single instance deployed with 2.3beta3
[18:27] <bdx> it eventually gets past the "failed to start instance (failed to start instance in provided availability zone) " and finds one and eventually starts
[18:27] <bdx> I was just posting that to show that its not only maas thats having that issue
[18:28] <bdx> ok, well I think this verifies my theory http://paste.ubuntu.com/25975824/
[18:28] <bdx> I jus deploy the ubuntu-zesty-space
[18:28] <bdx> it required the spaces constraint and --series
[18:29] <bdx> and it failed similar to the artful
[18:29] <bdx> just stuck pending
[18:29] <bdx> #@(*$U(#@*$@#*&
[18:30] <bdx> idk
[18:30] <bdx> I may as well go back to sleep
[18:30] <bdx> somehow I knew today would be a trying day
[18:37] <kwmonroe> :)
[18:39] <kwmonroe> bdx: i would note that in bug 1706462, that spaces + series repro this easily on aws
[18:39] <mup> Bug #1706462: juju tries to acquire machines in specific zones even when no zone placement directive is specified <cdo-qa> <foundations-engine> <juju:Triaged by ecjones> <MAAS:Invalid> <https://launchpad.net/bugs/1706462>
[18:40] <bdx> kwmonroe: series + spaces only with artful
[18:40] <bdx> AND @kwmonroe
[18:40] <bdx> ^ bug is entirely different then what I'm seeing I think
[18:40] <jamesbenson> kwmonroe: sorry for the delay, turkey-luncheon thingy.... that command is giving me a TLS handshake timeout..
[18:41] <jamesbenson> kwmonroe: The instance can ping google...
[18:41] <bdx> kwmonroe: this verifies that it is only happening with artful http://paste.ubuntu.com/25975887/
[18:42] <bdx> kwmonroe: what I'm seeing is the instances stay in pending for only series + space + artful
[18:42] <bdx> kwmonroe: what I'm seeing is the instances stay in pending for only series + space + artful
[18:42] <bdx> 1706462 - failed to start instance (failed to start instance in provided availability zone) within attempt 0, ret
[18:42] <bdx> rying in 10s with new availability zone
[18:43] <kwmonroe> wait bdx, your previous paste shows machine 8 waiting for machine with series zesty: http://paste.ubuntu.com/25975824/
[18:43] <bdx> but then on *a* next attempt, juju will find an instance, and start it, and go on its way
[18:43] <bdx> kwmonroe: ah, my bad, yea, that machine started
[18:43] <bdx> which made me realize, in all cases, its only artful that is the commonality here
[18:44] <bdx> when used with spaces + series
[18:44] <bdx> try it
[18:44] <bdx> oooh, it may be only beta3, let me try this on jaas
[18:48] <kwmonroe> jamesbenson: how about just "juju debug-log"?  does that give you a tls timeout too?
[18:48] <bdx> works great on jaas http://paste.ubuntu.com/25975933/
[18:48] <bdx> kwmonroe: the juju agent never starts, so I don't get any log from those instances
[18:48] <jamesbenson> yes
[18:48] <jamesbenson> kwmonroe: ^
[18:49] <bdx> oooh jamesbenson
[18:49] <bdx> my b
[18:49] <bdx> lol
[18:49] <kwmonroe> :)
[18:49] <jamesbenson> http://paste.openstack.org/show/626542/
[18:49] <kwmonroe> jamesbenson: ooooohhh, i thought you meant the debug-log command wasn't showing any output.
[18:50] <jamesbenson> kwmonroe: No, seems to have issues with net/http: TLS handshake timeout...
[18:51] <jamesbenson> I'm not sure why that is, though, since easyrsa is able to get active ...
[18:51] <jamesbenson> so it must be able to reach out, correct?
[18:52] <kwmonroe> jamesbenson: easyrsa doesn't snap install anything
[18:52] <kwmonroe> etcd and k8s charms do
[18:52] <jamesbenson> oh man..
[18:52] <jamesbenson> so something with the bridge then
[18:52] <kwmonroe> so jamesbenson, i'll bet you all the money in my pockets that if you do a "juju run --unit easyrsa/0 'sudo snap install etcd'", it'll fail
[18:53] <jamesbenson> kwmonroe: seems to be just sitting there...
[18:54] <jamesbenson> yep, same error
[18:54] <jamesbenson> juju run --unit easyrsa/0 'sudo snap install etcd'
[18:54] <jamesbenson> error: cannot perform the following tasks:
[18:54] <jamesbenson> - Download snap "core" (3440) from channel "stable" (Get https://068ed04f23.site.internapcdn.net/download-snap/99T7MUlRhtI3U0QFgl5mXXESAiSwt776_3440.snap?t=2017-11-16T20:00:00Z&h=30ced1b835617d49d8ff4221a62d789f7ca638aa: net/http: TLS handshake timeout)
[18:54] <jamesbenson> sorry about the paste there...
[18:54] <kwmonroe> jamesbenson: to test the tls/http connectivity more generically, do this.. juju ssh etcd/0, then wget https://google.com from the etcd unit.
[18:55] <kwmonroe> (make sure it's https)
[18:55] <bdx> ok, here it is https://bugs.launchpad.net/juju/+bug/1732764
[18:55] <mup> Bug #1732764: series + spaces + artful + juju2.3beta3 = fail <juju:New> <https://launchpad.net/bugs/1732764>
[18:55] <jamesbenson> kwmonroe: works.
[18:55] <kwmonroe> interesting
[18:56] <jamesbenson> http://paste.openstack.org/show/626545/
[18:56] <kwmonroe> jamesbenson: how about a "sudo snap install etcd" from that same etcd unit?
[18:58] <jamesbenson> kwmonroe : nope...http://paste.openstack.org/show/626546/
[18:59] <jamesbenson> interesting...
[18:59] <ryebot> jamesbenson: if this is an egress-restricted environment and you're unable to hit the snap store, I can provide you with some steps for installing them manually.
[18:59] <jamesbenson> all ports are open....
[18:59] <jamesbenson> I'll double check though..
[19:01] <kwmonroe> bdx: nice detail in 1732764.  interesting that's it's such a specific combo.  also, you may want to s/"spaces=myspace"/"spaces=facebook" in case a more recent social media platform helps.
[19:01] <jamesbenson> ryebot kwmonroe: iptables are empty, and security group is open all ports in and out.
[19:01] <jamesbenson> http://paste.openstack.org/show/626547/
[19:03] <jamesbenson> This is the only rule in my iptables -t nat: MASQUERADE  all  --  10.55.234.0/24      !10.55.234.0/24       /* managed by lxd-bridge */
[19:04] <kwmonroe> jamesbenson: stick some quotes around that url... wget 'https://068ed04f23.site.internapcdn.net/download-snap/99T7MUlRhtI3U0QFgl5mXXESAiSwt776_3440.snap?t=2017-11-16T20:00:00Z&h=30ced1b835617d49d8ff4221a62d789f7ca638aa'
[19:04] <jamesbenson> hmm... still shows connected.  But once connected it sits.
[19:05] <kwmonroe> jamesbenson: how about running "env | grep -i proxy" on that unit.  anything in there?
[19:07] <jamesbenson> NO_PROXY=10.55.234.245,127.0.0.1,::1,localhost
[19:07] <jamesbenson> no_proxy=10.55.234.245,127.0.0.1,::1,localhost
[19:09] <kwmonroe> hmph jamesbenson, that seems legit
[19:10] <jamesbenson> ....so confused....  not a good sign that everything seems legit from you too...
[19:11] <jamesbenson> kwmonroe: do you deploy on baremetal or in VM's?  Do you have any script or anything?
[19:22] <kwmonroe> jamesbenson: by "legit", i meant the no_proxy stuff looks legit :)  if you can't do a "sudo snap install foo" from the unit, juju won't be able to either.
[19:22] <kwmonroe> jamesbenson: there's a gremlin in there to be sure.  just need to figure out why those unit's can't snap install.
[19:23] <kwmonroe> jamesbenson: i typically deploy to clouds or localhost (lxd).  not much experience with maas.
[19:24] <jamesbenson> well this is in an openstack VM, so not with maas...
[19:24] <kwmonroe> ah, right
[19:25] <jamesbenson> I know I can deploy using openstack magnum, but want to do it manually...
[19:25] <kwmonroe> well jamesbenson, from what i can tell, apt install works and wget works, so it's not like your units are totally locked down.  i'm not sure what's causing snap install to fail.
[19:25] <jamesbenson> error: cannot install "foo": snap "core" has changes in progress
[19:26] <kwmonroe> silly rabbit, dont' actually stick 'foo' in there
[19:26] <jamesbenson> :-p
[19:26] <jamesbenson> hey, didn't know if it was a test option ;-)
[19:27] <jamesbenson> ansible ping/pong test ...
[19:27] <kwmonroe> :)
[19:27] <kwmonroe> jamesbenson: what does snap changes show?
[19:27] <kwmonroe> "snap changes"
[19:27] <kwmonroe> i'm guessing it's stuck somewhere trying to download the core snap
[19:27] <jamesbenson> http://paste.openstack.org/show/626550/
[19:28] <jamesbenson> you'll like that..
[19:28] <kwmonroe> heh, classy
[19:29] <kwmonroe> jamesbenson: how about a "snap download etcd"?
[19:30] <kwmonroe> we should see the tls error.. just making sure.
[19:31] <magicaltrout> "Hello Kubernetes support desk, Kevin speaking, how may I help you today???"
[19:31] <kwmonroe> phew!  backup arrives.  magicaltrout, meet jamesbenson.  he's having trouble snap installing k8s.
[19:31] <jamesbenson> yep, tls error
[19:31] <magicaltrout> i have many k8s installations
[19:31] <magicaltrout> too many
[19:32] <kwmonroe> magicaltrout: any on openstack?
[19:32] <magicaltrout> sorta
[19:32] <magicaltrout> its manual though not openstack cloud provider
[19:33] <jamesbenson> magicaltrout: I've got a ubuntu 16 LTS, VM sitting in openstack.  Security group is completely open.  No iptables rules...
[19:34] <jamesbenson> 8 vCPU, 16GB RAM, 160 GB HD;  deployed using these commands:  http://paste.openstack.org/show/626536/
[19:37] <jamesbenson> can't seem to install though, giving me tls errors.
[19:41] <magicaltrout> okay, jamesbenson your cluster lives inside lxd on nodes on openstack?
[19:42] <jamesbenson> yes
[19:42] <jamesbenson> VM in openstack, lxd on that VM.
[19:42] <magicaltrout> hmm i've not tried that before
[19:42] <magicaltrout> if you snap install at vm level does it work?
[19:42] <jamesbenson> yeah, did that to install conjure-up
[19:42] <jamesbenson> and lxd
[19:45] <magicaltrout> hmm
[19:47] <kwmonroe> jamesbenson: it feels like something about your lxd-bridge is interferring with fetching data from the snap store, but i can't fathom a reason why it would affect snap and not apt or wget.
[19:49] <R_P_S> I am having difficulties adding subnets to spaces to ensure instances are deployed in the correct VPC/AZ
[19:49] <R_P_S> I get an error "cannot add subnet: no subnets defined" while running
[19:50] <R_P_S> juju add-subnet 1.2.3.4/5 public subnet-12345678
[19:52] <jamesbenson> kwmonroe, magicaltrout: do you have general guidelines/rules/instructions on how you set up lxd, zfs, and the network?
[19:53] <jamesbenson> ipv6 is disabled...
[19:53] <jamesbenson> but I wasn't sure about the bridge
[19:53] <magicaltrout> i've only installed k8s with conjure up on lxd once
[19:53] <magicaltrout> i just did whatever it told me
[19:53] <jamesbenson> how do you typically install it?
[19:54] <magicaltrout> i have 1 standard aws install and 3 openstack manual provider installs
[19:54] <jamesbenson> I'm doing lxd to do some dev with multiple "nodes" in an all in one...
[19:55] <jamesbenson> openstack with magnum?
[19:55] <magicaltrout> nope
[19:55] <jamesbenson> manual?
[19:55] <magicaltrout> yeah just spin up some nodes
[19:55] <magicaltrout> and deploy some stuff to them
[19:55] <jamesbenson> using which method?
[19:56] <magicaltrout> https://jujucharms.com/docs/2.2/clouds-manual
[19:56] <magicaltrout> just like a small 3 node cluster for k8s dev
[19:57] <stokachu> jamesbenson: how's your lxd bridge configured
[19:57] <jamesbenson> stokachu: any command to detail it?
[19:58] <stokachu> jamesbenson: lxc network show lxdbr0
[19:59] <stokachu> jamesbenson: easiset to do `lxc network show lxdbr0|pastebinit`
[19:59] <jamesbenson> http://paste.ubuntu.com/25976235/
[19:59] <stokachu> do you have another bridge defined?
[19:59] <jamesbenson> no idea about the pastebinit... awesome..
[20:00] <stokachu> jamesbenson: whats `lxc network list|pastebinit` show
[20:00] <jamesbenson> http://paste.ubuntu.com/25976248/
[20:01] <jamesbenson> http://paste.ubuntu.com/25976252/
[20:01] <stokachu> jamesbenson: yea youve got no bridge defined for lxd to use
[20:02] <jamesbenson> okay...
[20:02] <stokachu> jamesbenson: how'd you create lxdbr0 before?
[20:02] <jamesbenson> sudo lxd init --auto
[20:03] <kwmonroe> jamesbenson: fwiw, we have a generic lxd guide:  https://jujucharms.com/docs/stable/tut-lxd.  might be worth following that and bootstrap on a new node, then "juju deploy ubuntu", then "juju ssh ubuntu/0" and see if a "sudo snap install core" works.
[20:03] <stokachu> kwmonroe: it's his bridge
[20:03] <stokachu> it isn't configured
[20:03] <jamesbenson> BRB
[20:04] <jamesbenson> could you send me a few commands?
[20:04] <stokachu> jamesbenson: `lxc profile show default|pastebinit`
[20:05] <R_P_S> bdx / rick_h: any ideas why add-subnet is not working and complaining about subnets not being defined?
[20:05] <kwmonroe> stokachu: if the bridge was borked, how did he get this far with kubeapi-load-balancer going active: https://snag.gy/nT0LPv.jpg
[20:06] <kwmonroe> (and easyrsa)
[20:06] <stokachu> well for one, his lxd bridge is inet addr:10.55.234.1
[20:06] <stokachu> and those ip's are different
[20:06] <jamesbenson> http://paste.ubuntu.com/25976291/
[20:07] <stokachu> kwmonroe: also thats not output from conjure-up
[20:07] <stokachu> so i dont know what he did there
[20:08] <stokachu> jamesbenson: basically your lxd network bridge is acting up
[20:08] <stokachu> jamesbenson: i recommend tearing down that setup
[20:09] <stokachu> jamesbenson: juju kill-controller localhost-localhost
[20:09] <jamesbenson> okay
[20:09] <jamesbenson> and how do I bring it back up?
[20:09] <stokachu> then delete that lxdbr0 bridge
[20:09] <stokachu> one sec
[20:09] <jamesbenson> ok
[20:09] <jamesbenson> thanks 🙏
[20:09] <stokachu> jamesbenson: then do `sudo brctl delbr lxdbr0`
[20:11] <stokachu> jamesbenson: let me know when you've done that, and give output of `ip addr|pastebinit`
[21:01] <R_P_S> so I just discovered that by creating a new model, the subnets aren't populated...
[21:02] <R_P_S> the subnet info is available in the controller and default models, but I want to build a model for each environment
[21:02] <R_P_S> how do I populate... or copy the subnet info from one model to the other?
[21:07] <R_P_S> juju switch default && juju list-subnets -> full subnet output
[21:07] <R_P_S> juju switch dev-k8s && juju list-subnets -> No subnets to display
[21:20] <kwmonroe> R_P_S: does 'juju reload-spaces' while in the dev-k8s model do anything?
[21:21] <kwmonroe> R_P_S: on aws, all new models look populated with subnets for me.
[21:24] <R_P_S> reload-spaces appears to not do anything
[21:25] <R_P_S> hold on
[21:26] <R_P_S> would reload-spaces be dependent on a vpc-id being specified in the model-config?
[21:30] <jamesbenson> stokachu: I think it's easier to reset the VM and start from scratch, no?
[21:30] <stokachu> jamesbenson: probably
[21:34] <jamesbenson> stokachu: So it's rebuilt.
[21:34] <stokachu> ok so do this, `sudo apt-add-repository ppa:ubuntu-lxc/stable`
[21:34] <stokachu> `sudo apt update && sudo apt install lxd lxd-client`
[21:35] <stokachu> then `lxd init --auto` (no sudo)
[21:35] <stokachu> then `lxc network create lxdbr0 ipv4.address=auto ipv4.nat=true ipv6.address=none ipv6.nat=false`
[21:35] <stokachu> then `snap install conjure-up --classic`
[21:35] <stokachu> and run conjure-up
[21:38] <jamesbenson> snap needs sudi
[21:38] <jamesbenson> sudo
[21:39] <jamesbenson> running conjure-up
[21:40] <R_P_S> ok, turns out you can't just add a VPC to a model after the fact (got errors), as the VPC parameters need to be specified during model creation with --config
[21:41] <jamesbenson> stokachu:  oooo something different is happening... getting a good feeling ^_^
[21:41] <stokachu> jamesbenson: \o/
[21:42] <jamesbenson> what's the watch command again?
[21:43] <jamesbenson> got it
[21:50] <R_P_S> ok, now I'm straight up running into this bug :(  https://bugs.launchpad.net/juju/+bug/1704876
[21:50] <mup> Bug #1704876: can't deploy to specific AWS subnets due to `juju add-subnet` fails <add-subnet> <aws> <conjure> <spaces> <subnet> <vpc> <juju:Triaged> <https://launchpad.net/bugs/1704876>
[21:59] <R_P_S> how do you delete a space in a model?
[21:59] <hml> spaces can’t be deleted currently.  :-(
[21:59] <R_P_S> ...
[22:00] <R_P_S> are spaces completely broken? :(  can't delete, can't add subnets to a space... can't do anything with them?  yet they're core to defining where things will be deployed?
[22:01] <hml> i never use spaces personally - there are other ways to define how things are deployed
[22:01] <hml> depends on the cloud you’ve bootstrapped
[22:01] <R_P_S> I'm following: https://insights.ubuntu.com/2017/02/08/automate-the-deployment-of-kubernetes-in-existing-aws-infrastructure/
[22:02] <R_P_S> how would I rewrite this command then to not use spaces?
[22:02] <R_P_S> juju deploy --constraints "instance-type=m3.medium spaces=private" cs:~containers/etcd-23
[22:03] <hml> ah…
[22:03] <hml> you can just make a space with a different name to use - suboptimal i know -
[22:04] <R_P_S> the only things I've done different so far are that I'm not using cloudformation (infrastructure preexisting) and creating a model
[22:04] <R_P_S> But how do I use empty spaces?
[22:04] <R_P_S> since I can't add-subnet to a space?
[22:04] <jamesbenson> stokachu : etcd/0 Missing relation to certificate authority.
[22:05] <jamesbenson> https://snag.gy/5ED6sa.jpg
[22:05] <jamesbenson> ah, my nginx just became active....
[22:08] <R_P_S> so apparently you need to define your subnets when calling add-space...
[22:09] <R_P_S> do it once, don't screw it up... and if you ever accidentally re-assign a subnet to a different space, you're screwed?
[22:14] <jamesbenson> kwmonroe stokachu : thoughts?  seems like having a similiar issue like before.  the bridge is managed now though
[22:16] <stokachu> If there is no error in juju status then give it time
[22:17] <jamesbenson> error hook failed: "install" ?
[22:17] <jamesbenson> for all etcd
[22:17] <jamesbenson> but keeps on restarting...
[22:17] <stokachu> Is this a full VM you're running?
[22:17] <jamesbenson> full?  yes, ubuntu server cloud image...
[22:18] <kwmonroe> jamesbenson: juju debug-log --replay -i etcd/0
[22:18] <kwmonroe> let's see if it's still a problem snap installing
[22:18] <kwmonroe> jamesbenson: alternatively, juju ssh etcd/0 and try a "sudo snap install etcd"
[22:18] <jamesbenson> http://paste.ubuntu.com/25976912/
[22:19] <kwmonroe> instant regret on the --replay ;)
[22:19] <kwmonroe> please hold while your pastebin loads into ram
[22:20] <jamesbenson> :-(
[22:20] <stokachu> Looks like a timeout downloading the snaps
[22:20] <jamesbenson> yeah
[22:20] <jamesbenson> should I try the `sudo snap download etcd`
[22:20] <stokachu> sudo snap install etcd
[22:21] <jamesbenson> tried inside of etcd and got the TLS handshake timeout error
[22:21] <kwmonroe> R_P_S: while i'm waiting on jamesbenson to crash my browser, what's your end game here?  i'm really not well versed with spaces/subnets, but i'm curious what people are up to when they have strict space reqs. i know bdx does these space constraints all the time -- i never knew why.
[22:21] <stokachu> Yea, not a conjure-up or juju issue
[22:21] <stokachu> But an issue nonetheless
[22:22] <jamesbenson> kwmonroe: sorry!
[22:22] <stokachu> Are you behind proxies?
[22:22] <jamesbenson> stokachu: no
[22:22] <kwmonroe> no worries jamesbenson -- it's ammo for getting a new rig for the holidays ;)
[22:23] <stokachu> jamesbenson: may want to post on discuss.snapcraft.io
[22:23] <jamesbenson> ooo, nice :-) 👍. I'm running MBP with touchbar...
[22:24] <R_P_S> kwmonroe: simple management of VPCs and subnets.  Without this, some of the anti-patterns that juju enables is mind boggling... things like 0.0.0.0/0 SSH ACLs on every instance
[22:24] <kwmonroe> you happy with the touchbar jamesbenson?  i hear mixed reviews (kinda like battlefront 2), where by "mixed" i hear "i hate it" ;)
[22:25] <kwmonroe> jamesbenson: i see we're back to the tls handshake timeout :/
[22:25] <jamesbenson> lol... it's okay... I do need to reset it though, really random hangs and freezes as of late... circle of death for like 10-20 seconds then free's up.
[22:25] <jamesbenson> does snapcraft have an IRC?
[22:26] <kwmonroe> oof on the death spiral
[22:26] <kwmonroe> jamesbenson: you'll get much better response from snapcraft.io, but there is a #snappy freenode channel
[22:26] <R_P_S> kwmonroe: at that point, the only way to secure them is to control your subnets with things like public and private... these are very basic security concepts when building AWS infrastructure.
[22:26] <jamesbenson> but I hack the hell out of it, so I probably fugged something somewhere...
[22:27] <jamesbenson> back to point though
[22:27] <kwmonroe> jamesbenson: i shouldn't say "much better", but i know those forums are monitored like crazy.  irc, i'm not sure.
[22:27] <jamesbenson> okay
[22:28] <kwmonroe> jamesbenson: https://forum.snapcraft.io/ is the place
[22:28] <kwmonroe> dont go there ^^ from your etcd/0 unit because you'll probably get a tls handshake error.
[22:29] <R_P_S> kwmonroe: the fact that conjure-up for a kubernetes cluster uses ec2-classic by default is, to be brutally honest, downright scary :(  ec2-classic was deprecated years ago, and should never be used again
[22:30] <R_P_S> damn, I gotta run to meetings... I'll likely be in meetings until EOD...
[22:30] <R_P_S> once again, thanks for all the help, I am making progress, but it is much slower than I'd hoped
[22:30] <kwmonroe> R_P_S: thx for the insights on your use of subnets/spaces!
[22:31] <kwmonroe> i'll catch up with you later to dive in more
[22:31] <stokachu> Lol ec2-classic?
[22:32] <kwmonroe> yeah - i dunno what ec2-classic is either, that was the diving part i alluded to ;)
[22:33] <stokachu> R_P_S: feel free to elaborate on ec2-classic
[22:36] <jamesbenson> kwmonroe: ha, thanks.  Not sure exactly what to post to them..  I suppose just snap is failing in openstack VM with that massive pastebin from earlier..
[22:37] <kwmonroe> jamesbenson: like you said, back to the point, if you can't "sudo snap install etcd" from the deployed unit, juju won't be able to either.  so step 1 is to figure out why that's failing.  you're probably going to hear 1000 people ask "what are your proxy settings", don't be mad.  whatever's going on is probably a mix of openstack / lxd / snap.
[22:38] <kwmonroe> jamesbenson: i would just create a new topic that says "snap install fails in a lxd container on an openstack VM"
[22:41] <kwmonroe> jamesbenson: and the pastebin is good -- but since it has so much juju noise in it, i'd paste the failure that you see from "sudo snap install core" on the etcd unit.
[22:47] <tvansteenburgh> stokachu: kwmonroe: ec2-classic == ec2 in the days before vpcs were introduced. if you have a sufficiently old aws account, the machines juju provisions will not be in a vpc, which can break things in unexpected ways. the way to get around this is to tell juju which vpc to use. you can do that using bootstrap or model config.
[22:55] <jamesbenson> kwmonroe : Posted... lets see what happens.
[22:55] <kwmonroe> i predict nothing but good things jamesbenson ;)
[23:15] <R_P_S> ec2-classic is amazon ec2 before VPCs existed... IIRC, it's not even accessible on accounts that were created in the past few years
[23:16] <R_P_S> ec2-classic was the equivalent of one giant public VPC that contained every amazon customer all in one giant internal subnet (split per region)
[23:17] <R_P_S> ec2-classic didn't have as many features as "ec2-vpc" either.  examples include: the SG couldn't be changed, and could only have one SG attached to an instance.
[23:18] <R_P_S> ec2-classic SGs don't have egress ACLs.  They simply don't exist (non-configurable ALL ALL 0.0.0.0/0 egress)
[23:19] <R_P_S> ec2-classic without VPCs, you didn't make subnets either... I can't remember everything though, it's been years since I've done any significant amount of work in ec2-classic.
[23:19] <R_P_S> ec2-classic is inherently insecure compared to ec2-vpc
[23:21] <tvansteenburgh> R_P_S: you can make conjure-up use a vpc, but you need to bootstrap the juju controller or create the model before using conjure-up, then tell conjure-up to use that controller or model
[23:21] <tvansteenburgh> for example: juju bootstrap --config "vpc-id=vpc-xxxxxxxx"
[23:22] <tvansteenburgh> or: juju add-model --config "vpc-id=vpc-xxxxxxxx"
[23:23] <R_P_S> at this point, I have my ha controllers in the VPC i want... kwmonroe was curious about ec2-classic though
[23:26] <R_P_S> at the fact that the AWS account I'm using appears to be old enough to still support ec2-classic meant that a basic/barebones conjure-up created a kubernetes cluster in ec2-classic instead of inside a VPC