R_P_S | hey rick_h, just wanted to thank you for the help so far. | 00:08 |
---|---|---|
bdx | is there support for artful series on aws? | 00:15 |
bdx | Im seeing juju has grabbed a machine http://paste.ubuntu.com/25970976/ | 00:16 |
bdx | but just doesn't want to start | 00:16 |
bdx | been sitting there for a while, making me think artful hasn't made the cut just yet | 00:17 |
bdx | by a "while" I mean like 20 mins | 00:18 |
=== frankban|afk is now known as frankban | ||
ChaoticMind | is the jaas controller being super slow right now or is it just me? | 09:49 |
mhilton | ChaoticMind, what cloud/region are you seeing problems in? | 09:50 |
ChaoticMind | aws/eu-central-1 | 09:50 |
mhilton | ChaoticMind, thanks I'll try it out and see if I see the same. | 09:52 |
ChaoticMind | thanks | 09:52 |
mhilton | ChaoticMind, is there any particular command that's taking it's time for you? | 09:57 |
ChaoticMind | mhilton: just deploying the bundle took forever (like 3 minutes for a smallish bundle). Setting relationships took like 15 seconds each | 09:58 |
ChaoticMind | Usually it's about 0.5 seconds | 09:58 |
ChaoticMind | I made a new model and tried it again now, it seems ok now! | 09:59 |
mhilton | ChaoticMind: I'll look into it, one of the controllers may be overloaded. Thanks for mentioning it. | 10:00 |
ChaoticMind | no worries | 10:01 |
=== salmankhan1 is now known as salmankhan | ||
=== zeus is now known as Guest84717 | ||
=== Guest84717 is now known as zeus | ||
bdx | yoyoyo - whats the deal with artful deploys? Will someone `juju add-machine --series artful` on aws and let me know if I'm crazy | 16:12 |
bdx | oooh shoot, looks like adding a machine of artful worked actually | 16:14 |
bdx | nm | 16:14 |
bdx | jeeze | 16:14 |
bdx | ahh, `juju deploy ubuntu --series artful --force` is what is failing | 16:14 |
kwmonroe | bdx: how about juju add-machine --series artful; juju deploy ubuntu --to X --force? | 16:16 |
kwmonroe | just a shot in the dark | 16:16 |
bdx | kwmonroe: no, great shot, I actually just did that and it worked | 16:16 |
kwmonroe | sweet | 16:16 |
bdx | and it looks like what was failing me last evening is now working too, with the `juju deploy ubuntu --series artful --force` | 16:17 |
bdx | gd | 16:17 |
bdx | I was experiencing some extreme jitter yesterday on JAAS I think | 16:18 |
bdx | I was trying to get an artful deploy going for quite a while and it was just failing at machine "pending" | 16:18 |
bdx | really strange | 16:18 |
kwmonroe | ha! yeah, "juju deploy ubuntu --series artful --force" just worked for me too on aws | 16:18 |
R_P_S | hey, so as part of my ongoing evaluation of juju, I've just created an HA controller. But how do I specify what subnets to create the ha-instances into? | 16:34 |
bdx | jam:^^ | 16:35 |
bdx | R_P_S: that is possible with the `--to` directive, its just not documented yet | 16:36 |
bdx | I think its something like `--to subnet=subnet-<id>` | 16:37 |
R_P_S | so $ juju enable-ha --to subnet=subnet-priv1b --to subnet=subnet-priv1c ? | 16:37 |
bdx | R_P_S: let me see if I can get it, omp | 16:38 |
bdx | R_P_S: `juju bootstrap aws/us-west-2 --to subnet=subnet-<id> --credential mycred` | 16:40 |
bdx | ^^ worked | 16:40 |
bdx | I'll see about the HA omp | 16:40 |
R_P_S | yeah, that worked for the first instance... | 16:40 |
bdx | instances are launching faster then I've ever experienced | 16:40 |
R_P_S | but after creating the initial controller, enabling HA put them in random subnets as far as I can tell | 16:41 |
bdx | I already have a bootstrapped controller | 16:41 |
bdx | crazy | 16:41 |
R_P_S | including moxing public and private subnets for controllers 1 and 2 | 16:41 |
bdx | R_P_S: `juju enable-ha --to subnet=subnet-<id>,subnet=subnet-<id>` | 16:43 |
bdx | worked seamlessly | 16:43 |
bdx | R_P_S: would you mind putting some heat on this please https://github.com/juju/docs/issues/2122 | 16:44 |
R_P_S | I don't have a github account and I'm at work... I'll need to do that later... as an aside, that ticket doesn't appear to be about enable-ha | 16:47 |
R_P_S | so now I'm not sure how to remove the extra HA controllers | 16:52 |
R_P_S | https://jujucharms.com/docs/2.2/controllers-ha | 16:52 |
R_P_S | juju status doesn't have any mention of "has-vote" for the controller model... and "remove-machine" just fails with a message that "machine 2 is required by the model" | 16:53 |
rick_h | R_P_S: so juju show-controller should mention HA status bits and has-vote I believe | 16:55 |
jamesbenson | hi all, I'm trying to do a simple LXD conjure-up k8s with help, all-in-one. But it fails from the get-go. | 16:55 |
rick_h | R_P_S: you can always remove-machine --force but yea, best to know what's up there. | 16:55 |
stokachu | jamesbenson: whats the issue? | 16:56 |
rick_h | jamesbenson: bummer, what's the issue? I'm sure folks can get you good to go here. | 16:56 |
jamesbenson | thanks stokachu and rick_h | 16:56 |
jamesbenson | I hope so | 16:56 |
jamesbenson | So there seems to be a few issues. Sidenote: I'm doing this from a ubuntu server VM in openstack, xlarge. Pretty sure that doesn't matter, but just in case | 16:57 |
stokachu | jamesbenson: whats the hw specs? | 16:57 |
stokachu | ram, cpus | 16:57 |
jamesbenson | 8 vCPU, 16GB RAM, 160 GB HD | 16:58 |
stokachu | ok should be fine | 16:58 |
jamesbenson | ubuntu 16. LTS | 16:59 |
jamesbenson | These are the commands I do from deploy: http://paste.openstack.org/show/626536/ | 16:59 |
jamesbenson | https://snag.gy/nT0LPv.jpg | 17:00 |
jamesbenson | that's the latest state... | 17:00 |
jamesbenson | actually I've tried twice... here's the other: https://snag.gy/Z9cs2x.jpg | 17:01 |
jamesbenson | thoughts stokachu? | 17:02 |
R_P_S | so I'm trying to rollover controllers by simply terminating "bad" ones in aws directly and re-running enable-ha | 17:15 |
R_P_S | but I'm still unable to remove machines that don't show up with ha-status enabled in "show-controller" | 17:16 |
R_P_S | I upped enable-ha to 7 to test the subnets... it looks like I need to specify the subnets with each enable-ha command :\ | 17:17 |
R_P_S | but show-controller lists machines 0,3,4,5,6,7,8 (1,2 were "demoted" accoding to enable-ha output)... but I still get "machine 1 is required by the model" | 17:18 |
R_P_S | and one thing I've found is that using --force for remove-machine leaves an orphaned security group | 17:19 |
kwmonroe | jamesbenson: just a guess, but can those units get to the outside world? i know etcd and k8s charms snap install stuff, so i wonder if they're having trouble getting out. can you pastebin a "juju debug-log --replay -i etcd/0"? | 17:22 |
R_P_S | juju remove-machine 1 --force | 17:27 |
R_P_S | fails | 17:27 |
R_P_S | this bug was opened almost a year https://bugs.launchpad.net/juju/+bug/1658033 | 17:31 |
mup | Bug #1658033: Juju HA - Unable to remove controller machines in 'down' state <4010> <cpe-onsite> <juju:Triaged> <https://launchpad.net/bugs/1658033> | 17:31 |
=== frankban is now known as frankban|afk | ||
bdx | R_P_S: downsizing the controller cluster isn't supported | 17:39 |
bdx | you have to dump the db and restore to a smaller cluster (I think) | 17:40 |
R_P_S | correct, once an -n N has been specified, it can't be shrunk | 17:40 |
R_P_S | but I'm trying to simulate failure | 17:40 |
R_P_S | so I terminated one instances and reran enable-ha to rebuild new ones | 17:40 |
R_P_S | but the terminated onces are still in the list, unable to be removed | 17:40 |
R_P_S | I'm up to 13 "machines" in the config, with 5/7 ha (currently rebuilding) | 17:41 |
R_P_S | <controllername>* controller admin superuser aws/us-east-1 2 13 5/7 2.2.6 | 17:41 |
bdx | kwmonroe: I'm hitting it again, I just deployed these and they stood up just fine, tore it down and redeployed and its the artful instances that have been in pending for > 20mins now -> http://paste.ubuntu.com/25975589/ | 17:41 |
bdx | kwmonroe: create a new model on the same controller, then deployed the same charm http://paste.ubuntu.com/25975644/ | 17:52 |
bdx | see what I'm saying about the inconsistencies ? | 17:52 |
kwmonroe | bdx: i'm in us-east-1, and just verified "juju deploy ubuntu --series artful --force" worked again. hard to say what's up with it being intermitten. do a "juju ssh -m controller 0" and sudo grep around /var/log/juju for 'machine-X' to see if there's a provisioner issue. | 17:52 |
bdx | right | 17:52 |
kwmonroe | yeah bdx, frustrating for sure.. i'm hoping there's something in the controller log that will be more insightful about an artful provisioning issue. | 17:53 |
bdx | kwmonroe: http://paste.ubuntu.com/25975668/ - oh man | 17:55 |
kwmonroe | bdx: i haven't seen "failed to start instance (failed to start instance in provided availability zone)" before, and no sign of it in my controllers. however, i'm on 2.3-beta1 so that could be new logging in beta3. | 18:00 |
kwmonroe | bdx: what kind of constraints do you have for redis-cache? any wicked machine reqs there? | 18:01 |
bdx | root-disk, spaces | 18:01 |
bdx | testing w/o any constraints | 18:02 |
kwmonroe | bdx: i was hoping you had "instance-type=p3.xxlarge" and i could just say "us-west is simply out of those instance types", but that doesn't seem like the case. | 18:03 |
bdx | kwmonroe: it was the constraints | 18:05 |
bdx | I removed them, and wala | 18:05 |
kwmonroe | must have been spaces, right? surely not root-disk | 18:05 |
bdx | I wonder if I'm hitting disk cap on aws | 18:05 |
bdx | testing that right now | 18:05 |
kwmonroe | yeah, make sure you're asking for GB and not PB ;) | 18:06 |
kwmonroe | othwise RIP your wallet | 18:06 |
bdx | ha, yeah, "G" | 18:09 |
bdx | so I just logged into aws console and created 10 x 100G ebs volumes | 18:09 |
bdx | no issues | 18:09 |
bdx | kwmonroe: https://bugs.launchpad.net/juju/+bug/1706462 | 18:11 |
mup | Bug #1706462: juju tries to acquire machines in specific zones even when no zone placement directive is specified <cdo-qa> <foundations-engine> <juju:Triaged by ecjones> <MAAS:Invalid> <https://launchpad.net/bugs/1706462> | 18:11 |
bdx | see my comment at the bottom | 18:12 |
bdx | kwmonroe: I'm about to suggest something crazy | 18:19 |
bdx | http://paste.ubuntu.com/25975792/ | 18:19 |
bdx | taking ^ into consideration | 18:19 |
bdx | redis-space and ubuntu-space are both deployed with only a "spaces" constraint | 18:20 |
bdx | the ubuntu-space didn't have a --series constraint | 18:20 |
bdx | or bah | 18:20 |
bdx | --series argument | 18:20 |
bdx | the only failures I'm seeing here are when '--series' is specified alongside a spaces constraint | 18:21 |
bdx | because we see from ^ that redis-disk worked, it had '--series artful' and '--constraints "root-disk=100G" | 18:21 |
bdx | and ubuntu-space worked | 18:21 |
bdx | which had no '--series' arg, but had the spaces constraint | 18:22 |
bdx | but the only thing failing consistently | 18:22 |
bdx | are things deployed to a space that has the '--series' arg | 18:23 |
bdx | I'll prove it by specifying '--series' with another series other than artful | 18:23 |
bdx | how about zesty | 18:23 |
bdx | since we see from ^ that zesty worked w/o a spaces constraint | 18:24 |
kwmonroe | bdx: i don't know enough about juju's zone handling, but happened here with graylog / #38? http://paste.ubuntu.com/25968550/ | 18:25 |
kwmonroe | did graylog have constraints? | 18:25 |
bdx | no | 18:26 |
bdx | well yes | 18:26 |
bdx | but that isn't happening because of that | 18:26 |
bdx | that happens with every single instance deployed with 2.3beta3 | 18:27 |
bdx | it eventually gets past the "failed to start instance (failed to start instance in provided availability zone) " and finds one and eventually starts | 18:27 |
bdx | I was just posting that to show that its not only maas thats having that issue | 18:27 |
bdx | ok, well I think this verifies my theory http://paste.ubuntu.com/25975824/ | 18:28 |
bdx | I jus deploy the ubuntu-zesty-space | 18:28 |
bdx | it required the spaces constraint and --series | 18:28 |
bdx | and it failed similar to the artful | 18:29 |
bdx | just stuck pending | 18:29 |
bdx | #@(*$U(#@*$@#*& | 18:29 |
bdx | idk | 18:30 |
bdx | I may as well go back to sleep | 18:30 |
bdx | somehow I knew today would be a trying day | 18:30 |
kwmonroe | :) | 18:37 |
kwmonroe | bdx: i would note that in bug 1706462, that spaces + series repro this easily on aws | 18:39 |
mup | Bug #1706462: juju tries to acquire machines in specific zones even when no zone placement directive is specified <cdo-qa> <foundations-engine> <juju:Triaged by ecjones> <MAAS:Invalid> <https://launchpad.net/bugs/1706462> | 18:39 |
bdx | kwmonroe: series + spaces only with artful | 18:40 |
bdx | AND @kwmonroe | 18:40 |
bdx | ^ bug is entirely different then what I'm seeing I think | 18:40 |
jamesbenson | kwmonroe: sorry for the delay, turkey-luncheon thingy.... that command is giving me a TLS handshake timeout.. | 18:40 |
jamesbenson | kwmonroe: The instance can ping google... | 18:41 |
bdx | kwmonroe: this verifies that it is only happening with artful http://paste.ubuntu.com/25975887/ | 18:41 |
bdx | kwmonroe: what I'm seeing is the instances stay in pending for only series + space + artful | 18:42 |
bdx | kwmonroe: what I'm seeing is the instances stay in pending for only series + space + artful | 18:42 |
bdx | 1706462 - failed to start instance (failed to start instance in provided availability zone) within attempt 0, ret | 18:42 |
bdx | rying in 10s with new availability zone | 18:42 |
kwmonroe | wait bdx, your previous paste shows machine 8 waiting for machine with series zesty: http://paste.ubuntu.com/25975824/ | 18:43 |
bdx | but then on *a* next attempt, juju will find an instance, and start it, and go on its way | 18:43 |
bdx | kwmonroe: ah, my bad, yea, that machine started | 18:43 |
bdx | which made me realize, in all cases, its only artful that is the commonality here | 18:43 |
bdx | when used with spaces + series | 18:44 |
bdx | try it | 18:44 |
bdx | oooh, it may be only beta3, let me try this on jaas | 18:44 |
kwmonroe | jamesbenson: how about just "juju debug-log"? does that give you a tls timeout too? | 18:48 |
bdx | works great on jaas http://paste.ubuntu.com/25975933/ | 18:48 |
bdx | kwmonroe: the juju agent never starts, so I don't get any log from those instances | 18:48 |
jamesbenson | yes | 18:48 |
jamesbenson | kwmonroe: ^ | 18:48 |
bdx | oooh jamesbenson | 18:49 |
bdx | my b | 18:49 |
bdx | lol | 18:49 |
kwmonroe | :) | 18:49 |
jamesbenson | http://paste.openstack.org/show/626542/ | 18:49 |
kwmonroe | jamesbenson: ooooohhh, i thought you meant the debug-log command wasn't showing any output. | 18:49 |
jamesbenson | kwmonroe: No, seems to have issues with net/http: TLS handshake timeout... | 18:50 |
jamesbenson | I'm not sure why that is, though, since easyrsa is able to get active ... | 18:51 |
jamesbenson | so it must be able to reach out, correct? | 18:51 |
kwmonroe | jamesbenson: easyrsa doesn't snap install anything | 18:52 |
kwmonroe | etcd and k8s charms do | 18:52 |
jamesbenson | oh man.. | 18:52 |
jamesbenson | so something with the bridge then | 18:52 |
kwmonroe | so jamesbenson, i'll bet you all the money in my pockets that if you do a "juju run --unit easyrsa/0 'sudo snap install etcd'", it'll fail | 18:52 |
jamesbenson | kwmonroe: seems to be just sitting there... | 18:53 |
jamesbenson | yep, same error | 18:54 |
jamesbenson | juju run --unit easyrsa/0 'sudo snap install etcd' | 18:54 |
jamesbenson | error: cannot perform the following tasks: | 18:54 |
jamesbenson | - Download snap "core" (3440) from channel "stable" (Get https://068ed04f23.site.internapcdn.net/download-snap/99T7MUlRhtI3U0QFgl5mXXESAiSwt776_3440.snap?t=2017-11-16T20:00:00Z&h=30ced1b835617d49d8ff4221a62d789f7ca638aa: net/http: TLS handshake timeout) | 18:54 |
jamesbenson | sorry about the paste there... | 18:54 |
kwmonroe | jamesbenson: to test the tls/http connectivity more generically, do this.. juju ssh etcd/0, then wget https://google.com from the etcd unit. | 18:54 |
kwmonroe | (make sure it's https) | 18:55 |
bdx | ok, here it is https://bugs.launchpad.net/juju/+bug/1732764 | 18:55 |
mup | Bug #1732764: series + spaces + artful + juju2.3beta3 = fail <juju:New> <https://launchpad.net/bugs/1732764> | 18:55 |
jamesbenson | kwmonroe: works. | 18:55 |
kwmonroe | interesting | 18:55 |
jamesbenson | http://paste.openstack.org/show/626545/ | 18:56 |
kwmonroe | jamesbenson: how about a "sudo snap install etcd" from that same etcd unit? | 18:56 |
jamesbenson | kwmonroe : nope...http://paste.openstack.org/show/626546/ | 18:58 |
jamesbenson | interesting... | 18:59 |
ryebot | jamesbenson: if this is an egress-restricted environment and you're unable to hit the snap store, I can provide you with some steps for installing them manually. | 18:59 |
jamesbenson | all ports are open.... | 18:59 |
jamesbenson | I'll double check though.. | 18:59 |
kwmonroe | bdx: nice detail in 1732764. interesting that's it's such a specific combo. also, you may want to s/"spaces=myspace"/"spaces=facebook" in case a more recent social media platform helps. | 19:01 |
jamesbenson | ryebot kwmonroe: iptables are empty, and security group is open all ports in and out. | 19:01 |
jamesbenson | http://paste.openstack.org/show/626547/ | 19:01 |
jamesbenson | This is the only rule in my iptables -t nat: MASQUERADE all -- 10.55.234.0/24 !10.55.234.0/24 /* managed by lxd-bridge */ | 19:03 |
kwmonroe | jamesbenson: stick some quotes around that url... wget 'https://068ed04f23.site.internapcdn.net/download-snap/99T7MUlRhtI3U0QFgl5mXXESAiSwt776_3440.snap?t=2017-11-16T20:00:00Z&h=30ced1b835617d49d8ff4221a62d789f7ca638aa' | 19:04 |
jamesbenson | hmm... still shows connected. But once connected it sits. | 19:04 |
kwmonroe | jamesbenson: how about running "env | grep -i proxy" on that unit. anything in there? | 19:05 |
jamesbenson | NO_PROXY=10.55.234.245,127.0.0.1,::1,localhost | 19:07 |
jamesbenson | no_proxy=10.55.234.245,127.0.0.1,::1,localhost | 19:07 |
kwmonroe | hmph jamesbenson, that seems legit | 19:09 |
jamesbenson | ....so confused.... not a good sign that everything seems legit from you too... | 19:10 |
jamesbenson | kwmonroe: do you deploy on baremetal or in VM's? Do you have any script or anything? | 19:11 |
kwmonroe | jamesbenson: by "legit", i meant the no_proxy stuff looks legit :) if you can't do a "sudo snap install foo" from the unit, juju won't be able to either. | 19:22 |
kwmonroe | jamesbenson: there's a gremlin in there to be sure. just need to figure out why those unit's can't snap install. | 19:22 |
kwmonroe | jamesbenson: i typically deploy to clouds or localhost (lxd). not much experience with maas. | 19:23 |
jamesbenson | well this is in an openstack VM, so not with maas... | 19:24 |
kwmonroe | ah, right | 19:24 |
jamesbenson | I know I can deploy using openstack magnum, but want to do it manually... | 19:25 |
kwmonroe | well jamesbenson, from what i can tell, apt install works and wget works, so it's not like your units are totally locked down. i'm not sure what's causing snap install to fail. | 19:25 |
jamesbenson | error: cannot install "foo": snap "core" has changes in progress | 19:25 |
kwmonroe | silly rabbit, dont' actually stick 'foo' in there | 19:26 |
jamesbenson | :-p | 19:26 |
jamesbenson | hey, didn't know if it was a test option ;-) | 19:26 |
jamesbenson | ansible ping/pong test ... | 19:27 |
kwmonroe | :) | 19:27 |
kwmonroe | jamesbenson: what does snap changes show? | 19:27 |
kwmonroe | "snap changes" | 19:27 |
kwmonroe | i'm guessing it's stuck somewhere trying to download the core snap | 19:27 |
jamesbenson | http://paste.openstack.org/show/626550/ | 19:27 |
jamesbenson | you'll like that.. | 19:28 |
kwmonroe | heh, classy | 19:28 |
kwmonroe | jamesbenson: how about a "snap download etcd"? | 19:29 |
kwmonroe | we should see the tls error.. just making sure. | 19:30 |
magicaltrout | "Hello Kubernetes support desk, Kevin speaking, how may I help you today???" | 19:31 |
kwmonroe | phew! backup arrives. magicaltrout, meet jamesbenson. he's having trouble snap installing k8s. | 19:31 |
jamesbenson | yep, tls error | 19:31 |
magicaltrout | i have many k8s installations | 19:31 |
magicaltrout | too many | 19:31 |
kwmonroe | magicaltrout: any on openstack? | 19:32 |
magicaltrout | sorta | 19:32 |
magicaltrout | its manual though not openstack cloud provider | 19:32 |
jamesbenson | magicaltrout: I've got a ubuntu 16 LTS, VM sitting in openstack. Security group is completely open. No iptables rules... | 19:33 |
jamesbenson | 8 vCPU, 16GB RAM, 160 GB HD; deployed using these commands: http://paste.openstack.org/show/626536/ | 19:34 |
jamesbenson | can't seem to install though, giving me tls errors. | 19:37 |
magicaltrout | okay, jamesbenson your cluster lives inside lxd on nodes on openstack? | 19:41 |
jamesbenson | yes | 19:42 |
jamesbenson | VM in openstack, lxd on that VM. | 19:42 |
magicaltrout | hmm i've not tried that before | 19:42 |
magicaltrout | if you snap install at vm level does it work? | 19:42 |
jamesbenson | yeah, did that to install conjure-up | 19:42 |
jamesbenson | and lxd | 19:42 |
magicaltrout | hmm | 19:45 |
kwmonroe | jamesbenson: it feels like something about your lxd-bridge is interferring with fetching data from the snap store, but i can't fathom a reason why it would affect snap and not apt or wget. | 19:47 |
R_P_S | I am having difficulties adding subnets to spaces to ensure instances are deployed in the correct VPC/AZ | 19:49 |
R_P_S | I get an error "cannot add subnet: no subnets defined" while running | 19:49 |
R_P_S | juju add-subnet 1.2.3.4/5 public subnet-12345678 | 19:50 |
jamesbenson | kwmonroe, magicaltrout: do you have general guidelines/rules/instructions on how you set up lxd, zfs, and the network? | 19:52 |
jamesbenson | ipv6 is disabled... | 19:53 |
jamesbenson | but I wasn't sure about the bridge | 19:53 |
magicaltrout | i've only installed k8s with conjure up on lxd once | 19:53 |
magicaltrout | i just did whatever it told me | 19:53 |
jamesbenson | how do you typically install it? | 19:53 |
magicaltrout | i have 1 standard aws install and 3 openstack manual provider installs | 19:54 |
jamesbenson | I'm doing lxd to do some dev with multiple "nodes" in an all in one... | 19:54 |
jamesbenson | openstack with magnum? | 19:55 |
magicaltrout | nope | 19:55 |
jamesbenson | manual? | 19:55 |
magicaltrout | yeah just spin up some nodes | 19:55 |
magicaltrout | and deploy some stuff to them | 19:55 |
jamesbenson | using which method? | 19:55 |
magicaltrout | https://jujucharms.com/docs/2.2/clouds-manual | 19:56 |
magicaltrout | just like a small 3 node cluster for k8s dev | 19:56 |
stokachu | jamesbenson: how's your lxd bridge configured | 19:57 |
jamesbenson | stokachu: any command to detail it? | 19:57 |
stokachu | jamesbenson: lxc network show lxdbr0 | 19:58 |
stokachu | jamesbenson: easiset to do `lxc network show lxdbr0|pastebinit` | 19:59 |
jamesbenson | http://paste.ubuntu.com/25976235/ | 19:59 |
stokachu | do you have another bridge defined? | 19:59 |
jamesbenson | no idea about the pastebinit... awesome.. | 19:59 |
stokachu | jamesbenson: whats `lxc network list|pastebinit` show | 20:00 |
jamesbenson | http://paste.ubuntu.com/25976248/ | 20:00 |
jamesbenson | http://paste.ubuntu.com/25976252/ | 20:01 |
stokachu | jamesbenson: yea youve got no bridge defined for lxd to use | 20:01 |
jamesbenson | okay... | 20:02 |
stokachu | jamesbenson: how'd you create lxdbr0 before? | 20:02 |
jamesbenson | sudo lxd init --auto | 20:02 |
kwmonroe | jamesbenson: fwiw, we have a generic lxd guide: https://jujucharms.com/docs/stable/tut-lxd. might be worth following that and bootstrap on a new node, then "juju deploy ubuntu", then "juju ssh ubuntu/0" and see if a "sudo snap install core" works. | 20:03 |
stokachu | kwmonroe: it's his bridge | 20:03 |
stokachu | it isn't configured | 20:03 |
jamesbenson | BRB | 20:03 |
jamesbenson | could you send me a few commands? | 20:04 |
stokachu | jamesbenson: `lxc profile show default|pastebinit` | 20:04 |
R_P_S | bdx / rick_h: any ideas why add-subnet is not working and complaining about subnets not being defined? | 20:05 |
kwmonroe | stokachu: if the bridge was borked, how did he get this far with kubeapi-load-balancer going active: https://snag.gy/nT0LPv.jpg | 20:05 |
kwmonroe | (and easyrsa) | 20:06 |
stokachu | well for one, his lxd bridge is inet addr:10.55.234.1 | 20:06 |
stokachu | and those ip's are different | 20:06 |
jamesbenson | http://paste.ubuntu.com/25976291/ | 20:06 |
stokachu | kwmonroe: also thats not output from conjure-up | 20:07 |
stokachu | so i dont know what he did there | 20:07 |
stokachu | jamesbenson: basically your lxd network bridge is acting up | 20:08 |
stokachu | jamesbenson: i recommend tearing down that setup | 20:08 |
stokachu | jamesbenson: juju kill-controller localhost-localhost | 20:09 |
jamesbenson | okay | 20:09 |
jamesbenson | and how do I bring it back up? | 20:09 |
stokachu | then delete that lxdbr0 bridge | 20:09 |
stokachu | one sec | 20:09 |
jamesbenson | ok | 20:09 |
jamesbenson | thanks 🙏 | 20:09 |
stokachu | jamesbenson: then do `sudo brctl delbr lxdbr0` | 20:09 |
stokachu | jamesbenson: let me know when you've done that, and give output of `ip addr|pastebinit` | 20:11 |
R_P_S | so I just discovered that by creating a new model, the subnets aren't populated... | 21:01 |
R_P_S | the subnet info is available in the controller and default models, but I want to build a model for each environment | 21:02 |
R_P_S | how do I populate... or copy the subnet info from one model to the other? | 21:02 |
R_P_S | juju switch default && juju list-subnets -> full subnet output | 21:07 |
R_P_S | juju switch dev-k8s && juju list-subnets -> No subnets to display | 21:07 |
kwmonroe | R_P_S: does 'juju reload-spaces' while in the dev-k8s model do anything? | 21:20 |
kwmonroe | R_P_S: on aws, all new models look populated with subnets for me. | 21:21 |
R_P_S | reload-spaces appears to not do anything | 21:24 |
R_P_S | hold on | 21:25 |
R_P_S | would reload-spaces be dependent on a vpc-id being specified in the model-config? | 21:26 |
jamesbenson | stokachu: I think it's easier to reset the VM and start from scratch, no? | 21:30 |
stokachu | jamesbenson: probably | 21:30 |
jamesbenson | stokachu: So it's rebuilt. | 21:34 |
stokachu | ok so do this, `sudo apt-add-repository ppa:ubuntu-lxc/stable` | 21:34 |
stokachu | `sudo apt update && sudo apt install lxd lxd-client` | 21:34 |
stokachu | then `lxd init --auto` (no sudo) | 21:35 |
stokachu | then `lxc network create lxdbr0 ipv4.address=auto ipv4.nat=true ipv6.address=none ipv6.nat=false` | 21:35 |
stokachu | then `snap install conjure-up --classic` | 21:35 |
stokachu | and run conjure-up | 21:35 |
jamesbenson | snap needs sudi | 21:38 |
jamesbenson | sudo | 21:38 |
jamesbenson | running conjure-up | 21:39 |
R_P_S | ok, turns out you can't just add a VPC to a model after the fact (got errors), as the VPC parameters need to be specified during model creation with --config | 21:40 |
jamesbenson | stokachu: oooo something different is happening... getting a good feeling ^_^ | 21:41 |
stokachu | jamesbenson: \o/ | 21:41 |
jamesbenson | what's the watch command again? | 21:42 |
jamesbenson | got it | 21:43 |
R_P_S | ok, now I'm straight up running into this bug :( https://bugs.launchpad.net/juju/+bug/1704876 | 21:50 |
mup | Bug #1704876: can't deploy to specific AWS subnets due to `juju add-subnet` fails <add-subnet> <aws> <conjure> <spaces> <subnet> <vpc> <juju:Triaged> <https://launchpad.net/bugs/1704876> | 21:50 |
R_P_S | how do you delete a space in a model? | 21:59 |
hml | spaces can’t be deleted currently. :-( | 21:59 |
R_P_S | ... | 21:59 |
R_P_S | are spaces completely broken? :( can't delete, can't add subnets to a space... can't do anything with them? yet they're core to defining where things will be deployed? | 22:00 |
hml | i never use spaces personally - there are other ways to define how things are deployed | 22:01 |
hml | depends on the cloud you’ve bootstrapped | 22:01 |
R_P_S | I'm following: https://insights.ubuntu.com/2017/02/08/automate-the-deployment-of-kubernetes-in-existing-aws-infrastructure/ | 22:01 |
R_P_S | how would I rewrite this command then to not use spaces? | 22:02 |
R_P_S | juju deploy --constraints "instance-type=m3.medium spaces=private" cs:~containers/etcd-23 | 22:02 |
hml | ah… | 22:03 |
hml | you can just make a space with a different name to use - suboptimal i know - | 22:03 |
R_P_S | the only things I've done different so far are that I'm not using cloudformation (infrastructure preexisting) and creating a model | 22:04 |
R_P_S | But how do I use empty spaces? | 22:04 |
R_P_S | since I can't add-subnet to a space? | 22:04 |
jamesbenson | stokachu : etcd/0 Missing relation to certificate authority. | 22:04 |
jamesbenson | https://snag.gy/5ED6sa.jpg | 22:05 |
jamesbenson | ah, my nginx just became active.... | 22:05 |
R_P_S | so apparently you need to define your subnets when calling add-space... | 22:08 |
R_P_S | do it once, don't screw it up... and if you ever accidentally re-assign a subnet to a different space, you're screwed? | 22:09 |
jamesbenson | kwmonroe stokachu : thoughts? seems like having a similiar issue like before. the bridge is managed now though | 22:14 |
stokachu | If there is no error in juju status then give it time | 22:16 |
jamesbenson | error hook failed: "install" ? | 22:17 |
jamesbenson | for all etcd | 22:17 |
jamesbenson | but keeps on restarting... | 22:17 |
stokachu | Is this a full VM you're running? | 22:17 |
jamesbenson | full? yes, ubuntu server cloud image... | 22:17 |
kwmonroe | jamesbenson: juju debug-log --replay -i etcd/0 | 22:18 |
kwmonroe | let's see if it's still a problem snap installing | 22:18 |
kwmonroe | jamesbenson: alternatively, juju ssh etcd/0 and try a "sudo snap install etcd" | 22:18 |
jamesbenson | http://paste.ubuntu.com/25976912/ | 22:18 |
kwmonroe | instant regret on the --replay ;) | 22:19 |
kwmonroe | please hold while your pastebin loads into ram | 22:19 |
jamesbenson | :-( | 22:20 |
stokachu | Looks like a timeout downloading the snaps | 22:20 |
jamesbenson | yeah | 22:20 |
jamesbenson | should I try the `sudo snap download etcd` | 22:20 |
stokachu | sudo snap install etcd | 22:20 |
jamesbenson | tried inside of etcd and got the TLS handshake timeout error | 22:21 |
kwmonroe | R_P_S: while i'm waiting on jamesbenson to crash my browser, what's your end game here? i'm really not well versed with spaces/subnets, but i'm curious what people are up to when they have strict space reqs. i know bdx does these space constraints all the time -- i never knew why. | 22:21 |
stokachu | Yea, not a conjure-up or juju issue | 22:21 |
stokachu | But an issue nonetheless | 22:21 |
jamesbenson | kwmonroe: sorry! | 22:22 |
stokachu | Are you behind proxies? | 22:22 |
jamesbenson | stokachu: no | 22:22 |
kwmonroe | no worries jamesbenson -- it's ammo for getting a new rig for the holidays ;) | 22:22 |
stokachu | jamesbenson: may want to post on discuss.snapcraft.io | 22:23 |
jamesbenson | ooo, nice :-) đź‘Ť. I'm running MBP with touchbar... | 22:23 |
R_P_S | kwmonroe: simple management of VPCs and subnets. Without this, some of the anti-patterns that juju enables is mind boggling... things like 0.0.0.0/0 SSH ACLs on every instance | 22:24 |
kwmonroe | you happy with the touchbar jamesbenson? i hear mixed reviews (kinda like battlefront 2), where by "mixed" i hear "i hate it" ;) | 22:24 |
kwmonroe | jamesbenson: i see we're back to the tls handshake timeout :/ | 22:25 |
jamesbenson | lol... it's okay... I do need to reset it though, really random hangs and freezes as of late... circle of death for like 10-20 seconds then free's up. | 22:25 |
jamesbenson | does snapcraft have an IRC? | 22:25 |
kwmonroe | oof on the death spiral | 22:26 |
kwmonroe | jamesbenson: you'll get much better response from snapcraft.io, but there is a #snappy freenode channel | 22:26 |
R_P_S | kwmonroe: at that point, the only way to secure them is to control your subnets with things like public and private... these are very basic security concepts when building AWS infrastructure. | 22:26 |
jamesbenson | but I hack the hell out of it, so I probably fugged something somewhere... | 22:26 |
jamesbenson | back to point though | 22:27 |
kwmonroe | jamesbenson: i shouldn't say "much better", but i know those forums are monitored like crazy. irc, i'm not sure. | 22:27 |
jamesbenson | okay | 22:27 |
kwmonroe | jamesbenson: https://forum.snapcraft.io/ is the place | 22:28 |
kwmonroe | dont go there ^^ from your etcd/0 unit because you'll probably get a tls handshake error. | 22:28 |
R_P_S | kwmonroe: the fact that conjure-up for a kubernetes cluster uses ec2-classic by default is, to be brutally honest, downright scary :( ec2-classic was deprecated years ago, and should never be used again | 22:29 |
R_P_S | damn, I gotta run to meetings... I'll likely be in meetings until EOD... | 22:30 |
R_P_S | once again, thanks for all the help, I am making progress, but it is much slower than I'd hoped | 22:30 |
kwmonroe | R_P_S: thx for the insights on your use of subnets/spaces! | 22:30 |
kwmonroe | i'll catch up with you later to dive in more | 22:31 |
stokachu | Lol ec2-classic? | 22:31 |
kwmonroe | yeah - i dunno what ec2-classic is either, that was the diving part i alluded to ;) | 22:32 |
stokachu | R_P_S: feel free to elaborate on ec2-classic | 22:33 |
jamesbenson | kwmonroe: ha, thanks. Not sure exactly what to post to them.. I suppose just snap is failing in openstack VM with that massive pastebin from earlier.. | 22:36 |
kwmonroe | jamesbenson: like you said, back to the point, if you can't "sudo snap install etcd" from the deployed unit, juju won't be able to either. so step 1 is to figure out why that's failing. you're probably going to hear 1000 people ask "what are your proxy settings", don't be mad. whatever's going on is probably a mix of openstack / lxd / snap. | 22:37 |
kwmonroe | jamesbenson: i would just create a new topic that says "snap install fails in a lxd container on an openstack VM" | 22:38 |
kwmonroe | jamesbenson: and the pastebin is good -- but since it has so much juju noise in it, i'd paste the failure that you see from "sudo snap install core" on the etcd unit. | 22:41 |
tvansteenburgh | stokachu: kwmonroe: ec2-classic == ec2 in the days before vpcs were introduced. if you have a sufficiently old aws account, the machines juju provisions will not be in a vpc, which can break things in unexpected ways. the way to get around this is to tell juju which vpc to use. you can do that using bootstrap or model config. | 22:47 |
jamesbenson | kwmonroe : Posted... lets see what happens. | 22:55 |
kwmonroe | i predict nothing but good things jamesbenson ;) | 22:55 |
R_P_S | ec2-classic is amazon ec2 before VPCs existed... IIRC, it's not even accessible on accounts that were created in the past few years | 23:15 |
R_P_S | ec2-classic was the equivalent of one giant public VPC that contained every amazon customer all in one giant internal subnet (split per region) | 23:16 |
R_P_S | ec2-classic didn't have as many features as "ec2-vpc" either. examples include: the SG couldn't be changed, and could only have one SG attached to an instance. | 23:17 |
R_P_S | ec2-classic SGs don't have egress ACLs. They simply don't exist (non-configurable ALL ALL 0.0.0.0/0 egress) | 23:18 |
R_P_S | ec2-classic without VPCs, you didn't make subnets either... I can't remember everything though, it's been years since I've done any significant amount of work in ec2-classic. | 23:19 |
R_P_S | ec2-classic is inherently insecure compared to ec2-vpc | 23:19 |
tvansteenburgh | R_P_S: you can make conjure-up use a vpc, but you need to bootstrap the juju controller or create the model before using conjure-up, then tell conjure-up to use that controller or model | 23:21 |
tvansteenburgh | for example: juju bootstrap --config "vpc-id=vpc-xxxxxxxx" | 23:21 |
tvansteenburgh | or: juju add-model --config "vpc-id=vpc-xxxxxxxx" | 23:22 |
R_P_S | at this point, I have my ha controllers in the VPC i want... kwmonroe was curious about ec2-classic though | 23:23 |
R_P_S | at the fact that the AWS account I'm using appears to be old enough to still support ec2-classic meant that a basic/barebones conjure-up created a kubernetes cluster in ec2-classic instead of inside a VPC | 23:26 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!