[00:54] babbageclunk: https://github.com/juju/juju/pull/10686 for the httpserver test race [00:54] looking [00:55] babbageclunk: noticed that I hadn't pushed last commit [00:55] ok [00:55] babbageclunk: good now [00:59] thumper: why'd you change the quick request loop from a timeout based to just trying 5 times? Won't that lead to more failures? [01:00] no, because it normally catches it immediately, the first time through [01:00] and if not the first. definitely the second [01:00] the loop is just there for paranoia [01:02] I don't see how making it max of 5 rather than up to LongAttempt is more paranoid though [01:03] I could rewrite it to end up at LongWait, but the attempt code is deprecated, and shouldn't be used [01:03] it introduces double sleeps through the loop for no benefit [01:03] ah, ok [01:04] * thumper reworks [01:04] running tests [01:10] thumper: so I should be waiting? I might just approve with a comment on that line [01:11] babbageclunk: pushing change now [01:11] was looping the tests to double check [01:11] babbageclunk: it's all there now [01:14] thumper: cool approved [01:14] ta [04:44] another worker updated : https://github.com/juju/juju/pull/10687 [09:47] Good morning everyone! [09:48] Is anyone else having issues with containers deploying with juju? im still having this issue, and I either need to get this resolved, or I need to pull juju out of production entirely and explore alternative solutions. [09:57] Fallenour, any more information about that? [10:00] yea I keep getting DNS error messages on my lxd containers not deploying. MaaS/Juju deployment, servers deploy images just fine, image deploys just fine, and the OS can ping google.com, so its not a DNS issue. From what Ive seen its an ongoing issue from as far back as 2016, and its impacting all of juju it seems. I cant pitch a solution that doesnt work in my datacenter to someone elses. [10:07] @stickupkid, http://paste.ubuntu.com/p/DBbGM7Ksdx/ [10:07] for the juju status. Just rechecked by sshing into machine 0, pinging google, it works [10:11] manadart, do you have any thoughts about this? [10:12] "no matching image found", seems familar, but could be wrong [10:20] Im testing something right now as a LDE @stickupkid [10:20] Is there a way to add just a container, an empty container via juju? [10:20] stickupkid Fallenour: 2.4.7 is quite old. [10:20] I think I know what the issue might be. Its a BIG MAYBE [10:21] 2.4.7 for what @manadart [10:21] Fallenour: "juju add-machine lxd:x" [10:21] Fallenour, juju [10:21] Im on 2.6.9 @manadart stickupkid [10:21] bionic-amd64 [10:22] My controllers are on 2.4.7 though I just saw. Should I update those, and if so, how? [10:22] Fallenour, not according to the status output [10:22] http://paste.ubuntu.com/p/kpKXyQYfYp/ [10:23] Fallenour, upgrading can be found here https://jaas.ai/docs/upgrading-models [10:23] Just an FYI, thats how long Ive been fighting to get this to work, and even since before 2.4.7 [10:23] So do keep in mind, this is not from a lack of effort o.o [10:23] I really have been fighting for years to get this thing to work for me. I r eally do want it in production [10:23] Fallenour, i think upgrading your controllers would help, some bug fixes around this have landed since 2.4.7 [10:24] alright, I just ran upgrade model [10:24] on the controllers [10:25] its installing 2.5.8 [10:25] im also concurrently installing the latest images for 18.04 on bare machines, as well as the upgrade on the controllers [10:26] Im gonna be so sad if it was a simple upgrade on the controllers @___@ [10:26] 2-3 years of fighting, all because of an update x...x [10:49] @stickupkid, @manadart I upgraded the controllers, as well as ran upgrade-model on the model itself, but its still failing to install an image. I did notice that when I run juju-status that its still showing that the model is on 2.4.7. Is there anything else i need to run? [10:50] Fallenour, you need to run it on the controller, `juju upgrade-mode -m controller` [10:50] I ran it in the controller model [10:50] @stickupkid, [10:51] It shows 2.5.8 in the controller model [10:51] but not in cloud-000-000001, the current model im testing with [10:53] @stickupkid, @manadart I went ahead and destroyed the model and redeployed it. Its showing 2.5.8. Ill let you knwo what it does once I get the machines deployed [10:59] achilleasa, nammn_de - whilst I'm fighting with a lxd container in jenkins, I've brought the shell check stuff upto parity with the older pre-check tests https://github.com/juju/juju/pull/10689 [11:00] achilleasa, nammn_de if anyone finds a way to speed up `go vet` I'm all ears, there is no way it should take 180s, that's crazy [11:01] @stickupkid, not sure what you are building, but ramfs is a good start for dramatic performance improvements. thats one fo the things Ive been working with for a while now. ive noticed substantial performance improvements when using it, especailyl in combination with ceph with juju [11:05] Fallenour, don't tempt me with ramfs :D [11:06] @stickupkid, loool, its one of the things Im hoping to bottle and ship as part of the cloud Ive created, which uses juju as a machine mechanism. ceph is another major component. Once i get this working, I plan on shipping alpha for people to test. Ill probably need you guys help with getting it setup right, but lawd if it works, its gonna be a monster. [11:07] I was able to build an entire hybrid platform, complete with a centralized UI, and self-healing services, fully compliant out of the box, but its all in pieces until this works. [11:07] Fallenour, yeah, you should bring it up here - https://discourse.jujucharms.com/ people would be very interested! [11:08] @stickupkid, Oh I cant wait! Its just insane. ive written over 615,000+ lines of code for it so far. Its a beast. [11:08] NOOOOOOOOOOOOooooooooooooooo [11:08] the same issue still! [11:08] D; [11:23] @stickupkid, @manadart whats the current version for controllers? [11:25] Fallenour: long time no see, how goes? [11:26] @rick_h, ! Its been a while! Im doing amazingly well! We are borderline like...3-4 weeks of getting funded! Super awesome things happening [11:26] How are you doing? How have you been? [11:26] Fallenour: whoa! awesome on the funding news [11:26] How are you doing? How have you been? [11:26] ooops! Wrong window XD [11:27] Fallenour: partying hard you know :) just plugging away at the Juju fun stuff [11:27] lol [11:27] yea its super cool! Im really excited! OOH! Even bigger news! I got NASA to sponsor the project! I will be able to test the entire cloud platform Ive been workign on for ages with them. [11:27] rick_h, Oh I wish! Ive still smashing my face to keyboard trying to get it working. [11:27] Fallenour: wow, that's crazy. You run across magicaltrout in your travels there? [11:27] Fallenour: ouch on the face smashing, I'd suggest less of that. [11:27] rick_h, not yet, but Im expecting Ill hear from him the moment he realizes juju is coming DoD and fedspace wide. [11:27] never really seems to work on the good bugs anyway [11:27] lol [11:28] rick_h, Im having issues with lxd deploying now, of all the things, images where never a problem. Its trading one fire for antoher I suppose. [11:28] Fallenour: :/ that's odd. What LXD issues. I haven't seen anything in the recent fires. [11:28] * rick_h is still catching up on weekend emails and now is scared to look at the bugs section of email [11:29] rick_h, yea they think it might be a controller version issue thatll solve itself once updates are done, so Im moving from 2.4.7 to whatever current is. [11:29] can we tripple check by doing "lxc launch ubuntu:18.04 bionic" [11:29] Fallenour: oh ouch, yea 2.4's been a bit [11:29] stickupkid, yea Im just waiting for the 2.5.8 to 2.6.X to occur. its requiring I update twice. [11:30] and what does "lxc image list" [11:30] stickupkid: Fallenour yea my thought exactly. Did 2.4 know bionic existed to be able to use? [11:30] print out [11:30] rick_h, errrr [11:30] maybe [11:30] rick_h, yea I was really hesistant to change anything until I got everything working, and then I was gonna merge it all together, and then upgrade. Seems like upgrade is happening now though :P [11:30] Fallenour: wheeee [11:30] rick_h, surprisingly, yea, it worked [11:30] Fallenour: umm, congrats? :P [11:31] I was running trusty, xenial, and bionic, side by side. was mazballs tbh [11:31] hah, well that's ok and good. We definitely support each of those LTS's [11:31] rick_h, security would have had a seizure though, so definitely not recommended for prod, but it demosntrates the capacity for juju to support a dramatic difference in OS, whcih was a plus [11:32] Fallenour: just have to get on that UA train and have the ESM for trusty :P [11:32] * rick_h takes off the sales hat [11:33] rick_h, loool. I was keeping it up for minecraft modules. The target for that one was gaming servers. Ill be able to move it to bionic once I start pushing modules out for software support with bionic [11:33] I didnt have a chance to tell you, but I finished carousel, so now I can mass produce software support once i figure out how to build charms. [11:35] Fallenour: awesome [11:51] it doesnt seem to be a version problem, Ive updated everything to 2.6.9, and its still occuring @rick_h @stickupkid @manadart [11:51] Fallenour: ok, what's the issue/log output? [11:51] machine-0: 07:46:30 DEBUG juju.worker.logger reconfiguring logging from "=DEBUG" to "=WARNING;unit=DEBUG" [11:51] machine-0: 07:46:30 ERROR juju.worker.dependency "broker-tracker" manifold worker returned unexpected error: no container types determined [11:51] machine-0: 07:50:03 WARNING juju.container.broker no name servers supplied by provider, using host's name servers. [11:51] machine-0: 07:50:03 WARNING juju.container.broker no search domains supplied by provider, using host's search domains. [11:51] machine-0: 07:50:03 WARNING juju.container.broker incomplete DNS config found, discovering host's DNS config [11:51] machine-0: 07:50:45 WARNING juju.worker.provisioner failed to start machine 0/lxd/0 (acquiring LXD image: no matching image found), retrying in 10s (10 more attempts) [11:52] machine-0: 07:51:02 WARNING juju.container.broker no name servers supplied by provider, using host's name servers. [11:52] machine-0: 07:51:02 WARNING juju.container.broker no search domains supplied by provider, using host's search domains. [11:52] machine-0: 07:51:02 WARNING juju.container.broker incomplete DNS config found, discovering host's DNS config [11:52] Sorry for the wall of text everyone [11:52] hah ok sec...processing [11:52] pastebin ftw :) [11:52] yea I wasnt sure if itd let me pastebinit since its an ongoing write [11:52] Fallenour: this is on MAAS? [11:53] yea [11:53] Fallenour: what version of MAAS and are the DNS servers setup in the MAAS config? [11:57] Fallenour: Can you leave the root log level at DEBUG for a re-run, so we can see what the container manager is outputting? [11:59] rick_h, its 2.4.2, and DNS is configured and working. I can ssh into a machien deployed and ping google. [11:59] manadart, how would i do that/ [12:00] Fallenour: hmmm, this is from the machine you want the container on? [12:00] Fallenour: e.g. machine 0 in this case it looks like [12:00] rick_h, yea [12:01] Fallenour: According to the first line of the output^ you have model config to use WARNING by default except for units. [12:01] rick_h, when I ssh into the machines, I can ping external DNS sources, but for some reason its giving me the image cannot be foudn error. I even went out of my way tod ownload all the images just to make them available, but that didnt work. I also added additional DNS servers, to include a root dns server, but that didnt resolve the issue either. [12:01] Fallenour: try: juju model-config logging-config="=DEBUG" [12:01] manadart, ok ill try that [12:02] Fallenour: was the machine manually added? https://bugs.launchpad.net/juju/+bug/1821714 [12:02] Bug #1821714: Container on sshprovided machine: incorrect DNS [12:03] rick_h, I have this issue whether I use juju deploy, conjure up, or manual [12:03] one thing I have been looking into, is juju using snap as the default lxd or /bin/ ? [12:03] Fallenour: hmm, yea I see https://bugs.launchpad.net/juju/+bug/1826203 is the same as well :/ [12:03] Bug #1826203: deploy openstack base bundle failed with lxd error: incomplete DNS config found, discovering host's DNS config [12:04] Fallenour: the deb bin vs the snap though it detects if the snap is on the system [12:04] my thoughts are if the lxd container system use isnt matching, it would fail [12:04] rick_h, ok, so that wouldnt matter to juju? [12:04] Fallenour: I don't think so [12:05] rick_h, I didnt either, but Im willing to try it. [12:07] * rick_h has to take the dog to the vet, will look when I get back. [12:07] Fallenour: it might be worth hopping onto https://bugs.launchpad.net/juju/+bug/1826203 as that seems the same vein and means it's not just you :) [12:07] Bug #1826203: deploy openstack base bundle failed with lxd error: incomplete DNS config found, discovering host's DNS config [12:39] Fallenour: do you have any http proxies or anything in play in the controller/model? [12:41] rick_h, not that Im aware [12:41] Im currently playing with wiping out the dns config entirely, and seeing what happens [12:41] building a new machine, whcih will deplyo the new config. Will see what happens @rick_h [12:41] Fallenour: ok, still looking [12:42] if everything works, and I get this working, like fully working, i expect free tacos & energy drinks for life from Canonical [12:43] and some kinda plushy animal. Suse gave me an iguana o.o [12:46] rick_h, well...at least I know that its not maas now [12:46] Fallenour: ? [12:46] rick_h, unless...does juju store the dns configurations for the model, or per machine when using maas? [12:46] Fallenour: so the thing is that I don't think Juju deals with DNS and just relies on the machines at play. [12:47] Fallenour: e.g. dhcp/etc [12:47] rick_h, wiped the dns config entirely from maas, so its not a bad config from maas. maas is fully up to date. [12:47] Fallenour: I'm guessing there's something up wtih the dhcp setup and dns but not sure [12:47] rick_h, thats the thing though,t he machines can ping externally [12:47] rick_h, if its dns, it doesnt make much sense, becuase it can ping dns names, and resolve them. [12:48] rick_h, whats the address of the system it gets images from? [12:48] Fallenour: right, but according to https://github.com/juju/juju/blob/085584f255f6d66530da25947544ca33418d0675/container/broker/broker.go#L76 we look at the interfaces on the host [12:48] rick_h, which is what doesnt make sense. The host can reach the internet and resolve addresses [12:48] rick_h, whtas the dns name of the source it draws the images from? [12:51] Fallenour: I *think* https://us.images.linuxcontainers.org/ [12:52] rick_h, OOOO [12:52] OOOOH BOIII! [12:52] I found something here! [12:52] Fallenour: I'm not going to get my hopes up yet... [12:55] rick_h, it cant ping the address for the images [12:55] Fallenour: :( [12:55] rick_h, it cna ping ubuntu though. hmmm [12:56] rick_h, and it can pull down packages [12:56] Fallenour: https vs non https? [12:56] rick_h, stickupkid manadart is there a command I can use to test to see if I can download an image? [12:57] Fallenour: lxc launch [12:57] Fallenour: from that host machine [12:58] rick_h, I tried building a container from the local machine, but its failing [12:58] Fallenour: there's your problem? [12:58] rick_h, yea Im thinking its something to do with reaching the images [12:58] if you can't do it Juju certainly can't [12:58] rick_h, mmmmm [12:58] rick_h, have you guys seen this issue before? this doesnt make any sense logically [12:59] Fallenour: no, my brain is hurting trying to think it through [12:59] lxc launch ubuntu:18.04 bionic [12:59] if I can ping the parent domain, it doesnt make sense that I could....FIREWALL PFSENSE!!! [12:59] Fallenour: I'm getting a setup going on my maas to see if I can run logs and see where it falls over here [12:59] that's the exact command [12:59] Fallenour: is there's pfsense on the maas nodes? [13:00] rick_h, no, but there is a firewall inline [13:00] rick_h, im testing that now by disabling all block ru...no that isnt it either. [13:00] rick_h, I disabled all the block rules [13:01] Fallenour: Does resolve.conf on the MAAS hosts have the MAAS server as the nameserver? [13:02] manadart, yea it says search maas [13:02] rick_h, manadart which if Im not mistaken, if the gateway is the firewall, would forward it to the firewall if it doesnt know it right? [13:03] rick_h, manadart would...would ubuntu be blocking the request because its behind a firewall? because its port is 443. ssl issue? [13:03] rick_h, I have an idea [13:05] btw I just want to say in advance I really do appreciate you guys and whta you do for the community [13:05] rick_h, its gotta be something inline [13:05] rick_h, I just pulled down an image after testing to see if Ic na pull it down inline, and it failed, then unplugged, went wireless, and it worked [13:06] Fallenour: yea, it's hard because firewalls like to drop things without ack/etc so you don't know how to blame [13:06] rick_h, manadart ok, so we know its inline, but whats the best way to troubleshoot which device it is? theres only two things in line that could do this, the maas system, and the firewall. [13:06] rick_h, the thing with the firewall though is I disabled the block rules [13:06] Fallenour: check the logs on those boxes looking for someone running a firewall and dropping the requests [13:07] rick_h, i run both boxes. Whats the best way(s) to do that? [13:07] rick_h, should I start with the firewall, pfsense, or the maas box? [13:07] both are latest and up to date. [13:10] rick_h, man this is something...else? Ive never seen this kind of problem before. im in the logs now, and its not even showing the IP address for anything ubuntu related. [13:10] Fallenour: but the url is linuxcontainers.org vs ubuntu [13:12] rick_h, but its trying to resolve to cloud-images.ubuntu.com:443 [13:12] rick_h, does it hit that address, and then pull from somewhere else? [13:13] Fallenour: oh maybe that's good then. [13:13] rick_h, ok, I need to make this really noisy so I see it for sure [13:14] Fallenour: maybe see if you can create a container on the maas host? [13:15] rick_h, ok so now this really doesnt make sense. Im on the same network, behind the same infra, and one system allows me to pull, and the other doesnt? [13:16] rick_h, but none of the infra is specifically isolated. Where does the metadata come from for the images for lxd? [13:20] rick_h, manadart ok so this is just weird. Maas wont let me test it. it keep saying lxdbr0 exists, but it doesnt? so I cant use sudo lxd init --auto to test [14:20] Fallenour: ? on the maas node? [14:20] sorry, maas host I should say [14:39] @rick_h, stickupkid manadart ok lets try this again [14:39] OOOOOOOOOOOOOOOooooooooooooooooOOOOOOOOOOOOO!!!!!!!!!!! [14:39] GOOOOOOOALLL!!! [14:40] fallenour1: ?! found it? [14:40] rick_h, DAMN YOU SNORT! I DID! [14:40] fallenour1: \o/ So what is it. [14:40] * rick_h drumrolls [14:40] rick_h, It was snortc2 rule set that was blocking a lot of things unreasonably [14:41] rick_h, So waht I did collectively: updated juju from 2.4.7 > 2.6.X, Updated maas packages, wiped DNS setting in maas, removed squid, squid proxy, snort, and pfblockerng rules, and it works! [14:41] fallenour1: lol [14:41] fallenour1: "small tweaks" [14:42] rick_h, LOL [14:42] rick_h, alright! lets DO THIS! [14:42] ooo you GOTTA BE KIDDING ME [14:42] power went out Q____Q [17:33] rick_h, here it is: https://bugs.launchpad.net/juju/+bug/1847128 [17:33] Bug #1847128: [2.7] ceph-osd stuck in "agent initializing" [19:00] pmatulis: ty [20:00] hml: did that help any with the model-config test? [20:01] rick_h: i’m in debugger land, trying to understand what’s going wrong. helped to point in a better direction. [20:01] hml: ok, let me know if you could use extra eyeballs [20:01] rick_h: rgr [22:10] heyya [22:11] when I add `series: kubernetes` to my bundle, it does not want to deploy [22:11] this bundle https://api.jujucharms.com/charmstore/v5/~omnivector/bundle/slurm-core-k8s-2/archive/bundle.yaml [22:13] give me https://paste.ubuntu.com/p/Bh6Hwy54zq/ [22:13] but the bundle shows the k8s tag in the charmstore, all looks good there [22:14] to get the k8s bundle to deploy I need to change `series: kubernetes` to `bundle: kubernetes` liek so https://paste.ubuntu.com/p/WQ7xmF4Gw7/ [22:15] after I make this change the bundle can deploy, but doesnt show the k8s tag in the charmstore [22:15] are people aleady aware of this? [22:25] series: kubernetes is just wrong [22:25] because kubernetes isn't a series [22:25] the bundle: keyword was added to handle this [22:26] if the charmstore is parsing 'series: kubernetes' it is doing it wrong