[03:38] externalreality: the check merge job runs the tests in a xenial container, we should be able to tell which version of mongodb it's running easily [03:38] * veebers checks [03:39] externalreality: I can see it's installing juju-mongodb3.2 from xenial-updates (this matches with what I see in the make file install-deps) [03:40] I think it probably depends which series the controller is running. [03:43] aye, we have mongodb-server-core, but that's bionic/cosmic only [04:03] babbageclunk: if I've deployed an app under an alias (i.e. juju deploy cs:ubuntu blah) is it possible, in code, to map blah -> cs:ubuntu? (i.e. app name to charm name) [04:15] veebers: sorry, didn't see this - yeah, an application has a charm, which has a url. [04:15] Hmm, I can see a way to get name -> charmurl [04:15] veebers: What context are you in? [04:15] hah sweet [04:15] babbageclunk: Deciding what extra meta data needs stored and what exists already for doing a resource-get on behalf of a charm [04:16] veebers: ah, ok - so in the unit agent? [04:18] babbageclunk: I'm not 100% certain where it will occur, but the unit agent make a lot of sense. I'm just storing the data atm (at deploy-ish time) [04:18] oh gotcha [08:15] Need a review for LXD cluster nodes and AZs: https://github.com/juju/juju/pull/8961 [08:29] manadart: i'm on it [08:30] manadart: i have a question about the "default" profile, should we be using "default", or should we be using some way to get the profile name? [08:31] stickupkid: We'll be discussing and evolving that in the course of designing profile/device pass-through. [08:31] manadart: perfect, just reading your PR and that cropped up [08:33] mandart: slightly OT, i was looking into cleaning up some lxd tests yesterday and noticed this https://github.com/juju/juju/blob/develop/provider/lxd/environ.go#L46 [08:34] manadart: turns out we never send the provider to the environ - so we should clean that up and remove it [08:35] stickupkid: Yes; I also added some TODOs in my PR for moving logging/checking. Created a card for it yesterday. [08:35] manadart: perfect :) [08:42] manadart: done - LGTM [08:42] stickupkid: Thanks. [08:43] manadart: "github.com/juju/juju/cmd/juju/machine.TestPackage" that package is failing constantly now [08:43] stickupkid: Will look. [08:47] manadart: I've made a bug for it https://bugs.launchpad.net/juju/+bug/1783284 [08:47] Bug #1783284: Intermittent unit test failure: machine.TestPackage [08:48] I had a look last week, but I wasn't able to work out why we don't get any real good stack trace of the error [08:48] In fact i tried to run the go test with -count=64 to try and force it into failing, but it didn't fail locally at all [08:51] I've got a theory (probably wrong), that it's failing on a tear down [08:54] stickupkid: Ack. [09:45] manadart: how much do we want to refactor the lxd tests? [09:45] the goal being, removing stubs? [09:46] stickupkid: That is my position, but maybe too much to bite the whole lot off as a single exercise. [09:47] yeah, that's what i'm thinking, maybe incremental steps... [10:51] stickupkid: Another one: https://github.com/juju/juju/pull/8964 [10:54] manadart: approved, much better - less branching [10:55] stickupkid: We should probably start back-porting LXD provider commits to 2.4 as well... [10:55] stickupkid: Ta. [10:55] manadart: yeah make sense to me [11:50] morning party folks [11:58] rick_h_: Howdy. [12:36] bdx: ping when you're about [12:36] kwmonroe: same to you please [13:15] kwmonroe has the holiday blues [13:15] he's gone on strike [13:16] manadart: doh [13:16] oops magicaltrout [13:16] magicaltrout: do you have any use for lxd profile edits that we can/should be aware of for big data charms so we can spec/make it nice and awesome? === plars_ is now known as plars [14:38] rick_h_: strike is over, i have returned from holiday. we don't customize lxd profiles in big data charms today, but k8s does. see this for the rigamaroll: https://github.com/juju-solutions/bundle-canonical-kubernetes/wiki/Deploying-on-LXD. if/when big data charms need gpu accel in containers, or has a process that needs /proc or /sys, they might need something similar to the cdk profile. [14:39] rick_h_: any config or command that makes editing lxd profiles easier would be nice and awesome. [14:39] (i mean, juju config/command) [14:40] kwmonroe: k, just wanted to check in. [14:42] o/ [14:43] Im looking to build a high availability juju controller set (3) as a individual model, whats the best way to do this? I want to use a model so I can take advantage of the permissions controls [14:45] fallenour: I'm confused. So basically you bootstrap, juju switch controller, juju enable-ha and you get three api servers setup and running [14:45] fallenour: then any models you create/use are using those HA api servers in the controller model there [14:45] fallenour: is there something else you're looking for? [14:46] rick_h_: Correct, I want to enable ha, but I also want to control how thast accessed. I dont remember how I did it last time, but I set up a model, and inside that model I built all of my controllers. Im not sure how I did that [14:46] fallenour: so the controller is the api server and then you can juju add-model and use juju add-user to create users and then juju grant... those users access to different models [14:46] fallenour: that's how the layering functions [14:47] I guess a better way to put it, unless the models are divided, everyone that has access to the model can make any changes to any system inside that model. I want the controllers to be apart from model A, inside model B, and only give access to Model B to admins [14:47] fallenour: so each model can be granted access indepdently [14:47] rick_h_: Correct. My issue is, if I enable-HA inside the normal model, the controllers will also be inside that same model, model A, and not inside of model B, [14:48] fallenour: so enable-ha only does one thing. It brings up more machines into the controller model. (juju switch controller) if you juju status on that model you can see the new machines listed [14:48] rick_h_: So you are saying all Ill need to do is simply create a new model, and then activate HA inside of that new model, model B, and that will do it? [14:48] fallenour: enable-ha can never touch any other model [14:48] yes [14:48] YES! [14:48] fallenour: I'm saying that enable'ing HA in any other model than the controller model doesn't make sense/work [14:48] Ok, so we are on the same page then, so the Model B (controller model) already exists upon deployment? [14:49] fallenour: yes, bootstrap comes out of the box with a controller model for the controller bits (e.g. HA api servers) and a default model which you can use to deploy workloads/etc [14:49] so switch over to Model B(Controller model) and THEN do juju enable-ha, correct? [14:49] manadart: https://github.com/juju/juju/pull/8965 these are just some clean up, that we should land, we can always work on more in the future [14:49] fallenour: the controller model can never be removed [14:49] fallenour: it only goes away with destroy-controller command [14:49] fallenour: but the default model can be removed, new ones added, etc [14:49] ok so that explains a lot [14:49] stickupkid: Ack. [14:50] fallenour: so it sounds like I'd suggest you bootstrap, juju switch controller, juju enable-ha, juju remove-model default, and then follow https://docs.jujucharms.com/2.4/en/tut-users from there [14:53] rick_h_: Ok, so just rebuild from scratch, all the way back from bootstrap? [15:03] fallenour: not sure what you've got so I can't say [15:03] fallenour: I mean all of that is possible at any time [15:08] those docs are looking real nice [15:10] thank you to everyone who put effort into migrating/making the new docs.jujucharms.com [15:10] its beautiful [15:11] manadart: i get the comments, i think my PR is a better first step, before tackling the arch feature... [15:12] rick_h_: help update for LXD - https://github.com/juju/juju/pull/8966 [15:14] Hi, I have a strange error from a unit. We tried to remove the application, and it shows up as terminated, but it won't go away. In debug-log I see this: [15:14] machine-0: 23:06:58 ERROR juju.worker.dependency "unit-agent-deployer" manifold worker returned unexpected error: failed to query services from dbus for application "jujud-unit-sru20170725637-1": Failed to activate service 'org.freedesktop.systemd1': timed out [15:14] stickupkid: You can lick the arch feature in less code than it takes to shim it out here. [15:14] anyone seen something like that? The current version of juju in that model is 2.2.9 but it won't let me upgrade it [15:15] manadart: ok, i'll drop that commit [15:15] stickupkid: We also don't need to shim out IsSupportedArch, because we can always just return a mock arch that gets the return that we want from that method. [15:17] stickupkid: I am thinking to ice my logging PR too for now. What we discussed with rick_h_ means putting back some mess that we took away :( [15:18] manadart: dropped that commit, so it's just a simple update to the provider [15:20] manadart: do we need to change the plan? [15:22] plars: no, that's a new one to me. [15:22] rick_h_: any suggestions on debugging or repairing it? [15:23] plars: looking for existings bugs atm to see if there's something more to help [15:23] thanks! [15:24] plars: and coming up empty... [15:24] plars: can you file a new bug with details on version/cloud/what was running/etc please? [15:25] plars: I mean it might be some cleanup step race condition but I'm not sure. The fact that it's 2.2.9 makes me :( but the issue is that if you can't upgrade then double :/ [15:25] rick_h_: sure, tbh I'm not sure how it got in this state. We've been running very stable for a long while and got a similar error one a unit when trying to deploy. Then it started having this on an existing unit that we tried to get rid of [15:25] rick_h_: I'd be happy to upgrade, but it gives me an error that it can't because of that unit [15:26] ERROR some agents have not upgraded to the current model version 2.2.9: unit-sru20170725637-1 [15:26] plars: ugh, yea it's so tough to debug this stuff. Maybe we can see if we can upgrade around it or force it in some way [15:26] plars: is the machine that the application on still there? [15:26] plars: e.g. can we juju remove-machine --force to help put some pressure on things? [15:26] rick_h_: yes - it's maas, but I can't remove the machine because it hosts a lot of other applications/units [15:26] plars: yea, that's what I was worried about [15:26] plars: can they be migrated off? [15:26] plars: I guess no, but figure I'll ask [15:27] rick_h_: on another model, I also have a machine that I can't remove, even with --force [15:27] stickupkid: one thought sorry, can you verify that the current help text there conforms to that template we got a while ago? [15:27] stickupkid: just to make sure that while we're in there we bring it up to standard across the whole thing [15:27] rick_h_: on that one, the maas machine that it was once using is gone. It just seems to silently fail [15:28] plars: ? that seems odd. --force with remove-machine is a pretty big hammer that usually doesn't fail unless something is really odd [15:28] and the machine never disappears. On that one, the whole model can go if there's an easier way to force that [15:28] 2 down 10.101.49.149 nx38gq xenial default Deployed [15:28] is how it shows up [15:28] plars: no, we're looking to add some add-model force bits to 2.5 this cycle but I don't have it yet [15:28] sorry, remove-model --force bits [15:28] that one is stuck on 2.0.2 - and can't update for the same reason. No units are even deployed [15:29] plars: is it in a happy state? e.g. does the agent report ok? [15:29] plars: I'm curious of model-migrations can be used to help garden up to the later versions with fixed bugs [15:29] stickupkid: https://docs.google.com/document/d/1ySjCNqd0x6veLfcBetxLI9NH7qfw3xayLWJNqXMvyW8/edit specifically [15:31] rick_h_: sure let me look [15:31] rick_h_: the agent for the one where I can't remove the machine? [15:31] plars: right, but it shows it still there? [15:32] rick_h_: yes, if I do juju status on that model, it shows the machine is there. In reality, that's the only machine left in the model, and it's gone [15:32] plars: oh I see [15:47] rick_h_: https://bugs.launchpad.net/juju/+bug/1783357 - please tell me if you need any other information, or have suggestions for debugging [15:47] Bug #1783357: Failed to activate service 'org.freedesktop.systemd1': timed out [15:49] anyone in here have a good contact with the maas team? Its really becoming frustrating that the system im working with keeps trying to change all of its configuration instructions mid-build. It completely defeats the purpose of asking me what domain, IP, storage config, etc if you are just going to randomly generate all of that, and change it all, including hardware zone. [15:52] fallenour, consider documenting your situation and sending to the juju mailing list [18:39] whats the flag option for acquiring devices in a specific zone with juju? [18:39] is it --zone= [18:55] fallenour: yea check out https://docs.jujucharms.com/2.4/en/charms-deploying-advanced under "deploy --to" [18:56] fallenour, 'zone' is a placement directive. it can be used whenever juju spawns a machine (commands: deploy, add-unit, add-machine, bootstrap) [18:56] is it --zone= [18:56] sorry, meant to send that earlier [18:57] fallenour, it is dependant on your chosen cloud provider [18:58] wow, im losing my mind. Ok, next question. So ive got ha enabled, thank you very much fro that btw rick_h_ rebuilding was definitely the smarter decision, id like to integrate docker. I did just notice there was a remove-k8s, which makes me think theres a segment for kubernetes and docker. Can you provide some enlightment for me on that? [18:58] pmatulis: cloud provider is maas. [18:59] fallenour: heh, so now you're on the bleeding edge of stuff. There's work going on to enable Juju models on k8s but it's kind of "soft launch" as things like storage and features are in progress. [18:59] fallenour: definitely not ready for production infrastructure yet unfortunately [18:59] Oh! Also, for those deploying charms, if they keep pushing because "its not working" with maas, tell them to check to see if they imported bionic (18.04LTS). It hung up my openstack deployment because of that. [18:59] fallenour: however, if you have the time/resource definitely play with it and see if it fits your needs [18:59] fallenour: ah, definitely. Having the right MAAS images is vital [19:00] rick_h_: You know, as much as I might hate myself in the morning, you guys have helped me a ton. Ive got 3 extra servers. Ill see if I cant contribute a few pints of blood and a few more bpm to help with testing it. [19:00] also, the new bionic boot, it looks amazing. im not even gonna lie, its beautiful [19:00] fallenour: all up to you, I just want you to know where stuff sits and as we build stuff for users to solve problems it's always <3 to get feedback that we're on the right track [19:00] its like 4-6 fonts smaller. [19:01] lol [19:12] So rick_h_ pmatulis I was thinking. Id like to build a web app cluster with a separate docker cluster, both backed by ceph storage clusters, so the applications and containers can store across the ceph storage drive array across the 3 servers each, what are your thoughts? [19:13] from my perspective, it should give all the apps and containers access to a total of about 1.8 TB storage space, with the ability to easily swap out the drives to increase size. Are there any risks I should be aware of, and am I overlooking anything? [19:15] fallenour: you're stepping into kwmonroe and tvansteenburgh's expertise there. I'm not sure [19:15] basically can you setup that ceph cluster as a storage provider for kubernetes and deploy in that way? [19:17] kwmonroe: tvansteenburgh Can you two provide some insight? Itll be the first time ill be combining ceph storage with docker. [19:17] rick_h_: yea its a pretty interesting idea, especially if it pans out. completely flexible application deployment with completely flexible storage. [19:21] fallenour: if you're using kubernetes for your "docker cluster" then integrating with ceph will be pretty straightforward [19:21] if you're not, then i have no idea [19:24] tvansteenburgh: Um...can you clarify? Wait, easier question, is there a kubernetes for juju? [19:25] fallenour: yah, `juju deploy canonical-kubernetes` [19:25] or, for a minimal version `juju deploy kubernetes-core` [19:25] tvansteenburgh: how many machines does it take? I only currently have 3 physical set aside for it, is that enough? [19:26] fallenour, then you want kubernetes-core [19:26] dang it tvansteenburgh, i knew that one. fallenour, here's some details on both of those: https://jujucharms.com/canonical-kubernetes/ (takes 9 machines) and https://jujucharms.com/kubernetes-core (takes 2 machines) [19:26] tvansteenburgh: I was just reading up on it as well, it seems a lot of thought into building it. Yea I just found that one [19:26] OOH MY GAWD 9 MACHINES!? [19:26] 9 times the fun [19:27] kwmonroe: Slow your roll there google, we poor people over here. [19:27] LOL [19:27] :) [19:28] fwiw fallenour, the big bundle is meant to represent a production cluster, so you have 3 etcd units, 3 workers, 2 masters.. thars 8 right thar. [19:28] fallenour: for the poor people we have microk8s: https://github.com/ubuntu/microk8s [19:28] kwmonroe: Any reason why I cant cram those onto 3 boxes? [19:28] kwmonroe: tvansteenburgh I mean, 3 3 and 2. [19:29] fallenour: sure you can [19:29] fallenour: you could use the lxd provider and put it all on one machine [19:29] tvansteenburgh: Im sensing a downside coming.. o.o [19:29] there's no downside [19:30] fallenour: you could certainly adjust the bundle.yaml for canonical-kubernetes (cdk) to change num_units from 3 3 2 to 1 1 1. but if you're gonna hack it up like that, you may as well use kubernetes-core (which we've hacked/condensed for you). [19:30] fallenour: if that one machine falls over you're out of luck [19:30] kwmonroe: knobby tvansteenburgh No no no, I mean put 3 3 and 2 on 3 physical machines, one on each for each machine. so 1 1 and 1 2 2 and 2 3 and 3 [19:31] a game of sudoku spontaneously broke out [19:31] so spreading all 8 over 3 machiens instead of 8 machines. I know 8 is a lot more beefy, but the demand on 8 machines for someone like myself wont reach justifying 8 for a while [19:32] Ooh, yea, so A, B, C Machines, etcd1 worker1 and master1 on machine A, etcd2, worker2, and master2 on machine B, and etcd3 and worker3 on machine C [19:32] sorry [19:32] fallenour: the only downside to reducing the number of machines is that you lose the highly available part of it to a degree. I'm running it and not using 9 machines. I smashed etcd onto the (single)master I have and 2 of the workers [19:32] knobby: well you would still keep the HA, just spread it across less hardware. [19:33] the odds of that many boxes dying at the same time without a serious issue occuring is really low. [19:33] fallenour: but if you lose 2 machines you're lost. In the 9 machine setup, it would be ok [19:33] I completely agree, fallenour and I am making the same gamble locally [19:34] knobby: yea but again, the odds of even losing 2 machines at the same time at a moderate load is still really low. like im buying a lotto ticket low, and ill see you all on my yacht. I got that much better odds. [19:34] make that two yachts then XD [19:34] knobby: but yea, I mean whats the best way to build that into a yaml? [19:37] fallenour: i would start with the kubernetes-core yaml (https://api.jujucharms.com/charmstore/v5/kubernetes-core/archive/bundle.yaml), adjust it so there are 3 machines in the machines section, keep easyrsa as is (in a lxc container on machine 0), bump up num_units for etcd and k8s-worker to 2 (or 3, or whatever), and adjust those "to:" directives to be like "to: - '0' - '1' - '2'" as you want. [19:37] fallenour: just use the lxd stuff like kubernetes-core has. I'm not a master of the --to stuff unfortunately. [19:39] fallenour: you'd effectively be making a bundle somewhere between core and cdk. the only thing i'd be careful of is to ensure the k8s-master and workers are on different machines.. so like easyrsa+master on machine 0, etc+worker on machines 1 and 2. [19:40] knobby: kwmonroe yea Im working on it now, ill let you guys take a look once i get it done. Feel free to let me know your thoughts once I get it done. [19:40] if you guys like it, ill publish it. [19:40] eh, in a low usage scenario I'm not worried about mixing masters and slaves on the same hardware. But then again, I live dangerously... [19:45] alright, so I got it done [19:46] knobby: kwmonroe tvansteenburgh Im curious though, do you think I shoudl go ahead and build in ceph into it? [19:47] fallenour: if that was the end goal, I would [19:49] knobby: whats the best way to integrate it? shoudl I just toss it in anywhere, or do I need it to establish a specific relationship? [19:52] kwmonroe: tvansteenburgh rick_h_ Do I just kinda "toss" ceph osd/mon onto the pile, and its "good" [19:57] fallenour: you're out of my league there -- i haven't used ceph myself. [19:58] kwmonroe: saaaadness. YOU WERE THE CHOOOSEN ONE! [19:58] you guy buy 6 more machines, and i'll google how to use ceph ;) [19:58] kwmonroe: LOOOL [19:59] kwmonroe: Yes...."buy"....*pulls out lightsaber and laser pistols* [20:03] fallenour: you need some monitors, they are like kubernetes master, and then machines with disks, which are the osd part [20:03] you probably want to run osd/mon on each machine is my guess [20:11] fallenour: I have a PR up to allow relating ceph to kubernetes and getting a default storage class for free so you can just make persistent volume claims and get them backed by ceph. [20:39] kwmonroe: we've been charming up https://livy.incubator.apache.org/ in our efforts to get Hue up and running. Whilst not part of the Big Top stack, would you like us to eventually stick it in bigdata-charmers or just hold on to it? [20:41] magicaltrout: i'm cool with you holding it. the only benefit for putting it into bd-charmers would be that we would auto build/release it when things like layer-basic changes. if you own it, i'll just open an issue reminding you guys to push it yourselves (which is what i do for giraph) [20:41] fair enough [20:42] due to a myriad of wifi issues we didn't have a call in the end but i did shoot uros a bunch of questions which he half answered with promises of grandeur and so on [20:42] i'll forward it on [20:44] he also said he wished he could grow a beard like rick_h_ but sadly his child like features prevent it.... [20:45] LoL [20:46] Naw, he's got that wise man sans beard thing going [20:46] Eternal scholar [20:46] oh, i thought he just couldn't be bothered going to the hairdressers [21:17] babbageclunk, anyone: have you seen that odd github tls issue in any PR over the last day? [21:18] veebers: no, I didn't see it yesterday [21:20] veebers: although it looks like check-merge jobs are failing at the point of launching the container [21:21] and merge jobs too. [21:22] veebers: eg http://ci.jujucharms.com/job/github-check-merge-juju/2518/console [21:22] I'm having a look on grumpig now [21:23] babbageclunk: oh :-\ ok thanks, let me know what you find. Thanks re: tls issue [21:25] veebers: I can't launch a new lxd container on it, getting this: https://paste.ubuntu.com/p/CJS4rKtJbb/ [21:26] babbageclunk: I would just reboot grumpig for a start, easiest and laziest way to debug :-) [21:26] ok [21:26] babbageclunk: there are a couple of things you need to do firest [21:26] cool [21:38] wallyworld: That update to Juju edge did fix the issue I was having, thanks [21:38] cory_fu: great! pr is lgtm also, looks awesome [21:39] wallyworld: Great. I'll get a quick PR together for your charms repo to work with that before I EOD [21:39] no rush! [21:39] hey wallyworld o/ welcome back to the sensible timezone :-) [21:39] indeed [21:43] also sensible hemisphere [22:37] anastasiamac: when you have a moment could you review: https://github.com/juju/juju/pull/8346 the part I was specifically interested in is 391-392 [22:53] anastasiamac: heh, hold off for now, want to make a slight change to it [23:16] veebers: oh awesome! good thing i did not look yet then :) [23:17] ^_^