[00:49] yeah, check out the osm bundle it uses `bundle: kubernetes` and doesn't get a kubernetes tag in the charmstore https://jaas.ai/osm [00:50] and here, my bundle uses it incorrectly (non-deployable bundle) and get the kubernetes tag https://jaas.ai/u/omnivector/slurm-core-k8s/bundle/3 [00:55] https://github.com/juju/charmstore/issues/887 [01:09] wallyworld / hpidcock / kelvinliu could I get a review of this? https://github.com/juju/juju/pull/10692 (model-defaults test fix) [01:10] also this one is some assess tweaks https://github.com/juju/juju/pull/10693 [01:11] ok [01:14] babbageclunk: lgtm, thanks for the fix [01:15] thanks! [01:15] damn - meant to put them against 2.6 [01:16] hang on, cancelling the merge on the test fix so I can retarget [01:18] babbageclunk: we ain't gonna do any more 2.6 (touch wood) to don't feel like it's a requirement to land there firxt [01:23] I just figured may as well fix it in both vaguely current branches [02:15] thread logger through apicaller worker: https://github.com/juju/juju/pull/10694 [02:19] hpidcock: did you want to talk about bug 1847084? [02:19] Bug #1847084: Juju k8s controller is not getting configuration parameters correctly [02:19] there is history there [02:22] hpidcock: nm, replied to the bug [02:22] hpidcock: the issue is --config vs. --model-defaults, they were setting proxies in the config, adding a model then wondering why the proxies weren't there [02:22] I already talked with wallyworld about it. I just wanted to gather a bit more information about what we should test it on with a fix. Since there is like 20 different CNI plugins for kubernetes that could affect how this works [02:23] this isn't a network issue [02:23] this is a model config issue [02:23] hmm... [02:23] * thumper is rereading the bug... [02:24] the model config is fine, it's not passing the proxy env vars to the controller pod [02:25] but I wanted to understand the environment they are expecting this to work on so I can test it. Because the proxy configuration especially the no-proxy ranges could affect the in cluster/container networking [02:26] ok [02:26] I think perhaps there is an issue with them setting no-proxy not juju-no-proxy [02:27] hpidcock: I guess I'm not sure where they expect the proxy to be set [02:29] a bit weird that the apt worked but curl didn't... [02:29] I might be just assuming too much here, but I think they want the pod to have the proxy env vars set. So each unit in the model will have proxy env vars injected via the pod spec. [02:29] yeah not sure about that [02:29] hpidcock: I think you may well be right [02:29] but I'd also heck the no-proxy... [02:29] and also their expecation that --config is passed on to models is wrong [02:30] so there are a bunch of not-right thinking there [02:30] I don't think we have a way to automatically populate no-proxy, since we might not know the pod networking cidrs [02:30] but it seems no-proxy ranges should be for all container networking, so anything in-cluster shouldn't go via the proxy [02:30] I agree [02:31] not many things handle a cidr no-proxy [02:33] proxy configuration at application level is also a bit weird inside of k8s, not very idiomatic way to setup stuff. Normally you would use something like Isitio to handle any proxy configuration, but I can understand setting the apt-proxy [02:33] w [02:34] wallyworld: I wasn't able to repro the 1847128 volume bug using 2.7 develop head [02:51] hpidcock: hmmm, ok, i'd update the bug with what you did and possibly ask for exact repro steps and make as incomplete in the interim [03:01] babbageclunk: easy review? https://github.com/juju/juju/pull/10694 [03:02] sure [03:05] wallyworld: got a min to discuss add-k8s cmd? [03:05] kelvinliu: sure, give me 2 minutes [03:05] yup [03:09] thumper: approved [03:10] babbageclunk: ta [03:16] kelvinliu: free now [03:16] wallyworld: stdup? [03:17] yup [03:21] bah, is anyone familiar with decrypting ssl/tls in wireshark? [03:36] babbageclunk: it's a bit fiddly [03:36] you need to add the private key I believe in the settings [03:37] hpidcock: yeah, I'm beginning to realise - the bit about needing to capture the handshake is what's tripping me up now [03:37] I think I've got all the private keys added [03:37] I think they just mean it needs to be a new connection [03:38] like you can't decrypt an already established connection [03:38] yeah, so I need to kill the controllers, start tcpdump, start the controllers again [03:42] babbageclunk: https://sharkfesteurope.wireshark.org/assets/presentations17eu/15.pdf [03:43] might not be possible if its an ECDHE session [03:44] see slide 15 "Ephemeral (Elliptic Curve) Diffie-Hellman (ECDHE)" === exsdev0 is now known as exsdev [03:50] babbageclunk: https://golang.org/src/crypto/tls/cipher_suites.go#L77 looks like your SOL without doing some MITM proxy or something else, you could probably force it to use TLS_RSA_WITH_AES_128_GCM_SHA256 if your in the mood to recompile [03:51] hpidcock: oh right, because the controllers will use the top ones for their connections so it'll be ECDHE [03:52] hpidcock: it's probably not that important - was hoping to distinguish between different traffic types going to 17070, thanks for the pointers though [04:06] kelvinliu: free again? [04:19] Hey, I want to some maintenance on a host machine which runs openstack respective juju units. Is there any way to migrate the existing units created by juju from one host machine to another? [04:26] sou: depends what you're wanting to migrate i guess, you could `juju add-unit blah` to create a new one of the exsiting application [04:26] then juju remove-unit to shutdown the old one when they are sync'd up [04:30] wallyworld: back now [04:32] sorry, just finished lunch [04:34] kelvinliu: no worries, standup? [04:34] yes [04:47] This is wrt easyrsa unit. There is only one unit in the setup. [04:48] Easyrsa serves as the CA for certs generated by etcd [05:00] I had to take down the host which runs easyrsa unit for maintenance (I had to reinstall the host machine). Though when it came up, a new easyrsa unit was added. But the etcd cluster was broken [05:16] So I was figuring out a plan to securely reinstall the host machine which runs easyRSA [05:16] unit [05:17] One of the points which came to my mind is to backup the volume used by easyrsa, and then when the unit is recreated restore the backup [05:34] wallyworld: we can't do precheck on podspec in deploy facade, because at that time, no podspec yet.. [05:35] derp [05:35] just have to check and error out later then [05:36] ok, so no need to do this check, coz we do all these in ensureXXResources already. [09:03] Hi [09:04] i have a question about juju, we can add unit to application, but in case of one of unit not response, juju redeploy them ? [09:28] shann: You can "juju remove-unit app/x" and "juju add-unit app". [09:29] thanks manadart, i see also doc about ha-controller and ha-applications [10:31] manadart stickupkid a small pr with some tests https://github.com/juju/juju/pull/10696 if someone want to take a look [10:33] nammn_de: Looking. [10:42] OH MY GAWD WHAT A BLESSED DAY! [10:42] Openstack is written in Django! [10:42] 8D [10:42] <3 [10:42] you guys are amazing o.o [10:57] Hey manadart stickupkid rick_h do you guys know which region is chosen by default for openstack? [11:01] Fallenour, i don't unforunately [11:01] nammn_de, i've approved, you won't be able to land, until my CMR branch does [11:02] I figured it out @stickupkid @manadart @rick_h its admin_domain. Also, the command for recovering the password is: juju run --unit keystone/0 leader-get admin_passwd [11:02] the username is admin. [11:02] My gift to the juju community <3 [11:03] That officially makes me a juju contributer for the openstack module XD [11:03] seriously though, that should be added to the official docs on openstack-base-61 [11:04] next question: rick_h manadart stickupkid I have 5 OSDs per compute/storage node set, juju status shows 5 active, openstack only sees 1 drive each. thoughts? [11:04] thanks manadart stickupkid i'll wait for your branch then and merge after [11:32] Fallenour: not sure, this falls under openstack expertise I don't have tbh. [11:32] Fallenour: have to bug icey and company on that one [11:33] rick_h, WUT! Something you dont know!? The apocalypse o.o [11:33] Fallenour: I know, I hang my head in shame [11:33] rick_h, tis truly a sadness day :( its ok though! Openstack is up and running, and that is good enough. with ssl enabled, I should be able to access sessions over the web, and I do vaguely remember solving this issue in the past, so im sure I can do it again. [11:34] rick_h, In other news, my boss didnt like the work I did to create a centralized UI for controlling all of our systesm and services at work, which means the company has officially rejected it. This also means that it remains solely as mine now. [11:35] rick_h, that being said, it means I can contribute up to 615,000+ lines of code to whatever I see fit. [11:35] rick_h, now that I know that openstack is built in django, that means I can contribute all of the code to the openstack juju charm suite, if the team will have it. [11:38] Fallenour: OpenStack isn't build in django, the openstack dashboard is though [11:38] Fallenour: we're generally over in #openstack-charms if you want to hang out with the cool kids ;-) [11:38] icey, yeap! and the rest is built in python ;), which the app I built is completely and utterly designed to natively integrate with [11:39] icey, horizon is the only app that matters O.o, all the rest are just...core services o.o [11:39] XD [11:40] icey, it wont let me join :( [11:41] icey, I got in [11:41] Fallenour: let me guess, you had to register your nick :-P [11:45] lol yea icey I thought I had already signed in, but im guessing registration sign ins time out [12:25] nammn_de, right, I've fixed the issue around landing PRs, you should be able to merge yours now [12:43] stickupkid: cool 🦸 [13:34] manadart: can you rubberstamp the rebase onto develop https://github.com/juju/juju/pull/10698 [13:35] https://media2.giphy.com/media/xT4Aphm45GMfpVEUxO/giphy.gif?cid=790b7611dfff702a60cf90c310d9d75147cd11c9ad8af327&rid=giphy.gif [14:59] rick_h: https://github.com/juju/juju/pull/10685 should force all bootstraping to be lower case [15:33] nammn_de: getting back and looking [15:34] nammn_de: I think if it's a user name then yes, however we can't force it for things that exist in the world like clouds and regions on those clouds that are outside our control [15:35] you mean regions should stick to being camelcase in case they are? Arent we then back to the initial kube problem? [15:37] rick_h, fyi hpidcock cannot reproduce my ceph-osd issue but i can, consistently (can make it break and can make it work) [15:37] nammn_de: sorry, so for the controller name we create it's fine to lowercase it [15:37] nammn_de: but we have to be careful we don't pass that as a new "region name" value to the API for where to request an VM from [15:38] nammn_de: because the underlying cloud might be case dependent [15:38] nammn_de: the bug was that juju creates a name for the controller, and that can be lowered just fine [15:38] nammn_de: let me know if you want to HO high bandwidth to make sure we're on the same page [15:38] rick_h: lets ho, better safe than sorry :D [15:40] rick_h: gimme a ping if you can join HO [15:41] nammn_de: k, omw [15:41] hopping into daily [16:42] gaa. when your mac pops up a notification where you see someone saying something about a charm not working, then can find no trace of the message anywhere on your laptop or online.... 😡 [16:49] magicaltrout: hah yea, "was it IRC? no... Telegram? no... Email? no...wtf!!!" [16:49] i know i've literally looked everywhere, some chinese student and thats all I know [16:49] grr [16:58] oh well whoever it was wasn't lying... i have broken it =/ [16:59] :( [16:59] "the more you know" I guess [17:01] well if every bug report was some transient osx notification my life would probably be a lot less stressful as i'd ignore them unless there was a useful description in the first 10 words ;) [17:01] so this chap got lucky ;) [17:01] also, if you follow that plan, if you're not looking at your screen when the notification lands, it never happened ;) [18:53] cory_fu: We're looking at implementing add/list/update clouds (https://github.com/juju/python-libjuju/blob/master/juju/juju.py). Do you foresee any issues? The goal is to be able to replicate the functionality of add-k8s, et al. [18:57] aisrael: Well, the main caveat is that there is now a distinction between cloud info stored locally in the clouds.yaml file vs the cloud data actually registered with the controller. I don't actually think that class / file has any relevance any more and should probably be removed. [18:58] aisrael: Instead, you probably care about the clouds in the controller, which should be accessed via the Controller class. [18:59] cory_fu: okay, noted. Just making sure this can work via api remote from the bootstrapped machine [19:02] aisrael: Yeah, it can. Just use the CloudFacade methods, e.g. AddCloud: https://github.com/juju/python-libjuju/blob/master/juju/client/_client5.py#L1342 [19:03] aisrael: If you're not already familiar, https://pythonlibjuju.readthedocs.io/en/latest/upstream-updates/index.html#integrating-into-the-object-layer has tips on how to use the Juju CLI to see what calls need to be implemented and what kind of data they need to be passed [19:04] cory_fu: Perfect, thanks. And nice, I didn't know about that. [19:05] aisrael: Also, libjuju is now officially maintained by the Juju core team, so they should be familiar enough with the Python code-base to help. You can check the PR or commit history to see who has actively worked on it. [19:58] cory_fu: aisrael right, stickupkid and I can help review/QA and such. I know stickupkid was consulted earlier today around pre-imp ideas and the like [19:58] rick_h: much appreciated! It sounds like David's already making progress, thanks to stickupkid [20:20] hello [20:20] hey bdx [20:20] how goes things out west? [20:20] battles [20:20] lol [20:20] wheeee [20:20] * rick_h collects his chain mail [20:21] I have a peer relation that is giving me "ERROR permission denied" [20:21] rick_h: ever heard of this? [20:22] bdx: hmmm, in the unit log? [20:22] yeah https://paste.ubuntu.com/p/2rHnhYJKNz/ [20:23] bdx: https://bugs.launchpad.net/juju/+bug/1818230 looks like yes... [20:23] Bug #1818230: k8s charm fails to access peer relation in peer relation-joined hook [20:23] thats it [20:24] bdx: but honestly no hadn't run across it. Thinking... [20:24] looks like that was targeting 2.5.9 [20:24] Im running 2.7beta1 [20:24] geh [20:24] yea, filed back in march and seems like it never went anywhere [20:25] stub: did you find anything re: that or did that just get dropped? ^ [20:28] bdx: is your charm also a k8s charm? [20:28] yeah [20:28] ok, it's really unclear what permission is at issue here... [20:29] chown -R root:root / && chmod -R 777 / [20:29] magicaltrout: hah, "when all else fails" [20:29] i've got all the best hacks [20:30] may as well close that bug [20:30] magicaltrout, great work here [20:30] lol [20:30] bdx: lol [20:31] bdx: are you setting a pod-spec in this hook? [20:32] bdx: looking at http://bit.ly/322LBrG there's a bunch of tests with that error around setting a pod spec without values [20:33] https://github.com/omnivector-solutions/layer-slurmd-k8s/blob/7d6987a8a0b186486ad08854e6c12a60977ea3b5/src/reactive/slurmd_k8s.py#L130,L153 [20:36] oh no [20:36] oh? [20:37] this is not good [20:37] does this code line up to what's running? the line before the permissions denied is the @when('sulrmd.initialized')? [20:38] * rick_h walks back and away slowly... [20:38] throw it all away and crack out terraform! [20:38] lol [20:38] I hear that has no bugs and never does anything bad ;) [20:39] bdx: so what's bad? [20:39] rick_h: from the charm code, nothing is gating the peer relation handler from running except being the leader and the relation.join [20:40] bdx: yea, I get that [20:40] but looking at the log and what code is being impacted the line before is odd that it's just the @when('slurmd.initialized') [20:41] so like ... I'm guessing the only way it could run before the @when('slurmd.initialized'), is if multiple units are deployed simultaneously [20:41] maybe [20:41] but yeah, from the log, you are right, that it comes last [20:41] bdx: can we put some debug output to see which line is causing the permission denied? [20:42] yeah [20:42] I assume it must be the interating over peer._data but that seems odd so I'm questioning assumptions that this is what's running [20:42] root= [20:42] ? [20:42] unit=INFO [20:43] I more meant updating that get_slurmd_peers.... with some print("got 1") [20:43] and 2 and 3 and see if we can tell right where it goes boom [20:43] gotcha, perfect, on it [20:43] ty [20:43] * rick_h refills coffee while you do that [20:47] morning team [20:47] morning thumper [20:47] https://github.com/juju/juju/pull/10702 for more logger threading [20:54] rick_h: I set a log("DEBUG STAGE #") statement on every other line in that handler [20:55] the traceback is getting thrown before any code in the handler executes [20:55] not seeing any of the debug statements [20:57] cmars: whats the deal with your kafka charm? [20:58] bdx: ok, so that's good/bad [20:58] I'm updating all the big data charms, but don't really wanna update the kafka one with the bigtop version as its moves slow compared to the upstream stuff, which is fine for hadoop etc but not as much for kafka [20:58] i'm considering forking zookeeper into a non bigtop version also just cause you don't need all the crud for zk [21:01] rick_h: do you think increasing the verbosity of of the unit log would be helpful here? [21:03] well I bunped it up, doesn [21:03] magicaltrout: hey! we're using it in production for an internal project. it might need some work to support other use cases. currently requires tls certs for clients, for example [21:03] t show anything useful around this error [21:03] magicaltrout: we've also forked zookeeper [21:04] magicaltrout: https://github.com/cloud-green/zookeeper-snap-charm [21:04] magicaltrout: the kafka charm is https://github.com/cloud-green/kafka-snap-charm [21:05] magicaltrout: we have a few other charms under cloud-green, a jmx-exporter to prometheus, and a cert-manager to help organize client certs for the kafka clients [21:06] thats cool cmars! [21:07] if we could make tls certs optional that would be good, but the cert-manager and tls support is a cool feature that is a pain to setup [21:18] magicaltrout: are you using kafka streams at all? [21:19] we've found it pretty nice for our use cases so far.. but there's a learning curve [21:26] thumper: https://github.com/juju/juju/pull/10702 LGTM [21:26] hpidcock: ta [21:28] rick_h: possibly a good test would be to take out the peer relation code and see if the issue persists when adding a second unit [21:35] I've commented out all charm code pertaining to the peer relation, running a deploy now [21:37] commenting out all peer relation charm code allowed the second unit to deploy successfully https://paste.ubuntu.com/p/fNPGgmGbMR/ [21:38] now, possibly I should add back in bits of the peer relation to see where it breaks [21:46] alright, I have somewhat of a data point [21:47] when the only peer relation code was that in metadata.yaml https://github.com/omnivector-solutions/layer-slurmd-k8s/commit/9f39b24b32d94c17a983b61ba513eb12abb1694a [21:50] adding a second unit of the application worked, there was a warning message in the log WARNING unit.slurmd/1.juju-log slurmctld:3: No RelationFactory found in relations.slurmd.peers [21:50] but no relation-get error, everything successfully deployed with no hook errors [21:51] the next step I took was to add back the relation factory to peers.py [21:51] which is where it broke again [21:52] https://github.com/omnivector-solutions/layer-slurmd-k8s/commit/71455dcebfb793f706bd76d57a5b3a420ba74d5f [21:53] the charm handler for the peer relation is still commented out https://github.com/omnivector-solutions/layer-slurmd-k8s/blob/debug_peer_relation/src/reactive/slurmd_k8s.py#L130,L156 [21:54] the errors happens with just ^ [21:54] I'll add this to the bug, srry for spamming [22:01] ok, here's a slightly more coherent version of this rambling https://bugs.launchpad.net/juju/+bug/1818230/comments/3 [22:01] Bug #1818230: k8s charm fails to access peer relation in peer relation-joined hook [22:30] bdx: thanks for all the input, we'll take a look [22:30] thx