[02:07] o/ [03:49] hi, has anyone seen an issue where juju just entirely refuses to work? [03:50] as in, I type "juju status" and it just hangs [03:50] i am running ubuntu 16, I have restarted my computer, I have reinstalled juju [03:51] i had been using it successfully to follow tutorials, but it has since stopped working [03:51] babou: There are a few bugs around which cause jujuds to become very busy; restarting jujud-machine-0 should sort them out [03:51] ^ That's for juju 1.x; for 2.x it might be called something slightly different [03:52] i'm running 2.0.2 [03:53] It will be called something similar (maybe the same); ssh directly to your juju controller and systemctl restart 'juju*' [03:53] i'm running locally with lxd [03:55] in that case I would enter that command on my local machine, right? [03:56] babou: Within your lxd container [03:56] aah [03:57] babou: This could be something different as well, like your networking not set up quite right or something. [04:02] that doesn't seem to have worked [04:02] tried what you said and tried restarting the container externally [04:02] i can use a cloud container though, so I will just use that for now [04:03] babou: It could easily be something else then [04:03] any other ideas? [04:03] have a look in ~/.local/share/juju [04:04] there's a file called controllers.yaml [04:04] find your controller, and there should be a line in its section like this: api-endpoints: ['10.1.10.7:17070'] [04:04] check that you can "nc -vz 10.1.10.7 17070" [04:05] i have no fille called controllers.yaml [04:05] hmmm [04:06] what does "juju list-controllers" say? [04:06] wait yes i do [04:06] phew [04:08] i found the line and tried this command [04:08] nc -vz 10.187.163.241:17070 [04:08] but it just printed the help message [04:24] babou: no colon between IP & port [04:24] nc: connect to 10.187.163.241 port 17070 (tcp) failed: No route to host [04:25] babou: Sounds like network setup then [04:34] ok thank you for your help [07:51] Good morning Juju world [09:25] morning folks, reading through the docs on constraints (https://jujucharms.com/docs/stable/charms-constraints) was wondering if it was possible to specify a constraint in a file ahead of time, so for example, all controllers bootstrapped on the given cloud will have that constraint? === frankban|afk is now known as frankban === deanman_ is now known as deanman [13:35] mattyw: did you read this here? https://jujucharms.com/docs/stable/charms-constraints#using-constraints-when-creating-a-controller [13:43] holocron, I did thanks, I was hoping to have a file somewhere so I could not have to remember to do it on bootstrap [13:47] Matty I see.. you want to do it post bootstrap. That I'm not sure about [13:48] Ah, you want to apply it to all controllers. Now I got it [13:51] holocron, the other way around, I want it to happen to all future bootstraps [13:51] holocron, all future bootstraps for a given cloud anyway [13:51] holocron, if for example I'm too cheap to want to use m3.medium for all my instances [13:52] holocron, but I'm also too lazy to remember how cheap I am [13:55] I see.. sorry I don't know of a way. All my juju use cases are local LXD or manual [13:57] mattyw: alias cheapbootstrap=juju bootstrap --constraints... ? :P [13:57] mattyw: no, there's no "constraints via file" currently. [14:56] help I do "juju deploy mysql" but it says connection refused [14:58] what says connection refused rts-sander ? [15:05] Hi @kjackal [15:05] Merlijn_S: Hello! [15:06] Merlijn_S: How can I help? [15:06] I'm working on a setup for a spark course. ~20 teams of students, each get their own jupyter notebook with pyspark integration that's connected to a shared spark-on-yarn + hdfs cluster [15:07] Merlijn_S: sounds nice! [15:07] now the issue is that while a sparkcontext is active, it is a running job in yarn. This means that it blocks resources, even though nothing is running. [15:07] ~20 teams means ~20 spark contexts [15:09] yes, true. Is there any parameter we can play with, let me see if I find something. [15:09] each sparkcontext has its own yarn job that marks a certain amount of resources as "used", so the cluster runs out of resources. [15:09] magicaltrout, https://justpaste.it/12280 [15:10] I guess I could download the charm and then deploy it from local [15:10] rts-sander: looks like the server you're trying to run it on is behind a firewall [15:10] or missing a route to the interwebs [15:10] juju version 1.25.6-xenial-amd64 [15:11] I'm also not sure how the yarn resources work exactly. Does each "application container" get a fixed amount of resources or does this depend on the job itself? [15:11] what you doing running 1.25 anyway ?! :P [15:11] because we were going to use it in production before 2.0 was out of beta [15:11] but now we switched to 2.0 but I'm too lazy to upgrade [15:11] hehe [15:11] well [15:12] if I go to the url in your error in my browser [15:12] it downloads the charm [15:12] so i guess its connectivity of some sort [15:12] rts-sander: magicaltrout exactly [15:12] same here [15:12] mysql.zip [15:12] rts-sander: it looks like you can't reach the api.jujucharms.com to get the zip of the charm from the store [15:12] rts-sander: so the only thing to do is to get the charms locally and deploy that way if you can't augment the connectivity [15:13] I can wget it [15:13] rts-sander: there's a download link here: https://jujucharms.com/mysql/ on the right side to get the zip [15:13] rts-sander: right, but can where the controller lives get to it? [15:13] * rick_h goes back to read where the juju controller is sitting (local or on an openstack or?) [15:13] it's a local deployment [15:14] rts-sander: hmm, so is lxc setup with a bridge where it can reach the internet through your computer then? [15:14] yes [15:14] * rick_h thought we set that up in 1.25 [15:14] rts-sander: maybe an intermittent issue? did you try more than once and get it? [15:14] I tried multiple times [15:14] when I first installed it I could get charms through it [15:15] I'll deploy from local, it will take less time than trying to figure this out [15:15] rts-sander: hmm, that sounds like something intermittent then perhaps. I guess can you try now? If you can wget the file and lxc is setup to work. [15:23] Merlijn_S: this looks like a good description of YARN esource amangement: https://www.cloudera.com/documentation/enterprise/5-3-x/topics/cdh_ig_yarn_tuning.html [15:25] Merlijn_S: And this is how you set the cores and mem from spark https://www.mapr.com/blog/resource-allocation-configuration-spark-yarn [15:28] Merlijn_S: If I understand your request correctly you need to either overcommit resources so that the utilisation increases or do some kind dynamic allocation/resource preemption [15:31] kjackal: We're running the slaves inside LXD containers; running multiple slaves in multiple lxd containers solves the issue by overcommitment, but I worry that when they start to run real queries everything will bork... === scuttle|afk is now known as scuttlemonkey [15:45] kjackal: So if I understand correctly, It's indeed spark that decides what the size of the container is. The size of the container is the same no matter how much is actually used in the jobs. [15:46] kjackal: So ideally, I find a way to make the sizes of these containers dynamic depending on what actual jobs are running. If that doesn't work then I'll just have to find a way to make the spark containers smaller. [15:48] Merlijn_S: give me some time. In a meeting [15:48] kjackal: Related: How do the bigtop charms calculate what the "total memory" of a machine is? [15:48] kjackal: aight :) [16:11] cory_fu: I updated https://github.com/juju-solutions/matrix/pull/63/files, with code that catches the TestFailure in a slightly bettter place. (Also set the exit code to 1 if we have a generic uncaught Exception.) [16:11] Merlijn_S: I am back. Not sure how bigtop queries for the available memory. I will have to look at the code, If you want me to. [16:11] petevg: Excellent [16:14] Merlijn_S: I would start with the configuration of spark. Like spark.executor.cores as described here: http://spark.apache.org/docs/1.6.1/configuration.html [16:14] Merlijn_S: with --conf you are able to set the config variables: in spark-submit [16:16] kjackal: I'm using pyspark; any idea how those config values translate to that? [16:16] Merlijn_S: I see there is a SPARK_CONFIGURATION variable in jupiter configuration (never used that) [16:18] Merlijn_S: are you using this pyspark? http://spark.apache.org/docs/latest/api/python/pyspark.html#pyspark.SparkConf [16:19] kjackal: I'm using findspark: https://github.com/minrk/findspark [16:20] nice [16:21] kjackal: so that translates to: http://spark.apache.org/docs/latest/api/python/pyspark.html#pyspark.SparkContext [16:21] Seems so [16:21] Merlijn_S: and you can set the configuration there [16:27] kjackal: k, thx, I'll take a look at that. [16:30] kjackal: I'm also interested in the bigtop available memory logic. Currently, it only uses 2/3 of the real memory on the machines. [16:31] kjackal: and it differs a lot between machines that have the same hardware. [16:32] kjackal: scrap that, it's the same; it's 2/3 of the total memory of each machine. [16:42] kjackal: ever heard of 'dynamic resource allocation' of spark contexts? That might be what I'm looking for: https://spark.apache.org/docs/latest/job-scheduling.html#dynamic-resource-allocation [16:55] Merlijn_S: dyn res allocation won't be easy. it looks like it requries a jar + yarn-site.xml config changes to all nodemanagers, and we're currently not exposing those config options :/ [17:03] Merlijn_S: sorry i'm playing catch-up here.. but you also said "now the issue is that while a sparkcontext is active, it is a running job in yarn". are you in yarn-client mode? if so, the driver (which instantiates a sparkcontext) does not run on the yarn cluster. it runs whereever you start the spark application. [17:04] Merlijn_S: if you're in yarn-cluster mode, of course, the driver will indeed take one of your yarn nodemanagers to do its thing. [17:22] cory_fu: pushed one more update to that gating PR (I wrote some better .matrix tests, and found out that we couldn't turn off gating -- the update contains the fix + the tests.) [17:25] cory_fu: I kind of want to integrate those .matrix tests into tox, but your clever method of running main only works once -- the second time we try to do it, we've already set a bus and done other globally relevant stuff to matrix, so it doesn't work. I'm kind of tempted to write a script that sniffs out .matrix files in the test dir, and runs them via [17:25] Python subprocess, checking for a 0 or non zero exit code. Don't want that to delay the gating stuff, though. Will open a ticket ... [17:28] petevg: Couldn't you just add them as additional test plans in the existing functional test? [17:29] cory_fu: they have to be completely separate matrix runs. [17:29] petevg: Oh [17:42] kwmonroe: sorry, was out to get food. I'm talking about the sparkcontext itself, not the driver. The sparkcontext itself takes up space while it is alive. So every interactive spark session blocks unused resources. [17:45] kwmonroe: So basically, a multi-tenant interactive spark cluster is more or less a no-go without dyn res allocation.. [17:51] I'm trying to remove a relation to re-add it and juju remove-relation seems to work but when I do juju add-relation .. it keep saying relationship exist. Is there a way to force break the relation? === larrymi_afk is now known as larrymi [17:54] * larrymi tries juju destroy-relation [17:57] hmm.. destroy-relation doesn't work either [18:00] larrymi - is the unit halted in a hook context after removing the relation? [18:00] as in, debug-hooks or some such [18:03] lazyPower: ah yes I see a config-changed hook failure [18:04] lazyPower: not on the unit itself but on the other endpoint [18:06] yeah, that will halt the removal of the relationship in the controller [18:06] lazyPower: ah but that's not from the removal .. that was there before [18:06] so its in progress but not completed, and its right its been removed but "isnt really removed" [18:06] yep [18:06] its in queue [18:06] if you resolve that failure and step that remote unit through, it should hit the -departed and -broken hooks and clear itself up [18:06] lazyPower: right [18:07] lazyPower: ah that makes sense. Thanks! :-) I'll give that a try. === frankban is now known as frankban|afk [18:57] rick_h: SHOWTIME? [18:57] rick_h: got a link to the show ? [18:58] accidental caps [18:58] marcoceppi: arosales working on it [18:59] hmm, won't let me check the "hangouts on air" button :/ [19:02] marcoceppi: arosales https://hangouts.google.com/hangouts/_/ytl/VusL3EZZPRDpX4QUibgOwiPCpKQKzhydHyEehU55OXw=?eid=103184405956510785630&hl=en_US&authuser=0 [19:02] to join the hangout [19:02] had to create a new one :( [19:02] https://www.youtube.com/watch?v=n86tRu4xpYU to watch [19:04] rick_h: thanks, omw [19:04] anyone else coming? jcastro [19:04] natefinch: or redir or anyone want to join feel free [19:06] https://www.youtube.com/watch?v=AfBnrLvDFy8 The Juju Show #3 | Last streamed live on 5 Oct 2016 [19:06] cfgmgmt session [19:09] Why aren't we showing this today ? [19:09] arosales: coming or should we go on? [19:09] CoderEurope: things and stuff, have to watch! :P [19:10] rick_h: I have been waiting 3 days for this show = Is it happening, or not ? [19:10] CoderEurope: yes, we were waiting for arosales [19:10] but he's gone awol so we're going on now [19:11] CoderEurope: the link to watch is here: https://www.youtube.com/watch?v=n86tRu4xpYU [19:11] okay - I shall await .... [19:11] rick_h: I have a fix for the lxd thing, so i'll be making a test and committing it. have fun, though! [19:12] rick_h: please start without me [19:14] https://github.com/juju/python-libjuju [19:17] rick_h: I neglected to see the examples directory in libjuju [19:17] is that a community member that just joined?! hype! [19:18] there are a bunch of examples that I seemed to of missed, but it was a good learning experience none the less [19:19] rick_h is it unfiled if we just discovered it today? :) [19:19] hehe [19:20] QUESTION: First of all thankyou for the video \o/ - they're fabulous for the community ! 2ndly, @Discourse juju charm (Modern Forum Type) is really needed for the lug I am involved with - What are the chances ? [19:21] oooo its mreed hype [19:21] CoderEurope - we had a very old discourse charm, however we're interested if there's a community member thats interested in maintaining the discouse charm [19:21] CoderEurope: there is a community developed Discourse charm @ https://jujucharms.com/u/marcoceppi/discourse/precise/27 [19:21] thejujushow: the libjuju work I've been cutting has already made it into my jenkins post build step .... I'm simply using it to run actions on my juju deployed applications following successful build step completion [19:23] lazyPower: I would but it may be beyond me .. but I have spoken about this before :D [19:23] CoderEurope - well, if your LUG is interested in learning, we can certainly host a charm school for anyone interested [19:23] free training to get you up and moving with juju with a focus / eye towards developing the discourse charm [19:24] http://summit.juju.solutions [19:25] bdx_: are you James? Have you looked at the ci pipeline bundle? It's more or less what we wanted to build.. [19:26] bdx_: https://jujucharms.com/u/juju-solutions/cwr-ci/ for reference, is what I think Merlijn_S is referencing [19:26] Merlijn_s: yes, its me [19:26] bdx_: Yes. + they're working on support for testing and publishing full bundles. preview here: https://github.com/juju/python-libjuju [19:26] Merlijn_s: niceeeee [19:26] marcoceppi - juju status-history [19:26] juju show-status-log [19:26] ^ [19:27] that [19:27] show-status-log [19:27] sparkiegeek high5 on gmta [19:27] lazyPower: o/ :) [19:27] bdx_ : sorry, preview ci for bundles here: https://github.com/juju-solutions/layer-cwr/tree/build-bundles [19:27] 32 days.... [19:27] bugger, better come up with some content [19:27] alias wjst="watch -n 0.2 -c juju status --color" [19:29] Merlijn_s: for me its more about integrating application ci with juju though [19:29] marcoceppi - this may be a good time to bring up layer-debug [19:29] Merlijn_s: not so much CI on the charms themselves, which I think is more your use case right? [19:29] I am not familiar with layer-debug. [19:30] definitely bring that up [19:30] QUESTION: Not really a question- But , but ... I would just like to say that Beards' win in the show, today :D [19:30] skay - we (The eco team) wrote a layer to provide scaffolding for reporting on a unit to do debug. In our case we wanted entwork info dump, container / file descriptor info dumps, fs dumps, etc. [19:30] so i hope they cover it, might b fun :D [19:31] bdx_: I'd think that's more or less the same issue. Since charm tests can be whatever. I'd think you would just create a charm test that starts the tests of the actual application inside the charm.. [19:31] https://bugs.launchpad.net/juju/+bug/1645422 - retry-provisioning bug [19:31] Bug #1645422: retry-provisioning doesn't retry failed deployments on MAAS [19:34] Merlijn_S: application CI vs charm CI .... we have a team that writes selenium test suites for each of our apps [19:35] rick_h: true that [19:35] bdx_: I have no experience with application CI so I'll just shut up :) [19:36] Merlijn_S: lol ... oh man .. lets catch up soon === tris- is now known as tris [19:40] rick_h: FYI: the Google logo is obscuring the first line of your terminal [19:42] QUESTION: Does interactive juju add-cloud pre-seed potential MAAS API endpoints from ~/.maasclidb ? [19:44] ooh nice [19:44] no more tickling Juju in three different ways to bootstrap in MAAS [19:45] cory_fu: updated the crashdumping/gating PR to address your comments: https://github.com/juju-solutions/matrix/pull/63/files [19:45] cory_fu: if you have the branch checked out locally, you'll need to delete it and refetch -- someone's "trivial" indentation fixed created conflicts that I needed to clean up with a rebase :-p [19:46] petevg: :) Sorry about that [19:47] Marco: I really think it would be solid to talk about layer-debug even if only briefly [19:47] Is cool. [19:47] marcoceppi ^ [19:47] if you have k8s deployed, you can just run it :) [19:50] QUESTION: are there any ACLs planned for actions? [19:50] I might not want everyone to be able to get debug information [19:50] cover it next time, all good :) [19:50] bye guys ! [19:50] Great show fellas [19:50] \o/ [19:51] lazyPower: Was it you that was to ping me about discourse Xenial charm ? [19:51] CoderEurope - thats marcoceppi, but i'm happy to sit in and provide feedback/assistance [19:51] okay cheers. [19:51] CoderEurope: o/ [19:51] o/ [19:52] so I wrote this a long /long/ time ago. But they've moved to docker distribution. So it should be pretty straight forward to re-charm again where we run their bits in their docker package and then bolt on the rest from the outside [19:52] right -oh [19:52] CoderEurope: would you ahve time time later this week, or next week, to hash it out a bit? [19:54] marcoceppi: - I have a new laptop coming in this/next week - Had to stretch to the chromebook just to tune in today :D [19:54] mbruzek: do you have a link for the layer to stick in the show notes? [19:54] CoderEurope: cool, give me a ping here when you're set up [19:54] yes [19:54] marcoceppi: cool-beans ! [19:54] mbruzek: ty [19:54] rick_h: https://github.com/charms/layer-debug [19:55] welcome [19:58] marcoceppi: jcastro either of you know why the comments are disabled when I look at the video, but if I go to the edit/advanced settings, allow comments is checked? [20:00] rick_h: is it publshed? [20:02] marcoceppi: hmm, says "public" not seeing anything addressing "published" specifically [20:04] I still have problems accessing the links in the show notes in iOS - the long links get truncated by YouTube (fine) but it actually changes the href to be truncated too :( [20:04] rick_h: think you could use a link shortener to work around their bug? [20:06] sparkiegeek: doh [20:06] sparkiegeek: you're asking for an increased production level here. I feel like I need to go to film school for this :P [20:08] rick_h: well normally I wouldn't pay it any attention, but there's this guy I follow on Twitter who keeps harking on about them :) [20:08] sparkiegeek: lol [20:28] in python world can i do status_get and get the message? [20:30] oh yeah it appears it already does [20:31] hey all, picking up after a few weeks. is juju on OSX compiled with Go 1.7 yet? [20:33] justicefries it is not [20:33] i still get the random go stacktracing here and there [20:36] magicaltrout yep :D [20:36] rick_h - wasn't there some talk about the go 1.7 bit and how we handle that for brew? [20:55] petevg: I added a comment reply to your PR, but I really think that we need to keep the test with glitch in and figure out how to make it work with the gating. That was the entire point of adding the gating logic in the first place. [20:55] cory_fu: I thought that we had decided not to run glitch by default. [20:56] cory_fu: ... because it conflicts with the goal of providing a reliable automated testing framework for now. [20:56] petevg: The reason we added the gating flag was because we wanted to still be able to run the glitch test but not have it influence the success or failure [20:56] So it would be informational but not affect pass / fail [20:57] cory_fu: yes. But unless we add a generic Exception check, we can't do that, because we also fail on uncaught Exceptions. [20:57] cory_fu: we could also just not throw an error on uncaught exceptions if the task doesn't gate, but that feels more likely to mask genuine issues to me. [20:58] We need to address that, then. And I don't think a blanket catch is the right solution. What exception is being thrown and why? Is it a bug in glitch, or is it an expected failure case? If the latter, then it should be converted to a TestFailure or added to the gating logic [20:58] cory_fu: I'm talking about things that you posted -- the NoneType surfacing because of a bug in core. [20:59] cory_fu: I've caught that by checking for TypeError, along with some other general Python errors, but I don't know what other errors libjuju or core will throw (especially as libjuju is under rapid development). [20:59] cory_fu: I want matrix to surface those errors, actually. But I don't want to do it in a way that messes the automated testing of charms bit. [20:59] cory_fu: I think that the best way to wind up with a glitch that gates in a controlled manner is to turn any Exception that comes up during a glitch run into a TestFailure. [21:00] cory_fu: that way, if we gate on glitch, we see the error. If not, it's in the logs, but it doesn't break stuff. [21:00] petevg: How is that NoneType error not a bug in glitch / matrix? [21:00] cory_fu: it's a bug in core/libjuju. [21:00] cory_fu: it's one of the things that can happen if you try to send a command to a machine that is rebooting. [21:00] cory_fu: you periodically complain about it in PRs, but then you also complain about it when I try to fix it in a generic way :-p [21:01] I don't think that NoneType error is a bug in libjuju. I think it's a bug in matrix in that it continues to try to perform glitch operations after encountering a TestFailure [21:01] cory_fu: The NoneType error happens in libjuju. [21:02] petevg: Sure it does, because we're trying to call model methods after the connection was closed [21:02] cory_fu: and if I mask it in glitch by adding a check for it in the reboot action, I disable glitch's ability to tell us whether or not that error is fixed. [21:02] cory_fu: let's jump into the hangout. [21:18] cory_fu: wrote code, tested, and pushed the restored matrix.yaml, and Exception catching glitch. [21:19] Thanks [21:19] I'll take a look in a minute [21:34] cory_fu: ack. Didn't check in the "gating: false" lines in the matrix.yaml. Fixed and pushed. [21:57] tvansteenburgh: Quick PR for you: https://github.com/juju/python-libjuju/pull/40 [22:06] do bundles need metadata.yaml's to be in the charm store? [22:08] jhobbs: nope, just a readme and bundle.yaml [22:10] ahh it was the README i was missing [22:10] thanks marcoceppi