=== meetingology` is now known as meetingology [00:45] how do I kill an app in error state? [00:47] redir: do u want to resolve the error or remove the app? [00:47] remove [00:47] anastasiamac: ^ [00:48] redir: juju remove-application? [00:48] yeah that doesn't get rid of it [00:48] redir: what does it say? [00:49] remove-app says nothing anastasiamac [00:49] just returns [00:49] but the app (ubuntu) is still there in an error state and the machine is still there too [00:50] redir: do u know what's error? anything in the logs? [00:50] ERROR juju.worker.uniter.operation hook "install" failed: exit status 127 [00:51] but is seems to keep renewing leadership lease [00:52] redir: in the past, we'd restart jujud on the machine to make sure that leadership gets kicked too... mayb it'll help here... [00:53] redir: i think it's worth a bug tho with repro steps... [00:53] nope nope, not killing any machines on this maas since it is sooo hard to get the nodes up and running. [00:54] and there's no one there to wiggle the wires for 2 weeks [00:54] all I did was `juju deploy ubuntu --series precise --to kvm:0 [00:55] redir: not killing machine... just re-starting jujud.. [01:17] menn0: the logs I had from IS had messages for transactions we were running. they were all for lease & status. there were just a lot more entities than I thought there were [01:17] menn0: so 9000 txn over 2 hours isn't unusual in their env [01:18] trick-or-treaters are starting [01:18] * redir eods [01:18] still not not sure why there was a spike (previous period had only ~1000) [01:18] axw: hmm interesting [01:23] wallyworld: I think my change would be simpler if I ripped out the blank model uuid from the api first. [01:23] babbageclunk: i din't really have enough context to comment [01:24] model uuid in 2.0 should not be blank though i don't think [01:26] wallyworld: https://github.com/juju/juju/blob/staging/apiserver/utils.go#L47 [01:27] wallyworld: Where do I need to check? I guess all of the clients? [01:28] babbageclunk: from memory, i tried taking that check out recently and stuff failed. i can't recall exactly what though [01:28] i think right now there's still a legitimate use case for empty model uuid in that context [01:29] backup might have been one case [01:29] wallyworld: Ah, ok - I'll leave it alone then. [01:29] i was hoping we could remove that check [01:30] it may be that there's just legacy code to fix, but am not sure now [01:34] wallyworld menn0: monitoring is paying dividends... just found a mongo session leak [01:34] yay! [01:34] axw: where? [01:35] wallyworld: apiserver/logsink. we've got a loop that never breaks as long as the apiserver is running [01:35] wallyworld: so the session it copies at the start of the handler never gets closed [01:35] wallyworld: client reconnects and starts a new one when the client conn breaks, and so we have a slow buildup of goroutines & sessions [01:36] how often does aclient conn break? [01:36] you mean each new incoming api request? [01:38] wallyworld: well in my env, there's been 4 new goroutines added for this in the past 90 minutes. pretty slow buildup [01:38] wallyworld: it's not one per API request, this is the logsink thing. it's a long-lived websocket [01:39] makes sense [01:39] anyone know how vsphere configuration is supposed to work? [01:40] I can;t tell if my the code has a weird bug or if it's doing what it's supposed to do [01:41] nfi [01:41] horatio is your man [01:42] Horatio; a fellow of infinite jest, of most excellent fancy [01:43] technically the second part is talking about yorick, but it's funnier to take it out of context [01:44] axw: nice work. hooray for monitoring! [01:48] natefinch: context, schmontext [01:48] natefinch: I *may* know something about vsphere configuration also [01:49] blahdeblah: cool, well, I think I figured it out, since I see code explicitly doing what I thought might be a bug. In fact, there's like 100 lines of reflection to do it, for some odd and scary reason [01:50] Well, in that case, I know nothing about vsphere :-) [01:50] lol [01:51] blahdeblah: thanks anyway :) What it was, was that the docs say to set your yaml like the example here: https://jujucharms.com/docs/stable/help-vmware [01:51] where the regions are just empty... but then when it gets saved into your clouds.yaml, the regions get endpoint set with the value from the cloud endpoint... which seemed like a very weird bug [01:52] but looking at the code, it's intentional.... just seems weird to manually copy them out when they're saved rather than simply interpret them that way at runtime. [01:52] axw: this is awesome.. i don't think we had logsink in 1.25... did we? [01:53] anastasiamac: I think we did, I'll check [01:53] anastasiamac: Are you thinking what I'm thinking? :-) [01:53] axw: oooh.. it'd b great if we have a fix to put it before 1.25.7 goes out then :D [01:53] blahdeblah: :D [01:53] blahdeblah: always!! [01:53] anastasiamac: you are definitely thinking what I'm thinking. :-) [01:53] anastasiamac: it's there, but the code is different. I don't think the bug applies to 1.25 [01:54] d'oh! [01:54] axw: interesting. os.Mkdir("/some/path", 0660) only creates the directory with 0640 and there's no error [01:54] at least on my yakkety system [01:54] so unit tests fail \o/ [01:55] wallyworld: umask [01:55] ah [01:55] wallyworld: try clearing umask and running again [01:55] ta [01:55] forgot about that [01:57] yep, that was it [01:57] wallyworld: I guess IsolationSuite should clear it? [01:58] yeah, i think so [01:58] except not every suite embeds that sadly [01:58] the one that failed for me doesn't [02:01] wallyworld: looks like awesome drive-by improvement :D [02:01] fix every test suite? [02:01] there's a lot of them [02:02] if u r really keen :D but i was talking about the test (or its suite) that was brought up to u attention [02:02] it embeds BaseSuite. I'd rather approach the problem holistically [02:03] look at making BaseSuite embed IsolationSuite [02:03] but there will be consequences [02:03] i would have prefered for this suite to embed IsolationSuite rather than Base... It looks like it needs it [02:04] piecemeal embedding IsolationSuite here and there isn't the best approach [02:04] every suite should embedd BaseSuite [02:04] no it' not... but it gets us to a better place [02:04] that gives standard log captures, cleanup etc [02:04] what does basesuite do that isolation doesn't? [02:04] i think basesuite is more limited [02:05] isolationsuite should b used where file systems r touched... off memor... [02:05] JujuOSEnvSuite vs OSEnvSuite [02:05] if you need logging to figure out why your tests failed, it means your tests are bad [02:05] i'm not talking about that [02:05] log capture prevents logginf leaking onto the console [02:06] ahh [02:15] fun regression in tests \o/ https://bugs.launchpad.net/juju/+bug/1466520 [02:15] Bug #1466520: serviceSuite setup fails [02:25] Bug #1466520 changed: serviceSuite setup fails [02:25] Bug #1514922 changed: Deploying to maas ppc64le with proxies kills websocket [02:52] if anyone is feeling ambitious, interactive add-cloud: https://github.com/juju/juju/pull/6498 +1,670 −58 [03:02] axw: i think we should move the ec2 withDefaultControllerConstraints() method and generically apply it to all providers. i can't see a reason why we would not want to ensure a minimum memory spec in all cases [03:03] wallyworld: we already do have a min memory of 1G for all instances, see environs/instances/instancetype.go. but perhaps we should do something controller-specific [03:04] axw: yeah, that's my point. 1G is too small for controllers, eg azure gets 1.5GB. aws gets 3.75 which is what we'd want for all providers [03:05] wallyworld: sure, sounds like a good idea. I think we'll need tests for each provider to check which instance types get used by default tho [03:05] otherwise we allow people to bootstrap with undersized controllers where steady state is say 2GB and it thrashes/pages all the time [03:05] do we care what instance types get used? [03:05] so long as the capabilities are met? [03:06] and for openstack say, we don't know ahead of time that the instance types are called [03:06] wallyworld: IIANM, if you set a min of 3.75G on azure you get something with ~7GB [03:06] there's one with slightly less than 3.75GB that would probably be appropriate [03:07] ok, i'll look at adding some for of sanity check or something [03:07] wallyworld: for openstack we wouldn't do it, I'm just saying as a general sort of rule - to make sure we don't go costing people more than necessary [03:08] yep, will add it to azure etc [03:09] anastasiamac: you able to look at https://github.com/juju/juju/pull/6518 for me? [03:10] wallyworld: that's the really big one :D [03:10] not really [03:10] a few files, but minimal changes [03:10] wallyworld: i'll look in a sec.. i did see it coming in earlier and was posptponing the pleasure [03:10] only +192 :-) [03:10] -156 [03:11] yep agaisnt 66 files... gimme asec [03:11] most straight search and replace [03:11] would be far easier in review board :-( [03:11] like grep? I love it when u use it :D [03:11] since yo'd have a paginated list of files [03:34] anastasiamac: i tried to explain the the PR description why Must was called. all tests use Must - that's our standard practice. the other places used Host except where it was not possible eg existing method did not return error. [03:34] there's no one explanation [03:35] and the WIP PR will oblitarate most of this anyway [03:35] and the stuff this PR replaced all used to panic [03:36] it's just that the method in utils was incorrectly named [03:36] so all this core PR does is adapt to the new method name essentially [03:36] agent/agent.go is nt a test and it only has .Must calls.... [03:37] in var initialisations.. if we can panic, shouldn;t it be handled differently [03:37] ? [03:37] right, as i said in the description, for var initialisation it is needed [03:37] the same question for api/certpool?... [03:37] right but that means thatw e can still panic [03:38] sure, that's what it always ever did [03:38] the old HostSeries used to panic [03:38] so if we paniced before and will panic now, what's the improvement? [03:39] wallyworld: i'd appreciate if u got a second opinion :D [03:39] the improvement is not is this host series stuff - i am forced to adapt to use the new apis because i committed changes to tip of utils repo [03:39] work has landed there that core does not use properly let [03:39] there's a WIP PR for that [03:39] this just adapts core to use the apis from tip of utils [03:40] i have landed changes elsewhere in utils that are needed [03:40] this PR is just method renames, no bheaviour changes [03:40] except for the dep updates which do bring in desired behaviour [03:41] wallyworld: like i said - i don;t have issues with code but m not understanding the value of the change... revving up the dependency is fine... but the change in util is the one that we are adapting to here... [03:41] but the change it not the HostSeries stuff [03:41] that is a byproduct because other wor landed in utils [03:41] my chage is to the internals of utils elsewhere [03:42] so maybe do a depenedncy rev up as a separate PR? [03:42] i can't - tip of utils has changed [03:42] this PR *is* the dependency change [03:42] with adaoptions for the altered apis [03:42] k... hangon m going to call royalty.. a sec [03:42] which we are now forced to consume [03:58] how anti goto are we? [04:15] not at all? great! [04:20] babbageclunk: :) i thought i saw it in our codebase... [04:20] Bug # changed: 1484105, 1506460, 1515475, 1545562, 1592609 [04:21] anastasiamac: Was only half a joke - the error flow in the code I'm working in is a bit unusual, I think goto would probably work a lot better than what I've got now. If it's not going to give people conniptions I'll try it out. [04:23] babbageclunk: \o/ [04:38] anastasiamac: "Once the checks succeed, LGTM \o/" <- why is that a prereq? the checks are a subset of a merge run? [04:38] (AFAICT the checks failed because of an interimittent failure) [04:43] axw: failed checks sometimes indicate test failures :D [04:44] if it's intermittent... would b nice to fix but we have landed code with intermittency before.. o m leaving at ur discretion [04:45] axw: hence, approved and lgtm'ed :D [04:46] anastasiamac: okey dokey [04:46] anastasiamac: thanks for the review [04:57] yay, I think that code's much better with goto! [05:52] axw: ur PR is merged \o/ awesome!! [05:53] anastasiamac: :) just upgraded my azure env, will see how it fares now [05:53] axw: m holding my breath :D [07:07] anastasiamac: please see my reply on https://github.com/juju/juju/pull/6520 [07:28] anastasiamac: happy for me to land now? === mup_ is now known as mup [09:46] hey everyone, is there a rest api for juju? Where can I find documentation for it? === dootniz is now known as kragniz [10:04] hi , could anyone explain to me why I'm getting this log messages, tring to set some custom tools on bootstrap but I'm gettings this logs [10:04] http://paste.ubuntu.com/23410800/ [10:07] Some thoughts on this? Anyone? [10:15] anyone? [10:28] hoenir: it's telling you your data doesn't have everything it needs [10:31] mgz, so please direct me where to read on "what it needs" [10:32] now that is a much harder question :) [10:35] I think it's just that the final url for the tools results in a 404? so you need to fix/change that location [10:36] turning on trace logging/looking at the actual requests made may help [10:36] and I'd just refer to the code in environs/simplestreams for what's actually happening [10:37] which is pretty confusing as it's several layers of support across change versions [12:10] I'm going to be out from standup, summary, I have the minimal ssh fix done, looking at a more comprehesive one [12:11] as dash with no prompt and no control codes is still pretty annoying to work over [12:24] voidspace: ping :D [12:34] anastasiamac: pong [12:35] voidspace: sorry, pm-ing u :D [12:39] anastasiamac: do you not sleep? [12:45] rick_h___: you bailed on me yesterday, I was too busy to rant at you then :p [12:45] perrito666: not on irc :D [12:48] Bug #1333162 opened: cannot bootstrap on openstack provider: Multiple security_group matches found for name 'XYZ', use an ID to be more specific. [12:48] [12:50] Any ideas how i could debug the following error 'agent is not communicating with the server' when deploying a charm using LXD ? [13:23] kjackal: https://godoc.org/github.com/juju/juju/api [13:40] thanks rick_h___ [13:41] rick_h___: This does not look like a rest api [13:42] kjackal: it's a websocket API for the most part. There's some restful bits, but for the most part it's async with events pushed across the websocket. [13:42] rick_h___: I see. thanks [13:43] kjackal: best thing might be to look at clients like the WIP libjuju https://github.com/juju/python-libjuju [13:43] kjackal: not sure what you're looking to do/etc [13:45] rick_h___: I saw python-libjuju but its WIP. We have a client that wants to use its own admin console to drive juju deployments. I am exploring what would be the right way to do that [13:45] rick_h___: any suggestions? [13:45] kjackal: in python or another language? [13:47] Not sure about the language, I would guess python. Thats why a REST api would seem a good starting point. Do we have a reference example? [13:48] rick_h___: ^ [13:48] rick_h___: the Juju gui on juju 1 was doing something similar (drive juju) not sure how that worked [14:05] frobware: mgz: dooferlad: standup time? [14:19] katco: just to confirm, new developer workflow, for a branch forked off staging I propose it against develop. Right? [14:20] voidspace: correct, unless it's a fix for a revision-release (e.g. 2.0.2), then you fork the 2.0 branch and propose both there and to develop [14:20] katco: ok, I thought we were doing backports - i.e. propose against develop then cherry-pick to propose against 2.0 [14:21] voidspace: i believe it's the other way around? didn't we have this conversation in a standup and rick said, "no it's more like a forward-port"? [14:21] katco: we have had this conversation and my memory (which I wouldn't rely on) was that we concluded back-port... :-) [14:21] katco: this one is easy, so it doesn't matter anyway [14:22] voidspace: lol y3ep [15:01] voidspace: do you need review on your maas constraint changes still? [15:47] balloons: any idea why the bot isn't getting triggered for this? https://github.com/juju/juju/pull/6414 [15:50] Hi guys! I'm using juju 2.0.1.1 here. I have the landscape-charm deployed on some machines as a subordinate charm. I removed it from one of them, but the unit still shows on "juju status". It was really removed since it doesn't appear on juju-gui. Also trying to "juju remove-unit landscape-client/3" gives me a "ERROR no units were destroyed: unit [15:50] "landscape-client/3" is a subordinate". How can I get rid of it? [15:51] you can't remove subordinates without removing the parent unit [15:52] the thought is that many subordinates fundamentally change how the parent functions, and removing them could break the parent. [15:52] I don't entirely agree with outright preventing people from doing it, but that's the way it works right now. [15:57] sinzui: mgz: any ideas re. my question to balloons above? [16:00] is hoenir a public member of the juju devs group? [16:00] natefinch: I see. But I removed it before, it just didn't vanish from "juju status". Because he is in an "inconsistent" state now it is on "maintainance" workload saying that it "Need computer-title and juju-info to proceed" on the message [16:00] seems not. so, you can either land for him, or we can sign him up. [16:01] mgz: he's in the group. looks like he just needs to make it public [16:01] it don't even shows on juju-gui anymore [16:01] hoenir: can you please make your juju team status public? [16:01] hackedbellini: I don't know what the GUI does, but if it's in juju-status, it exists [16:01] katco, yea sure, let me search what button should I press in order to do that. [16:02] hoenir: i always forget myself :) [16:02] hoenir: go to [16:02] hoenir: after you do that, try the merge one more time [16:02] and mark your membership as public [16:03] yeah mgz thanks ! I'm now Public [16:03] natefinch: ahh, I remember what I did. I removed the relation that existed between the subordinate charm and the charm itself. The relation doesn't exists anymore (the reason why I can't see it on juju-gui) but the unit still shows on status [16:03] ahh [16:03] okay, don't need to do anything else, bot will believe your comment next run [16:03] hmm interesting [16:04] yohooo katco it worked ! [16:04] natefinch: so the unit that is still there is useless. It doesn't do anything without the relation. That is the reason that I want to remove it [16:05] it is just polluting my status =P [16:05] hoenir: yay! i'm really sorry, i think you've been struggling with that for awhile haven't you? [16:05] hackedbellini: can you kill the parent unit and replace it? [16:06] katco, nah it was a fun experience, and thanks a lot for taking time to review my PR. [16:06] natefinch: hrm, I'd rather not have to do that if possible [16:07] hoenir: no worries, hth. thanks for the pr! [16:10] hackedbellini: it sounds like a bug... if there's no relationship, then it shouldn't be treated like a subordinate... it's probably one of those corner cases we just hadn't really spent much time on [16:11] natefinch: hrm that explains it. Is there any way to hack it so I can "force" the unit deletion? [16:14] hackedbellini: not that I know of, let me check the code to see if there's a backdoor we can try [16:17] hackedbellini: unfortunately, the code that determines whether it's a subordinate or not just reads a field in mongo, so to change it, you'd have to twiddle with mongo [16:20] natefinch: hrm I don't have any experience with nosql databases (if it was postgresql I would try to hack it myself =P), but if you tell me what should I do there I can try it! [16:53] mgz: I do, but I want dimitern to QA it as well [16:55] voidspace: I don't see from the diff where the 1.9/2.0+ split is [16:56] mgz: are you looking at the gomaasapi PR? [16:56] mgz: there's two branches - a gomaasapi that addresses both bugs for 2.0 [16:56] voidspace: yeah, I'm not quite sure where the bits are going to fit together [16:56] mgz: the gomaasapi fix is 2.0 only [16:56] mgz: separate branch for 1.9 (that code is in juju - about to turn it into a PR) [16:56] voidspace: gotcha [17:03] rick_h___: katco: dimitern has a public holiday today which is why he wasn't at standup [17:04] voidspace: yeah just realized that a few minutes ago :) [17:28] hackedbellini: sorry, had to step out, missed your message before I left. I don't really recommend triddling with the DB. My recommendation is to create a new unit of the parent on a different machine and kill this machine. Sorry, I think that's the best you can do. [17:31] natefinch: yeah I did that =P. Thaks for your time anyway! :) [17:41] hackedbellini: file a bug, if you haven't. I think it's worth tracking. Might also convince people that being able to remove a subordinate is a good idea [17:48] katco, sorry I never saw your pings :-( [17:48] katco, I will look now and see if I still have the logs on what it saw the PR as [17:48] balloons: no worries. we got it figured out [17:49] katco, ohh? Why didn't it trigger? [17:49] balloons: his juju membership wasn't public [17:50] ahh. I wonder if dmitern has the same problem.. menno as well [18:00] i've been trying to start an local lxd controller, and it's taking ages (it's been about 30 minutes so far mostly running apt-get update, upgrade and install). anyone got an idea what might be my issue? [18:00] just the "Installing curl" step has taken about 15 minutes so far [18:01] (it's still running) [18:01] i just bootstrapped an aws controller easily within the time that step has been taking [18:03] ah, seems like it's trying to connect via IPv6 [18:04] E: Failed to fetch http://archive.ubuntu.com/ubuntu/pool/main/c/cpu-checker/cpu-checker_0.7-0ubuntu7_amd64.deb Unable to connect to archive.ubuntu.com:http: [IP: 2001:67c:1560:8001::11 80] [18:09] sigh [19:08] anyone have time for a big review? Trying to get this reviewed in the next day or two so it can be mergeed by EOW: https://github.com/juju/juju/pull/6498 [19:09] 1600 lines, but almost all new code [21:20] hey menn0, writing some more tests - what do you think should happen if a machine agent tries to log in to a model that's been migrated and the model's been removed? [21:22] babbageclunk: error I think [21:22] hmm [21:22] babbageclunk: if the model has been removed then maybe it's ok to redirect [21:22] ok [21:23] babbageclunk: but really, the situation shouldn't happen [21:23] babbageclunk: great care is taken during the migration to make sure that all the agents are talking to the new model [21:23] new controller [21:24] babbageclunk: what is the least complicated thing to do in the redirect code? [21:24] menn0: Ok, thanks - redirect's probably simplest for me [21:24] babbageclunk: as long as there's no redirects while the model data is still there [21:24] babbageclunk: the migrationminion still needs to be able to connect back to the source controller post SUCCESS [21:24] heya babbageclunk welcome to the cool TZ :) [21:24] (for a little while anyway) [21:24] thanks alexisb! [21:25] menn0: Yup, I've got a test for that. [21:25] babbageclunk: sweet [21:32] menn0: Hmm, actually at the moment (with no changes) it gives a login error because the machine doesn't exist - ok to leave it like that? Since it's something that shouldn't happen anyway. [21:32] babbageclunk: yep that's fine [21:33] menn0: woot, then it's on to some manual testing! [21:37] babbageclunk: awesome! [23:17] team having some networking issues [23:17] will be on the standup as soon as I can [23:17] wallyworld, ^^ [23:18] wallyworld, menn0 please get started without me [23:18] ok