[01:41] <hatch> If I have a multi-series subordinate, should I be able to relate that subordinate to multiple applications of different series as long as they are in the supported series list?
[02:08] <cory_fu> hatch: Unfortunately, no.  The series for a subordinate is set when it is deployed and it can then only relate to other applications with the same series
[02:39] <hatch> cory_fu: np thanks for confirming
[09:23] <rts-sander> Hello, I'm trying to get juju actions defined on my charm; but it says no actions defined: https://justpaste.it/yird
[09:24] <rts-sander> did I miss something? I added actions.yml, have a command in the actions directory...
[09:27] <rts-sander> Oh.. I've got it: actions.yml => actions.yaml :>
[10:39] <user_____> hi
[11:18] <user_____> hi! need your help
[11:18] <user_____> juju add-machine takes forever
[11:19] <user_____> : cloud-init-output.log shows “Setting up snapd (2.14.2~16.04) …” (last line in file)
[11:19] <user_____> how to fix this?
[11:49] <rock> Hi. I have MAAS 1.9.4 0n trusty. And 4 physical servers commissioned. Now I want to test our Openstack bundle on MAAS. On MAAS node I installed juju 2.0. And then I followed  https://jujucharms.com/docs/2.0/clouds-maas. MAAS cloud juju model bootstrapping failed. Issue details :  http://paste.openstack.org/show/581940/. Can anyone help me in this.
[15:08] <cholcombe> we need some docs for debugging reactive charms
[15:08] <cholcombe> everything i'm getting is stuff random people remember about how to poke at it
[15:15] <marcoceppi> cholcombe: what are you trying to debug? I gave a whole lightning talk on this at the summit ;)
[15:15] <marcoceppi> (I intend on turning that into a document)
[15:15] <cholcombe> marcoceppi, my state isn't firing and i'm trying to figure out why
[15:15] <marcoceppi> cholcombe: is it set? `charms.reactive get_states`
[15:15] <cholcombe> marcoceppi, it's not.  so i'm working backwards
[15:16] <cholcombe> marcoceppi, my interface should be setting a state i'm waiting for so i suspect one of the keys the interface is waiting on is None
[15:17] <marcoceppi> cholcombe: did you write the interface?
[15:17] <cholcombe> marcoceppi, i did.  can i breakpoint it?
[15:17] <marcoceppi> you can with pdb, sure
[15:17] <marcoceppi> cholcombe: link to it? I've gotten good at finding oddities in interface layers
[15:17] <cholcombe> marcoceppi, any idea where reactive puts the interface file?
[15:17] <marcoceppi> cholcombe: hooks/interfaces/*
[15:18] <cholcombe> cool
[15:18] <cholcombe> marcoceppi, https://github.com/cholcombe973/juju-interface-ceph-mds/blob/master/requires.py
[15:18] <cholcombe> it worked until i added admin_key
[15:18] <cholcombe> looks like hooks/interfaces doesn't exist
[15:19] <marcoceppi> cholcombe: is in hooks
[15:19] <marcoceppi> could be relations
[15:19] <marcoceppi> cholcombe: also,  admin_key  isn' tin the auto-accessors
[15:19] <cholcombe> marcoceppi, heh helps to have more eyes on it doesn't it
[15:19] <cholcombe> marcoceppi, thanks :D
[15:20] <marcoceppi> cholcombe: np, it might be better to just use get_conv instead
[15:20] <marcoceppi> which will getyou al lthe relation data, regardless of auto-accessors
[15:20] <cholcombe> marcoceppi, yeah i think i should switch to that
[15:20] <cholcombe> too much magic going on
[15:20] <marcoceppi> *magic.gif*
[15:34] <kjackal> kwmonroe: is this the revision we should be testing cs:bundle/hadoop-processing-9   ?
[15:34] <kwmonroe> yup kjackal
[15:35] <kjackal> kwmonroe: not good....
[15:35] <kwmonroe> oh?
[15:35] <kjackal> something went wrong, with ganglia
[15:35] <kjackal> let me see
[15:35] <kwmonroe> kjackal: error on the install hook with ganglia-node?
[15:35] <kwmonroe> kjackal: i saw that.. looks like it's trying to run install before the charm is unpacked.. give it a couple minutes and it should work itself out
[15:36] <kwmonroe> it's not really red unless it's red for > 5 minutes ;)
[15:36] <kjackal> Indeed!!! It recovered!
[15:36] <kwmonroe> and that, my friend, is ganglia.
[15:36] <kjackal> Selfhealing!
[15:36] <kwmonroe> :)
[15:54] <cory_fu> kjackal, kwmonroe: I was able to deploy cs:hadoop-processing-9 on GCE and run smoke-test on both namenode and resourcemanager without issue
[15:55] <kwmonroe> w00t.  thx cory_fu
[15:55] <cory_fu> admcleod_: ^
[15:55] <kjackal> kwmonroe: cory_fu: is the smoke-test doing a terasort?
[15:55] <kjackal> I thought there was a seperate action for terasort
[15:57] <cory_fu> kjackal: smoke-test does a smaller terasort.  I can run the bigger one.  One min
[15:59] <bdx> hows it going all? Can storage be provisioned via provider, and attached to an instance w/o also being mounted?
[15:59] <bdx> using `juju storage`
[16:00] <bdx> lets say I want to deploy the ubuntu charm and give it external storage, but not have the storage mount to anything
[16:01] <bdx> then, subsequently configure and deploy the lxd charm over ubuntu
[16:01] <kjackal> cory_fu: kwmonroe: admcleod_: Terasort action finished here as well. On canonistack
[16:02] <kwmonroe> oh sweet baby carrots.  thanks kjackal cory_fu.  i kinda wish i would have dug into the broken env more, but we're 3 for 3 today.. so it's ready to ship ;)
[16:07] <cory_fu> kwmonroe: How are you seeing the error manifest?  My second smoke-test on resourcemanager seems to be hung
[16:10] <cory_fu> kwmonroe: Hrm.  I seem to have lost my NodeManager on 2 of 3 slaves, too
[16:10] <kjackal> cory_fu:  if you ssh to the slave node you should find only one java process (Datanode). There should be two java processes there. THe Namenode process is missing
[16:10] <cory_fu> Ok, I'm seeing that now
[16:11] <cory_fu> No errors in the hadoop-yarn logs, though
[16:12] <cory_fu> kwmonroe: 2016-09-20 15:55:25,059 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 9965 for container-id container_1474386123726_0003_01_000001: -1B of 2 GB physical memory used; -1B of 4.2 GB virtual memory used
[16:13] <cory_fu> -1B??
[16:41] <cory_fu> tvansteenburgh: You pushed the fix for the missing diff... Will it only apply if a new rev is added?
[16:42] <tvansteenburgh> cory_fu: once the fix is deployed, you'll need to close and resubmit the review
[16:42] <cory_fu> Oh.
[16:43] <tvansteenburgh> cory_fu: and to be clear, i'm not in the process of deploying it
[16:43] <cory_fu> tvansteenburgh: Also, +1 on removing that whitespace in the textarea.  :p
[16:43] <cory_fu> tvansteenburgh: Fair enough
[16:43] <tvansteenburgh> i knew you'd appreciate that
[16:55] <kjackal> kwmonroe: cory_fu: any luck on the namenode issue? I am trying different configs but with no success
[16:55] <cory_fu> kjackal: I've been focusing on InsightEdge and waiting for kwmonroe to get back.  I don't see anything useful in the logs, so I have no idea what's happening
[17:53] <kjackal> kwmonroe: I might have a set of params that seem to make namenode stable (until it breaks again)
[17:55] <cory_fu> kjackal: What were the params?
[17:56] <kjackal> just a asec
[17:57] <kjackal> cory_fu: kwmonroe: http://pastebin.ubuntu.com/23208123/ these go to the yarn-site.xml
[17:58] <kjackal> I have been running terasort with a single slave for 3-4 consequtive times
[17:58] <cory_fu> kjackal: Do we have any idea why we're seeing these failures on the Bigtop charms and not (presumably) the vanilla Apache charms?
[17:59] <kjackal> cory_fu: not yet
[18:00] <kwmonroe> kjackal: cory_fu:  we should have shipped when we had the chance
[18:00] <kwmonroe> and yes, nodemgr death is what i saw in yesterday's failure
[18:01] <kwmonroe> kjackal: do those values go into yarn-site.xml on the resourcemanager, slaves, or both?
[18:02] <kjackal> kwmonroe: I have them on the slave
[18:03] <kwmonroe> also cory_fu kjackal, i wonder if the addition of ganglia-node and rsyslogd ate enough resources to cause the slaves to run out of memory.. that's one thing different between the bigtop and vanilla bundles
[18:04] <kwmonroe> furthermore, didn't we have a card on the board to watchdog these procs?  if not, we should add one... or at least check for each proc before we report status.
[18:05] <kwmonroe> it'd be nice to see at a glance that status says "ready (datanode)" and know that something went afoul with nodemanager
[18:07] <kjackal> kwmonroe: cory_fu: nooo... it died again....after 5 terasorts...
[18:09] <cory_fu> +1.  We could have a cron job that checks for the process and uses juju-run to call update-status if it goes away
[18:13] <kjackal> yeap thi fire and forget policy we have for services could improve. kafka may also fail to start and we never report that (we have a card for this)
[18:19] <cory_fu> kwmonroe, kjackal: There's an issue with the idea of relying on the base layer to set the series.  When doing charm-build, if the series isn't defined in the top-level metadata.yaml, it will default to trusty: https://github.com/juju/charm-tools/blob/master/charmtools/build/builder.py#L190
[18:19] <kwmonroe> yup cory_fu.. the workaround is to be explicit with charm build --series foo
[18:20] <cory_fu> Yeah, but that's a bit of a hassle
[18:20] <kwmonroe> cory_fu: i'm not married to the base layer defining the series.. if we want to leave it up to each charm, i'm +1
[18:20] <kwmonroe> but let's decide that now before i make final changes to push to bigtop
[18:20] <cory_fu> I'd like to fix charm-build, but I don't see an easy way to do so
[18:21] <kjackal> cory_fu: kwmonroe: if you have a single series then it will always place the charm under buildidrectory/trusty/mycharm
[18:21] <cory_fu> Without a fix for charm-build, I don't think it's reasonable for our charms to not "just work" with `charm build` by default
[18:21] <kjackal> even if the charm is for xenial
[18:22] <cory_fu> kjackal: That's not true.  If you define a single series in the metadata.yaml, it will use that
[18:22] <kwmonroe> yeah, pretty sure that ^^ is correct, but it has to be in the top layer metadata.yaml
[18:23] <cory_fu> kjackal: Specifically, if you define *any* series in metadata.yaml, it will output to builds/.  Otherwise, it will output to trusty/
[18:24] <kjackal> cory_fu: kwmonroe: I see, so I need to set the series on the top level metadata.yaml to move the output to builds/
[18:24] <kjackal> that will not play well with the series on the base layer
[18:24] <cory_fu> kwmonroe, kjackal: Looks like a (slightly hacky) work-around is to put an empty list for series in the top-level charm layer
[18:25] <cory_fu> Ah, damnit.  Nevermind, that doesn't work, either
[18:25] <kwmonroe> 2 Ms in dammit, goose
[18:27] <cory_fu> Though, making that work-around work would be pretty easy
[18:31] <kjackal> since we are in this subject ... I tried to remove the slave we have in the bundle (juju remove-application) and  add an older one from trusty. The trusty one never started the namenode, probably never related to resource manager
[18:49] <cory_fu> kjackal: Looks like we weren't the only one to be hitting this: https://github.com/juju/charm-tools/issues/257
[18:55] <kjackal> cory_fu: Oh.. I totaly forgot to continue wit this issue. Got consumed with the hadoop thing
[18:55] <kjackal> tring now the bundle in trusty
[19:15] <cory_fu> tvansteenburgh: Have you run in to issues with charms that use_venv and the python-apt package?
[19:15] <cory_fu> marcoceppi: ^
[19:16] <tvansteenburgh> cory_fu: no
[19:17] <marcoceppi> cory_fu: I don't have a use_venv charm
[19:17] <cory_fu> grr
[19:28] <tvansteenburgh> cory_fu: i don't know what the issue actually is, but have you tried installing apt from pypi instead?
[19:37] <cory_fu> tvansteenburgh: I don't actually need python-apt but it gets pulled in automatically whenever charmhelpers.fetch is imported.  The problem with installing from pypi is that it adds a lot of dependencies into the wheelhouse that I don't need, and it's already installed on the system anyway.
[19:37] <cory_fu> tvansteenburgh: I remembered, though, that I could use include_system_packages, and that's working for me
[19:38] <tvansteenburgh> cool
[19:42] <kjackal> kwmonroe: cory_fu: I am testing the hadoop processing we have for truty. I see a different behavior there. The namenode does not die but we get some jobs failing
[19:45] <kwmonroe> kjackal: that's odd.. are they long running jobs?  can you tell from the DN or RM logs if they are being killed for a particular reason (like memory exceeds threshold)?
[19:46] <kjackal> kwmonroe: looking
[19:47] <kwmonroe> also kjackal, the failure earlier with ganglia-node was a bug.  the charm uses '#/usr/bin/python' which doesn't exist on xenial.  it "fixed" itself because rsyslog-fowarder-ha installs 'python'  (http://bazaar.launchpad.net/~charmers/charms/trusty/rsyslog-forwarder-ha/trunk/view/head:/hooks/install), so once rsyslog-fowarder finished its install hook, the subsequent ganglia install hook would succeed.
[19:48] <kwmonroe> i'm pushing a similar install hook change for ganglia-node, so we shouldn't see that again.
[19:56] <beisner> thedac, tvansteenburgh - i've been trying to figure out what changed to cause our juju-deployer + 1.25.6 machine placement to break.  it looks like 0.9.0 is causing our ci grief.  https://launchpad.net/~mojo-maintainers/+archive/ubuntu/ppa
[19:57] <beisner> basically, bundles that used to deploy to 7 machines with all sorts of lxc placement now end up asking for 18 machines, while still placing some apps in containers.  very strange.
[19:57] <beisner> is there a known issue?
[19:58] <beisner> i've found that we got deployer 0.9.0 from the mojo ppa of all places
[20:00] <tvansteenburgh> beisner: they got it from my ppa
[20:00] <tvansteenburgh> https://launchpad.net/~tvansteenburgh/+archive/ubuntu/ppa
[20:01] <tvansteenburgh> nothing has changed with placement in a quite a while
[20:01] <tvansteenburgh> feel free to file a bug on juju-deployer though
[20:01] <tvansteenburgh> https://bugs.launchpad.net/juju-deployer
[20:02] <beisner> tvansteenburgh, at a glance, here's a bundle and the resultant model: http://pastebin.ubuntu.com/23205474/
[20:03] <tvansteenburgh> beisner: sorry, i don't even have time to glance right now, can you put that paste in a bug?
[20:03] <kjackal> kwmonroe: this is odd http://pastebin.ubuntu.com/23208612/
[20:03] <beisner> tvansteenburgh, ok np.  first i need to revert to a working state and block 0.9.0 pkgs.  we're borked atm.
[20:07] <kwmonroe> kjackal: http://stackoverflow.com/questions/31780985/hive-could-not-initialize-class-java-net-networkinterface.. sounds like our old friend "datanode ip is not reverse-resolvable".  what substrate are you on?
[20:07] <kjackal> I am on canonistack
[20:07] <kjackal> kwmonroe: ^
[20:08] <kjackal> let me check the resolutions
[20:08] <kwmonroe> kjackal: can you try adding an entry to your namenode and resourcemanager /etc/hosts files that includes your slave IP and `hostname -s`?
[20:09] <kjackal> yeap just a sec
[20:16] <kjackal> kwmonroe: it seems i have a ghost slave...
[21:13] <kjackal> kwmonroe: I have 5 consecutive successfull terasorts
[21:20] <kwmonroe> kjackal: i got to 4 before my terasort hung.. http://imgur.com/a/maMAM  it's not dead yet, but i have little faith that it will return :/
[21:20] <kjackal> kwmonroe: I am on the ninth successfull now!
[21:22] <kwmonroe> kjackal: my "Lost nodes" count is rising on my RM :(  i think i'm toast.
[21:23] <kjackal> kwmonroe: 10 successfull!
[21:24] <kjackal> So here is what the setup looks like: started from hadoop-processing-6 updated the /etc/hosts to have reveres lookups
[21:25] <kjackal> I believe what makes the difference is the trusty host :(
[21:26] <kjackal> What I do not fully get is why do we still have the hosts issue, I thought we have a workaround for it.
[21:26] <kjackal> kwmonroe: ^
[21:28] <kwmonroe> right kjackal -- and especially on clouds that have proper ip/dns mapping, which aws and cstack have
[21:28] <kjackal> kwmonroe: http://imgur.com/a/VUTSZ
[21:29] <kwmonroe> however, kjackal, we have only ever worked around the NN->DN reverse ip issue with the hadoop datanode-ip-registration param set to allow non-reversible registration.. perhaps there's an issue with RM->NM that we're not considering.
[21:33] <kjackal> Ok, kwmonroe, next step for me is to force the latest charms to deploy on trusty. The fact that the namenode is not crashing on trusty is promissing
[21:34] <kjackal> kwmonroe: (even is the jobs fail)
[21:35] <kjackal> 8if
[21:35] <kjackal> *if
[21:37] <beisner> thedac, tvansteenburgh - updated with examples and attachments.  it's definitely a thing.  https://bugs.launchpad.net/juju-deployer/+bug/1625797
[21:37] <mup> Bug #1625797: (juju-deployer 0.9.0 + python-jujuclient 0.53.2 + juju 1.25.6) machine placement is broken <uosci> <juju-deployer:New> <mojo:New> <python-jujuclient:New> <https://launchpad.net/bugs/1625797>
[21:42] <kwmonroe> kjackal: what timezone are you in this week?
[21:42] <kjackal> kwmonroe: I am in DC
[21:42] <kwmonroe> ah, very good kjackal.  you can keep working ;)
[21:42] <kjackal> kwmonroe: -5 I think
[21:43] <kwmonroe> yeah -- just as long as you're not back in Greece
[21:44] <kwmonroe> cory_fu: fyi, deploying bigtop zepplin also apt installs spark-core-1.5.1
[21:44] <cory_fu> Makes sense