[02:28] <veebers> menn0: You mentioned earlier that you have a fix for the password/macroon parts for migration? What's the error message I should look for on expected failure?
[02:28] <veebers> The current incorrect one is "empty target password not valid"
[02:28] <menn0> veebers: you should see a permission denied
[02:28]  * menn0 checks if the fix has merged
[02:29] <menn0> veebers: it hasn't merged next. looks like it's next in the queue.
[02:29] <menn0> veebers: but you should see "permission denied"
[02:30] <menn0> (when it has merged)
[02:30] <veebers> menn0: Cool, I'll make sure the test matches exactly on the error (otherwise we would have missed something like this)
[02:30] <veebers> Cheers
[02:30] <menn0> veebers: it occurred to me last night that it's worth have a CI test for a superuser that isn't the bootstrap user.
[02:31] <menn0> such a user should be able to start a migration, but the authentication path is a bit different to the bootstrap user (uses macaroons instead of passwords)
[02:32] <veebers> menn0: Similar to the test I've just proposed but with the proper permissions and thus it should work
[02:33] <menn0> veebers: exactly. so add a user to both controllers with the superuser controller permission and run a migration. it should work.
[02:33] <menn0> veebers: it won't work until this current change lands.
[02:34] <veebers> menn0: Cool, I'll get on that after I've cleaned up this current one.
[07:21] <suresh_> hii all i am deploying openstack bundle in juju
[07:22] <suresh_> but in "juju status" it is showing all services in error state
[07:22] <suresh_> please someone help
[07:24] <suresh_> i am using this link to install https://github.com/openstack-charmers/openstack-on-lxd
[07:24] <suresh_> after this command "juju bootstrap --config config.yaml localhost lxd"
[07:25] <suresh_> it is saying deployed but containers created showing error statte
[07:25] <suresh_> please some one help
[10:40] <kjackal> hey cory_fu are you around?
[10:42] <bbaqar_> hey guys I upgraded the rabbitmq server units and now seeing  Unit has peers, but RabbitMQ not clustered
[10:42] <bbaqar_> any thoughts
[10:57] <bbaqar_> someone must have worked with rabbitmq here
[11:42] <suresh_> hii all, i am installing openstack with juju
[11:42] <suresh_> while installing nova-compute it is giving error state
[11:42] <suresh_> please some one help
[12:56] <suresh_> hii all, I installed juju on ubuntu 16.04 and while running "juju quickstart" command
[12:57] <suresh_> i am getting this error
[12:57] <suresh_> interactive session closed juju quickstart v2.2.4 bootstrapping the local environment sudo privileges will be required to bootstrap the environment juju-quickstart: error: error: flag provided but not defined: -e
[12:57] <suresh_> my juju version is  "2.0-beta12-xenial-amd64"
[12:57] <rick_h_> suresh_: hmm, you shouldn't have a juju-quickstart command in 16.04 with the juju there.
[12:57] <suresh_> please someone help
[12:58] <rick_h_> suresh_: did you install juju-quickstart? can you remove it?
[12:58] <suresh_> rick_h: yes i do
[12:59] <rick_h_> suresh_: please check out https://jujucharms.com/docs/stable/getting-started for getting started
[13:01] <suresh_> ok thank you
[13:02] <suresh_> rick_h: And other problem i am facing while deploying "nova-compute" charm from the store
[13:03] <rick_h_> suresh_: what is the error? have you looked at the logs of the charm? you can get there by running a juju ssh to the unit and then looking at the log in /var/log/juju/unit-xxxxx where xxxx looks like the novaa-compute unit
[13:03] <suresh_> i am getting "E: Sub-process /usr/bin/dpkg returned an error code (1)"
[13:05] <suresh_> it is trying to install nova-compute and some packages but it is faling and last log it is showing
[13:05] <suresh_> subprocess.CalledProcessError: Command '['apt-get', '--assume-yes', '--option=Dpkg::Options::=--force-confold', 'install', 'nova-compute', 'genisoimage', 'librbd1', 'python-six', 'python-psutil', 'nova-compute-kvm']' returned non-zero exit status 100
[13:07] <suresh_> rick_h: are you around
[13:08] <rick_h_> suresh_: sorry, in and out on meetings/etc
[13:08] <suresh_> rick_h: have you seen the error i pasted
[13:15] <axino> suresh_: try running the command on the unit and see why it fails
[13:17] <suresh_> in unit also i tried to run command
[13:17] <suresh_> it is giving same error
[13:17] <suresh_> invoke-rc.d: initscript nova-compute, action "start" failed. dpkg: error processing package nova-compute (--configure):  subprocess installed post-installation script returned error exit status 1 E: Sub-process /usr/bin/dpkg returned an error code (1)
[13:18] <suresh_> axino: other components i am able to deploy without any error
[13:18] <axino> suresh_: apparently it's failing on nova-compute start, try : sudo /etc/init.d/nova-compute start
[13:20] <suresh_> axino: I will try and let you know the status
[13:35] <suresh_> axino: i ran that command but it is saying "sudo: /etc/init.d/nova-compute: command not found"
[13:35] <axino> ugh
[13:35] <axino> suresh_: sudo start nova-compute
[13:36] <suresh_> yeah it is giving "start: Job failed to start"
[13:37] <suresh_> what i need to do
[13:45] <suresh_> axino: are you around
[13:46] <axino> suresh_: not really, and not for long I'm afraid
[13:46] <axino> suresh_: you can look at /var/log/upstart/nova-compute.log and /var/log/nova/*.log
[13:46] <axino> suresh_: good luck !
[13:55] <cory_fu> kjackal: Welcome back.  I'm here now
[13:55] <cory_fu> Sorry I missed you earlier
[13:56] <kjackal> Hey cory_fu, I wanted some help with some python dependency hell with cwr
[13:57] <cory_fu> kjackal: Hrm.  I didn't run in to any issues that tox didn't handle for me.  Jump on daily?
[13:57] <kjackal> managed to deploy cwr on a clean container but i guess i also need the juju-core env
[13:57] <kjackal> yes, daily
[14:05] <beisner> bdx, rick_h_ - traveling a similar path :) ... https://bugs.launchpad.net/juju/+bug/1614364
[14:05] <mup> Bug #1614364: manual provider lxc units are behind NAT, fail by default <amd64> <manual-provider> <s390x> <uosci> <juju:Triaged> <juju-core:Triaged> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1614364>
[14:05] <beisner> and https://bugs.launchpad.net/juju/+bug/1615917
[14:05] <mup> Bug #1615917: juju openstack provider --to lxd results in unit behind NAT (unreachable) <openstack-provider> <uosci> <juju:Triaged> <https://launchpad.net/bugs/1615917>
[14:06] <rick_h_> beisner: heh :)
[14:34] <suresh_> hii all i am getting error in  "/var/log/upstart/nova-compute.log" file
[14:35] <suresh_> like "modprobe: ERROR: ../libkmod/libkmod.c:556 kmod_search_moddep() could not open moddep file '/lib/modules/3.13.0-32-generic/modules.dep.bin'"
[14:35] <suresh_> i deployed nova-compute using juju
[14:35] <suresh_> start nova-compute giving above error
[14:40] <beisner> hi suresh_ - can you give us a pastebin of the `juju status` output so we can get a sense of the topology?
[14:42] <suresh_> beisner: here is my juju status
[14:43] <suresh_> http://paste.openstack.org/show/562483/
[14:48] <suresh_> beisner: are you arounf
[14:49] <beisner> yep, one sec
[14:53] <beisner> hi suresh_ can you tell us about machine 12?  is it a container?
[14:55] <suresh_> beisner: yes it is a container
[14:56] <beisner> suresh_, generally-speaking, nova-compute and neutron-gateway units must be on metal.  it is possible to deploy the whole stack into containers using this approach:  https://github.com/openstack-charmers/openstack-on-lxd
[14:58] <suresh_> I followed this also
[14:59] <suresh_> but it is giving  after this "juju deploy bundle.yaml" command it is saying deployment completed
[15:00] <suresh_> beisner: but in juju status i am getting all are "error" state
[15:01] <beisner> suresh_, it looks like your deployment is using juju 1.25.6, where the openstack-on-lxd example requires juju 2.0 (currently in beta).
[15:01] <suresh_> beisner: not this environment
[15:02] <suresh_> I deployed on ubuntu 16.04 and followed that github repo
[15:02] <suresh_> there i am getting all the states are "error"
[15:03] <beisner> suresh_, the pastebin shows Juju 1.25.6 is in use
[15:03] <suresh_> beisner: actually i have two environments
[15:03] <suresh_> and in another i have juju 2.0
[15:07] <beisner> suresh_, that's the one i would focus on, as expected-to-work.
[15:09] <suresh_> beisner: here i am deploying that in a vm which has 12 GB RAM, 80 GB DISK and 5 cpu cores
[15:10] <suresh_> it is enough to "Deploy OpenStack on LXD"
[15:10] <beisner> suresh_, if you see failures with Juju 2 current beta, and the openstack-on-lxd procedure, please provide details on that.  thanks!
[15:11] <suresh_> beisner: I am deploying this one and let you know where i got strucked
[15:12] <suresh_> besiner: how much time you will be here?
[15:26] <beisner> hi suresh_ - +~ 6hrs
[15:27] <suresh_> beisner: thanks i will report the errors i will get
[15:44]  * D4RKS1D3 Hi 
[15:44] <suresh_> beisner: while running this command "sudo lxd init"
[15:45] <suresh_> it is asking for Name of the storage backend to use (dir or zfs):
[15:45] <suresh_> what i need to give
[15:45] <marcoceppi> suresh_: dir, unless you have ZFS set up
[15:46] <suresh_> and this Address to bind LXD to (not including port)
[15:47] <suresh_> can i leave empty
[15:47] <marcoceppi> suresh_: again, that's up to you, 0.0.0.0 is generally okay
[16:03] <bdx> beisner: excellent. I put some heat on there for ya'
[16:04] <bdx> beisner: you may as well just make a general bug for all providers != MAAS, ya?
[16:06] <beisner> hi bdx, i'll leave that up to juju core triage, but i suspect each will be tracked separately as each provider would likely be addressed separately in dev efforts.
[16:06] <bdx> beisner: gotcha, thanks for filing those!
[16:09] <lazyPower> rick_h_ - do you recall the command to remove a controller from your $JUJU_DATA?
[16:09] <lazyPower> i'm pretty sure i mailed the list about it, but i'm having a dandy time trying to find it
[16:10] <rick_h_> lazyPower: unregister
[16:10] <lazyPower> thank you!
[16:10] <rick_h_> lazyPower: yes, mailed the list and filed a bug and we updated the help docs in response to the bug
[16:10] <rick_h_> lazyPower: np
[16:16] <beisner> bdx, yw.  thanks for the input
[16:55] <suresh_> besiner: when i need to run this command "sudo ppc64_cpu --smt=off"
[17:04] <suresh_> marcoceppi: are you around
[17:05] <lazyPower> I've noticed a lot of bugs getting moved to the /juju project in launchpad (from juju-core), should we start opening bugs against /juju? or continue filing them against juju-core?
[17:06] <marcoceppi> lazyPower: check the mailing list (yes)
[17:06] <marcoceppi> suresh_: yes?
[17:06] <lazyPower> haha 4 minutes ago
[17:06] <lazyPower> \o/
[17:06] <rick_h_> lazyPower: hey, there was an email to the cloud list warning of this weeks ago :P
[17:06] <rick_h_> lazyPower: but yea, it's done today
[17:07] <marcoceppi> YEAH lazyPower READ YOUR EMAILS
[17:07] <lazyPower> ah, well, i just noticed in my bug-mail-feed its been a slew of project moving, soooo
[17:07] <rick_h_> :)
[17:07] <rick_h_> we just wanted to flood your inbox
[17:07] <rick_h_> and flooding my own to no end was so worth it!
[17:07] <lazyPower> i make no apologies for missing information in this black hole of messaging
[17:07]  * lazyPower points @ his inbox
[17:07] <lazyPower> its nicknamed e-fail for a reason
[17:09] <suresh_> marcoceppi: how much time will taken by this command "juju bootstrap --config config.yaml localhost lxd"
[17:10] <marcoceppi> suresh_: depends, but at most 10 mins?
[17:10] <suresh_> marcoceppi: can we monitor logs regarding this command
[17:17] <marcoceppi> suresh_: if you issue the --debug flag when you run the command it will be more verbose
[17:18] <suresh_> marcoceppi: i ran this command before 20 minutes
[17:19] <suresh_> and it is still waiting at apt-get update here is i pasted the output http://paste.openstack.org/show/562516/
[17:22] <suresh_> marcoceppi: can i interrupt this command to rerun with --debug
[17:22] <marcoceppi> yes
[17:30] <suresh_> marcoceppi: i run that command with --debug option
[17:30] <suresh_> and the log pasted here http://paste.openstack.org/show/562519/
[17:31] <suresh_> it is waiting at "Running apt-get update"
[17:32] <suresh_> I enabled ipv6 Is this is a problem?
[17:33] <suresh_> beisner: are you around
[17:35] <beisner> hi suresh_
[17:35] <suresh_> beisner: yeah i am deploying openstack with juju by following this link https://github.com/openstack-charmers/openstack-on-lxd
[17:36] <suresh_> while execution of this command "sudo lxd init"
[17:36] <suresh_> i enabled Ipv6 also
[17:37] <mattrae> hi, when i did 'juju create-backup' it wanted me to switch to the controller model. after doing 'juju switch controller' i was able to run backup-create. does the backup also contain the default model? if i grep through the backup file for a name of my service, i can see some matches.. but i just want to make sure i've backed up everything
[17:38] <suresh_> besiner: and my problem is while "Bootstraping a Juju controller" it is waiting at "Running apt-get update"
[17:39] <suresh_> and the log of that bootstarp command is pasted here log pasted here http://paste.openstack.org/show/562519/
[17:41] <suresh_> beisner: have you seen my log
[17:41] <beisner> suresh_, i see fd7d:b856:c794:1a4:216:3eff:fea1:5c73 port 22: Connection refused.  i've not personally validated this with ipv6.  my suggestion would be to first run through the example pretty much verbatim (ipv4), make sure everything works as expected.
[17:43] <suresh_> beisner: I will try only enabling ipv4
[17:44] <suresh_> and will update if any issues
[17:51] <suresh_> sudo lxd-init it is asking Address to bind LXD to (not including port)
[17:52] <suresh_> beisner: can i give localhost here
[18:04] <beisner> suresh_, i believe so, but for the all-on-one deploy, i usually answer 'no' to 'Would you like LXD to be available over the network (yes/no)?'
[18:06] <suresh_> beisner: i have given ip as 0.0.0.0
[18:07] <suresh_> and Would you like LXD to be available over the network is 'yes'
[18:07] <suresh_> and i enabled only ipv4
[18:07] <suresh_> and i ran juju bootstrap --config config.yaml localhost lxd command
[18:08] <suresh_> http://paste.openstack.org/show/562529/
[18:09] <suresh_> the above is the log and i got strucked at Running apt-get update
[18:11] <beisner> suresh_, it seems like your containers may not have internet access?
[18:11] <beisner> suresh_, i just bootstrapped successfully against a fresh xenial install, after doing sudo lxd init, yes to network, localhost as the binding.
[18:12] <suresh_> beisner: containers are getting internet
[18:13] <suresh_> and how much time it will take to bootstarp
[18:13] <suresh_> and also you selected zfs or dir
[18:19] <beisner> suresh_, outside of juju, and unrelated to openstack, this should all succeed:  can you try http://pastebin.ubuntu.com/23082706/ to confirm that is the case?
[18:22] <mattrae> hi, i'm trying juju restore-backup, but I can't find a syntax that doesn't give an error. any idea where i'm going wrong? https://gist.github.com/raema/4b70b3593f84e852a9fd22c4ab3f139f
[18:22] <suresh_> beisner: yeah it is waiting at bootstrap command
[18:23] <suresh_> beisner: can you paste your logs how you executed the bootstrap
[18:25] <suresh_> beisner: sorry i am seeing you pastebin
[18:25] <suresh_> and let you know the output
[18:26] <beisner> suresh_, sure: http://pastebin.ubuntu.com/23082729/
[18:27] <beisner> suresh_, but if anything in that first pastebin **2706, there is a config or network issue on the host or network
[18:27] <suresh_> beisner: you installed on baremetal or VM
[18:28] <suresh_> here i am trying on VM
[18:28] <beisner> suresh_, this is inside a vm
[18:34] <suresh_> beisner: in the first pastebin **2706 commands are working properly but while apt-get update is giving some errors
[18:34] <suresh_> http://paste.openstack.org/show/562535/
[18:36] <beisner> suresh_, if you do that a few times in a row, do you get the exact failure?
[18:36] <suresh_> beisner: oh can i redpoly the setup again
[18:37] <beisner> suresh_, i mean just the `lxc exec test123 apt-get update` command
[18:38] <suresh_> again it is giving same result
[18:40] <beisner> suresh_, identical to http://paste.openstack.org/show/562535/?
[18:40] <suresh_> beisner: yes
[18:43] <suresh_> in my host machine is also giving the same result for apt-get update
[18:44] <beisner> suresh_, perhaps that is a transient issue with a mirror
[18:46] <suresh_> beisner: can i redeploy it again
[18:46] <beisner> suresh_, if the host can't do an apt-get update, it wouldn't try a redeploy yet.  Err:16 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 Packages   Hash Sum mismatch
[18:47] <beisner> that needs to work from the host in your network before juju bootstrap will succeed
[19:01] <suresh_> beisner: Name of the storage backend to use (dir or zfs):
[19:01] <suresh_> what you used
[19:02] <suresh_> beisner: are you around
[19:08] <beisner> suresh_, i used zfs but dir should be fine too
[19:08] <suresh_> beisner: i used dir
[19:10] <lazyPower> mattrae - thats rough :( I haven't used the plugin myself so i'm not certain how to guide you other than to file a bug and that will get the proper eyes on the issue at hand.
[19:12] <suresh_> beisner: i am following your paste-bin http://pastebin.ubuntu.com/23082729/
[19:13] <beisner> suresh_, is `apt-get update` working on the host, in the vm, and in a lxc container?
[19:16] <bdx> postgresql-peeps: say I have an application that needs to connect to 2 separate postgresql database instances, is there a way to react to the states of the same service under a different name? Is this done by providing service sensitive interface names?
[19:18] <suresh_> yes it is working after few apt-get updates in host it got output like http://paste.openstack.org/show/562540/
[19:18] <suresh_> beisner: now it make some progress on bootstrap command
[19:27] <petevg> cory_fu: you were right that the hadoop processing test might unearth an issue w/ dropping the openjdk relation. It seems to mess up the namenode relation for the slave machines, though I'm not 100% clear why
[19:27] <petevg> (either something is firing to soon, or some java lib isn't getting installed)
[19:27] <petevg> cory_fu: error from the logs on the slave machine: http://paste.ubuntu.com/23082956/
[19:28] <petevg> (That error happens if I tell bigtop to install java, whether or not I then go to add the openjdk relation, btw.)
[19:28] <petevg> cc kwmonroe ^
[19:32] <cory_fu> petevg: Hrm.  The UnboundLocalError is from an out-of-date jujubigdata
[19:32] <cory_fu> But it's only covering up a timeout error anyway
[19:33] <cory_fu> I honestly did not expect it to actually fail
[19:33] <petevg> cory_fu: yep. The more relevant part of the log is probably the connection refused bits.
[19:33] <cory_fu> petevg: Right.  Should probably check the NameNode log to see if it failed to start and why
[19:33] <petevg> cory_fu: it has failed. All my slaves say "hook failed: "namenode-relation-changed" for namenode:datanode"
[19:34] <cory_fu> petevg: I know that it *did* fail.  I'm saying that the java change *shouldn't* have caused that, according to my understanding
[19:35] <petevg> cory_fu: I don't see anything obvious in the reactive handlers that would cause it to fail, or even have different timings :-/
[19:36] <petevg> The openjdk charm does set JAVA_HOME to be inside of the jre directory, while we set JAVA_HOME to be one level up. Everything is symlinked from that level, though, so unless something isn't following a symlink, that should be fine ...
[19:39] <petevg> cory_fu: I rebased my apache-bigtop-base branch, and I'm going to redeploy; maybe there's something interesting in the error that it's chomping ...
[20:13] <firl> lazyPower you around?
[20:14] <lazyPower> firl - i am, whats up
[20:14] <firl> I will have some time over the next couple days if you wanted me to try getting the kubernetes bundle working
[20:14] <firl> ( inside openstack )
[20:15] <lazyPower> firl - sure! We verified it works inside openstack yesterday, but i'm more than happy to get additional feedback on what worked well for you vs what was rough around the edges
[20:16] <firl> oh sweet
[20:16] <firl> juju2 only?
[20:16] <lazyPower> juju 1 actually, we had to gut the 2.0 features so we could get a clean weather report on the bundles
[20:16] <lazyPower> so, either/or works swimmingly
[20:18] <lazyPower> http://status.juju.solutions/test/9f58fe960c8b4216ac93c1b71aefdb07  -- latest test results with the observable bundle
[20:18] <lazyPower> http://status.juju.solutions/test/fb39dcbd7f90454aa494fe6a6e6a5129 -- latest results with the core bundle
[20:19] <lazyPower> i'm thinking we willl get an openstack provider enabled on this at some point in the not so distant future. but public cloud results are a decent litmus
[20:22] <firl> nice
[20:22] <firl> You were mentioning about having ingress working with traefik?
[21:47] <cholcombe> in the layer.yaml can you point it at interfaces that are local to your machine for testing?
[21:48] <mattrae> hi, how do i remove a machine from the controller model after using enable-ha to add additional controller machines? now destroy-machine is telling me that the machines are required by the model https://gist.github.com/raema/a8b8f9ab6c33572fc0ac263e91e6025e
[21:49] <kwmonroe> petevg: did you get your namenode:datanode issues resolved?
[21:50] <petevg> kwmonroe: nope. I'm still poking at it.
[21:51] <kwmonroe> so one thing i've learned petevg, is not to trust the hook that actually failed.  like cory_fu said, check the namenode logs (/var/log/hadoop*).  i'd bet money you have an OOM or something that's not quite java related.
[21:51] <petevg> kwmonroe: I did. There's nothing obviously broken in the logs (the one error I saw, I wasn't able to reproduce more than once).
[21:52] <kwmonroe> petevg: if you have a broken env, check 'hdfs dfsadmin -report' to see if hdfs is there
[21:53] <kwmonroe> also petevg, is this aws or lxd?
[21:53] <petevg> kwmonroe: I'm just re-setting up a broken environment right now. I was trying to setup two environments in parallel, but amazon was unhappy about that (I suspect I might have a machine limit on my account).
[21:54] <petevg> kwmonroe: aws. lxd fails for other reasons.
[21:54] <kwmonroe> roger that petevg.. lxd failures (though concerning) would be more explainable with container hostname resolvability
[21:55] <petevg> Yeah. I'm pretty certain that's the lxd issue.
[21:55] <kwmonroe> i guess all that's left is to blame your code ;)
[21:56] <kwmonroe> i +1 your suspicion that there's an account limit preventing you from multi aws deployments.. though i think those are region limits.. you should be able to setup an aws-east and aws-west and make gravy.
[21:56] <petevg> kwmonroe: yep. At least it's not an obvious mistake. I can deploy with revised bigtop base layer, with bigtop_jdk turned off, and everything works.
[21:56] <petevg> kwmonroe: Cool. I will try that next.
[21:57] <kwmonroe> oh, well poop.  if bigtop_jdk changes your life, that's on us.
[22:02] <petevg> kwmonroe: it does look like it might be a problem talking to hdfs: http://paste.ubuntu.com/23083287/
[22:02] <petevg> (I get that error both on the namenode and the slave)
[22:06] <kwmonroe> petevg: can you get on the namenode and verify there's a java process running?  (ps -ef | grep java)
[22:07] <kwmonroe> petevg: and if so, verfiy the NN is listening (sudo netstat -nlp | grep 8020)
[22:07] <petevg> kwmonroe: interesting. There isn't one running. (Java is installed, and setup in /etc/alternatives).
[22:08] <kwmonroe> ok petevg, /var/log/hadoop-hdfs* must tell you something
[22:08] <kwmonroe> if it doesn't, i'll give you a coors light in pasadena
[22:09] <petevg> kwmonroe: aha. There are errors, there.
[22:10] <petevg> "java.io.IOException: NameNode is not formatted."
[22:10] <kwmonroe> oh ffs
[22:10] <petevg> kwmonroe: http://paste.ubuntu.com/23083464/
[22:11] <petevg> (context)
[22:12] <kwmonroe> petevg: this is kindof a big deal.. why isn't https://github.com/juju-solutions/jujubigdata/blob/master/jujubigdata/handlers.py#L478 being run?
[22:13] <petevg> grepping code ...
[22:15] <petevg> kwmonroe: hmmm ... we don't call that function explicitly in layer-hadoop-namenode
[22:17] <petevg> kwmonroe: it's dinner time for me. Tomorrow morning, I am going to grab all the relevant layers and interfaces and libs, and trace how that function gets called. My guess is that something is relying on a status set by the openjdk layer, but it's not trivially greppable, in the bigtop repo, or in the bigtop base layer.
[22:17] <kwmonroe> ack petevg
[22:18] <kwmonroe> fwiw, jbd handlers "format_namenode" might be a red herring.. i don't see where that's called at all.  which makes it true, but not right.
[22:18] <petevg> Heh.
[22:19] <petevg> Maybe the next thing to do is to read the openjdk charm, to see what its doing that bigtop isn't.
[22:19] <petevg> (I skimmed it, but might be time for a deep dive.)
[22:19] <kwmonroe> no, openjdk is my charm.  there's nothing wrong with that.
[22:19] <petevg> :-)
[22:19] <kwmonroe> :)
[22:20] <petevg> Anyway ... going to go get noms. Thx for all the help, kwmonroe. I'll poke at it more in the morning, and bug you about it if I'm still stuck.
[22:23] <kwmonroe> word.  nom for us all.
[22:41] <kwmonroe> hey petevg, i see this on a normal deployment of hadoop-processing:
[22:41] <kwmonroe> unit-namenode-0: 2016-08-23 21:03:51 INFO unit.namenode/0.java-relation-changed logger.go:40 Debug: Executing '/bin/bash -c 'hdfs namenode -format -nonInteractive >> /var/lib/hadoop-hdfs/nn.format.log 2>&1''
[22:41] <kwmonroe> will you check your /var/lib/hadoop-hdfs/nn.formatlog to see if there are details there?
[22:41] <kwmonroe> (i know you're nom'ing, just leaving here for when you get back.. petevg petevg petevg)
[22:59] <petevg> kwmonroe: interesting. I do see the call to format namenode, but the only line in the log is an error about JAVA_HOME not being set.
[23:00] <petevg> kwmonroe: my revised code does attempt to set JAVA_HOME, though (and I can see it successfully writing it to the bigtop defaults).
[23:00] <petevg> Maybe it winds up getting set later on, in a way that works for things that I've tested to work like Zookeeper, but doesn't work here.
[23:01] <petevg> kwmonroe: that's a concrete thing that I can actually go and see about fixing. Thank you :-)
[23:02] <petevg> In other news, the chickens appear to have been slacking off, and I have to run to the store for eggs lest dessert and breakfast plans get spoilt. Catch ya later :-)