/srv/irclogs.ubuntu.com/2015/09/11/#juju.txt

mikenI have a unit which *should* be in an error state (juju log shows an error in the charm resulting in config-changed failing), but the unit isn't in an error state... anyone familiar with that?01:08
mikenMore details on https://bugs.launchpad.net/juju-core/+bug/149454201:08
mupBug #1494542: unit does not go to error state <juju-core:New> <https://launchpad.net/bugs/1494542>01:08
=== natefinch-afk is now known as natefinch
=== urulama__ is now known as urulama
=== lukasa is now known as lukasa_away
=== lukasa_away is now known as lukasa
=== lukasa is now known as lukasa_away
=== lukasa_away is now known as lukasa
=== lukasa is now known as lukasa_away
=== lukasa_away is now known as lukasa
=== lukasa is now known as lukasa_away
=== lukasa_away is now known as lukasa
=== lukasa is now known as lukasa_away
=== lukasa_away is now known as lukasa
=== lukasa is now known as lukasa_away
Odd_BlokeHow can I check what hooks have been called on a particular unit?13:28
Odd_BlokeAh, status-history seems to do it.13:29
Odd_BlokeSo I'm trying to test leader election/failover of a service with three units.13:30
Odd_Bloke(N.B. We're not using normal leader election yet)13:30
Odd_BlokeI stopped the instance running the leader; Juju has noticed (in `juju status`) but hasn't done anything to notify any of the other units in the service.13:31
Odd_BlokeI would expect at least a relation broken or departed hook to be fired, but I'm not seeing that happen.13:32
Odd_BlokeDoes anyone know what I could do to investigate?13:33
lazypowerOdd_Bloke, the leader-elected hook runs on the unit that becomes the new leader juju elected13:37
lazypowerOdd_Bloke, there's only one way to truly know, is thats exposed via the is_leader check13:37
Odd_Blokelazypower: Couple of questions: (a) these units have a cluster relationship; am I wrong to expect the broken/departed hook on that relationship to be triggered?13:37
Odd_BlokeCrap, I forgot what (b) was going to be.13:38
lazypowerwell context here... let me scope this as someone not familiar with what you've done13:38
=== lukasa_away is now known as lukasa
Odd_Blokelazypower: So the broad context is that we were using "lowest numbered unit is the leader" logic in the ubuntu-repository-cache charm.13:40
lazypowerso, leadership hooks run when leadership changes occur. leader_elected is always run on the leader, and if is_leader = true, take action. If you need to send data over the cluster relation, do so out of band relation-set -r # foo=bar  - otherwise you get nothing really for free with this aside from juju picking your leader, and exposing a few primitives for that. leader-set (i need to double check this) can be used to send data to all the subo13:40
lazypowerrdinates13:40
Odd_Blokelazypower: We updated charmhelpers (to fix another bug) without noticing that the leader stuff ahd been pulled in.13:40
Odd_Blokelazypower: So at the moment I'm trying to jerry-rig the "lowest numbered unit is the leader" logic back in (to fix existing deployments).13:41
lazypowerThat sounds troublesome13:41
Odd_Blokelazypower: And then we will look at moving forward to proper leader election.13:41
lazypoweralso keep in mind leadership functions landed in 1.23 - so anything < (eg: whats shipping in archive) will not work w/ leadership functions13:42
lazypoweri ran into this with the etcd charm13:42
Odd_BlokeYeah, that's part of the reason we aren't moving straight forward to leadership election.13:42
lazypowerthe charm just blatantly enters error state, sets status, and complains loudly in the logs if you're using < minimum version.13:42
pmatulisis this the only place where users can look in order to choose a tools version to upgrade to? https://streams.canonical.com/juju/tools/releases/13:43
Odd_BlokeBecause we need to take stock of where this is deployed, and maybe manage them through a Juju upgrade.13:43
Odd_Blokelazypower: But I was surprised that one or both of cluster-{broken,departed} weren't called on other units in the service when I stopped the machine running another unit.13:43
lazypowerbroken.departed are implicit actions during the relation-destroy cycle13:44
lazypoweri dont think they get called when the machine is just stopped13:44
Odd_BlokeOK.13:44
lazypowerpmatulis, might want to try asking that in #dev - i dont think they monitor #juju as actively as the eco peeps.13:45
Odd_Blokelazypower: So in a pre-leadership-election world, how do you get notified of/handle a machine going AWOL?13:45
tvansteenburgh1that is surprising if true lazypower13:45
=== tvansteenburgh1 is now known as tvansteenburgh
lazypowerOdd_Bloke, to be completely honest, i dont think we did, because there was no good way to handle it without an implicit action causing a hook to be fired. The work around to something like this is use DNS and hide everything behind load balancers.13:46
lazypowertvansteenburgh, i've shot a few services in the aws control panel and never saw a broken/departed hook fire13:46
lazypowerthis might be a regression i witnessed13:47
lazypowerhere, i'll stand up an etcd cluster, scale to 4 nodes and kill of 113:47
tvansteenburghOdd_Bloke: i'd ask about that in #juju-dev too13:48
lazypowerlets test this theory on 1.24.5 and see if it behaves as we expect13:48
tvansteenburghif you don't i will13:48
lazypowerbootstrapping, should be g2g in ~ 813:48
tvansteenburghlazypower: cool, i'll wait :)13:48
lazypoweri mean the more we talk about this13:48
lazypoweryeha it seems like a big oversight13:49
lazypowerso i'm hoping i witnessed oddity in one environment, or mis-remembering13:49
mthaddonhi folks, can someone help me with the juju local provider on vivid with 1.24.5 (i386)? I was running into https://bugs.launchpad.net/juju-core/+bug/1441319, set the mtu as advised, and am now getting "container failed to start and was destroyed"13:49
mupBug #1441319: intermittent: failed to retrieve the template to clone: template container juju-trusty-lxc-template did not stop <canonical-bootstack> <cisco> <cpec> <deployer> <landscape> <lxc> <oil> <regression> <systemd> <upstart> <juju-core:Triaged by cherylj> <https://launchpad.net/bugs/1441319>13:49
cheryljhi mthaddon, there was another issue where the local provider wasn't working on vivid.  Let me double check that it was fixed in 1.24.513:51
mthaddoncherylj: great, thanks13:51
cheryljmthaddon: in the meantime, can you get the contents of /var/log juju/containers/juju-trusty-lxc-template/console.log into pastebin or something for me to look at?13:53
mthaddoncherylj: is that from one of the instances in the environment? I don't see that on my local machine13:54
cheryljmthaddon: it should be on your system if you're running the local provider.13:55
mthaddoncherylj: https://pastebin.canonical.com/139636/13:55
pmatulislazypower: alrighty13:55
mthaddoner, I mean http://paste.ubuntu.com/12338803/ as some in this channel won't be able to see the one above13:56
cheryljmthaddon: sorry!  it's in /var/lib/juju/containers13:56
cheryljmuscle memory of going to /var/log/juju13:56
mthaddoncherylj: that's a 0 byte file on my machine :/13:56
cheryljmthaddon: okay, let me take a quick look at 1.24.513:58
cheryljmthaddon: if you unset the mtu and try to bootstrap again, can you see if that console.log file gets created?14:03
cheryljthe setting of the mtu was for that very specific environment in that bug14:03
mthaddonsure, gimme a few mins - was pulled into a call, but will get to it soon14:03
cheryljmthaddon: np, I'm going to spin up a vivid machine and see if I can recreate14:04
mthaddoncherylj: removed and still getting "container failed to start and was destroyed", but this time I have logs - http://paste.ubuntu.com/12339144/14:16
mthaddon"Incomplete AppArmor support in your kernel. If you really want to start this container, set lxc.aa_allow_incomplete = 1 in your container configuration file"14:17
mwenninglazypower, good morning14:18
lazypowermwenning, o/14:18
mwenninglazypower, any ideas from my pastebin?14:19
lazypowermwenning, honestly, haven't had a chance to take a look - let me wrap up this debug session im' doing for Odd_Bloke  and i'll take another look14:19
mwenninglazypower, k no hurry14:20
lazypowertvansteenburgh, ok a 7 node etcd cluter just settled.... i wont get into why its 7 nodes large.14:20
lazypowerbut it rhymes with i'm impatient14:20
lazypowertvansteenburgh, Odd_Bloke - aggreable method on testing this is to just terminate the machin ein the AWS control panel?14:21
Odd_Blokelazypower: I was doing this on GCE and stopped rather than terminated; but yes, that sounds reasonable.14:21
lazypowerok state server received an EOF from the unit in question, no action taken so far14:22
cheryljmthaddon: I haven't seen that error before.  Let me poke around a bit more.14:23
* mthaddon nods14:23
lazypowerOdd_Bloke, 3 minutes in and no action taken. Unless it suddenly decides to execute those hooks i think my assertion stands that it does nothing for you without an implicit breaking action.14:25
Odd_BlokeRight.14:26
Odd_BlokeAnd that is expected behaviour?14:26
lazypoweri dont know that i would expect it to do that14:26
lazypoweri think the state server should do its dilligence to run the broken/departed hooks on that units relations until it comes back14:26
lazypowertvansteenburgh, ^14:27
lazypowerOdd_Bloke, also - you cannot terminate the machine via conventional means - juju destroy-machine # --force just to get it out of the enlistment makes it go away, however the units departed/borken hooks do not run14:30
lazypowerso we still have possible broken config left around in the cluster14:30
Odd_BlokeBlargh.14:30
lazypowerso, looks like we found a pretty gnarly case that we need to file for14:31
lazypowerget it on the docket to be looked at14:31
Odd_BlokeOK, well at least I'm not going crazy. :p14:31
lazypowerOdd_Bloke, https://bugs.launchpad.net/juju-core/+bug/149478214:35
mupBug #1494782: should *-broken *-departed hooks run when a unit goes AWOL? <juju-core:New> <https://launchpad.net/bugs/1494782>14:35
lazypowerAnything you can add here would be great, as i'm not sure i did a great explanation of the problem domain14:36
lazypowermwenning, looking now14:36
lazypowermwenning, the invalid config items strike me as the first issue -  deployer.deploy: Invalid config charm ceph osd-devices=/tmp/ceph014:37
mwenninglazypower, I was assuming those would go away once it could find the local ceph charm.14:38
mwenninglazypower, the bundle was exported from a running juju session14:39
cheryljmgz, I'm looking at the artifacts for bug 1494356, and I only see the container information for the juju-trusty-lxc-template container.14:40
mupBug #1494356: OS-deployer job fails to complete <blocker> <ci> <regression> <juju-core:Triaged by cherylj> <juju-core 1.25:Triaged by cherylj> <https://launchpad.net/bugs/1494356>14:40
Odd_Blokelazypower: So if I wanted to get those hooks to fire, what do I do?  juju remove-unit?14:40
lazypowerOdd_Bloke, i was trying to figure that out and by destroying the machine it removed the unit14:41
lazypowerso it effectivel blocked me from doing anything to reconfigure the service14:41
mgzcherylj: I think we have some namespace collision issues14:42
cheryljmgz: I was wondering if that was the issue.14:42
mgzcherylj: the logs are named the same thing and dir isn't preserved14:42
Odd_Blokelazypower: Ah, yes, remove-unit has triggered cluster-relation-departed14:43
lazypowerwell that helps14:43
* mwenning is rebooting after a kernel update...14:44
lazypowerack mwak14:44
lazypowerer14:44
lazypowermisping14:44
mgzcherylj: what else is in those dirs apart from logs?14:45
mgzcherylj: wondering if I can just archive the complete dirs14:45
cheryljmgz: the logs and the cloud config for cloud init.  Nothing too large14:47
lazypowermwenning, ok, lets see if we cant iron this out. When you comment out those config directives does the bundle deploy still fail by not finding the charm?14:53
=== JoshStrobl is now known as JoshStrobl|Nap
* mwenning is waiting for juju to bootstrap14:57
lazypowermwenning, also which version of juju-deployer are you running?15:00
mgzcherylj: I updated the logging, there's a CI run in progress though so won't get anything new for a while15:09
cheryljmgz: thanks!  ping me when it's done and I'll take a look15:09
mwenninglazypower, juju-deployer 0.5.1-315:13
lazypowerok, thats the most recent release of deployer15:23
* lazypower checks off one box15:23
lazypowerHey, if anybody here is interested in delivering docker app containers with juju - I'd love a review on this PR if you've got time - https://github.com/juju/docs/pull/67215:24
=== lukasa is now known as lukasa_away
=== lukasa_away is now known as lukasa
mwenninglazypower, found at least part of it, waiting for bootstrap again15:42
=== natefinch is now known as natefinch-afk
mwenninglazypower, the problem was that the charm dirs were named differently than "ceph" and "ceph-dash" .16:27
lazypoweri thought it boiled down tos oemthing like that. Deployer wasn't able to find the charms it was looking for16:28
mwenningThis worked OK with the command-line 'juju deploy', but juju-deployer apparently uses a different way of finding them (?)16:28
lazypowerwell cool - glad you sorted it mwenning16:28
lazypowerit does. Deployer creates a cache in $JUJU_HOME16:29
lazypowerand it looks for dir names that match the charms as thats part of proof16:29
lazypowerjuju is a bit more forgiving with that, raising a warning that charm_name doesn't match the dir name - but still deploys.16:29
mwenninglazypower, good to know16:29
pmatulisi just did 'juju upgrade-juju' and got back < ERROR invalid binary version "1.24.5--amd64" > . indeed i am running juju-core 1.24.516:51
pmatulisagents are currently using 1.22.816:51
WalexI upgraded Juju from 1.23.3 to 1.24.5 should I also upgrade MAAS from 1.5.4 to 1.7.6 (all this on ULTS 14). Is the MAAS upgrade likely to be pretty painless? It is a provider for Juju.17:13
Walexpmatulis: I had much the same issue going 1.23.3 to 1.24.5 and things are complicated. Have a look at the recent thread here: http://comments.gmane.org/gmane.linux.ubuntu.juju.user/282417:18
=== natefinch-afk is now known as natefinch
pmatulisWalex: i read it but i don't see my error. i will try being explicit (--version) with a version other than 1.24.5 . i wonder what the rules are for a version to be considered "valid"?17:41
firlCharles butler on?17:42
Walexpmatulis: your error is described in the first message17:52
pmatulisWalex: i don't see it18:02
Walexpmatulis: http://permalink.gmane.org/gmane.linux.ubuntu.juju.user/282418:03
pmatulisWalex: nothing on 'invalid binary' there18:05
natefinchmarcoceppi: https://github.com/marcoceppi/juju.fail/pull/318:06
natefinchmarcoceppi: oops, crud, missing a comma18:07
=== JoshStrobl|Nap is now known as JoshStrobl
Walexpmatulis: look a bit harder18:13
Walexpmatulis: the words "invalid binary" don't indeed appear.18:13
Walexpmatulis: it is your own choice to look for those words.18:14
Walexpmatulis: look for ""1.24.5--amd64"18:18
firlAnyone know how to change the “http://ubuntu-cloud.archive.canonical.com “ mirror for installed unit packages?18:21
pmatulisWalex: yes, i see that. i'm assuming my error is the same then18:25
lazypowerfirl, Hello18:25
firlhey lazypower18:25
lazypoweri believe you're looking for me?18:25
firlDidn’t realize you were the person emailing, was just going to say thanks for sending me the links to the .md files for the docker layer18:26
lazypowerAnytime!18:26
lazypowerReally excited to get your feedback there18:26
firlYeah, It might be a few weeks, but the hope is to wrap some of our servies into that layer, I am interested to see how easy it works with a private docker hub18:29
lazypowerIf you find any bugs that you need sorted to support that, feel free to file them on the GH repo for the docker layer and we will so our best. There's a todo item for charming up the private registry and adding relation stubs to support configuration ootb18:30
lazypowerso it may not do what you need just yet without some manual intervention18:30
=== zz_CyberJacob is now known as CyberJacob
firlgotcha; I will see how far I can get. One of the requirements is to put a virtual bridge between a physical nic into docker. So I might have to contribute some stuffs anyways18:34
blahdeblahHi all.  Can anyone point me to an example charm which makes use of leader election?19:46
lazypowerblahdeblah, we use the leader election bits in etcd - http://bazaar.launchpad.net/~kubernetes/charms/trusty/etcd/trunk/view/head:/hooks/hooks.py#L3619:54
lazypowerits pretty simplistic however19:54
blahdeblahlazypower: simplistic is what I want for now - thanks :-)19:54
=== scuttle` is now known as scuttle|afk
=== scuttle|afk is now known as scuttlemonkey
natefinchahasenac, dpb1_: note that a fix to 1486553 has landed in 1.2420:37
natefincher ahasenack ^20:37
dpb1_natefinch: that is fantastic20:37
natefinchhttps://bugs.launchpad.net/juju-core/+bug/148655320:37
mupBug #1486553: i/o timeout errors can cause non-atomic service deploys <cisco> <landscape> <juju-core:In Progress by natefinch> <juju-core 1.24:Fix Committed by natefinch> <juju-core 1.25:In Progress by natefinch> <https://launchpad.net/bugs/1486553>20:37
dpb1_natefinch: will there be another half landed in 1.24, or will that part be in 1.25?20:39
natefinchdpb1_: figuring that out now.  The other half is a little more tricky, so we may put it into 1.25 instead20:39
dpb1_natefinch: ok, understood20:40
=== natefinch is now known as natefinch-afk
=== CyberJacob is now known as zz_CyberJacob
=== zz_CyberJacob is now known as CyberJacob
=== CyberJacob is now known as zz_CyberJacob

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!