=== Guest16410 is now known as CyberJacob [11:52] cholcombe - FYI - for this version of the revq, bugs must be in a new, or fix-committed state in order to be ingested. otherwise they get ignored. The "in-progress" status of that but would have actively prevented it from being ingested. [11:53] i went ahead and ftfy [12:22] lazyPower, hi, good morning. [12:24] Regarding the problem with resource-get hanging... I just hit that; https://bugs.launchpad.net/juju-core/+bug/1577415/comments/2 [12:24] Bug #1577415: resource-get hangs when trying to deploy a charm with resource from the store [12:24] neiljerram - yep. one of two things will fix that [12:25] Is there a straightforward workaround, given that I'm deploying from a bundle? [12:25] supply the ETCD bins (that make target is handy for grabbing them), or deploy it as a local charm so it hits the xenial fallback [12:26] Do bundles in Juju 2 have a syntax for local charms? [12:26] yep, instead of defining cs:foo/bar - give it a fullpath on disk [12:26] A few days ago, IIRC, I had a bundle with "branch:" settings, and it was complaining... [12:28] But perhaps you're saying that charm: will work. [12:30] ... yes, it appears so. [12:36] lazyPower, Now with the local charm, juju status says "Missing Resource: see README". Does that mean that the fallback hasn't worked? [12:36] niemeyer - can you confirm the series is xenial and not trusty? [12:37] er [12:37] neiljerram [12:38] Ah no, it appears the machine is trusty. I must be missing something in my bundle. [12:39] Is it because the charm's own metadata.yaml has trusty before xenial? [12:43] neiljerram - give me a sec and i'll publish a revision of the charm that short circuts the resource-get calls [12:43] in the interest of making it less painful while we sort out the store [12:44] well if you like - here I just deleted the trusty decl from my local, and it looks like it's now deploying on xenial, so should work. [12:44] \o/ === rogpeppe1 is now known as rogpeppe [12:52] lazyPower, my etcd install has completed now [12:53] neiljerram - juju ssh into that etcd node and poke it :) your calico node should have barfed trying to talk to it, as its TLS encrypted and it doesn't have client certificates yet [12:54] even with -6? [12:54] (I thought the TLS work came after that version) [12:56] -7 should be the TLS bits [12:57] -6 is just xenial support [12:58] I'm using just -6 for now - focussing on the Xenial support first (plus other things for the relevant contract) first. [12:59] ack [12:59] FYI, http://pastebin.com/UnxWHXrd I think that all indicates that it's not expecting https or TLS communication... [13:00] yep, thats all the old port schema, and no TLS [13:01] icey, beisner: minor niggle as a result of the switchout of ceph-mon from ceph: [13:01] https://bugs.launchpad.net/charms/+source/ceph-radosgw/+bug/1577519 [13:01] Bug #1577519: ceph-radosgw "Initialization timeout, failed to initialize" [13:02] basically rgw gives up if it can't complete init in 5 mins - if no OSD's appear, then we hit that... [13:02] fginther, ^^ [13:02] they do appear - just a bit later I guess [13:02] jamespage: I'm not certain that it is, I may try to reproduce with the ceph charm [13:02] jamespage: in essence, I suspect that woul happen with the ceph charm deploying no ceph-osds [13:02] icey, quite correct [13:03] icey, its fairly easy to reproduce - just deploy ceph-mon with ceph-radosgw with no ceph-osd... [13:03] in essence, maybe we should hold off on allowing attachment until we have OSDs in the quorum jamespage [13:03] jamespage: I mean repro on the ceph charm since it's not specifically a ceph-mon+ceph-osd issue but rather an issue of allowing the related connections to get setup before we have OSDs [13:05] icey, like you say, this has always existed, but the switch to ceph-mon in containers triggers this more frequently in deployments; ceph was typically placed on hardware so had its own osd's [13:13] fginther, hey - think we have a root cause for the radosgw init failures... [13:14] jamespage, nice! [13:14] jamespage, sorry if you sent me a bunch of questions, been having network/IRC issues [13:14] fginther, its actually a bug that's always existed, but until the switch to ceph-mon, you would have been unlikely to see it [13:15] fginther, I suspect that ceph-osd is not the first charm down onto the physical machines right? [13:15] i.e. it gets done after nova-compute [13:16] jamespage, let me check [13:17] fginther, well whatever the order, the time between ceph-radosgw starting up, and the first osd's joining the cluster is > 5 mins [13:18] fginther, as icey and I where discussing, the right fix is to probably not give out the keys to radosgw until we know there are some OSD's in the cluster... [13:20] jamespage, let me know if you have something to test. I've been hitting this frequently on one of my maas clusters [13:22] jamespage, would this be a change to the charm or the service itself? [13:22] fginther: it's a change to the charm === frankban|afk is now known as frankban [16:11] lazyPower, hi again - just hit a couple further issues with the relation between etcd-6 and other charms that incorporate an etcd proxy [16:12] lazyPower, one is that apparently a proxy needs the peering URL for its ETCD_INITIAL_CLUSTER config; not the client URL. In other words, typically, :2380 instead of :2379. === redir_ is now known as redir [16:13] neiljerram - ah i didnt notice it was using the management port [16:14] * lazyPower snaps [16:14] so much for deprecating an interface [16:14] lazyPower, the second is that the connection string, that gets passed across the relation, is missing the cluster name; in other words it appears to be something like "http://172.18.19.20:2379", whereas what a proxy needs is "etcd0=http://172.18.19.20:2380" [16:14] lazyPower, yes, I'm afraid so. :-) [16:14] neiljerram - think you're up to contributing the interface? [16:15] I can hack around these things for now, but firstly was just wondering if I'd missed something. [16:15] nope, you've caught me red handed at breaking you :( [16:15] i would have gotten away with it too had it not been for you pesky kids and your dog! [16:15] lazyPower, hehe [16:17] lazyPower, so what would be involved in putting back a distinct proxy relation? [16:18] neiljerram - its a departure from whats familiar - https://github.com/juju-solutions/interface-etcd -- basically modify this to be the etcd-proxy interface, and we'll need to feed it the management initial-cluster-string [16:18] when i say modify, i mean clone / use as a template. it'll be a brand new interface, but a majority of what needs to be there is there. [16:18] nix the peers interface and go for a strict provides/requires [16:19] and some slight tweaks to the layer to implement the relationship (metadata, supporting code in reactive/etcd.py to provide the relation data) [16:26] lazyPower, OK, thanks - let's leave the matter here for now. I'll review implications for my current project and follow up with you by email. [16:28] sounds good neiljerram. sorry about the inconvenience :/ i wish i would have caught that sooner [16:49] how do i pass --storage data=ebs,10G to amulet for testing? [16:50] cholcombe - i dont think we have storage support in amulet, but i may be wrong [16:51] lazyPower, yeah. I see you can pass constraints but that won't help me [16:51] iirc there's a bug to track that, but its under the umbrella of 2.0 feature compatibility [16:51] lazyPower, yeah i found the bug [16:53] lazyPower, ok so my charmhelpers promotion of gluster is prob on hold then until i can get storage support in amulet. [16:53] cholcombe - feel up to contributing storage support? (its python, not rust) [16:53] i might look into just hacking this in myself but i don't know if it'll get accepted [16:53] lol [16:54] :D sorry, obvious troll is obvious [16:54] ;) [16:54] howdy cholcombe, i think storage support will also be dependent on the provider used to run the test. ie., i don't think the juju openstack provider understands storage yet. [16:55] interesting [16:55] ah right [16:55] seems like currently a maas prover only kinda thing i think? [16:56] provider even [16:56] https://jujucharms.com/docs/devel/charms-storage [16:56] beisner - all supported substrates are listed here [16:58] looks like it's more feature-ful than i recall :-) [17:12] \o/ wooo features [17:12] we like features [17:26] https://review.openstack.org/#/q/topic:switch-to-bundletester [17:26] ^ jamespage, wolsen, thedac [17:26] 3 oscharms switched to run bundletester+amulet purely from venv via tox. thedac reviewed & +1d, he and i are looking for add'l reviews before unleashing the sweep of updates. [17:26] I'd suggest starting with the README_TESTS.md to see the new workflow. The test bot is already wired to use this new method by priority, but fall back to the legacy stuff if tox func tests are not present. [17:26] Can you have a sweep through, call out anything that's not clear? tia. [17:26] dosaboy tinwood too ^ [17:27] beisner: sure thing, I'll queue it up for the day [17:27] very likely pilot error, but are folks able to deploy a bundle on beta7? [17:27] I can bootstrap, but once I deploy a bundle I get error on each machine [17:27] dosaboy, will take a look tomorrow. [17:28] beisner ^^ [17:28] thx tinwood wolsen much appreciated [17:29] i'll let those set and chill, switching gears to charm upload/publish automation [17:29] tvansteenburgh: any known issues with bundletester and this error? ssl.SSLError: [SSL: TLSV1_ALERT_PROTOCOL_VERSION] tlsv1 alert protocol version (_ssl.c:590) [17:33] tinwood, wolsen, dosaboy - fwiw, if you're bored ;-) we still have "workload status vs init systems" raciness @ master for at least keystone, and i suspect others. see thedac's keystone review. he fixed one race path, but there is still something off. [17:33] jamespage, fyi ^ [17:33] ref: https://review.openstack.org/#/c/316195/ [17:34] bug 1581171 [17:34] Bug #1581171: pause/resume failing (workload status races) [17:43] arosales - is it an error at the machine level? (i assume yes) [17:44] lazyPower: correct, error at the machine level [17:44] arosales - and if thats the case, have you tried juju retry-provisioning? (this wont help in all cases... eg: out of resources) [17:44] * arosales trying west-2 now [17:44] was on east1 [17:44] arosales - are you on the CDP credentials? [17:44] just wanted to see if others had been sucessfuly hee, I suspect so. [17:44] lazyPower: no [17:44] yep, i've been in/out of AWS/GCE today without issue [17:45] ok just checking, as thats getting a bit crowded lately [17:46] west-2 now at least in pending [17:46] instead of straight error [17:46] lazyPower: ok thanks for the feedback [17:47] well, thats progress [17:47] * lazyPower is bootstrapping in us-east-1 rn [17:50] magicaltrout: ping [17:52] aisrael: you need the latest python-jujuclient [17:53] aisrael: install from pypi or ppa:tvansteenburgh/ppa [17:53] aisrael: ditto for juju-deployer [17:59] tvansteenburgh: ta! [18:05] hey all, been trying to set up ceph-osd and ceph-mon but get a 'hook failed: "mon-relation-changed" for ceph-mon:osd ' message for ceph-osd. AFAIK I've got correct parameters in my YAML file. Output and config available here: [18:05] https://gist.github.com/0X1A/fd19baf8e8db7586e8d5e753b54589fc Any ideas? [18:09] rick_h_: you around? [18:19] beisner, okay, something for me to look at tomorrow too :) [18:20] tinwood, muchos [18:25] Brochacho, hey there [18:26] Brochacho, looks like you're using juju2 is the correct? [18:26] cholcombe: Hey Chris, yes I'm using 2.0-beta6 [18:27] Brochacho, icey was just saying that he deploys on xenial with juju2 and it's fine. hmm.. [18:28] Brochacho, can you do a juju debug-log --include ceph-osd/0 [18:28] i'd like to try and figure out what failed [18:28] arosales: any luck with those instances? I had one machine allocate fine but the next two failed [18:29] Brochacho, is your filesystem ext4 on those osds? That would give this error [18:30] Brochacho: can you jump into a hangout for a minute? [18:31] aisrael: finally got a good deploy in us-west-2 [18:32] aisrael: thanks for checking [18:32] arosales: ack, thanks. I think us-east is my problem, too. [18:33] cholcombe: Yes, one second. I respun the osd's since they locked in in error. Seems that when they error I can't use remove-unit [18:33] icey: Yes, I can [18:41] omg, there's a brochacho here [18:41] ha! [18:41] sorry for the absolutely do nothing ping... but you're my most abused slang word to refer to friends... just ask my co-workers [18:41] i feel somehow validated... [18:42] mbruzek hey, check this out. today is full of random occurrences of awesome [18:42] what? [18:42] theres a bro-chacho here... like... proper noun and everything [18:43] its the little things...really :) === redir is now known as redir_afk [19:05] arosales: how did you switch to us-west? [19:20] oh, controller/region === redir is now known as redir_afk [20:16] icey: 'juju set-default-region aws us-west-2' [20:17] thanks arosales [20:19] icey: np === blr_ is now known as blr [23:33] hows it going everyone? Has anyone successfully bootstrapped lxd on a non lxdbr0 interface? [23:34] ....like a bridge bridged to an external network [23:34] non host only