=== scuttlemonkey is now known as scuttle|afk [03:46] while doing 'juju bootstrap' on the local (lxc) env I get "ERROR there was an issue examining the environment: required environment variable not set for credentials attribute: User" [03:46] Any hints as to what that means? [03:47] * thumper thinks [03:48] thomi: you aren't using lxc [03:48] or local [03:48] thumper: oh wait, I think yeah [03:48] that error comes from the openstack provider [03:48] thumper: sorry, thinko on my part [03:48] np [03:48] forgot I had the env var exported [03:48] thanks [03:51] thumper: good catch [03:51] o/ lazyPower [03:51] thumper: btw, i *will* get to your django MP's this week, soz its taken me so long to get to them [03:51] lazyPower: review poke [03:51] :_ [03:51] heh [03:51] hah [03:51] already on your wavelength mate [03:51] lazyPower: once that one is in, I'll submit the celery one [03:51] I have something you're going to want to take for a spin i think. [03:52] I'm using it now [03:52] http://github.com/chuckbutler/dns-charm [03:52] i've been reviving this project from last year quite a bit [03:52] huge feature branch is going to land later this week that includes RT53 as a provider [03:54] lazyPower: interesting [03:54] I like to think so [03:54] I've got a long road ahead of me w/ the unit tests that are failing [03:55] i think i've failed to encapsulate my contrib code from charmhelpers somewhere, its failing on the cache file for config() in ci [03:55] but thats future chucks problem (by future i mean tomorrow) [03:58] :) [03:59] thumper: i'll hit your MP up first thing when i clock into the office tomorrow, that sound good? that gives you 3 days to refactor before I head out for SF if anything needs some touch ups [03:59] then i'll be out until Thurs of next week [03:59] lazyPower: sounds god. [03:59] lazyPower: should be fine though [03:59] aight, you're on the calendar [03:59] * thumper crosses fingers [03:59] It more than likely is :) [04:00] i have faith in your ability to python [04:00] nice [04:00] lazyPower: I just remembered another fix that I should submit... [04:00] lazyPower: although this one is all docs [04:01] YOU WROTE DOCS? [04:01] no [04:01] oh [04:01] the readme was stomped over [04:01] between version 6 and 7 of the charm [04:01] dude dont get me excited like that [04:01] and they no longer reflect reality [04:01] however... [04:01] i dont think my heart could take it [04:01] I am going to write docs [04:01] around how to write a payload charm for it [04:01] because that truly sucked [04:02] messing around working that out [04:02] i wouldn't doubt it [04:02] payload charms are tricky to get write [04:02] *right [04:02] I did learn a lot though :) [04:02] would be good to capture that [04:02] in a way someone else can learn from it === JoshStrobl is now known as JoshStrobl|AFK [07:20] wallyworld: hey, I just saw your cursor on the resources spec, there was a change to "disk-path" I mentioned. [07:22] jam: i've been making lots of changes and also responding to rick's comments. i understand the default is needed but also think we need to not hard code it to allow deployers to say their resources go elsewhere, eg onto an ebs volume [07:23] we can hard code it if that's the plan, but i think it is a bit limiting? what if the resource won't fit on the root disk? [07:27] wallyworld: can we fit those resources in gridfs? [07:28] jam: i had thought'd we'd use a separate mongo db so we can shard etc [07:28] but i guess it doesn't matter, we can just hard code [07:28] wallyworld: so I'd like to leave it in the "do we want to add this" pile [07:28] we can decide on it, but I'd rather start simple [07:28] ok [07:29] it's not that much to support, there' s much harder stuff first up :-) [07:29] also, isn't the default root disk size on aws 8GB? [07:29] wallyworld: I certainly agree it isn't hard, but it is complexity that we may never actually need. [07:29] that's rather small [07:29] wallyworld: it is, but so is the size of the MongoDB that's running the environment. [07:30] not if we use a separate db for resources [07:30] wallyworld: where does that DB *live* ? [07:30] well, fair point [07:31] that would be a complication first up [07:31] wallyworld: I guess if we let you tweak the Juju API server to put the Resource cache onto a different disk [07:31] something lke that. but as you say, we can start simple [07:31] wallyworld: so I'm happy to have it as a separate logical Database (like we do for presence and logging) [07:31] yup [07:32] wallyworld: and especially for the large multiple environments having a way to go in and do some sort of surgery to handle scale will be good [07:32] yeah, we always planned to use a separate logical db [07:34] jam: so i'm off to soccer soon, i think i've answered most of rick's questions but i need time away from the doc as it's starting to blur into a mess of works. i'll revisit later and tweak some more. need to add sample output etc. there's still some points needing clarification. hopefully it's getting close [07:35] wallyworld: np, have a good night [07:35] ty, be back after soccer [11:22] If someone could take a look at https://code.launchpad.net/~daniel-thewatkins/charms/trusty/ubuntu-repository-cache/update_charm-helpers/+merge/262072, it would be much appreciated. [11:22] I failed to add some of the new charmhelpers files, so the ubuntu-repository-cache charm is broken. [11:22] It's a very easy code review. :) === JoshStrobl|AFK is now known as JoshStrobl === anthonyf is now known as Guest89967 [12:59] lazyPower: ping =) === scuttle|afk is now known as scuttlemonkey [13:50] lukasa: pong [13:50] o/ [13:52] o/ [13:52] Wanted to get your eyes on this quickly: https://github.com/whitmo/etcd-charm/pull/10 [13:54] so, as these etcd units are not raft peers they aren't part of the same cluster? [13:54] just independent etcd nodes on each docker host? [13:54] lazyPower: Correct [13:54] hi coreycb, a merge/review for you re: ceilometer amulet test updates: https://code.launchpad.net/~1chb1n/charms/trusty/ceilometer/next-amulet-kilo/+merge/261850 [13:54] well, s/docker/service/ [13:55] ok [13:55] lazyPower: Eh, you say tomato... [13:56] hehe, well the bug mentions calico openstack [13:56] but i bet this is for both [13:56] =P Certainly on OpenStack we deploy etcd proxies everywhere for scale reasons more than anything else [13:56] beisner, ok I'll look later today probably [13:56] But also for homogeneity [13:56] (Fun word, glad I got to use it) [13:56] ok, i'm good with this. would be excelent to see tests here too but i wont block on that [13:56] Well, do you want to hold off a sec? [13:56] coreycb, ack thanks [13:56] sure [13:56] I'm writing the Calico side of things, and I can quickly sanity check by actually running the damn thing [13:56] =D [13:56] i'm +1 for that [13:57] while i'ev got your attention [13:57] Awesome, so that'll get done today or tomorrow [13:57] is the docker merge still blocked on CLA? [13:57] AFAIK, yes, but I'll double check [13:57] ok let me reach out to my contact and poke them again [13:59] lazyPower: Fab, I'm checking on my end as well [14:09] lazyPower: Yup, as far as we know we're still waiting on the CLA stuff [14:10] i unfortunately had presumed as much, i just poked my contact again. i think they're dragging feet on a confirmation from management to sign it. [14:10] i'll run the ropes on this and see if i cant get it expedited [14:10] when you get some free time i'd like to work through whats there with you, i still haven't gotten a good test from it yet, but thats more than likely pebkac [14:14] Hopefully I'll be sitting on a little bit of time this week, assuming this etcd charm change goes off without a hitch [14:20] right on [14:21] I'm the lone ranger left on my team prepping for dockercon, so our roles have been reversed this week [14:21] but after the conf i should have some time [14:24] =D Nice [14:24] Our docker folks are all heads down atm, so I'm manning the fort on the charms side [14:54] lazyPower: Still about? [14:55] surely [14:55] whats up [14:59] The install hook of the etcd charm assumes that easy_install will be present [14:59] But it's not present on an Ubuntu cloud image as far as I know... [14:59] So installing the charm explodes =P [14:59] easy_install is shipped in cloud images on CPP clouds [14:59] where are you running these tests? [14:59] On a MAAS box [14:59] hmm [15:00] thats bizarre, ok. [15:00] Well, it's not necessarily the most up to date MAAS in the world [15:00] i guess we can throw down a quick block fo code to install easy_install. [15:00] It's easy enough to fix, just need to manually intervene [15:00] Well, you could [15:00] but easy_install has been present on everything i've tested on [15:00] Or you could just skip the middle-man and use get-pip.py directly to install pip ;) [15:01] Which has the advantage of doing it over a secure connection, unlike easy_install [15:01] ah, i'm not a fan of doing the wget | bash method [15:01] Oh sure, I mean literally bundle get-pip.py [15:01] Just a single file =) [15:01] this all stems from our pip package in archive being busted [15:01] install requests and the world blows up [15:01] stupid python dependencies :| [15:01] * lazyPower rages silently against a problem thats been cropping up more and more [15:01] =P This is where I put my hand up as a requests core developer [15:02] So this is a little bit my fault [15:02] * lazyPower instantly un-rages and apologizes [15:02] =D [15:02] It's totally ok [15:02] The situation is a mess [15:02] it really is [15:02] But HP are paying dstufft full-time to fix it [15:02] system dependencies not being in a venv make this tricky [15:02] Yup [15:02] Presumably the charm could have a virtualenv, though...? [15:03] thats tricky. we have venv being prepped in our docker charm - but we haven't really leveraged it [15:03] i'm not sure what issues we will crop up with going that route - but i'm game for trying it out [15:03] thats a hefty feature branch however, as it effects the entirety of the charm [15:03] Yeah, I wouldn't do it now [15:03] lets file a bug and explore that at a later date [15:03] For now I can just do a juju add-machine and hop on and install easy_install [15:04] Then deploy the charm to it directly [15:04] ok, sorry about the inconvenience, but good to know if we have a substrate thats not shipping with batteries [15:04] =P It's a pretty minor inconvenience [15:04] I think I also have a too-old Juju, so I'm updating that as well while I'm here [15:04] but *handsigns* magic [15:04] be aware that 1.23.x has an issue whend estrying the env it pulls the socket out from underneath you [15:05] *destroying [15:05] What's the net effect of that? [15:05] things like bundletester have random bouts of errors when running multiple test cases [15:05] client connections are terminated and you get a stacktrace while destrying an env, but the env *does* get destroyed. [15:06] http://juju-ci.vapour.ws:8080/job/charm-bundle-test-aws/173/console [15:06] is a good example of the output you'll see [15:06] the "reset" bits that loop for ~ 30 lines [15:06] Eh, I'm not scared of stacktraces [15:07] Oh, btw, we're dropping a new 'feature' that should make docker demos a bit nicer, which we may want to incorporate into the charm [15:07] oh? [15:07] But basically, on cloud environments we can set up ip-in-ip tunnels between hosts and run the Calico traffic through them [15:07] This means you don't need a cloud that gives you a proper fabric [15:07] nice :) [15:07] when is that expected to land? [15:08] i have a work item this week to get SDN in our bundle we're using @ the conf [15:08] It's already in the latest release of Calico, I think the next calico-docker release will contain it [15:08] Which I'd expect...today, I think? [15:08] oh nice [15:08] i'll def. tail the repo and when it lands give it a go [15:08] We don't plan to call that a productised feature because customers won't deploy Calico in that kind of fabric [15:08] right [15:08] But it's useful for demos and trying it out on clouds [15:08] +1 to that [15:09] Also, setting up those tunnels involves typing a series of *super* cryptic 'ip' commands into Linux, so charms are perfect for it. ;) [15:10] juju power activate! [15:10] calco will form the network [15:17] Hello to everybody. Is this the place where I can share my troubles with Juju? =) [15:22] jamespage, odl-controller mp https://code.launchpad.net/~sdn-charmers/charms/trusty/odl-controller/odl-cmds/+merge/262095 (no great rush) [15:23] Most of all I have question about juju agents. Is there a way to restore or regenerate agents apipasswords? [15:25] MrOJ: agent configurations are all listed /var/lib/juju [15:25] let me get a direct path for you 1 m [15:26] MrOJ: so assuming your charm name is 'test' [15:26] teh agent config path is /var/lib/juju/agents/unit-test-#/agent.conf [15:27] the .conf file is a yaml formatted key/value store of all the data required to communicate w/ the state server. You can update all the values in there if required, including repointing to a new state server, updating the api password, etc. [15:28] Yes I know that. It' a long story but right now I don't have that directory in my system [15:29] Sorry my english. I'm from Finland and its not my main language [15:29] no worries MrOJ [15:31] I think that somehow Bug #1464304 might have made this situation [15:31] Bug #1464304: Sending a SIGABRT to jujud process causes jujud to uninstall (wiping /var/lib/juju) [15:33] yikes! [15:37] I've managed to manually restore agents.conf and all other files in /var/lib/juju and jujud start scripts in /etc/init. [15:39] But if I start jujud-machine-xx it removes /var/lib/juju again in that node. [15:44] In /var/log/juju/machine-xx.log is mention about "invalid entity name or password" and after that "fatal "api": agent should be terminated " [15:45] natefinch: ping [15:45] lazyPower: sup [15:45] have you seen behavior like this? is this due to a stale status left over in the state server terminating the restoration of the unit agent? i'm a bi tout of my depth here [15:45] MrOJ: ran into a pretty hefty bug that's terminating a unit out from underneath him [15:46] lazyPower: reading history [15:49] MrOJ, lazyPower: ouch, that's a gnarly one. [15:49] Version is 1.23.3 [15:50] MrOJ: what provider are you using? (like, amazon, maas, openstack, etc)? [15:50] It's maas [15:51] MrOJ: Do you need to keep that machine running. or can you just destroy it and recreate it? [15:52] I need to have it running because it's in production. [15:53] I have small Openstack cloud running in our company and machine is part of it. [15:54] Openstack deployment itself is ok [15:57] MrOJ: tricky. I'm talking to some of the other devs to see if we have a way to get that machine back in working order. [15:58] I've learned basics about mongodb and have recovered most of data straight from there but I can't figure out how I can restore apipassword [15:58] natefinch: Thank you === kadams54 is now known as kadams54-away [16:14] MrOJ: Still doing some tests to try to figure out the best way to get you recovered. [16:16] natefinch: Thanks again! === natefinch_ is now known as natefinch === lukasa is now known as lukasa_away === lukasa_away is now known as lukasa [16:32] hey MrOJ :) [16:32] let me recap here, the files in /var/lib/juju where lost and you rebuilt it right? [16:33] and all seems ok excepting for the api password [16:33] MrOJ: I have to run for a bit, so I'm handing you off to the very capable perrito666. [16:35] perrito666: Yes that's right. I forgot to mention statepassword too.. === scuttlemonkey is now known as scuttle|afk [16:43] MrOJ: currently the status for said service says something? [16:46] perrito666: juju status says "agent-state: down" [16:46] is it the only unit for that service? [16:47] perrito666: no but they all says the same [16:48] oh, so you have multiple machines/containers in that shape? [16:48] yes that is the situation.. [16:48] ah, sorry, I had missed that part [16:49] it's ok.. [16:50] Actually all my machines are in that situation.. [16:50] Except state servers [16:55] I had to restore HA state servers and same time I had dns problem in MAAS.. I didn't know that then. Because of this I bumped to bug I mentioned earlier === kadams54-away is now known as kadams54 [16:57] At least, I think this is what happened.. === scuttle|afk is now known as scuttlemonkey [16:58] MrOJ: I am thinking, I am sure we can rescue this, but I am thinking which is the best way, either to nuke all the password for one that we can use or something like that [16:58] brb, lunch. [17:02] perrito666: I can restore each unit one by one.. We have only about 50 units so it's not so big job.. [17:03] perrito666: Ok. Take your time and have a great lunch =) [17:06] can i setup a new deployment in each test_case for amulet? Is that advisable ? [17:11] cholcombe: Typically when the entire toplogy is undergoing a rapid change it warrants a new test file as the deployment map is defined in __init__() [17:11] but i'm open to seeing a different pattern emerge :) [17:12] ok interesting [17:13] cholcombe: if you're only adding a unit to the topology, it should be fine to just self.deploy.add_unit() or add a new service. start w/ bare bones and iterate through the test file [17:13] it'll cut down the overall test-run time which is a good thing, right now the integration tests are very slow [17:13] so its kind of dependent on what you're doing [17:13] well the issue is gluster has like 10 different volume types and i want to test each one [17:14] is this something that needs to be defined at deploy time? [17:14] or can you reconfigure the charm w/ the different volume type [17:14] i've been setting the volume type in the charm config and then running deploy [17:14] meaning once its stood up and running, is it possible to reconfigure the charm for that volume type [17:14] or do you *have* to redeploy to gain that volume type [17:15] i'm thinking this is like ceph, that your volumes are defined at deploy, and as its storage you're locked into that volume type for the duration [17:15] pretty much yeah [17:15] MrOJ: oh, didnt catch that and had a medium to bad lunch :p [17:15] you set it before you run it and you're locked in [17:16] Yeah, you'lll need to do a different permutation of the charm then, which would warrant a new test - as afaik there's no way to destroy a service in amulet to date [17:16] perrito666: I know the feeling =) [17:42] thanks for python-django work [17:42] I haven't been able to look at it in a while, but definately appreciate the work in the meanwhile [17:43] delurking to give props [17:47] cholcombe, lazyPower - you might be interested in related in-flight work on the ceph amulet tests... [17:47] oh? [17:47] the pivot point is different (ubuntu:openstack release) with the same topology [17:48] WIPs @ https://code.launchpad.net/~1chb1n/charms/trusty/ceph/next-amulet-update/+merge/262016 [17:48] & https://code.launchpad.net/~1chb1n/charm-helpers/amulet-ceph-cinder-updates [17:48] so that exercises the same ceph topology against precise-icehouse through vivid-kilo [17:48] * cholcombe checking [17:49] yeah that's similar to what i need to do [17:49] actively working to update and enable kilo and predictive liberty prep [17:49] ^ on all os-charms that is. [17:49] beisner: wow thats a huge diff [17:50] the tl;dr is you get bundle permutations mid-flight with this? [17:50] yeah really [17:50] yeah, some refactoring for shared usage by cinder and glance when i get there [17:50] hmm [17:50] you should blog about this :) [17:50] so i can read the blog intead of the diff [17:50] <3 [17:50] *instead [17:50] lol [17:52] ha! [17:54] how about a pastebin of the test output? ;-) trusty-icehouse: http://paste.ubuntu.com/11726195/ [17:55] oh i guess that paste includes precise-icehouse too. just got the kilo stuff working, but no osci results yet. [17:57] beisner: it has no pictures [17:57] i need pictures, and a story to go with it [17:57] i know i know, needs shine ;-) [17:57] :D [17:57] so, i'll put you down as writing a blog post next week on this? excellent [17:57] jcastro: ^ [17:57] you saw it here first, beisner agreed to blog about his awesome osci permutations code [17:57] actually, that is on my list o things to do, lazyPower [17:58] I'm being deliberately obtuse to rally support for your cause [17:58] in the form of giving you work items [17:58] that are totally awesome, and i can tweet about [17:59] where do you blog currently beisner? [17:59] i'd like to add you to my feed [18:01] lazyPower, http://www.beisner.com - it's been mostly idle though as i've been mostly throttled [18:02] ack, thanks for the link === kadams54 is now known as kadams54-away [18:22] hey jose [18:22] ohai [18:22] hey so office hours in 2 days iirc? [18:22] jcastro: yes, want me to host? if so, can we move it up by 2h? [18:22] I would like to confirm the time so marcoceppi doesn't make fun of me [18:23] I can host, can you resend me the creds just in case though? [18:23] MrOJ: looks like we're handing you back to me. Do you have log files you could share? all-machines.log on one of the state servers might contain some useful information for figuring out what went wrong. [18:24] jcastro: sure, will do right now, along with some instructions [18:24] excellent [18:25] my calendar has it for 8pm UTC, is that what you have? [18:43] jcastro: sorry, was having lunch. I do have it at 20 UTC. looks like we're good === anthonyf is now known as Guest41199 [19:49] MrOJ, ericsnow: we should talk here. MrOJ, ericsnow is one of the developers that worked on the backup code (along with perrito666). [19:50] natefinch, MrOJ: note that I'm not nearly as familiar with the restore side of things, but I'll help as much as I can [19:51] MrOJ: do you have a machine log from one of the machines that killed its state server? [19:59] natefinch, ericsnow: I think I have. just a moment [20:00] I am back [20:16] natefinch: Yes I have log, but filesezi is almost 20M [20:21] MrOJ: how big is it if you compress it? It should compress a *lot* [20:27] natefinch: I'll check [20:36] natefinch: Ok.. Now I have log from machine-0 and machine-2. [20:37] natefinch: those files are about 1M compressed [20:40] natefinch: Can I email those to you or somebody else? [20:45] MrOJ: email to nate.finch@canonical.com please and thank you [20:49] natefinch: ok.. I'll send those from my work email -> timo.ojala@kl-varaosat.fi [20:53] lazyPower: cheers for the python-django review [20:54] MrOJ: btw, you said you were doing a restore while having DNS issues... why were you doing the restore in the first place? [20:56] thumper: happy to help :) === kadams54-away is now known as kadams54 === kadams54 is now known as kadams54-away === kadams54-away is now known as kadams54 === kadams54 is now known as kadams54-away === anthonyf is now known as Guest21341