[00:05] <_mup_> juju/trunk r385 committed by kapil.thangavelu@canonical.com [00:05] <_mup_> merge unit-info-cli [r=fwereade,niemeyer][f=863816] [00:05] <_mup_> Allow units to obtain their public and private addresses in a provider [00:05] <_mup_> independent fashion [00:09] <_mup_> Bug #867991 was filed: local provider needs documentation < https://launchpad.net/bugs/867991 > [00:12] <_mup_> Bug #867993 was filed: Code/module docs seem to use "j" instead of juju < https://launchpad.net/bugs/867993 > [00:17] hazmat, here is the master-customize.log: http://pastebin.ubuntu.com/702511/ [00:18] i tried going against the latest in unit-info-cli, no changes i could see [00:18] interesting [00:18] in terms of the log or juju status' [00:19] jimbaker, what's the output of > host archive.ubuntu.com192.168.122.1 [00:19] whoops space between those [00:19] jimbaker, it looks like a network connectivity problem [00:20] hazmat, here's the output: http://pastebin.ubuntu.com/702516/ [00:21] jimbaker, one more.. what's the output of sudo cat /var/lib/lxc/yourusername-yourenvname-0-template/rootfs/etc/resolvconf/run/resolv.conf [00:22] hazmat, nameserver 192.168.122.1 [00:23] jimbaker, odd [00:23] everything sounds like its correct [00:23] jimbaker, aha [00:23] hazmat, ? [00:24] (if you want i can give you a login to this box) [00:24] jimbaker, so oneiric is still in beta so old packages aren't per se kept around [00:24] jimbaker, you'll need to update manually your lxc cache [00:24] hazmat, ok [00:24] so it has the latest oneiric packages [00:25] jimbaker, $ sudo chroot /var/cache/lxc/oneiric/rootfs-i386/ [00:25] jimbaker, $ apt-get update && apt-get upgrade [00:26] that should fix things, else the pkg cache was ref'ing non existant packages upstream i bet [00:26] ok, doing that, but against amd64 :) [00:26] * hazmat heads out for a dog walk [00:26] bbiab [00:26] hazmat, makes sense, this would explained those bad refs in master-customize.log [00:27] jimbaker, also you can update your juju-origin to lp:juju/trunk if you want.. its all committed now [00:28] hazmat, sounds good [00:56] jimbaker, that work for you? [01:14] hazmat: Man! [01:14] hazmat: wtf is happy with unit-get! [01:14] niemeyer, :-) [01:14] * niemeyer dances around the chair [01:14] niemeyer, cool, i'm out going to go enjoy some down time [01:15] hazmat: Enjoy it :) [01:15] niemeyer, sent out some mails to the list inviting additional local provider testing [01:15] I'm going down too [01:15] hazmat: Superbsome [01:17] * niemeyer => poweroff [01:51] hazmat, hmmm, the upgrade was successful of the lxc cache. looks like i have a networking issue, the agent logs (which i finally have) just report zk errors about network is unreachable [01:51] (agent logs for mysql/0, wordpress/0) [01:52] the other detail is that the master-customize.log was not touched. is that only used in the event of an error? [02:10] hazmat: oh nice, I'll try local/LXC as soon as the build is finished [02:10] hazmat: I've been looking forward to this. [02:49] anyone try the local provider? I'm having some problems getting the environment right [02:52] jcastro, i have tried the local provider several times, but yet to get it running [02:52] I'm not even getting past the environments.yaml bits [02:54] jcastro, this is my environments.yaml for lxc: http://pastebin.ubuntu.com/702557/ [02:55] jcastro, for me it's failing in the network at this point [02:56] "error: internal error Network is already in use by interface virbr0" [02:56] look familiar? [02:56] jcastro, i have seen that. one thing to try (hate to tell you it) is to reboot your computer [02:56] hah, awesome. [02:57] the initial lxc networking for whatever reason didn't come up properly until i did this [02:57] jcastro, before doing that, might as well make sure you upgrade the pkg cache [02:57] see above, approx 2.5 hours ago [02:59] ah [03:00] jimbaker: woo! status works [03:01] jcastro, cool, but the provisioning agent runs on your machine, so it's not indicative of the lxc bits [03:01] (other than using local of course) [03:02] ah [03:02] i just got past that just now, the fragility was in the package cache [03:02] they show up in status, but their state is still null [03:04] jcastro, do something like $ ps -ef | grep juju - verify you see unit agents for mysql/0 and wordpress/0 (or whatever stack you are attempting) [03:05] doesn't look like it, just one for zookeeper itself, which I assume is the provisioning one [03:07] jcastro, yes, you will have one zookeeper instance as part of the provisioning setup [03:08] jcastro, so check for the existence of master-customize.log - this will be in the data dir, owned by root [03:09] jcastro, ideally you will have logs for each of service units there too, a container log (for lxc debugging) and the unit.log itself [03:09] all the log dirs are empty [03:10] jcastro, ok, so you are seeing something else wrong (and i assume you did the reboot) [03:10] yeah [03:10] the bootstrap runs, completes, no errors [03:12] jimbaker: ok heading to bed, openstack conf tomorrow, I will certainly try on the plane. :) [03:12] bootstrap really means nothing unfortunately - and status is less useful here [03:13] well, of node 0 [03:13] yeah it looks like it thinks it's firing off things, but they're not happening [03:13] at first I was like "wow, this is fast." [03:13] but it doesn't appear to be actually doing anything, heh [03:13] i think it might actually be fast, but yeah, still waiting to get it to work [03:13] ok, enjoy the conf! [03:13] i feel like we're close! [03:14] i think so too, there are some basic setup doc issues, possibly ones that can be scripted, to make it work, i suspect i just have one more to go through [03:20] /me catches up [03:20] surprised rebooting is needed [03:21] jcastro, its all async same as other providers so its fast to execute commands [03:21] but takes a few moments for reality [03:22] jcastro, planes are probably a problem, network connectivity is needed for non cached package installs [03:24] jimbaker, so the lxc cache update didn't solve the issue for you? [03:25] <_mup_> juju/remove-sec-grp-do-not-ignore-exception r383 committed by jim.baker@canonical.com [03:25] <_mup_> Merged trunk [03:28] jimbaker, the master customize log is updated only after the complete run, u should use ps | grep lxc to verify that [03:28] ps uax [03:28] oh... no.. gluster acquired [03:38] <_mup_> juju/trunk r386 committed by jim.baker@canonical.com [03:38] <_mup_> merge remove-sec-grp-do-not-ignore-exception [r=niemeyer][f=863510] [03:38] <_mup_> Simplified logic for removing security groups to ensure any exceptions [03:38] <_mup_> are not ignored by being consumed by the Twisted reactor. [04:10] hazmat, i got further with the lxc cache update [04:10] but now i'm seeing this networking issue [04:14] hazmat, so in the unit.log files (in my case mysql/0, wordpress/0), i'm seeing this repeated: Network is unreachable [04:14] so something is wrong w/ the virtual network setup [09:46] Hi all ! [09:46] is juju well suited for my use case: [09:46] I want to launch a complete infrastructure with 1 command, and service knowing one each other address [09:47] and , also , I want to specify a list of packages to install on each nodes , and some sed commands [09:47] is chef-solo better for doing this ? or cloudformation (I'm working on EC2) [09:49] xerxas: from my reading I think juju can do that for you - but keep in mind it is under development. Not sure if you want to base any production sites on it right now [09:52] TeTeT: you mean, if I want to make live my infrastructure ? [09:53] If I'm adding unit afterwards ? [09:53] xerxas: adding units should work nicely, if the charm allows for it. What is your schedule for going productive? [09:54] now ;) [09:55] what's the simplest way de provision ec2 infrastructure (not instances, but infrastructures !) [09:55] cloud-init seems ok for instances, is cloud-formation the cloud-init of infrastructure ? [09:58] xerxas: actually I'd say juju is exactly what you need, and is the simplest way going forward [09:59] xerxas: it's just that it's alpha quality right now [09:59] kim0: why ? [09:59] because it does what you described :) [09:59] kim0: I don't mind if it's alpha or not, if it boots my 3 instances, setup a rabbitmq on one , et create a config file on a second one containing my rabbitmq server ip address [09:59] kim0: ;) [10:00] kim0: this is maybe why I'm asking here ;) [10:00] ahh , also, question , I tested juju ... but [10:00] juju client, needs to be ubuntu , as juju "controller" and , juju booted intances ? [10:00] xerxas: no client can be osx too [10:00] by controller I mean the instance where zookeeper is installed [10:01] or other linux [10:01] kim0: nice, I'm running osx ;) [10:01] cool [10:01] but the rest needs to be ubuntu [10:01] xerxas: I think it's available on some osx repo [10:01] can't remmeber what osx folks use to install cli tools :) [10:01] brew [10:01] yes that's it [10:01] and easy_install ;) [10:01] wasn't there fink back in the day? [10:01] which was a lot like apt [10:02] think brew it is now [10:02] TeTeT: fink is useless know [10:02] been 6 years since I worked with osx ... [10:02] brew is git based, easy formula (which is metadata for building packages) writing , social ... [10:03] been 6 years I had a ubuntu based desktop ;) [10:03] xerxas: here is a rabbitmq charm: https://code.launchpad.net/~charmers/charm/oneiric/rabbitmq-server/trunk [10:03] but never left ubuntu on the server ;) [10:03] xerxas: good choice ;) [10:03] and will never (you're doing good work , guys ! ) [10:03] kim0: thx for the url [10:03] xerxas: so one instance is rabbitmq, what are the other two ? [10:04] some daemon I wrote in python consuming messages [10:04] ah cool! [10:04] one will have some lxc container [10:04] containers [10:04] morning - anyone tried using the LXC local provider yet? [10:04] xerxas: sounds like a cool project! [10:05] anyway, my python daemon have a configuration file and in that configuration file I have an ip address, the one of the rabbitmq server [10:05] (more precisely, one of rabbitmq-s server, because we might have several of them ... ) [10:05] xerxas: I'd love to write an article about that project when you get it running .. keep me in the loop :) [10:09] jamespage: it's on my plate for later this week [10:19] TeTeT: is it already working fine ? [10:33] kim0: I don't know yet, this I why I'd love to test it [10:33] yeah [11:14] * HarryPanda can't get local charm repositories working anymore [11:15] good morning [11:16] `charm create test /root/charms && juju deploy --repository /root/charms/ local:test` = no dice [11:17] HarryPanda, you need to use a series (like 'oneiric') as additional directory in the repository from the base to the charm.. [11:18] HarryPanda, ie.. so looking at the example formulas that come with juju.. its examples/oneiric/wordpress [11:18] aah, I haven't been keeping up with the changelog [11:20] HarryPanda, yeah.. that was recent [11:20] * HarryPanda got confused about store.juju.ubuntu.com etc. [11:20] there was a mail out to the list about the chance [11:20] HarryPanda, its basically a charm repository.. its not active yet [11:20] HarryPanda, yeah.. its a bit confusing on the error message [11:20] fwereade, ^ [11:21] jamespage, haven't gotten any new feedback yet [11:22] hazmat, hey - I had a go based on your posting to juju@l.u.c [11:22] jamespage, cool, how'd it go? [11:23] hit a few errors - the environments.yaml entry wanted a type and admin-secret entry as well ( so I had a guess) [11:23] jamespage, doh.. yeah [11:24] jamespage, late night emails.. bad [11:24] jamespage, and then? [11:24] jamespage, fwiw admin-secret is anything you want it to be [11:25] I appear to have a running environment! [11:25] I've not deployed anything yet - you caught me as I was typing juju bootstrap --environment.... [11:25] * HarryPanda is slowly getting our full stack running on a mix of vmware+orchestra, ec2 and lxc :) [11:26] jamespage, cool.. the real work happens when you first deploy a charm.. and it starts creating lxc containers.. [11:26] HarryPanda, fun [11:27] hazmat, OK about to try that [11:27] jamespage, its async to the cli, so giving progress on it is a little hard, you can ps aux | grep lxc to watch it happen, or tail the log file in data-dir/machine-agent.log [11:36] hazmat: hmm - not sure the machine agent started [11:38] http://paste.ubuntu.com/702702/ [11:39] jamespage, hmm.. haven't seen that one [11:39] jamespage, can you try bootstrapping with the verbose flag? ie. juju -v bootstrap [11:40] jamespage, that should give the traceback [11:40] jamespage: I'm getting the same [11:40] local/agent.py:59, changed JUJU_ORIGIN=%s to JUJU_ORIGIN=ppa [11:41] hazmat: http://paste.ubuntu.com/702706/ [11:41] dah [11:41] foobar [11:41] jamespage, thanks [11:41] np - want me to raise a bug? [11:42] jamespage, i think its a trivial, i can commit it to trunk, just double checking [11:42] i was verifying this with a juju-origin defined in my environments.yaml.. but i didn't test without one [11:43] yeah.. niemeyer mentioned he'd changed the interface on this function, but i forget [11:46] hazmat: hmm - might also be better to depend on zookeeper rather than zookeeperd as the local provider starts its own instance [11:46] saves a running java process which is always a good thing [11:47] jamespage, good point.. libzookeeper-java should have what we need [11:47] yep [11:49] keen to test this as I have seen panics running Java stuff under lxc with openstack [11:49] might disappear as a result tho :-) [11:54] jamespage, we don't require any java under lxc.. but noted [11:54] fwereade, http://paste.ubuntu.com/702712/ [11:54] could i get a +1 on that trivial [11:54] hazmat, well juju does not - but charms might... [11:55] jamespage, yup [11:55] no cassandra might hurt [11:55] * hazmat sheds a tear over gluster's acquisition [11:56] redhat? 0.o [11:59] hazmat: sorry, I'm not quite following the paste [12:01] hazmat: well, just the first bit [12:01] hazmat: I guess, the packages need to change, so... you change the packages [12:01] hazmat: consider my confusion withdrawn [12:02] hazmat: +1 [12:03] fwereade, yeah.. let me repaste.. the origin handling changed, but i made it to a branch copy by accident [12:06] hazmat: getting there - I now have a running lxc container - however its not showing up as started yet [12:08] I can log into it using SSH [12:08] fwereade, updated diff.. http://paste.ubuntu.com/702715/ [12:09] fwereade, does a couple of things, logs juju origin when starting machine agent, fixes the handling for juju origin to reflect new return signature from get_default_origin, changes the package dep from zookeeperd to libzookeeper-java [12:11] hazmat: cool; have a much-happier +1 :) [12:12] jamespage: what's the 'state' for the service unit? [12:13] HarryPanda: null [12:13] I can see the failsafe upstart configuration waiting for network interfaces to come up within the lxc instance [12:14] http://paste.ubuntu.com/702719/ [12:14] <_mup_> juju/trunk r388 committed by kapil.thangavelu@canonical.com [12:14] <_mup_> [trivial] require libzookeeper-java not zookeeperd, update handling of get_default_origin, log origin used [r=fwereade] [12:17] jamespage, interesting i never noticed that b4 re waiting for network within the container [12:18] jamespage, sadly it happens on my laptop.. sort of destroys any value to making boot faster [12:19] jamespage, the unit agent gets started via upstart.. it should have a symlink to its log at $data-dir/units/$unit-name/unit.log [12:22] the link will be broken if the unit agent never started [12:24] HarryPanda, where you get to with the origin=ppa change? [12:25] hazmat: enough to get it working so I could deploy stuff [12:25] I will re-checkout after lunch [12:25] HarryPanda, cool, and you see the units with state: started in juju status? [12:25] HarryPanda, cheers [12:26] lxc-ls shows the containers, the output is a bit odd, it will list it twice if its running, once if its defined but not running [12:27] * hazmat regen's the ppa [12:29] * HarryPanda still has a lot of catching up to do in the meantime, like puppet integration [12:31] jamespage, can you pastebin your $data-dir/units/master-customize.log [12:36] hazmat: http://paste.ubuntu.com/702724/ [12:37] jamespage, nice.. looks good [12:37] jamespage, but your unit agents aren't starting? [12:38] not within the instance [12:39] jamespage, do you have any output in $data-dir/units/$unit-name/unit.log or is it a broken link? [12:40] hazmat, broken link ATM [12:40] this instance is stalled waiting for the network interface to start [12:40] so I don't the right run-level has been initiated yet [12:41] jamespage, if your able to ssh into the unit, you can try manually starting the agent using its upstart file as a guideline /etc/init/$unit-name-unit-agent.conf [12:41] jamespage, hmm.. i haven't seen that one before re network start wait in the container, except on the host [12:42] * hazmat doesn't understand why oneiric is waiting on network config [12:42] hazmat: so I did a sudo start xxx on the unit agent [12:42] now showing as started [12:43] jamespage, cool [12:43] hazmat; adding another unit has resulted in another stalled lxc instance - bah! [12:45] humbug [12:46] jamespage, had you used lxc before today?.. i'm just wondering if destroying the env, and clearing out the lxc cache would help /var/cache/lxc/oneiric/* [12:46] its been at least week since i did that myself [12:46] hazmat, hmm - I can try that [12:47] jamespage, alternatively you can try manually chroot into the cache and upgrade it.. but if i remember the bug correctly its pretty temperamental wrt to package install/upgrade/removal [12:58] * hazmat will bbiab [13:01] Morning all! [13:05] niemeyer: hiya [13:12] hazmat, looks like /etc/rcS.d/S07resolvconf is blocking - which is stopping the rest of the lxc instance from coming up OK [13:15] rog: Yo [13:15] jamespage: Hey James [13:15] jamespage: Very interesting ideas in your testing charms post [13:15] niemeyer, morning [13:15] thanks [13:15] jamespage: Still digesting them [13:16] niemeyer, I did a bit of refactoring yesterday based of sabdfl's comments [13:16] jamespage: We have a test suite continuously running at wtf.labix.org ATM, and it made me ponder if we could do more there [13:16] the charm-tester is now a bit more clever and less implicit [13:16] niemeyer, that might be good [13:22] niemeyer, whats driving the testing ATM? [13:22] jamespage: A few trivial scripts.. the whole thing is in lp:juju/ftests if you'd like to check it out [13:26] fwereade: ping [13:26] niemeyer: pong [13:26] niemeyer: it's turned out *much* easier than I expected [13:26] fwereade: Oh, that's so great to hear [13:27] fwereade: I went to sleep a bit concerned yesterday imagining if we'd have to go back [13:27] niemeyer: didn't stop me trying to do it wrong earlier, ofc, but you should have a fresh MP in a short while [13:27] fwereade: Woohay! [13:27] niemeyer: yeah, it's a great relief :) [13:27] niemeyer: and then the auto-update on top of that should be almost trivial [13:28] niemeyer: EOD or before [13:28] * fwereade crosses fingers [13:28] fwereade: Superb.. I'm hopeful we can close down the client side for fixes only today still [13:28] SpamapS: ^ [13:32] hazmat: so it looks like plymount --ping is hanging forever; hence why the failsafe config is kicking in [13:33] * jamespage scratches his head [13:52] niemeyer: if you guys can give me the revision to upload I will start some builds and tests [13:53] SpamapS: Sounds great.. we just need the stuff fwereade is punching on right now [13:54] niemeyer: here is r381's build failures btw https://launchpadlibrarian.net/81962870/buildlog_ubuntu-oneiric-i386.juju_0.5%2Bbzr381-1juju1~oneiric1_FAILEDTOBUILD.txt.gz [13:56] haven't seen those [13:57] jamespage, lame.. dhcp should just work [13:57] hazmat, I think it kinda is - however plymouth --ping is hanging when it tries to configure its logging [13:57] jamespage, we can prepopulate resolvconf/resolv.conf.d/base like we did before but for the container startup, straight dhcp should just work [13:58] * hazmat does a man plymouth [13:58] hmm.. undocumented [13:59] jamespage, so plymouth --ping sounds like it just pings upstart [13:59] from its help [13:59] its the thing that multiplexes all of the output from init scripts and upstart [13:59] * hazmat is out of his element [13:59] ah [14:00] I've pinged hallyn_ - he might have some ideas esp. with regards to why its borked in a lxc container [14:01] jamespage, great, just sent some updated instructions to the list and noted the plymouth hangs [14:02] * hazmat resets his lxc cache [14:03] * hazmat really needs to find a 7mm ssd [14:20] niemeyer: about "breaking" changes [14:21] niemeyer: it's ok to just commit something that will not work with existing agents deployed from an older branch... or is it? [14:21] fwereade: *TODAY* it is.. tomorrow, it's not. :-) [14:21] heh, good :) [14:22] hey guys, you should get the juju mailing list added here: https://lists.ubuntu.com/ [14:22] it's kind of hard to find if you don't know what it's called [14:26] mdeslaur, good point [14:27] <_mup_> Bug #868391 was filed: Juju mailing list should be listed on lists.ubuntu.com < https://launchpad.net/bugs/868391 > [14:27] mdeslaur, at the moment the easiest way to find the ml is from the section at the bottom of juju.ubuntu.com [14:28] hazmat: yes, thanks [14:33] rog: Great stuff indeed in update-server-interface man, thanks! [14:33] rog: I've sent a few comments/suggestions, but pretty superficial stuff [14:33] niemeyer: good, i'm, erm, relieved you like it :-) [14:37] don't you hate it when you know you've made some changes somewhere, but you can't find them anywhere? makes me wonder what else i'm also missing :-) [14:40] niemeyer: how would you like really-isolate-formula-revisions? stacked branch on isolate-formula-revisions, both set to needs review? [14:40] jamespage, works for me ootb with a clean lxc cache [14:40] fwiw [14:40] jcastro, ping [14:41] niemeyer: ah, i've just realised that update-server-interface is several revisions ago. [14:41] rog: Well, please do the suggested changes in the version that is up for review, and let's merge it as is [14:41] rog: Then you can push a separate branch with the new content [14:41] niemeyer: i thought you were reviewing factor-out-service ... [14:41] rog: Nope.. reviewing this now [14:42] rog: Please do the changes in the specific revision that was actually reviewed so we can merge it [14:42] niemeyer: will do. [14:42] fwereade: Hmmm [14:42] fwereade: Not sure I understand your question [14:43] fwereade: You have a review on the previous branch.. what is happening with it? [14:43] niemeyer: the trivials are fixed in that branch; the big one is fixed in the stacked branch [14:43] niemeyer: the actual changes are trivial but widespread [14:44] niemeyer: I'm just as happy to merge back into the original branch, or to add a new MP [14:44] niemeyer: whatever's easiest for you to review [14:44] fwereade: Please just push it all on that single branch.. I'll have to review it jointly either way since the prior one has blockers without the follow up [14:44] niemeyer: ok, cool [14:48] rog: Just added a [10] comment now that I've figured a detail in the follow up branch [14:49] niemeyer: one thing that concerns me about doing kill(pid, 0) is that it's system-specific. but i guess we're tied to linux anyway. [14:50] rog: Yeah.. I'm not very concerned about WIndows to be honest :) [14:50] rog: The way to fix that would be to add something in the Process interface itself [14:50] rog: IsRunning or whatever [14:50] rog: server side linux is a good bet. clients may be OS X or something else. [14:51] rog: But that's not something I worry about myself.. I'll let the Windows folks worry about it [14:51] this is client-side too, right? i'm also slightly concerned that some users may get EPERM even though the process is actually running. [14:52] i'm not too au fait with modern unixy capability stuff [14:59] rog: If they get EPERM, that's great.. it's not their process and they shouldn't be fiddling with it [15:00] * rog nods. [15:04] hazmat: I have something to try - I have plymouth in debug mode on my laptop for another issue - might be causing problems [15:10] cd [15:15] niemeyer: hmm, there's a problem with checking if server process is actually running: what do you do if the pid.txt file exists, but there's no running server? you'll have to remove or rewrite the pid.txt file, but then you no longer get the nice race-free create semantics that you've got currently, so two concurrent callers could start two servers at once. [15:17] niemeyer: btw, https://code.launchpad.net/~fwereade/juju/isolate-formula-revisions/+merge/78164 is ready for another look [15:21] hazmat: that fixed it - for some reason having plymouth:debug as a kernel boot options breaks the in-instance plymouth [15:33] fwereade: Ok, let me see how's lunch looking like here [15:33] fwereade: I'll review it either right now or right after lunch [15:34] rog: I don't get the problem.. if the pid file exists and there's no server running, there's simply no server running? [15:34] niemeyer: yes, but there's a problem if two processes are both trying to start a server at the same time in the same directory [15:34] niemeyer: currently that situation is dealt with fine [15:35] rog: One of them will fail because the port number won't be available [15:35] niemeyer: but the wrong pid might end up in the pid file [15:36] jamespage, awesome! [15:36] * niemeyer thinks [15:36] niemeyer: my current thought is that we should still return an error, but that the error should reflect our knowledge of whether the server is or isn't currently running [15:37] kill(pid,0) should work on osx i would think [15:37] hazmat: yeah, it should. [15:38] hazmat: i was more thinking of windows. [15:38] ah, i don't think anyone cares about that in practice for juju ;-) [15:38] rog: Please ignore the concurrency issues for now and just proceed as we discussed [15:38] e.g. "server is currently running" vs "server was terminated abnormally. Remove /.../pid.txt to force a start" [15:39] rog: We can use actual file locking to sort the problem out, but that's really not for this branch [15:39] niemeyer: oh yeah, another thought: does anyone use shared filesystems these days? [15:39] niemeyer: 'cos that'll break it too. [15:40] rog: I don't know, but that's not a problem to worry about now either [15:41] niemeyer, latest tip should work with the go branches in review? [15:44] hazmat: It should [15:44] * hazmat compiles tip and dives in for a review [15:52] fwereade: Ok, I'll have lunch before finishing your review.. will be back ASAP and dive into it directly [15:52] niemeyer: cool, the auto-upgrade itself is simple, and works AFAICT, but testing it is a bit tedious [15:53] fwereade, should be much faster with local provider ;-) [15:53] hazmat: it's just *writing* the tests that's bugging me :) [15:54] fwereade, ah [15:55] hazmat, i tried the latest (trunk r388), still seeing the same networking issues [15:55] i'm going to just reboot this box and see if that helps [15:55] jimbaker, yeah.. bcsaller is as well [15:55] jimbaker, rebooting won't do anything for this [15:56] hazmat: last I heard that was about virbr0 errors for jimbaker, thats not what I'm getting [15:56] hazmat, ok, it was the ultimate solution for the first time, but definitely welcome any ideas [15:56] i'm clearing out my apt-cacher-ng cache, and trying again, already cleared out the lxc cache and it worked fine.. so afaics i shouldn't have anything cached localy [15:56] jimbaker, your not getting virbr0 errors anymore are you? [15:56] hazmat, no [15:57] bcsaller, so i cleared out lxc cache, apt-cacher-ng cache, and it works for me.. [15:57] i'm just getting network connectivity issues between the lxc instances (mysql/0, wordpress/0) and zk [15:57] hazmat: we may want to add a juju-local-dev metapackage with all those extra deps [15:57] SpamapS, perhaps.. bootstrap will fail and tell you whats missing though [15:57] hazmat: yeah, thats expected, right. Its stale cached data that makes the issue [15:58] hazmat: think upgrades [15:59] hazmat: if we add more of those, or don't need more of those, we can help users add/remove them automatically :) [15:59] SpamapS, yeah.. but that puts the extra dep everywhere [15:59] SpamapS, including machines deploying units [15:59] SpamapS, ie require java ;-) [16:00] i'd rather avoid that [16:00] hazmat: which is why we put it as juju-local-dev, which is only a Suggests of juju [16:01] SpamapS, ah. true that, yeah.. that sounds fine to me.. i think niemeyer objected last time around on that, but if your up for doing it.. sounds good [16:01] hazmat: its a distro choice really. :) [16:02] SpamapS, well then i leave it to some anonymous distro person ;-) [16:12] slightly frustratingly my local lxc instances are not picking up any dns resolvers... [16:12] weird [16:12] hazmat: ^^ [16:17] bcsaller, why would dnsmasq need to be specified explicitly in the container if its dhcp based [16:18] hazmat: maybe if the resolvconf package isn't installed, I don't know the details, it might be a matter of competing systems there [16:18] bcsaller, so what do you have in resolv.conf of the master template? [16:21] SpamapS: hey [16:24] hazmat, finally the lxc containers came up after cleaning the caches (i assume /var/cache/lxc, /var/cache/apt-cacher-ng), but i'm still getting Network is unreachable in the unit logs: http://pastebin.ubuntu.com/702840/ [16:25] fwereade: Diving in again [16:27] jimbaker, do you have zookeeper running in the host? [16:28] sounds like it on a different port then what the unit agents expect [16:29] hazmat, it was started by the local provider [16:30] with this setting: clientPort=53699 [16:34] interesting [16:34] jimbaker, what ip address does the lxc instance have? [16:35] lxc-ls .. grab container name.. and ping dnsmasq... host container-name 192.168.122.1 [16:40] jamespage, what do you end up with in resolv.conf of the container? is it empty? [16:40] hazmat, it had the resolvconf header but no configuration [16:41] bcsaller, yeah.. perhaps we need both [16:41] i find it extremely odd that we don't pick up the dns server from dhcp [16:41] hazmat, not clear what you mean here... [16:42] jimbaker, the container has an ip address, the easiest way to discover it (if the unit agents aren't connecting) is to query the dnsmasq with the container name [16:43] i have consulted the man page/other howtos on dnsmasq. in any event, i can see the containers with lxc-ls [16:44] hazmat, ok, i was getting confused about things ;) just use standard tools [16:45] hazmat, do we actually setup dnsmasq with the local provider? [16:45] jimbaker, no its from libvirt [16:45] i need to grab some lunch [16:46] bcsaller, i guess the way to verify re dns is to add it to both run/resolv.conf and resolv.conf.d/base [16:46] bbiab [16:46] hazmat, i do have dnsmasq running. in any event, we can debug more when you get back [16:49] hazmat: Please ping me when you're bakc [16:49] SpamapS: ping [16:59] fwereade: ping [16:59] niemeyer: pong [17:00] fwereade: Looks fantastic [17:00] fwereade: I'm pretty surprised as well [17:00] niemeyer: yeah, totally unexpected [17:01] fwereade: We should try to get a +1 from hazmat as well given this is such a core concept [17:02] fwereade: I have just a few trivials for you meanwhile [17:02] niemeyer: sounds good; sounds good [17:03] fwereade: Sent [17:03] niemeyer: cheers [17:03] fwereade: Is there anything else in the pipe I can help you with while you take a look at these? [17:04] niemeyer: try previewing lp:~fwereade/juju/always-upgrade-local [17:04] niemeyer: just pushed it [17:04] fwereade: I'm on it [17:04] niemeyer: pretty sure it works, was just setting up a local provider to try to verify a bit faster [17:04] fwereade: Awesome [17:09] niemeyer: re: caching in a private attribute, the idea was that CharmDirectory instances should still report the correct revision even if they're up-revisioned mid-flow [17:09] niemeyer: on balance, that might not actually be necessary [17:09] niemeyer: because that *should* only happen via set_revision (or at __init__ time) [17:10] fwereade: Yeah, I think that's the exception, so I'd rather delay the behavior until necessary [17:10] fwereade: Opening the file all the time feels bad given we don't depend on this behavior today [17:11] niemeyer: in which case... yep, I'll just call it once, will simplify everything I think [17:11] niemeyer: cheers [17:11] niemeyer: others look good, ta [17:11] fwereade: Thanks! [17:18] fwereade: http://paste.ubuntu.com/702861/ [17:19] niemeyer: [1] yep, clearer [17:20] niemeyer: [2] ambivalent; I'm not sure local bundles are common enough to be a major worry [17:21] niemeyer: it feels like the same class of issue as "what if the local repo is not writable" [17:21] fwereade: They're not.. but if we don't take care of this case, it'll explode and render upgade-charm non-useful with bundles [17:21] niemeyer: hm, true [17:21] fwereade: Different case, I think [17:22] niemeyer: fair enough [17:22] niemeyer: offhand, is there a non-isinstance way to tell the difference? [17:27] fwereade: Nope, I think isinstance is the way to go [17:29] hey folks .. is this plymouth thing blocking local deployment, do can I play already [17:29] lxc deployment I mean [17:44] /me is back [17:44] niemeyer, ping [17:44] fwereade, so the charms have revision directly on them now.. the callback into the metadata stuff was wierd.. [17:45] * hazmat checks out the branch latest [17:48] kim0, try it out [17:48] cool will [17:49] kim0, it was a issue to jamespage setup he had plymouth in debug mode [17:53] hazmat: yeah, the idea was to avoid changing the charm state format [17:53] hazmat: turns out, revision was never actually used from charm state [17:53] hazmat: (well, it was, but it came from charm_url, and we only touched metadata.revision to assert they were the same) [17:54] fwereade, cool [17:54] fwereade, so the branch looks good to me, the only thing is the file contents should be stripped before attempting to int() [17:55] fwereade, if people hand edit it, editors are want to do wonky things [17:55] hazmat: >>> int(" \t27\n \r\n ") [17:55] 27 [17:56] hazmat: but still, no actual dsagreement [17:56] interesting [17:56] hazmat: explicit has occasionally been postulated to be superior to implicit, after all ;) [17:56] i definitely remember that not working at some point [17:57] oh well, relic of the past [18:04] so, hazmat, can I count that as an approve? [18:04] fwereade, yeah.. i'm just wondering about the extra zipfile construction in get revision on the bundle, i guess its fine since its cached, but we have the zip file already in init [18:05] fwereade, +1 [18:05] hazmat: and we always call get_revision in __init__, too [18:05] hazmat: I'll just fix that and merge then [18:05] hazmat: cheers [18:06] fwereade, cool [18:11] Woohay [18:15] see y'all tomorrow [18:17] rog have a good one [18:19] rog: Cheers! [18:21] fwereade: I'm writing an email to the list warning about this change [18:25] niemeyer: tyvm [18:30] hazmat, so i just wanted to continue with you on the local provider debugging [18:30] hazmat, whenever it's a good time, just ping me [18:31] fwereade: Hey! [18:31] fwereade: One thought just occurred to me [18:31] niemeyer: go on [18:31] fwereade: While writing the email [18:31] fwereade: Rather than erroring out with a missing revision file, we should create one with revision 1 automatically [18:32] fwereade: In a charm directory [18:32] niemeyer: good plan [18:32] niemeyer: 0 maybe? [18:32] fwereade: Sounds good.. I'm a fan of zero indexing as well ;-) [18:32] niemeyer: or is that just too look-at-me-I'm-a-*programmer*? [18:32] fwereade: LOL [18:33] fwereade: Maybe.. but memory initialized to zero by default is such a great idea! I'm sure we can't go wrong in this case either. ;-D [18:33] niemeyer: anyway, sounds good, I'll pastbin you a trivial shortly, just fixing up the always-upgrade-local branch [18:40] i thought it was backwards compatible already? [18:41] only on invalid content of the metadata file would it error [18:44] jimbaker, ping [18:46] niemeyer: http://paste.ubuntu.com/702915/ [18:46] hazmat: It's backwards compatible, but it's a significant change in structure that deserves a ntoe [18:46] note [18:47] hazmat: it is backwards compatible; this is a distinct case, where you create a charm and never even consider the possibility of needing a revision [18:47] sounds good [18:47] hazmat: Otherwise people will get a crazy warning that makes no sense after an unknown point in time [18:47] hazmat, hi [18:48] fwereade: Hmmm.. that's not quite it I think [18:48] fwereade: There's an important difference between a revision file not being present and it containing unexpected content [18:48] yeah [18:48] niemeyer: ...very good point [18:49] * fwereade looks shifty [18:49] its more of a preamble to the method if not os.path.exists(revision_path) [18:49] fwereade: Don't worry.. you can look shifty for the next few months without any damage to your image given how late it is there and how much you've been pushing :) [18:49] jimbaker, so to continue the epic, our heroes where last trying to figure out why they couldn't get to the zoo [18:50] LOL [18:50] fwereade, and it also has to play nice with the backwards compatible stuff, ie. what might already be in the metadata [18:50] * niemeyer puts that in the quotes page [18:51] fwereade, bringing shifty back into style ;-) [18:51] :D [18:51] jimbaker, where you able to get the container address? [18:52] hazmat, not certain what the procedure for that should be [18:52] jimbaker, query the dnsmasq server [18:52] jimbaker, is it running? [18:52] bcsaller, where you able to try out adding the dnsmasq server to base in addition to run/resolv.conf? [18:52] hazmat, dnsmasq is running [18:53] hazmat, with this range reported from ps, --dhcp-range 192.168.122.2,192.168.122.254 [18:53] so that looks as expected [18:53] jimbaker, lxc-ls -> list of containers.. pick container name of unit... query dnsmasq on its listen address typically 192.168.122.1 [18:55] hazmat, hmm, i missing something in my knowledge of how something like this is setup; would the query be like this: dig -b 192.168.122.1 jbaker-desktop-wordpress-0 [18:55] (yes, its listen address is 192.168.122.1, according to ps) [18:56] dig @192.168.122.1 jbaker-desktop-wordpress-0 [18:56] or less verbosely [18:57] host jbaker-desktop-wordpress-0 192.168.122.1 [18:59] hazmat, now makes perfect sense, but dnsmasq doesn't have the address: http://pastebin.ubuntu.com/702926/ [18:59] hazmat, niemeyer: http://paste.ubuntu.com/702927/ [19:00] slightly less trivial now but just barely qualifies IMO [19:01] fwereade, that seems sensible.. basically only in the absence of revision set it [19:02] i was wondering about transparently doing migrations for folks, but that's probably icky, they'll see the warning [19:02] fwereade, +1 [19:03] fwereade: Put this within the try: if result >= 0:\n return result [19:03] fwereade: as a follow up you'll notice you can unify the branches [19:03] jimbaker, odd indeed. what's the output of > brctrl show and virsh net-list --all [19:03] jimbaker, its like you have a different bridge setup [19:04] fwereade: It all looks good though, +1 [19:04] jimbaker, for any active on virsh net-list.. can you paste virsh net-dumpxml network-name [19:05] http://pastebin.ubuntu.com/702932/ [19:05] hazmat, i don't have brctrl installed. is that a problem? [19:05] niemeyer: re try: in get_revision? [19:05] fwereade: yeah [19:05] in terms of not having some useful bridging utils installed? [19:05] niemeyer: that won't error on revisions < 0 [19:06] jimbaker, no its just a bridge management tool.. [19:06] although i thought libvirt used it [19:06] hazmat, just checking ;) [19:06] fwereade: It will if you tweak the branches below [19:06] hazmat, definitely could install bridge-utils shortly [19:06] fwereade: You'll need a single raise and won't have to define the spurious message upfront [19:06] niemeyer: ...oh, *branches*, not branches [19:06] jimbaker, do you have libvirt-bin installed? [19:06] fwereade: LOL [19:06] * fwereade looks shifty again [19:06] fwereade: Yeah, sorry :-) [19:06] jimbaker, bridge-utils is a dep of it [19:07] hazmat, i do, but apparently an upgrade is available [19:07] hazmat, enjoying the oneiric edge for sure [19:07] jimbaker, right.. but you should have brctrl.. its a dependency for libvirt-bin [19:07] hazmat, certainly strange [19:09] hazmat, the upgrade of libvirt-bin did not install bridge-utils/brctrl, so don't know why the dependency is not being honored (or different here) [19:10] hazmat, sorry, it was just suggesting bridge-utils be installed for missing brctrl, but i do have it [19:10] (the bridge-utils package) [19:11] jimbaker, okay... so output of brctrl show would be helpful... also do you have other virtualization packages (vmware, virtualbox, etc) on the machine? [19:11] hazmat, no other virtualization installed on this box [19:11] jimbaker, cool [19:11] hazmat, still looking for a package with brctrl [19:11] jimbaker, so the brctrl output and the virsh net-dumpxml default output are next [19:12] jimbaker, sorry its brctl [19:12] hazmat, virsh net-dumpxml default: http://pastebin.ubuntu.com/702935/ [19:13] hazmat, brctl show: http://pastebin.ubuntu.com/702936/ [19:14] jimbaker, so all that looks good.. time to check the container config [19:15] hazmat, sounds good [19:15] niemeyer: and at last: https://code.launchpad.net/~fwereade/juju/always-upgrade-local/+merge/78306 [19:15] jimbaker, can you pastebin sudo cat /var/lib/lxc/jbaker-desktop-wordpress-0/config [19:15] niemeyer: already addressed the points you brought up before [19:16] fwereade: Cool, looking! [19:16] hazmat, here is is: http://pastebin.ubuntu.com/702937/ [19:18] Oops [19:19] fwereade: http://wtf.labix.org/ [19:20] fwereade: Looks like it never came up.. not sure if it's a hiccup or if it's actually broken for unknown reasons [19:20] I'll kick it and run it again [19:21] niemeyer: blech, I'll verify here [19:23] fwereade: +1 on the change [19:30] fwereade: 390 is already on the pipeline running [19:30] fwereade: Let's see.. [19:33] niemeyer, something is screwed up, the cloud-init scripts fall over [19:33] niemeyer, but I don't *think* this was me... [19:34] 2011-10-05 19:24:02,144 - cc_apt_update_upgrade.py[WARNING]: Failed to install packages: ['bzr', 'byobu', 'tmux', 'pyt [19:34] hon-setuptools', 'python-twisted', 'python-argparse', 'python-txaws', 'python-zookeeper', 'default-jre-headless', 'zoo [19:34] keeper', 'zookeeperd'] [19:35] niemeyer: I'm seeing a lot of 403s trying to talk to http://us-east-1.ec2.archive.ubuntu.com [19:36] jimbaker that's just wacky [19:36] hazmat, how so? ;) [19:36] jimbaker, everything looks good [19:36] hazmat, ahh. maybe i should try the reboot. just this once [19:36] niemeyer: does that sound familiar in any way? [19:37] reboot is for windows and kernel upgrades (/me shake fist at oracle for buying ksplice) [19:37] hazmat, i know, i know [19:37] fwereade: I've heard something about a name change yeah [19:37] fwereade: What's strange is that the wtf broke in that specific revision [19:38] niemeyer: theories regarding what I did received gratefully... :/ [19:38] it's an act of desperation. but is everyone else who's tried the local provider seen it work? [19:39] jimbaker, not afaik, i think jamespage and bcsaller had it fail on a different network issue (dns wasn't working) [19:39] fwereade: The previous revision looks suspicious [19:39] fwereade: 388 [19:39] fwereade: But that'd mean something in wtf got the revision wrong somehow [19:39] fwereade: Which isn't impossible [19:39] hazmat: I'm testing a small diff to juju-create that I think fixed everything here [19:40] bcsaller, writing both base and run/resolv.conf ? [19:40] niemeyer: hm, that went in a good while ago, and all it hits is providers.local... I thought unused providers weren't even imported? [19:41] hazmat: yeah, http://pastebin.ubuntu.com/702946/ w/ the apt-get update after the cache is in place [19:41] fwereade: True.. [19:42] fwereade: It could be something in Oneiric itself too [19:42] fwereade: I'll rollback wtf to 388, and make it test everything again from there [19:42] niemeyer: cheers [19:42] bcsaller, that's slightly dangerous.. if the environment is long running, it will hit the same problem during a beta cycle.. but ok [19:42] fwereade: If 388 changes to broken, it might not be your revision [19:42] * fwereade feels slightly hopeful [19:43] fwereade: Your merge log message are not following the convention, btw [19:43] bcsaller, its bewildering though to me, its like dhcp isn't being respected, we shouldn't have to manuall setup dns for a dhcp setup [19:44] fwereade: Do a "bzr log" on trunk to see the difference [19:44] hazmat: I haven't verified it resolvconf is messing that up, /etc/dhcp/dhclient-enter-hooks.d/resolvconf makes me think it should be working as well [19:45] niemeyer: pong! [19:46] niemeyer: damn, I'm sorry, I thought I'd retrained myself right [19:46] hazmat: another issue, the way wp is coming up now its only addressable by IP, I'll look at adding an alias with the assigned name and see if that fixes it [19:47] bcsaller, that's not really a problem its only resolvable by the browser by ip [19:47] bcsaller, that's irrelevant imo [19:47] fwereade: No worries, it's just important because it's hard to tell what was merged from e.g. http://bazaar.launchpad.net/~juju/juju/trunk/revision/389 [19:47] not a deal breaker, no [19:47] SpamapS: Hey man [19:47] SpamapS: Was going to ask you for some reviewing, but we already pushed it forward [19:48] fwereade: 388 is churning [19:49] niemeyer: the "mfrom: (348.8.15 isolate-formula-revisions)" on that page gives us as much information as does the summary line, surely? [19:50] niemeyer: but I'll try to force myself to keep to the standard anyway [19:50] niemeyer: that's sort of good news and sort of bad news [19:50] <_mup_> juju/ftests r12 committed by gustavo@niemeyer.net [19:50] <_mup_> - Fixed butler so it doesn't jump revisions. [19:50] <_mup_> - Compute the waterfall after every revision. [19:53] fwereade: Indeed [19:53] fwereade: Hadn't noticed it to be honest [19:55] hazmat: do you have a moment for a review of https://code.launchpad.net/~fwereade/juju/always-upgrade-local/+merge/78306? [19:55] fwereade, sure [20:02] niemeyer: ahh ok. :) [20:03] hazmat, so one useful aspect of doing the reboot is looking at this scenario: what happens to the lxc containers? bootstrap reports it's still bootstrapped, but status fails [20:05] jimbaker, their down [20:05] they don't autostart [20:05] phone bbiam [20:05] hazmat, exactly. is there a way to restart them? [20:05] hazmat, later [20:07] jimbaker, lxc-start -n name-of-container [20:07] hazmat, sounds good. so we can support that later in some sort of automatic way i guess [20:07] jimbaker, i believe there are mechanisms to autostart them as well [20:08] jimbaker, but the zk and machine agent are down as well [20:08] hazmat, in any event, the reboot did nothing (as expected) [20:08] jimbaker, so on reboot you'll need to destroy and bootstrap [20:08] the env [20:08] for now [20:08] hazmat, correct, that's why juju status failed so badly [20:08] hazmat, which is fine for now [20:09] jimbaker, absolutely, we're not touching the host this release with local provider [20:10] outside of the minimum needed to work (lxc containers) [20:10] hazmat, unless you have any more ideas, i'm going to put off getting local provider running. hopefully jamespage, bcsaller, and others have better luck [20:10] jimbaker, well, we havent done anything since its been rebooted [20:10] jimbaker: with the small patch I posted a short while ago its working fine for me again [20:10] bcsaller, what was that? [20:10] jimbaker, destroy-environment and bootstrap & deploy [20:11] jimbaker, the issue your seeing is different than the patch which fixes some dns resolver issues [20:11] hazmat, yes, i did that, and units were started, but the network issue is still there [20:11] http://pastebin.ubuntu.com/702946/ [20:11] Uh oh [20:11] jimbaker, so you still have units unable to connect to zk? [20:12] hazmat, correct, exact same issue in the unit.log (s) [20:12] jimbaker, i'm not entirely out of ideas yet ;-) [20:12] hazmat, SpamapS: AMBER CODE [20:12] hazmat, ok, then let's try them out [20:12] niemeyer, ready and waiting [20:12] EC2 deployment is broken [20:12] and it doesn't look like something we did [20:13] Something probably changed in Oneiric in the last few hours [20:13] jimbaker, confirming (netstat -al) that zk is listening on a localhost with port known to the agents [20:13] 388 was passing moments ago, and is now broken [20:13] jimbaker, i'm going to swtich off to the ec2 issue [20:13] niemeyer, debugging [20:13] fwereade: Looks unrelated to your changed indeed [20:13] niemeyer, could be a bad image [20:13] fwereade: Do you have more details that could help us pinpoint the bug [20:13] ? [20:14] fwereade: (from your test run) [20:14] hazmat, definitely is priority [20:14] jimbaker, but if you could confirm that info it would be helpful [20:14] jimbaker, did you mention early i could get a login to that machine? [20:15] hazmat, yes, i will create one for you [20:15] niemeyer: sorry, not really: the only other thing I verified was that even bzr wasn't installed [20:15] fwereade, hazmat: Maybe one of the packages is missing (or has been renamed) [20:15] so if the ec2 repos are changing names [20:15] niemeyer, i'm firing up an ec2 environment more info in a few [20:16] hazmat: From your log: [trivial] require libzookeeper-java not zookeeperd [20:17] hazmat: Shouldn't that be done to the EC2 provider as well? [20:17] niemeyer, no.. they don't install this stuff.. that's for running it on the host [20:17] Ok [20:18] * niemeyer checks the list of packages [20:22] Looks ok [20:23] smoser: Hey! [20:23] smoser: Were the AMIs updated today by any chance? [20:29] niemeyer, utlemming updated them. https://lists.ubuntu.com/archives/ubuntu-cloud/2011-October/thread.html [20:30] smoser: What should I be looking for there? [20:31] niemeyer, so i get the same traceback that fwereade got [20:31] hazmat: What's the traceback? [20:31] Refreshed UEC Images of 10.04 LTS (Lucid Lynx) [20110930] [20:31] smoser, niemeyer http://paste.ubuntu.com/702970/ [20:31] Refreshed Cloud Images of 10.10 (Maverick Meerkat) [20111001] [20:31] Refreshed UEC Images of 11.04 LTS (Natty Narwhal) [20111003] [20:32] smoser: We're using the Oneiric images [20:32] the dailies? [20:32] dailies [20:32] they're updated every day [20:32] smoser: I'll ping utl* about it [20:32] smoser: Oh, ok [20:32] Hmm [20:32] What the heck.. what's a Pangolin? [20:33] hazmat, you have the cloud-init-output.log now [20:33] pastebin that [20:33] smoser, sure [20:33] /var/log/cloud-init-output.log (i think that is the path we used) [20:33] apt just failed, either archive issue or package failure [20:33] smoser, what's the cli for pastebin called? [20:34] got it [20:34] smoser, http://paste.ubuntu.com/702972/ [20:34] smoser, and http://paste.ubuntu.com/702973/ [20:34] woah.. that second one is fubar [20:34] forbidden on the repo? [20:34] yeah [20:35] wtf [20:36] i'm asking in #is [20:36] Ugh.. that explains a ton [20:39] I'll stop the waterfall for the moment [20:39] fwereade: Thanks for your help on this issue, and sorry for the false alarm [20:40] fwereade: Is that last branch in? [20:41] <_mup_> juju/ftests r13 committed by gustavo@niemeyer.net [20:41] <_mup_> Fixed authorized_keys handling in ec2-setup.sh when moving ~/.ssh away [20:41] <_mup_> so that the user can still log in. [20:41] hazmat, smoser: Thanks for debugging this as well folks [20:41] I'll step down for a cup of coffee.. biab [20:41] niemeyer, np [20:41] that sounds like a good idea [20:48] niemeyer, sorry, got pulled away for a while [20:49] hazmat: I presume the weirdness has prevented you from taking a look at that branch? [20:50] fwereade, it has, also trying to debug local provider on jim's machine [20:50] fwereade, many beautiful butterflies.. i'll stop and have a look now [20:51] hazmat, only if you don't know who *isn't* horribly busy atm, who I could hit up instead? [20:56] fwereade: Man, I feel bad that we're holding you back still.. that must have been a long day for you [20:56] niemeyer, don't worry about it [20:57] heya hazmat, whats the path to the daily ppa that I should use to try out the local provider awesomeness you posted about? [20:57] statik, ppa:juju/pkgs [20:58] ah ok, I didn't realize that was daily [20:58] thx [20:58] bcsaller, +1 on that patch btw [20:59] jimbaker, i'm going to pause on debugging this and switch back to a review for a bit [21:01] lkja;lkj [21:02] hm, niemeyer, just noticed a minor lurking bug in the deploy/upgrade changes: the error when --repository is needed but not included is entirely unhelpful [21:03] niemeyer: worth a minor to raise FileNotFound from resolve(), or possibly LocalCharmRepository.__init__()? [21:03] fwereade: How does it look like? [21:03] niemeyer: NoneType has no attribute .endswith IIRC [21:03] fwereade: Ouch :-) [21:03] niemeyer: hm, it's not quite FileNotFound though [21:04] fwereade: It's CharmNotFound I guess [21:04] niemeyer: not even quite that, if anything it's RepositoryNotFound ;) [21:05] fwereade: Hmm [21:05] niemeyer: or, really, RepsitoryNotSpecified [21:06] fwereade: Yeah, but only if the charm is local:, and if it's not yet in the env [21:06] fwereade, what's the leapfrog do? [21:06] rev 0 to rev 2? [21:06] fwereade: and the problem is similar if the repository is provided, but the specific charm isn't there [21:07] hazmat: bumps to deployed+1, which is a jump of >1 from the one in the repo [21:07] fwereade, ah cool [21:07] hazmat: if yu can think of a beeter name I'm all ears :) [21:07] * fwereade peers suspiciously at his keyboard [21:07] it's the tools, I tell you! [21:08] niemeyer: I feel like the arg parser is really the right place to catch it [21:09] niemeyer: at the moment anyway [21:09] fwereade: I can't see how the arg parser is related to this [21:09] fwereade: This is highly contextual.. [21:09] niemeyer: that might change once we actually can specify a repo in the environment [21:09] fwereade: "if not repository_path and charm is local and charm not in environment: raise foo" [21:10] niemeyer: maybe, but it's connected: the reason the bug exists is because the "repository" arg is no longer required [21:10] fwereade: Sure.. that's quite expected given that we just removed it [21:10] fwereade: We just need to consider the case where it's not there and raise an error [21:11] fwereade: We can do that tomorrow, though, in a bug fix [21:11] fwereade: with you relaxed ;-) [21:11] fwereade, review in one minor, but very awesome [21:11] hazmat: cool, thanks [21:12] niemeyer: sounds like a plan [21:12] fwereade: Please just file a bug about this so you can sort it out later [21:12] niemeyer: will do, I meant "tomorrow" sounds like a plan :) [21:12] fwereade: Awesome, thank you! [21:16] SpamapS, m_3 .. fwereade has answered the common desire of all formula authors.. auto incrementing versions on upgrade [21:16] negronjl, ^ [21:16] ah nice [21:17] \o/ !!! [21:17] [21:25] <_mup_> Bug #868729 was filed: deploy and upgrade-charm give unhelpful errors when repository not specified < https://launchpad.net/bugs/868729 > [21:26] ...and that's it for me, nn all [21:28] fwereade: Woohay! [21:28] fwereade: Have a restful night! :) [22:45] niemeyer, bcsaller, jimbaker just a heads up, i'm moving over not in progress bugs to the next milestone [22:46] hazmat: k [22:46] hazmat: Awesome.. if you find anything interesting on the way that could deserve some attention, please raise a flag [22:48] jimbaker, this one is a duplicate afaics https://bugs.launchpad.net/juju/+bug/846055 [22:48] <_mup_> Bug #846055: Occasional error when shutting down a machine from security group removal < https://launchpad.net/bugs/846055 > [22:48] perhaps not [22:57] hazmat, that's a txaws parsing issue, when we separate out the related bug 863510 [22:57] <_mup_> Bug #863510: destory-environment errors and hangs forever < https://launchpad.net/bugs/863510 > [22:57] jimbaker, this one also seems closed afaics https://bugs.launchpad.net/juju/+bug/824222 [22:57] <_mup_> Bug #824222: juju bootstrap should be more robust < https://launchpad.net/bugs/824222 > [22:57] jimbaker, i only see tracebacks when in verbose mode [22:58] hazmat, yes, that's correct [22:58] jimbaker, and this one is fixed? https://bugs.launchpad.net/juju/+bug/802678 [22:58] hazmat, so let's definitely close that one. i think we could consider expected errors vs not, but that can be in the next milestone [22:58] <_mup_> Bug #802678: Update watch_ports < https://launchpad.net/bugs/802678 > [23:00] bcsaller, i assume statusd isn't getting merged in the next day or so? [23:00] hazmat: no [23:00] hazmat, the test has gone away [23:00] hazmat, let me see where it lives now, if at all [23:05] hazmat, let's mark that as fix released, the current test is very close to what is the attached patch as would fail when repeated [23:05] it doesn't now [23:05] well its invalid at this point [23:06] sure, that works too [23:06] hazmat, are we planning on doing jenkins for ec2 functional testing? i thought wtf covered it [23:07] jimbaker, fair enough.. feel free to mark it wont fix [23:07] hazmat, ok [23:08] kim0, ping.. just curious about the orchestra docs [23:08] hazmat, what do you think of bug 697093 - this purely seems to be an artifact of our testing setup [23:08] <_mup_> Bug #697093: Ensemble command should return nonzero status code for errors < https://launchpad.net/bugs/697093 > [23:09] (at the very least i need to update the description/summary accordingly) [23:11] jimbaker, oh.. perhaps [23:11] jimbaker, yeah... ideally we could test for status codes, but if i'm reading that right, we're really just testing the mocks not the actual exit codes [23:11] * hazmat calls it a night [23:11] cheers [23:12] hazmat, yeah, it would be nice to get it right, but tricky for sure [23:12] worthwhile for florence however [23:12] hazmat, enjoy! [23:13] niemeyer, as far as interesting bugs we might want to still consider for eureka.. https://bugs.launchpad.net/juju/+bug/814987 and https://bugs.launchpad.net/juju/+bug/828326 the latter is a new feature though [23:13] <_mup_> Bug #814987: config-changed hook is retried on 'resolved' even when --retry is not passed < https://launchpad.net/bugs/814987 > [23:13] <_mup_> Bug #828326: need to be able to retrieve a service config or schema from the cli < https://launchpad.net/bugs/828326 > [23:14] re the first one.. actually the issue i'm interested isn't really the about config-hooks, but that install error retries are broken [23:14] for getting to a working state [23:14] hazmat: Why are they broken? [23:14] niemeyer, because the automatic transition from install to start doesn't happen [23:15] afaicr [23:15] after the error recovery [23:15] the unit just stays in the installed state [23:16] hazmat: Ugh [23:16] hazmat: Ok [23:17] hazmat: Sounds like something that would be nice to get done indeed, if we find some spare time [23:55] I'll kick off the wtf [23:55] Hopefully the repositories are back already [23:56] Uh oh.. Steve Jobs is no more.. :(