[00:12] arosales, interesting re affinity group bug [00:13] arosales, did red squad implement affinity groups recently? it wasn't an issue last week [00:15] sinzui: no [00:15] sinzui: only minor jumps of .2 at this stage [00:15] sinzui: planning to fix this later [00:15] sinzui: has the world stopped burning? [00:18] sinzui, i though it looked for the lastest compatible minor release [00:20] whoops thumper ^ [00:21] looking at upgradejuju initVersions [00:22] thumper, per environ/tools.FindTools ... f minorVersion = -1, then only majorVersion is considered. [00:22] s/f/if [00:23] hazmat: whether or not it looks for the latest tools is irrelevant, the code doesn't promise to work [00:23] I have been told loudly and often that we only support single jumps at this stage [00:23] and not from dev -> dev [00:23] only from released -> current dev [00:23] or old release -> current release [00:24] or last release -> current release [00:24] well old_release -> current_release is the question.. and if the code doesn't prevent it, then the default behavior is effectively not supported [00:25] ie 1.12 upgrade will go straight to 1.16 or 1.18.. [00:25] :) [00:25] hazmat: well, wether is does or not is a bit moot [00:26] either its supported and its fine, or its not supported, and we have default behavior which is potentially dangerous for prod envs. [00:26] thumper, then we can talk about downgrading ;-) [00:26] well, I've been told it isn't supported, so we should check to see if it dies :) [00:27] thankfully a test trivial with a downgrade.. [00:28] downgrade ? [00:28] are you mad [00:29] the only way is up, baby [00:29] davecheney, i filed a bug and told it was meant to work for downgrade.. [00:29] but you can --version=previous_version [00:29] unfortunately can't go back to 1.12 since its not in simple streams.. [00:33] which also sort of explains the upgrade problems [00:34] hazmat, ack thanks. I do have cgroup-lite installed. [00:34] (1.8) [00:36] gary_poster, yeah.. i looked over the tracebacks again.. didn't seem related, not sure what's up there [00:37] gary_poster, it did seem possibly networking related, but unclear.. [00:37] gary_poster, could you pastebin ifconfig from the host [00:38] hazmat http://paste.ubuntu.com/6177815/ [00:38] totally sane.. [00:38] thumper, gary_poster i'd ask hallyn about this one [00:39] hazmat, I was thinking the same. [00:40] sinzui, thumper so with deploys coming from the new streams structure, effectively the only version of juju known to trunk is 1.15 atm.. no real testing of multi-version leaps atm unless using an old client [00:41] hazmat: i don't think downgrades have ever been supportde [00:41] i've certainly never heard that requirement, and it has never been a constraint [00:43] davecheney, fair enough.. this notion of 'support' only known to juju-core devs is interesting.. my concern is primarily what does the tool allow end users. if minor increment skipping isn't supported, why does the tool allow it.. much less do it by default.. if downgrading isn't supported, how about a warning at least.. if not remove the capability [00:44] hazmat: how can you downgrade ? [00:45] maybe this was added when I wasn't looking [00:45] davecheney, juju upgrade-juju --version=lower_version [00:46] meh, that is like --to, if you do that, all the safties are demoved [00:46] removed [00:46] davecheney, https://bugs.launchpad.net/juju-core/+bug/1227991 [00:46] <_mup_> Bug #1227991: upgrade-juju can downgrade.. and causes hook executions [00:46] yup, certainly a footgun [00:49] davecheney, --to doesn't remove safeties from juju, just the charms, and its quite safe for container spec [00:50] although honestly i like the capability.. what i dislike is a default behavior which only juju-core devs knows is not supported, namely skipping versions on upgrades. [00:50] ie juju upgrade-juju.. resulting in an unsupported behavior [00:51] * hazmat files a bug for future ref === gary_poster is now known as gary_poster|away [00:58] ah ffs [00:59] lxc has changed its behaviour for another thing [00:59] fuckity fuck fuck [00:59] * thumper looks at the docs [00:59] * thumper goes to write another branch [01:06] thumper: what's the cut off date for 1.16.0? [01:14] axw: soft cut off is thursday I think [01:14] and then it gets harder :) [01:14] thumper: okey dokey, ta [01:15] public holiday came at a bad time :\ [01:15] * thumper shrugs [01:15] * thumper goes to fix the lxc bug [01:17] thumper: sounds like the same story as last time [01:17] there is a date [01:17] but it's optional [01:17] depending on your frame of reference [01:18] davecheney: I would say that we only want bug fixes after that [01:18] and not try to drop features [01:20] drop == include [01:20] or drop == exclude ? [01:20] drop == land ? [01:21] why why did I make my admin-secret so secret [01:21] such a pain [01:21] y didn't i make it 'bob' [01:26] not to land features [01:28] understood [01:28] * thumper needs hacking music [02:00] axw, davecheney: I'm confused [02:00] I use os.Symlink to make a symlink [02:00] but it doesn't set the symlink mode flag [02:00] if I go os.Stat(path), fileinfo.Mode().IsRegular() it returns true [02:00] why? [02:01] thumper: not sure [02:01] let me try [02:01] I'm writing a checker for IsSymlink [02:01] also not sure.. [02:01] so in my test test, I make a symlink , then test it [02:01] and it says nah [02:02] * thumper is confused [02:02] c.Assert(symlinkPath, jc.IsSymlink) [02:02] ... obtained string = "/tmp/gocheck-5577006791947779410/8/a-symlink" [02:02] ... /tmp/gocheck-5577006791947779410/8/a-symlink is not a symlink: &{name:a-symlink size:0 mode:384 modTime:{sec:63516189566 nsec:62250446 loc:0x6c71c0} sys:0xc2000975a0}, true [02:02] anyone seen this one.. https://bugs.launchpad.net/bugs/1233457 [02:02] <_mup_> Bug #1233457: service with no units stuck in lifecycle dying [02:03] file, err := ioutil.TempFile(c.MkDir(), "") [02:03] c.Assert(err, gc.IsNil) [02:03] symlinkPath := filepath.Join(filepath.Dir(file.Name()), "a-symlink") [02:03] err = os.Symlink(file.Name(), symlinkPath) [02:03] c.Assert(err, gc.IsNil) [02:03] cts needs a fix, and afaics the only thing to do is munge the db.. [02:03] how about those pastebins [02:03] hazmat: sure... [02:04] thumper: I think you want os.Lstat [02:04] ah... [02:04] yeah [02:04] Stat is checking the thing that is being pointed to right? [02:04] yup [02:04] thumper, python is the same btw [02:04] os.lstat vs os.stat [02:04] yeah, that's it [02:05] hazmat: no, not seen that before [02:05] although maybe I have, but I don't recall [02:06] we have some lifecycle issues [02:12] thumper, so what's the downside of just yanking that document from mongo.. [02:12] hazmat: unsure [02:13] hazmat: nobody has ever tried [02:13] afaics as long as its cleared out of units, relations, relation scopes, its fine. [02:14] thumper, probably some txn issues as well [02:14] yeah... that's the problem.. prod customer env.. doing the crazy.. i guess i need a full db dump to verify this is sane.. [02:15] but against a sample env, it does look okay.. [02:16] the txn observers complicate things, but nothing triggers on this observation [02:16] thumper, hmm.. let me ask a different way.. do you know what processes the cleanup of services in lifecycle:dying [02:16] would have to be the provisioner [02:16] ie could wondering if i could just bounce the bootstrap jujud to process it [02:17] hazmat: no, there is a lifecycle doc written by fwereade in the tree, but no I don't really know [02:17] hazmat: try it [02:17] can't hurt [02:17] davecheney, why the provisioner? [02:17] its not related to the provider [02:17] hazmat: it's the only one left [02:17] unit agent moves the unit to dying [02:17] there is no service agent [02:17] so it must be the provisioner [02:18] davecheney, fair enough [02:18] davecheney, or at least a job on the machine agent [02:18] on bootstrap [02:18] ie could just as easily be api [02:18] or something else [02:19] davecheney, you reasonably sure that bouncing it won't pertub the system? === julianwa_ is now known as julian === julian is now known as juliawa [02:20] hazmat: it could be the machine agent [02:20] but that is a problem [02:20] because when the last machine is removed [02:20] who will remove the service ? [02:20] the machine agent moves the unit from dying to dead [02:20] that is all === juliawa is now known as julianwa [02:21] hi julianwa [02:21] hi davecheney hazmat [02:21] hazmat: yes [02:21] bouncing it is fine [02:21] julianwa, so the suggestion is juju ssh 0 ... and then [02:21] it is designed for this [02:22] julianwa, sudo service jujud-machine-0 restart [02:22] hazmat: ok, will try in few minutes [02:22] julianwa, that might resolve it as forces the agent to process current state, shouldn't pertub the system at all [02:23] hazmat: the bootstrap agent is probably restarting all the time [02:23] julianwa, if that doesn't work.. then i'll have to fallback to the manual munge the db, which i'd rather avoid if possible, as its not supported. [02:23] davecheney, huh [02:23] we don't let the process die, 'cos that will cause upstart to put it in the sin bin [02:23] but the process does restart the worker frequently [02:24] davecheney, if that's the case then the persistent error here over days, would have already been resolved.. [02:24] hazmat: yes [02:24] so a restart is unliekly to fix the problem [02:24] but it's also unlikly to make it any worse [02:25] davecheney, upstart will restart as long as its not a persistent suicide [02:25] davecheney, how often does juju kill workers.. my understanding is that it only does it on error in the worker [02:25] ie. its not a periodic process [02:25] hazmat: depends on the error [02:25] with maas, it's quite frequent [02:26] with other providers, it tends to be less frequent [02:26] davecheney, ie.. if its not a periodic process... then no error.. means no restart [02:26] so a manual restart still has value [02:26] hazmat: sure [02:26] davecheney, is the scope of the worker all the environ jobs [02:26] bottom line [02:26] nobody knows what will fix it [02:26] but a restart can't hurt [02:26] indeed [02:27] and its better than touching juju's db by hand.. which is the next step.. so worth a shot [02:28] aha [02:28] hazmat: I'd highly recommend not hacking the juju db by hand [02:28] davecheney, thumper we have the underlying issue it looks like [02:28] jujud died perm [02:29] thumper, except when nesc :-) like migration [02:29] hazmat: only if you mean pyjuju -> go juju [02:30] 12:28 < hazmat> jujud died perm [02:30] ^ can you please expand on this [02:30] davecheney, its not running [02:30] what does the log file say ? [02:32] thumper, ack on recommend, and yes that's the migration i'm working on... in this case service is pretty innoucous though, it has no agent manipulating state. [02:37] an observers to trigger on txn, in fact its the lack of triggers there that forces the manual [02:38] hazmat: already restart jujud-machine-0 on bootstrap node [02:38] hazmat: nothing changed [02:38] julianwa, did it stay up.. ie is it in the ps aux | grep juju output [02:39] julianwa, could you copy/send /var/log/juju/machine-0.log for analysis [02:39] thumper: [02:39] 2013-10-01 01:03:32 DEBUG juju.rpc.jsoncodec codec.go:172 -> {"RequestId":9,"Response":{"Results":[{"StringsWatcherId":"4","Changes":null,"Error":null}]}} [02:39] 2013-10-01 01:03:32 DEBUG juju.worker.logger logger.go:45 reconfiguring logging from "=DEBUG" to "=WARNING" [02:39] ^ when did this land ? [02:39] my juju environment has stalled [02:39] it hasn't stalled [02:39] and I'm not getting any output from the agents because we log nothing at WARNING level [02:39] it just isn't giving info and debug [02:40] if you want, you should bootstrap with debug [02:40] ok, is it possible to change this ? [02:40] or read my big long email [02:40] ok, will re-read [02:40] juju set-env logging-config='=DEBUG' [02:40] on the running environment [02:40] will set it back [02:41] juju set-env logging-config='=INFO;juju=DEBUG;juju.rpc=INFO' will reduce noise [02:41] or even [02:41] juju set-env logging-config='juju=DEBUG;juju.rpc=INFO' will reduce noise [02:41] yummy [02:41] thumper, how bout a default of that [02:42] export JUJU_LOGGING_CONFIG='juju=DEBUG;juju.rpc=INFO' [02:42] thumper, what's the current default? [02:42] =WARNING [02:42] k [02:42] the logging config you bootstrap with is passed on [02:42] hazmat: thumper i'll do that [02:43] but i'm concerned that the debugging that I needed has probably been thrown away [02:43] i'll rebootstrap and try again [02:46] davecheney, hazmat: perhaps we should change the default logging level if you are using a debug release [02:46] as that is most likely to be developers [02:46] who are more interested in logs [02:46] well, debug info anyway [02:46] thumper: i think the default sohuld be export JUJU_LOGGING_CONFIG='juju=DEBUG;juju.rpc=INFO' [02:46] hazmat: machine-0.log.tar.gz under chinstrap:/home/julianwa [02:46] the main problem is the log spam from the rpc code [02:46] everything else is crucial [02:46] davecheney: alternatively, we set the rpc spam to TRACE [02:47] so DEBUG just works [02:47] sure [02:47] i don't care about the specificas [02:47] everyone hates the rpc spam [02:47] all i know is I have an environment that is broken [02:47] and i cannot debug it beucase the logging that i needed was discarded [02:48] all the agents are idle [02:48] but i have no logs to debug the problem [02:48] davecheney: I'll submit a fix tomorrow [02:48] * davecheney destroys environment and trys again [02:50] julianwa, thanks [02:52] julianwa, that's odd i get unexpected EOF extracting [02:52] and output cuts off mid line [02:53] but it doesn't look the agent was doing anything for a while [02:53] julianwa, when you restarted it, did the process stay up? [02:55] hazmat: FWIW, I'm seeing upstart issues starting upstart for the local provider now [02:55] it seems that upstart sees it very fast [02:55] and starts it before we call start [02:56] a race it seems between us going: are you running? no? ok lets start you, fail - already running [02:57] hazmat: what's you mean stay up? [02:58] hazmat: did you mean process still have same pid? [02:59] julianwa, sure [02:59] julianwa, i'm trying to verify its running [02:59] its unclear from the log why it died, just want to make sure that it is in fact alive [03:00] hazmat: yes. it's running with new pid [03:00] julianwa, good, and status still has mysql dying... [03:00] ah fark [03:00] right [03:00] julianwa, so unto manual db work.. [03:01] one moment [03:03] * thumper headdesks [03:06] thumper, do you have an alternative? [03:06] * thumper headdesks [03:06] * thumper headdesks [03:06] julianwa, are you using juju-deployer? [03:06] hazmat: no [03:06] k [03:08] hazmat: for some reason, my local provider is completely screwed [03:08] thumper, hmm.. works okay for me. what do you see? i still have a sleep in Install and a sudo in start [03:08] re upstart.go [03:09] telling me that the environment isn't up when I can see that it is [03:09] bootstrap worked [03:09] status says "you aren't bootstrapped" [03:10] thumper, would you mind pastebin status -v --debug [03:10] just curious what it looks like [03:10] hazmat: -v isn't needed if you use --debug [03:10] really [03:10] thumper, i used to assume that then someone told/showed me otherwise [03:11] no, it's bollocks [03:11] davecheney, wasn't that you? [03:11] not important [03:11] thumper, yeah.. --debug would be interesting to see [03:11] $ juju status --debug 2>&1 | pastebinit [03:11] http://paste.ubuntu.com/6178119/ [03:12] thumper, is your provider storage running?.. its basically failure to resolve the juju db i'd assume [03:13] your not even getting to state/open [03:13] not very helpful eh? [03:14] so env/conn is the failure point [03:14] well maybe.. [03:14] actually possibly before then [03:15] hmm... [03:15] for some reason it wasn't started [03:15] but it said it was [03:15] wtf [03:16] * hazmat returns to db surgery [03:16] * thumper is still confused [03:17] seems to be some weirdness there [03:37] hazmat: i did recommend -v always [03:37] but the meaning of -v has changed recently [03:38] it hasn't changed yet, but will one day soon [03:38] or perhaps it did way back, and I changed it [03:39] -v and --debug uses to be synonyms [03:39] not anymore [03:39] or somethign [03:40] julianwa, i need a sanity check before i give the modify db script.. could you run this http://paste.ubuntu.com/6178181/ from the client machine, you need to pass -e env_name and -o output_file.json .. and then copy the file over to chinstrap so i can verify what needs modification. [03:41] julianwa, you'll probably need to setup a virtualenv to get a recent version of mongodb if the client is precise [03:47] hazmat: setup a virtualenv on the server? [03:55] julianwa, on the client [03:55] sorry net dropped for bit, battery died [03:55] julianwa, virtualenv --system-site-packages jujufix && source jujufix/bin/activate && easy_install pymongo [03:56] then python script_from_pastebin -e env_name -o output.json [03:58] * thumper off to do a demo of juju [03:58] so I fully expect everything to fall in a heap [03:58] thumper, good luck [03:58] cheers [04:01] julianwa, that script is read only, it just grabs a copy of the db into a json [04:02] mostly so i can inspect and verify all refs to 'mysql' before doing an update script [04:11] we don't have a client to create virtualenv. the client is maas server. [04:11] hazmat: ^^ [04:11] julianwa, that's fine [04:12] julianwa, you can setup the virtualenv there [04:12] the virtualenv will keep the pymongo package isolated [04:12] from the system [04:12] you do need to install python-virtualenv package [04:24] ie. on the maas server [04:26] and then you can remove the virtualenv dir when the work is done [04:36] julianwa, sorry i just got told you may not have outbound net access [04:36] might have to do go for this.. [04:37] except not sure i have time for that tonight [04:46] julianwa, k.. i'm done for the night. can pick up via email. [04:49] hatch: thanks will exec script and get back to you [04:49] sorry hazmat ^^ [04:49] julianwa, no worries [07:41] mornin' all [07:41] morning [07:43] morning rogpeppe, dimitern [07:44] axw: hiya [07:44] axw: just glancing at q [07:45] axw: https://codereview.appspot.com/14197043 [07:45] axw: is the whole thing really on one line? [07:46] rogpeppe: yep, the base64 encoder doesn't add newlines if that's what you're thinking of [07:46] axw: hmm, that seems wrong - head should not have to read 10MB lines [07:47] rogpeppe: hmm yeah, I guess head is going to buffer all that isn't it. [07:47] axw: and... is there a pty at the other end of that ssh connection? [07:47] axw: indeed [07:47] rogpeppe: no pty allocated [07:47] axw: well at least we won't overflow the input buffer [07:48] rogpeppe: I can change it to a here doc. [07:48] hm [07:48] isn';t that going to be the same? [07:48] axw: that would be better, but for other reasons [07:50] rogpeppe: other reasons being? [07:50] axw: it's robust when the input doesn't happen to be exactly the right length [07:50] rogpeppe: this will always be one line [07:51] axw: currently yes, but i think it should probably be split [07:51] axw: so that it can be streamed all the way [07:51] axw: it's slightly odd that encoding/base64 doesn't provide an option to do that [07:52] rogpeppe: indeed, I thought there was a mandated line length for some usages of base64... [08:06] rogpeppe: I was thinking about https://codereview.appspot.com/14011046/ again before; I *think*, once everything's behind the API, we should be able to solve this cleanly by adding something into the environ config [08:07] i.e. a "bootstrapped" flag, which is checked to see if ssh/http storage is used [08:07] initially false (while bootstrapping), then set to true before adding into state [08:07] then the API server will always load it out as true [08:07] axw: i wondered about that [08:08] axw: i'm not sure though [08:08] axw: currently i assume that Bootstrap doesn't change any config options that need storing locally [08:10] axw: how about something like this for splitting lines (untested): http://paste.ubuntu.com/6178727/ ? [08:12] rogpeppe: sure, but then what on the server-side? head still buffers, as does a here doc [08:12] axw: no, a here doc doesn't buffer (and neither does head, other than line-at-a-time) [08:13] er yeah of course, line buffered.. :) ok, I thought the here doc was fully buffered [08:13] ok, I can make that change [08:15] axw: seems to work ok: http://play.golang.org/p/oYPyqeej5_ [08:24] fwereade, ping [08:25] dimitern, pong [08:25] axw: their unbuffered nature is almost the entire reason that heredocs exist (otherwise you could just use single quotes); think of shell archives. [08:26] fwereade, I updated that passwords CL, if you can take a look? https://codereview.appspot.com/14036045/ [08:26] dimitern, cheers [08:26] rogpeppe: thanks, I didn't think about that too hard [08:26] fwereade, ah, sorry didn't see you had comments recently - will look at them now [08:27] dimitern, they're basically a repriseof ourdiscussions, thought Ishould record them [08:28] fwereade, I couldn't find a way to access the password from the agent conf, except if there's a method on the interface [08:28] fwereade, but now at least, there's no newPassword and I do read it back to get the saved value [08:29] dimitern, hmm, I see, don't love it but I guess that's why we had that Password method in the first place [08:29] dimitern, do you think there's any way we can make its testiness clearer [08:30] dimitern, like if we added an unexported passwordForTests method to the interface, and then had an export_test func that called it? [08:31] fwereade, the problem is, I need it outside of the agent module, so export_test won't do [08:32] dimitern, jujud/bootstrap_test? [08:32] fwereade, also, originally the method was PasswordHash(), but that's not going to work in my case - I need the actual plain text password [08:32] dimitern, or more? [08:32] dimitern, yeah, I saw that, entirely unbothered there :) [08:32] fwereade, there and in machine_test, yeah [08:33] fwereade, hmm you know what? [08:33] fwereade, I do have OpenState and OpenAPI in config [08:33] dimitern, I don't see it in machine_test [08:34] fwereade, I can use these instead in assertCanConnectToState and the other [08:34] dimitern, hey, yeah, just load a config and check those work (or don't) as expected [08:34] fwereade, ok, and I'll revert to PasswordHash then [08:34] dimitern, and in bootstrap_test I think the try-to-load-a-config thing should work too [08:34] fwereade, you mean drop PasswordHash altogether? [08:34] dimitern, ISTM that we might then even be able to drop Password [08:34] yeah :) [08:35] fwereade, ok, will try now [08:35] dimitern, lovely, thanks [08:36] rvba, ping [08:37] fwereade: ping [08:37] Hi fwereade [08:37] rvba, those docs STM to imply that you can and should and do use multiple API keys for the same user, and that that's the accepted way to use juju/maas, and all I'm looking for is a way to automate the ugly bit of the setup-- have Imissed something? [08:38] TheMue, pong [08:38] fwereade: 1st, just found your mail regarding the unset documentation, missed it in the mail stream. aaaargh. will do it [08:38] fwereade: right, but I think the bug is there even if you use multiple API keys (belonging to the same user). [08:39] rvba, are we sure? the python implementation doesn't look like it does anything very different [08:39] fwereade: 2nd, the refactoring is almost done (currently in testing), but i have to admit that using a struct doesn't really feel better then 3 arguments and 4 return values [08:39] fwereade: it sometimes even reads more ugly [08:40] fwereade: I confess I haven't tested it in the field but I'm pretty sure. [08:40] fwereade: I'll test it today. [08:40] rvba, tyvm [08:40] fwereade: and additionally i'm not happy with a behavior we already had with the CL before. [08:41] TheMue, ok, if it's a bad idea let's not do it :) [08:41] TheMue, ah, what's that one? [08:41] fwereade: that is that we can write Status(params.StatusFoo, "something", nil) [08:41] fwereade: but when reading the status data is not nil but an empty map [08:42] fwereade: i would like to change that in a way, that in case of an empty map in the db we return nil for data [08:42] TheMue, what's the distinction you're trying to preserve here? [08:43] fwereade: none ;) it only feels asynchronous, not my way of thinking [08:43] fwereade: but ok, i see a problem [08:44] fwereade: if i write an empty data a nil would be returned, so the same problem :( [08:44] fwereade: so forget my stammering here [08:44] fwereade: :D [08:44] TheMue, haha, np [08:45] fwereade: so i'll now do the doc and then free for the next job [08:45] TheMue, rogpeppe: I am wondering if it is feasible/useful to have TheMue start work on some of the environ Prepare methods? [08:45] TheMue, perfect, ty [08:45] TheMue, lining that up now :) [08:45] fwereade: great [08:46] hey, trunk is doing this for me: http://paste.ubuntu.com/6178801/ [08:46] jam, rogpeppe, dimitern, TheMue: anyone else seeing it? [08:47] fwereade: I thought rogpeppe submitted a "skip this test" last night for that [08:47] fwereade, no, but haven't tried trunk yet [08:47] fwereade: once https://codereview.appspot.com/14136043/ has gone in, that might work ok actually [08:47] jam, fwiw it just failed on the bot, but only one of them did [08:48] fwereade: i'm still struggling through fixing test failures, but i don't think that should be an obstacle [08:48] rogpeppe, cool, that looked LGTMed already so I didn't examine closely [08:48] rogpeppe, ok, fantastic [08:48] fwereade: ok [08:49] fwereade: yes, i've seen that test failure [08:49] fwereade, updated https://codereview.appspot.com/14036045 again [08:49] fwereade, jam: that wasn't the one i submitted a Skip for [08:51] fwereade: looks like a problem with the code dialling out inappropriately [08:51] fwereade, sorry, some code left over, will repropose [08:51] fwereade: i was planning to do a global check that we're not doing that, by tweaking the net package so that it panics when resolving or dialling a non-localhost address [08:51] rogpeppe, do I recall that I convinced (or browbeat or ordered or something) you to handle admin-secret and other juju-level things in environs.Prepare? [08:51] rogpeppe, nice [08:52] rogpeppe, in which case TheMue should steer clear of the Prepare func itself and just work with the individual environ implementations [08:52] fwereade: yes [08:52] fwereade: well, yes to the latter anyway [08:53] fwereade: i'm not quite sure what you mean by "handle" in the above context [08:53] rogpeppe, insert values if none are present [08:54] rogpeppe, y'know, "prepare" them ;p [08:54] fwereade: yes, that should definitely happen [08:54] fwereade: the above CL does that for CA cert, and it would be easy to do for admin-secret too [08:54] rogpeppe, fantastic [08:55] rogpeppe, do we have any other global candidates? [08:55] rogpeppe, hey interesting wrinkle [08:56] fwereade: go on...? [08:56] rogpeppe, if state-port/api-port get filled in globally, the local provider won't be able to pick non-conflicting ones itself [08:56] rogpeppe, nbd [08:56] fwereade: hmm [08:57] fwereade: well, there's nothing stopping a given provider forcibly changing a config attribute, in fact, although that may be inadvisable [08:58] 25495131 - alexia [08:58] bugger [08:58] :-) [09:00] brb [09:01] b [09:09] dimitern, LGTM [09:10] fwereade, thanks [09:10] fwereade, fixing a few other minor things I found and landing [09:11] fwereade, after a live test ofc [09:11] fwereade: ah, sync.selectSourceStorage is the culprit [09:12] fwereade: we should set up sync.DefaultToolsLocation [09:12] rogpeppe, gaaah ofc (tyvm for looking into that, somehow it's already 3 levels down my stack) [09:12] fwereade: the "tweak net package" hack is working quite nicely [09:13] rogpeppe, excellent [09:15] fwereade: ah ha! [09:15] fwereade: we *do* set up sync.DefaultToolsLocation, but in this case we're acually invoking a different test binary (via TestRunMain) which doesn't tweak it [09:16] rogpeppe, gaah indeed [09:16] sorry bbs [10:00] rogpeppe: https://codereview.appspot.com/14197043 [10:00] renamed your writer, I hope you don't mind ;) [10:00] axw: it was an arbitrary name :-) [10:01] axw: although i think Wrap on its own is a bit lacking context. LineWrapWriter would probably be better [10:01] good point [10:01] axw: (as many writers "wrap" another writer) [10:01] yup [10:01] I shall change it [10:02] axw: also, it's not quite as general as the doc comment implies - it wraps lines at lineLength *bytes* not characters [10:02] axw: so it can split utf chars inappropriately [10:03] yeah, true, I should specify bytes/ascii chars [10:03] axw: that's the reason i think it's not entirely appropriate for utils, as it's ok to use for wrapping base64, but not really in general [10:03] * axw nods [10:03] rogpeppe: I very nearly left it in there... can move back [10:04] axw: making it do proper utf8 handling would be not entirely trivial and need some additional buffering, so i think that's probably best [10:05] rogpeppe: cool. btw, why did you add the additional bufio in there originally? [10:05] axw: because i didn't want to do lots of 1-byte writes [10:06] axw: it could slow things down quite a bit when transferring MBs [10:06] rogpeppe: ok. I would think that's the concern of the user [10:06] yeah [10:06] ok [10:07] axw: true; the user can always pass in a bufio.Writer [10:07] axw: which is probably a good thing anyway, as base64.Encoder does lots of small writes too [10:08] rogpeppe: yeah, I'll create one in the sshstorage.run command. [10:10] fwereade, live tests with the local provider pass nicely [10:11] fwereade, now trying ec2 [10:11] dimitern, try an upgrade from 1.14 as well on general principles :) [10:11] fwereade, oh.. that might be a problem [10:11] fwereade, not sure how to get 1.14 first [10:13] dimitern, it's in ppa:juju/stable [10:19] axw: are you sure that legitimate base64 output can't finish with the letters "EOF" ? [10:20] rogpeppe1: heh, good question. probably could mangle it so it is. /me fixes [10:20] axw: that's why i suggested '@EOF' originally [10:20] ah [10:21] rogpeppe1: and my response about not understand that was thinking it was a special bash syntax [10:21] thanks, I'll use that [10:21] axw: well, the quotes are significant [10:21] axw: if the string is quoted, the shell won't scan for potential variable and `` expansions [10:21] rogpeppe1: yeah I thought the @ was doing something special in the context of a here-doc [10:21] axw: ah no :-) [10:28] rogpeppe1: updated [10:28] axw: i posted a review [10:28] off for dinner now, adieu [10:28] ta, I'll take a look later [10:29] axw: i don't think you've re-proposed [10:29] axw: enjoy [10:32] can the x clipboard really not hold any more than 32K ? [10:35] hmm, must be bug elsewhere [10:43] fwereade: testing shows I was right (see the email I just sent). [10:46] dimitern: rogpeppe1 standup ? [10:46] https://plus.google.com/hangouts/_/cf9a1a494368b354ff8311d84b7e64aa52777ed0 [10:48] rvba, crap, so all the docs are wrong? *does* pyjuju do something special to handle it? [10:49] fwereade: something seems to be wrong with the docs indeed. I didn't test with pyjuju, but I doubt it behaves differently. [10:49] fwereade: it looks like not only might you shutdown all of your own instances, you might shutdown another users if you have perms to do so [10:49] rvba, yeah, I didn't spot any obvious differences in the implementations [10:54] geh. g+ is dropping too many packets for me... [11:02] understanding robot is too hard... please internet [11:17] * TheMue => lunch [11:34] fwereade, so summary of the live upgrade test [11:37] fwereade, I managed to upgrade, but I needed to do 3 manual steps: 1) copy the tools as discussed in control-bucket/tools; 2) create a new /var/lib/juju/tools/1.15.-precise-amd64/downloaded-tools.txt with the proper json format (without this the MA upgraded and all tasks run ok, except the lxc-provisioner stopped working); 3) replace the symlink in /var/lib/juju/tools/unit-mysql-0 to point to the 1.15.0.1 tools - now the UA works as well ( [11:37] after 2) and 3) did killall -9 jujud, just in case) [11:38] fwereade, and one final thing worth mentioning - destroy-environment called on 1.14 bootstrapped env, upgraded to 1.15, using the juju command from 1.15 produces this error: ERROR cannot read environment information: environment "amazon" not found (when I call the juju command from 1.14 it works) [11:40] fwereade, so should I land it anyway? [11:47] fwereade: i found the mock charm store BTW. it still doesn't play well with InferRepository though. [12:00] fwereade, jam: https://codereview.appspot.com/14200044 [12:06] rogpeppe1: "broken config" is just a single quote char ? [12:06] Isn't that invalid yaml? [12:06] jam: that's the point, yes [12:07] jam: it's a more reliable way of triggering an error [12:07] rogpeppe1: while that is true, that seems (to me) to be exercising a very different code path [12:07] one is that if we have well-formed but missing data [12:07] and the other is malformed data [12:08] jam: that's not the code path this test is interested in testing, if i read the test correctly. [12:08] jam, I think that's LGTM -- the only thing it's actually intended to test is that the arg parsing doesn't care about order of command vvs --log-file [12:08] jam, I am a little sceptical about the value that test provides anyway [12:08] fwereade: yeah, i seriously considered trashing it [12:09] rogpeppe1: so at least I would change the comment on "breakJuju" to say "with invalid configuration" [12:09] jam, seems like an awfully awkward way to run something a bit like the real binary [12:09] fwereade, have you seen my previous messages? [12:09] fwereade: you mean TestRunMain ? [12:09] it is very ackward, but used in quite a few places [12:09] since you do end up with a test binary [12:09] that you could then invoke [12:09] and I guess means you don't have to "go build" before you "go test" [12:10] I was a bit surprised we didn't do the "go build" route myself. [12:10] jam, I can't think of any good reason not to test the build product [12:10] jam, but the idea was that it should all be done in go test [12:10] fwereade: because when you just run "go test" there isn't a 'juju' build product [12:11] jam, ofc we *could* actually build the binary inside a go test test [12:11] jam, in fact in some places we do IIRC [12:11] jam, but that's kinda horrible as well [12:11] fwereade, jam: i'd be happier if these kinds of test were actually shell scripts or similar, outside the purview of go test. [12:12] rogpeppe1, jam, +1 [12:12] fwereade: it seems to me that they don't add much value anyway - the amount of actual logic they're testing in general is just glue code and tiny [12:14] rogpeppe1, sure, but it would also be nice to run at least one test on our actual build output [12:14] fwereade: "badrun" is very badly named [12:14] fwereade: i agree [12:14] rogpeppe1, isn't it, I think it has mutated quite significantly over the years [12:15] fwereade: we should have more tests that work on the final juju binary - in particular live tests that use it to deploy charms, etc [12:15] rogpeppe1, +100 [12:15] rogpeppe1, I think that a few of those would be pretty good replacements for some of the live tests tbh [12:15] fwereade: yeah [12:16] fwereade: we really really need a charm-level regression testing suite [12:18] rogpeppe1, a bunch of standard charms that exerice our functoinality, that we deploy in those tests? yeah [12:18] fwereade: exactly [12:18] fwereade: i started doing it, but came up against the issue of how we determine success or failure (we need to actually talk to the charm) [12:19] fwereade: there are lots of possibilities and i got distracted while trying to decide on one :-) [12:20] rogpeppe1, yeah, that's a beguiling problem, I immediately started looking into space and pondering [12:20] fwereade: :-) [12:23] fwereade, ping [12:25] dimitern, pong [12:25] fwereade, ^^ [12:26] dimitern, ok, so that looks like "really rather broken" to me [12:26] dimitern, "downloaded-tools.txt" is complete nonsense, I fear [12:26] fwereade, it's hard to tell why it's needed [12:27] dimitern, and do we have any idea why the upgrader didn't run on the unit agent? [12:27] dimitern, it just looks like a not-properly-tested typo/thinko for downloaded-url.txt [12:27] fwereade, it ran, but kept restarting with that issue "empty size or checksum" for tools [12:28] dimitern, or maybe not [12:28] fwereade, downloaded-url.txt is just a url in a text file, while the downloaded-tools.txt has the url, size, sha256, and binary version string in a json-serialized format [12:28] dimitern, ohhhh right ok [12:28] jam: your new logMatches changes are just great. i was struggling to see what's what in http://paste.ubuntu.com/6179372/; i merged your branch and got this: http://paste.ubuntu.com/6179376/ [12:29] rogpeppe1: thanks, I think it is quite a bit nicer, if I had known I should have done it that way to start [12:29] fwereade, that brings me back to the question [12:29] dimitern, so we fucked the format to store a load of information that is not only unused but is literally worthless [12:30] * fwereade sighs [12:30] fwereade, should I land my branch despite the upgrade issues, which are related to tools and not to my changes? [12:30] fwereade, yeah, pretty much istm [12:30] jam: one thing i realise though - it talks about "line 8" but there's actually no way of seeing what was on the lines in question. [12:31] dimitern, ok so what you're saying is that once the actual tools are put in place the mongo-related implications of the change are fine [12:31] fwereade, in fact in the 1.15.0.1 tools dir there is indeed a downloaded-url.txt only [12:31] dimitern, ofc there is [12:31] dimitern, what version downloaded it, do you think? :) [12:31] fwereade, yes, the agents restart, and I can see the UA is using API only, and the MA on machine 1 is using only api [12:32] dimitern, ok then, please land that [12:32] fwereade, ah, right! [12:32] fwereade, ok, landing then [12:32] dimitern, rogpeppe1: what would it cost us to just rip out all that where-did-the-tools-come-from crap? if we care we can log it at upgrade time [12:33] fwereade, not sure, have to check, but it seems the lxc-provisioner and upgrader currently log tool-related errors [12:34] fwereade: i guess we could do that. i *thought* it would be useful in the status, but if it's really causing problems... [12:35] dimitern: this error: "/var/lib/juju/tools/1.15.0-precise-amd64/downloaded-tools.txt" looks suspicious because I'm pretty sure it should look in /var/lib/juju/tools/downloaded-tools.txt [12:35] jam, there's no downloaded-tools.txt in /var/lib/juju/tools/ [12:36] jam, but if there is one in the subdirs it finds it [12:36] jam, there had better not be one in there :) === gary_poster|away is now known as gary_poster [12:37] fwereade: well it does say "$bin/downloaded-tools.txt" [12:37] I would have thought it would split the tools all into one file, but I see it is doing ">" not ">>" [12:39] rogpeppe1, jam: have we already released an API that demands we include that information? I think we have [12:39] awwfuck it [12:39] fwereade: needs what infor? [12:40] jam, all the stuff in a state.Tools which is completely irrelevant [12:40] jam, url, hash, size, etc [12:40] jam, I mean, I don't see how it's possibly useful to feed that info into juju from the agents [12:40] jam, the agents *got* that info from juju [12:41] jam, and I fear that SetAgentTools expects all that stuff now === TheRealMue is now known as TheMue [12:59] rogpeppe1, jam, dimitern: do any of you recall seeing a panic in testing.TarGz? [12:59] fwereade: I have not recently seen a panic there [12:59] fwereade: i don't think so [13:00] jam, rogpeppe1: dammit, I'm sure I saw one once, and I think mattyw is having trouble with one at the moment [13:00] fwereade: i'd be interested to see a stack trace [13:01] rogpeppe1, I've got the full output of go test juju-core/.. if that's useful? [13:01] mattyw: paste away [13:01] fwereade: back to the sha stuff, the only thing I can particularly remember is that Ian wanted to make sure the sha sum matched what we read in the simplestreams code. As you mention if we validate that it matches the expected value before we untar it, then we just use the expected value inside Juju and it doesn't help us to write it down again. [13:03] rogpeppe1, I've emailed you if that's ok, it'd be an enormous paste [13:03] mattyw: ok [13:04] mattyw: hmm, what version of Go are you running? [13:05] fwereade: re "It's that old wrong-environ problem again", I thought thought the issue was that an environment might be stale, and you need to SetConfig with config from state to do the right thing [13:05] rogpeppe1, 1.1 [13:06] fwereade: so, you could SetPrechecker with the stale env, then SetConfig later to make it right [13:06] (before using the state object) [13:06] mattyw: could you try with 1.1.2, please? i think that fixed a few GC issues. also, could you pull tip too? [13:07] rogpeppe1, tip of core and go 1.1.2? [13:07] mattyw: yeah [13:08] rogpeppe1, will do [13:19] ain't it great when a test fails because DeepEquals returns false, but the info printed out for each value is identical? [13:23] hey fwereade. when you have a chance, could you take a look at https://bugs.launchpad.net/juju-gui/+bug/1233462 and clarify expected/correct behavior please? The CLI and GUI should be aligned on issues like this, I think. [13:23] <_mup_> Bug #1233462: gui does not allow multiple relations between wordpress and ha proxy [13:23] rogpeppe1, thanks for your help, upgrading to go 1.1.2 seemed to fix it, although I get these failures on tip http://paste.ubuntu.com/6179576/ [13:24] mattyw: i suspect those are current test-isolation issues [13:24] rogpeppe1, ok, I'm able to ignore them for my stuff anyway [13:24] rogpeppe1, thanks again for your help [13:24] mattyw: np [13:27] sinzui: you want initiating into the ways of sync-tools? [13:27] mgz I do [13:28] so, confusing part #1: the landing bot and the cloud-wide tools account are the same one [13:28] after that, it's dead easy [13:29] source the bot creds, add a juju-env section something like: [13:29] canonistack-juju-tools: [13:30] type: openstack [13:30] admin-secret: [13:30] control-bucket: juju-dist [13:31] then run `juju sync-tools -e canonistack-juju-tools`, being careful with which version of juju you're actually using, due to simplestreams stuff having changed of late [13:31] if you don't have the bot creds, I will gpg enc them to you [13:32] it's the account named "juju-tools-upload" [13:35] it's useful to `swift list juju-dist` to check what's actually there afterwards [13:35] also it rhymes [13:54] gary_poster, responded [13:57] sinzui: do you need the creds from me? [13:58] fwereade, is this valid for a txn op.. "$not": "<_sre.SRE_Pattern object at 0x19276f8>" ? [13:59] that looks like an accidental serialization of a regex [13:59] object instead of pattern [13:59] hazmat, it does rather, doesn't it [13:59] heh [13:59] fwereade, line 14309 from the dump fwiw [14:00] great, thank you fwereade [14:00] (and yes, fwereade, helpful. :-) ) [14:01] hazmat, hmm, bson.RegEx, eh? that smells a little funny [14:01] gary_poster, that behavior is intended for the gui [14:02] * hazmat vaguely remembers writing that code [14:02] hazmat, you'd argue that the CLI and GUI should have different behavior [14:02] gary_poster, i'd argue 99% of charms are broken with a requires having multiple relations [14:02] gary_poster, see my conversation with dave cheney in #juju-gui last night [14:03] gary_poster, the gui should prevent people from shooting themselves and their services in the foot [14:03] thats the whole point for the relation dimming guides, to give contextual help to a user [14:03] hazmat, IMO talk through it with fwereade. CLI and GUI should be corresponding. Need to be on call now, can return to convo later [14:04] hazmat, I concur, we should respect relation limits [14:05] fwereade, fair enough.. this was done in the absence of enforcement of those limits, and the reality that at the time only one extant charm had any support for multiple requirers, every other basically had undefined behavior for it [14:05] and even then that one charm required explicit service config for the scenario to work [14:06] sinzui: emailed you the creds [14:06] thank you! [14:07] hazmat, yeah, I think the gui was right to restrict -- if there *is* a requires with a limit != 1, though, it should probably allow that one [14:09] fwereade, fair enough, although my intent at the time was that if they really needed that sort of thing, they could use the cli for it. [14:09] because frankly nothing supports it [14:10] in terms of charms [14:10] haproxy only does in the context of explicit service config, without that its entirely broken [14:12] gary_poster, commented on the bug [14:13] considering the complexity of fixing this and the zero gain in terms of real world usage, i'd rather kick this down the road for the gui. +1 on cli respecting limits and converging behavior with gui down the road. [14:13] ack thx hazmat === hatch_ is now known as hatch [14:20] * fwereade taking a break, bbs [14:24] jam, ping [14:24] or mgz ? [14:26] h, nevermind [14:27] hmm, anyone got any tips for debugging DNS issues? i'm seeing a 16s turnaround on some DNS requests [14:28] which *really* slows down some of the non-isolated tests .... [14:29] rogpeppe, you can use a local dns server perhaps? [14:30] thank you again mgz, canonistack is sorted. [14:30] dimitern: yeah, i guess i could try that [14:31] hazmat, re the affinity bug. I think it may be related to how msft sets an affinity group. [14:31] hazmat, still a hypothesis. I ran into this blog http://michaelwasham.com/2012/08/07/http-error-message-the-location-or-affinity-group-east-us-specified-for-source-image/ [14:31] * arosales still investigating [14:40] dimitern: hmm, tried that (i used these instructions http://askubuntu.com/questions/264827/how-do-i-activate-a-local-caching-nameserver), but this program (http://paste.ubuntu.com/6179816/) still reliably takes between 15 and 20s to run. [14:41] rogpeppe, are you sure you're hitting your local dns? [14:41] rogpeppe, try host -v store.juju.ubuntu.com [14:42] dimitern: ah, i was using nslookup to test stuff [14:43] rogpeppe, host is way better in many ways [14:44] dimitern: hmm, the local name server returns in no time, but then it times out twice: http://paste.ubuntu.com/6179829/ [14:45] dimitern: (that took about 15s to run) [14:47] rogpeppe, I'm running dnsmasq locally and it's pretty fast, and caching [14:47] dimitern: it seems to be caching ok, but for some reason the lookup is ignoring the cache [14:48] rogpeppe, take a look in your /etc/resolv.conf [14:48] dimitern: it just says "nameserver 127.0.1.1" [14:49] rogpeppe, mine does as well and dnsmasq is listening on :53 [14:51] dimitern: i wonder if it's an issue with the ubuntu DNS record - i tried a few things under ubuntu.com and they all took ages, but google.com was quick [14:51] rogpeppe, they're quick here (ubuntu.com I mean) [14:51] dimitern: actually google.com tried contacting the actual google server too, it seems [14:52] dimitern: so maybe it's an issue with an ubuntu DNS server [14:53] rogpeppe, could be, but it might be on your side as well [14:54] dimitern: what does host -v store.juju.ubuntu.com print for you? [14:54] rogpeppe, http://paste.ubuntu.com/6179868/ [14:56] dimitern: interesting - here's what i see http://paste.ubuntu.com/6179862/ [14:56] dimitern: yours seems to be following the same logic, but mine times out the first time and is slow the second [14:57] dimitern: i did change my router/modem at the weekend, so it's probably related to that somehow, but i can't quite see how [14:57] rogpeppe, this is what I get for ubuntu.com http://paste.ubuntu.com/6179878/ [14:58] rogpeppe, your router might be getting in the way as a preferred dns on the network and then not doing a good job and timing out? [14:59] rogpeppe, or perhaps there's a override somewhere in the settings ? [14:59] fwereade: ping [14:59] axw, pong, I'm very sorry I haven't managed to review more of your branches today [14:59] dimitern: even if it was, why isn't my local dns cache working properly? [14:59] fwereade: nps, I just have a question [14:59] fwereade: so, you could SetPrechecker with the stale env, then SetConfig later to make it right [14:59] (before using the state object) [14:59] err [14:59] sorry, missed some context [14:59] dimitern: anyway, i'm spending too much time trying to fix this issue. thanks for the input. [14:59] that was preceded by: [14:59] fwereade: re "It's that old wrong-environ problem again", I thought thought the issue was that an environment might be stale, and you need to SetConfig with config from state to do the right thing [15:00] axw, I think it is, yes [15:00] fwereade: so a PrecheckerSource would be unnecessary, no? [15:01] axw, well, I was thinking more in the context of a task that kept a shared env up to date [15:01] I envision the series of events like this: juju.Open gets Environ with stale config, and a state, and calls SetPrechecker. Later, someone updates the env's config with SetConfig [15:02] axw, but then we need to track that specific environ around and update it at the right time -- and there's no guarantee we can create an environ at the point we're creating state, is there? [15:03] axw, environs lack secrets until the first CLI connection [15:03] axw, I was thinking of a model in which there was a task that waited for a valid environ in state and only then returned it [15:04] fwereade: ok. and I was thinking that you could have an invalid Prechecker (just as you can have an invalid Environ - they're the same after all), and make it valid by updating the environ's config [15:05] axw, how do you propose to create this invalid environ? ;p [15:05] axw, it'd have to be nil, wouldn't it? [15:06] * axw is confused [15:06] * fwereade is too, a little, do you have a moment for a g+? [15:07] fwereade: for a short while, going to bed shortly [15:07] just gotta go get set up, brb [15:09] fwereade: https://plus.google.com/hangouts/_/eef843eef5b79f33bd5a7a2a4d50f20192103ac721?authuser=1&hl=en-GB === hatch_ is now known as hatch [15:32] does anyone else see provider/ec2 tests fail (TestStartInstanceWithEmptyNonceFails fails for me) ? [15:32] on trunk [15:32] it looks like another isolation issue to me, but i can't see a specific bug for it [15:32] mgz: ^ [15:33] checking [15:33] * mgz switches to trunk [15:35] yeah, tht fails for me [15:36] If, in an ec2 environment, juju claims that the environment is bootstrapped but it is not actually (http://pastebin.ubuntu.com/6180035/), what do I do? this is trunk, as of yesterday and today. [15:36] probably the test passed when poorly isolated [15:36] mgz: one mo - i'll disconnect from the network and we'll see [15:37] gary_poster: destroy-environment and start again? [15:37] gary_poster: the more manual option is to poke around in the file store and see what it has [15:37] using some s3 tool or other [15:37] mgz that worked thanks :-) [15:38] mgz: well, it fails with the same error when the netork is disconnected [15:38] mgz: i'm not sure how it got past the bot [15:39] what caused all these isolation-related issues to suddenly raise their head? [15:40] changes to simplestreams [15:40] hmm [15:40] * rogpeppe escapes to lunch [16:16] fwereade: preliminary proposal of the branch that's been in the offing for ages and ages: https://codereview.appspot.com/14207046 [16:17] fwereade: all tests pass except the one that fails in trunk too [16:17] fwereade: (i don't know why that test passes on the 'bot) [16:19] rogpeppe, gaah, last time I ran trunk everything was passing :/ [16:20] rogpeppe, are you sure it's not due to your specific router/dns setup? [16:21] dimitern: no :-) [16:21] dimitern: but it's a vanilla setup [16:21] dimitern: the DNS settings are automatically obtained from my ISP [16:22] rogpeppe, I mean is the failing test timeouting on dns resolving? [16:22] dimitern: and... if the caching was working, why wouldn't the query return immediately it has got an answer? [16:22] dimitern: yes [16:22] dimitern: oh, no [16:22] dimitern: i don't think so [16:22] dimitern: mgz could reproduce the issue [16:23] dimitern: do provider/ec2 tests pass for you? [16:23] rogpeppe, they did when I landed my branch last time [16:24] dimitern: they failed for me with the network disconnected too [16:24] dimitern: that might be a good way of reproducing the issue, if it is an isolation problem [16:27] rogpeppe, yeah [16:28] dimitern: if you could bear to be without the internet for a minute or so, i'd appreciate it if you could try that [16:29] rogpeppe, not right now - in a few minutes, I'm testing a bugfix [16:29] dimitern: np [16:29] dimitern: whenever [16:51] rogpeppe, take a look at cmd/juju/destroyenvironment.go:46 [16:51] rogpeppe, why's that? [16:52] dimitern: the "if !assumeYes" line? [16:53] rogpeppe, no, the if before that, with the _, err = [16:53] dimitern: hmm, i don't see that in my branch - gimme 5 minutes while my lbox propose runs and i'll have a look [16:55] dimitern: that line was a bit premature [16:56] rogpeppe, it's spurious [16:56] dimitern: not really [16:56] rogpeppe, why? [16:56] dimitern: if you haven't got info stored for an environment you won't be able to talk to that environment at all [16:57] dimitern: so that line gives a better error message to the user [16:57] dimitern: however... [16:57] dimitern: that line is now gone in https://codereview.appspot.com/14207046/ [16:57] dimitern: (review appreciated BTW!) [16:58] rogpeppe, I can't see it gone [16:58] rogpeppe, and I was going to do that anyway [17:00] dimitern: actually it's gone in trunk already [17:00] hazmat, fwereade, added comment #7 to https://bugs.launchpad.net/juju-gui/+bug/1233462 fwiw. I welcome corrections, but I think what I said/did will be mostly OK to both of you. Thank you again for your feedback on this. [17:00] <_mup_> Bug #1233462: gui does not allow multiple relations between wordpress and ha proxy [17:01] rogpeppe, oh? that's good - it must be very recently [17:02] Already made correction/clarification: "The GUI also ignores the "limit" value, but simply does not currently allow more *requires* relations than 1." [17:02] dimitern: yeah, it merged in the last hour or so [17:06] gary_poster, that looks accurate and sane to me [17:06] thank you very much fwereade, cool [17:17] I keep wanting to "correct" all the spots where I see containerised and other similar words using s where I'd use a z, and then I remember who I'm working for.... [17:33] natefinch, that was done in Lp code, then we got a command from above. Canonical is British company and and it encourages British spelling. [17:35] sinzui: that's understandable. The language is called "English" after all, not "American" ;) [17:36] I went to high school in AU, University in US, and work for GB company. My spelling is permanently buggered [17:39] rogpeppe, dimitern: I may have gone mad. I used sync-tools --source --destination repeatedly from r1903 and make the tools. today (possibly after installing the official 1.15.0 package) I cannot get that command too work [17:39] Since I have the tools and metadata it created, and logs, I know it worked for many days [17:40] sinzui: i'm afraid i don't understand all the new "simple"streams stuff [17:41] natefinch: i try to use american spelling in the source code [17:41] I noticed right away that I couldn't use the juju from the package, but I am sure juju from the tree continued to work....but not today [17:41] sinzui: i didn't know that canonical is a british company, although i knew its head office is in london [17:42] rogpeppe, I think it corporate identity rather than legal identity. [17:43] sinzui: so... how does the command fail today? [17:43] rogpeppe: I think I've only seen one or two things that stuck out at me as british... and honestly, when those things pop up, they're things I didn't know we differed on. Not a big deal, for sure. [17:44] rogpeppe, no, I reverted to r1903, which I used on Friday. The commands no longer work. [17:44] sinzui: you've done "go install ./..." presumably? [17:44] yes. --version is correct [17:45] sinzui: what errors do you get? [17:46] sinzui: "no longer work" is a hard place to start from :-) [17:46] yeah [17:48] I am surprised it is listing available tools when the --source --destination options are for building a tree that can be republished anywhere. [17:56] sinzui: i still haven't seen any error messages... [18:00] rogpeppe, This is the command I run. It is from a script I have played more than dozen times over the weekend. sync-tools is run after the tgz files are created in new-tools/tools http://pastebin.ubuntu.com/6180604/ [18:00] sinzui: i'd like to see that with --debug on too [18:01] rogpeppe, http://pastebin.ubuntu.com/6180622/ [18:04] rogpeppe, "boing" is a new env I add. I wondered if the hp and aws envs were "tainted" because the tools were uploaded there. [18:05] sinzui: just to clarify: what do you expect this command to do? [18:06] rogpeppe, copy the tgz files to new-tools/juju-dist/tools/releases generate the json and copy that to new-tools/juju-dist/streams/v1/ [18:07] sinzui: without touching any external provider, right? this should all work locally on your machine. [18:07] that's right [18:07] sinzui: (even if you weren't using the local provider) [18:08] sinzui: in fact, the fact that you have to specify a provider for this command is kinda superfluous, i guess [18:08] that is also right. I use hp and aws in previous calls (to be certain the data was identical) [18:08] This is the content of the hp/aws calls from a few days ago: http://pastebin.ubuntu.com/6180642/ [18:09] sinzui: what does ls -lR /home/curtis/Work/new-tools print? [18:10] rogpeppe, http://pastebin.ubuntu.com/6180652/ [18:14] sinzui: i don't see anything in /home/curtis/Work/new-tools/tools/releases [18:14] sinzui: which is where synctools looks for tools, AFAICS [18:15] rogpeppe, that is right! sync-tools --destination is to make releases/ and streams/v1/ [18:15] I will make those dirs just to eliminate this scenario [18:18] sinzui: AFAICS the tools sync logic copies from $source/tools/releases [18:20] rogpeppe, thank you very much! indeed it doe NOW.... [18:21] sinzui: ah, cool [18:21] rogpeppe, sync-tools is a plugin isn't it and it obeys the rules of PATH [18:21] sinzui: a lot of this stuff has changed [18:21] ? [18:21] sinzui: possibly [18:21] * rogpeppe hates the frickin' plugin thing [18:22] rogpeppe, I think the 1.14.1 rules were used when called GOPATH/bin/juju, and now I have 1.15.0 installed. [18:22] * sinzui revises script [18:23] sinzui: it looks like the plugin logic respects $PATH, yes [18:34] * rogpeppe is done for the day [18:34] g'night all [18:35] sinzui: glad you got it working! [18:35] goodnight and thank you again rogpeppe [18:37] sinzui: do you have a moment to chat? [18:40] jcsackett, do [18:40] sinzui: fantastic. i'll call you on g+. [18:41] jcsackett, is this a trap. I entered the room, but you isn't there [18:56] sinzui, abentley: can one of you review https://code.launchpad.net/~jcsackett/charmworld/missing-qa-data-in-review-queue/+merge/188584 [18:56] jcsackett: Sure. [19:03] abentley: thanks! [19:05] jcsackett: Should you consider series as well as charm name when checking for QA? [19:19] abentley: oh, good catch. [19:19] yes, i should. === meetingology` is now known as meetingology === _mup__ is now known as _mup_ === adam_g` is now known as adam_g [19:55] morning folks [19:56] sinzui: what's the status? [19:57] gary_poster: ping [19:58] thumper, I am redeploying ALL streams because they are different when generated with 1.15 and 1.15 is installed. When I generated the streams on the weekend, I had 1.14.1 installed and I think its sync-tool plugin was used for part of the streams :( [19:58] :-| [19:58] thumper, fwereade: All tools are redeployed now. I will validate then restart the tests! [19:59] thumper, hey [19:59] sinzui: are you testing 1.15 deployments, or upgrades or both? [19:59] gary_poster: hey, was giving a demo last night, showing off the juju-gui [19:59] gary_poster: one key problem I had was the projector was 1024x768 [19:59] thumper, both [19:59] gary_poster: and lots of stuff was not visible [20:00] sinzui: ack [20:00] thumper, full ack :-( [20:00] known problem [20:00] gary_poster: is there a plan to have a zoom out for the entire interface? [20:00] or perhaps this magic css fu I don't understand properly [20:00] that changes more behaviour based on resolution [20:01] thumper, no, UX intends bigger design change to address [20:01] inspector grows to cnsume all of right hand side, and...other stuff they haven't worked out :-P [20:01] gary_poster: ok, would be nice if it scaled well for projector demos with crappy resolution [20:02] gary_poster: you could put that on the designers todo list :) [20:02] thumper, it definitely is. this is an important one [20:02] gary_poster: cool, that's all [20:02] gary_poster: although people were very interested in bundles [20:02] ack thanks thumper [20:02] and being able to save state from the gui [20:03] s/although/also/ [20:03] I guess it reads ok either way [20:04] :-) thumper, cool. release today in progress gives you UX to save state from gui (was only keypress before). bundle work getting very close but last bits of hooking things up take extra time, as is often the case. was hoping for this week, but next week looks more likely [20:04] * thumper nods [20:29] thumper, fwereade , azure is definitely walking! I will know in about 30 minutes if we can say it us running [20:29] sinzui, awesome news [20:29] walking or working? [20:29] sinzui: you bootstrapped ec2 ok? [20:29] * thumper is having trouble with it [20:30] walking. I can status just the state-server within 15 minutes of bootstrap. I could not do that yesterday [20:33] thumper, I have not done ec2 yet. I am doing azure and hp [20:33] sinzui: I have bootstrap succeeding, but no instance coming up [20:33] I will switch to ec2 then [20:34] usually the management console is pretty on the ball with new instances [20:34] but I'm not seeing anything [20:34] us-east-1 (the default) [20:34] * thumper tries ap-southeast-1 [20:35] * thumper tries ap-southeast-2 [20:35] not 1 [20:35] 2013-10-01 20:35:02 ERROR juju supercommand.go:282 cannot start bootstrap instance: cannot set up groups: cannot revoke security group: Source group ID missing. (MissingParameter) [20:35] hmm... [20:35] so much for aws having consistent apis [20:37] aws does appear to be slower doing a deploy [20:45] thumper, azure depoy PASS http://juju-test-release-azure-zw9r097xn7.cloudapp.net/ [20:51] thumper aws deploy PASS http://ec2-23-22-68-40.compute-1.amazonaws.com/ [20:56] sinzui: hmm, are you using --upload tools? [20:57] No, I am not [20:57] ah crap [20:57] I think I'm looking at the wrong user [20:58] been there, done that this week [20:58] haha [20:58] I was [20:59] * thumper feels like a dumb ass [20:59] thumper, HP deploy PASS http://15.185.254.245/ [21:00] I can start the upgrade tests now. I think the messed up streams were the true cause of the HP upgrade failures. [21:02] * thumper nods [21:02] ok [21:12] does anyone know what the purpose of the settingsref collection is? afaics its basically redundant info [21:14] hazmat`: sorry, no idea [21:14] no worries, i'll ping fwereade tomorrow [21:15] hazmat`, heyhey [21:16] fwereade, greetings, thought you might be done for the night.. just curious about the settingsref collection for the pymigration stuff.. namely what's the intent behind it [21:17] hazmat`, IIRC it allows us to clean up service config settings when they're no longer used [21:17] afaics its basically a count of settings users, but it seems to always a key ref to a named service's settings with a unit count? [21:18] fwereade, how's it any different then unit count? [21:18] on the service doc [21:18] hazmat`, so, for each given charm a service runs, it has a (potentially) different config [21:19] hazmat`, the service always holds a ref to the service's current charm's version of the settings [21:19] hazmat`, units hold refs to the version suitable for the version of the charm they are currently running [21:19] hazmat`, when the refcount hits 0, we delete them [21:20] hazmat`, if we didn't have a refcount it would be hard to know when they were no longer used [21:20] but its not mvcc on the service config.. ie no previous copies or versions are held. the keys are all the service ref [21:20] not versioned config refs [21:20] hazmat`, we can't assert things like "no doc exists with $value in $field" in the txn library [21:21] fwereade, so the value should be unit_count+1 [21:21] hazmat`, but we *do* hold older versions of the config, for use by those units that have not updated to a charm that is guaranteed to understand the current one [21:21] hazmat`, steady state, in general, yeah, I think so [21:22] hazmat`, service configs are keyed on service name + charm urlIIRC [21:22] fwereade, cool i think thats answers my question.. but re older versions of config.. how can they be held when the keys conflict [21:23] hazmat`, we've always got the charm against which they originally validated, and we use that to fill in defaults etc when presenting old configs to outdated units [21:24] hazmat`, if charm config changes we basically just drop old fields, add new ones as default, and reset to default any we can't make head or tail of (type changes) [21:24] sure re defaults, but afaics you always have one actual service config, and a ref to the charm.. the older service config would have to be localdata on the unit. [21:25] hazmat`, the older service config stays there in state until all the units have upgraded [21:26] fwereade, okay.. ah.. ic.. this only in the context of upgrades, where you have both charms versions to merge against the current setting, doesn't apply to config changes, and multiple versions of applied config in state (ie juju set) [21:26] hazmat`, juju set only actually applies to the current one [21:26] yup.. so not really multiple versions of config [21:27] just synthesized for upgrades from multiple charms defaults [21:27] hazmat`, a given set of settings may well only half-validate against an older charm so we leave them frozen until they're culled [21:27] hazmat`, so yeah it's just a dumb cache [21:27] fwereade, so the settingsref purpose is primarily garbage collection against extant refs [21:27] hazmat`, yeah [21:27] fwereade, cool, thanks [21:28] hazmat`, anywhere you see a refcount in state the reason for it is basically isomorphic [21:28] fwereade, yeah.. txn guard behavior.. what's generally unclear is the actor / supervision tree for state [21:30] mostly around the gc of docs, possibly b/c in several cases there are multiple answers. ie. service doc owned by last unit agent or cli vs. machine doc owned by machine or provisioner [21:30] fwereade, thanks for the insights [21:31] hazmat`, the lifecycle docs in doc/ should be helpful there [21:32] hazmat`, lifecycles.txt:156 onwards [21:33] hazmat`, it has probably rotted in one or two places but it should still be largely accurate [21:34] hazmat`, in particular I did not think to update it when we fixed the "unit destruction depends on unit agents" bug [21:35] fwereade: can I get you to run the ec2 tests for trunk? [21:35] fwereade: it could be that saucy is causing this test to fail [21:35] hazmat`, in fact death-and-destruction.txt might be even closer [21:35] fwereade, cool, i'll check them out [21:36] thumper, is there a particular one? I'm a bit up to my elbows in provider again here [21:37] localLiveSuite.TestStartInstanceWithEmptyNonceFails [21:37] same failure for openstack [21:41] thumper, both passing for me [21:41] * thumper sighs [21:41] thumper, paste me the failures? [21:41] fwereade: http://pastebin.ubuntu.com/6181433/ [21:42] fwereade: wrote to tools, but tried to read from different location? [21:43] don't seem like a distro specific issue there [21:44] ha it isn't a tools problem but image problem [21:46] thumper, still, it's surprising they're the only ones failing [21:46] yeah [21:47] thumper, although, hmm, it is uploading version.Current there [21:47] yeah, but that's the tools [21:47] not the image [21:47] it is complaining with the image search [21:47] thumper, Iwould not be too surprised to see we didn't include test image metadata forsaucy though [21:48] yeah, do you know where the test metadata is? [21:48] thumper, and specifying version.Current will force saucy where perhaps nothing else does [21:48] * thumper nods [21:48] fwereade: shall I try with "quantal", which is what other tests do? [21:49] thumper, that sounds sensible, yeah [21:50] thumper, heh, ec2 has a load in export_test [21:50] thumper, no mention of sauct [21:50] or even saucy [21:51] * thumper nods [21:51] I've modified the test to specify "quantal" [21:51] lets see if it passes [21:54] that looks like it fixed it [21:54] * thumper proposes as a drive by in the other branch [21:55] gah, fixed ec2 but openstack broke [21:55] * thumper looks at image differences [21:58] oh ffs [21:58] too many non-constrained test variables [21:59] arch should be set too [22:13] HP upgrade FAIL. Like before the state sever completed, but the units did not.. I have capture the log [22:30] wow, lbox confused that merge [22:33] fwereade: if you are still around https://codereview.appspot.com/14243043/ [22:37] thumper, should we not be explicitly testing --system ratherthan just doing that shift-out-of-the-way lark in checkargs? [22:38] fwereade: probably [22:39] fwereade: do you know where the $(JUJU_HOME)/environments dir and associated .jenv files are created? [22:39] fwereade: if you bootstrap the local provider with sudo as you need to, you can't do anything else because it is created 600 by root [22:40] which includes status [22:40] I assigned the bug to rog, but he hasn't done anything with it [22:40] so local provider is still broken [22:41] thumper, it's all in environs/configstore, not sure offhand exactly where called from [22:41] environs.Prepare, probably [22:41] hmm... [22:42] * thumper will look after the gym [22:42] enjoy [22:42] sinzui: please add to the todo list of live tests: local provider [22:47] I will [22:49] sinzui: things fixed on trunk should target 1.15.1? [22:49] * thumper -> gym === thumper is now known as thumper-afk [23:08] azure upgrade PASS [23:12] Hi all -- is this a known issue on a precise bootstrap node with the maas provider? # start juju-db [23:12] juju-db start/running, process 4473 [23:12] root@qnc48:/var/log/upstart# cat juju-db.log [23:12] error command line: unknown option sslOnNormalPorts [23:12] use --help for help [23:14] seems like wrong version of mono or something [23:14] *mongo [23:33] aws upgrade PASS