[00:05] * thumper wonders if os.Rename is stable under stress [00:05] it is the only think that would be causing this to fail imo [00:05] ah fuk [00:06] ?? [00:07] I wasn't checking errors [00:07] * thumper has a screed of them [00:07] bzzzt [00:07] directory not empty... [00:08] that is an error I didn't expect [00:08] is there is a .turd in there ? [00:08] or an editor file, or something ? [00:13] c.Check(err, IsNil) [00:13] ... value *os.SyscallError = &os.SyscallError{Syscall:"readdirent", Err:0x2} ("readdirent: no such file or directory") [00:13] huh? [00:13] this is on Unlock [00:15] ah.. I think I know what this is... [00:15] maybe [00:20] \o/ [00:20] stress test passes now [00:20] * thumper ups the stress [00:21] davecheney: what is a reasonable amount of stress in your opinion? [00:21] 3*100 showed the problem, which is now fixed [00:21] obvious joke is obvious [00:22] had to make unlock atomic at fs level too [00:22] and rename returned more errors than just ErrExists [00:22] which caught me out [00:22] it comes down to time [00:22] 1000 iterations with 10 concurrent locks takes about 2.5 seconds [00:23] wow, or 7s without the max procs [00:23] I really don't want 7s of time added to the test :( [00:24] 200 and 10 is 1.5s, which is bareable [00:24] just [00:25] ok, that has taken longer than I wanted... [00:25] but I'm off for lunch [00:26] which is really heading into town to go to the supermarket and buy the new device CD [00:43] davecheney: does juju core have any kind of automated integration testing? [01:02] bigjools: i think the best answer to that is the charm testing harness that m_3 has built [01:07] 2013/04/18 01:06:45 INFO environs/openstack: started instance "1517935" [01:07] 2013/04/18 01:06:45 NOTICE worker/provisioner: started machine 45 as instance 1517935 [01:07] 2013/04/18 01:06:45 INFO worker/provisioner: found machine "46" pending provisioning [01:07] 2013/04/18 01:06:45 INFO worker/provisioner: found machine "47" pending provisioning [01:07] ^ we need to log when the PA reloads [01:08] ok thanks [01:08] Hi there bigjools [01:08] jtv: you're doing an awesome impression of someone who has the week off :) [01:09] I get the hint [01:09] I'll be off later, through a region with spotty GSM coverage let alone internet. [01:09] I just wanted to pop online for a moment, and then had a long fight with bluetooth tethering in Raring. [01:10] (I checked out of the resort earlier this morning) [01:10] Apart from bluetooth tethering still not working except once just after installation, raring is working out pretty well so far. [01:11] Why does the Ubuntu Software Center now have a big A on it? [01:11] jtv: you can re-enable virtual desktops in settings btw [01:11] Yeah, already did thanks. [01:11] I tried to say it on IRC yesterday, but I think my network connection was in a bit of a limbo state at that point. [01:15] it seemed so! [01:19] Oh, gotta go [01:20] You may not get this message because my IRC ping time just now was 44 seconds. [01:42] https://bugs.launchpad.net/juju-core/+bug/1170176 [01:44] that's nasty [01:54] thumper: i think a nil in instance is being stored in that map [01:54] checking now [01:57] thumper: ubuntu@juju-hpgoctrl2-machine-0:~$ juju bootstrap -v --upload-tools [01:57] 2013/04/18 01:57:18 INFO environs/openstack: opening environment "goscale2" [01:57] 2013/04/18 01:57:22 INFO environs/tools: built 1.9.15.1-precise-amd64 (2193kB) [01:57] why does upload tools append a build number to the tool ? [01:58] davecheney: I don't know [01:58] that kind of sucks [01:58] i wanted to use those numbers [01:59] ask fwereade [02:02] thumper: https://bugs.launchpad.net/juju-core/+bug/1170176/comments/1 [02:03] hmm, that would explain it :) [04:49] jam: hi there, I've munged all my fslock branches into one, and addressed all the comments (I think) [04:49] jam: spent most of the day writing tests actually :) [04:50] and fixing the fallout when something failed... [05:59] "17": [05:59] instance-id: "1520273" [05:59] dns-name: 15.185.165.35 [05:59] agent-version: 1.9.15.1 [05:59] agent-state: started [05:59] "18": [05:59] instance-id: "1520275" [05:59] dns-name: 15.185.165.81 [05:59] agent-version: 1.9.15.1 [05:59] agent-state: down [05:59] agent-state-info: (started) [05:59] whut ? [06:45] * davecheney away til 18:00h [06:46] * thumper just realised that the meeting is not in 15 minutes [07:16] jam: heya, let me know when you can pair up === ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: dimitern | Bugs: 2 Critical, 61 High - https://bugs.launchpad.net/juju-core/ [07:25] morning all [07:25] dimitern: hiya [07:25] mornin' everyone else too! [07:25] darn internet is still down [07:26] rogpeppe1: hey, what happened to status? [07:26] dimitern: sorry, it changed [07:26] rogpeppe1: I like it :) [07:26] dimitern: glad to hear it :-) [07:29] rogpeppe1: me too, even i wondered this morning [07:30] rogpeppe1: i know ctx as abbrev for context, so i wondered here too. the context argument for the command execution btw has the name ctx ;) [07:30] TheMue: now not a single occurrence of .(someType) or interface{} in sight? [07:31] TheMue: yeah, i'd forgotten that. maybe i should change ctxt to ctx to fit in [07:32] TheMue: i hadn't even realised that i'd got "ctx" and "ctxt" in the same function scope... [07:33] rogpeppe1: would be nice. the introduction of a context (or a similar type) as brace for all status operations would have been the next refactoring step, as well as getting rid of this pseudo generic structure [07:33] rogpeppe1: but i thought due to other needs with respect to the code freeze we could do this later [07:34] TheMue: i figured it was easier to do the refactoring than to get the subord and relation stuff to work nicely [07:35] TheMue: i'm afraid i found processService entirely opaque. and i think there was a problem with it too. [07:36] thumper: i don't think you've pushed your latest changes to https://codereview.appspot.com/8602046/ [07:37] TheMue: oh, i see, you've made the changes in the final branch only [07:37] oops [07:37] thumper: ^ [07:37] rogpeppe1: which changes do you refer to? [07:38] rogpeppe1: ah, you meant thumper [07:38] TheMue: yeah, sorry, tab malfunction :-) [07:41] rogpeppe1: i know that, happens often to me with robbiew, *shit* ^H^H^H^H^Hg *lol* [07:43] * TheMue just got the order to write an article about rust. will be interesting to see the differences to go [07:47] 'sup ? [07:48] davecheney, heyhey, sorry late [07:48] davecheney, it appends a build number so that uploaded tools always have dev versions [07:49] fwereade: right [07:49] but we never made any use of the concept of dev version [07:49] so now it just sticks out there like a sore thumb [07:49] fwereade: the reason I wish to complain is I need to use the build version, as I will discuss in the hangout in T-10 [07:50] davecheney, ok, cool [07:52] fwereade: https://codereview.appspot.com/8648045/ [07:52] looking for a 2nd lgtm [07:52] * fwereade looks [07:53] this turned up today in laod testing [07:53] davecheney, I don;t think that's correct [07:54] davecheney, why hide state info if the instance is missing? [07:54] davecheney, (this is not to say that panicing is correct either) [07:55] fwereade: ok, that is fine, but this is just making, https://codereview.appspot.com/8842043 work [07:55] i'm not changing the behavior [07:57] https://docs.google.com/a/canonical.com/document/d/1bSiicbYOV25fq73dZqXz738OU5l96QjkY3mqnXFdLlE/edit [07:59] hi rogpeppe1 [07:59] thumper: hiya [07:59] rogpeppe1: yes, changes all in fslock-mashup [07:59] rogpeppe1: thanks for doing this [08:00] thumper: am just reviewing. a few more comments on the fslock code after some pondering. [08:00] well, it's making it *slightly* less broken, and mea culpa for not spotting that in the review yesterday, but it's not really following intent [08:01] where did the g+ link from the even go? [08:01] event [08:01] can anybody send me the link? [08:02] fwereade, rogpeppe1: ? [08:04] davecheney, thumper: link please? [08:04] dimitern: https://plus.google.com/hangouts/_/calendar/bWFyay5yYW1tLWNocmlzdGVuc2VuQGNhbm9uaWNhbC5jb20.gdt9rkp5uspih9n3db6b95kccc [08:04] rogpeppe1: cheers [08:04] dimitern: i'm probably not going to be able to make it. [08:04] dimitern: will try through my phone connection, but i have my doubts [08:06] 1:05 AM [08:06] Thursday, April 18, 2013 (PDT) [08:06] Time in Portland, OR, USA [09:49] um [09:50] do we have any expectation that the packaged version will work? we don't know what the critical differences were in 1.9.14 [09:50] (other than that they shouldn't exist) [09:50] jam, rogpeppe2, dimitern: ^ [09:51] fwereade: we should try the packaged version and see whether we see the same problems [09:51] fwereade: not sure i understand your question [09:51] fwereade: it looked like the problem was client-side, so shouldn't be too hard to diagnose [09:53] * fwereade is trying to remember what the hell he saw going wrong when he tried ap-southeast-2 [09:53] ok, rogpeppe2 and dimitern, you should not be worrying about this [09:53] you have code to land :) [09:53] fwereade: I'll try with the ppa and one random region [09:54] fwereade: (after I land my stuff) [09:54] dimitern, awesome [09:54] TheMue, would you pick a region, let us know what it is, and get bootstrapping from the PPA please? [09:54] * TheMue just prepares a test image with ppa [09:55] TheMue, AFAICT us-east-1 works [09:55] fwereade: hehe, just wrote when you sent [10:00] TheMue, except... hmm, no, I think that maybe even that does not [10:04] Hi guys, I just put up for review a branch which adds constraints support in the MAAS provider: https://codereview.appspot.com/8842045/ [10:04] Please have a look. [10:05] rvba: i'm glad to see you got lbox working! [10:05] rvba, cool [10:05] dimitern: yeah, I must admit the problem was my fault, wrong bzr config. [10:06] rvba: please share your findings with the rest of the red squad, so they can set it up too :) [10:06] dimitern: already done :) [10:07] rvba: great, thnaks! [10:07] rvba, I will take a look at it but can I please ask you not to land anything today, while we try to handle the release frenzy [10:07] fwereade: what do think about reporting agent-state=pending when instance-state==pending or missing? [10:07] fwereade: sure, no problem. [10:07] rogpeppe2, I'd rather not pretend we can give an instance-state when we can't [10:07] fwereade: oops, sorry, i meant instance-id not instance-state [10:07] rogpeppe2, the list appeared sanguine about the prospect of (temporarily) missing agent-state [10:08] rvba: I'll review it shortly [10:08] ta [10:08] fwereade: the current tests assume no agent-state when instance-id is pending or missing [10:08] rogpeppe2, while an instance id is pending I'm fine with reporting only machine series [10:08] rogpeppe2, missing is a different matter [10:08] fwereade: my changes are pushing towards reporting it always [10:09] fwereade: otherwise i have to special-case [10:09] fwereade: which seems kinda unnecessary. [10:09] rogpeppe2, that's probably simplest -- the downside is that not-yet-provisioned machines are nicely visually distinct today [10:09] rogpeppe2, and this would work against that [10:10] fwereade: yeah, maybe i'll make pending the only special case [10:10] rogpeppe2, when you say special-case... ISTM that it's just one branch in one place [10:10] fwereade: sure [10:10] rogpeppe2, if it has tentacles that's a different matter [10:10] rogpeppe2, +1 on early exit on pending [10:10] fwereade: every if statement doubles the number of reachable states :-) [10:10] rogpeppe2, but I thought dimitern was doing that? [10:11] fwereade: it's actually "if id==pending {agentstate=""}. [10:11] fwereade: it meshed too closely with what i was doing already [10:11] fwereade: we agreed rogpeppe2 to take life and I'll do series [10:11] rogpeppe2, agreed, but we are trying to report on a very large number of states ;p [10:11] dimitern, rogpeppe2: isn't that just asking for conflicts? [10:11] fwereade: i was already mucking with processMachine [10:12] rogpeppe2, you have one to land and one to write and get reviewed already though [10:12] fwereade: one to land? [10:12] rogpeppe2, https://codereview.appspot.com/8821043/ ? [10:12] rogpeppe2: cmd logging [10:13] rogpeppe2, that one on top [10:13] fwereade: oh yeah; will do [10:13] dimitern, it's a significant reduction in logspam when the allwatcher's running, been approved for a day or 2 [10:14] rogpeppe2, dimitern: regardless, rogpeppe2 is messing with exactly the method dimitern needs to change [10:14] fwereade: what's that? [10:14] dimitern, processMachine [10:14] fwereade: that is true [10:15] rogpeppe2, I do not think this work is sanely parallelisable [10:15] the changes shouldn't conflict (or not badly) [10:15] fwereade: the tests are the main part of the work [10:15] fwereade: i don't care about conflicts in the code - it's all trivial [10:15] rogpeppe2, so dimitern needs to modify every one of your tests, and also your code [10:16] fwereade: hmm. dimitern shall i pass over my WIP to you? [10:16] rogpeppe2, I appreciated the ninja-rewrite last night very much but I think this is decidedly less convenient tbh [10:17] fwereade: i think that all the changes we agreed to do will clash [10:17] rogpeppe2: I'm proposing mine now [10:17] dimitern: cool [10:17] rogpeppe2, dimitern: cool, I think that is a much cleaner direction to merge [10:18] rogpeppe2, dimitern: objections withdrawn [10:19] rogpeppe2, please focus on the other branches while waiting on dimitern's to land though [10:19] fwereade: will do [10:19] fwereade: only one branch, right? [10:20] rogpeppe2, cheers -- well, 290 to land if not already done, command output to propose with the logging just in SuperCommand [10:20] fwereade: ah, thanks for reminding of that one. i should've made a list! [10:22] I'm having trouble authenticating on rietveld while proposing :( tried 10 times, sign out/in from the web site works, I authorized the app (again), still no joy - probably related to the recent google apps / ubuntu sso change? [10:36] dimitern, np, it's pretty easy to review on LP [10:37] dimitern, you have an LGTM, but considering the current circumstances I'm not keen to call it a trivial [10:38] rogpeppe2, would you glance at https://code.launchpad.net/~dimitern/juju-core/034-status-shows-machine-series/+merge/159589 briefly please? [10:38] fwereade: sure [10:42] TheMue, fwiw my current findings are that eu-west-1 works perfectly but I *think* the us-east-1 issues were down to ec2 not us [10:42] fwereade: looking [10:43] fwereade, dimitern: i might remove the omitempty, 'cos if we do have a blank series for some reason, we'll want to know [10:43] rogpeppe2: ok [10:44] rogpeppe2: can you pull my branch and merge it for me please? [10:44] rogpeppe2, you can't create a machine without a series, and you can't set a machine's series once it's created [10:44] fwereade: so it can never be empty? [10:44] rogpeppe2, yeah, I think that is a guarantee that state makes [10:44] rogpeppe2, hence no ,bool or ,error [10:44] fwereade: in which case we don't need the omitempty, right? [10:44] rogpeppe2: i'm still struggling to get lbox/lpad working - auth failing [10:44] rogpeppe2, ha, that is true [10:44] fwereade: i don't care much though [10:45] dimitern: ok, i'll merge it for you [10:45] rogpeppe2, AFAICT omitempty is entirely academic [10:45] rogpeppe2: tyvm [10:45] rogpeppe2, follow your heart [10:46] fwereade: yeah. i might leave it there for consistency [10:46] rogpeppe2, +1 [10:47] TheMue, ok, yes, us-east-1 problem confirmed as an unhappily-timed connection loss to s3 causing apparent lack of tools [10:48] TheMue, where are you looking? [10:48] fwereade: Put needs to retry :-) [10:48] rogpeppe2, it's List actually [10:48] ; [10:48] fwereade: hmm. i thought List did [10:49] wtf is this: 2013/04/18 12:48:49 RIETVELD 0xf840000150 client.Get returned (*http.Response)(nil), &url.Error{Op:"Get", URL:"http://example.com/marker", Err:(*errors.errorString)(0xf8400a26d0)} [10:49] fwereade: wanted to look at us-east-1, but will now choose a different one. had troubles with my test image. :( [10:49] TheMue, what's this test image? [10:50] TheMue, you can just install from the ppa, can't you? [10:51] fwereade: have an extra vm for it [10:52] Get http://example.com/marker: redirect blocked ??? [10:52] reported by net/http/client.Get [10:52] dimitern, have you ever visited example.com? [10:53] dimitern, it's a placeholder basically [10:53] fwereade: i *know* what it is, but why is lbox misbehaving? [10:54] fwereade: interestingly, searching for that message in google gave me a #juju-dev log where I complained about the same thing on 2012/11/21 :) [10:54] http://irclogs.ubuntu.com/2012/11/21/%23juju-dev.txt [10:54] dimitern, I was taking the use of example.com to be evidence of lbox hitting the crack pipe pretty hard today, but I don't know *why* [10:55] dimitern, does nuking your various relevant .files and reauthing help? [10:55] fwereade: tried that already [10:55] dimitern, sorry, out of ideas then :( [10:59] fwereade: yeah... drawing knowledge from my earlier self in that conversation - it seems it's a go 1.0.3 issue, which was fixed on tip, and I need to rebuild lbox with go tip [11:00] cmd/juju tests almost take 3 minutes! [11:02] FFS this is altogether too eventual for my liking [11:02] and I need food [11:03] rogpeppe2, I'm aware of those tests, they are for now the price we pay for coverage [11:03] rogpeppe2, I will be trying to figure out what the hell is taking so long very soon [11:03] fwereade: me too :-) [11:03] but for now, lunch is an absolute necessity [11:03] maybe the instance will have shown up next hour [11:04] dimitern, rogpeppe2: if you can find someone else to review everything before I return, I would encourage one of you to set the build number and follow up on my juju-dev email [11:05] dimitern, rogpeppe2: but please, whoever does that, make sure it works live ;p [11:06] dimitern, rogpeppe2: if not I'll bbiab [11:06] fwereade: ok; i'm currently doing the manual diff thing on DeepEqual output [11:10] it worked! [11:12] so, for the record: goetveld is broken before a patch from wallyworld_ (https://code.launchpad.net/~wallyworld/goetveld/auth-cookie-fix/+merge/147585) [11:13] now it works with go 1.0.3, the "redirect blocked" issue is gone and I can use it normally [11:34] everything crashes [11:34] jam: dimitern: ping [11:34] * rogpeppe2 thanks thumper for passing on the nm-applet hack [11:35] I cannot bootstrap on any region with 1.9.14 from the ppa: http://paste.ubuntu.com/5718528/ [11:36] fwereade: any idea? [11:38] currently it looks as i can bootstrap but status doesn't return :( [11:39] ah, now, a pending machine 0 [11:39] TheMue: how did you manage? with the ppa version and just "juju bootstrap" on ec2? [11:40] TheMue: no --upload-tools or --series, right? [11:41] rogpeppe2: you managed to get my branch? [11:41] dimitern: my phone went down, and i've run into unexpected difficulties with the Life change. [11:41] dimitern: will submit your branch now [11:42] dimitern: with the ppa version [11:42] dimitern: and a pure juju bootstrap [11:42] TheMue: which region? [11:43] TheMue: ah, you're running precise! [11:43] dimitern: i wanted to start from west to east, so us-east-1 now [11:43] dimitern: yes, precise [11:44] so it's not working on quantal [11:46] dimitern: i'm trying to figure out the correct logic for processAgent [11:46] dimitern: i can't convince myself that it's currently right, and the tests aren't great [11:46] dimitern: i've been writing out a truth table [11:46] rogpeppe2: I see [11:47] TheMue: i wonder if you could talk me through the logic in processAgent (it was processStatus) [11:48] rogpeppe2: the idea is to have Life(), AgentAlive(), AgentTools() and Status() for entities that support it - units and machines [11:48] rogpeppe2: and process them similarly [11:48] i can't quite get my head around this condition: status != params.StatusPending && !agentAlive && !entityDead [11:49] dimitern: i realise that [11:49] rogpeppe2: I can help with that [11:49] dimitern: it's just that under *some* conditions the agentAlive status is lost [11:49] rogpeppe2: this is used to determine if the agent is down [11:50] rogpeppe2: it's down if the machine is alive, but the agent is not and the status is not pending (i.e. provisioned and started) [11:51] rogpeppe2: never touched processAgent(), sorry [11:51] TheMue: it's the same logic you wrote in processStatus [11:53] rogpeppe2: one moment, have to open then code [11:53] TheMue: it's ok, i think i'm there [11:53] rogpeppe2: I wrote that actually [11:54] dimitern: ah, ok [11:54] rogpeppe2: see above, does it make sense? [11:54] dimitern: it's all those double-negatives makes me see boggle-eyed [11:55] rogpeppe2: simple logic :) [11:57] dimitern: yeah. i have a better intuitive grasp when it's if !(status == params.StatusPending || agentAlive || entityDead) { [11:57] rogpeppe2: change it, if you think it'll be more readable [11:57] rogpeppe2: as long as it's the same logic [11:58] dimitern: i'm not sure. it helped me, but probably only 'cos i'd been staring at it the other way [11:58] dimitern: standard boolean transformation [11:58] rogpeppe2: yeah [11:58] dimitern: i can't remember the name of the rule though [12:00] dimitern: i think there's no point in calling AgentAlive if the status is pending [12:01] rogpeppe2: yeah [12:01] rogpeppe2: correct [12:04] http://paste.ubuntu.com/5718578/ - still cannot bootstrap from the ppa (tried both us-east-1 and eu-west-1) [12:05] dimitern, 2013/04/18 14:00:46 ERROR command failed: cannot find tools: use of closed network connection [12:05] dimitern, I have been seeng that sometimes, but not always [12:05] dimitern, I don't *think* it's us [12:05] dimitern: i now have a problem with us-west-2 when creating the s3 control bucket (conflicting condition) [12:06] dimitern: us-east-1 worked fine [12:06] TheMue, s3 bucket names are global [12:06] dimitern: i've reworked the code a little (the logic should still be the same though) http://paste.ubuntu.com/5718581/ [12:06] fwereade: yeah, the issue before (with uncommented public-bucket) was different (no compatible tools found) [12:06] TheMue, you need a new name [12:07] rogpeppe2: looks good [12:07] fwereade: even if the former is destroyed? [12:07] dimitern: at least my small brain can wrap itself around it now :-) [12:07] rogpeppe2: :) [12:08] TheMue, I don;t recall that ever working, no [12:08] TheMue, dimitern: I have successfully bootstrapped and deployed in both us-east-1 and eu-west-1 with the ppa [12:08] fwereade: hmm, so when testing between east and southeast i seem to have switched my buckets. can't remember, but it looks like. [12:09] fwereade: you're running precise, maybe that's why [12:09] fwereade: and i successfully to us-east-1 [12:09] TheMue, dimitern: I have "successfully bootstrapped" in ap-southeast-2, as in I have a bootstrap instance running, but it's been running for an hour and I'm still unable to get the instance [12:09] fwereade: do we have an issue or doc to collect the test results [12:10] TheMue, creating an issue now [12:11] fwereade: thx, +1 [12:12] * rogpeppe2 hates the spot-the-difference competition when a DeepEqual fails: http://paste.ubuntu.com/5718597/ [12:14] rogpeppe2: hehe, i know that from my tests. i then replaced those map[i]i by \n and some manual sorting. so it gets easier [12:14] current mostly-manual solution: Edit ,|gofmt Edit ,x/"}/c/",\n}/ Edit ,x/,/a/\n/ Edit ,x/{./v/}/x/{/a/\n/ [12:14] :-) [12:14] TheMue, dimitern: https://bugs.launchpad.net/juju-core/+bug/1170326 [12:14] dimitern, rogpeppe2: btw how is progress? should I be reviewing, supporting, etc? [12:15] fwereade: I'm done - mine has landed [12:15] dimitern, awesome [12:15] fwereade: just working out the right way to fix the tests. at least i'm convinced the logic is good now. [12:15] rogpeppe2, ?/3 [12:16] rogpeppe2, excellent [12:16] fwereade: i've submitted the log changes [12:16] rogpeppe2, cool [12:16] fwereade: not the finished ones though [12:16] fwereade: i think status life is 100x more important [12:17] rogpeppe2, well, I did originally ask dimitern to do it because (1) he'd already started and (2) you had 2 other branches to do [12:17] rogpeppe2, and now he is kickinghis heels [12:18] fwereade: yeah, sorry, i thought life was trivial to do along with what i was doing anyway [12:18] rogpeppe2, I didn't think you were actually doing anything with status [12:18] fwereade: for some reason i thought i was [12:20] rogpeppe2, we agreed and minuted otherwise [12:20] rogpeppe2, https://docs.google.com/a/canonical.com/document/d/1bSiicbYOV25fq73dZqXz738OU5l96QjkY3mqnXFdLlE/edit# [12:21] rogpeppe2, but hey ho [12:21] fwereade: ah, i think it was because i was already half way through some changes when that was minuted [12:22] so, with tip I'm able to bootstrap with default region and public-bucket (commented out) (no --upload-tools or --series), with default-series: precise [12:22] but not with 1.9.14 from the ppa [12:22] dimitern, what's the error? [12:23] fwereade: the same - use of closed network connection [12:23] dimitern, and it's meaningless to compare tip and 1.9.14, I think [12:23] hey! [12:23] has goamz updated recently? [12:23] so 1.19.14-amd64-quantal1 is confirmed broken [12:23] fwereade: it has i think [12:24] GAAAH [12:24] magic dependency updates FTL [12:24] right, tests pass. [12:25] the difference between agent-state=pending and instance-id=pending is subtle [12:26] hmm, no goamz changes since march it seems [12:27] fwereade: you might be stuck [12:27] fwereade: try removing the goamz directory and go getting again [12:27] rogpeppe2, I went to look on launchpad ;) [12:27] rogpeppe2, but hmm maybe I hadn't actually updated since before then? [12:27] rogpeppe2, but no [12:28] rogpeppe2, we've done releases that didn't exhibit these issues, right? [12:28] fwereade: you're right [12:28] fwereade: 35 is my latest revno [12:29] fwereade, dimitern: https://codereview.appspot.com/8852043 [12:30] one known issue - the summary in the test is wrong; fixing [12:32] the current state is that when the user clicks "Add" on a charm page the charm details will dissapear and the left sidebar will stay visible and the service configuration panel will be displayed at the right [12:32] in full-screen mode we will switch to sidebar mode and go to the same state [12:33] I don't know if Rick is communicating with Jovan or not [12:34] fwereade, dimitern: now proposed with that fixed [12:35] rogpeppe2: reviewed [12:36] fwereade: I updated bug 1170326 [12:36] _mup_: wtf? [12:36] https://bugs.launchpad.net/juju-core/+bug/1170326 [12:37] rogpeppe2, reviewed [12:37] TheMue, how do you "install" in juju? [12:38] fwereade: typo, deployed a service and waited until it is started [12:38] fwereade: currently in us-west-1 [12:39] fwereade: what instance id would we set? [12:39] fwereade: when there's no instance [12:39] rogpeppe2, the one in state? [12:39] rogpeppe2, I don;t understand why you'd ever call instance.Id() [12:39] hmm, sadly can't edit, will comment it after current test. [12:40] fwereade: excellent point [12:40] but looks good so far, mysql is pending [12:40] TheMue, are you watching the provisioner logs? [12:40] fwereade: FWIW this is an old problem - the logic there hasn't changed [12:41] rogpeppe2, it was also stuff that I'd figured out with dimitern before you took it over unilaterally [12:41] fwereade: as long as the commands and status tell me it's ok not. shall i look for something special? [12:41] fwereade: v sorry about that [12:42] * dimitern lunch [12:42] rogpeppe2, no worries, it happens, I'm just a bit confused that it did when I thought I'd been extra clear -- but it takes at least 2 to experience a communication problem ;) [12:53] fwereade: is there ever a case that we can have an alive agent when the entity status is pending? [12:57] rogpeppe2: not really, with the nonced provisioning changes, even if this happens briefly, the agent will commit suicide soon after starting (even before setting AgentAlive I think) [12:58] dimitern: that's what i think. i was asking because of the review comment about that, so thought perhaps fwereade had some more useful input there. [12:58] fwereade, dimitern: PTAL https://codereview.appspot.com/8852043 [13:00] rogpeppe2: changing status is *not* what an agent does first - it sets itself as alive first [13:00] dimitern: interesting [13:00] dimitern: perhaps it should be the other way around [13:01] rogpeppe2: take a look at both uniter/modes and machiner [13:01] rogpeppe2: how so? [13:02] dimitern: it's an early indication of liveness [13:02] dimitern: we save round trips in status [13:02] dimitern: thanks for the review (MAAS provider constraints branch)! I see you guys are busy, just ping me when it's ok for me to land this. [13:03] dimitern: it's a once-and-for-all "i have started running!" - then the liveness status can change over time [13:03] rvba: please don't - we're about to release and it can land after that [13:03] rvba: it's a bit of a mess anyway, let's not complicate it [13:04] dimitern: sure, I will wait until you guys tell me it's good to go. [13:04] rvba: cheers! [13:04] rogpeppe2: i don't think it's an early indication of liveness [13:04] rogpeppe2: setagentalive is that indication, not the status change [13:05] rogpeppe2: it might seem so only from the command's perspective [13:05] dimitern: it's an indication that it got there anyway. [13:05] dimitern: if we set the status, then die immediately, then we at least see that it got that far [13:06] rogpeppe2: how is that useful? [13:06] rogpeppe2: if it dies, the status will be incorrect anyway [13:06] rogpeppe2: the agent has to be alive to set the status [13:06] dimitern: if it dies, we'll print "down" for the status [13:07] dimitern: so there's no difference there [13:07] rogpeppe2: "down" doesn't actually mean "oops i crashed while starting" [13:07] dimitern: it means "i crashed" [13:07] rogpeppe2: it means "i was running ok, entity was started, then something went wrong and i died" [13:08] dimitern: for me, it means that the agent started running and then stopped working for some reason [13:09] rogpeppe2: the significant distinction here is, the entity went into a started state before "down" being meaningful [13:09] dimitern: and a good (the only) indication we have that an agent started running is that it set its status [13:09] rogpeppe2: exactly, in addition to being alive as well [13:10] rogpeppe2: if the agents sets status to "started" and then sets itself alive, we'll see "down" a lot more often, and it will be a lie [13:11] rogpeppe2: it's a very brief moment, i agree, but it still be a lie [13:11] dimitern: hmm, good point [13:12] rogpeppe2: that's the point in setting the status after setting the agent to alive [13:12] dimitern: perhaps we should have a "pending but alive" status [13:13] rogpeppe2: how? [13:13] rogpeppe2: agent live and status are tightly linked ("agent-state" is what we use for status) [13:13] rogpeppe2, dimitern: if we did, I think it'd probably be "running" [13:13] fwereade: that's a good idea [13:14] rogpeppe2, dimitern: but I'm not sure [13:14] fwereade: we'll be departing from py-juju compatibility a bit if we do this [13:14] fwereade: https://codereview.appspot.com/8658045/ [13:15] dimitern: ^ [13:15] dimitern, not much tbh -- it's an extra step in unit status, and a bit closer to python for the machine -- although "running" will still not be a terminal state for a machine [13:15] rogpeppe2, cheers [13:15] fwereade: still waiting on https://codereview.appspot.com/8852043/ too [13:16] rogpeppe2: why remove Noticef("agent starting") ? [13:17] fwereade: well, in that case we can have "agent-state": "running" when status is pending and the agent is alive [13:17] dimitern: which file? [13:18] fwereade: but then we'll have "agent-state": "started" for most of the time [13:18] rogpeppe2: see the comments inline [13:18] dimitern, yeah, matching the unit agent [13:18] dimitern: ah, the *agent exiting* messages [13:18] dimitern: they're now redundant [13:18] rogpeppe2: how so? [13:18] dimitern, I *think* it is more important to impose consistency here by messing with the less-interesting-to-observe status [13:18] dimitern: as the messages are printed by supercommand [13:19] dimitern: which is the point of the CL [13:19] dimitern: no point in printing the info twice, i think [13:19] rogpeppe2: I though the point was to report "command completed successfully" for cli commands, not agents [13:19] dimitern: i did that originally, but fwereade suggested the supercommand change [13:20] dimitern: and given that we print it *anyway* for agents, why not? [13:20] fwereade: yeah, but it's slightly confusing to have "running" (for a short while) and "started" otherwise [13:20] dimitern, yeah [13:21] dimitern, maybe "starting" would be better [13:21] dimitern, anyway I don;t think that's one for today [13:21] rogpeppe2: I don't think these two are related - cli commands report success as a courtesy to the user; agents log stuff which is greppable by admins/etc. [13:21] fwereade: +1 for "starting" [13:21] wait, did we not do life for services/relations? [13:22] dimitern: they're both logging the same thing, no? [13:22] fwereade: and I also agree it's better postponed for after today [13:22] dimitern: given that the exit status is now logged by supercommand, what's the reason for the Noticef in the juju commands? [13:23] rogpeppe2: unless i'm on crack cli commands use stdout/err to report these things, not the logging infrastructure [13:23] dimitern, you're on crack ;p [13:23] dimitern, about one CLI command does [13:23] :) [13:23] +1 :-) [13:23] dimitern, they all *should* but that's not important enough for now [13:24] dimitern: this CL is entirely about log messages [13:24] so, with this change - if I run "juju somecommand" will I see "command completed successfully" on the console after it finished? [13:24] rogpeppe2, the status one's looking good given what you've done [13:24] dimitern: no [13:24] dimitern: only if you use --verbose [13:24] rogpeppe2, but there's no life for services/relations that I can see [13:24] rogpeppe2: so that's what I was thinking [13:24] fwereade: ah [13:24] rogpeppe2: can we add the stdout/err message like this as well please? [13:25] dimitern: no [13:25] dimitern: :-) [13:25] dimitern: i don't think a command should print this stuff by default [13:25] dimitern: it's noise [13:25] it's nice and reassuring [13:25] but, fine [13:25] dimitern: so is the next shell promt :-) [13:26] rogpeppe2, LGTM with the extra life fields [13:26] dimitern: if there's an error, that *will* be printed [13:26] rogpeppe2: fair enough [13:27] rogpeppe2: LGTM then [13:27] dimitern: thanks [13:27] dimitern: i'll leave "completed successfully" for another day if that's ok [13:28] rogpeppe2: series is still omitempty - should it be left like this? [13:28] dimitern: yeah, i decided it was fine [13:28] rogpeppe2: you mean the wording or the stdout output? [13:28] dimitern: the wording [13:28] dimitern: i don't want to spend another 10 minute round-trip [13:29] rogpeppe2: a command always finishes, even when it fails [13:29] dimitern: yeah, i'm +1 on the change, but i don't think it's that important right now [13:29] rogpeppe2: ok, if we're absolutely rushing things, fine [13:29] dimitern: meeting at 3 [13:30] rogpeppe2: so once you land these 2 we're done? [13:30] god, meeting [13:30] I'm going to lie down for 20 mins [13:30] see you then [13:30] who is gonna do the release following dave's process? [13:31] fwereade: how do you suggest we show relation life status? [13:31] fwereade: currently relations are [13:31] Relations map[string][]string `json:"relations,omitempty" yaml:"relations,omitempty"` [13:31] fwereade: no struct for a life field [13:31] rogpeppe2: the same way? [13:31] dimitern: how do you mean? [13:31] rogpeppe2: having relationStatus instead of string? [13:32] dimitern: that will break compatibility, no? [13:32] rogpeppe2: i think so, yeah [13:33] fwereade: when you're back comment on this one please [13:34] rogpeppe2: well, we can always add it in parenthesis after it :) [13:34] dimitern: bad idea [13:34] rogpeppe2: best compromise I think [13:34] dimitern: that breaks scripts horribly [13:35] dimitern: i'd add another field, RelationLife map[string]state.Life [13:35] dimitern: or something like that [13:35] rogpeppe2: yeah, should work, at the expense of extra output size, but meh.. [13:35] dimitern: we'd only include dead and dying ones [13:36] rogpeppe2: good point! [13:36] rogpeppe2: were we doing that in python? [13:36] rogpeppe2: reporting not alive relations [13:37] dimitern: i'm not sure there was such a concept in the python [13:37] rogpeppe2: really? oh, well [13:41] dimitern: quick: remind me of a good way to get a service that's in a dying state? [13:41] rogpeppe2: st.Service(id) should work [13:41] dimitern: no, i mean, to create a service and then put it into a dying state [13:42] rogpeppe2: ah, Destroy() [13:42] dimitern: doesn't that just remove the service if there's not something keeping it around [13:42] ? [13:42] rogpeppe2: yeah, you'll need to add a subordinate at least [13:42] dimitern: i think i probably need to add a unit [13:43] dimitern: but i'm not sure i can call Destroy if there's a unit [13:44] * rogpeppe2 thinks sometimes that we should have a way of recreating a given desired State rather than going through all the steps necessary to arrive at it [13:44] rogpeppe2: take a look at preventUnitDestroyRemove [13:44] dimitern: thank you [13:44] dimitern: perfect! [13:45] dimitern: oh, no [13:45] dimitern: that's the unit not the service [13:46] rogpeppe2: hmm.. [13:46] dimitern: actually, maybe i can destroy it even if it has units [13:47] rogpeppe2: if it has 1 unit and no relations, destroy() will remove it, otherwise it'll set it to dying [13:48] dimitern: so i need a relation i guess [13:48] rogpeppe2: looks that way - at least according to some of the tests that are faking a relation to test destroy() [13:49] rogpeppe2: TestDestroyStillHasUnits [13:51] dimitern: interesting. i think that might be wrong actually. [13:51] rogpeppe2: which one? [13:51] dimitern: if i destroy a service before it's been provisioned i think it should go away and all its units too [13:51] rogpeppe2: provisioning applies to machines, not services, right? [13:52] dimitern: yeah [13:52] rogpeppe2: I can't deploy a service with no units with the cli [13:52] dimitern: but it looks like i can't destroy a service until the units i created on it have started and stopped [13:53] rogpeppe2: well, adding a unit assumes you want it started [13:53] dimitern: if i do {juju deploy wordpress; juju destroy-service wordpress} i shouldn't have to wait 10 minutes for the machine to come up [13:53] fwereade: I know you're busy right now and this is definitely not urgent but when you have time, could you please see what you have to say about Jeroen's comment here: https://code.launchpad.net/~maas-maintainers/juju-core/maas-provider-skeleton/+merge/157025/comments/347752 [13:53] dimitern: yeah, but people change their minds === wedgwood_away is now known as wedgwood [13:54] rogpeppe2: peaople should rtfm [13:54] dimitern: and there's little more annoying than a service that doesn't do what it could do [13:54] :) [13:54] dimitern: i think we could do a better job here - the manual says "you can't stop what you've started, even though it's quite possible to do so|" [13:55] rogpeppe2: they can change their mind, they just have to wait for the action they issued before destroying [13:55] dimitern: yeah. we could do better there [13:55] rogpeppe2: possibly yeah, but that's not the only place, i assure you :) [13:56] dimitern: indeed :-) [13:56] rogpeppe2: it could be worth adding a wishlish bug? [13:56] wishlist [13:57] dimitern: just a bug would do [14:04] kanban meeting guys? [14:06] rogpeppe2, TheMue: kanban? [14:06] ouch, yes [14:25] rvba, responded [14:41] fwereade: ta [15:05] lunch [15:07] * TheMue has to step out in a few moments for dinner in a restaurant, younger daughter has her 17th birthday today [15:29] TheMue: have fun [15:29] live tests passed against trunk for me (except the usual StopInstances failure) [15:31] rogpeppe2: thx, we'll have. and daddy is allowed to pay. :) [15:52] anyone here know the easiest way to script install setuptools ? [15:52] the best i've got currently is: [15:52] wget -o setuptools.egg http://pypi.python.org/packages/2.7/s/setuptools/setuptools-0.6c11-py2.7.egg#md5=fe1f997bc722265116870bc7919059ea [15:52] sh *.egg [15:52] which seems a bit arbitrary [15:53] this is in a charm, BTW === rogpeppe3 is now known as rogpeppe [16:53] here's my current juju environment which i'm using to try out some stuff: http://paste.ubuntu.com/5719234/ [16:53] i've done upgrade-charm a few times [16:53] dimitern: ^ [16:53] dimitern: seems to be working well! [16:53] rogpeppe: good to hear! :) [16:54] dimitern: this was the script i used to set things up: http://paste.ubuntu.com/5719240/ [16:54] rogpeppe: status looks nicer as well [16:54] dimitern: yeah, it's good to see it working [16:56] dimitern: hmm, actually i'm not entirely sure it is working [16:56] rogpeppe: i can see the hook failed [16:56] dimitern: actually i don't think it did [16:56] dimitern: oh, it did [16:57] dimitern: i ssh'd in to the one machine that it didn't fail on! [17:00] dimitern: two "juju resolved" invocations later and it's all running [17:00] rogpeppe: nice! [17:01] dimitern: yeah it feels really good to just play with it a bit [17:01] dimitern: just found a bug in juju get though [17:02] % juju get logging [17:02] error: constraints do not apply to subordinate services [17:02] rogpeppe: oh? [17:02] interrresting error [17:02] rogpeppe: indeed [17:03] dimitern: it only happens when doing juju-get on the subord [17:04] dimitern: ha, found it i think [17:05] dimitern: yup [17:07] dimitern: fixed. [17:07] dimitern: am sorely tempted to push the fix :-) [17:07] rogpeppe: what was it? [17:08] dimitern: in statecmd.ServiceGet, it calls svc.Constraints without checking if the service is principal or not [17:09] dimitern: personally i'd be tempted to make Service.Constraints return a zero constraints if the service is subord [17:09] dimitern: rather than an error [17:10] dimitern: hmm, another bug in juju get [17:10] dimitern: it doesn't appear to print default values [17:11] rogpeppe: hmm.. well, bugs will appear anyway :) good that we have some time now to actually test it and find them [17:11] dimitern: definitely [17:11] dimitern: and none of these are show-stoppers [17:12] rogpeppe: yeah [17:25] hmm, i thought juju get was supposed to work now [17:41] right, that's me done [17:41] g'night all! === deryck is now known as deryck[lunch] === deryck[lunch] is now known as deryck [21:01] hmm... forgot to close irc last night [21:01] oh well, [21:01] morning [21:05] mornin' [21:22] morning mgz, thumper [21:22] ;) === ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: danilos | Bugs: 2 Critical, 61 High - https://bugs.launchpad.net/juju-core/ [21:46] dimitern: why start log messages with lower case? [23:18] thumper, convention [23:18] oh hai fwereade [23:18] thumper, heyhey [23:18] fwereade: good timing, [23:18] fwereade: https://codereview.appspot.com/8849043/ just being updated [23:19] thumper, cool [23:28] fwereade: hmm... [23:28] $ juju bootstrap [23:28] error: cannot find tools: use of closed network connection [23:28] $ juju version [23:28] 1.9.14-raring-amd64 [23:28] that is the package version [23:30] fwereade: I don't suppose you could help me test the serialization? [23:31] fwereade: like how to make some fake charms and fake subordinates that just take time :) [23:31] fwereade: try and force contention === wedgwood is now known as wedgwood_away [23:31] thumper, hum, that raring thing is not nice [23:31] thumper, we have been unable to adequately characterise it [23:32] running with -v [23:32] also, if I use the raring mongodb tests fail, with tarball, it works [23:33] 2013/04/19 11:32:36 ERROR command failed: cannot find tools: Get https://s3.amazonaws.com/juju-c54985419ee80c98531550e15fdcc6a8/?prefix=tools%2Fjuju-1.&delimiter=&marker=: remote error: handshake failure [23:33] thumper, AFAIWCT it is coming out of s3 somehow, very much more in some regions than others, and possibly varying by client series [23:33] thumper, handshake failures appear to Just Happen [23:34] thumper, I know this is shit [23:34] thumper, but IME they have never progressed beyond a mild annoyance [23:35] thumper, huh, was the mongo package used in 1.9.14? [23:35] * thumper tries again then [23:35] I think the terminal running juju is using the packaged mongo too [23:35] no, using tar ball [23:36] thumper, that shouldn't be hitting mongo except via mgo [23:36] thumper, the tests will just use whatever's on your path [23:36] * thumper nods [23:36] got connection closed that time [23:36] 2013/04/19 11:36:26 ERROR command failed: cannot find tools: use of closed network connection [23:37] * thumper runs of local version [23:37] thumper, assuming for now that you never see handshake failures, do you ever see anything other than closed connections to s3? [23:37] not backage [23:37] I get it with trunk too [23:37] ! [23:38] I can't bootstrap at all [23:38] ok we have a reproducible case with actual source [23:38] hallelujah [23:38] may I ask you to log the heel out of what is happening there? [23:38] why does our logging not give us file and line numbers? [23:38] * thumper switches to trunk, and tries again [23:40] 2013/04/19 11:39:31 INFO environs: reading tools with major version 1 [23:40] 2013/04/19 11:39:37 INFO environs: falling back to public bucket [23:40] 2013/04/19 11:39:37 ERROR command failed: use of closed network connection [23:40] from tip of trunk [23:40] where do I start logging? [23:40] must be in the tools search right? [23:42] thumper, sorry got distracted [23:42] thumper, ok that is interesting [23:42] thumper, I think we might want to delve into goamz/s3 and log in some detail [23:42] environs/tools.go line 26 fails [23:42] thumper, what region are you in? [23:43] thumper, that'd be a Storage.List(), right? [23:43] the default [23:43] fwereade: right [23:43] how do you format a bool for %s ? [23:43] thumper, %v usually works [23:43] thumper, there might be something I ought t ouse instead [23:52] thumper, I'm sorry, but I think I have to sleep :( [23:52] fwereade: np, I'll keep digging [23:52] thumper, the absolutely most valuable thing you can do is to mail that, indeed [23:52] thumper, nail that [23:52] heh [23:52] funny typo [23:53] thumper, and possibly to just try deploying to ap-southeast-2 in case you're "lucky" enough to encounter difficulties finding instances from ids [23:53] ok, I'll try that now [23:53] thumper, I will finish your review first though [23:54] mramm, thumper seems to be able to repro one of the elusive issues against a source build [23:55] fwereade: changing region made no difference [23:55] fwereade: although unhelpfully, the region you are using isn't logged anywhere [23:55] mramm, in the light of mgz's comments re time, I think that we should not be releasing right now, but I would dearly appreciate clarity re the precise situation to which he alludes [23:56] thumper, ha [23:56] fwereade: I have been in customer meetings all afternoon, so I'll try to check in on that now [23:56] thumper, I will gladly LGTM just about anything that improves our logging [23:57] hmm, how am I supposed to log from within goamz?? [23:57] thumper, I know we produce a lot but we're cutting down the useless ones and so we have space for ones that might be more useful [23:57] * thumper does printf [23:58] thumper, what else does core/log import? I thought it should be doable directly without cycles [23:58] thumper, (not saying it's nice, just expedient) [23:58] actually, should probably be able to import it