[00:22] Is twisted.internet.error.CannotListenError: Couldn't listen on any:2121: [Errno 98] Address already in use. [00:22] Is ^ on ec2 normal? [00:34] StevenK: It started appearing a week or two ago. [00:34] But yes. [00:38] Hmm. [00:39] StevenK, see my mail a week ago [00:39] We should probably present a subtle beta notification to everyone when there's something optional on a page, and allow them to opt-in then and there. [00:39] wgrant, i agree with the ammendment that not every beta is going to be available to every person [00:39] necessarily [00:40] poolie: Right, a feature flag to turn on the option for certain teams, I guess. [00:40] perhaps plenty of them arethough [00:41] IIRC Gmail did something like this years ago. [00:41] Not sure if they still do. [00:41] Labs or something, but seems Google discontinued all of that. [00:41] yes, google love it [00:41] they do things i really envy [00:41] such as correlating this with their logs [00:41] to see if it makes things slower, crashier, more popular [00:42] I hope we can get to a point eventually where that would be useful. [00:42] Right now it's not. [00:43] stevenk https://bugs.launchpad.net/launchpad/+bug/894205 [00:43] <_mup_> Bug #894205: spurious test_poppy failures

< https://launchpad.net/bugs/894205 > [00:43] We need an ec2test -D [00:43] for what? [00:43] So we can break into the instance once it hits that failure. [00:44] And see what's going on. [00:44] oh postmortem [00:44] can't be that hard [00:45] wouldn't ec2 test --attached -t '-D' do the job? [00:46] except perhaps you don't want to do that every time just in case it fails [00:46] Or perhaps if we just immediately emailed on failure. [00:46] Or indeed just ran a few instances and watched ec2 list [00:46] yeah [00:48] i thought ec2 list used to give a progress percentage [00:48] No [00:48] but apparently not any more? [00:48] well [00:48] It never did. [00:48] Just a time. [00:48] maybe i'm thinkingo f bzr pqm [00:48] i guess once you're experienced it's enough to know it takes about 4h [00:48] Yeah [00:50] the other thing i would like to do here is make the flags admin ui a bit better [00:50] oh and developer control on *staging [00:50] so much to do [00:52] wallyworld__: What if we said something like "Albert Einstein | [Proprietary: everything (!)] [Security: 6 bugs, 4 branches (!)] [(+)]" (where (!) is the edit icons, (+) is the add icon) [00:53] wallyworld__: Gets rid of the obscure "observer" and "restricted observer" terms. [00:53] Removes the third column. [00:53] And shows more directly what's what. [00:53] wgrant: sounds reasonable at first glance [00:53] wgrant, so [00:54] self.factory.makeBugComment() [00:54] fails if i don't previously log in as a user [00:54] is this a bug? it seems inconsistent with other factory things [00:54] poolie: A bug, but an unsurprising one. [00:54] this is from https://code.launchpad.net/~mbp/launchpad/888353-microformats/+merge/82767 circa line 59 === micahg_ is now known as micahg [01:15] StevenK: wgrant: have you used ec2 test recently? does it work for you? [01:15] ec2 test and ec2 land are mostly the same [01:15] yes, that's what i thought. but first ec2 test complained there was no submit branch [01:15] so i added one to branch.conf [01:15] then it gets to starting ec2 and complains there's a missing revision [01:15] What are you trying to do? [01:15] Oh, you do everything in one working directory, don't you? [01:15] just try ec2 test to confirm it still works for me [01:15] i have a lightweight checkout [01:15] i guess it needs submit_branch=... so that it knows where to grab the code from inside the ec2 instance [01:15] wallyworld__: I used ec2 test 12 hours ago. [01:15] Worked fine. [01:15] hmmm. don't know what broken for me then :-( [01:15] wallyworld__: rocketfuel-setup adds a submit_branch [01:15] To ~/.bazaar/locations.conf [01:16] wgrant: mine has a submit_branch:policy = appendpath [01:16] Er [01:16] really? [01:17] public_branch:policy and push_location:policy should be appendpath [01:17] submit_branch should not. [01:17] yeah, it's been that way forever. maybe is mis-copied from somewhere [01:17] http://paste.ubuntu.com/747751/ is what I use. [01:18] i'll try those settings. but i'm sure ec2 test used to work [01:19] ec2 land will work, since it gets it from the MP [01:19] ec2 test uses submit_branch. [01:20] right, makes sense [01:21] wgrant: so now the code in ec2 says public branch out of date [01:22] wallyworld__: The public branch is probably out of date :) [01:22] Or wrongly configured. [01:22] yet, i'm running ec2 test from a branch i've just bzr pulled to [01:22] It has the same name? [01:23] as what? [01:23] Your local branch and LP branch need to have the same name. [01:23] Or you need to configure public_location explicitly. [01:24] ah, that may be it [01:24] i have been using ec2 test a lot recently [01:25] my public_branch is bzr+ssh://bazaar.launchpad.net/~wallyworld/launchpad/devel [01:25] my subit branch is bzr+ssh://bazaar.launchpad.net/~launchpad-pqm/launchpad/devel [01:26] Ah [01:26] There's the problem. [01:26] public_branch needs to the be the public URL of the current branch. [01:26] That's why it uses appendpath [01:27] so my observer-db-2 branch maps to bzr+ssh://bazaar.launchpad.net/~wgrant/launchpad/observer-db-2 [01:27] wgrant: so, how to fix? edit ~/.bazaar/locations.conf? [01:27] wallyworld__: Right. [01:28] if this is misconfigured i don't understand how ec2 would be working for you at all [01:28] or are you always giving the full url? [01:28] Indeed, even ec2 land should fail here. [01:28] If public_branch is wrong. [01:29] it's all worked so far for whatever reason [01:29] i have a single working tree and use bzr switch [01:29] if the working tree has the current branch i want to land, i just type ec2 land [01:30] if the working tree is switched to a different branch, i used ec2 land fullurl [01:32] wgrant: so my locations.conf is the same was yours, yet i get the public and submit branch mismatch [01:33] wallyworld__: Right, I use multiple working trees. [01:33] wallyworld__: Where are your branches? [01:33] Directly in lp-branches? [01:33] yes [01:33] Hmm [01:33] and the branch directories are empty [01:33] That should still work. [01:33] except for .bzr [01:34] wgrant@lucid-test-lp:~/launchpad/lp-branches$ bzr co --lightweight rip-out-accountpassword rip-out-accountpassword-co [01:35] wgrant@lucid-test-lp:~/launchpad/lp-branches$ bzr info rip-out-accountpassword-co | grep public public branch: bzr+ssh://bazaar.launchpad.net/~wgrant/launchpad/rip-out-accountpassword [01:35] WFM :/ [01:35] You don't have it configured in branch.conf? [01:35] my branch.conf has parent_location [01:36] to point back to the parent devel chechout on disk which i pull into from lp and then branch from when i create a new branch [01:37] [/home/mbp/launchpad/lp-branches/work/.bzr/branches/] [01:37] submit_branch = bzr+ssh://bazaar.launchpad.net/~launchpad-pqm/launchpad/devel [01:37] [/home/mbp/ianb/lp-branches/] actually [01:38] that matches my submit_branch when i type bzr info [01:38] but ec2 test complains that something is out of date [01:38] what specifically? [01:39] bzrlib.errors.PublicBranchOutOfDate: Public branch "bzr+ssh://bazaar.launchpad.net/~wallyworld/launchpad/devel" lacks revision "launchpad@pqm.canonical.com-20111123175547-p02iyjil3n13end4" [01:39] which it does if i goto ~wallyworld/launchpad/devel [01:40] but i'm running ec2 test from a branch which is fully up to date via bzr pull [01:40] are you wanting to run the tests on devel? [01:41] i just wanted to get ec2 test working, because i ec2 demo is based off that and i want to see if ec2 demo works [01:42] there is a special 'ec2 test --trunk' option that doesn't do a merge [01:42] sounds like my config is hosed somehow, yet ec2 land, bzr pull, bzr push etc all work [01:43] wallyworld__, i think your setup has a confusion between a branch called devel owned by you, and the real devel owned by launchpad-pqm [01:43] it's kind of bad this state is possible [01:43] the short answer is to either use --trunk or test one of your own real branches [01:43] i tried one of my own real branches and got similar errors [01:44] --trunk is fine for ec2 test but i really wanted to see if ec2 demo works with a branch containing some changes [01:45] since it was stated ec2 demo is broken, and i wanted to see why/how [01:45] really? [01:45] wallyworld__: ec2 test should work fine [01:46] Since ec2 land runs test with details from the MP [01:46] ec2 demo also works fine for me [01:46] wallyworld__, give me the name of one of your branches? [01:46] poolie: someone stated on the thread i started about interactive demos that ec2 demo was broken [01:47] poolie: delete-all-bugtasks-889202 [01:47] StevenK: ec2 land works fine for me. but ec2 test doesn't (i haven't run it in ages, and tried a bit earlier for the first time in a while) [01:49] perhaps my workflow of having a single local checkout of devel and branching off that and using a single working tree which i switch between isn't very common [01:49] Ya think? [01:49] i think it's common [01:49] i use colos which are similar enough [01:49] wallyworld__, can you be more specific about what's going wrong [01:49] do you get a traceback or something? [01:50] steven@liquified:~/launchpad/lp-branches% ls -1 | wc -l [01:50] 130 [01:50] i have all my branch dires in lp-branches too [01:50] but use devel-sandbox as my only dir with a working tree [01:51] that's fine [01:51] there is no good reason why that would affect ec2 [01:52] lifeless, what actually is bitrotten in ec2 demo? [01:52] poolie: i tried it again in my devel-sandbox with a previous branch and it worked this time [01:53] poolie: but it didn't work in my local trunk working tree that i use to branch from [01:53] so [01:53] that's probably best fixed by adding this configuration [01:53] [/home/ianb/launchpad/lp-branches/devel] [01:53] public_location = lp:launchpad [01:54] so it knows it's just a mirror [01:54] ah, ok. that's great. i'll try it. thanks :-) [01:59] poolie: that worked. the key though is public_branch. thanks. [02:41] wallyworld__, all good now? [02:41] i might have some lunch [03:07] wallyworld__: I work as you do [03:07] wallyworld__: I think its pretty common [03:08] saves a lot of time checking out, and disk space too [03:08] s/checking out/branching [03:09] to be pedantic, i think you do mean checking out [03:09] or tree building [03:09] branching is not the bit you wait for :-) [03:09] it is if you branch from lp:launchpad rather than a local ondisk copy [03:10] No. [03:10] Well, only if you don't have a shared repo. [03:10] In which case you are wrong. [03:10] if you're using a shared repo, you're doomed :) [03:10] +not [03:11] Exactly. [03:11] maybe i misremembered then. i could have sworn branching off lp:launchpad took a fair bit long than branching from local [03:12] Well, that's orthogonal. [03:12] You can branch locally perfectly well without colo. [03:12] bzr branch devel somethingelse [03:13] right [03:14] steven@liquified:~/launchpad/lp-branches% du -sh . [03:14] 21G . [03:14] Crumbs [03:14] StevenK: How!? [03:14] Do you never delete branches? [03:14] Not usually [03:17] i don't delete branches either and mine is 1.5GB [03:18] and that includes a few non lp branches [03:18] ls -1 | wc -l -> 254 [03:40] that's crazy. [03:40] I delete branches after they're deployed. [04:22] nigelb: because i only use one working tree, the branches are just directories, and i keep them around locally for easy reference [04:23] Nice. [04:23] I ran out of space on my hard-disk. [04:23] Too much code :) [04:23] /dev/mapper/sys-home 46G 39G 4.7G 90% /home [04:24] But I have another 390Gb that is unallocated. :-) [04:26] I need to buy more hard-disk space. [04:26] Going to work with lots of code soonish. [04:26] Morning poolie! [04:26] Err, Afternoon, rather :) [04:27] hey there [04:29] * StevenK deletes 10Gb of old branches [04:32] StevenK: wgrant: attempting to make a multi-pillar bug private - do you agree that it should raise ValueError or do you prefer a custom exception type? [04:33] A custom type. [04:34] ockey dokey [04:35] OMG [04:35] Python 3.3 will have slightly less stupid Unicode. [04:36] It won't just support the BMP. [04:36] You mean like Perl 5.10 YEARS ago? [04:40] * nigelb sees StevenK's subtle stab. [04:45] Bah, no jtv. [05:21] (Pdb) p self.errors [05:21] [WidgetInputError('owner', u'Maintainer', ConstraintNotSatisfied())] [05:21] :-( [05:23] Hmm? [05:23] Product:+edit-people [05:23] Sure. But why is that saddening? [05:24] Because the error that is set is "Constraint not satisfied" [05:27] It's due to the validator failing, but I'm not sure how to deal with it properly. [05:50] StevenK: could be trying to assign an open team as a pillar owner [05:50] I am! [05:50] That's the point. [05:51] Just the error message is horrible. [05:51] that's how it bubbles up from storm i guess [05:51] Right [05:51] we should catch that and rethrow [05:51] wallyworld__: Catch it where? [05:52] It's already in self.errors in the validate() of the view. [05:52] hmmm. not sure then off hand [05:52] in the model code somewhere [05:52] so that by the time the view sees it, it is a nice error [05:52] Can you think of an example? [05:53] doesn't validate() take the form values prior to executing the form action? [05:53] so a check could be done there that the owne isn't open [05:53] and the field error set accordingly [05:54] but the vocab shouldn't allow such a team to be picked anyway [05:54] wallyworld__: The first thing validate() does is 'if len(self.errors) >= 1: return' [05:54] so my guess as to how validate works is wrong :-( [05:55] seems backwards [05:55] shouldn't validate happen first, then the action [05:55] It does happen that way [05:56] wallyworld__: I can explain this easier over mumble, since it seems I've confused you. [05:56] StevenK: let me look quickly at the code [05:57] wallyworld__: The view in question is ProductEditPeopleView [05:59] StevenK: so looks like custom_widget('owner', .....) needs to have the correct vocab passed in [05:59] so that only closed teams or people can be picked [05:59] wallyworld__: But I didn't use the picker to generate that error [06:00] you typed into the field directly? [06:00] And hit the button, yes [06:00] ok. so the picker vocab is still wrong :-) [06:00] Right, so I can fix that at the same time, but my question still stands [06:00] but i don't understand how validate() has errors already set [06:01] at the very start of the method [06:01] when no validation has been done yet [06:01] LaunchpadFormView has done it? Or storm's\? [06:02] perhaps. it seems apparent something is having a go at validation before the view nominated validate() is called [06:03] Agreed. [06:03] based on the form schema [06:03] and the declared fields [06:03] but that's the limit of my knowledge [06:03] It seems ugly to just reach into self.errors and pull out ConstraintNotSatisfied and replace it [06:03] yup. let me check something [06:05] StevenK: so the storm validator for closed teams raises a OpenTeamLinkageError [06:05] not sure how this is munged into a constraint error [06:05] By Zope, I guess [06:05] I have no idea how validators work [06:05] i do seem to recall a doc test checking for OpenTeamLinkageError [06:06] rather than ConstraintError [06:07] sadly, i have in the past set breakpoints inside LaunchpadFormView to track the flow when a form submit is done [06:07] you may need to do the same here [06:07] bbiab === almaisan-away is now known as al-maisan [06:13] wallyworld__: I've just sent a somewhat lengthy reply to the list about prototyping. Once you've had a chance to read it I'm happy to have a bit more of a chat here if it helps (not saying you should read it now). [06:32] jtv: Evening. [06:41] Hi wgrant [06:42] Sorry I had to email; internet connection where I was was just unusable. === al-maisan is now known as almaisan-away [06:43] jtv: Did you end up testing recipe builds on DF? [06:43] Yes. [06:43] Ahem. [06:43] Not on DF. [06:43] On staging. [06:43] Ah, I just tested on DF anyway. [06:43] Looks good. [06:43] On staging as well. [06:47] jtv: there is a lot of commit/DatabaseTransactionPolicy/dosomething/commit which seems like it could be done in a single context manager. You also unexplainedly abort in a few places. [06:47] It's all a bit of a mess, but I guess that can't really be helped... [06:48] Without the rewrite that can't happen. [06:48] Right. Ideally, I'd like to abort everywhere. [06:48] Because it's all supposed to be read-only transactions. [06:48] But that might upset tests. [06:48] Yeah. [06:48] OTOH aborts may possibly be slower, depending on implementation. [06:48] If it wasn't for tests I would have required you to also have the context manager assert that it was already a read-only txn. [06:49] It's a risk that's hard to manage. :/ [06:49] That's also why I have all these minimal read-write regions: can't afford to have anything in there yield. [06:50] It would be nice if we could add a hook to the reactor to confirm that the transaction is read-only before and after each callback. [06:50] Yeah. I asked around, but Twisted doesn't seem to have anything like that. [06:51] Ultimately Robert is right: the twisted-based daemon and the ORM-backed logic shouldn't even be in the same process. [06:52] Although adding more moving parts is no happier a prospect than twistification probably was. [06:52] Well. [06:52] A Twisted bridge between two XML-RPC sides is probably not that bad. [06:53] And the current architecture *could* work well, I suspect, except that the code architecture is still from the non-Twisted world. [06:53] We don't have just 2 sides though: there's the slaves, the builder logic, the build manager, and Librarian. [06:53] Anyway, I'd really like to see you use 'with write_transaction:' or something like that, rather than the commit/DatabaseTransactionPolicy/commit [06:53] Like with dbuser. [06:54] Mm. [06:54] Sort of, I guess. [06:55] Here's where I regret that it took so long. I forget so many details. What I remember is that write_transaction had both general problems and specific problems. [06:55] I don't mean the existing write_transaction. [06:55] One of the general ones was that it doesn't enforce a fresh transaction. [06:55] Which is a function decorator that also retries. [06:55] Just something to encapsulate what you repeat. [06:55] So you can get more of these weird “A and C have been done, but B aborted somehow” effects. [07:02] wgrant: Ah yes, another problem with read_transaction is that it allows you to write to the database. I didn't want that in this case, because of the high probability that there would be accidental, untested database writes in the read-only sections. [07:03] jtv: Sure, the existing decorators are not useful here. I should have used a different name. [07:04] Naming is hard. Any specific suggestions? [07:06] with a_promise_that_i_am_not_evil_twisted_code [07:11] with hand_on_heart: [07:57] Do we provide download statistics for user PPAs anywhere? [07:58] In the DB [08:00] jtv: And the API [08:00] Only there? [08:01] Yes. [08:01] I only added an initial API, then got busy and never designed a UI. [08:03] wgrant: Oh well, I'll tell them it's your fault then. Thanks. [08:05] wgrant: meanwhile, any further thoughts on my MP? Anything I need to change? Do you think it's landable? [08:09] jtv: I don't like the duplication. It's too easy to forget a commit. Apart from that it looks pretty sane, but I need to go over it again. [08:09] The duplication of what? [08:09] commit/DatabaseTransactionPolicy/commit [08:30] * stub reads up on celery [08:56] j [09:04] stub: how much of the last fdt was slony / fti/trusted / the actual patch ? [09:04] stub: e.g. how much identifiable fat do we have ? [09:04] lifeless: I don't think we have done a rollout with the new Slony yet, so I don't know the real, current answer to that. [09:05] lifeless: I don't think we can tell, because to minimize overhead we collapse everything update.py does into a single big .sql script rather than running EXECUTE SCRIPT several times. We can tell how long it takes to run security.py vs. update.py, but beyond that? [09:09] * stub wonders if the timestamps in LaunchpadDatabaseRevision have the information, or if transaction_timestamp vs. statement_timestamp dance that has to be done means they lie. [09:09] oh... staging has been new slony for ages. duh. [09:12] stub: We did an fdt with the new slony on Monday [09:13] And probably Friday too [09:13] lifeless: http://paste.ubuntu.com/747987/ is the relevant section from staging. 51s to apply no patches (but this still resets security and applies trusted.sql) [09:13] Yeah, 18th and 21st there were fastdowntimes. [09:14] * stub looks for the production logs [09:14] lifeless: :( postgres on carob is hungry. [09:22] http://paste.ubuntu.com/747996/ is relevant stuff from a production outage, 3:12 outage [09:24] The fat seems to be in waiting for sync and things to propagate. About 10 seconds everytime we pause and ask if everything is ok and caught up. [09:26] 13s to lock all the tables and run trusted.sql and the single db patch. The db patch thinks it took 2.3 seconds. [09:26] 47s for that update to propagate and be run on the slaves and success to be reported back to the master. [09:27] We run security.py in series, but that is only a few seconds as we bypass slony entirely for that. [09:27] rvba: Can you QA bug #867941, please? [09:27] <_mup_> Bug #867941: person:+activereviews times out

< https://launchpad.net/bugs/867941 > [09:27] Your rollback is on qastaging. [09:28] wgrant: oh, great, I was just waiting for it to be deployed to qastaging. [09:28] rvba: buildbot finally passed :) [09:28] Finally! :) [09:28] stub: The previous fastdowntime might be better to analyse. [09:29] The biggest fat I can identify is adding new stuff we have created to replication. If we create new tables, we need to call EXECUTE SCRIPT twice. [09:29] stub: What are our slony intervals nowadays? [09:29] We'll hopefully do a fastdowntime tomorrow. [09:29] I might do a double one. [09:29] The second being a no-op. [09:30] So we can see how that goes. [09:30] And we can fit both in 5 minutes, so it's fine. [09:30] default settings if you mean the slond config timings === almaisan-away is now known as al-maisan [09:31] so 2 seconds for sync-check-interval. We could lower that and see if it speeds up all the handshaking [09:32] Hmm. [09:32] What's the other one? 10s? [09:32] IIRC that's the deaful [09:32] default [09:33] wgrant: That is a heartbeat - shouldn't affect handshaking [09:33] You'd think not. [09:33] But in my testing locally it did, IIRC. [09:33] It's been a few months, though. [09:34] hmm [09:34] well, easy enough to tweak it and see what happens. We are not going to overload things changing these settings [09:34] But only one at a time of course. [09:35] wgrant: Any opinion on changing sync-check-interval or sync-interval-timeout first? [09:36] Hallo [09:40] stub: I am no longer sufficiently informed to have such an opinion. [09:41] wgrant: yeah, it is [09:45] lifeless: Hi Rob, any way for me to access an oops report… like manually or something? The web application seems to be still jammed ;). [09:46] I suppose the old oops cleaning up stuff is taking more time than anticipated. [09:47] wgrant: I'll knock the sync-check-interval (how often a daemon polls to see if a sync should be sent) to half, 1s. Lets see if it affects that 10s time. [09:53] wgrant: huh. 'If the node is not an origin.... wasteful for this value to be much less than sync_interval_timeout'. So yeah, twiddle them both. [09:54]