mwhudsonmbarnett: ok, that's not what i expected00:00
mwhudsonmbarnett: which machines did you try it on?00:00
lifelesssinzui: browser/team.py has an intersting thing00:00
lifelessit looks up the action from the form00:00
mbarnettmwhudson: galapagos, pear, russkaya00:01
lifelessbut it doesn't seem to check that the action is *for* that row00:01
mwhudsonmbarnett: sorry if this is tedious, but can you pastebin the ~importd/.bazaar/bazaar.conf files from pear and russkaya?00:03
spmmwhudson: I'll sort that mbarnett needs to go and en-cake-enate00:05
sinzuiLifeless yeah. I am staring at the template. I think the message is adapted to a widget and it builds ids from the message id...we automatically discard duplicate message ids00:05
maxbjelmer: Did the bzr-svn on launchpad somehow just not ever try "discovering revprop revisions" before the last rollout?00:05
jelmermaxb: it's always done that00:05
mbarnettyes yes, cake levels dangerously low!00:05
jelmermaxb: But we didn't do KDE imports until recently00:05
maxbYes, but the KDE imports which ran before the rollout didn't "discover revprop revisions"00:06
maxboh, wait, I'm getting my dates wrong.00:06
lifelesssinzui: anyhow all tests are passing, except for00:07
lifeless>>> find_tag_by_id(admin_browser.contents, 'batchnav_first')['class']00:07
sinzuiYes that will be an issue for a few more moments. I have a patch00:07
lifelesssinzui: and I don't think the bug with POST there is any better or worse due to batching; if its buggy its always been buggy.00:08
sinzuiI agree00:08
mwhudsonspm: cool, and good morning00:11
sinzuilifeless: https://bugs.edge.launchpad.net/launchpad-foundations/+bug/637654 has a proposed patch. It fixes the most common case of upper and lower. I image a page with two sets of BNs will continue to be broken00:11
spmmwhudson: hmm. you may have hit the nail. pear doesn't have that file00:12
lifelesssinzui: is this likely to cause other tests to fail ?00:12
spmmwhudson: https://pastebin.canonical.com/37119/ <== russkaya00:12
lifeless(I mean, will there be other fixups to do with that patch)00:12
lifelesssinzui: could you commit that patch somewhere and push it? I'll pull it in00:13
sinzuiI bet not since there are no tests reporting that we have rubbish ids. I reported it as a separate bug because I think this issue is a separate concern from your branch now00:13
* sinzui does00:13
lifelesssinzui: it is a seperate concern, but either I leave the test out or I include your branch00:13
mwhudsonspm: gar00:14
spmmwhudson: I assume C&P from one t'other?00:14
mwhudsonspm: does /home/importd/.bazaar/sign-vcs-import exist on pear?00:14
mwhudsondid pear get reinstalled at some point?00:15
mwhudsonspm: can you look at  /home/importd/.bazaar/sign-vcs-import on russkaya?00:15
spmmwhudson: I'd assume it being a new machine; some things may have been missed :-(00:15
mwhudsonit probably references something ridiculous like ~importd/hoover/keys/key.gpg00:15
spmholy dooly00:16
mwhudsonhttps://wiki.canonical.com/InformationInfrastructure/OSA/LPHowTo/SetUpCodeImportSlave -> "# Make sure ~importd/.bazaar/ and ~importd/botslave look like they do on a working slave. "00:16
spmmwhudson: exec /usr/bin/gpg --no-default-keyring --keyring /home/importd/botslave/gpg/vcs-imports@canonical.com.pub --secret-keyring /home/importd/botslave/gpg/vcs-imports@canonical.com.secret --default-key A60FA0E1 "$@"00:16
mwhudsonspm: i was sure this was working at some point on pear :(00:17
mwhudsonmaybe not00:17
mwhudsonit only affects cscvs imports i guess....00:17
spmI'm surprised it's working on russkaya....00:17
sinzuilifeless: lp:~sinzui/launchpad/batch-ids-0 There are no tests for these links. Nor can I find any tests for the existing of the upper or lower navs. All the tests for the BN are getting the Next link using TestBrowser00:22
sinzuiI checked for uses the the default BN. Subclasses like root search and bug do their own layouts00:23
lifelesssinzui: so 'batchnav_first' is not what to look for00:24
sinzuilifeless try 'upper-batch-nav-batchnav-first' but keep in mind the nav will not rendered if there is only one batch. We need 6 messages in the testrunner env or we add ?batch=1 to the url we are tesing to set the size to 1 message00:27
lifelessand for the bottom lower-batch-nav-batchnav-first ?00:30
lifeless?batch=1 doesn't do it00:31
lifeless1 is too small00:33
lifelessmoving it lower down00:34
lifelesssinzui: doesn't appear to be rendering the nav links to me00:37
lifelessContinue to hold the message, deferring\n          your decision until later.</li>\n        </ul>\n      </div>\n\n        <table class="listing">\n00:39
sinzuilifeless sorry, my screen was locked for a moment00:39
lifelesssinzui: ^ thats with batch=1 and 2 messages00:39
wgrantSo, we have an issue with the OpenID identifier migration last week, causing incorrect accounts to be linked together... can someone poke around on staging to work out WTF is going on?00:39
lifeless 2\n        \n        <span>\n        messages have</span>\n        been posted to00:39
lifelesswgrant: sure, once I'm finished here.00:39
wgrantlifeless: Thanks.00:40
lifelesssinzui: only one message is shown00:40
lifelessso the batch param worked00:40
lifelessits the naviation bit that isn't00:40
sinzuiI am a bad advisor. you are on the first batch. We should be checking for upper--batch-nav-batchnav-next00:41
lifelessno, you're fine00:42
lifelessits no there00:42
lifelessafter the advice00:42
lifelessDiscard</strong> - Throw the message aw00:42
lifelessyour decision until later.</li>\n        </ul>\n      </div>\n\n        <table class="listing">\n        <thead><tr>\n          <th>Message detail00:42
lifelessis whats in the browser.contents00:42
lifelessthe navigation bit is just awol00:43
lifelessit should be after that </div>00:43
sinzuiWe suppress rendering of the lower if there is no additional batches, so maybe that template fragment is wrong00:44
lifelesswell there is an additional batch - batch=1 and two messages to moderate00:45
lifelessthe count on the page shows '2' so we know there are two there00:45
lifelessand there is only one "approve" in the output, so we know only one got shown00:45
sinzuilifeless, the upper template must rendered since there is clearly a batch. the view guards the rendering with this: ``if self.context.currentBatch():``00:47
* sinzui is looking at canonical/launchpad/webapp/batching.py00:47
lifelesswhat is the context object going to be - held_messages ?00:48
spmmwhudson: pear now has those dirs/files setup per russkaya00:49
mwhudsonspm: great, thanks00:49
sinzuilifeless: yes held_messages. we are adapting a BN00:50
lifelessso I did this00:50
lifeless+++ lib/canonical/launchpad/webapp/batching.py  2010-09-13 23:50:20 +000000:50
lifeless@@ -40,7 +40,7 @@00:50
lifeless     def render(self):00:50
lifeless         if self.context.currentBatch():00:50
lifeless             return LaunchpadView.render(self)00:50
lifeless-        return u""00:50
lifeless+        return u"not rendered"00:50
lifelessnot rendered was not included in the output00:50
* sinzui checks zcml 00:51
lifelessI wanted to see if that code path was shortcircuiting or something00:51
* sinzui checks other batches with the hacked template00:54
lifelessand this:00:58
lifeless+++ lib/canonical/launchpad/webapp/batching.py  2010-09-13 23:52:07 +000000:58
lifeless@@ -38,6 +38,7 @@00:58
lifeless     css_class = "upper-batch-nav"00:58
lifeless 00:58
lifeless     def render(self):00:58
lifeless+        return u" fooo "00:59
lifeless         if self.context.currentBatch():00:59
lifeless             return LaunchpadView.render(self)00:59
lifeless         return u""00:59
lifelessalso doesn't show up in the output00:59
sinzuiright. I am not seeing my template change when testing https://blueprints.launchpad.dev/firefox?batch=200:59
sinzuiOr I could run the instance that I made the change in instead01:00
lifelessand to cap it off,when I make that raise an Exception I don't get an error01:00
sinzuioh.. I wonder. I see >>> admin_browser.reload() which has a history if being buggy01:01
lifelesshave a look at lib/lp/blueprints/templates/person-specworkload.pt01:01
lifelesssinzui: I'm sure its not that, I made the view crash and the page rendered01:02
sinzuiI see my hack in specs now that I am running the right branch01:02
lifelessI'm thoroughly confused01:06
lifelessis there a sample data team w/list ?01:07
sinzuilifeless me too, this always just works. Can you humour me by adding this link before we do the call the find_tag_by_id01:08
sinzui    >>> admin_browser.open(01:08
sinzui    ...     'http://launchpad.dev/~guadamen/+mailinglist-moderate')01:08
lifelessof course01:08
sinzuilifeless there are no mls in data. I have a make harness note about making them after a request in made in the UI01:09
lifeless    >>> admin_browser.open(01:09
lifeless    ...     'http://launchpad.dev/~guadamen/+mailinglist-moderate?batch=1')01:09
lifeless    >>> find_tag_by_id(admin_browser.contents, 'upper-batch-nav-batchnav-first')['class']01:09
lifeless    first01:09
lifeless    >>> admin_browser.contents01:09
lifelessthats what the story does01:09
lifelesswhats weirded01:10
lifelessI added a string literal and I can't see it01:10
lifelessits almost like those divs are eaten01:11
lifelesswhen I put a literal above it works01:11
lifelessin the list of action descriptions01:11
lifelessbut when I add another div at the place we have the navigation ones it disappears01:12
sinzuilifeless as a desperate act to to verify this we could add size=1 to the BN instantiation in the view to be certain that the URL is not being ignore01:12
lifelessI'm certain its not01:12
lifelessbecause only one "approve" action is in the contents01:12
sinzuiAh, yes, that is what I did to be certain something showed up in my env01:12
lifelessI suspect the metal:form stuf01:13
lifelessI'm positive its simply not evaluation things without metal:fill-slot in that container01:14
lifelessI think if we add a div around it it wil work, moving the widgets slot up01:14
sinzuiwe are adapting the message in the same manner that we want to adapt the BN01:15
lifelessyes, but the metal interpreter isn't evaluating things without slots01:15
sinzuiWe can certainly move the navs out of the form to be sire it works01:15
lifelessbet you that that is is01:16
lifelessis it01:16
lifelessit was01:16
lifelessI have this now01:16
lifeless<a class="next" rel="next"\n           href="http://launchpad.dev/%7Eguadamen/+mailinglist-moderate?start=1&amp;batch=1"\n           id="upper-batch-nav-batchnav-next">01:16
lifeless-      <table class="listing" metal:fill-slot="widgets">01:16
lifeless+      <div metal:fill-slot="widgets">01:16
lifeless+      <tal:navigation01:16
lifeless+        replace="structure view/held_messages/@@+navigation-links-upper" />01:16
lifeless+      <table class="listing">01:16
lifelessthats the key01:16
sinzui1.5h of confusion and 1 minute to fix with insight.01:17
lifelesswe must be programming01:17
lifelessThank you for this; I'll push up and propose for merge01:18
lifelessand I'll write a mail to the list with a) a howto and b) asking for where it should go01:19
lifelessdoes anyone remember the wiki page for the bug sprinty thing at the end of the year?01:20
=== lifeless changed the topic of #launchpad-dev to: Launchpad Development Channel | Performance Tuesday | Week 1 of 10.10 | PQM is open for business | firefighting: - | https:/​/​dev.launchpad.net/​ | Get the code: https:/​/​dev.launchpad.net/​Getting
lifelesssinzui: https://code.edge.launchpad.net/~lifeless/launchpad/registry/+merge/3535401:24
wgrantlifeless: So, I worked out what was up with the broken accounts.01:40
wgrantSadly more will likely break soon.01:40
lifelesswgrant: ok cool01:40
lifelessI learnt how to batch stuff01:40
lifelessand to hate metal:form01:40
wgrantWho owns our OpenID consumer these days?01:41
lifelessconsumer? foundations01:42
lifelessthumper: does transaction time == scan time ?01:53
thumperlifeless: luckily, no01:53
thumperlifeless: 5.5 minutes to get the ancestry from bzrlib :(01:54
lifelessI think your idea of decoupling the tip change may not be enough01:58
lifelessI'd start with autocommit01:58
lifelessbut it seems like low effort for big return01:58
=== Ursinha-brb is now known as Ursinha-afk
lifelessthumper: can I borrow your eyeballs02:15
thumperlifeless: no, they're mine02:22
thumperlifeless: what do you need?02:22
* mwhudson is reminded of the end of hotshots02:22
lifelessthumper: a review02:23
lifelessits small, it will fix mailing list moderation (or make it fixable by further tuning)02:23
MTecknologyJust on the wild off chance... Is there anyone that knows very basic accounting principles in here?   I know there are very smart people in here and hoping one might be able to help me out..02:43
lifelessMTecknology: I do, enough to say 'run run away'02:47
lifelessspm: hey, don't suppose in the losa wiki you have a sql fragment to report on locks in the db ?02:48
lifelessspm: I know I wrote one up years ago ...02:48
spmyup sure do02:48
lifelessthumper needs its.02:48
spmit's a tad obscure to find tho. lp howto, troubleshooting from ememory02:48
MTecknologylifeless: How about enough to help me figure out this problem that's driving me absolutely bonkers? I have the book - but the book doesn't cover the material.02:48
thumperlifeless: did you mute?02:48
spmthumper: https://wiki.canonical.com/InformationInfrastructure/OSA/LPHowTo/BlockedProcessesDBLocks as a general02:49
spmhttps://wiki.canonical.com/InformationInfrastructure/OSA/LPHowTo/PostgresOldQueries is also vaguely relevant02:49
spmMTecknology: google isn't helping find it? if they're basic accuonting principles, there should be heap of online references that explain them??02:50
MTecknologyspm: My issue is understanding the basics of what I'm even reading online02:50
spmMTecknology: being quite serious here (I've got a few ni teh series): Perhaps "Accounting for Dummies"? serious suggestion, the dummies series are excellent for explaining the basic concepts. ??02:52
MTecknologyspm: might be worth buying from ya - any chance you could try to help me in a query with this one?02:54
MTecknologyor else there's a barnes & nobel here if you weren't offering to sell02:54
spmMTecknology: accounting? hell no. I never studied it at school or uni. wouldn't have a clue. I just have a few Dummies books that I've found excellent for explaining early concepts in the topics in question. :-)02:55
MTecknologyBTW - This is what I'm fighting.  Pearson Brothers recently reported an EBITDA of $13.5 million and net income of $2.6 million. It had $2.0 million of interest expense, and its corporate tax rate was 35%. What was its charge for depreciation and amortization?02:55
spmhttp://www.amazon.com/Accounting-Dummies-John-Tracy/dp/0764550144 fwiw02:56
MTecknologycheap, and probably much more useful than this $150 unbound see through sheets of paper book I have02:57
spmprobably :-)02:59
cr3lifeless: hi there, sorry I couldn't answer you earlier. still around?03:10
cr3lifeless: so, regarding test runs, do you also feel that's the best way to describe a group of test results run at a point in time in a given context?03:15
cr3lifeless: typically, I prefer to name things with one word, like submission instead of test run, but I think the latter might be clearer03:15
lifelessI commentted on thta in #testrepository03:17
lifelesssorry otp now03:17
cr3lifeless: heh, you seem to have been on the phone all day :)03:19
cr3on an unrelated topic, I have a question about defining interfaces: if a class implements IBugTarget which inherits from IHasBugs, using bugs as an example, then that class typically defines a createBug method.03:22
cr3however, why not have the class have a bugs attribute which returns a IBugSet which, in turn, implements a create method03:23
cr3in other words, the difference is like product.createBug compared to product.bugs.create, does this make sense to anyone?03:24
lifelessthumper: https://dev.launchpad.net/LEP/FeatureFlags#preview03:25
lifelessthumper: if features.getFeature('code.incrementaldif') == 'on':03:28
lifelessin templates, you do view/features/code.incrementaldiff03:28
lifelessor something like that03:28
lifelessspm: how many cpus on the master db?03:35
spmlifeless: 1603:36
lifelesscr3: hi03:57
lifelessin reverse order, I don't know, IBugSet really isn't the specific code I'd use if sketching it that way, and don't forget that all calls to SQL are ~ 1000 times slower than python.03:58
lifelessI odn't have a brilliant name for the result of running many tests other than 'test run'03:59
wgrantlifeless: Any idea how to debug the +filebug issue?04:08
lifelessspm: we really do need a hand04:10
lifelessspm: when you can, its approx the top timeout on prod04:10
spmlifeless: sure, was on a call, earlier hence the terse reply. sup?04:10
lifelessspm: +filebug gives an apache/haproxy error, reliably, on staging and prod04:11
lifelesswgrant has been looking at it04:11
lifelesswe need to know a bit more about whats actually going on.04:11
wgrantBug 63680104:11
spmum. since when? I happily filed a bug earlier?04:11
lifelessor for someone to make the request to a naked appserver04:11
wgrantI guess we need someone to watch staging Apache and see why it errors.04:11
lifelessor something04:11
wgrantspm: Only when filing with lots of apport attachments.04:12
lifelessspm: with apport on a package with 20+ subscribers?04:12
spmno, just a soyuz one. qed. ;-)04:12
* wgrant kicks mup.04:12
lifelessmup has mastered the fine art of silence.04:12
spmmup appears to have left the channel04:12
wgrantspm: WRT that Soyuz one, it seems to be a general problem. I've received complaints that lots of builds are dispatching repeatedly.04:12
lifelessspm: _mup_04:13
cr3lifeless: the interface question was mostly related to something containing other objects. put another way, I could have projects['bzr'].create_test_run() or projects['bzr'].test_runs.create()04:13
spmahh. it hides under a new name.04:13
lifelesscr3: is this python or LP API's ?04:14
lifelesscr3: if its LP API's you probably want to design to the wire protocol, given how round-trip-happy it is.04:14
cr3lifeless: ok, so every dot is a roundtrip potentially04:15
wgrantNot just potentially :(04:15
lifelessif by potentially you mean 'almost guaranteed'04:15
lifelessand by dot you mean 'python method invocation'04:15
lifeless(which includes __getattr__ aka '.')04:15
cr3I thought that perhaps launchpadlib could potentially cache information on the client side, sometimes avoiding a roundtrip04:15
lifelesscr3: optimise for cold cache :)04:16
lifeless(it can, under very limited circumstances)04:16
lifelesswhich I suspect we'll be limiting to about 2.5 hours in the near future04:17
cr3ok, that answers my question and provides good guidelines for the future. thanks!04:17
cr3lesson learned, now time for bed. cheerio folks!04:20
wgrantBah, no staging.04:23
spmlifeless: so, been doing some log snarfing and head scratching. not finding any errors in apache - but if a POST, and timing out; tbh I wouldn't expect to. :-( If this can be reliably repeated, I'd suggest 'a' way forward, would be to sniff the traffic at the client end when doing such a thing. even tho the connection'd be ssl'd, I'd betcha we'd get useful info out of the flow.04:29
lifelessstub: https://code.edge.launchpad.net/~stub/launchpad/cronscripts/+merge/35279 reviewedish04:31
spmwgrant: ref soyuz; yeah, I'm sure I'd seen comments around this bug before; but didn't have enough "knowledge" to find 'em. So figured a new with some detailed timeing info may help Julian. Being a private build I had to be a tad circumspect in what I put in unf. :-/04:31
lifelessspm: we don't see the response on the client,thats the point.04:31
lifelessspm: client -> server, pause, 'could not connect to launchpad'04:31
spmlifeless: the tcp conenction stays open forever until it gets client killed?04:32
lifelessspm: so we don't get an oops, don't get zip04:32
lifelessspm: no we get the haproxy/apache lalala page04:32
spmafter what time period, repeatedly?04:32
lifelesswgrant: please tell spm how to make it happen, then he'll see04:32
spmsame time period? longer? shortly? varies by moon phase and tides?04:32
lifeless10seconds sometimes apparently, though I think that was during the overload04:32
lifeless30 normallyish, I thinks.04:33
spmdifferent browsers to make a diff?04:33
lifelessspm: don't think we've tested, because the browser is working fine.04:33
spmjust wondering if it's an internal browser timeout that's then kicking the server error04:34
lifelessI don't even know how to parse that04:34
spmie. are packets actually flowing and then dying.04:34
spmor no packets flowing at all04:34
lifelessspm: its http - request/response model04:35
lifelessspm: and apport does preuploading of the bugs, so its not a big post.04:35
spmfor sure; I'm looking at the tcp layer to get clues for wtf is happening at the http layer.04:35
lifelesswgrant: whats a package that this has happened to ?04:35
lifelessspm: I don't think its an http problem myself, I think its appserver lalalalala land time genuinely, but we don't see the oops clearly04:36
spmfwiw, it should be pretty simple in staging: intranettertubers -> apache -> appserver. no squid, no haproxy.04:36
lifeless ok04:36
wgrantspm, lifeless: I was trying to prepare a case, but staging is borked.04:37
lifelessso we're seeing the apache 'server fail' message04:37
spmactually there's a point. wonder if the oops are being generated; we're just not seeing 'em. looks...04:37
lifelessOOPS-1717E1745, OOPS-1717G1716, OOPS-1717H1810, OOPS-1717K1882, OOPS-1717L176004:37
lifelessOOPS-1717E1218, OOPS-1717E1837, OOPS-1717K1949, OOPS-1717M1234, OOPS-1717N121104:37
lifelessOOPS-1717D703, OOPS-1717G778, OOPS-1717K884, OOPS-1717K88504:37
lifelessthey are listed as soft timeouts on +filebug04:37
lifelesswe also have some 'OffsiteFormPostError'04:38
spmprocess-apport-blobs.log is remarkably unhelpful04:38
wgrantprocess-apport-blobs is fine.04:38
lifelessthat happens async04:38
lifelessits all in the appserver at this point04:39
* spm is doing the Sherlock Holmes method of debug - eliminate the working, to discover the not ;-)04:39
wgrantCan we expect staging to return at some point?04:40
wgrantIt's been down a lot lately...04:40
lifelesswgrant: theres about 6 queries per attachment04:40
wgrantlifeless: Really?04:40
spmlaunchpad-trace.log has zip with 'filebug' in it. orsum04:41
lifelessINSERT INTO BugAttachment (message, bug, libraryfile, type, title) VALUES (%s, %s, %s, %s, %s) RETURNING BugAttachment.id04:41
wgrantMessage, BugAttachment, BugNotification, FUCKLOADS * BugNotificationRecipient04:41
lifelessSELECT Person.account, Person.creation_comment, Person.creation_rationale, Person.datecreated, Person.defaultmembershipperiod, Person.defaultrenewalperiod, Person.displayname04:41
lifelessSELECT BugTask.assignee, BugTask.bug, BugTask.bugwatch, BugTask.date_assigned, BugTask.date_closed, BugTask.date_confirmed, BugTask.date_fix_committed, BugTask.date_fix_released04:41
spmwgrant: for some reason, staging is being updated 'continuously' regardless of need. haven't had a chance to chase. yet.04:41
lifelessSELECT StructuralSubscription.blueprint_notification_level, StructuralSubscription.bug_notification_level, StructuralSubscription.date_created, StructuralSubscription.date_last_updated04:41
lifelessSELECT Person.account, Person.creation_comment, Person.creation_rationale, Person.datecreated, Person.defaultmembershipperiod, Person.defaultrenewalperiod, Person.displayname,04:41
lifelessSELECT Person.account, Person.creation_comment, Person.creation_rationale, Person.datecreated, Person.defaultmembershipperiod, Person.defaultrenewalperiod, Person.displayname04:41
wgrantlifeless: 'ugh' comes to mind.04:41
lifelessSELECT Person.account, Person.creation_comment, Person.creation_rationale, Person.datecreated, Person.defaultmembershipperiod, Person.defaultrenewalperiod, Person.displayname,04:41
lifelessSELECT Person.account, Person.creation_comment, Person.creation_rationale, Person.datecreated, Person.defaultmembershipperiod, Person.defaultrenewalperiod, Person.displayname,04:42
lifelessSELECT BugTask.assignee, BugTask.bug, BugTask.bugwatch, BugTask.date_assigned, BugTask.date_closed, BugTask.date_confirmed, BugTask.date_fix_committed, BugTask.date_fix_released,04:42
lifelessSELECT StructuralSubscription.blueprint_notification_level, StructuralSubscription.bug_notification_level, StructuralSubscription.date_created, StructuralSubscription.date_last_updated,04:42
lifelessSELECT Person.account, Person.creation_comment, Person.creation_rationale, Person.datecreated, Person.defaultmembershipperiod, Person.defaultrenewalperiod, Person.displayname,04:42
lifelessSELECT Person.account, Person.creation_comment, Person.creation_rationale, Person.datecreated, Person.defaultmembershipperiod, Person.defaultrenewalperiod, Person.displayname,04:42
lifelessSELECT LibraryFileContent.datecreated, LibraryFileContent.filesize, LibraryFileContent.id, LibraryFileContent.md5, LibraryFileContent.sha1 FROM LibraryFileContent WHERE LibraryFileContent.id = %s LIMIT 104:42
lifelessINSERT INTO BugActivity (oldvalue, datechanged, whatchanged, message, newvalue, bug, person) VALUES (%s, CURRENT_TIMESTAMP AT TIME ZONE 'UTC', %s, %s, %s, %s, %s) RETURNING04:42
lifelessINSERT INTO Message (datecreated, owner, subject, rfc822msgid) VALUES (CURRENT_TIMESTAMP AT TIME ZONE 'UTC', %s, %s, %s) RETURNING Message.id04:42
lifelessI'm going to stop there04:43
lifelesswgrant: mailed you, its an open package, normal person04:44
wgrantlifeless: Is this from a hidden OOPS?04:44
wgrantHm, that's only 500 queries.04:45
wgrantIs this a soft timeout?04:45
lifelessbut we've no reason to assume that this is unrelated ;)04:45
wgrantI'm more concerned with the bad error than the fact that there's an error.04:46
wgrantWe know why it's timing out.04:46
wgrantWe don't know why it's timing out like this.04:46
lifelessoh it concerns me too04:47
lifelesswgrant: are you doing some perf stuff today? It is tuesday...04:47
wgrantAre we going to be able to work this out on staging soon, or should we do it on prod now?04:49
lifelessprod it up04:49
wgranthttps://bugs.edge.launchpad.net/ubuntu/+source/linux/+filebug/5ca89d78-bfa3-11df-905e-0025b3df357a breaks pretty repeatedly.04:51
lifelesswgrant: whats your ip for spm to look in apache logs04:51
wgrantIsn't the token in the URL sufficient?04:52
spm"trust me, I'm a sysadmin"04:52
* lifeless looks in shock at spm04:52
jtvwgrant: heya04:52
wgrantjtv: Morning.04:52
jtvwgrant: may or may not be related but last night at least, we had some edge breakage where one of the edge instances reported the wrong revision.04:53
jtvSo it'd say it was at r11532 but actually seemed to be stuck at r11522 like the rest.04:53
wgrantjtv: Unrelated -- this has been going on since ~10.09.04:53
jtvOh ok04:53
jtvnm that then :)04:54
spmbleh. aapche say '502'04:56
wgrantNothing useful in the error log?04:56
wgrantOr does that mean the appserver said 502?04:56
lifelessspm: now the question is, did an oops get generated04:57
spm[14/Sep/2010:04:50:42 +0100]04:57
* wgrant stabs BST in the face.04:57
spm[Tue Sep 14 04:50:55 2010] [error] [client] (70014)End of file found: proxy: error reading status line from remote server localhost, referer: https://bugs.edge.launchpad.net/ubuntu/+source/linux/+filebug/5ca89d78-bfa3-11df-905e-0025b3df357a04:59
spm[Tue Sep 14 04:50:55 2010] [error] [client] proxy: Error reading from remote server returned by /ubuntu/+source/linux/+filebug/5ca89d78-bfa3-11df-905e-0025b3df357a, referer: https://bugs.edge.launchpad.net/ubuntu/+source/linux/+filebug/5ca89d78-bfa3-11df-905e-0025b3df357a04:59
spm^^ errorlog04:59
wgrantSo, the appserver died.04:59
wgrantOr otherwise closed the connection.04:59
spmhmm. haproxy/squid are in there somewhere. there may be in ter est ing comp li ca tions05:00
wgrantI thought they were on the other side.05:00
wgrantBut I could well be wrong.05:00
wgrantSo I guess you need to go through all the layers :(05:03
lifelessapache -> ha -> appserver05:03
wgrantWith Squid in front of Apache?05:04
spmapache -> (squid)? -> ha -> app; POsts don't go via squiddly05:04
lifelessnor do authenticated requests IIRC05:04
lifelessthumper: https://bugs.edge.launchpad.net/launchpad-code/+bug/637758 please put the code walkthrough we did in there, for gary's info when he sees the other bug I'm filing :)05:05
lifelessspm: how many appservers for lpnet ?05:08
pooliehow's it going, wallyworld?05:09
wgrantI was a bit surprised to see O oopses over the weekend.05:09
lifelessso 60 threads05:09
wgrantI didn't realise there were quite that many.05:09
lifelessthumper: https://bugs.edge.launchpad.net/launchpad-foundations/+bug/63776105:11
pooliewgrant: because the counter was broke, or because we actually had 0?05:13
poolieremarkably good if os05:13
mwhudsondoes that 15 include the login and shipit servers?05:13
wgrantpoolie: O != 005:13
poolieO meaning some particular category?05:14
lifelesspoolie: server ID in the oops code05:14
lifelessA, B, C, ...05:14
spmmwhudson: no, theose are extras05:14
mwhudsonlots of hardware05:14
lifelesssome machines have multiple instances05:14
lifelessbut yes.05:14
pooliearen't some higher letters used for something other than a machine id?05:15
poolieor maybe that's a different field05:15
lifelessits an arbitrary string05:15
lifelesse.g. XML05:15
lifelessdate before, serial after05:15
lifelessthumper: rt 41361 if you want to high-pri it05:15
thumperlifeless: ack, dealing with a user on #launchpad right now05:16
wgrantThe appservers are single letters. Others are longer strings (eg. CW, FTPMASTER, PPA)05:16
spmwoo. progress. Sep 13 23:26:16 localhost haproxy[15039]: [13/Sep/2010:23:26:00.844] lpnet-app lpnet-app/potassium_lpnet_5 0/0/0/-1/15230 502 1184 - - SH-- 67/38/38/2/0 0/0 "POST /ubuntu/+source/linux/+filebug/8dc224d8-bf85-11df-806b-0025b3df357a HTTP/1.1"05:16
lifelessI wonder how hard it would be to port storm & zope to stackless05:16
lifelessor jython05:17
wgrantspm: But what does it mean?05:17
lifelesswgrant: it means SH--05:17
lifelesswgrant: one thing it tells us05:17
spmlpnet 5 on potassium did the "work"05:17
lifelesspotassium should have the oops05:17
wgrantIf there was an OOPS.05:18
wgrantI was hoping it would tell us on what terms the response ended.05:18
thumperlifeless: I'm thinking that we are seeing other xmlrpc problems from the bzr client05:20
thumperlifeless: as it does lp name resolution lookups05:20
lifelesswgrant: divorced05:21
spmlifeless: wgrant: it also tells us, this timed out after 15 seconds; some other logs around there have 300secs, so ... funky.05:21
lifelessthumper: sorry, can you expand on that please.05:21
spmor succeeded after 270 seconds; so I'd suggest this is *unlikely* (but not ruled out) to be a timeout issue directly.05:22
wgrantspm: Does potassium have an opinion?05:22
lifelessthe 270 seconds will be a file attachment05:22
lifelesswgrant: it likes water05:22
wgrantspm: Also, it didn't time out after 15 seconds.05:22
wgrantI don't think.05:23
wgrantBecause I get that error in less than 14 seconds.05:23
spm15230 <== ms, ~ 15 secs05:23
lifeless:23:26:00 -> 23:26:1605:23
wgrant(now, at least -- not sure about that request)05:23
wgrantSo it's not a pure timeout.05:23
spmI'd be inclined to rule out an apache/haproxy/squid timeout, not exlcude, but look elsewhere.05:24
lifelessspm: is that url in the zop elogs on potassium05:24
spmhuh. not the *same* url, but related: https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1718F12405:29
lifelessspm: its odd for the request to dispatch and not be in the access log05:31
lifelessspm: wouldn't you say?>05:31
spmwhich log? the lp one?05:31
spmoh yes.05:31
lifelessisn't there an access log for it?05:32
spmI've founjd it in the trace log05:32
spmbut not in launchpad-access5.log-2010091405:32
lifelessok that OOPS is also a soft timeout05:32
lifelessI wonder if there is something special going on05:32
wgrantThere is clearly something special going on.05:32
lifelesswgrant: I have an experiment you could do05:32
spmlifeless: https://pastebin.canonical.com/37129/05:33
lifelessconfigure launchpad.dev to have a 1 second soft timeout and 1.2 second hard timeout05:33
lifelesspoint apport at it and have fun05:33
lifelesswgrant: theory: hard timeouts are breaking in this case05:33
lifelessand - boom - I think I know why05:33
thumperlifeless: could have been something else, don't worry about it05:33
wgrantLet's see...05:33
lifelessrequesttimeline stuff we saw yesterday.05:33
wgrantYeah, I wondered if that was realted.05:34
* wgrant finds the timeouts.05:34
lifelessspm: is there an OverlappingActionError in the lpnet5 appserver log ?05:34
lifelessspm: (not the trace log)05:34
wgrantlifeless: I only see soft_request_timeout05:35
wgrantNo hard_request_timeout.05:35
wgrantThe comment on that is misleadding.05:35
* thumper looks at the daily timeout candidates05:35
wgrant# SQL statement timeout in milliseconds. If a statement05:35
wgrant# takes longer than this to execute, then it will be aborted.05:35
wgrant# A value of 0 turns off the timeout. If this value is not set,05:35
wgrant# PostgreSQL's default setting is used.05:35
thumperwhat is BranchSet:CollectionResource:#branches ?05:35
lifelessthumper: an API call05:35
wgrantthumper: API branch collection.05:35
thumperyeah but which?05:35
lifelessI've got a bug open on the clarity for that05:35
lifelessthumper: you have to look at the oops05:36
wgrant /branches05:36
lifelesswhich is why i have a bug open05:36
lifelesswgrant: *any* collection.05:36
wgrantlifeless: It says BranchSet05:36
lifelessah true, damn your eyes.05:36
wgrantBut yes, normally it's stupid.05:36
spmlifeless: not that I can find. afaict, that "request" doesn't exist in the lpnet5 access log. even looking at the full 10 min period around05:40
lifelesswgrant: http://pastebin.com/naNudsp305:40
lifelessspm: check nohup05:40
spmhrm. point05:40
wgrantProxy Error05:40
lifelessfor for OverlappingActionError05:40
wgrantOverlappingActionError: (<lp.services.timeline.timedaction.TimedAction object at 0xe91ebac>, <lp.services.timeline.timedaction.TimedAction object at 0xe71ebec>)05:40
lifelesswgrant: you reproduced ?05:40
wgrantYou win.05:40
lifelesswgrant: \o/05:41
wgrantQuestion is... which are they...05:41
spm1st problem. nohup doesn't log times.05:41
lifelessapply my patch05:41
lifelessspm: never mind, my WAG was spot on05:41
spm    raise OverlappingActionError(self.actions[-1], result)05:41
spmOverlappingActionError: (<lp.services.timeline.timedaction.TimedAction object at 0x2b034a6f7990>, <lp.services.timeline.timedaction.TimedAction object at 0x1338fd90>)05:42
spm    raise OverlappingActionError(self.actions[-1], result)05:42
spmOverlappingActionError: (<lp.services.timeline.timedaction.TimedAction object at 0x13ff2f90>, <lp.services.timeline.timedaction.TimedAction object at 0x147b1f50>)05:42
spmlifeless: cool, fwiw tho ^^05:42
lifelessspm: thanks05:42
=== jerinas is now known as j24
wgrantOverlappingActionError: (<TimedAction SQL-launchpad-main-master[UPDATE Bug SET heat_]>, <TimedAction SQL-session[ UPDATE ]>)05:42
wgrantMaybe when it times out it doesn't close the action?05:42
lifelesssee the comment in errorlog about this05:42
wgrantI see.05:43
lifelessin logTuple05:43
lifelessstorm tracers are not a stack.05:43
lifelessour having a timeout tracer and a log tracer doesn't work as well as it should in theory.05:45
lifelessI think I'm going to create a stack-lock tracer that delegates to two other tracers and combine them.05:45
lifelesslong term.05:45
lifelessfor now, lets get fugly, lets get fugly.05:45
=== jerinas_ is now known as j24
* spm decides now would be a good time to run away for lunch05:46
lifelessspm: I have a cowboy05:47
lifelessspm: when you return05:47
lifelesswgrant: what was the bug for this?05:47
spmlifeless: cool; I assum by then you'll also have an incident report to go with? ;-)05:47
wgrantBug 63680105:47
lifelessspm: I can make one05:51
lifelesswgrant: please confirm that http://pastebin.com/iPnkpPpF fixes it05:52
wgrantlifeless: Great success.05:55
lifelessspm: we have the thing to cowboy06:02
wgrantstub: Hi.06:22
wgrantstub: The multiple OpenID identifiers stuff has had some interesting consequences.06:22
wgrantstub: In particular the bit where it respects email address linkage more than identifier linkage.06:23
wgrantWhich results in people being logged in as the wrong person, and the real person OOPSing because they no longer have an identifier.06:23
wgrantI guess the users can fix it by merging the accounts... but I'm not sure that respecting the email address in the first place is a good idea.06:24
lifelessI don't get not respect.06:24
lifelessplaying with words06:24
stubI don't understand what the problem case is. If you are logging into the OP using an email address, you want to login as the Launchpad Person attached to that email address.06:26
stubI suspect the cases that are broken where broken already, caused when LP accounts which were merged (the main bug this change was supposed to tackle)06:28
lifelessspm: when you return : the thing to cowboy is https://code.launchpad.net/~lifeless/launchpad/cp/+merge/3536406:28
lifelessspm: its going through the motions now to get into prod-devel, and I'll request a normal reroll tomorrow or so with it in it, but we should fix it now.06:29
lifelesswgrant: so, care to work on filebug ?06:29
lifelesswgrant: -huge- room for improvement.06:29
lifelessspm: and for edge, we need https://code.launchpad.net/~lifeless/launchpad/oops/+merge/3536306:32
lifelessagain, its in the pipe to be done the normal way06:32
wgrantstub: In the cases I know of, the user had changed their LP email address to blah+launchpad@some.domain06:32
wgrantstub: A package or translations import then recreated blah@some.domain06:33
wgrantSo the next time they log in, they land in a different account.06:33
stubI see.06:33
wgrantWhat is the purpose of the email address match?06:34
stubBecause people can change their email details in the OpenID Provider.06:34
stubEdit your emails, create a new account with the old email, be unable to log into Launchpad.06:35
wgrantIf the LP person was tied to an identifier, then email addresses don't matter.06:36
wgrantIt could be a little confusing in some cases, until the OpenID associations are listed clearly... but it wouldn't do strange things like this.06:36
lifeless-> shops. If issues, ring me06:38
lifelessoh, this is needed primarily on appservers, so just them for now.06:38
stubCreate foo@example.com in the OP. Log into Launchpad. Edit the account to be bar@example.com. Create a new account for foo@example.com. Now if you log in as foo@example.com, you can't log into Launchpad as your email address in Launchpad is associated with a different identifier.06:38
stubAlthough the way we really triggered this was account merging.06:39
wgrantlifeless: ECHAN?06:39
wgrantstub: Ah, I see.06:39
stubPeople had multiple accounts in the OP, and a Launchpad person with multiple email addresses. They had to log into the OP using the email address that happens to be linked to the correct Person.06:39
wgrantI wonder what should be done here.06:41
wgrantI cannot see a good solution.06:42
stubBecause we can now link multiple identifiers to a Person, and because person merge does the right thing now, we might be able to drop some of the repair work login does now if the solution is causing worse problems.06:42
wgrantDoes the use case you provided above have any legitimate reason for occuring?06:43
stubIf it does, it is pretty obscure.06:43
wgrantSo I wonder if the repair is useful.06:43
wgrantOr if it should just tell you that you are doing bad things.06:43
stubWe are in a half way stage to becoming a real OpenID consumer. I think the problems go away if we stop trusting the Canonical SSO and instead implement the work flow for attaching OpenID identifiers to Launchpad accounts. But there is a fair bit of work that needs doing first (shipit and our test infrastructure makes this more complicated)06:46
wgrantRight, this is what I was thinking.06:46
wgrantExcept for the test infrastructure.06:46
wgrantWhat's the issue with that?06:46
wgrantDoes it do something stupid like using basic auth?06:46
stubThe OpenID our tests use is the old Launchpad OP code. It uses the same underlying database tables, so it is all tied up in knots.06:47
wgrantAh, right.06:47
wgrantstub: So, what should be done? Advise the users to merge?06:57
stubAt the moment, yes.06:58
stubThat might be the preferred solution too, as we ensure the SSO database and Launchpad database remain in sync. I'm not sure.07:00
* mwhudson eods07:00
wgrantThe separation needs to eventually be far more obvious.07:00
lifelesswgrant: not echan, info for spm07:15
=== almaisan-away is now known as al-maisan
lifelessspm: the cowboy07:15
lifelessspm: see all the backlog07:15
lifelessspm: we're missing out on many OOPS at the moment, its rather important07:16
spmyarp; just getting it all together atm07:16
spmhrm. complex patch that one07:16
stublifeless: by 'fail closed' do you mean if we can't load or parse the config file we should default to enabled?07:50
stubI can argue that either way. losas might have an opinion.07:50
lifelessI mean we should not run unless permitted too07:51
spmlosas have lots of opinions, some of them are even relevant07:51
lifelessif theres something wrong, running is likely to add to the problem07:51
lifeless(as a default, for most cronscripts)07:51
stubok. My reasoning for the current behaviour is a stuffup in the config mechanism (Apache dead, syntax error) shouldn't bring the Launchpad systems down. And your reasoning is just as valid :-) It does get noisy if things fail, but that is about it.07:54
lifelessits just cronscripts now isn't it ?07:55
stubI could make it a config option and make it somebody elses problem ;)07:55
stubYes - this is just cronscripts.07:55
lifelessso if all the cronscripts are down, when apache is down, I don't think its raeally a problem :)07:56
lifelessI mean, at that point, apache is down :)07:56
stubI think a typo or mistake is more likely - this config file is being edited by humans.07:56
lifelessso there are two places that can occur07:57
lifelessthe lazr config providing the url07:57
lifelessand the referenced ini file07:58
lifelessfor the former, it should change nearly never07:58
lifelessfor the latter, we could provide a small lint-and-update tool07:59
lifelessspm: what do you think08:00
spmI don't think. sysadmin.08:00
lifelessand if someone paid you to ? :P08:00
spmwith chocolate? I'd pretend to think REAL hard.08:01
spmnot sure I fully follow the issue? Q&D summary? something about not running cronscripts if parts of LP are borked?08:02
lifelessso there is a new ini file coming in08:02
lifelessit will disable all cronscripts in one hit, no need to touch cron08:02
lifelessif there is an error obtaining it (over http) and parsing it, what should happen: should the scripts run, or not run08:03
spmwhere "one hit" is ~ 26 easily accessible servers and 4 difficult ones?08:04
spmcool, just ensuring we appreciate what "one hit" means :-)08:04
lifelessas long as they can access it over http inside the network.08:04
spmOh! I see. Right! thats.... funky.08:04
lifelesshyou odn't like?08:05
spmNo I do like08:05
spmthe scripts should do whatever the last invocation was. which sucks, because now you're maintaining state as well.08:05
spmie. network hiccups *will* occour. we don't want to clobber LP by such a transient08:06
lifelessspm: so tcp syn will retry three times anyway08:07
spmas a thought: you'd have a 2 checks. the official "check http"; with a secondary, check local/state. We can script update the state if necessary - eg apache update08:07
spmeven so. soyuz used to barf badly all the time on funky network woes.08:07
spmI'm think more resiliant than what just tcp et al give.08:08
spmdoes that make sense? crackful?08:08
spmlifeless: that should be done on PROD btw08:09
lifelessso, for daily and hourly crons08:09
lifelesswgrant: please break it08:09
lifelesswgrant: prod08:09
wgrantlifeless: OK.08:09
spmnot edge, haven't done edge yet08:09
lifelessspm: for daily and hourly cronscripts, not running is fairly significant08:10
lifelessspm: OTOH daily and hourly things are background tasks mainly, and oops reporting etc is separate and not driven by this08:10
wgrantlifeless: Success.08:10
lifelessspm: \o/08:10
spmright, which is why I'd want them to fail as safely as possible - via a local state "what did I do last time?" <== but state is also likely to shared (maybe??) so likely updated more frequently.08:11
lifelessso, personal opinion, if things are fucked royally I'd rather have the cronscripts not running to facilitate recovery.08:11
spmI'd probably suggest shying away from per-job state; go for a global08:11
lifelessas long as when they don't run they log it, if we find tha network transients are an issue, we can iterate.08:12
wgrantAs long as it keeps attempting to retrieve it frequently...08:12
spmand if things are that bad; humans are involved. and we can set the state file; or 'script roll out' an updated state file08:12
stubI understand the 'what I did last time' argument, but the extra complexity makes diagnosis complex. I'd say keep it as simple as we can.08:12
lifelessstub: I agree.08:12
spmif that case, fail quiet; don't run.08:12
stubBut if that means maintaining state, we can do that (the cronscript can remember its last invokation on the file system, in /var/run or somewhere.08:12
lifelessspm: I'd fail closed, don't run, and log the failure.08:13
lifelessspm: why do you say fail quiet ?08:13
stubspm: At the moment, if the config file cannot be found (404) we emit a DEBUG and enable. Any other errors, including syntax errors, we emit a ERROR traceback and enable.08:14
spmprobably via some mechanisim that makes it easy to nagios alert08:14
spmlifeless: don't cronspam bombard == quiet08:14
lifelessspm: thats in tension with 'diagnosable'08:14
lifelessspm: logging to a file would be ok?08:14
lifelessspm: also remember that this is on 404s and syntax errors08:15
lifelessspm: so things are messed up if its happening at all08:15
spmlifeless: you're talking > 260 cron tasks. if we get a global fail, that's a LOT of spam to wade thru08:15
spmlogging to disk is ok08:16
lifelessso, if we said:08:16
lifeless - on failure to get ini and parse, don't run.08:16
lifeless - log that to disk, not stderr08:16
lifeless - nagios should be monitoring those log files08:16
lifelesswould that make sense to you?08:16
lifelessstub: to you ?08:17
spm* log that to disk, not stderr: like oops' perhaps in a vague handwavy way. known dir; date time stamped; we can nagios alert on files between 0-60 mins old type of thing08:17
spmsetup a red button "archive and zot the cron logs" so the "all fixed" is not as painful08:18
stubSeems a little fragile relying on nagios like this. It is looking for an error rather than checking something is reacting correctly.08:18
spmstub: how so?08:18
stubWe are relying on the cronscript to log things correctly and ...08:19
stuboh.. hang on.08:19
spmbeing ware of assumptions: I'd assume the apache/configs setup is monitored08:19
stubWe already alert if scripts stop running.08:19
spmonly via scriptactivity, but yes.08:19
spmit's arguable if that should be nagios'd. atm I'd be vehmently against it.08:20
stubSo we could just disable silently (DEBUG or INFO - whatever) and rely on the existing checks to beep if things remain screwed up for too long.08:20
lifelessworks for me08:20
spmthat works08:20
lifelessas long as someone coming along to look can look at a file on disk to see whats up.08:20
spmI'm only aware of one script that really should have a nagios check against it - branch-merge-proposals08:20
lifelessSQL time: 10701 ms08:21
lifelessNon-sql time: 4505 ms08:21
lifelessTotal time: 15206 ms08:21
lifelessStatement Count: 53608:21
spmit's timely, and is also requiring human intervention on fail08:21
stubI think the 'too much spam to wade though' indicates too many scripts are emitting their logs via email. Perhaps they should log to file instead of stdout/email and losas look on disk when the alerts ping.08:22
lifelessstub: mthaddon wants that08:23
lifelessstub: for all basically08:23
* StevenK kicks webservice.get()08:23
spmI think pretty much every LP scripts logs to STDERR by default, which is known as Doing It Wrong08:23
StevenKFirst call works, second call returns AttributeError: 'thread._local' object has no attribute 'interaction' :-(08:23
stubspm: There is --logfile available right now to log elsewhere.08:23
lifelessStevenK: login()08:23
spmstub: I'm sure we've used that elsewhere and it doesn't quite work as described...08:24
StevenKlifeless: But why does the first call to webservice.get() work fine?08:24
lifelessyou're already logged in08:24
StevenKAnd how does that log me out?08:24
lifelessend of the request08:24
lifelessfeel free to clean this up08:24
stubspm: That would be a bug (not surprising as nobody ever used --logfile after it got implemented 5 years ago)08:24
stub  --log-file=LOG_FILE  Send log to the given file, rather than stderr.08:25
spmAhh. That's not helpful as is. what you want is all "normal output" to go to stdout, or the above option. any *real* errors that *REQUIRE* manual intervention get sent to STDERR.08:26
spmatm we get craploads of "INFO" messages or "CRITCIAL ERRORS" that are nothing of the sort, sent to STDERR08:26
lifelessspm: icing on edge is borked08:26
pooliestub: yeah we were just talking about this08:26
poolielifeless: me too08:26
lifelessspm: we haven't got that part of the deployment right yet08:27
stubspm: ok. That is a change, but we could do it globally for all scripts at once. The desired behaviour will need to be documented (bug report?)08:27
spmlifeless: bleh. edge is autodeploying atm08:27
lifelessspm: need a new RT ?08:27
pooliei'm also getting a 'The following errors were encountered:08:27
poolieServer error, please contact an administrator.08:27
spmstub: i think this is a bug I logged about 2 years ago... :-)08:27
pooliein an ajax thing08:27
spmpoolie: I can't do anything until you contact me per the above error. sorry.08:27
stubspm: Yes - I was expecting production scripts to all be run with -q08:27
lifelesspoolie: thats possibly/probably the thing we're deploying to fix08:27
poolielike a finely-oiled machine08:28
* spm watches poolie hop on his bike to drive down here and slap me upside the head....08:28
pooliei would but it's a bit cold and wet08:28
poolieand probably doubly so down there08:28
spm"horrible" <== and not just saying to keep you away08:29
lifelessspm: so now I have 11542 with no icing08:30
lifelessspm: can you check the apache ?08:30
spmhrm. supposedly we *are* 11542 everywhere08:31
lifelessbut the icing ain't08:31
spmisn't this the build farkup we saw the other week?08:31
lifelessspm: thats meant to be static, from apache08:31
spmle sigh08:31
StevenKlifeless: With webservice.get(), login(), webservice.get() I get newInteraction called while another interaction is active. for the second .get08:31
lifelessStevenK: odd08:32
lifelessStevenK: perhaps the get() isn't what was throwing.08:32
lifelessStevenK: perhaps you're actually tring to access something in  between the ( and )08:32
StevenKlifeless: Of course I am08:32
lifelesscalculate the url outside of the function call08:32
lifelessbecause ...08:33
lifelessaccessing objects requires a participation08:33
spmoh awesome. that file doesn't exist on edge.08:36
lifelesssay what ?08:36
stublifeless: I'll land the branch I have with the abspath and maybe the timeout if it is simple, as it is still an improvement, and open bugs and kanban tickets on the next set of changes.08:36
lifelessstub: thanks08:36
spmlifeless: exactly that. the folder is there, that particular file (and possible some small number of others) aren't.08:36
lifelessspm: they are built by 'make compile'08:36
lifelessor possibly make build08:37
spmapparenetly not in this case....08:37
spmoh ffs. the make build blewup again.08:39
lifelessspm: does the deploy script abort when that happens ?08:39
wgrantNot my fault, this time, though!08:39
spmlifeless: https://pastebin.canonical.com/37132/08:39
spmthe script can't - the make is continuing, so the deploy scripts doesn't know it's aborted08:40
spm(AIUI, IMBW)08:40
lifelessspm: filing an RT - it has to.08:40
spmNo. I lie. It does see the error.08:40
lifelessmake: *** [compile] Error 108:40
lifelessError 2 running ssh launchpad@banana make -C /srv/edge.launchpad.net/edge/launchpad build LPCONFIG=edge108:40
lifelessRunning ssh launchpad@banana "rm -rf /srv/edge.launchpad.net/edge/launchpad && ln -s /srv/edge.launchpad.net/edge/launchpad-rev-11542 /srv/edge.launchpad.net/edge/launchpad"08:40
lifelessits not halting !08:40
spmyeah x 308:41
spmok. later problem. reverting.08:41
lifelessspm: why was it 11542 that rolled out ?08:43
spmno idea atm08:43
lifelessok, thats tip of stable08:43
lifelessfair enough (but wtf with the error)08:43
spmoki, apaches rolled back; doing the app servers08:43
adeuringgood morning08:43
spmtruely. it's supposed to abort on errors. we use this logic all over the place. And it works on other systems :-(08:44
lifelessspm: can you do 11538 which I think was previous with the patch applied; and we may need to stop other edge rollouts till we fix.08:44
spmheya adeuring08:44
spmlifeless: that I have/am08:44
spmlifeless: launchpad@banana:/srv/edge.launchpad.net/edge$ rm launchpad ; ln -s /srv/edge.launchpad.net/edge/launchpad-rev-11538 launchpad08:44
lifelessspm: I bet its python2.508:46
spmlifeless: shrug, I just blame wgrant for everything. faster, easier, if less accurate08:46
lifelessspm: can you do this on a machine thats still 2.5 ?08:46
wgrantBut it *was* me (and python2.5) last time this happened.08:46
wgrantSo it's quite accurate.08:46
spmhaha. lets not let FACTS get in the way here!!!!08:47
lifelessspm: find . -name 'potemplate.py'08:47
spmlet me finish getting the apps restarted :-)08:47
lifelessspm: then, for each reported file, cd to that dir and run 'python -c 'import potemplate'08:47
spmedge3 done08:47
spm2010-09-14 07:48:01 WARNING SIGTERM failed to kill launchpad (7487). Trying SIGKILL <== yay. it's back! wooo!08:48
spmedge4 coming back08:49
spmedge1 coming back08:50
stubIs network syslog loathed by IS?08:51
spmedge2 on the way back08:51
spmstub: i don't mind it; no idea about others tho08:51
spmedge1 & 4 being difficult and not working08:52
spmedge2 is fine08:53
spmedge1 being really painful and needing to be manually killed.08:54
spmedge1 stabbing successful; it lives08:54
spmretrying edge4...08:54
spmedge5 coming back08:56
spmedge4 lives08:56
spmedge5 lives; should be done. verifying.08:56
spmlifeless: have you logged a bug on this essplosion?08:58
lifelessspm: urls like this: https://bugs.edge.launchpad.net/+icing/rev11542/combo.css - how are they served.08:58
lifelessspm: I RT'd it08:58
lifelessspm: for the explosion part08:59
spmthat's the continue, but not the root cause?08:59
lifelessspm: waiting on your python2.5 test for confirmation08:59
spmah k08:59
spmlifeless: so to recap a bit back - CP'd to prod; not to edge.09:01
spm*cowboyed* not CP'd.09:02
lifelessspm: right, can we do edge 11538 cowboy, not 11542 cowboy ?09:02
spmI'll throw that to Tom I suspect09:02
lifelessrev 11542 looks like the bust one09:03
lifelesswhich is jtv's patch09:04
lifelessjtv: you appear to have used python 2.6 in lib/lp/translations/browser/potemplate.py line 97109:05
* jtv looks09:05
lifelessjtv: look at spm's pastebin09:05
jtvWonder what's wrong with it…09:06
jtvlifeless: want me to write up a quick patch?09:08
jtvhi mrevell09:10
jtvlifeless, spm: I'd fix it thusly: http://paste.ubuntu.com/493508/09:10
lifelessspm: ping09:12
spmlifeless: yo09:13
lifelessspm: need you to try applying jtv's patch to a 11542 dir (one of the failed ones)09:13
lifelessand see if 'make build' will then work.09:13
lifelessjtv: could you please do a few things for me, its getting on here.09:13
lifeless - file a bug that this is broken,09:13
jtvlifeless: speak09:13
* jtv files bug09:13
lifeless - put your branch up for review etc etc - r=me to apply it, if spm confirms it works.09:13
lifeless - arraange for someone to let the LOSAs know when it lands in stable, so that edge updates can be reenabled.09:14
lifeless - (e.g. yourself, or your delegate)09:14
jtvOn the way.09:14
lifelessI will mail the list about the process issue09:14
spmtrying atm...09:15
spmtry2 wit hright config...09:15
jtvbug 63786809:16
lifelessjtv: did you ec2 your test ?09:17
spmlifeless: I'd suggest that cowboy get's rolled in with jtv's fix and just a regular edge rollout rolled.09:17
jtvlifeless: not yet, not yet09:17
lifelessjtv: sorry, let me be more clear09:17
lifelessjtv: the patch that broke; did you land it:09:17
lifeless  - by running all tests locally + pqm09:18
jtvec2 land.09:18
lifeless - by ec2 land09:18
lifeless - ...09:18
spmjtv: lifeless: https://pastebin.canonical.com/37134/ looks good09:19
jtvspm: still looking good?09:25
jtvThe codebase, I mean, not you.  You'll always look good.09:26
lifelessspm: that looks good; thanks. jtv your patch fixes it.09:27
jtvBTW it's odd that this passed PQM, what with the pagetests exercising it.09:27
lifelessjtv: pqm doesn't run tests.09:27
jtvSorry, buildbot.09:27
lifelessjtv: ec2 runs them, but its probaby running lucid09:27
wgrantbuildbot is Lucid.09:27
lifelessbuildbot has two separate jobs.09:27
jtvWell, we have lucid and hardy buildbot slaves.09:27
lifelessjtv: see my mail09:28
* jtv will see mail09:28
lifelessI don't think bb requires *both* to be ok.09:28
lifelessbut thats what we probably need09:28
jtvBTW should I MP the fix for stable or for devel?09:29
jtvOK.  I branched off stable though just to be sure.09:30
lifelessnow that edge rollouts are blocked, theres no panic (no reason to delay) but no panic.09:30
jtvlifeless: the MP is at https://code.launchpad.net/~jtv/launchpad/bug-637868/+merge/3537309:31
wgrantlifeless: ec2 has been running Lucid for a long time.09:31
wgrantI think this is about the third time things have broken.09:32
lifelesswgrant: definitely second.09:32
lifelesswgrant: bug 63785409:33
wgrant_mup_, I am disappoint.09:34
wgrantlifeless: At least it tries to prejoin.09:34
lifelesswgrant: ugh!09:34
jtvlifeless: so now I can land on devel as normal and just wait for the fix to percolate?09:35
jtv(I would appreciate a click on the button from you btw, to prove I didn't invent your approval :)09:36
lifelessI did09:36
jtvOh!  The MP just timed out for me is all09:36
lifelessyou need to let mthaddon know when its good to go on stable09:36
lifelessjtv: interesting, what OOPS id ?09:36
jtvI already ran the applicable pagetests through ec2… guess a full EC2 run makes no sense here.09:36
jtvlifeless: I don't know; focused on fixing my bug, so just reloaded09:37
bigjoolsthere's a certain person posting comments on that one09:53
wgrantLet me guess...09:53
spivwgrant: you're psychic, clearly.09:55
wgrantHe does have some good points, as usual.09:59
lifelessand they are clearly unbiased10:04
lifelesswhich is refreshing10:04
bigjoolsshame it's the same sound of that grinding axe10:06
jtvSpeaking of grinding axes…10:10
jtvThe builds-list.pt template is supposed to work for any BuildFarmJob but it tries to access build/dependencies.  :(10:11
jml"No longer needed: Python 2.5"10:12
wgrantjtv: I'm glad you're completing the generalisation for us :P10:12
jtvwgrant: remember that axe I mentioned  just now?  Kindly insert it into one of your feet.10:12
jtvAnd thank you.10:15
* wgrant limps away viciously.10:15
bigjoolsjtv: then return None10:17
bigjoolsyou don't have any dependencies10:17
wgrantIt's not on the interface.10:17
wgrantAssuming it is illegal.10:17
jtvNo, that's the nasty bit.10:17
jtvIt's in IPackageBuild.10:17
jtvSo _implementing_ it in BuildFarmJob or BuildFarmJobDerived or my own specific buildfarmjob class isn't enough.10:17
jtvIt needs to move into IBuildFarmJob, which is uglier than a Windows desktop.10:18
bigjoolsat least Windows has sound that works10:18
jtvTo name some arbitrary example of extreme ugliness.10:18
jtvYes, Windows often has working sound, so you can _hear_ the "I don't know this codec" error instead of just seeing it.10:19
jtvBut we digress.10:19
jtvHere we are chattering about operating systems when wgrant's foot is oozing virtual blood.10:20
jtvHelp the man, for God's sake!10:20
wgrantbigjools: Your sound isn't working?10:21
wgrantI think mine might be slightly crackly on Maverick.10:21
wgrantBut it works mostly.10:21
bigjoolsmaverick re-installed pulseaudio10:21
jpdsjtv: He's in AU, dude.10:21
bigjoolspulseaudio is a crock of shit10:22
jtvjpds: pronounced "Ow!!!"10:22
wgrantbigjools: WFM!10:22
wgrantAlthough I could be distracted by my poor, poor foot.10:22
bigjoolsit's insisting on using my laptop instead of my headset's mic10:23
jtvwgrant: nice save10:23
bigjoolsI've no idea how to make it do what *I* want instead of what *it* wants10:23
jtvMeanwhile, I just managed to insinuate a TranslationTemplatesBuild into a builder history!10:23
wgrantbigjools: Kubuntu doesn't have a nice control panel for it?10:23
wgrantjtv: But does it crash?10:24
jtvwgrant: no.  Not that anyone'd notice: I don't see anything in the build code that would put a BuildFarmJob into the right state to show up there.10:24
bigjoolswgrant: I dunno, what is it in Gnome?10:24
jtvbigjools: a crock of shit.  Trick question.10:25
wgrantbigjools: The sound preferences thing (in the volume indicator's menu) has radio buttons for the input device.10:25
* bigjools hears 2 drums and a cymbal falling off a cliff10:25
allenapHi jml, thanks for looking into my Zope befuddlement. I think mwh's reply has now hit the nail on the head, so I'll wikify that and reply to the list.10:25
jtvAFAICS the Builder still selects and dispatches a BuildQueue object, not a BuildFarmJob.  How's it ever going to update BuildFarmJob.{status,date_started,date_finished}?10:26
bigjoolsfinally it works10:27
jmlallenap, cool.10:27
wgrantjtv: It's complicated.™10:28
wgrantBut it works.10:28
jtvwgrant: well since there is absolutely currently nothing coupling my BuildFarmJobs to my BuildQueues, I don't see how it can.10:28
bigjoolsso, when using pulse with skype, how do I make it ring the PC speakers and not in the headphones, which I might not be wearing? :/10:30
wgrantjtv: I believe it's handled by the IBuildFarmJobBehavior.10:30
wgrantjtv: IIRC you override updateBuild_WAITING.10:30
wgrantThe default calls handleStatus on the build.10:31
jtvI think we already override that.10:31
wgrantI mean you do currently.10:32
wgrantBut you should probably stop.10:32
jtvThat could hurt.10:32
wgrantIt will.10:33
jtvHand me that axe, will you?10:33
jtvThank—eewwww, there's blood all over the blade10:33
jtvAnyway, for now <puts axe aside> I guess it's enough to get display working and then next we can focus on tying the BuildQueue and the TranslationTemplatesBuild together.10:34
bigjoolswgrant: the discussion in https://bugs.launchpad.net/bugs/635103 is a little over my head at the moment, do you know why it's not working for him yet fine in Ubuntu?10:46
wgrantbigjools: He wants to not have to download and upload the whole thing.10:47
wgrantTo do that we'd need an ia32-libs-specific hack, to support the conglomeration of horrid hacks that is ia32-libs.10:47
wgrantI was going to tell him to go away. But you're probably a better person to do it :P10:48
bigjoolswgrant: possibly, but I don't understand what that package is doing10:49
wgrantbigjools: Do you *really* want to know?10:49
wgrantThere is a reason that the source packages is 700MB.10:49
wgrantIt contains approximately an awful lot of packed source packages.10:50
wgrantIt builds on amd64, but builds them for i386.10:50
wgrantEr, wait, no, it includes the binaries too.10:51
wgrantSo it doesn't build them.10:51
wgrantIt extracts the i386 binaries, and produces a big amd64 ia32-libs binary containing all of them.10:51
wgrantSo you have this huge source package contain dozens or hundreds of sources and binaries from the archive.10:51
wgrantEr, yes.10:51
bigjoolsdot com10:51
wgrantIt will be rendered obsolete by multiarch.10:52
wgrantBut multiarch hasn't happened yet.10:52
lifelessmultiarch was 'coming' when we -started-10:52
bigjoolsdem's de magic words10:52
wgrantlifeless: it's seriously in development now, though.10:52
wgrantSome of the work has been done in the last year.10:52
wgrantHell, even NMSP is happening.10:52
lifelesswgrant: I know.10:52
wgrantAnd derivative distros.10:52
wgrantThis is incredible.10:53
bigjoolsI can only lever so much of the planet with the team size I have10:53
jmlbigjools, well, to be true to the metaphor, you just need a better place to stand11:02
bigjoolsjml: I'll jump higher :)11:03
bigjoolsjml: when are you arriving here BTW?11:06
jmlbigjools, Sunday. Let me check my ticket.11:06
bigjoolsyou bought train ticket in advance?  gosh :)11:07
bigjoolsinsert preposition as required. Sigh.11:07
jmlI think you mean "article", and yes I did.11:08
jmlit's hard to get out of the habit of booking travel in advance11:08
wgrantIs this for the buildd-manager attack session?11:09
jmlindeed it is.11:09
jmlbigjools, anyway, I'll be taking an afternoon train, probably the 144211:11
bigjoolsjml: ok well when you get ensconced in the pub, gimme a shout and I'll pop over for a pint11:13
jmlbigjools, will do.11:14
wgrantbigjools: I've received lots of complaints in the last few days that builds keep getting redispatched.11:14
wgrantEven on non-virt builders.11:14
bigjoolsjml: there should be taxis at the station but if there are not let me know and I'll come and pick you up11:14
jmlbigjools, thanks.11:15
bigjoolswgrant: the UI is a lie11:15
bigjoolsthe early commit, is not :/11:15
lifelessjml: can you do me a favour?11:15
wgrantEven if it is committing before it confirms successful dispatch, why is the dispatch not successful?11:16
jmllifeless, quite possibly.11:16
bigjoolswhat's happening is that we mark the build as running before it's completely dispatched.  If there's a comms error then it looks like it gets re-dispatched after the next builder picks it up11:16
lifelessjml: my pqm-landed (nonec2) branch has a test failure11:16
lifelessjml: its -extremely- shallow.11:16
lifelessjml: (add the missing tuple)11:16
wgrantbigjools: But there shouldn't comms errors :(11:16
lifelessjml: however the fix needs to be done to production-devel too.11:17
bigjoolswgrant: we don't live in a perfect world11:17
lifelessjml: before the oops fix can be uncowboyed.11:17
bigjoolsrouters drop out, DC engineers kick cables11:17
lifelessjml: yes/no ?11:17
wgrantbigjools: Over 20 minutes?11:17
jmllifeless, you'd like me to land the fix for you?11:17
* gmb hates at typos in sampledata11:17
gmb'testible' indeed.11:18
lifelessjml: yes, on devel and production-devel11:18
jmlgmb, it's a pun!11:18
bigjoolswgrant: yes, that's the interval I see because of the bad scaling11:18
lifelessjml: its 22:20 here, more or less11:18
gmbjml, I noticed that about half a second after pressing enter :)11:18
wgrantbigjools: Ahh, true, forgot that bit.11:18
jmllifeless, ok, will do.11:18
lifelessjml: thank you!11:18
jtvAre we going into testfix?11:37
jtvThe lucid_lp buildbot just failed.11:37
jtvIs failing, rather.11:37
jtv   lib/canonical/launchpad/webapp/ftests/test_adapter.txt11:37
jtvLine 305, in test_adapter.txt Failed example:11:38
jtv     get_request_statements()11:38
jtvDifferences (ndiff with -expected +actual):11:38
jtv     - []     + [(0, 0, 'SQL-launchpad-main-master', 'SELECT 2')]11:38
wgrantIs that what lifeless was talking about above?11:40
jmlwgrant, yes.11:44
jmlI wonder why emacs is segfaulting for me.12:01
thumperjml: because it hates you :)12:02
bigjoolsit's telling you to use a real editor12:04
deryckMorning, all12:04
bigjoolshowdy deryck12:04
jmlbigjools, yeah, you're right. I've had to revert to "emacs -nw"12:04
jmlderyck, hello12:05
thumperjml: ?? whazzat?12:05
jmlthumper, try it!12:05
thumpernot right now12:05
jmlthumper, ok, I give up, it's emacs in a terminal12:05
thumperI'm trying to right a talk12:05
thumperah.. no windows, it's so obvious12:05
bigjoolsthat's either a typo or a clever play on words12:05
thumperbigjools: which one?12:06
jmlthumper, "righting" a talk.12:06
bigjools"right a talk"12:06
* thumper hangs head12:06
thumperit is a typo12:06
thumperI'd like to be more cleverer12:06
=== al-maisan is now known as almaisan-away
=== matsubara-afk is now known as matsubara
jmlI'm off for lunch & errands. Back later.13:25
=== almaisan-away is now known as al-maisan
cr3leonardr: when you have a moment, I would have a question for you about routes when exposing a restful interface with lazr14:31
leonardrcr3, sure14:34
leonardrroutes as in the url traversal code?14:34
cr3leonardr: yes, how can a collection be contextual? for example, lets say LP had /me/bugs and /project/foo/bugs, where both person and project would implement IHasBugs, how should the "bugs" part of the url be defined?14:35
leonardrcr3: that's called a scoped collection, and lazr.restful traverses from 'leonardr' to bugs or from 'mozilla' to bugs by attribute access on the person or project object14:36
leonardrso leonardr.bugs is /~leonardr/bugs14:36
cr3leonardr: ah, so it must be defined as an attribute, I thought it might be ProjectNavigation in the browser layer or perhaps even using the Bag14:37
leonardrcr3: no, once you have identified a specific object all further traversal happens through attribute access14:37
cr3leonardr: would it make sense to have the IHasBugs define a "bugs" attribute?14:38
leonardrcr3: afaik, yes14:39
cr3leonardr: but if IHasBugs has a searchBugs method already which should essentially behave like the bugs attribute, given no parameters, then wouldn't searchBugs and bugs look a lot the same?14:40
cr3leonardr: my concern is that every class implementing IHasBugs would essentially have to do something like: @property; def bugs(self): return self.searchParams();14:42
leonardrcr3: well, you don't _have_ to put 'bugs' in IHasBugs if different implementations get the bugs differently14:43
leonardrbut, i have two things to say on top of that14:44
leonardroh, neverm ind, you're saying that all the IHasBugs feature 'bugs'14:44
leonardrbe that as it may, /bugs is better for the end-user than ?ws.op=searchBugs14:45
leonardrhowever, there's nothing to be done about that for now14:45
leonardrmy next project will include things like14:46
cr3leonardr: I was mostly using IHasBugs as an example for a collection which might be implemented by more than one context14:46
leonardrcr3: sure, i know you're not really talking about IHasBugs. but i'm trying to deal with the situation as you posed it14:47
leonardrmy next project will include features like the ability to designate a method as being "the method you call to generate a scoped collection"14:47
leonardrso you could tag searchBugs as the generator for /bugs14:47
cr3leonardr: I was grepping through lazr for the concept of an alias, like "bugs" is aliased to searchBugs or somesuch14:48
henningesinzui: ping15:01
sinzuihi henninge15:02
henningeI am a bit at loss at how to downgrade a package.15:02
henningepsycopg2 in this case15:02
sinzuihenninge, do you have the deb?15:02
henningesinzui: I have done that once or twice before but I could use a pointer, please ;-) ?15:03
henningeNo, I was just searching for that.15:03
* sinzui checks lp history15:03
sinzuihenninge, download the deb from here: https://edge.launchpad.net/ubuntu/lucid/i386/python-psycopg2/2.0.13-2ubuntu215:04
sinzuihenninge, sudo dpkg -i --force-downgrade python-psycopg2_2.0.13-2ubuntu2_i386.deb15:05
henningesinzui: thank you very much!15:06
* henninge actually forgot to look on LP for the package ...15:06
sinzuihenninge, i decided not to pin the version. I hold out some small hope that lp or psycho will resolve there differences. I downgrade after every update breaks lp15:06
henningesinzui: yes, otherwise one might forget about the pinning and hit strange errors later ...15:07
gmbrockstar: So, deryck solved our JS wizard problem :)15:52
rockstargmb, that's because deryck is awesome.15:52
rockstargmb, what was the issue?15:53
gmbrockstar: Two things: 1) YUI auto-generates the CSS class names based on WIDGET.name - so the hidden class was yui3-lazr-wizard-hidden, which wasn't defined anywhere.15:53
gmbAlso, widgets that aren't created as hidden can never *be* hidden.15:53
gmb(At least, that's how it behaves; I suspect theres a bug there)15:54
rockstargmb, wait, how was it never created?15:54
rockstargmb, and it's Widget.NAME, isn't it?15:54
gmbrockstar, Yeah, sorry, bad caps.15:54
gmbrockstar: Let me just check the patch deryck gave me so that I know I'm not BSing you.15:54
gmbrockstar: http://pastebin.ubuntu.com/493666/15:55
rockstargmb, okay, so I had it defined as Wizard.NAME = "wizard"; so I don't know where the lazr comes from either.15:55
gmbrockstar: The wizard was created but by default visible was True.15:55
gmbAt least, that's how it looks based on deryck's patch.15:55
rockstargmb, okay.15:55
rockstargmb, I'm not sure I understand the bottom patch though, to wizard.js.15:56
gmbrockstar: Yeah. lp:~gmb/lazr-js/wizard-widget/ contains deryck's fix and some further CSS fixes.15:56
rockstargmb, I guess he's just demonstrating that there's missing CSS somewhere?15:56
gmbI don't know if you need to do more with it.15:56
rockstargmb, well, it needs to get finished now.  If it's firing events, it can probably start moving through steps now.15:56
gmbWell, I don't know. That seems to be related to the way the widget behaves... deryck, can you clarify exactly why your fix fixes?15:56
deryckrockstar, yeah, that's all.  The CSS in use currently assumes the widget name is "lazr-formoverlay" and it wasn't set to hide by default.15:57
rockstarderyck, okay, so we can't reuse that name, so we need to just define yui3-lazr-wizard-hidden in the CSS then?15:57
deryckrockstar, yui3-NAME-hidden, where NAME is what you define in the widget.  This is how all those CSS classes get built.15:59
rockstarderyck, yeah, so it should have been yui3-wizard-hidden15:59
rockstarderyck, and I thought I had defined that.15:59
deryckrockstar, right.  There is a yui3-NAME for every class this widget descends from.  But only the current NAME gets the hidden class added.15:59
rockstarderyck, yeah, okay.16:00
deryckrockstar, you did, but the CSS was not using it.  And you couldn't tell because you weren't hidding by default with visible: false.  So changes to the name had no affect.16:00
rockstarderyck, ah, that makes sense.16:00
rockstarSo if it's not hidden by default, it can't be hidden again.16:00
rockstarThat is, quite possibly, one of the silliest things I've ever heard.16:01
gmbrockstar: Yeah, I needed to mop the tea off my monitor when deryck told me.16:01
rockstargmb, I'm digging in my junk drawer for a baby to punch as we speak.16:01
deryckwell, no, not quite accurate16:02
deryckrockstar, wizard.render() shows the widget without the hidden class.  Nothing in your code calls wizard.hide().16:03
rockstarderyck, the defaultCancel should have been doing it.16:03
deryckrockstar, I don't think so.  That's only called after UI changes, AIUI.16:03
deryckat least, my reading of the code.16:03
rockstarderyck, no, it's called when the CANCEL event is fired.  I know it was being called, because that's where the Y.log("aoeu") was.16:04
rockstarI think you and I both confirmed that the -hidden CSS class was getting set as well.16:04
deryckright, but that's not called on load.16:04
rockstarderyck, yeah, so I either have to hide by default and call .show() in the example, or call .hide() and then .show() on load.16:05
deryckCANCEL event is only fired by ESC or clicking away from the widget, no?  I never saw "aoeu" until I did some action, not on load.16:05
rockstarderyck, I saw it when I clicked the cancel button.16:05
rockstarYeah, okay.  So what you're saying is what I'm understanding.16:06
rockstarStupidness prevails.16:06
rockstarThanks for sorting it out.16:06
derycknp :-)16:06
rockstarWe need a page on the wiki of all the YUI gotchas.16:06
rockstarWe'll probably just forget the page exists and create it 4 more times, but whatever.16:06
salgadojcsackett, I was going to have a second look at your unknown-blueprints-service-597738 branch but noticed there's some discussion still going on, and I'm wondering whether you're going to do any more changes to it or if it's ready for a second look16:07
jcsackettsalgado: i'm still working on it.16:19
jcsacketti actually needed to add an attr to the view, so i need to write some tests as well.16:19
jcsackettsalgado: i'll ping you and sinzui when i've pushed changes for round 2.16:21
gmbrockstar: So - for my own clarity - are you now going to do further work on the wizard to make it do wizardly things properly?16:22
rockstargmb, I _can_, but it sounds like it's blocking you, and I'd like to avoid that as much as possible.16:23
gmbrockstar, Right. That sounds fair. In that case, I'll get cracking on getting it doing what what we need and ping you if there are any further issues.16:24
gmbrockstar: Is there a specific bug or LEP that your work so far is tied to? I don't want to make something that doesn't do what you need it to do.16:24
rockstargmb, does the overall design make sense?16:24
rockstargmb, I had a kanban card with the work on it.16:25
rockstar(because we like to track our work in many different places)16:25
gmbrockstar: Yes. In fact, it's pretty much exactly what I had in mind for my hack, although that was less elegant :)16:25
gmbrockstar: Ah, cool. I shall go and find it.16:25
salgadojcsackett, ok16:26
deryckrockstar, hey, I think all the python was added to lazr-js for the testsing story.  To hook up the yui-unittest stuff with zope test runner.16:42
rockstarderyck, I'm not so sure.  Our testing story uses Java.  It can be fired up from the shell.16:45
deryckah, ok.  Maybe not then.16:45
deryckI thought that was why.  Why all the Zope packages then?16:45
deryckand storm and lazr.restful.  god good y'all. ;)16:46
rockstarderyck, so, the testing story does use lazr.testing, but it doesn't need to.16:52
rockstarderyck, also, it could be used for the lazr-js testing, but not have to be distributed to our projects as well.16:52
deryckrockstar, just thinking more....16:55
deryckrockstar, some of what the egg provides us is the js lint stuff.... perhaps that could be broken out into it's own package.... separate the testing, python utils, and js file building stories a bit?  Smaller simpler packages?16:56
rockstarderyck, yes.16:56
rockstarderyck, the problem is that we've tied ourselves to a buoy with no anchor, so we experience pain anytime we want to change anything.16:56
deryckyeah, maybe it's not easy to do a neat and clean separation.16:57
rockstarderyck, the launchpad build system is too closely tied to the lazr-js build system, which makes it exponentially more complicated.16:57
jmlclearly you should attach a sail to your buoy16:57
jmlor something.16:57
rockstarjml, engineering fail.  :)16:57
deryckrockstar, it also makes adoption of lazr-js by any other web project outside Canonical difficult or impossible.16:58
rockstarderyck, exactly.16:58
jmlrockstar, you're the one who's in pain and stuck to a buoy!16:58
rockstarjml, no, my boat is stuck to a buoy.16:58
rockstarjml, also, while the launchpad team does blow a lot of hot air, I think we want engines, not sails.  :)16:59
jmlrockstar, ok, as long as it's an electric motor16:59
rockstarderyck, I tried to use lazr-js this weekend.  After about an hour, we gave up and used jQuery.  :)16:59
jmlwith batteries charged by a wind farm.16:59
rockstarjml, yeah, because we also want to be environmentally responsible.16:59
rockstarUnfortunately, Google owns the wind farms, so we have to display Google Ads on our boat the whole time.17:00
=== benji is now known as benji-lunch
=== matsubara is now known as matsubara-lunch
=== Ursinha is now known as Ursinha-lunch
=== Ursinha-lunch is now known as Ursinha
mrevellnight all18:06
=== matsubara-lunch is now known as matsubara
=== benji-lunch is now known as benji
=== leonardr is now known as leonardr-away
deryckSome say money is the root of all evil, but it's really notifying subscribers in a web app request.18:54
rockstarderyck, :)19:01
rockstarderyck, sending mail in general...19:01
=== leonardr-away is now known as leonardr
rockstarjames_w, why does Recipe.__str__ need to call .parse() ?19:47
james_wrockstar: it doesn't19:48
rockstarjames_w, so I could write a patch that removes it, and you'd be happy with it?19:49
james_wrockstar: but it's a good way of ensuring that we are not generating malformed recipes in __str__19:49
rockstarjames_w, shouldn't tests be enough?19:49
james_wclearly you have found a case where that breaks down, but I'm not convinced that it warrants getting rid of that19:49
james_wI would remove it if I could be convinced that there are many more cases where it isn't going to be a good thing19:50
james_wrockstar: yes, they /should/ be enough19:50
rockstarjames_w, I can't foresee any other issues, but if I could foresee bugs, I would fix them before they became bugs.19:52
rockstarjames_w, I think that, for reads, we should trust that it creates them properly.19:52
rockstarjames_w, if we have bugs, then we can deal with them.19:52
james_wwhy trust when you can verify?19:52
rockstarAs it is, if __str__ ever creates a bad manifest, it'll explode on a user in Launchpad.19:52
rockstarjames_w, I guess the question is "Is Launchpad bzr-builder's most common use case."19:53
rockstarjames_w, I'm all for verifying if it didn't raise an exception the way it does.  I'd like it to warn and move on, but that would be difficult to look for in Launchpad.19:54
james_wbad manifest> explode how?19:54
james_wwarnings> considered that, but who ever pays attention to warnings?19:54
rockstarjames_w, yes, and warnings would be difficult to get out of a webapp.19:55
james_wwell we don't have to use warn(), but still...19:55
rockstarjames_w, you're verifying that functionality you wrote is working properly.  That's noble and all, but if there were a bug, RecipeParser.parse() would raise a rather exception that isn't really the user's fault.19:56
rockstarjames_w, I think the best course of action would be to remove the parse in __str__.  We could have different method be more strict, but having __str__ be that strict seems odd.19:58
james_whaving it be sure to generate a valid recipe is odd?19:59
rockstarjames_w, in __str__ I think it is.20:00
rockstarjames_w, how 'bout this: Recipe.get_manifest() will always return a valid manifest or raise an exception, while __str__ just returns the manifest, valid or not.20:02
james_wwhy would you ever want to put an invalid manifest in the edit box?20:02
rockstarIn Launchpad, we could then call Recipe.get_manifest(), and if it raises an exception, get the raw string.20:02
rockstarjames_w, I think we have a valid reason to put an invalid manifest in the edit box.20:03
james_wif there is a bug that causes a round-trip to fail, then you will force the user to correct the error, then hit save, which will corrupt it again when displaying it20:03
rockstarWe can catch the exception and add an error box that says "This recipe is totally broken.  Please fix it."20:03
james_wbut you just said yourself that this would only happen for bugs where it isn't the users fault20:03
james_wso don't we want an OOPS if they are bugs?20:04
rockstarjames_w, in this case, it's *kind of* the user's fault, but we let them save the bad data, and now they can't fix it.20:04
rockstarjames_w, but we need a better migration path than oopsing on crap data.20:05
james_wbut I'm now starting to think we should call it a bug and go back and rethink the fix20:05
rockstarjames_w, I've been calling it a bug the whole time.20:05
rockstarBecause to Launchpad, it is a bug.20:05
rockstarAnd I'm trying to solve it for both Launchpad AND bzr-builder.20:05
rockstarjames_w, in this case, the current bug is caused by a bzr-builder bug getting fixed.20:06
rockstarThe user used . as a directory, and that never worked in building.  Now it fails earlier.20:06
james_wyes, but perhaps we should be saying that making the parser more strict without a format version bump is a bug, and should be fixed20:07
rockstarSo the user entered bad data, but we accepted it.20:07
rockstarjames_w, possibly.  I suggested that to abentley, and he had a point that this was always bad data.  It's just that it fails earlier now.20:07
rockstarjames_w, and it never really affected users of bzr-builder itself like it did with Launchpad.20:07
james_wyes, it was always bad data20:08
rockstarjames_w, it's just that now, the user has no way of getting to that data and fixing it themselves.20:08
james_wbut there is a distinction between parse+build in the code, and perhaps trying to conflate them like that isn't the best idea20:08
rockstarSo we need to teach Launchpad how to cope with crap data and encourage the user to fix the crap data.20:09
abentleyjames_w, I *like* validation.  I think we should do *more* of it.  For example, bogus revision specs.20:09
rockstarI'm not saying we shouldn't validate, but we should provide a path for coping with invalid data that doesn't require futzing with the database.20:10
james_wabentley: then please file bugs20:10
abentleyI just think we should distinguish between "well-formed" and "valid", and allow parsing of recipes that are merely "well-formed".20:10
abentleyjames, there's already a bug about that.20:10
james_wrockstar: if the code had always been like this then the bad data could never get in the db. If we have a rule that more strict parsing would always result in a format bump then there are two ways we could get this problem in future:20:11
james_w1. a bug that means that the parser doesn't detect the problem in the first place20:11
james_w2. a bug in __str__ which is why the check is there20:11
james_wor I guess 3. that we want to make it stricter without a bump for some reason20:11
rockstarjames_w, this change I'm proposing is STRICTLY for Launchpad sanity cases.20:12
james_win either the first two cases then there is a bug that we want to know about20:12
rockstarjames_w, have you seen https://bugs.edge.launchpad.net/launchpad-code/+bug/62086820:12
rockstar(this is the bug I'm addressing)20:12
james_wabentley: I don't see a bug20:14
rockstarjames_w, do you understand why that bug exists?20:14
rockstarjames_w, okay, there was a bug, #1 in your case.  It was fixed.20:15
rockstarIt caused existing data (that never really worked anyway) to cause oopses, but not provide a way for the user to fix it.20:15
james_wIt's not #1 in my case20:15
abentleyjames_w, https://bugs.edge.launchpad.net/launchpad-code/+bug/59282120:15
james_wthey were well-formed recipes before, just ones that were never going to work20:16
james_wabentley: ah, so not on bzr-builder20:16
abentleyjames_w, right.  It doesn't have to be done there.20:17
james_wwhy not do it there?20:17
rockstarabentley, I'm not sure I see how your issue and mine go together her.20:17
abentleyjames_w, because I don't know whether such a check is useful to bzr-builder.  Because if it is useful, I don't know why it's not there.20:19
james_wbecause I never thought of checking?20:19
abentleyjames_w, because the set of valid revision specs can vary, and maybe you don't want to get into that.20:19
james_wand maybe Launchpad shouldn't either?20:20
rockstarjames_w, in the case of your #1, yes, the parser wasn't detecting the problem, and now it is (with a newer bzr-builder)20:20
abentleyjames_w, we can guarantee that it won't vary between our appservers and our builders if we choose to.20:20
rockstarBecause it wasn't detecting the problem, the invalid recipe made it into the database.20:21
rockstarNow, with a new bzr-builder, it finds the bad data and oopses.20:21
abentleyrockstar, they go together because they are both issues of validation, where an incorrect value was put into a recipe field.20:21
rockstarLaunchpad needs a way for users to deal with bad data that made it into the data (however that happens) and allow them to change it.20:22
rockstarabentley, okay.20:22
rockstarjames_w, so I'm proposing a change that would help Launchpad by providing a better interface the bzr-builder's Recipe class.20:23
rockstarLaunchpad needs to be more robust.  We can validate 'til the cows come home, but if we don't give users a way to deal with invalid data, then we've only made things worse.20:24
james_wrockstar: I understand that, but I am looking to explore the issue in a little more depth. There's little point in asking the user to correct the problem if the problem was caused by us.20:24
rockstarjames_w, the problem was caused by us, only in the fact that we let them put bad data into the database, but that would never actually succeed.20:25
rockstarjames_w, format bump or whatever needs to happen, I'm happy with.  My big concern, however, is that the user's don't suddenly find out their recipe is broken by finding an oops where their recipe used to be.20:26
* rockstar should probably go eat something, so he stops this egregious use of apostrophes.20:26
james_wrockstar: I agree with you that they should be able to fix bad data that they put it20:27
rockstarjames_w, the patch I propose would do that.20:27
james_wrockstar: I'm arguing that doing this across the board leads to us possibly asking users to "fix" perfectly valid recipes due to bugs in bzr-builder20:27
rockstarjames_w, maybe.  I'm less concerned with that at this point.20:28
james_wso I'm looking for ways for us to separate the two things such that we can ask them to fix bad data, while apologising for bugs and fix them20:28
rockstarjames_w, I will always apologize to the user.  The fact that we're just now telling them they're wrong is a no-no on our part.20:29
james_wall of the examples given so far are recipes where we can perfectly understand the intent, they just aren't going to work20:32
james_wso as Aaron said, splitting well-formed and valid might make sense20:32
rockstarjames_w, which is what I'm proposing.20:33
james_wrockstar: at one level, yes, but we can push it deeper than the patch you are suggesting20:33
rockstarjames_w, here's my patch: http://pastebin.ubuntu.com/493798/20:33
james_wyes, I perfectly understand the change you are proposing20:34
rockstarjames_w, I don't see necessity for going any deeper than that.  If I can catch the exception and somehow say "Hm, this is what you had, but for some reason bzr-builder doesn't think it's valid anymore."  then I'm happy.20:35
james_wsure you are20:36
rockstarjames_w, I'm not sure how much more apart "valid" and "well-formed" you want.20:37
james_wa way of saying to the user "your recipe is well-formed, but these things are likely to be problems:"20:37
james_wat any time20:38
james_wthen we can make the parser more "strict", without causing issues like this, provide better assistance to the user, and still have validation that what we create is at least well-formed20:40
rockstarjames_w, yeah, I have no opinions on the overall architecture of bzr-builder.  I would just like something that works today.  If there's a bigger picture, great.20:48
james_wthis has an impact on LP too though20:48
rockstarjames_w, I think it does.  I think it'd be great for user's experience.20:48
james_wit's about user-experience, so I won't let you wash your hands of it as lying outside of LP ;-)20:48
rockstarjames_w, however, right now, the user's experience is "WTF? Why can't I get to my recipe?"  That's my big concern now.20:49
james_wsure, but it's only two recipes?20:49
rockstarjames_w, yes, but that's two more oopses that we don't need.20:50
james_wthe change you propose is a "narrow" interface to likely problems (it can only report one), and it has a poor API to use it everywhere20:50
james_wsure, it's just not a stop-the-line issue IMO20:50
lifelessso the OOPS comes from where?20:50
james_wso if we can we should come up with an API that nicely gives us the better experience and implement that20:51
james_wlifeless: https://bugs.edge.launchpad.net/launchpad-code/+bug/62086820:51
lifelessso maybe I'm confused20:52
lifelesswe used a plain text field to store the recipe didn't we?20:52
abentleylifeless, no.21:00
abentleylifeless, we store the recipe in object form.21:00
lifelesssourcepackagerecipedata ?21:03
abentleylifeless, yes, and the SourcePackageRecipeDataInstructions that refer to it.21:04
lifelessok, I see21:04
lifelessI was wondering if it would make sense, when the recipe is invalid to still permit it to be edited21:04
lifelessuntil it becomes valid.21:04
abentleylifeless, that is what we want to do.21:05
lifelessthen we can handle a wider range of unexpected things like this21:05
lifelessabentley: awesome21:05
abentleylifeless, The problem is that we can't stringify the invalid recipe.21:05
lifelesswhat do we we use stringification for ?21:05
abentleylifeless, because bzr-builder checks for validity when it stringifies a recipe.21:05
abentleylifeless, we use stringification for displaying the field to the user so that they can edit it.21:06
lifelessdoes this mean that we can't show the user the invalid recipe21:06
abentleylifeless, yes.21:06
lifelessI see, certainly not going to help things along ;)21:06
lifelessand it helps me understand the chat you were having - thanks.21:06
rockstarabentley, http://pastebin.ubuntu.com/493798/21:48
=== al-maisan is now known as almaisan-away
=== ajmitch_ is now known as ajmitch
abentleythumper, http://pastebin.ubuntu.com/493848/22:20
lifelessUrsinha: hi22:23
lifelessoh, I see whats up22:25
lifelessUrsinha: the ec2land stuff22:26
lifelessthat gets bugs from an MP, does it get it from the MP, or the branch ?22:26
mwhudsonlifeless: the mp gets bugs from the branch, unless i'm missing context horribly22:27
lifelessmwhudson: so I use the same branch for domain-fixues22:27
lifelessmwhudson: mps' show bugs already fixed in earlier MP's22:27
lifelessmwhudson: but I need to know which -precisely- the ec2 land code uses, or where to find that code, to make it stop including fix-committed and fix-released bugs in the bugs list in the pqm mail.22:28
mwhudsonah right22:29
mwhudsoni expect the ec2 land code isn't that impentrable...22:29
lifelessits blatting a growing number of bugs every time i land22:31
Ursinhalifeless, the branch22:34
lifelessUrsinha: hi22:34
* Ursinha looks22:34
Ursinhalifeless, don't know if that works22:35
lifelessUrsinha: 'that' ?22:36
Ursinhalifeless, sorry, let me elaborate22:36
Ursinhalifeless, problem is many times people mention bugs that weren't properly fixed, but had code landed, so bugs that are fix committed or fix released22:36
Ursinhaso for that to work we'd need to ensure that all bugs that have fix to land are !fix committed/released22:37
Ursinhaotherwise we'll start missing things22:37
Ursinhamy thoughts would be: create another branch22:37
Ursinhathan that won't happen22:38
lifelessUrsinha: hang on a sec.22:38
lifelessUrsinha: ec2 land will error if there are no valid bugs right ?22:38
Ursinhalifeless, if there are no bugs and it's not no-qa, yes22:39
lifelessso, if someone has references only fix-committed and fix-released bugs22:40
lifelessthey will get an error22:40
lifelessand in that errror you could say (bug X Y and Z are also linked on the branch but are fix committed/fix released)22:40
Ursinhalifeless, what will you do if you're trying to land a branch which is already linked to a bug which is fix committed, but that's what you want to do? are you going to unlink the bug?22:40
UrsinhaI don't like this idea22:40
lifelessUrsinha: if the code I'm landing is needed to fix that bug, its not really fix committed is it ?22:41
Ursinhalifeless, it can be, but qa-bad22:41
Ursinhafix committed == there's a fix for that bug (working or not)22:41
Ursinhain progress == fix is in progress (it might have incremental fixes but the whole fix isn't committed yet)22:42
lifelessqa-bad should imply in-progress or triaged22:42
lifelessfix committed isn't 'commit in the tree' its *FIX* in the tree.22:43
lifelessI think that when something is bust we should make the bug be in-progress again22:43
lifelessjust like --incr22:44
Ursinhalifeless, manually?22:44
lifelesswe could automate it22:44
lifelessbut sure, manually.22:44
lifelessqa-bad + fixreleased makes no sense.22:45
lifelessqa-bad + fixcommitted also makes no sense.22:45
Ursinhaqa-bad is added by the devel; we could a) change it manually in the same time we're adding the qa-bad tag or b) make the bot to check if there are qa-bad and change it to in progress again22:45
lifelessDesigning other workflows around nonsensical states is not going to work well.22:45
lifelessUrsinha: I like both a) and b)22:45
Ursinhalifeless, ok. what to say about branches that have several bugs linked, and some of them are already released22:45
lifelessjust list the other bugs.22:45
Ursinhawhy are people reusing branches?22:46
lifelessUrsinha: convenience; clarity; organisation.22:46
Ursinhalifeless, well, I think we're trying to make the scripts to workaround a situation that could be avoided by just not reusing branches22:48
Ursinhaand the problem is that the script already tries to workaround some behaviors to create consistency22:48
UrsinhaI think the scripts will get more and more confuse because of that, but if you think this is really worthy, I can do that22:48
Ursinhaadding a mechanism to tagger to set qa-bad bugs to in progress isn't hard22:49
lifelessIt makes my dev environment a lot easier to manage.22:49
lifelessI have 'librarian' for the librarian, 'registry' for registry, 'oops' for oops etc22:49
lifelesssometimes I have bug-X branches when I have multiple things in flight : but the whole kanban + RFWTAD workflow is about removing the need for parallel-tasking.22:50
Ursinhalifeless, what are you going to do about the fix committed bugs linked to your branches, when you land a new fix?22:50
lifelessUrsinha: I don't understand the question.22:50
lifelessif its fix committed there is no more work to do on the bug.22:51
lifelesslanding a new fix suggests its not fix committed.22:51
Ursinhalifeless, that's not what I see following bugmail22:51
lifelessthats a contradiction22:51
lifelessUrsinha: can you point me at some examples?22:51
Ursinhalifeless, I'd have to do some gardening in my bugmail22:52
Ursinhalifeless, we can try that. the idea is to make ec2 land ignore fix committed bugs?22:52
Ursinhaor error on them?22:52
* wallyworld_ off to doctor appointment22:52
lifelessUrsinha: ignore fix(committed|released) bugs22:54
Ursinhalifeless, one case that came to mind now is, bug fix released but not really fixed. not tagged qa-bad or -needstesting, but has new fix22:54
Ursinhawhat to do in this case?22:54
UrsinhaI saw that happen a few times after releases22:54
lifelessso, something has been made better22:54
lifelessbut not good enough22:54
lifelesswith the QA workflow using bugs to permit commits to trunk to be deployed.22:55
lifelesswe need a new bug for the QA workflow.22:55
lifelessdon't we?22:55
Ursinhanot sure what you mean22:55
lifelessso in this scenario:22:55
UrsinhaI guess ec2 land should check all bugs, see only the !fix committed/released and if no bugs left, error22:55
lifeless - bug X22:55
lifeless - branch lands that 'fixes X'22:56
lifelesswe QA it - bugX [qa-ok]22:56
lifelesswe deploy22:56
lifeless - bugx [FixReleased]22:56
lifelessthen we realise its still timingout (for instance)22:56
lifelessthats your scenario, right ?22:56
lifelessnow, there are two places we might find this22:57
lifelessfirstly, we might notice in QA22:57
lifelessand secondly we might notice after deploy22:57
lifelessif we notice in QA, because its 'better' we don't need to stop the deploy.22:57
lifelessso any solution to this needs to cater for noticing in QA.22:57
lifelessif we notice in QA, we could se the bug to qa-incremental (is that right?)22:58
lifelessso In-progress status, and 'branch is ok to land'22:58
Ursinhaif we notice in qa, so it's qa-untestable22:58
Ursinhaor qa-bad if devel thinks the fix is going to bork prod. if rolled out22:59
lifelesshttps://dev.launchpad.net/QAProcessContinuousRollouts#We can QA the branch, and it is an incremental step towards the fix of one or more bugs22:59
Ursinhalifeless, but only if you landed the first fix as --incr22:59
Ursinhaif you know previously that the fix might not be the last one, than land it as incremental and all of that will be done automatically23:00
lifelessUrsinha: So I guess I'm saying 'what you describe is us realising that the fix *was* incr, even if we didn't *say it was*'23:00
Ursinhabug in progress, qa-untestable23:00
Ursinhayes, I see that23:00
lifelessso the right thing to do, whether we notice in QA, is to set the bug status in the same way.23:00
UrsinhaI'm saying there's room for tweaking things :)23:00
lifelessand so if we set th ebug status the same way23:01
Ursinhaqa-untestable, in that case23:01
Ursinhaotherwise we'll be blocked23:01
lifelesswe'll set it to in-progress, qa-untestable(or-qa-ok I guess for the testable-incremental-case)23:01
Ursinhadoesn't matter for the script, both are "go-for-it"23:02
UrsinhaI like qa-ok best23:02
lifelessand ec2 land would correctly *include* this bug in the later landing23:02
lifelessbecause its in-progress23:02
lifelessthat seems to work to me23:02
Ursinhalifeless, right. I'll update the wiki page and let you know23:05
lifelessUrsinha: thanks!23:05
UrsinhaI don't like the way the theme in dev.lp.net separates the sections of the page23:16
Ursinhait's kinda hard to read23:16
lifelessits rather awkward23:17
lifelessmbarnett: nevermind,m just BB flakiness23:18
marslifeless, reading backscroll, looking sadly upon the BB waterfall - did you already pass the BB restart work along?23:23
lifelessmars: restart work ?23:27
marslifeless, re: your "BB flakiness" comment23:27
lifelessmars: well I haven't debugged deeply, for clarity23:28
mbarnettlifeless: hehe..  kk23:28
lifelessmars: but the most recent build was for an older rev23:28
lifelessmars: so it hadn't tried tip of prod-devel23:28
lifeless(and it was nearly 12 hours old that the fix landed in prod-devel)23:29
marslifeless, looking at the waterfall BB is completely hosed right now :(23:30
marsso I need to figure out what to tackle first23:30
lifelessmars: the machine gun ?23:30
lifelessmars: parhaps bring out the 'restart the world' card?23:31
=== matsubara is now known as matsubara-afk
marslifeless, we used that card earlier today - I am worried23:31
lifelessthat is a concern23:31
marsand lp and db_lp have been down for a week23:32
lifelesswe has several CPs pending23:32
lifelesscannot rename, ubuntu bug uploads, and OOPS generation23:32
marsok, so we need to get lucid_lp and prod up first23:34
marsmbarnett, I forced the lucid_lp build.  If the build does not start in, say, 10 minutes, you will have to restart the build slave.23:35
mbarnettmars: kk23:35
marsok, prod_lp is offline for some reason (is it an EC2 slave?)23:36
lifelessthe log shows 'substantiating'23:37
lifelessso I'd say yes23:37
marschecking master.cfg23:37
marsyes, EC2Latent23:38
marslifeless, you said prod_lp pulled a stale tip revision for it's test run?23:38
lifelessmars: no, I said the last run in 'recent builds' was ages ago and for what is now obsolete23:39
lifelessmars: and that it hasn't has a more recent run which would get a better tip23:39
lifelessI hypothesised that it hadn't detected it23:39
marsok, I'll force a build then23:39
lifelessalternatively, if the slave doesn't come up, I bet bb doesn't report that as a failed run.23:39
lifelessmars: I forced a build23:40
marsit is still offline23:40
lifelesssee 23:18:22 in the waterfall23:40
marshave to restart the build master then23:41
marsmbarnett, could you please restart the build master?  That should get the prod_lp builder running again23:43
marslifeless, EC2 build slaves need a master restart in order to bring them back up (lp, db_lp, prod_lp).23:44
marsI haven't seen this problem before, but "restart the world" sounds right23:44
marsuse the unstoppable super weapons first23:44
marsalways the unstoppable super weapons first23:44
mbarnettbuild master has been restarted23:45
lifelesswhere's the earth shattering kaboom23:45
lifelessthere's meant to be an earth shattering kaboom23:45
mbarnettdoes not appear to be starting back up happily23:46
* mars mashes F5 a few more times in desperation23:47
marsj/k, that actually speeds server death in resource-exhausted environs :)23:48
marsmbarnett, do you have a log I could look at please?23:48
mbarnettmars: sure, give me a couple23:50

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!