/srv/irclogs.ubuntu.com/2012/08/27/#launchpad-dev.txt

wgrantStevenK: More importantly you should check what the subscribers are00:05
StevenKwgrant: So I guess I should set a bug supervisor too, then00:06
wgrantRight00:07
* StevenK picks wgrant.00:08
StevenKwgrant: https://bugs.qastaging.launchpad.net/auditorclient/+bug/939552 looks good to me00:11
_mup_Bug #939552: Juju should support MAAS as a provider <juju:Fix Released by allenap> < https://launchpad.net/bugs/939552 >00:11
StevenKwallyworld_, wgrant: https://code.launchpad.net/~stevenk/launchpad/contains-to-match/+merge/12135400:49
wallyworld_+1 from me00:50
StevenKwallyworld_: Thanks00:51
wallyworld_np00:52
wgrantwebops: Could you ppa-reset marid, please?01:12
wgrantIt's on furud01:13
spmta01:13
spmwgrant: hmm. oki, how does one reset a single server. ppa-reset seems to be an all or nothing affair?01:16
wgrantspm: I think 'ppa-reset marid' should work01:18
spmI don't have rights for that01:19
wgrantAh01:19
wgrantYou'll have to do it from alphecca, I guess01:19
wgrantsec01:19
spmand looking at scripts, I'm going to need gsa intervention01:19
wgrantNope01:19
wgrantssh -i /home/lp_buildd/.ssh/ppa-reset-builder ppa@furud.ppa ppa-reset marid01:20
spmah yes, the builddmaster would have stab ability.01:20
spmMon, 27 Aug 2012 01:20:18 +0000: Clearing all marid Copy-On-Write devices.01:20
spmdevice-mapper: create ioctl failed: Device or resource busy01:20
spmCommand failed01:20
wgrantThat's rather unpleasant of it01:20
spmthat doesn't look happy.01:20
wgrantSounds GSAy01:20
StevenKwgrant: Shall I put together a deploy?01:24
wgrantStevenK: Worth a try01:24
wallyworld_wgrant: according to bug 1040999, branches should always be able to be marked as security fixes, with userdata only available if branch is linked to a userdata bug. so i'm going to make this change01:26
_mup_Bug #1040999: Cannot use branch information type portlet to set type <disclosure> <information-type> <javascript> <Launchpad itself:In Progress by wallyworld> < https://launchpad.net/bugs/1040999 >01:26
wgrantwallyworld_: Sounds reasonable01:26
wallyworld_but proprietary cannot be done just yet i don't think?01:27
wgrantIt can be01:27
wgrantIt should always be shown if it's allowed01:27
wgrantLike Public01:27
wgrantHm01:27
wgrantActually01:27
wgrantWe can't really hide userdata until nobody's using BVPs01:27
wallyworld_it's always there now, but the code comments say it should only be shown if branch linked to proprietary bug01:28
wgrantThat's meant to apply to Public Security, Private Security and Private01:28
wgrantNot Proprietary01:28
* wgrant checks the comments01:28
wgrantOh01:29
wgrant            # Once Proprietary is fully deployed, it should be added here.01:29
wgrant"it" == Private there01:29
wgrantI must have removed a line describing why Private wasn't included01:29
wallyworld_ok, i'll fix the comment01:30
wgrantThanks.01:30
wgrantThe idea is that we need to show Private now since it's what BVPs use for privacy01:30
wallyworld_so just to confirm, userdata is to be updated once bvps go away01:30
wgrantBut once everyone's using Proprietary, Private is no longer going to be common at all for branches01:30
spmwgrant: it's being stubborn, but looked at.01:30
wgrantspm: Thanks01:31
StevenKwgrant: The amount of cowboys is terrible :-(01:31
wgrantYeah...01:31
wgrantAll of them have landed, though01:31
wgrantOnly one code changes01:33
wgrant-s01:33
wgrantRest are security.cfg01:33
wgrantSo we can ndt without a problem01:33
wgrantUm01:33
wgrantThough01:33
wgrantStevenK: Have you checked for new DB perms?01:33
StevenKI have not.01:34
* wgrant does so01:34
StevenKgarbo is one I can think of, I think01:34
wgrantOh01:34
wgrantCan't pull01:34
wgrantBlah01:34
StevenKHahaha01:34
* wgrant does it manually01:34
StevenKwgrant: Is this going to involve a second call to Optarse?01:35
wgrantAlready done01:35
wgrantTwo sets of DB perms01:37
StevenKgarbo and what else?01:38
spmwgrant: we believe that's back01:41
wgrantStevenK, webops: There's an SQL request at the usual place to manually apply this nodowntime's DB security changes01:42
wgrantWhich will take us up to 5 live cowboys :)01:43
spm./ignore wgrant01:44
wgrantspm: marid looks healthy again, thanks01:45
StevenKwgrant: Do you even have an ETA from them?01:45
wgrantNo01:46
StevenKBecause routing to Europe is hard or something.01:46
wgrantMaybe I can convince a GSA to check what the melons think of the BGP state01:46
wgrantL3's looking glass looks fine01:46
wgrantMaybe Datahop is breaking things01:47
bigjoolsurls for multi-task bugs are weird01:52
bigjoolsI entered a bug on maas, url has maas in it.  Add a task for cloud-init and the url changes to one for cloud-init01:53
wgrantbigjools: Right, when you add a new task it sends you to the bug in that context01:55
StevenKwgrant: No test failure for me. :-(03:01
lifelessStevenK: auditor today?03:02
StevenKHaha03:09
StevenKwgrant: http://pastebin.ubuntu.com/1169187/ == no failure after make schema03:10
wgrantStevenK: The problem is creating the link from a --fixes03:13
StevenKwgrant: I thought that just linked the bug?03:18
wgrantStevenK: Yes03:18
wgrantBut it's the bug linking that crashes03:19
wgrantNot scanning a branch with a linked bug03:19
wgrantYou can probably reproduce by switching to the DB user before calling linkBug03:19
StevenKwgrant: Calling db_branch.linkBug() in the with dbuser block == no crash03:25
wgrantStevenK: Hm, possibly it's not calling the notify methods?03:28
wgrantI forget where in the traceback it died03:28
wgrantBut I linked the OOPS in the bug03:28
StevenKwgrant: Ah, yes, that would likely be it.03:30
StevenKActually, I think it's because the project that the bug is created has no structural subscribers.03:31
StevenKAnd maybe no notification03:31
wgrantThat could do it too03:37
StevenKwgrant: Right, I'm hooking it into the bug linking bit, I needed a revprop with the right format.04:41
StevenKBut still no failure, which is annoying.04:41
StevenKOh, hah. My contains-to-match trips over test_getAllPermissions_constant_query_count05:35
wgrantHeh05:46
StevenKwgrant: Still no failure. :-(05:47
spmwgrant: psql:tmp/wg.sql:8: ERROR:  cannot execute GRANT in a read-only transaction05:47
wgrantspm: tThat's no druk05:47
spmoh wait. nm. my bad. wrong server.05:47
spmyah05:47
spmI blame mondays.05:47
wgrantI blame Optus/Datahop/NTT :)05:48
spmad that tomorrow is tuesday. and yesterday being sunday.05:48
wgrantFor everything05:48
spmwgrant: applied05:48
wgrantspm: Thank05:48
wgrants05:49
wgrantHopefully you can now ndt without the world burning down05:50
wgrantmore than it already is05:50
StevenKspm: http://images.ucomics.com/comics/ga/1991/ga910304.gif05:51
wgrantwoah06:13
wgrantppa queue almost caught up06:13
StevenKWah, still no failure.06:19
StevenKAh, the job running code isn't sending a notification.06:22
StevenKwgrant: I think this test is being defeated by caching06:26
wgrantStevenK: :(06:30
wgrantStevenK: Then clear the cache :)06:30
StevenKWhich doesn't help either06:31
StevenKSigh06:31
StevenKwgrant: Store.of(obj).invalidate() only invalidates only that object? What if I want to invalidate everything?06:36
wgrantStevenK: That invalidates the whole store06:43
StevenKThen I'm not sure why notify isn't calling back into getBugNotificationRecipients :-(06:49
StevenKwgrant: The notify(ObjectCreatedEvent(bug_branch))06:56
StevenKline in linkBranch() is directly implicated in the OOPS, but my test causes it to notify noone06:56
wgrantYou've verified that there are adequate subs to the bug?06:56
StevenKI've added a direct subscriber with an APG06:57
StevenKI'm trying to work out why that notify call is deciding to do nothing06:57
StevenKAnd failing, I might add06:58
StevenKwgrant: The bugchange ignores private branches. The notification code has to run before the branch is made private.07:10
wgrantStevenK: Why is the branch private?07:16
StevenKBecause I was forcing it to be.07:16
wgrantAh07:17
wgrant(and yes, it does ignore private branches -- I reported that leak a few years ago :))07:17
StevenKwgrant: BTF isn't in branchscanner's security.cfg, too07:18
wgrantStevenK: It probably inherits that07:29
wgranteg. from write or something07:29
wgrantYeah, from write07:31
StevenKAh. I think APG is fine, and we have a test that will blow up if that check changes.07:31
StevenKOh, sigh.07:43
StevenKI bet the scanner has cursed this branch07:43
* StevenK stabs the branch scanner over and over.07:54
wgrantI have mail07:58
wgrantSo I take it you uncursed it07:58
StevenKNo, rename and push again dance.07:59
wgrantYou do love to crush my hopes and dreams.07:59
StevenKAnd putting a bloody knife in a Express Post and addressing it to celeryd@ackee07:59
StevenKwgrant: Well, I do need a hobby ...08:00
wgrantStevenK: Care to check that the bug actually gets linked?08:00
StevenKThat's probably a good idea.08:01
adeuringgood morning08:11
=== adeuring changed the topic of #launchpad-dev to: http://dev.launchpad.net/ | On call reviewer: - | Firefighting: - | Critical bugs: 4.0*10^2
StevenKwgrant: Done. Look again?08:13
wgrantStevenK: r=me08:19
wgrantthanks08:19
=== almaisan-away is now known as al-maisan
stubwgrant: Do you think http://paste.ubuntu.com/1169485/ will work?08:43
stubwgrant: I'm thinking this behaviour makes the improved FDT process much simpler. I just need to stop the slaves replaying WAL, disable master connections, apply patches to the master, enable master connections, disable slave connections, reenable replication, wait for sync, back to normal.08:45
stubWhich I think I can do without swapping pgbouncer config files around, which seems fragile.08:46
wgrantstub: That was exactly the process I had in mind, but let me read the code...08:50
wgrantstub: I think that would work08:52
wgrantBut we'd want to go a bit further eventually :)08:52
wgrantA bit of refactoring to allow generic support for fallbacks would probably make it all a bit nicer08:53
stubIn what way? We can also cause master requests to get a slave, which is reinventing lp's read-only mode.08:53
wgrantstub: Well, webapp and API requests always use the master08:54
wgrantErm08:54
wgrantWebapp requests in recently-POSTed sessions08:54
wgrantBecause they want up to date data08:54
stubBut then we have a little risk with scripts, as we are deliberately returning a broken result.08:54
wgrantBut if the master's not available then they should fall back08:54
wgrantxmlrpc-private also usually uses the master policy08:55
wgrantBecause it wants to be as up to date as possible08:55
stubYeah, so we can do that for all master requests, which would be a lie. Or make a master + fallback policy for them, and switch them to using that policy.08:55
wgrantRight, I think we want a MasterIfYouCan policy which all those use08:55
wgrantSo the slave policy should always allow fallback to master08:56
wgrantthe masterifyoucan policy can always fall back to a slave08:56
wgrantand the master policy just fails08:56
stubYes, slave falling back to master is documented as allowed.08:56
wgrantOh right, I think it even already does that08:57
wgrantIt must08:57
stubI don't think we do that dynamically anywhere08:59
wgrantSo I think default_flavor = MAIN_FLAVOR becomes eg. flavours = [MAIN_FLAVOUR, SLAVE_FLAVOUR]08:59
wgrantstub: I thought the slave policy respected lag08:59
wgrantBut I can't remember exactly.08:59
stuboh yes09:00
stubOnly in the LaunchpadPolicy09:00
wgrantIndeed09:00
wgrantSlaveDatabasePolicy doesn't respect lag09:00
wgrantThat's probably a bad idea09:00
stubWe only choose the default based on lag.09:01
wgrantRight, and only in LaunchpadDatabasePolicy09:01
stubmaster requests still get a master if explicit, and slave requests still get a slave if explicit, no matter lag.09:01
wgrantOh hm09:01
wgrantTrue09:01
wgrantThat sucks09:01
wgrantSo09:01
wgrantI think most of dbpolicy.py wants a bit of a rethink09:02
wgrantand09:02
wgrantmost importantly09:02
wgranta de-Americanisation09:02
wgrant:)09:02
wgrantBecause there's not much reason to ever not respect lag09:02
wgrantLag should probably be treated as failure09:02
wgrantAlthough not failed enough that it won't use it as a last resort09:03
stubI think there is plenty of stuff that is happy using a slave even if it is an hour behind, and we raise alerts when things are 5 minutes behind09:03
wgrantTrue09:05
wgrantSo yeah09:06
wgrantxmlrpc09:06
wgrantxmlrpc-private09:06
wgrantrecently-POSTed webapp09:06
wgrantand API09:06
wgrantprobably all want to fall back to slaves09:06
wgrantWebapp writes shouldn't09:06
wgrantSo we need a new policy09:06
stubTry slave first, fallback to master ;)09:06
wgrantRight, that's correct for webapp09:07
wgrantBut xmlrpc probably wants the opposite09:07
wgrantOr at least a very low lag limit09:07
stubI'm joking there.09:07
stubI'd like to try to use the LaunchpadDatabasePolicy logic if there is a session cookie09:07
wgrantRight, that logic is still good09:08
wgrantExcept that that should only influence the default09:08
stubAnd I'm still annoyed nobody would put in a session token to the webapi, killing its scalability. But I suspect a lot of clients give us a cookie anyway.09:08
stubI think the way forward is to try this with just slave fallback, which is the original ppa use case. Get the bugs ironed out on the production side before complicating things further.09:09
wgrantI'm just worried that we're complicating things unnecessarily by adding a hacky single-case fallback09:10
stubWe only have 2 types of connections, 3 if you count 'DEFAULT'. We don't really need a generic framework.09:11
stubOr do you mean this shouldn't be in the BaseDatabasePolicy?09:11
wgrantRight, I don't think this belongs in BaseDatabasePolicy09:12
wgrantI'm not quite sure where is better09:12
stubI can put it in SlaveDatabasePolicy and SlaveOnlyDatabasePolicy09:12
wgrant(also, am I missing something or do you try to reretrieve the same store there? you don't change the flavour)09:12
wgrantWhich means it'll just try to regrab the slave09:13
stubtypo09:13
stubNot tried actually using this yet, just thinking through the idea :)09:13
wgrantHeh09:14
* wgrant foods09:14
czajkowskiwgrant: stub you guys doing anything to LP right now? getting timeouts doing the licience review09:46
czajkowski(Error ID: OOPS-e3753aed5cfe86fe227192e43be904c1)09:46
stubnope09:46
czajkowskihmm09:46
stubBah. Need to rethink this, again. DBPolicy will happily hand out stores from the ZStorm cache even if they won't work, and I don't want to test if connections work every time the policy is invoked.09:56
wgrantstub: Hm10:06
wgrantstub: Well10:06
wgrantstub: It should be done the same way as the lag check, right?10:06
wgrantI forget at what stage that is done, but whatever it is it's right at the start of the request10:06
wgrantAnd for non-request-based stuff we probably just want to switch when a connection fails, maybe?10:07
wgrantAlthough that makes it harder to fail back10:07
stubWe might not have a request10:07
stubI think I can ask the store if it is in a disconnected state or not before handing it out. If it is disconnected, attempt reconnection10:07
wgrantYeah10:07
stubSo a script running during fdt will get disconnected and need to handle that. And when it handles it, it will get the master store if it asks for the slave.10:08
stubJust need to wade through to see how to detect disconnected state, and to force a reconnection attempt.10:08
wgrantstub: Yeah, we may just have to wait for a disconnectionerror to be raised, I suspect10:11
wgrantAnd teach stuff to deal10:11
=== benji changed the topic of #launchpad-dev to: http://dev.launchpad.net/ | On call reviewer: benji | Firefighting: - | Critical bugs: 4.0*10^2
=== al-maisan is now known as almaisan-away
deryckabentley, adeuring, rick_h_ -- let's be ready for a longer stand-up today, to let rick_h_ lead us through the mockups he has.13:20
abentleyderyck: okay.13:21
rick_h_party13:21
rick_h_drink refill before the meeting, got it13:21
abentleyrick_h_: hey, don't party too hard :-)13:22
jamjelmer: I added a card to track the translations stuff, but I don't have any specific insight into it.13:49
jamMy first guess is that there is an issue with a cron job that is supposed to be running.13:49
jamjtv is someone you can poke for translations background, but he doesn't necessarily know more intimate details if it is an operational issue13:49
jamwgrant is generally the person with the most ops ideas13:49
rick_h_gary_poster: ping, hazmat ping'd me about looking over their YUI work on the juju js app/ui and deryck mentioned that since you guys were coming into that work should someone from your squad take part in the discussion15:25
rick_h_bah15:25
gary_posterrick_h_, hey thank you.  why the bah?  It would be great to be a part of it, I think, though I'll check with hazmat15:28
czajkowskijam: jelmer jtv will be on later if that helps, in the mean time I'm going to put ana annoucement out on Twitter and places as we're getting more bugs/questions logged about it15:30
czajkowskiit's added to the topic as well in case people ask, that way you're not under as much pressure to find an answer15:30
deryckabentley, ready for call?15:30
deryckstand-up hangout is fine15:31
abentleyderyck: sure.15:31
=== bac- is now known as bac
=== salgado is now known as salgado-lunch
=== almaisan-away is now known as al-maisan
=== al-maisan is now known as almaisan-away
czajkowskihiya someone help me for a momen, bug https://bugs.launchpad.net/launchpad/+bug/1041864  I wanted to change the URL for the import, but when I do i get invalid as it's being used elsewhere and I've never seen that issue before :/15:59
_mup_Bug #1041864: Badly named weston import <Launchpad itself:New> < https://launchpad.net/bugs/1041864 >15:59
maxbczajkowski: Hm.. I don't think the user wants the import URL change16:18
maxb+d16:18
maxbCan someone look up OOPS-eb261d3e309c39d6948f60de23422af9 for me?16:19
rick_h_maxb: loaded, the user isn't a member of the team16:20
czajkowskirick_h_: you're faster than I was16:21
czajkowskimaxb: http://pastebin.ubuntu.com/1170166/16:21
maxbOh, right, this is because ~vcs-imports members has slightly weird hybrid edit permissions on code imports, I remember now16:23
=== salgado-lunch is now known as salgado
=== Beret- is now known as Beret
jtvczajkowski, jelmer: will be at least 8 more hours before I'm here — provided I get well enough.  What's the crisis?17:47
czajkowskijtv: translation imports seem to have stopped17:48
czajkowskijtv: https://bugs.launchpad.net/launchpad/+bug/104185817:49
_mup_Bug #1041858: No daily translation export anymore <Launchpad itself:Triaged> < https://launchpad.net/bugs/1041858 >17:49
jtvImport or export?17:49
czajkowskiexport sorry17:49
jtvAnd it's not the normal exports, but the exports to branches, I see.17:50
jtvNow, there's always a few exports that are skipped because the branches are locked, or there are concurrent translation updates that the exporter doesn't want to overwrite, etc.17:50
jtvSo there's a big difference between “several people haven't seen it work” and “it's stopped.”  Do we know which it is?17:51
czajkowskihttps://answers.launchpad.net/launchpad/+question/206912  https://answers.launchpad.net/launchpad/+question/20694817:52
czajkowskijtv: I asked jelmer to look into it today17:52
czajkowskinot sure he made progress or what update he got with it17:52
jtvIf there's no breakthrough, chances are it's hard to diagnose — which most likely means a crash in a C-level library.  Might be worth finding out if the outage coincides with an upgrade.17:53
jtvHmm the log on crowberry simply stops on the 17th.  We haven't disabled the whole cron job by any chance?17:55
czajkowskijtv: no clue :s17:55
czajkowskijtv: but as soon as jelmer and jam come on tormorrow will get them to loook17:56
czajkowskior ask wgrant in a bit when he arrives17:56
czajkowskiaugust 17th was when we did a lot of the DC move17:56
jtvThere's more log on taotie.17:56
jtvI'll see if anything jumps out, but will leave it to the Dutch Cavalry otherwise then.17:57
czajkowskithanks jtv17:57
jtvczajkowski: I can give you my quick & vague impression…  The exporter writes to branches, which triggers branch-scan jobs (to make Launchpad notice the branch changes).  These seem to go into Celery now, via a RabbitMQ queue.  It looks as if the connection to RabbitMQ started breaking on the 18th (possibly from a change on the 17th, since this job runs in early morning UTC) and eventually on the 21st somebody may just have killed and disabled the18:10
jtvWell, not disabled exactly.  Maybe the lock file from the aborted run is still there; I seem to remember that logging is a bit asymmetrical when it comes to those lock files.18:10
jtvIt may mean that that instance from the 21st is still hanging around trying to request a branch scan for Stellarium, and the subsequent runs are quietly giving up as they notice that.18:11
lifeless_flacoste: o/ - just settling Cynthia a little, back soon19:05
flacostelifeless_: o/19:07
=== lifeless_ is now known as lifeless
lifelessflacoste: ok, ready when you are.19:13
* deryck heads home, back online shortly19:45
=== salgado is now known as salgado-afk
=== salgado-afk is now known as salgado
=== BradCrittenden is now known as bac
=== benji changed the topic of #launchpad-dev to: http://dev.launchpad.net/ | On call reviewer: - | Firefighting: - | Critical bugs: 4.0*10^2
wallyworldwgrant: StevenK: mumble?22:02
=== Ursinha` is now known as Ursinha
=== jelmer_ is now known as jelmer
=== jelmer is now known as Guest90165
=== Guest90165 is now known as jelmer
lifelesshttp crackheads of the world, does bug 1040689 strike you as add?23:15
lifelesss/add/odd/23:15
_mup_Bug #1040689: add api to refresh an existing token <escalated> <Canonical SSO provider:In Progress by ricardokirkner> < https://launchpad.net/bugs/1040689 >23:15
StevenKlifeless: Hello replay attack?23:16

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!