/srv/irclogs.ubuntu.com/2012/08/27/#launchpad-dev.txt

wgrant	StevenK: More importantly you should check what the subscribers are	00:05
StevenK	wgrant: So I guess I should set a bug supervisor too, then	00:06
wgrant	Right	00:07
* StevenK picks wgrant.		00:08
StevenK	wgrant: https://bugs.qastaging.launchpad.net/auditorclient/+bug/939552 looks good to me	00:11
_mup_	Bug #939552: Juju should support MAAS as a provider <juju:Fix Released by allenap> < https://launchpad.net/bugs/939552 >	00:11
StevenK	wallyworld_, wgrant: https://code.launchpad.net/~stevenk/launchpad/contains-to-match/+merge/121354	00:49
wallyworld_	+1 from me	00:50
StevenK	wallyworld_: Thanks	00:51
wallyworld_	np	00:52
wgrant	webops: Could you ppa-reset marid, please?	01:12
wgrant	It's on furud	01:13
spm	ta	01:13
spm	wgrant: hmm. oki, how does one reset a single server. ppa-reset seems to be an all or nothing affair?	01:16
wgrant	spm: I think 'ppa-reset marid' should work	01:18
spm	I don't have rights for that	01:19
wgrant	Ah	01:19
wgrant	You'll have to do it from alphecca, I guess	01:19
wgrant	sec	01:19
spm	and looking at scripts, I'm going to need gsa intervention	01:19
wgrant	Nope	01:19
wgrant	ssh -i /home/lp_buildd/.ssh/ppa-reset-builder ppa@furud.ppa ppa-reset marid	01:20
spm	ah yes, the builddmaster would have stab ability.	01:20
spm	Mon, 27 Aug 2012 01:20:18 +0000: Clearing all marid Copy-On-Write devices.	01:20
spm	device-mapper: create ioctl failed: Device or resource busy	01:20
spm	Command failed	01:20
wgrant	That's rather unpleasant of it	01:20
spm	that doesn't look happy.	01:20
wgrant	Sounds GSAy	01:20
StevenK	wgrant: Shall I put together a deploy?	01:24
wgrant	StevenK: Worth a try	01:24
wallyworld_	wgrant: according to bug 1040999, branches should always be able to be marked as security fixes, with userdata only available if branch is linked to a userdata bug. so i'm going to make this change	01:26
_mup_	Bug #1040999: Cannot use branch information type portlet to set type <disclosure> <information-type> <javascript> <Launchpad itself:In Progress by wallyworld> < https://launchpad.net/bugs/1040999 >	01:26
wgrant	wallyworld_: Sounds reasonable	01:26
wallyworld_	but proprietary cannot be done just yet i don't think?	01:27
wgrant	It can be	01:27
wgrant	It should always be shown if it's allowed	01:27
wgrant	Like Public	01:27
wgrant	Hm	01:27
wgrant	Actually	01:27
wgrant	We can't really hide userdata until nobody's using BVPs	01:27
wallyworld_	it's always there now, but the code comments say it should only be shown if branch linked to proprietary bug	01:28
wgrant	That's meant to apply to Public Security, Private Security and Private	01:28
wgrant	Not Proprietary	01:28
* wgrant checks the comments		01:28
wgrant	Oh	01:29
wgrant	# Once Proprietary is fully deployed, it should be added here.	01:29
wgrant	"it" == Private there	01:29
wgrant	I must have removed a line describing why Private wasn't included	01:29
wallyworld_	ok, i'll fix the comment	01:30
wgrant	Thanks.	01:30
wgrant	The idea is that we need to show Private now since it's what BVPs use for privacy	01:30
wallyworld_	so just to confirm, userdata is to be updated once bvps go away	01:30
wgrant	But once everyone's using Proprietary, Private is no longer going to be common at all for branches	01:30
spm	wgrant: it's being stubborn, but looked at.	01:30
wgrant	spm: Thanks	01:31
StevenK	wgrant: The amount of cowboys is terrible :-(	01:31
wgrant	Yeah...	01:31
wgrant	All of them have landed, though	01:31
wgrant	Only one code changes	01:33
wgrant	-s	01:33
wgrant	Rest are security.cfg	01:33
wgrant	So we can ndt without a problem	01:33
wgrant	Um	01:33
wgrant	Though	01:33
wgrant	StevenK: Have you checked for new DB perms?	01:33
StevenK	I have not.	01:34
* wgrant does so		01:34
StevenK	garbo is one I can think of, I think	01:34
wgrant	Oh	01:34
wgrant	Can't pull	01:34
wgrant	Blah	01:34
StevenK	Hahaha	01:34
* wgrant does it manually		01:34
StevenK	wgrant: Is this going to involve a second call to Optarse?	01:35
wgrant	Already done	01:35
wgrant	Two sets of DB perms	01:37
StevenK	garbo and what else?	01:38
spm	wgrant: we believe that's back	01:41
wgrant	StevenK, webops: There's an SQL request at the usual place to manually apply this nodowntime's DB security changes	01:42
wgrant	Which will take us up to 5 live cowboys :)	01:43
spm	./ignore wgrant	01:44
wgrant	spm: marid looks healthy again, thanks	01:45
StevenK	wgrant: Do you even have an ETA from them?	01:45
wgrant	No	01:46
StevenK	Because routing to Europe is hard or something.	01:46
wgrant	Maybe I can convince a GSA to check what the melons think of the BGP state	01:46
wgrant	L3's looking glass looks fine	01:46
wgrant	Maybe Datahop is breaking things	01:47
bigjools	urls for multi-task bugs are weird	01:52
bigjools	I entered a bug on maas, url has maas in it. Add a task for cloud-init and the url changes to one for cloud-init	01:53
wgrant	bigjools: Right, when you add a new task it sends you to the bug in that context	01:55
StevenK	wgrant: No test failure for me. :-(	03:01
lifeless	StevenK: auditor today?	03:02
StevenK	Haha	03:09
StevenK	wgrant: http://pastebin.ubuntu.com/1169187/ == no failure after make schema	03:10
wgrant	StevenK: The problem is creating the link from a --fixes	03:13
StevenK	wgrant: I thought that just linked the bug?	03:18
wgrant	StevenK: Yes	03:18
wgrant	But it's the bug linking that crashes	03:19
wgrant	Not scanning a branch with a linked bug	03:19
wgrant	You can probably reproduce by switching to the DB user before calling linkBug	03:19
StevenK	wgrant: Calling db_branch.linkBug() in the with dbuser block == no crash	03:25
wgrant	StevenK: Hm, possibly it's not calling the notify methods?	03:28
wgrant	I forget where in the traceback it died	03:28
wgrant	But I linked the OOPS in the bug	03:28
StevenK	wgrant: Ah, yes, that would likely be it.	03:30
StevenK	Actually, I think it's because the project that the bug is created has no structural subscribers.	03:31
StevenK	And maybe no notification	03:31
wgrant	That could do it too	03:37
StevenK	wgrant: Right, I'm hooking it into the bug linking bit, I needed a revprop with the right format.	04:41
StevenK	But still no failure, which is annoying.	04:41
StevenK	Oh, hah. My contains-to-match trips over test_getAllPermissions_constant_query_count	05:35
wgrant	Heh	05:46
StevenK	wgrant: Still no failure. :-(	05:47
spm	wgrant: psql:tmp/wg.sql:8: ERROR: cannot execute GRANT in a read-only transaction	05:47
wgrant	spm: tThat's no druk	05:47
spm	oh wait. nm. my bad. wrong server.	05:47
spm	yah	05:47
spm	I blame mondays.	05:47
wgrant	I blame Optus/Datahop/NTT :)	05:48
spm	ad that tomorrow is tuesday. and yesterday being sunday.	05:48
wgrant	For everything	05:48
spm	wgrant: applied	05:48
wgrant	spm: Thank	05:48
wgrant	s	05:49
wgrant	Hopefully you can now ndt without the world burning down	05:50
wgrant	more than it already is	05:50
StevenK	spm: http://images.ucomics.com/comics/ga/1991/ga910304.gif	05:51
wgrant	woah	06:13
wgrant	ppa queue almost caught up	06:13
StevenK	Wah, still no failure.	06:19
StevenK	Ah, the job running code isn't sending a notification.	06:22
StevenK	wgrant: I think this test is being defeated by caching	06:26
wgrant	StevenK: :(	06:30
wgrant	StevenK: Then clear the cache :)	06:30
StevenK	Which doesn't help either	06:31
StevenK	Sigh	06:31
StevenK	wgrant: Store.of(obj).invalidate() only invalidates only that object? What if I want to invalidate everything?	06:36
wgrant	StevenK: That invalidates the whole store	06:43
StevenK	Then I'm not sure why notify isn't calling back into getBugNotificationRecipients :-(	06:49
StevenK	wgrant: The notify(ObjectCreatedEvent(bug_branch))	06:56
StevenK	line in linkBranch() is directly implicated in the OOPS, but my test causes it to notify noone	06:56
wgrant	You've verified that there are adequate subs to the bug?	06:56
StevenK	I've added a direct subscriber with an APG	06:57
StevenK	I'm trying to work out why that notify call is deciding to do nothing	06:57
StevenK	And failing, I might add	06:58
StevenK	wgrant: The bugchange ignores private branches. The notification code has to run before the branch is made private.	07:10
wgrant	StevenK: Why is the branch private?	07:16
StevenK	Because I was forcing it to be.	07:16
wgrant	Ah	07:17
wgrant	(and yes, it does ignore private branches -- I reported that leak a few years ago :))	07:17
StevenK	wgrant: BTF isn't in branchscanner's security.cfg, too	07:18
wgrant	StevenK: It probably inherits that	07:29
wgrant	eg. from write or something	07:29
wgrant	Yeah, from write	07:31
StevenK	Ah. I think APG is fine, and we have a test that will blow up if that check changes.	07:31
StevenK	Oh, sigh.	07:43
StevenK	I bet the scanner has cursed this branch	07:43
* StevenK stabs the branch scanner over and over.		07:54
wgrant	I have mail	07:58
wgrant	So I take it you uncursed it	07:58
StevenK	No, rename and push again dance.	07:59
wgrant	You do love to crush my hopes and dreams.	07:59
StevenK	And putting a bloody knife in a Express Post and addressing it to celeryd@ackee	07:59
StevenK	wgrant: Well, I do need a hobby ...	08:00
wgrant	StevenK: Care to check that the bug actually gets linked?	08:00
StevenK	That's probably a good idea.	08:01
adeuring	good morning	08:11
=== adeuring changed the topic of #launchpad-dev to: http://dev.launchpad.net/ \| On call reviewer: - \| Firefighting: - \| Critical bugs: 4.0*10^2
StevenK	wgrant: Done. Look again?	08:13
wgrant	StevenK: r=me	08:19
wgrant	thanks	08:19
=== almaisan-away is now known as al-maisan
stub	wgrant: Do you think http://paste.ubuntu.com/1169485/ will work?	08:43
stub	wgrant: I'm thinking this behaviour makes the improved FDT process much simpler. I just need to stop the slaves replaying WAL, disable master connections, apply patches to the master, enable master connections, disable slave connections, reenable replication, wait for sync, back to normal.	08:45
stub	Which I think I can do without swapping pgbouncer config files around, which seems fragile.	08:46
wgrant	stub: That was exactly the process I had in mind, but let me read the code...	08:50
wgrant	stub: I think that would work	08:52
wgrant	But we'd want to go a bit further eventually :)	08:52
wgrant	A bit of refactoring to allow generic support for fallbacks would probably make it all a bit nicer	08:53
stub	In what way? We can also cause master requests to get a slave, which is reinventing lp's read-only mode.	08:53
wgrant	stub: Well, webapp and API requests always use the master	08:54
wgrant	Erm	08:54
wgrant	Webapp requests in recently-POSTed sessions	08:54
wgrant	Because they want up to date data	08:54
stub	But then we have a little risk with scripts, as we are deliberately returning a broken result.	08:54
wgrant	But if the master's not available then they should fall back	08:54
wgrant	xmlrpc-private also usually uses the master policy	08:55
wgrant	Because it wants to be as up to date as possible	08:55
stub	Yeah, so we can do that for all master requests, which would be a lie. Or make a master + fallback policy for them, and switch them to using that policy.	08:55
wgrant	Right, I think we want a MasterIfYouCan policy which all those use	08:55
wgrant	So the slave policy should always allow fallback to master	08:56
wgrant	the masterifyoucan policy can always fall back to a slave	08:56
wgrant	and the master policy just fails	08:56
stub	Yes, slave falling back to master is documented as allowed.	08:56
wgrant	Oh right, I think it even already does that	08:57
wgrant	It must	08:57
stub	I don't think we do that dynamically anywhere	08:59
wgrant	So I think default_flavor = MAIN_FLAVOR becomes eg. flavours = [MAIN_FLAVOUR, SLAVE_FLAVOUR]	08:59
wgrant	stub: I thought the slave policy respected lag	08:59
wgrant	But I can't remember exactly.	08:59
stub	oh yes	09:00
stub	Only in the LaunchpadPolicy	09:00
wgrant	Indeed	09:00
wgrant	SlaveDatabasePolicy doesn't respect lag	09:00
wgrant	That's probably a bad idea	09:00
stub	We only choose the default based on lag.	09:01
wgrant	Right, and only in LaunchpadDatabasePolicy	09:01
stub	master requests still get a master if explicit, and slave requests still get a slave if explicit, no matter lag.	09:01
wgrant	Oh hm	09:01
wgrant	True	09:01
wgrant	That sucks	09:01
wgrant	So	09:01
wgrant	I think most of dbpolicy.py wants a bit of a rethink	09:02
wgrant	and	09:02
wgrant	most importantly	09:02
wgrant	a de-Americanisation	09:02
wgrant	:)	09:02
wgrant	Because there's not much reason to ever not respect lag	09:02
wgrant	Lag should probably be treated as failure	09:02
wgrant	Although not failed enough that it won't use it as a last resort	09:03
stub	I think there is plenty of stuff that is happy using a slave even if it is an hour behind, and we raise alerts when things are 5 minutes behind	09:03
wgrant	True	09:05
wgrant	So yeah	09:06
wgrant	xmlrpc	09:06
wgrant	xmlrpc-private	09:06
wgrant	recently-POSTed webapp	09:06
wgrant	and API	09:06
wgrant	probably all want to fall back to slaves	09:06
wgrant	Webapp writes shouldn't	09:06
wgrant	So we need a new policy	09:06
stub	Try slave first, fallback to master ;)	09:06
wgrant	Right, that's correct for webapp	09:07
wgrant	But xmlrpc probably wants the opposite	09:07
wgrant	Or at least a very low lag limit	09:07
stub	I'm joking there.	09:07
stub	I'd like to try to use the LaunchpadDatabasePolicy logic if there is a session cookie	09:07
wgrant	Right, that logic is still good	09:08
wgrant	Except that that should only influence the default	09:08
stub	And I'm still annoyed nobody would put in a session token to the webapi, killing its scalability. But I suspect a lot of clients give us a cookie anyway.	09:08
stub	I think the way forward is to try this with just slave fallback, which is the original ppa use case. Get the bugs ironed out on the production side before complicating things further.	09:09
wgrant	I'm just worried that we're complicating things unnecessarily by adding a hacky single-case fallback	09:10
stub	We only have 2 types of connections, 3 if you count 'DEFAULT'. We don't really need a generic framework.	09:11
stub	Or do you mean this shouldn't be in the BaseDatabasePolicy?	09:11
wgrant	Right, I don't think this belongs in BaseDatabasePolicy	09:12
wgrant	I'm not quite sure where is better	09:12
stub	I can put it in SlaveDatabasePolicy and SlaveOnlyDatabasePolicy	09:12
wgrant	(also, am I missing something or do you try to reretrieve the same store there? you don't change the flavour)	09:12
wgrant	Which means it'll just try to regrab the slave	09:13
stub	typo	09:13
stub	Not tried actually using this yet, just thinking through the idea :)	09:13
wgrant	Heh	09:14
* wgrant foods		09:14
czajkowski	wgrant: stub you guys doing anything to LP right now? getting timeouts doing the licience review	09:46
czajkowski	(Error ID: OOPS-e3753aed5cfe86fe227192e43be904c1)	09:46
stub	nope	09:46
czajkowski	hmm	09:46
stub	Bah. Need to rethink this, again. DBPolicy will happily hand out stores from the ZStorm cache even if they won't work, and I don't want to test if connections work every time the policy is invoked.	09:56
wgrant	stub: Hm	10:06
wgrant	stub: Well	10:06
wgrant	stub: It should be done the same way as the lag check, right?	10:06
wgrant	I forget at what stage that is done, but whatever it is it's right at the start of the request	10:06
wgrant	And for non-request-based stuff we probably just want to switch when a connection fails, maybe?	10:07
wgrant	Although that makes it harder to fail back	10:07
stub	We might not have a request	10:07
stub	I think I can ask the store if it is in a disconnected state or not before handing it out. If it is disconnected, attempt reconnection	10:07
wgrant	Yeah	10:07
stub	So a script running during fdt will get disconnected and need to handle that. And when it handles it, it will get the master store if it asks for the slave.	10:08
stub	Just need to wade through to see how to detect disconnected state, and to force a reconnection attempt.	10:08
wgrant	stub: Yeah, we may just have to wait for a disconnectionerror to be raised, I suspect	10:11
wgrant	And teach stuff to deal	10:11
=== benji changed the topic of #launchpad-dev to: http://dev.launchpad.net/ \| On call reviewer: benji \| Firefighting: - \| Critical bugs: 4.0*10^2
=== al-maisan is now known as almaisan-away
deryck	abentley, adeuring, rick_h_ -- let's be ready for a longer stand-up today, to let rick_h_ lead us through the mockups he has.	13:20
abentley	deryck: okay.	13:21
rick_h_	party	13:21
rick_h_	drink refill before the meeting, got it	13:21
abentley	rick_h_: hey, don't party too hard :-)	13:22
jam	jelmer: I added a card to track the translations stuff, but I don't have any specific insight into it.	13:49
jam	My first guess is that there is an issue with a cron job that is supposed to be running.	13:49
jam	jtv is someone you can poke for translations background, but he doesn't necessarily know more intimate details if it is an operational issue	13:49
jam	wgrant is generally the person with the most ops ideas	13:49
rick_h_	gary_poster: ping, hazmat ping'd me about looking over their YUI work on the juju js app/ui and deryck mentioned that since you guys were coming into that work should someone from your squad take part in the discussion	15:25
rick_h_	bah	15:25
gary_poster	rick_h_, hey thank you. why the bah? It would be great to be a part of it, I think, though I'll check with hazmat	15:28
czajkowski	jam: jelmer jtv will be on later if that helps, in the mean time I'm going to put ana annoucement out on Twitter and places as we're getting more bugs/questions logged about it	15:30
czajkowski	it's added to the topic as well in case people ask, that way you're not under as much pressure to find an answer	15:30
deryck	abentley, ready for call?	15:30
deryck	stand-up hangout is fine	15:31
abentley	deryck: sure.	15:31
=== bac- is now known as bac
=== salgado is now known as salgado-lunch
=== almaisan-away is now known as al-maisan
=== al-maisan is now known as almaisan-away
czajkowski	hiya someone help me for a momen, bug https://bugs.launchpad.net/launchpad/+bug/1041864 I wanted to change the URL for the import, but when I do i get invalid as it's being used elsewhere and I've never seen that issue before :/	15:59
_mup_	Bug #1041864: Badly named weston import <Launchpad itself:New> < https://launchpad.net/bugs/1041864 >	15:59
maxb	czajkowski: Hm.. I don't think the user wants the import URL change	16:18
maxb	+d	16:18
maxb	Can someone look up OOPS-eb261d3e309c39d6948f60de23422af9 for me?	16:19
rick_h_	maxb: loaded, the user isn't a member of the team	16:20
czajkowski	rick_h_: you're faster than I was	16:21
czajkowski	maxb: http://pastebin.ubuntu.com/1170166/	16:21
maxb	Oh, right, this is because ~vcs-imports members has slightly weird hybrid edit permissions on code imports, I remember now	16:23
=== salgado-lunch is now known as salgado
=== Beret- is now known as Beret
jtv	czajkowski, jelmer: will be at least 8 more hours before I'm here — provided I get well enough. What's the crisis?	17:47
czajkowski	jtv: translation imports seem to have stopped	17:48
czajkowski	jtv: https://bugs.launchpad.net/launchpad/+bug/1041858	17:49
_mup_	Bug #1041858: No daily translation export anymore <Launchpad itself:Triaged> < https://launchpad.net/bugs/1041858 >	17:49
jtv	Import or export?	17:49
czajkowski	export sorry	17:49
jtv	And it's not the normal exports, but the exports to branches, I see.	17:50
jtv	Now, there's always a few exports that are skipped because the branches are locked, or there are concurrent translation updates that the exporter doesn't want to overwrite, etc.	17:50
jtv	So there's a big difference between “several people haven't seen it work” and “it's stopped.” Do we know which it is?	17:51
czajkowski	https://answers.launchpad.net/launchpad/+question/206912 https://answers.launchpad.net/launchpad/+question/206948	17:52
czajkowski	jtv: I asked jelmer to look into it today	17:52
czajkowski	not sure he made progress or what update he got with it	17:52
jtv	If there's no breakthrough, chances are it's hard to diagnose — which most likely means a crash in a C-level library. Might be worth finding out if the outage coincides with an upgrade.	17:53
jtv	Hmm the log on crowberry simply stops on the 17th. We haven't disabled the whole cron job by any chance?	17:55
czajkowski	jtv: no clue :s	17:55
czajkowski	jtv: but as soon as jelmer and jam come on tormorrow will get them to loook	17:56
czajkowski	or ask wgrant in a bit when he arrives	17:56
czajkowski	august 17th was when we did a lot of the DC move	17:56
jtv	There's more log on taotie.	17:56
jtv	I'll see if anything jumps out, but will leave it to the Dutch Cavalry otherwise then.	17:57
czajkowski	thanks jtv	17:57
jtv	czajkowski: I can give you my quick & vague impression… The exporter writes to branches, which triggers branch-scan jobs (to make Launchpad notice the branch changes). These seem to go into Celery now, via a RabbitMQ queue. It looks as if the connection to RabbitMQ started breaking on the 18th (possibly from a change on the 17th, since this job runs in early morning UTC) and eventually on the 21st somebody may just have killed and disabled the	18:10
jtv	Well, not disabled exactly. Maybe the lock file from the aborted run is still there; I seem to remember that logging is a bit asymmetrical when it comes to those lock files.	18:10
jtv	It may mean that that instance from the 21st is still hanging around trying to request a branch scan for Stellarium, and the subsequent runs are quietly giving up as they notice that.	18:11
lifeless_	flacoste: o/ - just settling Cynthia a little, back soon	19:05
flacoste	lifeless_: o/	19:07
=== lifeless_ is now known as lifeless
lifeless	flacoste: ok, ready when you are.	19:13
* deryck heads home, back online shortly		19:45
=== salgado is now known as salgado-afk
=== salgado-afk is now known as salgado
=== BradCrittenden is now known as bac
=== benji changed the topic of #launchpad-dev to: http://dev.launchpad.net/ \| On call reviewer: - \| Firefighting: - \| Critical bugs: 4.0*10^2
wallyworld	wgrant: StevenK: mumble?	22:02
=== Ursinha` is now known as Ursinha
=== jelmer_ is now known as jelmer
=== jelmer is now known as Guest90165
=== Guest90165 is now known as jelmer
lifeless	http crackheads of the world, does bug 1040689 strike you as add?	23:15
lifeless	s/add/odd/	23:15
_mup_	Bug #1040689: add api to refresh an existing token <escalated> <Canonical SSO provider:In Progress by ricardokirkner> < https://launchpad.net/bugs/1040689 >	23:15
StevenK	lifeless: Hello replay attack?	23:16

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!