/srv/irclogs.ubuntu.com/2012/07/04/#bzr.txt

=== zyga is now known as zyga-afk
jonohey05:20
jonoI am getting this error:05:21
jonoUsing saved push location: http://bazaar.launchpad.net/~jonobacon/ubuntu-accomplishments-system/adminportal/05:21
jonobzr: ERROR: Cannot lock LockDir(http://bazaar.launchpad.net/~jonobacon/ubuntu-accomplishments-system/adminportal/.bzr/branch/lock): Transport operation not possible: http does not support mkdir()05:21
jonocan anyone help me resolve it?05:21
bob2pretty sure you want to push via ssh instead05:22
bob2I think it'll be misconfigured due to you using an lp: url to clone without having use bzr launchpad-login (so lp: resolves to anonymouse http)05:22
bob2from my vague distant memories :)05:23
jonobob2, I just re-authed with launchpad-login and it doesnt seem to fix it05:24
jonoI think the issue is that another user checked this branch out05:24
jonoand I changed the owner/group on the code05:24
bob2possibly logging in will fix it, possibly you'll need to replace the htpt url in .bzr/checkout (maybe) by hand, or use 'bzr push --update-defaults-or-whatyever ssh://'05:24
bob2what does bzr info say the url is?05:24
jonoUsing saved push location: http://bazaar.launchpad.net/~jonobacon/ubuntu-accomplishments-system/adminportal/05:25
bob2yeah05:25
jonoso I need to set it to default to ssh?05:25
spivIf you're doing bzr push, just use "bzr push --remember lp:~jonobacon/ubuntu-accomplishments-system/adminportal" this once to remember the right location05:25
jonocd ..05:26
spiv(now that you've done 'bzr lp-login, the lp: URL will resolve to the right location)05:26
jonothanks spiv05:26
jonothat fixed it05:26
mgzmorning08:09
james_wI'm stopping the importer for the rollout10:09
james_wbacked up databases10:42
james_wcleaning some cruft from the dbs and vacuuming10:42
james_wstarting importer10:46
james_wfirst successful import10:52
jmlwhat's the reason for the named NOT NULL constraints in the table definitions in udd?11:06
jmle.g. signature TEXT CONSTRAINT nonull NOT NULL11:06
jmlWhy not "signature TEXT NOT NULL"?11:06
james_wdon't know11:08
james_wI'd say one of "sqlite refused the alternative" or "I didn't know any better"11:09
jmlok, thanks.11:10
jmlhmm.11:15
james_wimporter seems to be running fine so far fwiw11:15
jmljames_w: yay11:15
jmljames_w: some of these tables seem pretty weird, especially when creation is expressed with terser syntax11:15
jmle.g. CREATE TABLE packages_update (anupdate TIMESTAMP NOT NULL)11:16
james_whmm, yeah11:16
jmle.g. CREATE TABLE commits (package TEXT PRIMARY KEY)11:17
mikhashi, I am trying to use bzr-git on a git branch with a non-linear history (read: it has some merge commits)11:38
mikhascloning the branch fails, and I get bzr: ERROR: Unknown extra fields in <Commit f45a66d9ea4e1a3043c66e4cb0ee21f56ff7dcfd>: ['mergetag', '', '', '', '', '', '', '', '', '', '', '', ''].11:39
mikhasany ideas?11:39
vilajames_w: Count me in11:54
james_wvila, sorry, for what?11:54
vilajames_w: watching the importer11:55
james_wvila, oh, great, thanks11:55
vilaI see dead imports...11:55
vilajoke aside, how did you start it ?11:55
james_w./etc-init-mass-import start or whatever the script is called11:56
vila674 outstanding jobs seems... unexpected11:56
vilaand only 18 failures too11:56
vilasounds like.. dunno, retry-all-failures of whatever option jml added is on11:56
vilaon the bright side, some imports have succeeded11:57
* jml doesn't remember adding options11:57
vilaoh, did you clear the failures table or something ?11:57
james_wvila, not intentionally11:58
vilajml: pi.retry_all_failed_jobs ?11:58
vilajames_w: ok, may not be a problem in the end, was just checking11:58
james_wvila, it might be11:58
jmlvila: I might have written it, but I honestly can't remember :)11:58
vilajml: :)11:59
james_wthere's still a lot of failures in the failures table11:59
james_wso maybe a bug in the retry logic11:59
james_wit's also failing quite a lot with "something changed" with its own revisions, which shouldn't happen11:59
james_wbut that seems to have been happening a fair bit recently, so I'm not sure it's related to storm12:00
vilajames_w: there were quite a lot of such failures12:00
vilajames_w: we have successful imports, that means, most of the db updates are exercised and branches pushed12:01
vilaeven gtk+3.0 imported a release (I saw it succeed for several of them this morning so that's a *new* one)12:01
vilajames_w: the backup you did should allow further and finer diagnosis, keep them preciously ;)12:02
vilajames_w: egoboo killed ?12:03
vilahttp://package-import.ubuntu.com/status/egoboo.html#2012-07-04%2010:30:26.488895 ?12:03
vilaoh, *before* you restarted ?12:03
james_wvila, yeah, I got bored of waiting12:04
vilabut that one wasn't requeued ? weird12:04
james_wsorry, should have requeued that already12:04
vilajames_w: not sure, wait12:04
james_wyep12:04
james_wnot doing anything :-)12:04
vilaegoboo is the new nexuiz-data, it can't succeed IIUC12:04
vilaso better not requeue it12:04
vilahmm, key AssertionError:<module>:main:_import_package:find_unimported_versions:check will be retried automatically ?12:07
vilasomthing is probably wrong here... as if some data was inverted about the failure12:07
vilaor the retry one12:09
=== zyga is now known as zyga-afk
vilajames_w: my guess would be that something is wrong with the retry table and the code can't find what it's searching for inverting the whole behaviour12:13
james_wvila, when considering what to automatically retry?12:14
vilaat least12:15
vilaall the failures being retried is already unexpected, then all failures are seen as needing a retry too12:15
vilajames_w: (out of curiosity) what did you clean in the dbs ?12:17
james_wvila, old rows in the jobs table12:17
james_wand old rows in the import table12:17
=== zyga-afk is now known as zyga
vila'old' means ?12:18
james_wvila, earlier than this month for import12:19
james_wvila, earlier than this year and active=0 for jobs12:19
vilahmm12:20
vilaI mostly understand the later (I thought all active=0 can be purged and the importer will catch up) but the former rings no bell12:21
vilaha no, the lp lag window, so not all active=012:21
vilajames_w: just got the emails about {categorise|graph}_failures12:27
james_wvila, I fixed categorise_failures12:28
james_wI haven't seen graph_failures yet12:28
vilajames_w: yup, I saw the fix before the email :)12:28
james_wah, ok :-)12:28
jmlis 'jobs.id' ever used as a foreign key?12:28
jmlI don't think so.12:29
james_wnot that I recall12:29
vilajames_w: looks like the same traceback12:29
james_wthese dbs haven't ever really heard of normal form12:29
vilajames_w: so most probably already fixed too12:29
james_wok12:29
vilajames_w: http://package-import.ubuntu.com/status/mlton.html#2012-07-04%2012:30:30.284526 is *definitely* something that shouldn't be retried12:35
vilajames_w: this means the next lp fdt will create a bunch of failures that won't be retried12:35
james_wvila, why is that?12:36
vilawell, if the retry logic is inverted, true failures are retried and transient ones are not12:36
james_wah, I see12:37
james_wand retrying that failure would have bad consequences? or it's just something that certainly won't be fixed by a retry?12:38
viladunno, that's the point, but I would refrain requeue anything until you understand what is going on as I don't know what kind of data will be inserted then12:39
vilanexuiz-data should not be retried because it can't succeed (we don't know why yet), but it's running right now (it has a special-cased timeout of 24h),12:41
vilathe last time it failed (I don't remember exactly how) the associated failure was a non-transient one,12:41
vilaas such it was black-listed12:41
vilanow it's not12:41
vilano big deal for this one, but as a whole, the importer has now lost its knowledge about what should be retried or not12:42
james_wvila, it certainly looks like it is retrying everything13:16
james_wexcept for that linux failure, which is odd13:16
james_wI'll look in to that13:16
vilajames_w: hmm, now that you mention it... I wonder if the linux failure wasn't marked as a transient one *by mistake* !13:18
vilajames_w: and since you said the blob -> text stuff  shouldn't be an issue, I wonder if the root cause may be in how signatures are created from backtraces or something...13:19
james_wyeah13:20
james_wthat's what I'm thinking13:20
james_wif that changed (either type, or the content) then it could well cause this behaviour13:20
james_wthough I would expect it to stop auto-retrying, rather than to auto-retry everything13:20
vilahehe, that's part of what make bugs interesting, surprise us :)13:21
vilawow, wow,  Exception: sqlite3.OperationalError: database is locked13:21
vilanot yet on the web page13:22
vilapython-bsddb3 base-installer vdr-plugin-live nexuiz-data (!)13:23
vilafollowed by two successful imports13:23
vilaweb page updated13:24
mikhasis there a workaround for git mergetags? https://bugs.launchpad.net/dulwich/+bug/96352513:24
ubot5Ubuntu bug 963525 in Bazaar Git Plugin "mergetag support" [High,Triaged]13:24
mgznot that I'm aware of.13:24
mikhasso nothing I can do then? other than swearing at bzr, of course?13:25
vilajames_w: and mails from add_import_jobs and categorise_failures with the same traceback too13:26
james_wvila, database locked ones?13:26
vilayup :-/13:26
james_wok13:26
james_wno need to panic yet13:26
james_wit's not like they were completely absent with the old code13:26
vilaI don't :)13:26
vilaindeed13:26
mikhassince the error message I get is "ERROR: Unknown extra fields in blah", I was wondering whether I could force bzr to simply ignore the extra fields?13:27
vilaI realized today we weren't talking about the same db locked errors, you were talking about the one in your first attempt while I was talking about the existing ones, I was still hoping your change could kill two birds though :-/13:28
mgzmikhas: ask in the bug or on the list, jelmer would know best but is not around today13:29
mikhasok13:29
vilajames_w: still, the ones in the existing code fired like... 10 times in the last 2 weeks, we're at 7 in 3 hours. In some ways, it's a progress, the bug is revealing itself, ready to be fought ;)13:33
james_wheh13:33
vilaI don't remember categorise-failures triggering it either but I may be wrong13:34
james_wvila, it's run less frequently than add-import-jobs, so I'd expect to see it fail with that error roughly one fifth of the number of times13:36
vilajames_w: unless an fdt ruined the game, and given that some very old failures seem to succeed (given the number of releases imported for some packages), I think it's worth to let the importer run a bit longer13:37
vilatrue13:37
james_wvila, I don't think there's a distinction between the existing errors and the ones from the last attempt13:37
james_wit was just a matter of frequency of them occuring13:37
james_wI predict we will all but abolish them by moving to postgres13:37
vilathat's a risky bet ;)13:38
james_wso unless there is a significant performance degradation with the importer on the current code we should just push ahead with that13:38
james_wvila, I'll bet you a beer :-)13:38
vilahehe13:38
vilaif you fix the retry stuff, the picture will be clearer,13:39
vilaif we don't get too much locked errors, we can mark them as transient errors and pursue the experiment13:39
james_wyeah13:40
vilaadd-import-jobs and categorise-failures won't retry unless you fix them too but again as long as they *can* run often enough, it's still worth continuing13:40
vilajames_w: you still have 645 outstanding jobs to find a fix ;)13:41
vilajames_w: just to confirm, in your udd use case, you don't run import-package right ? Nor needs bzr-builddeb nor pristine-tar still right ?13:48
james_wvila, correct, we do not13:48
vilagood13:48
vilahehe, oops is a package name, was wondering about its appearance in an error message ;)13:50
james_wvila, ok, found it13:52
james_wMP coming13:52
vilajames_w: hmm, looking at the crontab categorise-failures and add-import-jobs run at the same frequency...13:52
james_wvila, hmm, ok, I was misremembering then13:52
james_wit is odd that it is usually add-import-jobs then13:53
vilano worries, yeah weird13:53
vilaboth should run at the same time, may be one is blocking the other in "normal" circumstances13:53
vilajames_w: which reminds me another question: you mention 30 seconds delay for sqlite and I recently realized that's the case in your proposal, but what was it before ?13:54
james_wvila, 30 seconds13:55
james_wit has been for months/years13:55
james_wit crept up as the load went up13:56
vilayou mean: implicit before and explicit in your proposal ?13:56
james_wvila, nope, it was explicit before as well14:17
=== zyga is now known as zyga-food
james_wvila, https://code.launchpad.net/~james-w/udd/fix-auto-retry/+merge/11340914:47
vilawow, just a misplaced is not None ?14:49
james_wvila, yeah14:50
james_wand missing tests :-)14:50
vilaindeed :)14:50
vilajames_w: approved14:51
james_wthanks14:51
james_wrolling it out14:51
vilajames_w: I have to go in a few minutes,14:51
vilayeah, roll out, kill whatever is in the way but keep refraining from requeuing until you have a good feeling you nailed that one good ;)14:52
james_wyeah14:52
mgzI'll be around a bit longer, so feel free to bug me instead for anything urgent14:53
fullermdmgz: My lawn needs mowed...14:58
vilajames_w: the fix is convincing and rules out the data so that's encouraging14:59
james_wvila, yeah15:01
james_wvila, yeah, and testing with local dbs shows that the same data gives the correct behaviour now15:03
vila\o/15:03
* mgz seeds fullermd's lawn15:07
vilajames_w: back for a tiny bit, you did deploy with no down time ? ;)16:05
james_wvila, I did16:05
vilacute :)16:05
james_wthough mass-import should really be restarted16:05
james_wotherwise it may start to think that LP is down when it isn't16:06
vilato etect , yeah16:06
james_wI can do that before I go if you like?16:06
vilaor even now, so it will just have to be restarted16:07
vilahmm, libreoffice...16:07
vilawell, the web page shows that the fix is effective, since no data was harmed, we can requeue, so mass-import stop and the killed ones should just restart IIRC16:08
james_wvila, stopping...16:16
vilaI see, driver said db locked, not sure if anything can be wrong there (I don't think so, bit surprising though)16:17
=== zyga-food is now known as zyga-afk
james_wvila, finally started again after a few kills17:19
james_wI have to leave now, I'll check in later17:19
vilajames_w: mass0import stop (not grateful-stop) didn't kill them ?17:19
james_wvila, it did not17:19
vilaweird17:19
james_wnor a normal kill17:19
vilaWeird17:20
james_wI have to get out the bigger hammer in the end17:20
=== r0bby_ is now known as robbyoconnor
=== r0bby is now known as robbyoconnor
=== r0bby is now known as robbyoconnor

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!