/srv/irclogs.ubuntu.com/2011/09/16/#bzr.txt

Noldorin_jelmer, hey. is any of it making sense to you now?01:17
Noldorin_this issue...01:17
pooliespiv, hi01:42
poolienm01:44
=== medberry is now known as med_out
* maxb catches up on a bit of ~bzr PPA update backlog04:41
pooliehi maxb!04:42
maxbmorning04:42
maxbHmm. Not sure if I'm doing something wrong, but I'm finding the process of doing something in bzr to model having copied the beta-ppa line of bzr packages into the main ppa to be somewhat awkward05:26
pooliehm05:26
pooliedo you want to talk more about it?05:27
maxbWell, I have a way which works, it's just not pretty05:27
maxbcd ....../ppa/natty05:27
maxbbzr merge ../../beta-ppa/natty05:27
maxbbzr revert -r-1:../../beta-ppa/natty05:28
maxbbzr ci05:28
maxbcd ../maverick05:28
maxbbzr merge ../../beta-ppa/maverick05:28
pooliehm05:28
maxbbzr merge --force ../natty05:28
maxbbzr revert -r-1:../../beta-ppa/maverick05:28
maxbbzr ci05:28
maxbcontinue for lucid05:28
poolieso you want to make a merge, that actually replaces everything with the origin?05:29
poolieyou could just pull, perhaps05:29
maxbIt would be an --overwrite05:29
poolieright05:30
pooliebut that's arguably the reality here05:30
pooliehm05:30
poolieit would be better if there was a revspec for "my pending merge tip" so you could revert to that, too05:31
pooliei think there's a bug asking for it05:31
maxbyes, yes it would05:31
maxbI may have even filed it. Or me-too-ed it05:31
maxbhow on earth have I ended up with differing file-ids in the lucid ppa branch for various packaging bits?!05:36
maxboh, no05:37
maxbjust conflict files left behind05:37
maxbbut I do now have conflicting tags between beta-ppa/natty and ppa/natty!05:38
poolie:/05:46
pooliethat could actually be coming out of the workflow you discuss?05:46
pooliesince there are going to be effectively two different revisions for 2.4.1-blah05:47
maxbThere should never be that05:47
maxbOne of the conflicting tags is 'bzr-2.3.3' !05:47
maxbthe rest are packaging in nature05:48
maxber, whoops05:49
* maxb realizes the need to revert . not revert in the above workflow05:49
poolieright, or it will lose your pending merge05:52
pooliei assumed that was just a typo - otherwise the ci will fail05:53
vilahi all !06:40
maxbMorning vila06:42
maxb2.4.1 building PPA06:42
vilahay maxb !06:42
vila\o/06:42
vilathanks a ton !06:42
maxber, building in PPA06:42
maxbwe have an issue with bzr-builddeb. Its tests seem to be hinting at a bug in bzrlib.tests06:43
vilayeah, I figured ;) (I'm pretty good at deciphering tyops, lots of practice in producing them myself)06:43
vilaha ha, tell me more06:44
maxbAttributeError: type object 'TestCaseWithMemoryTransport' has no attribute '_SAFETY_NET_PRISTINE_DIRSTATE'06:44
vilariiiings a bell06:45
vilawhat and where was it...06:45
vilaknown and fixed issue anyway06:45
vilamy background daemon is whispering jelmer or Riddell06:46
maxbor.... you?06:46
maxb:-)06:46
vilaAug 30 12:18:08 <vila>_SAFETY_NET_PRISTINE_DIRSTATE was introduced recently, let me check06:47
vilafakeroot06:47
maxbbzr/2.4 r6024 looks like it might fix it06:47
spivSounds a bit like something not calling the super class's setUp?06:47
pooliehi spiv, vila06:48
vilatest isolation says my IRC logs, reading06:48
vilahey poolie !06:48
maxbi'll retry the bzr-builddeb builds once bzr 2.4.1 has built06:48
vilamaxb: lp:bzr/2.4 revno 6024 is: (jameinel) Bug #609187,06:49
vila check that packaging import branches are up-to-date when accessing them.06:49
vila (John A Meinel) here06:49
ubot5Launchpad bug 609187 in bzr (Ubuntu) "users are not warned when branching ubuntu:foo (or lp:ubuntu/foo) and the package import of foo is out of date" [High,Fix released] https://launchpad.net/bugs/60918706:49
vila6042 !!06:49
vilapff, who said he can't figure out tyops ? Lier06:50
vilamaxb: yup, that's exactly the one06:50
vilapff, who said he *can* figure out tyops ? Lier06:51
jammorning all07:25
pooliehi jam07:28
poolievila, so it looks like the changelog merge hook is working ok?07:28
vilapoolie: ha, ha, quizz question first: how long does it take to the package importer to queue/process all tracked packages ?07:29
poolieto check them all?07:30
pooliei'm not sure, good question07:30
pooliea couple of hours?07:30
vilanothing to win, I was surprised by the answer myself and like to see what people feel it is07:30
vilaexactly my feeling, ~2 hours07:30
jamvila: queue and check when there is nothing to do?07:30
vilajam: not exactly, on average, how long between two attempts for the same package07:30
vilajam: no cheating by looking at the logs, first thought !07:31
jamvila: I'm not quite sure what you're getting at. It should be trying to pull the data of what needs to be done next from LP07:32
pooliemaybe you can just tell us?07:32
jamAre you saying to retry a failed package?07:32
vila~36 hours07:33
vilai.e. on average, we try to import a given package every 36 hours07:35
vilafar more than the ~2 hours gut feeling I had07:35
vilahence why I wanted you to tell your gut feeling before I told you07:36
poolieok07:36
pooliewell, that could certainly be faster07:37
pooliewhere did you get that data? from sampling the logs?07:37
vilayup07:37
poolieperhaps eventually it could look at a feed-like view from launchad07:37
vilalooking at the output of https://code.launchpad.net/~vila/udd/analyze-log-imports/+merge/74057 instead of tail -F progress_log gives a better feeling07:38
vilaholistic feeling that is07:38
vilapoolie: and from that and to come back to your original question: no evidence yet that dpkg-mergechanlogs broke something, but still a bit more time needed to know that it has been tried for all packages07:40
vilai.e. tonight, I'll mark the bug as fixed with reasonable confidence and still ask for it to be re-opened if needed07:41
vilabut I'm already convinced it's ok07:41
vila2011-09-15 05:00:30,456 - __main__ - INFO - All packages requeued, start again07:42
vilaso roughly the next occurrence is expected today ~17h00 UTC07:43
poolievila, so you're on bug 795321 now?07:44
ubot5Launchpad bug 795321 in Ubuntu Distributed Development "udd importer should make tea while launchpad is down" [High,In progress] https://launchpad.net/bugs/79532107:44
pooliejam that was very quick on the disconnect stuff07:44
poolieis there anything i can do on it?07:44
vilapoolie: yup, I have a crcuit breaker implemented and tested, I'm now trying to plug it into mass_import07:44
poolienice07:44
vilathree events: attempt (to import), success, failure07:45
vilaone assumption is that when lp is down, no import can succeed07:45
vilaanother is that failures are already classified, some of them are transient07:46
vilaI expect the transient ones to be easily linked to lp down07:46
vilaand the final one is that we have a way to say: this failure is a transient lp one from the backtrace and/or because it raises launchpadlib.HTTPError07:49
vilaideally from the command line07:50
vilabbiab07:51
jampoolie: you can test the disconnect stuff if you like, it seems to work fine here. On Windows and Natty at least.07:54
jamI think the next obvious step is to hook into that with a SIGUSR1/SIGHUP07:55
jamand then get the client to gracefully reconnect07:55
poolievila, one option for concurrency is to change to use tdb07:55
pooliei'm pretty sure that's multi-writer07:55
poolieand simple07:55
poolieand udd does very simple things with its db07:55
poolieeither that or whichever nosql is fashionable today07:56
jampoolie: well, you could just switch to postgres07:56
jamMost nosql solutions do very poorly at low scale07:56
jamfor example, mongodb defaults to pre-allocating 5% of your disk space (in my case, 1GB)07:56
poolieawesome07:57
pooliei wasn't very serious about that actually07:57
jampoolie: I didn't think you were. As for tdb, we could, but sqlite seems much more well tested. With the WAL work for sqlite 3.6 or so07:58
jamthe actual contention between readers and writers is tiny07:58
jamthe only issue is if you need multi writer07:58
pooliei think tdb is pretty reliable, though less commonly used07:59
nigelbpoolie: On the note of nosql -> http://howfuckedismydatabase.com/nosql07:59
poolieupgrading sqlite would be a smaller change07:59
jamnigelb: :)07:59
pooliei like that07:59
pooliemore to the point, http://howfuckedismydatabase.com/sqlite/07:59
nigelbheh07:59
poolieso would anyone (maxb?) agree or disagree with me in bug 831699 that to track success, it's probably cleanest just to add a success table?08:04
ubot5Launchpad bug 831699 in Ubuntu Distributed Development "no report/log of successful packages" [High,Confirmed] https://launchpad.net/bugs/83169908:04
poolieand/or refactor the 'failure' thing into an 'outcome' that can be either success or failure08:04
vilapoolie: right, not there yet (migrating from sqlite ;)08:04
poolie?08:05
vilamaking tea has issues with concurrency but not related to the db08:05
vila. o O (Go decipher that joe random ;)08:06
vilaFor the circuit breaker, the fact that lp can go down while several imports are running has two outcomes:08:06
Riddellaloha08:07
vila- you can see a failure *followed* by a success because the failure see lp down while the success still had work todo to finish the import that didn't require lp08:07
vilaso the success is a false positive as far as lp state is concerned08:08
poolieright08:08
jampoolie: I think adding a success table makes sense. It gives you a place to say *when* it last succeeded, how long it took, whatever other stats you want to track08:08
poolieso there has to be some kind of trending08:08
poolieone swallow does not make a spring08:08
jampoolie: I always burn my coats when I see a swallow....08:09
vila- you can see a failure related to lp but the classification is wrong (permanent instead of transient) which is also a false positive08:09
vilapoolie: right, so as far as the circuit breaker is concerned, there is little interest in waiting for more success as long as we keep trying on transient failures08:10
jamvila: you could also just have it be a soft timeout that increments on failure, and decrements on success.08:11
jam(not necessarily by the same amount)08:11
vilaanother issue is that there is no fast and unambiguous way to decide if lp is up08:11
pooliealso08:11
jamso start a new package every 30s, if they are failing, make it 45, then 60, then...08:11
vilajam: right, but the issue here is more about lag08:12
vilaI don't know how long a package will need to tell me lp is up08:12
vilaso I'm more inclined to say: we have a max_threads for mass_import,08:13
pooliecould you look at the code i quote in bug 83169908:13
ubot5Launchpad bug 831699 in Ubuntu Distributed Development "no report/log of successful packages" [High,In progress] https://launchpad.net/bugs/83169908:13
poolieit seems wrong but i might be missing the point08:13
vilawhat is wrong ?08:15
vilaOLD_FAILURES ?08:15
pooliewhy does it chec kfor a failures entry, and only if that exists delete the old job08:15
vilaI think the JOB table is populated only if you had  a failure08:15
vilaor if it's a new package08:16
vilaerr08:17
vilawon't work for new packages, forget that08:17
poolieno, i think there's always a job created when it starts, and it's marked closed higher up in this function08:17
poolieso the intention certainly seems to be that they're kept around, but inactiev08:18
pooliealso, why delete it if it failed on the previous attempt08:18
jampoolie: "I wonder what the logic is behind deleting the job if there was previously a failure record."08:18
jamI think it is deleted on any success, indicating that it doesn't need to be run again08:18
jam(job completed)08:18
vilaso that it doesn't appear on the web page while it's queued ?08:18
jamah, nm, I see your point08:19
jampoolie: I think it is just faulty logic. It seems like it checks for row, because it wanted to use it, or something like that.08:19
vilamass_import starts by queuing the job table and only when it's empty does it look at the package table08:19
pooliei think it's a mismatch08:20
pooliewell, i have to go out now08:20
jamalso note that the 'delete from %s' doesn't follow the rest of the SQL refactoring that pulls strings out into constants, etc.08:20
pooliemaybe james will reply08:20
jampoolie: have a good night08:20
poolieexactly08:20
pooliei colud annotate it08:20
vila. o O (my empire for a test framework there)08:20
vilapoolie: g'night08:20
poolienothing obvious there08:21
poolieok, we'll see08:21
pooliei'll track successes on monday08:21
pooliecheerio08:21
Riddellin add_hook e.g. self.add_hook('transform_fallback_location', "Called when a stacked branch is activating its fallback " etc   is the description ever shown to the user?09:37
Riddellare the strings in check.py check() self.progress.update() user visible?09:49
jamRiddell: "bzr help hooks" ?09:58
jamI'm not sure about the check.py stuff09:58
jambut if it is "progress", then yes, most likely user-visible09:58
jamvila: did you ever get a chance to re-review the patch I put up? I think I ironed out the kinks09:59
vilajam: not yet10:00
jamI think poolie's only comment was that I should probably be checking "errors"10:00
vilayup, I agree with that10:00
jamI was hoping to have a way to actually trigger that, so that I know the code works10:00
vilaI can list a few themes I want to mention without formally reviewing if you wish10:01
vilaor do that later more formally10:01
jamvila: feedback is feedback if you have time to list it out10:10
vilamedium has a disconnect method, why do you need to implement a _close() one ?10:11
vilathe config stuff could be revisited once bug #491196 is fixed, in the mean time, what you did is good,10:13
ubot5Launchpad bug 491196 in Bazaar "want a way to set configuration options from the command line" [High,Confirmed] https://launchpad.net/bugs/49119610:13
vilaI wouldn't require plugins to support the new timeout parameter (but I'm not sure you had a choice there)10:14
jamvila: client side has .disconnect. not server side10:14
jamI can call it "disconnect" if you like10:14
jamvila: I don't see a way to pass the timeout parameter optionally, other than what I did10:15
jamtry/TypeError10:15
vilajam: but is there a way to not *force* the plugins to accept it (i.e. could they just ignore it to start with and implement it later)10:16
vilai.e. is loggerhead *required* to take it into account *today*10:16
jamvila: I did that, try/pass_5_arguments/except TypeError/pass 4 arguments10:16
jamwith a deprecation warning inbetween10:16
jamvila: I did mention that loggerhead works *today* with a warning.10:17
jamwhich is suppressed in release builds10:17
jamvila: I know it isn't obvious, but too-many-arguments is a TypeError10:17
jamin python10:17
vilajam: oh, I may have looked at on old version then, I don't remember seeing this part10:18
vilas/on/an/10:18
vilaok, good then10:18
jamvila: possible, though I think I implemented except TypeError when I implemented the command line.10:18
vila<jam> vila: client side has .disconnect. not server side10:19
vilabut test server side has shutdown_client (client from the server side pov)10:19
vilathere may be a way for the tests to use a native disconnect() if available10:20
jamvila: only the test server, not this implementation10:20
jamvila: what do you mean by 'native disconnect'?10:20
jamSmartServerStreamMedium does not have the concept of disconnecting the client yet10:20
jamyou added shutdown_client only in the Test implementations10:21
vilanative as in supported by the server side of the client10:21
vilayes, to limit the scope at the time and at least for the SmartTCPServer because spiv said he didn't care10:21
jamvila: SmartServerStreamMedium *is* the server side of the client. I really feel like I'm missing something here.10:21
jamvila: I'm happy to rename ._close() to .disconnect() and to change shutdown_client() to call .disconnect()10:22
vilathe test infrastructure didn't try to use it because that was the only existing one10:22
jamthough it doesn't quite match because of *what* self.clients tracks10:22
vilaha10:22
jamit doesn't track the medium/handlers, it tracks the actual socket connections.10:22
vilathat may be where the mismatch is,10:22
vilaI feel like you're adding stuff that is existing but may be that's because the dots are not connected10:23
vilaand a fallout of doing that may explain the weird behaviors you're seeing10:23
vilaand I'm still uncomfortable with the select loop as I have a feeling that it's needed because these dots are not connected10:24
jamvila: fundamentally the test infrastructure is poking at a thread's internal state from another thread.10:25
jamI don't think that is a 'stable' situation.10:25
jamWe do it because parts of the thread are blocking10:25
vilaon top of that, if the SmartTCPServer ends up needing to track its connections, it sounds like the this code should be shared10:25
jamvila: well, *today* SmartTCPServer does *not* track its connections. They are set to Daemon and forgotten.10:26
vilathe alternative was to raise an exception in the client thread from the server thread but back in the days this wasn't easy to do and still be compatible with 2.4, 2.5 and 2.610:26
jamvila: this is like the thread.interrupt_main()?10:27
jamvila: In my last comments I noted something10:27
jamwhich is that *doesn't* interrupt socket.accept()10:27
vilajam: which is why you encounter issues with "interpreter shutdown" and need to to keep references to sys.stderr and the line10:27
jamit waits for it to timeout/return first10:27
vilalike10:27
jamvila: so "interrupting" the client thread doesn't actually do such a thing10:27
jambecause it is blocked in a C lib10:27
jamso you *still* need to do a loop10:27
vilawhich loop ? the select one ?10:28
jamvila: In that case, a loop around socket.accept()10:28
jam(SmartTCPServer.serve)10:28
jamit wanted a loop already10:28
jambecause it wants to support multiple connections10:28
jamThere is a test case in blackbox10:28
jamthat calls "thread.interrupt_main()"10:28
vilainterrupt_main sounds like a thread interrupting the *main* thread which can receive signals10:28
jamand it doesn't actually interrupt until socket.accept() returns10:28
jamyou can see my notes on it10:28
vilaI'm talking about raising an exception *from* the main thread in another one10:29
jambut if you have "socket.settimeout(1)" it takes 1s for the test to shutdown10:29
vilaevil, don't do that10:29
jamvila: sure, *my* point is that raising an exception doesn't actually interrupt select.select() or socket.accept() or socket.recv() etc.10:29
jamvila: we already have that10:29
jamI "fixed" it with an optional "change the timeout parameter.10:29
vilawhere ?10:29
jambzr selftest -s bb.test_serve10:29
jamI forget the exact test10:29
jamblackbox.test_serve.TestCmdServeChrooting.test_serve_tcp10:30
jamAnd I've seen that happen with one of the other tests10:30
jamI think it is a race condition with whether it gets blocked in the socket.accept() before it gets a chance to raise the exception.10:30
jamvila: so our SmartTCPServer already does a 1 second timeout loop in serve10:33
vilaI've long suspected race conditions but 1) I stopped encountering them when adding the necessary sync points, 2) I'm not convinced anymore that *python* itself have some10:33
jamvila: sync points?10:34
vilajam: on the listening socket, irrelevant, this one is ok10:34
jamvila: except it makes the test take 1s to shut-down10:34
jamwhich I override down to 0.1s10:34
vilabut that's not a race10:34
jamvila: there *is* a race in another test, where it sometimes waits an extra second to shutdown after calling thread.interrupt_main()10:34
jamit isn't strictly a 'race'10:35
jamas in, it always gives the same results10:35
jambut how long it takes varies10:35
jambecause the 'main' thread is blocked10:35
jamwaiting on socket.accept(). I would expect the same thing for select.select()10:35
jambecause the 'thread.interrupt_main()' *doesn't* use signals10:35
jamso if the call is in a C function10:35
jamit is blocked from python seeing the 'you need to raise an exception' call.10:36
jamvila: hence, you need a loop, to avoid blocking forever10:36
jamvila: for example, if I wrote the _wait_for_timeout code to do select.select(..., timeout=300). I *think* you could not ^C the python process.10:36
jamYou technically *could*, but it may not actually trigger until the 300s times out10:37
jamI've certainly seen stuff like ^C get blocked because we are in a C function.10:37
vilainterrupt_thread is not what I had in mind, it may shares some common parts but the one I remember was allowing raising a specific exception not KeyboardInterrupt10:38
vilahttp://docs.python.org/library/thread.html?highlight=thread.interrupt_main#thread.interrupt_main10:38
jamvila: so we sort of got off on a tangent. My specific point is that you need the loop, regardless of testing-specific interactions, because you aren't guaranteed that ^C will do what you want10:38
vilasays" Threads interact strangely with interrupts: the KeyboardInterrupt exception will be received by an arbitrary thread. (When the signal module is available, interrupts always go to the main thread.)10:38
jamvila: you're looking at a different function than what you linked10:39
vilajam: my point is: you shouldn't need the loop and we don't know *why*10:39
vila?10:39
jamvila: http://paste.ubuntu.com/690681/10:40
vilayou mentioned interrupt_main, I don't remember the link of the alternative10:40
jamfollow your own link10:40
jamit doesn't say what you said10:40
jamvila: ah, way down at the end?10:40
jamvila: I can confirm that it is a Windows thing I'm seeing.10:46
jamSpecifically: socket.recv(1) is blocking on windows.10:46
jamsuch that ^C doesn't interrupt it10:46
jamand you have to kill the process by other means10:46
jamon Linux, I get KeyboardInterrupt reliably10:46
jamon Windows, once I finally send some data10:46
jamthen I see KeyboardInterrupt10:47
vilaI don't understand what you're talking about, interrupt)main ?10:47
jamvila: so there are 2 things10:47
jam1) socket.recv() blocks ^C until it returns on Windows (not on linux)10:47
jam2) socket.accept() is known to block thread.interrupt_main() from raising KeyboardInterrupt until socket.accept() returns (on all platforms)10:48
jamas in, it queues up a KeyboardInterrupt *to be raised when socket.accept returns*10:48
vilaforget interrupt_main, not all servers run in the main thread (most don't)10:48
vilaright, hence the need for the *test* which runs in the main thread, to act on the socket in the server context so that either the blocking call is unblocked or it raises an exception10:49
jamvila: going further, select.select() [on Windows] blocks ^C until it returns10:49
vilathat's the whole idea of shutdown_client10:49
vilajam: even when you specify the error parameter ?10:50
jamvila: thread.interrupt_main() doesn't matter if it is main thread or not, it is still blocked until socket.accept() returns.10:50
vilaright, so, let's forget about it10:50
jamvila: i'll try that, but given that KeyboardInterupt is raised the moment select.select() returns10:50
jamvila: select.select([c], [], [c], 10) doesn't respond to ^C until  I either wait the 10s or write a byte to the client socket10:51
vilaoh, right, TestBzrServeBase, now I remember that one, special case, probably the only one where the server runs in the main thread10:51
vilaor close the client socket ?10:51
vilafrom the server end, not the client end10:52
jamvila: my experience with testing, was that server-end *does not return* until the timeout if I close the socket in another thread.10:52
jamI'll try again, though.10:52
jamvila: *my* point, is that because select.select() blocks stuff like ^C until timeout10:53
jamwe should use a shorter timeout, and loop10:53
jamand if we need a loop anyway10:53
jamthen we can not worry about it in the test suite.10:53
viladunno for select  but I can assure you that it works for read() (for the client threads))10:53
vilaso either the select is interrupted or the read is interrupted and as long as you catch the right exceptions you shouldn't need the loop around select10:54
jamvila: there is no read(), do you mean recv() ?10:55
vilayeah, recv, sorry10:55
jamvila: and you mean thread.interrupt_main() or you mean ^C10:55
jam?10:55
vilaI mean whatever way is used to close the connection10:56
jamvila: I'm setting up a test case for that on Windows. I'll let you know.10:56
vilaforget about interrupt_main, it's a hack and not a good example (useful shortcut for TestBzrServeBase though)10:57
vilabut TestBzrServeBase is about checking hook execution IIRC not really about how the server is interrupted10:58
jamvila: ok, I can say that... testing a very simple test case threading.Thread(target = (sleep1, c.close()).start(); select([c], [], [], 10) returns before 10s saying that you can recv from the socket without blocking.10:59
vilaso not relevant for our current discussion because we *cannot* use it10:59
jamon Windows10:59
jamvila: I can also say, this was not borne out in the test suite10:59
jamwhere sometimes it hangs forever10:59
jamwell, until timeout10:59
vilaborne out ?10:59
jamvila: confirmed by10:59
jamsimilar results with10:59
vilaoh, I can believe that10:59
vilahangs are a pain to debug, therefore extremely hard to diagnose11:00
jamhttp://answers.yahoo.com/question/index?qid=20100702014132AAcKdX811:00
vilajam: thanks ! Keep them coming ! (And always fell free to fix my broken english)11:01
jamvila: so. select.select() will block ^C on windows until timeout, and seems unreliable that we detect the socket getting closed underneath us (according to the test suite).11:01
jamAs such, it seems that a loop is reasonable.11:01
vila...11:01
vilajam: before I fixed the test suite leaks, we *were* using a similar loop11:02
vilawith timeouts to make matte worse11:02
jamvila: if only because you can't ^C the process until timeout, I don't think we should have a 300s timeout.11:02
vilayou lost me there, I thought the scope of the bug was lp, why should we pay for busy loop there ?11:04
jamvila: the scope of the bug is "we'd like to disconnect clients that are idle for too long", a 1s sleep loop isn't a lot of wakeups, though certainly you could reduce it if you wanted.11:05
jamvila: I feel pretty strongly that we *need* a loop on windows11:06
jamI can have "if sys.platform == 'win32': loop"11:06
jamThat seems far worse than just having a loop, which handles all the current issues, even if there may be other issues in the future.11:06
viladoes selec([],[], [xxx]) block ?11:09
viladoes select([],[], [xxx]) block ?11:09
jamvila: it blocks ^C until it returns, yes.11:10
jamvila: are you saying with no timeout?11:10
jamor are you saying with the socket passed in the errors field11:11
vilathe later11:11
jam(and note, there is no *error* with ^C)11:11
jamvila: *on Windows*, select.select() blocks until the C function underneath it returns, and then stuff like exception handling and signal processing occur11:11
jambefore the python function returns11:11
vilajam: wow, and recv(), listen() etc all behave this way on windows ?11:12
jamvila: recv() dose11:12
jamdoes11:12
jamI'm not sure about accept11:12
jamI'll check11:12
vilajam: it's that easy to create an unkillable unstoppable process ?11:12
jamvila: yes11:12
jamvila: you can kill it11:12
jamjust not ^C it11:12
vilameh11:12
jamvila: cygwin's "kill" command kills the process just fine.11:13
vilakill is certainly a way to interrupt especially if you can trap it11:13
jamvila: no "real" signals on Windows.11:13
jamvila: I dug into this a lot back when I implemented "SIGQUIT" for the pdb debugger for windows11:14
jami'll see if I can dig it out, just a sec11:14
vilais it because only the main thread is seeing the signal ?11:14
jamvila: On Windows, *there is no signal*11:15
jamYou have "GenerateConsoleCtrlEvent" and "TerminateProcess"11:15
vilayeah, whatever C-c trigger11:15
jamvila: there are some very interesting restrictions, like one Terminal cannot send "GenerateConsoleCtrlEvent" to another console11:15
jamand some other things like you can only kill a process group, which kills yourself11:16
jamvila: I could be wrong, I'm a bit fuzzy on the details, but in general, thinking in terms of signals doesn't work on Windows.11:16
vilaright, which makes it an interesting platform for servers...11:17
jamvila: I think you can do a lot of things, but you have to write them in the windows way. WaitForObjectEx, etc.11:18
vilajam: so, the bug mentions xinetd and inetd not really common on windows AFAIK11:18
jamrather than trying to use the posix workarounds like select()11:18
vilaindeed11:18
jamvila: sure, it doesn't work anyway because you can't select() on a pipe11:18
jamthat doesn't mean we have to have 2 implementations11:19
vilawell, it may mean it's harder to fix for windows so we'd better fix it for unix first11:19
jamvila: both savannah and launchpad are using the Pipe implementation because they are using bzr+ssh11:19
jamvila: I have11:19
jamyou just don't like that I loop11:19
jamit *still* helps "bzr serve" on Windows11:19
jamand "bzr serve --inet" on Linux11:20
jamand the test suite passes11:20
jametc11:20
vilabecause I suspect it hides other issues that gave us a lot of trouble in the past11:20
jamvila: we leak threads in some tests, but I made sure we were already leaking threads in those tests without my code.11:21
jam(again, that is pretty random, sometimes 9 leaking threads, sometimes 2, but 'bzr.dev' had the same behavior.)11:21
vilahuh ?11:21
vilaon windows you mean ?11:21
jamvila: I'm pretty sure on natty, too. I don't remember which tests, let me see if I can find them.11:24
jamvila: http://paste.ubuntu.com/690709/11:25
jamwith bzr.dev as of right now11:25
jamon devpad11:25
jamvila: on my machine right now on Windows, neither bzr.dev nor my code claims to leak threads.11:26
jamvila: on Natty, I think I generally got the same "9 leaking threads"11:26
jamI remember it varied11:27
jamand the same tests did not always leak11:27
vilaI *never* see leaking threads here... Do I miss some special flag I can't remember ?11:27
jamvila: I haven't set any flags AFAIK11:27
jammaybe debug_flags=hpss11:27
jambut it is consistently leaking for me on my Natty, and on devapd11:28
jamdevpad11:28
jamvila: re-running it, I only get 1 leaking thread11:28
jamso maybe your hardware is fast enough to handle it11:28
jamthis is without --parallel11:28
vilawhich tests ?11:28
jamvila: I don't see leaking threads with --parallel11:29
jamvila: see the paste11:29
jampy ./bzr selftest -s bt.test_smart_transport11:29
vilaha !11:29
jamL11:31
jam?11:31
jamvila: they leak for you? just not with --parallel ?11:31
vilayu[11:31
vilayup11:31
vilaso we have issues :)11:31
vilaright, so indeed, bzrlib.tests.test_smart_transport.TestServerHooks.test_server_started_hook_memory uses smart.server.SmartTCPServer which.....11:33
viladoesn't try to collect its client threads :)11:33
vilabut you mention the PipeServer right ?11:34
jamvila: bzr serve --inet ?11:35
vilayup11:35
vilastill can't buy the idea that this server will be used on windows where the TCP one is far better suited...11:36
jamvila: I don't see it being used on Windows either11:37
jamthis code also explicitly doesn't work there11:37
vilawhich one ?11:38
jamvila: see line 337 of https://code.launchpad.net/~jameinel/bzr/drop-idle-connections-824797/+merge/7534811:38
jamthat is the SmartServerPipeStreamMedium11:39
vilaOr you saying your fix doesn't apply for the PipeServer on windows ?11:40
jamvila: my fix doesn't select() Pipes on Windows11:41
jam(it would fail anyway)11:41
vilaso no select() loop either, so how do you handle the interrupt ?11:42
jamvila: interrupting bzr serve --inet on Windows? I don't try to do anything tere.11:43
jamthere.11:43
jamI don't try to timeout, etc.11:43
vilaright, so your fix doesn't apply to PipeServer on windows, correct ?11:43
jamvila: correct11:43
vilaok, so trying to handle C-c during a select is irrelevant, what we want is being able to handle C-c during listen() in the TCPServer which already has a loop with a timeout correct ?11:44
vila(still on windows)11:45
vilajam: ?11:51
jamvila: I doubt it is "irrelevant", but yes, blocking in a thread doesn't seem to block ^C for the process11:52
jamsorry, was testing it to make sure11:53
vilawell, irrelevant as in "we don't care about that *on windows*", sorry if this came out differently on your side, not the intent, just trying to get the picture11:53
vilabecause it's a very important distinction, even on linux, because the *main* thread doesn't have to be in a blocking call11:54
vilaand *can* handle more stuff11:54
vilaincluding tricks to terminate the other threads11:54
vilaif needed11:55
vilawow, lunch time is almost past and I need some food :) bb later11:57
jamvila: yeah, me too :)11:57
=== Ursinha-afk is now known as Ursinha
flacostehi Riddell13:11
Riddellsalut flacoste13:11
vilahey flacoste13:11
flacosteabout your question on bug #851379, what package contains language-selector-kde?13:11
ubot5Launchpad bug 851379 in qbzr (Ubuntu) "qdiff makes Xorg eats up all RAM on Oneiric" [Undecided,New] https://launchpad.net/bugs/85137913:11
flacostei don't seem to have it installed13:11
Riddellflacoste: how about usb-creator-kde ?13:13
vilajames_w: I'm trying to understand the conditions to decide which failures are considered transient in the package importer (i.e. when a package import is retried)13:14
flacosteRiddell: i only have the -gtk one installed13:14
flacostevila: salut!13:14
vilajames_w: is this only triggered by first asking for a package to be requeued ?13:14
Riddellflacoste: could you install usb-creator-kde and check?  it's unlikely to have the same problem but at least it would discount it being a general issue13:15
james_wyeah, with --auto13:15
vilajames_w: \o/13:15
vilajames_w: was visually grepping for transient and missed it ! Thanks !13:16
jamvila: interestingly, I'm trying a switch to avoid the loop. It only fails on linux so far13:19
flacosteRiddell: nope, usb-creator-kde works fine13:20
Riddellflacoste: and can you reply with your video card13:20
vilaswitch ? as in command-line switch ?13:20
Riddellah you did13:20
flacosteRiddell: i did, it's an intel GM965/GL960 and I attached my Xorg.0.log file13:21
vilajam: or as in switching main thread and client thread ? :)13:21
jamvila: no. I'm just trying the code without the loop, and the test suite fails (randomly?) on linux, and after 30 runs of a reasonable subset on windows, no failures13:22
Riddellflacoste: hmm, fiddly then, I've heard of a similar issue with Nvidia where the driver reports the screen size to be massive and some widgets get set as a proportion of that and break13:22
Riddellbut intel should be more reliable13:22
jamI did see one, one time. But I was also hitting ^C around that time, so I'm not positive the failure wasn't from the interrutp.13:22
vilawith the err parameter to select ?13:23
jamvila: just tried that, still got a timeout13:23
jamand no error returns13:23
vilawhich timeout ?13:23
jamselect.select() gave a timeout13:23
jam"waited until timeout before returning, after which returned an empty reads and errors available"13:24
jamwithout raising EBADF, etc.13:24
vilawell, I don't know what you're testing so it's hard to know if you want a timeout or not13:24
vilaI don't expect removing the loop is *enough*, I suspect it hides other issues, which you seem to observe now13:25
flacosteRiddell: if I branch lp:qbzr/0.21 in my plugins directory, it should use that one right?13:25
flacosteRiddell: i'll instrument to see which Qt call triggers the setMinimumSize log message13:26
Riddellflacoste: yes if it's in a directory named qbzr13:26
vilajam: basically the idea is that there is a race, sometimes you win (test pass) sometimes you lose (test fail), there should be 2 relevant threads here, the one waiting in the select and the main thread running your test (the client)13:29
vilaone of them is running too fast (or too slow) except in some unknown circumstances, how the time slices are given to each is less important here than *where* you need to synchronize (forcing one thread to wait for the other before a critical point)13:30
flacosteRiddell: which module is responsible for the DiffWindow? ui_tag.py?13:32
jamvila: test suite is failing without a loop13:32
jamtest suite passes with the loop13:32
vilajam: I know about only *one* such race that is not fixed today but it occurs very rarely, http://babune.ladeuil.net:24842/job/selftest-chroot-oneiric/81/ for example13:32
vilajam: that's the issue with the races, you *think* you've fixed it .....13:33
vilauntil it comes back and won't go away13:33
Riddellflacoste: lib/diffwindow.py13:33
Riddellinstansiated in lib/diff.py13:33
jamvila: select.select() isn't noticing that the file handle is no longer valid, thus we time out incorrectly. If I call select.select() again, it properly notices what I wanted it to notice.13:33
vilajam: I encounter this exact situation a lot while chasing the leaks and I can understand your frustration, but there is clearly one here or you wouldn't observe a random test failure13:33
jamWould you prefer a double select.select with the second one having a very short timeout?13:34
jamvila: I'm not particularly interested in debugging this for another 5 days after I could implement it in 1-213:34
vilaI would prefer a test that clearly exhibit what you're talking about13:34
vilaI'm not interested in having to debug it later either13:35
jamI understand the desire to avoid future confusion13:35
jamI'm currently past the point of diminishing returns, however.13:35
jammaybe i'll feel better next week13:35
jamvila: did you make 2.5b1 publically gold?13:36
jamvila: I don't see an email about it13:36
vilaI think I did, checking13:37
flacosteRiddell: do you have an idea what all the "13:38
vilagrr, left in drafts....13:38
flacosteGtk-CRITICAL **: IA__gtk_widget_style_get: assertion `GTK_IS_WID13:38
flacosteGET (widget)' failed13:38
flacoste" warning are about?13:38
Riddellflacoste: are you running gnome/unity?  I think that's Qt's GTK theme trying to make Qt fit in13:39
flacosteRiddell: i am13:39
Riddellcheck with other Qt apps that you get the same thing13:39
flacoste(unity)13:39
RiddellI doubt it's the cause of the problem although maybe I should try running qbzr under unity to check13:40
flacosteRiddell: i don't have it with usb-creator-kde13:40
* Riddell installs ubuntu-desktop13:42
Riddellflacoste: no problems with me in unity14:05
RiddellI'm not sure how else to try and recreate the issue :(14:05
flacosteRiddell: that's all-right i'm tracing it over here14:09
flacosteRiddell: let you know one I have an hypothesis of what's going on14:09
pickscrapeIs it possible to disable an extension for a specific branch or checkout?14:30
SlimG_How do I fetch a subdirectory in a launchpad branch with bzr? bzr branch lp:project/i/want/this/directory #doesn't seem to work14:39
Riddellpickscrape: do you mean plugin?14:40
Riddellpickscrape: you can set e.g. BZR_DISABLE_PLUGINS=cia14:40
RiddellSlimG_: I don't think you can14:40
Riddelldoes  ./bzr selftest -s bb.test_branch  pass the test suite for others in current trunk?14:41
SlimG_Thanks for the info Riddell14:41
pickscrapeYes, plugin sorry. Context switching. :)14:41
pickscrapeRiddell: thanks, that's a decent workaround. :)14:43
vilaRiddell: yes, ./bzr selftest -s bt.test_branch pass: Ran 84 tests in 1.992s14:45
vilaRiddell: try BZR_PLUGIN_PATH=-site ?14:45
Riddellvila: mm, that helps14:46
RiddellI wonder what plugin is breaking it then14:46
=== med_out is now known as medberry
flacosteRiddell: what does self.processEvents() do?15:47
flacostei assume it's a QWidget or QWindow methods15:47
Riddellflacoste: just runs Qt's event queue15:48
flacosteany way to see what's going on in there15:48
Riddellit's a cheap way to ensure the UI is kept updated if you are running a complex task and don't want to use threads15:48
flacostethat's where the setMinimumSize() calls happen15:48
flacostein DiffWindow.load_diff()15:48
flacostethe first processEvents()15:48
flacostebefore it Xorg is at 1G of RSS (which is still high, normally it's around 300M)15:49
flacostebut after it, it climbs to 2G15:49
flacosteand that's when the setMinimumSize warning is output15:49
flacosteRiddell: any idea what I should try next?15:50
Riddellhmm, maybe finding the widget that gets set, overriding setMinimumSize() and seeing what is calling it15:50
flacosteRiddell: it seems to be related to the 'Maximizing behavior'15:54
flacostefor some reason, it opens the window maximized15:54
flacostehmm, scratch that16:00
flacosteit doesn't seem to be related16:01
davi_jelmer, hi, where fetch_tags should be set? in branch.conf?16:10
jelmerdavi_: sorry, in what context?16:16
davi_jelmer, bzr-git, you added a change that causes tags to not be fetched if a config option 'branch.fetch_tags' is set to false16:17
=== yofel_ is now known as yofel
=== deryck is now known as deryck[lunch]
jelmerdavi_: in the branch.conf of the branch you're fetching from, locations.conf or bazaar.conf16:36
davi_jelmer, none of them seem to work. does it apply to push?16:37
jelmerdavi_: "branch.fetch_tags = True" ?16:38
davi_jelmer, False, i'm trying to disable the fetch of tags.16:38
jelmerdavi_: fetching tags is disabled by default16:39
davi_jelmer, hum, why do I get a GhostTagsNotSupported exception then?16:40
jelmerdavi_: can you paste the traceback?16:42
davi_jelmer, http://pastebin.com/NuyJZYPr16:43
davi_jelmer, fwiw, it only happens if the destination git repo is empty16:52
sorinHello.16:56
sorinHow many of you use oh-my-zsh?16:56
=== deryck[lunch] is now known as deryck
jelmersorin: I'm a zsh user17:39
jelmerdavi_: ah, that's a slightly different code path17:39
jelmerdavi_: I think I know what's wrong but don't have time to look now. can you file a bug ?17:39
davi_jelmer, will do17:56
sorinjelmer, I am specifically interested if people use oh-my-zsh, not zsh.18:22
jelmersorin: ah, sorry - I don't have experience with that18:24
nigelbsorin: yes, I do18:25
* flacoste is happy18:32
flacosteqbzr works again18:32
flacosteafter applying the work-around for bug 80530318:32
ubot5Launchpad bug 805303 in xorg-server (Ubuntu Oneiric) "Gtk-CRITICAL **: IA__gtk_widget_style_get: assertion `GTK_IS_WIDGET (widget)' failed with the default qt4 gui" [Critical,Confirmed] https://launchpad.net/bugs/80530318:32
mgzcrazy.21:59
mgzI hope that says something good about the way I wrote these tests.21:59
mgzCompletely changed the implementation of the reporting and they all still pass.21:59
keeshello! I'm trying to get a list of files added per revno. using "bzr log -v" is crazy slow. is there some faster way to get that info?22:24
Noldorinhi jelmer22:29
jelmerhi Noldorin22:29
jelmerhi kees22:29
Noldorinkees, it shouldn't be that slow...isn't for me22:30
Noldorinjelmer, any progress in your busy schedule? :-)22:30
jelmerkees: what sort of performance are you getting?22:30
keesNoldorin: my tree has about 4000 revnos22:30
keesjelmer: 30 seconds per about 50 revs22:31
jelmerkees: hmm, that is slow indeed22:31
jelmer(I was going to say it wasn't quick here either, but at ~200 per 5 seconds it's still a lot better than for you)22:31
jelmerkees: is this with a recent version of bzr, and the 2a format?22:32
jelmerNoldorin: nope, sorry22:32
keesyeah, 2.3.4 2a22:32
Noldorinjelmer, no problem...is it proving tricky then eh?22:32
keesthe man page even carries a warning about the slow speeds22:32
Noldorinkees, you're not running on a pentium I by chance are you? ;-)22:32
keeshaha no22:32
keescore2 duo22:33
Noldorinok, so not terrible...22:33
Noldorinjelmer, i'm tempted to think black-box debugging is not being very helpful here. we are both sturggling it seems...22:34
Noldorinhrmm22:34
jelmerkees: what size is the tree?22:34
jelmerNoldorin: I haven't had time to look at it at all yet22:34
kees57M.bzr22:34
jelmerkees: sorry, I mean the rough number of files in a checkout22:35
keesoh!22:35
keeser22:35
keesa bit under 900022:35
jelmerin that case, you could be hitting inventory paging issue that was fixed for 2.422:36
jelmerkees: is this is a public tree?22:37
keesjelmer: yeah, one sec22:37
keeslp:~ubuntu-security/ubuntu-cve-tracker/master22:37
keesah, my full bzr log -v finished. 23 minutes :)22:37
jelmerkees: hmm, that is indeed surprisingly slow22:49
jelmerthe launchpad tree is a lot bigger and has more revisions but running "bzr log -v" there is ~200 per 5 seconds22:50
jelmerkees: on your branch it's ~100 per 10 seconds22:50
keesweird22:53
=== jam2 is now known as jam
Noldorinjelmer, ah ok. :-(23:27
Noldorinjelmer, i know i'm pestering you about it so i don't want to too much... is there any other bzr-dev i should deal with?23:27
Noldorinbzr-git dev *23:27
jelmerNoldorin: I'm the only one who's working on bzr-git23:28
jelmerNoldorin: the best thing you can do to help is still to provide some sort of script which reproduces the issue from scratch23:28
Noldorinjelmer, fair enough. i just wanted to share the workload, but in this case let's just take our time :-)23:28
Noldorinjelmer, yes i spent ~4 hours trying to do that yesterday with no luck23:29
Noldorinblack-box testing is very difficult for such problems as these23:29
jelmereven if you copy all the contents out of r46, add them in an empty bzr tree and then try to make the same changes as r47?23:30
Noldorinjelmer, i get shitloads of merge conflicts then23:32
Noldorinnot doable23:32
jelmerNoldorin: I mean manually23:32
Noldorinoh haven't tried23:32
Noldorinthat woudl take a long time23:33
Noldorini'd have to figure out which lines of code i wrote23:33
Noldorinwhich are many23:33
jelmerthe code doesn't matter23:33
Noldorinno?23:33
jelmerit's the renames, etc that do23:33
jelmerand whether the code changed23:33
Noldorinoh ok23:33
Noldorini can try now then23:33
Noldorini suspect it will cause no error23:33
Noldorinsomething tells me that...23:33
Noldorinbut let's see23:33
PawnStarhi23:58

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!