/srv/irclogs.ubuntu.com/2013/05/27/#ubuntu-mir.txt

RAOFthomi: Yup?00:37
thomiRAOF: hey, maybe you can help me - I think the mir client library is leaking file handles when running multiple connections from different threads. I *think* it's leaking /dev/dri/card0, but I'm not sure where that's opened in the ode to do any debugging00:38
RAOFthomi: That's entirely plausible.00:39
thomiI'm just trying to get some more evidence to support my theory that I've got the correct FD00:41
RAOFYou could do that with strace -e file, right?00:43
thomiahh, good idea, I've been taking lsof snapshots over time00:44
RAOFYeah. Dump the strace output to a file, parse it with something, and track the open fd.00:46
thomiRAOF: so on th client side, I don't see any open() calls while the client is spitting out errors, which makes me think that perhaps the client is just reporting a server-side error.01:25
thomiRAOF:  but running strace on mir_demo_server doesn't seem to show me anything interesting either01:26
thomithe error I get is one of either:01:26
thomiCould not connect to server, error is: connect: Too many open files01:26
thomior:01:26
thomiCould not connect to server, error is: eventfd_select_interrupter: Too many open files01:26
RAOFThat does sound a lot like a fd leak, doesn,t it.01:27
thomiyeah01:27
thomiif I slow down the client app, it takes longer to appear, so it's soemthing that happens within a connect & disconnect01:28
thomiis valgrind something that might help in situations like this?01:28
RAOFIt *can* trace fd leaks, yes.01:29
RAOFWow. Compiz on llvmpipe is much better in mesa master.01:29
RAOFI didn't notice that I don't have HW accel.01:29
thomiheh01:29
thomiwill try valgrind, but I suspect I'm missing something01:30
thomiit seems like strace is missing some open calls, since lsof shows more files open thatn strace shows open() calls for01:30
RAOFIf it doesn't happen when slowing down the client then maybe it's *not* a leak.01:30
RAOFMaybe you're just trying to open too many files? :)01:30
thomiRAOF: no, it does happen, it just takes longer01:30
RAOFAh, fair enough.01:30
thomiRAOF: so my point was that it's linked to the speed at which I run the stress test01:31
RAOFI was trying to see if glxinfo open()s /dev/dri/card0, but llvmpipe has foiled that brilliant plan.01:31
thomihaha01:31
thomihmmm. when I run mir_demo_server under valgrind, everything works, probably because it's so bloody slow :-/01:33
RAOF:)01:55
thomiugh, I hate trying to track down these sorts of issues02:12
smspillazRAOF: compiz and unity or just compiz?02:47
RAOFsmspillaz: Compiz and Unity.02:47
smspillazhm, neat02:47
=== Mirv_ is now known as Mirv
RAOFthomi: OOH!07:17
thomiRAOF: ?07:17
RAOFthomi: I think I know where your fd leak is.07:18
thomioh good07:18
thomieasily fixable?07:18
RAOFI need to check the stress-test code.07:18
RAOFBut - do you close() the platform fd?07:18
thomiall it does is a connect & then release07:18
RAOFRight. So you get the platform package, which contains a drm fd, and then don't close it.07:19
thomihmmmm...07:19
thomiseems like a mistake lots of people are going to make. Can we not make it RAII?07:19
alf_RAOF: thomi: Could be, although in that case I would argue that it should be closed on release07:20
thomialf_: +107:20
RAOFalf_: Maybe?07:20
thomithe thing is: I didn't do anything to get the platform package07:20
thomiall I do is a connect07:20
RAOFCorrect. You get it automatically on connect.07:20
thomiright, so if you're going to allocate it automatically, I think you should also deallocate it automatically07:21
thomiotherwise, how do I know that I need to deallocate it?07:21
RAOFWell, you don't, unless you're deliberately writing stress-tests :)07:21
alf_RAOF: what's your concern about deallocating automatically?07:22
RAOFalf_: Silently closing file descriptors out from under clients smells a bit to me.07:22
RAOFBut since we already do it for the prime fds in buffers, I guess it's got the virtue of consistency.07:23
tvossRAOF, alf_ why not make it a policy and default to atexit, while clients have a way to say: please close it asap07:24
alf_RAOF: If it turns out to be an issue, we could also give copies of the fds to the user when asking for the platform package, and tell them that they need to close it themselves there, and we can still close our own fds internally07:25
thomiRAOF: surely it should be closed when I release the connection anyway?07:25
RAOFthomi: Why? You might want it, and it's not in any way bound up to the connection.07:26
RAOFalf_: I think I prefer that, and would like to also do that for the buffer package.07:26
thomiI mean, if it's created automatically as a side effect of calling connect(), I'd expect it to be released as a side effect of calling release()07:26
thomianyway, at the very least the documentation for the connect() call needs to be updated with the fact that you need to close certain FDs07:27
RAOFTo client code it doesn't look like it's created automatically; it looks like they get it as a result of get_platform_package().07:27
alf_tvoss: we could, although I find that giving copies is more versatile and straightforward07:27
RAOFI'd support a policy of only ever handing out dup()'d fds from *get_*_package(), and managing the lifetimes of our internal fds.07:28
tvossalf_, yup :)07:28
tvossRAOF, +107:28
RAOFThat will mean that we eat up double the fd space, but it's not like we're consuming a vast range of it now anyway :)07:29
thomiRAOF: hang on, I never call get_platform_package()07:29
RAOFthomi: Yeah. That *doesn't* create the platform package, merely accesses it. But from the client code's perspective, that's where it's created.07:29
thomiright07:29
RAOFWhich is why dup()ing in that call and handling our own fd lifecycles internally makes sense.07:30
alf_thomi: RAOF: perhaps even better, we should hand a release function to the user when they get the platform/buffer packages, so they will just need to call that, don't care about explicitly closing etc07:31
alf_tvoss: ^^07:31
thomialf_: that would work for me07:32
RAOFalf_: I don't know; client code that deliberately accesses the platform/buffer packages needs to know the platform-specific nature of that package anyway.07:32
tvossalf_, okay, so they receive responsibility and power at the same time07:33
RAOFI think that if dup() is sufficiently costless then dup() + automatic internal management is a better approach.07:33
alf_RAOF: I am not arguing against dup. I am saying that the client doesn't need to know the details of how to release the platform package. The FD they get may be dup()ed, maybe not, and the same things goes for other resources contained in the platform/buffer packages.07:35
alf_RAOF: tvoss: Perhaps an other approach would be, to not dup the FDs, just say that if you need it beyond connection lifetime just make a dup yourself.07:37
alf_RAOF: tvoss: This doesn't consume more fd space unless needed by the client.07:38
tvossalf_, I like that idea. RAOF: thoughts?07:38
tvossalf_, it requires the client to express intent quite clearly07:38
tvossand our policy is strict and straightforward07:38
RAOFI'm a fan of the API being able to respond consistently, and I think dup() makes that easier.07:39
alf_RAOF: what's inconsistend about saying we are giving access to resources but we own them?07:39
RAOFOtherwise, for example if a client accesses the platform buffer and close()s the drm fd, subsequent calls to get_platform_package() will appear to succeed but return a garbage platform.07:39
alf_RAOF: so it's mostly about not allowing the client to mess up?07:41
RAOFYeah.,07:41
RAOFWhere it's low-cost, I think it makes sense to have a forgiving API.07:42
RAOF(With an implicit assumption that dup() is low-cost)07:42
mlankhorstit usually is, it just adds a duplicate to the process' fd list07:45
RAOFYeah, it didn't seem like something that would be high-cost.07:53
RAOFAnd even if we double-up on fds we're only using about 10 or so, which is a negligible fraction of the 1024 limit(ish).07:53
thomiok guys, it's getting late here. Can someone post a msg to the ML with whatever the outcome of this is please? I'll fix up the stress tests in the morning07:57
hikikohmm hi, alan_g is not working today?08:22
hikikoalf_, ping? :)08:22
alf_hikiko: it's a public holiday for US and UK today08:23
hikiko:)08:23
hikikoI have a question08:23
hikikowhy do we use protected destructors in display.h?08:23
hikikois it a design pattern?08:24
hikikoI guess we only want to destroy the object at release time08:26
hikiko?08:27
hikikosorry :p i don't guess anything :)08:28
alf_hikiko: Protected destructors in display don't allow destroying a derived *Display object when using a pointer to a base class. This basically enforces the use of a shared_ptr for ownership when upcasting, and that's the reason it is done this way. It's not necessary to do that, though. You can just make the virtual destructor public.08:31
hikiko:)08:36
hikikothanks alf_ !08:36
hikikoalf_, could you please review this: lp:~hikiko/mir/mir.fix-missing-virtual-destructors if you have some time later? +question: shall I push my future branches as lp:~mir-team/...?10:31
alf_hikiko: whichever way you prefer is fine10:34
hikiko:)10:35
=== tvoss is now known as tvoss|lunch
=== mzanetti is now known as mzanetti|lunch
=== tvoss|lunch is now known as tvoss
=== mzanetti|lunch is now known as mzanetti
=== mmrazik is now known as mmrazik|afk
=== francisco is now known as Guest85729
=== mmrazik|afk is now known as mmrazik
=== mmrazik is now known as mmrazik|afk
=== greyback is now known as greyback|food
=== tvoss is now known as tvoss|test
=== tvoss|test is now known as tvoss
=== greyback|food is now known as greyback
=== francisco is now known as Guest35949
thomimorning20:14
RAOFYo!23:25

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!