/srv/irclogs.ubuntu.com/2017/05/02/#ubuntu-kernel.txt

DEvil0000so nobody here can help me with this kernel issue?07:41
apwDEvil0000, "this" ?09:27
ogra_apw, i have a user running CI tests in qemu/kvm for ubuntu on top of a Centos box ... his VMs fill up /run/udev/data with tons of cgroup entries over time (ls output is at https://bugs.launchpad.net/snapd/+bug/1687507/+attachment/4870568/+files/run_udev_data.log) ...https://bugs.launchpad.net/snapd/+bug/1687507 any idea ? (he runs a hwe kernel in the VMs)10:22
ubot5`Ubuntu bug 1687507 in snapd "Memory leak (/tmp file system filling up)" [Undecided,New]10:22
apwogra_, so he is saying that within the VM we are dropping those failes in /run, or are they in his host ?10:28
apwogra_, but if they are /run/udev/data files i would assume they are related to systemd in whichever ?10:28
ogra_apw, they are inside the VM ... he runs his VMs with 200MB ram each and after a week or so they all run out of ram10:29
apwogra_, ok so then our systemd seems most likely given the names10:29
ogra_i would expect udev to only react to device creastion here 10:29
apwogra_, which you do for every snap install, you make a new loop10:30
ogra_well, thats not loop devices there 10:30
apwno it is session things, perhaps one for every run of a snap command10:31
ogra_zyga_, ^^^ ?10:32
zyga_ogra_: looking10:32
apwogra_, but they look like regular files, and they arn't in a directory the kernel makes up out of the ether, they look like actual disk blocks10:32
ogra_well ... i see bits like: -rw-r--r-- 1 root root     31 May  1 06:08 +cgroup:kmalloc-1024(30177:apt-daily.service10:32
zyga_ogra_: I saw this bug10:32
ogra_thats definitely not from snapd10:32
zyga_ogra_: perhaps the real memory is hidden somewhere else that we don't see10:32
zyga_aha10:32
zyga_do you have a log of what is in the slab?10:33
apwogra_, no but that doesn't make it a kernel thing.  that is a systemd unit which ran10:33
apwzyga_, these are actual files in a directory on a ramfs10:33
ogra_zyga_, we only have the ls output of the udev db atm10:33
apwthe memory is being using storing them10:33
apwi would suggest whatever is making the sessions is triggering udev to leak an internal file10:33
zyga_apw: aha, 10:34
zyga_apw: is that an actual ramfs or did you mean tmpfs?10:34
ogra_its a tmpfs ... 10:35
ogra_usually /run has 10% of your ram10:35
apwzyga_, tmpfs prolly, as it is /run10:35
apwso according to the other logs in teh forum, i think there is a session being logged in syslog for every execution of his snap commands10:36
zyga_apw: what do you mean by session specifically? (I'm not fully aware of how various "sessions" are managed by linux)10:37
ogra_Apr 28 09:36:50 ci-comp11-dut systemd[1]: Started Session 1817 of user root.10:37
apwzyga_, that is userspace shit :)10:37
ogra_yxou mean that line i guess10:37
apwogra_, yeah, that is what it looks like to me10:38
apwperhaps that is an install of a snap, dunno, he is implying he is doing repeated installs10:38
ogra_yeah, no wonder if he does CI 10:39
ogra_(though he should simply kill the VM and use a fresh one to do it right :) )10:39
apwogra_, he could, but i would open a systemd task on that bug at least, as it looks to be leaking, or if it is not they might be able to tell us better waht the hell those files represent10:40
apwogra_, and what the reporter is doing wrong (this is systemd after all)10:41
apw"you didn't want to do it like _that_"10:41
zyga_do  you guys think there's an actual snapd bug somewhere? 10:41
ogra_zyga_, unlikely given the type of files created there 10:41
apwzyga_, without knowing what the session represents ... you might ask for one of the file content as well10:41
LocutusOfBorg[12:42:15] <LocutusOfBorg> hello cjwatson wrt the git/ssh repo cloning issue, I finally found it10:43
LocutusOfBorg[12:42:16] <LocutusOfBorg> http://kernel.ubuntu.com/git/cking/stress-ng.git/10:43
LocutusOfBorg[12:42:31] <LocutusOfBorg> this one is git only clonable (ok not really a launchpad host)10:43
LocutusOfBorgcking, ^^ :)10:43
LocutusOfBorgdo you mind having https too?10:43
apwLocutusOfBorg, what does that mean, ie which protocols don't work10:43
LocutusOfBorggit clone https://10:43
* apw wonders how that works ... hmmm10:43
LocutusOfBorgif you look at the page, it is not even shown as "clonable"10:44
LocutusOfBorgprobably some configuration needs an enable10:44
apwthat host is somewhat of a legacy mess ... no idea if it even can do https10:44
apwi suspect it does not at all ...10:45
ckingI guess that's the reason10:45
apwcking, would it make more sense to just mirror that into LP and be happy ?10:45
LocutusOfBorg:(10:45
LocutusOfBorgwe have git protocol blocked here at work10:45
apwcking, we could even add it to the primary mirroring list if you care to keep both working10:45
infinityLocutusOfBorg: Tell the people who run your work network to stop being terrible?10:45
ckingapw, I'd prefer LP to mirror the original git repo if thats possible10:46
apwinfinity, but but they need to record everything you do, oh except https:// cause you can't do anything dodgy with that10:46
apwcking, i guess it depends where in the namespace it goes to10:46
infinitycking: Wasn't there a goal to eventually get *all* the master repos off of wani, though?10:47
LocutusOfBorginfinity, it is the customer's network :) unfortunately security team has a blacklist everything opinion10:47
ckinginfinity, yeah, I'm being lazy10:47
apwLocutusOfBorg, letting https:// out basically means they don't stop anything10:47
ckingI have it mirrored at https://github.com/ColinIanKing/stress-ng too10:47
apwcking, oh ... heh10:48
LocutusOfBorgapw, well, somebody was doing reverse ssh over git port :)10:48
LocutusOfBorgthey even blocked ssh on 44310:48
apwLocutusOfBorg, and they are now doing it over https no doubt10:48
ckingi guess you could download https://github.com/ColinIanKing/stress-ng/archive/master.zip10:48
infinityLocutusOfBorg: You can't "block ssh on 443".  I mean, you can block the pre-ssh handshake, but you can't stop people from setting up ssl tunnels and then going to town.10:48
apwcking, he can clone from github if it is a mirror10:49
infinityLocutusOfBorg: So, indeed, as Andy says, if you allow 443/https, you allow everything.  Just in a way you can't filter/inspect.10:49
LocutusOfBorginfinity, yes, of course, they just made things harder10:49
LocutusOfBorganyhow git over ssh works nicely, and I don't want to hack things on a build server10:49
apw"harder" for someone who was reverse tunnelling ssh over the git port, is not going to be actually hard10:49
ckingapw, I keep a mirror of all my repos on github because folks in .cn sometimes can't access my repos any other way10:50
apwcking, greak, then LocutusOfBorg should mirror from 10:50
apwgithub for now, and we can think about where that repo should move to10:50
apwat our leisure10:50
ckingapw, the "at our leisure" generally means never10:50
apwcking, no we have cards now10:51
ckingapw, OK, I'll slap it on the card deck10:51
apwcking, so ... who owns that, is it cking or is it c-k-t or something else10:51
ckingapw, tis me10:51
apwthen i cannot mirror _to_ your LP, but if you switched master to lp:~cking i could mirror it back to kernel, and you could add the symlink there10:51
apwcking, as an aside does github auto-mirror for you ?10:52
ckingapw, sounds like a plan10:52
ckingcking, I generally push to my repo and mirror one each push10:53
LocutusOfBorgthanks!10:53
LocutusOfBorgcking, I'm pretty sure there is some automatic push feature on github10:53
infinitycking: I'd argue that if you're going to the trouble of moving it, you might want it owned by a stress-ng-hackers team or something instead of cking.  Future proof it a bit.10:53
ckingi suspect there is10:53
LocutusOfBorgbut I never found how does it work10:53
ckinginfinity, yeah, sounds like a good idea10:53
apwinfinity, in which case we could add the mirror-bot10:54
apwto that team until we get the fine-grained permissions we are meant to have10:54
apwcking, i assume you have an LP project already anyhow10:54
ckinghttps://launchpad.net/stress-ng10:55
LocutusOfBorgor maybe just 10:55
LocutusOfBorgadd an hook10:55
LocutusOfBorgBTW the network is mostly windows only, I'm the only one with such connection issues, and I don't want to complain to /dev/null network admins, corporate networks can be time consuming10:57
LocutusOfBorg(I do tethering and live happy for now)10:57
apwLocutusOfBorg, well it sounds like you have a work-around, use the github mirror as that is https: already10:58
LocutusOfBorgyes, thanks! I forwarded this helpful message to my colleague10:59
ckingLocutusOfBorg, and let me know if you find any stress-ng issues too11:02
LocutusOfBorgcking, it is an awesome tool! we use it to stress our custom yocto BSP :)11:03
ckingLocutusOfBorg, thanks, that's good to know.  can you star it on the git hub page :-)11:04
LocutusOfBorgdone :)11:04
LocutusOfBorgI didn't even star my projects11:04
cking\o/11:04
LocutusOfBorgI think I had some feature requests to do for this project, something I feel "hey that would be nice" but I forgot them11:05
ckingLocutusOfBorg, well, I'm very open to new feature requests and/or patches too :-)11:05
LocutusOfBorgI probably remember the need to test ram overloading and some cpu micro (e.g. stress test only one single core) or something like that11:07
LocutusOfBorgit happened 1.5 year ago, too much time for my brain to remember11:07
LocutusOfBorg:)11:07
ckingah, stress-ng has progressed a lot since then11:07
LocutusOfBorgsure, I would like to give it a new try :)11:07
LocutusOfBorgflashbench, this is a tool we had to add because stress-ng was not testing flashes :)11:11
DEvil000010:10:48 - DEvil0000: I have a issue with my 14.04 and hope to find some help here. Description: https://hastebin.com/raw/fareguzego11:35
DEvil000010:11:10 - DEvil0000: in short: core dumps of 32Bit processes on a 64Bit OS are not correct somehow. so its about gdb, kernel, ubuntu 14.04/16.04 and packages11:35
DEvil0000@apw11:37
apwDEvil0000, hmmm so you are saying 32bit core dumps differ depending whether they are dumped by the lts-backport 4.4 or the native 4.4 from a later release ?11:39
apwthat is somewhat unexpected as the only real difference is normally the compiler used to build it11:43
apwDEvil0000, you should file a bug against the kernel and add a task for gdb as well i suspect, and drop the number in here11:44
DEvil0000apw: yes, this is what I see. resulting in incorrect/broken stack trace in gdb later when dumped with the backport kernel12:03
apwDEvil0000, that is most odd as those kernels are essentially the same other than where they are built12:03
DEvil0000apw: when I install the native 4.4 packages on my 14.04 it works as expected12:03
apwthat makes little to no sense in reality12:04
apwbut, it is happening, so please file a bug against the kernel with all that detail and how you reproduced it12:04
apwi assume a simple 32bit proggy which does abort will exhibit this behaviour12:04
DEvil0000apw: not sure - this seams to be quite likely for my processes with about 30 threads but less likely with less threads. hard to find a good example process12:05
DEvil0000apw: main thread dumps seam to be ok most of the time12:07
apwhmmm12:07
apwit is not 100% reproducible.  that is ever odder12:08
DEvil0000is it save to use the native 4.4 kernel packages instead of the backport ones?12:08
DEvil0000is maybe the patch level different in some way?12:09
DEvil0000or maybe a lib needed to compile the kernel?12:09
apwthey are identical as far as i know from a patches applied perspective (on x86 at least)12:10
apwand they do not use external libraries12:10
apwsafe is a relative term, you won't get updates if you do tha12:10
apwtaht12:10
DEvil0000I have found some reports from others via google having the same issues.12:12
DEvil0000sometimes with gcore gdb in the title. some for debian.12:12
DEvil0000updates are not a issue in this case. we have our own pool which is a less updated ubuntu pool clone12:12
DEvil0000(but yes I tested it on plain and full updated 14.04)12:16
DEvil0000maybe it is somehow related to this: http://www.bigeng.io/recovering-redis-data-with-gdb/12:18
DEvil0000looks similar12:18
DEvil0000@apw: so you tell me using the native 4.4 should be no issue. why don't you use the same packages for both versions then? maybe even from a common package pool?12:25
apwi didn't actually say there would be no issues but regardless12:25
apwthe thing is not compiled using tools you have so no out of tree drivers can be relied on for one12:26
apwthat is why we build a native version of the kernel for the series it is run in12:26
apwas the differences between the two are minor, at best, i have no evidence you are not just getting lucky and the problem will come back12:27
apwDEvil0000, nope there are no code differences between the two, /me checks config12:29
apwDEvil0000, and the only thing there is how stack-crashing is detected (due to compiler limitations) but that should not be visible in userspace either12:30
=== daniel is now known as Guest47234
=== Guest47234 is now known as Odd_Bloke
=== JanC is now known as Guest15747
=== JanC_ is now known as JanC

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!