=== zzlatev is now known as ZZlatev [03:09] @gbkersey: Thanks for the info. Even a 7-8Gbps is better than a 1Gbps. We are using the ga-16.04-lowlatency kernel for what it's worth [06:01] Good morning [07:29] I have an fresh Ubuntu machine running and I installed Nagios using the following website: https://kifarunix.com/how-to-install-and-configure-nagios-core-on-ubuntu-18-04/ [07:29] Installation seems to be successfully because I can access the website. But now comes to problem, when I reboot the machine it won´t work anymore and I need to follow the complete tutorial again to get it up. But then again after a reboot its not working. [07:30] My Linux experience is zero. So I was hoping if anyone has an idea what could be wrong on that tutorial or how I can get Nagios to work again. [07:31] Is the Nagios service enabled? [07:47] lordievader: How can I check this on Ubuntu? [07:48] `systemctl status nagios` (assuming here the nagios service is called that way) [07:49] Active: active (running) [07:53] Just did a check on Apache2 if that is running and that is also good [07:53] Is it the same after a reboot? [07:54] Let me restart to be sure [07:55] The IP address is also reachable. Maybe it has something to do with the firewall [07:57] Both Nagios and Apache are running successfully after a reboot [07:58] Found the issue! Thank you [07:58] I need to readd the ¨ufw allow apache;ufw reload¨ [07:58] How can I make sure those settings will stay stored and not be gone after a reboot? [08:01] I think ufw takes care of that. Haven't used ufw in ages. Dislike the way it does things. [08:02] I am just rebooting the machine again to see if ufw is enabled or not [08:14] It is enabled, but still need to add the rules. I will look how to make those rules mandatory === msmarcal|eod is now known as msmarcal [13:23] JamesBenson: thanks for that.... [14:05] * foo attempts to figure out what is causing oom to murder processes [14:05] * foo reads https://serverfault.com/questions/134669/how-to-diagnose-causes-of-oom-killer-killing-processes [14:24] I wonder if oom-killer can be too aggressive? What's strange is I rarely see this system swapping [14:39] Hi all, I'm looking to setup a ticketing system on 18.04 LTS and looking for recommendations. I was using Spiceworks on Windows but have since moved to Linux and do not have a Windows license. [14:39] ticketing system, like help desk thing / [14:40] Something that I will mainly use, maybe 1 or 2 more users. Yeah for tech support. [14:40] vahnx: look at osticket [14:40] Ok thanks, will do! [14:40] vahnx: https://osticket.com/ [14:41] any Traefik experts here? [15:09] hi .. my server / is 100% full, but i am not able to see what is causing it .. / is 80G .. du -sh /* grep G does not even come near 80G [15:10] is it possible it can be something that is in memory or an open file handler . and if such is there, how do I find it out ? [15:11] admin0 I once ran out of inodes, and it showed as full [15:11] inodes is only 4% used [15:12] df -h => /dev/mapper/cloud-root 75G 75G 0 100% / | df -i => /dev/mapper/cloud-root 5005312 174582 4830730 4% / [15:14] admin0: cd / ; sudo du -hs .[^.]* [15:14] admin0: that'll run against any hidden directories [15:15] admin0: cd / ; sudo du -hs .[^.]* * |grep G # this will run on everything [15:15] admin0: once you get some space, I recommend using ncdu [15:15] admin0: If a process is holding a file open, that space will still be consumed for filesystem allocation purposes until the process releases the lock. [15:15] 9.0G . [15:15] that is what i get [15:15] but df is 100% full [15:16] wishlock , how do I locate such process or such file [15:16] admin0: can you pastebin exactly what commands you are running and the output please? [15:16] sure [15:17] sure .. one moement [15:19] whislock, thanks for the pointer .. a cron was rm -rf a file while the process was not stopped [15:19] rebooting that process ( libvirtd) cleared up the space [15:20] instead of rm -rf the file, will do cp /dev/null instead [15:35] thanks guys for helping [15:36] * admin0 sends pizza (virtual) to leftyfb and whislock :D [15:36] admin0: future reference, install ncdu [17:30] how can I figure out what blocked processes I have on a server? [17:31] my monitoring system is telling me I have on average 5 blocked processes, but I don't see a D in the S column on top [17:34] is uninterruptable sleep what your monitoring system means by "blocked process", though? [17:38] tomreyn, good question... not sure [17:39] ndicates the number of processes blocked for I/O, paging, etc. [17:42] hmm yes, sounds like it should be that [17:43] DammitJim: procfs(5) /proc/pid/syscall sounds vaguely enough like a blocked vs not-blocked measure for such a tool [17:43] it'd be a bit silly to open, read, and close, a few thousand files for this information every N seconds of course, but maybe that's what it's doing [17:43] hhmmmm [17:44] if its source code is available to you, you could inspect what it actually does. [17:46] so, how do I get the process that is blocked? [17:47] well, the thing with these kinds of measurements, is that it's all very transitory and racy [17:47] so, hard to "catch?" [17:48] after all it takes ~20ms to handle a read IO operation from a spinning metal hard drive, by the time top or similar tool has crawled through all the processes on the system, the information it has on a process is likely already out of date [17:48] oh yeah, here I"m talking about an all flash array [17:49] and the blocked process stats from the monitoring system are reported every 5 minutes and I had this "problem" for about an hour [17:49] according to google your quote's source is https://docs.eginnovations.com/Unix_and_Windows_Servers/System_Details_Test_1.htm [17:49] yes [17:50] I'm on hold with them asking them what they are actually polling [17:50] no source code there, i assume. [17:51] you could perf trace or strace the thing. it'd be drinking from a firehose though [17:56] yikes [17:59] you can filter to specific sysclals with strace [18:01] When oom starts killing stuff, per syslog, it's not always clear what that is, correct? [18:02] hmm? I'm accustomed to seeing it saying which process it killed [18:03] both pid and process name should be listed [18:03] of course if it kills X11 and then all your X clients *also* die because the other end of their socket went away, that might feel a lot like the oom killer not reporting what died .. when really, it was just responsible for one process going away [18:04] tomreyn / sarnold - thanks, but that's not *always* the process that is consuming the memory right? eg. X can consume a ton of memory, Y will get killed off a result, correct? Or am I misunderstanding? [18:05] foo: yeah, there's also some per-process scoring involved; and depending upon how much memory is shared among processes, killing "huge" ones may not actually free up much memory [18:06] sarnold: ok, so whatever gets killed is not always the culprit. eg. I've seen a ton of different things killed off now that I think about it [18:06] System runs nginx, postgres and a few python scripts. Attempting to figure out what is causing this [18:07] yeah, the kernel tries to balance (a) killing something quickly (b) killing as little as possible (c) while also still getting as much memory for the pain [18:07] the journal will report which process was killed. processes which depend on this process may also fail as a result, and wont be listed individually as part of the oom kill record.. [18:08] sarnold: thanks [18:08] nginx looks ok, checking postgres right now too. [18:08] you can actually influence the kernels' decision making a little. but, much more reasonably, you don't want the OOM killing to happen in the first place. [18:09] Also going to enable query logs for slower queries [18:09] ./postgresqltuner.pl says [URGENT] set vm.overcommit_memory=2 in /etc/sysctl.conf and run sysctl -p to reload it. This will disable memory overcommitment and avoid postgresql killed by OOM killer. - I've been tracking down a memory issue with something, not sure what it is. Are we in agreement this is suggested? I assume it is but thought I'd ask [18:10] first identify which of the processes allocated more memory than they should have according to your planning, then try to see how to tune them. [18:10] if you start increasing debugging / verbosity now you already change their resource allocation [18:12] tomreyn: "first identify which of the processes allocated more memory" - I can only do this by checking conf files, right? Is there another way? [18:12] montoring [18:16] tomreyn: you have suggested tools? It's so sporadic, I haven't been able to narrow it down. Running a top and sysstat and what not now [18:16] you run some services on your server. ideally as few as possible, and move others to separate servers (or VMs). you think about how much memory you want each of them (as well as the OS itself) to consume, and calculate the total memory allocation. you configure services to allocate only the amount of memory you want them to allocate (which is not always possible, but it often is more or less possible, especially with DB servers). [18:18] and you do monitoring in short enough intervals to determine what may have consumed more memory than planned. and when this happens you review its logs (maybe increase verbosity), configuration, do the tuning. [18:18] tomreyn: yeah, I thought about splitting things about a bit more... namely moving postgres onto it's own system. Right now postgres + nginx + various python scripts all on one server... and thus fine-tuning isn't an exact science since each fluctuates [18:18] right, DB servers should always be run just by themselves IMO. [18:20] postgresql is actually quite configurable in terms of memory allocation, nginx also, but there i find it not to be so plannable. [18:21] the downside to running databases on different servers is that can add milliseconds to latency. that's probably better than minutes of latency if the oom killer has decided your database is a hog :) but still, something to keep in mind [18:22] so can a lot of other factors, yes. [18:22] tomreyn / sarnold - yeah, I'm not opposed to that. Would definitely help control resources better [18:23] tomreyn: do you recommend to always separate the DB backend from the web frontend for security? performance? upgradability? all those? [18:29] I know amazon has RDS. I wonder if Digital Ocean has something. [18:29] Does anyone have any commentary on this suggestion: [URGENT] set vm.overcommit_memory=2 in /etc/sysctl.conf and run sysctl -p to reload it. This will disable memory overcommitment and avoid postgresql killed by OOM killer. [18:30] foo: in isolation, I don't like the suggestion. if, after doing the analysis tomreyn suggested, you may realize it makes sense or it may not make sense [18:31] foo: yes, that should drastically reduce the chances of hitting OOM, but it might also make the machine nearly unusable. [18:34] sarnold: thank you. Part of my challenge is little to nothing meaningful has changed in the past month that I can see. I'm almost wondering if some library had some API change and there's some obscure threading issue due to some change which is causing some resource issue... but meh, OOM killed stuff once in feb, once in march, and 4 times his month (already). Traffic all looks nearly the same [18:35] foo: that sounds a lot like the machine just isn't sized correctly for the workload [18:37] sarnold: thank you. it's been online for 3 months. It was a recent migration from ubuntu 14.04 to 18.04. Not much has changed in the past few months but nonetheless, I agree something isn't tuned properly. I don't think gunicorn can be tuned, leaving nginx + postgres, namely. Django also runs on here. [18:37] how do I get an older version of mysql? everytime I try to install a deb it tells me dependency issues and install -f just gives me the latest version [18:38] wondering if anyone can poitn me in a direction :) [18:38] BrianBlaze: can you pastebin the whole thing? (pastebinit package has an easy pastbinit tool that can help this) [18:38] * foo sets up pg_stat_statements [18:39] https://pastebin.com/gEH5Li2i [18:40] why do you want to install that specific version? [18:40] where did you get it? [18:40] because this app needs mysql version between 5.5 and 5.2.24 [18:40] sorry 5.7.24 [18:41] does 5.7.25 break something? or does their documentation just not know about 5.7.25 yet? [18:41] when I go through the install it tells me it won't work with the newest version of MYSQL and won't let me go farther [18:41] so yeah the latter sarnold [18:41] ew [18:41] alright then [18:41] do you have any data in the database that you care about? [18:42] nah this is a fresh install [18:42] basically we use orangeHRM at work [18:42] open surce [18:42] and I am trying to go to the latest version [18:42] I will worry about getting the data there after [18:43] alright, cool. I think you'd be best served by apt-get purge mysql-server -- maybe you'll need to purge other mysql packages while you're at it -- and download the 5.7.24 packages from https://launchpad.net/ubuntu/+source/mysql-5.7/5.7.24-0ubuntu0.18.04.1 [18:45] thanks so much I will give it a shot [18:52] sdeziel: not always, not necessarily for a small test / dev / hobby project. but for anything 'serious', yes. [18:53] tomreyn: OK. I myself usually put it on the same machine to remove the network from potential source of failure. I also think that since the web app has the DB password, security-wise it isn't much worse [18:54] tomreyn: but for a bigger deployment, I guess you are right it's best to separate them [19:02] sdeziel: sure, networking is always a possible hazard (still but not neccessarily as much in a more controlled environment than the Internet), and there is latency, as sarnold mentioned. but if you run a webserver on the same system as a database server, it already rules out a serious HA setup. (definitely but not neccessarily only) if there's server side scripting involed on the webserver it also also means you're adding additional [19:02] attack vectors against a local vs remote database server (vectors and attacks which involve the local (e.g. file) system, such as remote file include, privilege escalation, directory traversal). [19:03] tomreyn: right, good point. It's harder to secure when both are on the same machine [19:03] BrianBlaze: don't forget to dpkg hold the mysql packages to prevent security updates from replacing the specific versions you're installing [19:05] BrianBlaze: apt-mark(8) can do that [19:05] tomreyn: that said, the only valuable thing on the DB server is usually the DB itself [19:05] how true [19:05] thanks [19:07] sdeziel: which is the big secret trove, the crown jewels, though, right? surely not always, but in many cases DB leaks are worse than, say, application code leaks (though those can be very bad, too, exposing malpractive, dodgy policies which carried into code) [19:07] tomreyn: agreed but since the web app already has access to the DB... [19:08] sdeziel: database user access, yes, not file system access [19:08] those are very different [19:08] tomreyn: that's probably what I fail to understan [19:09] mind elaborating a little on the security implications? [19:09] if you can "select into outfile" on a backend DB server but have no means to access the data it stored into a file that is now local to the DB server, such as thorugh a remote file include attacks against PHP, then this attack vector doesn't help you at all. [19:11] and such case, the source of the select would have to be something else than the DB itself, is that even possible? [19:11] I really appreciate the input sarnold I am on my way :) [19:11] (I know very little about DBs... just enough to drop a table/DB ;) ) [19:11] BrianBlaze: great! :) have fun [19:12] little sdeziel tables :) [19:12] hehe [19:13] https://www.xkcd.com/327/ [19:13] :) [19:15] sdeziel: so imagine this scenario: there is a php application running on the weserver which is both vulnerable to remote file includes and SQL injection, and you have a mysql server as the backend. and the SQL injection is limited in that the application prevents it almost except that you can still run INTO OUTFILE sql queries successfully, where mysql qould write the result of a query into a file on the local file system. [19:16] tomreyn: so far I understand from the above that you could extract stuff the mysql user has access to. [19:16] sdeziel: in this scenario, if the DB server runs on the same system as the vulnerable web application, you can access this file via remote file include. not so if the database server runs on a different system and wrote the file on this systems' file system but not that of the web application [19:17] tomreyn: I (think I) understand that part but what I fail to understand is how would that be a bigger threat than leaking the full DB the web app has access to anyways? [19:19] sdeziel: it is only marginally greater. but in the scenario discussed, you can't make the web application leak the full DB its DB user has access to by any other means. [19:19] normally web applications are not meant to just read the full DB and dump it to the internet ;-) [19:20] we'Re well beyond the scope of this channel by the way. if anyone thinks we should move elsewhere please say so. [19:20] I'm not worried about the normal case ;) but I'd assume someone with SQL injection and interested in the DB data would simply leak it without the intermediate file [19:20] yeah, that's OT, sorry [19:22] sdeziel: it's all a matter of what the attacker can control. if they can just run any SQL statement they like against the database within the scope of the web applications' database, then surely that means they can dump it. [19:23] the scenario i meant to describe only allows the attacker very limited control over how sql statements can be modified. [19:25] tv time now, but we can talk later in #ubuntu-offtopic or elsewhere, just ping me. [19:25] tomreyn: thanks [19:25] thanks for the discussion, it's been fun reading [19:27] :) and fun for me learning to understand how i can express myself better, and not mixing up the proper terms so much. i bet sarnold would have explained it much better. ;-) [19:27] I wouldn't be so sure of that -- actually *using* computers isn't my forte :) [19:28] once again, I get to the conclusion I should learn more stuff to better understand things.. [19:28] heh, yes :) [20:44] @gbkersey: FYI: Linux 4.4.0-145-lowlatency #171-Ubuntu SMP PREEMPT Tue Mar 26 13:17:00 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux [21:26] JamesBenson: any luck with the 10Gb ? [21:29] ehehehehehehehehe i feel privileged... xD [21:29] I have TWO cable hookups here xD [21:33] sarnold: mind helping me test something?