=== zz_DenBeiren is now known as DenBeiren [00:50] Question. I was looking at performance of a program communicating with itself via localhost (TCP). I'm seeing some retransmissions, and an rto of 200ms when speaking between both the programs. Wondering where I should start debugging since local communications should not see a rexmit if it's healthy [00:50] Example: ESTAB 0 65 127.0.0.1:41690 127.0.0.1:22144 timer:(on,212ms,0) uid:1000 ino:380496980 sk:ffff8801ca163600 [00:53] jeremy_carroll: 200ms sounds suspiciously like the TCP_CORK entry in the tcp(7) manpage [00:54] sarnold: Yeah. Everything right around 200ms. Which I thought was RTO. Checking man entry [00:55] sarnold: No shit. This looks exactly right. I do not think the program is setting TCP_NODELAY. So it's most likely waiting for CORK === markthomas|away is now known as markthomas === markthomas is now known as markthomas|away [01:02] sarnold: I looked at the C code for the program. It's not setting TCP_CORK specifically. I'll look for setsockopts on startup to see if it's doing so. This is not a default option, correct? [01:02] jeremy_carroll: well, I don't know that TCP_CORK is the right option to set, since you'd need to unset TCP_CORK when you want the data to fly on the wire; setting TCP_NODELAY is more likely the solution [01:03] sarnold: Yeah. Thanks for the tip. I think you are right that this has 'something' to do with Nagels. NO_DELAY, CORK, etc.. Very helpful. timer being set made me think it was rexmit / rto. Though now I know the timer can be for other options, such as CORK. [01:04] jeremy_carroll: I hope that's it; if so, it'd be simple enough fix. I'd be curious to know the results when you've got something sorted out :) === Siebjeee is now known as Siebjee === furkan_ is now known as furkan [05:33] I mounted a partition(multipath) from SAN. This is working too slow. may u explain why is it slow ? this is working fine in another nodes [05:42] I mounted a partition(multipath) from SAN. This is working too slow. may u explain why is it slow ? this is working fine in another nodes === suigeneris is now known as Kartagis === maxb_ is now known as maxb [08:21] Good morning. === liam_ is now known as Guest79610 [08:28] Hey anyone is using pxe for ubuntu server? Looks I am hitting the same thing with: https://bugs.launchpad.net/ubuntu/+source/net-retriever/+bug/1067934 [08:28] Launchpad bug 1067934 in net-retriever "spends 10+ minutes deduplicating Package lists" [High,Fix released] [08:28] both precise and trusty tested. [08:29] every pxe installation will hang me 10 more mins at the stage. [09:44] hi - I am looking for a way to be able to sync 'parts' of various config files in multiple linux servers - they are different distros, I also am looking for a way to update all servers on amss - should I be looking at something like puppet ? [09:45] or can anyone suggest a simple alternative ? I do not really care about deployment (yet) - just syncing 'parts' of config files and updating multiple servers [09:45] i.e does landscape have the tools to sync parts of config files or is that a tool to update multiple servers ? [09:46] yossarianuk: ERB [09:46] yossarianuk: Puppet is great for that ;) [09:47] lordievader: cheers that is what I thought.... [09:47] ikonia: what does ERB mean ? [09:48] ruby templates [09:48] ikonia: ah - thanks [09:49] Puppet is written in Ruby, and can use templates. [09:50] puppet could be a huge overkill though for a few config files [09:50] it really depends on what's needed [09:50] True, true. [09:53] lordievader: ikonia: that was a fear.... [09:53] i.e overkill... [09:54] if you have any suggestions of lighter alternatives ..... [09:54] you can use ERB templates without puppet [09:54] and that can also update servers of different os types ? [09:54] totrally [09:55] totally [09:55] cool [09:55] it's just a cross-platform template [09:55] (it's used with puppet hence the cross platform) [09:55] the only think you need to work out is the distribution method but that can be as easy as a shell script [09:55] well cheers ! (going for a meeting now - back in several hrs.) [09:56] ERB sounds like a good solution to be fair... [09:56] setting it up outside of puppet will require a little thought, but once you've worked it out, you'll fly [09:57] eg: heira is a common use for populating the data, you won't be using that, so you'll need to do something different, but it won't be too hard === liam_ is now known as Guest73903 [10:21] hey guys. I'm having a bit of a dependency issue while trying to install php5-memcached. I was hoping I could get some advice on what to do next? Here's the bash output: http://pastebin.com/nZcn1YTx [10:25] klander: gconf2 fails to setup, and everything seems to depend on that. What happens when you manually run dpkg on that package? [10:25] lordievader: I haven't tried.. [10:25] dpkg -i gconf2 ? [10:27] Hi guys, does any one have experience with Ubuntu Landscape ? [10:27] klander: Using the full path to the package, should be somewhere in /var/cache/apt/archives [10:29] ok i have gconf2-common_2.28.1-0ubuntu1_all.deb , gconf2_2.28.1-0ubuntu1_amd64.deb, libgconf2-4_2.28.1-0ubuntu1_amd64.deb [10:30] klander: Try gconf2_2.28.1-0ubuntu1_amd64.deb [10:31] https://gist.github.com/anonymous/9fc2c90355ba15c47ff8 [10:34] Pff that is informative.. sudo apt-get autoclean&&sudo apt-get update&&sudo apt-get install gconf2 [10:36] https://gist.github.com/anonymous/0f165f657de5695761b7 [10:36] (after autoclean and update) [10:38] klander: sudo apt-get purge gconf2&&sudo apt-get install gconf2 [10:39] https://gist.github.com/anonymous/248fc2618d89d6532019 [10:42] klander: Does "dpkg -l|grep gconf" show it as installed? [10:43] https://gist.github.com/anonymous/bd756c995e76e5f2fdfe [10:43] I guess not ^ [10:45] klander: sudo apt-get install gconf2 === nath|off is now known as nathema [10:46] https://gist.github.com/anonymous/3b66871ca41988c67c97 [10:46] :/ [10:46] shared-mime-info, libgtk2-perl and libgnome2-canvas-perl [10:47] klander: Well gconf2 seems to be installed correctly: sudo apt-get install -f [10:47] same output [10:49] klander: "sudo dpkg --configure shared-mime-info" Errors I suppose? [10:51] https://gist.github.com/anonymous/1671f16cbb349310bf84 [10:51] Segmentation fault? [10:53] It ain't supposed to do that... [10:56] klander: What you could try, might be risky, is removing the package temporarely cleaning the cache and reinstalling it. [11:00] okay.. [11:01] klander: shared-mime-info likely has dependencies to remove it without removing the dependencies see http://ubuntuforums.org/showthread.php?t=1513821 === Lcawte|Away is now known as Lcawte === liam_ is now known as Guest35660 === zz_DenBeiren is now known as DenBeiren === Lcawte is now known as Lcawte|Away === unreal_ is now known as unreal === liam_ is now known as Guest73986 [14:32] I have been getting this email regularly now. 'panic action' script /usr/share/samba/panic-action. nothing esoteric. just local samba for file sharing with windows machines. I am also getting no talloc stackframe at ../source3/param/loadparm.c:4864, leaking memory [14:50] My home server, mainly media and backups, has been turning off at some point in the night. I have to power it up in the morning. This has happened maybe 3 days in a row. Things ran fine for months. Is there a log i can look at? I looked at dmesg but didn't see anything there. [14:51] look at whatever log your ups software logs to === bilde2910|away is now known as bilde2910 [15:06] ok, didn't know there was an ups log. Thanks. [15:59] smb: hi, are you around? [15:59] hallyn, I feel tempted to say no, but yes. [15:59] smb: caribou is having an issue with backported libvirt pkgs due to apparmor complications. I think that a version of your upstream patch to tweak the apprmor rules might be the best fix [16:01] hallyn, Yeah... Should I fwd him my latest patches for upstream? [16:01] Probably still have to be tweaked a bit since I only test compiled the upstream variant. Not integrated into Debian packaging [16:02] smb: yeah, it's probably better to do it in debian/rules based on the deb target arch [16:02] hallyn, btw, something else. is the irc meeting planned to take place or was it cancelled since many would be away [16:03] it is cancelled [16:03] Ah ok. [16:04] In theory it should work after things are expanded. I am just not sure which steps are used to get there. Maybe repackage after ./bootstrap === Lcawte|Away is now known as Lcawte === exixt_ is now known as exixt [18:30] Can I set UFW to allow SSH from all local networks? We've got quite a few 10.x.x.x VLANs at work, and I'd like to lock SSH down to the local VLANs without having to add each one independently [18:31] maybe just allow from 10.0.0.0/8? [18:32] tgm4883: try ufw allow in ssh from 10.0.0.0/8 or similar? [18:45] sarnold: yes that seems to have worked. Thanks [18:46] tgm4883: nice === DenBeiren is now known as zz_DenBeiren === exixt is now known as exixt_ [19:01] Hi there! I used smartctl --test=short to scan my server's hard drive for errors. I'm not totally sure how to interpret the results, however. Is there some easy way I can check whether my disk ought to be replaced soon? Anything to look out for in the future? https://puu.sh/cDdv3.png [19:03] I'm guessing the answer to this is actually a bit too simple.. but I just can't seem to figure itout [19:03] Which syntax would I use to to bond an interface and then bridge it, while using DHCP? [19:03] bilde2910: that hardware ecc recovered and raw read error rate seem staggeringly high; to the point that I even wonder if they're outright wrong.. [19:04] So... something's up? Should I replace the drive? [19:04] bilde2910: I'd run the test again tomrrow or something and see if those counts have increase. if they have, plan its replacement soon. if they haven't, you might not have an -immediate- problem but .. it's scary, right? :) [19:04] Well yeah, I should probably do more frequent backups then [19:04] never a wrong answer :) [19:05] Will run the test again tomorrow, then. Thanks for help [19:05] good luck :) [19:06] Thanks :) [19:07] bilde2910: see the line about SMART Self-test log stuff [19:07] Num #1 "Completed without error" [19:08] Well that at least looks promising, at least in its current state. [19:09] Oh, and another question. Is it possible to be alerted somehow (by email, for instance) when something bad happens or is about to happne? === roost_ is now known as roost [19:09] bilde2910: also, ignore the Hardware_ECC_recovered line, usually only the vendor knows what it means [19:09] Ok, thanks for the tip, dasjoe === exixt_ is now known as exixt [19:10] bilde2910: If you can erase the drive you should run a destructive test using badblocks, it overwrites the disk multiple times with patterns and checks them for correctness [19:11] dasjoe: oh, thanks [19:11] dasjoe, not sure if that is currently an option; not sure how that would impact uptime on the web server I'm running there. I'd like to use it as much as possible and avoid any downtime I can [19:12] bilde2910: also, see "man 5 smartd.conf" for info on how to receive mails from smartd. If you're using mdadm you should check out "man 5 mdadm.conf", too [19:13] Thanks [19:13] Sure [19:15] sarnold: imho the only interesting lines are the ones where the vendor configured a threshold, where I usually compare VALUE to THRESH and (mostly) ignore the raw value === exixt is now known as exixt_ [19:17] One last question - how long could I hope my disk would last if I read/write about one file per second? I'm not sure if there are any good estimates on this, but if there is, it would be good to know [19:17] dasjoe: ah, the middle columns that I've mostly ignored; those look scary too :) [19:19] I have been getting this email regularly now. 'panic action' script /usr/share/samba/panic-action. nothing esoteric. just local samba for file sharing with windows machines. I am also getting no talloc stackframe at ../source3/param/loadparm.c:4864, leaking memory [19:20] bilde2910: nobody can say, disk life is a guessing game at best; I replace when errors show up in the log, sometimes that's two months in, and sometimes it never happens [19:20] 10 years down the line [19:22] Ok, thanks! [19:24] SMART errors give you reasonable warning prior to a failure about 98% of the time in my experience, and they're evidence enough for an RMA, so that's what I use [19:24] bilde2910: your disk has "used" 6% of its target hard power-cycles (being switched off and on) and 11% of its load cycles (its head getting parked). So you can probably use it for about 9x as long as you've used it for now [19:25] Interesting [19:25] Just keep in mind SMART is not perfect, a large study (iirc done by Google) found SMART didn't give any warnings for 50% of failed disks [19:27] Must have been some crap disks [19:27] Failures without smart errors are pretty rare IME and normally that only happens with a drop dead failure situation [19:28] Yeah, because that's what Google would be using. They're known for taking the worst possible hardware ;) [19:28] "Figure 14 shows that even when we add all remaining SMART parameters (except temperature) we still find that over 36% of all failed drives had zero counts on all variables." [19:28] I don't check the parameters, just the error log [19:29] http://static.googleusercontent.com/media/research.google.com/en//archive/disk_failures.pdf [19:29] The parameters are largely useless [19:29] Most failures don't happen all at once, so there's a window of opportunity to replace it [19:30] Right. I ignore the error log, but check the parameters, I also trust my senses of smell, hearing and temperature ;) [19:33] I've never had any success with tools that monitor the parameters to predict failure, but I have had great success by monitoring the error count [19:34] Soon as that error pops up, prepare to replace [19:59] I've set up a OpenVPN server (just with the regular tun interface, not tap) and everything connects smoothly with firewall disabled, but once I turn on my firewall again I can connect perfectly but it seems to refuse the routing with as result I have no internet access. I'(ve tried adding rules to iptables such as "-A POSTROUTING -o eth0 -j MASQUERADE" & "-t nat -A POSTROUTING -s 10.8.0.0/24 -o eth0 -j MASQUERADE" but with [20:00] I have iptables-persistent installed also [20:00] Anyone has any idea what's going wrong with the iptables I added? [20:02] kevindf_: Let iptables log the dropped packets and look at what it is dropping. [20:02] How can I log that exactly? As i'm not that familiar yet with iptables [20:03] kevindf_: http://www.thegeekstuff.com/2012/08/iptables-log-packets/ [20:04] I'll take a look at that, and come back with the results in a few minutes [20:04] thank you [20:12] lordievader I logged the data, I think this is the output http://pastebin.com/i0WU96GD [20:14] kevindf_: Lots of DNS is being dropped. Can you ping your vpn network with the firewall on? [20:16] will try to ping on my laptop with the vpn connection, as I tested the vpn quick trough my phone for the log [20:16] hang on [20:19] Got my server running under 40c finally [20:21] at 100% load <3 [20:21] lordievader I can ping 10.8.0.1 perfectly when firewall is enabled and when connected to the vpn [20:22] but no internet access ofcourse [20:22] kevindf_: I think you'll find you have internet access but your DNS is broken. [20:23] ^ [20:23] LinStatSDR: Whoo neat. Is it an airplane now? [20:23] Nope, just ram air. Not too too loud but... servers are loud anyway. [20:24] I will try comment out push "dhcp-option DNS 8.8.8.8" push "dhcp-option DNS 4.4.4.4" [20:24] in my openvpn server.conf file [20:24] and then try again [20:27] lordievader I've tried commenting out the DNS in my server conf so it doesn't push the client the dns servers but that didn't work out either unfortantly [20:28] kevindf_: That's not what I meant with 'your DNS is broken', look at the iptables log paste you posted. [20:28] kevindf_: What is it mainly dropping? What destination port? [20:29] It's set on port 1194 UDP [20:29] maybe i should try use port 443 or something? [20:30] kevindf_: Try to answer my questions... [20:35] kevindf_: Look at the paste you gave me, what destination port is being dropped? [20:37] 54010? [20:38] kevindf_: That is likely a source port.., no traffic with destination port (DPT) 53 is being dropped. [20:39] kevindf_: What uses UDP port 53? [20:39] dns? [20:39] I know I know [20:39] aww he beat me to it [20:39] Yes, DNS [20:41] i'm still pretty new to networking but trying to learn as much as i can everyday [20:41] kevindf_: Exactly, in other words: any host lookup you do from your vpn client is not able to resolve it to an ip address. [20:42] kevindf_: Allow outgoing udp connections to 8.8.8:53 and 4.4.4.4:53 (wasn't it 8.8.4.4?) and you are good to go. [20:43] allow tcp too [20:43] (Unless there are other ports your firewall blocks ;) [20:44] Ok, thank you. I will try adding those rules to my firewall and see how it turns out [20:44] Sorry for some stupid answers, but everyone starts somewhere :) [20:45] No worries. We don't mean to come off as being rude. Just text has no emotions or tones. [20:46] kevindf_: Exactly, that is why I tried to teach you something rather than just provide answers ;) [20:47] no problem :) and yes lordie i appreciate that alot, helps me understanding things more easily [20:57] dasjoe, just curious, where did you see those cycle use percentages you mentioned [21:00] I allowed the outgoing UDP connections to 8.8.8.8:53 and 8.8.4.4:53 tcp & udp, the log is gving me UFW blocks now for proto 80 TCP & proto 443 TCP [21:02] so http [21:06] LinStatSDR If i'm correct I should allow 80 & 443 now also but for 10.8.0.0/24? [21:06] TCP [21:06] Sounds good to me. [21:11] kevindf_: I'd allow those in general. Whitelisting of web servers is a drag. [21:12] lordievader I just checked and these are both configured for IPV4 aswell as IPV6 to allow from anywhere [21:14] I don't see why UFW is blocking the packets on those ports now as they are both allowed [21:14] lordievader: I agree, whitelisting is very time consuming. === bilde2910 is now known as bilde2910|away === bilde2910|away is now known as bilde2910 [21:58] bilde2910: check the table, ID 9 Power_On_Hours and ID 193 Load_Cycle_Count [21:59] POH's VALUE is "094", which is in %. So it was on for 6% of the time it was designed for [22:10] lordievader: Finally got it working, took me some time but added some new iptables rules and it works fine now [22:11] lordievader: Thanks for helping me out and teaching some new stuff :) [22:14] kevindf_: Sure, no problem. Glad to hear it is working now :) [22:14] :) === Corey_ is now known as Corey [23:44] could not find module name cc_ubuntu_init_switch [23:44] anyone seen this? [23:44] server failing to boot === Lcawte is now known as Lcawte|Away