[00:00] cwill_at_work... does a system with high run q and low load indicate that it's I/O bound to you? [00:01] cwillu_at_work, sorry [00:01] iowait% is i/o bound [00:01] you have 0% iowait [00:02] That post was of a healthy system... I was just trying to figure out how to interpret [00:04] I logged some netstat numbers during the issue... I see some high send and recv Q, this may be somethin here [00:06] that would likely be a nic driver issue, I would think [00:06] do all your machines have the same driver/nic? [00:07] been a long time since I attempted to diagnose or work on something at that level [00:08] Yep every machine is identical [00:08] Most are in LAST_ACK [00:08] e1000e? [00:09] but some ESTABLISHED [00:09] like almost 200k in some queues [00:09] it might just be the result of the issue, but it might be a cause [00:09] the card? [00:09] it's a broadcom hangom [00:09] not sure what to tell you to figure it out [00:10] Broadcom 5720 [00:10] That queue size seems pretty unhealthy right? [00:11] what do you get for ethtool -k eth0 [00:11] I don't have a currently borked system [00:11] doesn't matter [00:12] http://pastebin.com/ARwR2W6K [00:12] but atleast till someone has an idea how to diagnose this somemore, I can atleast throw you some things to see if they have any effect [00:12] if they do, it's likely the cause, of not, just an effect [00:13] Yea totally. You've been reallllly helpful [00:13] This stuff is all fairly new to me [00:13] give a try: ethtool -K rx off tx off sg off tso off gso off rso off rxvlan off txvlan off eth0 [00:13] maybe again for eth1 if you use it [00:13] opps [00:14] ethtool -K eth0 rx off tx off sg off tso off gso off rso off rxvlan off txvlan off [00:14] I am not sure about the broadcoms, but I know the intel driver has gone back and forth on it working and not working [00:14] my older intel ones, I had to disable a few of those, to make it work correctly [00:15] this will cause higher cpu usage [00:15] I doubt it will be enough for you to notice though [00:15] so basically turning everything off [00:15] yep [00:15] any potential for badness here? [00:15] besides cpu usage [00:15] no [00:15] the chcksums just lower cpu usage [00:16] But in your experience they gunk up the works sometimes? [00:16] the rest mainly cause the nic and linux to move around 64k of data at a time, instead of one packet at a time [00:16] gro sometimes, rxvlan on one of mine here at home [00:16] tso I think I had an issue with on some too [00:17] this system I am using now needs: ethtool -K eth0 rxvlan off tx off [00:17] that leaves only gso turned on [00:18] forget about the tx on it, but it doesn't support rxvlan, but the driver thinks it does [00:18] Its a gigabit card, it's speed is set at 100Mb... likely the network its on [00:19] oh, at 100mbit you will never see the increased cpu usage :) [00:19] I'm wondering if I need to upgrade the network... I've never seen us go beyond maybe 15Mb though [00:27] Seems like a lot of data in queue in LAST_ACK state indicates a problem with our code no? [00:27] Basically the connection has been severed on their end, but we haven't gotten rid of it [00:28] TCP: 458 (estab 50, closed 96, orphaned 63, synrecv 0, timewait 1/0), ports 0 [00:28] Lotta orphans [00:29] dunno what a LAST_ACK is [00:30] Right before the tcp connection closes [00:30] oh, that is actually a state [00:30] I never see those [00:30] I've got a bunch... perhaps thats an issue [00:30] na, normally that for me is TIME_WAIT, where the connection was closed, but not properly [00:31] "The remote end has shut down, and the socket is closed. Waiting for acknowledgement." [00:31] ya, sounds like your sending it data, but it's not responding [00:31] oh [00:31] is there a funny firewall in the way preventing those packets? [00:32] just iptalbes [00:32] hmm, odd though, never seen them, just the FIN_WAIT TIME_WAIT mainly [00:34] Here so you have some idea what I'm looking at: http://pastebin.com/9RNzEbb9 [00:34] ips jiggled to protect the innocent :) [00:36] I wonder if we're just shipping data to a "almost closed" socket, and filling up the tcp queue [00:42] jsonperl_: the 'slabtop' utility ought to be able to show you if TCP is eating too much of your memory [00:43] I'll check it out [00:43] though memory utilization is quite good now [00:43] (with a little help from my buddy PatrickDK) [00:49] gotta head home... thanks folks, back later [00:49] have fun :) [01:11] back for more! [01:31] PatrickDK [01:32] I just had a system flake out… I hit those networking settings live, and it seems to have fixed it? [01:32] (super super anecdotally) [01:32] dunno :) [01:32] so your theory there is that there is a driver issue with the card? [01:32] personally, I would put those on like 3 or so, and see [01:33] well, driver or firmware [01:33] yea the whole "it didn't explode" thing is a really frustrating way to prove stuff :) [01:33] more likely driver, but firmware could affect the drivers actions [01:33] so by turning all of that off, we reduce the load on the card essentially? [01:33] and let the os take care of stuff [01:33] well, it puts the card into normal dumb mode basically [01:34] instead of attempting to limit interrupts, and queue up requests and stuff [01:34] and offloading some of the work [01:34] it might be there is some kind of buffer overrun happening on the nic, causing the issue [01:34] but I'm totally random guessing [01:35] me2 [01:35] :D [01:35] but now since that is off, nothing is really getting buffered [01:35] oh man, if this fixes the problem [01:35] I have had issues with broadcom drivers before, but not on linux [01:35] but then, I really have not used broadcom on linux so :) [01:35] I use what they rent me :) [01:35] (peer1 / serverbeach) [01:37] jsonperl: was that an ethtool command that seems to be fixing it? [01:37] yep [01:38] ethtool -K eth0 rx off tx off sg off tso off gso off rxvlan off txvlan off [01:38] 'seems' being the operative word [01:40] if your really interested, start knocking one off at a time, till it acts up again :) [01:41] hahahaha [01:41] oh man, the fact that thats a reasonable thing to do kinda of makes me ill :) [01:42] :) [01:43] * Patrickdk bets on the tso or gso [01:44] im gonna turn everything off on all machines [01:44] could be tx, but normally not [01:44] then i'll pull those on one of them [01:44] so do tso, gso, and tx in that order huh :) [01:44] or, pull a different one per machine? :) [01:44] yeah, I'm also suspicious of tso and gso [01:44] ahahha [01:44] and it feels like 'sg' would be nice to have back [01:44] I have no idea what sg is, never bothered by it before :) [01:45] (at least I assume it means Scatter/Gather) [01:45] it does [01:46] oh man, im excited [01:46] * Patrickdk locates a bed [01:46] I MAY BE ABLE TO SLEEP [01:46] g'night :) [01:46] cya Patrick, thanks again [01:52] allright, all machines updated [01:52] now I wait :) [01:52] sarnold/Patrickdk, it makes sense those settings kick in live right? [01:52] no networking restart or anything [01:53] jsonperl: right [01:53] good… because if it didn't that would disprove that it fixed it ;) === jtv2 is now known as jtv === smb` is now known as smb [07:27] yolanda, https://code.launchpad.net/~james-page/glance/sqlalchemy-bump/+merge/176613 if you are around :-) [07:27] zul, ^^ [07:27] morning [07:27] I'm gonna review all packages today [07:27] great [07:28] yolanda, morning! [07:28] jamespage, bad news, since this branch is on ubuntu-server-dev, i don't have permissions [07:30] yolanda, just need a review [07:30] not a merge [07:30] I'll do that myself [07:30] jamespage, assign me as a reviewer [07:30] otherwise i can't [07:31] i don't have the permissions to "Request review" [07:31] yolanda, dog [07:31] doh rather [07:31] yolanda, done [07:33] ok, reviewed, i cannot change the main status anyway [07:34] ack [07:34] thanks === Ursinha-afk is now known as Ursinha === racedo` is now known as racedo === tim___ is now known as vorpalbunny === LordOfTime is now known as LordOfTime|EC2 === tedski- is now known as tedski === vorpalbunny is now known as thumper === Tribaal_ is now known as Tribaal === thumper is now known as thumper-afk === thumper-afk is now known as thumper === psivaa_ is now known as psivaa [10:25] how to check if ssh server is running? [10:26] ThothCastel: service ssh status [10:26] ps -ef | grep ssh [10:30] greppy: mardraum: thanks, it's running, however I am unable to connect to it via ssh :S [10:31] what exactly happens? use pastebin if you must [10:36] zul, when you start review needed please - https://code.launchpad.net/~james-page/neutron/fixup-h2/+merge/176650 [10:43] greppy, "sshd" [11:05] zul, you might wanna take a look at the python-greenlet upload you did yesterday [11:05] it blasted all of the python3 work that you did in the previous two ubuntu versions [11:06] (which is why its block in proposed right now) [11:35] jamespage: fuuuuuu [11:38] zul: ? [12:14] hello, I can upgrade my kernel on Ubuntu Server 12.04, but when I reboot, the server don't boot and hangs, it's KVM virtualisation [12:18] zul, hey - I also uploaded trivial fixes for keystone and glance autopkgtest failures [12:19] I'm stuffing them into havana staging as well [12:19] jamespage: ack [12:21] streulma, anything on the console? [12:21] there is on the moment a problem with console, the isp upgraded to new version of OnApp [12:22] but before I had the problem [12:22] it boots the kernel [12:22] and then hangs after keyboard... [12:22] before the services loads === smb` is now known as smb [13:40] jamespage: http://people.canonical.com/~chucks/ca/ [13:42] zul, ceilometer? [13:43] jamespage: yep [13:56] zul, why does simplejson need " - Build for python 3.2 as well." [13:56] I know precise has python 3.2 [13:57] but can't a generic fix be applied in saucy which makes it a no-change backport again? [13:58] jamespage: because it explicity dependeon on python 3.3 [13:58] zul, +1 for msgpack-python [13:59] jamespage: python3-all-dev (>= 3.3.0-3) in the debian/control [14:01] zul, ack [14:01] reviewing now [14:01] jamespage: ill fix the saucy version [14:01] zul, does it work with python3.2 [14:01] jamespage: yeah [14:01] just wondering if that why the min-versions are specced [14:02] nothing in the changelog [14:03] zul, nope [14:04] and it looks OK - maybe poke piotr in #debian-python on OFTC and see if there are any gotchas [14:04] jamespage: nope im not uploading it, i just noticed a bug [14:04] zul, do we really need the new webtest? [14:04] is 1.3.3 -> 1.3.4 [14:04] its rather [14:04] jamespage: im not sure, nack it please [14:05] jamespage: chuck@homer:~/pbuilder/precise_result$ dpkg -c python3-simplejson_3.3.0-2ubuntu1~cloud0_amd64.deb [14:05] drwxr-xr-x root/root 0 2013-07-24 09:14 ./ [14:05] drwxr-xr-x root/root 0 2013-07-24 09:14 ./usr/ [14:05] drwxr-xr-x root/root 0 2013-07-24 09:14 ./usr/share/ [14:05] drwxr-xr-x root/root 0 2013-07-24 09:14 ./usr/share/doc/ [14:05] drwxr-xr-x root/root 0 2013-07-24 09:14 ./usr/share/doc/python3-simplejson/ [14:05] -rw-r--r-- root/root 3160 2013-07-24 09:06 ./usr/share/doc/python3-simplejson/changelog.Debian.gz [14:05] zul, -1 [14:05] -rw-r--r-- root/root 1645 2011-02-15 15:56 ./usr/share/doc/python3-simplejson/copyright [14:05] chuck@homer:~/pbuilder/precise_result$ dpkg -c python-simplejson_3.3.0-2ubuntu1~cloud0_amd64.deb [14:05] drwxr-xr-x root/root 0 2013-07-24 09:14 ./ [14:05] drwxr-xr-x root/root 0 2013-07-24 09:14 ./usr/ [14:05] drwxr-xr-x root/root 0 2013-07-24 09:14 ./usr/share/ [14:05] \o/ [14:05] drwxr-xr-x root/root 0 2013-07-24 09:14 ./usr/share/doc/ [14:05] drwxr-xr-x root/root 0 2013-07-24 09:14 ./usr/share/doc/python-simplejson/ [14:05] -rw-r--r-- root/root 7062 2013-05-01 16:01 ./usr/share/doc/python-simplejson/index.rst.gz [14:05] -rw-r--r-- root/root 3160 2013-07-24 09:06 ./usr/share/doc/python-simplejson/changelog.Debian.gz [14:05] nice [14:05] -rw-r--r-- root/root 1645 2011-02-15 15:56 ./usr/share/doc/python-simplejson/copyright [14:05] shit! [14:05] its zul, so I'll let it slide... this time ;) [14:06] * jamespage drowns in irc [14:06] jamespage: tests are not enabled in that package either === mahmoh1 is now known as mahmoh [14:19] hi everyone [14:19] i need some help with ldap integration with packetfence [14:19] anyone has any idea how to go about doing this? [14:40] if I connect to an openvpn server in the office... should it not tunnel all my internet connection through it? [14:40] I have the same IP as before... [14:42] Depends on how you have it configured. [14:43] rbasak, it was configured by my predecessor - where can I check? [14:44] I don't recall, sorry. Check the docs for mentions of your default gateway. I think it's a client-side setting, but you can also configure the client to accept the server's settings and then configure it on the server (IIRC). [14:45] Or may default route, rather than default gateway. [14:45] maybe [14:46] lots of mention of bridging... [14:47] Trivial question: how do I upgrade a kernel module that is in use? By in use it is module for raid controller but I am booting using a live CD [14:47] Monotoko: check your routing, is default route via VPN or your ISP? [14:48] command "ip r sh" [14:49] usually, server pushed routes to the client, but client can overwrite it or do some other tricks w/out getting server involved [14:49] pushed=pushes [14:50] oozbooz, http://pastebin.com/65vcRqk7 [14:50] I tried to remove the comment in the config here: ;push "redirect-gateway def1 bypass-dhcp" [14:51] however then the client wouldn't load anything [14:51] I assume 5.10.152.225 is your ISP GW [14:51] then your internet traffic should go over it [14:51] yeah, we have a /29 I believe [14:51] when I'm connected from outside the office [14:51] I want it to still use the office IP [14:52] use office IP for ... ? [14:52] you mean send your ALL traffic via the tunnel? [14:52] yeah - it's static - a lot of people who work here work from homes etc, with dynamic IP's [14:53] I'd rather they all used our network to make it easier to firewall the servers and not keep punching random holes in the FW [14:55] I don't get your last statement ... [14:55] usually, you want to only relevant traffic to send to your office via the tunnel, [14:55] rest of the stuff, they should use their ISP [14:56] why would you want them to download youtube videos using office bandwidth [14:56] I'd say it depends. Road warriors might prefer everything to go via the office if they don't trust the connections they're using (coffee shops, hotels, etc) [14:56] oozbooz, we have a "cloud" provider off site that I need to give developers access to, and certain things that they can log into through the web browser but only from this IP [14:57] aha [14:57] 3rd party mess.. [14:58] aye - obviously I need a static IP I can trust for that, so I'd rather tunnel everyone through our office network [14:58] well... you can create a new route that only traffic for cloud provider goes via the tunnel [15:00] hmm, what route would I be adding for that? route add 1.2.3.4 gw 5.10.152.227 eth0 ? [15:01] but, if you decide to divert all traffic, you will have to change routing rules on the server, that will be pushed to the client [15:01] which VPN server do you use [15:01] openvpn [15:01] jamespage: simplejson fixed locally ill upload to the regular archive and get it for the cloud archive as well [15:02] openvpn or openvpn-AS? [15:02] regular openvpn AFAIK [15:02] yeah [15:03] just checked with dpkg [15:03] ok, first my advice to upgrade to openvpn-AS - much easier to manage [15:04] there is IRC channel "openvpn", you should confirm with them... but it should be not difficult [15:05] cheers oozbooz [15:06] have fun [15:07] jamespage: http://people.canonical.com/~chucks/ca/ === pleia2_ is now known as pleia2 [15:14] smb: ping i was wondering if you could offer some insight on it https://launchpadlibrarian.net/145685953/buildlog_ubuntu-precise-amd64.xen_4.2.2-1ubuntu1~cloud0_FAILEDTOBUILD.txt.gz [15:15] zul, maybe, let me read [15:16] smb: this is on precise [15:17] zul, Looks like the known problem of passing LDFLAGS in gcc format -Wl but don't we work around that [15:18] smb: yeah seems to ignore that for some reason [15:18] And why do you compile xen 4.2.2 on Precise? [15:18] :-P [15:18] Still have not cleared theat MRE [15:19] Actually I would not aim 4.2.2 immediately but 4.1.5... or .6 but anyway [15:20] zul, +! [15:20] +! [15:20] +1 rather [15:20] jamespage: cool thanks [15:21] zul, "LDFLAGS = $(shell dpkg-buildflags --get LDFLAGS|sed -e 's/-Wl,//g')" in debian/rules? [15:21] zul, https://code.launchpad.net/~james-page/neutron/fixup-rootwrap-conf/+merge/176708 [15:21] I'm more the nginx kind of guy so what did I missed here? Installed apache, changed port (so that it won't conflict with nginx), getting this nestat "tcp 0 0 0.0.0.0:8080 0.0.0.0:* LISTEN 0 185527477 18552/apache2" but it just reacts to local requests. There is no iptable rule... Any ideas, I'm desperate :( [15:26] zul, Just out of curiosity is that the 4.2.2 version from current Saucy? [15:26] yeah [15:27] zul, Hm, so it has that line... but for some reason I vaguely remember something going wrong with something like this (but I believe that was another package) [15:29] zul, Oh wait maybe because in P LDFLAGS is exported by the build system... [15:29] hmm...interesting ill try it out [15:30] zul, Is that LDFLAGS := instead of LDFLAGS = === JonnyNomad_ is now known as JonnyNomad [15:39] zul, Oh I think I can imagine what is going on: we do not set LDFLAGS at all by default in newer releases. So when compiling in S I did not notice none of them being used and setting LDFLAGS in debian/rules being useless [15:39] But in P when they are set by default it fails... [15:39] smb: so disable it? [15:40] zul, I'd probably try either an export in debian/rules or move the definition into debian/rules.real for a moment === Catbuntu is now known as LexieGrey [15:40] smb: ok ill try that [15:41] zul, And I need to make sure I really use those flags in the Xen 4.3 I am preparing [15:41] for S that is [15:43] smb: when are you doing 4.3? [15:43] zul, I am just about to think I got all pieces together. Testing it on my boxes [15:44] smb: ok cool [15:45] zul, a user asked me if there will be any quantum-> neutron renaming in raring or earlier (and similarly, anything before havana) [15:45] my answer was "NO, but I'll check with zul" [15:46] med: no quantum in raring was quantum [15:46] nod. [15:47] * med_ was pretty sure it was only a cease and desist not a "go undo the world" [15:59] Madkiss: howdy! have you looked into packaging dlm? [16:00] smb: nope neither worked [16:01] zul, Hm, ok need to figure out how to modify it correctly for the actual compile. Seems the more recent releases just don't use any [16:02] I mean it does not get passed in and fails because where we change it somehow does not replace the default of the system [16:04] zul, Doing the export did break the build in the same way on S though... So maybe := is the second missing piece [16:13] zul, having LDFLAGS= and export LDFLAGS both in rules.real seems to make the compile run longer (not finished yet) [16:14] smb: can i see a snippet your rules.real please? [16:17] rbasak: BTW, merges.py won't work right now - until egress firewall is more relaxed. Have raised RT [16:20] Daviey: OK, thanks. [16:20] roaksoax: Hey, does Openstack / Kombu support Rabbit Active/Active in Havana? [16:20] I'll try and keep people.canonical.com/~rbasak/delta.py updated in the mean time, though note that I'm doing it manually. [16:22] Daviey: I haven't check yet, sorry! I'm doing the whole upgrade process of the clustering tools, whcih is not as easy as syncing packages from debian [16:24] Daviey, the issue wasn't active/active its the lack of any type of heartbeating support, so that the rpc layer (quickly) detects failure and migrates to a new server [16:27] Patrick, I still got the issue, but I think I'm getting closer [16:28] Patrickdk that is [16:28] Would a BUNCH of connections in CLOSE_WAIT stop up the tcp pipeline at some point? [16:40] jamespage: still around? [16:44] zul, yes [16:44] jamespage: one more for you today http://people.canonical.com/~chucks/ca/ [16:45] zul, does that one build against the havana-staging PPA? [16:45] jamespage: just finished building [16:45] zul, +1 then [16:45] jamespage: thanks [16:47] jsonperl, if that is the case, a couple of issues could be the case [16:47] open file handles? [16:48] or just exaustion of resources [16:48] maybe look here, it seems to have an ok description of the sysctl's involved [16:48] http://www.ufirsttech.com/content/linux-kernel-settings-related-tcp-connections-68 [16:48] Awesome thanks [16:49] normally there are several sysctls that need to be adjusted for any kind of high performance server [16:49] expecially when handling lots of connections [16:49] In this case it's actually a library i use to hit amazon s3 [16:49] don't think any of this would cause that single cpu usage issue though [16:49] which is the least often used connection i got [16:49] I think all of what we were seeing is a RESULT of connectivity issues [16:50] no players = no processing [16:50] oh, that page uses proc, I normally do it via sysctl instead [16:50] I think the ethtool command to change stuff maybe reset the stuck connections? [16:50] jsonperl, still :) [16:50] making it look fixed [16:50] setup a ping [16:50] see if you start missing, or get delayed pings [16:51] if your running tcpdump on the server at the time too, watching just for icmp [16:51] ok, we use pingdom… that sufficient you think? [16:51] you should be able to easily tell [16:51] I actually try tcp to the server every minute [16:51] isn't that like once a minute? [16:51] yea [16:51] You're thinking more often? [16:51] ya, I would go second, and watch delays [16:51] you want to know how long it takes, you know it gets there ,and responds [16:52] you want to know if it gets lost, or delayed [16:52] well, tcp would get lost and retried [16:52] but ping would just get lost [16:52] Any service you can recommend? or you just do it from another box [16:52] I normally just do it from my home box [16:52] gotcha [16:52] or a work computer [16:52] not like ping uses much traffic [16:53] Doesn't feel very enterprisey :D [16:53] now if you want to take it a step more, use mtr :) [16:53] so you can see where the issue actually happens, if it's network related [16:53] It's not [16:53] this is my boxes [16:54] I wish it were somebody elses fault! [16:54] no, if you think the issue was you aren't receiving the players traffic [16:54] that would be network issue :) [16:54] ping would easily show that [16:54] But I see the same issue cross machines, cross facilities [16:54] different parts of the US [16:54] same issue [16:54] not likely then [16:55] I really don't know where to go [16:55] unless I actually get on it and dig around and maybe setup my own stuff to monitor it [16:55] I feel like i need to get rid of those orphaned connections [16:55] but not even sure how good I could do that [16:55] Want a consulting job? :D [16:55] I have enough of those :) [16:55] haha [16:56] But we're a super entertaining indie game company [16:56] like on the tv :D [16:56] So real quickly... [16:57] Do you believe it's possible that piling up of CLOSE_WAIT connections eventually can lead to connectivity issues in the tcp stack? [16:57] or am I going up the wrong road here [16:57] it can, I doubt your anywhere near that though [16:57] I doubt your even >5% of the limit [16:58] Does the OS limit per process? [16:58] check ulimit for that [16:58] k [16:58] remember, tcp connections are file handles, and count with open files [16:59] So what seems like a clue to me is [16:59] Turning everything off with ethtools fixed "the glitch" [16:59] Temporarily [17:00] No question… went from "very borked" to normal the moment I changed the settings [17:00] what kernel you running on these? [17:01] 3.2.0-38-generic-pae #61-Ubuntu SMP Tue Feb 19 12:39:51 UTC 2013 i686 i686 i386 GNU/Linux [17:01] hmm, 32bit [17:01] why not 64? [17:02] actually wait… that box is an oddball [17:02] the rest are 64 [17:02] :) [17:02] using any dkms modules? [17:02] I doubt you are [17:02] 32 was to save memory [17:02] these are the rest 3.2.0-49-generic #75-Ubuntu SMP Tue Jun 18 17:39:32 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux [17:03] dkms? I donno what that is [17:03] addon modules for the kernels [17:03] ah… can I list em? [17:03] it's pretty stock 12.04 [17:03] like, vmware drivers, xtables, .... [17:03] nvidia [17:03] ah, i doubt [17:03] no video [17:03] normally should show up in dpkg -l | grep dkms [17:03] no virtualization [17:03] nothin [17:04] since I still believe it's a kernel issue [17:04] might be worth giving a 3.8 kernel test on it [17:04] though, on all my servers I haven't hit this issue, but then, I likely wouldn't have noticed either [17:06] I'm using 3.8 on my firewall machines for the newer firewall stuff in it [17:07] to install it, apt-get install linux-generic-lts-raring linux-tools-lts-raring [17:07] then reboot [17:07] you can always uninstall it too [17:07] Was just looking into that? [17:08] How do you downgrade? [17:08] it's just a new grub kernel option [17:08] just select a different one [17:08] then once it's booted apt-get remove those two [17:08] I had all kinds of dkms issues with it [17:08] cause I needed both vmware and xtables dkms modules [17:09] Makes sense… That's why i like to stay 2 steps behind bleeeeeding edge [17:09] I really wanted the bufferbloat stuff in 3.8 though :) [17:09] for the firewall, and firewall needs xtables :) [17:09] all my other machines are normal 64bit 12.04 though [17:10] but I wonder if the issue your having got fixed in the kernel already [17:10] and there is a LOT of changelogs to read to find out easily [17:10] without just testing it [17:11] Or testing that it doesn't happen to explode [17:11] over a period of days :) [17:12] I guess we could always setup an ice, and test it there :) [17:12] ice? [17:12] http://en.wikipedia.org/wiki/In-circuit_emulator [17:13] when you go there, it's not pretty [17:14] I guess these days people would just use a vm [17:15] but oldschool it was using an ice === Ursinha is now known as Ursinha-afk [17:17] gotcha… yep that's before my time! [17:17] mtr is cool [17:17] cept allll my packets are lost on the way to my server [17:18] Must be clipping all but the first [17:21] whoops === jsonperl1 is now known as jsonperl [17:33] btw I would be HAPPY to give you access to the box :) [17:43] jsonperl: Sounds like a dreadful idea from a security point-of-view. [17:44] haha [17:44] Truth [18:42] Hey guys. I want to install Redis on a 12.04 Azure Extra Small VM. It has only 768MB of RAM available. How can I find the RAM usage and what steps should I follow to minimize memory usage, so Redis can have the lion's share? [18:43] rizzuh: measuring memory use is a bit complicated; 'free' will give you a very quick overview of free memory on the system, the -/+ buffers/cache line is probably most important summary of the summary.. [18:43] rizzuh: ps auxw or top (sorted with M), look for the highest RSS numbers, that's what's actually resident in RAM for those programs.. [18:44] rizzuh: ut sometimes shared libraries take a pile, the 'smem' tool can help you find out wihch processes have which shared libraries loaded, and apportions to each of them a certain amout of the fault for the memory used by those shared libraries [18:47] sarnold, well ATM top shows 554478k free - if that isn't woefully inaccurate it's pretty good [18:48] rizzuh: well, "free" is a funny thing. the kernel keeps some memory around, free, to handle spikes of allocations. but it tries to minimize the amount of free memory because free memory is wasted memory. :) [18:48] sarnold, ahh, sure, free as in not reserved by an app. If it's full of cache that ain't an issue. [18:48] rizzuh: that's where the -/+ buffers/cache line comes in -- that includes memory that is currently being used for storing in ram copies of files but _could_ be thrown away under pressure [18:49] rizzuh: *nod* *nod* === wxl_ is now known as wxl [18:55] sarnold, that said, 500 MB RAM to use is good, but damn this thing is slow. Good that Redis doesn't need much processing power. It's taking a while to update a few apt packages. [18:56] rizzuh: at least the amazon micro instances are very heavily penalized in much the same way.. not bad for slight spikes in a mostly-idle environment, but installing a few hundred packages is -painful- [18:57] yea those micros [18:57] i'm fairly sure they arbitrarily throttle you... [18:57] sarnold, yeah these are pretty much the same as AWS micro. 5 Mbit network as well, not great. [18:57] if the azure storage can be moved among instances, it might even make sense to turn it off, attach to a good instance, upgrade, and move back to cheap again.. heh. === jsonperl1 is now known as jsonperl [18:57] rizzuh: 5MBit? wow! [18:58] The next one is small at $50 a month, with 1.5GB RAM and a dedicated core. Oh and 100 Mbit network or something like that. [18:59] But then through BizSpark we pay 33% less. "Pay", as we have $150 credit / dev, with production usage rights, so it's pretty good for the money :P === Ursinha-afk is now known as Ursinha [19:50] Patrickdk, so running simulators at a box… I'm able to REALLLLLY pile up on LAST_ACK state connections [19:50] Over about 20 minutes, I'm able to get to a count of 450 or so [19:51] nice [19:51] Seems odd right? [19:52] something isn't closing the connection correctly [19:52] might just be normal for ios, no idea though [20:02] Our server was trying to "close a connection after writing remaining data" [20:02] I changed it to just close the connection, seems to fix that at least [20:34] sarnold: Ive dumped some dmesg output from blocked processes, but still unclear how to read it [20:35] jdstrand: would adding AUDIT_WRITE to libvirtd apparmor policy be acceptable? [20:41] hallyn: usr.sbin.libvirtd? [20:46] yes [20:55] hallyn: that's fine, libvirtd is not really confined anyway (the VMs it launches are) [20:55] hallyn: let me point you at a bug though [20:55] hallyn: actually, nm, you should be ok [20:59] jdstrand: ok, thanks. (i consider this ultra-low priority) [20:59] zul: ^ if you happen to be merging libvirt soon-ish, we should toss that in i guess (there is an open bug requesting it) [21:05] netstat -s output… does anything here look overly concerning? http://pastebin.com/bnzEFRPh [21:05] hi hallyn [21:05] hallyn: thanks for the comprehensive email [21:05] it has me thinking... [21:06] hallyn: also, lxc-device isn't available in the precise lxc that we are limited to [21:08] jsonperl: 10878 invalid SYN cookies received [21:08] jsonperl: that seems steep. [21:08] take the system down steep [21:08] ? [21:08] maybe it's normal on the internet now, but .. it'd be worth asking your host if you're under attack.. [21:08] jsonperl: what's this machine -do-? [21:09] serves a game via a persistent tcp connection to a bunch of users [21:09] at this time only about 50-100 concurrent on that machine [21:09] distributed amongst 14 servers on that machine [21:10] thumper: are you actually limited to the stock precise lxc, or could you use lxc from the ubuntu-lxc ppa for precise? AFAIUI you're using ppas anyway.... but in any case lxc-device is just a nicety, you do NOT need it :) [21:11] hallyn: possibly not necessarily limited to stock lxc [21:11] but I've not considered extra ppas [21:11] managed to not really need it at this stage [21:12] hallyn: this would be on every machine, and I don't think we install ppas on every machine [21:12] thumper: well lxc-device itself isn't enough of a reason to switch to ppa i don't think [21:12] * thumper nods [21:12] I need to find someone who knows maas [21:12] to work out how to do the "gimmie a nic" thing [21:12] thumper: is it acceptable to simply start up the container after getting the nic from ? [21:12] yes, I think we can do that [21:13] cool, that'll be easiest [21:13] as long as the getting a nic doesn't take too long [21:13] < 10s would be ok I think [21:13] longer than that and we might need to work out something else [21:13] by something else [21:13] just a better work flow [21:13] sarnold: Any ideas for further investigation into the invalid syn cookies? [21:14] hallyn: I wish I knew about the "no network conf" bit to use the host [21:14] that would have been a good enough setting by default I think [21:14] an attack certainly could explain the very random connectivity issues we've seen [21:14] I need to consider the implications for the local provider [21:14] thumper: i don't follow. you mean lxc.network.empty ? [21:14] no, the number 2 [21:15] no network entry [21:15] jsonperl: syn packets tie up kernel memory; syn cookies are one way to tyr to avoid the worst of the kernel memory use. for some good backgroud information, see http://lwn.net/Articles/277146/ [21:15] jsonperl: /etc/sysctl.conf has a configuration you can set to turn on syn cookies [21:15] also I need to work out how to have a nice api to our internal providers, and how to handle that config with the containers [21:16] ok, thanks for the read [21:16] the brain is busy handling this with a background process :) [21:16] I think I almost have it :) [21:16] sarnold: if netstat is reporting invalid syn cookies, doesn't that mean they're on? [21:17] jsonperl: maybe? :) [21:26] sarnold is that the only thing of concern that popped out at ya? [21:28] jsonperl: the high connection counts made me wonder, but the use makes sense, hehe [21:28] Kids jumping in and out of the game [21:28] sorry nothing just stands out to me ;( [21:29] worlds exist on one server on one machine, and they can "teleport" between them [21:29] haha ok :) [21:31] sarnold: good reading on syncookies thanks === Ursinha is now known as Ursinha-afk [21:50] hallyn: still around? [21:50] thumper: yup [21:51] hallyn: thinking about number four, where we create a veth pair [21:51] hallyn: if the container hasn't been started, there is no network namespace right? [21:51] or is there? [21:52] nope. [21:52] also, this "sudo lxc-unshare -s NETWORK -- /bin/bash" seems like it does something intersting I don't quite grok [21:52] thumper: that's just doing the same thing as creating a container. [21:52] it starts a task inside a new, private network ns [21:52] as for veth - if MAAS/openstack/ec2 will hand you a nic, then ignore veths [21:53] lxc.network.type = veth will always create a new veth pair and attach the one end to lxc.network.link. [21:53] well openstack won't [21:53] ah, I was going to ask what the link bit was [21:54] hallyn: can I run my idea past you? [21:54] so if you *were* going to use veth, which my feeling is you won't, then you would bridge whatever you get from openstack to br0, then say lxc.network.type = veth lxc.network.link=br0 [21:54] sure [21:54] hallyn: although #juju-dev might be better