jsonperl_ | cwill_at_work... does a system with high run q and low load indicate that it's I/O bound to you? | 00:00 |
---|---|---|
jsonperl_ | cwillu_at_work, sorry | 00:01 |
Patrickdk | iowait% is i/o bound | 00:01 |
Patrickdk | you have 0% iowait | 00:01 |
jsonperl_ | That post was of a healthy system... I was just trying to figure out how to interpret | 00:02 |
jsonperl_ | I logged some netstat numbers during the issue... I see some high send and recv Q, this may be somethin here | 00:04 |
Patrickdk | that would likely be a nic driver issue, I would think | 00:06 |
Patrickdk | do all your machines have the same driver/nic? | 00:06 |
Patrickdk | been a long time since I attempted to diagnose or work on something at that level | 00:07 |
jsonperl_ | Yep every machine is identical | 00:08 |
jsonperl_ | Most are in LAST_ACK | 00:08 |
Patrickdk | e1000e? | 00:08 |
jsonperl_ | but some ESTABLISHED | 00:09 |
jsonperl_ | like almost 200k in some queues | 00:09 |
Patrickdk | it might just be the result of the issue, but it might be a cause | 00:09 |
jsonperl_ | the card? | 00:09 |
jsonperl_ | it's a broadcom hangom | 00:09 |
Patrickdk | not sure what to tell you to figure it out | 00:09 |
jsonperl_ | Broadcom 5720 | 00:10 |
jsonperl_ | That queue size seems pretty unhealthy right? | 00:10 |
Patrickdk | what do you get for ethtool -k eth0 | 00:11 |
jsonperl_ | I don't have a currently borked system | 00:11 |
Patrickdk | doesn't matter | 00:11 |
jsonperl_ | http://pastebin.com/ARwR2W6K | 00:12 |
Patrickdk | but atleast till someone has an idea how to diagnose this somemore, I can atleast throw you some things to see if they have any effect | 00:12 |
Patrickdk | if they do, it's likely the cause, of not, just an effect | 00:12 |
jsonperl_ | Yea totally. You've been reallllly helpful | 00:13 |
jsonperl_ | This stuff is all fairly new to me | 00:13 |
Patrickdk | give a try: ethtool -K rx off tx off sg off tso off gso off rso off rxvlan off txvlan off eth0 | 00:13 |
Patrickdk | maybe again for eth1 if you use it | 00:13 |
Patrickdk | opps | 00:13 |
Patrickdk | ethtool -K eth0 rx off tx off sg off tso off gso off rso off rxvlan off txvlan off | 00:14 |
Patrickdk | I am not sure about the broadcoms, but I know the intel driver has gone back and forth on it working and not working | 00:14 |
Patrickdk | my older intel ones, I had to disable a few of those, to make it work correctly | 00:14 |
Patrickdk | this will cause higher cpu usage | 00:15 |
Patrickdk | I doubt it will be enough for you to notice though | 00:15 |
jsonperl_ | so basically turning everything off | 00:15 |
Patrickdk | yep | 00:15 |
jsonperl_ | any potential for badness here? | 00:15 |
jsonperl_ | besides cpu usage | 00:15 |
Patrickdk | no | 00:15 |
Patrickdk | the chcksums just lower cpu usage | 00:15 |
jsonperl_ | But in your experience they gunk up the works sometimes? | 00:16 |
Patrickdk | the rest mainly cause the nic and linux to move around 64k of data at a time, instead of one packet at a time | 00:16 |
Patrickdk | gro sometimes, rxvlan on one of mine here at home | 00:16 |
Patrickdk | tso I think I had an issue with on some too | 00:16 |
Patrickdk | this system I am using now needs: ethtool -K eth0 rxvlan off tx off | 00:17 |
Patrickdk | that leaves only gso turned on | 00:17 |
Patrickdk | forget about the tx on it, but it doesn't support rxvlan, but the driver thinks it does | 00:18 |
jsonperl_ | Its a gigabit card, it's speed is set at 100Mb... likely the network its on | 00:18 |
Patrickdk | oh, at 100mbit you will never see the increased cpu usage :) | 00:19 |
jsonperl_ | I'm wondering if I need to upgrade the network... I've never seen us go beyond maybe 15Mb though | 00:19 |
jsonperl_ | Seems like a lot of data in queue in LAST_ACK state indicates a problem with our code no? | 00:27 |
jsonperl_ | Basically the connection has been severed on their end, but we haven't gotten rid of it | 00:27 |
jsonperl_ | TCP: 458 (estab 50, closed 96, orphaned 63, synrecv 0, timewait 1/0), ports 0 | 00:28 |
jsonperl_ | Lotta orphans | 00:28 |
Patrickdk | dunno what a LAST_ACK is | 00:29 |
jsonperl_ | Right before the tcp connection closes | 00:30 |
Patrickdk | oh, that is actually a state | 00:30 |
Patrickdk | I never see those | 00:30 |
jsonperl_ | I've got a bunch... perhaps thats an issue | 00:30 |
Patrickdk | na, normally that for me is TIME_WAIT, where the connection was closed, but not properly | 00:30 |
sarnold | "The remote end has shut down, and the socket is closed. Waiting for acknowledgement." | 00:31 |
Patrickdk | ya, sounds like your sending it data, but it's not responding | 00:31 |
Patrickdk | oh | 00:31 |
sarnold | is there a funny firewall in the way preventing those packets? | 00:31 |
jsonperl_ | just iptalbes | 00:32 |
Patrickdk | hmm, odd though, never seen them, just the FIN_WAIT TIME_WAIT mainly | 00:32 |
jsonperl_ | Here so you have some idea what I'm looking at: http://pastebin.com/9RNzEbb9 | 00:34 |
jsonperl_ | ips jiggled to protect the innocent :) | 00:34 |
jsonperl_ | I wonder if we're just shipping data to a "almost closed" socket, and filling up the tcp queue | 00:36 |
sarnold | jsonperl_: the 'slabtop' utility ought to be able to show you if TCP is eating too much of your memory | 00:42 |
jsonperl_ | I'll check it out | 00:43 |
jsonperl_ | though memory utilization is quite good now | 00:43 |
jsonperl_ | (with a little help from my buddy PatrickDK) | 00:43 |
jsonperl_ | gotta head home... thanks folks, back later | 00:49 |
sarnold | have fun :) | 00:49 |
jsonperl | back for more! | 01:11 |
jsonperl | PatrickDK | 01:31 |
jsonperl | I just had a system flake out… I hit those networking settings live, and it seems to have fixed it? | 01:32 |
jsonperl | (super super anecdotally) | 01:32 |
Patrickdk | dunno :) | 01:32 |
jsonperl | so your theory there is that there is a driver issue with the card? | 01:32 |
Patrickdk | personally, I would put those on like 3 or so, and see | 01:32 |
Patrickdk | well, driver or firmware | 01:33 |
jsonperl | yea the whole "it didn't explode" thing is a really frustrating way to prove stuff :) | 01:33 |
Patrickdk | more likely driver, but firmware could affect the drivers actions | 01:33 |
jsonperl | so by turning all of that off, we reduce the load on the card essentially? | 01:33 |
jsonperl | and let the os take care of stuff | 01:33 |
Patrickdk | well, it puts the card into normal dumb mode basically | 01:33 |
Patrickdk | instead of attempting to limit interrupts, and queue up requests and stuff | 01:34 |
Patrickdk | and offloading some of the work | 01:34 |
Patrickdk | it might be there is some kind of buffer overrun happening on the nic, causing the issue | 01:34 |
Patrickdk | but I'm totally random guessing | 01:34 |
jsonperl | me2 | 01:35 |
jsonperl | :D | 01:35 |
Patrickdk | but now since that is off, nothing is really getting buffered | 01:35 |
jsonperl | oh man, if this fixes the problem | 01:35 |
Patrickdk | I have had issues with broadcom drivers before, but not on linux | 01:35 |
Patrickdk | but then, I really have not used broadcom on linux so :) | 01:35 |
jsonperl | I use what they rent me :) | 01:35 |
jsonperl | (peer1 / serverbeach) | 01:35 |
sarnold | jsonperl: was that an ethtool command that seems to be fixing it? | 01:37 |
jsonperl | yep | 01:37 |
jsonperl | ethtool -K eth0 rx off tx off sg off tso off gso off rxvlan off txvlan off | 01:38 |
jsonperl | 'seems' being the operative word | 01:38 |
Patrickdk | if your really interested, start knocking one off at a time, till it acts up again :) | 01:40 |
jsonperl | hahahaha | 01:41 |
jsonperl | oh man, the fact that thats a reasonable thing to do kinda of makes me ill :) | 01:41 |
sarnold | :) | 01:42 |
* Patrickdk bets on the tso or gso | 01:43 | |
jsonperl | im gonna turn everything off on all machines | 01:44 |
Patrickdk | could be tx, but normally not | 01:44 |
jsonperl | then i'll pull those on one of them | 01:44 |
jsonperl | so do tso, gso, and tx in that order huh :) | 01:44 |
Patrickdk | or, pull a different one per machine? :) | 01:44 |
sarnold | yeah, I'm also suspicious of tso and gso | 01:44 |
jsonperl | ahahha | 01:44 |
sarnold | and it feels like 'sg' would be nice to have back | 01:44 |
Patrickdk | I have no idea what sg is, never bothered by it before :) | 01:44 |
sarnold | (at least I assume it means Scatter/Gather) | 01:45 |
Patrickdk | it does | 01:45 |
jsonperl | oh man, im excited | 01:46 |
* Patrickdk locates a bed | 01:46 | |
jsonperl | I MAY BE ABLE TO SLEEP | 01:46 |
sarnold | g'night :) | 01:46 |
jsonperl | cya Patrick, thanks again | 01:46 |
jsonperl | allright, all machines updated | 01:52 |
jsonperl | now I wait :) | 01:52 |
jsonperl | sarnold/Patrickdk, it makes sense those settings kick in live right? | 01:52 |
jsonperl | no networking restart or anything | 01:52 |
sarnold | jsonperl: right | 01:53 |
jsonperl | good… because if it didn't that would disprove that it fixed it ;) | 01:53 |
=== jtv2 is now known as jtv | ||
=== smb` is now known as smb | ||
jamespage | yolanda, https://code.launchpad.net/~james-page/glance/sqlalchemy-bump/+merge/176613 if you are around :-) | 07:27 |
jamespage | zul, ^^ | 07:27 |
yolanda | morning | 07:27 |
jamespage | I'm gonna review all packages today | 07:27 |
yolanda | great | 07:27 |
jamespage | yolanda, morning! | 07:28 |
yolanda | jamespage, bad news, since this branch is on ubuntu-server-dev, i don't have permissions | 07:28 |
jamespage | yolanda, just need a review | 07:30 |
jamespage | not a merge | 07:30 |
jamespage | I'll do that myself | 07:30 |
yolanda | jamespage, assign me as a reviewer | 07:30 |
yolanda | otherwise i can't | 07:30 |
yolanda | i don't have the permissions to "Request review" | 07:31 |
jamespage | yolanda, dog | 07:31 |
jamespage | doh rather | 07:31 |
jamespage | yolanda, done | 07:31 |
yolanda | ok, reviewed, i cannot change the main status anyway | 07:33 |
jamespage | ack | 07:34 |
jamespage | thanks | 07:34 |
=== Ursinha-afk is now known as Ursinha | ||
=== racedo` is now known as racedo | ||
=== tim___ is now known as vorpalbunny | ||
=== LordOfTime is now known as LordOfTime|EC2 | ||
=== tedski- is now known as tedski | ||
=== vorpalbunny is now known as thumper | ||
=== Tribaal_ is now known as Tribaal | ||
=== thumper is now known as thumper-afk | ||
=== thumper-afk is now known as thumper | ||
=== psivaa_ is now known as psivaa | ||
ThothCastel | how to check if ssh server is running? | 10:25 |
mardraum | ThothCastel: service ssh status | 10:26 |
greppy | ps -ef | grep ssh | 10:26 |
ThothCastel | greppy: mardraum: thanks, it's running, however I am unable to connect to it via ssh :S | 10:30 |
mardraum | what exactly happens? use pastebin if you must | 10:31 |
jamespage | zul, when you start review needed please - https://code.launchpad.net/~james-page/neutron/fixup-h2/+merge/176650 | 10:36 |
cwillu_at_work | greppy, "sshd" | 10:43 |
jamespage | zul, you might wanna take a look at the python-greenlet upload you did yesterday | 11:05 |
jamespage | it blasted all of the python3 work that you did in the previous two ubuntu versions | 11:05 |
jamespage | (which is why its block in proposed right now) | 11:06 |
zul | jamespage: fuuuuuu | 11:35 |
ikonia | zul: ? | 11:38 |
streulma | hello, I can upgrade my kernel on Ubuntu Server 12.04, but when I reboot, the server don't boot and hangs, it's KVM virtualisation | 12:14 |
jamespage | zul, hey - I also uploaded trivial fixes for keystone and glance autopkgtest failures | 12:18 |
jamespage | I'm stuffing them into havana staging as well | 12:19 |
zul | jamespage: ack | 12:19 |
jamespage | streulma, anything on the console? | 12:21 |
streulma | there is on the moment a problem with console, the isp upgraded to new version of OnApp | 12:21 |
streulma | but before I had the problem | 12:22 |
streulma | it boots the kernel | 12:22 |
streulma | and then hangs after keyboard... | 12:22 |
streulma | before the services loads | 12:22 |
=== smb` is now known as smb | ||
zul | jamespage: http://people.canonical.com/~chucks/ca/ | 13:40 |
jamespage | zul, ceilometer? | 13:42 |
zul | jamespage: yep | 13:43 |
jamespage | zul, why does simplejson need " - Build for python 3.2 as well." | 13:56 |
jamespage | I know precise has python 3.2 | 13:56 |
jamespage | but can't a generic fix be applied in saucy which makes it a no-change backport again? | 13:57 |
zul | jamespage: because it explicity dependeon on python 3.3 | 13:58 |
jamespage | zul, +1 for msgpack-python | 13:58 |
zul | jamespage: python3-all-dev (>= 3.3.0-3) in the debian/control | 13:59 |
jamespage | zul, ack | 14:01 |
jamespage | reviewing now | 14:01 |
zul | jamespage: ill fix the saucy version | 14:01 |
jamespage | zul, does it work with python3.2 | 14:01 |
zul | jamespage: yeah | 14:01 |
jamespage | just wondering if that why the min-versions are specced | 14:01 |
zul | nothing in the changelog | 14:02 |
jamespage | zul, nope | 14:03 |
jamespage | and it looks OK - maybe poke piotr in #debian-python on OFTC and see if there are any gotchas | 14:04 |
zul | jamespage: nope im not uploading it, i just noticed a bug | 14:04 |
jamespage | zul, do we really need the new webtest? | 14:04 |
jamespage | is 1.3.3 -> 1.3.4 | 14:04 |
jamespage | its rather | 14:04 |
zul | jamespage: im not sure, nack it please | 14:04 |
zul | jamespage: chuck@homer:~/pbuilder/precise_result$ dpkg -c python3-simplejson_3.3.0-2ubuntu1~cloud0_amd64.deb | 14:05 |
zul | drwxr-xr-x root/root 0 2013-07-24 09:14 ./ | 14:05 |
zul | drwxr-xr-x root/root 0 2013-07-24 09:14 ./usr/ | 14:05 |
zul | drwxr-xr-x root/root 0 2013-07-24 09:14 ./usr/share/ | 14:05 |
zul | drwxr-xr-x root/root 0 2013-07-24 09:14 ./usr/share/doc/ | 14:05 |
zul | drwxr-xr-x root/root 0 2013-07-24 09:14 ./usr/share/doc/python3-simplejson/ | 14:05 |
zul | -rw-r--r-- root/root 3160 2013-07-24 09:06 ./usr/share/doc/python3-simplejson/changelog.Debian.gz | 14:05 |
jamespage | zul, -1 | 14:05 |
zul | -rw-r--r-- root/root 1645 2011-02-15 15:56 ./usr/share/doc/python3-simplejson/copyright | 14:05 |
zul | chuck@homer:~/pbuilder/precise_result$ dpkg -c python-simplejson_3.3.0-2ubuntu1~cloud0_amd64.deb | 14:05 |
zul | drwxr-xr-x root/root 0 2013-07-24 09:14 ./ | 14:05 |
zul | drwxr-xr-x root/root 0 2013-07-24 09:14 ./usr/ | 14:05 |
zul | drwxr-xr-x root/root 0 2013-07-24 09:14 ./usr/share/ | 14:05 |
jamespage | \o/ | 14:05 |
zul | drwxr-xr-x root/root 0 2013-07-24 09:14 ./usr/share/doc/ | 14:05 |
zul | drwxr-xr-x root/root 0 2013-07-24 09:14 ./usr/share/doc/python-simplejson/ | 14:05 |
zul | -rw-r--r-- root/root 7062 2013-05-01 16:01 ./usr/share/doc/python-simplejson/index.rst.gz | 14:05 |
zul | -rw-r--r-- root/root 3160 2013-07-24 09:06 ./usr/share/doc/python-simplejson/changelog.Debian.gz | 14:05 |
Pici | nice | 14:05 |
zul | -rw-r--r-- root/root 1645 2011-02-15 15:56 ./usr/share/doc/python-simplejson/copyright | 14:05 |
zul | shit! | 14:05 |
Pici | its zul, so I'll let it slide... this time ;) | 14:05 |
* jamespage drowns in irc | 14:06 | |
zul | jamespage: tests are not enabled in that package either | 14:06 |
=== mahmoh1 is now known as mahmoh | ||
dranix | hi everyone | 14:19 |
dranix | i need some help with ldap integration with packetfence | 14:19 |
dranix | anyone has any idea how to go about doing this? | 14:19 |
Monotoko | if I connect to an openvpn server in the office... should it not tunnel all my internet connection through it? | 14:40 |
Monotoko | I have the same IP as before... | 14:40 |
rbasak | Depends on how you have it configured. | 14:42 |
Monotoko | rbasak, it was configured by my predecessor - where can I check? | 14:43 |
rbasak | I don't recall, sorry. Check the docs for mentions of your default gateway. I think it's a client-side setting, but you can also configure the client to accept the server's settings and then configure it on the server (IIRC). | 14:44 |
rbasak | Or may default route, rather than default gateway. | 14:45 |
rbasak | maybe | 14:45 |
Monotoko | lots of mention of bridging... | 14:46 |
raub | Trivial question: how do I upgrade a kernel module that is in use? By in use it is module for raid controller but I am booting using a live CD | 14:47 |
oozbooz | Monotoko: check your routing, is default route via VPN or your ISP? | 14:47 |
oozbooz | command "ip r sh" | 14:48 |
oozbooz | usually, server pushed routes to the client, but client can overwrite it or do some other tricks w/out getting server involved | 14:49 |
oozbooz | pushed=pushes | 14:49 |
Monotoko | oozbooz, http://pastebin.com/65vcRqk7 | 14:50 |
Monotoko | I tried to remove the comment in the config here: ;push "redirect-gateway def1 bypass-dhcp" | 14:50 |
Monotoko | however then the client wouldn't load anything | 14:51 |
oozbooz | I assume 5.10.152.225 is your ISP GW | 14:51 |
oozbooz | then your internet traffic should go over it | 14:51 |
Monotoko | yeah, we have a /29 I believe | 14:51 |
Monotoko | when I'm connected from outside the office | 14:51 |
Monotoko | I want it to still use the office IP | 14:51 |
oozbooz | use office IP for ... ? | 14:52 |
oozbooz | you mean send your ALL traffic via the tunnel? | 14:52 |
Monotoko | yeah - it's static - a lot of people who work here work from homes etc, with dynamic IP's | 14:52 |
Monotoko | I'd rather they all used our network to make it easier to firewall the servers and not keep punching random holes in the FW | 14:53 |
oozbooz | I don't get your last statement ... | 14:55 |
oozbooz | usually, you want to only relevant traffic to send to your office via the tunnel, | 14:55 |
oozbooz | rest of the stuff, they should use their ISP | 14:55 |
oozbooz | why would you want them to download youtube videos using office bandwidth | 14:56 |
rbasak | I'd say it depends. Road warriors might prefer everything to go via the office if they don't trust the connections they're using (coffee shops, hotels, etc) | 14:56 |
Monotoko | oozbooz, we have a "cloud" provider off site that I need to give developers access to, and certain things that they can log into through the web browser but only from this IP | 14:56 |
oozbooz | aha | 14:57 |
oozbooz | 3rd party mess.. | 14:57 |
Monotoko | aye - obviously I need a static IP I can trust for that, so I'd rather tunnel everyone through our office network | 14:58 |
oozbooz | well... you can create a new route that only traffic for cloud provider goes via the tunnel | 14:58 |
Monotoko | hmm, what route would I be adding for that? route add 1.2.3.4 gw 5.10.152.227 eth0 ? | 15:00 |
oozbooz | but, if you decide to divert all traffic, you will have to change routing rules on the server, that will be pushed to the client | 15:01 |
oozbooz | which VPN server do you use | 15:01 |
Monotoko | openvpn | 15:01 |
zul | jamespage: simplejson fixed locally ill upload to the regular archive and get it for the cloud archive as well | 15:01 |
oozbooz | openvpn or openvpn-AS? | 15:02 |
Monotoko | regular openvpn AFAIK | 15:02 |
Monotoko | yeah | 15:02 |
Monotoko | just checked with dpkg | 15:03 |
oozbooz | ok, first my advice to upgrade to openvpn-AS - much easier to manage | 15:03 |
oozbooz | there is IRC channel "openvpn", you should confirm with them... but it should be not difficult | 15:04 |
Monotoko | cheers oozbooz | 15:05 |
oozbooz | have fun | 15:06 |
zul | jamespage: http://people.canonical.com/~chucks/ca/ | 15:07 |
=== pleia2_ is now known as pleia2 | ||
zul | smb: ping i was wondering if you could offer some insight on it https://launchpadlibrarian.net/145685953/buildlog_ubuntu-precise-amd64.xen_4.2.2-1ubuntu1~cloud0_FAILEDTOBUILD.txt.gz | 15:14 |
smb | zul, maybe, let me read | 15:15 |
zul | smb: this is on precise | 15:16 |
smb | zul, Looks like the known problem of passing LDFLAGS in gcc format -Wl but don't we work around that | 15:17 |
zul | smb: yeah seems to ignore that for some reason | 15:18 |
smb | And why do you compile xen 4.2.2 on Precise? | 15:18 |
smb | :-P | 15:18 |
smb | Still have not cleared theat MRE | 15:18 |
smb | Actually I would not aim 4.2.2 immediately but 4.1.5... or .6 but anyway | 15:19 |
jamespage | zul, +! | 15:20 |
jamespage | +! | 15:20 |
jamespage | +1 rather | 15:20 |
zul | jamespage: cool thanks | 15:20 |
smb | zul, "LDFLAGS = $(shell dpkg-buildflags --get LDFLAGS|sed -e 's/-Wl,//g')" in debian/rules? | 15:21 |
jamespage | zul, https://code.launchpad.net/~james-page/neutron/fixup-rootwrap-conf/+merge/176708 | 15:21 |
soahccc | I'm more the nginx kind of guy so what did I missed here? Installed apache, changed port (so that it won't conflict with nginx), getting this nestat "tcp 0 0 0.0.0.0:8080 0.0.0.0:* LISTEN 0 185527477 18552/apache2" but it just reacts to local requests. There is no iptable rule... Any ideas, I'm desperate :( | 15:21 |
smb | zul, Just out of curiosity is that the 4.2.2 version from current Saucy? | 15:26 |
zul | yeah | 15:26 |
smb | zul, Hm, so it has that line... but for some reason I vaguely remember something going wrong with something like this (but I believe that was another package) | 15:27 |
smb | zul, Oh wait maybe because in P LDFLAGS is exported by the build system... | 15:29 |
zul | hmm...interesting ill try it out | 15:29 |
smb | zul, Is that LDFLAGS := instead of LDFLAGS = | 15:30 |
=== JonnyNomad_ is now known as JonnyNomad | ||
smb | zul, Oh I think I can imagine what is going on: we do not set LDFLAGS at all by default in newer releases. So when compiling in S I did not notice none of them being used and setting LDFLAGS in debian/rules being useless | 15:39 |
smb | But in P when they are set by default it fails... | 15:39 |
zul | smb: so disable it? | 15:39 |
smb | zul, I'd probably try either an export in debian/rules or move the definition into debian/rules.real for a moment | 15:40 |
=== Catbuntu is now known as LexieGrey | ||
zul | smb: ok ill try that | 15:40 |
smb | zul, And I need to make sure I really use those flags in the Xen 4.3 I am preparing | 15:41 |
smb | for S that is | 15:41 |
zul | smb: when are you doing 4.3? | 15:43 |
smb | zul, I am just about to think I got all pieces together. Testing it on my boxes | 15:43 |
zul | smb: ok cool | 15:44 |
med_ | zul, a user asked me if there will be any quantum-> neutron renaming in raring or earlier (and similarly, anything before havana) | 15:45 |
med_ | my answer was "NO, but I'll check with zul" | 15:45 |
zul | med: no quantum in raring was quantum | 15:46 |
med_ | nod. | 15:46 |
* med_ was pretty sure it was only a cease and desist not a "go undo the world" | 15:47 | |
roaksoax | Madkiss: howdy! have you looked into packaging dlm? | 15:59 |
zul | smb: nope neither worked | 16:00 |
smb | zul, Hm, ok need to figure out how to modify it correctly for the actual compile. Seems the more recent releases just don't use any | 16:01 |
smb | I mean it does not get passed in and fails because where we change it somehow does not replace the default of the system | 16:02 |
smb | zul, Doing the export did break the build in the same way on S though... So maybe := is the second missing piece | 16:04 |
smb | zul, having LDFLAGS= and export LDFLAGS both in rules.real seems to make the compile run longer (not finished yet) | 16:13 |
zul | smb: can i see a snippet your rules.real please? | 16:14 |
Daviey | rbasak: BTW, merges.py won't work right now - until egress firewall is more relaxed. Have raised RT | 16:17 |
rbasak | Daviey: OK, thanks. | 16:20 |
Daviey | roaksoax: Hey, does Openstack / Kombu support Rabbit Active/Active in Havana? | 16:20 |
rbasak | I'll try and keep people.canonical.com/~rbasak/delta.py updated in the mean time, though note that I'm doing it manually. | 16:20 |
roaksoax | Daviey: I haven't check yet, sorry! I'm doing the whole upgrade process of the clustering tools, whcih is not as easy as syncing packages from debian | 16:22 |
adam_g | Daviey, the issue wasn't active/active its the lack of any type of heartbeating support, so that the rpc layer (quickly) detects failure and migrates to a new server | 16:24 |
jsonperl | Patrick, I still got the issue, but I think I'm getting closer | 16:27 |
jsonperl | Patrickdk that is | 16:28 |
jsonperl | Would a BUNCH of connections in CLOSE_WAIT stop up the tcp pipeline at some point? | 16:28 |
zul | jamespage: still around? | 16:40 |
jamespage | zul, yes | 16:44 |
zul | jamespage: one more for you today http://people.canonical.com/~chucks/ca/ | 16:44 |
jamespage | zul, does that one build against the havana-staging PPA? | 16:45 |
zul | jamespage: just finished building | 16:45 |
jamespage | zul, +1 then | 16:45 |
zul | jamespage: thanks | 16:45 |
patdk-wk | jsonperl, if that is the case, a couple of issues could be the case | 16:47 |
patdk-wk | open file handles? | 16:47 |
patdk-wk | or just exaustion of resources | 16:48 |
patdk-wk | maybe look here, it seems to have an ok description of the sysctl's involved | 16:48 |
patdk-wk | http://www.ufirsttech.com/content/linux-kernel-settings-related-tcp-connections-68 | 16:48 |
jsonperl | Awesome thanks | 16:48 |
patdk-wk | normally there are several sysctls that need to be adjusted for any kind of high performance server | 16:49 |
patdk-wk | expecially when handling lots of connections | 16:49 |
jsonperl | In this case it's actually a library i use to hit amazon s3 | 16:49 |
patdk-wk | don't think any of this would cause that single cpu usage issue though | 16:49 |
jsonperl | which is the least often used connection i got | 16:49 |
jsonperl | I think all of what we were seeing is a RESULT of connectivity issues | 16:49 |
jsonperl | no players = no processing | 16:50 |
patdk-wk | oh, that page uses proc, I normally do it via sysctl instead | 16:50 |
jsonperl | I think the ethtool command to change stuff maybe reset the stuck connections? | 16:50 |
patdk-wk | jsonperl, still :) | 16:50 |
jsonperl | making it look fixed | 16:50 |
patdk-wk | setup a ping | 16:50 |
patdk-wk | see if you start missing, or get delayed pings | 16:50 |
patdk-wk | if your running tcpdump on the server at the time too, watching just for icmp | 16:51 |
jsonperl | ok, we use pingdom… that sufficient you think? | 16:51 |
patdk-wk | you should be able to easily tell | 16:51 |
jsonperl | I actually try tcp to the server every minute | 16:51 |
patdk-wk | isn't that like once a minute? | 16:51 |
jsonperl | yea | 16:51 |
jsonperl | You're thinking more often? | 16:51 |
patdk-wk | ya, I would go second, and watch delays | 16:51 |
patdk-wk | you want to know how long it takes, you know it gets there ,and responds | 16:51 |
patdk-wk | you want to know if it gets lost, or delayed | 16:52 |
patdk-wk | well, tcp would get lost and retried | 16:52 |
patdk-wk | but ping would just get lost | 16:52 |
jsonperl | Any service you can recommend? or you just do it from another box | 16:52 |
patdk-wk | I normally just do it from my home box | 16:52 |
jsonperl | gotcha | 16:52 |
patdk-wk | or a work computer | 16:52 |
patdk-wk | not like ping uses much traffic | 16:52 |
jsonperl | Doesn't feel very enterprisey :D | 16:53 |
patdk-wk | now if you want to take it a step more, use mtr :) | 16:53 |
patdk-wk | so you can see where the issue actually happens, if it's network related | 16:53 |
jsonperl | It's not | 16:53 |
jsonperl | this is my boxes | 16:53 |
jsonperl | I wish it were somebody elses fault! | 16:54 |
patdk-wk | no, if you think the issue was you aren't receiving the players traffic | 16:54 |
patdk-wk | that would be network issue :) | 16:54 |
patdk-wk | ping would easily show that | 16:54 |
jsonperl | But I see the same issue cross machines, cross facilities | 16:54 |
jsonperl | different parts of the US | 16:54 |
jsonperl | same issue | 16:54 |
patdk-wk | not likely then | 16:54 |
patdk-wk | I really don't know where to go | 16:55 |
patdk-wk | unless I actually get on it and dig around and maybe setup my own stuff to monitor it | 16:55 |
jsonperl | I feel like i need to get rid of those orphaned connections | 16:55 |
patdk-wk | but not even sure how good I could do that | 16:55 |
jsonperl | Want a consulting job? :D | 16:55 |
patdk-wk | I have enough of those :) | 16:55 |
jsonperl | haha | 16:55 |
jsonperl | But we're a super entertaining indie game company | 16:56 |
jsonperl | like on the tv :D | 16:56 |
jsonperl | So real quickly... | 16:56 |
jsonperl | Do you believe it's possible that piling up of CLOSE_WAIT connections eventually can lead to connectivity issues in the tcp stack? | 16:57 |
jsonperl | or am I going up the wrong road here | 16:57 |
patdk-wk | it can, I doubt your anywhere near that though | 16:57 |
patdk-wk | I doubt your even >5% of the limit | 16:57 |
jsonperl | Does the OS limit per process? | 16:58 |
patdk-wk | check ulimit for that | 16:58 |
jsonperl | k | 16:58 |
patdk-wk | remember, tcp connections are file handles, and count with open files | 16:58 |
jsonperl | So what seems like a clue to me is | 16:59 |
jsonperl | Turning everything off with ethtools fixed "the glitch" | 16:59 |
jsonperl | Temporarily | 16:59 |
jsonperl | No question… went from "very borked" to normal the moment I changed the settings | 17:00 |
patdk-wk | what kernel you running on these? | 17:00 |
jsonperl | 3.2.0-38-generic-pae #61-Ubuntu SMP Tue Feb 19 12:39:51 UTC 2013 i686 i686 i386 GNU/Linux | 17:01 |
patdk-wk | hmm, 32bit | 17:01 |
patdk-wk | why not 64? | 17:01 |
jsonperl | actually wait… that box is an oddball | 17:02 |
jsonperl | the rest are 64 | 17:02 |
patdk-wk | :) | 17:02 |
patdk-wk | using any dkms modules? | 17:02 |
patdk-wk | I doubt you are | 17:02 |
jsonperl | 32 was to save memory | 17:02 |
jsonperl | these are the rest 3.2.0-49-generic #75-Ubuntu SMP Tue Jun 18 17:39:32 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux | 17:02 |
jsonperl | dkms? I donno what that is | 17:03 |
patdk-wk | addon modules for the kernels | 17:03 |
jsonperl | ah… can I list em? | 17:03 |
jsonperl | it's pretty stock 12.04 | 17:03 |
patdk-wk | like, vmware drivers, xtables, .... | 17:03 |
patdk-wk | nvidia | 17:03 |
jsonperl | ah, i doubt | 17:03 |
jsonperl | no video | 17:03 |
patdk-wk | normally should show up in dpkg -l | grep dkms | 17:03 |
jsonperl | no virtualization | 17:03 |
jsonperl | nothin | 17:03 |
patdk-wk | since I still believe it's a kernel issue | 17:04 |
patdk-wk | might be worth giving a 3.8 kernel test on it | 17:04 |
patdk-wk | though, on all my servers I haven't hit this issue, but then, I likely wouldn't have noticed either | 17:04 |
patdk-wk | I'm using 3.8 on my firewall machines for the newer firewall stuff in it | 17:06 |
patdk-wk | to install it, apt-get install linux-generic-lts-raring linux-tools-lts-raring | 17:07 |
patdk-wk | then reboot | 17:07 |
patdk-wk | you can always uninstall it too | 17:07 |
jsonperl | Was just looking into that? | 17:07 |
jsonperl | How do you downgrade? | 17:08 |
patdk-wk | it's just a new grub kernel option | 17:08 |
patdk-wk | just select a different one | 17:08 |
patdk-wk | then once it's booted apt-get remove those two | 17:08 |
patdk-wk | I had all kinds of dkms issues with it | 17:08 |
patdk-wk | cause I needed both vmware and xtables dkms modules | 17:08 |
jsonperl | Makes sense… That's why i like to stay 2 steps behind bleeeeeding edge | 17:09 |
patdk-wk | I really wanted the bufferbloat stuff in 3.8 though :) | 17:09 |
patdk-wk | for the firewall, and firewall needs xtables :) | 17:09 |
patdk-wk | all my other machines are normal 64bit 12.04 though | 17:09 |
patdk-wk | but I wonder if the issue your having got fixed in the kernel already | 17:10 |
patdk-wk | and there is a LOT of changelogs to read to find out easily | 17:10 |
patdk-wk | without just testing it | 17:10 |
jsonperl | Or testing that it doesn't happen to explode | 17:11 |
jsonperl | over a period of days :) | 17:11 |
patdk-wk | I guess we could always setup an ice, and test it there :) | 17:12 |
jsonperl | ice? | 17:12 |
patdk-wk | http://en.wikipedia.org/wiki/In-circuit_emulator | 17:12 |
patdk-wk | when you go there, it's not pretty | 17:13 |
patdk-wk | I guess these days people would just use a vm | 17:14 |
patdk-wk | but oldschool it was using an ice | 17:15 |
=== Ursinha is now known as Ursinha-afk | ||
jsonperl | gotcha… yep that's before my time! | 17:17 |
jsonperl | mtr is cool | 17:17 |
jsonperl | cept allll my packets are lost on the way to my server | 17:17 |
jsonperl | Must be clipping all but the first | 17:18 |
jsonperl1 | whoops | 17:21 |
=== jsonperl1 is now known as jsonperl | ||
jsonperl | btw I would be HAPPY to give you access to the box :) | 17:33 |
jpds | jsonperl: Sounds like a dreadful idea from a security point-of-view. | 17:43 |
jsonperl | haha | 17:44 |
jsonperl | Truth | 17:44 |
rizzuh | Hey guys. I want to install Redis on a 12.04 Azure Extra Small VM. It has only 768MB of RAM available. How can I find the RAM usage and what steps should I follow to minimize memory usage, so Redis can have the lion's share? | 18:42 |
sarnold | rizzuh: measuring memory use is a bit complicated; 'free' will give you a very quick overview of free memory on the system, the -/+ buffers/cache line is probably most important summary of the summary.. | 18:43 |
sarnold | rizzuh: ps auxw or top (sorted with M), look for the highest RSS numbers, that's what's actually resident in RAM for those programs.. | 18:43 |
sarnold | rizzuh: ut sometimes shared libraries take a pile, the 'smem' tool can help you find out wihch processes have which shared libraries loaded, and apportions to each of them a certain amout of the fault for the memory used by those shared libraries | 18:44 |
rizzuh | sarnold, well ATM top shows 554478k free - if that isn't woefully inaccurate it's pretty good | 18:47 |
sarnold | rizzuh: well, "free" is a funny thing. the kernel keeps some memory around, free, to handle spikes of allocations. but it tries to minimize the amount of free memory because free memory is wasted memory. :) | 18:48 |
rizzuh | sarnold, ahh, sure, free as in not reserved by an app. If it's full of cache that ain't an issue. | 18:48 |
sarnold | rizzuh: that's where the -/+ buffers/cache line comes in -- that includes memory that is currently being used for storing in ram copies of files but _could_ be thrown away under pressure | 18:48 |
sarnold | rizzuh: *nod* *nod* | 18:49 |
=== wxl_ is now known as wxl | ||
rizzuh | sarnold, that said, 500 MB RAM to use is good, but damn this thing is slow. Good that Redis doesn't need much processing power. It's taking a while to update a few apt packages. | 18:55 |
sarnold | rizzuh: at least the amazon micro instances are very heavily penalized in much the same way.. not bad for slight spikes in a mostly-idle environment, but installing a few hundred packages is -painful- | 18:56 |
jsonperl1 | yea those micros | 18:57 |
jsonperl1 | i'm fairly sure they arbitrarily throttle you... | 18:57 |
rizzuh | sarnold, yeah these are pretty much the same as AWS micro. 5 Mbit network as well, not great. | 18:57 |
sarnold | if the azure storage can be moved among instances, it might even make sense to turn it off, attach to a good instance, upgrade, and move back to cheap again.. heh. | 18:57 |
=== jsonperl1 is now known as jsonperl | ||
sarnold | rizzuh: 5MBit? wow! | 18:57 |
rizzuh | The next one is small at $50 a month, with 1.5GB RAM and a dedicated core. Oh and 100 Mbit network or something like that. | 18:58 |
rizzuh | But then through BizSpark we pay 33% less. "Pay", as we have $150 credit / dev, with production usage rights, so it's pretty good for the money :P | 18:59 |
=== Ursinha-afk is now known as Ursinha | ||
jsonperl | Patrickdk, so running simulators at a box… I'm able to REALLLLLY pile up on LAST_ACK state connections | 19:50 |
jsonperl | Over about 20 minutes, I'm able to get to a count of 450 or so | 19:50 |
patdk-wk | nice | 19:51 |
jsonperl | Seems odd right? | 19:51 |
patdk-wk | something isn't closing the connection correctly | 19:52 |
patdk-wk | might just be normal for ios, no idea though | 19:52 |
jsonperl | Our server was trying to "close a connection after writing remaining data" | 20:02 |
jsonperl | I changed it to just close the connection, seems to fix that at least | 20:02 |
jsonperl | sarnold: Ive dumped some dmesg output from blocked processes, but still unclear how to read it | 20:34 |
hallyn | jdstrand: would adding AUDIT_WRITE to libvirtd apparmor policy be acceptable? | 20:35 |
jdstrand | hallyn: usr.sbin.libvirtd? | 20:41 |
hallyn | yes | 20:46 |
jdstrand | hallyn: that's fine, libvirtd is not really confined anyway (the VMs it launches are) | 20:55 |
jdstrand | hallyn: let me point you at a bug though | 20:55 |
jdstrand | hallyn: actually, nm, you should be ok | 20:55 |
hallyn | jdstrand: ok, thanks. (i consider this ultra-low priority) | 20:59 |
hallyn | zul: ^ if you happen to be merging libvirt soon-ish, we should toss that in i guess (there is an open bug requesting it) | 20:59 |
jsonperl | netstat -s output… does anything here look overly concerning? http://pastebin.com/bnzEFRPh | 21:05 |
thumper | hi hallyn | 21:05 |
thumper | hallyn: thanks for the comprehensive email | 21:05 |
thumper | it has me thinking... | 21:05 |
thumper | hallyn: also, lxc-device isn't available in the precise lxc that we are limited to | 21:06 |
sarnold | jsonperl: 10878 invalid SYN cookies received | 21:08 |
sarnold | jsonperl: that seems steep. | 21:08 |
jsonperl | take the system down steep | 21:08 |
jsonperl | ? | 21:08 |
sarnold | maybe it's normal on the internet now, but .. it'd be worth asking your host if you're under attack.. | 21:08 |
sarnold | jsonperl: what's this machine -do-? | 21:08 |
jsonperl | serves a game via a persistent tcp connection to a bunch of users | 21:09 |
jsonperl | at this time only about 50-100 concurrent on that machine | 21:09 |
jsonperl | distributed amongst 14 servers on that machine | 21:09 |
hallyn | thumper: are you actually limited to the stock precise lxc, or could you use lxc from the ubuntu-lxc ppa for precise? AFAIUI you're using ppas anyway.... but in any case lxc-device is just a nicety, you do NOT need it :) | 21:10 |
thumper | hallyn: possibly not necessarily limited to stock lxc | 21:11 |
thumper | but I've not considered extra ppas | 21:11 |
thumper | managed to not really need it at this stage | 21:11 |
thumper | hallyn: this would be on every machine, and I don't think we install ppas on every machine | 21:12 |
hallyn | thumper: well lxc-device itself isn't enough of a reason to switch to ppa i don't think | 21:12 |
* thumper nods | 21:12 | |
thumper | I need to find someone who knows maas | 21:12 |
thumper | to work out how to do the "gimmie a nic" thing | 21:12 |
hallyn | thumper: is it acceptable to simply start up the container after getting the nic from <whatever hands it to you> ? | 21:12 |
thumper | yes, I think we can do that | 21:12 |
hallyn | cool, that'll be easiest | 21:13 |
thumper | as long as the getting a nic doesn't take too long | 21:13 |
thumper | < 10s would be ok I think | 21:13 |
thumper | longer than that and we might need to work out something else | 21:13 |
thumper | by something else | 21:13 |
thumper | just a better work flow | 21:13 |
jsonperl | sarnold: Any ideas for further investigation into the invalid syn cookies? | 21:13 |
thumper | hallyn: I wish I knew about the "no network conf" bit to use the host | 21:14 |
thumper | that would have been a good enough setting by default I think | 21:14 |
jsonperl | an attack certainly could explain the very random connectivity issues we've seen | 21:14 |
thumper | I need to consider the implications for the local provider | 21:14 |
hallyn | thumper: i don't follow. you mean lxc.network.empty ? | 21:14 |
thumper | no, the number 2 | 21:14 |
thumper | no network entry | 21:15 |
sarnold | jsonperl: syn packets tie up kernel memory; syn cookies are one way to tyr to avoid the worst of the kernel memory use. for some good backgroud information, see http://lwn.net/Articles/277146/ | 21:15 |
sarnold | jsonperl: /etc/sysctl.conf has a configuration you can set to turn on syn cookies | 21:15 |
thumper | also I need to work out how to have a nice api to our internal providers, and how to handle that config with the containers | 21:15 |
jsonperl | ok, thanks for the read | 21:16 |
thumper | the brain is busy handling this with a background process :) | 21:16 |
thumper | I think I almost have it :) | 21:16 |
jsonperl | sarnold: if netstat is reporting invalid syn cookies, doesn't that mean they're on? | 21:16 |
sarnold | jsonperl: maybe? :) | 21:17 |
jsonperl | sarnold is that the only thing of concern that popped out at ya? | 21:26 |
sarnold | jsonperl: the high connection counts made me wonder, but the use makes sense, hehe | 21:28 |
jsonperl | Kids jumping in and out of the game | 21:28 |
sarnold | sorry nothing just stands out to me ;( | 21:28 |
jsonperl | worlds exist on one server on one machine, and they can "teleport" between them | 21:29 |
jsonperl | haha ok :) | 21:29 |
jsonperl | sarnold: good reading on syncookies thanks | 21:31 |
=== Ursinha is now known as Ursinha-afk | ||
thumper | hallyn: still around? | 21:50 |
hallyn | thumper: yup | 21:50 |
thumper | hallyn: thinking about number four, where we create a veth pair | 21:51 |
thumper | hallyn: if the container hasn't been started, there is no network namespace right? | 21:51 |
thumper | or is there? | 21:51 |
hallyn | nope. | 21:52 |
thumper | also, this "sudo lxc-unshare -s NETWORK -- /bin/bash" seems like it does something intersting I don't quite grok | 21:52 |
hallyn | thumper: that's just doing the same thing as creating a container. | 21:52 |
hallyn | it starts a task inside a new, private network ns | 21:52 |
hallyn | as for veth - if MAAS/openstack/ec2 will hand you a nic, then ignore veths | 21:52 |
hallyn | lxc.network.type = veth will always create a new veth pair and attach the one end to lxc.network.link. | 21:53 |
thumper | well openstack won't | 21:53 |
thumper | ah, I was going to ask what the link bit was | 21:53 |
thumper | hallyn: can I run my idea past you? | 21:54 |
hallyn | so if you *were* going to use veth, which my feeling is you won't, then you would bridge whatever you get <handwaving> from openstack to br0, then say lxc.network.type = veth lxc.network.link=br0 | 21:54 |
hallyn | sure | 21:54 |
thumper | hallyn: although #juju-dev might be better | 21:54 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!