[00:19] my issue was resolved on #postfix, thank you === airtonix_ is now known as airtonix [02:15] Hi, anybody has some monit experience? [02:16] hi all. i need help getting sshfs working on boot. fstab entries are not working. === arrrghhh is now known as arrrghhhAWAY [03:18] hello [03:18] trying to figure out the cloud thingie :P any help here? [03:19] right now i'm running a kvm on one server and a few virtual machines [03:24] can i use MAAS to do the same? [03:24] on one server that is [03:25] or is that an overkill? [05:30] Hello Friends, I want to control server in such way that it kill process when it consumes more memory === thumper is now known as thumper-afk [07:06] ketan985: try looking into "ulimit" and "oom killer". === rvba` is now known as rvba === smb` is now known as smb [07:38] hello there [07:39] hello anyone there ? [07:40] /join ubuntu-gb [08:29] hi [08:29] anyone there ? [08:30] is it pĂ´ssible to create a root user with temporary access ? [08:34] zertyui, you can add any user on the sudo list temporary. [08:35] but .... how can you be sure this user will not change things while he is root so he can be root again later on ? [08:37] yes if he would liek [08:38] but i m looking for a command creating users with timestamp access [08:45] so impossible ? [08:49] root, by definition, can do anything. Including adding backdoors to regain root later [08:54] yes of course i m an idiot do not understand that [08:55] my question was how to actroy root access to a user with timestamp access ? [08:56] you are logged in root on a system i would like to a create a user [08:56] and give him a root [08:56] just for example for 2 days [08:56] add him in the sudeor file, and remove him after 2 days. [08:57] if you wanna have this automatically done, write a script that does it for you. and use cron to launch it [08:57] then the user autamatically back to an normal user i would do that automatically [08:57] without doing it manually [08:59] Daviey: "agile granite foundations", wow :) [09:05] You can set user accounts to expire. See usermod(8). [09:05] sudoers(5) expiry doesn't exist, AFAIK, but you can cron it. [09:17] jdstrand: Would like to know if there is any update on bug #1197484. ETA for any possible fix, etc? [09:17] Launchpad bug 1197484 in isc-dhcp "Connection requests to saucy server VMs from a hosts fail after fresh VM installs" [High,New] https://launchpad.net/bugs/1197484 === Guest46086 is now known as jpds [10:13] yolanda, that nagios3 merge is confusing - I'm not entirely sure why the debdiffs's are so huge [10:14] let me double check [10:14] not sure now [10:18] jamespage, you mean the diffs between the two ubuntu versions? [10:19] yolanda, yes - the debian commits are patches and packaging [10:21] I'll recheck the process [10:21] yolanda, even the diff between -3 and -4 in debian is massive [10:21] yolanda, I don't think its what you have done [10:21] although I did not expect to see the diff in debian/po [10:21] let me paste the report [10:21] maybe you see something [10:22] http://paste.ubuntu.com/5876994/ [10:23] yolanda, yeah - I know [10:23] but think about what change you actually made for 3ubuntu1 [10:23] that should be the only delta [10:24] let me recheck taht [10:24] yolanda, oh - you should also close out the merge bug as part of your changelog [10:25] I was about to add that to the bug report but got sidetracked by this issue [10:26] ok, didn't know about it, so i should reference the LP bug in changelog? [10:28] yolanda, yes [10:28] yolanda, I think the po mods are a grab-merge bug [10:28] yolanda, if I do the merge using ubuntu:nagios3 and lp:debian/sid/nagios3 I get what I would expect [10:29] ok, so i'll try with that approach [10:30] jamespage, is that better to rely on manual merges, not in grab_merge script? [10:30] or maybe do the grab_merge and then check that for unexpected results? [10:31] yolanda, yeah - thats what I end up doing [10:31] ok, then i'll fix that, and i also have to update the changelog for the others [10:33] yolanda, great - thanks! [10:51] jamespage, much more cleaner debdiff doing a manual merge [10:52] i'll resend the patches [10:59] jamespage, generated diff between prev ubuntu version and this one is huge anyway, mostly same size, but diff between debian/ubuntu is clean now [11:27] yolanda, when you attach patches please can you make sure that you tick the 'this is a patch to fix the problem' option [11:27] it breaks the sponsorship tooling otherwise [11:28] yolanda, for the nagios merge the bug in the changelog does not match the one in launchpad [11:31] yolanda, fwiw you can just push the branch to launchpad/raise a MP instead of doing the debdiff's [11:34] jamespage, i remember don't having the permissions, i think [11:34] yolanda, to mark patches as 'patches'? [11:35] no, for the MP [11:35] yolanda, anyone can raise a merge proposal [11:35] ok, and you do the merge? i can't remember who recommended me to use the debdiff approach [11:36] np, having lunch and i'll raise the mp [11:39] yolanda, OK - for the quid3 merge: [11:39] dpkg-source: info: local changes detected, the modified files are: [11:39] squid3-3.3.4/src/cf.data.pre [11:39] when I try to use the debdiff - can you take a look at that as well [11:39] ta [11:40] sure [11:57] jamespage: https://code.launchpad.net/~yolanda.robla/ubuntu/saucy/nagios3/debian_merge/+merge/174736 [12:00] yolanda, lots of conflicts [12:00] i know [12:00] lots of conflicts between prev and this ubuntu veresion [12:00] but it's like that using the grab-merge and the manual merging also [12:05] i'll try resubmitting the mp, just a moment [12:06] yolanda, nm - I already uploaded that one [12:07] is that ok? [12:07] yolanda, I just fixed up the debdiff you uploaded a while back [12:07] oh ok [12:07] i'll check the squid3 problem, not sure what happens [12:12] jamespage: i really need to beat out neutron today [12:14] zul, beat away! [12:14] morning btw [12:15] good morning [12:17] yolanda, https://jenkins.qa.ubuntu.com/job/saucy-adt-nagios3/ [12:19] let me try locally [12:29] hi [12:29] i want to emulate a NAS server in my ubuntu server, is that possible? [12:29] exists any kind of software for do this? [12:29] what could be the query string for google, nas software gives too many results [12:30] What exactly are you trying to acheive? [12:30] create folders, rename files, that basic things [12:30] i guess a NAS is a webdav server, no? [12:30] So just a file server? Pick a protocol. [12:31] NFS, SMB, whatever. [12:31] yes, i use smb, but some users asked for have a HTTP interface with admin control [12:31] i just wonder if that exists [12:32] or the user just should login as admin when map the server [12:40] i would think you need some CMS type of software === cmagina-away is now known as cmagina [13:18] hxm: what do you need to administer over web? [13:19] hxm: look into freenas. It's BSD based, but perhaps you can run it inside a VM on Ubuntu Server? [13:19] freenas is good, runs on zfs too IIRC, which is good indeed [13:22] psivaa: still trying to reproduce [13:23] jdstrand: ok, i had an impression that you've seen it somewhere else as well, may be i misunderstood [13:24] psivaa: I did, but I've yet to reproduce it [13:24] jdstrand: ok, understand [13:36] Hmmm, any idea why my NFS mount would lock up periodically (to the point of needing a reboot)? [13:40] Chocobo: not sure, it could be anything. anything in the logs? dmesg? [13:43] jamespage, i tested nagios3 tests again locally, and run fine for me [13:43] yolanda, bah [13:45] some dependency with nagios3-cgi should be the problem? why it works locally with run-adt-test, and not on the test machine? [14:47] hey [14:48] can MAAS be a good substitute for managing virtual machines? [14:51] RoyK: Not really... I can still mount other NFS exports on the same interfaces, but it just hangs when I try to mount a certain export. It is strange because other nodes in the cluster all have the problematic export mounted [14:59] When I try to mount it there is tons of traffic (using tcpdump) this is strange [15:00] Chocobo: same server as well? [15:01] RoyK: What do you mean same server? yes, I can mount other exports from the same server. [15:01] Chocobo: if you mount with options soft,intr, than the connection should be interruptable [15:02] otherwise, the default action for NFS is to hang while the server's unavailable [15:06] jamespage, about squid3, i'm finding an strange problem with the patches. I removed all .pc directory, retried again, applying all patches manually, etc... [15:06] when i do a bzr bd -S i have this error :bzr: ERROR: An error (1) occurred running quilt: The working tree was created by an older version of quilt. Please run 'quilt upgrade'. [15:06] RoyK: this is my fstab entry: dedup-ib:/big_pool/os-grizzly /os-grizzly nfs rw,async,noatime,nolock,tcp,bg,intr,hard,_netdev,noauto 0 0 [15:07] runnning quilt upgrade doesn't help, it complains about that the quilt metadata is already in version 2, nothing to do [15:07] packaging with a debuild works, but not sure it that is ok [15:10] Chocobo: perhaps try soft instead of hard [15:10] Chocobo: it won't fix the issues, but may make it easier to debug [15:10] btw, I don't think noatime is a valid nfs flag [15:11] RoyK: Thanks, I will give it a shot. [15:14] Chocobo: btw, is this some dedup thing? [15:15] RoyK: it is a ZFS backend that has deduplication enabled, yes. [15:19] ok [15:19] lots of memory in the machine? [15:19] in my experience, zfs dedup is *very* hungry for memory [15:22] hey, my system clock keeps drifting [15:22] whats the best solution to fix this [15:22] doesnt ubuntu have a default cron to handle this ? [15:23] fncirunbvhltdjiddnjuihkrfglcfigcvdekrevdnlin [15:23] bitnumus: ntp should keep your clock in sync [15:23] bitnumus: is this a vm? [15:23] nope [15:23] not sure how the provider has it setup, its a VPS [15:24] then it's probably a vm [15:24] can you pastebin lshw output? [15:24] lshw ? [15:24] why would ubuntu have a cron to handle clock? that is the worst idea ever [15:24] patdk-wk, just what i've read [15:25] bitnumus: apt-get install ntp [15:25] ntp is installed [15:25] maybe not running but [15:25] sec [15:25] i looked at this a few days ago now, something about ntpdate [15:25] bitnumus: yes, lshw, it should show on what hardware or hypervisor you're running [15:26] RoyK, that gives 'bad command' [15:26] then apt-get install it :) [15:26] bitnumus: perhaps dmidecode will tell [15:26] Some VPSes don't let you set the clock. [15:27] but lshw output is better [15:27] sec, [15:27] I had one where the clock was out, the kernel wasn't available to user modification, and setting the clock resulted in an error. I had to get the hosting provider to fix it. [15:28] RoyK, http://pastebin.com/z9aFjKGm [15:28] rbasak: openvz or vserver based systems don't have individual clocks [15:29] so i've installed ntp, anything i need to do to initialise it ? [15:29] bitnumus: not sure, but I guess vserver [15:29] does it need a reboot [15:29] bitnumus: to manually set the time from a timeserver, use ntpdate pool.ntp.org [15:30] i dont want to manually do anything, i need it to keep up to date with next to 0 drift [15:30] you might need to stop ntp first because of an open socket [15:30] will ntpd keep it in check ? [15:30] yes [15:30] so no reboot or anything ? [15:30] no [15:30] how often should it update it ? [15:31] but if the clock is too far askew, ntpd might not catch up [15:31] or it will take some time to catch up [15:31] so, service ntp stop ; ntpdate pool.ntp.org ; service ntp start [15:31] na, atm its about 1second out [15:31] ntpd adjusts the clock speed to match the time it's syncing. So it's not updated as such. Once the clock stays in sync it should just appear to be in sync. [15:31] maybe that was my issue before, it drifted to 264seconds [15:31] then it souldn't be needed to use ntpdate [15:32] ok great stuff [15:32] what is the best way to update your server? apt-get upgrade, or apt-get upgrade --show-upgraded, or apt-get dist-upgrade, or aptitude dist-upgrade ? [15:32] cheers ^ [15:32] streulma, depends on the goal [15:32] streulma: I just do apt-get update && apt-get -y dist-upgrade && apt-get -y autoremove [15:32] upgrade everything, upgrade security patches only, ... [15:33] dist-upgrade is what I use, and you need it to bring in new kernel security patches [15:34] used command from RoyK [15:35] royk, maybe use virt-what next time, over lshw? [15:35] patdk-wk: virt-what? [15:35] ah [15:35] didn't know that ;) [15:36] I knew it existed, but couldn't remember the name [15:36] and didn't know if it did openvz and them, but it does [15:36] streulma: can you try virt-what as patdk-wk suggested? [15:36] bitnumus, you mean? [15:36] uh, yes [15:37] bitnumus: ? [15:37] what [15:37] ya, what is the word :) [15:37] bitnumus: can you try virt-what as patdk-wk suggested? [15:37] RoyK: There is 512GB in that machine I believe. [15:37] Chocobo: should suffice for rather a large amount of diskspace ;) [15:38] hate to see that reboot [15:38] XEN [15:38] ah [15:38] I've seen clock drift with xen [15:38] your ok running ntp on xen [15:39] it won't keep the clock perfect though [15:39] but it will keep it close [15:40] I've seen some time wibble on Xzn [15:40] Xen === whaley is now known as aTribeCalled [16:20] Hi everyone [16:20] ho [16:20] I just got access to two new sponsored servers running the latest ubuntu [16:20] I haven't used the past few ubuntu versions so some things have changed [16:21] First up, is there something special that needs to be done now to change my sudo password? [16:21] passwd does not seem to be working [16:22] What do you mean by 'sudo password' [16:22] if your user is in the etc sudoers file it can access root via it's own user password [16:23] if you wanna change the root password you can do 'sudo passwd' [16:23] I've tried about 4 times now, I use passwd, enter the current password, then enter my new password twice, it says it is changed, yet if I open a new SSH session only the old password works [16:23] are you changing the password for the user you also try to ssh in with? [16:23] So I mean my own account password which is the only account on the box [16:24] nothing in that regard has changed, is this a brand new installation? [16:24] My next question would be how to check if a root password is set at all, since I know ubuntu does not set one by default, so I would only want to change that if one is already set [16:24] But I first need to figure out why my own password is not changing [16:24] hachre: Yes, I was told it was installed today [16:25] it's weird, passwd should go through the /etc/pam.d/system-auth component [16:25] I see now it is not the latest ubuntu even, they installed Ubuntu 12.04.2 LTS [16:25] I think [16:25] ah yea [16:25] I hope 12.04 does not cause me grief [16:25] thats the latest LTS release [16:26] weeb1e: "sudo getent shadow root" to see the root password hash. If it's "!" or "*" or something, then there's no root password set. That's generally the same across all distros. [16:26] thanks rbasak, any idea why passwd wouldn't be taking effect for me? [16:26] There is a hash there so I assume a root password is set [16:26] weeb1e: "passwd" sets the password you use to sudo with (your own user password). What you ssh in as should be the same password. I don't know of any reason that wouldn't work unless your provider is doing something? Is it a fresh install on real hardware, or some kind of VM?> [16:27] rbasak: I know that, passwd says it works for my user but then my users password is not changed in any new SSH sessions [16:27] It is a fresh install on hardware, I had to wait a few days for them to remove the VM and install an OS directly [16:28] VMs are useless for realtime software which requires minimal overhead and max performance [16:28] weeb1e: how about an ssh user@localhost from the machine itself? [16:29] weeb1e: were you running "passwd" as your own user, or as root? [16:29] rbasak: Still only the old password works [16:29] RoyK: My own user [16:29] weeb1e: "sudo getent shadow youruser" before and after changing the password. Does that get updated? [16:29] check the file date of /etc/shadow [16:29] rbasak++ === aTribeCalled is now known as whaley [16:32] rbasak: Yes it changes [16:32] This is very weird [16:32] weeb1e: and then if you try to ssh youruser@localhost? [16:32] I am getting very confused, I have worked with plenty Ubuntu servers in the past and have never had an issue like this [16:32] weeb1e: "grep password /etc/pam.d/sshd" - does that say "@include common-password" or something else? [16:33] weeb1e: it certainly shouldn't do that on a default install. [16:33] RoyK: Hmm, that worked now from that same session [16:33] weeb1e: but not from another machine? [16:33] And now it works in a new session [16:33] Why the hell would it suddenly work on the 6th attempt :| [16:33] ok, possibly PEBKAC ;) [16:34] Oh wait, I changed the root password [16:34] bingo [16:34] So the only explaination is it is using the root password for my own account? [16:34] Why would it be doing that [16:34] rbasak: Yes, that is included [16:35] I guess the techie that installed these boxes did something odd [16:35] weeb1e: the behaviour you're describing is certainly non-standard non-default. [16:35] normally, on ubuntu, root doesn't have a password. it means you can boot to single if you have physical access without a password, but then, if you have physical access, you can normally override most security [16:36] rbasak: seems to me he just ran passwd as root, nothing more [16:36] weeb1e: it might be worth comparing /etc/ssh/sshd_config and /etc/pam.d/* against a default system. [16:36] If I had to remove the root password now, would my user still work with its own password? [16:36] yes [16:36] I just don't want to lock myself out [16:36] weeb1e: RoyK: yeah, perhaps I've misunderstood the details. [16:36] How would I remove the root password? [16:37] weeb1e: no need, really [16:37] weeb1e: your system is only slightly more secure with a root password [16:37] weeb1e: leave an ssh session running "sudo -i" so you have a root prompt. Change and test at will. If you leave the session open you can recover from problems using that. [16:37] EOD [16:38] weeb1e: you may want to turn off root login in /etc/ssh/sshd_config, though [16:38] Well, ok I don't need to remove the root password [16:38] But I don't want all accounts to use that password [16:38] Would removing "@include common-password" be enough to solve that? [16:38] weeb1e: all accounts have their own passwords [16:39] RoyK: Like I said I can't login or sudo with my own accounts password [16:39] It only started working when I set the root password to my own password === jkyle_ is now known as jkyle [16:40] weeb1e: I'd avoid changing /etc/pam.d at all unless you're restoring defaults that have been changed. AIUI, the behaviour you want *is* default on Ubuntu [16:40] @include common-password is there by default [16:40] Ok well let me change my own password and see if it takes effect now [16:40] The beahviour I've heard you describe here (as far as I've understood what you've said) *is not* default on Ubuntu. [16:41] weeb1e: well, now, after you have successfully changed your password, login and try sudo -i [16:42] rbasak: Yeah that was my understanding too, I've used plenty ubuntu servers and never experienced this before [16:42] But now after having set the root password, changing my own accounts password works correctly [16:42] weeb1e: I guess what you experienced was just taht you changed the wrong password [16:42] I still don't understand why it was not before [16:42] RoyK: I tried using passwd without sudo at least 5 times [16:43] And it said it worked, yet a new ssh session only worked with the old password [16:43] never seen that - ever - since I installed slackware 2.1 back in 1994 [16:43] Very odd behaviour [16:43] weeb1e: indeed - does ssh youruser@localhost work with the new one? [16:43] Well, I have a second box that should be identical to this one, lets see how the password changing goes there [16:44] It does now, it didn't before [16:44] weeb1e: try localhost first [16:44] if there's a difference between ssh to localhost and from another machine, there may be a man-in-the-middle somewhere [16:44] which is rather alarming [16:45] No, there is no difference, both ssh to localhost and an external ssh session failed for the first bunch of attempts [16:45] They only started working with the newly set password after I changed the root password [16:46] weeb1e: do both work now? [16:46] Yes [16:46] then you probably changed the wrong user's password [16:46] But I have a second machine to test now, and it does not have a root password set [16:46] try again [16:46] ok [16:47] Oh well, that machine worked as expected [16:48] All things do point to me having changed the wrong password, but I am also very sure that I did not.. but oh well, thanks for the help anyway [16:53] So much for the machines being identical, the second box has something seriously wrong [16:53] E: Package 'build-essential' has no installation candidate [16:54] huh [16:54] weeb1e: I'd reinstall that if I were you [16:54] perhaps run rkhunter or chkrootkit on it first [16:54] and check the repos used [16:54] or just nuke it [16:55] RoyK: Reinstall the whole OS? [16:55] I would have to get my sponsor to send a technician to do it [16:55] if something has been let in that can be logging passwords, then it's rather bad [16:55] can you compare /etc/apt/* between the two machines? [16:55] "if something has been let in that can be logging passwords"? [16:56] Where did you get that from? [16:56] use rsync -r from a separate machine to transfer the contents [16:56] weeb1e: I'm just paranoid, sometimes that's all it takes [16:56] Hmm, I'll compare the contents [16:57] why apt/* not just apt/sources? [16:57] the sources.list files are the same [16:57] heh? [16:57] because sources.d is another source to sources :P [16:57] someone can easily setup apt to use a proxy server [16:58] and then give you whatever they want [16:58] checking sources won't detect that [16:58] true [16:58] Yeah well, they could, but this sponsor likely does not have the technical know how for that :P [16:58] * patdk-wk hopes no one gets my proxy :) [16:58] weeb1e: check the checksums (md5 or sha) of passwd and the modules used by pam [16:59] weeb1e: it may be false alert, but you're seeing some rather interesting issues that *may* turn up to be nasty [17:00] I'd need to find another 12.04.2 ubuntu server to compare against [17:00] Let me check if I have a VM installed [17:00] weeb1e: first: download rkhunter and/or chkrootkit from the source, not from the repos, and run it/them [17:00] RoyK: I understand your concern, since I have just gained access to these boxes I'd rather be safe than sorry [17:00] patdk-wk: do you know any other checks to run on such a system? [17:01] not really, I just don't bother anymore [17:01] restore from template [17:01] patdk-wk: why not? [17:01] ok [17:02] I do tend to keep the old ones around for inspection, and find the issue [17:02] but normally, people breaking into servers leave craploads of helpful info around [17:02] patdk-wk: doesn't work too well for physical machines, though [17:02] good thing I don't have any :) [17:03] but it would work the same way [17:03] just take longer to do a restore [17:03] I do it for laptops, and desktops [17:03] after I install, I backup to a template, that I restore on the other ones [17:03] and use if someone gets infected [17:04] that is windows though [17:04] I only have physical machines, without any physical access :/ [17:04] VMs have too much overhead [17:04] weeb1e: huh? [17:05] weeb1e: we run 150ish VMs on 8 VMware hosts at work, and it runs smoothly [17:06] would probably run well on 6, or it will, when we reorganize the two clusters into one [17:09] since when do vm's have overhead? [17:09] atleast if your using an ept server, so e54xx or higher cpu [17:10] patdk-wk: heh - back before they added virtualization extensions ;) [17:10] no, that was painful [17:10] vmware around 2001 was rather heavy [17:10] ept caused it so you didn't have overhead for memory page changes [17:11] if your server is that old, to not support vt, I would suggest, you don't need a server :) [17:12] hehe [17:12] but if your server is <5years old or so, you probably have ept support [17:12] so the vm will have an unmeasurable amount of vt overhead [17:13] I will say, going from physical to vmware, caused me a 15% additional overhead [17:13] then I realized the old servers didn't have ept, removed it, and I am <5% overhead [17:15] patdk-wk: got a cluster? [17:15] 4 clusters [17:15] many hosts? [17:15] large windows, small windows, large ubuntu, small rhel [17:16] physical, from 3 to 6 [17:16] why separate the vm's into different clusters based on OS? [17:16] royk, they aren't === matsubara is now known as matsubara-lunch [17:16] they are in different datacenters doing different things [17:16] ok [17:17] large windows cluster has like 5 rhel on it [17:17] but it has 400 windows vm's [17:17] damn [17:17] how many hosts? [17:17] on 5 blades [17:17] not bad [17:17] how much memory in those? [17:17] currently, 144, and we are pushing into 80% used again [17:19] those blades are getting upgraded next spring, so moving to 386 or more ram, but need faster cpu's, single core performance in windows is really hurting lately [17:19] we have two clusters, plus a separate box for patient data, running a single vm, separate box of historical issues, I guess, since some people didn't trust putting a large VM on other machines that were exposed to the internet [17:19] perhaps going for virtual datacentre one day [17:20] if it had patient data, it would be a hippa issue here, much easier to say your in regulations [17:20] but not sure what the laws are there [17:20] hippa? [17:21] http://en.wikipedia.org/wiki/Health_Insurance_Portability_and_Accountability_Act [17:21] I know others that are using the same cluster for mixed data [17:21] guess it's hipaa [17:21] I see, thanks [17:22] The Norwegian Data Protection Authority is the main actor at this, and they allows (at least certain) installations of patient data VMs along with open servers [17:25] RoyK: "Exposed to the internet", you mean web servers? or just something that was able to access the internet outbound? [17:25] ok, hipaa doesn't forbid it :) [17:25] but if you don't want to report loss to your *customers* [17:25] then it must be approved and encrypted [17:26] Pici: web servers or others that can be reached from the internet [17:26] so making it encrypted, and being able to verify loss, is simple if it's dedicated [17:27] IMHO nothing is really dedicated when the blade is in the same chassis as the other blades and VLAN control is at the VM level [17:29] ok, it's long after lunch time [17:44] patdk-wk: 400 VMs on five hosts with 192GB seems rather heavy, it's like 1.92GB per VM [17:45] (with four, if one fails) [17:45] well, depends on mem dedup though [17:45] is that really efficient? [17:45] as 240 or so are cloned win7, they dedup good [17:45] ok [17:45] ya, they have 4gigs of ram each, and normally only use 1gig of ram each [17:46] ok [17:46] clients? [17:46] vmware view, for client access yes [17:46] destroyed on each logout [17:46] should try that out [17:47] we have some 20k users, mostly students, but some 1800 employees [17:47] how many are logged in at any given time? [17:47] looking at the fileserver statistics, perhaps 2k [17:47] at really high times [17:47] jamespage: ping enjoy: https://code.launchpad.net/~zulcss/neutron/rename/+merge/174832 [17:47] well, that would be how many licenses you need then [17:48] possibly rather expensive :P [17:48] well like everything [17:48] do you use thin clients for this? [17:48] you do it yourself, or you pay for it [17:48] royk, heh? [17:48] Is this a single hospital? [17:48] clients==customers, we have no control over them [17:48] Pici: hioa.no [17:49] Ah. [17:52] patdk-wk: I meant, are you using thin clients or PCs for this thing? thin clients as in those that only knows RDP or whatever access protocol, but doesn't have much of an OS locally [17:53] like I said, how should I know [17:53] RoyK: I host realtime sensitive software which is affected by the overhead and timeslicing of virtual machines [17:53] they are controlled by the customer, offsite, nothing to do with our company [17:53] Such software includes a variety of resource intensive game servers as well as multimedia transcoding and processing [17:53] ya, realtime stuff is not vm friendly [17:54] My services are realtime and latency sensitive, so VMs are really not an option [17:54] you would be surprised how fast vm's can work [17:54] depends though [17:54] we keep hosts here with just 1 vm on them [17:55] but if latency is the only issue, latency normally trumps all vm latency issues [17:55] weeb1e: I see [17:55] the 1 vm is a very important and high speed guest the reason its virtual is due to portability [17:55] network latency [17:55] we installed varnish on a dedicated blade some time back, 200% speed increase [17:56] so in some applications, virtualization isn't the best approach [17:56] what happens if your blade backplane fails? [17:56] royk, that sounds like an ept issue :) [17:56] which has happened to me [17:56] patdk-wk: ept? [17:56] the memory paging virtualization support in newer cpu's [17:57] otherwise every page table lookup, hits the hyperviser [17:57] TheSov: it all goes down, obviously, and the important VMs are started on the secondary site [17:57] and since varnish is memory happy, it will matter a lot [17:57] RoyK, im just saying virtualization, as much as it has its drawbacks is worth it most of the time [17:57] I was getting 50% slowdown on some vm's [17:57] if not for just machine portability [17:58] not being hardware dependant is ****** awesome [17:58] TheSov: I know, but the positive side of virtualization is rather huge compared to the drawbacks [17:58] i think we are arguing on the same side lol [17:58] 150 VMs as pizzaboxes would fill four racks [17:58] and consume a rather large amount of power [17:59] oh, maybe you needed those old rlx blade I used to have :) [17:59] 2ghz with 20gig drive, 24 per 4u blades [17:59] we have three Dell bladecentres atm [17:59] so happy to drop them off a cliff [18:00] i have an entire rack of dell r714's with 12 core processors and 128 gigs of ram [18:00] recycling the older ones for the secondary site [18:00] they rock [18:00] sounds like amd [18:00] yes they are [18:01] I'm normally ram heavy [18:01] but the 100% flash san is helping to change that [18:01] no need to cache as much stuff in ram [18:01] patdk-wk: what sort of SAN do you have? [18:02] purestor [18:02] url? [18:03] purestorage.com [18:03] something like zfs? [18:04] it's not [18:04] it works a lot like zfs, but it's not zfs at all [18:04] they are using raid3d, so it's basically raid6 but without a dedicated spare, but random holes all over [18:04] have you tried to yank a disk and put it in a zfs-enabled box and tried zpool import? [18:04] ;) [18:05] it wouldn't work [18:05] it's not zfs, as it's raid3d :) [18:05] even if they did zfs ontop of it [18:05] what's raid3d? [18:05] google it [18:05] ibm made it [18:05] it solves the slow rebuild issue of using spares [18:06] hard to explain without the picture [18:07] <1s failover is nice [18:07] Dell tells EqualLogic customers to increase iSCSI timeout to 120 to avoid problems [18:07] well, it's active/active [18:08] which doesn't work too well with internal timeouts in databases, exchange etc [18:08] ya, vmware says to use 180sec [18:08] and it pushs that into windows [18:08] but not linux [18:08] doesn't work with exchange [18:09] I have never failed over exchange yet [18:09] exchange uses non-blocking I/O and fails after some seconds [18:11] patdk-wk: all SSD SAN? [18:11] yes [18:12] what interlink? [18:12] dude, lefthand networks has an amazing virtual san appliance [18:12] using 8gb fc [18:12] ok [18:12] i use that in combo with freenas and RDM to produce a high speed full failover san solution that functions at high speed [18:12] FC!?!? ok i get off the boat here [18:12] patdk-wk: guess you get rather good IOPS from that thing [18:13] only have 4 of the 8fc connected right now [18:13] but we can max out the 2 fc ports per host, easily [18:13] with 4k iops [18:14] 4kiops doesn't sound that impressive, though [18:14] when a single SSD can deliver 10x+ of that [18:15] hmm? [18:15] a single ssd can do >200k iops? [18:15] I know I can do random write iops at full speed [18:16] you can't say that about zfs with dedup, very easily [18:17] royk, one thing I do like about them, and why their numbers do seem low [18:17] their numbers you will get, they are the best numbers under perfect optimization conditions [18:18] and they are working on a cost scale too [18:18] so one gen old hardware, to keep costs down === matsubara-lunch is now known as matsubara [18:28] patdk-wk: how much storage do you have in total (net) on those SSDs? [18:28] 11tb raw usable [18:28] how many SSDs? [18:28] we have 19tb of data on it [18:28] seems like an awful lot [18:29] 48 256gb ssd's [18:29] we moved our 15tb of thin allocated data from our old san, to it, and used 4.5tb [18:29] you should e getting a wee more than 4kiops from that bunch [18:29] 4k? [18:29] even spinning rust should give you 4kiops with that amount of drives [18:30] wee more than 4k block size iops? [18:30] I keep peeking out around 300-500k iops [18:30] way over their specs [18:30] shit [18:30] that's a lot [18:31] can easily get 100-150k for a single stream [18:31] (and my excuses to the language police for saying a bad word) [18:31] it must not count anymore, or bot the bot would yell :) [18:31] patdk-wk: want to ship this over? you don't need it, do you? :D [18:33] I kind of like it [18:33] we are getting a 2x dedup ratio, and a 2.3x compression ratio on it [18:33] they join those numbers into one though, generally [18:34] but we pre-tested our data using a tool that will read your lun and spit out what it would use [18:34] so you can estimate how much you need [18:34] If you're going to make sarcastic comments in regards to the ops right after using language you clearly know is not acceptable, why do it at all? [18:36] * patdk-wk failed to see any sarcastic comments made [18:36] because you are not aware of all the facts perhaps [18:37] Anyway, let us all try to behave according to the rules please. [18:38] IdleOne, is something about this sarcastic? " (and my excuses to the language police for saying a bad word)" [18:38] as that was the only thing said after the word [18:39] in this channel atleast [18:39] and everything you said, if it was in reply to an off-channel comment, not sure why you would bring it in here [18:39] yes, first of all we are not "language police" second of all if you are aware enough to apologise for doingsomething wrong then you should have been aware enough not to do it. [18:40] the second part is not true [18:40] it's one thing to know you did something wrong, it's another thing to break your habbit [18:40] sure it is. There is no excuse for bad behaviour. When someone joins an Ubutu channel they know what behaviour is acceptable and expected. [18:41] * patdk-wk notes almost all drug addicts [18:41] Ubuntu* [18:41] especially someone who has been in ubuntu channels as long as RoyK has. [18:42] We all mess up now and then I'll grant you that, but in light of recent history. I think the rules woyuld have been fresh in his mind. === cmagina is now known as cmagina-away === cmagina-away is now known as cmagina === arrrghhhAWAY is now known as arrrghhh === cmagina is now known as cmagina-away === cmagina-away is now known as cmagina [21:31] HI there [21:32] I installed apache 2.4 but now when i try to analyze a log file, I got "-bash: fork: Cannot allocate memory" and the ssh session close. Do you know why ? :) [21:39] cyberviking: what ubuntu version? [21:39] cyberviking: how much memory? [21:41] total used free [21:41] Mem: 2097152 287824 1809328 [21:41] -/+ buffers/cache: 20072 2077080 [21:41] trying to analyze a 35mo fil via some grep [21:42] 35mB [21:44] pastebin ps axfv [21:44] !pastebin | cyberviking [21:44] cyberviking: For posting multi-line texts into the channel, please use http://paste.ubuntu.com | To post !screenshots use http://imagebin.org/?page=add | !pastebinit to paste directly from command line | Make sure you give us the URL for your paste - see also the channel topic. [21:45] cyberviking: cannot fork seems like a bunch of processess staggering [21:46] cyberviking: pastebin output of uptime as well [21:47] the command is not so impressive but it crash, just one to know how time Googlebot was there [21:47] cat /var/log/apache2/other_vhosts_access.log|grep "15/Jul"|grep -v "Googlebot"|wc -l [21:47] uptime : 23:47:18 up 3:11, 1 user, load average: 0.00, 0.00, 0.00 [21:48] so probably no disk issues [21:48] but now swap? [21:48] forget the "-v" on grep above of course :p [21:49] it's a VPS [21:49] with no swap [21:49] Swap: 0 0 0 [21:55] I can shutdown apache, execute this command and start apache again it works ^^. But I want to understand what the hell happen here. [21:56] pastebin ps axfv [21:57] the only difference is without apache -/+ buffers/cache: 11092 2086060 [21:57] and with apache : -/+ buffers/cache: 15912 2081240 [21:58] should be no difference [21:58] I know :s, but it's not :D === arrrghhh is now known as arrrghhhAWAY [22:14] i've got a bit of a strange situation with memory (potentially a swap thing) [22:14] i have a bunch of servers running with 16gb of ram available... they have a leak and when they get somewhere above 1GB, they get restarted [22:15] but for some reason, freeing of that memory seems to make the whole machine spike in cpu usage, and slows everything WAY DOWN while it happens [22:15] i was thinking maybe tuning the swappiness might be the solution, but does anyone have an idea what I should be looking for? [22:29] heh? [22:29] why would you think this is a swap issue? [22:29] where is a pastebin with any results that back this up? [22:30] cause if you have 16gig ram, and you reboot them when they > 1gb ram, you have personal issues, not swap issues [22:32] each process has 1gb of ram [22:32] there are 14 server instances running [22:32] im now running sysstat so I can get some stats next time i see the issue [22:32] its a custom ruby / c game server [22:33] really, all you need to do is run vmstat, and maybe free, and probably ps axl, when you are having the issue [22:33] to tell if you have a swap issue or not [22:33] what would i want to look for? [22:33] something wrong [22:34] jamespage: is there any particular reason for openvswitch package not using upstart? [22:34] does that symptom seem indicative of a swapping issue? [22:34] jsonperl, the issue is unknown yet, as you have not described anything [22:34] you said a cpu spike, swap issues don't cause cpu spikes, they cause disk spikes [22:35] so far, that is the only clue given [22:35] sure ok here [22:36] what is nice, is to use something like munin, so you know what it *normally* looks like [22:36] then you can tell what changed [22:36] Basically all server activity drops to 0 [22:36] sysstat does it also, I just never used it [22:36] i have charts of core usage [22:36] i basially persist mpstat to db [22:36] it only starts happening once servers cycle... and release a lot of memory [22:36] I'll paste one somewhere and link [22:37] mpstat only gives cpu info [22:37] http://picpaste.com/pics/Screen_Shot_2013-07-13_at_10.19.40_PM-AS1JtSXk.1373927861.png [22:38] Cpu is clearly a problem here [22:38] so cpu usage drops to bottom [22:38] that says cpu is NOT the issue [22:38] so again, we have no idea [22:38] What are some potential reasons for that [22:38] heavy IO wait time? [22:38] I could list you atleast a few million [22:39] but there is no point [22:39] theres not much running on the machine [22:39] pretty much just these servers [22:39] this is why you need to record all basic stats [22:39] disk i/o, memory, cpu [22:39] all in reference to each other [22:39] sysstat is doing that for me now [22:39] other things, if this is a vm [22:39] it could not be anything to do with you [22:40] its not a vm [22:40] physical machine [22:40] all mine [22:53] Patrickdk ok im collecting stats on the minute now [22:53] hopefully ill see something interesting [22:53] this sucks [22:54] I run deepworld btw... fun game if you have a mac or ios device