/srv/irclogs.ubuntu.com/2017/11/10/#ubuntu-server.txt

=== hggdh is now known as WieWoWasWozu
cpaelzer	good morning	06:31
lordievader	Good morning	07:13
Slashman	hello, I think that I'm hitting a bug with the latest openjdk, I would like to rollback to the previous version, is there any way to do that ?	11:05
joelio	Slashman: within a point release or new version	11:17
joelio	but fundementally, yess	11:17
=== andreas is now known as ahasenack
Slashman	joelio: apt-cache policy only show me 8u151-b12-0ubuntu0.16.04.2 or 8u77-b03-3ubuntu3	11:19
joelio	it may still be in your /var/cache/apt/archives dir	11:19
joelio	in which case you can dpkg -i it	11:20
joelio	I take it it's 8 series you need?	11:20
Slashman	joelio: no trace of it in /var/cache/apt/archives	11:28
Slashman	joelio: I used the oracle jdk instead, since any tar.gz can be downloaded from their website... it's a little sad that I have to switch to that to use a different version	11:29
joelio	the oracle jdk is available as a package too	11:32
joelio	https://launchpad.net/~webupd8team/+archive/ubuntu/java	11:32
joelio	Slashman: also, you probably have lost that specific version do to a security update	11:39
joelio	it's... java at the end of the day ;)	11:39
Slashman	I prefer to have several different java version to test it, but it doesn't seem to come from the jvm in the end	11:41
Slashman	I have "fork: retry: Resource temporarily unavailable" with 40GB free ram and plenty of none of limits that I know of breached	11:42
Slashman	I have "fork: retry: Resource temporarily unavailable" with 40GB free ram and none of limits that I know of breached	11:42
Slashman	the result is "java.lang.OutOfMemoryError: unable to create new native thread"	11:43
Slashman	with 40GB ram free and 100GB unused swap, it should be some kind of limit... but I don't see which	11:44
joelio	Slashman: you need to set in in the jvm options	12:11
joelio	as the heap is a value you set, depending on the resource needed	12:11
joelio	also be aware that the garbage collection mode changes when you set it about 4GB	12:12
joelio	but that will probably be your issue	12:12
joelio	Slashman: what's the application you're using in java (usually there is an /etc/default/{thing} that allows you to tune)	12:15
joelio	or in things like Elasticsearch there is a jvm.options file nowadays	12:15
faekjarz	Hi! [17.10 server / netplan] I want one of my NICs configured but booting into link DOWN. Where do i find the information / documentation to acheive this?	13:23
tomreyn	faekjarz: try asking in #netplan if you can't get help here	13:26
tomreyn	documentatio should be in 'man 5 netplan' and online at http://people.canonical.com/~mtrudel/netplan/	13:27
tomreyn	...according to https://wiki.ubuntu.com/Netplan	13:27
faekjarz	tomreyn: aye, i did already, but my pesky impatience ;)	13:28
tomreyn	this is the first time i heard of it, i assume it's fairly new	13:28
faekjarz	yes, i've found ~mtrudel/netplan already but ctrl+f link doesn't highlight what i'm looking for. Wrong keyword?	13:32
Slashman	joelio: that's not a jvm issue, the error happens even if I try to run "java -version"	13:35
Slashman	not sizing I meant, it's production software, that's not a new service or anything	13:35
Slashman	joelio: you can see the error when trying to run "java -version" here: https://apaste.info/TIIb	13:37
joelio	Slashman: have you tuned the heap, otherwise you'll be running on the default.	13:39
joelio	err, ifd java -versoon is broken then I don't know, sounds like it's fubar	13:39
Slashman	some system limit are reached, that's how I interpret it	13:40
Slashman	but nproc, nfile, etc are far from their limit	13:40
joelio	umm, you said you'd used oracle version?	13:42
joelio	vm_info: OpenJDK 64-Bit Server VM (25.151-b12) for linux-amd64 JRE (1.8.0_151-8u151-b12-0ubuntu0.16.04.2-b12), built on Oct 27 2017 21:59:02 by "buildd" with gcc 5.4.0 20160609	13:42
Slashman	I'm using openjdk atm	13:43
Slashman	1.8.0_151 build 1.8.0_151-8u151-b12-0ubuntu0.16.04.2-b12	13:43
joelio	so what that output is literally from a java -version ?	13:43
joelio	(in the pastebin)	13:43
Slashman	https://apaste.info/mAn3	13:43
joelio	eh, you said it was failing	13:43
joelio	13:35 < Slashman> joelio: that's not a jvm issue, the error happens even if I try to run "java -version"	13:44
joelio	have you checked your heap space?	13:44
joelio	this is a fairly common thing to do	13:44
joelio	i.e. increase heap allocated to a java application	13:44
joelio	have to do it on stuff deployed here, as the 256Mb is pretty low	13:44
Slashman	I'll save you some time, this is a corporate production server, the prod JVM are tuned, the server had no issue, we serve thousand of connections per JVM, I'm trying to understand why now we have an issue where we have JVM error and "java -version" doesn't work anymore, the problem happened this morning and we had to restart 3/7 JVM, after restarting just one, the problems goes away, for a time	13:47
Slashman	so it works for a time and suddenly we see "java.lang.OutOfMemoryError: unable to create new native thread" in tomcat logs when there is available memory, both in the heap and on the os	13:48
Slashman	only solution is to restart tomcat at this point	13:49
Slashman	so, since this problem is "new", my guess is that we have reached some system limit, that is not the amount of available memory	13:50
Slashman	it may have been a bug in the new java version but I just confirmed that we hav the issue on the previous one too, so that's not it	13:53
=== albech1 is now known as albech
joelio	Slashman: memtest?	13:56
joelio	or this n multiple boxes?	13:57
Slashman	not possible right now, but the server iDrac doesn't report any ECC issue, I'll have to try it	13:57
Slashman	joelio: it happens on 2 servers, you're right, doesn't seem related to hardware	13:57
Slashman	maybe it's ubuntu related, I have some debian servers without the issue with the same config, I'll compare the sysctl values...	13:58
joelio	yea, also check process list output to see what it's been instantiated with, just to make sure those values are being set	13:59
Slashman	what do you mean by that?	14:01
joelio	the ps -ef / auxx output for the java process	14:07
joelio	check how it's been intantiated	14:08
joelio	maybe there is a subtle difference	14:08
Slashman	oh, that's the exact same line exept pid number ofc	14:12
=== JanC is now known as Guest54856
=== JanC_ is now known as JanC
drab	ikonia: sdeziel: fwiw, I got ldirectord working with 3 containers, 1 director, 2 real servers	16:06
drab	couple things are broken in the default pkg so took me a while, but it otherwise works very well	16:07
drab	systemd unit has a bad path and the pkg actually won't install at all	16:07
drab	as there's a race condition with the config file	16:07
sdeziel	drab: glad to hear that. I'd have to look into ldirectord as it's new to me	16:09
drab	there's still something I don't understand as far as networking goes tho, several of the howtos seemed to say I had to set the director as the default gw for the real servers, but I didn't	16:09
drab	I guess it seems more lightweight than haproxy/nginx for just pure tcp/udp connections	16:09
drab	and a lot simpler, which is all I need	16:09
sdeziel	drab: the default gw thing is related to the asymmetric routing we talked about the other day	16:11
sdeziel	drab: could you share your ldirectord config?	16:13
sdeziel	http://www.linuxvirtualserver.org/docs/ha/ultramonkey.html shows that is can "masq"uerade real servers so that could be why you got away without changing the default gw	16:14
drab	well, you'd think, yeah, but actually when you look at how things re set up I don't get it	16:14
drab	I don't do masq, I do "gate" which is direct routing	16:14
drab	the cavia in most howtos, for the real server to respond , is that a non arping interface needs to get the VIP	16:15
drab	so often lo:0 gets the VIP/32 , that's how I've seen in in most howtos, and it makes sense	16:15
drab	box won't arp for that (you need to tweak sysctl), and it will still accept connections for that ip since it sees it as local	16:15
drab	responses will also originate from that ip since it's the obvious selection given that the request was received for it	16:16
drab	so it all seems to make sense and it works just fine, leaving me puzzled why I should be setting the gw to the director	16:16
drab	which several howtos I found mention	16:16
drab	altho they are all at least 5-7yrs old	16:17
sdeziel	yeah, everything in that space seems pretty dated documentation-wise	16:17
sdeziel	isn't ldirectord for http/https backends only though?	16:18
drab	well ldirectord actually really only check that backends are alive, it doesn't even do any of the switching etc	16:19
drab	that's done in kernel by ip_vs, which you have to modprobe	16:19
drab	so technical you can just install ipvsadm and modprove ip_vs and you're done	16:19
drab	as far as balancing goes	16:19
sdeziel	OK so only the health checks, right	16:19
drab	but that won't give you any monitoring of the backends. the monitoring part seems to be http in many examples, but maybe not	16:20
sdeziel	so yeah, much lighter than a user space proxy like HAproxy/nginx	16:20
drab	also ldirectord is just one perl script... which well, it's perl, but it's a single script	16:20
drab	sdeziel: and it happens in kernel space	16:20
drab	in theory this could simply be plugegd into nagios/icinga/whatever monitoring system	16:21
sdeziel	yeah, got that :)	16:21
drab	a nagios event_handler could run ipvsadm and remove the backend or something	16:21
drab	it would be trivial to implement	16:21
sdeziel	is your VIP actually movable between 2 or more boxes?	16:21
drab	I haven't tried that part yet, it's next on the list, testing one component at a time	16:22
drab	gonna give keepalived a shot	16:22
sdeziel	keepalived integrates nicely with IPVS	16:23
sdeziel	and keepalived can run whatever health you want it to	16:23
sdeziel	no need to mess with nagios handlers	16:23
drab	I'm actually pretty happy to have figured this one out, because even in the case of exposing containers and whatnot this is now really straightforward, no iptables or other stuff	16:23
drab	seems I can just do straight ip_vs and fiddle with ipvsadm and I'm done	16:24
drab	set that up on the baremetal maybe and redirect to whatever containers on it at will swapping things around in just a oneliner	16:24
drab	sdeziel: keepalived will take care of the VIP, not the realservers, for those you still need something else like ldirectord or nagios	16:28
drab	or whatever	16:28
drab	my point was, ldirectord isn't technically needed to get the balancing part going, I thought it was	16:28
sdeziel	http://manpages.ubuntu.com/manpages/xenial/man5/keepalived.conf.5.html see the LVS section	16:29
drab	yeah, looks like I was wrong, so maybe ipvs + keepalived is all that's needed to both manage the VIP on the directors and manage the real servers.	16:36
drab	that's great, one less compoennt, thanks	16:36
sdeziel	ldirectord seems to be responsible of monitoring and tapping into ipvsadm whenever needed	16:37
sdeziel	keepalived on the other hand provides a wrapper on top of ipvsadm and also handle monitoring	16:37
drab	which keepalivedd seems to be capable o doing too, no? that's how I read that section	16:38
drab	right	16:38
sdeziel	yeah	16:38
sdeziel	so if you setup with keepalived it should give you all the features you need	16:39
sdeziel	and you won't have a SPOF anymore	16:40
drab	yeah, that should be fine, what I'm most concerned about is the zfs snapshot part that comes after that	16:53
drab	I've redone the lxd hosts so that /var/lib/lxd is on zfs itself, that way I can send snapshots over to a backup host and have all containers setup ina single swoop	16:54
drab	but in that case they are going to have the same ips, so they need to be stopped until it's time	16:55
drab	now that I have lvs I'm wondering if instead I should have the containers on diff ips/not synced and just sync the data DS	16:56
drab	haven't thought that through quite yet	16:56
sdeziel	what do you mean same IPs? your 2 lxd hosts?	16:57
drab	the containers	16:57
drab	if I put them on zfs and send the snaps to the failover lxd server	16:57
drab	then all configs will be the same including mac and ip	16:57
drab	so if they come up I have a conflict	16:58
sdeziel	I'm assuming you'll "lxc copy" them, right?	16:58
sdeziel	but yeah, the same instance can/should be up only once	16:59
drab	I wasn't planning on it, I was planning on setting up zfs-backup-snapshot or something like that	16:59
drab	since all lxd is zfs backed up	16:59
drab	that way I don't have to make a difference betwee lxd or other data stored on zfs	16:59
sdeziel	unless you use the PPA/backports, I think there is no easy way to adopt a zfs	16:59
sdeziel	hence the suggestion of lxc copy	17:00
drab	how do you mean? adopt a zfs, that is	17:00
sdeziel	say you zfs send/receive the container's FS, the receiving lxd host won't be able to start it as is	17:00
drab	why not?	17:01
drab	I create a DS which I mount on /var/lib/lxd and then the default storage pool for lxc is a LXD DS. so both containers data and lxd config gets moved over to the receiving host	17:01
drab	by simply snapshotting everything and sending it	17:02
drab	all names and whtnot are consistent, the only diff is the ip of the lxd host and its hostname, that's about it	17:02
drab	am I missing something?	17:02
sdeziel	I guess this would work if you flip everything at once	17:04
sdeziel	but if you want to do it per container that's where you will need a different solution	17:04
drab	this also means that ldirectord/keepalived wouldn't have to change since ips would be the same, it's basically almost cloning the whole system, which is quite neat, the only issue is the turn on/off	17:04
drab	right	17:04
drab	that's what I'm debating, if I will corner myself... but at the same time this keeps it pretty simple and this is a charity, not a tech company	17:05
drab	I just want them to have a decent failover solution and data in a diff physical place	17:05
sdeziel	zfs send/receive should cut it then	17:07
drab	are you using sanoid/syncoid by any chance? seems one of the common solutions to deal with that stuff	17:08
sdeziel	drab: sanoid+syncoid might assist you with that	17:08
drab	:)	17:08
sdeziel	if you want to venture into new territories, you can take a look at DRBD (~RAID1 over the network). Pretty nice	17:09
drab	yeah there was a thread about that on the zfsonnix ML, zfs + drdb, I think it's more than they need	17:12
drab	in fact simply telling them "if something happens turn off and keep off this machine" is possibly a very good place to be for them	17:13
=== CodeMouse92 is now known as CodeMouse92__
drab	sdeziel: did you look at http://www.znapzend.org/ by any chance?	18:38
drab	or even the "official" https://github.com/zfsonlinux/zfs-auto-snapshot	18:38
sdeziel	drab: no first time I hear about znap	18:42
drab	I like there README quite a bit: https://github.com/mikalsande/znap	18:44
drab	and it's all bash, not a perl guy	18:44
drab	only cavia is , the author says he only uses fbsd so it's only really tested there	18:45
=== jelly-home is now known as jelly
hallyn	cpaelzer: why is https://launchpad.net/~ubuntu-virt/+archive/ubuntu/virt-daily-upstream disabled? :)	22:07
Pinkamena_D	How to find what is causing 'Device or resource busy'? I am trying to move /home so that I can overwrite it with something else. However, I get "device or Resource busy" . I am su to root and there is nothing in lsof \| grep home . my cwd is in /	22:17
drab	Pinkamena_D: is home on a diff drive?	22:18
Pinkamena_D	no	22:18
drab	an mv command is saying resource busy?	22:18
Pinkamena_D	yes	22:18
drab	whups, got disconnected	22:21
drab	don't know if my msgs went through	22:21
Pinkamena_D	no, did not get any >.>	22:21
drab	Pinkamena_D: I was asking, did you log in with ur user and then su to root?	22:21
Pinkamena_D	sorry about it	22:21
drab	could you login directly with root and try again?	22:22
Pinkamena_D	thats correct, root login is disabled	22:22
drab	if your user's home is in /home then you can see the problem	22:22
Pinkamena_D	there is no way to remove all handles so thats not an issue?	22:23
drab	there's probably somewhere a ref too that that's giving you the error even tho your user is technically not opening any file	22:23
drab	atltho that's just my guess, I don't think I've personally ran into that before	22:23
drab	Pinkamena_D: what you could do is to change your homedir temporarily	22:25
drab	so say mkdir /var/tmp/tempuser	22:25
drab	change your homedir to that, logout, log back in, try again	22:25

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!