/srv/irclogs.ubuntu.com/2019/06/20/#ubuntu-server.txt

jamespagemdeslaur: cosmic ceph packages also tested OK05:35
zetherooon several (but not all) of our Ubuntu Server 18.04 installs the hostname resolution seems to intermittently fail and only 'service systemd-resolved restart' will get it working again. Any ideas as to what can cause this kind of behavior?07:12
blackflowzetheroo: can you give an example of how it fails, and what the systemd-resolve --status is  (did the upstream NS change?)07:22
blackflowmeanwhile... due to resolved's stupid design not to obey the list of NS entries given, and many other quirks, my recommendation is always to drop systemd-resolved, esp. on prod servers where DNS config is static and must be consistently reliable (ie. no roaming, changing networks, etc...)07:24
zetherooblackflow: when trying to ping a hostname you get this 'ping: hostname: Temporary failure in name resolution'07:25
blackflowthere's dnsmasq, unbound, bind, powerdns... pick your poison if you need a local caching, recursive resolver. If not, just statically configure /etc/resolv.conf (and mask out systemd-resolved)07:25
TJ-zetheroo: yes, the reason is that when resolved has a list of DNS servers and one is unreachable it'll move onto, and remain using, another in the list. If you've got 'private' DNS as well as public that can then break the ability to resolve public names, for example07:26
TJ-zetheroo: if you do have 'private' DNS (for LAN say) that should be set as on-link only so it isn't used globally07:27
zetherooThis is the resolv.conf and netplan config from two systems where this happens: https://paste.ubuntu.com/p/mDjnJwQSxm/07:29
blackflowzetheroo: ah you have .local . systemd-resolved won't work with those unless you have mDNS in your network07:30
zetherooblackflow: wdym? DNS does work ... 90% of the time ... it just goes dead after some random time ... or something is fritzing it out07:32
blackflowsystemd-resolved doesn't resolve .local names, and you have mt.local in search so I assumed you're querying for .local names?07:32
TJ-zetheroo: what does "systemd-resolve --stauts" show?07:36
TJ-The problem I see there is you've got both private and public DNS servers listed "addresses: [192.168.81.9, 1.1.1.1]"07:37
zetheroohttps://paste.ubuntu.com/p/j9NjmcPFQ3/07:38
TJ-zetheroo: 1.1.1.1, if used, is NOT going to resolve .local addresses whereas 192.168.81.9 presumably will07:38
TJ-zetheroo: so, if at some point resolved cannot get a response from 192.168.81.9 it'll switch to 1.1.1.1 at which point .local names won't be resolved07:39
TJ-zetheroo: is that what is happeing? local names fail? or is it public names fail ?07:39
zetherooTJ-: ah, so if at any point the internal DNS (192.168.81.9) server doesn't work, resolve will use the next one (1.1.1.1) and from then on ignore the internal one?07:39
zetherooTJ-: honestly I didn't try to reach an external hostname ... only internal ones.07:40
zetherooblackflow: just doing 'ping hostname' resolves fine to 'hostname.mt.local'07:41
zetherooblackflow: 'ping hostname.mt.local' also works fine07:42
zetherooTJ-: systemd-resolve --status -> https://paste.ubuntu.com/p/j9NjmcPFQ3/07:43
TJ-zetheroo: I think the problem here is you're trying to use netplan to do something it is unable to do; what you need, from what I can see, is a systemd-resolved 'global' DNS server 1.1.1.1 set in resolved.conf (DNS=1.1.1.1) and then only have the 198.168.81.9 in the netplan config07:46
zetherooI guess we are using netplan like we were using interfaces.conf07:47
blackflowTJ-: wait, both are in the netplan config?07:47
TJ-zetheroo: that should help but if the internal DNS server is unreachable for reasons that is what you ought to focus on because if that is dying then there'll be no mt.local resolution anyhow07:48
TJ-zetheroo: DNS should be a highly available service with fast response times; unfortnately it's not given the repsect it deserves in many LANs07:49
TJ-often bundled as a service with many others on some poor overloaded sytem :)07:49
zetherooTJ-: ok, but what actually causes resolve to switch to the secondary DNS? is it a timeout? if so ... what time limit?07:49
TJ-zetheroo: connection timeout if I recall correctly07:49
TJ-zetheroo: remember that DNS uses UDP in the main, and the U stands for Unreliable :)  UDP can be dropped by routers under pressure07:50
zetherooI'm just trying to get an idea of how critical the "drop" in DNS is for resolve - again we have dozens of Ubuntu systems, and this is only happening on 4.07:54
TJ-zetheroo: I'd check/monitor their network links07:57
zetheroothat's the thing ... 2 of the systems (the ones in the pastebin) are standalone hardware systems, and the other two are VMs on our virtualization, which are living with other Ubuntu VMs which don't do this ...07:59
blackflow(UDP is primarily used, but TCP must be allowed for requests and responses larger than single packet size)07:59
zetheroobut, OK, if we remove the external DNS address from the netplan config it should never ignore the internal DNS server entirely - right?08:00
zetherooI normally use 'apt autoremove' to free up space in general, and it also frees up space in /boot by removing old kernels, but is there another way to specifically clean up /boot?09:15
TJ-zetheroo: clean up? or increase freespace?09:16
TJ-zetheroo: you could make the initrd.img smaller, /etc/initramfs-tools/initramfs.conf MODULES=dep09:19
zetheroowell it seems that even after the old kernels are removed there is still a bunch of files from those kernels in /boot09:20
zetheroohttps://paste.ubuntu.com/p/rMRxjb55vy/09:20
blackflowzetheroo: you have to explicitly remove the 4.13 kernel (wth btw). autoremove only keeps current and current-1 version of currently (heh) running kernel09:21
mdeslaurjamespage: ack, thanks. xenial next?11:09
supamanI have an fstab entry to mount a smb share, when running mount -a I get that there is an error in options to the smb mount, here is the options, can someone see the error? credentials=/root/.smbcredentials,iocharset=utf8,sec=ntlm11:55
supamanhmmm ... removing sec=ntlm fixes the problem11:58
supamansetting sec=ntlmssp works12:01
=== SoniEx2 is now known as Soni
kinghathttps://arstechnica.com/gadgets/2019/06/zfs-features-bugfixes-0-8-1/16:01
=== Casper26_ is now known as Casper26
=== kklimonda_ is now known as kklimonda
=== MarkMaglana_ is now known as MarkMaglana
=== yokel_ is now known as yokel
mybalzitchtoo bad kernel devs are working overtime to neuter ZFS16:03
kinghatwhy is that?16:20
=== jelly-home is now known as jelly
sarnoldkinghat: the last paragraph here describes the mood well https://marc.info/?l=linux-kernel&m=15471451683238917:10
kinghatso because of the GPL?17:12
tomreynbecause of the CDDL17:16
swillsi saw the author of that ars article give a pretty decent talk in zfs, part of which talked about how the GPL is unenforcable17:29
swillsall the stuff with the SFC vs SFLC etc17:29
lordcirthWhile I understand that supporting CDDL is annoying, I find it odd that they don't get that ZFS is good and people want it, regardless of what Sun wanted.17:30
lordcirthIf they want to make btrfs raid5/6 stable, I'd be ok with that, but it's just not there.17:30
swillsi think they do get that, but they also get that they are between a rock and a hard place17:30
swillsand facebook is doing a lot of work on btrfs17:30
lordcirthBtrfs is quite nice for root partitions (way easier than ZFS root on most distros, even Ubuntu) but I need raid5 in order to use it for a lot of use cases17:32
swillszfs on root is automatic on FreeBSD, fwiw17:33
swillsbut anyway, look at all the talk about btrfs in recent LWN coverage of LSFMM17:33
lordcirthI'm aware, but that's not really an option for a lot of systems17:33
swillsso i dunno, i think ultimately what's going to happen is everone is going to be told "btrfs is good enough, use it, not ZFS" and "if you want to use ZFS, don't expect it to get easier"17:34
swillsbut i could be wrong17:35
kinghatyou have to set the trim flag with zfs, when setting up an ext4 fs on an ssd is it automagic?17:35
tomreynthere's an fstrim systemd timer17:36
lordcirthkinghat, iirc Ubuntu does scheduled trim, not instant trim17:36
kinghatis one better than the others?17:36
kinghatother*17:37
lordcirthscheduled is better, assuming you don't fill the entire drive with garbage before it can trigger17:37
lordcirthIn general, at least17:37
kinghatoh wow, its weekly.17:38
tomreynthe "discard" mount option can cause serious I/O problems with some SSD / NVMEs.17:38
tomreynhttps://wiki.debian.org/SSDOptimization#WARNING17:39
swillsfun fact, some drives are slow at trim, so turning it on can make you disk seem slower17:40
swillsperhaps schedule trim helps with that, i dunno17:40
kinghatso does that only run for the OS ssd or any attached ssds?17:41
lordcirthkinghat, well, it just runs "/sbin/fstrim -av"17:41
tomreynhttps://wiki.archlinux.org/index.php/Solid_state_drive#Periodic_TRIM  "The service executes fstrim(8) on all mounted filesystems on devices that support the discard operation. "17:42
lordcirthAnd "man fstrim" says that "-a" is all supported mounted devices17:42
kinghatah. would arch be applicable to ubuntu?17:42
tomreyni think it's the same systemd service + timer17:43
tomreynsystemctl list-timers fstrim.timer17:44
kinghatah17:45
tomreynhmm the timer seems to lack randomization, always runs at 00:00.17:47
tomreynif you want to review the timer + service: ls -l /lib/systemd/system/fstrim.*17:48
kinghattime is just a social construct tomreyn.17:48
tomreynmy point is you don't want all your servers to become I/O loaded at 00:0017:49
tomreynRandomizedDelaySec= should be used17:51
kinghatagreed17:51
=== Serge is now known as hallyn
lordcirthNote that if you want to change a timer/service, do not edit the one in /var. Copy it to the equivalent directory in /etc and edit that. The /etc one will override the one in /var when read, but will not interfere with the packaged file.17:55
tomreynbug 1833593 files17:59
ubottubug 1833593 in util-linux (Ubuntu) "fstrim.timer always triggers at 00:00, use RandomizedDelaySec" [Undecided,New] https://launchpad.net/bugs/183359317:59
tomreyn*fileD17:59
tomreynif some of you have access to I/O load logs across larger server farms, it'd be great to check (and comment) whether this has any noticable impact.18:02
lordcirthMost of our servers have separate SSDs for root and data, so the root ones don't have much load. Maybe our DB servers?18:08
lordcirthEr, I meant, SSDs for root and HDDs for data.18:08
lordcirthIt's Monday at local midnight? I will ask18:09
sdeziellordcirth: 'systemctl edit $foo' lets you create an override snippet, add '--full' to it and it will do a full unit copy for you to edit18:15
lordcirthsdeziel, cool, didn't know that!18:15
sdeziela bit like upstart's .override files but better18:15
geardexit19:56
mybalzitchCan I go from 18.10 to 18.04.2?23:18
mybalzitchI tried do-release-upgrade but it wants me to go to 19.xx23:19
sarnoldyou really can't go backwards23:20
sarnoldindividual packages might downgrade alright, once in a while, but the packages are packaged with the assumption that you only ever move forward along with the passage of time23:21
mybalzitchok, I was trying to get zfs 0.8.1 with Jonathon F's packages, but it seems he only supports LTS releases, so I don't have a "easy" way to get it installed. I will have to do dkms myself I think23:23
sarnoldoh dang. :/23:24
sarnoldyou could probably just build his packages locally23:24
tewardwooooooooooow I feel like an imbecile today...23:32
tewardi totally forgot to save a switch config so I thought my Ubuntu gateway machine in this one lab env was broken23:32
tewardand ended up realizing after puttering with it for 2 hours that it was the switch23:32
teward>.<23:32
tewardanyways... sarnold I'mma bother you again like I normally do :p23:32
sarnoldteward: sounds like a problem best solved with a sandwich. or pizza. with beer.23:32

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!