[06:33] Could I `dpkg -i xserver-xorg-video-nouveau` from 16.04 to 18.04 to see if it fixes various showstopper issues that I see in ancient nvidia cards? Or would it break due to abi changes? [06:34] I mean, install the 16.04 package to 18.04 [06:34] no, you'll find that it won't install [06:35] Thank you :/ [06:42] alkisg: which version of X is it? [06:43] KitsuWhooa: stock 18.04, xserver-xorg-video-nouveau 1:1.0.15-2 [06:44] The problems I've seen so far are xorg segfault/crashes, and scrambled lines in the physical output which do display fine over vnc though [06:44] In various old nvidia cards, e.g. tnt2, mx 400, mx 4000... [06:46] Hm, 18.04 has 1.19.6 [06:46] Are the segfaults with acceleration enabled consistent? [06:46] KitsuWhooa: yes, and they are also consistent with NoAccel defined as well [06:48] http://termbin.com/haem [06:49] It boots fine up to lightdm, and crashes on login (where I assume things like compositing and opengl are used) [06:50] Ah, it's probably different to what I encountered [06:50] Maybe; I've seen tens of crashes and I assumed it was the same; maybe I've seen various different ones and haven't realized it yet [06:50] I had a similar backtrace with the s3 savage driver with acceleration enabled, and resolved it by re-enabling sigio in X and recompiling it. I never found out why it happened, but it got disabled some time in 1.19 and I only found out by bisecting it [06:51] (I'm getting reports from various schools with a lot of different clients) [06:51] If possible, maybe try enabling HWE on a 16.04 machine and see if that also breaks it [06:52] Thanks for the sigio pointer, I'll give it a try; I'll also see if I can reproduce it locally to try with 16.04 + hwe [06:52] Although I think some schools already have that, and none reported the issue [06:52] Maybe also install the debug symbols and get a gdb backtrace to see if there is anything interesting there [06:54] My (old) school's ubuntu lab still runs the version with gnome2 (pre 12.04). I guess no one bothered updating it :p [06:55] Some of the schools that I upgraded from 12.04 to 18.04 report that they want to go back :D [06:55] I'd love to see less features and more stability, but I guess many developers find stability programming = boring :D [06:56] I think it's more like it's getting incrementally difficult to support old hardware [06:56] Oh I see it in programs that aren't related to hardware as well [06:56] Panels that crash by just switching keyboard layouts "but they're so modern now,they support searching for programs while you type!" [06:57] oh :p [06:58] Btw the problem with the scrambled output was in nvidia 7200 as well; I don't know, is that considered too old/unsupported too? [06:59] alkisg: so did you try the git version? [06:59] tjaalton: I need to reproduce it locally first [06:59] I'll try to get such a card in my office [07:00] just build it and push to clients, it's only five commits on top of 1.0.15 ;) [07:02] ty, will do [09:05] alkisg: I can confirm I get corruption on 18.04.1 mate with nouveau on an MX400 [09:06] KitsuWhooa: thank you :) Btw, changing resolutions some times fixes it [09:06] I tried with fx5200 locally, it worked on 1024x768, will try some other one now... [09:07] https://tasossah.com/CameraPics/P1110844.JPG [09:07] this is the ubuntu mate installer screen :p [09:07] Yeah exactly like that [09:11] I'll see what other old nvidia cards I have. I know there's an MX440 somewhere [09:16] KitsuWhooa: I think it might also help if you click "try ubuntu" instead of "install ubuntu", as it might cause the second bug too, the segfault [09:16] I can't see where to click :p [09:16] Hehe, true, I have vnc there, you don't [09:16] it's a live boot over usb1.1 and it's ridiculously slow [09:16] so that's not helping either [09:16] Ouch,network boot would help [09:19] I started lightdm manually and I can barely see the firefox icon pinned in mate [09:19] so I'm going to say this didn't segfault [09:20] And I think netboot would take too much time to set up. I'd also need to find a NIC for this board as it doesn't have an onboard one [09:20] and even then, it'd be fast ethernet, so I'm not sure how much better it'd be [09:22] alkisg: interesting thing. When I switch to a tty, the desktop renders fine for a bit before switching to a tty [09:23] KitsuWhooa: if you want help with netbooting, I'm an expert, I could set it up for you in a few minutes, and, 100mbps is a hell of a lot faster than 1.1 usb [09:24] NIC => ipxe boots almost all of them [09:24] That'd be appreciated [09:25] give me a bit to see if I can find a PCI NIC to plug in to this board [09:26] Right, I found one that looks to have a realtek chipset clone of sorts, and what looks like a boot rom [09:27] That way you won't even need ipxe then :) [09:28] So the last notes that I've made for netbooting without our "ltsp" project, are the "automation script" paragraph of this page: https://wiki.ubuntu.com/LiveCDNetboot [09:28] I.e. you're supposed to mount the cd to the server /cdrom, and just run this command: [09:28] wget 'http://alkisg.mysch.gr/steki/index.php?action=dlattach;topic=2525.0;attach=1421' -O /tmp/livecd-netboot && sudo sh /tmp/livecd-netboot [09:29] one moment, trying to find out how to enable the boot rom in the bios [09:29] If the boot rom is in the nic itself, you might need to press ctrl+f11 or so when it displays that message [09:29] I.e. it might not be in the bios [09:30] The easiest way would be to create a usb stick/floppy/cd with ipxe though [09:30] it's not loading the rom at all, and IIRC there's usually a toggle in the bios [09:30] boot.ipxe.org => images to download [09:30] The bios usually is for onboard nics. It's the same bios for many boards, so it won't work for pci nics. [09:31] (well, unless the bios was expecting a realtec onboard nic in other board versions) [09:34] yeah looks like I'll be going with ipxe [09:41] Προέκυψαν σφάλματα κατά την επεξεργασία του: nfs-kernel-server [09:41] I have a feeling this isn't going to work [09:42] mount: unknown filesystem type 'rpc_pipefs' [09:42] looks like I'm going to be recompiling my kernel [09:50] Er, nfs-kernel-server can't be installed? Yeah that's not a good sign... [09:50] You can also boot another pc with the ubuntu mate usb stick, and then run that command, and it will allow you to netboot the older pc [09:51] i.e. both live server and live client [09:51] alkisg: your mysch site doesn't seem to be responding over IPv6 [09:51] (in case your actual setup is strange and you can't install nfs temporarily...) [09:51] Yeah they're old school :D [09:51] I'll do it in a stock 18.04 VM [09:51] it has an AAAA record though and it resolves to an address [09:51] so it just causes wget to hang [09:52] I've filed complaints a lot of times, but no solution yet. They also don't support https in their hosting. Lame :/ [09:52] might want to edit the wiki page to use wget -4 then [09:52] so that it forces ipv4 [09:52] Oh I wrote that paragraph 10 years ago, I don't think anyone maintains it [09:53] Ah [09:53] It's full of obsolete information, but I think my script still works [09:53] * alkisg loves code that works 10 years later :D [09:54] 01:00.0 VGA compatible controller [0300]: NVidia / SGS Thomson (Joint Venture) Riva128 [12d2:0018] (rev 10) ==> nah this one loaded vesa, too old, trying another... [10:02] I love you [10:02] er [10:02] lmao [10:02] not sure how that got in my X clipboard [10:02] Haha no worries it's always a good thing to say [10:02] Sorry for that. Anyway, I'm waiting for the unattended updates to finish [10:03] because it started running dpkg when I booted the VM so I can't use apt [10:03] that is true :p [10:04] I managed to reproduce the segfault with this one: 01:00.0 VGA compatible controller [0300]: NVIDIA Corporation NV5 [Riva TNT2 Model 64 / Model 64 Pro] [10de:002d] (rev 15) [10:16] the netboot script seems to fail to detect my IP [10:16] oh well [10:18] alkisg: how do I get ipxe to boot? [10:18] alkisg: do you mean how to put it in floppy/cd/usb? [10:18] or, how does it detect the server ip? [10:18] the latter [10:18] I got ipxe running on the machine but it just says press a key to reboot [10:19] The normal netbooting setup there is "a dhcp server somewhere, e.g. in a router, and my script running dnsmasq in proxydhcp mode to only send the boot server ip/boot filename" [10:19] Is this your use case? Btw, did you put the VM in bridged mode, so that it has all TCP/UDP ports open? [10:19] your script is running dnsmasq, yeah [10:20] the VM is in bridged mode [10:20] it has an IP in the lan like any other device [10:20] and doesn't go through any nAT [10:20] *NAT [10:20] Does ipxe get an ip? [10:20] You can also try ctrl+b there, and then run `autoboot`, and if it fails, run `config` and see what it got from dhcp/proxydhcp [10:21] It definitely shows the mac address, but I'm not sure about the IP. I am already in the command prompt so I'll try that [10:21] "[...]/tftpboot/pxelinux... no such file or directory" [10:21] it does get an ip [10:21] and it does talk to the serv er [10:22] *server [10:22] (01:21:51 μμ) KitsuWhooa: "[...]/tftpboot/pxelinux... no such file or directory" => that sounds like some error in the script? Did you see anything wrong when it ran? [10:23] https://tasossah.com/txt/netboot_script_log [10:25] route => not found => yeah that's an issue [10:25] Old script, not relying on ip [10:25] guess I'll install it [10:26] Btw it should be trying to download pxelinux.0, not pxelinux [10:26] I thought that was to detect the ip [10:26] route, I mean [10:26] Can you try to symlink it just for a quick hack? [10:26] Yeah, probably [10:26] Sure, give me a bit to restart it [10:27] I don't think I can [10:27] I take that back [10:27] however, pxelinux.0 is a symlink itself that points to a file that does not exist [10:28] it points to /usr/lib/syslinux/pxelinux.0 [10:28] my guess is it depends on the pxelinux package [10:28] It installs it, but it moved elsewhere, moment, [10:28] /usr/lib/PXELINUX/pxelinux.0 [10:29] ah [10:29] OK so I guess the 10 year old script needs to be updated with the new pxelinux path, sorry :D [10:29] okay, adding those two symlinks got me further [10:30] now it's complaining about .c32 files not being found [10:31] https://tasossah.com/txt/netboot_script_log_2 [10:31] The new location is in /usr/lib/syslinux/modules/bios/ [10:31] Copy them from there or symlink them or something [10:33] Yeah this is really broken [10:34] now it fails loading capser/vmlinuz [10:34] ...IO error? what [10:34] "attempt to access beyond end of device" [10:35] I wonder if the vmlinuz/initrd symlinks also point to wrong paths [10:36] Do an ls -lR in the tftp dir and check for broken symlinks [10:36] I remounted the image and now it loads vmlinuz but it can't find initrd.lz inside the casper dir [10:36] and no, no broken symlinks [10:36] It might be initrd.gz now or something [10:36] there's only an initrd in the image [10:36] and I can't modify anything under casper because it's a symlink to /cdrom [10:37] Ah you modify the kernel etc in pxelinux.cfg/default [10:37] That's somewhere under tftp, a file that the script generated [10:37] I found it [10:38] looks like it's booting [10:38] * alkisg crosses fingers, took too long already... [10:39] "nfs server not responding" [10:39] Maybe that route part that failed, failed to export to local network only [10:39] what's /etc/exports like? [10:39] I installed route and restarted [10:40] there's /cdrom in there [10:40] /cdrom *(ro,no_subtree_check,no_root_squash) [10:40] oh in the script log there's a "job for nfs-server.service cancelled" [10:40] Sounds good. Try `exportfs -ra` in case it helps, [10:41] ah [10:41] try restarting it manually from another tab and see why it fails to start [10:41] systemctl stop nfs-kernel-server, then start again, journalctl -xe, etc [10:42] active (exited) [10:42] and there are no errors whatsoever [10:42] there are only two lines in the log, starting and started [10:42] but it's not running [10:43] https://serverfault.com/questions/859934/ubuntu-16-04-nfs-kernel-server-wont-start [10:43] s your Ubuntu server a linux container (lxc) ? If yes, you need to set something like explained here: mount fstype=rpc_pipefs, mount fstype=nfsd, [10:44] That happened on my 16.04 desktop because I run a custom kernel that didn't have the filesystem needed [10:44] I wonder if running in a VM is related there [10:44] the 18.04 VM is virtualbox [10:44] Hmm [10:44] so I very much doubt it [10:44] it doesn't fail to start it [10:44] issuing the start command doesn't throw any errors, I mean [10:45] I wonder if the unattended update installed a new kernel and broke things. Let me reboot the VM and redo all the symlinks :p [10:46] Ah, damn those unattended updates :) [10:47] I ended up disabling them, since it's a VM anyway [10:53] yeah sch.gr hosting is terrible [10:53] I can't even download the script over v4 now :p [10:54] Ah don't wget it again, one time is more than enough for one day :D [10:54] Hehe [10:54] I rebooted, so it's gone from /tmp [10:54] I made the mistake of not saving it to the disk [10:54] there we go, it worked on the third attempt [10:54] This avoided the segfault, I'll try to limit down now: Option "HWCursor" "off" Option "PageFlip" "off" Option "WrappedFB" "on" Option "ShadowFB" "on" [10:57] we're back to ipxe not detecting the server [10:59] or the NIC [10:59] one of the two [10:59] What's the client output, does it get an ip? autoboot, config etc... [10:59] ctrl+b before that [11:01] okay yeah that was my fault. Needed to reseat the NIC [11:01] sorry [11:01] tjaalton: I didn't get to compiling git yet, but I found out that `Option "PageFlip" "off"` avoids the segfault in [11:01] 01:00.0 VGA compatible controller [0300]: NVIDIA Corporation NV5 [Riva TNT2 Model 64 / Model 64 Pro] [10de:002d] (rev 15) [11:10] alkisg: after the reboot, nfs is working [11:10] however the live image fails [11:11] goes into emergency mode [11:11] a few units fail, and I can't really figure out why [11:12] Ouch, it sounds like it needs a lot of updating... I'm using ltsp everywhere now so I haven't updated it [11:12] Sorry about that [11:12] Oh well, it's fine [11:12] I need to go afk for a couple of hours, be back later... :/ [11:12] sure [11:12] thanks again [11:13] np, thank you too [11:13] Do check that pageflip option if you get the chance [11:13] I'll go through the sch.gr manual and see if it's worth/easy to set up LTSP [11:13] Nono ignore sch.gr, follow this one (mine again): http://wiki.ltsp.org/wiki/Installation/Ubuntu [11:14] Ah, thanks [11:14] If you have a mate installation, you can make it an ltsp server in about 5 commands and 10 minutes [11:14] And it gives epoptes=vnc as a bonus [11:14] I have a stock 18.04 with gnome in a vm [11:15] That works too [11:15] does that mean the client will try to boot gnome too [11:15] ? [11:15] I doubt gnome3 will work [11:15] In the quick "chrootless" setup, yeah [11:15] ah [11:15] But you can choose an xterm session if you prefer [11:15] Or install mate as well ... [11:15] Ah, yeah [11:15] thanks [11:15] * alkisg really goes for now, bbl [11:16] see ya [11:42] alkisg: for when you get back, this is what happened when I tried booting the image in a VM to install it and then set up LTSP https://tasossah.com/s/dd9d97babad3.jpg [12:07] KitsuWhooa: try alt+ctrl+f1, then alt+ctrl+f7 [12:07] *right ctrl, since it's vbox [12:08] it's rshift for me, but wow that worked [12:08] (I rebound it) [12:10] * alkisg searches how to fetch/compile the git version... [12:11] My guess would be to git clone the repo and then either install it directly, or use the files from the ubuntu package to make a deb [12:12] packages.ubuntu.com usually points to all the necessary resources [12:12] This one? https://cgit.freedesktop.org/nouveau/xf86-video-nouveau/ [12:12] I only see 3 commits there from 2018, that's a good sign that it will be somewhat easy to bisect it... [12:13] I think that's the one, yeah [12:16] The test client ran glxgears for an hour with pageflip off, so I guess it makes things stable [12:22] you assume it's fixed in git.. [12:22] that's not at all clear [12:22] tjaalton: I don't assume that at all! I'm assuming it broke in git! :) [12:23] ah [12:23] I.e. i want to bisect and find the commit that broke it, I don't expect to find a commit that fixed it... [12:23] it's not necessarily the driver that broke it [12:23] it may be X server itself [12:24] and a smaller chance the drm driver [12:24] Ah [12:24] 1.0.15 was released in april '17 [12:26] test cosmic [12:26] or just file a bug upstream [12:26] against nouveau [12:26] I'll do both tomorrow morning [12:27] I haven't reproduced the "scrambled screen" issue locally though, only the segfault [12:27] I can only see the scrambled screen in vbox [12:27] pretty sure that's a different issue [12:28] I can't get the scrambled screen to go away by switching to a tty and back with my MX400 [12:28] Gotcha. So, 3 different issues. [12:28] and then I got distracted trying to set up netboot/ltsp [12:28] Hehe [13:12] alkisg: I installed ltsp-server-standalone ltsp-client, ran ltsp-update-image, didn't install epoptes, and ran the first ltsp-config dnsmasq line in the wiki [13:12] is it supposed to be working now? [13:19] KitsuWhooa: I believe so, do you get any errors while booting the client? [13:20] ipxe says nothing to boot [13:21] dnsmasq seems to be running [13:21] but ipxe is only seeing my normal gateway [13:42] try ltsp-config dnsmasq --overwrite; systemctl restart dnsmasq [13:46] that was the first thing I did, and I even rebooted [13:46] Only different thing I did from the wiki was to not install epoptes or make a user for it [14:13] Apparently I just needed to change the subnet in the dnsmasq conf for the proxy [14:28] It should be autodetected... unless you didn't have an ip when you run ltsp-config [14:30] that might have been it [14:30] but since it works now, I can also reproduce the bug with an MX440 [14:31] The fuzzy lines or the segfault? [14:32] corruption [14:32] no segfault [14:32] I think the segfault might be exclusive to the TNT2 [14:33] does the pageflip off option fix the corruption? [14:33] I haven't tried yet