[02:14] TJ-: I found a stable version of Ubuntu Server that doesnt present the aacraid error [02:15] cryptodan: a different release? [02:15] the kernel is Linux capricorn 3.13.0-24-generic #47-Ubuntu SMP Fri May 2 23:31:42 UTC 2014 i686 i686 i686 GNU/Linux [02:16] Linux capricorn 3.13.0-24-generic #47-Ubuntu SMP Fri May 2 23:31:42 UTC 2014 i686 i686 i686 GNU/Linux [02:17] DISTRIB_DESCRIPTION="Ubuntu 14.04.5 LTS" [02:18] I also found that the bug goes all the back to CentOS 5 on kernel 2.6 [02:19] ouch! [02:20] this is what happens when devs mess with code for newer devices and don't ensure older devices aren't upset [02:20] There's been a lot of that in the kernel in recent years [02:20] also found a validated version of Red Hat for the server doesnt boot from the CD [02:20] DevOps contagion [02:20] it stalls at loading kernel [02:21] let it sit over night and no boot [02:21] "loading kernel" is a boot loader message, if that's what you mean [02:21] so the kernel doesn't start executing? [02:22] it should go "loading kernel" ... "loading initrd" then kernel starts and you see its messages [02:22] no dmesg [02:24] right, so hand-over failed. that can happen if the firmware e820 memory map confuses the boot loader [02:25] id expect a validated and supported OS per Dell would boot up [02:28] but I posted my system specs on that one bug report https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1777586 that I found a stable system setup for people to try [02:28] Launchpad bug 1777586 in linux (Ubuntu Bionic) "Ubuntu Server 18.04 LTS aacraid error" [High,Confirmed] === neel is now known as Guest16758 [03:30] HI [03:31] need some help on package installation that in failing on my ubuntu server 18.04.1. i suspect its because of repository [03:31] Err:1 http://archive.ubuntu.com/ubuntu bionic/main amd64 cpp-7 amd64 7.3.0-16ubuntu3 [03:31] Connection failed [IP: 91.189.88.149 80] [03:31] Err:2 http://archive.ubuntu.com/ubuntu bionic/main amd64 gcc-7 amd64 7.3.0-16ubuntu3 [03:31] Connection failed [IP: 91.189.88.152 80] [03:31] E: Failed to fetch http://archive.ubuntu.com/ubuntu/pool/main/g/gcc-7/cpp-7_7.3.0-16ubuntu3_amd64.deb Connection failed [IP: 91.189.88.149 80] [03:31] E: Failed to fetch http://archive.ubuntu.com/ubuntu/pool/main/g/gcc-7/gcc-7_7.3.0-16ubuntu3_amd64.deb Connection failed [IP: 91.189.88.152 80] [03:31] E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing? [03:31] any idea folks? [03:32] neell: no route to host? firewall? [03:32] neell: can you connec to those URLs with curl/wget/w3m/lynx ? [03:32] some packages were installed successfully, after 4/5 tries [03:33] for this one, its showing 63% completed and then failing [03:33] Connection Failed means the TCP connection broke [03:33] could be a MITM and/or proxy [03:34] ok, i am running the same my command again. its showing now 63% waiting for header [03:35] got this now: [03:35] Err:1 http://archive.ubuntu.com/ubuntu bionic/main amd64 cpp-7 amd64 7.3.0-16ubuntu3 [03:35] Connection failed [IP: 91.189.88.161 80] [03:35] 63% [Waiting for headers] [03:36] and the same error as above [03:36] I can connect to those IP addresses fine [03:36] check the route with "tracepath 91.189.88.149", try a ping , see if there's any packet loss or variable latency [03:37] something wrong from my end. i am kind of newbee on Ubuntu commands [03:39] i am giving u tracert result [03:39] http://paste.ubuntu.com/p/q3GjsG28WG/ [03:41] You're on Guam? you ought to set a mirror in Hong Kong [03:41] ok [03:41] so u want me to connect to hk mirror instead and install package from there? [03:43] how do i change repo to my closest one and install package from there? [03:43] neell: well, it'd likely be more reliable, you'd avoid the level-3 HK>London link which causes a lot of latency [03:44] The best mirror would be https://launchpad.net/ubuntu/+mirror/mirror.xtom.com.hk-archive [03:44] ok [03:44] how do i change it to that? [03:44] You'd edit /etc/apt/sources.list and match the info given on that web page [03:45] any command to do that? [03:45] use the "Display sources.list entries for" choose Bionic and it shows you what should be in sources.list [03:45] neell: any text editor, using sudo because the file is owned by root [03:45] neell: if you use vim, it'd be "sudo vim /etc/apt/sources.list" [03:47] yes, i can see the repo list [03:47] so i need to create the hk mirror at the beginning of the file? [03:48] replace every archive.ubuntu.com with mirror.xtom.com.hk [03:49] ok [03:50] I noticed apt allows editing now, you can do "sudo apt edit-sources" [03:54] deb http://mirror.xtom.com.hk/ubuntu bionic main [03:54] deb http://mirror.xtom.com.hk/ubuntu bionic-security main [03:54] deb [arch=arm64,ppc64el,amd64] http://mariadb.mirror.digitalpacific.com.au/repo/10.3/ubuntu bionic main [03:54] deb [arch=ppc64el,arm64,amd64] http://sfo1.mirrors.digitalocean.com/mariadb/repo/10.3/ubuntu bionic main [03:54] deb http://mirror.xtom.com.hk/ubuntu bionic-updates main [03:54] this is now the list [03:54] is it ok? [03:58] neell: I don't think they mirror security, usually we set that to security.ubuntu.com so you get updates immediately [03:59] so only the "bionic main" will be updated to HK mirror? [03:59] neell: oh, they do mirror it... just the mirror is delayed a few hours compared to security.ubuntu.com [03:59] neell: try the HK server out for those you've just shown [03:59] ok [03:59] neell: you can always change back if there's no improvement [03:59] neell: "sudo apt update" [04:00] Get:5 http://mirror.xtom.com.hk/ubuntu bionic/main amd64 Packages [1,019 kB] [04:00] Get:5 http://mirror.xtom.com.hk/ubuntu bionic/main amd64 Packages [1,019 kB] [04:00] Get:5 http://mirror.xtom.com.hk/ubuntu bionic/main amd64 Packages [1,019 kB] [04:00] Get:5 http://mirror.xtom.com.hk/ubuntu bionic/main amd64 Packages [1,019 kB] [04:00] Get:5 http://mirror.xtom.com.hk/ubuntu bionic/main amd64 Packages [1,019 kB] [04:00] Get:5 http://mirror.xtom.com.hk/ubuntu bionic/main amd64 Packages [1,019 kB] [04:00] 30% [5 Packages 0 B/1,101 B 0%] [04:00] and lots of these lines [04:01] !paste | nell [04:01] nell: For posting multi-line texts into the channel, please use https://paste.ubuntu.com | To post !screenshots use https://imgur.com/ !pastebinit to paste directly from command line | Make sure you give us the URL for your paste - see also the channel topic. [04:01] ok, sure. [04:01] good morning [04:01] seems like its downloading... [04:01] but still 30% [04:02] neell: neell it could point to problem with your connection [04:02] between you and HK at least [04:02] ok. any idea how can i solve it [04:03] neell: unless it's on your network, complain to your ISP :) [04:03] before, it would take few seconds to finish command sudo apt update [04:03] neell: what have you changed on the server, or on the network, recently? [04:03] no [04:04] i was able to install maria-db from au mirroe, it was i guess 183MB [04:04] now, while installing packages of 140MB, it was giving error [04:05] neell: I wonder if your ISP has deployed a transparent proxy [04:05] ok, i got an error at last [04:05] https://paste.ubuntu.com/p/xk43GW7YdR/ [04:06] neell: you might get a clue looking at the HTTP headers with "wget -S -O /dev/null http://mirror.xtom.com.hk/ubuntu" [04:10] https://paste.ubuntu.com/p/7sKQ6ch4dB/ [04:11] neell: hahah "X-Custom-Job: If you see this header, please contact hello@xtom.com for a job" [04:12] hmm [04:12] ok, so it looks good? [04:12] neell: no tell-tale signs of an HTTP proxy, but there could be a transparent proxy [04:13] can I try to switch to an US mirror? [04:13] neell: but we say it only got about 1100 bytes when it should have been much more, so I think the connections are being broken on the link between you and HK [04:14] ok, got it. may be I can try some other mirrors? [04:15] you could but I doubt it'll help, looks like all your traffic goes via HK [04:18] where can I get mirror list? [04:21] neell: https://launchpad.net/ubuntu/+archivemirrors [04:22] ok, thanks. i will check [07:34] looking for help here.... I am trying to install Webmin [07:34] getting below error [07:35] on Ubuntu Server 18.04.1 [07:35] https://paste.ubuntu.com/p/D873FM5BXJ/ [07:36] it gives me this [07:36] The following packages have unmet dependencies: [07:36] webmin : Depends: libauthen-pam-perl but it is not installable [07:36] Depends: apt-show-versions but it is not installable [07:36] E: Unable to correct problems, you have held broken packages. [07:38] and when I am trying to install the package "sudo apt-get -f install libauthen-pam-perl [07:38] " [07:38] it gives me this: [07:38] Package libauthen-pam-perl is not available, but is referred to by another package. [07:38] This may mean that the package is missing, has been obsoleted, or [07:38] is only available from another source [07:38] E: Package 'libauthen-pam-perl' has no installation candidate [07:39] anyone can help on this? [10:21] So I'm getting started on my first production deployment - there's going to be around ~27 production servers and I was looking into some sort of server management tool like SaltStack [10:21] Anyone using it and love it? Would anyone recommend anything else? [10:22] I use ansible [10:22] there's quite a few different tools to choose from, though - chef, puppet, ansible, cfengine, ansible, saltstack etc etc [10:23] ansible twice in that list - oh well - I like ansible :D [10:25] thanks RoyK I'm looking at them all - trying to see which one has the best free offering :) [10:36] "free offering"? [10:37] iirc they're all open source [10:42] indeed - I mean which one has the best open source option/easiest to use [10:47] that mostly depends on your own preferences ;) [12:24] so i seem to have an issue with slow dns query response. when i try to ping something like www.google.com it'll take like 6 seconds for it to do the dns lookup, then it'll start pinging. but if i do a dns query with nslookup it finishes quickly. i had read some stuff about disabling tx and rx offloading or something like that, but disabling it as per the sugestions,it didn't work. not sure where to look [12:26] its a vm? [12:26] yes [12:27] kvm? [12:27] esxi 6.5 [12:31] thegoat_: I've seen that when there's an IPv6 record being returned for the hostname too. Try "dig www.google.com" see what is returned [12:35] n/m fixed it. it was an id10t error [14:27] rbasak: powersj: either of you around so I can bounce a question off of you? [14:27] probably rbasak moreso since he's been around throughout nginx becoming a main package :P [14:29] rbasak: i think this was discussed, but would there be a problem with the non-LTS interim releases tracking NGINX Mainline instead of NGINX Stable so the interim releases can work with newer features/etc. that won't be available until next LTS in the Stable NGINX branches? [14:29] I think we had said there wouldn't be, but I forget the original conversation and what came from it. (It's also why I emailed the list, but apparently I'm still stuck in the moderation queue, so someone needs to poke the mailing list admins) [14:44] TJ-: thanks for the assist yesterday, I've done some revisions to the package I was working on, and I think I am almost at the point where this can be tested for that LP bug requesting a daemon-only package. Just gotta wait for the PPA builders to finish uploading to run some tests myself... [14:44] without the assist and guidance I'm not sure I'd have gotten this all solved. [14:48] teward: :) it often helps to have a pair of unrelated eyes === jamespage is now known as JamesPage === JamesPage is now known as jamespage [15:46] replaced the bad disk with a new disk and smartctl does not recognize it [15:46] https://dpaste.de/iThm/raw [15:46] any suggestion? [15:50] the corrupted disk that were replaced were visible by smartctl.. so backend is fine. so possibly the new disk is bad? [15:50] this disk was hot swapped? [15:51] i assume it is not 600 peta bytes? [15:52] is /dev/sdb a proper device node still? [15:59] looks like the hot-swap... didn't :) [15:59] yes, this looks like live transplant without sedation [16:00] maybe you can: for host in /sys/class/scsi_host/host*; do echo "[ Rescanning ${host##*/} ]"; echo "- - -" | sudo tee -a $host/scan 1>/dev/null; sleep 1; echo; done [16:00] but a reboot seems a good idea. [16:00] axisys: ^ still with us? [16:01] TJ-: yes, hot swap [16:01] ok .. let me scan [16:03] axisys: are the disk in a chassis? the messages look like there is some intermediary hardware/firmware between PC and disk [16:03] cool.. scan did the trick [16:04] be sure to find out how to hot swap properly for the future. this was not a healthy operation. [16:05] !cookie | tomreyn [16:05] tomreyn: Wow! You're such a great helper, you deserve a cookie! [16:05] wow, now i get cookies for blaming people, sweet. ;-) [16:06] axisys: dont take me too serious, good luck there. i'd still wnat to reboot it soon. [16:07] tomreyn: why reboot? (learning) [16:07] tomreyn: curious on hot swap properly .. [16:08] axisys: well you ripped this disk out while the controller was still accessing it, or thinking it was still there all the time. your dmesg will be full of errors., and a couple things may still be in an unsane state. [16:09] how do I tell controller to stop accessing the disk? thanks for your help! [16:10] axisys: how to hot swap properly will be documented in your server operators manual. but i'd always announce removal via software before the fact. [16:10] don't do this now: echo 1 > /sys/block/sda/device/delete [16:10] tomreyn: right.. documenting .. [16:11] obviously you want to unomunt everything from there beforehand [16:11] ok /dev/sdb is not in use anywhere [16:12] i'd "eject", too, just in case [16:12] also not now [16:12] eject would eject cdrom.. no? [16:12] understand... [16:13] system is running off of /dev/sda .. right now.. so no harm either way.. but understood [16:14] tomreyn: so once I swap the disk just scan it, right? [16:15] axisys: yes, that's usually enough afterwards. the important thing is to prepare for removal properly [16:15] axisys: and most of all not all hardware has hot swap capability [16:15] right [16:16] axisys: you need the controller, firmware, and OS to support it. and whatever else might sit between controller and storage. [16:17] so start by reading your controller / mainboard / server manual [16:17] ok.. I tested on another system exact same hardware sun fire x2250 .. and it worked.. first stop accessing, swap out and then scan.. awesome! saving a steps in my wiki.. thank you! [16:18] another computer museum? [16:18] :) [16:19] you'll pay a lot of power for those. might be worth replacing them by half as many current systems some day. [16:19] or actually a third [16:21] yes.. those are from part.. recently most of our servers hp dl360 or dl380s [16:21] from past* [16:22] current gen hp is fine, as long as you have a support contract. [16:22] and dont need to stack up fast... [16:24] ^ personal opinion / experience, i'm not affiliated with canonical [17:05] teward: what would happen if we tracked mainline before LTS-1 and we didn't get a stable release by LTS? [17:06] rbasak: NGINX stable releases are always cut from mainline around the time we release [17:06] for 16.04 it was cut same-day as release and we did a version-string-only SRU with the Release team's approval post release [17:06] for 18.04 it came out same week as FinalFreeze but I was able to get that in right before the freeze went into effect (same-day) [17:07] rbasak: if we track Mainline up to release date, then the delta between Mainline and Stable when it's cut is extremely minimal, and the past several cases we've run into this we really didn't have to do any feature changes, etc. just the version string revisions [17:08] by the time of NGINX Stable cut which is about when we release LTS, it would be most likely a trivial post-release version-string-only change SRU with no new 'features' by the release date [17:08] rbasak: basically it'd mirror what we had for 16.04, or this past time for 18.04. [17:09] the other problem we're going to face though rbasak... [17:09] if we don't give the 'newer versions' people are going to become 100% dependent on the PPA for the "new features" [17:09] at which point the question is "why do we bother updating nginx in the repos then?" (to the non-informed user, that is) [17:09] teward: what I mean is: we should bump to mainline unless the following stable is already scheduled to be release before freeze for Ubuntu's following LTS. [17:09] we *shouldn't* [17:09] rbasak: and that's the 'problem' [17:09] rbasak: it's always just before or just after our release date [17:10] consistently falls around the same week or two, and they don't give firm date releases [17:10] rbasak: i have no issues keeping it at 'stable' [17:10] but people are going to complain heavily, I guarantee it. [17:11] It's generally OK to bump to final stable, even in an SRU, if the changes are minimal (eg. just a version string bump, or a few bugfixes), since those changes qualify for SRU anyway. [17:11] However, it risks pain. [17:11] So it depends on you I think. By Ubuntu policies we can do it. [17:11] it's actually less pain to bump to latest Mainline and switch it to nginx stable, because a large portion of the 'fixes' and changes to spec of HTTP/2 and such ahppen in Mainline [17:11] But, as I don't particularly want to commit Canonical's time to back that up, I'd prefer to stick to nginx stable consistently. [17:12] this is why i posted to the ML [17:12] but unless you can release that, it's stuck in limbo for eternity [17:12] (read: mod queue) [17:12] Oh [17:12] * rbasak looks at the mod queue [17:13] Accepted [17:13] rbasak: To be fair, I tell people to use the PPA if they want the "latest and greatest" anyways, but the reason I'd like more feedback is because MaaS people or other departments might want to see whether their stuff works in the newer releases, etc. [17:13] rbasak: as for *my* workload it doesn't change [17:13] i have to keep both NGINX Stable and NGINX Mainline uptodate and working in two PPAs anyways, so [17:13] the other problem is Debian [17:13] because they track Mainline usually most of the time [17:14] ... though they are far behind at this point, last thing they did was in april [17:14] It's the risk of work, I think. If nginx stable releases late and with feature changes, leaving us in a pickle if we've released our LTS pre-stable-release. [17:14] (E:UnmaintainedInDebian?) [17:15] Since we need to decide on that in advance, I think it depends on our relationship with upstream. [17:15] (and on how much we need it) [17:16] where 'upstream' means nginx in this context? [17:16] and not Debian [17:17] Yes [17:18] rbasak: the only reason I am hesitant to track only Stable is because Stable is only supported officially upstream for a year [17:18] that is, until the next Stable cut from Mainline [17:18] and we're going to have that problem with Mainline either way, because that's only good for a year before they drop official support for it [17:18] the remaining 'bug fixes' are either nitpicked or microreleased as needed for substantial ones [17:19] and security patches need backported either way (but Security Team takes care of that for the most part) [17:19] Stable is a better fit for stable distribution releases I think? [17:19] do we consider the interim releases "stable distribution releases" though as we only support them for 9 months? [17:19] I don't see how mainline would be better to have in the distribution from a length of support perspective. [17:19] rbasak: wait until TLS1.3 is a thing? [17:20] Yes. The SRU policy applies equally on non-LTS releases. [17:20] And our stability promise is roughly the same. [17:20] If anything, LTS is less stable, because we do HWE and occasionally feature enhancements in them. [17:20] (because they have an extended life it's more necessary to do that) [17:23] y'know it sucks I can't search the mailing list archives easily [17:26] rbasak: i found a thread in the list about this, back from 2015... [17:26] sarnold: you were the last to reply to it heh [17:28] and it establishes the precedent that was used for 14.10 through 15.10 and then established the 16.04 changes. https://lists.ubuntu.com/archives/ubuntu-server/2015-June/007075.html https://lists.ubuntu.com/archives/ubuntu-server/2015-June/007076.html [17:28] not sure if that opinion still stands [17:28] not sure this has to be determined today, we could wait for replies to my message to the list you just released, rbasak [17:29] I ultimately don't care either way, but you still have to realize that every x.04 release is going to run into the same problem with the current 'release schedule' that NGINX has. [17:29] even if we stick to stable. === kallesbar_ is now known as kallesbar [18:55] before rsync completes .. sda disappeared .. [18:55] sdb is the new disk as part of md0 (sda1,sdb1) and md1(sda2,sdb2) [18:56] # ls -al fstab [18:56] -rw-r--r-- 1 root root 1113 Jul 2 2012 fstab [18:56] # cat fstab [18:56] cat: fstab: Input/output error [18:56] yikes! [19:00] was that a striped raid? [19:00] ahasenack: raid1 [19:01] so why did the raid fail if sdb was still there? [19:01] sdb is the new disk to replace bad sdb [19:02] smartctl was saying FAILING and replace it now .. [19:02] you had a failure during the raid rebuild? [19:02] Hiya. What would happen if I create a Virtualbox VM with Ubuntu Server on it, including the ubuntu-desktop package for GUI, then copy this VM to a headless server? Would Ubuntu still start and run the background services? [19:02] ahasenack: yes :-( [19:03] I don't yet have a headless server to try it out for myself [19:03] axisys: yep, I heard that can happen [19:03] rangergord: it will still have a video card, right? [19:03] ahasenack: condolences [19:03] ahasenack: I'm not sure. Let's say it doesn't. What happens then? [19:03] rangergord: I don't think a PC boots without a video card [19:04] but could be wrong [19:04] pretend it's a typical 1U server. if it needs a gpu to boot, then sure, there's one. [19:04] rangergord: anyway, I would expect it to try to start X as usual and present the login greeter. If X failed (no driver), that would be ok. What wouldn't be running is services that run after a desktop user logins [19:05] all that UI stuff [19:05] ahasenack: good enough for me! thanks. [19:05] axisys: that's why I hear that two disk redundancy is advised. If possible ($$), of course [19:06] the rebuild stress can make the last disk fail [19:06] s/last/remaining/ [19:06] also never forget backup :) [19:07] * ahasenack ponders about adding a 3rd disk to his 2-disk mirror [19:07] I actually have one, but was considering it a spare [19:07] could make it a differential backup of the mirror [19:07] *mirrored disks [19:08] I got myself a Synology recently for home server use. Didn't want to invest the time to learn everything. [19:09] Synology is just Linux with a GUI slapped on it [19:10] I have a synology [19:10] what I disliked about it is that they have their own patches on top of btrfs [19:10] I can't btrfs send/recv to/from it [19:10] from a linux box [19:11] so now I have an old desktop with zfs to backup that nas and other stuff [19:11] almost headless :) [19:16] it is possible raid build completed before sda giving up.. but I am not 100% [19:16] ahasenack: not sure how to confirm [19:16] is it still rebuilding? [19:16] check /proc/mdstat [19:16] ahasenack: no [19:17] ahasenack: yep [19:17] and you still hav the error? [19:17] sda is missing [19:17] anything useful in the last lines of dmesg? === mark-otaris is now known as Guest43597 [19:17] ahasenack: https://dpaste.de/Ayvu/raw [19:17] and, is sda really dead? [19:17] ahasenack: yes [19:17] file-system is readonly [19:18] readonly now* [19:18] what fs is on top of that? [19:18] ext4 [19:19] so md0 completed .. but md1 is the large disk and not sure if completed [19:21] I don't know [19:31] hi all, I have a ubuntu 18.04 LTS server box (quad core, 8GB memory) as a router and have two access points. I am running dnsmasq as dhcp and dns server. I have around 30 wifi devices connecting to the network. For some reason, the ubuntu box randomly looses internet. I can't ping 8.8.8.8 or google.com. But as soon as I reboot the ubuntu box, it works. I don't change anything. Any advice? really apprec [19:31] iate it [19:33] ironpillow: that's very generic, sorry. It could be a million things [19:35] ahasenack: yeah. I am not able to figure it out. syslog is not showing anything in particular. dmseg only shows one error perf: EDAC pnd2: Failed to register device with error -22. [19:35] ahasenack: do you know if /etc/resolv.conf is automatically rewritten. I ask because, there is a bug in 18.04 and I have to re-write resolov.conf manually every time system is rebooted. [19:35] it's a generated file, yes. Changes will be lost [19:36] networks coming and going could trigger an update to resolv.conf [19:41] added a disk on same slot where sda was.. it came up as sdc .. server still up .. [19:44] sdc1 is added to md0 fine, no complain.. but failing to add sdc2 [19:44] # mdadm /dev/md1 --add /dev/sdc2 [19:44] mdadm: cannot load array metadata from /dev/md1 [19:45] well md1 thinks it's active device is sda2. but you removed this, apparently uncleanly. [19:46] ahasenack: so it might be re-written on a running system? [19:46] you'll need to mdadm --fail /dev/sda2, just telling mdadm about the facts, i guess. and probably delete the sda scsi device, too [19:47] (or sata) [19:47] tomreyn: no .. it was failing and removed itself while rebuilding [19:47] ironpillow: nowadays, actually, I think it will stay put, with just the entry to 127.0.0.53. It's the resolver at 127.0.0.53 that gets reconfigured [19:47] axisys: so those dmesg records are old? [19:47] mdadm --fail /dev/sda2 [19:47] mdadm: error opening /dev/sda2: No such file or directory [19:48] sorry, wrong usage [19:48] tomreyn: that triggered I think during raid1 rebuild with sdb [19:49] let me paste current /proc/mdstat [19:49] ahasenack: I have to change the entry to 127.0.0.1 for internet to work. [19:49] from 127.0.0.53? [19:49] yes [19:49] have someone already updated 16.04 to 18.04? [19:50] https://dpaste.de/OHfb/raw [19:50] axisys: actually it was the correct usage [19:50] sorry abotu my confusion [19:51] yes /dev/sda seems disappeared [19:52] axisys: well it's still in mdstat [19:52] is there a way I can force in sdc2 into md1 [19:54] ahasenack: https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1624320 [19:54] Launchpad bug 1624320 in systemd (Ubuntu) "systemd-resolved appends 127.0.0.53 to resolv.conf alongside existing entries" [Low,Confirmed] [19:54] you can "mdadm --remove /dev/sda2" now [19:54] so does systemd-resolved append 127.0.0.53 at random times on a running system? [19:55] no [19:55] I don't know what happens in an upgrade from < 18.04 [19:55] axisys: and you can "mdadm --add /dev/md1 /dev/sdc2" [19:56] but a fresh 18.04 will use just 127.0.0.53 [19:56] tomreyn: # mdadm /dev/md1 --add /dev/sdc2 [19:56] mdadm: cannot load array metadata from /dev/md1 [19:59] ahasenack: yes, I am not able to ping or access internet. I have dnsmasq installed. [20:00] and my dns server is at 192.168.2.2 [20:01] sorry: my dns server dnsmasq is listening on 192.168.2.2 [20:02] but forgetting about dns, ping 8.8.8.8 also is not working when resolv.conf has 127.0.0.53. it only works after changing it to 127.0.0.1 [20:03] `ping 8.8.8.8` doesn't use DNS at all [20:03] axisys: hmm that's quite unfortunate. the array metadata explains how the data is aligned on the raid devices. if this is missing... there's no way to interpret it. [20:03] so if `ping 8.8.8.8` just doesn't work that's probably a different issue [20:04] teward: yeah, that's what confusing. ubuntu just stops working. I have to reboot in order for ping 8.8.8.8 to work [20:08] ironpillow: does ping get stuck? [20:08] or what does it complain about? [20:08] axisys: you can try to --grow --raid-devices=2 /dev/md1 (but i assume it will fail with the same error) [20:09] ahasenack: it gets stuck [20:10] doesn't complain or anything. [20:12] let me check [20:13] # mdadm --grow --raid-devices=2 /dev/md1 --add /dev/sdc2 mdadm: /dev/md1: no change requested [20:13] tomreyn: ^ [20:14] axisys: can you show a current paste of the same info as before? [20:17] tomreyn: https://dpaste.de/6L5P/raw [20:20] axisys: so do you still have file systems mounted on top of md1? [20:22] system is still up .. but readonly and sometimes even worse like here [20:22] you should not, so be sure to go to single user mode and unmount / disable anything that's on top of md1 [20:22] # ls -al fstab [20:22] -rw-r--r-- 1 root root 1113 Jul 2 2012 fstab [20:22] # cat fstab [20:22] cat: fstab: Input/output error [20:22] i assume your OS is on md0? [20:23] md0 is /boot [20:23] so the Os is on md1? [20:23] right md0 is too small to be / [20:23] https://dpaste.de/YrWm/raw [20:24] lsblk would maybe answer my question [20:24] haha .. [20:24] but i think this is a lost cause, rebuild system, restore backups [20:24] # lsblk [20:24] bash: /bin/lsblk: Input/output error [20:25] assuming there are backups, of course. [20:25] well, we always assume that ,right? [20:25] after dealing with users on Ask Ubuntu for a couple years I lost hope that there're backups held by [Insert User Looking for Help Here] [20:25] but you're not wrong [20:26] we should always assume there's backups :P [20:31] axisys: so in the hopefully very unlikely case that there are NO backups: you could dd sdb2 to some other device, then boot to some recovery system and run mdadm against sdb2, creating a new RAID-1 array with a single active device [20:32] and then see if there is data on there that you can recover. [20:34] how to recover this data will depend on which block device layers you had on top of md1, you would need to recreate those there as well [20:34] ah.. so dd if=/dev/sdb2 of=/dev/sdc2; (no backup) [20:35] yes, if sdc2 is not in use [20:35] i thought we had added that to md1 [20:35] this is one of the 6 servers to access the network.. so it is not an outage.. but I am taking this opportunity to learn to rebuild it graciously (if possible) [20:36] tomreyn: failing to add sdc2 to md1 [20:36] oh right there was no metadata, so it couldnt add it [20:38] axisys: also worth a try while you're still running: mdadm --grow --raid-devices=3 /dev/md1 [20:38] but this would also fail, i guess [20:39] # mdadm --grow --raid-devices=3 /dev/md1 [20:39] raid_disks for /dev/md1 set to 3 [20:39] axisys: any news on mdstat? [20:39] md1 : active raid1 sdb2[2](S) sda2[0] [20:39] 243801976 blocks super 1.2 [3/1] [U__] [20:40] 3/1 [20:40] I could try adding it again [20:40] yes [20:40] but the metadata is still missing ;) [20:40] does not like it.. if I could just get metadata from where :-) [20:40] # mdadm /dev/md1 --add /dev/sdc2 [20:40] mdadm: cannot load array metadata from /dev/md1 [20:41] a good server with similar build [20:41] built* [20:42] what do yuo mean? [20:42] I wonder if I could take its metadata and place it in /boot dir since md0 is good and point to it [20:42] i have another good server with similar built [20:42] how far in was the resync from sda2 to sdb2 when it failed, do yuo know? [20:43] no .. [20:43] the other server wont help you [20:43] k [20:45] your only hope now is to try to carve off sdb2 what was copied there. [20:45] or to have soemone recover data off your failed disk drives [20:47] yeah.. if that is the case we will just rebuild.. but using this opportunity to indulge all ideas.. [20:53] different topic => is it possible to convert from RAID1 to RAID10 without data loss with HP raid controller? [20:54] yes I am working on another set of hp servers when one of the server, whoever intern built, did not use LVM .. so trying to find a way to expand sda .. application vendor says it has to be all in sda .. [20:54] s/when/where/ [20:55] all the other servers has LVM.. so had no issue on expanding [20:57] so you have hardware raid 1 (which controller?) and the capacity it provides is insufficient, but you have more spare disks? [20:57] downtime is an issue? [21:00] tomreyn: downtime is not an issue [21:00] tomreyn: raid1 (800G,800G) .. need to expand and bought 2 2TB .. [21:00] so thinking another raid1 and then strip the two raid1s [21:01] stripe* [21:01] if downtime is not an issue i'd just backup and rebuild from scratch. [21:02] if downtime is an issue, it seems to be possible to migrate https://serverfault.com/questions/545809/how-to-move-raid-1-to-raid-10 [21:04] hmm this is really old, better make sure it's still valid [21:04] k [21:04] thanks for the link tho.. [21:05] https://support.hpe.com/hpsc/doc/public/display?docId=emr_na-c03510253 [21:05] cool.. online resizing .. nice [21:05] but then i dont know what hardware you have there [21:08] Smart Array P440ar [21:09] you apparently operate as a company or larger organization, and dont have backups for everything. i really recommend you take some time to make sure your processes are in good shape. [21:09] if this can happen now, it will only get worse in the future unless you re-evaluate how you do what you do. [21:10] P440ar - that#s a common one. [21:10] these HP servers have backup .. that old SUN FIRE was just bashtion server one of 6 which do not have backup [21:13] it still semed mimportant enough thtat you spent time on evaluating whether the data on raid can be restored, if partially. [21:13] *seemed [21:14] i don't mean to criticise you, it's none of my business, i'm just trying to provide suggestions [21:15] hey.. taking all as suggestions.. appreciate the help [21:15] :) [21:15] HPE SSA user guide https://support.hpe.com/hpsc/doc/public/display?docId=c03909334 [21:16] HPE P440ar controller quickspecs https://www.scalcom.de/ftp-import/datasheets/726736-B21.pdf [21:17] actually this one https://h20195.www2.hpe.com/v2/getdocument.aspx?docname=c04346299