[02:14] designated, your error is probably true. [02:14] it couldn't get a lock because the process running (that never completeded) had it locke [02:14] you could strace that process and probably get some more info === CyberJacob|Away is now known as CyberJacob [02:15] maybe you have outbound internet access blocked ? [02:19] smoser, it's not blocked, maas is not recognizing the domain as being managed even though it's configured correctly and forwarding the dns request to the configured dns forwarder [02:19] the sudo error is not relevant. [02:20] it happens any time dns resolution of 'hostname' fails. [02:20] which is fine. [02:20] smoser, correct but that's what I'm trying to explain. [02:20] thats fine though. [02:20] if you're apt-get update hung, then thats the problem. [02:20] apt-get update is hanging because of a dns issue and never completing enlistment. [02:20] no. i don't think so. [02:20] dns resolution of `hostname` is fialing [02:21] it is because if i kill the process it finishes enlistment immediately [02:21] can you verify it works for anything else ? [02:21] and resolution would *fail* not hang. [02:21] you're on the system now? [02:21] yes [02:21] do 'ping archive.ubuntu.com' [02:21] i think you'll get dns resolution [02:21] or if you dont, then yes, dns is the issue. [02:22] it will resolve that [02:22] apt-get update doesn't succeed because it cannot resolve it's own hostname [02:22] doesn't care. [02:22] thats not why its hanging. i'm certain of that. [02:23] you can replicate that anywhere like this: [02:23] this process is running: root 1384 0.0 0.0 31064 2380 ? S 20:13 0:00 /usr/bin/apt-get --option=Dpkg::Options::=--force-confold --option=Dpkg::options::=--force-unsafe-io --assume-yes --quiet update [02:23] if i kill it, enlistment will finish [02:23] strace it [02:23] what is it doing. [02:23] or even just tail /var/log/cloud-init-output.log [02:23] sudo strace -p 2380 [02:24] $ sudo strace -p 1384 [02:24] sudo: unable to resolve host maas-enlisting-node [02:24] Process 1384 attached [02:24] select(8, [6 7], [], NULL, {0, 23729}) = 0 (Timeout) [02:24] select(8, [6 7], [], NULL, {0, 500000}) = 0 (Timeout) [02:24] select(8, [6 7], [], NULL, {0, 500000}) = 0 (Timeout) [02:24] select(8, [6 7], [], NULL, {0, 500000}) = 0 (Timeout) [02:25] Ign http://archive.ubuntu.com trusty-updates InRelease [02:25] Err http://security.ubuntu.com trusty-security Release.gpg [02:25] Connection failed [02:27] smoser, mackrel is working with me on this issue [02:27] just letting you know so he can ask questions about the same issue [02:28] smoser, could it be an issue with the proxy server on maas? [02:28] designated, see, its timing out on a network connection. [02:29] you set a proxy in maas ? [02:29] smoser, no but i thought by default the apt-get requests got proxied through maas === CyberJacob is now known as CyberJacob|Away [02:29] you can see if that was correctly written into /etc/apt (grep -r Proxy /etc/apt) [02:30] under maas gui there is an option to configure proxy server, it says if you leave it blank "This will also be passed onto provisioned nodes instead of the default proxy (the region controller proxy)." [02:30] it could be an issue then on the squid proxy on the maas region controller [02:30] try: [02:30] /etc/apt/apt.conf.d/95cloud-init-proxy:Acquire::HTTP::Proxy "http://192.168.168.7:8000/"; [02:31] http_proxy=http://your.maas.ip.addr:3128 wget http://security.ubuntu.com [02:31] i suspect that will hang similarly. [02:31] er... s/3128/8000/ [02:33] Resolving your.maas.ip.addr (your.maas.ip.addr)... failed: Name or service not known. [02:33] wget: unable to resolve host address âyour.maas.ip.addrâ [02:33] shit...just a sec [02:33] :) [02:33] smoser, designated, yes it is stalling. [02:33] yeah, thats not going to work :) [02:33] noobing it up tonight [02:34] just hangs [02:34] Connecting to 192.168.168.7:8000... connected. [02:34] Proxy request sent, awaiting response... [02:34] i wonder how squid proxy got jacked up [02:38] sudo service squid-deb-proxy status squid-deb-proxy start/running, process 1005 [02:39] i show squid-deb-proxy as well as squid3 installed. are both of these needed? [02:40] squid3 is listening on TCP/8000 [02:40] proxy 1005 0.0 0.1 127620 30736 ? Ssl 18:30 0:01 /usr/sbin/squid3 -N -f /etc/squid-deb-proxy/squid-deb-proxy.conf [02:40] proxy 1233 0.0 0.1 113348 20260 ? Ss 18:32 0:00 /usr/sbin/squid3 -N -YC -f /etc/squid3/squid.conf [02:43] is squid proxy simply blocked on outbound network connections ? [02:43] ie, from maas sytem can you hit archive.ubuntu.com ? [02:43] smoser, it's not blocked [02:43] smoser, yes i can [02:43] can you try the wget above on the maas region controller ? [02:44] maybe just try restarting squid. that doesn't give you warm fuzzies, but id ont know. [02:45] smoser, the wget from the controller succeeds but not when i proxy the request through itself. [02:47] mackrel, didn't we have a similar problem in the lab? [02:47] right. [02:47] so yeah, squid is messed up. did you try restarting it ? [02:47] i did [02:47] no difference [02:47] i know thats a hack. [02:47] but i dont know why it would be hung. [02:47] you can look at its logs for some info. [02:49] yes. we experienced this issue before but this was working and has deteriated to this [02:52] mackrel, right, no changes were made to squid. everything was working, then it stopped. i rebuilt maas from scratch today and we're still seeing this issue. [02:53] so see if there is anything in squid error or access logs that gives you any hiints [02:54] the only guess i have a this point is that squid's dns resolution is borked. suspecting something to do with maas taking over dns on that system. [02:54] but i dont have a lot of faith in that theory [02:54] 1401936515.205 4723 192.168.168.7 TCP_MISS_ABORTED/000 0 GET http://security.ubuntu.com/ - HIER_DIRECT/2001:67c:1562::15 - [02:54] i don't understand why there are two squid processes running [02:54] proxy 8372 0.0 0.1 115640 22480 ? Ss 20:45 0:00 /usr/sbin/squid3 -N -f /etc/squid-deb-proxy/squid-deb-proxy.conf [02:54] proxy 8464 0.0 0.1 113216 19956 ? Ss 20:48 0:00 /usr/sbin/squid3 -N -YC -f /etc/squid3/squid.conf [02:55] well, yeah, that is kind of silly. :) [02:55] but one of them is squid deb proxy and one is just squid. [02:55] maas uses squid-deb-proxy...no? [02:55] squid deb proxy actually i think probably runs on 3128 [02:55] err. [02:55] i might be rwong [02:55] i am wrong [02:55] 8000 is squid deb proxy [02:55] sudo netstat -tanpo | grep 3128 [02:56] tcp6 0 0 :::3128 :::* LISTEN 8464/squid3 off (0.00/0/0) [02:56] tcp6 not 4 [02:56] both of them say tcp6 [02:56] sudo netstat -tanpo | grep 8000 [02:56] tcp6 0 0 :::8000 :::* LISTEN 8372/squid3 off (0.00/0/0) [02:57] one process is listening 3128 and another on 8000 [02:59] you're saying squid just isn't listening on ipv4 ? [02:59] telnet localhost 8000 ? [03:00] smoser, it connects [03:00] does squid get installed as a depend of maas-dns? [03:01] localhost resolves to ::1 in /etc/hosts, so I imagine when squid starts it binds itself to localhost and resolve ipv6. we can probably comment that out, restart squid and it would list on ipv4 [03:01] probably not maas-dns. prbobaly maas region controller [03:04] mackrel, any ideas? [03:08] designated, not really. dns resolution was first step but now it is proxy... seems pretty weird we got three dozen nodes to register before and now kaput [03:18] smoser, thanks for your help. we're going to have to figure out wth squid is doing [03:38] designated, sorry couldn't get you past that issue. === plars is now known as plars-away === vladk|offline is now known as vladk === CyberJacob|Away is now known as CyberJacob === vladk is now known as vladk|offline === vladk|offline is now known as vladk [08:57] gmb: I'm marking your fix-commissioning-page-distro-list-bug-1312844 branch "needs fixing". The problem I describe could have gone unnoticed because the testing coverage is not complete in this area. Happy to help you with this when you're back from your mini sprint. [08:57] rvba: Merci. Yeah, you’re right… I kind of knew I was on a bit of a wing and a prayer tests-wise :) [08:58] anyway [08:58] * gmb -> sprinting [09:02] bigjools: why do you think it's best to set it in start_nodes()? [09:02] rvba: lol [09:03] rvba: because I figured chaining the jobs would be better, but as you just pointed out we need host entries for other types too [09:03] the other job being the power_on [09:03] hmmm [09:04] I'll add it tomorrow. [09:04] in claim_static_ip() I mean [09:04] Sounds good. [09:05] we have to hope celery gets to it before the power_on :) [09:06] This is a bit of a gamble. [09:06] quite [09:07] hence my question [09:07] it's not so cut and dry [09:11] bigjools: we can't reasonably take that risk. [09:12] rvba: yeah, and I think in terms of abstraction it makes sense to leave it out [09:13] I would have preferred to have the change in the DB and the setting of the host entry in one place. Because they are the two sides of the same coin (one part is internal the other is external). === vladk is now known as vladk|offline === vladk|offline is now known as vladk === vladk is now known as vladk|offline === vladk|offline is now known as vladk === vladk is now known as vladk|offline === vladk|offline is now known as vladk [12:43] I need help debugging MaaS. The node (VM) fails to Commission === vladk is now known as vladk|offline [12:49] I need help to debug MaaS. Anyone? [12:50] I posted the question in askubuntu: http://askubuntu.com/questions/477028/maas-fails-to-commission-nodes [13:11] Hi Jay_; thanks for the question. I'll have a look at the logs you provided in a short while. I'll get back to you when I do (probably on askubuntu.com). [13:14] rvba: Thank you very much === vladk|offline is now known as vladk [13:42] rvba: Please let me know if you need any more logs from the box. [13:46] Jay_: I suggest you have a look at the machine's logs (syslog & co) when it fails to get its IP address. Maybe you'll find a hint as to why it failed commissioning. [13:52] rvba: Do these logs make any sense? [13:52] Jun 4 20:03:56 maas dhcpd: DHCPNAK on 50.50.50.13 to 08:00:27:b9:e6:16 via eth0 [13:52] Jun 4 20:04:12 maas dhcpd: DHCPDISCOVER from 08:00:27:b9:e6:16 via eth0 [13:52] Jun 4 20:04:12 maas dhcpd: DHCPOFFER on 50.50.50.58 to 08:00:27:b9:e6:16 via eth0 [13:52] Jun 4 20:04:12 maas dhcpd: DHCPREQUEST for 50.50.50.13 (50.50.50.3) from 08:00:27:b9:e6:16 via eth0: lease 50.50.50.13 unavailable. [13:53] Jay_: the "lease unavailable" doesn't look too good. [13:54] rvba: That's where I think MaaS messing up something. Don't know how to proceed as I know little about MaaS [13:56] Jay_: can you share the content of the lease file? /var/lib/maas/dhcp/dhcpd.leases [13:56] leases* even [13:59] Jay_: The DHCPNAK message also seems to indicate something is wrong with the network config/the DHCP config. [13:59] rvba: Copied here: https://www.dropbox.com/s/fwu0db3orphz2jk/dhcp-leases [14:00] rvba: Appears that MaaS assigns an IP and binds MAC during enlistment. Then it is getting confused during Commissioning! [14:04] Jay_: yes, the assignment is made the first time MAAS sees a node. It should be used throughout a node's lifecycle so that the IP doesn't change. === vladk is now known as vladk|offline [14:05] makes sense. I thought so too. However, the VM PXE boots again during Commissioning, whcih is when it is getting confised === vladk|offline is now known as vladk === CyberJacob is now known as CyberJacob|Away [14:57] which tool do you use to change the volume quotas? [14:59] cinder! === vladk is now known as vladk|offline [16:34] blake_r, do you know of a quick way to restart all maas services? [16:36] designated: i just normally restart them one at a time [16:51] blake_r, so just restart everything in /etc/init/maas-* one at a time? [16:52] yes [16:52] and apache2 [16:52] and tgt === vladk|offline is now known as vladk [17:28] blake_r, thank you. [18:20] blake_r, do you know of a way to disable maas forcing the nodes to use the maas controller as a proxy? during enlistment, squid-deb-proxy doesn't seem to be functioning correctly, I've been troubleshooting it for a couple of days now with no success. I keep getting: [18:20] Err http://security.ubuntu.com trusty-security Release.gpg [18:20] Connection failed [18:21] all of my nodes have direct internet access. [18:53] smoser, who is responsible for working on the squid-deb-proxy portion of maas and can assist in troubleshooting this issue? [18:56] well squid-deb-proxy is just an ubuntu package. maas depends on it. you can file a bug against squid-deb-proxy using 'ubuntu-bug squid-deb-proxy'. === roadmr is now known as roadmr_afk [18:58] and you can probably turn up debug info in squid [18:58] http://www.squid-cache.org/Doc/config/debug_options/ [18:59] smoser, do you know a way to prevent maas from forcing the nodes to proxy the apt requests? [19:08] designated, grep through /etc/maas -r for http_proxy or just proxy and see if you see anything [19:09] i think it should show up there. [19:09] and i think you should be even able to set the proxy in the maas web ui [19:10] smoser, i don't want a proxy [19:10] right. i suspec tyou'll see it set to some value [19:10] and you can unset it === CyberJacob|Away is now known as CyberJacob [19:19] smoser, I'll try that. thank you === roadmr_afk is now known as roadmr [20:11] smoser, the file /etc/maas/preseeds/commissioning only contains {{preseed_data}}. Does that get pulled in from 'generic' and 'preseed_master'? [20:12] probably rendered in maas internal. [20:12] i'd hvae to look at it. [20:13] i dont really know., anyone know how to globally disable the squid proxy ? [20:16] i successfully disabled the proxy server during enlistment and it enlisted perfectly. now trying to go the same for commissioning [20:23] smoser, I think I found it here: /etc/maas/templates/commissioning-user-data/user_data_config.template === vladk is now known as vladk|offline === CyberJacob is now known as CyberJacob|Away