=== CyberJacob|Away is now known as CyberJacob [06:56] jtv: thanks for the review on my maas-test branch. Doing the fixes you suggested now. [06:57] I was just looking. :) [06:57] jtv: just pushed the fixes. [07:14] bigjools: the second problem I saw is that the NUC's power details are not properly autodetected. [07:15] bigjools: the third problem is that, even if I install amttool manually and fill the power details manually, power_on fails (because the IP is not correctly derived from the MAC address). [07:15] bigjools: hence my question to gmb "can you confirm that you got an AMT-based machine working with MAAS"… [07:16] he said he had, previously [07:16] the latter problem is arp cache [07:16] bigjools: no, the arp cache is fine, checked [07:16] it needs something to populate that [07:16] oh [07:16] then.... wtf! [07:16] I* checked [07:16] Yeah, wtf indeed. [07:16] I just plugged mine in for the first time, will have a play with it in a moment [07:16] okay [07:17] I have no monitor on it :) [07:20] bigjools: btw, here is the manpage for the new --power-type/--power-parameters for maas-test: paste.ubuntu.com/7364567/ [07:21] rvba: you've been pasting urls without the http:// lately and they're not clickable! [07:21] bigjools: I know, fu**ing chromium is at fault. [07:21] Found documentation for those mysterious password requirements: “man amt-howto” [07:21] rvba: I use chromium and it's not a problem here [07:21] jtv: yeah, you need to define an env var [07:22] ? I meant the mysterious requirements for setting a new password instead of “admin.” [07:22] Where every password you can reasonably come up with is rejected without explanation. [07:22] jtv: ah, no. I thought you where talking about passing the pw to amttool [07:22] Ah [07:30] bigjools: with the 'http://', for your clicking pleasure: http://paste.ubuntu.com/7364567/ [07:30] \o/ [07:31] rvba: tip top [07:32] Cool [07:33] oh gee, thanks grub for removing the menu timeout on my headless server === CyberJacob is now known as CyberJacob|Away [07:44] jtv: all good with the changes I made to my 'power-type-support' branch? [07:45] (btw, I QAed it in the lab. We only have IPMI-based nodes in the lab but at least that's a confirmation that my branch doesn't introduce regressions) [07:45] rvba: I didn't look in detail — but I have faith. [07:45] jtv: okay, thanks. Landing it now :). [07:46] It was already Approved anyway. [07:46] Now, how do I know how to address my NUC on the network... [07:46] MAC Address + arp -n [07:48] That same MAC address that was so helpfully not displayed on boot, and found with random key-bashing... [07:48] jtv: precisely :) [07:48] jtv: or, nmap on the node's network will give you the MAC Address *and* the IP. [07:50] That might be better in this case... not seeing any arp for the node. [07:53] Ahhh, my own machine hadn't got a dhcp address for whatever reason. [08:07] Christ, `sudo lshw -xml` doesn't return a lot of details on these NUCs… not surprised MAAS was unable to detect the number of cores or the amount of memory available. [08:31] rvba: in which package does amttool live? [08:31] bigjools: amtterm [08:32] ta [08:34] bigjools: so, the dependency is in 'Suggests' [08:34] feh [09:01] rvba, allenap, jtv, gmb: what did you think of the acquire() race? [09:01] nasty [09:01] bigjools: I replied to the list about that [09:01] I saww [09:02] bigjools: Like it said, it's unfortunate but Django doesn't do optimistic locking. [09:02] And yes, isolation level should prevent that. [09:03] it needs a lock, yes :) [09:03] this is very serious [09:03] we need to fix this and backport everywhere [09:04] Yeah [09:04] And probably audit the code for other places where we need that locking. [09:04] yep [09:04] anyway I am switching locations to sort out the NUC [09:04] ttfn [09:05] gmb: any suggestions on what to do about that 1.5 lander? [09:06] jtv: I appear to have missed context … what’s wrong with it? [09:06] gmb: "it's stone dead, that's what's wrong with it" :) [09:06] Oh poo. [09:07] what he said. [09:08] * gmb pokes [09:09] jtv, rvba Looks like tarmac is stuck. Hang on, I’ll kill it and run it manually to see what it’s choking on [09:09] Thanks. [09:19] gmb: how do you get the AMT card to issue a DHCP request? [09:25] AIUI it'd do that as soon as it's plugged into the network, even while off... [09:26] The lander seems to be running again, but running out of memory. :( [09:27] jtv: you're right, it got an IP when I unplugged it and then plugged it back on. [09:27] The only problem was that the ARP table wasn't populated correctly. [09:27] Yeah, hard resets are the only way to get it to do it. [09:27] jtv, rvba: Re: the lander: “OSError: [Errno 12] Cannot allocate memory” [09:27] I had to manually run `nmap` to get the ARP table to be populated. [09:28] (Which is weard because there’s > 1G free) [09:28] I think the arp table is populated on demand. [09:28] bigjools: would you happen to know anything about that? ^. My NUC gets an IP alright but then MAAS fails to power it up because the ARP table didn't contain a record for the NUC's MAC address. [09:28] bigjools: manually running `nmap` solves the problem of course. [09:29] Populated on demand. [09:29] Yeah, but doesn't this mean there is a bit of a flaw with the fact that MAAS expects the ARP table to be populated? [09:30] ! [09:30] We rely on a cache lookup? [09:30] Yes we do. [09:30] Not a proper rarp lookup!? [09:30] (1 NUC in ready state) [09:31] jtv: we run `arp -n` and parse the output. [09:31] Yeah that won't work... [09:31] It doesn't. [09:33] * rvba files bug [09:37] it's populated on demand [09:37] as I keep saying :) [09:37] which is why maas-test uses nmap [09:37] it seems to work fine for me, I used its IP address [09:38] FPI runs in 30 seconds, lol [09:38] bigjools: my testing shows that its very unreliable. [09:38] v reliable for me! [09:39] in my admittedly limited set of tests [09:39] That's a shame because we have the MAC<->IP correspondance in MAAS (from the parsing of the lease file). [09:39] yes [09:39] we can shortcut this for amt [09:41] Filed bug 1314559. [09:41] https://bugs.launchpad.net/maas/+bug/1314559 [09:49] ubuntu@nuc1:~$ [09:49] hehe [09:52] 30 seconds? That seems quicker than what I saw. [09:53] I'll test again tomorrow [09:53] Timing [hit start node → ssh connect] now…. [09:54] 2min24s [09:54] (Using fpi) [09:58] bigjools, rvba, jtv: How about we bite the bullet and switch to serialized? [09:59] why is there a bullet? [09:59] bigjools: Fallout from doing it. Having to add a middleware for retries. [10:02] somehow I was naive enough to think that would have been handled by Django already [10:03] allenap: but yes, we need to do it, the DB is not safe otherwise. [10:03] * gmb -> errands and lunch [10:06] allenap: there are a bunch of plugins for Django that implement optimistic locking. [10:08] jtv: I'm wondering if checking the the presence of a rogue DHCP server on the network *every minute* isn't a bit too aggressive… it certainly generates a lot of traffic/log noise. [10:10] rvba: Agreed. I think the main thing is that we check initially... [10:12] jtv: filed https://bugs.launchpad.net/maas/+bug/1314571 (low priority of course) [10:14] rvba: I'd be ok with that changing to say 15 minutes [10:15] bigjools: yeah, but like Jeroen said, it would be important to get the network scanned *initially* (i.e. when the user is likely to be configuring his DHCP server). [10:15] yes [12:59] roaksoax: Hi Andres… as you probably know, we're testing MAAS on NUC nodes that use AMT for power-management. Since amtterm is only in 'Suggests', these nodes don't work with MAAS "out-of-the-box"" you need to manually install amtterm to get the tool (amttool) that the AMT power template uses. Isn't that a bit suboptimal (it's quite hard to understand what's wrong, you need to dig up the celery log file)? [13:00] Why was it decided not to add amtterm as a hard dependency (or, at least, as a 'recommends')? [13:01] gmb: Did the power credentials detection work on the AMT boxes you've used? [13:02] rvba: hey [13:02] roadmr: release (universe) [13:02] rerrr [13:02] rvba: ^^ [13:03] rvba: amtterm is in universe [13:03] arg [13:03] roaksoax: masters of the universe unite! [13:03] roadmr: :) [13:03] roaksoax: I understand why it can't be a hard dependency then… but still, the experience is clearly suboptimal. [13:04] rvba: same as ipmitool [13:04] rvba: unfortunately, I don't think I added the dependency myself [13:04] rvba: so that should have gone through a MIR [13:04] roaksoax: IPMI works out of the box AFAIK [13:04] rvba: which means more dependencies [13:04] Because we're using free-ipmi or something [13:04] rvba: IPMI when it uses freeipmi-tools not when using ipmitool [13:05] roaksoax: so we should aim for either a) things working out of the box or b) a very clear message about what you need to install [13:06] rvba: i don't disagree what you are saying [13:07] rvba: i'm just saying that at the time of this happened, a request should have been made to take care of this dependency [13:07] rvba: but anyways, I no longer maintain the packaging :) [13:07] rvba: lutostag should be helping with this now [13:08] roaksoax: sure, I was asking you because you know packaging better than I do… and because the changelog says you added the 'suggests' :) [13:08] rvba: yeah I changed it from dependes to suggests [13:08] IIRC [13:08] rvba: but at the time that happened, it was too late to do something about it [13:08] I can't remembe really [13:08] roaksoax: all right. I'll file a bug about this and see what lutostag can do about it. [13:09] rvba: thanks! he should be able to file a MIR for utopic and then we might be able to MIR [13:13] rvba: I don’t recall trying the credentials detection on the orange boxes. [13:16] gmb: ah ok… I'm not sure it's a feature that's supposed to be there… [13:17] tych0: do you happen to know if credential auto-detection is something that's possible with AMT/supposed to be working within MAAS right now. (I'm asking you because I see you've added the AMT power template.) [13:17] rvba: it isn't supposed to be working [13:18] i think it is possible, but i spent a day fiddling with it when i did it and i couldn't figure out how to do it [13:19] tych0: okay, fair enough. If you have bits and pieces (i.e. a vague embryo of support), don't hesitate to file a bug with all the details you have so that, maybe, someone else will pick it up. [13:19] ok [13:19] i'll see what I can dig up [13:20] tych0: fwiw, I just filed bug https://bugs.launchpad.net/maas/+bug/1314629 [13:20] ah, yeah [13:20] well, actually amtterm isn't that huge [13:20] it is like a 400 line perls cript [13:20] and maas doesn't even use most of it [13:20] so you could probably get away with using something else there [13:21] Something else as in "extract the bits that we need from amttool and ship it with MAAS" or as in "use another package"? [13:27] the former [13:27] i don't think there are any other packages, really [13:27] most of intel's tools are windows-based [13:30] I see. === CyberJacob|Away is now known as CyberJacob === roadmr is now known as roadmr_afk === roadmr_afk is now known as roadmr === CyberJacob is now known as CyberJacob|Away [23:16] Wondering if anyone is around that might be able to help me troubleshoot something I am running into. I am running into the same issue as highlighted in this link: [23:16] http://askubuntu.com/questions/370971/ubuntu-13-10-server-maas-pb-to-import-boot-images [23:17] However, when I check for celery.log I do not find one in /var/log/maas [23:18] hi [23:18] I can try to help [23:18] hello bigjools, thanks appreciate it [23:18] I am running for latest source [23:18] are you using maas from trusty? [23:19] I followed the HACKING.txt [23:19] ah [23:19] so checked out the latest code [23:19] running from source [23:19] yes [23:19] doing that is a bit harder than using packages [23:19] yeah I know [23:19] but I want to fix some bugs [23:19] ;) [23:20] did you start up maas? [23:20] yes [23:20] "make run" [23:20] able to login to dashboard etc [23:20] there's a logs/ dir at the top level of src [23:20] yeah I have been looking at those...still trying to track this down [23:20] but unfortunately it will try to run stuff that is not installed [23:21] I think there's code that assumes packaged locations [23:21] you can try to run scripts/maas-import-pxe-files after editing etc/maas/bootresources.yaml [23:21] bigjools, what is your advice in terms of getting a setup that is conducive to fixing bugs and verifying the fix? [23:21] Ok [23:22] we've never needed to import pxe files to test at this source level you [23:22] s/you^// [23:22] the testing is done in unit tests with fake resources [23:22] otherwise working with such huge files would be a right pain [23:23] Okay if I wanted to work on a bug like this: https://bugs.launchpad.net/maas/+bug/1283106 [23:23] Ubuntu bug 1283106 in MAAS "MAAS allows the same subnet to be defined on two managed interfaces of the same cluster" [High,Triaged] [23:23] once a fix is in place, we do "make package" to build some .debs and install them in a separate VM [23:23] Do you guys usually just use your testing framework? [23:23] yes - that one's pretty easy to fix, just needs some overlapping network checks [23:23] there's an IPNetwork class somewhere [23:23] you don't need to import images to fix it [23:23] ah okay [23:24] thanks for looking to fix things! [23:25] bigjools, to verify the fix would you just try and manage two cluster interfaces with the same subnet and see if the error pops up again? [23:26] newell_: yes - create a unit test case *first* to reproduce the problem [23:26] then once done you can work on a fix and keep running the test until it passes [23:39] bigjools, so if I created my own unit test I would do something like: Just to verify: $ ./bin/test.maas test src/maasserver/tests/test_cluster_subnets.py [23:40] newell_: yes but don't make a new file just for one test, find existing tests for the functionality and add a new test case [23:40] k [23:40] thanks [23:41] np