=== CyberJacob|Away is now known as CyberJacob | ||
rvba | jtv: thanks for the review on my maas-test branch. Doing the fixes you suggested now. | 06:56 |
---|---|---|
jtv | I was just looking. :) | 06:57 |
rvba | jtv: just pushed the fixes. | 06:57 |
rvba | bigjools: the second problem I saw is that the NUC's power details are not properly autodetected. | 07:14 |
rvba | bigjools: the third problem is that, even if I install amttool manually and fill the power details manually, power_on fails (because the IP is not correctly derived from the MAC address). | 07:15 |
rvba | bigjools: hence my question to gmb "can you confirm that you got an AMT-based machine working with MAAS"… | 07:15 |
bigjools | he said he had, previously | 07:16 |
bigjools | the latter problem is arp cache | 07:16 |
rvba | bigjools: no, the arp cache is fine, checked | 07:16 |
bigjools | it needs something to populate that | 07:16 |
bigjools | oh | 07:16 |
bigjools | then.... wtf! | 07:16 |
rvba | I* checked | 07:16 |
rvba | Yeah, wtf indeed. | 07:16 |
bigjools | I just plugged mine in for the first time, will have a play with it in a moment | 07:16 |
rvba | okay | 07:16 |
bigjools | I have no monitor on it :) | 07:17 |
rvba | bigjools: btw, here is the manpage for the new --power-type/--power-parameters for maas-test: paste.ubuntu.com/7364567/ | 07:20 |
bigjools | rvba: you've been pasting urls without the http:// lately and they're not clickable! | 07:21 |
rvba | bigjools: I know, fu**ing chromium is at fault. | 07:21 |
jtv | Found documentation for those mysterious password requirements: “man amt-howto” | 07:21 |
bigjools | rvba: I use chromium and it's not a problem here | 07:21 |
rvba | jtv: yeah, you need to define an env var | 07:21 |
jtv | ? I meant the mysterious requirements for setting a new password instead of “admin.” | 07:22 |
jtv | Where every password you can reasonably come up with is rejected without explanation. | 07:22 |
rvba | jtv: ah, no. I thought you where talking about passing the pw to amttool | 07:22 |
jtv | Ah | 07:22 |
rvba | bigjools: with the 'http://', for your clicking pleasure: http://paste.ubuntu.com/7364567/ | 07:30 |
bigjools | \o/ | 07:30 |
bigjools | rvba: tip top | 07:31 |
rvba | Cool | 07:32 |
bigjools | oh gee, thanks grub for removing the menu timeout on my headless server | 07:33 |
=== CyberJacob is now known as CyberJacob|Away | ||
rvba | jtv: all good with the changes I made to my 'power-type-support' branch? | 07:44 |
rvba | (btw, I QAed it in the lab. We only have IPMI-based nodes in the lab but at least that's a confirmation that my branch doesn't introduce regressions) | 07:45 |
jtv | rvba: I didn't look in detail — but I have faith. | 07:45 |
rvba | jtv: okay, thanks. Landing it now :). | 07:45 |
jtv | It was already Approved anyway. | 07:46 |
jtv | Now, how do I know how to address my NUC on the network... | 07:46 |
rvba | MAC Address + arp -n | 07:46 |
jtv | That same MAC address that was so helpfully not displayed on boot, and found with random key-bashing... | 07:48 |
rvba | jtv: precisely :) | 07:48 |
rvba | jtv: or, nmap on the node's network will give you the MAC Address *and* the IP. | 07:48 |
jtv | That might be better in this case... not seeing any arp for the node. | 07:50 |
jtv | Ahhh, my own machine hadn't got a dhcp address for whatever reason. | 07:53 |
rvba | Christ, `sudo lshw -xml` doesn't return a lot of details on these NUCs… not surprised MAAS was unable to detect the number of cores or the amount of memory available. | 08:07 |
bigjools | rvba: in which package does amttool live? | 08:31 |
rvba | bigjools: amtterm | 08:31 |
bigjools | ta | 08:32 |
rvba | bigjools: so, the dependency is in 'Suggests' | 08:34 |
bigjools | feh | 08:34 |
bigjools | rvba, allenap, jtv, gmb: what did you think of the acquire() race? | 09:01 |
bigjools | nasty | 09:01 |
rvba | bigjools: I replied to the list about that | 09:01 |
bigjools | I saww | 09:01 |
rvba | bigjools: Like it said, it's unfortunate but Django doesn't do optimistic locking. | 09:02 |
jtv | And yes, isolation level should prevent that. | 09:02 |
bigjools | it needs a lock, yes :) | 09:03 |
bigjools | this is very serious | 09:03 |
bigjools | we need to fix this and backport everywhere | 09:03 |
rvba | Yeah | 09:04 |
rvba | And probably audit the code for other places where we need that locking. | 09:04 |
bigjools | yep | 09:04 |
bigjools | anyway I am switching locations to sort out the NUC | 09:04 |
bigjools | ttfn | 09:04 |
jtv | gmb: any suggestions on what to do about that 1.5 lander? | 09:05 |
gmb | jtv: I appear to have missed context … what’s wrong with it? | 09:06 |
rvba | gmb: "it's stone dead, that's what's wrong with it" :) | 09:06 |
gmb | Oh poo. | 09:06 |
jtv | what he said. | 09:07 |
* gmb pokes | 09:08 | |
gmb | jtv, rvba Looks like tarmac is stuck. Hang on, I’ll kill it and run it manually to see what it’s choking on | 09:09 |
jtv | Thanks. | 09:09 |
rvba | gmb: how do you get the AMT card to issue a DHCP request? | 09:19 |
jtv | AIUI it'd do that as soon as it's plugged into the network, even while off... | 09:25 |
jtv | The lander seems to be running again, but running out of memory. :( | 09:26 |
rvba | jtv: you're right, it got an IP when I unplugged it and then plugged it back on. | 09:27 |
rvba | The only problem was that the ARP table wasn't populated correctly. | 09:27 |
gmb | Yeah, hard resets are the only way to get it to do it. | 09:27 |
gmb | jtv, rvba: Re: the lander: “OSError: [Errno 12] Cannot allocate memory” | 09:27 |
rvba | I had to manually run `nmap` to get the ARP table to be populated. | 09:27 |
gmb | (Which is weard because there’s > 1G free) | 09:28 |
jtv | I think the arp table is populated on demand. | 09:28 |
rvba | bigjools: would you happen to know anything about that? ^. My NUC gets an IP alright but then MAAS fails to power it up because the ARP table didn't contain a record for the NUC's MAC address. | 09:28 |
rvba | bigjools: manually running `nmap` solves the problem of course. | 09:28 |
jtv | Populated on demand. | 09:29 |
rvba | Yeah, but doesn't this mean there is a bit of a flaw with the fact that MAAS expects the ARP table to be populated? | 09:29 |
jtv | ! | 09:30 |
jtv | We rely on a cache lookup? | 09:30 |
rvba | Yes we do. | 09:30 |
jtv | Not a proper rarp lookup!? | 09:30 |
rvba | (1 NUC in ready state) | 09:30 |
rvba | jtv: we run `arp -n` and parse the output. | 09:31 |
jtv | Yeah that won't work... | 09:31 |
rvba | It doesn't. | 09:31 |
* rvba files bug | 09:33 | |
bigjools | it's populated on demand | 09:37 |
bigjools | as I keep saying :) | 09:37 |
bigjools | which is why maas-test uses nmap | 09:37 |
bigjools | it seems to work fine for me, I used its IP address | 09:37 |
bigjools | FPI runs in 30 seconds, lol | 09:38 |
rvba | bigjools: my testing shows that its very unreliable. | 09:38 |
bigjools | v reliable for me! | 09:38 |
bigjools | in my admittedly limited set of tests | 09:39 |
rvba | That's a shame because we have the MAC<->IP correspondance in MAAS (from the parsing of the lease file). | 09:39 |
bigjools | yes | 09:39 |
bigjools | we can shortcut this for amt | 09:39 |
rvba | Filed bug 1314559. | 09:41 |
rvba | https://bugs.launchpad.net/maas/+bug/1314559 | 09:41 |
bigjools | ubuntu@nuc1:~$ | 09:49 |
bigjools | hehe | 09:49 |
rvba | 30 seconds? That seems quicker than what I saw. | 09:52 |
bigjools | I'll test again tomorrow | 09:53 |
rvba | Timing [hit start node → ssh connect] now…. | 09:53 |
rvba | 2min24s | 09:54 |
rvba | (Using fpi) | 09:54 |
allenap | bigjools, rvba, jtv: How about we bite the bullet and switch to serialized? | 09:58 |
bigjools | why is there a bullet? | 09:59 |
allenap | bigjools: Fallout from doing it. Having to add a middleware for retries. | 09:59 |
bigjools | somehow I was naive enough to think that would have been handled by Django already | 10:02 |
bigjools | allenap: but yes, we need to do it, the DB is not safe otherwise. | 10:03 |
* gmb -> errands and lunch | 10:03 | |
rvba | allenap: there are a bunch of plugins for Django that implement optimistic locking. | 10:06 |
rvba | jtv: I'm wondering if checking the the presence of a rogue DHCP server on the network *every minute* isn't a bit too aggressive… it certainly generates a lot of traffic/log noise. | 10:08 |
jtv | rvba: Agreed. I think the main thing is that we check initially... | 10:10 |
rvba | jtv: filed https://bugs.launchpad.net/maas/+bug/1314571 (low priority of course) | 10:12 |
bigjools | rvba: I'd be ok with that changing to say 15 minutes | 10:14 |
rvba | bigjools: yeah, but like Jeroen said, it would be important to get the network scanned *initially* (i.e. when the user is likely to be configuring his DHCP server). | 10:15 |
bigjools | yes | 10:15 |
rvba | roaksoax: Hi Andres… as you probably know, we're testing MAAS on NUC nodes that use AMT for power-management. Since amtterm is only in 'Suggests', these nodes don't work with MAAS "out-of-the-box"" you need to manually install amtterm to get the tool (amttool) that the AMT power template uses. Isn't that a bit suboptimal (it's quite hard to understand what's wrong, you need to dig up the celery log file)? | 12:59 |
rvba | Why was it decided not to add amtterm as a hard dependency (or, at least, as a 'recommends')? | 13:00 |
rvba | gmb: Did the power credentials detection work on the AMT boxes you've used? | 13:01 |
roaksoax | rvba: hey | 13:02 |
roaksoax | roadmr: release (universe) | 13:02 |
roaksoax | rerrr | 13:02 |
roaksoax | rvba: ^^ | 13:02 |
roaksoax | rvba: amtterm is in universe | 13:03 |
rvba | arg | 13:03 |
roadmr | roaksoax: masters of the universe unite! | 13:03 |
roaksoax | roadmr: :) | 13:03 |
rvba | roaksoax: I understand why it can't be a hard dependency then… but still, the experience is clearly suboptimal. | 13:03 |
roaksoax | rvba: same as ipmitool | 13:04 |
roaksoax | rvba: unfortunately, I don't think I added the dependency myself | 13:04 |
roaksoax | rvba: so that should have gone through a MIR | 13:04 |
rvba | roaksoax: IPMI works out of the box AFAIK | 13:04 |
roaksoax | rvba: which means more dependencies | 13:04 |
rvba | Because we're using free-ipmi or something | 13:04 |
roaksoax | rvba: IPMI when it uses freeipmi-tools not when using ipmitool | 13:04 |
rvba | roaksoax: so we should aim for either a) things working out of the box or b) a very clear message about what you need to install | 13:05 |
roaksoax | rvba: i don't disagree what you are saying | 13:06 |
roaksoax | rvba: i'm just saying that at the time of this happened, a request should have been made to take care of this dependency | 13:07 |
roaksoax | rvba: but anyways, I no longer maintain the packaging :) | 13:07 |
roaksoax | rvba: lutostag should be helping with this now | 13:07 |
rvba | roaksoax: sure, I was asking you because you know packaging better than I do… and because the changelog says you added the 'suggests' :) | 13:08 |
roaksoax | rvba: yeah I changed it from dependes to suggests | 13:08 |
roaksoax | IIRC | 13:08 |
roaksoax | rvba: but at the time that happened, it was too late to do something about it | 13:08 |
roaksoax | I can't remembe really | 13:08 |
rvba | roaksoax: all right. I'll file a bug about this and see what lutostag can do about it. | 13:08 |
roaksoax | rvba: thanks! he should be able to file a MIR for utopic and then we might be able to MIR | 13:09 |
gmb | rvba: I don’t recall trying the credentials detection on the orange boxes. | 13:13 |
rvba | gmb: ah ok… I'm not sure it's a feature that's supposed to be there… | 13:16 |
rvba | tych0: do you happen to know if credential auto-detection is something that's possible with AMT/supposed to be working within MAAS right now. (I'm asking you because I see you've added the AMT power template.) | 13:17 |
tych0 | rvba: it isn't supposed to be working | 13:17 |
tych0 | i think it is possible, but i spent a day fiddling with it when i did it and i couldn't figure out how to do it | 13:18 |
rvba | tych0: okay, fair enough. If you have bits and pieces (i.e. a vague embryo of support), don't hesitate to file a bug with all the details you have so that, maybe, someone else will pick it up. | 13:19 |
tych0 | ok | 13:19 |
tych0 | i'll see what I can dig up | 13:19 |
rvba | tych0: fwiw, I just filed bug https://bugs.launchpad.net/maas/+bug/1314629 | 13:20 |
tych0 | ah, yeah | 13:20 |
tych0 | well, actually amtterm isn't that huge | 13:20 |
tych0 | it is like a 400 line perls cript | 13:20 |
tych0 | and maas doesn't even use most of it | 13:20 |
tych0 | so you could probably get away with using something else there | 13:20 |
rvba | Something else as in "extract the bits that we need from amttool and ship it with MAAS" or as in "use another package"? | 13:21 |
tych0 | the former | 13:27 |
tych0 | i don't think there are any other packages, really | 13:27 |
tych0 | most of intel's tools are windows-based | 13:27 |
rvba | I see. | 13:30 |
=== CyberJacob|Away is now known as CyberJacob | ||
=== roadmr is now known as roadmr_afk | ||
=== roadmr_afk is now known as roadmr | ||
=== CyberJacob is now known as CyberJacob|Away | ||
newell_ | Wondering if anyone is around that might be able to help me troubleshoot something I am running into. I am running into the same issue as highlighted in this link: | 23:16 |
newell_ | http://askubuntu.com/questions/370971/ubuntu-13-10-server-maas-pb-to-import-boot-images | 23:16 |
newell_ | However, when I check for celery.log I do not find one in /var/log/maas | 23:17 |
bigjools | hi | 23:18 |
bigjools | I can try to help | 23:18 |
newell_ | hello bigjools, thanks appreciate it | 23:18 |
newell_ | I am running for latest source | 23:18 |
bigjools | are you using maas from trusty? | 23:18 |
newell_ | I followed the HACKING.txt | 23:19 |
bigjools | ah | 23:19 |
newell_ | so checked out the latest code | 23:19 |
bigjools | running from source | 23:19 |
newell_ | yes | 23:19 |
bigjools | doing that is a bit harder than using packages | 23:19 |
newell_ | yeah I know | 23:19 |
newell_ | but I want to fix some bugs | 23:19 |
newell_ | ;) | 23:19 |
bigjools | did you start up maas? | 23:20 |
newell_ | yes | 23:20 |
bigjools | "make run" | 23:20 |
newell_ | able to login to dashboard etc | 23:20 |
bigjools | there's a logs/ dir at the top level of src | 23:20 |
newell_ | yeah I have been looking at those...still trying to track this down | 23:20 |
bigjools | but unfortunately it will try to run stuff that is not installed | 23:20 |
bigjools | I think there's code that assumes packaged locations | 23:21 |
bigjools | you can try to run scripts/maas-import-pxe-files after editing etc/maas/bootresources.yaml | 23:21 |
newell_ | bigjools, what is your advice in terms of getting a setup that is conducive to fixing bugs and verifying the fix? | 23:21 |
newell_ | Ok | 23:21 |
bigjools | we've never needed to import pxe files to test at this source level you | 23:22 |
bigjools | s/you^// | 23:22 |
bigjools | the testing is done in unit tests with fake resources | 23:22 |
bigjools | otherwise working with such huge files would be a right pain | 23:22 |
newell_ | Okay if I wanted to work on a bug like this: https://bugs.launchpad.net/maas/+bug/1283106 | 23:23 |
ubot5 | Ubuntu bug 1283106 in MAAS "MAAS allows the same subnet to be defined on two managed interfaces of the same cluster" [High,Triaged] | 23:23 |
bigjools | once a fix is in place, we do "make package" to build some .debs and install them in a separate VM | 23:23 |
newell_ | Do you guys usually just use your testing framework? | 23:23 |
bigjools | yes - that one's pretty easy to fix, just needs some overlapping network checks | 23:23 |
bigjools | there's an IPNetwork class somewhere | 23:23 |
bigjools | you don't need to import images to fix it | 23:23 |
newell_ | ah okay | 23:23 |
bigjools | thanks for looking to fix things! | 23:24 |
newell_ | bigjools, to verify the fix would you just try and manage two cluster interfaces with the same subnet and see if the error pops up again? | 23:25 |
bigjools | newell_: yes - create a unit test case *first* to reproduce the problem | 23:26 |
bigjools | then once done you can work on a fix and keep running the test until it passes | 23:26 |
newell_ | bigjools, so if I created my own unit test I would do something like: Just to verify: $ ./bin/test.maas test src/maasserver/tests/test_cluster_subnets.py | 23:39 |
bigjools | newell_: yes but don't make a new file just for one test, find existing tests for the functionality and add a new test case | 23:40 |
newell_ | k | 23:40 |
newell_ | thanks | 23:40 |
bigjools | np | 23:41 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!