/srv/irclogs.ubuntu.com/2014/04/30/#maas.txt

=== CyberJacob|Away is now known as CyberJacob
rvbajtv: thanks for the review on my maas-test branch.  Doing the fixes you suggested now.06:56
jtvI was just looking.  :)06:57
rvbajtv: just pushed the fixes.06:57
rvbabigjools: the second problem I saw is that the NUC's power details are not properly autodetected.07:14
rvbabigjools: the third problem is that, even if I install amttool manually and fill the power details manually, power_on fails (because the IP is not correctly derived from the MAC address).07:15
rvbabigjools: hence my question to gmb "can you confirm that you got an AMT-based machine working with MAAS"…07:15
bigjoolshe said he had, previously07:16
bigjoolsthe latter problem is arp cache07:16
rvbabigjools: no, the arp cache is fine, checked07:16
bigjoolsit needs something to populate that07:16
bigjoolsoh07:16
bigjoolsthen.... wtf!07:16
rvbaI* checked07:16
rvbaYeah, wtf indeed.07:16
bigjoolsI just plugged mine in for the first time, will have a play with it in a moment07:16
rvbaokay07:16
bigjoolsI have no monitor on it :)07:17
rvbabigjools: btw, here is the manpage for the new --power-type/--power-parameters for maas-test: paste.ubuntu.com/7364567/07:20
bigjoolsrvba: you've been pasting urls without the http:// lately and they're not clickable!07:21
rvbabigjools: I know, fu**ing chromium is at fault.07:21
jtvFound documentation for those mysterious password requirements: “man amt-howto”07:21
bigjoolsrvba: I use chromium and it's not a problem here07:21
rvbajtv: yeah, you need to define an env var07:21
jtv?  I meant the mysterious requirements for setting a new password instead of “admin.”07:22
jtvWhere every password you can reasonably come up with is rejected without explanation.07:22
rvbajtv: ah, no.  I thought you where talking about passing the pw to amttool07:22
jtvAh07:22
rvbabigjools: with the 'http://', for your clicking pleasure: http://paste.ubuntu.com/7364567/07:30
bigjools\o/07:30
bigjoolsrvba: tip top07:31
rvbaCool07:32
bigjoolsoh gee, thanks grub for removing the menu timeout on my headless server07:33
=== CyberJacob is now known as CyberJacob|Away
rvbajtv: all good with the changes I made to my 'power-type-support' branch?07:44
rvba(btw, I QAed it in the lab.  We only have IPMI-based nodes in the lab but at least that's a confirmation that my branch doesn't introduce regressions)07:45
jtvrvba: I didn't look in detail — but I have faith.07:45
rvbajtv: okay, thanks.  Landing it now :).07:45
jtvIt was already Approved anyway.07:46
jtvNow, how do I know how to address my NUC on the network...07:46
rvbaMAC Address + arp -n07:46
jtvThat same MAC address that was so helpfully not displayed on boot, and found with random key-bashing...07:48
rvbajtv: precisely :)07:48
rvbajtv: or, nmap on the node's network will give you the MAC Address *and* the IP.07:48
jtvThat might be better in this case...  not seeing any arp for the node.07:50
jtvAhhh, my own machine hadn't got a dhcp address for whatever reason.07:53
rvbaChrist, `sudo lshw -xml` doesn't return a lot of details on these NUCs… not surprised MAAS was unable to detect the number of cores or the amount of memory available.08:07
bigjoolsrvba: in which package does amttool live?08:31
rvbabigjools: amtterm08:31
bigjoolsta08:32
rvbabigjools: so, the dependency is in 'Suggests'08:34
bigjoolsfeh08:34
bigjoolsrvba, allenap, jtv, gmb: what did you think of the acquire() race?09:01
bigjoolsnasty09:01
rvbabigjools: I replied to the list about that09:01
bigjoolsI saww09:01
rvbabigjools: Like it said, it's unfortunate but Django doesn't do optimistic locking.09:02
jtvAnd yes, isolation level should prevent that.09:02
bigjoolsit needs a lock, yes :)09:03
bigjoolsthis is very serious09:03
bigjoolswe need to fix this and backport everywhere09:03
rvbaYeah09:04
rvbaAnd probably audit the code for other places where we need that locking.09:04
bigjoolsyep09:04
bigjoolsanyway I am switching locations to sort out the NUC09:04
bigjoolsttfn09:04
jtvgmb: any suggestions on what to do about that 1.5 lander?09:05
gmbjtv: I appear to have missed context … what’s wrong with it?09:06
rvbagmb: "it's stone dead, that's what's wrong with it" :)09:06
gmbOh poo.09:06
jtvwhat he said.09:07
* gmb pokes09:08
gmbjtv, rvba Looks like tarmac is stuck. Hang on, I’ll kill it and run it manually to see what it’s choking on09:09
jtvThanks.09:09
rvbagmb: how do you get the AMT card to issue a DHCP request?09:19
jtvAIUI it'd do that as soon as it's plugged into the network, even while off...09:25
jtvThe lander seems to be running again, but running out of memory.  :(09:26
rvbajtv: you're right, it got an IP when I unplugged it and then plugged it back on.09:27
rvbaThe only problem was that the ARP table wasn't populated correctly.09:27
gmbYeah, hard resets are the only way to get it to do it.09:27
gmbjtv, rvba: Re: the lander: “OSError: [Errno 12] Cannot allocate memory”09:27
rvbaI had to manually run `nmap` to get the ARP table to be populated.09:27
gmb(Which is weard because there’s > 1G free)09:28
jtvI think the arp table is populated on demand.09:28
rvbabigjools: would you happen to know anything about that? ^.  My NUC gets an IP alright but then MAAS fails to power it up because the ARP table didn't contain a record for the NUC's MAC address.09:28
rvbabigjools: manually running `nmap` solves the problem of course.09:28
jtvPopulated on demand.09:29
rvbaYeah, but doesn't this mean there is a bit of a flaw with the fact that MAAS expects the ARP table to be populated?09:29
jtv!09:30
jtvWe rely on a cache lookup?09:30
rvbaYes we do.09:30
jtvNot a proper rarp lookup!?09:30
rvba(1 NUC in ready state)09:30
rvbajtv: we run `arp -n` and parse the output.09:31
jtvYeah that won't work...09:31
rvbaIt doesn't.09:31
* rvba files bug09:33
bigjoolsit's populated on demand09:37
bigjoolsas I keep saying :)09:37
bigjoolswhich is why maas-test uses nmap09:37
bigjoolsit seems to work fine for me, I used its IP address09:37
bigjoolsFPI runs in 30 seconds, lol09:38
rvbabigjools: my testing shows that its very unreliable.09:38
bigjoolsv reliable for me!09:38
bigjoolsin my admittedly limited set of tests09:39
rvbaThat's a shame because we have the MAC<->IP correspondance in MAAS (from the parsing of the lease file).09:39
bigjoolsyes09:39
bigjoolswe can shortcut this for amt09:39
rvbaFiled bug 1314559.09:41
rvbahttps://bugs.launchpad.net/maas/+bug/131455909:41
bigjoolsubuntu@nuc1:~$09:49
bigjoolshehe09:49
rvba30 seconds?  That seems quicker than what I saw.09:52
bigjoolsI'll test again tomorrow09:53
rvbaTiming [hit start node → ssh connect] now….09:53
rvba2min24s09:54
rvba(Using fpi)09:54
allenapbigjools, rvba, jtv: How about we bite the bullet and switch to serialized?09:58
bigjoolswhy is there a bullet?09:59
allenapbigjools: Fallout from doing it. Having to add a middleware for retries.09:59
bigjoolssomehow I was naive enough to think that would have been handled by Django already10:02
bigjoolsallenap: but yes, we need to do it, the DB is not safe otherwise.10:03
* gmb -> errands and lunch10:03
rvbaallenap: there are a bunch of plugins for Django that implement optimistic locking.10:06
rvbajtv: I'm wondering if checking the the presence of a rogue DHCP server on the network *every minute* isn't a bit too aggressive… it certainly generates a lot of traffic/log noise.10:08
jtvrvba: Agreed.  I think the main thing is that we check initially...10:10
rvbajtv: filed https://bugs.launchpad.net/maas/+bug/1314571 (low priority of course)10:12
bigjoolsrvba: I'd be ok with that changing to say 15 minutes10:14
rvbabigjools: yeah, but like Jeroen said, it would be important to get the network scanned *initially* (i.e. when the user is likely to be configuring his DHCP server).10:15
bigjoolsyes10:15
rvbaroaksoax: Hi Andres… as you probably know, we're testing MAAS on NUC nodes that use AMT for power-management.  Since amtterm is only in 'Suggests', these nodes don't work with MAAS "out-of-the-box"" you need to manually install amtterm to get the tool (amttool) that the AMT power template uses.  Isn't that a bit suboptimal (it's quite hard to understand what's wrong, you need to dig up the celery log file)?12:59
rvbaWhy was it decided not to add amtterm as a hard dependency (or, at least, as a 'recommends')?13:00
rvbagmb: Did the power credentials detection work on the AMT boxes you've used?13:01
roaksoaxrvba: hey13:02
roaksoaxroadmr:  release (universe)13:02
roaksoaxrerrr13:02
roaksoaxrvba: ^^13:02
roaksoaxrvba: amtterm is in universe13:03
rvbaarg13:03
roadmrroaksoax: masters of the universe unite!13:03
roaksoaxroadmr: :)13:03
rvbaroaksoax: I understand why it can't be a hard dependency then… but still, the experience is clearly suboptimal.13:03
roaksoaxrvba: same as ipmitool13:04
roaksoaxrvba: unfortunately, I don't think I added the dependency myself13:04
roaksoaxrvba: so that should have gone through a MIR13:04
rvbaroaksoax: IPMI works out of the box AFAIK13:04
roaksoaxrvba: which means more dependencies13:04
rvbaBecause we're using free-ipmi or something13:04
roaksoaxrvba: IPMI when it uses freeipmi-tools not when using ipmitool13:04
rvbaroaksoax: so we should aim for either a) things working out of the box or b) a very clear message about what you need to install13:05
roaksoaxrvba: i don't disagree what you are saying13:06
roaksoaxrvba: i'm just saying that at the time of this happened, a request should have been made to take care of this dependency13:07
roaksoaxrvba: but anyways, I no longer maintain the packaging :)13:07
roaksoaxrvba: lutostag should be helping with this now13:07
rvbaroaksoax: sure, I was asking you because you know packaging better than I do… and because the changelog says you added the 'suggests' :)13:08
roaksoaxrvba: yeah I changed it from dependes to suggests13:08
roaksoaxIIRC13:08
roaksoaxrvba: but at the time that happened, it was too late to do something about it13:08
roaksoaxI can't remembe really13:08
rvbaroaksoax: all right.  I'll file a bug about this and see what lutostag can do about it.13:08
roaksoaxrvba: thanks! he should be able to file a MIR for utopic and then we might be able to MIR13:09
gmbrvba: I don’t recall trying the credentials detection on the orange boxes.13:13
rvbagmb: ah ok… I'm not sure it's a feature that's supposed to be there…13:16
rvbatych0: do you happen to know if credential auto-detection is something that's possible with AMT/supposed to be working within MAAS right now. (I'm asking you because I see you've added the AMT power template.)13:17
tych0rvba: it isn't supposed to be working13:17
tych0i think it is possible, but i spent a day fiddling with it when i did it and i couldn't figure out how to do it13:18
rvbatych0: okay, fair enough.  If you have bits and pieces (i.e. a vague embryo of support), don't hesitate to file a bug with all the details you have so that, maybe, someone else will pick it up.13:19
tych0ok13:19
tych0i'll see what I can dig up13:19
rvbatych0: fwiw, I just filed bug https://bugs.launchpad.net/maas/+bug/131462913:20
tych0ah, yeah13:20
tych0well, actually amtterm isn't that huge13:20
tych0it is like a 400 line perls cript13:20
tych0and maas doesn't even use most of it13:20
tych0so you could probably get away with using something else there13:20
rvbaSomething else as in "extract the bits that we need from amttool and ship it with MAAS" or as in "use another package"?13:21
tych0the former13:27
tych0i don't think there are any other packages, really13:27
tych0most of intel's tools are windows-based13:27
rvbaI see.13:30
=== CyberJacob|Away is now known as CyberJacob
=== roadmr is now known as roadmr_afk
=== roadmr_afk is now known as roadmr
=== CyberJacob is now known as CyberJacob|Away
newell_Wondering if anyone is around that might be able to help me troubleshoot something I am running into.  I am running into the same issue as highlighted in this link:23:16
newell_http://askubuntu.com/questions/370971/ubuntu-13-10-server-maas-pb-to-import-boot-images23:16
newell_However, when I check for celery.log I do not find one in /var/log/maas23:17
bigjoolshi23:18
bigjoolsI can try to help23:18
newell_hello bigjools, thanks appreciate it23:18
newell_I am running for latest source23:18
bigjoolsare you using maas from trusty?23:18
newell_I followed the HACKING.txt23:19
bigjoolsah23:19
newell_so checked out the latest code23:19
bigjoolsrunning from source23:19
newell_yes23:19
bigjoolsdoing that is a bit harder than using packages23:19
newell_yeah I know23:19
newell_but I want to fix some bugs23:19
newell_;)23:19
bigjoolsdid you start up maas?23:20
newell_yes23:20
bigjools"make run"23:20
newell_able to login to dashboard etc23:20
bigjoolsthere's a logs/ dir at the top level of src23:20
newell_yeah I have been looking at those...still trying to track this down23:20
bigjoolsbut unfortunately it will try to run stuff that is not installed23:20
bigjoolsI think there's code that assumes packaged locations23:21
bigjoolsyou can try to run scripts/maas-import-pxe-files after editing etc/maas/bootresources.yaml23:21
newell_bigjools, what is your advice in terms of getting a setup that is conducive to fixing bugs and verifying the fix?23:21
newell_Ok23:21
bigjoolswe've never needed to import pxe files to test at this source level you23:22
bigjoolss/you^//23:22
bigjoolsthe testing is done in unit tests with fake resources23:22
bigjoolsotherwise working with such huge files would be a right pain23:22
newell_Okay if I wanted to work on a bug like this: https://bugs.launchpad.net/maas/+bug/128310623:23
ubot5Ubuntu bug 1283106 in MAAS "MAAS allows the same subnet to be defined on two managed interfaces of the same cluster" [High,Triaged]23:23
bigjoolsonce a fix is in place, we do "make package" to build some .debs and install them in a separate VM23:23
newell_Do you guys usually just use your testing framework?23:23
bigjoolsyes - that one's pretty easy to fix, just needs some overlapping network checks23:23
bigjoolsthere's an IPNetwork class somewhere23:23
bigjoolsyou don't need to import images to fix it23:23
newell_ah okay23:23
bigjoolsthanks for looking to fix things!23:24
newell_bigjools, to verify the fix would you just try and manage two cluster interfaces with the same subnet and see if the error pops up again?23:25
bigjoolsnewell_: yes - create a unit test case *first* to reproduce the problem23:26
bigjoolsthen once done you can work on a fix and keep running the test until it passes23:26
newell_bigjools, so if I created my own unit test I would do something like:  Just to verify: $ ./bin/test.maas test src/maasserver/tests/test_cluster_subnets.py23:39
bigjoolsnewell_: yes but don't make a new file just for one test, find existing tests for the functionality and add a new test case23:40
newell_k23:40
newell_thanks23:40
bigjoolsnp23:41

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!