/srv/irclogs.ubuntu.com/2013/02/21/#maas.txt

=== matsubara-afk is now known as matsubara
=== matsubara is now known as matsubara-lunch
roaksoaxrvba: howdy!! thanks for work on the genericipaddress filed15:06
roaksoaxrvba: One thing though, I don't oppose merging this upstream, however, shipping it as a patch we would have ensured that this would have only affercted precise. Now that it is being merged, it will also affect quantal15:09
roaksoaxand in quantal we will not use GenericIpAddressFiel from django15:09
roaksoaxrvba: so just wanted to make sure you guys are aware of that15:09
rvbaroaksoax: Hi.  In my code, I detect which version of Django we are using and only do the monkey patch if django.VERSION = 1.315:11
roaksoaxrvba: ok cool then :)15:11
rvbaSo the quantal version will use the field from Django itself.15:11
rvbaI think it's better to put this upstream because it mean all that code gets tested everytime we make a change to the code (testing by the unit tests suite)15:12
roaksoaxrvba: awesome then!!15:12
rvbameans*15:12
roaksoaxrvba: for sure15:12
roaksoaxrvba: alright, so once it lands in maas/1.2 i'll prepare a new package, upload it to stable15:30
roaksoaxrvba: upload a new django to PPA as well15:30
roaksoaxand cleanup to start testing15:31
rvbaroaksoax: ok, I'll land it now and then we can start testing it.15:32
=== matsubara-lunch is now known as matsubara
roaksoaxrvba: ok it is all in ppa:maas-maintainers/experimental for testing. It is building now though16:06
rvbaroaksoax: it's building in the daily ppa too :)16:06
roaksoaxok cool :)16:06
roaksoaxrvba: once tested I'll upload to raring16:07
roaksoaxand that should be almost all we need to SRU16:07
rvbaroaksoax: the recent change only impacted the 1.2 branch.  The raring package uses trunk.16:08
roaksoaxrvba: right, but not in Ubuntu archives16:08
roaksoaxrvba: ubuntu archives will have 1.2 until the SRU is done16:08
rvbaAh ok16:08
roaksoaxonce that happens, we can upload trunk to ubuntu archives16:08
rvbaroaksoax: btw, we need to fix bug 1123986 before we SRU MAAS.  we're in the process of fixing it.16:09
ubot5bug 1123986 in MAAS 1.2 "multiple juju environments in maas" [High,Triaged] https://launchpad.net/bugs/112398616:09
roaksoaxrvba: ok! Note that having different juju environments won't really help much though :)16:10
roaksoaxrvba: cause the network is the same so things can still collide16:10
roaksoaxrvba: won't help much in certain escenarios16:10
rvbaroaksoax: yeah, but we definitely need to fix the file storage stuff in MAAS.16:10
roaksoaxrvba: but this is definitelly cool! we were just talking about it yesterday :)16:10
roaksoaxrvba: alright. Is there an ETA on when we can start seeing these changes landing?16:11
rvbaroaksoax: I'd say sometime next week (most of the code is already done).  Because we need to do some serious testing.16:12
roaksoaxrvba: ok cool16:13
roaksoaxrvba: btw.. urgency=high is a debian thing, not ubuntu :) (lp:~rvb/maas/packaging.precise.sru-high)16:18
rvbaroaksoax: haha :)16:18
Jon___Howdy, anyone home?16:37
rvbaroaksoax: btw, please have a look at bug 113129617:05
ubot5bug 1131296 in maas-enlist (Ubuntu) "maas-enlist uses a wrong url when enlisting nodes (/MAAS/api/1.0/nodes//MAAS/api/1.0/nodes/)" [Undecided,New] https://launchpad.net/bugs/113129617:05
roaksoaxrvba: will do17:08
roaksoaxrvba: could you provide a bit more backgroun though?17:12
rvbaroaksoax: I think the way maas-enlist builds the url it uses to register nodes has a bug.17:14
rvbaroaksoax: but because of a bug in MAAS itself, it works ok :).17:14
rvbaThe thing is: the bug in MAAS must be resolved.17:14
roaksoaxrvba: for sure... I wonder what is that is being given to maas-nlist17:53
roaksoaxrvba: as in maas-enlist -s X.y.z.a/17:53
roaksoaxor what17:53
=== kentb is now known as kentb-afk
racedohey roaksoax we have seen a pattern that when we "release" an allocated node in maas to be ready and then redeploy it after rebooting it goes to grup rescue18:42
racedoif we instead delete it, reenlist it and deploy something it works18:42
racedos/grup/grub/18:42
roaksoaxracedo: how do you release it and how do you reboot it?19:05
roaksoaxracedo: it seems that on the reboot it is not being told to PXE boot! which is what maas tells the node to do when it tells it to start19:07
roaksoaxracedo: so if you are manually rebooting the node, it won't pxe boot unless you tell it so via IPMI19:07
racedoroaksoax: we reboot it via ipmi19:07
roaksoaxracedo: manually?19:08
racedoyes19:08
racedowe dont have ilo or access now19:08
roaksoaxracedo: that's the problem then19:08
roaksoaxracedo: if you reboot manually you need to tell ipmi to PXE boot19:08
racedoso the sequence is: enlist->commission->deploy then juju-destroy then deploy with constraints maas-name to a node then after reboot it goes to grub rescue19:10
racedoroaksoax: ack19:10
roaksoaxracedo: so19:10
roaksoaxipmi-chassis-config ${driver_option} -h ${power_address} -u ${power_user} -p ${power_pass} --commit --filename ${config}19:10
roaksoaxipmipower ${driver_option} -h ${power_address} -u ${power_user} -p ${power_pass} --cycle --on-if-off19:10
roaksoaxwhere config is: http://paste.ubuntu.com/1700983/19:10
racedoyep Boot_Device PXE19:11
racedogot that19:11
racedothanks roaksoax that could be it19:11
racedowe are confirming it right now19:11
=== kentb-afk is now known as kentb
negronjlroaksoax, I have a question re: preseed when you get a chance19:58
roaksoaxnegronjl: shoot :)19:58
negronjlroaksoax, I see this line in /usr/share/maas/preseeds/preseed_master: partman/early_command string debconf-set partman-auto/disk `list-devices disk | head -n1`19:58
negronjlroaksoax, however, when I have seen that line in the past, I have seen it as: partman/early_command string debconf-set partman-auto/disk "$(list-devices disk | head -n1)"19:59
negronjlroaksoax, will the above affect anything ?19:59
roaksoaxnegronjl: uhmmm I wouldn't know really20:00
roaksoaxi don't think it should20:01
roaksoaxnegronjl: depends on the busybox shell I guess20:01
roaksoaxwith i believe is postfix20:01
negronjlroaksoax, If I change the preseed file, do I need to restart any particular service ?20:02
negronjlroaksoax, I mean if i change /usr/share/maas/preseeds/preseed_master BTW20:03
roaksoaxnegronjl: no, the preseeds anre rendedred at exec time20:03
negronjlroaksoax, ack ... thanks20:03
=== matsubara is now known as matsubara-afk
racedowe are getting a "No authorization header received." at the last stage of cloud init during commissioning when the nodes are accessing the maas server20:11
racedothat prevents them from reporting back to the maas server and go to ready state and they are stuck at commissioning20:11
racedoany clue?20:11
racedothe address the nodes are trying to contact is http://maas/MAAS/metadata/2012-03-01/20:12
roaksoaxracedo: the DEFAULT_MAAS_URL is incorrect20:13
racedoroaksoax: ok, where is that?20:13
* racedo checking20:13
roaksoaxracedo: sudo dpkg-reconfigure maas-region-controller and enter either a hostname or ip address that is addresseable from the node's that are commissioning20:14
roaksoaxracedo: etc/maas/maas_local_settings.py20:14
racedoit's the right one20:14
roaksoaxracedo: so 'maas' is not resolvable20:14
roaksoaxhttp://maas/MAAS/metadata/2012-03-01/ --> 'maas' resolves?20:14
racedook20:14
racedono20:14
racedoit's the ip20:14
racedosorry20:15
racedoi pasted it for privacy reasons :)20:15
roaksoaxracedo: ah lol :), so is the address reacheable from the commissioning server?20:15
roaksoaxracedo: as in the *same* network?20:15
roaksoaxof the nodes being deployed?20:15
racedoif I access from my browser it says "No authorization header received."20:15
racedoyes, they enlist, then reboot then we accept and commission20:15
racedothen after cloud init we see that they want to report back to maas using that URL20:16
racedoand then they get the auth error 40120:16
racedoand get stuck in commissioning20:16
racedoinstead of going to ready and shut down20:16
racedothey just shut down20:17
roaksoaxracedo: if you access throught the browser you wont see anything because the commissioning steps does authentication20:23
racedook20:24
roaksoaxracedo: maybe ntp issue?20:25
roaksoaxthe clocks in the maas server and the nodes are not the same?20:25
racedoi ssh to it during comissiong and check the date and it was fine, we went to the bios to set the right time and date too20:25
racedowe just rebooted maas and are trying again20:26
roaksoaxack20:33
racedoroaksoax: i'm going to share a screenshot in a sec if that's ok20:33
roaksoaxracedo: sure20:34
racedoroaksoax: https://docs.google.com/a/canonical.com/file/d/0BzitEgbYskgzN0Y3X21td0RKMU0/edit?usp=sharing20:35
racedoyou sho20:35
racedoshould have access :)20:35
roaksoaxracedo: yeah that's an issue with oath clocks not being synced20:35
roaksoaxsmoser: ^^20:35
racedoLP 978127 ?20:36
ubot5Launchpad bug 978127 in MAAS "incorrect time on node causes failed oauth" [Critical,Fix released] https://launchpad.net/bugs/97812720:36
roaksoaxracedo: that seems to be the one20:36
roaksoaxracedo: you guys are using stable ppa right?20:36
roaksoaxsmoser: was the fix for this backported to maas/1.2?20:36
smoserroaksoax, that bug (and its fix) are displayed there correctly.20:37
racedoduring commissioning i'm sshing the node and the time it's right, it was 5 hours ahead this morning but not now20:37
racedoroaksoax: yeah we use /stable20:37
smoserracedo, and that system is 5 hours off the clock on the maas server20:37
racedosmoser: it was20:37
racedonot any more20:37
smoserit was in that screenshot20:38
smoserthats what its telling you.20:38
smoserunless you're telling me you fixed it since that screen shot.20:38
smoserbut the INTERNAL SERVER ERROR is different.20:38
racedosmoser: no, i ssh during comissioning and the time is right, i did right during the time we took the screenshot20:38
roaksoaxracedo: also please pastebin apache2's error log20:38
smoseri suspect you have something in your maas logs20:38
racedook20:39
roaksoaxracedo: is the MAAS server with the same time too?20:39
racedoyeah20:39
smoserracedo, that system and the maas server disagree on the time. by 18000 seconds.20:39
smosertheres little doubt in my mind on that.20:39
smoser'date --utc'20:40
smoseron both20:40
racedook20:40
smoseri think you have differing clocks, but i dont think thats the whole issue. the fact that the client is re-setting its clock indicates that its working around the issue.20:43
negronjlroaksoax, smoser: apache log with errors here: https://pastebin.canonical.com/85324/20:43
smosernegronjl, /var/log/apache2/errors.log20:44
smoseror something to that effect.20:44
smoseryou're shoing me access log (i htink)20:44
racedosmoser: roaksoax https://pastebin.canonical.com/85326/20:45
racedocheck lines 4 and 5020:45
smoserright. its 5 hours off.20:46
roaksoaxracedo: Thu Feb 21 20:43:44 UTC 2013 commissioning node: Thu Feb 21 15:43:42 UTC 201320:46
negronjlsmoser: https://pastebin.canonical.com/85327/20:46
roaksoaxtimes are different20:46
smoser(isnt that what i said?)20:46
racedooh!20:47
roaksoaxracedo: that's your issue20:47
smoserits not the issue.20:47
racedo i was being too slow with date then :)20:47
smoserunless its causing fallout from maas/longpoll20:47
roaksoaxsmoser: i think txlongpoll is just for UI related stuff isn't it?20:48
smoseri dont know. but the error in the screenshot says INTERNAL ERROR20:49
smoserand the log says INTERNAL ERROR20:49
roaksoaxnegronjl: restart maas-txlongpoll20:50
negronjlroaksoax, done20:51
roaksoaxsmoser: maybe it is related... though I think we saw that too lkast time on the drill20:51
roaksoaxnegronjl: so there's 2 things it seems. 1. the clock skew, 2. txlongpoll20:51
negronjlroaksoax, checking the txlongpoll on logs to see if it is still there20:51
racedoroaksoax: but the time in the maas server is set to EST even if the BIOS has UTC, is that an issue?20:52
racedoroaksoax: https://pastebin.canonical.com/85328/20:53
smoserracedo, in ubuntu bios clock always has utc.20:53
smoser(i thikn there are some cases where if you're dual booting it will try to not use utc, but you want utc)20:53
racedosmoser: should i go to the BIOS and change it to match EST20:54
racedo?20:54
smoserthat is fine. all checking is done on actual time.20:54
racedook20:54
smoseryou want bios set to utc. on both boxes.20:54
racedosmoser: we changed it in the BIOS of the commissioning node just in case, then we are changing it back to UTC20:55
* roaksoax brb20:56
smoserracedo, fwiw, i'm pretty sure you could just run 'sudo ntpdate pool.ntp.org' and reboot. and i think that will get it fixed.20:56
smoser(because on system shutdown, the current clock is copied to bios clock)20:56
racedosmoser: thats right20:57
racedosmoser: ok done20:58
smoserthat will likely fix the oauth complaints, but i think you'll still see the internal server error.21:00
racedosmoser: you are right, now it only says internal server error21:00
racedothis is what i see in the maas apache error from that client when commissioning: https://pastebin.canonical.com/85331/21:06
smoserroaksoax, how do we get more info on that ?21:06
negronjlsmoser, roaksoax: increasing the debug level in apache ...21:08
smosernot apache.21:08
smosermaybe in maas.21:09
smoseri'm pretty sure its comoing from maas.21:09
negronjlsmoser: ok21:09
smoserwe should be able to get a maas stack trace some where.21:09
roaksoaxmaas.log21:09
roaksoaxcelery logas21:10
roaksoaxand txlongpoll logs21:10
roaksoaxracedo pastebin those please21:10
racedoroaksoax: https://pastebin.canonical.com/85334/ is maas.log21:12
racedoafter increasing the apache log to debug nothing changed from above apache2 access log21:13
racedothe 500 errors are logged in the access log rather than the error log21:14
roaksoaxracedo did you guys crrate any tags? that error is weird in maas.log21:15
racedoroaksoax: no21:15
racedoroaksoax: we created constraints21:15
racedoand actually we are not getting the maas-name constraint to match the nodes names21:15
roaksoaxmaybe thats related21:17
roaksoaxi have little to no knowledge in the constraints system21:18
racedonow there's no zookeeper21:18
racedono constraints, just maas21:18
racedowe can reinstall maas :)21:18
roaksoaxwill need to check logs to see ig any upstream commit might havre regressed something21:18
roaksoaxcould you file a bug with that error log?21:19
racedoyes21:19
racedoroaksoax: https://bugs.launchpad.net/maas/+bug/113141821:29
ubot5Launchpad bug 1131418 in MAAS "Nodes don't go to ready, after commissioning they get a 500 error when reporting back to maas" [Undecided,New]21:29
racedoroaksoax: as we need to finish this, we may reinstall maas now and do exactly the same steps we were following21:29
roaksoaxracedo: ok i' testing this in my local virtual environment21:29
racedoroaksoax: I'm sharing a doc with you of the exact steps we took from the very beginning21:30
racedoroaksoax: i shared with you a dump of all the http traffic between the client and the maas server to debug with wireshark if that helps21:48
roaksoaxracedo: ack thanks22:11
roaksoaxracedo: still around?23:22
racedoroaksoax: yes23:22
roaksoaxracedo: i don't think this is needed: juju set-constraints maas-name=23:22
roaksoaxnot anymore with newer maas23:22
racedooh ok23:23
racedobut doesn't the constraint stay until wiped?23:23
roaksoaxracedo: nah, the reason why that was done was because there was a bug , but IIRC that was fixed23:23
roaksoaxracedo: juju add-unit --constraints doens'yt work23:24
roaksoax?23:24
racedoroaksoax: what we were doing is specify every time what node we are deploying the server to23:24
racedooh23:24
racedoi see what you mean23:24
racedook, thanks roaksoax23:25
negronjlroaksoax, didn't know that add-unit took constraints ... that saves us time23:25
roaksoaxnegronjl: I think that works23:26
negronjlroaksoax, not according to the juju help but, I'll give it a try anyway23:26
roaksoaxlet me check23:26
negronjlroaksoax, thx23:27
roaksoaxnegronjl: yeah it doesn't :(23:28
roaksoaxi thought it did23:28
negronjlroaksoax, thx23:28
roaksoaxracedo: so yeah after deploying swift you have to clean the constraints23:29
roaksoaxbecuase you are setting them globally23:29
roaksoaxbut in cases like the bootstrap I think you have tro23:29
roaksoaxbecause you only set them for that deployment in particular23:29
roaksoaxracedo: ok just tested commissioning with latest from maas-maintainers/stable and it commissioned just fine23:30
racedook, with constraints and stuff?23:30
=== kentb is now known as kentb-out
roaksoaxracedo: i'm testing that now23:32
racedocool thx23:32
roaksoaxracedo: ok bootstrap with constraint went fine. I commissioned nodes after bootstrap, also went fine23:43
racedook, juan is doing it as well here in parallel23:48
* roaksoax waitingf for the bootstrap to finish23:52

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!