[15:06] <roaksoax> rvba: howdy!! thanks for work on the genericipaddress filed
[15:09] <roaksoax> rvba: One thing though, I don't oppose merging this upstream, however, shipping it as a patch we would have ensured that this would have only affercted precise. Now that it is being merged, it will also affect quantal
[15:09] <roaksoax> and in quantal we will not use GenericIpAddressFiel from django
[15:09] <roaksoax> rvba: so just wanted to make sure you guys are aware of that
[15:11] <rvba> roaksoax: Hi.  In my code, I detect which version of Django we are using and only do the monkey patch if django.VERSION = 1.3
[15:11] <roaksoax> rvba: ok cool then :)
[15:11] <rvba> So the quantal version will use the field from Django itself.
[15:12] <rvba> I think it's better to put this upstream because it mean all that code gets tested everytime we make a change to the code (testing by the unit tests suite)
[15:12] <roaksoax> rvba: awesome then!!
[15:12] <rvba> means*
[15:12] <roaksoax> rvba: for sure
[15:30] <roaksoax> rvba: alright, so once it lands in maas/1.2 i'll prepare a new package, upload it to stable
[15:30] <roaksoax> rvba: upload a new django to PPA as well
[15:31] <roaksoax> and cleanup to start testing
[15:32] <rvba> roaksoax: ok, I'll land it now and then we can start testing it.
[16:06] <roaksoax> rvba: ok it is all in ppa:maas-maintainers/experimental for testing. It is building now though
[16:06] <rvba> roaksoax: it's building in the daily ppa too :)
[16:06] <roaksoax> ok cool :)
[16:07] <roaksoax> rvba: once tested I'll upload to raring
[16:07] <roaksoax> and that should be almost all we need to SRU
[16:08] <rvba> roaksoax: the recent change only impacted the 1.2 branch.  The raring package uses trunk.
[16:08] <roaksoax> rvba: right, but not in Ubuntu archives
[16:08] <roaksoax> rvba: ubuntu archives will have 1.2 until the SRU is done
[16:08] <rvba> Ah ok
[16:08] <roaksoax> once that happens, we can upload trunk to ubuntu archives
[16:09] <rvba> roaksoax: btw, we need to fix bug 1123986 before we SRU MAAS.  we're in the process of fixing it.
[16:10] <roaksoax> rvba: ok! Note that having different juju environments won't really help much though :)
[16:10] <roaksoax> rvba: cause the network is the same so things can still collide
[16:10] <roaksoax> rvba: won't help much in certain escenarios
[16:10] <rvba> roaksoax: yeah, but we definitely need to fix the file storage stuff in MAAS.
[16:10] <roaksoax> rvba: but this is definitelly cool! we were just talking about it yesterday :)
[16:11] <roaksoax> rvba: alright. Is there an ETA on when we can start seeing these changes landing?
[16:12] <rvba> roaksoax: I'd say sometime next week (most of the code is already done).  Because we need to do some serious testing.
[16:13] <roaksoax> rvba: ok cool
[16:18] <roaksoax> rvba: btw.. urgency=high is a debian thing, not ubuntu :) (lp:~rvb/maas/packaging.precise.sru-high)
[16:18] <rvba> roaksoax: haha :)
[16:37] <Jon___> Howdy, anyone home?
[17:05] <rvba> roaksoax: btw, please have a look at bug 1131296
[17:08] <roaksoax> rvba: will do
[17:12] <roaksoax> rvba: could you provide a bit more backgroun though?
[17:14] <rvba> roaksoax: I think the way maas-enlist builds the url it uses to register nodes has a bug.
[17:14] <rvba> roaksoax: but because of a bug in MAAS itself, it works ok :).
[17:14] <rvba> The thing is: the bug in MAAS must be resolved.
[17:53] <roaksoax> rvba: for sure... I wonder what is that is being given to maas-nlist
[17:53] <roaksoax> rvba: as in maas-enlist -s X.y.z.a/
[17:53] <roaksoax> or what
[18:42] <racedo> hey roaksoax we have seen a pattern that when we "release" an allocated node in maas to be ready and then redeploy it after rebooting it goes to grup rescue
[18:42] <racedo> if we instead delete it, reenlist it and deploy something it works
[18:42] <racedo> s/grup/grub/
[19:05] <roaksoax> racedo: how do you release it and how do you reboot it?
[19:07] <roaksoax> racedo: it seems that on the reboot it is not being told to PXE boot! which is what maas tells the node to do when it tells it to start
[19:07] <roaksoax> racedo: so if you are manually rebooting the node, it won't pxe boot unless you tell it so via IPMI
[19:07] <racedo> roaksoax: we reboot it via ipmi
[19:08] <roaksoax> racedo: manually?
[19:08] <racedo> yes
[19:08] <racedo> we dont have ilo or access now
[19:08] <roaksoax> racedo: that's the problem then
[19:08] <roaksoax> racedo: if you reboot manually you need to tell ipmi to PXE boot
[19:10] <racedo> so the sequence is: enlist->commission->deploy then juju-destroy then deploy with constraints maas-name to a node then after reboot it goes to grub rescue
[19:10] <racedo> roaksoax: ack
[19:10] <roaksoax> racedo: so
[19:10] <roaksoax> ipmi-chassis-config ${driver_option} -h ${power_address} -u ${power_user} -p ${power_pass} --commit --filename ${config}
[19:10] <roaksoax> ipmipower ${driver_option} -h ${power_address} -u ${power_user} -p ${power_pass} --cycle --on-if-off
[19:10] <roaksoax> where config is: http://paste.ubuntu.com/1700983/
[19:11] <racedo> yep Boot_Device PXE
[19:11] <racedo> got that
[19:11] <racedo> thanks roaksoax that could be it
[19:11] <racedo> we are confirming it right now
[19:58] <negronjl> roaksoax, I have a question re: preseed when you get a chance
[19:58] <roaksoax> negronjl: shoot :)
[19:58] <negronjl> roaksoax, I see this line in /usr/share/maas/preseeds/preseed_master: partman/early_command string debconf-set partman-auto/disk `list-devices disk | head -n1`
[19:59] <negronjl> roaksoax, however, when I have seen that line in the past, I have seen it as: partman/early_command string debconf-set partman-auto/disk "$(list-devices disk | head -n1)"
[19:59] <negronjl> roaksoax, will the above affect anything ?
[20:00] <roaksoax> negronjl: uhmmm I wouldn't know really
[20:01] <roaksoax> i don't think it should
[20:01] <roaksoax> negronjl: depends on the busybox shell I guess
[20:01] <roaksoax> with i believe is postfix
[20:02] <negronjl> roaksoax, If I change the preseed file, do I need to restart any particular service ?
[20:03] <negronjl> roaksoax, I mean if i change /usr/share/maas/preseeds/preseed_master BTW
[20:03] <roaksoax> negronjl: no, the preseeds anre rendedred at exec time
[20:03] <negronjl> roaksoax, ack ... thanks
[20:11] <racedo> we are getting a "No authorization header received." at the last stage of cloud init during commissioning when the nodes are accessing the maas server
[20:11] <racedo> that prevents them from reporting back to the maas server and go to ready state and they are stuck at commissioning
[20:11] <racedo> any clue?
[20:12] <racedo> the address the nodes are trying to contact is http://maas/MAAS/metadata/2012-03-01/
[20:13] <roaksoax> racedo: the DEFAULT_MAAS_URL is incorrect
[20:13] <racedo> roaksoax: ok, where is that?
[20:13]  * racedo checking
[20:14] <roaksoax> racedo: sudo dpkg-reconfigure maas-region-controller and enter either a hostname or ip address that is addresseable from the node's that are commissioning
[20:14] <roaksoax> racedo: etc/maas/maas_local_settings.py
[20:14] <racedo> it's the right one
[20:14] <roaksoax> racedo: so 'maas' is not resolvable
[20:14] <roaksoax> http://maas/MAAS/metadata/2012-03-01/ --> 'maas' resolves?
[20:14] <racedo> ok
[20:14] <racedo> no
[20:14] <racedo> it's the ip
[20:15] <racedo> sorry
[20:15] <racedo> i pasted it for privacy reasons :)
[20:15] <roaksoax> racedo: ah lol :), so is the address reacheable from the commissioning server?
[20:15] <roaksoax> racedo: as in the *same* network?
[20:15] <roaksoax> of the nodes being deployed?
[20:15] <racedo> if I access from my browser it says "No authorization header received."
[20:15] <racedo> yes, they enlist, then reboot then we accept and commission
[20:16] <racedo> then after cloud init we see that they want to report back to maas using that URL
[20:16] <racedo> and then they get the auth error 401
[20:16] <racedo> and get stuck in commissioning
[20:16] <racedo> instead of going to ready and shut down
[20:17] <racedo> they just shut down
[20:23] <roaksoax> racedo: if you access throught the browser you wont see anything because the commissioning steps does authentication
[20:24] <racedo> ok
[20:25] <roaksoax> racedo: maybe ntp issue?
[20:25] <roaksoax> the clocks in the maas server and the nodes are not the same?
[20:25] <racedo> i ssh to it during comissiong and check the date and it was fine, we went to the bios to set the right time and date too
[20:26] <racedo> we just rebooted maas and are trying again
[20:33] <roaksoax> ack
[20:33] <racedo> roaksoax: i'm going to share a screenshot in a sec if that's ok
[20:34] <roaksoax> racedo: sure
[20:35] <racedo> roaksoax: https://docs.google.com/a/canonical.com/file/d/0BzitEgbYskgzN0Y3X21td0RKMU0/edit?usp=sharing
[20:35] <racedo> you sho
[20:35] <racedo> should have access :)
[20:35] <roaksoax> racedo: yeah that's an issue with oath clocks not being synced
[20:35] <roaksoax> smoser: ^^
[20:36] <racedo> LP 978127 ?
[20:36] <roaksoax> racedo: that seems to be the one
[20:36] <roaksoax> racedo: you guys are using stable ppa right?
[20:36] <roaksoax> smoser: was the fix for this backported to maas/1.2?
[20:37] <smoser> roaksoax, that bug (and its fix) are displayed there correctly.
[20:37] <racedo> during commissioning i'm sshing the node and the time it's right, it was 5 hours ahead this morning but not now
[20:37] <racedo> roaksoax: yeah we use /stable
[20:37] <smoser> racedo, and that system is 5 hours off the clock on the maas server
[20:37] <racedo> smoser: it was
[20:37] <racedo> not any more
[20:38] <smoser> it was in that screenshot
[20:38] <smoser> thats what its telling you.
[20:38] <smoser> unless you're telling me you fixed it since that screen shot.
[20:38] <smoser> but the INTERNAL SERVER ERROR is different.
[20:38] <racedo> smoser: no, i ssh during comissioning and the time is right, i did right during the time we took the screenshot
[20:38] <roaksoax> racedo: also please pastebin apache2's error log
[20:38] <smoser> i suspect you have something in your maas logs
[20:39] <racedo> ok
[20:39] <roaksoax> racedo: is the MAAS server with the same time too?
[20:39] <racedo> yeah
[20:39] <smoser> racedo, that system and the maas server disagree on the time. by 18000 seconds.
[20:39] <smoser> theres little doubt in my mind on that.
[20:40] <smoser> 'date --utc'
[20:40] <smoser> on both
[20:40] <racedo> ok
[20:43] <smoser> i think you have differing clocks, but i dont think thats the whole issue. the fact that the client is re-setting its clock indicates that its working around the issue.
[20:43] <negronjl> roaksoax, smoser: apache log with errors here: https://pastebin.canonical.com/85324/
[20:44] <smoser> negronjl, /var/log/apache2/errors.log
[20:44] <smoser> or something to that effect.
[20:44] <smoser> you're shoing me access log (i htink)
[20:45] <racedo> smoser: roaksoax https://pastebin.canonical.com/85326/
[20:45] <racedo> check lines 4 and 50
[20:46] <smoser> right. its 5 hours off.
[20:46] <roaksoax> racedo: Thu Feb 21 20:43:44 UTC 2013 commissioning node: Thu Feb 21 15:43:42 UTC 2013
[20:46] <negronjl> smoser: https://pastebin.canonical.com/85327/
[20:46] <roaksoax> times are different
[20:46] <smoser> (isnt that what i said?)
[20:47] <racedo> oh!
[20:47] <roaksoax> racedo: that's your issue
[20:47] <smoser> its not the issue.
[20:47] <racedo>  i was being too slow with date then :)
[20:47] <smoser> unless its causing fallout from maas/longpoll
[20:48] <roaksoax> smoser: i think txlongpoll is just for UI related stuff isn't it?
[20:49] <smoser> i dont know. but the error in the screenshot says INTERNAL ERROR
[20:49] <smoser> and the log says INTERNAL ERROR
[20:50] <roaksoax> negronjl: restart maas-txlongpoll
[20:51] <negronjl> roaksoax, done
[20:51] <roaksoax> smoser: maybe it is related... though I think we saw that too lkast time on the drill
[20:51] <roaksoax> negronjl: so there's 2 things it seems. 1. the clock skew, 2. txlongpoll
[20:51] <negronjl> roaksoax, checking the txlongpoll on logs to see if it is still there
[20:52] <racedo> roaksoax: but the time in the maas server is set to EST even if the BIOS has UTC, is that an issue?
[20:53] <racedo> roaksoax: https://pastebin.canonical.com/85328/
[20:53] <smoser> racedo, in ubuntu bios clock always has utc.
[20:53] <smoser> (i thikn there are some cases where if you're dual booting it will try to not use utc, but you want utc)
[20:54] <racedo> smoser: should i go to the BIOS and change it to match EST
[20:54] <racedo> ?
[20:54] <smoser> that is fine. all checking is done on actual time.
[20:54] <racedo> ok
[20:54] <smoser> you want bios set to utc. on both boxes.
[20:55] <racedo> smoser: we changed it in the BIOS of the commissioning node just in case, then we are changing it back to UTC
[20:56]  * roaksoax brb
[20:56] <smoser> racedo, fwiw, i'm pretty sure you could just run 'sudo ntpdate pool.ntp.org' and reboot. and i think that will get it fixed.
[20:56] <smoser> (because on system shutdown, the current clock is copied to bios clock)
[20:57] <racedo> smoser: thats right
[20:58] <racedo> smoser: ok done
[21:00] <smoser> that will likely fix the oauth complaints, but i think you'll still see the internal server error.
[21:00] <racedo> smoser: you are right, now it only says internal server error
[21:06] <racedo> this is what i see in the maas apache error from that client when commissioning: https://pastebin.canonical.com/85331/
[21:06] <smoser> roaksoax, how do we get more info on that ?
[21:08] <negronjl> smoser, roaksoax: increasing the debug level in apache ...
[21:08] <smoser> not apache.
[21:09] <smoser> maybe in maas.
[21:09] <smoser> i'm pretty sure its comoing from maas.
[21:09] <negronjl> smoser: ok
[21:09] <smoser> we should be able to get a maas stack trace some where.
[21:09] <roaksoax> maas.log
[21:10] <roaksoax> celery logas
[21:10] <roaksoax> and txlongpoll logs
[21:10] <roaksoax> racedo pastebin those please
[21:12] <racedo> roaksoax: https://pastebin.canonical.com/85334/ is maas.log
[21:13] <racedo> after increasing the apache log to debug nothing changed from above apache2 access log
[21:14] <racedo> the 500 errors are logged in the access log rather than the error log
[21:15] <roaksoax> racedo did you guys crrate any tags? that error is weird in maas.log
[21:15] <racedo> roaksoax: no
[21:15] <racedo> roaksoax: we created constraints
[21:15] <racedo> and actually we are not getting the maas-name constraint to match the nodes names
[21:17] <roaksoax> maybe thats related
[21:18] <roaksoax> i have little to no knowledge in the constraints system
[21:18] <racedo> now there's no zookeeper
[21:18] <racedo> no constraints, just maas
[21:18] <racedo> we can reinstall maas :)
[21:18] <roaksoax> will need to check logs to see ig any upstream commit might havre regressed something
[21:19] <roaksoax> could you file a bug with that error log?
[21:19] <racedo> yes
[21:29] <racedo> roaksoax: https://bugs.launchpad.net/maas/+bug/1131418
[21:29] <racedo> roaksoax: as we need to finish this, we may reinstall maas now and do exactly the same steps we were following
[21:29] <roaksoax> racedo: ok i' testing this in my local virtual environment
[21:30] <racedo> roaksoax: I'm sharing a doc with you of the exact steps we took from the very beginning
[21:48] <racedo> roaksoax: i shared with you a dump of all the http traffic between the client and the maas server to debug with wireshark if that helps
[22:11] <roaksoax> racedo: ack thanks
[23:22] <roaksoax> racedo: still around?
[23:22] <racedo> roaksoax: yes
[23:22] <roaksoax> racedo: i don't think this is needed: juju set-constraints maas-name=
[23:22] <roaksoax> not anymore with newer maas
[23:23] <racedo> oh ok
[23:23] <racedo> but doesn't the constraint stay until wiped?
[23:23] <roaksoax> racedo: nah, the reason why that was done was because there was a bug , but IIRC that was fixed
[23:24] <roaksoax> racedo: juju add-unit --constraints doens'yt work
[23:24] <roaksoax> ?
[23:24] <racedo> roaksoax: what we were doing is specify every time what node we are deploying the server to
[23:24] <racedo> oh
[23:24] <racedo> i see what you mean
[23:25] <racedo> ok, thanks roaksoax
[23:25] <negronjl> roaksoax, didn't know that add-unit took constraints ... that saves us time
[23:26] <roaksoax> negronjl: I think that works
[23:26] <negronjl> roaksoax, not according to the juju help but, I'll give it a try anyway
[23:26] <roaksoax> let me check
[23:27] <negronjl> roaksoax, thx
[23:28] <roaksoax> negronjl: yeah it doesn't :(
[23:28] <roaksoax> i thought it did
[23:28] <negronjl> roaksoax, thx
[23:29] <roaksoax> racedo: so yeah after deploying swift you have to clean the constraints
[23:29] <roaksoax> becuase you are setting them globally
[23:29] <roaksoax> but in cases like the bootstrap I think you have tro
[23:29] <roaksoax> because you only set them for that deployment in particular
[23:30] <roaksoax> racedo: ok just tested commissioning with latest from maas-maintainers/stable and it commissioned just fine
[23:30] <racedo> ok, with constraints and stuff?
[23:32] <roaksoax> racedo: i'm testing that now
[23:32] <racedo> cool thx
[23:43] <roaksoax> racedo: ok bootstrap with constraint went fine. I commissioned nodes after bootstrap, also went fine
[23:48] <racedo> ok, juan is doing it as well here in parallel
[23:52]  * roaksoax waitingf for the bootstrap to finish