/srv/irclogs.ubuntu.com/2012/06/11/#maas.txt

=== matsubara is now known as matsubara-lunch
=== matsubara-lunch is now known as matsubara
allenapmatsubara: Hi there, have you just installed python-tempita on the build machine by any chance?13:07
allenapmatsubara: Ah ha, no, it was rvba. Thanks rvba :)13:08
matsubaraallenap, I haven't but I can if you need.13:08
matsubara:-)13:08
matsubarathe joys of shared maintenance13:08
matsubarabtw, I should be able to start the migration from ec2 to the QA lab jenkins instance today13:08
matsubarafinish with the maza migration as their instance crashed last week and the team is blocked13:09
allenapmatsubara: Oh cool, nice.13:09
matsubaras/finish/finishing/13:09
allenaprvba: Interesting failure now; it wants confirmation for deleting the test database. I'll sort it out.13:09
rvbaallenap: yep, I'm seing this.13:10
rvbamatsubara: that's great news!13:10
roaksoaxrvba: howdy! so I was wondering if there are significant changes in maas trunk yet that require new packaging and stuff?13:21
rvbaallenap: build fixed.  I think there is an option in Django to make the command "make check" non interactive.  This would avoid the problem we just saw.13:22
rvbaroaksoax: hi.  Well, we've changed the architecture quite a bit since we are working on removing cobbler.13:23
roaksoaxrvba: ok... is it still functional in trunk? What packaging changes you think it needs? What new dependencies hae been introduced?13:24
rvbaroaksoax: it is functional, we're removing things bits by bits.13:25
roaksoaxrvba: ok. On the other hand, what version of libraphael and libyui you guys need as the plan is to remove them from shipping them on trunk13:26
roaksoaxso we need to ensure the packages are updated accordingly13:26
rvbaroaksoax: indeed.  I think I'd like to use yui 3.5 (we currently use 3.4.1) but I need to make sure everything is fine with YUI 3.5 first.  I'll do that this week and get back to you.13:27
roaksoaxrvba: ok, other than that, If I get to that this week, I'll update to yui 3.4.1 and then we can upgrade13:27
roaksoaxrvba: what about libraphael?13:27
rvbaroaksoax: we use 2.1.0.13:28
roaksoaxok13:28
roaksoaxi'll take ccare of that13:29
rvbaroaksoax: again, I'll check if there is not a major release that has been done in the meantime.  I guess it would be more profitable to other people if we ship major milestone version as opposed to just the version we need.13:29
rvbaroaksoax: cool, I'll test MAAS with YUI 3.5.  Then, once the packaging is done, I'll get able to get rid of the version of YUI that we ship… that will be a very cleanup.13:30
roaksoaxindeed13:31
rvbaroaksoax: I guess it won't be a problem for you to upgrade once the initial packaging will be done right?  It's just a bunch of static files as far as you're concerned I guess.13:31
rvbaallenap: I've got another branch up for review if you feel like it: https://code.launchpad.net/~rvb/maas/ui-power-parameters/+merge/10962113:33
rvbaallenap: and then I'll be done with the power parameters \o/13:33
cheez0rhrm, i can't figure out if this is a maas issue or a juju issue; I've got 11 nodes in my maas, they're all allocated to me, I've done a juju bootstrap and a bunch of deploys, but some of my machines in juju stay in agent-state: not started; and on the console these nodes get stuck partway through their pxe boot.13:34
cheez0rthey seem to lock up, with nothing but the pink ncurses background screen showing. A reboot brings them back through the pxe process and they hang at the same spot.13:34
cheez0rsince it's not making it through pxe I figured I'd talk to you guys first ;)13:35
cheez0rThe last thing it does is I see it pull the kickstart, then load 'additional components', then NTP, then partitioning, and then it just sits there.13:41
roaksoaxrvba: yeah raphael and yui shouldn't really be an issue13:55
roaksoaxcheez0r: those those machines *don't* pxe boot? or they pxe boot but doesn't show up the installer?13:57
cheez0rroaksoax: they pxe boot, they load the pxelinux image, they do the initial boot sequence, they do the initial system configuration (network addressing, NTP time, etc)13:58
rvbaroaksoax: I'm testing with YUI 3.5.1 and everything seems fine.  Also, libraphaeljs 2.1.0 is the most recent release.13:58
cheez0rthen they hang after partition manager.13:58
cheez0rI'm watching the console of one right now: loading additional components: partman, pkgsel, etc. Setting up the clock/NTP; detecting disks; flash; starting up the partitioner, flash, flash, nothing.13:59
cheez0rflash = too quick to be read13:59
roaksoaxrvba: ok cool14:01
allenaprvba: Sure, I'll review that.14:02
roaksoaxcheez0r: is there a way you could get the syslog of the machine being installed?14:03
rvbaallenap: ta.14:03
allenaprvba: Sorry for the delay; someone at the door.14:03
cheez0rroaksoax: it never boots to a point where it can be accessed.14:03
roaksoaxcheez0r: right, but the installer is running, which means it has a ssyslong14:03
cheez0rI have been watching cobbler.log; it stalls directly after generating and downloading its kickstart.14:03
roaksoaxsyslong14:03
rvbaroaksoax: apart from that we've added two new dependencies: python-celery python-tempita14:03
cheez0rroaksoax: right, but I can't get to it without being able to log into the system14:04
roaksoaxrvba: awesome. so it is still using cobbler14:04
rvbaroaksoax: we will need to update the packaging scripts when we will want to start the celery workers but we need to think about how to do this properly first.14:04
rvbaroaksoax: yes, right now it does.14:04
rvbaallenap: hehe, no worries.14:05
roaksoaxrvba: ok so the first step, I'll provide a new release in quantal with what is in trunk now introducing the new dependencies and filing MIRs14:05
roaksoaxand then step by step we can be improving the packaging14:05
roaksoaxrvba: as I'm sure they will be some cleanup to be done14:05
rvbaroaksoax: if we don't start the celery workers this won't be functional.14:06
rvbaroaksoax: because the only part that replaces cobbler with our stuff uses celery.14:06
roaksoaxcheez0r: https://help.ubuntu.com/community/Installation/NetworkConsole14:06
roaksoaxrvba: right, but right now you said it is using cobler still, so we don't need it yet?14:06
rvbaroaksoax: like I said, we're replacing it bits by bits.  Most of the application uses it still but the tiny part which does not use it uses celery.14:07
roaksoaxrvba: alright, I guess i could release a partially working maas in order for me to introduce the new dependencies and get things working14:07
rvbaroaksoax: if by "partially working" you mean "not really working" then yes… but I'm not sure it's worth it really.14:09
roaksoaxrvba: right, but I much rather have all the new dependecies in main now than later14:09
roaksoaxrvba: how do you usually start celery workers?14:09
cheez0rroaksoax: I'm not following what you're telling me to do; are you basically saying create this netboot iso, boot the system with it, log into it, and somehow recover the log from the prior install attempt where it never wrote a partition to a disk, so there can't be any log retained?14:09
roaksoaxcheez0r: not the netboot ISO. use the network-console part14:09
roaksoaxcheez0r: it will allow you to login while the installer is running14:10
cheez0rokay, so how do I implement this in the pxe boot image file?14:10
rvbaroaksoax: Very simple: add the maas package to the PYTHONPATH and then run 'celeryd'.14:10
roaksoaxcheez0r: you only need to modify the preseed file14:10
cheez0rah ok14:11
roaksoaxcheez0r: /var/lib/cobbler/kickstarts/maas.preseed14:11
rvba(note that it also need the rest of the python packages provided by maas but IIRC they will be on the general python path)14:11
roaksoaxrvba: so is it similar to how we run twistd?14:11
rvbaroaksoax: in the most simple case yes, but then we want to be able to run the workers on another machine and also run many workers.14:12
roaksoaxcheez0r: sthis 3 are the ones you need:14:12
roaksoaxd-i network-console/password           password SECRET12314:12
roaksoaxd-i network-console/password-again     password SECRET12314:12
roaksoaxd-i preseed/early_command string anna-install network-console14:12
rvbaroaksoax: and we haven't started the work for these complex setup yet.14:12
roaksoaxrvba: right, so as a first step, wouldn't it make more sense to ship a wrapper that sources PYTHONPATH and runs celeryd and/or upstartify it?14:13
cheez0rI've already got a partman/early_command string; that can coexist with preseed/early_command string?14:13
cheez0r(just confirming)14:13
roaksoaxrvba: and eventually ship it in i.e. maas-worker14:13
roaksoaxrvba: and provide a config to say how many workers are needed to be run?14:14
roaksoaxcheez0r: yes, just add network-console to the end14:14
rvbaroaksoax: That could be first step indeed. Yeah, the plan is to ship the workers within another package.14:14
rvbaroaksoax: but this is very much WIP.  Unless you really really want to do a release, I suggest waiting a little bit so that we can think of a plan.14:15
roaksoaxrvba: awesome then, so we *do* need a wrapper for celeryd or it is just the same way as how twistd was used?14:15
=== vibhav_ is now known as vibhav
roaksoaxrvba: well I was talking to Daviey last week and we agreed on having a new release in archives ASAP cause there14:16
roaksoaxthere's much to take care. i.e. New versions of raphael/yui/ File MIR's for celery ., etc etc14:16
rvbaroaksoax: ok then.  But we definitely will need to iterate on that first release.  You've been warned ;)14:16
roaksoaxrvba: :) no worries... after what I've dealt with last cycle this is piece of cake now :)14:17
rvbaroaksoax: haha. What would the wrapper do then?14:17
roaksoaxrvba: TBH, I rather identify issues and start working on getting things package now than be in trouble later14:17
roaksoaxrvba: that's my question, do we really need a wrapper that starts as many workers wanted? or for me we can just run celeryd from an upstart job14:18
rvbaroaksoax: sure.  We're in the middle of reachitecturing the whole thing that's all :)14:18
rvbaroaksoax: we need only one worker for now.14:18
roaksoaxrvba: ok, so that means using an upstart job is enough14:19
rvbaStarting celeryf with the right packages on the path should be enough.14:19
rvbaceleryd*14:19
rvbaYes I think so.14:19
roaksoaxrvba: cause, once we need various workers, we can just have a wrapper that starts as many as cfonfigured and replace it with the upstart job14:20
roaksoaxs/with the upstart job/within the upstart job14:20
rvbaWell, it will be more complicated than that because how many workers to start is an information that will be in the DB so starting workers and stuff will need to be done at the application level.  Or at least controlled at the application level.14:21
rvbaBut let's start with the simple case.14:21
cheez0rroaksoax: similar behavior- I connect to the network console, I start the install, it finishes with setting up the partition manager, it kicks me out of the network console.14:22
rvbaroaksoax: so, you need to start celeryd with all maas' packages available on the path, plus the celery configuration file (etc/celeryconfig.py).14:22
cheez0rnow ssh back in no longer works, but the console is still stuck on "Continue installation remotely using SSH" screen14:22
roaksoaxrvba: ok, awesome. As always I'll seek your advice when I get to that part14:23
rvbaallenap: I'm thinking that if we want to do the massive code reorg we talked about the other day, we should probably do that ASAP.14:23
rvbaroaksoax: sure, no problem.14:23
roaksoaxcheez0r: uhmmm14:23
roaksoaxcheez0r: TBH I hjave no idea what it could be14:23
cheez0rk14:23
roaksoaxcheez0r: smoser` Daviey ^^ any ideas for cheez0r's issue?14:24
allenaprvba: I'm losing interest in that because of the Django app issue :-/ Do you want to try moving metadataserver to maas.metadata (or something like that)? If you can get that to work then I think we're back on.14:24
rvbaallenap: I can try that.  What was the issue?14:25
rvbaallenap: if we don't manage to pull it off then I think we should at the very least rename 'apiclient'.14:28
allenaprvba: Django complained that it couldn't load the app, but it didn't make a lot of sense. I suspect a bug.14:30
rvbaallenap: ok, I'll see what I can do.14:30
allenapCool·14:32
cheez0rroaksoax: so I ran the network connection as a shell and continued the installer on the console, and tailed syslog14:37
cheez0rI've got the output of that if you'd like it- it appears to be querying memory modules and CPUs and then just dies14:37
cheez0rthe last line I get is kernel: [ ###.######] lowmem_reserve{}: 0 869 64806 64806 and then the beginning of another line before "Connection closed by remote host."14:38
roaksoaxcheez0r: so that might be a kernel issue with your hardware? or faulty hardware even?14:40
cheez0rcould be; all of the blades in this chassis are identical though, so hardware maybe, and it's passing all of the on boot/POST tests14:40
cheez0rI'm also getting this issue on five of eleven blades14:40
cheez0rI'm checking hardware configuration right now to try and isolate differences14:41
roaksoaxalright!/win 314:46
allenapDiff against target: 4466764 lines (+3848606/-587797) 5320 files modified  <-- rvba does not muck around.15:51
rvbaallenap: haha :).  I look forward to removing YUI our tree.15:52
rvbafrom* our tree15:53
roaksoaxrvba: still around?17:07
=== smoser` is now known as smoser
marcoceppi_So, as I continue to venture into a more stable BTRFS setup, I've been trying to research if USB3 is just as bad as USB2 for connecting disks over long periods of time. Is USB3 any better than USB2 in regards to consistency of disk connections? Or should I still be hunting for a better method of connection not using USB at all17:36
marcoceppi_FUUUU, wrong room17:36
=== matsubara is now known as matsubara-afk
kurt_General help question:  I'm having difficulty getting a node to accept an ssh key and allow auto-logon.  Other nodes don't have this problem.  I've tried deleting and re-adding the node via maas, and the node re-bootstraps, but still no go with the key.  Any suggestions?23:12
kurt_Anyone watch this list? :)23:22
kurt_General help question:  I'm having difficulty getting a node to accept an ssh key and allow auto-logon.  Other nodes don't have this problem.  I've tried deleting and re-adding the node via maas, and the node re-bootstraps, but still no go with the key.  Any suggestions?23:59

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!