=== matsubara is now known as matsubara-lunch === matsubara-lunch is now known as matsubara [13:07] matsubara: Hi there, have you just installed python-tempita on the build machine by any chance? [13:08] matsubara: Ah ha, no, it was rvba. Thanks rvba :) [13:08] allenap, I haven't but I can if you need. [13:08] :-) [13:08] the joys of shared maintenance [13:08] btw, I should be able to start the migration from ec2 to the QA lab jenkins instance today [13:09] finish with the maza migration as their instance crashed last week and the team is blocked [13:09] matsubara: Oh cool, nice. [13:09] s/finish/finishing/ [13:09] rvba: Interesting failure now; it wants confirmation for deleting the test database. I'll sort it out. [13:10] allenap: yep, I'm seing this. [13:10] matsubara: that's great news! [13:21] rvba: howdy! so I was wondering if there are significant changes in maas trunk yet that require new packaging and stuff? [13:22] allenap: build fixed. I think there is an option in Django to make the command "make check" non interactive. This would avoid the problem we just saw. [13:23] roaksoax: hi. Well, we've changed the architecture quite a bit since we are working on removing cobbler. [13:24] rvba: ok... is it still functional in trunk? What packaging changes you think it needs? What new dependencies hae been introduced? [13:25] roaksoax: it is functional, we're removing things bits by bits. [13:26] rvba: ok. On the other hand, what version of libraphael and libyui you guys need as the plan is to remove them from shipping them on trunk [13:26] so we need to ensure the packages are updated accordingly [13:27] roaksoax: indeed. I think I'd like to use yui 3.5 (we currently use 3.4.1) but I need to make sure everything is fine with YUI 3.5 first. I'll do that this week and get back to you. [13:27] rvba: ok, other than that, If I get to that this week, I'll update to yui 3.4.1 and then we can upgrade [13:27] rvba: what about libraphael? [13:28] roaksoax: we use 2.1.0. [13:28] ok [13:29] i'll take ccare of that [13:29] roaksoax: again, I'll check if there is not a major release that has been done in the meantime. I guess it would be more profitable to other people if we ship major milestone version as opposed to just the version we need. [13:30] roaksoax: cool, I'll test MAAS with YUI 3.5. Then, once the packaging is done, I'll get able to get rid of the version of YUI that we ship… that will be a very cleanup. [13:31] indeed [13:31] roaksoax: I guess it won't be a problem for you to upgrade once the initial packaging will be done right? It's just a bunch of static files as far as you're concerned I guess. [13:33] allenap: I've got another branch up for review if you feel like it: https://code.launchpad.net/~rvb/maas/ui-power-parameters/+merge/109621 [13:33] allenap: and then I'll be done with the power parameters \o/ [13:34] hrm, i can't figure out if this is a maas issue or a juju issue; I've got 11 nodes in my maas, they're all allocated to me, I've done a juju bootstrap and a bunch of deploys, but some of my machines in juju stay in agent-state: not started; and on the console these nodes get stuck partway through their pxe boot. [13:34] they seem to lock up, with nothing but the pink ncurses background screen showing. A reboot brings them back through the pxe process and they hang at the same spot. [13:35] since it's not making it through pxe I figured I'd talk to you guys first ;) [13:41] The last thing it does is I see it pull the kickstart, then load 'additional components', then NTP, then partitioning, and then it just sits there. [13:55] rvba: yeah raphael and yui shouldn't really be an issue [13:57] cheez0r: those those machines *don't* pxe boot? or they pxe boot but doesn't show up the installer? [13:58] roaksoax: they pxe boot, they load the pxelinux image, they do the initial boot sequence, they do the initial system configuration (network addressing, NTP time, etc) [13:58] roaksoax: I'm testing with YUI 3.5.1 and everything seems fine. Also, libraphaeljs 2.1.0 is the most recent release. [13:58] then they hang after partition manager. [13:59] I'm watching the console of one right now: loading additional components: partman, pkgsel, etc. Setting up the clock/NTP; detecting disks; flash; starting up the partitioner, flash, flash, nothing. [13:59] flash = too quick to be read [14:01] rvba: ok cool [14:02] rvba: Sure, I'll review that. [14:03] cheez0r: is there a way you could get the syslog of the machine being installed? [14:03] allenap: ta. [14:03] rvba: Sorry for the delay; someone at the door. [14:03] roaksoax: it never boots to a point where it can be accessed. [14:03] cheez0r: right, but the installer is running, which means it has a ssyslong [14:03] I have been watching cobbler.log; it stalls directly after generating and downloading its kickstart. [14:03] syslong [14:03] roaksoax: apart from that we've added two new dependencies: python-celery python-tempita [14:04] roaksoax: right, but I can't get to it without being able to log into the system [14:04] rvba: awesome. so it is still using cobbler [14:04] roaksoax: we will need to update the packaging scripts when we will want to start the celery workers but we need to think about how to do this properly first. [14:04] roaksoax: yes, right now it does. [14:05] allenap: hehe, no worries. [14:05] rvba: ok so the first step, I'll provide a new release in quantal with what is in trunk now introducing the new dependencies and filing MIRs [14:05] and then step by step we can be improving the packaging [14:05] rvba: as I'm sure they will be some cleanup to be done [14:06] roaksoax: if we don't start the celery workers this won't be functional. [14:06] roaksoax: because the only part that replaces cobbler with our stuff uses celery. [14:06] cheez0r: https://help.ubuntu.com/community/Installation/NetworkConsole [14:06] rvba: right, but right now you said it is using cobler still, so we don't need it yet? [14:07] roaksoax: like I said, we're replacing it bits by bits. Most of the application uses it still but the tiny part which does not use it uses celery. [14:07] rvba: alright, I guess i could release a partially working maas in order for me to introduce the new dependencies and get things working [14:09] roaksoax: if by "partially working" you mean "not really working" then yes… but I'm not sure it's worth it really. [14:09] rvba: right, but I much rather have all the new dependecies in main now than later [14:09] rvba: how do you usually start celery workers? [14:09] roaksoax: I'm not following what you're telling me to do; are you basically saying create this netboot iso, boot the system with it, log into it, and somehow recover the log from the prior install attempt where it never wrote a partition to a disk, so there can't be any log retained? [14:09] cheez0r: not the netboot ISO. use the network-console part [14:10] cheez0r: it will allow you to login while the installer is running [14:10] okay, so how do I implement this in the pxe boot image file? [14:10] roaksoax: Very simple: add the maas package to the PYTHONPATH and then run 'celeryd'. [14:10] cheez0r: you only need to modify the preseed file [14:11] ah ok [14:11] cheez0r: /var/lib/cobbler/kickstarts/maas.preseed [14:11] (note that it also need the rest of the python packages provided by maas but IIRC they will be on the general python path) [14:11] rvba: so is it similar to how we run twistd? [14:12] roaksoax: in the most simple case yes, but then we want to be able to run the workers on another machine and also run many workers. [14:12] cheez0r: sthis 3 are the ones you need: [14:12] d-i network-console/password password SECRET123 [14:12] d-i network-console/password-again password SECRET123 [14:12] d-i preseed/early_command string anna-install network-console [14:12] roaksoax: and we haven't started the work for these complex setup yet. [14:13] rvba: right, so as a first step, wouldn't it make more sense to ship a wrapper that sources PYTHONPATH and runs celeryd and/or upstartify it? [14:13] I've already got a partman/early_command string; that can coexist with preseed/early_command string? [14:13] (just confirming) [14:13] rvba: and eventually ship it in i.e. maas-worker [14:14] rvba: and provide a config to say how many workers are needed to be run? [14:14] cheez0r: yes, just add network-console to the end [14:14] roaksoax: That could be first step indeed. Yeah, the plan is to ship the workers within another package. [14:15] roaksoax: but this is very much WIP. Unless you really really want to do a release, I suggest waiting a little bit so that we can think of a plan. [14:15] rvba: awesome then, so we *do* need a wrapper for celeryd or it is just the same way as how twistd was used? === vibhav_ is now known as vibhav [14:16] rvba: well I was talking to Daviey last week and we agreed on having a new release in archives ASAP cause there [14:16] there's much to take care. i.e. New versions of raphael/yui/ File MIR's for celery ., etc etc [14:16] roaksoax: ok then. But we definitely will need to iterate on that first release. You've been warned ;) [14:17] rvba: :) no worries... after what I've dealt with last cycle this is piece of cake now :) [14:17] roaksoax: haha. What would the wrapper do then? [14:17] rvba: TBH, I rather identify issues and start working on getting things package now than be in trouble later [14:18] rvba: that's my question, do we really need a wrapper that starts as many workers wanted? or for me we can just run celeryd from an upstart job [14:18] roaksoax: sure. We're in the middle of reachitecturing the whole thing that's all :) [14:18] roaksoax: we need only one worker for now. [14:19] rvba: ok, so that means using an upstart job is enough [14:19] Starting celeryf with the right packages on the path should be enough. [14:19] celeryd* [14:19] Yes I think so. [14:20] rvba: cause, once we need various workers, we can just have a wrapper that starts as many as cfonfigured and replace it with the upstart job [14:20] s/with the upstart job/within the upstart job [14:21] Well, it will be more complicated than that because how many workers to start is an information that will be in the DB so starting workers and stuff will need to be done at the application level. Or at least controlled at the application level. [14:21] But let's start with the simple case. [14:22] roaksoax: similar behavior- I connect to the network console, I start the install, it finishes with setting up the partition manager, it kicks me out of the network console. [14:22] roaksoax: so, you need to start celeryd with all maas' packages available on the path, plus the celery configuration file (etc/celeryconfig.py). [14:22] now ssh back in no longer works, but the console is still stuck on "Continue installation remotely using SSH" screen [14:23] rvba: ok, awesome. As always I'll seek your advice when I get to that part [14:23] allenap: I'm thinking that if we want to do the massive code reorg we talked about the other day, we should probably do that ASAP. [14:23] roaksoax: sure, no problem. [14:23] cheez0r: uhmmm [14:23] cheez0r: TBH I hjave no idea what it could be [14:23] k [14:24] cheez0r: smoser` Daviey ^^ any ideas for cheez0r's issue? [14:24] rvba: I'm losing interest in that because of the Django app issue :-/ Do you want to try moving metadataserver to maas.metadata (or something like that)? If you can get that to work then I think we're back on. [14:25] allenap: I can try that. What was the issue? [14:28] allenap: if we don't manage to pull it off then I think we should at the very least rename 'apiclient'. [14:30] rvba: Django complained that it couldn't load the app, but it didn't make a lot of sense. I suspect a bug. [14:30] allenap: ok, I'll see what I can do. [14:32] Cool· [14:37] roaksoax: so I ran the network connection as a shell and continued the installer on the console, and tailed syslog [14:37] I've got the output of that if you'd like it- it appears to be querying memory modules and CPUs and then just dies [14:38] the last line I get is kernel: [ ###.######] lowmem_reserve{}: 0 869 64806 64806 and then the beginning of another line before "Connection closed by remote host." [14:40] cheez0r: so that might be a kernel issue with your hardware? or faulty hardware even? [14:40] could be; all of the blades in this chassis are identical though, so hardware maybe, and it's passing all of the on boot/POST tests [14:40] I'm also getting this issue on five of eleven blades [14:41] I'm checking hardware configuration right now to try and isolate differences [14:46] alright!/win 3 [15:51] Diff against target: 4466764 lines (+3848606/-587797) 5320 files modified <-- rvba does not muck around. [15:52] allenap: haha :). I look forward to removing YUI our tree. [15:53] from* our tree [17:07] rvba: still around? === smoser` is now known as smoser [17:36] So, as I continue to venture into a more stable BTRFS setup, I've been trying to research if USB3 is just as bad as USB2 for connecting disks over long periods of time. Is USB3 any better than USB2 in regards to consistency of disk connections? Or should I still be hunting for a better method of connection not using USB at all [17:36] FUUUU, wrong room === matsubara is now known as matsubara-afk [23:12] General help question: I'm having difficulty getting a node to accept an ssh key and allow auto-logon. Other nodes don't have this problem. I've tried deleting and re-adding the node via maas, and the node re-bootstraps, but still no go with the key. Any suggestions? [23:22] Anyone watch this list? :) [23:59] General help question: I'm having difficulty getting a node to accept an ssh key and allow auto-logon. Other nodes don't have this problem. I've tried deleting and re-adding the node via maas, and the node re-bootstraps, but still no go with the key. Any suggestions?