=== vladk is now known as vladk|offline === CyberJacob|Away is now known as CyberJacob === CyberJacob is now known as CyberJacob|Away === vladk|offline is now known as vladk === perrito6` is now known as perrito666 === vladk is now known as vladk|offline === vladk|offline is now known as vladk === vladk is now known as vladk|offline === vladk|offline is now known as vladk === vladk is now known as vladk|offline === rvba` is now known as rvba [16:30] So, I have 7 machines under maas control right now [16:30] and all of them have been correctly started and configured with juju, and allocated to the correct user [16:31] this morning I restarted celery in the hopes that it would clear up a problem I'm having with etherwake not working (it didn't) and now all the nodes are in the "ready" state, rather than showing as already allocated [16:31] is that to be expected if celery restarts? [16:31] How do I correct it? [16:49] qhartman: absolutely not expected — I can't imagine how that could result from a celery restart. === roadmr is now known as roadmr_afk [16:57] jtv, ok, I didn't think so. I have some other issue going with juju where I hit a bug that nuked my env, so I'm thinking that might be the root cause. [17:00] Yes, that sounds much more probable. The juju env gave the nodes back to the maas. [17:04] jtv, yeah. I hadn't noticed that happened when I posted initially, so I'm going to call that the root. [17:22] it does lead me to another question though. If I need to nuke-and-pave a node in maas, what's the "right" was to do it? I have been deleting it from maas and then re-initializing it, but the installer was refusing to install to a non-empty hdd [17:22] so I've been manually wiping the drives of machines before trying to bring them back into maas === roadmr_afk is now known as roadmr === CyberJacob|Away is now known as CyberJacob === roadmr is now known as roadmr_afk === roadmr_afk is now known as roadmr [19:52] So, I got MAAS up and running. I bootstrapped juju, but when I run juju status, it cannot resolve the host, despite the MAAS cluster set to DHCP/DNS [19:56] make sure that the node name in the .juju/environments/your_env.jenv actually resolves correctly [19:57] I am running into a problem with make on lan not actually working all of a sudden [19:58] I rebooted the maas box just in case something got into a weird state, and I can WOL machines fro mthe commandline using etherwake [19:58] and the WOL template in the /etc/maas/templates seems right [19:58] (and was working) [19:58] but for some reason MAAS can't wake machines to commission them anymore. [19:59] I looked in the celery.log as suggested earlier. Couldn't really make heads or tails of it, but nothing that seemed to mention WOL popped out. [20:00] any suggestions for troubleshooting would be appreciated [20:21] qhartman_too: well I got it bootstrapped now, both nodes are allocated, except one of my modes is always "pending" while the other is "running" [20:21] I deployed juju-gui, it went to the "pending" node [20:21] so it says the juju-gui agent status is "pending" [20:21] (I just went into my pfsense router and set the hostnames in the DNS forwarder so they resolve) [20:23] is the installation actually going on the pending node? [20:29] yeah, juju-status shows the 2nd node as pending, and the juju-gui agent-state as pending (on machine "1") [20:29] I removed it and deployed it --to 0 [20:29] so it went to the machine that says "ready" [20:29] but still pending [20:29] oh, no now it says started hmmm [20:36] it does take a bit to get things installed and whatnot [20:39] the second node is still pending :( [20:40] and I have some now stating that "no matching tools available" [20:42] can you get on a console for the node? === CyberJacob is now known as CyberJacob|Away [20:48] hm, no [20:48] having a key issue. [20:49] Permisison denied (publickey) [20:50] i mean like a management console [20:50] though if its up enough to deny your key, thats probably a good sign [20:50] oh, like a local terminal? yeah. [20:50] I have it in my physical kvm [20:58] I just shitcanned the environment, gonna start over :D [21:02] magicrobotmonkey: So how do I fix the key issue? [21:02] I don't know how to log into the nodes interactively [21:03] I tried using the user/password for my admin account for MAAS [21:03] you need to setup an ssh key for the user you're using to run the juju commands with in maas [21:03] in the maas admin click on your user name and click preferences [21:03] there should be a place to add an ssh key [21:04] Yeah I did, I must've used the wrong key. [21:04] are you sure you were using your key when ssh'ing? [21:05] I never SSHed into the nodes directly yet. [21:05] oh, you'll also need to sepcify the user "ubuntu" [21:05] ah [21:05] ah right [21:05] ah that did it [21:05] ssh -i /path/to/private_key ubuntu@host [21:05] i always forget that at least once [21:05] specifying ubuntu worked [21:05] cool [21:05] so, do I want to SU to ubuntu to run the juju bootstrap, etc? [21:06] no, just as whatever user you've been using [21:06] ubuntu is the just the default username it uses when starting the hosts [21:06] I see, ok cool. [21:06] Thanks. [21:06] you can do that if you want, but it's not needed [21:07] might be better for the juju channel, but is there a way to, using the juju-gui, to specify which node a service is being deployed to? [21:07] also, how does one determine the IP of the container? [21:07] I dunno in the gui [21:07] i just use the cli for that [21:07] on the cli you do --to N [21:07] its easier [21:07] where N is the node number [21:07] yeah [21:07] what I've been doing. [21:07] what about the IP? [21:07] you can script spinning up an environment [21:07] juju status juju-gui [21:08] will tell you the hostname [21:08] ok, neat [21:08] this is kind of cool [21:09] cept that I did juju destroy-environment, now when I tried to bootstrap, it says it failed :P [21:09] job already running, juju-db, failed: rc: 1 [21:10] did you clean up the nodes and re-initialize them in maas? [21:10] I did not, do I just commission them again? [21:10] juju usually takes care of that [21:10] if its setup right, it will commission and decommission as needed [21:10] huh, I have been doing that part by hand [21:11] hm [21:11] How can juju commission nodes if it doesn't know their power settings? [21:11] through maas [21:11] (unless you're using vms I suppose) [21:11] maas does an excellent job of using ipmi [21:11] oh so you you fill theat in and then just stop short of commissioning [21:12] huh, I thought they had to be "ready" before juju would touch them [21:12] i don't know but when i set up juju i hooked it up to my maas install and it takes care of everything for me [21:12] its kind of nuts [21:12] except when it doesn't work its kind of hard to debug [21:12] are you using real hardware? [21:12] yea [21:13] huh [21:13] some weird old dell stuff [21:13] and yeah, it's suuupoer opaque [21:13] my nodes said "ready" but when it got partially through, said juju DB was running, stopping instance, then bootstrap failed. [21:13] yea im working on an openstack deployment [21:13] and maas and juju got me pretty far [21:13] have you had luck with wol working consistently, or are you using some other power method? [21:13] but the networking stuff has stumped me [21:13] im using ipmi [21:13] yeah, I'm at about the same spot [21:13] which works great [21:13] using WOL myself.. it wasn't working at first, but then magically it worked. [21:13] maas adds its own user when it boots the enlist preseed [21:14] super slick [21:14] yeah, Term1nal I've had WOL stuff magically stop working [21:14] yea I've never used it [21:14] when I enlist my HP machines it looks like it tries to IPMI them, but then it complains about no free user spots [21:14] lol [21:14] I've switched to attacking it with m established cobbler install and some ansible playbooks i found [21:14] well, they both just commissioned [21:15] so now I'm gonna run the bootstrap and watch things magically turn on [21:15] I'm tempted to hook up their iLO ports, but I don't really have the switch space [21:15] it's pretty impressive :D [21:15] yea i think you need the ilo ports wired for ipmi? [21:15] yeah, when I first got this going and the machines all started coming up one after another it was definitely a O_O moment. [21:16] yea same [21:16] magicrobotmonkey, not sure, it's been awhile since I worked with iLO stuff, and it was always the "deluxe" iLO before, so I'm not sure of the quirks yet [21:16] if only maas was as configurable as cobbler, I'd be sold [21:16] yeah, I'm still on the fence about the whole maas/juju thing [21:17] heh yea one of my machines randomly complains about not having the license for certain ilo functions [21:17] stupid [21:17] yeah [21:19] yea I've been pretty happy with cobbler [21:19] I haven't used it at all [21:19] I use chef for all my AWS stuff [21:19] cobbler is like maas, for bare metal [21:19] this is my first foray into config management w/ real hardware [21:20] always just done it by hand before [21:20] but if we grow this cluster like I think we will, that won't fly for long [21:20] heh me too then I had 80 nodes to do at once [21:20] well, "by hand" using PXE and preseeds [21:20] but still a helluva lot simpler than this [21:21] yea cobbler is more flexible/transparent [21:21] does the juju bootstrap do one at a time? [21:21] I have 2 nodes, only one powered up and started going. [21:21] bootstrap should only bring up one node [21:21] ah [21:21] the "machine 0" [21:21] it just powers up one node and install the juju master or whatever on it [21:21] then it gets node 1+? [21:22] once that's up do the "juju deploy..." and it will bring up another [21:22] ah [21:22] so, magicrobotmonkey, if you're happy w/ cobbler, why are you messing w/ maas? [21:24] openstack [21:24] the maas/juju seemed like a good way to get it going [21:24] yeah [21:24] Yeah, I tried doing a foreman/staypuft plugin install of RDO openstack [21:25] but getting foreman setup and shit, and installing the staypuft plugin... [21:25] yeah, I'm not far from giving up on maas / juju and just rolling some shell scripts [21:25] at least then I'd get some insight into what's going on [21:26] this just feels like it would be useful long term [21:26] only the latest pre-release version of foreman had the staypuft plugin in the repo, but it was an OLD version that was not compatible with the version of foreman for which the plugin was in the repo for... [21:26] So I would have to install from source [21:26] and it's all ruby, and screw ruby. === CyberJacob|Away is now known as CyberJacob [21:27] best I had so far was packstack (RDO) on CentOS [21:27] the only collection that I've had that got a running openstack platform, on a single box even, in less than an hour. [21:27] with only a few commands. [21:27] I'd give cobbbler a look, qhartman [21:28] yeah? [21:29] It looks interesting on the surface [21:29] its in between a bunch of shell scripts and maas [21:29] If you're already familiar with pxebooting, it'll be cake for you to get going [21:30] I actually haven't had much trouble with MAAS, aside from the unreliable WOL, it's the juju that has bugged me [21:30] yea thats pretty much my experience too [21:30] since my deployment needs aren't quite what they want to do, it's been tough figuring out the right way to tweak things [21:30] i need a primer on whats going on behind the scenes or something [21:30] yeah, me too [21:31] it probably doesnt help that my first experience with it is attacking a project with the complexity of openstack [21:31] there are a million how-to's but there's very little (that I've found) that goes under the covers [21:31] heh [21:31] <-also [21:31] i think it did an ok job of deploying [21:31] other than some handholding keystone around proxies [21:31] but the networking is as confusing as crap [21:32] I'm starting to think i might have a driver issue [21:32] yeah, all the stuff it's done right, is like magic, but when things go weird, or don't support being installed on the same host as one another, or some other corner-case I have the knack of finding, it's tricky to pick apart [21:32] exactly [21:32] and yeah, openstack networking is a PITA [21:33] all I want is my VMs to be bridged onto the main network, and get their DHCP DNS handled by the stuff I have in place [21:33] heh all i want is any connectivity from my nodes [21:33] no SDN, so single router to hide them, none of that [21:33] i dont care how [21:33] so, if you are using the flatdhcpmanager [21:34] I've found that the charms don't correctly install the nova-network package on the compute nodes [21:34] yup [21:34] the OS and juju guys I've talked swear it's supposed to [21:34] i switched to using neutron and got further [21:34] but I'll be damned if I can see how [21:34] Installed those by hand, and it got working [21:34] yea it totally doesnt add any bridges [21:35] now im at the point where it gets all set up [21:35] and seems right [21:35] and then my external interface goes dark [21:35] Yeah, it seems like neutron is supported better, but the last thing I want is all my VMs getting their traffic siphoned through a single host [21:35] yea no kidding [21:35] im still shooting for POC though [21:35] so just anything working would be nice [21:36] yeah, I've managed to get there a couple times, but I hven't been able to repeat it consistently [21:36] heh [21:36] last time one of the dhcp servers started talking on the main network and started fucking people up [21:37] still not sure why [21:37] I had everything working, and then adding a second compute node made that happen [21:38] haha [21:38] with its own dhcp? === CyberJacob is now known as CyberJacob|Away [21:39] apparently. I had left the office already by the time it manifested, so I just shut everything down [21:39] and have since nuked it all since I knew it was being bad, but not sure where === CyberJacob|Away is now known as CyberJacob [21:52] well I'll ask here since ubuntu-server is dead [21:55] Adding a node should never add a DHCP server... Only editing the cluster network interfaces should do that. [21:57] jtv, yeah, my best guess is that adding the second node made juju decide that the dnsmasq needed to be talking on the main interface so the other compute node could reach it [21:58] and nobody noticed it was causing trouble until their lease expired [21:58] Hmm... maas doesn't run any dnsmasq. [21:58] jtv, yeah, this has wandered into openstack territory [21:59] That does fit the story better I think. :) [21:59] jtv: filed https://bugs.launchpad.net/maas/+bug/1317677 [21:59] Ubuntu bug 1317677 in MAAS "Spurious error in celery.log: [2014-05-07 19:36:14,895: ERROR/Worker-4] Ignoring DHCP scan for virbr0, it no longer exists. Check your cluster interfaces configuration." [Low,Triaged] [22:00] Cool. [22:09] hmmm [22:09] so I tried to deploy --to 0/lxc/0 [22:09] do I need to -make- containers first before I can deploy to them? [22:09] how's that work? [22:11] no [22:11] do juju deploy --to lxc:0 [22:11] and that should do it [22:11] start a new lxc container on node 0 [22:12] you only use the 0/lxc/0 -style notation when referring to existing nodes/containers [22:14] gotcha [22:14] so, if I do the deploy lxc:0, does that start it on a new node? can I specify a node to start the lxc on as well? [22:15] after you have a couple running, examine the output of juju status and it will become clear [22:15] the 0 refers to the node [22:15] OH [22:15] so I run lxc:0, that spins up a container ON 0 [22:15] so if you do multiple lxc:0 commands, it will spin up multiple containers on 0 [22:15] yup [22:16] oh sweet :3 [22:16] that's neat. [22:16] yeah, I have like a dozen lxc's running on my node 0 [22:17] I deployed mysql and rabbit mq to node 0, and openstack-dashboard to lxc:0 [22:17] sure [22:17] I did the opposite, gave rabbit and mysql their own containers, and put the dash on the node directly [22:18] ah [22:18] The reason I went that way is that the dash needs to be user-facing [22:18] ohhhh [22:18] that makes sense [22:18] can I move a service into a container? [22:18] or just redeploy [22:19] I think you'd have to destroy it and redeploy it [22:19] ok fair enough [22:19] not sure though [22:19] * qhartman is still wearing his newb hat [22:20] So you can't have containers user-facing? [22:20] don't they get their own virtual IPs or what have you? [22:20] dunno. they seem to only have single interfaces and they get their IPs from the admin-side network [22:20] I'm sure it can be changed, but I've no idea how [22:21] and on my physical boxes, I have eth0 as admin-side, and eth1 as user-side [22:23] it seems like there should be a maas/juju/openstack channel to talk about the whole stack to help avoid the semi-OT talk in one channel or the other. [22:24] yeah I agree :D [22:24] it really involves all 3 [22:24] ok join majuos [22:28] lol [22:29] allenap: https://bugs.launchpad.net/maas/+bug/1317682 [22:29] Ubuntu bug 1317682 in MAAS "The cluster takes a long time to connect to the region." [High,Triaged]