/srv/irclogs.ubuntu.com/2014/11/07/#maas.txt

=== CyberJacob is now known as CyberJacob\|Away
horatio	I'm having troubles with IPMI + Commissioning. I get 3 of "could not open device at /dev/ipmi0 .. no such file or directory", followed by initscript ipmievd action "start" failed. I added a backdoor and ran the maas_ipmi_autodetect.py script, which works. Then I ran IPMI tool with the credentials the python script returns, and that works too. And when I check, /dev/ipmi0 exists.	00:55
horatio	The IPMI commissioning failures look like they're blocking the reboot at the end of the script though.	00:56
=== jfarschman is now known as MilesDenver
=== mscheel is now known as Guest43984
=== jfarschman is now known as MilesDenver
=== jfarschman is now known as MilesDenver
=== jfarschman is now known as MilesDenver
=== jfarschman is now known as MilesDenver
=== jfarschman is now known as MilesDenver
=== jfarschman is now known as MilesDenver
=== CyberJacob\|Away is now known as CyberJacob
=== jfarschman is now known as MilesDenver
=== jfarschman is now known as MilesDenver
=== CyberJacob is now known as CyberJacob\|Away
jtv	Who's up for a pre-imp?	08:55
jtv	Because I've got to do some actual product improvement done this week, as opposed to specs and meetings, or I'll go mad.	08:56
=== jfarschman is now known as MilesDenver
jgrassler	Good morning.	10:12
jgrassler	jtv: I found out where yesterday's problem occurs.	10:12
jgrassler	local_host, local_port = get("local", (None, None)) # tftp.py, line 180	10:13
jgrassler	This retrieves the machine's IP addres from twisted, which is quite sensible in most cases, but not in mine.	10:14
jgrassler	(since it amounts to what you'd get from `ifconfig eth0`)	10:14
jgrassler	I fixed it in a rather messy manner by setting params['local'] manually in the next line, but that's not exactly clean.	10:18
jgrassler	I might cobble up a patch that allows for configuring params['local'] in pserv.yaml - would that have a chance of getting accepted upstream?	10:19
dimitern	hey guys, just a heads-up; as discussed on the x-team call, I've filed a couple of bugs - https://bugs.launchpad.net/maas/+bug/1390404 and https://bugs.launchpad.net/maas/+bug/1390411	10:29
ubot5	Ubuntu bug 1390404 in MAAS "new API to get network interfaces information for a node" [Undecided,New]	10:30
ubot5	Ubuntu bug 1390411 in MAAS "add docs on how to take advantage of maas-proxy to cache custom images (e.g. for LXC/KVM containers)" [Undecided,New]	10:30
dimitern	and I've also commented on a few static ip addresses bugs I filed (mostly questions about whether what's left will be fixed in time for 1.7.1 or 1.7.2)	10:35
=== jfarschman is now known as MilesDenver
jtv	Hi jgrassler — thanks for digging that up! Let me just digest the whole thing...	11:03
jtv	allenap, did you see jgrassler's note above? Looks like the tftp code doesn't use MAAS_URL when parameterising a boot method, but "whatever my address is."	11:07
allenap	jtv, jgrassler: I need to remind myself what that code is meant to do.	11:08
* jtv mumbles his usual rant about documenting code		11:09
jtv	The "local" parameter goes into the PXE config as the iscsi address for the boot image.	11:10
jtv	Where do we serve iscsi now?	11:11
jtv	If it's the cluster controller, then MAAS_URL doesn't apply of course.	11:11
allenap	It should be the cluster controller.	11:11
jtv	Blast.	11:11
jtv	We have an abstraction for "address where nodes can find the region controller," but no equivalent for the cluster controller.	11:12
allenap	Well, here we discover it from the address on which the node has contacted the cluster controller.	11:13
allenap	That should be accurate.	11:13
allenap	Unless NAT or something gets in the way.	11:13
jtv	Which seems to be the case here. :(	11:13
jtv	Maybe this should really be the cluster interface address.	11:14
allenap	The cluster can have multiple interfaces, right?	11:14
jtv	(Which, I know, just raises more wrinkles)	11:14
jtv	Yes, a single cluster can manage multiple subnets...	11:14
roaksoax	scsi is on the cluster	11:16
roaksoax	jtv: we cannot bind the clsuter to one single address	11:16
jtv	Yeah.	11:16
roaksoax	because the cluster manages different networks	11:16
allenap	I think supporting cluster controllers behind NAT is a can of worms at the end of a rabbit hole that I really don't want to get drawn in to. jgrassler, I think it's unlikely that we'll support it.	11:17
roaksoax	if we do NAT< then MAAS would have to inject rules for all the networks it knows about	11:18
jtv	In this case MAAS is not involved in managing the NAT (and couldn't be).	11:18
roaksoax	jtv:that doens't mean that we wont :)	11:19
roaksoax	jtv: but that's not something we will be doing anytime soon	11:19
jtv	I'm guessing if only we could determine an appropriate cluster interface, the cluster interface's address would be the right one here.	11:19
roaksoax	jtv: we can't really know what's te right clsuter interface	11:20
roaksoax	jtv: when It comes to know, the right clsuter interface is the interface they are being managed from	11:20
allenap	jtv: I think MAAS would need to know its real address and the address that outsiders know it by, and relate the two.	11:20
jtv	I think we want "the cluster interface's address from a given node's point of view."	11:21
jtv	Which, yes, is a can of worms. :(	11:21
roaksoax	jtv: right, and that we can know	11:21
roaksoax	jtv: with the NIC->network matching, we can know	11:21
jtv	Not in this case, I think.	11:21
jtv	(We're talking about a very specific scenario here)	11:21
roaksoax	jtv: well, i think we should discuss this in Austin	11:22
gmb	jtv, rvba, allenap: Branch needs review: https://code.launchpad.net/~gmb/maas/check-for-overlapping-cluster-networks/+merge/241061	11:22
jtv	roaksoax: you mean supporting NAT between the node and the cluster controller? I'm just mulling it over in hopes of finding an easy solution, but if we don't find one, is it a use-case we want to support?	11:23
roaksoax	jtv we might want to support doing nat when both the cluster and region are in the same machine	11:34
jtv	That is the case here.	11:34
gmb	allenap: Thanks for the review. I've replied. After your comments, I think it's safer to just disallow overlaps altogether. Sound sane to you?	11:41
=== jfarschman is now known as MilesDenver
allenap	gmb: Sounds good to me, but I'd like rvba to take a look too. rvba, can you look at Graham's last diff comment on https://code.launchpad.net/~gmb/maas/check-for-overlapping-cluster-networks/+merge/241061?	11:44
rvba	allenap: sure	11:59
jtv	Python wishlist item: have "import" propagate indirect ImportErrors as a different type, so we can tell "I'm trying to import something that doesn't exist" from "I'm trying to import a file with a broken import in it."	12:08
jtv	(Because I'm tired of test runners reporting that my test doesn't exist just because my test contains an import error)	12:09
jgrassler	jtv, allenap: Sorry, I missed the discussion (lunch o' clock got in the way)	12:14
jtv	jgrassler: it's not good news I'm afraid — I hadn't realised yesterday that there'd be NAT between the cluster controller and the nodes.	12:15
jgrassler	I can relate to not wanting to support the oddball scenario we've got here - I'll just fix it locally by templating the address into tftp.py with puppet	12:16
jgrassler	It's ugly but it'll do for now	12:16
jgrassler	These floating IP addresses are a bit of a nuisance, unfortunately.	12:18
jgrassler	It's not the first time we've run afoul of the problem :-)	12:18
jtv	And this is one area where even in 1.7 you won't get IPv6. :(	12:31
jgrassler	That'll be another can of worms at some point in the future...	12:33
gmb	rvba: I think you missed what allenap was asking about… See the final comment on the original diff (circa line 124)	12:38
gmb	(https://code.launchpad.net/~gmb/maas/check-for-overlapping-cluster-networks/+merge/241061)	12:38
gmb	rvba: He's spotted a problem with the assumption that different clusters can define the same networks. And I think he's right.	12:39
=== jfarschman is now known as MilesDenver
rvba	gmb: when we discussed about this yesterday, we were talking about overlapping networks in the same cluster. I don't think it makes sense to have a node related to many clusters (not related in DB terms, I'm talking network here)	12:43
gmb	rvba: Right, so the point allenap is making then — that no two interfaces anywhere in a region's scope should have overlapping networks - makes sense. It's not that the node is related to two clusters, its that two independent clusters can right now define interfaces with exactly the same network settings. Which looks fine on paper — they're not the same network on the physical level — but once you get to layer 3 and above, they're identical, w	12:46
roaksoax	they could be yes	12:54
rvba	gmb: what I mean it that I don't see why we would have to enforce that in MAAS. The only problem we could see was the IP allocation and it only becomes a pb if a node is connected (network) to 2 clusters.	12:54
rvba	s/I mean it/I mean is/	12:54
roaksoax	rvba: rihgt, but nodes should not be connected to two cluster, should they?	12:55
rvba	roaksoax: yeah, that's my point.	12:56
rvba	roaksoax: but it's not something we enforce anywhere.	12:56
gmb	rvba: So, my concern is that we're leaving a potential footgun lying around for people if we allow them to do stuff like this. OTOH, you could do some NATing at the cluster level, so maybe it's not a big deal and we should let them. I'm happy with either solution, TBH.	13:03
gmb	And we probably should't be telling network admins what to do.	13:03
gmb	Come to think of it :)	13:03
rvba	gmb: yeah, I think it's the admin's job to sort out the routing. Unless letting them configure identical network will break something in MAAS, I think we should let them do so.	13:04
rvba	networks*	13:05
gmb	rvba: Okay, I'm happy with that.	13:05
roaksoax	rvba: right, but that's not something we reocmmend either	13:10
rvba	roaksoax: probably not. But I don't think we should forbid this (again, unless it breaks something in MAAS itself).	13:11
roaksoax	rvba: yeah, if someone does that it is their own fault	13:13
jesk	hi	13:16
jesk	trying to understand maas... having problems it :-)	13:17
jesk	s/it/with it/	13:17
jesk	i'am not able to get informations of the boot order process	13:18
jesk	what I could see so far was that a) server boots via PXE, b) server reboots again and boots via PXE (whyever two times) c) shuts off	13:19
jesk	when trying to install something (only tried juju quickstart) a) server boots and installs image b) reboots and boots again from PXE	13:20
jesk	do I need to deactivate PXE manually or is that handled by MAAS?	13:20
gmb	allenap, rvba, jtv: 'Nother branch for all y'all. https://code.launchpad.net/~gmb/maas/fix-ipmi-wording-bug-1304518/+merge/241075	13:22
jtv	I'll take it.	13:22
allenap	jtv: I've done it.	13:23
jtv	Grrr	13:23
allenap	Sorry :)	13:23
jtv	Bikeshed derby is ON!	13:24
jesk	is there any real technical doc about MAAS?	13:24
jesk	or just cloud-style powerpoint informations :D	13:24
jtv	jesk: http://maas.ubuntu.com/docs1.5/	13:26
jesk	jtv: those docs dont explain what happens when you really want to deploy nodes	13:28
jtv	It's an old version... more recent docs on maas.ubuntu.com may help. Did you have anything specific in mind?	13:29
jesk	its more like "type this and that"	13:29
jesk	jtv: i dont get the overall picture of it	13:29
jesk	jtv: concrete use case	13:30
jesk	currently i'am only having one MAAS node and now i want to deploy more nodes. I'am not coming over the step of "booting a server from pxe" which shuts down after PXE boot	13:31
jtv	Okay, so you're clearly beyond the part covered in the Orientation section.	13:31
jtv	Which is good.	13:31
jesk	even wake on lan works	13:32
jesk	"start node" -> node starts, boots from pxe -> and down again	13:32
jtv	Ah, wake-on-LAN is awkward because it has no way to shut down a node.	13:32
jtv	So you already commissioned and allocated the node?	13:32
jesk	yeah, but i'am happy with shutting down fvor the moment via ILO-manually	13:32
jesk	yeah I did that, but i'am not able to get the knowledge what it even means :-)	13:33
jtv	Right.	13:33
jtv	I'm sure we documented this _somewhere_, but let me be lazy first and summarise.	13:33
jesk	one node is currently in "allocated to root"-state and one in "ready"-state	13:33
jtv	OK. The "allocated to root" one should be either being installed, or up and installed with your system and your ssh key.	13:34
jtv	Or, crucially, it could be waiting for you to start it.	13:34
jesk	"allocated to root"-state because of issueing "juju quickstart" most probably, which unfortunately ended in nothing	13:34
jtv	(This all gets much better in the 1.7 which we're currently in the process of releasing)	13:34
jtv	Oh, this was done through juju?	13:35
jesk	i'am not 100% sure if juju started through MAAS a node	13:35
jesk	but i could saw with remote console that a system was installed	13:35
jtv	Oh, an operating system was installed on that node?	13:36
jtv	That's good.	13:36
jesk	after automatic reboot it was booted again via PXE and juju quickstart timeout	13:36
jtv	When you bootstrap a juju environment, it allocates a node for itself.	13:36
jtv	It asks MAAS to allocate a node, and when it gets a node, it tells MAAS to start the node up.	13:37
jtv	As the node starts up, it netboots off the MAAS server, and boots into an install image.	13:37
jtv	Thus it installs an OS, and the user's SSH keys.	13:37
jtv	Then it reboots into the OS that it just installed.	13:37
jtv	At this point the user (which I guess here is juju) has a working node.	13:37
jtv	I guess your node got installed, and then shut down... and did it come back up after that?	13:38
jesk	in my case (i believe) it rebooted from PXE again :-)	13:38
jesk	so the user has to make sure that PXE is turned off on the server when the server reboots finally after OS installation?	13:39
jesk	or can the PXE boot image check if the server was started for normal operation after OS installation and boots from local disk?	13:40
jesk	boot order is: (1) PXE (2) CD (3) HDD	13:41
jtv	Once the node is deployed (as this one seems to be), it will boot off its own disk.	13:41
jesk	to have the flexibility to always boot from PXE for new installation PXE has to be (1)	13:41
jtv	So no need to change that order.	13:41
jesk	if it boots from disk directly boot order must be alsways (1) HDD first	13:42
jesk	but then iam not able to boot from PXE if i want to	13:42
jtv	If the node tries to netboot while it's deployed, the MAAS server tells it to boot from local disk.	13:42
jesk	and HDD gets boots always as soon as the HDD has an valid boot record	13:42
jesk	ah! so via PXE it gets told to boot from local disk	13:43
jtv	Yeah. No need to change that order: just always let it netboot.	13:43
jesk	interesting but unfortunately seemed not work	13:43
jtv	Any symptoms?	13:43
jesk	it booted from PXE	13:44
jesk	saw this via console	13:44
=== jfarschman is now known as MilesDenver
jesk	but maybe "juju quickstart" dont tell MAAS to handle that installation persistent and leave it as "new installed node"?	13:44
jesk	so much magic :-)	13:45
jesk	i'am just dumb network engineer playing with that stuff	13:45
jtv	So the node that was "allocated to root" booted from PXE? What happened then?	13:45
jesk	(with a bit of linux and freebsd background)	13:45
jesk	it shut off after that	13:46
jtv	(Yes, far too much complexity — there's a lot less you can count on once you cross the boundaries between machines and between reboots)	13:46
jtv	It shut off...	13:46
jtv	That normally means it's not allocated.	13:46
jesk	ah those server shouldnt shut off after PXE boot?	13:46
jtv	Now, the situation as I understand it is that you have two nodes: #1 was deployed by Juju itself, and #2 is in the Ready state.	13:47
jtv	Servers will PXE-boot rather a lot... it depends on the situation. During deployment there should be one reboot, from the install image into the installed system.	13:48
jesk	yes, but both off	13:48
jtv	Is it possible that the wake-on-LAN simply didn't come through? Again, things are much better in 1.7, but in 1.5 the server just wouldn't notice.	13:48
jesk	wake on lan works, i dont get how a system is installed at all. What I saw is that servers boot two times from PXE, but dont install a full blown OS	13:49
jesk	and doesnt matter what I do they dont come up with a plain OS boot	13:50
jesk	they always boot something from PXE which ends in a shutdown after that	13:50
jtv	I wonder if maybe you don't have the boot images you need...	13:50
jesk	so the goal ist that MAAS would install the image I gave the node via MAAS frontend	13:51
jtv	Well you wouldn't have to provide an image; MAAS downloads those by itself.	13:51
jesk	and it would install and boot it similar to installation from CD	13:51
jesk	ending in terminal prompot	13:52
jtv	Well, login prompt. :)	13:52
jesk	yes :)	13:52
jesk	however it finds out charsets, language, time zone, disk partitions...	13:52
jtv	Let me just summarise what phases these pxe-boots go through:	13:52
jtv	First you "enlist" nodes — usually simply by turning them on and letting them netboot off the MAAS server.	13:53
jtv	They then register their existence with MAAS.	13:53
jtv	Then, you tell MAAS that you want to "commission" them.	13:53
jtv	MAAS boots them up, but into an ephemeral image, and builds an inventory of the node's hardware.	13:53
jtv	After this step, a node is Ready.	13:54
jtv	If you got to this point, that should mean that basic things like netbooting already work.	13:54
jtv	I do believe that ILO has some IPMI quirks, but if you're using wake-on-LAN, I don't think those would affect you.	13:55
jesk	(the wakeonlan package isnt installed by dependencies btw, had to manually do this)	13:55
jesk	ok those steps were all done I guess, the server registered, I see their MACs, and both were already in the ready-state	13:56
jesk	but what after that	13:56
jtv	If you go to a Ready node's UI page, there are buttons to allocate and start the node.	13:57
jtv	(You may want to log in as a non-admin user to hide the atypical steps for now)	13:57
jtv	Did you upload your SSH public key?	13:58
jesk	yes	13:58
jtv	Then the Start button should boot the node into the installer.	13:58
jesk	when I press "start node" what will happen?	13:58
jesk	ìt boots again from PXE?	13:58
jtv	Yes, into an installer.	13:58
jesk	ah ok	13:58
jtv	That then installs the OS (which is always Ubuntu in 1.5).	13:58
jtv	(If you edit the node you can select a different release.)	13:59
jtv	When the installer is done, it reboots the node.	13:59
jtv	At that point the node should come back up, into the OS that was just installed — with your SSH keys on it.	13:59
jesk	and this last step can also be managed by juju for example?	13:59
jtv	Yes.	13:59
jtv	When you ask Juju to start a unit, it allocates and starts that node.	14:00
jtv	(It provides some custom data to install the charm you want, of course.)	14:00
jtv	Once you have the node up and running, it's utterly yours. You can reboot it, mess with the OS, etc. Just don't disable PXE-boot or it will be hard for MAAS to manage after you release it. :)	14:01
jesk	thanks so far, jtv	14:01
jtv	Juju has a very cloud-y view of machines, so it will tend to think of machines as things you start up once, use for as long as you need it, and then discard.	14:01
jtv	np.	14:01
jesk	i will play a little more	14:02
jtv	OK. Let me know when you want to tackle the Mystery of the Phantom Server.	14:02
jesk	of what? :D	14:02
jtv	Yeah, this analogy isn't working very well. These titles usually complain about dead people/ships/animals acting as if they're alive, not the other way around.	14:03
jesk	i would really like to install the whole openstack magic on box for now	14:03
jesk	and have like 6 nodes for storage and computing, just for the possibilities in my lab	14:03
jesk	one openstack box	14:04
jtv	One small caveat: MAAS can manage VMs, but it doesn't create them.	14:04
jesk	but unfortunately the guides install like 6 openstack servers for having 2 compute and 2 storage nodes	14:04
jesk	ok thanks for the hint	14:05
jesk	interesting OS was installed	14:16
jesk	but cant login, SSH pubkey of my user doesnt work	14:17
jesk	:D	14:17
jesk	wonder how juju handled that...	14:17
jesk	muha... my mistake sorry for the spam... user ubuntu ...	14:20
jtv	:)	14:25
=== matsubara is now known as matsubara-lunch
* jtv steps out for a break		14:40
=== jfarschman is now known as MilesDenver
=== jfarschman is now known as MilesDenver
=== matsubara-lunch is now known as matsubara
jesk	it's a bit of shame that if you do an openstack installation like ubuntu guides suggest that on all edges and corners things dont work as explained	16:34
jesk	i would expect that for foreign howtos, but from the distribution itself	16:35
lutostag	jesk: where at? I'd like to fix that if possible?	16:38
jesk	just take one of the manuals about installing openstack, this is really not a flame... i'am now trying for few days to install it in all kinds of variants... without success	16:40
jesk	next error happened right now:	16:41
jesk	2014-11-07 14:56:27 ERROR juju.cmd supercommand.go:305 gomaasapi: got error back from server: 401 OK (Expired timestamp: given 1415372187 and now 1415381494 has a greater difference than threshold 300)	16:41
jesk	its a mess	16:41
roaksoax	rvba: ^^	16:42
jesk	you need a lot of clue of all components, maybe then its possible to install that stuff, but then please no big marketing about "openstack installation from canonical step by step guides"	16:42
roaksoax	jesk: what guides are you using?	16:45
jesk	i tried all I could found :D	16:45
jesk	maas guides, juju guides, openstack guides	16:45
jesk	all from the ubuntu doc archive, and also foreign stuff	16:45
roaksoax	jesk: like?	16:46
jesk	what do you mean?	16:46
jesk	like that https://insights.ubuntu.com/2014/05/21/ubuntu-cloud-documentation-14-04lts/	16:47
jesk	or just the openstack-install package	16:48
roaksoax	jesk: http://insights.ubuntu.com/wp-content/uploads/UCD-latest.pdf?utm_source=Ubuntu%20Cloud%20documentation%20%E2%80%93%2014.04%20LTS&utm_medium=download+link&utm_content= that's what you need to follow	16:50
roaksoax	maybe you just run into a bug with juju	16:50
=== roadmr is now known as roadmr_afk
=== jfarschman is now known as MilesDenver
=== CyberJacob\|Away is now known as CyberJacob
=== roadmr_afk is now known as roadmr

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!