[04:23] bigjools: you up for a quick chat about the way we'll register installed boot images? [04:43] Blast. Test failures in trunk. [08:09] bigjools: I think having some time of a 3-4 person standup, separate from a 8-person standup is reasonable. [08:09] jam: exactly what I was thinking :) [08:27] jam: do you want to come in in about 10-15 mins then? [08:27] the more the merrier, and distracting ! [08:27] Daviey: just the man [08:28] oh dear [08:28] Daviey: my upgrade to quantal has seriously fucked my installation [08:29] rocking [08:29] any more detail? [08:29] $ dpkg -l [08:29] dpkg-query: error: parsing file '/var/lib/dpkg/status' near line 56897 package 'libao4:i386': [08:29] mixed non-coinstallable and coinstallable package instances present [08:30] and a crapload of broken dependencies :/ [08:30] wow, that is quite spectacular [08:30] thanks :) [08:30] bigjools: sounds good [08:30] bigjools: fancy raising a bug, and including all this output? [08:30] jam: cool [08:30] Daviey: yeah [08:30] bigjools: just raise it against 'Ubuntu' [08:31] Daviey: ok, shall I assign you? :) [08:31] bigjools: I suspect it's one that cjwatson will be better suited :) [08:31] heh [08:31] i'm just a dumb mangler. [08:44] jam, jelmer, mgz: https://plus.google.com/hangouts/_/c667fc896a4dfdbabcd78a47a4018ee6c2cb7341?authuser=0&hl=en-GB [08:44] ta [09:09] bigjools: I don't think i have access rights to modify the Maas board, can you add the Story 3 lane? [09:09] ah, there it is, nm [09:27] jam: you should have all the rights you need, let me know if you need more [09:47] Daviey: you were asking about mongo auth. I have a summary of it, would you like it here or on Maas-devel? [09:51] jam: mailing list might be more useful for historic and wider audience [11:45] morning [11:48] o/ roaksoax [11:54] o/ bigjools [12:47] roaksoax, rvba did you get a chance to read http://pad.daviey.com/server-maas-release-support ? [12:48] do you have som time to talk about it? [12:48] smoser: yeah I skimmed through it [12:48] smoser: i, hoewver, think you should probably post your proposal on the ML [12:49] smoser: i'll drop my patches but I don't think whether there's time to get that into quantal [12:49] well ,we can do that. but we can surely have people tell me what they think is wrong first. [12:50] smoser: well personally I still think we should be able to select what node should have what release, I think we can support both approaches... but if you consider my approach is incorrect i give in in favor to yours [12:50] smoser: I think your proposal makes sense. Let's post it to the list to see what people think. [12:51] That problem was briefly mentioned during our standup call this morning. [12:51] People seen to think that having the ability to tell that a particular node will used a particular release before deploy time is superfluous. [12:52] s/seen/seem/ [12:52] roaksoax: ^ [12:52] k [12:52] as I said, I give in [12:53] i guess that now making sure that smoser's proposal or anhy other fix for the issue gets into quantal is a priority [12:53] yep [12:53] but not having quantal support is a blocker for me, so my tasks will be blocked until ya'll have the support implemented [12:53] smoser: ^^ [12:54] roaksoax, i dont understand "we should be able to select what node should have what release, I think we can support both approaches" [12:54] how does my approach not do that? [12:54] smoser: your approach is telling the api to give you a node with a particular release [12:54] well, no. [12:54] i think that is a requirement [12:54] roaksoax: you can add the global default stuff, that alone will allow you to test stuff. And it will be one step in the right direction. [12:54] but that is not related to this actually. [12:55] we dont have an api entry point now for "give me a node" [12:55] we only have "start" on a particular node [12:55] so, unaddressed is "just give me a node already!" [12:55] smoser: exactly, so that's what i mean. start a node an install XYZ release on it [12:56] i'm confused. [12:56] my approach has that. [12:56] smoser: yes [12:56] smoser: that's what I'm saying [12:56] i'm talking about your approach [12:56] smoser: so on the WebUI, when you do "start node" how can you tell it what release to use if you want a different from the default? [12:57] smoser: in your appraoch is not possible to do that because the way you are considering to do this is doing so through the API [12:57] smoser: so in order to do it, you still need to tell the node what release to use, it might not be an attribute of the node, but you still need to tell it [12:57] smoser: and that should be stored somewhere [12:58] well it is stored somewhere. [12:58] roaksoax: the usage of the WebUI to start a node was always considered a very special case primarily meant for testing. [12:58] on the node [12:58] "default_release" as i said. [12:58] The API story is the real one. [12:58] rvba: right, but the usage is there [12:58] rvba: it is out to the wild [12:58] rvba: so you need to provide the same functionality in both interfaces [12:58] smoser: yes, defualt_release is very broad... [12:58] I think it is ok to say that using the webUI will use the default for now. And refine later if we need to. [12:59] rvba: i think by doing that you are removing functionality [12:59] rvba: and another problem is make it place nicely with juju [12:59] rvba: for 12.10 [12:59] Exactly, but Juju uses the API. [13:00] rvba: while the WebUI might be a simplified interface to interact with MAAS, I think it is essential to be able to support a release [13:00] rvba: so when you start a node, you node that that node will use X release you want [13:01] roaksoax: we can add this in the WebUI as well (dropdown menu next to the start button or something) [13:01] Let's focus on the core functionality first, then the API, and only then the UI. [13:03] rvba: ok, well the start button on the UI uses the API right? [13:03] roaksoax: not directly. [13:03] roaksoax: well, sorry, not at all. [13:04] rvba: well then we would just simply need to figure out how to pass the release dynamically without relying on having it stored somewhere [13:04] roaksoax: indeed. [13:04] rvba: if I find the time I'll look into it, unfortunately i need to concentrate on other stuff atm [13:04] roaksoax: first step would be to add the default I think. [13:04] roaksoax: you've done most of the work for that already I think. [13:05] rvba: yeah the default stuff is just a couple lines, i'll separate the patches and see to it [13:05] roaksoax: cool. === cheez0r_ is now known as cheez0r [13:13] i will send proposal to list. === flacoste changed the topic of #maas to: 4 weeks until Final Freeze | Discussion of upstream development of Ubuntu's Metal as a Service (MAAS) tool | MAAS jenkins: https://jenkins.qa.ubuntu.com/job/maas-trunk/ [13:13] why am I still up [13:16] it's a mystery [15:32] roaksoax, ok. so interesting problem here. [15:32] i just booted a ephemeral image in a kvm instance. [15:33] instance had (i think 256 M of memory) or some small ish number. [15:33] apt-get update runs [15:33] apt-get update writes 115M to /var/lib/apt/lists [15:33] which in ephemeral instance consumes 115M of memory [15:33] ie, i see some real potential issues here. [15:35] Daviey, you might find that interesting also. [15:35] i suspect that 2G memory is the minimum "real server" size [15:36] but for test within vms, thats a bit annoying. [15:40] smoser: hmm.. do you think we might need to back the overlayfs via networking ? [15:41] i dont know. we coudl back it via block device exposed over iscsi [15:43] yeah [15:43] not urgent for 12.10 IMO [15:44] but next cycle, we might need to consider something smarter? [15:45] yeah [16:03] smoser: interesting [16:04] we can improve that speicifically fairly easily, by removing i386 from and64 arch [16:04] smoser: during commissioning, do we actually need anything from apt? [16:04] and removing deb-src lines [16:05] smoser: right, so the only reson why it is being run is for maas-enlist right? [16:05] apt* being run [16:05] well..sort of. [16:06] commissioning is intended to be a general purpose environment [16:06] so you'd expect it to work. [16:06] smoser: right, well we don't really need deb-src === hazmat is now known as kapilt [16:38] allenap: still around? [16:42] roaksoax: Yes :) [16:42] allenap: so maybe yoy'll be able to help me [16:43] allenap: where is it that when you try to start a node, it sends the data to generate the preseed for a particular node? [16:43] allenap: or whats the entry point from the node and the generation of the tftp parameters it should use, such as arch, releaes, etc [16:44] I'm trying to run 'make' but ipython.scypi.org seems to refuse all connections. [16:44] Is there another option? [16:44] jam: Install ipython locally ought to solve it. Failing that, the tarball can be put in your buildout cache (I can send it to you). [16:45] (We kind of need a better solution for that.) [16:45] allenap: is that api.py? [16:45] allenap: wasn't that the 'download-cache' discussion? [16:45] jam: Yes. [16:45] or ignoring it an installing from archive works if the version matches [16:46] Right, what mgz said; that's what I was fat-fistedly trying to say with "install ... locally" :) [16:46] mgz: there is a strong debate that you should ignore system packages so that the build environment enforces strict dependencies. [16:46] (I had that for nosetests who's blog(?!) buildout was failing to scrape on make)) [16:46] allenap: 'apt-get install ipython; make' still fails [16:47] still is trying to download ipython [16:47] (on Precise) [16:47] jam: buildout will enforce strict dependencies because allow-picked-version is false, but if they match it'll use it. [16:47] Oh buildout, how you ***FEFEGEg4%$^$ me. [16:48] allenap: how do I know what version buildout is wanting? [16:48] jam: versions.cfg ought to say. [16:48] allenap: ah, buildout wants 0.12, and I have 0.12.1 ... :( [16:49] hacking that makes it get further at least [16:49] jam: http://ubuntuone.com/6HXrvwiGY2qRqKyezMmVYZ for 0.12 [16:50] if only I could easily copy that to another machine :). But I'll make it work [16:50] For the egg. Put it in ~/.buildout/cache/dist [16:50] it still is getting selenium, and, and [16:50] allenap: I don't have a ~/.buildout [16:50] is it reasonably safe to just create it? [16:50] jam: Ah, that's worth having. See HACKING.txt about it. [16:51] (I wasn't following the discussion that closely.) [16:57] smoser: so I still see no easy way to get the release stuff [16:57] smoser: rather tahn continue to do what we were doing [16:58] because when the preseed is presented or the parameters for tftp boot, the only thing we really send is the Node [17:00] roaksoax: So, got distracted. [17:00] roaksoax: pxeconfig() in maasserver.api creates KernelParameters object, which has various boot-related things in it. [17:00] It doesn't generate the preseed though. [17:00] allenap: so the problme is this [17:01] allenap: hold on (meeting :) ) [17:03] allenap: ok, so the issue is this: In order to add releases support, I added a new attribute to the Node as os_release. However, after discussing it with smoser , he mentioned that we should not have an attribute for the node [17:03] allenap: but rather we should simply specify the release we want the node to boot into [17:04] allenap: and that's it [17:04] allenap: so the priblem now becomes, how can we dynamically have that value (the release to use), if we only pass objects ot preseed.py [17:04] allenap: we can't pass a globally set value right? since we need it for a particular node [17:05] allenap: the only easy solution i see is simply saying "the release for X node on deployment will be Y" but I don't see how to pass that Y if we are passing objects [17:05] roaksoax: I think we need to store this on the node, as you've done. [17:06] allenap: yeah that's the only solutio I see [17:06] allenap: I just needed another opinion [17:08] roaksoax: smoser's right in that it's only used during the first boot after a machine has been allocated, and that once the machine has correctly installed and pinged maas to say "stop netbooting me" it's no longer relevant, but we need to keep it somewhere until that ping is done. [17:09] allenap, shouldn't we be able to add a getter to the object? [17:09] that does something more complex than just look at that attribute of the node? [17:09] But there isn't yet a better place to put it. We will at some point model "allocations", at which point I guess we can move it to there. [17:09] it just really isnt an attribute of a node [17:09] operating systems != hardware [17:10] (unless you come from apple) [17:12] smoser: Node is overloaded so I'm not sure where else to store the choice of OS right now. We need to remodel. Perhaps something like: board -< node >- allocation, or something like that. [17:12] board ? [17:12] so. [17:12] the one thing i really care about here. [17:12] is the external interface [17:13] 'start' simply needs to take "release" [17:13] smoser: For reconfigurable hardware... so "future". [17:13] and having the user have to change a node attribute in order to do that seems completely broken [17:14] smoser: ok so, the problme is thta basically, in ordfer to create the tftp/pxe stuff we only use the node attributes, right? [17:14] smoser: It probably ought to be an argument to the "acquire" call. [17:14] smoser: so we need a way to tell "NodeXYZ will deploy with ABC release" [17:14] "start" and "stop" are just power related; they don't imply OS installation. [17:15] smoser: this means that each node object needs to know what release is related when it gets started [17:15] smoser: so it is either have database table/class that matches Node <-> Instance release version, or keep it as an attributed, but treat it as an instance [17:15] smoser: I don't know if I explained myself clearly :) [17:15] allenap, hm.. [17:15] i thought start becuase i thoguht userdata was attachedk to start [17:16] but i could have been wrong [17:16] User will call MAAS and say "I want an amd64 node running Quantal" (== the acquire call), MAAS will store the user and release on the Node record, turn netbooting on, mark it as allocated, and reboot the node. [17:16] what is the link for do con the api? [17:16] Argh, you're right, start() does take the user data. [17:17] acquire doesn't do the boot; you need to acquire and start in sequence. [17:18] whilke start_nodes() does take the user data, it sets that data as to the node right? [17:18] NodeUserData.objects.set_user_data(node, user_data) [17:19] but that's for the metadata server [17:19] roaksoax: Yes, essentially. There are only ever zero or one NodeUserData rows for a node. [17:19] I have to go now, but I'll be back in about 2h. [17:19] allenap: alright [17:19] thanks! === matsubara is now known as matsubara-lunch [17:21] ugh. [17:22] so. the big thing to me is that whatever takes user-data should take release. [17:22] from the api perspective. [17:23] smoser: right, but that user-data is for meta-data server, not for the preseed/tftp boot generation stuff [17:23] smoser: the preseed/tftp takes a Node object [17:24] but thats just garbage implementation [17:24] just implementation [17:25] the consumer should not think of those things as separate. [17:26] smoser: right, so the user-data is set as Data for the node right? but that's meta-data server stuff. Unless you want to do the same for the preseed... [17:30] well the user doesn's specify preseed. [17:30] so i dont really care [17:30] but user-data and release need to be exposed via api [17:31] smoser: right, but the thing is "how to you match a node object with a piece of data (release)" [17:31] smoser: so that when the preseed/tftp url generation happens, it uses the correct data for *that* node [17:35] smoser: atm, it seems much easier to keep a node attribute for release and when you do "give me a node and install X on it" then it should just simply get a node and set the os_release attribute to the release you are requesting [17:35] but that is just completely broken from the api perspective. [17:36] smoser: i agree might not be the best approach, but it is a similar approach to what cobbler did [17:37] smoser: in cobbler you simply told it "this system inherits from profile X, and profile X is Y release" [17:38] i dont think thats going to win you any fans. [17:38] saying "cobbler did it that way" as a response to "thats not the right way" [17:39] smoser: I'm not arguing that because cobbler did it, we should do it that way. I'm arguing that we need a way to add handle this fast as upstream is busy with other stuff, I need to get other stuff done, and i'm sure you also need to [17:39] smoser: so, i'm saying can we provide a more appropriate fix with the time constraints that we face? [17:39] smoser: since you said you requested an "instance" type of handling for MAAS, but they were not going to make it happen for 12.10 [17:40] smoser: so the release stuff would be part of that "instace" approach within MAAS, right? [17:41] smoser: on fwiw, in cobbler you could say "deploy this machine with X release, or deploy this machine with Y release using its api (using koan)" [17:42] smoser: but anyways, I'm maybe to dumb to see a fix for it (at the moment) that doesn't involve having to store an attribute for a Node object [17:43] suck. [17:43] then we could ammend my suggestion by having an "instance release" that default ed to "node release' that defaulted to "global release" [17:44] and instance release would be set to NULL on "return/destroy" (whatever that api call is called) [17:44] smoser: as a node attribute? [17:44] well, my proposal was that those things were node attributes, yes. [17:44] (the proposal to the list) [17:44] i suggested that we would have a node attribute for "release" [17:45] that would default to the global release. [17:45] smoser: ok hold on, I'm confused now [17:46] smoser: the support that was hacked for that was basically this: Have a os_release attribute for a Node (similarly to architecture, power type). Every time we enlist a node, the os_release would be set to a globally default release [17:47] smoser: so what I was suggesting now was that if we simply specify a release different than default, it would set node.os_release as the new requested release, and deploy with that [17:47] smoser: and on a release, that could be set back to default [17:47] smoser: is that what you were looking for? [17:49] well, maybe for now that is sufficient. [17:51] smoser: but if that's what you were proposing, then the support I added does that almost the same. The only difference is that I was setting a default release for both, commissioning/enlistment and deployment [17:51] smoser: which can obviously be easily changed [17:53] well, i think you need to differenciate between commissioning/enlistment and install [17:53] so keep that. [17:53] smoser: right, that's simple to do [17:54] smoser: what I was trying to look a solution for is the attribute (os_release) for the node [17:54] smoser: which I thought you didn't want at all [17:54] outside of another table ("instance"), which i think is the correct long term path, i dont think there is one. [17:54] so what i was suggesting is largely what you are doing [17:54] except for the fact that i do not want specifying "release" in start to change the node. [17:55] which you're accomplishing by setting it back to 'default' on release. [17:55] which ... is somewhat lossy [17:55] but not that bad. [17:55] ie, if it the default "install" for that node had been set to 12.10 for good reason [17:55] and i specify 13.04 for one install [17:55] and then on release maas puts it back to 12.04 (the global default) [17:55] we've lost data. [17:56] the only way i can see to work around that is to not set the default for the node, but on start to set a "install_release" [17:56] and unset *that* on release. [17:57] smoser: right [17:59] smoser: but that's the thing, on start(), calls start_nodes(), which could take a release and set node.os_release = "whatever release passed" [18:01] smoser: http://paste.ubuntu.com/1199137/ or similar [18:28] smoser:so then the "instance" view of a node would be simple a NodeInstace that inherits from Node? which is able to hanlde user-data? [18:30] "instance" view? [18:30] roaksoax, [18:31] smoser: i mean, what you refer as treating nodes deployments as intances [18:31] smoser: i mean, just wondering how did you envisoned this? [18:35] well, i basically consider a "install -> release" cycle an "instance" [18:36] and an "instance" would have user-data specific to it and a user that owned the instance [18:36] and a release that was installed [18:36] and a start and end date [18:41] smoser: ok, so how do you start a node thorugh the API? [18:42] look at juju i think. [18:49] smoser: I think I just spotted another issue [18:50] smoser: not that it matters much, yet but if the node has not being deployed yet, obtaining the preseed would display the preseed for the default release [18:51] smoser: unless we specify a release specifically i guess [18:51] smoser: it will indeed require lot of refactoring I believe [18:52] i guess we will leave that for upstream :) [18:52] yeah, i dont think thats an issue really. [19:14] smoser: ok, so another problme is this [19:14] smoser: when a node makes a PXE request, the request grabs the MAC address of the node making the request [19:15] smoser: and by using the mac address it obtains the node [19:15] smoser: and with that node it generates the kernel command line [19:16] smoser: so the release *has* to be stored somewhere for that particular node [19:16] smoser: because the way this happes is by making a post request [19:16] smoser: unless, we could obtain the relese from that post request somehow [19:17] smoser: oh no way, we can't, cause the request is being made by a node itself after you tell the node to start [19:31] roaksoax, you've confused me. [19:32] roaksoax, i dont see a problem in the above. [19:32] if there were an "instances" table, there would be a way to get the "current" instance htat occupied a node. [19:32] so essentially, a node would have a os release but only because of hte instance currently occupying it. [19:32] which woudl hvae been created on the start [19:33] smoser: right, but look at src/maasserver/api.py:pxeconfig [19:35] roaksoax, right. thats fine. [19:35] i'm missing something [19:35] i'm sorry [19:36] smoser: so it gets the MAC address of the machine that has made the request, it searches for a node within MAAS. If found, it uses its attributes to create the params to be granted by the pxe file [19:36] right. [19:37] so yes, i agree. that means that given a node you have to be able to come up with the correct kernel/initramdisk/state [19:37] but thats not bad. [19:37] smoser: righjt, so there, there's two options, keep the os_release attribute of a node [19:38] smoser: or, once a node is found based on its mac, then we would need to map the node to somewhere else to obtain the release, right? [19:38] smoser: so if we have user data for the node, then first we need the node, then we need that node's user-data and then we need the release [19:39] right. [19:39] that is fine. [19:39] given a node, there will only ever be one "instance" at a given point in time. [19:39] and that instance would have release and user-data [19:40] smoser: right, so the way I was fixing that is: release = node.get_os_release() [19:40] so assuming mac is unique (which we have assumed), at a given point in time a MAC can map to user-data or os-release thorugh node [19:41] smoser: so, do you think that the Node class should have an attribute (which might be called instance_data, that contains the release) [19:42] so when you call: node.get_os_release(), which we can rename to: node.get_install_release(), it will search for the release in that instance_data variable and return it, if not found, simply return the default? [19:42] no. [19:42] roaksoax, for your work i tihnk i'd just hack the way you are [19:43] the longer term fix is another table "instances" where an instance maps N:1 to nodes. [19:43] and somehow you can look up "current" instance for a node. [19:43] and that instance has "os_release" and such [19:43] but your solution for now is ifne i think. [19:43] so dont get more complex than you need. [19:43] smoser: right, that's my question. Where does that "instances" table live? [19:43] it seems we are knowingly shortcutting [19:43] or should live [19:44] well its a object i think [19:44] smoser: an object that would be instanciated within a node, or separately? [19:46] separately i thikn. but i dont knwo tha ti understand thequestion. [19:46] it has to be separate though [19:47] smoser: ok, but, there should only be 1 instance for 1 node, hence a 1:1 relation, shouldn't it? BEcause 1 node can only have 1 instance at a time [19:47] but over history there are N [19:47] and you want to be able to look at historic ones [19:47] for accounting [19:48] smoser: are we looking to make an instance a database entry? [19:48] i would think so [19:48] but i'm not sure how such releations are normally modelled. [19:48] ie, how you would normally indicate "current" on something like that. [19:49] smoser: right. Yeah cause I thought we wouldn't care how many times has that node been deployed in what release or for what purpose [19:53] roaksoax, see that bug i opened. [19:53] you'd want to know that for accounting [19:53] you want to know that bob uses 90% of your cpu time [19:53] or that only 8% of your cpu used run 10.04 [19:54] smoser: right [19:54] or that 68% of your total cpu time is unused [19:54] and you cant do that without history (or at least keeping the count at the moment of acquire and release) [19:54] but i think you just keep the whole records instead [19:55] right === matsubara-lunch is now known as matsubara === kapilt is now known as hazmat [22:36] smoser: stil around?