[09:33] smoser: harlowja: So that looks like we won't need to move to openstack/, right? [09:33] As they are trying to avoid doing moves (rather than remove the string 'stackforge'), moving all the stackforge projects would seem unproductive... [13:35] Odd_Bloke, it seems that the name stackforge is gone. [13:35] we will have to move to openstack/ namespace. [13:43] smoser: http://paste.ubuntu.com/12061743/ [13:43] smoser: That's from #openstack-infra just now. [13:50] well thats nicer than i thought [13:54] smoser: Yeah, I think there was an earlier draft which was harsher. [13:55] i thoguht we were at least forced to move namespace nwo rather than "at _some_ undefined point" [14:01] hey guys. [14:01] meeting today? [14:02] Yeah, I need to restart browser. [14:02] Will be there RSN. [14:03] k [14:07] ok... [14:07] so we'll start cloud-init meeting here now. [14:07] yeah, sorry about that, I don't know what's happening with my mic. [14:08] smoser, meeting here? or? [14:08] no worries [14:08] yeah [14:08] cool [14:08] let me type an agenda quick [14:09] agenda: [14:09] * reviews http://bit.ly/cloudinit-reviews-public [14:09] * https://review.openstack.org/#/c/209520/ [14:09] * reporting / smoser and cloud-init 0.7 [14:09] * main and persisting state [14:09] * open discussion [14:09] seem reasonable ? if i've missed anything then we can add them in open discussion. [14:09] +1 [14:09] Sounds good. [14:10] +1 [14:10] ok. [14:10] so reading claudiupopa's very nice commit message. [14:11] i think looks nice. have to read closer on how the network and local data source woudl work. [14:12] but since many comments addresed in that review i think we're good there.. i'll review that some more after meeting [14:12] reporting / smoser and cloud-init 0.7 [14:12] Yeah, I need to do a re-review of that as well. [14:12] == reporting / smoser and cloud-init 0.7 == [14:12] But it was great last I looked, so that shouldn't take long. [14:13] i took the reporting code from cloud-init 2.0, and added some things to it and loaded it into 0.7. the things i added were not much, and the intent will be to get them to really just be copied from 2->0.7 [14:13] the things discovered in using it that need to be addressed: [14:13] a.) exceptions should not leak through reporting. [14:13] the handlers should log errors but exceptions shoudlnt be allowed to bubble up. [14:14] You mean errors inside the handlers? [14:14] yeah. [14:14] claudiupopa: Yes. [14:15] ie, posting error to http://foo should not cause cloud-init to stacktrace [14:15] log failure, go on with life. [14:15] b.) the webhook ideally woudl buffer messages until it has network connection and thne start. non-trivial to decide when it should try to reach an endpoint though. [14:16] That sounds like a nice-to-have to me; do we have a concrete use case for something needing to know what happened before network comes up? [14:16] other than that i'd like to have that information collected... [14:17] if we add timestamp of events to the event, then it makes logging boot time very easy. [14:17] c.) blocking... its not really a problem yet, but it'd seem that blocking on a webhook could be problematic to boot performance. [14:18] if we're just expecting things to log errors rather than raise exception seems reasonable to somehow not block the rest of the code. [14:18] make sense ? [14:19] Could we achieve (b) by writing structured data to the filesystem (i.e. a separate reporter) and then a separate job could Require/After cloud-init-with-networking and push that data out somewhere? [14:19] Why not storing them in memory? [14:19] Seems redundant to cache them on the filesystem. [14:20] And that's additional disk access that we don't need. [14:20] Because we don't know when we're going to get networking, and systemd is good at ordered dependencies. :p [14:20] I mean, this sounds like a very specific use-case. [14:20] I'm thinking of a way that someone who wanted it could achieve it. [14:21] Since I'm not very familiar with it..what systemd has to do with this? :-) [14:21] systemd would start a job for us when networkign was all up. [14:21] just like our next topic. we'll have stages in boot, for local datasource and applying networking config [14:22] and then a stage that runs when networking config has come up [14:22] in cloud-init 0.7 that is 2 separate processes, so need disk to persist and replay. [14:23] i agree that it is a bit of a specific use case [14:23] I think I got it. I'm wondering if we could customize the stages or if they are hardcoded, since ideally on windows, we'll have only two of them. [14:24] how so? [14:24] basically on windows we're doing everything in almost one run of cloudbaseinit. Except the fact that setting a hostname requires a restart and we have that splitted into another initial run. [14:25] that's why we actually don't need persistence on disk for windows. [14:25] have to think some on it. [14:25] i think we're kind of moved on to topic 2 [14:25] we've bled over. [14:26] Yeah, I think there's a fairly fundamental design decision to be made here. [14:26] yeah, regarding topic 1, I'm +1 on not crashing cloudinit when a reporting hook fails. [14:26] Which is going to affect how we implement a whole bunch of things. [14:26] And, yeah, we shouldn't crash out, and we should be able to do reporting in the background. [14:26] We'll obviously need to ensure that the payload contains a timestamp if we're doing it in the background. [14:27] in the background meaning a thread? [14:27] I'm not fussy. :p [14:27] It's all IO/networking, so the GIL won't be a problem. [14:29] so .. [14:29] https://review.openstack.org/#/c/202743/7/cloudinit/shell.py [14:29] the first 4 are "stages" that are largely what we have in cloud-init. [14:29] basically you have: [14:29] basically you have: [14:30] search local datasources (and apply netowrking if possible) . this stage blocks networkign from coming up [14:30] search network datasources and run "init" modules (in 0.7 talk) [14:30] then config-final. [14:31] which is really just supposed to be "rc.local like timeframe" [14:31] realize that i dont know that i wrote those well, but those are the general stages. [14:31] got it [14:31] in 0.7 we have 3 config stages: init, config, and rc.local. i'm not sure if we need both 'init' and 'config'. [14:31] so you're relying on the OS to call these 4 stages for you? [14:32] right. but whether it calls me or not... they're still tasks we want to accomplish [14:32] and also offer as places to "hook" into boot [14:32] as the earlier you can do something, the less re-work you might have to do .. ie, ideally you can write config for a service before the service would normally start. [14:33] (so rc.local isn't ideal). [14:33] anywa.. [14:33] just to be clear on this, why the need of multiple stages in cloud-init? What does it solve from one single stage? [14:33] because running all "really early" might mean network isnt up. [14:33] and running all "really late" means you've lost your chance to make a change without restarting a service or possibly other negative side affects. [14:34] so basically each run is a granular list of tasks to do, of config modules to execute? [14:34] in 0.7, yes. [14:35] in 2.0 we'd like this to be more dynamic, but still maintain that idea. [14:35] yeah, what would be ideal is to have a workflow that doesn't require multiple runs on windows and works with more stages on other platforms. [14:36] how about having a stage that chains all the other ones together? [14:36] well, that is what 'all' is. [14:36] probably windows will be the only platform that will use it. [14:37] and its useful for a user wanting to re-try something. [14:37] oh, there is one already. [14:37] i think you kind of still have 2 stages there though. [14:37] one before your networkign is up, that configures networkign, and sets hostname (and reboots) [14:38] and then one that runs after that in the new boot [14:38] yeah, currently hostname and network config are separated. [14:38] ok. [14:38] Don't know why, but I'm guessing because in the first phase we don't have networking info available. [14:38] But I have to check it out. [14:38] So two of them are definitely what we need. [14:39] this means that there should be a way to customize what a stage is. [14:39] yeah, we can probably work things out. [14:40] so when Odd_Bloke went looking at this he came to the problem where the 'network' stage (locate and apply networking configuration') [14:40] would crawl a datasource and that datasource had a one-time read-only password in it. [14:40] the network stage, would only apply networking information ideally, so then he'd have to store that password somewhere so that the 'config' stage could apply it. [14:41] yep, I understand know why persistance was into discussion. [14:41] at risk of being called an idiot.. i dont think its all that big of a deal. [14:41] i'd write the data to a file and read it back later. [14:42] if its a root owned file, you risk that password being read off the disk by a root user at some future point or having that data found otherwise. [14:42] i dont advocate storing passwords in plaintext on a disk, but i also dont advocate setting root passwords [14:42] Odd_Bloke, ? thoughts? [14:43] Hold on, pulled in to another call. [14:43] claudiupopa, thoguths ? [14:44] yeah, I'm not into security, but storing a file with potential private data doesn't strike me as being a very good practice. [14:44] on linux we could encrypt it for inclusion to shadow, so thats better than plaintext. [14:45] also, if we'll have the ability to customize stages, then in that case on windows we will probably don't activate the persistance. [14:45] If that makes sense. [14:45] Since the password is probably not used by the network module. [14:45] and we can also attempt to "shred" the file on disk, but i'm not sure how affective shred is on non-local and spinning disks. (http://superuser.com/questions/22238/how-to-securely-delete-files-stored-on-a-ssd) [14:46] claudiupopa, i think we can come up with some solution there , yeah. [14:46] you'll have to drive that :) [14:46] yeah. [14:46] Okay. [14:46] :-) [14:47] Right, back./ [14:48] smoser: claudiupopa: So I had an idea: we could make the persistence invisible to consumers and only happen at the last possible moment. [14:48] That way, if we decide to run in stages, we persist the data. [14:48] But if we decide to run all at once, we don't. [14:48] (Or we only persist things that haven't been scrubbed out of the data) [14:49] i think that sounds reasonable. [14:49] makes sense [14:51] the registry could do that [14:51] you coudl register 'persistent' items, some with a 'secure' or something. [14:51] adn then on exit write it out [14:52] I think we'd always need to write out everything if we were writing out. [14:52] And then consumers of secure data would have a way of expunging it. [14:52] right. [14:53] Which would remove it from memory and, if it had been persisted, remove it from disk as well. [14:54] what's a consumer in this case? Another endpoint or? [14:54] config that consumes password [14:54] aa. [14:56] A potential future enhancement would be to allow you to configure cloud-init to never persist "secure" information. [14:56] But that would require a way of specifying what was "secure" data, and so is probably too much to bite off at this point. [14:58] If I want to add a single item to cloud_final_modules, shouldn't I be able to do that with a cloud-config file that has a merge_how directive? [15:00] larsks, is the module available somwhere already ? [15:03] Yeah, it's one of the standard config modules. I just want to enable it (without needing to re-specify evertyhing that's in the system cloud.cfg) [15:04] I thought something like this might work: http://chunk.io/f/f36e3fad696f4e23bf0b3a3e7eb26182 [15:04] ...but that seems to be *replacing* the cloud_final_modules: key. [15:10] smoser: claudiupopa: So I think I have enough to start on implementing that configuration stuff (and therefore beginning on main work). [15:10] smoser: claudiupopa: Do we have any open discussion topics? [15:11] I don't think so. You mean working on stages? [15:12] Yeah. [15:12] one requirement I would like to have is the ability to customize them, we'll probably use something different on windows. [15:12] So the current stubs include 'all' which runs all the stages. [15:13] We could either add an 'all-on-windows', or make the stages that 'all' runs customisable using /etc/cloud.cfg (or the Windows equivalent). [15:13] But getting 'all' working is probably further down the line anyway. [15:13] So we can hash it out then. :) [15:13] Okay. :-) [15:15] ok. i think we're done for today. [15:15] Cool. [15:15] I'm going to try to get the meeting I have half an hour after this meeting starts rescheduled. [15:15] okay, guys. [15:16] I'll be waiting a review on the data source api. ;-) [15:16] So that I don't have to disappear just when we're getting to the juicy stuff. :p [15:16] claudiupopa: Well I'm backporting stuff for cloud-init 0.7.x on Azure, which means I regularly have time to do other things while waiting for Azure. [15:16] So hopefully I'll get to that this afternoon. [15:18] Odd_Bloke: smoser: any thoughts on the merging-vs-replacing question? [15:18] larsks: I've never touched the merging stuff, so I have no idea, I'm afraid. :) [15:20] larsks, sorry.. reading [15:20] larsks, that actuallyu looks like it *should* work [15:20] you can also use cloud-config-jsonp to do somethig similar [15:23] smoser: yeah, that's what I thought, too :/. I'm looking for jsonp examples right now; if you have one handy that would be awesome... [15:24] I found https://bugs.launchpad.net/cloud-init/+bug/1316323. Let me git it a shot. [15:50] smoser: do you have a minute to take a gander at my cloud-config-jsonp attempt? [15:51] larsks, yea. sorry. give me a minute. [15:52] no rush. Here are the details: https://gist.github.com/larsks/8e0848d4e81c9e7cb066 [15:52] At the bottom is the jsonp file, based on the examples from the lp bug. Above that is the system cloud.cfg, and at the top is the error that cloud-init is throwing... [15:53] The error suggests that it is applying my patch against an empty configuration. [15:58] larsks, ok... so [15:58] from trunk, there is a tool : ./tools/ccfg-merge-debug [16:00] Okay. I'll take a look at that in a second. Did the jsonp in my gist look reasonable? [16:02] http://paste.ubuntu.com/12062712/ [16:03] so it seems to work for me... [16:03] maybe you dont have a stock config ? [16:03] in which case maybe you have to create the entry first ? [16:03] hm.. [16:03] No, there's definitely a stock config there. [16:03] yeah [16:03] odd [16:19] If it matters, I'm working with 0.7.5, because that's what centos has. [16:34] do you see any WARN in config ? [17:18] smoser: there are no warning messages in the log. Using that merge config just replaces cloud_final_modules. I can get the same behavior by booting with no user-data, and then running "cloud-init --file config-with-merge.yml modules --final" [17:22] larsks, i'm sorry, i'm not following [17:22] (a) there are no warnings (b) as we have seen, any use of that merge configuration actually *replaces* cloud_final_modules rather than appending, and (c) I can reproduce that behavior on demand just by passing the merge config to cloud-init using --file. [17:24] That makes me sad, because I am trying to avoid spinning a new cloud-init rpm every time someone says, "we should enable config module in the default configuration". [17:24] I'm going to see if maybe this is a version-related issue and maybe a more recent cloud-init will work correctly. [17:25] look at /etc/cloud/cloud.cfg.d/05_logging.cfg [17:25] andcomment out [17:25] - [ *log_base, *log_syslog ] [17:26] so it doesnt go to /dev/log (im not sure where that goes) [17:26] /dev/log ends up in the system journal. [17:26] (on these systems) [17:27] Let me try this with 0.7.6 first, because if that makes the problem go away I think we're actually in okay shape. [17:50] larsks, it seems like maybe its a matter of unicode [17:50] Interesting. In what way? Fwiw, 0.7.6 exhibits the same behavior... [17:53] nah. its nto that. [17:53] :-I( [17:54] it appears that you cant patch builtin config [17:54] Ahhhhhhhhh, poop. [17:54] yeah, crappy [17:55] you can re-define the whole list though [17:55] Yeah. Thanks for lookin'! === rangerpb is now known as rangerpbzzzz