[11:41] <Odd_Bloke> claudiupopa: harlowja: I like the look of taskflow, but there are a couple of problems: (a) it isn't packaged for Python 3 in Debian/Ubuntu, and (b) it pulls in a lot of dependencies that I don't think we'll use (e.g. MySQL/PostgreSQL drivers, Zookeeper library).
[12:39] <Odd_Bloke> claudiupopa: harlowja: smoser: So it looks like there are a few hurdles to getting a python3-taskflow building, but they probably aren't insurmountable for 16.04.
[12:41] <Odd_Bloke> But I don't really want to try and get it sorted unless we are actually going to use taskflow. :p
[12:42] <Odd_Bloke> I expect that we could also convert quite a few of those dependencies to Recommends/Suggests.
[13:19] <smoser> Odd_Bloke, one thing to do is just file a bug
[13:19] <smoser> and tell openstack team.
[13:19] <smoser> they have to solve it at some point.
[13:20] <Odd_Bloke> smoser: Ah, true.
[13:20] <Odd_Bloke> smoser: Will they have to solve it for/by 16.04, do you know?
[13:21] <smoser> not really, but i filed a ton of "no python3" and they did get fixed.
[13:22] <Odd_Bloke> (It turns out there's also some pain with building both python-taskflow and python3-taskflow at the same time, python{,3}-networkx conflict with one another)
[13:22] <smoser> https://bugs.launchpad.net/ubuntu/+source/python-novaclient/+bug/1319145
[13:23] <smoser> Odd_Bloke,  you can even just file in debian :)
[13:24] <Odd_Bloke> smoser: I've filed https://bugs.launchpad.net/ubuntu/+source/python-taskflow/+bug/1492267
[13:25] <Odd_Bloke> Can I file a Debian bug and link them somehow?
[13:33] <smoser> you can, yeah.
[13:34] <smoser> use 'reportbug' to file debian bug
[13:34] <smoser> and then you can link it in launchpad
[13:41] <Odd_Bloke> Done.
[13:45] <smoser> Odd_Bloke, question..
[13:45] <smoser> do you knwo if i can specify a bzr repository to tox
[13:46] <smoser> i want to use simplestreams in a tox, but dont want to push to pypi
[13:49] <Odd_Bloke> smoser: You mean as a requirement to install?
[13:49] <smoser> as a 'deps', yeah
[13:49] <Odd_Bloke> You can do it with pip via https://pip.pypa.io/en/latest/reference/pip_install.html#bazaar
[13:50] <Odd_Bloke> So you should be able to get tox to do it.
[14:42] <openstackgerrit> Daniel Watkins proposed stackforge/cloud-init: Configure basic logging, and make it possible to log to console.  https://review.openstack.org/220536
[14:53] <openstackgerrit> Daniel Watkins proposed stackforge/cloud-init: Use a single source for version information.  https://review.openstack.org/220543
[14:54] <smoser> Odd_Bloke, you can. i did, its nice.
[14:54] <smoser> http://paste.ubuntu.com/12273678/
[14:55] <Odd_Bloke> :)
[16:11] <Odd_Bloke> harlowja: I'm playing around with TaskFlow; I was wondering if you have any recommendations for handing objects around as results/arguments.
[16:11] <Odd_Bloke> harlowja: I've just switched to the dir/files storage implementation, and it can't serialise a DataSource to disk (unsurprisingly).
[16:36] <Odd_Bloke> harlowja: OK, maybe a more general question: I want to be able to do things in discrete runs of cloud-init (e.g. run it once to configure networking, once to find a data source, and once to do configuration).
[16:36] <Odd_Bloke> harlowja: So I figured that I could do this by re-using a logbook (and maybe a flow, but I haven't managed to get that far yet).
[16:37] <Odd_Bloke> harlowja: But DirBackend.get_connection().get_logbooks() consistently returns no items.
[16:37] <Odd_Bloke> harlowja: There are loads of logbooks stored on disk, but it only looks for them if they are links.
[16:37] <Odd_Bloke> harlowja: Am I hitting a bug in my understanding, or a bug in taskflow? :p
[16:43] <harlowja> Odd_Bloke yo yo
[16:44] <Odd_Bloke> harlowja: o/
[16:44] <harlowja> \o
[16:44] <harlowja> ha
[16:44] <harlowja> so did u get the handing objects around working?
[16:45] <Odd_Bloke> harlowja: Nah, I moved on to trying to get all the different bits working together.
[16:45] <Odd_Bloke> Because if I can't get the different runs to see the same data, then it doesn't matter what I can store. :p
[16:45] <harlowja> so let's see here
[16:46] <harlowja> dr.josh on the case
[16:46] <harlowja> ha
[16:46] <Odd_Bloke> harlowja: I think this might be a legit bug; books are always written as directories, but _get_children will only ever return links.
[16:46] <harlowja> ya, it might be
[16:47] <harlowja> legit bug ftw
[16:47] <harlowja> ha
[16:47] <harlowja> i don't think many people have been using the dir backend, most afaik have use the sql one
[16:47] <harlowja> or the zookeeper one
[16:47] <harlowja> btw, as for dependencies, most are actually optional, depending on what u use
[16:48] <harlowja> i should probably reorganize them now that better optional dependency support exists in pip
[16:48] <Odd_Bloke> harlowja: Yeah, I figured they would be.
[16:48] <Odd_Bloke> So hopefully it's mostly just shifting them from Depends to Suggests.
[16:48] <harlowja> ya
[16:48] <harlowja> i think pbr only recently got support for this, so thats part of it
[16:48] <harlowja> * support for https://www.python.org/dev/peps/pep-0426/#extras-optional-dependencies
[16:48] <harlowja> *thats part of why
[16:49] <Odd_Bloke> Well, let me switch to using sqlite for now.
[16:50] <harlowja> k
[16:50] <harlowja> but now u guys know my other project :-P
[16:50] <harlowja> ha
[16:53] <harlowja> ok, https://bugs.launchpad.net/taskflow/+bug/1492392 opened for optional deps
[16:56] <harlowja> http://paste.openstack.org/show/445602/ another example Odd_Bloke
[16:56] <harlowja> that one dumps out the in-memory backend
[16:58] <Odd_Bloke> harlowja: So I'm trying to work out how I should do this resuming stuff.
[16:59] <Odd_Bloke> harlowja: Should I be building the entire cloud-init flow up front, and then different commands can do different parts of it?
[16:59] <Odd_Bloke> Or can the commands just build their own, smaller flows which can be extended/resumed by later commands.
[17:00] <harlowja> i've seen people do both
[17:00] <harlowja> building up-front allows for more parallelsim
[17:00] <harlowja> building smaller ones and executing them allows for less
[17:02] <harlowja> building up front gets complicated if u need to do conditionals, this kind of programming model (the one taskflow has, typically called dataflow-like) does require some brain twisting, since its basically ahead-of-time defintion of all the things :-P
[17:02] <harlowja> so i'd start out simple (which seems to be what most people do)
[17:04] <harlowja> Odd_Bloke here is an example that uses a prior runs data
[17:04] <harlowja> http://paste.openstack.org/show/445605/
[17:05] <harlowja> output @ http://paste.openstack.org/show/445608/
[17:05] <harlowja> ^ will avoid recomputation
[17:05] <Odd_Bloke> Aha, I think I'm missing the models.FlowDetail part.
[17:06] <Odd_Bloke> I was trying to do something messy with logbooks.
[17:06] <harlowja> ya, that might be part of it, the bug probably is still legit though
[17:10] <Odd_Bloke> harlowja: taskflow.exceptions.NotFound: No flow details found with uuid '6c1a3f09-8a58-426f-bffb-8262680fcb67'
[17:10] <Odd_Bloke> harlowja: Using sqlite on the first run with it added in.
[17:10] <harlowja> ya, k, the memory backend is sorta different in that case, so u have to do a little more for other backends
[17:11] <harlowja> basically https://github.com/openstack/taskflow/blob/master/taskflow/examples/resume_from_backend.py#L106 (those 5 lines)
[17:11] <harlowja> the in-memory stuff sorta automatically saves the provided flow detail when a backend is not provided (because its an in-memory one, and the default)
[17:20] <Odd_Bloke> harlowja: OK, so I'm persisting and fetching the same flow_detail now.
[17:20] <harlowja> k
[17:21] <Odd_Bloke> harlowja: What I have ATM is two Flows A and B.  A produces 'data_source' and B requires 'data_source'.  I'm running A on one invocation of cloud-init (i.e. e = engines.load(A, ...); ...; e.run()) and B on the second.
[17:21] <harlowja> k
[17:21] <Odd_Bloke> harlowja: But B always fails because 'no other entity produces said requirements'.
[17:22] <roychri> Im trying to find how to access my ec2 metadata (specifically the private_ipv4) so I can use it in bootcmd, runcmd or write_files. Any pointers?
[17:22] <harlowja> Odd_Bloke can u pastebin the code u have?
[17:23] <harlowja> Odd_Bloke https://bugs.launchpad.net/taskflow/+bug/1492403 also (for the dir stuff); fixing that right now
[17:23] <openstackgerrit> Daniel Watkins proposed stackforge/cloud-init: [WIP] TaskFlow for running shell commands  https://review.openstack.org/220593
[17:24] <harlowja> is that the pastebin ;)
[17:24] <Odd_Bloke> Yep. :p
[17:24] <harlowja> :)
[17:25] <harlowja> Odd_Bloke can u also try turning log level 5 on, that will show the symbol lookup resolution
[17:25]  * harlowja aka the 'trace' log level
[17:25] <harlowja> the lookup loggin should show u why / what is being searched for and found
[17:26] <Odd_Bloke> "2015-09-04 18:26:14,869 [Level 5] taskflow.storage: Looking for 'data_source' <= 'data_source' for atom named: cloudinit.flows.get_config_flow.<locals>.PlaceHolderTask"
[17:26] <harlowja> (yes i know thats a non-standard log level, but the some openstack people complained about it being to noisy, ha)
[17:27] <Odd_Bloke> harlowja: So if you look at flows.py, all works fine but doing the two separately doesn't.
[17:27] <Odd_Bloke> Which makes sense.
[17:27] <Odd_Bloke> But I don't know how I should go about doing those two separately.
[17:28] <harlowja> can u insert the search flow (Even if it already finished) into the config_flow
[17:28] <harlowja> i'm pretty sure that it needs to know the names of providers, so if u don't insert it, it doesn't know about the other prior tasks that already saved stuff
[17:28] <harlowja> so u could either add dummy tasks, or just insert the search flow
[17:29] <Odd_Bloke> OK, that did work.
[17:29] <Odd_Bloke> But that makes inserting earlier steps tricky.
[17:29] <Odd_Bloke> So I'm wondering if I should actually just be building the whole flow up-front, and using a targeted graph thingie just to execute up to where I want?
[17:29] <harlowja> that could work to
[17:30] <harlowja> most of the advanced users use the graph stuff, its more powerful imho
[17:30] <Odd_Bloke> Or maybe it's just the way I'm defining things that makes it seem messy doing it this way..
[17:31] <harlowja> i think the graph stuff would make it better, then running up to a point
[17:31] <Odd_Bloke> Cool, I'll try that.
[17:31] <openstackgerrit> Daniel Watkins proposed stackforge/cloud-init: [WIP] TaskFlow for running shell commands  https://review.openstack.org/220593
[17:31] <Odd_Bloke> That's the fixed version of what I have now.
[17:31] <harlowja> basically the argument providers aren't persisted, aka, what each task provides, so without those being inserted, the lookup mechanism goes 'idk where that is' for later tasks that want to use it
[17:32] <Odd_Bloke> Right, that makes sense.
[17:32] <harlowja> now maybe those should be saved
[17:32] <harlowja> not especially hard, just isn't right now
[17:32] <Odd_Bloke> I'd assumed that they were shoved in to the same global dict as the stuff you give as store={...}.
[17:33] <Odd_Bloke> But I think we'll want to transition to the graph stuff anyway.
[17:33] <Odd_Bloke> So I'll try that.
[17:33] <Odd_Bloke> And if it's too hard then you can fix the library to do whatever I want it to do. ;)
[17:33] <harlowja> nah, during the compile() step the whole tasks are validated against who provides what (by name) so its a little more complicated :-P
[17:33] <harlowja> :(
[17:33] <harlowja> haha
[17:33] <harlowja> http://docs.openstack.org/developer/taskflow/engines.html#scoping is basically the 'lookup algo'
[17:34] <harlowja> http://docs.openstack.org/developer/taskflow/engines.html#taskflow.engines.action_engine.scopes.ScopeWalker.__iter__ (the meat of it)
[17:34] <harlowja> Odd_Bloke as long as other openstack projects aren't affected by those changes then i guess i can change it for u :-P
[17:34] <harlowja> ha
[17:34] <Odd_Bloke> Nah, they can work around us.
[17:34] <harlowja> openstack/other projects (rackspace uses this for http://www.rackspace.com/cloud/big-data afaik)
[17:34] <harlowja> lol
[17:35] <Odd_Bloke> See if their instances will boot if cloud-init breaks. ;)
[17:35] <harlowja> :-P
[17:35] <harlowja> have u figured out the parallel stuff yet btw?
[17:35] <Odd_Bloke> "the parallel stuff"?
[17:35] <harlowja> https://github.com/openstack/taskflow/blob/master/taskflow/examples/hello_world.py#L82
[17:35] <harlowja> executing non-dependent tasks at the same time
[17:35] <Odd_Bloke> Oh, not yet.
[17:36] <harlowja> k
[17:36] <harlowja> np
[17:36] <harlowja> thats step 1.5 of the 4 step taskflow program
[17:36] <Odd_Bloke> Everything's been one big linear flow up until now, so it wouldn't have made a difference anyway, right?
[17:36] <harlowja> lol
[17:36] <harlowja> right, it wouldn't have
[17:36] <Odd_Bloke> Phew.
[17:36] <Odd_Bloke> I understood something. :p
[17:36] <harlowja> ya, linear stuff not so paralleizable
[17:37] <harlowja> *where not so == not at all, lol
[17:39] <Odd_Bloke> harlowja: So how should I target this execution?
[17:39] <Odd_Bloke> I can't just say "until data_source is set", right?
[17:40] <harlowja> nah, u have to set the node to run 'up to'
[17:40] <harlowja> https://github.com/openstack/taskflow/blob/master/taskflow/patterns/graph_flow.py#L314
[17:40] <harlowja> so set_target(config_task_obj)
[17:40] <harlowja> or whatever
[17:48] <Odd_Bloke> harlowja: It works. \o/
[17:48] <harlowja> woot
[17:49] <harlowja> damn, amazing shit
[17:49] <harlowja> ha
[17:49] <Odd_Bloke> Who wrote this?
[17:49] <Odd_Bloke> They must be amazing!
[17:49] <harlowja> mostly me :-P
[17:49] <harlowja> ha
[17:49] <Odd_Bloke> Oh...
[17:49] <Odd_Bloke> Never mind.
[17:49] <Odd_Bloke> ;)
[17:49] <harlowja> :-P
[17:50] <harlowja> and thats just part 1 of its awesomeness
[17:50] <harlowja> lol
[17:50] <openstackgerrit> Daniel Watkins proposed stackforge/cloud-init: [WIP] TaskFlow for running shell commands  https://review.openstack.org/220593
[17:50] <harlowja> stuff that won't be likely used by cloud-init, but is used by others http://docs.openstack.org/developer/taskflow/jobs.html
[17:50] <harlowja> once u have resumption, now imagine tying a job to a set of tasks, and having that job be placed somewhere for others to work on
[17:51] <harlowja> and if those job 'workers' crash the job gets resumed by others that then try to finish it...
[17:51] <harlowja> annnnd magic
[17:51] <harlowja> ha
[17:51] <harlowja> Odd_Bloke https://docs.google.com/presentation/d/1EZoY4FE2SDjfCqMCgBRrwo7ovHF4vYZFGKjjvFvWHIE/
[17:51] <harlowja> that might be useful now that u have seen some of it, ha
[17:52] <harlowja> (a talk i did with the hp folks)
[17:53] <Odd_Bloke> harlowja: That demo sucked.
[17:54] <Odd_Bloke> But the rest of the slideshow was helpful.
[17:54] <harlowja> lol
[17:54] <harlowja> https://github.com/openstack/taskflow/blob/master/taskflow/examples/99_bottles.py#L39 :-P
[17:54] <harlowja> ^ the demo, ha
[17:54] <harlowja> DIY
[17:54] <harlowja> lol
[17:54] <harlowja> DIY demo
[17:58] <roychri> Im trying to find how to access my ec2 metadata (specifically the private_ipv4) so I can use it in bootcmd, runcmd or write_files. Any pointers?
[17:59] <Odd_Bloke> roychri: You could hit the EC2 metadata server yourself.
[17:59] <roychri> using curl?
[18:00] <roychri> that will work in runcmd, but write_files ?
[18:00] <Odd_Bloke> roychri: I don't know that you can do it in write_files.
[18:00] <Odd_Bloke> roychri: Though you can, of course, write out files with your runcmd.
[18:00] <roychri> ok, I thought the metadata could be available thru some variables of some kind...
[18:00] <Odd_Bloke> roychri: You're right, let me dig up the docs on what's available.
[18:01] <roychri> I skimmed thru the datasources docs, I couldn't find anything there...
[18:03] <Odd_Bloke> roychri: Actually, looking at it, you can't do substitution with write_files.
[18:03] <Odd_Bloke> cc_final_message lets you do some substitution.
[18:03] <roychri> Maybe I should just use #!/bin/bash instead of #cloud-config
[18:04] <Odd_Bloke> roychri: You can send multi-part cloud-config over.
[18:04] <Odd_Bloke> roychri: Which would let you have the best of both worlds.
[18:04] <roychri> Im not there yet :)
[18:05] <roychri> I'll get my first instance to work, then I'll look at multipart.
[18:05] <openstackgerrit> Daniel Watkins proposed stackforge/cloud-init: [WIP] TaskFlow for running shell commands  https://review.openstack.org/220593
[18:09] <Odd_Bloke> harlowja: Could you do a sanity check of ^ before I spend any time firming it up with tests/logging/etc.?
[18:09] <harlowja> odd is 'PlaceHolderTask' going to stay?
[18:09] <harlowja> Odd_Bloke ^
[18:10] <harlowja> otherwise seems like a good start to me
[18:10] <Odd_Bloke> harlowja: That represents "all the configuration", so I'll probably move that out to a separate module to be a start for that.
[18:10] <harlowja> k
[18:11] <Odd_Bloke> harlowja: How do you feel about the 'select an arbitrary UUID' approach to identifying flows?
[18:11] <harlowja> unsure, mixed feelings :-P
[18:12] <harlowja> i prefer useful names :-/
[18:12] <Odd_Bloke> harlowja: They require a UUID, though.
[18:12] <harlowja> for the flow detail storage part, ya
[18:13] <Odd_Bloke> Yeah, that's the bit I meant.
[18:13] <harlowja> so i guess arbitrary UUID is fine then, its really used to reconnect with later runs (if the same names are used)
[18:14] <Odd_Bloke> harlowja: Actually, UUIDs aren't just random, I might be able to do something more sensible.
[18:14] <Odd_Bloke> s/aren't just/don't have to be/
[18:15] <Odd_Bloke> Blargh, they always have a part which is a UUID.
[18:17] <Odd_Bloke> In [37]: uuid.UUID(bytes=b'cloudinitiscool!')
[18:17] <Odd_Bloke> Out[37]: UUID('636c6f75-6469-6e69-7469-73636f6f6c21')
[18:17] <harlowja> lol
[18:18] <harlowja> so there is really no check that u are providing a uuid, btw, u can probably  just provide 'cloudinitiscool!' as flow detail uuid arg
[18:18] <Odd_Bloke> Oh, cool.
[18:19] <harlowja> not saying thats the best, but perhaps this could be tweaked in taskflow, lol
[18:19] <harlowja> to be named 'ident' or something instead
[18:19] <harlowja> vs uuid, lol
[18:19] <harlowja> but bygones be bygones, ha
[18:20] <Odd_Bloke> Until that happens, I'll probably keep it as a UUID.
[18:20] <Odd_Bloke> Because who knows if you'll end up going in the other direction and enforcing UUIDs? :p
[18:20] <harlowja> :-/
[18:20] <harlowja> ha
[18:20] <harlowja> ok, https://review.openstack.org/#/c/220607/ fixes the get_logbooks for dir(s)
[18:22] <harlowja> thx for finding that Odd_Bloke
[18:25] <Odd_Bloke> harlowja: No worries, thanks for fixing it. :)
[18:25] <harlowja> sureee
[18:26] <harlowja> btw
[18:26] <harlowja> uuid.uuid5(uuid.NAMESPACE_URL, "https://launchpad.net/cloud-init")
[18:26] <harlowja> that seems to always create
[18:26] <harlowja> UUID('bbd9656b-aed9-5912-9702-7ddde940f8f6')
[18:26] <harlowja> so that could be your uuid :-P
[18:26] <Odd_Bloke> Oh, nice.
[18:26] <harlowja> (pick other url as u want, ha)
[18:26] <Odd_Bloke> I hadn't seen that there were hardcoded namespaces you could use.
[18:32] <roychri> Why does runcmd have two formats (string and array)? In what use case I should use one over the other?
[18:34] <harlowja> Odd_Bloke and someone from yahoo is taking over https://bugs.launchpad.net/taskflow/+bug/1492392 so that hopefully will get done soon to
[18:34] <harlowja> (the splitting the optional/non-optional taskflow deps up)
[18:35] <Odd_Bloke> Ah, yeah, that would make sorting it out in packaging downstream a lot easier.
[18:35] <Odd_Bloke> harlowja: Thanks!
[18:35] <harlowja> ya
[18:35] <harlowja> np
[18:35] <harlowja> smoser u been keeping track of all of this?? ;)
[18:36] <harlowja> shit about to get cray cray
[18:36] <harlowja> lol
[18:36] <harlowja> ha
[18:36] <smoser> been keeping track of nothing
[18:36] <harlowja> lol
[18:36] <harlowja> woot
[18:46] <harlowja> Odd_Bloke also #openstack-state-management channel if u ever don't find me for questions :-P
[18:46] <harlowja> someone else in there might know to, ha