/srv/irclogs.ubuntu.com/2015/07/30/#juju-dev.txt

menn0	davecheney: I knew that but didn't think you were around	00:01
davecheney	got back on monday	00:02
davecheney	took a few days off to recover	00:02
davecheney	now back to hunting races	00:02
davecheney	\o/	00:02
anastasiamac	wow! first time experiencing eqrthquake in Brisbane... 5.3 @ 35km of coral sea :D	00:25
mup	Bug #1479653 opened: state depends on system clock <tech-debt> <juju-core:Triaged> <https://launchpad.net/bugs/1479653>	07:29
TheMue	dimitern: ping	08:22
dimitern	TheMue, pong	08:23
TheMue	dimitern: feeling better today, good for work, but not for hangout. but getting closer with the test problem. assigning a new IP address to a machine that exists before (!) the worker started leads to the wanted event. but if the machine is created after the worker nor events are raised. interesting behavior, will add more logging	08:25
TheMue	dimitern: wondered like you yesterday, that the 0.1.2.9 is never shown in the logs.	08:27
dimitern	TheMue, I'm glad you're feeling better	08:27
TheMue	dimitern: now simply created a 0.1.2.10 and added it to the 1st machine initialized in SetUp it funnily is logged	08:27
dimitern	TheMue, hmm this feels like the EntityWatcher is forwarding the initial event, but then it's not triggering on change?	08:28
TheMue	dimitern: thanks, thankfully our job allows to work below a blanket ;)	08:28
TheMue	dimitern: the change comes later, but it is related to an already existing machine	08:29
dimitern	TheMue, :) yeah	08:29
dimitern	TheMue, not quite sure I follow you - please describe the steps when the event is triggered (and when it's not) in detail	08:29
TheMue	dimitern: IMHO the state.WatchIPAddresses() should react on state.AddIPAddress()	08:30
TheMue	dimitern: ok, prepare a small paste	08:30
dimitern	TheMue, hmm wait	08:31
dimitern	TheMue, AddIPAddress adds an alive address, right?	08:31
dimitern	TheMue, doesn't the entity watcher only trigger on dead addresses?	08:32
TheMue	dimitern: http://paste.ubuntu.com/11965055/	08:32
TheMue	dimitern: IMHO not, but would have to look. It's only a mapping StringWatcher, which maps the received string values to their according entity tags	08:34
TheMue	dimitern: I wondered, because another existing test adding a new IP doesn't fail. but it uses the existing machine. so I added this fragment to my failing test and found the astonishing behavior.	08:35
dimitern	TheMue, looking at the code to remind myself what was implemented	08:39
TheMue	dimitern: /me too, digging deeper and adding more logs (have to remove them afterwards, phew)	08:40
dimitern	TheMue, so the worker starts the watcher on SetUp ?	08:51
dimitern	TheMue, show me your latest branch code please	08:51
TheMue	dimitern: the branch is here: https://github.com/TheMue/juju/tree/addresser-worker-using-api	08:52
dimitern	TheMue, so far, looking at state, api, and apiserver watcher code I can't see any obvious flaws that might be causing it, so it must be the way worker uses it	08:53
alexisb	fwereade, I will need to steal thumper for 30 minutes	08:56
TheMue	dimitern: ah, wait, I've got an idea. may be due to the mix of strings watcher and entity watcher	08:57
TheMue	alexisb: heya o/	08:57
dimitern	TheMue, I don't see your worker implementing SetUp, where it should start the watcher	08:58
alexisb	heya TheMue	08:58
dimitern	TheMue, ah, I saw it below	08:58
TheMue	dimitern: yep	08:58
dimitern	alexisb, hey, btw there's a 30m scheduling conflict for OS+juju call and networking roadmap one	08:59
dimitern	alexisb, I guess the OS call is more important than the other one?	09:00
alexisb	dimitern, I am not sure what the OS+juju call?	09:02
dimitern	alexisb, the one with jamespage	09:02
alexisb	dimitern, you are good, they are the same	09:03
dimitern	alexisb, I see :) cheers then	09:03
dimitern	TheMue, does waitForInitialDead work in your branch (TestWorkerRemovesDeadAddresses) ?	09:11
TheMue	dimitern: yes, all other tests are fine	09:12
dimitern	TheMue, waitForInitialDead should (almost?) immediately see 2 dead IPs - 0.1.2.4 and 0.1.2.6, right?	09:13
TheMue	dimitern: yes, as those belong to machine2	09:14
dimitern	TheMue, fetching the IPAddress from state and then calling EnsureDead on it is weird	09:18
dimitern	TheMue, it's not what will happen in real life - the address being dead is a side effect of the machine it's assigned to getting destroyed	09:19
TheMue	dimitern: you mean in TestWorkerRemovesDeadAddress? that's what I found in the existing tests	09:19
TheMue	dimitern: the failing one is TestMachineRemovalTriggersWorker	09:19
dimitern	TheMue, but, machine removal should indeed trigger "Set all allocated ips to dead for this machine id"	09:20
TheMue	dimitern: so the original test already has been wrong? can remove it then	09:20
dimitern	TheMue, no :) let's think first why it's failing	09:20
dimitern	TheMue, so only TestMachineRemovalTriggersWorker fails?	09:21
TheMue	dimitern: yes, and the IP address is dead, see the adderts following to the machine removal	09:21
TheMue	dimitern: exactly, the rest works fine	09:21
dimitern	TheMue, then there's the problem :)	09:21
TheMue	dimitern: already the adding of the new IP to the new machine isn't reported (at least as alive) while the reporting to a machine existing before the worker is started is reported (see pastebin, the second IP is reported)	09:22
dimitern	TheMue, see, removing a machine should include an op to ensure all alive ips of that machine end up dead without needing to do anything else (e.g. expecting the provisioner to see the machine getting dead and marking allocated ips as dead)	09:23
dimitern	TheMue, so I'd dig more into the list of ops machine.EnsureDead() and/or Remove() includes w.r.t. ipaddressesC	09:24
TheMue	dimitern: yeah, just started a state browsing of ipaddressesC to look exactly there ;)	09:24
dimitern	TheMue, adding a new alive IP (not need to even allocate it to machine) should trigger the IP addresses watcher, but should be ignored by the worker I think	09:26
TheMue	dimitern: as it is alive, yes. the logs should show it like "[LOG] 0:00.351 DEBUG juju.worker.addresser IP address ipaddress-03c48ed1-c389-4930-82e1-1df101fb7ab2 is not dead (life "alive"); skipping"	09:27
dimitern	TheMue, I seriously hope we dial down this log message to TRACE level (image thousands of IPs being reported when rapidly provisioning machines)	09:29
TheMue	dimitern: so far I haven't touched the log levels I found, but can do it while finishing the code	09:30
dimitern	TheMue, if I've suggested adding it at DEBUG, sorry	09:30
TheMue	dimitern: the modification on worker level has been less than thought, mostly moving the release stuff to server side and change behavior from single to bulk	09:31
dimitern	TheMue, state.ensureIPAddressDeadOp looks dangerous on its own - without an assert isAliveDoc (and the corresponding handing of ErrAborted where it's called) it's potentially overwriting the life field of the doc indiscriminately	09:33
TheMue	dimitern: I see. so the original intention has been to set the address to dead regardless of its life status? what's happening, when it isn't alive, so dying or already dead?	09:37
TheMue	dimitern: one usage is in Machine.Remove()	09:37
dimitern	TheMue, yeah, that's the one without an assert set, the other (with isAliveDoc) is in IPAddress.EnsureDead	09:38
TheMue	dimitern: and one is in IPAddress.EnsureDead() with an assert isAliveDoc	09:38
TheMue	h5	09:38
* TheMue loves mongo transactions <SARCASM OFF/>		09:39
=== ashipika1 is now known as ashipika
dimitern	TheMue, have you tried: 1) adding s.State.StartSync() just after line 234 (asserting the addr is dead); 2) if that doesn't work, try removing one or both of the other StartSync() calls before that, but leave the one introduced in 1)	09:46
dimitern	TheMue, looking at the sequence of ops, it looks like waitForReleaseOp is timing out because the apiserver has no chance of observing the address being dead before reading from the dummy ops chan	09:48
dimitern	TheMue, however, since opsChan is buffered, that shouldn't be the case (unless buffer size of 10 is somehow not enough)	09:49
TheMue	dimitern: already tried with a larger buffer, and played with the StartSync()s. not sure if I've done it how you've described, so I'll do now	09:50
dimitern	TheMue, need to get in a call, let's continue later	10:00
TheMue	dimitern: ok	10:00
perrito666	morning	11:29
* perrito666 is devoid of his internet connection		11:29
dimitern	TheMue, any luck isolating the issue?	11:59
TheMue	dimitern: not yet done, but deeper, heads down in the lifecycle watcher ;) wondering about its merge()	12:00
TheMue	dimitern: one moment, showing you an interesting log fragment	12:00
dimitern	TheMue, ok	12:01
TheMue	dimitern: http://paste.ubuntu.com/11966200/	12:02
TheMue	dimitern: so, here the first four addresses are the normal ones	12:02
TheMue	dimitern: the 0.1.2.9 is the one for the new created machine	12:02
TheMue	dimitern: the 0.1.2.10 is instead created for the existing machine	12:03
TheMue	dimitern: why is the 0.1.2.9 in updates, but not in the updated ids anymore? the only step between is the merge() of the lifecycle watcher and here I'm looking now	12:05
dimitern	TheMue, it looks to me the lifecycle watcher is receiving entities with wrongly prefixed IDs	12:05
TheMue	dimitern: the updates map contains all known IPs so far, all with the env id as prefix	12:06
TheMue	dimitern: so I have to see what merge() exactly does	12:07
dimitern	TheMue, hmm that's right - the ids are ok at that point	12:08
* TheMue never has been so deep in our watcher. this mix of differently formatted events, transformations, mappings, online queries etc seems weird sometimes		12:08
dimitern	TheMue, merge should combine the updates with the entities with known life and produce ids for the changes	12:09
TheMue	dimitern: and you can imagine the large number or debug statements lol	12:09
TheMue	dimitern: and here it drops the 0.1.2.9, maybe after its machine has been removed <LOOKING />	12:10
dimitern	TheMue, nothing should just remove ips without releasing them, the machine removal just triggers "set to dead"	12:13
perrito666	TheMue: any part of juju, upon detailed inspection, looks weird	12:14
TheMue	perrito666: rofl thanks for motivational remarks from Argentina	12:14
dimitern	TheMue, and I don't get why 0.1.2.3 is even there	12:14
TheMue	perrito666: heya btw	12:14
perrito666	TheMue: hi :)	12:14
TheMue	dimitern: you mean the received one?	12:16
dimitern	TheMue, yeah	12:16
TheMue	dimitern: you're right, the machine as well as its IPs aren't touched during the test	12:18
TheMue	dimitern: http://paste.ubuntu.com/11966320/ to understand where and what I'm logging in the lifecycle watcher	12:20
* TheMue should add a debug log remover based on the comments above to his juju development tool ...		12:22
dimitern	TheMue, it seems more and more like a sync issue to me	12:29
dimitern	TheMue, have you tried dropping all StartSync() calls?	12:29
TheMue	dimitern: yes, the log is w/o sync as well as w/ sync after the assert that the ip addrress is dead	12:32
TheMue	dimitern: doesn't change anything	12:32
TheMue	dimitern: and as I said, the IP assigned to the new machine is dropped in the notifications while the one for an existing machine is kept	12:33
TheMue	dimitern: look how different the .9 and the .10 behave	12:33
TheMue	dimitern: a theoretical question	12:35
TheMue	dimitern: oh, forget, got it while formulating it	12:36
dimitern	TheMue, :)	12:36
dimitern	TheMue, weird issue indeed	12:37
* dimitern hates debugging watchers		12:37
* TheMue too		12:37
TheMue	dimitern: boah, no, you don't get it	13:03
TheMue	dimitern: I took a deeper look at merge() with the individual states of the IPs etc	13:04
TheMue	dimitern: and I've seen that the .9 always is dead	13:04
TheMue	dimitern: and never known as alive	13:04
TheMue	dimitern: so no removal	13:04
TheMue	dimitern: then I thought we've too fast, dead simple	13:05
TheMue	dimitern: and for testing I added a 30secs pause between adding and machine removal	13:05
TheMue	dimitern: and now - TADDAAH - the test passes	13:05
TheMue	dimitern: so, yes, it is a syncing problem, but different from State.StartSync()	13:07
dimitern	TheMue, sorry, was afk; catching up..	13:12
dimitern	TheMue, that sounds like the desired behavior for watchers (consolidating multiple changes between two events)	13:13
TheMue	dimitern: isn't the API watcher a kind of polling?	13:14
TheMue	dimitern: because we now don't have a direct state watcher anymore, but using the API	13:14
dimitern	TheMue, ok, how about this: instead of sleeping for 30s, just add a short attempt loop between machine removal and adding 0.1.2.9 and setting it to dead	13:19
TheMue	dimitern: sure, the hard coded sleep just has been a test	13:20
dimitern	TheMue, cheers	13:25
katco	wwitzel3: natefinch: ericsnow: ping	14:43
ericsnow	katco: heyheyhey	14:43
wwitzel3	katco: pong	14:43
katco	o/	14:43
katco	did you guys get my email?	14:43
katco	how are we looking for iteration work?	14:44
wwitzel3	katco: good, the state/persistence story won't land	14:44
wwitzel3	katco: all others will be done by EOD Friday	14:44
wwitzel3	katco: of the pointed stuff that is	14:45
katco	wwitzel3: i can live with that :)	14:45
wwitzel3	katco: we have some low-prio overhead that probably won't get done	14:45
katco	wwitzel3: understood... glad the pointed work is mostly landed	14:46
katco	wwitzel3: ericsnow: ty, just wanted to check in!	14:46
katco	we'll talk about the sprint sometime after i get back. lots of interesting stuff	14:46
ericsnow	katco: sweet	14:47
katco	ericsnow: wwitzel3: k gotta run to another meeting... ty again, and if i don't talk to you before, have a great weekend	14:47
wwitzel3	katco: dibs on the Python library, lol	14:47
katco	rofl	14:47
ericsnow	wwitzel3: dang it!	14:47
ericsnow	wwitzel3: we should pair up :)	14:48
wwitzel3	katco: you too, safe travels	14:48
ericsnow	katco: ditto	14:48
thumper	o/ sinzui	15:12
thumper	sinzui: been working with fwereade on this blocker issue	15:13
thumper	just asked the bot to land it	15:13
thumper	it has been tested by Ed to deploy a complex openstack bundle that uses leadership a lot	15:13
thumper	and it all worked	15:13
thumper	\o/	15:13
thumper	also, I have run all the tests locally, and they at least pass here	15:13
thumper	first time too	15:13
* thumper crosses fingers for the bot to do its thing		15:14
thumper	hello?	15:14
thumper	anyone alive in here?	15:14
* thumper streaks through the empty channel		15:15
* ericsnow averts eyes		15:15
alexisb	ericsnow, all of us in annecy have to see it in real life	15:16
mgz	thumper: sorry, I wasn't sure if there was actually a question in all that	15:16
* alexisb is blinded		15:16
thumper	mgz: there wasn't	15:16
ericsnow	alexisb: :)	15:16
mgz	thumper: okay then, carry on streaking :)	15:16
thumper	but I do like to know that I'm not just talking to myself	15:16
wwitzel3	lol	15:16
alexisb	mgz, as soon as we land it is release time	15:17
thumper	well	15:17
thumper	once it passes CI	15:17
wwitzel3	I've given up and just assume I'm always talking to myself	15:17
alexisb	thumper, details details ;)	15:17
alexisb	mgz, what thumper said	15:17
thumper	wwitzel3: so... tycho here is doing some lxd container stuff for us	15:17
sinzui	thumper: was OTP. CI is ready for your landing	15:17
wwitzel3	thumper: awesome	15:17
thumper	sinzui: coolio	15:17
wwitzel3	thumper: what stuff?	15:18
thumper	container/lxd	15:18
sinzui	thumper: alexisb mgz: Robie had a brilliant idea to solve the deoloyer/quicikstart/pyjujuclient problem. Maybe we can include those plugins in the juju-code source package to ensure lock-step delivery of compatible plugins to trusty (and everywhere)	15:19
wwitzel3	thumper: right, but what about it is being done for us, I mean	15:19
alexisb	wwitzel3, tych0 is adding lxd support to juju-core	15:21
alexisb	sinzui, thumper and mramm have been pondering that	15:21
alexisb	and I am sure would like your input	15:21
wwitzel3	alexisb: oh, nice :)	15:22
sinzui	alexisb: we can release as we have done in the past. But I thinkn we need to change the policy to release blessed revisions that have passed compatability and reliability tests. Those tests take days to run and mostly run on weekends when CI has more resources	15:22
tych0	thumper: github.com/tych0/juju lxd-container-type	15:23
perrito666	anyone more or less familiar with environ.Config?	16:11
TheMue	perrito666: don't know if I can help you, but ask	16:15
perrito666	I am looking at the implementation because I might want to add a key but I am not sure I understand it properly	16:16
TheMue	perrito666: regarding schema and default values?	16:17
mup	Bug #1479889 opened: Test failure com_juju_juju_featuretests.TearDownTest.pN44_github.com_juju_juju_featuretests.dblogSuite <ci> <intermittent-failure> <ppc64el> <test-failure> <unit-tests> <juju-core:Triaged> <juju-core trunk:Triaged> <https://launchpad.net/bugs/1479889>	17:06
redelmann	Hi there.	17:22
redelmann	Need some help upgrading juju 1.23 to 1.24.3	17:23
redelmann	1.23.3 to 1.24.3	17:24
perrito666	redelmann: what is going on?	17:24
redelmann	perrito666, hi.	17:25
redelmann	perrito666, i was trying to upgrade juju in maas environment	17:25
redelmann	perrito666, after running "juju upgrade-juju"	17:25
redelmann	perrito666, machine0.log says: http://paste.ubuntu.com/11967995/	17:26
redelmann	perrito666, Well, after that I can't run any juju command	17:28
redelmann	perrito666, that's the problem :P	17:28
perrito666	mm, are the machines still there? if so what is on the logs for machine 0? (Assuming you can access it)	17:29
redelmann	perrito666, all machines are online, machine0.log: http://paste.ubuntu.com/11967995/	17:30
perrito666	have you tried restarting the juju service by hand?	17:31
redelmann	perrito666, yes, and nothing happend	17:32
redelmann	perrito666, same log	17:35
perrito666	mm, strange, I think you will have to make some changes by hand	17:35
redelmann	perrito666, "ls /var/lib/juju/tools": http://paste.ubuntu.com/11968047/	17:36
redelmann	perrito666, agents tools are there, but not linked	17:36
perrito666	there is more than that to updates :)	17:37
redelmann	perrito666, well i suppose that moving links will not fix anything	17:38
perrito666	redelmann: I cannot really recall what change you need to do	17:39
redelmann	perrito666, mhhh.... look at this:	17:39
redelmann	perrito666, http://paste.ubuntu.com/11968067/	17:40
perrito666	redelmann: the rest are links	17:41
redelmann	perrito666, :P i see	17:41
redelmann	perrito666, couldn't read wrench directory: stat /var/lib/juju/wrench: no such file or directory	17:43
redelmann	perrito666, that's is nothing to worry about?	17:43
perrito666	that is not a problem, wrench is something to develop	17:43
perrito666	t is used to introduce failures into juju	17:44
redelmann	perrito666, i suppose that: rsyslogd-2039: Could no open output pipe '/dev/xconsole': No such file or directory [try http://www.rsyslog.com/e/2039 ]	17:54
redelmann	perrito666, is not a problem too	17:54
natefinch	ericsnow: I'd love it if you could review the status stuff again today. I think it should be all set.	18:09
ericsnow	natefinch: will do	18:09
=== kadams54 is now known as kadams54-away
=== kadams54_ is now known as kadams54
redelmann	perrito666, Ok, fixed by hand	18:40
perrito666	hey, I was afk, how did you?	18:40
marcoceppi	katco: could you or someone from moonstone look into this? https://bugs.launchpad.net/juju-core/+bug/1478156	18:40
mup	Bug #1478156: summary format does not give enough details about machine provisioning errors <charmers> <juju-core:Triaged> <https://launchpad.net/bugs/1478156>	18:40
marcoceppi	katco: ugh, nvm	18:40
marcoceppi	I see it's marked as high now, I had old data on the page	18:41
natefinch	wwitzel3: you around?	18:59
natefinch	ericsnow: you around?	19:03
ericsnow	natefinch: yep	19:03
natefinch	ericsnow: I was trying to work out what exactly I needed to do for my kanban card about local file images and docker.... and it seems like there's no such thing as a local file image... they're all stored in a local docker repository and behave exactly like remote ones.... there's no "docker run file://home/nate/mydockerimage"	19:05
natefinch	at least as far as I can tell	19:05
ericsnow	natefinch: the idea is, for local file images, to load them first	19:06
mup	Bug #1479931 opened: Juju 1.22.6 cannot upgrade to 1.24.3/1.24.4 <blocker> <ci> <regression> <upgrade-juju> <juju-core:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1479931>	19:19
mup	Bug #1479942 opened: Reference to undefined method <ci> <intermittent-failure> <ppc64el> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1479942>	19:19
natefinch	ericsnow: sort of a problem... the name of the tar file bears no relation to the name of the image.	19:22
natefinch	ericsnow: so if we're given foo.tar as something to load and run... we can load it, but we won't know what it is called once it's in the registry. I guess we could look in the tar file and figure it out :/	19:23
ericsnow	natefinch: wwitzel3 will have to take it from here; I don't know enough about that	19:23
natefinch	ericsnow: ok... actually, looks like a tar can have multiple images, so it even moreso won't work	19:28
wwitzel3	natefinch: yeah, looking at some of the other tools out there that wrap docker, they take an inventory first, using docker images	19:40
wwitzel3	natefinch: then they load it, and parse the diff	19:41
natefinch	wwitzel3: doesn't solve the problem if more than one image is loaded from the tar file	19:41
wwitzel3	natefinch: we could also use the remote API instead of wrapping the cmd	19:41
wwitzel3	natefinch: it does, since we would parse out both of them and they can only specify a single image name in the process definition	19:42
natefinch	wwitzel3: but I thought the feature was that the image name is the tar file	19:42
ericsnow	natefinch: gave you one last review (LGTM with some minor caveats)	19:42
natefinch	ericsnow: thanks	19:43
ericsnow	natefinch: np	19:43
wwitzel3	natefinch: well, in that case we could launch and register both	19:44
wwitzel3	natefinch: or we could leave image as is and make the file to load a type specific arg	19:44
natefinch	wwitzel3: so, does this seem like a useful feature? Is the idea that someone will package a tar file in their charm?	19:45
wwitzel3	natefinch: I can't remember the reason for it, it was based on some feedback we got iirc	19:47
natefinch	wwitzel3: seems like it needs to be better defined before we work on it. I don't want to guess at the correct implementation.	19:50
wwitzel3	natefinch: I don't even see a card for it	19:51
wwitzel3	natefinch: oh, there it is, overhead	19:51
natefinch	wwitzel3: yep	19:51
wwitzel3	natefinch: so if the file replaces the image name, then it won't matter how many images are in the tar, we would just load and launch any it contained	19:53
natefinch	wwitzel3: I don't think that's a good idea... in all other cases, the process specification is for a single process - you give it a command to run, etc. I think it would be surprising for a single process definition in the yaml to result in multiple registered processes.	19:54
natefinch	wwitzel3: maybe if we added a LoadFrom field in the process info that would tell Juju to load the image before launching it	19:55
natefinch	or maybe we need a separate step that loads all images before we start launching processes	19:56
wwitzel3	natefinch: I don't think it would be a surprise if I, the charm author provided a tar that had multiple images in it, but we shouldn't be designing this interacton anyway. We should probably ping lazyPower and whit about what that interaction would look like and what they want :)	19:56
lazyPower	hello o/	19:56
lazyPower	in office hours	19:56
lazyPower	will circle back when we're out, because i know what you're talking about and want to be a part of it	19:56
natefinch	lazyPower: awesome	19:57
natefinch	lazyPower: I love a man that knows what he wants ;)	19:57
lazyPower	natefinch: ok my session is over, whaaatt would we like to do with process management in charming? :) i have some ideas already for example workloads to deliver with this.	20:22
lazyPower	ah i see, this is wrt multi processes	20:23
natefinch	lazyPower: well, so, I had a work item to support loading images from files on disk (a la docker's load from a tar)	20:23
lazyPower	ok, i dont see shipping multiple images in teh charm, i see more shipping with a dockerfile/compose-formation, and building on the host during deploy, or pulling from a private registry	20:24
lazyPower	thats the established pattern. Do we want to advocate for fatpacking images in a charm?	20:25
natefinch	I don't know that we want to make that standard practice, but some people may certainly ask for it. Fat charms are popular.	20:26
lazyPower	ok, let me re-check the spec to make sur ei'm on the same page	20:26
lazyPower	i dont want to try and account for something thats already been discussed.	20:26
natefinch	lazyPower: AFAIK, it's not in the spec. So maybe that answers the question	20:28
natefinch	^^ ericsnow	20:28
lazyPower	We can always file and iterate	20:28
ericsnow	natefinch: this is something we added to the spec late last week in response to feedback katco got prepping for the demo	20:29
lazyPower	i think if you put in multiple resource uri's, fetch them	20:31
lazyPower	it wont be ovious to the user they only get a single resource, and thats not a one size fits all scenario	20:31
lazyPower	*obvious	20:31
lazyPower	and we'll see weird things happening like people tarballing up multiple images and then writing extra code to handle that when we could be handling it in the delivery mechanism	20:32
lazyPower	s/images/payloads/	20:32
natefinch	ericsnow: I think it's a bad idea to munge the idea of images with the tar files that docker supports. tar != image	20:35
natefinch	ericsnow: I'd prefer to either let the charm do the loading itself during install, or add a new field that'll tell juju how to load the info	20:36
ericsnow	natefinch: hey, it wasn't my idea! :)	20:37
wwitzel3	natefinch: I think having another field for the URI seperate from the image is fine	20:38
wwitzel3	natefinch: since that would also work for the location of a private docker registry	20:39
natefinch	wwitzel3: that seems fine... unless you wanted to specify both	20:43
natefinch	wwitzel3: load the images from this tar into this registry... or is that not a thing?	20:43
wwitzel3	natefinch: if you want to specify both, then you define two processes	20:45
wwitzel3	natefinch: packing two images in to a single tar isn't that common from what I know, lazyPower might have more experience with that than me	20:45
wwitzel3	natefinch: but I've not seen it done personally, because the size of the tar is already large, most people are trying to make their images and archives smaller, not bigger	20:46
lazyPower	wwitzel3: well, you wouldnt hve 2 images in a single tar	20:46
lazyPower	once you export, its a single package per container. I can see someone trying to work around an artificial limitation by bundling 2 images in a tar file	20:46
lazyPower	but that wouldn't be the norm i dont think.	20:46
lazyPower	unless you're trying to get hyper specific with arch and support multi-arch in the charm	20:47
lazyPower	ARMHF images will not run on amd64 for example, and vice versa	20:47
natefinch	ericsnow: ug, these juju status tests are horrible	20:52
ericsnow	natefinch: sorry	20:53
natefinch	ericsnow: as well you should be ;)	20:53
natefinch	ericsnow, wwitzel3, lazyPower: what do you guys think about adding a resource: key to the process info, that gets passed to the plugin, and the plugin can handle it however it wants (for docker it would do a docker load)	20:55
lazyPower	I like that idea	20:55
perrito666	natefinch: i would be a bit careful about the use of the word resource	20:56
ericsnow	natefinch: at long as it makes sense as a general feature and not just mostly-docker-specific	20:56
perrito666	I really dont feel like having State all over again	20:56
natefinch	ericsnow: I presume other container technologies might need a separate step for "install the image" before running it... but I don't know.	20:57
ericsnow	natefinch: yeah, who knows	20:58
lazyPower	natefinch: looking at the existing things - rocket/docker/runc - its all basically the same delivery mechanism	20:58
ericsnow	natefinch: for now we could just support it with a type option	20:58
lazyPower	but looking @ say, tomcat - loading a warfile has a different process	20:58
natefinch	ericsnow: ahh, yeah, type options... that makes sense	20:59
natefinch	ericsnow: forgot about that escape hatch	20:59
ericsnow	natefinch: yep, that's why we added them	20:59
natefinch	ok I gotta run. I'll do it via type-option for now, and we can always make it more official later	21:00
ericsnow	natefinch: sounds good	21:00
=== natefinch is now known as natefinch-afk
lazyPower	natefinch: ericsnow - is this going into a different branch than what landed for the concept wwitzel3 did?	21:00
ericsnow	lazyPower: nope, it'll go into feature-proc-mgmt	21:01
wwitzel3	lazyPower: it is going in to feature-proc-mgmt branch	21:01
lazyPower	ack	21:01
lazyPower	i'm going to setup a build and get a container running for this while its under active dev if you'd like active feedback on the feature before it hits CR	21:01
lazyPower	I had intended to do this for wwitzel3 but got sidetracked with the 1.0 launch of k8's	21:02
wwitzel3	yes please bat eyelashes	21:03
lazyPower	:) you got it dude	21:03
lazyPower	wwitzel3: i'll ping when i'm working on it tomorrow	21:04
wwitzel3	lazyPower: awesome, ty	21:04
sinzui	cherylj: you cannot make CI regression as fix released, we have tests and cloud checks that say upgrades are broken	21:15
cherylj	I didn't do that	21:16
cherylj	sinzui: It was set to fix released by the QA bit	21:17
cherylj	bot	21:17
sinzui	cherylj: from the same report we can see http://reports.vapour.ws/releases/2934 that the 22 jobs failed	21:18
cherylj	sinzui: Yeah, I can recreate the failure. Debugging it more now.	21:19
sinzui	cherylj: sorry, IU have two email with your name first :( I had to make them non-voting for this run because if the command to release 1.24.4, but I will mkae the voting again soon	21:20
cherylj	sinzui: I think this is a problem with 1.22.6, not 1.24.3/4. The upgrade is failing when it's trying to get the tools for 1.24.3	21:22
cherylj	just fyi	21:23
sinzui	cherylj: maybe we should try 1.22.7 (1.22 tip) if it works, it is an incentive to relesse as soon as possible.	21:23
cherylj	sinzui: I can give that a try after this debug run I'm doing now.	21:25
cherylj	ec2 seems particularly slow for me today :(	21:25
sinzui	cherylj: Indeed it is installing packafes seems to be taking longer	21:27
sinzui	cherylj: Joyent and GCE are the fastest clouds. I tend to use joyent	21:27
cherylj	sinzui: are there some shared creds for the core team? or do I need to create my own account?	21:28
sinzui	cherylj: in cloud-city? yes you can use default-joyent. and you can try different regions	21:29
cherylj	menn0: The state server will refuse connections while it's performing an upgrade, right?	21:36
cherylj	menn0: It appears that the state server is hung trying to unpack the tools, and I see the syslog filling up with these errors: http://paste.ubuntu.com/11969606/	21:42
menn0	cherylj: no the state server still accepts connections during an upgrade	21:46
cherylj	this is weird.	21:46
menn0	cherylj: the available API requests are quite limited though	21:46
menn0	cherylj: status should still work	21:46
menn0	cherylj: "the not authorized for status" error is worrying	21:47
cherylj	yeah	21:47
menn0	cherylj: also, the very high connection count	21:47
cherylj	just keeps going up!	21:47
cherylj	heh	21:47
menn0	cherylj: something in juju isn't releasing the connections	21:47
menn0	cherylj: that's probably not the root cause but related to it	21:47
menn0	cherylj: the authorization errors sounds closer the root cause	21:48
menn0	have you got the machine-0.log?	21:48
menn0	hang on... flying solo with a kid at the moment and he's calling	21:48
cherylj	yeah, I can add your SSH key to this machine.	21:48
sinzui	cherylj: menn0 help: I don't know which bugs thumpers merge at tip https://github.com/juju/juju/commits/1.24 were fixed. I can make a release, but I cannot say what issues are fixed	22:00
menn0	sinzui: looking	22:01
menn0	sinzui: looks like will and thumper have been activating the new leadership bits	22:03
menn0	sinzui: this will fix bug 1478024	22:03
mup	Bug #1478024: Looping config-changed hooks in fresh juju-core 1.24.3 Openstack deployment <blocker> <canonical-bootstack> <leadership> <upgrade-juju> <juju-core:Triaged> <juju-core 1.24:In Progress by fwereade> <https://launchpad.net/bugs/1478024>	22:03
sinzui	\o/	22:03
menn0	sinzui: but I wouldn't cut a release until they say it's done	22:04
menn0	sinzui: based on the commit messages it looks like they're close though	22:04
sinzui	menn0: I see this in the context od thumper, mgz and alexis a few hours ago	22:07
sinzui	alexisb>	22:07
sinzui	mgz, as soon as we land it is release time	22:07
menn0	sinzui: ok cool	22:07
* sinzui this the final job just passed and the rev is bless by all the old rules		22:08
menn0	sinzui: we still have bug 1479931	22:11
mup	Bug #1479931: Juju 1.22.6 cannot upgrade to 1.24.3/1.24.4 <blocker> <ci> <regression> <upgrade-juju> <juju-core:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1479931>	22:11
menn0	sinzui: for some reason the QA bot marked it as fix released for 1.24	22:11
menn0	sinzui: but cherylj was able to repro it	22:11
menn0	sinzui: we're looking at that one now	22:12
sinzui	menn0: We had to make the two jobs that show the regression non-voting, which conviced CI that ere was a bless	22:12
menn0	sinzui: never mind... I just saw your comment on the bug	22:12
menn0	sinzui: cool, makes sense	22:13
sinzui	menn0: we are jugglin a nasty case of a regression in the wild. 1.24.4 is better than 1.24.3 :/	22:13
menn0	sinzui: I don't think we should release another 1.24 until this one is figured out	22:13
sinzui	menn0: I think so, I really don't like releasing in this rush. I officially EODed lat hour	22:14
sinzui	menn0: we can replace the proposed version with an other fixed version while in propsed. maybe 1.24.5 can be put in place by your tuesday	22:15
menn0	sinzui: ok	22:15
menn0	sinzui: this one should be fixed soon I think. i'm getting a sense of the problem from the logs	22:15
sinzui	menn0: also, I will hit the delay in Lp's builders. If I see a fix in CI, I can just switch the debs we plan to put in streams :)	22:17
menn0	sinzui: sounds good	22:17
menn0	waigani: if you get a chance could you have a look at http://reviews.vapour.ws/r/2279/ pls? (no rush though)	22:18
waigani	menn0: okay, I'm just finishing some stuff for Will. Probably get to it around 11am?	22:21
menn0	waigani: np. i'm looking at this upgrade issue anyway.	22:21
menn0	cherylj: I see the problem... the relevant revision is 0e39ac8d6fcc77793e5028e03bfb651707cf1bb6	22:30
menn0	cherylj: if the env UUID is missing open() tries to query the DB to figure it out, but that's before the mongodb login happens in newState() so the query isn't allowed	22:31
menn0	cherylj: I find it hard to believe that this was tested with an actual upgrade...	22:32
menn0	cherylj: it should be fixable by extracting the login into it's own method	22:33
menn0	and calling that earlier in open()	22:33
sinzui	menn0: waigani Can either of you review http://reviews.vapour.ws/r/2283/	22:40
menn0	sinzui: ship it	22:41
menn0	sinzui: btw I'm pretty sure I have a fix for bug 1479931	22:42
mup	Bug #1479931: Juju 1.22.6 cannot upgrade to 1.24.3/1.24.4 <blocker> <ci> <regression> <upgrade-juju> <juju-core:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1479931>	22:42
menn0	sinzui: testing now	22:42
sinzui	menn0: Thank you . I may need to wait though. Hp cloud got relested and a job failed, so I am retesting	22:42
menn0	ok	22:43
sinzui	menn0: ping when you want to merge because I might just as well release your fix	22:43
menn0	sinzui: ok	22:44
menn0	sinzui: ok that fix works... just prepping for proposing now	22:59
sinzui	menn0: You rock, as does cherylj . I will let CI accept the current failure and wait for the fix	23:00
menn0	waigani or anyone else, review for CI blocker please : http://reviews.vapour.ws/r/2284/	23:09
waigani	menn0: looking	23:10
menn0	waigani: never mind ... the change breaks the state unit tests	23:12
menn0	sinzui: this is going to take longer	23:12
waigani	menn0: okay	23:13
axw	anastasiamac_: ok to delay 10m to wait for perrito666?	23:13
sinzui	menn0: okay. I Hp hates me so I am in no rush	23:13
anastasiamac_	axw: yes :D brilliant - m going to coffee	23:13
anastasiamac_	axw: is ur school run going to b k?'	23:13
menn0	sinzui: ok. I have to be out house for a bit soon so it might be a few more hours	23:14
axw	anastasiamac_: should be fine	23:14
anastasiamac_	axw: gr8! see u then :D	23:14
menn0	sinzui: or perhaps someone else can run with it	23:14
menn0	let's see where I get to	23:14
menn0	waigani, sinzui: tests fixed	23:20
menn0	waigani: pushing now	23:20
menn0	waigani: can you take a look again please?	23:21
waigani	menn0: yep	23:21
menn0	waigani: I need to step away for a bit. if you're happy with the change can you pls hit merge for me?	23:21
menn0	back in 10min	23:21
waigani	menn0: yep np	23:21
perrito666	anastasiamac_: axw I am back, thanks :D	23:22
anastasiamac_	perrito666: axw: omw	23:23
waigani	menn0: done, I hit merge also	23:29
sinzui	menn0: waigani : the magic fixes-1479931 was missing, I am adding it and requeing the merge	23:33
waigani	sinzui: ugh, sorry I keep forgetting that.	23:34
menn0	waigani, sinzui: i'm back for 20 mins or so then off again	23:36
sinzui	menn0: okay I will watch the merge and retry as needed	23:36
waigani	menn0: half day for me, heading to airport in 30min.	23:37
menn0	sinzui, waigani: thanks both of you	23:37
waigani	:)	23:37
mwhudson	davecheney: what's happened to the ppc64le builder?	23:57

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!