/srv/irclogs.ubuntu.com/2015/07/30/#juju-dev.txt

menn0davecheney: I knew that but didn't think you were around00:01
davecheneygot back on monday00:02
davecheneytook a few days off to recover00:02
davecheneynow back to hunting races00:02
davecheney\o/00:02
anastasiamacwow! first time experiencing eqrthquake in Brisbane... 5.3 @ 35km of coral sea :D00:25
mupBug #1479653 opened: state depends on system clock <tech-debt> <juju-core:Triaged> <https://launchpad.net/bugs/1479653>07:29
TheMuedimitern: ping08:22
dimiternTheMue, pong08:23
TheMuedimitern: feeling better today, good for work, but not for hangout. but getting closer with the test problem. assigning a new IP address to a machine that exists before (!) the worker started leads to the wanted event. but if the machine is created after the worker nor events are raised. interesting behavior, will add more logging08:25
TheMuedimitern: wondered like you yesterday, that the 0.1.2.9 is never shown in the logs.08:27
dimiternTheMue, I'm glad you're feeling better08:27
TheMuedimitern: now simply created a 0.1.2.10 and added it to the 1st machine initialized in SetUp it funnily is logged08:27
dimiternTheMue, hmm this feels like the EntityWatcher is forwarding the initial event, but then it's not triggering on change?08:28
TheMuedimitern: thanks, thankfully our job allows to work below a blanket ;)08:28
TheMuedimitern: the change comes later, but it is related to an already existing machine08:29
dimiternTheMue, :) yeah08:29
dimiternTheMue, not quite sure I follow you - please describe the steps when the event is triggered (and when it's not) in detail08:29
TheMuedimitern: IMHO the state.WatchIPAddresses() should react on state.AddIPAddress()08:30
TheMuedimitern: ok, prepare a small paste08:30
dimiternTheMue, hmm wait08:31
dimiternTheMue, AddIPAddress adds an alive address, right?08:31
dimiternTheMue, doesn't the entity watcher only trigger on dead addresses?08:32
TheMuedimitern: http://paste.ubuntu.com/11965055/08:32
TheMuedimitern: IMHO not, but would have to look. It's only a mapping StringWatcher, which maps the received string values to their according entity tags08:34
TheMuedimitern: I wondered, because another existing test adding a new IP doesn't fail. but it uses the existing machine. so I added this fragment to my failing test and found the astonishing behavior.08:35
dimiternTheMue, looking at the code to remind myself what was implemented08:39
TheMuedimitern: /me too, digging deeper and adding more logs (have to remove them afterwards, phew)08:40
dimiternTheMue, so the worker starts the watcher on SetUp ?08:51
dimiternTheMue, show me your latest branch code please08:51
TheMuedimitern: the branch is here: https://github.com/TheMue/juju/tree/addresser-worker-using-api08:52
dimiternTheMue, so far, looking at state, api, and apiserver watcher code I can't see any obvious flaws that might be causing it, so it must be the way worker uses it08:53
alexisbfwereade, I will need to steal thumper for 30 minutes08:56
TheMuedimitern: ah, wait, I've got an idea. may be due to the mix of strings watcher and entity watcher08:57
TheMuealexisb: heya o/08:57
dimiternTheMue, I don't see your worker implementing SetUp, where it should start the watcher08:58
alexisbheya TheMue08:58
dimiternTheMue, ah, I saw it below08:58
TheMuedimitern: yep08:58
dimiternalexisb, hey, btw there's a 30m scheduling conflict for OS+juju call and networking roadmap one08:59
dimiternalexisb, I guess the OS call is more important than the other one?09:00
alexisbdimitern, I am not sure what the OS+juju call?09:02
dimiternalexisb, the one with jamespage09:02
alexisbdimitern, you are good, they are the same09:03
dimiternalexisb, I see :) cheers then09:03
dimiternTheMue, does waitForInitialDead work in your branch (TestWorkerRemovesDeadAddresses) ?09:11
TheMuedimitern: yes, all other tests are fine09:12
dimiternTheMue, waitForInitialDead should (almost?) immediately see 2 dead IPs - 0.1.2.4 and 0.1.2.6, right?09:13
TheMuedimitern: yes, as those belong to machine209:14
dimiternTheMue, fetching the IPAddress from state and then calling EnsureDead on it is weird09:18
dimiternTheMue, it's not what will happen in real life - the address being dead is a side effect of the machine it's assigned to getting destroyed09:19
TheMuedimitern: you mean in TestWorkerRemovesDeadAddress? that's what I found in the existing tests09:19
TheMuedimitern: the failing one is TestMachineRemovalTriggersWorker09:19
dimiternTheMue, but, machine removal should indeed trigger "Set all allocated ips to dead for this machine id"09:20
TheMuedimitern: so the original test already has been wrong? can remove it then09:20
dimiternTheMue, no :) let's think first why it's failing09:20
dimiternTheMue, so *only* TestMachineRemovalTriggersWorker fails?09:21
TheMuedimitern: yes, and the IP address is dead, see the adderts following to the machine removal09:21
TheMuedimitern: exactly, the rest works fine09:21
dimiternTheMue, then there's the problem :)09:21
TheMuedimitern: already the adding of the new IP to the new machine isn't reported (at least as alive) while the reporting to a machine existing before the worker is started is reported (see pastebin, the second IP is reported)09:22
dimiternTheMue, see, removing a machine should include an op to ensure all alive ips of that machine end up dead *without* needing to do anything else (e.g. expecting the provisioner to see the machine getting dead and marking allocated ips as dead)09:23
dimiternTheMue, so I'd dig more into the list of ops machine.EnsureDead() and/or Remove() includes w.r.t. ipaddressesC09:24
TheMuedimitern: yeah, just started a state browsing of ipaddressesC to look exactly there ;)09:24
dimiternTheMue, adding a new alive IP (not need to even allocate it to machine) *should* trigger the IP addresses watcher, but should be ignored by the worker I *think*09:26
TheMuedimitern: as it is alive, yes. the logs should show it like "[LOG] 0:00.351 DEBUG juju.worker.addresser IP address ipaddress-03c48ed1-c389-4930-82e1-1df101fb7ab2 is not dead (life "alive"); skipping"09:27
dimiternTheMue, I *seriously* hope we dial down this log message to TRACE level (image thousands of IPs being reported when rapidly provisioning machines)09:29
TheMuedimitern: so far I haven't touched the log levels I found, but can do it while finishing the code09:30
dimiternTheMue, if I've suggested adding it at DEBUG, sorry09:30
TheMuedimitern: the modification on worker level has been less than thought, mostly moving the release stuff to server side and change behavior from single to bulk09:31
dimiternTheMue, state.ensureIPAddressDeadOp looks dangerous on its own - without an assert isAliveDoc (and the corresponding handing of ErrAborted where it's called) it's potentially overwriting the life field of the doc indiscriminately09:33
TheMuedimitern: I see. so the original intention has been to set the address to dead regardless of its life status? what's happening, when it isn't alive, so dying or already dead?09:37
TheMuedimitern: one usage is in Machine.Remove()09:37
dimiternTheMue, yeah, that's the one without an assert set, the other (with isAliveDoc) is in IPAddress.EnsureDead09:38
TheMuedimitern: and one is in IPAddress.EnsureDead() with an assert isAliveDoc09:38
TheMueh509:38
* TheMue loves mongo transactions <SARCASM OFF/>09:39
=== ashipika1 is now known as ashipika
dimiternTheMue, have you tried: 1) adding s.State.StartSync() just after line 234 (asserting the addr is dead); 2) if that doesn't work, try removing one or both of the other StartSync() calls before that, but leave the one introduced in 1)09:46
dimiternTheMue, looking at the sequence of ops, it looks like waitForReleaseOp is timing out because the apiserver has no chance of observing the address being dead before reading from the dummy ops chan09:48
dimiternTheMue, however, since opsChan is buffered, that shouldn't be the case (unless buffer size of 10 is somehow not enough)09:49
TheMuedimitern: already tried with a larger buffer, and played with the StartSync()s. not sure if I've done it how you've described, so I'll do now09:50
dimiternTheMue, need to get in a call, let's continue later10:00
TheMuedimitern: ok10:00
perrito666morning11:29
* perrito666 is devoid of his internet connection11:29
dimiternTheMue, any luck isolating the issue?11:59
TheMuedimitern: not yet done, but deeper, heads down in the lifecycle watcher ;) wondering about its merge()12:00
TheMuedimitern: one moment, showing you an interesting log fragment12:00
dimiternTheMue, ok12:01
TheMuedimitern: http://paste.ubuntu.com/11966200/12:02
TheMuedimitern: so, here the first four addresses are the normal ones12:02
TheMuedimitern: the 0.1.2.9 is the one for the new created machine12:02
TheMuedimitern: the 0.1.2.10 is instead created for the existing machine12:03
TheMuedimitern: why is the 0.1.2.9 in updates, but not in the updated ids anymore? the only step between is the merge() of the lifecycle watcher and here I'm looking now12:05
dimiternTheMue, it looks to me the lifecycle watcher is receiving entities with wrongly prefixed IDs12:05
TheMuedimitern: the updates map contains all known IPs so far, all with the env id as prefix12:06
TheMuedimitern: so I have to see what merge() exactly does12:07
dimiternTheMue, hmm that's right - the ids are ok at that point12:08
* TheMue never has been so deep in our watcher. this mix of differently formatted events, transformations, mappings, online queries etc seems weird sometimes12:08
dimiternTheMue, merge should combine the updates with the entities with known life and produce ids for the changes12:09
TheMuedimitern: and you can imagine the large number or debug statements *lol*12:09
TheMuedimitern: and here it drops the 0.1.2.9, maybe after its machine has been removed <LOOKING />12:10
dimiternTheMue, nothing should just remove ips without releasing them, the machine removal just triggers "set to dead"12:13
perrito666TheMue: any part of juju, upon detailed inspection, looks weird12:14
TheMueperrito666: *rofl* thanks for motivational remarks from Argentina12:14
dimiternTheMue, and I don't get why 0.1.2.3 is even there12:14
TheMueperrito666: heya btw12:14
perrito666TheMue: hi :)12:14
TheMuedimitern: you mean the received one?12:16
dimiternTheMue, yeah12:16
TheMuedimitern: you're right, the machine as well as its IPs aren't touched during the test12:18
TheMuedimitern: http://paste.ubuntu.com/11966320/ to understand where and what I'm logging in the lifecycle watcher12:20
* TheMue should add a debug log remover based on the comments above to his juju development tool ...12:22
dimiternTheMue, it seems more and more like a sync issue to me12:29
dimiternTheMue, have you tried dropping all StartSync() calls?12:29
TheMuedimitern: yes, the log is w/o sync as well as w/ sync after the assert that the ip addrress is dead12:32
TheMuedimitern: doesn't change anything12:32
TheMuedimitern: and as I said, the IP assigned to the new machine is dropped in the notifications while the one for an existing machine is kept12:33
TheMuedimitern: look how different the .9 and the .10 behave12:33
TheMuedimitern: a theoretical question12:35
TheMuedimitern: oh, forget, got it while formulating it12:36
dimiternTheMue, :)12:36
dimiternTheMue, weird issue indeed12:37
* dimitern *hates* debugging watchers12:37
* TheMue too12:37
TheMuedimitern: boah, no, you don't get it13:03
TheMuedimitern: I took a deeper look at merge() with the individual states of the IPs etc13:04
TheMuedimitern: and I've seen that the .9 always is dead13:04
TheMuedimitern: and never known as alive13:04
TheMuedimitern: so no removal13:04
TheMuedimitern: then I thought we've too fast, dead simple13:05
TheMuedimitern: and for testing I added a 30secs pause between adding and machine removal13:05
TheMuedimitern: and now - *TADDAAH* - the test passes13:05
TheMuedimitern: so, yes, it is a syncing problem, but different from State.StartSync()13:07
dimiternTheMue, sorry, was afk; catching up..13:12
dimiternTheMue, that sounds like the desired behavior for watchers (consolidating multiple changes between two events)13:13
TheMuedimitern: isn't the API watcher a kind of polling?13:14
TheMuedimitern: because we now don't have a direct state watcher anymore, but using the API13:14
dimiternTheMue, ok, how about this: instead of sleeping for 30s, just add a short attempt loop between machine removal and adding 0.1.2.9 and setting it to dead13:19
TheMuedimitern: sure, the hard coded sleep just has been a test13:20
dimiternTheMue, cheers13:25
katcowwitzel3: natefinch: ericsnow: ping14:43
ericsnowkatco: heyheyhey14:43
wwitzel3katco: pong14:43
katcoo/14:43
katcodid you guys get my email?14:43
katcohow are we looking for iteration work?14:44
wwitzel3katco: good, the state/persistence story won't land14:44
wwitzel3katco: all others will be done by EOD Friday14:44
wwitzel3katco: of the pointed stuff that is14:45
katcowwitzel3: i can live with that :)14:45
wwitzel3katco: we have some low-prio overhead that probably won't get done14:45
katcowwitzel3: understood... glad the pointed work is mostly landed14:46
katcowwitzel3: ericsnow: ty, just wanted to check in!14:46
katcowe'll talk about the sprint sometime after i get back. lots of interesting stuff14:46
ericsnowkatco: sweet14:47
katcoericsnow: wwitzel3: k gotta run to another meeting... ty again, and if i don't talk to you before, have a great weekend14:47
wwitzel3katco: dibs on the Python library, lol14:47
katcorofl14:47
ericsnowwwitzel3: dang it!14:47
ericsnowwwitzel3: we should pair up :)14:48
wwitzel3katco: you too, safe travels14:48
ericsnowkatco: ditto14:48
thumpero/ sinzui15:12
thumpersinzui: been working with fwereade on this blocker issue15:13
thumperjust asked the bot to land it15:13
thumperit has been tested by Ed to deploy a complex openstack bundle that uses leadership a lot15:13
thumperand it all worked15:13
thumper\o/15:13
thumperalso, I have run all the tests locally, and they at least pass here15:13
thumperfirst time too15:13
* thumper crosses fingers for the bot to do its thing15:14
thumperhello?15:14
thumperanyone alive in here?15:14
* thumper streaks through the empty channel15:15
* ericsnow averts eyes15:15
alexisbericsnow, all of us in annecy have to see it in real life15:16
mgzthumper: sorry, I wasn't sure if there was actually a question in all that15:16
* alexisb is blinded15:16
thumpermgz: there wasn't15:16
ericsnowalexisb: :)15:16
mgzthumper: okay then, carry on streaking :)15:16
thumperbut I do like to know that I'm not just talking to myself15:16
wwitzel3lol15:16
alexisbmgz, as soon as we land it is release time15:17
thumperwell15:17
thumperonce it passes CI15:17
wwitzel3I've given up and just assume I'm always talking to myself15:17
alexisbthumper, details details ;)15:17
alexisbmgz, what thumper said15:17
thumperwwitzel3: so... tycho here is doing some lxd container stuff for us15:17
sinzuithumper: was OTP. CI is ready for your landing15:17
wwitzel3thumper: awesome15:17
thumpersinzui: coolio15:17
wwitzel3thumper: what stuff?15:18
thumpercontainer/lxd15:18
sinzuithumper: alexisb mgz: Robie had a brilliant idea to solve the deoloyer/quicikstart/pyjujuclient problem. Maybe we can include those plugins in the juju-code source package to ensure lock-step delivery of compatible plugins to trusty (and everywhere)15:19
wwitzel3thumper: right, but what about it is being done for us, I mean15:19
alexisbwwitzel3, tych0 is adding lxd support to juju-core15:21
alexisbsinzui, thumper and mramm have been pondering that15:21
alexisband I am sure would like your input15:21
wwitzel3alexisb: oh, nice :)15:22
sinzuialexisb: we can release as we have done in the past. But I thinkn we need to change the policy to release blessed revisions that have passed compatability and reliability tests. Those tests take days to run and mostly run on weekends when CI has more resources15:22
tych0thumper: github.com/tych0/juju lxd-container-type15:23
perrito666anyone more or less familiar with environ.Config?16:11
TheMueperrito666: don't know if I can help you, but ask16:15
perrito666I am looking at the implementation because I might want to add a key but I am not sure I understand it properly16:16
TheMueperrito666: regarding schema and default values?16:17
mupBug #1479889 opened: Test failure com_juju_juju_featuretests.TearDownTest.pN44_github.com_juju_juju_featuretests.dblogSuite <ci> <intermittent-failure> <ppc64el> <test-failure> <unit-tests> <juju-core:Triaged> <juju-core trunk:Triaged> <https://launchpad.net/bugs/1479889>17:06
redelmannHi there.17:22
redelmannNeed some help upgrading juju 1.23 to 1.24.317:23
redelmann1.23.3 to 1.24.317:24
perrito666redelmann: what is going on?17:24
redelmannperrito666, hi.17:25
redelmannperrito666, i was trying to upgrade juju in maas environment17:25
redelmannperrito666, after running "juju upgrade-juju"17:25
redelmannperrito666, machine0.log says: http://paste.ubuntu.com/11967995/17:26
redelmannperrito666, Well, after that I can't run any juju command17:28
redelmannperrito666, that's the problem :P17:28
perrito666mm, are the machines still there? if so what is on the logs for machine 0? (Assuming you can access it)17:29
redelmannperrito666, all machines are online, machine0.log:  http://paste.ubuntu.com/11967995/17:30
perrito666have you tried restarting the juju service by hand?17:31
redelmannperrito666, yes, and nothing happend17:32
redelmannperrito666, same log17:35
perrito666mm, strange, I think you will have to make some changes by hand17:35
redelmannperrito666, "ls /var/lib/juju/tools": http://paste.ubuntu.com/11968047/17:36
redelmannperrito666, agents tools are there, but not linked17:36
perrito666there is more than that to updates :)17:37
redelmannperrito666, well i suppose that moving links will not fix anything17:38
perrito666redelmann: I cannot really recall what change you need to do17:39
redelmannperrito666, mhhh.... look at this:17:39
redelmannperrito666, http://paste.ubuntu.com/11968067/17:40
perrito666redelmann: the rest are links17:41
redelmannperrito666, :P i see17:41
redelmannperrito666, couldn't read wrench directory: stat /var/lib/juju/wrench: no such file or directory17:43
redelmannperrito666, that's is nothing to worry about?17:43
perrito666that is not a problem, wrench is something to develop17:43
perrito666t is used to introduce failures into juju17:44
redelmannperrito666, i suppose that: rsyslogd-2039: Could no open output pipe '/dev/xconsole': No such file or directory [try http://www.rsyslog.com/e/2039 ]17:54
redelmannperrito666, is not a problem too17:54
natefinchericsnow: I'd love it if you could review the status stuff again today.  I think it should be all set.18:09
ericsnownatefinch: will do18:09
=== kadams54 is now known as kadams54-away
=== kadams54_ is now known as kadams54
redelmannperrito666, Ok, fixed by hand18:40
perrito666hey, I was afk, how did you?18:40
marcoceppikatco: could you or someone from moonstone look into this? https://bugs.launchpad.net/juju-core/+bug/147815618:40
mupBug #1478156: summary format does not give enough details about machine provisioning errors <charmers> <juju-core:Triaged> <https://launchpad.net/bugs/1478156>18:40
marcoceppikatco: ugh, nvm18:40
marcoceppiI see it's marked as high now, I had old data on the page18:41
natefinchwwitzel3: you around?18:59
natefinchericsnow: you around?19:03
ericsnownatefinch: yep19:03
natefinchericsnow: I was trying to work out what exactly I needed to do for my kanban card about local file images and docker.... and it seems like there's no such thing as a local file image... they're all stored in a local docker repository and behave exactly like remote ones.... there's no "docker run file://home/nate/mydockerimage"19:05
natefinchat least as far as I can tell19:05
ericsnownatefinch: the idea is, for local file images, to load them first19:06
mupBug #1479931 opened: Juju 1.22.6 cannot upgrade to 1.24.3/1.24.4 <blocker> <ci> <regression> <upgrade-juju> <juju-core:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1479931>19:19
mupBug #1479942 opened: Reference to undefined method <ci> <intermittent-failure> <ppc64el> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1479942>19:19
natefinchericsnow: sort of a problem... the name of the tar file bears no relation to the name of the image.19:22
natefinchericsnow: so if we're given foo.tar as something to load and run... we can load it, but we won't know what it is called once it's in the registry.  I guess we could look in the tar file and figure it out :/19:23
ericsnownatefinch: wwitzel3 will have to take it from here; I don't know enough about that19:23
natefinchericsnow: ok... actually, looks like a tar can have multiple images, so it even moreso won't work19:28
wwitzel3natefinch: yeah, looking at some of the other tools out there that wrap docker, they take an inventory first, using docker images19:40
wwitzel3natefinch: then they load it, and parse the diff19:41
natefinchwwitzel3: doesn't solve the problem if more than one image is loaded from the tar file19:41
wwitzel3natefinch: we could also use the remote API instead of wrapping the cmd19:41
wwitzel3natefinch: it does, since we would parse out both of them and they can only specify a single image name in the process definition19:42
natefinchwwitzel3: but I thought the feature was that the image name *is* the tar file19:42
ericsnownatefinch: gave you one last review (LGTM with some minor caveats)19:42
natefinchericsnow: thanks19:43
ericsnownatefinch: np19:43
wwitzel3natefinch: well, in that case we could launch and register both19:44
wwitzel3natefinch: or we could leave image as is and make the file to load a type specific arg19:44
natefinchwwitzel3: so, does this seem like a useful feature?  Is the idea that someone will package a tar file in their charm?19:45
wwitzel3natefinch: I can't remember the reason for it, it was based on some feedback we got iirc19:47
natefinchwwitzel3: seems like it needs to be better defined before we work on it.  I don't want to guess at the correct implementation.19:50
wwitzel3natefinch: I don't even see a card for it19:51
wwitzel3natefinch: oh, there it is, overhead19:51
natefinchwwitzel3: yep19:51
wwitzel3natefinch: so if the file replaces the image name, then it won't matter how many images are in the tar, we would just load and launch any it contained19:53
natefinchwwitzel3: I don't think that's a good idea... in all other cases, the process specification is for a single process - you give it a command to run, etc.  I think it would be surprising for a single process definition in the yaml to result in multiple registered processes.19:54
natefinchwwitzel3: maybe if we added a LoadFrom field in the process info that would tell Juju to load the image before launching it19:55
natefinchor maybe we need a separate step that loads all images before we start launching processes19:56
wwitzel3natefinch: I don't think it would be a surprise if I, the charm author provided a tar that had multiple images in it, but we shouldn't be designing this interacton anyway. We should probably ping lazyPower and whit about what that interaction would look like and what they want :)19:56
lazyPowerhello o/19:56
lazyPowerin office hours19:56
lazyPowerwill circle back when we're out, because i know what you're talking about and want to be a part of it19:56
natefinchlazyPower: awesome19:57
natefinchlazyPower: I love a man that knows what he wants ;)19:57
lazyPowernatefinch: ok my session is over, whaaatt would we like to do with process management in charming? :) i have some ideas already for example workloads to deliver with this.20:22
lazyPowerah i see, this is wrt multi processes20:23
natefinchlazyPower: well, so, I had a work item to support loading images from files on disk (a la docker's load from a tar)20:23
lazyPowerok, i dont see shipping multiple images in teh charm, i see more shipping with a dockerfile/compose-formation, and building on the host during deploy, or pulling from a private registry20:24
lazyPowerthats the established pattern. Do we want to advocate for fatpacking images in a charm?20:25
natefinchI don't know that we want to make that standard practice, but some people may certainly ask for it.  Fat charms are popular.20:26
lazyPowerok, let me re-check the spec to make sur ei'm on the same page20:26
lazyPoweri dont want to try and account for something thats already been discussed.20:26
natefinchlazyPower: AFAIK, it's not in the spec. So maybe that answers the question20:28
natefinch^^ ericsnow20:28
lazyPowerWe can always file and iterate20:28
ericsnownatefinch: this is something we added to the spec late last week in response to feedback katco got prepping for the demo20:29
lazyPoweri think if you put in multiple resource uri's, fetch them20:31
lazyPowerit wont be ovious to the user they only get a single resource, and thats not a one size fits all scenario20:31
lazyPower*obvious20:31
lazyPowerand we'll see weird things happening like people tarballing up multiple images and then writing extra code to handle that when we could be handling it in the delivery mechanism20:32
lazyPowers/images/payloads/20:32
natefinchericsnow: I think it's a bad idea to munge the idea of images with the tar files that docker supports.  tar != image20:35
natefinchericsnow: I'd prefer to either let the charm do the loading itself during install, or add a new field that'll tell juju how to load the info20:36
ericsnownatefinch: hey, it wasn't *my* idea! :)20:37
wwitzel3natefinch: I think having another field for the URI seperate from the image is fine20:38
wwitzel3natefinch: since that would also work for the location of a private docker registry20:39
natefinchwwitzel3: that seems fine... unless you wanted to specify both20:43
natefinchwwitzel3: load the images from this tar into this registry... or is that not a thing?20:43
wwitzel3natefinch: if you want to specify both, then you define two processes20:45
wwitzel3natefinch: packing two images in to a single tar isn't that common from what I know, lazyPower might have more experience with that than me20:45
wwitzel3natefinch: but I've not seen it done personally, because the size of the tar is already large, most people are trying to make their images and archives smaller, not bigger20:46
lazyPowerwwitzel3: well, you wouldnt hve 2 images in a single tar20:46
lazyPoweronce you export, its a single package per container. I can see someone trying to work around an artificial limitation by bundling 2 images in a tar file20:46
lazyPowerbut that wouldn't be the norm i dont think.20:46
lazyPowerunless you're trying to get hyper specific with arch and support multi-arch in the charm20:47
lazyPowerARMHF images will not run on amd64 for example, and vice versa20:47
natefinchericsnow: ug, these juju status tests are horrible20:52
ericsnownatefinch: sorry20:53
natefinchericsnow: as well you should be ;)20:53
natefinchericsnow, wwitzel3, lazyPower: what do you guys think about adding a resource: key to the process info, that gets passed to the plugin, and the plugin can handle it however it wants (for docker it would do a docker load)20:55
lazyPowerI like that idea20:55
perrito666natefinch: i would be a bit careful about the use of the word resource20:56
ericsnownatefinch: at long as it makes sense as a general feature and not just mostly-docker-specific20:56
perrito666I really dont feel like having State all over again20:56
natefinchericsnow: I presume other container technologies might need a separate step for "install the image"  before running it... but I don't know.20:57
ericsnownatefinch: yeah, who knows20:58
lazyPowernatefinch: looking at the existing things - rocket/docker/runc - its all basically the same delivery mechanism20:58
ericsnownatefinch: for now we could just support it with a type option20:58
lazyPowerbut looking @ say, tomcat - loading a warfile has a different process20:58
natefinchericsnow: ahh, yeah, type options... that makes sense20:59
natefinchericsnow: forgot about that escape hatch20:59
ericsnownatefinch: yep, that's why we added them20:59
natefinchok I gotta run.  I'll do it via type-option for now, and we can always make it more official later21:00
ericsnownatefinch: sounds good21:00
=== natefinch is now known as natefinch-afk
lazyPowernatefinch: ericsnow - is this going into a different branch than what landed for the concept wwitzel3 did?21:00
ericsnowlazyPower: nope, it'll go into feature-proc-mgmt21:01
wwitzel3lazyPower: it is going in to feature-proc-mgmt branch21:01
lazyPowerack21:01
lazyPoweri'm going to setup a build and get a container running for this while its under active dev if you'd like active feedback on the feature before it hits CR21:01
lazyPowerI had intended to do this for wwitzel3 but got sidetracked with the 1.0 launch of k8's21:02
wwitzel3yes please *bat eyelashes*21:03
lazyPower:) you got it dude21:03
lazyPowerwwitzel3: i'll ping when i'm working on it tomorrow21:04
wwitzel3lazyPower: awesome, ty21:04
sinzuicherylj: you cannot make CI regression as fix released, we have tests and cloud checks that say upgrades are broken21:15
cheryljI didn't do that21:16
cheryljsinzui: It was set to fix released by the QA bit21:17
cheryljbot21:17
sinzuicherylj: from the same report we can see http://reports.vapour.ws/releases/2934 that the 22 jobs failed21:18
cheryljsinzui: Yeah, I can recreate the failure.  Debugging it more now.21:19
sinzuicherylj: sorry, IU have two email with your name first :( I had to make them non-voting for this run because if the command to release 1.24.4, but I will mkae the voting again soon21:20
cheryljsinzui: I think this is a problem with 1.22.6, not 1.24.3/4.  The upgrade is failing when it's trying to get the tools for 1.24.321:22
cheryljjust fyi21:23
sinzuicherylj: maybe we should try 1.22.7 (1.22 tip) if it works, it is an incentive to relesse as soon as possible.21:23
cheryljsinzui: I can give that a try after this debug run I'm doing now.21:25
cheryljec2 seems particularly slow for me today :(21:25
sinzuicherylj: Indeed it is installing packafes seems to be taking longer21:27
sinzuicherylj: Joyent and GCE are the fastest clouds. I tend to use joyent21:27
cheryljsinzui: are there some shared creds for the core team?  or do I need to create my own account?21:28
sinzuicherylj: in cloud-city? yes you can use default-joyent. and you can try different regions21:29
cheryljmenn0: The state server will refuse connections while it's performing an upgrade, right?21:36
cheryljmenn0: It appears that the state server is hung trying to unpack the tools, and I see the syslog filling up with these errors:  http://paste.ubuntu.com/11969606/21:42
menn0cherylj: no the state server still accepts connections during an upgrade21:46
cheryljthis is weird.21:46
menn0cherylj: the available API requests are quite limited though21:46
menn0cherylj: status should still work21:46
menn0cherylj: "the not authorized for status" error is worrying21:47
cheryljyeah21:47
menn0cherylj: also, the very high connection count21:47
cheryljjust keeps going up!21:47
cheryljheh21:47
menn0cherylj: something in juju isn't releasing the connections21:47
menn0cherylj: that's probably not the root cause but related to it21:47
menn0cherylj: the authorization errors sounds closer the root cause21:48
menn0have you got the machine-0.log?21:48
menn0hang on... flying solo with a kid at the moment and he's calling21:48
cheryljyeah, I can add your SSH key to this machine.21:48
sinzuicherylj: menn0 help: I don't know which bugs thumpers merge at tip https://github.com/juju/juju/commits/1.24 were fixed. I can make a release, but I cannot say what issues are fixed22:00
menn0sinzui: looking22:01
menn0sinzui: looks like will and thumper have been activating the new leadership bits22:03
menn0sinzui: this will fix bug 147802422:03
mupBug #1478024: Looping config-changed hooks in fresh juju-core 1.24.3 Openstack deployment <blocker> <canonical-bootstack> <leadership> <upgrade-juju> <juju-core:Triaged> <juju-core 1.24:In Progress by fwereade> <https://launchpad.net/bugs/1478024>22:03
sinzui\o/22:03
menn0sinzui: but I wouldn't cut a release until they say it's done22:04
menn0sinzui: based on the commit messages it looks like they're close though22:04
sinzuimenn0: I see this in the context od thumper, mgz and alexis a few hours ago22:07
sinzuialexisb>22:07
sinzuimgz, as soon as we land it is release time22:07
menn0sinzui: ok cool22:07
* sinzui this the final job just passed and the rev is bless by all the old rules22:08
menn0sinzui: we still have bug 147993122:11
mupBug #1479931: Juju 1.22.6 cannot upgrade to 1.24.3/1.24.4 <blocker> <ci> <regression> <upgrade-juju> <juju-core:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1479931>22:11
menn0sinzui: for some reason the QA bot marked it as fix released for 1.2422:11
menn0sinzui: but cherylj was able to repro it22:11
menn0sinzui: we're looking at that one now22:12
sinzuimenn0: We had to make the two jobs that show the regression non-voting, which conviced CI that ere was a bless22:12
menn0sinzui: never mind... I just saw your comment on the bug22:12
menn0sinzui: cool, makes sense22:13
sinzuimenn0: we are jugglin a nasty case of a regression in the wild. 1.24.4 is better than 1.24.3 :/22:13
menn0sinzui: I don't think we should release another 1.24 until this one is figured out22:13
sinzuimenn0: I think so, I really don't like releasing in this rush. I officially EODed lat hour22:14
sinzuimenn0: we can replace the proposed version with an other fixed version while in propsed. maybe 1.24.5 can be put in place by your tuesday22:15
menn0sinzui: ok22:15
menn0sinzui: this one should be fixed soon I think. i'm getting a sense of the problem from the logs22:15
sinzuimenn0: also, I will hit the delay in Lp's builders. If I see a fix in CI, I can just switch the debs we plan to put in streams :)22:17
menn0sinzui: sounds good22:17
menn0waigani: if you get a chance could you have a look at http://reviews.vapour.ws/r/2279/ pls? (no rush though)22:18
waiganimenn0: okay, I'm just finishing some stuff for Will. Probably get to it around 11am?22:21
menn0waigani: np. i'm looking at this upgrade issue anyway.22:21
menn0cherylj: I see the problem... the relevant revision is 0e39ac8d6fcc77793e5028e03bfb651707cf1bb622:30
menn0cherylj: if the env UUID is missing open() tries to query the DB to figure it out, but that's before the mongodb login happens in newState() so the query isn't allowed22:31
menn0cherylj: I find it hard to believe that this was tested with an actual upgrade...22:32
menn0cherylj: it should be fixable by extracting the login into it's own method22:33
menn0and calling that earlier in open()22:33
sinzuimenn0: waigani Can either of you review http://reviews.vapour.ws/r/2283/22:40
menn0sinzui: ship it22:41
menn0sinzui: btw I'm pretty sure I have a fix for bug 147993122:42
mupBug #1479931: Juju 1.22.6 cannot upgrade to 1.24.3/1.24.4 <blocker> <ci> <regression> <upgrade-juju> <juju-core:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1479931>22:42
menn0sinzui: testing now22:42
sinzuimenn0: Thank you . I may need to wait  though. Hp cloud got relested and a job failed, so I am retesting22:42
menn0ok22:43
sinzuimenn0: ping when you want to merge because I might just as well release your fix22:43
menn0sinzui: ok22:44
menn0sinzui: ok that fix works... just prepping for proposing now22:59
sinzuimenn0: You rock, as does cherylj . I will let CI accept the current failure and wait for the fix23:00
menn0waigani or anyone else, review for CI blocker please : http://reviews.vapour.ws/r/2284/23:09
waiganimenn0: looking23:10
menn0waigani: never mind ... the change breaks the state unit tests23:12
menn0sinzui: this is going to take longer23:12
waiganimenn0: okay23:13
axwanastasiamac_: ok to delay 10m to wait for perrito666?23:13
sinzuimenn0: okay. I Hp hates me so I am in no rush23:13
anastasiamac_axw: yes :D brilliant - m going to coffee23:13
anastasiamac_axw: is ur school run going to b k?'23:13
menn0sinzui: ok. I have to be out house for a bit soon so it might be a few more hours23:14
axwanastasiamac_: should be fine23:14
anastasiamac_axw: gr8! see u then :D23:14
menn0sinzui: or perhaps someone else can run with it23:14
menn0let's see where I get to23:14
menn0waigani, sinzui: tests fixed23:20
menn0waigani: pushing now23:20
menn0waigani: can you take a look again please?23:21
waiganimenn0: yep23:21
menn0waigani: I need to step away for a bit. if you're happy with the change can you pls hit merge for me?23:21
menn0back in 10min23:21
waiganimenn0: yep np23:21
perrito666anastasiamac_: axw I am back, thanks :D23:22
anastasiamac_perrito666: axw: omw23:23
waiganimenn0: done, I hit merge also23:29
sinzuimenn0: waigani : the magic fixes-1479931 was missing, I am adding it and requeing the merge23:33
waiganisinzui: ugh, sorry I keep forgetting that.23:34
menn0waigani, sinzui: i'm back for 20 mins or so then off again23:36
sinzuimenn0: okay I will watch the merge and retry as needed23:36
waiganimenn0: half day for me, heading to airport in 30min.23:37
menn0sinzui, waigani: thanks both of you23:37
waigani:)23:37
mwhudsondavecheney: what's happened to the ppc64le builder?23:57

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!