/srv/irclogs.ubuntu.com/2015/09/04/#juju-dev.txt

davecheneymwhudson: testing now, thanks00:24
davecheneythis stuff is a bit subtle00:25
mwhudsondavecheney: as has been noted before the go linker is terrible00:26
mwhudsondavecheney: i know what i was doing wrong though00:27
mwhudsonbasically the value of goarm was being appended to the value from the runtime.a file00:27
mwhudsonso it was \x00\x0700:27
mwhudsonand it's only a uint8 so the runtime was just seeing 000:27
ericsnowwallyworld_: I've landed that fix to unblock master (thanks for the review)00:31
wallyworld_np, thanks for fixing :-)00:32
ericsnowwallyworld_: CI is supposed to auto-unblock once CI passes, right?00:32
wallyworld_yes00:32
* ericsnow crosses fingers00:32
ericsnowg'night00:32
davecheneyericsnow: thanks for fixing that00:45
stokachuran into an interesting bug: https://bugs.launchpad.net/juju-core/+bug/149208801:59
mupBug #1492088: juju bootstrap fails inside a wily container <cloud-installer> <juju-core:New> <https://launchpad.net/bugs/1492088>01:59
stokachuanyone seen this before?01:59
mupBug #1492088 opened: juju bootstrap fails inside a wily container <cloud-installer> <juju-core:New> <https://launchpad.net/bugs/1492088>02:04
mwhudsondavecheney: say, do you know off hand the difference between GOARM=5 and GOARM=6?02:11
mwhudsonoh soft float seems like a major suspect02:12
davecheney12:10 < dfc> thumper: http://reviews.vapour.ws/r/2588/02:13
davecheney12:10 < dfc> remove version.Binary.OS02:13
davecheneymwhudson: yes, and no atomics02:13
davecheneyno STREX/LDREX02:13
mwhudsoni think this is soft float02:16
mwhudsondavecheney: if you enable the -shared flag for arm things go very wrong in hashmap code02:17
mwhudsonand there is floating point code that could explain it02:17
davecheneyyes, the hash factor02:21
davecheneyor fill factor or something02:21
davecheneyfrom memory hashmap.go:mapinit02:21
perrito666cmd/juju/status_test.go is a clear form of modern torture02:26
wallyworld_yes it is :-(02:27
* perrito666 uses the time between testruns to learn emacs02:28
* perrito666 wonders if he can set kb layouts only for a given app02:29
mwhudsondavecheney: yeah, luckily the softfloat is so broken that it doesn't get 10.0*100.0 right...02:29
davecheneyperrito666: i'm working on the great american novel while running tests on ppc6402:29
perrito666heh02:29
perrito666oh finally, success02:30
perrito666I changed one value in formatted status02:31
perrito666git diff status_test.go | grep "^\+" | wc -l02:31
perrito66610502:31
mwhudsonARGH02:35
davecheneysinzui, I thought we had a voting race builder now ?02:52
davecheneyhttps://bugs.launchpad.net/juju-core/+bug/149209502:53
davecheneymwhudson: soft float only works for values of 1 with extremely large expononents02:53
mupBug #1492095: worker/statushistorypruner: data race <juju-core:New> <https://launchpad.net/bugs/1492095>02:53
mwhudsonhaha i fixed it02:53
mwhudsonnot sure that was a sensible use of my time, but it took waaaay less than figuring out what was going on02:54
mupBug #1492095 opened: worker/statushistorypruner: data race <juju-core:New> <https://launchpad.net/bugs/1492095>03:02
davecheneythumper: thanks for your comments, please see my reply03:03
thumperdavecheney: I agree with your summation03:04
thumperdavecheney: all within time03:04
perrito666davecheney: thanks for finding the race (that was me, most likely)03:04
davecheneyperrito666: np03:04
davecheneythumper: it's a valid operation03:05
davecheneybut I think it deserves to be broken out into its own logic03:05
davecheneypossibly in an upcoming juju/series package ?03:05
thumperperhaps03:26
davecheneymwhudson: congrats on your +205:03
davecheneyproving the motto of the Go team: "you get commit rights when we get sick of comitting your stuff"05:03
mwhudsondavecheney: thanks05:31
mwhudsonand, yeah05:31
davecheneywe're classy like that05:33
=== Spads_ is now known as Spads
wallyworld_axw: a small one if you have a moment http://reviews.vapour.ws/r/2590/07:09
axwwallyworld_: looking07:29
wallyworld_ta07:29
axwwallyworld_: filesystem CLI is delayed, the existing volume stuff needs cleaning up first. I created another card on the board07:30
wallyworld_np07:30
axwwallyworld_: I don't really understand why this branch is required at all. when would we ever get to the end of the resolver and have Started==false?07:38
wallyworld_axw: during my testing i saw the update-status hook fire before the start hook had run (after install hook i think, before leader-elected)07:39
wallyworld_so it's in response to observed behaviour07:39
wallyworld_hmmm07:40
axwwallyworld_: ah hm, apparently Start isn't run until after the first ConfigChanged07:40
wallyworld_maye the refactoring got eid of the prolem07:40
axwwallyworld_: so if the update status trigger comes in before config changed, this could happen I think07:41
wallyworld_config changed may cone first yeah07:41
wallyworld_yeah07:41
axwwallyworld_: but no... we always wait for the first config changed07:41
wallyworld_oh, ok, i forgot that07:41
axwwallyworld_: although I think it could still happen in the case of a failed/resolved config-changed07:42
wallyworld_could do. adding this extra started check is trivial abd seems like a good sfatey net07:43
axwwallyworld_: we should probably move that logic into the resolver (out of operation/runhook.go), and have it drop out of the resolver if !Started and waiting for config changed07:43
axwmaybe not now07:43
wallyworld_the stuff in run hook commit?07:44
axwwallyworld_: yes07:44
wallyworld_yeah, makes sense to do that now i think, but not earlier with the old code07:44
axwwallyworld_: actually even that doesn't make sense. if we resolved the config-changed, it'd still commit and go to Started07:48
wallyworld_so maybe now that the update status trigger has been pulled into the main loop processing, the issue is mmot07:50
wallyworld_whereas before it was a concurrency lottery07:50
axwwallyworld_: right, yeah. I thought you were testing with your change07:50
wallyworld_i just added a bunch of cards, but i should have retested after the first refactor07:51
wallyworld_i'll retest and drop this branch probably07:51
wallyworld_just goes to show how fragile the uniter was before07:51
wallyworld_and how theis reworkhas fixed a bunch of stuff implciitly07:52
axwwallyworld_: that particular bit was okay before maltese-falcon, it got broken during (I think)07:58
wallyworld_that may well be true07:58
wallyworld_axw: and i think those other cards about duplicate status may be bugs in status history (need to dig further). so the feature branch may well be almost ready07:58
axwwallyworld_: cool.07:59
wallyworld_i have soccer now but will continue testing after07:59
fwereadecmars, axw: you know the worker/gate thing I did?09:26
fwereadecmars, axw: I think that we want a sort of extension of the concept to describe what charmdir.worker really does09:26
fwereadecmars, axw: because I think it really is just a custom synchronisation construct, the charmdir relationship is entirely incidental09:27
fwereadecmars, axw: metaphorically something like Fortress sorta works -- clients can Visit(func() error), the person in charge can Unlock (unblock Visits) and Lockdown (stop accepting new Visits, wait for existing ones to complete)09:31
fwereadecmars, axw: but that's mainly just because I'm thinking about gates09:31
fwereadeor anyone who's interested in naming problems :) ^^09:32
perrito666davecheney: still here?11:11
voidspacedimitern: http://reviews.vapour.ws/r/2593/11:24
voidspacefwereade: ping11:25
dimiternvoidspace, cheers11:25
mupBug #1492232 opened: backup hogs resources. <juju-core:New> <https://launchpad.net/bugs/1492232>11:39
mupBug #1492237 opened: juju state server mongod uses too much disk space <canonical-bootstack> <juju-core:New> <https://launchpad.net/bugs/1492237>12:09
mupBug #1492241 opened: juju upgrade-juju cli doesn't provide clear feedback on action being taken <canonical-bootstack> <juju-core:New> <https://launchpad.net/bugs/1492241>12:09
dimiternfwereade, hey, are you around?12:39
fwereadedimitern, heyhey -- and voidspace, oops, sorry12:39
dimiternfwereade, :) voidspace has a branch that I'd like you to have a look, if possible12:40
fwereadedimitern, just saw; voidspace, looking forward to it :)12:40
dimiternfwereade, hopefully rectifies some of the issues with unit addresses changing randomly12:40
dimiternfwereade, awesome, thanks :)12:40
dimiternfwereade, we were careful not to break api compatibility12:42
ericsnowmgz: could we get a CI run against master to clear that blocking bug?12:46
voidspacefwereade: cool13:08
voidspacefwereade: it touches the uniter which is why we pinged you particularly13:08
voidspacefwereade: (touches it in a very minor way)13:08
fwereadevoidspace, phew :)13:08
fwereadevoidspace, what with maltese-falcon and all ;)13:08
voidspacefwereade: yeah...13:12
voidspacefwereade: hopefully a good touch not a bad touch...13:12
ericsnowmgz, dooferlad: ping13:14
dooferladericsnow: pong13:14
ericsnowdooferlad: could you kick off a CI run against master to clear the blocker bug?13:14
dooferladericsnow: I wouldn't know how...13:15
ericsnowdooferlad: ah13:15
dimiternericsnow, abentley, mgz, jog_, or sinzui are better people to ask :)13:16
ericsnowdimitern: duh, mixed up irc handled :)13:16
ericsnowabentley, mgz, jog_: could you kick off a CI run against master to clear the blocker bug?13:17
abentleyericsnow: looking...13:29
ericsnowabentley: thanks!13:30
abentleyericsnow: The lxc on our wily slave is unhappy.  We're still in the middle of testing 1.25.  I'm looking into fixing lxc.13:41
ericsnowabentley: k13:42
ericsnowabentley: any way we could get an exception for unblocking master?13:42
ericsnowabentley: I've verified locally that Windows builds and passes the test suite now13:42
abentleyericsnow: We need the lxc working before we can test master.13:43
wwitzel3katco: having google issues14:01
katcowwitzel3: there's no issues like google issues!14:02
abentleyericsnow: I've fixed the lxcs and queued master to be tested next.14:07
ericsnowabentley: thanks!14:08
ericsnowabentley: is it still about 2 hours to run?14:08
=== benji is now known as Guest60778
abentleyericsnow: yes.14:10
ericsnowk14:10
natefinchwwitzel3: we should circle up on my bug... I think I'm going to end up being out a lot today.14:44
wwitzel3ok, now is good14:44
wwitzel3fwereade: ping14:51
fwereadewwitzel3, pong14:52
wwitzel3fwereade: question about deploy.go and bug #148655314:52
mupBug #1486553: i/o timeout errors can cause non-atomic service deploys <cisco> <landscape> <juju-core:Triaged> <juju-core 1.25:In Progress by natefinch> <https://launchpad.net/bugs/1486553>14:52
fwereadewwitzel3, ah yes14:52
wwitzel3fwereade: if you look at the DeployService code, is there a specific reason all of that AddService and UpdateConfig and AddUnits aren't in a single transaction?14:53
wwitzel3fwereade: is there some chicken egg thing going on? Or is a possible fix to prevent the empty service being created just to perform all those in a single transaction?14:54
wwitzel3fwereade: if that isn't possible, then my other thought was to wrap all of those so I could handle the error and properly cleanup previous transactions manually14:54
fwereadewwitzel3, apart from sheer inertia, the trickiest bit of fixing that would be to unpick the machine assignnments14:54
fwereadewwitzel3, I am generally a bit underwhelmed by "clean up the mess" approaches, because it's hard to guarantee that they get run14:55
wwitzel3fwereade: yeah, I thought about that, but given that the placement to the unit is the last thing that happens, if that errors, I wouldn't have to actually worry about unpicking the assignments right?14:55
fwereadewwitzel3, I *think* it goes add/assign/add/assign/add/assign etc14:55
wwitzel3fwereade: hrmm, ok, so even if the AddUnitsWithPlacecode returns an error, it may hve done the add but not the assign?14:56
wwitzel3fwereade: and that won't be cleaned up?14:56
fwereadewwitzel3, yeah :(14:56
wwitzel3fwereade: that's shitty14:56
wwitzel3lol14:56
fwereadewwitzel3, it has always been like that: my justification is that at least it's *possible* to clean it up manually, and there are only so many hours in the day14:57
fwereadewwitzel3, however14:57
wwitzel3fwereade: :)14:57
fwereadewwitzel3, now that at last we've been told it's important to fix it, we can actually dive into Doing It Right14:58
fwereadewwitzel3, which I *think* is not that hard14:58
fwereadewwitzel3, because:14:58
wwitzel3fwereade: take my hand, show me the way14:58
fwereadewwitzel3, as you observe, add-service/set-config/add-unit go very nicely together14:58
fwereadewwitzel3, there will be tweaks necessary -- eg set all the unit refcounts in the service doc at once14:59
fwereadewwitzel3, and *create* service settings with X data instead of create-empty and set-later14:59
fwereadewwitzel3, and I really don't think there's anything terribly *hard* there15:00
fwereadewwitzel3, but, obviously, that leaves us with a bunch of unassigned units15:01
fwereadewwitzel3, ...and *that* sounds to me like a candidate for a watcher/worker that just assigns unassigned units15:02
fwereadewwitzel3, now this is clearly not a *small* bugfix15:02
wwitzel3fwereade: right, I can probably put a good dent in it today though and hand it off (I'm out next week)15:03
fwereadewwitzel3, but if we have traction on fixing it I think it would be worth while15:03
fwereadewwitzel3, sweet15:03
wwitzel3fwereade: so in the case where they used placement .. the worker/watcher would handle that too15:05
fwereadewwitzel3, (placement directives might be a touch fiddly -- some (`--to 0`) you can just run (or reject) directly, but others will need to be stored somewhere and used by the assigner15:05
wwitzel3fwereade: but then, there would be a delay right?15:05
fwereadewwitzel3, yeah, there would, I think that's just the price we have to pay15:05
wwitzel3fwereade: we would still have to validate placement as part of the operation though right?15:06
wwitzel3fwereade: we wouldn't want the assigner to come back later and give the user a bad placement error15:06
fwereadewwitzel3, I think it's equivalent to a provisioning error15:07
fwereadewwitzel3, any pre-validation we can do, hell yes15:07
fwereadewwitzel3, in fact I think that covers everything, right?15:08
fwereadewwitzel3, we reject invalid ones on the way in, and we can ask the environ about them15:08
wwitzel3fwereade: so it looks like .. add service w/ config, increment unit refs, validate and store placement directives .. run that transaction15:08
fwereadewwitzel3, but they *might* induce provisioning errors on the associated machines later, just like any other machine15:09
fwereadewwitzel3, yeah exactly15:09
wwitzel3fwereade: assigner picks up job, attempts to use unassigned units, surface and error to the user like a provisioning error15:09
fwereadewwitzel3, yeah15:09
fwereadewwitzel3, in fact I think it is possible that assignment could fail there15:10
fwereadewwitzel3, manual provider with unhelpful assignment policy?15:10
fwereadewwitzel3, we don't have any way to retry machine provisioning *with different constraints/placement*, do we?15:11
wwitzel3fwereade: right, so in the case of this bug, does this fix their issue though .. I'm not sure. If we fail at assignment, we still have a service?15:11
wwitzel3fwereade: I guess this bug was caused by the adding of the service and the updating the config and units not being atomic .. this does solve that15:12
wwitzel3fwereade: an assignment error would be something else and wouldn't be caused by a timeout to the API since the worker would be retrying in cases of timeout15:13
fwereadewwitzel3, sorry justa sec15:14
wwitzel3fwereade: np15:14
fwereadewwitzel3, so, yes, I think that it's a different situation, even if I'm not 100% sure why15:19
fwereadewwitzel3, failing to put my finger on what it is about the assignment logic that plays badly with transactions15:19
fwereadewwitzel3, but I think if we (1) surface the errors and (2) think through how users'd want to address them, we will provide a much better experience there15:21
wwitzel3fwereade: well, at the least, this is an improvement over the current implementation and it addresses the bug15:21
fwereadewwitzel3, (the assignment stuff must be coming up to 3 years old now... memory is hazy)15:21
fwereadewwitzel3, yeah15:21
wwitzel3fwereade: thank you15:22
fwereadewwitzel3, np :)15:22
=== frankban_ is now known as frankban
perrito666who is ocr today??16:10
=== frankban_ is now known as frankban
natefinchwwitzel3, katco, ericsnow: sorry, my wife is not really getting much better, so I won't be getting anything done today.16:35
katconatefinch: hope she starts feeling better soon :(16:35
ericsnownatefinch: hope she gets better soon!16:35
natefinchthanks, hopefully it'll get better overnight16:36
* natefinch stays on to be able to read scrollback16:36
=== natefinch is now known as natefinch-afk
mupBug #1492396 opened: Misleading error when agent-version doesn't match juju version on bootstrap <bootstrap> <ci> <juju-core:Triaged> <https://launchpad.net/bugs/1492396>17:04
perrito666any one willing to rubber stamp a couple of fwports?19:07
perrito666mmpdf, seem to be having flaky tests again19:25
perrito666MachineWithCharmsSuite.TestManageEnvironRunsCharmRevisionUpdater <-- anyone seem that one ?19:25
natefinch-afkwwitzel3: you around?20:01
wwitzel3natefinch-afk: yeah20:01
=== natefinch-afk is now known as natefinch
natefinchwwitzel3: let's catch up20:01

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!