davecheney | mwhudson: testing now, thanks | 00:24 |
---|---|---|
davecheney | this stuff is a bit subtle | 00:25 |
mwhudson | davecheney: as has been noted before the go linker is terrible | 00:26 |
mwhudson | davecheney: i know what i was doing wrong though | 00:27 |
mwhudson | basically the value of goarm was being appended to the value from the runtime.a file | 00:27 |
mwhudson | so it was \x00\x07 | 00:27 |
mwhudson | and it's only a uint8 so the runtime was just seeing 0 | 00:27 |
ericsnow | wallyworld_: I've landed that fix to unblock master (thanks for the review) | 00:31 |
wallyworld_ | np, thanks for fixing :-) | 00:32 |
ericsnow | wallyworld_: CI is supposed to auto-unblock once CI passes, right? | 00:32 |
wallyworld_ | yes | 00:32 |
* ericsnow crosses fingers | 00:32 | |
ericsnow | g'night | 00:32 |
davecheney | ericsnow: thanks for fixing that | 00:45 |
stokachu | ran into an interesting bug: https://bugs.launchpad.net/juju-core/+bug/1492088 | 01:59 |
mup | Bug #1492088: juju bootstrap fails inside a wily container <cloud-installer> <juju-core:New> <https://launchpad.net/bugs/1492088> | 01:59 |
stokachu | anyone seen this before? | 01:59 |
mup | Bug #1492088 opened: juju bootstrap fails inside a wily container <cloud-installer> <juju-core:New> <https://launchpad.net/bugs/1492088> | 02:04 |
mwhudson | davecheney: say, do you know off hand the difference between GOARM=5 and GOARM=6? | 02:11 |
mwhudson | oh soft float seems like a major suspect | 02:12 |
davecheney | 12:10 < dfc> thumper: http://reviews.vapour.ws/r/2588/ | 02:13 |
davecheney | 12:10 < dfc> remove version.Binary.OS | 02:13 |
davecheney | mwhudson: yes, and no atomics | 02:13 |
davecheney | no STREX/LDREX | 02:13 |
mwhudson | i think this is soft float | 02:16 |
mwhudson | davecheney: if you enable the -shared flag for arm things go very wrong in hashmap code | 02:17 |
mwhudson | and there is floating point code that could explain it | 02:17 |
davecheney | yes, the hash factor | 02:21 |
davecheney | or fill factor or something | 02:21 |
davecheney | from memory hashmap.go:mapinit | 02:21 |
perrito666 | cmd/juju/status_test.go is a clear form of modern torture | 02:26 |
wallyworld_ | yes it is :-( | 02:27 |
* perrito666 uses the time between testruns to learn emacs | 02:28 | |
* perrito666 wonders if he can set kb layouts only for a given app | 02:29 | |
mwhudson | davecheney: yeah, luckily the softfloat is so broken that it doesn't get 10.0*100.0 right... | 02:29 |
davecheney | perrito666: i'm working on the great american novel while running tests on ppc64 | 02:29 |
perrito666 | heh | 02:29 |
perrito666 | oh finally, success | 02:30 |
perrito666 | I changed one value in formatted status | 02:31 |
perrito666 | git diff status_test.go | grep "^\+" | wc -l | 02:31 |
perrito666 | 105 | 02:31 |
mwhudson | ARGH | 02:35 |
davecheney | sinzui, I thought we had a voting race builder now ? | 02:52 |
davecheney | https://bugs.launchpad.net/juju-core/+bug/1492095 | 02:53 |
davecheney | mwhudson: soft float only works for values of 1 with extremely large expononents | 02:53 |
mup | Bug #1492095: worker/statushistorypruner: data race <juju-core:New> <https://launchpad.net/bugs/1492095> | 02:53 |
mwhudson | haha i fixed it | 02:53 |
mwhudson | not sure that was a sensible use of my time, but it took waaaay less than figuring out what was going on | 02:54 |
mup | Bug #1492095 opened: worker/statushistorypruner: data race <juju-core:New> <https://launchpad.net/bugs/1492095> | 03:02 |
davecheney | thumper: thanks for your comments, please see my reply | 03:03 |
thumper | davecheney: I agree with your summation | 03:04 |
thumper | davecheney: all within time | 03:04 |
perrito666 | davecheney: thanks for finding the race (that was me, most likely) | 03:04 |
davecheney | perrito666: np | 03:04 |
davecheney | thumper: it's a valid operation | 03:05 |
davecheney | but I think it deserves to be broken out into its own logic | 03:05 |
davecheney | possibly in an upcoming juju/series package ? | 03:05 |
thumper | perhaps | 03:26 |
davecheney | mwhudson: congrats on your +2 | 05:03 |
davecheney | proving the motto of the Go team: "you get commit rights when we get sick of comitting your stuff" | 05:03 |
mwhudson | davecheney: thanks | 05:31 |
mwhudson | and, yeah | 05:31 |
davecheney | we're classy like that | 05:33 |
=== Spads_ is now known as Spads | ||
wallyworld_ | axw: a small one if you have a moment http://reviews.vapour.ws/r/2590/ | 07:09 |
axw | wallyworld_: looking | 07:29 |
wallyworld_ | ta | 07:29 |
axw | wallyworld_: filesystem CLI is delayed, the existing volume stuff needs cleaning up first. I created another card on the board | 07:30 |
wallyworld_ | np | 07:30 |
axw | wallyworld_: I don't really understand why this branch is required at all. when would we ever get to the end of the resolver and have Started==false? | 07:38 |
wallyworld_ | axw: during my testing i saw the update-status hook fire before the start hook had run (after install hook i think, before leader-elected) | 07:39 |
wallyworld_ | so it's in response to observed behaviour | 07:39 |
wallyworld_ | hmmm | 07:40 |
axw | wallyworld_: ah hm, apparently Start isn't run until after the first ConfigChanged | 07:40 |
wallyworld_ | maye the refactoring got eid of the prolem | 07:40 |
axw | wallyworld_: so if the update status trigger comes in before config changed, this could happen I think | 07:41 |
wallyworld_ | config changed may cone first yeah | 07:41 |
wallyworld_ | yeah | 07:41 |
axw | wallyworld_: but no... we always wait for the first config changed | 07:41 |
wallyworld_ | oh, ok, i forgot that | 07:41 |
axw | wallyworld_: although I think it could still happen in the case of a failed/resolved config-changed | 07:42 |
wallyworld_ | could do. adding this extra started check is trivial abd seems like a good sfatey net | 07:43 |
axw | wallyworld_: we should probably move that logic into the resolver (out of operation/runhook.go), and have it drop out of the resolver if !Started and waiting for config changed | 07:43 |
axw | maybe not now | 07:43 |
wallyworld_ | the stuff in run hook commit? | 07:44 |
axw | wallyworld_: yes | 07:44 |
wallyworld_ | yeah, makes sense to do that now i think, but not earlier with the old code | 07:44 |
axw | wallyworld_: actually even that doesn't make sense. if we resolved the config-changed, it'd still commit and go to Started | 07:48 |
wallyworld_ | so maybe now that the update status trigger has been pulled into the main loop processing, the issue is mmot | 07:50 |
wallyworld_ | whereas before it was a concurrency lottery | 07:50 |
axw | wallyworld_: right, yeah. I thought you were testing with your change | 07:50 |
wallyworld_ | i just added a bunch of cards, but i should have retested after the first refactor | 07:51 |
wallyworld_ | i'll retest and drop this branch probably | 07:51 |
wallyworld_ | just goes to show how fragile the uniter was before | 07:51 |
wallyworld_ | and how theis reworkhas fixed a bunch of stuff implciitly | 07:52 |
axw | wallyworld_: that particular bit was okay before maltese-falcon, it got broken during (I think) | 07:58 |
wallyworld_ | that may well be true | 07:58 |
wallyworld_ | axw: and i think those other cards about duplicate status may be bugs in status history (need to dig further). so the feature branch may well be almost ready | 07:58 |
axw | wallyworld_: cool. | 07:59 |
wallyworld_ | i have soccer now but will continue testing after | 07:59 |
fwereade | cmars, axw: you know the worker/gate thing I did? | 09:26 |
fwereade | cmars, axw: I think that we want a sort of extension of the concept to describe what charmdir.worker really does | 09:26 |
fwereade | cmars, axw: because I think it really is just a custom synchronisation construct, the charmdir relationship is entirely incidental | 09:27 |
fwereade | cmars, axw: metaphorically something like Fortress sorta works -- clients can Visit(func() error), the person in charge can Unlock (unblock Visits) and Lockdown (stop accepting new Visits, wait for existing ones to complete) | 09:31 |
fwereade | cmars, axw: but that's mainly just because I'm thinking about gates | 09:31 |
fwereade | or anyone who's interested in naming problems :) ^^ | 09:32 |
perrito666 | davecheney: still here? | 11:11 |
voidspace | dimitern: http://reviews.vapour.ws/r/2593/ | 11:24 |
voidspace | fwereade: ping | 11:25 |
dimitern | voidspace, cheers | 11:25 |
mup | Bug #1492232 opened: backup hogs resources. <juju-core:New> <https://launchpad.net/bugs/1492232> | 11:39 |
mup | Bug #1492237 opened: juju state server mongod uses too much disk space <canonical-bootstack> <juju-core:New> <https://launchpad.net/bugs/1492237> | 12:09 |
mup | Bug #1492241 opened: juju upgrade-juju cli doesn't provide clear feedback on action being taken <canonical-bootstack> <juju-core:New> <https://launchpad.net/bugs/1492241> | 12:09 |
dimitern | fwereade, hey, are you around? | 12:39 |
fwereade | dimitern, heyhey -- and voidspace, oops, sorry | 12:39 |
dimitern | fwereade, :) voidspace has a branch that I'd like you to have a look, if possible | 12:40 |
fwereade | dimitern, just saw; voidspace, looking forward to it :) | 12:40 |
dimitern | fwereade, hopefully rectifies some of the issues with unit addresses changing randomly | 12:40 |
dimitern | fwereade, awesome, thanks :) | 12:40 |
dimitern | fwereade, we were careful not to break api compatibility | 12:42 |
ericsnow | mgz: could we get a CI run against master to clear that blocking bug? | 12:46 |
voidspace | fwereade: cool | 13:08 |
voidspace | fwereade: it touches the uniter which is why we pinged you particularly | 13:08 |
voidspace | fwereade: (touches it in a very minor way) | 13:08 |
fwereade | voidspace, phew :) | 13:08 |
fwereade | voidspace, what with maltese-falcon and all ;) | 13:08 |
voidspace | fwereade: yeah... | 13:12 |
voidspace | fwereade: hopefully a good touch not a bad touch... | 13:12 |
ericsnow | mgz, dooferlad: ping | 13:14 |
dooferlad | ericsnow: pong | 13:14 |
ericsnow | dooferlad: could you kick off a CI run against master to clear the blocker bug? | 13:14 |
dooferlad | ericsnow: I wouldn't know how... | 13:15 |
ericsnow | dooferlad: ah | 13:15 |
dimitern | ericsnow, abentley, mgz, jog_, or sinzui are better people to ask :) | 13:16 |
ericsnow | dimitern: duh, mixed up irc handled :) | 13:16 |
ericsnow | abentley, mgz, jog_: could you kick off a CI run against master to clear the blocker bug? | 13:17 |
abentley | ericsnow: looking... | 13:29 |
ericsnow | abentley: thanks! | 13:30 |
abentley | ericsnow: The lxc on our wily slave is unhappy. We're still in the middle of testing 1.25. I'm looking into fixing lxc. | 13:41 |
ericsnow | abentley: k | 13:42 |
ericsnow | abentley: any way we could get an exception for unblocking master? | 13:42 |
ericsnow | abentley: I've verified locally that Windows builds and passes the test suite now | 13:42 |
abentley | ericsnow: We need the lxc working before we can test master. | 13:43 |
wwitzel3 | katco: having google issues | 14:01 |
katco | wwitzel3: there's no issues like google issues! | 14:02 |
abentley | ericsnow: I've fixed the lxcs and queued master to be tested next. | 14:07 |
ericsnow | abentley: thanks! | 14:08 |
ericsnow | abentley: is it still about 2 hours to run? | 14:08 |
=== benji is now known as Guest60778 | ||
abentley | ericsnow: yes. | 14:10 |
ericsnow | k | 14:10 |
natefinch | wwitzel3: we should circle up on my bug... I think I'm going to end up being out a lot today. | 14:44 |
wwitzel3 | ok, now is good | 14:44 |
wwitzel3 | fwereade: ping | 14:51 |
fwereade | wwitzel3, pong | 14:52 |
wwitzel3 | fwereade: question about deploy.go and bug #1486553 | 14:52 |
mup | Bug #1486553: i/o timeout errors can cause non-atomic service deploys <cisco> <landscape> <juju-core:Triaged> <juju-core 1.25:In Progress by natefinch> <https://launchpad.net/bugs/1486553> | 14:52 |
fwereade | wwitzel3, ah yes | 14:52 |
wwitzel3 | fwereade: if you look at the DeployService code, is there a specific reason all of that AddService and UpdateConfig and AddUnits aren't in a single transaction? | 14:53 |
wwitzel3 | fwereade: is there some chicken egg thing going on? Or is a possible fix to prevent the empty service being created just to perform all those in a single transaction? | 14:54 |
wwitzel3 | fwereade: if that isn't possible, then my other thought was to wrap all of those so I could handle the error and properly cleanup previous transactions manually | 14:54 |
fwereade | wwitzel3, apart from sheer inertia, the trickiest bit of fixing that would be to unpick the machine assignnments | 14:54 |
fwereade | wwitzel3, I am generally a bit underwhelmed by "clean up the mess" approaches, because it's hard to guarantee that they get run | 14:55 |
wwitzel3 | fwereade: yeah, I thought about that, but given that the placement to the unit is the last thing that happens, if that errors, I wouldn't have to actually worry about unpicking the assignments right? | 14:55 |
fwereade | wwitzel3, I *think* it goes add/assign/add/assign/add/assign etc | 14:55 |
wwitzel3 | fwereade: hrmm, ok, so even if the AddUnitsWithPlacecode returns an error, it may hve done the add but not the assign? | 14:56 |
wwitzel3 | fwereade: and that won't be cleaned up? | 14:56 |
fwereade | wwitzel3, yeah :( | 14:56 |
wwitzel3 | fwereade: that's shitty | 14:56 |
wwitzel3 | lol | 14:56 |
fwereade | wwitzel3, it has always been like that: my justification is that at least it's *possible* to clean it up manually, and there are only so many hours in the day | 14:57 |
fwereade | wwitzel3, however | 14:57 |
wwitzel3 | fwereade: :) | 14:57 |
fwereade | wwitzel3, now that at last we've been told it's important to fix it, we can actually dive into Doing It Right | 14:58 |
fwereade | wwitzel3, which I *think* is not that hard | 14:58 |
fwereade | wwitzel3, because: | 14:58 |
wwitzel3 | fwereade: take my hand, show me the way | 14:58 |
fwereade | wwitzel3, as you observe, add-service/set-config/add-unit go very nicely together | 14:58 |
fwereade | wwitzel3, there will be tweaks necessary -- eg set all the unit refcounts in the service doc at once | 14:59 |
fwereade | wwitzel3, and *create* service settings with X data instead of create-empty and set-later | 14:59 |
fwereade | wwitzel3, and I really don't think there's anything terribly *hard* there | 15:00 |
fwereade | wwitzel3, but, obviously, that leaves us with a bunch of unassigned units | 15:01 |
fwereade | wwitzel3, ...and *that* sounds to me like a candidate for a watcher/worker that just assigns unassigned units | 15:02 |
fwereade | wwitzel3, now this is clearly not a *small* bugfix | 15:02 |
wwitzel3 | fwereade: right, I can probably put a good dent in it today though and hand it off (I'm out next week) | 15:03 |
fwereade | wwitzel3, but if we have traction on fixing it I think it would be worth while | 15:03 |
fwereade | wwitzel3, sweet | 15:03 |
wwitzel3 | fwereade: so in the case where they used placement .. the worker/watcher would handle that too | 15:05 |
fwereade | wwitzel3, (placement directives might be a touch fiddly -- some (`--to 0`) you can just run (or reject) directly, but others will need to be stored somewhere and used by the assigner | 15:05 |
wwitzel3 | fwereade: but then, there would be a delay right? | 15:05 |
fwereade | wwitzel3, yeah, there would, I think that's just the price we have to pay | 15:05 |
wwitzel3 | fwereade: we would still have to validate placement as part of the operation though right? | 15:06 |
wwitzel3 | fwereade: we wouldn't want the assigner to come back later and give the user a bad placement error | 15:06 |
fwereade | wwitzel3, I think it's equivalent to a provisioning error | 15:07 |
fwereade | wwitzel3, any pre-validation we can do, hell yes | 15:07 |
fwereade | wwitzel3, in fact I think that covers everything, right? | 15:08 |
fwereade | wwitzel3, we reject invalid ones on the way in, and we can ask the environ about them | 15:08 |
wwitzel3 | fwereade: so it looks like .. add service w/ config, increment unit refs, validate and store placement directives .. run that transaction | 15:08 |
fwereade | wwitzel3, but they *might* induce provisioning errors on the associated machines later, just like any other machine | 15:09 |
fwereade | wwitzel3, yeah exactly | 15:09 |
wwitzel3 | fwereade: assigner picks up job, attempts to use unassigned units, surface and error to the user like a provisioning error | 15:09 |
fwereade | wwitzel3, yeah | 15:09 |
fwereade | wwitzel3, in fact I think it is possible that assignment could fail there | 15:10 |
fwereade | wwitzel3, manual provider with unhelpful assignment policy? | 15:10 |
fwereade | wwitzel3, we don't have any way to retry machine provisioning *with different constraints/placement*, do we? | 15:11 |
wwitzel3 | fwereade: right, so in the case of this bug, does this fix their issue though .. I'm not sure. If we fail at assignment, we still have a service? | 15:11 |
wwitzel3 | fwereade: I guess this bug was caused by the adding of the service and the updating the config and units not being atomic .. this does solve that | 15:12 |
wwitzel3 | fwereade: an assignment error would be something else and wouldn't be caused by a timeout to the API since the worker would be retrying in cases of timeout | 15:13 |
fwereade | wwitzel3, sorry justa sec | 15:14 |
wwitzel3 | fwereade: np | 15:14 |
fwereade | wwitzel3, so, yes, I think that it's a different situation, even if I'm not 100% sure why | 15:19 |
fwereade | wwitzel3, failing to put my finger on what it is about the assignment logic that plays badly with transactions | 15:19 |
fwereade | wwitzel3, but I think if we (1) surface the errors and (2) think through how users'd want to address them, we will provide a much better experience there | 15:21 |
wwitzel3 | fwereade: well, at the least, this is an improvement over the current implementation and it addresses the bug | 15:21 |
fwereade | wwitzel3, (the assignment stuff must be coming up to 3 years old now... memory is hazy) | 15:21 |
fwereade | wwitzel3, yeah | 15:21 |
wwitzel3 | fwereade: thank you | 15:22 |
fwereade | wwitzel3, np :) | 15:22 |
=== frankban_ is now known as frankban | ||
perrito666 | who is ocr today?? | 16:10 |
=== frankban_ is now known as frankban | ||
natefinch | wwitzel3, katco, ericsnow: sorry, my wife is not really getting much better, so I won't be getting anything done today. | 16:35 |
katco | natefinch: hope she starts feeling better soon :( | 16:35 |
ericsnow | natefinch: hope she gets better soon! | 16:35 |
natefinch | thanks, hopefully it'll get better overnight | 16:36 |
* natefinch stays on to be able to read scrollback | 16:36 | |
=== natefinch is now known as natefinch-afk | ||
mup | Bug #1492396 opened: Misleading error when agent-version doesn't match juju version on bootstrap <bootstrap> <ci> <juju-core:Triaged> <https://launchpad.net/bugs/1492396> | 17:04 |
perrito666 | any one willing to rubber stamp a couple of fwports? | 19:07 |
perrito666 | mmpdf, seem to be having flaky tests again | 19:25 |
perrito666 | MachineWithCharmsSuite.TestManageEnvironRunsCharmRevisionUpdater <-- anyone seem that one ? | 19:25 |
natefinch-afk | wwitzel3: you around? | 20:01 |
wwitzel3 | natefinch-afk: yeah | 20:01 |
=== natefinch-afk is now known as natefinch | ||
natefinch | wwitzel3: let's catch up | 20:01 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!