[00:05] wallyworld_: https://github.com/juju/juju/pull/621 [00:05] ready for review [00:05] looking [00:09] wallyworld_: grabbing some supper, brb [00:09] ok [00:13] wallyworld_: back [00:14] katco: i left a couple of droppings in your PR [00:14] i think maybe the test coverage needs to be expanded a bit [00:16] wallyworld_: gah i'm flipping back and forth b/t branches too much. the reason i didn't use the val in that map is b/c with the harvesting stuff, what's in the map will be a string and we want an int. here it's totally fine though. thanks :p [00:16] :-) [00:44] wallyworld_: ready again. running tests on my machine [00:44] ok [00:54] wallyworld_: http://golang.org/doc/effective_go.html#redeclaration [00:56] katco: rightio. i thought that only applied to the first variable [00:56] i HATE := vs = soooo much. worst design decision EVER [00:56] wallyworld_: afaik, go doesn't do anything anywhere with regards to parameter ordering [00:56] wallyworld_: haha [00:57] wallyworld_: i don't mind it, but i do wish they would have standardized on new(...) vs := [00:57] := vs = is the cause of so many bugs [00:57] and anyway, it's the fucking compiler's problem to sort out, not the programmer [00:57] wallyworld_: really? i haven't experienced that directly yet [00:57] we have in juju [00:58] wallyworld_: any more feedback? i think i got everything/pushed [00:58] katco: yeah, just about to LGTM but you interrupted me :-P [00:59] wallyworld_: sorry oh supreme leader of wallyworld! ;) [00:59] have i told you today? [00:59] told me what? [00:59] fark off! [00:59] LOL [00:59] that'll be 2x i think today [00:59] there, now you've been told [00:59] haha [01:00] ok back to the harvesting stuff [01:00] thanks for fixing [01:00] thanks for the review [01:00] thumper: should be landing momentarily. sorry for the regression. [01:00] don't apoligise to him, he will expect it everytime now [01:01] i already told him i trusted him. i think i'm just off on the wrong foot with that guy [01:01] axw, ping [01:01] axw, was doing some research on azure earlier today and found some interesting info, wanted to run by you.. [01:01] he's just a pussy cat really, roll him over and rub his belly and he's all good [01:02] axw, nutshell we don't need to associate the env to an affinity group anymore for the sole purpose of getting a vnet.. vnets can be associated to regions now [01:03] hazmat: i have a theory why provisioning is failing, but the log files don't contain the error message i would expect to see, so it's a guess. apt contention installing container dependencies [01:03] that explains the issue you raised, but doesn't explain the one where only one container out of several fails to start [01:04] wallyworld_, do we normally have errors we don't log? [01:04] wallyworld_, possibly.. i thought that was addressed already via retry? [01:04] hazmat: yes we do, and i'm not seeing them which is confusing me [01:04] wallyworld_, the issue is nothing else on that machine is installing anything [01:04] wallyworld_, the one other unit on the machine.. is the ubuntu charm.. aka do nothing [01:04] wallyworld_, its log is also in the tarball [01:04] wallyworld_, so apt contention with what.. [01:05] hazmat: retry is only in 1.20.6 [01:05] wallyworld_, ah.. fair enough.. this is .5 [01:05] wallyworld_, but still curious as to what it would contend with [01:05] hazmat: ok, i didn't see what the unit was. but apt contention is the only thing that i can see right now that explains why the logging cuts off at the point it does [01:06] there may be another cause [01:06] wallyworld_, why are all the container watchers being killed on all the machines at the same time [01:06] but what's happening is that the code is calling the container setup,which calls apt, but then gets no further [01:07] hazmat: they are killed because they are no longer needed - they exist to set up container support for the machine and then they die [01:07] ie the apt stuff and set up to run lxc is done lazily [01:07] once the first lxc is asked for [01:07] wallyworld_, so it would be a container provisioner logic issue then [01:08] maybe, i can't explain why things just stop [01:08] there was an issue in 1.20.5 where the watcher was stopped twice [01:08] but i don't think that will cause this issue [01:08] i need to keep digging a bit [01:08] wallyworld_, ack [01:09] thanks for getting the logs [01:09] np [01:09] i'm keen to get 1.20.6 out there [01:09] so we can see how it behaves [01:09] lots of fixes in there [01:09] a CI issue with azure is holding things up [01:10] the --upload-tools issue on azure? [01:10] azure was broken for CI, might be fixed now [01:10] not sure, i just heard 2nd hand that the CI tests failed [01:10] they passed yesterday or the day before [01:10] but i haven't heard directly [01:11] we'll be pushing for a release tomorrow regardless [01:11] we have to get this 1.20.6 out and into the hands of landscape and other folks [01:12] wallyworld_, so does that mean there's some coordination between container watcher and container provisioner? === Ursinha is now known as Ursinha-afk [01:24] hazmat: I changede our vnet creation to use a "location" (region) a while back, but reverted it. not sure if it was a coincidence, but after that change there were a lot of problems with the vnet not being available [01:25] it would take >5 mins for the vnet to be accessible after creation [01:25] axw, interesting [01:25] IIRC the warning message that popped up in the azure console said it was only a problem with vnets created without an affinity group [01:26] katco: awesome, ta [01:26] thumper: np [01:27] hazmat: did you see my PR for the docs on zones? [01:27] axw, i did looked good [01:27] reading your doc now [01:27] hazmat: yes, the container watcher starts the provisioner when a new container is requested [01:30] ah ic now [01:32] wallyworld_, by why would container watcher killed be seen before apt-get install lxc if the watcher is responsible for installing the pre-reqs [01:35] hazmat: not sure. i think it asks the worker to die, but it won't do so until the current operation has finished ie the provisioner is started and then it exists [01:35] i'm not 100% across the worker infrastructure [01:35] wallyworld_, that's not apparent [01:35] wallyworld_, ie the logs where its successful show it die and then the provisioner come up [01:37] hazmat: my understanding is that kill() marks the worker as dying, and it still needs the current loop invocation to finish, but i'm not sure [01:40] i'll look at reworking it, adding more logging also [01:49] ah ic.. it signals to stops itself before doing its actual work [01:51] thumper: https://github.com/juju/juju/pull/614 [01:51] if you have a sec [01:51] * thumper looks [01:52] this is th eone from standup [02:02] waigani: you can stop reviewing https://github.com/juju/juju/pull/547, it's redundant [02:02] I already fixed the problem [02:04] axw: ah so I see, thanks I missed that [02:10] hazmat: I don't have much to say on your doc, SGTM. [02:10] it would be nice if we could use thise to enable colocation of services in azure [02:11] atm that's disallowed because we can't control which units communicate to which based on zone allocation [02:11] axw, hmm [02:12] (I still have no idea how it would work though) [02:12] axw, zone/fault domain in azure is a logical concept that's specific to azure service and its role instances. [02:13] axw, theoretically we could map to those when doing co-location, and pick the appropriate next instance (ie distribution group from co-located service's instances [02:13] you'd also have to make sure you don't spread the two units across fault domains though [02:14] it's not enough to stick them in the same availability set [02:14] and then there's upgrade domains [02:14] axw, you do want to spread across fault domains. [02:14] axw, we don't actually use upgrade domains afaics [02:15] they are implicitly used [02:15] axw, does azure use upgrade domains under the hood? [02:15] when the machine is upgraded [02:15] i.e. regular maintenance [02:15] ic [02:15] that's my understanding anyway [02:15] my understanding was that it was tied to the app roll out of updates [02:16] but yeah.. underlying upgrades also makes sense [02:16] it is definitely tied to the app updates [02:16] I thought both tho [02:17] hazmat: re spreading across fault domains, I mean if you have a co-dependent app server & db, you surely don't want to spread htem across fault domains [02:17] but multiple units of each, eys [02:17] yes* [02:18] axw, every ref ic to upgrade domain references app /deployment updates.. not iaas updates [02:18] axw, multiple units of each.. and you'd want spread.. single unit of each.. does it matter ;-) [02:19] hazmat: all I'm saying is the pairs need to be located in the same fault domain, otherwise you have a broken service if one goes down [02:19] axw, single unit of each and we don't really have any real notion of trying to keep it up.. fault domains are not global [02:19] their service local logical [02:20] axw, ie. if their co-located their on the same vm.. so doesn't matter.. if their separate services in azure [02:20] there is no guarantee that 0 == 0 between two services fault domains [02:20] right, they have to be in the same cloud service [02:21] it's a bit messy, forget I said anything :) [02:21] when thye're in the same CS there's also issues of port collision [02:21] axw, so we'd use them as separate roles ? [02:22] within a service [02:22] axw, yeah [02:22] azure.. is special [02:23] yes, separate roles. I was thinking we could deploy a service and specify the cloud service name [02:24] (to be the same as an existing one) [02:24] woah.. now your talking crazy.. semantic service names in an iaas console ? ;-) [02:24] i walk away for a few months to come back and remember how special it is.. i wrote up some code to verify the fault/upgrade domain thingy and its interaction with affinity groups. https://gist.github.com/kapilt/d326b853e4606f9203e9 i kinda of wish we had a list-machines to do iaas provider specific details [02:24] axw, oh.. nevermind not semantic [02:25] axw, we currently do separate roles per instance as well.. there's some messiness trying to treat azure as general compute [02:26] a (Virtual Machine) role is an instance [02:26] there's some other roles that aren't applicable to IaaS [02:26] web worker roles.. don't know much about them [02:26] axw, so why do we/they have roles and role_instance_list separately [02:26] nfi [02:27] I think it's to do with deployments [02:27] you can have prod/testing deployments [02:27] and switch them at runtime [02:27] yeah.. the slots and upgrades [02:27] and rollbacks [02:27] so you define a role and ther's an instance for it in each deployment [02:28] ah.. ic.. that makes a certain sense.. logical from instantation across prod vs staging [02:29] anyway, so what I was saying is we could do, say "juju deploy app --to cloudservice=mythingy" and "juju deploy db --to cloudservice=mythingy", then if you ensure each service has at least the same number of units as there are fault domains, then the units can self organise to talk to units in the same fault domain [02:30] there's still the issue of port collisions, but there's not much we can do about that. only matters for exposed services anyway [02:31] waigani: https://github.com/juju/juju/pull/622 [02:32] axw, there on the same machine w/ co-location.. so the port collision thing is immaterial to the provider. [02:32] axw, also matters for unexposed.. cause failure to bind [02:32] hazmat: not same machine, just same cloud service [02:32] axw, we don't control fault domain [02:33] davecheney: looking :D [02:33] hazmat: no... but there are 2 fault domains and we allocate 2 units, I think Azure will spread them equally? [02:33] but if* [02:33] it will [02:34] axw, this is where the spec comes into play.. the charms can choose to self-organize that way if they choose.. via relation-get query to remote unit matching zone [02:34] hazmat: right, that was my point :) [02:35] I'm saying with your proposal, this is feasible [02:35] axw, aha.. finally i understand.. i should go to bed.. that took a while ;-) [02:35] davecheney: that's awesome. [02:42] davecheney: what's the -type d flag? [02:43] help just says: -type [bcdpflsD] [02:43] not very helpful... [02:44] waigani: man find [02:45] waigani: please review my comments to https://github.com/juju/juju/pull/617 [02:51] waigani: please review my comments to https://github.com/juju/juju/pull/613 [02:52] waigani: did you want to update the envuser stuff now with the st.environTag, or as a followup? [02:53] thumper: followup? I've got the todos in there so should be easy/quick [02:53] kk [02:53] * thumper keeps reviewing [02:58] davecheney: I've got to do the school run, I'll be back online in a bit [03:14] kk [03:46] heh... [03:46] * thumper squeezed (╯°□°)╯︵ ┻━┻ into a unit test [03:54] * hazmat steps back from the unicode wizardry [03:59] davecheney, thinking about your concern with the empty ActionTag as a signal of non-action hook [04:00] davecheney, since I'm initializing the ActionTag with an empty value and only inserting a value (via api) if the hook was an Action, it seems to me like there would never be a case when it would not suffice as the switch [04:01] bodie_: then you never need to check ? [04:02] davecheney, well, the check is to consider whether it is an action (i.e. always has a tag value), or not in which case the value will always be empty [04:02] if it's empty then use somethign that can be nil [04:03] otherwise you'll get fucked by the subtle difference between var a names.ActionTag, and a = names.NewActionTag("") [04:03] I think the latter would only happen if the action didn't have an id, in which case we're fucked anyway [04:04] axw: a small one https://github.com/juju/juju/pull/623 if you have a moment [04:04] but, that error case should get caught by runHook [04:04] looking [04:04] i.e., the value is *always* going to either be empty = non-action, or non-empty = action, or already errored out when the id was mysteriously missing [04:05] bodie_: i don't like using the zero value like that [04:05] that is my feeling too [04:05] please make it a pointer or use the names.Tag interface [04:05] sounds like a plan [04:06] :) [04:06] thanks [04:06] cool [04:06] thanks [04:08] wallyworld_: that AddInt32 test looks like crack anyway [04:09] won't it always stop the last worker it added? [04:09] it doesn't add workers [04:09] it stops the container watcher once all supported container types have been intialised [04:10] lazy init of containers [04:10] ah, I see [04:10] i'm not sure it was bad how it was, but it's more logical to have it in a defer i think [04:19] axw: thanks. the defer is a hail mary. it *shouldn't* matter but the runner stuff is a bit mysterious. certainly early termination of the worker is one explanation for the logs i saw [04:20] waigani: if you can't use the factory, just use the state methods to create users [04:20] thumper: ok [04:38] thumper: https://github.com/juju/juju/pull/553 [04:40] waigani: I'll look shortly, need to go make dinner [04:40] thumper: np, I'll have to do the same soon - at ice skating right now [04:45] wallyworld_: can you please close https://github.com/juju/juju/pull/547? [04:45] sure [04:45] thanks [04:56] davecheney, addressed your points. any response to https://github.com/juju/juju/pull/617#discussion_r16817826 when you have a sec? [05:42] davecheney, this code is the dep for a bunch of other stuff, so if I can get even a brief comment on that reply it would be really helpful to moving us forward [05:42] otherwise I believe others may hesitate to jump in on that topic [05:43] and since this is my 1:30 am, I don't have a lot of confidence I will get a chance to pester you again soon :) [05:55] axw: something to ponder with the tools work, not 100% relevant now but good to keep in mind https://bugs.launchpad.net/juju-core/+bug/1347984 [05:55] Bug #1347984: container provisioner may choose bad tools [05:56] wallyworld_: thanks [06:17] wallyworld_: you make a good point about "pending forever", but it's the same either way [06:17] perhaps when we fix that we can put a sensible timeout in place? [06:18] yeah, we do need to do something [06:18] we have work scheduled to improve this area === uru_ is now known as urulama [06:45] morning [07:06] morning [07:15] hmm, two tests running the whole story fail *checking* [07:16] dimitern: btw, your latest change led to a minor but nice redesign by my side [07:31] TheMue, oh yeah? [07:34] dimitern: yeah, using an interface answering the questions RequiresSafeNetworker() has, instead of adding more and more arguments [07:34] TheMue, cool! [07:35] dimitern: and, you may believe it or not, machiner.Machine implements this interface too :D like my mock type for the tests [07:35] TheMue, the IsManual thing? [07:36] dimitern: and the Id of the machine, all now fetched in one versioned doc, and the params separated from the in-memory storage [07:36] dimitern: John and I discussed about it these days [07:36] TheMue, yep, it is better like this, isn't it? [07:37] dimitern: yeah, I think so. params should simply be for transport. this also will make the implementation of versioning more simple [07:37] TheMue, that's the intent, yeah [07:38] dimitern: +1 [08:17] davecheney, morning - thanks for the review [08:35] so, looks like I catched all failing tests due to the redesign. one final complete test and then PR :) [09:25] mattyw: no worries [10:11] morning all [10:17] wwitzel3, ericsnow, team meeting? [10:43] do we have a stack trace dump signal handler on agents? [10:44] wallyworld_, was thinking that might have helped container debug [10:45] hazmat: no, would be nice though [10:45] hazmat: the stack trace should get output to stderr on a panic and thus go into the log [10:46] hazmat: or maybe you mean like give it a signal and it'll log the current stack trace? We can do that easily [10:46] natefinch, the later [10:47] but yeah, no, doesn't currently exist [10:47] given a hung/spun .. with no log output. nothing happening on syscalls (per strace).. it would be nice to see what's brokens [10:49] * hazmat files a bug [10:51] hazmat: what's your preferred signal? [10:53] natefinch, QUIT [10:54] natefinch, https://bugs.launchpad.net/juju-core/+bug/1362546 [10:54] Bug #1362546: Need a way/signal handler to dump stack trace on agents === urulama-afk is now known as urulama [10:56] jam, i think i totally misunderstood the context of your email yesterday [10:56] re container density [10:57] hazmat: well, some of it was just testing that we can genuinely get container addressibility, and some of it was trying to see what we could do with it for scale testing. [10:57] natefinch: SIGQUIT is built into Go [10:57] to trigger a panic() [10:57] I've used it repeatedly [10:58] hazmat: I'm pretty sure you alredy can [10:58] jam: triggering a panic is different than just printing a stack trace though [10:58] jam: but that's a good point [11:02] jam, thanks x2 [12:02] axw: katco: finish meeting, be therereal soon [12:02] finishing [12:09] wallyworld_: is aggregateSuite.TestMultipleResponseHandling one of your intermittant tests? [12:09] because I just came across it [12:09] and it assumes that "go foo(); go bar()" will call foo before bar [12:09] which is *not* guaranteed. [12:09] jam: no. i will add it. what's the jenkins link? [12:10] wallyworld_: I just discovered it locally [12:10] wallyworld_: I'll try to just fix it, since i'm doing some tests there [12:10] jam: ok, thanks [12:10] I happened to have the system change ordering, or I wouldn't have noticed. [12:13] fortunately it is just a bug in the test, and not a more serious underlying issue === Ursinha-afk is now known as Ursinha [12:48] good morning everybody [12:48] natefinch: hey, did you get my email? === Ursinha is now known as Ursinha-afk === Ursinha-afk is now known as Ursinha [13:29] perrito666: yep, got it. [13:30] the cold medicine I took must be either made of unicorn powder or some illegal drug, these things work waaay too well [13:31] perrito666: heh.... psuedoephedrine is good stuff [13:31] heh, well that explains === jheroux_away is now known as jheroux === hatch__ is now known as hatch [14:03] ericsnow, natefinch: standup time :) [14:03] * perrito666 notices that the only person actually standing up in those is wwitzel3 [14:39] apparently landing is blocked - is anyone currently working on https://bugs.launchpad.net/juju-core/+bug/1362636 ? [14:39] Bug #1362636: ppc64el compilation error [14:41] mattyw: not that I know of [14:46] mgz_, rogpeppe1, thumper, wallyworld_: do any of you know if we verify the SSL certificate of the state servers when agents connect to them? I presume we do, but I don't actually know. [14:46] dimitern, TheMue ^^ [14:46] natefinch: we did originally, but at some point someone added InsecureSkipVerify i think. [14:46] natefinch: i hope that's been removed now. [14:48] mgz_, curtis doesn't seem to be around - any idea how I can get started on looking into that? [14:49] natefinch: actually it does look as if we correctly verify the SSL cert of the state servers now [14:49] natefinch: look in state/api/apiclient.go === urulama is now known as urulama-afk [14:53] rogpeppe1: blech [14:54] natefinch: what's the blech for? [14:55] rogpeppe1: oh, sorry, misread what you said [14:56] rogpeppe1: I can't really tell from the apiclient code if it's actually verifying the certs. I see them being passed around, but I can't figure out where they're actually being checked. [14:56] 's just done in the go stdlib, no? [14:56] natefinch: they're being checked by the websocket code [14:56] natefinch: and by the fact that we use a wss: address [14:56] natefinch: and we add a known root CA to the config [14:57] ahh, ok [15:05] anyone know of a way to get gtalk inside gmail to make the desktop notification mail icon thingy turn blue? Also, what is that thing called and how do I change its settings? It doesn't seem to have any kind of menu on it. [15:10] I dont think you can do that [15:10] that is a part of unity iirc === ericsnow is now known as ericsnow_switchi === ericsnow_switchi is now known as ericsnow__ [16:29] does anyone know how I could try to run a ppc build of core? I'm trying to take a look at https://bugs.launchpad.net/juju-core/+bug/1362636 [16:29] Bug #1362636: ppc64el compilation error [16:35] my full test always times out on my pc [16:35] is there some way to accelerate the tests, or increase the timeout? [16:40] arosales, ping? [16:40] mattyw: hello === psivaa is now known as psivaa-off === urulama-afk is now known as urulama [18:42] natefinch: just had a brief glance through lumberjack.go [18:42] natefinch: looks great in general [18:43] natefinch: a few minor suggestions: [18:43] natefinch: if you specified MaxAge as a time.Duration you wouldn't need the comment and your code would be simpler, and (i think) the API a little more obvious [18:44] natefinch: similarly, if you specified the max size in int64 bytes, you wouldn't need to mock megabytes. [18:45] natefinch: i think that rather than returning an error if a write is too big, you'd be best off just writing it anyway [18:45] rogpeppe1: v1 used bytes, but then in config files you have like size = 100000000 which is illegible and error prone... and no one cares about anything smaller than a megabyte anyway. [18:45] natefinch: i don't see any particular reason why you sort the result of oldLogFiles [18:45] rogpeppe1: I really appreciate the feedback btw. [18:45] rogpeppe1: sorting the old logfiles may be a leftover from the v1 code. I'll look at it again [18:46] rogpeppe1: I thought it was so I could determine which were the N newest and keep those [18:46] natefinch: you just scan directly through the list. you *could* break, i suppose, but that would seem like severe premature optimisation... [18:47] rogpeppe1: they're likely returned in last modified order, which if someone modifies an old log file might mean its last modified date is newer than the contents. [18:48] natefinch: i can't see how the order affects anything [18:48] natefinch: oh, i see [18:48] rogpeppe1: maxbackups .... right [18:48] natefinch: yeah [18:49] natefinch: perhaps it would be better to put the sort just before the code that relies on it [18:49] natefinch: rather than sorting in oldLogFiles [18:49] rogpeppe1: yeah that's probably more clear [18:50] natefinch: then it's more obvious why the slicing logic works [18:50] natefinch: trivial thing: i'd put the [:l.MaxBackups] before the [l.MaxBackups], just because it's slightly nicer to slice the start before the end [18:53] natefinch: i'm not entirely sure about the conflation of actual Logger and the serialisability of the logger config [18:53] natefinch: i *think* i'd be happier leaving all the serialisation stuff out, and leaving it for higher layers [18:56] rogpeppe1: I could see splitting out the config from the logger object itself, so people won't try to do wacky stuff like change values on the fly... [18:57] natefinch: the thing that seems a little hooky to me is the "well we'll preguess yaml and json because we know about those formats" thing [18:58] rogpeppe1: yeah, that's true [18:59] natefinch: i'd just leave the config as vanilla, i think, and if people outside the package want to massage it, they're free to [18:59] natefinch: and specify age as time.Duration and size as bytes. [19:00] natefinch: leaving it up to higher layers to decide about sensible formatting if need be (i'd like to see 4g, 32m, for example to specify sizes, but that's really out of the domain of lumberjack) [19:00] natefinch: great package name, BTW [19:01] natefinch: but i do see the other side of the coin [19:01] too [19:01] natefinch: it forces higher layers to know about all the lumberjack config details [19:02] yeah... I struggled with that [19:02] natefinch: but then again, they probably will anyway - we'd probably use juju config attributes to specify some of this stuff [19:03] natefinch: i *think* i tend towards the "not this package's concern" p.o.v. [19:03] rogpeppe1: yeah, easy deserialization definitely affected the API, that's why it's megabytes and days, not bytes and time.Duration [19:04] rogpeppe1: I think you're right, that it shouldn't be this package's concern [19:06] rogpeppe1: Thanks again for the review. It's a big help having fresh eyes on it. [19:08] natefinch: np. it's a nice package, thanks. [19:09] hey need a quick opinion: i'm looking to document the new harvest mode behavior, and also the update/upgrade settings. are those better in their own individual documents, or embedded in another file (architectural-overview.txt)? [19:23] oops nevermind, just reviewed my notes. looks like juju/docs is the place to be [19:37] what's the URL to download the zip from charmstore? [19:38] having a hell of a time tracking it down int he code [20:06] mattyw around? [20:19] I think alexisb got the power machine worked out [20:20] re mattyw [20:21] cmars, you have what you need with the power box? [20:21] arosales, mattyw is probably gone for the day [20:25] alexisb: ack [20:47] natefinch: it says On-call reviewer: see calendar. What calendar? [20:56] abentley: that's the joke, which calendar keeps changing.... ask thumper, he's redoing it as of this morning. I think it'll be on the juju core team calendar... which I doubt most people can see. [20:57] natefinch: Could I ask you do do a review? It's verra short. [20:57] abentley: I have 3 minutes, so we'll see how short [20:58] https://github.com/juju/juju/pull/629 [20:58] :) [20:58] abentley: LGTM'd [20:58] natefinch: TY. [21:36] waigani: https://github.com/juju/juju/pull/631 [21:36] thumper: https://github.com/juju/juju/pull/632 [21:36] ;) [21:38] thumper: good catch, I didn't write that test [21:38] I had it fail on me this morning [21:40] I also didn't know you could use the || in an assert like that - makes sense [21:40] thumper: CI blocker [21:40] thumper, mattyw and I were unable to land anything today, there's a ppc64el build error blocking [21:40] i got access to a ppc64 and about to try to reproduce [21:41] is davecheney around today? [21:41] cmars: he will be later [21:41] cmars: he normally starts in just over an hour [21:41] ok cool [21:43] cmars: you can do it locally [21:43] cmars: I have reproduced the compiler error on amd64 [21:44] state/apiserver/deployer$ go test -compiler gccgo [21:44] thumper, ah, so its a general gccgo issue [21:44] yep [21:44] unlikely to be power specific [21:45] should find out when it last passed, and what the change was [21:45] git bisect might be helpful there [21:49] damn, how to I get git log to show me the diff [21:49] for revisions [21:49] gitk might be best to browse that [21:49] ugly but useful [21:50] yup or gitg, which is slightly less ugly but also less useful [21:52] so git log won't show me a diff for the revision? [21:52] i dont think so, it should just tell you the commit message and some other metadata [21:53] thumper: what exactly are you trying to do? [21:53] I've found qgit to be a lot nicer [21:53] I want to look at the files changed for every commit [21:54] I know what I'm looking for (ish), I just want to see the commits [21:54] thumper: apparently -p does that [21:54] nope [21:54] ah... [21:54] hang on [21:55] --stat [21:55] that seems to produce a very useful output [21:56] I use that kind of ouput for pull and it is actually very informative [21:57] looks like it is passing now. something must have landed to fix? [21:57] cmars: that is the 1.20 branch [21:57] oh [21:58] cmars: yes, that is certainly very confusing [21:59] good grief, is there a way to see more build history for http://juju-ci.vapour.ws:8080/job/run-unit-tests-trusty-ppc64el/ [22:00] i know where it is on the filesystem... grr [22:00] cmars: jenkins is not actually finding it [22:00] I tried going to a previous job by hand and I get 404 [22:01] hmm [22:01] hmm... [22:01] ok, I have a commit I want to test [22:01] how do I revert the tree to a particular commit? [22:02] thumper: you can use co or revert [22:02] sorry s/co/checkout [22:03] not revert [22:03] * thumper nods [22:04] aghh effing git commands [22:04] thumper: apologies I meant to say reset [22:04] which is like svn revert === jheroux is now known as jheroux_away [22:04] and those always get mixed in my head [22:06] why is git show for a commit not showing me the diff? [22:08] --stat shows lots of files [22:08] but no diff [22:08] ok, definitely have the error [22:08] thumper: is a merge [22:08] yes [22:09] thumper: you dont get diff on merges [22:09] I want to see the diff as a result of the merge [22:09] yes you do... [22:09] grr [22:09] dumb git [22:09] thumper: one of the lines from show say merge blah and bleh [22:10] git diff those two [22:10] thumper: let me correct myself, you should, git sucks [22:10] thumper, i'm running an automatic bisect, will let you know how it goes [22:10] doing that now [22:10] cmars: I have the revision [22:10] looking at the change [22:10] oh cool [22:10] 3ebb3a1edbccd8e6c4211b2f5b9e1fd6d518d82a [22:10] thumper: I presume that merge in internal terms for git adds actual git nodes doesnt do an actual merge of diffs [22:11] * perrito666 never bothered to actually check how git internally works [22:13] the problem is that the code is perfectly fine, just triggering a bug in gccgo [22:13] * thumper sighs [22:17] hmm, ok not that bit [22:18] I have a bad feeling about this [22:20] hahaha [22:20] omg [22:20] * thumper grunts [22:21] well don't leave us hanging... [22:21] waigani, I was thinking the same thing [22:22] * perrito666 eats popcorn and reads [22:22] here is the code that was removed: [22:22] - // TODO(dfc) comparing the two interfaces caused a compiler crash with [22:22] - // gcc version 4.9.0 (Ubuntu 4.9.0-7ubuntu1). Work around the issue [22:22] - // by comparing by string value. [22:22] - if names.NewMachineTag(parentId).String() == authEntityTag { [22:22] it was replaced by a line that compared two interfaces [22:22] well [22:22] one interface and one type [22:22] * thumper pokes [22:22] lol [22:23] I remember that one [22:23] thumper, bisect tells me the breaking change is 41e8f0a7bf33d3b22a7ccf0949e988c834c4eeac [22:24] and i confirm this with gccgo on 41e8f0a7bf33d3b22a7ccf0949e988c834c4eeac vs 41e8f0a7bf33d3b22a7ccf0949e988c834c4eeac~1 [22:24] cmars: didn't you trust me? [22:24] i did, but i wanted to see my bisect work :) [22:24] trust but verify? :) [22:25] * perrito666 never saw "no" said so elegantly [22:25] cmars: ok, I'll give you that [22:26] funny thing is, it has a conditional *very* similar to the one you pasted up there [22:26] oh this is so fucked [22:26] for some value of 'this' [22:28] ok I have a fix [22:29] * thumper runs all apiserver tests with gccgo [22:30] gccgo needs work [22:30] thumper, that is why we are investing in golang [22:31] * thumper nods [22:31] I was just about to say something about that [22:31] beautiful day here today, want to take the dog for a walk around ross creek at lunch time [22:32] * cmars misses beautiful dunedin now. 100F outside and all the grass is dead [22:32] it is about ... [22:32] * thumper calculates [22:32] 50°F [22:32] now that is something useful we can teach mup [22:32] so quite cool [22:32] i'd take it :) [22:33] cmars: interesting we had a couple of days like that a few days ago [22:34] the only issue is that we are in winter [22:34] perrito666, aw man, that's not fair at all. sounds like our winters [22:34] but was an interesting change, it is quite hard to actually store summer clotes in winter [22:35] cmars, perrito666: https://github.com/juju/juju/pull/633 [22:35] fixing the compiler would be best, but i wonder, if we could walk the AST to look for this bug ahead of time, to prevent this from breaking the build [22:35] confirmed passes tests locally with gc and gccgo [22:35] for apiserver at least [22:35] it's comparing two interfaces that triggers the compiler bug? [22:36] hang on [22:36] I think I can simplify [22:36] cmars: no, it appears to be one interface, and one concrete type [22:40] * thumper pushing [22:41] cmars: https://github.com/juju/juju/pull/633/files [22:42] thumper: isnt tag == authEntityTag blowing? [22:42] perrito666: no, because they are both interfaces [22:43] it appears to be when one is a concrete type, and one is an interface [22:43] where the concrete type implements the interface [22:43] not the pointer to the concrete type [22:43] that is a good thing to mail to the list for people to keep an eye on it [22:43] agreed [22:44] btw, isnt there a bug filed for that in gccgo? perhaps a reference to it in the comments would be useful so future maintainers can know when to remove the workaround [22:44] ok, let me pull and restart the tests [22:49] * thumper shrugs [22:49] perrito666: I'll ask dave in the standup [22:50] cmars: sorry this blocked you and matty so much today [22:52] thumper, no problem. it's a good reminder to check gccgo locally. although, it's nice to have access to power8 now, in case we need it in the future [22:52] thumper, I pointed wallyworld to a spreadsheet today that says you have access to multiple power vms [22:52] but there was not access info [22:53] it would be nice to share the info with the whole team [22:53] https://docs.google.com/a/canonical.com/spreadsheets/d/1_y3BM1Fcxmc_niIMrNvqtrzOl23vrX1DdeoQQqTejbg/edit?usp=sharing [22:53] I've forgotten mostly how to get there ... :) [22:53] sure... [22:53] however mostly gccgo problems can be caught locally [22:54] that way we can have power access in US timezones when there is an issue [22:54] people just don't know how [22:54] well education would help to :) [22:54] I included that in my email to the list [22:54] cool, thanks [22:55] cmars, thank you for driving help with that bug today! [22:57] thumper: is there a gccgo bug report for that? want me to file one? [22:57] cmars: I'll check with dave if there is a bug fix [22:57] mwhudson: oh hai [22:57] mwhudson: I'll see if dave has done one already first [22:57] ok [22:57] would be good to get a minimal test case [22:58] which I think I have a good grip on now [22:58] yeah, that was going to be my next question :) [23:00] waigani: with you in a sec [23:00] waigani: just testing this bug [23:00] thumper: okay, I'll just keep the hangout open in bg [23:01] thumper: nice work on getting the bug! [23:03] thumper: dave is here [23:10] thumper: i don't see a fix flicking through gofrontend commits [23:23] rogpeppe1: your suggestion works, and is less intrusive, ta [23:24] thumper: cool, np [23:24] rogpeppe1: we are looking at creating a simple reproduction of the error [23:24] seems to be only with nested funcs and closure issues [23:24] so... not simple [23:24] thumper: ah [23:25] what are the system-y tests - mentioned in the team lead minutes? [23:59] * thumper takes the dog for a walk [23:59] bbl