[00:00] <davecheney> natefinch: see tim's email thread from last year about the proposed design
[00:00] <davecheney> sorry i cannot link to it
[00:01] <davecheney> i dunno how to link to messages on that ML
[00:01] <natefinch> davecheney: search for diaf is surprisingly useful
[00:01] <davecheney> :)
[00:04] <mup> Bug #1582841 changed: remove check for lxdbr0 <juju-core:Invalid> <https://launchpad.net/bugs/1582841>
[00:08] <natefinch> davecheney: I don't see anything in that thread that says that flock won't work.  It seems to say flock won't release on process end, but that's incorrect from my reading and experience.  go closes the file handle and the lock is released on process exit.
[00:09] <natefinch> er lock is released when the file descriptor is closed, which happens automatically on exit
[00:10] <natefinch> davecheney: if you just want to avoid futzing with files on disk, that's valid, too
[00:11] <redir> anastasiamac: FWIW I only found these 3 https://goo.gl/JG8NXL none of which would have been fixed by the updates to aggregator AFAICT.
[00:25] <davecheney> natefinch: please, no flock, use unix domain sockets as we discussed
[00:25] <natefinch> davecheney: ok
[00:25] <davecheney> flock requires a file on disk
[00:25] <davecheney> if I delete that file I can effeictly release that lock
[00:25] <davecheney> and we're back to square one
[00:26] <perrito666> can you do that in windows? (without a signifficant effort?)
[00:26] <natefinch> nope
[00:26] <natefinch> windows is a real lock
[00:26] <natefinch> I know flock is only advisdory
[00:27] <davecheney> it's not that flock is advisory
[00:27] <davecheney> but the thing that is being locked can be moved or deleted which breaks the lock invariant
[00:29] <davecheney> flock also leaves with the 'oh i crashed and there is a file on disk' problem
[00:29] <natefinch> man page says "Apply or remove an advisory lock on the open file specified by fd."  maybe the docs are misleading, but regardless.. yes, you can break it trivially
[00:29] <natefinch> the file on disk doesn't matter
[00:30] <davecheney> yes it does
[00:30] <davecheney> you need to have a file to apply a lock to
[00:30] <natefinch> the existence of the file has no meaning
[00:30] <davecheney> you cannot flock a non existant file
[00:30] <davecheney> so you have to create that file
[00:30] <natefinch> well, yes
[00:30] <davecheney> so now you have two lockers racing to create the file
[00:30] <davecheney> if you fail to create the file because someone else was racing with does taht mean you cannot lock it, etc
[00:30] <perrito666> k people, EOD, see you all on thu
[00:30] <davecheney> please just use unix domain sockets
[00:31] <davecheney> they solve 100% of these problems
[00:31] <natefinch> it's really not a race, but that's fine
[00:31] <davecheney> net.Dial("unix", "@/juju/$SOMEHASH")
[00:32] <natefinch> I open with O_create... one proc will create first, the other will open
[00:33] <davecheney> good point
[00:33] <davecheney> but please
[00:33] <davecheney> no files on disk
[00:34] <natefinch> I agree that files on disk are a liability.  I was just doing it that way to minimize changes to current code.  but we can do socket on linux and mutex on windows
[00:38] <bradm> fwiw I seem to be hitting bug 1537585 a lot more with juju 2 and maas 2
[00:38] <mup> Bug #1537585: machine agent failed to register IP addresses, borks agent <2.0-count> <blocker> <landscape> <network> <juju-core:Triaged> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1537585>
[00:59] <davechen1y> thumper: what tha ?
[00:59] <davechen1y> lucky(~/src/github.com/juju/juju/cmd/jujud/agent) % go test -i -v . && go test .
[00:59] <davechen1y> ok      github.com/juju/juju/cmd/jujud/agent    122.841s
[01:00] <davechen1y> lucky(~/src/github.com/juju/juju/cmd/jujud/agent) % go test -race -i -v . && go test -race .
[01:00] <davechen1y> ok      github.com/juju/juju/cmd/jujud/agent    1.168s
[01:00] <davechen1y> ^ this test runs 100x faster under the race detector ...
[01:00] <natefinch> lol, we should use the race detector during production, evidently
[01:02] <davechen1y> uh oh
[01:02] <davechen1y> i have a terrible feeling about this
[01:02] <natefinch> I have a great feeling that this is going to be an awesome bug
[01:02] <davechen1y> lucky(~/src/github.com/juju/juju/cmd/jujud/agent) % pt -i race
[01:02] <davechen1y> package_test.go:
[01:02] <davechen1y> 15:     if testing.RaceEnabled {
[01:02] <davechen1y> 16:             t.Skip("skipping package under -race, see LP 1519133, 1519097")
[01:02] <natefinch> lol
[01:02] <natefinch> nice
[01:02] <natefinch> bug 1519133
[01:02] <mup> Bug #1519133: cmd/jujud/agent: data race <2.0-count> <juju-core:Triaged> <https://launchpad.net/bugs/1519133>
[01:03] <natefinch> bug 1519097
[01:03] <mup> Bug #1519097: juju/utils/fslock: data race caused by createAliveFile running twice <2.0-count> <race-condition> <tech-debt> <juju-core:Fix Released by dooferlad> <https://launchpad.net/bugs/1519097>
[01:03] <davechen1y> OH MY FUCK
[01:03] <davechen1y> this god damn fslock nonsense
[01:03] <natefinch> what, it's fix released
[01:03] <natefinch> no more race
[01:03] <davechen1y> let's remove that check and see what breaks
[01:05] <natefinch> well, the fslock one we are working on
[01:10] <mup> Bug #1585424 opened: all: data race's in tests are surpressed <juju-core:New> <https://launchpad.net/bugs/1585424>
[01:11] <davechen1y> thumper: https://bugs.launchpad.net/juju-core/+bug/1585424
[01:11] <davechen1y> please see email
[01:11] <mup> Bug #1585424: all: data race's in tests are surpressed <juju-core:New> <https://launchpad.net/bugs/1585424>
[01:11] <davechen1y> and let me know your decision
[01:19] <mup> Bug #1585424 changed: all: data race's in tests are surpressed <juju-core:New> <https://launchpad.net/bugs/1585424>
[01:20] <davechen1y> bug 1519095
[01:20] <mup> Bug #1519095: state: tests to not pass with -race under Go 1.2 <2.0-count> <juju-core:Triaged> <https://launchpad.net/bugs/1519095>
[01:20] <davechen1y> oh wow
[01:21] <davechen1y> welp that's what happens when a bug isn't a blocker
[01:21] <davechen1y> it goes straight to the bottom of the pool
[01:31] <mup> Bug #1585424 opened: all: data races in tests are surpressed <juju-core:New> <https://launchpad.net/bugs/1585424>
[01:36] <natefinch> davechen1y: IIRC the idea was that at least we wouldn't get any *new* data races in core if we're running the race detector.  The problem being that we never went back and actually fixed the ones that got skipped
[01:38] <davechen1y> yeah, that's what I remembmer now
[01:38] <davechen1y> i remember it was an uphill battle to stop new races being added
[01:38] <davechen1y> it's good to know that we can prdict the future
[01:40] <davechen1y> http://reviews.vapour.ws/r/3204/
[01:40] <davechen1y> :(
[01:41] <natefinch> *sad trombone*
[01:42]  * davechen1y considers replacing paul the octopus as a fortune teller
[02:10] <mup> Bug #1585430 opened: Cloud-init failed on windows <ci> <cloud-init> <regression> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1585430>
[02:19] <davechen1y> if you thought the state tests were slow without -race, just wait til you see them with -race
[02:19] <natefinch> heh yeah, -race and -cover are always interesting additions
[02:33] <davechen1y> thumper: cherylj https://github.com/juju/juju/pull/5452
[02:33] <davechen1y> please read the description closely
[02:33] <thumper> I think our timeout for race is like 60 minutes
[02:33]  * thumper double checks
[02:34] <davechen1y> that's probably _just_ enough
[02:36] <thumper> 30 minutes
[02:36] <thumper> go test -race -test.timeout=1800s ./...
[02:36] <davechen1y> it'll be tight
[02:36] <davechen1y> this passed in 1500s on my machine
[02:37] <davechen1y> without any other tests running
[02:37] <davechen1y> it'll probably be ok
[02:37] <davechen1y> at least we know which change to revert
[02:37] <thumper> right
[02:37] <thumper> I say push it in
[02:39] <davechen1y> jolly good
[03:11] <mup> Bug #1583893 changed: 1.25.5: goroutine panic launching container on xenial <landscape> <juju-core:Fix Released> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1583893>
[03:50] <mup> Bug #1583893 opened: 1.25.5: goroutine panic launching container on xenial <landscape> <juju-core:Triaged> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1583893>
[03:59] <davechen1y> cherylj: anastasiamac https://bugs.launchpad.net/juju-core/+bug/1518806
[03:59] <mup> Bug #1518806: apiserver: tests to not pass with -race under Go 1.2 <2.0-count> <juju-core:Triaged by dave-cheney> <https://launchpad.net/bugs/1518806>
[03:59] <davechen1y> i think there is a still a race here
[03:59] <davechen1y> what testing did you do and why do you think the race is not present ?
[04:00] <davechen1y> anastasiamac: cherylj for reference the race I see is https://bugs.launchpad.net/juju-core/+bug/1519133/comments/2
[04:00] <mup> Bug #1519133: cmd/jujud/agent: data race <2.0-count> <juju-core:Triaged by dave-cheney> <https://launchpad.net/bugs/1519133>
[04:00] <cherylj> davechen1y: I ran the race test locally
[04:00] <cherylj> specifically for apiserver/
[04:02] <anastasiamac> davechen1y: did u ran only apiserver package with race detector (in isolation) or started at the root, i.e. from top of juju/juju?
[04:02] <davechen1y> anastasiamac: did you remove the skip ?
[04:02] <davechen1y> that package knows when it's run under the race detector and skips all the tests
[04:02] <davechen1y> it does log it
[04:02] <anastasiamac> davechen1y: i have not run race detector here as I have not been in this package yet
[04:02] <davechen1y> but the log is not printed anywhere
[04:02] <davechen1y> :(
[04:03] <davechen1y> righto, i guess it's not fixed then
[04:03] <anastasiamac> if u r seen it, then it does not sound fixed ;-P
[04:03] <anastasiamac> it could have also been re-introduced?..
[04:03] <cherylj> I ran the test locally removing the skip and didn't see a race in apiserver/, but that doesn't mean that there isn't one
[04:05] <anastasiamac> at some stage, we should really unskip tests that r related to bugs that are considered fixed...
[04:18] <davechen1y> cherylj: i think it's not well exercised by the race
[04:18] <davechen1y> the tests in cmd/jujud/agent agetate the cert updater so that's why it shows up there
[04:18] <davechen1y> but ht race is definitely in the apiserver code
[04:19] <davechen1y> i'll try to write a test that agrevates the problem
[04:22] <natefinch> davechen1y: when you said use unix domain sockets and net.Dial(....) did you mean net.listen?  can you dial a domain socket without anyone listening?
[04:25] <davechen1y> yes, i'm sorry, i meand net.ListenUnix
[04:33] <natefinch> EOD++ for me.  night all.
[04:37] <anastasiamac> axw: wallyworld: this is permissions check that i was talking about at standup - http://reviews.vapour.ws/r/4895/
[04:38] <axw> ok, taking a look
[04:39] <wallyworld> anastasiamac: but
[04:39] <wallyworld> 		// password, so that we don't allow unauthenticated users to find
[04:39] <wallyworld> 		// information about existing entities.
[04:39] <wallyworld> that information hiding is now lost
[04:40] <wallyworld> it's like when you enter user/pass into a banking website
[04:40] <wallyworld> it just says bad credentials or something
[04:40] <wallyworld> it doesn't tell you what was invalid
[04:40] <anastasiamac> it's not lost, we return permission denied without saying that entity does not exist
[04:41] <anastasiamac> when u go to the bank website, they say the same thing
[04:41] <wallyworld> right, and then when the correct entity is passed, we then return BadCreds right?
[04:41] <wallyworld> so 2 different errors depending if the user is valid or not
[04:42] <anastasiamac> we have a reaction to bad credentials error - we restart stuff... so we need to ensure that we only return bad credentials when we intent to resat agent, for eg..
[04:42] <anastasiamac> yes, if user is valid but does not have permission - deny acces
[04:42] <wallyworld> but the error code becomes different so we leak info
[04:42] <anastasiamac> if user/pwd is not valid, bad credentials - do whatever needs to be done like restart...
[04:43] <anastasiamac> we do not leak info - we do not say 'not found"
[04:43] <wallyworld> the point is - we need to return the same error if user/pass together are not valid, regardless of cause
[04:43] <anastasiamac> we say - u have no permissions to do what u r trying to do
[04:43] <wallyworld> you can infer the not found
[04:44] <wallyworld> by looking at the return error
[04:45] <anastasiamac> if we return bad credentials error, for example, we terminate api... we do not want to that when the login entity has valid creds but not permissions
[04:47] <wallyworld> sure, that's an upstream problem though
[04:47] <wallyworld> we cannot solve that issue by leaking information
[04:47] <anastasiamac> ?
[04:47] <wallyworld> the comment that was deleted in the PR explained why it was done that way
[04:47] <anastasiamac> this is the reason we keep restarting agents
[04:48] <anastasiamac> permission error doe s not imply something is not found...
[04:48] <wallyworld> it does in this context as it is different to what's returned when the entity is found
[04:49] <wallyworld> the question to ask is - why is the api being called with a bad user?
[04:49] <wallyworld> isn't that the root cause issue?
[04:51] <anastasiamac> the root cause is in several bug, the prime one is refered to in PR.
[04:51] <anastasiamac> essentially, we (juju) reacts differently to bad creds
[04:51] <anastasiamac> so we  need to be careful about when we throw it
[04:52] <wallyworld> sure, but we can't fix the issue by leaking information
[04:52] <anastasiamac> otherwsie, we;'ll have a reaction that is too drastic, is uncalled for and is not helping
[04:52] <anastasiamac> \o/ i have difficulty seeing info leak
[04:53] <wallyworld> if i'm a hacker, i throw different user/pass at the api. if i get an errperm vs an badcreds, i can deduce if the user exists or not
[04:53] <wallyworld> that's an information leak
[04:53] <wallyworld> once i know a user exists vs not exists, the cracking problem becomes much easier
[04:53] <wallyworld> i now just need to guess the passowrd
[04:54] <anastasiamac> sure, but if "entity" is not found, u really do not want to restart everything either
[04:54] <wallyworld> correct
[04:54] <wallyworld> so we need a different solution
[04:54] <wallyworld> that doesn't leak info
[04:54] <anastasiamac> wel, we need an error that is not bad creds (coz we have a reaction to that)
[04:54] <anastasiamac> the only other one that we have is PermErr
[04:54] <wallyworld> we need to understand the upstream root cause
[04:54] <anastasiamac> whcih is actually what we are after - permission is denied \o/
[04:54] <wallyworld> why is the caller passing a bad entity name?
[04:57] <anastasiamac> this func is not just used by users, if u look for dependent calls
[04:58] <anastasiamac> also i think that this conversation should be taken off-line and have Will and Andrew involved :D
[04:58] <anastasiamac> m happy to have it anytime and m happy with any solution that involves not to restart for every failure
[05:01] <wallyworld> sure, we can discuss. but it's an information leak, and there was a comment that explained why. if it is decided we don't care about the information leak, then fine. but it's risky to leak information without a proper discussion
[05:03] <anastasiamac> true information leask is a concern - m not convinced this si the case here.... another concern is the restart that is caused by bad creds... solution needs to ensure that neither happen :D
[05:04] <wallyworld> disclosure of whether an user protected by a password exists or not is an attack vector
[05:04] <wallyworld> the information is whether the user exists and it is leaked
[05:05] <wallyworld> if different errors are returned. it's why banks for example always just return a generic "invalid user password combination" error
[05:05] <wallyworld> they don't say *which* of the user or password is wrong
[05:10] <anastasiamac> in this instance, find entity determines if the call is made by machine/unit/user/service/etc... r u saying that if an unknown to us machine is making a call, it is the correct call for us to restart the api? :D
[05:17] <axw> anastasiamac wallyworld: what has the server's behaviour got to do with the client's behaviour around restarting the API? that's a client policy. you should not be leaking information (yes there *is* leakage) from the server to control the client's behaviour
[05:18] <wallyworld> +1
[05:18] <wallyworld> that's my point
[05:18] <axw> anastasiamac: like wallyworld says, if you can infer that the username is valid because "not found", then you're leaking info, and htat's a security concern
[05:18] <axw> that reduces the cost of a brute foce attack to focusing mostly on the password
[05:18] <axw> force*
[05:19] <wallyworld> yep
[05:19] <anastasiamac> sure and i agree
[05:19] <anastasiamac> but throwing bad creds here if the entity is not a user, is what is causing issues
[05:20] <axw> anastasiamac: so the issue is related to the unhandled tag kind?
[05:24] <anastasiamac> axw: the issue is thatwe use given tag to find entity. if entity is not found we throw bad creds which we react to.. i think we should throw something else here
[05:24] <anastasiamac> m happy to revert this badcreds instance for now and only keep the other instances (that are more cleanly defined) in this PR
[05:25] <davechen1y> axw: this fix is surprisingly simple, can I get a sanity check https://github.com/juju/juju/pull/5455
[05:26] <davechen1y> thnks!
[05:31] <axw> anastasiamac: I don't understand how can say "sure and i agree" and then what you just said. either we return ErrBadCreds, or we leak information
[05:31] <axw> davechen1y: looking
[05:34] <axw> davechen1y: LGTM
[05:36] <anastasiamac> axw: I agree about obfuscating and not leaking info :D i am not convinced that badcreds is always the right answer here.. maybe only for user tags...either way, i'll come back to this instance later but will adress other comments on PR after school/kids pick-up
[05:43] <davechen1y> who has the power to unblock a message to juju-dev ?
[06:07] <wallyworld> axw: "application-wordpress" or "app-wordpress" ?
[06:07] <axw> wallyworld: "app-" please
[06:07] <wallyworld> ok, i kinda like application- :-)
[06:08] <wallyworld> app- is ok i guess
[06:08] <axw> wallyworld: feel free to get a second opinion, I just think application- is tedious
[06:08] <wallyworld> yeah it is, but app- sorta grates for some reason. maybe just me
[06:25] <wallyworld> axw: if you have a moment, first of a few sigh http://reviews.vapour.ws/r/4897/
[06:27] <axw> wallyworld: hrm, app- looks grating to me too now. maybe bring it up in the meeting later
[06:27] <wallyworld> axw: i do think application works better
[06:28] <wallyworld> it's not much longer than service
[06:28] <wallyworld> i have a charm.v6 change queued up, so i'll change to application-
[06:28] <axw> wallyworld: did you intend to change the package paths to gopkg.in?
[06:28] <wallyworld> yeah
[06:28] <wallyworld> so we can rev up the version
[06:29] <wallyworld> to names.v2
[06:29] <axw> right, ok
[06:29] <wallyworld> so we don't buuger up 1.25 etc
[06:29] <wallyworld> there'll be a lot of churn in all this sadly
[06:29] <mup> Bug #1584059 changed: Deployment of swift-storage charms fails with Juju 2.0 - swift-storage-relation-joined KeyError: 'JUJU_ENV_UUID' <oil> <juju-core:Triaged> <swift-storage (Juju Charms Collection):New> <https://launchpad.net/bugs/1584059>
[06:49] <wallyworld> axw: next one https://github.com/juju/charm/pull/210
[06:49] <wallyworld> gotta duck out for school pickup, will look at yuors in a bit
[07:22] <wallyworld> axw: the yaml thing - we had the relations data unmarshalled with the v2 Unmarshall, and I think the service data with GetYaml() from v1, it was a bit of a mess
[07:23] <wallyworld> i had a quick look at the v2 release notes and didn't see anything that jumped out
[07:23] <wallyworld> multi-line strings have been improved, but i don't think that affects charms
[07:24] <axw> wallyworld: non capisco. is anything feeding charm metadata into yaml.v1? not sure which things use it TBH. charmstore possibly? new charm tool?
[07:25] <axw> no argument that it's a mess, just not sure if we can fix it without breaking others
[07:25] <wallyworld> given we had a mix of v1 and v2 before, it seems unlikely we'd be relying on v2 specific behaviour
[07:26] <wallyworld> the charm metadata is fairly simple yaml i guess?
[07:26] <wallyworld> nothing too esoteric
[07:27] <axw> wallyworld: my point is that (AFAIK) the MarshalYAML methods won't be called if the structs are passed to yaml.v1
[07:27] <axw> wallyworld: so the custom behaviour that was in GetYAML won't be triggered
[07:27] <wallyworld> yeah, that's correct
[07:28] <wallyworld> i can't fathom why we'd need both
[07:28] <axw> wallyworld: only because some clients might not have been updated
[07:28] <wallyworld> now is the time ot make this change i guess
[07:29] <wallyworld> it is charm.v6-unstable, clients out there would be using .v5
[07:29] <axw> ok
[07:29] <wallyworld> that's my self justification
[07:29] <wallyworld> if i say it often enough it makes sense
[07:30] <axw> wallyworld: :)
[07:30] <axw> wallyworld: seems fair, I'm just not sure what the impact is
[07:30] <wallyworld> yeah, i'm not 1000% sure
[07:31] <wallyworld> we can fix anything that comes up easily enough
[07:31] <axw> wallyworld: yep ok, consider the question dropped
[07:31] <wallyworld> ok
[07:33] <fwereade> wallyworld, in hangout
[07:39] <bradm> anyone know what the deal with br-bond1 is?  we have 2 bonds on these machines, and juju creates br-bondx for both of them, but only one of them is put in the interfaces file
[07:39] <bradm> this is with juju-2
[07:57] <wallyworld> bradm: dimitern or frobware would know
[07:58] <frobware> bradm: can you PB the original /e/n/i file and Juju's morphed version.
[07:59] <bradm> frobware: thats the point, there is no details about br-bond1 in /e/n/i
[07:59] <frobware> bradm: MAAS?
[07:59] <bradm> frobware: yup
[07:59] <bradm> frobware: in this case, maas 2.0 with juju 2.0
[08:00] <frobware> bradm: ah. haven't really tried MAAS 2.0 that hard. hmm.
[08:00] <wallyworld> bradm: frobware: maybe voidspace then
[08:00] <bradm> I have to run to take the boy to cubs, will check back in later
[08:01] <frobware> bradm: but there should be the original /e/n/i available
[08:01] <dimitern> frobware, bradm: hey, what's the issue?
[08:01] <bradm> frobware: ayup, there is - it adds br-bond0 to /e/n/i, but not br-bond1
[08:01] <frobware> bradm: it has the extension -before-juju-add-bridge.py
[08:02] <bradm> got to run, didn't realise the time
[08:02] <frobware> bradm: ack
[08:02] <dimitern> perhaps is br-bond1 unconfigured and has no address, but it has configured 'children'?
[09:02] <davechen1y> can someone with mod rights on juju-dev please let my message out of moderation
[09:02] <davechen1y> ta!
[10:29] <rogpeppe> anyone know anything about the reasoning behind https://github.com/juju/juju/pull/5100 ?
[10:29] <rogpeppe> for example, was it deliberate to remove "juju help plugins" ?
[10:30] <rogpeppe> fwereade, davechen1y, axw: ^
[10:31] <bradm> frobware, dimitern: yes, our bond1 is unconfigured, we use it for neutron-gateway lxcs
[10:32] <frobware> bradm: so juju should not create a bridge for it, but it should still be present in /e/n/i
[10:32] <dimitern> bradm: so you have bond1 unconfigured but up and e.g. bond1.42 also unconfigured and up?
[10:33] <bradm> frobware: right, that would do us - we create a bridge ourselves in the preseed
[10:33] <dimitern> oh..
[10:33] <bradm> dimitern: in maas we create teh bond and leave it unconfigured, we add it to the lxcs as an extra interface, so we do need a bridge for it
[10:34] <bradm> which is fine if you guys do it, we can use whichever bond is there
[10:34] <bradm> er, bridge
[10:34] <bradm> just having 2 bridges on the interface doesn't work so well :)
[10:34] <dimitern> bradm: well, as long as the bridge you create in the preseed is called 'br-bond1' and e.g. 'br-bond1.42' for the neutron-ext vlan, the bridge script shouldn't mess with it
[10:35] <frobware> bradm: can you PB the original /e/n/i file - it should be /etc/network/interfaces-before-juju-add-bridge.py
[10:36] <frobware> bradm: are there any VLANs configured on the bond?
[10:37] <bradm> frobware: https://pastebin.canonical.com/157211/ - nope, very plain
[10:37] <bradm> I can try creating it as br-bond1 tomorrow to see if that fixes it
[10:38] <frobware> bradm: your dns-nameserver entry (x.y.z.5) - is that right?
[10:38] <bradm> yeah, I just hid our ip range, out of habit
[10:39] <bradm> frobware: thats the IP of the maas node.
[10:39] <frobware> bradm: ok, just checking... :)
[10:40] <frobware> bradm: so I don't see juju messing with bond1 at all
[10:40] <bradm> trying to do as much as we can in maas, and not hacking up preseeds for it
[10:40]  * frobware tries to recall what the issue was/is ...
[10:40] <bradm> frobware: right, yet there's a br-bond1 up
[10:40] <bradm> frobware: there's commands in the cloud-init-output.log that configure it
[10:41] <bradm> frobware: https://pastebin.canonical.com/157212/
[10:41] <frobware> bradm: so juju will create br-bond0 based on your original eni - http://pastebin.ubuntu.com/16676634/
[10:42] <bradm> actually, there's a br-ethx for all of the eth devices too
[10:43] <frobware> bradm: ah... let me check something...
[10:43] <bradm> but only br-bond0 gets written out to /e/n/i
[10:43] <bradm> maybe because none of the others have IPs?  dunno
[10:44] <frobware> bradm: br-bond0 get written to eni because its parent device is 'static'
[10:44] <frobware> bradm: bond1 is manual.
[10:44] <frobware> bradm: so does not get bridged
[10:44] <bradm> frobware: ah, but it does!  it creates a br-bond1
[10:44] <bradm> frobware: its just not in /e/n/i
[10:45] <frobware> bradm: right. that's my "aha" - just checking something...
[10:45] <bradm> frobware: like I said, all the interfaces have a bridge
[10:45] <bradm> well, except for lo
[10:45] <frobware> bradm: I would say, based on the input, the /e/n/i as written looks correct, just checking on something that has changed in the bridge script quite recently.
[10:46] <bradm> frobware: yup, /e/n/i is what we want.  its all the extra interfaces that are up we don't want :)
[10:46] <frobware> bradm: agreed. state of interfaces should reflect what's in /e/n/i
[10:47] <bradm> frobware: https://pastebin.canonical.com/157215/ <- those are the extra interfaces
[10:49] <bradm> we just create our br1 in the maas preseed, which is fine
[10:50] <frobware> dooferlad: ^^
[10:50] <bradm> I can see in the cloud-init-output log where it does it all
[10:50] <frobware> dooferlad: I have a sneaky suspicion your recent changes to the bridge script may be bringing interfaces up where it shouldn't...
[10:51] <bradm> honestly, either way works - if it wants to create a br-bond1 because the interface is there, thats fine, we can use it
[10:51] <bradm> just need it in /e/n/i
[10:51] <anastasiamac> bradm: plz clarify - r u using juju2.beta7 or master tip?
[10:52] <bradm> anastasiamac: juju2 beta 7 with maas 2.0 beta 5
[10:52] <dooferlad> frobware: that is horrid. Yea, maybe that is happening.
[10:52] <frobware> dooferlad: bridge_now() should consider is_active - we are creating bridges where we should not.
[10:52] <frobware> dooferlad: see https://pastebin.canonical.com/157215/
[10:53] <frobware> bradm: this is a recent regression
[10:53] <frobware> bradm: I'm guessing that if you were to reboot the machine it would come up as you expect
[10:53] <bradm> frobware: ayup
[10:54] <bradm> frobware: well, except the lxcs are cranky because they couldn't access anything in our public IP range, so neutron is a little upset
[10:55] <bradm> but, cool, happy to see its not me totally misunderstanding something about the new stuff, these things happen.
[10:55] <frobware> bradm: I think we just need to get a fix and a binary to you. Would that work?
[10:56] <bradm> frobware: sure, thats fine, this is for an internal stack right now
[10:56] <bradm> frobware: and I'm not at work for another 12 hours anyway. :)
[10:56] <frobware> bradm: you happy with some unofficial build from some bloke off the net?
[10:56] <bradm> frobware: ideally we want it fixed in upstream :)
[10:57] <frobware> bradm: of, of course.
[10:57] <frobware> bradm: I was merely trying to find a way to unblock you.
[11:05] <frobware>  dooferlad, bradm: https://bugs.launchpad.net/juju-core/+bug/1585582
[11:05] <mup> Bug #1585582: MAAS bridge script bridges inactive interfaces <network> <juju-core:New for dooferlad> <https://launchpad.net/bugs/1585582>
[11:06] <mup> Bug #1585582 opened: MAAS bridge script bridges inactive interfaces <network> <juju-core:New for dooferlad> <https://launchpad.net/bugs/1585582>
[11:07] <bradm> frobware: perfect.
[11:20] <fwereade> ericsnow, katco: I can add a payload with ID "xyz" (and class type/name docker/payloadA), but when I list the unit payloads it's reported as a Result with ID "payloadA"
[11:21] <fwereade> ericsnow, katco: I presume this is expected behaviour, but I'm not sure what it's for
[11:21] <fwereade> ericsnow, katco: can you enlighten me?
[11:41] <voidspace> dimitern: ping
[11:42] <dimitern> voidspace: pong
[11:42] <voidspace> dimitern: subnets in model migration
[11:43] <voidspace> dimitern: would you consider spacename and availabilityzone fields of the subnet to be "optional"?
[11:43] <voidspace> dimitern: (i.e. should they be omitempty in the yaml)
[11:43] <voidspace> dimitern: what do you think?
[11:45] <dimitern> voidspace: I think we should make best effort to set them, but failing that it's better to have them stored empty (i.e. not omitempty)
[11:45] <voidspace> dimitern: ok, cool - thanks
[11:45] <voidspace> no default on the import schema either then
[11:47] <dimitern> voidspace: yeah, I think so
[11:47] <dimitern> voidspace: on a side-note: the subnetDoc should support storing multiple AZs per subnet
[11:47] <voidspace> dimitern: yeah, it should
[12:01] <voidspace> fwereade: ping
[12:03] <voidspace> dimitern: ping
[12:03] <dimitern> voidspace: pong
[12:04] <voidspace> dimitern: fwereade: the State.AddSubnet method has a checkModelActive in the transaction
[12:04] <dimitern> voidspace: yeah?
[12:04] <voidspace> this causes adding a subnet during a model migration import to fail because the model is not active
[12:04] <voidspace> the obvious solution is to remove the check...
[12:05] <voidspace> is that ok, or is there a "right way" to do this
[12:05] <dimitern> voidspace: I don't think that's ok
[12:06] <dimitern> voidspace: removing the assert I mean
[12:06] <dimitern> voidspace: why won't the model be alive ?
[12:06] <voidspace> this test fails: https://github.com/juju/juju/compare/master...voidspace:model-migrations-subnets#diff-62dc0c2f793cbec94ae9fcbdf9c9ab84R507
[12:06] <voidspace> on that assert
[12:07] <voidspace> it fails with: github.com/juju/juju/state/model.go:878: model "new" is being migrated
[12:07] <voidspace> dimitern: a model that is being migrated is not active
[12:07] <voidspace> dimitern: the checkModelActive assert actively checks for this
[12:08] <voidspace> dimitern: see the definition of checkModelActive
[12:08] <dimitern> voidspace: well, the checkModelActive is pretty common - how is that working for other types?
[12:09] <voidspace> dimitern: AddSpace doesn't do the check
[12:10] <dimitern> voidspace: it should btw; but how about e.g. machines, units, ..
[12:10] <dimitern> voidspace: maybe it's ok to skip the checkModelActive check *only* during migration?
[12:12] <babbageclunk> Weird - if I cd '$GOROOT/src/github.com/juju/juju/mongo; go test -v -check.v' the tests all run twice. Does anyone know why that is?
[12:13] <babbageclunk> Does anyone else see that?
[12:14] <dimitern> babbageclunk: perhaps due to how '-v' and '-check.v' are handled?
[12:14] <babbageclunk> Oops, meant $GOPATH
[12:14] <dimitern> babbageclunk: check `go help testflag`
[12:14] <voidspace> dimitern: machine import has it's own transaction and manually inserts the machine into state
[12:14] <babbageclunk> Well, without -v you don't see the output from -check.v
[12:15] <dimitern> voidspace: so then it sounds like AddSubnet's ops cannot be used literally for import
[12:15] <voidspace> dimitern: copying and pasting them sounds worse
[12:15] <dimitern> babbageclunk: try '-check.v -test.v' in that order?
[12:16] <dimitern> voidspace: they are likely to be quite different anyway, aren't they?
[12:16] <voidspace> dimitern: identical minus that check
[12:16] <voidspace> I think, let me confirm
[12:17] <dimitern> voidspace: assuming they are called in a certain order - e.g. if spaces weren't yet imported..
[12:17] <babbageclunk> dimitern: oh no - taking off the -v I still see the (duplicate) output
[12:17] <voidspace> dimitern: we only store SpaceName on subnet
[12:17] <voidspace> dimitern: so order doesn't matter
[12:18] <voidspace> AddSubnet is quite short - I can copy it into migration_import and remove the assert
[12:19] <dimitern> babbageclunk: usually, mixing '-check.v' and '-test.v' is not a good idea - as -check.v works for the current package, whereas '-test.v' can work on a root import path recursively - e.g. go test -v github.com/juju/juju/...
[12:19] <voidspace> not so terrible I guess
[12:19] <babbageclunk> dimitern: Could you try 'cd $GOPATH/src/github.com/juju/juju/mongo; go test -check.v -check.f oplogSuite.TestRestartsOnError' for me and see if you see the test run twice?
[12:20] <dimitern> voidspace: or perhaps better, make a helper that returns addSubnetOps which does not add checkModelActive and use it for both import and AddSubnet, but in the last case append the check before running?
[12:21] <voidspace> dimitern: heh, yeah - better
[12:21] <dimitern> babbageclunk: just a sec..
[12:21] <voidspace> dimitern: ta
[12:21] <babbageclunk> dimitern: Thanks! I think the -check.v flag is a red-herring - I'm trying to understand why the tests run twice.
[12:22] <dimitern> babbageclunk: yeah, it does run the test twice
[12:22] <dimitern> babbageclunk: so not related to -check.v / -test.v
[12:23] <babbageclunk> dimitern: What's up with that?
[12:25] <dimitern> babbageclunk: usually it's because a 'baseSuite' or something is embedded in 2 other, separately registered suites (i.e. var _ = gc.Suite(&SuiteEmbeddingBase{})
[12:26] <dimitern> babbageclunk: due to the embedding, any baseSuite.TestFoo() will be called once for each SuiteEmbeddingBase like that
[12:26] <dimitern> as TestFoo is 'inherited'
[12:26] <babbageclunk> dimitern: I looked to see if there was a suite embedding this one, but no dice. Also this one just embeds BaseSuite.
[12:28] <dimitern> babbageclunk: indeed, might be something else..
[12:29] <dimitern> babbageclunk: always try '-check.vv' in such cases to see which setup/teardown methods where called
[12:32] <alexisb> cherylj, are you joining us this morning
[12:32] <alexisb> fwereade, ping
[12:32] <babbageclunk> dimitern: ok, I'll try that too, thanks.
[12:33] <dimitern> babbageclunk: ah! got it.. nasty indeed - internal_test.go:31
[12:34] <dimitern> babbageclunk: calling gc.TestingT twice in the same package is *infinitely* more sinister than embedding a suite
[12:35] <babbageclunk> dimitern: Ooooh. Nasty!
[12:35] <dimitern> babbageclunk: (it's already called in package_test.go as well)
[12:36] <dimitern> babbageclunk: and gc.TestingT is the entry point gc uses to hook into 'go test' machinery
[12:36] <babbageclunk> dimitern: Right. So is there any reason not to just delete the internal_test one?
[12:36] <dimitern> while gc.Suite is registering a suite (not fixture - it needs TestXX() methods to be a suite) into gc to run
[12:37] <dimitern> babbageclunk: please do, and I'm sure fwereade might like to add this to the guidelines :) never run gc.TestingT more than once per package
[12:38] <babbageclunk> dimitern: Awesome - thanks for the help!
[12:38] <dimitern> np :)
[12:56] <hoenir> could anyone review my PR?
[12:56] <hoenir> http://reviews.vapour.ws/r/4900/
[13:07] <voidspace> dimitern: babbageclunk: http://reviews.vapour.ws/r/4901/
[13:07] <dimitern> voidspace: looking in a sec
[13:07]  * voidspace lunches
[13:30] <sinzui> abentley: can you review https://code.launchpad.net/~sinzui/juju-ci-tools/controller-model/+merge/295657
[13:31] <abentley> sinzui: sure.
[13:31] <babbageclunk> perrito666: ping?
[13:45] <babbageclunk> He's probably off doing fun birthday things.
[13:47] <hoenir> could someone review my PR http://reviews.vapour.ws/r/4900/
[13:47] <hoenir> ?
[13:47] <hoenir> my PR fixes this https://bugs.launchpad.net/juju-core/+bug/1585430
[13:47] <mup> Bug #1585430: Cloud-init failed on windows <ci> <cloud-init> <regression> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1585430>
[14:00] <dimitern> voidspace: reviewed
[14:23] <natefinch> katco: +1 on the comment about aliases. Feels like two people disagreed so we decided to just do both
[14:23] <katco> natefinch: yeah... death by options
[14:39] <frobware> dimitern: please could you take a look at http://reviews.vapour.ws/r/4903/
[14:39] <frobware> dimitern: we looked at this a while ago, was just getting back to it
[14:39] <dimitern> frobware: sure, looking
[14:39] <natefinch> fwereade: when you change a model config in a running model, what executes those changes?
[14:40] <fwereade> natefinch, depends -- what changes are you thinking of? substrate config will be watched for and applied by several things that have their own environs
[14:40] <fwereade> natefinch, different bits of config will be watched for by other things, e.g. logging-config
[14:41] <natefinch> fwereade: working on the syslog forwarding stuff... didn't know if there was a generic watcher for it, or if I should write a new one for the syslog stuff explicitly
[14:43] <fwereade> natefinch, you need to watch model config under the hood whatever you do
[14:43] <fwereade> natefinch, so at least it's easy to create the watcher
[14:44] <fwereade> natefinch, just don't let on what it's really watching, call it WatchRsyslogConfig or something
[14:44] <natefinch> fwereade: yep, ok
[14:44] <fwereade> natefinch, (note: do not attempt to filter out changes that don't apply to your fields: it is not possible to do so reliably)
[14:45] <natefinch> fwereade:  good to know.
[14:47] <natefinch> ahh, so there's ModelWatcher.WatchForModelConfigChanges
[14:48] <fwereade> natefinch, yeah, that's the one
[14:52] <fwereade> natefinch, ...although, looking at it, there's a bit of WTF there
[14:52] <mup> Bug #1585674 opened: "Waiting for agent initialization to finish" needs to be more detailed. <juju-core:New> <https://launchpad.net/bugs/1585674>
[14:53] <frobware> dooferlad, voidspace: would appreciate a review of http://reviews.vapour.ws/r/4903/
[14:53] <fwereade> natefinch, non-bulk, doesn't identify model, returns an error indicating OMG INFRASTRUCTURE BORKED instead of "this request didn't work"
[15:01] <mup> Bug #1585674 changed: "Waiting for agent initialization to finish" needs to be more detailed. <juju-core:New> <https://launchpad.net/bugs/1585674>
[15:07] <mup> Bug #1585674 opened: "Waiting for agent initialization to finish" needs to be more detailed. <juju-core:New> <https://launchpad.net/bugs/1585674>
[15:10] <mup> Bug #1585674 changed: "Waiting for agent initialization to finish" needs to be more detailed. <juju-core:New> <https://launchpad.net/bugs/1585674>
[15:13] <mup> Bug #1585674 opened: "Waiting for agent initialization to finish" needs to be more detailed. <juju-core:New> <https://launchpad.net/bugs/1585674>
[15:28] <voidspace> dimitern: ping
[15:29] <katco> natefinch: lmk when you're done with the 1-pager; need to send an email out about the bug
[15:29] <natefinch> katco: ok, working on it now
[15:29] <dimitern> voidspace: pong
[15:29] <katco> natefinch: ta
[15:29] <voidspace> dimitern: you suggest abstracting newSubnetFromArgs out from addSubnetOps
[15:30] <voidspace> dimitern: but addSubnetOps still needs to construct a subnetDoc - so that at least would be duplicated
[15:30] <voidspace> dimitern: not a big deal, but that's why I didn't do it
[15:31] <voidspace> dimitern: if you still think it's worth it I'll make the change
[15:31] <dimitern> voidspace: sorry, otp - will get back to you shortly
[15:34] <dimitern> voidspace: well, addSubnetOps can take a prepopulated subnetDoc
[15:34] <dimitern> voidspace: it doesn't have to also create it
[15:40] <voidspace> dimitern: still leaving creation and validation of the Subnet to be duplicated, plus the subnetDoc is a bit of an "internal type" so it feels a bit yucky
[15:40] <voidspace> I'll duplicate creation of the subnetDoc
[15:47] <dimitern> voidspace: I'll leave it up to you
[15:50] <bogdanteleaga> natefinch, you could look at the retrystrategy stack, it's essentially watching a model config variable and acting on changes
[16:02] <alexisb> can someone please review this commit: https://github.com/juju/juju/pull/5457
[16:02] <alexisb> katco, ^^
[16:02] <katco> alexisb: tal
[16:02] <alexisb> btw perrito666, happy bday!
[16:03] <katco> perrito666: happy birthday :) hope you're feeling better
[16:03] <alexisb> thanks katco
[16:04] <natefinch> bogdanteleaga: awesome, thanks
[16:04] <katco> alexisb: looks like that already has a review: http://reviews.vapour.ws/r/4900/
[16:06] <alexisb> does the code author use review board?
[16:07] <katco> alexisb: i don't know, but the project does :) the link is automatically injected into the PR. i'll leave a note asking them to review the review
[16:08] <alexisb> katco, thank you
[16:13] <bogdanteleaga> alexisb, katco, yes he does, he recently joined our juju team :)
[16:14] <alexisb> bogdanteleaga, aaaah, ok that makes sense
[16:15] <katco> bogdanteleaga: grats on the add :)
[16:17] <mbruzek> katco: https://bugs.launchpad.net/juju-core/+bug/1585701
[16:17] <mup> Bug #1585701: juju resources needs to support extensions other than ".tgz" <juju-core:New> <https://launchpad.net/bugs/1585701>
[16:17] <katco> mbruzek: tal
[16:18] <katco> mbruzek: replied on bug
[16:18] <babbageclunk> perrito666: ping?
[16:19] <bogdanteleaga> katco, cheers :)
[16:21] <babbageclunk> bogdanteleaga: I've reviewed it as well (although I'm technically only a junior reviewer).
[16:21] <mbruzek> katco: Thanks for pointing that out. My bad.
[16:22] <katco> mbruzek: no worries; new feature
[16:22] <bogdanteleaga> babbageclunk, you've basically expressed my exact thoughts on it :P
[16:22] <katco> mbruzek: fwiw we opposed checking the file extension because it's just a blob of data
[16:23] <bogdanteleaga> I guess we should start reviewing on rbt as well
[16:24] <katco> bogdanteleaga: i didn't see the need at first, but after using it for awhile it's hard when i have to review something on GH
[16:24] <babbageclunk> bogdanteleaga: ok great - I thought maybe I was missing something.
[16:26] <natefinch> sinzui: what's with the multiple posts from the bot on this PR? https://github.com/juju/juju/pull/5419
[16:27] <natefinch> sinzui: (at the bottom it says merge request accepted 4 times in response to a single $$merge$$_
[16:28] <sinzui> natefinch: I have never seen that
[16:28] <sinzui> and I am looking at it now
[16:28] <natefinch> sinzui: k, just figured I'd bring it to your attention in case it indicates a problem
[16:32] <katco> natefinch: ding! your hour on the 1-pager is up :) are you about wrapped up?
[16:34] <natefinch> katco: forgot and got lunch in the middle, but still almost done. It's actually fairly simple.  5 minutes.
[16:34] <katco> natefinch: kk
[16:39] <voidspace> dimitern: why does the subnetDoc have an IsPublic field?
[16:42] <natefinch> katco: last two pages here: https://docs.google.com/document/d/1x60GL9zckzXNfHw_yyY_syxF1WSL5JkcWoWKJtwT_Ww/edit#
[16:42] <dimitern> voidspace: because subnets can be public or private (i.e. with and without shadow addresses support)
[16:42] <dimitern> (by spec, not as implemented currently except by mocking)
[16:42] <katco> natefinch: ta; can you send an email to ian and cc eric, me stating as such?
[16:42] <natefinch> katco: sure thing
[16:43] <katco> natefinch: and then you're rolling back onto the bug now?
[16:43] <natefinch> katco: yep
[16:43] <katco> natefinch: cool, reassign yourself plx :)
[16:45] <voidspace> dimitern: then why is that field not *used*
[16:45] <voidspace> dimitern: it's not on SubnetInfo
[16:45] <voidspace> dimitern: and it's not exposed on Subnet
[16:46] <voidspace> dimitern: so it's never set or changed
[16:51] <dimitern> voidspace: because we haven't quite got there yet
[16:51] <voidspace> dimitern: unused code for imaginery future use cases... :-p
[16:52] <dimitern> voidspace: not imaginary, just not done yet
[16:52] <voidspace> dimitern: I've added it as an ignored field in the migration_internal_test - when we do start using it we'll have to remember to migrate it
[16:52] <dimitern> voidspace: if you read the network model spec, private/public subnets and spaces are part of it
[16:53] <dimitern> voidspace: sounds good, +1
[16:53] <voidspace> dimitern: that doesn't change my opinion - specs are always imaginery until they're actually implemented
[16:55] <dimitern> voidspace: of course - I'd leave a TODO on the field at least - to keep it in mind; there's even a bug for it
[16:56] <voidspace> dimitern: good idea, adding comment
[16:57] <dimitern> voidspace: thanks!
[17:13] <rogpeppe> a simple addition to the juju/cmd package. anyone up for a little review? https://github.com/juju/cmd/pull/37
[17:13] <rogpeppe> voidspace, dimitern: ^
[17:14] <rogpeppe> natefinch: ^
[17:14] <rogpeppe> fwereade: ^
[17:21] <dimitern> rogpeppe: LGTM
[17:21] <rogpeppe> dimitern: you angel, thanks!
[17:22] <natefinch> rogpeppe: I'm sort of surprised that this wouldn't just be something we add to the help command itself
[17:22] <rogpeppe> natefinch: how would we do that?
[17:22] <rogpeppe> natefinch: the help command is built into SuperCommand
[17:22] <rogpeppe> natefinch: FWIW we considered about 3 dozen options here :)
[17:23] <rogpeppe> natefinch: this was the simplest and best tailored to what we actually needed
[17:23] <natefinch> rogpeppe: oh yeah, I forgot the help command is built in....
[17:23] <rogpeppe> natefinch: this is what it enabled: https://github.com/juju/charmstore-client/pull/66
[17:24] <natefinch> rogpeppe: very nice
[17:25] <rogpeppe> natefinch: yup, always happy to delete code :)
[17:37] <redir> reboot brb
[19:11] <mup> Bug #1585750 opened: Destroying a service in error gives no feedback <juju-core:New> <https://launchpad.net/bugs/1585750>
[21:38] <katco> natefinch-afk: you never reassigned yourself to bug 1537585 :(
[21:38] <mup> Bug #1537585: machine agent failed to register IP addresses, borks agent <2.0-count> <blocker> <landscape> <network> <juju-core:Triaged> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1537585>
[21:51] <axw> rogpeppe: not sure why help plugins was removed.
[21:51] <axw> nor the rest really
[22:11] <cherylj> axw: the help topics were removed to avoid having to keep two places up-to-date: (online docs and cli help)
[22:12] <axw> cherylj: ok, fair enough. plugins tho? that might have been accidental?
[22:13] <cherylj> oooh, yeah I forgot that help plugins wasn't just a topic
[22:13] <cherylj> yeah, that might've been an oversight
[23:09] <wallyworld> axw: anastasiamac: have standup without me today, in another meeting
[23:09] <axw> ok
[23:16] <axw> anastasiamac: standup?
[23:16] <axw> perrito666: ^
[23:39] <mup> Bug #1585825 opened: Takes too long to download a resource from a controller to unit <ci> <resources> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1585825>