/srv/irclogs.ubuntu.com/2012/10/09/#juju-dev.txt

davecheney	hey - awesome - my CI machine on canonistack was deleted	00:34
davecheney	that's fantastic	00:34
niemeyer	davecheney: Oops :)	00:36
niemeyer	davecheney: Welcome to the Clouds!	00:36
niemeyer	:)	00:36
davecheney	it's a good thing there wasn't anything important on it	00:36
niemeyer	davecheney: The schema package, which we use with configs, take care of the type-enforcing logic	00:51
niemeyer	Today is Launchpad-EOF day	00:51
davecheney	niemeyer: it sure bloody is	00:52
davecheney	niemeyer: yeah, I didn't think the int64/int problem would be a real problem	00:52
davecheney	only when constructing faux test data	00:52
davecheney	niemeyer: [LOG] 41.35831 SYNC Cluster 0xf840582210 is stopping its sync loop.	00:52
davecheney	... Panic: command failed: bzr commit -m Imported charm. (PC=0x4114F3)	00:52
davecheney	happens on a fresh precise machine	00:53
davecheney	^ store	00:53
niemeyer	davecheney: Does it say what the error message was?	00:53
niemeyer	davecheney: The bzr error, that is	00:54
davecheney	that is it	00:54
niemeyer	davecheney: That's the command run, not its output	00:54
davecheney	niemeyer: http://paste.ubuntu.com/1268515/	00:54
niemeyer	davecheney: Would you mind to tweak the message so we get an idea?	00:54
niemeyer	davecheney: ?	00:55
davecheney	i think i have looked into this before	00:55
niemeyer	davecheney: I trust you :)	00:55
davecheney	i have a branch somwhere	00:55
davecheney	that did add extra debugging	00:55
davecheney	i remember being annoyed that bzrDir.Commit panic'd	00:55
davecheney	will fix	00:55
niemeyer	davecheney: Btw, any news on the ec2 signature issue?	00:55
davecheney	it's on the cards for today	00:56
niemeyer	davecheney: Sweetest	00:56
davecheney	i have a nasty feeling there is a limit to the number of machines we can specify in that url	00:56
davecheney	going to do some spelunking in the aws forums	00:56
niemeyer	davecheney: I don't doubt, but it'd be a surprisingly bad error message if nothing else	00:59
davecheney	it's actually a 403	00:59
niemeyer	davecheney: Isn't that a forbidden?	00:59
davecheney	which smells like a generic 'hmm, i don't like that, better tell you to get stuffed'	01:00
davecheney	niemeyer: panic(fmt.Sprintf("command failed: bzr %s\n%s", strings.Join(args, " "), output))	01:00
davecheney	^ this is how that tests tries to capture the output	01:00
davecheney	no idea why it isn't working	01:00
davecheney	will try compbined output	01:00
davecheney	bingo	01:01
davecheney	... Panic: command failed: bzr commit -m Imported charm.	01:01
davecheney	bzr: ERROR: Unable to determine your name.	01:01
davecheney	Please, set your name with the 'whoami' command.	01:01
davecheney	E.g. bzr whoami "Your Name <name@example.com>"	01:01
niemeyer	davecheney: We should show the error as well	01:05
niemeyer	Ah, okay	01:05
niemeyer	Combined output	01:05
niemeyer	We cannot use that, unfortunately.. :(	01:05
davecheney	i tried that, but it breaks the test for others that expect Run to only handle stdout	01:06
niemeyer	We should at least display the rest of the output	01:06
niemeyer	davecheney: Right.. it fixed a real bug	01:06
niemeyer	It used to be combined	01:06
davecheney	is there a flag we can pass to bzr to force an identity	01:06
davecheney	?	01:06
niemeyer	davecheney: Doesn't it respect $EMAIL?	01:07
davecheney	niemeyer: no idea, let me try	01:07
davecheney	niemeyer: $EMAIL works	01:11
davecheney	i'll fix the test to pass that in	01:11
niemeyer	Sweet, thanks	01:11
davecheney	niemeyer: https://codereview.appspot.com/6631051	01:21
niemeyer	davecheney: LGTM	01:25
davecheney	niemeyer: ty	01:25
davecheney	niemeyer: http://docs.amazonwebservices.com/AWSEC2/latest/APIReference/ApiReference-query-TerminateInstances.html	01:44
davecheney	no mention of a limit for n	01:44
davecheney	and nothing obvious on the googles	01:44
niemeyer	davecheney: It's likely an error in the signature logic	01:45
davecheney	i will look there for something length related	01:46
niemeyer	davecheney: I'd try to find something that can sign that request properly, like the Python's boto, and comparing the signatures	01:47
niemeyer	davecheney: and perhaps most interestingly, comparing the payloads	01:47
davecheney	will do	01:47
niemeyer	Why you love me not, Launchpad	02:21
davecheney	https://codereview.appspot.com/6642048/	05:35
fwereade	wrtp, heyhey	06:48
wrtp	fwereade: yo!	06:48
wrtp	fwereade: how's tricks?	06:48
fwereade	wrtp, ah, not bad, and you?	06:49
wrtp	fwereade: not too bad. just trying to keep myself oriented in the sea of tiny CLs that i'm doing for this change. sometimes i think that it's better to do larger CLs, just to keep the mental overhead down (plus less testing overhead)	06:51
fwereade	wrtp, I know the feeling	06:52
TheMue	morning	07:20
fwereade	TheMue, heyhey	07:21
TheMue	heya fwereade	07:21
TheMue	fwereade: any idea on how we can detect if a machine fails the hard way (not by stopping it manually)?	09:17
fwereade	TheMue, sorry, explain the situation more	09:18
fwereade	TheMue, are yu talking about an actual instance disappearing?	09:18
TheMue	fwereade: yes.	09:19
fwereade	TheMue, I think we expect the firewaller to notice that the provider's not reporting it any more in the Instances list	09:19
TheMue	fwereade: if we remove it watchers will get notified. but what happes if there's a hard stop?	09:19
fwereade	TheMue, wait, we don't have an instances watcher do we?	09:20
TheMue	fwereade: afaik not	09:20
fwereade	TheMue, "machine" and "instance" are different -- which are we talking about here	09:20
fwereade	?	09:20
Aram	morning.	09:20
fwereade	Aram, heyhey	09:20
TheMue	Aram: morning	09:20
TheMue	fwereade: let's start with instances, the hardest part. ;)	09:21
fwereade	TheMue, AFAIK the only way to tell is by polling the provider :/	09:22
fwereade	TheMue, IIRC the python had a separate thing running once a minute to do that, does that ring a bell?	09:23
TheMue	fwereade: i also had polling in my mind, only wanted to get sure. thanks for the py hint, i'll look there.	09:23
fwereade	TheMue, I think that's what we do anyway :) been a little while since I looked...	09:24
TheMue	fwereade: i could integrate such a mechanism in the firewaller, one poller per instance. and if it fails i notify the main loop to react <thinkking/>	09:25
fwereade	TheMue, one poller per instance sounds a bit off... consider N=100000	09:26
fwereade	TheMue, surely this is provisioner more than firewaller?	09:27
fwereade	TheMue, (arguably a whole separate task...)	09:27
TheMue	fwereade: that's a scaling problem of the current firewaller, even w/o polling. it already runs goroutines for all machines and units	09:28
fwereade	TheMue, that's no reason to make it worse :p	09:28
TheMue	fwereade: that hasn't been meant as a reason, only that we have to rethink the fw for large clouds	09:29
TheMue	fwereade: maybe a kind of partinioning	09:29
TheMue	arg	09:29
TheMue	partitioning	09:29
fwereade	TheMue, yeah, makes sense... but surely this is even more reason to separate out the restarting for now	09:29
fwereade	TheMue, although... hm	09:29
fwereade	TheMue, it took users about a year to figure out that that functionality even existed in Python, iirc	09:30
fwereade	TheMue, I'm really just idly wondering if that bit is the highest possible priority right now, assuming you're in sync with niemeyer just ignore me :)	09:31
TheMue	fwereade: first step is only to recognize dead instances to keep the ports state in the fw up-to-date	09:31
fwereade	TheMue, ah, hmm, I see	09:32
fwereade	TheMue, actually, sorry, no I don't...	09:32
fwereade	TheMue, why do we need to close ports on dead instances?	09:32
fwereade	TheMue, ok, the more I think, the more I feel like I've missed something big/important	09:33
TheMue	fwereade: not to close them, but to know that they have to be opend when an instance becomes availble again	09:33
fwereade	TheMue, I thought that you were changing to an everything-open model in preparation for per-machine FWing?	09:33
fwereade	TheMue, was that just some fever dream of mine? :P)	09:33
TheMue	fwereade: that's optional, and brings other problems we still have to think about. think about multiple services needing the same port, so you can't just close it for the first one but for the last one.	09:34
fwereade	TheMue, isn't the logic completely identical?	09:35
fwereade	TheMue, (but ok the answer to my question is "no" -- so, sorry, I'm out of the loop: what is your current goal?)	09:36
TheMue	fwereade: no, today we tell the instance to close ports. in case of only one global security group it would do it immediately for all, even if other services need that port.	09:36
TheMue	fwereade: but my current goal is only to get aware of real dying instances to keep the state in the firewaller up-to-date	09:38
TheMue	fwereade: the global firewaller mode is a different topic	09:38
fwereade	TheMue, with per-machine FWing we'd leave the global group open all the time, wouldn't we?	09:38
fwereade	TheMue, sorry I'm derailing again	09:39
fwereade	TheMue, ok, I think I am being dense though: please explain again why you need to close ports on an instance that doesn't exist?	09:39
fwereade	TheMue, ahhhhhh sorry: is it because next time we start an instance for that machine, it will be started with the security group of the original instance?	09:40
TheMue	fwereade: pls forget security groups	09:41
TheMue	fwereade: different topic and currently not interesting from the firewallers perspective, because the API is neutral to how the default and the global modes are implemented	09:41
fwereade	TheMue, ok then, back to your original explanation: "not to close them, but to know that they have to be opend when an instance becomes availble again"	09:42
fwereade	TheMue, I still don't follow	09:42
fwereade	TheMue, if a new instance shows up for that machine	09:42
fwereade	TheMue, surely the last thing we want is open ports?	09:42
TheMue	fwereade: the ports will be opened for the units. but if the firewaller "thinks" that a port is already open, then it wouldn't open it.	09:43
TheMue	fwereade: technologically it is closed, the firewaller still thinks it is open	09:44
fwereade	TheMue, possible strawman: all we need to do is, whenever a machine's instance-id changes, we should close all ports on all units for that machine	09:44
fwereade	TheMue, because opening the ports that state thinks are open, on a new instance, is surely Bad and Wrong?	09:44
fwereade	TheMue, ISTM that you're trying to figure out how to implement a security hole ;p	09:45
fwereade	TheMue, but then I may be missing context and repeating previous discussions..?	09:45
TheMue	fwereade: trying to follow you, don't see the sec hole	09:46
TheMue	fwereade: the ports aren't opend by default, indeed not	09:46
fwereade	TheMue, well, we shouldn't open ports until the charm tells us to, right?	09:46
fwereade	TheMue, so if we have instance X running u/0, with a bunch of ports open	09:46
TheMue	fwereade: yes, and when the firewaller in this moment "thinks" it is already open, it won't open it	09:46
fwereade	TheMue, and instance X dies hard, and u/0 is redeployed to instance Y	09:47
fwereade	TheMue, and you then open a bunch of ports on Y before the unit tells yu they shoudl be opened	09:47
TheMue	fwereade: pls read, i will not open anything automatically	09:47
TheMue	fwereade: only on demand of a deployed unit	09:48
fwereade	TheMue, right	09:48
TheMue	fwereade: BUT!	09:48
TheMue	fwereade: the firewaller still "THINKS" !!! that the port is already open (he has not gone aware that the instance has been gone)	09:49
TheMue	fwereade: so he won't open the port, even if needed, because he currently thinks there is nothing to do	09:49
TheMue	fwereade: your hint with the instance id may be a good one	09:50
TheMue	fwereade: can we be sure, in every environment, that they are always new?	09:50
fwereade	TheMue, I'm not sure :/	09:50
fwereade	TheMue, so actually it shouldn't be the FW	09:51
fwereade	TheMue, I think the only thing that makes sense if for the UA to reset its port state when it installs a charm	09:51
fwereade	TheMue, which then solves your problem... right?	09:51
TheMue	fwereade: sounds reasonable	09:52
fwereade	TheMue, or maybe not, sorry, I need to think it through again, I'm somewhat unfamiliar with the guts of the firewaller	09:52
TheMue	fwereade: we only have the gap between the moment of the dying instance and the redeployment of the unit	09:52
TheMue	fwereade: the fw knows about machines to map this information to instances and about services (are they exposed) and units. the queue unit -> machine -> instance controls, which ports are to open/close.	09:54
TheMue	fwereade: in fast words ;)	09:54
wrtp	TheMue: can't you just remove ports from an instance when a machine's instance gets changed?	09:57
wrtp	TheMue: yes, that would mean that a dying instance would still keep some global ports open, but i think that's reasonable.	09:57
fwereade	wrtp, as pointed out, I think that we can't be sure that a new machine will have a new instance id	09:58
TheMue	wrtp: see fwereade	09:58
wrtp	fwereade: i don't think that matters.	09:58
TheMue	wrtp: how do you detect, that the instance is new?	09:59
fwereade	wrtp, ah, ok, if the provisioner does that it could work...	09:59
wrtp	TheMue: i don't think you need to	09:59
fwereade	wrtp, you suggested that we could remove ports when the instance is changed... but you're saying we don't need to detect new instances to detect this?	10:00
wrtp	TheMue: in fact, i think we have to be able to assume that an instance id isn't repeated.	10:00
* fwereade has a confuse		10:00
fwereade	wrtp, well, you can't do that, can you?	10:00
fwereade	wrtp, I'm pretty sure EC2 will sometimes repeat instance ids in really rather quick succession	10:01
wrtp	fwereade: if an instance id might be repeated, then we have no way of knowing for sure when a machine has been assigned to a new instance	10:01
TheMue	wrtp: you think of watching the instance id of a machine? every set, even with equal values, is a change.	10:01
fwereade	wrtp, except for the fact that we do the assigning...	10:01
wrtp	fwereade: yes, but we're watching the instance id on the machine. that's all we know of the new instance.	10:02
wrtp	fwereade: and if an old instance has gone away, we might allocate a new instance and set the machine's instance id.	10:02
fwereade	wrtp, right -- but the FW has no way to detect this, does it?	10:03
TheMue	fwereade: with the help of a watcher it may be done	10:04
wrtp	fwereade: and this could be a problem.	10:05
wrtp	TheMue: i don't believe it can	10:05
TheMue	wrtp: yes, if it doesn't change we will not see it :(	10:06
wrtp	fwereade: it's actually not a problem for real, because we actually never store fw settings per instance	10:06
wrtp	fwereade: we store them per machine.	10:06
fwereade	wrtp, yeah, which is a problem... isn't it?	10:07
wrtp	fwereade: despite the API in Environ.	10:07
wrtp	fwereade: well, it's kind of odd. we can have a situation where we cannot change the port settings for a given machine, because its instance has gone away	10:08
fwereade	wrtp, (a problem, because, we don't want to have open ports on an instance until code running on that instance asks for them to be open)	10:08
fwereade	wrtp, I assert that ports are entirely about instances and only coincidentally to do with machines	10:08
wrtp	fwereade: with global ports, we do want that to be true	10:08
wrtp	fwereade: when we start an instance, we wipe its security group AFAIR	10:09
wrtp	fwereade: it would be nice if they were, but the implementation doesn't do that	10:09
fwereade	wrtp, then isn't it just a matter of the MA, on first run, clearing all ports for all assigned units, and then we're done?	10:10
wrtp	fwereade: we pretend that the ports are set per instance, but they're not - they're per machine	10:10
wrtp	fwereade: possibly. i seem to remember suggesting that before.	10:11
fwereade	wrtp, I think that if the clearing is already in place that that is the right thing to do	10:12
TheMue_	wrtp: and with a global firewall mode they are globally. but how does a machine knows about which ports other machines need?	10:15
=== TheMue_ is now known as TheMue
TheMue	wrtp: so there has to be a port manager for the global mode	10:16
wrtp	phone call, sorry	10:18
fwereade	TheMue, sorry, why would any machine need to know about any other machine's ports?	10:18
fwereade	TheMue, except in the very limited sense in which, yes, the FW is running with a machine agent	10:20
wrtp	back	10:21
wrtp	fwereade: wouldn't the correct place to clear the ports for a given unit be when we reassign that unit to a new machine?	10:22
fwereade	wrtp, when does that happen?	10:22
wrtp	fwereade: never, currently AFAIK. i think this is a fairly sketchy area.	10:23
TheMue	fwereade: no, you got me wrong, a machine does not need to know about other machines	10:23
fwereade	wrtp, the issue I think I am talking about is when a machine is reprovisioned, with all its units, and this does happen -- at least in python :)	10:23
TheMue	fwereade: but if a machine clears all ports for all assigned units, then in a global mode a port may be closed for a different machine that still needs that port.	10:24
wrtp	fwereade: ok, so perhaps that's the time that the ports should be cleared.	10:24
wrtp	fwereade: (by the provisioner)	10:24
wrtp	TheMue: a machine can't clear the ports in the environment - only the firewaller can do that	10:25
TheMue	wrtp: yes	10:25
wrtp	TheMue: a machine agent could clear the ports in any units assigned to that machine though, but i'm not convinced that's the best place for it.	10:25
fwereade	TheMue, I think there's something I'm missing about this "global" mode	10:26
TheMue	wrtp: i answered "<fwereade> wrtp, then isn't it just a matter of the MA, on first run, clearing all ports for all assigned units, and then we're done?"	10:26
fwereade	TheMue, er	10:26
TheMue	wrtp: no, i would like to have the control in the firewaller	10:27
fwereade	TheMue, how does a machine agent calling ClosePort() on a unit affect the global set of opened ports?	10:27
TheMue	fwereade: currently the fw doesn't know about that global mode	10:27
wrtp	TheMue: i don't think the firewaller should be changing a Unit's open ports.	10:27
fwereade	TheMue, if the FW is closing a port just because one unit had a port closed then it's just crack, surely?	10:27
* TheMue thinks we are turning here a bit		10:28
fwereade	wrtp, +1	10:28
fwereade	TheMue, s/closing a port/closing a global port/	10:29
TheMue	reset<CR>	10:29
TheMue	fwereade: again, the fw today doesn't know about the global mode	10:29
wrtp	fwereade: where's the code that's doing the refcounting of ports then?	10:29
fwereade	TheMue, please don't criticise my suggestions in the context of theis global mode and then tell me the global mode is irrelevant	10:30
wrtp	s/fwereade/TheMue/	10:30
fwereade	wrtp, I have no idea... tbh the idea of a global FW sounds like complete crack to me	10:30
wrtp	TheMue: the plan is to make the firewaller aware of global mode, right?	10:30
fwereade	wrtp, shitty security, and a whole load of complexity	10:30
wrtp	fwereade: i think it's a reasonable pragmatic security	10:30
TheMue	fwereade: no, i'm not telling you it's irrelevant, i only want to sort the ideas a bit	10:30
wrtp	s/a reas/reas/	10:31
fwereade	wrtp, so every machine opens the intersewction of everything that might be open?	10:31
wrtp	fwereade: union	10:31
* fwereade had a brain once but then he left it somewhere		10:32
fwereade	wrtp, ok... but, yeah, isn't that completely crackful from a security POV?	10:32
wrtp	fwereade: it means we will be able to scale in ec2 without opening all ports.	10:32
wrtp	fwereade: actually, i think the thing that's crackful is that we're pretending that it's all per-instance when it's not.	10:33
TheMue	wrtp: i made a proposal with ref counting, but niemeyer meant we have to discuss about it, because the environments can act differently. see https://codereview.appspot.com/6635043/ and http://irclogs.ubuntu.com/2012/10/05/%23juju-dev.html at 14:23	10:33
wrtp	TheMue: i think niemeyer's concerns there could be addressed by having the provisioner clear out the ports of machine's units when it detects that the machine's instance is dead (or when it reprovisions a machine)	10:36
TheMue	wrtp: that has been my initial question: how do we detect a dead instance?	10:37
fwereade	TheMue, by polling, in the provisioner, like I said	10:38
TheMue	fwereade: it's already doing so? you sounded like that's still open and in py it's an external program.	10:38
fwereade	TheMue, I dunno, I'm afraid I didn't implement it	10:39
fwereade	TheMue, if the provisioner is not polling for dead instances then, to match py, I think it should be	10:39
fwereade	TheMue, and if it is, then we have the information we need available right there, and can take action as needed	10:39
TheMue	fwereade: yes, i didn't found anything in the provisioner code. but i hoped i only may be blind. ;)	10:40
fwereade	TheMue, and if we decide that is not the right place for it we can then put it somewhere else	10:40
fwereade	TheMue, I thought I saw a TODO in there back in oakland :/	10:40
TheMue	fwereade: ok	10:40
TheMue	fwereade: imho it would be ok if the provisioner detects it but it notifies the firewaller to keep control over the ports (as it has its own representation of its world)	10:41
fwereade	TheMue, why can't the provisioner make the appropriate state changes?	10:42
wrtp	fwereade: +1	10:42
wrtp	TheMue: the provisioner only needs to set the units' ports	10:42
fwereade	TheMue, if that feels wrong, then I don't really mind who does make the state changes, except that it shouldn't be the FW... surely?	10:42
fwereade	TheMue, I think that the FW sounds complex enough without adding feedback loops to it ;)	10:43
TheMue	fwereade: yes, state changes are a kind of notification too. i mean that only opening and closing ports should be done in the firewaller to keep the internal representation up-to-date	10:43
wrtp	i'm wondering if the environment global port changes should have a different entry point; e.g. Environ.OpenPorts	10:44
TheMue	wrtp: that sounds reasonable	10:44
fwereade	wrtp, I shouldn't probably be getting into this, but the everything-opens-the-union-of-ports-needed-in-the-whole-deployment approach really does still sound crazy to me	10:45
wrtp	fwereade: and the alternative is?	10:45
fwereade	wrtp, it feels no different in spirit from "meh, just open everything"	10:45
wrtp	fwereade: i think it's quite different	10:46
wrtp	fwereade: a small set of ports vs everything	10:46
wrtp	fwereade: most installations will only open a very small number of ports, i believe	10:46
fwereade	wrtp, still doesn't seem sane to me that service not-a-big-deal can open ports for very-important-db-service	10:47
wrtp	fwereade: ah, well if we're talkin' malicious services, you're probably right.	10:47
fwereade	wrtp, but there is clearly something I just Do Not Get here	10:48
TheMue	fwereade: maybe one global group has the same fault than one per machine. one per service could make more sense.	10:48
wrtp	fwereade: that something is that without this change, we cannot scale under ec2	10:48
fwereade	wrtp, haven't we known since day 1 that the FWing is crack, and the only sane solution is getting the MA to handle it?	10:48
fwereade	wrtp, well, yeah, because of the known-stupid approach	10:48
wrtp	fwereade: yeah, that would be much better	10:49
wrtp	fwereade: do we know how to do that?	10:49
fwereade	wrtp, I was under the impression that we have iptables everywhere... and the logic for knowing what ports should be open on a machine does already exist	10:50
wrtp	fwereade: i tend to agree that we're adding complexity for no particularly good reason, and it's complexity that we want to throw away as soon as possible	10:50
wrtp	fwereade: is iptables sufficient?	10:50
Aram	a trivial: https://codereview.appspot.com/6637050	10:51
wrtp	fwereade: after all, do we want to allow a dubious charm to manipulate a machine's iptables and thereby exposed ports that shouldn't be exposed	10:51
fwereade	wrtp, perhaps not -- I am no expert -- but I haven't heard anyone suggesting it isn't, and I've heard plenty of people suggesting it should	10:51
fwereade	wrtp, we have this tool called "open-port"	10:51
wrtp	Aram: LGTM	10:51
wrtp	fwereade: yes, but open-port does nothing if the service is not exposed.	10:51
fwereade	wrtp, right, and?	10:52
fwereade	wrtp, in theory, at least, units are containerised	10:52
fwereade	wrtp, and the MA will be responsible for the FWing	10:52
wrtp	fwereade: yeah.	10:52
fwereade	wrtp, I don't see how the situation here is any different from anything else the unit could or could not do	10:52
wrtp	fwereade: i think part of the difficulty is we don't know what we're doing just to make do, and what's going to be around for a while.	10:53
wrtp	fwereade: because currently, units are not containerised	10:53
wrtp	fwereade: with the current scheme, a unit can't expose itself without someone from outside explicitly deciding to do so.	10:54
wrtp	fwereade: i'm not saying that using iptables is a bad thing, just that it does change the security model slightly.	10:54
TheMue	wrtp, fwereade: how do different provider handle this? does openstack has security groups too?	10:55
wrtp	fwereade: i do think that the effort that's going into doing this global ports stuff (and adding complexity to the core that will take effort to remove later) we'd be better implementing an on-machine firewaller	10:55
fwereade	wrtp, yeah, I'm just reacting to a feeling that we're putting a lot of effort into a provider-specific solution that is kinda crap even for that provider	10:56
wrtp	TheMue: at least some other providers don't implement firewalling at all AFAIK	10:56
wrtp	fwereade: agreed	10:56
fwereade	wrtp, but, well, I have my own worries in other bits of the codebase :/	10:56
TheMue	wrtp: expected it, yes. iptables should work everywhere, don't they? how about lxc?	10:56
wrtp	fwereade: and things like tests for config.FirewallerMode spreading around the code make me unhappy	10:56
* fwereade slopes off for a pre-meeting ciggie, brb		10:57
TheMue	wrtp: yes, the topic is larger than thought in the beginning	10:58
Aram	TheMue: containers can each have a different network stack, so yes.	11:01
niemeyer	Yo	11:02
wrtp	niemeyer: yo!	11:02
Aram	hi	11:02
wrtp	niemeyer: G+?	11:02
wrtp	Aram: ^	11:02
niemeyer	davecheney: ping	11:26
wrtp	niemeyer: this was the CL i meant to propose last night but accidentally pushed it onto a previous branch instead: https://codereview.appspot.com/6639043/	11:53
fss	niemeyer: good morning	12:04
fss	niemeyer: is launchpad happier today?	12:05
niemeyer	fss: Haven't talked to it yet, but I'm hoping so :)	12:06
niemeyer	wrtp: Cheers, will check it	12:06
fss	niemeyer: nice, let's hope so :-)	12:20
niemeyer	Woohay broken pipe, literally	12:26
niemeyer	And it wasn't just any pipe.. it was a huge pipe, with wild pressure	12:56
niemeyer	OK: 42 passed	13:14
wrtp	niemeyer: this should make the authentication problem easier to deal with in tests: https://codereview.appspot.com/6643049	14:00
niemeyer	wrtp: That doesn't quite work..	14:03
wrtp	niemeyer: oh	14:03
wrtp	niemeyer: it seems to...	14:03
niemeyer	wrtp: Yeah, but just seems :)	14:03
wrtp	niemeyer: ok, so how is it broken?	14:03
niemeyer	wrtp: The login information is cached.. mgo manipulates connections by itself.. you've rendered the login information it is using invalid by killing the user	14:03
wrtp	niemeyer: the login information is sent with each request?	14:04
niemeyer	wrtp: No, it's properly handled internally	14:05
wrtp	niemeyer: FWIW if this logic is wrong, then i think the logic that i previously had in the bootstrap-state test might have been wrong too	14:05
wrtp	niemeyer: i'd like to understand the mgo auth model a little better	14:05
niemeyer	wrtp: If you were killing the user that juju logs in with, then yes, it's wrong	14:05
wrtp	niemeyer: so you can't authenticate and then delete the user?	14:06
niemeyer	wrtp: It's pretty straightforward.. you give it a user, and it uses it	14:06
wrtp	niemeyer: and a session is associated with a user?	14:06
niemeyer	wrtp: If you remove the user under it, it may blow up next time	14:06
wrtp	niemeyer: so, i'm trying to think of a way that i can set the db into authenticated mode, set an admin user, and still be able to set the db back into unauthenticated mode afterwards, without knowing what the admin user was set to.	14:07
wrtp	s/admin user was set/admin password was set/	14:08
wrtp	niemeyer: i think you're saying that a session becomes invalid if the user it was logged in as is removed. but removing the user is the only way to get the db to work without logging in.	14:09
wrtp	niemeyer: or rather, having no admin user is the way to do that	14:10
niemeyer	wrtp: There's nothing complex there.. don't remove the user you want to operate as..	14:11
niemeyer	wrtp: otherwise we'll see random failures when it attempts to authenticate a connection.. that's all	14:12
wrtp	niemeyer: because there's no a one-to-one correspondence between session and connection?	14:12
wrtp	s/no a/not a/	14:13
wrtp	niemeyer: i have to say i thought there was, but i think i understand better now.	14:13
niemeyer	wrtp: yeah, there's a good reason why it's called session rather than connection :)	14:14
niemeyer	wrtp: A session abstracts away the communication with the whole cluster	14:15
niemeyer	wrtp: The primary may shift and you may not notice	14:15
niemeyer	(you may notice as well, though, depending on what was in progress by then)	14:15
wrtp	niemeyer: so perhaps a better approach might be to set the admin password to "" rather than removing the user, and always attempt to log in even if the password is "".	14:15
niemeyer	wrtp: My suggestion is to keep evolving logic for a bit without refactoring	14:16
niemeyer	wrtp: Let's try to get this stuff working	14:16
wrtp	niemeyer: my reason for doing this is i couldn't work out a good way of writing a particular test in another branch	14:17
wrtp	niemeyer: i'll show you the test, one mo	14:17
wrtp	niemeyer: http://paste.ubuntu.com/1269312/	14:18
wrtp	niemeyer: it's in the juju package	14:18
wrtp	niemeyer: given that Environ.Bootstrap sets the admin password (as it should), how can i make the test revert to the previous admin-passwordless state if it fails half-way through?	14:19
wrtp	niemeyer: if the password is changed, that might invalidate the session too, right?	14:20
niemeyer	wrtp: Reset the database if the test fails, for example	14:20
wrtp	niemeyer: yeah	14:20
wrtp	niemeyer: you mean restart the mgo server?	14:21
wrtp	niemeyer: so i think that means that my defer st.SetAdminPassword("") lines in the state tests are bogus	14:21
wrtp	s/that/this/	14:22
niemeyer	wrtp: I mean that's one way of doing it	14:24
niemeyer	wrtp: Resetting the password can also work, as you've noticed	14:25
wrtp	niemeyer: how can we reset the database?	14:25
niemeyer	wrtp: ?	14:25
wrtp	niemeyer: we don't have an authenticated connection any more, so we can't manipulate the db to reset it or change the password	14:25
niemeyer	wrtp: It's our database.. our files.. our machine :)	14:27
wrtp	niemeyer: ok, so how do we do it? can i go underneath mgo and manipulate its files directly?	14:28
wrtp	niemeyer: i'm not sure how many abstractions i need to break here :)	14:29
wrtp	niemeyer: perhaps restarting mgo is a reasonable approach. if we get an auth failed error when trying to reset the db, we could kill the server and start a new one. this should never happen in the normal course of events, so it won't slow down tests.	14:33
niemeyer	wrtp: Right	14:35
wrtp	niemeyer: i think it feels right to make the db reset work regardless of what the test has done. the current SetAdminPassword defers are unnecessary (and wrong, as i think you've demonstrated)	14:37
wrtp	niemeyer: so are you ok with the above approach? (restarting mgo on auth failure)	14:38
niemeyer	wrtp: I think it's fine if the test passes	14:38
niemeyer	wrtp: No reason to slow down the test	14:38
wrtp	niemeyer: which test?	14:38
niemeyer	wrtp: The same one you're talking about	14:38
niemeyer	Woohay.. Launchpad responded after 1h trying to submit	14:39
wrtp	niemeyer: i agree there's no reason to slow down the test, hence my suggestion (restarting mgo on auth failure when resetting), which is what i'm checking you're ok with	14:39
wrtp	niemeyer: jeeze	14:39
niemeyer	fss: Answering your question, no, Launchpad isn't much happier today	14:39
niemeyer	wrtp: The point is that you need the defers on the success case	14:40
wrtp	niemeyer: ah, i see	14:40
fwereade	been on since 8ish, have run out of brain... might be back a bit later, but provisionally calling it a day	14:41
niemeyer	fwereade: Have a good time then, provisionally :-)	14:42
wrtp	niemeyer: ok, that makes sense, as the password is in a known state at the end of the test. it means we probably don't need to make it a defer either - we can just call SetAdminPassword at the end of the test, which makes the logic more straightforward.	14:43
niemeyer	wrtp: Right	14:45
* niemeyer => lunch		15:21
wrtp	fwereade: i just saw this uniter test failure: http://paste.ubuntu.com/1269425/	15:26
fwereade	wrtp, hum, that is relevant to my interests, would you make a bug please?	15:27
wrtp	fwereade: k	15:27
wrtp	fwereade: https://bugs.launchpad.net/juju-core/+bug/1064476	15:30
wrtp	niemeyer: PTAL https://codereview.appspot.com/6643049	15:33
fss	niemeyer: :-(	16:35
fss	niemeyer: sorry for the huge delay, I was out for lunch	16:36
niemeyer	fss: No worries	16:36
wrtp	niemeyer: passwords used for real: https://codereview.appspot.com/6632049	16:46
niemeyer	wrtp: Thanks	16:47
niemeyer	wrtp: I'll have to give the other branches some attention	16:47
wrtp	niemeyer: np	16:47
niemeyer	fwereade: ping	17:33
niemeyer	fwereade: Oh, sorry, you're in relax mode already	17:34
wrtp	i'm off for the evening.	17:35
niemeyer	wrtp: Cheers	17:35
wrtp	niemeyer, fwereade, Aram: night all	17:35
niemeyer	wrtp: Thanks, have a good night too	17:35
wrtp	niemeyer: will do, thanks	17:36
wrtp	niemeyer: and you	17:36
niemeyer	Aram: https://codereview.appspot.com/6595064/ reviewed	18:02
niemeyer	Aram: Sorry it took a day to get to it	18:02
fwereade	niemeyer, ping	19:36
niemeyer	fwereade: yo	19:36
fwereade	niemeyer, I guess that was actually a pong really :)	19:36
niemeyer	fwereade: Ah, don't worry then, it's all good	19:37
niemeyer	fwereade: I've been doing reviews, so you'll have some ideas to look at/branches to merge tomorrow	19:37
fwereade	niemeyer, yeah, I see the one you don't like	19:37
niemeyer	fwereade: It's not entirely that I don't like.. I think it's more about a previous debate having some evidence than about the branch content itself	19:38
niemeyer	fwereade: I think we should debate, but I wouldn't mind that the shift of convention was done a tip, for example	19:39
fwereade	niemeyer, the trouble is, on a brief reading, I can't see any cases where the error return doesn't make the code more complex	19:39
niemeyer	fwereade: Interesting, I see exactly the opposite	19:39
niemeyer	fwereade: I see a different convention handling cases we're used to	19:40
fwereade	niemeyer, in every case, you seem to be asking me to switch `if x() { y() }` into `if a, err = y(); err != nil { if err != someSpecificError {return err} } else { a.b() }	19:40
niemeyer	fwereade: and pretty much no case where it's not the good-old if err != foo { ... }	19:41
fwereade	niemeyer, well, in every case, I have to handle a nonsensical extra branch	19:42
niemeyer	fwereade: Is it 100% guaranteed that if we see a relation id in RelationIds, Relation(id) will necessarily work for it?	19:42
fwereade	niemeyer, because those methods actually would only return one error ever	19:42
fwereade	niemeyer, well, yes...	19:42
niemeyer	fwereade: Interesting. How can we guarantee it?	19:42
fwereade	niemeyer, it should not be in any way dependent on external state	19:42
fwereade	niemeyer, well, we need to ebmed meaning into the interface above and beyond that explicitly stated in the code	19:44
fwereade	niemeyer, in the same way that, say sort.Strings() makes the guarantee that it won;t launch nuclear missiles	19:44
fwereade	niemeyer, IYSWIM	19:44
fwereade	niemeyer, this is a straight replacement of a struct with an interface	19:44
fwereade	niemeyer, if it's ok to do `if X != ""`, why is it not ok to do `if X() != ""`?	19:45
niemeyer	fwereade: I thought the review was clear	19:46
niemeyer	fwereade: I'm simply showing evidence that something looks odd	19:47
niemeyer	fwereade: I'm not hand-waving that this is bad	19:47
niemeyer	fwereade: If you're doing Has+Get, Has+Get, Has+Get, Has+Get consistently, it seems to me that the interface is fragile.. because tomorrow non-William will come here and put Get, and blow it up	19:47
niemeyer	fwereade: I accepted to wait until later to see if we'd do that or not.. your branch does exactly that so far	19:48
niemeyer	fwereade: In pretty much all cases but one or two	19:48
niemeyer	fwereade: I'm still talking, though	19:49
niemeyer	fwereade: Rather than enforcing anything	19:49
fwereade	niemeyer, ok... perhaps I am taking it the wrong way	19:51
niemeyer	fwereade: What if we took away all of those Has methods and used methods that have a second (..., ok bool) result?	19:55
niemeyer	fwereade: Does that solve your concern?	19:55
fwereade	niemeyer, probably 99%, yes	19:56
niemeyer	fwereade: Cool, it solves mine as well	19:56
fwereade	niemeyer, it's the introduction of fake error paths that's mandated by the error return that bugs me	19:56
fwereade	niemeyer, ok, and that's fewer methods in the interface too, nice :)	19:56
niemeyer	fwereade: Because it forces both the consumer and the producer of that interface to acknowledge the fact the data may not be availble	19:56
fwereade	niemeyer, indeed -- it still feels somewhat heavyweight, tbh, but maybe by just the right amount considering the different expectations of an interface and a struct	19:58
niemeyer	fwereade: My feeling is that it's actually both less work and less code than the current implementation	20:12
niemeyer	fwereade: If you're happy to move that way, as I mentioned, I'm happy to have that done at the tip	20:13
fwereade	niemeyer, agreed :)	20:13
fwereade	niemeyer, but tbh if we're agreed on a direction I'm perfectly happy threading it through... feels cleaner	20:14
niemeyer	fwereade: Sounds good.. my LGTMes are still valid if you decide to refactor on the way	20:15
fwereade	niemeyer, cool, thanks	20:15
fwereade	niemeyer, depending on whether or not cath wakes up, I should be able to run this branch past you again pretty soon	20:15
* niemeyer sings for cath		20:16
fwereade	niemeyer, if I give them named return values, ie (r ContextRelation, ok bool), ISTM that that makes the convention pretty clear without explicit documentation... sane?	20:18
niemeyer	fwereade: ok is kind of ambiguous.. if we name it "found" I guess it'd be okay	20:19
niemeyer	fwereade: ambiguous in the language, I mean	20:20
niemeyer	fwereade: ok is of course meaningless in that regard :-)	20:20
fwereade	niemeyer, ok, cool -- I'm mainly asking because I can't find the right words for the doc comment :)	20:20
niemeyer	fwereade: appending "if it was found and whether it was found" to the end of the first sentence of those methods should do the deal, I think	20:21
* fwereade peers critically at the sentences... yeah, LGTM		20:22
fwereade	niemeyer, cheers	20:22
fwereade	niemeyer, and, yeah, the code's way nicer too	20:40
fwereade	niemeyer, https://codereview.appspot.com/6633043 reproposed	20:42
niemeyer	fwereade: Woot	20:48
niemeyer	fwereade: LGTM, thank hyou	20:50
fwereade	niemeyer, cool, thanks	20:50
fwereade	niemeyer, sorry this bit was difficult... I had a surprisingly violent adverse reaction to the error returns over the weekend, though... I felt my code was made of lies, in some way, and it really bugged me :)	20:51
niemeyer	fwereade: No worries.. I think the end result is better than either of the original ideas we had	21:00
niemeyer	fwereade: So the brainstorming was worth it	21:00
fwereade	niemeyer, definitely :)	21:00

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!