[00:03] smoser, interesting.. coreos guys rewrote cloudinit in go.. [00:04] i hadnt' seen that. [00:06] smoser, its very limited subset and assumes coreos /systemd https://github.com/coreos/coreos-cloudinit [00:07] its a bit much for them to call it cloudinit... its almost zero feature set overlap [00:16] did anyone see fwereade after this am? (and when I say AM I mean GMT-3 AM) [00:18] perrito666: its unusual to see him online at this time [00:19] davecheney: I know, he just said that he was taking a plane and returning later and then I got disconnected [00:26] perrito666: ok, you probably know more than i then [00:27] heh tx davecheney [00:38] hmm.. odd /bin/sh: 1: exec: /var/lib/juju/tools/unit-mysql-0/jujud: not found [00:41] hazmat, looks like the last message in juju-ci-machine-0's log. Jujud just disappeared 2 weeks ago. Since that machine is the gateway into the ppc testing, we left it where it was [00:42] thumper, I can hangout now [00:42] sinzui, its odd its there.. the issue is deployer/simple.go [00:43] it removes the symlink on failure, but afaics that method never failed, the last line is install the upstart job, and the job is present on disk. [00:44] sinzui: just munching [00:44] with you shortly [00:44] * sinzui watches ci [00:44] sinzui, ie its resolvable with sudo ln -s /var/lib/juju/tools/1.18.1-precise-amd64/ /var/lib/juju/tools/unit-owncloud-0 [00:45] hmm.. its as though the removeOnErr was firing [00:45] even on success [00:47] * sinzui nods [00:49] sinzui: https://plus.google.com/hangouts/_/76cpik697jvk5a93b3md4vcuc8?hl=en [00:50] wallyworld, jam: looks like all the upgrade test are indeed fixed. I disabled the local-upgrade test for thumper. I will retest when I have the time or when the next rev lands [00:50] \o/ [00:50] sinzui: do local upgrade and local deploy run on the same machine? [00:50] sinzui: can't hear you [00:50] sinzui: so if thumper actually pulls his finger out, we could release 1.19.0 real soon now? [00:53] deployer worker is a bit strange .. does it use a tombstone to communicate back to the runner? [00:56] thumper, when you have a moment i'd like to chat as well.. [00:56] hazmat: ack [00:59] hazmat: the deployer worker is similar to most others, it is created by machine agent but wrapping it inside a worker.NewSimpleWorker [01:00] wallyworld, ah. thanks [01:01] np. that worker stuff still confuses me each time i have to re-read the code [01:04] the pattern is a bit different [01:04] trying to figure out why i'd get 2014-04-15 00:00:42 INFO juju runner.go:262 worker: start "1-container-watcher" .. when there are no containers.. basically my manual provider + lxc seems a bit busted with 1.18 [01:04] also trying to figure out if on a simpleworker erroring, if the runner will just ignore it and move on. [01:04] with no log [01:05] the nutshell being deploy workloads gets that jujud not found [01:05] * hazmat instruments [01:08] hazmat: whazzup? [01:11] thumper, trying to debug 1.18 with lxc + manual [01:11] thumper, mostly in the backlog [01:11] Wow. [01:13] abentley replace the mysql + wordpress charms with dummy charms that instrument and report what juju is up to. They have take 2-4minutes off of all the tests [01:13] Azure deploy in under 20 minutes [01:14] AWS is almost as fast as HP Cloud [01:16] sinzui: \o/ [01:17] wallyworld: should I patch envtools.BundleTools in a test suite e.g. coretesting? Or should I copy the mocked function to each package that is failing and patch there? [01:18] wallyworld: it's just there seem to be a lot of tests that are all effected/fixed by this patch [01:18] use s.PatchValue [01:18] wallyworld: yep I am [01:18] but should I do it in a more generic suite? [01:18] so if the failures are clustered in a particular suite, you can use that in SetUpTest [01:19] not sure it's worth doing a fixture for a one liner [01:19] wallyworld: that is what I'm doing now, but aready I've done that in about 4 packages, with more to go [01:19] wallyworld: oh okay, you mean just patch in each individual test? [01:20] possibly, depends on hwere the failures are [01:20] okay, I'll do it the verbose way and we can cleanup in review if needed [01:20] but if the failures are in a manageable nuber of suites, doing the patc in setuptest makes sense [01:21] okay [01:28] what the actual fuck! [01:33] wallyworld, CI hates the unit-tests on precise. Have you seen these tests fail consistently in pains before? http://ec2-54-84-137-170.compute-1.amazonaws.com:8080/job/run-unit-tests-amd64-precise/617/console [01:34] ^ The last three runs on different precise instances has tghe same failure [01:34] sinzui: I have some binaries copying to the machine [01:35] sinzui: i haven't seen those. and one of them, TestOpenStateWithoutAdmin, is the test added in the branch i landed for john to make upgrades work [01:35] thank you thumper. [01:35] so it seems there's a mongo/precise issue [01:36] thumper: were you running some tests in a precise vm? [01:37] wallyworld: I have a real life precise machine [01:37] wallyworld: that it works fine [01:37] on [01:37] I've hooked up loggo to the mgo internals logging [01:37] so we can get internal mongo logging out of the bootstrap command [01:37] uploading some binaries now [01:37] hmm. so what's different on jenkins then to cause the tests to fail [01:37] not sure [01:38] same version of mongo [01:38] my desktop is i386 [01:38] ci is amd64 [01:38] that is all I can come up with so far [01:38] if that is th cause then we're doomed [01:38] :-) [01:38] FSVO doomed [01:38] yeah :-) [01:39] the error is that something inside mgo is explicitly closing the socket [01:39] when we ask to set up the replica set [01:39] thumper: so, one thing it could be - HA added an admin db [01:39] hence the desire for mor logging [01:39] wallyworld: my binaries work locally [01:39] and copying up [01:39] if that is the case [01:39] and my binaries work [01:39] and the recently added test which i reference above tests that we can ignore unuath access to that db [01:39] it could be that [01:40] * thumper ndos [01:40] and that test fails [01:40] still copying that file [01:40] * thumper waits... [01:40] * wallyworld waits too.... [01:40] and here I was wanting to sleep [01:40] not feeling too flash [01:40] :-( [01:41] thumper, sinzui fwiw. my issue was user error around series. i have trusty containers but had registered them as precise, machine agent deployed fine, unit agents didn't like it though. unsupported usage mode. [01:41] haha [01:41] thumper, concievably the same happens when you dist-upgrade a machine [01:41] thumper, wallyworld: the machines the run the unit tests are amd64 m1.larges for precise and trusty. We 95% of users deploy top amd64 [01:41] hmm... [01:42] sinzui: right... [01:42] we saw numbers that showed a very small number were 1386, we assume those are clients, not services [01:42] * thumper nods [01:42] wallyworld: can I get you to try the aws reproduction? [01:42] wallyworld: are you busy with anything else? [01:43] i am but i can [01:43] what's up with aws? [01:43] just trying to replicate the issues that we are seeing on CI with the local provider not bootstrapping [01:44] it works on trusty for me [01:44] and precise/i386 [01:44] but we should check real precise amd64 [01:44] ok, so you want to spin up an aws precise amd64 and try there [01:45] right [01:45] okey dokey [01:45] install juju / juju-local [01:45] yarp [01:45] probably need to copy local 1.19 binaries [01:45] to avoid building on aws [01:45] right [01:51] ugh... [01:51] man I'm confused [01:57] wallyworld: sinzui: using my extra logging http://paste.ubuntu.com/7253010/ [01:57] so not a recent fix issue [01:58] thumper: we should just disable the replica set stuff [01:58] it has broken so much [01:58] perhaps worth doing for the local provider at least [01:58] we are never going to want HA on local [01:58] it makes no sense [01:59] closed explicitly? That's like the computer says no [01:59] sinzui: ack [02:01] * thumper has a call now [02:06] axw, Is there any more I should say about azure availability sets? https://docs.google.com/a/canonical.com/document/d/1BXYrLC78H3H9Cv4e_4XMcZ3mAkTcp6nx4v1wdN650jw/edit [02:12] sinzui: otp [02:18] thumper: sinzui: i'm going to test this patch to disable the mongo replicaset setup for local provider https://pastebin.canonical.com/108522/ [02:19] this should revert local bootstrap to be closer to how it was prior to HA stuff being added [02:19] and hence it should remove the error in thumper's log above hopefully [02:20] sinzui: can I have permissions to add comments? [02:21] sinzui: this line is a bit suspect 2014-04-15 02:20:44 DEBUG mgo server.go:297 Ping for 127.0.0.1:37019 is 15000 ms [02:21] sinzui: locally I have 0ms [02:22] sorry axw I gave all canonical write access as I intended [02:22] sinzui: ta [02:22] * sinzui looks in /etc [02:23] sinzui: availability-sets-enabled=true by default; I'll update the notes [02:27] wallyworld: that patch is wrong [02:27] i know [02:27] found that out [02:27] doing it differently [02:28] wallyworld: jujud/bootstrap.go line 165, return there if local [02:28] yep [02:37] sinzui: I updated the azure section, would you mind reading over it to see if it makes sense to you? [02:43] Thank you axw. Looks great [02:44] sinzui: wallyworld, axw: bootstrap failure with debug mgo logs: http://paste.ubuntu.com/7253155/ [02:44] sinzui: I don't know enough to be able to interpret the errors [02:44] sinzui: perhaps we need gustavo for it [02:44] thanks for playing thumper [02:45] sinzui: can you re-enable local provider tests in CI? i will do a branch to try and fix it and then when landed CI can tell us if it works [02:45] sinzui: I'm done with the machine now [02:45] I will re-enable the tests [02:46] thanks [02:46] let's see if the next branch i land works [02:47] thumper, wallyworld . I think you had decided to disable HA on local...and how would I do HA with local...Does that other machine get proper access to my local machine that probably has died with me at the keyboard [02:47] sinzui: you wouldn't do HA with the local provider [02:47] :) [02:48] sinzui: we are trying to set up replicaset and other stuff which is just failing with local and for 1.19 t least, i can't see why we would want that [02:48] :) [02:48] so to get 1.19 out, we can disable and think about it later [02:49] wallyworld, really, I don't think we ever need to offer HA for local provider. [02:49] maybe for testing [02:49] but i agree with you [02:49] i was being cautious in case others were attached to the idea [03:45] axw: this should make local provider happy again on trunkhttps://codereview.appspot.com/87830044 [03:57] wallyworld: was afk, looking now [04:01] ta [04:05] wallyworld: reviewed [04:05] ta [04:05] axw: everyone hates that we use lcal provider checks in jujud [04:06] been a todo for a while to fix [04:06] yeah, I kind of wish we didn't have to disable replicasets at all though [04:06] I know they're not needed, but if they just worked it would be nice to not have a separate code path [04:07] axw: yeah. we could for 1.19.1, but we need 1.19 out the door and HA still isn't quite ready anuway [04:08] it is indeed a bandaid. nate added another last week also [04:09] wallyworld: yep, understood [04:09] makes me sad too though [05:12] wallyworld, Your hack solved local. The last probable issue is the broken unit tests for precise. I reported bug 1307836 [05:12] <_mup_> Bug #1307836: Ci unititests fail on precise

[05:13] sinzui: yeah, i just saw that but didn't think you'd be awake [05:13] I don't want to be awake [05:14] i didn't realise we still had the precise issue :-( [05:14] i'll look at the logs [05:14] hopefully we'll have some good news when you wake up [05:22] wallyworld, azure-upgrade hasn't passed yet. It may not because azure is unwell this hour. We don't need to worry about a failure for azure. I can ask for a retest when the cloud is better [05:23] righto [05:23] * sinzui finds pillow [05:23] good night [05:32] PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND [05:32] 7718 ubuntu 20 0 2513408 1.564g 25152 S 45.2 19.6 2:41.51 juju.test [05:32] memory usage for Go tests is out of control [05:47] jam1: you online? [05:47] morning wallyworld [05:47] I am [05:47] g'day [05:47] jam1: so with you branch, and one i did, CI is happy for upgrades [05:47] but [05:47] a couple of tests fail under precise [05:48] there's the one you added for your branch, plus TestInitializeStateFailsSecondTime [05:48] wallyworld: links to failing tests ? [05:48] the error says that a connection to mongo is unauth [05:48] http://ec2-54-84-137-170.compute-1.amazonaws.com:8080/job/run-unit-tests-amd64-precise/621/consoleFull [05:48] wallyworld: and are you able to see the local provider fail with replica set stuff, because neither Tim or I could reproduce it. [05:49] yeah, i saw it [05:49] and fixed [05:49] i had to disable HA for local provider [05:49] and while we don't have to have replica set local, I'd prefer consistency and the ability to test out HA locally if we could [05:49] sure [05:49] but to get 1.19 out the door i went for a quick fix [05:49] which we can revisit in 1.19.1 [05:50] curtis was ok with that [05:50] wallyworld: so I certainly had a WTF about why I was able to create a machine in "admin" but not able to delete it without logging in as the admin I just created. [05:50] wallyworld: so it seems like some versions of Mongo don't have that security hole [05:50] but I can't figure out how to log in as an actual admin, but I can try digging into the TestInitialize stuff a bit more for my test. [05:51] so we are using a different mongo on precise vs trusty? [05:51] wallyworld: 2.4.6 vs 2.4.9 [05:51] ok, i didn't realise that [05:51] Trusty is the one that lets you do WTF stuff. [05:51] :-( [05:51] there are 2 failing tests [05:51] maybe more, i seem to recall previous logs showing more [05:52] but the latest run had 2 failures only [05:52] the other one was TestInitializeStateFailsSecondTime [05:53] jam1: i gotta run to an appointment soon, but will check back when i return. if we can this this sroted, we can at least release 1.19.0 asap and deal with the workarounds for 1.19.1 [05:54] wallyworld: is your code landed? [05:54] yep [05:54] k [05:54] happy to revert it if we can find a fix [05:54] I'll pick it up [05:55] thanks, i can look also but ony found out about precise tests just before and sadly i gotta duck out [05:56] hmm... LP failing to load for me right now [05:56] wallyworld: Ci is running an anchient version of mongo [05:56] that won't help [06:02] davecheney: sinzui: I would think we should run mongo 2.4.6 which is the one you get from the cloud-archive:tools [06:04] jam1: agreed [06:07] davecheney: are they running 2.2.4 from the PPA? [06:09] jam1: good point, 2.0 was all that shipped in precise [06:09] I'm just trying to find a way to reproduce, and I thought there was a 2.4.0 out there for a while, but I can't find it [06:10] and it isn't clear *what* version they are runnig. [06:10] jam1: Get:40 http://ppa.launchpad.net/juju/stable/ubuntu/ precise/main mongodb-clients amd64 1:2.2.4-0ubuntu1~ubuntu12.04.1~juju1 [20.1 MB] [06:10] Get:41 http://ppa.launchpad.net/juju/stable/ubuntu/ precise/main mongodb-server amd64 1:2.2.4-0ubuntu1~ubuntu12.04.1~juju1 [5,135 kB] [06:10] this is our fault [06:10] remember that old ppa [06:10] yep, thanks for pointing me to it [06:10] well, I can at least test with it. [06:10] so, that isn't he cloud archive [06:10] :emoji concerned face [06:10] At one point we probably wanted to maintain compat with 2.2.4, but I'm not *as* concerned with it anymore. [06:12] 2.2.4 never shipped in any main archive [06:12] i don't think we have a duty of compatability [06:12] https://bugs.launchpad.net/juju-core/+bug/1307289/comments/1 [06:12] if anyone cares [06:12] btw, go test ./cmd/juju{,d} [06:12] takes an age because the test setup is constantly recompiling the tools [06:27] why are the cmd/juju tests calling out to bzr ? [06:33] FAIL: publish_test.go:75: PublishSuite.SetUpTest [06:33] publish_test.go:86: [06:33] c.Assert(err, gc.IsNil) [06:33] ... value *errors.errorString = &errors.errorString{s:"error running \"bzr init\": exec: \"bzr\": executable file not found in $PATH"} ("error running \"bzr init\": exec: [06:33] \"bzr\": executable file not found in $PATH") [06:33] what is this shit ? [06:44] mornin' all [06:47] https://bugs.launchpad.net/juju-core/+bug/1307865 [06:47] this seems like an obvious failure [06:47] why does it only happen sporadically ? [06:47] davecheney: that's been the case for over a year (tests running bzr) [06:48] rogpeppe: fair enough [06:48] davecheney: i agree, that does seem odd [06:50] rogpeppe: do we have thoughts on how we would have a Provider work that didn't have storage? I know we don't particularly prefer the HTTP Storage stuff that we have. [06:51] jam1: we'd need to provide something to the provider that enabled it to fetch tools from the mongo-based storage [06:51] rogpeppe: so we'd have to do away with "provider-state" file as well, right? [06:51] jam1: other than that, i don't think providers rely much on storage, do they? [06:51] rogpeppe: we use it for charms [06:52] jam1: so... provider-state is *supposed* to be an implementation detail of a given provider [06:52] sure [06:52] it is in the "common code" path, but you wouldn't have to use it/could make that part optional [06:53] jam1: we don't really rely on it much these days [06:53] rogpeppe: we'd want bootstrap to cache the API creds and then we rely on it very little [06:53] you'd lose the fallback path [06:54] jam1: yeah, and we don't want to lose that entirely [06:54] jam1: for a provider-state replacement, i'd like to see the fallback path factored out of the providers entirely [06:54] well, it only works because there is a "known location" we can look in that is reasonably reliable. If a cloud doesn't provide its own storage, then any other location is just guesswork [06:55] anyway, switching machines now [06:55] jam1: ok [06:56] axw: looking at http://paste.ubuntu.com/7252280/, in the first status machines 3 and 4 are up AFAICS. [06:57] axw: and that's the status that i am presuming that ensure-availability was acting on [06:57] rogpeppe: in the first one, yes, but how do you know when they went down? [06:58] rogpeppe: my point was it could have changed since you did "juju status" [06:58] axw: there was a very short time between the first status and calling ensure-availability. i don't see any particular reason for it to have gone down in that time period, although of course i can't be absolutely sure [06:58] right, that's why I asked about the log. I'm really only guessing [06:59] axw: luckily i still have all the machines up, so i can check the log [06:59] rogpeppe: I see no reason why the agent would have gone down after calling ensure-availability either [06:59] cool [07:00] axw: it would necessarily go down after calling ensure-availability, because mongo reconfigures itself and agents get thrown out [07:01] rogpeppe: for *all* machines? not just the shunned ones? [07:01] axw: yeah [07:01] axw: we could really do with some logging in ensure-availability to give us some insight into why it's making the decisions it is [07:02] yeah, fair enough === vladk|offline is now known as vladk [07:16] axw: here's the relevant log: http://paste.ubuntu.com/7252375/ [07:17] axw: the relevant EnsureAvailability call is the second one, i think [07:17] axw: it's surprising that the connection goes down so quickly after that call [07:17] rogpeppe: wrong pastebin? [07:18] axw: ha, yes: http://paste.ubuntu.com/7253848/ [07:38] rogpeppe: machine-3's API workers have dialled to machine-0's API server ... [07:38] rogpeppe: not saying that's the cause, but it's strange I think [07:39] axw: that's not ideal, but it's understandable [07:39] axw: one change i want to make is to make every environ manager machine dial the API server only on its own machine [07:39] yep [07:42] axw: rogpeppe: right, we originally only wrote "localhost" into the agent.conf. I think the bug is that the connection caching logic is overwriting that ? [07:42] jam: yeah - each agent watches the api addresses and caches them [07:43] rogpeppe: I thought when we spec'd the work we were going to explicitly skip overwritting when the agents were "localhost" [07:43] jam: but also, the first API address received by a new agent is not going to be localhost [07:43] rogpeppe: well, the thing that monitors it could just do if self.IsMaster() => localhost [07:44] jam: i don't remember that explicitly [07:44] or not run the address poller if IsMaster [07:44] sorry [07:44] IsManager [07:44] not Master [07:44] jam: i don't think it's IsMaster - i think it's is-environ-manager [07:44] jam: right [07:45] jam: i've been thinking about whether to run the address poller if we're an environ manager [07:45] s/poller/watcher/ [07:45] jam: my general feeling is that it is probably worth it anyway [07:46] jam: because machines can lose their environment manager status [07:46] jam: even though we don't fully support that yet [07:47] rogpeppe: won't they get bounced under that circumstance? [07:47] anyway, we can either simplify it by what we write in agent.conf, or we could detect that we are IsManager and if so force localhost at api.Open time. [07:48] jam: they'll get bounced, but if they do we want them to know where the other API hosts are [07:48] jam: i was thinking of going for your latter option above [07:51] rogpeppe: I can't really see much from the logs, I'm afraid. there is one interesting thing: "dialled mongo successfully" just after FullStatus and before EnsureAvailability [07:51] axw: i couldn't glean much from them either [07:51] axw: i'm just doing a branch that adds some logging to EnsureAvailability [07:52] axw: then i'll try the live tests again to see if i can see what's going on [07:55] rogpeppe: any idea why agent-state shows up as "down" just after I bootstrap? should FullStatus be forcing a resynchronisation of state? [07:55] axw: i think it's because the presence data hasn't caught up [07:55] rogpeppe: oh. I wonder if that's it? FullStatus may be reporting wrong agent state in your test too [07:55] axw: we should definitely look into that [07:56] axw: i think that FullStatus probably sees the same agent state that the ensure availability function is seeing [07:56] rogpeppe: yeah, true [08:09] rogpeppe: https://codereview.appspot.com/88030043 [08:09] axw: nice one! looking. [08:10] jam: I've reverted your change from last night that eats admin login errors; this CL adds machine-0 to the admin db if it isn't there already [08:11] axw: any chance that we could get the port from mongo rather than passing it in? [08:11] rogpeppe: this is just the bare minimum, will follow up with maybeInitiateMongoServer, etc. [08:11] jam: can do, but it requires parsing and I thought it may as well get passed in since it's already known to the caller [08:12] axw: well we can have mongo start on port "0" and dynamically allocate, rather than our current ugly hack of allocating a port, and then closing it and hoping we don't race. [08:12] jam: I assume you are referring to the EnsureAdminUserParams.Port field [08:12] axw: if it is clumsy to parse, then we can pass it in. [08:12] oh I see what you mean [08:13] umm. dunno. I will take a look [08:13] we *can* just start on port 37017, but that means other goroutines will also think that mongo is up, and for noauth stuff, we really want as little as possible to connect to it. [08:14] axw: I always get thrown off by "upstart.NewService" because *no we don't want to create a new upstart service* [08:14] but that is just "create a new memory representation of an upstart service" [08:15] jam: heh yeah, it is a misleading name [08:15] axw: I'm not sure why upstart specifically throws me off. [08:15] as I certainly know the pattern. [08:16] axw: can "defer cmd.Process.Kill()" do bad things if the process has already died ? [08:16] axw: is it possible to do EnsureAdminUser as an upgrade step rather than doing it on all boots? [08:17] jam: if the pid got reused very quickly, yes I think so [08:17] axw: I'm not particularly worried about PID reuse that fast [08:17] jam: not really feasible as an upgrade step, as they require an API connection [08:17] I'm more wondering about a panic because the PID didn't exist [08:17] then there's all sorts of horrible interactions with workers dying and restarting all the others, etc. [08:19] jam: I'm pretty certain it's safe, but I'll double check [08:19] jam: hi, any update on the precise tests failures? [08:22] jam: late Kill does not cause a panic [08:23] wallyworld: they pass with mongo 2.4.6 from cloud-archive:tools, they fail with 2.2.4 from ppa:juju/stable [08:23] on all machines that matter we use cloud-archive:tools [08:23] wallyworld: so CI should be using that one [08:23] great, so we can look to release 1.19 [08:23] wallyworld: and axw has a patch that replaces my change anyway. [08:23] wallyworld: the replicaset failure isn't one that I could reproduce... [08:24] since it is flaky [08:24] hmmm. i hate those [08:24] CI could reproduce it [08:24] wallyworld: it is *possible* we just need to wait longer, but I hate those as well :) [08:24] jam: this is what happens if you try to use "--port 0" in mongod: http://paste.ubuntu.com/7254007/ [08:25] axw: bleh.... ok [08:25] I don't think we want to use the "default mongo port of 27017" so we might as well use our own since we know we just stopped the machine [08:25] stopped the service [08:33] axw: reviewed [08:34] thanks [08:37] jam: using info.StatePort seems right to me (at least in production). [08:38] rogpeppe: for "bring this up in single user mode so we can poke at secrets and then restart it" I'd prefer it was more hidden than that, but I can live with StatePort being good-enough. [08:39] jam: if there's someone sitting on localhost waiting for the fraction of a second during which we add the admin user, i think the user is probably not going to be happy anyway [08:40] jam: note that the vulnerability is *only* to processes running on the local machine [08:40] jam: and if there are untrusted processes running on the bootstrap machine, they're in trouble anyway [08:41] rogpeppe: I'm actually more worried about the other goroutines in the existing process waking up, connecting, thinking to do work, and then getting shut down again. [08:42] rogpeppe: more from a cleanliness than a "omg we broke security" perspective [08:42] jam: what goroutines would those be? [08:43] rogpeppe: so this is more about "lets not force ourselves to think critically about everything we are doing and be extra careful that we never run something we thought we weren't". Vs "just don't expose something we don't want exposed so we can trust nothing can be connected to it." [08:44] jam: AFAIK there are only two goroutines that connect to the state - the StateWorker (which we're in, and which hasn't started anything yet) and the upgrader (which requires an API connection, which we can't have yet because the StateWorker hasn't come up yet. [08:45] jam: even if we *are* allowed to connect to the mongo, i don't think we can do anything nasty accidentally [08:46]