waigani | davecheney: I'm new as I know, so I'm asking to understand | 00:00 |
---|---|---|
waigani | s/i/you | 00:00 |
davecheney | sure | 00:00 |
davecheney | basically the rule is | 00:00 |
davecheney | if a function/method returns an error | 00:00 |
davecheney | you generally cannot make any assetions abvout the state of any other values it returns or the instance itself | 00:01 |
davecheney | *generally* | 00:01 |
davecheney | there are exceptions | 00:01 |
davecheney | so, there is no point in cleaning up the internal state of that instance as it is broke | 00:01 |
davecheney | there is value in cleaning up the files we poo'd on disk | 00:02 |
davecheney | because that is something we can do to make it possible to run the test again later | 00:02 |
waigani | ah okay | 00:02 |
davecheney | waigani: this ticket, https://bugs.launchpad.net/juju-core/+bug/1299969 | 00:13 |
_mup_ | Bug #1299969: launchpad.net/juju-core/provider/manual: ssh tests are not properly isolated <ppc64el> <juju-core:Triaged> <https://launchpad.net/bugs/1299969> | 00:13 |
davecheney | is a blocker on getting the common and manual provider tests working | 00:14 |
davecheney | do you think you could add it to your todo list ? | 00:14 |
waigani | davecheney: sure. I actaully hit that problem and thought I was being stupid because I did not have keys on the vm | 00:15 |
davecheney | waigani: right, but the solution is not to add keys to the vm | 00:16 |
davecheney | that will just push the problem off onto someone elses' plate | 00:17 |
waigani | davecheney: yeah I get that now :) | 00:17 |
davecheney | the goal is we want to run these tests during the build on the lp build servers | 00:17 |
davecheney | which are scrupiolously clean | 00:17 |
waigani | right | 00:17 |
davecheney | so any screwing with the environment beforehand will have to be expunged | 00:17 |
waigani | davecheney: provider/local is now passing :) proposing my branch now | 00:28 |
perrito666 | sinzui: nite, question, am I supposed/allowed to clos the bug and also, could you open the new one with the new bug info? (such as the exit of jenkins) | 00:32 |
davecheney | waigani: sweet | 00:32 |
sinzui | perrito666, I will do that | 00:48 |
davecheney | thumper: waigani wallyworld http://paste.ubuntu.com/7219526/ | 00:50 |
davecheney | i | 00:50 |
davecheney | i'd like to log a bug about this | 00:50 |
sinzui | perrito666, The bug is fix committed. It cannot be closed until the milestone is released/a package is distributed to the users. | 00:51 |
davecheney | i see far to many of these errors in the log when working with the local provider | 00:51 |
davecheney | i think its more than a harmless warning | 00:51 |
* sinzui reports the new bug | 00:51 | |
thumper | davecheney: which bit exactly? | 00:51 |
perrito666 | sinzui: tx | 00:51 |
wallyworld | yeah, doesn't seem right | 00:51 |
davecheney | thumper: 2014-04-08 00:35:07 DEBUG juju.worker.logger logger.go:45 reconfiguring logging from "<root>=DEBUG" to "<root>=WARNING;unit=DEBUG" | 00:52 |
davecheney | ^ what generates this line | 00:52 |
thumper | davecheney: the logging config worker | 00:52 |
thumper | davecheney: that is the default logging level | 00:53 |
davecheney | thumper: http://paste.ubuntu.com/7219530/ | 00:53 |
thumper | that is the logging level specified in state | 00:53 |
davecheney | thumper: this is the repro | 00:53 |
davecheney | ok | 00:53 |
davecheney | /dev/sda 9.9G 7.6G 1.9G 81% / | 00:53 |
davecheney | 10gb vm isn't really enough to debug on ... | 00:53 |
* davecheney waves to axw | 00:55 | |
davecheney | welcome to daylights saving | 00:55 |
thumper | davecheney: do you have a question for me? | 00:55 |
axw | davecheney: howdy | 00:55 |
axw | no daylight savings over here | 00:56 |
davecheney | thumper: how can I set the logging back to debug everywhere ? | 00:56 |
davecheney | i'm trying to get more details on issue 1303787 | 00:57 |
thumper | davecheney: "juju bootstrap --debug" or "juju bootstrap --logging-config='juju=debug'" | 00:57 |
thumper | davecheney: or... if you want more | 00:57 |
davecheney | thumper: cna't change it after the fact ? | 00:57 |
thumper | davecheney: yes | 00:57 |
davecheney | config-set ? | 00:58 |
thumper | juju set-env logging-config=juju=debug | 00:58 |
axw | oh ffs unity | 00:58 |
davecheney | thumper: thans | 00:58 |
davecheney | thanks | 00:58 |
davecheney | ok, environmnet setup, now apparently we wait 7 hours ... | 01:00 |
thumper | davecheney: I have been talking about that issue | 01:01 |
thumper | davecheney: how much memory does your VM have? | 01:01 |
davecheney | 8 gb | 01:01 |
thumper | ok... perhaps that'll be fine | 01:01 |
thumper | 24 was bad, 16 was good | 01:01 |
davecheney | hmm, that is odd | 01:01 |
thumper | I'm wondering if there is a fundamental issue elsewhere | 01:01 |
thumper | as in, not us | 01:01 |
davecheney | more likely | 01:01 |
thumper | but yes, keep an eye on it | 01:02 |
davecheney | i don't doubt there are bugs in the compiler | 01:02 |
davecheney | that is why i need machine-0.log | 01:02 |
davecheney | all machines via rsyslog is eating the stack trace | 01:02 |
davecheney | thumper: could you make sure if thre is any conversation about that issue that you cc me | 01:02 |
davecheney | i've been trying to get everyone to communicate via the isse | 01:02 |
thumper | for the original machine that was a problem has now got 16gig of ram and a new kernel | 01:02 |
thumper | davecheney: it was all voice conversation | 01:02 |
davecheney | but am currently managing 3 indepedentant email threads by three separate groups working the problem | 01:03 |
thumper | I was chatting with hazmat about it | 01:03 |
thumper | huh... | 01:03 |
thumper | that isn't fun | 01:03 |
davecheney | thumper: i've done more funner tings | 01:05 |
davecheney | things | 01:05 |
thumper | :) | 01:05 |
davecheney | i know this issue is critical | 01:05 |
davecheney | but saying it's critical isn't enough | 01:05 |
davecheney | we need details | 01:05 |
axw | fairly trivial review anyone? deleting 1.16 compat code: https://codereview.appspot.com/84520046/ | 01:09 |
tvansteenburgh | davecheney: re http://pastebin.ubuntu.com/7219597/ | 01:17 |
tvansteenburgh | (re https://bugs.launchpad.net/juju-core/+bug/1303787) | 01:17 |
_mup_ | Bug #1303787: hook failures - nil pointer dereference <hooks> <local-provider> <ppc64el> <juju-core:Incomplete by dave-cheney> <https://launchpad.net/bugs/1303787> | 01:17 |
tvansteenburgh | i'm guessing that doesn't help much :P | 01:17 |
thumper | tvansteenburgh: have you encountered issues since the vm was resized? | 01:18 |
davecheney | tvansteenburgh: the command you want is | 01:18 |
davecheney | dpkg -l | grep gccgo | 01:18 |
davecheney | sorry | 01:18 |
davecheney | dpkg -l | grep gccgo | 01:19 |
tvansteenburgh | thumper: honestly i don't know for sure as i haven't been on the machine today, but from convos on irc i've inferred that there are still probs | 01:20 |
davecheney | tvansteenburgh: can you get me the machine-0.log file from the machine | 01:20 |
davecheney | tvansteenburgh: or | 01:20 |
davecheney | better | 01:20 |
davecheney | can you do ssh-import-id dave-cheney | 01:20 |
davecheney | and I can do it myself | 01:20 |
tvansteenburgh | davecheney: dpkg -l | grep gccgo prints nothing | 01:20 |
tvansteenburgh | davecheney: imported your key | 01:21 |
davecheney | tvansteenburgh: is this on a ppc64el machine ? | 01:21 |
tvansteenburgh | yes | 01:21 |
davecheney | tvansteenburgh: this is very worrying | 01:21 |
davecheney | what you are teling me doesn't add up | 01:21 |
davecheney | the example fro the weekend showed that juju was compiled locally | 01:22 |
davecheney | but you're saying that there is no compiler on this machine | 01:22 |
tvansteenburgh | don't shoot the newbie :) | 01:22 |
davecheney | tvansteenburgh: soryr mate | 01:22 |
tvansteenburgh | i didn't install juju on this machine so i'm not sure what to tell you | 01:22 |
tvansteenburgh | but keep in mind that juju was upgraded since the original bug report | 01:22 |
davecheney | tvansteenburgh: ok | 01:23 |
tvansteenburgh | so it's possible that it was from source before, and now pre-compiled binary | 01:23 |
davecheney | i'll check out the machine myself | 01:23 |
davecheney | tvansteenburgh: are you going to be doing hte demo, or hazmat ? | 01:23 |
tvansteenburgh | hazmat | 01:23 |
davecheney | roger | 01:24 |
thumper | davecheney: no, hazmat ;-P | 01:24 |
* thumper snickers at his joke | 01:24 | |
* hazmat hides | 01:24 | |
davecheney | tvansteenburgh: what is the internal ip of wolfe-01 | 01:26 |
davecheney | i have the wrong config | 01:26 |
tvansteenburgh | 10.245.66.193 | 01:28 |
davecheney | ta | 01:28 |
davecheney | tvansteenburgh: thanks, i'm grabbing the log files now | 01:30 |
hazmat | thumper, local provider detects btrfs and just uses it if its at /var/lib/lxc ? | 01:34 |
thumper | yup | 01:34 |
hazmat | thumper, awesome | 01:34 |
* hazmat gives it a whirl | 01:35 | |
thumper | hazmat: will still take time to create the template first time around | 01:35 |
thumper | wallyworld: got 10 minutes? I need help solving a tools issue | 01:37 |
wallyworld | sure | 01:37 |
* thumper starts a hangout | 01:37 | |
thumper | https://plus.google.com/hangouts/_/7acpj3v9ejfcghmeigfq41jfc4?hl=en | 01:38 |
hazmat | thumper, understood, i updated the lxc cache image directly as well | 01:40 |
hazmat | well that's cute.. juju set-env random-string="abc" just works.. | 01:42 |
hazmat | and is retrievable via juju get-env | 01:42 |
davecheney | http://paste.ubuntu.com/7219672/ | 01:48 |
davecheney | grep -c panic | 01:48 |
davecheney | eeerk | 01:48 |
davecheney | sinzui: where is the packaging branch for juju-core ? | 01:51 |
davecheney | tvansteenburgh: i'd take a punt and say wolfe-01 got nuked and recreated | 01:54 |
davecheney | that is why there is no compiler | 01:55 |
davecheney | nuke; recreate; [sudo] apt-get install juju-core | 01:55 |
tvansteenburgh | davecheney: yeah i think that /is/ what happened | 01:57 |
davecheney | tvansteenburgh: that makes sense tehn | 01:57 |
davecheney | then | 01:57 |
davecheney | jamespage: where is the packaging branch for juju-core ? | 02:03 |
=== arosales_ is now known as arosales | ||
waigani | thumper: wallyworld: axw: my internet connection is crapping out on m | 02:38 |
waigani | e | 02:38 |
wallyworld | yay for NZ inyternet | 02:39 |
thumper | works for me | 02:44 |
hazmat | thumper, davecheney so good news.. i've run through the demo about a dozen times.. zero panics | 03:24 |
thumper | \o/ | 03:24 |
hazmat | thumper, and using btrfs :-) | 03:24 |
thumper | on power? | 03:24 |
hazmat | thumper, yup | 03:24 |
thumper | awesome | 03:24 |
thumper | how do I find out which package provides an executable again? | 03:39 |
thumper | wallyworld https://codereview.appspot.com/85220043 | 03:45 |
wallyworld | looking | 03:45 |
thumper | now this works | 03:45 |
thumper | for some value of works | 03:45 |
thumper | it tries to create a trusty container and fails | 03:45 |
thumper | for other reasons | 03:46 |
thumper | I'm asking on #ubuntu-server for answers to those reasons | 03:46 |
davecheney | ubuntu@winton-02:~$ juju status | 04:17 |
davecheney | ERROR failed verification of local provider prerequisites: exec: "mongod": executable file not found in $PATH | 04:17 |
davecheney | MongoDB server must be installed to enable the local provider: | 04:17 |
davecheney | sudo apt-get install mongodb-server | 04:17 |
davecheney | still telling me to instal lmongodb-server | 04:17 |
davecheney | did that branch land ? | 04:17 |
thumper | not yet | 04:19 |
davecheney | ok | 04:20 |
davecheney | https://codereview.appspot.com/85150044 | 05:13 |
davecheney | trivial fix | 05:13 |
jam | davecheney: LGTM, all the other files in that directory use the bson: syntax | 05:19 |
davecheney | jam: that is why i feel pretty confident that the fix is ok | 05:21 |
jam | davecheney: me too | 05:21 |
davecheney | ├─jujud─┬─lxc-create───lxc-ubuntu-clou───ubuntu-cloudimg───wget | 05:38 |
davecheney | │ └─11*[{jujud}] | 05:39 |
davecheney | what the balls is it downloading | 05:39 |
davecheney | and is there a way to speed it up | 05:39 |
=== vladk|offline is now known as vladk | ||
jam | davecheney: if you are using LXC, it has to download the Ubuntu image for the LXC instance | 05:53 |
davecheney | it's hung up trying to download some small the releases text file from cdimages | 05:53 |
davecheney | ubuntu@ip-10-251-8-60:~$ juju deploy ubuntu --debug | 05:55 |
davecheney | 2014-04-08 05:54:20 INFO juju.cmd supercommand.go:296 running juju-1.19.0-trusty-amd64 [gccgo] | 05:55 |
davecheney | 2014-04-08 05:54:20 DEBUG juju api.go:189 trying cached API connection settings | 05:55 |
davecheney | 2014-04-08 05:54:20 INFO juju api.go:259 connecting to API addresses: [localhost:17070 10.0.3.1:17070] | 05:55 |
davecheney | 2014-04-08 05:54:20 INFO juju.state.api apiclient.go:194 dialing "wss://localhost:17070/" | 05:55 |
davecheney | 2014-04-08 05:54:20 INFO juju.state.api apiclient.go:141 connection established to "wss://localhost:17070/" | 05:55 |
davecheney | ^ hun | 05:55 |
davecheney | hung | 05:55 |
davecheney | hang on | 05:55 |
davecheney | why is it talking to localhost ? | 05:55 |
davecheney | the api server is running on 10.3.0.1 | 05:55 |
davecheney | axw: wasn't there a bug logged about this over the weekend ? | 05:57 |
* axw looks up | 05:57 | |
axw | um | 05:58 |
axw | davecheney: localhost is the "public address" for the local provider's machine-0 | 05:58 |
axw | 10.0.3.1 is the internal | 05:58 |
axw | davecheney: is there a problem with that? | 05:59 |
=== vladk is now known as vladk|offline | ||
davecheney | axw: not sure | 06:02 |
davecheney | there will be when proxies are involved | 06:02 |
davecheney | we always say | 06:02 |
davecheney | no_proxy="10.0.3.1" | 06:02 |
davecheney | not no_proxy="localhost" | 06:02 |
axw | no_proxy=localhost only makes sense if you're outside the environment | 06:03 |
axw | i.e. using the CLI | 06:03 |
axw | I suppose we could manage the CLI's environment specifically for the local provider | 06:03 |
davecheney | axw: ok | 06:04 |
davecheney | nm | 06:04 |
davecheney | probably not the problem at the moment | 06:04 |
davecheney | 2014-04-08 06:07:56 ERROR juju.provisioner provisioner_task.go:438 cannot start instance for machine "3": error executing "lxc-start": command get_cgroup failed to receive response | 06:24 |
davecheney | any ideas ? | 06:24 |
davecheney | I wonder if it is related to https://bugs.launchpad.net/bugs/1304167 | 06:25 |
_mup_ | Bug #1304167: syntax error, trusty beta-2 cloud image <apparmor (Ubuntu):New> <https://launchpad.net/bugs/1304167> | 06:25 |
=== vladk|offline is now known as vladk | ||
jam | davecheney: that sounds like nested-lxc issues, but I'm guessing you're not nesting? | 06:32 |
davecheney | nup | 06:33 |
davecheney | just reboote | 06:33 |
davecheney | thre was an apparmor snafu installing the lxc package | 06:33 |
davecheney | maybe that was the cause | 06:33 |
davecheney | http://paste.ubuntu.com/7220297/ | 06:36 |
davecheney | nope, lxc is broken | 06:36 |
rogpeppe1 | mornin' all | 06:47 |
rogpeppe1 | i was just looking at this change: https://github.com/juju/testing/pull/3/files | 06:47 |
rogpeppe1 | is there any way i can see the whole file diff on github? | 06:47 |
rogpeppe1 | or do i have to pull it down to do that? | 06:48 |
fwereade | jamespage, jam: so, I think I've tracked down that relations-on-upgrade bug -- but I think (1) it only hits relations without other members, and (2) it's always existed | 07:13 |
fwereade | jamespage, were there any other units in that peer relation? | 07:13 |
fwereade | jamespage, jam: that's bug 1303697 ^^ | 07:14 |
_mup_ | Bug #1303697: peer relation disappears during upgrade of juju <juju-core:Triaged by fwereade> <juju-core 1.18:Triaged> <https://launchpad.net/bugs/1303697> | 07:14 |
axw | fwereade: I'm creating an instance.Placement structure for placement directives, but I'd like to leave SSH out of it because it only makes sense in add-machine (for now at least) | 07:32 |
axw | is that okay with you? | 07:32 |
axw | fwereade: http://paste.ubuntu.com/7220424/ -- currently adding a new method to state to add a machine with one of these | 07:33 |
fwereade | axw, I feel it's mildly suboptimal, because I think it *should* be reasonable to deploy --to ssh:blah -- but I can see how it would pull in a lot more work, so, fair enough | 07:35 |
fwereade | axw, about the kv pairs in environment placement... upsides and downsides | 07:36 |
axw | I'm pretty sure we can add it later if necessary | 07:36 |
fwereade | axw, and, hmm, I don't quite love the separation between ContainerPlacement/EnvironmentPlacement | 07:36 |
axw | fwereade: the idea was that a container placement can encapsulate any other, the others being terminals | 07:37 |
fwereade | axw, (1) I think the conception of lxc/kvm as pseudo-providers is quite nice; (2) I think that --to us-east-1a is nice, but that's not a kv pair | 07:37 |
axw | fwereade: so you could do lxc:maas:name=abc | 07:37 |
fwereade | axw, ha, that's an interesting thought | 07:38 |
fwereade | axw, I'm a little bit -1 on it -- I think it's ok that the domain of the lxc/kvm providers is restricted to existing machines | 07:39 |
fwereade | axw, I can see nice things about it but it also makes my spidey sense tingle a little | 07:40 |
axw | okay, I'll collapse them for now | 07:40 |
axw | fwereade: so just some kind of scope/provider and a value, and leave it to the environ to parse k=v if it wants to? | 07:40 |
fwereade | axw, yeah, I think that's the right separation of concerns | 07:41 |
axw | ok | 07:41 |
fwereade | axw, and it does leave the providers free to have us-east-1a for now, and later to allow zone=us-east-1a,somethingelse=somethingelse | 07:41 |
axw | yep, fair enough | 07:42 |
fwereade | axw, fwiw I think scope is a nice term for it | 07:42 |
axw | okey dokey well that simplifies the changes needed to state then | 07:44 |
* fwereade is getting really quite grumpy with the half-assed attempts to make the uniter work on multiple goroutines | 07:44 | |
fwereade | axw, I don't suppose you know why we call u.proxy.SetEnvironmentValues() in updatePackageProxy, do you? | 07:54 |
* axw looks | 07:54 | |
axw | fwereade: I think so the hooks run with the proxy env vars set? | 07:55 |
fwereade | axw, we explicitly pass the proxy settings in when we create a hook context | 07:55 |
axw | sorry, not really sure tbh. better off asking thumper - he implemented that I think | 07:56 |
fwereade | axw, np, cheers | 07:56 |
axw | fwereade: is this better? http://paste.ubuntu.com/7220551/ | 08:04 |
axw | machine-id is a special case, because there isn't really a scope/provider/whatever | 08:05 |
axw | for empty scope, "juju add-machine" will fill it in with the current environment | 08:05 |
axw | so "juju add-machine --to maas-node-123" will implicitly have the current env's name as the placement scope | 08:06 |
fwereade | axw, I'm wondering whether it would be excessively evil to consider "0" or whatever to be in some special scope itself | 08:08 |
fwereade | axw, "--to 0" == --to "machine:0" as it were | 08:09 |
fwereade | axw, just trying to figure out the consequences of a collision between DWIM vs law of least surprise | 08:10 |
fwereade | axw, or juju:0, or something | 08:10 |
axw | fwereade: I suppose we could just have an instance.MachineScope constant, = "-" or some character that is not valid in an environment name | 08:10 |
dimitern | fwereade, hey | 08:11 |
fwereade | dimitern, heyhey | 08:11 |
dimitern | fwereade, https://codereview.appspot.com/85060043/diff/1/provider/maas/environ_whitebox_test.go#newcode537 re that comment | 08:11 |
fwereade | dimitern, oh yes | 08:11 |
dimitern | fwereade, what exported method did you mean? | 08:11 |
fwereade | dimitern, StartInstance | 08:11 |
dimitern | fwereade, but to do that we need to land a few things in gomaasapi | 08:12 |
fwereade | dimitern, expand please? sorry I'm not up on all the context there | 08:12 |
dimitern | fwereade, well, to test the new features we need them to be supported by the test server of gomaasapi | 08:13 |
dimitern | fwereade, that's lshw machine details and list_connected_macs for a network | 08:14 |
dimitern | fwereade, i wanted to save some time and land these changes (after testing them live on my local maas ofc) so we can have MVP, and later polish it | 08:15 |
fwereade | dimitern, ok, I see -- as long as there's a card for it, then :) | 08:15 |
dimitern | fwereade, sure, will add one now | 08:15 |
fwereade | axw, yeah, something like that maybe | 08:16 |
axw | fwereade: I'll go with that and see how it turns out. | 08:17 |
fwereade | axw, cheers | 08:18 |
vladk | dimitern: morning | 08:38 |
dimitern | vladk, hey morning | 08:38 |
vladk | dimitern: I am working on linking GetNetworkInfo and extractInterfaces | 08:40 |
vladk | dimiternЖ I do not want both of us do the same changes | 08:43 |
davecheney | anyone: if I do juju destroy-environment, does this remove /var/log/juju-$ENV ? | 08:47 |
rogpeppe | davecheney: i'm not sure it does. there might have been a discussion about this recently. jam? fwereade? | 08:48 |
jam | davecheney: thumper explicitly designed it so that destroy-env does *not* delete /var/log/juju* (though it does delete /var/lib). | 08:49 |
jam | var/log will be deleted on another bootstrap | 08:49 |
rogpeppe | axw: i am thinking that "mongodb" in utils/apt.go, cloudArchivePackages, should be "mongodb-server". does that seem right to you? | 08:49 |
dimitern | vladk, ah, sorry, I misunderstood you perhaps | 08:49 |
davecheney | jam: hang on | 08:49 |
davecheney | you just said two things | 08:49 |
davecheney | and they conflicted | 08:50 |
dimitern | vladk, I have the branch that returns []NetworkInfo proposed already | 08:50 |
davecheney | you said /var/log/juju-ubuntu-local would not be deleted | 08:50 |
davecheney | then you said it would | 08:50 |
rogpeppe | any else know about what package we should be installing for mongodb? | 08:50 |
jam | davecheney: /var/log is not deleted during destroy, but /var/lib is | 08:50 |
dimitern | vladk, but we do need some changes in gomaasapi | 08:50 |
jam | davecheney: during bootstrap /var/log is cleaned up | 08:50 |
davecheney | jam: /var/log/juju-ubuntu-local is truncated | 08:50 |
rogpeppe | s/any else/anyone else/ | 08:50 |
davecheney | so the all machines log of my environment is only 5 mins long | 08:50 |
dimitern | vladk, to extend the test server, so it supports /version/ (capabilities), /network/<name</?op=list_connected_macs and /nodes/<id>/?op=details (for the XML lshw data) | 08:51 |
axw | rogpeppe: sorry, just a moment | 08:51 |
vladk | dimitern: I see your last code review, do you have something behind that? | 08:51 |
dimitern | vladk, I have 2 more - the provisioner to take []NetworkInfo and add NICs/networks in state | 08:52 |
axw | rogpeppe: yes, I think you're right | 08:52 |
dimitern | vladk, and the last one is the cloudinit scripts, where I believe you did some work on and we can do it together? | 08:52 |
rogpeppe | axw: but i don't see mongodb-server mentioned in http://reqorts.qa.ubuntu.com/reports/ubuntu-server/cloud-archive/cloud-tools_versions.html | 08:52 |
davecheney | jam: ok, thanks for confirming | 08:53 |
davecheney | arosales: you online ? | 08:53 |
rogpeppe | axw: but... it looks like that's what we're using in current bootstrap, so i guess it should work | 08:53 |
axw | rogpeppe: well, mongodb depends on mongodb-server. so we get what we need either way | 08:53 |
rogpeppe | axw: ah | 08:53 |
rogpeppe | axw: ok, trying that | 08:54 |
rogpeppe | axw: i've added mongodb-server to cloudArchivePackages | 08:54 |
axw | cool | 08:54 |
vladk | dimitern: we need one more change to gomaasapi, to not panic in testing mode, what bootstrap a node without networks | 08:57 |
vladk | dimitern: how can I see your code? | 08:59 |
dimitern | vladk, yeah, that's why we need to change gomaasapi | 09:01 |
dimitern | vladk, what code specifically? | 09:02 |
dimitern | fwereade, I replied to your comment https://codereview.appspot.com/85060043/ does it make sense? | 09:02 |
fwereade | dimitern, yeah, sgtm | 09:04 |
dimitern | fwereade, cool, submitting then | 09:04 |
vladk | dimitern: I can't work on cloudinit script before you land your changes on NIC->VLAN mapping | 09:17 |
dimitern | vladk, it's approved and on its way to land now | 09:19 |
natefinch | morning all | 09:24 |
natefinch | axw: thanks for doing some testing on my branch. Don't need to apologize that it works ;) | 09:27 |
rogpeppe | hmm, it occurs to me that since we don't need the tools until we connect to the instance, we could start the instance first, then concurrently upload the tools. that would make bootstrap --upload-tools about twice as quick for me. | 09:30 |
rogpeppe | as i wait for yet another bootstrap | 09:30 |
rogpeppe | fwereade: where does the machiner get the addresses from now? | 10:02 |
fwereade | rogpeppe, net.InterfaceAddrs | 10:03 |
fwereade | rogpeppe, and it sets them all with NetworkUnknown | 10:03 |
rogpeppe | fwereade: as an alternative, perhaps the uniter should wait until its machine has an address before starting | 10:03 |
fwereade | rogpeppe, I'm not sure exactly what handwaving we do with the local provider to make those addresses show up usefully | 10:03 |
fwereade | rogpeppe, yeah, maybe that's the way -- ask repeatedly for your own address until you have one, then carry on | 10:04 |
fwereade | rogpeppe, and I think we can do that with existing apis | 10:04 |
rogpeppe | fwereade: yeah | 10:04 |
natefinch | axw: not yet... | 10:05 |
rogpeppe | fwereade: istm that if you're the uniter, you really don't want to start until other units can talk to you | 10:05 |
axw | natefinch: cat ~/.juju/local/cloud-init-output.log and see if there's anyhting interesting there | 10:05 |
fwereade | rogpeppe, yeah -- and that used to work, but we moved stuff around | 10:06 |
fwereade | rogpeppe, although, hmm | 10:06 |
fwereade | rogpeppe, unaddressable containers should be fine, this might render them less fine | 10:06 |
* fwereade goes to dig and find out exactly how we collate addresses | 10:06 | |
rogpeppe | fwereade: hmm, interesting | 10:06 |
axw | natefinch: also, you do have juju-mongodb installed right? | 10:07 |
rogpeppe | fwereade: i think we need to be able to explicitly mark a container as unaddress | 10:07 |
rogpeppe | able | 10:07 |
fwereade | rogpeppe, yeah | 10:07 |
rogpeppe | fwereade: some special form of address might work | 10:08 |
fwereade | rogpeppe, and add a bunch of consistency stuff so we don't accidentally put workloads that need to relate into them etc etc | 10:08 |
rogpeppe | fwereade: yeah | 10:08 |
fwereade | rogpeppe, well, they'll still have some sort of address on the machine network at least | 10:08 |
natefinch | axw: yeah, there's nothing interesting in the cloud init output.... last thing is just "opening environment local" | 10:08 |
fwereade | rogpeppe, that'll all come if/when we do explicit proxy charm support I think | 10:08 |
natefinch | axw: I have mongodb built with SSL in the correct spot, but it's not specifically juju-mongodb | 10:09 |
rogpeppe | natefinch: what are you having problems with? | 10:09 |
axw | natefinch: mk. is mongo even running? | 10:10 |
natefinch | axw: mongo runs, yeah | 10:10 |
natefinch | rogpeppe: bootstrapping local | 10:10 |
natefinch | axw: maybe I should just install juju-local | 10:11 |
axw | wouldn't hurt :) | 10:11 |
natefinch | :) | 10:11 |
axw | natefinch: do you have an http proxy set? just wondering if that breaks something... | 10:11 |
axw | otherwise I am out of ideas | 10:11 |
natefinch | no proxy | 10:11 |
rogpeppe | natefinch: try putting some more debug statements in the dial code | 10:12 |
axw | gotta make dinner, bbl | 10:12 |
=== axw is now known as axw-afk | ||
natefinch | well, installing juju-local let me bootstrap, but I can't destroy the environment now. Sigh | 10:18 |
fwereade | wallyworld, I think we'll need to TOPIC this, but let me read further | 10:19 |
wallyworld | sure | 10:19 |
fwereade | wallyworld, yeah, I think we need to make the fallback logic provider-specific too | 10:20 |
fwereade | wallyworld, consider env:arch=i386, service:instance-type=t1.micro | 10:21 |
wallyworld | fwereade: i think it is, or? | 10:21 |
wallyworld | all providers except ec2 and openstack just use the current WithFallbacks as they don't yet support instance-typ | 10:21 |
wallyworld | ec2 and openstack have specific logic | 10:22 |
fwereade | wallyworld, ah! ok, I misread something, let me keep going | 10:22 |
wallyworld | ok, no rush | 10:22 |
fwereade | wallyworld, we should probably move WithFallbacks off the Constraints type though | 10:22 |
wallyworld | sure | 10:22 |
wallyworld | that's easily done | 10:23 |
fwereade | wallyworld, and I think there's a subtlety wrt which values need to be masked when | 10:23 |
wallyworld | ok, if we can encapsulate the business rules, i'll implement them | 10:23 |
wallyworld | we have the means now that there's provider specific behaviour we can plug in | 10:25 |
fwereade | wallyworld, yeah -- did you get a chance to look at the python implementation by any chance? | 10:25 |
wallyworld | fwereade: yeah, but a lot of it was pretty much not that relevant | 10:26 |
fwereade | wallyworld, it's not a perfect model, but it does allow for defining conflicts between particular constraints | 10:26 |
fwereade | wallyworld_, as I was saying before we were so rudely interrupted | 10:27 |
fwereade | wallyworld, it's not a perfect model, but it does allow for defining conflicts between particular constraints | 10:27 |
fwereade | wallyworld_, ie attempting to define mem and instance-type together fails | 10:27 |
fwereade | wallyworld_, and defining instance-type in service constraints masks out only those env constraints that conflict with it | 10:28 |
wallyworld_ | fwereade: yes, although i have been more strict | 10:28 |
dimitern | fwereade, mgz, vladk, perrito666, https://codereview.appspot.com/85380043 - a small necessary step before the provisioner can start adding networks/NICs | 10:28 |
fwereade | wallyworld_, I think it becomes too strict though | 10:28 |
wallyworld_ | ie if the combined constraint has inst type and (mem or cpu-core or cpu-poewer etc) , it ignores instance type | 10:29 |
fwereade | wallyworld_, it looked like it was also masking arch | 10:29 |
wallyworld_ | fwereade: not for ec2 | 10:29 |
natefinch | rogpeppe: btw, I landed the namespace branch | 10:29 |
rogpeppe | natefinch: thanks a lot | 10:29 |
fwereade | wallyworld_, ok, I just fail at reading then :) | 10:30 |
wallyworld_ | fwereade: it masks for openstack but i can change that | 10:30 |
wallyworld_ | if needed | 10:30 |
fwereade | wallyworld_, possibly all we need is conflict resolution within single-level constraints | 10:30 |
wallyworld_ | fwereade: for ec2, if there's an arch, it checks that the instance type can support it | 10:30 |
natefinch | rogpeppe: merging into HA now | 10:30 |
rogpeppe | natefinch: cool | 10:30 |
wallyworld_ | fwereade: i'll ley you digest and we can topic | 10:30 |
wallyworld_ | gotta put kid to bed | 10:31 |
rogpeppe | natefinch: i'm just trying out a spike which tries to integrate everything to see if it all actually works, BTW | 10:31 |
axw-afk | wallyworld_: I just sent another update on placement directives | 10:32 |
natefinch | rogpeppe: seems like a good thing to do | 10:32 |
perrito666 | dimitern: small? | 10:39 |
dimitern | perrito666, well, at least it's straightforward I hope :) | 10:40 |
mgz | :) | 10:40 |
wallyworld_ | axw-afk: great, thanks :-) | 10:42 |
jam | dimitern: standup ? | 10:48 |
dimitern | jam, sorry, coming | 10:50 |
dimitern | fwereade, I still think this should land https://codereview.appspot.com/85380043, but I'll change https://codereview.appspot.com/85220044/ to implement the SetProvisionedWithNetworks API as agreed | 11:32 |
jam | rogpeppe: natefinch: I'm getting a strange failure trying to do this in a test case: apiMachine, err := s.State.AddMachine("quantal", state.JobManageEnviron) | 11:44 |
jam | if I do it in SetUpTest it works fine | 11:45 |
jam | but in TestFoo | 11:45 |
jam | it gives:"cannot add a new machine: state server jobs specified without calling EnsureAvailability" | 11:45 |
rogpeppe | jam: you can't do that if it's not the first machine | 11:45 |
rogpeppe | jam: only the bootstrap machine can be explicitly added with JobManageEnviron | 11:45 |
jam | rogpeppe: so even if the first machine *isn't* a JobManageEnviron | 11:45 |
rogpeppe | jam: yes | 11:45 |
rogpeppe | jam: to allow that would complicate the logic for no gain apart from in test code | 11:46 |
rogpeppe | jam: (and it's usually easy enough to work around the issue in tests) | 11:46 |
rogpeppe | jam: i still think that EnsureAvailability isn't a great idea in general, but this is one of the implications of it | 11:47 |
rogpeppe | jam: just wanted to run something by you for sanity checking | 12:07 |
jam | ? | 12:07 |
rogpeppe | jam: i thought that mgo.Strong implied that you'd always be talking to the mongo primary | 12:07 |
mgz | fwereade: do you have a mo? | 12:07 |
rogpeppe | jam: so IsMaster *should* always report that the current session is master | 12:08 |
rogpeppe | jam: at least that's what i assumed | 12:08 |
rogpeppe | jam: but it doesn't seem to be the case | 12:08 |
mgz | fwereade: your review for horacio's network verification branch, you say you want it done in state | 12:08 |
rogpeppe | jam: can you think of some reason that's not a reasonable assumption? | 12:08 |
mgz | fwereade: but that means the first time juju knows it can't do do what the user asked is in the provisioner | 12:08 |
jam | rogpeppe: which 'IsMaster' are we talking about? | 12:08 |
mgz | fwereade: which strikes me as pretty pantsy feedback | 12:09 |
rogpeppe | jam: replicaset.IsMaster (and the corresponding mongo api call) | 12:09 |
jam | rogpeppe: I'm digging | 12:12 |
rogpeppe | jam: me too | 12:12 |
jam | rogpeppe: so the one thing I see is that session.SetMode(consistency, refresh) | 12:13 |
jam | can take "consistency=Strong" but "refresh= false" | 12:13 |
jam | which means it will set slaveOk=falseb | 12:13 |
jam | but doesn't seem to actually unset the current connection | 12:13 |
rogpeppe | jam: we never actually call SetMode explicitly AFAIK | 12:13 |
jam | rogpeppe: DialWithInfo ends with (SetMode(Strong, true)) | 12:14 |
jam | newSession takes consistency as a parameter | 12:14 |
jam | EnsureIndex calls Clone and then SetMode(Strong, false) | 12:14 |
rogpeppe | jam: oh, i think i see what might be going on | 12:14 |
rogpeppe | jam: replicaset itself calls SetConsistency | 12:15 |
rogpeppe | jam: SetMode | 12:15 |
rogpeppe | natefinch: why does replicaset.CurrentConfig set the session's consistency mode to Monotonic? | 12:16 |
jam | rogpeppe: I think if you want to dial a specific Mongo server | 12:19 |
jam | you have to use Monotonic | 12:19 |
jam | otherwise, whatever you read would actually go to the mastetr | 12:19 |
rogpeppe | jam: hmm, yes. | 12:19 |
rogpeppe | jam: i think that it's wrong that it sets the argument session's mode though | 12:19 |
rogpeppe | jam: i've just changed it to clone the session first | 12:19 |
jam | rogpeppe: so the mgo way of doing it is that you always clone before changing stuff liket hat | 12:19 |
rogpeppe | jam: we'll see if that fixes the problem | 12:19 |
jam | rogpeppe: right | 12:20 |
perrito666 | fwereade: hey, I was discussing your input with mgz abut the --to error on case of network incompatibility and we got some doubts | 12:23 |
fwereade | perrito666, heyhey | 12:39 |
fwereade | perrito666, go on | 12:40 |
fwereade | perrito666, mgz: ah I see | 12:40 |
fwereade | perrito666, mgz, not sure I follow | 12:40 |
fwereade | perrito666, mgz: if you're assigning a unit to a machine that doesn't exist, the network spec comes from the service, easy | 12:41 |
fwereade | perrito666, mgz: if you're assigning to a machine that exists but hasn't been provisioned, all you have to go on is whatever network spec the machine was created with | 12:41 |
fwereade | perrito666, mgz: if you're assigning to a provisioned machine, you can directly check include/exclude against the actual networks | 12:42 |
mgz | fwereade: sure, (apart from the last one, we don't yet populate that) | 12:42 |
fwereade | perrito666, mgz: in any case you should be able to abort at unit-assignment time, right? | 12:42 |
fwereade | mgz, indeed :) | 12:42 |
mgz | fwereade: the issue is if we're doing this in state, that won't trigger till in response to the user's api call, right? | 12:43 |
fwereade | perrito666, mgz: it sucks a bit that it's hard to do at unit *creation* time (or service creation for that matter) | 12:43 |
fwereade | mgz, yes | 12:43 |
fwereade | mgz, that's still plenty of time to send an error back down to the user, surely? | 12:43 |
mgz | fwereade: so, the api call will succeed, then the worker will go, "oh hey, I can't actually do that"... then the user needs to run status and see that nothing actually happened | 12:43 |
jam | rogpeppe: do you remember the earlier question about "is the address actually changing"? I'm seeing this in the local provider: updater.go:249 machine "0" has new addresses: [public:localhost local-cloud:10.0.3.1] | 12:43 |
jam | but that slice doesn't ever change | 12:43 |
fwereade | mgz, what worker? the assignment completes before the API call returns | 12:43 |
natefinch | rogpeppe, jam: thanks for fixing that. | 12:44 |
mgz | fwereade: ah, does it? that's the bit I'm not clear on. | 12:44 |
fwereade | mgz, it's in the juju package iirc -- we create unit, then assign to machine, and the result of all that gets sent over the api | 12:44 |
mgz | fwereade: basically perrito666 doesn't have enough to go on from your review to actually change the code how you want it done | 12:44 |
jam | fwereade: mgz: I'm sure it is done in the same call, but it is not atomic | 12:45 |
jam | "juju add-unit --to N" will fail | 12:45 |
fwereade | mgz, the sucky bit is that we don't generally check assignment sanity before creating the unit, so you can end up with unassigned units, which is rubbish | 12:45 |
jam | but leave you with a Unit in Pending status | 12:45 |
rogpeppe | jam: that is odd | 12:45 |
jam | fwereade: right | 12:46 |
jam | that happens with --to | 12:46 |
jam | so it isn't *worse* than with --network :) | 12:46 |
fwereade | jam, and also just in general with deploy and add-unit -- they're not transactions and can fail partway through | 12:46 |
jam | fwereade: yeah, it is just surprising when the command line args aren't valid, but it creates some stuff anyway | 12:47 |
fwereade | jam, most other things are transactional iirc (nothing is atomic :-/) | 12:47 |
fwereade | jam, stricter checking in the api is the only response I'm really comfortable with there at this stage | 12:47 |
fwereade | jam, no argument that we should have it | 12:47 |
fwereade | jam, but some stuff -- like auto assignment in particular -- is likely to be tricky to get right in a single transaction | 12:48 |
rogpeppe | good news: i just stopped the juju-db service running on machine 0 and juju status continues to work | 12:49 |
natefinch | nice :) | 12:49 |
fwereade | rogpeppe, sweet | 12:49 |
fwereade | let's ship it | 12:49 |
fwereade | ;p | 12:49 |
jam | rogpeppe: presumably in HA first, right ? :) | 12:51 |
rogpeppe | jam: yea | 12:51 |
rogpeppe | h | 12:51 |
jamespage | fwereade, as the font of all knowledge - I've noticed that juju in our internal cloud picks i386 by default - is that intentional? | 12:52 |
jamespage | we did not hit this before because I was local tool building for amd64 only | 12:52 |
jamespage | now we sync from streams | 12:52 |
fwereade | jamespage, hmmmmmm, that is suboptimal, I think that is a casualty of arch-selection changes in response to ppc | 12:52 |
jamespage | fwereade, hmmmmm | 12:53 |
mgz | I thought we selected amd64 first... | 12:53 |
jamespage | fwereade, I have to force with --constraints now | 12:53 |
jamespage | fwereade, mgz: sounds like a bug to me | 12:53 |
fwereade | mgz, we certainly did once, and I *thought* we still did | 12:54 |
fwereade | jamespage, concur | 12:54 |
* jamespage goes to raise a bug | 12:54 | |
mgz | maybe we had the constraint default to amd64, which was wrong (given non-intely clouds), and just lost it as a preference when fixing? | 12:55 |
jam | mgz: note we have a MaaS bug open for arm clouds related to that | 12:57 |
jam | apparently if you don't specify arch= we thought we were starting an amd64 there | 12:57 |
jam | fwereade: https://codereview.appspot.com/85450043 Upgrader returns version.Current if we haven't upgraded yet | 12:57 |
jam | tested using the local provider and, indeed, the API server upgrades first. | 12:58 |
jam | they all wake up in response to changing the agent-version in config, but they're all told to not do anything, and they all ask again when machine-0's agent restarts and they all reconnect | 12:59 |
rogpeppe | fwereade: do you know what happens about all-machines.log in the presence of multiple API servers, by any chance? | 13:00 |
fwereade | rogpeppe, hopefully you update the rsyslog config to sync to the other state-servers ;) | 13:01 |
jam | rogpeppe: that might explain the results where DesiredVersion was confusing you in the past | 13:01 |
jam | rogpeppe: the *unit* agents get the version their Machine is currently running. | 13:01 |
rogpeppe | jam: i'm not sure how that explains how DesiredVersion could go backwards | 13:02 |
jam | rogpeppe: DesiredVersion for the machine is 1.19.2, but until the machine upgrades DesiredVersion for the unit is 1.19.1 | 13:02 |
jam | so it shouldn't go backwards | 13:02 |
jam | for a given agent | 13:02 |
jam | but it might look like it | 13:02 |
jam | if we aren't tweezing out the calls | 13:02 |
rogpeppe | jam: but it really did go backwards, because it caused the unit agent to break because of that | 13:03 |
fwereade | jam, LGTM | 13:05 |
jam | rogpeppe: thats why.... Old Unit agents *don't* watch their machine versions | 13:07 |
jam | so if the Unit agent upgrades before the machine agent | 13:07 |
jam | and slightly out of sync with the api | 13:07 |
jam | so you get upgraded Unit and API but *not* machine | 13:07 |
rogpeppe | jam: ah! | 13:07 |
rogpeppe | jam: brilliant, well done | 13:08 |
jamespage | fwereade, jam: bug 1304407 | 13:10 |
_mup_ | Bug #1304407: juju bootstrap on openstack cloud defaults to i386 <amd64> <apport-bug> <ec2-images> <trusty> <juju-core (Ubuntu):New> <https://launchpad.net/bugs/1304407> | 13:10 |
jam | jamespage: any chance that your juju client is i386 ? | 13:11 |
jamespage | jam: definately not | 13:11 |
jam | fwereade: so I think we know why bug #1299802 happened, but I'd be interested to brainstorm how we might fix it. | 13:12 |
_mup_ | Bug #1299802: upgrade-juju 1.16.6 -> 1.18 (tip) fails <juju-core:Incomplete> <https://launchpad.net/bugs/1299802> | 13:12 |
fwereade | jam, oh yes? | 13:12 |
jam | fwereade: so in 1.18 the Unit agent gets the Machine agent's actual version as its DesiredVersion, right? | 13:12 |
jam | but in 1.16 the Unit agents just get the global desiredversion | 13:12 |
jam | which means that if the API server and the Unit agent upgrade before the Machine agent | 13:13 |
fwereade | jam, ah, yes, an old server will send the wrong info | 13:13 |
jam | fwereade: well, not an old server | 13:13 |
jam | but just an about-to-be-upgraded one | 13:13 |
fwereade | jam, indeed, a server running the old version | 13:13 |
jam | so technically the fix I just put up would help here, but we can't "put that genie back in the bottle" for existing 1.16.6 servers. | 13:13 |
jam | fwereade: refuse to downgrade would be the easiest fix | 13:14 |
fwereade | jam, indeed | 13:14 |
jam | fwereade: given we actually have 0 experience making downgrades work | 13:14 |
fwereade | jam, +1 | 13:14 |
jam | fwereade: refresh rawMachine ? the issue is that there *isn't* a record in the DB for the raw and api server machine versions | 13:17 |
jam | and SetEnvironAgentVersion checks that everything has the same version | 13:17 |
jam | before it lets you set it to something else | 13:17 |
jam | and nil != version.Current | 13:17 |
jam | fwereade: (context is the patch you just reviewd) | 13:17 |
fwereade | jam, doh, I see, they are now the same machine | 13:18 |
fwereade | jam, forget I said anything | 13:18 |
fwereade | jam, s/now/not/ | 13:18 |
jam | "not" right | 13:18 |
fwereade | jam, yeah | 13:19 |
fwereade | jam, sorry, half my brain is taken up trying to figure out wtf has happened to the uniter | 13:19 |
jam | fwereade: np, too many threads as always | 13:20 |
rogpeppe | jam, fwereade: refusing to downgrade is the easy answer - it's a useful sanity constraint anyway | 13:39 |
jam | rogpeppe: so I have lp:///~jameinel/juju-core/1.18-refuse-downgrade-1299802, the question is testing it... | 13:39 |
=== axw_ is now known as axw | ||
rogpeppe | jam: i'm not sure that the log message should mention the bug. it might not be due to the bug. | 13:42 |
rogpeppe | jam: testing it shouldn't be too hard, i'd have thought | 13:42 |
rogpeppe | jam: it should sit nicely with the other upgrader tests | 13:42 |
dimitern | fwereade, jam, mgz, vladk, I still need a review on https://codereview.appspot.com/85380043 pleasre | 13:43 |
dimitern | *please | 13:43 |
rogpeppe | we need to decide how we want to represent environ-worker machines in status | 13:44 |
axw | fwereade: if you have any time today, I'd appreciate a glance over https://codereview.appspot.com/85040046. I can a full review off someone tomorrow if you're okay with it in general | 13:48 |
axw | fwereade: this is placement for add-machine only at this stage; I will follow up with deploy and add-unit | 13:49 |
fwereade | dimitern, I thought we didn't need to send those errors over the API? I'm strongly -1 on ignoring them at state/api level -- we should catch the at apiserver level, or handle them at agent level, but I really don't like just ignoring them in the api client | 13:49 |
fwereade | dimitern, and tbh I'm getting less sure it's helpful, even at the state level -- surely we should be checking that every field is identical; and if they are it really doesn't seem worth an error, and if they aren't there's a clear fuckup that's more serious than just "alreayd exists" | 13:53 |
dimitern | fwereade, actually with the next CL your concerns will be moot, because i'm removing both AddNetworks and AddNetworkInterfaces from state/api/provisioner | 13:55 |
fwereade | dimitern, haha, ok -- at the state level, though? | 13:55 |
rogpeppe | natefinch: how are you getting on with merging trunk into MA-HA ? | 13:55 |
dimitern | fwereade, i've added machine.SetProvisionedWithNetworks at both state and api level | 13:56 |
fwereade | dimitern, awesome -- it feels like that should completely replace this CL then? | 13:58 |
dimitern | fwereade, more or less, but AlreadyExistsError I think should stay | 13:58 |
dimitern | fwereade, I'm live testing it on maas and will propose soon | 13:58 |
fwereade | dimitern, until someone's consuming it I'm not 100% sure there | 13:59 |
mgz | fwereade: can you give perrito666 guidance on writing really complicated asserts with the mgo shizzle? | 13:59 |
fwereade | mgz, hah, yes, I can try | 13:59 |
mgz | because I'm pretty sure we have no docs for that, and that's what you're asking for with the review | 13:59 |
dimitern | fwereade, *sigh* it'll be a bit difficult to replace the current CL with it, but i'll try | 14:00 |
fwereade | mgz, perrito666: there's a bit in doc/hacking-state.txt, but indeed not a great deal | 14:00 |
fwereade | perrito666, 2 mins, would you start a hangout please? | 14:01 |
perrito666 | certainly | 14:01 |
fwereade | dimitern, I don't follow what dependencies on that will continue to exist? | 14:01 |
mgz | perrito666: invitemetoo! | 14:01 |
perrito666 | suddenly I am so popular, everyon wnts to hang out with me :p | 14:01 |
dimitern | fwereade, well the error itself, which will still be returned in state AddNetwork/AddNetworkInterface, but will be ignored at the API level (i need it on the state level though for SetProvisionedWithNetworks) | 14:02 |
dimitern | fwereade, ahem.. "ignored" as in SetProvisionedWithNetworks will not fail if you try to add an existing network or interface | 14:03 |
fwereade | dimitern, indeed, I see that -- but I'm worried that in state we should either be checking every field -- in which case, ehh, why return any error if the state perfectly matches the effect of having called successfully | 14:04 |
fwereade | dimitern, sorry, s/either// | 14:04 |
fwereade | dimitern, s/in which case/if they match/ | 14:04 |
perrito666 | mgz: fwereade google says you are both not aailable | 14:05 |
fwereade | dimitern, and if they don;t match that's probably a real error | 14:05 |
fwereade | perrito666, just paste the link, it'll work | 14:05 |
dimitern | fwereade, and how do you solve the case when not everything is the same, but the network name exists? | 14:05 |
fwereade | dimitern, something's all fucked up then, isn't it? | 14:05 |
mgz | perrito666: I;m making one | 14:05 |
dimitern | fwereade, or we could be trying to change it, which is not supported now | 14:06 |
fwereade | dimitern, yeah I wondered whether it's cleaner as an Update-style thing | 14:06 |
dimitern | fwereade, and in any case, this "ignore if it exists" is a temporary shortcut to the MVP, ideally, we want to discover and add all networks before deploy (probably at bootstrap time or shortly after) | 14:07 |
fwereade | dimitern, good point, well made | 14:07 |
dimitern | fwereade, that will allow us to have much better picture of the reality in state and make sanity checks much better | 14:07 |
fwereade | dimitern, ok, sgtm, AlreadyExists handling will smooth the transition | 14:08 |
dimitern | fwereade, yeah, thanks | 14:08 |
natefinch | rogpeppe: sorry, was helping with family stuff. The merge is complete, fixed some merge problems, checking the tests now. Looks like there's a few more compilation issues in the tests, hope to have it in a decent state shortly. | 14:15 |
rogpeppe | axw: ping | 14:35 |
axw | rogpeppe: yo | 14:36 |
rogpeppe | axw: did you have something to do with stateOpened in the machine agent? | 14:36 |
rogpeppe | axw: i'm just wondering what the justification for it is | 14:37 |
axw | rogpeppe: yeah I think I did that. broken is it? | 14:37 |
axw | ah | 14:37 |
rogpeppe | axw: 'fraid so | 14:37 |
* axw tries to remember | 14:37 | |
rogpeppe | axw: i can see why you did it, but i'm pretty sure it's broken anyway | 14:37 |
rogpeppe | axw: (i thought so before, but a runtime panic in a live environment assures me that it really is) | 14:38 |
axw | ah yeah, so the upgrade steps for state servers could get a state connection | 14:38 |
rogpeppe | axw: yeah. | 14:38 |
rogpeppe | axw: i think that the solution is probably for the upgrader to make an independent connection to the state | 14:38 |
rogpeppe | axw: i'm just wondering if that might end up being awkward. | 14:38 |
axw | rogpeppe: that would probably be fine, but I honestly can't remember the finer details at the moment | 14:39 |
rogpeppe | axw: ok | 14:40 |
rogpeppe | axw: FWIW, the StateWorker can be called multiple times. the second time it gets called, it panics because it closes the stateOpened channel again | 14:40 |
axw | agh :( | 14:40 |
axw | sorry | 14:41 |
rogpeppe | axw: that's ok - i only saw it relatively recently, thought "must fix that some time" but didn't realise it was quite as bad as it was | 14:41 |
rogpeppe | axw: i've nearly fixed it, BTW | 14:46 |
natefinch | yay for tests. Somehow dropped a codepath in the merge and a test detected it. | 14:48 |
axw | rogpeppe: cool, thank you. I'll check out what you did in the morning | 14:48 |
=== hatch__ is now known as hatch | ||
sinzui | fwereade, I see a report that says I will release https://launchpad.net/juju-core/+milestone/1.19.0 tomorrow. 1/3 of the targeted bugs will not be ready. | 15:02 |
sinzui | fwereade, jam: and then I see 1.19.1 for Friday, which is impossible | 15:03 |
natefinch | rogpeppe: all MA-HA code builds and tests all pass, doing live tests. | 15:03 |
rogpeppe | natefinch: great | 15:03 |
rogpeppe | natefinch: if you propose it, i'll have a last look and then hopefully it can go in | 15:03 |
sinzui | fwereade, jam: I can release 1.19.0 this week as it is, or we can push all non-essential bugs to 1.19.1 today. I think we already did push the non-essential bugs to 1.19.1 though | 15:04 |
natefinch | rogpeppe: https://codereview.appspot.com/72500043 | 15:06 |
fwereade | sinzui, yeah, I don't think we can reasonably push that tomorrow | 15:06 |
fwereade | sinzui, we will have to see what gets done overnight wrt those bugs | 15:06 |
sinzui | fwereade, bug 1303583 is the only azure bug. The report I see defines azure availability sets as the definition of 1.19.0 | 15:08 |
_mup_ | Bug #1303583: provider/azure: new test failure <gccgo> <juju-core:Triaged> <https://launchpad.net/bugs/1303583> | 15:08 |
rogpeppe | hmm, i'm seeing this when i lbox propose: error: Failed to load data for branch lp:juju-core: Get https://api.launchpad.net/devel/branches?url=lp%3Ajuju-core&ws.op=getByUrl: dial tcp 91.189.89.225:443: connection refused | 15:18 |
natefinch | hmm... canonical twiddling with their SSL certs right now perhaps? | 15:19 |
natefinch | FWIW, lbox worked ok for me 10 minutes ago | 15:19 |
natefinch | hm... apt-get update is failing on amazon | 15:20 |
natefinch | Unable to connect to us-east-1.ec2.archive.ubuntu.com:http: | 15:21 |
natefinch | Unable to connect to ubuntu-cloud.archive.canonical.com:http: | 15:21 |
natefinch | Unable to connect to security.ubuntu.com:http: [IP: 91.189.92.200 80] | 15:21 |
natefinch | wonder if canonical's servers are getting slammed from everyone in the world doing an apt-get update today | 15:21 |
rogpeppe | natefinch: #is are dealing with it | 15:24 |
natefinch | rogpeppe: could you try bootstrapping local with my branch? My local environment is still hosed, so I can't test it. I'll try to fix it, but I don't want to gate HA on my stupid environment. | 15:28 |
rogpeppe | natefinch: will do | 15:28 |
natefinch | rogpeppe: of course, if juju can't apt-get update, we can't test in the cloud either | 15:32 |
rogpeppe | natefinch: true 'nuff | 15:32 |
natefinch | update worked this time, let's see what happens with upgrade | 15:35 |
natefinch | gah, it's not going to finish before I have to go. Looks like it upgraded and installed everything just fine. It's sitting at "Bootstrapping Juju machine agent", so it may or may not work (since that seems to be where we fall over a lot). | 15:40 |
rogpeppe | natefinch: reviewed | 15:40 |
natefinch | Gotta go pick up my daughter from preschool ,back in like half hour or 45 minutes. | 15:40 |
rogpeppe | natefinch: ok | 15:40 |
=== bodie_ is now known as Guest15351 | ||
rogpeppe | fwereade, dimitern, mgz: i'd appreciate a review of this, if possible - it fixes a live panic i saw (and cleans some test code up slightly): https://codereview.appspot.com/85450044 | 15:47 |
mgz | rogpeppe: that branch makes some sense to me | 15:56 |
mgz | would prefer if someone else looked at it too | 15:56 |
rogpeppe | mgz: thanks. | 15:56 |
dimitern | rogpeppe, looking | 15:56 |
rogpeppe | dimitern: ta! | 15:56 |
dimitern | rogpeppe, LGTM | 15:59 |
rogpeppe | dimitern: thanks | 15:59 |
stokachu | so im seeing an issue with a local provider running kvm as machines and deploying charms within those machines to lxc containers. problem is you can not access the lxc containers outside of the parent machine http://paste.ubuntu.com/7222158/ | 16:01 |
stokachu | should there be some sort of tunneling setup to allow juju to be able to add-relations between those containers or be accessible outside of the machine hosting the containers? | 16:01 |
dimitern | fwereade, mgz, vladk, if anyone of you is still here, I'd appreciate a review on https://codereview.appspot.com/85220044/ | 16:52 |
natefinch | rogpeppe: so, my bootstrap on amazon failed to open state. | 17:03 |
rogpeppe | natefinch: oh. i thought you'd tried that. | 17:03 |
natefinch | rogpeppe: that's what I was trying as I left, but it didn't finish before I had to go. | 17:04 |
rogpeppe | natefinch: have you looked at the logs on the bootstrap machine?: | 17:04 |
natefinch | rogpeppe: not yet, the bootstrap machine was destroyed when I got back. I guess I should have suspended the client it before I left | 17:05 |
rogpeppe | natefinch: well, just checked - the local provider seems to work ok with your branch | 17:17 |
natefinch | rogpeppe: thanks | 17:17 |
natefinch | rogpeppe: hmm no mongo running on | 17:18 |
natefinch | rogpeppe: on the bootstrap node in amazon | 17:19 |
rogpeppe | natefinch: have you looked at /var/log/upstart ? | 17:19 |
rogpeppe | natefinch: that's where mongod errors seem to go | 17:19 |
natefinch | rogpeppe: yeah, I thought it should be in rsyslog.log, but that file is almost empty | 17:20 |
rogpeppe | natefinch: so... do you see anything? | 17:21 |
natefinch | rogpeppe: /var/log/upstart/rsyslog.log contains only a single log line: Skipping profile in /etc/apparmor.d/disable: usr.sbin.rsyslogd | 17:21 |
rogpeppe | natefinch: there should be a juju-db.log file, i think | 17:21 |
marcoceppi | so, can someone jump on a hangout real quickly, 1.18 appears to break local charm deployments | 17:22 |
rogpeppe | marcoceppi: i'm afraid i'm too close to EOD | 17:22 |
natefinch | marcoceppi: I can jump on | 17:22 |
marcoceppi | natefinch: thanks, inviting | 17:22 |
marcoceppi | natefinch: https://plus.google.com/hangouts/_/7ecpje3khh45s9i32nccdngavk | 17:23 |
natefinch | rogpeppe: there's no upstart juju-db.conf either... something is wonky | 17:23 |
rogpeppe | natefinch: what does your cloud-init-output look like? | 17:23 |
natefinch | rogpeppe: sorry, gotta work on this call | 17:26 |
rogpeppe | natefinch: it all seems to work fine for me | 17:29 |
natefinch | rogpeppe: weird, ok. | 17:30 |
=== bodie_ is now known as Guest92342 | ||
rogpeppe | natefinch: i have to go now. perhaps we could pair tomorrow on trying to get the tests written for the spiked code i've got that actually puts it all together. | 17:33 |
natefinch | rogpeppe: cool, yeah, that would be good. | 17:34 |
natefinch | rogpeppe: I'll try to figure out what's wrong with my environment | 17:34 |
natefinch | rogpeppe: if you've tested amazon and local, I'll make the tweaks you suggested and land HA, if you think that's ok? | 17:34 |
=== Guest92342 is now known as bodie_ | ||
wwitzel3 | natefinch: is your branch up to date, I have some spare time if you want me to kick the tires with it on maas as well | 17:35 |
rogpeppe | natefinch: sgtm | 17:35 |
rogpeppe | wwitzel3: that would be great | 17:35 |
* rogpeppe leaves | 17:36 | |
rogpeppe | g'night all | 17:36 |
wwitzel3 | rogpeppe: see ya rogpeppe | 17:36 |
rogpeppe | wwitzel3: have a great pycon | 17:36 |
natefinch | wwitzel3: that would be awesome. Just a bootstrap and quick deploy would be great. I have to figure out why my environment is borked | 17:36 |
wwitzel3 | rogpeppe: thanks :) | 17:37 |
wwitzel3 | natefinch: ok, I'll do that | 17:37 |
wwitzel3 | natefinch: I'm having some weird issues connecting to cloud-images | 17:50 |
natefinch | wwitzel3: I know IS was having some problems with their servers.... likely due to heartbleed, one way or another | 17:50 |
wwitzel3 | natefinch: yeah, I will keep trying | 17:51 |
marcoceppi | natefinch: fyi: http://askubuntu.com/a/445101/41 | 17:52 |
natefinch | marcoceppi: so the only difference is specifying the series, it looks like? Is this intended behavior or is it a bug? | 17:54 |
natefinch | (obviously the error message is terrible even if it is intended) | 17:54 |
marcoceppi | natefinch: it's still a bug, but not nearly as critical, see 1303880 | 17:54 |
marcoceppi | bug 1303880 | 17:55 |
marcoceppi | dear mup, where are you | 17:55 |
natefinch | bug #1303880 | 17:55 |
_mup_ | Bug #1303880: Juju 1.18.0, can not deploy local charms without series <juju-core:Triaged> <https://launchpad.net/bugs/1303880> | 17:55 |
_mup_ | Bug #1303880: Juju 1.18.0, can not deploy local charms without series <juju-core:Triaged> <https://launchpad.net/bugs/1303880> | 17:55 |
natefinch | heh just slow | 17:55 |
sinzui | cmars, bug 1303880 relates to your recent changes to support charms with ambiguous series | 17:59 |
_mup_ | Bug #1303880: Juju 1.18.0, can not deploy local charms without series <charms> <regression> <series> <juju-core:Triaged> <https://launchpad.net/bugs/1303880> | 17:59 |
* cmars looks | 18:00 | |
sinzui | cmars, I will talk with others to discuss how serious this issue is. I would like to classify the issue a a documentation problem | 18:01 |
cmars | sinzui, got it. thanks. let me know the outcome | 18:01 |
sinzui | or a UI problem. The people would have fixed the issue themselves if the UI said a series/os-version must be specified when deploying a local charm | 18:01 |
=== vladk is now known as vladk|offline | ||
natefinch | sinzui: grossly incorrect error message is essentially a bug. At least it should be easy to fix | 18:08 |
perrito666 | fwereade: hey, you most likely told me this before but, when checking networks against Machine doc, should I query for Addresses or MachineAddresses ? looks to me as if both should have the same content, but famous last words | 18:18 |
=== vladk|offline is now known as vladk | ||
fwereade | perrito666, hey, you still around? | 19:12 |
perrito666 | fwereade: always, I have no life | 19:12 |
bac | anyone having trouble using lbox today after launchpad got new keys? i'm getting error: Get https://api.launchpad.net/devel/people/+me: x509: certificate signed by unknown authority | 19:17 |
bac | sinzui: ^^ any thoughts? | 19:17 |
sinzui | bacL I haven't used lbox since since last Friday | 19:18 |
sinzui | bac | 19:18 |
bac | sinzui: browsers can talk to https://launchpad just fine. i'm not sure what lplib is doing | 19:19 |
bac | oh, lbox probably rolled it's own access, it can't be using lplib | 19:19 |
sinzui | bac, lbox was the first user of a go-based port of lplib | 19:20 |
bac | sinzui: so a CA that python knows about that go doesn't? /me grasps | 19:20 |
sinzui | natefinch, you landed something today? does lbox love you? | 19:21 |
natefinch | sinzui: it did this morning, but things may have changed in the last 5-ish hours | 19:21 |
natefinch | sinzui: let me see if I can run it now | 19:21 |
natefinch | sinzui, bac: works for me, re-proposing an already proposed branch with new changes. Don't have a new branch to propose to try that, if it's different. | 19:24 |
bac | sinzui: i am trying to 'lbox submit'. oddness. also it looks like the certs aren't newly generated today. | 19:25 |
sinzui | bac, did you get lbox from code? | 19:25 |
sinzui | bac "go get launchpad.net/lbox" | 19:26 |
bac | sinzui: no. i'll try that | 19:27 |
natefinch | sometimes I forget how awesome go get is, because I use it all the time, then I try to go build something else from source, and I'm like "oh yeah, this blows" | 19:27 |
marcoceppi | is there a max filesize for charms? | 19:47 |
perrito666 | fwereade: you vanished | 19:48 |
fwereade | perrito666, hey, sorry, my internets went all funny for a bit | 19:48 |
perrito666 | fwereade: branches you too? :p | 19:48 |
fwereade | perrito666, did you get my stuff about the networks from 30 mins ago? | 19:49 |
perrito666 | fwereade: nope, after I answered, you timed out | 19:49 |
perrito666 | well, not you, your irc client | 19:49 |
fwereade | perrito666, ha, I never saw your answer | 19:50 |
perrito666 | <perrito666> fwereade: always, I have no life | 19:50 |
fwereade | perrito666, haha | 19:50 |
fwereade | perrito666, when you're around: there's a separate collection, called (IIRC) linkednetworks, that has requested machine and service networks, keyed on the respective entity's globalKey | 19:50 |
fwereade | perrito666, that's where you get the unit's requested networks (from the service) and the networks the machine was itself requested to start with -- there will imminently be another collection of *actual* machine networks, but dimitern hasn't landed that yet | 19:50 |
fwereade | perrito666, if any of that is incomprehensible I can expand at arbitrary length on it all | 19:50 |
perrito666 | ok so, for the moment, as we spoke, I should only check that the requested nets of the unit and the requested nets of the machine are a match | 19:51 |
fwereade | perrito666, yeah -- and for the transaction, assert no changes in either document | 19:53 |
bac | hi marcoceppi, i'm trying to use lbox to submit the latest version of lp:~bac/charm-tools/cmd-line-server but it is not working for me due to launchpad certificate issues. would you mind trying to land it? | 19:53 |
marcoceppi | bac: yeah, give me 30 mins | 19:54 |
perrito666 | fwereade: please expand that last one a bit | 19:54 |
bac | marcoceppi: no rush. thanks. | 19:54 |
fwereade | perrito666, hangout? | 19:55 |
perrito666 | fwereade: sure, gimme a sec | 19:55 |
=== vladk is now known as vladk|offline | ||
natefinch | ahh shit | 20:53 |
natefinch | thumper, sinzui: are we supposed to be supporting go 1.1? Rog added a line that requires 1.2 :/ | 20:54 |
thumper | natefinch: yes, AFAIK, we are still 1.1 | 20:55 |
natefinch | dang | 20:55 |
sinzui | natefinch, I think we are. | 20:55 |
* sinzui checks cloud archive | 20:55 | |
sinzui | natefinch, thumper the PPAs that build depend on Go 1.1.2 | 20:59 |
thumper | thought so | 20:59 |
natefinch | yeah, me too, just hoped I was wrong | 20:59 |
thumper | also, what does gccgo support? | 20:59 |
natefinch | no idea | 21:00 |
thumper | natefinch: what was the line ? I'm curious | 21:01 |
natefinch | thumper: using sort.Stable to do a stable sort of instance addresses | 21:01 |
natefinch | thumper: to pick the one that was the most like what we wanted (cloud local, non-hostname) | 21:02 |
thumper | ah | 21:02 |
natefinch | I gotta run. Unfortunately that means HA won't go in tonight. But it should be easy to fix in the morning. That's the last blocker. | 21:03 |
mwhudson | gccgo in trusty supports go 1.2 | 21:07 |
mwhudson | and we really don't care about <trusty for gccgo | 21:07 |
bac | marcoceppi: ignore my earlier request. lbox finally decided to play nicely. | 21:27 |
marcoceppi | Bac ack | 21:27 |
bac | in black | 21:28 |
waigani | hi all | 21:56 |
waigani | when I run test on trunk I get a mgo.QueryError | 21:56 |
waigani | "exception: cannot run map reduce without the js engine" | 21:57 |
mwhudson | juju-mongodb does not support map reduce | 22:00 |
mwhudson | so that's interesting :) | 22:00 |
waigani | right, if I use mongodb-server, no problem | 22:01 |
thumper | waigani: is that in the store tests? | 22:03 |
waigani | thumper: yep | 22:03 |
thumper | yeah, we need to skip that test if using juju-mongodb | 22:04 |
thumper | the store code should be moving out of core AFAIK | 22:04 |
thumper | the store doesn't use juju-mongodb | 22:04 |
waigani | rightio | 22:04 |
davecheney | thumper: sinzui ubuntu@winton-02:/var/log/juju-ubuntu-local$ grep panic -c all-machines.log | 23:10 |
davecheney | 0 | 23:10 |
thumper | \o/ | 23:10 |
davecheney | 12 hours, no panic | 23:10 |
thumper | although, we do suck... | 23:10 |
davecheney | $ uname -a | 23:10 |
davecheney | Linux winton-02 3.13.0-8-generic #28-Ubuntu SMP Mon Feb 17 08:22:39 UTC 2014 ppc64le ppc64le ppc64le GNU/Linu | 23:11 |
thumper | I'm trying to get the new debug-log client to error on an old server | 23:11 |
davecheney | ^ magic kernel, accept no substitutes | 23:11 |
thumper | but it just blocks | 23:11 |
davecheney | :( | 23:11 |
thumper | and there is no way as the api client to know the version of the server | 23:11 |
thumper | and I can't add it to use it, because it needs to be there now | 23:11 |
thumper | so I can look from the future | 23:11 |
thumper | hmm... | 23:12 |
thumper | maybe I should add a method | 23:12 |
thumper | to ask for the remote version | 23:12 |
thumper | and if that fails, no debug-log for you | 23:12 |
thumper | it seems the websocket connection just hangs if the end point isn't there | 23:12 |
thumper | can't seem to get it to time out | 23:12 |
davecheney | thumper: hmm | 23:23 |
davecheney | that shold be easy to fix | 23:23 |
davecheney | is thre any log on the other side if you hit an non existant endpoint ? | 23:23 |
davecheney | it probably doens't get further than the rootMethod | 23:23 |
thumper | nope | 23:24 |
thumper | I'm adding a client call "Version" | 23:24 |
thumper | that returns version.Current | 23:24 |
thumper | we know that if you call a client end point that isn't there, then it doesn't work | 23:24 |
thumper | as it is the rpc layer | 23:24 |
thumper | but the websocket gets stuck. | 23:24 |
thumper | it is a bit shit, but necessary :-( | 23:24 |
waigani | hi davecheney | 23:32 |
waigani | First pass at ssh test isolation: https://codereview.appspot.com/85710043 | 23:32 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!