wallyworld | alexisb: free now if you are | 00:10 |
---|---|---|
wallyworld | anastasiamac: looks like you need to pull juju/testing etc to get latest versions | 00:11 |
alexisb | wallyworld, ok, brt | 00:12 |
anastasiamac | this elements that the build contains are in charm.v4 | 00:12 |
anastasiamac | but it looks like there is a conflict with gopkg.in/charm.v4 vs my github.com/juju/charm.v4?... have no idea how to resolve it | 00:13 |
anastasiamac | wallyworld: ^^ | 00:13 |
anastasiamac | i'll try pulling testing though I cannot c if/how it's related.. | 00:13 |
wallyworld | anastasiamac: the error indicates that juju/testing is out of date | 00:13 |
anastasiamac | k. thnx ;) | 00:14 |
anastasiamac | wallyworld: pulled testing but the error for charm.v4 is the same.. | 00:20 |
=== kadams54 is now known as kadams54-away | ||
katco | wallyworld: SHAM-WOW! http://reviews.vapour.ws/r/658/ | 00:25 |
thumper | wallyworld: https://bugs.launchpad.net/juju-core/+bug/1403689 | 00:42 |
mup | Bug #1403689: Server should store tools of unknown or unsupported series <upgrade-juju> <upload-tools> <juju-core:Triaged> <https://launchpad.net/bugs/1403689> | 00:42 |
wallyworld | thumper: just finished meeting, thanks, will look | 01:06 |
thumper | np | 01:06 |
wwitzel3 | ericsnow: ping, in moonstone | 01:44 |
thumper | ok, I'm turning distractions off and going to try to focus for an hour | 02:02 |
thumper | now if only the kids comply... | 02:02 |
* thumper switches music from random to heavy mix | 02:03 | |
mattyw | thumper, morning - mind if this kid distracts you for a moment? | 02:07 |
* thumper looks at mattyw | 02:07 | |
mattyw | thumper, I made this small change in juju/utils the other day - I'm not sure if it belong there - but as it was small I thought - better to ask forgiveness than permission - what do you think? http://reviews.vapour.ws/r/634/ | 02:08 |
thumper | seems reasonable | 02:09 |
mattyw | thumper, do I $$merge$$ utils? | 02:13 |
thumper | I don't remember | 02:13 |
thumper | try it and if the bot doesn't say anything | 02:13 |
thumper | manually do it | 02:13 |
mattyw | thumper, thanks for the help - you can go back to <whatever it was you were doing> | 02:14 |
mattyw | thumper, me again - I don't have permission, can you hit merge for me? | 02:19 |
thumper | done | 02:19 |
mattyw | thanks very much | 02:19 |
jog | wallyworld, around? | 02:59 |
wallyworld | jog: in a meeting, can be with you soon | 03:00 |
jog | I'm considering setting bug 1396099 as a blocker on master, please see my latest comment when you're available | 03:01 |
mup | Bug #1396099: AWS/Joyent/manual/maas: juju deploy error "connection is shut down" <api> <ci> <deploy> <ec2-provider> <joyent-provider> <manual-provider> <juju-core:Triaged> <https://launchpad.net/bugs/1396099> | 03:01 |
wwitzel3 | so in cloud-init.log I see some log lines that read writing /home/ubuntu/.ssh/authorized_keys .. but when I look in the file, the keys don't match my jenv. Any pointers on where to start poking around? | 03:05 |
jw4 | jog: can I send you some clickbait, or interesting wired articles or anything? | 03:05 |
jw4 | jog: just trying to buy some time before CI gets blocked | 03:06 |
jog | heh heh | 03:06 |
katco | davecheney: hey just reviewing some of your comments | 03:08 |
katco | davecheney: to maybe save you a bit of time: as this is refactoring, i don't want to change any existing code i don't have to touch | 03:09 |
katco | davecheney: e.g.: eitherState (which i agree is a little strange) | 03:09 |
davecheney | katco: if you don't clean it up now | 03:09 |
davecheney | then when ? | 03:09 |
davecheney | refactoring sounds like the perfect time to clean house | 03:10 |
katco | davecheney: i will be making changes to this area for awhile longer | 03:10 |
katco | davecheney: this piece is just so i can land some leadership functional tests | 03:10 |
katco | davecheney: once that's done, i will be circling back and giving this package some more TLC :) | 03:11 |
davecheney | sgtm | 03:12 |
katco | davecheney: i do appreciate your thoughtful reviews | 03:12 |
katco | davecheney: in fact, if you want to call anything out, but not open it as an issue, i'd love that so i can reference it | 03:12 |
mattyw | folks, the latest version of juju/utils contains a call to errors.UserNotFound https://github.com/juju/utils/blob/master/file_unix.go. But that call doesn't exist | 03:29 |
mattyw | I think we should just return err here - what does everyone else think | 03:30 |
menn0 | wallyworld, jog: I wonder if that's the new certupdater work. it can lead to the API server being restarted soon after the machine agent starts up. | 03:40 |
jw4 | mattyw: is errors.UserNotFound in a later version of juju/utils? | 03:41 |
jw4 | mattyw: can we just update | 03:41 |
menn0 | s/work/worker/ | 03:41 |
jw4 | mattyw: otherwise I'm in favor of just returning err too | 03:41 |
menn0 | wallyworld, jog: that issue has certainly been happening a lot in CI lately | 03:43 |
jog | yup, it's been a big problem for us lately | 03:44 |
mattyw | jw4, is in the latest version | 03:46 |
mattyw | jw4, you mean the latest version of juju/errors? | 03:47 |
mattyw | jw4, ah yes, it's in the latest version of juju errors - I thought I'd tried that, but apprently not | 03:48 |
jw4 | mattyw: I meant the latest version of juju/utils because I misunderstood you - but I'm glad you found it anyway :D | 03:50 |
ericsnow | davecheney: FYI, I've added a set.Ints (http://reviews.vapour.ws/r/659/) and updated the PortSet patch to use it (http://reviews.vapour.ws/r/617/) | 03:52 |
ericsnow | davecheney: thanks for the nudge | 03:52 |
wwitzel3 | ericsnow: nothing yet on why authorized_keys doesn't contain our keys | 03:52 |
ericsnow | wwitzel3: :( | 03:52 |
wwitzel3 | ericsnow: I did take care of the autoDelete though | 03:52 |
ericsnow | wwitzel3: sweet | 03:53 |
ericsnow | wwitzel3: I'm pretty sure once the connection issue is resolved we'll be bootstrapping on GCE!!! | 03:53 |
wwitzel3 | ericsnow: yeah, seems that way .. I'm going to login manually while the attempting to connect loops are running and add the key to authorized_keys and see if that fixes it | 03:54 |
ericsnow | wwitzel3: sneaky :) | 03:55 |
davecheney | ericsnow: ta | 03:55 |
ericsnow | davecheney: it was easier than expected :) | 03:55 |
wwitzel3 | ericsnow: yeah, the cloud-init.log file looks good, it appears to update the auth_keys file for both ubuntu and root | 03:55 |
wwitzel3 | ericsnow: but our keys never end up in there :( | 03:56 |
ericsnow | wwitzel3: ?!? | 03:56 |
ericsnow | wwitzel3: where does cloud-init get the stuff it's supposed to add? | 03:58 |
wwitzel3 | ericsnow: so far, no luck, I've added the keys, but the Attempting to connect is still just hanging. | 03:58 |
wwitzel3 | ericsnow: it is from the jenv for gce | 03:58 |
davecheney | thumper: i was trying to sort out my xmas leave | 03:58 |
davecheney | https://sites.google.com/a/canonical.com/operations/people-and-culture/dashboard | 03:58 |
davecheney | this page is now 404 | 03:59 |
ericsnow | wwitzel3: oh, right | 03:59 |
davecheney | where is the calendar that tells us what days we need to claim ? | 03:59 |
thumper | hr somehwere | 03:59 |
thumper | waigani: do you have that link somewhere? | 03:59 |
thumper | waigani: you had it the other day | 03:59 |
ericsnow | wwitzel3: what feeds that data to cloud-init on the new instance? | 03:59 |
thumper | waigani: the christmas leave page | 03:59 |
ericsnow | wwitzel3: I have a feeling it's that metadata :( | 04:00 |
wwitzel3 | ericsnow: error: Could not load host key: /etc/ssh/ssh_host_ed25519_key | 04:00 |
waigani | thumper: https://sites.google.com/a/canonical.com/operations/people-and-culture/general?pli=1 | 04:00 |
thumper | davecheney: ^^^ | 04:01 |
thumper | waigani: ta | 04:01 |
wwitzel3 | ericsnow: it looks like we are connecting successfully, but that error is resulting in a disconnect | 04:02 |
ericsnow | wwitzel3: ah | 04:03 |
wwitzel3 | ericsnow: the keys still aren't getting populated correct, but that error is also preventing us from connecting | 04:03 |
wwitzel3 | ericsnow: I'm going to keep swing my bat around in this china shop for a while, I'll let you know what i come up with him the morning | 04:04 |
dimitern | ericsnow, hey | 04:04 |
ericsnow | wwitzel3: k | 04:04 |
ericsnow | dimitern: hey | 04:04 |
wallyworld | menn0: jog: sorry, just finished meeting | 04:04 |
dimitern | ericsnow, can you have a quick look at this please? http://reviews.vapour.ws/r/656/ | 04:04 |
ericsnow | dimitern: sure | 04:05 |
jog | wallyworld, just trying to decide what to do about bug 1396099 | 04:05 |
wallyworld | when juju starts, it will restart the state server api to accommodate newly known machine addresses | 04:05 |
mup | Bug #1396099: AWS/Joyent/manual/maas: juju deploy error "connection is shut down" <api> <ci> <deploy> <ec2-provider> <joyent-provider> <manual-provider> <juju-core:Triaged> <https://launchpad.net/bugs/1396099> | 04:05 |
dimitern | ericsnow, thanks | 04:05 |
wallyworld | jog: if the CI scripts immediately connect to the state server api, they could come undone as the state server api will be restarted very shortly after the state server comes up | 04:06 |
wallyworld | this restart is necessary to accommodate a new server certicicate being generated | 04:06 |
wallyworld | the certificate regeneration is needed to allow https connections over the state server IP addresses | 04:07 |
wallyworld | jog: is it plausible to add a small delay to the CI script? | 04:08 |
ericsnow | wallyworld: how long does it go before restarting after coming up? | 04:08 |
jog | wallyworld, yeah that sounds like what's happening, but if we add a sleep to the test script, customer who are also scripting will experience intermittent connection failures. | 04:08 |
bodie_ | could use a quick review on https://github.com/juju/charm/pull/81 if anyone has a few moments :) | 04:08 |
wallyworld | ericsnow: as soon as it learns about the state server machine addresses, not sure of the exact time | 04:08 |
* jog is actually testing a sleep now | 04:08 | |
bodie_ | a worthy Actions schema format | 04:08 |
dimitern | wallyworld, can't we use the upgrade blocker or something to block deployments until api server cert is regenerated? | 04:08 |
wallyworld | dimitern: it's a change listener, the addresses could change anytime | 04:09 |
davecheney | thumper: ta | 04:09 |
dimitern | wallyworld, hmm right | 04:09 |
wallyworld | dimitern: so before this, any htths connection over the state server ip address would fail | 04:09 |
ericsnow | wallyworld: I've see issues with the API client used in backups commands being good for only one request (and then the connection shows up as disconnected), so perhaps that's related | 04:09 |
ericsnow | wallyworld: ah | 04:10 |
wallyworld | ericsnow: the backup client, if run after the state server has fully come up, will not be affected by this | 04:10 |
ericsnow | wallyworld: yeah, that's what I gathered from what you just said :) | 04:11 |
wallyworld | well, by fail, i mean certificate verification would fail | 04:11 |
ericsnow | dimitern: LGTM | 04:13 |
dimitern | ericsnow, cheers | 04:13 |
dimitern | wallyworld, well, the api client could be made more robust I guess | 04:14 |
dimitern | wallyworld, I mean since this happens at every bootstrap the initial connection (or say first 3 attempts) might fail, but shouldn't be logged as errors | 04:15 |
dimitern | wallyworld, and the sleep could be in there, rather than in the ci script | 04:16 |
jw4 | ericsnow: thanks for that clarification - copied:= *meta... looked like a pointer assignment to me | 04:24 |
wallyworld | dimitern: the issue is that if it is really quick, a client might grab a connection which is then lost with the restart. we could look at delaying the state server start until after the first address change | 04:26 |
dimitern | wallyworld, that second part sgtm | 04:26 |
wallyworld | i'll raise a bug | 04:26 |
ericsnow | jw4: yep, * != &, but our brains don't handle that so well sometimes :) | 04:30 |
anastasiamac | ericsnow: this should be on a t-shirt :D | 04:34 |
jw4 | ericsnow: :-p | 04:35 |
jw4 | ericsnow: yours did | 04:35 |
ericsnow | jw4: sure, this time... :) | 04:36 |
jw4 | hehe | 04:36 |
wallyworld | jam1: hi, i see you guys did some work on the status spec. i made a couple of v1 compatibility comments near the top. i also see you didn't like "broken", which is to blocked as busy is to waiting | 05:07 |
jam1 | wallyworld: so it isn't entirely, we did discuss it a bit | 05:08 |
jam1 | specifically, Broken is actually "come look and fix something with *me*" | 05:08 |
jam1 | which is what we had as broken | 05:08 |
wallyworld | yes | 05:08 |
wallyworld | just like busy, which is waiting on me | 05:08 |
jam1 | (yes, it means you need to relate me to something else, but that distinction vs you need to fix my config isn't very compelling) | 05:08 |
jam1 | wallyworld: so still, not exactly | 05:08 |
jam1 | sorry, I had a typo | 05:09 |
wallyworld | not just my config, could also be disk spac eetc | 05:09 |
jam1 | *blocked* is come look at me | 05:09 |
wallyworld | doesn't matter, we can go with what's there | 05:09 |
wallyworld | i see there's a lot of extra unit states | 05:09 |
jam1 | wallyworld: so I would like you to understand how we put the rationale together, I realize in the scale of things the specifics aren't huge, but there was a fair discussion and I was happy with how the mapping worked out. | 05:11 |
wallyworld | i think i disagree that error shouldn't be on agent-state, as just because a hook fails, doesn't mean that the software isn't running, but you guys would have discussed that | 05:11 |
jam1 | wallyworld: so one guiding thing there is that we want to move to where we have 1 Juju agent for the machine and all its units | 05:11 |
jam1 | not 1 unit-agent per unit | 05:11 |
wallyworld | yes, that's true | 05:12 |
jam1 | wallyworld: so you do need a place to say "this unit failed its hook", without saying that all units are dead | 05:12 |
wallyworld | so error is best done on unit then | 05:12 |
wallyworld | fair point | 05:12 |
jam1 | wallyworld: IIRC there is *also* error on the agent for compat | 05:12 |
jam1 | and then we drop that in favor of "failed" | 05:12 |
jam1 | when the agent itself is unresponsive | 05:12 |
wallyworld | yes | 05:12 |
wallyworld | so you agree with my v1 compat comments? | 05:13 |
wallyworld | s/do/do | 05:13 |
wallyworld | s/so/do | 05:13 |
jam1 | I haven't gotten to them yet, just chatting with you here | 05:13 |
wallyworld | sure, sorry | 05:13 |
jam1 | wallyworld: np. So Mark feels strongly that we can drop pending, because nobody is actually depending on it, but we'll keep Started and Error | 05:14 |
jam1 | We don't need Installed, because nothing stayed there very long | 05:14 |
jam1 | Down is a fair point, though | 05:14 |
wallyworld | i figured that would be the case for Installed, just wanted to be 10000% sure | 05:14 |
wallyworld | i wasn't sure if we wanted to be really anal about keeping 100% compat | 05:15 |
wallyworld | i think we need to keep Stopped also | 05:15 |
wallyworld | s/need/should | 05:15 |
wallyworld | as we don't know who has scripts that depend on it | 05:16 |
jam1 | wallyworld: this is agent-state stopped, right? Are we actually able to depend on it? We can run it by fwereade or someone, but I thought Stopped only existed as long as the Database hadn't cleaned up that unit yet | 05:16 |
jam1 | We may need to keep it for the same purpose, though. | 05:17 |
wallyworld | jam1: yes, the current agent-state Stopped | 05:17 |
wallyworld | we might be able to safely drop it, as for Installed, just want to be sure | 05:17 |
bodie_ | I'd really like to land this so TheMue can land the changes in master tomorrow: https://github.com/juju/charm/pull/81 | 05:49 |
bodie_ | it's a pretty simple change which tweaks the way Charm parses actions schemas from yaml | 05:49 |
wwitzel3 | so when cloud-init writes the authroized_keys file, where does it get the information it writes in to there from? | 05:50 |
bodie_ | s/tomorrow/today | 05:51 |
bodie_ | (but a huge usability improvement from the charm author's perspective) | 05:52 |
dimitern | bodie_, looking | 05:52 |
jog | wallyworld, looks like something with 24c1b80d is affecting upgrades across multiple substrates | 05:53 |
bodie_ | dimitern, thanks! | 05:53 |
wallyworld | let me look at what the rev is | 05:53 |
wallyworld | jog: this PR https://github.com/juju/juju/pull/1291 ? | 05:54 |
jog | wallyworld, yes | 05:56 |
wallyworld | i have no knowledge of that work, what do the CI logs say is the problem? | 05:56 |
jog | wallyworld, joyent, aws, hp, azure, maas, KVM, ... all timing out after waiting 10 minutes for juju status after the upgrade... so at a minimum the time for an upgrade to complete has increased | 06:01 |
=== kadams54 is now known as kadams54-away | ||
wallyworld | wouldn't surprise me if it's more than that, more likely to be a breakage, the juju state server upgrade and machine log would be helpful | 06:02 |
wwitzel3 | so in the cloud-init log I see 014-12-18 05:51:56,497 - util.py[DEBUG]: Writing to /home/ubuntu/.ssh/authorized_keys - wb: [384] 1416 bytes | 06:03 |
wwitzel3 | which implies that it is writing the keys, but when I look at that file, there are only the keys from the provider | 06:03 |
wwitzel3 | I wonder if Google is overwriting them after we write them .. | 06:03 |
wallyworld | could be, i know nothing about gce | 06:04 |
wwitzel3 | hrmm actually I think maybe I am just not writing them to the proper place in the metadata | 06:06 |
jog | wallyworld, I can attach log or if you want to look sooner one instance is under the artifacts here: http://juju-ci.vapour.ws:8080/job/aws-upgrade-precise-amd64/2176/ | 06:06 |
wwitzel3 | but then what is cloud-init writing? .. hrmm | 06:06 |
wallyworld | jog: looking | 06:07 |
jog | wallyworld, this looks like the same issue menn0 fixed yesterday " login blocked because upgrade is in progress" | 06:10 |
wallyworld | i wasn't aware of that fix - do the logs look the same or similar? | 06:11 |
wallyworld | i can see a test fix | 06:13 |
wallyworld | the logs sure do have a lot of terminated connections | 06:15 |
jam1 | fwereade: ping for where you're at with active/goal | 06:15 |
dimitern | bodie_, reviewed | 06:16 |
bodie_ | dimitern, thanks, have been following along making a few changes :) | 06:16 |
bodie_ | I really ought to call it a day though | 06:16 |
bodie_ | I think most of this should be very straightforward -- can you sync up with TheMue? I have to get up early to travel in the morning | 06:17 |
bodie_ | he offered to pick it up since I'm leaving town | 06:17 |
bodie_ | or... expect him to ping you back in there | 06:17 |
jog | wallyworld, this was yesterdays fix https://github.com/mjs/juju/commit/f22f2f07ace804fbce81b66bfe938439a6878a29 | 06:17 |
bodie_ | or something which works well and makes everyone happy ;) | 06:17 |
dimitern | bodie_, ok, feel free to land this, but I'd like you to address the suggestions in a follow-up if you don't mind? | 06:17 |
dimitern | bodie_, I will sync up with TheMue | 06:19 |
wallyworld | jog: could be related, but doesn't seem like it. i can't see off hand from the logs what the issue is. more detailed investigation is required | 06:20 |
wallyworld | are these upgrade failures intermittent? | 06:20 |
dimitern | wallyworld, I might be able to help there, let me have a look at the logs | 06:21 |
wallyworld | ty :-) | 06:21 |
jog | wallyworld, dimitern nearly all substrates started failing upgrade tests with pull 1291... so not intermittent, rather very consistent | 06:24 |
dimitern | jog, why is the job destroying the environment? http://juju-ci.vapour.ws:8080/job/aws-upgrade-precise-amd64/2175/console after upgrade | 06:25 |
* wallyworld bbiab | 06:25 | |
dimitern | jog, this is seems fishy 2014-12-18 03:27:58 INFO juju.provider.common destroy.go:15 destroying environment "aws-upgrade-precise-amd64" | 06:26 |
jog | it waits 10 minutes checking 'juju status' and then gives up and destroys the environment, so the resources are available for the next test | 06:26 |
dimitern | jog, ah, ok | 06:26 |
jog | dimitern, wallyworld, I think we should block on this, it might be harder to figure out if addition code lands | 06:28 |
bodie_ | dimitern, that sounds perfect, much appreciated! | 06:36 |
dimitern | jog, looking at the logs so far it seems the upgrade was completed on machine-0, but the upgrade block wasn't lifted | 06:39 |
jam1 | fwereade: poke when you're awake | 06:43 |
wwitzel3 | man, it got late. We are so close to getting GCE to bootstrap. | 06:44 |
dimitern | wwitzel3, \o/ | 06:46 |
bodie_ | dimitern, I'm going to go ahead and ship the charm fix, and open a new PR with the requested changes referenced. I also have a branch for updating tests on master to reflect this stuff | 06:46 |
dimitern | bodie_, sweet, thanks | 06:48 |
dimitern | jog, this is the issue: cannot set agent version for machine 0: not found or dead | 06:54 |
dimitern | jog, and machine-0 is obviously alive and well, so something around env-uuid changes in state recently does not work properly | 06:54 |
bodie_ | fwiw, the PR to fix the charm actions parsing in juju master is http://reviews.vapour.ws/r/661/ if anyone fancies having a look | 06:55 |
jog | dimitern, I opened bug https://bugs.launchpad.net/juju-core/+bug/1403738 | 06:55 |
mup | Bug #1403738: upgrade tests fail on multiple substrates with revision 24c1b80d <juju-core:Triaged> <https://launchpad.net/bugs/1403738> | 06:55 |
jog | dimitern, do you need anything else from me? If not my day is long over. | 07:07 |
dimitern | jog, can you re-run one of the failing jobs, but using logging-config: <root>=TRACE in envs.yaml ? | 07:07 |
dimitern | jog, with <root>=INFO we're practically loosing all context during the upgrade - all upgrade jobs should run with at least logging-config: <root>=DEBUG | 07:08 |
jog | ok | 07:09 |
jam1 | wallyworld: fwereade: we feel pretty good about where the status spec is at (you should have gotten an email). So comments are welcome. | 07:09 |
wallyworld | jam1: ty, will look | 07:10 |
jam1 | wallyworld: are you around at all during the break? | 07:15 |
jam1 | I feel like it might be good to have a hangout to discuss finer points, but I know everyone is officially not-working | 07:15 |
wallyworld | jam1: sure, i can be available, what time is the break? | 07:17 |
jam1 | wallyworld: I mean Holiday break | 07:18 |
jam1 | eg, next week | 07:18 |
wallyworld | jam1: oh, right :-) that will be fine too | 07:18 |
wallyworld | maybe one evening my time, afternoon your time, which will be midday for william | 07:19 |
jam1 | wallyworld: k, I know I'm out of town from 25-1st, but I'll have Monday/Tues that I'm just relaxing around the house | 07:19 |
wallyworld | ok, maybe aim for monday depending on william's availability? | 07:19 |
=== urulama|out is now known as urulama | ||
wallyworld | fwereade: you free for a catchup in 10? | 07:22 |
dimitern | jog, updated the bug | 07:38 |
wallyworld | jam1: storage phase 1 spec has openstack cinder volumes in scope, yet the in scope providers are listed as maas, local, aws. i thought openstack was out of scope for phase 1 | 08:36 |
wallyworld | jam1: you may have missed my message when your connection bounced | 08:42 |
wallyworld | storage phase 1 spec has openstack cinder volumes in scope, yet the in scope providers are listed as maas, local, aws. i thought openstack was out of scope for phase 1 | 08:42 |
jam1 | wallyworld: thanks for the heads up, the network here likes to stay up for approximately 2min before needing to be reset… | 09:04 |
jam1 | I did try to flag that with a comment, can you make sure there is a note if mine didn't go through ? | 09:04 |
wallyworld | sure, will do, just about to be called for dinner, will do it straight after | 09:04 |
TheMue | morning | 09:26 |
voidspace | TheMue: o/ | 09:32 |
=== rogpeppe3 is now known as rogpeppe | ||
voidspace | dimitern: ping | 09:48 |
dimitern | voidspace, pong | 09:52 |
voidspace | dimitern: hey, hi | 09:54 |
voidspace | dimitern: davecheney suggests that network.SubnetInfo should use net.IP for the AllocatableIPLow and High | 09:54 |
voidspace | dimitern: what do you think? | 09:55 |
dimitern | voidspace, sgtm | 09:55 |
voidspace | dimitern: we have to convert back to strings where we *use them* | 09:55 |
voidspace | dimitern: in state and on the wire | 09:55 |
voidspace | dimitern: and it doesn't save us validation as constructing an IP doesn't return an error (you have to check for a nil value) | 09:55 |
voidspace | so I'm not sure what it buys, beyond more conversions | 09:56 |
voidspace | dimitern: TheMue: little girl just woken up - neighbour has agreed to babysit (wife out), but I have to set that up | 09:58 |
voidspace | dimitern: TheMue: will take a few minutes, so will be late to standup again... sorry | 09:58 |
TheMue | voi | 09:59 |
TheMue | voidspace: ok | 09:59 |
rogpeppe | is a unit's public-address also supposed to be accessible from within an environment? | 10:05 |
rogpeppe | dimitern: ^ | 10:05 |
voidspace | TheMue: dimitern: babysitter here, omw | 10:07 |
dimitern | rogpeppe, you mean like in ec2 automatic public ips ? | 10:07 |
rogpeppe | the question is really: is it reasonable to have a single address for a service endpoint that works both within the environment (from unit to unit) and from outside it? | 10:07 |
dimitern | rogpeppe, short answer - it depends | 10:07 |
rogpeppe | dimitern: i mean the public-address as reported by the unit-get public-address charm tool | 10:08 |
dimitern | rogpeppe, in joyent for example you can, but not in ec2 or openstack (depends on how floating ips are configured) | 10:08 |
rogpeppe | dimitern: i thought it worked ok in ec2 as the public-address resolves correctly whether you're inside or outside the cloud | 10:09 |
voidspace | TheMue: dimitern: struggling to join... | 10:09 |
rogpeppe | dimitern: but perhaps i'm misremembering? | 10:09 |
voidspace | "Trying to join the call. Please wait..." | 10:09 |
rogpeppe | dimitern: i'm more concerned with the intra-environment behaviour here | 10:10 |
dimitern | rogpeppe, you're talking about different things | 10:10 |
rogpeppe | dimitern: as i already know that public-address might not be accessible from outside the env - that's an issue we always need to deal with | 10:10 |
dimitern | rogpeppe, the ec2 instance dns name resolves to internal ip in ec2 or to the public ip outside | 10:10 |
rogpeppe | dimitern: and that's what we report for public-address, right? | 10:11 |
rogpeppe | dimitern: or has that changed? | 10:11 |
rogpeppe | perhaps i should phrase the question like this: can i be sure that a unit can connect to another unit's public-address as well as its private-address? | 10:12 |
rogpeppe | voidspace, dimitern: FWIW, I'm +1 on using net.IP when we know we've got IP addresses | 10:13 |
rogpeppe | voidspace: and net.ParseIP does validate, even though it doesn't return an explicit error | 10:14 |
dimitern | rogpeppe, it depends on the provider | 10:16 |
rogpeppe | dimitern: hmm, that's not great | 10:17 |
dimitern | rogpeppe, in standup now, i'll get back to you in a bit | 10:18 |
rogpeppe | dimitern: ta | 10:19 |
rogpeppe | another network-related question: is it possible for a unit to find out the public address of another unit that it's related to? | 10:45 |
rogpeppe | dimitern, fwereade: ^ | 10:45 |
dimitern | rogpeppe, so you can rely on a split horizon dns name to work internally and externally in ec2 and openstack (if so configured) | 10:46 |
dimitern | rogpeppe, not automatically - via relation settings | 10:47 |
rogpeppe | dimitern: and public-address will always return a split-horizon dns name | 10:47 |
rogpeppe | ? | 10:47 |
dimitern | rogpeppe, I wouldn't say always | 10:47 |
rogpeppe | dimitern: ok, so i can't rely on this at all then? | 10:47 |
dimitern | rogpeppe, in juju status - most likely, as unit.PublicAddress - if set by the provider (IIRC some providers were explicitly changed to either return dns name or ip for various reasons) | 10:48 |
rogpeppe | dimitern: my situation is that i have a web service that returns some data to the client which includes the address of another service for the client to connect to. | 10:49 |
rogpeppe | dimitern: i'd like that client to work correctly whether it's inside the environment or outside it | 10:49 |
dimitern | rogpeppe, right | 10:49 |
rogpeppe | dimitern: it currently looks like that's not possible without returning more than one address | 10:49 |
dimitern | rogpeppe, yes | 10:50 |
rogpeppe | dimitern: and that a standard http relation isn't sufficient to find out the public address | 10:50 |
dimitern | rogpeppe, hopefully this will change as more networking model stuff lands | 10:50 |
rogpeppe | dimitern: i'm guessing that it will only change if this specific requirement is on the roadmap | 10:50 |
dimitern | rogpeppe, alternatively, you can use custom networking config inside the charm | 10:51 |
rogpeppe | dimitern: how would that work? | 10:51 |
dimitern | rogpeppe, so your webservice charm returns this info via relation settings? | 10:52 |
rogpeppe | dimitern: no, via http | 10:52 |
dimitern | rogpeppe, ok, so the webservice is not running inside a charm? | 10:53 |
rogpeppe | dimitern: it's part of the http API that it exposes | 10:53 |
rogpeppe | dimitern: yes, it is running inside a charm | 10:53 |
rogpeppe | dimitern: all the services here are running as charms, possibly excluding the client | 10:53 |
dimitern | rogpeppe, and you want to return a single hostname/ip that's usable both internally and externally? | 10:53 |
rogpeppe | dimitern: ideally, yes | 10:54 |
dimitern | rogpeppe, so first, the only way I can think of is to use a split horizon dns name | 10:55 |
rogpeppe | dimitern: but i can't rely on that, right? | 10:55 |
dimitern | rogpeppe, and it needs to be supported either by the cloud itself (like ec2) or by another exposed service running a dns server | 10:55 |
dimitern | rogpeppe, not right now, because the addresses we store in state are not in a single place | 10:57 |
dimitern | rogpeppe, that's why it's unreliable - unit.PublicAddress() returns whatever its assigned machine's Addresses() method returns | 10:58 |
rogpeppe | dimitern: an arbitrary selection, presumably? | 10:58 |
rogpeppe | dimitern: because Addresses can return many addresses | 10:59 |
rogpeppe | dimitern: i think i'll just go with returning several addresses to the client and relying on them to try all of them | 10:59 |
dimitern | rogpeppe, nope | 11:00 |
rogpeppe | dimitern: anything else i think is being unreasonably optimistic/platform-dependent | 11:00 |
dimitern | rogpeppe, since a recent change I made all addresses are consistently ordered - public ips before hostnames, then cloud-local, etc. | 11:00 |
rogpeppe | dimitern: ok, so it's still an arbitrary selection but at least a stable choice, then? | 11:01 |
dimitern | rogpeppe, however the uncertainty still exists, as those addresses are merged from the instance addresses (coming from the provider) and machine ones (as discovered by the net package) | 11:01 |
dimitern | rogpeppe, it is stable yes | 11:01 |
rogpeppe | dimitern: hmm, so that means that IP addresses are always chosen over host names? | 11:01 |
rogpeppe | dimitern: so you'll never get a split-horizon DNS name? | 11:02 |
dimitern | rogpeppe, and provider addresses always shadow machine ones - so if the provider (e.g. like maas) adds dns names in addition to ips in response of calling instance.Addresses() - you'll get those, then machine addrs in a single list, ordered | 11:03 |
dimitern | rogpeppe, effectively, since that change it's even less likely (can only happen if there are only hostnames) | 11:03 |
rogpeppe | dimitern: so i'm right that it'll never return a DNS name when an IP address is available? | 11:04 |
rogpeppe | dimitern: "it" == "unit-get public-address" | 11:04 |
dimitern | rogpeppe, but this can change - preferring ips over hostnames was a requirement for api endpoints, but for intra-environment communication can be different | 11:04 |
rogpeppe | dimitern: in this case, this is about from-the-outside access to the environment | 11:05 |
dimitern | rogpeppe, it depends what ip - if it's public, yes - it will always come before any hostnames; if cloud-local - hostnames come first | 11:06 |
rogpeppe | dimitern: aren't ip addresses less stable than dns names? | 11:06 |
dimitern | rogpeppe, in maas'es case for example we have "vm0.maas" "192.168.10.1" and "127.0.0.1" - hostname will be chosen | 11:07 |
dimitern | rogpeppe, ips are absolute (much more so at least than hostnames that can resolve to anything) | 11:07 |
rogpeppe | dimitern: yeah, but ip addresses can be on short-term lease | 11:07 |
rogpeppe | dimitern: in this case i need a stable address that can be used to contact a service | 11:08 |
dimitern | rogpeppe, that's rarely an issue | 11:08 |
dimitern | rogpeppe, I agree | 11:08 |
dimitern | rogpeppe, and can suggest raising a bug about it :) | 11:08 |
rogpeppe | dimitern: ok | 11:09 |
dimitern | rogpeppe, i.e. have a way to get a hostname if possible for public-address | 11:09 |
rogpeppe | dimitern: it would be quite nice if a unit was able to get the public address of a related unit as well as its priviate address too | 11:10 |
dimitern | rogpeppe, it still can - if the remote unit sets its address into the relation settings | 11:11 |
dimitern | rogpeppe, also, there might not be a public address to get | 11:12 |
rogpeppe | dimitern: also, this means that "unit-get public-address" in ec2 will never return an address that is reachable from within the environment, right? | 11:12 |
dimitern | rogpeppe, e.g. in maas all ips are cloud-local | 11:12 |
rogpeppe | dimitern: that's true. i'm thinking that public-address is provided by default (when available) along with private-address | 11:13 |
dimitern | rogpeppe, that's like this since several months now actually - after DNSName got dropped from Environ | 11:14 |
rogpeppe | dimitern: just to be clear, there's currently no way to obtain the ec2 split-horizon DNS name within juju, right? | 11:14 |
dimitern | rogpeppe, ec2 now only reports ips (public and private) | 11:14 |
dimitern | rogpeppe, there are lots of ways :) - fetching a metadata url from the charm for example | 11:15 |
dimitern | rogpeppe, but not a "usual" way | 11:15 |
rogpeppe | dimitern: yeah, i don't want to write ec2-specific code in my charm | 11:16 |
rogpeppe | dimitern: that kinda loses the point of juju | 11:16 |
dimitern | rogpeppe, i can't recall what was the reasoning behind removing DNSName() from Environ | 11:16 |
dimitern | rogpeppe, true, i'm not suggesting it seriously | 11:17 |
rogpeppe | dimitern: i guess one way forward would be to expose a way to get all a unit's addresses from a unit, (and possibly from a related unit) | 11:17 |
dimitern | rogpeppe, yeah - a client api call "get all unit addresses" | 11:20 |
rogpeppe | dimitern: i'm actually thinking of: unit-get addresses | 11:21 |
dimitern | rogpeppe, which then can be called from a hook tool that does that | 11:21 |
rogpeppe | dimitern: and relation-get addresses | 11:21 |
dimitern | rogpeppe, the address-get hook tool that's on the roadmap for 15.04 will do this | 11:22 |
rogpeppe | dimitern: the client side is another question - the only way to get a unit's address is through status currently, right? | 11:22 |
dimitern | rogpeppe, by default it will return a single address; you can specify -r <rel-id> -maybe --all as well | 11:22 |
rogpeppe | dimitern: ah, replacing unit-get ? | 11:23 |
dimitern | rogpeppe, yep - unit-get will only live for backwards compatibility but will get dropped and aliased to something else in the mean time (most likely address-get) | 11:23 |
fwereade | mm, heston blumenthal mass-produces a pretty decent christmas cake | 11:50 |
rogpeppe | fwereade: :) | 12:05 |
* rogpeppe has mince pies downstairs | 12:05 | |
rogpeppe | dimitern: as it turns out the public IP address also works ok internally in ec2 | 12:07 |
dimitern | rogpeppe, nice! :) | 12:08 |
jam2 | fwereade: sounds yummy. feeling better today ? | 12:14 |
fwereade | jam2, yeah, more-or-less with it again | 12:15 |
voidspace | dimitern: fancy giving my PR another quick look-over (including string to net.IP change) | 12:29 |
voidspace | dimitern: http://reviews.vapour.ws/r/644/ | 12:29 |
dimitern | voidspace, sure thing | 12:30 |
dimitern | voidspace, great, just 1 typo | 12:35 |
dimitern | well, not a typo - more like an omission | 12:35 |
voidspace | dimitern: ok, cool - thanks | 12:36 |
voidspace | dimitern: hmmm... looks like the change to SubnetInfo to use net.IP instead of string is causing a panic | 12:44 |
voidspace | dimitern: ... Panic: runtime error: hash of unhashable type network.SubnetInfo (PC=0x414676) | 12:45 |
voidspace | 12:45 | |
voidspace | dimitern: investigating | 12:45 |
voidspace | dimitern: hah, jc.SameContents is panicking trying to compare | 12:45 |
dimitern | voidspace, use jc.DeepEquals instead? | 12:46 |
voidspace | dimitern: I presume SameContents is being used to not be order dependent | 12:47 |
voidspace | dimitern: (this is a pre-existing test) | 12:47 |
voidspace | dimitern: I'll try it though | 12:47 |
voidspace | it works | 12:47 |
voidspace | ShipIt! | 12:47 |
dimitern | voidspace, it's used for maps (or was it slices?) but works for gcc-go (ppc) and golang-go | 12:53 |
dimitern | voidspace, \o/ | 12:54 |
wwitzel3 | ericsnow: bug #1403662 lgtm, so I'm going prepare the commit to backout the hack we have working around it | 14:23 |
wwitzel3 | ericsnow: also I still didn't get the ssh issue solved .. even with explictly adding the sshKeys to the GCE metadata. | 14:24 |
wwitzel3 | ericsnow: it is like there is a step we are missing | 14:24 |
wwitzel3 | ericsnow: ok, so I have the keys being properly uploaded to GCE, now I'm just trying to resolve this "error: Could not load host key: /etc/ssh/ssh_host_ed25519_key" | 14:51 |
wwitzel3 | ericsnow: that is the error the juju ssh client keeps disconnecting on | 14:51 |
LinstatSDR | Good morning all. | 14:52 |
wwitzel3 | ericsnow: looks like we are running in to this bug https://bugs.launchpad.net/cloud-init/+bug/1382118 | 14:55 |
mup | Bug #1382118: Cloud-init doesn't support SSH ed25519 keys <cloud-init:New> <https://launchpad.net/bugs/1382118> | 14:55 |
ericsnow | wwitzel3: looks like you're on to something | 15:13 |
natefinch | well that looks less than promising | 15:15 |
ericsnow | natefinch: it does mean we are nearly done with bootstrapping (I think) | 15:20 |
natefinch | ericsnow: awesome. Any idea if that bug is going to be a major problem for you guys? | 15:21 |
ericsnow | natefinch: I imagine so but wwitzel3 may have a better idea of it | 15:22 |
wwitzel3 | natefinch:I don't know much about cloud-init .. can we send it some a set of custom commands to run for us? | 15:28 |
wwitzel3 | natefinch: we either need the sshd_config on the instance to remove the line expecting ed25519 keys or we need to issue a ssh-keygen to create the expected key file. | 15:29 |
natefinch | man I wish we used github for releases.... trying to figure out what code is in what release in launchpad is a huge pain | 15:39 |
perrito666 | natefinch: tags dont have the same name as releases in lp? | 15:40 |
natefinch | perrito666: ha, I missed we had tags on juju... it's a separate tab in the list of branches. Thanks for that. | 15:41 |
=== kadams54 is now known as kadams54-away | ||
=== kadams54-away is now known as kadams54 | ||
ericsnow | fwereade: PTAL: http://reviews.vapour.ws/r/663/ | 16:29 |
ericsnow | fwereade: that drops the AZ from unit-get and adds it as an env variable | 16:30 |
ericsnow | fwereade: let me know if that's what you had in mind | 16:30 |
fwereade | ericsnow, catching up wit hthe related code: I'm wondering (1) why we set context.availabilityZone in factory.updateContext, especially given the doc comment on updateContext and (2) why there aren't any tests for it in factory_test.go? | 16:34 |
ericsnow | fwereade: I'll take a look | 16:36 |
ericsnow | fwereade: FWIW, a lot of what I did for availability zones was following the precedent of what I considered to be the similar fields | 16:38 |
fwereade | ericsnow, hmm, what are the untested factory/context bits? I made an effort to fix them -- not saying I *did* catch them all, but I'd like to know so I can fix them ;) | 16:40 |
ericsnow | fwereade: oh, I probably just missed addin the AZ-related tests | 16:42 |
ericsnow | fwereade: if I see something missing I'll let you know :) | 16:42 |
katco | gah, trunk is currently closed? | 16:42 |
katco | the room's title lies! | 16:42 |
fwereade | ericsnow, I don't want to claim particular superiority of the existing code, though: in particular all the tests that directly construct a *HookContext are Bad Tests, and they're only like that because I was [focused on delivering business value elsewhere|too damn lazy] (strike out whichever does not apply) | 16:42 |
katco | we really should modify mup to report CI status. | 16:43 |
natefinch | katco: the only thing that never lies is this page: https://bugs.launchpad.net/juju-core/+bugs?field.status%3Alist=TRIAGED&field.status%3Alist=INPROGRESS&field.importance%3Alist=CRITICAL&field.tag=ci+regression+&field.tags_combinator=ALL | 16:43 |
fwereade | ericsnow, (context: I added Factory and moved NewHookContext into export_test.go, but didn't fix all the tests that used it) | 16:43 |
* natefinch has that labeled in his bookmarks toolbar as "CI Blockers" | 16:43 | |
katco | it seems like this question comes up CONSTANTLY and from different people | 16:44 |
katco | it's clearly something we need to make more clear somehow | 16:44 |
natefinch | katco: yeah, I don't know enough about launchpad to know if there's some way to create a status page or something | 16:44 |
natefinch | katco: rather than a random filter of existing bugs | 16:45 |
katco | natefinch: there is; i wrote some emacs lisp to interface w/ launchpad | 16:45 |
katco | natefinch: ah, yeah that's probably what it would be: does this query return anything? blocked. no? not blocked. | 16:45 |
ericsnow | fwereade: got it | 16:45 |
katco | and then have mup periodically do that | 16:45 |
katco | and change the room topic | 16:45 |
katco | and maybe announce it | 16:46 |
katco | ~1m or so | 16:46 |
natefinch | please no announcements every minute :) | 16:46 |
katco | no i mean if it changes haha | 16:46 |
katco | "CI IS OPEN AND ALL IS WELL! HEAR YE HEAR YE!" | 16:47 |
fwereade | haha | 16:48 |
katco | ^ this guy gets it! | 16:48 |
natefinch | it would be nice, but not really sufficient, I think. It's easy to miss stuff in IRC, and you don't see the room title except on login.... I'm much rather have a webpage with a URL I can at least attempt to remember that I can point people to. | 16:48 |
katco | natefinch: +1; but fwiw, the topic is always up for me and i can set notifications on keywords/people | 16:49 |
voidspace | natefinch: if you use xchat you see the room title all the time | 16:49 |
katco | natefinch: but i website would be a great first step. i don't even care if it's blank and the background changes from #00FF00 to #FF0000 | 16:50 |
natefinch | Oh, yeah, there it is at the top of the screen... | 16:50 |
katco | and it should probably be hooked into reports.vapour or w/e | 16:50 |
natefinch | it's just like a foot above where I usually look on my IRC window | 16:50 |
voidspace | hah... | 16:50 |
voidspace | natefinch: you remember the question I asked you about netmasks the other day? How to work out the last ip in a subnet. | 16:51 |
voidspace | natefinch: I was taking the number of zeros in the netmask, then OR'ing (2 ** numZeros -1) with the first IP | 16:51 |
voidspace | natefinch: which works | 16:52 |
voidspace | natefinch: but instead you can do 1 << numZeros | 16:52 |
voidspace | natefinch: which gives you the number of IPs in the subnet | 16:52 |
natefinch | voidspace: ahh, bit twiddling | 16:52 |
voidspace | natefinch: and is a bit more elegant (then add that to the first IP) | 16:52 |
voidspace | natefinch: yeah, fun | 16:52 |
natefinch | voidspace: evidently useful for more than CS101 and job interviews ;) | 16:52 |
voidspace | hehe | 16:53 |
voidspace | and crypto | 16:53 |
natefinch | voidspace: well, I guess... though if you're writing your own bit twiddling code for crypto, you're probably doing it wrong. | 16:54 |
voidspace | natefinch: yeah probably | 16:54 |
natefinch | voidspace: but I get your meaning. Certainly bit twiddling is useful in many circumstances... I was mostly joking :) | 16:55 |
voidspace | natefinch: :-) | 16:55 |
dimitern | ericsnow, hey | 17:05 |
ericsnow | dimitern: what's up? | 17:05 |
dimitern | ericsnow, mattyw asked a question today about which repos are handled by the automatic RB diff creation from PRs | 17:06 |
ericsnow | dimitern: currently just core and utils | 17:06 |
dimitern | ericsnow, and now that I think of it - I wondered as well | 17:06 |
dimitern | ericsnow, ah, ok, thanks | 17:07 |
ericsnow | dimitern: It's been on my todo list to add the rest of the ones that RB knows about | 17:07 |
dimitern | ericsnow, and where is the bot/script that does that live? | 17:07 |
dimitern | s/live// | 17:07 |
dimitern | ericsnow, iirc it's running on some ec2 instance | 17:08 |
ericsnow | dimitern: it's a github webhook (pointing to a RB URL) | 17:08 |
ericsnow | dimitern: oh, the GH bot? I don't know | 17:08 |
dimitern | ericsnow, ah right | 17:08 |
dimitern | ericsnow, so each repo has a webhook configured? | 17:09 |
dimitern | ericsnow, each := juju and utils I mean | 17:09 |
ericsnow | dimitern: exactly | 17:09 |
dimitern | ericsnow, ok,thanks | 17:10 |
ericsnow | dimitern: np | 17:10 |
alexisb | wwitzel3, you around? | 17:14 |
=== kadams54 is now known as kadams54-away | ||
voidspace | alexisb: o/ | 17:19 |
voidspace | alexisb: hey, alexis - just saying hi | 17:20 |
voidspace | alexisb: happy christmas and see you next year :-) | 17:20 |
wwitzel3 | alexisb: yes, the bug ended up not being a blocker for gce | 17:20 |
voidspace | wwitzel3: and you Wayne - I'm signing off for the year shortly... have a good holiday | 17:21 |
wwitzel3 | voidspace: ahh, have a great one! :) | 17:22 |
katco | voidspace: happy new year! | 17:22 |
voidspace | katco: thanks :-) Have a good break and see you on the other side. | 17:22 |
alexisb | hey there voidspace we havent chatted in like forever | 17:23 |
katco | voidspace: you too! best to you and your family | 17:23 |
alexisb | you must not be on my calendar | 17:23 |
alexisb | howdy and happy holidays to you! | 17:23 |
voidspace | alexisb: I don't think I am... | 17:23 |
voidspace | alexisb: we can sort that out next year :-) | 17:23 |
alexisb | voidspace, I will fix that | 17:23 |
voidspace | coolio | 17:23 |
perrito666 | mgosuites are a torture | 17:41 |
wwitzel3 | so anyone have any pointers on why the /var/lib/juju folder and nonce.txt would not be being created? | 18:03 |
wwitzel3 | when bootstrapping | 18:04 |
hazmat | anybody seen these types of errors out of go 1.4 | 18:11 |
hazmat | $ go get -u -v github.com/docker/swarm/... | 18:11 |
hazmat | package github.com/docker/swarm: /home/kapil/src/github.com/docker/swarm is from https://github.com/docker/swarm/, should be from https://github.com/docker/swarm | 18:11 |
voidspace | right, EOY | 18:23 |
voidspace | bye all | 18:23 |
voidspace | have a great holiday and new year, and see you there... | 18:24 |
wwitzel3 | ericsnow: I'm going to grab some food, but I'll leave my session up | 18:34 |
ericsnow | wwitzel3: k | 18:34 |
ericsnow | wwitzel3: ping me when you're back | 18:34 |
ericsnow | perrito666: so it looks like restore has to be all committed by Jan 9... | 18:35 |
perrito666 | ericsnow: so it seems | 18:36 |
ericsnow | perrito666: or we'll need to disable backups in 1.22 like we did in 1.21 | 18:36 |
perrito666 | we are getting there if CI has enough non locked time | 18:36 |
ericsnow | perrito666: :) | 18:37 |
ericsnow | if anyone has a few minutes to spare, I could use a review on http://reviews.vapour.ws/r/659/ | 18:38 |
ericsnow | it's a basically a copy of utils/set.Strings with s/String/Int/ :) | 18:39 |
natefinch | ericsnow: ship it! | 18:39 |
ericsnow | natefinch: thanks | 18:39 |
natefinch | ericsnow: I had looked at it before but forgot to hit the ship it button | 18:39 |
wwitzel3 | ericsnow: back | 19:07 |
ericsnow | brt | 19:08 |
wwitzel3 | during bootstrap what is responsible for creating the /var/lib/juju on the instance? | 19:13 |
wwitzel3 | right now, during ssh, when we login, there is no /var/lib/juju/nonce.txt file | 19:14 |
wwitzel3 | in fact, there is no /var/lib/juju folder at all | 19:14 |
wwitzel3 | the finish bootstrap command expects that nonce.txt file | 19:14 |
ericsnow | (on GCE) | 19:14 |
wwitzel3 | so that is why GCE is failing atm | 19:14 |
natefinch | wwitzel3: cloud init makes it | 19:22 |
=== kadams54-away is now known as kadams54 | ||
katco | is there a good, clear example of using suites to set up a full juju stack? i'm getting nil-reference exceptions and it's not exactly clear to me why | 20:16 |
perrito666 | katco: mm? | 20:19 |
natefinch | There's the dummy provider, but proceed with caution... there be dragons | 20:19 |
katco | perrito666: so you know how we chain suites in tests? | 20:19 |
perrito666 | katco: I have no clue | 20:19 |
katco | perrito666: i am probably wording it poorly | 20:20 |
natefinch | embed this suite in that suite etc etc | 20:20 |
katco | yeah | 20:20 |
perrito666 | natefinch: I might have let you a review in priv | 20:31 |
natefinch | katco: I think it's the JujuConnSuite | 20:33 |
katco | natefinch: what is? | 20:33 |
natefinch | katco: the suite that gives you a bootstrapped env | 20:34 |
natefinch | katco: /home/nate/src/github.com/juju/juju/juju/testing/conn.go | 20:34 |
katco | natefinch: ah ironically that's where the panic happens :p | 20:34 |
thumper | hello folks | 20:34 |
perrito666 | thumper: morning mate | 20:34 |
thumper | katco: what is your error? | 20:34 |
katco | natefinch: it's the confluence of using that package with another suite i think | 20:34 |
waigani | thumper: http://reviews.vapour.ws/r/627/ | 20:34 |
perrito666 | natefinch: btw, if we have blocking bugs the topic might benefit from that info | 20:34 |
waigani | thumper: and follow up branch http://reviews.vapour.ws/r/657/ | 20:34 |
perrito666 | waigani: I answered one question on one of your reviews | 20:35 |
thumper | natefinch: is anyone on your team dealing with https://bugs.launchpad.net/juju-core/+bug/1396099 ? | 20:35 |
mup | Bug #1396099: AWS/Joyent/HP/manual/maas: juju deploy error "connection is shut down" <api> <ci> <deploy> <ec2-provider> <joyent-provider> <manual-provider> <regression> <juju-core:Triaged by natefinch> <https://launchpad.net/bugs/1396099> | 20:35 |
thumper | I don't always trust the assignee | 20:35 |
=== kadams54_ is now known as kadams54-away | ||
perrito666 | thumper: btw, thanks for taking over yesterday | 20:35 |
waigani | perrito666: which review? | 20:36 |
waigani | perrito666: got it | 20:36 |
perrito666 | http://reviews.vapour.ws/r/645/diff/#http://reviews.vapour.ws/r/645/diff/# | 20:36 |
thumper | perrito666: that's fine | 20:36 |
perrito666 | waigani: that, but only once | 20:36 |
* perrito666 lends his computer a second to his mother in law to look up the recipe of a sweet chrismas bread and is profusely ... criticized... for having the kb layout US | 20:37 | |
waigani | hehe | 20:38 |
waigani | perrito666: where are the tests for the Restore func? | 20:39 |
perrito666 | waigani: not yet pushed, sadly when I click fixed it sumits immediately instead of remaining in my draft | 20:40 |
perrito666 | waigani: but you added a ? | 20:40 |
perrito666 | which is not very clear | 20:40 |
waigani | perrito666: sorry about that , I dropped it | 20:41 |
katco | thumper: so i'm trying to use the cmd/jujud/agent/testing/agent.go suite, and if i use that alone i get errors with various workers trying to operate on my host's real fs | 20:42 |
katco | thumper: and then the test just keeps trying to dial the state server which apparently didn't come up | 20:42 |
katco | thumper: so i tried to bring in juju/juju/testing, and that gives me a nil reference exception (hold) | 20:43 |
thumper | katco: do you have the code handy? | 20:43 |
katco | thumper: do you feel like doing a peer coding session? | 20:43 |
thumper | could do... | 20:44 |
=== kadams54-away is now known as kadams54_ | ||
katco | thumper: it's not pushed up anywhere yet | 20:44 |
thumper | ok | 20:44 |
perrito666 | waigani: no problem I was curious if that actually was a way to say wtf | 20:44 |
=== kadams54_ is now known as kadams54-away | ||
perrito666 | ericsnow: my mind is a bit clouded, the cool upload is already landed right? | 20:57 |
ericsnow | perrito666: right | 20:57 |
perrito666 | tx man | 20:59 |
menn0 | wallyworld: ping? | 21:02 |
perrito666 | ah wallyworld ping me with relative low priority when you have a moment | 21:03 |
ericsnow | where does nonce.txt get written to the new instance during bootstrap? | 21:18 |
natefinch | ericsnow: it gets added to the stuff cloud init does | 21:18 |
ericsnow | natefinch: but where? | 21:18 |
natefinch | ericsnow: github.com/juju/juju/environs/cloudinit/cloudinit_ubuntu.go#105 | 21:19 |
ericsnow | natefinch: that doesn't write anything to the new host though, right? | 21:20 |
ericsnow | natefinch: doesn't that happen in cloudinit/sshinit/configure.go? | 21:21 |
natefinch | ericsnow: yeah, it adds a line to cloud init... github.com/juju/juju/cloudinit/options.go#374 | 21:21 |
ericsnow | natefinch: right | 21:21 |
perrito666 | hey people heads up https://github.com/blog/1938-vulnerability-announced-update-your-git-clients | 21:30 |
ericsnow | natefinch: is there an initial /var/lib/juju/nonce.txt included in the cloud images? | 21:38 |
natefinch | ericsnow: no clue | 21:39 |
ericsnow | natefinch: :( | 21:39 |
wallyworld | perrito666: hi | 21:53 |
perrito666 | wallyworld: hi | 21:53 |
perrito666 | wallyworld: ill privmsg you | 21:53 |
menn0 | hmmm.... I think I've found another upgrade regression that's unrelated to what I'm looking at | 22:04 |
wallyworld | menn0: hi | 22:07 |
menn0 | wallyworld: hi | 22:08 |
menn0 | wallyworld: i think I figured out what i was going to ask you | 22:08 |
wallyworld | ask away | 22:08 |
menn0 | wallyworld: but i'm about to ask you to review a fix to one of the CI blockers | 22:08 |
wallyworld | sure | 22:08 |
menn0 | wallyworld: it's a bit of a reorg of the apiworker in the machine agent | 22:09 |
* katco looks at menn0 nervously | 22:09 | |
wallyworld | is this the upgrade bug with uuid? | 22:09 |
menn0 | yep | 22:09 |
alexisb | menn0, travel approved | 22:10 |
menn0 | wallyworld: but that change has exposed a "bug" in the machine agent | 22:10 |
menn0 | alexisb: thanks | 22:10 |
menn0 | wallyworld: the container setup code was running during upgrades | 22:10 |
wallyworld | menn0: so we are roomies, thumper will be jealous | 22:10 |
wallyworld | ah, container setup code should run after upgrades shouldn't it | 22:11 |
menn0 | thumper has nothing to worry about. you're all his :-p | 22:11 |
wallyworld | \o/ | 22:11 |
menn0 | wallyworld: yep | 22:11 |
menn0 | wallyworld: give me a few minutes and you can see what i've done. it's not completely straightforward to fix this. | 22:11 |
wallyworld | i guess till now the container setup didn't really matter when it ran so we got away with it | 22:12 |
menn0 | wallyworld: well kinda, | 22:12 |
menn0 | wallyworld: there was always a lurking bug there | 22:12 |
wallyworld | yeah | 22:13 |
wallyworld | i men it was luck nothing broke till now | 22:13 |
wallyworld | mean | 22:13 |
menn0 | wallyworld: if the machines collection migration happened at about the same time as the container setup it could have blown up anyway | 22:13 |
menn0 | wallyworld: it's just much more likely now | 22:13 |
wallyworld | menn0: one thing also that needs doing is to delay the api worker start up until after the first address change has come in | 22:14 |
wallyworld | to allow the server cert to be regenerated | 22:14 |
wallyworld | with the machine ip addresses | 22:14 |
wallyworld | thus avoiding a restart after any clients that manage to connect realy, really quickly | 22:15 |
menn0 | wallyworld: yeah, the other CI problem | 22:15 |
wallyworld | oh, they raised a blocker for that? | 22:15 |
wallyworld | wtf | 22:15 |
wallyworld | sigh | 22:15 |
wallyworld | that will not be an issue in practice | 22:15 |
wallyworld | i wish blockers we added to the juju-dev topic | 22:16 |
wallyworld | so we could see them | 22:16 |
menn0 | wallyworld: they are but it seems to be a manual process | 22:30 |
menn0 | wallyworld: a bot should do it | 22:30 |
wallyworld | yes | 22:30 |
menn0 | wallyworld: here's that fix https://github.com/juju/juju/pull/1343 | 22:32 |
wallyworld | menn0: ta, in ameeting will look soon | 22:33 |
menn0 | thumper: can you look at https://github.com/juju/juju/pull/1343 pls? wallyworld is probably best placed to review this but he's in a meeting but I'd like to get this CI blocker fixed. | 22:34 |
thumper | menn0: ack | 22:37 |
thumper | menn0: Ship It! | 22:41 |
menn0 | thumper: ta | 22:41 |
stokachu | when did juju rename its ssh keys to juju_id_rsa? | 22:50 |
stokachu | or has it always been like that | 22:50 |
wallyworld | menn0: change looks good, thanks for fixing | 22:52 |
menn0 | wallyworld: great | 22:52 |
wallyworld | stokachu: i'm not 100% sure, i thought it was always like that | 22:52 |
menn0 | wallyworld, thumper: thanks for the reviews | 22:52 |
stokachu | wallyworld: ok cool | 22:52 |
menn0 | wallyworld: this API server restart issue is affecting my personal test scripts too :) | 22:53 |
menn0 | wallyworld: it'll probably bite anyone who has scripted deployments | 22:53 |
wallyworld | i'm about to relocate, will look at it as soon as i'm online again ain about 20 mins | 22:54 |
stokachu | i wonder, if using JUJU_HOME causes the sshkeys to be renamed | 22:55 |
* thumper sighs | 22:55 | |
thumper | I need to take the dog to the vet | 22:55 |
thumper | bbl | 22:55 |
stokachu | though i can't find anywhere in the code where juju_id_rsa is referenced | 22:56 |
perrito666 | mm, making a function take a variadic argument does not imply I get a slice of that type right? | 23:24 |
katco | perrito666: the argument is used as a slice, but you are not guaranteed its length is > 0 | 23:26 |
perrito666 | but, is it a slice? | 23:27 |
katco | yes | 23:27 |
katco | well... you mean like down at the AST level? | 23:27 |
katco | like if you used reflection would it be a slice? | 23:27 |
perrito666 | katco: like I want to append it to a [][]string :) | 23:27 |
perrito666 | katco: exactly | 23:27 |
katco | oh, yep. it's a slice :) | 23:27 |
perrito666 | so I have a slice of string slices and I append the variadic arg to it (tests) | 23:28 |
perrito666 | but when I try to DeepEqual each of the slices I get an error about the capacity | 23:28 |
katco | i am not sure how go constructs the capacity under the covers for variadic parameters | 23:29 |
katco | my guess is that it matches the number of parameters passed in exactly? | 23:29 |
katco | but it's almost certainly constant, so i'd allocate your test values with whatever capacity it says go has provided | 23:30 |
perrito666 | katco: It seems that it expects the subslices to have been properly allocated | 23:38 |
perrito666 | so I have to make them and then fill them with the contents of the variadic arg | 23:38 |
katco | perrito666: deep equals? | 23:38 |
perrito666 | it does not support dirrect assignation | 23:38 |
perrito666 | katco: yup | 23:38 |
katco | i guess that makes sense | 23:38 |
perrito666 | I would expect go to have that kind of information for those slices | 23:39 |
perrito666 | katco: https://github.com/juju/juju/pull/1326/files#diff-32f2baace5b89ccd33a7a5a4c0619b3bR67 | 23:43 |
perrito666 | I end up having to do that | 23:43 |
perrito666 | If anybody want to take a second look at http://reviews.vapour.ws/r/645/ Ill be thankful | 23:46 |
katco | perrito666: that code makes send to me. what is unintuitive about it? | 23:47 |
perrito666 | katco: well it makes a string slice and copies a string slice | 23:47 |
perrito666 | but I am most likely de-referencing some pointers there which might be what was breaking my code | 23:48 |
katco | oh, so if you just do a mgoArgs = append(mgoArgs, mongoRestoreArgs...) it complains? | 23:49 |
perrito666 | I have not tried, if you look closely I dont want to append the elements of mongoRestoreArgs but the slice itself | 23:49 |
katco | perrito666: oh i think i see what you meant | 23:50 |
katco | perrito666: the variadic slice didn't have the correct capacity | 23:51 |
perrito666 | katco: exactly, which is odd | 23:51 |
katco | perrito666: i think there is a way to size it down... hm | 23:51 |
perrito666 | katco: well what I end up doing is clear so I will leave it that way | 23:51 |
katco | perrito666: maybe make the new slice, and then do a copy? | 23:53 |
katco | to elide the for loop? | 23:53 |
perrito666 | I somehow fear copy will blow in a similar way? | 23:54 |
katco | perrito666: http://stackoverflow.com/questions/12768744/re-slicing-slices-in-golang | 23:54 |
katco | perrito666: well if it's truly the capacity that's erroring out deepequals, as long as your destination slice is the correct capacity i think it should be fine | 23:54 |
katco | looks like even better is this: mongoRestoreArgs = mongoRestoreArgs[0:] | 23:55 |
perrito666 | katco: I might give it a try | 23:55 |
* perrito666 tries | 23:55 | |
katco | perrito666: should give you a slice ref with the correct cap. if not, try mongoRestoreArgs[0:len(mongoRestoreArgs)] | 23:55 |
katco | with a nice comment :) | 23:55 |
perrito666 | katco: blows | 23:57 |
katco | perrito666: both? | 23:57 |
perrito666 | katco: yup (I used a new variable because I find reassignation is a bit ugly) | 23:57 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!