thumper | fark!!!! | 02:30 |
---|---|---|
thumper | rsyslog is fiddly | 02:30 |
axw | indeed, and the config format is arcane | 02:31 |
thumper | hmm... | 02:31 |
thumper | I think I broke something | 02:31 |
thumper | a bugger | 02:33 |
thumper | ugh | 02:33 |
thumper | copy and paste error | 02:33 |
* thumper does the destroy, make, bootstrap dance | 02:33 | |
axw | make? | 02:34 |
axw | oh, you use make to build? | 02:34 |
thumper | make install | 02:34 |
thumper | just does: go install ./... | 02:34 |
thumper | but easier to type | 02:34 |
thumper | make check does: go test ./... | 02:35 |
thumper | axw: so... when is debug-log going to use the API | 02:35 |
thumper | axw: I have a feeling we need this backchannel that wallyworld keeps talking about | 02:36 |
axw | thumper: it uses it now for getting the SSH address, but I guess that's not what you mean :) | 02:36 |
thumper | axw: so the api server looks at all-machines.log and filters, streaming info back to the client | 02:36 |
thumper | right | 02:36 |
axw | yeah, I don't know - still slogging through destroy-env atm | 02:36 |
thumper | axw: when we get it working properly streaming data over the api, then it'll work with the local provider | 02:36 |
axw | what back channel? | 02:36 |
thumper | I'm getting all-machines.log working for local | 02:36 |
thumper | there isn't one | 02:36 |
thumper | but we should have one | 02:36 |
axw | sorry, what does that mean in this context? | 02:37 |
thumper | it means we could get meaningful informational stuff back to the user | 02:37 |
thumper | probably easier to explain in a hangout :) | 02:37 |
axw | sure | 02:37 |
thumper | \o/ | 02:37 |
thumper | it works | 02:37 |
thumper | IT'S ALIVE!!!1!11!! | 02:38 |
axw | :) | 02:38 |
axw | hangout now, or are you busy atm? | 02:38 |
thumper | what a fucking mission that has been | 02:38 |
thumper | about to go and make a coffee and have a quick sit-down with Rache | 02:38 |
thumper | l | 02:38 |
thumper | but I'll be back after that | 02:38 |
axw | okey dokey | 02:38 |
thumper | and we can chat | 02:38 |
axw | sounds good, ttyl | 02:38 |
thumper | I could also talk you through these two branches I'll propose | 02:38 |
axw | cool | 02:39 |
thumper | one to make the rsyslog udp port configurable | 02:39 |
thumper | and the other to get the local provider to use rsyslog and all-machines.log | 02:39 |
thumper | I've also changed the output, so instead of just %HOSTNAME%, we get things like 'machine-1' and 'unit-ubuntu-0' | 02:39 |
* thumper goes for a bit | 02:39 | |
* thumper goes to fix all the broken tests... | 03:08 | |
thumper | dumb script checking in cloudinit | 03:08 |
axw | thumper: is there a reason why we don't delete the bootstrap state file if bootstrap fails? | 03:38 |
thumper | um... because we might be 'kinda set up' ?? | 03:39 |
axw | hmm I guess. it does stop the instance if it screws up (it's easier to tell that it failed once it's synchronous) | 03:39 |
thumper | local provider isn't as good, but could be made so | 03:43 |
thumper | axw: hangout? | 03:47 |
axw | sure, just a mo | 03:47 |
axw | thumper: https://plus.google.com/hangouts/_/76cpiq5hco79gtjcr755e8jsok?authuser=1&hl=en | 03:49 |
* thumper is happy that he has figured out some live kvm tests | 05:08 | |
* thumper is done for now, back for the team meeting | 05:08 | |
axw_ | fwereade: hiya. TheMue reviewed my state changes for environment Life, but I thought I'd better wait for you since you reviewed the parent branch | 07:36 |
axw_ | when you have some moments: https://codereview.appspot.com/28880043/ | 07:36 |
dimitern | morning all | 08:14 |
TheMue | axw_: using the uuid check for the id looks better now, thanks | 08:14 |
axw_ | TheMue: cool, thanks for the review | 08:14 |
axw_ | morning dimitern | 08:14 |
TheMue | morning | 08:15 |
rogpeppe | mornin' all | 08:40 |
axw_ | morning rogpeppe | 08:41 |
jam | morning | 08:43 |
jam | well, afternoon :) | 08:43 |
mgz | mornin' | 09:29 |
jam | morning mgz | 09:43 |
jam | mgz: did you see the recent discussion about maintaining compat? | 09:44 |
mgz | the API thread on the list? | 09:44 |
jam | mgz: yeah, abentley reasonably pointed out that Status is something we probably want multi-version compatible | 09:44 |
jam | so don't be too hasty to tear out the client side code yet | 09:44 |
mgz | I didn't fully understand his point | 09:45 |
jam | mgz: for CLI API changes that require us to add a new API, we'd like to keep the old code around in the 1.18 client in case it connects to a 1.16 server that doesn't have the new API | 09:46 |
mgz | but, I guess that goes for one upgrade step only? we need to remove the mongo access in two stages | 09:46 |
jam | mgz: right. I'm not 100% sure if we want a 1.16 client to be able to do status on a 1.18 but maybe | 09:47 |
fwereade | jam, mgz: I really don't think we can avoid it | 09:54 |
fwereade | jam, mgz: 2-way compatibility for minor+2 was always the plan | 09:54 |
fwereade | jam, mgz: it's just something that we seem to repeatedly fuck up | 09:54 |
fwereade | jam, mgz: and ofc once we hit 2.0 we need compatibilty all the way back to 2.0 until 3.0 (and even that needs compat back to the final 2.x, I think) | 09:55 |
jam | fwereade: so I know that we didn't do it for several commands so far (I don't know them offhand, but juju destroy-machine --force comes to mind :) | 09:57 |
fwereade | jam, indeed so, I definitely fucked that up for 1.16.4 -- but it actually just turned out to be early warning of the same fuckup for 1.18 | 09:58 |
fwereade | jam, casual glance indicates that we've just been doing straight replacement with no fallbacks across the board | 09:59 |
jam | fwereade: that has definitely been the pattern. I did bring it up to you before | 09:59 |
jam | it was fine for all the ones that didn't need a new API | 09:59 |
jam | but the others are all incompatible | 09:59 |
fwereade | jam, well, fuck, I'm sorry -- I don't recall that, and if I told you to do something stupid then that's all on me | 10:00 |
jam | fwereade: weekly standup, btw | 10:01 |
jam | TheMue: are you coming to the standup? | 10:05 |
TheMue | jam: oh, yep, forgot that it is now, sorry | 10:06 |
jam | https://plus.google.com/hangouts/_/calendar/bWFyay5yYW1tLWNocmlzdGVuc2VuQGNhbm9uaWNhbC5jb20.09gvki7lhmlucq76s2d0lns804 | 10:06 |
jam | TheMue: ^^ | 10:06 |
jamespage | can anyone help me with the impact and test case for https://bugs.launchpad.net/juju-core/+bug/1239508 | 10:16 |
_mup_ | Bug #1239508: Empty constraint value lost during some cloud-init step <cloud-init> <juju-core:Fix Committed by thumper> <juju-core 1.16:Fix Released by thumper> <juju-core (Ubuntu):Fix Released> <juju-core (Ubuntu Saucy):New> <juju-core (Ubuntu Trusty):Fix Released> <https://launchpad.net/bugs/1239508> | 10:16 |
jamespage | (working the SRU for saucy/1.16.3) | 10:16 |
thumper | committed by me? | 10:17 |
* thumper looks | 10:17 | |
thumper | jamespage: look at my comment, I marked it as invalid | 10:19 |
thumper | jamespage: the bug is erroneous | 10:19 |
thumper | I filed it, but the bug itself is wrong | 10:19 |
thumper | the fundamental problem was the local provider wasn't starting | 10:20 |
thumper | and I thought that this was the problem | 10:20 |
thumper | but it wasn't | 10:20 |
jamespage | thumper, ah - ok | 10:20 |
thumper | the problem was that the startup process was changed | 10:20 |
jamespage | thumper, I thought that was the case but I was not sure | 10:20 |
jamespage | thanks | 10:20 |
thumper | meaning that the local provider needed to set up the storage earlier | 10:20 |
thumper | hence the branch connected to that bug just has a "setup bootstrap storage" method | 10:21 |
thumper | np | 10:21 |
jamespage | OK - uploaded for SRU team review - lets see how this goes | 10:34 |
jamespage | MRE soonish if this goes all OK | 10:34 |
jamespage | ehw, ^^ | 10:38 |
ehw | jamespage: \o/ | 10:38 |
jamespage | fwereade, who's a good person to talk to about what juju-core is doing CI wise? | 10:43 |
jamespage | I need to evidence sufficient upstream testing as part of the Minor Release Exception we need for Ubuntu | 10:43 |
jam | jamespage: sinzui and abentley seem to be driving that | 10:46 |
jam | jamespage: I'm the one that set up the trunk robot for passing the test suite before landing to truk | 10:47 |
jam | trunk | 10:47 |
jamespage | jam: unit testing right? or does that exercise with each provider directly? | 10:48 |
jam | jamespage: it tests each provider against its double, but not against a live service | 10:48 |
jam | (eg, not against Amazon) | 10:48 |
jam | testing against Amazon is on Curtis's team | 10:48 |
jamespage | jam: does the robot cover unit testing pre-landing for stable branches as well? | 10:49 |
jam | jamespage: currently es | 10:50 |
jam | yes | 10:50 |
sinzui | Ci tests the candidate works on LXC, AWS, HP, Azure, and Canonistack. It also nominally tests stable to next are compatible | 10:50 |
jam | so 1.16 passes the test suite before accepting patches, etc | 10:50 |
jamespage | thumper, sinzui: I marked bug 1239508 as invalid all round - I got confused so figured others would as well | 10:50 |
_mup_ | Bug #1239508: Empty constraint value lost during some cloud-init step <cloud-init> <juju-core:Invalid by thumper> <juju-core 1.16:Invalid by thumper> <juju-core (Ubuntu):Invalid> <juju-core (Ubuntu Saucy):Invalid> <juju-core (Ubuntu Trusty):Invalid> <https://launchpad.net/bugs/1239508> | 10:50 |
sinzui | jamespage, Ci also verifies the tarball works with Ubuntu packaging. We are verifying the package upgrade (and downgrade) | 10:53 |
jamespage | sinzui, is this publically visible somewhere | 10:54 |
jamespage | useful for my email to the tech board | 10:54 |
sinzui | visible, but the results are not very intelligible at http://162.213.34.53:8080/ | 10:54 |
sinzui | jamespage, I just drafted this as a the kind of report we want managers and engineers to see https://docs.google.com/a/canonical.com/spreadsheet/ccc?key=0AoY1kjOB7rrcdEl3dWl0NUM3RzE2dXFxcGxwbVZtUFE#gid=0 | 10:55 |
dimitern | g+ kicked me out for some reason and I can't reconnect | 10:55 |
fwereade | dimitern, not to worry, we'renearly deon | 10:58 |
* fwereade can has typing? apparently not | 10:58 | |
dimitern | fwereade, sticky keyboard? :) | 10:58 |
natefinch | TheMue: forgot to mention. your package shipped, Monday, I believe, so it should be there sometime in the next 3-30 days :) | 11:03 |
TheMue | natefinch: fantastic, have to thank you | 11:18 |
natefinch | TheMue: it's no problem :) | 11:21 |
TheMue | natefinch: will be paid at least with one (or some more) beer | 11:23 |
natefinch | TheMue: bring some coffee, I'm not a big beer guy :) | 11:24 |
TheMue | natefinch: yeah, already thought so. will do ;) | 11:25 |
arosales | thumper, http://newraycom.com/how-to-set-up-google-hangout-lower-third/ | 11:36 |
thumper | arosales: cheers | 11:37 |
* thumper goes to bed | 11:37 | |
jam | sinzui: I set up a 1.17.1 milestone, for things that I know we need before we can do a 1.18.0 final release. | 12:02 |
jam | sinzui: is that a reasonable way to do it ? | 12:02 |
sinzui | +1 | 12:02 |
jam | axw_: poke if you're still around | 12:05 |
jam | I think the summary for https://bugs.launchpad.net/juju-core/+bug/1246905 needs to be updated | 12:05 |
_mup_ | Bug #1246905: status for manual provider hangs <manual-provider> <juju-core:Triaged> <https://launchpad.net/bugs/1246905> | 12:05 |
axw_ | jam: yo, just making dinner - what's up? | 12:05 |
axw_ | ok | 12:05 |
* axw_ looks | 12:05 | |
jam | reading the text, it looks like it should be something about "juju should use netwokr address from ?" | 12:05 |
axw_ | jam: yes, agreed - I'll update it in a bit | 12:06 |
jam | axw_: thanks, I was a bit confused reading the summary and looking at our trajectory. I think we can unschedule it from "stuff we are working on right now" unless you're looking closer at it | 12:07 |
jam | (so unset the milestone) | 12:07 |
jam | natefinch: are you still looking at bug #1218616 ? | 12:07 |
_mup_ | Bug #1218616: all-machines.log is oversized on juju node <logging> <juju-core:Triaged> <https://launchpad.net/bugs/1218616> | 12:07 |
jam | I thought I remembered a ping on your logrotate branch | 12:08 |
jam | but I don't know where that's at | 12:08 |
axw_ | jam: yeah, sorry, it should be unset | 12:08 |
jam | axw_: yeah no problem, I was just going through the list and figured I'd check it with you | 12:08 |
natefinch | jam: I should pick it back up.... all it really needs is a signal to juju to have it reopen the log file so it starts writing to a new one, rather than the old one. Not exactly sure how hard that'll be to implement, since I don't know the core logging code that well | 12:10 |
jam | natefinch: I'm assigning https://bugs.launchpad.net/juju-core/+bug/1248329 to you. Is that ok ? | 12:10 |
_mup_ | Bug #1248329: juju destroy-environment does not accept -e <ci> <destroy-environment> <regression> <ui> <juju-core:Triaged by natefinch> <https://launchpad.net/bugs/1248329> | 12:10 |
jam | it sounded like you were going to pick it up at the meeting today | 12:11 |
natefinch | jam: yep | 12:11 |
natefinch | jam: so we'll support -e but not JUJU_ENV ? | 12:12 |
jam | natefinch: and not ~/.juju/environments | 12:13 |
jam | at least, that is what I'm going with | 12:13 |
jam | as that will be "you can run juju destroy-environment -e YYYY" on 1.16 and 1.18 | 12:14 |
jam | natefinch: so you *have* a transition step. Also specifying "-e env" still gets us "you had to type it manually" | 12:14 |
jam | which is what the old bug was about | 12:14 |
jam | I can respect the "required arguments should be positional" but we still hold to the letter of the original request | 12:14 |
jam | (don't make it so easy to destroy the production env) | 12:14 |
natefinch | jam: right. Are we going to remove -e in 1.20 then, or keep it? | 12:15 |
jam | natefinch: *shrug* | 12:15 |
jam | If someone feels super strong that we must not have a required but flagged argument then we can remove it in 1.20 | 12:16 |
jam | natefinch: you can write a bug about it and target 1.19 if you like | 12:16 |
natefinch | jam: yeah, it does bug me to have required flags :) | 12:16 |
jam | natefinch: I think of them a bit like named arguments | 12:17 |
jam | I actually like them sometimes | 12:17 |
jam | especially vs a "cmd dosomething true" | 12:17 |
jam | cmd dosomething -tox=true | 12:17 |
jam | fwereade: do you feel bug #1233936 is very important for 1.18 ? | 12:17 |
_mup_ | Bug #1233936: worker/uniter: uniter restarts when relation removed <tech-debt> <juju-core:In Progress by fwereade> <https://launchpad.net/bugs/1233936> | 12:17 |
natefinch | jam: there are cases where it makes more sense, like if you have several arguments, or the value of the argument is something non-obvious, like a boolean or a number | 12:17 |
jam | (iow, target 1.17.1 or leave it at 1.19?) | 12:18 |
fwereade | jam, 1.17.1 | 12:18 |
jam | done | 12:18 |
fwereade | jam, it's not even hard -- find everywhere in worker/ and cmd/jujud/ that checks for NotFound, and cause it to also check for Unauthorized | 12:19 |
fwereade | jam, (ok that's more than the bug, but that bug is just a symptom of the broader problem) | 12:19 |
jam | fwereade: sure, but doing that work as part of the bug is fine | 12:20 |
jam | you could even land that today *wink* :) | 12:20 |
fwereade | jam, but actually coding doesn't seem to be something I can get away with doing at the moment | 12:20 |
jam | :) | 12:20 |
jam | I understand the feeling | 12:20 |
jam | fwereade: it does feel like something that is really missing test coverage | 12:20 |
jam | somehow | 12:20 |
jam | fwereade: I suppose our test suite doesn't assert that services don't bounce ? | 12:21 |
fwereade | jam, it was something we agreed and planned but missed in the actual hurly-burly of developing stuff | 12:21 |
fwereade | jam, well, I spotted an unchecked error return in the uniter.filter test suite that would have caught it | 12:22 |
fwereade | jam, but yeah, I don't think we have adequate coverage of such situations | 12:22 |
fwereade | jam, otherwise I don't think there'd be any bare NotFound checks | 12:22 |
jam | dimitern: ping | 12:45 |
jam | sinzui: bug #1253628 *might* be a blocker for juju 1.17.0, we could sort of go either way here | 12:46 |
_mup_ | Bug #1253628: juju upgrade-juju incompatible with 1.16 <compatibility> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1253628> | 12:46 |
jam | but not being able to upgrade 1.16 to 1.17 is a bit bad | 12:46 |
sinzui | yeah | 12:46 |
jam | sinzui: I *think* we don't have to block. because people can still test if 1.17 works for new workloads, and we advocate that people *never* upgrade a production env to a dev release | 12:47 |
jam | so not being able to, hey we just helped you out, right ? :) | 12:47 |
sinzui | jam, they cannot upgrade without the --dev flag | 12:48 |
sinzui | though... | 12:48 |
jam | sinzui: right, but they *can't do it at all* with a 1.17 that we release today :) | 12:48 |
sinzui | yesterday I upgraded the charmworld staging site from 1.13.3 and was surprised to see 1.15.0 selected. That was the aborted release. I then specified the upgrade to choose 1.14.1 then 1.16.3 to upgrade along a path I know worked | 12:49 |
jam | sinzui: right, we've explicitly stated that we will support stable => stable, but we don't guarantee from/to dev releases. | 12:50 |
jam | sinzui: I' believe Dimiter was specifically doing upgrade work so that it wouldn't default dev => dev anymore | 12:50 |
sinzui | fab | 12:50 |
jam | (so 1.17 won't try to upgrade to anything but 1.17+ by default, not 1.19) | 12:50 |
jam | dimitern: I'd like you to pick up bug #1253628 if you can | 12:52 |
_mup_ | Bug #1253628: juju upgrade-juju incompatible with 1.16 <compatibility> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1253628> | 12:52 |
dimitern | jam, will take a look | 12:55 |
dimitern | jam, is it more important than the CLI stuff? | 12:55 |
jam | dimitern: it is the compatibility for the CLI stuff. | 12:55 |
jam | and yes | 12:55 |
jam | dimitern: essentially because you use Client.EnvironmentGet | 12:55 |
jam | you *can't* actually upgrade 1.16 to trunk | 12:55 |
jam | because that API doesn't *exist* in 1.16 | 12:55 |
axw_ | fwereade: not sure if you saw this earlier, because you got cut off shortly after I sent it: | 12:55 |
axw_ | <axw_> fwereade: hiya. TheMue reviewed my state changes for environment Life, but I thought I'd better wait for you since you reviewed the parent branch | 12:55 |
axw_ | <axw_> when you have some moments: https://codereview.appspot.com/28880043/ | 12:55 |
dimitern | jam, ok, will switch to it then | 12:56 |
fwereade | axw_, I did, and I started to make comments, and then meetings swept me away | 12:56 |
axw_ | no worries | 12:56 |
fwereade | axw_, sent | 13:04 |
axw_ | fwereade: thanks | 13:10 |
jam | sinzui: I'm upgrading bug #1250154 | 13:11 |
_mup_ | Bug #1250154: updateSecrets in juju-1.18 uses Client.EnvironmentGet which is not in 1.16 <ci> <intermittent-failure> <precise> <regression> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1250154> | 13:11 |
jam | sinzui: it turns out that axw_'s patch to try to pass secrets on demand | 13:12 |
jam | broke *every* cli command against 1.16 | 13:12 |
axw_ | :o | 13:12 |
sinzui | Yay! I like the kinds of changes that are 100% something. Thats how I roll. | 13:13 |
jam | sinzui: no half-assed measures | 13:13 |
jam | we're gonna break it | 13:13 |
jam | and we're gonna do it right! | 13:13 |
jam | axw_: not particularly your fault, as we weren't being cautious and labeling what the new APIs were | 13:14 |
jam | axw_: can I assign it to you ? | 13:14 |
axw_ | jam: certainly | 13:14 |
sinzui | nope. Not one promulgated charm branch worked yesterday thanks to me | 13:14 |
jam | axw_: I *think* we can just fall back to finishing api.Open in that case | 13:14 |
jam | axw_: because we don't *have* to pass secrets. | 13:14 |
jam | (most of the time) | 13:15 |
axw_ | jam: ah, because it's existing | 13:15 |
axw_ | yep | 13:15 |
jam | axw_: so there is still the potential for someone bootstrapping with 1.16 and then we try to use 1.18 for everything else | 13:15 |
axw_ | I assume we don't have to cater for installed 1.16 without having ever connected to it? :) | 13:15 |
jam | but I think that window is small enough we can document it and WontFix | 13:15 |
axw_ | +1 | 13:15 |
axw_ | "delete and rebootstrap with 1.18" | 13:16 |
jam | axw_: worst case, yeah | 13:16 |
jam | as long as destroy-environment still works :) | 13:16 |
axw_ | hehe | 13:16 |
jam | axw_: that particular bug makes it hard for me to test what other commands we broke, becaues... they're all broken :) | 13:17 |
axw_ | jam: all the SSH-based ones, IIRC, use a new API | 13:18 |
axw_ | yep - I added the PublicAddress API | 13:18 |
axw_ | sigh | 13:19 |
jam | axw_: yeah, I'm just going throug hthe "bzr diff -r juju-1.16.3..trunk" | 13:19 |
jam | so I'll get there eventually | 13:19 |
axw_ | jam: so that's ssh, scp, debug-hooks, debug-log | 13:19 |
jam | fwereade: compatibility is turning into quite a bit of work | 13:19 |
axw_ | can we just unbreak upgrade-juju and leave the rest? ;) | 13:20 |
jam | axw_: well updateSecrets is a definite fix, and upgrade-juju is | 13:20 |
jam | status hasn't been done yet | 13:20 |
jam | axw_: for the rest... I wanted to scope the work and then decide | 13:20 |
jam | but it is looking pretty big | 13:20 |
axw_ | ok | 13:20 |
fwereade | jam, isn't the secrets-pushing just one piece of work? ie if the api isn't there, fall back to a state connection to push them using the previous mechanism? | 13:21 |
jam | fwereade: so *that* is one of the "we must do" bits | 13:21 |
jam | fwereade: well, I don't think we *have* to fallback, but we could | 13:21 |
jam | fwereade: the big scope of the work is falling out from my auditing | 13:22 |
jam | I'm up to about 10 commands and counting | 13:22 |
fwereade | jam, that just didn't have APIs at all in 1.16!? | 13:22 |
fwereade | jam, how did the gui get anything done? | 13:22 |
jam | fwereade: well they were missing a key api. | 13:23 |
jam | fwereade: like "juju set-constraints" didn't support EnvironmentConstraints | 13:23 |
jam | fwereade: so the GUI could change a service but not the global | 13:23 |
jam | but they didn't need to | 13:23 |
jam | I guess | 13:23 |
jam | add-machine wasn't done by the guy | 13:23 |
jam | debug-hooks, ssh, scp weren't done by the gui | 13:24 |
jam | fwereade: service-set and unset needed new APIs to handle the empty string vs null appropriately | 13:24 |
jam | destroy-machines, | 13:24 |
jam | environment-get/set | 13:24 |
jam | so basically, the GUI didn't do machines or environment level things | 13:24 |
jam | dimitern: just to keep you in the loop, bug #1250154 is blocking *all* of trunk talking to 1.16 so if you go testing manually that needs to be fixed first. axw will pick it up, but you can do a quick hack if you want to test it. | 13:26 |
_mup_ | Bug #1250154: updateSecrets in juju trunk uses Client.EnvironmentGet which is not in 1.16 <ci> <intermittent-failure> <precise> <regression> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1250154> | 13:26 |
dimitern | jam, ok, so what's the fix we're looking for? | 13:29 |
dimitern | jam, migrate my set-/get-environment changes into 1.16? | 13:29 |
jam | dimitern: updateSecrets (in a first pass) should just keep going if it can't get EnvironmentGet | 13:29 |
jam | dimitern: because the 99% case is that we don't actually need to pass the secrets | 13:30 |
jam | (it is only needed if they bootstrapped with 1.16 and then never did anything but upgraded to 1.18) | 13:30 |
dimitern | i'm not sure I get you | 13:30 |
dimitern | in trunk we need to implement a workaround for using state.EnvironConfig if client.EnvironmentGet fails? | 13:30 |
jam | dimitern: no. The particular bug is that when doing NewAPIConnFromName we check if we need to pass secrets by calling Client.EnvironmentGet | 13:31 |
jam | but that API didn't exist in 1.16 | 13:31 |
jam | however, we *also* don't normally need to set secrets | 13:31 |
jam | so a quick fix | 13:31 |
jam | is that if we try to EnvironmentGet in updateSecrets, and it fails, just keep going. | 13:31 |
dimitern | jam, ah, so just skip it and ignore the error? | 13:31 |
axw_ | jam: did you mean to set the overall compat bug to me, or the upgradeSecrets one? | 13:32 |
axw_ | (you assigned the former) | 13:32 |
jam | axw_: just the upgradeSecrets one | 13:32 |
jam | I'll fix it | 13:32 |
axw_ | ok | 13:32 |
axw_ | ta | 13:32 |
jam | axw_: I'm making you responsible for all that is wrong in the world. Are you ok with that? y/Y ? | 13:33 |
axw_ | ;) | 13:33 |
jam | yay, DestroyRelation should be ok. Finally something that may not have broken (aside from the breaking of everything :) | 13:34 |
dimitern | jam, so I shouldn't look at bug 1250154, but at bug 1253628 only? | 13:34 |
_mup_ | Bug #1250154: updateSecrets in juju trunk uses Client.EnvironmentGet which is not in 1.16 <ci> <intermittent-failure> <precise> <regression> <test-failure> <juju-core:Triaged by axwalk> <https://launchpad.net/bugs/1250154> | 13:34 |
_mup_ | Bug #1253628: juju upgrade-juju incompatible with 1.16 <compatibility> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1253628> | 13:34 |
jam | dimitern: well, I think axw_ is officially done working for today. So if you want to pick up 1250154 before he gets to it. It is the more important one. But both are important and you have more context for upgrade-juju | 13:35 |
dimitern | I'm confused now :/ | 13:35 |
dimitern | what's the workaround for SetEnvironAgentVersion ? | 13:36 |
jam | dimitern: just focus on fixing juju upgrade-juju when running against 1.16 | 13:36 |
jam | dimitern: essentially juju upgrade-juju needs to fall back to poking the DB directly if it can't client.EnvironmentGet | 13:36 |
dimitern | 1.16 in the cloud when using cli from trunk? | 13:36 |
jam | since if it can't EnvironmentGet then it can't SetEnvironAgentVersion | 13:36 |
jam | dimitern: "juju-1.16 bootstrap; juju-1.16 deploy; juju-trunk upgrade-juju --dev" is broken | 13:37 |
axw_ | I've gotta go, I'll look at 1250154 first thing in the morning if it's still broken | 13:37 |
jam | axw_: have a good night | 13:37 |
dimitern | jam, upgrade-juju --dev is no longer present in trunk | 13:38 |
axw_ | cheers, good night jam, all | 13:38 |
jam | dimitern: "juju upgrade-juju" is just broken | 13:38 |
dimitern | axw_, g'night | 13:38 |
axw_ | later dimitern | 13:38 |
jam | dimitern: we *can't* upgrade a 1.16 installation to anything | 13:38 |
jam | because it doesn't have the API we added to do so | 13:38 |
jam | regardless --dev or whatever | 13:38 |
sinzui | I am uncertain about bug #1253576. I expect a relation error to eventually be shown in status if the hook didn't complete. | 13:38 |
_mup_ | Bug #1253576: Juju does not show relation status <add-relation> <juju-core:New> <https://launchpad.net/bugs/1253576> | 13:38 |
dimitern | jam, so the fix should be "try EnvironmentGet, if it fails, access the state directly?" | 13:39 |
jam | dimitern: right | 13:39 |
dimitern | jam, if it fails in a certain way that is - when it's missing | 13:39 |
jam | dimitern: well, we can just fallback and then worry about other reasons why it fails | 13:40 |
dimitern | jam, so we'll have 2 commands in 2 old upgrade-juju and the new one | 13:40 |
jam | dimitern: 2 commands? | 13:40 |
dimitern | jam, s/in 2/in 1/ | 13:40 |
jam | 2 code paths ? | 13:40 |
dimitern | yeah | 13:40 |
jam | dimitern: yes | 13:40 |
dimitern | ugh, how ugly... but well | 13:40 |
jam | dimitern: keep what you've done, we'll want it for upgrading a 1.18 to something newer, but we also need to support upgrading from 1.16 to 1.18 | 13:41 |
jam | dimitern: the "ugliness" is just beginning: https://bugs.launchpad.net/juju-core/+bug/1253619 | 13:41 |
_mup_ | Bug #1253619: juju 1.18 needs to support all CLI against a 1.16 server <compatibility> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1253619> | 13:41 |
jam | I'm up to 10 commands | 13:41 |
jam | that are broken against 1.16 | 13:41 |
dimitern | so due to compatibility the mess is getting worse for each minor version from now on forever | 13:41 |
jam | dimitern: no, we don't claim to support 1.16 in 1.20 | 13:42 |
dimitern | we'll have duplicated code that both has and hasn't direct db access | 13:42 |
jam | so for things that we have compatibility to 1.16 | 13:42 |
jam | we can delete it in 1.120 | 13:42 |
jam | 1.20 | 13:42 |
jam | things will be worse in the 2.0 series | 13:42 |
jam | where we *will* go for 2.X is compatible with 2.Y | 13:42 |
jam | but hopefully we've gotten our shit sorted a bit better by then | 13:42 |
fwereade | jam, that "10" number hasn't got worse in the last 20 minutes at least :/ | 13:43 |
dimitern | can we claim that upgrade-juju truly uses the api, if it has this workaround? it seems like a regression/security leak | 13:43 |
jam | dimitern: it is possible that we'll decide instead "juju upgrade-juju and juju status" are all that need to maintain compat | 13:43 |
jam | and the rest will just have new ocde | 13:43 |
jam | fwereade: too much talking on IRC :) | 13:43 |
jam | fwereade: but actually I've gotten to a few commands that were in the API | 13:44 |
* fwereade cheers | 13:44 | |
fwereade | dimitern, jam: and it's 2 steps -- *can* use the api in 1.18, followed by *must* in 1.20 | 13:44 |
fwereade | dimitern, jam: so we can drop the db access code starting in 1.19 | 13:45 |
dimitern | fwereade, ok, that's a small consolation at least | 13:45 |
fwereade | dimitern, take 'em where you can find 'em ;) | 13:46 |
dimitern | :) | 13:46 |
fwereade | jam, do you have a note somewhere for cutting off client mongo access for new deployments in 1.18 and for all in 1.20? as for agents in 14/16? | 13:51 |
jam | fwereade: 12 total | 13:52 |
jam | audit is done | 13:52 |
jam | fwereade: arguably the 1.18 thing is https://bugs.launchpad.net/juju/+bug/804284 | 13:53 |
_mup_ | Bug #804284: API for managing juju environments, aka expose cli as daemon <pyjuju:Triaged> <juju-core:In Progress by jameinel> <https://launchpad.net/bugs/804284> | 13:53 |
jam | but we need another for 1.20 | 13:53 |
jam | fwereade: is it 14/16 or is it 16/18 ? | 13:53 |
jam | fwereade: I *think* we actually need to cut off all access in the 1.18 code for all agents | 13:54 |
fwereade | jam, hmm, you may be right in fact | 13:54 |
fwereade | jam, ok, we need to do that too then :) | 13:54 |
jam | fwereade: yeah, filing the bug now | 13:56 |
fwereade | jam, cheers | 13:57 |
jam | fwereade: so, an issue of 1.16 to 1.18 to 1.20 compat for direct DB access | 13:58 |
jam | for clients specifically | 13:58 |
jam | if we want "juju-1.18 bootstrap; juju-1.16 status" to work | 13:58 |
jam | then we can't remove DB access yet | 13:58 |
jam | but we can remove access in 1.20 | 13:58 |
jam | (by default) | 13:58 |
fwereade | ha, good point | 13:59 |
jam | fwereade: but I think we can just do it forcibly | 14:00 |
jam | as in, upgrading 1.18 to 1.20 will remove access | 14:01 |
jam | which we *couldn't* do for agents | 14:01 |
jam | mostly because of the set of "people who might access this" is unknown | 14:01 |
jam | while for agents we knew who might | 14:01 |
jam | and could leave rights just for that set | 14:01 |
jam | fwereade: ok. so I've done the audit for bug #1253619 | 14:05 |
_mup_ | Bug #1253619: juju 1.18 needs to support all CLI against a 1.16 server <compatibility> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1253619> | 14:05 |
jam | fwereade: we have 12 commands that need compatibility | 14:05 |
jam | fwereade: they break down into "things that touch machine", "things that touch environment", "things that directly access a node", and "ServiceSet" allowed the empty string to indicate reverting a field to default, while we added Unset for that | 14:06 |
jam | and then we'll get upgrade-juju for SetAgentAPIVersion, "juju status" which the GUI did via the AllWatcher, and "juju deploy" because of the PutCharm stuff. | 14:07 |
abentley | sinzui: I think the EnvironmentGet thing is interfering with Azure uploads, but I thought that Azure uploads didn't use juju. http://162.213.34.53:8080/job/upgrade-and-deploy-specific/174/consoleFull#-19356225433c12a2ef-3702-44a9-9d42-48ac051c3e02 | 14:12 |
jam | abentley: You used to use gwacl directly there | 14:15 |
jam | so it couldn't be EnvironmentGet, but if you changed that maybe | 14:16 |
abentley | jam: we don't anymore. | 14:16 |
abentley | jam: Now we use azure_publish_tools.py from ci-cd-scripts2. | 14:16 |
dimitern | jam, fwereade: fix for bug 1253628 https://codereview.appspot.com/30300043/ | 14:17 |
_mup_ | Bug #1253628: juju upgrade-juju incompatible with 1.16 <compatibility> <regression> <juju-core:In Progress by dimitern> <https://launchpad.net/bugs/1253628> | 14:17 |
jam | abentley: that is a pretty big log, any help to narrow down where you think the failure due to EnvironmentGet is? | 14:17 |
jam | abentley: I don't see azure_publish_tools in that file | 14:17 |
abentley | jam: The bit in red. | 14:17 |
abentley | jam: The link should have taken you to to bit in red. | 14:18 |
jam | abentley: I probably scrolled shortly after. So I can't tell what command is being run, but if it is "juju scp" yes, that would be broken | 14:19 |
abentley | jam: I don't think it is, but sinzui will know better, because he wrote the script being run. | 14:19 |
jam | abentley: unfortunately that log seems to hold the output of commands run, but not the commands themselves | 14:20 |
jam | abentley: anyway, bug #1250154 means that *any* command you run with a 1.17 client against a 1.16 environment will fail | 14:21 |
_mup_ | Bug #1250154: updateSecrets in juju trunk uses Client.EnvironmentGet which is not in 1.16 <ci> <intermittent-failure> <precise> <regression> <test-failure> <juju-core:In Progress by dimitern> <https://launchpad.net/bugs/1250154> | 14:21 |
abentley | jam: Yes. That's because some of these scripts source credentials and I was being cautious to about having credentials appear in the log. | 14:21 |
sinzui | abentley, jam: we use the official MS python azure library to upload the stream meta and tools | 14:21 |
abentley | sinzui: Okay, maybe the azure upload completes, and it's my wait_for_agent_update script that dies. | 14:22 |
abentley | sinzui: No, can't be that, that happens after we upgrade juju. | 14:22 |
sinzui | abentley, The log doesn't show --upload so there is no evidence that juju wanted to move things. | 14:23 |
abentley | sinzui: I'm going to change the script to output commands, since we now exect publish-public-tools, instead of sourcing it. | 14:25 |
abentley | sinzui: I think it might be upgrade-juju itself that's failing. | 14:26 |
sinzui | abentley, looking at the log and the error line, I see that upload to azure completed a few lines before. We can update the script to state is is done with everything | 14:26 |
sinzui | abentley, in juju-dev, jam predicted that upgrade-juju is dead because of a recent landing | 14:26 |
jam | dimitern: reviewed | 14:27 |
abentley | sinzui: That would explain it, then. | 14:27 |
jam | ideally we'd have you test with deploying with 1.16 but bug #1250154 will break for you | 14:27 |
_mup_ | Bug #1250154: updateSecrets in juju trunk uses Client.EnvironmentGet which is not in 1.16 <ci> <intermittent-failure> <precise> <regression> <test-failure> <juju-core:In Progress by dimitern> <https://launchpad.net/bugs/1250154> | 14:27 |
jam | dimitern: if you want to pick that one up then we'll have both fixed which would be great | 14:27 |
jam | I need to head off to dinner now. | 14:27 |
sinzui | thanks jam, I stumbled because I wasn't sure how it was renamed | 14:27 |
dimitern | jam, I'm on it already | 14:28 |
dimitern | jam, added a single line to call IsNoSuchRequest in the only sensible test I can find, which tests for "no such request" - rpc/reflect_test.go, should be sufficient | 14:43 |
dimitern | jam, proposing now | 14:43 |
dimitern | jam, I don't want to complicate things too much for this - it's not easy to test it in params, because there are no api tests there, and in order to test it I need to implement some generic client call that accepts a method name and calls whatever you give it, or something | 14:46 |
dimitern | jam, fwereade: and this https://codereview.appspot.com/30330043 fjxes bug 1250154 | 14:50 |
_mup_ | Bug #1250154: updateSecrets in juju trunk uses Client.EnvironmentGet which is not in 1.16 <ci> <intermittent-failure> <precise> <regression> <test-failure> <juju-core:In Progress by dimitern> <https://launchpad.net/bugs/1250154> | 14:50 |
rogpeppe | dimitern: isn't there already such a client call? | 14:50 |
rogpeppe | dimitern: api.State.Call | 14:50 |
dimitern | rogpeppe, well, not directly accessible from tests I think | 14:51 |
rogpeppe | dimitern: isn't it? | 14:51 |
rogpeppe | dimitern: what about in the state/api tests? | 14:51 |
dimitern | rogpeppe, there are no tests that use Call directly | 14:51 |
rogpeppe | dimitern: i should look at the CL itself to see what jam is suggesting :-) | 14:52 |
dimitern | rogpeppe, yeah, do that please :) | 14:52 |
dimitern | jam, ping | 14:53 |
rogpeppe | dimitern: i think HasPrefix would be better than using regexp | 14:54 |
rogpeppe | dimitern: strings.HasPrefix, that is | 14:55 |
dimitern | rogpeppe, ah, yes, I forgot about that | 14:55 |
rogpeppe | dimitern: also the Is* functions should work correctly when passed a nil error | 14:55 |
dimitern | rogpeppe, not in this case - i'm not using ErrCode(err) | 14:56 |
rogpeppe | dimitern: all the Is* functions should work ok when passed a nil error | 14:56 |
rogpeppe | dimitern: that's a current invariant, and a useful one | 14:56 |
dimitern | rogpeppe, IsCode*, but this is not like the others | 14:57 |
rogpeppe | dimitern: it doesn't matter | 14:57 |
rogpeppe | dimitern: it's like os.IsNotExist and many others | 14:57 |
dimitern | rogpeppe, ok, I see your point | 14:57 |
natefinch | sigh.... mgo doesn't like connecting to a mongod that has been started with --replSet. Trying to figure out why. | 14:58 |
rogpeppe | dimitern: reviewed | 15:02 |
dimitern | rogpeppe, thanks, take a look at the follow-up as well please https://codereview.appspot.com/30330043 | 15:03 |
rogpeppe | dimitern: i'm not entirely convinced by that | 15:04 |
rogpeppe | dimitern: are we sure that noone is going to bootstrap a 1.16 environment any more? | 15:04 |
rogpeppe | dimitern: and then talk to it with 1.18 tools | 15:04 |
rogpeppe | dimitern: if we can rule that out, then it seems ok | 15:04 |
rogpeppe | dimitern: but i'm not sure | 15:04 |
dimitern | rogpeppe, what do you suggest? | 15:05 |
rogpeppe | dimitern: well, the proper fix would be to push secrets as we did before | 15:05 |
dimitern | rogpeppe, expand please | 15:05 |
rogpeppe | dimitern: well, say someone bootstraps a 1.16 environment | 15:06 |
rogpeppe | dimitern: then they run a 1.18 juju command to try and get the status | 15:06 |
rogpeppe | dimitern: actually, that's a bad example maybe | 15:06 |
fwereade | rogpeppe, dimitern: I think we want +2 compatibility to work both ways | 15:07 |
fwereade | rogpeppe, dimitern: and I think that is a good example | 15:07 |
dimitern | fwereade, both ways? | 15:07 |
rogpeppe | fwereade: i think it should work both ways in that specific instance actualy | 15:07 |
rogpeppe | lly | 15:07 |
fwereade | rogpeppe, dimitern: yeah -- 1.18 client, 1.16 server; and 1.16 client, 1.18 server, are both important | 15:08 |
rogpeppe | fwereade: won't it work with the above CL? | 15:08 |
fwereade | rogpeppe, I don't know, I haven't looked -- been in a meeting | 15:08 |
rogpeppe | fwereade: the old client will push the secrets correctly because it talks to mongo directly | 15:08 |
natefinch | niemeyer: you around? | 15:09 |
rogpeppe | fwereade: the new client will push the secrets correctly because it should fall back to talking to mongo directly when the relevant API call isn't implemented | 15:09 |
fwereade | rogpeppe, yeah, that sounds just fine | 15:09 |
rogpeppe | fwereade: the only thing that falls through the cracks are calls that use the API in 1.16 | 15:09 |
rogpeppe | fwereade: but that's a problem in 1.16 anyway | 15:10 |
rogpeppe | fwereade: so i don't think there's a regression there | 15:10 |
fwereade | rogpeppe, 1.16 calls that use the api should be ok, shouldn't they? what's wrong there? | 15:10 |
rogpeppe | dimitern: i've convinced myself it's all fine :-) | 15:10 |
rogpeppe | fwereade: they don't push secrets | 15:10 |
dimitern | rogpeppe, ok, so it's good as is then | 15:10 |
rogpeppe | fwereade: so if it's the first call that's made after bootstrapping, they won't work | 15:10 |
fwereade | rogpeppe, I *think* that we specifically avoided calls that made sense as first ones there | 15:11 |
rogpeppe | fwereade: even better | 15:11 |
fwereade | rogpeppe, eg juju get -- meaningless without a service :) | 15:11 |
jam | rogpeppe: yeah, "juju get" and "juju add-unit" both don't work unless you've deployed | 15:13 |
jam | fwereade: so there is likely to be a gap that if you "juju-1.16 bootstrap" and immediately "juju-1.18 status" the env won't have the secrets | 15:13 |
jam | *but* you have no services | 15:13 |
jam | so either you "juju-1.16 status" | 15:13 |
jam | or you "juju-1.18 destroy-environment; juju-1.18 bootstrap" | 15:13 |
jam | and things will work ok | 15:13 |
rogpeppe | dimitern: reviewed | 15:13 |
dimitern | rogpeppe, thanks | 15:14 |
jam | dimitern: It matches what I thought we would need to do. I wonder if we could mock test it. Or possibly just test it live and I'll be happy enough | 15:14 |
rogpeppe | dimitern, fwereade: BTW if you're ever stuck debugging Eaborted with a large set of transaction operations, you might find this code useful for finding out which assertion actually failed: http://paste.ubuntu.com/6453685/ | 15:15 |
rogpeppe | dimitern, fwereade: i just found it very useful | 15:15 |
dimitern | rogpeppe, cheers, good to know | 15:16 |
fwereade | rogpeppe, nice | 15:16 |
rogpeppe | dimitern: in case it's not obvious, it returns the index of the operation with the first failed assertion | 15:16 |
dimitern | rogpeppe, this can probably live in some of the testing packages | 15:17 |
rogpeppe | dimitern: it's not actually that useful in a testing package because the operations rarely escape to the tests | 15:17 |
rogpeppe | dimitern: it might be useful in its own testing package i suppose, to be imported temporarily when debugging | 15:18 |
rogpeppe | dimitern: but in that case it might as well live somewhere external | 15:18 |
dimitern | rogpeppe, my point was, it's better to be somewhere in the code, so it can be reused if needed + some comments | 15:18 |
rogpeppe | dimitern: i agree it needs comments - i just hacked it up for my own use :-) | 15:18 |
rogpeppe | dimitern: just found that the 9th op out of 11 was the one that failed its assert | 15:20 |
dimitern | rogpeppe, sweet! | 15:23 |
rogpeppe | fwereade: i think you might be pleased to know i've just refactored all the addmachine logic. i can actually understand it now. hopefully others will be able to too. | 15:27 |
fwereade | rogpeppe, you rock, thank you | 15:29 |
rogpeppe | fwereade: it was some of the twistiest stuff i've had to deal with in a while | 15:29 |
jam | dimitern: so if you land those two patches, target the bugs to 1.17.0 since sinzui hasn't released it yet | 15:49 |
sinzui | thank you jam and dimitern | 15:50 |
dimitern | jam, ok, will change the milestone to 1.17.0 | 15:55 |
dimitern | sinzui, np, they were easy to fix at least | 15:56 |
niemeyer | natefinch: Yep | 15:57 |
natefinch | niemeyer: I am having trouble dialing into a mongod that has been started with --replSet, but that hasn't had replSetInitiate called yet. mgo connects, but then repeatedly tries <something> and eventually fails with " no reachable servers" | 15:58 |
niemeyer | natefinch: That's the right behavior.. it's not finding the master | 15:58 |
niemeyer | natefinch: If you want to force a connection to a slave, you can use the direct mode | 15:59 |
natefinch | niemeyer: but I need to be able to connect to call replSetInitiate so it can determine the master :/ | 15:59 |
natefinch | niemeyer: ahh, ok | 15:59 |
rogpeppe | fwereade: i'm looking at Unit.findCleanMachineQuery | 16:16 |
rogpeppe | fwereade: it first finds all machines with non-empty containers, then finds all machines which weren't found earlier | 16:16 |
rogpeppe | fwereade: do you know of anything that's stopping something jumping in and adding a container between those two steps? | 16:17 |
rogpeppe | dimitern, jam: ^ | 16:17 |
fwereade | rogpeppe, dimitern: if the eventual assignment doesn't or can't assert on the lack of a container, then no | 16:18 |
dimitern | rogpeppe, fwereade, yeah, that sounds right | 16:19 |
rogpeppe | fwereade: if the query can't, i'm not sure how the eventual assignment can | 16:19 |
rogpeppe | fwereade, dimitern: i think it's fundamentally racy unless there's some indication of "has a container" in machineDoc | 16:21 |
fwereade | rogpeppe, the eventual assignment surely can, because it can assert on the containerrefs document | 16:22 |
rogpeppe | fwereade: ah | 16:22 |
fwereade | rogpeppe, whether it actually does so is another question | 16:22 |
rogpeppe | fwereade: yeah, i can't see that it does, but good point. | 16:23 |
rogpeppe | fwereade: i think it should probably be another txn assert inside Unit.assignToMachine | 16:23 |
fwereade | rogpeppe, sgtm | 16:24 |
rogpeppe | fwereade, dimitern: https://bugs.launchpad.net/juju-core/+bug/1253704 | 16:31 |
_mup_ | Bug #1253704: state: unit assignment emptiness check is not transactional <tech-debt> <juju-core:New> <https://launchpad.net/bugs/1253704> | 16:31 |
fwereade | rogpeppe, triage it please :) | 16:31 |
dimitern | rogpeppe, thanks | 16:32 |
rogpeppe | fwereade: what importance would you give it? | 16:32 |
fwereade | rogpeppe, I'd call it medium because I can't decide between high and low | 16:32 |
rogpeppe | fwereade: :-) | 16:32 |
fwereade | rogpeppe, which is what medium means | 16:32 |
rogpeppe | fwereade: confirmed at medium | 16:33 |
jam | fwereade: rogpeppe: High is we should do it now, Low is we should do it, medium doesn't mean anything | 16:39 |
rogpeppe | jam: i hold no opinion in this matter :-) | 16:39 |
rogpeppe | jam: i will mark it however people think | 16:40 |
jam | rogpeppe: sinzui put up a good listing of what the different priorities mean | 16:40 |
fwereade | jam, then almost everything is low ;) | 16:40 |
jam | fwereade: exactly | 16:40 |
jam | saying "this is slightly higher than this other thing" doesn't mean much | 16:40 |
jam | it is either "stuff we're working on for the next releases" == High | 16:40 |
jam | or it is Low | 16:40 |
jam | fwereade: or it is "OMG we have to fix this now" = Critical | 16:41 |
rogpeppe | jam: personally i think that "low" should be reserved for things that won't break anything if they're not fixed | 16:41 |
jam | rogpeppe: Either it is High and we'll actually schedule it to be done, or we won't actually and then it doesn't matter | 16:42 |
jam | whether you call that "Low" or "Medium" they are effectively the same | 16:42 |
jam | because you're not getting to them in a reasonable amount of time | 16:42 |
rogpeppe | jam: hmm, perhaps it should be High then | 16:42 |
sinzui | High mean we commit to fixing it in our plans for the cycle. It is easy to say the cycle is 6 months, but I think 3 months is more realistic for the purposes of planning. I cannot image more than two milestones personally. I think this, that, and them | 16:45 |
sinzui | Also, I don't believe 100 of our High bugs are really High. We just don't have a enough of them in our heads to see which ones we wont fix | 16:46 |
hazmat | fwiw, there seem to be quite a few issues at the http client layer for go trunk and juju-core trunk wrt to ec2 bootstrapping and s3 interactions | 17:49 |
mgz | interesting as in terrifying? | 17:49 |
hazmat | sinzui, re ui here.. is there any way to sort the latest release at the top, people are just downloading the top link for 1.15 even though 1.16.3 is the latest there (at the bottom) https://launchpad.net/juju-core/+download | 18:07 |
sinzui | no | 18:07 |
sinzui | I keep bring this up. Lp doesn't support what we do and our stable releases will fall off +downloads, breaking debwatch | 18:08 |
natefinch | sinzui: how are they sorted? | 18:09 |
sinzui | hazmat, to be blut, Lp doesn't do the right thing for any project in this matter. | 18:09 |
sinzui | natefinch, according to the minds of salgado-grubbs-pool-albistte... | 18:10 |
natefinch | hazmat, sinzui: wow, that is, umm... like completely unacceptable. | 18:11 |
sinzui | natefinch, I believe we are seeing the project's focus-of-development series listed first, then ordered in z-a, then all other series in a-z with releases as z-a | 18:11 |
sinzui | natefinch, hazmat I get angry every time this comes up. I didn't have the power and time to prevent it | 18:12 |
natefinch | sinzui: I have no idea what that means. Is there a way we can hack out outputs so they sort the way we want. | 18:12 |
sinzui | hazmat, natefinch I /think/ we want the page listed by deb-version descending and offer an alternate page by date | 18:12 |
sinzui | the latestest portlet would show highest deb-version for the series or all series | 18:13 |
sinzui | natefinch, the +downloads page was originally simple and practical. Then changes were made without checking use cases. I wept | 18:14 |
natefinch | sinzui, hazmat: in my experience, when stuff like this comes up, the best way to get people to listen is to do something incredibly drastic, so everyone has to go "why the hell would you do that?" And that gets the conversation going about how to fix things. | 18:14 |
sinzui | I don't know the use cases myself, but I think packagers are the first users that need to find what they need to get into distros | 18:15 |
sinzui | +downloads is broken by design. I haven't found a single project that it works for | 18:16 |
natefinch | sinzui: here's my suggestion - we delete all the downloads off that page and move them somewhere outside LP where we have control over sorting. Then email the juju list about the change.... and way to see how long it takes someone to throw a hissy fit. | 18:16 |
sinzui | natefinch, the page exists to two of our users. Ubuntu and homebrew. Both pickup packages from that page. removing them will break ubuntu temporarily and home brew for ever | 18:18 |
natefinch | sinzui: well then, it would definitely get some attention ;) Sigh, yes, I guess we can't do that, then. | 18:22 |
sinzui | I have a 4 day weekend coming up. I can make a grand simplification to the page that stevenk and wgrant might except. They are sympathetic to the needs of packagers. | 18:24 |
* sinzui see that we are 2 unstable releases aways from pushing stable releases to the scond page | 18:25 | |
hazmat | natefinch, drastric would be have release downloads on juju.ubuntu.com only and trash the lp download page. | 18:28 |
hazmat | sinzui, ^ | 18:29 |
hazmat | natefinch, oh.. yeah.. same thing you were suggesting | 18:29 |
hazmat | sinzui, we can fix homebrew | 18:29 |
hazmat | its an mp away. | 18:29 |
hazmat | er. pr | 18:29 |
sinzui | hazmat, yes we can do that. | 18:36 |
rogpeppe | fwereade: finally https://codereview.appspot.com/30390043 | 18:37 |
rogpeppe | and that's me for the night | 18:38 |
rogpeppe | g'night al | 18:38 |
rogpeppe | l | 18:38 |
natefinch | hazmat. sinzui: took me over 3 months, but I finally realized that when canonical says series, what they mean is "branch", at least from a code perspective. | 18:43 |
sinzui | not the same | 18:43 |
natefinch | sinzui: branch + sub-branches | 18:44 |
sinzui | natefinch, a series is metadata about intent. I have argued that Lp should let me state a branch is a fix, a feature, a series that I make releases from | 18:45 |
sinzui | natefinch, I would like to think a series is a superset of data about a branch, but distros have series and there are no branches. A series is planning device that branches are arbitrarily associated with. I say "arbitrary" because Lp puts projects before branches, and treats secondary communities as having equal power the the primary developers. equal power with out any responsability | 18:48 |
sinzui | natefinch, I know why Lp does its non-sense. I was a collaborator. The developers spent too much time creating incomplete features without journey to guide design | 18:50 |
natefinch | sinzui: I get it. And yeah, I can see there's a lot started on LP that didn't get the spit and polish. | 18:55 |
=== gary_poster is now known as gary_poster|away | ||
natefinch | morning thumper | 19:25 |
natefinch | niemeyer: still around? I still can't get this working. replSetInitiate. mgo is returning "no reachable servers" which I know is not true, since it is correctly dialing. when I do rs.initiate() from the shell, I get "can't find self in the replset config" | 19:32 |
niemeyer | natefinch: How are you dialing? | 19:32 |
natefinch | niemeyer: here's the mongod command line and the code I'm running: http://pastebin.ubuntu.com/6454825/ | 19:32 |
niemeyer | natefinch: What's the output? | 19:33 |
natefinch | niemeyer: actually, I guess it is failing on the dial, I had thought I'd seen it made it past that, but I guess not. I'm getting "Error from dial: no reachable servers" (first half is my message) | 19:34 |
niemeyer | natefinch: In that case you have a sever that is actually in a bad state | 19:35 |
niemeyer | natefinch: Oh, wait | 19:35 |
niemeyer | natefinch: You need to use a Monotonic session | 19:35 |
thumper | o/ | 19:36 |
niemeyer | natefinch: With a strong session, it won't allow anything but a master | 19:36 |
natefinch | niemeyer: how do I do that? I only see SetMode on the session, which I can only get with Dial, which is what is failing | 19:38 |
niemeyer | natefinch: Well, if you cannot even dial the server is likely in a bad state | 19:41 |
niemeyer | natefinch: Reset your server and try this command again | 19:41 |
natefinch | niemeyer: ok | 19:44 |
natefinch | niemeyer: ok, yeah, now I'm getting "couldn't initiate : can't find self in the replset config my port: 28000" | 19:49 |
thumper | mramm2: around somewhere? | 19:52 |
mramm2 | thumper: yea | 19:56 |
=== thumper is now known as thumper-afk | ||
niemeyer | natefinch: Cool | 20:12 |
natefinch | niemeyer: well, except for the error. | 20:15 |
niemeyer | natefinch: This means you're not configuring the replica set correctly | 20:15 |
niemeyer | natefinch: It's simply telling you the server isn't part of the configuration, so the config can't be valid | 20:15 |
=== gary_poster|away is now known as gary_poster | ||
natefinch | niemeyer: ok, I thought since I could do rs.initiate() that I could do a bare replSetInitiate and it would do the right thing, but I can give it a valid config too | 20:15 |
niemeyer | natefinch: Yeah, it likes valid configs :) | 20:16 |
natefinch | niemeyer: details details :) | 20:16 |
=== gary_poster is now known as gary_poster|away | ||
jam | sinzui: we could always make "stable" releases from the trunk series ... | 20:29 |
sinzui | jam we could | 20:30 |
jam | sinzui: or maybe the .0 from there? | 20:30 |
sinzui | I could also move the releases to trunk now... | 20:30 |
jam | it at least gets people 1.16 instead of 1.15 | 20:30 |
sinzui | jam ...but I sent an email to wgrant and stevenk proposing an fix that is easy for me to do and would make the page timeout less often | 20:31 |
jam | sinzui: if you can fix it, I'd rather not work around it :) | 20:32 |
sinzui | jam, the fix has always been political. I appealed to their love of speed and deb/ubuntu packaging | 20:32 |
natefinch | ahh, of course, I had to set direct=true and mode=monotonic, why didn't I think of that? #mongodb :/ | 21:10 |
=== BradCrittenden is now known as bac | ||
=== thumper-afk is now known as thumper | ||
thumper | abentley: still around? | 21:27 |
abentley | thumper: Hi. | 21:27 |
thumper | abentley: hey | 21:27 |
thumper | abentley: looking at this 'local provider not starting' bug | 21:27 |
thumper | abentley: do you have a machine it happens on reliably that I can get you to test a fix on? | 21:27 |
abentley | Yes, I was excited to see you think you have a solution. I am not sure how reliably it happens on my machine. | 21:28 |
abentley | thumper: It worked yesterday, but before that, it had been failing a lot. | 21:29 |
thumper | abentley: is the machine in question able to compile easily from source? | 21:29 |
thumper | abentley: I think it is a race condition | 21:29 |
thumper | so could well be impacted by other work the machine is doing at the time | 21:30 |
abentley | thumper: just failed. | 21:30 |
thumper | abentley: let me whip you up a branch | 21:30 |
abentley | thumper: Yes, I should be able to compile from source here. | 21:31 |
thumper | kk | 21:31 |
thumper | give me a few minutes | 21:31 |
thumper | wallyworld_: hi | 21:35 |
thumper | wallyworld_: enjoy the cricket? | 21:35 |
wallyworld_ | thumper: no :-( | 21:36 |
wallyworld_ | bloody poms are winning | 21:36 |
thumper | abentley: can I get you to hammer this a few times? lp:~thumper/juju-core/fix-upstart-start-race | 21:36 |
wallyworld_ | thumper: if you get a chance today, here's a branch which reworks the provisioner. the container stuff is moved out, only one switch statement now. https://codereview.appspot.com/29450043/ | 21:37 |
thumper | wallyworld_: cool, got time to hangout? | 21:38 |
wallyworld_ | ok | 21:38 |
* thumper starts one | 21:38 | |
thumper | wallyworld_: https://plus.google.com/hangouts/_/72cpjtpfsf6lleb295gbg9iff0?hl=en | 21:38 |
abentley | thumper: Okay, I got it, but I don't know how to build it; make complains that various packages are missing. | 21:40 |
thumper | abentley: ah bugger | 21:40 |
thumper | I've forgotten how to grab all that | 21:40 |
abentley | https://pastebin.canonical.com/100868/ | 21:40 |
thumper | abentley: try 'go get launchpad.net/juju-core | 21:40 |
thumper | abentley: have you compiled from source before? | 21:42 |
abentley | thumper: Not sure. I grabbed the source long ago to make a small change. | 21:42 |
abentley | thumper: "no Go source files in /home/abentley/hacking/go/src/launchpad.net/juju-core" | 21:43 |
thumper | ugh... | 21:43 |
thumper | natefinch: do you member the command? | 21:43 |
thumper | abentley: perhaps add /... to the end of the go get juju-core command | 21:43 |
natefinch | thumper, abentley: that error is actually right... we structure our code in a non-idiomatic way, so there actually is no go code in the root directory, and that's ok | 21:44 |
natefinch | thumper, abentley: you can pass go get the -d flag and it'll just download (by default it downloads and tries to build) | 21:44 |
abentley | natefinch: I passed -d and it completed immediately. | 21:45 |
natefinch | abentley: so, you have juju-core, but it doesn't automatically go get the dependencies | 21:45 |
abentley | natefinch: That's what we're trying to accomplish with go get. | 21:46 |
hatch | hey can someone confirm if juju support retry & remove on 'pending' units? | 21:48 |
natefinch | abentley: go get launchpad.net/godep | 21:50 |
natefinch | abentley: then go to launchpad.net/juju-core/ and run godep -u dependencies.tsv | 21:50 |
abentley | natefinch: godeps with an s? | 21:51 |
natefinch | abentley: oops, yeah, godeps | 21:51 |
natefinch | abentley: there's like three different tools named godep / godeps | 21:51 |
abentley | natefinch: "No command 'godep' found, did you mean:..." | 21:52 |
natefinch | godeps :) | 21:53 |
abentley | natefinch: "godeps: command not found" | 21:53 |
natefinch | abentley: is your $GOPATH/bin folder in your path? | 21:54 |
abentley | natefinch: No. | 21:54 |
natefinch | abentley: you should fix that :) | 21:55 |
natefinch | abentley: that's where go puts the binaries that it builds | 21:55 |
natefinch | (go get both downloads and builds) | 21:56 |
abentley | https://pastebin.canonical.com/100869/ | 21:56 |
abentley | natefinch: ^^ | 21:58 |
natefinch | abentley: arg, sounds like it will only update, not actually go get the thing in the first place. sigh | 21:59 |
natefinch | abentley: the hard way is to first simply do go get <path> where path is the first string in dependencies.tsv | 21:59 |
natefinch | (for each line in dependencies.tsv) | 22:00 |
abentley | natefinch: https://pastebin.canonical.com/100870/ | 22:02 |
fwereade | hatch, retry no; remove yes | 22:02 |
hatch | fwereade: thanks for confirming | 22:03 |
hatch | fwereade: so when a charm gets 'stuck' what is a user to do? | 22:03 |
hatch | just remove it and try again? | 22:03 |
natefinch | abentley: that's probably ok... that's just the same thing as juju-core had | 22:03 |
fwereade | hatch, soon, they will be able to destroy-machine --force to take the whole thing down; in the meantime, assuming the agent is running, they can destroy the unit and repeatedly resolve hook errors (as opposed to retry) until it finally goes away | 22:06 |
natefinch | abentley: I have to go in a few minutes, FYI | 22:06 |
abentley | natefinch: That's alright. It's past EOD for me. | 22:07 |
fwereade | hatch, --force has landed, and will be in 1.17, but most people won't see it until 1.18, which we can't do until we've got some fairly significant api compatibility work done | 22:08 |
hatch | fwereade: ok so from the GUI if I were to give the user a 'remove' button for units in pending status that would suffice to match the functioning features? | 22:08 |
natefinch | good night all | 22:08 |
fwereade | hatch, yeah, I think so -- we favour the "destroy" terminology, but if you're calling it "remove" elsewhere it's probably best to stay consistent | 22:14 |
hatch | well from the gui it can't 'destroy' it can only remove the unit | 22:14 |
hatch | which will leave the machine | 22:14 |
hatch | at least for the time being | 22:14 |
hatch | at least I think that's the though process behind it | 22:15 |
hatch | thought* | 22:15 |
abentley | thumper: Got it to a state where make finishes with no output, but GOPATH/bin doesn't have an updated juju. The one there is from March 8. | 22:17 |
thumper | abentley: make, or make install? | 22:17 |
thumper | make just builds | 22:17 |
abentley | thumper: make install put it there. I'm used to make install putting it in /usr/bin, not a home directory. | 22:20 |
abentley | thumper: failed: ERROR failed to initialize state: no reachable servers | 22:21 |
* thumper shrugs, and looks at jtv | 22:21 | |
thumper | abentley: again? | 22:21 |
abentley | thumper: You want me to try again? | 22:21 |
thumper | as in, you are still getting it? | 22:21 |
abentley | thumper: I'm saying I just tried it and it just failed. | 22:21 |
thumper | ah, can you show me your command line? | 22:21 |
thumper | the sudo line | 22:22 |
abentley | thumper: "$ sudo $GOBIN/juju bootstrap -e local" | 22:22 |
thumper | I tend to use $ sudo $(which juju) bootstrap | 22:23 |
thumper | but I gess that works | 22:23 |
thumper | can you see if the local mongo service is running? | 22:23 |
abentley | thumper: Since GOPATH isn't in my PATH, that wouldn't work. | 22:23 |
thumper | inside /etc/init there is a juju-something-db.conf | 22:24 |
thumper | go 'sudo status juju-...-db' | 22:24 |
thumper | to see if it is up | 22:24 |
thumper | I think it probably isn't | 22:24 |
thumper | but now I'm more confused | 22:24 |
thumper | I thought that this would fix it | 22:24 |
abentley | thumper: https://pastebin.canonical.com/100872/ | 22:24 |
thumper | no, not monodb | 22:25 |
thumper | like I have juju-db-tim-local-kvm.conf | 22:25 |
thumper | where 'local-kvm' is my environment | 22:25 |
thumper | so I'd type 'sudo status juju-db-tim-local-kvm' | 22:25 |
thumper | yours is probably juju-db-abently-local | 22:25 |
thumper | assuming your userid is abently | 22:26 |
* thumper forgot last e | 22:26 | |
thumper | sorry | 22:26 |
abentley | thumper: nw. | 22:26 |
abentley | thumper: https://pastebin.canonical.com/100874/ | 22:26 |
* thumper frowns | 22:27 | |
thumper | wtf | 22:27 |
thumper | ok, not sure, will have to come back to this | 22:27 |
abentley | thumper: okay. ttyl. | 22:28 |
thumper | kk | 22:28 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!