[00:20] Hi davecheney [00:22] sinzui: s'ok, wallyworld_ answered my question [00:45] ericsnow: ok, done with dinner, taking a look at 708 [00:50] waigani_: I'm being bitten by the missing envuser now too [00:50] waigani_: because I changed the 'add service' code to look for them :-| [00:50] waigani_: how far away are you? [00:50] thumper: just about done [00:50] coolio [00:51] thumper: I didn't see your messages but I implemented exactly what you suggested - even the param name :) [00:51] waigani_: I didn't say them on the PR, jus here [01:02] thumper: currently if you pass in creator: "eric" to MakeUser it does not create a local user for eric [01:02] waigani_: I think that is fine for now [01:02] thumper: this now fails when you try to specify eric as the creator of the environuser [01:03] waigani_: again, probably ok [01:03] waigani_: fall back to the environment owner [01:03] if not specified [01:04] thumper: you mean if doesn't exist as a local user? [01:04] because it is being specified in the params [01:04] I mean that if you pass a value in explicitly to the factory, it is exptected to work [01:05] if you haven't set it up right, then it is the tests fault [01:05] not the factory [01:05] thumper: right, so if you pass in "eric" as the creator you should have created "eric" as a local user? [01:05] yes, that is what I'm saying [01:06] got it, I'll update the test in that case [01:15] thumper: do we still need factory.MakeEnvUser ? [01:16] yes [01:16] waigani_: there will be cases where we want an envuser, but they are not local [01:16] all users are local users [01:17] right, of course [01:21] thumper: just fixing up all the call sites now, there are a few [01:27] waigani_: here is one for your TODO list: [01:27] func (s *Service) GetOwnerTag() string { [01:27] from state/service.og [01:27] s/og/go/ [01:28] please make it return a names.UserTag [01:29] coffee time [01:29] thumper: okay [01:30] thumper: should the user now have a func to get the envuser? [01:37] waigani_: I don't thinkso [01:37] thumper: https://github.com/juju/juju/pull/702 [01:42] * thumper looks [01:46] waigani_: one change and one question [01:47] thumper: api.Open does not return a NotFound err [01:47] I tried to satisfy and it failed [01:47] is that error from api.Open? [01:47] waigani_: that's fine, that is why I asked :) [01:47] right [01:48] although... [01:48] by the time it hits api.Open [01:48] it should be "permission denied" [01:48] and nothing else [01:49] * thumper adds another comment [01:49] thumper: why perm denied? I'm giving the user perms in the test [01:49] waigani_: no, in the general case [01:50] and you aren't giving the user perms, you are explicitly testing that they can't get in [01:50] the error result should be "permission denied" [01:52] right, because it's more info than you should share to say that the user is not found [01:54] right [02:03] oops, sorry wallyworld_, thought I had updated my blobstore when I added it to dependencies.tsv... [02:03] no worries [02:09] davecheney: I have rockne-02 up with the deb locally [02:10] davecheney: but I can't remember how to install a deb [02:10] anyone? [02:12] sudo dpkg -i package.deb [02:13] bcsaller: ta [02:15] thumper: how the mighty have fallen [02:15] davecheney: I don't claim to be mighty with dpkg [02:15] nor have ever [02:17] hmm... [02:17] juju bootstrap tells me port 37017 is in use [02:17] how can I get netstat to tell me if this is true [02:17] I did 'netstat -a' [02:17] but that didn't show the port in use [02:17] am I mssing something? [02:18] actually, I see it now [02:18] hmm... [02:18] how to find out the process? [02:22] davecheney: if I can bootstrap with the 1.18.4 deb specified with the local provider, and do status, is that verified fixed? [02:23] mwhudson: hey, around? [02:23] thumper: yes [02:24] thumper: yes, i think so [02:26] davecheney: cool [02:56] thumper: are you sure rockne doens't have 64k pages ? [02:56] if you hit it with the api-get upgrade hammer [02:56] it will be running 64k pages [02:56] davecheney: yes, looked [02:56] welp, shitter [02:57] davecheney: I just did upgrade, and not dist-upgrade [02:57] you want me to try that? [02:57] nope [02:57] uname -a [02:57] * thumper is sshing in again [02:57] getconf PAGESIZE [02:57] Linux rockne-02 3.13.0-18-generic #38-Ubuntu SMP Mon Mar 17 21:41:16 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux [02:58] ah... [02:58] wat [02:58] * thumper did that just before [02:58] but got a different result [02:58] ubuntu@rockne-02:~$ getconf PAGESIZE [02:58] 65536 [02:58] textbook defintion of instanity [02:58] it was 4096 when I looked just before [02:58] maybe that was your own host [02:59] perhaps [02:59] i reckon it's not an issue [02:59] you did the test rihgt [02:59] * thumper is bootstrappinga agin [03:00] where did it fail last time? [03:01] * thumper tags verification-done [03:01] once any juju process had been running for > 5 mins [03:01] juju ssh some unit [03:01] wait for 20 mins [03:01] no crash, all good [03:02] oh, it has to run for some time? [03:02] hmm [03:02] * thumper bootstraps it and waits [03:04] yup, the bug is when the scavenger runs, it will try to munmap(2) an area of memory that isn't a multiple of the page size [03:04] this shows up on agents [03:04] and using juju ssh as the juju ssh parent process just sits there quitely [03:14] thumper: PTAL https://github.com/juju/juju/pull/709 [03:16] thumper: double underpants, check out dmsg [03:16] make sure there are no oddball kernel messages there [03:16] that's the canonical check [03:17] davecheney: well, machine 0 has been up over 15 minutes [03:17] had 'watch juju status` running [03:18] nup[ that won't show it [03:18] juju status only runs for a few seconds [03:18] so either the jujud daemons crash [03:18] dmesg seems fine [03:19] look, it's ficed [03:19] it's been fixed for months [03:19] if you use the right compiler [03:19] * thumper has marked the bug as verified [03:19] job done [03:19] next [03:26] any thoughts on why I can run lxc containers from inside a docker container but that the local provider fails to dial the state server on bootstrap? [03:27] bcsaller: sounds like the networking is all fucked up [03:29] davecheney: I was able to lxc-create/start etc. I manually brought up the lxcbr0 in the container and that seemed to work in the raw lxc case. w/o the bridge boostrap was failing much sooner [03:29] so it still might be, but I'm not sure that it is [03:34] bcsaller: what addresses and networks do the various components have ? [03:35] davecheney: juju.state open.go:101 connection failed, will retry: dial tcp 127.0.0.1:37017: connection refused [03:35] is the failure I'm seeing x100 [03:35] so its not getting very far I think [03:36] I put lxcbr0 on 10.0.4.1 [03:37] and eth0 in the container is a 172. address [03:39] menn0: did you figure out why your test was passing when you didn't think it should? [03:40] davecheney: eh, looks like there still might be some issues with the lxc-container networking as well, so I'll keep debugging the setup === allenap_ is now known as allenap === psivaa_ is now known as psivaa [04:12] _thumper_: do we need to handle the error from ParseUserTag? s.doc.OwnerTag is guaranteed to be in the right format, right? === _thumper_ is now known as thumper [04:13] * thumper thinks of how to best handle this... [04:15] waigani_: as much as I find it a little frustrating, I think the only real approach is to return (names.UserTag, error) [04:15] and handle the error in the places where we need to [04:16] which is exactly one place [04:16] thumper: yep [04:16] we shouldn't ever get an error [04:16] but I'd rather return an error that may one day be real [04:16] than panic [04:16] yeah, for sure [04:30] waigani_: what line ? [04:31] davecheney: https://github.com/juju/juju/pull/713 [04:31] davecheney: state/service.go:628 [04:34] waigani_: is it too late to not call the document OwnerTag [04:34] 'cos it's not [04:35] davecheney: nop, what would you like it called? [04:35] anything, as long as it doesn't end with Tag [04:35] there are two reasons for this [04:35] 1. the data in there is not in tag string format [04:35] 2. william has decreed that tags shall not be stored in the database [04:36] GetOwner ? [04:36] thumper: ^? [04:36] sgtm [04:37] davecheney: unfortunately it is indeed a string version of a tag [04:37] davecheney: and I think that 2. is flexible if it refers to a generic entity [04:37] but in this case it certainly doesn't [04:37] it is only a user [04:38] ok, if it is a tag [04:38] so it is a little more complicated [04:38] then it should be aclled OwnerTag and it miust be passed through ParseUserTag [04:38] there was the suggestion to remove it all together [04:39] and clean it up [04:39] thumper: fair enough [04:39] i don't know the background [04:39] just eating what's in front of me [04:39] it was an early attempt to deal with permissions [04:39] * thumper nods [04:39] s/eating/digesting [04:42] waigani_: this is turning out to be much more of a PITA than I wanted [04:42] * thumper is considering the whole kill it approach [04:42] thumper: doing last round of testing [04:42] nuke it from orbit [04:42] it is the only way to be sure [04:43] thumper: you want me to drop the branch? [04:44] waigani_: I [04:44] ugh [04:44] I'm thinking we may be throwing good effort after bad [04:44] and we should perhaps just clean up the mess [04:44] ooooh [04:44] rather than pushing it into a nice pile in the corner [04:45] I'd like to clarify with fwereade [04:45] waigani_: however, removing it has more changes [04:45] as all the deploy helpers now take a service owner [04:45] that we would no longer need [04:45] thumper: I'm just about done with this, shall I finish it off and push it up for reference if nothing else? [04:46] waigani_: if you like, and we should get input from fwereade [04:46] waigani_: don't spend too much more on it though [04:46] understood [04:46] waigani_: instead look at auditing the user manager functions that we have [04:47] thumper: okay, where should I start with that? [04:47] waigani_: look at what functions are implemented, [04:47] compare CLI, api client, api server [04:47] and state [04:47] and look at strings vs. tag usage [04:47] ah right, go it [04:48] I know there isn't consistency, but I want to know how inconsistent we are === Guest9121 is now known as wallyworld [04:50] axw_: can you connect to cloud-images.ubuntu.com ? [04:50] wallyworld: yep [04:50] sigh, i can't :-( [04:55] thumper: sorry, just saw this... yes I figured out why that test was passing - the test setup was wrong so it was passing for the wrong reason [04:56] menn0: ok, in which case you should be good to go [04:56] menn0: cheers [05:05] axw_: can you run "juju metadata validate-images" for me to look up a precise image id on ec2, since i can't access cloud-images [05:05] seems there's a routing issue :0( [05:05] sure [05:06] axw_: ah, got connectivity again [05:06] okey dokey === urulama-afk is now known as urulama [05:21] axw_: it appears there's a problem with trunk - i bootstrap with default-series=precise and machine 0 comes up ok. i deploy a charm, and machine 1 can't start: "no matching tools available" [05:21] hrm [05:21] I'll take a look [05:21] wallyworld: which provider? [05:21] ok, ta [05:21] aws [05:21] and are you doing --upload-tools? [05:22] yep [05:22] hm, weird. ok [05:22] and also --upload-series=precise,trusty [05:22] that shouldn't do anything anymore [05:22] i'm running from a utopic client [05:22] thought so, just did it in case [05:23] you should get a deprecation warning for --upload-series... you did right? [05:23] yeah, i did [05:23] ok. I'll try and repro in a sec [05:36] wallyworld: what did you try to deploy? [05:36] ubuntu? [05:36] mysql [05:36] you didn't specify series? [05:37] no [05:37] k [05:37] "1": [05:37] agent-state-info: no matching tools available [05:37] instance-id: pending [05:37] series: precise [05:38] wallyworld: just worked for me... :( [05:38] wallyworld: can you check cloud-init-output.log on machine-0 for lines saying "Adding tools" [05:38] ok, i'll try again a bit later and try and reproduce [05:38] i may have destroyed, i'll check [05:39] wallyworld: oh I have an idea what it might be [05:39] ok [05:39] if you uploaded, then your uploaded tools will have series=utopic.. does our code know about utopic already? [05:39] actually, probably does... [05:40] should do, but i wanted precise tools [05:40] wallyworld: yeah, what happens is the CLI uploads the tools it can build, and the bootstrap machine explodes them into each of the series of hte same OS [05:40] by "the tools it can build" I mean the local series [05:41] hrm, actually it should be the series of the bootstrap machine not the local machine... will have to check it's doing the right thing [05:42] checking machine-0, the only tools entry in cloud-init-output is 3b20f9692616c75f4df7326aed49efcfe520cbdeddeb39b8e19a59696e2975f8 /var/lib/juju/tools/1.21-alpha1.1-precise-amd64/tools.tar.gz [05:42] wallyworld: nothing saying "Adding tools" [05:43] ? [05:43] not that i can see [05:43] ok... can you please cat /var/lib/juju/tools/1.21-alpha1.1-precise-amd64/downloaded-tools.txt [05:44] {"version":"1.21-alpha1.1-precise-amd64","url":"file:///tmp/juju-tools260863187/tools/releases/juju-1.21-alpha1.1-utopic-amd64.tgz","sha256":"3b20f9692616c75f4df7326aed49efcfe520cbdeddeb39b8e19a59696e2975f8","size":8198295} [05:45] ah look [05:45] utopic [05:45] right, that's a bug [05:45] thanks [05:45] yet machine 0 is precise [05:46] yeah, that URL is wrong and precise doesn't know about utopic, so it doesn't know it's Ubuntu [05:59] wallyworld: just live testing a fix now, do you want a patch while I write a unit test? [05:59] axw_: it's ok, i have been able to test what i needed [05:59] cool [05:59] mongo syslog is beng spammed :-( [06:00] i've reduced it, but it's still logging regularly about authenticating a user [06:01] hmm, actually that URL shouldn't make a difference, only the version should. hrrmmm. [06:01] I'll try faking my series [06:26] wallyworld: can you please review https://github.com/juju/utils/pull/28 [06:27] * axw_ checks OCR [06:27] asleeping [06:27] if you're too busy, I can wait [06:28] master is not happy with the apt retries though === kwmonroe_ is now known as kwmonroe [06:53] wallyworld: I can't reproduce the issue. I've forced my local series to utopic, still nothing. That URL doesn't matter, I was misremembering what it was used for [06:54] bootstrapped ec2 with default-series=precise, and deployed mysql with no issue [07:02] morning [07:02] dimitern, ping? [07:04] axw_: hmmmm, ok. i'll try again a bit later [07:05] tasdomas, hey [07:05] dimitern, you pinged me yesterday - was afk at that moment [07:05] wallyworld: CI doesn't look particularly happy either, though. [07:06] axw_: looks like the upgrade jobs at first glance [07:06] tasdomas, yes, it was about the port ranges work, we'll be inheriting from you :) [07:06] dimitern, right - I'm addressing fwereade's comments as we speak [07:07] tasdomas, can you give me a quick status update? [07:07] dimitern, fixing up the PR (https://github.com/juju/juju/pull/517) [07:08] dimitern, it's a large PR, fwereade requested that it be split up into smaller ones, unfortunately I won't be able to do that [07:08] tasdomas, right, so how much time do you need? [07:09] dimitern, to finish fixing the PR? [07:09] tasdomas, I can perhaps take over and finish it if you don't have the time? [07:10] tasdomas, I heard your team is focusing on other things now [07:10] dimitern, that would be great [07:11] dimitern, I'll finish what I am working on at the moment [07:11] dimitern, do you want to have a hangout to discuss the port ranges work? Or do you want a small write-up on what's been done and what still needs to be done? [07:11] tasdomas, ok, cool, I'll have a look to remember what's what and how to continue [07:12] tasdomas, what works better for you? [07:12] dimitern, ok, ping me if you have any questions [07:12] dimitern, it doesn't really make a difference for me [07:12] dimitern, whatever works best for you [07:13] tasdomas, ok, then I'd rather have the writeup summary, as I'm doing like 3 things now :) [07:14] dimitern, ok - you'll have it by lunch time (2-3 hours) [07:14] tasdomas, thanks! [07:14] dimitern, no, thank you [07:15] dimitern, also, I've updated the PR https://github.com/juju/juju/pull/667 - when you have a sec, could you take a look? [07:15] tasdomas, sure, looking [07:33] tasdomas, LGTM [07:33] dimitern, thanks - I'll update the error message before landing [07:34] tasdomas, sweet! [07:44] wallyworld: I have charms deploying without provider storage :) needs some polishing and more testing before I can propose anything [07:45] also upgrade steps required this time [08:01] morning [08:07] TheMue, morning [08:09] dimitern: regarding the last comment yesterday: yes, the suite is running twice, once for v0 and once for v1, during the first run the test for a function introduced with v1 is skipped [08:09] dimitern: this way it's easy to check if v1 doesn't break compatability to v0 [08:09] TheMue, yeah, I've seen this, but doesn't that seem awkward way of running the tests? [08:10] TheMue, how is that better than having 2 separate v0- and v1-only suites? [08:12] dimitern: it thought about it, but then you 1st need a base test you can embed into the real ones, and then 2nd you have one for v0 and one for v1 with almost the same content, in my case only one additional test. that's lots of redundant code [08:12] dimitern: because each new version has to ensure that it doesn't break existing functionality [08:17] TheMue, ok, that sounds good to me [08:18] dimitern: yeah, spent some time yesterday how to organize it best and to see, where the lowest dependencies exist [08:19] TheMue, cheers [08:20] jam: would also like to discuss it with you, mast of API versioning :D [08:23] s/mast/master/ [08:30] TheMue, well, new versions are surely there *because* we want to break existing functionality -- when things don't change, yes, you get a duplicated test; but when they do I think it will be very hard to adapt that style of test [08:30] TheMue, I understand where you're coming from [08:33] TheMue, might it make most sense to have per-method suites? so then you can run the same per-method suite against multiple versions, hopefully minimising duplication without falling into a situation where adding a new version involves adding a new layer of special-casing to an over-general full-facade suite? [08:33] fwereade: yes, I simply want to ensure that all functions of a former version work like before while those which are added or changed surely behave different [08:34] fwereade: could you please expand a bit? [08:34] fwereade: did you you take a look into my proposal? [08:34] TheMue, so, the concern is that having a single full suite with one special-case for one new method is defining the direction we'll take in the future [08:34] simply to synchronize better [08:34] TheMue, next method will be another special case [08:35] TheMue, and then next version there's a change in functionality for some method [08:35] TheMue, and whoever implements it will... add another special case [08:36] TheMue, and *very soon indeed* it will become straight-up impossible to understand what's happening in this single godlike test suite that actually tests slightly different things for all the api versions [08:36] hazmat: I'm assuming you succeeded in building tokumx, but I've been struggling a bit. Did you grab their source control branches? What version? And did you use cmake or scons, as it looks like they want to switch to cmake (mongo itself uses scons), but I keep running into errors trying to build 1.5.0 [08:36] TheMue, I haven't seen the code we're talking about, though, I'm just going by what you said above [08:38] fwereade: please take a look here: https://github.com/TheMue/juju/blob/capability-detection-for-networker/apiserver/machine/machiner_test.go [08:39] fwereade: and I would like to see an outline of a per-method suite. this term sadly doesn't tell me a lot. ;) [08:39] TheMue: a "Suite" object for each method, rather than one "Suite" for each Facade [08:39] TheMue, you have a suite that tests all the methods, but special-cases some of them [08:39] TheMue, I'm suggesting having lots of suites, defining our expectations of the behaviour of a single method each [08:40] TheMue, and registering explicitly only the tests we actually want to run [08:40] jam: ah, thanks [08:40] TheMue, rather than mixing the what-to-test in with the how-to-test [08:44] * TheMue tries to imagine how the code base will look like for a number of methods that are robust over time. [08:45] so a v0 test would be embedded into a v1 test and so on, and only when it breaks, e.g. at v7, a new implementation would be made? [08:46] my goal is a good compromise of test reusage and flexibility for changes over time. [08:46] TheMue, I'd rather avoid embedding anything at all anywhere really [08:47] TheMue, I'm imagining there'd be a TestGetMachines suite, which gets set up to run its tests against v1 of the API [08:47] so let's say we have 5 suites for a v0, I add a new method, now have e.g. 6 suites for v1, and then in v2 I add two more and change one ... [08:48] TheMue, and all the other suites test against both v1 and v0 [08:48] fwereade: no embedding, code duplication instead? [08:48] TheMue, where did I suggest we duplicate code? [08:48] fwereade: that's why I ask [08:49] TheMue, you write one suite, that is capable of testing that some method implementation acts as expected [08:49] fwereade: simply to get better aware of your thoughts ;) [08:49] TheMue, you then feed all the facade versions that you expect to have that behaviour into that suite [08:49] TheMue, so adding a new version is a matter of adding the new version to the suite for each method it still uses [08:50] TheMue, new method? new suite, targeting just that facade [08:50] fwereade: ok, that's what I'm doing (when I get your word right), but for the whole suite with more then one method to test [08:50] TheMue, yes [08:50] aaaaaaah [08:50] TheMue, I just want more granularity [08:51] fwereade: instead of using the skipping or evel switches based on the version number inside the tests [08:51] TheMue, I think (particularly for the bigger facades) full-facade suites wil become unmanageable really alarmingly fast [08:51] fwereade: sounds cool [08:53] TheMue, on a separate note, what does Machiner need GetMachines for? [08:55] TheMue, ah, whether something's on a manual provider? what do we use that for? [08:55] fwereade: IIRC for stuff that was on the Agent API but doesn't do anything for Unit agenst, and thus is a Machiner responsibility [08:55] fwereade: the whole branch is about the needance for a safe networker. and here we neede the information if a machine is provisioned manually. first approach has been to retrieve the information extra, as it isn't needed so often. [08:56] jam, TheMue: then that is *definitely* not a machiner responsibility -- the machiner doesn't start the networker [08:56] fwereade: but review and discussion feedback has been to not make an extra call, so I changed the way we retrieve a machine info on the client side of the API [08:58] TheMue, jam: this feels like it should be a job, as communicated by the agent api, rather than tacking it onto an unrelated purpose-specific facade [08:58] TheMue, jam: am I confused about something? [08:58] fwereade: so, previously there was an API on Agent that was giving you the Life of the entity you wanted, and a bunch of other Machine related stuff that didn't make sense for Unit agents. [08:58] fwereade: what is the task of the machiner API? [08:58] fwereade: GetEntity IIRC, looking [08:59] fwereade: naive, by taking the term "Machiner" I would expect machine related API calls, like retrieving information about a machine [08:59] TheMue, set the machine to dead once it's marked as dying, and shut down [08:59] TheMue, it also sends network addresses once on startup which is a bit yucky [09:00] TheMue, the facades are all worker-specific [09:00] TheMue, they should be exactly what's needed for a remote worker to fulfil its (ideally *single*) responsibility [09:02] fwereade: here's my problem from a maintenance perspective. wanting to do something related to machines it always pulls me to the term "Machine" or "Machiner", but never to something called "Agent" [09:02] fwereade: so today we have AGentGetEntitiesResult which has 1 field that is actually shared, and then 2 fields that aren't meaningful for Unit agents, we would have been adding a 3rd. It felt better to split that out for Machine-Agent specific responsibilities. [09:02] I see your point that Machiner is the worker, not the Machine-Agent api [09:03] but do we have a Facade for just machine agents (vs all agents in general), do we want one? Is it just better to pull it out of Agent.GetEntities and make it something Agent.GetMachineDetails sort of thing? [09:03] fwereade: ^^ [09:03] jam, TheMue: IMO the separate existence of unit agents is the anomaly -- making the agent api more machine-agenty doesn't seem to me to be a particularly major issue, because it echoes where we want to go anyway [09:03] jam, fwereade: so maybe there's a need for two facades: "Machiner"/"MachineWorker" and "Machine" [09:03] TheMue, I don't think so [09:03] TheMue, what's the worker that uses "Machine" [09:03] ? [09:04] fwereade: are only worker using the API? [09:04] TheMue, and agents; and external clients; but essentially, yes [09:04] TheMue, and an agent is almost a special case of a worker [09:05] TheMue, it's the "worker" that starts other workers [09:05] TheMue, and what we have hitherto done is (1) use the Jobs to figure out what to start [09:05] TheMue, or (2) pull hacky shite out of the agent config instead [09:06] TheMue, the latter is not good [09:06]