[00:40] Bug #1582948 opened: juju status: base statuses of "unknown" are not instilling confidence [03:10] yay, launchpad is having a fit and CI jobs are failing [03:17] wallyworld: the charmstore should just set origin to store on the resource data it gives back. If it's possible some other jazzy values could be set in the future, let's just set the right thing now and not have to worry about it later. Having a field in the return struct that is explicitly not set is galling [03:38] natefinch: that can be done in charmrepo right? [03:39] natefinch: just saw the admin->controller rename - did you check with QA to see what impacts it would have on their scripts? [03:40] wallyworld: got a ping from cheryl to hold up, so I'm not landing it for now. It honestly didn't occur to me that it would break their scripts.... I kinda wish we didn't have all these implicit dependencies in the CI code :/ [03:40] wallyworld: about the origin - it's the charmstore code itself that needs to return "store" [03:40] natefinch: well, it's to be expected isn't it? the model name is a part of the output we need to query [03:41] natefinch: the charmstore won't do that I don't think so we'll just need to do it in charmrepo for now [03:43] wallyworld: I'll fix it for now, but I think it's wrong. Either the charmstore should always populate it, or never populate it. Not populating it some of the time and expecting everyone to just understand that means origin=store is just plain bad. [03:43] natefinch: not disagreeing :-) need to follow up this end. this will get us out of trouble though for now [03:48] wallyworld: trouble for deleting a value that isn't set that only we read.... [03:49] http://data.whicdn.com/images/104484357/original.gif [03:51] natefinch: it will be set in the future, but just not now [03:52] https://imgflip.com/readImage?iid=9310200 === JoseeAntonioR is now known as jose [05:13] if tools.SHA256 == "" { [05:13] logger.Warningf("no SHA-256 hash for %v", tools.SHA256) [05:13] spot the bug === hazmat_ is now known as hazmat === bodie__ is now known as bodie_ === meetingology` is now known as meetingology === cargonza_ is now known as cargonza === xnox_ is now known as xnox === Ursinha_ is now known as Ursinha === gsamfira_ is now known as gsamfira [10:13] dimitern: ping [10:14] voidspace: pong [10:15] dimitern: I have a test for linklayerdevices that no longer fails, and I think the new behaviour is *correct*, but want to run it by you [10:15] voidspace: sure [10:15] dimitern: there is a test that calling SetLinkLayerDevices with a matching name and provider id to an existing linklayer device fails validation [10:15] dimitern: this used to fail validation because the provider id matched an existing one which was checked before everything eslse [10:16] dimitern: *however*, if the name matches (i.e. the docID is the same) it's treated as an update not an insert [10:16] dimitern: and as the providerid is already set in the original the insert doesn't attempt to change the provider id [10:16] dimitern: so the update works and the test *fails* [10:17] dimitern: however, because this is actually an update not an insert I think the new behaviour is correct [10:17] voidspace: well, that test won't be relevant anymore [10:18] dimitern: yep, just checking you agreed I could just remove the test [10:18] dimitern: I'm adding a new test that an update that attempts to add a duplicate provider id fails [10:18] voidspace: because it tests 'failing validation', whereas now we fail as a side-effect of the txn.Ops triggering ErrAborted [10:19] voidspace: I'd like to have a look at the code still, but so far your plan sounds good [10:19] dimitern: cool [10:19] voidspace: (when you're ready ofc) [10:19] dimitern: I think all tests now pass [10:19] dimitern: doing a run to check [10:19] if they pass I'll propose the branch [10:19] voidspace: sweet! [10:20] dimitern: by the way - current behaviour for an update that attempts to change ProviderID is that the change is silently ignored [10:20] dimitern: shall I change that to an error? [10:21] (this is the current *pre-existing* behaviour - not a change I've made) [10:21] voidspace: it should only be an error if the original prID != from the new one we're trying to set (and the org is also != "") [10:23] dimitern: yep, exactly [10:24] dimitern: currently if original != "" then the new one is just ignored [10:25] dimitern: ok, all state tests pass - I'll make this change and a test and then propose [10:27] voidspace: please make sure you also run worker/provisioner/ and apiserver/common/networkingcommon/ tests as well to double check [10:28] dimitern: well, the merge attempt runs everything... [10:32] voidspace: true :) [10:33] https://github.com/juju/testing/pull/99 [10:43] dimitern: http://reviews.vapour.ws/r/4859/ [10:44] Bug #1583109 opened: error: private-address not set [10:47] dimitern: state, apiserver and provisioner tests pass [10:53] voidspace: otp, will look in a bit [10:59] Bug #1583109 changed: error: private-address not set [11:08] Bug #1583109 opened: error: private-address not set [11:12] voidspace: reviewed [11:27] dimitern, not sure whether you have cycles but that bug ^^ is killing our CI atm... [11:28] jamespage: oh, ok I'll have a look in a bit [11:28] dimitern, thankyou [11:37] jamespage: it looks like the machine hosting keystone/0 did not start ok so that's why the unit had no private address [11:37] dimitern, nope - the machine did start ok [11:37] its running the install hook when that happens [11:38] dimitern, I have the console output for the machine - one sec [11:38] jamespage: I can see machine "7" stuck in "pending" from the linked paste of juju status [11:38] dimitern, I think that's the effect the error has... [11:39] dimitern, https://openstack-ci-reports.ubuntu.com/artifacts/test_charm_pipeline_amulet_full/openstack/charm-cinder/317913/1/2016-05-18_09-29-55/test_charm_amulet_full/logs/cinder-0-nova-console-1.bz2 [11:39] thats the console log from the machine [11:40] jamespage: that looks like the log for machine-1 do you have the log of machine-7 from that same run? [11:41] dimitern, sorry crossed tests - that was from a different failed run [11:42] dimitern, https://openstack-ci-reports.ubuntu.com/artifacts/test_charm_pipeline/openstack/charm-ceph/317910/1/2016-05-18_08-48-05/test_charm_amulet_smoke/logs/keystone-0-nova-console-7.bz2 [11:42] jamespage: I'm looking at https://openstack-ci-reports.ubuntu.com/artifacts/test_charm_pipeline/openstack/charm-ceph/317910/1/2016-05-18_08-48-05/test_charm_amulet_smoke/juju-stat-yaml-collect.txt [11:42] dimitern, https://openstack-ci-reports.ubuntu.com/artifacts/test_charm_pipeline/openstack/charm-ceph/317910/1/2016-05-18_08-48-05/index.html [11:42] jamespage: ah. ok [11:42] for the full index of data [11:43] dimitern, [11:43] machine-7[3059]: 2016-05-18 09:13:45 INFO juju.worker runner.go:275 stopped "machiner", err: setting machine addresses: cannot set machine addresses of machine 7: state changing too quickly; try again soon [11:43] machine-7[3059]: 2016-05-18 09:13:45 DEBUG juju.worker runner.go:203 "machiner" done: setting machine addresses: cannot set machine addresses of machine 7: state changing too quickly; try again soon [11:43] machine-7[3059]: 2016-05-18 09:13:45 ERROR juju.worker runner.go:223 exited "machiner": setting machine addresses: cannot set machine addresses of machine 7: state changing too quickly; try again soon [11:43] machine-7[3059]: 2016-05-18 09:13:45 INFO juju.worker runner.go:261 restarting "machiner" in 3s [11:43] seems symptomatic of when this happens [11:44] jamespage: right! that I know about [11:44] dimitern, one might cause the other? [11:44] jamespage: unfortunately I wasn't able to reproduce it at will, seems quite intermittent [11:44] dimitern, intermittent - yes [11:44] dimitern, but I hit this three times today - bear in mind we're spinning 1000's of units of charms a day... [11:45] jamespage: indeed - that error causes the MA to bounce repeatedly, before setting status to "started", and because setting the addresses failed both the machine and unit will lack a private address [11:46] jamespage: well, it will help to collect the machine-0 logs with logging-config='=TRACE' when it happens [11:46] beisner, can we turn that on? [11:46] ^^ [11:47] jamespage: beisner: juju set-env logging-config='=TRACE' [11:47] (assuming 1.25) [11:48] dimitern, will enable that for all future jobs [11:48] jamespage: that will produce tons of extra logs, beware, but will also include details down to the actual mgo transactions [11:48] we'll need to turn it back off as soon as we figure out what's going on [11:49] beisner: cheers! with your setup it should be easy to catch why it happens, however intermittent it might be [11:49] dimitern, we have enough activity that we're seeing it quite often actually [11:49] dimitern, with each deploy being 10-25 units, chances are, 1 of those units will have this happen on about half the jobs. [11:49] so we should be able to repro with logs in short order [11:50] beisner, jamespage: when it happens again, machine-0.log (where the apiserver is) and machine-X.log (the one "pending") will both be very useful [11:50] dimitern, our bot can't automatically grab the log from machine-x because the agent is awol, it has no address, and we can't ssh into it. [11:50] so that will take a manual reproduction [11:51] beisner: well, can you get to the juju's machine-0.log ? [11:51] dimitern, definitely [11:51] beisner: sweet! that's the important one actually (where the txns are logged) === \b is now known as benonsoftware [11:58] jamespage, dimitern - hrm. this will be extremely tricky as we can't set-env logging until after bootstrap, but bootstrap is done by amulet, after our script has left the building. [11:58] dimitern, is there an environments.yaml way to do this? [12:00] beisner: sure, add `logging-config: '=TRACE'` to the env.yaml [12:05] dimitern, ok cool juju run sed fu for the win. we already had that =DEBUG :-) all set here jamespage [12:10] beisner: thanks! [12:10] beisner, jamespage: please, let me know when you repro the issue with the extra logging [12:46] * dimitern steps out for ~1h [13:45] Bug #1583170 opened: `juju help placement` does not exist [13:47] #1583170 seems like #1580946 [13:47] Bug #1583170: `juju help placement` does not exist [13:47] Bug #1580946: Juju 2 help commands for constraints or placement return ERROR unknown command [13:54] OerHeks: definitely. i didn't see that [13:55] i got distracted by juju/juju2 [14:15] Bug #1583170 changed: `juju help placement` does not exist [14:49] dimitern, ping [14:51] alexisb: hey [15:04] * perrito666 returns from the dead [15:06] wb zombie perrito666 [15:06] lazyPower: tx, hey are those brains you have there? [15:07] indeed, have some seasoning salt [15:07] you're going to inherit all my charms once you eat my brains though... i setup a deadman switch. #yolo [15:08] * perrito666 eats a cracker [15:10] ;) i thought that might dissuade you === tasdomas` is now known as tasdomas [16:11] katco, ericsnow: oh yeah, I also did the rename admin model to controller, but it was put on hold. [16:12] natefinch: mark doesn't want it anymore? [16:12] katco: not sure. Cheryl posted to the PR "Please don't land this just yet. We're still getting feedback on this requested change." [16:12] natefinch: k, thanks for doing that anyway [16:13] katco: I should probably land all but the actually constant name, and then all we'd need to do is change the constant in another PR :) [16:14] katco: but I'll wait for now, see how it shakes out [16:14] natefinch: hey that's a good idea [16:18] babbageclunk, voidspace, dooferlad: anyone still around? I'd appreciate a review on this (mostly removals of legacy/obsolete/unused code): http://reviews.vapour.ws/r/4865/ [16:18] that's a prerequisite to fixing bug 1574844 [16:18] Bug #1574844: juju2 gives ipv6 address for one lxd, rabbit doesn't appreciate it. [16:18] Sure - looking now [16:19] katco: here you go: branched, then put controller model name back to what it was before: https://github.com/juju/juju/pull/5428 [16:19] babbageclunk: thanks! [16:19] ericsnow: this is your weekly notification that reviewboard is broken [16:23] dimitern: I'm still around [16:23] dimitern: ah, babbageclunk beat me to it [16:24] natefinch: looks like it's just you [16:24] voidspace: I'm sure you'd like it though :) [16:24] ericsnow: weird: https://github.com/juju/juju/pull/5428 [16:24] natefinch: yeah, looking at it [16:28] dimitern: reviewed, looks great! [16:28] babbageclunk: awesome, tyvm! [16:30] dimitern: (Well, there was one comment.) [16:31] dimitern: In exchange, can you explain to me how watchers work? :) [16:31] babbageclunk: sure, I can try :) [16:32] dimitern: So the remaining failures I have of my changed code against mongo2.4 are watchers. [16:32] babbageclunk: basically any change to a collection as a result of runing []txn.Op is reported by watchers [16:32] babbageclunk: and the changes (IIRC) are detect by looking into the txnLog [16:33] detected* [16:33] dimitern: ok, so they're goroutines that watch txns.log with specific filter criteria? [16:34] babbageclunk: yeah, more or less - the nitty gritty low level details are in state/watcher/ (rather than state/watcher.go) [16:35] dimitern: Hmm. I clear out txns.log (actually, since it's a capped collection I drop and recreate it). [16:35] dimitern: ...between tests I mean. [16:36] dimitern: hah, yeah - looks good [16:36] babbageclunk: there's also the "presence" thingy - which is related to watchers, but I'm less familiar with it (can't remember whether it was tracking active agent connections or life cycle changes alive->dying->dead in a collection) [16:37] dimitern: can you really remove stuff from the agent/format-1.18.go though? [16:37] babbageclunk: there's more to txns than the log, there are a few other internal collections used to track txn state, what was applied, whether it's being applied or aborted, etc (something called "stash" IIRC) [16:38] voidspace: yeah, I think so - don't let the name mislead you :) that's the current format we're using [16:38] dimitern: ah.... [16:40] dimitern: Thanks, that gives me some stuff to go on with. [16:42] dimitern: my only worry would be that this might cause IPv6 addresses to leak out unwanted [16:42] dimitern: but as far as I can tell that was a risk anyway [16:42] dimitern: so I don't think this makes it worse [16:43] voidspace: there are no changes to behavior - the removed code paths deal with preferIPv6=true, but as I pointed out in the PR description, it was hard-coded to false some time ago now [16:44] dimitern: heh, ok [16:44] natefinch: sorry was otp. lgtm [16:45] katco: thanks :) [16:46] babbageclunk: I've fwd you a couple of ML discussions around txns and pruning logs [16:53] dimitern: Changing tack a bit - I've got a test that's failing here: https://github.com/juju/juju/blob/master/state/volume_test.go#L354 [16:54] dimitern: Saying got []string{"0/0", "0/1"}, expected []string{"0/1", "0/2"} [16:55] dimitern: What are the values in the change list? Are they units? [16:55] babbageclunk: ah, that sounds like a sequence is reset to 0 [16:56] babbageclunk: so some entities, e.g. machines and units, but also others rely on auto-incremented id coming from the sequences collection [16:56] dimitern: Sequences sound like some state that I might be clobbering now that wasn't before? [16:57] dimitern: Aha, that sounds like a possibility. I'm going to dump what's in that collection in successful and failing runs. [17:01] babbageclunk: yeah, it's either that or the test itself relies on volume entities starting from 1 rather than 0? [17:03] dimitern: Yeah, those are the values in the assertion. [17:04] dimitern: So maybe there's something that makes sure sequence is populated appropriately that I'm blowing away? [17:05] babbageclunk: I doubt that - but have a look at StorageStateSuiteBase and/or ConnSuite it uses for possible clues [17:12] * dimitern hits eod [17:31] Bug #1583274 opened: Openstack base bundle 42 fails to start rabbitmq when deployed with Juju/MAAS 2.0 [18:07] I think I need another coffee before looking at your PRs ericsnow [18:08] redir: k :) [18:09] fwereade__: yt? [18:17] wallyworld: problem [18:18] katco: I've tried to attach a resource [18:18] and it's not working [18:19] marcoceppi: ok. can you pastebin the steps or something? [18:20] katco: http://paste.ubuntu.com/16498534/ [18:20] marcoceppi: ty [18:24] marcoceppi: i think you have to specify the --resource flag for each resource when publishing [18:25] marcoceppi: attaching just make the blob available to be referenced [18:25] marcoceppi: publishing is the act of coupling the charm in the channel with a revision of a resource that has been attached [18:28] marcoceppi: does that make sense? [18:28] katco: what's the param for --resource in publish? [18:29] marcoceppi: - [18:30] marcoceppi: `charm list-resources ~marcoceppi/charm-svg` lists the revisions as -1. that's interesting. [18:33] katco: marcoceppi: i noticed the -1 thing too and assumed it was because no resources had been published yet [18:35] wallyworld: marcoceppi: ah that's exactly what it is. but: charm list-resources ~marcoceppi/charm-svg -c development [18:35] wallyworld: marcoceppi: also lists -1. because you don't really publish to the development channel, that's just a dumping ground [18:36] katco: i get a macaroon error [18:36] marcoceppi: can you try: charm publish ~marcoceppi/charm-svg-3 --resource python-jujusvg-1 --resource webapp-1 [18:37] wallyworld: what version of the charm tool are you on? [18:37] natefinch: FYI, looks like you already have a review request up (unpublished? discarded?) for that commit hash [18:39] katco: recent version, casey is checking [18:40] Bug #1255799 changed: juju installing cloud-tools archive on non-bootstrap nodes [18:40] Bug #1258132 changed: [manual] bootstrap fails in livecd due to juju-db not starting [18:52] Bug #1583304 opened: upload-tools appends instead of increment version numbers [19:05] katco: thanks! wallyworld it's pushed [19:05] marcoceppi: ty, will look. i can't log in to charm store atm though. sigh [19:05] wallyworld: sounds like a personal problem? [19:05] marcoceppi: maybe, but casey also can't log out [19:06] so there's something weird happening [19:12] story of my life wallyworld [19:18] ericsnow: thanks... I really wish reviewboard/the bot would handle that differently. it's a different PR, it should just get a new review [19:18] natefinch: agreed [19:20] * perrito666 postpones all his meetings because he lost the ability to speak in english for long periods [19:22] lol [19:43] review anyone? http://reviews.vapour.ws/r/4801/ katco, this is the choose-a-series logic one that you already reviewed once. [19:44] natefinch: tal [19:47] natefinch: did you just move the code to different files? hard to see the diff [19:48] katco: yes, and added comments to the fields in series selector. Mostly I punted on the supportedSeries changes, because it would complicate this code due to things that are really outside its purview === redir is now known as redir_lunch [19:50] natefinch: understand. ship it [19:50] katco: thanks [19:55] katco: huzzah, now the change admin model name to "controller" is a single line change: https://github.com/juju/juju/pull/5419/files [19:56] natefinch: :) sometiems DRY is a good thing [20:25] ericsnow: hey, why are we registering the components in our tests? e.g. cmd/juju/service/bundle_resource_test.go ? [20:26] katco: because of state and the full-stack testing [20:27] ericsnow: can you elaborate a bit? fwereade__ got me curious [20:28] katco: if you try to do resource-related stuff in state and resources haven't been registered in state then it blows up [20:28] katco: some of our full-stack tests do resource-related stuff (e.g. bundle_resource_test.go) [20:30] ericsnow: ah, so some of the existing tests, which are full stack, deal with state which expects resources to be there. therefore it must be registered. [20:30] katco: yep [20:30] fwereade__: there is your answer. we are not in a final state ^^ [20:31] fwereade__: this is a mid-step solution === redir_lunch is now known as redir [22:47] anastasiamac_: redir hey, I can talk for more than a few mins, ill miss standup [22:47] I got restore working, now working on the tests, cheers [22:47] hope you feel better RSN perrito666 [22:59] perrito666: well done with restore \o/ get better :D