[00:32] <menn0> wallyworld: this doesn't actually need a review right? http://reviews.vapour.ws/r/1653/diff/
[00:48] <mup> Bug #1454468 was opened: Hyper-V Server 2012 R2 node deployed by maas but juju status remains pending <oil> <juju-core:New> <https://launchpad.net/bugs/1454468>
[01:15] <wallyworld> menn0: hey, sorry was out at accountant. anastasia has looked at it, i was just waiting for 1.24 to be unblocked
[01:15] <wallyworld> thanks for asking
[01:17] <wallyworld> ericsnow: it was master
[01:17] <davecheney> thumper: i have a horrible feeling I know what is going on with arm64 and juju
[01:17] <davecheney> long story short, the text of most of the net error messages has changed
[01:18] <davecheney> i'm sure we're doing string sniffing
[01:18] <davecheney> that's why it's not figuring out when ports are not available and stuff
[01:22] <menn0> wallyworld: ok cool
[01:23] <menn0> wallyworld: what's blocking 1.24? I don't see anything
[01:23] <wallyworld> menn0: was blocked yesterday, and i landed another one first this morning before i had to pop out. will land this one now
[01:25] <menn0> wallyworld: cool cool
[01:26] <davecheney> thumper: yup, i was right
[01:26] <davecheney> juju local provider tests fail on amd64
[01:26] <davecheney> under go 1.5
[01:26] <davecheney> (which is what we have to use for arm64)
[01:26] <davecheney> it's the changes to the net output text
[01:26] <davecheney> :-<
[01:26] <davecheney> grr
[01:32] <natefinch-afk> ...and what have we learned, kids?  Don't do string matching on error messages.
[01:36] <mup> Bug #1454359 changed: Unexpected api shutdown followed by panic <juju-core:New> <https://launchpad.net/bugs/1454359>
[01:39] <davecheney> natefinch-afk: sad trombone
[01:40] <natefinch> yeah....   it's unfortunate that gocheck makes it so easy to do string parsing on errors
[01:41] <thumper> davecheney: net output text?
[01:41] <davecheney> my guess is somewhere in the local provider we're looking a err.Error() to figure out what kind of failure it was
[01:42] <thumper> ugh
[01:43] <davecheney> yeah, this is going to suck for a while
[01:43] <davecheney> straddling a bunch of different go versions
[01:44] <natefinch> good for us... make us stop doing stupid things
[01:44] <davecheney> http://paste.ubuntu.com/11105279/
[01:45] <davecheney> lets hear it for juju/errors
[01:46] <menn0> davecheney: nice
[02:21] <natefinch> I need a smart diff tool that can diff two git patch files to prove a forward port really is just a copy of another PR.
[02:21] <natefinch> it has to be smart, because for some reason, the order of files in the patches is not constant, so you can't easily just do a text diff
[02:23] <natefinch> related... can someone review this forward port to 1.23 of a fix already merged to 1.22?  http://reviews.vapour.ws/r/1659/
[02:23] <natefinch> menn0: if you're not working on that customer issue? ^
[02:27] <menn0> natefinch: if it's a forward port of an already reviewed PR with minimal additional changes then it doesn't need another review
[02:27] <natefinch> menn0: even better
[02:27] <menn0> natefinch: I did see it :)
[02:28] <natefinch> :)
[02:38] <davecheney> natefinch: menn0 thumper https://github.com/juju/juju/pull/2304
[02:39] <natefinch> davecheney: wow, that seems kind of broken
[02:39] <davecheney> the contract for the net pacakge says the errors will be of type *net.OpError
[02:40] <davecheney> natefinch: https://groups.google.com/d/topic/golang-dev/nHMOfuSBYLs/discussion
[02:41] <mwhudson> davecheney: is this why everything was timing out?
[02:43] <davecheney> yup
[02:43] <davecheney> well, sort of the opposite
[02:43] <davecheney> the test was expecting that ECONNREFUSED was ok
[02:43] <davecheney> err
[02:43] <davecheney> it was expecting to get ECONREFUSED
[02:43] <davecheney> it got another error and freaked
[02:44] <natefinch> wish there had just been a IsConnectionRefused(error) bool
[02:45] <davecheney> i considered breaking that big block of logic out into a function
[02:45] <davecheney> if you review my PR
[02:45] <davecheney> you can ask me to do that
[02:45] <natefinch> davecheney: I meant in the stdlib :)
[02:46] <davecheney> hmm
[02:46] <davecheney> maybe
[02:46] <davecheney> but where does it end
[02:49] <natefinch> dang, I always forget how to do "fix it then ship it" :/
[02:49] <natefinch> there, close enough
[02:49] <menn0> davecheney: looking
[02:50] <natefinch> davecheney: lol, just saw your post to that golang-dev thread on this change.
[02:51] <davecheney> natefinch: pride comes before the fall
[02:58] <davecheney> natefinch: menn0 thanks for your review
[02:58] <davecheney> i've pulled the logic out into a function
[02:59] <davecheney> let's see if it'll fit through the CI needle
[03:21] <mup> Bug #1454481 was opened: juju log spams ERROR juju.worker.diskmanager lsblk.go:111 error checking if "sr0" is in use: open /dev/sr0: no medium found <juju-core:New> <https://launchpad.net/bugs/1454481>
[03:22]  * thumper snorts
[03:22] <thumper> davecheney: I also looked but you already had two shipits
[03:22] <thumper> so I didn't bother adding another
[03:23] <thumper> was more snorting at the 'fit through the CI needle" comment :-)
[03:24] <anastasiamac> thumper: interestinghow ur virtual reactions are so different to ur real life ones..
[03:27] <thumper> anastasiamac: or not so different...
[03:27] <thumper> anastasiamac: although to be honest, I have never stabbed anyone in the face in person
[03:27] <mup> Bug #1454481 changed: juju log spams ERROR juju.worker.diskmanager lsblk.go:111 error checking if "sr0" is in use: open /dev/sr0: no medium found <juju-core:New> <https://launchpad.net/bugs/1454481>
[03:28] <anastasiamac> can't imagine u  "snort"ing :D
[03:28]  * natefinch can.
[03:28]  * natefinch ducks
[03:28] <anastasiamac> u r brave, Nate :D
[03:29] <natefinch> anastasiamac: nah, I'm just very far away ;)
[03:30] <anastasiamac> :D
[03:36] <mup> Bug #1454481 was opened: juju log spams ERROR juju.worker.diskmanager lsblk.go:111 error checking if "sr0" is in use: open /dev/sr0: no medium found <juju-core:New> <https://launchpad.net/bugs/1454481>
[03:44] <thumper> WT actual F
[03:44] <thumper> ha
[03:45] <thumper> hmm
[03:45] <thumper> I should set an alias for 'got est' to be 'go test'
[03:45] <thumper> I hate timing based intermittent failures in our test suite
[03:45] <thumper> just sayin
[03:46] <thumper> ActionSuite.TestActionsWatcherEmitsInitialChanges
[03:46]  * thumper looks at jw4
[03:48] <anastasiamac> -*- thumper watches jw4 sleep...
[03:49] <thumper> fark
[03:49]  * thumper hates git at times
[03:50]  * thumper branches then does a git dance
[03:55] <davecheney> http://juju-ci.vapour.ws:8080/job/github-merge-juju/3219/console
[03:55] <davecheney> what gives
[03:55] <davecheney> is this a flake ?
[03:55] <davecheney> thumper: well, i'm testing with go trunk
[03:55] <thumper> probably one of the more boring fixes for a critical issue http://reviews.vapour.ws/r/1669/
[03:56] <davecheney> i need to check it'll fit through the 1.2 sieve
[03:56] <davecheney> git it
[03:56] <davecheney> hulk smash
[03:56] <thumper> davecheney: looks like an ec2 failure
[03:57] <thumper> menn0: you were reviewer right?
[03:57] <menn0> thumper: I am
[03:57] <thumper> menn0: care to look at the above review?
[03:58] <menn0> thumper: looking
[03:58] <thumper> menn0: cheers
[04:04] <menn0> thumper: wow so that was a bit of screwup
[04:04] <thumper> yeah
[04:04] <thumper> naming things is hard, right?
[04:05] <menn0> thumper: that explains the huge number of SetAddresses calls
[04:05] <menn0> and the resulting txns
[04:05] <thumper> right
[04:05] <menn0> thumper: ship it... no comments
[04:05]  * thumper looks at something
[04:06] <thumper> this would add one transaction per machine per 15 minutes
[04:06] <thumper> so in an environment with 100 machines, 400 per hour
[04:09] <thumper> hmm...
[04:09] <thumper> I wonder if I got that __fixes__ thing right, or even if this fix needed it
[04:10] <natefinch> thumper: AFAIK nothing is blocked
[04:10] <thumper> merge accepted
[04:10]  * natefinch has landed stuff on 1.22, 1.23, 1.24, and master today
[04:11] <thumper> menn0: I'll look to forward port the fix once it has landed
[04:11] <menn0> thumper: sweet
[04:14] <jw4> thumper: Yeah, that WatcherEmitsIntialChanges test sux0rs
[04:14] <jw4> thumper: mea culpa
[04:15] <jw4> hopefully I'll get a chance to fix it if no-one else does this cycle
[04:15] <axw> menn0: would you please take a look at http://reviews.vapour.ws/r/1670/? trivial change, fixes a critical bug
[04:15] <jw4> I'll squeeze it in with Actions 2.0
[04:16] <menn0> axw: looking
[04:16] <wallyworld> axw: what about existing systems - do we need an upgrade step?
[04:18] <menn0> axw: looks good. what manual testing have you done to confirm it works?
[04:19] <jw4> anastasiamac: it's only 9:20 pm... but I feel like I *should* be sleeping
[04:20] <anastasiamac> jw4: well, if ur were working from an office, would u have been available?..
[04:20] <jw4> anastasiamac: nope
[04:20] <jw4> anastasiamac: at least not normally :)
[04:21] <davecheney> checking: go vet ...
[04:21] <davecheney> provider/vsphere/ova_import_manager.go:269: missing verb at end of format string in Debugf call
[04:21] <davecheney> what's the point of doing this check if it doesn't fail ?
[04:21] <anastasiamac> jw4: :D ur dedication and quick reponse is appreaciated
[04:21] <jw4> davecheney: you ranted about that last week didn't you?
[04:21] <jw4> anastasiamac: ;)
[04:21] <davecheney> i rant about a lot of stuff
[04:21] <jw4> :)
[04:21] <davecheney> it's hard for me to keep track
[04:25] <wallyworld> davecheney: i already ranted about that earlier today too
[04:26] <wallyworld> davecheney: i think go vet doesn't always exit with an error (at least for the Go version used by the bot)
[04:26] <wallyworld> but ffs, why don't developers have it turned on locally
[04:26] <jw4> wallyworld, davecheney the scripts/verify.bash file does return non-zero... the error must be swallowed elsewhere in CI?
[04:26] <wallyworld> jw4: it depends on the version og go vet i believe
[04:27] <jw4> wallyworld: ah
[04:27] <wallyworld> earlier versions were broken
[04:27] <wallyworld> and CI uses Go 1.2.1
[04:27] <axw> wallyworld: worker/rsyslog will rewrite the config if changes, so no upgrade step required
[04:27] <wallyworld> axw: awesome thanks, just wanted to double check due to the sensitity of the site
[04:28]  * thumper goes to walk the dog while the landing bot does its thing
[04:28] <axw> menn0: I confirmed that the config is updated, and that logging still works/is distributed. I couldn't repro; I left a note to that effect in the bug, will test manually if someone is able to give me the steps
[04:30] <menn0> axw: I suspect you need to have an HA setup where rsyslog isn't up on one or more of the state servers so that the outbound queue fills up. with your change, the queue should fill up to 512MB instead of to all free space.
[04:30] <axw> menn0: yeah, I tried that
[04:31] <menn0> axw: (the outbound queue should fill up on the hosts where rsyslog is working obviously)
[04:31] <axw> menn0: from the bug: "I tried creating an HA env, creating a giant machine-0.log and all-machines.log on machine 0, and then disabling rsyslog on the 2nd and 3rd state servers; didn't cause the spool to grow by more than a couple MB."
[04:32] <menn0> axw: then I don't know. jam might have better context (he's around)
[04:33] <axw> jam: ^^  any ideas on how to repro https://bugs.launchpad.net/juju-core/+bug/1453801
[04:33] <mup> Bug #1453801: /var/spool/rsyslog grows without bound <stakeholder> <juju-core:Triaged> <juju-core 1.22:In Progress by axwalk> <juju-core 1.23:New> <juju-core 1.24:New> <https://launchpad.net/bugs/1453801>
[04:35] <jam> axw: looking
[04:36] <jam> axw: put 4GB of data into machine-0.log
[04:36] <axw> jam: from the bug: "I tried creating an HA env, creating a giant machine-0.log and all-machines.log on machine 0, and then disabling rsyslog on the 2nd and 3rd state servers; didn't cause the spool to grow by more than a couple MB."
[04:37] <axw> (giant was ~512MB, not 4GB, though)
[04:37] <jam> axw: so to start with what version is your env?
[04:37] <axw> jam: 1.22.3
[04:38] <jam> axw: so to repro, I'd recommend using 1.20 if you can
[04:38] <jam> at the very least because I believe rsyslog might be configured slightly differently (forward messages from the log file vs connect to rsyslog and send messages directly)
[04:38] <jam> you're Right about ActionQueueMaxDiskSpace
[04:39] <axw> I'll give 1.20 a shot
[04:44] <jam> axw: do you see if 1.22 is reading machine-X.log and forwarding the content?
[04:45] <jam> axw: ah I have another way
[04:45] <jam> python!
[04:45] <axw> jam: it was when rsyslog was enabled, yes
[04:45] <jam> python -m "import syslog; syslog.openlog('juju-test'); syslog.syslog(syslog.LOG_WARNING, 'test this out'))"
[04:45] <jam> axw should end up in the all-machines.log
[04:45] <jam> axw: and you can trivially loop over that to create as much logspam as you want
[04:46] <axw> jam: thanks, I'll try with that
[05:02] <thumper-bbs> grr... bad record mac
[05:03] <axw> jam: I think my previous attempts failed because I was creating log lines that were too long, not fitting into syslog messages
[05:03] <axw> this seems to be creating more spool files
[05:03] <jam> axw: so you need lots of short messages rather than few long ones ?
[05:03] <axw> jam: at least log lines smaller than 1MB :)
[05:04] <jam> axw: we probably have some sort of "max message length"
[05:04] <axw> yes probably. I'm testing with 1K now, seems to be doing the trick. now to see what happens when it gets to 512MB
[05:08] <davecheney> what the heck http://paste.ubuntu.com/11107599/
[05:10] <davecheney> ... error stack: github.com/juju/juju/environs/bootstrap/bootstrap.go:103: Juju cannot bootstrap because no tools are available for your environment.
[05:10] <davecheney> You may want to use the 'agent-metadata-url' configuration setting to specify the tools location.
[05:10] <davecheney> did juju just tell me to go fuck muself ?
[05:11] <thumper-bbs> davecheney: but nicely...
[05:12] <davecheney> ok, i think these are all symptoms of missing tools
[05:39] <jam> axw: how goes?
[05:40] <axw> jam: trying to figure out where my rsyslog config files went...they've disappeared out of /etc/rsyslog.d
[05:41] <jam> axw: that doesn't sound very good. I thought juju writes them on startup/etc.
[05:41] <axw> jam: yeah, worker/rsyslog is meant to write it out
[06:00] <thumper-bbs> wallyworld: we aren't reviewing forward ports of fixes are we?
[06:00]  * thumper-bbs forgot to change nick
[06:04]  * thumper guesses not
[06:09] <axw> jam: finally, verified. I think the worker/rsyslog code isn't very robust to rsyslog being restarted externally
[06:09] <axw> i.e. if it's not running, and tries to stop/restart it, it'll be unhappy
[06:10] <jam> axw: ah, we're probably just issuing a "stop" which will fail if it isn't running, and not handling the "stop a stopped service"
[06:10] <jam> axw: but you found you could overload the queue ?
[06:10] <axw> jam: yes, and it stopped growing when I added the patch
[06:15] <jam> axw: can you confirm if the patch would obviously apply to a 1.20 branch?
[06:16] <jam> axw: and any thoughts on how we might test in "as realistic as we can for CI" ?
[06:17] <axw> jam: yes, the 1.20 code is pretty similar, so would work there
[06:17] <jam> axw: you're doing this on 1.22, right? Have you checked 1.24/master as well?
[06:17] <jam> I haven't heard the official statement from alexis and wes about what version we're targetting
[06:17] <axw> jam: ensure-availability, juju ssh 1 "sudo stop rsyslog", logspam on machine 0 until all-machines.log grows to 512MB, and then a bit more
[06:18] <axw> jam: I thought 1.22 was the important one, that's all I've looked into so far. AFAIK the others haven't changed dramatically
[06:18] <axw> I'll work on forward porting them now
[06:19] <jam> axw: I agree with that basic sentiment, wes and alexis were supposed to meet last night to discuss official upgrade plans for them
[06:19] <axw> wallyworld: do you know any more about ^^ ?
[06:31]  * thumper head desks
[06:32]  * thumper grunts
[06:33] <thumper> *another* damn intermittent failure
[06:33] <thumper> fuck fuck fuckity fuck
[06:34] <thumper> provider/vsphere tests on 1.24
[06:35] <thumper> and for extra hillarity
[06:35] <thumper> environ_broker_test.go:93:
[06:35] <thumper>     c.Assert(err, gc.ErrorMatches, "no mathicng images found for given constraints: .*")
[06:35] <thumper> ... error string = "invalid URL \"http://cloud-images.ubuntu.com/releases/streams/v1/index.json\" not found"
[06:35] <thumper> ... regex string = "no mathicng images found for given constraints: .*"
[06:35] <thumper> note the spelling mistake in the regex
[06:36]  * thumper fixes that first
[06:42] <jam> axw: I'm happy with your patch, but I'm wondering why we ended up with lines added in 2 places
[06:42] <wallyworld> axw: we are going to patch 1.22 afaik
[06:42] <jam> ah non API vs API
[06:44] <wallyworld> jam: that's your understanding too right? we talked this morning at the release meeting that the imminent arrival of 1.22.x into trusy is now being delayed till these issues are fixed in 1.22
[06:44] <jam> wallyworld: Last I had heard it was going to be discussed at the release meeting, which I was not at
[06:44] <jam> and it was a "do we go for 1.22 and push back 1.24"
[06:45] <wallyworld> so, if i understood correctly, it is 1.22
[06:45] <wallyworld> as we want these fixes in trusty also
[06:47] <thumper> wallyworld: care to cast your eye over this commit? https://github.com/howbazaar/juju/commit/9757821b5070ff26510cedc58e7919450ebfa9a6
[06:47] <thumper> wallyworld: not sure why it was intermittently failing
[06:47] <wallyworld> looking
[06:47] <thumper> wallyworld: but with this patch, it passes all the time
[06:48] <thumper> wallyworld: the log showed that the file source was read, and didn't find an image
[06:48] <thumper> but sometimes it would try to get to cloud-images.ubuntu.com...
[06:48] <thumper> no idea why it was only sometimes
[06:48] <wallyworld> thumper: NFI about intermittent nature either
[06:49] <wallyworld> but good that you fixed the other stuff
[06:49] <thumper> this does seem to make it go away though
[06:49] <wallyworld> hmmm
[06:49] <wallyworld> if it works, but would like to understand the root cause
[06:49] <wallyworld> i'm looking at another bug related to this
[06:49] <wallyworld> i'll poke around a bit
[06:49] <thumper> yeah... me too
[06:50] <thumper> fix it taking a while to land
[06:50] <thumper> landed in 1.22
[06:50] <thumper> trying 1.23
[06:50] <wallyworld> thumper: i'm looking at bug 1452422
[06:50] <thumper> before I try 1.24
[06:50] <mup> Bug #1452422: Cannot boostrap from custom image-metadata-url or by specifying metadata-source <sts> <juju-core:Triaged> <https://launchpad.net/bugs/1452422>
[06:50] <wallyworld> we're not overlapping are we?
[06:50] <thumper> don't think so
[06:50] <thumper> not at all
[06:50] <thumper> all my changes have been around machine Addresses and SetAddresses
[06:50] <wallyworld> ok, just when you said me too i wasn't sure
[06:51] <thumper> just a slight diversion to fix the intermittent vsphere test failure
[06:51] <wallyworld> ok
[06:51] <thumper> me too was relating to wanting to know the root cause
[06:51] <wallyworld> ah
[06:51] <wallyworld> anyway, +1 on that fix
[06:51] <thumper> I hate weird shit like that
[06:51] <wallyworld> yup
[06:51] <thumper> cheers
[06:51] <thumper> just checking
[06:52] <wallyworld> the regexp typo made me laugh
[06:52] <wallyworld> talk about tweaking the test to match bad code :-)
[06:52] <wallyworld> you can tell it wasn't TDD :-)
[06:53] <jam> wallyworld: prob just copy and paste
[06:53] <wallyworld> yup
[07:04] <jam> thumper: do we know what menn0's plan is for working on https://bugs.launchpad.net/juju-core/+bug/1453785 ?
[07:04] <mup> Bug #1453785: transaction collection (txns) grows without bound <stakeholder> <juju-core:Triaged> <juju-core 1.22:In Progress by menno.smits> <juju-core 1.23:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1453785>
[07:04] <jam> I feel like we should have some coordination to find out when we can purge items from the txn collection
[07:04] <jam> it is possible that everything can be purged once they have been in APPLIED
[07:12] <thumper> jam, wallyworld: quick hangout to hand off?
[07:13] <wallyworld> sure
[07:13] <thumper> wallyworld, jam: https://plus.google.com/hangouts/_/canonical.com/handoff?authuser=0
[07:13] <jam> rogpeppe2: ^^
[07:17] <axw> wallyworld: when you're free, can you please pastebin the output of "sudo lsblk"? I don't have an optical drive :)
[07:17] <wallyworld> axw: me either
[07:18] <axw> wallyworld: oh, I thought sr0 was optical. well, anyway, I don't have one of them
[07:18] <wallyworld> axw: neither does my output
[07:19] <axw> wallyworld: same machine you had juju running on? juju just runs "lsblk"...
[07:19] <axw> wallyworld: juju just runs lsblk. you're on the same machine you had juju running, where the log was spammed?
[07:19] <axw> sorry, thought I was disconnected
[07:20] <axw> wallyworld: sorry, I'm an idiot
[07:20] <axw> wallyworld: you didn't send the bug report... :)
[07:23] <wallyworld> :-)
[09:40] <mup> Bug #1454599 was opened: firewaller gets an exception if a machine is not provisioned <cpec> <stakeholder> <juju-core:Triaged> <https://launchpad.net/bugs/1454599>
[09:47] <wallyworld> axw: very small review? http://reviews.vapour.ws/r/1677/
[09:48] <axw> looking
[09:48] <wallyworld> ty
[09:51] <axw> wallyworld: done
[09:52] <wallyworld> axw: tyvm
[09:53] <voidspace> wallyworld: axw: ping - know anything about lifecycle watchers?
[09:53] <wallyworld> a little
[09:53] <axw> voidspace: what about them?
[09:54] <voidspace> wallyworld: axw: I have a new watcher / worker combo watching for when IPAddresses become Dead and releasing them with the provider
[09:54] <wallyworld> ohh, nice
[09:54] <voidspace> machine removal marks the addresses as dead, which should trigger the worker to release and remove them
[09:54] <voidspace> the watcher / worker is tested - setting an IPAddress to Dead triggers its removal
[09:54] <voidspace> machine removal is tested
[09:55] <voidspace> removing a machine marks associated IP addresses as dead
[09:55] <voidspace> but an end-to-end test fails
[09:55] <voidspace> allocating an ip address to a machine and then removing the machine *does* mark the address as Dead
[09:55] <voidspace> but the watcher doesn't seem to notice it - it's not released
[09:55] <voidspace> I wonder if I'm missing anything obvious
[09:56] <wallyworld> it is most likely a resource catsing issue
[09:56] <wallyworld> casting
[09:56] <wallyworld> i've seen before
[09:56] <axw> voidspace: where's the worker?
[09:56] <voidspace> I figure it maybe something about our test infrastructure - two states or something
[09:56] <voidspace> worker/addresser/worker.go
[09:56] <voidspace> let me link you to the current WIP
[09:56] <wallyworld> where if  struct was not castable to an interface i can't remember, events were rejected
 I figure it maybe something about our test infrastructure - two states or
[09:57] <voidspace> dammit
[09:57] <voidspace> axw: https://github.com/juju/juju/compare/1.23...voidspace:addresser-machine-destruction
[09:57] <voidspace> wallyworld: interesting
[09:57] <voidspace> the only difference is that when a machine is removed the ipaddress is set to Dead as part of a bigger transaction
[09:58] <voidspace> axw: TestMachineRemovalTriggersWorker is the failing test
[09:58] <axw> mk
[09:58] <voidspace> axw: it fails in waitForReleaseOp
[09:58] <voidspace> (and without waiting for the release op it fails because the address really isn't removed)
[09:58] <wallyworld> voidspace: i'll see if i can find the code
[09:59] <voidspace> if I change state.Machine.Remove to call address.EnsureDead (set the address to dead in its own transaction) the test *still fails*
[09:59] <voidspace> yet tests that do *exactly that* pass
[09:59] <voidspace> so I suspect test infrastructure problems
[09:59] <voidspace> wallyworld: ok, cool
[10:00] <voidspace> the debug output shows that the watcher never sees the event
[10:08] <axw> voidspace: what first came to mind was that you might need to do s.State.StartSync() just before waitForReleaseOp... but it looks like you're using just the one State, and not BackingState+State
[10:09] <axw> voidspace: nothing jumps out, sorry
[10:09] <wallyworld> voidspace: do you have the watcher code?
[10:10] <axw> wallyworld: it's a plain old lifecycle watcher: https://github.com/juju/juju/blob/master/state/watcher.go#L174
[10:11] <wallyworld> oh, right
[10:13] <voidspace> StartSync doesn't appear to help
[10:13] <voidspace> I see the ip address life set to Dead - watcher isn't triggered
[10:14] <voidspace> wallyworld: axw: ok, time to start digging into the event code
[10:15] <voidspace> hah
[10:15] <voidspace> axw: wallyworld: moving the StartSync to later in the test worked
[10:15] <voidspace> magically...
[10:15] <voidspace> axw: wallyworld: thanks...
[10:15] <voidspace> yay for mysterious magic
[10:15] <wallyworld> didn't do anything, glad you got it working
[10:15] <axw> voidspace: where in the test did you add it?
[10:15] <wallyworld> but i hate magic :-)
[10:15] <axw> I'm curious to know why that works
[10:16] <wallyworld> me too
[10:16] <voidspace> axw: just before the "machine.EnsureDead()"
[10:16] <axw> huh, that doesn't make any sense
[10:17] <voidspace> test now passes
[10:17] <axw> voidspace: I can't repro success with that change, is it passing reliably for you?
[10:17] <voidspace> celebration coffee
[10:18] <voidspace> axw: I also needed to tweak the instance ID I allocate to
[10:18] <voidspace> axw: just pushed a passing branch
[10:18] <voidspace> ah no
[10:18] <voidspace> axw: just failed
[10:19] <voidspace> axw: maybe it's a timing issue
[10:19] <voidspace> goddammit
[10:19] <axw> voidspace: if the sync were to make sense anywhere, it'd be after the machine.Remove()
[10:20] <axw> but that fails for me too
[10:20] <voidspace> I just had two passes
[10:20] <voidspace> now two fails
[10:23] <voidspace> axw: with *three* calls to StartSync it reliably passes, remove any one and it seems to fail
[10:23] <voidspace> after machine provisioning, after address creation and allocation and after machine removal
[10:24] <axw> voidspace: still fails with three for me; I suspect it's just adding enough time for it to see the event in your case
[10:24] <voidspace> that's awful
[10:24] <voidspace> hmmm... no, it seems like only the first two are needed
[10:24] <axw> sorry, didn't try in all those spots
[10:24] <voidspace> it's still reliably passing
[10:25] <voidspace> that kind of makes sense - it syncs the two new entities - the machine and the address
[10:26] <axw> voidspace: sorry not sure, gotta go help get kids ready for bed.. I'd be interested to know if you get to the bottom of it
[10:27] <voidspace> axw: well, that works...
[10:27] <voidspace> I can dig through the connsuite and try and see where we end up using the different states
[10:28] <voidspace> axw: laters o/
[10:50] <natefinch> is wallyworld on?  I want to bitch about launchpad some more ;)
[10:50] <wallyworld> \o/
[10:51] <natefinch> wallyworld:  I added a tag to a bug,  but when I click on the link for that tag that was created, it brings me to a list of bugs that doesn't include the bug I just tagged
[10:51] <natefinch> bug: https://bugs.launchpad.net/juju-core/+bug/1452285
[10:51] <mup> Bug #1452285: logs don't rotate <cpec> <logging> <stakeholder> <juju-core:Fix Released by natefinch> <juju-core 1.22:Fix Committed by natefinch> <juju-core 1.23:Fix Committed by natefinch> <juju-core 1.24:Fix Committed by natefinch> <https://launchpad.net/bugs/1452285>
[10:51] <natefinch> link from tag: https://bugs.launchpad.net/juju-core/+bugs?field.tag=cpec
[10:52] <wallyworld> it the bug assigned to the 1.24 series?
[10:52] <wallyworld> yes, i can see it is
[10:52] <mup> Bug #1452285 changed: logs don't rotate <cpec> <logging> <stakeholder> <juju-core:Fix Released by natefinch> <juju-core 1.22:Fix Committed by natefinch> <juju-core 1.23:Fix Committed by natefinch> <juju-core 1.24:Fix Committed by natefinch> <https://launchpad.net/bugs/1452285>
[10:52] <mup> Bug #1454627 was opened: presence shouldn't try to hold all possible Sequences at once <performance> <tech-debt> <juju-core:Triaged> <https://launchpad.net/bugs/1454627>
[10:53] <wallyworld> natefinch: the default search criteria only includes open bugs i think
[10:54] <wallyworld> so if the bug is fix committed it won't show up (a guess)
[10:54] <wallyworld> you may need to go to advanced deatch so you can explicitly select the bug sates you want
[10:54] <wallyworld> states
[10:54] <wallyworld> eg fix committed, triaged, in progress etc
[10:55] <natefinch> ahh, that actually almost makes sense :)
[10:57] <wallyworld> natefinch: and i just tested it, advanced search works
[10:57] <wallyworld> if you clock on advanced search, you'll see the default tags
[10:57] <natefinch> wallyworld: yeah, I did too.
[10:57] <wallyworld> s/tags/states
[10:57] <wallyworld> not obvious i agree
[10:58] <natefinch> it didn't help that I'd *just* set it to fix released...  plus the whole "only searching one series" thing.
[11:00] <wallyworld> yeah
[11:25] <dooferlad> TheMue: http://reviews.vapour.ws/r/1679/ - if you could take a look. This should be getting familiar now!
[11:25] <jam> wallyworld: or axw: question about "juju status" and agent up/down
[11:26] <wallyworld> yo
[11:26] <jam> this is older code, but it might be stuff that you guys have thought about recently
[11:27] <TheMue> dooferlad: sure, will do
[11:30] <TheMue> dooferlad: done
[11:30] <TheMue> dooferlad: a diff between two PRs would be nice, so that it more easily can be compared
[11:30]  * TheMue is at lunch now
[11:47] <mup> Bug #1454661 was opened: presence collection grows without bound <performance> <tech-debt> <juju-core:Triaged> <https://launchpad.net/bugs/1454661>
[12:06] <anastasiamac> jam: ping
[12:07] <jam> hey anastasiamac, sorry I got my head deep in this sky stuff. Give me a sec and I'll be right there.
[12:07] <anastasiamac> jam: can reschedule if it's easier :D
[12:14] <jam> anastasiamac: joining now
[12:29] <voidspace> known problem in ubuntu 15.04: can't enter decryption password for encrypted hard drive on boot
[12:29] <voidspace> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1359689
[12:29] <mup> Bug #1359689: cryptsetup password prompt not shown <apport-collected> <iso-testing> <kernel-da-key> <kernel-fixed-upstream> <kernel-graphics> <rls-v-incoming> <utopic>
 <linux (Ubuntu):Triaged by mathieu-tl> <linux (Ubuntu Utopic):Triaged> <linux (Ubuntu Vivid):Triaged by mathieu-tl> <https://launchpad.net/bugs/1359689>
[12:29] <voidspace> *sigh*
[12:32] <voidspace> and there's no supported way to remove full disk encryption either
[12:32] <mup> Bug #1454676 was opened: failed to retrieve the template to clone - 500 Internal Server error - error creating container juju-trusty-lxc-template - <oil> <juju-core:New> <https://launchpad.net/bugs/1454676>
[12:41] <mup> Bug #1454676 changed: failed to retrieve the template to clone - 500 Internal Server error - error creating container juju-trusty-lxc-template - <oil> <juju-core:New> <https://launchpad.net/bugs/1454676>
[12:47] <mup> Bug #1454676 was opened: failed to retrieve the template to clone - 500 Internal Server error - error creating container juju-trusty-lxc-template - <oil> <juju-core:New> <https://launchpad.net/bugs/1454676>
[12:47] <mup> Bug #1454678 was opened: "relation-set --file -" doesn't seem to work <landscape> <juju-core:New> <https://launchpad.net/bugs/1454678>
[12:53] <mup> Bug #1454678 changed: "relation-set --file -" doesn't seem to work <landscape> <juju-core:New> <https://launchpad.net/bugs/1454678>
[12:59] <mup> Bug #1454678 was opened: "relation-set --file -" doesn't seem to work <landscape> <juju-core:New> <https://launchpad.net/bugs/1454678>
[13:07] <wwitzel3> ericsnow: ping
[13:38] <mup> Bug #1454697 was opened: jujud leaking file handles <cpec> <stakeholder> <juju-core:Triaged> <https://launchpad.net/bugs/1454697>
[13:47] <jcastro> mgz, any luck with dreamhost?
[13:51] <mgz> jcastro: expect to have some news shortly, I need to poke a bit more
[13:52] <jcastro> ack
[14:04] <katco> ericsnow: standup
[15:20] <katco> voidspace: hey are you in #juju on canonical's irc?
[15:24] <natefinch> katco: he hides there as mfoord
[15:42] <lazyPower> x-post from #juju -- "Man, actions + the new status  stuff in 1.24 is really nice. hattip @ jujucore for this"
[16:24] <rogpeppe2> anyone know how to change the debug level of a running juju environment?
[16:28] <mgz> rogpeppe2: `juju set-environment logging-config=<>`
[16:28] <rogpeppe2> mgz: ah, ok, i wondered if it was set-environment
[16:28] <rogpeppe2> mgz: the help could be more helpful there, i think :)
[16:28] <mgz> `juju help logging` isn't bad... just not super concise
[17:20] <natefinch> cherylj: how's the file handle bug going?
[17:48] <katco> cherylj: perrito666: btw if you don't think you'll get a patch up for your bugs before your EOD, please be sure to update the bug with any information so we can hand them off
[17:48] <perrito666> katco: sure
[17:56]  * perrito666 tries to reproduce a bug by having crappy db
[17:56] <perrito666> s/db/bw
[18:00] <cherylj> natefinch, katco:  sorry, was at an appointment.  I'm still digging into that bug
[18:11] <katco> wwitzel3: hey do you have some code i can look at for the container management stuff?
[18:13] <wwitzel3> katco: I do
[18:14] <wwitzel3> katco: https://github.com/wwitzel3/juju/tree/ww3-container-mgmt
[18:14] <perrito666> oh great, I am almost falling sleep on the kb and spotify decides to play total eclipse of the heart
[18:14]  * perrito666 makes coffee
[18:14] <katco> lol
[18:14] <wwitzel3> katco: that is fairly recent, I haven't pushed anything from today yet, will after it is actually compiling :)
[18:21] <katco> wwitzel3: can i suggest this (https://github.com/juju/juju/blob/master/apiserver/leadership/leadership.go#L41-L49) as a way of doing IoC for the apiserver stuff?
[18:23] <katco> wwitzel3: also wondering if we really need a state object in api/procmanager/procmanager.go? what functionality are we using from state?
[18:25] <katco> wwitzel3: and last observation, ericsnow and i have been talking about organizing code into modules, so would it make sense to put all of this code in a central spot, and then utilize the interesting bits in the various areas of Juju?
[18:30] <wwitzel3> katco: yeah, I like the idea of all the code being in the same package, we are going to be registering / unregistering the process with state, those methods aren't there yet
[18:30] <katco> wwitzel3: ah gotcha. so could we just pass in a few closures or an interface that handle the registration?
[18:31] <wwitzel3> katco: that is the idea, yeah
[18:31] <katco> wwitzel3: awesome... looking forward to seeing the next iteration of code
[18:33] <wwitzel3> katco: thanks for looking at the WIP, appreicate the reivew, it should all be a little more concerete tomorrow and we can shop it to every one via a review
[18:33] <katco> wwitzel3: sweet
[18:33] <natefinch> cherylj: want some help?  I could help try to repro or something.
[18:34] <perrito666> natefinch: have a maas?
[18:34]  * perrito666 grins
[18:35] <natefinch> perrito666: nope.  I got a maas half set up at the sprint but then hit some problems and never got further with it
[18:35] <perrito666> meh, I thought you had a hardware maas
[18:36] <natefinch> perrito666: I have hardware that I could probably make into maas given time. ... but time is not something I have much of.
[18:36] <cherylj> natefinch: yeah, that would be helpful if you could do that...
[19:03] <natefinch> haha juju 1.20 does *not* appreciate all the extra garbage in my environments.yaml
[19:04] <natefinch> huge list of 'WARNING unknown config field "blah"'
[19:11] <natefinch> sinzui: what do we expect for backwards compatibility between juju versions?  I had a 1.22 local environment and tried to  juju status using a 1.20 client, and it just failed
[19:11] <natefinch> (hung forever)
[19:12] <sinzui> natefinch, that is a very bad. I don't think CI has seen that though. The compatibility tests for 1.20 clients to 1.22 servers could get status and do other ops
[19:13] <natefinch> sinzui: I wonder if juju local is just special
[19:13] <sinzui> natefinch, shouldn't be, but since everyones local id a little different it can be
[19:35] <natefinch> cherylj: for what it's worth, I can't seem to reproduce the file descriptor leak.  At least from the proposed hypothesis of it just being because the API server is down.
[19:35] <natefinch> cherylj: granted, I'm using 1.20.14, not 1.20.11 like at the customer site... I can try 1.20.11 and see if it changes anything though (also trying on juju local, but I can't imagine that matters).
[19:36] <cherylj> natefinch: yeah, I haven't had much luck with that either.
[19:40] <perrito666> and you dont have a lot of descriptors open?
[19:40] <perrito666> if it is a  leak it most likely showing even when not arriving to a critical point
[19:43] <sinzui> natefinch, I think you have found a regressions. I may need to block 1.22.1 going into trusty.
[19:44] <sinzui> natefinch, 1.20.x client sees this error talking to a 1.22.x env: x509: certificate is valid for localhost, juju-apiserver, juju-mongodb, not anything
[19:45] <natefinch> sinzui: oops
[19:46] <sinzui> natefinch, I wonder if this error is about closing a security issue, in which case, it is intentional
[19:54] <natefinch> sinzui:  I don't know
[19:54] <sinzui> natefinch, I think this is just for new envs. I am retesting upgraded envs
[19:57] <natefinch> sinzui: yes, this was not an upgraded environment
[19:58] <TheMue> anyone free for reviewing http://reviews.vapour.ws/r/1681/ ?
[19:59] <sinzui> natefinch, this is just old clients cannot be guaranteed to talk to envs bootstrapped by newer/securer clients. upgraded envs continue to to work. I just took 1.20.11 to 1.21, 1.1.22, then 1.23 and all is good
[20:00] <natefinch> sinzui: ok... I find that odd, but since I don't really care about backwards compatibility personally, I'm ok with it if you're ok with it ;)
[20:03] <perrito666> brb, new firmware for my wifi card
[20:04] <perrito666> well, no improvements, new hardware an linux is a nightmare
[20:36] <mup> Bug #1454829 was opened: 1.20.x client cannot communicate with 1.22.x env <compatibility> <status> <juju-core:Triaged> <https://launchpad.net/bugs/1454829>
[20:41] <natefinch> well, the good news is, we're going to have a brand new A/C unit.  That's also the bad news.
[20:43] <perrito666> natefinch: well, that fix gave you an extra year
[20:43] <natefinch> perrito666: yes, but it cost me $1000
[20:44] <natefinch> perrito666: I don't really want to pay $1000 for a year of A/C
[20:45] <perrito666> no warranty on the fix?
[20:46] <perrito666> the bad part is, if you had known this a couple of months ago you could have saved the time you spent trying to protect it from the falling ice
[20:49] <natefinch> perrito666: not sure about warranty, probably not (or not more than like 30-60 days)
[20:50] <natefinch> gotta run, Lily has an art show at her school
[20:50] <natefinch> cherylj: FWIW, I have some agents that are very very slowly gaining file handles (like one per half hour), not sure where though, so I'll leave them to run and see what happens.
[20:51] <perrito666> there is a joke around there about running, heat and AC but I cannot quite make it
[20:51] <natefinch> haha
[20:55] <perrito666> bbl
[21:27] <mup> Bug #1454658 was opened: TestUseLumberjack fails on windows <blocker> <ci> <regression> <test-failure> <windows> <juju-core:Triaged> <juju-core 1.23:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1454658>
[21:48] <thumper> FFS
[21:49] <thumper> waigani: I don't suppose you have a windows box handy to run tests on?
[21:49] <thumper> waigani: can I get you to look at bug 1454658 ?
[21:49] <mup> Bug #1454658: TestUseLumberjack fails on windows <blocker> <ci> <regression> <test-failure> <windows> <juju-core:Triaged> <juju-core 1.23:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1454658>
[21:49] <waigani> thumper: windows? what's that?
[21:49] <thumper> waigani: CI blocker, and very simple fix
[21:49] <thumper> this is the merge that brought in the failure: https://github.com/juju/juju/pull/2303/files
[21:49] <waigani> thumper: okay, I'm giving this upgrade bug one final pock (I may have Stockholm syndrome)
[21:50] <thumper> waigani: please leave it for a bit, and look at this critical bug
[21:50] <waigani> *poke
[21:50] <waigani> thumper: yep, will do
[21:50] <thumper> cmd/jujud/agent/machine_test.go needs two tweaks
[21:51] <thumper> func (FakeConfig) LogDir() string should wrap the file path in filepath.FromSlash
[21:52] <thumper> unit_test needs it too in the same palce
[21:52] <thumper> and the test that checks the filename also needs filepath.FromSlash
[21:52] <thumper> I *think* that should be sufficient
[21:53] <thumper> to fix the windows issue
[21:53]  * thumper has other ports to fix
[21:53] <waigani> thumper: sure. I'm not going to be able to test in on windows though.
[21:53] <thumper> waigani: it is purely an assumption on slash file path separators
[21:53] <thumper> waigani: that's fine, as long as it passes locally, submit it as a fix for that bug and CI will tell us
[21:54] <thumper> I'm 98% sure this will fix it
[21:54] <waigani> ok, on it
[21:54] <thumper> cheers
[21:55] <thumper> waigani: also not that you should start on the 1.23 branch
[21:55] <thumper> and forward port through 1.24 and master
[21:55] <waigani> I was going to ask, okay
[21:55] <thumper> s/not/note/
[22:02] <wallyworld> alexisb: you joining sky handoff?
[22:06] <waigani> thumper:  http://reviews.vapour.ws/r/1683/
[22:07] <waigani> thumper: local unit tests pass
[22:35] <thumper> waigani: first one merged, now to forward port :)
[22:35] <waigani> thumper: okay. ports don't need reviews right?
[22:36] <thumper> waigani: as long as they apply cleanly (and it should)
[22:38] <waigani> thumper: https://github.com/juju/juju/pull/2323 - 1.24 port
[22:39] <thumper> waigani: LGTM
[22:39] <thumper> waigani: the one thing I'd add for the next one is to mention in the pull request that it is a forward port of a previously reviewed and landed fix
[22:42] <thumper> waigani: please also keep the bug tasks up to date :-) ta
[22:44] <waigani> thumper: https://github.com/juju/juju/pull/2324 - 125 port, added comment
[22:45] <thumper> waigani: nice, thanks - generally I wait for the previous target to merge before pushing the next in
[22:45] <waigani> thumper: yep. I haven't tried to merge 1.25, waiting for 1.24
[22:46] <thumper> cool
[22:57] <mup> Bug #1454870 was opened: Client last login time writes should not use mgo.txn <juju-core:In Progress by thumper> <juju-core 1.22:In Progress by thumper> <juju-core 1.23:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1454870>
[23:00] <thumper> menn0: do you know of examples in state where we write to the db without a transaction?
[23:03] <menn0> thumper: state/sequence.go does
[23:03] <thumper> ta
[23:03] <waigani> thumper: 1.24 landed, 1.25 landing... bug status updated
[23:04] <menn0> thumper: but that uses mgo.Change and Apply which is a little esoteric
[23:06] <menn0> thumper: State.AddCharm doesn't use a txn... and probably should. that looks like a bug to me.
[23:15] <thumper> menn0: if not mgo.Change and Apply then what?
[23:16] <menn0> thumper: someCollection.Insert/Update/RemoveId/etc etc
[23:17] <thumper> hmm..
[23:17] <menn0> there's lots of methods on collections which let you add, modify and delete docs
[23:17] <thumper> do we do an Update on a collection anywhere?
[23:17]  * thumper looks for the mgo docs
[23:17] <menn0> thumper: not in state proper
[23:19] <menn0> thumper: you probably want UpdateId
[23:20] <thumper> yeah, that is what i'm doing :)
[23:21] <menn0> thumper: i've just noticed that the collection type you get back due to the auto multi-env stuff doesn't support any of the Update* methods :(
[23:21] <menn0> thumper: easily added though
[23:22] <thumper> agh
[23:22] <menn0> thumper: i probably didn't implement them b/c we weren't using them anywhere
[23:22] <thumper> I don't think that is in 1.22
[23:22] <thumper> but I'll talk to you about adding it as I go
[23:22] <menn0> thumper: might not be, in which case you're ok ther
[23:22] <menn0> there
[23:23] <menn0> thumper: and the compiler will tell you when you get to the version that does have the multi-env collections stuff
[23:23] <menn0> :)
[23:23]  * thumper nods
[23:25] <thumper> menn0: oh shit
[23:25] <thumper> menn0: it is in 1.22
[23:25] <menn0> thumper: ha
[23:25] <menn0> thumper: ok, it's easy enough to add the required method(s)
[23:25]  * thumper nods
[23:25] <menn0> thumper: do you just need UpdateId/
[23:30] <thumper> I'm going to add update and updateid for consistency
[23:49] <thumper> menn0: could I get you to look at http://reviews.vapour.ws/r/1687/ for me plz?
[23:49] <thumper> much appreciated
[23:49]  * thumper goes to the gym
[23:55] <menn0> thumper: looking