[01:00] couple easy ones: http://reviews.vapour.ws/r/5325/ [01:01] http://reviews.vapour.ws/r/5323/ [01:29] one last easy straggler: http://reviews.vapour.ws/r/5323/ === natefinch-afk is now known as natefinch [02:01] thumper: TestHostedModelWorkers is sooo unreliable on Windows [02:02] I might take a look if I get a chance later one [02:02] on [02:06] ok [02:11] pretty simple review anyone? https://github.com/juju/schema/pull/13 [02:33] redir: thanks for the review btw [03:43] thumper: I just assigned you this as I think you knew what it was :-p https://bugs.launchpad.net/juju-core/+bug/1607599 [03:43] Bug #1607599: new unit ends up on wrong machine after migration [03:44] ok, no problem [03:50] Bug #1607599 opened: new unit ends up on wrong machine after migration [03:52] thumper: migrationmaster now does status reporting: http://reviews.vapour.ws/r/5326/ [03:53] natefinch: np [04:02] Bug #1501569 changed: MachineSuite failed [04:08] Bug #1501569 opened: MachineSuite failed [04:20] Bug #1501569 changed: MachineSuite failed [04:20] Bug #1582881 changed: destroying a subordinate returns an error but seems to work [04:20] Bug #1607601 opened: There is no way to see assigned user ACLs [04:29] Bug #1607601 changed: There is no way to see assigned user ACLs [04:29] Bug #1582881 opened: destroying a subordinate returns an error but seems to work [04:32] Bug #1582881 changed: destroying a subordinate returns an error but seems to work [04:32] Bug #1600301 opened: cmd/jujud/agent MachineSuite.TestHostedModelWorkers fails because compute-provisioner never really starts [04:32] Bug #1607601 opened: There is no way to see assigned user ACLs [04:32] Bug #1607608 opened: "juju shares" doesn't make sense any more [04:38] Bug #1600301 changed: cmd/jujud/agent MachineSuite.TestHostedModelWorkers fails because compute-provisioner never really starts [04:38] Bug #1607608 changed: "juju shares" doesn't make sense any more [04:45] Anyone know how I can work around this error? ERROR cannot add unit 1/1 to service "glance": cannot add unit to service "glance": inconsistent state [04:45] Seems to be coming from here: https://github.com/juju/juju/blob/1.25/state/service.go#L899 [04:46] It happened when I tried to "juju add-unit glance --to lxc:21" [04:50] Bug #1600301 opened: cmd/jujud/agent MachineSuite.TestHostedModelWorkers fails because compute-provisioner never really starts [04:50] Bug #1607608 opened: "juju shares" doesn't make sense any more [04:50] Bug #1607611 opened: User commands "See also" help sections need work [04:55] axw: you should really stop biting yourself [04:55] menn0: too bad you can't edit launchpad comments ;p [04:55] that's two people now [04:55] axw: https://bugs.launchpad.net/juju-core/+bug/1600301/comments/4 [04:55] Bug #1600301: cmd/jujud/agent MachineSuite.TestHostedModelWorkers fails because compute-provisioner never really starts [04:56] haha [05:17] axw: Why are you biting yourself? https://www.youtube.com/watch?v=W_bQ_vMtnbo [05:19] blahdeblah: :p [05:21] Bug #1607620 opened: vsphere provider doesn't use region/endpoint from clouds.yaml [05:36] axw: ping [05:37] menn0: pong [05:37] axw: I've been looking at bug 1600301 [05:37] Bug #1600301: cmd/jujud/agent MachineSuite.TestHostedModelWorkers fails because compute-provisioner never really starts [05:37] and I think I see what's happening [05:37] it's a miracle it even works at all [05:38] menn0: was I right with my comment about multi-model? or is it something else? [05:38] the only reason it works is that the test spins in a tight loop and just happens to see the compute-provisioner come up just long enough before it dies with a "model is not prepared" error [05:38] on windows it fails a lot more before things are slower or the timings are different [05:39] if I reduce the test's loop frequency it fails reliably [05:39] axw: there appears to be some stuff in there to support multi-model [05:40] axw: maybe there just needs to be some extra setup done in the dummy provider... right now the test just creates the hosted model in state with no other setup [05:40] axw: I know very little about the dummy provider [05:42] menn0: ok. the code's a bit inscrutable, but it looks like we only enter things into the "state" map during bootstrap. and if there's nothing in there for a model UUID, we get "model is not prepared" [05:42] hence my comment. I may be missing something though [05:42] axw: yep that's what I had figured out too [05:42] * menn0 checks if there's another way to get something into dummy.state [05:43] axw: failing that, what would you say to having this test - and the other 2 or 3 that use a similar structure - skipped [05:43] axw: i'll keep working on it but that seems like a reasonable short term solutionm [05:44] menn0: seems fine. we need a better way of telling that workers are started [05:44] axw: I'll bring it up with will [05:44] axw: he last touched these tests [05:45] menn0: we'll find out pretty quick if the compute provisioner isn't working ;) [05:45] axw: I wonder if it's enough to just statically check the manifolds config to ensure the workers we expect are listed [05:45] axw: and then rely on other tests (featuretests and CI) if something isn't wired up [05:46] menn0: yeah, I think that'd probably be enough. it's not something that's likely to break often - and if it did, it would be very obvious [05:47] well, I think it would? maybe not. might be wondering why the machine never gets provisioned... [05:48] menn0: the other thing we could do is stop using the dummy provider... [05:48] axw: but what then? [05:48] for this test I mean [05:49] menn0: something more programmable. it wouldn't need to do much really - just as long as EnvironProvider.Open works [06:02] axw: here's the PR that skips the 3 flaky tests: http://reviews.vapour.ws/r/5328/ [06:02] axw: I'm emailing Will now regarding a better fix. [06:02] menn0: ok, ta [06:02] anastasiamac: So, re: juju add-unit, this failed on 8 services across 2 different nodes, but retrying them all again worked! \o/ [06:02] menn0: shipit, thanks [06:03] blahdeblah: \o/ [06:03] Which is kinda weird, but yay anyway [06:03] I was only ever adding one unit at a time [06:03] blahdeblah: worthy of a bug :) and the reference to the code that u've paster is brilliant! [06:03] plz add it to the bug too ;) [06:04] For the record, all I did was pattern match the error message + guess. I find go code incredibly hard to read, and couldn't trace the logic of it to save my life. [06:05] Is it really worth a bug report when it's not likely to get fixed? [06:06] blahdeblah: it's worth a bug and i think it's some kind of race :) [06:06] OK [06:06] will do [06:06] \o/ [06:09] axw: merging and i've emailed Will [06:09] axw: I need to EOD [06:09] axw: if the merge fails can you retry pls? [06:09] menn0: no worries. have a good weekend [06:10] axw: you too, cheers [07:13] dooferlad: you up? about? [08:54] Bug #1607689 opened: Cannot launch lxd container in MAAS provider with local image mirror [09:12] morning [09:48] why oh why does azure take so long to tear itself down [10:30] Bug #1607727 opened: JUJU_GUI_SIMPLESTREAMS_URL cannot be specified except by environment variable [10:35] dooferlad, macgreagoir: PTAL @ http://reviews.vapour.ws/r/5330/ [10:38] fwereade_: looks like your OCR - any opinion on http://reviews.vapour.ws/r/5330/ welcome... [10:38] frobware, ack [10:39] fwereade_: ty - needs +2 from somebody else on sapphire [10:39] frobware, yeah, I can't pretend to expertise there ;) [10:40] fwereade_: it's sooooo much better than what we have. in fact, it appears to work! OoO [10:40] frobware, nice :D [10:44] frobware: I'm retesting with my lp:1603473 -based env. [10:46] macgreagoir: is this different to the testing you were doing yesterday? [10:47] No, just now with the pr, really. [10:47] Any worth? [10:48] Bug #1607727 changed: JUJU_GUI_SIMPLESTREAMS_URL cannot be specified except by environment variable [11:06] Bug #1607689 changed: Cannot launch lxd container in MAAS provider with local image mirror [11:32] frobware, can we reasonably drop the e/n/i and e/n/i-juju globals and supply them explicitly in this CL? [11:33] frobware, in good healthy ioc style ;p [11:36] Bug #1607749 opened: juju bootstrap fails with MAAS trunk and juju beta12 [11:38] fwereade_: we can - I'm just wary that I'm a) away for two weeks and unsatifactorily w.r.t your comment I'm trying to close a few other PRs too. [11:38] fwereade_: it is just wrong [11:40] fwereade_: I very deliberately made -- func raiseJujuNetworkInterfacesScript(oldInterfacesFile, newInterfacesFile string) -- take arguments. :) [11:41] fwereade_: let me come back to this once I've closed on a few other things. [11:42] frobware, fair enough, but please do come back to it: bug#, or card, or whatever's most likely to ensure it gets dealt with as soon as practically possible [11:45] fwereade_: https://canonical.leankit.com/Boards/View/122969419/123547832 [11:46] frobware, <3 [11:47] frobware, LGTMed [11:52] Bug #1607749 changed: juju bootstrap fails with MAAS trunk and juju beta12 [11:55] Bug #1607749 opened: juju bootstrap fails with MAAS trunk and juju beta12 [12:31] Bug #1607766 opened: juju-upgrade-mongo should ask for confirmation [13:00] Hi, lads - can I get some traction (or at least info) on bug #1457575 ? [13:00] Bug #1457575: archive/tar: write too long [13:02] Mmike: weeelll... that's a confusing bug status [13:03] mgz: what can I do to un-confuse it? :) (the subject is a bit vague, yup) [13:03] I wish my link was still valid... [13:04] Mmike: I'm guessing we just marked it fixed because we stopped seeing it in CI [13:04] not because anyone actually fixed it [13:04] this is probably not much good to you unless I can work out what changed [13:05] mgz: I just tested it locally (in our stsstack), and while it is fixed in latest 1.24, it is still failing in 1.25.6 [13:05] welp [13:06] I wonder what the diff looks like if I try to merge 1.24 to 1.25 [13:06] Have a customer that is hitting this issue - I'll suggest a workaround, as to delete stale log files and then try backups again. But the customer would appreciate if he sees some traction on this bug ) [13:07] mgz: maybe it's matter of just some tactical include missing :D [13:07] * Mmike is joking, of course [13:08] there's no commit on the 1.24 branch mentioning that bug number or the symptom... [13:11] mgz: maybe the issue never existed in 1.24 (or was already fixed by the time bug was first reported), so it was 'Invalid' for 1.24 [13:11] s/was/should be/ [13:12] fwereade_: does my comment regarding the lack of default gateway need not be catastrophic, hence warning vis-a-vis error? [13:12] Mmike: maybe. we have one failure in CI on 1.24 from 2015-05 [13:12] the others are on master/feature branches (1.25) [13:13] Mmike: given you have repo steps, I think it's probably just something we need to get assigned to someone to fix [13:13] that's be neat [13:14] * Mmike would look to cheryl, for no apparent reason, had she been here when I wanted to look :D [13:14] * Mmike thinks his english broke [13:17] frobware, hmm... if it *is* reasonably expected, can or should it be an INFO? honestly I was pretty much convinced by davecheney's rubbishing of most logging levels [13:18] frobware, I think the place we generally need more sophistication is in the provenance of log messages [13:18] fwereade_: that logging statement I added today - I've had the patch for a while. One wonders whether we should log anything at all. And if so, let's make it just informational. [13:19] frobware, +1 [13:28] Bug #1607786 opened: juju backups won't backup whole /var/log/juju directory [13:29] Mmike: so... this looks reasonably fixable, though the differnce between 1.24 and 1.25 isn't totally clear [13:29] they do have different utils versions but nothing in the log looks like a relevent utils/tar change [13:32] mgz: also, it seems that the actual size of all-machines.log file which triggers the bug is not the same each time - I tested this against maas provider (so my bootstrap machine was a node in maas cluster), and I had to kick the allmachines.log to 1.5GB for the bug to kick in [13:33] now I'm testing against openstack provider (so my state machine is some kvm instance), and it worked ok for 300MB file, failed for 600MB file [13:34] Bug #1607794 opened: withoutControllerSuite.TestWatchMachineErrorRetry unexpected change [13:35] Mmike: oh...oooo [13:35] I wonder [13:36] I bet the size is just helping us hit the bug by making the process slower [13:36] Mmike: the log is still being written during backup right? [13:36] so... theory-of-bug [13:37] we start backup, pass the name all-machines.log to tar.TarFiles to stick in our tar [13:37] that code stats the file to get the size [13:37] mgz: keep in mind that if I repeat the test with the exact same file sizes, it doesn't fail on 1.24.7 [13:38] starts io.Copy the contents across [13:38] a new line goes into the log - it's now longer than it was when statt-ed [13:39] we hit the error case in tar.Write [13:39] which doesn't let you write more than you said you were going to [13:40] Bug #1607794 changed: withoutControllerSuite.TestWatchMachineErrorRetry unexpected change [13:41] Mmike: I can try a trivial patch to test that theory if you're up for trying a custom binary [13:42] mgz: sure thing, would be happy to [13:45] how are even the utils tests slow.. [13:46] 0.004s runtime 5 seconds compiling I guess [13:46] okay, passed [13:47] http://paste.ubuntu.com/21401859 [13:48] Mmike: building you some binaries... presume amd64 is fine? [13:49] frobware: If you have a moment, http://reviews.vapour.ws/r/5333/ is ready to fix that LXD gateway bug [13:49] * dooferlad goes to get tea [13:49] mgz: yup, amd64 [13:50] dooferlad: I wasn't sure what stage things were for you so I was also working this: http://reviews.vapour.ws/r/5331/ [13:50] ...now I wish I added some more debug statements as well [13:51] build takes aaages [13:52] Bug #1607794 opened: withoutControllerSuite.TestWatchMachineErrorRetry unexpected change [13:53] does anyone know how to use juju/schema? I can't understand the API [13:58] rogpeppe: ^ [13:59] natefinch: sure, what don't you understand? [13:59] natefinch: FWIW it's one of the very oldest packages in juju... [13:59] Mmike: okay, I went back and added a bunch of debugging in case I'm wrong [14:00] rogpeppe: I expect there to be some sort of "Validate" method or something... I'm adding some logic to validation options, except that I can't find code that actually does validation [14:00] natefinch: Coerce is the method to use [14:00] natefinch: it's not just validation, but it also converts to a standard form [14:01] s/method/function/ [14:01] mgz: ack, just let me know where to get the binary from [14:05] rogpeppe: ahh, I think it's the FieldMap function that is the entrypoint I'm trying to find [14:06] mgz: how often is the daily(??) PPA generated? [14:06] lol, I hope there's only one answer to that [14:06] frobware: by default, when master is blessed [14:06] mgz: ty [14:07] so, 99 days ago [14:12] hm, what happened to lillypilly [14:21] Mmike: so, I have a file, but I don't have routing from my build machine to mombin, which is the new lillypilly [14:21] Mmike: so, if you can get to my canonistack box, pull it straight from there? [14:22] mgz: how large is it, can you email it to me? [14:22] mgz: or you can add my keys from https://launchpad.net/~mariosplivalo/+sshkeys [14:22] Mmike: scp 10.55.60.255:~/juju-1.25-gztardebug.tar.xz . [14:23] done ssh-import-id already [14:23] ack [14:24] mgz: what username should I use? [14:24] ubuntu [14:25] has the juju and jujud binaries for 1.25 tip with this change applied, so unpack somewhere and ./juju bootstrap --upload-tools [14:26] ack [14:26] it's downloading [14:27] expectation is either no error with your steps, or same error but with +GZ and extra details on the end [14:33] Building tools to upload (1.25.6.1-trusty-amd64) [14:33] mgz: shouldn't the version stting be changed? [14:33] eh, sorry, my fault [14:34] wrong binary :) [14:36] :D [14:37] it should report 1.25.7.1 [14:46] mgz: testing it now [14:49] -rw------- 101/4 1572890664 2016-07-29 14:45 var/log/juju/all-machines.log [14:49] mgz: it worked ok, it put that large file into root.tar [14:49] sec, I'll pastebin [14:50] mgz: http://paste.ubuntu.com/21407167/ [14:50] Mmike: now I want to build one that I think will fail with the extra debugging to be sure... [14:51] sure thing [14:51] i'm going to be around for the next hour and a half [14:51] I guess it looks like we have the fix though [14:53] hm, the other thing that could matter is the go version [14:58] Mmike: scp ubuntu@10.55.60.255:~/juju-1.25-gztarbad.tar.xz . [14:59] I worry that may actually pass too given it's go 1.6 but hopefully now [14:59] *not [15:07] mgz: few mins, pls [15:10] Bug #1605714 opened: juju2 beta11: LXD containers always pending on ppc64el systems === frankban is now known as frankban|afk [15:39] this is glorious https://twitter.com/fatih/status/759049292531109889 [15:43] mgz: sorry, had mtgs, testing it now [15:47] Mmike: no worries, thanks for all your help [15:52] mgz: it failed, as expected: http://paste.ubuntu.com/21412575/ [15:54] perrito666: gah, I should have finished my code to do that: https://github.com/natefinch/graffiti [15:55] Mmike: woho [15:55] that's a good 'woho', right? :) [15:56] Mmike: yeah, though... I did get my debug statement slightly wrong, annoyingly [15:56] Mmike: I'm happy to propose this [15:57] will be in 1.25.7 and it seems like the best workaround is minimising the logs/avoid getting the logfile written to mid-backup [15:57] mgz: excellent, thank you very much! [15:57] mgz: may I ask you to throw a short update on the bug so that I can point the customer there too? [15:58] Mmike: I shall indeed summarzie from irc and assign to me [15:58] mgz: muchos gracias, senor! [16:06] * Mmike signs of for the week [16:10] Bug #1457575 opened: archive/tar: write too long [16:15] perrito666: so... you want to review some code for me? [16:16] I have a slight issue in that I can fix this much easier than I can test it [16:25] Bug #1457575 changed: archive/tar: write too long [16:31] Bug #1457575 opened: archive/tar: write too long [16:31] Bug #1607855 opened: introspectionSuite.SetUpTest unable to listen to socket [16:31] Bug #1607858 opened: MachinerStateSuite.TestSetsStatusWhenDying timeout waiting for status to change [16:48] any OCR today? [16:49] mgz: fwereade_ or what is left of himm [16:51] ...what did you do to him... [16:52] perrito666: this one is right up your alley though, https://github.com/juju/utils/pull/227 (and 228) [16:53] mgz, LGTM [16:55] I looked quite seriously at adding a test, but that meant rewriting so much of the implementation to take non io/os things [16:55] mgz, so I imagined [16:55] mgz, perhaps a comment is warranted? [16:58] // Limit data copied to the size of the file on first stat to prevent ErrWriteTooLong from tar if file grows [16:58] or something like that? [16:58] mgz: oh, that is not right [16:58] from archive/tar I guess [16:59] mgz: where in the world did you see tar not whining about files changing during copy? [16:59] perrito666: for the log case it's reasonable. for other cases of getting files rewritten under you it's perhaps a higher level code fault, but can only be recovered by abandoning the tar and recreating [16:59] mgz, yeah, I had "CopyN lest f grow during Copy" in my head, which is unhelpfully terse [17:00] perrito666: it does whine, that's the issue. breaks backup. [17:00] mgz, perhaps it's worth a doc comment actually [17:01] mgz: do add a doc comment, out of the top of my head I can think quickly in another file that might change and we definitely dont want to truncate [17:01] yeah, doesn't hurt to be explicit about the function behaviour [17:01] which is agent.conf [17:01] mgz, "if it succeeds, it has copied at least the full contents of the file at the time it was passed" or something [17:01] fwereade_: not entirely true [17:02] Bug #1607859 opened: MachineWithCharmsSuite.TearDownTest left sockets in a dirty state [17:02] if the file is being in-place mutated we're at the mercy of the filesystem implementation anyway [17:02] if it succeeds will copy at least size(file) at the time it was passed <- this is more realistic [17:02] heh, yes [17:02] if we're doing rewriting correctly by linux rules, agent.conf is not a problem [17:02] because we have a handle to the old file [17:03] so we just back up the old version [17:03] mgz: I suddenly fear we are not [17:03] not the agent.conf.new that gets renamed agent.conf [17:03] perrito666: well, that's a bug then :) [17:03] are we actually doing that? [17:03] we really should be [17:03] someone, not me, should check [17:03] perrito666, we darn well should be [17:03] truncate and write is asking for bugs on linux [17:03] fwereade_: never even loked at that part of the code [17:04] perrito666, agent/agent.go:622 has utils.AtomicWriteFile [17:04] sweet [17:12] relatively easy review anyone? https://github.com/juju/schema/pull/13/files === ses is now known as Guest16266 [17:18] fwereade_: added some text on the master branch for your delectation [17:25] abentley, sinzui is there a way to populate my local db with issues? [17:26] issue #1: I have lost my irc nick [17:26] I suspect you can dump the live reports db, or at least some of it, and pull it in locally [17:27] Guest16266: import issues form reports.vapour.ws? === ses is now known as Guest24158 [17:50] gah, juju/schema is so weird [17:50] k bbl, about 1h [18:02] Bug #1607895 opened: juju2, maas2, two credentials yields confusing error message [20:12] if anyone is alive still, I have the utils dep bump branches up [20:13] natefinch: you have a vested interest in reviewing, also picks up one of your bug fixes [20:13] mgz: link me? [20:14] http://reviews.vapour.ws/r/5338/ http://reviews.vapour.ws/r/5337/ [20:18] mgz: the CopyN thing is weird.... what causes that? [20:18] mgz: i.e. - when copy copy more than that? [20:18] the way the writeContents function works is that it opens the file at the start [20:18] stats it [20:19] that information (including the filesize) it then put in the tar header [20:19] then we kick of io.Copy with the file handle [20:19] if io.Copy takes long enough, the file might have been appended to by the time it finishes [20:20] so, we're writing more data than we promised we would in the header [20:20] I get it [20:20] to prevent borked tarfiles, the archive/tar package has a check for that and throws [20:20] as we're talking logfiles in practice, just taking their contents (length) at time of opening is sane [20:21] anything else should be using atomic file updates [20:21] right [20:21] lgtm'd [20:21] natefinch: ta! === natefinch is now known as natefinch-afk [21:13] /j #lxcontainers [21:31] mgz: Y U ESCALATE ON FRIDAY? [21:32] perrito666: so I can land and get a CI run while I frolick over the weekend? :) [21:33] you are a sick puppy (and just sent me to look up a word) [21:35] perrito666: much of my pleasure on irc is making you learn more obscure english language and culture [21:35] though, this one really is pretty innocent (and lose the k when not being archaic) [21:36] mgz: you wont stop until you see me having afternoon tea with scones, arent you? [21:36] I feel scones are not a great stretch for you [21:37] mgz: can I get more english than that? [21:37] * perrito666 has an actual tea pot in his work desk [21:41] Bug #1607964 opened: juju2, maas2, lxd containers started with wrong IP, rely on dhclient to switch things [22:14] Bug #1607971 opened: [juju 2.0 ] cannot remove or destroy machine in pending state [23:13] I am going EoW soon and will be on Holiday next week. See you all the following week.