cjwatsontomwardill: lxc-attach seems to pass output straight through without modification, so I think you can just do lxc-attach -n "$container_name" -- env blah | subunit-2to1, much as you had above07:53
tomwardillah, right :)07:53
tomwardillwill give that a try once I've worked out why postgres is refusing to start07:53
tomwardillthanks :)07:54
cjwatsonThough consider error handling07:54
cjwatsonAs in, what happens if lxc-attach exits non-zero07:54
cjwatsonPipes tend to lose that unless you take special care07:54
cjwatsonYou still want to stop the container, but should preserve the exit code07:55
SpecialK|Canonsomeone with more familiarity with traversal want to review https://code.launchpad.net/~cjwatson/launchpad/+git/launchpad/+merge/385489 ?08:51
SpecialK|Canon(because hi)08:51
tomwardillcjwatson: before I break out pdb agian, any idea what might be causing: https://pastebin.canonical.com/p/djCTpmdCwF/09:32
cjwatsontomwardill: bad working directory maybe?09:33
tomwardill`pwd` agrees with my current directory09:33
cjwatsonor permissions?09:33
cjwatsonopen_for_writing swallows any IOError that isn't ENOENT!09:33
cjwatsonif you add an else: raise into open_for_writing (which really ought to be there anyway), you might get a better error message09:34
tomwardillhmm, I appear to have a bunch of files that are owned by root09:35
tomwardillI suspect that is not ideal09:35
cjwatsonOh, your lxc-attach arrangements don't seem to switch user09:35
cjwatsonadd '$PWD/utilities/run-as buildbot' just before 'env', maybe?09:36
tomwardillyeah, this is from pre-that step I think09:36
tomwardilltrying to work out which step is doing it09:36
cjwatsonMight be leftovers from a previous run?09:37
tomwardillyeah, think so09:37
cjwatsonAnd yeah, lxc-start-ephemeral -u ... meant "switch to this user", and part of the reason I added utilities/run-as was that at least at the time there was nothing else with quite exactly the right semantics09:39
cjwatson(lxc-attach -u and lxc exec --user both take uids rather than usernames; setpriv(1) didn't exist yet)09:42
tomwardillthe good news is that I'm just about at the point where I can hack master.cfg and it will repeatedly get to the same state so I can debug it09:46
tomwardill... I should pick a faster subsection of the test suite to test this09:50
* tomwardill twiddles thumbs a bit more09:51
tomwardillrun-as is giving me a permissions error on chdir to the build directory, but only when run via buildbot10:18
cjwatsonUser namespaces can cause much confusion sometimes, maybe ...10:20
tomwardillyeah, something weird going on10:47
tomwardillworks fine run from a terminal10:48
* tomwardill sighs at the amount of shell/environment/namespace learning I don't know10:48
tomwardillunsure how I'm getting permission denied changing to the directory that is cwd10:57
StevenKThat is clearly perms10:57
StevenKYou either aren't the owner, or in the group, or there's no +x10:57
tomwardilldrwxr-xr-x 20 buildbot buildbot 4.0K Jun 11 10:55 build10:59
StevenKAll the way up to / ?10:59
tomwardillhmm, no, buildbot ownership stops at /var/lib11:00
StevenKI'd expect that, but hopefully everything has +x11:01
tomwardilllooks like it11:02
tomwardillI can be in that directory quite happily in a shell11:02
tomwardilloh, wait11:02
tomwardillmaybe I can't in this situation11:02
tomwardillroot@lptests-xenial_tfbWfo:/var/lib/buildbot/lp-devel-xenial/build# su buildbot11:02
tomwardillCannot execute /bin/bash: Permission denied11:02
cjwatsonbuildbot outside and inside the container might not be the same thing11:04
cjwatsonI would get the most minimal possible reproducer you can manage and strace it11:05
cjwatsonand also make sure to be looking at permissions by id (ls -nl) inside the container11:06
tomwardillso, yeah11:08
tomwardill`/` was 070011:08
tomwardill... that's a thing11:08
* tomwardill gets lunch, leaves it for future tom to worry about11:10
ilascand in this context of twom dealing with complex issues, I come along and ask the rudimentary question: in LP how do we split a large MP in several smaller MPs ? Do I just create separate git branches and open MPs for each new branch or is there something I can do at the level of the large MP that I'm not yet aware of? 11:10
StevenKYears and years ago one of my friends did 'chmod 644 .*' as root in a top-level directory and then wondered why no one could log in11:11
cjwatsonilasc: Separate branches and open MPs for each.  Are you familiar enough with the git-level operations here?11:12
cjwatson(Also, prerequisites in MPs may be useful, depending on how you lay out the branch structure)11:15
ilascthanks cjwatson, hmmm good question :) just to make sure I start on the right path, I assume I start creating the smaller new git branches from master ?11:16
ilascindeed figured prerequisites in MPs will be necessary11:17
cjwatsonNormally from master, yes.11:17
cjwatsonThe "splitting commits" section of "man git rebase" may be useful.11:17
cjwatsonIf the bits you need to split up are separate enough in their respective files, you can often manage it with "git add -p" or whatever equivalent exists in your IDE.  Failing that I sometimes resort to just dumping out the overall patch and editing it down to the bits I want before applying it, but editing patches by hand certainly isn't for everyone11:20
ilascgreat, ok, thanks Colin! it sounds like our approaches are similar in this case, I always go for editing patches by hand :)11:24
cjwatsonOh, I'm glad I'm not the only person who puts up with that11:26
cjwatsonI do need it slightly less since I found tools to let me do line-by-line rather than hunk-by-hunk changes to the git index11:26
ilasc :)11:29
cjwatsonAlso keep a git ref to the original thing around, then you can't lose it11:33
ilascgood idea :)11:44
SpecialK|Canon`git add --patch`'s editor option is <311:49
cjwatsonI prefer vimagit since I discovered it last year sometime, but same sort of idea11:54
tomwardillcjwatson: any idea wher eI need the umask change?13:28
cjwatsonI'm not quite sure, it was just a hunch as to how you might end up with mysterious mode 70013:31
cjwatsonIs the base container like this or just the ephemeral copy?13:32
tomwardillgood question, sec13:33
cjwatsonIf the former, look in your build pipeline, if the latter, start from lp-setup-lxc-test and trace down13:33
tomwardillyeah, it's the latter13:34
cjwatsonNot actually what I expected13:34
tomwardillwhich makes sense, as lp-setup-lxc-test is the only bit I've actually changed13:34
cjwatsonThough it probably should have been since you reported different behaviour when running from a terminal13:34
tomwardilla hack fix would be to just chmod / ;)13:34
cjwatsonbuildbot's buildslave runs with umask 077 by default unless you say --umask=02213:35
cjwatsonMaybe relevant?13:35
cjwatsonBut you could also just umask 022 at the top of lp-setup-lxc-test ...13:35
cjwatsonI suspect that'll do it13:36
cjwatsonI could be wrong here, because I thought our buildbot worker config already did umask 022, but it's been some time since I looked at that and maybe it got lost somewhere along the way13:37
tomwardillI'll have a look and give that a try13:37
cjwatsonpuppet modules/launchpad/templates/buildbot.tac.erb has it13:37
cjwatsonHm.  Did you write buildbot.tac or whatever the modern equivalent is for the workers yourself?  Or where did you get it from?13:38
tomwardillI didn't write it13:38
tomwardillcame from lpsetup I think13:38
cjwatsonIt might be a good idea to get sluagh:/srv/buildbot/lpbuildbot/buildbot.tac and compare13:38
cjwatsonlpsetup's might be wrong13:39
tomwardilland it has an interesting thing:13:39
tomwardill`umask = None`13:39
cjwatsonThat might be from lpbuildbot demo/slave/buildbot.tac13:39
cjwatsonWhich I'm not certain is in sync13:39
tomwardillyeah, that makes sense13:39
tomwardillasked for the real one13:39
tomwardillwell, it gets further, now to see if postgres works13:50
tomwardillit's running tests!13:50
tomwardillnow just subunit to work out13:51
ilasc... can't type :P13:53
tomwardillnow, how do I make it stop13:56
* tomwardill reboots the worker13:56
tomwardillokay, might need to teach the test step about subunit 214:59
cjwatson--subunit-v2 | subunit-2to1 you mean?  or something else?14:59
tomwardillpiping the lxc-attach output through subunit-2to1 just reproduces the same problem of testr not understanding the ouput15:00
tomwardilland trying to pipe the testr output through it still results in weird stdout in the logs and the step not understanding how many tests have run15:00
cjwatsonAh, hm15:01
cjwatsonMaybe testr adds too much extra stuff15:01
tomwardillhmm, or maybe I've done something wrong somewhere15:01
tomwardillas `testr run --parallel --concurrency=2 --subunit --full-results '|' subunit-2to1` looks a bit weird, given the escaping around the pipe15:02
cjwatsonWhere did you put the subunit-2to1 in that case?15:02
tomwardillin the master.cfg15:03
cjwatsonWhat's the diff?15:03
tomwardill            command=['testr', 'run', '--parallel', '--concurrency=2', '--subunit', '--full-results', '|', 'subunit-2to1']))15:03
cjwatsonWell, yes15:03
cjwatsonThat's an argv15:04
cjwatsonMore or less15:04
cjwatsonIt's not passed to a shell, so doesn't understand |15:04
tomwardillwhich makes sense15:04
* cjwatson looks at buildbot.steps.shell15:05
cjwatsonSo ... there's no fiddly quoting required for the arguments themselves there15:05
cjwatsonYou *could* just try:15:06
cjwatsoncommand=['sh', '-c', 'testr run --parallel --concurrency=20 --subunit --full-results | subunit-2to1']15:06
cjwatsonDefinitely a workaround, but ought to help15:06
tomwardillthe docs spec that you can give command as a single string15:06
tomwardilland it does basically that15:06
tomwardill(although that's in the latest docs)15:07
* tomwardill tries15:08
cjwatsonI am a bit suspicious 'cos I can't find what implements that, but maybe15:08
cjwatsonBut the sh -c trick should definitely work if that doesn't15:09
tomwardillthe stdout is good too15:17
tomwardillokay, so I think that's all the problems worked through15:17
tomwardillnow I just need to document what they were, work out patches and file an RT to try this...15:18
tomwardillconcurrency 5 is making my computer VERY LOUD15:21
cjwatsonOut of interest, does this fix the "unknown worker (bug in our subunit output?)" thing that we currently get?  Looked like it might from your image ...15:25
tomwardillit seems to...15:25
tomwardillwe have a list of workers too!15:25
cjwatsonOoh, does this let us download independent subunit streams from each worker?15:25
tomwardillooh, which tells you which worker ran which tests15:25
tomwardillI think the only 'stream' we get is the list of tests15:26
cjwatsonThat will make debugging certain kinds of test isolation bugs so much easier15:26
tomwardillif we upgrade from precise to xenial, do we need to rebuild the xenial LXC that we already have?15:26
cjwatsonWell, even separate lists of tests for each worker is a lot better than nothing15:27
cjwatsonI have no idea15:27
cjwatsonHopefully not15:27
tomwardillindeed, as I don't really want to have to try and maintain this script :)15:27
cjwatsonAs long as you have something working locally, I think it's OK to debug it into existence a little bit on production if necessary15:27
tomwardillhmm, getting some 'App server startup timed out' failures, but that may well be due to the load on the VM/machine15:28
cjwatsonYeah, likely15:28
tomwardillit's at 350% cpu usage and has eaten all the ram allocated to it15:28
cjwatsonom nom nom15:28
tomwardillthey're not on the same tests as the ones I had in the last run, so points towards that at least15:28
tomwardillwish I'd left this machine in the basement now15:30
cjwatsonThis is great though, super-happy to see these improvements15:31
tomwardillgetting this out and working, then transcribing over to LXD will be super nice15:31
cjwatsonAnd hopefully LXD won't be too difficult after this15:31
tomwardilland cleaning up/sorting lpsetup along the way15:31
cjwatsonI think I've decided I don't have enough brain to review https://code.launchpad.net/~pappacena/turnip/+git/turnip/+merge/385158 today.  I've reviewed the Launchpad bits that need to precede that ...16:32
tomwardillfixed container cleanup and exit code return too16:46
tomwardillokay, will work out a plan and extract / update the files required tomorrow morning16:56
tomwardillbut it's looking good/feasible now16:56
cjwatsonNon-lcy01 bionic image builders aren't working.  I've (belatedly) deployed staging equivalents to test this.  lgw01 is failing due to a glance API difference, bos02 is possibly something else but I haven't worked it out yet.20:21
cjwatsonwgrant: ^- could I have a quick review of https://code.launchpad.net/~cjwatson/canonical-is-charms/gss-glance-v2-private/+merge/385608 ?20:36
cjwatsonLooks like bos02 is probably the same thing after all.20:46
* cjwatson cowboys on lgw01 bionic staging to test20:46
cjwatsonLooks like that fixes it on lgw01, indeed20:52
wgrantcjwatson: Ah, fun21:41

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!