/srv/irclogs.ubuntu.com/2020/06/11/#launchpad-dev.txt

cjwatson	tomwardill: lxc-attach seems to pass output straight through without modification, so I think you can just do lxc-attach -n "$container_name" -- env blah \| subunit-2to1, much as you had above	07:53
tomwardill	ah, right :)	07:53
tomwardill	will give that a try once I've worked out why postgres is refusing to start	07:53
tomwardill	thanks :)	07:54
cjwatson	Though consider error handling	07:54
cjwatson	As in, what happens if lxc-attach exits non-zero	07:54
cjwatson	Pipes tend to lose that unless you take special care	07:54
cjwatson	You still want to stop the container, but should preserve the exit code	07:55
tomwardill	right	07:56
SpecialK\|Canon	someone with more familiarity with traversal want to review https://code.launchpad.net/~cjwatson/launchpad/+git/launchpad/+merge/385489 ?	08:51
SpecialK\|Canon	(because hi)	08:51
tomwardill	cjwatson: before I break out pdb agian, any idea what might be causing: https://pastebin.canonical.com/p/djCTpmdCwF/	09:32
cjwatson	tomwardill: bad working directory maybe?	09:33
tomwardill	`pwd` agrees with my current directory	09:33
cjwatson	or permissions?	09:33
cjwatson	open_for_writing swallows any IOError that isn't ENOENT!	09:33
cjwatson	quality	09:34
cjwatson	if you add an else: raise into open_for_writing (which really ought to be there anyway), you might get a better error message	09:34
tomwardill	hmm, I appear to have a bunch of files that are owned by root	09:35
tomwardill	I suspect that is not ideal	09:35
cjwatson	Oh, your lxc-attach arrangements don't seem to switch user	09:35
cjwatson	add '$PWD/utilities/run-as buildbot' just before 'env', maybe?	09:36
tomwardill	yeah, this is from pre-that step I think	09:36
tomwardill	trying to work out which step is doing it	09:36
cjwatson	Might be leftovers from a previous run?	09:37
tomwardill	yeah, think so	09:37
tomwardill	poking	09:37
cjwatson	And yeah, lxc-start-ephemeral -u ... meant "switch to this user", and part of the reason I added utilities/run-as was that at least at the time there was nothing else with quite exactly the right semantics	09:39
cjwatson	(lxc-attach -u and lxc exec --user both take uids rather than usernames; setpriv(1) didn't exist yet)	09:42
tomwardill	the good news is that I'm just about at the point where I can hack master.cfg and it will repeatedly get to the same state so I can debug it	09:46
tomwardill	... I should pick a faster subsection of the test suite to test this	09:50
* tomwardill twiddles thumbs a bit more		09:51
tomwardill	run-as is giving me a permissions error on chdir to the build directory, but only when run via buildbot	10:18
tomwardill	wtf	10:18
cjwatson	User namespaces can cause much confusion sometimes, maybe ...	10:20
tomwardill	yeah, something weird going on	10:47
tomwardill	works fine run from a terminal	10:48
* tomwardill sighs at the amount of shell/environment/namespace learning I don't know		10:48
tomwardill	unsure how I'm getting permission denied changing to the directory that is cwd	10:57
StevenK	That is clearly perms	10:57
StevenK	You either aren't the owner, or in the group, or there's no +x	10:57
tomwardill	drwxr-xr-x 20 buildbot buildbot 4.0K Jun 11 10:55 build	10:59
StevenK	All the way up to / ?	10:59
tomwardill	hmm, no, buildbot ownership stops at /var/lib	11:00
StevenK	I'd expect that, but hopefully everything has +x	11:01
tomwardill	looks like it	11:02
tomwardill	I can be in that directory quite happily in a shell	11:02
tomwardill	oh, wait	11:02
tomwardill	maybe I can't in this situation	11:02
tomwardill	wtf	11:02
tomwardill	root@lptests-xenial_tfbWfo:/var/lib/buildbot/lp-devel-xenial/build# su buildbot	11:02
tomwardill	Cannot execute /bin/bash: Permission denied	11:02
cjwatson	buildbot outside and inside the container might not be the same thing	11:04
cjwatson	I would get the most minimal possible reproducer you can manage and strace it	11:05
cjwatson	and also make sure to be looking at permissions by id (ls -nl) inside the container	11:06
tomwardill	so, yeah	11:08
tomwardill	`/` was 0700	11:08
tomwardill	... that's a thing	11:08
* tomwardill gets lunch, leaves it for future tom to worry about		11:10
ilasc	and in this context of twom dealing with complex issues, I come along and ask the rudimentary question: in LP how do we split a large MP in several smaller MPs ? Do I just create separate git branches and open MPs for each new branch or is there something I can do at the level of the large MP that I'm not yet aware of?	11:10
StevenK	Years and years ago one of my friends did 'chmod 644 .*' as root in a top-level directory and then wondered why no one could log in	11:11
ilasc	:)	11:11
cjwatson	ilasc: Separate branches and open MPs for each. Are you familiar enough with the git-level operations here?	11:12
cjwatson	(Also, prerequisites in MPs may be useful, depending on how you lay out the branch structure)	11:15
ilasc	thanks cjwatson, hmmm good question :) just to make sure I start on the right path, I assume I start creating the smaller new git branches from master ?	11:16
ilasc	indeed figured prerequisites in MPs will be necessary	11:17
cjwatson	Normally from master, yes.	11:17
cjwatson	The "splitting commits" section of "man git rebase" may be useful.	11:17
cjwatson	If the bits you need to split up are separate enough in their respective files, you can often manage it with "git add -p" or whatever equivalent exists in your IDE. Failing that I sometimes resort to just dumping out the overall patch and editing it down to the bits I want before applying it, but editing patches by hand certainly isn't for everyone	11:20
ilasc	great, ok, thanks Colin! it sounds like our approaches are similar in this case, I always go for editing patches by hand :)	11:24
cjwatson	Oh, I'm glad I'm not the only person who puts up with that	11:26
cjwatson	I do need it slightly less since I found tools to let me do line-by-line rather than hunk-by-hunk changes to the git index	11:26
ilasc	:)	11:29
cjwatson	Also keep a git ref to the original thing around, then you can't lose it	11:33
ilasc	good idea :)	11:44
SpecialK\|Canon	`git add --patch`'s editor option is <3	11:49
cjwatson	I prefer vimagit since I discovered it last year sometime, but same sort of idea	11:54
tomwardill	cjwatson: any idea wher eI need the umask change?	13:28
cjwatson	I'm not quite sure, it was just a hunch as to how you might end up with mysterious mode 700	13:31
cjwatson	Is the base container like this or just the ephemeral copy?	13:32
tomwardill	good question, sec	13:33
cjwatson	If the former, look in your build pipeline, if the latter, start from lp-setup-lxc-test and trace down	13:33
cjwatson	(probably)	13:33
tomwardill	yeah, it's the latter	13:34
cjwatson	Not actually what I expected	13:34
tomwardill	which makes sense, as lp-setup-lxc-test is the only bit I've actually changed	13:34
cjwatson	Though it probably should have been since you reported different behaviour when running from a terminal	13:34
tomwardill	yeah	13:34
tomwardill	a hack fix would be to just chmod / ;)	13:34
cjwatson	buildbot's buildslave runs with umask 077 by default unless you say --umask=022	13:35
cjwatson	Maybe relevant?	13:35
cjwatson	But you could also just umask 022 at the top of lp-setup-lxc-test ...	13:35
cjwatson	I suspect that'll do it	13:36
cjwatson	I could be wrong here, because I thought our buildbot worker config already did umask 022, but it's been some time since I looked at that and maybe it got lost somewhere along the way	13:37
tomwardill	I'll have a look and give that a try	13:37
cjwatson	puppet modules/launchpad/templates/buildbot.tac.erb has it	13:37
cjwatson	Hm. Did you write buildbot.tac or whatever the modern equivalent is for the workers yourself? Or where did you get it from?	13:38
tomwardill	I didn't write it	13:38
tomwardill	came from lpsetup I think	13:38
cjwatson	It might be a good idea to get sluagh:/srv/buildbot/lpbuildbot/buildbot.tac and compare	13:38
cjwatson	lpsetup's might be wrong	13:39
tomwardill	and it has an interesting thing:	13:39
tomwardill	`umask = None`	13:39
cjwatson	That might be from lpbuildbot demo/slave/buildbot.tac	13:39
cjwatson	Which I'm not certain is in sync	13:39
tomwardill	yeah, that makes sense	13:39
tomwardill	asked for the real one	13:39
tomwardill	well, it gets further, now to see if postgres works	13:50
tomwardill	it's running tests!	13:50
tomwardill	weeee	13:50
tomwardill	now just subunit to work out	13:51
ilasc	+!	13:53
ilasc	+1	13:53
ilasc	... can't type :P	13:53
tomwardill	now, how do I make it stop	13:56
* tomwardill reboots the worker		13:56
tomwardill	okay, might need to teach the test step about subunit 2	14:59
cjwatson	--subunit-v2 \| subunit-2to1 you mean? or something else?	14:59
tomwardill	piping the lxc-attach output through subunit-2to1 just reproduces the same problem of testr not understanding the ouput	15:00
tomwardill	and trying to pipe the testr output through it still results in weird stdout in the logs and the step not understanding how many tests have run	15:00
cjwatson	Ah, hm	15:01
cjwatson	Maybe testr adds too much extra stuff	15:01
tomwardill	hmm, or maybe I've done something wrong somewhere	15:01
tomwardill	as `testr run --parallel --concurrency=2 --subunit --full-results '\|' subunit-2to1` looks a bit weird, given the escaping around the pipe	15:02
cjwatson	Where did you put the subunit-2to1 in that case?	15:02
tomwardill	in the master.cfg	15:03
cjwatson	What's the diff?	15:03
tomwardill	command=['testr', 'run', '--parallel', '--concurrency=2', '--subunit', '--full-results', '\|', 'subunit-2to1']))	15:03
cjwatson	Oh	15:03
cjwatson	Well, yes	15:03
cjwatson	That's an argv	15:04
cjwatson	More or less	15:04
cjwatson	It's not passed to a shell, so doesn't understand \|	15:04
tomwardill	which makes sense	15:04
* cjwatson looks at buildbot.steps.shell		15:05
cjwatson	So ... there's no fiddly quoting required for the arguments themselves there	15:05
cjwatson	You could just try:	15:06
cjwatson	command=['sh', '-c', 'testr run --parallel --concurrency=20 --subunit --full-results \| subunit-2to1']	15:06
cjwatson	Definitely a workaround, but ought to help	15:06
tomwardill	the docs spec that you can give command as a single string	15:06
tomwardill	and it does basically that	15:06
tomwardill	(although that's in the latest docs)	15:07
* tomwardill tries		15:08
cjwatson	I am a bit suspicious 'cos I can't find what implements that, but maybe	15:08
tomwardill	running	15:09
cjwatson	But the sh -c trick should definitely work if that doesn't	15:09
tomwardill	https://usercontent.irccloud-cdn.com/file/7xneaob9/image.png	15:16
tomwardill	success!	15:16
cjwatson	Progress!	15:16
tomwardill	the stdout is good too	15:17
tomwardill	okay, so I think that's all the problems worked through	15:17
tomwardill	now I just need to document what they were, work out patches and file an RT to try this...	15:18
tomwardill	concurrency 5 is making my computer VERY LOUD	15:21
cjwatson	Nice	15:24
cjwatson	Out of interest, does this fix the "unknown worker (bug in our subunit output?)" thing that we currently get? Looked like it might from your image ...	15:25
tomwardill	it seems to...	15:25
tomwardill	we have a list of workers too!	15:25
cjwatson	Ooh, does this let us download independent subunit streams from each worker?	15:25
tomwardill	ooh, which tells you which worker ran which tests	15:25
cjwatson	EXCELLENT	15:26
tomwardill	I think the only 'stream' we get is the list of tests	15:26
cjwatson	That will make debugging certain kinds of test isolation bugs so much easier	15:26
tomwardill	if we upgrade from precise to xenial, do we need to rebuild the xenial LXC that we already have?	15:26
cjwatson	Well, even separate lists of tests for each worker is a lot better than nothing	15:27
cjwatson	I have no idea	15:27
cjwatson	Hopefully not	15:27
tomwardill	indeed, as I don't really want to have to try and maintain this script :)	15:27
cjwatson	As long as you have something working locally, I think it's OK to debug it into existence a little bit on production if necessary	15:27
tomwardill	hmm, getting some 'App server startup timed out' failures, but that may well be due to the load on the VM/machine	15:28
cjwatson	Yeah, likely	15:28
tomwardill	it's at 350% cpu usage and has eaten all the ram allocated to it	15:28
cjwatson	om nom nom	15:28
tomwardill	they're not on the same tests as the ones I had in the last run, so points towards that at least	15:28
tomwardill	wish I'd left this machine in the basement now	15:30
cjwatson	Heh	15:30
cjwatson	This is great though, super-happy to see these improvements	15:31
tomwardill	getting this out and working, then transcribing over to LXD will be super nice	15:31
cjwatson	And hopefully LXD won't be too difficult after this	15:31
cjwatson	Yeah	15:31
tomwardill	and cleaning up/sorting lpsetup along the way	15:31
cjwatson	I think I've decided I don't have enough brain to review https://code.launchpad.net/~pappacena/turnip/+git/turnip/+merge/385158 today. I've reviewed the Launchpad bits that need to precede that ...	16:32
tomwardill	fixed container cleanup and exit code return too	16:46
SpecialK\|Canon	nice	16:48
tomwardill	okay, will work out a plan and extract / update the files required tomorrow morning	16:56
tomwardill	but it's looking good/feasible now	16:56
cjwatson	Non-lcy01 bionic image builders aren't working. I've (belatedly) deployed staging equivalents to test this. lgw01 is failing due to a glance API difference, bos02 is possibly something else but I haven't worked it out yet.	20:21
cjwatson	wgrant: ^- could I have a quick review of https://code.launchpad.net/~cjwatson/canonical-is-charms/gss-glance-v2-private/+merge/385608 ?	20:36
cjwatson	Looks like bos02 is probably the same thing after all.	20:46
* cjwatson cowboys on lgw01 bionic staging to test		20:46
cjwatson	Looks like that fixes it on lgw01, indeed	20:52
wgrant	cjwatson: Ah, fun	21:41

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!