/srv/irclogs.ubuntu.com/2009/07/02/#ubuntu-testing.txt

ara	good morning!	07:00
ara	schwuk: ping	11:53
cr3	hey folks, I would appreciate your feedback to improve the concept of attachments in Checkbox!	15:23
cr3	currently, attachments can be expressed either as a filename or a command. the problem is that a command makes for a horrible download filename	15:24
cr3	so, we need to find a way to still make it possible to express commands but also relate a name to that command somehow	15:25
fader	cr3: There are a couple of things that immediately come to mind...	15:25
fader	cr3: 1. Add an (optional?) parameter to attachments to provide a filename	15:26
cr3	fader: attachments are currently expressed this way in a test definition file: attachments: cat /foo	15:26
fader	cr3: 2. Generate the filename by some simple method, e.g. "$SYSTEM_ID-$DATE-$TESTNAME"	15:26
cr3	fader: also, a single test definition can define multiple attachments	15:27
cr3	aha! what if we had attachment definition files!?	15:28
fader	cr3: So could you say: attachments: filename cat /foo ?	15:28
cr3	fader: what if filename contains a space?	15:29
fader	cr3: "Don't do that" :P	15:29
fader	cr3: Seriously, didn't you just write some escaping code?	15:29
cr3	fader: currently, the values for definitions do not support the concept of a dictionary, ie key/value pairs such as filename/command	15:30
cr3	what if we introduce another type of definition, like: type: attachment; filename: foo; command; cat /foo	15:31
fader	Hey, I like that... it leaves flexibility for future expansion	15:31
cr3	fader: only type and filename would be required, command would be optional	15:32
cr3	furthermore, in order to link an attachment to one or more tests: tests: foo bar baz	15:32
fader	If you don't provide a command it just attaches the filename specified?	15:32
cr3	fader: right	15:32
fader	cr3: Hmm, I'm not sure if that's a good idea. You have the same keyword performing two fairly different functions	15:33
fader	1. grabbing a file, 2. specifying a filename for output of a command	15:33
cr3	fader: ok, so you'd have this then, where parts in square brackets are optional: type: attachment; name: foo; [filename: /tmp/foo;] [command: cat /tmp/foo;] [tests: foo bar baz;]	15:34
cr3	fader: filename and command would be mutually exclusive	15:34
fader	cr3: I like it.	15:35
cr3	I think I'm comfortable with that too. there is one caveat though: if tests are specified, this means that filename or command needs to be evaluated immediately when the test is finished, not at the end of running all tests	15:36
fader	Why is that?	15:37
cr3	so, it is possible that the same attachment definition will produce different attachment instances	15:37
cr3	fader: this was a problem raised by eeejay where mago overwrites the same log file for every test, or somesuch	15:37
fader	cr3: Oh. That's ugly :(	15:37
cr3	fader: this is a valid concern though, an attachment should indeed attempt to capture the state of the system immediately after the test was run	15:38
fader	I won't argue with you... you're the one who has to code it ;)	15:38
cr3	fader: heh, it does indeed put additional burden on the coding, the result should be transparent to everyone else	15:39
fader	cr3: See, this is why everybody puts up with you working through weekends and national holidays.	15:40
fader	Wait, hang on, that doesn't make sense...	15:40
cr3	fader: I think I finally understood the purpose of national holidays: it's not so that I can have a holiday, it's so that other people can have a holiday from me.	15:48
fader	Hehe	15:48
fader	Is there a way to see what image an installed system was built from? Something more specific than /etc/lsb-release?	17:04
=== Erkan_Yilmaz__ is now known as Erkan_Yilmaz
=== fader is now known as fader\|lunch
cr3	eeejay: yo, got a minute to bounce ideas?	18:29
=== fader\|lunch is now known as fader
cr3	fader: what would think of changing the requires field in a test definition to explicit: packages and devices?	19:42
cr3	these would still take registry expressions so that I can do something like: packages: package.name == 'firefox' and int(package.version) > 2	19:43
fader	cr3: Only those two specifically? What if I want to require e.g. processor scaling support to do a test?	19:43
fader	Or does that fall under 'devices'?	19:43
cr3	fader: is that information available in lshal or only in cpuinfo?	19:44
fader	cr3: Good question. Let me check.	19:44
fader	Looks like it's in lshal: info.capabilities = {'cpufreq_control'} (string list)	19:45
cr3	all the use cases I have so far relate to either packages and devices. so, that's why I'm thinking that from our experience it might make sense to be explicit about both	19:45
cr3	but requires is pretty powerful though, maybe I should keep it	19:45
fader	cr3: What about existence of a file?	19:45
cr3	the existence of a file is not necessarily contained in the registry	19:46
fader	cr3: Ah, so one could still express requirements for things that are not specifically contained within the registry then, right?	19:47
cr3	no, the other way around: one could not express requirements for things...	19:47
cr3	or, to avoid the double negative: one can only express requirements for things specifically contained within the registry	19:48
* fader tries to think of a use case where this would be a problem.		19:48
cr3	typically, when this has posed a problem, a new registry was created	19:48
fader	Heh, y'know, that's obviously the right way to handle it :)	19:48
fader	I'm just concerned that there might potentially be a case where you'd want some information that's not from the package manager or lshal et al. But I can't think of a specific case.	19:50
cr3	coming back to my grid testing idea to take over the world, I just remembered that some job description languages support the concept of expressing requirements as a form of boolean query. so, I'm going to keep "requires"	19:50
cr3	fader: right, it's not because our current limited experience has not come up with a valid use case that there are none	19:50
fader	cr3: Yeah, that.	19:51
cr3	fader: however, I am tempted to remove "architectures" and "categories" which are basically just shorthands for requirements. those seem to just add noise to the test definition format	19:51
cr3	and I don't even recall architectures ever being used	19:51
fader	cr3: So you'd just formulate them like e.g. "requires: category=server"?	19:52
fader	I can imagine cases where architecture would be useful, especially with e.g. LPIA devices	19:52
cr3	fader: something that correlates to the registry in a boolean expression	19:52
cr3	so that would use "==" rather than "=" :)	19:52
fader	Hehe	19:52
cr3	dpkg.architecture == "i386"	19:53
cr3	there is a difference between the architecture of the system and the architecture of the packages installed on the system, dpkg.architecture is already provided in the registry to return the latter information	19:54
cr3	so for the rare times a test might care about the architecture, having to specify that boolean expression should not cause anyone carpel tunnel syndrome	19:54
fader	cr3: Cool, I like it.	19:59
cr3	fader: another question for you: do you think that timeout should be part of the test definition format or part of the command: timeout 10 some_command	20:45
fader	cr3: Hmm... I'm assuming it would be optional?	20:46
cr3	fader: it's optional, but does it belong as an attribute of the test definition or as the command itself	20:49
fader	cr3: Would Checkbox terminate the test if it runs over the timeout value?	20:49
cr3	fader: yes, but the timeout command could do the same	20:50
fader	cr3: It seems to make more sense to me at the test level. If you have multiple commands that should each have a timeout, then you should either split them into separate tests or handle that inside your test itself IMO	20:50
cr3	the only potential reason to make it an attribute of the test definition is whether we care to formalize the difference between a failure because the test failed or because it timed out	20:51
cr3	fader: what do you mean by "handle that inside the test itself"?	20:52
fader	cr3: I mean that if the conditions of your test are complex enough that multiple different timeout values are required but you still can't split the test into multiple tests, you can't expect Checkbox to do it all for you :)	20:53
fader	cr3: Yeah, the difference between 'failed' and 'timed out' seems important to me	20:53
cr3	fader: when you say "the test", do you mean the command being called?	20:54
fader	cr3: Yes, the script that is being called	20:54
cr3	fader: timeout 10 the_script	20:54
cr3	fader: the "timeout" script can essentially do the same: first argument is the timeout in seconds, the rest is the command to run and it's arguments	20:55
fader	cr3: Right, either way works for simple cases. It's only cases where you might want to say "this test will run 3 commands. Let the first one run for 10s, the second for 30s, the third for 10s" that I am saying it should be handled inside the test script	20:56
fader	Which if the timeout is defined at the test definition level and is optional, everything is fine	20:56
cr3	fader: or that timeout command could actually relieve the script for that responsibility: timeout 10 first_one && timeout 30 second_one && timeout 10 third_one	20:57
cr3	but then, in that complex situation, you lose that granularity of 'failure' vs 'timed out'	20:57
fader	cr3: Good point. But it doesn't seem to give the ability to track the difference between 'timed out' and 'failed'	20:57
fader	Heh	20:57
fader	It seems cleaner to me to do it at the test definition level, but that's just an aesthetic distinction	20:58
cr3	referring to my test-result-codes blog post, I'm starting to agree with the importance between 'failure' and 'timed out'. I think the latter might fall under the code UNRESOLVED or INCOMPLETE	20:59
cr3	I'm really glad I took the time to enumerate all those darn test result codes, good reference for myself :)	21:00
cr3	ok, I'm convinced, timeout stays	21:00
cr3	fader: for your example of three tests, they should be expressed with dependencies between each other so that if one times out, the others aren't run	21:01
fader	cr3: Ooh, slick	21:01
cr3	and they should use the timeout feature of the test definition to distinguish 'failure' from 'timed out'	21:01
fader	Man, you've thought about this stuff :)	21:01
cr3	I did some things right, but there's plenty I did wrong too :)	21:02
fader	And your blog posts are really in-depth. It's good for me to read them... maybe I can learn something.	21:02
fader	:)	21:02
cr3	I've learned a lot myself in the process :)	21:03
cr3	yesterday, I was googling for some test related problem and my blog actually came up as the third result :)	21:03
fader	Hehe	21:06
fader	You'll end up writing a book. Just wait.	21:07
cr3	fader: I already have a few people lined up for a book about my little humiliating stories :)	21:09
fader	cr3: Not quite what I had in mind, but that works too.	21:09
fader	You need to start including those on the blog as well!	21:09
cr3	fader: maybe I can combine both: Testing in underwear	21:10
fader	You know, you can tag the entries so only the testing related ones get syndicated on planet.u.c... :)	21:10
cr3	fader: I've made it a point that my blog and micro-blog posts will be strictly testing related, I didn't want to fall into inanities like where I'm scratching myself right now	21:11
cr3	fader: besides, you already know where I'm scratching most of the time...	21:12
fader	cr3: You don't know how often I lay awake at night wondering what you had for breakfast or how often you went to the bathroom today. Enquiring twitterers want to know!	21:13
cr3	I think that crosses the fine line between twitterers and just twits	21:14
fader	cr3: I know we've been over this but I've confused myself. Checkbox executes tests as root whether run interactively (e.g. checkbox-gtk) or not (e.g. kicked off after boot from certify-web) right?	21:23
fader	(I ask as some of the security qa-regression-test tests refuse to run as root and explicitly call sudo in the scripts, which will have to change if everything is run as root)	21:26
cr3	fader: checkbox-(gtk\|cli) runs tests as the user unless overridden by a specific user in the test definition. checkbox-(compatibility\|certification)* runs everything as root	21:26
cr3	fader: calling sudo in scripts run as the normal user won't work either	21:26
cr3	fader: if the script prompts, we're screwed. I need to create a bug to disable all interactivity possibly assumed by scripts	21:27
fader	cr3: It seems like the best way to handle this is to remove any prompts in the script, which means we may end up maintaining our own version of some tests :(	21:27
cr3	fader: or maybe they weren't written well in the first place...	21:27
fader	cr3: Heh, I'll let you have that fight with kees :)	21:28
cr3	tests simply shoved in a directory are probably not written properly in the first place. if the same tests were written within any test suite like checkbox, subunit, or whatever, these problems would've been caught early on	21:29
cr3	tests in a directory are scripts, not tests	21:29
cr3	it's not really a fight that I'm looking for, it's just that the sru team was probably under pressure to just get something done which implies some eventual migration process	21:30
cr3	if we can help with this migration process, I'm sure it will be much appreciated	21:30
fader	cr3: Right, I'm trying to migrate some of those myself and wanted to make sure I was on the right track	21:34
fader	It'll just be a question of if we can make the tests in checkbox the authoritative source for those tests	21:35
cr3	fader: it doesn't have to be though, any test suite could be integrated. however, checkbox is one of the rare ones which supports interactive testing in addition to automated testing, so that might be the deciding factor	21:39
cr3	the only problem is that the author is a real pain to deal with	21:39
fader	cr3: The author of checkbox you mean? :)	21:40
cr3	yeah, avoid him if possible	21:40
fader	He's not so bad... you just have to rough him up a little.	21:40
fader	But anyway, it seems like anything that refuses to run as root but explicitly calls sudo will need to be modified. It's just a question of whether we can push those modifications upstream or if we have to maintain them	21:41
fader	And/or make our version the authoritative version and accept patches for other changes	21:42
sbeattie	fader: I don't have a problem merging them with upstream.	21:43
sbeattie	fader: tricky bit will be the tests for sudo itself.	21:43
fader	sbeattie: Yeah, there are some that are obviously ill-suited for this treatment. I'm taking baby steps right now :)	21:43
sbeattie	fader: excellent. How can I help?	21:43
sbeattie	do you have work-in-progress stuffed anywhere?	21:44
fader	I'm also starting right now by purposefully ignoring any destructive tests (those which overwrite config files, etc.)	21:44
fader	sbeattie: Just my laptop right now. I don't have anything usable yet, just my prototype bits that suck and need thrown out. I'll try to have something to point you at next week though.	21:45
fader	= was aiming for this week but that didn't happen :(	21:46
fader	s/=/I/	21:46
sbeattie	fader: alright, but I'm keenly interested in getting this going, so I don't mind looking at junk that needs to be thrown out.	21:46
sbeattie	(I briefly started on it at one point, but operator error prevented me from getting my tests noticed by checkbox)	21:48
cr3	interactive tests as for sudo perhaps could be wrapped in an (py)expect script to be fully automated	21:58
cr3	or, they could remain a manual test by asking the user to perform a series of steps	21:58
fader	cr3: I'd rather keep them automated and just remove the requirement for sudo for the ones that will work when run as root	22:00
fader	sbeattie: Maybe we can come to an arrangement where you can beat on the test scripts and I can work on the suite definitions and make everything run from Checkbox :)	22:01
fader	(Or more likely annoy cr3 until he tells me what I'm doing wrong, but whatever.)	22:01
=== plars_ is now known as plars
=== plars_ is now known as plars
fader	sbeattie: I cleaned up what I've been playing with a bit so you can at least see what I'm doing, but it will still make the baby cr3 cry so don't show it to him	22:32
fader	sbeattie: https://code.launchpad.net/~fader/checkbox-certification/security-tests	22:32
fader	This is just the glibc test; I've also poked a bit at the gcc test but I'd rather hold onto it for a day or two	22:32
fader	(As it's in worse shape and doesn't run right at all)	22:33
fader	NB that you'll need build-essential installed to run the test. That's not in a 'requires' field for the suite yet. (Another reason not to tell cr3)	22:34
sbeattie	fader: cool, thanks.	22:54
fader	sbeattie: I'll be interested to hear if it works for you. :)	22:55
fader	You can look at ~/.checkbox/submission.xml when it prompts you to enter a secure ID, which you probably don't have	22:56
fader	A quick and dirty way to see if the glibc script ran is to look for /tmp/glibc-security, which I haven't bothered to clean up at the end of the script yet :(	22:56
fader	Okay, time to find food and be interactive in meatspace for a bit... sniff you jerks later :)	23:00

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!