/srv/irclogs.ubuntu.com/2009/07/02/#ubuntu-testing.txt

aragood morning!07:00
araschwuk: ping11:53
cr3hey folks, I would appreciate your feedback to improve the concept of attachments in Checkbox!15:23
cr3currently, attachments can be expressed either as a filename or a command. the problem is that a command makes for a horrible download filename15:24
cr3so, we need to find a way to still make it possible to express commands but also relate a name to that command somehow15:25
fadercr3: There are a couple of things that immediately come to mind...15:25
fadercr3: 1. Add an (optional?) parameter to attachments to provide a filename15:26
cr3fader: attachments are currently expressed this way in a test definition file: attachments: cat /foo15:26
fadercr3: 2. Generate the filename by some simple method, e.g. "$SYSTEM_ID-$DATE-$TESTNAME"15:26
cr3fader: also, a single test definition can define multiple attachments15:27
cr3aha! what if we had attachment definition files!?15:28
fadercr3: So could you say: attachments: filename cat /foo ?15:28
cr3fader: what if filename contains a space?15:29
fadercr3: "Don't do that" :P15:29
fadercr3: Seriously, didn't you just write some escaping code?15:29
cr3fader: currently, the values for definitions do not support the concept of a dictionary, ie key/value pairs such as filename/command15:30
cr3what if we introduce another type of definition, like: type: attachment; filename: foo; command; cat /foo15:31
faderHey, I like that... it leaves flexibility for future expansion15:31
cr3fader: only type and filename would be required, command would be optional15:32
cr3furthermore, in order to link an attachment to one or more tests: tests: foo bar baz15:32
faderIf you don't provide a command it just attaches the filename specified?15:32
cr3fader: right15:32
fadercr3: Hmm, I'm not sure if that's a good idea.  You have the same keyword performing two fairly different functions15:33
fader1. grabbing a file, 2. specifying a filename for output of a command15:33
cr3fader: ok, so you'd have this then, where parts in square brackets are optional: type: attachment; name: foo; [filename: /tmp/foo;] [command: cat /tmp/foo;] [tests: foo bar baz;]15:34
cr3fader: filename and command would be mutually exclusive15:34
fadercr3: I like it.15:35
cr3I think I'm comfortable with that too. there is one caveat though: if tests are specified, this means that filename or command needs to be evaluated immediately when the test is finished, not at the end of running all tests15:36
faderWhy is that?15:37
cr3so, it is possible that the same attachment definition will produce different attachment instances15:37
cr3fader: this was a problem raised by eeejay where mago overwrites the same log file for every test, or somesuch15:37
fadercr3: Oh.  That's ugly :(15:37
cr3fader: this is a valid concern though, an attachment should indeed attempt to capture the state of the system immediately after the test was run15:38
faderI won't argue with you... you're the one who has to code it ;)15:38
cr3fader: heh, it does indeed put additional burden on the coding, the result should be transparent to everyone else15:39
fadercr3: See, this is why everybody puts up with you working through weekends and national holidays.15:40
faderWait, hang on, that doesn't make sense...15:40
cr3fader: I think I finally understood the purpose of national holidays: it's not so that I can have a holiday, it's so that other people can have a holiday from me.15:48
faderHehe15:48
faderIs there a way to see what image an installed system was built from?  Something more specific than /etc/lsb-release?17:04
=== Erkan_Yilmaz__ is now known as Erkan_Yilmaz
=== fader is now known as fader|lunch
cr3eeejay: yo, got a minute to bounce ideas?18:29
=== fader|lunch is now known as fader
cr3fader: what would think of changing the requires field in a test definition to explicit: packages and devices?19:42
cr3these would still take registry expressions so that I can do something like: packages: package.name == 'firefox' and int(package.version) > 219:43
fadercr3: Only those two specifically?  What if I want to require e.g. processor scaling support to do a test?19:43
faderOr does that fall under 'devices'?19:43
cr3fader: is that information available in lshal or only in cpuinfo?19:44
fadercr3: Good question.  Let me check.19:44
faderLooks like it's in lshal:   info.capabilities = {'cpufreq_control'} (string list)19:45
cr3all the use cases I have so far relate to either packages and devices. so, that's why I'm thinking that from our experience it might make sense to be explicit about both19:45
cr3but requires is pretty powerful though, maybe I should keep it19:45
fadercr3: What about existence of a file?19:45
cr3the existence of a file is not necessarily contained in the registry19:46
fadercr3: Ah, so one could still express requirements for things that are not specifically contained within the registry then, right?19:47
cr3no, the other way around: one could not express requirements for things...19:47
cr3or, to avoid the double negative: one can only express requirements for things specifically contained within the registry19:48
* fader tries to think of a use case where this would be a problem.19:48
cr3typically, when this has posed a problem, a new registry was created19:48
faderHeh, y'know, that's obviously the right way to handle it :)19:48
faderI'm just concerned that there might potentially be a case where you'd want some information that's not from the package manager or lshal et al.  But I can't think of a specific case.19:50
cr3coming back to my grid testing idea to take over the world, I just remembered that some job description languages support the concept of expressing requirements as a form of boolean query. so, I'm going to keep "requires"19:50
cr3fader: right, it's not because our current limited experience has not come up with a valid use case that there are none19:50
fadercr3: Yeah, that.19:51
cr3fader: however, I am tempted to remove "architectures" and "categories" which are basically just shorthands for requirements. those seem to just add noise to the test definition format19:51
cr3and I don't even recall architectures ever being used19:51
fadercr3: So you'd just formulate them like e.g. "requires: category=server"?19:52
faderI can imagine cases where architecture would be useful, especially with e.g. LPIA devices19:52
cr3fader: something that correlates to the registry in a boolean expression19:52
cr3so that would use "==" rather than "=" :)19:52
faderHehe19:52
cr3dpkg.architecture == "i386"19:53
cr3there is a difference between the architecture of the system and the architecture of the packages installed on the system, dpkg.architecture is already provided in the registry to return the latter information19:54
cr3so for the rare times a test might care about the architecture, having to specify that boolean expression should not cause anyone carpel tunnel syndrome19:54
fadercr3: Cool, I like it.19:59
cr3fader: another question for you: do you think that timeout should be part of the test definition format or part of the command: timeout 10 some_command20:45
fadercr3: Hmm... I'm assuming it would be optional?20:46
cr3fader: it's optional, but does it belong as an attribute of the test definition or as the command itself20:49
fadercr3: Would Checkbox terminate the test if it runs over the timeout value?20:49
cr3fader: yes, but the timeout command could do the same20:50
fadercr3: It seems to make more sense to me at the test level.  If you have multiple commands that should each have a timeout, then you should either split them into separate tests or handle that inside your test itself IMO20:50
cr3the only potential reason to make it an attribute of the test definition is whether we care to formalize the difference between a failure because the test failed or because it timed out20:51
cr3fader: what do you mean by "handle that inside the test itself"?20:52
fadercr3: I mean that if the conditions of your test are complex enough that multiple different timeout values are required but you still can't split the test into multiple tests, you can't expect Checkbox to do it all for you :)20:53
fadercr3: Yeah, the difference between 'failed' and 'timed out' seems important to me20:53
cr3fader: when you say "the test", do you mean the command being called?20:54
fadercr3: Yes, the script that is being called20:54
cr3fader: timeout 10 the_script20:54
cr3fader: the "timeout" script can essentially do the same: first argument is the timeout in seconds, the rest is the command to run and it's arguments20:55
fadercr3: Right, either way works for simple cases.  It's only cases where you might want to say "this test will run 3 commands.  Let the first one run for 10s, the second for 30s, the third for 10s" that I am saying it should be handled inside the test script20:56
faderWhich if the timeout is defined at the test definition level and is optional, everything is fine20:56
cr3fader: or that timeout command could actually relieve the script for that responsibility: timeout 10 first_one && timeout 30 second_one && timeout 10 third_one20:57
cr3but then, in that complex situation, you lose that granularity of 'failure' vs 'timed out'20:57
fadercr3: Good point.  But it doesn't seem to give the ability to track the difference between 'timed out' and 'failed'20:57
faderHeh20:57
faderIt seems cleaner to me to do it at the test definition level, but that's just an aesthetic distinction20:58
cr3referring to my test-result-codes blog post, I'm starting to agree with the importance between 'failure' and 'timed out'. I think the latter might fall under the code UNRESOLVED or INCOMPLETE20:59
cr3I'm really glad I took the time to enumerate all those darn test result codes, good reference for myself :)21:00
cr3ok, I'm convinced, timeout stays21:00
cr3fader: for your example of three tests, they should be expressed with dependencies between each other so that if one times out, the others aren't run21:01
fadercr3: Ooh, slick21:01
cr3and they should use the timeout feature of the test definition to distinguish 'failure' from 'timed out'21:01
faderMan, you've thought about this stuff :)21:01
cr3I did some things right, but there's plenty I did wrong too :)21:02
faderAnd your blog posts are really in-depth.  It's good for me to read them... maybe I can learn something.21:02
fader:)21:02
cr3I've learned a lot myself in the process :)21:03
cr3yesterday, I was googling for some test related problem and my blog actually came up as the third result :)21:03
faderHehe21:06
faderYou'll end up writing a book.  Just wait.21:07
cr3fader: I already have a few people lined up for a book about my little humiliating stories :)21:09
fadercr3: Not *quite* what I had in mind, but that works too.21:09
faderYou need to start including those on the blog as well!21:09
cr3fader: maybe I can combine both: Testing in underwear21:10
faderYou know, you can tag the entries so only the testing related ones get syndicated on planet.u.c... :)21:10
cr3fader: I've made it a point that my blog and micro-blog posts will be strictly testing related, I didn't want to fall into inanities like where I'm scratching myself right now21:11
cr3fader: besides, you already know where I'm scratching most of the time...21:12
fadercr3: You don't know how often I lay awake at night wondering what you had for breakfast or how often you went to the bathroom today.  Enquiring twitterers want to know!21:13
cr3I think that crosses the fine line between twitterers and just twits21:14
fadercr3: I know we've been over this but I've confused myself.  Checkbox executes tests as root whether run interactively (e.g. checkbox-gtk) or not (e.g. kicked off after boot from certify-web) right?21:23
fader(I ask as some of the security qa-regression-test tests refuse to run as root and explicitly call sudo in the scripts, which will have to change if everything is run as root)21:26
cr3fader: checkbox-(gtk|cli) runs tests as the user unless overridden by a specific user in the test definition. checkbox-(compatibility|certification)* runs everything as root21:26
cr3fader: calling sudo in scripts run as the normal user won't work either21:26
cr3fader: if the script prompts, we're screwed. I need to create a bug to disable all interactivity possibly assumed by scripts21:27
fadercr3: It seems like the best way to handle this is to remove any prompts in the script, which means we may end up maintaining our own version of some tests :(21:27
cr3fader: or maybe they weren't written well in the first place...21:27
fadercr3: Heh, I'll let you have that fight with kees :)21:28
cr3tests simply shoved in a directory are probably not written properly in the first place. if the same tests were written within any test suite like checkbox, subunit, or whatever, these problems would've been caught early on21:29
cr3tests in a directory are scripts, not tests21:29
cr3it's not really a fight that I'm looking for, it's just that the sru team was probably under pressure to just get something done which implies some eventual migration process21:30
cr3if we can help with this migration process, I'm sure it will be much appreciated21:30
fadercr3: Right, I'm trying to migrate some of those myself and wanted to make sure I was on the right track21:34
faderIt'll just be a question of if we can make the tests in checkbox the authoritative source for those tests21:35
cr3fader: it doesn't have to be though, any test suite could be integrated. however, checkbox is one of the rare ones which supports interactive testing in addition to automated testing, so that might be the deciding factor21:39
cr3the only problem is that the author is a real pain to deal with21:39
fadercr3: The author of checkbox you mean? :)21:40
cr3yeah, avoid him if possible21:40
faderHe's not so bad... you just have to rough him up a little.21:40
faderBut anyway, it seems like anything that refuses to run as root but explicitly calls sudo will need to be modified.  It's just a question of whether we can push those modifications upstream or if we have to maintain them21:41
faderAnd/or make our version the authoritative version and accept patches for other changes21:42
sbeattiefader: I don't have a problem merging them with upstream.21:43
sbeattiefader: tricky bit will be the tests for sudo itself.21:43
fadersbeattie: Yeah, there are some that are obviously ill-suited for this treatment.  I'm taking baby steps right now :)21:43
sbeattiefader: excellent. How can I help?21:43
sbeattiedo you have work-in-progress stuffed anywhere?21:44
faderI'm also starting right now by purposefully ignoring any destructive tests (those which overwrite config files, etc.)21:44
fadersbeattie: Just my laptop right now.  I don't have anything usable yet, just my prototype bits that suck and need thrown out.  I'll try to have something to point you at next week though.21:45
fader= was aiming for this week but that didn't happen :(21:46
faders/=/I/21:46
sbeattiefader: alright, but I'm keenly interested in getting this going, so I don't mind looking at junk that needs to be thrown out.21:46
sbeattie(I briefly started on it at one point, but operator error prevented me from getting my tests noticed by checkbox)21:48
cr3interactive tests as for sudo perhaps could be wrapped in an (py)expect script to be fully automated21:58
cr3or, they could remain a manual test by asking the user to perform a series of steps21:58
fadercr3: I'd rather keep them automated and just remove the requirement for sudo for the ones that will work when run as root22:00
fadersbeattie: Maybe we can come to an arrangement where you can beat on the test scripts and I can work on the suite definitions and make everything run from Checkbox :)22:01
fader(Or more likely annoy cr3 until he tells me what I'm doing wrong, but whatever.)22:01
=== plars_ is now known as plars
=== plars_ is now known as plars
fadersbeattie: I cleaned up what I've been playing with a bit so you can at least see what I'm doing, but it will still make the baby cr3 cry so don't show it to him22:32
fadersbeattie: https://code.launchpad.net/~fader/checkbox-certification/security-tests22:32
faderThis is just the glibc test; I've also poked a bit at the gcc test but I'd rather hold onto it for a day or two22:32
fader(As it's in worse shape and doesn't run right at all)22:33
faderNB that you'll need build-essential installed to run the test.  That's not in a 'requires' field for the suite yet.  (Another reason not to tell cr3)22:34
sbeattiefader: cool, thanks.22:54
fadersbeattie: I'll be interested to hear if it works for you.  :)22:55
faderYou can look at ~/.checkbox/submission.xml when it prompts you to enter a secure ID, which you probably don't have22:56
faderA quick and dirty way to see if the glibc script ran is to look for /tmp/glibc-security, which I haven't bothered to clean up at the end of the script yet :(22:56
faderOkay, time to find food and be interactive in meatspace for a bit... sniff you jerks later :)23:00

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!