Subj : Re: Auto-grading aptitude tests (was: Software Job Market Myths)
To   : comp.programming,comp.software-eng
From : rem642b
Date : Tue Aug 23 2005 04:20 pm

> From: Richard Heathfield <inva...@address.co.uk.invalid>
> I wouldn't bother with anything that required the candidate to
> provide more than one token, such as short-answer questions or essay
> questions, except perhaps as a tie-breaker, primarily because I
> wouldn't trust a computer to get the marking right;

Well then if that's all you want I already have the technology you want
(and more).

Of course when you compose the questions, you have to be really sure
there isn't another synonym you didn't think of which is just as good
an answer as the ones you did think of. J2SE, J2EE and J2ME are all
versions of the **** programming language. (Only one possible correct
answer there.) Something like a String but varying in length and
efficient for building up a textual sequence piece by piece in
consecutive sequence is a ************. (Again, only one possible
correct answer there.) In memory management on the Macintosh, a pointer
to an allocated but unmovable block which contains a pointer to a
moveable block, is called a ******. (Within the jargon, again only one
possible correct answer.) strcmp returns a positive number if the
second argument is ******* than the first argument. (Problem, more than
one possible answer, and having to count the letters of the word to
figure out which synonym is the desired one is unreasonble if you are
testing for understanding of function, while trying to accept all
possible synonyms might be prone to mistakes.)

Suppose the computer-run test was truly interactive, where the
applicant could fill in something, then the program would check it and
discard the parts that don't fit the desired pattern, and give one free
letter as a clue what's missing, and then the applicant would get
another chance, etc. as many clues as needed until the applicant gets
the correct answer? Suppose the preliminary score is determined by
subtracting two letters for each one letter given for free, for example
a 16-letter answer requiring 3 free letters would score only 10/16,
scaled to whatever the actual total value of that quesition was, and
all the preliminary guesses are kept in the data file to be used as
tiebreakers if the preliminary score is close to the threshold for
passing the test? If the applicant truly understands the technology,
but starts off with wrong wording of the answer, he/she should be able
to figure out from a couple free letters what the correct expected
wording was, and adjust to circumstances by finishing it? This would
handle synonyms gracefully, especially in long answers where the
troublesome synonym is only a small fraction of the whole answer, so
even if the applicant's favorite synonym isn't included in the correct
answer set, still the deduction for that wouldn't be a major problem.

.