[HN Gopher] Executable Examples for Programming Problem Comprehe...
___________________________________________________________________
Executable Examples for Programming Problem Comprehension [pdf]
Author : luu
Score : 31 points
Date : 2022-05-13 20:07 UTC (1 days ago)
(HTM) web link (cs.brown.edu)
(TXT) w3m dump (cs.brown.edu)
| jswrenn wrote:
| Oh, whoa, I'm the author of this! Happy to answer any questions.
| acbart wrote:
| I saw this talk at ICER, and I really loved how it led to the
| idea of evaluating tests in terms of "wheats" (good programs pass
| the test) and "chaffs" (bad programs fail the tests). They
| describe this in terms of Thoroughness and Validity.
|
| > A suite is valid if it accepts (i.e., its assertions pass) all
| correct implementations... In order for a suite to be valid for
| all implementations of median, it must not include any assertions
| involving empty input lists. We can accurately identify such
| assertions as invalid by checking them against two correct
| implementations (henceforth wheats [24])... If a student asserts
| that implementations should produce an error on empty inputs,
| their suite will reject the wheat that produces 0 (and visa
| versa). Provided that the set of wheats completely exercises the
| space of underspecified behaviors permitted by the specification,
| accepting all wheats guarantees that a suite is valid and will
| accept all correct implementations.
|
| > A suite is thorough if it rejects (i.e., its assertions do not
| pass) buggy implementations. We assess the thoroughness of a
| suite by running it against a curated set of buggy
| implementations (henceforth chaffs [24]). The thoroughness of a
| suite is measured as the proportion of chaffs it rejects. To
| assess test suites, the set of chaffs should include subtly buggy
| implementations. To assess examples, we take a different
| perspective: the set of chaffs should exercise logical
| misunderstandings that students are likely to make. For instance,
| to assess the thoroughness of examples for median, the set of
| chaffs could include implementations of mean and mode.
|
| I want to see this used in more curricula and tools. I need to
| see if there's been any follow-up on this research and learn how
| it's gone.
| np_tedious wrote:
| Seems roughly analogous to consistency and completeness in a
| logic system
___________________________________________________________________
(page generated 2022-05-14 23:01 UTC)