https://quorumlanguage.com/evidence.html

skip to main content

  * HomeQuorum
  * Learn
  * Hour of Code
  * Reference
  * Libraries
  * Download
  * Bugs
  * My Projects
  * Login

[                    ][Search]
Quorum Login
[                    ]
[                    ]
LoginCancel
Forgot password?Sign up for an account
Programming Languages and Learning

A quick primer on human-factors evidence in programming language
design

The goal of this page

This page is designed to provide an overview of recent evidence on
human factors evidence in programming language design. In some cases,
our intent is to dispel myths. In others, it is to provide the result
of research lines. The community of scholars advocating for and
participating in evidence-based programming language design is
growing. This page is not comprehensive, but covers several of the
broad trends and highlights a number of empirical studies that have
been performed.

Computer science for all

In the United States, one broad trend is toward computer science
education for all. As such, students in K-12 or beyond are learning
to program. Despite this trend, the programming language community is
highly fractured, with thousands of products used for a variety of
purposes. Teachers and students participating in CS for all may not
have the training to evaluate the feature sets of various language
products or their impact on people (e.g., students, professionals,
those with disabilities). Modern computer science should
significantly re-examine issues involved in the programming language
wars. Programming languages make up the foundation of software
technology. As we push toward computer science for all, the design of
our languages should be based increasingly on evidence.

The Language Wars are Insufficiently Studied

A systematic search and selection process of peer-reviewed papers
found only 156 publications (reporting 137 empirical primary studies
and 19 secondary studies of empirical primary studies) somehow
comparing language design decisions, broadly construed, with respect
to human-factors efficacy measures. When limited to studies comparing
features, and excluding certain research designs that arguably aren't
empirical, the number goes down to 65 primary studies. When further
limited to controlled experiments, 35 primary studies remain; further
restrict to randomized trials, and we have only 22 primary studies
from the 1950s up until 2012 [2].

This graph represents the number of randomized controlled trials in
programming language design over time that were identified in
KaiJanaho's meta-analysis.

Static vs. Dynamic Typing

Experiments have shown a positive impact of statically typed
languages

A number of experiments have been performed that show a positive
impact of static type systems over dynamic type systems. This benefit
was found in situations where developers had to use an API that was
new to them. In two studies, the positive type system effect was
compared with the documentation effect (also, the effect achieved by
code completion). In both experiments, the positive impact of type
systems was significant and much larger than the documentation or
code completion effect. Static typing also showed a positive impact
on debugging time for type errors comparing Java and Groovy (where
Groovy was used as a dynamically typed Java). The results showed that
the Java group was significantly faster fixing the type error [3, 4].

This figure shows data from an experiment on static vs. dynamic
typing, with or without documentation. Overall, the figure shows that
static typing led to faster programming times compared to dynamic.
Documentation had a smaller impact than the type system.This figure
shows the amount of time it took in an experiment on static vs.
dynamic typing to debug problems related to the type system. Static
typing, again, appears to have a positive impact.

Software Transactional Memory (STM)

STM can reduce programming time and prevent synchronization bugs

In a study by Pankratius and Adl-Tabatabai, student teams developed a
desktop search engine with either locks or STM. The STM teams were
among the first to have an operational implementation and spent less
than half the time debugging segmentation faults, but had more
problems tuning performance and implementing queries. Code
inspections were suggestive that STM code was easier to understand
than locks code, because the locks teams used many locks (up to
thousands) to improve performance [5].

Notation

The notation used in programming languages has a large impact on
novices

In a study of six programming languages using novices, one randomized
controlled trial found that accuracy rates for certain C-style
languages (Perl, Java) were not significantly higher than a language
with randomly generated keywords and symbols, while languages that
deviated from this style did (Quorum, Ruby, Python). Statistical
procedures called Token Accuracy Mapping now exist that can predict
which tokens contribute, positively or negatively, to the overall
effect [6].

This figure shows a token accuracy map from a study on syntax. Token
accuracy maps can predict which tokens users did, and did not, use
correctly when completing a programming task. In this figure, 7
tokens were identified in the language that novices were unable to
use correctly.

Inheritance

Inheritance depth is not a significant driver of software maintenance
effort

Code maintenance effort is not strongly correlated with inheritance
depth. Instead, factors such as the number of methods to change and
the experience of the maintainer appear more related. In one
controlled experiment, Prechelt et al. studied the impacts on
maintenance with five levels of inheritance depth compared to
"flattened" versions of the same program with three and zero levels.
The effort level was most highly correlated with the number of
methods that needed to be understood for the change, followed by the
maintainer's experience and then inheritance depth, which made little
difference. Reducing or eliminating this depth does not seem to have
much effect on the effort required for understanding software [7].

Early Learning and Visualization/Blocks

Curriculum for elementary education and early learning

When analyzing a curriculum, look at what earlier skills lead to
success in the large and teach those in isolation before introducing
more complex programming tasks. While most first grade students
cannot program 'real' programs, students can still learn to think in
ways that lead to strong computational thinking skills. They already
do activities that have some relationship to computing -
color-by-number pictures, following algorithms, classifying objects
by type. The key is to identify how to frame those activities to make
connections with computational thinking, as well as identify what
bridge activities could come between existing activities and more
mature activities [9].

Block Languages may impact learners differently than text

Visual Block-based languages, though lacking some of the
functionality of text-based languages, have been shown to lead to
knowledge transfer to text-based languages under some conditions.
Initialization of state, however, is one concept that is challenging
for students within block-based languages and does not match
text-based languages. [1, 8, 9].

Compiler Error Design

Compiler error messages are an important tool that a language offers
its programmers, and for novices their feedback is particularly
critical. It is not possible to continue development until a program
compiles, and these messages are often the primary source of feedback
on the errors students make. Yet, they pose several challenges.
Novices often struggle to interpret them and they have been shown to
cause considerable frustration. There is a wide variability between
programming languages, and for some there is not a one-to-one mapping
between the error in code and the resulting error message. There is
also a significant disparity between the errors that educators think
students make and those that students actually commit [10]. It has
been shown that there is significant complexity involved in reading
compiler error messages which demands a large amount of time and has
been linked to performance. Importantly these challenges are not
limited to students, as compiler and build errors have also been
shown to cause considerable issues for professional developers [11].

The barriers that compiler error messages present is in part a design
issue. They are normally worded to indicate the technicalities of how
an error violates a language specification, not what is wrong from
the programmer's point of view. Worse, some (cascading) compiler
error messages result from compiler confusion and can be misleading,
often indicating the presence of spurious errors. Although some
recent efforts to enhance compiler errors to make them more useful
have met mixed results, this is a nascent research area and promising
progress has been made [12].

Contributors

  * Andreas Stefik: University of Nevada, Las Vegas
  * Patrick Daleiden: University of Nevada, Las Vegas
  * Diana Franklin: University of Chicago
  * Stefan Hanenberg: University of Duisburg-Essen
  * Antti-Juhani Kaijanaho: University of Jyvaskyla
  * Walter Tichy: Karlsruhe Institute of Technology
  * Brett Becker: University College Dublin

References

 1. David Weintrop and Uri Wilensky. 2015. Using Commutative
    Assessments to Compare Conceptual Understanding in Blocks-based
    and Text-based Programs. In Proceedings of the eleventh annual
    International Conference on International Computing Education
    Research (ICER '15). ACM, New York, NY, USA, 101-110.
 2. Kaijanaho, A.-J., Evidence-Based Programming Language Design: A
    Philosophical and Methodological Exploration. PhD Diss.,
    Information Technology Faculty, University of Jyvaskyla, 2015.
 3. Lars Fischer and Stefan Hanenberg. 2015. An empirical
    investigation of the effects of type systems and code completion
    on API usability using TypeScript and JavaScript in MS visual
    studio. In Proceedings of the 11th Symposium on Dynamic Languages
    (DLS 2015). ACM, New York, NY, USA, 154-167.
 4. Stefan Endrikat, Stefan Hanenberg, Romain Robbes, and Andreas
    Stefik. 2014. How do API documentation and static typing affect
    API usability? In Proceedings of the 36th International
    Conference on Software Engineering (ICSE 2014). ACM, New York,
    NY, USA, 632-642.
 5. Victor Pankratius and Ali-Reza Adl-Tabatabai. 2014. Software
    Engineering with Transactional Memory Versus Locks in Practice.
    Theor. Comp. Sys. 55, 3 (October 2014), 555-590.
 6. Andreas Stefik and Susanna Siebert. 2013. An Empirical
    Investigation into Programming Language Syntax. ACM Transactions
    on Computing Education 13, 4, Article 19 (November 2013), 40
    pages.
 7. Lutz Prechelt, Barbara Unger, Michael Philippsen, and Walter
    Tichy. 2003. A controlled experiment on inheritance depth as a
    cost factor for code maintenance. J. Syst. Softw. 65, 2 (February
    2003), 115-126.
 8. Christopher D. Hundhausen, Sean F. Farley, and Jonathan L. Brown.
    2009. Can direct manipulation lower the barriers to computer
    programming and promote transfer of training?: An experimental
    study. ACM Trans. Comput.-Hum. Interact. 16, 3, Article 13
    (September 2009), 40 pages.
 9. Franklin, D., Hill, C., Dwyer, H., Hansen, A., Iveland, A., and
    Harlow, D. Initialization in Scratch: Seeking Knowledge Transfer,
    SIGCSE, 2016.
10. Neil C. C. Brown and Amjad Altadmri. 2017. Novice Java
    programming mistakes: large-scale data vs. educator beliefs.
    Trans. Comput. Educ. 17, 2, Article 7 (May 2017), 21 pages. DOI:
    https://doi.org/10.1145/2994154
11. Seo, Hyunmin, Caitlin Sadowski, Sebastian Elbaum, Edward
    Aftandilian, and Robert Bowdidge. 2014. Programmers' build
    errors: a case study (at Google). In Proceedings of the 36th
    International Conference on Software Engineering (ICSE 2014).
    ACM, New York, NY, USA, 724-734. DOI: https://doi.org/10.1145/
    2568225.2568255
12. Becker, Brett A., Glanville, Graham, Iwashima, Ricardo,
    McDonnell, Claire, Goslin, Kyle and Mooney, Catherine. 2016.
    Effective compiler error message enhancement for novice
    programming students, Computer Science Education 26, 2-3,
    148-175. DOI: http://dx.doi.org/10.1080/08993408.2016.1225464

The Quorum Programming LanguageCopyright (c) 2014-2021. Funding for
Quorum comes largely from the National Science Foundation (NSF #
0940521, #1440878, #1644491, #1640131, #1738259 and #1738252, #
2023739, #2048356, #2106392, #2121993). Other contributions can be
found on the contributors page.