https://a3nm.net/blog/adversarial_tetris.html

a3nm's blog

  * < prev
  * next >

  * latest
  * all
  * rss
  * website

Can you be sure to clear a line at Tetris?

2022-04-19 21:36

Summary: it is possible to play Tetris and guarantee that you will
score at least one line, no matter which pieces are given to you,
i.e., even assuming they are chosen adversarially.

There are many things that can be found online about the game Tetris:
high scores (see for instance this great documentary), implementation
details (did you know there was a specification for Tetris games?),
intellectual property surprises (this post is not related to or
endorsed by The Tetris Company)... But for many years I have not been
able to find on the Internet the answer to this question: can you be
sure to clear a line at Tetris?

Of course, in practice, most Tetris players seem to be able to clear
lines at least some of the time^[citation needed]. But maybe they are
just being lucky! Maybe an evil computer or extreme bad luck could
prevent you from ever clearing a line, no matter how you played. For
instance, consider the game Hatetris, which is programmed to give you
unpleasant pieces. Clearing a line in Hatetris is much more
challenging, though good players can do it. Could a worse version of
Hatetris, with a more clever opponent, give you pieces that will
always make you lose without scoring a single line?

Mathematically, we can formulate this a sequential zero-sum game with
perfect information. We have a standard Tetris board of 10 columns by
20 rows. The computer and human play alternatively: the computer
gives a piece to the human (one of the seven tetrominoes), and the
human responds by positioning it somewhere according to the usual
Tetris rules. If the screen is filled up then the human has lost, and
if the human completes a line then the human has won. Mathematically,
there are then only two possibilities:

  * either the computer has a strategy to give pieces (depending on
    how the human has played so far) that guarantees that the human
    will lose without completing a single line, no matter what they
    do;
  * or the human has a strategy to place pieces that guarantee that
    they will score a line before losing, no matter which pieces they
    receive from the computer.

My question was to figure out which one of the two is true. After
some coding and much computation, it turns out that the human wins:
it is possible to guarantee a non-zero score when playing Tetris. If
you want to see an example strategy, you can test it with the Tetris
board below. You play the computer, i.e., you select which piece to
give, and the chosen piece will be placed according to a winning
strategy. Try preventing your opponent from scoring a line! (the
point of this post is that this is impossible)

Your navigator does not support iFrames, please go here.

An alternative challenge: try preventing the computer from scoring a
line in the 5 bottom rows! This is possible for precisely 759 piece
sequences out of the 427513 winning sequences. Can you find them?

In the rest of the post, I explain some details about the exact
problem statement, and give more details about the strategy and how
to check it. Then, I explain how I computed the strategy, possible
improvements to the computation, and then present related work and
remaining open problems.

Technical details

Here are some detail about the problem and strategy:

  * I restrict the human player to only rotate and then drop pieces
    vertically, i.e., playing a piece means choosing a rotation and a
    column where to drop it. (You cannot slide a block under another
    block, no t-spins, etc.) Of course, if the human can win under
    this restriction, they can win even when we allow more moves.
  * I do not require the computer to show the next piece, i.e.,
    commit to what the next piece will be before the human has
    dropped their current piece. Of course, if the human can win in
    such conditions, they can also win with a next piece display.
  * Under such circumstances, the human can play so as to guarantee
    that they can complete one of the lowest 6 rows of the board. The
    exploration also shows that the computer can prevent the human
    from scoring one of the bottom 5 lines, subject to the above
    restrictions, and assuming that the human does not play "above"
    the fifth line, i.e., dropped pieces cannot touch a full column
    which has height at least 5. (See below to know more about this
    limitation.^1)
  * In the worst case, the human needs to position 13 pieces to
    complete its first line among the bottom 6 rows. This is optimal
    under the above restrictions: when only considering completion of
    the first 6 rows, dropping pieces without sliding them, and
    prohibiting dropped pieces from touching a full column. I think
    it is very likely that the human can win faster if we allow the
    player to slide pieces; and possible that the human can win
    faster by going above the 6th row.
  * Where several moves achieve the same distance to a win, they are
    chosen arbitrarily, following the order in which the possible
    choices of where to put a piece are considered (first by
    rotation, then by column). This is why the strategy above is not
    symmetric, e.g., between l-shaped and j-shaped tetrominoes.

Checking the strategy

I provide the strategy as a file describing how the human can play
and guarantee a win. Each line of this file corresponds to a possible
"game state", numbered from 0 to 5249 (the file has 5250 lines^2).
Each line consists of 21 integers, numbered from 0 to 20 as 3p+q with
0<=p<7 and 0<=q<3. The triple 3p,3p+1,3p+2 explains what to do when
receiving piece p: the number 3p indicates the rotation 0<=r<4 to use,
the number 3p+1 indicates the column -2<=c<10 where to drop the piece,
and the number 3p+2 indicates the game state from which we should
follow the rest of the winning strategy (the special value "-1"
indicates that the player has won). The pieces, rotations, and column
offsets are according to the following description of the pieces:

== piece 0 rotation 0 ==
#...
#...
#...
#...
== piece 0 rotation 1 ==
....
....
....
####
== piece 0 rotation 2 ==
...#
...#
...#
...#
== piece 0 rotation 3 ==
####
....
....
....
== piece 1 rotation 0 ==
.#..
##..
#...
....
== piece 1 rotation 1 ==
....
....
##..
.##.
== piece 1 rotation 2 ==
....
...#
..##
..#.
== piece 1 rotation 3 ==
.##.
..##
....
....
== piece 2 rotation 0 ==
.##.
##..
....
....
== piece 2 rotation 1 ==
....
#...
##..
.#..
== piece 2 rotation 2 ==
....
....
..##
.##.
== piece 2 rotation 3 ==
..#.
..##
...#
....
== piece 3 rotation 0 ==
###.
.#..
....
....
== piece 3 rotation 1 ==
....
#...
##..
#...
== piece 3 rotation 2 ==
....
....
..#.
.###
== piece 3 rotation 3 ==
...#
..##
...#
....
== piece 4 rotation 0 ==
###.
#...
....
....
== piece 4 rotation 1 ==
....
#...
#...
##..
== piece 4 rotation 2 ==
....
....
...#
.###
== piece 4 rotation 3 ==
..##
...#
...#
....
== piece 5 rotation 0 ==
###.
..#.
....
....
== piece 5 rotation 1 ==
....
##..
#...
#...
== piece 5 rotation 2 ==
....
....
.#..
.###
== piece 5 rotation 3 ==
...#
...#
..##
....
== piece 6 rotation 0 ==
##..
##..
....
....
== piece 6 rotation 1 ==
....
....
##..
##..
== piece 6 rotation 2 ==
....
....
..##
..##
== piece 6 rotation 3 ==
..##
..##
....
....

The strategy thus describes a DAG (with labeled edges). In
particular, we can get to the same game state by different paths, and
indeed the same game state can correspond to different states of the
board but from which we can use the same winning strategy.

The strategy in this file is the one used in the Javascript game
above. I have written a program (with helper file) that checks the
strategy by systematically going over all possible moves by the
computer, playing according to the strategy, and checking that the
human indeed wins when the strategy says they do. Hence, the
mathematical "proof" that the human has a winning strategy would
consist of this file and of the verification program.

Finding the strategy

I could stop here and say that the strategy and the verification
program are an answer to the question I had posed :) but let me say a
few words about how I found it. This is performed using the minimax
algorithm, i.e., systematically exploring the game tree. Formally, we
define inductively who wins on a given board state, among the human
(H) and computer (C):

  * Base 1: H wins on board states where one of the bottom 6 rows is
    filled
  * Base 2: C wins on board states where we cannot place a piece
    without touching a full column, i.e., one where there are already
    blocks above the 6th row.
  * Induction 1: In a given board state b where H is given some piece
    p, if H can drop it and get to a board state which is winning for
    H, then (b,p) is winning for H; if all moves by H get to a board
    state which is losing for H, then (b,p) is losing for H.
  * Induction 2: In a given board state b, if C can give H a piece p
    such that (b,p) is losing for H, then b is losing for H; if any
    pieces given by C are such that (b,p) is winning for H then b is
    winning for H.

This is just a simple recursive exploration. The naive way to
implement this would be to compute who wins on every board state,
remembering in the board state if each cell of the bottom 6 rows is
filled or not: this would be 26x10 possible board states, where 10 is
the number of columns. This amounts to 1153 peta states, which is too
much.

Since we are only allowing pieces to be dropped (without sliding
them), we can do better: in each column, we only need to remember the
height of the highest completed block^3. The status of the cells
below the topmost filled cell of a column have no influence on which
moves are possible. The only extra thing to remember is the set of
rows having a "hole", i.e., an empty cell above which there is a
filled cell. As we cannot slide pieces, these holes can never be
filled, so we must remember that the rows that have a hole cannot be
completed. We say that these lines are "sacrificed".

For an example, consider the state of the board after dropping a
vertical I-tetromino in the second column, pictured below. We
represent this as the sequence of heights written at the bottom.

[config0]

Imagine now that we drop a vertically oriented z-block, getting to
the configuration below:

[config1]

Columns 2 and 3 are now full, i.e., the blocks stack up to the 6th
row. Hence, we store their height as 6, and forget what happens above
the 6th row (i.e., we forget the block in column 3 with a black
cross) -- because of this, we will no longer allow blocks to touch
one of these full columns. Further, the new block has now created
holes -- the gray cells with red crosses. The sequence of heights
does not account for these holes; to remember that they are there, we
store that the bottom 5 rows have been "sacrificed", materialized by
the red crosses to the left of the board. This is sufficient, because
our way to drop pieces will never allow H to slide a piece and fill
these holes. Now, the 6th row is the only row where H can ever hope
to complete a line. (In fact, the configuration is now clearly losing
for H: if C no longer gives H any I-tetromino, H can never fill the
cell at column 1 and row 6, so H can never complete the only
non-sacrificed line.)

To summarize, the board state consists of the height of each column
(between 0 and 6 inclusive) and the set of sacrificed lines, for a
total of 710x26 board states. This is much better: there are now 18
billion states. There are other small savings, e.g., board states
where all lines are sacrificed are immediately losing. In practice, a
complete exploration considers around 2 billion states, i.e., only
around 11% of the possible board states are reached.

For efficient implementation, a board state is represented as a
64-bit integer consisting of a 16-bit mask describing the sacrificed
lines (only the 6 lowest bits are useful), and 10 4-bit integers
describing the heights (only the 3 lowest bits are useful).

As the same board state can be reached by many different paths in the
game tree, we use memoization: when we are done exploring the tree
from some board state, we remember whether this state is losing or
winning, and if it is winning we remember in how many moves. This is
done with a hash table (a C++ unordered set). This means that the
program requires a large quantity of RAM to run (around 20-30 GB):
for this, I thank the INFRES department at Telecom Paris for giving
us access to suitable computing resources.

This explains how the winning strategy is found. It is fast (one
hour) to check that a winning strategy exists by terminating the
search for a board state and piece as soon as a winning move is
found. It is longer (18 hours) to find a strategy that wins as fast
as possible, because this requires us to explore the full tree. To
speed things up, we use a form of alpha-beta pruning: when trying to
put a piece, when we have found a winning option, we consider other
options but only exploring them up to a depth that would give a
strictly shorter strategy than the currently known best option. This
lowers the running time to 10 hours and lowers the number of explored
states from 2 billion states to around 750 million states, of which 9
million are winning. The resulting program is here. (Sorry, it is not
very clean...)

Once the program has produced a strategy, we need to "compress" it.
To do so, we first "trim" it by removing unreachable board states: we
go from 9 million states to just 21 thousand states: this is done
here. Then we minimize it by merging states from which the strategy
is the same: this is done here. As the DAG is acyclic, we can perform
this in linear time simply by processing the DAG bottom-up and
hashing configurations. We get to the 5250 states of the strategy
file.

Other possible optimizations

The following optimizations would have been possible, but I did not
implement them:

  * There be an easy saving of a factor 2 by breaking the left-right
    symmetry, which would also probably make the winning strategy
    more concise (but it is less convenient to reconstruct it).
  * When checking the cache of possible board states, we could check
    "shifted" configurations where we remove any one of the lowest d
    lines, provided each column has a height of at least d. We could
    also check configurations with a subset, or superset, of
    sacrificed rows: if a configuration with identical column heights
    and with a subset of sacrificed rows is known to be losing then
    the current configuration also is, and if a configuration with
    identical column heights and a superset of sacrificed rows is
    known to be winning then the current configuration also is.
  * Rather than exploring all possibilities to the maximal depth
    before giving up, a better solution is to use iterative deepening
    . I did this in an earlier version of the code that followed a
    different approach. Likewise, ordering the possible moves with a
    piece using some heuristic (e.g., putting the piece as low as
    possible, or sacrificing as little new lines as possible) would
    possibly make the search faster (thanks to alpha-beta pruning).

Related work

For the related question of whether the player can win at Tetris,
i.e., play indefinitely, it is known that this is not the case. The
computer can force the player to lose, even with a strategy which is
oblivious to how the human plays and alternates S-shaped and Z-shaped
pieces: see Burgiel's 1997 paper "How to Lose at Tetris" or this page
. This does not contradict the result presented here, which says we
can clear a line (but eventually lose). Optimizing the number of
lines cleared (in the worse case or in expectation) vs guaranteeing
that you make one line are different goals, that may be at odds with
each other.

On the other hand, with stronger assumptions on how the computer can
propose pieces, it is possible to play forever: see this page.

There are other results of this kind in Brzustowski's 1992 master
thesis Can you win at Tetris?.

There is a 2002 study by Demaine et al. of the computational
complexity of Tetris play, which does not seem related to the results
here. For more related work on Tetris, there is a great literature
review in this FUN'2022 paper by Dallant and Iacono.

There are many Tetris implementations that try to give the worst
possible pieces to the player: at least Hatetris and bastet.

The most relevant study of Tetris to what I present here seems to be
the study of "adversarial Tetris", introduced at a reinforcement
learning competition in 2009 (see here), and followup, e.g., Beatris.
However, I wasn't able to find works there which considered the
question of whether the human could guarantee that they score at
least one line assuming perfect play.

Open problems

I do not know what is the minimal number of rows and/or maximal
necessary height to score a line if the human can slide or rotate
pieces as they fall; the latter would depend on the exact rules
implemented for rotation which is somewhat unpleasant (similar
subtleties are considered in Demaine et al.'s paper).

I do not know what happens for a different number of columns. Odd
number of columns clearly allow the computer to win by giving only
square tetrominoes, and two columns clearly allow the human to win
(every piece can score a line by itself in every configuration except
I-tetrominoes where this can be achieved in two moves). Of course one
can generalize the problem to different polyomino sets, etc.

I do not know if the human can guarantee scoring multiple lines at
once: if the computer only gives O-shaped blocks then the human
cannot hope to score three or four lines at once^4; I don't know if
guaranteeing that you can score two lines at once is possible.

I do not know if the result implies that the human can score any
arbitrary number of lines provided that the board is sufficiently
high. The strategy I presented does not ensure this, because it only
works from the empty board; it could be the case (although unlikely)
that the board states reached after scoring one line in this strategy
no longer themselves have a winning strategy (no matter the height).
More generally, one can ask what is the behavior of the function that
maps the number of rows of the board to the number of lines that can
be guaranteed by the player.

---------------------------------------------------------------------

 1. Thanks to Louis for clarifying this point. -

 2. The number 5250 is not optimized by the program; it is possible
    that a more concise strategy exists -- and I am not sure of how
    to find it. In any case, the strategy could probably be
    compressed further if it were encoded in a more clever way, e.g.,
    by breaking symmetries. -

 3. Thanks to Louis for a discussion where he proposed this
    bruteforce approach, which worked better than what I was
    attempting to do. -

 4. Thanks to Ted for this remark. -

comments welcome at a3nm<REMOVETHIS>@a3nm.net

< prev

next >

-- all posts --

author a3nm -- license CC-BY-SA -- engine pelican -- legal