\documentclass[12pt,reqno]{article} \usepackage[usenames]{color} \usepackage{amssymb} \usepackage{graphicx} \usepackage{amscd} \usepackage[colorlinks=true, linkcolor=webgreen, filecolor=webbrown, citecolor=webgreen]{hyperref} \definecolor{webgreen}{rgb}{0,.5,0} \definecolor{webbrown}{rgb}{.6,0,0} \usepackage{color} \usepackage{fullpage} \usepackage{float} \usepackage{psfig} \usepackage{graphics,amsmath,amssymb} \usepackage{amsthm} \usepackage{amsfonts} \usepackage{latexsym} \usepackage{epsf} \setlength{\textwidth}{6.5in} \setlength{\oddsidemargin}{.1in} \setlength{\evensidemargin}{.1in} \setlength{\topmargin}{-.1in} \setlength{\textheight}{8.4in} \newcommand{\seqnum}[1]{\href{http://oeis.org/#1}{\underline{#1}}} \def\longto{\mathop{\longrightarrow}\limits} \begin{document} \begin{center} \epsfxsize=4in \leavevmode\epsffile{logo129.eps} \end{center} \theoremstyle{plain} \newtheorem{theorem}{Theorem} \newtheorem{corollary}[theorem]{Corollary} \newtheorem{lemma}[theorem]{Lemma} \newtheorem{proposition}[theorem]{Proposition} \theoremstyle{definition} \newtheorem{definition}[theorem]{Definition} \newtheorem{example}[theorem]{Example} \newtheorem{conjecture}[theorem]{Conjecture} \theoremstyle{remark} \newtheorem{remark}[theorem]{Remark} \begin{center} \vskip 1cm{\LARGE\bf The Asymptotics of Useless Hint Sequences \\ \vskip .08in in Nonograms } \vskip 1cm \large Daniel Berend\footnote{Research supported in part by the Milken Families Foundation Chair in Mathematics.} \\ Departments of Mathematics and Computer Science \\ Ben-Gurion University \\ Beer-Sheva 84105 \\ Israel \\ \href{mailto:berend@math.bgu.ac.il}{\tt berend@math.bgu.ac.il}\\ \ \\ Shira Zucker \\ Department of Computer Science \\ Sapir Academic College\\ M.P. Hof Ashkelon 79165 \\ Israel \\ \href{mailto: zucker.shira@gmail.com}{\tt zucker.shira@gmail.com} \end{center} \vskip .2 in \begin{abstract} In this paper we consider a question raised by Mullen regarding Nonogram puzzles. An instance of the puzzle consists of a $p\times n$ grid, whose cells are to be colored black or white, according to some hints. These hints specify the lengths of the blocks of consecutive black cells in each row and column. Mullen studied the asymptotic probability that a random hint sequence of a single row uniquely determines the color of at least one cell in that row, and gave lower and upper bounds on this probability. In this paper we tighten his bounds. \end{abstract} \section{Introduction} A {\it Nonogram} is a classic logic puzzle, in which cells in a grid must be colored black or left uncolored (or, equivalently, colored white) according to numbers at the side of the grid. (Wikipedia~\cite{wiki} lists dozens of other names by which the puzzle is known.) In this puzzle, a rectangular grid is given with a finite sequence of numbers alongside each row and above each column. These numbers indicate the lengths of the blocks of consecutive black cells in the rows and columns, respectively. For example, the sequence 3, 2 signifies that in this row (or column) there are three adjacent black cells, a gap of at least one uncolored cell, and then two adjacent black cells. Note that there may be any number of uncolored cells before the first block of black cells and/or any number of uncolored cells after the last block. Thus, if the given sequence is of length $k$, then in addition to the blocks of black cells, there must be $k-1$ uncolored blocks in between, and perhaps one or two uncolored blocks on the sides. Table~\ref{1example} provides an example of a small solved puzzle. Initially, we note that the second row needs to contain 5 black cells, and has only 5 places. Therefore, all cells in this row need to be colored black. This completes filling the second and fourth columns, which should contain only one black cell each. Similarly, the third column should contain three black cells and has only three places. Therefore, all cells of the third column are colored black. This completes filling the third row. The first row should contain three black cells, with at least one uncolored cell between each two. Thus there is only one way to color the first row, and we are done. \begin{table}[h] \centering \begin{tabular}{c|c|c|c|c|c|} & 2 & 1 & 3 & 1 & 2 \\ \hline $1,1,1$ & x & & x & & x \\ \hline $5$ & x & x & x & x & x \\ \hline $1$ & & & x & & \\ \hline % inserts single-line \end{tabular} \caption{Example of a small puzzle} \label{1example} \end{table} Assume, for example, that the given sequence for some row is $2,1,4$. In this case, if the length of the corresponding row is $9$, then we know exactly how the cells are to be colored. If $n=10$, then there are four possible configurations, corresponding to the possible locations of the ``extra" uncolored cell. However, in all these configurations, the cells in places $2,7,8,9$ of the row are colored black. In order to color the remaining cells, we need to use data regarding the rows and the other columns. If $n\geq 13$, then we cannot deduce the color of any cell without using data from the rest of the puzzle. The puzzle was studied from several points of view. The decision problem of unique solvability of a puzzle is NP-complete~\cite{UN}. Batenburg and Kosters~\cite{BK} showed that one can find all cells of a partially colored row that must be colored in some way (black or white), according to the hints regarding that row, in polynomial time. Berend, Pomeranz, Rabani, and Raziel~\cite{B} showed that, for a large randomly colored puzzle, the probability that even the color of a single cell is uniquely determined, by the hints relating to the corresponding row or column only, is very small. (However, when we do consider all hints simultaneously, the probability of having some cells with a uniquely determined color is bounded away from~0.) It was also proved that such a puzzle usually admits a huge number of solutions. Benton, Snow, and Wallach~\cite{rowsums} studied the number of possible puzzles if, instead of recording the length of all blocks of black cells in all rows and columns, one records only the total number of black cells in each row and column. When solving a nonogram puzzle, it is natural to start by looking for rows (and columns) whose hint sequences determine uniquely the color of some of the cells along the row, without resort to data from the rest of the puzzle. Thus, it is natural to study the number of hint sequences, for rows of any length $n$, that provide immediate information on at least some of the cells. It follows from Mullen~\cite{M} that most hint sequences satisfy this property. Hence it is more convenient to study the complement, namely the number of sequences that give no such information --- {\it useless sequences} ({\it forceless sequences} in Mullen's terminology). Mullen~\cite{M} studied the asymptotics of the number of useless sequences as a function of the row length. More precisely, he first observed that the total number of hint sequences, for a length-$n$ row, is $F_{n+2}$, where $(F_n)_{n=1}^\infty$ is the Fibonacci sequence, defined as follows: $$ \begin{array}{ll} F_1=F_2=1,\cr F_n=F_{n-1}+F_{n-2}, \qquad n\geq 3. \end{array} $$ Note that there are obviously~$2^n$ ways to color a length-$n$ row, but we care only about hint sequences. As many distinctly colored rows give rise to the same hint sequence, the total number of hint sequences is much smaller than~$2^n$. Recall that~$F_n$ behaves asymptotically as $\phi^n/\sqrt{5}$, where $\phi=\left(1+\sqrt{5}\right)/2$ is the golden ratio. Letting~$G_n$ denote the number of useless sequences for rows of length~$n$, Mullen~\cite{M} proved \begin{theorem}\label{mullenFormula} \quad \it{For appropriate constants} $c_1,c_2>0,$ \begin{equation} G_n \geq c_1 \cdot \frac{\phi^n}{n} \label{eq11} \end{equation} and \begin{equation} G_n \leq c_2 \cdot \frac{\phi^n\ln n}{n}, \end{equation} for all $n\geq 2$. \end{theorem} The sequence $(G_n)$ is sequence \seqnum{A304179} in the {\it On-Line Encyclopedia of Integer Sequences}. In this paper we show that the right-hand side of \eqref{eq11} provides the correct order of magnitude of $G_n$. In Section~\ref{main} we state this result formally, and in Section~\ref{the-proof}, we prove it. Section~\ref{new} presents a heuristic discussion that tries to explain even better the asymptotic behavior of~$G_n$, and in particular that this behavior is somewhat irregular. An abridged version of this paper was presented at the 28th Canadian Conference on Computational Geometry~\cite{BZconf}. We express our gratitude to the editor and to the referee for their helpful comments on the initial version of the paper. \section{The main result}\label{main} Our main result, employing the notations in the preceding section, is \begin{theorem}\label{mainTheorem} There exist two constants $C_1, C_2 > 0$ such that $$C_1\cdot\frac{\phi^n}{n}\leq G_n\leq C_2\cdot\frac{\phi^n}{n}, \ \ \ n\geq 1.$$ \end{theorem} Theorem~\ref{mainTheorem} can be interpreted as giving an asymptotic estimate of the probability of a random hint sequence being useless. In fact, as the total number of hint sequences is $F_{n+2}$, the probability of a random hint sequence being useless is $G_n / F_{n+2}$. Since $F_{n+2}\approx \phi^{n+2}/\sqrt{5},$ the probability in question is approximately $\frac{G_n \sqrt{5}}{\phi^{n+2}},$ which by the theorem is $\Theta (\frac{1}{n})$. Note, however, the difference between this result, and the previously mentioned result of Berend et al.~\cite{B}, according to which a random coloring usually gives rise to a useless hint sequence. In the model of Berend et al.~\cite{B}, one selects randomly a coloring of the row out of all $2^n$ colorings, and considers the corresponding hint sequence. Here, as in Mullen's results~\cite{M}, we pick randomly one of all $F_{n+2}$ hint sequences and consider it. The two probability spaces are very different, and thus the difference in the outcome is not contradictory. In Table~\ref{G_n_table} we provide the value of $G_n / F_{n+2}$, normalized by multiplying it by~$n$, for several $n$'s up to~$6000$. (The computations were carried out using Formulas~(\ref{amk}) and~(\ref{G_n}) below.) According to Theorem~A, these values are bounded. A quick glance at Table~\ref{G_n_table} may lead one to believe that $nG_n / F_{n+2}$ increases to some limit as $n\rightarrow\infty$. In Section~\ref{new} we will argue heuristically that, in fact, the sequence oscillates between two very close values, but does not actually converge. \begin{table}[h] \centering \begin{tabular}{|c|c|c|} \hline $n$ & $G_n$ & $nG_n/F_{n+2}$\\ \hline 10 & 33 & 2.2917 \\ 20 & 2258 & 2.5498 \\ 50 & $1.8\cdot 10^{9}$ & 2.7331 \\ 100 & $2.6\cdot 10^{19}$ & 2.8007 \\ 200 & $1.0\cdot 10^{40}$ & 2.8358 \\ 500 & $2.1\cdot 10^{102}$ & 2.8573 \\ 1000 & $3.3\cdot 10^{202}$ & 2.8646 \\ 2000 & $1.6\cdot 10^{415}$ & 2.8682 \\ 3000 & $1.0\cdot 10^{624}$ & 2.8694 \\ 4000 & $7.5\cdot 10^{832}$ & 2.8700\\ 5000 & $5.8\cdot 10^{1041}$ & 2.8704 \\ 6000 & $4.7\cdot 10^{1250}$ & 2.8706 \\ \hline \end{tabular} \caption{The probability of being useless for some small $n$} \label{G_n_table} \end{table} \section{Proof of Theorem~\ref{mainTheorem}}\label{the-proof} Following Mullen's paper~\cite{M}, let the {\it norm} of a hint sequence be the maximal term in that sequence, i.e., the maximal length of a block of cells to be colored black in a row. The {\it length} of a hint sequence is the least number of columns needed for that sequence to fit in a puzzle. (For example, the norm of the hint sequence $3,10,5,10,2$ is $10$ and its length is $34$.) Let $A_{m,k}$ denote the number of hint sequences of length~$m$ with norm at most~$k$. By Mullen~\cite[p.~6]{M}, \begin{equation}\label{amk} A_{m,k} = \begin{cases} F_m, \qquad& m\leq k;\\ \sum_{j=m-k-1}^{m-2} A_{j,k}, & m>k. \end{cases} \end{equation} To count all useless sequences, we need only sum $A_{m,n-m}$ over all possible lengths~\cite{M}, that is \begin{equation}\label{G_n} G_n=\sum_{m=0}^{n-1}A_{m,n-m}. \end{equation} For a fixed $k$, let $\phi_k$ be the largest zero of the characteristic polynomial of $\left(A_{m,k}\right)_{m=1}^\infty$, namely $x^{k+1}-x^{k-1}-x^{k-2}-\cdots-1$. Mullen~\cite{M} noted that this is in fact the only positive zero larger than~1. We will first establish an upper bound on $A_{m,k}$. \begin{proposition}\label{prop} $A_{m,k}\leq 2\phi_k^m$ for every $m,k\geq 1$. \end{proposition} \begin{proof} For $m\leq k$, by~(\ref{amk}) and the definition of $\phi$, we know that $A_{m,k}=F_m \leq \phi^m$. Now,~\cite[Prop.~5.3]{M} gives $$\phi\leq \phi_k \left(1+ \frac{1}{\sqrt{5}\phi^{k+1}-(k+2)}\right),$$ which implies $$\phi\leq \phi_k \left(1+\frac{1}{2\phi^{k+1}}\right), \qquad k\geq 8.$$ Thus, $$\phi^m \leq \phi_k^m \left(1+\frac{1}{2\phi^{k+1}}\right)^m\leq \phi_k^m \left(1+\frac{1}{2k}\right)^k \leq \phi_k^m \cdot \sqrt{e} \leq 2\phi_k^m, \qquad m\leq k.$$ \noindent For $m>k$ we proceed by induction. If $m=k+1$ then $$A_{m,k}=\sum_{j=k+1-k-1}^{k+1-2}A_{j,k} = \sum_{j=0}^{k-1}A_{j,k}\leq 2\sum_{j=0}^{k-1}\phi_k^j.$$ Since $\phi_k$ is a zero of the polynomial $x^{k+1}-x^{k-1}-x^{k-2}-\cdots-1$, we have $\sum_{j=0}^{k-1}\phi_k^j = \phi_k^{k+1}$ and therefore $$A_{m,k}\leq 2\phi_k^{k+1}=2\phi_k^{m},$$ \noindent as required. Now suppose that the inequality holds for each $m'$ such that $m'0$, and recall that $\phi^2=\phi+1$, we obtain \begin{equation} \begin{array}{ll} P_k\left(\phi-\frac{L}{\phi^k}\right) = (\phi-\frac{L}{\phi^k})^{k+2}-(\phi-\frac{L}{\phi^k})^{k+1}-(\phi-\frac{L}{\phi^k})^k +1 \cr \qquad \qquad \quad \ \ = (\phi-\frac{L}{\phi^k})^{k}\cdot [ (\phi-\frac{L}{\phi^k})^{2}-(\phi-\frac{L}{\phi^k})-1] +1 \cr \qquad \qquad \quad \ \ = (\phi-\frac{L}{\phi^k})^{k}\cdot [ \phi^2-\phi-1 +\frac{L}{\phi^k}-\frac{2L}{\phi^{k-1}}+\frac{L^2}{\phi^{2k}}] +1 \cr \qquad \qquad \quad \ \ = (1-\frac{L}{\phi^{k+1}})^{k}\cdot L(1-2\phi+\frac{L}{\phi^k}) +1 \cr \qquad \qquad \quad \ \ = 1- L(2\phi -1 - \frac{L}{\phi^k})(1-\frac{L}{\phi^{k+1}})^k \cr \qquad \qquad \quad \ \ = 1- L(\sqrt{5} - \frac{L}{\phi^k})(1-\frac{L}{\phi^{k+1}})^k, \cr \end{array} \end{equation} \noindent which is positive for $L=\frac{1}{\sqrt{5}}$. Since $\phi_k$ is a zero of $P_k$, this completes the proof of the lemma. \end{proof} \subsection{Conclusion of the proof of Theorem~\ref{mainTheorem}} \noindent By Proposition~\ref{prop} and Lemma~\ref{phi_lemma}: $$A_{m,k}\leq 2\phi_k^m \leq 2(\phi-\frac{1}{\sqrt{5}\phi^k})^m, \qquad m,k\geq 1.$$ \noindent In particular, $$A_{m,n-m}\leq 2(\phi -\frac{1}{\sqrt{5}\phi^{n-m}})^m = 2\phi^m (1-\frac{1}{\sqrt{5}\phi^{n-m+1}})^m, \qquad 1\leq m\leq n.$$ As will become evident in the sequel, for a fixed large~$n$, the significant terms $A_{m,n-m}$ on the right-hand side of~(\ref{G_n}) are those with~$m$ close to $n-\log_\phi n$, while the other terms are relatively negligible. Hence we will write~$m$ in the form $m=n-\log_\phi n +d$, where $d$ varies between approximately $-n +\log_\phi n$ and $\log_\phi n$. (Note that~$d$ is non-integer.) We have \begin{equation}\label{eqa} \begin{array}{ll} \phi^m(1-\frac{1}{\sqrt{5}\phi^{n-m+1}})^m = \displaystyle\phi^{n-\log_{\phi}n+d}\cdot\left(1-\frac{1}{\sqrt{5}\phi^{\log_{\phi}n-d+1}}\right)^{m} \cr \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ = \displaystyle\frac{\phi^n}{\phi ^{\log_{\phi}n}}\cdot \phi ^d \cdot \left(1-\frac{1}{\sqrt{5}n/\phi^{d-1}}\right)^{m} \cr \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ = \displaystyle\frac{\phi^n}{n}\cdot \phi^d \cdot \left(1-\frac{1}{\sqrt{5}n/\phi^{d-1}}\right)^{\frac{\sqrt{5}n}{\phi^{d-1}}\cdot \frac{\phi^{d-1}}{\sqrt{5}n}\cdot m}. \cr \end{array} \end{equation} For $d\geq 0$ (and $n$ sufficiently large) this gives $$\phi^m \left(1-\frac{1}{\sqrt{5}\phi^{n-m+1}}\right)^m\leq \frac{\phi^n}{n}\cdot \frac{\phi^d}{e^{\phi^{d-1}/3}}.$$ Hence $$\sum_{m\geq n-\log_{\phi}n}A_{m,n-m}\leq \frac{\phi^n}{n}\cdot 2\sum_{d=0}^\infty\frac{\phi^{d+1}}{e^{\phi^{d-1}/3}}. $$ As the series on the right-hand side converges, this means \begin{equation}\label{fin1} \sum_{m\geq n-\log_{\phi}n}A_{m,n-m} = O\left(\frac{\phi^n}{n}\right). \end{equation} For $d<0$ we obtain by~(\ref{eqa}) $$\phi^m\left(1-\frac{1}{\sqrt{5}\phi^{n-m+1}}\right)\leq \frac{\phi^n}{n}\cdot \phi^d,$$ so that \begin{equation}\label{fin2} \sum_{m