\magnification=1200 \hsize=4in \overfullrule=0pt \input amssym %\def\frac2{{#1\over #2}} \def\emph#1{{\it #1}} \def\em{\it} \nopagenumbers \noindent % % {\bf Avraham Goldstein, Petr Kolman and Jie Zheng} % % \medskip \noindent % % {\bf Minimum Common String Partition Problem: Hardness and Approximations} % % \vskip 5mm \noindent % % % % String comparison is a fundamental problem in computer science, with applications in areas such as computational biology, text processing and compression. In this paper we address the minimum common string partition problem, a string comparison problem with tight connection to the problem of sorting by reversals with duplicates, a key problem in genome rearrangement. A {\em partition} of a string $A$ is a sequence ${\cal P} = (P_1,P_2,\dots,P_m)$ of strings, called the \emph{blocks}, whose concatenation is equal to $A$. Given a partition ${\cal P}$ of a string $A$ and a partition ${\cal Q}$ of a string $B$, we say that the pair $\langle{{\cal P},{\cal Q}}\rangle$ is a {\em common partition} of $A$ and $B$ if ${\cal Q}$ is a permutation of ${\cal P}$. The {\em minimum common string partition} problem ({\tt MCSP}) is to find a common partition of two strings $A$ and $B$ with the minimum number of blocks. The restricted version of {\tt MCSP} where each letter occurs at most $k$ times in each input string, is denoted by $k$-{\tt MCSP}. In this paper, we show that $2$-{\tt MCSP} (and therefore {\tt MCSP}) is NP-hard and, moreover, even APX-hard. We describe a $1.1037$-approximation for $2$-{\tt MCSP} and a linear time $4$-approximation algorithm for $3$-{\tt MCSP}. We are not aware of any better approximations. \bye .