\count100= 1 \count101= 37 \input satmacros.tex %\draft \def\updated{November 2, 2004} %% This needs to be defined. \title{The Weierstrass Approximation Theorems} \author{Allan Pinkus} \def\shorttitle{The Weierstrass Approximation Theorems} \def\shortauthor{Allan Pinkus} \def\alp{\alpha} \def\Alp{\Alpha} \def\bet{\beta} \def\gam{\gamma} \def\Gam{\Gamma} \def\del{\delta} \def\Del{\Delta} \def\eps{\varepsilon} \def\tet{\theta} \def\Tet{\Theta} \def\lam{\lambda} \def\Lam{\Lambda} \def\sig{\sigma} \def\Sig{\Sigma} \def\ome{\omega} \def\Ome{\Omega} \def\rchi{\raise 2pt\hbox{$\chi$}} \def\bfa{{\bf a}} \def\bfA{{\bf A}} \def\bfb{{\bf b}} \def\bfB{{\bf B}} \def\bfc{{\bf c}} \def\bfC{{\bf C}} \def\bfd{{\bf d}} \def\bfD{{\bf D}} \def\bfe{{\bf e}} \def\bfE{{\bf E}} \def\bff{{\bf f}} \def\bfF{{\bf F}} \def\bfg{{\bf g}} \def\bfG{{\bf G}} \def\bfh{{\bf h}} \def\bfH{{\bf H}} \def\bfi{{\bf i}} \def\bfI{{\bf I}} \def\bfj{{\bf j}} \def\bfJ{{\bf J}} \def\bfk{{\bf k}} \def\bfK{{\bf K}} \def\bfl{{\bf l}} \def\bfL{{\bf L}} \def\bfm{{\bf m}} \def\bfM{{\bf M}} \def\bfn{{\bf n}} \def\bfN{{\bf N}} \def\bfo{{\bf o}} \def\bfO{{\bf O}} \def\bfp{{\bf p}} \def\bfP{{\bf P}} \def\bfq{{\bf q}} \def\bfQ{{\bf Q}} \def\bfr{{\bf r}} \def\bfR{{\bf R}} \def\bfs{{\bf s}} \def\bfS{{\bf S}} \def\bft{{\bf t}} \def\bfT{{\bf T}} \def\bfu{{\bf u}} \def\bfU{{\bf U}} \def\bfv{{\bf v}} \def\bfV{{\bf V}} \def\bfw{{\bf w}} \def\bfW{{\bf W}} \def\bfx{{\bf x}} \def\bfX{{\bf X}} \def\bfy{{\bf y}} \def\bfY{{\bf Y}} \def\bfz{{\bf z}} \def\bfZ{{\bf Z}} \def\CC{{\rlap {\raise 0.4ex \hbox{$\scriptscriptstyle |$}} \hskip -0.1em C}} \def\FF{{I\!\!F}} \def\NN{{I\!\!N}} \def\PP{{I\hskip-2pt P}} \def\QQ{{\rlap {\raise 0.4ex \hbox{$\scriptscriptstyle |$}} \hskip -0.1em Q}} \def\RR{{I\!\!R}} \def\ZZ{{Z\!\!\! Z}} \def\AA{{\hskip -3pt A}} \def\all{\forall} \def\func#1#2#3{#1\colon \,#2\rightarrow #3} %functio:from...to... \def\incl{\subseteq} \def\isom{\cong} \def\nek{,\ldots,} \def\onto{\mapsto} \def\union{\bigcup} \def\sqr#1#2{{\vcenter{\hrule height.#2pt\hbox{\vrule width.#2pt height#1pt \kern#1pt \vrule width.#2pt}\hrule height.#2pt}}} \def\square{\mathchoice\sqr56\sqr56\sqr{3.2}3\sqr{2.3}3} \def\span{{\rm span}} \def\\{{\backslash}} \def\tilC{{\widetilde C}} \def\tilf{{\widetilde f}} \def\oA{{\overline A}} \def\W{{Weierstrass}} \overfullrule=0pt \abstract {This is a survey of the {\W} Approximation Theorems and their various proofs.} \sect{Introduction} This survey is about the Weierstrass Approximation Theorems. We consider these results within a historical context and also discuss in detail many of the subsequent proofs. This is a shorter version of the paper Pinkus [2000] with some alterations. The Weierstrass Approximation Theorems are two theorems that Weierstrass (1815--1897) published in 1885 in Weierstrass [1885] when he was 70 years old. They prove the density of algebraic polynomials in the space of continuous real-valued functions on a finite interval in the uniform norm, and the density of trigonometric polynomials in the space of $2\pi$-periodic continuous real-valued functions on $\RR$ in the uniform norm. These theorems did not arise from nowhere. They were born within a historical context and it is of some interest to try to understand their origins and their impact. It has been said that two main themes stand out in {\W}' work. The first is called the {\it arithmetization of analysis}. This was a program to separate the calculus from geometry and to provide it with a proper solid analytic foundation. Providing a logical basis for the real numbers, for functions and for calculus was a necessary stage in the development of analysis. {\W} was one of the leaders of this movement in his lectures and in his papers. He not only brought a new standard of rigour to his own mathematics, but attempted to do the same to much of mathematical analysis. The second theme which is everpresent in {\W}' work is that of power series (and function series). {\W} is said to have stated that his own work in analysis was ``nothing but power series'', see Bell [1936, p.~462]. In fact we will see how Weierstrass perceived his approximation theorems as theorems on convergent series. These approximation theorems were also a counterbalance to {\W}' famous example of a continuous nowhere differentiable function. It is a generally accepted fact that the existence of continuous nowhere differentiable functions was known and lectured upon by {\W} in 1861. The approximation theorems are in a sense its converse. Every continuous function on $\RR$ is a limit not only of infinitely differentiable or even analytic functions, but in fact of polynomials. Furthermore, this limit is uniform if we restrict the approximation to any finite interval. Thus the set of continuous functions contains very, very non-smooth functions, but they can each be approximated arbitrarily well by the ultimate in smooth functions. It is this dichotomy which very much lies at the heart of approximation theory. \sect{The Fundamental Theorems of Approximation Theory} In this section we review the contents of {\W}' [1885] and its variants. We first fix some notation. $C(\RR)$ will denote the class of continuous real-valued functions on all of $\RR$, $C[a,b]$, $-\infty0$. He explicitly states that $\psi(x)=e^{-x^2}$ is an example thereof. The consequence of the above is the following. \proclaim Theorem A. Let $f$ be continuous and bounded on $\RR$. Then there exists a sequence of entire functions $F(x,k)$ (as functions of $x$ for each positive $k$) such that for each $x$ $$\lim_{k\to 0^+} F(x, k) = f(x).$$ \smallskip {\W} seems very much taken with this result that every bounded continuous function on $\RR$ is a pointwise limit of entire functions. In fact he prefaces Theorem A with the statement that this theorem ``strikes me as remarkable and fruitful''. For unknown reasons this sentence, and only this sentence, was deleted from the paper when it was reprinted in {\W}' Mathematische Werke. As mentioned, on any finite interval, one may obtain uniform convergence. Furthermore, since $F(\cdot,k)$ is entire, the truncated power series of $F(\cdot,k)$ uniformly converges to $F(\cdot,k)$ on any finite interval. Each of the above statements is easily proved and gives: \proclaim Theorem B. Let $f$ be continuous and bounded on $\RR$. Given a finite interval $[a,b]$ and an $\eps>0$, there exists an algebraic polynomial $p$ for which $$|f(x)-p(x)|<\eps$$ for all $x\in [a,b]$.\nopf Throughout the first part of {\W} [1885] and for much of the second part, {\W} is concerned with functions defined on all of $\RR$. However later in the second part he does note that given any $f\in C[a,b]$, $-\infty0$, there exists an algebraic polynomial $p$ for which $$|f(x)-p(x)|<\eps$$ for all $x\in [a,b]$.\nopf Returning to {\W} [1885], and bounded $f\in C(\RR)$, {\W} considers two sequences of positive values $\{c_n\}$ and $\{\eps_n\}$, for which $\lim_{n\to\infty} c_n=\infty$, and $\sum_{n=1}^\infty \eps_n<\infty$. From Theorem B it follows that for $f$ as above there exists a polynomial $p_n$ such that $$|f(x)-p_n(x)|< \eps_n$$ on $[-c_n,c_n]$. Set $q_0=p_1$ and $q_m=p_{m+1}-p_m$, $m=1,2,\ldots$ . Then $$\sum_{m=0}^n q_m(x)=p_{n+1}(x)$$ and, thus, in a pointwise sense $$f(x)=\sum_{m=0}^\infty q_m(x).\eqno(2.1)$$ Furthermore, let $[a,b]$ be a finite interval. Then for all $m$ sufficiently large $$|f(x)-p_m(x)|< \eps_m$$ for all $x\in [a,b]$, implying also $$|q_m(x)| < \eps_m + \eps_{m+1}$$ for all $x\in [a,b]$. Thus for some $M$ $$\sum_{m=M}^\infty |q_m(x)| < 2\sum_{m=M}^\infty \eps_m$$ for all $x\in [a,b]$ and the series $$\sum_{m=0}^\infty q_m(x)$$ therefore converges absolutely and uniformly to $f$ on $[a,b]$. This {\W} states as Theorem C. That is, \proclaim Theorem C. Let $f$ be continuous and bounded on $\RR$. Then $f$ may be represented, in many ways, by an infinite series of polynomials. This series converges absolutely for every value of $x$, and uniformly in every finite interval.\nopf {\W} and subsequent authors would often phrase or rephrase these approximation or density results (in this case Theorem B) in terms of infinite series. It was only many years later that this equivalent form went out of fashion. In fact such a phrasing was at the time significant. One should also recall that it was only a few years earlier that du Bois-Reymond had constructed a continuous function whose Fourier series diverged at a point, see du Bois-Reymond [1876]. {\W}' theorem was considered by many, including {\W} himself, to be a ``representation theorem''. The theorem was seen as a means of reconciling the ``analytic'' and ``synthetic'' viewpoints that had divided late 19th century mathematics, see Gray [1984] and also Siegmund-Schultze [1988]. Much of the remaining parts of {\W} [1885] is concerned with the construction (in some sense) of a good polynomial approximant or a good representation for $f$ (as in (2.1)). {\W} was well aware that he could not possibly construct a good power series representation for $f$, but he did find, in some sense, a reasonable expansion of $f$ in terms of Legendre polynomials. In the latter part of {\W} [1885], {\W} proves the density of trigonometric polynomials in $\tilC[0,2\pi]$. His proof is interesting and proceeds as follows using complex function theory. Let $\psi$ be an entire function that is nonnegative, integrable and even on $\RR$ and has the following property. Given an $f\in \tilC[0,2\pi]$, the functions $$F(z, k) = {1\over {2k\omega}} \int_{-\infty}^\infty f(u) \psi\left({{u-z}\over k}\right)\dd u,$$ where $$\omega =\int_0^\infty \psi(x)\dd x,$$ are entire for each $k>0$ (as a function of $z\in \CC$) and satisfy $$\lim_{k\to 0^+} F(x, k) = f(x)$$ uniformly on $[0,2\pi]$. {\W} notes that such functions $\psi$ exist, e.g., $\psi(u)= e^{-u^2}$. Since $f$ is $2\pi$-periodic so is $F$, i.e., $$F(z+2\pi,k)= F(z,k)$$ for all $z\in \CC$ and $k>0$. For each fixed $k>0$, set $$G(z, k) = F({{\log z}\over i}, k).$$ In general, since $\log z$ is a multiple-valued function, $G$ would also be a multiple-valued function. However from the $2\pi$-periodicity of $F$, it follows that $G$ is single-valued and thus is an analytic function on $\CC \\ \{0\}$. Consequently, $G$ has a Laurent series expansion of the form $$G(z, k) = \sum_{n=-\infty}^{\infty} c_{n,k} z^n$$ which converges absolutely and uniformly to $G$ on every domain bounded away from $0$ and $\infty$. We will consider this expansion on the unit circle $|z|=1$. Setting $z=e^{ix}$, it follows that $$F(x,k) = \sum_{n=-\infty}^{\infty} c_{n,k} e^{inx}$$ where the series converges absolutely and uniformly to $F(x,k)$ for all real $x$. (In fact, it may be shown that if $\psi(u)= e^{-u^2}$, then $c_{n,k} = c_n e^{-n^2 k^2/4}$, where the $\{c_n\}$ are the Fourier coefficients of $f$.) In other words, {\W} has given a proof of the fact that for $F(x, k)$ $2\pi$-periodic and entire, its Fourier series converges absolutely and uniformly to $F(x, k)$ on $\RR$. We now truncate this series to get an arbitrarily good approximant to $F(x,k)$ which itself, by a suitable choice of $k$, was an arbitrary good approximant to $f$. The truncated series is a trigonometric polynomial. This completes {\W}' proof, the result of which we now formally state. \proclaim Second Fundamental Theorem of Approximation Theory. Let $f\in \tilC[0,2\pi]$. Given $\eps>0$, there exists a trigonometric polynomial $t$ for which $$|f(x)-t(x)|<\eps$$ for all $x\in [0, 2\pi]$.\nopf As we stated at the beginning of this section, when {\W} [1885] was reprinted in {\W}' Mathematische Werke there were two notable additions. These are of interest and worth mentioning. We recall that while this reprint appeared in 1903 there is reason to assume that {\W} himself edited this paper. The first addition was a short (half page) ``introduction''. We quote it (verbatim in meaning if not in fact). \noindent {\it The main result of this paper, restricted to the one variable case, can be summarized as follows: Let $f\in C(\RR)$. Then there exists a sequence $f_1, f_2, \ldots$ of entire functions for which $$f(x)=\sum_{i=1}^\infty f_i(x)$$ for each $x\in \RR$. In addition the convergence of the above sum is uniform on every finite interval.} We can assume that this is the emphasis which {\W} wished to give his paper. It is a repeat of Theorem C (although the boundedness condition on $f$ seems to have been overlooked) and curiously without mention of the fact that the $f_i$ may be assumed to be polynomials. The second addition is 10 pages appended to the end of the paper. In these 10 pages {\W} shows how to extend the results of this paper (or, to be more precise, the results concerning algebraic polynomials) to approximating continuous functions of several variables. He does this by setting $F(x_1\nek x_n, k)$ equal to $$ {1\over {2^nk^n\omega^n}} \int_{-\infty}^\infty \!\cdots \!\int_{-\infty}^\infty \!\!\!f(u_1\nek u_n) \psi({{u_1-x_1}\over {k}}) \cdots \psi({{u_n-x_n}\over {k}}) \dd u_1\cdots \dd u_n$$ and then essentially mimicking the proofs of Theorems A and B. However Picard [1891a] published already in 1891 an alternative proof of {\W}' theorems and showed how to extend the results to functions of several variables. As such, {\W}' priority to this result is somewhat in question. \sect{Additional Proofs of the Fundamental Theorems} In this section we present various alternative proofs of {\W}' theorems on the density of algebraic and trigonometric polynomials on finite intervals in $\RR$. We believe that the echo of these proofs have an abiding value. Some of the papers we cite contain additional results or emphasize other points of view. We ignore such digressions. The proofs we present divide roughly into three groups. The first group contains proofs that, in one form or another, are based on singular integrals. The proofs of {\W}, Picard, Fej\'er, Landau, and de la Vall\'ee Poussin belong here. The second group of proofs is based on the idea of approximating a particular function. In this group we find the proofs of Runge/Phragm\'en, Lebesgue, Mittag-Leffler, and Lerch. Finally, there is the third group that contain the proofs which do not quite belong to either of the above groups. Here we find proofs due to Lerch, Volterra and Bernstein. These are what we term the ``early proofs''. They all appeared prior to 1913. Note the pantheon of names that were drawn to this theorem. The main focus of these proofs are the Weierstrass theorems themselves rather than any far-reaching generalizations thereof. There are later proofs coming from different and broader formulations. However we discuss only one of these later proofs. It is that due to Kuhn which we consider to be wonderfully elegant and simple. For historical consistency we have chosen to present these proofs in more or less chronological order. This lengthens the paper, but we hope the advantages of this approach offset the deficiencies. We start by formally stating certain facts which will be obvious to most readers, but perhaps not to everyone. The first two statements follow from a change of variables, and are stated without proof. \proclaim Proposition 1. Algebraic polynomials are dense in $C[a,b]$ iff they are dense in $C[0,1]$.\nopf Analogously we have the less used: \proclaim Proposition 2. The trigonometric polynomials $$\span\{ 1, \sin x, \cos x, \sin 2x, \cos 2x,\ldots\}$$ are dense in $\tilC[0,2\pi]$ iff $$\span\{ 1, \sin {{2\pi x}\over {b-a}}, \cos {{2\pi x}\over {b-a}}, \sin 2{{2\pi x}\over {b-a}}, \cos 2{{2\pi x}\over {b-a}},\ldots\}$$ are dense in $\tilC[a,b]$.\nopf We now show that the density of algebraic polynomials in $C[a,b]$, and trigonometric polynomials in $\tilC[0,2\pi]$, are in fact equivalent statements. That is, we prove that each of the fundamental theorems follows from the other, see also Natanson [1964, p.~16--19]. \proclaim Proposition 3. If trigonometric polynomials are dense in $\tilC[0,2\pi]$, then algebraic polynomials are dense in $C[a,b]$. \pf We present two proofs of this result. The first proof may be found in Picard [1891a]. Assume, without loss of generality, that $0\le a< b<2\pi$. Extend $f\in C[a,b]$ to some $\tilf\in \tilC[0,2\pi]$. Since trigonometric polynomials are dense in $\tilC[0,2\pi]$, there exists a trigonometric polynomial $t$ that is arbitrarily close to $\tilf$ on $[0,2\pi]$, and thus to $f$ on $[a,b]$. Every trigonometric polynomial is a finite linear combination of $\sin nx$ and $\cos nx$. As such each is an entire function. Thus $t$ is an entire function having an absolutely and uniformly convergent power series expansion. By suitably truncating this power series we obtain an algebraic polynomial that is arbitrarily close to $t$, and thus ultimately to $f$. A slight variant on the above bypasses the need to extend $f$ to $\tilf$. Assume $f\in C[0,2\pi]$, and define $$g(x)=f(x) + {{f(0) - f(2\pi)}\over {2\pi}} x.$$ Then $g\in \tilC[0,2\pi]$. We now apply the reasoning of the previous paragraph to obtain an algebraic polynomial $p$ arbitrarily close to $g$ on $[0,2\pi]$, whence it follows that $$p(x) - {{f(0) - f(2\pi)}\over {2\pi}} x$$ is arbitrarily close to $f$ on $[0,2\pi]$. A different and more commonly quoted proof is the following which does not depend upon the truncation of a power series. According to de la Vall\'ee Poussin [1918], [1919], the idea in this proof is due to Bernstein. Given $f\in C[-1,1]$, set $$g(\tet) = f(\cos \tet),\qquad -\pi\le \tet\le \pi.$$ Then $g\in \tilC[-\pi,\pi]$ and $g$ is even. As such given $\eps>0$ there exists a trigonometric polynomial $t$ for which $$|g(\tet)-t(\tet)|<\eps$$ for all $\tet\in [-\pi,\pi]$. We divide $t$ into its even and odd parts, i.e., $$t_e(\tet) = {{t(\tet) + t(-\tet)}\over 2}$$ $$t_o(\tet) = {{t(\tet) - t(-\tet)}\over 2}$$ and note that $t_e$ and $t_o$ are also trigonometric polynomials. (Equivalently, $t_e$ is composed of the cosine terms of $t$, while $t_o$ is composed of the sine terms of $t$.) Since $g$ is even we have $$\max \{ |(g-t)(\tet)|, |(g-t)(-\tet)|\}$$ $$= \max \{ |(g-t_e)(\tet)-t_o(\tet)|, |(g-t_e)(\tet) + t_o(\tet)|\} \ge |(g-t_e)(\tet)|,$$ and, thus, $$|g(\tet) - t_e(\tet)|< \eps$$ for all $\tet\in [-\pi,\pi]$. In other words, since $g$ is even we may assume that $t$ is even. Let $$t(\tet) = \sum_{m=0}^n a_m \cos m\tet.$$ Each $\cos m\tet$ is a polynomial of exact degree $m$ in $\cos \tet$. In fact $$\cos m\tet = T_m(\cos \tet)$$ where the $T_m$ are the Chebyshev polynomials (see e.g., Rivlin [1974]). Setting $$p(x) = \sum_{m=0}^n a_m T_m(x),$$ we have $$|f(x)-p(x)| <\eps$$ for all $x\in [0,1]$. \eop \proclaim Proposition 4. If algebraic polynomials are dense in $C[a,b]$, then trigonometric polynomials are dense in $\tilC[0,2\pi]$. \pf The first proof of this fact was the one given by {\W} in Section 2. To our surprise (and chagrin) we have essentially found only one other proof of this result, and it is not simple. The proof we give here is de la Vall\'ee Poussin's [1918], [1919] variation on a proof in Lebesgue [1898]. Let $f\in \tilC[0,2\pi]$ and consider $f$ as being defined on all of $\RR$. Set $$g(\tet)={{f(\tet)+f(-\tet)}\over 2}$$ and $$h(\tet)={{f(\tet)-f(-\tet)}\over 2}\sin \tet.$$ Both $g$ and $h$ are continuous even functions of period $2\pi$. Define $$\phi(x) = g(\arccos x),\qquad \psi(x) = h(\arccos x).$$ These are well-defined functions in $C[-1,1]$. Thus, given $\eps>0$ there exist algebraic polynomials $p$ and $q$ for which $$|\phi(x) - p(x)|< {\eps \over 4},\qquad |\psi(x) - q(x)|< {\eps \over 4}$$ for all $x\in [-1,1]$. As $g$ and $h$ are even, it follows that $$|g(\tet) - p (\cos \tet)|<{\eps \over 4},\qquad |h(\tet) - q (\cos \tet)|<{\eps \over 4}$$ for all $\tet$. From the definition of $g$ and $h$, we obtain $$\left|f(\tet)\sin^2\tet -\left[p(\cos \tet)\sin^2\tet + q(\cos \tet)\sin \tet\right]\right|<{\eps\over 2}$$ for all $\tet$. We apply this same analysis to the function $f(\tet+ \pi/2)$ to obtain algebraic polynomials $r$ and $s$ for which $$\left|f(\tet+{\pi\over 2})\sin^2\tet -\left[r(\cos \tet)\sin^2\tet + s (\cos \tet)\sin \tet\right]\right|<{\eps\over 2}$$ for all $\tet$. Substituting for $\tet + \pi/2$ gives $$\left|f(\tet)\cos^2\tet -\left[r(\sin \tet)\cos^2\tet - s(\sin \tet)\cos \tet\right]\right|<{\eps\over 2}.$$ Thus the trigonometric polynomial $$p(\cos \tet)\sin^2\tet + q(\cos \tet)\sin \tet + r(\sin \tet)\cos^2\tet - s(\sin \tet)\cos \tet$$ is an $\eps$-approximant to $f$. \eop After these preliminaries we can now look at the inherent methods and ideas used in various alternative proofs of either of the two Weierstrass fundamental theorems of approximation theory. We present these proofs in more or less the order in which they appeared in print. \medskip\noindent {\bf Picard.} \'Emile Picard (1856--1941) (Hermite's son-in-law) had an abiding interest in the {\W}' theorems and in Picard [1891a] gave the first in a series of different proofs of the Weierstrass theorems. This proof also appears in Picard's famous textbook [1891b]. Later editions of this textbook expanded upon this, often including other methods of proof, but not always with complete references. Picard's proof, like that of Weierstrass, is based on a smoothing procedure using singular integrals. Picard, however, chose to use the Poisson integral. His proof proceeds as follows. Assume $f\in \tilC [0,2\pi]$. As $f$ is continuous and $2\pi$-periodic on $\RR$, it is uniformly continuous thereon. As such, given $\eps>0$ there exists a $\del>0$ such that for $|x-\tet|<\del$ we have $|f(x)-f(\tet)|<\eps$. Let $$P(r, \tet) = {1\over {2\pi}} \int_0^{2\pi} {{1-r^2}\over { 1-2r\cos (x-\tet) +r^2}} f(x)\dd x$$ denote the Poisson integral of $f$. We claim that, with the above notation, $$|P(r,\tet)-f(\tet)| < \eps + {{\|f\|(1-r^2)}\over {r(1-\cos \del)}}$$ for all $\tet$. This may be explicitly proven as follows. $$P(r,\tet)-f(\tet) = {1\over {2\pi}} \int_0^{2\pi} {{1-r^2}\over { 1-2r\cos (x-\tet) +r^2}} [f(x)-f(\tet)]\dd x$$ $$\!\!\!={1\over {2\pi}} \int_{|x-\tet|<\del} {{1-r^2}\over { 1-2r\cos (x-\tet) +r^2}} [f(x)-f(\tet)]\dd x$$ $$\phantom{123}+ {1\over {2\pi}} \int_{\del\le |x-\tet|\le \pi} {{1-r^2}\over { 1-2r\cos (x-\tet) +r^2}} [f(x)-f(\tet)]\dd x.$$ Now $${1\over {2\pi}} \int_{|x-\tet|<\del} {{1-r^2}\over { 1-2r\cos (x-\tet) +r^2}}|f(x)-f(\tet)|\dd x$$ $$ < {\eps \over {2\pi}} \int_0^{2\pi} {{1-r^2}\over { 1-2r\cos(x-\tet) +r^2}}\dd x = \eps.$$ In addition $${1\over {2\pi}} \int_{\del\le|x-\tet|\le \pi} {{1-r^2}\over { 1-2r\cos (x-\tet) +r^2}} |f(x)-f(\tet)|\dd x$$ $$\le 2\|f\| {1\over {2\pi}} \int_{\del\le|x-\tet|\le \pi} {{1-r^2}\over { 1-2r\cos (x-\tet) +r^2}}\dd x \le {{\|f\|(1-r^2)}\over {r(1-\cos \del)}}.$$ This last inequality is a consequence of $$1-2r\cos(x-\tet) +r^2\ge 2r -2r\cos \del =2r(1-\cos \del)$$ which holds for all $x,\tet$ satisfying $\del \le |x-\tet| \le \pi$. As a function of $r$, $${{\|f\|(1-r^2)}\over {r(1-\cos \del)}}$$ decreases to zero as $r$ increases to $1$. Choose some $r_1<1$ for which $${{\|f\|(1-r_1^2)}\over {r_1(1-\cos \del)}}< \eps.$$ Thus $$|f(\tet)-P(r_1,\tet)|<2\eps$$ for all $\tet$. Let $$a_0/2 +\sum_{n=1}^\infty \left[ a_n \cos nx + b_n \sin nx\right]$$ denote the Fourier series of $f$. Recall that the Fourier series of $P(r, \tet)$ is given by $$a_0/2 +\sum_{n=1}^\infty r^n\left[ a_n \cos nx + b_n \sin nx\right].$$ Since the $a_n$ and $b_n$ are uniformly bounded, the above Fourier series converges absolutely, and uniformly converges to $P(r,\tet)$ for each $r<1$. Thus there exists an $m$ for which $$\left| P(r_1,\tet) - \left[a_0/2 +\sum_{n=1}^m r_1^n (a_n \cos nx + b_n \sin nx)\right]\right|<\eps$$ for all $\tet$. Set $$g(\tet) = a_0/2 +\sum_{n=1}^m r_1^n(a_n \cos nx + b_n \sin nx).$$ We have ``constructed'' a trigonometric polynomial satisfying $$|f(\tet)-g(\tet)|<3\eps$$ for all $\tet$. In other words we have proven that in the uniform norm, trigonometric polynomials are dense in the space of continuous $2\pi$-periodic functions. As noted in the proof of Proposition 3, Picard then proves the {\W} theorem for algebraic polynomials based on the above result. Picard ends his paper by noting that the same procedure can be used to obtain parallel results for continuous functions of many variables. He was the first to publish an extension of the {\W} theorems to several variables. As Picard [1891a] states, this proof is based on an inequality obtained by H.~A.~Schwarz in his well-known paper Schwarz [1871]. In fact, as Cakon [1987] points out, almost the entire Picard proof can be found in Schwarz [1871]. What is perhaps surprising is that Weierstrass did not notice this connection. \medskip\noindent {\bf Lerch I.} M.~Lerch (1860--1922) was a Czech mathematician of some renown (see Skrasek [1960] and MacTutor [2004]) who attended some of Weierstrass' lectures. Lerch wrote two papers, Lerch [1892] and Lerch [1903], that included proofs of the Weierstrass theorem for algebraic polynomials. Unfortunately the paper Lerch [1892] is in Czech, difficult to procure, and I have found no reference to it anywhere in the literature except in Lerch [1903] and in a footnote in Borel [1905] (but Borel did not see the paper). Subsequent authors mentioned in this work were seemingly totally ignorant of this paper. Many of these authors quote Volterra [1897], although Lerch [1892] contains a similar proof with the same ideas. It is for the reader to decide whether, in these circumstances, Lerch deserves prominence or only precedence. We here explain the proof as is essentially contained in Lerch [1892]. We defer the discussion of Lerch [1903] to a more appropriate place. Let $f\in C[a,b]$. Since $f$ is uniformly continuous on $[a,b]$, it can be uniformly approximated thereon by a polygonal (piecewise linear) line. Lerch notes that every polygonal line $g$ may be uniformly approximated by a Fourier cosine series of the form $${{a_0}\over 2} + \sum_{n=1}^\infty a_n \cos{{x-a}\over {b-a}}n\pi,$$ where $$a_n = {2\over {b-a}} \int_a^b g(x)\cos{{x-a}\over {b-a}}n\pi \dd x.$$ It was, at the time, well-known to any mathematician worth his salt that the Fourier cosine series of a continuous function with a finite number of maxima and minima uniformly converges to the function. This result goes back to Dirichlet [1829], see e.g. Sz.-Nagy [1965, p.~399]. Alternatively it is today a standard result contained in every Fourier series text that if the derivative of a continuous function is piecewise continuous with one-sided derivatives at each point, then its Fourier cosine series converges uniformly. Both these results follow from the analogous results for periodic functions and the usual Fourier series. Both these results hold for our polygonal line. As this Fourier cosine series converges uniformly to our polygonal line we may truncate it to obtain a trigonometric polynomial (but not a trigonometric polynomial as in Proposition 2) which approximates our polygonal line arbitrarily well. Finally, as the trigonometric polynomial is an entire function we can suitably truncate its power series expansion to obtain our desired algebraic polynomial approximant. \medskip\noindent {\bf Volterra.} The next published proof of {\W}' theorems is due to Volterra [1897]. V.~Volterra (1860--1940) proved only the density of trigonometric polynomials in $\tilC[0,2\pi]$. As he was aware of Picard [1891a], this should not detract from his proof. Volterra was unaware of Lerch [1892], but his proof is much the same. Let $f\in \tilC[0,2\pi]$. Since $f$ is continuous on a closed interval, it is also uniformly continuous thereon. As such, it is possible to find a polygonal line that approximates $f$ arbitrarily well. One can also assume that the polygonal line is $2\pi$-periodic. It thus suffices to prove that one can arbitrarily well approximate any continuous, $2\pi$-periodic, polygonal line by trigonometric polynomials. As stated in the proof of Lerch, the Fourier series of the polygonal line uniformly converges to the function. We now suitably truncate the Fourier series to obtain the desired approximation. \medskip C.~Runge (1856--1927), E.~Phragm\'en (1863--1937), H.~Lebesgue (1875--1941) and G.~Mittag-Leffler (1846--1927) all contributed proofs of the Weierstrass approximation theorems, and their proofs are related both in character and idea. What did each do? Mittag-Leffler, in 1900, was the last of the above four to publish on this subject. However he seems to have been the first to point out, in print, Runge and Phragm\'en's contributions. As such we start this story with Mittag-Leffler. The paper Mittag-Leffler [1900] is an ``extract from a letter to E.~Picard''. This was, at the time, a not uncommon format for an article. Journals were still in their infancy, but were replacing correspondence as the primary mode of dissemination of mathematical research. Thus this combination of these two forms. The article came in response to what Picard had written in his ``Lectures on Mathematics'' given at the Decennial Celebration at Clark University, Picard [1899]. In this grand review Picard mentions the importance, in the development of the understanding of functions, of Weierstrass' example of a continuous nowhere differentiable function, and of {\W}' theorem on the representation of every continuous function on a finite interval as an absolutely and uniformly convergent series of polynomials. Picard then goes on to mention his own proof and that of Volterra [1897]. Mittag-Leffler [1900] points out that Weierstrass' theorem also follows from work of Runge [1885, 1885/86] although, as he notes, it is not explicitly contained anywhere in either of these two papers. He then explains his own proof, to which we shall return later. How did Mittag-Leffler know about {\W}' theorem following from the work of Runge? Firstly, Mittag-Leffler was the editor of Acta Mathematica and, as he writes, he was the one who published Runge's paper. (Mittag-Leffler founded Acta Mathematica in 1882 and was its editor for 45 years.) Moreover in the paper of Mittag-Leffler [1900] there is a very interesting long footnote which seems to have been somewhat overlooked. It starts as follows: {\it I found on this subject among my papers an article of Phragm\'en, from the year 1886, which goes thus}. What follows is two pages where Phragm\'en (who was 23 years old at the time) explains how Weierstrass' theorem can follow from Runge's work, Phragm\'en's simplification thereof, and also how to get from this the Weierstrass theorem on the density of trigonometric polynomials in $\tilC[0,2\pi]$ (with some not insignificant additional work). Before we explain this in detail, let us start with the general idea behind these various proofs. Let $f\in C[0,1]$. Since $f$ is continuous on a closed interval, it is also uniformly continuous thereon. As Lerch and Volterra pointed out, it is thus possible to find a polygonal line $g$ (which today we might also call a spline of degree 1 with simple knots) that approximates $f$ uniformly to within any given $\eps>0$, i.e., for which $$|f(x)-g(x)|<\eps,$$ for all $x\in [0,1]$. This polygonal line is the first idea in these proofs. The second idea is to show that there is an arbitrarily good polynomial approximant to the relatively ``simpler'' $g$. This will then suffice to prove that we can find a polynomial that approximates our original $f$ arbitrarily well. The third and more fundamental idea is to reduce the problem of finding a good polynomial approximant to $g$ (which depends upon $f$) to that of finding a good polynomial approximant to one and only one function, independent of $f$. Each of Runge, Mittag-Leffler and Lebesgue do this in a different way. \medskip\noindent {\bf Runge/Phragm\'en.} We first fix some notation. Let $0=x_0 1$\cr}.$$ Set $\psi_n(x) = 1-\phi_n(1+x)$. Then restricted to $[-1,1]$ we have $$\lim_{n\to\infty} \psi_n(x)=\cases{ 1,& $0 \psi_n(x)$ for $x\in (0,1]$, while $\psi_{n+1}(x)< \psi_n(x)$ for $x\in (-1,0)$, it follows that given any $\del>0$, small, the functions $\psi_n$ are bounded on $[-1,1]$ and uniformly converge to the function $h$ on $[-1,-\del] \union [\del,1]$ for any given $\del$. Since the linear polynomial $g_{i+1} - g_i$ vanishes at $x_i$, a short calculation verifies that for each $x_i\in (0,1)$ $$\left[ g_{i+1}(x) - g_i(x)\right] \psi_n(x-x_i)$$ uniformly converges to $$\left[ g_{i+1}(x) - g_i(x)\right] h(x-x_i)$$ on $[0,1]$. Replacing the $h$ in (3.1) by $\psi_n$ we obtain a series of functions which uniformly approximate $g$. These functions $$\Psi_n(x)=g_1(x) + \sum_{i=1}^{m-1} \left[ g_{i+1}(x) - g_i(x)\right] \psi_n(x-x_i)$$ are not polynomials or entire functions. But they are rational functions. Thus any continuous function on a finite real interval can be uniformly approximated by rational functions. This is the main result of Runge [1885/86]. It was published the same year as Weierstrass' paper. Runge also discussed what could be said in the case of continuous functions on all of $\RR$. In that context he noted that from one of his results in Runge [1885] one could always replace $\Psi_n$ by another rational function, real on $\RR$, with exactly two conjugate poles. Phragm\'en in the above-mentioned footnote in Mittag-Leffler [1900] (but according to Mittag-Leffler written in 1886), remarks that apparently Runge overlooked in Runge [1885/86] (or did not think important) the fact that he could replace rational functions by polynomials. Runge quite explicitly had the tools to do this from Runge [1885]. What is the relevant result from Runge [1885]? It is the following, which we state in an elementary form. Assume $D$ is a compact set and $\CC\\ D$ is connected. Let $R$ be a rational function with poles outside $D$. Then given any point $w\in \CC\\ D$ there are rational functions, with only the one pole $w$, that approximate $R$ arbitrarily well on $D$. This is not a difficult result to prove. Here, essentially, is Runge's proof. The rational function $R$ can be decomposed as $R=\sum_{j=1}^n R_j$ where each $R_j$ is a rational function with only one pole $w_j$. We now show how to move each $w_j$ to $w$ in a series of finite steps. For each $j$ we choose $a_0\nek a_m$, where $a_0=w_j$ and $a_m=w$, and the $a_i$ are chosen so that $$|a_{i-1}-a_i|< |z-a_i |,\qquad i=1\nek m$$ for all $z\in D$. This can be done. At each stage we will construct a rational function $G_i$ ($G_0=R_j$) with only the simple pole $a_i$, and such that $G_i$ is arbitrarily close to $G_{i-1}$. This follows from the fact that for given $k\in \NN$ the function $${1\over {(z-a_{i-1})^k}}$$ can be arbitrarily well approximated on $D$ by $$\left[ {1\over {(z-a_{i-1})}} \left[ 1 - \left({{a_{i-1}-a_i}\over {z-a_i}} \right)^n\right]\right]^k$$ by taking $n$ sufficiently large. Note that the latter is a rational function with a pole only at $a_i$. Runge further noted that by a linear fractional transformation (and a bit of care) the pole could be shifted to $\infty$, whence the rational function becomes a polynomial. As Phragm\'en points out, if the function $f$ to be approximated on $[0,1]$ is real, we can replace the polynomial approximant $G$ obtained above by ${\sl Re}\,G$ on $[0,1]$ which is also a polynomial and which better approximates $f$ thereon. Thus {\W}' theorem is proved. Phragm\'en also notes that it is really not necessary to use the results of Runge [1885]. If we go back to Runge [1885/86] and consider his construction therein, we see that each of the rational approximants are real on $[0,1]$, and have denominator $1+ (1+x)^{2n}$ for some $n$. Any such $R$ may be decomposed as $$R= g + r_1+ r_2$$ where $g$ is a polynomial, $r_1$ is a rational function, all of whose poles lie in the upper half-plane, and $r_2(z)={\overline {r_1(\overline{z})}}$ is a rational function, all of whose poles are conjugate to the poles of $r_1$ and lie in the lower half-plane. It is possible to choose a point $z_1$ in the lower half plane such that there exists a circle centered at $z_1$ containing $[0,1]$, but not containing any poles of $r_1$. As such the Taylor series of $r_1$ about $z_1$ converges uniformly to $r_1$ in $[0,1]$. Truncate it to obtain a polynomial $p_1$ that approximates $r_1$ arbitrarily well on $[0,1]$. It follows that $p_2(z)={\overline {p_1(\overline{z})}}$ has the corresponding property with respect to $r_2$. As such $$P=g+p_1+p_2$$ is a real polynomial that can be chosen to approximate $f$ arbitrarily well. Another simple option, not mentioned by Phragm\'en, is simply to use the result of Runge [1885], to move the poles of any rational approximant away from $[0,1]$ so that a circle can be put about $[0,1]$ which does not contain any poles, and then use the truncated power series as above. Phragm\'en's proof of the density of trigonometric polynomials in $\tilC[0,2\pi]$ is more complicated and we will not present it here. In any case, as we have seen, the algebraic Weierstrass theorem is a fairly simple consequence of Runge's [1885] and [1885/86] results. It is unfortunate and somewhat astonishing that Runge did not think of it. \medskip\noindent {\bf Lebesgue.} Let us now give Lebesgue's proof of {\W}' theorem as found in Lebesgue [1898]. This is one of the more elegant and cited proofs of {\W}' theorem. It is interesting to note that this was Lebesgue's first published paper. He was, at the time of publication, a 23 year old student at the \'Ecole Normale Sup\'erieure. He obtained his doctorate in 1902. A more ``modern'' form of writing the $g$ of (3.1) is as a spline. That is, $$g(x) = ax+b +\sum_{i=1}^{m-1} c_i(x-x_i)^1_+$$ where $$x^1_+ =\cases{ x,& $x\ge 0$\cr 0,& $x<0$\cr}$$ and $ax+b=g_1(x)$. (This easily follows from the form (3.1). As $g_{i+1}(x)-g_i(x)$ is a linear polynomial that vanishes at $x_i$, it is necessarily of the form $c_i(x-x_i)$ for some constant $c_i$.) Since $$2x^1_+= |x| + x$$ the above form of $g$ may also be rewritten as $$g(x) = Ax+B +\sum_{i=1}^{m-1} C_i|x-x_i|\eqno(3.2)$$ for some real constants $A$, $B$, and $C_i$. Lebesgue [1898] considers the form (3.2) of $g$, and argues as follows. To approximate $g$ arbitrarily well by a polynomial it suffices to be able to approximate $|x|$ arbitrarily well by a polynomial in $[-1,1]$ (or in fact in any neighbourhood of the origin). If for given $\eta>0$ there exists a polynomial $p$ satisfying $$\left| |x|-p(x)\right|<\eta$$ for all $x\in [-1,1]$, then $$\left| |x-x_i|-p(x-x_i)\right|<\eta$$ for all $x\in [0,1] \subset [x_i-1, x_i+1]$ (since $0\le x_i\le 1$). By a judicious choice of $\eta$, depending on the predetermined constants $C_i$ in (3.2), it then follows that $$\left| g(x) -\left[ Ax+B +\sum_{i=1}^{m-1} C_ip(x-x_i)\right]\right| <\eps$$ for all $x\in [0,1]$. Thus our problem has been reduced to that of approximating just the one function $|x|$. How can this be done? As Lebesgue explains, one can write $$|x|=\sqrt{x^2}=\sqrt{1-(1-x^2)}= \sqrt{1-z}$$ where $z=1-x^2$, and then expand the above radical by the binomial formula to obtain a power series in $z=1-x^2$ which converges uniformly to $|x|$ in $[-1,1]$. One finally just truncates the power series. To be more explicit, we have $$(1-z)^{1/2} = \sum_{n=0}^\infty {{1/2}\choose n} (-z)^n$$ where $${{1/2}\choose n} = {{{1\over 2}({1\over 2}-1)\cdots ({1\over 2} - n+1)} \over {n!}} = {{(-1)^{n-1} {1\over 2}{1\over 2}{3\over 2}\cdots {{2n-3}\over 2}}\over {n!}}.$$ Thus $$(1-z)^{1/2} = 1 -\sum_{n=1}^\infty a_nz^n$$ with $a_1= 1/2$, and $$a_n = {{(2n-3)!}\over {2^{2n-2} n! (n-1)!}},\qquad n=2,3,\ldots$$ This power series converges absolutely and uniformly to $(1-z)^{1/2}$ in $|z|\le 1$. It is easily checked that the radius of convergence of this power series is 1. An application of Stirling's formula shows that $$a_n = {e\over {2\sqrt{\pi}}}{1\over {n^{3/2}}}(1+ o(1))$$ so that the series also has the correct convergence properties for $|z|=1$. A different proof of this same fact may be found in Todd [1961, p.~11]. This finishes Lebesgue's proof. An alternative argument (see Ostrowski [1951, p.~168] or Feinerman, Newman [1974, p.~5]) gets around the more delicate analysis at $|z|=1$ by noting that $(1-z)^{1/2}$ may be uniformly approximated on $[0,1]$ by $(1-\rho z)^{1/2}$ as $\rho\uparrow 1$. (In fact it is easily checked that for $0<\rho<1$ $$| (1- z)^{1/2} - (1-\rho z)^{1/2} | \le (1-\rho)^{1/2}$$ for all $z\in [0,1]$.) Now the power series for $(1-\rho z)^{1/2}$, namely $$(1-\rho z)^{1/2} = 1 -\sum_{n=1}^\infty a_n\rho^n z^n,$$ is absolutely and uniformly convergent in $|z|< \rho^{-1}$ and thus in $|z|\le 1$. Bourbaki [1949, p.~55] (see also Dieudonn\'e [1969, p.~137]) presents an ingenious argument to obtain a sequence of polynomials which uniformly approximate $|x|$. For $t\in [0,1]$ define a sequence of polynomials recursively as follows. Let $p_0(t)\equiv 0$ and $$p_{n+1}(t)= p_n(t) +{1\over 2} (t- p^2_n(t)),$$ $n=0,1,2,\ldots$. It is readily verified that for each fixed $t\in [0,1]$, ${p_n(t)}$ is an increasing sequence bounded above by $\sqrt{t}$. The former is a consequence of the latter which is proven as follows. Assume $0\le p_n(t) \le \sqrt{t}$. Then $$\eqalign{ \sqrt{t} - p_{n+1}(t) = & \sqrt{t} - p_{n}(t) -{1\over 2} (t-p_n^2(t))\cr =& (\sqrt{t} - p_{n}(t)) (1- {1\over 2}(\sqrt{t} + p_{n}(t)))\cr \ge & 0\cr}$$ since $\sqrt{t} + p_{n}(t) \le 2\sqrt{t} \le 2$ for $t\in [0,1]$. Thus for each $t\in [0,1]$ $$\lim_{n\to\infty} p_n(t) = p(t)$$ exists. Since $p(t)$ is nonnegative and satisfies $$p(t)= p(t) - {1\over 2} (t-p^2(t))$$ we have $p(t)=\sqrt{t}$. The $\{p_n\}$ are real-valued continuous functions (polynomials) which increase, and converge pointwise to a continuous function $p$. This implies that the convergence is uniform (Dini's theorem). Let $q_n(x)=p_n(x^2)$ for $x\in [-1,1]$. Then the polynomials $\{q_n\}$ converge uniformly to $\sqrt{x^2}= |x|$ on $[-1,1]$. A similar and equivalent proof may be found in Sz.-Nagy [1965, p.~77]. (Sz.-Nagy attributes his procedure to C.~Visser.) \medskip\noindent {\bf Mittag-Leffler.} The proof by Mittag-Leffler as given in Mittag-Leffler [1900] is the following. He also considers the $g$ as given in (3.1), and sets $$\rchi_n(x)= 1- 2^{1-(1+x)^n}.$$ It is easily checked that $$\lim_{n\to \infty} \rchi_n(x) =\cases{ 1,& $0 \rchi_n(x)$ for $x\in (0,1]$, while $\rchi_{n+1}(x)< \rchi_n(x)$ for $x\in (-1,0)$, it follows that given $\del>0$, small, the function $\rchi_n$ uniformly converges to 1 on $[\del,1]$ and to $-1$ on $[-1,-\del]$. Thus the functions $$h_n = {{\rchi_n+1}\over 2}$$ are bounded on $[-1,1]$ and uniformly approximate the function $h$ of (3.1) on $[-1,-\del] \union [\del,1]$ for any given $\del$. Furthermore the $\rchi_n$ and thus the $h_n$ are entire (analytic) functions. As previously, since $g_{i+1} - g_i$ is a linear polynomial vanishing at $x_i$, a short calculation verifies that for each $x_i\in (0,1)$ $$\left[ g_{i+1}(x) - g_i(x)\right] h_n(x-x_i)$$ uniformly converges to $$\left[ g_{i+1}(x) - g_i(x)\right] h(x-x_i)$$ on $[0,1]$. Replacing the $h$ in (3.1) by $h_n$ we obtain a series of functions $\{H_n\}$ that uniformly approximate $g$. Finally, since $h_n$ is an entire function, each of the functions $H_n$ is an entire function. As such they may be approximated arbitrarily well by a truncation of their power series. This again proves Weierstrass' theorem. \medskip\noindent {\bf Fej\'er.} L.~Fej\'er (1880--1959) was a student of H.~A.~Schwarz. What we will report on here is taken from Fej\'er [1900] (he had just turned 20 when the paper appeared). This fundamental paper formed the basis for Fej\'er's doctoral thesis obtained in 1902 from the University of Budapest. The paper contains what is today described as the ``classic'' theorem on Ces\`aro ($C,1$) summability of Fourier series. As we are interested in {\W}' theorem, we will restrict ourselves, a priori, to $f\in \tilC[0,2\pi]$, and prove that the Ces\`aro sum of the Fourier series of any such $f$ converges uniformly to $f$. Note that this is the first proof of {\W}' theorem (in the trigonometric polynomial case) that actually provides, by a linear process, a sequence of easily calculated approximants. Let $\sig_0(x)=1/2$, and $$\sig_m(x) = {1\over 2} + \cos x +\cos 2x + \cdots + \cos mx$$ for $m=1,2,\ldots\,$. Set $$G_n(x) = {{\sig_0(x)+\cdots + \sig_{n-1}(x)}\over n}.$$ A calculation shows that $$G_n(x) = {1\over {2n}} {{1-\cos nx}\over {1-\cos x}} = {1\over {2n}} \left[ {{\sin\left({{nx}\over 2}\right)}\over {\sin\left({{x}\over 2}\right)}}\right]^2.$$ Furthermore it is easily seen that $${1\over \pi} \int_0^{2\pi} G_n(x)\dd x = 1.$$ $G_n$ is a nonnegative kernel that integrates to 1 (and, as we shall show approaches the Dirac-Delta function at $0$ as $n$ tends to infinity, i.e., convolution against $G_n$ approaches the identity operator). Assume $f\in \tilC[0,2\pi]$. Let $${{a_0}\over 2} + \sum_{k=1}^\infty a_k \cos kx + b_k \sin kx$$ denote the Fourier series of $f$. Let $s_0(x) = a_0/2$, and $$s_m(x) ={{a_0}\over 2} + \sum_{k=1}^m a_k \cos kx + b_k \sin kx$$ denote the partial sums of the Fourier series of $f$. The functions $s_m$ do not necessarily converge uniformly, or pointwise, to $f$ as $m\to\infty$. This is a well-known result of du Bois-Reymond [1876]. However let us now set $$S_n(x) ={{s_0(x)+\cdots + s_{n-1}(x)}\over n} = {1\over \pi} \int_0^{2\pi} f(y) G_n(y-x) \dd y .$$ Explicitly the $S_n$ are given by $$S_n(x) = {{a_0}\over 2} + \sum_{k=1}^{n-1} \left(1 - {k\over n}\right) \left[a_k \cos kx + b_k \sin kx\right].$$ Surprisingly, the $S_n$ always converge uniformly to $f$. \proclaim Theorem 5. For each $f\in \tilC[0,2\pi]$, the trigonometric polynomials $S_n$ converge uniformly to $f$ as $n\to\infty$. \pf From the above $$S_n(x) = {1\over \pi} \int_0^{2\pi} f(y) G_n(y-x) \dd y = {1\over {2n\pi}} \int_0^{2\pi} f(y) {{1-\cos n(y-x)}\over {1-\cos (y-x)}}\dd y .$$ Since $f\in\tilC[0,2\pi]$, $f$ may be considered to be uniformly continuous on all of $\RR$. Thus given $\eps>0$ there exists a $\del>0$ such that if $|x-y|<\del$, then $$|f(x)-f(y)|< {\eps\over 2} .$$ In what follows we assume $\del< \pi/2$. Since $G_n$ integrates to 1 we have $$S_n(x)-f(x)= {1\over {\pi}} \int_0^{2\pi} [f(y)-f(x)] G_n(y-x) \dd y $$ $$= {1\over {\pi}} \int_{|y-x|<\del} [f(y)-f(x)] G_n(y-x) \dd y + {1\over {\pi}} \int_{\del\le |y-x|\le \pi} [f(y)-f(x)] G_n(y-x)\dd y.$$ We estimate each of the above two integrals. On $|y-x|<\del$ we have $|f(x)-f(y)|< {\eps\over 2}$. Thus $$\left| {1\over {\pi}} \int_{|y-x|<\del} [f(y)-f(x)] G_n(y-x) \dd y\right| < {\eps\over 2} {1\over {\pi}} \int_{|y-x|<\del} G_n(y-x) \dd y$$ $$< {\eps\over 2} {1\over {\pi}} \int_0^{2\pi} G_n(y-x) \dd y= {\eps\over 2}.$$ We have here used the crucial fact that $G_n$ is nonnegative and integrates to 1 over any interval of length $2\pi$. From the explicit form of $G_n$ and the inequality $|f(y)-f(x)|\le 2\|f\|$ we have $$\left| {1\over {\pi}} \int_{\del\le |y-x|\le \pi} [f(y)-f(x)] G_n(y-x) \dd y\right| \le {{2\|f\|}\over {2n\pi}} \int_{\del\le |y-x|\le \pi} {{1-\cos n(y-x)}\over {1-\cos (y-x)}}\dd y.$$ Now $|1-\cos n(y-x)|\le 2$, while on $\del\le |y-x|\le \pi$ we have $1-\cos(y-x) \ge 1-\cos \del$. Thus $$\left| {1\over {\pi}} \int_{\del\le |y-x|\le \pi} [f(y)-f(x)] G_n(y-x) \dd y\right| \le {{2\|f\|}\over {2n\pi}}{2\over {1-\cos \del}}2\pi= {{4\|f\|}\over {n(1-\cos\del)}}.$$ For $n$ sufficiently large $${{4\|f\|}\over {n(1-\cos\del)}}<{\eps\over 2}.$$ Thus for such $n$ $$|S_n(x)-f(x)|<\eps.\meop$$ Applying the method of the (second) proof of Proposition 3 to the above we see that to each $f\in C[-1,1]$ we may obtain a sequence of algebraic polynomials $$p_n(x) = {{a_0}\over 2} + \sum_{k=1}^{n-1} \left(1 - {k\over n}\right) a_k T_k(x)$$ where $$a_k = {2\over \pi} \int^1_{-1} {{f(x)T_k(x)}\over {\sqrt{1-x^2}}}\dd x,$$ $k=0,1,\ldots$. These explicitly defined $p_n$ (each of degree at most $n-1$) uniformly approximate $f$. \medskip\noindent {\bf Lerch II.} The paper Lerch [1903] contains yet another proof of the density of algebraic polynomials in $C[0,1]$. In his previous proof, in Lerch [1892], Lerch had used general properties of Fourier series to prove the {\W} theorem for algebraic polynomials. His proof here is different in that while the same general scheme is used, he only needs to consider the Fourier series of two specific functions, and their properties. In this sense it is more elementary than his previous proof. We recall from Lerch [1892] that it suffices to be able to arbitrarily approximate the polygonal line $g$ as given in (3.1). Lerch rewrites (3.1) in the form $$g(x) = \sum_{i=1}^m \ell_i(x)$$ where $$\ell_i(x) =\cases{ 0, & $x< x_{i-1}$\cr y_{i-1} + \left({{x-x_{i-1}}\over { x_i-x_{i-1}}}\right) (y_i-y_{i-1}), & $x_{i-1}\le x < x_{i}$\cr 0, & $x_{i}\le x$\cr}$$ (when defining $\ell_m$ we should, for precision, define it to equal $y_m$ at $x_m=1$). As we mentioned, Lerch bases his proof on quite explicit Fourier series. It is well known and easily checked that $${1\over 2} - x = \sum_{n=1}^\infty {{\sin 2n\pi x}\over {n\pi}},\qquad 0 0$. Since $f$ is uniformly continuous on $[0,1]$ there exists a $\del>0$ such that if $x,y\in [0,1]$ satisfies $|x-y|<\del$, then $$|f(x)-f(y)|<\eps/3.$$ Assume $0<\del< \min\{a,1-b\}$. Choose $N$ so that for all $n\ge N$ $$2\|f\| \sqrt{n} (1-\del^2)^n \left(1-{1\over n}\right)^{-n} < \eps/3.$$ For every $x\in [a,b]$, $$|p_n(x)-f(x)| = \left|{1\over {k_n}} \int_0^1 f(y) \left[1 -(x-y)^2\right]^n \dd y -f(x)\right|$$ $$\le {1\over {k_n}} \int_0^1 |f(y)-f(x)| \left[1 -(x-y)^2\right]^n \dd y + |f(x)| \left| 1 - {1\over {k_n}} \int_0^1 \left[1 -(x-y)^2\right]^n \dd y \right|.$$ We bound the integral $${1\over {k_n}} \int_0^1 |f(y)-f(x)| \left[1 -(x-y)^2\right]^n \dd y$$ by considering separately integration over $\{y: |x-y|<\del\}$ and over $\{ y:\del\le |x-y|\}$ for $y\in [0,1]$. Now $${1\over {k_n}} \int_{|x-y|<\del} |f(y)-f(x)| \left[1 -(x-y)^2\right]^n \dd y$$ $$\phantom{1234}< {\eps\over 3}{1\over {k_n}} \int_{|x-y|<\del} \left[1 -(x-y)^2\right]^n \dd y < {\eps\over 3}.$$ Furthermore $${1\over {k_n}} \int_{{\del\le |x-y|}\atop {0\le y\le 1}} |f(y)-f(x)| \left[1 -(x-y)^2\right]^n \dd y \le {{2\|f\|}\over {k_n}} \int_{\del\le |u|\le 1} [1 -u^2]^n \dd u $$ $$\le 2\|f\| \sqrt{n} (1-\del^2)^n \left(1-{1\over n}\right)^{-n} < \eps/3.$$ Finally $$\!\!\! |f(x)| \left| 1 - {1\over {k_n}} \int_0^1 \left[1 -(x-y)^2\right]^n \dd y \right|$$ $$\phantom{1234} \le {\|f\|\over {k_n}} \left| \int_1^1 [1 -u^2]^n \dd u - \int_{-x}^{1-x} [1 -u^2]^n \dd u \right|.$$ Since $x\in [a,b]$ and $ \del< \min\{a,1-b\}$, we have $${\|f\|\over {k_n}} \left| \int_{-1}^1 [1 -u^2]^n \dd u - \int_{-x}^{1-x} [1 -u^2]^n \dd u \right| \le {\|f\|\over {k_n}} \int_{\del\le |u|\le 1} [1 -u^2]^n \dd u$$ $$\le \|f\| \sqrt{n} (1-\del^2)^n \left(1-{1\over n}\right)^{-n} < \eps/3.$$ This proves the result.\eop \medskip For completeness and as a matter of interest, it easily follows from integration by parts that $$k_n= \int_{-1}^1 [1 -u^2]^n \dd u = {{2^{2n+1} (n!)^2}\over {(2n+1)!}}.$$ Applying Stirling's formula it may be shown that $$\lim_{n\to\infty} \sqrt{n} k_n = \sqrt{\pi}.$$ The following is a variation on and simplification of Landau's proof. It is due to Jackson [1934]. As above, assume $f\in C[a,b]$ with $00$ there exists a $\del>0$ such that $$|x-y|<\del$$ implies $$|f(x)-f(y)|<{\eps \over 2}$$ for all $x,y\in [0,1]$. Set $${\overline f}(x) = \max \{ f(y): y\in [x-\del,x+\del]\cap [0,1]\}$$ and $${\underline f}(x) = \min \{ f(y): y\in [x-\del,x+\del]\cap [0,1]\}.$$ Thus for each $x\in [0,1]$ $$0\le {\overline f}(x)- f(x) < {\eps\over 2},$$ and $$0\le f(x) - {\underline f}(x) < {\eps\over 2}.$$ For fixed $\del>0$ as above, set $$\eta_n(x) = \sum_{\{m: |x-(m/n)|>\del\}} {n \choose m} x^m (1-x)^{n-m}.$$ From the decomposition $$B_n(x) = \sum_{m=0}^n f\left({m \over n}\right) {n \choose m} x^m (1-x)^{n-m}$$ $$\!\!\!\! =\sum_{\{m: |x-(m/n)|\le \del\}} f\left({m \over n}\right) {n \choose m} x^m (1-x)^{n-m} $$ $$\phantom{12345}+ \sum_{\{m: |x-(m/n)|>\del\}} f\left({m \over n}\right) {n \choose m} x^m (1-x)^{n-m},$$ it easily follows that $${\underline f}(x)[1-\eta_n(x)] - \|f\|\eta_n(x) \le B_n(x) \le {\overline f}(x)[1-\eta_n(x)] + \|f\|\eta_n(x).$$ Bernstein then states that according to Bernoulli's theorem there exists an $N$ such that for all $n>N$ and all $x\in [0,1]$ we have $$\eta_n(x)< {\eps \over {4\|f\|}}.$$ Thus as a consequence of $$f(x) + [{\underline f}(x)- f(x)] -\eta_n(x)[\|f\| + {\underline f}(x)] \le B_n(x)$$ and $$ B_n(x) \le f(x) +[{\overline f}(x)-f(x)] +\eta_n(x)[\|f\|- {\overline f}(x)],$$ we obtain $$f(x) - {\eps\over 2} - {\eps\over {4\|f\|}} 2\|f\| < B_n(x) < f(x) + {\eps\over 2} + {\eps\over {4\|f\|}}2\|f\|,$$ which gives $$|B_n(x)-f(x)|<\eps$$ for all $x\in [0,1]$. For completeness we now verify Bernstein's statement regarding $\eta_n(x)$. (For a probabilistic explanation of this quantity and estimate, see e.~g.~Levasseur [1984].) To this end confirm that $$\sum_{m=0}^n {n \choose m} x^m (1-x)^{n-m} = 1$$ $$\sum_{m=0}^n {m\over n} {n \choose m} x^m (1-x)^{n-m} = x$$ and $$\sum_{m=0}^n {{m^2}\over {n^2}} {n \choose m} x^m (1-x)^{n-m} = x^2 + {{x(1-x)}\over n}.$$ Then $$\eqalign{\eta_n(x) & = \sum_{\{m: |x-(m/n)|>\del\}} {n \choose m} x^m (1-x)^{n-m}\cr & \le \sum_{\{m: |x-(m/n)|>\del\}} \left({{x- {m\over n}}\over \del}\right)^2 {n \choose m} x^m (1-x)^{n-m}\cr & \le {1\over {\del^2}}\sum_{m=0}^n \left(x- {m\over n}\right)^2 {n \choose m} x^m (1-x)^{n-m}\cr & = {1\over {\del^2}} \left[ x^2 - 2x\cdot x + x^2 + {{x(1-x)}\over n} \right]\cr & = {{x(1-x)}\over {n\del^2}}\cr & \le {1\over {4n\del^2}}.\cr}$$ for all $x\in [0,1]$. Thus for each fixed $\del>0$ we can in fact choose $N$ such that for all $n\ge N$ and all $x\in [0,1]$ $$\eta_n(x) < {\eps \over {4\|f\|}}.$$ This ends Bernstein's proof. \medskip Bernstein's proof is beautiful and elegant! It constructs in a simple, linear (but unexpected) manner a sequence of approximating polynomials depending explicitly on the values of $f$ at rational points. No further information regarding $f$ is used. This was not the first attempt to find a proof of the Weierstrass theorem using a suitable partition of unity. In Borel [1905, p.~79--82], which seems to have been the first textbook devoted mainly to approximation theory, we find the following formula for constructing a sequence of polynomials approximating every $f\in C[0,1]$. E.~Borel (1871--1956) proved that the sequence of polynomials $$p_n(x) = \sum_{m=0}^n f\left({m\over n}\right) q_{n,m}(x)$$ uniformly approximates $f$ where the $q_{n,m}$ are fixed polynomials independent of $f$. His $q_{n,m}$ are constructed as follows. Set $$g_{n,m}(x) =\cases{0, & $\left|x- {m\over n}\right| > {1\over n}$\cr nx-(m-1), & ${{m-1}\over n} \le x\le {m\over n}$\cr -nx+(m+1), & ${{m}\over n} \le x\le {{m+1}\over n}.$\cr}$$ Note that the $g_{n,m}$ are non-negative, sum to 1, and $g_{n,m}(m/n)=1$. Let (by the {\W} theorem) $q_{n,m}$ be any polynomial satisfying $$|g_{n,m}(x)-q_{n,m}(x)| < {1\over {n^2}}$$ for all $x\in [0,1]$. It is now not difficult to verify that the $p_n$ do approximate $f$. However the Bernstein polynomials are so much more satisfying in so many ways. \medskip\noindent {\bf Kuhn's Proof.} There are many elegant and simple proofs of {\W}' theorem. But perhaps the most elementary proof (of which we are aware) is the following due to Kuhn [1964]. Kuhn's proof uses one basic inequality, namely Bernoulli's inequality $$(1+h)^n \ge 1+ nh$$ which is valid for $h\ge -1$ and $n\in \NN$. We present Kuhn's proof except that we save a step by recalling (see (3.1)) that we need only approximate continuous polygonal lines which we can write as $$g(x) = g_1(x) +\sum_{i=1}^{m-1} [ g_{i+1}(x)-g_i(x)]h(x-x_i)$$ where the $0=x_00$. Kuhn simply writes down such a sequence of polynomials, namely $$p_n(x) =\left[ 1 - \left({{1-x}\over 2}\right)^n\right]^{2^n}.$$ (Note that the polynomials $\{x[2p_n(x)-1]\}$ uniformly converge to $|x|$ on $[-1,1]$. See Lebesgue's proof.) It is more convenient to consider the simpler $$q_n(x) = (1-x^n)^{2^n},$$ which is just a shift and rescale of $p_n$. On $[0,1]$ the $q_n$ are decreasing and satisfy $q_n(0)=1$, $q_n(1)=0$. The requisite facts concerning the $p_n$ therefore reduce to showing $$\lim_{n\to\infty} q_n(x) = \cases{1,& $0\le x < 1/2$ \cr 0, & $1/2 (2x)^n$$ and thus $$0< q_n(x) < {1\over {(2x)^n}}.$$ As $2x>1$, it follows that $$\lim_{n\to\infty} q_n(x)=0.$$ The monotonicity of the $q_n$ implies that this approximation is appropriately uniform. This ends Kuhn's proof. \References \refB Baillaud, B., Bourget, H.; Correspondance d'Hermite et de Stieltjes; Tome I, Gauthier-Villars (Paris); 1905 ; \refB Bell, E.~T.; Men of Mathematics; Scientific Book Club (London); 1936; \refJ Bernstein, S.~N.; D\'emonstration du th\'eor\`eme de Weierstrass fond\'ee sur le calcul des probabilit\'es; Comm.\ Soc.\ Math.\ Kharkow; 13; 1912/13; 1--2; Also appears in Russian translation in Bernstein's Collected Works. \refJ du Bois-Reymond, P.; Untersuchungen \"uber die Convergenz und Divergenz der Fourierschen Darstellungsformeln; Abhandlungen der Mathematisch-Physicalischen Classe der K.\ Bayerische Akademie der Wissenshaften; 12; 1876; 1--13; \refB Borel, \'E.; Lecons sur les Fonctions de Variables R\'eelles et les D\'eveloppe\-ments en S\'eries de Polynomes; Gauthier-Villars (Paris); 1905; (2nd edition, 1928). \refB Bourbaki, N.; Topologie G\'en\'erale (Livre III). Espaces Fonctionnels Dictionnaire (Chapitre X); Hermann \& Cie (Paris); 1949; \refJ Butzer, P.~L., Nessel, R.~J.; Aspects of de la Vall\'ee Poussin's work in approximation and its influence; Archive Hist.\ Exact Sciences; 46; 1993; 67--95; \refQ Butzer, P.~L., Stark, E.~L.; The singular integral of Landau alias the Landau polynomials - Placement and impact of Landau's article ``\"Uber die Approximation einer stetigen Funktion durch eine ganze rationale Funktion''; (Edmund Landau, Collected Works, Volume 3), P.~T.~Bateman, L.~Mirsky, H.~L.~Montgomery, W.~Schall, I.~J.~Schoenberg, W.~Schwarz, H.~Wefelscheid (eds.), Thales-Verlag (Essen); 1986; 83--111; \refD Cakon, R.; Alternative Proofs of Weierstrass Theorem of Approximation: An Expository Paper; Master's Thesis, Department of Mathematics, The Pennsylvania State University; 1987; \refB Dieudonn\'e, J.; Foundations of Modern Analysis; Academic Press (New York); 1969; \refJ Dirichlet, L.; Sur la convergence des s\'eries trigonom\'etriques qui servent \`a repr\'esenter une fonction arbitraire entre des limites donn\'ees; J.\ f\"ur Reine und Angewandte Math.; 4; 1829; 157--169; \refB Feinerman, R.~P., Newman, D.~J.; Polynomial Approximation; Will\-iams and Wilkins Co. (Baltimore); 1974; \refJ Fej\'er, L.; Sur les fonctions born\'ees et int\`egrables; Comptes Rendus Acad.\ Sci.\ Paris; 131; 1900; 984--987; \refJ Gray, J.~D.; The shaping of the Riesz representation theorem: A chapter in the history of analysis; Arch.\ Hist.\ Exact Sciences; 31; 1984; 127--187; \refJ Jackson, D.; A proof of Weierstrass's theorem; Amer.\ Math.\ Monthly; 41; 1934; 309--312; \refJ Kuhn, H.; Ein elementarer Beweis des Weierstrassschen Approximationssatzes; Arch.\ Math.; 15; 1964; 316--317; \refJ Landau, E.; \"Uber die Approximation einer stetigen Funktion durch eine ganze rationale Funktion; Rend.\ Circ.\ Mat.\ Palermo; 25; 1908; 337--345; \refJ Lebesgue, H.; Sur l'approximation des fonctions; Bull.\ Sciences Math.; 22; 1898; 278--287; \refJ Lebesgue, H.; Sur la repr\'esentation approch\'ee des fonctions,; Rend.\ Circ.\ Mat.\ Palermo; 26; 1908; 325--328; \refJ Lebesgue, H.; Sur les in\'tegrales singuli\`eres; Ann.\ Fac.\ Sci.\ Univ.\ Toulouse; 1; 1909; 25--117; \refJ Lerch, M.; O hlavni vete theorie funkci vytvorujicich (On the main theorem on generating functions); Rozpravy Ceske Akademie v.~Praze; 1; 1892; 681--685; \refJ Lerch, M.; Sur un point de la th\`eorie des fonctions g\'en\'eratices d'Abel; Acta Math.; 27; 1903; 339--351; \refJ Levasseur, K.~M.; A probabilistic proof of the Weierstrass approximation theorem; Amer.\ Math.\ Monthly; 91; 1984; 249--250; \refX MacTutor. [2004] {\tt http://www-groups.dcs.st-and.ac.uk/$\sim$history} \refJ Mittag-Leffler, G.; Sur la repr\'esentation analytique des functions d'une variable r\'eelle; Rend.\ Circ.\ Mat.\ Palermo; 14; 1900; 217--224; \refB Natanson, I.~P.; Constructive Function Theory, Volume I; Frederick Ungar (New York); 1964; \refB Ostrowski, A.; Vorlesungen \"uber Differential-und Integralrechnung, Volume II; Birkh\"auser (Zurich); 1951; \refJ Picard, E.; Sur la repr\'esentation approch\'ee des fonctions; Comptes Rendus Acad.\ Sci.\ Paris; 112; 1891a; 183--186; \refB Picard, E.; Trait\'e D'Analyse; Tome I, Gauthier-Villars (Paris); 1891b; (Many subsequent editions followed). \refQ Picard, E.; Lectures on Mathematics; (Clark University 1880-1899 Decennial Celebration), W.~E.~Story and L.~N.~Wilson (eds.), Norwood Press (Norwood, Mass.); 1899; 207--259; \refJ Pinkus, A.; Weierstrass and Approximation Theory; J.\ Approx.\ Theory; 107; 2000; 1-66; \refB Rivlin, T.~J.; The Chebyshev Polynomial; John Wiley (New York); 1974; \refJ Runge, C.; Zur Theorie der eindeutigen analytischen Functionen; Acta Math.; 6; 1885; 229--244; \refJ Runge, C.; \"Uber die Darstellung willk\"urlicher Functionen; Acta Math.; 7; 1885/86; 387--392; \refJ Schwarz, H.~A.; Zur Integration der partiellen Differentialgleichung $\partial^2 u/ \partial x^2$ $+ \partial^2 u/ \partial y^2= 0$; J.\ f\"ur Reine und Angewandte Math.; 74; 1871; 218--253; \refJ Siegmund-Schultze, R.; Der Beweis des Weierstrasschen Approximationssatzes 1885 vor dem Hintergrund der Entwicklung der Fourieranalysis; Historia Math.; 15; 1988; 299--310; \refJ Skrasek, J.; Le centenaire de la naissance de Matyas Lerch; Czech.\ Math.\ J.; 10; 1960; 631--635; \refQ Stark, E.~L.; Bernstein-Polynome, 1912--1955; (Functional Analysis and Approximation), P.~L.~Butzer, B.~Sz.-Nagy, and E.~G\"orlich (eds.), ISNM 60, Birkh\"auser (Basel); 1981; 443--461; \refB Sz.-Nagy, B.; Introduction to Real Functions and Orthogonal Expansions; Oxford Univ.~Press (New York); 1965; \refB Todd, J.; Introduction to the Constructive Theory of Functions; CalTech Lecture Notes (); 1961; \refJ de la Vall\'ee Poussin, Ch.~J.; Sur l'approximation des fonctions d'une variable r\'eelle et leurs d\'eriv\'ees par des polynomes et des suites limit\'ees de Fourier; Bull.\ Acad.\ Royale Belgique; 3; 1908; 193--254; \refJ de la Vall\'ee Poussin, Ch.~J.; L'approximation des fonctions d'une variable r\'eelle; L'Enseign.\ Math.; 20 ; 1918 ; 5--29; \refB de la Vall\'ee Poussin, Ch.~J.; Le\c cons sur L'Approximation des Fonctions d'une Variable R\'eelle ; Gauthier-Villars (Paris); 1919; Also in ``L'Approximation'', Chelsea, New York, 1970. \refJ Volterra, V.; Sul principio di Dirichlet; Rend.\ Circ.\ Mat.\ Palermo; 11; 1897; 83--86; \refJ Weierstrass, K.; \"Uber die analytische Darstellbarkeit sogenannter will\-k\"ur\-li\-cher Functionen einer reellen Ver\"anderlichen; Sitzungsberichte der Akademie zu Berlin; ; 1885; 633--639 and 789--805; (This appeared in two parts. An expanded version of this paper with ten additional pages also appeared in Weierstrass' {\sl Mathematische Werke}, {\bf Vol.~3}, 1--37, Mayer \& M\"uller, Berlin, 1903.) \refJ Weierstrass, K.; Sur la possibilit\'e d'une repr\'esentation analytique des fonctions dites arbitraires d'une variable r\'eelle; J.\ Math.\ Pure et Appl.; 2; 1886; 105--113 and 115-138; (This is a translation of Weierstrass [1885] and, as the original, it appeared in two parts and in subsequent issues, but under the same title. This journal was, at the time, called {\sl Journal de Liouville}) { \bigskip\obeylines Allan Pinkus Department of Mathematics Technion, I.~I.~T. Haifa, 32000 Israel {\tt pinkus@tx.technion.ac.il} {\tt http://www.math.technion.ac.il/\~{}pinkus} } \end .