You are currently browsing the category archive for the ‘Random matrix theory’ category.

During the past week, while at Hawaii for the IEEE 802.11 interim, I happened to glance this NY times article. The story is about a New Hampshire professor Yitang Zhang coming out with a recent proof on establishing a finite bound on the gap between prime numbers.  While browsing the details, there are more details emerging as several pieces of blog and articles and reviews are being written (and some are still being written). Now, looks like the claim is more or less accepted by pundits in the field, and termed as a beautiful mathematical breakthrough. As an outsider sitting with curiosity, I feel nice to scratch the surface of this new finding.

The subject surrounding this news is number theory, prime numbers to be precise. The question of interest is on the gap between adjacent prime numbers. We know that $2$ and $3$ are prime with a gap of $1$, but this is truly a special case and unique per definition. The gap between $3$ and $5$ is 2. Similarly $5$ and $7$ differ by $2$. One may have thought that, the gap between successive primes go up as we flee along the number line. Not quite. For example, we can see that there are a lot of pairs with a gap of 2.  The easy ones are (3, 5), (5, 7), (11, 13), (17, 19), (29, 31), (41, 43), (59, 61), (71, 73), (101, 103), (107, 109), (137, 139) and the list goes on. It was conjectured that there are infinitely many such pairs, but the proof of that is not quite as easy as yet! It is known that there are precisely $808,675,888,577,436$ below $10^{18}$, but infinity is still a lot far from $10^{18}$! An interesting quest was to really prove that there are infinitely many twin primes, but this still remain as an open conjecture.

Now the new discovery by Zhang is not quite proving the twin conjecture, but a close relative of that. Twin conjectures are strictly about prime pairs separated by $2$. A related question is, how about prime pairs $p$ and $q$ which are separated by $k$ where $k$ could be a finite number. When $k=2$, then we have the  special case of the classical twin prime case. Can we at least prove mathematically that there exists infinitely many primes such as $(p,q=p+k)$ for some $k$. If so,  what is the smallest $k$ where this holds true? Zhang now has a proof that for $k$ as small as $70$ million. Mathematically, if we denote $p(n)$ is the $n$th prime, then the new claim says (stated crisply in the paper abstract),

$\lim_{n \to \infty} {} \left(p_{n+1}-p_{n}\right) <70 \times 10^{6}.$

$70$ million is still a large gap, but as Dorian Goldfeld says, is still finite and nothing compared to infinity! In future, it is not unlikely that  we may get to see this gap coming down and perhaps to the best case of $k=2$. Who knows?

The result is still interesting, even to general interesting folks like us. This kind of says that, the gap between prime numbers is worst case bounded by a finite number. If we really plot the prime numbers, then we will see a saturation like behavior!  Like many other things at asymptotic (for example, the eigenvalues of a large random matrices exhibit very interesting properties, when the size goes to infinity), things at infinity may exhibit some charm, after all!

The  paper is accessible here, but as expected the proof is hard (for me at least). Hopefully we will have some diluted explanation of its essence from experts in the coming days. Already,Terrence Tao had this sktech, a couple of weeks ago on his google+ feed. Over the last week or so, finer details on the new break through are still emerging. Terry Tao also has initiated an online Wiki collaboration in an effort to improve upon from this work (For experts that is, not for me!).

Congratulations Professor Zhang.

On a lighter note, here is a very interesting blog piece, bringing some sort of similarity to the giant component behaviour on a pile of staple pins. This is sort of a layman view of the Erdos Renyi explanation.

In one version of the Erdős-Rényi process, one start with a set of $n$ isolated vertices and then keep adding random edges one at a time;  More specifically, at each stage one choose two vertices at random from among all pairs that are not already connected, then draw an edge between them. It turns out there’s a dramatic change in the nature of the graph when the number of edges reaches $n/2$ . Below this threshold, the graph consists of many small, isolated components; above $n/2$, the fragments coalesce into one giant component that includes almost all the vertices. “The Birth of the Giant Component” was later described in greater detail in an even bigger paper–it filled an entire issue of Random Structures and Algorithms (1993, 4:233–358)–by Svante Janson, Donald E. Knuth, Tomasz Luczak and Boris Pittel.

Ofcourse Joel Spencer has written a beautiful article on AMS commemurating the golden anniversity of the giant component.

Some interesting tips on binary random matrices again. Kolchin’s book also discusses some aspects of this (see Page 126 for example). It is interesting to see the asymptotic behaviour.

The probability that a random matrix ${k \times (k+m)}$ binary matrix with each element chosen uniformly (${0}$ and ${1}$ picked equally likely), is of full rank ${ k}$ for ${m \ge 0}$ is

$\displaystyle\rho =\prod_{i=m+1}^{\infty}{\left(1-\frac{1}{2^{i}}\right)}, \quad m=0,1,2,\ldots\ \ \ \ \ (1)$

The probability that a random matrix $k \times (k+m)$ binary matrix with each element chosen uniformly (${0}$ and $1$ picked equally likely), is of full rank $k+m$  for $m < 0$ is

$\displaystyle\rho =\prod_{i=1}^{k+m}{\left(1-\frac{1}{2^{k-i+1}}\right)}, \quad m=-1,-2,\ldots\ \ \ \ \ (2)$

Glanced upon a paper from the 2009 ISIT, titled “LT Codes Decoding: Design and Analysis” by Feng Lu et al. The authors discusses LT decoding strategy which is Peeling decoder (conventional simple LT decoding) followed by an extension based on the well known Widemann algorithm solving a sparse homogeneous linear system. Here is a brief summary as I understood.

On packet erasure channels (internet being one close realization of a model of this kind, where, the TCP/IP suit, one can either gets packets correctly delivered or lost in the wild), Luby Transform (LT) codes forms an easy way to transfer information.  It is well known that, asymptotically (When the number of information symbols  (or packets)  $k$ and the number of packets received $r$ are large) near full recovery is made possible by simple peeling decoder which grows in complextity at $\mathcal{O}(k)$. The claim is that, for $n >10000$  or so, with $r= k(1+\delta)$ , on average  we can, with high probability, recover all of  $k$ information packets  by simple unwrapping of xor-ed packets. A more precise statement will read as: If from $k$ information packets, each coded packet constructed independently by xor operation (the number of packets taken for xor operation follows a distribution; Plain vanila version of LT codes uses Soliton distribution), then the content of the original information source can be recovered from any $k+\mathcal{O} \left(\sqrt{k} \ln^{2} \left(\frac{k}{\epsilon}\right)\right)$ packets with probability $1-\epsilon$ by an average of $\mathcal{O} \left(k \ln \frac{k}{\epsilon}\right)$ symbol operations. However, when the number of input symbols is not large (say less, $n < 1000$ ) the overhead term $\delta$ is not small. It is not too surprising, since we need sufficiently large $n$ and $r$ to converge the performance in expectation.  One can think of solving the system of linear equations using Gaussian elimination and get the best figures, but the question mark stays on the complexity $\mathcal{O}(k^{3})$ of this method over the cheaper peeling decoder complexity $\mathcal{O}(k)$.

Conventional LT decoding based on Peeling decoding rule terminate, once the number of degree -1 packets (ripple as it is called in graph theory parlance) are void. One can hope to continue from there, by using Gaussian elimination and decode a a few more (if the graph matrix rank permits it). If the original graph is full rank, one can technically decode all of them.  Gauissian eliminatio being heavy, one hope to find an easy, preferably recursive method. Indeed there comes Wiedemann and that is what the authors use.  The authors propose what they call as full rank  decoding, which essentially is LT decoding, until there are no more degree one packets. From there, they borrow a packet and then uses Wiedemann algorithm which is somewhat light on computational complexity.

The larger aspect on whether this scheme do any better than conventional LT peeling decoder is the other question they answer. To achieve full recovery we need to have a full rank matrix. Now, the matrix being a sparse matrix defined over $\mathbb{F}_{2}^{k \times r}$ where $r$ is the number of received packets. The probability of the matrix having a full rank will directly help us to infer the decoding merits. This is an interesting piece since I had looked into the computation of the rank of a binary matrix and also the probability of a random matrix rank being equal to an arbitrary value. All this was done during the Modern Coding Theory doctoral course by Ruediger Urbanke at EPFL.  For binary random matrix, we can also setup a recursive formula and thereby a good inference on the probability of decoding can be arrived at.

An interesting bound (problem from Rudi’s  Modern coding theory book) regarding the chromatic number of a random graph.  I first attempted it during the course.  Here is the problem statement:

C.5 (Concentration of the Chromatic Number – Spencer and Shamir). Consider a graph $G$ on $n$ vertices.  The chromatic number of a graph $G$, denoted by $\chi(G)$, is the smallest number of colors needed to color all vertices so that no two vertices which are joined by an edge have the same color. Consider the standard ensemble of random graphs on $n$ vertices with parameter $p$: to sample from this ensemble, pick $n$ vertices and connect each of the $\binom{n}{2}$ ordered pairs of vertices independently from all other connections with probability $p$. Show that for this ensemble

$\mathbb{P}\left(\lvert \chi(G)-\mathbb{E}\left[\chi(G)\right] \rvert >\lambda \sqrt{n-1}\right) \le 2e^{\frac{-\lambda^2}{2}}$.

My solution is as folllows: (PDF typeset as single file is available here. Look for problem 3)

Let $G$ is a random graph with $n$ vertices.  If $p$ is the probability that a given edge is in $G$. The probability space be $G(n,p)$.  Let $\chi$ be a filter on the set of all such random graphs. We can define a Martingale as follows:

$X_{0}=\mathbb{E}[X(G)]$

$X_{i}=\mathbb{E}[X(G)\lvert 1_{1},1_{2},\ldots,1_{i}],\forall 1\le i\le \binom{n}{2}$

where

$1_{i}=\begin{cases}1,&\text{if edge} \quad e_{i}\in G\\ 0, &\text{if edge} \quad e_{i}\notin G\end{cases}$

and $\chi(G)$ is the Chromatic number of $G$. Chromatic number changes at most by one , when the information about the new edge comes in. Clearly, $\chi$ satisfies the conditions for Azuma’s inequality. $\{X_i\}_{i\ge 0}$

is a Martingale, with $\lvert X_i-X_0\rvert \le 1$).  Let $Z_i=X_i-E[X_i]$. Clearly

$E[Z_i]=E[X_i]-E[x_i]=0$

$Z_m=X_m-E[X_m]=\mathbb{E}[\chi(G)|1_1,1_2,\ldots,1_m]-\mathbb{E}[\chi(G)]$

$=\chi(G)-\mathbb{E}[\chi(G)]$

Now we can use the Azuma’s inequality on $\{Z_i\}$ to get,

$\mathbb{P}\left(\lvert Z_n-Z_0\rvert \ge \lambda \sqrt{n}\right)=\mathbb{P}\left(\lvert \chi(G)-\mathbb{E}[\chi(G)] \rvert \ge \lambda \sqrt{n}\right)\le 2e^{\frac{-\lambda^2}{2}}$.

Since $\mathbb{P}\left(\lvert \chi(G)-\mathbb{E}[\chi(G)] \rvert \ge \lambda \sqrt{n}\right)=\mathbb{P}\left(\lvert \chi(G)-\mathbb{E}[\chi(G)] \rvert > \lambda \sqrt{n-1}\right)$, the result

$\mathbb{P}\left(\lvert \chi(G)-\mathbb{E}\left[\chi(G)\right] \rvert >\lambda \sqrt{n-1}\right) \le 2e^{\frac{-\lambda^2}{2}}$

follows.

Here is an interesting riddle on random matrices.

(Rank of Random Binary Matrix). Let $R(l,m,k)$ denote the number of binary matrices of dimension $l \times m$ and rank $k$, so that by symmetry $R(l,m,k)=R(m,l,k)$.  This is a repost of the solution that I have arrived at (certainly not the first!) and submitted as part of a homework (9) problem from the doctoral course  Modern coding theory (by Rudiger Urbanke) at  EPFL.  The sumbitted solution in PDF is available here.

Rank of a matrix $G$ is essentially the number of nonzero rows when the matrix $G$ is expressed in echelon form. So, we just need to compute the ways these matrices can be created with $k$ non zero rows. Since the elements of the matrix are binary (from $\mathbb{F}_{q=2}$), we can simply do a counting.

It is trivial to compute $R(l,m,k)$ for $k=0$ and $k>l$. For  $k=0$, only all zero matrix possible, and only one such matrix exist. Hence $R(l,m,0)=1$. For  $l>k>0$, since  $k>\min(l,m)$, no matrix exist, which means $R(l,m,k)=0$ .

Now we consider $l=k>0$.  How many ways? We have $l=k$  non zero rows of the $l\times m$  matrix, which means all rows must be nonzero. Without loss of generality, for counting, we could assume that, the rows are ordered. The last row ($l^{th}$ row can be be done in $2^{m}-1$,  since there anything other than all $0$ vector (of size $m$) is allowed. On $(l-1)$-th row, anything other than that of row $l$ is allowed. There are $2^{m}-2$ ways here. $l-2$-th row can have anything except any linear combination of the rows $l$ and $l-1$. This is nothing but $2^m-\left({\binom{2}{0}+\binom{2}{1}+\binom{2}{2}}\right)=2^m-2^2$. Row $l-3$ then have $2^m-\left(\binom{3}{0}+\binom{3}{1}+\binom{3}{2}\right)=2^m-2^3$ and so on. In all, Following the same procedure, we can have a total of

$= \left(2^m-1\right) \left(2^m-2^1\right)\left(2^m-2^2\right)\ldots \left(2^m-2^{l-1}\right)$

$=\left(2^m-1\right) 2^{1} \left(2^{m-1}-1\right) 2^{2} \left(2^{m-2}-1\right) \ldots 2^{l-1}\left(2^{m-l+1}-1\right)$

$=2^{0} 2^{1} 2^{2} \ldots 2^{l-1}\left(2^m-1\right)\left(2^{m-1}-1\right)\left(2^{m-2}-1\right)\ldots\left(2^{m-l+1}-1\right)$

$=\prod_{i=0}^{l-1}{{2^i}\left(2^{m-i}-1\right)}$

$=\prod_{i=0}^{l-1}{\left(2^{m}-2^{i}\right)}$

$=\prod_{i=0}^{l-1}{2^m \left(1-2^{i-m}\right)}$

$=2^{ml} \prod_{i=0}^{l-1}{ \left(1-2^{i-m}\right)}$

ways.  For $l>k>0$, we can construct a rank $k$ matrix of size $l \times m$ in any of the following ways:

1.  Take a rank $k-1$ matrix of size $(l-1) \times m$ and add an independent row.
2.  Take a rank $k$ matrix of size $(l-1) \times m$ and add a dependent row.

For every $(l-1) \times m$ matrix,

$2^{m}-1+\binom{k-1}{1}+\binom{k-1}{2}+\ldots +\binom{k-1}{k-1}=\left(2^m-2^{k-1}\right)$

and hence,

$R(l-1,m,k-1) \left(2^m-2^{k-1}\right)= R_{1}(l,m,k)$

ways. (Essentially avoid all possible linear combinations of existing $k-1$ rows).  Using the second (item 2 above) method, we can have $1+\binom{k}{1}+\binom{k}{2}+\ldots +\binom{k}{k} = 2^k$ and

$R_{2}(l,m,k)= 2^k R(l-1,m,k)$

different ways a rank $k$ matrix can be formed. Where,the first term ($=1$) is when the all zero row is picked as the new row. In$\binom{k}{1}$ ways we can pick any one of the exisiting row as a dependent (new row). In general for $0\le j\le k$ we can have combination of $j$ existing rows  out of $k$ in $\binom{k}{j}$ different ways to make a dependent (new) row.

So using (1) and (2) we get,

$R(l,m,k)=2^k R(l-1,m,k)+\left(2^m-2^{k-1}\right)R(l-1,m,k-1)$

Putting everything together,

$R(l,m,k) = \begin{cases} 1, & k=0, \\2^{ml} \displaystyle \prod_{i=0}^{l-1}{ \left(1-2^{i-m}\right)} , & l=k>0 \\ 2^k R(l-1,m,k) + \left(2^m-2^{k-1}\right) R(l-1,m,k-1) &l>k>0 \\ 0 & k>l>0 \end{cases}$

 leeyoongu on LT codes decoding using Wiedem… vinod kumar on Yesudas and Rafi singing same… Ramesh on The split lot corner stor… sreelesh. on Yesudas and Rafi singing same… The new Prime gap |… on Improved bounds on Prime numbe…