You are currently browsing the category archive for the 'Coding theory' category.
The memorial service for Ralf Koetter held at UCSD is video archived. Quite many of the stalwarts in Information theory field found it difficult to control their emotions when they spoke about him. Such was the level of closeness many people had with him. I have never got to directly interact with Ralf , but was aware about his stellar contributions to many areas in and related to coding. The most notable thing other than his well known research contribution is his amazing teaching skills. The two guest lectures given by him during David Forney’s MIT class in 2005 were simply stunning. He then had talked about Reed Solomon codes and that is by far the best lucid presentation of such a difficult topic, that I have ever seen. His sudden and untimely demise leaves an irreplaceable void on this planet. He was that good. So woefully cut short by cancer.
Alex Vardy knitted down a fitting tribute to his friend and colleague.
Todays IPG seminar had Fritz Eisenbrand (the Disctete Opt chair, Math department EPFL) talking about Diameter of Polyhedra:Limits of Abstraction. I don’t think I followed the topic too well, but this is a share of what I understood.
The topic is about a convex geometric problem on the diameter of a polyhedra. The question of whether the diameter of a polyhedron is polynomial or not seemed to be a longstanding open problem. The largest diameter of a
dimensional polyhedron with
facets has known upper and lower bounds.
.
The lower bound is due to Klee and Walkup and upper bound to Kalai and Kleitman. These bounds also hold good for combinatorial abstractions of the 1-skeleton of non-degenerate polyhedra (Polyhedron here is called non-degenrate). What Fritz and his colleagues have done is to look into the gap between these known lower and upper bounds. Apparently, the gap is wide and they have made some progress to get a super linear lower bound if
is allowed to grow with
.
The way they showed this bound is by establishing the bound for the largest diemeter of a graph in a base abstraction family. Let us say, the abstraction family of connected graphs be denoted by .The largest diameter of a graph in
is denoted by
. They find that,
and then using the fact that
, they conclude the bound
I have not had a chance to see their paper yet. I must say, the proof was not all that within my grab during the talk. However it appeared that it is based on some layering and combinatorics. He said some applications to covering problem, in particular disjoint covering design which I didn’t follow that well. Sometimes I get the feeling that I am a little dumb to grasp these ideas during a talk. I wonder whether others understand it very well on a first shot presentation. I have put it in my agenda (among the millions of other papers to read) to see through this problem and proof, one day! His presentation was very clear and legible though.
From this blog piece, I came to know that the smart MIT theoretical computer scientist Madhu Sudan is making a move from MIT to industry. He is set to take up a research position with Microsoft. At this economy troubled days, lesser mortals would take the conservative route that ensure stability and so on. They would say a move from a tenured professorship to a more volatile industry is risky. But then one of the smartest mind in the world can have a world revolve around him, if need be. So no surprises here. On the positive side it is a gain for industry, while it is a big loss for MIT, if Madhu decides to stay away from academia for too long.
Interestingly, on the very same blog, someone commented about other famous moves. Apparently, Venkatesan Guruswami, Madhu’s celebrated student is also making a permanent move from UWash to CMU. In industry, we are often associated with frequent hops. Academia is not too immune to attrition either. However, I see no harm in making smart moves. It is going to help the world, atleast in expectation.
As in EPFL too, there is imminent big fish attrition(s). Tom Henzinger and his wife Monika Henzinger are about to leave EPFL to take up a permanent position in Austria. The awesome twosome will be missed in EPFL.
An interesting bound (problem from Rudi’s Modern coding theory book) regarding the chromatic number of a random graph. I first attempted it during the course. Here is the problem statement:
C.5 (Concentration of the Chromatic Number – Spencer and Shamir). Consider a graph on
vertices. The chromatic number of a graph
, denoted by
, is the smallest number of colors needed to color all vertices so that no two vertices which are joined by an edge have the same color. Consider the standard ensemble of random graphs on
vertices with parameter
: to sample from this ensemble, pick
vertices and connect each of the
ordered pairs of vertices independently from all other connections with probability
. Show that for this ensemble
.
My solution is as folllows: (PDF typeset as single file is available here. Look for problem 3)
Let is a random graph with
vertices. If
is the probability that a given edge is in
. The probability space be
. Let
be a filter on the set of all such random graphs. We can define a Martingale as follows:
where
and is the Chromatic number of
. Chromatic number changes at most by one , when the information about the new edge comes in. Clearly,
satisfies the conditions for Azuma’s inequality.
is a Martingale, with ). Let
. Clearly
Now we can use the Azuma’s inequality on to get,
.
Since , the result
follows.
Here is an interesting riddle on random matrices.
(Rank of Random Binary Matrix). Let denote the number of binary matrices of dimension
and rank
, so that by symmetry
. This is a repost of the solution that I have arrived at (certainly not the first!) and submitted as part of a homework (9) problem from the doctoral course Modern coding theory (by Rudiger Urbanke) at EPFL. The sumbitted solution in PDF is available here.
Rank of a matrix is essentially the number of nonzero rows when the matrix
is expressed in echelon form. So, we just need to compute the ways these matrices can be created with
non zero rows. Since the elements of the matrix are binary (from
), we can simply do a counting.
It is trivial to compute for
and
. For
, only all zero matrix possible, and only one such matrix exist. Hence
. For
, since
, no matrix exist, which means
.
Now we consider . How many ways? We have
non zero rows of the
matrix, which means all rows must be nonzero. Without loss of generality, for counting, we could assume that, the rows are ordered. The last row (
row can be be done in
, since there anything other than all
vector (of size
) is allowed. On
-th row, anything other than that of row
is allowed. There are
ways here.
-th row can have anything except any linear combination of the rows
and
. This is nothing but
. Row
then have
and so on. In all, Following the same procedure, we can have a total of
ways. For , we can construct a rank
matrix of size
in any of the following ways:
- Take a rank
matrix of size
and add an independent row.
- Take a rank
matrix of size
and add a dependent row.
For every matrix,
and hence,
ways. (Essentially avoid all possible linear combinations of existing rows). Using the second (item 2 above) method, we can have
and
different ways a rank matrix can be formed. Where,the first term (
) is when the all zero row is picked as the new row. In
ways we can pick any one of the exisiting row as a dependent (new row). In general for
we can have combination of
existing rows out of
in
different ways to make a dependent (new) row.
So using (1) and (2) we get,
Putting everything together,
While trying to understand the Luby transform (LT) code, I stumbled upon the well known coupon collector’s problem. It took a while for me to figure out the connection, but as it turned out, there is a stronger connection between these two. In LT parlance, if we were to use only degree one packets (that is, packets sent and collected as it is) what is the expected number of packets to be collected (when collected randomly, one at a time) such that all the required packets are collected atleast once. For illustration let us say we have information packets at the transmitter. The receiver collectes these one by one at random. How many packets on the average, we need to collect until we have collected all of the
different information packets. Remember we are collecting the packets randomly (On the other hand, if we were to collect things deterministically, we just need to collect
packets to get all these
, when done without replacement).
Assume that there are distinct coupon types. We have, a large pool of these coupons at disposal. Every time you go to the shop you collect a coupon picked uniformly at random from the pool. The picked coupon has equal probability of being any of the
types. Naturally, some of the collected coupons (over multiple visits to the shop) may be of the same type. The question asked is this: Suppose the coupon collector aims to have coupons of all types. How many (number of visits) coupons he has to collect till he possess all the
distinct types of coupons?
In expectation, the coupon collector should make visits to the shop in order to have atleast one copy of all
distinct types of coupons . This coupon collector problem can sound a little confusing to a fresh reader. For simplicity sake we can assume that, there are
differently coloured coupons at the shop. The question then is, on average (i.e., expectation) how many times one needs to visit (each visit fetch a coupon) the shop so that all coloured coupons are fetched atleast once.
There are different type of coupons. The coupon collector collects a coupon upon each visit. The collected coupon is among the
types, picked uniformly at random (from a set of possibly large pool of coupons) . Since the coupon is drawn uniformly at random, there is a non zero probability that some of the collected coupons over multiple visits may be of the same type. Suppose that at some stage, the coupon collector has
different type of coupons collected. The probability that his next visit fetch a new coupon type (not of the
types he already have in the kitty) is
. So, the expected number of coupons to be collected to fetch a new coupon type is
. Let us denote this number by
.
The expected value . From this we can compute the expected value of
. In other words,
, the expected number of coupons to be collected (i.e, number of visits to the shop!) so that, the he would have all the different
types of coupons is:
So, what is the trouble? This number is prohibitively high a number to be acceptable (as decoding time of
is significantly higher than the wishful linear time
!). So, simply using degree
is not a good idea. This is why Luby went ahead and identified some smarter distribution like Soliton (and its variants proposed later on, such as robust soliton and then the recent raptor codes by Amin).
Today, as part of EPFL annual research day, there were 3 interesting talks. In the morning Prakash Narayan gave a very interesting talk titled “Common randomness, multiuser secrecy and tree packing”. Essentially it covered three distinct problems and he showed a connection among the three. The first problem setup is the following: A set of terminals observe separate but correlated signals. The classical Slepian and Wolf formulation of the data compression then is essentially the problem where a subset of the given terminals seeking to acquire the signals observed by all the terminals. And this is done by means of efficiently compressed inter terminal communication. This is a problem of generating common randomness. This of course does not involve any secrecy constraints. Now suppose a secret key generation problem. There the same subset of terminals seek to devise “secret” common randomness or a secret key through public communication. Assume here that an eavesdropper can observe this. So the setup is such that the key is concealed from the eavesdropper. Such a secret key can be used for subsequent encryption. Prakash’s talk was then to explain the connection between the two problems. He went on to establish the connection to a problem in computer science namely the maximal packing og Steiner trees in an associated multi graph. I dont think I figured out the details that well, but it triggered some curiosity to read the work a little more detail. I hope to do that sometime soon.
The afternoon session had two talks. One was by Shamai who talked about Broadcast approach in communication systems. It went over time. I thought I focused well in the beginning to follow him, but partly because of the post lunch effect and partly because of the tiredness I lost the flow. From what I understood, he outlined a lot of communication scenarios incorporating the broadcast strategy. Some examples were MIMO rate diversity trade off, ARQ, multilayer schemes etc. A lot the work seems to have gone in this direction, especially Suhas and Sanket etc (from the citation) and David Tse, L. Zheng, Al-Dahir and Shamai himself. I am somewhat amazed by the areas Shamai worked on. He seems to have covered a broad spectrum of research and yet produced some stellar work.
After Shamai, it was an interesting talk by Amos Lapidoth. He presented handsomely. I was attentive enough to follow this. Also, it happened to be a talk of different kind. He talked about the well known Matched filter used in communication. He sort of started with a little story. The story of a man from a village, venturing out of that place with a mission to find the meaning of life. So he goes to the mountains with a resolve not to come back until he finds the meaning of life. So days passed, months passed and years passed. Even after 10 years no sign of him. Finally he comes back after 11 years or so. The whole village feels curious: Aha he has come back. They ask him, wow, you have figured out the meaning of life. Please share us what is it? He says, with a pause: Life is (he pauses again)…. : Villages out of patience ask him, : ” You please go on .. life is …”. The man completes and says ” Life is like a train!”. Then they ask what you mean by “life is like a train”. Then to the surprise of the entire village he says, “may be not!”.
That was simply amazing a prelude for the talk. The talk abstract is the following:
One of the key results of Digital Communications can be paraphrased very roughly as follows: “in guessing which of two deterministic signals is being observed in white Gaussian noise, the inner products between the observed waveform and each of the signals form a sufficient statistic. Consequently, it is optimal to base one’s decision on these two inner products.” It is surprising that this basic result is never formulated as a theorem in any of the textbooks on the subject. This may be because of the difficulties in defining white Gaussian noise, in defining sufficient statistics for waveform observations, and in relating sufficiency to optimal detection. In this talk I shall describe a number of approaches to formulating the above statement as a theorem and point out some of their shortcomings. I will finally describe my proposed approach, formulate the theorem, and prove it from first principles. The proposed approach does not rely on the Ito Calculus, on Brownian Motion, or on generalized stochastic processes. It does not introduce non-physical infinite-power noise processes. Moreover, it is suitable for rigorously treating colored noise.
He gave a counter example where we can do better than matched filter. He says a Gaussian noise, but choose a point at random where the noise is made zero. Since it is randomly chosen (the null point) he claims it is still Gaussian. To me, that will result in SNR to blow up to infinity. So, are we missing something. I cant wait to read the full paper presentation of this. Otherwise, it seem to be a very very interesting way to look at matched filter, without needing the sojourn mathematical machinery.
Anyway all these talks are available (schedule at the moment) at [1]
[1]http://ic.epfl.ch/page65253-fr.html

Recent Comments