You are currently browsing the category archive for the ‘Technical’ category.

The 2012 Turing award goes to two cryptography pioneers Shafi Goldwasser and Silvio Micali. I don’t do much of cryptography (for that matter I never did anything serious on that front, besides doing some C programming to demonstrate RSA and later on some class project involving elliptic curve cryptography and mathematica programming at EPFL). But I always was fascinated by the topic of Cryptography, largely for the many interesting toy like, yet deeply thought provocative combinatorial as well as probabilistic problems it addressed. Several of these problems stay at the very intersection of CS theory and cryptography.

One such classical problem that we all learned in graduate classes is the zero knowledge proof. The idea is pretty cute. In the most simple scenario, this involve two parties Alice and Bob, who don’t trust each other, but one of them (say Alice) have to convince the other (bob) that she has knowledge (say a mathematical proof) of something without really revealing it. The fact that it is not revealed explains why it is zero knowledge (Bob in the end never get to know what Alice know, but just gets convinced that she knows!). So, it will work more like an interactive protocol wherein a lot of questions are being asked by Bob, chosen randomly, depending on prior answers from Alice. Doing this way for long, can Alice convince Bob, with an overwhelmingly high probability, that she knows what was claimed? Such an interactive protocol constitute what is coined as a zero knowledge proof. An example would be a novel ATM machine where, you don’t have to enter the PIN (unlike the ones that we have), but you can still convince the machine that you know your correct PIN. Sound cool and yet thought provoking, right? Well, that is why I said it is cute and interesting. This zero knowledge interactive proof idea was first conceived by the new Turning award winners. The names didn’t strike me initially with the Turning award news, but after reading the details, it occurred to me that, I had read their classic paper , as part of my coursework at EPFL.

A bit more formally stated, the two entities are namely a Prover and a Verifier. In the ATM example, the ATM machine is a verifier and you the user is a Prover. Prover knows the proof of a mathematical statement and verifier tries to verify whether the claim is indeed correct with high probability. The machinery of zero knowledge proof is that, if the prover is trying to trick, the verifier will be able to find that out as well, with high probability. There are many examples illustrating this beautiful idea. A classic example is the problem of helping a blind man in identifying whether two otherwise identical balls are of different colors? Can you convince him, without really telling which is which? Now there are variants of their original idea and myriads of practical significance have emerged or are still emerging.

The ACM award page has pretty enthralling account of these two pioneers (Shafi and Micali). Now, here is an interesting family trivia. Shafi Goldwasser and her husband Nir Shavit together now keeps three Gödel Prize in their shelves, besides adding the Turing award aroma, now to their household!

Through Anand Sarwate’s blog and this piece from Sergio Verdu, I came to know that the well known Information and Coding theorist Jim Massey has passed away. I don’t have any direct personal recollection of Massey, other than seeing him once at an Information theory workshop and also last year at the Marconi award ceremony at Irvine. The one thing I always remember (besides the Berlekamp Massey algorithm and transform decoding paper) is his notes at ETH. I have enormously benefited from his lecture notes on Cryptography when I was trying to learn the topic at EPFL. So lucid, crisp and intuitive were his scribes. How I always wished to sit in one of his live lectures! RIP!

I am sure detailed writing on his life and work will appear at some point. I recall Rudi Urbanke once mentioned the early impact of Massey’s work (as a graduate student ?) on threshold decoding of convolutional code, having spurred interest in industry. Codex corporation (to which, he was a co-founder, I learned recently.) once wanted to implement it into their line modems. Not sure whether I have all the details intact here, but prior to the Viterbi algorithm, his threshold decoding scheme must have been a big hit in communication! To have industry interested in a graduate student work must be special, any day, anywhere!

In his blog Sergio Verdu, has pointed to the IEEE oral history interview archive, which I happened to read last year almost same time.

Info theory website has further details including the funeral info.

If you have not seen this yet, a fascinating talk (Cryptography- Science or Magic) he did at MIT several years ago, is archived here. Boy! who did the speaker introduction? Another true connoisseur Peter Elias! First time, I saw a video of Elias.

Most of you may have been following this new prototype being developed and deployed by Google. I am talking about project Loon, an idea conceived by Google to help connect the few billion friends around the world who are still deprived of internet benefits. The idea at first may spell like fiction, but this one is for real. Already, some pilot projects are on the way, in New Zealand. Let us watch out for this to spread its wings in the coming months and years!

Anyone remember the old Motorola/Iridium initiative? It scooped and failed for many a reasons, but the idea that time was to have the entire world voice connected, but project Loon is a bit more than that in intention, technology and economic viability. Besides, Loon is backed by a highly successful technology driven company. The goal in itself is to have pretty much every corner of the world to stay connected by internet, the holy grail of global networking. Whereas, Iridium needed sophisticated lower orbit satellites, project Loon can get the job done through a set of balloons equipped with wireless communication technologies. The number of balloons may be much larger than the number 66 or 70 satellites, but the latter is a lot less expensive and green than the failed initiative!

So what goes into the making of project Loon? Logistic wise it needs deployment of enough number of helium powered balloons into the sky, the stratosphere layer of earth atmosphere to be precise. Why stratosphere? Because, the balloons will make use of the wind flow that prevail at stratosphere layers to steer and position it around a certain location locked to ground. The balloons are not quite stationary; they instead will move around, but on the average a certain number of balloons will stay put up in location range to provide a reasonable coverage for any given location. All the balloons are equipped with enough circuitry to perform necessary wireless communication networking jobs. The balloons are all the time connected (wireless that is) to neighboring balloons and some of them will talk to an available ground station terminals through which it will establish connection to the internet backbone and thus to rest of the connected world!

The balloons may have varying shapes and orientation. The shape of the balloon and the wind pattern may come into the equation to steer them and stay around (or move around the earth) at the atmosphere. They may, not only move around the earth, but also can potentially move up and down in the stratosphere layers. Each of these balloons are of approximately 15 meters in diameter which will float at about 20 km altitude from earth surface. For record, this height is more than double the distance where we can spot the farthest cloud or for that matter the highest altitude where airplanes fly! The task involves gyration, ballon steering and of course quite a lot of wireless mesh networking as well as co-ordination prospects. At the user side, you will have specialized antenna (or antennas, depending on whether MIMO comes in) to talk to one of the balloons above your location and we are all set to go. When fully operational, everything else will be transparent! Pretty much the energy for operation at balloons all will come from solar energy. The other natural resource used is wind. Both are green and free and almost universal!

I am very excited about the prospect of this coming off in full force in the near future. On the beneficiary side, one it will help reaching the far corners of our planet. More than that this may well serve as an inexpensive way for many billion folks to reap the benefits of internet and staying connected. Of all, the children of a lesser world can as well get to bite a share of a better world. Imagine a remote village school in Burundi or Bangladesh getting access to better educational tools through internet! Wouldn’t that be a beautiful? Corporations will make money, but when less privileged ones also benefit, that is something to cheer. In the end a model will sustain and everyone can have a share, monetary or otherwise.

Check out more details at the project Loon page. The google+ page has more updates pouring in.

In a lighter vein, what is the main downside of this everywhere connectedness? Here is a potential spoilsport scenario! You will agree with me here:-)

During the past week, while at Hawaii for the IEEE 802.11 interim, I happened to glance this NY times article. The story is about a New Hampshire professor Yitang Zhang coming out with a recent proof on establishing a finite bound on the gap between prime numbers. While browsing the details, there are more details emerging as several pieces of blog and articles and reviews are being written (and some are still being written). Now, looks like the claim is more or less accepted by pundits in the field, and termed as a beautiful mathematical breakthrough. As an outsider sitting with curiosity, I feel nice to scratch the surface of this new finding.

The subject surrounding this news is number theory, prime numbers to be precise. The question of interest is on the gap between adjacent prime numbers. We know that and are prime with a gap of , but this is truly a special case and unique per definition. The gap between and is 2. Similarly and differ by . One may have thought that, the gap between successive primes go up as we flee along the number line. Not quite. For example, we can see that there are a lot of pairs with a gap of 2. The easy ones are (3, 5), (5, 7), (11, 13), (17, 19), (29, 31), (41, 43), (59, 61), (71, 73), (101, 103), (107, 109), (137, 139) and the list goes on. It was conjectured that there are infinitely many such pairs, but the proof of that is not quite as easy as yet! It is known that there are precisely below , but infinity is still a lot far from ! An interesting quest was to really prove that there are infinitely many twin primes, but this still remain as an open conjecture.

Now the new discovery by Zhang is not quite proving the twin conjecture, but a close relative of that. Twin conjectures are strictly about prime pairs separated by . A related question is, how about prime pairs and which are separated by where could be a finite number. When , then we have the special case of the classical twin prime case. Can we at least prove mathematically that there exists infinitely many primes such as for some $k$. If so, what is the smallest where this holds true? Zhang now has a proof that for as small as million. Mathematically, if we denote is the th prime, then the new claim says (stated crisply in the paper abstract),

million is still a large gap, but as Dorian Goldfeld says, is still finite and nothing compared to infinity! In future, it is not unlikely that we may get to see this gap coming down and perhaps to the best case of . Who knows?

The result is still interesting, even to general interesting folks like us. This kind of says that, the gap between prime numbers is worst case bounded by a finite number. If we really plot the prime numbers, then we will see a saturation like behavior! Like many other things at asymptotic (for example, the eigenvalues of a large random matrices exhibit very interesting properties, when the size goes to infinity), things at infinity may exhibit some charm, after all!

The paper is accessible here, but as expected the proof is hard (for me at least). Hopefully we will have some diluted explanation of its essence from experts in the coming days. Already,Terrence Tao had this sktech, a couple of weeks ago on his google+ feed. Over the last week or so, finer details on the new break through are still emerging. Terry Tao also has initiated an online Wiki collaboration in an effort to improve upon from this work (For experts that is, not for me!).

Congratulations Professor Zhang.

I love Youtube. Every day, more or less on the average, I ends up spending (or at times wasting) some time there. Today was no exception, yet I was pleasantly surprised to hit upon some video taped lectures of Richard Hamming. These are apparently recorded in 1995 and are on a wide variety of topics. I didn’t get to go through all of them, which hopefully I will do sometime. I particularly liked this one on Discrete Evolution. The depth of knowledge these folks have is immense. What is astonishing is their ability and skill on connecting their point of expertise to vast majority of physical analogies. Long live internet!

A sad end to what looked like a promising and prodigious mind, complicated by many wizardly , perhaps at times turbulent actions and more so haste reactions from various corners of our society including the law enforcement offices. The news of Aaron Swart’s death at the young age of 26 is disturbing. The man, who at the age of 14 sparked into stardom by creating the now popular tool RSS for information subscription is no more! More than the wizardly invention, his name unfortunately caught into wider limelight perhaps through the MIT/JSTOR documents retrieving case. He had later championed to several causes on free information access. The right to free information in the internet era once again had caught the worldwide attention with that drive. It is difficult to keep side of this case, because it is strangled with multiple levels of complications involving the right to information, ethics, social stigma, law of the land, money, business,a wizardly mind and of course the turbulence of human mind!

I read his Uncle’s statement, “He looked at the world, and had a certain logic in his brain, and the world didn’t necessarily fit in with that logic, and that was sometimes difficult.” I couldn’t agree more to these words of Mr Wolf on Swartz. Don’t forget he was an ardent contributor to Wikipedia as well. Rest in peace Aaron!

This year’s Marconi foundation prize is being awarded to our company founder Henry Samueli. With last year’s price awarded to the other connoisseur Irwin Jacob (jointly with another stalwart Jack Wolf), now we have the two stellar communication company founders getting the prestigious award in consecutive years!. Feel proud to be part of the company he founded. Broadcom simply has a lot of vibrancy and part of this must surely be due to the champion founder. You can see the energy when Henry Samueli talk. I could feel a similar charm when Aart De Geus (founder and CEO of my earlier employer Synopsys) talks too. Congratulations Dr. Samueli, we are proud of you.

The first mail this morning (from Nihar Jindal) brought this very sad news that Tom Cover has passed away. A giant in this field who contributed immensely to many flavours of Information theory will be missed. Everything he touched had class written all over, gracefulness, simplicity, elegance and all the more depth.

A tremendous loss! His legacy will continue.

Yesterday evening, during the dinner at a restaurant at Hawaii, I and my colleagues (Eric, Jun, Nihar and myself) along with a fellow colleague (Neycer) from Motorola were having some random ramblings. Somewhere in the course,came the topic on history of OFDM. It was indeed fascinating to trace the history. I did a bit of Googling later on and also traced some old notes from the discussion with Emre Telatar (who to me is a walking encylcopedia on several things). My information may not be too acurate, but roughly this is what I gathered after all the pile collection.

The origin of OFDM idea as such is largely attributed to Chang 1970. Saltzberg had identified the problem of ISI and in came the notion of guard interval. Apparently, there is also a patent filed on this idea. The idea of cyclic prefix, the killer beauty which made OFDM ridiculously easy for equalization, was brought in by Peled and Ruiz in 1980. It was then Weinstein and Ebert who came up with the possibility of using FFT into OFDM. This traces back to the summer of 1971.

There are a few more interesting pre-work prior to these official OFDM milestones. Even though they are not really related, but hindlisht, we can still bring in similarities on how ideas shaped over time and different eras. For instance, the concept of parallel transmission was realized even in a product form in 1957 by a company Collins Radio Company. It was known as a Kineplex system. And the very idea of splitting to multiple carriers and power filling have signs of Gallager’s work and even waterfilling:-)

There is a Globecom paper which discusses all these. All these and may be more are neatly discussed there.

The SODA 2012 paper by Dina Katabi, Piotr Indyk et al promises a relatively faster way to compute DFT of sparse signals. This is getting traction from outside the community too, after the MIT broadcast. The well know FFT originated from Gauss and laregely resurrected by Cooley and Tukey has the time complexity of , which offers significant gains over the conventional DFT complexity of when the size is large.

If we really dissect the FFT complexity limits, it is already pretty good. With points to compute, the complexity will be proportional to and roughly, the per node complexity is .

Now, what the new scheme promises is not a change in the order of complexity, but a clever way of simplifying the complexity depending on the inherent signal sparsity. When the signal is sparse (i.e., only among is significantly different from zero ), it is fanciful to ask whether we can indeed get to the complexity and Katabi, Indyk et al have indeed reached there. Quite remarkable achievement this is, considering that the gain could be as good as the compressibility limit in most of the real world signals we deal today. Signals such as audio and video are the leading candidates in the practical signal signal processing world and they both are sparse in some transform basis. Recall that, the recent compressed sensing results for sparse signals showed the potential benefits of sparse signal processing and this new scheme will help to realize many things in a more meaningful way. One good thing with this is that this generalizes the conventional FFT. In that way, this is not just for sparse signals, but something which holds for any and in the limit when , the complexity is as good as the old mate FFT!

I want to try this out sometime for some communication problem that I have in mind. At the moment, I am hopeful!

Another stalwart, the founding father of Unix, C and in many ways one of the founding fathers of computing itself, Dennis Ritchie passed away. For me, Ritchie along with Kernigham was one of the first few names registered in mind, since learning computer programming. The first book I have ever seen on a programming language is this duo’s book on C. And boy wasn’t that most concise book in technology, every so compact and yet rich and clean!

Come to think of it, the impact of Ritchie to modern science and technology is enormous. He may not have been a very public figure, but his contributions indeed is the touchstone on the modern world, especially the information network world. Much of the internet, the Google, the iPhone’s and what more, almost everything we can think of runs on his stuffs or its variants. Quite remarkable.

I think the most apt summary of Ritchie’s contribution is heard from Brian Kernighan himself. He said “The tools that Dennis built -and their direct descendants – run pretty much everything today”

Windowing techniques have always offered confusion to me. The hardest one was to remember what is what and how they looked in frequency domain (the time domain view was something thankfully I recollect from the names). I hit upon this yet again while trying to map the theoretical spectrum to spectrum computed from discrete samples (for an OFDM modulated signal) and then to analog spectrum measured by an instrument (By the way, I figured out that, the closest discrete technique which maps to the spectrum computation in spectrum analyzer is the Danielle method, which for some reason is not there in Matlab!) . For ages, I was using the pwelch (now spectrum.pwelch) function (pwelch method to estimate the spectrum) to compute/plot the spectrum in Matlab. Usually, some windowing as well is used to adjust the mean square versus resolution. What window function is more suitable is something that I’ve never mastered. These days, doing a Wikipedia search to find a temporary fix and then move on is the adopted, yet not entirely satisfying strategy. The frequency domain characteristic of the popular window functions are now in here for reference, thanks to Marcel Müller. Since I may need it once in a while, I’ll keep a copy here for my own selfish quick reference. If you ever need to grab a copy, please be aware that, the ownership of this is with them.

It was in 2006, I guess, I had written a mail to Mathworks on this topic. I did get an acknowledgment from them saying that they will incorporate this into the future version. I don’t see it yet. I wonder why! Anyway, here is the Email I had sent to them. Unfortunately, I couldn’t trace their Email reply since it was sent to my earlier official ID, which has changed! Hopefully, I will ping Mathworks again and get this up.

I believe the theoretical BER under fading scenarios, provided by the ‘berfading’ function in Communication toolbox is not quite accurate, for frequency flat fading, when Doppler shift to be considered. Currently, for example, the berfading gives out the BER for Rayleigh (or Rice) fading channel, without considering the impact of Doppler spread. In the presence of Doppler, there may be an irreducible error floor for certain modulation schemes, especially differential modulation schemes. There are theoretical results (not closed form) to reflect such impacts. Closer approximations of the average bit error probability are derived in closed form for some of the differential schemes (say DBPSK, DQPSK etc).

Just to illustrate the point, I am enclosing here, a snapshot of the comparative BER results. For DBPSK, the exact BER, considering maximum Doppler frequency, for a uniform scattering model (Jakes spectrum) can be better approximated to Pb=(0.5./(1+EbN0) ) .*(1+ (EbN0.*(1-besselj(0,2*pi*Ts*fd)) )); The simulation results closely match this result, as against the berfading result “berfading(SNR,’dpsk’,M,1)”

The BER curves are shown in the figure, attached with. Blue curve is what Matlab ‘berfading’ gives. The pink (Pb_doppler_theory) is the approximate and more correct bound, for Jakes spectrum (See [1], [2], [3]). The latter result is in close unison with the simulation as well.It would be nice if Mathworks can modify (may be in the next release) berfading such that, the theoretical BER expected take care of such fading characteristics (in this case Doppler). I am using 2006b version of matlab. In this only Jakes spectrum can be compared. For other spectrum (I read in the Mathworks website that other spectrums can be configured in later

versions of Matlab/toolboxes), appropriate modifications to be done.

[1] P. Y. Kam, Tight bounds on the bit-error probabilities of 2DPSK and 4DPSK in nonselective Rician fading,IEEE Trans. Commun., pp. 860-862, July 1998.

[2] P. Y. Kam, Bit error probabilities of MDPSK over the nonselective Rayleigh fading channel with diversity reception,IEEE Trans. Commun., pp. 220-224, Feb. 1991.

[3] M. K. Simon and M.-S. Alouini, Digital Communication over Fading Channels A Unified Approach to Performance Analysis, Wiley 2000.

It came as a shock, when I heard the demise of Rudi Ahlswede, first from here. He passed away in December 2009, just two months after the ITW in Dublin where he looked jovial as ever. Interestingly, I had lunch with him, in one of those ITW days and it was fun conversing him. A great loss to Applied math and Information theory in particular. With his demise a massive figure has been lost. Very sad indeed. Besides the large contribution to the field, the special thing which standout is his humility and the sense of respect to fellow researchers, and the occasional humour in light conversations. His lectures were very jovial and filled with the innocence of an enthused child.

A friend of mine, asked me whether I know anything about IPNLMS. Honestly, I didnt even hear about this. He said it is something like progressive NLMS. We did a bit of googling and then found that it is somewhat a new adaptive scheme.

Well, IPNLMS stands for Improved Proportionate Normalized Least Mean Square scheme, which is a modified version of the well known NLMS technique used in adaptive filter theory. OK, the idea of IPNLMS as I understood is, as follows. Each of the coefficient (tap) is independently updated, where the adaptation step is proportional to the estimated filter coefficient. How is this helping?

On the Christmas day, out of blue I bumped across an old archive of Robert Fanos’s interview (oral history). Beautiful one.

Recently (almost a month back), I happened to ask a simple question to an interview candidate. The candidate apparently worked, among other things, on the design of scrambler and descrambler blocks in 3G/LTE terminal development .Naturally, I expected him to know in and out of the block and of course the scrambling idea itself. To my innocence, the answer came: “Scrambling adds protection to the information source”. That was a bit of foot in the mouth reply, which I seldom expected from such a candidate profile. Anyway, come to think of it I shouldn’t overly blame the candidate because many a time, engineers overlook the primary goal of placing a block or unit in the overall design, even when they do a glory traverse with it. I personally think this is ignorance on part of an individual engineer if he or she doesn’t spend time to figure out the reasoning behind every single unit in design, especially the one he or she works on, if not to an expert level, the intuitive idea gathering is paramount.

Anyway, let us get some facts right with scramblers. Scramblers are there in almost all the standardized data communication systems. It is some kind of randomize acting upon the information bit stream (Well, one can also think of it operating on non-binary symbol sequence or packets, but let us consider binary for the time being, for simplicity.). The input stream upon the act of a scrambler changes the patter of the original bit stream. Naturally, at the receiver should be doing a reverse operation (descrambler) to make sense of the source information. One can argue that, a receiver not having the right descrambler (descrambler is very well based on the scrambler algorithm) cannot retrieve the intended information, adding protection/security on the data stream is NOT the reason for the existence a scrambler/descrambler unit in a communication system. The why? Let us see why.

A typical receiver will have several algorithmic stages such as synchronization, equalization, decoding etc ( These are broad classifications. One can go into finer details. For the discussion on scrambling we can suffice with this much). All these stages are usually designed with the assumption that the source stream is i.i.d (continuous stream of identically independent ‘0’s and ‘1’s). For example, a long stint of ‘0’s or ‘1’s can degrade the timing/clock synchronizer performance or may even result in loss of synchronization.

The other important argument on the scrambler is the spectral density (spectral mask as defined in standard documents) requirements, usually imposed by the standard specification and (country specific) regulation requirements. Any periodicity (short or long), can result in spectral spikes, which in turn may result in co-channel and adjacent channel interference due to various nonlinear blocks present in the transceiver chain.

So, scrambler’s job is to make an uniform i.i.d stream so that, the spectrum stays as close to white. Similarly, the receivers do not misbehave because of arbitrary source distribution.

I didn’t see this book before. While trying to dig a little deeper into the sigma delta modulation theory, I bumped across this book by Robert Gray. The book, first published in 1990 hasn’t really become a mainstream reference on source coding, but the book is awesome. I didn’t read the whole book, but the chapter on uniform quantization noise is simply a treat for someone who loves the theory. Among other things, it discusses the Benetts’s conditions for the exactness of the uniform quantization error. I am now going through the noise analysis of the delta modulation and sigma delta modulation schemes.

Initially, I just managed to read a near full chapter content at books.google.com, but later I was convinced myself to get the book at Amazon. After losing the 7.5 USD used book offer, I finally had to content myself from ordering the next cheapest option of 40USD. I am expecting this to arrive in a few days. A brand new book would cost 130 bucks anyway!

The P versus NP has popped up again in to the mainstream. The HP scientist Vinay Deolaikar has recently prepared a manuscript which claims, it is after all P ≠ NP. Greg’s blog seems to be the one which first broadcasted this to general public. I have not had much time to digest the proof claims or counter arguments in detail. Here is a fascinating series of arguments discussing the proof claims. I am going to do some digesting on this in days to come! We will wait for the experts to have their say. I am going to relish this discussion.

On a lighter note, here is a very interesting blog piece, bringing some sort of similarity to the giant component behaviour on a pile of staple pins. This is sort of a layman view of the Erdos Renyi explanation.

In one version of the Erdős-Rényi process, one start with a set of isolated vertices and then keep adding random edges one at a time; More specifically, at each stage one choose two vertices at random from among all pairs that are not already connected, then draw an edge between them. It turns out there’s a dramatic change in the nature of the graph when the number of edges reaches . Below this threshold, the graph consists of many small, isolated components; above , the fragments coalesce into one giant component that includes almost all the vertices. “The Birth of the Giant Component” was later described in greater detail in an even bigger paper–it filled an entire issue of *Random Structures and Algorithms* (1993, 4:233–358)–by Svante Janson, Donald E. Knuth, Tomasz Luczak and Boris Pittel.

Ofcourse Joel Spencer has written a beautiful article on AMS commemurating the golden anniversity of the giant component.

I am very thrilled to learn that Ruediger Urbanke has won the 2011 (Koji) Kobayashi award. He and Tom Richardson are named the 2010 receipients of the famous Kobayashi award. Rudi and Tom are awarded Kobayashi prize “for developing the theory and practice of transmitting data reliably at rates approaching channel capacity.” They truly deserve this. Looking at the list of earlier Kobayashi award winners, it really is a place of pantheon of greats. Gottfried Ungerboeck, Don Coppersmith, Rivest, Shamir, Addleman, Jack Wolf, Berlekamp and so on are among the famous awardees of the past.

When pointed this to Rudi, he was as usual every modest about these. I am sure I will get to have a coffee treat from him, in Lausanne! Place Palud or Ouchy?

An interesting and twisted version of the twenty question game came across recently. I saw it in the 2009 ISIT paper on the error correction under Feedback with list decoding, by Ofer Shayevitz. The problem goes like this (I am going by the nice narration of Ofer with the famous Alice and Bob duo).

The game is being played by Alice and Bob. Alice selects one of possible objects, and Bob can ask binary questions (where the answer can be either yes or no only), in a quest to identify the object selected. If Alice is always truthful, then questions are obviously both necessary and sufficient. Now, suppose Alice is allowed to lie up to times. How many questions does Bob need to get the correct object? Alternatively, given questions, what is the maximal number of objects that Bob can separate? This version of the “twenty questions game” with lies is known as Ulam’s game.

Yesterday I attended a very nice talk during the Broadcom internal seminar series. It was a nice talk discussing the semiconductor fabrication. For me, it was truly whale swallowing kind of a talk, since a lot of new things learned. The term “tapeout” is there in the common vocabulary in semiconductor industry and we all use it with such ease that no one really bothered to search the origin of the term itself. I had done it once by doing a wikipedia, but then as usual had forgotten all about it. Anyway, here is the legend behind tapeout. Mike Magee said it all very cleanly, but let me sort of state it again.

Well, tapeout refers to that fancy thing the designers deliver to the (semiconductor fabrication) fab companies in Taiwan (well, I can say so because most of the fab companies are in Taiwan). The fancy algorithm we conceptualized in plain equations, went through the LaTex pages and then C programs finally took shape into RTL and gates and transistors. That then further went through the backend processes to eventually a set of masks. These mask description is what is given to the semiconductor manufacturing companies. These days, it is just a matter of file transfer using ftp or so. Then you can rightfully ask:why there is a term called tape out? Well there is history to it, fair enough I will say!

Glanced upon a paper from the 2009 ISIT, titled “LT Codes Decoding: Design and Analysis” by Feng Lu et al. The authors discusses LT decoding strategy which is Peeling decoder (conventional simple LT decoding) followed by an extension based on the well known Widemann algorithm solving a sparse homogeneous linear system. Here is a brief summary as I understood.

On packet erasure channels (internet being one close realization of a model of this kind, where, the TCP/IP suit, one can either gets packets correctly delivered or lost in the wild), Luby Transform (LT) codes forms an easy way to transfer information. It is well known that, asymptotically (When the number of information symbols (or packets) and the number of packets received $r$ are large) near full recovery is made possible by simple peeling decoder which grows in complextity at . The claim is that, for or so, with , on average we can, with high probability, recover all of information packets by simple unwrapping of xor-ed packets. A more precise statement will read as: If from information packets, each coded packet constructed independently by xor operation (the number of packets taken for xor operation follows a distribution; Plain vanila version of LT codes uses Soliton distribution), then the content of the original information source can be recovered from any packets with probability by an average of symbol operations. However, when the number of input symbols is not large (say less, ) the overhead term is not small. It is not too surprising, since we need sufficiently large and to converge the performance in expectation. One can think of solving the system of linear equations using Gaussian elimination and get the best figures, but the question mark stays on the complexity of this method over the cheaper peeling decoder complexity .

Conventional LT decoding based on Peeling decoding rule terminate, once the number of degree -1 packets (ripple as it is called in graph theory parlance) are void. One can hope to continue from there, by using Gaussian elimination and decode a a few more (if the graph matrix rank permits it). If the original graph is full rank, one can technically decode all of them. Gauissian eliminatio being heavy, one hope to find an easy, preferably recursive method. Indeed there comes Wiedemann and that is what the authors use. The authors propose what they call as full rank decoding, which essentially is LT decoding, until there are no more degree one packets. From there, they borrow a packet and then uses Wiedemann algorithm which is somewhat light on computational complexity.

The larger aspect on whether this scheme do any better than conventional LT peeling decoder is the other question they answer. To achieve full recovery we need to have a full rank matrix. Now, the matrix being a sparse matrix defined over where is the number of received packets. The probability of the matrix having a full rank will directly help us to infer the decoding merits. This is an interesting piece since I had looked into the computation of the rank of a binary matrix and also the probability of a random matrix rank being equal to an arbitrary value. All this was done during the Modern Coding Theory doctoral course by Ruediger Urbanke at EPFL. For binary random matrix, we can also setup a recursive formula and thereby a good inference on the probability of decoding can be arrived at.

While trying to analyze the effect of PA bias on the linearity aspects of power amplifier, I came across an analytical model of a BJT amplifier. It is known as the Gummel Poon model. A quick Wikipedia check on the history suggests that, the model is about 40 years old. I need to figure out the model in detail. I am not sure how good they are as a model equivalent for a cmos. In any case I reckon this may be useful to analyze things such as RF power amplifier nonlinearity?

I’ve stumbled upon to this nice article (titled Principles of Effective Research) (I saw it through Mahdi’s blog) by Michael Nielson (yes, the same Nielson who is known for his work on Quantum computation: the author of the popular (technical) book Quantum computation and quantum information). Even though, the title states “effective research”, I thought the underlying methods are applicable to any work, not merely research work. If you trust my word, then many of the points jotted down by Nielson applies well to life and work in general. It appears that he is writing a book on “The future of science“. I am looking forward to that too!

A pretty cool handwritten tex symbol identifier software is unleased and it is known as Detexify. I thought this is such a handy piece of online suit for the TeX community. One can try to write symbol and it simply display a list of nearest matching symbols. There is no absolute guarantee that it display the intended latex symbol immediately, but it does the job pretty well on most occasions.

The memorial service for Ralf Koetter held at UCSD is video archived. Quite many of the stalwarts in Information theory field found it difficult to control their emotions when they spoke about him. Such was the level of closeness many people had with him. I have never got to directly interact with Ralf , but was aware about his stellar contributions to many areas in and related to coding. The most notable thing other than his well known research contribution is his amazing teaching skills. The two guest lectures given by him during David Forney’s MIT class in 2005 were simply stunning. He then had talked about Reed Solomon codes and that is by far the best lucid presentation of such a difficult topic, that I have ever seen. His sudden and untimely demise leaves an irreplaceable void on this planet. He was that good. So woefully cut short by cancer.

Alex Vardy knitted down a fitting tribute to his friend and colleague.

I am eagerly waiting for this new search and compute engine promised by Stephen Wolfram. They call it wolfram|alpha (If google always went with the beta release, Wolfram is going even early).This, if it work in the promised lines is going to rock the Internet evolution. From the outset, this is not just a search engine. It is kind of an intelligent searcher who can loosely understand the human requirements.

For long, it was perceived that a search engine driven by natural language processing is the way forward. But it is pretty hard to build such a system since natural language processing is no mean business. Wolfram’s idea is to create an abstraction and then algorithm of these realizable models. Once we can do a mapping of the requirements to algorithm that is computable, at least in principle we can build such a system. But that is a whole lot of heavy statements already. How easy it is to build all these methods and models into an algorithmic framework? He is using the New Kind of Science (NKS) armoury to realize that. We have to wait to get the full rainbow, but when he promises we can confidently expect something big.

Now once the algorithmic mapping (and implementation) is done, then the question of natural interacting between humans and the system comes. Natural language is the way, but according to him we don’t have to worry about doing that as such. Once the knowledge of the individual is made into a computational framework, then that is enough. I am not an expert in this natural language processing and NKS framework, but for sure this is pretty exciting,both from an algorithmic point of view as well as a practical MontBlanc. As Wolfram himself pointed out *Pulling all of this together to create a true computational knowledge engine is a very difficult task*. Indeed it is still being considered a difficult problem, both in academia and industry. So there is excitement aplenty in the offing. I am eagerly waiting for this to hit soon.

Considering that, the big wig search engine houses including Google are still struggling to make that dream natural language engines (the many pseudo ones in the market are not quite approved). I remember http://www.ask.com started their business in those lines, but never seemed to have crossed that elusive mark of acceptance, atleast not to an extend to capture a world wide wow! If Wolfram has a new way to get this through, that will be a big breakthrough. I cant wait to see that. Wolfram promises that it is going to be very soon. He says it is in May 2009. My guess is that they will release it on May 14,2009.

I had earlier promised to update on the Xitip, when a windows setup is ready. Though delayed, I have something to say now. I have finally made a windows installer for the (Information theoretic inequality prover) Xitip software, which was working pretty smoothly on linux, cygwin and mac for a while. I was not too keen on making this windows installer since a few DLL files are involved with it. Besides it was a bit painful to include these nasty DLL files which would unnecessarily increase the bundle size. Some of these may not be required if Gtk is already installed on the machine, but anyway I made one double click style version to suit the layman windows users in information theory community.

Vaneet Aggarwal is the one who motivated me to make this up since he uses Windows. He showed some interest to use it, should a windows version be available. If atleast one user benefit from it, why not make it. In the process, I got to learn about an easy way to produce a windows install (setup maker) program. I used the freeware *Install creator* to produce it.

I will put this installer available at the xitip website, but for the time being you can access it from here. A lot of people suggested to revamp the xitip webpage which is pretty unclean at the moment. May be a short tutorial is impending. That will take a while; the next two and a half months are out of equation since I am pretty busy till then.

It was today. I’ve just come back to office, after the dinner party hosted as part of the I&C anniversary celebrations at EPFL. Andrew Viterbi was the guest of honour and largely because of his fame, there was considerable crowd attending the function. Martin Vetterli made a nice colourful, flashy presentation illustrating the history of I&C in EPFL as well as scientific progress in Switzerland. He mentioned the names including Jim Massey, Ungerboek who are undoubtedly pioneers of modern communication theory and practice. He began saying that “…Ungerboek is our friend, and now not quite..I will come to that in a minute…”. And of course he didnt come back and fill the circumstance in which the friendship derailed. But I reckon it was a casual remark, perhaps to indicate that Ungerboek, now with Broadcom is a bitter rival to Qualcomm. Since Qualcomm recently established a scientific partnership with EPFL and Viterbi being a Qualcom founder and associate, he perhaps just jotted that remark. It was a nice, usual interesting presentation by Martin.

He also mentioned a nice story about the current EPFL president Patrick Aebischer. Interestingly Patrick Aebischer after an MD (Medical science) degree was fond of computer science and decided to venture into taking a MS degree in CS . He then decided to test his luck at EPFL and approached the admission committee with a formal application. CS was affiliated to the Math department in those days. EPFL politely rejected his application and in due course that ended Patrick’s quest for an EPFL CS degree. He then moved to the US, as a successful surgeon and took a career path of entirely different trace. Years later, as one would say, due to the uncertain turn of things in the great cycle of life, he became the EPFL president and now ruling not only the CS department, but the whole school.

Viterbi talked about the Digital Communication history. He started giving a perspective of this field starting from the days of Maxwell, Rao, Cramer, Wiener and Nyquist. Then he discussed the impact of Shannon’s work. He said the three driving force which made this digital mobile revolution are

1) Shannon’s framework (1948)

2) Satellite (Sparked by the Sputnik success in 1957)

3) Moores’s law, which is more of a socio economic law, which dramatically kept driving the industry so successfully.

The talk as such wasn’t too attention gathering, but he made a rather comprehensive presentation discussing the impact of digital communication evolution spurred since Shannon’s days (and even early) knitting a dramatic success story of digital wireless world with millions of cell phones and similar devices, which showcased literally the realization of theoretical promise Shannon made in 1948. He himself has his name etched in part of that success story, at least in the form of Viterbi algorithm, which is (one of the instance of it) an algorithm used to detect sequences when perturbed by a medium.

Quite a lot of fun activities were organized by the committee. It was quite fun. Since many programs (especially the fun part) were in french, the appeal was considerably deaf to non-french speakers. But then the rationale given was that, the alumni in good percentage are french! I found it funfilled , mainly to see these successful people like Viterbi sharing their views in real. After all we can learn from history. Not many people can claim to have done so well in everything he touched. In the case of Viterbi, he is an academician, researcher, successful entrepreneur and now a venture capitalist, all scaled to the possible limits. Incredible role model, whichever way we look.

Todays IPG seminar had Fritz Eisenbrand (the Disctete Opt chair, Math department EPFL) talking about Diameter of Polyhedra:Limits of Abstraction. I don’t think I followed the topic too well, but this is a share of what I understood.

The topic is about a convex geometric problem on the diameter of a polyhedra. The question of whether the diameter of a polyhedron is polynomial or not seemed to be a longstanding open problem. The largest diameter of a dimensional polyhedron with facets has known upper and lower bounds.

.

The lower bound is due to Klee and Walkup and upper bound to Kalai and Kleitman. These bounds also hold good for combinatorial abstractions of the 1-skeleton of non-degenerate polyhedra (Polyhedron here is called non-degenrate). What Fritz and his colleagues have done is to look into the gap between these known lower and upper bounds. Apparently, the gap is wide and they have made some progress to get a super linear lower bound if is allowed to grow with .

The way they showed this bound is by establishing the bound for the largest diemeter of a graph in a base abstraction family. Let us say, the abstraction family of connected graphs be denoted by .The largest diameter of a graph in is denoted by . They find that, and then using the fact that , they conclude the bound

I have not had a chance to see their paper yet. I must say, the proof was not all that within my grab during the talk. However it appeared that it is based on some layering and combinatorics. He said some applications to covering problem, in particular disjoint covering design which I didn’t follow that well. Sometimes I get the feeling that I am a little dumb to grasp these ideas during a talk. I wonder whether others understand it very well on a first shot presentation. I have put it in my agenda (among the millions of other papers to read) to see through this problem and proof, one day! His presentation was very clear and legible though.

While the talk and boom about multimode multiband phone in CMOS is turning greener, there should be a natural question around it. How about doing all these in software? Rather add a level of programmability such that a great deal of issues from a hardwired implementation are shifted to more flexible firmware. Without contention, pros and cons with the idea of programmability still prevail. Clearly, one definite advantage I see with programmable design is the significant cost reduction and reuse. Additionally a migration or upgrade, which is imminent from a future gadget design point of view, can get done with relative ease with a programmable multimode chip. Building a suitable processor architecture to suit the modulations schemes (say an OFDM based scheme can have an inbuilt FFT engine or a WCDMA can have a correlator engine). Aren’t anyone working seriously in these directions? I am sure there are many, atleast startup ventures. Vaanu and Icera indeed are two things coming to my mind. How about the big boys? There were lot of furies about software programmable baseband chips being developed. Not quite sure what is the latest in that front. Isn’t it the next big thing in the offing? I am sure the EDA big houses have thought ahead for building tools for a heavily software oriented design, at least for years ahead. Or is it that, I am jumping the gun a little too far? However, I see some top level bottlenecks in making this programmable multimode chips realizable at an easier pace than a textbook concept. One of them is difficulty in getting away the analog front end. As a matter of fact, now I feel that, analog is going to stay.

So where are we heading to? Clearly, an all CMOS multiband multimode single chip (baseband and analog) with a near perfect RF and a software architecture would be the ultimate holy grail of cellular chip design. How many bands and how many modes to be incorporated becomes less important, if the programmability aspect is assured. Challenges within a single chip concept are themselves many. Clearly the RF portion is expected to take up lesser share of the overall chip size. An all digital front end is aimed in that direction. While a direct digitization of radio signal of high frequency eliminates analog life process significantly, there are several practical bottlenecks with this Utopian design model. We are not quite there to say good bye to analog entirely. Analog signal processing is still critical and inevitable, even for a programmable multimode dream. I will give you some numerical facts to substantiate my claim:

Suppose we decide to build programmable all digital zero if receiver for a 2GHz system (around the UMTS band). Then, Shannon Nyqusit sampling would demand at-least 4 G samples/second. Even with a processor which clocks 4Ghz and say 8 operations per cycle, our full steam purchase is going to be a maximum 32000000 operations per second. This theoretical figure is based on the assumption that processor memory is fully utilized. At the sampling rate of 4G samples/second, we only are going to get operations per sample. How are we going to have all the fancy radio algorithms shape life with this? Even to implement realistic functionality of a typical modern radio, this is inadequate. Another important thing is the imminent power dissipation to run a processor at . For a portable gadget, where these chip are targeted for, we still need more and more hand in hand optimization and integration with analog processing, software as well as digital processing, in addition to an optimized system architecture. My feeling is that, the analog front end is going to stay for some more time, if not for ever. At least on the immediate future, we need more inroads from analog processing, to realize the small size, cost effective multiband multi mode chip dream.

From this blog piece, I came to know that the smart MIT theoretical computer scientist Madhu Sudan is making a move from MIT to industry. He is set to take up a research position with Microsoft. At this economy troubled days, lesser mortals would take the conservative route that ensure stability and so on. They would say a move from a tenured professorship to a more volatile industry is risky. But then one of the smartest mind in the world can have a world revolve around him, if need be. So no surprises here. On the positive side it is a gain for industry, while it is a big loss for MIT, if Madhu decides to stay away from academia for too long.

Interestingly, on the very same blog, someone commented about other famous moves. Apparently, Venkatesan Guruswami, Madhu’s celebrated student is also making a permanent move from UWash to CMU. In industry, we are often associated with frequent hops. Academia is not too immune to attrition either. However, I see no harm in making smart moves. It is going to help the world, atleast in expectation.

As in EPFL too, there is imminent big fish attrition(s). Tom Henzinger and his wife Monika Henzinger are about to leave EPFL to take up a permanent position in Austria. The awesome twosome will be missed in EPFL.

Today, there appeared an interestng (and perhaps general) question posted on the Linkedin Analog RFmixed signal group. The question was this “Regarding multi-mode multiband RF transmitters for handsets (CMOS), what do you think are the hot issues (besides PA)?” I have given a short overview of the challenges that I could see when a multi mode phone is to be designed on CMOS: The phone has to support a wide range of frequency bands as well as multiple standards/technologies/modulation/air interface. Here is what I wrote. I am not sure whether the discussion is accessible to public. Hence I repost here.

Integrating the RF transmitter and receiver circuits is a challenging thing since we have to support multiple bands (within a single mode. Say GSM/EDGE should support GSM900 to 1900 bands) as well as support for multiple phone modes. For instance a natural multi mode multi band phone supporting GSM/GPRS/EDGE/WCDMA/LTE will have to consider a wide frequency ranges from 850MHz to over 2GHz. If we were to consider incorporating GPS and WLAN, add that extra consideration. Not just the transceiver circuitry, but also other components such as oscillators, filters, passive components, frequency synthesizers and power amplifiers. Another thing is that, for multi mode, the sensitivity requirements are much more stringent than a single mode, multi band design.

Since CMOS offers low cost, better performance and better scaling, to me that is the way forward. The natural choice of transceiver in CMOS would be the direct conversion/Zero IF, since it eliminates the costly SAW filters, and also reduce the number of on chip oscillators and mixers. Now, there would be several key design issues to be considered now with direct conversion architecture. Most notable ones are the well known ghost “DC offset” and the 1/f noise. Designers will have the task cut out to get a cleaner front end and as well as near ideal oscillators.

Now I see another problem with multi mode, depending on what level of flexibility we prefer on this integration. Do we need the phone to operate in multiple modes simultaneously? Say a voice call on GSM and at the same time a multimedia streaming on LTE. In such a case, the question of sharing components are completely ruled out. If not, say some components such as synthesizers and mixers (if in the same band for multiple modes) can be shared. Clearly, simultaneous mode operation will ask for increased silicon die size as well as cost. Challenges may be there for circuit isolation for different modes as well.

In all, depending on the level of sophistication (and of course all these things will have to be scaled economically too) the design,partitioning, architecture challenges are aplenty. Now the choice between a single chip (containing both analog baseband and digital baseband) versus two chips (analog and digital partitioned) will get a little more trickier with multiple modes. With multiple antennas (MIMO), add another dimension to this whole thing:-(.

https://ratnuu.wordpress.com

http://people.epfl.ch/rethnakaran.pulikkoonattu

The latest talk/demo at TED opened up a fresh life to the possibility of a sixth sense. The MIT Media labs now have unveiled a prototype of the sixth sense setup. The whole thing is reasonably economical already and all indications are that this is going to rock some day. Incredible idea which went all the way to realization. Kudos to Pranav Mistry, Pattie Meas and their team. One thing I am really hoping out of it is that, this paving way to assist disabled people. For instance a blind, deaf or dumb person finding avenues to get a sixth sense aid would be really helpful.

http://www.ted.com/talks/pattie_maes_demos_the_sixth_sense.html

An interesting bound (problem from Rudi’s Modern coding theory book) regarding the chromatic number of a random graph. I first attempted it during the course. Here is the problem statement:

C.5 (Concentration of the Chromatic Number – Spencer and Shamir). Consider a graph on vertices. The chromatic number of a graph , denoted by , is the smallest number of colors needed to color all vertices so that no two vertices which are joined by an edge have the same color. Consider the standard ensemble of random graphs on vertices with parameter : to sample from this ensemble, pick vertices and connect each of the ordered pairs of vertices independently from all other connections with probability . Show that for this ensemble

.

My solution is as folllows: (PDF typeset as single file is available here. Look for problem 3)

Let is a random graph with vertices. If is the probability that a given edge is in . The probability space be . Let be a filter on the set of all such random graphs. We can define a Martingale as follows:

where

and is the Chromatic number of . Chromatic number changes at most by one , when the information about the new edge comes in. Clearly, satisfies the conditions for Azuma’s inequality.

is a Martingale, with ). Let . Clearly

Now we can use the Azuma’s inequality on to get,

.

Since , the result

follows.

Here is an interesting riddle on random matrices.

(Rank of Random Binary Matrix). Let denote the number of binary matrices of dimension and rank , so that by symmetry . This is a repost of the solution that I have arrived at (certainly not the first!) and submitted as part of a homework (9) problem from the doctoral course Modern coding theory (by Rudiger Urbanke) at EPFL. The sumbitted solution in PDF is available here.

Rank of a matrix is essentially the number of nonzero rows when the matrix is expressed in echelon form. So, we just need to compute the ways these matrices can be created with non zero rows. Since the elements of the matrix are binary (from ), we can simply do a counting.

It is trivial to compute for and . For , only all zero matrix possible, and only one such matrix exist. Hence . For , since , no matrix exist, which means .

Now we consider . How many ways? We have non zero rows of the matrix, which means all rows must be nonzero. Without loss of generality, for counting, we could assume that, the rows are ordered. The last row ( row can be be done in , since there anything other than all vector (of size ) is allowed. On -th row, anything other than that of row is allowed. There are ways here. -th row can have anything except any linear combination of the rows and . This is nothing but . Row then have and so on. In all, Following the same procedure, we can have a total of

ways. For , we can construct a rank matrix of size in any of the following ways:

- Take a rank matrix of size and add an independent row.
- Take a rank matrix of size and add a dependent row.

For every matrix,

and hence,

ways. (Essentially avoid all possible linear combinations of existing rows). Using the second (item 2 above) method, we can have and

different ways a rank matrix can be formed. Where,the first term () is when the all zero row is picked as the new row. In ways we can pick any one of the exisiting row as a dependent (new row). In general for we can have combination of existing rows out of in different ways to make a dependent (new) row.

So using (1) and (2) we get,

Putting everything together,

Phew! After the heck of debates and discussions (over years) on the standard evolution of the IEEE 802.11n for multiple antennas (MIMO), now it appears that we are all in for a single stream (single antenna) chip. It sounds more like an 11g upgrade, or perhaps a conservative lead from there on? If Atheros believe this is is the way to go I have my belief that Broadcom and Marvell have it in the delivery line too. Here is that interesting new story at EEtimes.

Interesting and truly encouraging news about the MIT researchers who have now the idea to generate electricity round the clock, from solar energy.

While trying to understand the Luby transform (LT) code, I stumbled upon the well known coupon collector’s problem. It took a while for me to figure out the connection, but as it turned out, there is a stronger connection between these two. In LT parlance, if we were to use only degree one packets (that is, packets sent and collected as it is) what is the expected number of packets to be collected (when collected randomly, one at a time) such that all the required packets are collected atleast once. For illustration let us say we have information packets at the transmitter. The receiver collectes these one by one at random. How many packets on the average, we need to collect until we have collected all of the different information packets. Remember we are collecting the packets randomly (On the other hand, if we were to collect things deterministically, we just need to collect packets to get all these , when done without replacement).

Assume that there are distinct coupon types. We have, a large pool of these coupons at disposal. Every time you go to the shop you collect a coupon picked uniformly at random from the pool. The picked coupon has equal probability of being any of the types. Naturally, some of the collected coupons (over multiple visits to the shop) may be of the same type. The question asked is this: Suppose the coupon collector aims to have coupons of all types. How many (number of visits) coupons he has to collect till he possess all the distinct types of coupons?

In expectation, the coupon collector should make visits to the shop in order to have atleast one copy of all distinct types of coupons . This coupon collector problem can sound a little confusing to a fresh reader. For simplicity sake we can assume that, there are differently coloured coupons at the shop. The question then is, on average (i.e., expectation) how many times one needs to visit (each visit fetch a coupon) the shop so that all coloured coupons are fetched atleast once.

There are different type of coupons. The coupon collector collects a coupon upon each visit. The collected coupon is among the types, picked uniformly at random (from a set of possibly large pool of coupons) . Since the coupon is drawn uniformly at random, there is a non zero probability that some of the collected coupons over multiple visits may be of the same type. Suppose that at some stage, the coupon collector has different type of coupons collected. The probability that his next visit fetch a new coupon type (not of the types he already have in the kitty) is . So, the expected number of coupons to be collected to fetch a new coupon type is . Let us denote this number by .

The expected value . From this we can compute the expected value of . In other words, , the expected number of coupons to be collected (i.e, number of visits to the shop!) so that, the he would have all the different types of coupons is:

So, what is the trouble? This number is prohibitively high a number to be acceptable (as decoding time of is significantly higher than the wishful linear time !). So, simply using degree is not a good idea. This is why Luby went ahead and identified some smarter distribution like Soliton (and its variants proposed later on, such as robust soliton and then the recent raptor codes by Amin).

Stumbled upon the news on New York times: it is about a new search engine being developed by some former Google folks. First there is excitement when it comes to a startup idea when you know that they know how it is to be confronting their former employers in business. Anyway, the new engine is called cuil (pronounced just like ‘cool’). I am all for new ideas. Hopefully we are into better search engines. Since these folks are also from Google, you can expect a certain Google standard guaranteed. Google undoubtedly changed the search engine business, by simply scaling the internet to a level hitherto unimagined. Yet again, a Stanford connection to a new startup. Tom Costello and his wife Anna Patterson (former Google architect) surely will know this business better than us (correction, better than me to say the least).

If their motto of producing a more appropriate search engine, bettering Google, then we should feel happy and proud of this adventure. Surely Google cant relax either. In all it is a win win for the world. A preliminary look at the search engine game me a good feel. I am not sure whether the change in appearance (after being stuck and used to Google search for so long) gives me this impression. Anyway I look forward to see their progress.

I leave it to you to try out for a comparison. I did a Cuil on “compressed sensing” and found this where as a google of “compressed sensing” displayed this. Google displayed the search result as a list (rows) where as the Cuil results to a tabular form. Too early to say anything discrete, but I am going to try the new one as well. Google is by far the fastest (at the moment).

While searching for the book Information theory, Coding theorems for discrete memoryless systems by Imre Csiszar and Janos Korner, I came across several sites which echoes the fact that this book is one of the rarest specimen on earth. However in the process, I found a blog forum which lists a whole lot of out of print books. This book, as expected is in demand already. We can even vote to request to bring the out of print books to a possible reprint. I am not sure how effective this is, but there is no harm in trying! We can suggest any books you may want to have a reprint. To me, this is a whole new and welcome idea. For learning, we should have access to the good books. Already quite a few demands (See this blog for instance) for the Csiszar and Korner book. Man, the world sometimes think alike!

I have been posting my random thoughts, somewhat on irregular terms at Ratna’s ergodic view. The way I was led to this new blogging door is motivated by Compressed sensing. Well, I must not mislead anyone. I have heard about compressed sensing as an active research area, but today’s talk by Martin Wainwright on *Statistical estimation in high-dimensional settings: Practical and information-theoretic limits *in a way did more help than expected. It was indeed a very very motivating talk. My professor Rudiger Urbanke had earlier suggested to attend this talk and had promised it to be a remarkable one. Martin proved him dead right. Well, that is not the point. During the talk Martin touched upon examples and possible applications of estimations in high dimensional settings in compressed sensing. He described the compressed sensing problem in a simple setting. In all, the talk triggered me to study what is this subject of research namely, compressed sensing. First thing we must do is google for compressed sensing. It leads to the obvious pages. The IEEE signal processing magazine had a small tutorial paper published in 2007 July edition. It was quite an easy read which gave me a good idea on this topic. Then I chanced upon seeing the blog of Professor Terence Tao. Incredible timing and I am so glad that this visit had two immediate effects. First I am heaven pleased with his humbleness to share his thoughts to the world, in spite of being busy with his work and research. In this blog entry I came across, he describe the problem of compressed sensing with a very very simple example. He has not only presented it well for the target audience, but also gave it in a very very motivating manner to aid the interested audience to explore further. To be very honest, I did not expect a Fields medalist to spare his valuable time to help a larger audience by writing about a subject which is not easy to understand. His lucid presentation style is indeed a lesson I like to take as valuable note when writing articles.

In his blog Terence talks about a whole lot of things and I would very very highly recommend anyone, especially students and folks who are eager to learn just about anything. In one blog, he writes about the importance of writing down just about anything we learn, doest matter whether it is a simple thing you understood or a part of proof gathered. I have been following this method for a while and I found it amazingly useful as well. Well, I have decided to strictly enforce this on a more regular term.

Just aside, the beautiful thing about blogging in wordpress is its simplicity to incorporate mathematical expressions (Latex style editing is simple awesome). I have been searching for a simple way and here is the way. I am glad. Say for example this is pretty. I am going to enjoy this:-)

I promise to write more about compressed sensing (and about several other things as I always) in future blogs. For now, this much is good enough for a first blog!

Today, as part of EPFL annual research day, there were 3 interesting talks. In the morning Prakash Narayan gave a very interesting talk titled “Common randomness, multiuser secrecy and tree packing”. Essentially it covered three distinct problems and he showed a connection among the three. The first problem setup is the following: A set of terminals observe separate but correlated signals. The classical Slepian and Wolf formulation of the data compression then is essentially the problem where a subset of the given terminals seeking to acquire the signals observed by all the terminals. And this is done by means of efficiently compressed inter terminal communication. This is a problem of generating common randomness. This of course does not involve any secrecy constraints. Now suppose a secret key generation problem. There the same subset of terminals seek to devise “secret” common randomness or a secret key through public communication. Assume here that an eavesdropper can observe this. So the setup is such that the key is concealed from the eavesdropper. Such a secret key can be used for subsequent encryption. Prakash’s talk was then to explain the connection between the two problems. He went on to establish the connection to a problem in computer science namely the maximal packing og Steiner trees in an associated multi graph. I dont think I figured out the details that well, but it triggered some curiosity to read the work a little more detail. I hope to do that sometime soon.

The afternoon session had two talks. One was by Shamai who talked about Broadcast approach in communication systems. It went over time. I thought I focused well in the beginning to follow him, but partly because of the post lunch effect and partly because of the tiredness I lost the flow. From what I understood, he outlined a lot of communication scenarios incorporating the broadcast strategy. Some examples were MIMO rate diversity trade off, ARQ, multilayer schemes etc. A lot the work seems to have gone in this direction, especially Suhas and Sanket etc (from the citation) and David Tse, L. Zheng, Al-Dahir and Shamai himself. I am somewhat amazed by the areas Shamai worked on. He seems to have covered a broad spectrum of research and yet produced some stellar work.

After Shamai, it was an interesting talk by Amos Lapidoth. He presented handsomely. I was attentive enough to follow this. Also, it happened to be a talk of different kind. He talked about the well known Matched filter used in communication. He sort of started with a little story. The story of a man from a village, venturing out of that place with a mission to find the meaning of life. So he goes to the mountains with a resolve not to come back until he finds the meaning of life. So days passed, months passed and years passed. Even after 10 years no sign of him. Finally he comes back after 11 years or so. The whole village feels curious: Aha he has come back. They ask him, wow, you have figured out the meaning of life. Please share us what is it? He says, with a pause: Life is (he pauses again)…. : Villages out of patience ask him, : ” You please go on .. life is …”. The man completes and says ” Life is like a train!”. Then they ask what you mean by “life is like a train”. Then to the surprise of the entire village he says, “may be not!”.

That was simply amazing a prelude for the talk. The talk abstract is the following:

One of the key results of Digital Communications can be paraphrased very roughly as follows: “in guessing which of two deterministic signals is being observed in white Gaussian noise, the inner products between the observed waveform and each of the signals form a sufficient statistic. Consequently, it is optimal to base one’s decision on these two inner products.” It is surprising that this basic result is never formulated as a theorem in any of the textbooks on the subject. This may be because of the difficulties in defining white Gaussian noise, in defining sufficient statistics for waveform observations, and in relating sufficiency to optimal detection. In this talk I shall describe a number of approaches to formulating the above statement as a theorem and point out some of their shortcomings. I will finally describe my proposed approach, formulate the theorem, and prove it from first principles. The proposed approach does not rely on the Ito Calculus, on Brownian Motion, or on generalized stochastic processes. It does not introduce non-physical infinite-power noise processes. Moreover, it is suitable for rigorously treating colored noise.

He gave a counter example where we can do better than matched filter. He says a Gaussian noise, but choose a point at random where the noise is made zero. Since it is randomly chosen (the null point) he claims it is still Gaussian. To me, that will result in SNR to blow up to infinity. So, are we missing something. I cant wait to read the full paper presentation of this. Otherwise, it seem to be a very very interesting way to look at matched filter, without needing the sojourn mathematical machinery.

Anyway all these talks are available (schedule at the moment) at [1]

[1]http://ic.epfl.ch/page65253-fr.html

Last winter Etienne Perron, Suhas Diggavi and myself together, have developed a tool suit to prove inequalities in information theory. The tool is adapted from the previous work of Raymond Yeung and Ying-On Yan at Cornell. We have made it a complete C based software and removed the matlab dependency in the back end. There is also a pre-parser (using lex and yacc) built in to have flexibility on choosing random variable names. More importantly, a graphical front end is developed (using Gtk), which works well across the platform. Even though the beta version was ready in late 2007, for many reasons, including exhaustive testing (we always find scope for improvement) it was delayed. Last month, we finally made an official release. The original xitip project page in IPG has a short description and pointer to the exclusive Xitip page in EPFL (http://xitip.epfl.ch). A lot of things still need to be done, before we could say it is satisfactory. One of the main thing pending is the user guide and some kind of exemplified documentation. There is a technical report, I have prepared, but that is a bit too technical at the moment. Of course Raymond yeung’s amazing papers introducing the theoretical idea behind this prover and his book are valuable resources. I have tried to provide a little more easy understanding of the concept using some illustration and toy examples. I hope to put this report anyway in the EPFL repository sometime. The first version of the project discussing the background is available here in PDF form.

The software is open source. If you are not bothered to compile and make an executable yourself, then please download the binary executable and just run. It is just a matter of double click in the latter case. We have Linux, Windows, Windows(Cygwin) and Mac versions available. There are two different linear programming software used. One is a Gnu open source GLPK and the other one is Qsopt (developed at Gatech). The Qsopt version is faster than the GLPK. Just in case you are obsessed with a perfect open source model, you could avail the GLPK [5] version.

Hopefully during this summer we will get to complete the pending work on this project. If any of you happen to find it interesting please don’t forget to update us, on what you though about the software (Comments can be good, bad and ugly!).

Aside, I better mention this: Xitip is a software useful for proving (verifying) Information theoretic inequalities [7] only. Such inequalities contain expressions involving measures such as entropy, mutual information etc. It is a pretty handy tool if you are trying to prove some limiting bounds in information theory. In reality, there is broad classification of Shannon type and non-Shannon type inequalities. Non-Shannon type inequalities are not many, but they exist. Xitip at the moment is equipped to solve only the Shannon type inequalities. You can expect more information on this at the Xitip home page [2]

[1]http://ipg.epfl.ch/doku.php?id=en:research:xitip

[2]http://xitip.epfl.ch

[3]http://www2.isye.gatech.edu/~wcook/qsopt/

[4]http://user-www.ie.cuhk.edu.hk/~ITIP/

[5]http://www.gnu.org/software/glpk/

[6]http://en.wikipedia.org/wiki/Information_theory

[7]http://en.wikipedia.org/wiki/Inequalities_in_information_theory

On the day before yesterday, while on travel I picked up two books from the Bangalore airport book shop. One was the Helen Keller autobiography “The Story of My Life”. This was perhaps the first full volume autobiography book I had read, when I was a child. I still remember, the used copy of this book being given to me, by a family friend and teacher, when I was in early school. The vivid recollection of Helen Keller learning the alphabets and notably, words like ‘rain’ sill echoes somewhere in the back of my mind. I decided to buy this book, with the plan to gift it to someone, who I come across, preferably young kid in school or so. Eventually, I gifted this to a friend of mine who, when I mentioned about the same, was all excited to grab it. In a way I felt happy because I could instigate interest in some one about Helen Keller. Keller’s life has been a message in itself to humanity. Her strives and constant efforts to learn to react to this world is touching to say the least.

The other book which I have picked along is on the story of “Nicolas Bourbaki”. It is titled “The artist, the mathematician: The story of Nicolas Bourbaki, The genius mathematician who never existed”. It was authored by Amir Aczel, the author of the book on ‘The Fermats Last theorem’. I never heard about this book before, either in reviews or from friends. Well, I must confess that I didnt really search for books or reviews on a topic of this kind, to really hear about such a book. Anyway, the book is a recent print and the news may be just coming out only.

This book turned out to be quite an interesting one. On the flight and while on wait, I could quickly finish reading the book. Being a math and mathematician related book, I didn’t feel sleepy over the book. That helped to complete the reading at a stretch. The book is about Nicolas Bourbaki. Out of my innocence, I never heard about this character before, either. As the title says, there was no specimen of this kind in human form lived upon earth. Yet, there was an identity to this Nicolas Bourbaki. The real truth about this name is revealed a little later in the book. Nicolas Bourbaki was not just one person, but an identity for a collection of mathematicians, mostly French. Together they have published a lot of work in mathematics, under the identity of one name “Nicolas Bourbaki”. How does that sound? Well, it sounds a little weird and a little interesting to me. Can we say, Nicolas Bourbaki was an institution itself? (Well there is a story about Nicolas Bourbaki being applied to get an AMS, American Mathematical Society and gets a reply to pay the institutional pay to get the same. Of course, the AMS president that time, had known the truth about the name Nicolas Bourbaki!)

So, Nicolas Bourbaki represented a group of Mathematicians, post 1935 (between first and second world war period), who formulated a systematic structure to the study of mathematics itself. They have worked predominantly on the Modern mathematics (modern algebraic concepts) including the concept of ‘sets’. As you would expect (because of the time line and French connection), the star mathematician André Weil was one of the founding members of this group. Some of the other well known French (connected) mathematicians also figured in the celebrated group. They iuncluded names like Henri Cartan, Claude Chevalley, Jean Coulomb, Jean Delsarte, Jean Dieudonné, Charles Ehresmann, René de Possel, Szolem Mandelbrojt. Some later entrants to the Bourbaki group had the likes of Laurent Schwartz, Jean-Pierre Serre, Alexander Grothendieck, Samuel Eilenberg, Serge Lang and Roger Godement. The group would meet regularly to discuss and debate on modern mathematical topics and eventually publish under one name Nicolas Bourbaki!

Even though Bourbaki does not exist (at least actively) today, it is believed that this has brought some ‘structure’ in not only mathematics, but in European culture in general. Besides, the stellar contributions on the field of algebra and number theory by this group is acknowledged very well.

Now, another reason, why this book took my interest is because of the presence of the mysterious Alexander Grothendieck. The book talks rather in detail about this mystery man. He was regarded as a genius and had contributed immensely to the world of mathematics. His fields medal in 1966 and related absence from the Moscow event on his own Fields medal confer perhaps tells more volumes about this man. I took by surprise to hear about the current life of this once world star of mathematics. Post 1991 he is believed to be living in a remote jungle/outskirts of Southern France, far from any worldly contacts. I quickly recollected the 2006 Fields medal winner Grigori Perelman, when he declined the medal and began to live away from mathematics crowd. Like Perelman, who (now accepted by everyone) proved the Poincaré Conjecture, Grothendieck as well, produced some monumental contributions. He was considered as a Genius and it was recognized too. But in close resemblance to what Perelman did in 2006, he as well decided to quit mathematics altogether and decided to live a mysterious and aloof life.

What makes, genius folks like Perelman and Grothendieck to abandon themselves from the world? It is surely not fear, simply because they have contributed so much to us and can only be proud of it. What else then? There may be arguments that Grothendieck was disappointed by the fact that, he couldn’t get the results he wanted from his political activities (He had strong political views on certain things against the establishment). Others say that, the childhood miseries and leading an abandoned life early on triggered him to do an isolated life, out of frustration and despair. In Perelman’s case, it is surely a hurt feeling and the one that of ignorance by the mathematical community, for questioning his credentials on the Poincaré conjuncture proof. But can the human mind, which is otherwise so highly brilliant all of a sudden taken aback by a feel of hurt? I feel sorry for these two fine mathematicians, who could have produced much more (They have already done so much, but….) to the world. The life and fate of these two tells the story of the society we all are in, where jealousy, politics, lack of respect to others can completely eradicate a gem, which we should have preserved at all cost.

Tears and anxiety glues through my mind on the current whereabouts of Grothendieck. I hope he lives a happy life around Montpelier! Human mind, man is a highly complex black box. What all it takes and what all it breaks?

I was a little surprised to learn that Sanskrit fits the bill to become a computer language. Apparently, Forbes in 1987 claimed to have reported (or may have researched and produced a report) that Sanskrit is very suitable to use in computer (as a programming language?) because of its perfect syntax. Interestingly, Sanskrit has very little room for error as well. Well, I am not a linguistics or syntax expert, but this amazes me. How could a language so very perfect in grammatical sense become so obsolete? That too, a language originated and well used in a huge country (with huge population) went on to become obsolete! Well, the more quoted reasoning is that, the language and learning itself was restricted to the elite class in the earlier Indian society. Whatever be the past, there is scope for redemption now, then!

Now, I chanced upon to see another piece of report on Sanskrit and its usability on computer. This time, it is from none other than NASA. This report seconds, Forbes claim. Nasa’s study was mainly from the feasibility of using Sanskrit in artificial intelligence (AI). According to a Nasa researcher [1,2],

“In ancient India the intention to discover truth was so consuming, that in the process, they discovered perhaps the most perfect tool for fulfilling such a search that the world has ever known — the Sanskrit language. There is at least one language, Sanskrit, which for the duration of almost 1000 years was a living spoken language with a considerable literature of its own. Besides works of literary value, there was a long philosophical and grammatical tradition that has continued to exist with undiminished vigor until the present century. Among the accomplishments of the grammarians can be reckoned a method for paraphrasing Sanskrit in a manner that is identical not only in essence but in form with current work in Artificial Intelligence. This article demonstrates that a natural language can serve as an artificial language also, and that much work in AI has been reinventing a wheel millennia old.

The discovery is of monumental significance. It is mind-boggling to consider that we have available to us a language which has been spoken for 4-7000 years that appears to be in every respect a perfect language designed for enlightened communication. But the most stunning aspect of the discovery is this: NASA the most advanced research center in the world for cutting edge technology has discovered that Sanskrit, the world’s oldest spiritual language is the only unambiguous spoken language on the planet. Considering Sanskrit’s status as a spiritual language, a further implication of this discovery is that the age old dichotomy between religion and science is an entirely unjustified one. It is also relevant to note that in the last decade physicists have begun to comment on the striking similarities between their own discoveries and the discoveries made thousands of years ago in India which went on to form the basis of most Eastern religions.

OK, then, what makes a spoken language suitable as a programming language. Does it mean that a language so perfect in grammar make it as a perfect candidate in computer parlance. In loose term, this make sense, since the syntax and semantic description can be defined a priori. After all, computer is a dummy box! But the truth could be a little deeper. Anyway, I cant wait to understand a little bit of those rationale behind the suitability of a good computer language.

[1]http://www.hinduwisdom.info/Sanskrit.htm

[2]**http://www.americansanskrit.com**

## Recent Comments