where X(1) = empty sequence and X(n) = alpha followed by X(alpha) where alpha is chosen uniformly from {1…n}.

One of the most intriguing phenomena exposed in this paper is the following: Let B_{i,n} be

the event that i appears in the sequence X(n). Question: for 1 <= i < j <= n, are the events

B(i,n) and B(j,n) independent? Very counterintuitively (at least to me) the paper notices

that they are, and then builds on this!

– Madhu

]]>The original Goldreich-Levin algorithm (as in the paper) is best explained in the “Fourier coefficient” language. Kushilevitz and Mansour introduced this interpretation and thereby paved the way for major progress in learning theory. In this language, the problem we are trying to solve is to find all “large” Fourier coefficients of your function B, where the a-th Fourier coefficient

Bhat(a) = Pr[B(x) = ] – Pr[B(x) != ]. Using Parseval’s appropriately, one notices that sum_a Bhat(a)^2 = 1, and we are looking for all a’s such that Bhat(a)^2 >= eps^2. [The Chebychev

based upper bound on number of a’s follows in this language from this interpretation of Parseval.]

The key idea of Goldreich-Levin is to build a list of prefixes a0’s (of length i) such that for each

a0 in the list, there is a suffix a_i such that for a = a0 . a1, the Fourier coefficient Bhat(a)^2 >= eps^2. (Strictly speaking they build a list which includes all such a_0’s but includes some spurious

ones as well, but the list size is never large.) A simple way to do this in the Fourier language is

to estimate, for any given a_0, the quantity sum_{a1} Bhat(a0.a1)^2. Using simple Fourier expansions, this quantity turns out to be linearly related to Exp_{p_1,p_2,s} [ B(p_1s) parity B(p_2s)], and this latter quantity one can estimate by random sampling.

To use this estimation one can imagine a complete binary tree of depth n with leaves labelled by strings a of length n with value Bhat(a)^2, and internal nodes carry a value equal to the sum of their two children. The Goldreich-Levin-Kushilevitz-Mansour algorithm can be viewed as exploring every node with value at least eps^2, starting from the root and going downwards. Any level

has at most 1/eps^2 nodes of value at least eps^2 and so the total # nodes explored is at most

n/eps^2.

Incidentally, there have been many other interpretations/variations of Goldreich-Levin, including some of my own work. In work with Goldreich and Rubinfeld, we extended this work to computing linear and higher-degree functions over other fields. A work with Trevisan and Vadhan did a much better job of the higher-degree case. In work with Dinur, Grigorescu and Kopparty we viewed the G-L paper as decoding “homomorphisms between groups” and tried to extend it (some questions are still open here). Finally the most recent works along these lines are due to Gopalan-Klivans-Zuckerman and Gopalan – again many elegant ideas can be found in these papers, and more open questions.

]]>To delete a comment, just log in, and view the posts’ comments, there you will have the option to edit or delete them. ]]>