# Eric Miles: Goldreich-Levin

I think the algorithm I was most surprised by when I first saw it is the Goldreich-Levin algorithm.

The problem is, given an oracle B() such that B(x) = <x,r> with probability 1/2 + eps, for some unknown r, use B to find r with probability poly(eps). The idea is to choose x1,…,xm uniformly at random for m = O(log 1/eps), and then “guess” bits b1,…,bm such that <xi,r> = bi; these guesses will all be correct with probability poly(eps). Assuming they are correct, we then get poly(1/eps) pairwise independent vectors, for which we know the inner product with r, by taking xi+xj and bi+bj for all i \neq j. Finally, we can compute the ith bit of r by taking the majority vote of b’ + B(x’ + ei), where x’,b’ range over the pairwise independent elements and ei is the bit vector with the only 1 in the ith position. The majority vote is correct whp by Chebyshev because of the pairwise independence and the success probability of B.

I found this algorithm surprising mainly for two reasons. One is simply that pairwise independence is much more powerful than it at first appears. The second is that this way of generating an exponential number of vectors and inner product bits that are “random enough”, but still correct assuming the original guesses were correct, is very clever, and it was the first time I had seen this trick.

Advertisements

I like this algorithm very much as well, though in the form you describe it, it is actually an algorithm attributed to Rackoff (and also independently to Levin? I am not sure).

The original Goldreich-Levin algorithm (as in the paper) is best explained in the “Fourier coefficient” language. Kushilevitz and Mansour introduced this interpretation and thereby paved the way for major progress in learning theory. In this language, the problem we are trying to solve is to find all “large” Fourier coefficients of your function B, where the a-th Fourier coefficient

Bhat(a) = Pr[B(x) = ] – Pr[B(x) != ]. Using Parseval’s appropriately, one notices that sum_a Bhat(a)^2 = 1, and we are looking for all a’s such that Bhat(a)^2 >= eps^2. [The Chebychev

based upper bound on number of a’s follows in this language from this interpretation of Parseval.]

The key idea of Goldreich-Levin is to build a list of prefixes a0’s (of length i) such that for each

a0 in the list, there is a suffix a_i such that for a = a0 . a1, the Fourier coefficient Bhat(a)^2 >= eps^2. (Strictly speaking they build a list which includes all such a_0’s but includes some spurious

ones as well, but the list size is never large.) A simple way to do this in the Fourier language is

to estimate, for any given a_0, the quantity sum_{a1} Bhat(a0.a1)^2. Using simple Fourier expansions, this quantity turns out to be linearly related to Exp_{p_1,p_2,s} [ B(p_1s) parity B(p_2s)], and this latter quantity one can estimate by random sampling.

To use this estimation one can imagine a complete binary tree of depth n with leaves labelled by strings a of length n with value Bhat(a)^2, and internal nodes carry a value equal to the sum of their two children. The Goldreich-Levin-Kushilevitz-Mansour algorithm can be viewed as exploring every node with value at least eps^2, starting from the root and going downwards. Any level

has at most 1/eps^2 nodes of value at least eps^2 and so the total # nodes explored is at most

n/eps^2.

Incidentally, there have been many other interpretations/variations of Goldreich-Levin, including some of my own work. In work with Goldreich and Rubinfeld, we extended this work to computing linear and higher-degree functions over other fields. A work with Trevisan and Vadhan did a much better job of the higher-degree case. In work with Dinur, Grigorescu and Kopparty we viewed the G-L paper as decoding “homomorphisms between groups” and tried to extend it (some questions are still open here). Finally the most recent works along these lines are due to Gopalan-Klivans-Zuckerman and Gopalan – again many elegant ideas can be found in these papers, and more open questions.

– Madhu