A matrix M : A X − 1 1 corresponds to the following learning problem: An unknown element x X is chosen uniformly at random. A learner tries to learn x from a stream of samples, ( a 1 b 1 ) ( a 2 b 2 ) , where for every i , a i A is chosen uniformly at random and b i = M ( a i x ) .
Assume that k r are such that any submatrix of M of at least 2 − k A rows and at least 2 − X columns, has a bias of at most 2 − r . We show that any learning algorithm for the learning problem corresponding to M requires either a memory of size at least k , or at least 2 ( r ) samples. The result holds even if the learner has an exponentially small success probability (of 2 − ( r ) ).
In particular, this shows that for a large class of learning problems, any learning algorithm requires either a memory of size at least ( log X ) ( log A ) or an exponential number of samples, achieving a tight ( log X ) ( log A ) lower bound on the size of the memory, rather than a bound of min ( log X ) 2 ( log A ) 2 obtained in previous works [R17,MM17b].
Moreover, our result implies all previous memory-samples lower bounds, as well as a number of new applications.