We say a subset C 1 2 k n is a k -hash code (also called k -separated) if for every subset of k codewords from C , there exists a coordinate where all these codewords have distinct values. Understanding the largest possible rate (in bits), defined as ( log 2 C ) n , of a k -hash code is a classical problem. It arises in two equivalent contexts: (i) the smallest size possible for a perfect hash family that maps a universe of N elements into 1 2 k , and (ii) the zero-error capacity for decoding with lists of size less than k for a certain combinatorial channel.
A general upper bound of k ! k k − 1 on the rate of a k -hash code (in the limit of large n ) was obtained by Fredman and Komlós in 1984 for any k 4 . While better bounds have been obtained for k = 4 , their original bound has remained the best known for each k 5 . In this work, we obtain the first improvement to the Fredman-Komlós bound for every k 5 . While we get explicit (numerical) bounds for k = 5 6 , for larger k we only show that the FK bound can be improved by a positive, but unspecified, amount. Under a conjecture on the optimum value of a certain polynomial optimization problem over the simplex, our methods allow an effective bound to be computed for every k .