We propose a decentralized method for obtaining the vision graph for a distributed, ad-hoc camera network, in which each edge of the graph represents two cameras that image a sufficiently large part of the same environment. Each camera encodes a spatially well-distributed set of distinctive, approximately viewpoint-invariant feature points into a fixed-length “feature digest” that is broadcast throughout the network. Each receiver camera robustly matches its own features with the decompressed digest and decides whether sufficient evidence exists to form a vision graph edge. We also show how a camera calibration algorithm that passes messages only along vision graph edges can recover accurate 3D structure and camera positions in a distributed manner. We analyze the performance of different message formation schemes, and show that high detection rates ( > 0.8 ) can be achieved while maintaining low false alarm rates ( < 0.05 ) using a simulated 60-node outdoor camera network.