Here I present the background to, and a description of, a newly developed database of historical and contemporary lexical data for Australian languages (Chirila), concentrating on the Pama-Nyungan family (the largest family in the country). While the database was initially developed in order to facilitate research on cognate words and reconstructions, it has had many uses beyond its original purpose, in synchronic theoretical linguistics, language documentation, and language reclamation. Creating a multi-audience database of this type has been challenging, however. Some of the challenges stemmed from success: as the size of the database grew, the original data structure became unwieldy. Other challenges grew from the difficulties in anticipating future needs, in keeping track of materials, and in coping with diverse input formats for so many highly endangered languages.
In this paper I document the structure of the database, provide an overview of its uses (both in diachronic and synchronic research), and discuss some of the issues that have arisen during the project and choices that needed to be made as the database was created, compiled, curated, and shared. I address here the major problems that arise with linguistic data, particularly databases created for diverse audiences, from diverse data, with little infrastructure support.