摘要:Some natural product leads of drugs (NPLDs) have been found to congregate in the chemical space. The extent, detailed patterns, and mechanisms of this congregation phenomenon have not been fully investigated and their usefulness for NPLD discovery needs to be more extensively tested. In this work, we generated and evaluated the distribution patterns of 442 NPLDs of 749 pre-2013 approved and 263 clinical trial small molecule drugs in the chemical space represented by the molecular scaffold and fingerprint trees of 137,836 non-redundant natural products. In the molecular scaffold trees, 62.7% approved and 37.4% clinical trial NPLDs congregate in 62 drug-productive scaffolds/scaffold-branches. In the molecular fingerprint tree, 82.5% approved and 63.0% clinical trial NPLDs are clustered in 60 drug-productive clusters (DCs) partly due to their preferential binding to 45 privileged target-site classes. The distribution patterns of the NPLDs are distinguished from those of the bioactive natural products. 11.7% of the NPLDs in these DCs have remote-similarity relationship with the nearest NPLD in their own DC. The majority of the new NPLDs emerge from preexisting DCs. The usefulness of the derived knowledge for NPLD discovery was demonstrated by the recognition of the new NPLDs of 2013–2014 approved drugs.