Automating ecological database searches

Exploring tools for automating and compiling species lists from ecological databases.

This research project explored the automation and consolidation of species lists by integrating data from multiple biological databases for a specific area. The team developed Python scripts to efficiently request, collect, and merge records from sources such as Atlas of Living Australia, Global Biodiversity Information Facility, Species of National Environmental Significance, Wildnet, Bionet, and Victorian Biodiversity Atlas, overcoming the slow, manual process of individual database downloads and manual cross-checking. The automation recognized and consolidated common data fields, handled spatial queries via shapefile-based bounding boxes, and removed duplicate records, greatly streamlining the workflow.


Key learnings included embracing the Darwin Core standard for biological terms, improving team understanding of data overlaps and database interactions, and addressing challenges like manual downloads for Victorian databases and navigating administrative barriers in NSW. The project also illuminated the complexity in defining data duplicates across datasets. We hope to continue to improve our understanding of biological databases across different states and their limitations as well as increase the efficiency species list compilations across our projects.