LILA BC is a repository for archival data sets related to biology and conservation. Our intention is to create a valuable resource for the scientific community, including for machine learning researchers and those that want to harness machine learning for biology and conservation.

Machine learning depends on labeled data, but getting access to such data in biology and conservation is a challenge. Consequently, everyone benefits when more labeled data is made available. Biology and conservation teams benefit by having more data to train on, and can also multiply the scientific impact of the data they gather when others can take advantage of such data (we suggest listing this added benefit to grant proposals written to fund data collection). Machine learning researchers benefit by having data to experiment with.

LILA BC is both for creators of data sets, giving them a free way to share their valuable data with the scientific community, and for scientists looking to benefit from labeled data sets, by providing a central resource to find labeled data sets. We ask that if you use a data set, you give credit to the data set generator in they manner they specify (e.g. by citing a paper describing the data set).

For more information, or to inquire about adding a data set, email

LILA BC is maintained by a working group that includes representatives from Zooniverse, the Evolving AI Lab at the University of Wyoming, the University of Minnesota Lion Center, Snapshot Safari, and Microsoft AI for Earth. Hosting is provided by Microsoft AI for Earth.