LILA BC is a repository for data sets related to biology and conservation, intended as a resource for both machine learning (ML) researchers and those that want to harness ML for biology and conservation.
Machine learning depends on labeled data, but accessing such data in biology and conservation is a challenge. Consequently, everyone benefits when labeled data is made available. Biologists and conservation scientists benefit by having data to train on, and free hosting allows teams to multiply the impact of their data (we suggest listing this benefit in grant proposals that fund data collection). ML researchers benefit by having data to experiment with.
LILA BC is intended to host data from a variety of modalities, but emphasis is placed on labeled images; we currently host over ten million labeled images.
We ask that if you use a data set hosted on LILA BC, you give credit to the data set owner in the manner listed on the data set’s landing page.
For more information, or to inquire about adding a data set, email info@lila.science.
We also maintain a list of other labeled data sets related to conservation.
LILA BC is maintained by Dan Morris. LILA BC was created by a working group that includes representatives from Ecologize, Zooniverse, the Evolving AI Lab, and Snapshot Safari.
Hosting on Microsoft Azure is provided by the Microsoft AI for Good Lab. Hosting on Google Cloud is provided by the Google Cloud Public Datasets program. Hosting on AWS is provided by Source Cooperative.