All the camera trap datasets use their own category names… have they been mapped to a common taxonomy?
Yes! This .csv file defines a mapping from every category in every camera trap dataset on LILA to the iNaturalist taxonomy. The goal is to let users train classifiers that span datasets, e.g. to be able to find all the images of bears, or all the images of green herons, or all the images of reptiles, etc.
There is necessarily some information lost in this mapping, so this isn’t intended to replace the original categories. For example, some datasets use a category called “bird of prey”, which must have been scientifically meaningful to whoever created the dataset, and defines a group of birds with clear visual similarities, but does not map to a single taxonomic subtree, so this is just mapped to “birds”.
Code to create and explore this .csv file is here.
If you see anything suspicious in these mappings, email us!
A preliminary mapping to the Wildlife Insights taxonomy is here. Note that this is not interchangeable with the iNat mapping, and not reversible, because the iNaturalist taxonomy allows for more levels (e.g. suborder, variety) than the WI taxonomy.
If they’re all in a common taxonomy, why should I use the individual dataset metadata files?
You shouldn’t! Or at least, most of the time, you shouldn’t. The lines between datasets are a historical artifact, and we’re encouraging users to treat all the camera trap data on LILA as one big dataset. This much larger .csv file has at least one row for every camera trap image on LILA, including a full URL to that image, and the full mapping for that image into the iNaturalist taxonomy. Multiple rows may be present for the same image if multiple species are present.
This combined dataset (with all the camera trap data on LILA) is also available as a Hugging Face dataset.
I want to train a species classifier on cropped animals, are MegaDetector results available for LILA data?