MegaDetector results for camera trap datasets

I was planning to run MegaDetector on some of these datasets… any chance you’ve already done this?

Yes! We’ve run MegaDetector on every camera trap dataset on LILA. These results are intended to support classifier training or detector fine-tuning. There are two reasons we don’t recommend using these results to evaluate MegaDetector’s accuracy:

Most of these datasets have been used as part of MegaDetector’s training data, so although they are representative of the gist of MegaDetector’s performance in various ecosystems, they may provide a biased via of MegaDetector’s accuracy. You can see the list of specific datasets used to train MegaDetector here.
Because some of these results files are very large, we have thresholded them in some cases (albeit at levels much lower than anyone would care about for most applications).

What are the different results links for each dataset?

For each dataset, you’ll see links to results to one or more sets of “raw” results (MDv4, MDv5a, MDv5b, and/or MDv1000-redwood). You’ll also see a set of results called “MDv[something] with RDE”. These represent a version of the results where we’ve spent about five minutes per million images on the semi-manual repeat detection elimination process, which gets rid of a lot of false positives that are repeated over and over. There is a very small risk of getting rid of some true positives in this process as well, but the number of animals lost to this process should be negligible. So, all other things being equal, for almost everything you might want to do with the results on this page, use the RDE results. RDE results are not included for some datasets because those datasets don’t have location IDs that are required for the RDE process.

The results files contain relative paths… what are they relative to?

This .csv file contains metadata for every camera trap dataset on LILA; the “image_base_url” column has the Azure base URL to which all relative filenames can be concatenated to get a meaningful URL.

Clicking on this page like 20 times seems like an inefficient way to access these links… are they available in a structured format?

Yes! The following columns in the .csv file mentioned above contain URLs for MegaDetector results for each dataset:

mdv5a_results_raw
mdv5b_results_raw
md1000-redwood_results_raw
md_results_with_rde

As per above, for almost everything anyone would want to do with these results, you should use the “md_results_with_rde” column; that’s the “best” set of results for each dataset.

Image filenames in these results files are relative to the same base folder as the .json metadata for each dataset. But we’re generally encouraging folks to ignore the individual dataset metadata going forward, and treat all the camera trap data on LILA as one big dataset (more information here about common taxonomy mapping and a unified data table). These MegaDetector results break the rules just a little, because we provide results for each dataset, but only because otherwise the results would be mega-massively-enormous.

Enough chit-chat, can you just give me a list of links to MegaDetector results?

Can do!

Caltech Camera Traps (MDv5a) (MDv5b) (MDv5a with RDE)
Channel Islands Camera Traps (MDv5a) (MDv5b) (MDv5a with RDE)
ENA24 (MDv5a) (MDv5b)
Idaho Camera Traps (MDv5a) (MDv5b) (MDv5a with RDE)
Island Conservation Camera Traps (MDv5a) (MDv5b) (MDv5a with RDE)
Missouri Camera Traps (MDv5a) (MDv5b)
NACTI (MDv5a) (MDv5b) (MDv5a with RDE)
Orinoquía Camera Traps (MDv5a) (MDv5b) (MDv5a with RDE)
Snapshot Camdeboo (MDv5a) (MDv5b) (MDv5a with RDE)
Snapshot Enonkishu (MDv5a) (MDv5b) (MDv5a with RDE)
Snapshot Karoo (MDv5a) (MDv5b) (MDv5a with RDE)
Snapshot Kgalagadi (MDv5a) (MDv5b) (MDv5a with RDE)
Snapshot Kruger (MDv5a) (MDv5b) (MDv5a with RDE)
Snapshot Mountain Zebra (MDv5a) (MDv5b) (MDv5a with RDE)
Snapshot Serengeti (MDv1000-redwood) (MDv1000-redwood with RDE)
SWG Camera Traps (MDv5a) (MDv5b) (MDv5a with RDE)
Trail Camera Images of New Zealand Animals (MDv5a) (MDv5b) (MDv5a with RDE)
WCS Camera Traps (MDv5a) (MDv5b) (MDv5a with RDE)
Wellington Camera Traps (MDv5a) (MDv5b) (MDv5a with RDE)
Desert Lion Conservation Camera Traps (MDv5a)
Snapshot Safari 2024 Expansion (except for Snapshot Serengeti) (MDv5a) (MDv5a with RDE)
Snapshot Safari 2024 Expansion (Snapshot Serengeti only) (MDv1000-redwood) (MDv1000-redwood with RDE)
Seattle-ish Camera Traps (MDv1000-redwood) (MDv1000-redwood with RDE)
UNSW Predators (MDv1000-redwood) (MDv1000-redwood with RDE)
Nkhotakota Camera Traps (MDv1000-redwood) (MDv1000-redwood with RDE)

Which locations were used in MD training?

Really, I don’t encourage you to use these results to evaluate MD, because despite best efforts, we can’t guarantee that currently-available annotations reflect historical MD training splits. That said, there are >1.1M animal boxes here, representing most of what’s on LILA and was also used for MDv5 training, with a “split” field associated with each image (always “train”, “val”, or “test”). A “preview” of these boxes is available here.

The .json file is in COCO Camera Traps format, except that in lieu of the normal “bbox” field, this file uses a field called “bbox_relative” to indicate that coordinates are already normalized.

The filenames in this .json file are relative to the unified LILA base URL, which is any of the following:

https://storage.googleapis.com/public-datasets-lila/
http://us-west-2.opendata.source.coop.s3.amazonaws.com/agentmorris/lila-wildlife/
https://lilawildlife.blob.core.windows.net/lila-wildlife/

So for example, this is one of the filenames in the .json file:

wellington-unzipped/images/2906151635420888S482.JPG

That’s available at any of the following URLs:

Also see…

…this page, about mapping all of the LILA camera trap data to a common taxonomy, and ideally eliminating the lines between datasets and treating all the camera trap data on LILA as one big dataset.