Caltech Camera Traps

This data set contains 243,100 images from 140 camera locations in the Southwestern United States, with labels for 21 animal categories (plus empty), primarily at the species level (for example, the most common labels are opossum, raccoon, and coyote), and approximately 66,000 bounding box annotations. Approximately 70% of images are labeled as empty.

More information about this data set is available here.

If you use this data set, please cite the associated manuscript:

Sara Beery, Grant Van Horn, Pietro Perona. Recognition in Terra Incognita. Proceedings of the 15th European Conference on Computer Vision (ECCV 2018). (bibtex)

Annotations are provided in COCO Camera Traps .json format.

A link to a zipfile is provided below, but images are also available for direct download from the base folder:

For example, the image referred to in metadata as:


…is available for direct download from:

We have also divided locations (i.e., cameras) into training and validation splits to allow for consistent benchmarking on this data set. The file describing this split specifies a train/val split for all locations in the data set, and also provides the train/val split used in the ECCV paper listed above. The “eccv_train” split here corresponds to the “train” locations and all “cis” locations in the ECCV paper; the “eccv_val” split here corresponds to all “trans” locations in the ECCV paper.

This data set is released under the Community Data License Agreement (permissive variant).

For questions about this data set, contact

Download links:

Having trouble downloading? Check out our FAQ.