This data set contains about one million thermal/RGB image pairs, representing a 2016 aerial survey of sea ice habitat in U.S. waters of the Chukchi Sea, conducted by NOAA fisheries. Annotations indicate the locations of approximately 7000 seals in these images. This data set is provided to encourage the machine learning community to advance the state of the art in detection with an extremely imbalanced data set (the vast majority of images are empty), image registration (thermal and RGB images are not perfectly aligned), and multimodal fusion in detection.
This data set is released under the Community Data License Agreement (permissive variant).
Data archives called “TrainingAnimals” contain images with annotated animals. “Training” here refers to the fact that NOAA maintains a test set that is not included here. Each animal was annotated once, but may have occurred in multiple images due to overlap between sequential images. Therefore, data archives called “PotentialAnimals” contain images adjacent to images with animals, and may or may not contain animals. Data archives called “TrainingBackground” should not contain animals… email us if you find any!
Also check out the associated GitHub repo, which contains preliminary code for detector training, visualization, etc.
Metadata is provided as a .csv file, containing a list of animals; each animal is represented by both a pixel location (in an IR images) and a bounding box (in an RGB image). A complete list of all images and the zipfiles that contain them is provided in a separate .csv file (listed below as the “manifest”).
Having trouble downloading? Check out our FAQ.
Posted by Dan Morris.