Overview
This data set contains 77,739 images sampled from video collected on and around shellfish aquaculture farms in an estuary in the Northeast Pacific, in which 67,990 objects (fish and crustaceans) have been annotated on 30,384 images (the remainder have been annotated as “empty”). Boxes are on individual objects, but labeling was done at the image level, so objects are labeled as one of “fish”, “crab”, or “fish_and_crab”, where “fish_and_crab” means that both categories were present in this image.
This data set was used to develop a computer vision model to detect fish, allowing specialists from NOAA to examine images in which fish were detected to classify and quantify their species more efficiently. Incorporating artificial intelligence into ecological and resource management fields will advance our understanding of potential changes in the marine environment in the context of fisheries and aquaculture expansion, shoreline development, and climate change.
These data were collected in a collaborative effort between the NOAA Northwest Fisheries Science Center, The Nature Conservancy, and shellfish aquaculture farms in WA, USA. Funding was provided by the NOAA Office of Aquaculture Grant (NA17OAR4170218) and Washington Sea Grant (UWSC10159). The data were labelled in a collaborative effort between the NOAA Northwest Fisheries Science Center and the Microsoft AI for Good Research Lab.
Code for fine-tuning YOLOv5 on this data is available here.
Citation, license, and contact information
If you use these data in a publication or report, please use the following citation to refer to the data collection process:
Ferriss B, Veggerby K, Bogeberg M, Conway-Cranos L, Hoberecht L, Kiffney P, Litle K, Toft J, Sanderson B. Characterizing the habitat function of bivalve aquaculture using underwater video. Aquaculture Environment Interactions. 2021 Nov 18;13:439-54.
…and/or the following citation to refer to the annotations and public data set:
Farrell DM, Ferriss B, Sanderson B, Veggerby K, Robinson L, Trivedi A, Pathak S, Muppalla S, Wang J, Morris D, Dodhia R. A labeled data set of underwater images of fish and crab species from five mesohabitats in Puget Sound WA USA. Scientific Data. 2023 Nov 13;10(1):799.
For questions about this data set, contact Beth Sanderson (NOAA Northwest Fisheries Science Center) and Bridget Ferriss (NOAA Alaska Fisheries Science Center).
This data set is released under the Community Data License Agreement (permissive variant).
Data format
Annotations are provided in COCO Camera Traps .json format.
Some deployments used a green-blocking filter; the Boolean “filter” property on each image in the dataset indicates whether that images was captured with a filter.
Downloading the data
Download links:
Download images from GCP (7GB)
Download annotations from GCP (4MB)
Download images from AWS (7GB)
Download annotations from AWS (4MB)
Download images from Azure (7GB)
Download annotations from Azure (4MB)
Having trouble downloading? Check out our FAQ.
Posted by Dan Morris.