Great Zebra and Giraffe Count and ID


This dataset contains images of plains zebra (Equus quagga) and Masai giraffe (Giraffa tippelskirchi) with bounding boxes and individual animal identifications. Images are taken from a two-day census of Nairobi National Park, located just south of the airport in Nairobi, Kenya. The “Great Zebra and Giraffe Count” (GZGC) photographic census was organized on February 28th and March 1st 2015 and had the participation of 27 different teams of citizen scientists and 55 total photographers. Only images containing either zebras or giraffes were included in this dataset, for a total of 4,948 images. All images are labeled with bounding boxes around the individual animals for which there is ID metadata, meaning some images contain missing boxes and are not intended to be used for object detection training or testing. Viewpoints for all animal annotations were also added. All ID assignments were completed using the HotSpotter algorithm (Crall et al. 2013) by visually matching the stripes and spots as seen on the body of the animal. A total of 2,056 combined names are released for 6,286 individual zebra and 639 giraffe sightings. This dataset presents as a challenging comparison compared to the whale shark dataset since it contains a significantly higher number of animals that are only seen once during the survey.

Data format

The dataset is released in the Microsoft COCO .json format. We have collapsed the entire dataset into a single “train” label and have left “val” and “test” empty; we do this as an invitation to researchers to experiment with their own novel approaches for dealing with the unbalanced and chaotic distribution on the number of sightings per individual. All of the images in the dataset have been resized to have a maximum dimension of 3,000 pixels. The metadata for all animal sightings is defined by an axis-aligned bounding box and includes information on the rotation of the box (theta), the viewpoint of the animal, a species (category) ID, a source image ID, an individual string ID name, and other miscellaneous values. The temporal ordering of the images, and an anonymized ID for the original photographer, can be determined from the metadata for each image.

Citation, license, and contact information

For research or press contact, please direct all correspondence to Wild Me at Wild Me is a registered 501(c)(3) not-for-profit based in Portland, Oregon, USA and brings state-of-the-art computer vision tools to ecology researchers working around the globe on wildlife conservation.

This dataset is released under the Community Data License Agreement (permissive variant).

If you use this dataset in published work, please cite as:

Parham J, Crall J, Stewart C, Berger-Wolf T, Rubenstein DI. Animal population censusing at scale with citizen science and photographic identification. In AAAI Spring Symposium-Technical Report 2017 Jan 1.

Downloading the data

Data download links:

Images and metadata (GCP link) 10GB)
Images and metadata (Azure link) (10GB)
Images and metadata (AWS link) (10GB)

Having trouble downloading? Check out our FAQ.

Posted by Dan Morris.