Forest Damages – Larch Casebearer


The larch casebearer, Coleophora laricella, is a moth that mainly attacks larch trees and has caused significant damage in larch stands in Västergötland, Sweden.

The Swedish Forest Agency, supported by Microsoft’s AI for Earth program, started a project that utilizes artificial intelligence for identifying, inventorying, mapping, and monitoring forest areas affected by the larch casebearer. The primary intention of the project is to help forest caretakers to quickly identify threats and react to prevent further forest damage.

This dataset is an outcome of the project, and it contains 1,543 images taken from two drone flying occasions over five affected areas in Västergötland, Sweden. The data set is structured in 10 batches, numbered 1 to 10.

All batches contain bounding box annotations around trees, categorized as Larch and Other. These annotations can be used to train AI models for tree identification. In 1,543 images, there are 101,878 annotated trees.

Batches 1-5 also contain annotations describing tree damage in four categories: Healthy (H), Light Damage (LD), High Damage (HD), and Other. These annotations can be used to train models for damage classification. In 840 images there are 44,522 larch trees annotated with damage level.

An overview of the project is available here (video).

Figure 1: Data set statistics

Data format

There are ten folders in the zipfile, each corresponding to a drone survey, named according to:


  • area_name is one of “Bebehojd”, “Ekbacka”, “Jallasvag”, “Kampe”, “Nordkap”
  • capture_date is the capture date, formatted as yyyymmdd

Within each survey folder are two folders: “Annotations” and “Images”.

Image files are named according to:


  • batch_number is a two-digit batch number, from 01 to 10. Each survey folder contains only a single batch number.
  • image_number is a five-digit image number

Annotation files are named according to:


  • batch_number is a two-digit batch number, from 01 to 10, which will match the corresponding “Images” folder.
  • image_number is a five-digit image number

Annotations are in the Pascal VOC XML format for objection detection. A typical individual tree object would be annotated as:



If you use these data in a publication or report, please use the following citation:

Swedish Forest Agency (2021): Forest Damages – Larch Casebearer 1.0. National Forest Data Lab. Dataset.

Contact information

For questions about this data set, contact Halil Radogoshi ( at the Swedish Forest Agency.



The organizations responsible for generating and funding this dataset make no representations of any kind including, but not limited to the warranties of merchantability or fitness for a particular use, nor are any such warranties to be implied with respect to the data. Although every effort has been made to ensure the accuracy of information, errors may be reflected in data supplied. The user must be aware of data conditions and bear responsibility for the appropriate use of the information with respect to possible errors, collection methodology, currency of data, and other conditions. Credit should always be given to the data source when this data is transferred, altered, or used for analysis.

Labels are released under the Community Data License Agreement (permissive variant).


For the drone images, the Swedish Forest Agency has secured permit for dissemination of geographical data from Lantmäteriet, an authority belonging to Ministry of Finance in Sweden.

Downloading the data

This dataset is provided as a single zipfile:

Having trouble downloading? Check out our FAQ.

Posted by Dan Morris.