Towards Less Labels By Active Learning, Exploiting Unlabeled Data and Learned Augmentation (TOLEDA)
Abstract
Within the DARPA Learning with Less Labels (LwLL) Program, TNO has executed the TOLEDA: Towards Less Labels by Active Learning, Exploiting Unlabeled Data and Learned Augmentation. This report provides the findings of that research.The goal of the LwLL program was to reduce the number of labeled samples by a factor of 1000 (Phase I) to 1000000 (phase II). Our research aimed at these goals, i.e. using very few or no labels. They have developed a method to find clusters and use the cluster centers as initial labels. Due to the nature of clustering, such labels are representative and diverse. At the same time these clusters provide a good set of pseudo labels. This leads to a strong image classification system, needing only very few labeled samples (down to 1 labeled sample per class) to obtain fairly good results. Another innovation is the use of semantic construction. Here they exploit a network pretrained on a widely available, large scale dataset, in combination with a text embedding of the class labels to construct a new classifier. This provides a zero shot capability to image classification and object detection. In addition, they have explored use of upcoming externally trained language-vision foundation models such as CLIP and GLIP. These provided very powerful capabilities as compared to the technologies developed within LwLL.
Document Details
- Document Type
- Technical Report
- Publication Date
- Oct 16, 2023
- Accession Number
- AD1212952
Entities
People
- Gertjan Burghouts
- Klamer Schutte
- Maarten Kruithof
- Wyke Pereboom
Organizations
- Netherlands Organisation for Applied Scientific Research