NUS-WIDE

The Descriptions of the Files Included in the NUS-WIDE

Low-Level Features

Each row in every low-level feature file corresponds to the image in the corresponding row of Imagelist.

 

        Normalized_CH :           64-D color histogram with each row represents an image.

        Normalized_CORR :      144-D color correlogram with each row represents an image.

        Normalized_EDH :         75-D edge direction histogram with each row represents an image.

        Normalized_WT :          128-D wavelet texture with each row represents an image.

        Normalized_CM55 :       225-D block-wise color moments with each row represent an image.

        BoW_int :                     500-D bag of words with each row represent an image.

 

We separate the dataset into two parts, the first part contains 161,789 images for training and the second part contains 107,859 images for evaluation. So each low-level feature file is separated into two files which include 161,789 and 107,859 rows of features respectively.

        Normalized_CH     à    Train_Normalized_CH       +    Test_Normalized_CH

        Normalized_CORR   à    Train_Normalized_CORR  +    Test_Normalized_CORR

        Normalized_EDH   à    Train_Normalized_EDH     +    Test_Normalized_EDH

        Normalized_WT         à    Train_Normalized_WT       +    Test_Normalized_WT

        Normalized_CM55   à    Train_Normalized_CM55    +    Test_Normalized_CM55

        BoW_int               à    BoW_Train_int                  +    BoW_Test_int
@

Groundtruth

AllLabels :  This fold includes 81 files corresponding to 81 concepts respectively.  For a certain concept, every row in the file corresponds to the groundtruth of the image in the corresponding row of Imagelist.txt.

TrainTestLabels :  Every groundtruth file is separated into two parts according to the separation of the dataset.  Thus this fold includes 162 files corresponding to the training and testing groundtruth of the 81 concepts.

Tags

AllTags81 :  This file includes the labels extracted from the tags associated to the images for the 81 concepts. Each column represents a concept and each row includes the 81 labels extracted from the associated tags for the corresponding images. AllTags81(i, j) = 1 means the associated tags of the i-th image include the j-th concept in Concepts81.txt. Otherwise, AllTags81(i, j) = 0 means the associated tags of the i-th image does not include the j-th concept in Concepts81.txt.

        AllTags81      à        Train_Tags81      +       Test_Tags81 

Train_Tags81, Test_Tags81 :  The file AllTags81 is separated into two parts according to the separation of the dataset. 

AllTags1k :  This file includes the labels extracted from the tags associated to the images for the top 1,000 concepts in the Fianl_Tag_List.

        AllTags1k        à      Train_Tags1k         +      Test_Tags1k  

Train_Tags1k, Test_Tags1k  :  The file AllTags1k is separated into two parts according to the separation of the dataset. 

AllTags :  Each row includes the raw tags crawled from www.flickr.com for the image in the corresponding row of Imagelist.txt. 

Fianl_Tag_List :  The tag list includes 5,018 tags extracted from the associated tags of the dataset.  The tags are sorted according to their frequencies.

Concept List

Concepts81:  This file includes the 81 concepts in alphabetical order.

 Image List

Imagelist :  The list of raw images extracted from http://www.flickr.com/

        Imagelist     à     TrainImagelist         +        TestImagelist 

TrainImagelist :  The list of images for training.

TestImagelist :  The list of images for testing.


@


If you have any questions about the NUS-WIDE dataset, please contact Dr. Jinhui Tang:

tangjh@comp.nus.edu.sg