Hi, you mentioned in issue 28 that the ImageNet-21K data is cleaned, and I did not find any relevant cleaning information on the Internet. Can you provide a cleaned train.txt and val.txt with corresponding image paths and labels?
Hi, it was a typo, ImageNet-21k dataset we use has approximately 12.4M. Note, it is slightly lower than 14M, because ImageNet-21k has duplicate images, which we merge together.
Label files can be organized into the following structure, each line containing <image path, label> pairs:
If you have your own format, it can also be provided according to your format.
By the way, is each image single-label or multi-label when ImageNet-21K is pretrained?
@andsteing Can you help me?
Images are single-label.
The dataset that we used for the pre-training is not currently in public TFDS, but maybe @lucasb-eyer can share the deduplication IDs that you requested.
@andsteing thank you for your reply.
@lucasb-eyer Can you share a duplicate ID with me? I don't need the original image files, I just need train.txt and val.txt.
Hi, it's not "cleaned" but basically the (exact) same image may appear under multiple folders (labels), and thus we only count it as one image with N labels, that's why the mentioned image count is smaller, it means "unique image count".
I don't have this laying around in a file, and even if I had, I cannot simply share a file publicly like this without approval, which I don't have the bandwidth to do right now, sorry. But this should be really simple for you to get from the files/folder structure in the tar file. Basically throw everything into a defaultdict(list).
@lucasb-eyer So how do I select a label for the same image distributed in multiple folders (label)？
I mean, how do I select a label from multiple labels for the same image? Is it random or is there any way to choose?
@andsteing @lucasb-eyer @akolesnikoff Can you help me? I want to reproduce ViT-L pre-training model accuracy on ImageNet21K. see mentioned in #62 (comment)
I have reproduced the accuracy of ViT-L/16 pre-trained on ImageNet 21K and finetuned on ImageNet1K with the above suggestions and guidance. ImageNet21K data preparation for pre-training ViT-L
Thank you @lucasb-eyer