Two-Way Identification: Evaluating MindEye2 Reconstruction Accuracy

Two-way comparisons were performed for AlexNet (Krizhevsky et al., 2012) (second and fifth layers), InceptionV3 (Szegedy et al., 2016) (last pooling layer), and CLIP (final layer of ViT-L/14). We followed the same image preprocessing and the same two-way identification steps as Ozcelik and VanRullen (2023) and Scotti et al. (2023). For two-way identification, for each model, we computed the Pearson correlation between embeddings for the ground truth image and the reconstructed image, as well as the correlation between the ground truth image and a different reconstruction elsewhere in the test set. If the correlation for the former was higher than the latter, this was marked as correct. For each test sample, performance was averaged across all possible pairwise comparisons using the other 999 reconstructions to ensure no bias from random sample selection. This yielded 1,000 averaged percent correct outputs, which we averaged across to obtain the metrics reported in Table 1.

This paper is available on arxiv under CC BY 4.0 DEED license.

Authors:

(1) Paul S. Scotti, Stability AI and Medical AI Research Center (MedARC);

(2) Mihir Tripathy, Medical AI Research Center (MedARC) and a Core contribution;

(3) Cesar Kadir Torrico Villanueva, Medical AI Research Center (MedARC) and a Core contribution;

(4) Reese Kneeland, University of Minnesota and a Core contribution;

(5) Tong Chen, The University of Sydney and Medical AI Research Center (MedARC);

(6) Ashutosh Narang, Medical AI Research Center (MedARC);

(7) Charan Santhirasegaran, Medical AI Research Center (MedARC);

(8) Jonathan Xu, University of Waterloo and Medical AI Research Center (MedARC);

(9) Thomas Naselaris, University of Minnesota;

(10) Kenneth A. Norman, Princeton Neuroscience Institute;

(11) Tanishq Mathew Abraham, Stability AI and Medical AI Research Center (MedARC).

← Previous

COCO Image Retrieval with MindEye2: Challenges and Insights with OpenCLIP bigG Embeddings

Up Next →

Pretraining Efficiency: MindEye2's Performance with Fewer Subjects