publications
2023
- CCNCanonical dimensions of visionZirui Chen, and Michael Bonner2023
What accounts for the representational similarities between deep neural networks (DNNs) and visual cortex? While these similarities are often attributed to shared constraints imposed by specific learning objectives or architectures, DNNs with widely varied designs appear to perform equally well as representational models of visual cortex. Here, we suggest that a more global perspective is needed to understand the relationship between DNNs and visual cortex. We reasoned that the most essential visual representations are general-purpose and thus naturally emerge from diverse visual systems. This leads to a specific hypothesis: it is possible to identify such canonical dimensions, extensively learned by many DNNs, that best explain cortical visual representations. To test this hypothesis, we developed a novel metric, called canonical strength, that quantifies the degree to which a representational feature in a DNN can be observed in the latent space of many other DNNs. We computed this metric for over 150,000 feature dimensions from DNNs with diverse optimization constraints. Our results showed a striking positive association between the canonical strength of feature dimensions and their shared variance with cortical visual representations. These results suggest that biologically relevant visual features are generically learnable independent of the learning and architectural constraints of DNNs.
- NeuroimageA stimulus-driven approach reveals vertical luminance gradient as a stimulus feature that drives human cortical scene selectivityAnnie Cheng, Zirui Chen, and Daniel D DilksNeuroImage, 2023
Human neuroimaging studies have revealed a dedicated cortical system for visual scene processing. But what is a “scene”? Here, we use a stimulus-driven approach to identify a stimulus feature that selectively drives cortical scene processing. Specifically, using fMRI data from BOLD5000, we examined the images that elicited the greatest response in the cortical scene processing system, and found that there is a common “vertical luminance gradient” (VLG), with the top half of a scene image brighter than the bottom half; moreover, across the entire set of images, VLG systematically increases with the neural response in the scene-selective regions (Study 1). Thus, we hypothesized that VLG is a stimulus feature that selectively engages cortical scene processing, and directly tested the role of VLG in driving cortical scene selectivity using tightly controlled VLG stimuli (Study 2). Consistent with our hypothesis, we found that the scene-selective cortical regions—but not an object-selective region or early visual cortex—responded significantly more to images of VLG over control stimuli with minimal VLG. Interestingly, such selectivity was also found for images with an “inverted” VLG, resembling the luminance gradient in night scenes. Finally, we also tested the behavioral relevance of VLG for visual scene recognition (Study 3); we found that participants even categorized tightly controlled stimuli of both upright and inverted VLG to be a place more than an object, indicating that VLG is also used for behavioral scene recognition. Taken together, these results reveal that VLG is a stimulus feature that selectively engages cortical scene processing, and provide evidence for a recent proposal that visual scenes can be characterized by a set of common and unique visual features.