Evaluating the Representational Hub of Language and Vision Models