Exploiting language models for visual recognition