Joint Language Models for Automatic Speech Recognition and Understanding