The LAMBADA dataset: Word prediction requiring a broad discourse context