Disentangling Monocular 3D Object Detection: From Single to Multi-Class Recognition