Attention-based Fusion for Multi-source Human Image Generation