Multi task based sound localization model