UniTR: A unified transformer-based framework for co-object and multi-modal saliency detection