Flexible CTU-level parallel motion estimation by CPU and GPU pipeline for HEVC