A low-complexity distributed cyclic redundancy check (CRC) architecture for the CRC-Aided early stopping unit is proposed. In the previous distributed CRC unit, the general high-order Galois field (GF) multiplier occupies almost the area of the CRC unit and requires high-hardware cost and long critical path-delay. Accordingly, a computation algorithm based on GF arithmetic is analysed and an optimal CRC unit with the small order of the GF multiplier and newly designed linear feedback shift register is proposed. The proposed CRC architecture is implemented in 65 nm CMOS process for radix-22 and radix-24 parallel turbo decoders based on LTEAdvanced. In the radix-22 system, reductions of about 57.1% of gate count, 31.7% of critical path-delay and 44.1% of power consumption are achieved compared with the previous work.