Treats left and right as 4-component vectors of Int8 and computes dot(left, right)+acc
int dot4add_i8packed( uint left, uint right, int acc);