Fix Clang Tbprobe Miscompilation
Recent changes to the Square enum (reducing it from int32_t to int8_t)
now allow the compiler to vectorize loops that were previously too wide
for targets below AVX-512. However, this vectorization which Clang
performs is not correct and causes a miscompilation.
Disable this vectorization.
This particular issue was noticable with Clang 15 and Clang 19,
on avx2 as well as applie-silicon.
Ref: #6063
Original Clang Issue: llvm/llvm-project#80494
First reported by #6528, though misinterpreted.
closes #6529
No functional change