Improve update_piece_threats
passed on avx512ICL:
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 30240 W: 8026 L: 7726 D: 14488
Ptnml(0-2): 95, 3235, 8171, 3513, 106
https://tests.stockfishchess.org/tests/view/69281d9ab23dfeae38cfeeb8
passed on generic architectures:
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 73184 W: 19183 L: 18821 D: 35180
Ptnml(0-2): 258, 7988, 19744, 8338, 264
https://tests.stockfishchess.org/tests/view/6928ba11b23dfeae38cff276
subsequent cleanups tested as:
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 72480 W: 18678 L: 18502 D: 35300
Ptnml(0-2): 242, 7925, 19718, 8125, 230
https://tests.stockfishchess.org/tests/view/692a26adb23dfeae38cff566
We add an argument noRaysContaining, which skips all discoveries which contain
all bits in the argument; if the argument is from | to, then this will
eliminate the discovery. Separately, on AVX512ICL we can speed up the
computation of DirtyThreats by moving from a pop_lsb loop to a vector
extraction with vpcompressb. See PR for details.
closes https://github.com/official-stockfish/Stockfish/pull/6453
No functional change