AVX2 optimized code for AstroBWT

Added "astrobwt-avx2" parameter in config.json, it's turned off ("false") by default.

4-5% speedup on CPUs with proper AVX2 support (AMD Ryzen starting with Zen2, Intel Core starting with Haswell).

There will be no speedup on the following CPUs:

- Intel Pentium/Celeron don't support AVX2
- AMD Zen/Zen+ have only half-speed AVX

GCC compiled version is faster without AVX2, MSVC compiled version is faster with AVX2
This commit is contained in:
SChernykh 2020-03-10 22:03:16 +01:00
parent 8698b73036
commit e22f798085
14 changed files with 563 additions and 15 deletions

View file

@ -23,6 +23,16 @@ if (WITH_ASTROBWT)
src/crypto/astrobwt/salsa20_ref/salsa20.c
)
else()
if (CMAKE_SIZEOF_VOID_P EQUAL 8)
enable_language(ASM_MASM)
add_definitions(/DASTROBWT_AVX2)
if (CMAKE_C_COMPILER_ID MATCHES MSVC)
list(APPEND SOURCES_CRYPTO src/crypto/astrobwt/sha3_256_avx2.asm)
else()
list(APPEND SOURCES_CRYPTO src/crypto/astrobwt/sha3_256_avx2.S)
endif()
endif()
list(APPEND HEADERS_CRYPTO
src/crypto/astrobwt/Salsa20.hpp
)