- Use only steady timestamp counters to guarantee correctness
- CPU backend: directly measure total hashrate using raw hash counters from each thread; update data more often on ARM CPUs because they're slower
- GPU backends: directly measure total hashrate too, but use interpolator with 4 second lag to fix variance from batches of hashes
Total hashrate is now measured directly (realtime for CPU, 4 seconds lag for GPU), so it might differ a bit from the sum of all thread hashrates because data points are taken at different moments in time.
Overhead is reduced a lot since it doesn't have to go through all threads to calculate max total hashrate on every timer tick (2 times a second).
Set to false by default, gives 0.2% boost on Ryzen 7 3700X with 16 threads, but hashrate might be unstable on Ryzen between launches. Use with caution.
`scratchpad_prefetch_mode` can have 4 values:
0: off
1: use `prefetcht0` instruction (default, same as previous XMRig versions)
2: use `prefetchnta` instruction (faster on Coffee Lake and a few other CPUs)
3: use `mov` instruction
Added "astrobwt-avx2" parameter in config.json, it's turned off ("false") by default.
4-5% speedup on CPUs with proper AVX2 support (AMD Ryzen starting with Zen2, Intel Core starting with Haswell).
There will be no speedup on the following CPUs:
- Intel Pentium/Celeron don't support AVX2
- AMD Zen/Zen+ have only half-speed AVX
GCC compiled version is faster without AVX2, MSVC compiled version is faster with AVX2
Skips hashes with large stage 2 size. Added configurable `astrobwt-max-size` parameter, default value is 550, min 400, max 1200, optimal value ranges from 500 to 600 depending on CPU.
- Intel CPUs get 20-25% speedup
- 1st- and 2nd-gen Ryzens get 30% speedup
- 3rd-gen Ryzens get up to 50% speedup