Commit graph

9 commits

Author SHA1 Message Date
SChernykh
ff82ca57f2 RandomX: rewrote dataset read code
Unified code for AMD and Intel
1% faster on Intel
0.15% faster on AMD Ryzen
2021-05-20 12:45:42 +02:00
SChernykh
d443dd86f1 RandomX: added BMI2 version for scratchpad prefetch
Saves 1 instruction and 1 byte in the main loop.
2021-05-19 17:52:16 +02:00
SChernykh
3477f9fbc1 RandomX: optimized IMUL_RCP instruction
+0.4% on AMD Zen2
+0.3% on AMD Zen3
+0.1% on Intel SandyBridge
+0.3% on rx/wow on Intel SandyBridge
2021-04-19 17:43:58 +02:00
SChernykh
515a85e66c Dataset initialization with AVX2 (WIP) 2020-12-18 14:53:54 +01:00
SChernykh
f80177cbd3 Optimizations for AMD Bulldozer
- Added support for XOP instructions
- Enabled Ryzen code for Bulldozer because it's faster there too
2020-01-15 13:04:26 +01:00
SChernykh
ffec421408 Fixed indentation 2019-12-08 16:20:46 +01:00
SChernykh
d0df824599 Optimized dataset read for Ryzen CPUs
Removed register dependency in dataset read, +0.8% speedup on average.
2019-12-08 16:14:02 +01:00
SChernykh
2322e3bcf7 RandomX: optimized loading from scratchpad
Prefetches scratchpad data as soon as possible to calculate data address for the next load.

Up to ~1.4% speedup on Ryzen 7 3700X @ 4.1 GHz, RAM 3200 MHz 14-14-14-28 with optimized sub-timings:
Variant|Before H/S|After H/S
-------|----------|---------
rx/0|8663|8777
rx/wow|9867|10009
rx/loki|8652|8731
2019-09-11 19:10:01 +02:00
SChernykh
6eb9d0963b Integrated RandomX, added RandomXL (Loki) 2019-07-01 20:11:51 +02:00