- Port field operations assembler from libsecp256k1 (field_amd64.s,
field_amd64_bmi2.s) with MULX/ADCX/ADOX instructions
- Add AVX2 scalar and affine point operations in avx/ package
- Implement CPU feature detection (cpufeatures.go) for AVX2/BMI2
- Add libsecp256k1.so via purego for native C library comparison
- Create comprehensive SIMD benchmark suite comparing btcec, P256K1
pure Go, P256K1 ASM, and libsecp256k1
- Add BENCHMARK_SIMD.md documenting performance across implementations
- Remove BtcecSigner, consolidate on P256K1Signer as primary impl
- Add field operation tests and benchmarks (field_asm_test.go,
field_bench_test.go)
- Update GLV endomorphism with wNAF scalar multiplication
- Add scalar assembly (scalar_amd64.s) for optimized operations
- Clean up dependencies and update benchmark reports
This commit introduces several optimizations for elliptic curve operations in the secp256k1 library. Key changes include the implementation of the `ecmultStraussGLV` function for efficient scalar multiplication using the Strauss algorithm with GLV endomorphism, and the addition of windowed multiplication techniques to improve performance. Additionally, the benchmark tests have been updated to focus on the P256K1Signer implementation, streamlining the comparison process and enhancing clarity in performance evaluations.
This commit updates the BENCHMARK_REPORT.md to reflect the latest performance improvements following the implementation of optimized windowed multiplication for ECDH and verification. Key changes include a new generation date, updated operation times, and a detailed analysis of the performance of P256K1Signer, BtcecSigner, and NextP256K across various operations. Notably, P256K1Signer now shows significant improvements in ECDH (33% faster) and verification (20% faster), establishing it as the fastest pure Go implementation across all operations.
This commit introduces a new `ecmultWindowedVar` function that implements optimized windowed multiplication for scalar multiplication, significantly improving performance during verification operations. The existing `Ecmult` function is updated to utilize this new implementation, converting points to affine coordinates for efficiency. Additionally, the `EcmultConst` function is retained for constant-time operations. The changes also include enhancements to the generator multiplication context, utilizing precomputed byte points for improved efficiency. Overall, these optimizations lead to a notable reduction in operation times for cryptographic computations.
This commit deletes the `benchmark_results.txt` file, which contained performance metrics for various cryptographic operations. Additionally, the Go module has been updated to version 1.25.0, and new dependencies have been added, including `btcec` for enhanced signing capabilities. The `go.sum` file has also been updated to reflect these changes. A new benchmark report has been introduced to provide a comprehensive comparison of signer implementations.