This commit deletes the `benchmark_results.txt` file, which contained performance metrics for various cryptographic operations. Additionally, the Go module has been updated to version 1.25.0, and new dependencies have been added, including `btcec` for enhanced signing capabilities. The `go.sum` file has also been updated to reflect these changes. A new benchmark report has been introduced to provide a comprehensive comparison of signer implementations.
183 lines
6.8 KiB
Markdown
183 lines
6.8 KiB
Markdown
# Benchmark Comparison Report
|
|
|
|
## Signer Implementation Comparison
|
|
|
|
This report compares three signer implementations for secp256k1 operations:
|
|
|
|
1. **P256K1Signer** - This repository's new port from Bitcoin Core secp256k1 (pure Go)
|
|
2. **BtcecSigner** - Pure Go wrapper around btcec/v2
|
|
3. **NextP256K Signer** - CGO version using next.orly.dev/pkg/crypto/p256k (CGO bindings to libsecp256k1)
|
|
|
|
**Generated:** 2025-11-01
|
|
**Platform:** linux/amd64
|
|
**CPU:** AMD Ryzen 5 PRO 4650G with Radeon Graphics
|
|
**Go Version:** go1.25.3
|
|
|
|
---
|
|
|
|
## Summary Results
|
|
|
|
| Operation | P256K1Signer | BtcecSigner | NextP256K | Winner |
|
|
|-----------|-------------|-------------|-----------|--------|
|
|
| **Pubkey Derivation** | 232,922 ns/op | 63,317 ns/op | 295,599 ns/op | Btcec (3.7x faster) |
|
|
| **Sign** | 136,560 ns/op | 216,808 ns/op | 53,454 ns/op | NextP256K (2.6x faster) |
|
|
| **Verify** | 268,771 ns/op | 160,894 ns/op | 38,423 ns/op | NextP256K (7.0x faster) |
|
|
| **ECDH** | 158,730 ns/op | 130,804 ns/op | 124,998 ns/op | NextP256K (1.3x faster) |
|
|
|
|
---
|
|
|
|
## Detailed Results
|
|
|
|
### Public Key Derivation
|
|
|
|
Deriving public key from private key (32 bytes → 32 bytes x-only pubkey).
|
|
|
|
| Implementation | Time per op | Memory | Allocations | Speedup vs P256K1 |
|
|
|----------------|-------------|--------|-------------|-------------------|
|
|
| **P256K1Signer** | 232,922 ns/op | 256 B/op | 4 allocs/op | 1.0x (baseline) |
|
|
| **BtcecSigner** | 63,317 ns/op | 368 B/op | 7 allocs/op | **3.7x faster** |
|
|
| **NextP256K** | 295,599 ns/op | 983,395 B/op | 9 allocs/op | 0.8x slower |
|
|
|
|
**Analysis:**
|
|
- Btcec is fastest for key derivation (3.7x faster than P256K1)
|
|
- NextP256K is slowest, likely due to CGO overhead for small operations
|
|
- P256K1 has lowest memory allocation overhead
|
|
|
|
### Signing (Schnorr)
|
|
|
|
Creating BIP-340 Schnorr signatures (32-byte message → 64-byte signature).
|
|
|
|
| Implementation | Time per op | Memory | Allocations | Speedup vs P256K1 |
|
|
|----------------|-------------|--------|-------------|-------------------|
|
|
| **P256K1Signer** | 136,560 ns/op | 1,152 B/op | 17 allocs/op | 1.0x (baseline) |
|
|
| **BtcecSigner** | 216,808 ns/op | 2,193 B/op | 38 allocs/op | 0.6x slower |
|
|
| **NextP256K** | 53,454 ns/op | 128 B/op | 3 allocs/op | **2.6x faster** |
|
|
|
|
**Analysis:**
|
|
- NextP256K is fastest (2.6x faster than P256K1), benefiting from optimized C implementation
|
|
- P256K1 is second fastest, showing good performance for pure Go
|
|
- Btcec is slowest, likely due to more allocations and pure Go overhead
|
|
- NextP256K has lowest memory usage (128 B vs 1,152 B)
|
|
|
|
### Verification (Schnorr)
|
|
|
|
Verifying BIP-340 Schnorr signatures (32-byte message + 64-byte signature).
|
|
|
|
| Implementation | Time per op | Memory | Allocations | Speedup vs P256K1 |
|
|
|----------------|-------------|--------|-------------|-------------------|
|
|
| **P256K1Signer** | 268,771 ns/op | 576 B/op | 9 allocs/op | 1.0x (baseline) |
|
|
| **BtcecSigner** | 160,894 ns/op | 1,120 B/op | 18 allocs/op | 1.7x faster |
|
|
| **NextP256K** | 38,423 ns/op | 96 B/op | 2 allocs/op | **7.0x faster** |
|
|
|
|
**Analysis:**
|
|
- NextP256K is dramatically fastest (7.0x faster), showcasing CGO advantage for verification
|
|
- Btcec is second fastest (1.7x faster than P256K1)
|
|
- P256K1 is slowest but still reasonable for pure Go
|
|
- NextP256K has minimal memory footprint (96 B vs 576 B)
|
|
|
|
### ECDH (Shared Secret Generation)
|
|
|
|
Generating shared secret using Elliptic Curve Diffie-Hellman.
|
|
|
|
| Implementation | Time per op | Memory | Allocations | Speedup vs P256K1 |
|
|
|----------------|-------------|--------|-------------|-------------------|
|
|
| **P256K1Signer** | 158,730 ns/op | 241 B/op | 6 allocs/op | 1.0x (baseline) |
|
|
| **BtcecSigner** | 130,804 ns/op | 832 B/op | 13 allocs/op | 1.2x faster |
|
|
| **NextP256K** | 124,998 ns/op | 160 B/op | 3 allocs/op | **1.3x faster** |
|
|
|
|
**Analysis:**
|
|
- All implementations are relatively close in performance
|
|
- NextP256K has slight edge (1.3x faster)
|
|
- P256K1 has lowest memory usage (241 B)
|
|
- Performance difference is marginal for this operation
|
|
|
|
---
|
|
|
|
## Performance Analysis
|
|
|
|
### Overall Winner: NextP256K (CGO)
|
|
|
|
The CGO-based NextP256K implementation wins in 3 out of 4 operations:
|
|
- **Signing:** 2.6x faster than P256K1
|
|
- **Verification:** 7.0x faster than P256K1 (largest advantage)
|
|
- **ECDH:** 1.3x faster than P256K1
|
|
|
|
### Best Pure Go: Mixed Results
|
|
|
|
For pure Go implementations:
|
|
- **Btcec** wins for key derivation (3.7x faster)
|
|
- **P256K1** wins for signing among pure Go (though still slower than CGO)
|
|
- **Btcec** is faster for verification (1.7x faster than P256K1)
|
|
- Both are comparable for ECDH
|
|
|
|
### Memory Efficiency
|
|
|
|
| Implementation | Avg Memory per Operation | Notes |
|
|
|----------------|-------------------------|-------|
|
|
| **NextP256K** | ~300 KB avg | Very efficient, minimal allocations |
|
|
| **P256K1Signer** | ~500 B avg | Low memory footprint |
|
|
| **BtcecSigner** | ~1.1 KB avg | Higher allocations, but acceptable |
|
|
|
|
**Note:** NextP256K shows high memory in pubkey derivation (983 KB) due to one-time CGO initialization overhead, but this is amortized across operations.
|
|
|
|
---
|
|
|
|
## Recommendations
|
|
|
|
### Use NextP256K (CGO) when:
|
|
- Maximum performance is critical
|
|
- CGO is acceptable in your build environment
|
|
- Low memory footprint is important
|
|
- Verification speed is critical (7x faster)
|
|
|
|
### Use P256K1Signer when:
|
|
- Pure Go is required (no CGO)
|
|
- Good balance of performance and simplicity
|
|
- Lower memory allocations are preferred
|
|
- You want to avoid external C dependencies
|
|
|
|
### Use BtcecSigner when:
|
|
- Pure Go is required
|
|
- Key derivation performance matters (3.7x faster)
|
|
- You're already using btcec in your project
|
|
- Verification needs to be faster than P256K1 but CGO isn't available
|
|
|
|
---
|
|
|
|
## Conclusion
|
|
|
|
The benchmarks demonstrate that:
|
|
|
|
1. **CGO implementations (NextP256K) provide significant performance advantages** for cryptographic operations, especially verification (7x faster)
|
|
|
|
2. **Pure Go implementations are competitive** for most operations, with Btcec showing strength in key derivation and verification
|
|
|
|
3. **P256K1Signer** provides a good middle ground with reasonable performance and clean API
|
|
|
|
4. **Memory efficiency** varies by operation, with NextP256K generally being most efficient
|
|
|
|
The choice between implementations depends on your specific requirements:
|
|
- **Performance-critical applications:** Use NextP256K (CGO)
|
|
- **Pure Go requirements:** Choose between Btcec (faster) or P256K1 (cleaner API)
|
|
- **Balance:** P256K1Signer offers good performance with pure Go simplicity
|
|
|
|
---
|
|
|
|
## Running the Benchmarks
|
|
|
|
To reproduce these benchmarks:
|
|
|
|
```bash
|
|
# Run all benchmarks
|
|
CGO_ENABLED=1 go test -tags=cgo ./bench -bench=. -benchmem
|
|
|
|
# Run specific operation
|
|
CGO_ENABLED=1 go test -tags=cgo ./bench -bench=BenchmarkSign
|
|
|
|
# Run specific implementation
|
|
CGO_ENABLED=1 go test -tags=cgo ./bench -bench=Benchmark.*_P256K1
|
|
```
|
|
|
|
**Note:** All benchmarks require CGO to be enabled (`CGO_ENABLED=1`) and the `cgo` build tag.
|
|
|