Abstract Efficient prime number generation is a cornerstone of computational number theory and cryptography. This paper presents a comparative analysis of classical and modern sieve algorithms across various scales (10⁶, 10⁹, and 10¹²), focusing on time complexity, space complexity, and generation efficiency. We evaluate nine algorithms, including traditional methods like the Sieve of Eratosthenes, advanced optimizations such as the Segmented Wheel Sieve (P41), and GPU-accelerated sieves like Zakiya's Prime gen.
1. Introduction Prime numbers are essential in numerous fields, especially in cryptography and primality testing. Over time, many sieve algorithms have been proposed to enhance the efficiency of prime generation. This study aims to systematically compare these algorithms to understand their strengths and limitations in different computational scenarios.
2. Sieve Algorithms Compared The following sieve algorithms were analysed:
3. Methodology Each algorithm was tested at three input sizes: 10⁶, 10⁹, and 10¹². Time was measured in seconds, space in megabytes (MB), and efficiency as primes generated per second. The GPU implementations were executed on modern CUDA-compatible hardware. Performance metrics were gathered under standardized conditions to ensure fairness in evaluation.
4. Results
4.1 Time Complexity The P41 Segmented GPU and Zakiya's Primegen demonstrated the best performance for large inputs, achieving significant speed-ups over CPU-based methods. Basic methods like the Eratosthenes and Atkin sieves became infeasible at 10¹² due to memory and computation constraints.
4.2 Space Complexity Bitwise and wheel-optimized sieves drastically reduced memory usage. P41-based methods showed the lowest memory footprints, making them highly suitable for resource-constrained environments.
4.3 Generation Efficiency GPU-based methods excelled in terms of primes generated per second. P41 Segmented GPU achieved up to 150,000 primes/second at 10¹², outperforming all other methods. The balance of memory efficiency and computational speed in the P41 GPU implementation
5. Discussion While traditional sieves are easy to implement and effective for small inputs, they fail to scale to the levels required for contemporary cryptographic applications. Wheel factorization and segmentation provide meaningful improvements. The P41 model, especially when implemented on GPU hardware, represents a state-of-the-art method for large-scale prime generation.
6. Conclusion This study confirms the importance of hybrid optimization techniques (segmentation + wheel + bit compression) for large-scale prime generation. The results highlight the effectiveness of GPU-accelerated implementations, particularly those based on the P41 coprime set. Future work includes refining P41 GPU kernels and extending benchmarks to distributed systems and hybrid CPU-GPU clusters.
7. References