Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
swSpTRSV: A fast sparse triangular solve with sparse level tile layout on sunway architectures
Sparse triangular solve (SpTRSV) is one of the most important kernels in many real-world
applications. Currently, much research on parallel SpTRSV focuses on level-set construction …
applications. Currently, much research on parallel SpTRSV focuses on level-set construction …
Parallelization and optimization of NSGA-II on sunway TaihuLight system
X Liu, J Sun, L Zheng, S Wang, Y Liu… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org
Sunway TaihuLight system is the first supercomputer offering a peak performance over 100
PFlops, which can be utilized to parallelize Non-dominated Sorting Genetic Algorithm II …
PFlops, which can be utilized to parallelize Non-dominated Sorting Genetic Algorithm II …
Parallel optimization and application of unstructured sparse triangular solver on new generation of sunway architecture
Large-scale sparse linear equation solver plays an important role in both numerical
simulation and artificial intelligence, and sparse triangular equation solver is a key step in …
simulation and artificial intelligence, and sparse triangular equation solver is a key step in …
Increasing the efficiency of massively parallel sparse matrix-matrix multiplication in first-principles calculation on the new-generation Sunway supercomputer
X Chen, Y Gao, H Shang, F Li, Z Xu… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
The first-principles approach based on density-functional theory (DFT)/density-functional
perturbation theory (DFPT) is widely used in calculations of the systems' ground state …
perturbation theory (DFPT) is widely used in calculations of the systems' ground state …
[HTML][HTML] Modified fast inverse square root and square root approximation algorithms: The method of switching magic constants
Many low-cost platforms that support floating-point arithmetic, such as microcontrollers and
field-programmable gate arrays, do not include fast hardware or software methods for …
field-programmable gate arrays, do not include fast hardware or software methods for …
Evaluating the SW26010 many-core processor with a micro-benchmark suite for performance optimizations
The inadequate public information of China's SW26010 processor's micro-architecture
prevents global researchers from improving application performances on the TaihuLight …
prevents global researchers from improving application performances on the TaihuLight …
Enabling highly efficient batched matrix multiplications on SW26010 many-core processor
L Jiang, C Yang, W Ma - ACM Transactions on Architecture and Code …, 2020 - dl.acm.org
We present a systematic methodology for optimizing batched matrix multiplications on
SW26010 many-core processor of the Sunway TaihuLight supercomputer. Five surrogate …
SW26010 many-core processor of the Sunway TaihuLight supercomputer. Five surrogate …
[HTML][HTML] A modification of the fast inverse square root algorithm
We present a new algorithm for the approximate evaluation of the inverse square root for
single-precision floating-point numbers. This is a modification of the famous fast inverse …
single-precision floating-point numbers. This is a modification of the famous fast inverse …
Algorithms for calculating the square root and inverse square root based on the second-order householder's method
This article proposes a set of algorithms for calculating the square root and inverse square
root for normalized single and double precision floating-point numbers. They are based on …
root for normalized single and double precision floating-point numbers. They are based on …
Efficient floating-point square root and reciprocal square root algorithms
Several algorithms for calculating square roots and inverse square roots are developed.
These are oriented on normalized numbers with a floating point for single and double …
These are oriented on normalized numbers with a floating point for single and double …