My Project
programmer's documentation
|
Anthony M. Castaldo, R. Clint Whaley, and Anthony T. Chronopoulos. Reducing floating point error in dot product using the superblock family of algorithms. SIAM J. SCI. COMPUT., 31(2):1156 – 1174, 2008.
G.C. Fox and W. Furmanski. Hypercube algorithms for neural network simulation: the crystal accumulator and the crystal router. In Proceedings of the third conference on Hypercube concurrent computers and applications: Architecture, software, computer systems, and general issues - Volume 1, 1988.
T. Hoefler, C. Siebert, and A. Lumsdaine. Scalable communication protocols for dynamic sparse data exchange. In PPoPP '10: Proceedings of the 15th ACM SIGPLAN symposium on Principles and practice of parallel programming, pages 159–168. ACM, 2010.
W. Kahan. Pracniques: further remarks on reducing truncation errors. Communications of the ACM, 8(1), 1965.
Y. Notay and A. Napov. A massively parallel solver for discrete poisson-like problems. J. Comput. Physiscs, 281:237 – 250, 2015.