12th International Conference on Parallel Computing Technologies (PaCT), St Petersburg, Russia, 30 September - 04 October 2013, vol.7979, pp.1-12
A parallel implementation of Strassen's matrix multiplication algorithm is proposed for massively parallel supercomputers with 2D all-port torus interconnection networks. The proposed algorithm employs conflict-free routing patterns and operates on completely-connected subnetworks in order to reduce the latency cost L of the algorithm down to L(n, P) = 6log(7) P on allport 2D torus architectures having P nodes, which compares favorably to the latency cost L(n, P) = 36log(7) P of the recently proposed CAPS algorithm.