Matrix multiplication shared memory

onkar_shedge · April 5, 2015, 3:32am

Do the tile dimensions and block dimensions have to be same for shared memory matrix multiplication ?

little_jimmy · April 5, 2015, 7:17am

it is generally easier (i did not necessarily say better) to have the tile dimensions == block dimensions, i would think

running with your thought, if the tile dimension != block dimension, then block dimension < tile dimension; i can not perceive the case of tile dimension < block dimension

if the block dimension < tile dimension, you would then have to steer the block over the tile, likely through iteration

why do you ask this?

onkar_shedge · April 5, 2015, 8:22am

I am having hard time understanding shared memory (tiled) matrix multiplication

little_jimmy · April 5, 2015, 8:38am

there is likely a matrix transpose buried in there
thus, if you first understand how to do a matrix transpose, and why it is done the way it is done, the matrix multiplication should be easier to follow, i would think

Robert_Crovella · April 5, 2015, 1:42pm

There’s a writeup in the programming guide that covers matrix multiplication using shared memory that may be of interest:

[url]Programming Guide :: CUDA Toolkit Documentation

onkar_shedge · April 6, 2015, 2:23am

Thank You!!!

Topic		Replies	Views
Shared Memory Access - Matrix Multiplication CUDA Programming and Performance	1	1027	October 24, 2015
Matrix Multiplication with Shared Memory CUDA Programming and Performance	0	1346	September 28, 2009
matrix multiplication with shared memory (randomly sized) shared memory matrix multiplication random CUDA Programming and Performance	0	1732	May 29, 2009
Optimize problem regarding problem size CUDA Programming and Performance	4	6127	May 25, 2011
A Question from Programming Massively Parallel Processors: A Hands-on Approach CUDA Programming and Performance cuda , kernel	0	625	September 28, 2021
Why outer matrix dimensions shoud be equal in matrixMul ? CUDA Programming and Performance	1	580	January 18, 2020
Using more shared memory does not show improvement CUDA Programming and Performance	0	357	November 18, 2020
CUDA matrix transpose using shared memory CUDA Programming and Performance cuda	0	474	July 6, 2020
How to improve performance when multiply two matrices with large data in CUDA ? CUDA Programming and Performance	5	3915	March 19, 2014
Example of matrix multiplication (max. block_size) CUDA Programming and Performance	2	11578	January 28, 2010

Matrix multiplication shared memory

Related topics