稀疏矩陣的數(shù)據(jù)結(jié)構(gòu)_稀疏矩陣 matlab_稀疏矩陣存儲格式總結(jié)+存儲效率對比:COO,CSR,DIA,ELL,HYB

發(fā)布時(shí)間：2016-07-28 14:03

本文關(guān)鍵詞：稀疏矩陣，由筆耕文化傳播整理發(fā)布。

稀疏矩陣是指矩陣中的元素大部分是0的矩陣，事實(shí)上，實(shí)際問題中大規(guī)模矩陣基本上都是稀疏矩陣，很多稀疏度在90%甚至99%以上。因此我們需要有高效的稀疏矩陣存儲格式。本文總結(jié)幾種典型的格式：COO,CSR,DIA,ELL,HYB。

（1）Coordinate（COO）

這是最簡單的一種格式，每一個(gè)元素需要用一個(gè)三元組來表示，分別是（行號，列號，數(shù)值），對應(yīng)上圖右邊的一列。這種方式簡單，但是記錄單信息多（行列），每個(gè)三元組自己可以定位，因此空間不是最優(yōu)。

（2）Compressed Sparse Row (CSR)

稀疏矩陣的數(shù)據(jù)結(jié)構(gòu)_稀疏矩陣 matlab_稀疏矩陣存儲格式總結(jié)+存儲效率對比:COO,CSR,DIA,ELL,HYB

CSR是比較標(biāo)準(zhǔn)的一種，也需要三類數(shù)據(jù)來表達(dá)：數(shù)值，列號，以及行偏移。CSR不是三元組，而是整體的編碼方式。數(shù)值和列號與COO一致，表示一個(gè)元素以及其列號，行偏移表示某一行的第一個(gè)元素在values里面的起始偏移位置。如上圖中，第一行元素1是0偏移，第二行元素2是2偏移，第三行元素5是4偏移，第4行元素6是7偏移。在行偏移的最后補(bǔ)上矩陣總的元素個(gè)數(shù)，本例中是9。

CSC是和CSR相對應(yīng)的一種方式，即按列壓縮的意思。

以上圖中矩陣為例：

Values： [1 5 7 2 6 8 3 9 4]

Row Indices：[0 2 0 1 3 1 2 2 3]

Column Offsets：[0 2 5 7 9]

再來看一個(gè)CSR的例子[4]：

稀疏矩陣的數(shù)據(jù)結(jié)構(gòu)_稀疏矩陣 matlab_稀疏矩陣存儲格式總結(jié)+存儲效率對比:COO,CSR,DIA,ELL,HYB

（3）ELLPACK (ELL)

稀疏矩陣的數(shù)據(jù)結(jié)構(gòu)_稀疏矩陣 matlab_稀疏矩陣存儲格式總結(jié)+存儲效率對比:COO,CSR,DIA,ELL,HYB

用兩個(gè)和原始矩陣相同行數(shù)的矩陣來存：第一個(gè)矩陣存的是列號，第二個(gè)矩陣存的是數(shù)值，行號就不存了，用自身所在的行來表示；這兩個(gè)矩陣每一行都是從頭開始放，如果沒有元素了就用個(gè)標(biāo)志比如*結(jié)束。上圖中間矩陣有誤，第三行應(yīng)該是 0 2 3。

注：這樣如果某一行很多元素，那么后面兩個(gè)矩陣就會很胖，其他行結(jié)尾*很多，浪費(fèi)。可以存成數(shù)組，比如上面兩個(gè)矩陣就是：

0 1 * 1 2 * 0 2 3 * 1 3 *

1 7 * 2 8 * 5 3 9 * 6 4 *

但是這樣要取一行就比較不方便了

（4）Diagonal (DIA)

稀疏矩陣的數(shù)據(jù)結(jié)構(gòu)_稀疏矩陣 matlab_稀疏矩陣存儲格式總結(jié)+存儲效率對比:COO,CSR,DIA,ELL,HYB

對角線存儲法，按對角線方式存，列代表對角線，行代表行。省略全零的對角線。(從左下往右上開始：第一個(gè)對角線是零忽略，第二個(gè)對角線是5，6，第三個(gè)對角線是零忽略，，第四個(gè)對角線是1，2，3，4，第五個(gè)對角線是7，8，9，第六第七個(gè)對角線忽略)。[3]

這里行對應(yīng)行，所以5和6是分別在第三行第四行的，前面補(bǔ)上無效元素*。如果對角線中間有0，存的時(shí)候也需要補(bǔ)0，所以如果原始矩陣就是一個(gè)對角性很好的矩陣那壓縮率會非常高，比如下圖，但是如果是隨機(jī)的那效率會非常糟糕。

稀疏矩陣的數(shù)據(jù)結(jié)構(gòu)_稀疏矩陣 matlab_稀疏矩陣存儲格式總結(jié)+存儲效率對比:COO,CSR,DIA,ELL,HYB

（5）Hybrid (HYB) ELL + COO

稀疏矩陣的數(shù)據(jù)結(jié)構(gòu)_稀疏矩陣 matlab_稀疏矩陣存儲格式總結(jié)+存儲效率對比:COO,CSR,DIA,ELL,HYB

為了解決（3）ELL中提到的，如果某一行特別多，造成其他行的浪費(fèi)，那么把這些多出來的元素（比如第三行的9，其他每一行最大都是2個(gè)元素）用COO單獨(dú)存儲。

選擇稀疏矩陣存儲格式的一些經(jīng)驗(yàn)[2]：

一些特殊類型矩陣的存儲效率（數(shù)值越小說明壓縮率越高，即存儲效率越高）:

Structured Mesh

稀疏矩陣的數(shù)據(jù)結(jié)構(gòu)_稀疏矩陣 matlab_稀疏矩陣存儲格式總結(jié)+存儲效率對比:COO,CSR,DIA,ELL,HYB

Unstructured Mesh

稀疏矩陣的數(shù)據(jù)結(jié)構(gòu)_稀疏矩陣 matlab_稀疏矩陣存儲格式總結(jié)+存儲效率對比:COO,CSR,DIA,ELL,HYB

Random matrix

稀疏矩陣的數(shù)據(jù)結(jié)構(gòu)_稀疏矩陣 matlab_稀疏矩陣存儲格式總結(jié)+存儲效率對比:COO,CSR,DIA,ELL,HYB

Power-Law Graph

稀疏矩陣的數(shù)據(jù)結(jié)構(gòu)_稀疏矩陣 matlab_稀疏矩陣存儲格式總結(jié)+存儲效率對比:COO,CSR,DIA,ELL,HYB

格式適用性總結(jié)：

稀疏矩陣的數(shù)據(jù)結(jié)構(gòu)_稀疏矩陣 matlab_稀疏矩陣存儲格式總結(jié)+存儲效率對比:COO,CSR,DIA,ELL,HYB

下面摘自[2]

6. Skyline Storage Format

The skyline storage format is important for the direct sparse solvers, and it is well suited for Cholesky or LU decomposition when no pivoting is required.

The skyline storage format accepted in Intel MKL can store only triangular matrix or triangular part of a matrix. This format is specified by two arrays: values and pointers . The following table describes these arrays:

values

A scalar array. For a lower triangular matrix it contains the set of elements from each row of the matrix starting from the first non-zero element to and including the diagonal element. For an upper triangular matrix it contains the set of elements from each column of the matrix starting with the first non-zero element down to and including the diagonal element. Encountered zero elements are included in the sets.

pointers

An integer array with dimension ( m +1) , where m is the number of rows for lower triangle (columns for the upper triangle). pointers ( i ) - pointers (1)+1 gives the index of element in values that is first non-zero element in row (column) i . The value of pointers ( m +1) is set to nnz + pointers (1) , where nnz is the number of elements in the array values .

7. Block Compressed Sparse Row Format (BSR)

The Intel MKL block compressed sparse row (BSR) format for sparse matrices is specified by four arrays: values , columns , pointerB , and pointerE . The following table describes these arrays.

values

A real array that contains the elements of the non-zero blocks of a sparse matrix. The elements are stored block-by-block in row-major order. A non-zero block is the block that contains at least one non-zero element. All elements of non-zero blocks are stored, even if some of them is equal to zero. Within each non-zero block elements are stored in column-major order in the case of one-based indexing, and in row-major order in the case of the zero-based indexing.

columns

Element i of the integer array columns is the number of the column in the block matrix that contains the i -th non-zero block.

pointerB

Element j of this integer array gives the index of the element in the columns array that is first non-zero block in a row j of the block matrix.

pointerE

Element j of this integer array gives the index of the element in the columns array that contains the last non-zero block in a row j of the block matrix plus 1.

[1] Sparse Matrix Representations & Iterative Solvers, Lesson 1 by Nathan Bell.

[2]

[3]

[4]

[5] Implementing Sparse Matrix-Vector Multiplication on Throughput-Oriented Processors, Nathan Bell and Michael Garland, Proceedings of Supercomputing '09

[6] Efficient Sparse Matrix-Vector Multiplication on CUDA, Nathan Bell and Michael Garland, NVIDIA Technical Report NVR-2008-004, December 2008

本文關(guān)鍵詞：稀疏矩陣，由筆耕文化傳播整理發(fā)布。

本文編號：77464

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會員下載

Download by Member

本文鏈接：http://sikaile.net/wenshubaike/jyzy/77464.html

上一篇：過年去云南旅游好嗎？春節(jié)期間去云南旅游穿什么衣服好？
下一篇：稀疏矩陣 matlab_稀疏矩陣的壓縮存儲_稀疏矩陣轉(zhuǎn)置

論文發(fā)表

·知網(wǎng)|萬方|維普|龍?jiān)磡省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

稀疏矩陣的數(shù)據(jù)結(jié)構(gòu)_稀疏矩陣 matlab_稀疏矩陣存儲格式總結(jié)+存儲效率對比:COO,CSR,DIA,ELL,HYB