分布式在線社交網(wǎng)絡(luò)數(shù)據(jù)存儲及優(yōu)化技術(shù)研究
[Abstract]:In recent years, the online social network (OSN) has achieved great success, with billions of users worldwide. Through OSN, users can make new friends or share information with their own friends. With centralized data storage architecture, all user data is centrally stored on servers operated and maintained by service providers. Service providers can use and analyze this data, and even sell it directly to third parties, thus destroying user privacy. In this context, distributed online social network (DOSN) has been proposed to solve the problem of user data privacy leakage. Although DOSN is not as popular and mature as COSN, the research on it is very active. In DOSN, to protect privacy, user data is stored and forwarded directly in a friend's circle bypassing the server. Although DOSN can prevent service providers from leaking user's privacy data, there is a problem of low data availability: when a user is offline, other users cannot access the data stored in the offline circle. In order to improve data availability under data privacy protection constraints, data storage schemes and corresponding optimization strategies must be designed for DOSN scenarios, which is one of the biggest challenges in DOSN research. 4) Social data is mainly small data, and rarely modified. Through the in-depth study of existing DOSN data storage technology and storage optimization related work found that the existing work mainly focused on user dynamics, while ignoring other characteristics of the impact of data storage optimization goals. This paper systematically studies the DOSN data storage and storage optimization problem with the main objective of improving data availability under data privacy protection constraints. It mainly includes the following aspects: 1. Storage capacity-sensitive DOSN data availability modeling and analysis. Existing DOSN data storage schemes usually assume that friends always provide sufficient storage for users. Storage capacity holds data published by users, however, this assumption is inappropriate in DOSN. In order not to disclose user privacy, unprotected user privacy data can only be stored in the circle of friends. Energy devices usually have limited storage capacity. Intuitively, limited total Friends storage capacity reduces data availability. But it's not enough to know this rough conclusion. We also want to know how much storage capacity affects data availability to determine whether data storage optimization is necessary. Before the DOSN data storage scheme, it is necessary to quantitatively analyze the relationship between the total storage capacity contributed by the friend circle and the data availability that can be achieved, which is the primary problem to be solved in this paper. In addition, the dynamic changes of the online friends'height in the circle of friends affect the total storage capacity that the circle of friends can contribute to, and consequently lead to a high degree of dynamic changes in data availability. To solve this problem, this paper predicts the real-time data availability by predicting the total real-time storage capacity of the circle of friends. Finally, a large number of experiments are carried out to verify the validity of the storage capacity-sensitive data availability model. Based on the storage capacity-sensitive data availability model, given the expected data availability can be determined. The minimum total storage capacity required by the circle of friends can then determine the average minimum storage capacity that each friend needs to contribute and provide a basis for the allocation of application storage capacity; conversely, given the total storage capacity of the circle of friends, the maximum data availability that the circle of friends can achieve can be determined, thus determining the expected data availability is 2. Cloud-assisted dosn data storage scheme cadros, as mentioned above, in dosn, data can only be stored redundantly in the friends'circle without protection in order to ensure the privacy of users is not leaked. But dosn is a highly dynamic network, users can at any time. Adding and deleting friends, and friends can be online and offline at any time, so the collection of online friends and the total storage capacity contributed by friends are limited and dynamic changes. To achieve this goal, designing a data storage scheme suitable for dosn is the second key problem to be solved in this paper. To solve this problem, a cloud-assisted dosn data storage scheme, cadros, is proposed based on the storage capacity-sensitive data availability model. Cloud servers are introduced to improve data availability. When the Friendship Circle can not meet the data storage needs. In order to prevent cloud service providers from obtaining original data and protect user data privacy, this paper quantitatively studies cadros The data storage capability is discussed, and the data availability of cadros is discussed, which proves the feasibility and validity of the cadros scheme theoretically. At the same time, the probabilistic model of the dynamic behavior of friends in the circle of friends is established. By predicting the future data storage capacity and storage requirements of the circle of friends, a real-time data availability prediction model of cadros is established. The next step is to design the data storage strategy to provide the basis. 3. The real-time data availability prediction results of the research on the storage optimization technology of social data in dosn only show that cadros has the ability to achieve the corresponding data availability under the premise of the total storage capacity of a given circle of friends. It also depends on the data storage strategy. Even if the friend circle can provide enough storage capacity, the ideal data availability can not be achieved without a good data storage strategy. In the cadros data storage scheme, how to design a suitable data storage based on the prediction results of real-time data availability for dosn user behavior characteristics To solve this problem, this paper further optimizes the Cadros data storage scheme and studies the storage optimization technology of social data in DOSN. Firstly, an overhead-sensitive data partitioning method and storage strategy are proposed to determine the data stored in friends and cloud servers, respectively. Make full use of the available storage capacity of the friend circle to minimize the system overhead; then, propose a usability-driven DOSN data replica placement method, reasonably put the data into the friend circle, can achieve the expected data availability, and can balance the system load, reduce the system maintenance overhead to achieve data availability. 4. Social number According to the storage optimization technology in cloud server as mentioned above, Cadros data storage scheme not only stores user data redundancy in the friend circle, but also stores some data in the cloud server when the friend circle can not meet the data storage requirements. The cloud server has the characteristics of long-term high availability, so the data on the cloud server is available. Usability is approximated to 100%. There is no data availability problem. However, when users access social data on cloud servers, there is a problem of poor access performance. Social data is mainly small data and rarely modified. How to improve the access performance of small social data in cloud servers is the fourth key problem to be solved in this paper. To solve this problem, this paper first studies the performance bottleneck of distributed file systems for handling large amounts of small social data, and then proposes a lightweight file system iFlatLFS pair. IFlatLFS greatly simplifies the metadata structure and data access process. The total amount of new metadata accounts for only a small part of the total amount of original metadata and can be cached into the server memory, eliminating the small data addressing overhead and improving performance. Finally, this paper implements it in the CentOS 5.5 operating system. A prototype of iFlat LFS is implemented and integrated into the open source distributed file system TFS. At the end of this chapter, a large number of experiments are carried out. The results show that iFlat LFS can optimize the storage of large amounts of social small data and greatly improve the data access performance. In this paper, firstly, we quantitatively analyze the relationship between the total storage capacity contributed by the friend circle and the data availability that can be achieved. On this basis, we propose a cloud-assisted DOSN data storage scheme Cadros, which solves the problem of low data availability caused by the limited total storage capacity of the friend circle. The protection problem improves the data availability, and theoretically proves the feasibility and validity of the Cadros scheme, establishes a real-time data availability prediction model; then studies the storage optimization problem of social data in the circle of friends, and proposes an overhead-sensitive data partitioning method and storage strategy based on the prediction results, as well as availability. Sex-driven data placement method can achieve the expected data availability, and can balance the system load and reduce the maintenance overhead of data availability. Finally, the storage optimization of social data in cloud servers is studied, and an efficient lightweight file system iFlatLFS is designed to improve the access performance of social data on cloud servers.
【學(xué)位授予單位】:國防科學(xué)技術(shù)大學(xué)
【學(xué)位級別】:博士
【學(xué)位授予年份】:2014
【分類號】:TP393.09;TP333
【相似文獻】
相關(guān)期刊論文 前10條
1 鄭士貴;數(shù)據(jù)存儲的全面管理[J];管理科學(xué)文摘;1997年09期
2 相曉明;網(wǎng)上存儲:X:Drive[J];互聯(lián)網(wǎng)周刊;2000年30期
3 王宇葳;誰來吞吐你的數(shù)據(jù)[J];互聯(lián)網(wǎng)周刊;2000年30期
4 袁勝,馮毅,伍顯峰,涂春明,盛云川;移動計費營業(yè)系統(tǒng)中數(shù)據(jù)存儲的考慮[J];電信技術(shù);2001年01期
5 楊向東;數(shù)據(jù)存儲——深化金融電子化的奠基之石[J];華南金融電腦;2002年03期
6 李子臣,王振光,王文靜;外包數(shù)據(jù)存儲——經(jīng)濟、安全、高效[J];現(xiàn)代情報;2002年11期
7 楊向東;數(shù)據(jù)存儲——金融電子化的基石[J];中國金融電腦;2002年03期
8 黃重訊;企業(yè)的數(shù)據(jù)存儲[J];鄉(xiāng)鎮(zhèn)企業(yè)研究;2003年06期
9 李婕;;醫(yī)院信息化促進數(shù)據(jù)存儲中心的建立[J];醫(yī)學(xué)信息;2006年09期
10 夏歡;熊前興;馮櫻;;數(shù)據(jù)存儲的探討[J];科技信息;2006年S4期
相關(guān)會議論文 前10條
1 孫崢皓;汪宏f;閻巖;岑小鋒;鄧志均;;淺談信息化戰(zhàn)爭對大數(shù)據(jù)存儲與分析的要求及對策[A];2013第一屆中國指揮控制大會論文集[C];2013年
2 張沁川;王厚軍;;基于大容量閃存的數(shù)據(jù)存儲與管理[A];2008中國儀器儀表與測控技術(shù)進展大會論文集(Ⅲ)[C];2008年
3 霍躍華;;IP SAN在煤炭企業(yè)數(shù)據(jù)存儲的應(yīng)用研究[A];煤礦自動化與信息化——第20屆全國煤礦自動化與信息化學(xué)術(shù)會議暨第2屆中國煤礦信息化與自動化高層論壇論文集[C];2010年
4 盛磊;李美華;程林;;一種軋鋼過程數(shù)據(jù)存儲方法[A];全國冶金自動化信息網(wǎng)2014年會論文集[C];2014年
5 王文峰;李佳;;芻議信息系統(tǒng)數(shù)據(jù)存儲與備份系統(tǒng)的構(gòu)建方式[A];2011年云南電力技術(shù)論壇論文集(入選部分)[C];2011年
6 張艷秋;李建中;楊艷;張兆功;;混合負載多媒體服務(wù)器的數(shù)據(jù)存儲和數(shù)據(jù)提交[A];第二十屆全國數(shù)據(jù)庫學(xué)術(shù)會議論文集(研究報告篇)[C];2003年
7 王淑江;;煙臺日報傳媒集團存儲體系規(guī)劃[A];中國新聞技術(shù)工作者聯(lián)合會五屆一次理事會暨學(xué)術(shù)年會論文集(上篇)[C];2009年
8 ;Wallstor網(wǎng)絡(luò)數(shù)據(jù)存儲的新技術(shù)應(yīng)用[A];江蘇省微型電腦應(yīng)用協(xié)會產(chǎn)學(xué)研成果交流會會議資料[C];2010年
9 韋大偉;;分布式數(shù)據(jù)存儲中的機密性保護[A];2006年全國開放式分布與并行計算機學(xué)術(shù)會議論文集(三)[C];2006年
10 韓德志;;內(nèi)網(wǎng)數(shù)據(jù)存儲安全關(guān)鍵技術(shù)的研究與實現(xiàn)[A];2010年第16屆全國信息存儲技術(shù)大會(IST2010)論文集[C];2010年
相關(guān)重要報紙文章 前10條
1 中國惠普公司網(wǎng)絡(luò)存儲事業(yè)部技術(shù)顧問 周志峰;數(shù)據(jù)存儲面臨七大挑戰(zhàn)[N];計算機世界;2001年
2 本報記者 郭濤;中興通訊打造安全高效的大數(shù)據(jù)存儲[N];中國計算機報;2013年
3 本報記者 陳巍巍;數(shù)據(jù)存儲 進化正當(dāng)時[N];計算機世界;2013年
4 本報記者 黃銳;綠源巢:大數(shù)據(jù)存儲弄潮兒[N];東莞日報;2014年
5 毛玲玲 吳非;數(shù)據(jù)存儲 安全為重[N];解放軍報;2014年
6 本報記者 郭濤;華為存儲:高端存儲、大數(shù)據(jù)存儲齊頭并進[N];中國計算機報;2013年
7 本報記者 方慧玲;糾刪碼技術(shù):大數(shù)據(jù)存儲的“安全衛(wèi)士”[N];江蘇科技報;2014年
8 ;培養(yǎng)皿中的數(shù)據(jù)存儲[N];網(wǎng)絡(luò)世界;2007年
9 ;2010年中小企業(yè)數(shù)據(jù)存儲市場六大趨勢[N];網(wǎng)絡(luò)世界;2010年
10 本報實習(xí)記者 陳勛燕;數(shù)據(jù)存儲網(wǎng)絡(luò)凸現(xiàn)商機 上海郵通轉(zhuǎn)型前景看好[N];通信信息報;2002年
相關(guān)博士學(xué)位論文 前3條
1 付松齡;分布式在線社交網(wǎng)絡(luò)數(shù)據(jù)存儲及優(yōu)化技術(shù)研究[D];國防科學(xué)技術(shù)大學(xué);2014年
2 張杰;一種高速數(shù)據(jù)存儲方法的研究[D];中國科學(xué)技術(shù)大學(xué);2013年
3 付永忠;基于AFM和硫系相變材料的超高密度數(shù)據(jù)存儲機理研究[D];江蘇大學(xué);2010年
相關(guān)碩士學(xué)位論文 前10條
1 葛佳;P2P網(wǎng)絡(luò)信譽數(shù)據(jù)存儲與恢復(fù)方法的研究與實現(xiàn)[D];昆明理工大學(xué);2015年
2 潘陽;基于Hadoop技術(shù)在分布式數(shù)據(jù)存儲中的應(yīng)用研究[D];大連海事大學(xué);2015年
3 薩日娜;一種基于綜合閾值的分布式數(shù)據(jù)存儲方法[D];哈爾濱工程大學(xué);2011年
4 胡海光;鉆探工程項目數(shù)據(jù)存儲及其安全的應(yīng)用研究[D];中國地質(zhì)大學(xué)(北京);2012年
5 史玉麗;基于嵌入式的數(shù)據(jù)存儲與通信模塊的設(shè)計[D];內(nèi)蒙古師范大學(xué);2012年
6 趙晉;基于寬表的多租戶數(shù)據(jù)存儲模式研究[D];鄭州大學(xué);2014年
7 陳春霖;云計算中數(shù)據(jù)存儲的完整性校驗?zāi)P脱芯縖D];東華大學(xué);2013年
8 單旭;異構(gòu)大數(shù)據(jù)存儲方法研究[D];北京交通大學(xué);2014年
9 王永洲;基于HDFS的存儲技術(shù)的研究[D];南京郵電大學(xué);2013年
10 王東晨;網(wǎng)絡(luò)試驗平臺數(shù)據(jù)存儲研究與實現(xiàn)[D];北京郵電大學(xué);2013年
,本文編號:2211281
本文鏈接:http://sikaile.net/guanlilunwen/ydhl/2211281.html