Hadoop云盤系統(tǒng)

whbsdu 2015-01-08

展開全文

項(xiàng)目名稱: 《Hadoop云盤系統(tǒng)》

項(xiàng)目開發(fā)環(huán)境：Linux下Hadoop分布式系統(tǒng)

項(xiàng)目開發(fā)環(huán)境：Ubuntu11.04+Hadoop0.20.2+JDK1.6+Eclipse3.3.2。

使用技術(shù)：Hadoop + Java

作品展示地址：http://blog.csdn.net/jtlyuan/article/details/7980826

作品描述：1、個(gè)人獨(dú)立完成，課余興趣作品。包含全部設(shè)計(jì)、分析、編碼、優(yōu)化。

2、功能實(shí)現(xiàn)，文件上傳、下載、刪除、維護(hù)、文件夾的新建、文件路徑跟蹤、以及個(gè)人文件搜索功能實(shí)現(xiàn)和文件分類展現(xiàn)等。

3、基于Hadoop開發(fā)的分布式云平臺(tái)的文件管理系統(tǒng)。

一、概況：

1、這是個(gè)人的業(yè)余愛好項(xiàng)目，自己進(jìn)行了全部的設(shè)計(jì)、分析、編碼、和優(yōu)化。

2、根據(jù)現(xiàn)場需求進(jìn)行流程分析與編寫。

3、根據(jù)所需求的業(yè)務(wù)進(jìn)行開發(fā)，代碼編寫，實(shí)現(xiàn)功能。

4、對(duì)程序業(yè)務(wù)邏輯進(jìn)行優(yōu)化，使其達(dá)到更高的性能。

項(xiàng)目難點(diǎn)：1、搜索的實(shí)現(xiàn)利用了回溯法進(jìn)行所有文件的搜索，利用字符是否包含來判斷是否進(jìn)入結(jié)果容器中

2、目錄跟蹤顯示的實(shí)現(xiàn)

二、這是本人仿照《百度網(wǎng)盤》，利用Hadoop技術(shù)進(jìn)行開發(fā)的《Hadoop云盤系統(tǒng)》

如下圖所示，《百度網(wǎng)盤》和自己做的《Hadoop云盤系統(tǒng)》進(jìn)行了對(duì)比。

再看個(gè)人的《云盤》

總結(jié)：界面設(shè)計(jì)簡潔，整齊，操作方面，用戶體驗(yàn)良好。

三、Hadoop集群主要配置和啟動(dòng)操作操作過程

1、查看集群的主節(jié)點(diǎn)配置。先在Linux中啟動(dòng)Hadoop，如下：查看現(xiàn)在JPS運(yùn)行的進(jìn)程，檢查系統(tǒng)是否正常啟動(dòng)

2、查看 core-site.xml文件查看主節(jié)點(diǎn)的配置。

3、并在web中查看 http://192.168.236.132:50030/ 和http://192.168.236.132:50070/ web界面集群啟動(dòng)情況，確定

無誤后，可以利用Eclipse 啟動(dòng)程序運(yùn)行程序了。

四、系統(tǒng)部分測試和主要代碼解析

1、上傳文件，是從本地文件系統(tǒng)中上傳到HDFS中，上傳到當(dāng)前進(jìn)入的目錄當(dāng)中

主要代碼分析：

[java] view plain copy

JFileChooser chooser = new JFileChooser();
chooser.setVisible(true);
int returnVal = chooser.showOpenDialog(null);
if (returnVal == JFileChooser.APPROVE_OPTION) {// 為確定或OK是
String localPath = chooser.getSelectedFile()
.getPath();
String filename = chooser.getSelectedFile()
.getName();
InputStream in = null;
try {
in = new BufferedInputStream(
new FileInputStream(localPath));//本地文件輸入流
} catch (FileNotFoundException e3) {
e3.printStackTrace();
}
OutputStream out = null;
try {
out = hdfs.create(new Path(currentPath
+ "/" + filename),
new Progressable() {
public void progress() {
System.out.print(".");
}
});//HDFS路徑的輸出流抽象
} catch (IOException e2) {
e2.printStackTrace();
}
try {
IOUtils.copyBytes(in, out, 4096, true);//利用IOUtils工具類實(shí)現(xiàn)上傳
} catch (IOException e1) {
e1.printStackTrace();
}
try {
showTable(currentPath);//上傳完畢就刷新當(dāng)前路徑的文件表格
} catch (IOException e1) {
e1.printStackTrace();
}
}

2、文件下載分析：從在HDFS中下載到本地文件系統(tǒng)中，

主要代碼分析：

[java] view plain copy

if (e.getSource() == deleItem) {
int ensuce = JOptionPane.showConfirmDialog(new MainWindow(),
"確定刪除所選文件嗎", "確認(rèn)對(duì)話框", JOptionPane.YES_NO_OPTION);
if (ensuce == JOptionPane.NO_OPTION) {
return;
}
if (ensuce == JOptionPane.YES_OPTION) {
if (fileList.getSelectedRow() >= 0) {
String temp = currentPath
+ "/"
+ fileList.getValueAt(
fileList.getSelectedRow(), 0);//獲取要?jiǎng)h掉文件的路徑
try {
hdfs.delete(new Path(temp), true);
} catch (IOException e1) {
e1.printStackTrace();
}
try {
showTable(currentPath);
} catch (IOException e1) {
e1.printStackTrace();
}
}
}

3、文件表格展現(xiàn)

主要代碼：

[java] view plain copy

/*-------------------------------------把currentPath路徑下的文件和文件夾屬性全都顯示表格中----------------------------------------------------------*/
private void showTable(String currentPath) throws IOException {
SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd HH:mm");
Path inputDir = new Path(currentPath);/* 獲取文件的路徑*/
/* FileStatue類*/
FileStatus[] status = hdfs.listStatus(inputDir);/* 得到文件路徑目錄下文件列表*/
DefaultTableModel model = (DefaultTableModel) fileList.getModel(); // 獲取表格模型
if (fileList.getRowCount() != 0) { // 當(dāng)表格中有數(shù)據(jù)
((DefaultTableModel) fileList.getModel()).setRowCount(0);// 將表格置空
}
for (int i = 0; i < status.length; i++) {
String filename = null;
String lenStr = null;
String modifDate = null;
filename = status[i].getPath().getName();
String length = null; // 獲取文件大小信息
DecimalFormat df = new DecimalFormat("#.00");
if (status[i].isDir()) {
lenStr = "-";
} else {
if (status[i].getLen() > (1024 * 1024 * 1024)) {
length = df.format(status[i].getLen()
/ (1024.0 * 1024 * 1024));
lenStr = " " + length + "G";
} else if (status[i].getLen() > (1024 * 1024)) {
length = df.format(status[i].getLen() / (1024.0 * 1024));
lenStr = " " + length + "M";
} else if (status[i].getLen() > 1024) {
length = df.format(status[i].getLen() / 1024.0);
lenStr = " " + length + "KB";
} else {
length = df.format(status[i].getLen());
lenStr = " " + length + "B";
}
}
modifDate = sdf.format(status[i].getModificationTime());
model.addRow(new Object[] { filename, lenStr, modifDate }); // 將文件名、文件大小、文件創(chuàng)建日期添加到表格
}
}

4、文件搜索功能實(shí)現(xiàn)（利用回溯算法+字符匹配來實(shí)現(xiàn)）

先測試一下功能吧！我們輸入“Hadoop”關(guān)鍵字搜索比配的文件

再入：輸入“數(shù)學(xué)”搜索一下結(jié)果

主要實(shí)現(xiàn)代碼

[java] view plain copy

showAllResult(target);//具體實(shí)現(xiàn)
/*--------------------獲取所有要搜索到的文件路徑---------------------------------*/
private List<String> findAllFile(String target) {
List<String> result = new ArrayList<String>();
char[] tar = target.toCharArray();
int count = 0;
String findPath = currentPath;//默認(rèn)搜索的路徑是目前打開目錄下為根的目錄樹
getAllFile(tar, result, findPath, count);
return result;
}
/*-----------------------------回溯檢測樹形下的文件---------------------------------------------------------*/
private void getAllFile(char[] tar, List<String> result, String findPath,
int count) {
conf = new Configuration();
try {
hdfs = FileSystem.get(URI.create(findPath), conf);
} catch (IOException e) {
e.printStackTrace();
}
try {
if (hdfs.isFile(new Path(findPath))) {
String name = hdfs.getFileStatus(new Path(findPath)).getPath()
.getName();
if (isFind(tar, name.toCharArray())) {//檢測是否字符匹配，匹配為找到
result.add(findPath);// 搜索到加入數(shù)組
}
return;
}
} catch (IOException e) {
e.printStackTrace();
}
FileStatus[] sta = null;
try {
sta = hdfs.listStatus(new Path(findPath));
} catch (IOException e) {
e.printStackTrace();
}
for (int i = 0; i < sta.length; i++) {//回溯法實(shí)現(xiàn)循環(huán)遞歸遍歷
getAllFile(tar, result, sta[i].getPath().toString(), count++);
}
}
/*-----------------------查看字符串是否包含--------------------------------------------*/
private boolean isFind(char[] tar, char[] sour) {
int all=0.0；for (int i = 0; i < tar.length; i++) {
int j = 0;
for (; j < sour.length; ++j) {
if (tar[i] == sour[j]) {
all++；break;
}
}
if (j == sour.length&&all/sour.length<0.75) {//概率匹配
return false;
}
}
return true;
}

5、文件分類實(shí)現(xiàn)查詢

a、文件分類管理查看，查看“文檔”列出系統(tǒng)中所有的文檔

b、“圖片”

c、“音樂”

等等。。。。。。

6、其他的實(shí)現(xiàn)，目錄文件跟蹤、文件維護(hù)等：

*最后說說本系統(tǒng)的信息處理的實(shí)現(xiàn)---MapReduce解決

首先說下其實(shí)登陸不只是利用數(shù)據(jù)庫來實(shí)現(xiàn)的，解決方法有如下幾種：

方案一：用另外一臺(tái)機(jī)器專門用于對(duì)數(shù)據(jù)庫操作的。要是在一個(gè)Hadoop中集群中安裝一個(gè)數(shù)據(jù)庫，我們不應(yīng)該把數(shù)據(jù)庫放在namenode中，而是放到另外的一臺(tái)機(jī)，因

為namenode的任務(wù)已經(jīng)夠多了，安裝在namenode上，多用戶登陸時(shí)對(duì)數(shù)據(jù)庫操作會(huì)消耗namenode的內(nèi)存，會(huì)影響namenode對(duì)datanode的管理和調(diào)度等。

所以我們應(yīng)該放到指定的一臺(tái)機(jī)器上。

方案二：利用利用Hadoop中分布式數(shù)據(jù)庫HBase解決。這個(gè)毫無疑問是最好的解決方案，是針對(duì)云技術(shù)分布式的數(shù)據(jù)庫。不過本人對(duì)HBase還是處于了解階段，所以沒

有用上它。

最終方案：HBase用不上，不過沒關(guān)系，因?yàn)楝F(xiàn)在只是對(duì)一個(gè)用戶信息處理實(shí)現(xiàn)，業(yè)務(wù)量很少，我可以仿照Hive那樣，在數(shù)據(jù)進(jìn)行的時(shí)候轉(zhuǎn)化為MapReduce進(jìn)行，利用MapReduce來進(jìn)行表與表的關(guān)聯(lián)。

例如如下表：信息表：

登陸表：

有待擴(kuò)展的功能模塊：我的分享----可以做成分享一個(gè)文件提供所有人下載，也可以做成分享給具體的某個(gè)用戶。

實(shí)現(xiàn)方式：做一個(gè)分享表，記錄了分享人，被分享人，文件獲取的路徑（或者獲取文件所需的參數(shù)參數(shù)）

如下：

本站是提供個(gè)人知識(shí)管理的網(wǎng)絡(luò)存儲(chǔ)空間，所有內(nèi)容均由用戶發(fā)布，不代表本站觀點(diǎn)。請(qǐng)注意甄別內(nèi)容中的聯(lián)系方式、誘導(dǎo)購買等信息，謹(jǐn)防詐騙。如發(fā)現(xiàn)有害或侵權(quán)內(nèi)容，請(qǐng)點(diǎn)擊一鍵舉報(bào)。

轉(zhuǎn)藏 分享

QQ空間 QQ好友新浪微博微信

獻(xiàn)花（0） +1

來自： whbsdu > 《Hadoop》

舉報(bào)/認(rèn)領(lǐng)