小男孩‘自慰网亚洲一区二区,亚洲一级在线播放毛片,亚洲中文字幕av每天更新,黄aⅴ永久免费无码,91成人午夜在线精品,色网站免费在线观看,亚洲欧洲wwwww在线观看

分享

SAP HANA中PAL算法使用入門(mén) | SAP Blogs

 飛鳥(niǎo)的天空zwp 2019-04-15

1 應(yīng)用場(chǎng)合

         SAP HANA作為一款內(nèi)存數(shù)據(jù)庫(kù)產(chǎn)品, 使得數(shù)據(jù)常駐內(nèi)存, 物理磁盤(pán)的存儲(chǔ)作為數(shù)據(jù)備份與日志記錄, 以防斷電內(nèi)存中數(shù)據(jù)丟失. 這種構(gòu)架大大的縮短了數(shù)據(jù)存取的時(shí)間, 使得SAP HANA高速”.

         在傳統(tǒng)數(shù)據(jù)模型中,數(shù)據(jù)庫(kù)只是作為存取數(shù)據(jù)一個(gè)工具,對(duì)于類(lèi)似下圖所示的應(yīng)用, 客戶端從Database獲取數(shù)據(jù),然后計(jì)算,最后再把結(jié)果寫(xiě)回Database, 如果數(shù)據(jù)量過(guò)大, 數(shù)據(jù)傳輸?shù)拈_(kāi)銷(xiāo)過(guò)大,并且如果客戶端的內(nèi)存不夠, 計(jì)算分析的過(guò)程也將非常緩慢./wp-content/uploads/2014/03/flow_406433.png

         借助于大內(nèi)存的優(yōu)勢(shì), SAP HANA的解決方案是把數(shù)據(jù)敏感的相關(guān)計(jì)算邏輯都移動(dòng)到SAP HANA內(nèi), 從而省去了數(shù)據(jù)傳輸?shù)拈_(kāi)銷(xiāo). 典型的框架如下:

/wp-content/uploads/2014/03/hanasys_406434.png

對(duì)于一些簡(jiǎn)單的計(jì)算分析, 可以利用SQLScript腳本完成, SQLScript提供了基本的變量定義語(yǔ)句,流程控制語(yǔ)句. 但是對(duì)于復(fù)雜的分析與計(jì)算, 單純使用SQLScript可能不是特別方便, 比如對(duì)1T的數(shù)據(jù)表作聚類(lèi)分析. 為此 ,SAP HANA提供  AFL (Application Function Library) ,  把一些常見(jiàn)的分析任務(wù)用C++實(shí)現(xiàn),作為庫(kù)函數(shù)的形式, 提供給SQLScript調(diào)用,極大地豐富了SQLScript的功能.

2 PAL簡(jiǎn)介

         PAL (Predictive Analysis Library)SAP HANAAFL (Application Function Library)框架下的一個(gè)函數(shù)庫(kù), 主要用于數(shù)據(jù)預(yù)測(cè)與分析, 提供了很多數(shù)據(jù)挖掘算法的實(shí)現(xiàn). 按應(yīng)用的場(chǎng)景進(jìn)行分類(lèi),PAL函數(shù)主要包括以下類(lèi)別:

聚類(lèi)

分類(lèi)

關(guān)聯(lián)分析

時(shí)間序列分析

數(shù)據(jù)預(yù)處理

統(tǒng)計(jì)分析

社會(huì)網(wǎng)絡(luò)分析

具體到每個(gè)類(lèi)別下面, 有常見(jiàn)算法的實(shí)現(xiàn), 比如聚類(lèi)下面的K-means算法.

值得一提的是, AFL是一個(gè)單獨(dú)的包, 需要另外進(jìn)行安裝. 另外AFL 的版本號(hào)需要與SAP HANA的版本號(hào)匹配.

3 基本使用步驟

     PAL函數(shù)的使用包括三個(gè)步驟:

(1)    生成AFL_WRAPPER_GENERATOR AFL_WRAPPER_ERASER存儲(chǔ)過(guò)程.

       對(duì)于具體的某個(gè)算法, 在使用之前,利用AFL_WRAPPER_GENERATOR生成該算法的一個(gè)包裝器,然后才能進(jìn)行調(diào)用, 可以理解為生成該算法的一個(gè)實(shí)例, AFL_WRAPPER_ERASER作用是刪除這個(gè)算法的實(shí)例.

      這兩個(gè)存儲(chǔ)過(guò)程的生成很簡(jiǎn)單. AFL插件的目錄下有afl_wrapper_generator.sql, afl_wrapper_eraser.sql兩個(gè)腳文文件, 把它們的內(nèi)容拷貝到SAP HANA StudioSQL Console,然后執(zhí)行以及即可.然后為用戶分配執(zhí)行權(quán)限:

   GRANT EXECUTE ON system.afl_wrapper_generator to USER1;

   GRANT EXECUTE ON system.afl_wrapper_eraser to USER1;

這個(gè)步驟只需要首次利用使用AFL時(shí)執(zhí)行一次,對(duì)于后續(xù)其他的算法使用,就不需要執(zhí)行了.

(2)    生成算法的實(shí)例

          CALL SYSTEM.AFL_WRAPPER_GENERATOR(

              ‘<procedure_name>’,        

              ‘<area_name>’,

              ‘<function_name>’, <signature_table>);

Procedure_name:自定義的名稱(chēng);

Area_name:通常為AFLPAL;

Function_name:算法名稱(chēng);

Signature_table:指定一個(gè)用戶表,作為方法簽名的信息;

(3)  調(diào)用算法實(shí)例

CALL <procedure_name>(

<data_input_table> {,…},   

<parameter_table>,

<output_table> {,…}) with overview;

Procedure_name:算法實(shí)例名;

Data_input_table:輸入數(shù)據(jù)表;

Parameter_table:參數(shù)表;

Output_table:輸出表;

4 示例Demo

  下面以DBSCAN聚類(lèi)算法來(lái)說(shuō)明說(shuō)PAL算法的調(diào)用過(guò)程.(因測(cè)試機(jī)AFL_WRAPPER_GENERATOR存儲(chǔ)過(guò)程已經(jīng)存在,故第一步不再執(zhí)行,另假定SchemaTEST)

  DBSCAN聚類(lèi)算法是一個(gè)基于密度的聚類(lèi)算法,該算法有很好的降噪能力,有關(guān)該算法的更多介紹,請(qǐng)參考http://en./wiki/DBSCAN



/*創(chuàng)建數(shù)據(jù)表類(lèi)型 ,id,屬性1,屬性2 */
CREATE TYPE PAL_DBSCAN_DATA_T AS TABLE ( ID integer, ATTRIB1 double, ATTRIB2
double);
/*創(chuàng)建算法控制參數(shù)類(lèi)型*/
CREATE TYPE PAL_CONTROL_T AS TABLE( NAME varchar(50), INTARGS integer, DOUBLEARGS
double, STRINGARGS varchar(100));
/*結(jié)構(gòu)表類(lèi)型,ID,類(lèi)簇編號(hào)*/
CREATE TYPE PAL_DBSCAN_RESULTS_T AS TABLE( ID integer, RESULT integer);
/*創(chuàng)建方法參數(shù)表*/
CREATE COLUMN TABLE PAL_DBSCAN_PDATA_TBL( "ID" INT, "TYPENAME" VARCHAR(100), "DIRECTION" VARCHAR(100) );
/*向參數(shù)表中插入相關(guān)參數(shù)數(shù)據(jù)*/
INSERT INTO PAL_DBSCAN_PDATA_TBL VALUES (1, 'TEST.PAL_DBSCAN_DATA_T', 'in');
INSERT INTO PAL_DBSCAN_PDATA_TBL VALUES (2, 'TEST.PAL_CONTROL_T', 'in');
INSERT INTO PAL_DBSCAN_PDATA_TBL VALUES (3, 'TEST.PAL_DBSCAN_RESULTS_T', 'out');
/*分配權(quán)限*/
GRANT SELECT ON DM_PAL.PAL_DBSCAN_PDATA_TBL to SYSTEM;
/*生成PAL_DBSCAN9算法實(shí)例*/
call SYSTEM.afl_wrapper_eraser('PAL_DBSCAN9');
call SYSTEM.afl_wrapper_generator('PAL_DBSCAN9', 'AFLPAL', 'DBSCAN', PAL_DBSCAN_PDATA_TBL);
/* 創(chuàng)建數(shù)據(jù)表*/
CREATE COLUMN TABLE PAL_DBSCAN_DATA_TBL ( ID integer, ATTRIB1 double, ATTRIB2 double);
/*插入測(cè)試數(shù)據(jù)*/
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(1,0.10,0.10);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(2,0.11,0.10);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(3,0.10,0.11);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(4,0.11,0.11);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(5,0.12,0.11);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(6,0.11,0.12);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(7,0.12,0.12);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(8,0.12,0.13);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(9,0.13,0.12);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(10,0.13,0.13);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(11,0.13,0.14);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(12,0.14,0.13);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(13,10.10,10.10);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(14,10.11,10.10);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(15,10.10,10.11);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(16,10.11,10.11);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(17,10.11,10.12);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(18,10.12,10.11);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(19,10.12,10.12);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(20,10.12,10.13);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(21,10.13,10.12);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(22,10.13,10.13);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(23,10.13,10.14);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(24,10.14,10.13);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(25,4.10,4.10);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(26,7.11,7.10);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(27,-3.10,-3.11);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(28,16.11,16.11);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(29,20.11,20.12);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(30,15.12,15.11);
/*用臨時(shí)表來(lái)存儲(chǔ)算法的輸入?yún)?shù)*/
CREATE LOCAL TEMPORARY COLUMN TABLE #PAL_CONTROL_TBL( NAME varchar(50), INTARGS
integer, DOUBLEARGS double, STRINGARGS varchar(100));
/*指定DBSCAN算法的輸入?yún)?shù)*/
/*線程數(shù)18*/
INSERT INTO #PAL_CONTROL_TBL VALUES('THREAD_NUMBER',18,null,null);
/*自動(dòng)確定MINPTS與RADIUS參數(shù)*/
INSERT INTO #PAL_CONTROL_TBL VALUES('AUTO_PARAM',null,null,'true');
/*點(diǎn)與點(diǎn)之間的距離采用Manhattan距離*/
INSERT INTO #PAL_CONTROL_TBL VALUES('DISTANCE_METHOD',1,null,null);
/*結(jié)果表*/
CREATE COLUMN TABLE PAL_DBSCAN_RESULTS_TBL( ID integer, RESULT integer);
/*調(diào)用DBSCAN算法*/
CALL _SYS_AFL.PAL_DBSCAN9(PAL_DBSCAN_DATA_TBL, "#PAL_CONTROL_TBL",
PAL_DBSCAN_RESULTS_TBL) with overview;
/*查看結(jié)果*/
SELECT * FROM PAL_DBSCAN_RESULTS_TBL; 
/*創(chuàng)建數(shù)據(jù)表類(lèi)型 ,id,屬性1,屬性2 */
CREATE TYPE PAL_DBSCAN_DATA_T AS TABLE ( ID integer, ATTRIB1 double, ATTRIB2
double);
/*創(chuàng)建算法控制參數(shù)類(lèi)型*/
CREATE TYPE PAL_CONTROL_T AS TABLE( NAME varchar(50), INTARGS integer, DOUBLEARGS
double, STRINGARGS varchar(100));
/*結(jié)構(gòu)表類(lèi)型,ID,類(lèi)簇編號(hào)*/
CREATE TYPE PAL_DBSCAN_RESULTS_T AS TABLE( ID integer, RESULT integer);
/*創(chuàng)建方法參數(shù)表*/
CREATE COLUMN TABLE PAL_DBSCAN_PDATA_TBL( "ID" INT, "TYPENAME" VARCHAR(100), "DIRECTION" VARCHAR(100) );
/*向參數(shù)表中插入相關(guān)參數(shù)數(shù)據(jù)*/
INSERT INTO PAL_DBSCAN_PDATA_TBL VALUES (1, 'TEST.PAL_DBSCAN_DATA_T', 'in');
INSERT INTO PAL_DBSCAN_PDATA_TBL VALUES (2, 'TEST.PAL_CONTROL_T', 'in');
INSERT INTO PAL_DBSCAN_PDATA_TBL VALUES (3, 'TEST.PAL_DBSCAN_RESULTS_T', 'out');
/*分配權(quán)限*/
GRANT SELECT ON DM_PAL.PAL_DBSCAN_PDATA_TBL to SYSTEM;
/*生成PAL_DBSCAN9算法實(shí)例*/
call SYSTEM.afl_wrapper_eraser('PAL_DBSCAN9');
call SYSTEM.afl_wrapper_generator('PAL_DBSCAN9', 'AFLPAL', 'DBSCAN', PAL_DBSCAN_PDATA_TBL);
/* 創(chuàng)建數(shù)據(jù)表*/
CREATE COLUMN TABLE PAL_DBSCAN_DATA_TBL ( ID integer, ATTRIB1 double, ATTRIB2 double);
/*插入測(cè)試數(shù)據(jù)*/
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(1,0.10,0.10);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(2,0.11,0.10);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(3,0.10,0.11);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(4,0.11,0.11);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(5,0.12,0.11);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(6,0.11,0.12);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(7,0.12,0.12);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(8,0.12,0.13);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(9,0.13,0.12);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(10,0.13,0.13);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(11,0.13,0.14);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(12,0.14,0.13);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(13,10.10,10.10);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(14,10.11,10.10);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(15,10.10,10.11);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(16,10.11,10.11);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(17,10.11,10.12);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(18,10.12,10.11);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(19,10.12,10.12);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(20,10.12,10.13);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(21,10.13,10.12);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(22,10.13,10.13);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(23,10.13,10.14);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(24,10.14,10.13);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(25,4.10,4.10);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(26,7.11,7.10);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(27,-3.10,-3.11);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(28,16.11,16.11);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(29,20.11,20.12);
INSERT INTO PAL_DBSCAN_DATA_TBL VALUES(30,15.12,15.11);
/*用臨時(shí)表來(lái)存儲(chǔ)算法的輸入?yún)?shù)*/
CREATE LOCAL TEMPORARY COLUMN TABLE #PAL_CONTROL_TBL( NAME varchar(50), INTARGS
integer, DOUBLEARGS double, STRINGARGS varchar(100));
/*指定DBSCAN算法的輸入?yún)?shù)*/
/*線程數(shù)18*/
INSERT INTO #PAL_CONTROL_TBL VALUES('THREAD_NUMBER',18,null,null);
/*自動(dòng)確定MINPTS與RADIUS參數(shù)*/
INSERT INTO #PAL_CONTROL_TBL VALUES('AUTO_PARAM',null,null,'true');
/*點(diǎn)與點(diǎn)之間的距離采用Manhattan距離*/
INSERT INTO #PAL_CONTROL_TBL VALUES('DISTANCE_METHOD',1,null,null);
/*結(jié)果表*/
CREATE COLUMN TABLE PAL_DBSCAN_RESULTS_TBL( ID integer, RESULT integer);
/*調(diào)用DBSCAN算法*/
CALL _SYS_AFL.PAL_DBSCAN9(PAL_DBSCAN_DATA_TBL, "#PAL_CONTROL_TBL",
PAL_DBSCAN_RESULTS_TBL) with overview;
/*查看結(jié)果*/
SELECT * FROM PAL_DBSCAN_RESULTS_TBL; 








DBSCANresult.png

如果執(zhí)行無(wú)誤,將看到如上圖所示的結(jié)果,記錄被聚成三類(lèi),0,1,-1各代表一個(gè)類(lèi)簇.

5結(jié)束語(yǔ)

本文介紹了SAP HANAPAL算法的使用,以DBSCAN聚類(lèi)算法作為具體的實(shí)現(xiàn)例子.其他的相關(guān)算法使用流程上與上述流程都相似,主要的工作在于準(zhǔn)備數(shù)據(jù)表,根據(jù)算法的接口文檔定義相關(guān)的參數(shù),并將參數(shù)存入?yún)?shù)表.最后調(diào)用算法實(shí)例即可.

從效率上講,在SAP HANA中使用PAL,一方面利用了大內(nèi)存的優(yōu)勢(shì),另一方面利用了C++作為編譯型語(yǔ)言本身的高效性,如果使用得當(dāng),對(duì)于大數(shù)據(jù)的相關(guān)分析任務(wù),在速度上將會(huì)有一個(gè)很大的飛躍!

[: 本文的測(cè)試案例所使用的SAP HANA版本為SAP HANA SPS06]

想獲取更多SAP HANA學(xué)習(xí)資料或有任何疑問(wèn),請(qǐng)關(guān)注新浪微博@HANAGeek!我們歡迎你的加入!

    本站是提供個(gè)人知識(shí)管理的網(wǎng)絡(luò)存儲(chǔ)空間,所有內(nèi)容均由用戶發(fā)布,不代表本站觀點(diǎn)。請(qǐng)注意甄別內(nèi)容中的聯(lián)系方式、誘導(dǎo)購(gòu)買(mǎi)等信息,謹(jǐn)防詐騙。如發(fā)現(xiàn)有害或侵權(quán)內(nèi)容,請(qǐng)點(diǎn)擊一鍵舉報(bào)。
    轉(zhuǎn)藏 分享 獻(xiàn)花(0

    0條評(píng)論

    發(fā)表

    請(qǐng)遵守用戶 評(píng)論公約

    類(lèi)似文章