Before kernel 2.6.30 [2]:
702 rp_filter - BOOLEAN
703 1 - do source validation by reversed path, as specified in RFC1812
704 Recommended option for single homed hosts and stub network
705 rout問(wèn)題現(xiàn)象:
tunnel模式的lvs在系統(tǒng)從fedora8升級(jí)成fc17之后,同樣的lvs配置,同樣的架構(gòu)下,lvs工作不正常。
具體在現(xiàn)象(fc17內(nèi)核3.3.7)
1.從客戶端ping vip地址,OK;
2.Telnet訪問(wèn)vip的應(yīng)該用端口80現(xiàn)象為連接timeout;
3.完全同樣的架構(gòu)和配置在fc8內(nèi)核2.6.26下則工作正常;
系統(tǒng)構(gòu)成說(shuō)明圖:
2
上圖中整個(gè)數(shù)據(jù)流的過(guò)程應(yīng)該是:
2.1
客戶端訪問(wèn)lvs的vip地址,此時(shí)數(shù)據(jù)包源地址為客戶端地址192.168.91.196,目的地址為lvs
vip地址192.168.91.204;
2.2
lvs通過(guò)其配置的算法,決定請(qǐng)求發(fā)給底下的某臺(tái)real
server,采用的是tunnel模式,此后往下轉(zhuǎn)發(fā)的數(shù)據(jù)包應(yīng)該是被封裝的,源地址為L(zhǎng)VS 轉(zhuǎn)發(fā)器director
的內(nèi)部地址192.168.91.209,目的地址為real
server的th0地址192.168.91.78(此處假設(shè)請(qǐng)求被分配各左邊的這臺(tái)relaserver)
2.3 real
server的eth0上接收到目的地址為自己ip數(shù)據(jù)包后,解封裝,并且交給上層處理,協(xié)議棧對(duì)payload分析后是ipip封裝包,則進(jìn)一步把該解封裝后的數(shù)據(jù)包交給tunl0處理,因此tunel0上收到的數(shù)據(jù)包應(yīng)該是解封裝后的數(shù)據(jù)包,源地址為真實(shí)客戶端地址192.168.91.196目的地址為vip 192.168.91.204,tunl0接受該數(shù)據(jù)包后交由上層tcp處理。
2.4
針對(duì)該請(qǐng)求的返回?cái)?shù)據(jù)包則按照以下的動(dòng)作流程:應(yīng)用層生成返回的信息之后交給tcp/ip層,最后在ip層生產(chǎn)數(shù)據(jù)包,此時(shí)的數(shù)據(jù)包源地址為VIP192.168.91.204,目的地址為真實(shí)客戶端地址192.168.91.196
2.5 該數(shù)據(jù)包在參考適當(dāng)?shù)穆酚桑蠄D中實(shí)際情況則是參考192.168.0.0/16
via 192.168.91.65 dev eth0這條路由,通過(guò)eth0發(fā)送該數(shù)據(jù)包到網(wǎng)關(guān)C-Router192.168.91.65(如果內(nèi)核允許),直接發(fā)送到外部世界,最終被真實(shí)客戶端接收。
分析過(guò)程
1
從客戶端ping VIP192.168.91.204
通,說(shuō)明vip地址已經(jīng)起來(lái);
2
從客戶端不斷的telnet 192.168.91.204的80端口,模擬訪問(wèn)應(yīng)用程序,并且同時(shí)在上圖左邊的realserver上抓數(shù)據(jù)包,得到結(jié)果如下:
.在th0端口上,能夠抓到從lvs-director發(fā)給realersver的封裝后的ip數(shù)據(jù)包,源地址和目的地址分別為192.168.91.209,192.168.91.78;
.而在tunlo上,能夠抓到解封裝后的ip數(shù)據(jù)包源和目的分別是192.168.91.196,192.168.91.204。
由此可見(jiàn),來(lái)自客戶端來(lái)的數(shù)據(jù)包,已經(jīng)被lvs
director正確的封裝后轉(zhuǎn)發(fā)到后端真實(shí)的服務(wù)器上,而后端真實(shí)服務(wù)器上的ip棧也已經(jīng)正確的接收了封裝后的數(shù)據(jù)包,并且正確解封裝交由ipip棧處理,但是返回的數(shù)據(jù)包沒(méi)有在eth0上抓到,可見(jiàn)問(wèn)題出在返回?cái)?shù)據(jù)包上。
3
確認(rèn)真實(shí)服務(wù)器上的tcp端口正常啟動(dòng)之后,問(wèn)題懷疑點(diǎn)就落在了返回?cái)?shù)據(jù)包,沒(méi)有被正確路由或者干脆丟棄了的假設(shè)上。 于是很快想到了內(nèi)核參數(shù)rp_filter(關(guān)于什么是rp_filter請(qǐng)參看
http://en./wiki/Reverse_path_filtering)總結(jié)起來(lái)就是說(shuō)如果rp_filter被啟用,則服務(wù)器在某個(gè)接口上接收到了某個(gè)數(shù)據(jù)包,則目的地址為該數(shù)據(jù)包源地址的返回包必須通過(guò)同樣的接口發(fā)送出去,也就是用于返回?cái)?shù)據(jù)包的路由項(xiàng)的出口接口如果和該接口不一致,則返回的數(shù)據(jù)包就直接被內(nèi)核丟棄
于是查看realserver的內(nèi)核配置
| $ sudo
/sbin/sysctl -a|fgrep .rp_filter
net.ipv4.conf.all.rp_filter = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.lo.rp_filter = 1
net.ipv4.conf.eth0.rp_filter = 1
net.ipv4.conf.tunl0.rp_filter = 1
<---果然開(kāi)啟了rp_filter,需要關(guān)閉。
####關(guān)閉的命令如下
echo 0
>
/proc/sys/net/ipv4/conf/tunl0/rp_filter
|
4
修改rp_filter的配置之后,再次從客戶端連接vip的80端口,顯示正常,于是問(wèn)題解決??蛇€有個(gè)問(wèn)題沒(méi)有解釋清楚的是,為什么在fc8下同樣的配置沒(méi)有出問(wèn)題呢?
5 很快在網(wǎng)上查到了以下的知識(shí):
Linux kernel 2.6.30 和 kernel
2.6.31,內(nèi)核參數(shù)rp_filter的定義和計(jì)算其值的算法發(fā)生了變化:
這種變化包含,
Before kernel 2.6.30 [2]:
702 rp_filter - BOOLEAN
703 1 - do source validation by reversed path, as specified in RFC1812
704 Recommended option for single homed hosts and stub network
705 routers. Could cause troubles for complicated (not loop free)
706 networks running a slow unreliable protocol (sort of RIP),
707 or using static routes.
708
709 0 - No source validation.
710
711 conf/all/rp_filter must also be set to TRUE to do source validation
712 on the interface
713
714 Default value is 0. Note that some distributions enable it
715 in startup scripts.
Since kernel 2.6.31 [3]:
702 rp_filter - INTEGER
703 0 - No source validation.
704 1 - Strict mode as defined in RFC3704 Strict Reverse Path
705 Each incoming packet is tested against the FIB and if the interface
706 is not the best reverse path the packet check will fail.
707 By default failed packets are discarded.
708 2 - Loose mode as defined in RFC3704 Loose Reverse Path
709 Each incoming packet's source address is also tested against the FIB
710 and if the source address is not reachable via any interface
711 the packet check will fail.
712
713 Current recommended practice in RFC3704 is to enable strict mode
714 to prevent IP spoofing from DDos attacks. If using asymmetric routing
715 or other complicated routing, then loose mode is recommended.
716
717 conf/all/rp_filter must also be set to non-zero to do source validation
718 on the interface
719
720 Default value is 0. Note that some distributions enable it
721 in startup scripts.
Before kernel 2.6.31 :
Actual rp_filter for <interface> = net.ipv4.conf.<interface>.rp_filter AND net.ipv4.conf.all.rp_filter
I.e. reverse path filtering is enabled in strict mode if rp_filter=1 for both "all" and the interface.
Since kernel 2.6.31 :
Actual rp_filter for <interface> = MAX(net.ipv4.conf.<interface>.rp_filter, net.ipv4.conf.all.rp_filter)
I.e. reverse path filtering is enabled in strict mode if rp_filter=1 for either "all" or the interface.
由上面的內(nèi)容可以看出,在 2.6.31之前的版本中,判斷一個(gè)端口的rp_filter是不是有效,是對(duì)該端口的rp_filter和net.ipv4.conf.all.rp_filter的求與運(yùn)算,而在之后的版本里面,是對(duì)該端口的rp_filter值和net.ipv4.conf.all.rp_filter的求最大值。
6
再次來(lái)分析系統(tǒng)默認(rèn)的參數(shù)配置
FC8和fc17都是一下的默認(rèn)內(nèi)核參數(shù)配置
| $ sudo
/sbin/sysctl -a|fgrep .rp_filter
net.ipv4.conf.all.rp_filter = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.lo.rp_filter = 1
net.ipv4.conf.eth0.rp_filter = 1
net.ipv4.conf.tunl0.rp_filter = 1
|
因此 關(guān)于net.ipv4.conf.tunl0.rp_filter
最終值的計(jì)算過(guò)程如下
- 內(nèi)核2.6.26;
對(duì)net.ipv4.conf.all.rp_filter = 0
和 net.ipv4.conf.tunl0.rp_filter =
1 求余運(yùn)算后結(jié)果為0,所以在tunl0上rp_filter并沒(méi)有被啟用,因此返回的數(shù)據(jù)包能夠通過(guò)eth0端口發(fā)送出去,LVS工作正常;
- 內(nèi)核3.3.7; 對(duì)net.ipv4.conf.all.rp_filter = 0
和 net.ipv4.conf.tunl0.rp_filter = 1 求最大值得到1
,表示在該tunl0上 rp_filter是被啟用的,也就是說(shuō)在tunl0上收到的數(shù)據(jù)包,針對(duì)該數(shù)據(jù)包的返回?cái)?shù)據(jù)包(目的地址為接受到的數(shù)據(jù)包的源地址,而源地址為接收到數(shù)據(jù)包的目的地址),必須通過(guò)tunl0發(fā)送出去,否則丟棄。
在本case中,由于路由選擇為eth0而非tunl0所以,數(shù)據(jù)包丟棄。 從而lvs工作不正常。
總結(jié):
為了確保不出錯(cuò),在lvs-tunnel配置的時(shí)候統(tǒng)一配置內(nèi)核如下
| net.ipv4.conf.all.rp_filter
= 0
net.ipv4.conf.default.rp_filter = 0
net.ipv4.conf.lo.rp_filter = 0
net.ipv4.conf.eth0.rp_filter = 0
net.ipv4.conf.tunl0.rp_filter = 0
|
參考資料:
- [1]. http://en./wiki/Reverse_path_filtering
- [2].
http://lxr./source/Documentation/networking/ip-sysctl.txt?v=2.6.29#L702
- [3].
http://lxr./source/Documentation/networking/ip-sysctl.txt?v=2.6.30#L702
- [4]. http://www./lists/linux-net/msg17162.html
- [5]. http://www./lists/netfilter/msg47124.html
- [6].http://patchwork./patch/23513/
?。璭dit by andy.chouers. Could cause troubles for complicated (not loop free)
706 networks running a slow unreliable protocol (sort of RIP),
707 or using static routes.
708
709 0 - No source validation.
710
711 conf/all/rp_filter must also be set to TRUE to do source validation
712 on the interface
713
714 Default value is 0. Note that some distributions enable it
715 in startup scripts.
Since kernel 2.6.31 [3]:
702 rp_filter - INTEGER
703 0 - No source validation.
704 1 - Strict mode as defined in RFC3704 Strict Reverse Path
705 Each incoming packet is tested against the FIB and if the interface
706 is not the best reverse path the packet check will fail.
707 By default failed packets are discarded.
708 2 - Loose mode as defined in RFC3704 Loose Reverse Path
709 Each incoming packet's source address is also tested against the FIB
710 and if the source address is not reachable via any interface
711 the packet check will fail.
712
713 Current recommended practice in RFC3704 is to enable strict mode
714 to prevent IP spoofing from DDos attacks. If using asymmetric routing
715 or other complicated routing, then loose mode is recommended.
716
717 conf/all/rp_filter must also be set to non-zero to do source validation
718 on the interface
719
720 Default value is 0. Note that some distributions enable it
721 in startup scripts.