|
前文我們了解了k8s上的NetworkPolicy資源的使用和工作邏輯,回顧請(qǐng)參考:https://www.cnblogs.com/qiuhom-1874/p/14227660.html;今天我們來(lái)聊一聊Pod調(diào)度策略相關(guān)話(huà)題; 在k8s上有一個(gè)非常重要的組件kube-scheduler,它主要作用是監(jiān)聽(tīng)apiserver上的pod資源中的nodename字段是否為空,如果該字段為空就表示對(duì)應(yīng)pod還沒(méi)有被調(diào)度,此時(shí)kube-scheduler就會(huì)從k8s眾多節(jié)點(diǎn)中,根據(jù)pod資源的定義相關(guān)屬性,從眾多節(jié)點(diǎn)中挑選一個(gè)最佳運(yùn)行pod的節(jié)點(diǎn),并把對(duì)應(yīng)主機(jī)名稱(chēng)填充到對(duì)應(yīng)pod的nodename字段,然后把pod定義資源存回apiserver;此時(shí)apiserver就會(huì)根據(jù)pod資源上的nodename字段中的主機(jī)名,通知對(duì)應(yīng)節(jié)點(diǎn)上的kubelet組件來(lái)讀取對(duì)應(yīng)pod資源定義,kubelet從apiserver讀取對(duì)應(yīng)pod資源定義清單,根據(jù)資源清單中定義的屬性,調(diào)用本地docker把對(duì)應(yīng)pod運(yùn)行起來(lái);然后把pod狀態(tài)反饋給apiserver,由apiserver把對(duì)應(yīng)pod的狀態(tài)信息存回etcd中;整個(gè)過(guò)程,kube-scheduler主要作用是調(diào)度pod,并把調(diào)度信息反饋給apiserver,那么問(wèn)題來(lái)了,kube-scheduler它是怎么評(píng)判眾多節(jié)點(diǎn)哪個(gè)節(jié)點(diǎn)最適合運(yùn)行對(duì)應(yīng)pod的呢? 在k8s上調(diào)度器的工作邏輯是根據(jù)調(diào)度算法來(lái)實(shí)現(xiàn)對(duì)應(yīng)pod的調(diào)度的;不同的調(diào)度算法,調(diào)度結(jié)果也有所不同,其評(píng)判的標(biāo)準(zhǔn)也有所不同,當(dāng)調(diào)度器發(fā)現(xiàn)apiserver上有未被調(diào)度的pod時(shí),它會(huì)把k8s上所有節(jié)點(diǎn)信息,挨個(gè)套進(jìn)對(duì)應(yīng)的預(yù)選策略函數(shù)中進(jìn)行篩選,把不符合運(yùn)行pod的節(jié)點(diǎn)淘汰掉,我們把這個(gè)過(guò)程叫做調(diào)度器的預(yù)選階段(Predicate);剩下符合運(yùn)行pod的節(jié)點(diǎn)會(huì)進(jìn)入下一個(gè)階段優(yōu)選(Priority),所謂優(yōu)選是在這些符合運(yùn)行pod的節(jié)點(diǎn)中根據(jù)各個(gè)優(yōu)選函數(shù)的評(píng)分,最后把每個(gè)節(jié)點(diǎn)通過(guò)各個(gè)優(yōu)選函數(shù)評(píng)分加起來(lái),選擇一個(gè)最高分,這個(gè)最高分對(duì)應(yīng)的節(jié)點(diǎn)就是調(diào)度器最后調(diào)度結(jié)果,如果最高分有多個(gè)節(jié)點(diǎn),此時(shí)調(diào)度器會(huì)從最高分相同的幾個(gè)節(jié)點(diǎn)隨機(jī)挑選一個(gè)節(jié)點(diǎn)當(dāng)作最后運(yùn)行pod的節(jié)點(diǎn);我們把這個(gè)這個(gè)過(guò)程叫做pod選定過(guò)程(select);簡(jiǎn)單講調(diào)度器的調(diào)度過(guò)程會(huì)通過(guò)三個(gè)階段,第一階段是預(yù)選階段,此階段主要是篩選不符合運(yùn)行pod節(jié)點(diǎn),并將這些節(jié)點(diǎn)淘汰掉;第二階段是優(yōu)選,此階段是通過(guò)各個(gè)優(yōu)選函數(shù)對(duì)節(jié)點(diǎn)評(píng)分,篩選出得分最高的節(jié)點(diǎn);第三階段是節(jié)點(diǎn)選定,此階段是從多個(gè)高分節(jié)點(diǎn)中隨機(jī)挑選一個(gè)作為最終運(yùn)行pod的節(jié)點(diǎn);大概過(guò)程如下圖所示
提示:預(yù)選過(guò)程是一票否決機(jī)制,只要其中一個(gè)預(yù)選函數(shù)不通過(guò),對(duì)應(yīng)節(jié)點(diǎn)則直接被淘汰;剩下通過(guò)預(yù)選的節(jié)點(diǎn)會(huì)進(jìn)入優(yōu)選階段,此階段每個(gè)節(jié)點(diǎn)會(huì)通過(guò)對(duì)應(yīng)的優(yōu)選函數(shù)來(lái)對(duì)各個(gè)節(jié)點(diǎn)評(píng)分,并計(jì)算每個(gè)節(jié)點(diǎn)的總分;最后調(diào)度器會(huì)根據(jù)每個(gè)節(jié)點(diǎn)的最后總分來(lái)挑選一個(gè)最高分的節(jié)點(diǎn),作為最終調(diào)度結(jié)果;如果最高分有多個(gè)節(jié)點(diǎn),此時(shí)調(diào)度器會(huì)從對(duì)應(yīng)節(jié)點(diǎn)集合中隨機(jī)挑選一個(gè)作為最后調(diào)度結(jié)果,并把最后調(diào)度結(jié)果反饋給apiserver; 影響調(diào)度的因素 NodeName:nodename是最直接影響pod調(diào)度的方式,我們知道調(diào)度器評(píng)判pod是否被調(diào)度,就是根據(jù)nodename字段是否為空來(lái)進(jìn)行判斷,如果對(duì)應(yīng)pod資源清單中,用戶(hù)明確定義了nodename字段,則表示不使用調(diào)度器調(diào)度,此時(shí)調(diào)度器也不會(huì)調(diào)度此類(lèi)pod資源,原因是對(duì)應(yīng)nodename非空,調(diào)度器認(rèn)為該pod是已經(jīng)調(diào)度過(guò)了;這種方式是用戶(hù)手動(dòng)將pod綁定至某個(gè)節(jié)點(diǎn)的方式; NodeSelector:nodeselector相比nodename,這種方式要寬松一些,它也是影響調(diào)度器調(diào)度的一個(gè)重要因素,我們?cè)诙xpod資源時(shí),如果指定了nodeselector,就表示只有符合對(duì)應(yīng)node標(biāo)簽選擇器定義的標(biāo)簽的node才能運(yùn)行對(duì)應(yīng)pod;如果沒(méi)有節(jié)點(diǎn)滿(mǎn)足節(jié)點(diǎn)選擇器,對(duì)應(yīng)pod就只能處于pending狀態(tài); Node Affinity:node affinity是用來(lái)定義pod對(duì)節(jié)點(diǎn)的親和性,所謂pod對(duì)節(jié)點(diǎn)的親和性是指,pod更愿意或更不愿意運(yùn)行在那些節(jié)點(diǎn);這種方式相比前面的nodename和nodeselector在調(diào)度邏輯上要精細(xì)一些; Pod Affinity:pod affinity是用來(lái)定義pod與pod間的親和性,所謂pod與pod的親和性是指,pod更愿意和那個(gè)或那些pod在一起;與之相反的也有pod更不愿意和那個(gè)或那些pod在一起,這種我們叫做pod anti affinity,即pod與pod間的反親和性;所謂在一起是指和對(duì)應(yīng)pod在同一個(gè)位置,這個(gè)位置可以是按主機(jī)名劃分,也可以按照區(qū)域劃分,這樣一來(lái)我們要定義pod和pod在一起或不在一起,定義位置就顯得尤為重要,也是評(píng)判對(duì)應(yīng)pod能夠運(yùn)行在哪里標(biāo)準(zhǔn); taint和tolerations:taint是節(jié)點(diǎn)上的污點(diǎn),tolerations是對(duì)應(yīng)pod對(duì)節(jié)點(diǎn)上的污點(diǎn)的容忍度,即pod能夠容忍節(jié)點(diǎn)的污點(diǎn),那么對(duì)應(yīng)pod就能夠運(yùn)行在對(duì)應(yīng)節(jié)點(diǎn),反之Pod就不能運(yùn)行在對(duì)應(yīng)節(jié)點(diǎn);這種方式是結(jié)合節(jié)點(diǎn)的污點(diǎn),以及pod對(duì)節(jié)點(diǎn)污點(diǎn)的容忍度來(lái)調(diào)度的; 示例:使用nodename調(diào)度策略 [root@master01 ~]# cat pod-demo.yaml
apiVersion: v1
kind: Pod
metadata:
name: nginx-pod
spec:
nodeName: node01.k8s.org
containers:
- name: nginx
image: nginx:1.14-alpine
imagePullPolicy: IfNotPresent
ports:
- name: http
containerPort: 80
[root@master01 ~]#
提示:nodename可以直接指定對(duì)應(yīng)pod運(yùn)行在那個(gè)節(jié)點(diǎn)上,無(wú)需默認(rèn)調(diào)度器調(diào)度;以上資源表示把nginx-pod運(yùn)行在node01.k8s.org這個(gè)節(jié)點(diǎn)上; 應(yīng)用清單 [root@master01 ~]# kubectl apply -f pod-demo.yaml pod/nginx-pod created [root@master01 ~]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-pod 1/1 Running 0 10s 10.244.1.28 node01.k8s.org <none> <none> [root@master01 ~]# 提示:可以看到對(duì)應(yīng)pod一定運(yùn)行在我們手動(dòng)指定的節(jié)點(diǎn)上; 示例:使用nodeselector調(diào)度策略 [root@master01 ~]# cat pod-demo-nodeselector.yaml
apiVersion: v1
kind: Pod
metadata:
name: nginx-pod-nodeselector
spec:
nodeSelector:
disktype: ssd
containers:
- name: nginx
image: nginx:1.14-alpine
imagePullPolicy: IfNotPresent
ports:
- name: http
containerPort: 80
[root@master01 ~]#
提示:nodeselector使用來(lái)定義對(duì)對(duì)應(yīng)node的標(biāo)簽進(jìn)行匹配,如果對(duì)應(yīng)節(jié)點(diǎn)有此對(duì)應(yīng)標(biāo)簽,則對(duì)應(yīng)pod就能被調(diào)度到對(duì)應(yīng)節(jié)點(diǎn)運(yùn)行,反之則不能被調(diào)度到對(duì)應(yīng)節(jié)點(diǎn)運(yùn)行;如果所有節(jié)點(diǎn)都不滿(mǎn)足,此時(shí)pod會(huì)處于pending狀態(tài),直到有對(duì)應(yīng)節(jié)點(diǎn)擁有對(duì)應(yīng)標(biāo)簽時(shí),pod才會(huì)被調(diào)度到對(duì)應(yīng)節(jié)點(diǎn)運(yùn)行; 應(yīng)用清單 [root@master01 ~]# kubectl apply -f pod-demo-nodeselector.yaml pod/nginx-pod-nodeselector created [root@master01 ~]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-pod 1/1 Running 0 9m38s 10.244.1.28 node01.k8s.org <none> <none> nginx-pod-nodeselector 0/1 Pending 0 16s <none> <none> <none> <none> [root@master01 ~]# 提示:可以看到對(duì)應(yīng)pod的狀態(tài)一直處于pending狀態(tài),其原因是對(duì)應(yīng)k8s節(jié)點(diǎn)沒(méi)有一個(gè)節(jié)點(diǎn)滿(mǎn)足對(duì)應(yīng)節(jié)點(diǎn)選擇器標(biāo)簽; 驗(yàn)證:給node02打上對(duì)應(yīng)標(biāo)簽,看看對(duì)應(yīng)pod是否會(huì)被調(diào)度到node02上呢? [root@master01 ~]# kubectl get nodes --show-labels NAME STATUS ROLES AGE VERSION LABELS master01.k8s.org Ready control-plane,master 29d v1.20.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=master01.k8s.org,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master= node01.k8s.org Ready <none> 29d v1.20.0 app=nginx-1.14-alpine,beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=node01.k8s.org,kubernetes.io/os=linux node02.k8s.org Ready <none> 29d v1.20.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=node02.k8s.org,kubernetes.io/os=linux node03.k8s.org Ready <none> 29d v1.20.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=node03.k8s.org,kubernetes.io/os=linux node04.k8s.org Ready <none> 19d v1.20.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=node04.k8s.org,kubernetes.io/os=linux [root@master01 ~]# kubectl label node node02.k8s.org disktype=ssd node/node02.k8s.org labeled [root@master01 ~]# kubectl get nodes --show-labels NAME STATUS ROLES AGE VERSION LABELS master01.k8s.org Ready control-plane,master 29d v1.20.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=master01.k8s.org,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master= node01.k8s.org Ready <none> 29d v1.20.0 app=nginx-1.14-alpine,beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=node01.k8s.org,kubernetes.io/os=linux node02.k8s.org Ready <none> 29d v1.20.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,disktype=ssd,kubernetes.io/arch=amd64,kubernetes.io/hostname=node02.k8s.org,kubernetes.io/os=linux node03.k8s.org Ready <none> 29d v1.20.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=node03.k8s.org,kubernetes.io/os=linux node04.k8s.org Ready <none> 19d v1.20.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=node04.k8s.org,kubernetes.io/os=linux [root@master01 ~]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-pod 1/1 Running 0 12m 10.244.1.28 node01.k8s.org <none> <none> nginx-pod-nodeselector 1/1 Running 0 3m26s 10.244.2.18 node02.k8s.org <none> <none> [root@master01 ~]# 提示:可以看到給node02節(jié)點(diǎn)打上disktype=ssd標(biāo)簽以后,對(duì)應(yīng)pod就被調(diào)度在node02上運(yùn)行; 示例:使用affinity中的nodeaffinity調(diào)度策略 [root@master01 ~]# cat pod-demo-affinity-nodeaffinity.yaml
apiVersion: v1
kind: Pod
metadata:
name: nginx-pod-nodeaffinity
spec:
containers:
- name: nginx
image: nginx:1.14-alpine
imagePullPolicy: IfNotPresent
ports:
- name: http
containerPort: 80
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: foo
operator: Exists
values: []
- matchExpressions:
- key: disktype
operator: Exists
values: []
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 10
preference:
matchExpressions:
- key: foo
operator: Exists
values: []
- weight: 2
preference:
matchExpressions:
- key: disktype
operator: Exists
values: []
[root@master01 ~]#
提示:對(duì)于nodeaffinity來(lái)說(shuō),它有兩種限制,一種是硬限制,用requiredDuringSchedulingIgnoredDuringExecution字段來(lái)定義,該字段為一個(gè)對(duì)象,其里面只有nodeSelectorTerms一個(gè)字段可以定義,該字段為一個(gè)列表對(duì)象,可以使用matchExpressions字段來(lái)定義匹配對(duì)應(yīng)節(jié)點(diǎn)標(biāo)簽的表達(dá)式(其中對(duì)應(yīng)表達(dá)式中可以使用的操作符有In、NotIn、Exists、DoesNotExists、Lt、Gt;Lt和Gt用于字符串比較,Exists和DoesNotExists用來(lái)判斷對(duì)應(yīng)標(biāo)簽key是否存在,In和NotIn用來(lái)判斷對(duì)應(yīng)標(biāo)簽的值是否在某個(gè)集合中),也可以使用matchFields字段來(lái)定義對(duì)應(yīng)匹配節(jié)點(diǎn)字段;所謂硬限制是指必須滿(mǎn)足對(duì)應(yīng)定義的節(jié)點(diǎn)標(biāo)簽選擇表達(dá)式或節(jié)點(diǎn)字段選擇器,對(duì)應(yīng)pod才能夠被調(diào)度在對(duì)應(yīng)節(jié)點(diǎn)上運(yùn)行,否則對(duì)應(yīng)pod不能被調(diào)度到節(jié)點(diǎn)上運(yùn)行,如果沒(méi)有滿(mǎn)足對(duì)應(yīng)的節(jié)點(diǎn)標(biāo)簽表達(dá)式或節(jié)點(diǎn)字段選擇器,則對(duì)應(yīng)pod會(huì)一直被掛起;第二種是軟限制,用preferredDuringSchedulingIgnoredDuringExecution字段定義,該字段為一個(gè)列表對(duì)象,里面可以用weight來(lái)定義對(duì)應(yīng)軟限制的權(quán)重,該權(quán)重會(huì)被調(diào)度器在最后計(jì)算node得分時(shí)加入到對(duì)應(yīng)節(jié)點(diǎn)總分中;preference字段是用來(lái)定義對(duì)應(yīng)軟限制匹配條件;即滿(mǎn)足對(duì)應(yīng)軟限制的節(jié)點(diǎn)在調(diào)度時(shí)會(huì)被調(diào)度器把對(duì)應(yīng)權(quán)重加入對(duì)應(yīng)節(jié)點(diǎn)總分;對(duì)于軟限制來(lái)說(shuō),只有當(dāng)硬限制匹配有多個(gè)node時(shí),對(duì)應(yīng)軟限制才會(huì)生效;即軟限制是在硬限制的基礎(chǔ)上做的第二次限制,它表示在硬限制匹配多個(gè)node,優(yōu)先使用軟限制中匹配的node,如果軟限制中給定的權(quán)重和匹配條件不能讓多個(gè)node決勝出最高分,即使用默認(rèn)調(diào)度調(diào)度機(jī)制,從多個(gè)最高分node中隨機(jī)挑選一個(gè)node作為最后調(diào)度結(jié)果;如果在軟限制中給定權(quán)重和對(duì)應(yīng)匹配條件能夠決勝出對(duì)應(yīng)node最高分,則對(duì)應(yīng)node就為最后調(diào)度結(jié)果;簡(jiǎn)單講軟限制和硬限制一起使用,軟限制是輔助硬限制對(duì)node進(jìn)行挑選;如果只是單純的使用軟限制,則優(yōu)先把pod調(diào)度到權(quán)重較高對(duì)應(yīng)條件匹配的節(jié)點(diǎn)上;如果權(quán)重一樣,則調(diào)度器會(huì)根據(jù)默認(rèn)規(guī)則從最后得分中挑選一個(gè)最高分,作為最后調(diào)度結(jié)果;以上示例表示運(yùn)行pod的硬限制必須是對(duì)應(yīng)節(jié)點(diǎn)上滿(mǎn)足有key為foo的節(jié)點(diǎn)標(biāo)簽或者key為disktype的節(jié)點(diǎn)標(biāo)簽;如果對(duì)應(yīng)硬限制沒(méi)有匹配到任何節(jié)點(diǎn),則對(duì)應(yīng)pod不做任何調(diào)度,即處于pending狀態(tài),如果對(duì)應(yīng)硬限制都匹配,則在軟限制中匹配key為foo的節(jié)點(diǎn)將在總分中加上10,對(duì)key為disktype的節(jié)點(diǎn)總分加2分;即軟限制中,pod更傾向key為foo的節(jié)點(diǎn)標(biāo)簽的node上;這里需要注意的是nodeAffinity沒(méi)有node anti Affinity,要想實(shí)現(xiàn)反親和性可以使用NotIn或者DoesNotExists操作符來(lái)匹配對(duì)應(yīng)條件; 應(yīng)用資源清單 [root@master01 ~]# kubectl get nodes -L foo,disktype NAME STATUS ROLES AGE VERSION FOO DISKTYPE master01.k8s.org Ready control-plane,master 29d v1.20.0 node01.k8s.org Ready <none> 29d v1.20.0 node02.k8s.org Ready <none> 29d v1.20.0 ssd node03.k8s.org Ready <none> 29d v1.20.0 node04.k8s.org Ready <none> 19d v1.20.0 [root@master01 ~]# kubectl apply -f pod-demo-affinity-nodeaffinity.yaml pod/nginx-pod-nodeaffinity created [root@master01 ~]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-pod 1/1 Running 0 122m 10.244.1.28 node01.k8s.org <none> <none> nginx-pod-nodeaffinity 1/1 Running 0 7s 10.244.2.22 node02.k8s.org <none> <none> nginx-pod-nodeselector 1/1 Running 0 113m 10.244.2.18 node02.k8s.org <none> <none> [root@master01 ~]# 提示:可以看到應(yīng)用清單以后對(duì)應(yīng)pod被調(diào)度到node02上運(yùn)行了,之所以調(diào)度到node02是因?yàn)閷?duì)應(yīng)節(jié)點(diǎn)上有key為disktype的節(jié)點(diǎn)標(biāo)簽,該條件滿(mǎn)足對(duì)應(yīng)運(yùn)行pod的硬限制; 驗(yàn)證:刪除pod和對(duì)應(yīng)node02上的key為disktype的節(jié)點(diǎn)標(biāo)簽,再次應(yīng)用資源清單,看看對(duì)應(yīng)pod怎么調(diào)度? [root@master01 ~]# kubectl delete -f pod-demo-affinity-nodeaffinity.yaml pod "nginx-pod-nodeaffinity" deleted [root@master01 ~]# kubectl label node node02.k8s.org disktype- node/node02.k8s.org labeled [root@master01 ~]# kubectl get pods NAME READY STATUS RESTARTS AGE nginx-pod 1/1 Running 0 127m nginx-pod-nodeselector 1/1 Running 0 118m [root@master01 ~]# kubectl get node -L foo,disktype NAME STATUS ROLES AGE VERSION FOO DISKTYPE master01.k8s.org Ready control-plane,master 29d v1.20.0 node01.k8s.org Ready <none> 29d v1.20.0 node02.k8s.org Ready <none> 29d v1.20.0 node03.k8s.org Ready <none> 29d v1.20.0 node04.k8s.org Ready <none> 19d v1.20.0 [root@master01 ~]# kubectl apply -f pod-demo-affinity-nodeaffinity.yaml pod/nginx-pod-nodeaffinity created [root@master01 ~]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-pod 1/1 Running 0 128m 10.244.1.28 node01.k8s.org <none> <none> nginx-pod-nodeaffinity 0/1 Pending 0 9s <none> <none> <none> <none> nginx-pod-nodeselector 1/1 Running 0 118m 10.244.2.18 node02.k8s.org <none> <none> [root@master01 ~]# 提示:可以看到刪除原有pod和node2上面的標(biāo)簽后,再次應(yīng)用資源清單,pod就一直處于pending狀態(tài);其原因是對(duì)應(yīng)k8s節(jié)點(diǎn)沒(méi)有滿(mǎn)足對(duì)應(yīng)pod運(yùn)行時(shí)的硬限制;所以對(duì)應(yīng)pod無(wú)法進(jìn)行調(diào)度; 驗(yàn)證:刪除pod,分別給node01和node03打上key為foo和key為disktype的節(jié)點(diǎn)標(biāo)簽,看看然后再次應(yīng)用清單,看看對(duì)應(yīng)pod會(huì)這么調(diào)度? [root@master01 ~]# kubectl delete -f pod-demo-affinity-nodeaffinity.yaml pod "nginx-pod-nodeaffinity" deleted [root@master01 ~]# kubectl label node node01.k8s.org foo=bar node/node01.k8s.org labeled [root@master01 ~]# kubectl label node node03.k8s.org disktype=ssd node/node03.k8s.org labeled [root@master01 ~]# kubectl get nodes -L foo,disktype NAME STATUS ROLES AGE VERSION FOO DISKTYPE master01.k8s.org Ready control-plane,master 29d v1.20.0 node01.k8s.org Ready <none> 29d v1.20.0 bar node02.k8s.org Ready <none> 29d v1.20.0 node03.k8s.org Ready <none> 29d v1.20.0 ssd node04.k8s.org Ready <none> 19d v1.20.0 [root@master01 ~]# kubectl apply -f pod-demo-affinity-nodeaffinity.yaml pod/nginx-pod-nodeaffinity created [root@master01 ~]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-pod 1/1 Running 0 132m 10.244.1.28 node01.k8s.org <none> <none> nginx-pod-nodeaffinity 1/1 Running 0 5s 10.244.1.29 node01.k8s.org <none> <none> nginx-pod-nodeselector 1/1 Running 0 123m 10.244.2.18 node02.k8s.org <none> <none> [root@master01 ~]# 提示:可以看到當(dāng)硬限制中的條件被多個(gè)node匹配時(shí),優(yōu)先調(diào)度對(duì)應(yīng)軟限制條件匹配權(quán)重較大的節(jié)點(diǎn)上,即硬限制不能正常抉擇出調(diào)度節(jié)點(diǎn),則軟限制中對(duì)應(yīng)權(quán)重大的匹配條件有限被調(diào)度; 驗(yàn)證:刪除node01上的節(jié)點(diǎn)標(biāo)簽,看看對(duì)應(yīng)pod是否會(huì)被移除,或被調(diào)度其他節(jié)點(diǎn)? [root@master01 ~]# kubectl get nodes -L foo,disktype NAME STATUS ROLES AGE VERSION FOO DISKTYPE master01.k8s.org Ready control-plane,master 29d v1.20.0 node01.k8s.org Ready <none> 29d v1.20.0 bar node02.k8s.org Ready <none> 29d v1.20.0 node03.k8s.org Ready <none> 29d v1.20.0 ssd node04.k8s.org Ready <none> 19d v1.20.0 [root@master01 ~]# kubectl label node node01.k8s.org foo- node/node01.k8s.org labeled [root@master01 ~]# kubectl get nodes -L foo,disktype NAME STATUS ROLES AGE VERSION FOO DISKTYPE master01.k8s.org Ready control-plane,master 29d v1.20.0 node01.k8s.org Ready <none> 29d v1.20.0 node02.k8s.org Ready <none> 29d v1.20.0 node03.k8s.org Ready <none> 29d v1.20.0 ssd node04.k8s.org Ready <none> 19d v1.20.0 [root@master01 ~]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-pod 1/1 Running 0 145m 10.244.1.28 node01.k8s.org <none> <none> nginx-pod-nodeaffinity 1/1 Running 0 12m 10.244.1.29 node01.k8s.org <none> <none> nginx-pod-nodeselector 1/1 Running 0 135m 10.244.2.18 node02.k8s.org <none> <none> [root@master01 ~]# 提示:可以看到當(dāng)pod正常運(yùn)行以后,即便后來(lái)對(duì)應(yīng)節(jié)點(diǎn)不滿(mǎn)足對(duì)應(yīng)pod運(yùn)行的硬限制,對(duì)應(yīng)pod也不會(huì)被移除或調(diào)度到其他節(jié)點(diǎn),說(shuō)明節(jié)點(diǎn)親和性是在調(diào)度時(shí)發(fā)生作用,一旦調(diào)度完成,即便后來(lái)節(jié)點(diǎn)不滿(mǎn)足pod運(yùn)行節(jié)點(diǎn)親和性,對(duì)應(yīng)pod也不會(huì)被移除或再次調(diào)度;簡(jiǎn)單講nodeaffinity對(duì)pod調(diào)度既成事實(shí)無(wú)法做二次調(diào)度; node Affinity規(guī)則生效方式 1、nodeAffinity和nodeSelector一起使用時(shí),兩者間關(guān)系取“與”關(guān)系,即兩者條件必須同時(shí)滿(mǎn)足,對(duì)應(yīng)節(jié)點(diǎn)才滿(mǎn)足調(diào)度運(yùn)行或不運(yùn)行對(duì)應(yīng)pod; 示例:使用nodeaffinity和nodeselector定義pod調(diào)度策略 [root@master01 ~]# cat pod-demo-affinity-nodesector.yaml
apiVersion: v1
kind: Pod
metadata:
name: nginx-pod-nodeaffinity-nodeselector
spec:
containers:
- name: nginx
image: nginx:1.14-alpine
imagePullPolicy: IfNotPresent
ports:
- name: http
containerPort: 80
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: foo
operator: Exists
values: []
nodeSelector:
disktype: ssd
[root@master01 ~]#
提示:以上清單表示對(duì)應(yīng)pod傾向運(yùn)行在節(jié)點(diǎn)上有節(jié)點(diǎn)標(biāo)簽key為foo的節(jié)點(diǎn)并且對(duì)應(yīng)節(jié)點(diǎn)上還有disktype=ssd節(jié)點(diǎn)標(biāo)簽 應(yīng)用清單 [root@master01 ~]# kubectl get nodes -L foo,disktype NAME STATUS ROLES AGE VERSION FOO DISKTYPE master01.k8s.org Ready control-plane,master 29d v1.20.0 node01.k8s.org Ready <none> 29d v1.20.0 node02.k8s.org Ready <none> 29d v1.20.0 node03.k8s.org Ready <none> 29d v1.20.0 ssd node04.k8s.org Ready <none> 19d v1.20.0 [root@master01 ~]# kubectl apply -f pod-demo-affinity-nodesector.yaml pod/nginx-pod-nodeaffinity-nodeselector created [root@master01 ~]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-pod 1/1 Running 0 168m 10.244.1.28 node01.k8s.org <none> <none> nginx-pod-nodeaffinity 1/1 Running 0 35m 10.244.1.29 node01.k8s.org <none> <none> nginx-pod-nodeaffinity-nodeselector 0/1 Pending 0 7s <none> <none> <none> <none> nginx-pod-nodeselector 1/1 Running 0 159m 10.244.2.18 node02.k8s.org <none> <none> [root@master01 ~]# 提示:可以看到對(duì)應(yīng)pod被創(chuàng)建以后,一直處于pengding狀態(tài),原因是沒(méi)有節(jié)點(diǎn)滿(mǎn)足同時(shí)有節(jié)點(diǎn)標(biāo)簽key為foo并且disktype=ssd的節(jié)點(diǎn),所以對(duì)應(yīng)pod就無(wú)法正常被調(diào)度,只好掛起; 2、多個(gè)nodeaffinity同時(shí)指定多個(gè)nodeSelectorTerms時(shí),相互之間取“或”關(guān)系;即使用多個(gè)matchExpressions列表分別指定對(duì)應(yīng)的匹配條件; [root@master01 ~]# cat pod-demo-affinity2.yaml
apiVersion: v1
kind: Pod
metadata:
name: nginx-pod-nodeaffinity2
spec:
containers:
- name: nginx
image: nginx:1.14-alpine
imagePullPolicy: IfNotPresent
ports:
- name: http
containerPort: 80
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: foo
operator: Exists
values: []
- matchExpressions:
- key: disktype
operator: Exists
values: []
[root@master01 ~]#
提示:以上示例表示運(yùn)行pod節(jié)點(diǎn)傾向?qū)?yīng)節(jié)點(diǎn)上有節(jié)點(diǎn)標(biāo)簽key為foo或key為disktype的節(jié)點(diǎn); 應(yīng)用清單 [root@master01 ~]# kubectl get nodes -L foo,disktype NAME STATUS ROLES AGE VERSION FOO DISKTYPE master01.k8s.org Ready control-plane,master 29d v1.20.0 node01.k8s.org Ready <none> 29d v1.20.0 node02.k8s.org Ready <none> 29d v1.20.0 node03.k8s.org Ready <none> 29d v1.20.0 ssd node04.k8s.org Ready <none> 19d v1.20.0 [root@master01 ~]# kubectl apply -f pod-demo-affinity2.yaml pod/nginx-pod-nodeaffinity2 created [root@master01 ~]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-pod 1/1 Running 0 179m 10.244.1.28 node01.k8s.org <none> <none> nginx-pod-nodeaffinity 1/1 Running 0 46m 10.244.1.29 node01.k8s.org <none> <none> nginx-pod-nodeaffinity-nodeselector 0/1 Pending 0 10m <none> <none> <none> <none> nginx-pod-nodeaffinity2 1/1 Running 0 6s 10.244.3.21 node03.k8s.org <none> <none> nginx-pod-nodeselector 1/1 Running 0 169m 10.244.2.18 node02.k8s.org <none> <none> [root@master01 ~]# 提示:可以看到對(duì)應(yīng)pod被調(diào)度node03上運(yùn)行了,之所以能在node03運(yùn)行是因?yàn)閷?duì)應(yīng)node03滿(mǎn)足節(jié)點(diǎn)標(biāo)簽key為foo或key為disktype條件; 3、同一個(gè)matchExpressions,多個(gè)條件取“與”關(guān)系;即使用多個(gè)key列表分別指定對(duì)應(yīng)的匹配條件; 示例:在一個(gè)matchExpressions下指定多個(gè)條件 [root@master01 ~]# cat pod-demo-affinity3.yaml
apiVersion: v1
kind: Pod
metadata:
name: nginx-pod-nodeaffinity3
spec:
containers:
- name: nginx
image: nginx:1.14-alpine
imagePullPolicy: IfNotPresent
ports:
- name: http
containerPort: 80
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: foo
operator: Exists
values: []
- key: disktype
operator: Exists
values: []
[root@master01 ~]#
提示:上述清單表示pod傾向運(yùn)行在節(jié)點(diǎn)標(biāo)簽key為foo和節(jié)點(diǎn)標(biāo)簽key為disktype的節(jié)點(diǎn)上; 應(yīng)用清單 [root@master01 ~]# kubectl get nodes -L foo,disktype NAME STATUS ROLES AGE VERSION FOO DISKTYPE master01.k8s.org Ready control-plane,master 29d v1.20.0 node01.k8s.org Ready <none> 29d v1.20.0 node02.k8s.org Ready <none> 29d v1.20.0 node03.k8s.org Ready <none> 29d v1.20.0 ssd node04.k8s.org Ready <none> 19d v1.20.0 [root@master01 ~]# kubectl apply -f pod-demo-affinity3.yaml pod/nginx-pod-nodeaffinity3 created [root@master01 ~]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-pod 1/1 Running 0 3h8m 10.244.1.28 node01.k8s.org <none> <none> nginx-pod-nodeaffinity 1/1 Running 0 56m 10.244.1.29 node01.k8s.org <none> <none> nginx-pod-nodeaffinity-nodeselector 0/1 Pending 0 20m <none> <none> <none> <none> nginx-pod-nodeaffinity2 1/1 Running 0 9m38s 10.244.3.21 node03.k8s.org <none> <none> nginx-pod-nodeaffinity3 0/1 Pending 0 7s <none> <none> <none> <none> nginx-pod-nodeselector 1/1 Running 0 179m 10.244.2.18 node02.k8s.org <none> <none> [root@master01 ~]# 提示:可以看到對(duì)應(yīng)pod創(chuàng)建以后,一直處于pengding狀態(tài);原因是沒(méi)有符合節(jié)點(diǎn)標(biāo)簽同時(shí)滿(mǎn)足key為foo和key為disktyp的節(jié)點(diǎn); pod affinity 的工作邏輯和使用方式同node affinity類(lèi)似,pod affinity也有硬限制和軟限制,其邏輯和nodeaffinity一樣,即定義了硬親和,軟親和規(guī)則就是輔助硬親和規(guī)則挑選對(duì)應(yīng)pod運(yùn)行節(jié)點(diǎn);如果硬親和不滿(mǎn)足條件,對(duì)應(yīng)pod只能掛起;如果只是使用軟親和規(guī)則,則對(duì)應(yīng)pod會(huì)優(yōu)先運(yùn)行在匹配軟親和規(guī)則中權(quán)重較大的節(jié)點(diǎn)上,如果軟親和規(guī)則也沒(méi)有節(jié)點(diǎn)滿(mǎn)足,則使用默認(rèn)調(diào)度規(guī)則從中挑選一個(gè)得分最高的節(jié)點(diǎn)運(yùn)行pod; 示例:使用Affinity中的PodAffinity中的硬限制調(diào)度策略 [root@master01 ~]# cat require-podaffinity.yaml
apiVersion: v1
kind: Pod
metadata:
name: with-pod-affinity-1
spec:
affinity:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- {key: app, operator: In, values: ["nginx"]}
topologyKey: kubernetes.io/hostname
containers:
- name: myapp
image: ikubernetes/myapp:v1
[root@master01 ~]#
提示:上述清單是podaffinity中的硬限制使用方式,其中定義podaffinity需要在spec.affinity字段中使用podAffinity字段來(lái)定義;requiredDuringSchedulingIgnoredDuringExecution字段是定義對(duì)應(yīng)podAffinity的硬限制所使用的字段,該字段為一個(gè)列表對(duì)象,其中l(wèi)abelSelector用來(lái)定義和對(duì)應(yīng)pod在一起pod的標(biāo)簽選擇器;topologyKey字段是用來(lái)定義對(duì)應(yīng)在一起的位置以那個(gè)什么來(lái)劃分,該位置可以是對(duì)應(yīng)節(jié)點(diǎn)上的一個(gè)節(jié)點(diǎn)標(biāo)簽key;上述清單表示運(yùn)行myapp這個(gè)pod的硬限制條件是必須滿(mǎn)足對(duì)應(yīng)對(duì)應(yīng)節(jié)點(diǎn)上必須運(yùn)行的有一個(gè)pod,這個(gè)pod上有一個(gè)app=nginx的標(biāo)簽;即標(biāo)簽為app=nginx的pod運(yùn)行在那個(gè)節(jié)點(diǎn),對(duì)應(yīng)myapp就運(yùn)行在那個(gè)節(jié)點(diǎn);如果沒(méi)有對(duì)應(yīng)pod存在,則該pod也會(huì)處于pending狀態(tài); 應(yīng)用清單 [root@master01 ~]# kubectl get pods -L app -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES APP nginx-pod 1/1 Running 0 8m25s 10.244.4.25 node04.k8s.org <none> <none> nginx [root@master01 ~]# kubectl apply -f require-podaffinity.yaml pod/with-pod-affinity-1 created [root@master01 ~]# kubectl get pods -L app -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES APP nginx-pod 1/1 Running 0 8m43s 10.244.4.25 node04.k8s.org <none> <none> nginx with-pod-affinity-1 1/1 Running 0 6s 10.244.4.26 node04.k8s.org <none> <none> [root@master01 ~]# 提示:可以看到對(duì)應(yīng)pod運(yùn)行在node04上了,其原因?qū)?yīng)節(jié)點(diǎn)上有一個(gè)app=nginx標(biāo)簽的pod存在,滿(mǎn)足對(duì)應(yīng)podAffinity中的硬限制; 驗(yàn)證:刪除上述兩個(gè)pod,然后再次應(yīng)用清單,看看對(duì)應(yīng)pod是否能夠正常運(yùn)行? [root@master01 ~]# kubectl delete all --all pod "nginx-pod" deleted pod "with-pod-affinity-1" deleted service "kubernetes" deleted [root@master01 ~]# kubectl apply -f require-podaffinity.yaml pod/with-pod-affinity-1 created [root@master01 ~]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES with-pod-affinity-1 0/1 Pending 0 8s <none> <none> <none> <none> [root@master01 ~]# 提示:可以看到對(duì)應(yīng)pod處于pending狀態(tài),其原因是沒(méi)有一個(gè)節(jié)點(diǎn)上運(yùn)行的有app=nginx pod標(biāo)簽,不滿(mǎn)足podAffinity中的硬限制; 示例:使用Affinity中的PodAffinity中的軟限制調(diào)度策略 [root@master01 ~]# cat prefernece-podaffinity.yaml
apiVersion: v1
kind: Pod
metadata:
name: with-pod-affinity-2
spec:
affinity:
podAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 80
podAffinityTerm:
labelSelector:
matchExpressions:
- {key: app, operator: In, values: ["db"]}
topologyKey: rack
- weight: 20
podAffinityTerm:
labelSelector:
matchExpressions:
- {key: app, operator: In, values: ["db"]}
topologyKey: zone
containers:
- name: myapp
image: ikubernetes/myapp:v1
[root@master01 ~]#
提示:podAffinity中的軟限制需要用preferredDuringSchedulingIgnoredDuringExecution字段定義;其中weight用來(lái)定義對(duì)應(yīng)軟限制條件的權(quán)重,即滿(mǎn)足對(duì)應(yīng)軟限制的node,最后得分會(huì)加上這個(gè)權(quán)重;上述清單表示以節(jié)點(diǎn)標(biāo)簽key=rack來(lái)劃分位置,如果對(duì)應(yīng)節(jié)點(diǎn)上運(yùn)行的有對(duì)應(yīng)pod標(biāo)簽為app=db的pod,則對(duì)應(yīng)節(jié)點(diǎn)總分加80;如果以節(jié)點(diǎn)標(biāo)簽key=zone來(lái)劃分位置,如果對(duì)應(yīng)節(jié)點(diǎn)上運(yùn)行的有pod標(biāo)簽為app=db的pod,對(duì)應(yīng)節(jié)點(diǎn)總分加20;如果沒(méi)有滿(mǎn)足的節(jié)點(diǎn),則使用默認(rèn)調(diào)度規(guī)則進(jìn)行調(diào)度; 應(yīng)用清單 [root@master01 ~]# kubectl get node -L rack,zone NAME STATUS ROLES AGE VERSION RACK ZONE master01.k8s.org Ready control-plane,master 30d v1.20.0 node01.k8s.org Ready <none> 30d v1.20.0 node02.k8s.org Ready <none> 30d v1.20.0 node03.k8s.org Ready <none> 30d v1.20.0 node04.k8s.org Ready <none> 20d v1.20.0 [root@master01 ~]# kubectl get pods -o wide -L app NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES APP with-pod-affinity-1 0/1 Pending 0 22m <none> <none> <none> <none> [root@master01 ~]# kubectl apply -f prefernece-podaffinity.yaml pod/with-pod-affinity-2 created [root@master01 ~]# kubectl get pods -o wide -L app NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES APP with-pod-affinity-1 0/1 Pending 0 22m <none> <none> <none> <none> with-pod-affinity-2 1/1 Running 0 6s 10.244.4.28 node04.k8s.org <none> <none> [root@master01 ~]# 提示:可以看到對(duì)應(yīng)pod正常運(yùn)行起來(lái),并調(diào)度到node04上;從上面的示例來(lái)看,對(duì)應(yīng)pod的運(yùn)行并沒(méi)有走軟限制條件進(jìn)行調(diào)度,而是走默認(rèn)調(diào)度法則;其原因是對(duì)應(yīng)節(jié)點(diǎn)沒(méi)有滿(mǎn)足對(duì)應(yīng)軟限制中的條件; 驗(yàn)證:刪除pod,在node01上打上rack節(jié)點(diǎn)標(biāo)簽,在node03上打上zone節(jié)點(diǎn)標(biāo)簽,再次運(yùn)行pod,看看對(duì)應(yīng)pod會(huì)怎么調(diào)度? [root@master01 ~]# kubectl delete -f prefernece-podaffinity.yaml pod "with-pod-affinity-2" deleted [root@master01 ~]# kubectl label node node01.k8s.org rack=group1 node/node01.k8s.org labeled [root@master01 ~]# kubectl label node node03.k8s.org zone=group2 node/node03.k8s.org labeled [root@master01 ~]# kubectl get node -L rack,zone NAME STATUS ROLES AGE VERSION RACK ZONE master01.k8s.org Ready control-plane,master 30d v1.20.0 node01.k8s.org Ready <none> 30d v1.20.0 group1 node02.k8s.org Ready <none> 30d v1.20.0 node03.k8s.org Ready <none> 30d v1.20.0 group2 node04.k8s.org Ready <none> 20d v1.20.0 [root@master01 ~]# kubectl apply -f prefernece-podaffinity.yaml pod/with-pod-affinity-2 created [root@master01 ~]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES with-pod-affinity-1 0/1 Pending 0 27m <none> <none> <none> <none> with-pod-affinity-2 1/1 Running 0 9s 10.244.4.29 node04.k8s.org <none> <none> [root@master01 ~]# 提示:可以看到對(duì)應(yīng)pod還是被調(diào)度到node04上運(yùn)行,說(shuō)明節(jié)點(diǎn)上的位置標(biāo)簽不影響其調(diào)度結(jié)果; 驗(yàn)證:刪除pod,在node01和node03上分別創(chuàng)建一個(gè)標(biāo)簽為app=db的pod,然后再次應(yīng)用清單,看看對(duì)應(yīng)pod會(huì)這么調(diào)度? [root@master01 ~]# kubectl apply -f prefernece-podaffinity.yaml
pod/with-pod-affinity-2 created
[root@master01 ~]# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
with-pod-affinity-1 0/1 Pending 0 27m <none> <none> <none> <none>
with-pod-affinity-2 1/1 Running 0 9s 10.244.4.29 node04.k8s.org <none> <none>
[root@master01 ~]#
[root@master01 ~]# kubectl delete -f prefernece-podaffinity.yaml
pod "with-pod-affinity-2" deleted
[root@master01 ~]# cat pod-demo.yaml
apiVersion: v1
kind: Pod
metadata:
name: redis-pod1
labels:
app: db
spec:
nodeSelector:
rack: group1
containers:
- name: redis
image: redis:4-alpine
imagePullPolicy: IfNotPresent
ports:
- name: redis
containerPort: 6379
---
apiVersion: v1
kind: Pod
metadata:
name: redis-pod2
labels:
app: db
spec:
nodeSelector:
zone: group2
containers:
- name: redis
image: redis:4-alpine
imagePullPolicy: IfNotPresent
ports:
- name: redis
containerPort: 6379
[root@master01 ~]# kubectl apply -f pod-demo.yaml
pod/redis-pod1 created
pod/redis-pod2 created
[root@master01 ~]# kubectl get pods -L app -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES APP
redis-pod1 1/1 Running 0 34s 10.244.1.35 node01.k8s.org <none> <none> db
redis-pod2 1/1 Running 0 34s 10.244.3.24 node03.k8s.org <none> <none> db
with-pod-affinity-1 0/1 Pending 0 34m <none> <none> <none> <none>
[root@master01 ~]# kubectl apply -f prefernece-podaffinity.yaml
pod/with-pod-affinity-2 created
[root@master01 ~]# kubectl get pods -L app -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES APP
redis-pod1 1/1 Running 0 52s 10.244.1.35 node01.k8s.org <none> <none> db
redis-pod2 1/1 Running 0 52s 10.244.3.24 node03.k8s.org <none> <none> db
with-pod-affinity-1 0/1 Pending 0 35m <none> <none> <none> <none>
with-pod-affinity-2 1/1 Running 0 9s 10.244.1.36 node01.k8s.org <none> <none>
[root@master01 ~]#
提示:可以看到對(duì)應(yīng)pod運(yùn)行在node01上,其原因是對(duì)應(yīng)node01上有一個(gè)pod標(biāo)簽為app=db的pod運(yùn)行,滿(mǎn)足對(duì)應(yīng)軟限制條件,并且對(duì)應(yīng)節(jié)點(diǎn)上有key為rack的節(jié)點(diǎn)標(biāo)簽;即滿(mǎn)足對(duì)應(yīng)權(quán)重為80的條件,所以對(duì)應(yīng)pod更傾向運(yùn)行在node01上; 示例:使用Affinity中的PodAffinity中的硬限制和軟限制調(diào)度策略 [root@master01 ~]# cat require-preference-podaffinity.yaml
apiVersion: v1
kind: Pod
metadata:
name: with-pod-affinity-3
spec:
affinity:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- {key: app, operator: In, values: ["db"]}
topologyKey: kubernetes.io/hostname
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 80
podAffinityTerm:
labelSelector:
matchExpressions:
- {key: app, operator: In, values: ["db"]}
topologyKey: rack
- weight: 20
podAffinityTerm:
labelSelector:
matchExpressions:
- {key: app, operator: In, values: ["db"]}
topologyKey: zone
containers:
- name: myapp
image: ikubernetes/myapp:v1
[root@master01 ~]#
提示:上述清單表示對(duì)應(yīng)pod必須運(yùn)行在對(duì)應(yīng)節(jié)點(diǎn)上運(yùn)行的有標(biāo)簽為app=db的pod,如果沒(méi)有節(jié)點(diǎn)滿(mǎn)足,則對(duì)應(yīng)pod只能掛起;如果滿(mǎn)足的節(jié)點(diǎn)有多個(gè),則對(duì)應(yīng)滿(mǎn)足軟限制中的要求;如果滿(mǎn)足硬限制的同時(shí)也滿(mǎn)足對(duì)應(yīng)節(jié)點(diǎn)上有key為rack的節(jié)點(diǎn)標(biāo)簽,則對(duì)應(yīng)節(jié)點(diǎn)總分加80,如果對(duì)應(yīng)節(jié)點(diǎn)有key為zone的節(jié)點(diǎn)標(biāo)簽,則對(duì)應(yīng)節(jié)點(diǎn)總分加20; 應(yīng)用清單 [root@master01 ~]# kubectl get pods -o wide -L app NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES APP redis-pod1 1/1 Running 0 13m 10.244.1.35 node01.k8s.org <none> <none> db redis-pod2 1/1 Running 0 13m 10.244.3.24 node03.k8s.org <none> <none> db with-pod-affinity-1 0/1 Pending 0 48m <none> <none> <none> <none> with-pod-affinity-2 1/1 Running 0 13m 10.244.1.36 node01.k8s.org <none> <none> [root@master01 ~]# kubectl apply -f require-preference-podaffinity.yaml pod/with-pod-affinity-3 created [root@master01 ~]# kubectl get pods -o wide -L app NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES APP redis-pod1 1/1 Running 0 14m 10.244.1.35 node01.k8s.org <none> <none> db redis-pod2 1/1 Running 0 14m 10.244.3.24 node03.k8s.org <none> <none> db with-pod-affinity-1 0/1 Pending 0 48m <none> <none> <none> <none> with-pod-affinity-2 1/1 Running 0 13m 10.244.1.36 node01.k8s.org <none> <none> with-pod-affinity-3 1/1 Running 0 6s 10.244.1.37 node01.k8s.org <none> <none> [root@master01 ~]# 提示:可以看到對(duì)應(yīng)pod被調(diào)度到node01上運(yùn)行,其原因是對(duì)應(yīng)節(jié)點(diǎn)滿(mǎn)足硬限制條件的同時(shí)也滿(mǎn)足對(duì)應(yīng)權(quán)重最大的軟限制條件; 驗(yàn)證:刪除上述pod,重新應(yīng)用清單看看對(duì)應(yīng)pod是否還會(huì)正常運(yùn)行? [root@master01 ~]# kubectl delete all --all pod "redis-pod1" deleted pod "redis-pod2" deleted pod "with-pod-affinity-1" deleted pod "with-pod-affinity-2" deleted pod "with-pod-affinity-3" deleted service "kubernetes" deleted [root@master01 ~]# kubectl apply -f require-preference-podaffinity.yaml pod/with-pod-affinity-3 created [root@master01 ~]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES with-pod-affinity-3 0/1 Pending 0 5s <none> <none> <none> <none> [root@master01 ~]# 提示:可以看到對(duì)應(yīng)pod創(chuàng)建出來(lái)處于pending狀態(tài),其原因是沒(méi)有任何節(jié)點(diǎn)滿(mǎn)足對(duì)應(yīng)pod調(diào)度的硬限制;所以對(duì)應(yīng)pod沒(méi)法調(diào)度,只能被掛起; 示例:使用Affinity中的podAntiAffinity調(diào)度策略 [root@master01 ~]# cat require-preference-podantiaffinity.yaml
apiVersion: v1
kind: Pod
metadata:
name: with-pod-affinity-4
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- {key: app, operator: In, values: ["db"]}
topologyKey: kubernetes.io/hostname
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 80
podAffinityTerm:
labelSelector:
matchExpressions:
- {key: app, operator: In, values: ["db"]}
topologyKey: rack
- weight: 20
podAffinityTerm:
labelSelector:
matchExpressions:
- {key: app, operator: In, values: ["db"]}
topologyKey: zone
containers:
- name: myapp
image: ikubernetes/myapp:v1
[root@master01 ~]#
提示:podantiaffinity的使用和podaffinity的使用方式一樣,只是其對(duì)應(yīng)的邏輯相反,podantiaffinity是定義滿(mǎn)足條件的節(jié)點(diǎn)不運(yùn)行對(duì)應(yīng)pod,podaffinity是滿(mǎn)足條件運(yùn)行pod;上述清單表示對(duì)應(yīng)pod一定不能運(yùn)行在有標(biāo)簽為app=db的pod運(yùn)行的節(jié)點(diǎn),并且對(duì)應(yīng)節(jié)點(diǎn)上如果有key為rack和key為zone的節(jié)點(diǎn)標(biāo)簽,這類(lèi)節(jié)點(diǎn)也不運(yùn)行;即只能運(yùn)行在上述三個(gè)條件都滿(mǎn)足的節(jié)點(diǎn)上;如果所有節(jié)點(diǎn)都滿(mǎn)足上述三個(gè)條件,則對(duì)應(yīng)pod只能掛;如果單單使用軟限制,則pod會(huì)勉強(qiáng)運(yùn)行在對(duì)應(yīng)節(jié)點(diǎn)得分較低的節(jié)點(diǎn)上運(yùn)行; 應(yīng)用清單 [root@master01 ~]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES with-pod-affinity-3 0/1 Pending 0 22m <none> <none> <none> <none> [root@master01 ~]# kubectl apply -f require-preference-podantiaffinity.yaml pod/with-pod-affinity-4 created [root@master01 ~]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES with-pod-affinity-3 0/1 Pending 0 22m <none> <none> <none> <none> with-pod-affinity-4 1/1 Running 0 6s 10.244.4.30 node04.k8s.org <none> <none> [root@master01 ~]# kubectl get node -L rack,zone NAME STATUS ROLES AGE VERSION RACK ZONE master01.k8s.org Ready control-plane,master 30d v1.20.0 node01.k8s.org Ready <none> 30d v1.20.0 group1 node02.k8s.org Ready <none> 30d v1.20.0 node03.k8s.org Ready <none> 30d v1.20.0 group2 node04.k8s.org Ready <none> 20d v1.20.0 [root@master01 ~]# 提示:可以看到對(duì)應(yīng)pod被調(diào)度到node04上運(yùn)行;其原因是node04上沒(méi)有上述三個(gè)條件;當(dāng)然node02也是符合運(yùn)行對(duì)應(yīng)pod的節(jié)點(diǎn); 驗(yàn)證:刪除上述pod,在四個(gè)節(jié)點(diǎn)上各自運(yùn)行一個(gè)app=db標(biāo)簽的pod,再次應(yīng)用清單,看看對(duì)用pod怎么調(diào)度? [root@master01 ~]# kubectl delete all --all
pod "with-pod-affinity-3" deleted
pod "with-pod-affinity-4" deleted
service "kubernetes" deleted
[root@master01 ~]# cat pod-demo.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: redis-ds
labels:
app: db
spec:
selector:
matchLabels:
app: db
template:
metadata:
labels:
app: db
spec:
containers:
- name: redis
image: redis:4-alpine
ports:
- name: redis
containerPort: 6379
[root@master01 ~]# kubectl apply -f pod-demo.yaml
daemonset.apps/redis-ds created
[root@master01 ~]# kubectl get pods -L app -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES APP
redis-ds-4bnmv 1/1 Running 0 44s 10.244.2.26 node02.k8s.org <none> <none> db
redis-ds-c2h77 1/1 Running 0 44s 10.244.1.38 node01.k8s.org <none> <none> db
redis-ds-mbxcd 1/1 Running 0 44s 10.244.4.32 node04.k8s.org <none> <none> db
redis-ds-r2kxv 1/1 Running 0 44s 10.244.3.25 node03.k8s.org <none> <none> db
[root@master01 ~]# kubectl apply -f require-preference-podantiaffinity.yaml
pod/with-pod-affinity-5 created
[root@master01 ~]# kubectl get pods -o wide -L app
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES APP
redis-ds-4bnmv 1/1 Running 0 2m29s 10.244.2.26 node02.k8s.org <none> <none> db
redis-ds-c2h77 1/1 Running 0 2m29s 10.244.1.38 node01.k8s.org <none> <none> db
redis-ds-mbxcd 1/1 Running 0 2m29s 10.244.4.32 node04.k8s.org <none> <none> db
redis-ds-r2kxv 1/1 Running 0 2m29s 10.244.3.25 node03.k8s.org <none> <none> db
with-pod-affinity-5 0/1 Pending 0 9s <none> <none> <none> <none>
[root@master01 ~]#
提示:可以看到對(duì)應(yīng)pod沒(méi)有節(jié)點(diǎn)可以運(yùn)行,處于pending狀態(tài),其原因?qū)?yīng)節(jié)點(diǎn)都滿(mǎn)足排斥運(yùn)行對(duì)應(yīng)pod的硬限制; 通過(guò)上述驗(yàn)證過(guò)程可以總結(jié),不管是pod與節(jié)點(diǎn)的親和性還是pod與pod的親和性,只要在調(diào)度策略中定義了硬親和,對(duì)應(yīng)pod一定會(huì)運(yùn)行在滿(mǎn)足硬親和條件的節(jié)點(diǎn)上,如果沒(méi)有節(jié)點(diǎn)滿(mǎn)足硬親和條件,則對(duì)應(yīng)pod掛起;如果只是定義了軟親和,則對(duì)應(yīng)pod會(huì)優(yōu)先運(yùn)行在匹配權(quán)重較大軟限制條件的節(jié)點(diǎn)上,如果沒(méi)有節(jié)點(diǎn)滿(mǎn)足軟限制,對(duì)應(yīng)調(diào)度就走默認(rèn)調(diào)度策略,找得分最高的節(jié)點(diǎn)運(yùn)行;對(duì)于反親和性也是同樣的邏輯;不同的是反親和滿(mǎn)足對(duì)應(yīng)硬限制或軟限制,對(duì)應(yīng)pod不會(huì)運(yùn)行在對(duì)應(yīng)節(jié)點(diǎn)上;這里還需要注意一點(diǎn),使用pod與pod的親和調(diào)度策略,如果節(jié)點(diǎn)較多,其規(guī)則不應(yīng)該設(shè)置的過(guò)于精細(xì),顆粒度應(yīng)該適當(dāng)即可,過(guò)度精細(xì)會(huì)導(dǎo)致pod在調(diào)度時(shí),篩選節(jié)點(diǎn)消耗更多的資源,導(dǎo)致整個(gè)集群性能下降;建議在大規(guī)模集群中使用node affinity; |
|
|
來(lái)自: 小世界的野孩子 > 《待分類(lèi)》