|
https://dev./doc/refman/5.7/en/innodb-deadlocks.html 什么是mysql的死鎖? A deadlock is a situation where different transactions are unable to proceed because each holds a lock that the other needs. Because both transactions are waiting for a resource to become available, neither ever release the locks it holds. 簡單來說可以提煉出2個詞:環(huán)路等待(each holds a lock that the other needs)和不可剝奪(neither ever release the locks it holds)。 其實廣泛意義上死鎖的四個必要條件也可以直接簡化為上述兩個條件,剩下的互斥和請求保持條件只是兩個眾所周知的補充。 一、一個簡單的死鎖示例: 會話A: mysql> CREATE TABLE t (i INT) ENGINE = InnoDB;
Query OK, 0 rows affected (1.07 sec)
mysql> INSERT INTO t (i) VALUES(1);
Query OK, 1 row affected (0.09 sec)
mysql> START TRANSACTION;
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT * FROM t WHERE i = 1 LOCK IN SHARE MODE;
+------+
| i |
+------+
| 1 |
+------+
會話B: mysql> START TRANSACTION;
Query OK, 0 rows affected (0.00 sec)
mysql> DELETE FROM t WHERE i = 1;
此時會話B會被阻塞(直到鎖請求超時)。 此時會話A繼續(xù)執(zhí)行:
DELETE FROM t WHERE i = 1;
會話B會被立馬rollback,因為產(chǎn)生了死鎖,最近的死鎖信息可以通過show engine innodb status\G看到。 打開innodb_print_all_deadlocks參數(shù)之后,死鎖信息還會在error日志里打印。鑒于本例過于簡單就不占用篇幅分析死鎖信息了。 set @@global.innodb_print_all_deadlocks=on;
innodb會選擇耗費資源較少的事務(wù)進行回滾(取決于DML涉及的行數(shù)和size)。 二、一個實際的死鎖示例: error日志里顯示的死鎖日志為: InnoDB: transactions deadlock detected, dumping detailed information.
*** (1) TRANSACTION:
TRANSACTION 209262583957, ACTIVE 1 sec starting index read
mysql tables in use 2, locked 2
LOCK WAIT 4 lock struct(s), heap size 1184, 2 row lock(s)
MySQL thread id 129183854, OS thread handle 0x7f1aeae7a700, query id 68320628504 <服務(wù)器A信息> updating
update tb_authorize_info set account_balance=account_balance- 100.00
where (SELECT a.account_balance from
(select account_balance from tb_authorize_info a where appId = '49E5BD695F853DC3' )a) - 100.00 > 0
and appId = '49E5BD695F853DC3'
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 1845 page no 4 n bits 96 index `PRIMARY` of table `xxx`.`tb_authorize_info` trx id 209262583957 lock_mode X locks rec but not gap waiting
Record lock, heap no 18 PHYSICAL RECORD: n_fields 32; compact format; info bits 0
......
*** (2) TRANSACTION:
TRANSACTION 209262584968, ACTIVE 1 sec starting index read
mysql tables in use 2, locked 2
4 lock struct(s), heap size 1184, 2 row lock(s)
MySQL thread id 129183879, OS thread handle 0x7f198b208700, query id 68320632234 <服務(wù)器B信息> updating
update tb_authorize_info set account_balance=account_balance- 100.00
where (SELECT a.account_balance from
(select account_balance from tb_authorize_info a where appId = '49E5BD695F853DC3' )a) - 100.00 > 0
and appId = '49E5BD695F853DC3'
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 1845 page no 4 n bits 96 index `PRIMARY` of table `xxx`.`tb_authorize_info` trx id 209262584968 lock mode S locks rec but not gap
Record lock, heap no 18 PHYSICAL RECORD: n_fields 32; compact format; info bits 0
......
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 1845 page no 4 n bits 96 index `PRIMARY` of table `xxx`.`tb_authorize_info` trx id 209262584968 lock_mode X locks rec but not gap waiting
Record lock, heap no 18 PHYSICAL RECORD: n_fields 32; compact format; info bits 0
......
*** WE ROLL BACK TRANSACTION (2)
這個死鎖屬于簡單的死鎖,由于網(wǎng)絡(luò)或其他延遲導(dǎo)致應(yīng)用請求發(fā)送到了2臺負(fù)載均衡的應(yīng)用服務(wù)器,兩個應(yīng)用程序同時請求數(shù)據(jù)庫執(zhí)行SQL,兩者都根據(jù)where條件先獲取到了S鎖,然后準(zhǔn)備升級為(或新增)X鎖以便更新,但是各自被對方的S鎖阻塞,因此形成死鎖,不過死鎖很快被mysql殺掉,事務(wù)1正常執(zhí)行完畢,事務(wù)二回滾,前臺業(yè)務(wù)除了一點點延遲基本沒啥影響。 三、stackoverflow上另一個死鎖: 有人在stackoverflow上發(fā)了一個死鎖的信息,嘗試直接解析此類信息對分析高并發(fā)下的SQL卡慢會有幫助因此嘗試自己解析。 https://dba./questions/39550/when-and-why-can-this-kind-of-deadlock-occur LATEST DETECTED DEADLOCK ------------------------ 130409 0:40:58 *** (1) TRANSACTION: TRANSACTION 3D61D41F, ACTIVE 3 sec inserting mysql tables in use 1, locked 1 LOCK WAIT 43 lock struct(s), heap size 6960, 358 row lock(s), undo log entries 43 MySQL thread id 17241690, OS thread handle 0x7ffd3469a700, query id 860259163 localhost root update ############# INSERT INTO `notification` (`other_grouped_notifications_count`, `user_id`, `notifiable_type`, `action_item`, `action_id`, `created_at`, `status`, `updated_at`) VALUES (0, 4442, 'MATCH', 'MATCH', 224716, 1365448255, 1, 1365448255) ############# *** (1) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 0 page no 272207 n bits 1272 index `user_id` of table `notification` trx id 3D61D41F lock_mode X locks gap before rec insert intention waiting Record lock, heap no 69 PHYSICAL RECORD: n_fields 2; compact format; info bits 0 0: len 4; hex 8000115b; asc [;; 1: len 4; hex 0005e0bb; asc ;; -- 事務(wù)1欲插入數(shù)據(jù)user_id=4442,因此首先獲取了對應(yīng)主鍵(lower_bound,4443]范圍上的插入意向鎖,然后想要在輔助索引(lower_bound,4443]的范圍上加insert intention lock,但被阻塞,推斷這個范圍上已經(jīng)有了其他事務(wù)的行鎖 -- 事務(wù)1需要獲取2個插入意向鎖后才會開始插入操作,這兩個鎖的獲取是不可分割的 *** (2) TRANSACTION: TRANSACTION 3D61C472, ACTIVE 15 sec starting index read mysql tables in use 1, locked 1 3 lock struct(s), heap size 1248, 2 row lock(s) MySQL thread id 17266704, OS thread handle 0x7ffd34b01700, query id 860250374 localhost root Updating ############# UPDATE `notification` SET `status`=0 WHERE user_id = 4443 and status=1 ############# *** (2) HOLDS THE LOCK(S): -- 事務(wù)2的update語句要更新user_id=4443的記錄,因此首先在user_id索引的(lower_bound,4443]范圍添加了X模式的next-key行鎖,事務(wù)1就是被這個next-key行鎖阻塞的 RECORD LOCKS space id 0 page no 272207 n bits 1272 index `user_id` of table `notification` trx id 3D61C472 lock_mode X Record lock, heap no 69 PHYSICAL RECORD: n_fields 2; compact format; info bits 0 0: len 4; hex 8000115b; asc [;; 1: len 4; hex 0005e0bb; asc ;; *** (2) WAITING FOR THIS LOCK TO BE GRANTED: -- 當(dāng)事務(wù)2嘗試更新主鍵數(shù)據(jù)時要獲取user_id=4443對應(yīng)主鍵的行鎖,但是發(fā)現(xiàn)主鍵的(lower_bound,4443]范圍上已經(jīng)被事務(wù)1加了insert intention lock,因此被阻塞 -- 同樣事務(wù)2獲取輔助索引的next-key和主鍵的record鎖也是不可分割的,只有都獲取完畢才能進行update RECORD LOCKS space id 0 page no 261029 n bits 248 index `PRIMARY` of table `notification` trx id 3D61C472 lock_mode X locks rec but not gap waiting Record lock, heap no 161 PHYSICAL RECORD: n_fields 16; compact format; info bits 0 0: len 4; hex 0005e0bb; asc ;; 1: len 6; hex 00000c75178f; asc u ;; 2: len 7; hex 480007c00c1d10; asc H ;; 3: len 4; hex 8000115b; asc [;; 4: len 8; hex 5245474953544552; asc REGISTER;; 5: SQL NULL; 6: SQL NULL; 7: SQL NULL; 8: len 4; hex d117dd91; asc ;; 9: len 4; hex d117dd91; asc ;; 10: len 1; hex 80; asc ;; 11: SQL NULL; 12: SQL NULL; 13: SQL NULL; 14: SQL NULL; 15: len 4; hex 80000000; asc ;; *** WE ROLL BACK TRANSACTION (2) 所以這個死鎖的出現(xiàn)就很容易理解了,事務(wù)1先獲取了4442位置主鍵的插入意向鎖,在獲取輔助索引上的插入意向鎖時被事務(wù)2 update語句的next-key行鎖阻塞導(dǎo)致插入意向鎖獲取失敗,而事務(wù)2的update獲取了索引的next-key行鎖后嘗試更新主鍵(即在主鍵上加非gap行鎖)卻被事務(wù)1的插入意向鎖阻塞。 兩個事務(wù)都不能放棄自己已有的資源,都請求與對方不兼容的鎖,不可剝奪且形成環(huán)路等待因此死鎖。
這個死鎖的根源就在于事務(wù)2的update語句持續(xù)的時間過長,導(dǎo)致后繼insert語句卡死。
四、如何避免死鎖? 其實官網(wǎng)有一篇完整的介紹:https://dev./doc/refman/5.7/en/innodb-deadlocks-handling.html 但是內(nèi)容有點多,我還是習(xí)慣用幾句話總結(jié)下: 1、盡可能優(yōu)化SQL的查詢性能使得事務(wù)盡可能的短小。 2、如果不介意幻讀可以使用read committed隔離級別以禁止范圍鎖。 3、如果前兩者都做不到或者SQL優(yōu)化的空間比較小,那么盡量分表分庫,通過增加資源(或者叫分散資源)減少資源沖突的幾率。 五、總結(jié): 由于mysql innodb特殊的行鎖機制,死鎖通常都是涉及到插入意向鎖和next-key鎖的,因為這兩個鎖是范圍鎖,范圍鎖設(shè)計的目的就是為避免幻讀,這會鎖定一些自己不需要操作的記錄。 不過在mysql中死鎖從來都不是大問題,死鎖通常都是數(shù)據(jù)庫卡慢的果,而非因。而且由于數(shù)據(jù)庫中普遍存在的死鎖查殺機制,死鎖產(chǎn)生后會很快被查殺。 真正可能引發(fā)數(shù)據(jù)庫性能問題的,是高并發(fā)下的長事務(wù),這種事務(wù)會導(dǎo)致undo等資源的爭用,會占用binlog的提交隊列導(dǎo)致后繼事務(wù)處于commit階段無法提交,即便強制kill也會引發(fā)長時間的rollback操作。 因此高并發(fā)下的長事務(wù)和低性能SQL才是死鎖的主因,因為他們慢且作為一個整體在完成之前不會釋放資源產(chǎn)生環(huán)路等待。 |
|
|