联系:手机/微信(+86 17813235971) QQ(107644445)
标题:手动提交分布式事务一例
作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]
一.alert文件中出现了很多类此记录
Fri Feb 10 05:25:01 2012 Errors in file /tmp/recover/ahcx216_reco_7956.trc: ORA-12154: TNS:could not resolve service name Fri Feb 10 05:25:01 2012 Errors in file /tmp/recover/ahcx216_reco_7956.trc: ORA-12154: TNS:could not resolve service name
这里可以看出来两个信息:
1.出错的进程是rec0进程,而该进程的作用是解决分布式事务失败后遗留问题(事务提交或者回滚)
2.错误信息是tns不能被解析
二.查看trace文件
/tmp/recover/ahcx216_reco_7956.trc Oracle9i Enterprise Edition Release 9.2.0.8.0 - Production With the Partitioning, OLAP and Oracle Data Mining options JServer Release 9.2.0.8.0 - Production ORACLE_HOME = /opt/oracle/product/9.2.0/db_1 System name: Linux Node name: localhost.localdomain Release: 2.6.9-89.0.0.0.1.ELhugemem Version: #1 SMP Tue May 19 04:38:38 EDT 2009 Machine: i686 Instance name: ahcx216 Redo thread mounted by this instance: 1 Oracle process number: 7 Unix process pid: 7956, image: oracle@localhost.localdomain (RECO) *** SESSION ID:(6.1) 2012-02-10 04:58:24.886 *** 2012-02-10 04:58:24.886 ERROR, tran=6.1.712757, session#=1, ose=0: ORA-12154: TNS:could not resolve service name ……………… *** 2012-02-10 05:25:01.580 ERROR, tran=6.1.712757, session#=1, ose=0: ORA-12154: TNS:could not resolve service name *** 2012-02-10 05:25:01.627 ERROR, tran=12.19.99059, session#=1, ose=0: ORA-12154: TNS:could not resolve service name
通过这里我们可以看出事务id分别为12.19.99059和6.1.712757有问题
三.查看dba_2pc_pending视图
SQL> select local_tran_id,state,fail_time,retry_time from dba_2pc_pending; LOCAL_TRAN_ID STATE FAIL_TIME RETRY_TIME -------------- ---------------- -------------- -------------- 6.1.712757 collecting 2010/12/1 13:39:03 2012/2/10 5:38:52 12.19.99059 collecting 2010/12/1 15:56:26 2012/2/10 5:38:52
可以看出,果然有两个分布式事务在2010年12月1日出了问题(本库是一个问题库,在把库拉起来后发现该问题)
因为是异机恢复,而且间隔时间较长,很多tns的信息都已经不存在,所以需要手工提交分布式事务
四.手动提交事务
SQL> BEGIN 2 DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('6.1.712757'); 3 DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('12.19.99059'); 4 END; 5 / BEGIN * 第 1 行出现错误: ORA-30019: Illegal rollback Segment operation in Automatic Undo mode ORA-06512: at "SYS.DBMS_TRANSACTION", line 65 ORA-06512: at "SYS.DBMS_TRANSACTION", line 85 ORA-06512: at line 2 SQL> alter session set "_smu_debug_mode"=4; Session altered. ------------------------------------------------------------- --设置UNDO_SUPPRESS_ERRORS=true也可以解决此问题 alter system set UNDO_SUPPRESS_ERRORS = TRUE; EXECUTE DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('<事务ID>'); commit; alter system set UNDO_SUPPRESS_ERRORS = false; -------------------------------------------------------------- SQL> commit; SQL> exec DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('6.1.712757'); PL/SQL procedure successfully completed. SQL> exec DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('12.19.99059'); BEGIN DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('12.19.99059'); END; * ERROR at line 1: ORA-01453: SET TRANSACTION must be first statement of transaction ORA-06512: at "SYS.DBMS_TRANSACTION", line 65 ORA-06512: at "SYS.DBMS_TRANSACTION", line 85 ORA-06512: at line 1 --第一个分布式事务处理后,未提交导致 SQL> commit; Commit complete. SQL> exec DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('12.19.99059'); PL/SQL procedure successfully completed. SQL> commit; Commit complete.
五.补充说明
开始在另一个会话中,执行失败原因
SQL> alter session set "_smu_debug_mode"=4; 会话已更改。 SQL> commit; 提交完成。 SQL> BEGIN 2 DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('6.1.712757'); 3 DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('12.19.99059'); 4 END; 5 / BEGIN * 第 1 行出现错误: ORA-01453: SET TRANSACTION must be first statement of transaction ORA-06512: at "SYS.DBMS_TRANSACTION", line 65 ORA-06512: at "SYS.DBMS_TRANSACTION", line 85 ORA-06512: at line 3
因为这里的begin end中包含了两个事务的清理,在清理完第一个事务之后,需要提交才能够清理第二个,这里因为没有提交,导致ORA-01453错误。