联系:手机/微信(+86 17813235971) QQ(107644445)
标题:记录一次rman备份ORA-19502/ORA-27063错误原因分析
作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]
rman备份出现ORA-19502/ORA-27063错误
RMAN> 2> 3> 4> 5> 6> 7> 8> 9> 10> 11> 12> 13> 14> 15> 16> 17> 18> 19> 20> allocated channel: t11 channel t11: sid=824 instance=ncdb1 devtype=DISK allocated channel: t12 channel t12: sid=838 instance=ncdb1 devtype=DISK allocated channel: t13 channel t13: sid=809 instance=ncdb1 devtype=DISK allocated channel: t14 channel t14: sid=886 instance=ncdb1 devtype=DISK allocated channel: t15 channel t15: sid=620 instance=ncdb1 devtype=DISK allocated channel: t16 channel t16: sid=599 instance=ncdb1 devtype=DISK allocated channel: t17 channel t17: sid=482 instance=ncdb1 devtype=DISK allocated channel: t18 channel t18: sid=506 instance=ncdb1 devtype=DISK 一共开通8个通道 channel t12: starting full datafile backupset channel t12: specifying datafile(s) in backupset input datafile fno=00008 name=/dev/rnc32g_39 input datafile fno=00016 name=/dev/rnc32g_47 input datafile fno=00024 name=/dev/rnc32g_57 input datafile fno=00032 name=/dev/rnc32g_25 input datafile fno=00040 name=/dev/rnc32g_33 input datafile fno=00048 name=/dev/rnc32g_3 input datafile fno=00056 name=/dev/rnc32g_11 input datafile fno=00064 name=/dev/rnc32g_19 input datafile fno=00072 name=/dev/rnc32g_67 input datafile fno=00080 name=/dev/rnc32g_106 input datafile fno=00088 name=/dev/rnc32g_114 input datafile fno=00096 name=/dev/rnc32g_87 input datafile fno=00104 name=/dev/rnc32g_95 input datafile fno=00112 name=/dev/rnc32g_103 input datafile fno=00120 name=/dev/rnc32g_75 input datafile fno=00003 name=/dev/rnc50_sysaux input datafile fno=00130 name=/dev/rnc32g_119 channel t12: starting piece 1 at 14-MAY-12 --通道12备份数据文件 channel t17: starting full datafile backupset channel t17: specifying datafile(s) in backupset input datafile fno=00002 name=/dev/rnc32g_22 input datafile fno=00013 name=/dev/rnc32g_44 input datafile fno=00021 name=/dev/rnc32g_54 input datafile fno=00029 name=/dev/rnc32g_62 input datafile fno=00037 name=/dev/rnc32g_30 input datafile fno=00045 name=/dev/rnc32g_38 input datafile fno=00053 name=/dev/rnc32g_8 input datafile fno=00061 name=/dev/rnc32g_16 input datafile fno=00069 name=/dev/rnc32g_64 input datafile fno=00077 name=/dev/rncundo_33g_4 input datafile fno=00085 name=/dev/rnc32g_111 input datafile fno=00093 name=/dev/rnc32g_84 input datafile fno=00101 name=/dev/rnc32g_92 input datafile fno=00109 name=/dev/rnc32g_100 input datafile fno=00117 name=/dev/rnc32g_72 input datafile fno=00006 name=/dev/rnc50_4g_1 channel t17: starting piece 1 at 14-MAY-12 --通道17备份数据文件 channel t15: finished piece 1 at 15-MAY-12 piece handle=/rman/db_mpnb04jl_1_1 tag=TAG20120514T204954 comment=NONE channel t15: backup set complete, elapsed time: 06:07:59 channel t11: finished piece 1 at 15-MAY-12 piece handle=/rman/db_mlnb04jk_1_1 tag=TAG20120514T204954 comment=NONE channel t11: backup set complete, elapsed time: 06:17:25 channel t16: finished piece 1 at 15-MAY-12 piece handle=/rman/db_mqnb04jm_1_1 tag=TAG20120514T204954 comment=NONE channel t16: backup set complete, elapsed time: 06:34:49 channel t14: finished piece 1 at 15-MAY-12 piece handle=/rman/db_monb04jl_1_1 tag=TAG20120514T204954 comment=NONE channel t14: backup set complete, elapsed time: 06:40:05 channel t18: finished piece 1 at 15-MAY-12 piece handle=/rman/db_msnb04jn_1_1 tag=TAG20120514T204954 comment=NONE channel t18: backup set complete, elapsed time: 06:43:38 channel t13: finished piece 1 at 15-MAY-12 piece handle=/rman/db_mnnb04jl_1_1 tag=TAG20120514T204954 comment=NONE channel t13: backup set complete, elapsed time: 07:40:56 --这里可以看出rman的备份完成了通道11/13/14/15/16/18,也就是说目前为止通道12/17未完成. RMAN-03009: failure of backup command on t12 channel at 05/15/2012 04:39:58 ORA-19502: write error on file "/rman/db_mmnb04jl_1_1", blockno 30481025 (blocksize=8192) ORA-27063: number of bytes read/written is incorrect IBM AIX RISC System/6000 Error: 28: No space left on device Additional information: -1 Additional information: 1048576 ORA-19502: write error on file "/rman/db_mmnb04jl_1_1", blockno 30480897 (blocksize=8192) ORA-27063: number of bytes read/written is incorrect IBM AIX RISC System/6000 Error: 28: No space left on device channel t12 disabled, job failed on it will be run on another channel --通道12报错(硬盘空间不足) channel t11: starting full datafile backupset channel t11: specifying datafile(s) in backupset input datafile fno=00008 name=/dev/rnc32g_39 input datafile fno=00016 name=/dev/rnc32g_47 input datafile fno=00024 name=/dev/rnc32g_57 input datafile fno=00032 name=/dev/rnc32g_25 input datafile fno=00040 name=/dev/rnc32g_33 input datafile fno=00048 name=/dev/rnc32g_3 input datafile fno=00056 name=/dev/rnc32g_11 input datafile fno=00064 name=/dev/rnc32g_19 input datafile fno=00072 name=/dev/rnc32g_67 input datafile fno=00080 name=/dev/rnc32g_106 input datafile fno=00088 name=/dev/rnc32g_114 input datafile fno=00096 name=/dev/rnc32g_87 input datafile fno=00104 name=/dev/rnc32g_95 input datafile fno=00112 name=/dev/rnc32g_103 input datafile fno=00120 name=/dev/rnc32g_75 input datafile fno=00003 name=/dev/rnc50_sysaux input datafile fno=00130 name=/dev/rnc32g_119 channel t11: starting piece 1 at 15-MAY-12 --在通道12报错后,通道11已经完成了上次备份,所以启动备份通道12出错的数据文件 RMAN-03009: failure of backup command on t17 channel at 05/15/2012 04:39:58 ORA-19502: write error on file "/rman/db_mrnb04jm_1_1", blockno 30753793 (blocksize=8192) ORA-27063: number of bytes read/written is incorrect IBM AIX RISC System/6000 Error: 28: No space left on device Additional information: -1 Additional information: 1048576 ORA-19502: write error on file "/rman/db_mrnb04jm_1_1", blockno 30753665 (blocksize=8192) ORA-27063: number of bytes read/written is incorrect IBM AIX RISC System/6000 Error: 28: No space left on device channel t17 disabled, job failed on it will be run on another channel --通道17也因为磁盘空间报错 channel t13: starting full datafile backupset channel t13: specifying datafile(s) in backupset input datafile fno=00002 name=/dev/rnc32g_22 input datafile fno=00013 name=/dev/rnc32g_44 input datafile fno=00021 name=/dev/rnc32g_54 input datafile fno=00029 name=/dev/rnc32g_62 input datafile fno=00037 name=/dev/rnc32g_30 input datafile fno=00045 name=/dev/rnc32g_38 input datafile fno=00053 name=/dev/rnc32g_8 input datafile fno=00061 name=/dev/rnc32g_16 input datafile fno=00069 name=/dev/rnc32g_64 input datafile fno=00077 name=/dev/rncundo_33g_4 input datafile fno=00085 name=/dev/rnc32g_111 input datafile fno=00093 name=/dev/rnc32g_84 input datafile fno=00101 name=/dev/rnc32g_92 input datafile fno=00109 name=/dev/rnc32g_100 input datafile fno=00117 name=/dev/rnc32g_72 input datafile fno=00006 name=/dev/rnc50_4g_1 channel t13: starting piece 1 at 15-MAY-12 --通道13也尝试备份通道17失败的数据文件 RMAN-03009: failure of backup command on t11 channel at 05/15/2012 04:39:59 ORA-19504: failed to create file "/rman/db_mtnb104u_1_1" ORA-27044: unable to write the header block of file IBM AIX RISC System/6000 Error: 28: No space left on device Additional information: 3 Addition --因为当前没有空闲空间,通道11终止, --这个时候rman异常终止,导致后续的通道13终止记录未打印到日志
阅读完rman日志,很好理解因为存放rman备份的磁盘空间不足导致了一系列错误
检查磁盘剩余空间
Filesystem GB blocks Free %Used Iused %Iused Mounted on /dev/hd4 10.00 9.75 3% 6548 1% / /dev/hd2 10.00 4.55 55% 84383 8% /usr /dev/hd9var 5.00 4.04 20% 6290 1% /var /dev/hd3 5.00 3.87 23% 1551 1% /tmp /dev/hd1 10.00 9.91 1% 382 1% /home /proc - - - - - /proc /dev/hd10opt 5.00 4.89 3% 3502 1% /opt /dev/archalv 99.00 82.98 17% 96 1% /archa /dev/fslv01 40.00 19.49 52% 72324 2% /ora10 /dev/fslv00 1800.00 467.25 75% 10 1% /rman
这下让人迷糊了,磁盘空间还剩余467.25G,怎么会报错呢?
分析原因
RMAN-03009: failure of backup command on t12 channel at 05/15/2012 04:39:58 ORA-19502: write error on file "/rman/db_mmnb04jl_1_1", blockno 30481025 (blocksize=8192) ORA-27063: number of bytes read/written is incorrect IBM AIX RISC System/6000 Error: 28: No space left on device Additional information: -1 Additional information: 1048576 ORA-19502: write error on file "/rman/db_mmnb04jl_1_1", blockno 30480897 (blocksize=8192) ORA-27063: number of bytes read/written is incorrect IBM AIX RISC System/6000 Error: 28: No space left on device channel t12 disabled, job failed on it will be run on another channel RMAN-03009: failure of backup command on t17 channel at 05/15/2012 04:39:58 ORA-19502: write error on file "/rman/db_mrnb04jm_1_1", blockno 30753793 (blocksize=8192) ORA-27063: number of bytes read/written is incorrect IBM AIX RISC System/6000 Error: 28: No space left on device Additional information: -1 Additional information: 1048576 ORA-19502: write error on file "/rman/db_mrnb04jm_1_1", blockno 30753665 (blocksize=8192) ORA-27063: number of bytes read/written is incorrect IBM AIX RISC System/6000 Error: 28: No space left on device channel t17 disabled, job failed on it will be run on another channel
这两个通道在写入rman备份到磁盘中的时候,在05/15/2012 04:39:58发现磁盘空间不足,两个通道分别准备写入30480897/30753665号块的时候出错,那么当时这两个通道分别写入的数据块数为30480896/30753664,写入文件大小为(30480896+30753664)*8192/1024/1024/1024=467.1826171875G.这里可以看出磁盘剩余空间467.25G,其实当时已经写入了467.1826171875G,继续写入的时候出错.然后rman为了保证备份的正确性,自动删除了当时已经备份的467.1826171875G错误的备份文件.从而在备份结束后看到磁盘空间还有大量剩余而rman包空间不足的现象.