分类目录归档:数据库

PostgreSQL恢复工具—pdu恢复单个表文件

在某些情况下,比如我们需要对单个的PostgreSQL库的表文件进行恢复(比如文件系统损坏,drop库/表,truncate表等原因,然后找到了部分oid文件),可以使用pdu对其进行完美恢复(相比pg_filedump也方便很多),具体操作步骤:
1. 由于只有单个表文件,无法获取字典信息,因此需要应用厂商/客户提供具体表创建语句

his5_dms=#    CREATE TABLE t_xff (
his5_dms(#         id                       bigint,
his5_dms(#         hospital_id              bigint,
his5_dms(#         parent_id                bigint,
his5_dms(#         disease_code             varchar(60),
his5_dms(#         disease_name             varchar(60),
his5_dms(#         type                     smallint,
his5_dms(#         py                       varchar(60),
his5_dms(#         wb                       varchar(60),
his5_dms(#         sc                       varchar(20),
his5_dms(#         order_no                 int,
his5_dms(#         state                    smallint,
his5_dms(#         create_datetime          timestamp(6),
his5_dms(#         create_id                bigint,
his5_dms(#         edit_datetime            timestamp(6),
his5_dms(#         edit_id                  bigint,
his5_dms(#         search_path              varchar(300),
his5_dms(#         diagnosis_sort           int,
his5_dms(#         category_name            varchar(40),
his5_dms(#         input_option             varchar(40),
his5_dms(#         category_class           smallint,
his5_dms(#         memo1                    varchar(300),
his5_dms(#         memo2                    varchar(300),
his5_dms(#         other_code               varchar(60),
his5_dms(#         other_name               varchar(60),
his5_dms(#         special_disease_flag     smallint
his5_dms(#    );
CREATE TABLE

2. 把oid文件pdu放到restore库中

[root@xifenfeidg public]# pwd
/tmp/pdu/restore/public
[root@xifenfeidg public]# ls -l
total 7144
-rw-r--r--. 1 root root 7315456 Mar  2 21:04 123456
[root@xifenfeidg public]# 

3. 使用add语句在pdu加载数据类型

restore.public=# add 123456 t_xff bigint,bigint,bigint,varchar,varchar,smallint,varchar,varchar,varchar,
int,smallint,timestamp,bigint,timestamp,bigint,varchar,int,varchar,varchar,
smallint,varchar,varchar,varchar,varchar,smallint;
添加完成,请用\dt;查看可unload的表
restore.public=# \dt;
|--------------------------------------------------|
|               表名                  |  表大小    |
|--------------------------------------------------|
|    t_xff                            |  6.98 MB   |
|--------------------------------------------------|

        仅显示表大小排名前 1 的表名

4.使用pdu恢复表数据

restore.public=# unload t_xff;
正在解析表 <t_xff>. 已解析数据页: 893, 已解析数据: 46998 条
<t_xff>-<restore/public/123456> 解析完成, 894 个数据页 ,共计 46998 条数据. 成功 46998 条; 失败【0】条 
 COPY文件路径为:<restore/public/t_xff.csv>
restore.public=# unload COPY;

COPY命令导出完成, 文件路径: restore/COPY/public_copy.sql

5.导入数据到pg库中

his5_dms=# \i restore/COPY/public_copy.sql
SET
COPY 46998
his5_dms=# select count(1) from t_xff;
 count 
-------
 46998
(1 row)
his5_dms=# \x
Expanded display is on.
his5_dms=# select * from t_xff limit 1;
-[ RECORD 1 ]--------+---------------------------
id                   | 323839
hospital_id          | 0
parent_id            | 301
disease_code         | 57.8900x003
disease_name         | 腹腔镜下膀胱颈悬吊术
type                 | 2
py                   | fqjxpgjxds
wb                   | eeqgeeceks
sc                   | 
order_no             | 0
state                | 1
create_datetime      | 2022-09-29 15:22:58.588492
create_id            | 
edit_datetime        | 
edit_id              | 
search_path          | 301,
diagnosis_sort       | 
category_name        | 
input_option         | 
category_class       | 3
memo1                | 
memo2                | 
other_code           | 
other_name           | 
special_disease_flag | 0
发表在 PostgreSQL恢复 | 标签为 , , , , | 留下评论

PostgreSQL恢复工具—pdu工具介绍

张晨同学开发了一个PostgreSQL数据恢复工具PDU(PDU: Postgresql Data Rescue Tool),我这边配合做一些测试
使用帮助命令

PDU.public=# ;

PDU数据拯救工具 | 命令帮助
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
**基础操作**
b;                                      │ 初始化数据库元信息
exit;                                   │ 退出工具

**数据库切换**
use <db>;                               │ 指定目标数据库 (例: use logs;)
set <schema>;                           │ 指定操作模式 (例: set recovery;)

**元数据展示**
\l;                                     │ 列出所有数据库
\dn;                                    │ 显示当前数据库模式
\dt;                                    │ 列出当前模式下的表
\d+ <table>;                            │ 查看表结构详情 (例: \d+ users;)
\d <table>;                             │ 查看表列类型 (例: \d users;)

**数据导出**
unload <table>;                         │ 导出表数据 → ./<表名>.csv (例: unload orders;)
unload SCH;                             │ 导出当前模式所有数据
unload DDL;                             │ 生成模式结构定义文件
unload COPY;                            │ 生成PSQL COPY语句脚本

**误删数据恢复**
scan t1;                                │ 扫描被误删的表
restore del <Tx Number>;                │ 通过事务号恢复被误删的数据
--------------------------------------------------------------------------------------
scan drop;                              │ 扫描wal日志中的drop事务
restore drop <Tx Number>;               │ 通过事务号恢复被drop的表
add <oid> <tablename> <attibutes>;      │ 将表信息手动添加到restore库中
例如: <add 12345 t1 varchar,varchar,timestamp,varchar,numeric,varchar,varchar,varchar,numeric;>

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
语法规则
◈ 指令后缀必须带 `;`

加载PostgreSQL元数据

PDU.public=# b;
开始初始化...
 -pg_database:</var/lib/pgsql/12/data/global/1262>
    【postgres】
      -pg_schema:</var/lib/pgsql/12/data/base/14399/2615>
      -pg_class:</var/lib/pgsql/12/data/base/14399/1259>,共55行
      -pg_attribute:</var/lib/pgsql/12/data/base/14399/1249>,共2913行
      模式:
        -->public,2张表
    【his5_dms】
      -pg_schema:</var/lib/pgsql/12/data/base/16386/2615>
      -pg_class:</var/lib/pgsql/12/data/base/16386/1259>,共793行
      -pg_attribute:</var/lib/pgsql/12/data/base/16386/1249>,共31329行
      模式:
        -->public,660张表
        -->hiscrm,55张表
        -->pgagent,8张表
        -->report,7张表
        -->statistics,10张表

查看当前有哪些库

PDU.public=# \l;
|------------------|
|     数据库名     |
|------------------|
|    postgres      |
|    template1     |
|    template0     |
|    his5_dms      |
|    restore       |
|------------------|

      5 rows selected

进入某个库

PDU.public=# use his5_dms;
|----------------------------------------|
|          模式             |  表数量    |
|----------------------------------------|
|    public                 |  660       |
|    hiscrm                 |  55        |
|    pgagent                |  8         |
|    report                 |  7         |
|    statistics             |  10        |
|----------------------------------------|

进入某个模式

his5_dms.public=# set hiscrm;
|--------------------------------------------------|
|               表名                  |  表大小    |
|--------------------------------------------------|
|    t_patient_other                  |  600.00 KB |
|    t_sys_oper_log                   |  88.00 KB  |
|    t_auth                           |  88.00 KB  |
|    t_setting_user                   |  56.00 KB  |
|    t_field_define                   |  32.00 KB  |
|    t_oper_log                       |  16.00 KB  |
|    t_role                           |  16.00 KB  |
|    t_sys_login_log                  |  8.00 KB   |
|    t_appointment_item               |  8.00 KB   |
|    t_clinic                         |  8.00 KB   |
|    t_dept                           |  8.00 KB   |
|    t_employee                       |  8.00 KB   |
|    t_menu                           |  8.00 KB   |
|    t_patient_label                  |  8.00 KB   |
|    t_patient_label_detail_tpl       |  8.00 KB   |
|    t_patient_source_ref             |  8.00 KB   |
|    t_return_visit_tpl               |  8.00 KB   |
|    t_setting_clinic                 |  8.00 KB   |
|    t_setting_notify                 |  8.00 KB   |
|    t_sms_template_category          |  8.00 KB   |
|--------------------------------------------------|

        仅显示表大小排名前 50 的表名

显示部分表

his5_dms.hiscrm=# \dt;
|--------------------------------------------------|
|               表名                  |  表大小    |
|--------------------------------------------------|
|    t_patient_other                  |  600.00 KB |
|    t_sys_oper_log                   |  88.00 KB  |
|    t_auth                           |  88.00 KB  |
|    t_setting_user                   |  56.00 KB  |
|    t_field_define                   |  32.00 KB  |
|    t_oper_log                       |  16.00 KB  |
|    t_role                           |  16.00 KB  |
|    t_sys_login_log                  |  8.00 KB   |
|    t_appointment_item               |  8.00 KB   |
|    t_clinic                         |  8.00 KB   |
|    t_dept                           |  8.00 KB   |
|    t_employee                       |  8.00 KB   |
|    t_menu                           |  8.00 KB   |
|    t_patient_label                  |  8.00 KB   |
|    t_patient_label_detail_tpl       |  8.00 KB   |
|    t_patient_source_ref             |  8.00 KB   |
|    t_return_visit_tpl               |  8.00 KB   |
|    t_setting_clinic                 |  8.00 KB   |
|    t_setting_notify                 |  8.00 KB   |
|    t_sms_template_category          |  8.00 KB   |
|--------------------------------------------------|

        仅显示表大小排名前 50 的表名

显示某个表的信息

his5_dms.hiscrm=# \d+ t_auth;
----------------------------------------------------------------
|                            建表语句                           |
----------------------------------------------------------------
   CREATE TABLE t_auth (
        id                       bigint,
        clinic_id                bigint,
        group_id                 bigint,
        parient_id               varchar(64),
        menu_id                  varchar(64),
        auth_key                 varchar(60),
        auth_name                varchar(64),
        uris                     varchar,
        rely                     varchar(255),
        state                    bigint,
        sort                     bigint,
        tag                      bigint,
        explain                  varchar(255),
        desc                     varchar(255)
   );
----------------------------------------------------------------
|                                                              |
----------------------------------------------------------------
his5_dms.hiscrm=# \d t_auth;
----------------------------------------------------------------
|                            列类型                             |
----------------------------------------------------------------
bigint,bigint,bigint,varchar,varchar,varchar,varchar,varchar,varchar,bigint,bigint,bigint,varchar,varchar

恢复表数据

his5_dms.hiscrm=# unload t_auth;
正在解析表 <t_auth>. 已解析数据页: 11, 已解析数据: 492 条
<t_auth>-</var/lib/pgsql/12/data/base/16386/16895> 解析完成, 12 个数据页 ,共计 492 条数据. 成功 492 条; 失败【0】条 
 COPY文件路径为:<his5_dms/hiscrm/t_auth.csv>

确认恢复表的数据情况

[root@xifenfeidg hiscrm]# wc -l t_auth.csv
492 t_auth.csv

QQ20250228-224837


PostgreSQL中查询表实际数据情况

his5_dms=# select count(1) from hiscrm.t_auth;
 count 
-------
   492
(1 row)
is5_dms=# \d hiscrm.t_auth;
                                Table "hiscrm.t_auth"
   Column   |          Type          | Collation | Nullable |         Default         
------------+------------------------+-----------+----------+-------------------------
 id         | bigint                 |           | not null | 
 clinic_id  | bigint                 |           |          | 
 group_id   | bigint                 |           |          | 
 parient_id | character varying(64)  |           | not null | 
 menu_id    | character varying(64)  |           | not null | 
 auth_key   | character varying(60)  |           | not null | 
 auth_name  | character varying(64)  |           | not null | 
 uris       | text                   |           | not null | 
 rely       | character varying(255) |           |          | NULL::character varying
 state      | bigint                 |           | not null | 
 sort       | bigint                 |           | not null | 
 tag        | bigint                 |           |          | '0'::bigint
 explain    | character varying(255) |           |          | NULL::character varying
 desc       | character varying(255) |           |          | NULL::character varying

truncate表

his5_dms=# truncate table hiscrm.t_auth;
TRUNCATE TABLE
his5_dms=# select count(1) from hiscrm.t_auth;
 count 
-------
     0
(1 row)

导入数据,并验证恢复效果

his5_dms=# \i /tmp/pdu/his5_dms/COPY/hiscrm_copy.sql 
SET
COPY 492
his5_dms=# select count(1) from  hiscrm.t_auth;
 count 
-------
   492
(1 row)

通过验证pdu可以在数据库离线的情况下,恢复PostgreSQL数据库中表的数据,更加方便和灵活的实现替代pg_filedump功能,而且pdu还在不断完善和新功能更新中

发表在 PostgreSQL恢复 | 标签为 , , , , | 留下评论

近1万个数据文件的恢复case

朋友介绍一个恢复case,数据库发生过硬件故障,做过硬件恢复之后,数据库无法正常启动.我恢复的已经不是第一现场,客户和我反馈说找过三批人进行恢复,都没有正常打开数据库.数据库整体不大(1T左右),但是数据文件近1万个(9895个数据文件),我看了下alert日志,主要报错有:
ORA-600 kcratr_scan_lastbwr,该错误比较常见,一般是由于坏块或者redo和数据文件不匹配导致,在某些情况下recover下就可以解决,有些时候不行,看人品

Mon Feb 17 15:51:15 2025
Started redo scan
Hex dump of (file 3, block 240) in trace file F:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_10508.trc
Reading datafile 'F:\ORACLEDATA\ORCL\UNDOTBS01.DBF' for corruption at rdba: 0x00c000f0 (file 3, block 240)
Reread (file 3, block 240) found same corrupt data (logically corrupt)
Write verification failed for File 3 Block 240 (rdba 0xc000f0)
Errors in file F:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_10508.trc  (incident=293029):
ORA-00600: 内部错误代码, 参数: [kcratr_scan_lastbwr], [], [], [], [], [], [], [], [], [], [], []
Incident details in: F:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\incident\incdir_293029\orcl_ora_10508_i293029.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Aborting crash recovery due to error 600
Errors in file F:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_10508.trc:
ORA-00600: 内部错误代码, 参数: [kcratr_scan_lastbwr], [], [], [], [], [], [], [], [], [], [], []
Mon Feb 17 15:51:22 2025
Sweep [inc2][293029]: completed
Mon Feb 17 15:51:25 2025
Errors in file F:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_10508.trc:
ORA-00600: 内部错误代码, 参数: [kcratr_scan_lastbwr], [], [], [], [], [], [], [], [], [], [], []

ORA-600 krr_parse_3错误,官方没有查询到资料,但是从报错的位置分析,应该和redo的应用有直接关系

Thu Feb 20 11:45:03 2025
ALTER DATABASE RECOVER  datafile 116  
Media Recovery Start
Serial Media Recovery started
Recovery of Online Redo Log: Thread 1 Group 2 Seq 2084282 Reading mem 0
  Mem# 0: F:\ORACLEDATA\ORCL\REDO02.LOG
Errors in file F:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_10840.trc  (incident=321616):
ORA-00600: 内部错误代码, 参数: [krr_parse_3], [], [], [], [], [], [], [], [], [], [], []
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Media Recovery failed with error 600
ORA-283 signalled during: ALTER DATABASE RECOVER  datafile 116  ...
ALTER DATABASE RECOVER  datafile 1168  
Media Recovery Start
Serial Media Recovery started
Recovery of Online Redo Log: Thread 1 Group 2 Seq 2084282 Reading mem 0
  Mem# 0: F:\ORACLEDATA\ORCL\REDO02.LOG
Errors in file F:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_10840.trc  (incident=321617):
ORA-00600: 内部错误代码, 参数: [krr_parse_3], [], [], [], [], [], [], [], [], [], [], []
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Media Recovery failed with error 600
ORA-283 signalled during: ALTER DATABASE RECOVER  datafile 1168  ...

上述两个错误,由于数据库部分文件被offline,而且屏蔽一致性打开等操作,绕过了上述的两个ORA-600错误,现在停留在ORA-00604 ORA-00376 ORA-01110故障导致数据库无法打开的情况,该错误是由于数据库启动过程中有事务,需要使用被offline的undo文件.

Fri Feb 21 07:42:37 2025
minact-scn: got error during useg scan e:376 usn:1
minact-scn: useg scan erroring out with error e:376
Fri Feb 21 07:44:02 2025
Errors in file F:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_m007_11464.trc:
ORA-51106: 由于出错, 检查无法完成。请查看下面的错误
ORA-48223: 已请求中断 - 提取已中止 - 返回代码 [12751] [HM_FINDING]
Fri Feb 21 07:45:12 2025
Errors in file F:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_smon_14108.trc:
ORA-00604: 递归 SQL 级别 1 出现错误
ORA-00376: 此时无法读取文件 3
ORA-01110: 数据文件 3: 'F:\ORACLEDATA\ORCL\UNDOTBS01.DBF'

分析数据库文件状态,有25个数据文件被offline,而且这些文件的resetlogs信息均不对(截取了部分文件)

SQL> set lines 150
SQL> set numw 16
SQL> col CHECKPOINT_TIME for a40
SQL> set lines 150
SQL> set pages 1000
SQL> SELECT status,
  2  to_char(checkpoint_time,'yyyy-mm-dd hh24:mi:ss') checkpoint_time,FUZZY,checkpoint_change#,
  3  count(*) ROW_NUM
  4  FROM v$datafile_header
  5  GROUP BY status, checkpoint_change#, to_char(checkpoint_time,'yyyy-mm-dd hh24:mi:ss'),fuzzy
  6  ORDER BY status, checkpoint_change#, checkpoint_time;

STATUS         CHECKPOINT_TIME                          FUZZY  CHECKPOINT_CHANGE#          ROW_NUM
-------------- ---------------------------------------- ------ ------------------ ----------------
OFFLINE        2025-02-11 15:27:00                      YES            1909526545               22
OFFLINE        2025-02-17 17:24:14                      YES            1909551234                2
OFFLINE        2025-02-17 17:27:35                      NO             1909551234                1
ONLINE         2025-02-22 17:29:25                      YES            2095190672             9869

wrong-resetlogs
offline
对于这种情况,最简单的解决方法就是使用开发的小工具Oracle Recovery Tools(Oracle Recovery Tools工具一键解决ORA-00376 ORA-01110故障(文件offline)),对这些offline的文件头信息进行修改
ora-tool
对于这类缺少归档数据文件offline的故障Oracle Recovery Tools可以快速傻瓜式恢复
尝试直接open数据库

SQL> STARTUP MOUNT PFILE='D:/PFILE.TXT'
ORACLE 例程已经启动。

Total System Global Area      82309009408 bytes
Fixed Size                        2290160 bytes
Variable Size                 12884905488 bytes
Database Buffers              69256347648 bytes
Redo Buffers                    165466112 bytes
数据库装载完毕。
SQL> RECOVER DATAFILE 3;
完成介质恢复。
SQL> RECOVER datafile 6601,7043,7044,7045,7050,
   7053,7054,7055,7056,7059,7060,7061,7062,7063,7064,7071,7072,7187
  ,7188,7190,7191,7192,7244,9501 ;
完成介质恢复。
SQL> alter database datafile 3,6601,7043,7044,7045,7050,
  7053,7054,7055,7056,7059,7060,7061,7062,7063,7064,7071,7072,7187
 ,7188,7190,7191,7192,7244,9501 online;
SQL> ALTER DATABASE OPEN;

数据库已更改。

QQ20250222-184800

Sat Feb 22 18:38:26 2025
alter database mount exclusive
Sat Feb 22 18:38:26 2025
MMNL started with pid=25, OS id=7524 
Successful mount of redo thread 1, with mount id 3367723362
Database mounted in Exclusive Mode
Lost write protection disabled
Completed: alter database mount exclusive
alter database open
Sat Feb 22 18:42:34 2025
Thread 1 opened at log sequence 5
  Current log# 2 seq# 5 mem# 0: F:\ORACLEDATA\ORCL\REDO02.LOG
Successful open of redo thread 1
Sat Feb 22 18:42:34 2025
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sat Feb 22 18:42:34 2025
SMON: enabling cache recovery
[7960] Successfully onlined Undo Tablespace 12273.
Undo initialization finished serial:0 start:98760972 end:98761612 diff:640 (6 seconds)
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Database Characterset is AL32UTF8
No Resource Manager plan active
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
Sat Feb 22 18:42:41 2025
QMNC started with pid=29, OS id=8116 
Sat Feb 22 18:42:45 2025
Completed: alter database open
Sat Feb 22 18:42:47 2025
Starting background process CJQ0
Sat Feb 22 18:42:47 2025
CJQ0 started with pid=31, OS id=3264 
Sat Feb 22 18:42:47 2025
db_recovery_file_dest_size of 4977 MB is 0.00% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.

数据库已经open,后续收尾工作比较简单,不再累赘.
对于这类缺少归档数据文件offline的故障Oracle Recovery Tools可以快速傻瓜式恢复,还是比较方便的
软件下载:OraRecovery下载
使用说明:使用说明

发表在 非常规恢复 | 标签为 , , , , | 留下评论