标签云
asm恢复 bbed bootstrap$ dul In Memory kcbzib_kcrsds_1 kccpb_sanity_check_2 kfed MySQL恢复 ORA-00312 ORA-00607 ORA-00704 ORA-01110 ORA-01555 ORA-01578 ORA-08103 ORA-600 2131 ORA-600 2662 ORA-600 2663 ORA-600 3020 ORA-600 4000 ORA-600 4137 ORA-600 4193 ORA-600 4194 ORA-600 16703 ORA-600 kcbzib_kcrsds_1 ORA-600 KCLCHKBLK_4 ORA-15042 ORA-15196 ORACLE 12C oracle dul ORACLE PATCH Oracle Recovery Tools oracle加密恢复 oracle勒索 oracle勒索恢复 oracle异常恢复 Oracle 恢复 ORACLE恢复 ORACLE数据库恢复 oracle 比特币 OSD-04016 YOUR FILES ARE ENCRYPTED 勒索恢复 比特币加密文章分类
- Others (2)
- 中间件 (2)
- WebLogic (2)
- 操作系统 (102)
- 数据库 (1,682)
- DB2 (22)
- MySQL (73)
- Oracle (1,544)
- Data Guard (52)
- EXADATA (8)
- GoldenGate (24)
- ORA-xxxxx (159)
- ORACLE 12C (72)
- ORACLE 18C (6)
- ORACLE 19C (15)
- ORACLE 21C (3)
- Oracle 23ai (7)
- Oracle ASM (67)
- Oracle Bug (8)
- Oracle RAC (53)
- Oracle 安全 (6)
- Oracle 开发 (28)
- Oracle 监听 (28)
- Oracle备份恢复 (565)
- Oracle安装升级 (92)
- Oracle性能优化 (62)
- 专题索引 (5)
- 勒索恢复 (79)
- PostgreSQL (18)
- PostgreSQL恢复 (6)
- SQL Server (27)
- SQL Server恢复 (8)
- TimesTen (7)
- 达梦数据库 (2)
- 生活娱乐 (2)
- 至理名言 (11)
- 虚拟化 (2)
- VMware (2)
- 软件开发 (37)
- Asp.Net (9)
- JavaScript (12)
- PHP (2)
- 小工具 (20)
-
最近发表
- 断电引起的ORA-08102: 未找到索引关键字, 对象号 39故障处理
- ORA-00227: corrupt block detected in control file
- 手工删除19c rac
- 解决oracle数据文件路径有回车故障
- .wstop扩展名勒索数据库恢复
- Oracle Recovery Tools工具一键解决ORA-00376 ORA-01110故障(文件offline)
- OGG-02771 Input trail file format RELEASE 19.1 is different from previous trail file form at RELEASE 11.2.
- OGG-02246 Source redo compatibility level 19.0.0 requires trail FORMAT 12.2 or higher
- GoldenGate 19安装和打patch
- dd破坏asm磁盘头恢复
- 删除asmlib磁盘导致磁盘组故障恢复
- Kylin Linux 安装19c
- ORA-600 krse_arc_complete.4
- Oracle 19c 202410补丁(RUs+OJVM)
- ntfs MFT损坏(ntfs文件系统故障)导致oracle异常恢复
- .mkp扩展名oracle数据文件加密恢复
- 清空redo,导致ORA-27048: skgfifi: file header information is invalid
- A_H_README_TO_RECOVER勒索恢复
- 通过alert日志分析客户自行对一个数据库恢复的来龙去脉和点评
- ORA-12514: TNS: 监听进程不能解析在连接描述符中给出的SERVICE_NAME
分类目录归档:PostgreSQL恢复
PostgreSQL恢复系列:pg_filedump基本使用
当PostgreSQL遇到重大故障,使用各种方法都无法直接启动数据库,可以考虑使用类似oracle dul工具,直接离线方式读取文件进行恢复.这个工具为pg_filedump
pg_filedump安装
[root@xifenfei ~]# yum install pg_filedump_14.x86_64 Loaded plugins: langpacks, ulninfo Resolving Dependencies --> Running transaction check ---> Package pg_filedump_14.x86_64 0:14.1-1.rhel7 will be installed --> Finished Dependency Resolution Dependencies Resolved ====================================================================================================================== Package Arch Version Repository Size ====================================================================================================================== Installing: pg_filedump_14 x86_64 14.1-1.rhel7 pgdg14 43 k Transaction Summary ====================================================================================================================== Install 1 Package Total download size: 43 k Installed size: 81 k Is this ok [y/d/N]: y Downloading packages: pg_filedump_14-14.1-1.rhel7.x86_64.rpm | 43 kB 00:00:02 Running transaction check Running transaction test Transaction test succeeded Running transaction Installing : pg_filedump_14-14.1-1.rhel7.x86_64 1/1 Verifying : pg_filedump_14-14.1-1.rhel7.x86_64 1/1 Installed: pg_filedump_14.x86_64 0:14.1-1.rhel7 Complete! -bash-4.2$ pg_filedump Version 14.1 (for PostgreSQL 8.x .. 14.x) Copyright (c) 2002-2010 Red Hat, Inc. Copyright (c) 2011-2022, PostgreSQL Global Development Group Usage: pg_filedump [-abcdfhikxy] [-R startblock [endblock]] [-D attrlist] [-S blocksize] [-s segsize] [-n segnumber] file Display formatted contents of a PostgreSQL heap/index/control file Defaults are: relative addressing, range of the entire file, block size as listed on block 0 in the file The following options are valid for heap and index files: -a Display absolute addresses when formatting (Block header information is always block relative) -b Display binary block images within a range (Option will turn off all formatting options) -d Display formatted block content dump (Option will turn off all other formatting options) -D Decode tuples using given comma separated list of types Supported types: bigint bigserial bool char charN date float float4 float8 int json macaddr name numeric oid real serial smallint smallserial text time timestamp timestamptz timetz uuid varchar varcharN xid xml ~ ignores all attributes left in a tuple -f Display formatted block content dump along with interpretation -h Display this information -i Display interpreted item details -k Verify block checksums -o Do not dump old values. -R Display specific block ranges within the file (Blocks are indexed from 0) [startblock]: block to start at [endblock]: block to end at A startblock without an endblock will format the single block -s Force segment size to [segsize] -t Dump TOAST files -v Ouput additional information about TOAST relations -n Force segment number to [segnumber] -S Force block size to [blocksize] -x Force interpreted formatting of block items as index items -y Force interpreted formatting of block items as heap items The following options are valid for control files: -c Interpret the file listed as a control file -f Display formatted content dump along with interpretation -S Force block size to [blocksize] Additional functions: -m Interpret file as pg_filenode.map file and print contents (all other options will be ignored) Report bugs to <pgsql-bugs@postgresql.org>
创建测试表
-bash-4.2$ psql psql (14.3) Type "help" for help. postgres=# create table pg_xifenfei(id int,name varchar(100)); CREATE TABLE postgres=# insert into pg_xifenfei values(1,'www.xifenfei.com'); INSERT 0 1 postgres=# insert into pg_xifenfei values(2,'xienfei_pg_recovery'); INSERT 0 1 postgres=# select * from pg_xifenfei; id | name ----+--------------------- 1 | www.xifenfei.com 2 | xienfei_pg_recovery (2 rows) postgres=#
pg_filedump恢复数据
-bash-4.2$ pg_filedump /var/lib/pgsql/14/data/base/14487/16384 ******************************************************************* * PostgreSQL File/Block Formatted Dump Utility * * File: /var/lib/pgsql/14/data/base/14487/16384 * Options used: None ******************************************************************* Block 0 ******************************************************** <Header> ----- Block Offset: 0x00000000 Offsets: Lower 32 (0x0020) Block: Size 8192 Version 4 Upper 8096 (0x1fa0) LSN: logid 0 recoff 0x16299cf0 Special 8192 (0x2000) Items: 2 Free Space: 8064 Checksum: 0x0000 Prune XID: 0x00000000 Flags: 0x0000 () Length (including item array): 32 <Data> ----- Item 1 -- Length: 45 Offset: 8144 (0x1fd0) Flags: NORMAL Item 2 -- Length: 48 Offset: 8096 (0x1fa0) Flags: NORMAL *** End of File Encountered. Last Block Read: 0 *** -bash-4.2$ pg_filedump -D int,charn /var/lib/pgsql/14/data/base/14487/16384|grep COPY COPY: 1 www.xifenfei.com COPY: 2 xienfei_pg_recovery -bash-4.2$ pg_filedump -D int,charn /var/lib/pgsql/14/data/base/14487/16384|grep COPY > |awk '{$1=null;print $0}'>/tmp/pg_xifenfei_rec -bash-4.2$ sed -i 's/^[ ]*//g' /tmp/pg_xifenfei_rec
导入数据验证
postgres=# truncate table pg_xifenfei; TRUNCATE TABLE postgres=# select * from pg_xifenfei; id | name ----+------ (0 rows) postgres=# copy pg_xifenfei from '/tmp/pg_xifenfei_rec'(DELIMITER ' '); COPY 2 postgres=# select * from pg_xifenfei; id | name ----+--------------------- 1 | www.xifenfei.com 2 | xienfei_pg_recovery (2 rows)
通过上述简单测试证明,在PG数据库出现极端情况下,可以使用该方法进行最后的数据恢复,减少因为数据丢失带来的损失.
PostgreSQL恢复系列:pg_control异常恢复
在PG中pg_control文件类似oracle数据库的control文件(控制文件),在Oracle中如果该文件丢失/损坏,可以通过alter database create controlfile命令进行创建,对于PG数据库来说也可以通过pg_resetwal命令来实现创建,由于pg_control文件损坏,需要人工指定一些参数完成pg_resetwal相关操作
pg_resetwal 使用说明
-bash-4.2$ pg_resetwal --help pg_resetwal resets the PostgreSQL write-ahead log. Usage: pg_resetwal [OPTION]... DATADIR Options: -c, --commit-timestamp-ids=XID,XID set oldest and newest transactions bearing commit timestamp (zero means no change) [-D, --pgdata=]DATADIR data directory -e, --epoch=XIDEPOCH set next transaction ID epoch -f, --force force update to be done -l, --next-wal-file=WALFILE set minimum starting location for new WAL -m, --multixact-ids=MXID,MXID set next and oldest multitransaction ID -n, --dry-run no update, just show what would be done -o, --next-oid=OID set next OID -O, --multixact-offset=OFFSET set next multitransaction offset -u, --oldest-transaction-id=XID set oldest transaction ID -V, --version output version information, then exit -x, --next-transaction-id=XID set next transaction ID --wal-segsize=SIZE size of WAL segments, in megabytes -?, --help show this help, then exit Report bugs to <pgsql-bugs@lists.postgresql.org>. PostgreSQL home page: <https://www.postgresql.org/>
确认现在业务表记录情况
-bash-4.2$ psql psql (14.3) Type "help" for help. postgres=# select count(1) from ac_event; count -------- 246266 (1 row)
模拟pg_control文件异常
-bash-4.2$ ps -ef|grep postgres postgres 37178 1 0 09:58 ? 00:00:00 /usr/pgsql-14/bin/postgres -D /var/lib/pgsql/14/data postgres 37179 37178 0 09:58 ? 00:00:00 postgres: logger postgres 37181 37178 0 09:58 ? 00:00:00 postgres: checkpointer postgres 37182 37178 0 09:58 ? 00:00:00 postgres: background writer postgres 37183 37178 0 09:58 ? 00:00:00 postgres: walwriter postgres 37184 37178 0 09:58 ? 00:00:00 postgres: autovacuum launcher postgres 37185 37178 0 09:58 ? 00:00:00 postgres: stats collector postgres 37186 37178 0 09:58 ? 00:00:00 postgres: logical replication launcher root 41368 41314 0 11:06 pts/1 00:00:00 su - postgres postgres 41369 41368 0 11:06 pts/1 00:00:00 -bash postgres 45071 41369 0 12:07 pts/1 00:00:00 ps -ef postgres 45072 41369 0 12:07 pts/1 00:00:00 grep --color=auto postgres -bash-4.2$ kill -9 37178 -bash-4.2$ ps -ef|grep postgres root 41368 41314 0 11:06 pts/1 00:00:00 su - postgres postgres 41369 41368 0 11:06 pts/1 00:00:00 -bash postgres 45095 41369 0 12:08 pts/1 00:00:00 ps -ef postgres 45096 41369 0 12:08 pts/1 00:00:00 grep --color=auto postgres -bash-4.2$ pwd /var/lib/pgsql/14/data/global -bash-4.2$ ls -l pg_control -rw-------. 1 postgres postgres 8192 May 30 12:04 pg_control -bash-4.2$ rm -rf pg_control -bash-4.2$ ls -l pg_control ls: cannot access pg_control: No such file or directory
PG启动失败
-bash-4.2$ pg_ctl start pg_ctl: another server might be running; trying to start server anyway waiting for server to start....postgres: could not find the database system Expected to find it in the directory "/var/lib/pgsql/14/data", but could not open file "/var/lib/pgsql/14/data/global/pg_control": No such file or directory stopped waiting pg_ctl: could not start server Examine the log output.
创建空pg_control文件启动依旧失败
-bash-4.2$ touch /var/lib/pgsql/14/data/global/pg_control -bash-4.2$ pg_ctl start pg_ctl: another server might be running; trying to start server anyway waiting for server to start....2022-05-30 12:09:43.953 CST [45215] PANIC: could not read file "global/pg_control": read 0 of 296 stopped waiting pg_ctl: could not start server Examine the log output.
设置next-wal-file
-l, –next-wal-file=WALFILE,这个参数设置下一个新的WAL文件的最小值,这个值可以从$PGDATA/pg_wal目录下去看最后一个WAL 文件,这个文件的id+1即可
-bash-4.2$ pwd /var/lib/pgsql/14/data/pg_wal -bash-4.2$ ls -l total 16384 -rw-------. 1 postgres postgres 16777216 May 30 12:04 000000010000000000000014 drwx------. 2 postgres postgres 6 May 24 02:20 archive_status -bash-4.2$
这个文件+1,-l 000000010000000000000015
设置next-transaction
-x, –next-transaction-id=XID,这个参数设置pg_control中的下一个XID的值,这个值可以从pg_xact目录下的文件中查询
-bash-4.2$ pwd /var/lib/pgsql/14/data/pg_xact -bash-4.2$ ls -ltr total 8 -rw-------. 1 postgres postgres 8192 May 30 12:03 0000
最后一个是0000,那么下一个XID就是0001,然后乘以 1048576 (0×100000),实际上后面直接加5个0就行了。注意,这个值是16进制的。-x 0×000100000
multixact-ids设置
-m, –multixact-ids=MXID1,MXID2,这个参数包含两个部分,MXID1和MXID2,都可以从$PGDATA/pg_multixact/offsets目录下获得。MXID1的值,首先找到最大值,+1,再乘以 65536 (0×10000,相当于后面加4个0)作为这个参数的前半部分。找到最小的值,后面加4个0,作为MXID2的值
-bash-4.2$ pwd /var/lib/pgsql/14/data/pg_multixact/offsets -bash-4.2$ ls -ltr total 8 -rw-------. 1 postgres postgres 8192 May 29 22:06 0000 -bash-4.2$
-m 0×00010000, 0×00000000(由于oldest multitransaction ID不能为0,因此后续这个值需要适当调整)
multixact-offset设置
-O, –multixact-offset=OFFSET,这个参数可以从$PGDATA/pg_multixact/members目录下获得。找到最大值,+1,乘以 52352 (0xCC80)
-bash-4.2$ pwd /var/lib/pgsql/14/data/pg_multixact/members -bash-4.2$ ls -ltr total 8 -rw-------. 1 postgres postgres 8192 May 24 02:20 0000
-O 0xCC80
尝试执行pg_resetwal
-bash-4.2$ pg_resetwal -l 000000010000000000000015 -x 0x000100000 -m 0x00010000,0x00000000 -O 0xCC80 $PGDATA pg_resetwal: error: oldest multitransaction ID (-m) must not be 0
multixact-ids值不对,进行调整后处理
postmaster.pid文件需要清理
由于PG库异常关闭,需要人工清理掉该文件
-bash-4.2$ pg_resetwal -l 000000010000000000000015 -x 0x000100000 -m 0x00020000,0x00010000 -O 0xCC80 $PGDATA pg_resetwal: error: lock file "postmaster.pid" exists -bash-4.2$ rm -rf postmaster.pid
pg_resetwal结果预览
-bash-4.2$ pg_resetwal -l 000000010000000000000015 -x 0x000100000 -m 0x00020000,0x00010000 -O 0xCC80 $PGDATA pg_resetwal: warning: pg_control exists but is broken or wrong version; ignoring it Guessed pg_control values: pg_control version number: 1300 Catalog version number: 202107181 Database system identifier: 7103392535324046312 Latest checkpoint's TimeLineID: 1 Latest checkpoint's full_page_writes: off Latest checkpoint's NextXID: 0:3 Latest checkpoint's NextOID: 12000 Latest checkpoint's NextMultiXactId: 1 Latest checkpoint's NextMultiOffset: 0 Latest checkpoint's oldestXID: 3 Latest checkpoint's oldestXID's DB: 0 Latest checkpoint's oldestActiveXID: 0 Latest checkpoint's oldestMultiXid: 1 Latest checkpoint's oldestMulti's DB: 0 Latest checkpoint's oldestCommitTsXid:0 Latest checkpoint's newestCommitTsXid:0 Maximum data alignment: 8 Database block size: 8192 Blocks per segment of large relation: 131072 WAL block size: 8192 Bytes per WAL segment: 16777216 Maximum length of identifiers: 64 Maximum columns in an index: 32 Maximum size of a TOAST chunk: 1996 Size of a large-object chunk: 2048 Date/time type storage: 64-bit integers Float8 argument passing: by value Data page checksum version: 0 Values to be changed: First log segment after reset: 000000010000000000000015 NextMultiXactId: 131072 OldestMultiXid: 65536 OldestMulti's DB: 0 NextMultiOffset: 52352 NextXID: 1048576 OldestXID: 3 OldestXID's DB: 0 If these values seem acceptable, use -f to force reset.
pg_resetwal进行创建pg_control并启动PG
-bash-4.2$ pg_resetwal -l 000000010000000000000015 -x 0x000100000 -m 0x00020000,0x00010000 -O 0xCC80 -f $PGDATA pg_resetwal: warning: pg_control exists but is broken or wrong version; ignoring it Write-ahead log reset -bash-4.2$ pg_ctl start waiting for server to start....2022-05-30 13:33:28.266 CST [51437] LOG: redirecting log output to logging collector process 2022-05-30 13:33:28.266 CST [51437] HINT: Future log output will appear in directory "log". done server started
验证数据
-bash-4.2$ psql psql (14.3) Type "help" for help. postgres=# select count(1) from ac_event; count -------- 245275 (1 row)
这种方法恢复之后,建议理解dump数据,然后导入到新库中
PostgreSQL恢复系列:wal日志丢失恢复
WAL是Write Ahead Log的简写,和oracle的redo日志类似,存放在$PGDATA/pg_xlog中,10版本以后在$PGDATA/pg_wal目录.在oracle数据库中,如果redo丢失,分为active/current和inactive的redo,分别有不同的处理方式,对于oracle需要实例恢复的redo丢失,需要屏蔽数据库一致性,强制打开数据库,对于PG数据库这部分日志丢失该如何恢复,主要是通过pg_resetwal/pg_resetxlog(10以前版本)命令来实现,这里通过一个测试来验证
创建测试表并强制kill数据库
-bash-4.2$ psql psql (14.3) Type "help" for help. postgres=# create table t_xifenfei as select * from pg_database; SELECT 4 postgres=# select count(1) from t_xifenfei; count ------- 4 (1 row) postgres=# \q -bash-4.2$ ps -ef|grep post root 1819 1 0 May28 ? 00:00:00 /usr/libexec/postfix/master -w postfix 1838 1819 0 May28 ? 00:00:00 qmgr -l -t unix -u postgres 11102 1 0 05:49 ? 00:00:00 /usr/pgsql-14/bin/postgres -D /var/lib/pgsql/14/data postgres 11103 11102 0 05:49 ? 00:00:00 postgres: logger postgres 11105 11102 0 05:49 ? 00:00:00 postgres: checkpointer postgres 11106 11102 0 05:49 ? 00:00:00 postgres: background writer postgres 11107 11102 0 05:49 ? 00:00:00 postgres: walwriter postgres 11108 11102 0 05:49 ? 00:00:00 postgres: autovacuum launcher postgres 11109 11102 0 05:49 ? 00:00:01 postgres: stats collector postgres 11110 11102 0 05:49 ? 00:00:00 postgres: logical replication launcher root 22743 22300 0 18:26 pts/3 00:00:00 su - postgres postgres 22744 22743 0 18:26 pts/3 00:00:00 -bash postgres 22937 22744 0 18:28 pts/3 00:00:00 psql postgres 22938 11102 0 18:28 ? 00:00:00 postgres: postgres postgres [local] idle postfix 32623 1819 0 21:10 ? 00:00:00 pickup -l -t unix -u root 33032 32912 0 21:15 pts/2 00:00:00 su - postgres postgres 33033 33032 0 21:15 pts/2 00:00:00 -bash postgres 35210 33033 0 21:51 pts/2 00:00:00 ps -ef postgres 35211 33033 0 21:51 pts/2 00:00:00 grep --color=auto post -bash-4.2$ kill -9 11102
删除wal日志
-bash-4.2$ pwd /var/lib/pgsql/14/data/pg_wal -bash-4.2$ ls -ltr total 311296 drwx------. 2 postgres postgres 6 May 24 02:20 archive_status -rw-------. 1 postgres postgres 16777216 May 28 21:29 000000010000000000000014 -rw-------. 1 postgres postgres 16777216 May 28 21:29 000000010000000000000015 -rw-------. 1 postgres postgres 16777216 May 28 21:29 000000010000000000000016 -rw-------. 1 postgres postgres 16777216 May 28 21:29 000000010000000000000017 -rw-------. 1 postgres postgres 16777216 May 28 21:29 000000010000000000000018 -rw-------. 1 postgres postgres 16777216 May 28 21:29 000000010000000000000019 -rw-------. 1 postgres postgres 16777216 May 28 21:29 00000001000000000000001A -rw-------. 1 postgres postgres 16777216 May 28 21:29 00000001000000000000001B -rw-------. 1 postgres postgres 16777216 May 28 21:29 00000001000000000000001C -rw-------. 1 postgres postgres 16777216 May 28 21:29 00000001000000000000001D -rw-------. 1 postgres postgres 16777216 May 28 21:29 00000001000000000000001E -rw-------. 1 postgres postgres 16777216 May 28 21:29 00000001000000000000001F -rw-------. 1 postgres postgres 16777216 May 28 21:29 000000010000000000000020 -rw-------. 1 postgres postgres 16777216 May 28 21:29 000000010000000000000021 -rw-------. 1 postgres postgres 16777216 May 28 21:29 000000010000000000000022 -rw-------. 1 postgres postgres 16777216 May 28 21:30 000000010000000000000023 -rw-------. 1 postgres postgres 16777216 May 28 21:30 000000010000000000000024 -rw-------. 1 postgres postgres 16777216 May 28 21:30 000000010000000000000025 -rw-------. 1 postgres postgres 16777216 May 29 21:51 000000010000000000000013 -bash-4.2$ rm -rf 0000000100000000000000* -bash-4.2$ ls archive_status
查询当时数据库需要的最小wal记录
-bash-4.2$ pg_controldata pg_control version number: 1300 Catalog version number: 202107181 Database system identifier: 7100998319216817119 Database cluster state: in production pg_control last modified: Sat 28 May 2022 09:36:11 PM CST Latest checkpoint location: 0/13692F80 Latest checkpoint's REDO location: 0/13692F48 Latest checkpoint's REDO WAL file: 000000010000000000000013 <===需要的记录 Latest checkpoint's TimeLineID: 1 Latest checkpoint's PrevTimeLineID: 1 Latest checkpoint's full_page_writes: on Latest checkpoint's NextXID: 0:17824 Latest checkpoint's NextOID: 32769 Latest checkpoint's NextMultiXactId: 1 Latest checkpoint's NextMultiOffset: 0 Latest checkpoint's oldestXID: 727 Latest checkpoint's oldestXID's DB: 1 Latest checkpoint's oldestActiveXID: 17824 Latest checkpoint's oldestMultiXid: 1 Latest checkpoint's oldestMulti's DB: 1 Latest checkpoint's oldestCommitTsXid:0 Latest checkpoint's newestCommitTsXid:0 Time of latest checkpoint: Sat 28 May 2022 09:31:41 PM CST
尝试启动PG
-bash-4.2$ pg_ctl start pg_ctl: another server might be running; trying to start server anyway waiting for server to start....2022-05-29 21:52:22.926 CST [35270] LOG: redirecting log output to logging collector process 2022-05-29 21:52:22.926 CST [35270] HINT: Future log output will appear in directory "log". . stopped waiting pg_ctl: could not start server Examine the log output.
启动pg失败,查看日志记录
2022-05-29 21:52:22.926 CST [35270] LOG: starting PostgreSQL 14.3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit 2022-05-29 21:52:22.927 CST [35270] LOG: listening on IPv6 address "::1", port 5432 2022-05-29 21:52:22.927 CST [35270] LOG: listening on IPv4 address "127.0.0.1", port 5432 2022-05-29 21:52:22.929 CST [35270] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432" 2022-05-29 21:52:22.931 CST [35270] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432" 2022-05-29 21:52:22.936 CST [35272] LOG: database system was interrupted; last known up at 2022-05-28 21:36:11 CST 2022-05-29 21:52:23.049 CST [35272] LOG: invalid primary checkpoint record 2022-05-29 21:52:23.049 CST [35272] PANIC: could not locate a valid checkpoint record 2022-05-29 21:52:24.211 CST [35270] LOG: startup process (PID 35272) was terminated by signal 6: Aborted 2022-05-29 21:52:24.211 CST [35270] LOG: aborting startup due to startup process failure 2022-05-29 21:52:24.218 CST [35270] LOG: database system is shut down
错误比较明显,无法定位到有效的checkpoint记录,在oracle里面的意思可以理解为无法进行实例恢复,pg启动失败
重设wal
由于数据库为不一致状态,需要使用-f进行强制重设
-bash-4.2$ pg_resetwal $PGDATA The database server was not shut down cleanly. Resetting the write-ahead log might cause data to be lost. If you want to proceed anyway, use -f to force reset. -bash-4.2$ pg_resetwal -f $PGDATA Write-ahead log reset
启动PG成功
-bash-4.2$ pg_ctl start -D $PGDATA waiting for server to start....2022-05-29 22:01:02.647 CST [37178] LOG: redirecting log output to logging collector process 2022-05-29 22:01:02.647 CST [37178] HINT: Future log output will appear in directory "log". done server started
日志记录
2022-05-29 22:01:02.647 CST [37178] LOG: starting PostgreSQL 14.3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit 2022-05-29 22:01:02.648 CST [37178] LOG: listening on IPv6 address "::1", port 5432 2022-05-29 22:01:02.648 CST [37178] LOG: listening on IPv4 address "127.0.0.1", port 5432 2022-05-29 22:01:02.649 CST [37178] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432" 2022-05-29 22:01:02.651 CST [37178] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432" 2022-05-29 22:01:02.653 CST [37180] LOG: database system was shut down at 2022-05-29 22:00:47 CST 2022-05-29 22:01:02.661 CST [37178] LOG: database system is ready to accept connections
查看wal日志,产生新记录
-bash-4.2$ pwd /var/lib/pgsql/14/data/pg_wal -bash-4.2$ ls -ltr total 16384 drwx------. 2 postgres postgres 6 May 24 02:20 archive_status -rw-------. 1 postgres postgres 16777216 May 29 22:01 000000010000000000000014
验证刚刚创建测试表
-bash-4.2$ psql psql (14.3) Type "help" for help. postgres=# select count(1) from t_xifenfei; ERROR: relation "t_xifenfei" does not exist LINE 1: select count(1) from t_xifenfei; ^
由于需要进行实例恢复的wal日志丢失导致这表记录也丢失.由此可见这类操作可能导致数据丢失风险,对于生产环境,需要慎重,