模拟dataguard failover及flush redo功能介绍

长平狐 发布于 2013/09/17 15:10
阅读 928
收藏 0

DG目前正常运行在最大可用模式

注:Failover 是失败切换。 这种情况下切换对redo 的处理,就显的很重要。如果处理好,就不会有数据丢失。 否则就会有数据丢失。

Oracle 11g里,Data Guard 切换多了一个新的功能:flush redo。---我这里未使用。

 Flush 能把没有发送的redo 从主库传送到standby库。 只要主库能启动到mount 状态,那么Flush 就可以把没有发送的归档和current online redo 发送到备库。

Flush语法:SQL> ALTER SYSTEM FLUSH REDO TO target_db_name;

 这里的target_db_name 是我们在主库的db_unique_name 名称。 也就是在tnsnames.ora 文件配置的。 Flush 会将未发送的redo 从主库传到备库,并且等待redo 在standby 库上apply 之后返回成功。 所以只要Flush成功,那么Failover 就没有数据丢失。

 如果说我们的Primary 已经不能启动到mount 状态,那么就只能按照下面的方法来手动复制相应归档及REDO文件。


环境: DG1-主库
[oracle@dg1 ~]$ sqlplus / as sysdba
SYS@dg1>select open_mode from v$database;
OPEN_MODE
--------------------
READ WRITE
SYS@dg1>select protection_mode,database_role from v$database;
PROTECTION_MODE      DATABASE_ROLE
-------------------- ----------------
MAXIMUM AVAILABILITY PRIMARY
DG2-备库
SYS@dg2>select protection_mode,database_role from v$database;
PROTECTION_MODE      DATABASE_ROLE
-------------------- ----------------
MAXIMUM AVAILABILITY PHYSICAL STANDBY
SYS@dg2>select open_mode from v$database;
OPEN_MODE
--------------------
MOUNTED

1.在主库切换日志,备库可以正常应用归档。

---模拟failover的话这一步可以不做的,只是证明下DG工作正常。

DG1
SYS@dg1>alter system switch logfile;
System altered.
SYS@dg1>col name for a40
SYS@dg1>select * from (select name,recid from v$archived_log  order by recid desc) where rownum<5;
NAME                                          RECID
---------------------------------------- ----------
/u01/archivelog/arc_1_159_821829622.arc         310
dg2                                             309
                                                308
dg2                                             307
SYS@dg1>select max(sequence#) from v$archived_log;
MAX(SEQUENCE#)
--------------
           159
SYS@dg2>select max(sequence#) from v$archived_log;
MAX(SEQUENCE#)
--------------
           159
SYS@dg2>col name for a40
SYS@dg2>select * from (select name,recid from v$archived_log  order by recid desc) where rownum<5;
NAME                                          RECID
---------------------------------------- ----------
/u01/archivelog/arc_1_159_821829622.arc         166
                                                165
                                                164
                                                163
日志:
[oracle@dg1 ~]$ cat alert_dg.log
Tue Aug 06 09:12:50 2013
LGWR: Standby redo logfile selected to archive thread 1 sequence 160
LGWR: Standby redo logfile selected for thread 1 sequence 160 for destination LOG_ARCHIVE_DEST_2
Thread 1 advanced to log sequence 160 (LGWR switch)
  Current log# 1 seq# 160 mem# 0: /u01/oradata/dg/redo01.log
Tue Aug 06 09:12:51 2013
Archived Log entry 310 added for thread 1 sequence 159 ID 0x6776473b dest 1:
[oracle@dg2 ~]$ cat alert_dg.log
Tue Aug 06 09:12:50 2013
Standby controlfile consistent with primary
Tue Aug 06 09:12:50 2013
Media Recovery Waiting for thread 1 sequence 160 (in transit)
RFS[1]: Selected log 5 for thread 1 sequence 160 dbid 1735160627 branch 821829622
Tue Aug 06 09:12:51 2013
Archived Log entry 166 added for thread 1 sequence 159 ID 0x6776473b dest 1:
Recovery of Online Redo Log: Thread 1 Group 5 Seq 160 Reading mem 0
  Mem# 0: /u01/oradata/dg/standbyredo05.log

2.断开备库DG2的网络,同时在DG1上切换日志来触发归档,然后shutdown abort。

DG1:
SYS@dg1>alter system switch logfile;
System altered.
SYS@dg1>alter system switch logfile;
System altered.
SYS@dg1>select * from (select name,recid from v$archived_log  order by recid desc) where rownum<5;
NAME                                          RECID
---------------------------------------- ----------
/u01/archivelog/arc_1_161_821829622.arc         312
/u01/archivelog/arc_1_160_821829622.arc         311
/u01/archivelog/arc_1_159_821829622.arc         310
dg2                                             309
在表里插入一条数据,测试REDO能否恢复。
SYS@dg1>conn bys/bys
Connected.
BYS@dg1>select * from test;
         A
----------
       999
BYS@dg1>insert into test values(888);
1 row created.
BYS@dg1>commit;
Commit complete.
BYS@dg1>select * from test;
         A
----------
       999
       888
BYS@dg1>conn / as sysdba
Connected.
SYS@dg1>shutdown abort;
ORACLE instance shut down.
日志:
[oracle@dg1 ~]$ cat alert_dg.log

Tue Aug 06 09:12:51 2013
Archived Log entry 310 added for thread 1 sequence 159 ID 0x6776473b dest 1:
Tue Aug 06 09:19:40 2013
ORA-16198: LGWR received timedout error from KSR
LGWR: Attempting destination LOG_ARCHIVE_DEST_2 network reconnect (16198)
LGWR: Destination LOG_ARCHIVE_DEST_2 network reconnect abandoned----从这一句也能判断出是DG2网络有问题
Errors in file /u01/diag/rdbms/dg1/dg/trace/dg_lgwr_6182.trc:
ORA-16198: Timeout incurred on internal channel during remote archival
Error 16198 for archive log file 1 to 'dg2'
Destination LOG_ARCHIVE_DEST_2 is UNSYNCHRONIZED
LGWR: Failed to archive log 1 thread 1 sequence 160 (16198)
Thread 1 advanced to log sequence 161 (LGWR switch)
  Current log# 2 seq# 161 mem# 0: /u01/oradata/dg/redo02.log
Tue Aug 06 09:19:45 2013
Archived Log entry 311 added for thread 1 sequence 160 ID 0x6776473b dest 1:
Tue Aug 06 09:22:27 2013------------------------从这次日志可以判断,161号归档文件已经归档完成。当前使用的是redo03.log,对应的归档日志162号。
Thread 1 advanced to log sequence 162 (LGWR switch)
  Current log# 3 seq# 162 mem# 0: /u01/oradata/dg/redo03.log
Tue Aug 06 09:22:27 2013
Archived Log entry 312 added for thread 1 sequence 161 ID 0x6776473b dest 1:

Tue Aug 06 09:23:43 2013
Shutting down instance (abort)
License high water mark = 11
USER (ospid: 7015): terminating the instance
Instance terminated by USER, pid = 7015
Tue Aug 06 09:23:44 2013
Instance shutdown complete


3.打开备库dg2的网络并查询相关状态

[oracle@dg2 ~]$ sqlplus / as sysdba
SYS@dg2>select max(sequence#) from v$archived_log;
MAX(SEQUENCE#)
--------------
            159
SYS@dg2>col name for a40
SYS@dg2>select * from (select name,recid from v$archived_log  order by recid desc) where rownum<5;
NAME                                          RECID
---------------------------------------- ----------
/u01/archivelog/arc_1_159_821829622.arc         166
/u01/archivelog/arc_1_158_821829622.arc         165
/u01/archivelog/arc_1_157_821829622.arc         164
/u01/archivelog/arc_1_156_821829622.arc         163

4.从主库将未传到备库的归档及REDO发送过去

[oracle@dg1 ~]$ cd /u01/archivelog/
[oracle@dg1 archivelog]$ ll -t
total 3472
-rw-r----- 1 oracle oinstall   91648 Aug  6 09:22 arc_1_161_821829622.arc
-rw-r----- 1 oracle oinstall 2673152 Aug  6 09:19 arc_1_160_821829622.arc
-rw-r----- 1 oracle oinstall  772608 Aug  6 09:12 arc_1_159_821829622.arc
[oracle@dg1 archivelog]$ cd /u01/oradata/dg
[oracle@dg1 dg]$ ls
control01.ctl  standbyredo04.log  sysaux01.dbf
example01.dbf  standbyredo05.log  system01.dbf
redo01.log     standbyredo06.log  temp01.dbf
redo02.log     standbyredo07.log  undotbs01.dbf
redo03.log     standctl01.ctl     users01.dbf
[oracle@dg1 dg]$ scp redo0* dg2:/u01/archivelog/
oracle@dg2's password:
redo01.log            100%   50MB  10.0MB/s   00:05   
redo02.log            100%   50MB   7.1MB/s   00:07   
redo03.log            100%   50MB   6.3MB/s   00:08   
[oracle@dg1 dg]$ cd /u01/archivelog/
[oracle@dg1 archivelog]$ scp arc_1_160_821829622.arc arc_1_161_821829622.arc  dg2:/u01/archivelog
oracle@dg2's password:
arc_1_160_821829622.a 100% 2611KB   2.6MB/s   00:00   
arc_1_161_821829622.a 100%   90KB  89.5KB/s   00:00  

5.备库上注册复制来的归档日志--解决GAP问题

注意注意注意:
此时备库打开应用方式如果是alter database recover managed standby database disconnect from session;的话,会自动应用归档日志。
如果是alter database recover managed standby database using current logfile disconnect from session;不会自动应用归档日志。


SYS@dg2>alter database register logfile '/u01/archivelog/arc_1_160_821829622.arc';
Database altered.
SYS@dg2>alter database recover managed standby database cancel;
Database altered.
SYS@dg2>alter database recover managed standby database disconnect from session;
Database altered.
SYS@dg2>alter database register logfile '/u01/archivelog/arc_1_161_821829622.arc';
Database altered.
SYS@dg2>alter database recover managed standby database cancel;
Database altered.
SYS@dg2>select max(sequence#) from v$archived_log;
MAX(SEQUENCE#)
--------------
            161               -------------------------已经 恢复到162号。
对应的日志:
Tue Aug 06 09:19:43 2013
RFS[1]: Possible network disconnect with primary database
Tue Aug 06 09:31:22 2013--------------------  这里就是成功注册了归档日志,但是没有自动应用。
alter database register logfile '/u01/archivelog/arc_1_160_821829622.arc'
There are 1 logfiles specified.
ALTER DATABASE REGISTER [PHYSICAL] LOGFILE
Completed: alter database register logfile '/u01/archivelog/arc_1_160_821829622.arc'

Tue Aug 06 09:32:19 2013
alter database recover managed standby database cancel
Tue Aug 06 09:32:19 2013
MRP0: Background Media Recovery cancelled with status 16037
Errors in file /u01/diag/rdbms/dg2/dg/trace/dg_mrp0_6370.trc:
ORA-16037: user requested cancel of managed recovery operation
Managed Standby Recovery not using Real Time Apply
----------------------------------- 没有自动应用。
Recovery interrupted!
Recovered data files to a consistent state at change 1639340
Errors in file /u01/diag/rdbms/dg2/dg/trace/dg_mrp0_6370.trc:
ORA-16037: user requested cancel of managed recovery operation
MRP0: Background Media Recovery process shutdown (dg)
Waiting for MRP0 pid 6370 to terminate
Managed Standby Recovery Canceled (dg)
Completed: alter database recover managed standby database cancel
T ue Aug 06 09:32:41 2013
alter database recover managed standby database disconnect from session
   -----使用这种方式开始恢复
Attempt to start background Managed Standby Recovery process (dg)
Tue Aug 06 09:32:41 2013
MRP0 started with pid=26, OS id=7093
MRP0: Background Managed Standby Recovery process started (dg)
Serial Media Recovery started
Managed Standby Recovery not using Real Time Apply
Waiting for all non-current ORLs to be archived...
All non-current ORLs have been archived.
Media Recovery Log /u01/archivelog/arc_1_160_821829622.arc
Media Recovery Waiting for thread 1 sequence 161
Completed: alter database recover managed standby database disconnect from session
Tue Aug 06 09:33:14 2013-                                                                                                         -----------可以看到新注册的归档日志自动进行了恢复应用。
alter database register logfile '/u01/archivelog/arc_1_161_821829622.arc'
There are 1 logfiles specified.
ALTER DATABASE REGISTER [PHYSICAL] LOGFILE
Resynchronizing thread 1 from sequence 160 to 161
Completed: alter database register logfile '/u01/archivelog/arc_1_161_821829622.arc'

Tue Aug 06 09:33:18 2013
Media Recovery Log /u01/archivelog/arc_1_161_821829622.arc
Media Recovery Waiting for thread 1 sequence 162


6.使用主库的当前REDO恢复,failover并打开数据库。验证数据是否完整。

SYS@dg2>recover standby database until cancel;
ORA-00279: change 1639530 generated at 08/06/2013 09:22:27 needed for thread 1
ORA-00289: suggestion : /u01/archivelog/arc_1_162_821829622.arc
ORA-00280: change 1639530 for thread 1 is in sequence #162

Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
/u01/archivelog/redo03.log                                                                                       --------------输入日志名(从主库传送来的主库当前REDO),前面DG1 的日志中有提到。
Log applied.
Media recovery complete.
SYS@dg2>alter database activate standby database;
Database altered.
SYS@dg2>alter database open;
Database altered.
SYS@dg2>select protection_mode,database_role,open_mode from v$database;
PROTECTION_MODE      DATABASE_ROLE    OPEN_MODE
-------------------- ---------------- --------------------
MAXIMUM AVAILABILITY PRIMARY          READ WRITE
查看DG1崩溃前的DML操作,数据无丢失。
SYS@dg2>conn bys/bys
Connected.
BYS@dg2>select * from test;
         A
----------
       999

       888


7.相关日志:
Tue Aug 06 09:36:28 2013
ALTER DATABASE RECOVER  standby database until cancel

Media Recovery Start
Serial Media Recovery started
Managed Standby Recovery not using Real Time Apply
ORA-279 signalled during: ALTER DATABASE RECOVER  standby database until cancel  ...
Tue Aug 06 09:36:55 2013
ALTER DATABASE RECOVER CANCEL
Media Recovery Canceled
Completed: ALTER DATABASE RECOVER CANCEL
Tue Aug 06 09:37:31 2013
ALTER DATABASE RECOVER  standby database until cancel  
Media Recovery Start
Serial Media Recovery started
Managed Standby Recovery not using Real Time Apply
ORA-279 signalled during: ALTER DATABASE RECOVER  standby database until cancel  ...
Tue Aug 06 09:37:51 2013
ALTER DATABASE RECOVER    LOGFILE '/u01/archivelog/redo03.log'  

Media Recovery Log /u01/archivelog/redo03.log
Incomplete recovery applied all redo ever generated.
Recovery completed through change 1639636 time 08/06/2013 09:23:42
Media Recovery Complete (dg)
Completed: ALTER DATABASE RECOVER    LOGFILE '/u01/archivelog/redo03.log'

Tue Aug 06 09:43:40 2013
alter database activate standby database

ALTER DATABASE ACTIVATE [PHYSICAL] STANDBY DATABASE (dg)
krsv_proc_kill: Killing 2 processes (all RFS)
Begin: Standby Redo Logfile archival
End: Standby Redo Logfile archival
RESETLOGS after complete recovery through change 1639636
Resetting resetlogs activation ID 1735804731 (0x6776473b)
Online log /u01/oradata/dg/redo01.log: Thread 1 Group 1 was previously cleared
Online log /u01/oradata/dg/redo02.log: Thread 1 Group 2 was previously cleared
Online log /u01/oradata/dg/redo03.log: Thread 1 Group 3 was previously cleared
Standby became primary SCN: 1639634
Tue Aug 06 09:43:40 2013
Setting recovery target incarnation to 4
ACTIVATE STANDBY: Complete - Database mounted as primary
Completed: alter database activate standby database
原文链接:http://blog.csdn.net/q947817003/article/details/11526957
加载中
返回顶部
顶部