Thursday, February 26, 2015

Backdated Oracle Scheduler



Oracle scheduler is Oracle's job queue framework for scheduling various jobs. With scheduler, we can even schedule the jobs outside the scope of the database, like running a perl or shell script. While creating the job, we define the job's run interval and as per that only the job runs. We can query DBA_SCHEDULER_JOBS view to see all the jobs in system and their corresponding schedules. Here is a sample output of the same from my server:

 SQL> select JOB_NAME, STATE, LAST_START_DATE, NEXT_RUN_DATE from DBA_SCHEDULER_JOBS where OWNER <> 'SYS';  
 JOB_NAME                     STATE         LAST_START_DATE                         NEXT_RUN_DATE  
 ---------------------------------------- --------------- ---------------------------------------- ----------------------------------------  
 ORA_ASM_REPORT               SCHEDULED    21-FEB-15 01.00.00.984664 AM US/CENTRAL 22-FEB-15 01.00.00.000000 AM US/CENTRAL  
 ORA_BACKUP_DAILY             SCHEDULED    21-FEB-15 02.15.00.639829 AM US/CENTRAL 22-FEB-15 03.30.00.700000 AM US/CENTRAL  
 SQL> select SYSDATE from dual;  
 SYSDATE  
 ---------  
 26-FEB-15  

The DBA_SCHEDULER_JOBS.LAST_START_DATE stores the date information related to the last run of that job and similarly DBA_SCHEDULER_JOBS.NEXT_RUN_DATE stores the date for the next scheduled run of that job. As you can see in above example that the current date is Feb-26-14 and the NEXT_RUN_DATE is set to Feb-22-14, means these job's next run date is in past and these will never run if we don't fix that.

This scenario can happen mainly if the actual server was in long maintenance and it missed the actual scheduled NEXT_RUN_DATE because of that and next time when we start the server, it still keep waiting for that old date only. There is no inbuilt trigger or intellegence in database to find out that the NEXT_RUN_DATE has already been past and it needs to be reset.

Now, in order to reset it manually, we need to first disable the job queue completely at the system level:

 SQL> show parameter job_queue_processes  
 NAME                 TYPE    VALUE  
 ------------------------------------ ----------- ------------------------------  
 job_queue_processes         integer   100  
 SQL> alter system set job_queue_processes=0;  
 System altered.  

Now we can set the job_queue_processes back to 100 to enable it and it will reset the date as well but the key here is that it will kick off all the jobs immediately regardless of it's actual timings. If you are doing this in the middle of day then you really don't want to kick off those jobs in business hours. So in order to avoid that, next we need to disable the jobs manually:

 SQL> exec dbms_scheduler.disable('ORA_ASM_REPORT');  
 PL/SQL procedure successfully completed.  
 SQL> exec dbms_scheduler.disable('ORA_BACKUP_DAILY');  
 PL/SQL procedure successfully completed.  

And now, we can set the job_queue_processes back to it's original value and since the jobs are disabled, it won't kick off immediately:

 SQL> alter system set job_queue_processes=100;  
 System altered.  
 SQL> show parameter job_queue_processes  
 NAME                 TYPE    VALUE  
 ------------------------------------ ----------- ------------------------------  
 job_queue_processes         integer   100  

And finally we can enable the jobs as well and that should reset the NEXT_RUN_DATE as per the defined scheduled of the job without running them immediately:

 SQL> exec dbms_scheduler.enable('ORA_ASM_REPORT');  
 PL/SQL procedure successfully completed.  
 SQL> exec dbms_scheduler.enable('ORA_BACKUP_DAILY');  
 PL/SQL procedure successfully completed.  

Now let's query the DBA_SCHEDULER_JOBS view again to check the schedule:

 SQL> select JOB_NAME, STATE, LAST_START_DATE, NEXT_RUN_DATE from DBA_SCHEDULER_JOBS where OWNER <> 'SYS';  
 JOB_NAME                      STATE        LAST_START_DATE                         NEXT_RUN_DATE  
 ---------------------------------------- --------------- ---------------------------------------- ----------------------------------------  
 ORA_ASM_REPORT               SCHEDULED    21-FEB-15 01.00.00.984664 AM US/CENTRAL 26-FEB-15 01.00.00.000000 AM US/CENTRAL  
 ORA_BACKUP_DAILY             SCHEDULED    21-FEB-15 02.15.00.639829 AM US/CENTRAL 26-FEB-15 03.30.00.700000 AM US/CENTRAL  

And we have everything back on track once again!!


Thanks
Daljit Singh

Monday, February 23, 2015

ASM Filter Driver - From Scratch


CONCEPT

With Oracle 12.1.0.2, Oracle released ASM Filter Driver. As Oracle ASMLib, this feature is also available only on Linux but it gets installed along with the Oracle Grid Software, there is no need to install extra packages like for ASMLib.

Oracle ASM Filter Driver (Oracle ASMFD) is a kernel module that resides in the I/O path of the Oracle ASM disks. This is definitely NOT a replacement of ASMLib but DEFINITELY a better option. You can still go with any of the below configuration:

a) Use ASMLib
b) Use ASM Filter Driver (ASMFD)
c) Don't use any of these at all

In other words, like ASMLib, ASMFD is also an optional configuration but when used, it gives you the capability to scan & label ASM disks and also can be used for device persistence.

There is one major difference between ASMLib and ASMFD, which is, ASMFD rejects any I/O requests that are invalid or not coming through Oracle. This action eliminates accidental overwrites of Oracle ASM disks that would cause corruption in the disks and files within the disk group. I personally like this feature a lot because it eliminates the possibility of accidental damage of ASM disks and this alone is very good reason for me to use this feature in an enterprise.

CONFIGURATION

Now I am going to explain how we can start using ASMFD, if it is not configured on your already configured/running environment.

Here is how we can check if ASMFD is in use or not:

As Grid Infrastructure software owner, run the following command:

 oracle@linux-ora-01:+ASM>env | grep ORACLE  
 ORACLE_SID=+ASM  
 ORACLE_HOME=/u01/app/oracle/product/12.1.0.2/grid  
 oracle@linux-ora-01:+ASM>$ORACLE_HOME/bin/asmcmd afd_state  
 ASMCMD-9526: The AFD state is 'NOT INSTALLED' and filtering is 'DEFAULT' on host 'linux-ora-01'  

The above output confirmed that ASMFD is not installed on this box. We can also check the same using sqlplus by login to ASM instance as sysasm:

 oracle@linux-ora-01:+ASM>sqlplus / as sysasm  
 SQL*Plus: Release 12.1.0.2.0 Production on Mon Feb 23 16:53:40 2015  
 Copyright (c) 1982, 2014, Oracle. All rights reserved.  
 Connected to:  
 Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production  
 With the Automatic Storage Management option  
 SQL> SELECT SYS_CONTEXT('SYS_ASMFD_PROPERTIES', 'AFD_STATE') FROM DUAL;  
 SYS_CONTEXT('SYS_ASMFD_PROPERTIES','AFD_STATE')  
 --------------------------------------------------------------------------------  
 NOT AVAILABLE  

And we have the same results.

Here is all the ASM disk layout of my system:

 SQL> select name, path, total_mb, free_mb from v$asm_disk;  
 NAME                   PATH                    TOTAL_MB  FREE_MB  
 ------------------------------ ------------------------------ ---------- ----------  
 DATA_0000              /dev/sdb                  51492   43658  
 DATA_0001              /dev/sdc                  51492   43658  
 FRADG_0000              /dev/sdd                  51492   50906  
 6 rows selected.  

And as of now, my ASM discovery string is set to /dev/sd* in ASM resource profile:

 oracle@linux-ora-01:+ASM>srvctl config asm  
 ASM home: <CRS home>  
 Password file: +DATA/orapwasm  
 ASM listener: LISTENER  
 Spfile: +DATA/ASM/ASMPARAMETERFILE/registry.253.867758081  
 ASM diskgroup discovery string: /dev/sd*  

As Grid software owner, set the ASM discovery string to use the labels as well:

 oracle@linux-ora-01:+ASM>asmcmd dsset 'AFD:*','/dev/sd*'  
 oracle@linux-ora-01:+ASM>  
 oracle@linux-ora-01:+ASM>  
 oracle@linux-ora-01:+ASM>asmcmd dsget  
 parameter:AFD:*, /dev/sd*  
 profile:AFD:*,/dev/sd*  

Note: Why we are using "AFD:"? Because once we label the disk, it will show up as AFD: and that name will be used for disk discovery.

We must include the existing ASM disks along with new "AFD:" diskstring, otherwise the command will fail with following error:

 oracle@linux-ora-01:+ASM>asmcmd dsset 'AFD:*'  
 ORA-02097: parameter cannot be modified because specified value is invalid  
 ORA-15014: path '/dev/sdb' is not in the discovery set (DBD ERROR: OCIStmtExecute)  

Stop the HAS stack and configure ASMFD:

 oracle@linux-ora-01:+ASM>crsctl stop has -f  
 CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'linux-ora-01'  
 CRS-2673: Attempting to stop 'ora.test.db' on 'linux-ora-01'  
 CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'linux-ora-01'  
 CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'linux-ora-01' succeeded  
 CRS-2677: Stop of 'ora.test.db' on 'linux-ora-01' succeeded  
 CRS-2673: Attempting to stop 'ora.DATA.dg' on 'linux-ora-01'  
 CRS-2677: Stop of 'ora.DATA.dg' on 'linux-ora-01' succeeded  
 CRS-2673: Attempting to stop 'ora.evmd' on 'linux-ora-01'  
 CRS-2673: Attempting to stop 'ora.FRADG.dg' on 'linux-ora-01'  
 CRS-2677: Stop of 'ora.FRADG.dg' on 'linux-ora-01' succeeded  
 CRS-2673: Attempting to stop 'ora.asm' on 'linux-ora-01'  
 CRS-2677: Stop of 'ora.evmd' on 'linux-ora-01' succeeded  
 CRS-2677: Stop of 'ora.asm' on 'linux-ora-01' succeeded  
 CRS-2673: Attempting to stop 'ora.cssd' on 'linux-ora-01'  
 CRS-2677: Stop of 'ora.cssd' on 'linux-ora-01' succeeded  
 CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'linux-ora-01' has completed  
 CRS-4133: Oracle High Availability Services has been stopped.  
 oracle@linux-ora-01:+ASM>  

Configure the ASMFD as root:

 [root@linux-ora-01 ~]# /u01/app/oracle/product/12.1.0.2/grid/bin/asmcmd afd_configure  
 Connected to an idle instance.  
 AFD-620: AFD is not supported on this operating system version: 'unknown'  
 ASMCMD-9524: AFD configuration failed 'ERROR: afdroot install failed'  

I am using Oracle Unbreakable Linux 7 and it's still not working for me:

 [root@linux-ora-01 ~]# uname -r  
 3.10.0-123.13.2.el7.x86_64  
 [root@linux-ora-01 ~]# uname -a  
 Linux linux-ora-01 3.10.0-123.13.2.el7.x86_64 #1 SMP Thu Dec 18 10:51:45 PST 2014 x86_64 x86_64 x86_64 GNU/Linux  
 [root@linux-ora-01 ~]# cat /etc/redhat-release 
 Red Hat Enterprise Linux Server release 7.0 (Maipo)

The above error is due to Oracle bug: 18321597 (Metalink Doc: 1951850.1). Download patch 18321597 and install on Grid home.

PATCHING

Once you have the patch and latest OPatch (12.1.0.1.2 or later) downloaded, apply the patch:

Here is my opatch version:

 oracle@linux-ora-01:+ASM>./opatch version  
 OPatch Version: 12.1.0.1.6  
 OPatch succeeded.  

As you can see I have no interim patches applied so far:

 oracle@linux-ora-01:+ASM>./opatch lsinventory  
 OPatch could not create/open history file for writing.  
 Oracle Interim Patch Installer version 12.1.0.1.6  
 Copyright (c) 2015, Oracle Corporation. All rights reserved.  
 This is a read-only home and users need to run OPatch with -customLogDir option.  
 Oracle Home    : /u01/app/oracle/product/12.1.0.2/grid  
 Central Inventory : /u01/app/oraInventory  
   from      : /u01/app/oracle/product/12.1.0.2/grid/oraInst.loc  
 OPatch version  : 12.1.0.1.6  
 OUI version    : 12.1.0.2.0  
 Log file location : /u01/app/oracle/product/12.1.0.2/grid/cfgtoollogs/opatch/opatch2015-02-23_19-01-22PM_1.log  
 Lsinventory Output file location : /u01/app/oracle/product/12.1.0.2/grid/cfgtoollogs/opatch/lsinv/lsinventory2015-02-23_19-01-22PM.txt  
 --------------------------------------------------------------------------------  
 Installed Top-level Products (1):   
 Oracle Grid Infrastructure 12c                    12.1.0.2.0  
 There are 1 products installed in this Oracle Home.  
 There are no Interim patches installed in this Oracle Home.  
 --------------------------------------------------------------------------------  


Apply the patch as root user:

Note: If you don't already have Oracle Configuration Management response file (ocm.rsp) then you can create one by executing:
$ORACLE_HOME/OPatch/ocm/bin/emocmrsp

 [root@linux-ora-01 OPatch]# export ORACLE_HOME=/u01/app/oracle/product/12.1.0.2/grid  
 [root@linux-ora-01 OPatch]# export PATH=$ORACLE_HOME/OPatch:$ORACLE_HOME/bin:$PATH  
 [root@linux-ora-01 OPatch]# which opatch  
 /u01/app/oracle/product/12.1.0.2/grid/OPatch/opatch  
 [root@linux-ora-01 OPatch]# which opatchauto  
 /u01/app/oracle/product/12.1.0.2/grid/OPatch/opatchauto  
 [root@linux-ora-01 OPatch]# opatchauto apply /u02/media/patches/18321597 -oh /u01/app/oracle/product/12.1.0.2/grid -ocmrf /u02/media/ocm.rsp  
 OPatch Automation Tool  
 Copyright (c)2014, Oracle Corporation. All rights reserved.  
 OPatchauto Version : 12.1.0.1.6  
 OUI Version    : 12.1.0.2.0  
 Running from    : /u01/app/oracle/product/12.1.0.2/grid  
 opatchauto log file: /u01/app/oracle/product/12.1.0.2/grid/cfgtoollogs/opatchauto/18321597/opatch_gi_2015-02-23_19-06-27_deploy.log  
 Parameter Validation: Successful  
 Configuration Validation: Successful  
 Patch Location: /u02/media/patches/18321597  
 Grid Infrastructure Patch(es): 18321597   
 It does not contain any DB patch  
 Patch Validation: Successful  
 User specified following Grid Infrastructure home:  
 /u01/app/oracle/product/12.1.0.2/grid  
 Performing prepatch operations on SIHA Home... Successful  
 Applying patch(es) to "/u01/app/oracle/product/12.1.0.2/grid" ...  
 Patch "/u02/media/patches/18321597/18321597" successfully applied to "/u01/app/oracle/product/12.1.0.2/grid".  
 Performing postpatch operations on SIHA Home... Successful  
 Apply Summary:  
 Following patch(es) are successfully installed:  
 GI Home: /u01/app/oracle/product/12.1.0.2/grid: 18321597  
 opatchauto succeeded.  

Check the recently applied patch as Grid software owner user:

 oracle@linux-ora-01:+ASM>./opatch lsinventory  
 OPatch could not create/open history file for writing.  
 Oracle Interim Patch Installer version 12.1.0.1.6  
 Copyright (c) 2015, Oracle Corporation. All rights reserved.  
 This is a read-only home and users need to run OPatch with -customLogDir option.  
 Oracle Home    : /u01/app/oracle/product/12.1.0.2/grid  
 Central Inventory : /u01/app/oraInventory  
   from      : /u01/app/oracle/product/12.1.0.2/grid/oraInst.loc  
 OPatch version  : 12.1.0.1.6  
 OUI version    : 12.1.0.2.0  
 Log file location : /u01/app/oracle/product/12.1.0.2/grid/cfgtoollogs/opatch/opatch2015-02-23_19-17-04PM_1.log  
 Lsinventory Output file location : /u01/app/oracle/product/12.1.0.2/grid/cfgtoollogs/opatch/lsinv/lsinventory2015-02-23_19-17-04PM.txt  
 --------------------------------------------------------------------------------  
 Installed Top-level Products (1):   
 Oracle Grid Infrastructure 12c                    12.1.0.2.0  
 There are 1 products installed in this Oracle Home.  
 Interim patches (1) :  
 Patch 18321597   : applied on Mon Feb 23 19:12:51 CST 2015  
 Unique Patch ID: 18187560.1  
 Patch description: "ACFS Patch Set Update : 12.1.0.2.1 (18321597)"  
   Created on 2 Dec 2014, 21:31:40 hrs PST8PDT  
   Bugs fixed:  
    19127216, 19475588, 18957085, 19353057, 19279106, 19201087, 19270227  
    19335268, 18951113, 19184398, 19149476, 19450090, 19013966, 19183802  
    19051391, 18321597, 19557156, 19195735, 18877486, 18510745, 19355146  
    19001684, 18955907, 19060056, 19134464  
 --------------------------------------------------------------------------------  
 OPatch failed with error code 15  
 oracle@linux-ora-01:+ASM>  

This patch also starts the Oracle HAS stack, so we need to bring that down again as oracle user:

 oracle@linux-ora-01:+ASM>crsctl stop has -f  
 CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'linux-ora-01'  
 CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'linux-ora-01'  
 CRS-2673: Attempting to stop 'ora.test.db' on 'linux-ora-01'  
 CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'linux-ora-01' succeeded  
 CRS-2677: Stop of 'ora.test.db' on 'linux-ora-01' succeeded  
 CRS-2673: Attempting to stop 'ora.FRADG.dg' on 'linux-ora-01'  
 CRS-2673: Attempting to stop 'ora.DATA.dg' on 'linux-ora-01'  
 CRS-2677: Stop of 'ora.DATA.dg' on 'linux-ora-01' succeeded  
 CRS-2677: Stop of 'ora.FRADG.dg' on 'linux-ora-01' succeeded  
 CRS-2673: Attempting to stop 'ora.asm' on 'linux-ora-01'  
 CRS-2677: Stop of 'ora.asm' on 'linux-ora-01' succeeded  
 CRS-2673: Attempting to stop 'ora.evmd' on 'linux-ora-01'  
 CRS-2677: Stop of 'ora.evmd' on 'linux-ora-01' succeeded  
 CRS-2673: Attempting to stop 'ora.cssd' on 'linux-ora-01'  
 CRS-2677: Stop of 'ora.cssd' on 'linux-ora-01' succeeded  
 CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'linux-ora-01' has completed  
 CRS-4133: Oracle High Availability Services has been stopped.  
 oracle@linux-ora-01:+ASM>  

Let's try Configuring the ASMFD again as root:

 [root@linux-ora-01 ~]# /u01/app/oracle/product/12.1.0.2/grid/bin/asmcmd afd_configure  
 Connected to an idle instance.  
 AFD-627: AFD distribution files found.  
 AFD-636: Installing requested AFD software.  
 AFD-637: Loading installed AFD drivers.  
 AFD-9321: Creating udev for AFD.  
 AFD-9323: Creating module dependencies - this may take some time.  
 AFD-9154: Loading 'oracleafd.ko' driver.  
 AFD-649: Verifying AFD devices.  
 AFD-9156: Detecting control device '/dev/oracleafd/admin'.  
 AFD-638: AFD installation correctness verified.  
 Modifying resource dependencies - this may take some time.  
 ASMCMD-9524: AFD configuration failed 'ERROR: OHASD start failed'  

We can ignore the last error for now but we do have a working ASMFD now.

 oracle@linux-ora-01:+ASM>asmcmd afd_state  
 Connected to an idle instance.  
 ASMCMD-9526: The AFD state is 'LOADED' and filtering is 'DEFAULT' on host 'linux-ora-01'  

Note: ASMLib creates a directory called /dev/oracleasm and ASMFD creates /dev/oraclefd. When we label a disk using ASMFD, a pointer of that will get stored in /dev/oraclefd/disks directory.

Let's start the HAS stack manually:

 oracle@linux-ora-01:+ASM>crsctl start has  
 CRS-4123: Oracle High Availability Services has been started.  


LABELING THE DISKS

Label and migrate the existing disks:

 oracle@linux-ora-01:+ASM>srvctl stop diskgroup -diskgroup FRADG  
 oracle@linux-ora-01:+ASM>asmcmd afd_label FRA_DISK1 '/dev/sdd' --migrate  
 oracle@linux-ora-01:+ASM>srvctl stop diskgroup -diskgroup DATA  
 oracle@linux-ora-01:+ASM>asmcmd afd_label DATA_DISK1 '/dev/sdb' --migrate  
 oracle@linux-ora-01:+ASM>asmcmd afd_label DATA_DISK2 '/dev/sdc' --migrate  

Rescan the disks and start the diskgroups:

 oracle@linux-ora-01:+ASM>asmcmd afd_scan  
 oracle@linux-ora-01:+ASM>srvctl start diskgroup -diskgroup DATA  
 oracle@linux-ora-01:+ASM>srvctl start diskgroup -diskgroup FRADG  

We can list the disks being used by ASMFD:

 oracle@linux-ora-01:+ASM>asmcmd afd_lsdsk  
 --------------------------------------------------------------------------------  
 Label           Filtering  Path  
 ================================================================================  
 FRA_DISK1          ENABLED  /dev/sdd  
 DATA_DISK1         ENABLED  /dev/sdb  
 DATA_DISK2         ENABLED  /dev/sdc  
 oracle@linux-ora-01:+ASM>ls -l /dev/oracleafd/disks/  
 total 12  
 -rw-r--r--. 1 oracle dba 9 Feb 23 19:48 DATA_DISK1  
 -rw-r--r--. 1 oracle dba 9 Feb 23 19:48 DATA_DISK2  
 -rw-r--r--. 1 oracle dba 9 Feb 23 19:43 FRA_DISK1  

These are actually just the text files containing the actual path of the disk:

 oracle@linux-ora-01:+ASM>file DATA_DISK2  
 DATA_DISK2: ASCII text  
 oracle@linux-ora-01:+ASM>cat DATA_DISK2   
 /dev/sdc  

We have successfully migrated the ASM instance to use ASM Filter Driver.

And now we can also reset the ASM diskstring to scan only AFD disks:

 oracle@linux-ora-01:+ASM>asmcmd dsset 'AFD:*'  
 oracle@linux-ora-01:+ASM>  

DESTRUCTION

Now let's test the feature where it won't let any non-oracle I/O to go through and corrupt the disks managed by ASMFD:

 oracle@linux-ora-01:+ASM>dd if=/dev/zero of=/dev/sdb bs=1M count=1000  
 1000+0 records in  
 1000+0 records out  
 1048576000 bytes (1.0 GB) copied, 1.58866 s, 660 MB/s  
 oracle@linux-ora-01:+ASM>sqlplus / as sysdba  
 SQL*Plus: Release 12.1.0.2.0 Production on Mon Feb 23 20:17:39 2015  
 Copyright (c) 1982, 2014, Oracle. All rights reserved.  
 Connected to:  
 Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production  
 With the Automatic Storage Management option  
 SQL> col path format a30 trunc  
 SQL> set lines 120  
 SQL> select name, path, total_mb, free_mb from v$asm_disk;  
 NAME                   PATH                    TOTAL_MB  FREE_MB  
 ------------------------------ ------------------------------ ---------- ----------  
 FRADG_0000              AFD:FRA_DISK1                 51492   50906  
 DATA_0000              AFD:DATA_DISK1                 51492   48829  
 DATA_0001              AFD:DATA_DISK2                 51492   43658  

As you can see, the "dd" command did succeed but it didn't actually wipe the disks. Oracle rejected all the I/O which was coming from "dd" command because it was not coming through Oracle software stack.

Same applied to root user as well, even if we run the same "dd" command as root, it will still be bypassed and the disks will remain intact, absolutely amazing!!



Thanks
Daljit Singh

RMAN Incremental Backup SCN's


For all the archived logs, RMAN/Oracle manage the start and end SCN numbers. By having a quick view of these numbers, one can easily find out if anything is missing there which can jeopardize the recovery process.

Unfortunately the same doesn't apply to database backupsets. When we take RMAN incremental backup of database, it also does has the start and end SCN associated with it but that information is not easily visible to user for a quick look. When we take RMAN L1 backup, it actually backup all the blocks with SCN number greater than the end SCN of last L1 or L0 (depending of differential vs cumulative backups). This information is not directly available to users, in other words, one can't just have a look on the backupsets and determine if the continuity has been maintained or if something is missing/deleted in between available L1 backups.

Here is the output of RMAN archive log backup showing the START (Low) and END (Next) SCN of each available archive log:

 RMAN> list backup of archivelog all;  
 using target database control file instead of recovery catalog  
 List of Backup Sets  
 ===================  
 BS Key Size Device Type Elapsed Time Completion Time  
 ------- ---------- ----------- ------------ -------------------  
 690 9.17M DISK 00:00:06 02/23/2015 09:03:33  
 BP Key: 18910 Status: AVAILABLE Compressed: YES Tag: TAG20150223T081600  
 Piece Name: /media/rmanbackup/backup0gpvvu6v_1_1  
 List of Archived Logs in backup set 18902  
 Thrd Seq Low SCN Low Time Next SCN Next Time  
 ---- ------- ---------- ------------------- ---------- ---------  
 1 62083 362640841 02/23/2015 05:00:00 362643077 02/23/2015 06:00:00  
 1 62084 362643077 02/23/2015 06:00:00 362646450 02/23/2015 07:00:00  
 1 62085 362646450 02/23/2015 07:00:00 362651834 02/23/2015 08:00:03  
 1 62086 362651834 02/23/2015 08:00:03 362654121 02/23/2015 08:15:58  

And here is the L1 backup of database:

 RMAN> list backup of database;  
 List of Backup Sets  
 ===================  
 BS Key Type LV Size    Device Type Elapsed Time Completion Time  
 ------- ---- -- ---------- ----------- ------------ -------------------  
 694  Incr 1 12.4G   DISK    00:20:20   02/23/2015 08:23:32  
 BP Key: 18912  Status: AVAILABLE Compressed: YES Tag: TAG20150223T091530  
 Piece Name: /u02/PROD/backupset/2015_02_23/o1_mf_nnnd1_tgph2lw_.bkp  
 List of Datafiles in backup set 18904  
 File LV Type <b>Ckp SCN</b>  <b>Ckp Time</b>      Name  
 ---- -- ---- ---------- ------------------- ----  
 1  1 Incr 300407179 02/23/2015 08:43:13 +ORADATA/prod/system_01.dbf  
 2  1 Incr 300407179 02/23/2015 08:43:13 +ORADATA/prod/undo_01.dbf  
 3  1 Incr 300407179 02/23/2015 08:43:13 +ORADATA/prod/sysaux_01.dbf  
 4  1 Incr 300407179 02/23/2015 08:43:13 +ORADATA/prod/prod_data_01.dbf  
 5  1 Incr 300407179 02/23/2015 08:43:13 +ORADATA/prod/prod_index_01.dbf  

We do have Checkpoint SCN which is kind of start SCN of the backupset.

Oracle does store the start and end SCN of database backupsets as well and we can see that while doing the recovery in debug mode. Here are the steps to see that:

1) Before invoking rman, set the following environment variable to have the date in certain format:
export NLS_DATE_FORMAT='DD-MON-YYYY HH24:MI:SS'

2) Invoke RMAN and set the debug ON

 $rman trace=recovery.trc log=recovery.log  
 RMAN> connect target /  
 RMAN> set echo on  
 RMAN> debug on  
 RMAN> recover database;  
 RMAN> debug off  

3) Now open the recovery.trc file and find the string "bsKey=", there should be a section like this in debug file:

 DBGRCVMAN: DUMPING RECOVERY CONTAINER  
 DBGRCVMAN: Incremental Backup Set  
 DBGRCVMAN:  <b>bsKey=694</b> bsRecid=694 bsStamp=868787809 setStamp=868586419 setCount=2283 site_key=0  
 DBGRCVMAN:  bsLevel=1 bsType=I pieceCount=1  
 DBGRCVMAN:  multi_section=N  
 DBGRCVMAN:  key=82 recid=82 stamp=868787809  
 DBGRCVMAN:  compTime=23-FEB-15  
 DBGRCVMAN:  blocks=64000 blockSize=8192  
 DBGRCVMAN:  <b>fromSCN=300407179</b> <b>toSCN=300409375</b> toTime=23-FEB-15 level=1 section_size=0  
 DBGRCVMAN:  rlgSCN=1 rlgTime=22-MAY-13 dbincKey=  
 DBGRCVMAN:  afzSCN=0  
 DBGRCVMAN:  dfNumber=1 creationSCN=10 pluginSCN=0 foreignDbid=0 pluggedRonly=0  
 DBGRCVMAN:  cfType=B  
 DBGRCVMAN:  keep_options=0 keep_until=NULL  
 DBGRCVMAN:    fetchCursor1RecoveryAction real filter rejected - trying next  
 DBGRCVMAN:    fetchCursor1RecoveryAction seekNext  
 DBGRCVMAN:    fetchCursor1RecoveryAction rcvRecCursor1_c record  

Here "bsKey=694" is the backupset key and "fromSCN" & "toSCN" fields shows the START & END SCN contained in that backupset.

So far, this information is only available while doing actual recovery with debug mode. I wish if Oracle can somehow give this information directly in the "LIST BACKUP" command for ease of use.

I know there are many other ways to check the validity of the backups but this information would also help a lot and take less time to check the continuity of the backsets.



Thanks
Daljit Singh

Friday, February 6, 2015

Oracle 12c Grid Infrastructure Installation

Following are the steps to install Oracle 12c Grid Infrastructure for standalone server on Oracle Linux 6.

We need Oracle Grid Infrastructure on standalone box to use ASM, so the very first thing which we need to configure is the disks which we are going to use with ASM to store Oracle files.

Here is how disks are setup on my machine:

[root@orarac01 ~]# fdisk -l | grep /dev
Disk /dev/sda: 128.8 GB, 128849018880 bytes
/dev/sda1 * 1 64 512000 83 Linux
/dev/sda2 64 15666 125316096 8e Linux LVM
Disk /dev/sdb: 12.9 GB, 12884901888 bytes
Disk /dev/sdc: 12.9 GB, 12884901888 bytes

/dev/sdb and /dev/sdc are the disks which I am going to use with ASM. In this exercise, I am going to create 1 ASM diskgroup with NORMAL redundancy and will use these 2 disks for 2 failure groups.

We need to create 1 primary partition on each of the disk, covering the whole disk and will use that partition in our ASM disk creation process.

Here are the steps to create 1 primary partition on disk /dev/sdb:

fdisk /dev/sdb
Type n (for new  partition) and press enter
Type p (for primary partition) and press enter
Type 1 (for partition number) and press enter
Type p (to print the partition table) and press enter
Type w (to write the partition table to disk permanently) and press enter

Here is the complete action log of above mentioned steps for disk /dev/sdb:

[root@orarac01 ~]# fdisk /dev/sdb
Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel
Building a new DOS disklabel with disk identifier 0x9018e049.
Changes will remain in memory only, until you decide to write them.
After that, of course, the previous content won't be recoverable.

Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)

WARNING: DOS-compatible mode is deprecated. It's strongly recommended to
         switch off the mode (command 'c') and change display units to
         sectors (command 'u').

Command (m for help): m
Command action
   a   toggle a bootable flag
   b   edit bsd disklabel
   c   toggle the dos compatibility flag
   d   delete a partition
   l   list known partition types
   m   print this menu
   n   add a new partition
   o   create a new empty DOS partition table
   p   print the partition table
   q   quit without saving changes
   s   create a new empty Sun disklabel
   t   change a partition's system id
   u   change display/entry units
   v   verify the partition table
   w   write table to disk and exit
   x   extra functionality (experts only)

Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-1566, default 1): 
Using default value 1
Last cylinder, +cylinders or +size{K,M,G} (1-1566, default 1566): 
Using default value 1566

Command (m for help): p

Disk /dev/sdb: 12.9 GB, 12884901888 bytes
255 heads, 63 sectors/track, 1566 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x9018e049

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               1        1566    12578863+  83  Linux

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.


Repeat the above mentioned steps for all the disks you are planning to use with ASM.

Now we need to make the SATA devices trusted by adding following line in /etc/scsi_id.config file:

[root@orarac01 rules.d]# cat /etc/scsi_id.config
options=-g

If the /etc/scsi_id.config file doesn't exist, then just create one and add only the above line.

Check the SCSI ID of the disks which we are going to use in ASM:

[root@orarac01 rules.d]# /sbin/scsi_id -g -u -d /dev/sdb1
1ATA_VBOX_HARDDISK_VBdda4193b-5e1d9aed
[root@orarac01 rules.d]# /sbin/scsi_id -g -u -d /dev/sdc1
1ATA_VBOX_HARDDISK_VB9d7a8cf7-04f8bb36

We need to make sure that the disks are owned by Oracle software owner and to make these permissions permanent, we need to add udev rules by creating a new rules file for asm disks.

Create a new file called something like 90-asmdisks.rules in /etc/udev/rules.d directory and paste the disk information (SCSI ID which we retrieved in previous step).

Here is what i have created for my environment:

[root@orarac01 rules.d]# cat /etc/udev/rules.d/90-asmdisks.rules
KERNEL=="sdb1", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -d /dev/$parent", RESULT=="1ATA_VBOX_HARDDISK_VBdda4193b-5e1d9aed", NAME="asm-disk1", OWNER="oracle", GROUP="dba", MODE="0660"
KERNEL=="sdc1", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -d /dev/$parent", RESULT=="1ATA_VBOX_HARDDISK_VB9d7a8cf7-04f8bb36", NAME="asm-disk2", OWNER="oracle", GROUP="dba", MODE="0660"

The NAME parameter in above file defines the name of the block device which will be created after applying these rules. I am calling it asm-disk1 and asm-disk2, so that these can be easily identified in the device listing.

After creating the new rules, we can test the newly created rule using following command for each disk:

# /sbin/udevadm test /dev/sdb1
# /sbin/udevadm test /dev/sdc1

Now we should be able to see the newly created devices in /dev directory and the permissions of devices should also be owned by oracle/dba:

[root@orarac01 rules.d]# ls -l /dev/asm*
brw-rw----. 1 oracle dba 8, 17 Jul 21 23:23 /dev/asm-disk1
brw-rw----. 1 oracle dba 8, 33 Jul 21 23:23 /dev/asm-disk2

The above output tells that our new rules just worked without any issue and the disk setup completes here and we are ready to launch the OUI to install Grid Infrastructure now.

Go to the Oracle 12c media directory and run the installer from there to launch the OUI:

[oracle@orarac01 grid]$ ./runInstaller
Starting Oracle Universal Installer...

Checking Temp space: must be greater than 120 MB. Actual 43003 MB Passed
Checking swap space: must be greater than 150 MB. Actual 2015 MB Passed
Checking monitor: must be configured to display at least 256 colors. Actual 16777216 Passed
Preparing to launch Oracle Universal Installer from /tmp/OraInstall2013-07-21_11-38-28PM. Please wait ...[oracle@orarac01 grid]$ You can find the log of this install session at:
 /u01/app/oraInventory/logs/installActions2013-07-21_11-38-28PM.log


This should bring the following installer 12c Grid Infrastructure Install window:







In the above window you have 3 options:

First is to provide your metalink credentials, so the the installer can pull all the recommended upgrades for your product.

Second to apply already downloaded software updates

And the last to skip the software update completely.

Choose an option as per your requirement, I am going with option 3 to skip the updates completely. 

Select the appropriate radio button and hit Next.



In the above window, select "Install and Configure Grid Infrastructure for Standalone Server" and hit Next.


English language is selected by default, so hit Next.


In the above window, enter the diskgroup name you want to create for ASM. 

Then select the diskgroup redundancy: 

High -> 3 way mirroring
Normal -> 2 way mirroring
External -> No mirroring at all

And the allocation unit size (au_size). If you are not going to host really large database then 1MB is good enough.

Since there are no disks in the available disk list, click "Change Discovery Path" button to change the ASM diskstring.


Since the default diskstring is set to /dev/sd*, it wasn't finding any of our previously configured disks.

Change the discovery string to /dev/asm* because the devices which we created for ASM are /dev/asm-disk1 and /dev/asm-disk2.



Hit Ok.



Now we can see our disks. Select all the disks and hit Next.


In the above window, either specify different passwords for SYS and ASMSNMP users OR use the same password for both of the users and hit Next.


If you chose same password then you will get above warning window, just hit Yes.


In above window, select the OS group for SYSASM, SYSDBA & SYSOPER operational users. 

If you have separate team who deals with just storage then you need separate roles setup, otherwise just use the same and hit Next.


If you chose same OS group then you will get above warning window, just hit Yes.


In the above window, specify the location/path of ORACLE_BASE & ORACLE_HOME and hit Next.


In the above window, specify the location/path of Oracle inventory and hit Next.


In the above window, if you want the installer to run the root scripts for you then either provide root password or sudo access credentials to do the same, otherwise just leave it unchecked and hit Next.






In the above window, you can view the results of Oracle pre-req check run. I am going to ignore all of the above failed pre-reqs but it's not advisable for production boxes.


You will get the above warning window, if you ignore any failed pre-req, just hit Yes.


In the above window, it's showing the summary of all the choices we made for the install. Hit the Install button to actually install as per the given instructions.


The above window shows the overall progress & status of all the selected operations.



The above window is asking for the confirmation to run the root specific operations as privileged user (root). Hit Yes to continue.



In the above window, you can see that ASMCA failed for some reason and I just continued the installation. Will fix this manually after the install.




In the above window, install finished successfully. Now we need to fix the Automatic Storage Management Configuration Assistant error, we got earlier while installing.

Troubleshooting

Even though our installation failed on configuring ASM, it still configured HAS and other resources for us. We can query that as oracle user:

[oracle@orarac01 ~]$ crsctl status res
NAME=ora.LISTENER.lsnr
TYPE=ora.listener.type
TARGET=ONLINE
STATE=ONLINE on orarac01

NAME=ora.cssd
TYPE=ora.cssd.type
TARGET=ONLINE
STATE=OFFLINE

NAME=ora.diskmon
TYPE=ora.diskmon.type
TARGET=OFFLINE
STATE=OFFLINE

NAME=ora.evmd
TYPE=ora.evm.type
TARGET=ONLINE
STATE=ONLINE on orarac01

NAME=ora.ons
TYPE=ora.ons.type
TARGET=OFFLINE
STATE=OFFLINE

As you can see that the CSSD service is down, we need to start that up before starting ASM:

[oracle@orarac01 dbs]$ crsctl start res ora.cssd
CRS-2672: Attempting to start 'ora.cssd' on 'orarac01'
CRS-2672: Attempting to start 'ora.diskmon' on 'orarac01'
CRS-2676: Start of 'ora.diskmon' on 'orarac01' succeeded
CRS-2676: Start of 'ora.cssd' on 'orarac01' succeeded


Now we need to manually create a pfile in $ORACLE_HOME/dbs directory for ASM and start the instance using that:

This is what I put in my ASM init file:

[oracle@orarac01 dbs]$ cat init+ASM.ora
instance_type=ASM
db_unique_name=+ASM
asm_diskstring='/dev/asm*'
asm_diskgroups='DATA'
sga_target=400M

If you want you can use memory_target as well, it just needs the /dev/shm partition mount point.

SQL> startup pfile=/u01/app/oracle/product/12.1.0/grid/dbs/init+ASM.ora
ASM instance started

Total System Global Area 1135747072 bytes
Fixed Size 2297344 bytes
Variable Size 1108283904 bytes
ASM Cache 25165824 bytes
ORA-15032: not all alterations performed
ORA-15017: diskgroup "DATA" cannot be mounted
ORA-15063: ASM discovered an insufficient number of disks for diskgroup "DATA"

We can ignore the above error because we know that DATA diskgroup hasn't been created yet.

Now, If I try to create spfile from memory or pfile, it will fail with following error:

SQL> create spfile='/u01/app/oracle/product/12.1.0/grid/dbs/spfile+ASM.ora' from pfile='/u01/app/oracle/product/12.1.0/grid/dbs/init+ASM.ora';
create spfile='/u01/app/oracle/product/12.1.0/grid/dbs/spfile+ASM.ora' from pfile='/u01/app/oracle/product/12.1.0/grid/dbs/init+ASM.ora'
*
ERROR at line 1:
ORA-29786: SIHA attribute GET failed with error [Attribute 'SPFILE' sts[200]
lsts[0]]

We are getting the above error because ASM is not registered with OCR yet, we need to register it using SRVCTL and then only we can create spfile.

Here is how to register ASM with OCR:

[oracle@orarac01 dbs]$ srvctl add asm -listener LISTENER -spfile '/u01/app/oracle/product/12.1.0/grid/dbs/spfile+ASM.ora'

Eventhough we don't have spfile yet, just mention the complete path of it while registering the ASM, it will work and we won't have to change it after creating the spfile created.

Now since ASM is registered with OCR, let's try creating the spfile again:

[oracle@orarac01 dbs]$ sqlplus / as sysasm

SQL*Plus: Release 12.1.0.1.0 Production on Mon Jul 22 00:54:17 2013

Copyright (c) 1982, 2013, Oracle. All rights reserved.

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Automatic Storage Management option

SQL> create spfile='/u01/app/oracle/product/12.1.0/grid/dbs/spfile+ASM.ora' from pfile='/u01/app/oracle/product/12.1.0/grid/dbs/init+ASM.ora';

File created.


We have the spfile now and the ASM configuration in OCR is already pointing to it, so that's good as well.

And here are our disks which we configured earlier:

SQL> col path format a30 trunc
SQL> select path, header_status from v$asm_disk;

PATH HEADER_STATU
------------------------------ ------------
/dev/asm-disk2 CANDIDATE
/dev/asm-disk1 CANDIDATE

Everything is setup correctly so far, ASM can see our disks as CANDIDATE, which means we can use them to create new diskgroup.

Now we can create the diskgroup manually:

SQL> create diskgroup DATA normal redundancy failgroup DATA_FG1 disk '/dev/asm-disk1' name DATA_FG1_DISK1 failgroup DATA_FG2 disk '/dev/asm-disk2' name DATA_FG2_DISK1;

Diskgroup created.

SQL> select name,total_mb,free_mb,path,header_status,failgroup from v$asm_disk;

NAME TOTAL_MB FREE_MB PATH HEADER_STATU FAILGROUP
------------------------------ ---------- ---------- ------------------------------ ------------ -------------
DATA_FG2_DISK1 12284 12233 /dev/asm-disk2 MEMBER DATA_FG2
DATA_FG1_DISK1 12284 12233 /dev/asm-disk1 MEMBER DATA_FG1

Now, I am shutting it down, so that we can start it from CRS resource:

SQL> shut immediate
ASM diskgroups volume disabled
ASM diskgroups dismounted
ASM instance shutdown
SQL> quit


Start the ASM using crsctl command:

[oracle@orarac01 dbs]$ crsctl start res ora.asm
CRS-2672: Attempting to start 'ora.asm' on 'orarac01'
CRS-2676: Start of 'ora.asm' on 'orarac01' succeeded
CRS-2672: Attempting to start 'ora.DATA.dg' on 'orarac01'
CRS-2676: Start of 'ora.DATA.dg' on 'orarac01' succeeded
[oracle@orarac01 dbs]$ crsctl status res
NAME=ora.DATA.dg
TYPE=ora.diskgroup.type
TARGET=ONLINE
STATE=ONLINE on orarac01

NAME=ora.LISTENER.lsnr
TYPE=ora.listener.type
TARGET=ONLINE
STATE=ONLINE on orarac01

NAME=ora.asm
TYPE=ora.asm.type
TARGET=ONLINE
STATE=ONLINE on orarac01

NAME=ora.cssd
TYPE=ora.cssd.type
TARGET=ONLINE
STATE=ONLINE on orarac01

NAME=ora.diskmon
TYPE=ora.diskmon.type
TARGET=OFFLINE
STATE=OFFLINE

NAME=ora.evmd
TYPE=ora.evm.type
TARGET=ONLINE
STATE=ONLINE on orarac01

NAME=ora.ons
TYPE=ora.ons.type
TARGET=OFFLINE
STATE=OFFLINE


And at the very end, just to check all the resource dependencies, I am bouncing whole HAS stack:

[oracle@orarac01 dbs]$ crsctl stop has
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'orarac01'
CRS-2673: Attempting to stop 'ora.evmd' on 'orarac01'
CRS-2673: Attempting to stop 'ora.DATA.dg' on 'orarac01'
CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'orarac01'
CRS-2677: Stop of 'ora.DATA.dg' on 'orarac01' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'orarac01'
CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'orarac01' succeeded
CRS-2677: Stop of 'ora.asm' on 'orarac01' succeeded
CRS-2677: Stop of 'ora.evmd' on 'orarac01' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'orarac01'
CRS-2677: Stop of 'ora.cssd' on 'orarac01' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'orarac01' has completed
CRS-4133: Oracle High Availability Services has been stopped.

See how all the services got stopped by HAS in proper order, same applied to startup as well:

[oracle@orarac01 dbs]$ crsctl start has
CRS-4123: Oracle High Availability Services has been started.
[oracle@orarac01 dbs]$
[oracle@orarac01 dbs]$ crsctl status res
NAME=ora.DATA.dg
TYPE=ora.diskgroup.type
TARGET=ONLINE
STATE=ONLINE on orarac01

NAME=ora.LISTENER.lsnr
TYPE=ora.listener.type
TARGET=ONLINE
STATE=ONLINE on orarac01

NAME=ora.asm
TYPE=ora.asm.type
TARGET=ONLINE
STATE=ONLINE on orarac01

NAME=ora.cssd
TYPE=ora.cssd.type
TARGET=ONLINE
STATE=ONLINE on orarac01

NAME=ora.diskmon
TYPE=ora.diskmon.type
TARGET=OFFLINE
STATE=OFFLINE

NAME=ora.evmd
TYPE=ora.evm.type
TARGET=ONLINE
STATE=ONLINE on orarac01

NAME=ora.ons
TYPE=ora.ons.type
TARGET=OFFLINE
STATE=OFFLINE

All the registered services are up and everybody seems to be happy now. :)



Thanks
Daljit Singh