Cover V05, I07
Article
Listing 1
Listing 2
Sidebar 1
Sidebar 2

jul96.tar


Oracle Database Backup

Curtis Preston

Trying to back up today's databases is like trying to catch a charging rhino with a butterfly net. Real world databases have become screaming behemoths that are very difficult to capture (back up), and standard UNIX utilities are like butterfly nets. If you try to use these utilities for the job, not only will you not capture the data -- you'll break your arm! This article is my attempt to build a better rhino trap. It uses many advanced shell scripting features, so you may find it useful even if you do not have Oracle.

Hot or Cold Backups?

One of the first decisions a database administrator must make is whether to perform "cold" or "hot" backups. A cold backup involves:

1. Shutting down the instance 2. Backing up its filesystems or raw devices 3. Starting up the instance

Here are the steps to perform a hot backup. (Note that a hot backup requires much more knowledge of Oracle's internal structure, so a basic definition of helpful terms is given in the sidebar "Definitions.") First, you must determine the name of each tablespace in the instance and the files that are within it. Then, for each tablespace, you:

1. Put that tablespace in backup mode; 2. Backup that tablespace's files or raw partitions (using cp, tar, dd, etc.); 3. Take that tablespace out of backup mode.

Once all tablespaces are done:

1. Switch redolog files; and 2. Backup the control file.

It is much easier to perform cold backups, so why do hot backups at all? Simply stated, they increase your backup window, or the time in which you are allowed to perform backups. Because cold backups require a shutdown of the database, your users will probably allow only a very small backup window. Hot backups allow you to back up without shutting down, remaining virtually invisible to the user, and thereby significantly enlarging your window. I have even performed hot backups of production data during primetime hours. Try that with a cold backup!

Be Careful

If you are going to use hot backups, though, you must follow certain precautions. Failure to do so could mean you would only be able to restore up to your last cold backup -- if you made one.

During a cold backup, all data files have been synchronized. A restore simply requires copying all files into place and restarting the database. However, during a hot backup, each file is backed up at a different time; thus each has a different time stamp. You must have some method of syncing all of those files to the same point during a restore. That is what Oracle's Redo Logs, (also known as archiving, or Archive Logs) are for. Archiving continually records all database changes in what is called the On-line Redo log. Once that log is full, it is copied to an Archived Redo log. (For more information, see the sidebar "Should I Use Archiving?")

When restoring from a hot backup, you first replace the damaged database files with their respective backup copies. (You only need to replace the damaged files.) You then "play" the redo logs, effectively "redoing" all the transactions that affected those files. All files are then synchronized, and the database can be brought on line.

With all the logs in place, this restore can be done with one command. However, if just one redolog is lost, your restore could range from extremely difficult to impossible. If you do not have the set of redo logs that span the time of your backup, you will not recover at all.

This is why you must treat your redo logs as if they were gold. Some of the methods of protecting them are:

  • Mirroring the filesystem on which they are stored

  • Storing them on an NFS filesystem mounted from another server (can result in a performance loss)

  • Performing hourly incremental backups (to tape or disk) of that filesystem

    It is not uncommon, or unwise, to use all of the methods listed above to protect your redo logs.

    A Bird's Eye View

    To automate Oracle backups, you must know the location of tablespaces, raw partitions, data files, control files, redolog files, and whether archiving is running. To make things more difficult, each installation will have different answers to these questions. Oraback.sh (Listing 1) addresses all of the problems above by obtaining most of its information from Oracle. The script will do the following for each instance:

  • Use sqldba to obtain configuration information

  • Perform a cold or hot backup (as described above), based on the day of the week

  • Make a backup copy of essential redologs

    Oraback.sh supports these features:

  • Backups to disk or tape

  • Backups of raw partitions

  • Multitasking, which can reduce backup time by up to 75%

  • Mail-based error notification

    The script assumes that all instances want hot backups, which require redologs. Oraback.sh will warn you of any instances that do not have logs. It will continue to warn you until you correct the problem or customize your setup for that instance. Customization is done with oraback.conf (Listing 2).

    Looking in the Backup Window

    Now that I've explained the overall purpose of oraback.sh, here is a description of each step within the script. Oraback.sh begins by checking oraback.conf to see whether: the user running the backup is Oracle, or a user listed in field 7. It then looks to see whether the word "skip" is in field 2. If so, it skips the backup once, and removes the word from oraback.conf. The script then does one of the following, depending on which arguments, if any, it received:

    $ oraback.sh at -- If called with at as an argument (you would usually do this with cron), the script checks oraback.conf to see whether the day in field 3 (the cold backup day) is today. If so, it will set the variable $TIME to the time in field 4. If not, it will set it to the time in field 5. The script then schedules an at job that will run oraback.sh at the time specified by $TIME.

    $ oraback.sh [$SID1 $SID2...] -- If given one or more instance names (ORACLE_SID's, or SID's for short) as arguments, the script will perform a backup of each of them. If given no arguments, it performs a backup of all SID's listed in Oracle's oratab file. (The latter is what the at job will do.)

    Oraback.sh first checks the $HOST.master line in oraback.conf to find out if: (a) there is a tape device in field 6. If so, it labels the tape, making sure the no-rewind device is being used; and (b) there is a number in field 8. If so, it will perform that many simultaneous copies of database files.

    Next, oraback.sh asks Oracle questions that will determine how (or if) the SID will be backed up. It asks:

  • if the instance is on-line; if so, oraback.sh makes a set of configuration files needed later

  • if archiving is running

  • if the instance is excluded from hot backups

  • if this is a cold backup; if so, it shuts down the instance

    Hot Backup

    If the instance is online, logs are on, and the instance is not excluded from hot backups, the script will put each tablespace into backup mode by using the sqldba command begin backup. The script then copies that tablespace's files to the backup device. When the copying is done, it takes the tablespace out of backup mode with the sqldba command end backup.

    These three steps are the core of the hot backup. While the tablespace is in backup mode, the files will not change. Any changes made to the tablespace while it is in backup mode are sent to the redo logs, and are applied once it is taken out of backup mode. This is all done automatically by Oracle.

    The script supports simultaneous copying of individual data files to dramatically increase the speed of the backup. Depending on how many files there are per tablespace, there may be one or more tablespaces in backup mode at once. This will be done in order to perform the number of concurrent file copies (using dd) that are specified in field 8 of oraback.conf, if you are using this feature.

    Next, the script forces a checkpoint and archive log switch, which causes the online redo log to be sent to an archived redo log. It then backs up the control files to disk using both sqldba and a manual copy. Finally, the script makes copies of the essential redo logs and compresses them. (Essential logs are those that span the time of your backup.) If the backup device is a tape, it then backs up all files that were sent to disk to the tape device.

    Cold Backup

    If the instance is off-line, oraback.sh only needs to copy the datafiles to the backup device. The script then backs up the control files and redo logs the same way the hot backup does.

    Special Cases

    If the instance is online, but archiving is off, it checks oraback.conf for a line that reads:

    $HOST:$SID:NOARCHIVELOG

    If that line is not found, oraback.sh complains and sends you mail. If the line is found, it looks for the word "offline" on that same line. If "offline" is there, oraback.sh will shut down the instance, perform a cold backup, and then restart the instance. (This is one of the ways you can customize oraback.sh for different instances.)

    If the instance is online, oraback.sh also looks for a line in oraback.conf that reads

    $HOST:$SID:::nohotbackup

    If that word is found, the script will skip this instance when performing a hot backup.

    In summary, oraback.sh can back up from one to all of the instances in oratab. To back up all instances, run it with no arguments. To back up one or more, run it with those SID's as arguments. To schedule a backup for the time in oraback.conf, run it with the at argument.

    Installing Oraback.sh

    Check the following values in the site-specific section at the top of the script. They must be changed to fit your site.

    a. BINDIR must be set to where you install oraback.sh.

    b. ORATAB must be set to the name and location of Oracle's oratab file. (It must be located in a directory that Oracle can write to.)

    c. ORACONF must be set to the name and location of your oraback.conf file.

    You should also review and confirm all other preference variables in the site-specific section, including ORADIR, TMP, LOG, and PATH.

    You must also verify that:

    a. You know which instances are using archiving

    b. There is a valid archive_dest in each instances's config.ora (not the default /?/dbs/arch)

    c. Oracle can create directories in the filesystem that you are using for backups

    Customizing Your Backups

    Oraback.conf is the main configuration file for oraback.sh. If you want to use the at argument, you must have the following line in that file:

    $HOST.master::Sun:1200:1730:::5 :/dmp::Mail -s Error_oracle:

    These and other fields are described below:

  • $HOST=Hostname generated by uname -n

  • Field 2: If you want all backups skipped on this host tonight, put "skip" here

  • Field 3: The day you want to do cold backups. This can be a day of the week (Fri) or month (03), or blank if you do not want cold backups.

  • Field 4: The time to do cold backups

  • Field 5: The time to do hot backups

  • Field 6: No-rewind tape device, if you want to backup to tape. (Blank for disk-only backup)

  • Field 7: A "|" separated list of userid's permitted to run sqldba, such as "oracle|dba" (Blank allows only Oracle)

  • Field 8: The number of simultaneous copies you want to run. (Blank=1)

  • Field 9: Must be set to a filesystem that has enough room for a compressed copy of all database files. (If you are backing up to tape, you still need some space on disk.)

  • Field 10: A "Y" means to compress the files before they are written to tape or disk

  • Field 11: Set this to a complete mail command. (e.g., /usr/ucb/Mail -s Backup_Error oracle, dba@herworkstation.com)

    There are four predefined functions, unused by default, which are listed in the site-specific section of oraback.sh. They are listed below, with their respective usage times:

    Preback: Runs before entire backup starts

    Preshut: Runs before shutdown of each instance

    Poststart:Runs after startup of each instance

    Postback: Runs after entire backup finishes

    If you wanted to restart orasrv, for instance, after an instance is restarted (with a script called rc.orasrv.sh,) you would change the Poststart function to read:

    Poststart(){
    rc.orasrv.sh
    }

    Customizing Backups of an Instance

    If you have an instance that will not run archiving or hot backups, you need a line in oraback.conf that starts with:

    $HOST:$SID::::

  • Put "NOARCHIVELOG" in field 3 if the instance does not use archiving

  • Put "offline" in field 4 if the instance does not use archiving, and you want a cold backup done every time

  • Put "nohotbackups" in field 5 to skip hot backups for this instance

    Testing the Backup

    For testing, select a small instance and run the following command as Oracle:

    $ /usr/local/bin/oraback.sh <instance>

    Be aware that if the word "offline," is in oraback.conf, or if the day that you specified for cold backups is today, the instance will be shut down!

    If this test functions properly, you can put an entry in Oracle's cron. Call this command:

    /usr/local/bin/oraback.sh at

    at around 1200. It will check the oraback.conf and schedule an at job that will perform the backup.

    Testing the Restore

    Never accept that a new backup strategy works until you have fully tested it under all circumstances! Make sure you test all of the following until you are comfortable:

  • Complete restore from a cold backup

  • Complete restore from a hot backup

  • Partial restore from a hot backup, restoring only a few of your data files.

  • Simulate the loss of a redo log, following the instructions in the Oracle manual for this scenario. This one is extremely painstaking!

  • Restore from an older backup, using the redo logs to roll forward to today

    Once your restore test is successful, you will have caught and tamed the rhino! This script represents more than three years of work, and its development is a continual process. I hope you find it useful.

    About the Author

    Curtis Preston is a consultant for Pencom Systems Administration, and has been doing Unix Systems Administration for 3 years. He specializes in automating backups of heterogeneous networks. He may be reached at (312) 781-1111 or curtis@pencom.com.


     



  •