Understanding Oracle Backup & Recovery

W. Curtis Preston

This month's article will explain the elements of Oracle architecture that make backup and recovery possible. Understanding these architectural elements is key to being able to successfully backup and recover Oracle.

Historically, Oracle did not have a standalone backup utility like Informix's ontape or Sybase's dump, opting instead for commands that allow the DBA to use any backup utility. Oracle7 introduced the EBU, or Enterprise Backup Utility, but it is designed to work only with other commercial backup utilities. (Oracle8 now comes bundled with an OEM version of Legato Networker, which means that you do have another free option now, but this setup is still not as easy to use as ontape or dump.) Oracle8 introduced the Recovery Manager (rman), which also is designed to work with commercial backup utilities and added a lot more functionality.

Environments without a commercial utility must use backup scripts of some kind. This method is certainly the least user-friendly and most difficult to learn if you are new to Oracle and scripting, but also allows for the greatest flexibility during both backup and restore. This complexity, of course, requires a bit more explanation.

This article will use the Oracle8 command svrmgr for interfacing with Oracle databases. If you are running Oracle7, the command is sqldba.

Oracle Architecture

It is important to understand the design of the database that is being backed up. We will start with the power user's view of the database, and then continue with that of the database administrator. This article uses Oracle-specific terms. As much as possible, these architectural elements are presented in a "building-block" order. Elements that are used to explain other elements will be presented first. For example, I explain what a segment is before explaining what the rollback segment is.

Power User's View

Unless a power user wants to start doing the DBA's job of putting a database together, the following terms should be all he or she needs to know. This view also could be called the "logical" view, since many of the elements described in this view don't exist in a physical sense.

Instance -- An instance is a set of processes through which the Oracle database talks to shared memory. In UNIX, there is often more than one instance on a single system. On a UNIX system, an instance can be identified by a set of processes with the pattern ora_ORACLE_SID_, where ORACLE_SID is the instance name. When an Oracle instance is started, the database within it becomes available. On a UNIX system, an instance is started with the dbstart command and shut down with the dbshut command. NT databases are started via the Service Startup Utility.

Database -- The database is what most people think about when they are using Oracle because it is the database that contains the data. It contains all the tables, indexes, and other important database objects. In Oracle, there is a one-to-one relationship between instances and databases. A database resides in only one instance, and there is only one database within an instance. That is why an Oracle DBA or user may use the two terms interchangeably. (You will find the terms used interchangeably in this article as well. The term "instance" actually is used rather sparingly, since the term "database" is more widely known.) Technically, however, the instance is a set of processes through which the database talks to shared memory, whereas the database is the collection of data itself.

Table -- A table is a collection of related rows that all have the same attributes. In Oracle, a table can be "partitioned," or spread across multiple tablespaces. Other than that, Oracle tables are the same as any other RDBMSs.

Index -- A database index is analogous to an index in a book -- it allows Oracle to find data quickly. Again, an Oracle index is the same as anyone else's index, and it presents no unique backup requirements. An index is a derived table. It is created based on the attributes in another table, so it could be re-created during a restore. However, it almost always is going to be quicker to restore it than to re-create it.

BLOB datatypes -- Oracle8 has special datatypes called BLOB, CLOB, and BFILE for storing large objects such as text or graphics. The Binary Large OBject (BLOB) and Character Large OBject (CLOB) datatypes present no special backup requirements since they are stored within the database itself. (A BLOB typically contains image files, and a CLOB normally contains text data.) However, the BFILE datatype stores only a pointer inside the database to a file that actually resides somewhere in the filesystem. This does require some special attention during backups.

Row -- A row is a collection of related attributes, such as all the information about a specific customer. Oracle DBAs also may refer to this as a "record."

Attribute -- An attribute is any specific value (also known as a "column" or "field") within a row.

DBA's View

Now that I have covered the logical structure of an Oracle database, I will concentrate on the physical structure. Since only the DBA should need to know this information, I will call it the "DBA's View".

Block -- A block is the smallest piece of data that can be moved within the database. Oracle allows a custom block size for each instance; the size can range from 1024 to 8096 bytes. A block is referred to as a page in other RDBMSs.

Extent -- An extent is a collection of Oracle blocks that are treated as one unit. The size of each extent is determined by the DBA.

Segment -- A segment is the collection of extents dedicated to a database object (table). Depending on the type of table, extents may be allocated or taken away to meet the storage needs of a given table. A perfect example is the rollback segment, described later, which would be all the extents on which the rollback logs are stored. The size of the rollback segment may increase or decrease depending on how many uncommitted transactions are currently open. Oracle adds extents to (or subtracts extents from) the rollback segment as it needs them.

Datafile -- An Oracle datafile can be either a raw (disk device) or cooked (filesystem) file. Once they are created, the syntax to work with raw and cooked datafiles is the same. However, backup scripts do have to take the type of datafile into account. If the backup script is going to support datafiles on raw partitions, it will need to use dd or some other command that can back up a raw partition. Using cp or tar will not work, since they support only filesystem files.

Each Oracle datafile contains a special header block that holds that datafile's System Change Number (SCN). This SCN is updated every time a change is made to the datafile, and the controlfile keeps track of the current SCN. When an instance is started, the current SCN is checked against the SCN markers in each datafile. (See the definition of controlfile later in this article.)

Tablespace -- This is the virtual area onto which a DBA creates tables. It consists of several datafiles and is created by the create tablespace tablespace_name on devicea, deviceb, devicec command. A tablespace may contain several tables. The space that each table occupies within that tablespace is a segment (see the earlier definition of segment). Every Oracle instance has at least one tablespace -- the system tablespace. The files that make up the system tablespace must be specified when creating a new Oracle instance. The system tablespace stores the data dictionary, PL/SQL programs, view definitions, the system rollback segment, and other types of instance-wide information. When it comes to backup and recovery, the main difference between the system tablespace and the rest of the tablespaces is that it must be recovered offline because the instance cannot be brought online without the system tablespace. Other tablespaces can be recovered after the instance has been brought online.

Partition -- A table can be spread out across multiple tablespaces. When this is done, each tablespace is referred to as a partition.

Controlfile -- The controlfile is a database (of sorts) that keeps track of the status of all the objects within the database. It knows about all tablespaces, datafiles, and redologs within the database. It also knows the current state of each of these objects by tracking each object's SCN. Every time it makes a change to a file, the SCN gets incremented both in the controlfile and in the actual datafile. (See the definition of datafile earlier in the article.) That way, when the system reboots and the instance is starting up, the controlfile has a record of what SCN the file should be at, and it checks that against the SCN that the file has. This is how it "notices" that a file is older than the controlfile and is in need of media recovery. Also, if an older controlfile is put in place, Oracle will see that the SCN of the datafiles are higher than those that it has recorded in the controlfile. That's when Oracle displays the "datafile is more recent than the controlfile" error.

Controlfiles can be backed up using the backup controlfile to filename command in svrmgr, but restoring controlfiles is a bit tricky. The mechanics of this recovery are well beyond the scope of this article. It is best to avoid having to recover or rebuild a controlfile. A new feature introduced in Oracle7 provided a way to do this with the mirrored controlfile feature in which there can be multiple copies of the controlfile, each of which is updated simultaneously by Oracle. Make sure that this feature is being used. Mirrored controlfiles take up almost no space, and provide an incredible amount of recovery flexibility.

Transaction -- A transaction is any activity by a user or a DBA that changes one or more attributes in an Oracle database. (If a set of commands is contained between a begin transaction and end transaction statement, the entire set of commands is treated as one transaction.) Logically, a transaction modifies one or more attributes, but what actually occurs physically is a modification to one or more blocks within the Oracle database.

Rollback segment -- Remember that a segment comprises all the extents allocated to a database object. A rollback segment, then, is all the extents allocated to a rollback log. Before a page is physically changed on disk, the "before" image (its image before it was changed) needs to be recorded in case the transaction must be "rolled back". This before image is stored in a rollback log, which is contained within a rollback segment. (There can be several rollback segments within a given instance, and a transaction may even be told which rollback segment to use.) Oracle writes to only one extent within the rollback segment at a time. It also writes to these extents in a cyclical fashion, filling each extent one by one until all the extents are full, then returning to the "first" extent and overwriting it. However, it cannot start writing to an extent if there is an uncommitted transaction whose "before" images are found in that extent, because the before images must be preserved until the transaction is committed. Oracle then must assign additional extents so that additional before images can be saved. (This typically happens with a long transaction whose before images will span several extents within the rollback segment.) Once all transactions that need the before images in a particular extent are committed, that extent then is available for use by the rollback segment. If the number of extents needed by the rollback segment decreases, Oracle can release extents as necessary to shrink the rollback segment.

There is always at least one rollback segment created -- the system rollback segment -- which is stored in the system tablespace. Neither this rollback segment nor the tablespace in which it is stored is sufficient for a normal production database. Therefore, the DBA will create additional rollback segments in other tablespaces and take the system rollback segment offline. A common practice is to create a tablespace that will contain nothing but rollback segments. Oracle assigns rollback segments to transactions on a round-robin basis or to a specific rollback segment specified manually by the transaction. Taking the system rollback segment offline makes sure that no transactions will be assigned to it. This allows the system tablespace to concentrate on other matters, without being slowed down to record rollback information.

The main reason to understand rollback segments (and where they go) is their unique roll in a database recovery. Remember that the rollback segments store the before images of all changed blocks. After a crash or recovery, these pages are essential to return the database to a consistent state. They are needed in order to roll back any uncommitted transactions and return the necessary blocks to their before-transaction status. (This is the entire purpose of the rollback segment.)

The result of this restriction is that, while a rollback segment can be recovered online, a normal tablespace cannot be brought online until the rollback segment that it uses is completely restored. Therefore, Oracle does not allow the instance to be brought online unless all defined rollback segments are available. If you try to open the database without any of them, Oracle gives the error "rollback segment segment_name specified not available". This is covered in more detail later.

Checkpoint -- A checkpoint is the point at which all data kept in memory is flushed to disk. In Oracle, a DBA can force a checkpoint with the alter system checkpoint command, but a checkpoint also is done automatically every time the database switches redolog files.

Redolog -- If the rollback segment contains a rollback log, the redolog could be called a "roll-forward" log. Every time that Oracle needs to change a block on disk, it records the change vector in the redolog; that is, it records how it changed the block, not the value it changed it to. A mathematical explanation may be helpful here.

Suppose that you had a variable with a value of 100 and added 1 to it. To record the change vector, you would record +1; to record the changed value, you would record 101. This is how Oracle records information to the redologs during normal operation. When a tablespace is in backup mode, however, Oracle starts recording the changed value (e.g., 101 in the example above), rather than the changed vector.

In times of recovery, the redolog is used to "redo" transactions that have occurred since the last checkpoint or since the backup that is being used for a restore. Oracle has both online redologs and offline (archived) redologs. The online and archived redologs are essential to recovering from a crash or disk failure. Learn everything you can about how they work and protect them as if they were gold.

Originally, the online redologs were three or more files to which Oracle wrote the logs of each transaction. (Oracle requires only two logs, but the typical practice is to have three or more. That allows one log to be active, one to be completely inactive, and one to be in the process of being archived.) The problem with this approach is that the log to which Oracle was currently writing always contained the only copy of the most recent transaction logs. If the disk on which this log was stored were to crash, Oracle would not be able to recover up to the point of failure.

Log Groups

Oracle7 introduced the concept of log groups. A log group is a set of two or more files that are written to simultaneously by Oracle -- essentially a mirror for the redologs. A set of log files is called a "log group", and the separate files within that log group are referred to as "members". Each log group is treated as a single log file, and all transaction records are simultaneously written to all disks within the currently active log group. Now, instead of three or more separate files, any one of which could render the database useless if damaged, there are three or more separate log groups of mirrored files. If each log group is assigned more than one member, every transaction is being recorded in more than one place. After a crash, Oracle can read any one of these members to perform crash recovery.

Oracle writes to the log groups in a cyclical fashion. It writes to one log group until that log group is full. It then performs a log switch and starts writing to the next log group. As soon as this happens, the log group that was just filled is then copied to an archived redolog file, if automatic archiving is enabled. If automatic archiving is not enabled, this file is not copied and is simply overwritten the next time that Oracle needs to write to that log. Each of the online redologs is copied to the filename pattern specified by the LOG_ ARCHIVE_DEST parameter in the initORACLE_SID.ora file, followed by an incremented string specified by the LOG_ARCHIVE_FORMAT parameter in the initORACLE_ SID.ora file.

For example, assume that LOG_ARCHIVE_DEST was set to /archivelogs/arch and LOG_ARCHIVE_FORMAT is set to %s.log, where %s is Oracle's variable for the current sequence number. If the current sequence number is 293, a listing of the archivelogs directory might show the following:

# cd /archivelogs
# ls -l arch*
arch291.log
arch292.log
arch293.log

Depending on how much activity a database has, there may be hundreds of files in the archive log destination directory over time. Nothing is done by Oracle to manage this area, so a cron job must be set up to clean this directory. As long as these files are being backed up to some kind of backup media, they can be removed after a few days. However, the more logs there are on disk, the better off the database will be. That is because it sometimes may be necessary to restore from a backup that is not the most current one. (For example, this could happen if the current backup volume is damaged.) If all the archive logs since the time that old backup was taken are online, it's a breeze. If they aren't, they have to be restored as well. That can create an available-space problem, which is why I recommend having enough space to store enough archive logs to span two backup cycles. For example, if the system does a full database backup once a night, there should be enough space to have at least two days' worth of redologs online. If it backs up once a week, then there should be enough storage for two weeks' worth of transaction logs. (This is yet another reason for backing up every night.)

In summary, the online redologs are usually three or more log groups that Oracle cycles through to write the current transaction log data. A log group is a set of one or more logs that Oracle treats as one redolog. (Oracle always uses the term "log groups", even if a log group has only one member.) Log groups should have more than one member, since that minimizes the chance for data corruption in the case of disk failure. Once Oracle fills up one online redolog group, it copies that redolog to the archive log destination as a separate file with a sequence number contained in the filename. It makes this copy only if automatic archiving is enabled.

As you can see, there are many parts to Oracle's architecture, each of which is integral in the backup and recovery process. Hopefully this column has helped you to understand these pieces of architecture a little bit better.

W. Curtis Preston has specialized in storage for over eight years, and has designed and implemented storage systems for several Fortune 100 companies. He is the owner of Storage Designs, the Webmaster of Backup Central (http://www.backupcentral.com), and the author of two books on storage. He may be reached at curtis@backupcentral.com. (Portions of some articles may be excerpted from Curtis's books.)