Error Management Definition/Meaning:
Error Management of magnetic tape. The procedure followed when an error is detected in data read by a tape transport, either in the course of a read
operation or in the course of the read -while- write check associated with a
write operation (see magnetic tape). The procedure is usually controlled by the
host system software, but in buffered tape transports it may be partly or
entirely autonomous, i.e. controlled by the magnetic tape subsystem, which will
usually either count error occurrences or notify the host of each one
individually. Error management is carried out on a block- by-block basis.
Read
error recovery usually consists of repeated attempts to read the
block
containing the error, and this involves starting and stopping the tape for each
attempt. Various physical parameters of the tape transport (including the
direction of tape motion) may be varied for successive attempts, Typically the
procedure allows for up to about 10 attempts, or re-tries, before the effort is
regarded as irrecoverable (see error rate). Many tape formats make provision
for logical error recovery by redundant coding, which permits recovery without
rereading the tape; this is attempted (either autonomously or with software
assistance, depending on the particular format) before the main error recovery
procedure is invoked.
Write error recovery is usually preceded by one or more
attempts at read error recovery, since the error may have occurred in the read
part of the read -while- write check; it is then usual to erase the block
containing the error (which will be the last valid block on the tape, or at
least in the file) and rewrite it, starting a predefined distance down the tape
to avoid any flaw in the medium, and thus leaving an elongated interblock gap.
This process will be repealed a predetermined number of times (typically five)
before the error is regarded irrecoverable.
Some
recent tape format definitions, particularly for streaming cartridge tape, allow on-the-fly write error recovery where the block containing the error is not
erased but simply repeated, without stopping the tape; in practice, timing
considerations may require that two blocks are repeated. Block numbers in the
block headers allow repeated blocks to be detected on reading.
The design of the error recovery procedure has an effect on the permanent error
rate, since the more attempts that are made at recovering an error - especially
if the tape transport parameters are varied - the greater the chance that the
recovery will succeed and the error be classified as transient rather than
permanent. Thus the error rate quoted for a subsystem is valid only if the error
recovery procedure prescribed by the hardware manufacturer is followed. Error
rates and error recovery are also affected by block length.
|