Data corruption

From ArticleWorld


Data corruption refers to the occurrence of errors in computer data, due to errors in transmission or handling by users or operating systems. When data is corrupted, the original data is changed, unintentionally and with erratic results. Some measures against data corruption can be taken.

Causes

Data corruption commonly appears due to errors in transmission or retrieval of the data. For example, when a transmission process is interrupted and one of the parts is not aware of this, the received data may be corrupt. Some external conditions can influence this as well. For example, satellite transmissions can be blocked by very heavy clouds or solar wind, and data transfer in wireless networks can be influenced by devices like microwave ovens.

Detection and prevention

The detection of a data corruption can be done using checksums. If the corruption is a Poisson process (each bit has an independently very low chance of becoming corrupted), for example during a file transfer over the Internet, a checksum can be calculated, based on the original data. If the receiver tries to recalculate the checksum and finds it to be different from the original checksum, he most likely has to do with corrupted data.

There are no failure-free prevention methods for data corruption. However, ensuring that the transmission environment is free from external interferences or internal defects, using specially-designed transfer protocols and ensuring that the operating systems and programs correctly manipulate the data can avoid many corruptions. Some programs are especially designed to avoid data corruption when writing to disk.

The usage of backups (traditional or continuous) or RAID arrays is recommended, in order to restore corrupt data if it is damaged beyond repair. This is especially important when using encrypted data (for example, encrypted compressed data in a banking system), because a single error in the file can make it completely unusable.