Sunday 6 March 2011

Experiments in Data Corruption

Glitch Art involves the use of accidental data corruption (pure glitch), as well as the simulation of pure glitches by creating scenarios where glitches can occur. This can involve the use of redundant or faulty technologies to create new data or disrupt the output of existing information. The other route is to hack into the raw data code of digital image files and apply treatments that will result in the random corruption of the viewable image.
Data corruption, or Databending is an interesting and relevant area of investigation.

Databending involves the misuse of digital information. The most common types of databending are:
• Reinterpretation: converting a file from one medium to another or from one file format to a dissimilar format.
• Sonification: the reinterpretation of non-audio data into audio data
• Forced error: forcing an application or piece of hardware to fail in the hopes that it will behave unexpectedly or the data will corrupt
• Incorrect editing: editing a file using software/hardware intended for a different form of data; say, editing non-text files in a text editor [1]

I have therefore undertaken a number of experiments to investigate whether I can generate corrupted image files through manual intervention or source code manipulation. This logbook is a record of these investigations, complete with reflections on the resulting outcomes.

Application Creation
Through a series of contacts in the data programming industry, I have enlisted help to investigate the practicalities of creating an application which can automate a process in which the JPEG code is randomly adjusted to produce corrupted, viewable image files.

Because of the way JPEG compression is designed, images are stored in tightly-packed streams of binary bits (not bytes). Each pixel can be represented by as few as 2 bits to as many as 26 bits (dictated by the variable-length Huffman Coding scheme). To make matters worse, in an effort to keep the compression as efficient as possible, there is virtually nothing to indicate where you are in the stream of bits (unless Restart Markers are used). Therefore, as soon as a single bit is encountered wrong, the millions of bits that follow will be decoded incorrectly as well. The manner in which DC and AC coefficients are arranged in MCUs means that this corruption often shows up in shearing, wild color shifts and many other visual phenomena. [2]

This work is ongoing

The remainder of this log records my attempts to achieve corruption by a variety of methods.


No comments:

Post a Comment