dd is a common Unix program whose primary purpose is the low-level copying and conversion of raw data. dd is an abbreviation for "dataset definition" in IBM JCL, and the command's syntax is meant to be reminiscent of this.[1]
dd is used to copy a specified number of bytes or blocks, performing on-the-fly byte order conversions, as well as more esoteric EBCDIC to ASCII conversions.[2] dd can also be used to copy regions of raw device files, e.g. backing up the boot sector of a hard disk, or to read fixed amounts of data from special files like /dev/zero or /dev/random.[3]
It is jokingly said to stand for "disk destroyer", "data destroyer", or "delete data", since, being used for low-level operations on hard disks, a small mistake, such as reversing the if and of parameters, can possibly result in the loss of all or some data on a disk.[2]
Usage
The command line syntax of dd is significantly different from most other Unix programs, and because of its ubiquity it is resistant to recent attempts to enforce a common syntax for all command line tools. Generally, dd uses an option=value format, whereas most Unix programs use either -option value or --option=value format. Also, dd's input is specified using the "if" (input file) option, while most programs simply take the name by itself. It is rumored to have been based on IBM's JCL, and though the syntax may have been a joke[1], there seems never to have been any effort to write a more Unix-like replacement.
Example use of dd command to create an ISO disk image from a CD-ROM:
dd if=/dev/cdrom of=/home/sam/myCD.iso bs=2048 conv=sync,notrunc
Note that an attempt to copy the entire disk image using cp may omit the final block if it is an unexpected length; dd will always complete the copy if possible.
Using dd to wipe an entire disk with random data:
dd if=/dev/urandom of=/dev/hda
alternative:
for n in {1..7}; do dd if=/dev/urandom of=/dev/sda bs=8b conv=notrunc; done
Using dd to duplicate one hard disk partition to another hard disk:
dd if=/dev/sda2 of=/dev/sdb2 bs=4096 conv=notrunc,noerror
Note that notrunc means do not truncate the output file (that is, if the output file already exists, just replace the specified bytes, don't also drop all of the bytes in the output file beyond the end of the range of bytes that are replaced--for example, if the file /abc is 3 megabytes long, the command "dd if=/dev/zero of=/abc bs=1 count=1 conv=notrunc" would replace the first byte of /abc with a null (ascii 0), but the command "dd if=/dev/zero of=/abc bs=1 count=1" might replace the entire contents of /abc with a single null character). Noerror means to keep going if there is an error (though a better tool for this would be ddrescue).
To duplicate a disk partition as a disk image file on a different partition
dd if=/dev/sdb2 of=/home/sam/partition.image bs=4096 conv=notrunc,noerror
To duplicate a disk partition as a disk image file on a remote machine over a secure ssh connection:
dd if=/dev/sdb2 | ssh user@host "dd of=/home/user/partition.image"
Create a 1GB file containing only zeros (bs=blocksize, count=number of blocks):
dd if=/dev/zero of=file1G.tmp bs=1G count=1
To make sure that my drive is really zeroed out
dd if=/dev/sda | hexdump -C | head
The output of this command will resemble the following if the drive is blank:
00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
201f78000
16841664+0 records in
16841664+0 records out
8622931968 bytes (8.6 GB) copied, 1247.05 s, 6.9 MB/s
If the drive is blank, one line of blank bytes will be printed, followed by a '*' signifying repeated blank lines, followed by a line indicating the address of the line which ends the repetition, followed by the statistics which are printed after the output. The numbers in the statistics above are illustrative. If the drive is not entirely blank, there will be more than one line of data output.
To duplicate the first 2 sectors of the floppy.
dd if=/dev/fd0 of=/home/sam/MBRboot.image bs=512 count=2
To duplicate master boot record only
dd if=/dev/sda of=/home/sam/MBR.image bs=446 count=1
To make drive benchmark test and analyze read and write performance
dd if=/dev/zero bs=1024 count=1000000 of=/home/sam/1Gb.file
dd if=/home/sam/1Gb.file bs=64k | dd of=/dev/null
To make a file of 100 random bytes:
dd if=/dev/urandom of=/home/sam/myrandom bs=100 count=1
To convert a file to uppercase:
dd if=filename of=filename conv=ucase
To search the system memory:
dd if=/dev/mem | hexdump -C | grep 'some-string-of-words-in-the-file-you-forgot-to-save-before-you-hit-the-close-button'
Image a partition to another machine:
On source machine:
dd if=/dev/hda bs=16065b | netcat < targethost-IP > 1234
On target machine:
netcat -l -p 1234 | dd of=/dev/hdc bs=16065b
Everybody has mentioned the first obvious fix: raise your blocksize from the default 512 bytes. The second fix addresses the problem that with a single dd, you are either reading or writing. If you pipe the first dd into a second one, it'll let you run at the max speed of the slowest device.
dd if=/dev/ad2 conv=noerror,sync bs=64k | dd of=/dev/ad3 bs=64k
Sending a USR1 signal to a running `dd' process makes it print I/O statistics to standard error and then resume copying.
$ dd if=/dev/zero of=/dev/null& pid=$!
$ kill -USR1 $pid
18335302+0 records in 18335302+0 records out 9387674624 bytes (9.4 GB) copied,
34.6279 seconds, 271 MB/s
Create a 1GB sparse file or resize an existing file to 1GB without overwriting:
dd if=/dev/zero of=mytestfile.out bs=1 count=0 seek=1G
Some implementations understand x as a multiplication operator in the block size and count parameters:
dd bs=2x80x18b if=/dev/fd0 of=/tmp/floppy.image
where the "b" suffix indicates that the units are 512-byte blocks. Unix block devices use this as their allocation unit by default.
For the value of bs field, following decimal number can be suffixed:
w means 2
b means 512
k means 1024
M specifies multiplication by 1024*1024
G specifies multiplication by 1024*1024*1024
Hence bs=2*80*18b means, 2*80*18*512=1474560 which is the exact size of 1440 KiB floppy disk
To mount that image mount -o loop floppy.image /mntpoint
Speed
Small block sizes (bs=) take much longer due to the fixed overhead of transfer requests. Above 32k/64k/128k (depending on machine) block sizes there is nothing to be gained since the blocks have to be split up. The "sweet spot" is at 64k, which is much larger than the default bs=512. In other words, always use bs=64k for large files and where you don't need to count blocks. Don't worry, the fractional part less than 64k is always copied.
Output messages
The GNU variant of dd as supplied with Linux does not describe the format of the messages displayed on stdout on completion, however these are described by other implementations e.g. that with BSD.
Each of the "Records in" and "Records out" lines shows the number of complete blocks transferred + the number of partial blocks, e.g. because the physical medium ended before a complete block was read.
ATA Disks over 128 GB
Seagate documentation warns, "Certain disc utilities, such as DD, which depend on low-level disc access may not support 48-bit LBAs until they are updated."[4] 48-bit LBA is required for ATA harddrives over 128 GB in size.
Recovery-oriented variants of dd
Open Source unix-based programs for rescue include dd_rescue and dd_rhelp, which work together, savehd7, or GNU ddrescue.
Antonio Diaz Diaz (the developer of GNU ddrescue) compares[5] the variants of dd for the task of rescuing:
The standard utility dd does a linear read of the drive, so it can take a long time or even fry the drive without rescuing anything if the errors are at the beginning of the drive. Kurt Garloff's dd_rescue does basically the same thing as dd, only more efficiently. LAB Valentin's dd_rhelp is a complex shell script that runs Garloff's dd_rescue many times, trying to be strategic about copying the drive, but it is very inefficient.
- dd_rhelp first extracts all the readable data, and saves it to a file, inserting zeros where bytes cannot be read. Then it tries to re-read the invalid data and update this file.
- GNU ddrescue can be used to copy data directly to a new disk if needed, just like Linux dd.
dd_rhelp or GNU ddrescue will yield a complete disk image, faster but possibly with some errors. GNU ddrescue is generally much faster, as it is written entirely in C++, whereas dd_rhelp is a shell script wrapper around dd_rescue. Both dd_rhelp and GNU ddrescue aim to copy data fast where there are no errors, then copy in smaller blocks and with retries where there are errors. GNU ddrescue is easy to use with default options, and can easily be downloaded and compiled on Linux-based Live CDs such as Knoppix, and can be used with SystemRescueCD.
GNU ddrescue example [6]
# first, grab most of the error-free areas in a hurry:
ddrescue -n /dev/old_disk /dev/new_disk rescued.log
# then try to recover as much of the dicey areas as possible:
ddrescue -r 1 /dev/old_disk /dev/new_disk rescued.log
There is a big difference in how disk errors are processed by kernels. FreeBSD, NetBSD, OpenBSD, Solaris, and different Linux kernels (i.e. hda vs. sda (<2.6.20)) behave differently. Also, Linux lacks "raw" disk devices like *BSD has, which makes it less desirable for low-level data recovery. Non-raw devices read larger blocks than requested, obscuring the actual location where the error occurred. You may wish to use "dmesg -n8" to see the error messages on the console.
See also
References
External links
|
Unix command line programs and builtins (more) |
|
| File system |
|
|
| Processes |
|
|
| User environment |
|
|
| Text processing |
|
|
| Shell programming |
|
|
| Networking |
|
|
| Searching |
|
|
| Miscellaneous |
|
|