Background
At my day job I deal with a fair amount of image data. We typically are shipping the data out on either hard drives, thumb drives, or via SFTP. On occasion we will some times burn it to a CD and/or a DVD. But until today all the data was either large sets (200-400GB) variety, or small, less than 1-2GB. However today’s shipment was 18GB. What to do? I didn’t have a spare USB thumb drive handy so I thought, ah I’ll just throw it on a couple of single layer DVDs. So my first order of business was to figure out how many. As it is with Linux/UNIX, there is pretty much already a tool for everything, if only you look hard enough 8-).
For this particular shipment all the image data was organized into a couple dozen folders, each weighing in a ~100-200MB. I quickly figured that 5 DVDs should be more than enough, but how to optimally fill each DVD? Luckily there’s a program called dirsplit which made this a breeze.
Solution
Again another tool I’ve never heard of, dirsplit is actually a Perl script that can analyze a directory and report the optimal way to burn it to a set of DVDs. Once it’s done analyzing a directory, it’ll report back a set of .list files, one per each DVDs worth of files. dirsplit is part of the package cdrkit which in addition to dirsplit, also includes the following programs:
- dirsplit: dirsplit utility
- genisoimage: Creates an image of an ISO9660 filesystem
- icedax: A utility for sampling/copying .wav files from digital audio CDs
- wodim: A command line CD/DVD recording program – (“write optical disk media”) – a cdrecord replacement
It can get a little confusing, but cdrkit, at least under Fedora & CentOS, is comprised of 4 individual RPMs, so we’re only going to be using dirsplit. I installed it like so:
1 | yum install dirsplit |
dirsplit’s basic usage:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | % dirsplit [options] [advanced options] < directory > -H|--longhelp Show the long help message with more advanced options -n|--no-act Only print the commands, no action (implies -v) -s|--size NUMBER - Size of the medium (default: 4488M) -e|--expmode NUMBER - directory exploration mode (recommended, see long help) -m|--move Move files to target dirs (default: create mkisofs catalogs) -p|--prefix STRING - first part of catalog/directory name (default: vol_) -h|--help Show this option summary -v|--verbose More verbosity The complete help can be displayed with the --longhelp (-H) option. The default mode is creating file catalogs useable with: mkisofs -D -r --joliet-long -graft-points -path-list CATALOG Example: dirsplit -m -s 700M -e2 random_data_to_backup/ |
Once installed, cd <image data directory>, and run the following command:
1 2 3 4 5 6 7 8 9 10 11 12 13 | # -e takes a number (1-4). In our case we're using 2 # 2: like 1, but all files in directory are put together (as "atom") onto the # same medium. This does not apply to subdirectories, however. # analyze current directory, i.e. the dot % dirsplit -e2 . Building file list, please wait... Calculating, please wait... .................... Calculated, using 5 volumes. Wasted: 7827961 Byte (estimated, check mkisofs -print-size ...) |
In addition to telling us how many DVDs we’ll require, it also tells you how much wasted space the backup will incur, and provides you with a .list file per DVD. For the above run dirsplit generated the following 5 .list files:
1 2 3 4 5 | -rwxrwxr-x 1 root root 1679528 Feb 18 21:18 vol_1.list -rwxrwxr-x 1 root root 1689556 Feb 18 21:18 vol_2.list -rwxrwxr-x 1 root root 1694680 Feb 18 21:18 vol_3.list -rwxrwxr-x 1 root root 1694300 Feb 18 21:18 vol_4.list -rwxrwSr-x 1 root root 17110 Feb 18 21:18 vol_5.list |
The beauty of dirsplit is that the .list files can be utilized by mkisofs to generate .iso files, one for each .list file. The command mkisofs has an option, path-list which takes the .list file as an argument. I used the following command to generate a single .iso for the 1st .list file, vol_1.list.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | % mkisofs -o ~/backup1.iso -D -r --joliet-long -V "BACKUP DISC1" -graft-points -path-list vol_1.list INFO: UTF-8 character encoding detected by locale settings. Assuming UTF-8 encoded filenames on source filesystem, use -input-charset to override. 0.22% done, estimate finish Sat Feb 18 04:41:56 2012 0.44% done, estimate finish Sat Feb 18 04:41:57 2012 0.65% done, estimate finish Sat Feb 18 04:41:57 2012 0.87% done, estimate finish Sat Feb 18 04:41:57 2012 1.09% done, estimate finish Sat Feb 18 04:40:25 2012 1.31% done, estimate finish Sat Feb 18 04:40:41 2012 ... ... 99.46% done, estimate finish Sat Feb 18 04:41:58 2012 99.68% done, estimate finish Sat Feb 18 04:41:57 2012 99.90% done, estimate finish Sat Feb 18 04:41:57 2012 Total translation table size: 0 Total rockridge attributes bytes: 1086116 Total directory bytes: 2136064 Path table size(bytes): 4858 Max brk space used bfe000 2292383 extents written (4477 MB) |
You could use something more advanced to generate all the .iso files:
1 2 3 4 5 | #!/bin/bash for i in `seq 1 5`; do mkisofs -o ~/backup${i}.iso -D -r --joliet-long -V "BACKUP DISC${i}" -graft-points -path-list vol_${i}.list done |
Once you’ve got .iso files, you can use your favorite burning software to write them to DVDs. I usually just do something like this:
1 2 | # in dir. where the .iso are % sudo growisofs -Z /dev/dvd=backup1.iso |
References
links
- cdrkit – portable command-line CD/DVD recorder software
- Splits directory into multiple with equal size for ISO burning purpose – cyberciti.biz
local copies
NOTE: For further details regarding my one-liner blog posts, check out my one-liner style guide primer.