...making Linux just a little more fun!

USB thumb drive RAID

By January Weiner

I have a database that I'm working on, and sometimes I need to work on it on my laptop. However, the database is really demanding, and it is just too slow on my laptop's hard disk. I quickly found out that the limitation was the speed of the hard drive, and not so much the CPU. What I needed was a fast external hard drive. Anyway, I always wanted to play with a RAID system.

Hard Disk Performance

There are three parameters of drive speed:

I do not need fast read/write speed, as the amount of information that I retrieve from the database is tiny and the db is almost entirely read-only. However, I do need fast access time: the database is huge, and I need to retrieve information from different positions in the database very quickly. That is, I need very low access times, acceptable reading speed, and I do not care about writing.

Solution

It is well known that the so-called "solid-state disks" (SSD) have very low access times. I could have tried to buy an SSD, but being a tinkerer, I decided for another option. Thumb drives / flash drives / pen drives are also a kind of SSDs, one could say - but they have lousy transfer rates. In the end, I decided to create a software RAID using four 2GB USB drives. I bought

Setting up the Software RAID

Prerequisites: you need the mdadm tool (in Debian, simply run apt-get install mdadm).

Insert the drives into the hub, and attach the hub to the computer. Note: if GNOME or whatever mounts the disks automatically, unmount them before continuing. First, it is necessary to find out the names of the devices that were attached:

dmesg | grep "Attached SCSI"
sd 56:0:0:0: [sde] Attached SCSI removable disk
sd 57:0:0:0: [sdf] Attached SCSI removable disk
sd 58:0:0:0: [sdg] Attached SCSI removable disk
sd 59:0:0:0: [sdh] Attached SCSI removable disk

OK, the devices are /dev/sde, /dev/sdf, /dev/sdg/, /dev/sdh. I want a RAID-0; that is, no redundancy, and 4x2GB=8GB of space. Creating the RAID is simple:

mdadm --create --verbose /dev/md0 --level=0 --raid-devices=4 /dev/sd{e,f,g,h}

This way, we have a new block device that can be formatted. I use ext2, since reliability / journaling plays no role:

mkfs.ext2 /dev/md0
tune2fs -c 0 -j 0 /dev/md0
mount /dev/md0 /mnt

The first command creates the filesystem ("formats" the device); the second disables regular checks. Finally, the third command mounts the RAID on the filesystem so we can write data to it and read from it.

Stopping and Starting the Array

Stopping the Array

Before you stop the array, run the following (and save the output somewhere):

mdadm --detail /dev/md0

To stop the array that is running, first unmount the directory (umount /mnt) and then stop the array:

mdadm --stop /dev/md0

Now, you can safely remove the disks and, for example, plug them into another machine.

Starting the Array, Again

Before you can use your RAID again, you need to "assemble" it. This is easy if you have not removed the disk and try the assembly on the same machine. In that case, you can just type:

mdadm --verbose -A /dev/md0 /dev/sd{e,f,g,h}

However, what if the device letters have changed (e.g. not e-h, but i,j.k,l)? Well, you could find out again what the letters are. But there is a better solution. Remember I told you to save the output from "mdadm --detail"? It contained a line like that:

           UUID : d7ea744f:c3963d02:982f0012:7010779c

Based on this UUID, we can easily "assemble the array" on just any computer :

mdadm --verbose -A /dev/md0 -u d7ea744f:c3963d02:982f0012:7010779c

You can also enter this information in the config file /etc/mdadm/mdadm.conf

Performance Tests

TestDescriptionResultsComment
hdparmreading52 MB/sThis is twice as good as my laptop, and worse than the 70MB/s of my SATA disk in my workstation
ddwriting28 MB/sHalf of what my workstation disk can do
seekerrandom access0.8-1msThis is 10-20 times better than an ordinary hard disk

Notes for the Tests

Alternatives and Outlook

I have explained here how to create a RAID-0 from four USB thumb drives. However, most of what I was explaining here applies also to other RAID types and other disk drives. Even more so! You can combine just about any devices into a RAID. Well, it only makes sense if the devices have similar sizes, but (i) you can create a RAID out of RAIDs (e.g., join two 2GB USB sticks into a RAID0 /dev/md0, then join /dev/md0 with a 4GB USB stick to get a RAID0 of the size of 8GB...) and (ii) you can combine devices of different sizes using LVM (the logical volume manager).

Problems

Apart from some mistakes I made because I did not know 'mdadm', there were no problems. If you run into any, generally two things are of an immense help:

Links

Keywords: usb flash stick thumb drive pendrive linux raid raid0 mdadm

Talkback: Discuss this article with The Answer Gang


[BIO]

January Weiner is a biologist who uses computational tools to investigate evolutionary processes. He is a postdoc in a bioinformatics group.


Copyright © 2008, January Weiner. Released under the Open Publication License unless otherwise noted in the body of the article. Linux Gazette is not produced, sponsored, or endorsed by its prior host, SSC, Inc.

Published in Issue 151 of Linux Gazette, June 2008

Tux