Convert “bare Partition” To RAID1 / Mdadm?

Home » CentOS » Convert “bare Partition” To RAID1 / Mdadm?
CentOS 23 Comments

I have a large disk full of data that I’d like to upgrade to SW RAID 1
with a minimum of downtime. Taking it offline for a day or more to rsync all the files over is a non-starter. Since I’ve mounted SW RAID1 drives directly with “mount -t ext3 /dev/sdX” it would seem possible to flip the process around, perhaps change the partition type with fdisk or parted, and remount as SW RAID1?

I’m not trying to move over the O/S, just a data paritition with LOTS of data. So far, Google pounding has resulted in howtos like this one that’s otherwise quite useful, but has a big “copy all your data over”
step I’d like to skip:

http://sysadmin.compxtreme.ro/how-to-migrate-a-single-disk-linux-system-to-software-raid1/

But it would seem to me that a sequence roughly like this should work without having to recopy all the files.

1) umount /var/data;
2) parted /dev/sdX
(change type to fd – Linux RAID auto)
3) Set some volume parameters so it’s seen as a RAID1 partition
“Degraded”. (parted?)
4) ??? Insert mdadm magic here ???
5) Profit! `mount /dev/md1 /var/data`

Wondering if anybody has done anything like this before…

-Ben

23 thoughts on - Convert “bare Partition” To RAID1 / Mdadm?

  • Even if I found the magic place to change to make the drive think it was a raid member, I don’t think I would trust getting it right with my only copy of the data. Note that you don’t really have to be offline for the full duration of an rysnc to copy it. You can add another drive as a raid with a ‘missing’ member, mount it somewhere and rsync with the system live to get most of the data over. Then you can shut down all the applications that might be changing data for another rsync pass to pick up any changes – and that one should be fast. Then move the raid to the real mount point and either (safer)
    swap a new disk, keeping the old one as a backup or (more dangerous)
    change the partition type on the original and add it into the raid set and let the data sync up.

  • I would, of course, have backups. And the machine being upgraded is one of several redundant file stores, so the risk is near zero of actual data loss even if it should not work. :)

    And I’ve done what you suggest: rsync “online”, take apps offline, rsync, swap, and bring it all back up. But the data set in question is about 100 million small files (PDFs) and even an rsync -van takes a day or more, downtime I’d like to avoid. A sibling data store is running LVM2 so the upgrade without downtime is underway, another sibling is using ZFS which breezed right through the upgrade so fast I wasn’t sure it had even worked!

    So… is it possible to convert an EXT4 partition to a RAID1 partition without having to copy the files over?

  • For data partitions a lot of the stuff is not applicable.

    With respect to the madam steps, creating degraded arrays, filesystem on those degraded arrays and then copy over the data etc. is spot on IMO.

    I would recommend the steps in the above tutorial to really be assured that none of data is corrupted.

    ‘mdadm’ starts initializing the array (writing on the disk), overwriting your file system on that partition.

    I would not recommend it but you can try it and see what happens with your experiment. Should be a no brainer since you have secondary back ups of the data elsewhere (stated in this thread).

    — Arun Khan

  • What happens if you mount the partition of a raid1 member directly instead of the md device? I’ve only done that read-only, but it does seen to work.

  • You can also try this:
    1- Convert your ext4 partition to btrfs.
    2- Make raid1 with btrfs. With btrfs you can convert a “bare partition”
    to almost any raid level, with the proper hard disk amount.

  • This is the flip side of the OP’s use case i.e. you already have a RAID device.and mounting one of it’s member.

    — Arun Khan

  • As I originally stated, I’ve done this successfully many times with a command like:

    mount -t ext{2,3,4} /dev/sdXY /media/temp -o rw

    Recently, it seems that RHEL/CentOS is smart enough to automagically create /dev/mdX when inserting a drive “hot” EG: USB or hot swap SATA, so I haven’t had to do this for a while. You can, however do this, which seems to be logically equivalent:

    mdadm –manage /dev/mdX –stop;
    mount -t ext{2,3,4} /dev/sdXY /media/temp -o rw;

    -Ben

  • But if you write to it, can you clobber the raid superblock? That is, is it somehow allocated as used space in the filesystem or is there a difference it the space available on the md and direct partition, or something else?

  • Is there soome reason that the existing files cannot be accessed while they are being copied to the raid?

  • Sheer volume. With something in the range of 100,000,000 small files, it takes a good day or two to rsync. This means that getting a consistent image without significant downtime is impossible. I can handle a few minutes, maybe an hour. Much more than that and I have to explore other options. (In this case, it looks like we’ll be biting the bullet and switching to ZFS)

    -Ben

  • Benjamin Smith wrote:
    takes a good day or two to rsync. This means that getting a consistent image without significant downtime is impossible. I can handle a few minutes, maybe an hour. Much more than that and I have to explore other options. (In this case, it looks like we’ll be biting the bullet and switching to ZFS)

    Not a big deal. I’ve done this a number of times, esp when moving a researcher’s home directory, or putting in a larger drive. I rsync it all over. *Then*, right after it’s done, I have them log off, do a final rsync, and bring it back up. The upshot is that they haven’t created thousands of huge files while the big rsync was taking place, while it was still up. If I’m really worried, I do a second rsync after the first huge one, *then* do the “log off and final rsync”.

    mark

  • Rsync is really pretty good at that, especially the 3.x versions. If you’ve just done a live rsync (or a few so there won’t be much time for changes during the last live run), the final one with the system idle shouldn’t take much more time than a ‘find’ traversing the same tree. If you have space and time to test, I’d time the third pass or so before deciding it won’t work (unless even find would take too long).

  • Thanks for your feedback – it’s advice I would have given myself just a few years ago. We have *literally* in the range of one hundred million small PDF documents. The simple command

    find /path/to/data > /dev/null

    takes between 1 and 2 days, system load depending. We had to give up on rsync for backups in this context a while ago – we just couldn’t get a
    “daily” backup more often then about 2x per week. Now we’re using ZFS +
    send/receive to get daily backup times down into the “sub 60 minutes”
    range, and I’m just going to bite the bullet and synchronize everything at the application level over the next week.

    Was just looking for a shortcut…

  • Here is an evil thought. Is this possible for you do?

    1) Setup a method to obtain a RW lock for updates on the original filesystem

    2) Use rsync to create a gross copy of the original (yes, it will be slightly out of phase,
    but stick with me for a bit) on the new filesystem on top of LVM2
    on top of a RAID1
    volume to make the next step much more efficient.

    3) Perform the following loop:
    a) Set the updates lock on original filesystem
    b) rsync a *subset* sub-directory of the original filesystem such that you can complete it in, at worst, only a second or two
    c) Rename the original directory to some safe alternative (safety first)…
    d) Put a symlink in place of the original directory pointing to the newly synced file system sub-directory
    e) Release the mutex lock
    f) Repeat a-e until done

    4) Switch over operations to the new filesystem

    Another approach would be to leverage something like UnionFS (see http://en.wikipedia.org/wiki/UnionFS ) to allow you to both use the filesystem *and* automatically propagate all updates to the new volume during the migration.

    – Jerry Franz

  • From: Benjamin Smith

    What about:
    1. Setup inotify (no idea how it would behave with your millions of files)

    2. One big rsync
    3. Bring it down and copy the few modified files reported by inotify.

    Or lsyncd?

    JD

  • How about something like this:
    Use find to process each file with a script that does something like this:
    if foo not soft link :
    if foo open for output (lsof?) :
    add foo to todo list
    else :
    make foo read-only
    if foo open for output :
    add foo to todo list
    restore foo’s permissions
    else :
    copy foo to raid
    replace original with a soft link into raid
    give copy correct permissions

    move the todo list to where it will not be written the script process the todo list files with the same script making a new todo list rinse and repeat until the todo list is empty

    for the endgame, make the entire source read-only run find again, this time there is no need for most of the tests:
    if foo not soft link :
    copy foo to raid
    replace original with a soft link into raid
    give copy correct permissions

    after the last copy, make the entire source unreadable and unwriteable wait for last user to close file rename the old files’ top directory rename the raid’s top directory let users back in

  • In thinking about this some more, I had an idea that (a) I’m not totally sure would work, and (b) strikes me as dangerous.

    1. Use dmsetup to create a logical device that consists of an 8 KiB
    prefix followed by your existing partition with the ext4 filesystem.

    2. Create your RAID1 array using the above logical device as the first member and with the second member missing.

    3. Unmount the current filesystem and mount the RAID device in its place.

    4. Add a new device to the (currently degraded) RAID array, and let the RAID system spend the next couple of days recovering data onto the new device.

    Eventually, you would remove the dmsetup device from the RAID array and add a new device in its place.

    I have a feeling you will not want to risk your data to the above procedure. ;-) Trying to reboot a system with that cobbled together RAID member might prove an interesting exercise.

  • rsync breaks silently or sometimes noisily on big directory/file structures. It depends on how the OP’s files are distributed. We organised our files in a client/year/month/day and run a number of rsyncs on separate parts of the hierarchy. Older stuff doesn’t need to be rsynced but gets backed up every so often.

    But it depends whether or not the OP’s data is arranged so that he could do something like that.

    Cheers,

    Cliff

  • need to cancel subscription

    2014-07-28 9:25 GMT-04:00, John Doe :


    Juan Pablo De Mola Rodríguez

  • That’s essentially what we do to re-sync our production file stores. Once I move to ZFS, though it won’t be an issue.

  • lsyncd is interesting, but for our use case isn’t nearly as efficient as ZFS with send/receive. For one case, lsyncd is only useful after the first rsync (which in this case is days) so we would effectively start out with an out-of-sync system and then have to deal with millions of follow up syncs as the monstrous number of queued up inotify events get handled.

    I’m moving ahead with the rsync-a-few-directories-at-a-time method Jerry Franz put forward, as it’s fundamentally compatible with our setup. (Our
    “Subdir” is called a “client” – we have hundreds – and we do that during off-hours that clients’ business is closed, so nobody notices) But It takes a week or two to fully resync a file store… in the meantime we’re at N+1 redundancy instead of N+2 as usual.