XFS Not Getting It Right?

Home » CentOS » XFS Not Getting It Right?
CentOS 8 Comments

XFS is supposed to detect the layout of a md-RAID devices when creating the file system, but it doesn´t seem to do that:

# cat /proc/mdstat Personalities : [raid1]
md10 : active raid1 sde[1] sdd[0]
499976512 blocks super 1.2 [2/2] [UU]
bitmap: 0/4 pages [0KB], 65536KB chunk

# mkfs.xfs /dev/md10p2
meta-data=/dev/md10p2 isize=512 agcount=4, agsize=30199892 blks
= sectsz=512 attr=2, projid32bit=1
= crc=1 finobt=0, sparse=0
data = bsize=4096 blocks=120799568, imaxpct=25
= sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 ftype=1
log =internal log bsize=4096 blocks=58984, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0

# mkfs.xfs -f -d su=64m,sw=2 /dev/md10p2 meta-data=/dev/md10p2 isize=512 agcount=16, agsize=7553024 blks
= sectsz=512 attr=2, projid32bit=1
= crc=1 finobt=0, sparse=0
data = bsize=4096 blocks=120799568, imaxpct=25
= sunit=16384 swidth=32768 blks naming =version 2 bsize=4096 ascii-ci=0 ftype=1
log =internal log bsize=4096 blocks=58984, version=2
= sectsz=512 sunit=8 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0

The 64MB chunk size was picked my mdadm automatically. The device is made from two disks, and XFS either doesn´t figure that out, or it decided to ignore the layout of the underlying RAID.

Am I doing something wrong here, or is xfs in CentOS somehow different?
Do, or must, we always specify the apporpriate values for su and sw or did xfs ignore them because what it picked is better?

8 thoughts on - XFS Not Getting It Right?

  • I don’t know enough to answer, but I do have a question.. what were you expecting xfs to do (and what filesystems do that?) Thanks


    Stephen J Smoogen.

  • Once upon a time, hw said:

    RAID 1 has no “layout” (for RAID, that usually refers to striping in RAID levels 0/5/6), so there’s nothing for a filesystem to detect or optimize for.

    The chunk size above is for the md-RAID write-intent bitmap; that’s not exposed information (for any RAID system that I’m aware of, software or hardware) or something that filesystems can optimize for.

  • Chris Adams wrote:

    Are you saying there is no difference between a RAID1 and a non-raid device as far as xfs is concerned?

    What if you use hardware RAID?

    When you look at [1], it tells you to specify su and sw with hardware RAID and says it detects everything automatically with md-RAID. It doesn´t have an example with RAID1 but one with RAID10 — however, why would that make a difference? Aren´t there stripes in a RAID1? If you read from both disks in a RAID1 simultaneously, you have to wait out the latency of both disks before you get the data at full speed, and it might be better to use stripes with them as well and read multiple parts of the data at the same time.

    [1]: http://xfs.org/index.php/XFS_FAQ#Q:_How_to_calculate_the_correct_sunit.2Cswidth_values_for_optimal_performance

    > The chunk size above is for the md-RAID write-intent bitmap; that’s not
    > exposed information (for any RAID system that I’m aware of, software or
    > hardware) or something that filesystems can optimize for.

    Oh, ok. How do you know what stripe size was picked by mdadm? It seemd a good idea to go with defaults as far as possible.

  • RAID1 is mirroring. There is nothing to stripe because the virtual device is almost identical to the physical drives, and they both see the same read and write instructions in parallel.

  • RAID1 is simply two or more drives with the same stuff on each drive. its a simple form of redundancy. striping would put some of it on one drive and more on another, and so forth for as many drives as you use with some kind of redundancy or checksumming. RAID1 is much simpler:
    simply make all the drives carry the same data.

    RAID1 wouldn’t have that problem, necessarily. since all drives carry the same data, it is necessary to read from only one of them. it is during a write operation that all drives are written, and even then they may not be written at exactly the same time… one can be written and the data for the other in buffer-cache until the system gets a chance to write it.

    I don’t know details, per the above, for Linux software raid, but I also have a USB-attached HW raid box (jmicron chip) and I can watch the lights on the drives and see it doing exactly that. I would find it hard to believe that software raid in Linux is significantly different (in fact, i rather suspect that the external raid box is probably running some older version of Linux)

  • Once upon a time, hw said:

    Yes.

    No difference – same result.

    Because RAID level 1 and RAID level 10 are different. I suggest you read:

    https://en.wikipedia.org/wiki/RAID#Standard_levels

    What is called “RAID 10” is really a combination of level 1 and level 0
    (which one is higher/lower varies between implementations).

    RAID level 1 has the same data on both drives. You wouldn’t be reading the same data from both drives at the same time; reads would be spread between the drives (I know the Linux software RAID tries to keep read load fairly balanced between drives, I assume most hardware RAID
    implementations do the same).

    Again, RAID level 1 has no stripes.