Optimum Block Size To Use

Home » CentOS » Optimum Block Size To Use
CentOS 7 Comments

Hi All

We use CentOS 6.6 for our application. I have profiled the application and find that we have a heavy requirement in terms of Disk writes. On an average when our application operates at a certain load i can observe that the disk writes / second is around 2 Mbps (Average).

The block size set is 4k

*******************
[root@localhost ~]# blockdev –getbsz /dev/sda3
4096
*******************

OS , Kernel Version:

*****************
[root@localhost ~]# uname -a Linux localhost 2.6.32-504.el6.x86_64 #1 SMP Wed Oct 15 04:27:16 UTC
2014 x86_64 x86_64 x86_64 GNU/Linux

[root@localhost ~]# cat /etc/*release CentOS release 6.6 (Final)
CentOS release 6.6 (Final)
CentOS release 6.6 (Final)
*****************

File-System being used:

********************
[root@localhost ~]# df -T
Filesystem Type 1K-blocks Used Available Use% Mounted on
/dev/sda3 ext3 100934056 20298152 75508688 22% /
/dev/sda1 ext3 198337 34459 153638 19% /boot tmpfs tmpfs 16440152 0 16440152 0% /dev/shm
*********************

I have a few queries with respect to the block size being set in the system:

1. Is 4k the optimum block size considering the amount of writes /
second the application performs ?

2. How do i find out the optimum block size given the application load in terms of reads / writes per second ?

3. If there is a better block size that i can use , Can you suggest one ?

4. What are the pros / Cons of changing the default block size ?

5. We use ext3 as the file system for the partition which has heavy writes per second , Should we migrate it to ext4 ? Any pros / cons for it ?

Appreciate any response / pointers in this regard.

Thanks Jatin

7 thoughts on - Optimum Block Size To Use

  • Initial thought is, do you really care? 2Mbps is peanuts, so personally I’d leave everything at the defaults. There’s really no need to optimise everything.

    Obviously the exact type of writes is important (lots of small writes written and flushed vs fewer big unsynced writes), so you’d want to poke it with iostat to see what kind of writes you’re talking about.

    jh

  • Am 19.08.2015 um 10:24 schrieb John Hodrien :

    to address this we use (sysctl)

    vm.dirty_expire_centisecs vm.dirty_writeback_centisecs

    furthermore check the fs alignment with the underlying disk …

  • [Jatin]
    These options deal with “caching” the writes. Correct me if i am wrong. If it is indeed dealing with caching then i think it will not help because the application workload generates a lot of data that is always new. The application logic continously generates new data to be written to disk. Thanks Jatin

  • Is your application running fast enough?
    If so, I echo: do you really care?

    Let’s suppose your application is not running fast enough. Is the disk drive a bottleneck?
    If not, you need to fix someting else. Let’s suppose the disk drive is a bottleneck. Are your writes sequential?
    If so, I’d expect that drive internal caching would favor large block sizes, e.g. 2K. Since that is what you have, I expect your writes are not sequential. Find a way to make them sequential. The “and flush” is what will make that hard. Instead of writing X to location j in the main file, write (j, X) to the next sequential location to a cache file. When the cache has enough data, do an in-memory stable sort and start writing to the main file. Clear the cache file. Unless the cache file is larege enough, I expect that this will largely duplicate what the disk drive does internally.

    It might be simpler just to get a faster disk drive.

  • On x86, it’s effectively fixed at 4096 bytes. There is a clustering option in ext4 called bigalloc which isn’t the same thing as block size but might be what you’re looking for if you have a lot of larger file writes happening. But this implies CentOS 7 to get this feature.

    Piles of pros, and no meaningful cons. Just use ext4 with defaults.

  • This is very important. Certain workloads and certain AF drive firmware can really suck when there’s a lot of read,modify,write done by the drive (internally) if the fs block is not aligned to physical sector size. I’m pretty sure parted and fdisk on CentOS 6 does properly align, whereas they don’t on CentOS 5. Proper alignment is when the partition start LBA is divisible by 8. So a start LBA of 63
    is not aligned, where 2048 is aligned and now common.