RAID level 별 성능 테스트

RAID level 별 성능 테스트

 

 

회사에 어떤분이 “raid 6에서는 write 일 때 쓰기를 3번해서 io가 1/3이 된다.!” 라고해서 이 test를 시작하기 되었다.

다행이도 회사에 disk 12 bay짜리 array가 있고, 놀고있는 server도 있었다.

주말전에 2일 휴가도 내서 시간도 충분했다.

 

 

test case는

Disk : 2 ~ 12

Raid : 0, 10, 5, 6

R/W : read, write, R70:W30

이렇게해서 총 312개 case의 test를 하였다.

 

 

일단 server와 HBA(disk controller), disk array의 spec을 알아보면…

1. server

PowerEdge R620

Architecture: x86_64

CPU op-mode(s): 32-bit, 64-bit

Byte Order: Little Endian

CPU(s): 24

On-line CPU(s) list: 0-23

Thread(s) per core: 2

Core(s) per socket: 6

Socket(s): 2

NUMA node(s): 2

Vendor ID: GenuineIntel

CPU family: 6

Model: 45

Model name: Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz

Stepping: 7

CPU MHz: 1200.000

BogoMIPS: 4004.06

Virtualization: VT-x

L1d cache: 32K

L1i cache: 32K

L2 cache: 256K

L3 cache: 15360K

NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22

NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23

 

 

memory 8 x 22 = 176G

Memory Device

Array Handle: 0x1000

Error Information Handle: Not Provided

Total Width: 72 bits

Data Width: 64 bits

Size: 8192 MB

Form Factor: DIMM

Set: 1

Locator: DIMM_A4

Bank Locator: Not Specified

Type: DDR3

Type Detail: Synchronous Registered (Buffered)

Speed: 1600 MHz

Manufacturer: 00CE00B300CE

Serial Number:

Asset Tag: 03132463

Part Number: M393B1G70BH0-YK0

Rank: 1

Configured Clock Speed: 1333 MHz

 

 

2. HBA

Name : PERC H810 Adapter

Slot ID : PCI Slot 1

State : Ready

Firmware Version : 21.3.2-0005

Minimum Required Firmware Version : Not Applicable

Driver Version : 06.811.02.00-rh1

Minimum Required Driver Version : Not Applicable

Storport Driver Version : Not Applicable

Minimum Required Storport Driver Version : Not Applicable

Number of Connectors : 2

Rebuild Rate : 30%

BGI Rate : 30%

Check Consistency Rate : 30%

Reconstruct Rate : 30%

Alarm State : Not Applicable

Cluster Mode : Not Applicable

SCSI Initiator ID : Not Applicable

Cache Memory Size : 1024 MB

Patrol Read Mode : Auto

Patrol Read State : Stopped

Patrol Read Rate : 30%

Patrol Read Iterations : 0

Abort Check Consistency on Error : Disabled

Allow Revertible Hot Spare and Replace Member : Enabled

Load Balance : Auto

Auto Replace Member on Predictive Failure : Disabled

Redundant Path view : Not Applicable

CacheCade Capable : Yes

Persistent Hot Spare : Disabled

Encryption Capable : Yes

Encryption Key Present : No

Encryption Mode : None

Preserved Cache : Not Applicable

Spin Down Unconfigured Drives : Disabled

Spin Down Hot Spares : Disabled

Spin Down Configured Drives : Disabled

Automatic Disk Power Saving (Idle C) : Disabled

Start Time (HH:MM) : Not Applicable

Time Interval for Spin Up (in Hours) : Not Applicable

T10 Protection Information Capable : No

Non-RAID HDD Disk Cache Policy : Not Applicable

 

 

3. OS

[root@testlab1 youngju]# lsb_release -a

LSB Version: :core-4.1-amd64:core-4.1-noarch

Distributor ID: RedHatEnterpriseServer

Description: Red Hat Enterprise Linux Server release 7.3 (Maipo)

Release: 7.3

Codename: Maipo

[root@testlab1 youngju]# uname -a

Linux testlab1 3.10.0-514.10.2.el7.x86_64 #1 SMP Mon Feb 20 02:37:52 EST 2017 x86_64 x86_64 x86_64 GNU/Linux

[root@testlab1 youngju]#

 

 

performance test는 vdbench를 이용해서 하였다.

vdbench script

sd=sd1,lun=/dev/sdc

wd=wd1,sd=sd1,rdpct=100,xfersize=512

rd=run1,wd=wd1,iorate=max,elapsed=120,interval=5,openflags=o_direct,forrdpct=(100,0,70),forxfersize=(512,4096)

 

 

4. disk 1TB, 7200rpm sas

ID : 1:0:11

Status : Ok

Name : Physical Disk 1:0:11

State : Online

Power Status : Spun Up

Bus Protocol : SAS

Media : HDD

Part of Cache Pool : Not Applicable

Remaining Rated Write Endurance : Not Applicable

Failure Predicted : No

Revision : GS0A

Driver Version : Not Applicable

Model Number : Not Applicable

T10 PI Capable : No

Certified : Yes

Encryption Capable : No

Encrypted : Not Applicable

Progress : Not Applicable

Mirror Set ID : Not Applicable

Capacity : 931.00 GB (999653638144 bytes)

Used RAID Disk Space : 1.20 GB (1288437760 bytes)

Available RAID Disk Space : 929.80 GB (998365200384 bytes)

Hot Spare : No

Vendor ID : DELL(tm)

Product ID : ST1000NM0023

Serial No. :

Part Number :

Negotiated Speed : 6.00 Gbps

Capable Speed : 6.00 Gbps

PCIe Negotiated Link Width : Not Applicable

PCIe Maximum Link Width : Not Applicable

Sector Size : 512B

Device Write Cache : Not Applicable

Manufacture Day : 07

Manufacture Week : 29

Manufacture Year : 2013

SAS Address : 5000C50056FBA12D

Non-RAID HDD Disk Cache Policy : Not Applicable

Disk Cache Policy : Not Applicable

Form Factor : Not Available

Sub Vendor : Not Available

ISE Capable : No

 

 

[root@testlab1 youngju]# omreport storage pdisk controller=1

 

 

test에 사용된 shell script

[root@testlab1 youngju]# grep -iv ‘^#[a-zA-Z]’ create-vdisk.sh

#!/bin/bash

 

 

RAID=${1}

DISKN=`echo $(($2-1)) `

PDISK=`seq -s “,” -f 1:0:%g 0 $DISKN`

SIZE=$3

echo -e “\e[93m———– dell omsa RAID=$RAID PDISK=$2 vdisk create ———–\e[0m”

omconfig storage controller action=createvdisk controller=1 raid=r${RAID:=0} size=${SIZE:=5g} pdisk=${PDISK:=1:0:0,1:0:1} stripesize=64kb readpolicy=nra writepolicy=wt name=yj-r${RAID}-${2}disk

if [ $? = 0 ]; then

echo -e “\e[93m———– dell omsa RAID=$RAID PDISK=$2 vdisk create done ———–\e[0m”

else

echo -e “\e[91m———– dell omsa RAID=$RAID PDISK=$2 vdisk create fail ———–\e[0m”

exit 1

fi

 

 

[root@testlab1 youngju]# grep -iv ‘^#[a-zA-Z]’ delete-vdisk.sh

#!/bin/bash

 

 

VDISK=$1

VNAME=`bash status-vdisk.sh |grep -i name |head -n1|awk ‘{print $3}’`

 

 

echo -e “\e[93m———– dell omsa vdisk ${VNAME:=no vdisk} delete ———–\e[0m”

omconfig storage vdisk action=deletevdisk controller=1 vdisk=${VDISK:=1}

echo -e “\e[93m———– dell omsa vdisk ${VNAME:=no vdisk} delete done ———–\e[0m”

 

 

[root@testlab1 youngju]# grep -iv ‘^#[a-zA-Z]’ status-vdisk.sh

#!/bin/bash

 

 

omreport storage vdisk controller=1 vdisk=1

 

 

[root@testlab1 youngju]# grep -iv ‘^[[:space:]]*#\|^$’ raid-test.sh

#!/bin/bash

RSTD=test-result-`date +%Y%m%d-%H%M`

RSTD512=test-512-result-`date +%Y%m%d-%H%M`

RSTDCACHE=test-cache-result-`date +%Y%m%d-%H%M`

mkdir $RSTD512

for R in 0 10 5 6

do

for D in `seq 2 12`

do

sleep 2

bash delete-vdisk.sh

sleep 2

bash create-vdisk.sh $R $D 12g

if [ $? = 0 ] ; then

sleep 3

omconfig storage vdisk action=slowinit controller=1 vdisk=1

sleep 2

while [ `bash status-vdisk.sh |grep -i state|head -n1 |awk ‘{print $3}’` != Ready ]

do

echo -e “\e[91m —– vdisk is initializing ——–\e[0m”

bash status-vdisk.sh |grep -i progress

sleep 5

done

echo

echo -e “\e[93m —– vdisk is initialized ——–\e[0m”

sleep 1

vdbench/vdbench -f youngju-test.param-512 -o ${RSTD512}/raid${R}-disk${D} -w 10

sleep 1

else

echo vdisk create fail raid $R disk $D

echo

fi

echo -e “\e[96m———– test raid $R disk $D done ————\e[0m”

echo

echo

done

done

[root@testlab1 youngju]#

 

 

test는 최대한 cache effect를 타지 않게 하였다.

test 결과는 다음과 같다.

io/cache raid disk R/W iops

512 raid 0 2 read 514.26

512 raid 0 3 read 677.07

512 raid 0 4 read 830.22

512 raid 0 5 read 947.13

512 raid 0 6 read 1027.46

512 raid 0 7 read 1108.98

512 raid 0 8 read 1121.44

512 raid 0 9 read 1207.18

512 raid 0 10 read 1265.49

512 raid 0 11 read 1286.46

512 raid 0 12 read 1335.8

512 raid 10 4 read 815.64

512 raid 10 6 read 975.67

512 raid 10 8 read 1119.82

512 raid 10 10 read 1212.33

512 raid 10 12 read 1303.5

512 raid 5 3 read 635.9

512 raid 5 4 read 783.8

512 raid 5 5 read 911.86

512 raid 5 6 read 1010.51

512 raid 5 7 read 1074.68

512 raid 5 8 read 1145.63

512 raid 5 9 read 1187.66

512 raid 5 10 read 1242.5

512 raid 5 11 read 1289.53

512 raid 5 12 read 1310.89

512 raid 6 4 read 729.6

512 raid 6 5 read 850.5

512 raid 6 6 read 983.22

512 raid 6 7 read 1053.27

512 raid 6 8 read 1126.92

512 raid 6 9 read 1179.58

512 raid 6 10 read 1217.82

512 raid 6 11 read 1274.77

512 raid 6 12 read 1296.48

더 많은데… 일단 것만함..

이걸 graph로 만들어 보면…

 

 

read

1

 

write

2

 

 

read write 비교

3

 

대략 이와같은 결과가 나왓다.

test는 transper size 512byte 일때 4096byte 일때 HBA write policy를 writeback일 때 이렇게 3종류 case로 진행 했는데 비슷하더라. 512와 4096은 어차피 1개 page가 4k라서 인것같고, writeback policy로 진행한것은 약간의 성능 향상이 있었는데 결과는 비슷햇다.

 

 

Raid 당 순수 performance를 계산하는 공식이다. I/O처리하는데 있어 대기시간이 전혀 없다는 가정이다.

N=disk 갯수(8)

X=1개 disk가 낼수 있는 iops (125)

Read는 모두 parity를 뺀 NX 만큼의 iops가 나온다.

Write Raid 0 = NX = 8×125= 1000

Write Raid 10 = NX/2 = 8×125 = 500

Write Raid 5 = NX/4 = 8×125/4 = 250

Write Raid 6 = NX/6 = 8×125/6 = 166

아래 site 참조함.

https://www.storagecraft.com/blog/raid-performance/

raid 5는 write를 할 때 4개의 operation이 들어가게 된다. 먼저 data를 읽고, parity를 읽고, data를 쓰고, parity쓰는 4개의 operation이 일어나게 되어 4를 나눈것이고, 6은 parity가 1개 더 들어가서 6을 나눈다.

 

 

대략적인 test결과도 비슷하게 나왔다.

operation size 별 read 대비 write performance

4

 

결론

1. HBA의 성능이 받쳐주는 이상 disk를 늘리면 늘릴 수록 I/O의 성능은 올라간다. storage vendor에 물어보니까. HBA의 cache size에 따라서 몇개의 disk까지 수용 할 수 있는지가 정해 진단다. test에서는 12개 까지밖에 쓸 수가 없어서 12개까지만 test 해봄.

2. 위 site에서 raid 5는 안전상의 이유로 사용하지 말라고 하더라. 사실상 raid5와 raid6과의 read 대비 write의 차이가 그리 크지 않다. raid5가 약 25% 정도의 효율이 나오고, raid6이 20% 정도의 효율이 나온다. disk가 많아지면 많아질 수록 이 격차는 좁아진다. 그러므로 raid 6을 쓰자.

 

댓글 남기기