⬇️ SupremeRAID™ SR-1000 Brochure and Technical Specs
Download the SupremeRAID™ SR-1000 brochure and specs for more information about our award-winning NVMe RAID solution for PCIe Gen 3,…
阅读更多(StarWind Software, Volodymyr Khrystenko, July 20, 2022)
The new build of StarWind SAN & NAS has just gotten released and it will bring us the long-awaited FC (Fibre Channel) ecosystem support. StarWind SAN & NAS has been designed to give a new life to your existing hardware. Installed either on top of the hypervisor of your choice or bare-metal, it turns your server into a fast shared storage pool that can be accessed over iSCSI, SMB, or NFS. And in the new build, via FC as well. It uses the same SDS engine as StarWind vSAN which means high performance and also adds new features such as ZFS support to build the utmost resilient storage system using the commodity hardware.
This was a great chance to test how fast StarWind SAN & NAS can go using FC. Folks from StorageReview were kind to provide us with the testing infrastructure where we performed the benchmark. Thanks, StorageReview team once again!
We have tested the performance of shared storage presented from a dedicated storage node full of NVMe drives and StarWind SAN & NAS on top over FC to client nodes. We have decided to include only good old FCP (FCP – Fibre Channel Protocol) benchmark results in this article since the results of NVMe-FC were at the same level (on certain patterns even lower than FCP). To collect NVMe drives into a redundant storage array, we have used MDRAID and Graid tools and tested them separately. MDRAID is a Linux software RAID that is present as part of StarWind SAN & NAS and serves to collect drives into a redundant array. Graid is an extremely fast NVMe/NVMeoF RAID card, designed to deliver the full potential of PCIe Gen4 systems.
It is worth mentioning that Graid SupremeRAID is the only NVMe RAID card as of now capable of delivering the highest SSD performance possible that removes performance bottlenecks altogether. What is the difference, you may wonder? Well, Graid SupremeRAID SR-1010 is based on an NVIDIA A2000 GPU. In most characteristics, that doesn’t make this solution anything special, but when it comes to the NVMe RAID bottlenecks, the GPU can give a head start to lots of alternatives. In particular, the SupremeRAID is capable of processing all the I/O operations directly, and we don’t need to tell you just how much this frees up the CPU resources. Standard RAID cards are simply no match for the computing potential of the GPU card. Even though the Graid solution is a software RAID, the NVIDIA GPU card is essential to a lot of benefits that Graid has to offer. Additionally, thanks to the specifics of Graid software architecture, data can flow directly from the CPU and straight to the storage, passing by the Graid card.
Traditionally, NVIDIA cards serve various purposes. They are in demand for use in gaming, video acceleration, cryptocurrency mining, and professional working tools such as VDI. Moreover, NVIDIA also produces GPUs for vehicles. Now? NVIDIA hardware powers storage appliances. This novelty embarks nothing less but a breakthrough in utilizing the computing potential of the GPU in a whole new field.
Here is the list of the hardware and software used in our benchmark.
Storage node:
Hardware | |
sw-sannas-01 | Dell PowerEdge R750 |
CPU | Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz |
Sockets | 2 |
Cores/Threads | 80/160 |
RAM | 1024Gb |
Storage | 8x (NVMe) – PBlaze6 6926 12.8TB |
GRAID | SupremeRAID SR-1010 |
HBAs | 4x Marvell® QLogic® 2772 Series Enhanced 32GFC Fibre Channel Adapters |
Software | |
StarWind SAN&NAS |
Version 1.0.2 (build 2175 – FC) |
Client nodes:
Hardware | |
win-cli-{01..04} | PowerEdge R740xd |
CPU | Intel® Xeon® Gold 6130 Processor 2.10GHz |
Sockets | 2 |
Cores/Threads | 32/64 |
RAM | 256Gb |
HBAs | 1x Marvell® QLogic® 2772 Series Enhanced 32GFC Fibre Channel Adapters |
Software | |
OS | Windows Server 2019 Standard Edition |
The communication between storage nodes and client nodes has been carried out over 32GFC Fibre Channel fabric. The storage node had 4x Marvell® QLogic® 2772 Series Enhanced 32GFC Fibre Channel Adapters while each client node had one. The storage and client nodes were connected using two Brocade G620 Fibre Channel Switches to ensure resilience.
The interesting thing behind Marvell Qlogic 2772 Fibre Channel adapters is that the ports on it are independently resourced which gives an additional layer of resilience. The complete port-level isolation across the FC controller architecture prevents errors and firmware crashes from propagating across all ports. Find out more about Marvell Qlogic 2772 Fibre Channel adapters in terms of high availability and reliability.
We have collected 8 NVMe drives on the storage node in the RAID5 array:
And then, with Graid correspondingly:
Once the RAID arrays were ready, we sliced them into 32 LUNs, 1TB each. These were distributed by 8 LUNs per client node. This was done since 1 LUN has a performance limitation and we wanted to squeeze the max out of our storage.
This is the example of 8 LUNs connected on one client node:
The benchmark was held using the fio utility. fio is a cross-platform, industry-standard benchmark tool used to test local storage as well as shared.
1. Testing single NVMe drive performance:
Talking about these NVMe SSDs, an interesting thing is that they support 10W~35W flexible power management and 25W power mode by default. Basically, Memblaze’s NVMe drives to increase performance on sequential writes by consuming more power, which gives a flexible way to tune drive performance as per specific workload.
We have received the optimal (speed/latency) performance patterns of a single NVMe drive under the following number of jobs and IO depth values:
1 NVMe PBlaze6 D6926 Series 12.8TB | |||||
Pattern | Numjobs | IOdepth | IOPs | MiB\s | Latency (ms) |
4k random read | 16 | 32 | 1514000 | 5914 | 0,337 |
4k random write | 4 | 4 | 537000 | 2096 | 0,029 |
64k random read | 4 | 8 | 103000 | 6467 | 0,308 |
64k random write | 2 | 2 | 42000 | 2626 | 0,094 |
1M read | 1 | 4 | 6576 | 6576 | 0,607 |
1M write | 1 | 2 | 5393 | 5393 | 0,370 |
Before running the actual tests, we have determined the time needed to warm up the NVMe drives to Steady State:
P.S. More information about Performance States.
From the graph, it was visible that the NVMe drives should be warmed up for around 2 hours.
2. Testing MD and Graid RAID arrays performance locally:
Fewer words, more numbers. Heading to MDRAID and Graid local performance tests.
4k random read:
Table result:
MDRAID5 | GRAID5 | Comparison | ||||||||||||
Pattern | Numjobs | IOdepth | IOPs | MiB\s | Latency (ms) | CPU usage | IOPs | MiB\s | Latency (ms) | CPU usage | IOPs | MiB\s | Latency (ms) | CPU usage |
4k random read | 16 | 16 | 2670000 | 10430 | 0,095 | 7% | 1281000 | 5004 | 0,198 | 3% | 48% | 48% | 208% | 46% |
4k random read | 16 | 32 | 3591000 | 14027 | 0,141 | 10% | 2451000 | 9574 | 0,207 | 6% | 68% | 68% | 147% | 60% |
4k random read | 32 | 32 | 4049000 | 15816 | 0,250 | 20% | 4474000 | 17477 | 0,227 | 10% | 110% | 110% | 91% | 50% |
4k random read | 32 | 64 | 4032000 | 15750 | 0,504 | 30% | 7393000 | 28879 | 0,275 | 16% | 183% | 183% | 55% | 53% |
4k random read | 64 | 64 | 4061000 | 15863 | 0,998 | 40% | 10800000 | 42188 | 0,377 | 25% | 266% | 266% | 38% | 63% |
Graphs:
4k random write:
Table result:
MDRAID5 | GRAID5 | Comparison | ||||||||||||
Pattern | Numjobs | IOdepth | IOPs | MiB\s | Latency (ms) | CPU usage | IOPs | MiB\s | Latency (ms) | CPU usage | IOPs | MiB\s | Latency (ms) | CPU usage |
4k random read | 16 | 16 | 2670000 | 10430 | 0,095 | 7% | 1281000 | 5004 | 0,198 | 3% | 48% | 48% | 208% | 46% |
4k random read | 16 | 32 | 3591000 | 14027 | 0,141 | 10% | 2451000 | 9574 | 0,207 | 6% | 68% | 68% | 147% | 60% |
4k random read | 32 | 32 | 4049000 | 15816 | 0,250 | 20% | 4474000 | 17477 | 0,227 | 10% | 110% | 110% | 91% | 50% |
4k random read | 32 | 64 | 4032000 | 15750 | 0,504 | 30% | 7393000 | 28879 | 0,275 | 16% | 183% | 183% | 55% | 53% |
4k random read | 64 | 64 | 4061000 | 15863 | 0,998 | 40% | 10800000 | 42188 | 0,377 | 25% | 266% | 266% | 38% | 63% |
* – in order to get maximum performance of 1.5M IOPs with Graid SR-1010, you need PCIe Gen4x16. Our server however had only Gen4x8 PCIe slots.
4k random read/write 70/30:
Table result:
MDRAID5 | GRAID5 | Comparison | ||||||||||||
Pattern | Numjobs | IOdepth | IOPs | MiB\s | Latency (ms) | CPU usage | IOPs | MiB\s | Latency (ms) | CPU usage | IOPs | MiB\s | Latency (ms) | CPU usage |
4k random read/write 70/30 | 8 | 16 | 765000 | 2988 | 0,202 | 5% | 429000 | 1676 | 0,344 | 1% | 56% | 56% | 170% | 31% |
4k random read/write 70/30 | 16 | 16 | 1078000 | 4211 | 0,285 | 14% | 776000 | 3031 | 0,382 | 2% | 72% | 72% | 134% | 14% |
4k random read/write 70/30 | 16 | 32 | 1100000 | 4297 | 0,518 | 17% | 1253000 | 4895 | 0,470 | 3% | 114% | 114% | 91% | 18% |
4k random read/write 70/30 | 32 | 32 | 1147000 | 4480 | 0,960 | 30% | 1944000 | 7594 | 0,608 | 5% | 169% | 169% | 63% | 15% |
4k random read/write 70/30 | 32 | 64 | 1154000 | 4508 | 1,847 | 30% | 2686000 | 10492 | 0,882 | 6% | 233% | 233% | 48% | 20% |
4k random read/write 70/30 | 64 | 64 | 1193000 | 4660 | 5,298 | 49% | 3140000 | 12266 | 1,529 | 8% | 263% | 263% | 29% | 15% |
Graphs:
64k random read:
Table result:
MDRAID5 | GRAID5 | Comparison | ||||||||||||
Pattern | Numjobs | IOdepth | IOPs | MiB\s | Latency (ms) | CPU usage | IOPs | MiB\s | Latency (ms) | CPU usage | IOPs | MiB\s | Latency (ms) | CPU usage |
64k random read | 8 | 8 | 186000 | 11625 | 0,343 | 5% | 175000 | 10938 | 0,364 | 1% | 94% | 94% | 106% | 16% |
64k random read | 8 | 16 | 188000 | 11750 | 0,679 | 5% | 292000 | 18250 | 0,438 | 2% | 155% | 155% | 65% | 30% |
64k random read | 16 | 16 | 196000 | 12250 | 1,309 | 10% | 461000 | 28813 | 0,554 | 2% | 235% | 235% | 42% | 20% |
64k random read | 16 | 32 | 195000 | 12188 | 2,624 | 10% | 646000 | 40375 | 0,792 | 3% | 331% | 331% | 30% | 30% |
64k random read | 32 | 32 | 195000 | 12188 | 5,242 | 20% | 740000 | 46250 | 1,382 | 3% | 379% | 379% | 26% | 15% |
Graphs:
64k random write:
Table result:
MDRAID5 | GRAID5 | Comparison | ||||||||||||
Pattern | Numjobs | IOdepth | IOPs | MiB\s | Latency (ms) | CPU usage | IOPs | MiB\s | Latency (ms) | CPU usage | IOPs | MiB\s | Latency (ms) | CPU usage |
64k random write | 8 | 8 | 92200 | 5763 | 0,693 | 7% | 67400 | 4213 | 0,948 | 1% | 73% | 73% | 137% | 10% |
64k random write | 8 | 16 | 118000 | 7375 | 1,081 | 14% | 104000 | 6500 | 1,229 | 1% | 88% | 88% | 114% | 10% |
64k random write | 16 | 16 | 117000 | 7313 | 2,179 | 16% | 135000 | 8438 | 1,895 | 2% | 115% | 115% | 87% | 11% |
64k random write | 16 | 32 | 117000 | 7313 | 4,369 | 16% | 146000 | 9125 | 3,496 | 2% | 125% | 125% | 80% | 13% |
1M read:
Table result:
MDRAID5 | GRAID5 | Comparison | ||||||||||||
Pattern | Numjobs | IOdepth | IOPs | MiB\s | Latency (ms) | CPU usage | IOPs | MiB\s | Latency (ms) | CPU usage | IOPs | MiB\s | Latency (ms) | CPU usage |
1M read | 4 | 4 | 10000 | 10000 | 1,592 | 3% | 18200 | 18200 | 0,880 | 0% | 182% | 182% | 55% | 12% |
1M read | 8 | 4 | 11000 | 11000 | 2,673 | 5% | 28600 | 28600 | 1,120 | 1% | 260% | 260% | 42% | 10% |
1M read | 8 | 8 | 11900 | 11900 | 5,393 | 5% | 39400 | 39400 | 1,623 | 1% | 331% | 331% | 30% | 10% |
1M read | 8 | 16 | 12100 | 12100 | 10,563 | 5% | 44700 | 44700 | 2,865 | 1% | 369% | 369% | 27% | 12% |
1M read | 16 | 16 | 12100 | 12100 | 21,156 | 10% | 47000 | 47000 | 5,442 | 1% | 388% | 388% | 26% | 6% |
1M write:
Table result:
MDRAID5 | GRAID5 | Comparison | ||||||||||||
Pattern | Numjobs | IOdepth | IOPs | MiB\s | Latency (ms) | CPU usage | IOPs | MiB\s | Latency (ms) | CPU usage | IOPs | MiB\s | Latency (ms) | CPU usage |
1M write | 4 | 4 | 6938 | 6938 | 2,300 | 9% | 5363 | 5363 | 2,981 | 1% | 77% | 77% | 130% | 9% |
1M write | 8 | 4 | 6730 | 6730 | 4,753 | 11% | 8251 | 8251 | 3,876 | 1% | 123% | 123% | 82% | 12% |
1M write | 8 | 8 | 6782 | 6782 | 9,434 | 12% | 10100 | 10100 | 6,312 | 2% | 149% | 149% | 67% | 17% |
1M write | 8 | 16 | 6780 | 6780 | 18,870 | 12% | 11100 | 11100 | 11,530 | 2% | 164% | 164% | 61% | 17% |
1M write | 16 | 16 | 7071 | 7071 | 36,182 | 17% | 11400 | 11400 | 22,490 | 3% | 161% | 161% | 62% | 15% |
Graphs:
MDRAID shows decent performance on low Numjobs and IOdepth values but as the workload increases, so does the latency and performance stops growing. On the other hand, Graid gives better results with high Numjobs and IOdepth values: on a 4k random read pattern, we have received the incredible 10,8M IOPs with a latency of just 0,377 ms. That is basically the speed of 7 NVMe drives out of 8. On large block reads 64k/1M, Graid reaches the throughput of 40/47GiB/s., while MDRAID reached the ceiling with 12GiB/s.
3. Running benchmark remotely from client nodes:
Once we have received such impressive local storage results, we were fully ready to give FCP a try and see if it can deliver comparable performance on the client nodes.
In the results below, the Numjobs parameter is stated for all 32 LUNs.
4k random read:
Table result:
MDRAID5 | GRAID5 | Comparison | |||||||||
Pattern | Numjobs | IOdepth | IOPs | MiB\s | Latency (ms) | IOPs | MiB\s | Latency (ms) | IOPs | MiB\s | Latency (ms) |
4k random read | 16 | 16 | 1664285 | 6501 | 0,132 | 1067226 | 4169 | 0,230 | 64% | 64% | 174% |
4k random read | 32 | 16 | 3184359 | 12439 | 0,141 | 2104438 | 8221 | 0,233 | 66% | 66% | 165% |
4k random read | 64 | 16 | 3531393 | 13795 | 0,274 | 3687970 | 14406 | 0,264 | 104% | 104% | 96% |
4k random read | 128 | 16 | 3544646 | 13847 | 0,563 | 4563635 | 17827 | 0,430 | 129% | 129% | 76% |
4k random read | 16 | 32 | 1783060 | 6965 | 0,199 | 1772981 | 6926 | 0,261 | 99% | 99% | 131% |
4k random read | 32 | 32 | 3500411 | 13674 | 0,253 | 3475477 | 13576 | 0,268 | 99% | 99% | 106% |
4k random read | 64 | 32 | 3532084 | 13797 | 0,563 | 4459783 | 17421 | 0,436 | 126% | 126% | 77% |
4k random read | 128 | 32 | 3549901 | 13867 | 1,139 | 4578663 | 17886 | 0,873 | 129% | 129% | 77% |
4k random write:
Table result:
MDRAID5 | GRAID5 | Comparison | |||||||||
Pattern | Numjobs | IOdepth | IOPs | MiB\s | Latency (ms) | IOPs | MiB\s | Latency (ms) | IOPs | MiB\s | Latency (ms) |
4k random write | 16 | 16 | 204612 | 799 | 1,241 | 304228 | 1188 | 0,833 | 149% | 149% | 67% |
4k random write | 32 | 16 | 238109 | 930 | 2,143 | 513988 | 2008 | 0,988 | 216% | 216% | 46% |
4k random write | 64 | 16 | 271069 | 1059 | 3,769 | 514719 | 2011 | 1,980 | 190% | 190% | 53% |
4k random write | 128 | 16 | 331108 | 1294 | 6,176 | 511970 | 2000 | 3,991 | 155% | 155% | 65% |
4k random write | 16 | 32 | 247398 | 966 | 2,059 | 307504 | 1201 | 1,657 | 124% | 124% | 80% |
4k random write | 32 | 32 | 285527 | 1115 | 3,578 | 512118 | 2001 | 1,992 | 179% | 179% | 56% |
4k random write | 64 | 32 | 341017 | 1332 | 5,996 | 491534 | 1920 | 4,157 | 144% | 144% | 69% |
4k random write | 128 | 32 | 385361 | 1506 | 10,617 | 498065 | 1946 | 8,212 | 129% | 129% | 77% |
4k random read/write 70/30:
Table result:
MDRAID5 | GRAID5 | Comparison | |||||||||
Pattern | Numjobs | IOdepth | IOPs | MiB\s | Latency (ms) | IOPs | MiB\s | Latency (ms) | IOPs | MiB\s | Latency (ms) |
4k random read/write 70/30 | 16 | 16 | 538622 | 2104 | 0,683 | 646787 | 2 527 | 0,470 | 120% | 120% | 69% |
4k random read/write 70/30 | 32 | 16 | 670407 | 2619 | 1,136 | 1109071 | 4 332 | 0,554 | 165% | 165% | 49% |
4k random read/write 70/30 | 64 | 16 | 805986 | 3149 | 1,955 | 1072219 | 4 188 | 1,370 | 133% | 133% | 70% |
4k random read/write 70/30 | 128 | 16 | 927080 | 3622 | 3,493 | 1089414 | 4 256 | 2,912 | 118% | 118% | 83% |
4k random read/write 70/30 | 16 | 32 | 700225 | 2735 | 1,065 | 644987 | 2 520 | 1,133 | 92% | 92% | 106% |
4k random read/write 70/30 | 32 | 32 | 817516 | 3194 | 1,928 | 1103024 | 4 309 | 1,329 | 135% | 135% | 69% |
4k random read/write 70/30 | 64 | 32 | 933090 | 3645 | 3,471 | 1098277 | 4 290 | 2,888 | 118% | 118% | 83% |
4k random read/write 70/30 | 128 | 32 | 997943 | 3899 | 6,616 | 1061938 | 4 149 | 6,202 | 106% | 106% | 94% |
64k random read:
Table result:
MDRAID5 | GRAID5 | Comparison | |||||||||
Pattern | Numjobs | IOdepth | IOPs | MiB\s | Latency (ms) | IOPs | MiB\s | Latency (ms) | IOPs | MiB\s | Latency (ms) |
random read 64K | 8 | 8 | 192015 | 12001 | 0,326 | 149755 | 9360 | 0,420 | 78% | 78% | 129% |
random read 64K | 8 | 16 | 193967 | 12123 | 0,652 | 260821 | 16302 | 0,483 | 134% | 134% | 74% |
random read 64K | 8 | 32 | 194089 | 12131 | 1,311 | 397736 | 24859* | 0,634 | 205% | 205% | 48% |
* – throughput limitation of our FC adapters (3200MB\s * 8 ports = 25600MB\s).
64k random write:
Table result:
MDRAID5 | GRAID5 | Comparison | |||||||||
Pattern | Numjobs | IOdepth | IOPs | MiB\s | Latency (ms) | IOPs | MiB\s | Latency (ms) | IOPs | MiB\s | Latency (ms) |
random write 64K | 8 | 8 | 37343 | 2334 | 1,705 | 61839 | 3865 | 1,027 | 166% | 166% | 60% |
random write 64K | 8 | 16 | 51048 | 3191 | 2,497 | 100093 | 6256 | 1,269 | 196% | 196% | 51% |
random write 64K | 16 | 16 | 65517 | 4095 | 3,895 | 132669 | 8292 | 1,915 | 202% | 202% | 49% |
random write 64K | 16 | 32 | 85255 | 5330 | 5,992 | 138609 | 8664 | 3,677 | 163% | 163% | 61% |
1M read:
Table result:
MDRAID5 | GRAID5 | Comparison | |||||||||
Pattern | Numjobs | IOdepth | IOPs | MiB\s | Latency (ms) | IOPs | MiB\s | Latency (ms) | IOPs | MiB\s | Latency (ms) |
1M read | 4 | 2 | 9690 | 9690 | 0,803 | 8542 | 8542 | 0,915 | 88% | 88% | 114% |
1M read | 4 | 4 | 10495 | 10495 | 1,503 | 14799 | 14799 | 1,059 | 141% | 141% | 70% |
1M read | 4 | 8 | 11018 | 11018 | 2,874 | 19841 | 19841 | 1,584 | 180% | 180% | 55% |
1M read | 4 | 16 | 11713 | 11713 | 5,442 | 25150 | 25150* | 2,520 | 215% | 215% | 46% |
* – throughput limitation of our FC adapters (3200MB\s * 8 ports = 25600MB\s).
Graphs:
1M write:
Table result:
MDRAID5 | GRAID5 | Comparison | |||||||||
Pattern | Numjobs | IOdepth | IOPs | MiB\s | Latency (ms) | IOPs | MiB\s | Latency (ms) | IOPs | MiB\s | Latency (ms) |
1M write | 4 | 2 | 6028 | 6028 | 1,284 | 2991 | 2991 | 2,633 | 50% | 50% | 205% |
1M write | 4 | 4 | 7222 | 7222 | 2,167 | 4497 | 4497 | 3,509 | 62% | 62% | 162% |
1M write | 4 | 8 | 6992 | 6992 | 4,521 | 6748 | 6748 | 4,684 | 96% | 97% | 104% |
1M write | 4 | 16 | 6819 | 6819 | 9,310 | 8902 | 8902 | 7,125 | 131% | 131% | 77% |
1M write | 8 | 16 | 7144 | 7144 | 17,832 | 10493 | 10493 | 12,117 | 147% | 147% | 68% |
Graphs:
In the tables below, we have provided the best results achieved from each test as to performance/latency ratio. The full performance benchmark results are provided above.
MDRAID:
MDRAID5 – local | MDRAID5 – FCP | Comparison | |||||||
Pattern | IOPs | MiB\s | Latency (ms) | IOPs | MiB\s | Latency (ms) | IOPs | MiB\s | Latency (ms) |
4k random read | 4049000 | 15816 | 0,250 | 3531393 | 13795 | 0,274 | 87% | 87% | 110% |
4k random write | 478000 | 1867 | 0,535 | 341017 | 1332 | 5,996 | 71% | 71% | 1121% |
4k random read/write 70/30 | 1078000 | 4211 | 0,285 | 927080 | 3622 | 3,493 | 86% | 86% | 1226% |
64k random read | 186000 | 11625 | 0,343 | 192015 | 12001 | 0,326 | 103% | 103% | 95% |
64k random write | 118000 | 7375 | 1,081 | 85255 | 5330 | 5,992 | 72% | 72% | 554% |
1M read | 11900 | 11900 | 5,393 | 11709 | 11709 | 5,442 | 98% | 98% | 101% |
1M write | 6938 | 6938 | 2,300 | 7221 | 7221 | 2,167 | 104% | 104% | 94% |
GRAID:
GRAID5 – local | GRAID5 – FCP | Comparison | |||||||
Pattern | IOPs | MiB\s | Latency (ms) | IOPs | MiB\s | Latency (ms) | IOPs | MiB\s | Latency (ms) |
4k random read | 10800000 | 42188 | 0,377 | 4563635 | 17827 | 0,430 | 42% | 42% | 114% |
4k random write | 975000 | 3809 | 2,100 | 514719 | 2011 | 1,980 | 53% | 53% | 94% |
4k random read/write 70/30 | 3140000 | 12266 | 1,529 | 1109071 | 4332 | 0,554 | 35% | 35% | 36% |
64k random read | 740000 | 46250 | 1,382 | 397736 | 24859 | 0,634 | 54% | 54% | 46% |
64k random write | 135000 | 8438 | 1,895 | 132669 | 8292 | 1,915 | 98% | 98% | 101% |
1M read | 47000 | 47000 | 5,442 | 25150 | 25150* | 2,520 | 54% | 54% | 46% |
1M write | 11100 | 11100 | 11,530 | 10493 | 10493 | 12,117 | 95% | 95% | 105% |
* – throughput limitation of our FC adapters (3200MB\s * 8 ports = 25600MB\s).
Essentially, the most impressive shared storage performance was presented by a redundant Graid storage array full of PBlaze6 6920 Series NVMe SSDs with StarWind SAN & NAS on top and running over Fibre Channel to client nodes, using Marvell Qlogic 2772 Fibre Channel adapters. Graid is the only technology to guarantee probably the highest performance software-defined shared storage can get as of now. The Graid build has managed to receive around 50% of the local RAID array performance with the approximately same latency as with the local storage. The only reason the results on 64k/1M large block reads were different is the natural technical limitations of achieving near or at maximum bandwidth speeds for a 32G Fibre Channel environment.
Locally, Graid shows outstanding results with high data values: it was capable of receiving the seemingly impossible number of 10,8M IOPs with the latency of just 0,377 ms on a 4k random read pattern. Also, since Graid offloads IO requests processing to GPU, the CPU usage on the storage node is 2-10 times lower than that of MDRAID which allows using free CPU resources for other tasks. With MDRAID, we have managed to practically achieve the full performance that the RAID array could provide locally but at a cost of significantly higher latency.
If you want to unleash the full Graid performance potential, we would advise looking into NVMe-oF and RDMA which will be added in the subsequent StarWind SAN & NAS new builds. You can find more about the NVMe-oF and StarWind NVMe-oF initiator performance in one of the following articles.
Graid Technology™️ is headquartered in Silicon Valley, California, with an office in Ontario, California, and an R&D center in Taipei, Taiwan. Named one of the Ten Hottest Data Storage Startups of 2021 by CRN, Graid SupremeRAID performance is breaking world records as the first NVMe and NVMeoF RAID card to unlock the full potential of your SSD performance: a single SupremeRAID card delivers 19 million IOPS and 110GB/s of throughput. For more information on Graid Technology, visit Graid Technology or connect with us on Twitter or LinkedIn.
Learn more about award-winning GPU-based NVMe RAID controller SupremeRAID™ by Graid Technology. We’re ushering in the future of high storage capacity and extreme performance for mission critical and performance-demanding workloads. Contact us today to chat with a sales representative in your region.
"*" indicates required fields
Download the SupremeRAID™ SR-1000 brochure and specs for more information about our award-winning NVMe RAID solution for PCIe Gen 3,…
阅读更多New release will drive up to 4X-9X increase in sequential write performance for customers protecting their data with SupremeRAID™, allowing…
阅读更多(LeMagIT.fr, Yann Serra, October 14, 2022) “We rebuild in two hours what an ordinary RAID system takes three weeks to…
阅读更多