November 7, 2011

All nodes are now configured with the Write Through on the SSD logical disk. SNMP polling to the Brocade now done in 1 min intervals.

cacti_notes_5x5_2011-011-07.png

Figure 3: We are now seeing above 9 Gbps disk to disk on all boxes. The write on scdemo03 is the slowest in the cluster by about 1 Gb/s, it's not immediately obvious what the problem is.

Completed a 5 to 5 test Disk test:

  CFQ Scheduler   NOOP Sceduler  
Transfer Rate Reverse Rate Rate Reverse Rate
scdemo00->scdemo05 Avg: 9.174 Gb/s Avg: 9.435 Gb/s Avg: 9.544 Gb/s Avg: 9.489 Gb/s
scdemo01->scdemo06 Avg: 9.172 Gb/s Avg: 9.175 Gb/s Avg: 9.490 Gb/s Avg: 9.492 Gb/s
scdemo02->scdemo07 Avg: 9.121 Gb/s Avg: 9.173 Gb/s Avg: 9.490 Gb/s Avg: 9.549 Gb/s
scdemo03->scdemo08 Avg: 9.225 Gb/s Avg: 8.087 Gb/s Avg: 9.382 Gb/s Avg: 8.332 Gb/s
scdemo04->scdemo09 Avg: 9.382 Gb/s Avg: 9.170 Gb/s Avg: 9.545 Gb/s Avg: 9.545 Gb/s

Table 1: Switching to the noop scheduler has added about another 5 percent improvement in performance.

Switching to no noop:

echo noop > /sys/block/sdb/queue/scheduler

cacti_notes__noop_2011-11-07.png

Figure 4: Shows improvement when moving from CFQ to NoOp Kernel disk scheduler.

November 6, 2011

cacti_notes_2011-011-06.png

Figure 1: This shows preliminary results from testing from Sunday morning Nov 6. The summary is that simultaneous iperf is good from all hosts (9.9+ Gbps) . Reading from disk and writing memory is reasonable with performance of 7.5 -8.5 Gbit/s, but I think could do with some improvement. Disk to disk performance is only 5.0 Gbps and not really adequate for the test. We are using 2000 atlas files of around 500MB on each node for the transfers. We had better performance with 10G files written with 'dd'. Graph pulled from cacti for the Brocade.

Now attempting to improve disk performance be removing LVM and and changing the Raid controller to write through. Also note that scdemo00 and scdemo01 had their raid stripe set at 64 kb.

 Delete a logical drive
 [root@scdemo00 ~]# /opt/MegaRAID/MegaCli/MegaCli64 -CfgLdDel -L1 -a0
 
 Get the enclosure device ID:
 [root@scdemo00 ~]# /opt/MegaRAID/MegaCli/MegaCli64 -EncInfo -aALL
 
 Create a logical drive 
 [root@scdemo00 ~]# /opt/MegaRAID/MegaCli/MegaCli64  -CfgLdAdd -r0 [32:2,32:3,32:4,32:5,32:6] WT ADRA Direct -strpsz1024 -a0

Dramatic improvement with these new disk settings, but still using large files.

[igable@scdemo09 ssd]$ fdtclient -c 10.200.0.50 10Gfile01.dat * -d /ssd/batch0


FDT [ 0.9.23-201107290935 ] STARTED ... 


Nov 06, 2011 5:04:15 PM lia.util.net.common.Config <init>
INFO: Using lia.util.net.copy.PosixFSFileChannelProviderFactory as FileChannelProviderFactory
Nov 06, 2011 5:04:15 PM lia.util.net.common.Config <init>
INFO: FDT started in client mode
<snip>

INFO: Requested window size -1. Using window size: 43690
06/11 17:04:25   Net Out: 9.929 Gb/s   Avg: 9.929 Gb/s
06/11 17:04:30   Net Out: 9.408 Gb/s   Avg: 9.668 Gb/s


<snip>

06/11 17:07:45   Net Out: 9.892 Gb/s   Avg: 9.597 Gb/s 93.30% ( 15s )
06/11 17:07:50   Net Out: 9.923 Gb/s   Avg: 9.605 Gb/s 95.61% ( 09s )
06/11 17:07:55   Net Out: 8.892 Gb/s   Avg: 9.588 Gb/s 97.68% ( 05s )
06/11 17:08:00   Net Out: 9.891 Gb/s   Avg: 9.595 Gb/s 99.98% ( 00s )
Nov 06, 2011 5:08:03 PM lia.util.net.copy.FDTReaderSession handleEndFDTSession
INFO: [ FDTReaderSession ] Remote FDTWriterSession for session [ f72d49d9-db0e-4b27-9264-0247e1bc6864 ] finished OK!
06/11 17:08:05   Net Out: 90.602 Mb/s   Avg: 9.384 Gb/s 100.00% ( 00s )


FDTReaderSession ( f72d49d9-db0e-4b27-9264-0247e1bc6864 ) final stats:
 Started: Sun Nov 06 17:04:15 PST 2011
 Ended:   Sun Nov 06 17:08:06 PST 2011
 Transfer period:   03m 50s
 TotalBytes: 268435456000
 TotalNetworkBytes: 268435456000
 Exit Status: OK

Nov 06, 2011 5:08:06 PM lia.util.net.copy.FDTReaderSession doPostProcessing
INFO: [ FDTReaderSession ] Post Processing started
Nov 06, 2011 5:08:06 PM lia.util.net.copy.FDTReaderSession doPostProcessing
INFO: [ FDTReaderSession ] No post processing filters defined/processed.
 [ Sun Nov 06 17:08:07 PST 2011 ] - GracefulStopper hook started ... Waiting for the cleanup to finish
 [ Sun Nov 06 17:08:07 PST 2011 ]  - GracefulStopper hook finished!

 [ Sun Nov 06 17:08:07 PST 2011 ]  FDT Session finished OK.

Now do scdemo00->scdemo09 with 1TB

6/11 17:42:46  Net In: 6.342 Gb/s      Avg: 8.583 Gb/s 100.00% ( 00s )
Nov 06, 2011 5:42:49 PM lia.util.net.copy.transport.ControlChannel run
INFO:  ControlThread for ( 35c60688-40f0-4480-8025-bde1fcee9b25 ) /10.200.0.50:55421 FINISHED


FDTWriterSession ( 35c60688-40f0-4480-8025-bde1fcee9b25 ) final stats:
 Started: Sun Nov 06 17:26:44 PST 2011
 Ended:   Sun Nov 06 17:42:50 PST 2011
 Transfer period:   16m 06s
 TotalBytes: 1030792151040
 TotalNetworkBytes: 1030792151040
 Exit Status: OK

scdemo09->scdemo00

INFO: [ FDTReaderSession ] Remote FDTWriterSession for session [ 94f47223-967c-4275-a737-a71929e5dddb ] finished OK!
06/11 19:36:35  Net Out: 5.450 Gb/s     Avg: 9.383 Gb/s 100.00% ( 00s )


FDTReaderSession ( 94f47223-967c-4275-a737-a71929e5dddb ) final stats:
 Started: Sun Nov 06 19:21:55 PST 2011
 Ended:   Sun Nov 06 19:36:38 PST 2011
 Transfer period:   14m 42s
 TotalBytes: 1030792151040
 TotalNetworkBytes: 1030792151040
 Exit Status: OK

cacti_notes_good_rate_2011-11-06.png

Figure 2: After changing the raid configuration to be write through and using large 10G files created with 'dd' we see a much improved disk to disk throughput (as shown in the two FDT outputs immediately above). Strangely we see that one direction is nearly 0.8 Gbps faster then the other. I don't understand the reason for this yet.

Topic attachments
I Attachment History Action Size Date Who Comment
PNGpng cacti_notes_2011-011-06.png r1 manage 78.7 K 2011-11-06 - 23:29 UnknownUser  
PNGpng cacti_notes_5x5_2011-011-07.png r1 manage 123.8 K 2011-11-08 - 00:47 UnknownUser Shows some asymetric throughput from one node
PNGpng cacti_notes__noop_2011-11-07.png r1 manage 86.8 K 2011-11-08 - 04:07 UnknownUser  
PNGpng cacti_notes_good_rate_2011-11-06.png r2 r1 manage 67.7 K 2011-11-07 - 04:43 UnknownUser  
Edit | Attach | Watch | Print version | History: r14 | r8 < r7 < r6 < r5 | Backlinks | Raw View | More topic actions...
Topic revision: r6 - 2011-11-08 - igable
 
  • Edit
  • Attach
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback