Difference: HundredGigabitTestingLog (1 vs. 14)

Revision 142011-11-15 - crlb

Line: 1 to 1
 
META TOPICPARENT name="HundredGigabit"
Line: 14 to 14
 
  • The command is: /opt/versions/scdemo/bin/atlas-to-scdemo <file-lists-directory>
  • The content of the "<file-lists-directory>" is one or more files containing a list of paths of the ATLAS files to be copied. The name of each file within this directory corresponds to the fully qualified host name of the destination host. i.e the list of ATLAS files contained within scdemo06.heprc.uvic.ca will be copied via FDT to scdemo06.
Added:
>
>
  • Command must be run from elephant10 or elephant11; they have the connection to the xyratex.
  • You must ensure you have passwordless access to the scdemo nodes.
 

November 11, 2011

Revision 132011-11-15 - crlb

Line: 1 to 1
 
META TOPICPARENT name="HundredGigabit"
Line: 21 to 21
 Testing of MegaCLI -PDClear:

  • Test nodes consisted of scdemo06 and scdemo07. In all tests, scdemo06 was used as the client FDT node pushing 89 x 11gigabyte atlas root files contained on the /ssd filesystem to the /ssd filesystem on scdemo07 which was running the FDT server. Both client and server were running under non-priviled accounts, the scheduling algorithm was set to "noop", and the /ssd filesystems on both systems were hosted by XFS filesystems on hardware RAID0 of six OCZ Deneva2 drive, stripe size of 1MB, and write through cache.
Added:
>
>
  • The script /opt/versions/scdemo/bin/mrPDClearStart will reset/clear all the drives in bays 2 through 7. The script contains the following:
#!/bin/bash
   set -e
   umount /ssd
   /opt/MegaRAID/MegaCli/MegaCli64 -CfgLdDel -L1 -a0
   /opt/MegaRAID/MegaCli/MegaCli64 -PDClear -Start -PhysDrv[32:2,32:3,32:4,32:5,32:6,32:7] -a0
  • Afterward, you will need to redefine the RAID set, partition and recreate the filesystem.
 
  1. Last test prior to PDClear: Avg: 5.784 Gb/s after 100.00%.
  2. PDClear all drives on scdemo07. RAID0 redefined. XFS filesystem /ssd recreated and remounted.

Revision 122011-11-14 - igable

Line: 1 to 1
 
META TOPICPARENT name="HundredGigabit"
Added:
>
>
 

November 14, 2011

http://sc-repo.uslhcnet.org

Line: 8 to 10
  Memory to memory started the afternoon of the 13th. The key to improving the performance was moving from hashing to packet based load balancing on the Caltech Brocade.
Added:
>
>
Moving the ATLAS data to the scdemo nodes:

  • The command is: /opt/versions/scdemo/bin/atlas-to-scdemo <file-lists-directory>
  • The content of the "<file-lists-directory>" is one or more files containing a list of paths of the ATLAS files to be copied. The name of each file within this directory corresponds to the fully qualified host name of the destination host. i.e the list of ATLAS files contained within scdemo06.heprc.uvic.ca will be copied via FDT to scdemo06.

November 11, 2011

Testing of MegaCLI -PDClear:

  • Test nodes consisted of scdemo06 and scdemo07. In all tests, scdemo06 was used as the client FDT node pushing 89 x 11gigabyte atlas root files contained on the /ssd filesystem to the /ssd filesystem on scdemo07 which was running the FDT server. Both client and server were running under non-priviled accounts, the scheduling algorithm was set to "noop", and the /ssd filesystems on both systems were hosted by XFS filesystems on hardware RAID0 of six OCZ Deneva2 drive, stripe size of 1MB, and write through cache.

  1. Last test prior to PDClear: Avg: 5.784 Gb/s after 100.00%.
  2. PDClear all drives on scdemo07. RAID0 redefined. XFS filesystem /ssd recreated and remounted.
  3. The first test following PDClear was teminated prematurely because the performance was atrocious (Avg: 3.575 Gb/s after 07.10%). The server had been running as root and the default scheduling algorithm had been in effect.
  4. Corrected the scheduling algorithm and the uid of the server and ran the second test to completion. Result was poor, Avg: 4.655 Gb/s after 100.00%. However, the start had been very good (Avg: 6.539 Gb/s after 01.68%).
  5. A third test was conducted to see if a previously used disk (after the PDClear) performed better. Result gave previously expected level of performance: Avg: 5.661 Gb/s after 100.00%.
  6. Target /ssd erased and completely filled with zeros, erased and test run again. Result: Avg: 5.790 Gb/s after 100.00%
 

November 8, 2011

Line: 272 to 293
 Figure 2: After changing the raid configuration to be write through and using large 10G files created with 'dd' we see a much improved disk to disk throughput (as shown in the two FDT outputs immediately above). Strangely we see that one direction is nearly 0.8 Gbps faster then the other. I don't understand the reason for this yet.
Deleted:
<
<

November 11, 2011

Testing of MegaCLI -PDClear.

  • Test nodes consisted of scdemo06 and scdemo07. In all tests, scdemo06 was used as the client FDT node pushing 89 x 11gigabyte atlas root files contained on the /ssd filesystem to the /ssd filesystem on scdemo07 which was running the FDT server. Both client and server were running under non-priviled accounts, the scheduling algorithm was set to "noop", and the /ssd filesystems on both systems were hosted by XFS filesystems on hardware RAID0 of six OCZ Deneva2 drive, stripe size of 1MB, and write through cache.

  1. Last test prior to PDClear: Avg: 5.784 Gb/s after 100.00%.
  2. PDClear all drives on scdemo07. RAID0 redefined. XFS filesystem /ssd recreated and remounted.
  3. The first test following PDClear was teminated prematurely because the performance was atrocious (Avg: 3.575 Gb/s after 07.10%). The server had been running as root and the default scheduling algorithm had been in effect.
  4. Corrected the scheduling algorithm and the uid of the server and ran the second test to completion. Result was poor, Avg: 4.655 Gb/s after 100.00%. However, the start had been very good (Avg: 6.539 Gb/s after 01.68%).
  5. A third test was conducted to see if a previously used disk (after the PDClear) performed better. Result gave previously expected level of performance: Avg: 5.661 Gb/s after 100.00%.
  6. Target /ssd erased and completely filled with zeros, erased and test run again. Result: Avg: 5.790 Gb/s after 100.00%

November 14, 2011

 
Deleted:
<
<

Moving the ATLAS data to the scdemo nodes.

 
Deleted:
<
<
  • The command is: /opt/versions/scdemo/bin/atlas-to-scdemo <file-lists-directory>
  • The content of the "<file-lists-directory>" is one or more files containing a list of paths of the ATLAS files to be copied. The name of each file within this directory corresponds to the fully qualified host name of the destination host. i.e the list of ATLAS files contained within scdemo06.heprc.uvic.ca will be copied via FDT to scdemo06.
 
META FILEATTACHMENT attachment="cacti_notes_2011-011-06.png" attr="" comment="" date="1320622186" name="cacti_notes_2011-011-06.png" path="cacti_notes_2011-011-06.png" size="80579" user="igable" version="1"
META FILEATTACHMENT attachment="cacti_notes_good_rate_2011-11-06.png" attr="" comment="" date="1320640987" name="cacti_notes_good_rate_2011-11-06.png" path="cacti_notes_good_rate_2011-11-06.png" size="69346" user="igable" version="2"

Revision 112011-11-14 - crlb

Line: 1 to 1
 
META TOPICPARENT name="HundredGigabit"

November 14, 2011

Line: 285 to 285
 
  1. A third test was conducted to see if a previously used disk (after the PDClear) performed better. Result gave previously expected level of performance: Avg: 5.661 Gb/s after 100.00%.
  2. Target /ssd erased and completely filled with zeros, erased and test run again. Result: Avg: 5.790 Gb/s after 100.00%
Added:
>
>

November 14, 2011

Moving the ATLAS data to the scdemo nodes.

  • The command is: /opt/versions/scdemo/bin/atlas-to-scdemo <file-lists-directory>
  • The content of the "<file-lists-directory>" is one or more files containing a list of paths of the ATLAS files to be copied. The name of each file within this directory corresponds to the fully qualified host name of the destination host. i.e the list of ATLAS files contained within scdemo06.heprc.uvic.ca will be copied via FDT to scdemo06.
 
META FILEATTACHMENT attachment="cacti_notes_2011-011-06.png" attr="" comment="" date="1320622186" name="cacti_notes_2011-011-06.png" path="cacti_notes_2011-011-06.png" size="80579" user="igable" version="1"
META FILEATTACHMENT attachment="cacti_notes_good_rate_2011-11-06.png" attr="" comment="" date="1320640987" name="cacti_notes_good_rate_2011-11-06.png" path="cacti_notes_good_rate_2011-11-06.png" size="69346" user="igable" version="2"
META FILEATTACHMENT attachment="cacti_notes_5x5_2011-011-07.png" attr="" comment="Shows some asymetric throughput from one node" date="1320713240" name="cacti_notes_5x5_2011-011-07.png" path="cacti_notes_5x5_2011-011-07.png" size="126763" user="igable" version="1"

Revision 102011-11-14 - igable

Line: 1 to 1
 
META TOPICPARENT name="HundredGigabit"
Added:
>
>

November 14, 2011

http://sc-repo.uslhcnet.org

http://supercomputing.uvic.ca/

Memory to memory started the afternoon of the 13th. The key to improving the performance was moving from hashing to packet based load balancing on the Caltech Brocade.

 

November 8, 2011

Mark managed to dump some packets down our 100G link this morning using a test set. We didn't observe any packet loss. He was sending at 19 Gbit/s.

Revision 92011-11-11 - crlb

Line: 1 to 1
 
META TOPICPARENT name="HundredGigabit"

November 8, 2011

Line: 263 to 263
 Figure 2: After changing the raid configuration to be write through and using large 10G files created with 'dd' we see a much improved disk to disk throughput (as shown in the two FDT outputs immediately above). Strangely we see that one direction is nearly 0.8 Gbps faster then the other. I don't understand the reason for this yet.
Added:
>
>

November 11, 2011

 
Added:
>
>

Testing of MegaCLI -PDClear.

  • Test nodes consisted of scdemo06 and scdemo07. In all tests, scdemo06 was used as the client FDT node pushing 89 x 11gigabyte atlas root files contained on the /ssd filesystem to the /ssd filesystem on scdemo07 which was running the FDT server. Both client and server were running under non-priviled accounts, the scheduling algorithm was set to "noop", and the /ssd filesystems on both systems were hosted by XFS filesystems on hardware RAID0 of six OCZ Deneva2 drive, stripe size of 1MB, and write through cache.

  1. Last test prior to PDClear: Avg: 5.784 Gb/s after 100.00%.
  2. PDClear all drives on scdemo07. RAID0 redefined. XFS filesystem /ssd recreated and remounted.
  3. The first test following PDClear was teminated prematurely because the performance was atrocious (Avg: 3.575 Gb/s after 07.10%). The server had been running as root and the default scheduling algorithm had been in effect.
  4. Corrected the scheduling algorithm and the uid of the server and ran the second test to completion. Result was poor, Avg: 4.655 Gb/s after 100.00%. However, the start had been very good (Avg: 6.539 Gb/s after 01.68%).
  5. A third test was conducted to see if a previously used disk (after the PDClear) performed better. Result gave previously expected level of performance: Avg: 5.661 Gb/s after 100.00%.
  6. Target /ssd erased and completely filled with zeros, erased and test run again. Result: Avg: 5.790 Gb/s after 100.00%
 
META FILEATTACHMENT attachment="cacti_notes_2011-011-06.png" attr="" comment="" date="1320622186" name="cacti_notes_2011-011-06.png" path="cacti_notes_2011-011-06.png" size="80579" user="igable" version="1"
META FILEATTACHMENT attachment="cacti_notes_good_rate_2011-11-06.png" attr="" comment="" date="1320640987" name="cacti_notes_good_rate_2011-11-06.png" path="cacti_notes_good_rate_2011-11-06.png" size="69346" user="igable" version="2"

Revision 82011-11-09 - igable

Line: 1 to 1
 
META TOPICPARENT name="HundredGigabit"

November 8, 2011

Line: 7 to 7
  100G-2011-11-08.png
Added:
>
>
'

Files produced via dd and /dev/zero has some interesting properties when stored on the SSDs. It would appear that writing zero to a ssd gives an artificially good results. We haven't discovered the cause of this, but it definitely has in impact. We switched to using files produced by /dev/urandom our disk to disk performance has deteriorated significantly:

5 drive config D to D 6 Drive config D to D 5 Drive D to M 6 Drive D to M
Avg: 5.536 Gb/s Avg: 6.172 Gb/s Avg: 8.525 Gb/s Avg: 9.787 Gb/s

In an effort to get this back to something more reasonable we pulled two drives out of scdemo08 and placed them in scdemo06 and scdemo07. We improved performance by about 12 percent. (We might be able to add a few more drives).

We did some more investigation and discovered that we get different results doing pure read of all zero files vs random files. For example:

Read a random file: 955.402 MB/s 7.46 Gb/s
Read a zeroed file: 1.205 GB/s 9.64 Gb / s
Write a /dev/zero file: 1.372 GB/s 10.97600 Gb/s

scdemo07->scdemo00
Avg: 5.536 Gb/s 

FDTReaderSession ( 2df13142-1621-496a-96be-4ded64eb9645 ) final stats:
 Started: Tue Nov 08 17:10:41 PST 2011
 Ended:   Tue Nov 08 17:23:37 PST 2011
 Transfer period:   12m 56s
 TotalBytes: 536870913830
 TotalNetworkBytes: 536870913830
 Exit Status: OK


scdemo06->scdemo09
Avg: 6.172 Gb/s 100.00% ( 00s )


FDTReaderSession ( 7abaf554-401a-4ba3-85d0-16fe0eae35ab ) final stats:
 Started: Tue Nov 08 17:02:29 PST 2011
 Ended:   Tue Nov 08 17:14:05 PST 2011
 Transfer period:   11m 35s
 TotalBytes: 536870913830
 TotalNetworkBytes: 536870913830
 Exit Status: OK

Writing to disk with with the /dev/zero device:

[iscdemo00 ~]$ java -cp ~/fdt/fdt.jar lia.util.net.common.DDCopy if=/dev/zero of=/ssd/10Goutputfile5 bs=10M count=10240
[Tue Nov 08 19:09:03 PST 2011] Current Speed = 1.416 GB/s Avg Speed: 1.416 GB/s Total Transfer: 2.832 GB
[Tue Nov 08 19:09:05 PST 2011] Current Speed = 1.328 GB/s Avg Speed: 1.371 GB/s Total Transfer: 5.605 GB
[Tue Nov 08 19:09:07 PST 2011] Current Speed = 1.313 GB/s Avg Speed: 1.352 GB/s Total Transfer: 8.232 GB
[Tue Nov 08 19:09:09 PST 2011] Current Speed = 1.309 GB/s Avg Speed: 1.341 GB/s Total Transfer: 10.85 GB
[Tue Nov 08 19:09:11 PST 2011] Current Speed = 1.284 GB/s Avg Speed: 1.33 GB/s Total Transfer: 13.418 GB
[Tue Nov 08 19:09:13 PST 2011] Current Speed = 1.401 GB/s Avg Speed: 1.342 GB/s Total Transfer: 16.221 GB
[Tue Nov 08 19:09:15 PST 2011] Current Speed = 1.401 GB/s Avg Speed: 1.35 GB/s Total Transfer: 19.023 GB
[Tue Nov 08 19:09:17 PST 2011] Current Speed = 1.391 GB/s Avg Speed: 1.355 GB/s Total Transfer: 21.816 GB
[Tue Nov 08 19:09:19 PST 2011] Current Speed = 1.392 GB/s Avg Speed: 1.359 GB/s Total Transfer: 24.6 GB
[Tue Nov 08 19:09:21 PST 2011] Current Speed = 1.396 GB/s Avg Speed: 1.363 GB/s Total Transfer: 27.393 GB
[Tue Nov 08 19:09:23 PST 2011] Current Speed = 1.396 GB/s Avg Speed: 1.366 GB/s Total Transfer: 30.186 GB
[Tue Nov 08 19:09:25 PST 2011] Current Speed = 1.387 GB/s Avg Speed: 1.368 GB/s Total Transfer: 32.959 GB
[Tue Nov 08 19:09:27 PST 2011] Current Speed = 1.391 GB/s Avg Speed: 1.369 GB/s Total Transfer: 35.742 GB
[Tue Nov 08 19:09:29 PST 2011] Current Speed = 1.382 GB/s Avg Speed: 1.37 GB/s Total Transfer: 38.506 GB
[Tue Nov 08 19:09:31 PST 2011] Current Speed = 1.386 GB/s Avg Speed: 1.371 GB/s Total Transfer: 41.279 GB
[Tue Nov 08 19:09:33 PST 2011] Current Speed = 1.377 GB/s Avg Speed: 1.372 GB/s Total Transfer: 44.033 GB
[Tue Nov 08 19:09:35 PST 2011] Current Speed = 1.381 GB/s Avg Speed: 1.372 GB/s Total Transfer: 46.797 GB
[Tue Nov 08 19:09:37 PST 2011] Current Speed = 1.382 GB/s Avg Speed: 1.373 GB/s Total Transfer: 49.561 GB
[Tue Nov 08 19:09:39 PST 2011] Current Speed = 1.357 GB/s Avg Speed: 1.372 GB/s Total Transfer: 52.275 GB
^C

 Total Transfer: 52.812 GBytes ( 56706990080 bytes )
 Time: 38 seconds
 Avg Speed: 1.372 GB/s

read a random:

[igable@scdemo00 rbatch0]$ java -cp ~/fdt/fdt.jar lia.util.net.common.DDCopy if=/ssd/10Grandom03.dat of=/dev/null bs=10M count=10240
Got exception: 
java.io.FileNotFoundException: /ssd/10Grandom03.dat (No such file or directory)
   at java.io.RandomAccessFile.open(Native Method)
   at java.io.RandomAccessFile.<init>(Unknown Source)
   at java.io.RandomAccessFile.<init>(Unknown Source)
   at lia.util.net.common.DDCopy.main(DDCopy.java:371)
[igable@scdemo00 rbatch0]$ java -cp ~/fdt/fdt.jar lia.util.net.common.DDCopy if=/ssd/rbatch0/10Grandom03.dat of=/dev/null bs=10M count=10240
[Tue Nov 08 19:16:10 PST 2011] Current Speed = 760 MB/s Avg Speed: 760 MB/s Total Transfer: 1.484 GB
[Tue Nov 08 19:16:12 PST 2011] Current Speed = 1,008.646 MB/s Avg Speed: 886.82 MB/s Total Transfer: 3.535 GB
[Tue Nov 08 19:16:14 PST 2011] Current Speed = 989.505 MB/s Avg Speed: 920.598 MB/s Total Transfer: 5.469 GB
[Tue Nov 08 19:16:16 PST 2011] Current Speed = 990 MB/s Avg Speed: 937.771 MB/s Total Transfer: 7.402 GB
[Tue Nov 08 19:16:18 PST 2011] Current Speed = 1,000 MB/s Avg Speed: 950.114 MB/s Total Transfer: 9.355 GB


 Total Transfer: 10 GBytes ( 10737418240 bytes )
 Time: 10 seconds
 Avg Speed: 955.402 MB/s

Now read a file made with /dev/zero:

[igable@scdemo00 rbatch0]$ java -cp ~/fdt/fdt.jar lia.util.net.common.DDCopy if=/ssd/rbatch0/10Gfile01.dat of=/dev/null 
[Tue Nov 08 19:40:14 PST 2011] Current Speed = 1.163 GB/s Avg Speed: 1.164 GB/s Total Transfer: 2.328 GB
[Tue Nov 08 19:40:16 PST 2011] Current Speed = 1.194 GB/s Avg Speed: 1.179 GB/s Total Transfer: 4.753 GB
[Tue Nov 08 19:40:18 PST 2011] Current Speed = 1.229 GB/s Avg Speed: 1.196 GB/s Total Transfer: 7.211 GB
[Tue Nov 08 19:40:20 PST 2011] Current Speed = 1.231 GB/s Avg Speed: 1.204 GB/s Total Transfer: 9.674 GB


 Total Transfer: 10 GBytes ( 10737418240 bytes )
 Time: 8 seconds
 Avg Speed: 1.205 GB/s

 

November 7, 2011

All nodes are now configured with the Write Through on the SSD logical disk. SNMP polling to the Brocade now done in 1 min intervals.

Revision 72011-11-08 - igable

Line: 1 to 1
 
META TOPICPARENT name="HundredGigabit"
Added:
>
>

November 8, 2011

Mark managed to dump some packets down our 100G link this morning using a test set. We didn't observe any packet loss. He was sending at 19 Gbit/s.

100G-2011-11-08.png

 

November 7, 2011

All nodes are now configured with the Write Through on the SSD logical disk. SNMP polling to the Brocade now done in 1 min intervals.

Line: 149 to 156
 
META FILEATTACHMENT attachment="cacti_notes_good_rate_2011-11-06.png" attr="" comment="" date="1320640987" name="cacti_notes_good_rate_2011-11-06.png" path="cacti_notes_good_rate_2011-11-06.png" size="69346" user="igable" version="2"
META FILEATTACHMENT attachment="cacti_notes_5x5_2011-011-07.png" attr="" comment="Shows some asymetric throughput from one node" date="1320713240" name="cacti_notes_5x5_2011-011-07.png" path="cacti_notes_5x5_2011-011-07.png" size="126763" user="igable" version="1"
META FILEATTACHMENT attachment="cacti_notes__noop_2011-11-07.png" attr="" comment="" date="1320725234" name="cacti_notes__noop_2011-11-07.png" path="cacti_notes__noop_2011-11-07.png" size="88878" user="igable" version="1"
Added:
>
>
META FILEATTACHMENT attachment="100G-2011-11-08.png" attr="" comment="100 G Now 8 morning" date="1320775590" name="100G-2011-11-08.png" path="100G-2011-11-08.png" size="32325" user="igable" version="1"

Revision 62011-11-08 - igable

Line: 1 to 1
 
META TOPICPARENT name="HundredGigabit"

November 7, 2011

Changed:
<
<
All nodes are now configured with the Write Through on the SSD logical disk.
>
>
All nodes are now configured with the Write Through on the SSD logical disk. SNMP polling to the Brocade now done in 1 min intervals.

cacti_notes_5x5_2011-011-07.png

Figure 3: We are now seeing above 9 Gbps disk to disk on all boxes. The write on scdemo03 is the slowest in the cluster by about 1 Gb/s, it's not immediately obvious what the problem is.

  Completed a 5 to 5 test Disk test:
Deleted:
<
<
Transfer Rate Reverse Rate
scdemo00->scdemo05 Avg: 9.174 Gb/s Avg: 9.435 Gb/s
scdemo01->scdemo06 Avg: 9.172 Gb/s Avg: 9.175 Gb/s
scdemo02->scdemo07 Avg: 9.121 Gb/s Avg: 9.173 Gb/s
scdemo03->scdemo08 Avg: 9.225 Gb/s Avg: 8.087 Gb/s
scdemo04->scdemo09 Avg: 9.382 Gb/s Avg: 9.170 Gb/s
 
Added:
>
>
  CFQ Scheduler   NOOP Sceduler  
Transfer Rate Reverse Rate Rate Reverse Rate
scdemo00->scdemo05 Avg: 9.174 Gb/s Avg: 9.435 Gb/s Avg: 9.544 Gb/s Avg: 9.489 Gb/s
scdemo01->scdemo06 Avg: 9.172 Gb/s Avg: 9.175 Gb/s Avg: 9.490 Gb/s Avg: 9.492 Gb/s
scdemo02->scdemo07 Avg: 9.121 Gb/s Avg: 9.173 Gb/s Avg: 9.490 Gb/s Avg: 9.549 Gb/s
scdemo03->scdemo08 Avg: 9.225 Gb/s Avg: 8.087 Gb/s Avg: 9.382 Gb/s Avg: 8.332 Gb/s
scdemo04->scdemo09 Avg: 9.382 Gb/s Avg: 9.170 Gb/s Avg: 9.545 Gb/s Avg: 9.545 Gb/s

Table 1: Switching to the noop scheduler has added about another 5 percent improvement in performance.

Switching to no noop:

echo noop > /sys/block/sdb/queue/scheduler

cacti_notes__noop_2011-11-07.png

 
Added:
>
>
Figure 4: Shows improvement when moving from CFQ to NoOp Kernel disk scheduler.
 

November 6, 2011

Line: 124 to 142
  Figure 2: After changing the raid configuration to be write through and using large 10G files created with 'dd' we see a much improved disk to disk throughput (as shown in the two FDT outputs immediately above). Strangely we see that one direction is nearly 0.8 Gbps faster then the other. I don't understand the reason for this yet.
Changed:
<
<
  • Shows some asymetric throughput from one node:
    cacti_notes_5x5_2011-011-07.png
>
>
 
META FILEATTACHMENT attachment="cacti_notes_2011-011-06.png" attr="" comment="" date="1320622186" name="cacti_notes_2011-011-06.png" path="cacti_notes_2011-011-06.png" size="80579" user="igable" version="1"
META FILEATTACHMENT attachment="cacti_notes_good_rate_2011-11-06.png" attr="" comment="" date="1320640987" name="cacti_notes_good_rate_2011-11-06.png" path="cacti_notes_good_rate_2011-11-06.png" size="69346" user="igable" version="2"
META FILEATTACHMENT attachment="cacti_notes_5x5_2011-011-07.png" attr="" comment="Shows some asymetric throughput from one node" date="1320713240" name="cacti_notes_5x5_2011-011-07.png" path="cacti_notes_5x5_2011-011-07.png" size="126763" user="igable" version="1"
Added:
>
>
META FILEATTACHMENT attachment="cacti_notes__noop_2011-11-07.png" attr="" comment="" date="1320725234" name="cacti_notes__noop_2011-11-07.png" path="cacti_notes__noop_2011-11-07.png" size="88878" user="igable" version="1"

Revision 52011-11-08 - igable

Line: 1 to 1
 
META TOPICPARENT name="HundredGigabit"

November 7, 2011

Line: 6 to 6
  Completed a 5 to 5 test Disk test:
Changed:
<
<
Transfer Rate
scdemo00->scdemo05 Avg: 9.174 Gb/s
scdemo01->scdemo06 Avg: 9.172 Gb/s
scdemo02->scdemo07 Avg: 9.121 Gb/s
scdemo03->scdemo08 Avg: 9.225 Gb/s
scdemo04->scdemo09 Avg: 9.382 Gb/s
>
>
Transfer Rate Reverse Rate
<-- -->
Sorted ascending
scdemo03->scdemo08 Avg: 9.225 Gb/s Avg: 8.087 Gb/s
scdemo04->scdemo09 Avg: 9.382 Gb/s Avg: 9.170 Gb/s
scdemo02->scdemo07 Avg: 9.121 Gb/s Avg: 9.173 Gb/s
scdemo01->scdemo06 Avg: 9.172 Gb/s Avg: 9.175 Gb/s
scdemo00->scdemo05 Avg: 9.174 Gb/s Avg: 9.435 Gb/s
 

November 6, 2011

Line: 122 to 124
  Figure 2: After changing the raid configuration to be write through and using large 10G files created with 'dd' we see a much improved disk to disk throughput (as shown in the two FDT outputs immediately above). Strangely we see that one direction is nearly 0.8 Gbps faster then the other. I don't understand the reason for this yet.
Added:
>
>
  • Shows some asymetric throughput from one node:
    cacti_notes_5x5_2011-011-07.png
 
META FILEATTACHMENT attachment="cacti_notes_2011-011-06.png" attr="" comment="" date="1320622186" name="cacti_notes_2011-011-06.png" path="cacti_notes_2011-011-06.png" size="80579" user="igable" version="1"
META FILEATTACHMENT attachment="cacti_notes_good_rate_2011-11-06.png" attr="" comment="" date="1320640987" name="cacti_notes_good_rate_2011-11-06.png" path="cacti_notes_good_rate_2011-11-06.png" size="69346" user="igable" version="2"
Added:
>
>
META FILEATTACHMENT attachment="cacti_notes_5x5_2011-011-07.png" attr="" comment="Shows some asymetric throughput from one node" date="1320713240" name="cacti_notes_5x5_2011-011-07.png" path="cacti_notes_5x5_2011-011-07.png" size="126763" user="igable" version="1"

Revision 42011-11-07 - igable

Line: 1 to 1
 
META TOPICPARENT name="HundredGigabit"
Added:
>
>

November 7, 2011

All nodes are now configured with the Write Through on the SSD logical disk.

Completed a 5 to 5 test Disk test:

Transfer Rate
scdemo00->scdemo05 Avg: 9.174 Gb/s
scdemo01->scdemo06 Avg: 9.172 Gb/s
scdemo02->scdemo07 Avg: 9.121 Gb/s
scdemo03->scdemo08 Avg: 9.225 Gb/s
scdemo04->scdemo09 Avg: 9.382 Gb/s
 

November 6, 2011

cacti_notes_2011-011-06.png

Revision 32011-11-07 - igable

Line: 1 to 1
 
META TOPICPARENT name="HundredGigabit"

November 6, 2011

cacti_notes_2011-011-06.png

Changed:
<
<
Figure 1: This shows preliminary results from testing from Sunday Nov 6. The summary is that simultaneous iperf is good from all hosts (9.9+ Gbps) . Reading from disk and writing memory is reasonable with performance of 7.5 -8.5 Gbit/s, but I think could do with some improvement. Disk to disk performance is only 5.0 Gbps and not really adequate for the test. We are using 2000 atlas files of around 500MB on each node for the transfers. We had better performance with 10G files written with 'dd'. Graph pulled from cacti for the Brocade.
>
>
Figure 1: This shows preliminary results from testing from Sunday morning Nov 6. The summary is that simultaneous iperf is good from all hosts (9.9+ Gbps) . Reading from disk and writing memory is reasonable with performance of 7.5 -8.5 Gbit/s, but I think could do with some improvement. Disk to disk performance is only 5.0 Gbps and not really adequate for the test. We are using 2000 atlas files of around 500MB on each node for the transfers. We had better performance with 10G files written with 'dd'. Graph pulled from cacti for the Brocade.
 

Now attempting to improve disk performance be removing LVM and and changing the Raid controller to write through. Also note that scdemo00 and scdemo01 had their raid stripe set at 64 kb.

Line: 22 to 22
  Dramatic improvement with these new disk settings, but still using large files.
Added:
>
>
[igable@scdemo09 ssd]$ fdtclient -c 10.200.0.50 10Gfile01.dat * -d /ssd/batch0


FDT [ 0.9.23-201107290935 ] STARTED ... 


Nov 06, 2011 5:04:15 PM lia.util.net.common.Config <init>
INFO: Using lia.util.net.copy.PosixFSFileChannelProviderFactory as FileChannelProviderFactory
Nov 06, 2011 5:04:15 PM lia.util.net.common.Config <init>
INFO: FDT started in client mode
<snip>

INFO: Requested window size -1. Using window size: 43690
06/11 17:04:25   Net Out: 9.929 Gb/s   Avg: 9.929 Gb/s
06/11 17:04:30   Net Out: 9.408 Gb/s   Avg: 9.668 Gb/s


<snip>

06/11 17:07:45   Net Out: 9.892 Gb/s   Avg: 9.597 Gb/s 93.30% ( 15s )
06/11 17:07:50   Net Out: 9.923 Gb/s   Avg: 9.605 Gb/s 95.61% ( 09s )
06/11 17:07:55   Net Out: 8.892 Gb/s   Avg: 9.588 Gb/s 97.68% ( 05s )
06/11 17:08:00   Net Out: 9.891 Gb/s   Avg: 9.595 Gb/s 99.98% ( 00s )
Nov 06, 2011 5:08:03 PM lia.util.net.copy.FDTReaderSession handleEndFDTSession
INFO: [ FDTReaderSession ] Remote FDTWriterSession for session [ f72d49d9-db0e-4b27-9264-0247e1bc6864 ] finished OK!
06/11 17:08:05   Net Out: 90.602 Mb/s   Avg: 9.384 Gb/s 100.00% ( 00s )


FDTReaderSession ( f72d49d9-db0e-4b27-9264-0247e1bc6864 ) final stats:
 Started: Sun Nov 06 17:04:15 PST 2011
 Ended:   Sun Nov 06 17:08:06 PST 2011
 Transfer period:   03m 50s
 TotalBytes: 268435456000
 TotalNetworkBytes: 268435456000
 Exit Status: OK

Nov 06, 2011 5:08:06 PM lia.util.net.copy.FDTReaderSession doPostProcessing
INFO: [ FDTReaderSession ] Post Processing started
Nov 06, 2011 5:08:06 PM lia.util.net.copy.FDTReaderSession doPostProcessing
INFO: [ FDTReaderSession ] No post processing filters defined/processed.
 [ Sun Nov 06 17:08:07 PST 2011 ] - GracefulStopper hook started ... Waiting for the cleanup to finish
 [ Sun Nov 06 17:08:07 PST 2011 ]  - GracefulStopper hook finished!

 [ Sun Nov 06 17:08:07 PST 2011 ]  FDT Session finished OK.
 
Added:
>
>
Now do scdemo00->scdemo09 with 1TB

6/11 17:42:46  Net In: 6.342 Gb/s      Avg: 8.583 Gb/s 100.00% ( 00s )
Nov 06, 2011 5:42:49 PM lia.util.net.copy.transport.ControlChannel run
INFO:  ControlThread for ( 35c60688-40f0-4480-8025-bde1fcee9b25 ) /10.200.0.50:55421 FINISHED


FDTWriterSession ( 35c60688-40f0-4480-8025-bde1fcee9b25 ) final stats:
 Started: Sun Nov 06 17:26:44 PST 2011
 Ended:   Sun Nov 06 17:42:50 PST 2011
 Transfer period:   16m 06s
 TotalBytes: 1030792151040
 TotalNetworkBytes: 1030792151040
 Exit Status: OK
 
Added:
>
>
scdemo09->scdemo00

INFO: [ FDTReaderSession ] Remote FDTWriterSession for session [ 94f47223-967c-4275-a737-a71929e5dddb ] finished OK!
06/11 19:36:35  Net Out: 5.450 Gb/s     Avg: 9.383 Gb/s 100.00% ( 00s )


FDTReaderSession ( 94f47223-967c-4275-a737-a71929e5dddb ) final stats:
 Started: Sun Nov 06 19:21:55 PST 2011
 Ended:   Sun Nov 06 19:36:38 PST 2011
 Transfer period:   14m 42s
 TotalBytes: 1030792151040
 TotalNetworkBytes: 1030792151040
 Exit Status: OK
 
Changed:
<
<
-- IanGable - 2011-11-06
  • cacti_notes_2011-011-06.png:
>
>
cacti_notes_good_rate_2011-11-06.png
 
Added:
>
>
Figure 2: After changing the raid configuration to be write through and using large 10G files created with 'dd' we see a much improved disk to disk throughput (as shown in the two FDT outputs immediately above). Strangely we see that one direction is nearly 0.8 Gbps faster then the other. I don't understand the reason for this yet.
 
META FILEATTACHMENT attachment="cacti_notes_2011-011-06.png" attr="" comment="" date="1320622186" name="cacti_notes_2011-011-06.png" path="cacti_notes_2011-011-06.png" size="80579" user="igable" version="1"
Added:
>
>
META FILEATTACHMENT attachment="cacti_notes_good_rate_2011-11-06.png" attr="" comment="" date="1320640987" name="cacti_notes_good_rate_2011-11-06.png" path="cacti_notes_good_rate_2011-11-06.png" size="69346" user="igable" version="2"

Revision 22011-11-07 - igable

Line: 1 to 1
 
META TOPICPARENT name="HundredGigabit"

November 6, 2011

Line: 7 to 7
 Figure 1: This shows preliminary results from testing from Sunday Nov 6. The summary is that simultaneous iperf is good from all hosts (9.9+ Gbps) . Reading from disk and writing memory is reasonable with performance of 7.5 -8.5 Gbit/s, but I think could do with some improvement. Disk to disk performance is only 5.0 Gbps and not really adequate for the test. We are using 2000 atlas files of around 500MB on each node for the transfers. We had better performance with 10G files written with 'dd'. Graph pulled from cacti for the Brocade.
Added:
>
>
Now attempting to improve disk performance be removing LVM and and changing the Raid controller to write through. Also note that scdemo00 and scdemo01 had their raid stripe set at 64 kb.

 Delete a logical drive
 [root@scdemo00 ~]# /opt/MegaRAID/MegaCli/MegaCli64 -CfgLdDel -L1 -a0
 
 Get the enclosure device ID:
 [root@scdemo00 ~]# /opt/MegaRAID/MegaCli/MegaCli64 -EncInfo -aALL
 
 Create a logical drive 
 [root@scdemo00 ~]# /opt/MegaRAID/MegaCli/MegaCli64  -CfgLdAdd -r0 [32:2,32:3,32:4,32:5,32:6] WT ADRA Direct -strpsz1024 -a0

Dramatic improvement with these new disk settings, but still using large files.

 

-- IanGable - 2011-11-06

Revision 12011-11-06 - igable

Line: 1 to 1
Added:
>
>
META TOPICPARENT name="HundredGigabit"

November 6, 2011

cacti_notes_2011-011-06.png

Figure 1: This shows preliminary results from testing from Sunday Nov 6. The summary is that simultaneous iperf is good from all hosts (9.9+ Gbps) . Reading from disk and writing memory is reasonable with performance of 7.5 -8.5 Gbit/s, but I think could do with some improvement. Disk to disk performance is only 5.0 Gbps and not really adequate for the test. We are using 2000 atlas files of around 500MB on each node for the transfers. We had better performance with 10G files written with 'dd'. Graph pulled from cacti for the Brocade.

-- IanGable - 2011-11-06

  • cacti_notes_2011-011-06.png:

META FILEATTACHMENT attachment="cacti_notes_2011-011-06.png" attr="" comment="" date="1320622186" name="cacti_notes_2011-011-06.png" path="cacti_notes_2011-011-06.png" size="80579" user="igable" version="1"
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback