OMPM2001 Result: SGI SGI Altix 3000 (1500MHz, Itanium 2)

Benchmark	Reference Time	Base Runtime	Base Ratio	Peak Runtime	Peak Ratio
310.wupwise_m	6000	148	40558	148	40558
312.swim_m	6000	84.5	71024	84.5	71024
314.mgrid_m	7300	146	49931	146	49931
316.applu_m	4000	101	39780	101	39780
318.galgel_m	5100	441	11559	417	12242
320.equake_m	2600	119	21796	89.8	28947
324.apsi_m	3400	142	23938	124	27393
326.gafort_m	8700	472	18446	391	22223
328.fma3d_m	4600	233	19769	190	24178
330.art_m	6400	87.4	73243	87.4	73243
332.ammp_m	7000	559	12520	559	12520
SPECompMbase2001	28853
	SPECompMpeak2001	31210

Benchmark

Reference
Time

Base
Runtime

Base
Ratio

Peak
Runtime

Peak
Ratio

310.wupwise_m

6000

148

40558

148

40558

312.swim_m

6000

84.5

71024

84.5

71024

314.mgrid_m

7300

146

49931

146

49931

316.applu_m

4000

101

39780

101

39780

318.galgel_m

5100

441

11559

417

12242

320.equake_m

2600

119

21796

89.8

28947

324.apsi_m

3400

142

23938

124

27393

326.gafort_m

8700

472

18446

391

22223

328.fma3d_m

4600

233

19769

190

24178

330.art_m

6400

87.4

73243

87.4

73243

332.ammp_m

7000

559

12520

559

12520

SPECompMbase2001

28853

SPECompMpeak2001

31210

Hardware

Hardware Vendor:

SGI

Model Name:

SGI Altix 3000 (1500MHz, Itanium 2)

CPU:

Intel Itanium 2

CPU MHz:

1500

FPU:

Integrated

CPU(s) enabled:

32 cores, 32 chips, 1 core/chip

CPU(s) orderable:

4-256

Primary Cache:

16KBI + 16KBD (on chip) per core

Secondary Cache:

256KB (on chip) per core

L3 Cache:

6.0MB (on chip) per core

Other Cache:

N/A

Memory:

128 GB (32*512MB PC2700 DIMMS per 4 core module)

Disk Subsystem:

1 x 36 GB SCSI (Seagate Cheetah 15k rpm)

Other Hardware:

None

Software

OpenMP Threads:

Parallel:

OpenMP

Operating System:

SGI ProPack(TM) 3

Compiler:

Intel(R) Fortran Compiler for Linux 8.0 (Build 20040519)
Intel(R) C++ Compiler for Linux 8.0 (Build 20040519)

File System:

xfs

System State:

Multi-user

Notes / Tuning Information
Baseline optimization flags: C programs: -openmp -O3 -ipo -ansi -ansi_alias -auto_ilp32 (ONESTEP) Fortran programs: -openmp -O3 -ipo (ONESTEP) OpenMP runtime library libguide.a statically linked Portability Flags: 318.galgel_m: -FI -132 Extra Flags: 330.art_m: -DINTS_PER_CACHELINE=32 -DDBLS_PER_CACHELINE=16 Baseline user environment: OMP_NUM_THREADS 32 limit stacksize 64000 KMP_STACKSIZE 31M KMP_LIBRARY TURNAROUND OMP_DYNAMIC FALSE KMP_SCHEDULE static,balanced Peak optimization flags: 310.wupwise_m: basepeak=true 312.swim_m: basepeak=true 314.mgrid_m: basepeak=true 316.applu_m: basepeak=true 318.galgel_m: -openmp -O3 -ipo (ONESTEP) OMP_NUM_THREADS=16 320.equake_m: -openmp -O3 -ipo -ansi -ansi_alias -auto_ilp32 (ONESTEP) 324.apsi_m: -openmp -O3 -ipo (ONESTEP) 326.gafort_m: -openmp -O3 -ipo (ONESTEP) 328.fma3d_m: -openmp -O3 -ipo (ONESTEP) 330.art_m: basepeak=true 332.ammp_m: basepeak=true Alternate sources: Add critical region around update of linked list in parallel loop. Approved src.alt available as ompm-purdue1-20040324.tar.gz Used for 330.art_m, base and peak. Peak sources: SPEC OMPL2001 source for 64bit systems modified for SPEC OMPM2001. Available as ompl src.alt in SPEC OMP v3.0 Used for 320.equake_m, 324.apsi_m, 326.gafort_m, and 328.fma3d_m. For all benchmarks threads were bound to cores using the following submit command: dplace -x2 -cNTM1,0 $command, where NTM1 is the number of threads minus 1. This binds threads in order of creation, beginning with the master thread on core NTM1, the first slave thread on core NTM1-1, and so on. The -x2 flag instructs dplace to skip placement of the lightweight OpenMP monitor thread, which is created prior to the slave threads.

Notes / Tuning Information

 Baseline optimization flags: 
   C programs:       -openmp -O3 -ipo -ansi -ansi_alias -auto_ilp32 (ONESTEP)
   Fortran programs: -openmp -O3 -ipo (ONESTEP)
   OpenMP runtime library libguide.a statically linked
 
 Portability Flags:
   318.galgel_m: -FI -132
 
 Extra Flags:
   330.art_m: -DINTS_PER_CACHELINE=32 -DDBLS_PER_CACHELINE=16

 Baseline user environment:
   OMP_NUM_THREADS 32
   limit stacksize 64000
   KMP_STACKSIZE 31M
   KMP_LIBRARY TURNAROUND 
   OMP_DYNAMIC FALSE
   KMP_SCHEDULE static,balanced


 Peak optimization flags:
    310.wupwise_m: basepeak=true
    312.swim_m: basepeak=true
    314.mgrid_m: basepeak=true
    316.applu_m: basepeak=true
    318.galgel_m: -openmp -O3 -ipo (ONESTEP)
                  OMP_NUM_THREADS=16
    320.equake_m: -openmp -O3 -ipo -ansi -ansi_alias -auto_ilp32 (ONESTEP)
    324.apsi_m: -openmp -O3 -ipo (ONESTEP)
    326.gafort_m: -openmp -O3 -ipo (ONESTEP)
    328.fma3d_m: -openmp -O3 -ipo (ONESTEP)
    330.art_m: basepeak=true
    332.ammp_m: basepeak=true

 Alternate sources:
 Add critical region around update of linked list in parallel loop.
 Approved src.alt available as ompm-purdue1-20040324.tar.gz
 Used for 330.art_m, base and peak.
 
 Peak sources:
 SPEC OMPL2001 source for 64bit systems modified for SPEC OMPM2001.
 Available as ompl src.alt in SPEC OMP v3.0
 Used for 320.equake_m, 324.apsi_m, 326.gafort_m, and 328.fma3d_m.
 
 For all benchmarks threads were bound to cores using the following submit command:
 dplace -x2 -cNTM1,0 $command,
 where NTM1 is the number of threads minus 1.
 This binds threads in order of creation, beginning with the master
 thread on core NTM1, the first slave thread on core NTM1-1, and so on.
 The -x2 flag instructs dplace to skip placement of the lightweight
 OpenMP monitor thread, which is created prior to the slave threads.

First published at SPEC.org on 23-Jun-2004

Generated on Wed Jun 23 11:07:13 2004 by SPEC OMP2001 HTML formatter v1.01