![]() |
SPECjbb2000 User's Guide |
1 Introduction
2 Installation and Setup of SPECjbb2000
3 Running SPECjbb2000
4 Customizing the properties files
5 Operational Validity
6 The Metric
7 Results Reports
8 Troubleshooting
9 Performance Tuning
10 Submitting results
1 Introduction
This document is a practical guide for setting up and running a SPEC Java Business Benchmark (SPECjbb2000) test. To submit SPECjbb2000 results the benchmarker must adhere to the rules contained in the "Run and Reporting Rules" contained in the kit. For an overview of the benchmark architecture, see the SPECjbb2000 whitepaper.
This document is targetted at people trying to run the SPECjbb2000 benchmark in order to accurately measure their Java system, comprised of a JVM and an underlying operating system and hardware.
1.1 Background
SPECjbb2000 is a Java program emulating a 3-tier system with emphasis on the middle tier. Random input selection represents the first (user) tier. SPECjbb2000 fully implements the middle tier business logic. The third tier is represented by binary trees rather than a separate database.
The motivation behind SPECjbb2000 is that it is, and will continue to be, common to use Java as middleware between a database and customers. SPECjbb2000 is representative of a middle tier system, with simplifications to isolate it for benchmarking. This strategy saves the benchmarker the expense of having to invest in a fast database system in order to measure various JVM systems. The implication is that combining a JVM which proves to be fast on SPECjbb2000, with a database system which is proven to be fast on TPC-C, will provide a business with a fast and robust overall multi-tier environment. The decomposition into a separate middleware test makes testing accessible to more people.
SPECjbb2000 is inspired by the TPC-C benchmark and loosely follows the TPC-C specification for its schema, input generation, and transaction profile. SPECjbb2000 replaces database tables with Java classes and replaces data records with Java objects. SPECjbb2000 does no disk IO.
SPECjbb2000 runs in a single JVM in which threads represent terminals in SPECjbb2000, where each thread independently generates random input before calling transaction specific logic. There is no network IO in SPECjbb2000.
1.2 General Concepts
A warehouse is a unit of stored data. It contains roughly 25MB of data stored in many objects in many Btrees. A thread represents a terminal user within a warehouse. There is a one-to-one mapping between warehouses and threads, plus a few threads for SPECjbb2000 main and various JVM functions. As the number of warehouses increases during the full benchmark run, so does the number of threads.
A "point" represents a two-minute measurement at a given number of warehouses. A full benchmark run consists of a sequence of measurement points with an increasing number of warehouses (and thus an increasing number of threads).
2 Installation and Setup of SPECjbb2000
2.1 Installation of SPECjbb2000
2.1.1 Installing the benchmark
This section describes the steps necessary to set up SPECjbb2000. The setup instructions for most platforms are very similar to these generic setup instructions.
- Make sure Java is correctly installed on the test machine.
- Put the CD in the CDROM drive.
- Mount it (if necessary) and enter the top-level cdrom directory.
- Run InstallShield:
java setupor double click on setup.exe
- Follow the instructions from there.
Alternately, to run without the GUI, use the following command:
There should now be several jar files (jbb.jar, jbb_no_precompile.jar, check.jar, and reporter.jar), documentation in doc, and source and class files in src/spec/jbb and src/spec/reporter.
Do NOT recompile (javac).You are expected to use the bytecodes provided. Recompiling the benchmark will cause validation errors.
2.1.2 Trial run of SPECjbb2000
Try running the benchmark using either
or
as appropriate to your operating system. These are provided as examples,
and may require minor modifications for your particular environment. Go to
the directory containing the benchmark.
Alternately, set
CLASSPATH=./jbb.jar:./jbb_no_precompile.jar:./check.jar:.\reporter.jar:$CLASSPATH
for Unix or
CLASSPATH=.\jbb.jar;.\jbb_no_precompile.jar;.\check.jar;.\reporter.jar;%CLASSPATH%
for Windows.
To run the benchmark up to 8 points type:
java -ms256m -mx256m spec.jbb.JBBmain -propfile SPECjbb.props
The benchmarker will probably not want to keep these numbers, since the heap is fairly small and the properties files ( SPECjbb.props and SPECjbb_config.props) don't yet reflect the system under test.
2.2 Setup
Having done a trial run the benchmark, the benchmarker should now learn what the changeable parameters are, and where to find and change them in the properties files. Also, the documentation (and publishability) of the runs will be improved by editing the descriptive property file described below.
2.2.1 Properties File Setup
SPECjbb2000 takes two properties files as input: a control properties file and a descriptive properties file. The control properties file is used to modify the operation of the benchmark, for example altering the length of the measurement interval. The descriptive properties file is used to document the system under test; this documentation is required for publishable results, and is reflected in the output files. The values of the descriptive properties do not affect the workload.
The default name for the control properties file is SPECjbb.props, but may be overridden with a different name by using the -propfile command line option (see section 3 for an example). The name for the descriptive properties file is specified in the control properties file, using the input.include_file property. The default name, as distributed in the sample control properties file included with the kit, is SPECjbb_config.props. See below for a brief description of how to modify properties in these files.
A sample control properties file and a sample descriptive properties file are distributed with the SPECjbb2000 kit, using the default names. Before modifying these files, first make a copy of the originals to enable easy recovery of the default benchmark configuration. Also, if running on several platforms, the benchmarker will probably want to take advantage of the naming features described in the previous paragraph (e.g., SPECjbb_config.ALPHA_400.props, SPECjbb_config.NT_500.props).
2.2.1.1 Properties Format
Each line of a properties file is either blank, a comment, or a property assignment. Comments begin with the # character. Each property assignment is of the form name=value. "name" is the property identifier; "name" and "property" are often used synonymously. Property names are specific to the benchmark and must not be changed. See below for a discussion of the control properties that the benchmarker is likely to want to change and for a list of control properties that must not be changed in order for the SPECjbb2000 result to be publishable. See section 4 for examples of the specification of property values.
2.2.1.2 Control Properties
The control properties file allows modification of three distinct benchmark behaviors: length of run, warehouse sequence, and garbage collection behavior. The benchmarker may experiment with any of these behaviors, but for publishable results, there are restrictions on the modifications. See the following paragraphs for these restrictions. The control properties file also contains a property specifying the benchmark name, but this should not be changed.
Length of run is controlled by two properties: input.ramp_up_seconds and input.measurement seconds. input.ramp_up_seconds specifies, in seconds, the length of the warmup period that will precede each measured interval. input.measurement specifies, in seconds, the length of the measured interval used to produce a point (throughput value at a particular number of warehouses). For a result to be publishable input.ramp_up_seconds must be 30 and input.measurement_seconds must be 120.
Warehouse sequence may be controlled in either of two ways. The usual method for specifying warehouse sequence is the set of three properties, input.starting_number_warehouses, input.increment_number_warehouses, and input.ending_number_warehouses , which causes the sequence of warehouses to progress from input.starting_number_warehouses to input.ending_number_warehouses, incrementing by input.increment_number_warehouses. The alternative method of specifying warehouse sequence is input.sequence_of_number_of_warehouses, which allows specification of an arbitrary list of positive integers in increasing order. For a publishable result the warehouse sequence must begin at 1 and increment by 1. See Section 4.1 for a discussion of requirements on the total number of warehouses that must be run.
There is limited control over garbage collection behavior. Specifically, the benchmarker may choose to force a garbage collection between measurement intervals. This is controlled by the Boolean property input.forcegc, where the value "true" means that there will be a forced GC between measurement intervals and "false" means that no GC will be forced. Note that Boolean values must be lower case. The choice of value for this property has no effect on publishability of the result.
3 Running SPECjbb2000
After completing setup as described in the previous section, the benchmark is ready to run.
3.2 Running the benchmark
Try running the benchmark using either
run.sh
or
run.bat
as appropriate to your operating system. These are provided as examples, and
may require minor modifications for your particular environment.
Alternately, set CLASSPATH:
CLASSPATH=./jbb.jar:./jbb_no_precompile.jar:./check.jar:./reporter.jar:$CLASSPATH or CLASSPATH=.\jbb.jar:.\jbb_no_precompile.jar:.\check.jar:.\reporter.jar:%CLASSPATH%
then run the following line:
java -ms<min> -mx<max> spec.jbb.JBBmain -propfile SPECjbb.props
Most operating systems should be able to use a similar command line. Specifying more heap is allowed, and will probably help to get higher scores. Some Java 2 JVMs use -Xms/-Xmx instead of -ms/-mx to specify heap size.
The benchmark output appears in the results directory by default. The output types are raw, results, html, and ascii. The files are tagged with a sequential number, as
- SPECjbb.<num>.raw,
- SPECjbb.<num>.results,
- SPECjbb.<num>.html,
- and SPECjbb.<num>.asc
respectively. The jpeg also produces SPECjbb.<num>.jpg, which SPECjbb.<num>.html refers to.
3.3 Running the Reporter
Usually one doesn't need to run the Reporter manually, as the benchmark automatically creates all the file formats. However, if needed, a raw file can be processed with the Reporter tool to create: html with an html graph, html with a jpg graph, or nicely-formatted text. The html with a jpg graph requires Java 2, and (under unix) DISPLAY to be set to a working X display. (This may require setting xhost + on the display.) The html with an html graph is provided for those without Java 2, or without a display device. When the benchmark calls the Reporter, it will default to that which is available in the environment.
Additionally, the Reporter can be used to compare two results.
The most common way of running the Reporter is
java spec.reporter.Reporter -e -r results/SPECjbb.<num>.raw -o results/SPECjbb.<num>.html
There are a number of other options:
Usage: java spec.reporter.Reporter [options] Options are: -a Plain ASCII text output default: generate HTML output with JPG graph -e Do NOT echo raw results properties in HTML output default: raw results inserted as HTML comments -h Create graph in HTML rather than JPG default: use JPG if have Java 2 and a DISPLAY -l Label Label to infix into the JPG name: GraphImage.label.jpg default: a random number -o Output Output file for generated HTML default: written to System.out -r Raw A SPEC raw file, generated by a benchmark run. May be in a mail message with mail headers. default: read from System.in -c Second raw file, to compare default: none -v Verbose. List extra config.testx.* properties default: extra properties are not reported
So, comparing two results is done as follows:
java spec.reporter.Reporter -e -r results/SPECjbb.<num>.raw -c results/SPECjbb.<other-num>.raw -o results/compare.html
Results submitted to SPEC will appear on the SPEC website in the asc and html(-jpg) formats. The reporter output has to be redirected or specified with -o filename. The Reporter will use only the last value when the same point ( number of warehouses ) is repeated multiple times
4 Customizing the properties files
Section 2 introduced the properties files. These can be modified to control operation of the benchmark. There are two properties files: SPECjbb.props and SPECjbb_config.props. Their relationship is described in more detail in Section 2. Section 2 also describes the general format of the properties lines, and how to change them.
------------------------------SPECjbb.props-------------------------------- ################################################################################ # # # Control parameters for SPECjbb2000 benchmark # # # ################################################################################ # # This file has 2 sections; changable parameters and fixed parameters. The # fixed parameters exist so that you may run tests any way you want, however # in order to have a valid, reportable run of SPECjbb2000, you must reset # them to their original values. # # ################################################################################ # # # Changable input parameters # # # ################################################################################ # Warehouse sequence may be controlled in either of two ways. The more # usual method for specifying warehouse sequence is the triple # input.starting_number_warehouses, input.increment_number_warehouses, # and input.ending_number_warehouses, which causes the sequence of # warehouses to progress from input.starting_number_warehouses to # input.ending_number_warehouses, incrementing by # input.increment_number_warehouses. # The alternative method of specifying warehouse sequence is # input.sequence_of_number_of_warehouses, which allows specification of # an arbitrary list of positive integers in increasing order. # For a publishable result the warehouse sequence must begin at 1, increment by # 1 and go to at least warehouses input.starting_number_warehouses=1 input.increment_number_warehouses=1 input.ending_number_warehouses=8 #input.sequence_of_number_of_warehouses=1 2 3 4 5 6 7 8 # # 'forcegc' controls whether to garbage collect between each number of # warehouses. # input.forcegc=true # # 'include_file' is the name for the descriptive properties file. # On systems where the file separator is \, use \\ as the file separator here # # Examples: # input.include_file=SPECjbb_config.props # input.include_file=/path/to/SPECjbb_config.props # input.include_file=c:\\path\\to\\SPECjbb_config.props # input.include_file=config.props # # directory to store output files # On systems where the file separator is \, use \\ as the file separator here # # Examples: # input.include_file=results # input.include_file=/path/to/results # input.include_file=c:\\path\\to\\results # input.output_directory=results ################################################################################ # # # Fixed input parameters # # # # YOUR RESULTS WILL BE INVALID IF YOU CHANGE THESE PARAMETERS # # # ################################################################################ # DON'T CHANGE THIS PARAMETER, OR ELSE !!!! input.suite=SPECjbb # # If you need to collect stats or profiles, it may be useful to increase # the 'measurement_seconds'. This will, however, invalidate your results # # Amount of time to run each point prior to the measurement window input.ramp_up_seconds=30 # Time of measurement window input.measurement_seconds=120 ------------------------------end of SPECjbb.props-------------------------------- ------------------------------SPECjbb_config.props-------------------------------- # # SPECjbb2000 properties file # This is a SAMPLE file which you can use to specify characteristics of # a particular system under test, and to control benchmark # operation. You can reuse this file repeatedly, and have a version for # each system setup you use. You should edit the reporting fields appropriately. # All of this can still be edited in the output properties file after # you run the test, but putting the values in here can save you some # typing for attributes which do not change from test to test. # ################################################################################ # #System Under Test hardware # ################################################################################ # Company which sells the hardware config.hw.vendor=Neptune Ocean King Systems # Home page for company which sells the hardware config.hw.vendor.url=http://www.neptune.com # What type of system was used config.hw.model=TurboBlaster 2 # What type of processor(s) the system had config.hw.processor=ARM # MegaHertz rating of the chip. Usually an integer config.hw.MHz=300 # Number of CPUs in the system, config.hw.ncpu=1 # Amount of physical memory in the system, in Megabytes. DO NOT USE MB or # GB, IT WILL CONFUSE THE REPORTER config.hw.memory=4096 # Amount of level 1 cache for instruction and data on each CPU config.hw.primaryCache=4KBI+4KBD # Amount of level 2 cache, for instruction and data on each CPU config.hw.secondaryCache=64KB(I+D) off chip # Amount of level 3 cache (or above) config.hw.otherCache= # The file system the class files reside on config.hw.fileSystem=UFS # Size and type of disk on which the benchmark and OS reside on config.hw.disk=1 x 4GB SCSI (classes) 1 x 12GB SCSI (OS) # Any other hardware you think is performance-relative. That is, you would # need this to reproduce the test config.hw.other=AT&T Rotary Telephone config.hw.available=Jan-1997 ################################################################################ # # System Under Test software # ################################################################################ # The company that makes the JVM software config.sw.vendor=Phobos Ltd # Home page for the company that makes the JVM software config.sw.vendor.url=http://www.phobos.uk.co # Name of the JVM software product (including version) config.sw.JVM=Phobic Java 1.2.2 # Date when the JVM software product is shipping and generally available # to the public config.sw.JVMavailable=Jan-1997 # How many megabytes used by the JVM heap. "Unlimited" or "dynamic" are # allowable values for JVMs that adjust automatically config.sw.JVMheapInitial=1024 config.sw.JVMheapMax=1024 # Command line to invoke the benchmark # On systems where the file separator is \, use \\ as the file separator here config.sw.command_line=java -ms256m -mx1024m -spec.jbb.JBBmain -propfile Test1 # Name of precompiler used config.sw.precompiler=Phobic Java Compiler # Command line to invoke the precompiler # On systems where the file separator is \, use \\ as the file separator here config.sw.precompiler_command_line=phobic-jc spec/jbb/Jbbmain.java -exclude class.list # Method or command used to exclude the methods not allowed to be optimized # or precompiled (see Run Rules section 2.1.1) config.sw.precompiler_class_excluder_method=11 classes listed in file class.list # Operating system (including version) config.sw.OS=Phobos DOS V33.333 patch-level 78 # Date when the OS version is shipping and generally available to the public config.sw.OSavailable=May-2000 # State of the system, such as "single-user mode", or "minimal boot" config.sw.systemState=normal # Free text description of what sort of tuning one has to do to either # the OS or the JVM to get these results. This is where kernel tunables # belong. Use HTML list tags, if you want a list on the report page config.sw.tuning=Operating system tunings<UL><LI>bufcache=1024</LI> <LI>maxthreads_per_user=65536</LI> </UL> # Any additional software that you think is need to reproduce the performance # measured on this test config.sw.other=Neptune JIT Accelerator 2.3b6 # Date when the other software is shipping and generally available to the public config.sw.otherAvailable=Feb-98 ################################################################################ # # Tester information # ################################################################################ # The company that ran and submitted the result config.test.testedBy=Neptune Corp. #The person who ran and submitted this result (name does not go on public pages config.testx.testedByName=Willie the Mailboy # A web page where people within the aforementioned company might get more # information # On systems where the file separator is \, use \\ as the file separator here config.test.internalReference=http://pluto.eng/specpubs/mar97/ # The company's SPEC license number config.test.specLicense=50 # Physically, where the results were gathered config.test.location=Santa Monica, CA ------------------------------end of SPECjbb_config.props--------------------------------
These parameters can be categorized as:
- Control parameters - Variables that describe the load to put on the server. Several of these are "constants" in the sense that a particular value is required for use in publishable runs. These are included in the control properties file, SPECjbb.props .
- Configuration description - Details about the system that are necessary for the final report, such as CPU types, memory sizes, etc.
- As distributed in the sample kit, these are in the included properties file, SPECjbb_config.props .
4.1 Control parameters
These are the parameters which specify the workload.
- starting_number_warehouses,increment_number_warehouses,ending_number_warehouses
- Parameters to specify a series of numbers of warehouses to run at once. For example, with the settings start=1, increment=1, and end=10 respectively, the benchmark will run first 1 warehouse, then 2 concurrent warehouses, then 3, and so on up to 10. Since the metric averages points from the peak to twice as many warehouses, if the curve peaks beyond 5 warehouses, the benchmark must be run beyond 10. Roughly, the peak number of warehouses is often near the number of cpu's.
Note: note that publication requires all the points from 1 to 2*n warehouses, where n is number of warehouses at the peak throughput, and that the metric is computed from the points n to 2*n. In some cases, where the system under test is unable to successfully run all points up to 2*N, the runs may still be valid and publishable. See section 2.3 of the run and reporting rules.
- input.sequence_of_number_of_warehouses=7 10 12
- An alternative to the above. If using this, comment out one of starting_number_warehouses, increment_number_warehouses, or ending_number_warehouses. This allows skipping numbers of warehouses in a non-periodic fashion, or repeating a number of warehouses more than once. The list is assumed to be increasing; if it is not, the results may not appear as intended.
- input.forcegc=true
- Whether to garbage collect between each number of warehouses. Either way is legal for publication; JVMs differ on whether this feature helps their scores. Note: booleans must be specified in lowercase.
- input.include_file=SPECjbb_config.props
- The name of a file which contains other input parameters. The intended use is to allow segregation of "config" parameters from "input" parameters, easing changing one set without the other. This way, one can have several system configuration files for different systems, and simply change which one to include for a given run. Similarly, one system can easily have several sets of input parameters all referencing the same hardware/software descriptions. In fact, the parameters may be divided between this file and the specified propfile some other way if you want. Be careful not to lose any parameters. Duplicated parameters will take on the last value seen, with the include file following the propfile.
The next two are "constants". Normally, you will not change these. They are provided for information only. Any changes to the benchmark constants will invalidate the run. SPEC does not endorse such changes, and obtained with modified benchmark constants are not publishable.
The examples shown are the default and also the required values for valid runs.
- input.ramp_up_seconds=30
- How long the worker threads run before the measurement begins. This avoids boundary conditions. It may get records into memory, but is probably not as influential as ramp-up time on an actual database benchmark.
- input.measurement_seconds=120
- How long to measure each number of warehouses for.
4.2 Configuration description parameters
The Configuration description section should contain a full description of the testbed. It should have enough information to repeat the test results, including all necessary hardware, software and tuning parameters. All parameters with non-default values must be reported in the Notes section of the description.
The configuration description has the following categories.
- Server Hardware
- JVM Software
- Test Information
Each category contains variables for describing it. For example, Hardware contains variables for CPUs, caches, controllers, etc. If a variable doesn't apply, leave it blank. If no variable exists for a part you need to describe, add some text to the notes section. The notes sections can contain multiple lines.
The properties file included in the kit contains examples for filling out these fields.
Hardware:
- config.hw.vendor=SAMPLE vendor
- This is the company which sells the hardware.
- config.hw.vendor.url=http://www.neptune.com
- This is web page for the company which sells the hardware.
- config.hw.model=TurboBlaster 2
- What type of system was used.
- config.hw.processor=ARM
- What type of processor the system had.
- config.hw.MHz=300
- MegaHertz rating of the chip. Usually an integer, except for Intel.
- config.hw.ncpu=1
- The number of CPUs in the system.
- config.hw.memory=4096
- The amount of physical memory in the system under test, in megabytes. That's important to the reporter; saying "4G" or "2Gb" will confuse it, and your memory will be reported incorrectly.
- config.hw.primaryCache=4KBI+4KBD
- The amount of level 1 cache, for instruction and data, on each CPU.
- config.hw.secondaryCache=64KB(I+D)
- The amount of level 2 cache, for instruction and data, on each CPU. This example shows a combined cache.
- config.hw.otherCache=none
- Level 3 cache or above.
- config.hw.fileSystem=UFS
- The file system the classes reside on.
- config.hw.disk=2 x 4GB SCSI
- Size and type of disk on which the benchmark and the OS reside on. This is to help with reproduction of results.
- config.hw.other=none
- Anything you think is performance-relevant hardware-wise that someone would need to be told about in order to reproduce the result.
- config.hw.available=Jan-2000
- The date when the hardware is shipping and generally available to the public.
Software:
- config.sw.vendor=Phobos Ltd
- This is the company which sells the JVM software.
- config.sw.vendor.url=http://www.phobos.uk.co
- This is the web page for the company which sells the JVM software.
- config.sw.JVM=Phobic Java
- This is the name of the JVM software product.
- config.sw.JVMavailable=Jan-97
- The date when the product is shipping and generally available to the public.
- config.sw.JVMheapInitial=1024
- How many megabytes of memory were used for the initial heap (-ms). "Unlimited" or "dynamic" are also allowed for JVMs which adjust automatically.
- config.sw.JVMheapMax=1024
- How many megabytes of memory were used for the maximum heap (-mx). "Unlimited" or "dynamic" are also allowed for JVMs which adjust automatically.
- config.sw.OS=Phobos
- The operating system on the system under test.
- config.sw.OSavailable=Jan-97
- The date when the operating system is shipping and generally available to the public.
- config.sw.tuning=none
- Free text description of what sort of tuning one has to do to either the OS or the JVM to get these results. This is where kernel tunables belong.
- config.sw.command_line="java -ms256m -mx1024m com.ibm.sf.bob.JBBmain"
- The line used to invoke the benchmark, to help reproduce scores.
- config.sw.precompiler="Phobic Java Compiler"
- The tool used to precompiler the benchmark, if any.
- config.sw.precompiler_command_line="phobic-jc spec/jbb/Jbbmain.java -exclude class.list"
- The line used to precompiler the benchmark, if any.
- config.sw.precompiler_class_excluder_method=11 classes listed in file class.list
- Method or command used to exclude the methods not allowed to be optimized or precompiled (see Run Rules section 2.2.1)
- config.sw.other=Neptune JIT Accelerator 2.3b6
- Any additional software that you think is performance-relevant that someone would need to be told about in order to reproduce.
- config.sw.otherAvailable=Feb-98
- When it's shipping and generally available.
And, the test information:
- config.test.testedBy=Neptune Corp.
- The company that ran and submitted this result.
- config.testx.testedByName=Willie the Mailboy
- The person who ran and submitted this result.
- config.test.internalReference=http://pluto.eng/specpubs/mar97/
- A web page where people within the aforementioned company might get more information.
- config.test.specLicense=50
- The company's SPEC licence.
- config.test.location=Santa Monica, CA
- Physically, where the results were gathered.
5 Operational Validity
In order to be a publishable result, or directly comparable to existing published results, a run must pass several runtime validity checks:
- Such conformance testing as the benchmark runs must be passed.
- The checksum on jbb.jar must indicate that it is the same bytecodes which SPEC shipped.
- Elapsed time of each measurement interval shall be within 0.5% lower to 10% higher than the specified measurement interval length (120 s).
- The number of transactions done by each separate thread must not differ by more than 30% between maximum thread and minimum thread, for all points up to and including the peak.
- The run must include all points to the minimum of 8 warehouses or twice as many warehouses as at the peak.
- The Java environment must pass the partial conformance testing done by the benchmark prior to running any points.
6 The Metric
Assuming a run is valid (or invalid in a way in which the metric is still meaningful, although not publishable), the metric is the numerical representation of the performance of the system.
The SPECjbb2000 metric is calculated as follows:
- All points (numbers of warehouses) are run, from 1 up to at least twice the number of warehouses expected to produce the peak throughput. At a minimum all points from 1 to 8 must be run.
- The peak is observed to be at N warehouses.
- The throughputs for all the points from N warehouses to 2*N inclusive warehouses are averaged. This average is the SPECjbb2000 metric. As explained in section 2.3 of the run and reporting rules, results from systems that are unable to run all points up to 2*N warehouses are still considered valid. For all the missing points in the range N+1 to 2*N, the throughput is considered to be 0 ops/second in the metric computation.
The reporting tool contained within SPECjbb2000 produces a graph of the throughput at all the measured points with warehouses on the horizontal axis and throughputs on the vertical axis. All points from 1 to the minimum of 8 or 2*N are required to be run and reported. Missing points in the range N+1 to 2*N will be reported to have a throughput of 0 ops/second. The points being averaged for the metric will be marked on the report.
7 Results Reports
Benchmark results are available in several forms:
- Screen output
- Html file
- Raw file
- Ascii report page
All of these are just different ways of presenting the same basic information. All include the overall score, scores at individual numbers of warehouses, validation or error messages, information about the configuration, information about the input parameters, and details about each point.
For this reason, these are described in detail only once, in the Html Result Pages section. The descriptions of the other formats will refer back to classifications of information described in the Html section. Sample output pages are included with the kit ( ASCII, HTML-jpeg, HTML-only, raw, results. ) Errors in the creation of any format (other than raw) do not prevent publication.
How to create the other formats from the raw file is described in Section 3.3 .
7.1 Raw file
The SPECjbb.<num>.raw file contains all the inputs from the properties files and results from the test. The reporter tool uses this file to generate all the other file formats. The raw file is submitted to SPEC.
If you intend to postprocess the results, this is the best file to start from. All information is stored in field=value pairs. It can be read in to a Java program using the Properties class, or handled by Perl, awk, or shell using pattern match or grep.
7.2 HTML Result Pages
The HTML result pages contain the following elements.
SPECjbb2000 Metric
- See Section 6 for to see how the metric is calculated.
Performance for each point in the benchmark.
- Performance for each point, in the form of a chart and graph. (ASCII and the screen output do not include a graph.) The dashed line (in the jpg graph) is at the level of the overall metric, stretching across the points which were averaged together to get the metric.
Hardware and Software information
- The hardware vendor and model name, the OS vendor and name/version, and the JVM software vendor and software name/version. Enough information to reproduce the environment.
Availability and Configuration information
- All the other information from the Configuration description entries in the descriptive properties file.
- See The Run and Reporting section
on Availability for information on how to determine if results on your
system meet the availability criteria for publication.
Notes and Tuning Information
- The notes and tuning entries from the descriptive properties file.
Operational Validity / Errors
- Any errors encountered in the test inputs or results. To be valid, the test must have no errors reported in this section. Any errors will cause the word 'Invalid' to appear next to the SPECjbb2000 result number (ASCII and HTML formats). See Section 5 for details on validity checks.
If a thread ran out of memory, it will be reported at the point in the output where the condition occurred.
Details of Individual Points
- For each number of warehouses, the counts of each type of transaction, along with their total and maximum response times, the percentage size of the range of transactions done by different threads, and amount of heap used, is presented. The average is not given, because it tends to be very small. You can compute it from the total and the count of transactions.
7.3 ASCII report file
This is automatically generated by the benchmark run. You can also run
java spec.reporter.Reporter -a -e -r results/SPECjbb.<num>.raw
to get the ASCII report format. This is one of the formats available on the SPECjbb2000 result pages website. It is very similar to the html, except that it is lacking the chart. It is primarily for viewing/printing on a system without a browser.
7.4 Screen Output
The benchmark outputs data to the screen as it runs. This output also appears in the results/SPECjbb.<num>.results file. It provides the same information as the HTML file, but in the order in which the benchmark has information available. It is therefore somewhat more verbose. The following sections appear on the screen.
Test verification and summary of benchmark settings
- See Section 5 for a description of the runtime validity checks. Some of these cannot be checked until the end of the run. Such checks are reported at the end of the .results file.
The conformance check and its results comes out on the screen, but does not go into the .results file. Next, the parameters scroll down the screen.
Results from each point
The results from each point in the benchmark print out the following:
- Information about the minimum and maximum amount of work done by the threads/warehouses at that point.
- Numerical results for each transaction type.
- 'Min' and 'Max' and 'Avg' are the minimum, maximum, and average response times for each of the 5 transaction types.
- Count is the number of each type of transaction done during the run for this number of warehouses.
- 'Total' is the total time spent in this type of transaction, so Avg is Total / count.
- The 'result' is the sum of the Counts divided by the number of seconds.
Calculating results Minimum transactions by a thread = 213695 Maximum transactions by a thread = 214847 Minimum transactions by a warehouse = 213695 Maximum transactions by a warehouse = 214847 Difference = 1152 (0.5390861%) =============================================================================== TOTALS FOR: COMPANY with 2 warehouses and 1 terminals each ................... SPECjbb2000 1.0 Results (time in seconds) ................... Count Total Min Max Avg Heap Space New Order: 186319 140.30 0.000 0.200 0.001 total 778.6MB Payment: 186326 59.38 0.000 0.180 0.000 used 108.5MB OrderStatus: 18632 6.52 0.000 0.020 0.000 Delivery: 18632 9.60 0.000 0.020 0.001 Stock Level: 18633 16.04 0.000 0.180 0.001 SPECjbb2000 result = 3570.59 =============================================================================== ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Overall SPECjbb2000 result
The equivalent of the ascii report is printed to the screen and .results file at the end of the run, summarizing it.
Configuration Information
All the descriptive properties are echoed to the screen and results file. Also, system and JVM properties acquired by System.getProperties() are echoed.
8 Troubleshooting
Error messages are explained in greater detail below, listed by key words in the message.
Messages from the benchmark:
- "VALIDATION ERROR" -- The JVM has produced a screen which looks
different than expected for the transaction type specified in the message.
Further messages show the difference between expected and produced lines
in that screen. Determine why the JVM behaves differently than other JVM's.
This message means the run is not publishable, even though the run proceeds.
- "Out of memory" -- There was not enough memory for an allocation.
Increasing your heap space is recommended to fix this. In cases with a very
large number of warehouses, it may be that you need to increase your allowed
number of threads in your JVM or operating system.
- "Property file error" -- Some property in SPECjbb.props, SPECjbb_config.props,
or the file which was specified as the properties file, was deleted, misspelled,
or given a value outside of it's expected range. In most cases there
is a further message indicating which one. Compare against a known good copy
of the properties file. "Unrecognized property" most often indicates
a misspelling. Note that booleans should be specified in lower case.
- "No valid warehouse sequence specified" -- specify (and uncomment)
either sequence_of_number_of_warehouses or all three of starting_warehouse_number,
ending_warehouse_number, increment_warehouse_number in the properties file.
Warehouse numbers (in sequence_of_number_of_warehouses) should be increasing
order.
- "Error opening output file" -- Check that the directory of the
specified output file exists and is writable, and that there is not already
a file of the given name with write permission turned off.
- "INVALID: Run will NOT be compliant" -- One of the "fixed" properties
was assigned a value other than that required for valid runs. See Section
4.1.
- "An I/O Exception occurred" -- Check whether there is enough
disk space in the output directory disk or filesystem.
- "JVM Check detected error" -- This means that the small subset
of JVM conformance testing failed. Further messages indicate what the discrepancy
is.
- "Fails validation" -- Either jbb.jar was not first in the CLASSPATH, or the checksum of jbb.jar was not the expected value. This suggests that it has been recompiled or corrupted. If it is CLASSPATH, change CLASSPATH; otherwise, reinstall the kit.
Messages in the reports:
- "not compliant" -- The "fixed" parameters were not
the right values for a valid run. See Section 4.1.
- "JVM failed operational validity check" -- The JVM has produced
a screen which looks different than expected for the transaction type specified
in the message. Messages in the benchmark output (or results file) show the
difference between expected and produced lines in that screen. Determine
why the JVM behaves differently than other JVM's. This message means
the run is not publishable, even though the run proceeded.
- "conformance checking returned negative" -- The small subset
of JVM conformance testing found something non-conformant.
- "recompiled or jbb.jar not first in CLASSPATH" -- Either jbb.jar
was not first in the CLASSPATH, or the checksum of jbb.jar was not the expected
value. This suggests that it has been recompiled or corrupted. If it is CLASSPATH,
change CLASSPATH; otherwise, reinstall the kit.
- "Not having all required points" -- Points below the peak were
missing. They are required for a publishable run.
- "0's will be averaged in for points not measured" -- Points
between the peak and twice the number of warehouses at peak were missing.
If it is possible to collect a run including these points, the score will
be higher.
- "At least points 1 to 8 are required" -- Specify a sequence
of warehouses including all of the numbers from 1 to 8. This is to make publications
on the SPEC site more informative.
- "Measurement interval failing to end in close enough time limits" --
Further messages specify whether the measurement interval for a point ran
too long or to short compared to the intended amount of time, and which point.
Check that there is no other load on the system being measured, and that
the JVM scheduler is not starving the thread which signals the end of the
interval.
- "Max_warehouses_transactions - min_warehouses_transactions is __% of max_warehouses_transactions " -- Some threads ran significantly less than others. Unfairness in the JVM scheduler may affect the score, so the score is not publishable. Investigate JVM scheduling issues.
- Missing file separator: On systems where the file separator is \, use \\ as the file separator in the properties files.
Still have a problem that makes no sense? Contact support@spec.org .
9 Performance Tuning
To tune the benchmark, analyze the workload and look for possible bottlenecks in the configuration. There are a number of factors or bottlenecks that could affect the performance of the benchmark on the system.
The following tuning tips are for configuring the system to generate the best possible performance numbers for SPECjbb2000. Please note that these should not necessarily be used to configure and size real world systems.
- JVM Software
- This is the critical component in determining the performance of the system for this workload. Use JVM profiling (java -prof or java -Xprof) to identify API routines most heavily called. Other tools may be available on your system.
- CLASSPATH
- Having extra stuff in CLASSPATH can degrade performance on some JVMs.
- Memory
- SPECjbb2000 uses at least a heap size of 198m - 300m and is known to benefit from additional memory. Heap size is set via the -ms/-mx (or -Xms/-Xmx) options to java in many JVMs.
- Threads
- The number of threads is approximately equal to the number of warehouses. For very large numbers of warehouses, you may need to increase operating system limits on threads.
- JVM Locking
- At a larger number of warehouses (and hence threads), the synchronization of application methods, or the JVM's internal use of locking, may prevent the JVM from using all available CPU cycles. If there is a bottleneck without consuming 100% of CPU, lock profiling with JVM or operating system tools may be helpful.
- Network
- SPECjbb2000 does not run over the network.
- Disk I/O
- SPECjbb2000 does not write to disk as part of the measured workload. Classes are read from disk, of course, but that should be a negligible part of the workload.
10 Submitting results
Upon a successful run, the results may be submitted to the SPEC Java sub-committee for review by mailing the SPECjbb.<num>.raw file to subjbb2000 @ spec.org. When mailing the output properties file, include it in the body of the mail message, don't send it as an attachment, and mail only one result per email message.
Note: The SPECjbb.<num>.raw file uses the configuration and parameter information in the properties file. Please edit the properties files with the correct information prior to running the benchmark for submission.
Every submission goes through a two-week review process. During the review, members of the sub-committee may ask for further information/clarifications on the submission. Once the result has been reviewed and approved by the sub-committee, it is displayed on the SPEC web site at www.spec.org .