Netperf3 Run Rules
Table of Contents
What Netperf does
How to run Netperf3
Appendix 1 – Kernel Compilation and Installation
Appendix 2 – Contents of Config script files
Appendix 3 – Test Script files
Appendix 4 – Generate Netperf Results
Netperf is a benchmark that can be used to measure various aspects of networking performance between a single pair of machines. Its primary focus is on streaming, request/response using either TCP or UDP protocols. Netperf3 is an experimental version available at www.netperf.org. Which has both thread and process model. Netperf3+ (updated version that needs to be built) is an upgraded version that supports multi-adapter, tcp_rr and tcp_crr features. Right now Netperf3+ does not synchronize the test among multiple clients, so we run the test between only two machines (client and server).
What Netperf3 does?
Netperf 3 is a client/server network application that measures network throughput through tcp/udp streaming, tcp request/response, tcp connect/request/response tests using multiple adapters on both server and client. The throughput is measured in Mbits/second. The throughput statistics is reported at the client end.
CPU: 4 Intel Pentium III 500 MHz
CACHE: 2 MB
NETWORK: Intel Ether pro cards using 100mb
OS: RedHat 6.2+kernel 2.4.0/Kernel 2.4.4/Kernel 2.4.7
CPU: 8 Intel Pentium III 700 MHz
NETWORK: Intel Ether Pro cards using 100 mb
OS: RedHat 6.2+Kernel 2.4.0/Kernel 2.4.4/Kernel 2.4.7
The network NICs are connected using cross-over cables for present workload. For future workloads we will use fast ethernet, full duplex 24 ports Foundary Network Fast Iron workgrup switch. The network traffic is balanced across all the adapters through static host routes and permanent arps.
How do I run Netperf3?
a. Linux Kernel: Get the correct kernel at kernel.org
Compile and install kernel (see Appendix 2)
b. Installation of Netperf and its script files:
Download netperf3 available at www.netperf.org under experimental directory and apply netperf_adp.patch to get multi-adapter support. Build Netperf3+ and Netserver3+ with pthread flag turned on. In this document Netperf3+ server is referred as netserver3 and client as netperf3.
Copy the server (netserver3) to /usr/local/bin subdirectory on the server and the client (netperf3) to /usr/local/bin on the client machine. Netserver is a daemon which can be run either standalone or as part of inetd daemon. Add /usr/local/bin to the PATH. Also copy 4adp_1thread.sh,2adp_1thread.sh,1adp_1thread.sh,4adp_100thread.sh,2adp_100thread.sh,1adp_100thread.sh, start.sh, stop.sh, bsetup.sh, esetup.sh, vm.sh to /usr/local/bin directory at the client end and copy serv4.sh, serv2.sh, serv1.sh, start.sh, stop.sh, bsetup.sh, esetup.sh and vm.sh script files to /usr/local/bin subdirectory at the sever end. Make separate directories for each set of configuration. For eg., for running 4 adapter stream tests using kernel 2.4.0, make s240_4adp_1way_1thread, stream s240_4adp_2way_1thread,s240_4adp_4way_1thread directories on both the server and client machines. Since I run different tests back-to-back I chose to run server daemon as standalone and not through inetd. I also use the default port for Netperf. (change script files to take port, remote host and local host as inputs).
Linux Kernel:Build the kernel selecting the correct drivers for your network cards under Network device option in menuconfig. If you are using IBM Etherjet or Intel Ether pro adapters, bump up the TX_RING_SIZE and RX_RING_SIZE to 128 before building the kernel otherwise the adapter will fail with "out of resource" reason. This will be fixed in later kernels (not fixed in 2.4.0/2.4.4 used for the test)
2. Prepare Test Hardware
a. Kernel Selection and Server/client Reboot:
1. Logon to the server/client as root
2. Edit /etc/lilo.conf to boot the correct kernel
3. Run "lilo"
4. Run "reboot" or "shutdown -r now" to reboot the server
b. Load Balance Network Traffic:
Copy route.sh file to /usr/local/bin and execute this file from /etc/rc.local file so that every time the system starts the static routes and permanent arps will be set.
How to set Static Routes and Permanent Arps:
Type "netstat -rn" to see the existing routes:
execute /usr/local/bin/route.sh to set the static routes and permanent arps on both the client and server systems if you want to refresh that is, if you restart the network. Route.sh gets executed from startup files. route.sh has the following commands:
route add -remote_host remote_ipaddress interface_device_name
arp -s remote_host remote_hwaddr
Issue the above "route" command for each interface_device_name, namely in our case, eth1, eth2, eth3, eth4 and Issue the above "arp" command for each remote_interface.After setting the static route address type "netstat -rn" to see the static routes and issue "arp -a" to see the arp entries.
Make sure that all the adapters are functioning using the following commands:
ifconfig -- shows all the interfaces that are up
ping -- ping each remote host on both the server and client
netstat -i -- shows some statistics such as bytes xmitted, recvd,
Errors, collisions etc.,
Each interface should have equal number packets as a result of ping command being executed prior to this.
cat /proc/dev/net -- also shows the statistics for each interface. This one shows number of bytes transferred.
Avoid other network interference: Bring down the interface that are not used for the test especially the ones that are connected to the backbone by issuing "ifdown interface" (eth0 in our case). The interface that is connected to the backbone can be deciphered by using "ifconfig" command.
Set up hosts file: Update /etc/hosts file to add the remote hostnames and its ip addresses on both the server and client systems.
Config changes for 1000 connection test:
Execute the following commands at each session where you want to run netperf and netserver.
ulimit -n 8192 -- which increases the number of open files per process to
8192 and the default is 1024.
echo "32767" > /proc/sys/fs/file-max -- increases the maximum files per process
Make sure the path is set to /usr/local/bin in path environment The convention followed for the directory setup on both client and servers are:
For eg., s240_4adp_1way_1connection/
Go to the test directory on the server and execute serv4.sh to start the server. Go to another session start vm.sh just before starting the test on the client machine. The script vm.sh collects all the system information at the beginning of the test, collects vmstat (cpu usuage) while the test is being executed and finally collects netstatistics after the test completes.
On the client machine, open a session, go to the test specific directory and execute vm.sh that collects system information, cpu usuage info for every 4 seconds contiuously and netstatistics at the end. Open another session, go to the test specific directory and execute the test-specific script file. Note that changing to test specific directory such as /usr/local/bin/stream240/4adp_1way_1thread is important to collect testspecific data.Note: When you open a session you need to log-in as root to set static routes, permanent arps etc.,
Since netperf3 does not include convergence test, the tests are run 3 times to make sure the test converges within 5% variation. Take the average of all the 3 runs as results.
1. On the client under the test directory, network throughput log files for each msg size (4,16,1024,2048,4096,8192,32768) are collected along with vmstat.log, network statistics, system information before test start and after the test completes.
2. On the server under the test directory, vmstat.log file, network statistics, system information before test and after test are collected.
All these data need to be saved for data review and verification later and for all future references
Appendix 1: Build and Install a Linux Kernel
Get the kernel
Get the Linux kernel from ftp://www.kernel.org. Kernel images are usually located at /pub/linux/kernel/v2.4. Kernels are available in gzip and bzip2 compressed format.
Unpack the Kernel Tar File
Change directory to ‘/usr/src’ and unpack the tarball file (substitute “2.4.0” with the version you are using). For gzip compressed kernels: ‘tar zxvf linux-2.4.0.tar.gz’. For bzip2 compressed kernels: ‘bunzip -c linux-2.4.0.tar.bz2 | tar xvf -’. This will create a new directory called ‘linux’. Move linux to linux-<version>: ‘mv linux linux-2.4.0’. Create a symbolic link ‘linux’ to the new kernel directory: ‘ln -s linux-2.4.0 linux’
Configure the Kernel
Make sure to have no stale .o files and dependencies lying around:
Use “make menuconfig” to build the kernel using the default options plus :
· Select SMP option for SMP kernels
· For UP kernels, enable APIC support
· Add 4GB mem support.
· Add IBM ServeRAID support (in kernel, not module) as needed.
· Add Adaptec AIC7xxx support (in kernel, not module) as needed.
· Add appropriate network NIC support (in kernel, not module) as needed
Make the Kernel
Set up all the dependencies and build a compressed kernel image
If you configure any parts of the kernel as ‘modules’, you need to do
Boot the Kernel
To boot the new kernel, you need to copy the kernel image bzImage that you just built to the directory where your bootable kernel is normally found, e.g., /boot, and rename it to bzImage-240-SMP
‘cp arch/i386/boot/bzImage /boot/bzImage-240-SMP’
You need to also save the file System.map
‘cp system.map /boot/system.map’
Configure the lilo boot loader by editing the file /etc/lilo.conf and adding the following entry.
image=/boot/bzImage-240-SMP # specify the kernel is in bzImage-240-SMP
label=240SMP # give it the name “240SMP”
root=/dev/hda1 # use /dev/hda1 is the root filesystem partition
# change /dev/hda1 to your root filesystem
Install lilo on your drive by running
Shutdown and reboot the system
‘shutdown –r now’
Appendix 2: Test Preparation Script Files:
An example of each script file is given below:
Route.sh is used to set the host routes and permanent arps:
route add -host 18.104.22.168 eth1
route add -host 22.214.171.124 eth2
route add -host 126.96.36.199 eth3
route add -host 188.8.131.52 eth4
arp -s 184.108.40.206 00:04:AC:D8:37:FB
arp -s 220.127.116.11 00:04:AC:D8:81:D6
arp -s 18.104.22.168 00:04:AC:D8:83:2C
arp -s 22.214.171.124 00:04:AC:D8:38:3A
Serv4.sh is used to start the server using 4 adapters:
netserver3 -H perf1,perf2,perf3,perf4
rstart.sh file collects data before test start and its contents:
start.sh -- collects system information
netstat -i &> netbb -- collects network statistices/per interface
cat /proc/stat &> bb -- collects cpu statistics
vmstat 4 |tee vmstat.log -- collects cpu statistics for every 4 seconds continuously
Rstop.sh file collects data after the test run and its contents are:
stop.sh -- collects system information
netstat -i &> netee -- network statistics
cat/proc/stat &> ee -- cpu utlization during the test duration
The contents of start.sh:
cat /proc/meminfo > meminfo-start
cat /proc/slabinfo > slabinfo-start
cat /proc/cpuinfo > cpuinfo
cat /proc/version > version
cat /proc/interrupts > interrupts-start
cat /proc/net/netstat > netstat-start
uname -a > uname
ifconfig -a > ifconfig-start 2>&1
ps axu > ps-start 2>&1
The contents of stop.sh:
ps axu > ps-end 2>&1
cat /proc/meminfo > meminfo-end
cat /proc/slabinfo > slabinfo-end
cat /proc/interrupts > interrupts-end
cat /proc/net/netstat > netstat-end
ifconfig -a > ifconfig-end 2>&1
Appendix 3: Test Script Files
There are three sets of script files for tcp protocol test. Right now
we use only Stream test files.
s_4adp_1thread.sh, s_2adp_1thread.sh, s_1adp_1thread.sh s_4adp_100thread.sh, s_2adp_100thread.sh, s_1adp_1thread.sh
rr_4adp_1thread.sh, rr_2adp_1thread.sh, rr_1adp_1thread.sh rr_4adp_100thread.sh, rr_2adp_100thread.sh, rr_1adp_1thread.sh
crr_4adp_1thread.sh, crr_2adp_1thread.sh, crr_1adp_1thread.sh crr_4adp_100thread.sh, crr_2adp_100thread.sh, crr_1adp_1thread.sh
Appendix 4: Generate Netperf Results:
Collect the throughput log files for each test run from the test specifc directory. For example,
For a 4adapter,1way server,1connection test, copy msgsize.log (where msgsize is 4, 16, 1024 etc.,) files from s240_4adp_1way_1thread directory. The network throughput is collected per thread so in test runs that uses multiple thread (multiple connections), the throughput needs to be sumed up. Right now we use single client so the results are collected only at the client end.
Appendix 5: Sample Results:
Netperf3 Stream Test - 1 Connection/Adapter
2P 500 MHz 770 MB Mem
4 adpater test