| Strategies |
|---|
| Web Polygraph |
Correct measurement of proxy performance and bottleneck identification is often impossible in real world environment because of the numerous factors affecting the proxy behavior. Benchmarking under controlled conditions is hence an essential tool for performance analysis and evaluation.
There are at least two major benchmarking strategies as applied to proxy performance. First, one can try to simulate real world conditions as close as possible and then report the performance of a proxy under a close-to-reality workload. The obvious positive side of this approach is the ability of the tester to claim the results to be very close to the real-world performance. Unfortunately, the real-world simulation has a few important shortcomings:
The sequence of tests below follows a different strategy. We study proxy performance under several important workloads, including various combinations of:
|
|
|
The suggested sequence of tests attempts to isolate performance bottlenecks using a step-by-step approach. Most of the modeled conditions are rare in practice. However, bottleneck identification allows for much better understanding of proxy real-world potentials. Finally, variations of "mix" workload can be considered as simple real-world tests.
Our test sequence is just an example. We are still searching for the best way(s) to benchmark a caching proxy.
The discussion assumes the use of Web Polygraph (version 1.x) and includes necessary configuration options. However, the proposed tests are not poly-specific, and any good benchmark should be able to model similar conditions.
Please consider reporting performance figures for all tests back to us if possible. We will be building a database of such results. Clearly, there are privacy and non-disclosure concerns. However, at least the first tests of the sequence may not be subject to any restrictions. Interesting anonymous submissions (i.e., submissions without identifying the proxy under test) or preliminary results are also welcome.
The configuration options has been updated to reflect changes in Poly 1.0. Please e-mail us if you find typos. The text with old Poly 0.0 options is still available.
The options were last synchronized with Poly 1.0p5 on 03/07/1999.
Polygraph parameters are typeset in fixed size font.
The following variables are used
| Name | Possible Value | Comments |
|---|---|---|
$cl |
200 | number of concurrent clients (see note below) |
$Goal |
500000 | number of requests to generate; higher throughput requires larger goals |
$Proxy |
10.0.0.13:80 | where to send the requests |
$Origin |
10.0.1.17:80 | hostname part of URLs |
Some details are omitted. Not all options we list are required (in theory), but all are there for a reason. These are just examples and must be tuned to your environment!
All tests should be repeated with various number of clients unless specified otherwise. Start with obviously small number of clients and finish with enough clients to significantly decrease proxy or server performance.
Forget about your proxy for a moment if you can. Before you go ahead with benchmarking, you need to verify the actual conditions in your network. For example, you should know and test the sustained network bandwidth between hosts that will run Polygraph. Things like half-duplex mode of supposed-to-be full-duplex fast Ethernet connections are to watch for.
A simple yet efficient way of testing network bandwidth is to transfer huge
files over the network and measure the latency. The size of a file should be
close to a gigabyte for a 100Mbps link for results to be reliable. You may
want to repeat the experiment several times with different file sizes to
double check that you have reached the saturation point. Programs like
ttcp are very handy in measuring and reporting the sustained
bandwidth.
Before you try to explain the results of other tests, you must verify that your test suite is up to the task of stressing the proxy. There are at least two major components in your test suite: machine(s) with Polygraph software and network infrastructure. This test determines the performance of these two components without proxy intervention.
Configure Polygraph client(s) to talk directly to Polygraph server(s). Use random object ids as they will cause reply size distribution typical for other tests. You may want to play with other configuration parameters as well. The more you know about "raw" benchmark performance the better.
$ polysrv --goal $BigGoal --port $OriginPort $ polyclt --unique_urls 1 --ports 1024:30000 \ --proxy $Origin --origin $Origin --robots $cl --goal $BigGoal
As a rule of thumb, "no-proxy" performance of a Polygraph should be significantly better than proxy performance. The bigger that gap the better. Otherwise, you will not be able to stress the proxy and will end up testing the performance of a benchmark! If performance of a test suite is unsatisfactory at this point, you've got a problem. You probably should not proceed any further until the problem is identified and fixed.
The third important test suite component not tested in the previous experiment is the environment your proxy will run in. If you have a software solution this includes network connections to the proxy box and the box itself. For proxies running on dedicated hardware, this test may not be possible. However, you may want to replace the proxy box with a high performance Unix machine and still run the test.
Configure Polygraph client(s) to go through a proxy box. However, instead
of using the real proxy, install the null-proxy called tunnel
from the Polygraph 0.0 distribution (to be added to 1.0 shortly). Tunnel is a
primitive proxy that exchanges bytes between client and server connections
without caching or even understanding much of HTTP. Repeat tests similar to
the ones you just did with no-proxy traffic.
$ polysrv --goal $BigGoal --port $OriginPort $ tunnel $ProxyPort $Origin $ polyclt --unique_urls 1 --ports 1024:30000 \ --proxy $Proxy --origin $Origin --robots $cl --goal $BigGoal
If the "no-proxy" and "null-proxy" tests show the same results, you must be dreaming. Null-proxy introduces extra overhead for processing each client-server "connection" at the application level and extra delays for establishing new connections with origin servers. These delays mean larger response times and, hence, smaller request rates.
The purpose of this test is to get as close to the proxy working conditions as possible. We want to check that no network or host bottleneck exist on the way from Polygraph simulators to your proxy. Note that the first test did not stress at least some portions of that path.
If performance of a test suite is unsatisfactory (i.e., close to what you expect from or measured with the real proxy under test) at this point, you still might be OK. Indeed, a null-proxy might behave worse than the real one. However, the latter is highly unlikely, especially if your proxy does not use any dedicated hardware.
Finally, we get to the tests involving the real proxy. The first two tests in this group will study the performance for the proxy when its cache is empty. If your cache is already full, and you an in a hurry, then skip this test. Otherwise, wipe your cache clean.
Note that all proxies employ some form of garbage collection to free disk space. Garbage collection algorithms vary from "lazy" background tasks to active on-demand actions. The tests you are about to run should not depend on garbage collection algorithms much because the cache will remain mostly empty. Later, we will compare the results with similar experiments with full cache to see how much garbage collection (and presence of a large object index) affects proxy performance.
To simulate miss-only workload, instruct Polygraph clients to add unique suffixes to generated URLs. Run two tests (for each number of clients): cachable and uncachable reply generation. Cachability of replies is controlled on the client side. Replies are uncachable by default. Uncachable miss-only tests should be run first.
$ polysrv --goal $Goal --port $OriginPort $ polyclt --unique_urls 1 --world_id unique --ports 1024:30000 \ --proxy $Proxy --origin $Origin --robots $cl --goal $Goal # to get cachable miss-only workload, add ``--rep_cachable 100p'' on the client side # empty cache after each cachable run
Note that uncachable miss-only workload is probably the easiest for a proxy to handle as objects do not have to be stored on disk. The second workload (cachable miss-only) tests the ability of a proxy to write objects to disk.
If you have to report the results of the benchmarking tomorrow, and your cache is already full, you may skip this test. However, the statistics collected during this test may be invaluable for the future analysis. We recommend that you wipe your cache clean and run this test.
This is one of the longest tests you will have to run. However, with decent setup and good proxy performance, one can fill a 1GB of cache in, say, 20-30 minutes. If you do not have time to fill the cache several times varying the number of clients, select the number of clients that gave the best performance in previous tests with cachable miss-only workload. To estimate the ETA given the number of clients, use the result in corresponding cachable miss-only experiment with an empty cache (optimistic estimation).
The cache size, number of disks, etc. should match the base configuration you are going to test and work with in the future. It is a good idea to study simple configurations first, of course.
Start with an empty cache. Simulate cachable miss-only workload. The number of requests should be large enough to fill your cache. During the test, periodically (e.g. every 60 seconds) take a snapshot of major proxy measurements you have access to.
# empty the cache first $ polysrv --goal $HugeGoal --port $OriginPort $ polyclt --unique_urls 1 --ports 1024:30000 --rep_cachable 100p \ --proxy $Proxy --origin $Origin --robots $cl --goal $HugeGoal # use unique World Id(s) (default) so you do not have to worry about # the cached objects later when running miss-only tests
If possible try to prevent proxy garbage collection algorithms (if any) from kicking in (e.g., stop the test when cache space utilization is 90%). Also, the performance of a proxy may change dramatically as the disks fill up for reasons other that garbage collection (e.g. on Unix more objects may mean reduced caching of i-nodes, longer directory lookups). That is why we recommend to take performance snapshots often enough to be able to detect potential problems. The aggregate statistics may not be good enough for this workload when the situation may be changing with every object swapped out to disk.
If your network is fast (and it should be for the benchmarks to make sense), some proxies may not cache all the traffic that comes through. It is important to study the proxy logs or other sources of information to see how many objects were actually cached. If the proxy does not provide you with relevant information, you may want to repeat the same test several times (preserving test parameters between runs!). The number of misses during the subsequent tests may be used as an estimate of "skipped" document count (swap-out skip ratio). However, be advised that a proxy can also skip swap-in disk requests (also affecting number of misses) so more advanced testing may be required in this case.
$ polysrv --goal $Goal --port $OriginPort $ polyclt --unique_urls 1 --ports 1024:30000 \ --proxy $Proxy --origin $Origin --robots $cl --goal $Goal # to get cachable workload, add ``--rep_cachable 100p'' on the client side # do not empty the cache
TBD: Experiment with cachable replies and Zipf-distributed object ids. Some may consider this workload to be close to real-world traffic, but see "Bursty traffic" section as well.
$ polysrv --goal $Goal --port $OriginPort $ polyclt --rep_cachable 100p --ports 1024:30000 \ --proxy $Proxy --origin $Origin --robots $cl --goal $Goal
Repeat the previous experiment several times using the same parameters until hit ratio is close to 100%. Usually requires 2-3 passes for each number of clients (depends on the length of an experiment, of course).
$ polysrv --goal $Goal --port $OriginPort $ polyclt --order $rnd --world_id hit-only.full.$cl --ports 1024:30000 \ --rep_cachable 100p \ --proxy $Proxy --origin $Origin --robots $cl --goal $Goal # repeat several times (for each $cl) varying $rnd for each run
[Note: if your version of Poly complains about the --order
option (we have temporary disabled it), omit that option for now and use
the --rnd_seed option instead.]
Previous experiments had constant number of concurrent requests. Experiments in this group model bursty traffic with a given average rate of request submissions.
$ polysrv --goal $Goal --port $OriginPort $ polyclt ... --robots 1 --req_rate $Rate # other parameters depend on the experiment
TBD: DNS delays; slow clients and servers; 10 boxes simulating 10 clients each versus 1 box simulating 100 clients, etc.
There are some real-world phenomena we do not know how to simulate (and/or lack understanding of their impact):
$Id: strategies.html,v 1.7 1999/05/31 06:01:10 rousskov Exp $