| PGL: Types |
|---|
| Web Polygraph |
PGL supports many generic and domain specific types. Supported types are listed below with pointers to their detailed descriptions where available.
This page has been synchronized with Polygraph version 2.2.1.
| Generic | Domain Specific |
|---|---|
| Type: | addr | ||||
|---|---|---|---|---|---|
| Fields: | |||||
|
Network addresses are represented using addr type. The addresses can store IP or FQDN information along with an optional port number . Address constants are usually specified using 'single quoted strings' as shown below.
addr them = '204.71.200.245'; // no port number addr theirServer = '204.71.200.245:80'; theirServer.host = "209.162.76.5"; // change host name only theirServer.host = them; // Error: type mismatch!Arrays of addresses can be formed using regular array operations. To create an array with many ``similar'' addresses, a handy address range notation can be used. The a-b.c-d.e-f.g-h notation instructs PGL to produce an array of IP addresses that belong to a range specification. At least two ranges (or points) must be specified. A single range spec can span more than 255 IP addresses. IP addresses that end with .0 or .255 are skipped!
addr[] srv_ips = [ '10.100.1-2.1-250:8080' ]; // 500 IP addresses addr[] rbt_ips = [ '10.100.1-500' ]; // 498 IP addresses addr[] dummy = [ '10.100.0.1-254', '10.100.1.1-244' // same as rbt_ips ];While surrounding a single address range with array [brackets] is not required, you do need those to concatenate two or more ranges. One can also specify a subnet. The latter may be useful for aka tool.
clt01> sudo aka fxp0 10.100.2.1-250/16
| Type: | array |
|---|---|
| See also: | list, selector |
Array is simply a list of items of the same type. Polygraph extends arrays dynamically to accommodate all items so no array size specifications are supported. One cannot extract an element from an array (such a capability seems unnecessary because PGL does not support loops).
int[] numbers; // a declaration of an array of integers time[] alarms = []; // an empty array of time values addr[] ips = [ '10.0.1.1', '10.0.2.2' ]; // an array of two addressesArrays do automatic interpolation of sub-arrays. That is, when an array A is evaluated, an item I of array type is interpolated into A just as if each individual element of I were a member of A. Thus, arrays lose their identity in an array environment. (This feature and its explanation were borrowed from Perl language).
// the following two arrays are equivalent int[] A1 = [ 1, 2, 3, 4 ]; int[] A2 = [ 1, [2, [3]], 4]; // A1 becomes a concatenation of A1 and A2: A1 = [ A1, A2 ];
| Type: | bwidth |
|---|---|
| See also: | size, time, rate |
Bwidth type is nothing else but a size/time fraction:
bwidth bw = 100Mb/sec; // 100BaseTX (100 Mbit per second) size sz = 500Kb; time tm = 10sec; bwidth bw2 = sz/tm; // 50Kbps, naturally bwidth bw3 = 13/sec; // Error: type mismatch
| Type: | bool |
|---|
Boolean type can take the following values, with obvious interpretation: true, false, yes, no, on, and off. Simply use whatever value is appropriate for a given situation.
| Type: | distr |
|---|---|
| See also: | ObjLifeCycle |
Distr type allows you to specify a random distribution of a well-known shape. In PGL, distributions are ``typed''. That is, you must specify the type for values along with the shape of the distribution. Polygraph is usually able to guess the values type by examining the parameters of the distribution function.
size_distr repSize = exp(13KB); // exponential distribution of sizes int_distr connLen = zipf(64); // Zipf-distributed connection lengthsThe following distribution shapes are recognized:
- Constant: const(mean)
- Uniform: unif(min, max)
- Exponential: exp(mean)
- Normal: norm(mean, std_dev)
- Lognormal: logn(mean, std_dev)
- Zipf(1): zipf(world_size)
- Sequential: seq(max)
When a time distribution is used to specify Object Life Cycle parameters, it can be augmented by special qualifiers. The following qualifiers are supported:
- now -- current time
- lmt -- last modification time
- nmt -- next modification time
The value of the nmt qualifier is what lmt would read after the object is modified once. That is, it is the ``next last modified time''. This qualifier is handy for specifying truthful Expires header fields:
// object life cycle for "HTML" content ObjLifeCycle olcHTML = { birthday = now + exp(-0.5year); // born about half a year ago length = logn(7day, 1day); // heavy tail, weekly updates variance = 33%; with_lmt = 100%; // all responses have LMT expires = [nmt + const(0sec)]; // everything expires when modified };
| Type: | float |
|---|---|
| See also: | int, int() |
Floating point values are represented using float type. Common arithmetic operations are supported. Integer values are implicitly converted to floating point in a float context. There is no implicit or default conversion from floating point values to integers. Use the int() function for an explicit cast.
float f = 5/10; // f is equal to 0.0 float f = 5.0/10; // f is equal to 0.5 int i = f; // Error: no default conversion from float to intInternally, Polygraph stores floating point values using ``double precision'' (usually 8 bytes per variable).
| Type: | int |
|---|---|
| See also: | float, int() |
Integer values are represented using int type. Common arithmetic operations are supported for integers. The important thing to remember about integer arithmetic is that all calculations are done with integer precision. For example, 3/2 yields 1 and 3*(2/3) yields zero.
There is no implicit or default conversion from floating point values to integers. Use the int() function for an explicit cast.
A integer value of zero can be implicitly converted to many other types, resulting in a ``none'' or ``nil'' value. Note that the latter is not the same as an ``undefined'' value. Polygraph may replace undefined values with appropriate defaults, but zero value cannot be silently replaced or ignored.
int i = 5/10; // OK; i is equal to 0 int i = 5.0/10; // Error; no default conversion from float int i = int(10*(5.0/10)); // OK; i is equal to 5 time_distr xactThinkTime = 0; // no delays
| Type: | list |
|---|---|
| See also: | array |
List is a coma-separated enumeration of items. List items can be of different types. Lists are used in function and procedure calls, but you should not attempt to declare a list variable.
| Type: | rate |
|---|---|
| See also: | float, time, bwidth |
Rate type is nothing else but a float/time fraction.
rate req_rate = 10.1/sec; // about 10 requests per second rate xact_rate = 3/5min; // 3 xactions in 5 min interval rate rep_rate = 0; // no replies at all float dummy = xact_rate * sec; // that many xactions each second rate r = 13/5; // Error: type mismatch
| Type: | selector |
|---|---|
| See also: | array |
Selector is an array with probabilities associated with every item. By default, all probabilities are unknown. When actual probabilities are needed, the items with unknown probabilities will absorb whatever is left from 100%, in a fair fashion.
addr[] servers = [ '10.0.2.1:80' : 30%, // this server will be used in 30% of cases '10.0.2.2:80' : 50%, // this server will be used in 50% of cases '10.0.2.3:80' // 100-30-50 = 20% is everything that is left // for the last server ];If probabilities add up to less than 100%, they are adjusted proportionally to their absolute values.
// the following two selectors are equivalent: Phase[] scheduleA = [ ph1 : 20%, ph2 : 60% ]; Phase[] scheduleB = [ ph1 : 25%, ph2 : 75% ];Note that Polygraph does not complain if you specify probabilities in an array where none are expected. Such probabilities are silently ignored.
| Type: | size |
|---|---|
| See also: | time |
For size constants, Polygraph understands the following scales:
Suffix Bytes Byte 1 Kb 128 KB 1024 Mb 131072 MB 1048576 Gb 134217728 GB 1073741824 Scale suffices can be shortened to the first two letters (e.g. 5KB) except for the Bytes suffix that cannot be shortened.
Scale suffix can be applied to integer and floating point numbers. In case of floating point numbers, the final number of bytes is rounded to the smallest closest integer.
size s0 = 3KB + 1Mb; size s1 = 2.5Bytes; // OK; truncated to 2 bytes size s2 = 10 * s1; // s2 holds 20 bytes size s3 = s0/s1; // Error: type mismatchPGL can handle sizes up to 4611686016279904256 bytes on machines with 4 byte integers, which is approximately 4 exabytes. However, Polygraph objects cannot handle sizes larger than 2GB unless noted otherwise.
| Type: | string |
|---|
String constants are specified using "double quoted strings". At the time of writing, no interesting operations on strings were supported.
| Type: | time |
|---|---|
| See also: | size |
For time constants, Polygraph understands the following scales:
Suffix Abbreviation Value msec ms millisecond (1/1000 second) sec s second min minute hour hr 60 minutes day 24 hours year 365 days Scale suffix can be applied to integer and floating point numbers. In case of floating point numbers, the closest approximation is chosen to represent integer seconds and milliseconds.
time t0 = 5min + 1sec; time t1 = 0.5sec; // OK; 500 milliseconds time t3 = t0/t1; // Error: type mismatchPGL also allows for ``absolute time'' constants. Absolute constants are specified using single quoted strings and come in one of the two formats: 'YYYY/MM/DD' or 'YYYY/MM/DD HH:MM:SS'.
time today = '1999/08/23 13:10:30'; // absolute dateAbsolute times are assumed to represent Universal Coordinated Time (UTC).
| Type: | Agent | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Fields: | |||||||||||||
| |||||||||||||
| See also: | Robot, Server, Proxy, use() |
Agent is a base type for PGL Robots, Servers, and Proxies. In other words, Agents have properties common to those three types. Usually, you will not use the Agent type directly, but knowing its properties helps in Robot and Server manipulation.
The kind field is a label used for information purposes only.
The idle_pconn_tout fields specifies the delay after which an idle persistent connection (i.e., a connection with no pending messages) will be closed.
If hosts field is defined, Polygraph will start the agents on the specified hosts (network interfaces) only. Otherwise, an agent will be activated on all hosts where corresponding Poly process (polyclt or polysrv) is being run.
Proxy agents currently ignore all but the hosts field of their parent type. It is a bug.
| Type: | Cache | ||||
|---|---|---|---|---|---|
| Fields: | |||||
| |||||
| See also: | Proxy |
The Cache type is used to configure a proxy cache.
The capacity field specifies the maximum size of the cache. When the sum of content lengths of all cached objects exceeds the configured capacity, some objects may be purged to free space for the incoming traffic. Setting capacity to zero effectively disables the cache.
When set, icp_port instructs the cache object to listen for ICP queries on the specified port and reply to those queries according to the cache contents. At the time of writing, misses are replied with the miss-no-fetch ICP opcode.
Cache admission policy admits every cachable object at most capacity in size. The replacement policy is LRU.
Polygraph allocates about 80 bytes of housekeeping information per cache entry and assumes that average object size is 10KB. It is a good idea to make sure that your benchmarking environment has more than enough memory for the configured cache capacity.
Polygraph cache does not store object content, of course. If needed, ``cached'' content can be generated from scratch, using the corresponding origin server configuration. This content regeneration is the responsibility of the server side of a proxy. If you are using the cache, make sure that the origin servers in the Proxy configuration file are exactly the same as the origin servers used in the experiment!
| Type: | Content | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Fields: | |||||||||||||
| |||||||||||||
| See also: | Server |
The Content type accumulates details about such Web object properties as MIME type, size, cachability, etc.
The May_contain field specifies embedded types that the content type may contain. For example, HTML objects may contain various images and audio files.
Embedded_obj_cnt distribution is used to determine the number of embedded objects in the container of the corresponding content type.
| Type: | Goal | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Fields: | |||||||||
| |||||||||
| See also: | Phase |
Goal specifies one or more simulation goals for a given phase. Individual sub-goals are ORed together. That is, reaching one sub-goal is enough to reach the entire goal.
All sub-goals except errors are called ``positive'' sub-goals. Specifying errors or a ``negative'' sub-goal is somewhat tricky. If errors value is less than 1.0 than it is treated as error ratio. Otherwise, it is treated as error count. For example, a value of 0.03 would mean that getting at least 3% of errors is enough to reach the goal, while the value of 3 would mean that at least 3 errors are enough.
| Type: | Mime | ||||
|---|---|---|---|---|---|
| Fields: | |||||
| |||||
| See also: | Content |
Mime type groups together Web object properties related to MIME standard. Type specifies the string to be used for the Content-Type: HTTP header. The extensions filed is ignored by Polygraph version 2.2.9 and earlier.
| Type: | NetPipe | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Fields: | |||||||||||||||||
| |||||||||||||||||
| See also: | use() |
NetPipe type describes the parameters of a network ``pipe''. The set of parameters is based on dummynet interface for FreeBSD. Currently, pipe specifications are used by the piper tool only.
If the value of outgoing field is undefined (not true or false), the pipe is assumed to be symmetric. In that case, piper will assign two identical pipes to each address in the hosts field.
As with robots and servers, one must call use() to select which pipes will actually be used by piper.
| Type: | ObjLifeCycle | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Fields: | |||||||||||
| |||||||||||
| See also: | Content |
ObjLifeCycle specifies the parameters for the Object Life Cycle model.
// object life cycle for "HTML" content ObjLifeCycle olcHTML = { birthday = now + exp(-0.5year); // born about half a year ago length = logn(7day, 1day); // heavy tail, weekly updates variance = 33%; with_lmt = 100%; // all responses have LMT expires = [nmt + const(0sec)]; // everything expires when modified };See the distribution type for a list of supported qualifiers for time distributions (lmt, now, nmt, etc.).
| Type: | Phase | ||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Fields: | |||||||||||||||||||
| |||||||||||||||||||
| See also: | schedule() |
Most Polygraph measurements are split into Phases. Phases also allow to vary the overall load created by Polygraph to model complex load patterns.
Phase name is used for informational purposes only. Do not use name ``All'' which is an lx macro that stands for ``all phases''. Also, if you are going to make graphs based on console output (rather than binary logs), you want to avoid phase names with whitespace. The latter will effectively change the number of columns in console stats lines and confuse plotting tools.
Phase goal specifies the duration of the phase. See Goal type description for details.
Load factors affect the load generated by Polygraph Robots. Load level can be varied from 0% to 100% and beyond, relative to what load an individual Robot is generating. In other words, load factor tells each Robot to adjust its activity accordingly.
If load_factor_beg is not equal to load_factor_end, then the load level is adjusted linearly during the phase. That is, the load is increased(decreased) the load from load_factor_beg to load_factor_end. Load change requires a positive phase goal.
There are a couple of simple ``load preservation'' rules that make load factors easy to specify. All these rules apply only when a factor is not explicitly defined:
These rules eliminate repetitions of load factor entries for consecutive phases. Only changes in load levels have to be specified.
- For load_factor_beg, use load_factor_end of the previous phase.
- For load_factor_end, use load_factor_beg of the current phase.
- If a load factor is still undefined, it is set to 100%.
Other factors behave in a similar fashion. Recur_factor is applied to the recurrence_ratio of a Robot. Special_req_factor is applied to the portion of ``special requests'' such as ``IMS'' or ``Reload''. The latter can be specified using the ``req_type'' field of a Robot.
Finally, the log_stats flag tells Polygraph if statistics collected during the phase should be recorded in a log file. This flag defaults to true.
| Type: | PopDistr |
|---|---|
| See also: | PopModel |
The PopDistr type is similar to the distribution type. Popularity distribution specifies how to select the next object to be requested from a group of objects objects that were requested before. In other words, it specifies which objects are more popular than others (i.e., requested more often) within a certain group of objects.
PopModel R; R.pop_model = pmZipf(0.6);The following popularity models are supported:
- pmUnif() -- Uniform: all objects have equal chance of being selected
- pmZipf(skew_factor) -- Zipf: zipf-like power law with the specified skew
PopDistr was called PopModel in Polygraph version 2.2.9 and earlier. The PopModel type is now more complex.
| Type: | PopModel | ||||||
|---|---|---|---|---|---|---|---|
| Fields: | |||||||
| |||||||
| See also: | Robot, working_set_length() |
Popularity model specifies how to select the next object to be requested among all objects that were requested before. In other words, it specifies which objects are more popular than others (i.e., requested more often).
The selection of the object to be requested is done in two stages. First, Polygraph determines whether the object should come from a ``hot set''. That decision is positive with a probability specified by the hot_set_prob field. During the second step, the popularity distribution specified by the pop_distr field is used to select a particular object. If the object is selected among ``hot'' objects, the selection is limited by the hot set size. Otherwise, the entire working set is used. The hot set size is a fraction of the current working set size specified by the hot_set_frac field.
PopModel popModel = { pop_distr = pmUnif(); hot_set_frac = 1%; // hot set is 1/100th of the working set size hot_set_prob = 10%; // every 10th object is requested from the hot set }; Robot R; R.pop_model = popModel;The PopDistr type was called PopModel in Polygraph version 2.2.9 and earlier.
| Type: | Proxy | ||||||
|---|---|---|---|---|---|---|---|
| Is an: | Agent | ||||||
| Fields: | |||||||
| |||||||
| See also: | Agent, use() |
Proxy agent simulates a proxy cache. The client side (i.e., the side that sends requests to and receives replies from the servers) is configured using a Robot agent. Similarly, the server side (i.e., the side that receives requests from and sends replies to clients) is configured using a Server agent. Finally, a proxy may have a cache to store some of the proxied traffic.
The client side attempts to cache every cachable object it fetches. The server side attempts to resolve every request from the cache. See the Cache type description for important caveats of using the cache.
There is no direct connection between ICP ports of the client side and the cache (see Robot and Cache types for the description of those fields). However, in most cases, these two ports should be set to the same value because a real proxy usually sends and receives ICP queries using the same UDP port.
Note that the hosts field of the proxy agent overwrites the hosts fields of client and server configurations. Other fields inherited from the Agent type are currently ignored. The latter is a bug.
Proxies are activated by the polypxy program.
| Type: | Robot | ||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Is an: | Agent | ||||||||||||||||||||||||||||
| Fields: | |||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||
| See also: | Agent, Server, use(), place() |
Derived from Agent type, Robot (a.k.a. ``user'' or ``client'') is the main logical thread of execution in polyclt. Robots submit requests and receive replies. The frequency and nature of the submissions depends on the workload.
The origins field lists addresses of origin servers to be contacted. Launch_win specifies the initial delay for a robot to start (i.e., the delay before the very first request is submitted by a robot).
When req_rate is specified, a robot will emit a Poisson request stream with the specified mean rate, subject to phase load levels. The req_inter_arrival field can be used to specify request arrival stream different from Poisson. Naturally, the two fields are exclusive.
If neither of req_rate or req_inter_arrival are set, a Robot will use the ``best effort'' approach, submitting next request immediately after a reply to the previous request has been received.
Public_interest ratio specifies how often a robot would request a URL that is ``known'' to (and can be requested by) other robots. Robots are usually independent from each other in their actions. However, they may access same objects on the same servers. If public_interest is zero, a robot would request only ``private'' objects from all origin servers, resulting in no overlap of URL sets requested by individual robots. Note that both public and private objects can be requested more than once and hence produce a hit.
Recurrence ratio is simply how often a robot should re-visit a URL. In other words, how often a robot should request an object that was accessed before (possibly by other robots). Note that recurrence ratio is usually higher than hit ratio because many objects are uncachable and repeatative requests to uncachable objects do not result in a hit.
The embed_recur field specifies the probability of requesting an embedded object when the reference to the latter is found in the reply.
Pop_model specifies which ``popularity model'' to use when requesting an object that has already been requested before. You must specify popularity model if you specify positive recurrence.
When unique_urls flag is set, each request submitted by polygraph will be for a different URL. Note that this option is applied last and changes a URL without affecting the object id part. Object ids are responsible for generating various object properties. Thus, for filling-the-cache experiments, it may be a good idea to use this option in conjunction with other options like recurrence and public_interest. The latter would generate objects similar to production tests (but with zero hit ratio).
Open_conn_lmt is the maximum number of open connections (in any state, to any server) a robot may have at any given time. A robot will postpone new transactions if the limit is reached. This limit simulates typical behavior of browsers like Netscape Communicator that have a hard limit on the total number of open connections. See Pei Cao's experimental study for more information.
Wait_xact_lmt is only useful when open_conn_lmt is specified. If a robot, reaches its open connections limit, it will queue the extra transactions. When the queue length grows beyond Wait_xact_lmt, new transactions will be simply ignored (with an appropriate error message).
If launch_win is specified, the robot will submit its first transaction within the given time window. The actual time of submission is drawn from a uniform distribution. Launch window is useful to prevent all robots from starting at the same time, creating a big burst of requests.
The peer_icp address enables ICP module of the robot; the robot will send ICP queries for all to-be-requested objects from the icp_port to that address. The peer_http address specifies where to send HTTP queries after an ICP peer returns a hit.
Note that if only peer_icp address is set, the robot will send ICP queries to the specified address, but will not fetch objects from a peer. Setting peer_http only does not make sense at the time of writing, use the --proxy option instead. At most one ICP and at most one HTTP peer can be configured. Using completely different addresses for the two peers is allowed, but usually does not make sense.
| Type: | Server | ||||||
|---|---|---|---|---|---|---|---|
| Is an: | Agent | ||||||
| Fields: | |||||||
| |||||||
| See also: | Agent, Robot, use(), place() |
Server objects model origin servers in polysrv. Servers receive requests and send replies.
Accept_lmt specifies the limit for consecutive attempts to accept(2) an incoming connection. The attempts are terminated with the first un-successful accept() call or when the limit is reached. By default and when the limit is negative, all pending connections are accepted.
Contents is a content selector. It specifies the distribution (or relative popularity) of content types for the server. Each content type must be ``accessible''. That is, each type must be in the closure of the direct_access selector described below.
Direct_access array specifies what content types can be accessed directly by a robot (i.e., not as an embedded object). The configuration below describes a simplified relationship among the three most popular content types.
#include "contents.pg" Server S = { contents = [ cntImage : 70%, cntHTML : 10%, cntOther ]; direct_access = [ cntHTML : 95%, cntOther ]; };
| Type: | Socket | ||||
|---|---|---|---|---|---|
| Fields: | |||||
| |||||
| See also: | Agent |
Socket object encapsulates network socket options.
Nagle is an equivalent of the TCP_NODELAY TCP option. See Unix Socket FAQ for more info.
Linger_tout is an equivalent of the SO_LINGER socket option. Undefined linger_tout results in the default TCP behavior. In Poly releases after 1.0p0, setting timeout to zero is equivalent to setting SO_LINGER to ``disabled'' (see below).
SO_LINGER controls the action taken when unsent messages are queued on socket and a close(2) is performed. If SO_LINGER is set, the system will block the process on the close(2) attempt until it is able to transmit the data or until it decides it is unable to deliver the information within the specified timeout. If SO_LINGER is set to "disabled" and a close(2) is issued, the system will process the close in a manner that allows the process to continue as quickly as possible.
$Id: types.sml,v 1.6 2000/06/20 14:04:54 rousskov Exp rousskov $