| PolyDocs: Object Life Cycle Model |
|---|
| Web Polygraph |
This page explains the object life cycle model supported by Web Polygraph. Documentation has been synchronized with Poly 1.2.1.
Object life cycle model is responsible for simulating object modification, expiration, and similar events in Web object's ``life''. The model affects outcome of If-Modified-Since (IMS) requests and various prefetching or validation algorithms that depend on object freshness.
The model has three components:
All three simulators are described below. Object modification times usually
depend on object creation time. Expiration times may depend on object
modification times.
2.1 Object Creation Time
Every Web object is assumed to be created some time in the past. The birthday is determined using a random distribution specified using the --obj_bday option. Absolute and relative (to program start) birthdays can be specified.
Object creation time corresponds to the middle of the very first life cycle. See ``Modification Time'' section for the discussion about object life cycles.
Polygraph assumes that Web objects or entities have a cyclic life style. That is, modifications happen with certain periodicity. For example, a daily news page may be modified every 24 hours, a personal home page may be stable for a month or so, and a page with old rock group lyrics might remain constant for years. Let's define a cycle as a time period that contains exactly one modification of an object. Then a cycle period is defined as an average cycle length.
We further observe that the period of a cycle is object specific. Modification pattern of a given object is usually stable and often independent from other objects.
Clearly, for many objects, modifications do not happen at constant intervals. Polygraph allows you to model variability of object modification times while keeping cycle period constant. The variability is expressed in percents of a cycle period. Zero percent means no variability; all modifications happen exactly at the middle of a cycle. Hundred percent variability means that, for a given cycle, an object may be modified at any time (from the beginning until the end of a cycle). Variability higher than 100% indicates a problem at the simulated server; modification events for an object may appear in the wrong order or in the future (from client's point of view).
The picture below illustrates the object modification model. Note that we show several degrees of modification time variability, but the simulated variability is, of course, constant for a given object.

All objects have (known to Polygraph) last modification times. However, real Web servers often do not include the Last-Modified: entity-header field in replies. Polygraph allows you to specify the portion of objects that announce their modification times. For a given object, Polygraph either always includes or always excludes the Last-Modified: field; similar to what a real Web server would do.
To summarize, Poly allows you to specify
See examples below.
Object expiration time is reported via the Expires: entity-header field. Since Polygraph knows future modification times of objects, it would be very easy to report precise expiration times, reducing the guess work on proxies. However, having this nice algorithm hard-coded into Polygraph would lead to unrealistic simulations.
Indeed, real Web servers cannot predict future modification times. Hence, in most cases, servers lie about expiration time of objects. A server generates Expires: fields based on several configuration parameters. Usually, there is a way to tell a server to compute the Expires: value according to one of the following two formulas:
Using the formulas above, one can request that an object ``expires'' delta seconds after it was last accessed or modified. The first formula expires all cached copies of a given object at the same absolute time. The second formula expires cached copies when they reach a given ``age'' (after the last revalidation).
Polygraph server implements both formulas. Using one or two --obj_expire options, one can specify:
The third ``portion'' (unknown expires) simply absorbs whatever is left from the first two. See examples below.
Object modification times are honestly used when handling If-Modified-Since (IMS) requests. Since all objects have last modification times, Polygraph can generate an appropriate 200 OK or 304 Not Modified response for any IMS request.
For a given object, the presence of the Last-Modified: field in past replies is irrelevant for the 304/200 reply choice. Furthermore, the presence and value of the Expires: field in past replies is also irrelevant. This behavior mimics real world conditions. See ``Object Expiration Time'' section for details.
Note that the above assumes that generation of object modification times is enabled using the --obj_life_cycle option. Without that option, Polygraph will reply with a 200 OK response for any IMS request because object's last modification time would be unknown.
Here we give several typical applications of Object Life Cycle model. Polygraph configuration allows for only one model specification. Essentially, only one ``kind'' of a server can be simulated per polysrv process. We are working on adding multiple server functionality to Poly. Further more, a per mime-type configuration may be also required.
We show only the server side options that are relevant to this discussion.
E-Zine content is updated every month with low (2%) variability. Most expiration times are easy and safe to predict. Most (75%) content expires after one cycle, and some objects (say ads, 10%) can be cached for about 1 hour. The rest of the objects, (100-75-10=15%) have unknown expiration time.
$ polysrv ... \
--obj_life_cycle const:30day --obj_life_cycle_var 2p \
--obj_with_lmt 90p \
--obj_expire 75p=lmt+const:1 \ # one cycle, not 1 second
--obj_expire 10p=now+norm:1hour,20min
CNN-like server posts hot news and generates revenue by displaying advertisements. Content is updated sporadically (60% variability) with a 2 hour average life cycle. Life cycles differ a lot from object to object (exponential distribution with 2 hour mean is used). Expiration times are mostly unknown (80%) or very conservative.
$ polysrv ... \
--obj_life_cycle exp:2hour --obj_life_cycle_var 60p \
--obj_with_lmt 33p \
--obj_expire 5p=lmt+exp:0.5 \ # half a cycle, on average
--obj_expire 15p=now+const:15min
The PolyMix-1 workload was used during the first bake-off. For cachable objects, the time of last modification was set to about one year before the bake-off date. The expiration times was set to about one year after the bake-off date. The old behavior is no longer a default, but it can be emulated with good accuracy.
$ polysrv ... \
--obj_bday const:-1year \
--obj_with_lmt 100p \ # all objects have LMT header
--obj_life_cycle const:2year \
--obj_expire 100p=lmt+const:1
Note that here we must use an explicit --obj_bday option. There is no other way to have timestamps for all objects to be the same. Fortunately, such similarity is rarely needed.
The DataComm-1 workload description contains more information on emulating the first bake-off workload using Object Life Cycle model.
$Id: objlife.sml,v 1.4 1999/05/14 05:06:42 rousskov Exp $