libdap Updated for version 3.20.11
libdap4 is an implementation of OPeNDAP's DAP protocol.
|
#include <HTTPCache.h>
Public Member Functions | |
void | lock_cache_interface () |
void | unlock_cache_interface () |
virtual | ~HTTPCache () |
Static Public Member Functions | |
static HTTPCache * | instance (const string &cache_root, bool force=false) |
Friends | |
class | HTTPCacheInterruptHandler |
class | HTTPCacheTest |
class | HTTPConnectTest |
Accessors and Mutators for various properties. | |
string | get_cache_root () const |
void | set_cache_enabled (bool mode) |
bool | is_cache_enabled () const |
void | set_cache_disconnected (CacheDisconnectedMode mode) |
CacheDisconnectedMode | get_cache_disconnected () const |
void | set_expire_ignored (bool mode) |
bool | is_expire_ignored () const |
void | set_max_size (unsigned long size) |
unsigned long | get_max_size () const |
void | set_max_entry_size (unsigned long size) |
unsigned long | get_max_entry_size () const |
void | set_default_expiration (int exp_time) |
int | get_default_expiration () const |
void | set_always_validate (bool validate) |
bool | get_always_validate () const |
void | set_cache_control (const vector< string > &cc) |
vector< string > | get_cache_control () |
bool | cache_response (const string &url, time_t request_time, const vector< string > &headers, const FILE *body) |
void | update_response (const string &url, time_t request_time, const vector< string > &headers) |
bool | is_url_valid (const string &url) |
vector< string > | get_conditional_request_headers (const string &url) |
FILE * | get_cached_response (const string &url, vector< string > &headers, string &cacheName) |
FILE * | get_cached_response (const string &url, vector< string > &headers) |
FILE * | get_cached_response (const string &url) |
void | release_cached_response (FILE *response) |
void | purge_cache () |
Implements a multi-process MT-safe HTTP 1.1 compliant (mostly) cache.
Clients that run as users lacking a writable HOME directory MUST disable this cache. Use Connect::set_cache_enable(false).
The original design of this class was taken from the W3C libwww software, written by Henrik Frystyk Nielsen, Copyright MIT
This cache does not implement range checking. Partial responses should not be cached (HFN's version did, but it doesn't mesh well with the DAP for which this is being written).
The cache uses the local file system to store responses. If it is being used in a MT application, care should be taken to ensure that the number of available file descriptors is not exceeded.
In addition, when used in a MT program only one thread should use the mutators to set property values. Even though the methods are robust WRT MT software, having several threads change values of cache's properties will lead to odd behavior on the part of the cache. Many of the public methods lock access to the class' interface. This is noted in the documentation for those methods.
Even though the public interface to the cache is typically locked when accessed, an extra locking mechanism is in place for ‘entries’ which are accessed. If a thread accesses a entry, that response must be locked to prevent it from being updated until the thread tells the cache that it's no longer using it. The method get_cache_response() and get_cache_response_body() both lock an entry; use release_cache_response() to release the lock. Entries are locked using a combination of a counter and a mutex. The following methods block when called on a locked entry: is_url_valid(), get_conditional_request_headers(), update_response(). (The locking scheme could be modified so that a distinction is made between reading from and writing to an entry. In this case is_url_valid() and get_conditional_request_headers() would only lock when an entry is in use for writing. But I haven't done that.)
Definition at line 103 of file HTTPCache.h.
|
virtual |
Destroy an instance of HTTPCache. This writes the cache index and frees the in-memory cache table structure. The persistent cache (the response headers and bodies and the index file) are not removed. To remove those, either erase the directory that contains the cache using a file system command or use the purge_cache() method (which leaves the cache directory structure in place but removes all the cached information).
This class uses the singleton pattern. Clients should never call this method. The HTTPCache::instance() method arranges to call the HTTPCache::delete_instance() using atexit()
. If delete is called more than once, the result will likely be an index file that is corrupt.
Definition at line 302 of file HTTPCache.cc.
bool libdap::HTTPCache::cache_response | ( | const string & | url, |
time_t | request_time, | ||
const vector< string > & | headers, | ||
const FILE * | body | ||
) |
Add a new response to the cache, or replace an existing cached response with new data. This method returns True if the information for url
was added to the cache. A response might not be cache-able; in that case this method returns false. (For example, the response might contain the 'Cache-Control: no-cache' header.)
Note that the FILE *body is rewound so that the caller can re-read it without using fseek or rewind.
If a response for url
is already present in the cache, it will be replaced by the new headers and body. To update a response in the cache with new meta data, use update_response().
This method locks the class' interface.
url | A string which holds the request URL. |
request_time | The time when the request was made, in seconds since 1 Jan 1970. |
headers | A vector of strings which hold the response headers. |
body | A FILE * to a file which holds the response body. |
InternalErr | Thrown if there was a I/O error while writing to the persistent store. |
Definition at line 1157 of file HTTPCache.cc.
bool libdap::HTTPCache::get_always_validate | ( | ) | const |
Should every cache entry be validated before each use?
Definition at line 850 of file HTTPCache.cc.
vector< string > libdap::HTTPCache::get_cache_control | ( | ) |
Get the Cache-Control headers.
Definition at line 920 of file HTTPCache.cc.
CacheDisconnectedMode libdap::HTTPCache::get_cache_disconnected | ( | ) | const |
Get the cache's disconnected mode property.
Definition at line 676 of file HTTPCache.cc.
string libdap::HTTPCache::get_cache_root | ( | ) | const |
Get the current cache root directory.
Definition at line 516 of file HTTPCache.cc.
FILE * libdap::HTTPCache::get_cached_response | ( | const string & | url | ) |
Get a pointer to a cached response body. This is a convenience method that calls the three parameter version of get_cache_response().
This method locks the class' interface.
url | Find the body associated with this URL. |
Error | Thrown if the URL is not in the cache. |
InternalErr | Thrown if an I/O error is detected. |
Definition at line 1552 of file HTTPCache.cc.
FILE * libdap::HTTPCache::get_cached_response | ( | const string & | url, |
vector< string > & | headers | ||
) |
Get information from the cache. This is a convenience method that calls the three parameter version of get_cache_response().
This method locks the class' interface.
url | Get response information for this URL. |
headers | Return the response headers in this parameter |
Error | Thrown if the URL's response is not in the cache. |
InternalErr | Thrown if the persistent store cannot be opened. |
Definition at line 1535 of file HTTPCache.cc.
FILE * libdap::HTTPCache::get_cached_response | ( | const string & | url, |
vector< string > & | headers, | ||
string & | cacheName | ||
) |
Get information from the cache. For a given URL, get the headers, cache object name and body stored in the cache. Note that this method increments the hit counter for url
's entry and locks that entry. To release the lock, the method release_cached_response() must be called. Methods that block on a locked entry are: get_conditional_request_headers(), update_response() and is_url_valid(). In addition, purge_cache() throws Error if it's called and any entries are locked. The garbage collection system will not reclaim locked entries (but works fine when some entries are locked).
This method locks the class' interface.
This method does not check to see that the response is valid, just that it is in the cache. To see if a cached response is valid, use is_url_valid(). The FILE* returned can be used for both reading and writing. The latter allows a client to update the body of a cached response without having to first dump it all to a separate file and then copy it into the cache (using cache_response()).
url | Get response information for this URL. |
headers | Return the response headers in this parameter |
cacheName | A value-result parameter; the name of the cache file |
Error | Thrown if the URL's response is not in the cache. |
InternalErr | Thrown if the persistent store cannot be opened. |
Definition at line 1481 of file HTTPCache.cc.
vector< string > libdap::HTTPCache::get_conditional_request_headers | ( | const string & | url | ) |
Build the headers to send along with a GET request to make that request conditional. This method examines the headers for a given response in the cache and formulates the correct headers for a valid HTTP 1.1 conditional GET request. See RFC 2616, Section 13.3.4.
Rules: If an ETag is present, it must be used. Use If-None-Match. If a Last-Modified header is present, use it. Use If-Modified-Since. If both are present, use both (this means that HTTP 1.0 daemons are more likely to work). If a Last-Modified header is not present, use the value of the Cache-Control max-age or Expires header(s). Note that a 'Cache-Control: max-age' header overrides an Expires header (Sec 14.9.3).
This method locks the cache interface and the cache entry.
url | Get the HTTPCacheTable::CacheEntry for this URL. |
Error | Thrown if the url is not in the cache. |
Definition at line 1250 of file HTTPCache.cc.
int libdap::HTTPCache::get_default_expiration | ( | ) | const |
Get the default expiration time used by the cache.
Definition at line 831 of file HTTPCache.cc.
unsigned long libdap::HTTPCache::get_max_entry_size | ( | ) | const |
Get the maximum size of an individual entry in the cache.
Definition at line 803 of file HTTPCache.cc.
unsigned long libdap::HTTPCache::get_max_size | ( | ) | const |
How big is the cache? The value returned is the size in megabytes.
Definition at line 758 of file HTTPCache.cc.
|
static |
Get a pointer to the HTTP 1.1 compliant cache. If not already instantiated, this creates an instance of the HTTP cache object and initializes it to use cache_root
as the location of the persistent store. If there's an index (.index) file in that directory, it is read as part of the initialization. If the cache has already been initialized, this method returns a pointer to that instance. Note HTTPCache uses the singleton pattern; A process may have only one instance of this object. Also note that HTTPCache is MT-safe. However, if the
force
parameter is set to true, it may be possible for two or more processes to access the persistent store at the same time resulting in undefined behavior.
Default values: is_cache_enabled(): true, is_cache_protected(): false, is_expire_ignored(): false, the total size of the cache is 20M, 2M of that is reserved for response headers, during GC the cache is reduced to at least 18M (total size - 10% of the total size), and the max size for an individual entry is 3M. It is possible to change the size of the cache, but not to make it smaller than 5M. If expiration information is not sent with a response, it is assumed to expire in 24 hours.
cache_root | The fully qualified pathname of the directory which will hold the cache data (i.e., the persistent store). |
force | Force access to the persistent store if true. By default false. Use this only if you're sure no one else is using the same cache root! This is included so that programs may use a cache that was left in an inconsistent state. |
Error | thrown if the cache root cannot set. |
Definition at line 129 of file HTTPCache.cc.
bool libdap::HTTPCache::is_cache_enabled | ( | ) | const |
Is the cache currently enabled?
Definition at line 647 of file HTTPCache.cc.
bool libdap::HTTPCache::is_expire_ignored | ( | ) | const |
Definition at line 703 of file HTTPCache.cc.
bool libdap::HTTPCache::is_url_valid | ( | const string & | url | ) |
Look in the cache and return the status (validity) of the cached response. This method should be used to determine if a cached response requires validation.
This method locks the class' interface and the cache entry.
url | Find the cached response associated with this URL. |
Error | Thrown if the URL's response is not in the cache. |
Definition at line 1389 of file HTTPCache.cc.
|
inline |
Definition at line 208 of file HTTPCache.h.
void libdap::HTTPCache::purge_cache | ( | ) |
Purge both the in-memory cache table and the contents of the cache on disk. This method deletes every entry in the persistent store but leaves the structure intact. The client of HTTPCache is responsible for making sure that all threads have released any responses they pulled from the cache. If this method is called when a response is still in use, it will throw an Error object and not purge the cache.
This method locks the class' interface.
Error | Thrown if an attempt is made to purge the cache when an entry is still in use. |
Definition at line 1601 of file HTTPCache.cc.
void libdap::HTTPCache::release_cached_response | ( | FILE * | body | ) |
Call this method to inform the cache that a particular response is no longer in use. When a response is accessed using get_cached_response(), it is locked so that updates and removal (e.g., by the garbage collector) are not possible. Calling this method frees that lock.
This method locks the class' interface.
body | Release the lock on the response information associated with this FILE *. |
Error | Thrown if body does not belong to an entry in the cache or if the entry was already released. |
Definition at line 1572 of file HTTPCache.cc.
void libdap::HTTPCache::set_always_validate | ( | bool | validate | ) |
Should every cache entry be validated?
validate | True if every cache entry should be validated before being used. |
Definition at line 841 of file HTTPCache.cc.
void libdap::HTTPCache::set_cache_control | ( | const vector< string > & | cc | ) |
Set the request Cache-Control headers. If a request must be satisfied using HTTP, these headers should be included in request since they might be pertinent to a proxy cache.
Ignored headers: no-transform, only-if-cached. These headers are not used by HTTPCache and are not recorded. However, if present in the vector passed to this method, they will be present in the vector returned by get_cache_control.
This method locks the class' interface.
cc | A vector of strings, each string holds one Cache-Control header. |
InternalErr | Thrown if one of the strings in cc does not start with 'Cache-Control: '. |
Definition at line 872 of file HTTPCache.cc.
void libdap::HTTPCache::set_cache_disconnected | ( | CacheDisconnectedMode | mode | ) |
Set the cache's disconnected property. The cache can operate either disconnected from the network or using a proxy cache (but tell that proxy not to use the network).
This method locks the class' interface.
mode | One of DISCONNECT_NONE, DISCONNECT_NORMAL or DISCONNECT_EXTERNAL. |
Definition at line 664 of file HTTPCache.cc.
void libdap::HTTPCache::set_cache_enabled | ( | bool | mode | ) |
Enable or disable the cache. The cache can be temporarily suspended using the enable/disable property. This does not prevent the cache from being enabled/disable at a later point in time.
Default: yes
This method locks the class' interface.
mode | True if the cache should be enabled, False if it should be disabled. |
Definition at line 635 of file HTTPCache.cc.
void libdap::HTTPCache::set_default_expiration | ( | int | exp_time | ) |
Set the default expiration time. Use the default expiration property to determine when a cached response becomes stale if the response lacks the information necessary to compute a specific value.
Default: 24 hours (86,400 seconds)
This method locks the class' interface.
exp_time | The time in seconds. |
Definition at line 819 of file HTTPCache.cc.
void libdap::HTTPCache::set_expire_ignored | ( | bool | mode | ) |
How should the cache handle the Expires header? Default: no
This method locks the class' interface.
mode | True if a responses Expires header should be ignored, False otherwise. |
Definition at line 690 of file HTTPCache.cc.
void libdap::HTTPCache::set_max_entry_size | ( | unsigned long | size | ) |
Set the maximum size for a single entry in the cache.
Default: 3M
This method locks the class' interface.
size | The size in megabytes. |
Definition at line 772 of file HTTPCache.cc.
void libdap::HTTPCache::set_max_size | ( | unsigned long | size | ) |
Cache size management. The default cache size is 20M. The minimum size is 5M in order not to get into weird problems while writing the cache. The size is indicated in Mega bytes. Note that reducing the size of the cache may trigger a garbage collection operation.
This method locks the class' interface.
size | The maximum size of the cache in megabytes. |
Definition at line 724 of file HTTPCache.cc.
|
inline |
Definition at line 213 of file HTTPCache.h.
void libdap::HTTPCache::update_response | ( | const string & | url, |
time_t | request_time, | ||
const vector< string > & | headers | ||
) |
Update the meta data for a response already in the cache. This method provides a way to merge response headers returned from a conditional GET request, for the given URL, with those already present.
This method locks the class' interface and the cache entry.
url | Update the meta data for this cache entry. |
request_time | The time (Unix time, seconds since 1 Jan 1970) that the conditional request was made. |
headers | New headers, one header per string, returned in the response. |
Error | Thrown if the url is not in the cache. |
Definition at line 1320 of file HTTPCache.cc.
|
friend |
Definition at line 143 of file HTTPCache.h.
|
friend |
Definition at line 140 of file HTTPCache.h.
|
friend |
Definition at line 141 of file HTTPCache.h.