FM Cache (Work in Progress)

Frank Mitchell

Posted: 2023-02-11
Word Count: 3448
Tags: java programming

Table of Contents

Origins

When I worked at Hotel.com my manager asked me to write a simple cache library. Assembling all the information about a hotel – description, amenities, price quotes, etc. – too a lot of time, so it would speed things up, and ease the burden on databases and SOAP interfaces, to keep the results around. But not too long: price and availability change in an instant, and even though we doublecheck when someone tries to book a hotel we don’t want the information to get too stale.

So I wrote one, and it was fairly successful until we all got laid off. If I had to guess, I’d attribute it to these factors:

When scraping the rust off my coding skills, I decided to reproduce this library, but with a few “improvements”:

With these thoughts, I began with something almost like hope.

Design

Right now, I’m targeting Java 8. I should probably move to 11 at least, but the last time I tried to set up modules I ended up locking out my test code, so I’ll wait on that.

In any case, I settled on a few key abstractions, represented by six interfaces below. Four of them pertain to the mechanics of caching:

The other two concern how clients find the right cache, and how the parameters update:

Implementation

Reality

… is ongoing.

As of this writing (2023-02-11) I’m blocked just trying to get caches to expire entries in a unit test. It’s either a problem with the unit test or the two algorithms in the two Cache implementations I’m using to do it. One is too slow to use in production – walk through every entry in the cache – but it should work

The Plan

Once I figure out what stupid mistake I made, this is the plan:

  1. Continue to test caching scenarios.
  2. Test the CacheManager implementation, to make sure it all works smoothly.
  3. Implement a Configuration that watches a properties file. (Or JSON or YAML or ELTN, but watching the file is the easy part.)
  4. Implement a Configuration that opens an HTTP port to manage the cache. Possible implementations include:
    • A bare-bones HTML page with forms to clear caches, set parameters, and refresh the view automatically with JavaScript.
    • A template-based response to vend the same essential functions in a customizable HTML, XML, or other response format.
    • JSON-RPC queries and responses, maybe to implement the previous two.

Things I’m not going to do unless someone asks or I feel the need:

API

Cache

package com.frank_mitchell.cache;

import java.util.Collection;
import java.util.Map;
import java.util.Set;

/**
 * Instances cache a specified number of values by key for a specified time.
 * Entries are evicted when the entry has not been updated or accessed
 * in a specified interval, or when the cache is beyond its maximum size.
 * The cache will quietly remove expired entries, then remove the oldest
 * or least used (depending on a flag) until it's within its maximum size.
 * Removed items simply disappear; no query will show they were ever added.
 *
 * Implementations can either use timers to remove items in a timely manner
 * or lazily remove items if they've expired or if a new item would push
 * the cache size beyond its maximum.  Performance issues aside,
 * as long as the user doesn't <strong>see</strong> expired items and
 * more items than the maximum, they were never there.
 * 
 * @param <K> type of keys in cache
 * @param <V> type of values in cache
 */
public interface Cache<K,V> extends Map<K,V> {
    /**
     * The cache's contents.
     * This Set behaves identically to {#link Map#entrySet()};
     * changing the results or its contents <strong>does</strong> change
     * the underlying cache.
     * 
     * This operation may require creating intermediate objects or
     * maintaining a lot of state.  It's included mainly for debugging
     * purposes, either by developers or operations personnel.
     *
     * @return cache entries
     * @see Map#entrySet()
     */
    public Set<CacheEntry<K,V>> cacheEntrySet();

     /**
     * A partial snapshot of the cache's contents.
     * Unlike {#link Map#entrySet()} changing this method does not change
     * the underlying cache.
     * 
     * This operation may require creating intermediate objects or
     * maintaining a lot of state.  It's included mainly for debugging
     * purposes, either by developers or operations personnel.
     *
     * @return collection of cache views
     */
    public Collection<CacheView<K,V>> cacheViews();

   /**
     * Remove all expired entries from the cache.
     *
     * If the implementation uses timers and threads to expire items,
     * this method may do nothing.
     * If it removes objects lazily, this forces it to remove references
     * to expired items, which may reduce memory.
     */
    public void clearExpired();
    
    /**
     * Provide an interface to update the cache's parameters.
     * Changes to the return value <strong>directly</strong>
     * updates the cache's state.
     * 
     * @return the interface to update the cache's parameters
     */
    public CacheParameters getParameters();
}

CacheEntry

package com.frank_mitchell.cache;

import java.time.Instant;
import java.util.Map;

/**
 * An entry in a {@link Cache}, including access and update times.
 * The access time changes whenever the value of an entry is read,
 * and the update time changes whenever the value of an entry is written.
 * 
 * @author Frank Mitchell
 * @param <K> key type
 * @param <V> value type
 */
public interface CacheEntry<K, V> extends Map.Entry<K, V> {

    /**
     * Time entry was last accessed.
     * This is defined as calling the {@link Map.Entry#getValue()} method.
     *
     * @return timestamp of last access
     */
    Instant getAccess();

    /**
     * Time entry was last updated.
     * This is defined as calling the {@link Map.Entry#setValue()} method.
     *
     * @return timestamp of last update
     */
    Instant getUpdate();
}

CacheView

package com.frank_mitchell.cache;

import java.time.Instant;

/**
 * An interface providing a "safe" read-only facade to inspect a cache.
 * 
 * @author Frank Mitchell
 * @param <K> key type
 */
public interface CacheView<K,V> {
    /**
     * The key in the cache for this entry.
     * 
     * @return the key in the cache
     * @see Map.Entry#getKey()
     */
    K getKey();

    /**
     * A "safe" look in the cache for this entry.
     * This call will <strong>not</strong> update {@link #getAccess()}.
     * 
     * @return the key in the cache
     * @see Map.Entry#getValue()
     */
    V getValueSnapshot();

    /**
     * Time entry was last accessed.
     * This is defined as calling the {@link Map.Entry#getValue()} method.
     *
     * @return timestamp of last access
     */
    Instant getAccess();

    /**
     * Time entry was last updated.
     * This is defined as calling the {@link Map.Entry#setValue()} method.
     *
     * @return timestamp of last update
     */
    Instant getUpdate();
}

CacheParameters

package com.frank_mitchell.cache;

import java.time.Duration;

/**
 * Represents the configurable parameters of a {@Link Cache} instance.
 */
public interface CacheParameters {

    /**
     * Whether this instance is disabled.
     * A disabled cache contains no entries and accepts none.
     * @return whether cache is disabled
     */
    public boolean isDisabled();

    /**
     * Whether this instance is enabled.
     * 
     * @return whether cache is not disabled
     * @see #isDisabled() 
     */
    default public boolean isEnabled() {
        return !isDisabled();
    }
    
    /**
     * Set whether this cache is disabled.
     * 
     * @param disabled whether this cache should be enabled.
     * @see #isDisabled() 
     */
    public void setDisabled(boolean disabled);
    
    /**
     * Set whether this cache is enabled.
     * 
     * @param enabled whether this cache should be enabled.
     * @see #isDisabled() 
     */
    default public void setEnabled(boolean enabled) {
        setDisabled(!enabled);
    }

    /**
     * Whether this cache was configured to drop the least recently read
     * entry.  The default is to drop the least recently written entry.
     *
     * @return whether dropping least recently read.
     */
    public boolean isLastAccessedDroppedFirst();

    /*
     * Set whether the cache drops the least recently accessed entry.
     * The default is to drop the least recently updated entry.
     */
    public void setLastAccessedDroppedFirst(boolean value);

    /**
     * The maximum interval between now and an entry's last access time 
     * before an entry will be removed.
     *
     * @return maximum read interval
     */
    public Duration getLastAccessLimit();

    /**
     * Set the maximum interval between now and an 
     * entry's last access time before an entry will be removed.
     * 
     * @param value new maximum interval
     */
    public void setLastAccessLimit(Duration value);

    /**
     * The maximum interval between now and an entry's last update time 
     * before an entry will be removed.
     *
     * @return maximum write interval
     */
    public Duration getLastUpdateLimit();

    /**
     * Set maximum interval between now and an entry's last update time 
     * before an entry will be removed.
     * 
     * @param value new maximum interval
     */
    public void setLastUpdateLimit(Duration value);

    /**
     * The maximum number of entries in the cache before it drops an
     * old one.  "Old" depends on whether the cache deletes by read time
     * or write time.
     *
     * @return maximum number of cache entries.
     */
    public int getMaximumSize();

    /**
     * Set the maximum number of entries in the cache before it drops an
     * old one.  "Old" depends on whether the cache deletes by read time
     * or write time.
     * 
     * @param value new maximum size
     */
    public void setMaximumSize(int value);
}

CacheManager

package com.frank_mitchell.cache;

import java.util.*;

/**
 * A CacheManager manages a collection of {@link Cache}s by name.
 */
public interface CacheManager {

    /**
     * Get a cache named <i>name</i> that maps <i>keys</i> to <i>values</i>.
     *
     * @param <K> the type of keys
     * @param <V> the type of values
     * @param name name of a new or existing cache
     * @param keys the class for keys
     * @param values the class for values
     * @return a cache that stores the specified classes
     * @throws IllegalArgumentException if either class argument is missing
     *   or inconsistent with the method's last invocation for <i>name</i>.
     */
    public <K,V> Cache<K,V> getCache(String name, Class<K> keys, Class<V> values)
        throws IllegalArgumentException;

    /**
     * Get a list of all cache names in use.
     * 
     * @return a set of cache names
     */
    public SortedSet<String> getCacheList();

    /**
     * Get a set of configurations used by this factory.
     * 
     * @return a set of configurations.
     */
    public CacheConfiguration[] getConfigurations();

    /**
     * Register a new configuration.
     * The implementer should call either
     * {@link CacheConfiguration#addClient(CacheManager)} or
     * {@link CacheConfiguration#addClient(CacheManager, java.lang.String)}
     * to register itself for updates.
     * 
     * @param cf configuration to add.
     */
    public void addConfiguration(CacheConfiguration cf);

    /**
     * Remove a configuration.
     * The implementer should call 
     * {@link CacheConfiguration#removeClient(CacheManager)}
     * to register itself for updates.
     * 
     * @param cf configuration to remove.
     */
    public void removeConfiguration(CacheConfiguration cf);

    /**
     * Get the named cache to update its parameters.
     * If that cache doesn't exist yet, the implementer should either create it
     * on the fly or create a dummy configuration for when it does exist.
     * 
     * @param name
     * @return a parameters object
     */
    public CacheParameters getParameters(String name);
}

CacheConfiguration

package com.frank_mitchell.cache;

import java.util.*;

/**
 * A CacheConfiguration watches a specific resource for changes, then
 * reconfigures {@link Cache}s based on those changes.
 * In the most common use case a CacheConfiguration monitors a configuration
 * file.  When the file contents change, it rereads the file and configures
 * caches based on those changes.
 *
 * The application may use one configuration per file, network connection,
 * etc. and configuring caches may require multiple files.
 * In the general case, then, multiple configurations may support
 * a single {@link CacheManager}.
 * Likewise, while one factory per application may be the simplest option,
 * nothing in the API mandates that.  Different subsystems may allocate
 * their own factories, yet all may use the same external resources.
 *
 * In the general case, then, multiple factories may rely on multiple
 * configurations, and update multiple caches with the same name.
 * To avoid this last case, factories may register with a prefix.
 * For example if a factory has a prefix "foo" and the configuration
 * sees a change in "foobar", the configuration should only update
 * cache "bar" for factory "foo".
 *
 * Implementations must avoid changing cache parameters unnecessarily,
 * perhaps by retaining the previous values and only reconfiguring a cache
 * if its values change.
 * This is not a hard-and-fast rule, however.  Certain file formats
 * that describe object graphs may unavoidably remove and recreate
 * caches.
 */
public interface CacheConfiguration {

    /**
     * Get the current list of {@Link CacheManager} instances relying on this
     * instance.
     *
     * @return set of clients being watched
     */
    public Set<CacheManager> getClients();
    
    /**
     * Whether this client is currently in the configuration's client list.
     * 
     * @param client potential client
     * @return whether client is in list
     */
    public boolean hasClient(CacheManager client);

    /**
     * Get the client for this prefix.
     * 
     * @param prefix
     * @return the client or <code>null<code> if no such client
     */
    public CacheManager getClientByPrefix(String prefix);

    /**
     * Add a client with a zero-length prefix.
     * 
     * @param client 
     */
    default public void addClient(CacheManager client) {
        addClient(client, "");
    }

    /**
     * Add a client with a zero-length prefix.
     * 
     * @param client client for this configuration
     * @param prefix prefix to distinguish multiple caches with similar names
     */
    public void addClient(CacheManager client, String prefix);

    /**
     * Remove a client from the client list.
     * 
     * @param client client to remove
     */
    public void removeClient(CacheManager client);

    public void refresh();
}

MIT License

Copyright 2023 Frank Mitchell

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.


  1. I can’t remember what it was called, but it was a big deal in the Java / server management world. ↩︎

  2. Which isn’t a word, as my IDE informed me. Neither, oddly enough, is Mitchell. ↩︎

  3. Java 5 introduced generics and a few other things, and we were using the Java 6 VM by the end. Yet we stuck with Java 4 or less. Por que? Well, we couldn’t use generics because of all our legacy code; once code is compiled all that’s left are some class casts and assumptions. We couldn’t use Annotations for similar reasons (not that I’m a fan). Reflection was too complicated for our junior programmers to maintain. We couldn’t use code generation to create myriad Data Transfer Objects (similar to records in modern Java) even though our architecture demanded them because our build process was too convoluted. And so on. No wonder some Hungarians rewrote our entire site in a few months. ↩︎

  4. Table in Lua, Object in JavaScript, Dictionary in Python, Hash in Ruby, etc. etc. ↩︎

  5. In the current implementation, each name maps to only one cache, and to one pair of key and value types. It denies requests for the same name with different types than last time. (Or maybe silently hands them a dummy cache. I’m not sure.) Someone will have to explain why two systems – two programmers – would use the same name to cache different things as opposed to, I don’t know, picking different names between them. At this point in my life I won’t enable teams that can’t talk to each other. ↩︎

  6. This is just like Java Bean properties, except that the beans have a name, not just an in-memory identity. I may use bean properties in the implementation, but so far I’m not there yet. ↩︎