Key distribution

An indexed storage system will use keys to determine where it should store, or look for, associated data. One strategy to optimize access to data is to spread that data out over a number of storage locations. As a result, because each location should have less data, it will be faster to find something within a given collection. Consider having one million records: searching for a given record will, in the worst case, require the system to look at one million different elements to find what was asked for. On the other hand, if those one million records were divided into one-thousand groups, being able to limit your search to one group would reduce that worst-case to only one thousand elements.

The relative probability that a key will direct a search to a given location is known as the "key distribution". An even distribution means that, for any given key, the probability of being directed to any location is as likely as another and finding data will therefore be more efficient.