Memory-optimized hash index architecture

### Configure the hash index bucket

Configure the hash index bucket

A hash index consists of an array of pointers, and each element of the array is called a hash

bucket.

Each bucket is 8 bytes, which are used to store the memory address of a link list of key

entries.

Each entry is a value for an index key, plus the address of its corresponding row in the

underlying memory-optimized table.

Each entry points to the next entry in a link list of entries, all chained to the current

bucket.

The number of buckets must be specified at index creation time:

The lower the ratio of buckets to table rows or to distinct values, the longer the average

bucket link list is.

Short link lists perform faster than long link lists.

The maximum number of buckets in hash indexes is 1,073,741,824.

The hash function is applied to the index key columns and the result of the function

determines what bucket that key falls into. Each bucket has a pointer to rows whose hashed

key values are mapped to that bucket.

The hashing function used for hash indexes has the following characteristics:

The Database Engine has one hash function that is used for all hash indexes.

The hash function is deterministic. The same input key value is always mapped to the

same bucket in the hash index.

Multiple index keys might be mapped to the same hash bucket.

The hash function is balanced, meaning that the distribution of index key values over

hash buckets typically follows a Poisson or bell curve distribution, not a flat linear

distribution.

Poisson distribution isn’t an even distribution. Index key values aren’t evenly distributed in

the hash buckets.

If two index keys are mapped to the same hash bucket, there’s a

hash collision. A large

number of hash collisions can have a performance effect on read operations. A realistic

goal is for 30 percent of the buckets to contain two different key values.



Tip

To determine the right

for your data, see.

BUCKET_COUNT