🍁
Data Structures and Algorithms
  • Introduction
  • Introduction to Algorithms Analysis
    • Growth Rates
    • Big-O, Little-o, Theta, Omega
    • Analysis of Linear Search
    • Analysis of Binary Search
  • Recursion
    • The runtime stack
    • How to Write a Recursive Function
      • Example: the Factorial Function
    • Drawbacks of Recursion and Caution
  • Lists
    • Implementation
    • Linked List
      • Nodes
      • Iterator
      • Template Singly Linked List
      • Doubly Linked List
      • Circular Linked List
  • Stacks
    • Stack Operations
    • Stack Implementations
    • Stack Applications
  • Queue
    • Queue Operations
    • Queue Implementations
    • Queue Applications
  • Tables
    • Simple Table
    • Hash Table
      • Bucketing
      • Chaining
      • Linear Probing
      • Quadratic Probing and Double Hashing
  • Sorting
    • Simple Sorts
      • Bubble Sort
      • Insertion Sort
      • Selection Sort
    • Merge Sort
      • Merge Sort Implementation
    • Quick Sort
    • Heap Sort
      • Binary heap
      • Binary heap basics
      • Insertion into a binary heap
      • Delete from a binary heap
      • Implementation
      • Sorting
  • Introduction to Trees, Binary Search Trees
    • Definitions
    • Tree Implementations
    • Binary Trees
    • Binary Search Trees
      • Insertion
      • Removal
      • Traversals
  • AVL Trees
    • Height Balance
    • Insertion
    • Why it works
  • Red Black Trees
    • Insertion Example
  • 2-3 Trees
  • Graphs
    • Representation
  • Complexity Theory
  • Appendix: Mathematics Review
Powered by GitBook
On this page
  • Hash Functions
  • Load Factor
  • Collisions

Was this helpful?

  1. Tables

Hash Table

A hash table uses the key of each record to determine the location in an array structure. To do this, the key is passed into a hash function which will then return a numeric value based on the key.

Hash Functions

A hash function must be designed so that given a certain key it will always return the same numeric value. Furthermore, the hash function will ideally distribute all possible keys in the keyspace uniformly over all possible locations.

For example suppose that we wanted to create a table for storing customer information at store. For the key, a customer's telephone number is used. The table can hold up to 10,000 records and thus valid indexes for an array of that size would be [0 - 9999]

Telephone numbers are 10 digits (###) ###-####

The first 3 of which is an area code.

Now, if your hash function was: use the first 4 digits of the phone number (area code + first digit of number) that hash function would not be very good because most people in the same area would have the same area code. Most people in the Toronto for example have area code of 416 or 647... so there would be very little variation in the records. However the last 4 digits of a phone number is much more likely to be different between users.

Generally speaking a good hash function should be:

  • uniform (all indices are equally likely for the given set of possible keys)

  • random (not predictable)

Load Factor

The load factor denoted by the symbol λ\lambdaλ measures the fullness of the hash table. It is calculated by the formula:

λ=number of records in tablenumber of locations.\lambda = {number~of~records~in~table \over number~of~locations}.λ=number of locationsnumber of records in table​.

Collisions

A hash function translates all possible keys into indexes in an array. This typically means that there are many many more keys than there are indexes. Thus, it is always possible that two or more keys will be translated into exactly the same index. When this happens you have a collision. Generally speaking, collisions are unavoidable. The key is to have a method of resolving them when they do happen. The rest of this chapter look at different ways to deal with collisions.

PreviousSimple TableNextBucketing

Last updated 5 years ago

Was this helpful?