week 12

Week 12

Quiz

Homework

o Who is planning on using input data file and standard format?

o Agree no changes without everyone’s ok

Trees – cont.

o Algorithm for node deletion

o Show example program

Searching

o Very common operation

o Linear searches

o Go down list sequentially

o Best case time O(1) – don’t really factor these

o Worst case O(n)

o Average case

o On average n*(n+1)/2 divided by n = O(n)

o Binary searches

o Must have ordered list/array

o Best case O(1)

o Worst case O(log n) (really O(log n+1)

o Average case

o To motivate, show a binary tree and note that most of the nodes are in the bottom two depths, so average is O(log n)

o Hash tables

o Problem: have data to search for but not ‘contiguous’

o E.g. part numbers (book uses 000, 100, 200, 300, …, 4900)

o Want to store in an array

o Define a hash function that operates on a key

o Hash function returns an index into the hash array

§ Must return valid index

o Collisions occur when a two keys return the same hash value

§ In example above, 350, 325 may both return 3

o Resolve collision by storing object in the next open array index

§ open-address hashing or open addressing

§ when go off the end of the array, must go to 0

o Show a high level algorithm

o Non-integer keys

§ Have function that converts key to integer, then hash

§ Java has .hashCode() method for all objects

o Abstract data type hashtable

§ Methods

· Constructor with the size

· numfree() (book calls capacity)

· getsize()

· put(key, element)

o replaces objects with same key

· containskey(key)

· get(key)

· remove(key)

§ definition

public class hashtable

{

private Object[] data;

private Object[] keys;

private int manyitems;

private boolean[] hasbeenused;

}

§ private methods

· hash(key)

· nextIndex(i)

· findIndex(key)

o like get but returns an index

o returns -1 if can’t find

§ do hash method for put()

· first use findIndex() to see if already there

· use if (manyitems < data.length) to determine if there’s space

§ do remove, using findIndex()

· return the object and set the array values = null

§ do findIndex

· basically hash and then go through each item sequentially as long has hasbeenused is true

· ask why use hasbeenused rather than simply empty position? – (hint, some item may have been removed!)

· need to keep counter so that can test if < data.length for case when table is full (so don’t loop forever)

o Choosing a hash function

§ Want to avoid clustering (lots of collisions)

§ Division hash function (e.g. mod) want size = prime # = 4k +3

§ Mid-square hash (times itself, take middle digits

§ Multiplicative hash (time #<1, take first n digits)

§ Avoiding collisions:

· Double hashing

· Compute a hash2 to be the index increment to try next

· Must avoid coming back to same spot

· See text for using prime numbers

· Chained hashing

o Create a linked list at each hash index!

o Run-time analysis

§ See text p. 559

§ Run time is proportional to the load factor or percent full

In class (perhaps)

Split class into groups of 3
Assignment is to implement the following problem
- They may meet for n minutes initially and then must work independently until they have code to integrate and test
- Each person must have code to produce
- If a problem is found and need to communicate, must escalate to me
Problem:
- Create a hash table to store student names and grades
- Driven by applet with buttons for
  - Insert
  - Find
  - Remove
  - Print table (should print out <empty> if nothing there)