Week 12

 

Quiz

 

Homework

o       Who is planning on using input data file and standard format?

o       Agree no changes without everyone’s ok

 

Trees – cont.

o       Algorithm for node deletion

o       Show example program

 

Searching

        o      Very common operation

        o      Linear searches

o       Go down list sequentially

o       Best case time O(1) – don’t really factor these

o       Worst case O(n)

o       Average case

o       On average n*(n+1)/2 divided by n = O(n)

      o      Binary searches

o       Must have ordered list/array

o       Best case O(1)

o       Worst case O(log n)  (really O(log n+1)

o       Average case

o       To motivate, show a binary tree and note that most of the nodes are in the bottom two depths, so average is O(log n)

o      Hash tables

o       Problem: have data to search for but not ‘contiguous’

o       E.g. part numbers  (book uses 000, 100, 200, 300, …, 4900)

o       Want to store in an array

o       Define a hash function that operates on a key

o       Hash function returns an index into the hash array

§         Must return valid index

o       Collisions occur when a two keys return the same hash value

§         In example above, 350, 325 may both return 3

o       Resolve collision by storing object in the next open array index

§         open-address hashing  or open addressing

§         when go off the end of the array, must go to 0

o       Show a high level algorithm

o       Non-integer keys

§         Have function that converts key to integer, then hash

§         Java has .hashCode() method for all objects

o       Abstract data type hashtable

§         Methods

·        Constructor with the size

·        numfree()  (book calls capacity)

·        getsize()

·        put(key, element)

o       replaces objects with same key

·        containskey(key)

·        get(key)

·        remove(key)

§         definition

public class hashtable

{

      private Object[] data;

      private Object[] keys;

      private int manyitems;

      private boolean[] hasbeenused;

}

§         private methods

·        hash(key)

·        nextIndex(i)

·        findIndex(key)

o       like get but returns an index

o       returns -1 if can’t find

§         do hash method for put()

·        first use findIndex() to see if already there

·        use if (manyitems < data.length) to determine if there’s space

§         do remove, using findIndex()

·        return the object and set the array values = null

§         do findIndex

·        basically hash and then go through each item sequentially as long has hasbeenused is true

·        ask why use hasbeenused rather than simply empty position? – (hint, some item may have been removed!)

·        need to keep counter so that can test if < data.length for case when table is full (so don’t loop forever)

o       Choosing a hash function

§         Want to avoid clustering (lots of collisions)

§         Division hash function (e.g. mod) want size = prime # = 4k +3

§         Mid-square hash (times itself, take middle digits

§         Multiplicative hash (time #<1, take first n digits)

§         Avoiding collisions:

·        Double hashing

·        Compute a hash2 to be the index increment to try next

·        Must avoid coming  back to same spot

·        See text for using prime numbers

·        Chained hashing

o       Create a linked list at each hash index!

o       Run-time analysis

§         See text p. 559

§         Run time is proportional to the load factor or percent full

 

In class (perhaps)