Chapter 1: Arrays and Strings

Part of a series covering topics from Cracking the Coding Interview – this time we’ll look at Arrays and Strings.

Hash Tables

So let’s start with Hash Tables.

A hash table (or hash map) is a Data Structure that maps keys to values by using the key of each record/value to determine the location of the record/value in an array. A hash table uses a hash function to compute an index (since the key is passed into a hash function which returns a  linked number)

Swift has a built-in version – the Dictionary class.

Kotlin has the HashMap class.

Hash Table = collection of key-value pairs stored as an array indexed with the hash values of its keys.

How Do Hash-Tables Work? and what do we do about Collisions?

Hash Function?

“A hash function must be designed so that given a certain key it will always return the same numeric value. Furthermore, the hash function will ideally distribute all possible keys in the keyspace uniformly over all possible locations”(Ref #: C).

A good hash function should typically be uniform and random, meaning that all the indices are made equally likely for the given set of possible keys, and that also these are not predictable (Ref#: C).

HashFunction

 

 

 

 

 

Collisions

“A hash function translates all possible keys into indexes in an array. This typically means that there are many many more keys than there are indexes. Thus, it is always possible that two or more keys will be translated into exactly the same index. When this happens you have a collision. Generally speaking, collisions are unavoidable. The key is to have a method of resolving them when they do happen” (Ref #: C).

So, since hash tables can have collisions, how do we accommodate for this possibility?

Chaining with Linked Lists

So the most common way of solving the collision problem is basically having each hash connected to its own linked list, so for these, we can just add items to these linked lists. So this means that as long as the number of collisions isn’t that big, the efficiency will be decent, however in the worst case (with some strange high collision causing data) the efficiency could go to O(n) lookup.

Chaining with Binary Search Trees

Instead of storing collisions in a linked list, we can pop in a Binary Search Tree to store collisions. Although this means our worst-case lookup time down to O(log n) there’s typically no need to go this far with our solution, unless we anticipate a very non-uniform distribution in the data we hope to store in our Hash Table.

Open Addressing with Linear Probing 

Insertion, Searching, and Removal

(Ref#: B)

Quadratic Probing and Double Hashing

“Quadratic Probing is similar to Linear probing. The difference is that if you were to try to insert into a space that is filled you would first check 1^2 = 1 element away then 2^2 = 4 elements away, then 3^2 =9elements away then 4^2=16 elements away and so on.

With linear probing, we know that we will always find an open spot if one exists (It might be a long search but we will find it). However, this is not the case with quadratic probing unless you take care in the choosing of the table size” (Ref#: A).

[…]

(Source: Cracking the Coding Interview)

ArrayList and Resizable Arrays

Some languages will give you resizable arrays (or lists) by default, others like Java will give you fixed-length arrays with the size being determined when you initialize the array.

An ArrayList is an array that will be able to resize itself when we add items to it or take items away from it, yet is still able to provide O(1) access. Most typically (such as in Java) the way this actually works is that the array will double it’s size when it gets full, although this operation will take O(n) time since it’s rare, that means that the amortized insertion time will naturally still be O(1) time. As mentioned the “resizing factor” will typically be 2 but this can also vary by programming language.  To understand why we don’t really use the O(N) operation in stating the overall time complexity see my Big-O Articles.

An ArrayList in Swift is just called an Array, and in Objective-C, this would probably be NSMutableArray.

String Builder

Interview Questions

Part A: Questions

1/ Is Unique – An algorithm to work out if a given string has all unique characters, what about if no additional data structures can be used?

2/ Check Permutation – An algorithm that takes two strings and can decide if they are permutations of each other.

3/ URLify

4/ Palindrome Permutation

5/ One Away

6/ String Compression

7/ Rotate Matrix

8/ Zero Matrix

9/ String Rotation

 

Part B: Swift Based Answers

https://github.com/careercup/CtCI-6th-Edition-Swift

Part C: Python Answers

https://github.com/careercup/CtCI-6th-Edition-Python

Part D: C# Answers

https://github.com/careercup/CtCI-6th-Edition-CSharp

 

References

A: https://cathyatseneca.gitbooks.io/data-structures-and-algorithms/content/tables/quadratic_probing_and_double_hashing.html

B: https://cathyatseneca.gitbooks.io/data-structures-and-algorithms/content/tables/linear_probing.html

C: https://cathyatseneca.gitbooks.io/data-structures-and-algorithms/content/tables/hash_table.html

D: https://www.raywenderlich.com/206-swift-algorithm-club-hash-tables

Last modified: July 16, 2019