Algorithms as Tools For Solving Problems Pt. 2

We left off last time with the realization that our word-finding algorithm was too slow for large inputs.

We can improve its performance by minimizing the number of page turns required to find a word.

Amazingly, using binary search we can reduce the runtime from O(n) to O(logN)!

Note: In bigO notation, logN always refers to log base 2

Binary Search

Binary search is a common algorithm used for searching through large, ordered data sets, like a dictionary.

Here's how it works:

Let's say we're trying to find the word 'Zebra'.
Instead of going through each individual page, starting in the A's and ending in the Z's, we're going to start right in the middle of the book, probably somewhere near the M's.

A quick scan reveals that Zebra is not on the page, but since the book is ordered alphabetically, we have effectively ruled out the entire first half of the book in one step!

However, Zebra is still hiding somewhere in the second half so we will simply split that into two more pieces, perhaps somewhere in the T's. Again, it's not there, but we are already 75% of the way through the book in only two steps.

Repeating the above steps we will eventually reach the Z section and hopefully find our word on the very last page.

How much more efficient is binary search compared to a simple search?:

For a 1024 page dictionary, it will take simple search 1024 page turns to find a word in the worst case scenario.

For binary search it will take one page turn every time we cut it in half:

1024/2 = 512 ; 1 page turn
512/2 = 256 ; 2 page turns
256/2 = 128 ; 3 page turns
128/2 = 64 ; 4 page turns
64/2 = 32 ; 5 page turns
32/16 = 16 ; 6 page turns
16/2 = 8 ; 7 page turns
8/2 = 4 ; 8 page turns
4/2 = 2 ; 9 page turns
2/2 = 1 ; 10 page turns

Holy crap! In only 10 page turns we accomplished what we used to in over 1000!

The efficiency of binary search is O(logN) time because as the size of the dictionary increases, our runtime increases only by logN.

For a 2 page dictionary our runtime is log2 (2 to the first power equals 1); for a 4 page dictionary our runtime is log4 (2 to the 2nd power equals 4), etc.

Now that we know how binary search works, how can we implement it in code?

We know our function will take an array and a string as arguments. It will return true or false based on whether the word was found.

let binarySearch = function(arr, word){

  //return true or false
};

Since we're going to be splicing out large portions of the array, we'll loop through it only as long as it has a length:

let binarySearch = function(arr, word){
  while(arr.length){
    //search logic 
  }
};

Next, we'll check whether middle index matches the word:

let middleIndex = Math.floor(arr.length/2);
if(arr[middleIndex] === word){
return true ;
};

If it doesn't, then there are two possibilities:

  • The word comes after the middle index, in which case we splice out the second half of the array and set it as the new array,
if(arr[middleIndex] < word){
  arr = arr.splice(middleIndex + 1);
}
  • Or the word comes before the middle index, in which case we do the same for the first half instead.
else {
  arr = arr.splice(0, middleIndex);
     };

We should also prepare for the possibility that the word is not in the dictionary. After the while loop simply put:

return false;

This will be the default output if the while loop doesn't return true.

Recap###

Binary search is just one example of how we can improve the performance of our code.

The goal of this series was to show how bigO notation lets us measure the performance of our algorithms. You should now understand that writing good algorithms entails much more than just outputting the right answer. Runtime and memory must be taken into account, especially for large inputs.


Exercises

  1. How many page turns using binary search would be required to find a word in a 1,048,576 page dictionary? Assume the worst case scenario.

  2. You're given two ordered data sets. One has 8192 members, the other has 2048 members. How many more operations would you have to perform on the larger set than the smaller set using binary search?

3)Why do you think its necessary that a data set be ordered for binary search to work effectively?

  1. Copy this code into your text editor and use the array it produces as an argument in the function from earlier. Time how long it takes to search for the number 9999999. How much quicker is it than when you did it in the previous exercise?
let largeArray = [];  
for(let i = 0; i < 10000000; i++){  
largeArray.push(i);  
};