What The Heck Are Data Structures And Algorithms Anyway?

I first learned to program in PHP. In PHP everything is either a class instance (in PHP an object is sort of like an object but with the addition that it makes you frustrated a lot) or text with PHP in it. Those were like the two PHP "data types". And I got a lot done with them. And when I couldn't I had arrays! And when arrays didn't work I had strings!

Then I found Ruby, and everything was an object and it was glorious. Some people saw a RAM usage but I saw objects everywhere. Everything became an object. My car was an object! I didn't drive, I used my foot to call car.increase_speed!. My dogs were no longer dogs but an Animal class with a bark() method. Ruby was life changing. My life became scripts to restart memory leaking Rails apps.

Then I found Clojure and everything became a LIST, and everything was a LIST. Instead of calling methods, I mapped and reduced everything. The world was data and everything went into lists! Even code was lists! It was all lists!

Data structures are so invisible that we rarely notice them but they are there all the time. Every thing might feel like a class, an object, or a LIST, but they are not.

What is a Data Structure?

A data structure is essentially just a type of object for a specific type of data. If we have linear data we might use an array. If we have React components we might need a tree. If we have a map of cities we will need a Graph.

The data structures we choose for a type of data influences the type of operations we can perform on it, their efficiency, their performance, and even the way we conceptualize data or build interfaces for it. Algorithms are derived from their data structures.

What is an Algorithm?

Algorithm is just a clever word for a standardized set of operations. What defines an algorithm is the type of operations, the data structures for their state, and how and when those operations are performed.

When we talk about doing a MergeSort or QuickSort, what we are really talking about is doing a bunch of operations on that data. An algorithm defines "the operations and this how you should do them" and sometimes "this what you should use to store the state for the algorithm".

Most of the mistakes we encounter when tackling a HackerRank type challenge are because we do not fully understand the algorithm or we choose the wrong one. Most of us when presented with a problem that involves sorting, will know or be able to figure some way to sort that data. But we might choose a MergeSort and discover we missed the detail about the data set never exceeding 13 items.

The type of data you have, its constraints, its structure, will reveal algorithm choices. Often we hit upon the wrong solution simply because we chose the wrong data structure or missed a detail about our data. Learning to reason about data and knowing data structures cold will get you a long way towards grokking algorithms.

An Example

Imagine that you have a list of numbers and you want to find the max. We might implement that like this:

def max(numbers)
  max = 0
  numbers.each do |i|
    if i > max
      max = i
    end
  end
  return max
end

This is an algorithm. A simple one with just three operations but an algorithm still. This one is a standard linear search algorithm. Chances are you have used this algorithm a zillion times in some form. Maybe you were iterating over a list of names, a list of animals, a list of dogs, a list of books, a list of lists, or your collection of High School Musical blu-rays.

I feel like Cheesy Disney Musical in Space should be a thing. Like Teen Beach movie but in space. Slightly off topic, but did you know Johnny Tsunami got a sequel? Never seen it but it exists, I don't know why, but it does. Brink! is still the greatest Disney channel original movie.

When you woke up this morning, did you say to yourself 'Today, I'm gonna talk.' or 'Today, I'm gonna skate!'?

The 90s were amazing.

This algorithm is very generic and works for a large variety of data types. In a language like Ruby, it would work with any objects that could be compared.

If we knew more about the data we could make different choices. We might know that the data is ordered and that the first or last element is the biggest, in which case, we wouldn't even need the above algorithm at all; we could simply retrieve the last element.

Types of Data

Data structures can be classified into five and 1/2 categories;

  1. Fundamental Types
  2. Linear data structures
  3. Trees
  4. Hashes
  5. Graphs
  6. aka 5 and 1/2 Other or Misc. The Uncategorized

Fundamental types are stuff like ints, floats, booleans etc. Linear data structures are arrays and lists. Trees are those parent and children node objects; React is a tree of components. Hashes are those cool key/value associative things. Graphs are like trees but have a small difference; nodes in a tree point to only one other node, but nodes in a graph can connect to a variety of other nodes.

The distinction between a tree and graph might feel like it doesn't matter, but language is hugely important for breaking down interview challenges. Programmers are like Siths, as much as possible they try to deal in absolutes. The puzzles will get you if you don't read the small print about the inputs.

Seriously. Lion King 1.5?! They ran that franchise into the ground and ruined my childhood.

How it All Fits Together

Everything at its core begins with the data structures. We can think of algorithms as standard methods on data structures and their operations as how those methods are implemented. A List is a data structure with QuickSort, MergeSort, methods. If we thought of it as code it might look like this:

class List
  def quick_sort
  end

  def merge_sort
  end
end

When you think of algorithms as derived from data structures it makes it easier to structure our learning. We can learn data structures together with their algorithms. The data structures and algorithms are not separate, they are one and the same.

If we break down our learning into the categories of data structures, then further into specific data structures, we can learn one how to implement one data structure at a time and their accompanying algorithms.

We can learn further through example problems that make use of just that data structure. Most interview puzzles require you to make use of multiple data structures. Being able to break down puzzles into individual data structures (even though every thing might feel like an object, every thing is not an object) their algorithms, their operations, and how they relate to other data structures; is basically how you solve a puzzle.