Grokking Algorithms #6: Breadth-First Search

This post covers Chapter 6 of Grokking Algorithms, which introduces the graph data structure and the breadth-first search algorithm.

Shortest Path

Imagine you’re going on vacation and you need someone to watch your pet gecko while you’re gone. His name is Gordon and he’s very picky about who takes care of him.

His preference would be to have one of your friends watch him. But if none of them are available, he would settle for a friend of a friend or maybe even a friend of a friend of a friend.

So you first ask all of your friends. If one of them can watch Gordon, great! If not, you ask friends of your friends, and so on. This is breadth-first search.

Graphs

Graphs are used to represent entities and the relationships between them. The entities are represented as nodes in a graph and can be connected to any number of other nodes. Those connections are represented using edges in a graph.

Directed graphs contain edges that point in a single direction (unidirectional), while undirected graphs contain edges that point in both directions (bidirectional).

The mango seller example in the book covers undirected graphs.

Mango Sellers

We can represent the graph of mango sellers from the book’s example using a map in Clojure.

(def graph
  {"you"    '("alice" "bob" "claire")
   "bob"    '("anuj" "peggy")
   "alice"  '("peggy")
   "claire" '("thom" "jonny")
   "anuj"   ()
   "peggy"  ()
   "thom"   ()
   "jonny"  ()})

I chose to represent the neighbors for each person using a list rather than a vector because we won’t need to access individual elements by their index at random. Rather, we will traverse the list in order starting from the first element1.

Keep in mind that in Clojure, you need to prefix list data with an apostrophe, since lists that don’t start with an apostrophe get evaluated immediately2. For empty lists, the apostrophe is optional, so I omitted those.

Breadth-First Search

To implement our breadth-first search algorithm, we’ll start by defining a search function.

(defn search
  [name]
  (loop [search-queue (get graph name)
         searched #{}]
    ))

We include a loop at the top of our function to give us a recursion target with two arguments, search-queue and searched. We initialize search-queue to the starting node’s list of neighbors and searched to an empty set.

Next, we’ll need to determine our base cases and our recursive cases.

  1. Base case: Found a mango seller.
  2. Recursive case: Didn’t find a mango seller; more people to check.
  3. Base case: No more people to check.

Since we have three conditions, a cond is a good option to branch between them.

(defn search
  [name]
  (loop [search-queue (get graph name)
         searched #{}]
    (cond
      (and (not (contains? searched person))
           (person-is-seller? person))
      (do
        (println person "is a mango seller!")
        true)
      
      (not (empty? new-search-queue))
      (recur new-search-queue new-searched)

      :else
      false)
    ))

Our first condition will be true if we haven’t already searched the current person and that person is a mango seller. In this case, we print a message and return true. We haven’t yet defined person or the person-is-seller? function, but we’ll get to that shortly3.

The second condition will be triggered if the current person is not a mango seller (since the first condition didn’t pass) and if there are more people to check in the search queue. In this case, recur with the updated search-queue and searched arguments. Again, we haven’t yet defined new-search-queue or new-searched, but we’ll get to that.

If we didn’t find a mango seller after checking all the connected people, return false.

Now let’s go back and define the symbols we didn’t yet define.

(defn search
  [name]
  (loop [search-queue (get graph name)
         searched #{}]
    (let [[person & rest-of-search-queue] search-queue
          new-search-queue (concat rest-of-search-queue
                                   (get graph person))
          new-searched (conj searched person)]
      (cond
        (and (not (contains? searched person))
             (person-is-seller? person))
        (do
          (println person "is a mango seller!")
          true)
        
        (not (empty? new-search-queue))
        (recur new-search-queue new-searched)
        
        :else
        false))))

We wrapped the cond in a let to define the variables that we used previously. For starters, we destructure the first person from the search queue and the remaining items into person and rest-of-search-queue, respectively.

Next, we define new-search-queue as the concatenation of the rest of search-queue and the neighbors of person.

Finally, we define new-searched as searched with person added to it.

Now we can define the person-is-seller? function.

(defn person-is-seller?
  [name]
  (= \m (last name)))

Like in the book, we’re getting the last letter from the person’s name and checking if it’s the character m4. This would return true only for the name "thom" out of all the people in our map.

In a real-world scenario, we would likely want to keep track of the mango sellers in a separate data structure. A set works well for this.

(def mango-sellers
  #{"thom"})

(defn person-is-seller?
  [name]
  (contains? mango-sellers name))

We defined mango-sellers as a set containing only the name "thom" and we updated the person-is-seller? function to check if name is contained in mango-sellers.

Key Takeaways

  • Graphs allow us to represent many-to-many relationships between entities.
  • A graph can be represented in Clojure by a map.
  • Breadth-first search finds the shortest path between two nodes.

Footnotes

  1. When I originally learned Clojure, I was confused about when to use a list and when to use a vector. The general rule is, if you need to access elements at random, use a vector. If you’re only going to traverse the elements starting from the head, use a list. ↩︎
  2. When calling a function in Clojure, we specify a list containing the name of the function followed by its arguments. That list is not prefixed with an apostrophe, and therefore gets evaluated rather than being treated as data. ↩︎
  3. This approach of specifying the high-level logic and flow of a piece of code, then filling in the details later is called top-down design. ↩︎
  4. Getting the last character from a string returns a character, since a Clojure string is a sequence of characters. Additionally, the string "m" and the character \m are not equal. ↩︎

Leave a comment