Simple Hierarchical Clustering

Get Version


→ ‘hierclust’


Given a set of points, organizes them into clusters. You can either have it continue clustering until all the clusters are organized into larger clusters, or tell it to stop once a certain minimum level of separation between clusters has been reached.

Useful for taking a large set of points to be plotted on a map, and reducing them to a smaller number of clusters, separated enough so that the map remains legible.


sudo gem install hierclust

The basics

points = (1..6).map {, rand(10)) }
clusterer =
puts clusterer.clusters # => [[[(4, 9), (4, 8)], (9, 6)], [[(1, 4), (3, 1)], (6, 3)]]

Demonstration of usage

Let’s say you have an existing set of objects with latitudes and longitudes, and you want to organize them into clusters that are separated by at least 5 degrees (for simplicity’s sake we’ll pretend that latitudes and longitude form a rectangular grid).

require 'hierclust'

Start by extending the built-in Point class so that it can maintain a reference to your data:

class Hierclust::Point
  attr_accessor :data

Then turn your data into a set of points:

dataset = MyGeocodedThing.find(:all)
points = do |thing|
  point =, = thing

Then tell Hierclust to cluster those points to at least 5 degrees separation:

clusterer =, 5)
clusters = clusterer.clusters

Then do what you will with your clusters:

map =
clusters.each do |cluster|
    x => cluster.x,
    y => cluster.y,
    label => "#{cluster.points} Things"


API documentation: RDoc


Source code

You can browse the source at

How to submit patches

Read the 8 steps for fixing other people’s code and for section 8b: Submit patch to Google Groups, use the Google Group above.


This code is free to use under the terms of the MIT license.


Comments are welcome. Send an email to Brandt Kurowski email via the forum

Brandt Kurowski, 6th February 2008
Theme extended from Paul Battley