Embrace the histogram

Let’s say we are comparing two suppliers of some service to our business – any service: Temp Labor, Facilities Management, Security, IT, etc. One important measure is each supplier’s hourly rate. This may vary based on the seniority and skills of the staff, the task performed and the location of the work.

We start by gathering a large sample of invoices from each supplier so that their collective work is identical. That is, the hourly rates still vary, as per above, but the variation is identical for each supplier. Given this, how would we compare the supplier’s hourly rates?

The obvious answer is to take an average.

Fine. So we do that and find supplier A’s average hourly rate is $62.50/hour and supplier B’s is $71.07. Is there any other analysis we would do?

Anyone? Anyone? Bueller?

Most of us would stop there and conclude that supplier B is pricier.

But we’d be better off if we first looked at a histogram of the data. The histogram, as you may recall, looks at the frequency of different values. How many tasks were billed at $60/hr? How many at $65/hr? And so on up and down the range.

Here are histograms comparing suppliers A and B in my hypothetical example:


The average hourly rate of the two suppliers is different, but is supplier B truly pricier than supplier A? What is going on in that subgroup of tasks performed by supplier B at higher hourly rates? If you look close enough, you’ll see those tasks are distributed evenly around $90/hr., whereas the larger, remaining set of tasks are distributed evenly around $60/hr. Yes, that’s it, those higher priced tasks were performed on overtime, at a 50% premium.

Supplier B is NOT more expensive than supplier A, they’ve just performed a bunch of overtime work. This could be for good or bad reasons. Were they slow in their work, and were always behind schedule? Or, are they the more reliable supplier we turn to when the going gets tough?

Who knows? What’s important is that those questions would never have been asked, and we’d have settled on an incorrect conclusion, if we hadn’t looked at our histograms.

Sadly, the histogram is terribly underutilized. So I proclaim March Hug a Histogram month. Next time you’re looking at a set of data, please be kind and run a histogram. You might be surprised by what you see.

This entry was posted in Problem Solving. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>