Reaves.dev

v0.1.0

built using

Phoenix v1.7.17

Linear Time Median

Stephen M. Reaves

::

2025-01-09

Using the Divide and Conquer approach to find the median of an unsorted list in linear time

Summary

We can find the median of a list of numbers without sorting by combining ideas of QuickSort and Sample Sort.

Contents

Problem

Given an unsorted list A=[a1,,an] A = \left[ a_1, \ldots, a_n \right] of n numbers, find the median of A A .

median
n2 \left\lceil\frac{n}{2}\right\rceil th smallest element

When n is odd:

n=2l+1 n = 2l + 1

median=(l+1)st smallest \text{median} = \left(l + 1\right)^{st} \text{ smallest}

More generally,

Given an unsorted list A=[a1,,an] A = \left[ a_1, \ldots, a_n \right] of n numbers and an integer k where 1kn 1 \le k \le n , find the kth smallest of A.

Basic Approach

QuickSort style

  1. Choose a pivot p
  2. Partition A into A<p,A=p,A>p A < p, A=p, A>p
  3. Recursively sort A<p&A>p A <p &amp; A>p

In order to make our median selecting algorithm O(n) O(n) instead of O(nlogn) O\left( n\log{n} \right) , we only need to recursively search one of the sub lists, not both.

Psuedocode

QuickSelect(A,k):

Reverse Engineering

good pivot
A<p3n4&A>p3n4 |A<p| \le \frac{3n}{4} \& |A>p| \le \frac{3n}{4}

If we randomly select a pivot, there’s a 50% chance its good. We can instead choose a representative sample to choose the pivot from to gaurantee it’s good.

Representative Sample

We want to choose a subset S, then find a median of S, and have that median be an approximation of the median of A.

for each xS x \in S , we want a few (2) elements of A are x \le x and a few (2) elements that are x \ge x

Break A into n5 \frac{n}{5} groups of 5 elements each. Then sort each group and find the median of the middle group.

FastSelect