Wednesday, January 11, 2012

Efficient and scalable select of a random record in PostgreSQL


This is a code snippet for the select of a random record in a efficient and scalable way when using PostgreSQL:

   SELECT * FROM table_name
   LIMIT 1
   OFFSET (SELECT Random() * Count(*) FROM table_name);

According with the manual, the Random() function return a random value in the range 0.0 <= x < 1.0 .
So the effect of the expression (Random() * Count(*)) is to return a value between 0 and Count(*): we can use this value as offset to skip a random values of records.

This work in PostgreSQL 8.* and 9.*.

This solution required two queries, but if you have many records, is more efficient than this other one :

   SELECT * FROM table_name
   ORDER BY Random()
   LIMIT 1;

In this solution the Random() function is called for each row, so with many records it could be inefficient in term of execution time and memory usage.