IT (dis)adventures: Efficient and scalable select of a random record in PostgreSQL

Wednesday, January 11, 2012

Efficient and scalable select of a random record in PostgreSQL

This is a code snippet for the select of a random record in a efficient and scalable way when using PostgreSQL:

   SELECT * FROM table_name
   LIMIT 1
   OFFSET (SELECT Random() * Count(*) FROM table_name);

According with the manual, the Random() function return a random value in the range 0.0 <= x < 1.0 .
So the effect of the expression (Random() * Count(*)) is to return a value between 0 and Count(*): we can use this value as offset to skip a random values of records.

This work in PostgreSQL 8.* and 9.*.

This solution required two queries, but if you have many records, is more efficient than this other one :

   SELECT * FROM table_name
   ORDER BY Random()
   LIMIT 1;

In this solution the Random() function is called for each row, so with many records it could be inefficient in term of execution time and memory usage.

IT (dis)adventures

Wednesday, January 11, 2012

Efficient and scalable select of a random record in PostgreSQL

No comments:

Post a Comment