pg_repack
1. Overview
pg_repack is an extension for PostgreSQL used to rebuild tables and indexes online and eliminate storage bloat. It can achieve effects similar to VACUUM FULL (reclaiming space), while barely blocking business read/write operations.
Both the PG mode and Oracle compatibility mode of IvorySQL have been adapted for pg_repack.
Documentation: https://reorg.github.io/pg_repack/
License: BSD-3-Clause License
2. Principle
VACUUM FULL lacks a mechanism for capturing concurrent changes during the rewrite. It holds an ACCESS EXCLUSIVE lock on the entire table for the whole duration, during which the table cannot be read or written, and the lock duration grows with the table size. pg_repack compresses the strong lock into two brief moments at the beginning and end by means of "shadow table + triggers + file node swap", thereby enabling online rebuild.
2.1. Change Capture and Data Copy
-
First, install row-level triggers on the original table to record subsequent
INSERT/UPDATE/DELETEoperations into a log table (repack.log_<oid>) — this step requires a briefACCESS EXCLUSIVElock to atomically install the triggers, which is released immediately after installation; -
Create a shadow table with the same structure as the original table (
repack.table_<oid>), and bulk-copy the original table’s data under a consistent MVCC snapshot; this phase holds onlySHARE UPDATE EXCLUSIVE, so the original table continues to allow normal reads and writes; -
Rebuild indexes on the shadow table, then replay the log so that the shadow table catches up with the latest state of the original table.
2.2. Swap and Cleanup
-
Acquire a brief
ACCESS EXCLUSIVElock, replay any remaining log entries, then swap the physical file nodes (relfilenode) of the original table and the shadow table at the system catalog level — the table’s OID remains unchanged, and the swap is instantaneous; -
Drop the triggers and log tables, then drop the shadow table (which now points to the old heap, so dropping it reclaims the old physical files); finally, run
ANALYZEon the table.
|
Prerequisites for use: the target table must have a primary key or a non-null unique key; the operation requires approximately twice the table size of temporary disk space during the process. |
3. Installation and Enabling
IvorySQL 5 or above is already installed in the environment. pg_repack consists of two parts: the server-side extension (pg_repack.so and SQL scripts) and the client-side command-line tool pg_repack.
It is assumed that IvorySQL 5 is currently installed and the pg_config tool is available.
3.1. Source Code Installation
Pull the pg_repack source code:
git clone --branch ver_1.5.3 https://github.com/reorg/pg_repack.git
Perform compilation and installation:
cd pg_repack make make install
3.2. Create the Extension
[ivorysql@localhost ivorysql]$ psql psql (18.0) Type "help" for help. ivorysql=# CREATE EXTENSION pg_repack; CREATE EXTENSION
|
|
4. Usage
pg_repack is driven through the command-line client. The connection parameters are the same as those of psql (-h host, -p port, -U user, -d database, or use the PGHOST / PGPORT / PGUSER environment variables).
4.2. Rebuild a Specified Table
The schema.table form is recommended:
pg_repack -d ivorysql -t public.big_table