February | 2017 | Genuine Blog

With the ever growing demand for data science work in recent years, PostgreSQL has gained superb popularity especially in areas where extensive geospatial/GIS (geographic information system) functionality is needed. In a previous startup venture, MySQL was initially adopted and I went through the trouble of migrating to PostgreSQL mainly because of the sophisticated geospatial features PostGIS offers.

PostgreSQL offers a lot of goodies, although it does have a few things that I wish were done differently. Most notable to me is that while its SELECT statement supports SQL-92 Standard’s JOIN syntax, its UPDATE statement would not. For instance, the following UPDATE statement would not work in PostgreSQL:

--
-- SQL-92 compliance JOIN
--
UPDATE X
INNER JOIN Y ON X.yid = Y.id
LEFT JOIN Z ON Y.zid = Z.id
SET
    X.yname = Y.name,
    X.zname = Z.name
WHERE
    X.id IN (101,102,103)
;

Partial indexing

Nevertheless, for general performance and scalability, PostgreSQL remains one of the top candidates with proven track record in the world of open source RDBMS. In scaling up a PostgreSQL database, there is a wide variety of approaches. Suitable indexing is probably one of the first things to look into. Aside from planning out proper column orders in indexes that are optimal for the frequently used queries, there is another indexing feature that PostgreSQL provides for handling large datasets.

Partial indexing allows an index to be built over a subset of a table based on a conditional expression. For instance:

--
-- Partial Index
--
CREATE INDEX ix_unshipped_orders ON orders (order_num) WHERE shipped is FALSE;

In the case of a table with large amount of rows, this feature could make an otherwise gigantic index much smaller, thus more efficient for queries against the selectively indexed data.

Scaling up with table partitioning

However, when a table grows to certain volume, say, beyond a couple of hundreds of million rows, and if periodically archiving off data from the table isn’t an option, it would still be a problem even with applicable indexing strategy. In many cases, it might be necessary to do something directly with the table structure and table partitioning is often a good solution.

There are a few approaches to partition a PostgreSQL table. Among them, partitioning by means of table inheritance is perhaps the most popular approach. A master table will be created as a template that defines the table structure. This master table will be empty whereas a number of child tables inherited from this master table will actually host the data.

The partitioning is based on a partition key which can be a column or a combination of columns. In some common use cases, the partition keys are often date-time related. For instance, a partition key could be defined in a table to partition all sales orders by months with constraint like the following:

order_date >= ‘2016-12-01 00:00:00’ AND order_date < ‘2017-01-01 00:00:00’

Other common cases include partitioning geographically, etc.

A table partitioning example

When I was with a real estate startup building an application that involves over 100 millions nationwide properties, each with multiple attributes of interest, table partitioning was employed to address the demanding data volume. Below is a simplified example of how the property sale transaction table was partitioned to maintain a billion rows of data.

First, create the master table which will serve as the template for the table structure.

--
-- Create master table
--
DROP TABLE IF EXISTS property_sale CASCADE;

CREATE TABLE property_sale (
    "id" bigserial primary key,
    "state" character varying(20) not null,
    "property_type" character varying(100),
    "sale_date" timestamp with time zone,
    "sale_price" integer
);

Next, create child tables inheriting from the master table for the individual states. For simplicity, I only set up 24 states for performance evaluation.

--
-- Create child tables (for 24 selected states)
--
CREATE TABLE property_sale_ca ( CHECK ( state = 'CA' )) INHERITS (property_sale);
CREATE TABLE property_sale_ny ( CHECK ( state = 'NY' )) INHERITS (property_sale);
CREATE TABLE property_sale_tx ( CHECK ( state = 'TX' )) INHERITS (property_sale);
CREATE TABLE property_sale_il ( CHECK ( state = 'IL' )) INHERITS (property_sale);
CREATE TABLE property_sale_wa ( CHECK ( state = 'WA' )) INHERITS (property_sale);
CREATE TABLE property_sale_fl ( CHECK ( state = 'FL' )) INHERITS (property_sale);
CREATE TABLE property_sale_va ( CHECK ( state = 'VA' )) INHERITS (property_sale);
CREATE TABLE property_sale_co ( CHECK ( state = 'CO' )) INHERITS (property_sale);
CREATE TABLE property_sale_oh ( CHECK ( state = 'OH' )) INHERITS (property_sale);
CREATE TABLE property_sale_nv ( CHECK ( state = 'NV' )) INHERITS (property_sale);
CREATE TABLE property_sale_or ( CHECK ( state = 'OR' )) INHERITS (property_sale);
CREATE TABLE property_sale_pa ( CHECK ( state = 'PA' )) INHERITS (property_sale);
CREATE TABLE property_sale_ut ( CHECK ( state = 'UT' )) INHERITS (property_sale);
CREATE TABLE property_sale_ma ( CHECK ( state = 'MA' )) INHERITS (property_sale);
CREATE TABLE property_sale_ct ( CHECK ( state = 'CT' )) INHERITS (property_sale);
CREATE TABLE property_sale_la ( CHECK ( state = 'LA' )) INHERITS (property_sale);
CREATE TABLE property_sale_wi ( CHECK ( state = 'WI' )) INHERITS (property_sale);
CREATE TABLE property_sale_wy ( CHECK ( state = 'WY' )) INHERITS (property_sale);
CREATE TABLE property_sale_nm ( CHECK ( state = 'NM' )) INHERITS (property_sale);
CREATE TABLE property_sale_nj ( CHECK ( state = 'NJ' )) INHERITS (property_sale);
CREATE TABLE property_sale_nh ( CHECK ( state = 'NH' )) INHERITS (property_sale);
CREATE TABLE property_sale_mi ( CHECK ( state = 'MI' )) INHERITS (property_sale);
CREATE TABLE property_sale_md ( CHECK ( state = 'MD' )) INHERITS (property_sale);
CREATE TABLE property_sale_dc ( CHECK ( state = 'DC' )) INHERITS (property_sale);
CREATE TABLE property_sale_err ( CHECK ( state NOT IN (
    'CA', 'NY', 'TX', 'IL', 'WA', 'FL', 'VA', 'CO', 'OH', 'NV', 'OR', 'PA', 
    'UT', 'MA', 'CT', 'LA', 'WI', 'WY', 'NM', 'NJ', 'NH', 'MI', 'MD', 'DC' ))
) INHERITS (property_sale);

Nothing magical so far, until a suitable trigger for propagating insert is put in place. The trigger essentially redirects insert requests against the master table to the corresponding child tables.

--
-- Create trigger for insert
--
DROP FUNCTION IF EXISTS fn_insert_property_sale();
CREATE OR REPLACE FUNCTION fn_insert_property_sale()
RETURNS TRIGGER AS $
BEGIN
    CASE NEW.state
        WHEN 'CA' THEN
            INSERT INTO property_sale_ca VALUES (NEW.*);
        WHEN 'NY' THEN
            INSERT INTO property_sale_ny VALUES (NEW.*);
        WHEN 'TX' THEN
            INSERT INTO property_sale_tx VALUES (NEW.*);
        WHEN 'IL' THEN
            INSERT INTO property_sale_il VALUES (NEW.*);
        WHEN 'WA' THEN
            INSERT INTO property_sale_wa VALUES (NEW.*);
        WHEN 'FL' THEN
            INSERT INTO property_sale_fl VALUES (NEW.*);
        WHEN 'VA' THEN
            INSERT INTO property_sale_va VALUES (NEW.*);
        WHEN 'CO' THEN
            INSERT INTO property_sale_co VALUES (NEW.*);
        WHEN 'OH' THEN
            INSERT INTO property_sale_oh VALUES (NEW.*);
        WHEN 'NV' THEN
            INSERT INTO property_sale_nv VALUES (NEW.*);
        WHEN 'OR' THEN
            INSERT INTO property_sale_or VALUES (NEW.*);
        WHEN 'PA' THEN
            INSERT INTO property_sale_pa VALUES (NEW.*);
        WHEN 'UT' THEN
            INSERT INTO property_sale_ut VALUES (NEW.*);
        WHEN 'MA' THEN
            INSERT INTO property_sale_ma VALUES (NEW.*);
        WHEN 'CT' THEN
            INSERT INTO property_sale_ct VALUES (NEW.*);
        WHEN 'LA' THEN
            INSERT INTO property_sale_la VALUES (NEW.*);
        WHEN 'WI' THEN
            INSERT INTO property_sale_wi VALUES (NEW.*);
        WHEN 'WY' THEN
            INSERT INTO property_sale_wy VALUES (NEW.*);
        WHEN 'NM' THEN
            INSERT INTO property_sale_nm VALUES (NEW.*);
        WHEN 'NJ' THEN
            INSERT INTO property_sale_nj VALUES (NEW.*);
        WHEN 'NH' THEN
            INSERT INTO property_sale_nh VALUES (NEW.*);
        WHEN 'MI' THEN
            INSERT INTO property_sale_mi VALUES (NEW.*);
        WHEN 'MD' THEN
            INSERT INTO property_sale_md VALUES (NEW.*);
        WHEN 'DC' THEN
            INSERT INTO property_sale_dc VALUES (NEW.*);
        ELSE
            INSERT INTO property_sale_err VALUES (NEW.*);
    END CASE;
    RETURN NULL;
END
$
LANGUAGE plpgsql;

CREATE TRIGGER tr_insert_property_sale
    BEFORE INSERT ON property_sale
    FOR EACH ROW EXECUTE PROCEDURE fn_insert_property_sale();

Let’s test inserting data into the partitioned tables via the trigger:

--
-- Check created tables, functions, triggers
--
\dt property_sale*
\df fn_*
select tgname from pg_trigger where tgname like 'tr_%';

--
-- Test insert
--
INSERT INTO property_sale ("state", "property_id", "property_type", "sale_date", "sale_price") VALUES
('CA', 1008, 'Single Family House', '1990-05-05 09:30:00', 750000),
('NY', 2505, 'Apartment', '2002-01-30 12:00:00', 800000),
('ZZ', 9999, 'Single Family House', '2012-06-28 14:00:00', 500000),
('CA', 1008, 'Condominium', '2015-11-02 16:00:00', 1200000),
('TX', 3030, 'Condominium', '2006-04-20 11:15:00', 500000)
;

A Python program for data import

Now that the master table and its child tables are functionally in place, we’re going to populate them with large-scale data for testing. First, write a simple program using Python (or any other programming/scripting language) as follows to generate simulated data in a tab-delimited file for data import:

#!/usr/bin/python
import getopt, sys
import csv
import random
import time

def usage():
    print("Usage: %s [-h] [-r rows -f file]" % sys.argv[0])
    print(" e.g.: %s -r 20000 -f /db/data/test/pg_partitioning_infile.txt" % sys.argv[0])

def randDateTime(start, end, rand):
    format = '%Y-%m-%d %H:%M:%S'
    startTime = time.mktime(time.strptime(start, format))
    endTime = time.mktime(time.strptime(end, format))
    randTime = startTime + rand * (endTime - startTime)
    return time.strftime(format, time.localtime(randTime))

if __name__ == "__main__":

    rows = 0
    txtFile = ""

    try:
        opts, args = getopt.getopt(sys.argv[1:], "hr:f:", ["help", "rows=", "file="])
    except getopt.GetoptError as err:
        usage()
        sys.exit(2)

    for okey, oval in opts:
        if okey in ("-h", "--help"):
            usage()
            sys.exit()
        elif okey in ("-r", "--rows"):
            rows = int(oval)
        elif okey in ("-f", "--file"):
            txtFile = oval
        else:
            assert False, "unhandled option"

    print("rows = %d" % rows)

    stateList = ['CA', 'NY', 'TX', 'IL', 'WA', 'FL', 'VA', 'CO', 'OH', 'NV', 'OR', 'PA', \
                 'UT', 'MA', 'CT', 'LA', 'WI', 'WY', 'NM', 'NJ', 'NH', 'MI', 'MD', 'DC']
    propTypeList = ['Single Family', 'Condominium', 'Multi (2-4 units)', 'Duplex', 'Triplex', \
                    'Quadruplex', 'Miscellaneous', 'Mobile Home', 'Residential Vacant Land']

    with open(txtFile, 'w') as f:
        w = csv.writer(f, dialect = 'excel-tab')
        rowCount = 0

        for propId in range(1000, 200000000):
            state = random.choice(stateList)
            propType = random.choice(propTypeList)
            numSales = random.randint(1, 8)
            randPrice = random.randint(250, 2500) * 1000

            saleCount = 0
            while rowCount < rows and saleCount < numSales:
                saleDate = randDateTime("1980-01-01 00:00:00", "2017-01-01 00:00:00", random.random())
                salePrice = randPrice * (1.0 + random.randint(-20, 20) / 100.0)
                salePrice = int(salePrice / 1000.0) * 1000

                w.writerow([state, propId, propType, saleDate, salePrice])
                rowCount += 1
                saleCount += 1

            if rowCount >= rows:
                break

Run the Python program to generate up to 1 billion rows of property sale data. Given the rather huge output, make sure the generated file is on a storage device with plenty of space. Since it’s going to take some time to finish the task, it would better be run in the background, perhaps along with mail notification, like the following:

--
-- Run pg_partitioning_infile.py to create tab-delimited infile
--
cd /db/app/test/
nohup python pg_partitioning_infile.py -r 1000000000 -f /db/data/test/pg_partitioning_infile_1b.txt 2>&1 | mail -s "pg partitioning infile - creating 1-billion-row infile" me@mydomain.com &

Next, load data from the generated infile into the partitioned tables using psql. In case there are indexes created for the partitioned tables, it would generally be much more efficient to first drop them and recreate them after loading the data, like in the following:

--
-- Drop indexes, if exist, to speed up loading
--
psql -d mydb -U dbu -h localhost
DROP INDEX IF EXISTS property_sale_ca_prop_type_id;
DROP INDEX IF EXISTS property_sale_ny_prop_type_id;
DROP INDEX IF EXISTS property_sale_tx_prop_type_id;
DROP INDEX IF EXISTS property_sale_il_prop_type_id;
DROP INDEX IF EXISTS property_sale_wa_prop_type_id;
DROP INDEX IF EXISTS property_sale_fl_prop_type_id;
DROP INDEX IF EXISTS property_sale_va_prop_type_id;
DROP INDEX IF EXISTS property_sale_co_prop_type_id;
DROP INDEX IF EXISTS property_sale_oh_prop_type_id;
DROP INDEX IF EXISTS property_sale_nv_prop_type_id;
DROP INDEX IF EXISTS property_sale_or_prop_type_id;
DROP INDEX IF EXISTS property_sale_pa_prop_type_id;
DROP INDEX IF EXISTS property_sale_ut_prop_type_id;
DROP INDEX IF EXISTS property_sale_ma_prop_type_id;
DROP INDEX IF EXISTS property_sale_ct_prop_type_id;
DROP INDEX IF EXISTS property_sale_la_prop_type_id;
DROP INDEX IF EXISTS property_sale_wi_prop_type_id;
DROP INDEX IF EXISTS property_sale_wy_prop_type_id;
DROP INDEX IF EXISTS property_sale_nm_prop_type_id;
DROP INDEX IF EXISTS property_sale_nj_prop_type_id;
DROP INDEX IF EXISTS property_sale_nh_prop_type_id;
DROP INDEX IF EXISTS property_sale_mi_prop_type_id;
DROP INDEX IF EXISTS property_sale_md_prop_type_id;
DROP INDEX IF EXISTS property_sale_dc_prop_type_id;
DROP INDEX IF EXISTS property_sale_err_prop_type_id;
\q

--
-- Load data from the infile into property_sale
--
nohup psql -d mydb -U dbu -h localhost -c "\copy property_sale(state, property_id, property_type, sale_date, sale_price) from '/db/data/test/pg_partitioning_infile_1b.txt' delimiter E'\t'" 2>&1 | mail -s "pg partitioning test - loading 1 billion rows" me@mydomain.com &

psql -d mydb -U dbu -h localhost -c "select count(*) from property_sale;"
psql -d mydb -U dbu -h localhost -c "select count(*) from property_sale where state = 'NY';"

--
-- Recreate indexes (run in background)
--
nohup psql -d mydb -U dbu -h localhost -c "CREATE INDEX property_sale_ca_prop_type_id ON property_sale_ca (property_type, id);" &
nohup psql -d mydb -U dbu -h localhost -c "CREATE INDEX property_sale_ny_prop_type_id ON property_sale_ny (property_type, id);" &
nohup psql -d mydb -U dbu -h localhost -c "CREATE INDEX property_sale_tx_prop_type_id ON property_sale_tx (property_type, id);" &
nohup psql -d mydb -U dbu -h localhost -c "CREATE INDEX property_sale_il_prop_type_id ON property_sale_il (property_type, id);" &
nohup psql -d mydb -U dbu -h localhost -c "CREATE INDEX property_sale_wa_prop_type_id ON property_sale_wa (property_type, id);" &
nohup psql -d mydb -U dbu -h localhost -c "CREATE INDEX property_sale_fl_prop_type_id ON property_sale_fl (property_type, id);" &
nohup psql -d mydb -U dbu -h localhost -c "CREATE INDEX property_sale_va_prop_type_id ON property_sale_va (property_type, id);" &
nohup psql -d mydb -U dbu -h localhost -c "CREATE INDEX property_sale_co_prop_type_id ON property_sale_co (property_type, id);" &
nohup psql -d mydb -U dbu -h localhost -c "CREATE INDEX property_sale_oh_prop_type_id ON property_sale_oh (property_type, id);" &
nohup psql -d mydb -U dbu -h localhost -c "CREATE INDEX property_sale_nv_prop_type_id ON property_sale_nv (property_type, id);" &
nohup psql -d mydb -U dbu -h localhost -c "CREATE INDEX property_sale_or_prop_type_id ON property_sale_or (property_type, id);" &
nohup psql -d mydb -U dbu -h localhost -c "CREATE INDEX property_sale_pa_prop_type_id ON property_sale_pa (property_type, id);" &
nohup psql -d mydb -U dbu -h localhost -c "CREATE INDEX property_sale_ut_prop_type_id ON property_sale_ut (property_type, id);" &
nohup psql -d mydb -U dbu -h localhost -c "CREATE INDEX property_sale_ma_prop_type_id ON property_sale_ma (property_type, id);" &
nohup psql -d mydb -U dbu -h localhost -c "CREATE INDEX property_sale_ct_prop_type_id ON property_sale_ct (property_type, id);" &
nohup psql -d mydb -U dbu -h localhost -c "CREATE INDEX property_sale_la_prop_type_id ON property_sale_la (property_type, id);" &
nohup psql -d mydb -U dbu -h localhost -c "CREATE INDEX property_sale_wi_prop_type_id ON property_sale_wi (property_type, id);" &
nohup psql -d mydb -U dbu -h localhost -c "CREATE INDEX property_sale_wy_prop_type_id ON property_sale_wy (property_type, id);" &
nohup psql -d mydb -U dbu -h localhost -c "CREATE INDEX property_sale_nm_prop_type_id ON property_sale_nm (property_type, id);" &
nohup psql -d mydb -U dbu -h localhost -c "CREATE INDEX property_sale_nj_prop_type_id ON property_sale_nj (property_type, id);" &
nohup psql -d mydb -U dbu -h localhost -c "CREATE INDEX property_sale_nh_prop_type_id ON property_sale_nh (property_type, id);" &
nohup psql -d mydb -U dbu -h localhost -c "CREATE INDEX property_sale_mi_prop_type_id ON property_sale_mi (property_type, id);" &
nohup psql -d mydb -U dbu -h localhost -c "CREATE INDEX property_sale_md_prop_type_id ON property_sale_md (property_type, id);" &
nohup psql -d mydb -U dbu -h localhost -c "CREATE INDEX property_sale_dc_prop_type_id ON property_sale_dc (property_type, id);" &
nohup psql -d mydb -U dbu -h localhost -c "CREATE INDEX property_sale_err_prop_type_id ON property_sale_err (property_type, id);" &

Query with Constraint Exclusion

Prior to querying the tables, make sure the query optimization parameter, constraint_exclusion, is enabled.

--
-- Turn on query optimization for partitioned tables
--
SET constraint_exclusion = on;

--
-- Query test
--
psql -d mydb -U dbu -h localhost -c "select count(*) from property_sale where state = 'NY' and property_type = 'Condominium';"

With constraint exclusion enabled, the query planner will be smart enough to examine query constraints to exclude scanning of those partitioned tables that don’t match the constraints. Unfortunately, though, if the constraints involve matching against non-constants like the NOW() function, the query planner won’t have enough information to filter out unwanted partitions hence won’t be able to take advantage of the optimization.

Final notes

With a suitable partitioning scheme applied to a big table, query performance can be improved by an order of magnitude. As illustrated in the above case, the entire partitioning scheme centers around the key column used for partitioning, hence it’s critical to properly plan out which key column (or combination of columns) to partition. Number of partitions should also be carefully thought out, as too few partitions might not help whereas too many partitions would create too much overhead.

Genuine Blog

A Tech Blog by Leo Cheung

Monthly Archives: February 2017

PostgreSQL Table Partitioning