http://postgres-data-types.pvh.ca/
(If you want a reference or see a bug, let me know!)
— Knuth
bigserial or uuid for your primary key.*If your table is small, the extra size doesn't matter.
If your table is big, you're going to need it anyway.
CREATE EXTENSION "uuid-ossp";SELECT uuid_generate_v4();
CREATE EXTENSION pgcrypto;SELECT gen_random_uuid();
(Or create the UUIDs in the client.)
text!varchar, char, and anything else.An index supports prefixes (LIKE 'Peter%')
CREATE INDEX ON users (name);SELECT * FROM accounts WHERE email LIKE 'Peter%';An index supports prefixes (LIKE 'Peter%')
CREATE INDEX ON users (name);SELECT * FROM accounts WHERE email LIKE 'Peter%';
A functional index can support suffix lookups.
CREATE INDEX backsearch ON users (reverse(email));SELECT * FROM accounts WHERE reverse(email) LIKE reverse('%doe.com');`A tsvectors with a GIN index will find individual words efficiently
CREATE INDEX ON products USING gin(tsv);
Regular expressions: ~
SELECT * FROM users WHERE name ~ '(John|Jane)\s+Doe';
(You can get some index capability via pg_trgm/GIN index)
timestamptztimestamptz. (Not time.)timestamptztimestamptz. (Not time.)dateA query that always starts at the beginning of the current week
SELECT date_trunc('week', now())::date;A query that always starts at the beginning of the current week
SELECT date_trunc('week', now())::date;
A query that counts users created by day
SELECT date_trunc('day', created_at), count(*)FROM users GROUP BY date_trunc('day', created_at)When was a year ago?
SELECT now() - '1 year'::interval;When was a year ago?
SELECT now() - '1 year'::interval;
All the days last month (good for joining):
SELECT generate_series(date_trunc('month', now() - '1 month'::interval), date_trunc('month', now()), '1 day'::interval)bool, not bitbool, not bitbytea(Do consider if Postgres is the right solution...)
Mostly you will leave these alone and treat them as opaque.
Handy Exception:
SELECT md5(binary_data) FROM table;money: not up to modern standardstimestamp: use timestamptztime: you probably meant timestamptzserial: use bigserialfloat / integer: use numericvarchar, char: use text, it's fasterbitstring: premature optimizationxml: libxml2 is awful, but...json: you probably want jsonb typename_{in, out}()typename_{send, recv}() optionallyhost(), unnest(), average())<, =, ~)'rain', '{1, 3, 5}'

Large values are compressed and moved to a special storage location called a TOAST table automatically.
This assumes large values shouldn't be kept with their row anyway.
tsvector on textGIN on tsvector / jsonbarrayjsonbenumrangeIt's an array.
It's an array. In a value.
It's an array. In a value. It's not really rocket science.
It's an array. In a value. It's not really rocket science.
Useful for:
Make an array.
SELECT ARRAY[1, 2, 3];
Make it another way.
SELECT '{1, 2, 3}'::numeric[];
Extend your array.
SELECT ARRAY[1,2] || 3;SELECT ARRAY[1,2] || ARRAY[3, 4];
Aggregate from records.
SELECT array_agg(distinct users) from events where name = 'PGDay.RU';Make an array.
SELECT ARRAY[1, 2, 3];
Make it another way.
SELECT '{1, 2, 3}'::numeric[];
Extend your array.
SELECT ARRAY[1,2] || 3;SELECT ARRAY[1,2] || ARRAY[3, 4];
Aggregate from records.
SELECT array_agg(distinct users) from events where name = 'PGDay.RU';
You can make N-dimensional arrays too.
(Remember SQL arrays are 1-indexed!)
SELECT ('{one, two, three}'::text[])[1]; -- one
Use containment operator to find things.
CREATE INDEX ON table USING gin(tags); -- because fast!SELECT * FROM table WHERE tags @> array['sometag'];
(GIN indexes are great here)
— Oleg Bartunov
Convenient short-lived or occasional attribute storage.
{"weird_user?": true, "abuse_tags": ["new", "possible_abuser"]}
(Honour demands that I insist you make these columns if they get used often.)
Hierarchical data.
{"name": "car", "id": 91371 "parts": [ { "name": "engine" "id": 3241, "parts": [ { "name": "crankshaft", "id": 23991, ...The data is already JSON.
Why pull it apart?
jsonb vs jsonPrefer jsonb, not json.
ALTER TABLE users ADD COLUMN attrs jsonb;
Usually add a GIN index.
CREATE INDEX ON users USING gin(attrs);Probably don't do this...
CREATE TABLE bad_table(id serial, json json);Particularly if..
=# select distinct jsonb_object_keys(json) from bad_table; jsonb_object_keys------------------- id created_at updated_at name email owner_id ...
Performance will suffer and your queries will be more complex.
Input is easy.
select '{"a": "b"}'::jsonb
Adding members too.
UPDATE documents SET contents = content || '{"new_field": "goes here"}' WHERE document_id = 32421;Get to elements with -> and ->>
select attrs->>'presenter_irc_nick' from talks attrs------- pvh
Using a single -> returns the internal JSON.
select attrs->'previous_presentation'->>'location' from talks attrs----------- "Ottawa"
Or send a whole path
SELECT jsonb_extract_path_text(attrs, 'previous_presentation', 'location')Turning JSON into records (requires column definition).
select * from json_to_recordset( '[{"a":1,"b":"foo"},{"a":"2","c":"bar"}]') as x(a int, b text);Find documents with a certain key,
select * from talks where attrs ? 'previous_presentation'
or documents with a whole sub-document in common.
SELECT * FROM talks WHERE attrs @> '{"previous_presentation": "PGCon"}'CREATE TYPE weekdays AS ('Mon', 'Tue', 'Wed', 'Thu', 'Fri');
Enums are conceptually similar to domains.
Fast, transparent mapping of words to integers.
(Data lives in pg_enum.)
Enums provide a limited set of labels.
Storage is quite efficient - enums are stored as numbers.
Enums work like enums in other languages.
CREATE TYPE server_states AS ENUM ('running', 'uncertain', 'offline', 'restarting');Enums work like enums in other languages.
CREATE TYPE server_states AS ENUM ('running', 'uncertain', 'offline', 'restarting');
You can treat them like their text values.
INSERT INTO servers(state) VALUES ('offline');Enums work like enums in other languages.
CREATE TYPE server_states AS ENUM ('running', 'uncertain', 'offline', 'restarting');
You can treat them like their text values.
INSERT INTO servers(state) VALUES ('offline');
Enums block unexpected values from insertion.
INSERT INTO servers(state) VALUES ('mystery');ERROR: invalid input value for enum server_states: "mystery"LINE 1: insert into servers(state) values ('mystery');Enums work like enums in other languages.
CREATE TYPE server_states AS ENUM ('running', 'uncertain', 'offline', 'restarting');
You can treat them like their text values.
INSERT INTO servers(state) VALUES ('offline');
Enums block unexpected values from insertion.
INSERT INTO servers(state) VALUES ('mystery');ERROR: invalid input value for enum server_states: "mystery"LINE 1: insert into servers(state) values ('mystery');
But you can add new values.
ALTER TYPE server_states ADD VALUE 'abandoned' AFTER 'offline';Enums work like enums in other languages.
CREATE TYPE server_states AS ENUM ('running', 'uncertain', 'offline', 'restarting');
You can treat them like their text values.
INSERT INTO servers(state) VALUES ('offline');
Enums block unexpected values from insertion.
INSERT INTO servers(state) VALUES ('mystery');ERROR: invalid input value for enum server_states: "mystery"LINE 1: insert into servers(state) values ('mystery');
But you can add new values.
ALTER TYPE server_states ADD VALUE 'abandoned' AFTER 'offline';
(NB: You can treat enums as numbers and compare with > and < but... don't.)
Ranges are a simple but powerful variation on composite types.
INSERT INTO reservation VALUES ('[2016-01-01 11:30, 2016-01-01 15:00)');
Always has upper / lower, inclusive / exclusive bounds.
Benefits include:
Some useful predefined range types:
Or define your own:
CREATE TYPE inetrange AS RANGE ( SUBTYPE = inet);Ranges can be created mid-query.
SELECT tstzrange(created_at, deleted_at) FROM users;Ranges can be created mid-query.
SELECT tstzrange(created_at, deleted_at) FROM users;
(NULLs are helpfully treated as -/+ infinity.)
Ranges can be created mid-query.
SELECT tstzrange(created_at, deleted_at) FROM users;
(NULLs are helpfully treated as -/+ infinity.)
Find users with active accounts on Christmas Day
SELECT * FROM usersWHERE tstzrange(created_at, deleted_at) @> '2015-12-25'::timestamptzRanges can be created mid-query.
SELECT tstzrange(created_at, deleted_at) FROM users;
(NULLs are helpfully treated as -/+ infinity.)
Find users with active accounts on Christmas Day
SELECT * FROM usersWHERE tstzrange(created_at, deleted_at) @> '2015-12-25'::timestamptz
Indexing makes this very fast.
CREATE INDEX ON users USING GIST( tstzrange(created_at, deleted_at) );Understand how events unfold...
WITH events_duration AS ( SELECT tstzrange( at_time, lead(at_time, 1) OVER (partition by event_source order by at_time) ) as range, * FROM events)SELECT * FROM events WHERE range @> now() - '4 hours';Ranges are also great for exclusion constraints.
Prevent double-booking!
ALTER TABLE reservation ADD CONSTRAINT EXCLUDE USING gist(room with =, during with &&)Everyone has email addresses, here's how you deal with them today.
SELECT * FROM users WHERE email LIKE '%@bigcorp.com'Everyone has email addresses, here's how you deal with them today.
SELECT * FROM users WHERE email LIKE '%@bigcorp.com'
This is always a full table scan. Sad trombone.
Everyone has email addresses, here's how you deal with them today.
SELECT * FROM users WHERE email LIKE '%@bigcorp.com'
This is always a full table scan. Sad trombone.
We can try our reverse b-tree trick.
SELECT * FROM users WHERE reverse(email) LIKE reverse('%@bigcorp.com')Everyone has email addresses, here's how you deal with them today.
SELECT * FROM users WHERE email LIKE '%@bigcorp.com'
This is always a full table scan. Sad trombone.
We can try our reverse b-tree trick.
SELECT * FROM users WHERE reverse(email) LIKE reverse('%@bigcorp.com')
This is... clever, but ugly and hard to dream up.
emailaddrSELECT user(email) FROM users WHERE host(email) = 'bigcorp.com'
Bonus: By default, email addresses are sorted domain/user, so this is fast with no tricks!
uriuriSELECT path(referrer) FROM signups WHERE host = 'interestingblog.com'SELECT 'http://google.com/'::uri
SELECT host('http://google.com/'::uri)Postgres has a wide selection of geometry types.
+, -, *, /@-@ (length), # (intersects)@> (contains), <-> (distance)&<|)SELECT diameter(circle) FROM circles WHERE center(circle) = '(2, 3)'geomgeogSELECT sum(ST_Length(r.the_geom))/1000 AS kilometersFROM bc_roads r, bc_municipality mWHERE r.name = 'Roseberry St' AND m.name = 'VICTORIA' AND ST_Contains(m.the_geom, r.the_geom) ;192.168.0.100)::ffff:1.2.3.0/120)08:00:2b:01:02:03)https://github.com/RhodiumToad/ip4r
Adds index support for ranges, lighter-weight single IP type. (And lots more.)
Bio-informatic types for DNA querying
https://colab.mpi-bremen.de/wiki/display/pbis/PostBIS
Very stale (6 years) but a solid foundation!
https://github.com/samv/pg-currency
For when you need a fixed-precision decimal (performance)
https://github.com/2ndQuadrant/fixeddecimal
Not really a data type, but a data structure for estimating distinct values.
https://github.com/aggregateknowledge/postgresql-hll
Create your own types that constrain existing types.
Create your own types that constrain existing types.
Create your own types that constrain existing types.
CREATE DOMAIN email AS TEXTCHECK( VALUE ~ '.+\@.+');
CREATE DOMAIN web_url AS uri CHECK ( (uri_scheme(value) = 'http' OR uri_scheme(value) = 'https') AND uri_host(value) IS NOT null);CREATE DOMAIN email AS TEXTCHECK( VALUE ~ '.+\@.+');
CREATE DOMAIN web_url AS uri CHECK ( (uri_scheme(value) = 'http' OR uri_scheme(value) = 'https') AND uri_host(value) IS NOT null);
(A better email address would check other things like length.)
Create your own types that combine existing types.
Create your own types that combine existing types.
Every row is also a composite type.
SELECT (users) FROM users;
But you can create your own:
CREATE TYPE point3d AS (x float, y float, z float);CREATE TABLE threespace (location point3d);Referring to a row type (composite type named for table) requires parens
SELECT (users).email FROM users;
Row types are handy in window expressions
SELECT (agent_statuses) AS current, lead((agent_statuses), 1) OVER (partition by agent_uuid order by time) AS next FROM agent_statuses;Defining a new function is straightforward
CREATE OR REPLACE FUNCTION host(email) RETURNS text LANGUAGE sql AS $$ select split_part($1, '@', 2) as return; $$;Defining a new function is straightforward
CREATE OR REPLACE FUNCTION host(email) RETURNS text LANGUAGE sql AS $$ select split_part($1, '@', 2) as return; $$;
PL/PGSQL is very slow.
PL/V8 is relatively fast.
PL/SQL is quite fast.
Define an initial condition, a per-value function, and a final function (optional).
CREATE AGGREGATE avg (float8)( sfunc = float8_accum, stype = float8[], finalfunc = float8_avg, initcond = '{0,0,0}');An operator is any number of non-ASCII characters.
CREATE OPERATOR @-> ( LEFTARG = 'email', PROCEDURE = 'host');SELECT email@-> FROM emails LIMIT 1;money is not a good type, and we can do better.
SELECT justify('$34.03 USD, $12.01 CAD', 'USD', '{"CAD": 0.77}');Postgres users don't crash rovers into Mars.
SELECT '77 km/h'::physicalSELECT to_unit('ml', '542 fluid ounces'::physical);SELECT '4 m/s'::physical + '1 ounce'::physical; -- errorUPDATE graphs SET image = render_graph(SELECT * FROM data WHERE run = 117) ;
SELECT gif_frame(image, 3) FROM gifs WHERE name = 'space-kaboom';
Image classification, sorting, organizing, resampling, and so on in-DB could be useful.
SELECT genre FROM mp3s;SELECT max(sound_length(sound)) FROM samples;
Could be great for people with large music collections, DJs, researchers.
Consider building on taglib?
text, timestamptz, bigserial/uuid
jsonb, array, tstzrange/numrange
Domains, composite types, functions, operators, aggregates.
PostGIS, ip4r, email, uri, pgxn
Feedback is welcome and invited!
thanks to: peter eisentraut, jim mlodgenski, jonathan katz, bruce momjian, josh berkus, will leinweber, rachid belaid
http://postgres-data-types.pvh.ca/
(If you want a reference or see a bug, let me know!)
Keyboard shortcuts
| ↑, ←, Pg Up, k | Go to previous slide |
| ↓, →, Pg Dn, Space, j | Go to next slide |
| Home | Go to first slide |
| End | Go to last slide |
| b / m / f | Toggle blackout / mirrored / fullscreen mode |
| c | Clone slideshow |
| p | Toggle presenter mode |
| t | Restart the presentation timer |
| ?, h | Toggle this help |
| Esc | Back to slideshow |