Table with smaller data types seems to take more space on disk?

All we need is an easy explanation of the problem, so here it is.

I have these two identical tables:

                  Table "public.region"
   Column    |   Type   | Collation | Nullable | Default 
-------------+----------+-----------+----------+---------
 r_regionkey | integer      |           | not null | 
 r_name      | char(25)     |           |          | 
 r_comment   | char(152)     |           |          | 
Indexes:
    "region_pkey" PRIMARY KEY, btree (r_regionkey)

and

                  Table "public.region2"
   Column    |   Type   | Collation | Nullable | Default 
-------------+----------+-----------+----------+---------
 r_regionkey | smallint |           | not null | 
 r_name      | text     |           |          | 
 r_comment   | text     |           |          | 
Indexes:
    "region_pkey" PRIMARY KEY, btree (r_regionkey)

I am using smallint and text in order to save space, but weirdly this is the result:

select pg_size_pretty(pg_table_size('region'))

returns 8192 bytes while

select pg_size_pretty(pg_table_size('region2'))

returns 48 kB.

Why is region2 taking more space, even though I am using smallint instead of integer and text instead of char(n)?

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

After running VACUUM FULL public.region; and VACUUM FULL public.region2;, test again with:

SELECT pg_relation_size('public.region');

Three possible issues:

  1. The obvious reason: table bloat from updates or deletes. Removed by VACUUM FULL.

  2. Schema-qualified names (‘public.region’ instead of just ‘region’) makes sure you don’t measure the wrong table in a different schema by accident. Probably not the case.

  3. pg_table_size() includes auxiliary relation forks (files), which may be filled for one table, but not for the other. For your purpose, the more accurate test is with pg_relation_size().

The manual:

pg_table_size ( regclass ) → bigint

Computes the disk space used by the specified table, excluding indexes
(but including its TOAST table if any, free space map, and visibility
map).

And:

pg_relation_size ( relation regclass [, fork text ] ) →
bigint

Computes the disk space used by one “fork” of the specified relation.
(Note that for most purposes it is more convenient to use the
higher-level functions pg_total_relation_size or pg_table_size,
which sum the sizes of all forks.) With one argument, this returns the
size of the main data fork of the relation. The second argument can be
provided to specify which fork to examine:

  • main returns the size of the main data fork of the relation.

  • fsm returns the size of the Free Space Map (see Section 70.3) associated with the relation.

  • vm returns the size of the Visibility Map (see Section 70.4) associated with the relation.

  • init returns the size of the initialization fork, if any, associated with the relation.

Since your example has no TOAST table, only fsm and vm make a difference. Those are typically negligible in size for bigger tables, but relevant for your minimal test. Both may go down to "0 bytes" after VACUUM FULL.

Test with more rows (thousands).

Related:

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply