Parsing DATE while copying csv file into PostgreSQL table

All we need is an easy explanation of the problem, so here it is.

I have a long series of .csv files, which I want to import into a local database. I believe my query is correct, but there are some problems in parsing DATE and TIMESTAMP columns. PostgreSQL reads these columns expecting an ISO format "yyyy/mm/dd", but my data has it in another format: "dd/mm/yyyy".

I read online and on other Stack Overflow answers that one can SET the datestyle to be different, but it’s not recommended.

Is there a way to specify the format of the columns to import? Also, I do not need to import all columns from the csv file: can I leave some out?

Details

First, I wrote the code to create the table (sorry if column names are in Italian, but it’s not important):

CREATE TABLE IF NOT EXISTS bikes (
    bici INT,
    tipo_bici VARCHAR(20),
    cliente_anonimizzato INT,
    data_riferimento_prelievo DATE,
    data_prelievo TIMESTAMP,
    numero_stazione_prelievo INT,
    nome_stazione_prelievo TEXT,
    slot_prelievo SMALLINT,
    data_riferimento_restituzione DATE,
    data_restituzione TIMESTAMP,
    numero_stazione_restituzione INT,
    nome_stazione_restituzione TEXT,
    slot_restituzione SMALLINT,
    durata VARCHAR(10),
    distanza_totale REAL,
    co2_evitata REAL,
    calorie_consumate REAL,
    penalità CHAR(2)
);

Then I add the query to copy data into the table:

COPY bikes(
    bici,
    tipo_bici,
    cliente_anonimizzato,
    data_riferimento_prelievo,
    data_prelievo,
    numero_stazione_prelievo,
    nome_stazione_prelievo,
    slot_prelievo,
    data_riferimento_restituzione,
    data_restituzione,
    numero_stazione_restituzione,
    nome_stazione_restituzione,
    slot_restituzione,
    durata,
    distanza_totale,
    co2_evitata,
    calorie_consumate,
    penalità
)
FROM '/Users/luca/tesi/data/2019q3.csv'
DELIMITER ','
CSV HEADER;

The code seems fine, except the following error pops up:

ERROR:  date/time field value out of range: "31/07/2019"
HINT:  Perhaps you need a different "datestyle" setting.
CONTEXT:  COPY bikes, line 25296, column data_riferimento_restituzione: "31/07/2019"
SQL state: 22008

How can I specify in the CREATE TABLE portion of the code the format to parse? Also, I do not actually need all the cols of this csv, how do I leave these out? I tried to specify only those I need but I get an import error:

ERROR:  extra data after last expected column

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

Set datestyle to ISO, DMY, and your dates will be parsed as you want. There is nothing wrong with setting that parameter – do it with SET right before you COPY.

There is no way to skip columns from the CSV file. Add extra columns to the table and drop them later, that is cheap.

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply