PostgreSQL - UPSERT Operation

In PostgreSQL, the UPSERT operation means either UPDATE or INSERT operation. The UPSERT operation allows us to either insert a row or skip the insert operation if a row already exists and update that row instead.

Suppose you want to insert bulk data from one table to another table that already has some data. In this case, there might be some rows that already exist in the table or some rows with old data which you want to update. You can either skip the insert operation or update some fields if records already exist with the same primary key or update some fields for that rows based on some criteria.

PostgreSQL implements the UPSERT operation using ON CONFLICT clause of the INSERT statement. The ON CONFLICT specifies alternative action to be taken in case of conflict occurring during the insert operation.

Syntax:
INSERT INTO <table_name> (column_list)
    VALUES(value_list)
[ON CONFLICT <conflict_targe> <conflict_action>,]
[RETURNING * or <column_name>];

In the above INSERT statement, The conflict_target can be

  • A column name that is a primary key or has a unique constraint or has a unique index. It cannot be other non-unique columns.
  • ON CONSTRAINT constraint_name where constraint_name can be a unique constraint name
  • WHERE clause with a predicate condition that returns TRUE or FALSE.
  • DO NOTHING: It skips the insert operation and does nothing. If you use the DO NOTHING clause, then conflict_target is not needed.
  • DO UPDATE statement: Use the DO UPDATE SET column1 = value1,... WHERE - to update fields of the table. ON CONFLICT DO UPDATE ensure that either insert or update operation is definitely performed.

To see how the UPSERT operation works, let's create the following employee table.

Example: Create Demo Table
CREATE TABLE employee (
    emp_id 	INT PRIMARY KEY,
    first_name VARCHAR(50) NOT NULL,
    last_name VARCHAR(50) NOT NULL,
    gender CHAR(1),
    birthdate DATE,
    email VARCHAR(100) UNIQUE,
    salary INT
);

Now let's insert a row into the employee table using the insert statement.

Example: Insert Data
INSERT INTO employee
VALUES(1,'Annie','Smith','F', DATE '1988-01-09', '[email protected]',5000);

This will insert a row into the employee table, as shown below.

Now, if you execute the same INSERT statement again then it will raise an error

The above employee table has the emp_id as a Primary key and data already there for emp_id = 1. If you try to insert a new record with emp_id = 1 then it will raise an error.

ON CONFLICT DO NOTHING

If you use the ON CONFLICT DO NOTHING statement with the INSERT statement then it will not raise any error and will not insert any rows into the employee table, as we are asking to DO NOTHING in case of conflict.

Example: On Conflict Do Nothing
INSERT INTO employee
VALUES(1,'Annie','Smith','F', DATE '1988-01-09', '[email protected]',5000)
ON CONFLICT DO NOTHING;

The above statement would display the following result in pgAdmin:

In the above result, INSERT 0 0 indicates that it has inserted 0 and updated 0 rows.

You can achieve the same functionality as above using the ON CONFLICT(column_name) DO NOTHING, as shown below. Here the column_name should be the primary key column or has a unique constraint that ensures all the values are unique.

Example: UPSERT with PK Column
INSERT INTO employee
VALUES(1,'Annie','Smith','F', DATE '1988-01-09', '[email protected]',5000)
ON CONFLICT(emp_id) DO NOTHING;

Alternatively, you can also specify the primary key or unique constraint name with the ON CONFLICT ON CONSTRAINT clause, as shown below.

Example: UPSERT with PK Constraint Name
INSERT INTO employee
VALUES(1,'Annie','Smith','F', DATE '1988-01-09', '[email protected]',5000)
ON CONFLICT ON CONSTRAINT employee_pkey DO NOTHING;

In the above example, the primary key constraint name employee_pkey is used with DO NOTHING which indicates to skip the insert operation if a row violates the primary key constraint.

ON CONFLICT DO UPDATE

The ON CONFLICT DO UPDATE statement executes the specified update statement if a conflict occurs. For example, the following updates the last_name column in the existing row if a row with the same emp_id exists.

Example: ON UPDATE DO UPDATE
INSERT INTO employee
VALUES(1,'Annie','Rizzolo','F', DATE '1988-01-09', '[email protected]',5000)
ON CONFLICT(emp_id) DO UPDATE SET last_name = 'Rizzolo';

The same thing can be achieved using the EXCLUDED table object instead of specifying explicit value in the update statement, as shown below.

Example: Using EXCLUDED Table
INSERT INTO employee
VALUES(1,'Annie','Rizzolo','F', DATE '1988-01-09', '[email protected]',5000)
ON CONFLICT(emp_id) DO UPDATE SET last_name = EXCLUDED.last_name;

In the above result, INSERT 0 1 indicates 0 rows inserted and 1 row updated. PostgreSQL will try to insert a record, but because the emp_id column already contains 1, the conflict will occur and it will execute the DO UPDATE SET last_name = EXCLUDED.last_name statement.

The EXCLUDED is a table object that points to the reference values of the specified INSERT statement. Here, the EXCLUDED.last_name returns 'Rizzolo'.

Let's check the data in the table, you can see the last_name is changed to 'Rizzolo'.

ON CONFLICT WHERE

Use the ON CONFLICT WHERE clause to skip or do update operation based on some condition.

For example, the following executes the update statement UPDATE SET salary = 0 if a conflict occurs and also if salary > 0 in the existing row. If the salary is 0 then skip then it will skip the insert or update operation.

Example:
INSERT INTO employee
VALUES(1,'Annie','Rizzolo','F', DATE '1988-01-09', '[email protected]',5000)
ON CONFLICT(emp_id) where salary > 0 
DO UPDATE SET salary = 0;

The above would update the salary field to 0, as shown below.

Note that you cannot use EXCLUDED table with the WHERE clause.

Thus, you can perform the UPSERT operation using the INSERT statement and manage the conflicts.