PostgreSQL - UPSERT Operation
In PostgreSQL, the UPSERT operation means either UPDATE or INSERT operation. The UPSERT operation allows us to either insert a row or skip the insert operation if a row already exists and update that row instead.
Suppose you want to insert bulk data from one table to another table that already has some data. In this case, there might be some rows that already exist in the table or some rows with old data which you want to update. You can either skip the insert operation or update some fields if records already exist with the same primary key or update some fields for that rows based on some criteria.
PostgreSQL implements the UPSERT operation using
ON CONFLICT clause of the INSERT statement.
ON CONFLICT specifies alternative action to be taken in case of conflict occurring during the insert operation.
INSERT INTO <table_name> (column_list) VALUES(value_list) [ON CONFLICT <conflict_targe> <conflict_action>,] [RETURNING * or <column_name>];
In the above INSERT statement, The conflict_target can be
- A column name that is a primary key or has a unique constraint or has a unique index. It cannot be other non-unique columns.
- ON CONSTRAINT constraint_name where constraint_name can be a unique constraint name
- WHERE clause with a predicate condition that returns TRUE or FALSE.
- DO NOTHING: It skips the insert operation and does nothing. If you use the DO NOTHING clause, then conflict_target is not needed.
DO UPDATE statement: Use the DO UPDATE SET column1 = value1,... WHERE
- to update fields of the table. ON CONFLICT DO UPDATE ensure that either insert or update operation is definitely performed.
To see how the UPSERT operation works, let's create the following
CREATE TABLE employee ( emp_id INT PRIMARY KEY, first_name VARCHAR(50) NOT NULL, last_name VARCHAR(50) NOT NULL, gender CHAR(1), birthdate DATE, email VARCHAR(100) UNIQUE, salary INT );
Now let's insert a row into the
employee table using the insert statement.
INSERT INTO employee VALUES(1,'Annie','Smith','F', DATE '1988-01-09', '[email protected]',5000);
This will insert a row into the
employee table, as shown below.
Now, if you execute the same INSERT statement again then it will raise an error
employee table has the
emp_id as a Primary key and data already there for
emp_id = 1.
If you try to insert a new record with
emp_id = 1 then it will raise an error.
ON CONFLICT DO NOTHING
If you use the ON CONFLICT DO NOTHING statement with the INSERT statement then it will not raise any error and will not insert any rows into the employee table, as we are asking to DO NOTHING in case of conflict.
INSERT INTO employee VALUES(1,'Annie','Smith','F', DATE '1988-01-09', '[email protected]',5000) ON CONFLICT DO NOTHING;
The above statement would display the following result in pgAdmin:
In the above result,
INSERT 0 0 indicates that it has inserted 0 and updated 0 rows.
You can achieve the same functionality as above using the
ON CONFLICT(column_name) DO NOTHING, as shown below. Here the
column_name should be the primary key column or has a unique constraint that ensures all the values are unique.
INSERT INTO employee VALUES(1,'Annie','Smith','F', DATE '1988-01-09', '[email protected]',5000) ON CONFLICT(emp_id) DO NOTHING;
INSERT INTO employee VALUES(1,'Annie','Smith','F', DATE '1988-01-09', '[email protected]',5000) ON CONFLICT ON CONSTRAINT employee_pkey DO NOTHING;
In the above example, the primary key constraint name
employee_pkey is used with DO NOTHING which indicates to skip the insert operation if a row violates the primary key constraint.
ON CONFLICT DO UPDATE
ON CONFLICT DO UPDATE statement executes the specified update statement if a conflict occurs.
For example, the following updates the
last_name column in the existing row if a row with the same
INSERT INTO employee VALUES(1,'Annie','Rizzolo','F', DATE '1988-01-09', '[email protected]',5000) ON CONFLICT(emp_id) DO UPDATE SET last_name = 'Rizzolo';
The same thing can be achieved using the
EXCLUDED table object instead of specifying explicit value in the update statement, as shown below.
INSERT INTO employee VALUES(1,'Annie','Rizzolo','F', DATE '1988-01-09', '[email protected]',5000) ON CONFLICT(emp_id) DO UPDATE SET last_name = EXCLUDED.last_name;
In the above result,
INSERT 0 1 indicates 0 rows inserted and 1 row updated.
PostgreSQL will try to insert a record, but because the
emp_id column already contains 1, the conflict will occur and it will execute the
DO UPDATE SET last_name = EXCLUDED.last_name statement.
EXCLUDED is a table object that points to the reference values of the specified INSERT statement.
EXCLUDED.last_name returns 'Rizzolo'.
Let's check the data in the table, you can see the
last_name is changed to 'Rizzolo'.
ON CONFLICT WHERE
ON CONFLICT WHERE clause to skip or do update operation based on some condition.
For example, the following executes the update statement
UPDATE SET salary = 0 if a conflict occurs and also if
salary > 0 in the existing row.
salary is 0 then skip then it will skip the insert or update operation.
INSERT INTO employee VALUES(1,'Annie','Rizzolo','F', DATE '1988-01-09', '[email protected]',5000) ON CONFLICT(emp_id) where salary > 0 DO UPDATE SET salary = 0;
The above would update the salary field to 0, as shown below.
Note that you cannot use
EXCLUDED table with the WHERE clause.
Thus, you can perform the UPSERT operation using the INSERT statement and manage the conflicts.