Overview
Managing data effectively within a database is crucial for the integrity and functionality of applications that rely on it. UPSERT
is a database operation that allows for a more nuanced approach to data manipulation. It stands for ‘Update if exists, Insert if not’. Though MySQL does not have a specific UPSERT command, this functionality is achieved through several SQL statements like INSERT ON DUPLICATE KEY UPDATE
or using the REPLACE
statement.
What is Upsert?
In simple terms, UPSERT is the process of inserting a new record into a MySQL database table if the record does not exist or updating the existing record if it does. This operation is commonly required when you want to ensure that your dataset remains unique while still being able to update records as necessary.
The Basics of UPSERT in MySQL
1. Using INSERT ON DUPLICATE KEY UPDATE
MySQL offers the INSERT ON DUPLICATE KEY UPDATE
clause, which is the closest to a standard UPSERT statement. It works by attempting to insert a new row. If the insertion would result in a duplicate entry for a PRIMARY KEY
or a UNIQUE
index, MySQL updates the existing row instead.
INSERT INTO table_name (column1, column2, ...)
VALUES (value1, value2, ...)
ON DUPLICATE KEY UPDATE
column1 = value1,
column2 = value2,
...;
Example: Consider you have a table products
with fields id
, name
, and quantity
. Here’s how you would insert a new product or update the quantity if it already exists.
INSERT INTO products (id, name, quantity)
VALUES (1, 'Widget', 10)
ON DUPLICATE KEY UPDATE
quantity = VALUES(quantity) + 10;
Here if a product with id=1
already exists, its quantity will be updated by adding 10 more to it. Otherwise, a new product will be inserted
Working with Unique Constraints
It’s important to note that INSERT ON DUPLICATE KEY UPDATE
statement only works if the table has a PRIMARY KEY or UNIQUE index that triggers the duplicate violation. For example:
ALTER TABLE products ADD UNIQUE (name);
With the above line, we’ve added a UNIQUE constraint to the name
column of the products
table, meaning that no two products can have the same name.
Advanced UPSERT Operations
When the UPSERT logic becomes more complex, you might find yourself in need of using more advanced techniques such as using temporary tables or complex conditional statements within the ON DUPLICATE KEY UPDATE clause.
Using a Temporary Table for Bulk UPSERTs
When needing to UPSERT multiple rows at once, using a temporary table to contain the records and then iterating through them could be a viable method.
CREATE TEMPORARY TABLE tmp_products LIKE products;
INSERT INTO tmp_products (id, name, quantity) VALUES
(1, 'Widget', 15),
(2, 'Gadget', 30);
INSERT INTO products (id, name, quantity)
SELECT id, name, quantity FROM tmp_products
ON DUPLICATE KEY UPDATE
quantity = VALUES(quantity) + tmp_products.quantity;
DROP TEMPARY TABLE tmp_products;
Here, you first copy the structure of your target table to a temporary one, insert your data into the temporary table, then proceed with the UPSERT operation using the temporary table as the source.
Error Handling
In scenarios where the UPSERT fails, you may want to handle those errors properly. MySQL has conditional logic constructs such as IF
and CASE
statements that can be really useful:
Note that beginning with MySQL 5.5, you can use the SIGNAL SQLSTATE
statement to return an error:
INSERT INTO products (id, name, quantity)
VALUES (3, 'Gizmo', 20)
ON DUPLICATE KEY UPDATE
quantity = VALUES(quantity) + 10
IF ROW_COUNT() = 0 THEN
SIGNAL SQLSTATE '45000' SET MESSAGE_TEXT = 'No rows affected';
END IF;
This will execute the insert or update, and then it will explicitly raise an error signal if no rows were affected in the operation—either because of a duplicate key conflict without an update or because an actual error occurred.
Conclusion
UPSERT operations in MySQL allow the insertion of new data and the updatable of existing data simultaneously, without throwing errors upon uniqueness violations. Understanding and using UPSERT efficiently can significantly reduce the complexity of your scripts and ensure data consistency.