PostgreSQL: Using SELECT INTO to create a new table from an existing table

Updated: January 5, 2024 By: Guest Contributor Post a comment

Introduction

As a powerful feature of PostgreSQL, SELECT INTO allows users to create a new table and fill it with data derived from a query on an existing table, streamlining data duplication and backup processes.

Understanding SELECT INTO

The SELECT INTO statement in PostgreSQL is used to create a new table and insert data into it from an existing table. This can be especially useful when you want to make a copy of a table with or without its data for testing purposes or when archiving records. The basic syntax looks like this:

SELECT * INTO new_table FROM existing_table;

This statement would create a new table called new_table with the same column definitions as existing_table and insert all rows from the existing table into the new one.

Basic Use Cases

Creating a Full Copy of a Table

SELECT * INTO customers_backup FROM customers;

With the above command, we make an exact copy of the customers table, including all columns and rows, into customers_backup.

Copying Only Certain Columns

SELECT customer_id, customer_name INTO customer_info_backup FROM customers;

This will create a table customer_info_backup that only has the customer_id and customer_name columns from the original customers table.

Copying Data Based on Condition

SELECT * INTO customers_2021 FROM customers
WHERE registration_date >= '2021-01-01' AND registration_date < '2022-01-01';

This command clones rows where customers registered in the year 2021 into a new table called customers_2021.

Advanced Use Cases

Creating a Table with Aggregated Data

SELECT customer_id, SUM(orders_amount) INTO customer_totals
FROM orders
GROUP BY customer_id;

In this example, we are creating a new table customer_totals that will consist of each customer’s ID and the total amount of their orders.

Customizing the New Table

You are not limited to creating an identical copy of the existing structure and data. You can customize the new table in various ways:

Adding Constraints

SELECT customer_id, customer_name, email INTO customer_contacts
FROM customers
WHERE customer_id > 100;
ALTER TABLE customer_contacts
ADD PRIMARY KEY (customer_id);

After copying the desired data into customer_contacts, we add a primary key constraint to the customer_id column.

Include OR Exclude Table Constraints

In PostgreSQL, you can include or exclude constraints such as NOT NULL, DEFAULT values, and more when copying data. To include constraints from the source table, we use:

CREATE TABLE new_table AS
TABLE existing_table
WITH NO DATA;

This will create a new table with the same column definitions and constraints, but without the data.

Performance Considerations

When using SELECT INTO, be mindful of system resources. Large tables can take up considerable disk space and could affect system performance during the copy operation. Efficiency can be improved with proper indexing and by only selecting the necessary rows or columns.

Handling Duplicates

If a table with the same name already exists, PostgreSQL will return an error. It’s important to check for existing table names or to use temporary tables. You might also use conditional statements or transaction blocks to handle such situations gracefully.

Rename and Adjusting Table Properties

After creating a table using SELECT INTO, you can further refine it by renaming it or adjusting table properties like ownership and tablespace.

ALTER TABLE customer_backup RENAME TO customer_archive;
ALTER TABLE customer_archive OWNER TO new_owner;
ALTER TABLE customer_archive SET TABLESPACE new_tablespace;

These commands will rename the table, change its owner, and move it to a new tablespace respectively.

Conclusion

The SELECT INTO statement in PostgreSQL is a versatile tool for database administrators and developers. Easy to use for simple copy operations, it also scales up to more complex data duplication and transformation tasks. While valuable, be aware of its impact on database performance and always ensure the operation fits within your system’s capabilities.