Slow inserts across a linked server?
Linked Servers offer a great way to connect two SQL Servers together, allowing remote querying and DML operations. Frequently, this is used to copy data from production to reporting. However, the temptation is to run the copy operation on the production, or source side. If you do that, even with a single INSERT INTO
statement, SQL Server will process each individual row as a discrete INSERT INTO
statement via a cursor operation. This makes for very slow inserts across a linked server. Running the operation from the destination server means SQL Server can simply query the remote source table for all the rows, inserting them as a set into the destination table. The difference in speed can be eye-watering.
Take the following MCVE as an example. It drops-and-creates a src_db database on the source server, and a dest_db database on the destination database. It also drops-and-recreates the required linked server, so don’t run this on a production SQL Server. Once the source and destination databases have been created, it copies around 800,000 rows from the source to the destination.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
USE master; IF DB_ID(N'src_db') IS NOT NULL BEGIN ALTER DATABASE src_db SET SINGLE_USER WITH ROLLBACK IMMEDIATE; DROP DATABASE src_db; END GO CREATE DATABASE src_db ON (NAME = N'src_db_primary' , FILENAME = N'G:\Database\data\src_db_primary.mdf' , SIZE = 100MB , MAXSIZE = 1000MB , FILEGROWTH = 100MB) LOG ON (NAME = N'src_db_log' , FILENAME = N'F:\Database\data\src_db_log.ldf' , SIZE = 100MB , MAXSIZE = 1000MB , FILEGROWTH = 100MB); GO USE src_db; CREATE TABLE dbo.src_table ( i int NOT NULL CONSTRAINT src_table_pk PRIMARY KEY CLUSTERED IDENTITY(1,1) , s varchar(87) NOT NULL ) ON [PRIMARY]; INSERT INTO dbo.src_table (s) SELECT REPLICATE('S', 87) FROM sys.syscolumns sc1 CROSS JOIN sys.syscolumns sc2; GO |
Here, we’re creating the linked server to the destination SQL Server:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
IF EXISTS(SELECT 1 FROM sys.servers WHERE name = 'DESTINATION_SERVER') BEGIN EXEC sys.sp_dropserver @server = N'DESTINATION_SERVER' , @droplogins = 'droplogins'; END EXEC sys.sp_addlinkedserver @server = N'DESTINATION_SERVER' , @srvproduct = N'SQL Server'; EXEC sys.sp_addlinkedsrvlogin @rmtsrvname = N'DESTINATION_SERVER' , @useself = 'true'; EXEC master.dbo.sp_serveroption @server = N'DESTINATION_SERVER' , @optname = N'rpc' , @optvalue = N'true'; EXEC master.dbo.sp_serveroption @server = N'DESTINATION_SERVER' , @optname = N'rpc out' , @optvalue = N'true'; GO |
This code creates the destination database and table on the destination SQL Server, via the linked server:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
DECLARE @cmd nvarchar(max); SET @cmd = N' IF DB_ID(N''dest_db'') IS NOT NULL BEGIN ALTER DATABASE dest_db SET SINGLE_USER WITH ROLLBACK IMMEDIATE; DROP DATABASE dest_db; END CREATE DATABASE dest_db ON (NAME = N''dest_db_primary'' , FILENAME = N''G:\Database\data\dest_db_primary.mdf'' , SIZE = 100MB , MAXSIZE = 1000MB , FILEGROWTH = 100MB) LOG ON (NAME = N''dest_db_log'' , FILENAME = N''F:\Database\data\dest_db_log.ldf'' , SIZE = 100MB , MAXSIZE = 1000MB , FILEGROWTH = 100MB); '; EXEC [DESTINATION_SERVER].master.sys.sp_executesql @cmd; --create the destination database in the destination server SET @cmd = N' USE dest_db; CREATE TABLE dbo.dest_table ( i int NOT NULL CONSTRAINT dest_table_pk PRIMARY KEY CLUSTERED , s varchar(87) NOT NULL ) ON [PRIMARY]; '; EXEC [DESTINATION_SERVER].master.sys.sp_executesql @cmd; --create the table in the destination database GO |
Now, the INSERT
operation, being ran at the source server:
1 2 3 4 |
INSERT INTO [DESTINATION_SERVER].dest_db.dbo.dest_table (i, s) SELECT i, s FROM dbo.src_table; |
I stopped the insert after it ran for more than 7 minutes. FYI, it also took several minutes to roll back the operation.
Next, switch over to the destination server, and run this code to create a linked server to the source SQL Server:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
USE master; IF EXISTS(SELECT 1 FROM sys.servers WHERE name = 'SOURCE_SERVER') BEGIN EXEC sys.sp_dropserver @server = N'SOURCE_SERVER' , @droplogins = 'droplogins'; END EXEC sys.sp_addlinkedserver @server = N'SOURCE_SERVER' , @srvproduct = N'SQL Server'; EXEC sys.sp_addlinkedsrvlogin @rmtsrvname = N'SOURCE_SERVER' , @useself = 'true'; EXEC master.dbo.sp_serveroption @server = N'SOURCE_SERVER' , @optname = N'rpc' , @optvalue = N'true'; EXEC master.dbo.sp_serveroption @server = N'SOURCE_SERVER' , @optname = N'rpc out' , @optvalue = N'true'; GO |
Then run the INSERT
from there:
1 2 3 4 |
INSERT INTO dest_db.dbo.dest_table (i, s) SELECT i, s FROM [SOURCE_SERVER].src_db.dbo.src_table; |
The above insert completed in just 11 seconds. 800,000 rows, in eleven seconds.
When you see a “remote insert” operator in a query plan, investigate if you can move the work to the destination server.
See the other posts in our series on performance.
[…] Hannah Vernon recommends a pull rather than push model when you need to insert cross-server: […]