Bulk Insert Operations on MSSQL in the Cloud
I am looking to lift one of my SQL server database instances to the cloud. A limitation I will have is the inability of performing bulk insert operations directly from the SQL server. To accommodate this limitation, I will leverage both BCP and SQLCMD utilities and invoke the operations from an on-premises utility server via a Schedule Task.
More information on these utilities can be found at the following links:
- https://learn.microsoft.com/en-us/sql/tools/bcp-utility?view=sql-server-ver16
- https://learn.microsoft.com/en-us/sql/tools/sqlcmd-utility?view=sql-server-ver16
Three processes will be required:
- Bulk insert data into a local MSSQL database for initial staging purposes
- Extract the data from the local MSSQL database to be bulk inserted to the cloud MSSQL instance
- Bulk insert data into cloud MSSQL instance
Stage 1: Bulk insert data into a local MSSQL database for initial staging purposes
This process will make use of a custom stored procedure in the local staging database. The stored procedure will perform the initial bulk insert operation and be invoked from a command script using SQLCMD in the next process. The content of the stored procedure would be structured similar to the following:
USE [database_name]
GO
SET ANSI_NULLS ON
GO
SET
QUOTED_IDENTIFIER ON
GO
CREATE PROCEDURE
[dbo].[stored_procedure_name]
AS
BEGIN
SET NOCOUNT
ON;
-- WARNING:
Uncomment only if you wish to truncate the target table
IF EXISTS
(SELECT * FROM sys.objects WHERE object_id =
OBJECT_ID(N'dbo.table_name') AND type in (N'U'))
TRUNCATE
TABLE dbo.table_name
INSERT INTO
dbo.table_name( [column1], [column2], [column3], [...])
SELECT a.column1, a.column2, a.column3, ...,
FROM
OPENROWSET( BULK 'data_file',
FORMATFILE = 'format_file') AS a;
SELECT 'Rows
Imported via BULK Insert = ' + CAST(COUNT(*) AS VARCHAR) FROM
dbo.table_name
END
Additional Requirements
The trusted or named SQL login will require the following privileges on the source MSSQL database:
- GRANT EXECUTE ON OBJECT::[dbo].[stored_procedure_name] TO [user_name];
- GRANT ALTER ON OBJECT:: dbo.table_name TO [user_name];
- ALTER SERVER ROLE [BULKADMIN] ADD MEMBER [user_name];
Where:
- database_name is the name of your source on-premises database
- stored_procedure_name is the name of the local stored procedure
- table_name is the name of the target table name truncation and bulk operations will be performed against
- data_file is the full path of the data file whose data is to be copied into the target table
- format_file is the full path of a format file
- user_name is the trusted login or SQL login that process will execute as
WARNING: the script above includes TRUNCATE TABLE which will permanently delete all rows from the target table. Use your better judgment before including this in your process. Alternatively you may wish to use a DELETE operation but will ultimately be dependent on your specific circumstances.
For more information on BULK INSERT operations, refer to https://learn.microsoft.com/en-us/sql/t-sql/statements/bulk-insert-transact-sql?view=sql-server-ver16.
Stage 2: Extract the records to be bulk inserted to the cloud MSSQL instance
This process will both invoke the bulk insert operation above on the local MSSQL database and output the row results from the source table to a local data file. The local output data file will be referenced by the bulk insert operation in the next process.
To invoke the initial bulk insert operation using trusted credentials, issue the following command using SQLCMD:
- SQLCMD -Q "EXEC database_name.dbo.stored_procedure_name;" -S server_name[\instance_name] -d database_name -E
For non-trusted connections, issue the following command:
- SQLCMD -Q "EXEC database_name.dbo.stored_procedure_name;" -S server_name[\instance_name] -d database_name –U user_name –P password
To output the row results of the source table to a local file using trusted credentials, issue the following command using BCP:
- BCP.exe "SELECT * FROM database_name.dbo.table_name" queryout output_data_file -T -c -S server_name -C RAW
For non-trusted connections, issue the following command:
- BCP.exe "SELECT * FROM database_name.dbo.table_name" queryout output_data_file -c -S server_name -C RAW –U user_name –P password
Where:
- database_name is the name of your source on-premises database
- stored_procedure_name is the name of the local stored procedure
- table_name is the name of the source table name records will be selected from
- output_data_file is the name of the result output file generated
- server_name is the name of your local MSSQL database server
- instance_name is the optional MSSQL instance name used on your MSSQL database. You could alternatively reference a port_number rather than specifying a instance name
- user_name is the optional SQL login user name if non-trusted connection is used
- password is the optional SQL login password for the user_name above
Bulk insert data into cloud MSSQL instance
This final process will initially truncate the target cloud MSSQL database table and then perform the bulk insert against the cloud MSSQL instance. The output data file created above will be used as the source data for the bulk insert operation.
To invoke the initial table truncation using trusted credentials, issue the following command using SQLCMD:
- SQLCMD -Q "TRUNCATE TABLE database_name.dbo.table_name;" -S server_name -d database_name -E
For non-trusted connections, issue the following command:
- SQLCMD -Q "TRUNCATE TABLE database_name.dbo.table_name;" -S server_name -d database_name -U user_name -P password
To invoke the initial bulk insert operation using trusted credentials, issue the following command using BCP:
- BCP.exe database_name.dbo.table_name IN input_data_file -S server_name -T -c
For non-trusted connections, issue the following command:
- BCP.exe database_name.dbo.table_name IN input_data_file -S server_name -U user_name -P password -c
Where:
- database_name is the name of your source on-premises database
- table_name is the name of the target table name truncation and bulk operations will be performed against
- input_data_file is the name of the result output file generated in the previous process
- server_name is the name of your local MSSQL database server
- instance_name is the optional MSSQL instance name used on your MSSQL database. You could alternatively reference a port_number rather than specifying a instance name
- user_name is the optional SQL login user name if non-trusted connection is used
- password is the optional SQL login password for the user_name above
WARNING: the command above includes TRUNCATE TABLE which will permanently delete all rows from the target table. Use your better judgment before including this in your process. Alternatively you may wish to use a DELETE operation but will ultimately be dependent on your specific circumstances.
NOTE: In Microsoft Azure SQL, you must either grant db_datareader privilege on the master database to the login or explicitly identify the database to connect to using the -d switch when using SQLCMD. The above SQLCMD statements include the -d swich. In my situation, I opted to using the -d switch and limited access to the master database.
Additional Requirements
The trusted or named SQL login will require the following privileges on the target MSSQL database along with the db_datareader and db_datawriter roles:
- GRANT ALTER ON OBJECT:: dbo.table_name TO [user_name];
- ALTER ROLE [db_datareader] add member [user_name];
- ALTER ROLE [db_datawriter] add member [user_name];
The above three processes I have packaged in a single CMD script that is invoked via a trigger from an on-premises utility server trigger using a Scheduled Task.
Comments
Post a Comment