Transferring data from MySQL to MsSQL DataBase
In this article, we describe different ways to transfer data from a MySQL database to a Microsoft (MsSQL) database.
Context
At EnergiaSimple we manage data for Chile’s electric market in a MySQL database. To produce some client reports we use PowerBI, but the amount of data is considerable so we need to access it via DirectQuery, and PowerBI doesn’t allow it from a MySQL database. Our solution was to create an MsSQL database and copy the needed data on it.
Our main database is an MySQL 5.7, running in a db.t3.large (2 CPU, 8 Gb RAM y 500 Gb disk) in AWS, RDS service. For the MsSQL database, we used a t3.medium machine (2 CPU, 4 Gb RAM y 250Gb EBS disk) in EC2, running Windows 10 with SQL Server Web 2019.
Transfer data from MySQL to MsSQL
Row by row
The most obvious way to move data from one DB to another is by consulting the main one and writing the resulting rows in the other one.
$inserter = new BatchActiveRecordInserter([
'db' => $mssqlBalanceDb,
'batchSize' => 1000, // 1000 records is the max allowed in MsSql
'ignoreSaveErrors' => false, // InsertIgnore isn't valid in MsSql
]);$iterator = new ActiveQueryOffsetIterator([
'query' => $modelClass::find()->andWhere(['BETWEEN', $dateColumn, $init_date, $end_date]),
'batchSize' => 100000,
]);foreach ($iterator as $measure) {
$inserter->push($measure);
}
This method works, but it took more than 12 hours to copy one month of data. We also tried to divide the month into weeks, so we executed 4–5 processes in parallel to make the insertions, but the database connections locked and blocked other processes to operate correctly.
Dump file (DOESN’T WORK)
When we want to copy a big amount of data between MySQL the best way to do it is by mysqldump
, generating a file, and inserting it in the other DB. We get the data from our main DB in 1 min (getting 1 Gb for one month), so the lecture from the DB was not a problem.
mysqldump -u USER -h HOST --no-create-info -p DATABASE TABLE --where='datetime_clt BETWEEN "2022-02-01 00" AND "2022-02-28 23"' > mysql_dump_feb.sql
The --no-create-info
option is because we didn’t want to delete what is currently in the table, we want to incorporate new data. And with the where
parameter we extract only one month.
We tried to insert the dump into MsSQL using sqlcmd
, but the format of the INSERT
MySQL statement was not the correct one for the MsSQL.
sqlcmd -S HOST -U USER -P PASSWORD -d DATABASE -i mysql_dump_feb.sql
MsSQL linked to MySQL
Microsoft SQL Server 2019 can be linked with other DB using the ODBC MySQL connector. Start your Management Studio and choose your SQL Server instance. In the Object Explorer panel, expand the Server Objects, right-click on Linked Servers and then click on New Linked Server. Configure your linked server in the dialog box:
- Give a name for your server in the Linked server field.
- Under Server type, select Other data source.
- Choose Microsoft OLE DB Provider for ODBC Drivers in the Provider drop-down list.
- In the Provider string field write:
Driver={MySQL ODBC 8.0 Unicode Driver};Server=HOST;Database=DATABASE;uid=USER;password=PASSWORD;
Once the connection was established, we can use the instruction Insert INTO
to get rows from one DB and insert them into the other one. This method worked, but the amount of data transfer was too low for what we needed. We copied one day in 30 minutes (31.464 rows).
INSERT INTO [DATABASE].[dbo].[TABLE]
SELECT * FROM [MYSQL].[DATABASE2]..[TABLE2]
WHERE [datehour_clt] = '2022-02-28'GO
CSV and Bulk Insert
Our last resource was to get the data in CSV format from MySQL and insert it into the MsSQL database via Bulk Insert
. To do it, we need to extract the data via a SELECT
statement combine with sed
because we are using an RDS DB, and we can’t access the disk.
mysql -B --host=HOST --user=USERNAME -pPASSWORD DBNAME -e 'SELECT * FROM TABLE WHERE datetime_clt BETWEEN "2022-02-01 00" AND "2022-02-28 23"' | sed "s/'//;s/\t/,/g;s/\n//g" > measures_feb.csv
Finally, we write the results in the MsSQL with the BULK INSERT
statement.
BULK INSERT [DATABASE].[dbo].[TABLE]
FROM 'C:\route\measures_feb.csv'
WITH ( FIRSTROW=2, FIELDTERMINATOR = ',', ROWTERMINATOR = '0x0a' );
The query and CSV generation from the MySQL DB took 2 minutes and the insertion took 4 minutes (6.663.128 rows).