1 year ago
#10724
A. Stam
How do I read a Windows-1252 encoded CSV into Azure SQL server?
I have CSV files in an Azure Blob container that I want to load into an Azure SQL server database using BULK INSERT. The CSV files contain special characters like ë and are encoded using Windows-1252. When I run BULK INSERT, I cannot get the correct result in the SQL table.
Here's the code I'm using:
TRUNCATE TABLE [dbo].[test];
BULK INSERT [dbo].[test]
FROM 'test.csv'
WITH (
DATA_SOURCE = 'blobConnection',
CODEPAGE = '1252',
FORMAT = 'CSV',
DATAFILETYPE = 'CHAR',
FIELDTERMINATOR = ';',
FIRSTROW = 2
);
The dbo.test
table has one column of type VARCHAR(MAX)
. The database has collation SQL_Latin1_General_CP1_CI_AS. The file test.csv contains three lines:
test
test
tëst
Using the code snippet above, the third line is shown as t�st
. I have tried using a column of NVARCHAR type, but the issue remains. I have tried several alternative values for the CODEPAGE parameter:
- RAW: same result (
�
) - ACP: same result
- 65001: returns
t?st
I can't figure out what is causing this behavior. All the answers I can find online tell me to change the codepage parameter to one of these values, but it won't fix the issue.
The only solution I've found is to encode the CSV file in UTF-8. However, these CSV files are not created by myself but by other users and I'd really prefer to find a solution that doesn't require me to explain the concept of encoding to them.
sql-server
azure
csv
encoding
windows-1252
0 Answers
Your Answer