Tsql help – doing a large update from CSV file to a table in ssms

All we need is an easy explanation of the problem, so here it is.

I have been given a csv file with just ids in there and those ids are in a table in ssms with columns that require updating. Out of 90,000 ids in that table, only 83,000 of them require updating from the csv file.

the main table is:

1       1                       no  
2       1                       no
3       1                       no
4       0                       yes

I know I can do an update on the table from the ids i have been given in the csv such as:

update table1
set enabled = 0,
agreed = 'yes',
where id in('1','2','3')

however, my problem is, there are 83,000 of them that require updating to those specific conditions and wanted to find the best way to do this. I have been told to write a script to import the CSV into a temp table, then join that temp table onto the main table and perform the update rather than hard code a list of id’s in a script.

How can I do this? Does anyone know a better way?

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

There have been some solid answers to this post, but just as an added extra, if you know this is going to be a regular task and that the file is going to be dropped in a specific location each week with the same format and name, it is possible to query flat files directly such as csv, txt and so on via SQL using OPENROWSET feature and then operate on the data using inserts and updates. (Bulk Insert is also an option and can be simpler, but in this case I opted for OpenRowSet for the benefit of easy updates.)

This requires a little set up, but once you have it working, you can create a view to read your CSV with a fixed output or even use a stored procedure to update your table from then on, so you can automate your throughput, you can even join your CSV to existing data, although it won’t be indexed, so you may want to consider using the feature to stage data into a table, index that table and then perform your update.

Although, if you are going to do a proper throughput you should look into SSIS, as you can set properly piped errors in the throughput and output locations for erroneous inputs.

I am going to make some assumptions with your data.

  • You’re using tab spaces to separate them
  • You’re using CRLF as a carriage return.
  • I am assuming you are using INT instead of BIT for the second column.

So SQL has a weakness that more lax systems do not require, but that weakness enables all the strengths it has. It needs definitions for the columns. Definitions that aren’t contained explicitly in the CSV file. So you need to create the definitions in an XML file and point SQL to them, and you can do that in a definitional XML, in this case using BCPFORMAT.

The definitional XML file would be called: BCP_FileName.XML

<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format"
  <FIELD ID="1" xsi:type="CharTerm" TERMINATOR='\t' />
  <FIELD ID="2" xsi:type="CharTerm" TERMINATOR='\t' />
  <FIELD ID="3" xsi:type="CharTerm" TERMINATOR='\r\n' />
  <COLUMN SOURCE="1" NAME="ID" xsi:type="SQLINT" />
  <COLUMN SOURCE="2" NAME="Enabled" xsi:type="SQLINT" />
  <COLUMN SOURCE="3" NAME="Agreed" xsi:type="SQLVARYCHAR" />

The CSV would be called Load.CSV

The eventual select statement might be.

  SELECT ID,Enabled,Agreed
  ) as t1

This would enable you to put this into a view with the create view statement

Create View dbo.CSVQuery
SELECT ID,Enabled,Agreed
  ) as t1

With that view you can do…

SET Enabled = cq.Enabled,
agreed = cq.Agreed
FROM SourceTable st
INNER JOIN dbo.CSVQuery cq ON cq.ID = st.ID

This has the benefit of being able to query the view so you can sense check the data and all sorts.

Of course, if you are regularly staging important data, you may want to opt for SSIS, as the above means of importing data can be all or nothing, but if that serves your purpose, then it may do.

One thing that you should get is Notepad++ so you can accurately check your hidden characters for CRLF, because some CSV files use LF on it’s own or CR.

NOTE: You must give the SQL instance access to the file location if you intend to use that as a regular loading process. To do this, go into your sever, select services, check which user is running that process (Make sure it is a domain authenticated account if the file location is not on the same server, so you can utilise the domain user to access that location for the file pickup.)

NOTE 2: You should be aware, that this process is going to update things, so it is entirely possible for a malicious attack to really do some damage, so make sure that you restrict your input process to trusted users.

Here is a resource for the XSI types:

Here is the OpenRowSet overview.

Method 2

You can use the Import Flat File feature in SSMS to import a CSV file to a Table in your database. (Import Data is another feature you can use too but it’s a little more bulky and older.) Either of these features can be found be right clicking on the database you want to import into, clicking Tasks, then clicking Import Flat File (or Import Data if you prefer). Then it’s just a matter of following the Import Wizard which is only a few steps.

If the update you’re making to your main table are the same values for all 83,000 rows that are being updated, then you can just update them with a similar hard-coded update query to this, after you import your CSV file:

SET M.enabled = 0, -- Hard-coded value for all rows
    M.agreed = 'yes' -- Hard-coded value for all rows
FROM MainTable AS M
INNER JOIN CsvImportedTable AS C
    ON M.id = C.id

Otherwise if you have different values you need to set for different rows within your list of 83,000 IDs then you’ll need to pre-define those values in new columns in the CSV file first (using Excel for example) before you import it. Then after you import it, your CSV table will have those columns as well so you can do a more dynamic update like this:

SET M.enabled = C.enabledNewValue, -- Values you entered in your CSV
    M.agreed = C.agreedNewValue -- Values you entered in your CSV
FROM MainTable AS M
INNER JOIN CsvImportedTable AS C
    ON M.id = C.id

Additional resources on importing CSVs and data into a database via SSMS:

  1. SQLShack – Importing and Working with CSV Files in SQL Server

  2. Microsoft Books Online on the Import Data feature

  3. MSSQLTips – Simple way to import data into SQL Server

Method 3

Create the table schema that you need that matches the data in your CSV file (ie specify columns and datatypes).
Use Bulk Insert to load the data from the CSV file to the table.
Write an Update statement with a JOIN


do the UPDATE
validate the results

  • valid results run COMMIT TRAN,
  • invalid results run ROLLBACK TRAN

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply