Skip to main content

C# Tutorial - Using The Built In OLEDB CSV Parser [Beginner]


So, a while ago, one of the other writers here wrote a small tutorial on parsing simple CSV files inC#. It mostly just showed off the string split method, and only worked on really simple CSV files - no quoted fields, etc. Well, we got a comment asking about that, so today I sat down thinking I would write up a more robust parser. I thought to myself, am I suffering from NIH (not invented here) syndrome? Do I really need to write a full CSV parser?

And, as you might expect, the answer is no. Not only did I not need to write a parser, I found that there is one that is built into OLEDB subsystem of Windows! And it actually takes fewer lines of code to use this built in parser than it does to do the simple string split algorithm that was in the previous tutorial (and it feels a lot nicer too). I was originally planning on this being a decently long tutorial, when I thought I would write my own parser - but you are actually in for a really short and simple one today. Less for me to write, less for you to read, and more functionality to boot!

So, without further ado, I think we can jump straight into the code:
using System;
using System.Data;
using System.IO; //not used by default
using System.Data.OleDb; //not used by default

namespace CSVParserExample
{
  class CSVParser
  {
    public static DataTable ParseCSV(string path)
    {
      if (!File.Exists(path))
        return null;

      string full = Path.GetFullPath(path);
      string file = Path.GetFileName(full);
      string dir = Path.GetDirectoryName(full);

      //create the "database" connection string 
      string connString = "Provider=Microsoft.Jet.OLEDB.4.0;"
        + "Data Source=\"" + dir + "\\\";"
        + "Extended Properties=\"text;HDR=No;FMT=Delimited\"";

      //create the database query
      string query = "SELECT * FROM " + file;

      //create a DataTable to hold the query results
      DataTable dTable = new DataTable();

      //create an OleDbDataAdapter to execute the query
      OleDbDataAdapter dAdapter = new OleDbDataAdapter(query, connString);

      try
      {
        //fill the DataTable
        dAdapter.Fill(dTable);
      }
      catch (InvalidOperationException /*e*/)
      { }

      dAdapter.Dispose();

      return dTable;
    }
  }
}
 
So, first off, you will need to add the namespaces System.IO and System.Data.OleDb. The first we need because we will be doing some path manipulation, and the second gives us access to what we will need to do CSV parsing. I've created a nice static function here that takes a path to a CSV file and returns the contents of the file in a DataTable (which is really easy to view and manipulate).

The weird thing about all of this is that the CSV file gets treated as a database table. We need to create a connection string with the Jet OLEDB provider, and you set the Data Source to be the directory that contains the CSV file. Under extended properties, 'text' means that we are parsing a text file (as opposed to, say, an Excel file), the HDR property can be set to 'Yes' (the first row in the CSV files is header information) or 'No' (there is no header row), and setting the FMT property set to 'Delimited' essentially says that we will be working with a comma separated value file. You can also set FMT to 'FixedLength', for files where the fields are fixed length - but that wouldn't be a CSV file anymore, would it?

The next part to do is create the actual query. In this case, we want everything, so we have a "SELECT *". What are we selecting from? Well, in this somewhat twisted worldview, the directory is the database, so the file is the table we are selecting from.

Now we are into normal OLEDB territory - we create a DataTable that we will be filling with results, and we create a OleDbDataAdapter to actually execute the query. Then (inside of a try block, because it can throw an exception) we fill the data table with the results of the query. Afterwords, we clean up after ourselves by disposing the OleDbDataAdapter, and we return the now filled data table.
And using the now filled data table is extremely simple - we actually talk about it here.

And there you go! I'm not sure how parsing CSV could be much easier.

Comments

Popular posts from this blog

C# Snippet - Shuffling a Dictionary [Beginner]

Randomizing something can be a daunting task, especially with all the algorithms out there. However, sometimes you just need to shuffle things up, in a simple, yet effective manner. Today we are going to take a quick look at an easy and simple way to randomize a dictionary, which is most likely something that you may be using in a complex application. The tricky thing about ordering dictionaries is that...well they are not ordered to begin with. Typically they are a chaotic collection of key/value pairs. There is no first element or last element, just elements. This is why it is a little tricky to randomize them. Before we get started, we need to build a quick dictionary. For this tutorial, we will be doing an extremely simple string/int dictionary, but rest assured the steps we take can be used for any kind of dictionary you can come up with, no matter what object types you use. Dictionary < String , int > origin = new Dictionary < string , int >(); ...

C# WPF Printing Part 2 - Pagination [Intermediate]

About two weeks ago, we had a tutorial here at SOTC on the basics of printing in WPF . It covered the standard stuff, like popping the print dialog, and what you needed to do to print visuals (both created in XAML and on the fly). But really, that's barely scratching the surface - any decent printing system in pretty much any application needs to be able to do a lot more than that. So today, we are going to take one more baby step forward into the world of printing - we are going to take a look at pagination. The main class that we will need to do pagination is the DocumentPaginator . I mentioned this class very briefly in the previous tutorial, but only in the context of the printing methods on PrintDialog , PrintVisual (which we focused on last time) and PrintDocument (which we will be focusing on today). This PrintDocument function takes a DocumentPaginator to print - and this is why we need to create one. Unfortunately, making a DocumentPaginator is not as easy as...

C# WPF Tutorial - Implementing IScrollInfo [Advanced]

The ScrollViewer in WPF is pretty handy (and quite flexible) - especially when compared to what you had to work with in WinForms ( ScrollableControl ). 98% of the time, I can make the ScrollViewer do what I need it to for the given situation. Those other 2 percent, though, can get kind of hairy. Fortunately, WPF provides the IScrollInfo interface - which is what we will be talking about today. So what is IScrollInfo ? Well, it is a way to take over the logic behind scrolling, while still maintaining the look and feel of the standard ScrollViewer . Now, first off, why in the world would we want to do that? To answer that question, I'm going to take a an example from a tutorial that is over a year old now - Creating a Custom Panel Control . In that tutorial, we created our own custom WPF panel (that animated!). One of the issues with that panel though (and the WPF WrapPanel in general) is that you have to disable the horizontal scrollbar if you put the panel in a ScrollV...