Skip to main content

C# - Getting Image Metadata [Intermediate]


Everyone has images. Maybe you're a photographer, a graphic artist, or you simply had a really awesome vacation, but chances are you have a lot of them. What you may or may not know is that every image contains extra information about itself: when it was taken or created, the size of the image, and sometimes even details such as the focal length and type of lens used. Almost every image has some sort of extra data, called metadata, and using C# we can write a function or even a class to help us get this metadata from the image.

Image metadata is formatted according to a set of standards, which can be found in a multitude of ways. Basically each piece of metadata is referred to as a property and has four attributes: Id, Type, Value, and Length. The Id is a number that corresponds to what the property is, e.g. the image width property has an Id of 0x0100. Considering an image file is binary, probably the most important attribute is the type, which tells us what type of data the value attribute is going to be, like a string for example. Finally you have the length which tells you how long the value is. With those four attributes, you can get, format, and display metadata.

The Image we want Metadata from

Before we go diving into a mess of code, we need to set some things up first. The Id and Type are really just numbers that correspond to something useful, so in order to display and format the metadata, we need to know what number is what. There are plenty of resources for this, but the best I found was Microsoft’s metadata Property Tag table, found here. But, in the end we need something we can use in our code. There are a couple of ways we can go from a number to something cohesive, but the best way is to make an enum.

If you glanced at Microsoft’s metadata info, you may notice there are a lot of numbers there. So do I expect you to spend an hour typing out an enum for all that? No, not really. There is actually a resource that already has all this info in a nice neat file. All you need to do is include the ImageConstants.cs file from the source download at the bottom of the page. If you open the file, you will notice that all the enums are in a namespace, which you can rename to anything you like or you can just leave it. For this tutorial they are in the namespace PFP.Imaging, so in order to use the enums all you have to do is simply add ‘using PFP.Imaging’ at the top of your file.

OK, so hopefully the enums are all set up now and we can move on to some real code. The first thing we need to do is add some new ‘using’ statements. We will be using a couple of non-standard namespaces for this one:
using System.Drawing.Imaging
using System.Collections
using PFP.Imaging
 
These three using statements will allow us to use some pretty neat stuff, including our enums and the dictionary object. Our next step actually involves a dictionary object, and a pretty crazy one as well. Now, if you know what a HashTable is, then you know what a dictionary is. The only difference is that a HashTable is a key/value pair of objects only, while a Dictionary is something you set the types for. Our Dictionary object will have a key of the type PropertyTagId (one of our enums) and the value will be another key/value pair. This key value pair will have a key of a PropertyTagType (another of our enums) corresponding to the type of value, then finally the value as an object. This will be returned from a function, so our function declaration will look like this:
public Dictionary<PropertyTagId, KeyValuePair<PropertyTagType, Object>>
    BuildPropsHash(Image metaImage)
{
}
 
One pretty crazy function, but it will return a dictionary with everything that you will ever need involving metadata, presented in a way that is readable. Now we have to fill this function with the code that will fill our dictionary. As with any function, we have to start by declaring our variables, or in this case our return value. We are obviously returning a beastly Dictionary, so let’s get that all set up:
Dictionary<PropertyTagId, KeyValuePair<PropertyTagType, Object>> returnImageProps = 
    new Dictionary<PropertyTagId, KeyValuePair<PropertyTagType, object>>();

I already explained how this Dictionary is setup, so I don’t think a repeat is needed, but remember that PropertyTagId and PropertyTagType are the enums we define in the PFP.Imaging namespace. All they are doing is taking a number and making it more readable to us.

The next thing we have to do is finally start to get the binary data that corresponds to the metadata. This is not that hard, in fact the image class has a member called PropertyItems, that we can loop through. It is an array of PropertyItem objects, which is in the System.Drawing.Imaging namespace. Each PropertyItem object represents a piece of metadata, so we can use the PropertyItems array to get all the metadata for an image. The loop will start like this:
foreach (PropertyItem property in metaImage.PropertyItems)
{
}
 
Just keep in mind that metaImage is our argument for the function, in the form of an image object. What happens inside the loop is the magical and yet tricky part of getting the metadata to be legible. We are going to use a monster of a switch statement to read the binary data, all based on what type of data it is. I will go ahead and give you the garglemesh, because it will be much easier to explain once you see what the code is:
Object propValue = new Object();

switch ((PropertyTagType)property.Type)
{
  case PropertyTagType.ASCII:
    ASCIIEncoding encoding = new ASCIIEncoding();
    propValue = encoding.GetString(property.Value, 0, property.Len - 1);
    break;
  case PropertyTagType.Int16:
    propValue = BitConverter.ToInt16(property.Value, 0);
    break;
  case PropertyTagType.SLONG:
  case PropertyTagType.Int32:
    propValue = BitConverter.ToInt32(property.Value, 0);
    break;
  case PropertyTagType.SRational:
  case PropertyTagType.Rational:
    UInt32 numberator = BitConverter.ToUInt32(property.Value, 0);
    UInt32 denominator = BitConverter.ToUInt32(property.Value, 4);

    if(denominator != 0)
      propValue = ((double)numberator/(double)denominator).ToString();
    else
      propValue = "0";

    if (propValue.ToString() == "NaN")
      propValue = "0";
    break;
  case PropertyTagType.Undefined:
    propValue = "Undefined Data";
    break;
}

returnImageProps.Add(NumToEnum<PropertyTagId>(property.Id), 
    new KeyValuePair<PropertyTagType, object>(
        NumToEnum<PropertyTagType>(property.Type), propValue));

Ok, so the first thing we do is declare an object that will hold our value. The value type will vary, but it doesn’t matter as long as we use the base object. Next is our switch statement, which I could talk about for hours on end, but I will keep it simple. There are 7 types of metadata that you can have, with a couple of those able to be grouped together. First we take any ASCII type values and convert them to strings. Then we take any 16-Bit Integers and use the BitConverter class to convert the binary data to a Integer. The BitConverter class is a super-cool conversion class that converts binary data into a whole host of values, mostly numerical data. But, moving on we have SLongs and Int32s, which are pretty much the same type of value, so we convert those values to Int32s. After that, we test for and convert SRationials and Rationals, which are decimals.

Up until now the conversion process, as you can see, has been fairly straight forward. You take in some binary data and convert it into data using a one-liner with BitConverter. Rationals that are returned are decimals, but they are formatted a little strangely. The two Rational types are actually fractions that, when divided, will give you the decimal value. If you are proficient with C# you will notice that we take the first 4 bytes and that is the numerator of the fraction. The second set of 4 bytes is the denominator. We convert them to doubles, then divide. Before we can divide, we check the denominator in case it is zero, and if it is we just set the ending value to zero. This way we don't divide by zero, thus causing a super-nasty error. In some cases we actually end up with “NaN” for our value, so we just return 0 in those cases as well.

The last test in the switch is for a value that has no format, and therefore we won’t know how to read or display it. We can either ignore it or display a friendly message, in this case we just return “Undefined Data”. The final thing we do in the loop is add an entry to our dictionary, putting in all the information needed. You will notice that we use a function called NumToEnum which converts the Property Tag Id, which is an integer, into the relevant enum value.

The last thing you do with any function with a return value is……return a value. So after we add that, the final outcome is:
public Dictionary<PropertyTagId, KeyValuePair<PropertyTagType, Object>>
    BuildPropsHash(Image metaImage)
{
  Dictionary<PropertyTagId,KeyValuePair<PropertyTagType, Object>> returnImageProps =
      new Dictionary<PropertyTagId,KeyValuePair<PropertyTagType, object>>();

  foreach (PropertyItem property in metaImage.PropertyItems)
  {
    Object propValue = new Object();
    switch ((PropertyTagType)property.Type)
    {
      case PropertyTagType.ASCII:
        ASCIIEncoding encoding = new ASCIIEncoding();
        propValue = encoding.GetString(property.Value, 0, property.Len - 1);
        break;
      case PropertyTagType.Int16:
        propValue = BitConverter.ToInt16(property.Value, 0);
        break;
      case PropertyTagType.SLONG:
      case PropertyTagType.Int32:
        propValue = BitConverter.ToInt32(property.Value, 0);
        break;
      case PropertyTagType.SRational:
      case PropertyTagType.Rational:
        UInt32 numberator = BitConverter.ToUInt32(property.Value, 0);
        UInt32 denominator = BitConverter.ToUInt32(property.Value, 4);

        if(denominator != 0)
          propValue = ((double)numberator / (double)denominator).ToString();
        else
          propValue = "0";

        if (propValue.ToString() == "NaN")
          propValue = "0";
        break;
      case PropertyTagType.Undefined:
        propValue = "Undefined Data";
        break;
    }
    returnImageProps.Add(NumToEnum<PropertyTagId>(property.Id),
        new KeyValuePair<PropertyTagType, object>(
            NumToEnum<PropertyTagType>(property.Type), propValue));
  }
  return returnImageProps;
}
 
This function will return a slightly large and complex dictionary. As long as you know how to use it, it can be quite powerful. For simplicity sake, I will give you some code to loop through it and display all the values:
Dictionary<PropertyTagId, KeyValuePair<PropertyTagType, Object>> imageMeta = 
    ImageInfo.BuildPropsHash(newImage);

foreach (KeyValuePair<PropertyTagId, KeyValuePair<PropertyTagType, Object>>
    property in imageMeta)
{
  richTextBox1.Text += property.Key.ToString() + ": " + 
      property.Value.ToString() + "\n";
}
 
This will fill a RichTextBox with the metadata on the newImage. I put my functions inside a class, so ImageInfo.BuildPropsHash is a method call to our function we just built. Besides the ridiculous Dictionary, the loop is relatively simple. If you play around with the Dictionary object that was returned, you will find that getting metadata from it, using our nice enums, is quite easy. The output of the code above, might look similar to:

Metadata App Screnshot

Hopefully this screenshot helps you realize what you can do with the dictionary full of metadata. You can actually use our enums to help find a specific piece of metadata, even without knowing the exact id number. And with that this tutorial closes. Below are some resources I used to help create this tutorial, along with the Visual Studio Solution. If there are any questions, fell free to ask and remember that when you need coding help or advice, just switch on the code.

Source Files:

Comments

Popular posts from this blog

C# Snippet - Shuffling a Dictionary [Beginner]

Randomizing something can be a daunting task, especially with all the algorithms out there. However, sometimes you just need to shuffle things up, in a simple, yet effective manner. Today we are going to take a quick look at an easy and simple way to randomize a dictionary, which is most likely something that you may be using in a complex application. The tricky thing about ordering dictionaries is that...well they are not ordered to begin with. Typically they are a chaotic collection of key/value pairs. There is no first element or last element, just elements. This is why it is a little tricky to randomize them. Before we get started, we need to build a quick dictionary. For this tutorial, we will be doing an extremely simple string/int dictionary, but rest assured the steps we take can be used for any kind of dictionary you can come up with, no matter what object types you use. Dictionary < String , int > origin = new Dictionary < string , int >();

C# Snippet - The Many Uses Of The Using Keyword [Beginner]

What is the first thing that pops into your mind when you think of the using keyword for C#? Probably those lines that always appear at the top of C# code files - the lines that import types from other namespaces into your code. But while that is the most common use of the using keyword, it is not the only one. Today we are going to take a look at the different uses of the using keyword and what they are useful for. The Using Directive There are two main categories of use for the using keyword - as a "Using Directive" and as a "Using Statement". The lines at the top of a C# file are directives, but that is not the only place they can go. They can also go inside of a namespace block, but they have to be before any other elements declared in the namespace (i.e., you can't add a using statement after a class declaration). Namespace Importing This is by far the most common use of the keyword - it is rare that you see a C# file that does not h

C# WPF Printing Part 2 - Pagination [Intermediate]

About two weeks ago, we had a tutorial here at SOTC on the basics of printing in WPF . It covered the standard stuff, like popping the print dialog, and what you needed to do to print visuals (both created in XAML and on the fly). But really, that's barely scratching the surface - any decent printing system in pretty much any application needs to be able to do a lot more than that. So today, we are going to take one more baby step forward into the world of printing - we are going to take a look at pagination. The main class that we will need to do pagination is the DocumentPaginator . I mentioned this class very briefly in the previous tutorial, but only in the context of the printing methods on PrintDialog , PrintVisual (which we focused on last time) and PrintDocument (which we will be focusing on today). This PrintDocument function takes a DocumentPaginator to print - and this is why we need to create one. Unfortunately, making a DocumentPaginator is not as easy as