Everyone has images. Maybe you're a photographer, a graphic artist, or
you simply had a really awesome vacation, but chances are you have a lot
of them. What you may or may not know is that every image contains extra
information about itself: when it was taken or created, the size of the
image, and sometimes even details such as the focal length and type of
lens used. Almost every image has some sort of extra data, called
metadata, and using C# we can write a function or even a class to help
us get this metadata from the image.
Image metadata is formatted according to a set of standards, which can
be found in a multitude of ways. Basically each piece of metadata is
referred to as a property and has four attributes: Id, Type, Value, and
Length. The Id is a number that corresponds to what the property is,
e.g. the image width property has an Id of 0x0100. Considering an image
file is binary, probably the most important attribute is the type, which
tells us what type of data the value attribute is going to be, like a
string for example. Finally you have the length which tells you how long
the value is. With those four attributes, you can get, format, and
display metadata.
Before we go diving into a mess of code, we need to set some things up
first. The Id and Type are really just numbers that correspond to
something useful, so in order to display and format the metadata, we
need to know what number is what. There are plenty of resources for
this, but the best I found was Microsoft’s metadata Property Tag table,
found
here.
But, in the end we need something we can use in our code. There are a
couple of ways we can go from a number to something cohesive, but the
best way is to make an enum.
If you glanced at Microsoft’s metadata info, you may notice there are a
lot of numbers there. So do I expect you to spend an hour typing out an
enum for all that? No, not really. There is actually a resource that
already has all this info in a nice neat file.
All you need to do is include the ImageConstants.cs file from the source
download at the bottom of the page. If you open the file, you will
notice that all the enums are in a namespace, which you can rename to
anything you like or you can just leave it. For this tutorial they are
in the namespace PFP.Imaging, so in order to use the enums all you have
to do is simply add ‘using PFP.Imaging’ at the top of your file.
OK, so hopefully the enums are all set up now and we can move on to some
real code. The first thing we need to do is add some new ‘using’
statements. We will be using a couple of non-standard namespaces for
this one:
using System.Drawing.Imaging
using System.Collections
using PFP.Imaging
These three using statements will allow us to use some pretty neat
stuff, including our enums and the dictionary object. Our next step
actually involves a dictionary object, and a pretty crazy one as well.
Now, if you know what a HashTable is, then you know what a dictionary
is. The only difference is that a HashTable is a key/value pair of
objects only, while a
Dictionary
is something you set the types for. Our Dictionary object will have a
key of the type PropertyTagId (one of our enums) and the value will be
another key/value pair. This key value pair will have a key of a
PropertyTagType (another of our enums) corresponding to the type of
value, then finally the value as an object. This will be returned from a
function, so our function declaration will look like this:
public Dictionary<PropertyTagId, KeyValuePair<PropertyTagType, Object>>
BuildPropsHash(Image metaImage)
{
}
One pretty crazy function, but it will return a dictionary with
everything that you will ever need involving metadata, presented in a
way that is readable. Now we have to fill this function with the code
that will fill our dictionary. As with any function, we have to start by
declaring our variables, or in this case our return value. We are
obviously returning a beastly Dictionary, so let’s get that all set up:
Dictionary<PropertyTagId, KeyValuePair<PropertyTagType, Object>> returnImageProps =
new Dictionary<PropertyTagId, KeyValuePair<PropertyTagType, object>>();
I already explained how this Dictionary is setup, so I don’t think a
repeat is needed, but remember that PropertyTagId and PropertyTagType
are the enums we define in the PFP.Imaging namespace. All they are doing
is taking a number and making it more readable to us.
The next thing we have to do is finally start to get the binary data
that corresponds to the metadata. This is not that hard, in fact the
image class has a member called PropertyItems, that we can loop through.
It is an array of PropertyItem objects, which is in the
System.Drawing.Imaging namespace. Each PropertyItem object represents a
piece of metadata, so we can use the PropertyItems array to get all the
metadata for an image. The loop will start like this:
foreach (PropertyItem property in metaImage.PropertyItems)
{
}
Just keep in mind that metaImage is our argument for the function, in
the form of an image object. What happens inside the loop is the magical
and yet tricky part of getting the metadata to be legible. We are going
to use a monster of a switch statement to read the binary data, all
based on what type of data it is. I will go ahead and give you the
garglemesh, because it will be much easier to explain once you see what
the code is:
Object propValue = new Object();
switch ((PropertyTagType)property.Type)
{
case PropertyTagType.ASCII:
ASCIIEncoding encoding = new ASCIIEncoding();
propValue = encoding.GetString(property.Value, 0, property.Len - 1);
break;
case PropertyTagType.Int16:
propValue = BitConverter.ToInt16(property.Value, 0);
break;
case PropertyTagType.SLONG:
case PropertyTagType.Int32:
propValue = BitConverter.ToInt32(property.Value, 0);
break;
case PropertyTagType.SRational:
case PropertyTagType.Rational:
UInt32 numberator = BitConverter.ToUInt32(property.Value, 0);
UInt32 denominator = BitConverter.ToUInt32(property.Value, 4);
if(denominator != 0)
propValue = ((double)numberator/(double)denominator).ToString();
else
propValue = "0";
if (propValue.ToString() == "NaN")
propValue = "0";
break;
case PropertyTagType.Undefined:
propValue = "Undefined Data";
break;
}
returnImageProps.Add(NumToEnum<PropertyTagId>(property.Id),
new KeyValuePair<PropertyTagType, object>(
NumToEnum<PropertyTagType>(property.Type), propValue));
Ok, so the first thing we do is declare an object that will hold our
value. The value type will vary, but it doesn’t matter as long as we use
the base object. Next is our switch statement, which I could talk about
for hours on end, but I will keep it simple. There are 7 types of
metadata that you can have, with a couple of those able to be grouped
together. First we take any ASCII type values and convert them to
strings. Then we take any 16-Bit Integers and use the
BitConverter
class to convert the binary data to a Integer. The BitConverter class is
a super-cool conversion class that converts binary data into a whole
host of values, mostly numerical data. But, moving on we have SLongs and
Int32s, which are pretty much the same type of value, so we convert
those values to Int32s. After that, we test for and convert SRationials
and Rationals, which are decimals.
Up until now the conversion process, as you can see, has been fairly
straight forward. You take in some binary data and convert it into data
using a one-liner with BitConverter. Rationals that are returned are
decimals, but they are formatted a little strangely. The two Rational
types are actually fractions that, when divided, will give you the
decimal value. If you are proficient with C# you will notice that we
take the first 4 bytes and that is the numerator of the fraction. The
second set of 4 bytes is the denominator. We convert them to doubles,
then divide. Before we can divide, we check the denominator in case it
is zero, and if it is we just set the ending value to zero. This way we
don't divide by zero, thus causing a super-nasty error. In some cases we
actually end up with “NaN” for our value, so we just return 0 in those
cases as well.
The last test in the switch is for a value that has no format, and
therefore we won’t know how to read or display it. We can either ignore
it or display a friendly message, in this case we just return “Undefined
Data”. The final thing we do in the loop is add an entry to our
dictionary, putting in all the information needed. You will notice that
we use a function called NumToEnum which converts the Property Tag Id,
which is an integer, into the relevant enum value.
The last thing you do with any function with a return value is……return a
value. So after we add that, the final outcome is:
public Dictionary<PropertyTagId, KeyValuePair<PropertyTagType, Object>>
BuildPropsHash(Image metaImage)
{
Dictionary<PropertyTagId,KeyValuePair<PropertyTagType, Object>> returnImageProps =
new Dictionary<PropertyTagId,KeyValuePair<PropertyTagType, object>>();
foreach (PropertyItem property in metaImage.PropertyItems)
{
Object propValue = new Object();
switch ((PropertyTagType)property.Type)
{
case PropertyTagType.ASCII:
ASCIIEncoding encoding = new ASCIIEncoding();
propValue = encoding.GetString(property.Value, 0, property.Len - 1);
break;
case PropertyTagType.Int16:
propValue = BitConverter.ToInt16(property.Value, 0);
break;
case PropertyTagType.SLONG:
case PropertyTagType.Int32:
propValue = BitConverter.ToInt32(property.Value, 0);
break;
case PropertyTagType.SRational:
case PropertyTagType.Rational:
UInt32 numberator = BitConverter.ToUInt32(property.Value, 0);
UInt32 denominator = BitConverter.ToUInt32(property.Value, 4);
if(denominator != 0)
propValue = ((double)numberator / (double)denominator).ToString();
else
propValue = "0";
if (propValue.ToString() == "NaN")
propValue = "0";
break;
case PropertyTagType.Undefined:
propValue = "Undefined Data";
break;
}
returnImageProps.Add(NumToEnum<PropertyTagId>(property.Id),
new KeyValuePair<PropertyTagType, object>(
NumToEnum<PropertyTagType>(property.Type), propValue));
}
return returnImageProps;
}
This function will return a slightly large and complex dictionary. As
long as you know how to use it, it can be quite powerful. For simplicity
sake, I will give you some code to loop through it and display all the
values:
Dictionary<PropertyTagId, KeyValuePair<PropertyTagType, Object>> imageMeta =
ImageInfo.BuildPropsHash(newImage);
foreach (KeyValuePair<PropertyTagId, KeyValuePair<PropertyTagType, Object>>
property in imageMeta)
{
richTextBox1.Text += property.Key.ToString() + ": " +
property.Value.ToString() + "\n";
}
This will fill a
RichTextBox
with the metadata on the newImage. I put my functions inside a class, so
ImageInfo.BuildPropsHash is a method call to our function we just built.
Besides the ridiculous Dictionary, the loop is relatively simple. If you
play around with the Dictionary object that was returned, you will find
that getting metadata from it, using our nice enums, is quite easy. The
output of the code above, might look similar to:
Hopefully this screenshot helps you realize what you can do with the
dictionary full of metadata. You can actually use our enums to help find
a specific piece of metadata, even without knowing the exact id number.
And with that this tutorial closes. Below are some resources I used to
help create this tutorial, along with the Visual Studio Solution. If
there are any questions, fell free to ask and remember that when you
need coding help or advice, just switch on the code.
Source Files:
Comments
Post a Comment