Home > HOT Posts!, InDesign Scripts > Extract Metadata with Adobe XMP [Part 2]

Extract Metadata with Adobe XMP [Part 2]

In previous post about Extracting Metadata with Adobe XMP we were extracting complete Metadata from file and saving it to XML file. In this post we will take a look how to extract just those elements that we want. Also, don’t forget to grab Adobe XMP Specification and visit Adobe XMP Develop Center for additional info about XMP Specifications. Before continuing I suggest you to extract Metadata from file to XML or you can find same structure in ‘Links panel -> (Right click on image link) -> XMP File Info -> Raw Data (tab)’. It will be easier to understand and find data that we want to extract. So, let’s get started.

First, we are going to load ‘AdobeXMPScript’ through ‘External Object’.

function loadXMPLibrary(){
    if ( !ExternalObject.AdobeXMPScript ){
        try{ExternalObject.AdobeXMPScript = new ExternalObject('lib:AdobeXMPScript');}
        catch (e){alert('Unable to load the AdobeXMPScript library!'); return false;}
    }
    return true;
}

Also, if you want to unload ‘AdobeXMPScript’ at the end, you can do it like this:

function unloadXMPLibrary(){
    if( ExternalObject.AdobeXMPScript ){
        try{ExternalObject.AdobeXMPScript.unload(); ExternalObject.AdobeXMPScript = undefined;}
        catch (e){alert('Unable to unload the AdobeXMPScript library!');}
    }
}

Good, now, we are going to load XMP data from selected image frame like we did last time. Also we will unload XMP library and close file after loading Metadata:

if(loadXMPLibrary() && app.selection.length == 1 && app.selection[0].contentType == ContentType.GRAPHIC_TYPE){
    var myFile = File(app.selection[0].graphics[0].itemLink.filePath);
    xmpFile = new XMPFile(myFile.fsName, XMPConst.UNKNOWN, XMPConst.OPEN_FOR_READ);
    var myXmp = xmpFile.getXMP();
    xmpFile.closeFile(XMPConst.CLOSE_UPDATE_SAFELY);
    unloadXMPLibrary();
}

OK, so now ‘myXmp’ holds all Metadata. XML is separated in ‘namespaces’, so to load data we want, we have to know their values. Here is list on most common:

Namespace value Description URI
NS_XMP The XML namespace for the XMP basic schema http://ns.adobe.com/xap/1.0/
NS_XMP_MM The XML namespace for the XMP digital asset management schema http://ns.adobe.com/xap/1.0/mm/
NS_XMP_RIGHTS The XML namespace for the XMP copyright schema. http://ns.adobe.com/xap/1.0/rights/
NS_IPTC_CORE The XML namespace for the IPTC Core schema. http://iptc.org/std/Iptc4xmpCore/1.0/xmlns/
NS_DC The XML namespace for the Dublin Core schema http://purl.org/dc/elements/1.1/
NS_CAMERA_RAW The XML namespace for the Camera Raw schema http://ns.adobe.com/camera-raw-settings/1.0/
NS_PHOTOSHOP The XML namespace for the Adobe Photoshop custom schema http://ns.adobe.com/photoshop/1.0/
NS_TIFF The XML namespace for Adobe’s TIFF schema http://ns.adobe.com/tiff/1.0/
NS_EXIF The XML namespace for Adobe’s EXIF schema http://ns.adobe.com/exif/1.0/
NS_EXIF_AUX The XML namespace for Adobe’s EXIF auxiliary schema http://ns.adobe.com/exif/1.0/aux/

Additional information you can find on Adobe InDesign Scripting site in Scripting section and Adobe Creative Suite JavaScript Tools Guide on page 262. Now, let’s see how to use this ‘namespaces’ to load data that we want. Also, we can use ‘namespace URI’ for ‘namespace’ value. ‘Namespace URI’ can be found in metadata on beginning of each new description node (‘xmlns’). We are going to use basic ‘XMP’ to load ‘CreatorTool’. Other properties of basic ‘XMP Schema’ can be found in Adobe XMP Specification on page 38. Properties for ‘IPTC Schema’ can be found on IPTC website. I’m using sample image that can be downloaded from IPTC‘s web site. Here is part of ‘XMP’ Metadata stored in file.

<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="Adobe XMP Core 5.0-c060 61.134777, 2010/02/12-17:32:00">
  <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">

    <rdf:Description xmlns:xmp="http://ns.adobe.com/xap/1.0/" rdf:about="">
      <xmp:CreateDate>2005-02-01T00:30:44-06:00</xmp:CreateDate>
      <xmp:ModifyDate>2010-07-14T03:38:26-05:00</xmp:ModifyDate>
      <xmp:MetadataDate>2010-07-16T03:24:33+01:00</xmp:MetadataDate>
      <xmp:CreatorTool>Adobe Photoshop CS5 Windows</xmp:CreatorTool>
      <xmp:Rating>3</xmp:Rating>
      <xmp:Label>Red</xmp:Label>
    </rdf:Description>

  </rdf:RDF>
</x:xmpmeta>

And let’s take a look on code we need to load ‘CreatorTool’ node:

if(myXmp){
    var myCreatorTool = myXmp.getProperty(XMPConst.NS_XMP,"CreatorTool");
}

So, here, we are using ‘getProperty(schemaNS, propName[, valueType]);’ function from ‘XMPMeta Object’. More info about can be also found on Adobe Creative Suite JavaScript Tools Guide on page 280. First function property is ‘namespace’, second is node name that we are targeting, and third is optional for value type. We are going to use just first two properties.

Good, now let’s try one more example. We are going to load ‘Location’ from ‘IPTC Schema’. So, we are going to change just second line of previous script. We are putting ‘IPTC namespace’, and node name. Here is part of ‘IPTC’ Metadata stored in file.

<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="Adobe XMP Core 5.0-c060 61.134777, 2010/02/12-17:32:00">
  <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">

    <rdf:Description xmlns:Iptc4xmpCore="http://iptc.org/std/Iptc4xmpCore/1.0/xmlns/" rdf:about="">
      <Iptc4xmpCore:IntellectualGenre>Profile</Iptc4xmpCore:IntellectualGenre>
      <Iptc4xmpCore:Location>Moore family farm</Iptc4xmpCore:Location>
      <Iptc4xmpCore:CountryCode>US</Iptc4xmpCore:CountryCode>
      <Iptc4xmpCore:CreatorContactInfo rdf:parseType="Resource">
        <Iptc4xmpCore:CiAdrExtadr>Big Newspaper, 123 Main Street</Iptc4xmpCore:CiAdrExtadr>
        <Iptc4xmpCore:CiAdrCity>Boston</Iptc4xmpCore:CiAdrCity>
        <Iptc4xmpCore:CiAdrRegion>Massachusetts</Iptc4xmpCore:CiAdrRegion>
        <Iptc4xmpCore:CiAdrPcode>02134</Iptc4xmpCore:CiAdrPcode>
        <Iptc4xmpCore:CiAdrCtry>United States</Iptc4xmpCore:CiAdrCtry>
        <Iptc4xmpCore:CiTelWork>+1 (800) 1234567</Iptc4xmpCore:CiTelWork>
        <Iptc4xmpCore:CiEmailWork>johndoe@bignewspaper.com</Iptc4xmpCore:CiEmailWork>
        <Iptc4xmpCore:CiUrlWork>http://www.bignewspaper.com</Iptc4xmpCore:CiUrlWork>
      </Iptc4xmpCore:CreatorContactInfo>
      <Iptc4xmpCore:Scene>
        <rdf:Bag>
          <rdf:li>011900</rdf:li>
        </rdf:Bag>
      </Iptc4xmpCore:Scene>
      <Iptc4xmpCore:SubjectCode>
        <rdf:Bag>
          <rdf:li>04001000</rdf:li>
          <rdf:li>04001001</rdf:li>
        </rdf:Bag>
      </Iptc4xmpCore:SubjectCode>
    </rdf:Description>

  </rdf:RDF>
</x:xmpmeta>

And here is the code:

    var myLocation = myXmp.getProperty(XMPConst.NS_IPTC_CORE,"Location");

Good, now, let’s get to some tricky parts. For example ‘CreatorContactInfo’ from ‘IPTC Schema’ contains nested nodes, and we have also to address child nodes. If we want to load complete ‘CreatorContactInfo’ we need to go through each node separately. Following line will load just first node:

    var myCreatorAddress = myXmp.getProperty(XMPConst.NS_IPTC_CORE,"CreatorContactInfo/Iptc4xmpCore:CiAdrExtadr");

Great. Last thing we are going to take a look at is so called ‘Array valued XMP properties’. More info can be found on Adobe XMP Develop Center in Part 1, Data model, Serialization, and Core Properties document on page 21. This is part of ‘Dublin Core Schema’ Metadata stored in file.

    <rdf:Description xmlns:dc="http://purl.org/dc/elements/1.1/" rdf:about="">
      <dc:format>image/jpeg</dc:format>
      <dc:description>
        <rdf:Alt>
          <rdf:li xml:lang="x-default">After digging the furrows another ten yards with the tractor, Jim Moore hops off to hand-set more leeks and onions.</rdf:li>
        </rdf:Alt>
      </dc:description>
      <dc:creator>
        <rdf:Seq>
          <rdf:li>John Doe</rdf:li>
        </rdf:Seq>
      </dc:creator>
      <dc:subject>
        <rdf:Bag>
          <rdf:li>agriculture</rdf:li>
          <rdf:li>farm laborer</rdf:li>
          <rdf:li>farmer</rdf:li>
          <rdf:li>field hand</rdf:li>
          <rdf:li>field worker</rdf:li>
          <rdf:li>humans</rdf:li>
          <rdf:li>occupation</rdf:li>
          <rdf:li>people</rdf:li>
          <rdf:li>agricultural</rdf:li>
          <rdf:li>agronomy</rdf:li>
        </rdf:Bag>
      </dc:subject>
      <dc:title>
        <rdf:Alt>
          <rdf:li xml:lang="x-default">01661gdx</rdf:li>
        </rdf:Alt>
      </dc:title>
      <dc:rights>
        <rdf:Alt>
          <rdf:li xml:lang="x-default">©2010 Big Newspaper, all rights reserved</rdf:li>
        </rdf:Alt>
      </dc:rights>
    </rdf:Description>

Almost all data stored here is in ‘Array valued XMP properties’. For example if we want to get ‘description’ and we use methods from above,

    var myDescription = myXmp.getProperty(XMPConst.NS_DC,'description');

we will get empty string. Thet’s because ‘description’ data is inside ‘XMP Array’. So, how to retrieve data stored there? First way is to add ‘Array’ number at the end of node name like this:

    var myDescription = myXmp.getProperty(XMPConst.NS_DC,'description[1]');

Notice that ‘XMP Array’ doesn’t start from 0 like other ‘Arrays’!

But, how to check how many items do we have in ‘Array’? There is ‘XMPMeta Object’ function ‘countArrayItems();’ that counts items in ‘Array’. Let’s count number of elements in ‘subject’:

    var mySubjectCount = myXmp.countArrayItems(XMPConst.NS_DC,'subject');

Now, ‘mySubjectCount’ holds number of ‘Array’ items. Now we can loop through ‘description’ node and get data. Last ‘XMPMeta Object’ function we are going to use is ‘getArrayItem();’. This function is very easy to implement into the loop. First and second properties of function are used for targeting ‘namespace’ and node/array name, and third is for ‘Array’ item we want to retrieve.

    var mySubject = myXmp.getArrayItem(XMPConst.NS_DC,'subject', 1);

Well, that’s it. I hope I covered all the basics for accessing data stored in Metadata. Also, don’t forget that there are some custom Metadata ‘namespaces’, and you can always find their ‘namespace URI’s’ with extracting complete data or through ‘Links panel’.

That’s it! I hope you learned something, at least I did 😉

Have fun! 😀

Leave a comment