Embedding metadata in an image
I have been considering the question, "How can I provide a search engine with the metadata for my digital assets?". This question arose at the ECCHRD meeting recently. The context was with regard to adding human rights images, audio and video assets to Hurisearch. I came across rdfpic today. The rdfpic proposal (which looks dead in the water...) approaches the problem by using "content negotiation" filters in the http server. Crawlers are instructed to ask for photo.jpg with _application/rdf_ mimetype. The provided demo does not work but the principal seems sound.. Basically, we could provide a set of filters for popular web servers that look for embedded metadata in an asset and serve the metadata instead of the asset to search engines when requested. A second option would be for the search engine to download the asset and extract metadata from the XMP fields. And a third option would be for the metadata to be specified as photo.xml and contain an rdf:subject property pointing to photo.jpg. Some mechanism would be required to ensure that a search engine could find photo.xml. Of course, all three options are long range aspirations at the moment.. The proliferation of content management systems may eventually mean that by suitably tagging our assets with embedded metadata now we can hope that a future approach to asset metadata publication will be easier to add on without recataloguing our assets.
Re: Embedding metadata in an image
The software HuriSearch runs on has been used in some interesting multi-media applications, just search their site
Below something I received from one of their engineers:
"In principle we can extract meta data (like EXIF, XMP, ...) from multimedia files how Damon describes. However it is not out-of-the-box but has to be done by Solution Customer Services.
There is some existing open source code in the web which we can use but it will still need some days to integrate that.
Furthermore we have solutions for searching directly in video and audio streams. For example you can search for words spoken by somebody in a video. You then can jump to exactly that position in the video."
If it can do that sounds pretty neat!
Below something I received from one of their engineers:
"In principle we can extract meta data (like EXIF, XMP, ...) from multimedia files how Damon describes. However it is not out-of-the-box but has to be done by Solution Customer Services.
There is some existing open source code in the web which we can use but it will still need some days to integrate that.
Furthermore we have solutions for searching directly in video and audio streams. For example you can search for words spoken by somebody in a video. You then can jump to exactly that position in the video."
If it can do that sounds pretty neat!
Rdfpic and XMP
Rdfpic is not completely dead yet. The latest version uses XMP. I'm using it for all my photos. The software has some rough edges (no installer yet, e.g.) and lack of time keeps us from finishing it for the moment. (Maybe this northern summer…) But if you know how to compile Java, you can try the version from CVS (see instructions).