Book Image

Alfresco Developer Guide

Book Image

Alfresco Developer Guide

Overview of this book

Table of Contents (17 chapters)
Alfresco Developer Guide
Credits
About the Author
About the Reviewers
Preface
Index

Metadata Extractors


Metadata extractors (Chapter 4) are used to inspect a piece of content when it is uploaded to the repository and extract data. The data is then stored in properties on the content's node.

Metadata extractors are defined in content-services-context.xml. The following table lists the out of the box metadata extractors, and shows what gets extracted and how it is mapped to the node properties:

Bean ID

Class

Property Map

extracter.PDFBox

org.alfresco.repo.content.metadata.PdfBoxMetadataExtracter

author=cm:author

title=cm:title

subject=cm:description

created=cm:created

extracter.Office

org.alfresco.repo.content.metadata.OfficeMetadataExtracter

author=cm:author

title=cm:title

subject=cm:description

createDateTime=cm:created

lastSaveDateTime=cm:modified

extracter.Mail

org.alfresco.repo.content.metadata.MailMetadataExtracter

sentDate=cm:sentdate

originator=cm:originator, cm:author

addressee=cm:addressee

addressees=cm:addressees

subjectLine=cm:subjectline, cm:description

extracter.Html

org.alfresco.repo.content.metadata.HtmlMetadataExtracter

author=cm:author

title=cm:title

description=cm:description

extracter.MP3

org.alfresco.repo.content.metadata.MP3MetadataExtracter

songTitle=music:songTitle, cm:title

albumTitle=music:albumTitle

artist=music:artist, cm:author

description=cm:description

comment=music:comment

yearReleased=music:yearReleased

trackNumber=music:trackNumber

genre=music:genre

composer=music:composer

lyrics=music:lyrics

extracter.OpenDocument

org.alfresco.repo.content.metadata.OpenDocumentMetadataExtracter

creationDate=cm:created

creator=cm:author

date=

description=

generator=

initialCreator=

keyword=

language=

printDate=

printedBy=

subject=cm:description

title=cm:title

extracter.OpenOffice

org.alfresco.repo.content.metadata.OpenOfficeMetadataExtracter

author=cm:author

title=cm:title

description=cm:description