Posted on 2014-Apr-01
Why eBooks Need Metadata
eBook metadata is information that describes the book such as its title, author, and blurb. This data can be stored in a database by proprietary software platforms and utilized to organize catalogs of books or enhance discoverability by vendors, libraries, and third-party platforms. One convenient feature of the EPUB specification is that it allows the metadata to be embedded within the eBook itself rather than relying on an external database. The EPUB standard is open source, so retrieval of the metadata can be easily implemented by software developers. When the EPUB eBook is converted to MOBI for the Kindle platform, the metadata is ostensibly maintained.
Unfortunately, it is unclear how the various platforms utilize the metadata that is embedded in the eBooks and the majority of vendors and eReading software are not at all forthcoming with how the metadata is accessed. Therefore, when publishing at the different eBook vendors, the metadata that gets entered separately during the upload process is usually more important than anything that is embedded in the eBook package.
Since it is difficult to predict how eReading software will evolve in the future, it is considered a best practice to properly embed accurate metadata so that future reading systems can utilize it appropriately. Some reading systems already do utilize portions of the metadata and some examples are highlighted below.
The Different Bits of eBook Metadata
The EPUB specification developed by the International Digital Publishing Forum (IDPF) allows the embedding of metadata using XML. If you do not know what XML is, please don’t be afraid. It is simply a way to encode data that both humans and computers can easily recognize. Below is an example of some metadata in an EPUB2 and EPUB3 type eBook for The Wizard of Oz:
The following metadata is required for EPUB2, EPUB3, and MOBI/KF8 eBooks:
- Title: the book’s title
- Language: the language that the book is in
- Author: the author’s name or multiple authors’ names if a group project/anthology
- Unique Identifier: this can be a Universally Unique Identifier or an ISBN; more details on uniquely identify an eBook here
The following metadata is optional, but strongly encouraged to be used in all eBook editions to accurately describe the book:
- Publication Date: the date that the eBook was created
- Publisher: the name of the publisher or the author’s name if independently published
- Contributors: this includes the editor, cover artists, book designer, and many more; a complete list of contributors that can be listed is provided by the Library of Congress
- Rights: a copyright notice; more on copyright here
- Description: a blurb about the book; if no blurb is available, the description can be something like The Wizard of Oz: A Novel
- Keywords: these can be the same as what you type in at the eBook vendors (seven keywords is a good rule of thumb); you can also pick and choose from the predefined BISAC Subject Headings to make things easy (e.g. Fiction/Romance/Erotica)
Additionally, EPUB3 allows some metadata to more accurately describe the book if it is part of a series or a revision. The additional metadata is as follows:
- Short Title
- Extended Title
- Edition Number
- Series that the book is a part of
It is important to be as accurate as possible in metadata, since a pesky typo in the author’s name may go unnoticed in some reading systems (e.g. Adobe Digital Editions), but show up on the top of every left-handed page on other reading systems (e.g iBooks).
Example of Metadata in Action
While most reading systems ignore much of the metadata embedded in the eBook, some reading systems use some portions of the metadata (most notable the title and author metadata). Below are a few examples from different reading systems:
The Kindle devices place the title metadata embedded in the eBook on the top of every page like a running header.
iBooks places the author metadata on left-handed pages, and the eBook’s title metadata on the right-handed page in a fashion that is typically seen for running headers in novels.
Readium allows access to the publication date, identifier, and publisher metadata when the user peruses the “Details” in their eBook library.
Calibre probably makes the best use of metadata and allows access to both the description and keyword metadata when the user browses their eBook library.
Since there are so many reading systems on the market and the metadata is part of the open-source EPUB platform, it is considered a best practice to always embed as much accurate metadata as possible. At BB eBooks, we’re happy to handle this for our clients as part of our conversion services.
Label: Technical and Designcomments powered by Disqus