Meaningful Bibliographic Metadata (M2B) -- Recommendations of a set of metadata properties and encoding vocabularies BETA VERSION

M2B is intended to assist content providers in selecting appropriate metadata properties for the creation, management and exchange of meaningful bibliographic information in open repositories. Its objectives include:

  • To provide a set of common metadata properties;
  • To encourage the use of authority data, controlled vocabularies, and syntax encoding standards;
  • To recommend the use of URIs as names for things [1], especially for data values, when they are available.

Conceptual model

In order to have an overall picture and common understanding of involving entities and relationships in bibliographic descriptions, M2B has established a general conceptual model [2] (Figure 1) that provides a high level of abstraction focusing on bibliographic resource  entity.  Major relations can be identified between a resource instance (e.g., an article or a report) and the agent(s) (e.g., a personal author or a research team) that are responsible for the creation of the content and the dissemination of the resource, as well as the thema(s) (i.e., things that being the subjects or topics of an article. As a result, three core entities are presented in the model: resource, agent, and thema. The model presented in Figure 2 is based on the implication of the general concept model with examples of possible relationships between and among the instances in different entities.

The models convey the following meanings (entity names are presented in italics):

  • Basic entities and their relationships. The resource entity is the centre of every description here. The model does not exemplify the types of sub-entities, e.g., the sub-entities of resource would be various resource types. Relationships are established between the resource entity and two other major entitles: agent and thema.
  • Relationships between instances within the same entity. Relationships between instances of an entity also exist. For example, a resource may be related to another resource. An agent may be related to another agent. Such relationships are demonstrated in the model. 
  • Relationships between instances of different entities. Relationships between any pair of instances vary and can be found at different levels. The sample relationships illustrated in Figure 2 are demonstrative and may apply at different levels of the bibliographic resource entity. For example, an agent may provide the funding for the creation of an original work, for the translation of a work, or the production of a new format of a translation.
  • Control of values. Authority control is considered an important element of the model. The agents, regardless of their roles in relation to a resource, should be managed through name authority files.  Concepts, topics, and geographic places as the themas of a resource should be controlled with value vocabularies.  Although not emphasized in the model for the authority control of the titles of bibliographic resource given the context of this report, it is also a logical step that resource uniform titles also be controlled.

    More and more name authority files, controlled vocabularies, and resource datasets are becoming available as Linked Open Data (LOD). The model intentionally sets an extracted piece of the LOD cloud as the background for each entity, to remind the reader of reality. 

The conceptual model holds the key for sharing the common understanding of the important entities and relationships for bibliographic data.  It can be used with different data models that have different implementation approaches.




The LODE-BD general concept model



Figure 1. The LODE-BD general concept model



   
The implication of the general concept model in the LODE-BD v.1.1. case


Figure 2. The implication of the general concept

model in the LODE-BD v.1.1. case

 

Groups of Common Properties


Common properties for describing bibliographic resources are identified and grounded in nine groups based on our comprehensive studies of  several open repositories. About two dozen properties used for describing a bibliographic resource are included in Group 1 to 8.  Two sets of properties for describing relations between bibliographic resources or between agents are included in Group 9. In the following list of the groups, some selected properties are emphasized in italic format. In the report, the word resource is used to represent bibliographic resource, a primary resource type to be described.

  • 1. Title Information. Title is one of the most important and relevant access points for any resource.  The information is usually supplied through a number of properties including title, alternative title-(handling parallel title(s), translated title(s), transliterated title(s), etc.  
  • 2. Responsible Body. This group contains the properties associated with any agent who is responsible for the creation and/or publication of the content of the resource, for example, the creatorcontributor, and publisher or issuer of a resource.
  • 3. Physical Characteristics. Properties that describe the appearance and the characteristics of the physical form of a resource are placed into this group. They are: dateidentifierlanguageformat, and edition/version.
  • 4. Location (physical location). It is considered important for a resource to be located and obtained in the information exchange.  Properties that record the location and availability information are taken into account in this unique group.
  • 5. Subject. In contrast to the physical characteristics, the Subject group embraces the properties that describe or otherwise help the discovery of what the resource is about or denotes, in the form of subject termclassification/category, freely assigned keyword and geographic term.
  • 6. Description of content. Two major types of descriptions that focus on the content of the resource rather than the physical object are considered in this group:  a) any representative description of the content, usually in the form of abstractsummary, note, and table of contents and b) type or genre of the resource.
  • 7. Intellectual property. Any property that deals with an aspect of intellectual property rights relating to access and use of a resource is included in this group, with special regard to rightsterms of use and access condition.
  • 8. Usage. Properties that are related to the use of a resource, rather than the characteristics of the resource itself, are considered to belong to this group. Typical properties are: audienceliterary indication, and education Level.   
  • 9. Relation. This group has a different perspective for describing the resources from other groups that focus on describing the resource itself.  Here, various relations between two resources or between two agents are the focus of description. Due to the significant number of such properties, no specific properties are listed under the Relation group in the following table.   

These groups of information are listed together in Table 1, with the specific properties included in each group. Special attention should also be given to the additional recommendations on cardinality, value control, and important attributes.  Table 1 comprises the following components in corresponding columns:

  • A. Groups of properties
  • B. Properties included in each group. Two special styles are used to signify the importance of the properties: two plus signs “++” (also in red colour) for the mandatory property; one plus sign “+” (also in blue colour) for the highly recommended property in the context of bibliographic information exchange. The rest are recommended or optional.
  • C. Requirements of properties in the context of both non-analytical and analytical bibliographic descriptions, specified with (M)andatory; (H)ighly-(R)ecommended; (R)ecommended; and (O)ptional marked for either process.
  • D. Recommendation on the control of values, indicating: (n)ot controlled; should use a name authority or a controlled vocabulary; or should follow a syntax encoding rule.
  • E. Some important attributes associated with individual properties, with special regard to the language and scheme attributes. A scheme can be either a value encoding scheme or a syntax encoding scheme.
Table 1. Groups of Common Properties 

 


A

 
BCDE
Group 

Property

 
Requirement

| M | HR | R | O |
 

Value Control
Important Attributes
Non AnalyticalAnalytical
1. Title Information

title++



 
MMnlanguage


alternative title

 
OOn
2. Responsible Bodycreator+ HRHR

n or Name authority (personal, corporate body, conference)

 
scheme
contributorOO

n or Name authority

 
publisher/issuer+HRR

n or Name authority

 
3. Physical Characteristics

 
date++MM

Syntax encoding rule

 
scheme
identifier+HRHR

Syntax encoding rule

 
scheme
language++MM

Controlled list

 
scheme
format/medium+HRHR

Controlled list

 
scheme
edition /versionRR

n

 
 
source+HRR

n

 
 
4. Location location++

 
M

 
M

n or Rule

[Holding unit names may be  managed through a controlled list]

 
 
5. Subjectsubject term+HRHRControlled vocabulary

language

scheme

 
classificationOO

Controlled vocabulary, Classification system

 
scheme


[freely assigned] keyword

 
RRnlanguage
geographic termOOControlled vocabulary

language

scheme

 
6. Description of content

description/abstract (or note/ summary/ table of contents)

 
RRnlanguage


type/form/genre

 
RRControlled vocabularylanguage

scheme
7. Intellectual property

rights+

term of use

access condition

 
RRn [Rights holders may be managed through name authorities] 
8. Usage

audience

 
OOControlled listscheme


literary indication

 
OOControlled listscheme


education level

 
OOControlled listscheme
9. Relation

[relation between resources]+

 
OHRControlled resource IDs 


[relation between agents]

 
OOn or Name authority 

 

 

[1] See Lee, T-B.  (2006). Linked Data - Design Issues.

[2] The conceptual model is built on a FRBR-based model previously developed by the FAO AIMS team, with enormous extension and reconsideration for the current recommendations.