TEI P5: Guidelines for Electronic Text Encoding and Interchange: 3 Elements Available in All TEI Documents

3 Elements Available in All TEI Documents

This chapter describes elements which may appear in any kind of text and the tags used to mark them in all TEI documents. Most of these elements are freely floating phrases, which can appear at any point within the textual structure, although they must generally be contained by a higher-level element of some kind (such as a paragraph). A few of the elements described in this chapter (for example, bibliographic citations and lists) have a comparatively well-defined internal structure, but most of them have no consistent inner structure of their own. In the general case, they contain only a few words, and are often identifiable in a conventionally printed text by the use of typographic conventions such as shifts of font, use of quotation or other punctuation marks, or other changes in layout.

This chapter begins by describing the p tag used to mark paragraphs, the prototypical formal unit for running text in many TEI modules. This is followed, in section 3.2 Treatment of Punctuation, by a discussion of some specific problems associated with the interpretation of conventional punctuation, and the methods proposed by the Guidelines for resolving ambiguities therein.

The next section (section 3.3 Highlighting and Quotation) describes a number of phrase-level elements commonly marked by typographic features (and thus well-represented in conventional markup languages). These include features commonly marked by font shifts (section 3.3.2 Emphasis, Foreign Words, and Unusual Language) and features commonly marked by quotation marks (section 3.3.3 Quotation) as well as such features as terms, cited words, and glosses (section 3.3.4 Terms, Glosses, Equivalents, and Descriptions).

Section 3.4 Simple Editorial Changes introduces some phrase-level elements which may be used to record simple editorial interventions, such as emendation or correction of the encoded text. The elements described here constitute a simple subset of the full mechanisms for encoding such information (described in full in chapter 11 Representation of Primary Sources), which should be adequate to most commonly encountered situations.

The next section (section 3.5 Names, Numbers, Dates, Abbreviations, and Addresses) describes several phrase-level and inter-level elements which, although often of interest for analysis or processing, are rarely explicitly identified in conventional printing. These include names (section 3.5.1 Referring Strings), numbers and measures (section 3.5.3 Numbers and Measures), dates and times (section 3.5.4 Dates and Times), abbreviations (section 3.5.5 Abbreviations and Their Expansions), and addresses (section 3.5.2 Addresses).

In the same way, the following section (section 3.6 Simple Links and Cross-References) presents only a subset of the facilities available for the encoding of cross-references or text-linkage. The full story may be found in chapter 16 Linking, Segmentation, and Alignment; the tags presented here are intended to be usable for a wide variety of simple applications.

Sections 3.7 Lists, and 3.8 Notes, Annotation, and Indexing, describe two kinds of quasi-structural elements: lists and notes. These may appear either within chunk-level elements such as paragraphs, or between them. Several kinds of lists are catered for, of an arbitrary complexity. The section on notes discusses both notes found in the source and simple mechanisms for adding annotations of an interpretive nature during the encoding; again, only a subset of the facilities described in full elsewhere (specifically, in chapter 17 Simple Analytic Mechanisms) is discussed.

Section 3.9 Graphics and other non-textual components introduces some simple ways of representing graphic or other non-textual content found in a text. A fuller discussion of the multimedia facilities supported by these Guidelines may be found in chapters 14 Tables, Formulæ, and Graphics and 16 Linking, Segmentation, and Alignment.

Next, section 3.10 Reference Systems, describes methods of encoding within a text the conventional system or systems used when making references to the text. Some reference systems have attained canonical authority and must be recorded to make the text useable in normal work; in other cases, a convenient reference system must be created by the creator or analyst of an electronic text.

Like lists and notes, the bibliographic citations discussed in section 3.11 Bibliographic Citations and References, may be regarded as structural elements in their own right. A range of possibilities is presented for the encoding of bibliographic citations or references, which may be treated as simple phrases within a running text, or as highly-structured components suitable for inclusion in a bibliographic database.

Additional elements for the encoding of passages of verse or drama (whether prose or verse) are discussed in section 3.12 Passages of Verse or Drama.

The chapter concludes with a technical overview of the structure and organization of the module described here. This should be read in conjunction with chapter 1 The TEI Infrastructure, describing the structure of the TEI document type definition.

3.1 ParagraphsTEI: Paragraphs¶

The paragraph is the fundamental organizational unit for all prose texts, being the smallest regular unit into which prose can be divided. Prose can appear in all TEI texts, even those that are primarily of another genre (e.g., verse); thus the paragraph is described here, as an element which can appear in any kind of text.

Paragraphs can contain any of the other elements described within this chapter, as well as some other elements which are specific to individual text types. We distinguish phrase-level elements, which must be entirely contained within a paragraph and cannot appear except within one, from chunks, which can appear between, but not within, paragraphs, and from inter-level elements, which can appear either within a single paragraph or between paragraphs. The class of phrases includes emphasized or quoted phrases, names, dates, etc. The class of inter-level elements includes bibliographic citations, notes, lists, etc. The class of chunks includes the paragraph itself, and other elements which have similar structural properties, notably the ab (anonymous block) element described in 16.3 Blocks, Segments, and Anchors) which may be used as an alternative to the paragraph in some kinds of texts.

Because paragraphs may appear in different base or additional tag sets, their possible contents may differ in different kinds of documents. In particular, additional elements not listed in this chapter may appear in paragraphs in certain kinds of text. However, the elements described in this chapter are always by default available in all kinds of text.

The paragraph is marked using the p element:

p 散文の段落を示す．

If a consistent internal subdivision of paragraphs is desired, the s or seg (‘segment’) elements may be used, as discussed in chapters 16 Linking, Segmentation, and Alignment and 17 Simple Analytic Mechanisms respectively. More usually, however, paragraphs have no firm internal structure, but contain prose encoded as a mix of characters, entity references, phrases marked as described in the rest of this chapter, and embedded elements like lists, figures, or tables.

Since paragraphs are usually explicitly marked in Western texts, typically by indentation, the application of the p tag usually presents few problems.

In some cases, the body of a text may comprise but a single paragraph:

<body>
I fully appreciate Gen. Pope's splendid achievements with their
invaluable results; but you must know that Major Generalships in the
Regular Army, are not as plenty as blackberries.
</body>

direct	引用内容が，直接または間接的な話(法)かどうかを示す．
aloud	引用内容が言語または記号化されているかどうかを示す．

uri	(uniform resource identifier) 外部識別子によって親要素の意義を表す．
filter	当該要素を標準的XMLデータに変形する外部スクリプトへの参照を示す．
name	親要素の意義を表す．

cert	(certainty) 当該解釈や調整の確信度を示す．
resp	(responsible party) 当該解釈や調整の責任者を示す．例えば，編集者，翻訳者など．
evidence	当該解釈や調整の信頼度や正確さを判断する証拠を示す．

type	当該要素の分類を示す．
subtype	必要であれば，当該要素の下位分類を示す．

key	provides an externally-defined means of identifying the entity (or entities) being named, using a coded value of some kind.
ref	(reference) provides an explicit means of locating a full definition for the entity being named by means of one or more URIs.

type	数値の種類を示す．
value	標準的な形式で数値を示す．

quantity	計測単位の数を示す．
unit	一般には標準記号により，計測単位を示す．
commodity	計測される対象を示す．

target	当該ポインタの参照場所を，ひとつ以上のURIで示す．
cRef	(canonical reference) TEIヘダー内の要素refsDeclで定義されているスキームにある，標準的な参照により，当該ポインタの参照場所を示す．

target	ひとつ以上のURIで，参照先を特定する．
cRef	(canonical reference) 当該参照先は，TEIヘダーにある要素refsDeclで定義されているスキームの標準的な参照により示される．

type	当該ポインタの種類を示す．
evaluate	当該ポインタの参照先がポインタである場合，その意図を示す．

P5: TEIガイドライン

3 Elements Available in All TEI Documents

3.1 ParagraphsTEI: Paragraphs¶

3.2 Treatment of PunctuationTEI: Treatment of Punctuation¶

3.3 Highlighting and QuotationTEI: Highlighting and Quotation¶

3.3.1 What Is Highlighting?TEI: What Is Highlighting?¶

3.3.2 Emphasis, Foreign Words, and Unusual LanguageTEI: Emphasis, Foreign Words, and Unusual Language¶

3.3.2.1 Foreign Words or ExpressionsTEI: Foreign Words or Expressions¶

3.3.2.2 Emphatic Words and PhrasesTEI: Emphatic Words and Phrases¶

3.3.2.3 Other Linguistically Distinct MaterialTEI: Other Linguistically Distinct Material¶

3.3.3 QuotationTEI: Quotation¶

3.3.4 Terms, Glosses, Equivalents, and DescriptionsTEI: Terms, Glosses, Equivalents, and Descriptions¶

3.3.5 Some Further ExamplesTEI: Some Further Examples¶

3.4 Simple Editorial ChangesTEI: Simple Editorial Changes¶

3.4.1 Apparent ErrorsTEI: Apparent Errors¶

3.4.2 Regularization and NormalizationTEI: Regularization and Normalization¶

3.4.3 Additions, Deletions, and OmissionsTEI: Additions, Deletions, and Omissions¶

3.5 Names, Numbers, Dates, Abbreviations, and AddressesTEI: Names, Numbers, Dates, Abbreviations, and Addresses¶

3.5.1 Referring StringsTEI: Referring Strings¶

3.5.2 AddressesTEI: Addresses¶

3.5.3 Numbers and MeasuresTEI: Numbers and Measures¶

3.5.4 Dates and TimesTEI: Dates and Times¶

3.5.5 Abbreviations and Their ExpansionsTEI: Abbreviations and Their Expansions¶

3.6 Simple Links and Cross-ReferencesTEI: Simple Links and Cross-References¶

3.7 ListsTEI: Lists¶

3.8 Notes, Annotation, and IndexingTEI: Notes, Annotation, and Indexing¶

3.8.1 Notes and Simple AnnotationTEI: Notes and Simple Annotation¶

3.8.2 Index EntriesTEI: Index Entries¶

3.8.2.1 Pre-existing indexesTEI: Pre-existing indexes¶

3.8.2.2 Auto-generated indexesTEI: Auto-generated indexes¶

3.9 Graphics and other non-textual componentsTEI: Graphics and other non-textual components¶

3.10 Reference SystemsTEI: Reference Systems¶

3.10.1 Using the xml:id and n AttributesTEI: Using the xml:id and n Attributes¶

3.10.2 Creating New Reference SystemsTEI: Creating New Reference Systems¶

3.10.3 Milestone ElementsTEI: Milestone Elements¶

3.10.4 Declaring Reference SystemsTEI: Declaring Reference Systems¶

3.11 Bibliographic Citations and ReferencesTEI: Bibliographic Citations and References¶

3.11.1 Elements of Bibliographic ReferencesTEI: Elements of Bibliographic References¶

3.11.2 Components of Bibliographic ReferencesTEI: Components of Bibliographic References¶

3.11.2.1 Analytic, Monographic, and Series LevelsTEI: Analytic, Monographic, and Series Levels¶

3.11.2.2 Authors, Titles, and EditorsTEI: Authors, Titles, and Editors¶

3.11.2.3 Imprint, Pagination, and Other DetailsTEI: Imprint, Pagination, and Other Details¶

3.11.2.4 Series InformationTEI: Series Information¶

3.11.2.5 Related itemsTEI: Related items¶

3.11.2.6 Notes and Other Additional InformationTEI: Notes and Other Additional Information¶

3.11.2.7 Order of Components within ReferencesTEI: Order of Components within References¶

3.11.3 Bibliographic Pointers TEI: Bibliographic Pointers ¶

3.11.4 Relationship to Other Bibliographic SchemesTEI: Relationship to Other Bibliographic Schemes¶

3.12 Passages of Verse or DramaTEI: Passages of Verse or Drama¶

3.12.1 Core Tags for VerseTEI: Core Tags for Verse¶

3.12.2 Core Tags for DramaTEI: Core Tags for Drama¶

3.13 Overview of the Core Module TEI: Overview of the Core Module ¶