14. Document interchange

14.2 Procedure sets

Change the first paragraph as follows:

This feature has been deprecatedconsidered unnecessary since PDF 1.4 and was deprecated in PDF 2.0.

14.3 Metadata

14.3.2 Metadata streams

Change the paragraph below Table 347 as follows:

The contents of a metadata stream shall be the metadata represented in Extensible Markup Language (XML) and the grammar of the XML representing the metadata shall be defined according to the extensible metadata platform specification (ISO 16684-1). All XMP metadata in PDF shall be encoded as UTF-8.

...

14.3.3 Document information dictionary

Change the EXAMPLE below Table 349 as follows:

101 0 obj % document information dictionary << /CreationDate (D:20140314124211+01'00) /ModDate (D:20140924212303+02'00) >> endobj 102 0 obj % document level metadata stream << /Type /Metadata /Subtype /XML /Length 103 0 R >> stream <?xpacket begin=" Ôªø … UTF-8 value of U+FEFF (efbbbf) … id="W5M0MpCehiHzreSzNTczkc9d"?> ...

14.4 File identifiers

Change the first paragraph as follows:

PDF file identifiers shall be defined by the ID entry in a PDF file’s trailer dictionary (see 7.5.5, "File trailer"). The value of this entry shall be an array of two byte strings. The first byte string shall be a permanent identifier based on the contents of the PDF file at the time it was originally created and shall not change when the PDF file is updated. The second byte string shall be a changing identifier based on the PDF file’s contents at the time it was last updated (see 7.5.6, "Incremental updates"). When a PDF file is first written, both identifiers shall be set to the same value. If the first identifier in the reference matches the first identifier in the referenced file’s ID entry, and the last identifier in the reference matches the last identifier in the referenced file’s ID entry, it is very likely that the correct and unchanged PDF file has been found. If only the first identifier matches, a different version of the correct PDF file has been found.

...

14.5 Page-piece dictionaries

Change the paragraph above Table 350 as follows:

As "Table 350 - Entries in a page-piece dictionary" shows, a page-piece dictionary may contain any number of entries, each keyed by key should be a second-class name, or the name of a distinct PDF processor, or of a well-known data type recognised by a family of PDF processors. The value associated with each key shall be a data dictionary containing the private data that shall be used by the PDF processor. The Private entry may have a value of any data type, but typically it is a dictionary containing all of the private data needed by the PDF processor other than the actual content of the document, page, or form.

Change Table 350 as follows:

Table 350 - Entries in a page-piece dictionary
Key Type Value
any valid second-class name (recommended), any conforming product name or well known data type dictionary A data dictionary (see "Table 351 - Entries in a data dictionary").

Insert new NOTE below Table 350 as follows:

NOTE: the definition of page-piece dictionary keys was updated to also support the same definition as in ISO 32000-1:2008 to allow easier document upgrades to PDF 2.0, however second-class names are strongly recommend.

14.6 Marked content

14.6.1 General

EDITOR NOTE: notes in subclause 14.6.1 will be renumbered.

Delete NOTE 1 (original numbering) as follows:

NOTE 1 This is a sequence not simply of bytes in the content stream but of complete graphics objects. Each object is fully qualified by the parameters of the graphics state in which it is rendered.

...

Change the paragraph below NOTE 2 (original numbering) as follows:

All marked-content operators except EMC shall take a tag operand indicating the role or significance of the marked-content element to the PDF processor. All such tags should have second-class names registered with ISO For tags not defined in either ISO publications or Logical Structure (see 14.7, "Logical structure"), those tags should use second-class names (see Annex E, "Extending PDF") to avoid conflicts between different applications marking the same content stream. In addition to the tag operand, the DP and BDC operators shall specify a property list containing further information associated with the marked-content. Property lists are discussed further in 14.6.2, "Property lists".

Delete NOTE 3 (original numbering) as follows:

NOTE 3 The tag operand of marked-content operators have no relationship to Tagged PDF (see 14.8 "Tagged PDF") and thus is not rolemappedrole mapped.

...

Change the paragraph below NOTE 3 (original numbering) as follows:

Marked-content operators may appear only between graphics objects in the content stream at the content stream level or within a text object as shown in "Figure 9 — Graphics objects".. They may not occur within a graphics object or between a graphics state operator and its operands. Marked-content sequences may be nested one within another, but each sequence shall be entirely contained within a single content stream. "Table 352 — Marked-content operators" summarises the marked-content operators.

...

Change Table 352 as follows:

Table 352 - Marked content operators
Operands Operator Description
Tag tag BMC Begin a marked-content sequence terminated by a balancing EMC operator. tag shall be a name object indicating the role or significance of the sequence.

Add a new table and informative NOTE below Table 352 as follows:

Table 352a - Marked content tags defined in PDF 2.0 (informative)
PDF feature PDF version Subclause tag
Associated files PDF 2.0 14.13.5 AF
Artifacts PDF 2.0 14.8.2.2.2 Artifact
Optional content PDF 1.5 8.11.3.2 OC
Reverse order show strings PDF 1.4 14.8.2.5.3 ReversedChars
Alternate description PDF 1.5 14.9.3 Span
Replacement text PDF 1.4 14.9.4 Span
Expansion of abbreviations and acronyms PDF 1.5 14.9.5 Span
Variable text field replacement PDF 1.2 12.7.4.3 Tx

NOTE: PDF 1.7 defined various clipping related tags (see 14.6.3, "Clip tags" in ISO 32000-1:2008) and a TagSuspect tag (see 14.8.2.3, "Page Content Order" in ISO 32000-1:2008) which were intentionally removed in PDF 2.0.

Change the paragraph below new Table 352a as follows:

When the marked-content operators BMC, BDC, and EMC are combined with the text object operators BT and ET (see 9.4, "Text objects") ; the compatibility operators BX and EX (see "Table 33 - Compatibility operators"); or the graphics state save and restore operators q and Q (see "Table 56 - Graphics state operators"), each pair of matching operators (BMCEMC, BDCEMC, or BTET, BXEX or qQ) shall be properly (separately) nested. ...

14.7 Logical structure

14.7.2 Structure hierarchy

Change Table 354 as follows:

Table 354 - Entries in the structure tree root
Key Type Value
K dictionary or array (Optional) The immediate child or children of the structure tree root in the structure hierarchy. The value mayshall be either a dictionary representing a single structure element or an array of such dictionaries. Values or array elements shall not be null.
ParentTree number tree (Required if any structure element contains content items) A number tree (see 7.9.7, "Number trees") used in finding the structure elements to which content items belong. Each integer key in the number tree shall correspond to a single page of the document or to an individual object (such as an annotation or an XObject) that is a content item in its own right. The integer key shall be the value of the StructParent or StructParents entry in that object (see 14.7.5.4, "Finding structure elements from content items"). The form of the associated value shall depend on the nature of the object: For an object that is a content item in its own right, the value shall be an indirect reference to the object’s parent element (the structure element that contains it as a content item). For a page object or content stream containing marked-content sequences that are content items, the value shall be an array of references to the parent elements of those marked-content sequences. This array may contain elements that are null. See 14.7.5.4, "Finding structure elements from content items" for further discussion.
Namespaces array (Required if any structure elements have namespace identifiers; PDF 2.0) An array of at least all namespaces used within the document as referenced from structure elements in the structure hierarchy (see 14.7.4.2, "Namespace dictionary").

Change Table 355 as follows:

Table 355 - Entries in a structure element dictionary
Key Type Value
R integer (Optional; deprecated in PDF 2.0) The current revision number of this structure element (see 14.7.6.3, "Attribute revision numbers"). The value shall be a non-negative integer. Default value: 0.
Ref array

...

The array shall not contain elements that are null.

K (various)

(Optional) The children of this structure element (shall not be null). The value of this entry may be one of the following objects or an array consisting of one or more of the following objects in any combination:

...

A (various)

...

Attribute objects and revisions shall not be null.

C name or array

...

Attribute class names and revisions shall not be null.

NS dictionary (Optional; PDF 2.0) An indirect reference to a namespace dictionary defining the namespace this element belongs to (see 14.7.4, "Namespaces") that shall also be an element in the structure tree root Namespaces array (see "Table 354 - Entries in the structure tree root") . If NS is not present, the element shall be considered to be in the default standard structure namespace (see 14.8.6, "Standard structure namespaces").

14.7.3 Structure types

Change the third paragraph as follows:

The RoleMap dictionary shall be comprised of a set of keys representing structure element types rolemappedrole mapped to other structure element types. The corresponding value for each of these keys shall be a single name identifying the target structure element type.

...

14.7.5 Structure content

14.7.5.1 General

Move entire subclause 14.7.5.1.1 up one heading level to become 14.7.5.2 and renumber later subclauses of 14.7.5 appropriately. Subclause text is otherwise unchanged:

14.7.5.1.1 Content items

14.7.5.2 Content items

...

14.7.5.2 Marked-content sequences as content items

Change Table 357 as follows:

Table 357 - Entries in a marked-content reference dictionary
Key Type Value
Pg dictionary

(Optional Sometimes required; shall be an indirect reference) The page object representing the page on which the graphics objects in the marked-content sequence shall be rendered. This entry overridestakes precedence over any Pg entry in the structure element containing the marked content reference ; it shall be required if the structure element has no such entry. . This entry is required if the structure element containing the object reference has no Pg entry.

...

14.7.5.3 PDF objects as content items

Change Table 358 as follows:

Table 358 - Entries in an object reference dictionary
Key Type Value
Pg dictionary

(Optional Sometimes required; shall be an indirect reference) The page object of the page on which the object shall be rendered. This entry overridestakes precedence over any Pg entry in the structure element containing the object reference ; it shall be required if the structure element has no such entry. . This entry is required if the structure element containing the object reference has no Pg entry.

14.7.5.4 Finding structure elements from content items

Change the second paragraph as follows:

The parent tree is a number tree (see 7.9.7, "Number trees"), accessed from the ParentTree entry in a document’s structure tree root ("Table 354 — Entries in the structure tree root"). The tree shall contain an entry for each object that is a content item of at least one a structure element and for each content stream containing at least one marked-content sequence that is a content item. The key for each entry shall be an integer given as the value of the StructParent or StructParents entry in the object (see "Table 359 — Additional dictionary entries for structure element access"). The values of these entries shall be as follows:

Change the second bullet point as follows:

  • ...
  • For a content stream containing marked-content sequences that are content items, the value shall be an array of indirect references to the sequences’ parent structure elements or null for unused marked content identifiers (MCIDs) or those that do not have a structural parent. The array element corresponding to each sequence shall be found by using the sequence’s marked-content identifier (MCID) as a zero-based index into the array.

Add a new NOTE 2 below existing NOTE as follows (existing NOTE to be renumbered as NOTE 1):

NOTE 2: MCIDs are scoped by content stream and must start at zero, so the same MCID may reappear across pages or XObjects. Thus ensuring MCIDs are contiguous for any given page allows for efficient creation of StructParents without excessive null objects in the structure tree root ParentTree number-tree.

...

Change the paragraph before Table 359 as follows:

To locate the relevant parent tree entry, each object or content stream that is represented in the tree shall contain a special dictionary entry, StructParent or StructParents (see "Table 359 — Additional dictionary entries for structure element access"). At most one of the entries StructParent or StructParents shall be present. An object may be either a content item in its entirety or a container for marked-content sequences that are content items, but not both. Depending on the type of content item, this entry may appear in the page object of a page containing marked-content sequences, in the stream dictionary of a form or image XObject, or in an annotation dictionary. Its value shall be the integer key under which the entry corresponding to the object shall be found in the structural parent tree.

Change Table 359 as follows:

Table 359 - Additional dictionary entries for structure element access
Key Type Value
StructParents integer (Required for all content streams containing marked-content sequences that are structural content items; PDF 1.3) The integer key of this object’s entry in the structural parent tree. At most one of these two entries shall be present in a given object. An object may be either a content item in its entirety or a container for marked-content sequences that are content items, but not both.

14.7.6 Structure attributes

14.7.6.1 General

Change the first paragraph as follows:

A PDF processor that processes logical structure may attach additional information, called attributes, to any structure element. The attribute information shall be held in one or more attribute objects associated with the structure element. An attribute object shall be a dictionary or stream that includes an O entry (see "Table 360 — Entries common to all attribute object dictionaries") identifying the conforming product that owns owner of the attribute information. Other entries, except the NS entry, shall represent the attributes: the keys shall be attribute names, and values shall be the corresponding attribute values. To facilitate the interchange of content among conforming products, PDF defines a set of standard structure attributes identified by specific standard owners; see 14.8.5, "Standard structure attributes". In addition, attributes may be used to represent user properties (see 14.7.6.4, "User properties").

Change Table 360 as follows:

Table 360- Entries common to all attribute object dictionaries
Key Type Value
O name (Required) The name of the PDF processor creating owner of the attribute data. The value shall either be a NSO, UserProperties (see "Table 361 — Additional entries in an attribute object dictionary for user properties"), one of the values from 14.8.5, "Standard structure attributes""Table 376 — Standard structure attribute owners" , or conform to the guidelines described in Annex E, "Extending PDF".

If the value for the O entry is NSO then the NS entry shall be present, and shall identify the owner of the attribute object.

...

Change the last paragraph as follows:

When an array of attribute objects is provided, the value of the O and NS keys may be repeated across attribute objects. If a given attribute for a specific owner (as defined by the O and NS entries) is specified more than once, the later (in array order) entry shall take precedence.

14.7.6.2 Attribute classes

Change the last paragraph as follows:

The C entry in a structure element dictionary (see "Table 355 — Entries in a structure element dictionary") shall contain a class name or an array of class names (typicallypossibly accompanied by revision numbers as well (deprecated in PDF 2.0); see 14.7.6.3, "Attribute revision numbers"). For each class named in the C entry, the corresponding attribute object or objects shall be considered to be attached to the given structure element, along with those identified in the element’s A entry. Attribute objects included through a class and through an array of classes within the C entry may have the value of O and NS repeated. If a given attribute is specified more than once across the attribute objects, the later (in array order) shall take precedence. If both the A and C entries are present and a given attribute is specified by both, the one specified by the A entry shall take precedence.

Add a new NOTE after the last paragraph as follows:

NOTE If a class identified through the C entry is not present in the ClassMap, it is an empty class, which defines no additional attributes.

14.8 Tagged PDF

14.8.2 Tagged PDF and page content

14.8.2.2 Real content and Artifacts

14.8.2.2.2 Specification of Artifacts

Change/split the paragraph below NOTE 1 into a new NOTE 2 and normative sentence as follows:

For artifacts defined using the marked-content sequence method, the form indicated in EXAMPLE 1 shall be used to identify a generic artifact; the form indicated in EXAMPLE 2 shall be used for those artifacts that have an associated property list.

NOTE 2: For artifacts defined using the marked-content sequence method, the form indicated in EXAMPLE 1 is used to identify a generic artifact; the form indicated in EXAMPLE 2 is used for those artifacts that have an associated property list.

"Table 363 — Property list entries for artifacts" shows the properties that may be included in such a the property list.

...

14.8.3 Basic layout model

14.8.3.3 Progression direction

Change first paragraph as follows:

The meaning of the terms block-progression direction and inline-progression direction depends on the writing system in use, as specified by the standard structure attribute WritingMode (see 14.8.5.4.2, "General Layout Attributes"). In Western writing systems, the block direction is from top to bottom and the inline direction is from left to right. Other writing systems use different directions for laying out content.

...

14.8.4 Standard structure types

14.8.4.3 Document level structure types

Add a new informative NOTE 1 after the second paragraph (before Table 364) as follows:

NOTE 1: A document can consist of zero, one or more sub-documents and document fragments. However, regardless of the presence of sub-document or document fragments, the entire content of a single PDF file is considered a logical document. Annex L "Parent-child relationships between the standard structure elements in the standard structure namespace for PDF 2.0" contains the requirement that a document utilizing tags from the PDF 2.0 standard structure namespace have a single structure element of type Document as the root of the document.

...

14.8.4.4 Grouping level structure types

Change Table 365 as follows:

Table 365 - Grouping level structure types
Structure Type Category Description
Part Grouping

Encloses a grouping of structure elements without consideration for their hierarchy.

NOTE 1 Part is the semantic equivalent of Div The non-hierarchical aspect of Part is similar to Div. However, unlike Div, the grouping of elements enclosed in a Part structure element has semantic value.

A structure element with the type of Part shall inherit the containment requirements and limitations of its parent element. Where the parent element is itself a structure element of type Part, then the inheritance shall recurse to the first parent element whose type is not Part.

NOTE 2 The semantic value of a structure element of type Part is determined by the elements enclosed within, in addition to the grouping nature of Part.

...

14.8.4.7 Inline level structure types

14.8.4.7.2 General inline level structure types

Change Table 368 as follows:

Table 368 - General inline level structure types
Structure Type Category Description
Strong Inline

(PDF 2.0) Encloses content for the purpose of strong importance, seriousness or urgency for its contents.

EXAMPLE 3 In this example the Strong element is used to denote the content that the user is intended to read first is more important:

...

Annot Grouping, Block or Inline

Either an association between the content enclosed by the Annot structure element and one or more corresponding PDF annotations (see 12.5, "Annotations"), or a mechanism to include one or more PDF annotations in the structure tree. Encloses one or more PDF annotations and associated content, if any.

...

Form Grouping, Block or Inline

Either an association between content enclosed by the Form structure element and a corresponding widget annotation or a mechanism to include a widget annotation in the structure tree. Encloses a PDF widget annotation and associated content, if any.

...

Insert a new sub-clause heading "14.8.4.7.3 Link elements" below NOTE 1 and modify the existing text below NOTE 1 as follows:

14.8.4.7.3 Link elements

Tagged PDF link elements (standard structure type Link) use PDF's logical structure facilities to establish the association between content items and link annotations, providing functionality comparable to HTML hypertext links. The following items may be children of a link element:

  • One or more content items or other ILSEs (except other links) if A, Dest and PA keys of all of them have identical values.
  • Object references (see 14.7.5.3, "PDF objects as content items") to one or more link annotation annotations associated with the content.

NOTE 21 An SD entry in the GoTo or GoToR action of a Link annotation facilitates linking directly to a target structure element as opposed to just targeting an area on a page.

When a Link structure element describes a span of text to be associated with a link annotation and that span wraps from the end of one line to the beginning of another, the Link structure element shall include a single object reference that associates the span with the associated link annotation. Further, the link annotation shall use the QuadPoint entry to denote the active areas on the page.

EXAMPLE 1 The Link structure element references a link annotation that includes a QuadPoint entry that boxes the strings "with a" and "link". That is, the QuadPoint entry contains 16 numbers: the first 8 numbers describe a quadrilateral for "with a", and the next 8 describe a quadrilateral for "link".

Here is some text with a
link inside.

NOTE 2 Beginning with PDF 1.7, use of the Link structure element to enclose multiple link annotations on a single page is deprecated.

EXAMPLE 2 Consider the following fragment of HTML code, which produces a line of text containing a hypertext link:

<html> <body> <p> Here is some text <a href="https://www.pdfa.org">with a link</a> inside. </p> </body> </html>

This code sample shows an equivalent fragment of PDF using a link element, whose text it displays in blue and underlined.

/P << /MCID 0 >> % Marked-content sequence 0 (paragraph) BDC % Begin marked-content sequence BT % Begin text object /T1_0 1 Tf % Set text font and size 14 0 0 14 10.000 753.976 Tm % Set text matrix 0.0 0.0 0.0 rg % Set nonstroking colour to black (Here is some text ) Tj % Show text preceding link ET % End text object EMC % End marked-content sequence /Link << /MCID 1 >> % Marked-content sequence 1 (link) BDC % Begin marked-content sequence 0.7 w % Set line width [ ] 0 d % Solid dash pattern 111.094 751.8587 m % Move to beginning of underline 174.486 751.8587 l % Draw underline 0.0 0.0 1.0 RG % Set stroking colour to blue S % Stroke underline BT % Begin text object 14 0 0 14 111.094 753.976 Tm % Set text matrix 0.0 0.0 1.0 rg % Set nonstroking colour to blue (with a link) Tj % Show text of link ET % End text object EMC % End marked-content sequence /P << /MCID 2 >> % Marked-content sequence 2 (paragraph) BDC % Begin marked-content sequence BT % Begin text object 14 0 0 14 174.486 753.976 Tm % Set text matrix 0.0 0.0 0.0 rg % Set nonstroking colour to black ( inside.) Tj % Show text following link ET % End text object EMC % End marked-content sequence

EXAMPLE 3 This example shows an excerpt from the associated logical structure hierarchy.

501 0 obj % Structure element for paragraph << /Type /StructElem /S /P ... /K [ 0 % Three children: marked-content sequence 0 502 0 R % Link 2 % Marked-content sequence 2 ] >> endobj 502 0 obj % Structure element for link << /Type /StructElem /S /Link ... /K [ 1 % Two children: marked-content sequence 1 503 0 R % Object reference to link annotation ] >> endobj 503 0 obj % Object reference to link annotation << /Type /OBJR /Obj 600 0 R % Link annotation (not shown) >> endobj

14.8.4.7.34 Ruby and warichu elements

EDITOR NOTE: Clause is renumbered - existing text is unchanged.

14.8.4.8.3 Table structure types

Change Table 371 as follows:

Table 371 - Table standard structure types
Structure Type Category Description
TR Internal to a Table structure A row of table header cells (TH) or table data cells (TD) or both in a table.

Update all text below Table 371 as follows:

If the Headers attribute (see 14.8.5, "Standard structure attributes") is not specified, any cell in a table may have multiple headers associated with it. These headers are defined either explicitly by the Headers attribute, or implicitly, by the following algorithm: When a cell does not explicitly identify its headers through a Headers attribute, the following algorithm may be used to determine the cell’s headers, if any:

To find headers for any Given a data or header cell, begin from the current cell position and use the current value of WritingMode to search towards the first cell in the appropriate horizontal /and vertical directions. The search terminates when any of these conditions is reached:

  • the edge of the table is reached
  • a data cell is found after a header cell
  • a header cell is found that has the Headers attribute set — the headers that are specified are appended to the row /or column list that is current being built

However, cells with a value for Scope that conflicts with the search's direction should not terminate this section of the search and such a cell (irrespective of its Headers attribute) should not be added to the headers list being created.

EXAMPLE: If searching up a given column reveals a cell with a Scope value of Row this cell should not cause this column search to terminate.

When a header cell is found in the search and the (implicit or explicit) Scope attribute of the header cell is either Both or Row/Column, the header cell is appended to the end of the list of row/column headers matches the current search direction (i.e., is either Both, or is Row or Column when searching for row or column headers, respectively), then the header cell is appended to the end of the list of row or column headers , resulting in a list of headers ordered from most specific to most general.

NOTE: This algorithm works for languages with different intrinsic directionality of the script (such as right-to-left) because the structure always reflects the logical content order of the table.

14.8.4.8.4 Caption structure types

Change Table 372 as follows:

Table 372 - Standard structure type Caption
Structure Type Category Description
Caption Grouping or Block

...

A structure element is understood to be "captioned" when a Caption structure element exists as an immediate child of that structure element. The Caption shall be the first or the last structure element inside its parent structure element. The number of captions cannot exceed 1.

While captions are often used with figures or formulas, they may be associated with any type of content.

NOTE 1 In principle, captions can appear in a nested fashion. For example, several smaller images belonging to a group of images can each be accompanied by a caption, and the group of these images as a whole is accompanied by a caption as well.

NOTE 2 If an Artifact structure element is present, and needs to be associated with a Caption, then the Artifact structure element needs to be a descendent of the Caption.

14.8.4.8.6 Formula structure types

Delete the entire second paragraph as follows:

The standard structure type Formula shall not appear between the BT and ET operators delimiting a text object (see 9.4, "Text objects").

Change the third paragraph as follows:

A Formula element may have logical substructure, including other Formula elements. For repurposing purposes it may be treated as visually static, without examining its internal contents. It should have a BBox attribute (see 14.8.5, "Standard structure attributes") It should have a BBox attribute (see 14.8.5, "Standard structure attributes") and can then for repurposing purposes be treated as visually static, without examining its internal contents.

...

14.8.5 Standard structure attributes

EDITOR NOTE: as a result of Errata #226 throughout all subclauses of 14.8.5, "Layout", "Table", "PrintField" and "Artifact" terms are incorrectly formatted as bold indicating a key name, when they are O (owner) key values and thus should be italic.

14.8.5.1 General

Change reference to subclause 14.8.5.2 in sixth paragraph as follows:

In addition to the standard structure attributes described in 14.8.5.2, "Standard structure attribute owners" there are several other optional entries – Lang, Alt, ActualText, and E – that are described in 14.9, "Repurposing and accessibility support" but are useful to other PDF consumers as well. They appear in the following places in a PDF file (rather than in attribute dictionaries):

  • ...

Change title of subclause 14.8.5.2 as follows:

14.8.5.2 Standard structure attribute owners

Change first paragraph as follows:

Each attribute object has an owner, specified by the object's O entry, or, if the value of O is NSO, by the object’s NS entry, which determines the interpretation of the attributes defined in the object's dictionary. Multiple owners may define like-named attributes with different value types or interpretations. Tagged PDF defines a set of standard structure attribute owners as shown in "Table 376 — Standard structure attribute owners".

...

Change Table 376 as follows:

Table 376 - Standard structure attribute owners
Owner value for the attribute object’s O entry Description
CSS-1 Additional attributes governing translation to a format using CSS, versionlevel 1
CSS-2 Additional attributes governing translation to a format using CSS, version 2.1level 2, revision 1
CSS-3 Additional attributes governing translation to a format using CSS, versionlevel 3

Add a new NOTE immediately below Table 376 as follows:

NOTE: other values for the owner (O) entry are defined in "Table 360 — Entries common to all attribute object dictionaries".

Add a new NOTE after the current note below Table 376 as follows:

NOTE The attribute owner, defined through the O and NS entries in the attribute object, define an owner for each attribute, but do not provide information on transformation of those attributes into other formats. When considering formats such as HTML and MathML, attributes would be transformed to meet the syntactic requirements of those formats.

...

14.8.5.3 Attribute values and inheritance

...

Change NOTE 1 as follows:

NOTE 1 The description of each of the standard structure attributes in this subclause specifies whether their values are inheritable

...

14.8.5.4 Layout attributes

14.8.5.4.1 General

Replace Table 377 as follows:

EDITOR NOTE: as a result of Errata #226, only the second column in Table 377 should be bold (indicating key names). The first column should not have bold typeface unless a key name is used. A fourth column is also to be added containing cross-references to Tables in ISO 32000-2 where each attribute is defined.

Table 377 - Standard layout attributes
Structure Elements Attributes key name Inheritable References
Any structure element Placement No Table 378
WritingMode Yes Table 378
BackgroundColor No Table 378
BorderColor No Table 378
BorderStyle Yes Table 378
Color Yes Table 378
Padding No Table 378
Any BLSE; ILSEs with Placement other than Inline SpaceBefore No Table 379
SpaceAfter No Table 379
StartIndent Yes Table 379
EndIndent Yes Table 379
BLSEs containing text TextIndent Yes Table 379
TextAlign Yes Table 379
Figure, Form, Formula, Artifact, and Table elements BBox No Tables 379 and 385
Figure, Form, Formula, Table, TH (Table Header) and TD (Table data) Width No Table 379
Height No Table 379
TH (Table Header) and TD (Table data) BlockAlign Yes Table 379
InlineAlign Yes Table 379
TBorderStyle Yes Table 379
TPadding Yes Table 379
Any ILSE; BLSEs containing ILSEs or containing direct or nested content items LineHeight Yes Table 380
BaselineShift No Table 380
TextDecorationType Yes, only for directly nested ILSEs Table 380
TextPosition Yes Table 380
TextDecorationColor Yes Table 380
TextDecorationThickness Yes Table 380
Grouping elements ColumnCount No Table 381
ColumnWidth No Table 381
ColumnGap No Table 381
Vertical text Any structure element containing text whose inline-progression direction is top to bottom or bottom to top. GlyphOrientationVertical Yes Table 380
RB, RT, RP (Ruby text) RubyAlign Yes Table 369 and 380
RubyPosition Yes Table 369 and 380

...

14.8.5.4.2 General layout attributes

Change Table 378 as follows:

Table 378 - Standard layout attributes common to all standard structure types
Key Type Value
Placement name

(Optional; not inheritable) The positioning of the element with respect to the enclosing reference area and other content (see 14.8.3.3, "Progression direction"). The value shall be one of the following:

...

Default value: Block for BLSEs, Inline for ILSEs.

NOTE The default value depends on the context in which the structure element is used (see 14.8.4.1, "General").

EXAMPLE 1 A Figure structure element occurring within a P structure element is an ILSE, and therefore has a default value of Inline.

EXAMPLE 2 A Figure structure element occurring within a Sect structure element is an BLSE, and therefore has a default value of Block.

WritingMode name

(Optional; inheritable) Indicates the directions of layout progression inside Block Level Structure Elements (BLSEs) (inline progression) and regarding the sequence of BLSEs (block progression) (see 14.8.3.3, "Progression direction"). WritingMode may be used as an attribute for any structure element. The value shall be one of the following:

...

14.8.5.4.3 Layout Attributes for BLSEs

Change Table 379 as follows:

Table 379 - Standard layout attributes specific to block-level structure elements
Key Type Value
BBox rectangle ...

NOTE 4: Artifact attributes also define a BBox attribute (see "Table 385 — Standard artifact attributes").

14.8.5.4.4 Layout Attributes for ILSEs

Change Table 380 as follows:

Table 380 - Standard layout attributes specific to inline-level structure elements
Key Type Value
GlyphOrientationVertical numberinteger or name

...

14.8.5.5 List attributes

Change Table 382 as follows:

Table 382 - Standard list attributes
Key Type Value
ContinuedList boolean (Optional; not inheritable; PDF 2.0) ...
ContinuedFrom ID (byte string) (Optional; not inheritable; PDF 2.0) ...

14.8.5.8 Artifact attributes

Change Table 385 as follows:

Table 385 - Standard artifact attributes
Key Type Value
Type name (Optional; not inheritable; PDF 2.0) ...
BBox rectangle (Optional; not inheritable; PDF 2.0) ...

NOTE: BLSE attributes also define a BBox attribute (see "Table 379 — Additional standard layout attributes specific to block-level structure elements").

Subtype name (Optional; PDF 1.7not inheritable; PDF 2.0) ...

Insert a new NOTE below Table 385 as follows:

NOTE: the meaning of the keys Type and Subtype in "Table 382 - Standard artifact attributes" do not follow the documented conventions described in 7.3.7, "Dictionary objects". This is due to alignment of "Table 382 - Standard artifact attributes" (PDF 2.0) with the previously existing "Table 363 - Property list entries for artifacts".

14.8.6 Standard structure namespaces

14.8.6.1 Namespaces for standard structure types and attributes

Modify the paragraph above the current NOTE as follows:

To facilitate conversion of documents created against versions of the PDF standard earlier than PDF 2.0, the default standard structure namespace shall be "http://iso.org/pdf/ssn". When a namespace is not explicitly specified for a given structure element or attribute, it shall be assumed to be within this default standard structure namespace. When a structure element does not have a specified namespace, after transitively applying any role map present (see 14.8.6.2, "Role maps and namespaces"), the final element type shall be considered to be within the default standard structure namespace and shall be one of the standard structure types defined in the default standard structure namespace.

Add a new NOTE 1 above the current NOTE as follows:

NOTE 1 The original structure type is still considered to be in an undefined namespace, which means it is exempt from restrictions on role mapping within the same namespace.

EDITOR NOTE: the current note is renumbered as NOTE 2.

Insert a new NOTE 3 after the last paragraph as follows:

The term standard structure namespaces refers to either of the two namespaces defined above.

NOTE 3 Namespaces are designed to provide greater interchange of PDFs including logical structure, providing a means to identify the custom namespace for each element, if appropriate. However, structure elements types in undefined namespaces continue to be permitted.

14.8.6.2 Role maps and namespaces

...

Modify the 2nd bullet in the bulleted list as follows:

In a tagged PDF, all structure elements shall be in at least one of the standard structure namespaces or in a namespace identified in 14.8.6.3, “Other namespaces”. An element shall be considered to be in one of these namespaces if:

  • they directly identify one of these namespaces through their NS entry;
  • they are in the default standard structure namespace (after any role mapping);
  • ...

Insert a new EXAMPLE 1 at the end of the subclause as follows:

EXAMPLE 1: use of namespaces

17 0 obj << /Type /StructElem /S /section /P 5 0 R /NS 15 0 R >> endobj 15 0 obj << /Type /Namespace /NS (urn:uuid:A63861E-9F7-4FCB-9B27-C3BC8D9BFB06) /RoleMapNS 16 0 R >> endobj 16 0 obj << /section [/H1 11 0 R] ... >> endobj

Insert a new EXAMPLE 2 at the end of the subclause as follows:

EXAMPLE 2: Role mapping of structure elements with no explicitly identified namespace

13 0 obj % A structure tree with a role map for elements within an undefined namespace << /Type /StructTreeRoot /RoleMap << % The "Global" role map is applied to elements in an undefined namespace. /Foo /Bar % The element type "Foo" in an undefined namespace maps to "Bar". /Bar /P % The "Bar" element in an undefined namespace maps to P, which is defined in % the default standard structure namespace. This means that elements of % type "Foo" transitively map to "P" through the "Bar" element type. >> ... >> endobj 14 0 obj % A structure element with an undefined namespace of structure type "Foo" << /Type /StructElem % Structure Element with no defined namespace. /S /Foo % The "Foo" element has an undefined namespace and isn't defined in either % the PDF 1.7 or PDF 2.0 namespaces. ... >> endobj

14.8.6.3 Other namespaces

Replace the paragraph below NOTE 1 as follows:

NOTE 1 MathML is the only domain-specific namespace defined in PDF 2.0.

When including mathematics structured as MathML 3.0, the math structure element type as defined in MathML 3.0 shall be used, and shall have its namespace explicitly defined (see 14.7.4.2, "Namespace dictionary"). When including mathematics structured as MathML 3.0, the math structure element type, as defined in MathML 3.0, shall be used to enclose the formula under the Formula structure element type. All MathML structure element types and their attributes shall have the MathML 3.0 namespace explicitly defined (see 14.7.4.2, "Namespace dictionary").

...

14.9 Repurposing and accessibility support

14.9.2 Natural language specification

14.9.2.2 Language identifiers

Add a new NOTE below the last bullet point as follows:

NOTE: Lang entries are defined as text strings which include Unicode-encoded strings as shown in "Figure 7 - Relationship between string types". Lang entry text strings can be represented as either literal strings or hexadecimal strings (see 7.3.4, "String objects").

14.9.4 Replacement text

Add a new bullet as the 3rd (i.e. last) item in the bullet list as follows:

Replacement text may be specified for the following items:

  • A structure element (see 14.7.2, “Structure Hierarchy”), by means of the optional ActualText entry (PDF 1.4) of the structure element dictionary.
  • (PDF 1.5) A marked-content sequence (see 14.6, “Marked Content”), through an ActualText entry in a property list attached to the marked-content sequence with a Span tag.
  • (PDF 1.5) A marked-content sequence (see 14.6, "Marked content"), through an ActualText entry in a property list attached to the marked-content sequence with an Artifact tag (see 14.8.2.2.2 Artifact).

...

Add a new EXAMPLE above the existing EXAMPLE and number as follows:

EXAMPLE 1 This example shows the use of replacement text to indicate the correct character content in a case where the SPACE character does not appear in the text content.

/Span << /ActualText (Missing ) >> BDC % add missing space character (Missing) Tj EMC 44 0 Td % Simulate space by moving the drawing position (space) Tj

Correct the EXAMPLE as follows:

EXAMPLE 2 This example shows the use of replacement text to indicate the correct character content in a case where hyphenation changes the spelling of a word (in German, up until spelling reforms, the word "Drucker" when hyphenated was rendered as "Druk-" and "ker").

(Dru) Tj /Span << /Actual Text/ActualText (c) >> BDC (k-) Tj EMC (ker) '

14.10 Web capture

14.10.5 Source information

14.10.5.3 Command dictionaries

Change Table 393 as follows:

Table 393 - Entries in a Web Capture command dictionary
Key Type Value
F integer

(Optional) A set of flags specifying various characteristics of the command (see "Table 393 - Entries in a Web Capture command dictionary" "Table 394 - Web Capture command flags"). Default value: 0.

14.11 Prepress support

14.11.5 Output intents

Change Table 402 as follows:

Table 402 - Entries in a DestOutputProfileRef dictionary
Key Type Value
URLs array

(Optional; PDF 2.0) An array, containing at least one element, where each element shall be an embedded file specification (7.11.4, "Embedded file streams") or a URL file specification dictionary (see "Table 43 - Entries in a file specification dictionary") with an FS entry having the value URL and an F entry being a PDFDocEncoded text string limited to RFC 3986 (see 7.11.5, "URL specifications").

NOTE: ICC profiles referenced via the URLs array do not have to conform to the ICCBased requirements of "Table 67 — ICC profile types" and thus can also support N-component output profiles.

14.11.7 Open prepress interface (OPI)

Change Table 407 as follows:

Table 407 - Entries in a version 2.0 OPI dictionary
Key Type OPI Comment Value
Inks name or array %%ImageInks

(Optional) A name object or array specifying the colourants to be applied to the image. The value may be the name full_color or registration or an array of the form

[/monochrome name1 tint1…namen tintn]

where each name is a string representing the name of a colourant and each tint is a real number in the range 0.0 to 1.0 specifying the concentration of that colourant to be applied.

14.12 Document parts

14.12.4 Data structures

14.12.4.1 General

Change Table 409 as follows:

Table 409 - Entries in a DPart dictionary
Key Type Value
Metadata stream

(Optional; PDF 2.0; shall be an indirect reference) A metadata stream that shall contain metadata for this document part (see 14.3.2, "Metadata streams").

XMP metadata streams (see 14.3.2, "Metadata streams") shall not be used in DPart dictionaries.

NOTE: the Metadata key was allowed in earlier editions of PDF 2.0.

14.12.4.2 Document part metadata

Change the last paragraph as follows:

The values of keys present in the DPM dictionary, or of any dictionary or array object present in the DPM dictionary, shall only be of type text string, date string, name, array, dictionary, boolean, integer or real as defined in 7.3, "Objects". All key values that are PDF name objects, after expansion of character sequences escaped with a NUMBER SIGN (23h), if any, shall be valid UTF-8 character sequences. Other PDF value types shall not be used.

14.13 Associated files

14.13.5 Associated files linked to graphics objects

...

Change the paragraph below NOTE 2 as follows:

Unlike other types of marked-content tags, the DP or MP marked-content operators shall not be used with the AF tag when that tag is used to refer to an array of file specification dictionaryies.

Change NOTE 3 as follows:

NOTE 3 The combination of a DP or MP operator with an AF tag (when used to refer to an array of file specification dictionaryies) is forbidden, as these operators only mark a single point and thus don’t enable connections between any specific sequence of content operators and their associated file.

Change the paragraph below NOTE 3 as follows:

The property list associated with the marked-content shall specify a dictionary with an MCAF entry defining an array of file specification dictionaries (see 7.11.3, "File specification dictionaries") to which the content is associated. The named resource in the Property List property list (see 14.6.2, "Property lists") shall specify an array of file specification dictionaries this dictionary to which the content is associated. The relationship that the associated files have to the PDF content is supplied by the AFRelationship key in each file specification dictionary in the array .

Insert a new Table, new NOTE 4, and a new EXAMPLE below the paragraph below NOTE 3 as follows:

NOTE 4 As defined in Table 34 and section 14.6.2, "Property Lists", a marked-content property list is always a dictionary. Furthermore, entries in Associated File file specification dictionaries use an indirect reference to the embedded file stream and, since indirect references are not allowed in content streams, named property resources are always used.

Table 409a - Property list entries for associated files
Key Type Value
MCAF array (Optional, PDF 2.0) An array of one or more file specification dictionaries (7.11.3, "File specification dictionaries") which denote the associated files for this marked-content sequence. Each file specification dictionary in the array shall have an AFRelationship entry.

EXAMPLE:
10 0 obj << /Resources << /Properties << /MF1 << /MCAF [ << /Type /Filespec /AFRelationship /Data /EF ... >> << /Type /Filespec /AFRelationship /Schema /EF ... >> ] >> /MF2 << /MCAF [ ... ] >> >> >> >> stream ... /AF /MF1 BDC ... EMC ... endstream endobj

Change the second last paragraph as follows:

Although the marked-content tag shall be AF, other applications of marked-content are not precluded from using AF as a tag. The marked-content is connected with associated files only if the tag is AF and the named property list is defined as a valid array of file specification dictionaries according to “Table 409a - Property list entries for associated files”. To avoid conflict with other features that use marked-content, such as 14.7, "Logical structure", where content is to be tagged with source content markers as well as other markers, the other markers should be nested inside the source content markers.

...


Last modified: 13 November 2024