7. Syntax

7.2 Lexical conventions

7.2.2 Representation

Change the first paragraph and the 3rd (last) bullet in the subsequent list as follows:

A non-encrypted PDF file can be entirely represented using byte values corresponding to the visible printable subset of the ASCII character set defined in INCITS 4-1986 (R2017), plus white-space characters. However, a A PDF file is not restricted to the ASCII character set; it may contain arbitrary bytes, subject to the following considerations:

  • ...
  • ...
  • A PDF file containing binary data shall be transported as a binary file rather than as a text file to ensure that all bytes of the file are faithfully preserved.

...

Change Table 2 and add the footnote below Table 2 as follows:

Table 2 - Delimiter characters
Glyph Decimal Hexadecimal Octal Name
{ 123 7B 173 LEFT CURLY BRACKET a
} 125 7D 175 RIGHT CURLY BRACKET a

(a) The delimiter characters { and } (LEFT CURLY BRACKET (7Bh) and RIGHT CURLY BRACKET (7Dh)) are additional delimiter characters only within Type 4 PostScript calculator functions (see 7.10.5 "Type 4 (PostScript calculator) functions").

7.3 Objects

7.3.3 Numeric objects

Add the following EBNF figure and embedded attachment above EXAMPLE 1:

EBNF railroad diagram for a PDF integer object 📎
Figure 1a - EBNF diagram for a PDF integer object

Change EXAMPLE 1 as follows:

EXAMPLE 1 Integer objects

123 43445 +17 -98 0 00987

...

Add the following EBNF figure and embedded attachment above EXAMPLE 2:

EBNF railroad diagram for PDF real object 📎
Figure 1b - EBNF diagram for a PDF real object

Change EXAMPLE 2 as follows:

EXAMPLE 2 Real objects

34.5 -3.62 +123.6 4. -.002 0 009.87

...

7.3.4 String objects

7.3.4.2 Literal strings

Change first paragraph as follows:

A literal string shall be written as an arbitrary number of characters enclosed in parentheses (LEFT PARENTHESIS (28h) and RIGHT PARENTHESIS (29h)). Any characters may appear in a string except unbalanced parentheses and the backslash (REVERSE SOLIDUS (5Ch)), which shall be treated specially as described in this subclause. Balanced pairs of parentheses within a string require no special treatment.

...

7.3.7 Dictionary objects

...

Add a new NOTE 2 below the third paragraph as follows:

Multiple entries in the same dictionary shall not have the same key.

NOTE 2 Due to 2-digit hexadecimal code escaping in PDF names, there are different ways to write the same key (see 7.3.5, "Name objects").

...

7.3.8 Stream objects

7.3.8.2 Stream extent

Insert a new NOTE immediately after the first paragraph and before the EXAMPLE as follows:

NOTE: The 'encoded data' of a stream encompasses all enveloping markers of the encoding, e.g. end-of-data markers, if the encoding scheme uses them.

Change Table 5 as follows:

Table 5 - Entries common to all stream dictionaries
Key Type Value
F file specification

(Optional; PDF 1.2) The file containing the stream data. If this entry is present, the bytes between stream and endstream shall be ignored. However, the Length entry should shall still specify the number of those bytes (usually, there are no bytes and Length is 0). The filters that are applied to the file data shall be specified by FFilter and the filter parameters shall be specified by FDecodeParms.

7.3.10 Indirect objects

Change the first bulleted list as follows:

Any object in a PDF file may be labelled as an indirect object. This gives the object a unique object identifier by which other objects can refer to it (for example, as an element of an array or as the value of a dictionary entry). The object identifier shall consist of two parts:

  • A positive integer object numberA PDF object number is a positive (non-zero) decimal integer comprised only of digits. It shall not have a leading PLUS SIGN ("+", 2Bh) and shall not start with leading zeros ("0"). Indirect objects may be numbered sequentially within a PDF file, but this is not required; object numbers may be assigned in any arbitrary order.
    EBNF railroad diagram for a PDF object number 📎
    Figure 1c - EBNF diagram for a PDF object number
  • A non-negative integer generation numberA PDF generation number is a non-negative decimal integer: its syntax requirements are identical to those of a PDF object number (above), except that the single digit "0" shall also be permitted. In a newly created file, all indirect objects shall have generation numbers of 0. Non-zero generation numbers may be introduced when the file is later updated; see 7.5.4, "Cross-reference table" and 7.5.6, "Incremental updates".
    EBNF railroad diagram for a PDF generation number 📎
    Figure 1d - EBNF diagram for a PDF generation number

...

7.4 Filters

7.4.1 General

Change the first paragraph as follows:

Stream filters are introduced in 7.3.8, "Stream objects". An option when reading stream data is to decode it using a filter to produce the original non-encoded data. Whether to do so and which decoding filter or filters to use are shall be specified in the stream dictionary. All stream data shall follow the appropriate format(s) as described below.

...

Change the paragraph introducing the bulleted list above NOTE 1 as follows:

PDF files processors shall support a standard set of filters that fall into two main categories:

  • ...
  • ...

7.4.3 ASCII85Decode filter

Change the second paragraph as follows:

The ASCII base-85 encoding shall use the ASCII characters ! through u ((21h) - (75h)) and the character z (7Ah), with the 2-character sequence ~> (7Eh)(3Eh) as its EOD marker. The ASCII85Decode filter shall ignore all white-space characters (see 7.2, "Lexical conventions"). If the ASCII85Decode filter encounters the character ~ in its input, the next character shall be > and the filter will reach EOD. Any other characters shall cause an error. Any other characters, and any character sequences that represent impossible combinations in the ASCII base-85 encoding, shall cause an error.

Insert a new NOTE below the second paragraph as follows:

NOTE: the Adobe PostScript Language Reference Manual (PLRM), Third Edition, clause 3.13.3 defines the above parsing and error requirements for the ASCII base-85 EOD ~> marker.

Change last bulleted list as follows:

The following conditions shall never occur in a correctly encoded byte sequence:

  • The value represented by a group of 5 characters is greater than 232232 - 1.
  • A z character occurs in the middle of a group.
  • A final partial group contains only one character.

7.4.4 LZWDecode and FlateDecode filters

...

7.4.4.3 LZWDecode and FlateDecode parameters

...

Change Table 8 as follows:

Table 8 - Optional parameters for LZWDecode and FlateDecode filters
Key Type Value
BitsPerComponent integer

(May be used only if Predictor is greater than 1) The number of bits used to represent each colour component in a sample. Valid values are 1, 2, 4, 8, and (PDF 1.5) 16. Default value: 8.

NOTE there is no relationship between this parameter and the similarly named key in image dictionaries.

7.4.9 JPXDecode filter

Change paragraph below NOTE 5 as follows:

Data used in PDF image XObjects shall be limited to the JPX baseline set of features, except for excluding enumerated colour space 19 (CIEJab). In addition, enumerated colour space 12 (CMYK), which is part of JPX but not JPX baseline, shall be supported in a PDF file. JPX file structures used in PDF files shall conform to the JPEG 2000 specification.

...

7.5 File structure

7.5.2 File header

Change the last paragraph as follows:

If a PDF file contains binary data, as most do (see 7.2, "Lexical conventions"), the header line shall be immediately followed by a comment line containing line containing only a comment that starts with at least four binary characters–that is, characters whose codes are 128 or greater. This ensures proper behaviour of file transfer applications that inspect data near the beginning of a file to determine whether to treat the file’s contents as text or as binary.

7.5.4 Cross reference table

Change the first paragraph as follows:

The cross-reference table contains information that permits random access to indirect objects within the PDF file so that the entire PDF file need not be read to locate any particular object. The table shall containcomprises a one-line entry for each indirect object, specifying the byte offset of that object within the body of the PDF file. Beginning with PDF 1.5, some or all of the cross-reference information may alternatively be contained in cross-reference streams; see 7.5.8, "Cross-reference streams".

Change NOTE 1 as follows and move NOTE 1 below the second paragraph:

NOTE 1 The cross-reference table isCross-reference sections are the only part of a PDF file with a fixed format, which permits entries in the tablesections to be accessed randomly.

Change NOTE 3 as follows:

NOTE 3 The subsection structure is useful for incremental updates, since it allows a new cross-reference section to be added to the PDF file, containing entries only for objects that have been added, modified or deleted. This also means that cross reference subsections of incremental updates can never have an object number of zero.

Change paragraph below NOTE 3 as follows:

Each cross-reference subsection shall contain entries for a contiguous range of object numbers. Each cross-reference subsection shall contain entries for a contiguous range of object numbers. The subsection shall begin with a line containing only two non-negative integers separated by a single SPACE (20h) and terminated by an end-of-line marker (see 7.2.3, "Character set"). The two non-negative integers denote (respectively) the object number of the first object in this subsection and the number of entries in the subsection.

Change various paragraphs below EXAMPLE 1 as follows:

...

where:

nnnnnnnnnn shall be a 10-digit byte offset in the decoded stream PDF file

...

The byte offset in the decoded stream PDF file shall be a 10-digit number, padded with leading zeros if necessary, giving the number of bytes from the beginning of the PDF file to the beginning of the object. ...

Change EXAMPLE 2 as follows:

EXAMPLE 2 The cross-reference table sub-section line requires a single SPACE between "0" and "6".

Change EXAMPLE 3 as follows:

EXAMPLE 3 The cross-reference table first sub-section line requires a single SPACE between "0" and "1".

EDITOR NOTE: The typeface of Example 3 should be all monospaced and with single SPACEs between all cross-reference fields, and thus all cross-reference data fields vertically aligned.

7.5.5 File trailer

Change first paragraph as follows:

The trailer of a PDF file enables a PDF processor to quickly find the cross-reference table and certain special objects. PDF processors should read a PDF file from its end. The last line of the file shall contain only the end-of-file marker, %%EOF. The two preceding lines shall contain, one per line and in order, the keyword startxref and the byte offset in the decoded stream from the beginning of the PDF file to the beginning of the xref keyword in the last cross-reference section or the beginning of the previous cross-reference stream (see 7.5.8, "Cross-reference streams"). The startxref line shall be preceded by the trailer dictionary, consisting of the keyword trailer followed by a series of key-value pairs enclosed in double angle brackets (<<...>>) (using LESS-THAN SIGNs (3Ch) and GREATER-THAN SIGNs (3Eh)). Thus, the trailer has the following overall structure:

Change Table 15 as follows:

Table 15 - Entries in the file trailer dictionary
Key Type Value
Size integer

(Required; shall not be an indirect reference) The total number of entries in the PDF file’s cross-reference table, as defined by the combination of the original section and all update sections. Equivalently, this value shall be 1 greater than the highest object number defined in the PDF file. This value shall be 1 greater than the highest object number defined in the PDF file.

NOTE 1: this is equivalent to the total number of entries in the PDF file’s cross-reference table, as defined by the combination of the original section and all update sections (see 7.5.4 "Cross-reference table").

Any object in a cross-reference section whose number is greater than this value shall be ignored and defined to be missing by a PDF reader.

Prev integer

(Optional; present only if the file has more than one cross-reference section; shall be a direct object) The byte offset from the beginning of the PDF file to the beginning of the previous cross-reference stream section.

Info dictionary

(Optional; shall be an indirect reference) ...

7.5.6 Incremental updates

...

Change NOTE 1 as follows:

NOTE 1 The main advantage to updating a PDF file in this way is that small changes to a large document can be saved quickly . There are additional advantages such as when editing a document across an HTTP connection or using OLE embedding (a Microsoft WindowsTM specific technology) , or when a PDF processor cannot overwrite the contents of the original PDF file. Incremental updates are used to save changes to documents in these contexts.

...

Change the paragraph above NOTE 4 and NOTE 4 as follows:

In versions of PDF 1.4 or later a PDF writer may use the Version entry in the document’s catalog dictionary (see 7.7.2, "Document catalog dictionary") to override the version specified in the header upgrade the current version of the PDF specification to which the document conforms (considering both the document header (see 7.5.2, "File header") and the catalog dictionary Version key value, if already present). The catalog of an incremental update shall not reduce the version of the document with the value, or absence, of the Version entry . A PDF writer may also need to update the Extensions dictionary, see 7.12, "Extensions dictionary", if the update either deleted or added developer-defined extensions.

NOTE 4 The Version entry enables the version to be alteredupgraded when performing an incremental update.

...

7.5.7 Object streams

Append a new bullet to the bulleted list as follows:

The following objects shall not be stored in an object stream:

  • ...
  • The document catalog (see 7.7.2 Document catalog dictionary) in an encrypted document

Append the following paragraph after the bulleted list as follows:

Any entry's value in an ObjStm dictionary shall be either a direct object or an indirect uncompressed object.

NOTE 3 Indirect references to objects inside object streams use the normal syntax: for example, 14 0 R. Access to these objects requires a different way of storing cross-reference information; see 7.5.8, "Cross-reference streams". Use of compressed objects requires a PDF 1.5 PDF reader. However, compressed objects can be stored in a manner that a PDF 1.4 PDF reader can ignore.

Insert the following new NOTE 4 after NOTE 3 as follows:

NOTE 4: Including the document catalog in an object stream has interoperability implications, particularly for encrypted documents. If the catalog dictionary is part of an object stream, a PDF processor reading the document must first process that object stream before it can access potentially relevant document metadata, including the declared PDF version, developer extensions and XMP metadata.

...

7.5.8 Cross-reference streams

7.5.8.2 Cross-reference stream dictionary

...

Change the first bullet point as follows:

  • The values of all entries shown in "Table 17 - Additional entries specific to a cross-reference stream dictionary" shall be direct objects; indirect references shall not be permitted. For arrays (the Index and W entries), all of their elements shall be direct objects as well. If the stream is encoded, the Filter and DecodeParms entries in "Table 5 - Entries common to all stream dictionaries" shall also be direct objects. The values of all entries shown in "Table 5 - Entries common to all stream dictionaries" shall also be direct objects. For arrays, all array elements shall be direct objects and for dictionaries, all key values shall be direct objects as well. The F entry defined in Table 5 shall not be used.

    Append new informative NOTE below the first bullet as follows:

    NOTE: Metadata streams (see 14.3.2, "Metadata streams") and Associated Files (see 14.13, "Associated Files") are thus not allowed in cross-reference stream dictionaries.

  • ...

7.5.8.4 Compatibility with applications that do not support compressed reference streams

Change Table 19 as follows:

Table 19 - Additional entries in a hybrid-reference file’s trailer dictionary
Key Type Value
XRefStm integer

(Optional) The byte offset in the decoded stream from the beginning of the PDF file of a cross-reference stream. to the cross-reference stream, calculated from the beginning of the PDF file.

...

7.6 Encryption

7.6.2 Application of encryption

Add new NOTE 1 after the 4th bullet in the first bulleted list below the first paragraph as follows:

Encryption applies to all strings and streams in the document's PDF file, with the following exceptions:

  • ...
  • Any hexadecimal strings representing the value of the Contents key in a Signature dictionary

NOTE 1 For the signature schemes enumerated in ISO 32000-1 and in this document, the value of the Contents key in a Signature dictionary is always a hexadecimal string (see "Table 255 — Entries in a signature dictionary").

Encryption is not applied to other object types such as integers and boolean values, which are used primarily to convey information about the document's structure rather than its contents. ...

...

7.6.3 General encryption algorithm

7.6.3.1 General

Change NOTE 1 as follows:

NOTE 1 The name RC4™ is a registered trademark of RSA Security Inc. and cannot be used by third parties creating implementations of the algorithm. Proprietary implementations of the RC4 encryption algorithm are available under license from RSA Security Inc. For licensing information, contact: RSA Security Inc. 2955 Campus Drive, Suite 400, San Mateo, CA 94403-2507, USA, or http://www.rsasecurity.com/.

...

7.6.4 Standard security handler

7.6.4.1 General

Change the second paragraph above NOTE 2 as follows:

If a security handler of revision 4 or 5 is specified, the standard security handler shall support crypt filters (see 7.6.6, "Crypt filters"). The support shall be limited to the Identity crypt filter (see "Table 26 - Standard crypt filter names") and crypt filters a crypt filter named StdCF whose dictionaries contain an AuthEvent value of DocOpen. For revision 4, the filter CFM value shall be V2 (RC4) or AESV2 (AES-128). For revision 6, the filter CFM value shall be AESV3 (AES-256). Public-Key security handlers in this case shall use crypt filters a crypt filter named DefaultCryptFilter when all document content is encrypted, and shall use crypt filters a crypt filter named DefEmbeddedFile when file attachments only are encrypted in place of StdCF name. This nomenclature shall not be used as an indicator of the type of the security handler or encryption. Use of security handler revisions 1, 2, 3, 4 and 5 is deprecated in PDF 2.0.

...

7.6.4.3.2 Algorithm 2: Computing a file encryption key in order to encrypt a document (revision 4 and earlier)

Change NOTE 2 as follows:

NOTE 2 The first element of the ID array, as used in 7.6.4.3.2, "Algorithm 2: Computing a file encryption key in order to encrypt a document (revision 4 and earlier)", step e, generally remains unchanged across revisions of a given document. However, since this is not guaranteed, use of the ID in computation of the file encryption key, as required when using 7.6.4.3.3, "Algorithm 2.A: Retrieving the file encryption key from an encrypted document in order to decrypt it (revision 6 and later)Algorithm 2: Computing a file encryption key in order to encrypt a document (revision 4 and earlier)", can complicate updates to the document. For this reason, security handlers are encouraged to use Algorithm 2.A or higher, which do not use the ID in file encryption key computation. This algorithm, when applied to the user password string, produces the file encryption key used to encrypt or decrypt string and stream data according to 7.6.3.2, "Algorithm 1: Encryption of data using the RC4 or AES algorithms". Parts of this algorithm are also used in the algorithms described below.

Insert new NOTE 3 immediately below NOTE 2 as follows:

NOTE 3 This algorithm, when applied to the user password string, produces the file encryption key used to encrypt or decrypt string and stream data according to 7.6.3.2, "Algorithm 1: Encryption of data using the RC4 or AES algorithms". Parts of this algorithm are also used in the algorithms described in 7.6.4.4, "Password algorithms".

7.6.4.3.3 Algorithm 2.A: Retrieving the file encryption key from an encrypted document in order to decrypt it (revision 6 and later)

Insert new NOTE below bullet (f) as follows:

  1. Decrypt the 16-byte Perms string using AES-256 in ECB mode with an initialization vector of zero and the file encryption key as the key. ...

NOTE This algorithm, when applied to the user password string, produces the file encryption key used to encrypt or decrypt string and stream data according to 7.6.3.3, "Algorithm 1.A: Encryption of data using the AES algorithms". Parts of this algorithm are also used in the algorithms described in 7.6.4.4, "Password algorithms".

7.6.4.3.4 Algorithm 2.B: Computing a hash (revision 6 and later)

Change bullet (a) as follows:

  1. Make a new string, K1, consisting of 64 repetitions of the sequence: input password, K, the 48-byte user key. The 48 byte user key is only used when checking the owner password or creating the owner key. If checking the user password or creating the user key, K1 is the concatenation of the input password and K.

    Make a new string K0 as follows:

    • When checking the owner password or creating the owner key, K0 is the concatenation of the input password, K, and the 48-byte user key.
    • Otherwise, K0 is the concatenation of the input password and K.

    Next, set K1 to 64 repetitions of K0.

  2. ...
7.6.4.4.9 Algorithm 10: Computing the encryption dictionary's Perms (permissions) value (Security handlers of revision 6)

Change bullet (f) as follows:

  1. Encrypt the 16-byte block using AES-256 in ECB mode with an initialization vector of zero, using the file encryption key as the key. The result (16 bytes) is stored as the Perms string, and checked for validity when the file is opened.
7.6.4.4.12 Algorithm 13: Validating the permissions (Security handlers of revision 6)

Change bullet (a) as follows:

  1. Decrypt the 16 byte Perms string using AES-256 in ECB mode with an initialization vector of zero and the file encryption key as the key. ...

7.6.5 Public-key security handlers

7.6.5.1 General

...

7.6.5.2 Public-key security dictionary

...

Change the paragraph below the NOTE as follows:

Permitted values of the SubFilter entry for use with conforming public-key security handlers are adbe.pkcs7.s3 (PDF 1.3), adbe.pkcs7.s4 (PDF 1.4), which shall be used when not using crypt filters (see 7.6.6, "Crypt filters") and adbe.pkcs7.s5 (PDF 1.5), which shall be used when using crypt filters.

...

Insert a new subclause heading immediately below the NOTE below Table 23 as follows:

7.6.5.3 Public-key security permissions

EDITOR NOTE: current text and Table 24 remain unchanged.

Renumber the next clause appropriately:

7.6.5.37.6.5.4 Public-key encryption algorithms

Change the second bullet as follows:

  • ...
  • A 4-byte value defining the permissions, most significant byte first. See 7.6.5.3, "Public-key security permissions" and "Table 24 — Public-key security handler user access permissions" for the possible permission values.
  • ...

Add two new notes at the very end of the sub-clause as follows:

NOTE 1: This means that step c) only applies when both of the following conditions are met:

  • the key is being generated for the crypt filter named DefaultCryptFilter (i.e. the crypt filter used as the value for StmF in the encryption dictionary);
  • the EncryptMetadata entry of the associated crypt filter dictionary is set to false.

NOTE 2: Since crypt filters are not supported when SubFilter is set to adbe.pkcs7.s3 or adbe.pkcs7.s4 in the encryption dictionary, there is no way to specify that metadata is to be left unencrypted in these cases. In particular, step c) is always skipped for these SubFilter values.

7.6.6 Crypt filters

Change the first bullet in the sub-clause as follows:

PDF 1.5 introduces crypt filters, which provide finer granularity control of encryption within a PDF file. The use of crypt filters involves the following structures:

  • The encryption dictionary (see "Table 20 - Entries common to all encryption dictionaries") contains entries that enumerate the crypt filters in the document (CF) and specify which ones are used by default to decrypt all the streams (StmF) and strings (StrF) in the document. In addition, the value of the V entry shall be 4 or 5 to use crypt filters.

...

Change Table 25 as follows:

Table 25 - Entries common to all crypt filter dictionaries
Key Type Value
Length integer

(Required; deprecated in PDF 2.0) ...

When CFM is AESV2, the Length key shall have the value of 128 for public-key security handlers, and 16 for the standard security handler. When CFM is AESV3, the Length key shall have a value of 256 for public-key security handlers, and 32 for the standard security handler.

...

Change Table 27 as follows:

Table 27 - Additional crypt filter dictionary entries for public-key security handlers
Key Type Value
Recipients byte string or array

(Required) If the crypt filter is referenced from StmF or StrF in the encryption dictionary, this entry shall be an array of byte strings, where each byte string shall be a binary-encoded CMS object that shall ...

...

If the crypt filter is referenced from a Crypt filter decode parameter dictionary (see "Table 14 - Optional parameters for Crypt filters"), this entry shall be a byte string that shall be a binary-encoded CMS object that shall ...

...

Correct the last example in subclause 7.6.6 as follows (a SLASH was missing from the key value name "V2"):

... 8 0 obj %Encryption dictionary <</Filter /MySecurityHandlerName /V 4 %Version 4: allow crypt filters /CF %List of crypt filters <</MyFilter0 <</Type /CryptFilter /CFM /V2>> %Uses the standard algorithm >> ... ...

7.7 Document structure

7.7.1 General

Move Figure 5 from subclause 7.7.2 to after the first paragraph, and update Figure 5 as follows:

Updated Figure 5 - Structure of a PDF document, additionally showing Document Part in PDF DOM
Figure 5 - Structure of a PDF document

7.7.2 Document catalog dictionary

Append a new sentence to the end of the first paragraph as follows:

The root of a document’s object hierarchy is the catalog dictionary, located by means of the Root entry in the trailer of the PDF file (see 7.5.5, "File trailer"). The catalog dictionary contains references to other objects defining the document’s contents, outline, article threads, named destinations, and other attributes. In addition, it contains information about how the document shall be displayed on the screen, such as whether its outline and thumbnail page images shall be displayed automatically and whether some location other than the first page shall be shown when the document is opened. "Table 29 — Entries in the catalog dictionary" shows the entries in the catalog dictionary. For encrypted documents, the catalog dictionary shall not be in an object stream (see 7.5.7 Object streams).

Move Figure 5 - Structure of a PDF document from here to subclause 7.7.1:

Figure 5 - Structure of a PDF document

Change Table 29 as follows:

Table 29 - Entries in the catalog dictionary
Key Type Value
Extensions dictionary (Optional; shall be a direct object; ISO 32000-1) ...
Dests dictionary (Optional; PDF 1.1; shall be an indirect reference) ...
Outlines dictionary (Optional; shall be an indirect reference) ...
Threads array (Optional; PDF 1.1; shall be an indirect reference) ...
Lang text string

(Optional; PDF 1.4) A language identifier that shall specify the natural language for all text in the document except where overridden by language specifications for structure elements or marked-content (see 14.9.2, "Natural language specification"). If this entry is absent or invalid (see 14.9.2, "Natural language specification"), the language shall be considered unknown.

NOTE All text in a document includes PDF text strings (see 7.9.2.2 "Text string type") as well as textual content.

StructTreeRoot dictionary (Optional; PDF 1.3; shall be an indirect reference) ...

7.7.3 Page tree

7.7.3.2 Page tree nodes

Change Table 30 as follows:

Table 30 - Required entries in a page tree node
Key Type Value
Kids array (Required) An array of indirect references to the immediate children of this node. The children shall only be page objects or other page tree nodes (null entries shall not be present). The length of the array shall be at least one.
Count integer (Required) The number of leaf nodes (page objects) that are descendants of this node within the page tree which shall be 1 or greater.

NOTE Since the number of pages descendent from a Pages dictionary can be accurately determined by examining the tree itself using the Kids arrays, the Count entry is redundant.

A PDF writer shall ensure that the value of the Count key is consistent with the number of entries in the Kids array and its descendants which definitively determines the number of descendant pages.

7.7.3.3 Page objects

Change Table 31 as follows:

Table 31 - Entries in a page object
Key Type Value
Contents stream or array

(Optional) A content stream (see 7.8.2, "Content streams") that shall describe the contents of this page. If this entry is absent, the page shall be empty.

NOTE If the Contents key is not present, a Resources dictionary must still be present, either directly or through inheritance, in the pages tree.

...

ID byte string (Optional; PDF 1.3; indirect reference preferred) ...
B array (Optional; PDF 1.1; recommended if the page contains article beads) An array that shall contain indirect references to all article beads appearing on the page (see 12.4.3, "Articles"). The beads shall be listed in the array in natural reading order. Objects of Type Template shall have no B key.

...

7.7.4 Name dictionary

...

Change all occurrences of "name string" to "string" in Table 32 as follows:

Table 32 - Entries in the name dictionary
Key Type Value
Dests name tree

(Optional; PDF 1.2) name tree mapping name strings to destinations (see 12.3.2.4, "Named destinations").

AP name tree

(Optional; PDF 1.3) name tree mapping name strings to annotation appearance streams (see 12.5.5, "Appearance streams").

JavaScript name tree

(Optional; PDF 1.3) name tree mapping name strings to document-level ECMAScript actions (see 12.6.4.17, "ECMAScript actions").

Pages name tree

(Optional; PDF 1.3) name tree mapping name strings to visible pages for use in interactive forms (see 12.7.7, "Named pages").

Templates name tree

(Optional; PDF 1.3) name tree mapping name strings to invisible (template) pages for use in interactive forms (see 12.7.7, "Named pages").

EmbeddedFiles name tree

(Optional; PDF 1.4) name tree mapping name strings to file specifications for embedded file streams (see 7.11.4, "Embedded file streams"). ...

(PDF 2.0) For unencrypted wrapper documents for an encrypted payload document (see 7.6.7, "Unencrypted wrapper document") the name strings provided in this tree shall not contain or be derived from the encrypted payload document’s actual file name. This is to avoid potential disclosure of sensitive information in the original filename. The name string should match the value of F or UF in the referenced File Specification dictionary.

AlternatePresentations name tree

(Optional; PDF 1.4) name tree mapping name strings to alternate presentations (see 13.5, "Alternate presentations").

Renditions name tree

(Optional; PDF 1.5) A name tree mapping name strings (which shall have Unicode a UTF-16BE encoding) to rendition objects (see 13.2.3, "Renditions").

7.8.2 Content streams

Change the paragraph above Table 33 as follows:

Ordinarily, when a PDF reader encounters an operator in a content stream that it does not recognise, an error shall occur. A pair of compatibility operators, BX and EX (PDF 1.1), shall modify this behaviour (see "Table 33 — Compatibility operators"). These operators shall occur in pairs and may be nested. They bracket a compatibility section, a portion of a content stream within which unrecognised operators shall be ignored without error. This mechanism enables a PDF processor to use operators defined in later versions of PDF without sacrificing compatibility with older applications. It should be used only in cases where ignoring such newer operators is the appropriate thing to do. The BX and EX operators are not themselves part of any graphics object (see 8.2, "Graphics objects") or of the graphics state (8.4, "Graphics state"). All pairs of matching operators (marked-content operators BMC, BDC, and EMC (see 14.6, "Marked content"); text object operators BT and ET (see 9.4, "Text objects"); the compatibility operators BX and EX (see "Table 33 - Compatibility operators") and the graphics state save and restore operators q and Q (see "Table 56 - Graphics state operators")) shall be properly (separately) nested.

7.8.3 Resource dictionaries

Change the bulleted list below EXAMPLE 1 as follows:

A resource dictionary shall be associated with a content stream in one of the following ways:

  • For a content stream that is the value of a page's Contents entry (or is an element of an array that is the value of that entry), the resource dictionary shall be designated by the page dictionary's Resources entry or is inherited, as described under 7.7.3.4, "Inheritance of page attributes" from some ancestor node of the page object. PDF writers should not use this inheritance feature of PDF as its use can cause undue complexity for a PDF reader. A PDF writer should only include resource definitions for resources that are actually referenced by the content streams of the associated page in the Resources dictionary. If the content streams of multiple pages require exactly the same set of resources, a single Resources dictionary may be shared between them by using indirect references. If each page requires different sets of resources, then each should be written with its own Resources dictionary.
  • Content streams that define the glyph descriptions of a Type 3 font shall include a Resources entry in the Type 3 font dictionary specifying all the resources used by all the content streams in the CharProcs dictionary of a Type 3 font.

    If a glyph description content stream in the CharProcs entry of a Type 3 font uses named resources directly then those resources shall be present in the resource dictionary designated by the first Resources entry found in the following search order:

    1. the stream dictionary of that glyph description content stream;
    2. the parent Type 3 font dictionary that contained the CharProcs entry with the glyph description content stream;

    If there is no Resources dictionary explicitly associated with the Type 3 glyph description content stream or Type 3 font dictionary:

    1. the parent page dictionary on which the Type 3 font is used;
    2. resource inheritance from ancestor nodes of the parent page dictionary (see 7.7.3.4 "Inheritance of page attributes").

    NOTE 2: Named resources referenced by a resource, such as an XObject referenced from a glyph description content stream, would be included in the Resources dictionary of that resource rather than in the designated resource dictionary of the glyph description content stream.

  • For other types of content streams, a PDF writer shall include a Resources entry in the stream's dictionary specifying a resource dictionary which contains all named resources used by that content stream. This shall apply to content streams that define form XObjects (see "Table 93 — Additional entries specific to a Type 1 form dictionary"), patterns (see "Table 74 — Additional entries specific to a Type 1 pattern dictionary"), and annotation appearances (see 12.5.5 "Appearance streams").
  • PDF files written obeying earlier versions of PDF may have omitted the Resources entry in all form XObjects and Type 3 fonts used on a page. All resources that are referenced from those forms and fonts shall be inherited from the resource dictionary of the page on which they are used.

NOTE 3 PDF files written obeying earlier versions of PDF may have omitted the Resources entry in form XObjects, Type 3 glyph descriptions or annotation appearance streams used on a page. Those earlier versions state that resources that were referenced from those content streams can be inherited from the resource dictionary of the page on which they are used.

NOTE 4 Linearized PDF files impose additional requirements on resources - see "Annex F - (normative) Linearized PDF".

...

Change Table 34 as follows:

Table 34 - Entries in resource dictionary
Key Type Value
ColorSpace dictionary

(Optional) A dictionary that maps each resource name to either the name of a device-dependent colour space colour space with no additional parameters (DeviceGray, DeviceRGB, DeviceCMYK, or Pattern), or an array describing a colour space (see 8.6, "Colour spaces").

7.9 Common data structures

7.9.1 General

Replace Table 35 with the following table and NOTE:

Table 35 - PDF data types (informative)
Type Description Subclause
ASCII string Bytes containing ASCII characters A string object containing bytes encoded as ASCII characters. 7.3.4
7.9.2
array An array object. 7.3.6
boolean A Boolean object. 7.3.2
byte string A series of bytes that shall represent characters or other binary data. If such a type represents characters, the encoding shall be determined by the context. A string object containing bytes where the encoding is determined by the context. 7.3.4
7.9.2
7.9.2.4
date Date (ASCII string) A string object that represents a date. 7.3.4
7.9.2

7.9.4
dictionary A dictionary object. 7.3.7
file specification A file specification (dictionary or string) 7.11
function Function (dictionary or stream) A function object (dictionary or stream) that represents a parameterised function, including mathematical formulas or sampled representations with arbitrary resolution. 7.10
integer An integer number object 7.3.3
name A name object 7.3.5
name tree Name tree (dictionary) A name tree data structure (dictionary) 7.9.6
null The null object 7.3.9
number

Number A numeric object (integer or real).

NOTE: the term "real" may also be used in this specification to represent a numeric object.

7.3.3
number tree Number tree (dictionary) A number tree data structure (dictionary) 7.9.7
PDFDocEncoded string Bytes containing a string that shall be encoded using PDFDocEncoding A type of text string containing information intended to be human-readable that is encoded using the single-byte PDFDocEncoding. 7.9.2
7.9.2.3
rectangle Rectangle (array) A rectangle (array with 4 numeric elements) 7.9.5
stream A stream object (including the stream extent dictionary) 7.3.8
string Any string that is not a text string. Beginning with PDF 1.7, this type is further qualified as the types: ASCII string and byte string. A string object that may be further qualified as either a text string, an ASCII string, or a byte string. 7.3.4
7.9.2
text string Bytes that represent characters that shall be encoded using either PDFDocEncoding, UTF-16BE or UTF-8 (as defined in 7.9.2.2, "Text string type".) A type of string object containing information that is intended to be human-readable, and that is encoded as either PDFDocEncoding, UTF-16BE, or UTF-8 (PDF 2.0) 7.9.2
7.9.2.2
text stream A text stream object (including the stream extent dictionary) 7.9.3

NOTE: unless otherwise stated in this specification, all string objects may be written as either a literal string or a hexadecimal string as described in "7.3.4 - String objects".

7.9.2 String object types

7.9.2.2 Text string type

7.9.2.2.1 General

Change EXAMPLE 1 as follows:

EXAMPLE 1 A PDF dictionary containing key 'Key' with the value that is the text string "text‰" will look like

<</Key(text?)>> <</Key (text\213) >>

where the character '?' after the 'text' is represented by the hex code 8Bh (octal code 213 - that is according to "D.2 Latin character set and encodings".

...

Change EXAMPLE 2 as follows:

EXAMPLE 2 A PDF dictionary containing key 'Key' with the value that is the text string "тест" (that is what the word in Russian with the translation to English as 'test') will look like

<</Key(??????????)>> <</Key <FEFF0442043504410442> >>

where the characters in parentheses is the sequence of bytes with hex codes FE, FF, 04, 42, 04, 35, 04, 41, 04, 42.

...

Change NOTE 4 as follows:

NOTE 4 This mechanism precludes beginning a string using PDFDocEncoding with the three characters dieresisidieresis, guillemotright, questiondown, which is unlikely to be a meaningful beginning of a word or phrase.

Delete NOTE 5 as follows:

NOTE 5 It is important not to confuse UTF-16BE with UCS2 (i.e. wchar_t). UTF-16 is not a fixed width encoding scheme.

7.9.2.4 Byte string type

Change the first paragraph as follows, including adding an EXAMPLE and a new NOTE:

The byte string type shall be used for binary data that shall be represented as a series of bytes, where each byte may be any value representable in 8 bits. Byte string type is a subtype of string type. For example, byte strings are used to define a file identifier (see 14.4, "File identifiers") that is specified in ID entry of PDF file trailer (see "Table 15 — Entries in the file trailer dictionary").In such case byte string is written in hexadecimal form (see 7.3.4.3, "Hexadecimal strings") and looks like
<B6FB54F3F8554D478DC874F11DAD0F11>
Unless otherwise stated in this document, a byte string may be either a literal string (see 7.3.4.2, "Literal strings") or a hexadecimal string (see 7.3.4.3, "Hexadecimal strings").

EXAMPLE Byte strings are used to define a file identifier (see 14.4, "File identifiers") that are specified in the ID entry of the PDF file trailer (see "Table 15 — Entries in the file trailer dictionary"). If written in hexadecimal form, an ID array entry looks like:

<B6FB54F3F8554D478DC874F11DAD0F11>

NOTE 1 The Contents entry of a Signature dictionary can be required to be a hexadecimal string (see "Table 255 - Entries in a signature dictionary").

NOTE 2 The string can represent characters but the encoding is not known. The bytes of the string do not have to represent characters.

7.9.4 Dates

...

Change the last paragraph before the EXAMPLE as follows:

The prefix “D:” shall be present, the year field (YYYY) shall be present and all other fields may be present but only if all of their preceding fields are also present. The APOSTROPHE following the hour offset field (HH) shall only be present if the HH field is present. The minute offset field (mm) shall only be present if the APOSTROPHE following the hour offset field (HH) is present. The default values for MM and DD shall be both 01; all other numerical fields shall default to zero values. A PLUS SIGN as the value of the O field signifies that local time is now and later than UT, a HYPHEN-MINUS signifies that local time is earlier than UT, and the LATIN CAPITAL LETTER Z signifies that local time is equal to UT. If no UT information is specified, the relationship of the specified time to UT shall be considered to be GMT. the missing timezone offset shall be assumed to be the same as Greenwich Mean Time's timezone offset (+0'00). Regardless of whether the time zone is specified, the rest of the date shall be specified in local time.

...

7.9.6 Name trees

...

Change Table 36 as follows:

Table 36 - Entries in a name tree node dictionary
Key Type Value
Names array

(Root and leaf nodes only; required in leaf nodes; present in the root node if and only if Kids is not present) Shall be an array of the form

[key1 value1 key2 value2 ...keyn valuen]

where each keyi shall be a string and the corresponding valuei shall be the object associated with that key. The keys shall be sorted in lexical order, as described below. Keys shall not be the null object.

Change the paragraph below Table 36 as follows:

The Kids entries in the root and intermediate nodes define the tree’s structure by identifying the immediate children of each node. The Names entries in the leaf (or root) nodes shall contain the tree’s keys and their associated values, arranged in key-value pairs and shall be sorted lexically in ascending order by key. Shorter keys shall appear before longer ones beginning with the same byte sequence. Any encoding of the keys may be used as long as it is self-consistent; keys shall be compared for equality on a simple byte-by-byte basis.

...

7.9.7 Number trees

Change Table 37 as follows:

Table 37 - Entries in a number tree node dictionary
Key Type Value
Nums array

(Root and leaf nodes only; shall be required in leaf nodes; present in the root node if and only if Kids is not present) Shall be an array of the form

[key1 value1 key2 value2 ...keyn valuen]

where each keyi shall is an integer and the corresponding valuei shall be the object associated with that key. The keys shall be sorted in numerical order, analogously to the arrangement of keys in a name tree as described in 7.9.6, "Name trees". Keys shall not be the null object.

7.10 Functions

7.10.3 Type 2 (exponential interpolation) functions

Change the paragraph below Table 40 as follows:

Values of Domain shall constrain x in such a way that:

  • if N is not an integer, all values of x will be non-negative; and
  • if N is negative, no value of x will be zero.

Typically, Domain is declared as [0.0 1.0], and N is a positive number. To clip the output to a specified range the Range attribute shall be used.

...

7.10.5 Type 4 (PostScript calculator) functions

7.10.5.1 General

Change NOTE 1 as follows:

NOTE 1 Although any function can be sampled (in a Type 40 PDF function) and others can be described with exponential functions (Type 2 in PDF), Type 4 functions offer greater flexibility and potentially greater accuracy. For example, a tint transformation function for a hexachrome (six-component) DeviceN colour space with an alternative colour space of DeviceCMYK (see 8.6.6.5, "DeviceN colour spaces") requires a 6-in, 4-out function. If such a function were sampled with m values for each input variable, the number of samples, 4 × m6, could be prohibitively large. In practice, such functions can often be written as short, simple PostScript language functions.

...

7.10.5.2 Operators and operands

Change Table 42 as follows:

Table 42 - Operators in Type 4 functions
Operator Type Operators
Conditional operators If if ifelse

7.11 File specifications

7.11.3 File specification dictionaries

Change Table 43 as follows:

Table 43 - Entries in a file specification dictionary
Key Type Value
Type name

(Required if an EF, EP or RF entry is present; recommended always; PDF 1.3) The type of PDF object that this dictionary describes; shall be Filespec for a file specification dictionary.

AFRelationship name (Optional; PDF 2.0) A name value that represents the relationship between the component of this PDF document that refers to this file specification (via an AF array) and the associated file denoted by this file specification dictionary. See 14.13, "Associated files" for more details. These values represent the following relationships:

...

7.11.4 Embedded file streams

7.11.4.1 General

Change Table 44 as follows:

Table 44 - Additional entries in an embedded file stream dictionary
Key Type Value
Subtype name

(Optional, required in the case of an embedded file stream used as an associated file (see 14.13 "Associated files") or as an asset of a RichMedia annotation (see "13.7 Rich media")) ...

Change Table 45 as follows:

Table 45 - Entries in an embedded file parameter dictionary
Key Type Value
CheckSum byte string

(Optional) 16-byte string that is the checksum of the bytes of the uncompressed embedded file. The checksum shall be calculated by applying the standard MD5 message-digest algorithm (defined in Internet RFC 1321) to the bytes of the embedded file stream.

NOTE This is strictly a checksum, and is not used for security purposes.

7.11.6 Collection items

Change Table 47 as follows:

Table 47 - Entries in a collection subitem dictionary
Key Type Value
D text string, date or number

(Optional) The data corresponding to the related entry in the collection field dictionary (see "Table 155 - Entries in a collection field dictionary"). The type of data shall match the data type identified by the corresponding collection field dictionary. Default: none.

P text string

(Optional) A prefix string that shall be concatenated with the text string presented to the user. This entry is ignored when an interactive PDF processor sorts the items in the collection. Default: none.

7.12 Extensions dictionary

7.12.1 General

Add new informative NOTE below the first paragraph as follows:

NOTE: due to the above requirement for direct objects, Metadata streams (see 14.3.2, "Metadata streams") and Associated Files (see 14.13, "Associated Files") cannot be included in extensions dictionaries.

7.12.2 Extensions dictionary

Change Table 48 as follows:

Table 48 - Entries in an extensions dictionary
Key Type Value
Type name (Optional, shall be a direct objectif object if present) The type of PDF object that this dictionary describes; if present, shall be Extensions.

7.12.3 Developer extensions dictionary

Add the following note below Table 49 as follows:

NOTE The URL and ExtensionRevision entries are not exempt from encryption so if a developer extension defines a new PDF encryption algorithm and a PDF is configured to encrypt strings with that algorithm, then those values will not be meaningful to processors that do not support that developer extension.

...

7.12.4 BaseVersion

Change the first paragraph as follows:

The value of the BaseVersion entry shall be a name and shall be consistent with the syntax used for the Version entry value of the catalog dictionary (see 7.7.2, "Document catalog dictionary"). The value of BaseVersion, when treated as a version number, shall be less than or equal to the PDF version, both in the document header (see 7.5.2, "File header") version of the PDF specification to which this document conforms (see "Table 29 - Entries in the catalog dictionary", considering both the document header (see 7.5.2, "File header") and the catalog dictionary Version key value, if present). The value of BaseVersion may be different from the version number in the document header or that supplied by the Version key in the catalog dictionary. This is because it reflects the version of the standard that has been extended and not the version of this particular file.

...

7.12.5 ExtensionLevel

Change the paragraph as follows:

The value of the ExtensionLevel entry shall be an integer, which shall be interpreted with respect to the BaseVersion value. If a developer has released multiple extensions against the same BaseVersion value, they shallshould be ordered over time and the ExtensionLevel numbers shall be a monotonically increasing sequenceshould increase over time.


Last modified: 13 Sept 2024