7. Syntax

7.2 Lexical conventions

7.2.2 Representation

Change the first paragraph and the 3rd (last) bullet in the subsequent list as follows:

A non-encrypted PDF file can be entirely represented using byte values corresponding to the visible printable subset of the ASCII character set defined in INCITS 4-1986 (R2017), plus white-space characters. However, a A PDF file is not restricted to the ASCII character set; it may contain arbitrary bytes, subject to the following considerations:

  • ...
  • ...
  • A PDF file containing binary data shall be transported as a binary file rather than as a text file to ensure that all bytes of the file are faithfully preserved.

...

7.3 Objects

7.3.4 String objects

7.3.4.2 Literal strings

Change first paragraph as follows:

A literal string shall be written as an arbitrary number of characters enclosed in parentheses (LEFT PARENTHESIS (28h) and RIGHT PARENTHESIS (29h)). Any characters may appear in a string except unbalanced parentheses and the backslash (REVERSE SOLIDUS (5Ch)), which shall be treated specially as described in this subclause. Balanced pairs of parentheses within a string require no special treatment.

...

7.3.8.2 Stream extent

Change Table 5 as follows:

Table 5 - Entries common to all stream dictionaries
Key Type Value
F file specification

(Optional; PDF 1.2) The file containing the stream data. If this entry is present, the bytes between stream and endstream shall be ignored. However, the Length entry should shall still specify the number of those bytes (usually, there are no bytes and Length is 0). The filters that are applied to the file data shall be specified by FFilter and the filter parameters shall be specified by FDecodeParms.

7.4.3 ASCII85Decode filter

Change last bulleted list follows:

The following conditions shall never occur in a correctly encoded byte sequence:

  • The value represented by a group of 5 characters is greater than 232232 - 1.
  • A z character occurs in the middle of a group.
  • A final partial group contains only one character.

7.4.9 JPXDecode filter

Change paragraph below NOTE 5 as follows:

Data used in PDF image XObjects shall be limited to the JPX baseline set of features, except for excluding enumerated colour space 19 (CIEJab). In addition, enumerated colour space 12 (CMYK), which is part of JPX but not JPX baseline, shall be supported in a PDF file. JPX file structures used in PDF files shall conform to the JPEG 2000 specification.

...

7.5 File structure

7.5.4 Cross reference table

Change NOTE 3 as follows:

NOTE 3 The subsection structure is useful for incremental updates, since it allows a new cross-reference section to be added to the PDF file, containing entries only for objects that have been added, modified or deleted. This also means that cross reference subsections of incremental updates can never have an object number of zero.

Change paragraph below NOTE 3 as follows:

Each cross-reference subsection shall contain entries for a contiguous range of object numbers. Each cross-reference subsection shall contain entries for a contiguous range of object numbers. The subsection shall begin with a line containing only two non-negative integers separated by a single SPACE (20h) and terminated by an end-of-line marker (see 7.2.3, "Character set"). The two non-negative integers denote (respectively) the object number of the first object in this subsection and the number of entries in the subsection.

Change various paragraphs below EXAMPLE 1 as follows:

...

where:

nnnnnnnnnn shall be a 10-digit byte offset in the decoded stream PDF file

...

The byte offset in the decoded stream PDF file shall be a 10-digit number, padded with leading zeros if necessary, giving the number of bytes from the beginning of the PDF file to the beginning of the object. ...

Change EXAMPLE 2 as follows:

EXAMPLE 2 The cross-reference table sub-section line requires a single SPACE between "0" and "6".

Change EXAMPLE 3 as follows:

EXAMPLE 3 The cross-reference table first sub-section line requires a single SPACE between "0" and "1". The typeface of this example should be all monospaced and with single SPACEs between all cross-reference fields, and thus all fields vertically aligned.

7.5.5 File trailer

Change first paragraph as follows:

The trailer of a PDF file enables a PDF processor to quickly find the cross-reference table and certain special objects. PDF processors should read a PDF file from its end. The last line of the file shall contain only the end-of-file marker, %%EOF. The two preceding lines shall contain, one per line and in order, the keyword startxref and the byte offset in the decoded stream from the beginning of the PDF file to the beginning of the xref keyword in the last cross-reference section or the beginning of the previous cross-reference stream (see 7.5.8, "Cross-reference streams"). The startxref line shall be preceded by the trailer dictionary, consisting of the keyword trailer followed by a series of key-value pairs enclosed in double angle brackets (<<...>>) (using LESS-THAN SIGNs (3Ch) and GREATER-THAN SIGNs (3Eh)). Thus, the trailer has the following overall structure:

Change Table 15 as follows:

Table 15 - Entries in the file trailer dictionary
Key Type Value
Prev integer

(Optional; present only if the file has more than one cross-reference section; shall be a direct object) The byte offset from the beginning of the PDF file to the beginning of the previous cross-reference stream section.

Info dictionary

(Optional; shall be an indirect reference) ...

7.5.7 Object streams

Append the following paragraph after the bulleted list as follows:

The following objects shall not be stored in an object stream:

  • ...

Any entry's value in an ObjStm dictionary shall be either a direct object or an indirect uncompressed object.

NOTE 3 Indirect references to objects inside object streams use the normal syntax: for example, 14 0 R. Access to these objects requires a different way of storing cross-reference information; see 7.5.8, "Cross-reference streams". Use of compressed objects requires a PDF 1.5 PDF reader. However, compressed objects can be stored in a manner that a PDF 1.4 PDF reader can ignore.

Insert the following new NOTE 4 after NOTE 3 as follows:

NOTE 4: Including the document catalog in an object stream has interoperability implications, particularly for encrypted documents. If the catalog dictionary is part of an object stream, a PDF processor reading the document must first process that object stream before it can access potentially relevant document metadata, including the declared PDF version, developer extensions and XMP metadata.

...

7.5.8 Cross-reference streams

7.5.8.4 Compatibility with applications that do not support compressed reference streams

Change Table 19 as follows:

Table 19 - Additional entries in a hybrid-reference file’s trailer dictionary
Key Type Value
XRefStm integer

(Optional) The byte offset in the decoded stream from the beginning of the PDF file of a cross-reference stream. to the cross-reference stream, calculated from the beginning of the PDF file.

...

7.6.2 Application of encryption

7.6.3 General encryption algorithm

Add new NOTE 1 after the 4th bullet in the first bulleted list below the first paragraph as follows:

Encryption applies to all strings and streams in the document's PDF file, with the following exceptions:

  • ...
  • Any hexadecimal strings representing the value of the Contents key in a Signature dictionary

NOTE 1 For the signature schemes enumerated in ISO 32000-1 and in this document, the value of the Contents key in a Signature dictionary is always a hexadecimal string (see "Table 255 — Entries in a signature dictionary").

Encryption is not applied to other object types such as integers and boolean values, which are used primarily to convey information about the document's structure rather than its contents. ...

...

7.6.3.1 General

Change NOTE 1 as follows:

NOTE 1 The name RC4™ is a registered trademark of RSA Security Inc. and cannot be used by third parties creating implementations of the algorithm. Proprietary implementations of the RC4 encryption algorithm are available under license from RSA Security Inc. For licensing information, contact: RSA Security Inc. 2955 Campus Drive, Suite 400, San Mateo, CA 94403-2507, USA, or http://www.rsasecurity.com/.

...

7.6.4 Standard security handler

7.6.4.1 General

Change the second paragraph above NOTE 2 as follows:

If a security handler of revision 4 or 5 is specified, the standard security handler shall support crypt filters (see 7.6.6, "Crypt filters"). The support shall be limited to the Identity crypt filter (see "Table 26 - Standard crypt filter names") and crypt filters a crypt filter named StdCF whose dictionaries contain an AuthEvent value of DocOpen. For revision 4, the filter CFM value shall be V2 (RC4) or AESV2 (AES-128). For revision 6, the filter CFM value shall be AESV3 (AES-256). Public-Key security handlers in this case shall use crypt filters a crypt filter named DefaultCryptFilter when all document content is encrypted, and shall use crypt filters a crypt filter named DefEmbeddedFile when file attachments only are encrypted in place of StdCF name. This nomenclature shall not be used as an indicator of the type of the security handler or encryption. Use of security handler revisions 1, 2, 3, 4 and 5 is deprecated in PDF 2.0.

...

7.6.4.3.2 Algorithm 2: Computing a file encryption key in order to encrypt a document (revision 4 and earlier)

Change NOTE 2 as follows:

NOTE 2 The first element of the ID array, as used in 7.6.4.3.2, "Algorithm 2: Computing a file encryption key in order to encrypt a document (revision 4 and earlier)", step e, generally remains unchanged across revisions of a given document. However, since this is not guaranteed, use of the ID in computation of the file encryption key, as required when using 7.6.4.3.3, "Algorithm 2.A: Retrieving the file encryption key from an encrypted document in order to decrypt it (revision 6 and later)Algorithm 2: Computing a file encryption key in order to encrypt a document (revision 4 and earlier)", can complicate updates to the document. For this reason, security handlers are encouraged to use Algorithm 2.A or higher, which do not use the ID in file encryption key computation. This algorithm, when applied to the user password string, produces the file encryption key used to encrypt or decrypt string and stream data according to 7.6.3.2, "Algorithm 1: Encryption of data using the RC4 or AES algorithms". Parts of this algorithm are also used in the algorithms described below.

Insert new NOTE 3 immediately below NOTE 2 as follows:

NOTE 3 This algorithm, when applied to the user password string, produces the file encryption key used to encrypt or decrypt string and stream data according to 7.6.3.2, "Algorithm 1: Encryption of data using the RC4 or AES algorithms". Parts of this algorithm are also used in the algorithms described in 7.6.4.4, "Password algorithms".

7.6.4.3.3 Algorithm 2.A: Retrieving the file encryption key from an encrypted document in order to decrypt it (revision 6 and later)

Insert new NOTE below bullet (f) as follows:

  1. Decrypt the 16-byte Perms string using AES-256 in ECB mode with an initialization vector of zero and the file encryption key as the key. ...

NOTE This algorithm, when applied to the user password string, produces the file encryption key used to encrypt or decrypt string and stream data according to 7.6.3.3, "Algorithm 1.A: Encryption of data using the AES algorithms". Parts of this algorithm are also used in the algorithms described in 7.6.4.4, "Password algorithms".

7.6.4.4.9 Algorithm 10: Computing the encryption dictionary's Perms (permissions) value (Security handlers of revision 6)

Change bullet (f) as follows:

  1. Encrypt the 16-byte block using AES-256 in ECB mode with an initialization vector of zero, using the file encryption key as the key. The result (16 bytes) is stored as the Perms string, and checked for validity when the file is opened.
7.6.4.4.12 Algorithm 13: Validating the permissions (Security handlers of revision 6)

Change bullet (a) as follows:

  1. Decrypt the 16 byte Perms string using AES-256 in ECB mode with an initialization vector of zero and the file encryption key as the key. ...

7.6.5 Public-key security handlers

7.6.5.1 General

...

7.6.5.2 Public-key security dictionary

...

Insert a new subclause heading immediately below the NOTE below Table 23 as follows:

7.6.5.3 Public-key security permissions

EDITOR NOTE: current text and Table 24 remain unchanged.

Renumber the next clause appropriately:

7.6.5.37.6.5.4 Public-key encryption algorithms

Change the second bullet as follows:

  • ...
  • A 4-byte value defining the permissions, most significant byte first. See 7.6.5.3, "Public-key security permissions" and "Table 24 — Public-key security handler user access permissions" for the possible permission values.
  • ...

Add two new notes at the very end of the sub-clause as follows:

NOTE 1: This means that step c) only applies when both of the following conditions are met:

  • the key is being generated for the crypt filter named DefaultCryptFilter (i.e. the crypt filter used as the value for StmF in the encryption dictionary);
  • the EncryptMetadata entry of the associated crypt filter dictionary is set to false.

NOTE 2: Since crypt filters are not supported when SubFilter is set to adbe.pkcs7.s3 or adbe.pkcs7.s4 in the encryption dictionary, there is no way to specify that metadata is to be left unencrypted in these cases. In particular, step c) is always skipped for these SubFilter values.

7.6.6 Crypt filters

Change the first bullet in the sub-clause as follows:

PDF 1.5 introduces crypt filters, which provide finer granularity control of encryption within a PDF file. The use of crypt filters involves the following structures:

  • The encryption dictionary (see "Table 20 - Entries common to all encryption dictionaries") contains entries that enumerate the crypt filters in the document (CF) and specify which ones are used by default to decrypt all the streams (StmF) and strings (StrF) in the document. In addition, the value of the V entry shall be 4 or 5 to use crypt filters.

...

Change Table 25 as follows:

Table 25 - Entries common to all crypt filter dictionaries
Key Type Value
Length integer

(Required; deprecated in PDF 2.0) ...

When CFM is AESV2, the Length key shall have the value of 128 for public-key security handlers, and 16 for the standard security handler. When CFM is AESV3, the Length key shall have a value of 256 for public-key security handlers, and 32 for the standard security handler.

...

Change Table 27 as follows:

Table 27 - Additional crypt filter dictionary entries for public-key security handlers
Key Type Value
Recipients byte string or array

(Required) If the crypt filter is referenced from StmF or StrF in the encryption dictionary, this entry shall be an array of byte strings, where each byte string shall be a binary-encoded CMS object that shall ...

...

If the crypt filter is referenced from a Crypt filter decode parameter dictionary (see "Table 14 - Optional parameters for Crypt filters"), this entry shall be a byte string that shall be a binary-encoded CMS object that shall ...

7.7 Document structure

7.7.2 Document catalog dictionary

Change Table 29 as follows:

Table 29 - Entries in the catalog dictionary
Key Type Value
Dests dictionary (Optional; PDF 1.1; shall be an indirect reference) ...
Outlines dictionary (Optional; shall be an indirect reference) ...
Threads array (Optional; PDF 1.1; shall be an indirect reference) ...
Lang text string

(Optional; PDF 1.4) A language identifier that shall specify the natural language for all text in the document except where overridden by language specifications for structure elements or marked-content (see 14.9.2, "Natural language specification"). If this entry is absent or invalid (see 14.9.2, "Natural language specification"), the language shall be considered unknown.

NOTE All text in a document includes PDF text strings (see 7.9.2.2 "Text string type") as well as textual content.

7.7.3 Page tree

7.7.3.3 Page objects

Change Table 31 as follows:

Table 31 - Entries in a page object
Key Type Value
Contents stream or array

(Optional) A content stream (see 7.8.2, "Content streams") that shall describe the contents of this page. If this entry is absent, the page shall be empty.

NOTE If the Contents key is not present, a Resources dictionary must still be present, either directly or through inheritance, in the pages tree.

...

ID byte string (Optional; PDF 1.3; indirect reference preferred) ...

7.8.3 Resource dictionaries

Change the first bullet in the first bulleted list as follows:

A resource dictionary shall be associated with a content stream in one of the following ways:

  • For a content stream that is the value of a page's Contents entry (or is an element of an array that is the value of that entry), the resource dictionary shall be designated by the page dictionary's Resources entry or is inherited, as described under 7.7.3.4, "Inheritance of page attributes" from some ancestor node of the page object. ...

...

7.9.2.2 Text string type

7.9.2.2.1 General

Change EXAMPLE 1 as follows:

EXAMPLE 1 A PDF dictionary containing key 'Key' with the value that is the text string "text‰" will look like

<</Key(text?)>> <</Key (text\213) >>

where the character '?' after the 'text' is represented by the hex code 8Bh (octal code 213 - that is according to "D.2 Latin character set and encodings".

...

Change EXAMPLE 2 as follows:

EXAMPLE 2 A PDF dictionary containing key 'Key' with the value that is the text string "тест" (that is what the word in Russian with the translation to English as 'test') will look like

<</Key(??????????)>> <</Key <FEFF0442043504410442> >>

where the characters in parentheses is the sequence of bytes with hex codes FE, FF, 04, 42, 04, 35, 04, 41, 04, 42.

...

Change NOTE 4 as follows:

NOTE 4 This mechanism precludes beginning a string using PDFDocEncoding with the three characters dieresisidieresis, guillemotright, questiondown, which is unlikely to be a meaningful beginning of a word or phrase.

Delete NOTE 5 as follows:

NOTE 5 It is important not to confuse UTF-16BE with UCS2 (i.e. wchar_t). UTF-16 is not a fixed width encoding scheme.

7.10.3 Type 2 (exponential interpolation) functions

Change the paragraph below Table 40 as follows:

Values of Domain shall constrain x in such a way that:

  • if N is not an integer, all values of x will be non-negative; and
  • if N is negative, no value of x will be zero.

Typically, Domain is declared as [0.0 1.0], and N is a positive number. To clip the output to a specified range the Range attribute shall be used.

...

7.11.4 Embedded file streams

7.11.4.1 General

Change Table 44 as follows:

Table 44 - Additional entries in an embedded file stream dictionary
Key Type Value
Subtype name

(Optional, required in the case of an embedded file stream used as an associated file (see 14.13 "Associated files") or as an asset of a RichMedia annotation (see "13.7 Rich media")) ...

7.11.6 Collection items

Change Table 47 as follows:

Table 47 - Entries in a collection subitem dictionary
Key Type Value
D text string, date or number

(Optional) The data corresponding to the related entry in the collection field dictionary (see "Table 155 - Entries in a collection field dictionary"). The type of data shall match the data type identified by the corresponding collection field dictionary. Default: none.

P text string

(Optional) A prefix string that shall be concatenated with the text string presented to the user. This entry is ignored when an interactive PDF processor sorts the items in the collection. Default: none.

7.12.2 Extensions dictionary

Change Table 48 as follows:

Table 48 - Entries in an extensions dictionary
Key Type Value
Type name (Optional, shall be a direct objectif object if present) The type of PDF object that this dictionary describes; if present, shall be Extensions.

Last modified: 2 Sept 2022