Syntax
7. Syntax
7.2 Lexical conventions
7.2.2 Representation
Change the first paragraph and the 3rd (last) bullet in the subsequent list as follows:
A non-encrypted PDF file can be entirely represented using byte values corresponding to the visible
printable subset of the ASCII character set defined in INCITS 4-1986 (R2017), plus white-space characters.
However, a
A PDF file is not restricted to the ASCII character set; it may contain arbitrary bytes, subject to the following considerations:
- ...
- ...
- A PDF file
containing binary datashall be transported as a binary file rather than as a text file to ensure that all bytes of the file are faithfully preserved.
...
7.3 Objects
7.3.4 String objects
7.3.4.2 Literal strings
Change first paragraph as follows:
A literal string shall be written as an arbitrary number of characters enclosed in parentheses (LEFT PARENTHESIS (28h) and RIGHT PARENTHESIS (29h)). Any characters may appear in a string except unbalanced parentheses and the backslash (REVERSE SOLIDUS (5Ch)), which shall be treated specially as described in this subclause. Balanced pairs of parentheses within a string require no special treatment.
...
7.3.8.2 Stream extent
Change Table 5 as follows:
Key | Type | Value |
---|---|---|
F | file specification |
(Optional; PDF 1.2) The file containing the stream data. If this entry is present, the bytes between stream and endstream
shall be ignored. However, the Length entry |
7.4.3 ASCII85Decode filter
Change last bulleted list follows:
The following conditions shall never occur in a correctly encoded byte sequence:
- The value represented by a group of 5 characters is greater than
232232 - 1. - A z character occurs in the middle of a group.
- A final partial group contains only one character.
7.4.9 JPXDecode filter
Change paragraph below NOTE 5 as follows:
Data used in PDF image XObjects shall be limited to the JPX baseline set of features,
except for
excluding
enumerated colour space 19 (CIEJab).
In addition, enumerated colour space 12 (CMYK), which is part of JPX but not JPX baseline, shall be supported in a PDF file. JPX file
structures used in PDF files shall conform to the JPEG 2000 specification.
...
7.5 File structure
7.5.4 Cross reference table
Change NOTE 3 as follows:
NOTE 3 The subsection structure is useful for incremental updates, since it allows a new cross-reference section to be added to the PDF file, containing entries only for objects that have been added, modified or deleted. This also means that cross reference subsections of incremental updates can never have an object number of zero.
Change paragraph below NOTE 3 as follows:
Each cross-reference subsection shall contain entries for a contiguous range of object numbers.
Each cross-reference subsection shall contain entries for a contiguous range of object numbers.
The subsection shall begin with a line containing only two non-negative integers separated by a single SPACE (20h) and terminated by an end-of-line marker (see 7.2.3, "Character set"). The two non-negative integers denote (respectively) the object number of the first object in this subsection and the number of entries in the subsection.
Change various paragraphs below EXAMPLE 1 as follows:
...
where:
nnnnnnnnnn shall be a 10-digit byte offset in the decoded stream PDF file
...
The byte offset in the decoded stream PDF file shall be a 10-digit number, padded with leading zeros if necessary, giving the number of bytes from the beginning of the PDF file to the beginning of the object. ...
Change EXAMPLE 2 as follows:
EXAMPLE 2 The cross-reference table sub-section line requires a single SPACE between "0" and "6".
Change EXAMPLE 3 as follows:
EXAMPLE 3 The cross-reference table first sub-section line requires a single SPACE between "0" and "1".
EDITOR NOTE: The typeface of Example 3 should be all monospaced and with single SPACEs between all cross-reference fields, and thus all cross-reference data fields vertically aligned.
7.5.5 File trailer
Change first paragraph as follows:
The trailer of a PDF file enables a PDF processor to quickly find the cross-reference table and certain
special objects. PDF processors should read a PDF file from its end. The last line of the file shall contain
only the end-of-file marker, %%EOF. The two preceding lines shall contain, one per line and in order,
the keyword startxref and the byte offset
in the decoded stream from the beginning of the PDF file to
the beginning of the xref keyword in the last cross-reference section
or the beginning of the previous cross-reference stream
(see 7.5.8, "Cross-reference streams"). The startxref line shall be
preceded by the trailer dictionary, consisting of the keyword trailer followed by a series of key-value
pairs enclosed in double angle brackets (<<...>>) (using LESS-THAN SIGNs (3Ch) and GREATER-THAN
SIGNs (3Eh)). Thus, the trailer has the following overall structure:
Change Table 15 as follows:
Key | Type | Value |
---|---|---|
Prev | integer |
(Optional; present only if the file has more than one cross-reference section; shall be a direct object)
The byte offset from the beginning of the PDF file to the beginning of the previous cross-reference
|
Info | dictionary |
(Optional |
7.5.7 Object streams
Append the following paragraph after the bulleted list as follows:
The following objects shall not be stored in an object stream:
- ...
Any entry's value in an ObjStm dictionary shall be either a direct object or an indirect uncompressed object.
NOTE 3 Indirect references to objects inside object streams use the normal syntax: for example, 14 0 R. Access to these objects requires a different way of storing cross-reference information; see 7.5.8, "Cross-reference streams". Use of compressed objects requires a PDF 1.5 PDF reader. However, compressed objects can be stored in a manner that a PDF 1.4 PDF reader can ignore.
Insert the following new NOTE 4 after NOTE 3 as follows:
NOTE 4: Including the document catalog in an object stream has interoperability implications, particularly for encrypted documents. If the catalog dictionary is part of an object stream, a PDF processor reading the document must first process that object stream before it can access potentially relevant document metadata, including the declared PDF version, developer extensions and XMP metadata.
...
7.5.8 Cross-reference streams
7.5.8.4 Compatibility with applications that do not support compressed reference streams
Change Table 19 as follows:
Key | Type | Value |
---|---|---|
XRefStm | integer |
(Optional) The byte offset
|
...
7.6.2 Application of encryption
7.6.3 General encryption algorithm
Add new NOTE 1 after the 4th bullet in the first bulleted list below the first paragraph as follows:
Encryption applies to all strings and streams in the document's PDF file, with the following exceptions:
- ...
- Any hexadecimal strings representing the value of the Contents key in a Signature dictionary
NOTE 1 For the signature schemes enumerated in ISO 32000-1 and in this document, the value of the Contents key in a Signature dictionary is always a hexadecimal string (see "Table 255 — Entries in a signature dictionary").
Encryption is not applied to other object types such as integers and boolean values, which are used primarily to convey information about the document's structure rather than its contents. ...
...
7.6.3.1 General
Change NOTE 1 as follows:
NOTE 1 The name RC4™ is a registered trademark of RSA Security Inc. and cannot be used by third parties creating implementations of the algorithm. Proprietary implementations of the RC4 encryption algorithm are available under license from RSA Security Inc. For licensing information, contact: RSA Security Inc. 2955 Campus Drive, Suite 400, San Mateo, CA 94403-2507, USA, or http://www.rsasecurity.com/.
...
7.6.4 Standard security handler
7.6.4.1 General
Change the second paragraph above NOTE 2 as follows:
If a security handler of revision 4 or 5 is specified, the standard security handler shall support crypt filters (see 7.6.6, "Crypt filters").
The support shall be limited to the Identity crypt filter (see "Table 26 - Standard crypt filter names") and
crypt filters
a crypt filter
named StdCF whose dictionaries contain an AuthEvent value of DocOpen. For revision 4, the filter CFM value shall be V2 (RC4) or AESV2 (AES-128). For revision 6, the filter CFM value shall be AESV3 (AES-256). Public-Key security handlers in this case shall use
crypt filters
a crypt filter
named DefaultCryptFilter when all document content is encrypted, and shall use
crypt filters
a crypt filter
named DefEmbeddedFile when file attachments only are encrypted in place of StdCF name. This nomenclature shall not be used as an indicator of the type of the security handler or encryption. Use of security handler revisions 1, 2, 3, 4 and 5 is deprecated in PDF 2.0.
...
7.6.4.3.2 Algorithm 2: Computing a file encryption key in order to encrypt a document (revision 4 and earlier)
Change NOTE 2 as follows:
NOTE 2 The first element of the ID array, as used in 7.6.4.3.2, "Algorithm 2: Computing a file encryption key in order to encrypt a document (revision 4 and earlier)", step e, generally remains unchanged across revisions of a given document. However, since this is not guaranteed, use of the ID in computation of the file encryption key, as required when using 7.6.4.3.3, "Algorithm 2.A: Retrieving the file encryption key from an encrypted document in order to decrypt it (revision 6 and later)Algorithm 2: Computing a file encryption key in order to encrypt a document (revision 4 and earlier)", can complicate updates to the document. For this reason, security handlers are encouraged to use Algorithm 2.A or higher, which do not use the ID in file encryption key computation. This algorithm, when applied to the user password string, produces the file encryption key used to encrypt or decrypt string and stream data according to 7.6.3.2, "Algorithm 1: Encryption of data using the RC4 or AES algorithms". Parts of this algorithm are also used in the algorithms described below.
Insert new NOTE 3 immediately below NOTE 2 as follows:
NOTE 3 This algorithm, when applied to the user password string, produces the file encryption key used to encrypt or decrypt string and stream data according to 7.6.3.2, "Algorithm 1: Encryption of data using the RC4 or AES algorithms". Parts of this algorithm are also used in the algorithms described in 7.6.4.4, "Password algorithms".
7.6.4.3.3 Algorithm 2.A: Retrieving the file encryption key from an encrypted document in order to decrypt it (revision 6 and later)
Insert new NOTE below bullet (f) as follows:
-
Decrypt the 16-byte Perms string using AES-256 in ECB mode
with an initialization vector of zeroand the file encryption key as the key. ...
NOTE This algorithm, when applied to the user password string, produces the file encryption key used to encrypt or decrypt string and stream data according to 7.6.3.3, "Algorithm 1.A: Encryption of data using the AES algorithms". Parts of this algorithm are also used in the algorithms described in 7.6.4.4, "Password algorithms".
7.6.4.4.9 Algorithm 10: Computing the encryption dictionary's Perms (permissions) value (Security handlers of revision 6)
Change bullet (f) as follows:
-
Encrypt the 16-byte block using AES-256 in ECB mode
with an initialization vector of zero, using the file encryption key as the key. The result (16 bytes) is stored as the Perms string, and checked for validity when the file is opened.
7.6.4.4.12 Algorithm 13: Validating the permissions (Security handlers of revision 6)
Change bullet (a) as follows:
-
Decrypt the 16 byte Perms string using AES-256 in ECB mode
with an initialization vector of zeroand the file encryption key as the key. ...
7.6.5 Public-key security handlers
7.6.5.1 General
...
7.6.5.2 Public-key security dictionary
...
Change the paragraph below the NOTE as follows:
Permitted values of the SubFilter entry for use with conforming public-key security handlers are adbe.pkcs7.s3 (PDF 1.3), adbe.pkcs7.s4 (PDF 1.4), which shall be used when not using crypt filters (see 7.6.6, "Crypt filters") and adbe.pkcs7.s5 (PDF 1.5), which shall be used when using crypt filters.
...
Insert a new subclause heading immediately below the NOTE below Table 23 as follows:
7.6.5.3 Public-key security permissions
EDITOR NOTE: current text and Table 24 remain unchanged.
Renumber the next clause appropriately:
7.6.5.37.6.5.4 Public-key encryption algorithms
Change the second bullet as follows:
- ...
- A 4-byte value defining the permissions, most significant byte first. See 7.6.5.3, "Public-key security permissions" and "Table 24 — Public-key security handler user access permissions" for the possible permission values.
- ...
Add two new notes at the very end of the sub-clause as follows:
NOTE 1: This means that step c) only applies when both of the following conditions are met:
- the key is being generated for the crypt filter named DefaultCryptFilter (i.e. the crypt filter used as the value for StmF in the encryption dictionary);
- the EncryptMetadata entry of the associated crypt filter dictionary is set to false.
NOTE 2: Since crypt filters are not supported when SubFilter is set to adbe.pkcs7.s3 or adbe.pkcs7.s4 in the encryption dictionary, there is no way to specify that metadata is to be left unencrypted in these cases. In particular, step c) is always skipped for these SubFilter values.
7.6.6 Crypt filters
Change the first bullet in the sub-clause as follows:
PDF 1.5 introduces crypt filters, which provide finer granularity control of encryption within a PDF file. The use of crypt filters involves the following structures:
- The encryption dictionary (see "Table 20 - Entries common to all encryption dictionaries") contains entries that enumerate the crypt filters in the document (CF) and specify which ones are used by default to decrypt all the streams (StmF) and strings (StrF) in the document. In addition, the value of the V entry shall be 4 or 5 to use crypt filters.
...
Change Table 25 as follows:
Key | Type | Value |
---|---|---|
Length | integer |
(Required; deprecated in PDF 2.0) ... When CFM is AESV2, the Length key shall have the value of 128 for public-key security handlers, and 16 for the standard security handler. When CFM is AESV3, the Length key shall have a value of 256 for public-key security handlers, and 32 for the standard security handler. |
...
Change Table 27 as follows:
Key | Type | Value |
---|---|---|
Recipients | byte string or array |
(Required) If the crypt filter is referenced from StmF or StrF in the encryption dictionary, this entry shall be an array of byte strings, where each byte string shall be a binary-encoded CMS object that shall ... ... If the crypt filter is referenced from a Crypt filter decode parameter dictionary (see "Table 14 - Optional parameters for Crypt filters"), this entry shall be a byte string that shall be a binary-encoded CMS object that shall ... |
7.7 Document structure
7.7.2 Document catalog dictionary
Change Table 29 as follows:
Key | Type | Value |
---|---|---|
Extensions | dictionary | (Optional; shall be a direct object; ISO 32000-1) ... |
Dests | dictionary |
(Optional; PDF 1.1 |
Outlines | dictionary |
(Optional |
Threads | array |
(Optional; PDF 1.1 |
Lang | text string |
(Optional; PDF 1.4) A language identifier that shall specify the natural language for all text in the document except where overridden by language specifications for structure elements or marked-content (see 14.9.2, "Natural language specification"). If this entry is absent or invalid (see 14.9.2, "Natural language specification"), the language shall be considered unknown. NOTE All text in a document includes PDF text strings (see 7.9.2.2 "Text string type") as well as textual content. |
7.7.3 Page tree
7.7.3.3 Page objects
Change Table 31 as follows:
Key | Type | Value |
---|---|---|
Contents | stream or array |
(Optional) A content stream (see 7.8.2, "Content streams") that shall describe the contents of this page. If this entry is absent, the page shall be empty. NOTE If the Contents key is not present, a Resources dictionary must still be present, either directly or through inheritance, in the pages tree. ... |
ID | byte string |
(Optional; PDF 1.3 |
7.7.4 Name dictionary
...
Change all occurrences of "name string" to "string" in Table 32 as follows:
Key | Type | Value |
---|---|---|
Dests | name tree |
(Optional; PDF 1.2) name tree mapping |
AP | name tree |
(Optional; PDF 1.3) name tree mapping |
JavaScript | name tree |
(Optional; PDF 1.3) name tree mapping |
Pages | name tree |
(Optional; PDF 1.3) name tree mapping |
Templates | name tree |
(Optional; PDF 1.3) name tree mapping |
EmbeddedFiles | name tree |
(Optional; PDF 1.4) name tree mapping (PDF 2.0) For unencrypted wrapper documents for an encrypted payload document (see 7.6.7, "Unencrypted wrapper document") the
|
AlternatePresentations | name tree |
(Optional; PDF 1.4) name tree mapping |
Renditions | name tree |
(Optional; PDF 1.5) A name tree mapping |
7.8.3 Resource dictionaries
Change the first bullet in the first bulleted list as follows:
A resource dictionary shall be associated with a content stream in one of the following ways:
-
For a content stream that is the value of a page's Contents entry
(or is an element of an array that is the value of that entry), the resource dictionary shall be designated by the page dictionary's Resources entry or is inherited, as described under 7.7.3.4, "Inheritance of page attributes" from some ancestor node of the page object. ...
...
7.9 Common data structures
7.9.2 String object types
7.9.2.2 Text string type
7.9.2.2.1 General
Change EXAMPLE 1 as follows:
EXAMPLE 1 A PDF dictionary containing key 'Key' with the value that is the text string "text‰" will look like
<</Key(text?)>>
<</Key (text\213) >>
where the character '?' after the 'text' is represented by the hex code 8Bh (octal code 213 - that is according to "D.2 Latin character set and encodings".
...
Change EXAMPLE 2 as follows:
EXAMPLE 2 A PDF dictionary containing key 'Key' with the value that is the text string "тест" (that is what the word in Russian with the translation to English as 'test') will look like
<</Key(??????????)>>
<</Key <FEFF0442043504410442> >>
where the characters in parentheses is the sequence of bytes with hex codes FE, FF, 04, 42, 04, 35, 04, 41, 04, 42.
...
Change NOTE 4 as follows:
NOTE 4 This mechanism precludes beginning a string using PDFDocEncoding
with the three characters dieresisidieresis,
guillemotright, questiondown, which is unlikely to be a meaningful beginning of a word or phrase.
Delete NOTE 5 as follows:
NOTE 5 It is important not to confuse UTF-16BE with UCS2 (i.e. wchar_t). UTF-16 is not a fixed width encoding scheme.
7.9.6 Name trees
...
Change Table 36 as follows:
Key | Type | Value |
---|---|---|
Names | array |
(Root and leaf nodes only; required in leaf nodes; present in the root node if and only if Kids is not present) Shall be an array of the form [key1 value1 key2 value2 ...keyn valuen] where each keyi shall be a string and the corresponding valuei shall be the
object associated with that key. The keys shall be sorted |
Change the paragraph below Table 36 as follows:
The Kids entries in the root and intermediate nodes define the tree’s structure by identifying the immediate children of each node.
The Names entries in the leaf (or root) nodes shall contain the tree’s keys and their associated values, arranged in key-value pairs and shall be sorted
lexically in ascending order by key. Shorter keys shall appear before longer ones beginning with the same byte sequence.
Any encoding of the keys may be used as long as it is self-consistent; keys shall be compared for equality on a simple byte-by-byte basis.
...
7.10 Functions
7.10.3 Type 2 (exponential interpolation) functions
Change the paragraph below Table 40 as follows:
Values of Domain shall constrain x in such a way that:
- if N is not an integer, all values of x will be non-negative; and
- if N is negative, no value of x will be zero.
Typically, Domain is declared as [0.0 1.0], and N is a positive number. To clip the output to a specified range the Range attribute shall be used.
...
7.11.4 Embedded file streams
7.11.4.1 General
Change Table 44 as follows:
Key | Type | Value |
---|---|---|
Subtype | name |
(Optional, required in the case of an embedded file stream used as an associated file (see 14.13 "Associated files") or as an asset of a RichMedia annotation (see "13.7 Rich media")) ... |
7.11.6 Collection items
Change Table 47 as follows:
Key | Type | Value |
---|---|---|
D | text string, date or number |
(Optional) The data corresponding to the related entry in the collection field dictionary (see "Table 155 - Entries in a collection field dictionary").
The type of data shall match the data type identified by the corresponding collection field dictionary. |
P | text string |
(Optional) A prefix string that shall be concatenated with the text string presented to the user. This entry is ignored when an interactive PDF
processor sorts the items in the collection. |
7.12.2 Extensions dictionary
Change Table 48 as follows:
Key | Type | Value |
---|---|---|
Type | name |
(Optional, shall be a direct |
7.12.5 ExtensionLevel
Change the paragraph as follows:
The value of the ExtensionLevel entry shall be an integer, which shall be interpreted with respect to the BaseVersion value.
If a developer has released multiple extensions against the same BaseVersion value, they shallshould be ordered over time and the ExtensionLevel numbers shall be a monotonically increasing sequenceshould increase over time.
Last modified: 16 December 2022