XML Canonicalization Facts

This document provides facts related to XML Canonicalization for usage in the written part of my diploma thesis.

General

What is XML Canonicalization?

Article foundFact
XML security : Implement security layers, Part 1 - Basic plumbing technologies [Ve03a] You can create XML documents that appear to be different, but have identical data or identical semantical value. Differences may lie in entity structure, attribute ordering, character encoding, or insignificant whitespace. Because of such physical differences, equivalence testing cannot be done at byte level for arbitrary XML documents. Herein lies the problem: Digital signatures rely on byte-level equivalence, whereas it is possible to have two XML documents that are logically the same, but contain different byte sequences.

Canonicalization comes in two forms:
  • Normal canonicalization: When a sub-part of the XML is serialized, the ancestor element's context and all namespace declarations and attributes in the xmlns namespace are included.
  • Exclusive canonicalization: When a sub-part of the XML is serialized, the ancestor element's context is not included.