jcs

package module
v0.0.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 14, 2025 License: MIT Imports: 9 Imported by: 0

README

JSON Canonicalization Scheme (JCS) in Go

Overview

The jcs package implements the JSON Canonicalization Scheme (JCS) as defined in RFC 8785. JCS is a method for converting JSON data into a canonical form, ensuring that the data is consistently represented in a way that allows for reliable comparisons and signatures. This package provides a Go implementation that allows encoding Go values into canonical JSON format, which can be used for digital signatures, data integrity checks, and other cryptographic applications.

Supported Types and Behavior

  1. nil The value nil is serialized as null.

  2. bool The value is serialized as either "true" or "false".

  3. string The string is serialized using UTF-8 encoding.

  4. Numeric types The following numeric types are supported and are serialized as JSON numbers (with conversion to float64 where necessary):

    • float64
    • float32 (converted to float64)
    • int (converted to float64)
    • int8, int16, int32, int64 (converted to float64)
    • uint, uint8, uint16, uint32, uint64 (converted to float64)

    For int and uint types that exceed the supported range for JSON numbers, the function will return the ErrNumberOOR error. This means only integers in the range [‑(2^53‑1), v ,+(2^53‑1)] are valid.

  5. Arrays and Slices Slices of basic types (e.g., []int, []float64, []string, etc.) are recursively serialized as arrays in JSON format. Supported slice types include:

    • []int, []int8, []int16, []int32, []int64
    • []uint, []uint8, []uint16, []uint32, []uint64
    • []bool
    • []string
    • []float32, []float64
    • []any

    Each element of the slice is serialized individually, and the resulting canonicalized representation is appended to dst.

  6. time.Time A time.Time value is serialized in the RFC 3339 format (i.e., 2006-01-02T15:04:05Z07:00).

  7. map[string]any (Objects) A map is serialized as a JSON object. The keys are encoded as UTF-8 strings, and the values are serialized according to their types. Note that RFC 8785 requires the use of UTF-16 code unit comparison, which affects how non-BMP characters (e.g., Unicode surrogate pairs) are handled.

  8. Unsupported Types If the value v is of an unsupported type, the function returns the error ErrUnsupportedType.

Error Handling

During the process of encoding Go values into canonical JSON format, various errors can arise based on the type or characteristics of the data being encoded. This section outlines the possible errors that may be returned by the package, helping you understand how to handle them when using the package.

The following errors are defined in the jcs package. Each error corresponds to a specific issue that might occur during the canonicalization process:


1. ErrUnsupportedType

Description:
This error occurs when the encoder encounters a value of an unsupported type. The jcs encoder supports only a subset of Go types, including basic types like integers, strings, booleans, slices, and maps. Custom structs, channels, and function types (among others) are not supported by JCS and will trigger this error.

Possible Causes:

  • Attempting to encode unsupported types such as:
    • Function types
    • Channels
    • Structs not explicitly handled by the encoder
    • Other types than map[string]interface{}
  • Composite types that cannot be serialized into canonical JSON.

2. ErrNaN

Description:
Returned when the encoder encounters a NaN (Not a Number) value. According to RFC 8785, NaN values are not allowed in canonical JSON. If a NaN value is passed to the encoder, it triggers this error.

Possible Causes:

  • Attempting to encode NaN values, which are invalid in JCS.

3. ErrInf

Description:
Returned when the encoder encounters an infinity value, either positive (+Inf) or negative (-Inf). RFC 8785 disallows both, so the encoder rejects such values.

Possible Causes:

  • Attempting to encode positive or negative infinity values.

4. ErrInvalidUTF8

Description:
Returned when the encoder encounters a string containing invalid UTF‑8 byte sequences. JCS requires that all strings be valid UTF‑8. Both appendUTF16 and appendString enforce this rule, but they do so at different stages:

  • From UTF16:

    • Invalid single‑byte sequences (e.g., 0xFF)
    • Truncated multi‑byte sequences (e.g., 0xE2 0x82)
    • Surrogate code points (U+D800–U+DFFF) which are not valid Unicode scalar values
  • From UTF-8:

    • Same invalid UTF‑8 checks as above
    • Additional rejection of surrogate code points during string escaping
    • Ensures control characters are escaped correctly and rejects malformed sequences inline

Possible Causes:

  • Strings containing invalid or corrupted UTF‑8 byte sequences
  • Encodings that are not UTF‑8 (e.g., UTF‑16 or other encodings)
  • Malformed sequences such as:
    • Invalid single‑byte values (e.g., 0xFF)
    • Truncated multi‑byte sequences (e.g., 0xE2 0x82)
    • Surrogate code points (U+D800–U+DFFF)

Note: A valid U+FFFD replacement character (0xEF 0xBF 0xBD) is allowed, since it is a legitimate Unicode scalar value.


5. ErrNumberOOR

Description:
The ErrNumberOOR (Out of Range) error is returned when a number exceeds the valid range for precise representation in IEEE‑754 double‑precision format. RFC 8785 requires exact round‑trip encoding of numbers, so values larger than ±2^53 cannot be represented precisely and will result in this error.

Possible Causes:

  • Numbers exceeding the precision limits of IEEE‑754 double‑precision floating‑point (approximately ±9.007 × 10^15).
  • Applies to both integer and floating‑point numbers.

Number Compliance

RFC 8785 Rules Enforced:
  • Zero normalization
    Both +0.0 and -0.0 are rendered as "0". RFC 8785 requires that negative zero not be distinguishable from positive zero in canonical JSON.

  • NaN and Infinity
    NaN, +Inf, and -Inf are explicitly disallowed. If encountered, the encoder returns ErrNaN or ErrInf.

  • Safe integer range Integers must lie within the IEEE‑754 double‑precision safe range: [-(2^53‑1), +(2^53‑1)]. Values outside this range cannot be represented exactly as float64 and will trigger ErrNumberOOR.

  • Shortest decimal representation
    Finite values are serialized using strconv.AppendFloat with mode 'f' and precision -1, producing the shortest correct decimal string. For very large or very small magnitudes, numberNormalizer is used to ensure canonical exponent formatting.

  • Exponent normalization
    RFC 8785 requires that exponents:

    • Always include a sign (+ or -).
    • Never contain leading zeros.
      Example: 1230000000 must be represented as 1.23e+9, not 1.23e09 or 1.23e9.
Performance Notes

Benchmarks show predictable linear scaling with input size. The majority of runtime cost comes from Go’s strconv.AppendFloat implementation, which performs the heavy float‑to‑string conversion. The numberNormalizer adds a small overhead (~3–4%) to enforce RFC exponent formatting. While specialized algorithms like Ryu or Grisu3 can be faster, this approach is correct, stable, and maintainable.

Object Compliance and Performance

Canonicalization Rules for Objects

RFC 8785 requires that JSON objects (map[string]any in Go) be serialized in a canonical form:

  • Lexicographic ordering of keys
    Keys must be sorted by Unicode code points. This is achieved by converting each key into UTF‑16 code units and comparing them during sort.

  • UTF‑8 validation
    Keys must be valid UTF‑8. Invalid sequences trigger ErrInvalidUTF8.

  • Stable formatting

    • Objects are enclosed in {}.
    • Keys are quoted strings, followed by a colon :.
    • Values are serialized in canonical form (numbers, strings, arrays, nested objects).
    • Entries are separated by commas , with no extra whitespace.
  • Error propagation
    If any key or value fails to encode (e.g., unsupported type, invalid UTF‑8, NaN/Inf), serialization aborts and returns the error.

Observations:

  • Runtime grows roughly linearly with object size, dominated by sorting cost (O(n log n)).
  • Allocation count remains low (<50 even for 10M entries).
  • Memory usage scales with number of keys due to UTF‑16 buffers and key metadata.

Insights: TLDR; Performance is dominated by key sorting, which is unavoidable under the specification. Memory and allocation counts remain modest, and runtime scales predictably with object size.

  • UTF‑16 conversion (~6%)
    Each key is decoded into UTF‑16 code units for lexicographic comparison.

  • Key collection (~9%)
    Building the UTF‑16 buffer and metadata (kv structs).

  • Sorting (~45%)
    sort.Slice compares UTF‑16 slices to enforce canonical ordering. This is the dominant cost.

  • Serialization (~20%)
    Keys are written with appendString, values with Append. Linear in number of entries.

Documentation

Overview

Package jcs provides an implementation of the JSON Canonicalization Scheme (JCS) as defined in RFC 8785.

JCS defines a strict, deterministic serialization of JSON data so that logically equivalent values always produce the same byte sequence. This canonical form is essential for cryptographic applications such as digital signatures, hashing, and secure data exchange.

Features of this implementation:

  • Canonical serialization of primitive types (null, booleans, strings, numbers) following RFC 8785 rules.
  • Enforcement of IEEE‑754 double precision constraints: integers outside ±(2^53 − 1) cannot be represented exactly and return ErrNumberOOR.
  • Canonical ordering of object keys using UTF‑16 code unit comparison, ensuring correct handling of non‑BMP characters (surrogate pairs).
  • Support for slices of common Go types (ints, uints, floats, strings, bools, any) and maps with string keys.
  • Rejection of unsupported or non‑representable types with ErrUnsupportedType.

The core entry point is Append, which appends the canonical JSON representation of a Go value to a destination byte slice. Helper functions such as appendSlice and appendObject handle composite types. Errors are returned when values cannot be represented according to RFC 8785.

This package is intended for use in contexts where canonical JSON is required for interoperability, compliance, or cryptographic integrity.

Example

Example shows how to use Append in documentation.

var buf []byte

response := map[string]any{
	"user_id": "c3f65f70-eb2f-4979-ba73-24bcbde9fdd9",
	"age":     31,
}

// Append a string
buf, _ = Append(buf, response)
fmt.Println(string(buf))
Output:

{"age":31,"user_id":"c3f65f70-eb2f-4979-ba73-24bcbde9fdd9"}

Index

Examples

Constants

View Source
const MaxSafeNumber = 1<<53 - 1

MaxSafeNumber defines the largest integer that can be represented exactly in IEEE‑754 double precision (binary64), as required by RFC 8785 (JSON Canonicalization Scheme).

RFC 8785 3.2.2 mandates that all JSON numbers must be preserved exactly when serialized. Since canonical JSON relies on IEEE‑754 binary64, only integers in the range [‑(2^53‑1), +(2^53‑1)] are considered "safe". Any integer outside this range cannot be represented without precision loss and must cause canonicalization to fail.

MaxSafeNumber is therefore set to 2^53‑1 (9007199254740991).

Variables

View Source
var (
	// ErrUnsupportedType is returned when the encoder encounters a value
	// of an unsupported type. The encoder only supports specific types
	// like integers, strings, maps, and slices. Custom structs or other
	// complex types may trigger this error.
	ErrUnsupportedType = errors.New("jcs: value has unsupported type")

	// ErrNaN is returned when the encoder encounters a NaN (Not a Number)
	// value. RFC 8785 disallows NaN values in the canonical JSON format,
	// so this error is triggered when an attempt is made to encode a NaN.
	ErrNaN = errors.New("jcs: cannot c14n NaN")

	// ErrInf is returned when the encoder encounters an Inf (infinity) value.
	// RFC 8785 also disallows Inf values (both +Inf and -Inf) in the canonical
	// JSON format, which leads to this error when such values are encountered.
	ErrInf = errors.New("jcs: cannot c14n Inf")

	// ErrInvalidUTF8 is returned when the encoder encounters a string containing
	// invalid UTF-8 byte sequences. JCS requires that all strings be valid UTF-8,
	// so this error is triggered if a string contains non-UTF-8 characters.
	ErrInvalidUTF8 = errors.New("jcs: value has invalid utf8 character")

	// ErrNumberOOR is returned when a number exceeds the valid range for precise
	// representation in the IEEE-754 double precision format (±2^53).
	// Numbers larger than ±2^53 cannot be exactly represented, and JCS requires
	// exact round-trip encoding. This error occurs when such a number is encountered.
	ErrNumberOOR = errors.New("jcs: value number out of range (v ± 2^53)")
)

Functions

func Append

func Append(dst []byte, v any) ([]byte, error)

Append function is part of the jcs package, which implements the JSON Canonicalization Scheme (JCS) as defined in RFC 8785. This function appends the canonicalized JSON representation of various Go types to a byte slice (dst). It returns the canonicalized representation of the provided value v in the appropriate format.

Supported types include:

  • nil → serialized as "null"
  • bool → serialized as "true" or "false"
  • string → serialized with proper escaping
  • float64 → serialized as a canonical JSON number
  • float32, int, int8, int16, int32, int64, uint, uint8, uint16, uint32, uint64 → converted to float64 when within IEEE‑754 safe range (±(2^53 − 1)); otherwise ErrNumberOOR is returned
  • slices of common types (ints, uints, floats, strings, bools, any)
  • map[string]any → serialized as a JSON object with keys ordered by UTF‑16 code unit comparison, as required by RFC 8785

Errors:

  • ErrNumberOOR is returned when an integer cannot be represented exactly in IEEE‑754 double precision.
  • ErrUnsupportedType is returned when v is of a type not supported by this implementation.

This function is the core entry point for canonical JSON serialization in the package. It ensures deterministic output suitable for cryptographic operations such as hashing and signing. Example:

 var buf []byte
 response := map[string]any{
	"user_id": "c3f65f70-eb2f-4979-ba73-24bcbde9fdd9",
	"age":     31,
 }
 buf, _ = jcs.Append(buf, response)
 fmt.Println(string(buf))
 Output: {"age":31,"user_id":"c3f65f70-eb2f-4979-ba73-24bcbde9fdd9"}

Types

This section is empty.

Directories

Path Synopsis
cmd
jcscli command

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL