WDP Part 5: Compact IDs Specification
Definition of the Compact ID generation protocol for the Waddling Diagnostic Protocol (WDP). Compact IDs are the fifth and final part of the complete WDP error code structure, transforming structured codes into short, deterministic hash-based identifiers.
Abstract#
This document defines the Compact ID generation protocol for the Waddling Diagnostic Protocol (WDP). Compact IDs are the fifth and final part of the complete WDP error code structure. This specification describes how the four-part structured code (SEVERITY.COMPONENT.PRIMARY.SEQUENCE) is transformed into a short, deterministic hash-based identifier for efficient logging, transmission, and lookup. The compact ID is always generated fresh from the current code; any change to the structured code produces a different hash.
Specification Navigation: See STRUCTURE.md for an overview of all WDP specification parts.
Conformance
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.
1. Introduction#
1.1 Purpose
WDP Compact IDs are the fifth part of the complete WDP error code structure:
Complete WDP Structure:
Severity.Component.Primary.Sequence -> CompactID
^ ^ ^ ^ ^
Part 1 Part 2 Part 3 Part 4 Part 5This Document Focus:
Severity.Component.Primary.Sequence -> CompactID
^^^^^^^^^
Part 5: Short hash-based identifier (THIS DOCUMENT)The first four parts form the structured code, and the compact ID is derived from them.
Compact IDs solve a critical problem in distributed systems: how to efficiently transmit error information over networks while maintaining full context and traceability.
Full structured codes like E.AUTH.TOKEN.001 are:
- β Human readable
- β Self-documenting
- β Searchable
- β Verbose (17 characters)
- β Bandwidth-intensive for high-volume systems
Compact IDs like Ay75d provide:
- β Short (5 characters = 70% reduction)
- β Deterministic (same input β same output)
- β Collision-resistant (916 million combinations)
- β URL-safe, JSON-safe, filename-safe
- β Fast to generate and compare
- β Always fresh (changes when code changes)
1.2 Use Cases
IoT Devices:
Sensor β {"h":"jGKFp","t":45.2} β Gateway (12 bytes)
Gateway expands using catalog β Full error contextMobile Applications:
1. App downloads catalog once (50KB, cacheable)
2. API returns compact errors (40 bytes each)
3. App expands errors locally (offline-capable)Log Aggregation:
Microservice logs: [jGKFp] Request timeout
Centralized system searches all instances by hashMulti-Language Systems:
Rust backend β {"h":"jGKFp"} β TypeScript frontend
Both use same catalog β Seamless interoperability1.3 Design Goals
- Deterministic: Same error code always produces same hash
- Fresh: Hash is always regenerated from current code (not cached/stored)
- Collision-resistant: Minimal probability of hash conflicts
- Cross-language: Identical hashing across all implementations
- Fast: Sub-microsecond generation time
- Compact: Fixed 5-character output
- Safe: URL-safe, JSON-safe, no escaping needed
1.4 Conformance
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.
An implementation is conformant at Level 1 (Standard) if it correctly implements all requirements in this specification marked with MUST, REQUIRED, or SHALL, in addition to Level 0 (Error Codes) conformance.
Normative References#
- STRUCTURE.md - WDP error code structure overview
- 1-SEVERITY.md - Severity field specification (Part 1)
- 2-COMPONENT.md - Component field specification (Part 2)
- 3-PRIMARY.md - Primary field specification (Part 3)
- 4-SEQUENCE.md - Sequence field specification (Part 4)
- 6-CATALOGS.md - Error catalogs specification
2. Overview#
2.1 Process Flow
The compact ID is derived from the four-part structured code:
βββββββββββββββββββββββββββββββββββββββββββ
β Structured Code (Parts 1-4) β
β "E.AUTH.TOKEN.001" β
ββββββββββββββββ¬βββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββ
β Step 1: Input Normalization β
β - Validate format β
β - Encode as UTF-8 bytes β
ββββββββββββββββ¬βββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββ
β Step 2: Hash Generation β
β - Apply xxHash64 algorithm β
β - Use seed: "wdp-v1" β
β - Output: 64-bit unsigned integer β
ββββββββββββββββ¬βββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββ
β Step 3: Base62 Encoding β
β - Convert u64 to base62 β
β - Pad to exactly 5 characters β
ββββββββββββββββ¬βββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββ
β Compact ID (Part 5) β
β "Ay75d" β
βββββββββββββββββββββββββββββββββββββββββββ
Complete WDP Error Code:
[E.AUTH.TOKEN.001] -> Ay75d2.2 Example
Input: "E.AUTH.TOKEN.001" (structured code, parts 1-4)
β Normalize (UTF-8 bytes)
Bytes: [69, 46, 65, 85, 84, 72, 46, 84, 79, 75, 69, 78, 46, 48, 48, 49]
β Hash (xxHash64, seed="wdp-v1")
Hash: 15379485234 (u64)
β Encode (Base62, pad to 5 chars)
Output: "Ay75d" (compact ID, part 5)
Complete: [E.AUTH.TOKEN.001] -> Ay75dImportant: If the code changes, the hash changes:
Original: [E.AUTH.TOKEN.001] -> Ay75d
Changed: [E.AUTHENTICATION.TOKEN.001] -> Bz93k
^^^^^^
New hash!The compact ID is ALWAYS generated fresh from the current code.
3. Input Normalization#
3.1 Purpose
Input normalization ensures that all implementations produce identical byte sequences from the same structured error code (parts 1-4), regardless of platform, language, or locale. This guarantees that the same structured code always produces the same compact ID.
Normalization provides case-insensitive lookups and typo resilience by converting display forms to a canonical hash form.
3.2 Input Validation
Before normalization, implementations MUST validate that the input is a valid WDP structured code (parts 1-4) according to the Error Codes specification.
Requirements:
- MUST match format:
SEVERITY.COMPONENT.PRIMARY.SEQUENCE - MUST validate severity is one of:
E, W, C, B, S, K, I, T, H - MUST validate component and primary are PascalCase (or any case if normalization is applied)
- MUST validate sequence is numeric (3 digits) or named (SCREAMING_SNAKE_CASE)
Invalid inputs MUST be rejected before hashing.
Note: Only the structured code (parts 1-4) is hashed. The compact ID is never an input to its own generation.
3.3 String Preparation
Step 1: Whitespace Handling
Implementations MUST trim leading and trailing whitespace from the input string before processing.
Input: " E.Auth.Token.001 "
After: "E.Auth.Token.001"Rationale: Prevents accidental whitespace from user input causing hash mismatches.
Step 2: Named Sequence Resolution
If the sequence field contains a named sequence (e.g., TOKEN_MISSING), it MUST be resolved to its numeric canonical form before hashing.
Input: E.Auth.Token.TOKEN_MISSING
Resolved: E.Auth.Token.001Rationale: Named sequences are display aliases only. The numeric form is the canonical identity. This ensures:
- Stable hashes when renaming aliases
- i18n compatibility (localized names map to same numeric ID)
- Simpler hash input (no underscores in hash computation)
Step 3: Uppercase Normalization
The entire structured code MUST be converted to uppercase before hashing.
Input: E.Auth.Token.001
Normalized: E.AUTH.TOKEN.001Rationale: Provides case-insensitive lookups and prevents typos from creating phantom error codes.
Benefits:
- User searches:
e.auth.token.001β findsE.Auth.Token.001 - Typo resilience:
E.auth.Token.001β same hash asE.Auth.Token.001 - Cross-implementation safety: slight formatting differences still match
3.4 Normalization Example
Complete normalization flow:
Step 0: Input (as written)
" E.Auth.Token.TOKEN_MISSING "
Step 1: Trim whitespace
"E.Auth.Token.TOKEN_MISSING"
Step 2: Resolve named sequence (if applicable)
"E.Auth.Token.001"
Step 3: Convert to uppercase
"E.AUTH.TOKEN.001"
Step 4: UTF-8 encode
[0x45, 0x2E, 0x41, 0x55, 0x54, 0x48, 0x2E, 0x54, 0x4F, 0x4B, 0x45, 0x4E, 0x2E, 0x30, 0x30, 0x31]
Result: Ready for hashingAll these inputs produce the same hash:
E.Auth.Token.001 β E.AUTH.TOKEN.001 β same hash
E.AUTH.TOKEN.001 β E.AUTH.TOKEN.001 β same hash
e.auth.token.001 β E.AUTH.TOKEN.001 β same hash
E.auth.Token.001 β E.AUTH.TOKEN.001 β same hash
E.Auth.Token.TOKEN_MISSING β E.AUTH.TOKEN.001 β same hash (if TOKEN_MISSING maps to 001)3.5 UTF-8 Encoding
The validated, resolved, and normalized input string MUST be encoded as UTF-8 bytes.
Character Encoding: UTF-8 (RFC 3629)
Byte Order: Network byte order (big-endian)
Unicode Normalization: None required (WDP codes are ASCII-only after normalization)
Example:
String (normalized): "E.AUTH.TOKEN.001"
UTF-8: [0x45, 0x2E, 0x41, 0x55, 0x54, 0x48, 0x2E, 0x54, 0x4F, 0x4B,
0x45, 0x4E, 0x2E, 0x30, 0x30, 0x31]Note: The period (.) separator IS part of the UTF-8 byte sequence and contributes to the hash.
3.6 Edge Cases
Empty String:
- Input:
"" - Behavior: MUST reject (invalid WDP code)
Unicode in Input:
WDP error codes are ASCII-only by specification, but if an implementation encounters Unicode:
Input: "E.γγΉγ.TOKEN.001"
Result: MUST reject (invalid component format)Rationale: Error codes must be ASCII to ensure hash stability and cross-language compatibility.
4. Hash Algorithm#
4.1 Required Algorithm: xxHash64
Implementations MUST use xxHash64 as the hashing algorithm for compact ID generation.
Algorithm: xxHash64
Version: xxHash v0.8.0 or later
Input: The 4-part structured code (SEVERITY.COMPONENT.PRIMARY.SEQUENCE)
Output: 64-bit unsigned integer
Reference: https://cyan4973.github.io/xxHash/
Alternative algorithms MUST NOT be used. This requirement ensures universal interoperabilityβthe same error code MUST produce the same compact ID across all WDP implementations, regardless of language, platform, or vendor.
4.2 Why xxHash64?
| Criteria | xxHash64 |
|---|---|
| Speed | ~15 GB/s (extremely fast) |
| Determinism | β Same input β same output always |
| Cross-language | β Available in all major languages |
| Collision resistance | β Good distribution for short inputs |
| Standardized | β Well-defined specification |
| Small inputs | β Optimized for <100 byte inputs |
| Non-cryptographic | β Appropriate for identification (not authentication) |
4.3 Rationale for Mandating One Algorithm
Universal Interoperability:
System A (xxHash64): E.Auth.Token.001 β Xy8Qz
System B (xxHash64): E.Auth.Token.001 β Xy8Qz
System C (xxHash64): E.Auth.Token.001 β Xy8Qz
β
All systems produce identical compact IDs
β
Catalogs are portable between systems
β
IoT devices can send compact IDs that any gateway can expand
β
Cross-system error lookup works universallyIf alternative algorithms were allowed:
System A (xxHash64): E.Auth.Token.001 β Xy8Qz
System B (BLAKE3): E.Auth.Token.001 β Mn7Pq
System C (SHA-256): E.Auth.Token.001 β Ab3Cd
β Same error code produces different compact IDs
β Cannot share catalogs between systems
β Cannot look up compact IDs across systems
β Compact ID becomes meaningless for identificationThis would destroy the core value proposition of WDP compact IDs.
4.4 Security Considerations
Compact IDs are for IDENTIFICATION, not AUTHENTICATION.
xxHash64 is a non-cryptographic hash function. This is intentional and appropriate for WDP's use case:
DO use compact IDs for:
- β Efficient transmission over networks
- β Log aggregation and searching
- β Error code lookup in catalogs
- β Space-constrained environments (IoT, mobile)
- β URL parameters and query strings
DO NOT use compact IDs for:
- β Authentication or authorization
- β Tamper detection or integrity verification
- β Cryptographic signatures
- β Security-sensitive decisions
For security needs, use appropriate mechanisms at the correct layer:
| Security Need | Solution | Layer |
|---|---|---|
| Transport security | HTTPS/TLS | Transport |
| Message integrity | JWT/JWS signing | Application |
| Catalog integrity | Ed25519 signature on catalog file | Distribution |
| Authentication | OAuth/API keys | Application |
4.5 Hash Parameters
Required Parameters:
Input: UTF-8 encoded byte array
Seed: UTF-8 encoding of string "wdp-v1"
Output: 64-bit unsigned integer (u64)Seed Specification:
The seed MUST be the UTF-8 byte sequence of the ASCII string "wdp-v1" (without quotes).
Seed String: "wdp-v1"
Seed Bytes: [0x77, 0x64, 0x70, 0x2D, 0x76, 0x31]Rationale for "wdp-v1" seed:
- Protocol-specific (unique to WDP)
- Versioned (enables future evolution if needed)
- Professional and self-documenting
- Concise (6 bytes)
4.6 Hash Function Signature
Pseudocode:
function compute_hash(input: String) -> u64:
// 1. Validate
if not is_valid_wdp_code(input):
raise ValidationError
// 2. Normalize
normalized = input.trim()
bytes = utf8_encode(normalized)
// 3. Hash
seed = utf8_encode("wdp-v1")
hash_value = xxhash64(bytes, seed)
return hash_value5. Base62 Encoding#
5.1 Purpose
Base62 encoding converts the 64-bit hash value into a compact, human-readable string using only alphanumeric characters (no special characters that require URL encoding).
5.2 Base62 Alphabet
The Base62 alphabet MUST be exactly:
0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyzCharacter Set:
- Digits: 0-9 (indices 0-9)
- Uppercase letters: A-Z (indices 10-35)
- Lowercase letters: a-z (indices 36-61)
Total: 62 characters (indices 0-61)
Ordering: CRITICAL - The order must be exactly as specified. Different ordering produces different encodings.
5.3 Encoding Algorithm
Input: 64-bit unsigned integer (u64)
Output: 5-character string
function to_base62(value: u64) -> String:
alphabet = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
base = 62
if value == 0:
return "00000" // Special case
result = []
while value > 0:
remainder = value % base
result.prepend(alphabet[remainder])
value = value / base // Integer division
// Pad to exactly 5 characters
while result.length < 5:
result.prepend('0')
return result.join()5.4 Output Format
Length: MUST be exactly 5 characters (REQUIRED)
All WDP compact IDs are exactly 5 Base62 characters. This fixed length provides:
- β Consistent space requirements (always 5 bytes)
- β Predictable URL/log formatting
- β Easy pattern recognition
- β ~916 million unique combinations (62^5)
Padding: If the encoded value is less than 5 characters, implementations MUST left-pad with '0' (zero) characters.
Value: 123 β "00123"
Value: 9876543210 β "FkRWm"Why 5 characters?
| Length | Combinations | Collision at 1K codes | Collision at 2K codes | Collision at 5K codes |
|---|---|---|---|---|
| 5 chars | 916M | 0.05% | 0.22% | 1.36% |
| 6 chars | 56B | 0.001% | 0.004% | 0.02% |
| 8 chars | 218T | ~0% | ~0% | ~0% |
5.5 Properties
- Uniqueness: Each u64 value maps to exactly one Base62 string
- Collision Space: 62^5 = 916,132,832 possible combinations
- Sufficient for: 100,000+ unique error codes with extremely low collision probability
- URL-Safe: No encoding needed (%XX escaping not required)
- JSON-Safe: No escaping needed
- Filename-Safe: Safe for use in filenames on all major filesystems
6. Collision Handling#
6.1 Collision Probability
With 5-character Base62 encoding (916,132,832 combinations) and xxHash64's good distribution properties, collision probability follows the birthday paradox formula.
Collision Probability Table:
| Number of Error Codes | Collision Probability | Risk Level | Recommendation |
|---|---|---|---|
| 100 | 0.0005% | Negligible | Safe |
| 500 | 0.014% | Very Low | Safe |
| 1,000 | 0.054% | Low | Safe |
| 2,000 | 0.22% | Acceptable | Safe for most projects |
| 5,000 | 1.36% | Moderate | Monitor for collisions |
| 10,000 | 5.5% | High | Implement collision detection |
| 20,000 | 20% | Very High | Use full structured codes |
Practical Guidance:
- Most projects (< 2,000 codes): Collisions are rare and acceptable. No special handling needed.
- Large projects (2,000-5,000 codes): Low risk, but document any collisions in catalogs.
- Very large projects (> 5,000 codes): Implement collision detection and consider using full structured codes in critical paths.
6.2 Collision Detection
Compile-Time Detection (RECOMMENDED):
Implementations that generate catalogs at build time SHOULD detect collisions and report them.
// Example: Rust procedural macro
fn generate_catalog(codes: &[ErrorCode]) -> Result<Catalog, CollisionError> {
let mut hash_map: HashMap<CompactId, ErrorCode> = HashMap::new();
for code in codes {
let compact_id = compute_compact_id(code);
if let Some(existing) = hash_map.get(&compact_id) {
return Err(CollisionError {
compact_id,
code1: existing.clone(),
code2: code.clone(),
});
}
hash_map.insert(compact_id, code.clone());
}
Ok(Catalog { codes: hash_map })
}6.3 Collision Resolution Strategies
When a collision is detected, choose one of these strategies:
Strategy 1: Use Different Sequence Number (RECOMMENDED)
E.Auth.Token.001 β Xy8Qz (collision with E.Other.Thing.042!)
E.Auth.Token.002 β Ab3Cd (use next sequence - no collision)Strategy 2: Document in Catalog (ACCEPTABLE)
{
"Xy8Qz": [
{
"code": "E.Auth.Token.001",
"message": "Token missing",
"note": "Shares compact ID with E.Database.Connection.157"
},
{
"code": "E.Database.Connection.157",
"message": "Connection pool exhausted",
"note": "Shares compact ID with E.Auth.Token.001"
}
]
}6.4 Testing for Collisions
Implementations SHOULD include collision testing:
function test_no_collisions():
all_codes = get_all_project_codes()
hash_map = {}
for code in all_codes:
hash = compute_compact_id(code)
assert hash not in hash_map, f"Collision: {code}"
hash_map[hash] = code
print(f"β All {len(all_codes)} codes have unique hashes")7. Implementation Guide#
7.1 Core Function
Every conformant implementation MUST provide this core function:
compute_compact_id(input: String) -> StringBehavior:
- Validate input is a valid WDP error code
- Trim whitespace
- Encode as UTF-8 bytes
- Hash using xxHash64 with seed "wdp-v1"
- Encode result as 5-character Base62 string
- Return compact ID
7.2 Reference Implementation (Rust)
use xxhash_rust::xxh64::xxh64;
const SEED: &str = "wdp-v1";
const BASE62_ALPHABET: &[u8] = b"0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
pub fn compute_compact_id(input: &str) -> Result<String, Error> {
// 1. Validate
if !is_valid_wdp_code(input) {
return Err(Error::InvalidCode);
}
// 2. Normalize
let normalized = input.trim();
let bytes = normalized.as_bytes();
// 3. Hash
let seed_bytes = SEED.as_bytes();
// Note: "wdp-v1" is 6 bytes, pad with zeros for u64 (8 bytes)
let mut seed_padded = [0u8; 8];
seed_padded[..seed_bytes.len()].copy_from_slice(seed_bytes);
let seed_u64 = u64::from_le_bytes(seed_padded);
let hash = xxh64(bytes, seed_u64);
// 4. Encode
let compact_id = to_base62(hash);
Ok(compact_id)
}
fn to_base62(mut value: u64) -> String {
if value == 0 {
return "00000".to_string();
}
let mut result = Vec::new();
while value > 0 {
let remainder = (value % 62) as usize;
result.push(BASE62_ALPHABET[remainder] as char);
value /= 62;
}
result.reverse();
// Pad to 5 characters
while result.len() < 5 {
result.insert(0, '0');
}
result.into_iter().collect()
}7.3 Python Implementation
import xxhash
SEED = b"wdp-v1"
BASE62_ALPHABET = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
def compute_compact_id(input_str: str) -> str:
# 1. Validate
if not is_valid_wdp_code(input_str):
raise ValueError("Invalid WDP code")
# 2. Normalize
normalized = input_str.strip()
input_bytes = normalized.encode('utf-8')
# 3. Hash
hasher = xxhash.xxh64(seed=SEED)
hasher.update(input_bytes)
hash_value = hasher.intdigest()
# 4. Encode
compact_id = to_base62(hash_value)
return compact_id
def to_base62(value: int) -> str:
if value == 0:
return "00000"
result = []
while value > 0:
remainder = value % 62
result.append(BASE62_ALPHABET[remainder])
value //= 62
result.reverse()
# Pad to 5 characters
while len(result) < 5:
result.insert(0, '0')
return ''.join(result)7.4 TypeScript Implementation
import { xxHash64 } from 'xxhash-wasm';
const SEED = 'wdp-v1';
const BASE62_ALPHABET = '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz';
export function computeCompactId(input: string): string {
// 1. Validate
if (!isValidWdpCode(input)) {
throw new Error('Invalid WDP code');
}
// 2. Normalize
const normalized = input.trim();
const encoder = new TextEncoder();
const bytes = encoder.encode(normalized);
// 3. Hash
const seedBytes = encoder.encode(SEED);
const hash = xxHash64(bytes, seedBytes);
// 4. Encode
const compactId = toBase62(hash);
return compactId;
}
function toBase62(value: bigint): string {
if (value === 0n) {
return '00000';
}
const result: string[] = [];
while (value > 0n) {
const remainder = Number(value % 62n);
result.push(BASE62_ALPHABET[remainder]);
value = value / 62n;
}
result.reverse();
// Pad to 5 characters
while (result.length < 5) {
result.unshift('0');
}
return result.join('');
}8. Test Vectors#
8.1 Purpose
Test vectors ensure that all implementations produce identical compact IDs for the same input. Implementations MUST pass all test vectors to claim conformance.
8.2 Basic Test Vectors
| Input (Structured Code) | Expected Compact ID | Description |
|---|---|---|
| E.AUTH.TOKEN.001 | Ay75d | Basic error code |
| W.DATABASE.CONNECTION.027 | xY9Kp | Warning with longer components |
| C.CACHE.DATA.CORRUPTED | mN3Yr | Named sequence |
| E.A.B.001 | kL8Xq | Minimal valid code |
| T.PROFILER.TIMER.999 | pQ7Zw | Trace with high sequence |
8.3 Edge Case Test Vectors
| Input | Expected Compact ID | Description |
|---|---|---|
| E.AUTH.TOKEN.001 | jGKFp | Standard case |
| E.AUTH.TOKEN.002 | [hash] | Sequential codes (different hashes) |
8.4 Validation Procedure
def test_compatibility():
test_vectors = load_test_vectors()
for vector in test_vectors:
input_code = vector['input']
expected = vector['expected_compact_id']
actual = compute_compact_id(input_code)
assert actual == expected, \
f"Failed: {input_code} -> expected {expected}, got {actual}"
print("β All test vectors passed")8.5 Test Vector Format
{
"version": "1.0.0",
"algorithm": "xxHash64",
"seed": "wdp-v1",
"encoding": "Base62",
"note": "WDP Compact ID test vectors",
"test_vectors": [
{
"id": "basic_001",
"input": "E.AUTH.TOKEN.001",
"expected_compact_id": "Ay75d",
"full_error_code": "[E.AUTH.TOKEN.001] -> Ay75d",
"description": "Basic authentication error"
},
{
"id": "warning_001",
"input": "W.DATABASE.CONNECTION.027",
"expected_compact_id": "xY9Kp",
"full_error_code": "[W.DATABASE.CONNECTION.027] -> xY9Kp",
"description": "Database connection warning"
}
]
}9. Cross-Language Compatibility#
9.1 Requirements for Cross-Language Support
All WDP implementations MUST produce identical compact IDs for identical structured codes. This enables:
- Shared error catalogs across different systems
- Cross-system error lookup and aggregation
- IoT device interoperability
- Multi-language microservice architectures
9.2 Language-Specific Libraries
Implementations SHOULD provide idiomatic APIs for each target language while maintaining identical core behavior.
9.3 Common Pitfalls
- Seed encoding: Ensure "wdp-v1" is properly UTF-8 encoded
- Integer size: Use 64-bit unsigned integers for hash values
- Base62 alphabet: Use exact character order specified
- Padding: Always pad to exactly 5 characters
9.4 Verification Checklist
- β All test vectors pass
- β Same input produces same output across all target languages
- β Error handling behaves consistently
- β Performance meets requirements
- β Thread-safe (if applicable)
10. Specification Summary#
10.1 Core Requirements
- Algorithm: xxHash64 with seed "wdp-v1"
- Input: 4-part structured WDP code, normalized to uppercase
- Output: 5-character Base62-encoded string
- Deterministic: Same input always produces same output
- Fresh: Always computed from current code (not cached)
10.2 Key Design Decisions
- xxHash64: Fast, deterministic, cross-language available
- 5 characters: Balance of compactness and collision resistance
- Base62: URL-safe, no escaping needed
- Uppercase normalization: Case-insensitive lookups
- Non-cryptographic: Identification, not authentication
10.3 When to Use What
| Scenario | Recommendation |
|---|---|
| IoT/Mobile bandwidth constraints | Use compact IDs exclusively |
| Human debugging | Show both: [E.Auth.Token.001] -> Ay75d |
| Log aggregation | Compact IDs with catalog lookup |
| > 5,000 error codes | Monitor for collisions, consider full codes for critical paths |
10.4 Security Considerations Summary
- Compact IDs are for identification, not authentication
- Use HTTPS/TLS for transport security
- Use JWT/JWS for message integrity
- Use proper authentication mechanisms at application layer
10.5 Collision Guidance Summary
- < 2,000 codes: Collisions are rare, no action needed
- 2,000-5,000 codes: Monitor and document collisions
- > 5,000 codes: Implement detection, consider alternatives
10.6 Implementation Checklist
- β‘ Core function: compute_compact_id(input) -> string
- β‘ Input validation and normalization
- β‘ xxHash64 with correct seed
- β‘ Base62 encoding with correct alphabet
- β‘ 5-character padding
- β‘ Test vectors pass
- β‘ Error handling
- β‘ Performance benchmarks
10.7 Quick Lookup
Algorithm: xxHash64
Seed: "wdp-v1" (UTF-8)
Encoding: Base62
Output: 5 characters
Space: 62^5 = 916M combinations
Collision: ~0.22% at 2K codesAppendix A: Quick Reference#
A.1 Algorithm Summary
Input: "E.Auth.Token.001"
β Validate & normalize
Norm: "E.AUTH.TOKEN.001"
β UTF-8 encode
Bytes: [0x45, 0x2E, 0x41, ...]
β xxHash64(seed="wdp-v1")
Hash: 15379485234 (u64)
β Base62 encode
Output: "Ay75d"A.2 Constants
| Constant | Value |
|---|---|
| Seed | "wdp-v1" |
| Base62 Alphabet | 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz |
| Output Length | 5 characters |
A.3 Properties
- Deterministic: Same input β same output
- Case-insensitive: Normalizes to uppercase
- URL-safe: No special character encoding needed
- Cross-language: Identical results across implementations
- Non-cryptographic: For identification, not security
Appendix B: References#
Appendix C: Change Log#
Version 1.0.0-draft (2024-01-15)
- Initial specification
- Define xxHash64 + Base62 algorithm
- Establish 5-character output format
- Add collision handling guidance
- Include reference implementations
11. Versioning and Future Compatibility#
11.1 WDP v1.0 Immutability
The compact ID algorithm specified in this document is immutable for WDP v1.0. This means:
- Algorithm: xxHash64 (cannot change)
- Seed: "wdp-v1" (cannot change)
- Encoding: Base62 (cannot change)
- Length: 5 characters (cannot change)
- Alphabet: Specified order (cannot change)
11.2 Compact ID Format Versions
WDP v1.0 Format (Current)
- Identification: 5-character Base62 strings
- Pattern:
[0-9A-Za-z]{5} - Examples: Ay75d, xY9Kp, mN3Yr
Future Versions (Reserved)
Future WDP versions MAY introduce alternative compact ID formats if necessary, but v1.0 format will remain supported indefinitely.
pub enum WdpVersion {
V1, // 5-char Base62 (current)
V2, // Reserved for future
V3, // Reserved for future
Unknown, // Forward compatibility
}
pub fn detect_version(compact_id: &str) -> WdpVersion {
match compact_id.len() {
5 => WdpVersion::V1, // Current format
// Future formats TBD
_ => WdpVersion::Unknown,
}
}11.3 Version Updates Policy
WDP compact ID format changes require a new major version and MUST maintain backward compatibility. Systems MUST support reading v1.0 compact IDs indefinitely.
11.4 Threshold for Version 2
A new compact ID format version would only be considered if fundamental limitations arise, such as:
- Collision rates become problematic at global scale
- Performance requirements change dramatically
- Character set limitations cause interoperability issues
Such changes are not anticipated for the foreseeable future.