Serialization and Deserialization with Solace Schema Registry
Serialization means converting an application's data objects from their native format into a binary format suitable for transmission or storage. For example, a serializer might take a Java object and convert it to Avro binary format using a schema from the registry.
Deserialization is converting the binary data back into its original format for application processing.
Serialization and deserialization of messages with Solace Messaging APIs is supported by the SERDES Collection. These APIs allow you to specify the schema you want to use to perform SERDES operations on Solace event messages. Each different API and payload format pair needs its own SERDES.
You deploy the SERDES Collection alongside the Solace Messaging API for the programming language and schema format you are using to develop your application.
The following topics provide language-agnostic information about SERDES configuration and concepts that apply across all messaging protocols and programming languages.
- Using the SERDES Collection with Solace Schema Registry
- API and Protocol-Specific Implementation Guides
- Getting Started with SERDES
- Choosing Schema Resolution Strategies
- Optimizing Performance
- Advanced Configuration
- Cross-Protocol Message Translation
Using the SERDES Collection with Solace Schema Registry
In modern messaging systems, managing data formats across distributed applications can present challenges. As you scale your applications, differences in how message producers and consumers serialize and interpret data can lead to issues with data consistency and version compatibility. Without a shared understanding of the message format, mismatches such as missing fields, unexpected data types, or incompatible versions can result in runtime errors. To prevent these problems and streamline data exchange in Solace messaging environments, we recommend you use Solace Schema Registry with the SERDES Collection.
Understanding schemas
A schema defines the structure, data types, and validation rules for messages exchanged between producers and consumers, ensuring that they agree on the format of the data being exchanged. In formats like Avro or JSON Schema, a schema explicitly describes each field's name, type, and if it is required. This enables compatibility checks, validation, and tooling support, and ensures consistent data interpretation across services, especially when schemas are managed in a schema registry.
Understanding the details of Schema Registry
A schema registry is a centralized service that stores and manages schemas used for serializing and deserializing structured data. It ensures that applications exchanging messages follow a consistent data format, which prevents compatibility issues between message producers and consumers. By using Solace Schema Registry, you can enforce data validation, track schema versions, and support schema evolution in a way that maintains compatibility with applications using older schema versions.
Understanding SERDES and Schema Registry Interactions and Workflows
When integrated with the SERDES Collection, your applications can automatically retrieve and apply the correct schema from Solace Schema Registry during message serialization and deserialization. When a message producer sends a message, the serializer component encodes the data according to a registered schema, including a schema identifier stored in the SERDES header appended to your message. On the message consumer side, the application extracts the schema identifier from the header, retrieves the corresponding schema from the registry, and ensures the message is correctly parsed. This header-based approach allows for efficient schema resolution without modifying the message payload itself. This interaction enables dynamic schema resolution, eliminates the need to hardcode schemas in applications, and simplifies the management of structured data formats like Avro and JSON Schema in your messaging applications.
Steps for Using the SERDES Collection with Solace Schema Registry
- Set up a Solace Schema Registry instance then define and register your schemas.
- Choose a serialization format. Solace currently supports Avro and JSON Schema.
- Configure your SERDES objects (Avro or JSON Schema).
- Both JSON and Avro share a common SERDES configuration API for registry access, resolution, and caching.
- Use a serializer to serialize outbound messages and a deserializer to deserialize inbound messages.
For more information, see Solace Schema Registry.
API and Protocol-Specific Implementation Guides
The following pages provide protocol-specific implementation details, including code examples, Java imports, and language-specific considerations for using SERDES with different messaging protocols:
- Serializing and Deserializing Messages with the Solace JCSMP API—Java implementation for JCSMP messaging with complete code examples and JCSMP-specific considerations.
- Leveraging REST Messaging with Solace Schema Registry: Serializing and Deserializing Messages in Java—Java implementation for REST messaging with HTTP client/server examples and REST-specific considerations.
- Leveraging AMQP Messaging with Solace Schema Registry: Serializing and Deserializing Messages in Java—Java implementation for AMQP 1.0 messaging using Apache Qpid JMS with AMQP-specific considerations.
Getting Started with SERDES
Before configuring advanced features like resolution strategies or performance optimization, you need to connect to your Schema Registry instance and understand basic SERDES configuration. This section covers the essential setup steps to get started.
All SERDES properties are set in a configuration map. For protocol-specific examples of how to create and configure this map, see API and Protocol-Specific Implementation Guides.
Connecting to Schema Registry
The Generic SERDES for Java supports multiple authentication methods for connecting to Solace Schema Registry. This section covers how to configure authentication credentials for your SERDES objects, including basic username and password authentication, and how to implement them in both secure and non-secure connection contexts. You specify the authentication settings when you configure your schema registry client, which allows you to control access permissions and maintain the security of your schemas.
The following properties configure how your application connects to Solace Schema Registry:
REGISTRY_URL—The URL of Solace Schema Registry where schemas are stored. This property is required in all serializer and deserializer configurations. The registry URL must point to the Registry REST API endpoint. The default path for this endpoint is/apis/registry/v3.AUTH_USERNAME—The username to use to authenticate with Solace Schema Registry.AUTH_PASSWORD—The password to use to authenticate with Solace Schema Registry.TRUST_STORE_PATH—The file path to the truststore containing SSL/TLS certificates for secure connections.TRUST_STORE_PASSWORD—The password required to access the truststore file.VALIDATE_CERTIFICATE—A flag that determines whether SSL certificate validation is enabled (true) or disabled (false).We recommend that you never disable
VALIDATE_CERTIFICATEin production environments because it creates a security vulnerability.
The following sections describe how to set these properties based on which authentication option you use:
- Authentication, plain text connection—Used in restricted environments where network access is controlled but authentication is still required. Set
REGISTRY_URLto an HTTP endpoint (for example,http://localhost:8081/apis/registry/v3), and provideAUTH_USERNAMEandAUTH_PASSWORDcredentials. - Authentication, secure connection—The recommended production setup. This setup ensures both data encryption and access control for secure schema management. Set
REGISTRY_URLto an HTTPS endpoint (for example,https://{your_hostname}:443/apis/registry/v3), provideAUTH_USERNAMEandAUTH_PASSWORDcredentials, configureTRUST_STORE_PATHandTRUST_STORE_PASSWORDfor TLS, and setVALIDATE_CERTIFICATEtotrue. - Advanced authentication—Used when you configure Schema Registry with an external identity provider such as Microsoft Entra ID. When using advanced authentication flows, you configure the same
AUTH_USERNAMEandAUTH_PASSWORDproperties. In this case,AUTH_USERNAMEmaps to your OAuth Client ID, andAUTH_PASSWORDmaps to your OAuth Client Secret from your OAuth configuration. For more information about configuring external identity providers with Schema Registry, see Deploying and Configuring Solace Schema Registry with Docker or Podman.
For protocol-specific code examples showing how to configure these properties, see API and Protocol-Specific Implementation Guides.
Basic Configuration Setup
All SERDES configuration properties are set in a configuration map object. The specific implementation varies by protocol and programming language, but the concept remains consistent across all implementations. You create a configuration map, populate it with the necessary properties (such as registry URL, authentication credentials, and format-specific settings), and then use this map to configure your serializer and deserializer objects.
For detailed examples of how to create and configure the configuration map for your specific protocol, see API and Protocol-Specific Implementation Guides. These pages provide complete code examples showing how to instantiate the configuration map, set properties, and configure serializers and deserializers.
Choosing Schema Resolution Strategies
Artifact resolver strategies determine how the serializer selects which schema to use when serializing messages. The strategy you choose depends on your application architecture, topic structure, and how you organize schemas in your registry. This section helps you understand the available strategies and choose the right one for your use case.
- Understanding Resolution Strategies
- Destination ID Strategy
- Solace Topic ID Strategy
- Solace Topic ID Strategy with Profiles
- Record ID Strategy (Avro Only)
Understanding Resolution Strategies
Artifact resolver strategies define how a schema or artifact is dynamically resolved at runtime for serialization operations performed by the SERDES Collection. These strategies determine which schema to fetch from Solace Schema Registry based on metadata like Solace topics, destination IDs, and record IDs. Artifact resolver strategies are only applied during serialization, because the serializer is responsible for selecting or creating the artifact ID to register or reference a schema. Deserializers simply use the schema ID embedded in the incoming message to retrieve the correct schema, without needing to determine how that ID was formed.
Key points about resolution strategies:
- Strategies only apply during serialization (deserializers use the schema ID from the message)
- The strategy determines how to construct an ArtifactReference from message metadata
- Different strategies offer different levels of flexibility and complexity
Some important terms:
record—A structured data object, for example an Avro or JSON Schema message, that is serialized or deserialized using a schema from Solace Schema Registry.artifact—A named and versioned container in Solace Schema Registry that holds one or more schema versions. Each artifact is identified by anartifactIdand optionally grouped using agroupId.ArtifactReference—A pointer to an existing artifact in Solace Schema Registry. When the artifact contains a schema, this reference can be used to locate and apply it during serialization or deserialization.artifactId—A unique identifier for an artifact in Solace Schema Registry, which typically contains a single schema.groupId—A logical grouping mechanism for artifacts in Solace Schema Registry. It allows organizing related schemas, for example all schemas for a specific application. If nogroupIdis specified, the value isdefault.
Strategy Comparison
The following table compares the available resolution strategies to help you choose the right one for your use case:
| Strategy | Best For |
|---|---|
| Destination ID | Simple destination-to-schema mapping |
| Solace Topic ID | Direct topic name to schema mapping |
| Topic ID with Profiles | Multiple topics sharing schemas, wildcard patterns |
| Record ID (Avro) | Schema information embedded in record |
Destination ID Strategy
Use the Destination ID Strategy when you have a simple one-to-one mapping between message destinations and schemas. This works well when each REST endpoint, AMQP queue, or messaging destination consistently uses the same schema, and your schema registry follows a naming convention where artifact IDs match your destination names.
The DestinationIdStrategy automatically determines which schema to use during serialization. It extracts the destination name from the record metadata to use as the artifactId, while applying a default value as the groupId to construct the complete ArtifactReference. This approach eliminates the need for explicit schema specification on a per message basis because it creates a direct mapping between message destinations and schema artifacts stored in the registry. For successful implementation, Solace Schema Registry must contain schemas with artifactIds that correspond to the destination names used in your application's messaging infrastructure. The strategy is valuable in systems where different message types are routed to different destinations, as it reduces configuration overhead and decouples serialization logic from schema-specific details.
In summary, the DestinationIdStrategy:
- Uses the destination name defined by the parameter passed into the
serialize()method call. In REST and AMQP implementations, this is typically an endpoint name or path specified directly by the user. In Solace JCSMP API implementations, this destination is automatically extracted from the messagetopicvia theserialize()method. - Uses this destination name as the
artifactIdin theArtifactReference. - Uses a default value as the
groupIdin theArtifactReference. - Uses the constructed
ArtifactReferenceto locate the correct schema in Solace Schema Registry.
artifactIds
that match the destination names used in your application.
To configure a serializer to use the DestinationIdStrategy, set the ARTIFACT_RESOLVER_STRATEGY property to DestinationIdStrategy.class. For protocol-specific code examples, see API and Protocol-Specific Implementation Guides.
Solace Topic ID Strategy
Use the Solace Topic ID Strategy when your Solace topic names directly correspond to schema artifact IDs in your registry. This is ideal for Solace applications where topic hierarchies are well-defined and schema naming follows the same structure.
The SolaceTopicIdStrategy automatically determines which schema to use during serialization by mapping the destination name directly to the ArtifactReference, setting the destination string as the artifactId. This approach simplifies schema resolution by creating a direct correlation between Solace topic destinations and schema artifacts stored in the registry. For successful implementation, Solace Schema Registry must contain schemas with artifactIds that correspond to the topic names used in your Solace messaging infrastructure.
In summary, the SolaceTopicIdStrategy:
- Uses the destination name defined by the parameter passed into the
serialize()method call. In REST and AMQP implementations, this is typically an endpoint name or path specified directly by the user. In Solace JCSMP API implementations, this destination is automatically extracted from the messagetopicvia theserialize()method. - Maps this destination name directly as the
artifactIdin theArtifactReference. - Uses the constructed
ArtifactReferenceto locate the correct schema in Solace Schema Registry.
artifactIds
that match the topic names used in your Solace messaging application.
To configure a serializer to use the SolaceTopicIdStrategy without a profile, set the ARTIFACT_RESOLVER_STRATEGY property to SolaceTopicIdStrategy.class. For protocol-specific code examples, see API and Protocol-Specific Implementation Guides.
Solace Topic ID Strategy with Profiles
Use Topic ID Strategy with Profiles when you need flexible topic-to-schema mappings, such as when multiple topics should use the same schema, or when you want to use wildcard patterns to map topic hierarchies to schemas. This strategy is the most flexible and is recommended when you have complex topic structures or need to map many topics to fewer schemas.
For more advanced mapping scenarios, the SolaceTopicIdStrategy can be used with a SolaceTopicProfile to provide flexible topic-to-schema mappings. This approach allows for wildcard topic expressions and custom artifact references, giving you greater control over schema resolution.
Using profiles with SolaceTopicIdStrategy provides several benefits:
- Support for wildcard topic expressions to match multiple topics with a single mapping. You can use
*for single-level wildcards and>for multi-level wildcards. For more information, see Wildcard Characters in Topic Subscriptions. - The ability to map different topic patterns to specific schemas.
- Control over
groupId,artifactId, and versioning for schema selection
The SolaceTopicProfile can be configured with different types of mappings:
Topic Expression Only Mapping
This method maps a Solace topic expression directly to an artifactId that matches the topic expression itself. The groupId is left as default. Use this method when your schema registry follows a convention where schemas are registered with an artifactId that matches your topic hierarchy.
To implement this mapping:
- Create a new instance of
SolaceTopicProfile - Create mappings with topic expressions using
SolaceTopicArtifactMapping.create()with patterns like:- Exact match:
"solace/samples" - Single-level wildcard:
"solace/*/sample" - Multi-level wildcard:
"solace/>"
- Exact match:
- Add each mapping to the profile
- Set the
STRATEGY_TOPIC_PROFILEproperty to the configured profile
For protocol-specific code examples, see API and Protocol-Specific Implementation Guides.
Topic Expression with Custom Artifact ID
This method maps a Solace topic expression to a specific, custom artifactId. The groupId is left as "default". Use this method when multiple topics should use the same schema, but your topic names do not match your schema registry naming conventions.
To implement this mapping:
- Create a new instance of
SolaceTopicProfile - Create mappings with topic expressions and custom artifact IDs using
SolaceTopicArtifactMapping.create(topicExpression, artifactId). For example:- Map
"solace/samples"to artifact ID"User" - Map
"solace/*/sample"to artifact ID"NewUser" - Map
"solace/>"to artifact ID"OldUser"
- Map
- Add each mapping to the profile
- Set the
STRATEGY_TOPIC_PROFILEproperty to the configured profile
For protocol-specific code examples, see API and Protocol-Specific Implementation Guides.
Topic Expression with Full ArtifactReference
This method maps a Solace topic expression to a complete ArtifactReference with a custom groupId, artifactId, and optional version for exact schema selection. Use this method when you need complete control over schema resolution, especially when you have multiple schema groups or when specific versions must be used.
To implement this mapping:
- Create a new instance of
SolaceTopicProfile - Use an
ArtifactReferenceBuilderto create artifact references with:groupId(for example,"com.solace.samples.serdes.avro.schema")artifactId(for example,"User")- Optional
version(for example,"0.0.1") for version-specific references
- Create mappings using
SolaceTopicArtifactMapping.create(topicExpression, artifactReference) - Add each mapping to the profile
- Set the
STRATEGY_TOPIC_PROFILEproperty to the configured profile
For protocol-specific code examples, see API and Protocol-Specific Implementation Guides.
Record ID Strategy (Avro Only)
The RecordIdStrategy automatically determines which schema to use during serialization by extracting information directly from the record's payload. It uses the schema name as the artifactId and the schema namespace as the groupId to construct the complete ArtifactReference. This approach eliminates the need for explicit schema specification on a per message basis because it creates a direct mapping between the data structure of your records and schema artifacts stored in the registry. For successful implementation, Solace Schema Registry must contain schemas with an artifactId and a groupId that corresponds to the schema name and namespace used in your application's data models. The strategy is valuable in systems where the schema information is inherently contained within the record structure, as it reduces configuration overhead and ensures that the correct schema is always used for each record type.
Use the Record ID Strategy when your Avro records contain schema information (name and namespace) that should determine which schema to use. This strategy is valuable when the schema information is inherently part of your data model and you want the serializer to automatically extract it from each record.
In summary, the RecordIdStrategy:
- Extracts the schema from the record's payload.
- Uses the schema name as the
artifactIdin theArtifactReference. - Uses the schema namespace as the
groupIdin theArtifactReference. - Uses the constructed
ArtifactReferenceto locate the correct schema in Solace Schema Registry.
artifactIds
that match the schema names and groupIds that match the schema namespaces used in your application.
To configure a serializer to use the RecordIdStrategy, set the ARTIFACT_RESOLVER_STRATEGY property to RecordIdStrategy.class. For protocol-specific code examples, see API and Protocol-Specific Implementation Guides.
Optimizing Performance
Once your SERDES implementation is working, you can optimize performance through caching and lookup configuration. These settings control how frequently your application queries the schema registry and how it handles schema resolution, allowing you to balance performance, freshness, and resilience.
Schema Caching Options
To reduce the overhead of repeated schema registry lookups, clients using the SERDES Collection can configure schema caching options. These settings control how long schemas are stored locally and how different types of lookups interact with the schema registry cache.
CACHE_TTL_MS—Determines how long schema artifacts remain valid in the cache before they need to be fetched again from the registry on the next relevant lookup. The default value is 30000ms (30 seconds). Longer TTLs improve performance by reducing registry calls but risk using outdated schemas, shorter TTLs ensure fresher schemas but increase registry load. A zero TTL disables caching entirely, requiring registry fetches for every request. This property accepts:- Long value (for example,
5000Lfor 5 seconds) - String value (for example,
"5000") - Duration object (for example, 5 seconds)
- Zero (
0L) to disable caching completely
- Long value (for example,
USE_CACHED_ON_ERROR—Controls whether to use cached schemas when schema registry lookup errors occur. When enabled (true), schema resolution uses cached schemas instead of throwing exceptions after retry attempts are exhausted, improving resilience during registry outages. When disabled (false, the default), the API throws an exception when registry lookup errors occur.CACHE_LATEST(Serializers Only)—Controls whether schema lookups that specifylatestor do not include an explicit version (no-version) create additional cache entries, so that future "latest" or versionless lookups can be resolved using the cache instead of querying the registry again. When enabled (true, the default), latest and versionless (no-version) schema lookups create additional cache entries, enabling subsequent latest or no-version lookups to use the cached schema without contacting the registry. When disabled (false), only the resolved version is cached, meaning every latest or versionless lookup must query the registry to determine the current latest version.- When an explicit schema version is specified, the
CACHE_LATESTsetting has no effect on schema lookup results. CACHE_LATESTonly affects caching for serialization operations. It does not apply to schema references, which are always resolved through direct registry lookups, bypassing the cache.
- When an explicit schema version is specified, the
For protocol-specific code examples showing how to configure these properties, see API and Protocol-Specific Implementation Guides.
Schema Lookup Options
When you use the SERDES Collection with Solace Schema Registry, the serializer must determine which schema to use when writing data. You can configure schema lookup options to control how the appropriate schema is resolved from the registry. The following schema lookup options are available:
FIND_LATEST_ARTIFACT—A boolean flag that determines whether your serializer should attempt to locate the most recent artifact in the registry for the givengroupIdandartifactIdcombination. The default value isfalse.EXPLICIT_ARTIFACT_VERSION—A string value that represents the artifact version used for querying an artifact in the registry (for example,"1.0.0"). This property overrides the version returned by theArtifactReferenceResolverStrategy. If this property andFIND_LATEST_ARTIFACTare both set, this property takes precedence. The default value isnull.EXPLICIT_ARTIFACT_ARTIFACT_ID—A string value that represents theartifactIdused for querying an artifact in the registry (for example,"my-schema"). This property overrides theartifactIdreturned by theArtifactReferenceResolverStrategy. The default value isnull.EXPLICIT_ARTIFACT_GROUP_ID—A string value that represents thegroupIdused for querying an artifact in the registry (for example,"com.example"). This property overrides thegroupIdreturned by theArtifactReferenceResolverStrategy. The default value isnull.REQUEST_ATTEMPTS—Specifies the number of attempts to make when communicating with the schema registry before giving up. Valid values are any number between 1 andLong.MAX_VALUE. When used withUSE_CACHED_ON_ERROR, this property specifies the number of attempts before falling back to the last cached value. The default value is 3.REQUEST_ATTEMPT_BACKOFF_MS—Specifies the backoff time in milliseconds between retry attempts when communicating with the schema registry. Valid values are any number between 0 andLong.MAX_VALUE. This property accepts:- Long value (for example,
500L) - String value (for example,
"500") - Duration object (for example, 500 milliseconds)
- Long value (for example,
For protocol-specific code examples showing how to configure these properties, see API and Protocol-Specific Implementation Guides.
Advanced Configuration
The following configurations support specialized use cases and advanced scenarios. Most applications won't need these settings, but they provide fine-grained control over SERDES behavior when needed. These configuration options allow you to control automatic schema registration, customize header formats for cross-protocol compatibility, and configure format-specific encoding behaviors, which is useful for streamlining development workflows, optimizing message translation between protocols, and handling complex schema structures in Avro and JSON Schema applications.
- Auto-Registration Configuration
- SERDES Header Configuration
- Avro-Specific Configuration
- JSON Schema-Specific Configuration
Auto-Registration Configuration
Auto-registration allows serializers to automatically register schemas in Solace Schema Registry when they don't exist during serialization operations. This feature simplifies schema management by eliminating the need to manually register schemas before using them in your applications. When auto-registration is enabled, the serializer will attempt to register the schema if it cannot find a matching schema in the registry for the given artifact reference.
Auto-registration is particularly useful in development environments where schemas are frequently updated, or in scenarios where you want to streamline the deployment process by allowing applications to self-register their schemas. However, in production environments, you may want to disable auto-registration to maintain strict control over schema evolution and prevent unauthorized schema modifications.
AUTO_REGISTER_ARTIFACT—A boolean flag that controls whether schemas should be automatically registered when they don't exist in the registry during serialization. When set totrue, the serializer will attempt to register the schema if it's not found in the registry. When set tofalse(the default), the serializer will throw an exception if the schema is not found. This property only affects serialization operations.AUTO_REGISTER_ARTIFACT_IF_EXISTS—Controls the behavior when auto-registering artifacts that might already exist in the registry. This property works in conjunction withAUTO_REGISTER_ARTIFACTand only takes effect when auto-registration is enabled. The available options are:IfArtifactExists.CREATE_VERSION—If a schema with the same content already exists in the registry, create a new version of that schema. This is useful when you want to maintain version history even for identical schemas.IfArtifactExists.FAIL—If a schema with the same name already exists in the registry, the registration will fail and throw aSerializationException. This provides strict control over schema registration.IfArtifactExists.FIND_OR_CREATE_VERSION—If a schema with the same content already exists in the registry, use the existing schema. Otherwise, create a new version. This is the default behavior and provides the most flexible approach to schema management.
For protocol-specific code examples showing how to configure these properties, see API and Protocol-Specific Implementation Guides. For JSON Schema-specific auto-registration requirements, see JSON Schema Auto-Registration.
Important considerations when using auto-registration:
- Auto-registration only affects serialization operations. Deserializers always use the schema ID from the message to fetch the corresponding schema from the registry.
- The SERDES user must have read-write/sr-developer permissions to auto-register schemas. Using a read-only/sr-readonly user will cause auto-registration to fail.
- In production environments, consider disabling auto-registration to maintain strict control over schema evolution and prevent unauthorized schema modifications.
- When using auto-registration with schema references, ensure that all referenced schemas are also available in the registry or can be auto-registered.
- Auto-registration works with all artifact resolver strategies.
SERDES Header Configuration
The SCHEMA_HEADER_IDENTIFIERS property allows you to select different SERDES header formats for schema identification in message headers. This configuration involves a tradeoff between efficiency and interoperability, enabling you to optimize your messaging system based on your specific requirements and integration scenarios.
When a message producer sends a message, the serializer component encodes the data according to a registered schema and includes a schema identifier in the message headers. The format of this schema identifier can be configured to balance performance optimization with cross-protocol compatibility needs.
The SCHEMA_HEADER_IDENTIFIERS property accepts values from the SchemaHeaderId enum, which provides two header format options:
SchemaHeaderId.SCHEMA_ID(default)—Efficientlongheader configuration optimized for performance.SchemaHeaderId.SCHEMA_ID_STRING—Stringheader configuration for optimal interoperability across protocols.
Efficiency vs. Interoperability Considerations
The choice between header formats represents a tradeoff:
- Efficiency—The default
SCHEMA_IDoption provides the best performance and efficiency by using a compact 8-bytelongvalue for schema identification. This binary format minimizes header overhead and processing time, making it ideal for high-throughput Solace-to-Solace messaging scenarios. - Interoperability—The
SCHEMA_ID_STRINGoption ensures optimal compatibility between different messaging protocols by using human-readable string values. While this introduces some additional header processing overhead, it enables message translation and consumption across different messaging protocol environments. REST messaging serves as a specific example where string-only headers enable easier interoperability between protocols. When messages need to be consumed by REST endpoints or translated between different messaging systems, string-based schema identifiers provide better compatibility and easier debugging capabilities.
Configuration
To configure the SCHEMA_HEADER_IDENTIFIERS property:
- Set to
SchemaHeaderId.SCHEMA_ID(the default) for maximum efficiency - Set to
SchemaHeaderId.SCHEMA_ID_STRINGfor optimal interoperability
For protocol-specific code examples, see API and Protocol-Specific Implementation Guides.
Important considerations when configuring serde headers:
- The
SCHEMA_HEADER_IDENTIFIERSproperty only affects serialization operations. Deserializers automatically detect and handle both header formats. - Both header formats reference the same schema content; only the identifier representation differs.
- You can change the header format configuration without affecting existing schemas in your registry.
Avro-Specific Configuration
Apache Avro is a data serialization system that provides a compact binary data format and schema evolution capabilities. Avro serialization allows you to encode and decode complex data structures efficiently—encoding before publishing them as messages and decoding when consuming messages. This ensures consistent data formatting across your messaging applications and enables schema evolution as your application requirements change.
This section covers Avro-specific configuration properties. For common SERDES configuration properties, see Getting Started with SERDES.
All SERDES properties are set in a configuration map. For protocol-specific examples of how to create and configure this map, see API and Protocol-Specific Implementation Guides.
Dereferencing Schemas
The DEREFERENCED_SCHEMA property controls how serializers handle Avro schemas that contain references to other schemas. In Avro schema design, there are two primary approaches to managing complex schemas:
- Referenced schemas—These schemas are structured to reference or include other shared schema components, enabling a modular and reusable design. Common types can be defined once and reused across multiple schemas. Referenced schemas are reusable and easy to maintain.
- Dereferenced schemas—These schemas, also called flat schemas, have all references expanded inline, creating a single, self-contained definitions with no external dependencies. Dereferenced schemas simplify consumption by eliminating the need to resolve multiple schema references at runtime.
When set to true (the default), the DEREFERENCED_SCHEMA property tells the serializer to treat the schema in the record as fully dereferenced and use it directly for registry lookups. The serializer does not attempt to extract or manage referenced sub-schemas, which simplifies schema handling in your application.
Important considerations when using the DEREFERENCED_SCHEMA property:
- If
FIND_LATEST_ARTIFACTis set totrue, theDEREFERENCED_SCHEMAproperty is ignored. - This property only affects serializers, because deserializers do not embed schema information, they use the schema ID to fetch the corresponding schema from the registry.
- When
DEREFERENCED_SCHEMAis enabled, all schemas in your registry must be stored in their dereferenced form, as referenced schemas cannot be properly resolved.
This configuration is useful in scenarios where you want to:
- simplify schema handling in your application by avoiding schema reference resolution.
- optimize performance by reducing the number of schema registry lookups.
When DEREFERENCED_SCHEMA is set to false, the serializer will treat the provided schema as one that needs to be decomposed into reference schemas. This allows applications to use one Avro schema file to represent multiple Avro schemas in the registry, which can reduce storage of
common sub-schemas in the registry.
To configure a serializer to decompose schemas into references, set DEREFERENCED_SCHEMA to false. For protocol-specific code examples, see API and Protocol-Specific Implementation Guides.
Avro Encoding Types
The ENCODING_TYPE property allows you to specify the encoding format used by the AvroSerializer to convert data to bytes. Avro supports two primary encoding formats, each with different characteristics and use cases:
AvroEncoding.BINARY—Uses Avro'sDirectBinaryEncoderto produce a compact binary representation of the data. This is the default encoding and provides the most efficient serialization in terms of message size and processing performance.AvroEncoding.JSON—Uses Avro'sJsonEncoderto produce a human-readable JSON representation of the data. While less efficient than binary encoding, this format is useful for debugging, logging, and interoperability with systems that expect JSON.
The encoding type affects only the wire format of the serialized data; it does not change how the schema is resolved or how the data is structured. Both encoding types maintain full compatibility with the Avro schema system.
The encoding type is set as an Avro SERDES message header. This impacts the AvroDeserializer, which reads the assigned header value and attempts to use the appropriate decoder. If no header is present, the AvroDeserializer assumes the message was encoded with binary and attempts to decode it accordingly.
To configure a serializer to use JSON encoding, set ENCODING_TYPE to AvroEncoding.JSON. For protocol-specific code examples, see API and Protocol-Specific Implementation Guides.
When choosing an encoding type, consider these factors:
- Use
BINARYencoding when optimizing for performance, bandwidth efficiency, or message throughput. - Use
JSONencoding when debugging applications or when human readability is important.
If the ENCODING_TYPE property is not specified, the serializer defaults to BINARY encoding.
JSON Schema-Specific Configuration
JSON Schema is a vocabulary that defines how to annotate and validate JSON documents. JSON Schema serialization allows you to encode and decode structured JSON data efficiently—encoding before publishing them as messages and decoding when consuming messages. This ensures consistent data formatting across your messaging applications and enables schema evolution as your application requirements change.
This section covers JSON Schema-specific configuration properties. For common SERDES configuration properties, see Getting Started with SERDES.
All SERDES properties are set in a configuration map. For protocol-specific examples of how to create and configure this map, see API and Protocol-Specific Implementation Guides.
JSON Schema Auto-Registration
When using auto-registration with JSON Schema serialization, you must also specify the schema location using SchemaResolverProperties.SCHEMA_LOCATION to provide the classpath resource path to the schema file. Unlike Avro schemas, which embed schema information in the serialized data, JSON Schema requires an explicit schema file reference for auto-registration to work properly.
To configure the schema location, set AUTO_REGISTER_ARTIFACT to true and specify the classpath resource path using SCHEMA_LOCATION (for example, "json-schema/user.json"). For protocol-specific code examples, see API and Protocol-Specific Implementation Guides.
JSON Schema Validation
The VALIDATE_SCHEMA property controls whether JSON schema validation is performed during serialization and deserialization operations. When enabled, the serializer and deserializer will validate JSON data against its registered schema to ensure data conformity and catch schema violations early in the processing pipeline.
Schema validation provides several benefits:
- Data integrity—Ensures that JSON data conforms to the expected schema structure before processing.
- Early error detection—Catches schema violations during serialization/deserialization rather than downstream processing.
- Consistency enforcement—Guarantees that all messages follow the defined data contract.
However, enabling validation does introduce some performance overhead, so you may want to disable it in performance-critical scenarios where data integrity is guaranteed by other means.
The VALIDATE_SCHEMA property accepts a boolean value:
true(default)—Enables JSON schema validation during serialization and deserialization operations.false—Disables JSON schema validation, improving performance but removing data integrity checks.
To configure JSON schema validation, set VALIDATE_SCHEMA to true (the default) to enable validation, or false to disable it for improved performance. For protocol-specific code examples, see API and Protocol-Specific Implementation Guides.
Important considerations when configuring schema validation:
- Validation is enabled by default to ensure data integrity in most use cases.
- Disabling validation can improve performance but removes an important data quality safeguard.
- Consider your application's requirements for data integrity versus performance when configuring this property.
Java Type Property Configuration
The TYPE_PROPERTY configuration specifies the JSON schema property name that contains the Java type class path for deserialization. The JsonSchemaDeserializer uses this property to determine the target Java class when converting JSON data back to Java objects, enabling proper type reconstruction during the deserialization process.
This property is particularly important when:
- Your JSON schema includes type information for Java object reconstruction.
- You need to deserialize JSON data into specific Java classes rather than generic JSON objects.
- Your application uses polymorphic types that require runtime type resolution.
The TYPE_PROPERTY accepts a string value that specifies the property name in your JSON schema:
- Default value:
"javaType"—The deserializer will look for a property named"javaType"in the JSON schema. - Custom value: Any valid JSON property name that contains the Java class path information.
To configure the Java type property, set TYPE_PROPERTY to the desired property name (default is "javaType"). You can use custom property names like "className" or "targetClass" based on your schema design. For protocol-specific code examples, see API and Protocol-Specific Implementation Guides.
The following example shows a JSON schema that includes the javaType property. When the JsonSchemaDeserializer processes JSON data conforming to this schema, it will use the javaType value (com.example.User) to instantiate the correct Java class during deserialization:
{
"$schema": "https://json-schema.org/draft-07/schema",
"type": "object",
"javaType": "com.example.User",
"properties": {
"name": {
"type": "string"
},
"email": {
"type": "string",
"format": "email"
},
"age": {
"type": "integer",
"minimum": 0
}
},
"required": ["name", "email"]
}
Important considerations when configuring the type property:
- The property name must exist in your JSON schema and contain a valid Java class path.
- The specified Java class must be available on the classpath during deserialization.
- If the type property is not found, deserialization falls back to a generic JsonNode. If the property is present but contains an invalid class path, deserialization throws an exception.
Cross-Protocol Message Translation
SERDES supports cross-protocol messaging, allowing messages to be published using one protocol and consumed using another. The Solace broker performs automatic message translation and header mapping to preserve SERDES schema information across protocols.
Message Type Translation Rules
The broker translates messages between protocols according to the following rules. For detailed information on protocol metadata and payload encoding interactions, see Protocol Metadata and Payload Encoding Interactions:
- AMQP to REST Translation—AMQP application properties are mapped to HTTP headers with the
solace-user-property-prefix, preserving SERDES schema information for REST consumers. - AMQP to (SMF) Translation—AMQP application properties are mapped to Solace Message Format (SMF) user properties, allowing Solace Messaging API consumers to access SERDES headers.
- REST to (SMF) Translation—HTTP headers with the
solace-user-property-prefix are mapped to SMF user properties, preserving SERDES schema information for Solace Messaging API consumers. - Message Payload Preservation—Message payloads are preserved during protocol translation, ensuring that serialized data remains intact regardless of the consuming protocol. Note that during conversion, the event broker may wrap the payload in additional bytes as required by the target protocol's message format.
SERDES Compatibility with Message Types
Different protocols support different message payload types for SERDES operations:
- Binary Payload Support (Binary Attachment - Unstructured Bytes)—All protocols support binary message payloads (
BytesMessagein JCSMP,application/octet-streamin REST,BytesMessagein AMQP), where the SERDES deserializer processes the byte array directly. This is the recommended approach for optimal performance and compatibility. - Text Payload Support (Binary Attachment - Text)—Some protocols support text payloads for deserialization (
TextMessagein JCSMP, text-based Content-Types in REST,TextMessagein AMQP), where the deserializer converts the UTF-8 text content to bytes before processing. Text payloads are not supported for serialization. This enables flexibility in message handling and cross-protocol scenarios.
This cross-protocol compatibility allows applications using different protocols to work seamlessly together without requiring changes to the application code. Schema validation and message deserialization work consistently across all supported protocols.
Important considerations for cross-protocol messaging:
- SERDES headers are preserved during message translation only when using proper header mapping (AMQP application properties, REST user properties, SMF user properties).
- Message translation is performed automatically by the broker and does not require any configuration changes.
- Binary message types are recommended for optimal performance and cross-protocol compatibility.
- For protocol-specific cross-protocol considerations and troubleshooting, see API and Protocol-Specific Implementation Guides.