Protocol Specification
This page describes the wire format for HiveMind messages, including the binary framing introduced in protocol version 1.
Message envelope
Every HiveMind message is a HiveMessage with:
msg_type— aHiveMessageTypeenum value (see Protocol Concepts)payload— the message content (an OVOSMessageobject, a nestedHiveMessage, or raw bytes)context— optional routing metadata dict
Protocol versions
| Version | Transport encoding | Key exchange | Compression |
|---|---|---|---|
| v0 | JSON | Pre-shared AES key only | None |
| v1 | JSON or binary | PBKDF2 password handshake + PGP | Optional zlib |
Binary framing (v1)
When both sides negotiate binarize: true in the handshake, messages are framed in a compact binary format instead of JSON.
Header layout
<uint:1=start_marker> <uint:1=versioned_bit> [<uint:8=protocol_version>] <uint:5=msg_type> <uint:1=compression_bit> <uint:8=metadata_len>
| Field | Bits | Description |
|---|---|---|
| Start marker | 1 | Always 1; used for alignment |
| Versioned flag | 1 | 1 if protocol version follows |
| Protocol version | 8 | Present only if versioned flag is 1 |
| Message type | 5 | HiveMessageType encoded as 5-bit uint (up to 32 types) |
| Compression flag | 1 | 1 if payload is zlib-compressed |
| Metadata length | 8 | Length of metadata block in bytes |
Followed by: metadata bytes, then payload bytes. Padding zeros are added to align to a byte boundary.
Message type encoding
| Value | Type |
|---|---|
| 0 | HANDSHAKE |
| 1 | BUS |
| 2 | SHARED_BUS |
| 3 | BROADCAST |
| 4 | PROPAGATE |
| 5 | ESCALATE |
| 6 | INTERCOM |
| 7 | PING |
| 8 | PONG |
| 9 | HELLO |
| 10 | THIRDPRTY |
| 12 | BINARY |
Binary payload type
For BINARY (msg_type = 12) messages, a 4-bit unsigned integer immediately after the metadata block indicates the binary content type:
| Value | Name | Description |
|---|---|---|
| 0 | UNDEFINED | Opaque binary content |
| 1 | RAW_AUDIO | Continuous microphone stream |
| 2 | NUMPY_IMAGE | Numpy array image (e.g., webcam frame) |
| 3 | FILE | File transfer; see context for filename |
| 4 | STT_AUDIO_TRANSCRIBE | Full audio utterance — return transcript only |
| 5 | STT_AUDIO_HANDLE | Full audio utterance — transcribe and handle intent |
| 6 | TTS_AUDIO | Synthesized speech audio (hub → satellite) |
Example: BUS message (uncompressed, versioned)
1 | 1 | 00000001 | 00001 | 0 | 00000000 | <metadata> | <payload>
1— start marker1— versioned flag00000001— protocol version 100001— BUS (type 1)0— not compressed00000000— no metadata<payload>— UTF-8 JSON string
Example: BINARY message (raw audio)
1 | 1 | 00000001 | 01100 | 0 | 00000000 | <metadata> | 0001 | <audio_bytes>
01100— BINARY (type 12)0001— RAW_AUDIO binary payload type<audio_bytes>— PCM audio data
Compression
When the compression flag is set, both the metadata and payload are compressed independently with zlib. Compression is most effective on large payloads (reduces size by up to ~50%); it adds overhead for small messages.
Session context
See Protocol Concepts — Session and context keys for the full reference of keys injected into Message.context by the hub.
OVOS messages (payload format)
OVOS Message objects are the standard payload for BUS messages. The structure:
{
"type": "recognizer_loop:utterance",
"data": {
"utterances": ["what time is it?"],
"lang": "en-us"
},
"context": {
"session": {...},
"source": "hive",
"destination": "skills"
}
}
The full OVOS message specification is maintained at OpenVoiceOS/message_spec.
Transports
The protocol runs over any transport that can carry byte streams:
| Transport | Plugin | Default port |
|---|---|---|
| WebSocket | hivemind-websocket-plugin |
5678 |
| HTTP (polling) | hivemind-http-plugin |
5679 |
| MQTT | hivemind-mqtt-protocol |
— |
| Usenet (NNTP) | hivemind-usenet |
— |