MQTT - Part 1

In a previous post, I evoked the MQTT standard.

MQTT used to be the acronym of MQ Telemetry Transport but is now just MQTT is now a OASIS standard for Machine2Machine data sharing. This protocol was invented by Andy Stanford-Clark (IBM) and Arlen Nipper (Arcom, now Cirrus Link) in 1999 and was designed for connections with remote locations where a small code footprint is required or the network bandwidth is limited.

MQTT.org logo

Tutorials

The best explanations I found on the subject come from a series of blog posts by HiveMQ (even if I don't use their product):

There are other posts about security and clients but I think that the ten ones referenced above are a very good introduction.

Basic concepts of MQTT

There is also a IBM redbook called Building Smarter Planet Solutions with MQTT... which is a little bit old and geared towards IBM Websphere MQ product but still interesting and another good presentation of this protocol. Moreover it is free and available in numerous formats.

Extract from the book (chapter 1.2.3):

The MQTT protocol is built upon several basic concepts, all aimed at assuring message delivery while keeping the messages themselves as lightweight as possible.

Publish/subscribe

The MQTT protocol is based on the principle of publishing messages and subscribing to topics, which is typically referred to as a publish/subscribe model. Clients can subscribe to topics that pertain to them and thereby receive whatever messages are published to those topics. Alternatively, clients can publish messages to topics, thus making them available to all subscribers to those topics.

Topics and subscriptions

Messages in MQTT are published to topics, which can be thought of as subject areas. Clients, in turn, sign up to receive particular messages by subscribing to a topic. Subscriptions can be explicit, which limits the messages that are received to the specific topic at hand or can use wildcard designators, such as a number sign (#) to receive messages for a variety of related topics.

Quality of service levels

MQTT defines three quality of service (QoS) levels for message delivery, with each level designating a higher level of effort by the server to ensure that the message gets delivered. Higher QoS levels ensure more reliable message delivery but might consume more network bandwidth or subject the message to delays due to issues such as latency.

Retained messages

With MQTT, the server keeps the message even after sending it to all current subscribers. If a new subscription is submitted for the same topic, any retained messages are then sent to the new subscribing client.

Clean sessions and durable connections

When an MQTT client connects to the server, it sets the clean session flag. If the flag is set to true, all of the client's subscriptions are removed when it disconnects from the server. If the flag is set to false, the connection is treated as durable, and the client's subscriptions remain in effect after any disconnection. In this event, subsequent messages that arrive carrying a high QoS designation are stored for delivery after the connection is reestablished. Using the clean session flag is optional.

Wills

When a client connects to a server, it can inform the server that it has a will, or a message, that should be published to a specific topic or topics in the event of an unexpected disconnection. A will is particularly useful in alarm or security settings where system managers must know immediately when a remote sensor has lost contact with the network.

Naming of Topics

If the best practices are rather well defined and mainly:

  • No leading forward slash
  • No spaces in a topic
  • Only ASCII printable characters
  • Short topics
  • Client ID embedded

Trying to find a convention is a real headache. All projects seem to use their own format. See examples IBM Internet of Things Foundation, HIVEMQ examples, a blog, Eurotech, BB SmartSensing.

Through trial and error, I finally settled for the following formats:

  • group_id/sensor_id/action/action
  • group_id/sensor_id/data/measurement
  • group_id/sensor_id/status/heartbeat

For example:

  • cm/alecto1/data/temperature
  • cm/alecto1/data/humidity
  • cm/doorlock/data/status
  • cm/lcd/status/heartbeat
  • z-wave/fgk101-1/data/status
  • z-wave/fgwpe101-1/action/set

Payload

Another problem comes with the payload format. Sometimes the format is part (usually a postfix) of the topic (it is the case with IBM or Xively). It might make sense when the source of data is diverse and unknown but I really don't like it, specially for something all integrated.

At first, I tried to prefix all payloads with a indication of the format. But the result was really different from all the examples and productions I could see around.

Thinking about it, I needed only 2 main "forms" of data:

  • single value
  • dictionnary type data: key1="value1" key2="value2" ...

JSON is able to deal with the former case by default. For the single value, it can handle all integer and float values directly. For strings, it is a bit more complicated as string should be between quotes. So the idea now is simply:

  • First try JSON decoding
  • If there is an exception, fall back to the raw value

Brokers

There are quite a few brokers around (see here or here) but so far Mosquitto has been perfect on the Raspberry Pi: easy to install, fast and irreproachably stable.

Even the bridge mode between two remote locations worked perfectly the first time.

TBC...