When things go (not quite) web native — thoughts on Mozilla IoT

Updated 2018-11-07 with a few clarifications from Kathy Giori on the Mozilla IoT team.

The Internet-of-Things landscape is a mess. Everybody is trying to build the dominating IoT castle in the cloud, collecting (and almost never giving back) as much user data as possible; user wishes and needs be damned. It’s no surprise that consumer IoT has not seen a lot of “production” (as opposed to play-around-with) uptake so far.

Kai Kreuzer, of the OpenHAB open-source IoT project, conveys this sorry state of affairs brilliantly in what he calls his “depressingly rainy day” slide:

So when I first heard some years ago that Mozilla had an IoT project, I was excited. More so than any other leading tech organization, Mozilla stands for the open web, no lock-in, privacy, user advocacy and the like. Unfortunately, it has taken them many changes of plans and management to have “something” in IoT. But now it’s worth taking a look.

[I’m interested because my company, Indie Computing, recently started shipping home servers that follow very similar values that Mozilla products do. Having IoT software run on a user-controlled home server, in the home, makes a lot more sense to me architecturally, and being responsive to user needs, than today’s predominant rainy stovepipe architecture from Kai’s slide. So we’re trying to figure out what their approach and code would mean for us. I’ve also been talking about the need for an “Indie IoT” for some years.]

Here is what I learned so far about the approach they are taking.

[Disclaimer: There seems to be substantial overlap between Mozilla’s work and the IoT work in the W3C. I don’t know where exactly the line is between what Mozilla originated and what others in the W3C did. (Happy to be enlightened if somebody tells me.) In this post, I am just looking at the info published by Mozilla.]

Let’s first talk about URLs:

Remember the first time you saw a URL? (In my case, that was 25 years ago this month, during a visit at JPL of all places! And I was blown away!) You would have this slightly odd-looking text string starting with http://, and if you typed it into this new piece of software called a web browser, it would go on the internet, as far as it needed to go, and fetch a text document, or a picture, and show it to you right in front of you. And then you could click on text with a blue underline, and it would fetch some other document, possibly from a computer in a completely different part of the globe. It was magic: the simplicity of it, and how easily you could weave information togther that was held on far-away computers, without having to ask anybody for permission.

None of the power of URLs from back then has gone away since (although many have been trying very hard to stuff the open web back into corporate-controlled walled gardens), and we have found many novel uses since. For example, not only can we point to documents with URLs, but also to companies (e.g. my company), or people (e.g. me) and entire protocol stacks have been built for those additional uses of URLs.

The big idea behind the Mozilla IoT project (and the W3C IoT project), is to also use URLs to point to IoT devices, and use standard web protocols to interact with them. If the traditional web is any indication, this can open up unprecendented innovation and opportunity in IoT as well. Basically nobody (historically and currently) does this so far, and if you think back to the slide above, it’s quite clear why: If your goal as an IoT vendor is to dominate and control your users and the entire industry, the last thing you want is that anybody can connect to your products as easily as putting a URL into a web page. So the intriguing premise behind this project is to blow IoT wide open, leave the locked-down cloud castles behind, and let 1000 flowers bloom. Just like the Web 25 years ago.

It seems this project is worth spending some time with, right?

So how would one interact with an IoT device that’s on the web? Well, in depends on what kind of device it is and what it can do. Here are the main device categories:

Some devices are (just) sensors: they observe something, and we interact with the device, in order to learn what it observed. Examples would be:
- a switch (is it on or off?);
- a slider (at what value between 0 and 100% is it currently?);
- a thermometer (how hot is it currently?) or
- a power meter (how much power is currently consumed through this power outlet?).
Some devices are (just) actuators: we interact with the device, and the device then does something to the environment. Examples would be:
- a simple light bulb, or a simple motor, pump, or valve (we can turn it on or off);
- a dimmed light bulb, or a variable-speed motor, pump or valve (we can set it to any value between 0 and 100% brightness);
- a device on which we can trigger more complex behaviors, like a garage door opener (we can trigger it to go up or down; it will finish on its own);
Some more complex devices include both sensors and actuators. For example:
- an air conditioning system has temperature sensors, dampers, fans and other things (and we interact with in on a higher level, such as by changing the set temperature)

So far, Mozilla focuses on more “elemental” things: those that sense or actuate on one value (not sure whether there are any plans for complex things that cannot easily be represented as a collection of simple things). For those, the following basic interactions have been defined:

A thing may have properties, which may be read or written.
An action may be performed on a thing.
A thing may raise an event.

That sounds all very familiar? Yep, it’s the basic API of certain kinds of “objects” (like Java Beans) applied to IoT devices. Check, that will work. This is all very straightforward and elegant. The only difference now is that we use HTTP for REST-ful “query” operations and Web sockets for notifications. The details are documented here, and quite straightforward, so no need to discuss them here.

More interesting is to ask: how do we know what properties an IoT device has, which actions it can perform, or which events it might raise?

Easy: let the device tell us! And the simplest way would be to simply describe this in a JSON file that can be requested via HTTP from the device. Which is what they have done. So if you want to interact with an unknown device at a certain URL, you first retrieve that JSON file from that URL, which tells you all you need to know, and now you can interact with the device. Or software that interprets that file for you can interact with the device on your behalf.

Note the lose coupling and the self-description: to interact with the device, I don’t need to interact with anybody else. Certainly not the manufacturer website, read complicated data sheets, or install networking device drivers. Note that I don’t need to know anything about the manufacturer at all! This is very nice.

I have one misgiving, however: the JSON file containing all the metadata is served from the root URL of the device. What a user experience that makes! “Hi mom, I got this new thermometer and it got its own URL and I plug it in and I go to this URL and look how hot … this JSON looks!” (and that JSON doesn’t contain the temperature at all.) I strongly believe that metadata should have been put somewhere else than the base URL for the device! The base URL should have a human-level HTML page so I can type the URL into my browser, and interact with the device directly. (Yes, I realize the protocol says this only happens if you ask for application/json but even if you return HTML otherwise, that HTML would have to have semantically very different content; for usability reasons, it needs to show the actual current state of the device, not just metadata; and if it does that, the different semantics when accessing the same URL appear violate the HTTP principles. So the metadata document needs to move to a different URL, just like we have .well-known and we don’t put the metadata at /)

This is the reason why I titled this post “… (not quite) web native“. For me, the web is HTML, not JSON, and there is nothing in the spec that speaks about HTML. This could be fixed easily, however: simply add to the spec that when accessing any of the URLs defined in the spec with a regular browser, a human-readable version of the JSON needs to be displayed. And then fix inconsistencies between expected user experience and locations of JSON in the namespace (like that the root of the device URL needs to show the status of the device, not the metadata).

And finally, Mozilla has been writing some software, most notably a web application intended to run on a gateway device. This piece of software does a lot of things. (Is it monolithic? Should it rather be a larger collection of small, independent pieces? I have not had the time yet to pick it apart, so I may be wrong here)

a user interface for the user to interact with their things on their home network;
add/remove device functionality;
protocol conversion software, so the gateway software can surface IoT devices that do not natively speak this protocol (i.e. all of them) and we can pretend they actually do;
a basic rules engine, so you can graphically connect sensors and actuators and “script” your home;
enable / disable SSH on the device? What does this do in a web app here? Confused… (Kathy: “just a [developer] convenience… not necessary for products”)

So the beginnings of a system based on this architecture are appearing. There are still many open questions, however. Here are some:

If I buy a future IoT device (say a wall switch) that natively speaks these protocols, how does it get provisioned into the system? For example:
- Assuming that getting on WiFi is out-of-scope for this project, once the device is on WiFi, however, how does it get identified and named on the network? For example, all computers on my own home network have DNS entries that look like XXX.aviatis.com. How do I make that switch be accessible as, say, switch2.fronthall.aviatis.com? (And get a TLS cert in the process, too.)
How does security get set up? How do I prevent that my kids turn on, over the network, certain machinery in the house? Or turn the HVAC down to below freezing? (Kathy: “There should eventually be privileges that are per user.”)
How do I keep track which device is which? Say I repaint the house, and as part of this project, I temporarily remove all light bulbs. How do I put the right bulb back into the right socket?
If I run a “gateway” app as they envision, but my individual IoT devices all speak this protocol natively, how exactly does the situation stay consistent if changes can be made from all sorts of places? In the extreme case, if a fan (speaking this protocol) has been made a part of the HVAC system, but if its speed is changed directly (by talking to its URL), will the HVAC blow up? Perhaps the system needs some kind of exclusive access reservation, so only the HVAC software would be allowed to interact with it?
Where does historical information get stored? Almost all home IoT devices that I can think of have some kind of logging and display function. (Kathy: “One of our new feature goals to implement for the 0.7 gateway release. Suggestions?”)

To be sure, this is not intended as criticism but as a feedback to a project that is still quite early. The whole point of release early and often! I have some smaller feedback, too, which I will file as issues on Github has they request.

Next step: see what it would take to make my Pool Timer speak this protocol, and my attic fan, both of which are controlled by a Raspberry Pi each.

Upon 2020

When things go (not quite) web native — thoughts on Mozilla IoT

Mentions