In a previous post, I mentioned that we did some analysis to detect IoT devices versus human users. The labelling was based on the International Mobile Equipment Identity (IMEI) Type Allocation Code (TAC). From a database of TAC values there is field per device that specifies the type of device. We allocated devices of type “mobile phone” to be used by humans, and devices of type “M2M” or “module” to IoT devices. We left out “Router”, “Tablet”, “Dongle” and “unknown” since it was not so clear if these were humans or machines. In lieu of some ground truth, this seems like a reasonable approach.
In a dataset of 195,000 unique devices taken from a large mobile network operator, we noticed that the majority of the devices were “mobile phone”, which seems to make sense from our understanding of user distribution. When we created a subset with only devices designed as human or machine (IoT), we ended up with 95% of the sample being human.
The full features set for our data had 126 different features, with daily observations for the device usage over a 12-day period. The insights from this analysis was machines and humans have different levels for:
Average Revenue Per Users (ARPU) level (ordinal ranking from our collection system)
- humans higher than machines
Data download (DL) usage
- most machines do not have any DL reports over the 12-day period
Internet service usage
- most machines do not have any service usage (makes sense since they have no data DL).
As mentioned in the other post, I made a classifier based on ARPU levels and the presence of down-loaded data, which was reasonable accurate. But there were significant minorities in each group that act as the other and contributed to the error in the classifier. I named these error groups as:
- Humanoids: Machines that act like humans (8.31% of the machines). These are devices that download data like users of a mobile phone, and have significant internet service usage.
- Cyborgs: Humans that act like machines (10.84% of the humans). These are humans using mobile devices and never/very rarely download data or use internet services. People that use their smart phones to make calls and send SMS but never connect to data.
A little digging into the data yielded some insights about these groups.
On investigation of a few Humanoids, we found that they were modules that could be used in laptop computers or IoT devices. In this case, if these modules have dual use, then it makes sense that devices with these modules could be human or IoT.
In the case of Cyborgs, it was a little less clear because all them had smart phones, so in theory they should be using data services. However, in another recent investigation with an operator, we were able to find approx 18% of the subscribers had no significant data usage, despite having a data plan. It seems our Cyborgs are humans that are non-users of internet technology. Much like the 13% of Americans that still don’t use the internet (see my other post). This begs the question, “Why not?”, but I don’t have any answer for that at this time.
The last thing I wanted to mention was the relative utility of using IMEI TAC to identify human versus IoT users in a mobile network. Before we present these results to folks in the industry, most would affirm that IMEI TAC is a good way to identify devices versus humans. But because there are dual use devices for IoT and humans, this not a very good way to classify. In fact, for the 22 different device types in our sample that were considered “Machine” devices:
- 60.0% were used by devices that only acted like IoT machines
- 8.6% were used by devices that only acted like humans
- 31.4% were used in devices that acted like humand AND IoT devices.
Moral of the story: IMEI TAC does not tell you with accuracy if a device is an IoT device or not. And a lot of humans don’t surf the web on their mobile devices.
Here is graphic of the relative allocation of humanoid and cyborg device information.