Distinguishing between public and private places

Public places

In terms of privacy, streets, parks, stores, and museums are public places. In contrast, residences are private places.

What determines whether a given location is public is whether it is open to the public.

Fees don’t make a public place private; a road is public, even if it has tolls. And a museum is public, even if the entrance charge is high.

Ownership is not relevant to determining whether a given location is public or private. A privately-owned museum is a public place. A store is public place, even if the land it sits on is privately owned. On the other side, a residence is private, even if it is owned by the government.

Public places and privacy

People have fewer privacy rights in public places. The specifics vary by country and there are endless questions that arise. But, to take an example, Google’s Street View is able to show all that it does because streets are public. And I doubt that Google would get as far with Bedroom View or Bathroom View.

So, distinguishing between public and private spaces might help us effectuate appropriate privacy policies and practices.

Detailed maps

Happily, there exist maps with detail on property uses down to the parcel level. In essence, these maps are souped-up versions of the map in your car navigation device which can help you find gas stations, tourist destinations, hospitals, etc.

Some governmental units make this map data available for free. And commercial map providers enhance and standardize these offerings. So, the data is often available.

Complication – Privacy in Public Places

Its surprisingly easy to identify someone solely from a track of their movements. The location that someone visits most frequently is usually their home. And, using the location information and publically available data (e.g. phone directories), it is often possible to identify the individual (Krum, 2007).

And, even if we removed from the track all locations within the person’s home, the track would still show the person’s travels to and from their home. So, even a track showing only movements in the public space would compromise anonymity.

Mobile Carriers’ Data

In fact, it is the relative ease of identifying someone from a track of their movements that blocks cell phone companies from using their detailed and voluminous tracking data. Carriers are required to be able to locate any subscriber who calls for emergency services. And they have to keep records of a subscriber’s location each time they make or receive a call.

But carriers don’t generally resell this data because it is ‘personally identifiable,’ even if it does not include the subscribers’ name or address (Zhang, et al. 2011).

The Aggregation Solution

The simplest and most common solution to the ability to identify who made a given individual track is to get rid of individual tracks by aggregating (or combining) many tracks; if many tracks have been combined, then it is no longer possible to discern the movements of a single individual and to identify that individual. Aggregation provides effective privacy protection.

However, aggregation involves large data loss. When you combine lots of detailed tracks, you lose a lot of specificity about where a given trip started, where the people making a given trip started their day, etc.

To some degree, the data loss is intentional; aggregation is trying to lose the data which makes it possible to identify individuals. But the concern is that the data loss is greater than needed and that aggregate data are difficult to analyze and that the data loss limits the insight that can be obtained.

Anonymizing tracking data

With two tweaks, it is possible to use the distinction between public and private places to anonymize individual location tracks. If the method succeeds, then individual tracks can be used as data, without aggregation.

First, while minor streets may be public places, they typically don’t have enough traffic to provide sufficient anonymization. So, its necessary to blur locations on minor streets, as well as locations in the home itself.

The individual’s locations within this residential neighborhood are blurred to the point in the center. The blurred appearance the pin there indicates that this pin represents many GPS readings.

 

jeremytrack13home1point

The track segment in Figure 1 shows the person’s precise location on the busy, major road. But, all of the locations within the residential neighborhood are represented by the single point in the center.

In the same vein, some public places (e.g. restaurants) may not provide sufficient anonymization. And the decision may be made to treat these locations as private.

Second, the longer one is followed the more distinctive one’s track becomes. Tracking someone for 1 day seems reduce the de-anonymization risk to an acceptable level (Zhang, et al. 2011). Happily, the regularity in people’s travel patterns limits the data lost by shortening the tracking period (Gonzalez, 2008).

Uses for the method

Based on where the individual begins or ends his day, it is possible to probabilistically assign each track a demographic profile. For example, one could assume that there is a 70% probability that the owner of a track that began and ended its day in an upper middle class area where 70% of the people are college educated has completed college.

Individual tracks might be useful for retail site selection, transportation planning, and profiling attendees at a mall or sporting event.

Identifying public spaces can also help provide services to the user. For example, a person is typically more comfortable sharing her location with her friends when she is in a public place (Toch, et al. 2010). So, using detailed maps to distinguish public and private places might be useful in location sharing apps.

Also, it may emerge that people are more willing to receive advertisements or offers when they are in public.

Limitation – data accuracy

The method outlined here is called ‘location anonymization.’

In addition to the detailed maps, location anonymization requires tracking data precise enough to make the maps useful. If the tracking data is accurate to within 100M, then the random error will swamp distinctions relying on parcels that average 25M.

So, it seems unlikely that the method would work with cell-tower data. But GPS-derived data would seem to have sufficient accuracy.

Opt-in Data Collection

Of course, one must obtain the tracks before anonymizing them. And, even if more than 50% of people are carrying GPS-enabled smart phones, consent should be obtained before using these devices for data collection.

Potentially, the movements of those using a location sharing app could provide the data input. Alternatively, the data input could come from location data gathered from people’s use of a search engine or mapping/traffic application. In those situations, opt-in consent could be obtained as part of the registration process, and subjects would be receiving a benefit in exchange for their data.

Alternatively, a random sample of individuals could be recruited and offered an incentive to allow themselves to be tracked for 24 hours, with the understanding that the resulting track would be anonymized. Presumably, this method would involve a larger up-front cost, but the opt-in consent procedure would be less susceptible to criticism, and randomly selecting the individuals would likely result in a more representative sample and better data.

Prospects

Location anonymization is patented in the US (Wood, 2012/I) and a small study seems to show that it works with GPS-derived tracks (Wood, 2012/II). But it has not been implemented on a large scale, the anonymization risks have not been quantified, and it remains to be seen how the method competes against big data approaches relying on aggregation.

References

  1. Gonzalez, M. C., Hidalgo, C. A., and Barabasi, A.-L. Understanding individual human mobility patterns. 453 Nature, (2008), 779-782.

  2. Krumm, J. Inference attacks on location tracks. In PERVASIVE’07: Proceedings of the 5th international conference on Pervasive computing. Springer-Verlag, (2007), 127-143.

  3. Toch, E., Cranshaw, J., Drielsma, P. H., Tsai, J. Y., Kelley, P. G., Springfield, J., Cranor, L., Hong, J., and Sadeh, N. Empirical Models of Privacy in Location Sharing. UbiComp, ACM. (2010)

  4. Wood, J. Method of Providing Location-Based Information from Portable Devices. United States Patent 8,185,131.  (2012)

  5. Wood, J. Preserving Location Privacy by Distinguishing between Public and Private Spaces. UpiComp poster. 2012.