map-pin

The Importance of Location in Real Estate, Weather, and Machine Learning

The Importance of Location in Real Estate, Weather, and Machine Learning

Real estate experts like to say that the three most important features of a property are: location, location, location!  Likewise, weather events are highly location-dependent.  We will see below how a similar perspective is also applicable to machine learning algorithms.

In real estate, the buyer is first and foremost concerned about location for at least 3 reasons: (a) the desirability of the surrounding neighborhood; (b) the proximity to schools, businesses, services, etc.; and (c) the value of properties in that area.  Similarly, meteorologists tell us that all weather is local.  Location is significant in weather for at least 3 reasons also: (a) specific weather events are almost impossible to predict due to the massive complexity of micro-scale interactions of atmospheric phenomena that are spread over macro-scales of hundreds of miles; (b) the specific outcome of a weather prediction may occur only in highly localized areas; and (c) the minute details of a location (topography, hydrology, structures) are too specific to be included in regional models and yet they are very significant variables in micro-weather events.   Side note:  we might have a good start here on generating some predictive models (for real estate sales or for weather), if we could parameterize the above location-based features and score them appropriately.

Read Also:
Crowdsourcing Data Governance

Another aspect of “location” is the boundary region between different areas.  This boundary region can affect real estate sales, especially if a desirable area is adjacent to an undesirable area.  While conditions (prices, market factors, resale values) may be well understood deep within each of the two areas, there is more uncertainty in the boundary region.  This is similarly true for the weather, as was especially evident in the big snow and ice storms that swept across the United States on March 3, 2014. For those of us in the Baltimore-Washington region, we were expecting significant snow, ice rain, and sleet.  What we received was a moderate amount of snow across most of the region and not much else.  This “less significant” weather event was partially due to the fact that cold dry air from the north won the battle against warm wet air from the south, pushing dryer air into the region than was expected.   Wrong predictions for massive snowfalls are not unusual in this part of the country primarily because this latitude is often within the boundary region between the northern weather circulation patterns and the southern circulation patterns.  It is often difficult to predict reliably which weather pattern will win the battle in the boundary region during any particular storm.

Read Also:
To AI or not to AI?

Location is also very important in many machine learning algorithms. The simplest classification (supervised learning) algorithms in machine learning are location-based:  classify a data point based on its location on one side or the other of some decision boundary (Decision Tree), or classify a data point based on the classes of its nearest neighbors (K-nearest neighbors = KNN).  Furthermore, clustering (unsupervised learning) is intrinsically location-based, using distance metrics to ascertain similarity or dissimilarity among intra-cluster and inter-cluster members.  All of this is a natural consequence of the fact that humans place things into different categories (or classes) when we see that different categories of items are clearly separated in some feature space (i.e., occupying different locations in that space).  The challenge to data scientists is to find the best feature space for distinguishing, disentangling, and disambiguating different classes of behavior.  Sometimes (though not often) those “best” features are the ones that we measured at the beginning, but we can usually discover improved classification features as we explore different combinations (linear and nonlinear) of the initial measured attributes.

Read Also:
Introduction to Business Intelligence: How to Take a More Intelligent Approach to Business


Data Innovation Summit 2017

30
Mar
2017
Data Innovation Summit 2017

30% off with code 7wData

Read Also:
A New Way for Entrepreneurs to Think About IT

Big Data Innovation Summit London

30
Mar
2017
Big Data Innovation Summit London

$200 off with code DATA200

Read Also:
To AI or not to AI?

Enterprise Data World 2017

2
Apr
2017
Enterprise Data World 2017

$200 off with code 7WDATA

Read Also:
To AI or not to AI?

Data Visualisation Summit San Francisco

19
Apr
2017
Data Visualisation Summit San Francisco

$200 off with code DATA200

Read Also:
Rome Wasn’t Built in a Day, Why Should Your Analytics?

Chief Analytics Officer Europe

25
Apr
2017
Chief Analytics Officer Europe

15% off with code 7WDCAO17

Read Also:
Neural Networks and Modern BI Platforms Will Evolve Data and Analytics

Leave a Reply

Your email address will not be published. Required fields are marked *