Module 6: Data Classification


The purpose of this week’s lab is to use a dataset in ArcGIS Pro to explore four common methods of data classification: equal interval, quantile, standard deviation, and natural break. Each method has advantages and disadvantages and changes how the data is interpreted. The method selected for a final map is determined by the data, the purpose of the map and the intended audience. 

For the Natural Breaks method, data are put into classes in an attempt to make the values in the class similar to each other but also different from other classes. This method identifies real or “natural” class breaks and thus real trends within the data which can be more accurately represented on choropleth maps. In the Miami-Dade map, the tracts with the highest and lowest percentages - as well as all those in-between - of seniors can be seen.

For the Equal Interval method, data are put into classes that have equal ranges or "intervals" in the data, which are produced in an unbiased way. This method is good for showing values that are over or under represented. This method fails to consider how the data are distributed meaning that some classes can have more observations than others. In the Miami-Dade map, the area with the highest percentage is clearly visible, but it appears as if there are not many senior citizens in other tracts. 

For the Quantile method, data are put into classes so that each class has an equal number of observations in each class. There are no classes that are empty and each class is represented on the map. However, the values in each class can be similar or very different from each other, so that the range in the values in a class can be small or large. In the Miami-Dade map, the areas with the highest percentages might not actually be the highest depending on if those areas have higher or lower numbers of observations. The Miami-Dade map also obscures the census tract with the highest percentage that is the outlier on the other maps by placing this tract with others. 

For the Standard Deviation method, data are put into classes by calculating a mean and by adding or subtracting data values from the standard deviation to the mean. This means that most of the observations will be put in one class around the mean and there will be fewer and fewer observations in classes farther away from the mean. This means that outliers can be easily seen on the map. The method takes into account how data are distributed, but only works for data that are normally distributed on a bell curve. Maps made with this data are difficult for most audiences to interpret because it requires a basic understanding of statistics. In the Miami-Dade map, we can immediately see which two tracts have the highest and lowest percentages of seniors. 

Comments

Popular posts from this blog

Module 2: Coordinate Systems

Module 6: Proportional Symbol and Bivariate Choropleth Mapping

Module 8: Isarithmic Mapping