What is spatial analysis in GIS?

Spatial analysis is the application of GIS operations to solve real-world problems by analyzing geographic data.

What are basic operations in spatial analysis?

Basic operations include selection and classification, which involve choosing subsets of data and grouping data based on criteria.

How is GIS data represented?

GIS data is represented using vector and raster data models with geographic and projected coordinate systems.

What types of data are used in spatial analysis?

Data types include census data, land use, land cover, road maps, and digital elevation models among others.

A GIS model is a chain of GIS operations where the output of one operation serves as the input for the next to solve complex problems.

What are the types of GIS operations?

GIS operations are categorized into local, neighborhood, and global operations based on the spatial extent of data they analyze.

How is selection by attribute performed?

It involves selecting data subsets by querying attributes using operators like AND, OR, greater than, or equal to.

What is the purpose of classification in GIS?

Classification assigns new values to existing data based on conditions, simplifying and organizing the data into categories.

How do classification methods differ?

Methods like equal interval, quantile, and natural breaks differ in dividing data into categories, affecting visual results.

Why are there different classification techniques?

Different techniques highlight various data patterns, useful for making interpretations or decisions based on specific needs.

Introduction to Spatial Analysis

00:31:57

https://www.youtube.com/watch?v=yZzJWXb8pWs

Zusammenfassung

TLDRThe video introduces spatial analysis within GIS, a sophisticated topic that encompasses analyzing geographical data to solve real-world problems. Initially, viewers learn about key GIS data models such as vector and raster models, how geographic features and locations are represented, and how data is collected and managed. This foundation leads to spatial analysis where existing data, like census and land use, are combined to transform into actionable knowledge. Examples of spatial analysis applications include determining suitable locations for infrastructure or identifying at-risk areas for health crises. Spatial analysis involves various operations like selection and classification. Selection might be based on data attributes or spatial locations, and classification organizes data into groups based on certain criteria. The session progresses to discuss the types of GIS operations (local, neighborhood, and global), detailing how each operates differently in manipulating data. Methods for classifying data— such as equal interval, quantile, and natural breaks—are explained, demonstrating different approaches and outcomes in data visualization.

Mitbringsel

🔍 Introduction to spatial analysis in GIS
🗺 Understanding GIS data models and coordinate systems
📊 Importance of turning geographic data into actionable knowledge
🌍 Examples include locating new hospitals and managing disease outbreaks
🔗 Concept of chaining GIS operations to solve complex problems
📍 Types of analysis: local, neighborhood, global
📌 Selection by attribute/location in spatial analysis
🔄 Classification into categories based on conditions
📈 Various classification methods: equal interval, quantile, natural breaks
🛠 GIS operations as building blocks for spatial problem-solving

Zeitleiste

00:00:00 - 00:05:00
Introduction to spatial analysis in GIS, reviewing previous topics like GIS data models and data preparation. Moving forward to applying spatial analysis on GIS data to solve real-world problems, such as determining the best location for a hospital.
00:05:00 - 00:10:00
Explaining the spatial analysis technique as the application of GIS operations to solve real-world problems. Introducing the concept of chaining GIS operations and how they form a GIS model to perform complicated tasks.
00:10:00 - 00:15:00
Discussing the three categories of GIS operations: local, neighborhood, and global. Each type is explained with examples, like calculating population density or adjacency between states.
00:15:00 - 00:20:00
Introduction to selection GIS operations, focusing on selection by attribute and location. Decision criteria involve attribute values or spatial relationships, applicable primarily to vector data models.
00:20:00 - 00:25:00
Further detail on using selection by attribute in raster data, and how GIS processes these queries to produce binary grids. Introduction to classification as a GIS operation to summarize features.
00:25:00 - 00:31:57
Different methods of classification, including equal interval, quantile, natural break, and manual classification. Discussion on application of these methods and how map visualizations can vary based on classification methods.

Mind Map

Video-Fragen und Antworten

What is spatial analysis in GIS?
Spatial analysis is the application of GIS operations to solve real-world problems by analyzing geographic data.
What are basic operations in spatial analysis?
Basic operations include selection and classification, which involve choosing subsets of data and grouping data based on criteria.
How is GIS data represented?
GIS data is represented using vector and raster data models with geographic and projected coordinate systems.
What types of data are used in spatial analysis?
Data types include census data, land use, land cover, road maps, and digital elevation models among others.
What is a GIS model?
A GIS model is a chain of GIS operations where the output of one operation serves as the input for the next to solve complex problems.
What are the types of GIS operations?
GIS operations are categorized into local, neighborhood, and global operations based on the spatial extent of data they analyze.
How is selection by attribute performed?
It involves selecting data subsets by querying attributes using operators like AND, OR, greater than, or equal to.
What is the purpose of classification in GIS?
Classification assigns new values to existing data based on conditions, simplifying and organizing the data into categories.
How do classification methods differ?
Methods like equal interval, quantile, and natural breaks differ in dividing data into categories, affecting visual results.
Why are there different classification techniques?
Different techniques highlight various data patterns, useful for making interpretations or decisions based on specific needs.

Weitere Video-Zusammenfassungen anzeigen

Erhalten Sie sofortigen Zugang zu kostenlosen YouTube-Videozusammenfassungen, die von AI unterstützt werden!

Untertitel

Automatisches Blättern:

00:00:00
all right today we're going to become
00:00:01
familiar with this spatial analysis
00:00:03
which is one of the fanciest topics in
00:00:05
GIS first we will learn what is the
00:00:08
spatial analysis and then some basic
00:00:10
operations and spatial analysis
00:00:13
including selection and classification
00:00:17
so far we have talked about the
00:00:21
definition of GIS GIS data models vector
00:00:25
data model and raster data model how to
00:00:28
represent your graphic features and also
00:00:32
locations with coordinate geographic
00:00:35
coordinate system and projected
00:00:37
coordinate system how to manage
00:00:39
attributes into database and how to
00:00:43
produce GIS data from scratch
00:00:46
collecting GPS waypoints as a control
00:00:49
points then geofencing scan map and
00:00:52
finding and digitizing features so all
00:00:56
of them all of these topics that we
00:00:59
covered for about the data collection
00:01:02
and data preparation data production
00:01:05
from scratch and data representation as
00:01:09
map now we are ready to move to the next
00:01:11
step which is spatial analysis we have
00:01:15
the GIS data like census data land use
00:01:21
data land cover Road meant for
00:01:25
typography data such as digital
00:01:27
elevation model disease data such as
00:01:31
disease count prevalence incidence and
00:01:33
the next step is to combine these data
00:01:36
and turn the data into knowledge to
00:01:39
solve real world problem for example
00:01:42
where is the best place to build a new
00:01:45
hospital based on several criteria
00:01:47
factors like slow proximity to high
00:01:50
population proximity to road should be
00:01:53
combined or where are the high risk
00:01:55
areas for disease outbreak we can
00:01:58
allocate resources budget to these areas
00:02:00
to control the outbreak which route is
00:02:03
the shortest route between your house to
00:02:05
the University so these are knowledge
00:02:07
that we can extract from the data so
00:02:11
what is the spatial analysis technique
00:02:13
the spatial analysis is the application
00:02:16
of one or more GIS operation in order to
00:02:19
solve real-world problem such as what is
00:02:22
the shortest path between your house and
00:02:24
department by car how many people are
00:02:27
living within one kilometer around the
00:02:29
hazardous situ risk assessment or or
00:02:33
health disparity which areas have lowest
00:02:36
access to healthcare or another
00:02:38
real-world application in public health
00:02:40
according to Virginia Department of
00:02:42
Health is Virginia coupled GIS mapping
00:02:46
many spatial analysis to identify areas
00:02:49
where infant mortality rates are the
00:02:52
highest the extent of racial and ethnic
00:02:55
disparities in infant deaths the
00:02:57
underlying causes of those infant deaths
00:02:59
and how to best intervene so these are
00:03:02
knowledge you can extract from the
00:03:04
prepared data to support your
00:03:06
decision-making to make decision based
00:03:08
on the fact technically a spatial
00:03:13
analysis is a chain of GIS operations a
00:03:16
chain basically means the output from
00:03:18
one joyous operation is served as the
00:03:20
input of the second GIS operation and so
00:03:23
on so operations are linked together to
00:03:26
solve the real-world problems so here
00:03:30
for example the input layer are we have
00:03:33
an input layer a special analysis is
00:03:36
done on the input layer so we will have
00:03:38
an output layer the output layer will be
00:03:40
the input for the second special
00:03:42
operation and gives us the output layer
00:03:44
and this output layer serves as input
00:03:47
for the special operation and gives us a
00:03:50
final output layer so you can see the
00:03:52
operations are linked together to solve
00:03:54
the real-world problem and by chaining
00:03:59
GIS operation you will have a
00:04:01
complicated GIS flowchart like this and
00:04:04
a couple of input layers with many
00:04:07
operations linked together it is called
00:04:10
a GIS model in the future lecture we
00:04:13
will build sophisticated GIS model in
00:04:16
ArcGIS and solve complicated real world
00:04:19
problems that are not easy to solve
00:04:22
without GIS
00:04:24
so basically there are three categories
00:04:27
of Jonah's operation we will talk about
00:04:30
a little today we just scratch the
00:04:33
surface and with more details in the
00:04:36
future lectures so GIS operations can be
00:04:39
divided into three categories local
00:04:41
operations neighborhood operations and
00:04:43
global operations local operations
00:04:46
basically mean the output is determined
00:04:50
based on the input in the same location
00:04:53
so for input and output are for one
00:04:56
location for instance this gray polygon
00:05:02
is the state of Utah we want to
00:05:04
calculate population density of this
00:05:07
state and the output here only depends
00:05:11
on the population size and also the area
00:05:14
of that state so we only need
00:05:16
information about Utah State to
00:05:18
calculate population density of the same
00:05:21
estate so that's called local operation
00:05:24
the second type is called neighborhood
00:05:26
operation and this operation uses data
00:05:29
from both input location plus the nearby
00:05:32
or neighborhood locations to determine
00:05:36
the output value so that's why it's
00:05:38
called neighborhood operation for
00:05:40
example let's say let's say you want to
00:05:45
find out the number of adjacent say so
00:05:48
the Utah State okay or in other words
00:05:51
how many estates share boundary video da
00:05:54
in this case we not only consider the
00:05:58
location of Utah State but also the
00:06:00
surrounding estates as well to calculate
00:06:02
how many edges in the state's to Utah
00:06:04
State which is one two three four five
00:06:08
six and out would will be six right so
00:06:13
that is called neighborhood operation
00:06:15
and neighborhood operation involves
00:06:17
neighbors of the city region from
00:06:20
calculation for example for California
00:06:22
its shares border with one two three so
00:06:26
the output will be three and depends on
00:06:29
the neighbors third type is global
00:06:33
operation which means this operation
00:06:35
uses the
00:06:36
value from the entire input layer to
00:06:39
determine each output value for example
00:06:42
here here is let's say we want to rank
00:06:47
Western estates by their total
00:06:51
population for the year 1990 in this
00:06:54
case we not only consider the population
00:06:56
size of for example Utah but also the
00:07:00
population size for all of the Western
00:07:04
estates all of this study area and then
00:07:06
we can rank them in order and find out
00:07:09
Utah and Chile has a rank order of six
00:07:12
okay and California has the largest
00:07:15
population ranked number one among the
00:07:18
other state the second one is Washington
00:07:21
and so on so global operation involve
00:07:25
entire study area for calculation GIS
00:07:35
operations our building blocks for
00:07:38
complicated spatial analysis or spatial
00:07:41
analysis consists of several joist
00:07:43
operation which aims to address
00:07:45
real-world issue basically there are
00:07:47
four types of GIS operations selection
00:07:50
classification and proximity and overlay
00:07:53
which are applicable for both vector and
00:07:56
raster data set we're going to talk
00:07:58
about the operations one by one and in
00:08:01
today's class we will talk only about
00:08:04
selection and classification and
00:08:07
proximity and overlays for the future
00:08:10
lectures let's start with the selection
00:08:15
so selection is a GIS operation that
00:08:19
creates a subset of data that satisfies
00:08:22
certain criteria so there are two types
00:08:25
of selection selection my attribute and
00:08:27
selection my location selection my I
00:08:30
distribute that we already have worked
00:08:32
with we use that to select a specific
00:08:34
type of cancer based on location and it
00:08:38
selected specific records or feature
00:08:40
based on attribute values but the
00:08:44
selection by location selects record
00:08:46
based on a special relationship like
00:08:49
in composition interception at Jason's
00:08:52
adjacent or or in general of topological
00:08:56
relationship that we talked about before
00:08:59
so selection by location is only
00:09:02
applicable for vector data model because
00:09:05
raster they talk and do not have any
00:09:08
spatial topological information so let's
00:09:13
start with the selection by attribute
00:09:15
for vector data there are a number of
00:09:18
operators you can use like and not or
00:09:21
greater than less than equal to and so
00:09:24
on to write your statement or query so
00:09:28
we can use these operators to build a
00:09:30
query a statement was said to specify
00:09:33
the selection criteria so we've won with
00:09:36
that in the previous labs so basically
00:09:39
in select by attribute GS applies the
00:09:42
query on each feature and then compare
00:09:44
that reviews of each feature with the
00:09:47
criteria the feature or records that
00:09:49
meet the criteria will be selected for
00:09:52
example here we want to select all
00:09:54
counties with attribute name of the
00:09:57
state of Vermont so it is based on the
00:10:00
attribute table or we want to select all
00:10:03
counties that their attribute name is
00:10:05
not New York so this operator is shows
00:10:10
not and/or we want to select all
00:10:13
counties with areas larger than or equal
00:10:16
to 1,000 square miles so you can see
00:10:20
that so it's based on the area based
00:10:22
based on the attribute table there is a
00:10:24
field that the name of area so based on
00:10:26
that one we can select all counties or
00:10:29
select the counties that population
00:10:31
density is less than 250 percent per
00:10:34
square mile
00:10:35
so these are the selected counties
00:10:37
selection by attribute can also be
00:10:40
applied to raster data so let's say we
00:10:43
have a very simple raster dataset which
00:10:46
describes land you stop for a study area
00:10:49
one represents agriculture two
00:10:51
represents forests and three represents
00:10:54
urbanized area let's say we want we only
00:10:57
want to tease out the urbanized areas
00:11:00
okay we can define the SQL statement
00:11:03
like land use type across the tree and
00:11:06
then GIS will read this criteria and
00:11:09
then compare that compared every single
00:11:12
pixel one by one so let's start with the
00:11:15
first cell the first cell is one and one
00:11:19
doesn't satisfy the criteria because
00:11:22
language type is not equal to three so
00:11:24
the output will be 0 which means false
00:11:28
and then we go to the next cell the next
00:11:32
cell again is 1 and doesn't satisfy so 0
00:11:35
the second the third one is 2 and
00:11:38
doesn't satisfy it's not 3 so it's 0 so
00:11:42
until we reach this cell this cell is 3
00:11:47
& 3
00:11:48
land use type is Erinn area so it
00:11:50
satisfies the criteria so the output
00:11:53
will be 1 so 1 means that criteria is
00:11:57
satisfied and so that's how GIS that's
00:12:01
how can be lower scan is scan and do
00:12:04
this like by attribute or the raster
00:12:06
dataset okay so keep in mind that the
00:12:09
output of selection by attribute for
00:12:12
raster dataset is always a binary great
00:12:14
it's oh it's either 0 or 1
00:12:18
okay or how about this one line is type
00:12:21
greater than 1 so it compares each cell
00:12:24
to the criteria and if satisfies this
00:12:27
criteria the result would will be 1
00:12:29
otherwise it will be 0 so again the
00:12:32
output is only binary raster so let's
00:12:38
move on to the second type of selection
00:12:40
which is selection by location so here
00:12:43
our selection criteria is not defined
00:12:45
based on attribute information it's
00:12:48
defined based on a spatial relationship
00:12:50
or topological information special
00:12:53
relationship between two features we
00:12:56
have talked about before include the
00:12:57
intersection of two lines or adjacency
00:13:01
between two polygons or composition of
00:13:03
polygons or line or containments so
00:13:06
these are spatial relationship and
00:13:09
selection by location usually involves
00:13:12
two or
00:13:14
more layers as the input so we need at
00:13:17
least two layers for the selection by
00:13:19
location for the vector data let's see
00:13:23
one example of selection in my location
00:13:25
based on adjacent selection so we want
00:13:29
to know which assets are adjacent to
00:13:31
Missouri and so we have a shapefile of
00:13:35
Missouri and a shape far away all of the
00:13:37
US states now we're not looking at
00:13:40
attribute of the Missouri assay but we
00:13:43
look at the spatial relationship between
00:13:46
Missouri estate and the surrounding
00:13:48
states and then the surrounding polygon
00:13:50
can be teased out like this so our
00:13:54
containment selection is another
00:13:56
topology our relationship for instance
00:13:59
which estates contain the Mississippi
00:14:02
River so this is the Mississippi River
00:14:06
okay and and we have it as a shapefile
00:14:12
and also we have used US states as
00:14:15
another shapefile and all states that
00:14:17
contain part of Mississippi River the
00:14:21
entire state will be selected out so
00:14:23
these gray estates okay all of these
00:14:26
gray says they contain part of
00:14:29
Mississippi River
00:14:30
so in selection by location we're
00:14:32
looking at the spatial relationship
00:14:34
between the features for example
00:14:36
adjacency or containment sometimes its
00:14:41
intersection sometimes it's composition
00:14:43
so this is called selection by location
00:14:45
we're not looking at the attribute table
00:14:47
but we're looking at the relationship
00:14:50
between the features and
00:14:53
ArcGIS provides a user-friendly
00:14:55
interface to help us to construct SQL
00:14:59
statement to a specify selection
00:15:01
criteria and this dialog helps us to
00:15:05
define how to select features based on
00:15:08
the attribute or based on the spatial
00:15:13
relationship so here is like by
00:15:15
attributes and here is like by location
00:15:18
we already have work birds like by
00:15:21
attribute and in this lab we will also
00:15:23
work with select by location so in
00:15:26
summary for select by
00:15:28
beautiful Victoria Talley apply the
00:15:29
criteria on each feature if the features
00:15:33
satisfy the criteria it will be selected
00:15:35
and select by attribute for us there we
00:15:38
apply the criteria on each cell each
00:15:40
pixel and if the feature satisfies the
00:15:43
criteria it will be 1 otherwise it will
00:15:46
be 0 and select by location has nothing
00:15:52
to do with the attributes it's only
00:15:54
applicable for vector data set it cannot
00:15:57
be used for the raster dataset because
00:16:00
it's based on the relationship or is
00:16:02
based on the topological relationship
00:16:04
and topological relationship can only be
00:16:06
defined for the vector data model not
00:16:10
for the raster data model another widely
00:16:13
USG operation is classification
00:16:16
classification is also known as
00:16:18
reclassification or recoding and this
00:16:21
operation will summarize the features
00:16:23
into several groups based on predefined
00:16:26
conditions so let's say we have an input
00:16:29
raster layer and we want to simply use
00:16:34
new values to replace all values based
00:16:37
on a predefined condition and here is a
00:16:39
lookup table and lookup table basically
00:16:43
defines how old values will be replaced
00:16:46
by new values all values between 1 to 3
00:16:50
will be replaced by 5 and all values
00:16:54
between 3 to 7 will be replaced by new
00:16:57
values 3 and so on so for example first
00:17:02
cell is has a value of 3 which is old
00:17:07
value then you look up look at the
00:17:09
lookup table the cells falls into this
00:17:12
category right so the new value should
00:17:15
be 5 okay in the output and output will
00:17:18
be saved as 5 here for this cell and if
00:17:22
you look at so the second one is also
00:17:25
the same 5 the third one is 19 and 19
00:17:28
falls into this category and the new
00:17:31
value for this value for this for 19
00:17:34
will be 5 again so we can continue this
00:17:38
process for every single cell 1 by 1 2
00:17:42
reclassify this image okay so this is
00:17:45
called classification or
00:17:47
reclassification process which is widely
00:17:49
was for summarizing and displaying
00:17:52
datasets so there are many examples or
00:17:56
applications for classification for
00:17:59
example we can classify zip codes into
00:18:02
low crime rate high crime rates so two
00:18:06
categories we can summarize cancer type
00:18:09
such as brain cancer into medium and
00:18:14
high and low rate of the cancer or we
00:18:19
can summarize habitat suitability for
00:18:23
red fox to very high high medium low
00:18:26
very low suitability so you can see
00:18:29
there are numerous applications for
00:18:31
classification so now let's talk about
00:18:35
classification for all attribute size we
00:18:39
have talked about different types of
00:18:40
attributes before which were nominal
00:18:43
ordinal interval and ratio so we are
00:18:46
start with nominal and ordinal
00:18:48
attributes nominal and ordinal
00:18:51
attributes are also referred to as
00:18:54
categorical variable or categorical
00:18:57
attributes for classification of
00:19:00
categorical tribute is very easy we
00:19:03
should use a lookup table for
00:19:04
classification like this and lookup
00:19:07
table basically defines how old values
00:19:10
will be replaced by new values for
00:19:12
example let's say we're going to recode
00:19:14
all the states in the west of main
00:19:18
branch of Mississippi River as one and
00:19:21
an all estates in the east of river as
00:19:25
zero so here is a classification table
00:19:28
all values are categorical values right
00:19:33
so these are just names and new values
00:19:37
are specified as either one or zero so
00:19:41
here is their classification results
00:19:43
okay so that's what categorical
00:19:46
attributes summarize or group the data
00:19:48
now let's move on to the interval and
00:19:53
ratio attribute we
00:19:55
are also referred to as numeric
00:19:57
attributes which is more sophisticated
00:20:02
so for example let's say we have a study
00:20:06
area this is this is our study area this
00:20:08
is a community with more than 1,000
00:20:12
polygons and each polygon has a
00:20:15
population and population size ranges
00:20:18
from 0 to 500 5,000 and 100 and let's
00:20:25
say you want to reclassify our study
00:20:28
area into three categories based on
00:20:31
population size - low medium and high
00:20:34
population size so three categories mean
00:20:37
we need to define or determine two
00:20:40
cutoff values okay a and B so below a is
00:20:45
assigned to low category between a and B
00:20:51
polygons will be assigned to their own
00:20:53
medium population size and above B we
00:20:57
consider that consider this polygon as
00:20:59
high population size so question is how
00:21:03
to determine the cutoff values a and B
00:21:05
and the answer is we need some automatic
00:21:09
classification method to help us to
00:21:11
determine or define threshold values
00:21:14
there are a variety of ways for
00:21:17
classification and the first method is
00:21:19
called equal interval classification
00:21:22
which means the range of each category
00:21:23
is going to be the same or equally
00:21:26
Islita range from 0 to 5000 and 133
00:21:30
which is the maximum population so you
00:21:34
can see the entire study area is divided
00:21:36
into three categories and from low with
00:21:40
white color medium with gray color and
00:21:44
high with black color and if you look at
00:21:46
the range of each category from minimum
00:21:49
to maximum ranges of each class or
00:21:51
category is equal that's called equal
00:21:54
interval classification maybe this
00:21:58
picture is not really clear so let's see
00:22:01
the next picture the next slide that
00:22:04
percentage of population under
00:22:07
five so has been classified into five
00:22:11
categories and when you look at the
00:22:14
ranges the ranges are between 3 to 6 6
00:22:17
to 9 9 to 12 12 to 15 15 to 18 so the
00:22:22
ranges are equal or they have equal
00:22:26
interval okay so that's the first type
00:22:30
of classification the second method that
00:22:34
help us to determine the cutoff value is
00:22:37
called quantile classification so in
00:22:40
this classification method the threshold
00:22:42
or color values are set so that each
00:22:45
category is going to have the same
00:22:47
number of special features special
00:22:50
features the same number of features for
00:22:53
example in this color and the same
00:22:55
number of features in this color and so
00:22:58
for so on so this classification method
00:23:01
is suitable for linearly distributed
00:23:03
data so you can see the categories will
00:23:06
have however their range for each
00:23:08
categories are different within each
00:23:12
category the number of features is
00:23:14
almost the same or the polygons are
00:23:17
approximately the same that's called
00:23:19
equal number classification or quantile
00:23:22
classification so let's see an
00:23:25
application of contour classification in
00:23:27
GIS for example we want to rank in US
00:23:31
states based on their areas into five
00:23:33
categories so we use quantile
00:23:36
classification if you generate five
00:23:39
classes this means that ten states will
00:23:42
reside in each class each class the
00:23:46
first class will have the ten largest
00:23:49
states in terms of land mass and the
00:23:52
last class or last category will have
00:23:55
the ten as small as the States in terms
00:23:57
of land mass so when you use the
00:24:00
quantile map classification with five
00:24:02
classes it will look like this map so it
00:24:06
is easy to see see that it is easy to
00:24:10
see that it states like Texas or
00:24:12
California or New Mexico are in the top
00:24:17
ten for
00:24:18
his closest aides are obviously their
00:24:22
smallest so quantile classification is
00:24:26
ideal when there is a order in the data
00:24:28
it is suitable for ordinal data and
00:24:32
third method method that helps us to
00:24:35
determine the cutoff value is called
00:24:37
natural break classification and this
00:24:40
might be the most widely used
00:24:41
classification technique this
00:24:43
classification is based on the histogram
00:24:45
and looks for the obvious or largest
00:24:48
gaps in the data so let's say in the
00:24:51
same study area this is a histogram of
00:24:55
the same study area and x-axis indicates
00:25:00
population size from zero to maximum
00:25:03
which was 5133
00:25:05
and the y-axis tells us the frequency of
00:25:10
the occurrence of the population which
00:25:12
means number of polygons you have for
00:25:15
this population size then if you look at
00:25:18
histogram here we can identify the
00:25:20
number of natural gaps for example here
00:25:22
is a gap and also here is another
00:25:25
largest gap in the data so the natural
00:25:29
break classification uses mathematical
00:25:32
formula to help us to find the largest
00:25:34
gaps in histogram and then we use these
00:25:37
gaps as cutoff values so technically it
00:25:40
minimizes variance within each class and
00:25:44
maximizes variance between the classes
00:25:46
the last classification method is called
00:25:49
manual or defined interval
00:25:51
classification method so the choice of
00:25:54
cutoff values is up to you and you
00:25:58
define intervals or cutoff values so we
00:26:05
talked about four different type of
00:26:07
classification method for numeric
00:26:09
attributes manual defined equal interval
00:26:14
quantile and natural break so you can
00:26:17
see for the same data you got different
00:26:19
visual results or different pattern
00:26:22
that's why people say maps can be
00:26:25
misleading if I change cutoff value for
00:26:28
each class I will get different result
00:26:30
if I change the class
00:26:31
vacation technique I will get different
00:26:33
results so the question is which one
00:26:36
gives us the best display of data and
00:26:39
here are some criteria or guidelines for
00:26:43
you to select the best classification
00:26:45
method so you always have to look at the
00:26:47
data when when do we use natural break
00:26:51
so we use natural breaks when the
00:26:53
attributes like I don't know population
00:26:56
size are distributed unevenly across
00:26:59
overall range of the data so then when
00:27:03
you look at the histogram of the data
00:27:04
you will see some pics the distribution
00:27:07
will be like this and in that case
00:27:11
natural breaks might be the best choice
00:27:14
so you choose numbers that best reflects
00:27:18
natural gaps within your data and how
00:27:22
about equal interval equal interval
00:27:24
classification is suitable when you want
00:27:28
to have all classes with the same range
00:27:31
for example if you want to display
00:27:32
category score every 1000 increments and
00:27:35
then interval is 1000 for every category
00:27:38
in that case you call interval is
00:27:41
suitable the quantile classification
00:27:46
basically produces the same number of
00:27:48
the geographic feature for every
00:27:50
category and the best time you can use
00:27:52
this classification is menu when your
00:27:55
attributes are linearly distributed
00:27:57
across the range which means that if you
00:27:59
draw out a histogram there is no pick or
00:28:04
basically it's flat for histogram or
00:28:06
probably there is a very it has a very
00:28:08
gentle slope and the manual
00:28:11
classification and the manual defined
00:28:13
classification can be used when you want
00:28:15
your classes to break at a specific
00:28:17
values so that's a guideline for
00:28:20
classification method so let's open up
00:28:24
our jaws to show you classification
00:28:27
techniques first of all we need to
00:28:30
download US sa foundries from tiger line
00:28:34
so if I here type tiger line
00:28:39
and then go to the tiger line website
00:28:44
the most recent data web interface and
00:28:54
states and equivalent submit download
00:29:00
national file and then let's extract it
00:29:21
and loaded into ArcGIS
00:29:39
we're going to classify this shapefile
00:29:44
based on the land area so if you open up
00:29:48
attribute table of their states there is
00:29:52
a field with name of a land I think this
00:29:58
area land and I'm going to classify it
00:30:02
based on the symbology symbology and its
00:30:10
quantity right because it's a numeric
00:30:13
value and then graduate color and base
00:30:19
under a land so here the classification
00:30:25
technique is natural breaks right and
00:30:27
here we have five number of the classes
00:30:30
if I click apply and then okay so I can
00:30:34
see that so these areas you have higher
00:30:36
so remember this picture okay so if I
00:30:40
here instead of natural break with five
00:30:42
classes I use I don't know click on
00:30:46
classify and then I select equal
00:30:49
interval okay and then okay apply okay
00:30:55
so you can see that chain the shape
00:30:58
completely changed right so depend on
00:31:01
the classification technique we have
00:31:03
different cutoff values and then we have
00:31:07
different Maps right so or we can change
00:31:11
number of the classes instead of five
00:31:13
classes we can change it to three
00:31:16
classes and then we get different
00:31:17
results right or here we have another
00:31:21
type of classification you can manually
00:31:23
define the break values here okay you
00:31:26
can change them or you can use equal
00:31:30
interval define interval natural break
00:31:33
or other type of classification
00:31:35
techniques and each one gives us
00:31:38
different maps so so you can create
00:31:42
numerous maps with the same data that's
00:31:44
why in elections or advertisement they
00:31:48
use maps to deceive people the data are
00:31:52
the same
00:31:52
and true but classification techniques
00:31:55
are different