Introduction to Spatial Analysis

00:31:57
https://www.youtube.com/watch?v=yZzJWXb8pWs

Résumé

TLDRThe video introduces spatial analysis within GIS, a sophisticated topic that encompasses analyzing geographical data to solve real-world problems. Initially, viewers learn about key GIS data models such as vector and raster models, how geographic features and locations are represented, and how data is collected and managed. This foundation leads to spatial analysis where existing data, like census and land use, are combined to transform into actionable knowledge. Examples of spatial analysis applications include determining suitable locations for infrastructure or identifying at-risk areas for health crises. Spatial analysis involves various operations like selection and classification. Selection might be based on data attributes or spatial locations, and classification organizes data into groups based on certain criteria. The session progresses to discuss the types of GIS operations (local, neighborhood, and global), detailing how each operates differently in manipulating data. Methods for classifying data— such as equal interval, quantile, and natural breaks—are explained, demonstrating different approaches and outcomes in data visualization.

A retenir

  • 🔍 Introduction to spatial analysis in GIS
  • 🗺 Understanding GIS data models and coordinate systems
  • 📊 Importance of turning geographic data into actionable knowledge
  • 🌍 Examples include locating new hospitals and managing disease outbreaks
  • 🔗 Concept of chaining GIS operations to solve complex problems
  • 📍 Types of analysis: local, neighborhood, global
  • 📌 Selection by attribute/location in spatial analysis
  • 🔄 Classification into categories based on conditions
  • 📈 Various classification methods: equal interval, quantile, natural breaks
  • 🛠 GIS operations as building blocks for spatial problem-solving

Chronologie

  • 00:00:00 - 00:05:00

    Introduction to spatial analysis in GIS, reviewing previous topics like GIS data models and data preparation. Moving forward to applying spatial analysis on GIS data to solve real-world problems, such as determining the best location for a hospital.

  • 00:05:00 - 00:10:00

    Explaining the spatial analysis technique as the application of GIS operations to solve real-world problems. Introducing the concept of chaining GIS operations and how they form a GIS model to perform complicated tasks.

  • 00:10:00 - 00:15:00

    Discussing the three categories of GIS operations: local, neighborhood, and global. Each type is explained with examples, like calculating population density or adjacency between states.

  • 00:15:00 - 00:20:00

    Introduction to selection GIS operations, focusing on selection by attribute and location. Decision criteria involve attribute values or spatial relationships, applicable primarily to vector data models.

  • 00:20:00 - 00:25:00

    Further detail on using selection by attribute in raster data, and how GIS processes these queries to produce binary grids. Introduction to classification as a GIS operation to summarize features.

  • 00:25:00 - 00:31:57

    Different methods of classification, including equal interval, quantile, natural break, and manual classification. Discussion on application of these methods and how map visualizations can vary based on classification methods.

Afficher plus

Carte mentale

Mind Map

Questions fréquemment posées

  • What is spatial analysis in GIS?

    Spatial analysis is the application of GIS operations to solve real-world problems by analyzing geographic data.

  • What are basic operations in spatial analysis?

    Basic operations include selection and classification, which involve choosing subsets of data and grouping data based on criteria.

  • How is GIS data represented?

    GIS data is represented using vector and raster data models with geographic and projected coordinate systems.

  • What types of data are used in spatial analysis?

    Data types include census data, land use, land cover, road maps, and digital elevation models among others.

  • What is a GIS model?

    A GIS model is a chain of GIS operations where the output of one operation serves as the input for the next to solve complex problems.

  • What are the types of GIS operations?

    GIS operations are categorized into local, neighborhood, and global operations based on the spatial extent of data they analyze.

  • How is selection by attribute performed?

    It involves selecting data subsets by querying attributes using operators like AND, OR, greater than, or equal to.

  • What is the purpose of classification in GIS?

    Classification assigns new values to existing data based on conditions, simplifying and organizing the data into categories.

  • How do classification methods differ?

    Methods like equal interval, quantile, and natural breaks differ in dividing data into categories, affecting visual results.

  • Why are there different classification techniques?

    Different techniques highlight various data patterns, useful for making interpretations or decisions based on specific needs.

Voir plus de résumés vidéo

Accédez instantanément à des résumés vidéo gratuits sur YouTube grâce à l'IA !
Sous-titres
en
Défilement automatique:
  • 00:00:00
    all right today we're going to become
  • 00:00:01
    familiar with this spatial analysis
  • 00:00:03
    which is one of the fanciest topics in
  • 00:00:05
    GIS first we will learn what is the
  • 00:00:08
    spatial analysis and then some basic
  • 00:00:10
    operations and spatial analysis
  • 00:00:13
    including selection and classification
  • 00:00:17
    so far we have talked about the
  • 00:00:21
    definition of GIS GIS data models vector
  • 00:00:25
    data model and raster data model how to
  • 00:00:28
    represent your graphic features and also
  • 00:00:32
    locations with coordinate geographic
  • 00:00:35
    coordinate system and projected
  • 00:00:37
    coordinate system how to manage
  • 00:00:39
    attributes into database and how to
  • 00:00:43
    produce GIS data from scratch
  • 00:00:46
    collecting GPS waypoints as a control
  • 00:00:49
    points then geofencing scan map and
  • 00:00:52
    finding and digitizing features so all
  • 00:00:56
    of them all of these topics that we
  • 00:00:59
    covered for about the data collection
  • 00:01:02
    and data preparation data production
  • 00:01:05
    from scratch and data representation as
  • 00:01:09
    map now we are ready to move to the next
  • 00:01:11
    step which is spatial analysis we have
  • 00:01:15
    the GIS data like census data land use
  • 00:01:21
    data land cover Road meant for
  • 00:01:25
    typography data such as digital
  • 00:01:27
    elevation model disease data such as
  • 00:01:31
    disease count prevalence incidence and
  • 00:01:33
    the next step is to combine these data
  • 00:01:36
    and turn the data into knowledge to
  • 00:01:39
    solve real world problem for example
  • 00:01:42
    where is the best place to build a new
  • 00:01:45
    hospital based on several criteria
  • 00:01:47
    factors like slow proximity to high
  • 00:01:50
    population proximity to road should be
  • 00:01:53
    combined or where are the high risk
  • 00:01:55
    areas for disease outbreak we can
  • 00:01:58
    allocate resources budget to these areas
  • 00:02:00
    to control the outbreak which route is
  • 00:02:03
    the shortest route between your house to
  • 00:02:05
    the University so these are knowledge
  • 00:02:07
    that we can extract from the data so
  • 00:02:11
    what is the spatial analysis technique
  • 00:02:13
    the spatial analysis is the application
  • 00:02:16
    of one or more GIS operation in order to
  • 00:02:19
    solve real-world problem such as what is
  • 00:02:22
    the shortest path between your house and
  • 00:02:24
    department by car how many people are
  • 00:02:27
    living within one kilometer around the
  • 00:02:29
    hazardous situ risk assessment or or
  • 00:02:33
    health disparity which areas have lowest
  • 00:02:36
    access to healthcare or another
  • 00:02:38
    real-world application in public health
  • 00:02:40
    according to Virginia Department of
  • 00:02:42
    Health is Virginia coupled GIS mapping
  • 00:02:46
    many spatial analysis to identify areas
  • 00:02:49
    where infant mortality rates are the
  • 00:02:52
    highest the extent of racial and ethnic
  • 00:02:55
    disparities in infant deaths the
  • 00:02:57
    underlying causes of those infant deaths
  • 00:02:59
    and how to best intervene so these are
  • 00:03:02
    knowledge you can extract from the
  • 00:03:04
    prepared data to support your
  • 00:03:06
    decision-making to make decision based
  • 00:03:08
    on the fact technically a spatial
  • 00:03:13
    analysis is a chain of GIS operations a
  • 00:03:16
    chain basically means the output from
  • 00:03:18
    one joyous operation is served as the
  • 00:03:20
    input of the second GIS operation and so
  • 00:03:23
    on so operations are linked together to
  • 00:03:26
    solve the real-world problems so here
  • 00:03:30
    for example the input layer are we have
  • 00:03:33
    an input layer a special analysis is
  • 00:03:36
    done on the input layer so we will have
  • 00:03:38
    an output layer the output layer will be
  • 00:03:40
    the input for the second special
  • 00:03:42
    operation and gives us the output layer
  • 00:03:44
    and this output layer serves as input
  • 00:03:47
    for the special operation and gives us a
  • 00:03:50
    final output layer so you can see the
  • 00:03:52
    operations are linked together to solve
  • 00:03:54
    the real-world problem and by chaining
  • 00:03:59
    GIS operation you will have a
  • 00:04:01
    complicated GIS flowchart like this and
  • 00:04:04
    a couple of input layers with many
  • 00:04:07
    operations linked together it is called
  • 00:04:10
    a GIS model in the future lecture we
  • 00:04:13
    will build sophisticated GIS model in
  • 00:04:16
    ArcGIS and solve complicated real world
  • 00:04:19
    problems that are not easy to solve
  • 00:04:22
    without GIS
  • 00:04:24
    so basically there are three categories
  • 00:04:27
    of Jonah's operation we will talk about
  • 00:04:30
    a little today we just scratch the
  • 00:04:33
    surface and with more details in the
  • 00:04:36
    future lectures so GIS operations can be
  • 00:04:39
    divided into three categories local
  • 00:04:41
    operations neighborhood operations and
  • 00:04:43
    global operations local operations
  • 00:04:46
    basically mean the output is determined
  • 00:04:50
    based on the input in the same location
  • 00:04:53
    so for input and output are for one
  • 00:04:56
    location for instance this gray polygon
  • 00:05:02
    is the state of Utah we want to
  • 00:05:04
    calculate population density of this
  • 00:05:07
    state and the output here only depends
  • 00:05:11
    on the population size and also the area
  • 00:05:14
    of that state so we only need
  • 00:05:16
    information about Utah State to
  • 00:05:18
    calculate population density of the same
  • 00:05:21
    estate so that's called local operation
  • 00:05:24
    the second type is called neighborhood
  • 00:05:26
    operation and this operation uses data
  • 00:05:29
    from both input location plus the nearby
  • 00:05:32
    or neighborhood locations to determine
  • 00:05:36
    the output value so that's why it's
  • 00:05:38
    called neighborhood operation for
  • 00:05:40
    example let's say let's say you want to
  • 00:05:45
    find out the number of adjacent say so
  • 00:05:48
    the Utah State okay or in other words
  • 00:05:51
    how many estates share boundary video da
  • 00:05:54
    in this case we not only consider the
  • 00:05:58
    location of Utah State but also the
  • 00:06:00
    surrounding estates as well to calculate
  • 00:06:02
    how many edges in the state's to Utah
  • 00:06:04
    State which is one two three four five
  • 00:06:08
    six and out would will be six right so
  • 00:06:13
    that is called neighborhood operation
  • 00:06:15
    and neighborhood operation involves
  • 00:06:17
    neighbors of the city region from
  • 00:06:20
    calculation for example for California
  • 00:06:22
    its shares border with one two three so
  • 00:06:26
    the output will be three and depends on
  • 00:06:29
    the neighbors third type is global
  • 00:06:33
    operation which means this operation
  • 00:06:35
    uses the
  • 00:06:36
    value from the entire input layer to
  • 00:06:39
    determine each output value for example
  • 00:06:42
    here here is let's say we want to rank
  • 00:06:47
    Western estates by their total
  • 00:06:51
    population for the year 1990 in this
  • 00:06:54
    case we not only consider the population
  • 00:06:56
    size of for example Utah but also the
  • 00:07:00
    population size for all of the Western
  • 00:07:04
    estates all of this study area and then
  • 00:07:06
    we can rank them in order and find out
  • 00:07:09
    Utah and Chile has a rank order of six
  • 00:07:12
    okay and California has the largest
  • 00:07:15
    population ranked number one among the
  • 00:07:18
    other state the second one is Washington
  • 00:07:21
    and so on so global operation involve
  • 00:07:25
    entire study area for calculation GIS
  • 00:07:35
    operations our building blocks for
  • 00:07:38
    complicated spatial analysis or spatial
  • 00:07:41
    analysis consists of several joist
  • 00:07:43
    operation which aims to address
  • 00:07:45
    real-world issue basically there are
  • 00:07:47
    four types of GIS operations selection
  • 00:07:50
    classification and proximity and overlay
  • 00:07:53
    which are applicable for both vector and
  • 00:07:56
    raster data set we're going to talk
  • 00:07:58
    about the operations one by one and in
  • 00:08:01
    today's class we will talk only about
  • 00:08:04
    selection and classification and
  • 00:08:07
    proximity and overlays for the future
  • 00:08:10
    lectures let's start with the selection
  • 00:08:15
    so selection is a GIS operation that
  • 00:08:19
    creates a subset of data that satisfies
  • 00:08:22
    certain criteria so there are two types
  • 00:08:25
    of selection selection my attribute and
  • 00:08:27
    selection my location selection my I
  • 00:08:30
    distribute that we already have worked
  • 00:08:32
    with we use that to select a specific
  • 00:08:34
    type of cancer based on location and it
  • 00:08:38
    selected specific records or feature
  • 00:08:40
    based on attribute values but the
  • 00:08:44
    selection by location selects record
  • 00:08:46
    based on a special relationship like
  • 00:08:49
    in composition interception at Jason's
  • 00:08:52
    adjacent or or in general of topological
  • 00:08:56
    relationship that we talked about before
  • 00:08:59
    so selection by location is only
  • 00:09:02
    applicable for vector data model because
  • 00:09:05
    raster they talk and do not have any
  • 00:09:08
    spatial topological information so let's
  • 00:09:13
    start with the selection by attribute
  • 00:09:15
    for vector data there are a number of
  • 00:09:18
    operators you can use like and not or
  • 00:09:21
    greater than less than equal to and so
  • 00:09:24
    on to write your statement or query so
  • 00:09:28
    we can use these operators to build a
  • 00:09:30
    query a statement was said to specify
  • 00:09:33
    the selection criteria so we've won with
  • 00:09:36
    that in the previous labs so basically
  • 00:09:39
    in select by attribute GS applies the
  • 00:09:42
    query on each feature and then compare
  • 00:09:44
    that reviews of each feature with the
  • 00:09:47
    criteria the feature or records that
  • 00:09:49
    meet the criteria will be selected for
  • 00:09:52
    example here we want to select all
  • 00:09:54
    counties with attribute name of the
  • 00:09:57
    state of Vermont so it is based on the
  • 00:10:00
    attribute table or we want to select all
  • 00:10:03
    counties that their attribute name is
  • 00:10:05
    not New York so this operator is shows
  • 00:10:10
    not and/or we want to select all
  • 00:10:13
    counties with areas larger than or equal
  • 00:10:16
    to 1,000 square miles so you can see
  • 00:10:20
    that so it's based on the area based
  • 00:10:22
    based on the attribute table there is a
  • 00:10:24
    field that the name of area so based on
  • 00:10:26
    that one we can select all counties or
  • 00:10:29
    select the counties that population
  • 00:10:31
    density is less than 250 percent per
  • 00:10:34
    square mile
  • 00:10:35
    so these are the selected counties
  • 00:10:37
    selection by attribute can also be
  • 00:10:40
    applied to raster data so let's say we
  • 00:10:43
    have a very simple raster dataset which
  • 00:10:46
    describes land you stop for a study area
  • 00:10:49
    one represents agriculture two
  • 00:10:51
    represents forests and three represents
  • 00:10:54
    urbanized area let's say we want we only
  • 00:10:57
    want to tease out the urbanized areas
  • 00:11:00
    okay we can define the SQL statement
  • 00:11:03
    like land use type across the tree and
  • 00:11:06
    then GIS will read this criteria and
  • 00:11:09
    then compare that compared every single
  • 00:11:12
    pixel one by one so let's start with the
  • 00:11:15
    first cell the first cell is one and one
  • 00:11:19
    doesn't satisfy the criteria because
  • 00:11:22
    language type is not equal to three so
  • 00:11:24
    the output will be 0 which means false
  • 00:11:28
    and then we go to the next cell the next
  • 00:11:32
    cell again is 1 and doesn't satisfy so 0
  • 00:11:35
    the second the third one is 2 and
  • 00:11:38
    doesn't satisfy it's not 3 so it's 0 so
  • 00:11:42
    until we reach this cell this cell is 3
  • 00:11:47
    & 3
  • 00:11:48
    land use type is Erinn area so it
  • 00:11:50
    satisfies the criteria so the output
  • 00:11:53
    will be 1 so 1 means that criteria is
  • 00:11:57
    satisfied and so that's how GIS that's
  • 00:12:01
    how can be lower scan is scan and do
  • 00:12:04
    this like by attribute or the raster
  • 00:12:06
    dataset okay so keep in mind that the
  • 00:12:09
    output of selection by attribute for
  • 00:12:12
    raster dataset is always a binary great
  • 00:12:14
    it's oh it's either 0 or 1
  • 00:12:18
    okay or how about this one line is type
  • 00:12:21
    greater than 1 so it compares each cell
  • 00:12:24
    to the criteria and if satisfies this
  • 00:12:27
    criteria the result would will be 1
  • 00:12:29
    otherwise it will be 0 so again the
  • 00:12:32
    output is only binary raster so let's
  • 00:12:38
    move on to the second type of selection
  • 00:12:40
    which is selection by location so here
  • 00:12:43
    our selection criteria is not defined
  • 00:12:45
    based on attribute information it's
  • 00:12:48
    defined based on a spatial relationship
  • 00:12:50
    or topological information special
  • 00:12:53
    relationship between two features we
  • 00:12:56
    have talked about before include the
  • 00:12:57
    intersection of two lines or adjacency
  • 00:13:01
    between two polygons or composition of
  • 00:13:03
    polygons or line or containments so
  • 00:13:06
    these are spatial relationship and
  • 00:13:09
    selection by location usually involves
  • 00:13:12
    two or
  • 00:13:14
    more layers as the input so we need at
  • 00:13:17
    least two layers for the selection by
  • 00:13:19
    location for the vector data let's see
  • 00:13:23
    one example of selection in my location
  • 00:13:25
    based on adjacent selection so we want
  • 00:13:29
    to know which assets are adjacent to
  • 00:13:31
    Missouri and so we have a shapefile of
  • 00:13:35
    Missouri and a shape far away all of the
  • 00:13:37
    US states now we're not looking at
  • 00:13:40
    attribute of the Missouri assay but we
  • 00:13:43
    look at the spatial relationship between
  • 00:13:46
    Missouri estate and the surrounding
  • 00:13:48
    states and then the surrounding polygon
  • 00:13:50
    can be teased out like this so our
  • 00:13:54
    containment selection is another
  • 00:13:56
    topology our relationship for instance
  • 00:13:59
    which estates contain the Mississippi
  • 00:14:02
    River so this is the Mississippi River
  • 00:14:06
    okay and and we have it as a shapefile
  • 00:14:12
    and also we have used US states as
  • 00:14:15
    another shapefile and all states that
  • 00:14:17
    contain part of Mississippi River the
  • 00:14:21
    entire state will be selected out so
  • 00:14:23
    these gray estates okay all of these
  • 00:14:26
    gray says they contain part of
  • 00:14:29
    Mississippi River
  • 00:14:30
    so in selection by location we're
  • 00:14:32
    looking at the spatial relationship
  • 00:14:34
    between the features for example
  • 00:14:36
    adjacency or containment sometimes its
  • 00:14:41
    intersection sometimes it's composition
  • 00:14:43
    so this is called selection by location
  • 00:14:45
    we're not looking at the attribute table
  • 00:14:47
    but we're looking at the relationship
  • 00:14:50
    between the features and
  • 00:14:53
    ArcGIS provides a user-friendly
  • 00:14:55
    interface to help us to construct SQL
  • 00:14:59
    statement to a specify selection
  • 00:15:01
    criteria and this dialog helps us to
  • 00:15:05
    define how to select features based on
  • 00:15:08
    the attribute or based on the spatial
  • 00:15:13
    relationship so here is like by
  • 00:15:15
    attributes and here is like by location
  • 00:15:18
    we already have work birds like by
  • 00:15:21
    attribute and in this lab we will also
  • 00:15:23
    work with select by location so in
  • 00:15:26
    summary for select by
  • 00:15:28
    beautiful Victoria Talley apply the
  • 00:15:29
    criteria on each feature if the features
  • 00:15:33
    satisfy the criteria it will be selected
  • 00:15:35
    and select by attribute for us there we
  • 00:15:38
    apply the criteria on each cell each
  • 00:15:40
    pixel and if the feature satisfies the
  • 00:15:43
    criteria it will be 1 otherwise it will
  • 00:15:46
    be 0 and select by location has nothing
  • 00:15:52
    to do with the attributes it's only
  • 00:15:54
    applicable for vector data set it cannot
  • 00:15:57
    be used for the raster dataset because
  • 00:16:00
    it's based on the relationship or is
  • 00:16:02
    based on the topological relationship
  • 00:16:04
    and topological relationship can only be
  • 00:16:06
    defined for the vector data model not
  • 00:16:10
    for the raster data model another widely
  • 00:16:13
    USG operation is classification
  • 00:16:16
    classification is also known as
  • 00:16:18
    reclassification or recoding and this
  • 00:16:21
    operation will summarize the features
  • 00:16:23
    into several groups based on predefined
  • 00:16:26
    conditions so let's say we have an input
  • 00:16:29
    raster layer and we want to simply use
  • 00:16:34
    new values to replace all values based
  • 00:16:37
    on a predefined condition and here is a
  • 00:16:39
    lookup table and lookup table basically
  • 00:16:43
    defines how old values will be replaced
  • 00:16:46
    by new values all values between 1 to 3
  • 00:16:50
    will be replaced by 5 and all values
  • 00:16:54
    between 3 to 7 will be replaced by new
  • 00:16:57
    values 3 and so on so for example first
  • 00:17:02
    cell is has a value of 3 which is old
  • 00:17:07
    value then you look up look at the
  • 00:17:09
    lookup table the cells falls into this
  • 00:17:12
    category right so the new value should
  • 00:17:15
    be 5 okay in the output and output will
  • 00:17:18
    be saved as 5 here for this cell and if
  • 00:17:22
    you look at so the second one is also
  • 00:17:25
    the same 5 the third one is 19 and 19
  • 00:17:28
    falls into this category and the new
  • 00:17:31
    value for this value for this for 19
  • 00:17:34
    will be 5 again so we can continue this
  • 00:17:38
    process for every single cell 1 by 1 2
  • 00:17:42
    reclassify this image okay so this is
  • 00:17:45
    called classification or
  • 00:17:47
    reclassification process which is widely
  • 00:17:49
    was for summarizing and displaying
  • 00:17:52
    datasets so there are many examples or
  • 00:17:56
    applications for classification for
  • 00:17:59
    example we can classify zip codes into
  • 00:18:02
    low crime rate high crime rates so two
  • 00:18:06
    categories we can summarize cancer type
  • 00:18:09
    such as brain cancer into medium and
  • 00:18:14
    high and low rate of the cancer or we
  • 00:18:19
    can summarize habitat suitability for
  • 00:18:23
    red fox to very high high medium low
  • 00:18:26
    very low suitability so you can see
  • 00:18:29
    there are numerous applications for
  • 00:18:31
    classification so now let's talk about
  • 00:18:35
    classification for all attribute size we
  • 00:18:39
    have talked about different types of
  • 00:18:40
    attributes before which were nominal
  • 00:18:43
    ordinal interval and ratio so we are
  • 00:18:46
    start with nominal and ordinal
  • 00:18:48
    attributes nominal and ordinal
  • 00:18:51
    attributes are also referred to as
  • 00:18:54
    categorical variable or categorical
  • 00:18:57
    attributes for classification of
  • 00:19:00
    categorical tribute is very easy we
  • 00:19:03
    should use a lookup table for
  • 00:19:04
    classification like this and lookup
  • 00:19:07
    table basically defines how old values
  • 00:19:10
    will be replaced by new values for
  • 00:19:12
    example let's say we're going to recode
  • 00:19:14
    all the states in the west of main
  • 00:19:18
    branch of Mississippi River as one and
  • 00:19:21
    an all estates in the east of river as
  • 00:19:25
    zero so here is a classification table
  • 00:19:28
    all values are categorical values right
  • 00:19:33
    so these are just names and new values
  • 00:19:37
    are specified as either one or zero so
  • 00:19:41
    here is their classification results
  • 00:19:43
    okay so that's what categorical
  • 00:19:46
    attributes summarize or group the data
  • 00:19:48
    now let's move on to the interval and
  • 00:19:53
    ratio attribute we
  • 00:19:55
    are also referred to as numeric
  • 00:19:57
    attributes which is more sophisticated
  • 00:20:02
    so for example let's say we have a study
  • 00:20:06
    area this is this is our study area this
  • 00:20:08
    is a community with more than 1,000
  • 00:20:12
    polygons and each polygon has a
  • 00:20:15
    population and population size ranges
  • 00:20:18
    from 0 to 500 5,000 and 100 and let's
  • 00:20:25
    say you want to reclassify our study
  • 00:20:28
    area into three categories based on
  • 00:20:31
    population size - low medium and high
  • 00:20:34
    population size so three categories mean
  • 00:20:37
    we need to define or determine two
  • 00:20:40
    cutoff values okay a and B so below a is
  • 00:20:45
    assigned to low category between a and B
  • 00:20:51
    polygons will be assigned to their own
  • 00:20:53
    medium population size and above B we
  • 00:20:57
    consider that consider this polygon as
  • 00:20:59
    high population size so question is how
  • 00:21:03
    to determine the cutoff values a and B
  • 00:21:05
    and the answer is we need some automatic
  • 00:21:09
    classification method to help us to
  • 00:21:11
    determine or define threshold values
  • 00:21:14
    there are a variety of ways for
  • 00:21:17
    classification and the first method is
  • 00:21:19
    called equal interval classification
  • 00:21:22
    which means the range of each category
  • 00:21:23
    is going to be the same or equally
  • 00:21:26
    Islita range from 0 to 5000 and 133
  • 00:21:30
    which is the maximum population so you
  • 00:21:34
    can see the entire study area is divided
  • 00:21:36
    into three categories and from low with
  • 00:21:40
    white color medium with gray color and
  • 00:21:44
    high with black color and if you look at
  • 00:21:46
    the range of each category from minimum
  • 00:21:49
    to maximum ranges of each class or
  • 00:21:51
    category is equal that's called equal
  • 00:21:54
    interval classification maybe this
  • 00:21:58
    picture is not really clear so let's see
  • 00:22:01
    the next picture the next slide that
  • 00:22:04
    percentage of population under
  • 00:22:07
    five so has been classified into five
  • 00:22:11
    categories and when you look at the
  • 00:22:14
    ranges the ranges are between 3 to 6 6
  • 00:22:17
    to 9 9 to 12 12 to 15 15 to 18 so the
  • 00:22:22
    ranges are equal or they have equal
  • 00:22:26
    interval okay so that's the first type
  • 00:22:30
    of classification the second method that
  • 00:22:34
    help us to determine the cutoff value is
  • 00:22:37
    called quantile classification so in
  • 00:22:40
    this classification method the threshold
  • 00:22:42
    or color values are set so that each
  • 00:22:45
    category is going to have the same
  • 00:22:47
    number of special features special
  • 00:22:50
    features the same number of features for
  • 00:22:53
    example in this color and the same
  • 00:22:55
    number of features in this color and so
  • 00:22:58
    for so on so this classification method
  • 00:23:01
    is suitable for linearly distributed
  • 00:23:03
    data so you can see the categories will
  • 00:23:06
    have however their range for each
  • 00:23:08
    categories are different within each
  • 00:23:12
    category the number of features is
  • 00:23:14
    almost the same or the polygons are
  • 00:23:17
    approximately the same that's called
  • 00:23:19
    equal number classification or quantile
  • 00:23:22
    classification so let's see an
  • 00:23:25
    application of contour classification in
  • 00:23:27
    GIS for example we want to rank in US
  • 00:23:31
    states based on their areas into five
  • 00:23:33
    categories so we use quantile
  • 00:23:36
    classification if you generate five
  • 00:23:39
    classes this means that ten states will
  • 00:23:42
    reside in each class each class the
  • 00:23:46
    first class will have the ten largest
  • 00:23:49
    states in terms of land mass and the
  • 00:23:52
    last class or last category will have
  • 00:23:55
    the ten as small as the States in terms
  • 00:23:57
    of land mass so when you use the
  • 00:24:00
    quantile map classification with five
  • 00:24:02
    classes it will look like this map so it
  • 00:24:06
    is easy to see see that it is easy to
  • 00:24:10
    see that it states like Texas or
  • 00:24:12
    California or New Mexico are in the top
  • 00:24:17
    ten for
  • 00:24:18
    his closest aides are obviously their
  • 00:24:22
    smallest so quantile classification is
  • 00:24:26
    ideal when there is a order in the data
  • 00:24:28
    it is suitable for ordinal data and
  • 00:24:32
    third method method that helps us to
  • 00:24:35
    determine the cutoff value is called
  • 00:24:37
    natural break classification and this
  • 00:24:40
    might be the most widely used
  • 00:24:41
    classification technique this
  • 00:24:43
    classification is based on the histogram
  • 00:24:45
    and looks for the obvious or largest
  • 00:24:48
    gaps in the data so let's say in the
  • 00:24:51
    same study area this is a histogram of
  • 00:24:55
    the same study area and x-axis indicates
  • 00:25:00
    population size from zero to maximum
  • 00:25:03
    which was 5133
  • 00:25:05
    and the y-axis tells us the frequency of
  • 00:25:10
    the occurrence of the population which
  • 00:25:12
    means number of polygons you have for
  • 00:25:15
    this population size then if you look at
  • 00:25:18
    histogram here we can identify the
  • 00:25:20
    number of natural gaps for example here
  • 00:25:22
    is a gap and also here is another
  • 00:25:25
    largest gap in the data so the natural
  • 00:25:29
    break classification uses mathematical
  • 00:25:32
    formula to help us to find the largest
  • 00:25:34
    gaps in histogram and then we use these
  • 00:25:37
    gaps as cutoff values so technically it
  • 00:25:40
    minimizes variance within each class and
  • 00:25:44
    maximizes variance between the classes
  • 00:25:46
    the last classification method is called
  • 00:25:49
    manual or defined interval
  • 00:25:51
    classification method so the choice of
  • 00:25:54
    cutoff values is up to you and you
  • 00:25:58
    define intervals or cutoff values so we
  • 00:26:05
    talked about four different type of
  • 00:26:07
    classification method for numeric
  • 00:26:09
    attributes manual defined equal interval
  • 00:26:14
    quantile and natural break so you can
  • 00:26:17
    see for the same data you got different
  • 00:26:19
    visual results or different pattern
  • 00:26:22
    that's why people say maps can be
  • 00:26:25
    misleading if I change cutoff value for
  • 00:26:28
    each class I will get different result
  • 00:26:30
    if I change the class
  • 00:26:31
    vacation technique I will get different
  • 00:26:33
    results so the question is which one
  • 00:26:36
    gives us the best display of data and
  • 00:26:39
    here are some criteria or guidelines for
  • 00:26:43
    you to select the best classification
  • 00:26:45
    method so you always have to look at the
  • 00:26:47
    data when when do we use natural break
  • 00:26:51
    so we use natural breaks when the
  • 00:26:53
    attributes like I don't know population
  • 00:26:56
    size are distributed unevenly across
  • 00:26:59
    overall range of the data so then when
  • 00:27:03
    you look at the histogram of the data
  • 00:27:04
    you will see some pics the distribution
  • 00:27:07
    will be like this and in that case
  • 00:27:11
    natural breaks might be the best choice
  • 00:27:14
    so you choose numbers that best reflects
  • 00:27:18
    natural gaps within your data and how
  • 00:27:22
    about equal interval equal interval
  • 00:27:24
    classification is suitable when you want
  • 00:27:28
    to have all classes with the same range
  • 00:27:31
    for example if you want to display
  • 00:27:32
    category score every 1000 increments and
  • 00:27:35
    then interval is 1000 for every category
  • 00:27:38
    in that case you call interval is
  • 00:27:41
    suitable the quantile classification
  • 00:27:46
    basically produces the same number of
  • 00:27:48
    the geographic feature for every
  • 00:27:50
    category and the best time you can use
  • 00:27:52
    this classification is menu when your
  • 00:27:55
    attributes are linearly distributed
  • 00:27:57
    across the range which means that if you
  • 00:27:59
    draw out a histogram there is no pick or
  • 00:28:04
    basically it's flat for histogram or
  • 00:28:06
    probably there is a very it has a very
  • 00:28:08
    gentle slope and the manual
  • 00:28:11
    classification and the manual defined
  • 00:28:13
    classification can be used when you want
  • 00:28:15
    your classes to break at a specific
  • 00:28:17
    values so that's a guideline for
  • 00:28:20
    classification method so let's open up
  • 00:28:24
    our jaws to show you classification
  • 00:28:27
    techniques first of all we need to
  • 00:28:30
    download US sa foundries from tiger line
  • 00:28:34
    so if I here type tiger line
  • 00:28:39
    and then go to the tiger line website
  • 00:28:44
    the most recent data web interface and
  • 00:28:54
    states and equivalent submit download
  • 00:29:00
    national file and then let's extract it
  • 00:29:21
    and loaded into ArcGIS
  • 00:29:39
    we're going to classify this shapefile
  • 00:29:44
    based on the land area so if you open up
  • 00:29:48
    attribute table of their states there is
  • 00:29:52
    a field with name of a land I think this
  • 00:29:58
    area land and I'm going to classify it
  • 00:30:02
    based on the symbology symbology and its
  • 00:30:10
    quantity right because it's a numeric
  • 00:30:13
    value and then graduate color and base
  • 00:30:19
    under a land so here the classification
  • 00:30:25
    technique is natural breaks right and
  • 00:30:27
    here we have five number of the classes
  • 00:30:30
    if I click apply and then okay so I can
  • 00:30:34
    see that so these areas you have higher
  • 00:30:36
    so remember this picture okay so if I
  • 00:30:40
    here instead of natural break with five
  • 00:30:42
    classes I use I don't know click on
  • 00:30:46
    classify and then I select equal
  • 00:30:49
    interval okay and then okay apply okay
  • 00:30:55
    so you can see that chain the shape
  • 00:30:58
    completely changed right so depend on
  • 00:31:01
    the classification technique we have
  • 00:31:03
    different cutoff values and then we have
  • 00:31:07
    different Maps right so or we can change
  • 00:31:11
    number of the classes instead of five
  • 00:31:13
    classes we can change it to three
  • 00:31:16
    classes and then we get different
  • 00:31:17
    results right or here we have another
  • 00:31:21
    type of classification you can manually
  • 00:31:23
    define the break values here okay you
  • 00:31:26
    can change them or you can use equal
  • 00:31:30
    interval define interval natural break
  • 00:31:33
    or other type of classification
  • 00:31:35
    techniques and each one gives us
  • 00:31:38
    different maps so so you can create
  • 00:31:42
    numerous maps with the same data that's
  • 00:31:44
    why in elections or advertisement they
  • 00:31:48
    use maps to deceive people the data are
  • 00:31:52
    the same
  • 00:31:52
    and true but classification techniques
  • 00:31:55
    are different
Tags
  • GIS
  • Spatial Analysis
  • Data Models
  • Selection
  • Classification
  • Geographic Coordination
  • Real-world Problems
  • Data Operations
  • Classification Methods
  • GIS Operations