Skip to main content
Geosciences LibreTexts

6.3: Searches and Queries

  • Page ID
    20587

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    Access to robust search and query tools is essential to examining the general trends of a dataset. Queries are essentially questions posed to a database. The selective display and retrieval of information based on these queries are essential components of any geographic information system (GIS). There are three basic methods for searching and querying attribute data: (1) selection, (2) query by attribute, and (3) query by geography.

    Selection

    Selection represents the easiest way to search and query spatial data in a GIS. Selecting features highlights those attributes of interest, both on-screen and in the attribute table, for subsequent display or analysis. For example, one selects points, lines, and polygons simply by using the cursor to “point-and-click” the feature of interest or by dragging a box around those features. Alternatively, one can select features by using a graphic object, such as a circle, line, or polygon, to highlight all of those features that fall within the object. Advanced options for selecting subsets of data from the larger dataset include creating a new selection, selecting from the currently selected features, adding to the current selection, and removing from the current selection.

    Query by Attribute

    Map features and their associated data can be retrieved via the query of attribute information within the data tables. For example, search and query tools allow a user to show all the census tracts with a population density of 500 or greater, show all counties that are less than or equal to 100 square kilometers, or show all convenience stores within 1 mile of an interstate highway.

    Specifically, SQL (Structured Query Language) is a computer language developed to query attribute data within a relational database management system. Created by IBM in the 1970s, SQL allows retrieving a subset of attribute information based on specific, user-defined criteria via implementing particular language elements. More recently, the use of SQL has been extended for use in a GIS (Shekhar and Chawla 2003).[1] One important note related to the use of SQL is that the exact expression used to query a dataset depends on the GIS file format being examined. For example, ANSI SQL is a particular version used to query ArcSDE geodatabases, while Jet SQL is used to access personal geodatabases. Similarly, shapefiles, coverages, and dBASE tables use a restricted version of SQL that does not support all the features of ANSI SQL or Jet SQL.

    As discussed in Chapter 5, “Geospatial Data Management,” Section 5.2 “Geospatial Database Management,” all attribute tables in a relational database management system (RDBMS) used for an SQL query must contain primary and foreign keys for proper use. In addition to these keys, SQL implements clauses to structure database queries. A clause is a language element that includes the SELECT, FROM, WHERE, ORDER BY, and HAVING query statements.

    • SELECT denotes what attribute table fields you wish to view.
    • FROM denotes the attribute table in which the information resides.
    • WHERE denotes the user-defined criteria for the attribute information that must be met in order for it to be included in the output set.
    • ORDER BY denotes the sequence in which the output set will be displayed.
    • HAVING denotes the predicate used to filter output from the ORDER BY clause.

    While the SELECT and FROM clauses are mandatory statements in an SQL query, the WHERE is an optional clause to limit the output set. Likewise, the ORDER BY and HAVING are optional clauses used to present the information in an interpretable manner.

    The following is a series of SQL expressions and results applied to “Personal Addresses in “ExampleTable” Attribute Table.” The title of the attribute table is “ExampleTable.” Note that the asterisk (*) denotes a special case of SELECT whereby all columns for a given record are selected: SELECT * FROM ExampleTable WHERE City = “Upland”

    This statement returns the following:

    Consider the following statement:

    SELECT LastName FROM ExampleTable WHERE State = “CA” ORDER BY FirstName

    This statement results in the following table sorted in ascending order by the FirstName column (not included in the output table as directed by the SELECT clause):

    In addition to clauses, SQL allows for the inclusion of specific operators to delimit the query’s result further. These operators can be relational, arithmetic, or Boolean and will typically appear inside conditional statements in the WHERE clause. A relational operator employs the statements equal to (=), less than (<), less than or equal to (<=), greater than (>), or greater than or equal to

    (>=). Arithmetic operators are those mathematical functions that include addition (+), subtraction (−), multiplication (*), and division (/). Boolean operators (also called Boolean connectors) include the statements AND, OR, XOR, and NOT. The AND connector selects records from the attribute table that satisfies both expressions. The OR connector selects records that satisfy either one or both expressions. The XOR connector selects records that satisfy one and only one of the expressions (the functional opposite of the AND connector). Lastly, the NOT connector is used to negate (or unselect) an expression that would otherwise be true. Put into the language of probability, the AND connector is used to represent an intersection OR represents a union, and NOT represents a complement. “Venn Diagram of SQL Operators” illustrates the logic of these connectors, where circles A and B represent two sets of intersecting data. Remember that SQL is a very exacting language, and minor inconsistencies in the statement, such as additional spaces, can result in a failed query.

    These operators combine to provide the GIS user with powerful and flexible search and query options. With this in mind, can you determine the output set of the following SQL query as it is applied to “Histogram Showing the Frequency Distribution of Exam Scores”?

    SELECT LastName, FirstName, StreetNumber FROM ExampleTable WHERE StreetNumber >= 10000 AND StreetNumber < 100 ORDER BY LastName

    The following are the results:

    Query by Geography

    Query by geography, also known as a “spatial query,” allows one to highlight particular features by examining their position relative to other features. For example, a GIS provides robust tools that determine the number of schools within 10 miles of a home. Several spatial query options are available, as outlined here. Throughout this discussion, the “target layer” refers to the feature dataset whose attributes are selected, while the “source layer” refers to the feature dataset on which the spatial query is applied. For example, if we used a state boundary polygon feature dataset to select highways from a line feature dataset (e.g., select all the highways that run through Arkansas), the state layer is the source, while the highway layer is the target.

    Intersect

    This often used spatial query technique selects all features in the target layer that share a common locale with the source layer. For example, the “intersect” query allows points, lines, or polygon layers as both the source and target layers.

    The highlighted blue and yellow features are selected because they intersect the red features.

    Are Within a Distance Of

    This technique requires the user to specify some distance value, which is then used to buffer the source layer. All features that intersect this buffer are highlighted in the target layer. The “are within a distance of” query allows points, lines, or polygon layers for the source and target layers.

    The highlighted blue and yellow features are selected because they are within the selected distance of the red features; tan areas represent buffers around the various features.

    Completely Contain

    This spatial query technique returns those features that are entirely within the source layer. This query type does not select features with coincident boundaries. For example, the “completely contain” query allows for points, lines, or polygons as the source layer, but only polygons can be used as a target layer.

    The highlighted blue and yellow features are selected because they contain the red features entirely.

    Are Completely Within

    This query selects those features in the target layer whose entire spatial extent occurs within the geometry of the source layer. For example, the “are completely within” query allows for points, lines, or polygons as the target layer, but only polygons can be used as a source layer.

    The highlighted blue and yellow features are selected because they are entirely within the red features.

    Have Their Center In

    This technique selects target features whose center, or centroid, is located within the boundary of the source feature dataset. The “have their center in” query allows points, lines, or polygon layers to be used as the source and target layers.

    The highlighted blue and yellow features are selected because they have their centers in the red features.

    Share a Line Segment

    This spatial query selects target features whose boundary geometries share a minimum of two adjacent vertices with the source layer. The “share a line segment” query allows for line or polygon layers for either source and target layers.

    The highlighted blue and yellow features are selected because they share a line segment with the red features.

    Touch the Boundary Of

    This methodology is similar to the INTERSECT spatial query; however, it selects line and polygon features that share a common boundary with the target layer. The “touch the boundary of” query allows line or polygon layers to be used as both the source and target layers.

    The highlighted blue and yellow features are selected because they touch the boundary of the red features.

    Are Identical To

    This spatial query returns features that have the exact geographic location. The “are identical to” query can be used on points, lines, or polygons, but the target layer type must be the same source layer type.

    The highlighted blue and yellow features are selected because they are identical to the red features.

    Are Crossed by the Outline Of

    This selection criterion returns features that share a single vertex but not an entire line segment. The “are crossed by the outline of” query allows line or polygon layers to be used as source and target layers.

    The highlighted blue and yellow features are selected because the outline of the red features crosses them.

    Contain

    This method is similar to the COMPLETELY CONTAIN spatial query; however, features in the target layer will be selected even if the boundaries overlap. For example, the “contain” query allows for point, line, or polygon features in the target layer when points are used as a source, when line and polygon target layers with a line source, and when only polygon target layers with a polygon source.

    The highlighted blue and yellow features are selected because they contain the red features.

    Are Contained By

    This method is similar to the ARE COMPLETELY WITHIN spatial query; however, features in the target layer will be selected even if the boundaries overlap. For example, the “is contained by” query allows for point, line, or polygon features in the target layer when polygons are used as a source when pointing and line target layers with a line source, and when only point target layers with a point source.

    The highlighted blue and yellow features are selected because the red features contain them.


    This page titled 6.3: Searches and Queries is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Adam Dastrup.

    • Was this article helpful?