Skip to main content
Geosciences LibreTexts

4.3: Principles Of Database Management – Vector

  • Page ID
    44913
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\dsum}{\displaystyle\sum\limits} \)

    \( \newcommand{\dint}{\displaystyle\int\limits} \)

    \( \newcommand{\dlim}{\displaystyle\lim\limits} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \(\newcommand{\longvect}{\overrightarrow}\)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    Let’s turn our discussion from characteristics of data to how these values are organized within a data file. Data files are the basic “database” for many programs including spreadsheets, statistic programs, and GIS. Within a GIS, there is a data file for each particular type of geographic feature (e.g. streets, street lights, buildings, and parcels of land). They are the database’s version of your features. The data files are automatically created when feature layers are defined in your GIS. You place into them the attributes related to the features.

    Data files, often called “tables,” arrange attributes within a matrix of fields and records. Fields form the columns of a data file (see Figure 4.1), and they contain the values for each specific attribute you are collecting. For example, parcels might include attributes such as area, land use, and Assessor’s Parcel Number (APN). In this example, you would have at least three fields: one called area, another titled land use, and one labeled APN.

    Figure 4.1:  Key parts of a data file.

    Figure 4.1: Key parts of a data file.

    Remember from Chapter 2 that each of these fields has a specific “data format” that defines the type and length of the value that can be directly entered into the data file. Frequently attributes are coded as one of the following, but there are many data formats and the specific name of the data format often changes from one software program to another. Broad data format categories include:

    Figure 4.2:  Data format categories.

    Figure 4.2: Data format categories.

    A single record, a row in the data file, represents the database’s version of a single feature, including all of its specific attribute values (see Figure 4.1). A few of these attributes may be system variables that the GIS needs for data integrity reasons and to link the data file to the feature’s spatial files. In addition, some GIS programs automatically generate length calculations for line features and both area and perimeter calculations for polygon features. Each data file should have a key identifier field that uniquely identifies each feature (i.e. each record). The remaining attributes are up to you and the purpose of your study.

    Data files are a collection of related records. If you have 25 street lights within your GIS, you will have 25 street light records in its attribute file. As briefly described above, a largely empty data file is created when a new layer is defined within a GIS program. It is your job to add fields and attribute values to the data file. These descriptive attributes can be entered by hand or imported from external sources. It is likely that you will enter some attributes by hand (and it can be time consuming and tedious), but many—if not most—of the attributes you seek will be imported or “joined” from separate, non-GIS data files. This is because many non-spatial data files predate your need for their incorporation into a GIS, but it is deeper than that. Data manipulation within GIS is clumsy, and since most GIS users are familiar with data management programs like Excel and Access, they prefer working with these programs and then exporting their data and “joining” the external data file to the GIS data file. The joining process is described later in this chapter.

    These external data files are coded in one of many “file formats”. Some file formats are specific to a particular software program while others are somewhat universal. Even those using a program’s proprietary format can export the data file into one of many formats that most GIS programs can read. Some of the file formats that can be read by most GIS programs include:

    dBase This industry standard format is read by just about every GIS program. Many GIS programs use this format internally rather than creating their own.

    Excel and Access – Microsoft’s file formats for Excel and Access can be read by many GIS programs. If your GIS program does not read these formats, open the data file in Excel or Access and export it into a format that your system reads.

    ASCII (American Standard Code for Information Interchange) – Since most computers use ASCII to represent text, it is possible to transfer data from one computer to another in this format. It is also read and written by most GIS programs, but it is rarely used as the primary GIS file format (with the exception of some raster-based GIS programs). Some government data sets are contained in this file format. Text files come in several different “delimited” forms, and all may include numeric or alphanumeric content (see “Joining Data Files” later in this chapter).

    Data files contain a matrix of fields and records for each feature layer. A database is a collection of several related data files (like parcels, street lights, and buildings). In other words, databases contain data files for related layers. Accessing these data files are done through either the GIS software or increasingly from external database management systems (DBMS) that are linked to the GIS. DBMS are specialized programs that organize, manipulate, and report non-spatial data and help you store your data more efficiently. They are particularly valuable when working with large data sets because you can select a subset of your records and fields to work with. The entire attribute file does not have to be used. Examples of external DBMS programs include Access, Oracle, Ingres, SQLServer, INFORMIX, and to a lesser degree Excel, which can serve as an elementary database program. Regardless of whether you are accessing the data files within the GIS software or from an external DBMS, all databases have standard operations which include sorting and selecting records, deleting records and fields, and editing fields and attributes.

    Different databases have different structures or ways to organize data. The hierarchical and network data models are two examples, but they are rarely used for GIS (and so will be skipped in this section). For vector systems, the relational database model is the most common data model arguably because they are more flexible, the table structure is easy to understand and program, and outside of GIS, data files are commonly held in relational databases.

    Linking or joining data files is the relational database model’s strength. Key identifiers, found in multiple data files, are used to link records from one data file to another. In other words, you cross reference multiple data files using common attributes and attach (or join) these external data files to your internal GIS data file. This link takes the selected fields in the data file you wish to join and relates them to the appropriate records in the GIS data file. This requires that each data file have at least one common field to perform a join. There are different names for the key identifier including key and primary key. This process is highlighted later in this chapter.

    Many, however, think that the relational database model does not adequately represent spatial data. For some, records in a relational data file are too discrete; they do not properly depict the continuous and multi-dimensional nature of the features they are representing. We use relational data models because they are simple and convenient, but we artificially bend geographic features to conform to existing database standards that were created for non-spatial data.

    This has led to the development of object-oriented data structures, which are seen as a more sophisticated database model. The database discards many of the foundational concepts that we have applied throughout this book. Features are defined differently; object-oriented features blur the line between points, lines, and polygons. Also, instead of having multiple files for each GIS layer, the geography and attribute data are integrated into a single file. This allows for simultaneous geographic and attribute editing and quicker processing. The more sophisticated model, however, is a more complex model, and that may have slowed its spread even though “object-oriented” databases were one of the hottest topics in GIS in the 1990s. It may still be the touted successor of the relational model, but it seems that the relational model, despite its drawbacks, has significant pluses—including its ease of use—that will help it dominate at least into the near future.


    This page titled 4.3: Principles Of Database Management – Vector is shared under a CC BY-SA 3.0 license and was authored, remixed, and/or curated by via source content that was edited to the style and standards of the LibreTexts platform.