Skip to main content
Geosciences LibreTexts

1.6: Data Model

  • Page ID
    44900
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\dsum}{\displaystyle\sum\limits} \)

    \( \newcommand{\dint}{\displaystyle\int\limits} \)

    \( \newcommand{\dlim}{\displaystyle\lim\limits} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \(\newcommand{\longvect}{\overrightarrow}\)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    Current GIS programs represent points, lines, and polygons differently. There are two fundamental models: raster and vector. Each model has its advantages and disadvantages, and neither is superior to the other in every situation. One data model may fit certain types of data and applications better than the other.

    Raster

    A matrix of rows and columns, the raster data model covers sections of the Earth’s surface and represents features with cells or pixels. Pixels are the building blocks of the raster data model, and they are usually uniformly square and of consistent size within each layer. Each pixel represents a precise chunk of the Earth’s surface; the geographic position of any cell can be determined. A specific attribute value, representing the condition of that specific portion of the Earth’s surface (see figure 1.8), is associated with the pixel. If you need more than one attribute to describe the area contained within the pixel (and most likely you will), you need a second layer. The second raster layer gives you a second attribute. A third gives you a third attribute, and so on.

    Individual cells and groups of cells represent the features of the real world (Figure 1.8). A point feature usually fills one cell while lines and polygons are constructed as a string or contiguous group of cells. Raster layers fill space; they describe what occurs everywhere in the study area. There are no blank spaces across the layer. “Empty” areas simply get a “0” value, but every pixel gets a value.

    Figure 1.8: The raster and vector data models.  Each stores features in a different way.

    Figure 1.8: The raster and vector data models. Each stores features in a different way.

    Conceptually, the raster model is simple. You take a portion of the Earth’s surface, divide it into cells, and give each cell an attribute that represents that area. In the figure above, you might give each cell either a D (developed), P (park), or W (water). For those cells with both park and water, you can give these cells either another code PW (for park and water) or make a judgment as to what covers the majority of the cell. Another way to code these cells is with the percentage of the cell that is water. If 40 percent of the cell is covered by water, the cell gets a value of 40.

    Vector

    The vector data model uses discrete point and line segments to identify the locations of the Earth’s features. Vector objects usually do not fill space like raster layers do; they depict where features occur and the space around those features is empty. Notice that there are white spaces in the vector model of Figure 1.8. No white spaces exist in the raster model; it covers the entire area.

    Vector features are located with x, y coordinates. As described above, points are easy; they have one node (sometimes called a vertex). A node is a location in space that helps define the shape of point, line, and polygon features. As mentioned above, points have one coordinate pair that locates the feature in space. Lines have at least two nodes (their end points). Polygons have a minimum of three nodes to form an area. Lines and polygons usually have many more nodes that help define the course of the line or the polygon’s area.

    Contrasting with raster systems that record one attribute per layer, the vector data model can handle many attributes for each feature type. Different software programs have varying ways of organizing vector digital files, but usually they have at least two files: one that stores spatial data and another that stores attributes.

    The link between the spatial and attribute data files is made with a unique identifier. Each feature on the map and its corresponding attributes has a unique identifier that links the map feature to its database attributes. A type of unique identifier, a “key”, which links attribute files, is discussed in Chapter 3.

    Raster versus Vector

    Which is better? Although GIS users have their own personal favorite data model, the question of which is “better” is an incomplete question. There are advantages and disadvantages to both data models, so a better question is which is better for particular applications or datasets. Some in the GIS industry use the slogan “Raster is faster, but vector is corrector.” While this is a good starting point, it conceals the details. Yes, your computer can process raster data quicker, but today computer processors are so fast the difference may be negligible. Yes, vector output looks more accurate, but you can increase pixel resolution to something resembling vector resolution (this, however, greatly increases the database size). The following are some of the advantages and disadvantages of the data models:

    Raster advantages:

    1. Easy to understand. Conceptually, the raster data model is easy to understand. It arranges data into columns and rows. Each pixel represents a piece of territory.
    2. Processing speed. Raster’s simple data structure and its uncomplicated math produce quick results. For example, to calculate a polygon’s area, the computer takes the area contained within a single cell (which remains consistent throughout the layer) and multiples it by the number of cells making up the polygon. Likewise, the speed of many analysis processes, like overlay and buffering, are faster than vector systems that must use geometric equations.
    3. Data form. Remote sensing imagery is easily handled by raster-based systems because the imagery is provided in a raster format.
    4. Some analysis functions (surface analysis and neighborhood functions) are only feasible in raster systems. In addition, many new analysis functions appear in raster systems before migrating to vector systems because the math is simpler.

    Raster disadvantages:

    1. Appearance. Cells “seem” to sacrifice too much detail (Figure 1.9). This disadvantage is largely aesthetic and can be remedied by increasing the layer’s resolution.

      Figure 1.9:  Comparison of raster and vector data models.  Raster layers often appear pixilated and thus less accurate.

      Figure 1.9: Comparison of raster and vector data models. Raster layers often appear pixilated and thus less accurate.

    2. Accuracy. Sometimes accuracy is a problem due to the pixel resolution. Imagine if you had a raster layer with a 30 by 30 meter resolution, and you wanted to locate traffic stop signs in that layer. The entire 30 by 30 meter pixel would represent the single stop sign. If you converted this raster layer to vector, it might place the stop sign at what was the pixel’s center. Sometimes problems of accuracy (and appearance) can be resolved by selecting a smaller pixel resolution, but this has database consequences.
    3. Large database. As just described, accuracy and appearance can be enhanced by reducing pixel size (the area of the Earth’s surface covered by each cell), but this increases your layer’s file size. By making the resolution 50 percent better (say from 30 to 15 meters), your layer grows four times. Improve the resolution again by halving the pixel size (to 7.5 meters) and your layer will again increase by four times (16 times larger than the original 30-meter layer). The layer quadruples because the resolution increases in both the x and y direction.

    Vector advantages:

    1. Intuitive. In our minds, we picture features discreetly rather than made up of contiguous square cells.
    2. Resolution. If the locations of features are precise and accurate, you can maintain that spatial accuracy. The features will not float somewhere within a cell.
    3. Topology. Although the raster data model preserves where features are located in relation to one another, they do not represent how they are related to one another. This complex form of topology can be constructed in most vector systems, so you can track the connections in a municipal water network between pipe and valve features and thus track the direction and flow of water.
    4. Storage. Vector points, lines, and simple polygons use little disk space in comparison to raster systems. This was once a major consideration when hard-disk storage was limited and expensive.

    Vector disadvantages:

    1. Geometry is complex. The geometrical algorithms needed for polygon overlay and the calculation of distances, depending on the projection/coordinate system used, require experienced programmers. This is not usually a problem for most GIS users since most functions are directly coded in the software.
    2. Slow response times. The vector data model can be slow to process complex datasets especially on low-end computers.
    3. Less innovation. Since the math is more complex, new analysis functions may not surface on vector systems for a couple of years after they have debuted on raster system.

    This page titled 1.6: Data Model is shared under a CC BY-SA 3.0 license and was authored, remixed, and/or curated by via source content that was edited to the style and standards of the LibreTexts platform.