2.7: Analyze Your Results
- Page ID
- 31562
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)Data analysis represents the worst of times and the best of times for scientists. Entering, organizing, reducing, summarizing, tabulating, graphing, and analyzing scientific observations and data—in some cases, millions upon millions of data points—can be time consuming, frustrating, tedious, and a downright pain in the caboose. On the other hand, as the averages emerge, as the trends become clear, as the relationships make themselves known, and as something that no one has previously witnessed or discovered sees the light of day, data analysis can be one of the most exhilarating experiences in a scientist’s life. After all, the data tell the story. They refute or confirm the hypotheses and ideas. They tell the scientists whether they’re on the right track or whether a particular set of experiments is an utter failure. Data analysis is often the make-or-break moment in a scientific study. Occasionally, a scientist’s work results in something so extraordinary that it changes the world. Our modern, industrialized civilization—including the smartphone you’re checking as you read this—represents an accumulation of these moments in science.
Without question, the tools, methods, and approaches of data analysis have been transformed by the development of technology, notably computing. Scientists have been using computers since the 1950s, but computers’ speed and ease of use have changed dramatically in the past 20 years. As one oceanographer put it, computers did for the 20th century what adoption of Arabic numerals did for the 17th century: they transformed how we think (Warren 2006). Three examples illustrate Warren’s point: numerical models, big data, and artificial intelligence.
In recent decades numerical models, also known as mathematical or computer models, have evolved alongside theory and observation into what oceanographers call the third element of oceanographic research. Models generate data by solving sets of equations that represent—in the scientist’s mind—how a system works. When the model’s predictions (or forecasts) are compared to observations from nature, scientists can determine how well they do or don’t understand a system. When the model outputs fail to match observations, scientists know that some part of their model needs revision. This stepwise process—generating model predictions, comparing them to real-world observations, and revising the model—helps scientists improve their understanding of the natural world.
Models serve a practical function too. A branch of applied ocean science known as operational oceanography serves as a kind of ocean weather forecasting service. Its models produce near-real-time output called nowcasts. To create nowcasts, oceanographers use observational data from various near-real-time electronic streams, such as buoys, floats, satellites, and even ships. Operational oceanography serves commercial marine interests (fishing and shipping, for example), the military (which depends on up-to-date knowledge of the ocean to conduct its operations), and recreational interests (like boaters and surfers).
Increasingly, oceanographers are turning to computer scientists for help managing large oceanographic data sets. Big data refers to an emerging subdiscipline of computer science that tackles the challenges of analyzing data sets whose size or complexity exceeds the capabilities of traditional software and computers. Hey et al. (2009) refer to this “data-intensive computing” as the fourth paradigm, the use of computers to explore and mine big data for information. In a sense, the fourth paradigm emerges from the ability of computers to think. They detect patterns that we cannot. Big data scientists emphasize five key properties of big data: volume (lots of it), variety (many different sources), velocity (coming at you fast), veracity (ensuring the data are real and free of errors), and value (what you can learn; e.g., De Mauro et al. 2016). The terminology has largely developed in response to the proliferation of electronic devices such as smartphones, watches, tablets, virtual assistants, and similar devices. Capable of tracking, storing, and archiving (via the cloud) practically everything you do on your device, the resultant data stream (popularly known as digital exhaust) generates mind-blowing quantities of information.
Oceanographers have long faced the challenges of big data, marine big data, as they refer to it (e.g., Huang et al. 2015, 2019). They note that American oceanographer Matthew Maury’s (1806–1873) compilation of wind and current data from ships’ logs contained 1.2 million data points. Results from the world’s first global oceanographic voyage, the Challenger expedition (1872–1876), spanned 50 volumes and more than 29,500 pages (Murray 1895). More recently, large, multi-institutional oceanographic expeditions churn out data by the petabyte. That’s a thousand times a terabyte, the typical storage capacity of the current generation of desktop computers. A single Earth-observing satellite, the Aquarius, logs more data in two months than the first 125 years of ship and buoy data gathering (Huang et al. 2015). Developing the architecture and algorithms to collect, verify, store, process, integrate, and analyze such massive volumes will require new ways of thinking and a new generation of marine data science techies. Perhaps you will be among them.
One possible solution for extracting meaningful information from big data lies in artificial intelligence, or AI, a broad category of data science aimed at building computer and software systems that mimic human intention, intelligence, and adaptability (e.g., West and Allen 2018). While application of AI-based data analysis in the marine sciences has been slow to develop, the field holds great promise for extracting greater value from big data. Working side by side, humans and machines engaged in ocean science research may accelerate our knowledge and understanding of the world ocean as never before.