Evaluating Objectives for Data

This document will introduce you to some of the initial questions that are posed when evaluating the overall objectives for your data as they relate to processing and future use.
Hint: You can click on any image to see a larger version.

[wptabs style=”wpui-alma” mode=”vertical”] [wptabtitle] INTENDED AUDIENCE OF THIS GUIDE [/wptabtitle]

[wptabcontent]

Who will find this guide useful?

The growing number of digital technologies that allow for rapid and accurate documentation of sites and objects that are described on the GMV and within these guides, can acquire substantial amounts of data in a relatively short time in the field. In general, the resulting data are part of a larger data life cycle structure and are acquired with an appreciation of the wide range of possible future uses.

By the time that you are reading this, we assume that you are familiar with the types of data on the GMV and the types of applications and projects in which CAST uses them as outlined on the Using the GMV page and within individual technology sections. You should now also be familiar with the main ideas within the table Survey Options for GMV Technologies. In order to make use of this guide, you need to have a basic to intermediate understanding of these technologies and ideas.

______________________________________________________________

ADDITIONAL RESOURCES :  In projects in which the quality and quantity of data to be collected is still being decided, this document is intended to be used in conjunction with the GMV’s Evaluating the Project Scope Document.  Ideally, the ultimate destination of the data would be considered in the planning and collection stages. However, whether you making these decisions and collecting the data yourself or not, we hope that considering the objectives in this document will aid you in relating your data to the ‘big picture’.

The Archaeology Data Service / Digital Antiquity Guides to Good Practice provides much more detail about data management while this document attempts to simplify very complex topics for a more generalized understanding.

[/wptabcontent]

[wptabtitle] OBJECTIVE [/wptabtitle] [wptabcontent]

 Goal of this Guide

It is critical to note that the topics discussed here focus ONLY on general, basic processing of data that is necessary to make it intelligible to others and ready for an archive. That said, the data acquisition, processing and archiving referenced here are intended to be part of a larger data life cycle structure with an understanding of possible future uses. It is, of course, impossible to anticipate all future uses to which data may be applied but the objectives considered here are designed to obtain well documented and comprehensive data that can be expected to support a broad range of future applications and analyses, within heritage applications and beyond.

The primary objective of this document is to aid users in evaluating overall goals for data and how they relate to maintaining archivable and reusable data by considering :

I.     What is the overall life cycle of the data?

II.    How is preparing data for archival quality related to the products produced for end ‘consumption’?

II.    What types of error are involved with the data?

III.   What are your overall goals for data processing? How does the data evolve over each stage of processing from the original data to final files or products that you are hoping to produce?

[/wptabcontent]

[wptabtitle] DATA LIFE CYCLE [/wptabtitle] [wptabcontent]

What is the Life Cycle of Your Data?

A Documentation Perspective : When properly considered, the life cycle of your data begins before any data is collected. The first concept of the data occurs early in project planning when decisions relating to the data, such as file types, naming conventions, and documentation methods, are made.  Many projects and the great majority of heritage recordation efforts move directly from the acquisition and creation efforts to development and presentation of a specific set of work products. In general, here, we instead look at digital heritage and urban recordation data with a focus on a documentation perspective that keeps an ongoing life cycle in mind. This perspective involves multiple technologies in which the initial product is data in an archival quality that allows for a variety of end ‘consumption’ – including display, analysis and presentation products that are not covered here.

The data life cycle. Shaded portions of the life cycle are the topics of focus in this guide.

The data life cycle. Shaded portions of the life cycle are the topics of focus in this guide.

Heritage and Modern Environments : It should be noted that, in general, heritage guidelines for documenting these evolving technologies are far more advanced than the standards for projects involving modern environments. As disciplines that deal with modern environments (such as architects, engineers and city planners) become increasingly aware of these technologies and begin utilizing them for building information and daily maintenance/operations objectives, standards for these applications will continue to advance. While ultimate goals for heritage agendas often vary significantly from those goals involved in modern environment agendas, we propose that the basic documentation perspective applies to both. In all applications, if data is collected, processed and documented to meet basic archival quality, that data should be reusable and able to meet future needs.

[/wptabcontent]

[wptabtitle] ARCHIVING vs. CONSUMPTION [/wptabtitle] [wptabcontent]

How are archiving methods related to the end ‘consumption’ of data?

Future Needs : The methods of end ‘consumption’ and the final products to analyze or display your data will be specific to your project. Although final archiving might not be your first priority as you begin processing data, in all projects, we suggest that you maintain consistency and repeatability in processing and storage conventions to allow the data to meet these unknown, future needs. Maintaining consistency and keeping the data’s long-term life cycle in mind will help prepare your project for storing your data and/or meeting archival requirements as you realize these are needed.

Consistency at each stage of processing : All of the technologies on the GMV require basic data processing to make the data even minimally useful. Each technology section includes workflows for basic processing to achieve this minimal useability with some sections delving into more advanced processes. Combining the steps for basic processing with the detailed information on Project Documentation in the Guides to Good Practice will insure that your raw data will be adequate for archival purposes and will be accessible and usable by future investigators. Continuing these methodologies throughout the project will insure that the raw data, the minimally processed data, and comprehensive metadata are archived in such a manner as to remain accessible and to allow the future development of whichever end ‘consumption’ products are chosen.

[/wptabcontent]

[wptabtitle] TYPES OF ERROR [/wptabtitle] [wptabcontent]

Are accuracy and precision equally important in your data?

This might seem like a strange question. It is tempting to think that you need both highly precise and highly accurate data. Ideally, the equipment being used and the people performing the survey would be perfectly accurate and precise throughout all processes. However, there is error involved in every process and understanding the sources and types of error involved will help you to determine what is acceptable for your purposes and how to track the errors.

The left shows high accuracy but low precision while the right image shows high precision but low accuracy

The left shows high accuracy but low precision while the right image shows high precision but low accuracy

Project Documentation & Metadata : Being able to measure and trace the errors involved in each stage of your project is an integral component in determining the integrity of your data. This is important in all steps of your own processing as well as to those researchers who may use your data in the future. It is recommended early in project planning, to identify and document possible sources of error. Understanding these errors extends beyond knowing the technical specifications related to the equipment you are using to maintaining meticulous documentation of each stage in the project – from detailed field notes during collection to consistent lab notes during processing. This should include creating metadata throughout the life cycle of the data. If you are working with data that you did not collect, obtaining the complete history of documentation and metadata is essential to fully understanding the error involved in your final product(s).

See the Archaeology Data Service / Digital Antiquity Guides to Good Practice for more details and specific protocol for fully documenting your project.

[/wptabcontent]

[wptabtitle] PROCESSING GOALS [/wptabtitle] [wptabcontent]

How will your data evolve?

Basic to Advanced Processing : As previously explained, all of the technologies on the GMV involve a basic level of processing. This is considered the very minimum needed to make the data intelligible and useable. Beyond this basic level, there is a huge variety of options for more advanced procedures. Common agendas in more advanced processing often involve deriving useful information, such as measurements and quantitative analyses, from the original data. Additionally, the original data is often used to to create or to derive new data, including new spatial and/or semantic information.  In many cases, displaying and visualizing the data is needed in tandem with these processes. In all more advanced processes, there are different amounts of interpretation involved, whether this interpretation is the result of a software’s algorithm or the result of a human’s decisions.

Mapping out the data’s evolution : As early as possible in the planning process, it is highly recommended to identify how your data will evolve from its original structure to its final deliverable format to meet your overall goals. Early considerations that relate to the project’s data begin with the identification of the specific types of data that you will acquire or create and the file types that you will be using throughout your project. Once you are clear on the types of ‘original’ data that your project will utilize, each evolution of this data should be mapped to fully understand the interim products (such as basically processed data that will be imported into a separate software), the final products (such as the vectorized geometries or semantic databases that will be derived from the original data), as well as the storage/archiving protocols. Understanding these evolutions in the data life cycle before you begin any collection or processing can greatly help you focus on where your time, attention and effort are best applied.

The Guides to Good Practice provide excellent details and discussion on planning for this data evolution.

[/wptabcontent]

[/wptabs]

Comments are closed.