Skip to Main Content

Data basics

Data is defined as facts or information that can be used for reporting, calculations, planning, or analysis. Data can be analyzed and interpreted using statistical procedures to answer “why” or “how.” Data is used to create new information and knowledge, and has the following characteristics:

  • "Disaggregated" collection of observations with one or more characteristics
  • Generally requires manipulation or extraction using utilities
  • Can be values or observations of characteristics

Qualitative data describes qualities or characteristics of something. It is non-numerical and often collected through interviews, participant observation, and focus groups. It can be subjective and typically describes a perception or point of view. It is particularly useful for gaining cultural insight into social contexts and beliefs of a particular population. Qualitative data can take the form of field notes, audio, transcripts, and video.

Quantitative data attempts to quantify an answer to a question(s). It is numerical and often collected through measurements, surveys, observations. Quantitative data is analyzed usually in programs such as Excel, R, SPSS, STATA, and more.

Open data and content can be freely used, modified, and shared by anyone for any purpose. Open data should

  • have no monetary cost associated with use. That is, open data should be free.
  • be usable by as many people or organizations as possible. It must be available in a machine readable format that is easily accessible for processing on computers.
  • be available for commercial and non-commercial purposes and for combining with other datasets.
  • require no login or personal account to access.

For more information about open data and examples of open data, see unlocking the power of open data.


Proprietary data are generally documented in contracts and legally should not be published or disclosed to outside entities. Proprietary data may be protected under copyright, patent, or trade secret laws. Examples of proprietary data include:

  • financial data
  • product research and development
  • computer software
  • business processes and marketing strategies

Data from library subscription databases are proprietary data. The use of data from library databases requires authentication, and generally cannot be shared freely on the web or with people outside of the university.

Restricted-use data contain sensitive information (i.e., information that can cause potential harm if revealed) or information that enables the potential identification of respondents. Data may also be restricted-use because of confidentially promises or proprietariness.

Examples of sensitive information are reports of sexual behavior, criminal history, drug use, mental health history, HIV status, information collected from minors, or other materials that warrant extra discretion.

However, such data offers potential for research. Therefore, the government and the University want to ensure that restricted data is handled in a way that will safeguard the respondents/research subjects while allowing access to research which benefits our society as a whole. Files containing the confidential information are available to researchers only under certain conditions and agreements. Standard requirements may include the following:

  • Your institution must classify you as a Principal Investigator (PI), eligible to lead a research project.
  • Proof of IRB review. For more information about IRB, contact the UH IRB office.