Skip to Main Content

One Guide to Rule Them All: Research Station

This guide was created as a way for all students at Ferrum College to get basic 'help' with research, tutorials and quick tip suggestions.

Welcome! Let us help you find data sets!

This guide is designed to help you find raw data sets for use in data analysis projects such as t-tests, regression analyses, and Pearson r analyses. This guide can help you find data from which to make statistics, but not statistics themselves.

Here you'll find:

  • Data by Topic: If you're interested in finding raw data about a research topic and analyzing it yourself, this is your spot. 
  • Search Data Repositories: Don't like any of the suggested data sources? Check the search options located here. 
  • Citing Data: No matter what you do with the data, you must cite it, and that is tricky. Check here for some guidelines. 
  • Data Tools: Data cleaning, open source analysis tools, and data visualization tools can be found here. 

What is the difference between Data and Statistics?

In most common conversations, these words mean the same thing. However, in class research, they are different in one major way: Data is the "raw" stuff, the information from which statistics are created. Statistics interpret and summarize that data, making it something you can use easily in a paper or otherwise. 

Statistics

  • Statistical tables, charts, and graphs
  • Reported numbers and percentages in an article

If you’re looking for a quick number, you want a statistic. A statistic will answer “how much” or “how many”.  Statistics are the results of data analysis. It usually comes in the form of a table or chart. This is what a statistical table looks like:

Table 1206. Adult Attendance at Sports Events by Frequency: 2007

Source: Statistical Abstract of the United States

Data

  • Datasets
  • Machine-readable data files, data files for statistical software programs

If you want to understand a phenomenon, you want data. Data can be analyzed and interpreted using statistical procedures to answer “why” or “how.” Data is used to create new information and knowledge.

Raw data is the direct result of research that was conducted as part of a study or survey. It is a primary source. It usually comes in the form of a digital data set that can be analyzed using software such as Excel, SPSS, SAS, and so on. This is what a data set looks like:

Dataset example: each cell in the spreadsheet represents an individual response to survey questions


Find Your Own Data Sets

Far and away, the major method for finding data repositories is the Registry of Research Data Repositories. You can find repositories by subject, content type, and location. Please keep in mind that you want to search for broad terms. Once you find a repository you like, you can THEN search for specific data. Add the "Open Access" limiter to make sure your data can be used how you choose (and that you have access!) 

If that's a little overwhelming, here are some repositories we love. 


Citing Data

Properly citing data assists in the research process by giving data creators proper credit for their work, aids replication, provides permanent and reliable information about the data source, helps track the impact of the data, and facilitates resource discovery and access.

Citing Data From Others

In many cases, a data provider will include recommended citation formats (i.e. the U.S. Cenus, OECDICPSR, the Roper Center, and the Social Science Electronic Data Library). Recommended citations can come either with the dataset or from elsewhere on the website.  Also note that the producers of a particular dataset may request that users of the data cite a publication in which the data are described, rather than citing the dataset (i.e. the Database of Political Institutions). 

When a data provider does not recommend a citation format, we recommend these general citation guidelines:

  1. Author/Principal Investigator
  2. Year of Publication
  3. Title of the Data Source
  4. Edition/Version Number
  5. Format of the Data Source (e.g. [Computer File], [CD-ROM], [Online], etc.)
  6. Producer of the Data Source
  7. Distributor of the Data Source
  8. Identifier or permanent URL for the Data Source

Check out the library's Citation Guide for help with the various styles used on campus.


Citation Examples Using General Guidelines

Use the recommended citation for the data set if one is provided either with the dataset or on the publishers website (e.g. terms and conditionsfrequently asked questions, etc...).  

If there is not a recommended citation and your style guide does not offer specific citation requirements for data or other source types, the format for books is considered the generic format that should be modified and used.

APA (6th Edition, p. 211)

Pew Hispanic Center. (2004). Changing channels and crisscrossing cultures: A survey of Latinos on the news media [Data file and code book].
Retrieved from http://pewhispanic.org/datasets/

MLA (7th Edition)

Smith, Tom W., Peter V. Marsden, and Michael Hout. General Social Survey, 1972-2010 Cumulative File. ICPSR31521-v1. Chicago, IL: National Opinion Research Center [producer]. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2011. Web. 23 Jan 2012. doi:10.3886/ICPSR31521.v1

APSA (Revised 2006, p. 30)

Purdue University. 2007. Controversial Facilities in Japan, 1955-1995 [computer file] (Study #4725). ICPSR04725-v1. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2007. doi:10.3886/ICPSR04725.

NLM (2nd Edition)

Entrez Genome [Internet]. Bethesda (MD): National Library of Medicine (US), National Center for Biotechnology Information. [date unknown]. Haloarcula marismortui ATCC 43049plasmid pNG200, complete sequence; [cited 2007 Feb 27]. Available from: http://www. ncbi.nlm.nih.gov/entrez/query.fcgi?db= genome&cmd=Retrieve&dopt=Overview&list_uids=18013

Chicago (16th Edition, p. 693)

Bibliography style (based on documentation for books):

Milberger, Sharon. Evaluation of Violence Against Women With Physical Disabilities in Michigan, 2000-2001. ICPSR version. Detroit: Wayne State University, 2002. Distributed by Ann Arbor, MI: Inter-University Consortium for Political and Social Research, 2002. doi:10.3886/ICPSR03414.

Author-Date style:

Milberger, Sharon. 2002. Evaluation of Violence Against Women With Physical Disabilities in Michigan, 2000-2001. ICPSR version. Detroit: Wayne State University. Distributed by Ann Arbor, MI: Inter-University Consortium for Political and Social Research. doi:10.3886/ICPSR03414.

ACS

SciFinder, web; Wiley Subscription Services, Inc, 2014; RN 50-78-2.

GSA

Whitlow, J. W., 1969. Sample: AAM367, USGS National Geochemical Database. URL: http://mrdata.usgs.gov/ngdb/rock/show-ngdbrock.php?lab_id=AAM367. Accessed June 24, 2014