Entering Data in a Spreadsheet:
Once your data are collected they should be organized in a spreadsheet. As a rule, cases (individual observations) are each entered as a single row, while variables that you measured for each case are entered into each column. Columns and rows should be labeled.
In the example below, there are three variables, 1) Survival Time 2) Length and 3) Number of Offspring. Each of the variables is entered into a column and the units are labeled. Individual cases, also referred to as samples, are entered into the rows. Here, each of the five individuals were assigned a number (1019, 1020...), which serves as the row label. The data are then entered into the individual cells with the number of decimal places appropriate to the precision of the measurements. Both 'survival time' and 'length' are continuous variables (variables that can assume the infinitely many values corresponding to a line interval) while 'the number of offspring' is a discrete variable (variable that can assume only a countable or discrete number of values). Ideally, no cells should be left blank. Occasionally, however, data are missing, and cells must be indicated as such. Every statistical software package handles missing values differently. Before entering missing values (as *, NA, or blank), check the package you will be using to determine how best to enter missing data.
|
|
|
|
||
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Metadata provide a description of your data structure. What's the filename? Who entered the data? How were the data checked for accuracy during data entry? What do the rows and columns signify? What do variable names (usually 8 or fewer characters) actually mean? In what format are your data entered (e.g., integers, real numbers, alphanumeric characters)? When were your data collected? How? What is their precision and accuracy (be sure that you know the difference). Metadata includes any additional information that another user might find helpful if they were to access your data from a databank and then try to re-analyze them.
|
|
|
|
|
|
|
|
||||