SPSS Basics - Part 1

Creating, Importing, Reading and Validating SPSS Data Files

 

March 31, 2009

April 2, 2009

 

Draft

March 30, 2009

 

 

This document and related materials will be available at http://www.lehman.edu/faculty/john/spss/

 



1.      About SPSS

a.       Comprehensive collection of tools for data analysis, reporting, and data manipulation
 

b.      SPSS workshop series and format

c.       Available versions

d.      Licensing at Lehman

e.       Locations where SPSS is installed




2.      An example of data analysis using SPSS and NORC’s 2008 General Social Survey (GSS)

a.       For further information on the GSS including a detailed codebook, visit http://www.norc.org

b.      Starting SPSS for Windows

c.       SPSS interface (menus, dialog boxes, and other standard Windows features)

d.      HELP!
 

e.       Opening SPSS window for data entry and display – Data Editor

f.       Opening an existing SPSS-format data file (“system file”)

g.      .sav file extension for SPSS data files

h.      Data View/Variable View

i.        Basic structure of an SPSS-format data file (units of analysis, cases, or observations as rows; columns as variables or measures; cells contain values)


 

 

j.        Subset of variables from the 2008 General Social Survey

 

1)                        age

2)                        attend (religious services)

3)                        class

4)                        confinan (confidence in banks/financial institutions)

5)                        consci

6)                        degree

7)                        educ

8)                        fatalism

9)                        finrela

10)                    geomobil

11)                    getahead

12)                    happy

13)                    health

14)                    income

15)                    intecon (interest in economics)

16)                    life (exciting/dull)

17)                    marital

18)                    newsfrom

19)                    partyid

20)                    polviews

21)                    pres04

22)                    racecen1

23)                    region

24)                    relig

25)                    satfin

26)                    sex

27)                    trust

28)                    vote04

29)                    wrkstat


 

 

 

k.      Running statistical procedures (Frequencies, Descriptives, Crosstabs, Means) to answer questions about the data

l.        SPSS window for output – Output/viewer window

m.    Reviewing results

n.      Saving output

o.      .spv file extension for output



3.      Creating your own SPSS data files in the Data Editor

 

a.       The following hypothetical height/weight data will be used:

 

ID              GENDER                   HEIGHT                     WEIGHT

                                                            (inches)                       (pounds)

 

1                M                                 70                                155

2                F                                  61                               

3                F                                  64                                125

4                M                                                                     175

5                M                                 72                                180

6                M                                 69                                170

7                F                                  65                                115

8                M                                 77                                200

9                F                                  68                                140

10              M                                 70

 

Note that some measurements are missing.                                 


 

 

b.      Define variables in Data Editor-Variable View

 

Name                          ID                    GENDER                   HEIGHT                     WEIGHT

 

Type                           Numeric          String                          Numeric                      Numeric

 

Width                                                 1

 

Decimals                     0                                                          0

 

Variable labels           Identifi            cation                                      Height in inches          Weight in lbs

Number

 

Value labels                                        M Male

F Female

 

Missing values                                                                        99                                999

 

 

c.       Enter the following data in the Data Editor - Data View

 

ID              GENDER                   HEIGHT                     WEIGHT

 

1                M                                 70                                155

2                F                                  61                                999

3                f                                   64                                125

4                M                                 99                                175

5                                                    72                                180

6                m                                 69                                170

7                F                                  65                                115

8                M                                 775                              200

9                F                                  68                                140

10              M                                 70                               

 

 

d.      Structure of an SPSS data file – rectangular array or matrix with

Rows as cases (Units of analysis, observations)

Columns as variables

Values for particular cases on particular variables in cells at row-column intersection

 

e.       System missing versus user-defined missing values

 

f.       Saving your data file (.sav file extension)



 

4.      Data validation

a.       Running procedures to validate coding and data entry

b.      Frequencies

c.       Crosstabulation for contingent questions (not applicable here)

 

 

 

5.      Creating SPSS data files from other data sources

a.       Importing data from Excel

b.      Variable names in first row

c.       Variable type and potential for error

d.      Other sources: Access, SAS etc.




6.      Creating SPSS data files from “ASCII” text files

a.       Common non-SPSS formats: delimited or fixed format plain text (“ASCII”) files

b.      File extensions for text files .dat, .txt

c.       Notepad to view text files

In ASCII files, a line is generally referred to as a RECORD. The columns assigned to a variable are collectively referred to as a FIELD. HEALTH.DAT is an example of a fixed format ASCII file since the same information is coded in the same location for every case. AGE, for example, is always found in columns 14-15 of the first and only record of a case. (Column positions may appear distorted when using proportional fonts in Word or other word processors.) 





The use of the term column when describing the layout of an ASCII text file is different from the use of the term column when describing the contents of the Data Editor. A variable occupies a column in the Data Editor; a character occupies a column in an ASCII data file. The ASCII text file is not an SPSS data file.

 

 

 

 

d.      Record layout ("Codebook") for  HEALTH.DAT


 

description                              variable            record              columns


Identification number             ID                    1                      1-2

Systolic blood pressure           SBP                 1                      3-5

Quetelet index                                    QUET             1                      6-10

Age in years                            AGE                1                      11-12

98 = 98 or more                                  

99 = missing
            .                                  

Respondent smokes                SMK               1                      13

0 = no
1 = yes

 

*Quetelet Index (a measure of size) = 100 * (weight/height**2)

 


 

Fixed format file

 

011352.876450

021223.251410

031303.100490

041483.768520

051462.979541

061292.790471

071623.668601

081603.612481

091442.368441

101804.637641

111663.877591

121384.032511

131524.116990

141383.673560

151403.562541

161342.998501

171453.360491

18   3.024461

191353.171570

201423.401560

211503.628561

221443.751580

231373.296530

241323.210500

251493.301541

261323.017481

271202.789430

281262.956431

291613.800630

301704.132631

311523.962620

321644.010650

 

 

Source

Kleinbaum, David G. and Kupper, Lawrence L. (1978). Applied Regression Analysis and Other Multivariable Methods. Boston, Massachusetts: Duxbury Press. (p. 60)


 

Comma delimited format

 

1,135,2.876,45,0

2,122,3.251,41,0

3,130,3.1,49,0

4,148,3.768,52,0

5,146,2.979,54,1

6,129,2.79,47,1

7,162,3.668,60,1

8,160,3.612,48,1

9,144,2.368,44,1

10,180,4.637,64,1

11,166,3.877,59,1

12,138,4.032,51,1

13,152,4.116,99,0

14,138,3.673,56,0

15,140,3.562,54,1

16,134,2.998,50,1

17,145,3.36,49,1

18,,3.024,46,1

19,135,3.171,57,0

20,142,3.401,56,0

21,150,3.628,56,1

22,144,3.751,58,0

23,137,3.296,53,0

24,132,3.21,50,0

25,149,3.301,54,1

26,132,3.017,48,1

27,120,2.789,43,0

28,126,2.956,43,1

29,161,3.8,63,0

30,170,4.132,63,1

31,152,3.962,62,0

32,164,4.01,65,0

 



e.       Read an ASCII file using the Text Import Wizard



7.      Using command syntax to read file



8.      More complex input file structures

a.       multiple lines per case

b.      hierarchal files (e.g. household record, followed by one record per member of household)

 

c.       different record types (e.g. personal data record, course records, financial data record)

d.      varying numbers of measures per unit




9.      Obtaining and using existing SPSS data files

a.       Example of GSS 2008

b.      Other data archive sites

http://www.icpsr.umich.edu

Contact William Bosworth (william.bosworth@lehman.cuny.edu) for further information





 

 

10.  Learning more about SPSS

a.       http://www.spss.com

b.      Manuals in pdf format provided with license

c.       Help > Tutorial etc.

d.      Academic web sites, e.g.

http://www.usc.edu/its/stats/spss/index.html

http://www.usc.edu/its/stats/spss/index.html