An Example

Here is an example of probit analysis so you can see how all these elements go together.

This experiment was done to find out what the threshold level of detection for image compression was. An image was compresses a five different levels. The compression, JPEG, is a lossy compression, meaning that information in the image is lost when the image is compressed. This may affect the appearance of the image. For a particular image, a method of constant stimulus experiment was performed to find out the level of compression that produces the just noticeable change in the image. The MCS was implemented with a forced choice procedure and it was analyzed using probit analysis on SAS.

Each compressed image is presented with the original to the subjects twice so that each subject has 10 trials. The stimulus pairs were constructed before the experiment. The placement of the images were random and the order of the stimuli were placed in a fixed random order. If you were doing this on a computer, you could have different random placements and orders for each subject

The following shows the subjects' instructions:

 Instructions: There are 10 sample pairs. In each pair of images, one is an original and the other has been compressed to some degree using a JPEG algorithm. Choose which image (A or B) is the original and record your response on the attached response form.

The following shows the response form that the subject uses to record his responses and the scoring key which the experimenter has used for recording the subject's responses.

 Response Form PAIR CHOICE PAIR A B Correct 1-1 A B 1-1 3 0 B 1-2 A B 1-2 0 5 A 1-3 A B 1-3 5 0 B 1-4 A B 1-4 4 0 B 1-5 A B 1-5 0 1 A 1-6 A B 1-6 0 2 A 1-7 A B 1-7 2 0 B 1-8 A B 1-8 0 4 A 1-9 A B 1-9 1 0 B 1-10 A B 1-10 0 3 A

 Image QF CR Size(kb) 0 (original) NA 1.0 432 1 1 3.0 142 2 14 9.0 48 3 29 13.9 31 4 44 17.9 24 5 114 31.9 13

The bottom table shows the identity of each image and the amount of compression. Image 0 is the original. The QF is a number that was taken from the software that does the compression. It assigns a JPEG Quality Factor to the image for different levels of compression. A lower QF corresponds to less compression. We will use this as a stimulus strength variable. We will also use the compression ratio (CR) as a value for the magnitude of the stimuli. The compression ratio shows the ratio of file size for the original compared to the compressed version.

So for stimulus 1-5 for example, the left image is the original and the right image is image one which has a QF = 1 and a CR = 3. If the subject could tell them apart, he would have picked B. If the subject could not see a difference he would have had a 50% chance of picking A or B. For each subject you go through and tabulate their results and then calculate the number of times all the subjects chose the compressed image for each sample pair. Sample pairs 1-4 and 1-8 contain the same images but are in different positions so the experimenter would keep track of which sample, A or B, was the compressed image for each pair.

Eighteen observers took part in the experiment. This means that each pair was presented 36 times (2x per subject x 18 subjects). The raw data looks like this:

 JPEG Experiment Tally Sheet Image QF CR # Correct # Observations 1 1 3.0 20 36 2 14 9.0 25 36 3 29 13.9 30 36 4 44 17.9 31 36 5 114 31.9 36 36

These are the data used in the probit analysis. There are two physical measures of stimulus magnitude (the independent variable), QF and CR. A probit analysis will be performed with each.

Here is one way to perform the probit analysis using SAS. First you need to create a text file. We will call the text file whatever.sas. Here is the text file for this example.

 OPTION LINESIZE=72 PAGESIZE=66; DATA COLOR;      INPUT QFACTOR OBSCOR TOTALOBS;      PHAT=OBSCOR / TOTALOBS;      OUTPUT;      CARDS; 1 20 36 14 25 36 29 30 36 44 31 36 114 36 36 ; PROC PROBIT C=0.5; MODEL OBSCOR/TOTALOBS=QFACTOR / LACKFIT INVERSECL ITPRINT; OUTPUT OUT=B P=PROB STD=STD1 XBETA=XBETA; TITLE 'OUTPUT FROM EDM PROBIT'; RUN; PROC PLOT;      PLOT PHAT*QFACTOR='X' PROB*QFACTOR='P' / OVERLAY;      TITLE 'OUTPUT PLOTS FROM EDM EXPERIMENT'; RUN; Comments: Format line COLOR data set name. QFACTOR, OBSCOR, TOTALOBS are variable names referring to the three columns of data. PHAT observed probabilities calculated with this equation Data Don't forget this semicolon Specify PROBIT procedure omit "C=0.5" if not forced choice Specify output Specify plot

I am not interested in you (or me) becoming a SAS expert so you can use this as a template for your analyses. Just replace what is in black with your own names and data. Take out the blue C=0.5 line if not forced choice or change value if not 2AFC. Basically, this file tells SAS what the data are; how to calculate the observed probability; what procedure to run (probit) and how to display the results.

Upload this text file to the VAX if you didn't create it there. (I use "Fetch" on a Macintosh.) You can put it in your main directory or make a new one. To execute this file type: "SAS WHATEVER". Remember WHATEVER.SAS is the name of the text file. SAS runs the procedure and creates two files: WHATEVER.LOG and WHATEVER.LIS. The LOG file will contain any errors if you find you have a problem. The LIS file contains your results. Download this file and then you can look at it in an editing program. To look at it on the VAX, type "TYPE WHATEVER.LIS". (I don't know VMS so I can't help you much with editing on the VAX.)

The output in the LIS file contains a lot of stuff. Here is an excerpt:

 Goodness-of-Fit Tests Statistic                          Value          DF          Prob>Chi-Sq ------------------                --------          --               ----------- Pearson Chi-Square      1.0087         3                0.7992 L.R. Chi-Square             1.0222         3                0.7959 Response Levels: 2 Number of Covariate Values: 5 NOTE: Since the chi-square is small (p > 0.1000), fiducial limits will        be calculated using a t value of 1.96. Probit Model in Terms of Tolerance Distribution MU               SIGMA 23.68467          26.90919 Probit Procedure Probit Analysis on QFACTOR Probability       QFACTOR                  95 Percent Fiducial Limits                                                                  Lower            Upper   0.01                 -38.915                        -159.483      -11.823   0.02                 -31.580                        -139.024       -7.205 ...   0.45                 20.303                              0.050       31.087   0.50                 23.685                              7.208       35.489   0.55                 27.066                            13.460       40.797 ....   0.98                 78.949                            56.978       174.646   0.99                 86.285                            61.634       195.067

Results Summary

Using the Quality Factor metric, the chi-square = 1.01. The degrees of freedom = 3. And the Prob > Chi_square is 0.80. When this is greater than 0.1, your results show an OK fit. h = 1.01/3 = 0.337 but the h is not used because the chi-square is low enough. The 50% level (corresponding to mu) is 23.7 with a standard deviation of 26.9. The fiducial limits calculated from this (see the table at the end of the output) ranges from 7.2 - 35.5.

Repeating this analysis using the compression ratio metric gives these results:

chi-square = 0.66
degrees of freedom = 3
P>chi-square = .88 (greater than 0.1 is OK)
h = 0.66/3=0.22 (not used since chi-square is low)
mu(50% level) = 11.1, standard deviation = 8.0
Fiducial limits, 6.5 - 14.9

So you can see the fit a little better with the compression ratio but not much.

Why do both analyses?

Which one would you recommend?

 Continue on to Matching