Evaluation of SSIM

Lin Zhang, School of Software Engineering, Tongji University


Introduction

SSIM (Structural SIMilarity) index is famous as an FR IQA metric for its effectiveness and efficiency; it is invented by Dr. Zhou Wang and Prof. A.C. Bovik et al. on 2004 [1].


Source Code

Sometimes, different researchers reported different evaluation results of SSIM on the same testing dataset. This is because they use different implementations. Another reason is that one key step is ignored, that is before using the SSIM, the user need to down-sample the original image to an appropriate scale. Recently, Dr. Zhou Wang has given the final version of the SSIM implementation which encapsulates the down-sampling step. This source code can be download from https://ece.uwaterloo.ca/~z70wang/research/ssim/ssim.m.


Usage Notes

1. Although besides the two images waiting to be compared, there are other three parameters in Zhou Wang's SSIM implementation, usually please do not change the default values of these parameters when doing experiments; otherwise, you cannot get the same results with the other researchers. In all of the Dr. Zhou Wang's papers, SSIM is used with the default parameters settings.

2. Dr. Zhou Wang's original SSIM can only deal with gray-scale images and the luminance range is [0, 255]. So, for color images, before calling SSIM, you need to convert it to [0, 255] gray-scale version. Usually, this can be accomplished by the Matlab routine rgb2gray.


Evaluation Results

The results (in Matlab .mat format) are provided here. Each result file contains a n by 2 matrix, where n denotes the number of distorted images in the database. The first column is the SSIM values, and the second column is the mos/dmos values provided by the database. For example, you can use the following matlab code to calculate the SROCC and KROCC values for SSIM values obtained on the TID2008 database:

%%%%%%%%%%%%%%%

matData = load('SSIMOnTID.mat');
SSIMOnTID = matData.SSIMOnTID;
SSIM_TID_SROCC = corr(SSIMOnTID(:,1), SSIMOnTID(:,2), 'type', 'spearman');
SSIM_TID_KROCC = corr(SSIMOnTID(:,1), SSIMOnTID(:,2), 'type', 'kendall');

%%%%%%%%%%%%%%%

The source codes to calculate the PLCC and RMSE are also provided for each database. This needs a nonlinear regression procedure which is dependant on the initialization of the parameters. We try to adjust the parameters to get a high PLCC value. For different databases, the parameter initialization may be different. The nonlinear fitting function is of the form as described in [2].

Evaluation results of SSIM on seven databases are given below. Besides, for each evaluation metric, we present its weighted-average value over all the testing datasets; and the weight for each database is set as the number of distorted images in that dataset.

Database

Results

Nonlinear fitting code

SROCC

KROCC

PLCC

RMSE

TID2013

SSIMOnTID2013

NonlinearFittingTID2013

0.7417

0.5588

0.7895

0.7608

TID2008

SSIMOnTID2008

NonlinearFittingTID

0.7749

0.5768

0.7732

0.8511

CSIQ

SSIMOnCSIQ

NonlinearFittingCSIQ

0.8756

0.6907

0.8613

0.1334

LIVE

SSIMOnLIVE

NonlinearFittingLIVE

0.9479

0.7963

0.9449

8.9455

IVC

SSIMOnIVC

NonlinearFittingIVC

0.9018

0.7223

0.9119

0.4999

Toyama-MICT

SSIMOnMICT

NonlinearFittingMICT

0.8794

0.6939

0.8887

0.5738

A57

SSIMOnA57

NonlinearFittingA57

0.8066

0.6058

0.8017

0.1469

WIQ

SSIMOnWIQ

NonlinearFittingWIQ

0.7261

0.5569

0.7980

13.8046

Weighted-Average

 

         

Reference                

[1] Z. Wang, A. C. Bovik, H. R. Sheikh and E. P. Simoncelli, "Image quality assessment: From error visibility to structural similarity," IEEE Trans. on Image Processing, vol. 13, no. 4, pp. 600-612, 2004.

[2] H.R. Sheikh, M.F. Sabir, and A.C. Bovik, "A statistical evaluation of recent full reference image quality assessment algorithms", IEEE Trans. on Image Processing, vol. 15, no. 11, pp. 3440-3451, 2006.


Created on: May 08, 2011

Last update: Dec. 02, 2013