Evaluation of SSIM Lin Zhang, School of Software Engineering, Tongji University |
Introduction
SSIM (Structural SIMilarity) index is famous as an FR IQA metric for its effectiveness and efficiency; it is invented by Dr. Zhou Wang and Prof. A.C. Bovik et al. on 2004 [1].
Source Code
Sometimes, different researchers reported different evaluation results of SSIM on the same testing dataset. This is because they use different implementations. Another reason is that one key step is ignored, that is before using the SSIM, the user need to down-sample the original image to an appropriate scale. Recently, Dr. Zhou Wang has given the final version of the SSIM implementation which encapsulates the down-sampling step. This source code can be download from https://ece.uwaterloo.ca/~z70wang/research/ssim/ssim.m.
Usage Notes
1. Although besides the two images waiting to be compared, there are other three parameters in Zhou Wang's SSIM implementation, usually please do not change the default values of these parameters when doing experiments; otherwise, you cannot get the same results with the other researchers. In all of the Dr. Zhou Wang's papers, SSIM is used with the default parameters settings.
2. Dr. Zhou Wang's original SSIM can only deal with gray-scale images and the luminance range is [0, 255]. So, for color images, before calling SSIM, you need to convert it to [0, 255] gray-scale version. Usually, this can be accomplished by the Matlab routine rgb2gray.
Evaluation Results
The results (in Matlab .mat format) are provided here. Each result file contains a n by 2 matrix, where n denotes the number of distorted images in the database. The first column is the SSIM values, and the second column is the mos/dmos values provided by the database. For example, you can use the following matlab code to calculate the SROCC and KROCC values for SSIM values obtained on the TID2008 database:
%%%%%%%%%%%%%%%
matData
= load('SSIMOnTID.mat');
SSIMOnTID = matData.SSIMOnTID;
SSIM_TID_SROCC = corr(SSIMOnTID(:,1), SSIMOnTID(:,2), 'type', 'spearman');
SSIM_TID_KROCC = corr(SSIMOnTID(:,1), SSIMOnTID(:,2), 'type', 'kendall');
%%%%%%%%%%%%%%%
The source codes to calculate the PLCC and RMSE are also provided for each database. This needs a nonlinear regression procedure which is dependant on the initialization of the parameters. We try to adjust the parameters to get a high PLCC value. For different databases, the parameter initialization may be different. The nonlinear fitting function is of the form as described in [2].
Evaluation results of SSIM on seven databases are given below. Besides, for each evaluation metric, we present its weighted-average value over all the testing datasets; and the weight for each database is set as the number of distorted images in that dataset.
Database |
Results |
Nonlinear fitting code |
SROCC |
KROCC |
PLCC |
RMSE |
TID2013 |
0.7417 |
0.5588 |
0.7895 |
0.7608 | ||
TID2008 |
0.7749 |
0.5768 |
0.7732 |
0.8511 |
||
CSIQ |
0.8756 |
0.6907 |
0.8613 |
0.1334 |
||
LIVE |
0.9479 |
0.7963 |
0.9449 |
8.9455 |
||
IVC |
0.9018 |
0.7223 |
0.9119 |
0.4999 |
||
Toyama-MICT |
0.8794 |
0.6939 |
0.8887 |
0.5738 |
||
A57 |
0.8066 |
0.6058 |
0.8017 |
0.1469 |
||
WIQ |
0.7261 |
0.5569 |
0.7980 |
13.8046 |
||
Weighted-Average |
|
Reference
[1] Z. Wang, A. C. Bovik, H. R. Sheikh and E. P. Simoncelli, "Image quality assessment: From error visibility to structural similarity," IEEE Trans. on Image Processing, vol. 13, no. 4, pp. 600-612, 2004.
[2] H.R. Sheikh, M.F. Sabir, and A.C. Bovik, "A statistical evaluation of recent full reference image quality assessment algorithms", IEEE Trans. on Image Processing, vol. 15, no. 11, pp. 3440-3451, 2006.
Created on: May 08, 2011
Last update: Dec. 02, 2013