MVIAeval

About MVIAeval	Construction of MVIAeval	Usage of MVIAeval
● Motivation of MVIAeval ● What is MVIAeval?	● Twenty benchmark microarray datasets and twelve existing algorithms used for performance comparison ● Three existing performance indices used for performance evaluation ● Evaluating the performance of an algorithm for a benchmark microarray data matrix using a specific performance index ● Two existing comprehensive performance scores	● Usage ● A case study

Motivation of MVIAeval

Missing value imputation is important for microarray data analyses because microarray data with missing values would significantly degrade the performance of the downstream analyses. Although many microarray missing value imputation algorithms have been developed, an objective and comprehensive performance comparison framework is still lacking. Therefore, in our previous paper (Chiu et al. 2013), we proposed a framework which can perform a comprehensive performance comparison of different existing algorithms. Our performance comparison framework can also be applied to evaluate the performance of a newly developed algorithm. However, constructing our framework is not an easy task for the interested researchers. To save researchers time and effort, here we present an easy-to-use web tool named MVIAeval (Missing Value Imputation Algorithm evaluator) which implements our performance comparison framework.

TOP

Twenty benchmark microarray datasets and twelve existing algorithms used for performance comparison

In MVIAeval, we collected 20 benchmark microarray datasets of different species and different types.

GEO Dataset	Dim	Type	Organism	Title
GDS3323	45101*6	Non-time Series	Mus musculus	Na+/H+ exchanger 3 deficiency effect on the colon
GDS3215	12625*6	Non-time Series	Homo sapiens	13-cis retinoic acid effect on SEB-1 sebocyte cell line
GDS3485	45011*6	Non-time Series	Mus musculus	Zinc transporter SLC39A13 deficiency effect on chondrocytes
GDS3476	45011*6	Non-time Series	Mus musculus	NF-E2-related factor 2 Nrf2 activation effect on the liver
GDS3197	45101*6	Non-time Series	Mus musculus	Transcriptional coactivator PGC-1beta hypomorphic mutation effect on the liver
GDS3149	45101*6	Non-time Series	Mus musculus	Suppressor of cytokine signaling 3 deficiency effect on the regenerating liver
GDS2107	15923*6	Non-time Series	Rattus norvegicus	Long-term ethanol consumption effect on pancreas
GDS3464	15617*6	Non-time Series	Danio rerio	SPT5 mutant embryos
GDS3426	23015*6	Non-time Series	Staphylococcus epidermidis	Staphylococcus epidermidis SarZ mutant
GDS3421	10208*6	Non-time Series	Escherichia coli	Frag1 cells response to ionic and non-ionic hyperosmotic stress
GDS3360	22575*8	Time Series	Homo sapiens	Chlamydia pneumoniae infection effect on HL epithelial cells: time course
GDS2863	31099*6	Time Series	Rattus norvegicus	Tienilic acid effect on the liver: time course
GDS5057	34760*8	Time Series	Mus musculus	Mepenzolate bromide effect on lung: time course
GDS5055	45307*10	Time Series	Mus musculus	Histone demethylase KDM1A deficiency effect on 3T3-L1 preadipocytes: time course
GDS3428	22283*9	Time Series	Homo sapiens	Immature dendritic cell response to butanol fraction of Echinacea purpurea: time course
GDS4484	45101*8	Time Series	Mus musculus	Cerebellar neuronal cell response to thyroid hormone: time course
GDS3785	17589*8	Time Series	Homo sapiens	Osteoarthritic chondrocytes and healthy mesenchymal stem cell during chondrogenic differentiation: time course
GDS3930	8799*9	Time Series	Rattus norvegicus	Bone morphogenic protein effect on cultured sympathetic neurons: time course
GDS4321	10208*8	Time Series	Escherichia coli	Escherichia coli O157:H7 response to cinnamaldehyde: time course
GDS3032	22277*8	Time Series	Homo sapiens	Quercetin effect on intestinal cell differentiation in vitro: time course

In addition, we implemented 12 existing algorithms including two global approach algorithms and 10 local approach algorithms.

Algorithm	Category	Year of Published	Reference
SVD	Gobal	2001	[ Troyanskaya et al. 2001 ]
BPCA	Gobal	2003	[ Oba et al. 2003 ]
KNN	Local	2001	[ Troyanskaya et al. 2001 ]
SKNN	Local	2004	[ Kim et al. 2004 ]
IKNN	Local	2007	[ Brás et al. 2007 ]
LS	Local	2004	[ Bø et al. 2004 ]
LLS	Local	2005	[Kim et al. 2005 ]
ILLS	Local	2006	[Cai et al. 2006 ]
SLLS	Local	2008	[Zhang et al. 2008 ]
Shrinkage LLS	Local	2013	[Wang et al. 2013 ]
Shrinkage SLLS	Local	2013	[Wang et al. 2013 ]
Shrinkage ILLS	Local	2013	[Wang et al. 2013 ]

TOP

Three existing performance indices used for performance evaluation

In MVIAeval, we used three existing performance indices for performance evaluation.
First, the inverse of the normalized root mean square error (1/NRMSE) is used to measure the numerical similarity between the imputed matrix (generated by an imputation algorithm) and the original complete matrix. Therefore, the higher the 1/NRMSE value is, the better the performance of an imputation algorithm is.

Second, the cluster pair proportions (CPP) is used to measure the similarity of the gene clustering results of the imputed matrix and the complete matrix. High CPP value means that the imputed matrix (generated by an imputation algorithm) has very similar gene clustering results as the complete matrix does. Therefore, the higher the CPP value is, the better the performance of an imputation algorithm is.

Third, the biomarker list concordance index (BLCI) is used to measure the similarity of the differentially expressed genes identification results of the imputed matrix and the complete matrix. High BLCI value means that differentially expressed genes identified using the imputed matrix (generated by an imputation algorithm) are very similar to those identified using the complete matrix. Therefore, the higher the BLCI value is, the better the performance of an imputation algorithm is.

In summary, 1/NRMSE measures the numerical similarity, while CPP and BLCI measure the similarity of downstream analysis results (gene clustering and differentially expressed genes identification) of the imputed matrix and the complete matrix.

TOP

Evaluating the performance of an algorithm for a benchmark microarray data matrix using a specific performance index

The simulation procedure for evaluating the performance of an imputation algorithm (e.g. KNN) for a given complete benchmark microarray data matrix using a performance index (e.g. CPP) is divided into four steps:

Step 1: randomly generate five testing matrices having missing values with different percentages (1%, 3%, 5%, 8% and 10%) from the complete matrix
Step 2: generate five imputed matrices by imputing the missing values in the five testing matrices using KNN
Step 3: calculate five CPP scores using the complete matrix and five imputed matrices
Step 4: repeat Steps 1-3 for B times, where B is the number of simulation runs per missing percentage.

Then the final CPP score of KNN for the given benchmark microarray data matrix is defined as the average of the 5*B CPP scores. The following figure illustrates the whole simulation procedure.

TOP

Two existing comprehensive performance scores

In MVIAeval, we implemented two existing comprehensive performance scores to summarize the overall performance comparison results for the selected performance indices and benchmark microarray datasets.
(I) Overall Ranking Score (ORS): the sum of the rankings of an algorithm for the selected performance indices and benchmark microarray datasets.
The ranking of an algorithm for a specific performance index and a specific benchmark microarray dataset is d if its performance ranks #d among all the compared algorithms. For example, the ranking of the best performing algorithm is 1. Therefore, the smaller the ORS is, the better the overall performance of an algorithm is.
(II) Overall Normalized Score (ONS): the sum of the normalized scores for the selected performance indices and benchmark microarray datasets.
The ONS of the algorithm $k$ is calculated as follows:
$$ONS(k) = \sum_{i=1}^I\sum_{j=1}^JN_{ij}(k)=\sum_{i=1}^I\sum_{j=1}^J(\frac{S_{ij}}{max(S_{ij}(1),S_{ij}(2),...,S_{ij}(m))})$$ where $N_{ij}(k)$ and $S_{ij}(k)$ is the normalized score and the original score of the algorithm $k$ for the selected performance index $i$ ($i$=1 for $\frac{1}{NRMSE}$, 2 for CPP, and 3 for BLCI) and benchmark microarray dataset $j$; $I$ is the number of the selected indices; $J$ is the number of the selected benchmark microarray dataset and $m$ is the number of the algorithms being compared. Note that $0≤N_{ij}(k)≤1$ and $N_{ij}(k)=1$ if and only if the algorithm $k$ is the best performing algorithm for the selected performance index $i$ and benchmark microarray dataset $j$ (i.e. $S_{ij}(k) = max(S_{ij}(1),S_{ij}(2),...,S_{ij}(m))$ ). The larger the $ONS$ is, the better the overall performance of an algorithm is.

TOP

Usage

The usage of MVIAeval is shown in the following figure.

The friendly web interface allows users to upload the R code of their newly developed algorithm. Then five kinds of settings of MVIAeval need to be specified. Users have to

(1) choose the test datasets from 20 benchmark microarray datasets
(2) choose the compared algorithms from 12 existing algorithms
(3) choose the performance indices from three existing ones (1/NRMSE, CPP, and BLCI)
(4) choose the comprehensive performance scores from two existing ones (ORS and ONS)
(5) determine the number of simulation runs

After submission, MVIAeval will conduct a comprehensive performance comparison of the user’s algorithm to the compared algorithms using the selected performance indices and benchmark datasets. Then a webpage of the comprehensive performance comparison results will be generated.

TOP

A case study

In MVIAeval, the R code of a sample algorithm is provided. For demonstration purpose, we regard the sample algorithm as the user’s newly developed algorithm and would like to use MVIAeval to conduct a comprehensive performance comparison of this new algorithm (denoted as USER) to various existing algorithms. For example, users upload the R code of the new algorithm,

select two benchmark datasets,

select 12 existing algorithms,

select three performance indices,

select the overall ranking score as the comprehensive performance score,

and use 25 simulation runs.

After submission, the comprehensive comparison results are generated and shown as both tables and bar charts. The overall performance of the new algorithm ranks six among all the 13 algorithms being compared.

Comprehensive Ranking Score: Sum of the rankings of each algorithm in the selected performance indices and selected datasets

Rankings of each algorithm in the selected performance indices and selected datasets

Performance Index	DataType	Dataset	Algorithms													Details
Performance Index	DataType	Dataset	SLLS	ILLS	LS	LLS	KNN	USER	SVD	BPCA	IShrLLS	ShrLLS	IKNN	ShrSLLS	SKNN	Details
1/NRMSE	nontime	GDS3215	2	1	8	3	6	5	7	4	9	10	12	11	13	Details
1/NRMSE	time	GDS3785	1	2	5	3	7	6	8	4	9	10	12	11	13	Details
BLCI	nontime	GDS3215	3	4	2	1	6	6	8	4	9	10	12	11	13	Details
BLCI	time	GDS3785	3	4	2	6	8	5	1	11	10	11	7	13	19	Details
CPP	nontime	GDS3215	1	5	3	8	2	4	5	7	10	9	12	11	13	Details
CPP	time	GDS3785	4	2	3	4	1	7	6	8	11	9	12	13	10	Details
Comprehensive Ranking Score			14	18	23	25	30	33	35	38	58	59	67	70	71	-
Final Ranking			1	2	3	4	5	6	7	8	9	10	11	12	13	-

Actually, MVIAeval can provide the performance comparison results in many scenarios.

Performance Index	Benchmark Datasets	Ranking of USER Using ORS	Ranking of USER Using ONS	Detail
1/NRMSE	Five Time Series: GDS3360,GDS2863,GDS5057,GDS5055,GDS3428	5	6	Detail
1/NRMSE	Five Non-time Series: GDS3323,GDS3215,GDS3485,GDS3476,GDS3197	6	6	Detail
CPP	Five Time Series: GDS3360,GDS2863,GDS5057,GDS5055,GDS3428	7	9	Detail
CPP	Five Non-time Series: GDS3323,GDS3215,GDS3485,GDS3476,GDS3197	11	8	Detail
BLCI	Five Time Series: GDS3360,GDS2863,GDS5057,GDS5055,GDS3428	3	4	Detail
BLCI	Five Non-time Series: GDS3323,GDS3215,GDS3485,GDS3476,GDS3197	7	7	Detail
1/NRMSE+CPP+BLCI	Five Time Series: GDS3360,GDS2863,GDS5057,GDS5055,GDS3428	6	7	Detail
1/NRMSE+CPP+BLCI	Five Non-time Series: GDS3323,GDS3215,GDS3485,GDS3476,GDS3197	6	6	Detail

It can conclude that the new algorithm is mediocre because its performance is always in the middle of all the compared algorithms in different data types (time series or non-time series), different performance indices (1/NRMSE, BLCI or CPP) and different comprehensive performance scores (ORS or ONS). Receiving the comprehensive comparison results from MVIAeval, researchers immediately know that there is plenty of room to improve the performance of their new algorithm.

TOP