\section{PERFORMANCE TESTS}
\label{sec:tests}

This section is organized as follows. Section \ref{sec:expSetup} details the design of our experiments aimed at analyzing the OBAS benchmark, while the performance tests results are given in Sections \ref{sec:responseTime} to \ref{sec:reliability}.

\subsection{Experimental Setup}
\label{sec:expSetup}

The communication interface chosen for the experimental analysis of OBAS was the XML for Analysis (XMLA). XMLA was chosen because it is a standard specification of Web services for OLAP, which has only two methods: Discover and Execute. For using XMLA as communication interface, the evaluated analysis services have to allow the use of this interface.

We chose two analysis services that adopt XMLA: the SQL Server Analysis Services\footnote{Microsoft SQL Server 2000 Analysis Services Operations Guide: \url{http://technet.microsoft.com/en-us/library/cc917607.aspx}} (hereinafter referred to only as SSAS) and Pentaho Mondrian (hereinafter referred to only as Mondrian). We did not intend to compare OLAP servers, but to describe the results of both and to observe if the same trends occurred in both, aiming to highlight the features of OBAS that are independent of OLAP servers. Such description and observation are essential to corroborate that OBAS assists in the performance assessment of analysis services. For providing the XMLA communication interface, a web application server was needed to host the OLAP web service.

The Internet Information Services (IIS)\footnote{Internet Information Services 6.0 Solution Center: \url{http://support.microsoft.com/ph/2097}} hosted the SSAS XMLA web services, while the Mondrian used Apache Tomcat\footnote{Apache Tomcat: \url{http://tomcat.apache.org/}} to host both the analysis service and its XMLA web service as well. Because IIS is native in Windows operating system and Tomcat has Windows versions, all tests were run under the Windows XP operating system. Also, the storage mode of cubes used by the evaluated analysis services was Relational OLAP because it is the storage format of Mondrian. This enabled us to compare the services, SSAS and Mondrian, equally.

To compare our findings with performance results derived from the use of a real dataset, a data warehouse created for the mobile emergency care unit of the metropolitan area of the city of Recife was used in our experiments. This real DW is called SAMU and was adapted to be compatible with a synthetic dataset generated by the OBAS data generator and used in our tests as well. Both synthetic and real dimensional datasets contain a fact table with five measures and a set of four dimensions, in which each of them is related to a hierarchy of three aggregation levels. Also, correspondence mappings between the data schemas of OBAS and SAMU (i.e. mappings between cube names, dimension names, names of levels and values of filters) were carried out in order to enable the translation of OBAS workload into queries to be processed over the SAMU schema. While SAMU is a real and skewed dataset, OBAS provides a uniformly distributed dataset.

In the experiments conducted, the number of threads varied from 1 to 30, which allowed a detailed and progressive analysis of reliability. Moreover, an extra execution of 100 threads was performed for estimating and evaluating the behavior of services when subjected to a higher load. The data scalability of the experiments consisted in 250,000, 500,000 and 1 million of fact table records. The value of 250,000 was chosen as the initial scale because it corresponds to the size of the fact table of the real dataset (i.e. SAMU). The remaining values were multiples of 2.

\begin{table}[b]
\caption{The OBAS Experimental Setup} % title of Table
\centering % used for centering table
\sffamily % CHANGES THE FONT WITHIN THE TABLE TO sans serif
\begin{tabular}{|l|l|l|r|r|} % columns (5 columns)
\hline %\hline %inserts double horizontal lines
\bf{Mode} & \bf{Service} & \bf{Clear/Cache} & \bf{Start} & \bf{End} \\ \hline
Local & SSAS & Cache & 1 & 30  \\ \hline
Local & SSAS & Clear & 1 & 30  \\ \hline
Local & Mondrian & Cache & 1 & 30  \\ \hline
Local & Mondrian & Clear & 1 & 30  \\ \hline
Remote & SSAS & Cache & 1 & 30  \\ \hline
Remote & SSAS & Clear & 1 & 30  \\ \hline
Remote & Mondrian & Cache & 1 & 30  \\ \hline
Remote & Mondrian & Clear & 1 & 30  \\ \hline
Remote & SSAS & Clear & 100 & 100  \\ \hline
Remote & Mondrian & Clear & 100 & 100  \\ \hline
\end{tabular}
\label{table:tab6} % is used to refer this table in the text
\rmfamily % RESTORES TO TIMES, SO THAT CAPTION IS WRITTEN PROPERLY
\end{table}

Table \ref{table:tab6} lists 10 execution plans and each of them was processed against two data catalogs: SAMU and OBAS. Each of them is based on the following combinations of executions: execution modes (local and remote), services (SSAS and Mondrian), evaluation of cache (using cache and clearing cache), and the starting and ending numbers of concurrent execution processes (threads). These running combinations were used in the experiments related to the OBAS and SAMU catalogs and to the scale of 250,000, while for the other data scalability values, the same combinations of executions were used but only for the OBAS data catalog because it is scalable. Experiments were conducted on a computer with 2.0 GHz Duo Core processor, Windows XP SP2, 3GB of main memory and 150GB hard disk. Experiments were performed on a small scale, in relation to the size of the real fact table of the SAMU application, which has 250,000 records. Nevertheless, with the increasing number of threads, errors were registred and used to investigate the reliability of the analysis services. Moreover, by executing more than 10 threads and with the sharing of resources in the local execution, in which the processing is shared between the services and the OBAS application, it was possible to determine the robustness of the analysis services by testing beyond the limits of normal operation and verify the presence of errors used in the calculation of reliability.

The sequential execution of the 17 queries Q1 to Q17 was considered an iteration. The minimal sample number of 50 was divided by the number of threads of each configuration. Figure \ref{fig4} displays iterations and configurations as follows. In the configuration of 1 thread, thread $t_{11}$ performed 50 iterations, while in the configuration of 2 threads, thread $t_{21}$ and $t_{22}$ performed 25 iterations each one (totalizing 50 iterations), in the configuration of 3 threads, thread $t_{31}$, $t_{32}$ and $t_{33}$ performed 17 iterations each one (totalizing 51 iterations), and so on. Such method was used in all tests.

\begin{figure*}[b]
 \begin{center}
    \includegraphics[height=3cm,width=12cm]{Fig4}
    \caption{Configurations, threads and iterations of queries executions.}\label{fig4}
  \end{center}
\end{figure*}

Errors and outliers were treated according to the need. Errors are executions that fail and were removed as they are not complete and valid executions. Outliers are executions with very high elapsed time that exceed 3 times the standard deviation and were removed because they would distort the average response times obtained. For the rate of execution (Throughput), errors were discarded since what matters is the amount of valid executions. Conversely, outliers were kept because even if a query execution takes a longer or shorter time, it must be considered. As for Power, outliers were discarded.

\subsection{Response Time}
\label{sec:responseTime}

\begin{figure*}[t]
 \begin{center}
    \includegraphics[height=8cm,width=15cm]{Fig5}
    \caption{Response Times and Execution Rates for datasets with sizes of 250K}\label{fig5}
  \end{center}
\end{figure*}

The initial scale based on the value of 250K was used for both data catalogs OBAS and SAMU. In this test, the average response times were calculated for each running combination and are shown in Figure \ref{fig5}a. The local execution time was shorter than the remote execution time. Also, the use of cache benefited both local and remote execution plans. Regarding the datasets, the query processing over the OBAS data catalog demanded a greater response time than the execution over the SAMU data catalog. Note that SAMU is a real data catalog that has a uniform organization of data while OBAS has a synthetic data catalog which is randomly generated. The execution of queries over the OBAS data catalog required more resources and processing than performing the same computations over the SAMU data catalog. These trends were observed for both analysis services.

Regarding the metrics on rate of execution, the results for Power are shown in Figure \ref{fig5}b, the results of the Throughput are shown in Figure \ref{fig5}c, and the results of Composite are shown in Figure \ref{fig5}d. They revealed that the local execution provides a greater rate of execution than remote execution because local execution is not impaired by network traffic, and that maintaining the cache is crucial to provide a higher rate of execution, since data is reused instead of fetched twice. In addition, the rate of execution over the SAMU catalog was greater than over the OBAS catalog, since their data distributions are uniform and random, respectively. These trends were observed for one single thread (Power), for concurrent processes (Throughput) and for their geometric average (Composite). Therefore, we concluded that local execution with cache on SAMU catalog obtained the better performance. Furthermore, we emphasize that the discussed trends were similar to both analysis services.

\subsection{Scalability and Reliability}
\label{sec:scalabilityandReliability}

Figure \ref{fig6} provides performance results derived from a general scalability analysis and points out that as the data volume increased, the execution rates decreased. However, Figure \ref{fig6}a demonstrates that this reduction in execution is not proportional to scale, i.e., the limitation on the rate of execution of services is not proportional to the database size. In fact, the multiplication of the quantity of queries per hour by the scale factor produces values that are not close. Furthermore, since OBAS-QPH multiplies the reliability (which is a percentage) by the scale factor used on data generation, then greater volumes of data obtained greater values of reliability in Figure \ref{fig6}b.

\begin{figure*}[t]
 \begin{center}
    \includegraphics[height=8cm,width=15cm]{Fig6}
    \caption{Performance Results of the Scalability Analysis}\label{fig6}
  \end{center}
\end{figure*}

Another test regarding the scale was performed by calculating the Power test for groups of queries. In this analysis, the OBAS Group I of queries, shown in Figure \ref{fig6}c was more sensitive to scale than the OBAS Group II, shown in Figure \ref{fig6}d, and this was more easily noticed in the local execution with cache. As expected, this occurs particularly because the queries of Group I are based on selectivity variations, whose processing cost is proportional to the size of the dataset evaluated, while queries of the OBAS Group II affect the costs to process OLAP functions, which are not related directly to sizes of the datasets tested. In addition, according to our results, the queries of Group I required less processing than those of Group II, since the rate of execution for Group I was considerably higher than the rate of execution for Group II according to the Power test reported in Figures \ref{fig6}c and \ref{fig6}d. Note that results reported in Figure \ref{fig6}d assess the performance of OLAP functions provided by analysis services by varying the OLAP processing costs instead of considering the query selectivity only.

\subsection{Concurrency}
\label{sec:concurrency}

\begin{figure*}[b]
 \begin{center}
    \includegraphics[height=8cm,width=15cm]{Fig7}
    \caption{Performance Results of the Throughput Analysis}\label{fig7}
  \end{center}
\end{figure*}

In this test, we examined the rate of execution with increasing number of threads as concurrent processes, since OBAS employs throughput metrics defined according to the number of threads. We used the OBAS dataset with increasing data volumes. Figure \ref{fig7}a indicates that in the local running with cache, the increase in load caused degradation of analysis services due to the share of local resources and use of cache. Figure \ref{fig7}b reveals that, regarding remote running with cache, the performance of the evaluated analysis services tended to stabilize at a maximum rate of execution of approximately 15 threads, stablishing a threshold for the analysis services. Figure \ref{fig7}c reports the results of the local execution without cache, which demonstrated that discarding the cache increased the rate of execution for more than 10 threads, then stablishing another threshold for the analysis services. Figure \ref{fig7}d concerns the results for remote execution without cache, which revealed that with the number of threads varying from 1 to 30 the rate of execution increased for all data volumes. However, when varying the number of threads from 30 to 100, the rate of execution decreased for the most voluminous dataset of 1M, then demonstrating a limitation of the analysis services.

Note that the data volumes of 250K, 500K and 1M had their rates of execution multiplied by 0.042, 0.084 and 0.126, respectively. These values are their scale factors SF, since SF=1 produces 6 million of tuples in the fact table. Therefore, the data volume of 1M had greater rates of execution than the data volumes of 250K and 500K. Also, for the data volume of 1M, the rate of execution drastically decreased with more than 30 threads and revealed a drawback of the analisys services regarding concurrency.

\subsection{Reliability}
\label{sec:reliability}

\begin{figure*}[b]
 \begin{center}
    \includegraphics[height=8cm,width=15cm]{Fig8}
    \caption{Performance Results of the Reliability Analysis}\label{fig8}
  \end{center}
\end{figure*}

In this section, we focus on reliability since OBAS incorporates reliability metrics to allow the experimental evaluation of analysis services that may produce errors. Figure \ref{fig8}a indicates that there was a slight reduction on reliability with increasing scales.  In addition, the local reliability was degraded since computing resources were shared between the analysis service and benchmark application that ran locally. On the other hand, the remote reliability was approximate 100\% since there was no sharing of computing resources. Both locally and remotely, Mondrian was more sensitive to the variation of scale than SASS. Figure \ref{fig8}b reports the results for the local execution without cache, which requires a higher processing. One specific query that requires high computational cost, Q16, caused severe errors in Mondrian and decreased its reliability, particularly for the higher data volumes 500K and 1M. In all other queries, SASS had a lower reliability than Mondrian. In addition, the results reported in Figure \ref{fig8}b assess the performance of OLAP functions provided by analysis services by varying the OLAP processing costs instead of considering the query selectivity only. Figure \ref{fig8}c and Figure \ref{fig8}d show the results regarding the reliability of local executions and concurrent executions, by preserving the cache and discarding the cache, respectively. They indicate that the analisys service degradation was observed mainly in executions with more than 10 threads, especially regarding SSAS when the load of simultaneous executions was increased and cache was erased. 

The evaluation of the reliability test in a machine with only two cores was crucial because we were interested in causing execution errors mainly for higher quantities of threads (30 to 100). A machine with more cores would reduce these errors and ignore the execution priority of each analysis service.


