\section{Related Work}
\label{sec:RelatedWork}

There are few studies that have been developed for benchmarking multidimensional databases, but they differ from our work on their purpose and on which characteristics of the benchmark they focus on. The OLAP benchmark APB-1 (Analytical Processing Benchmark) \cite{olap98} was designed to meet the needs of a benchmark for analytical processing, by simulating a real OLAP business application. The main goal of APB-1 is to measure the overall performance of OLAP servers, while the execution of database transactions is not considered. To ensure relevance, this benchmark seeks to reflect common business operations. According to \cite{thomsen02}, the APB-1 was the only standard OLAP benchmark of public domain, and was used to report the performance of systems such as Applix, Hyperion and Oracle, and to determine whether service providers actually offer a minimum set of OLAP functionality to be seen as analytical services. However, the APB-1 does not use a standard OLAP query language, does not consider runtime configuration parameters (e.g. number of concurrent processes, evaluation of the use of cache, the physical location of the services evaluated) and does not include a tool that implements the benchmark proposed.

The TPC Benchmark\textsuperscript{TM} H (TPC-H) \cite{tpc09} is a decision support benchmark, which contains a set of 22 business-oriented queries (Q1 to Q22) and two update operations (called RF1 and RF2). It represents decision support systems that handle a large volume of data, submit complex queries and aim to answer critical business questions. The TPC-H contains two types of tests, the Power test and the Throughput test. The first performs the operations in the following order: RF1, queries Q1 to Q22, and RF2. After the execution of one Power test, it runs the Throughput test, which performs the parallel execution of a minimum number of concurrent processes, defined by the size of the database used in the experiments. Each process corresponds to the sequential execution of all 22 queries. The computed metrics are: (1) power, denoting results of the Power test; (2) throughput, corresponding to results of the Throughput test; (3) Composite (QphH), which is the geometric mean of metrics for the Power and the Throughput tests; and (4) Price-per-QphH, which is the ratio between proccessing costs and QphH values. Regarding data scalability, the TPC-H uses a database exponential scalability whose ratio is equal to the squared root of ten approximately. The initial level generates six million of records in the fact table and corresponds to a Scale Factor (SF) that is equal to 1. However, data generated by the TPC-H is not modeled through a star schema that is often used in DW applications.

The Star Schema Benchmark (SSB) \cite{oneil09} was designed to aid the performance evaluation of data warehouse systems by proposing some modifications to the TPC-H approach. These changes include the denormalization of the data scheme originally proposed by TPC-H to build a star schema, which is more commonly used in DW applications. Also, some tables and columns were dropped and a time dimension table was created. The SSB workload was also adapted from the TPC-H queries to obtain functional coverage and query selectivity. The functional cover concerns the retrieval of data from one, two, three or four dimension tables, while query selectivity consists in varying the logical conditions of queries to retrieve different amounts of records from the fact table. To reduce the cache effects in the successive execution of queries, the SSB queries have disjoint selections to enable the retrieval of different records. Also, its data generator called DBGEN enables the generation of synthetic data according to the data schema of SSB. Because the goal of the SSB is to assist in the evaluation of the processing time of DBMS queries, the SSB workload allows to vary the number of dimension tables processed and the number of records accessed by user queries in SQL (i.e. query selectivity). However, the SSB workload does not address the OLAP processing performance and does not provide service performance metrics for conducting performance evaluations of analysis services. Similar to the work proposed here \cite{labrie02} extends the data schema of an existing benchmark and defines a workload based on queries written in MDX. However, their work is based on the TPC-H instead of extending the SSB. This can be seen as a limitation because the TPC-H relational schema was not designed to be a star schema and does not conform to the dimensional modeling guidelines defined in \cite{kimball02}. Consequently, for concluding the research detailed in \cite{labrie02}, there is a need for making a considerable amount of data modeling adjustments and creating a dimensional dataset for each of the 22 TPC-H queries. In 2002, these changes were started and to the best of our knowledge, they are still under development.

Performance metrics were also not addressed in \cite{carniel12}, which translated SSB's queries to MDX to evaluate Pentaho Mondrian that uses Relational OLAP as storage format and XMLA as communication interface. Furthermore, Pentaho Mondrian\footnote{Pentaho Analysis Services: Mondrian Project: \url{http://mondrian.pentaho.org/}} was compared to the BJIn OLAP Tool, which uses bitmap join indices as storage format in a NoSQL\footnote{NoSQL: \url{http://nosql-database.org/}} approach and JavaScript Object Notation (JSON\footnote{JSON: \url{http://www.json.org/}}) to transmit data between the server and the web client. Conversely, we argue that an adequate comparison of analysis services should have the same settings, e.g. using MDX, a common storage format and a common communication interface.

In \cite{siqueira10}, the Spadawan was proposed to provide an environment for performance evaluation of spatial DW and for investigating the processing costs related to the redundant storage of spatial data. The SpatialSSB \cite{nascimento11} is an extension of the Spadawan and addresses the three main types of geometric data (i.e. points, lines and polygons), proposes a hybrid data schema, controls the selectivity of queries and the distribution of spatial data in the extent, obtains the number of objects that intersect an ad hoc query window, and regulates the increase in data volume by either raising the complexity of geometries of spatial objects or enlarging the number of spatial objects by using a scale factor. However, these studies do not use MDX and do not enable the performance evaluations of OLAP engines.

The benchmarking of analysis services is the focus of the work described here, which imply in a need for evaluating the impact of processing OLAP functions with an increasing number of calculated members and sets, instead of studying only the selectivity of queries, whose impact is more directly related to the DBMS, than with the OLAP functionality offered by the services themselves. Also, the use of dimensional star schemas simplifies the creation of dimensional datasets to be tested with the analytical services being evaluated, and more realistically represents the dimensional databases of real DW applications. Finally, in this article, the application of reliability metrics of services and the use of throughput metrics are seen as important criteria of performance evaluation of analytical services.  The literature has paid little attention to these issues that are addressed here.