Warehouse Dashboard Architecture

If Flex has a sweet spot, dashboards would be it. The technology provides a richness that makes engaging UIs very easy to implement.

For those evaluating possible solutions, I’m showcasing a simple architecture for a data warehouse / dashboard project. This particular architecture encapsulates the extract / transform so those are the only steps (simple SQL=>CSV) that need to be recreated for each environment (client) that has different source systems.

Warehouse Dashboard Architecture

Flex OLAP Take Two

After running some OLAP benchmarks it was obvious that a more scalable OLAP solution was needed.

After some quick searching and downloading, I’ve wired up a scalable Flex OLAP UI.

I downloaded and setup Mondrian from Pentaho for my OLAP server. Mondrian supports XMLA making integration easy.

I found a few Flex projects interfacing to OLAP via XMLA. The most robust appears to be Grebulon by the guys over at Sherlock Informatics

Mondrian setup required a few tweaks, as I’m running Java 6:

Added the following JARs
axis.jar
commons-discovery-0.2.jar (removed commons-logging)
jaxrpc.jar
wsdl4j-1.5.1.jar
xalan.jar

Also set the following parameters in tomcat
-Djavax.xml.soap.MessageFactory=org.apache.axis.soap.MessageFactoryImpl
-Djavax.xml.soap.SOAPConnectionFactory=org.apache.axis.soap.SOAPConnectionFactoryImpl
-Djavax.xml.soap.SOAPFactory=org.apache.axis.soap.SOAPFactoryImpl
-Djavax.xml.transform.TransformerFactory=org.apache.xalan.processor.TransformerFactoryImpl

Using the FoodMart as a reference, the setup was very straight forward.

I had to download the Grebulon source as the .swcs had hardcoded in a few references (PivotGrid) to the FoodMart datasource.

Commercially, I’d check out FlexMonster too 🙂

Flex OLAP Benchmarks

I ran some benchmarks for the OLAP components in Flex. The OLAP components allow you to perform you’re OLAP operations client side out of the box. You *can* utilize the SDK interfaces and move your processing to the backend, should you have too much data for the browser.

That said, the benchmarks below should help define the line for “too much data.”

Record Structure:

Course:String
School:String
GradeLevel:String
SubjectArea:String
Grade:Number (Our measure)

Records / Processing Time (s)

1,000 / 3 sec
5,000 / 12 sec
10,000 / 22 sec
15,000 / 33 sec
20,000 / 43 sec

The processing times listed above are for the *client* side processing time. The data was loaded from the server before starting the benchmark.

I ran the benchmarks at various rollup depths (multiple hierarchies) which the components actually handled very well, the processing time was not significantly affected.

Extending the dataset to 125K+ records caused errors with the script timeouts occurring.

For datasets above 5-10K records, I’d suggest moving the processing to the server and passing the results to the UI. (assuming you’re looking to leverage the OLAP components as provided)