Optimizing Your Application Configuration

Performance & Scaling

This is the fourth installment of our Performance & Scaling series, a collection of blog posts and technical articles on improving the speed and efficiency of your Exago BI environment. Last month, we reviewed how to optimize your application architecture. This month, we’ll narrow our focus and look at how configuration settings can impact performance.

 

 

Be sure to have all config- and report-level filtering in place before making changes to these settings so that users are already working with the minimum volume of requisite data. Defaults for the following are designed to satisfy 80% of use cases, so it’s likely that you will only need to adjust a few. Because all of these settings can be configured on a per-user basis via the API, it is possible to deviate from global settings in such a way that gives experienced users more control and protects novice users from making performance-taxing decisions. Modify those that will have the greatest impact on your particular environment and best serve the user in question.

Note, lastly, that each setting below is followed by italicized text defining the flag’s trade-offs. Performance enhancements are rarely gotten for free. Google Analytics, for example, gives users the choice between greater precision and a faster response time. The format used below would summarize this performance cost as: “Trade-off: reduced precision for faster reporting.

 

 

Database Timeout - Set the maximum amount of time the database will work on a single query before disconnecting. This is an emergency feature to prevent excessive and inappropriate queries from hanging the database. Hung databases affect all users attempting to query the database and may cause the server to reboot. If the database is being used to store session state information as well as reporting data, a crash will force all active users to begin new sessions and lose any unsaved work. If the database does not store session state, users will be presented with an error message only. If a database timeout is applied and the limit is reached, only the user whose query incited the delay will receive an error.

Trade-off: some dropped queries for system integrity

 

Database Row Limit - Set the maximum number of rows returned by any query. Reports that are too large are truncated without the user’s permission or knowledge. This setting can either be applied instead of or in addition to a database timeout, both of which serve to protect the database server from crashing. If both a row limit and timeout are applied, the database will abandon a query if either limit is met.

Trade-off: some dropped queries for system integrity

 

Row Limit Step Size - Set the maximum number of rows returned by a query. Setting step size to a value greater than zero automatically enables incremental loading, which allows users to execute reports in stages, starting with a small set of rows, and adding more in steps as desired. Users have the option of overriding incremental loading and generating the full data set by clicking a “Generate All” button in the Report Viewer. This setting is designed to protect the user experience rather than the database server. Users less concerned with a report’s content than its style or structure can view a subset of the data instead of waiting for the full report to generate. Row Limit Step Size can be used in tandem with other database settings.

Trade-off: Lack of data completeness for speed

 

Enable Special Cartesian Processing - This setting is relevant only in situations where the following describes the relationship between objects A, B, and C:

There is a one-to-many join from A to B, a one-to-many join from A to C, and no relationship between B and C. (Note: These one-to-many joins must be defined as such in the application configuration, not just in the database, for this setting to trigger.) Enable Special Cartesian Processing, which defaults to “True,” prevents the database from returning a cartesian product between objects B and C and instead handles joins server-side. The resulting data set is more intelligible to end users than what the database would produce, but the need for additional querying and data processing can certainly impact performance.

Trade-off: Less user-friendly query results for execution speed

 

Aggregate and Group in Database - Set aggregate and grouping calculations to be done in the database when possible (see our best practices). If enabled, qualifying reports will run more quickly and place less burden on the web server than they would otherwise. As long as all one-to-many joins are correctly defined in the admin console, there are no drawbacks to setting this flag to “True.”

Trade-off: Nothing for improved web server performance

 

Allow Editing ExpressView with Live Data - Set this flag to “False” to require that users only have access to sample data during ExpressView editing. This minimizes the number of times users query the database while they adjust their ExpressView design. This improves database and web server performance while simultaneously cushioning the user experience against frequent, potentially lengthy executions.

Trade-off: Reduced WYSIWYG for improved database performance, web server performance, and UX

 

Data Fields Search Mode - Set whether or not to show the data fields search box in the main left pane. This setting only provides a performance enhancement if column metadata is used and if the schema access type for the objects in question are set to “Metadata.” Without these measures, the search box’s type-to-search functionality causes the application to hit the database for schema information. Storing this information in the metadata reduces the likelihood that the search box will negatively impact performance and user experience. To prevent the metadata from going stale, admins must update it whenever columns are added, removed, or retitled.

Trade-off: Inconvenience of periodic manual updates to metadata for improved performance in search functionality

 

Fields Enabled in Data Fields Tree - This setting applies to ExpressViews only. Choose whether users may add any joinable field to the canvas or only those joined directly to fields already on the canvas. A joinable field is any field that bears some relation to the fields already on the report, even if they are separated by one or more intermediary objects. Allowing only direct joins (which require less processing than indirect joins) can improve database performance, especially if the database in question does not automatically identify and correct unnecessarily long join paths. Oracle, MsSQL, DB2, and other more mature databases typically do this on the fly by default.

Trade-off: More prescriptive join pathing for improved database performance

 

Maximum Number of Chart Data Points - By default, charts are set to display a maximum of 300 data points as limits exceeding that introduce the risk of browser slowdown. The setting can be lowered to mitigate browser performance issues caused by large charts. Users attempting to run a chart with more than the allotted number of data points will be prompted to modify the chart.

Trade-off: Decreased volume of data for improved browser performance

 

Use Sample Data for Dashboard Visualization Design - This setting is sister to “Allow Editing ExpressView with Live Data” above. Require that users building new ExpressView visualizations in the Dashboard Designer use sample data rather than live data during the design phase. Setting this flag to “False” will cause the visualization to execute against the database each time an integral (non-cosmetic) change is made. It’s worth noting here that there is a similar setting pertaining to Advanced Reports on dashboards, but this is configured on a report-by-report basis by the user in the Dashboard Designer proper.

Trade-off: Reduced WYSIWYG for improved database performance, web server performance, and UX

 

Save on Report Execution / Finish Press - If saving reports to the database takes a while, turn off autosave so that the execute and finish buttons work faster. Only the save button will save the report. Users who attempt to exit the report designer without saving changes will be prompted to save.

Trade-off: Less frequent saving for faster report design and execution

 

Allow Interactivity in Report Viewer - Exago BI’s interactive report viewer allows users to change report sorts, filters, and conditional formatting on the fly. Changes to sorts and filters cause report re-execution and can therefore result in delays. Use this flag to disable report viewer interactivity entirely.

Trade-off: Decreased functionality for increased performance

 

Max Number of Fields in Crosstab or Tabulation Header - Crosstab size multiplies with the addition of each Column Header, Row Header, and/or Tabulation Data Source because each successive field subdivides the one before it. For example, let’s say the first Row Header field is Employee Full Name. If the company in question only has ten employees, the crosstab will only contain ten detail rows. But let’s imagine that we’d like to see how much of each of the company’s 200 products each employee sold over the course of the year. Adding Product Name as the second Row Header field multiplies the number of products (200) by the number of employees (10) for a total of 2,000 rows. Capping the number of fields that can be added to any area of a crosstab helps prevent them from becoming too large and expensive to execute. The setting defaults to 5 fields.

Trade-off: Decreased crosstab complexity for increased crosstab performance

 

Read Database for Filter Values - By default, users adding a filter to their report may click a dropdown arrow to view a sorted list of values in that field. These values are pulled from the database, so there may be delays large or complex objects. One way of resolving this is to disable the filter dropdown feature, requiring that users key in the exact desired filter value (not case-sensitive). Alternatively, admins can assign large data objects less complex filter dropdown objects to help prevent slowdowns.

Trade-off: Diminished usability for increased speed in filter creation

 

Schedule Intraday Recurrence - Prevent users from making schedules that run multiple times per day. This helps protect the scheduler servers from overloading.

Trade-off: Diminished functionality for improved scheduler performance

 

Max Report Execution Time - Just as database timeouts protect the database server, max report execution time protects the execution server. An execution server is any server responsible for carrying out an execution, which might include the web server, remote execution server, and/or scheduler server. Use this setting to prevent server hangs.

Trade-off: Some terminated jobs for improved server performance and integrity

 

Cache External Services - Enable this setting to avoid performance issues with assemblies and web services, which can take time to load/compile as well as slow network traffic. Exago BI would refer to a cached file for the requisite data rather than call to the external source. The cache may be reset by restarting the web server (an option is to reload during low-traffic hours).

Trade-off: Data staleness for improved performance while using assembly and web service data sources

 

Bear in mind that enhanced speed can come at a cost to functionality. Improving speed may mean restricting access to features that are inherently performance-taxing. Optimizing your configuration means finding an ideal balance between performance and power.

Another caveat is that configuration optimizations should be made towards the end of a performance audit. They are not going to compensate for bottlenecks in hardware, network, data structure, or architecture. Prioritize those high-impact projects before making these finer adjustments.

 

Photo Credit: This modified version of "Soundboard" by Igor Putina are licensed under CC BY 2.0. 

 

 

 

Authored by
Nicole Hitner
Content Strategist
comments powered by Disqus
`