How can we help?

Search for answers or browse our knowledge base

< All Topics
Print

Performance optimization

This article identifies some points to watch out for when dealing with large documents (i.e. with large amounts of data or a large number of formulas).

Document

Version

In general, use the latest version of the modules. Improvements are constantly being made in terms of both functionality and performance.

Formulas

The number of formulas in the document is accessible in Design via the “Tools” menu.

Generally, if the document contains more than 100,000 formulas, it is advisable to try to optimise it.

Delete all redundant formulas: point to a cell containing a calculation that has already been carried out if it is long (EQUIV, NB.SI, SUM.SI, and…).

Use all costly functions carefully:

Volatile functions in too large a number or referenced too frequently (formulas that depend on cells containing a volatile calculation are also recalculated for any input in any cell)

  • NOW
  • OFFSET
  • INDIRECT

The search functions

  • SEARCHV
  • SEARCHH
  • SEARCH
  • EQUIV
  • NB.IF
  • NBS.IF
  • SUM.IF
  • SUMS.IF

In addition, the search functions NB.IF, NBS.IF, SUM.IF and SUMS.IF should not be used with numerically convertible text criteria on large ranges containing strings (as in Excel, the strings will be subject to an attempt of numerical conversion).

Always use an INDEX(EQUIV combination instead of SEARCH(V)

Advantages of the INDEX(EQUIV :

  • adding rows and columns in the search areas does not require the column numbers to be revised (as in SEARCHV)
    the search result (EQUIV) can be calculated in a separate cell if you wish to reuse it
  • Reduce the size of string constants used in formulas and/or write strings in separate cells and point to these cells in the formula.

Components

Remove all unnecessary components (e.g. controls containing formulas whose results are not useful during integration)

In drop-down lists with a high volume (more than 2000 entries),

  • Try to use as many linked lists as possible with the OFFSET syntax for the data source.
  • Try to use open drop-down lists (allowing input with the E flag) and check by a validation constraint (via an EQUIV) the inclusion of the input item in the list. In some cases, the improvement can be significant when checking constraints.

Use components without response (flag NR) when the component is a custom control whose value is not to be recovered during integration (generally corresponds to all cases of reference lists in patterns without modification, addition or deletion by the correspondent).
This optimisation is only useful for a GT version<3.9

Patterns

As soon as a pattern reaches more than 5000 rows, great care must be taken.

The basic range of the pattern (duplicate cells of the pattern) must be carefully examined to eliminate any unnecessary or redundant calculations or those that are not absolutely necessary.

If a pattern has a large number of editable rows, the data from this pattern will probably not be entered into the questionnaire directly but from an Excel import (few people will edit more than 5000 rows without using copy/paste from other rows or other data sources): in this case, one should not hesitate to sacrifice the questionnaire’s ergonomics (formatting, locks, …) for the sake of performance, as the questionnaire will then serve as a data vehicle and checker rather than as an input interface.

When importing Excel, locks are very costly: they are checked on each imported row to ensure that GTAnswer has the right to enter the value in a given cell after the cells without locks have been entered (a dependency calculation based on the lock formulas is carried out, but the check during the Excel import is still costly).
It is then preferable to replace the locks by constraints (gtconstraint).

Multi-tabs

When the number of multi-tabs becomes large (and non-multi-tab summaries are required), limitations to the use may come into play, both for use in GTAnswer and Excel export (see Multi-tab summary limitations)

Display

Only specify simple formulas in the conditional layouts: the calculations should be done in the sheet. This makes it possible:

  • to increase the performance of the display
  • to factor out certain calculations
  • to be able to locate dependencies between formulas with ease (the Excel formula audit tool does not search for dependencies in formulas contained in conditional formatting)

Avoid zooming below 50%.

Memory

Check the memory used by GTAnswer when the document is loaded.

Serious consideration should be given to re-cutting or restructuring a questionnaire that takes up more than 1 GB (splitting it into several questionnaires or moving certain elements into an input matrix, optimising redundant formulas or data, etc.).

Transposition

If several transpositions with the same number of dimensions address a large number of components (several thousand or tens of thousands), it is preferable to add one or more dummy dimensions to one or the other so that they can be differentiated by the number of dimensions.
This is to optimise the processing by GTServer.

Excel import

The locks are very costly during the Excel import: they are indeed checked (thus recalculated) after the entry of each value to be imported to make sure that GTAnswer has the right to enter the value in such and such a cell after the cells without locks have been entered (a dependency calculation based on the lock formulas is carried out, but the check during the Excel import is still costly)
If a very large number of cells are involved in the Excel import (either through transpositions or through patterns or whatever), then it is better to replace the locks with constraints (gtconstraint).
You can ensure that the wait times are lock-dependent by deleting all locks and checking the Excel-GTAnswer import times. You can also check which locks are bottlenecks by dichotomising when deleting the lock declaration or by checking the number of cells containing input components that are involved in the lock condition calculation.

Finally, if a very large number of components are declared in the questionnaire, which occurs, for example, when transpositions are used, each of these components will have to be given a field name equivalent in Excel. However, Excel does not optimally manage several thousand field names in the same workbook (including global and local fields in the case of a multi-tab questionnaire). The Excel import can therefore be slowed down by the Excel-GTAnswer communication if the transpositions concern a very large number of components.
Preferably use patterns where possible.
You can verify that the large number of named fields in Excel is a bottleneck in the Excel-GTAnswer import when, after choosing the Excel import and then the workbook in GTAnswer, the dialogue box for choosing the sheets to be imported takes a long time to appear.

Data integration

Determine which integration steps consume resources. To do this, activate the SQL query dump in the logs (via the General sheet in GTAdmin). The integration is carried out in several steps:
– Extraction of the data from the compartment
– Extraction of the response to be integrated (validated or last)
– Transposition
– Creation of the welcome table (or use of the intermediate integration table)
– Inserting the records
– SQL construction of the key/pivot table, placement of INDEXes. In the case of updates on key fields, creation of the table of records to be inserted.
– Deletion (if pivot) by SQL of records in the final table
– Insertion of records from the intermediate/temporary table into the final table

Examination of the logs will help determine the most costly steps and act accordingly (integrate only validated responses, modify transposition definitions, place indexes on certain tables, prepare certain tables, etc.)

Other suggestions:

Increase the memory available on the machine hosting GTServer.

Isolate GTServer and the database server hosting the GT database and the client database on different physical machines.

Place possibly unique indexes on the pivot axes or on the key fields in the destination tables.
Ideally, indexes should be unique in the case of key field updates.

Use intermediate integration tables (even without an intermediate integration view) on which indexes corresponding to pivot fields or keys should be built.
Ideally, these indexes should be unique and covering for key field updates.

Try to integrate only the validated response.

Maintenance of GTServer

See the installation guide, which contains paragraphs on maintenance and improvement of the operation.

Was this article helpful?
0 out Of 5 Stars
5 Stars 0%
4 Stars 0%
3 Stars 0%
2 Stars 0%
1 Stars 0%
5
How can we improve this article?
How Can We Improve This Article?
Table of Contents