To ensure the best possible updating (recalculation) performance for very large worksheets (containing millions of cells and large numbers of formulas),
you should follow a few rules:
- Avoid using formulas returning arrays, like
=b10:d2000000 * 5 + 1
as it can prevent GS-Calc from taking the full advantage of multicore calculations.
- Avoid using arrays for intermediate results in formulas, like
=match(v, b10:d2000000 <> 1, 0)
Instead, fill e.g. e10:g2000000 with the formulas =b10 <> 1, =c10 <> 1 ... etc. and use
=match(v, e10:e2000000, 0)
The above doesn't matter if your have only a few match() formulas, but if you have thousands of them, the whole worksheet can be executed tens of times (or more) faster.
- The rand() function and its variants can be used by a single thread only, forcing other threads/cores to wait.
If you use this function in large blocks to generate random data in very large worksheets with a large number of other time consuming formulas,
it may pay off to generate the random data as static values with Insert > Random Data command instead to
use multicore calculations effectively.
Make sure you allowed GS-Calc to use the maximum number of threads/cores in the Settings > Options > Calculation window
or in the Tools > Update Options dialog box. The former sets the default, global update options while the latter
can overwrite these options for each workbook individually.
If you use the vlookup(), match() functions for large data sets, always do one the following:
Use the default Tools > Update Options > VLOOKUP and MATCH mode > Fast with binary searches option;
this option is defined for each workbook individually and it causes all standard, basic versions of these functions in a given workbook
to utilize the fast binary searching. To enable this, GS-Calc performs internal (not changing the worksheet layout) background sorting of the searched ranges during
the first update and then keeps them up to date if they are later modified. This means that the first update might be slower and further updates will be
orders of magnitude faster.
Note that with this option you can't use regular expression as a search pattern because it requires sequential retrieving of the data.
Use the extended versions of these functions with additional parameters that explicitly specify the binary searching
mode individually for each their occurrence.
For the binary searching mode the performance gain increases significantly with the number of cells. Plain vlookup()/match() use the slow, sequential mode
require on average N/2 steps to find a value in an N-element range. In the binary mode only log2(N) steps can be required - for 10 million rows its over 200000 times less.
As the total execution time for lookup functions depends also on other pre- and post-searching operations, the real-life performance gains
will be most spectacular for large ranges or for updating large numbers of such functions.
A general memory usage tip: deleting a very large block of cells doesn't immediately free the memory occupied by it as it's placed
in the "Undo" buffer. Depending on the "Undo" level you define in Settings > Options, the RAM memory is released
when other editing actions replace it in that cyclic buffer. You can also reset that entire buffer manually at any moment.