Ensuring high performance

To ensure the best possible updating (recalculation) performance for very large worksheets (containing millions of cells and large numbers of formulas), you should follow a few rules:

  1. Avoid using formulas returning arrays, like
    =b10:d2000000 * 5 + 1
    as it can prevent GS-Calc from taking the full advantage of multicore calculations.
  2. Avoid using arrays for intermediate results in formulas, like
    =match(v, b10:d2000000 <> 1, 0)
    Instead, fill e.g. e10:g2000000 with the formulas =b10 <> 1, =c10 <> 1 ... etc. and use
    =match(v, e10:e2000000, 0)
    The above doesn't matter if your have only a few match() formulas, but if you have thousands of them, the whole worksheet can be executed tens of times (or more) faster.
  3. The rand() function and its variants can be used by a single thread only, forcing other threads/cores to wait. If you use this function in large blocks to generate random data in very large worksheets with a large number of other time consuming formulas, it may pay off to generate the random data as static values with Insert > Random Data command instead to use multicore calculations effectively.
  4. Make sure you allowed GS-Calc to use the maximum number of threads/cores in the Settings > Options > Calculation window or in the Tools > Update Options dialog box. The former sets the default, global update options while the latter can overwrite these options for each workbook individually.
  5. If you use the vlookup() and match() functions for large data sets, always do one the following:

    For the binary searching mode the performance gain increases significantly with the number of cells. Plain vlookup()/match() use the slow, sequential mode require on average N/2 steps to find a value in an N-element range. In the binary mode only log2(N) steps can be required - for 10 million rows its over 200000 times less.
    As the total execution time for lookup functions depends also on other pre- and post-searching operations, the real-life performance gains will be most spectacular for large ranges or for updating large numbers of such functions.

  6. If you use the sumIf() or countIf() functions for large data sets, by default they will apply fast binary searching for ranges specified as columns that include at least 16384 rows. For such ranges the functions will behave like the above look-up functions in the fast mode.
    You can change that row limit in the saved settings.xml configuration file modifying the if-calc-treshold value. If you set some value larger than the maximum number of rows, it'll effectively turn off fast binary searching for these two functions.
    The typical location of this file is:
    C:\Users\your-name\AppData\Local\GS-Calc
    If you choose a portable setup, the file will be in the folder you specify as the target setup folder for all files.

  7. A general memory usage tip: deleting a very large block of cells doesn't immediately free the memory occupied by it as it's placed in the "Undo" buffer. Depending on the "Undo" level you define in Settings > Options, the RAM memory is released when other editing actions replace it in that cyclic buffer. You can also reset that entire buffer manually at any moment.