3 min readfrom Microsoft Excel | Help & Support with your Formula, Macro, and VBA problems | A Reddit Community

Are whole-column references slow to calculate?

Our take

User Safety: safe

Question

An issue that often arises on r/excel is: Are whole-column references slow?

Recent improvements

The answer is not as clear-cut as it used to be. Before improvements in recent years, whole-column references (like A:A) were uniformly slow to calculate. That's because Excel looked at all 1 million+ rows, even if they were empty.

But, in some situations, Excel now recognizes the used range rather than using a whole-column reference. In those situations, there is little or no difference in calculation speed.

Excel also caches data for some calculations (like lookups), which helps to speed up repeat calculations.

For more information about some of the improvements, see "Excel performance: Performance and limit improvements" https://learn.microsoft.com/en-us/office/vba/excel/concepts/excel-performance/excel-performance-and-limit-improvements

Experiment

Do these improvements mean that whole-column references are no longer slow?

Let's do an experiment, using an example post from yesterday: "Is there a way to speed up excel? " https://www.reddit.com/r/excel/comments/1tyr4p3/is_there_a_way_to_speed_up_excel/ where the poster used the formula:

=xlookup(1,(sheet2!A:A=A2)*(sheet2!G:G=B1),sheet2!H:H,"") 

I created some random data and copied the data and formulae down 5,000 rows (like the OP did, though they had 16 columns of data). That is:

  • Columns A, B, and G contain the formula =RANDBETWEEN(1,1000) in rows 1 to 5000. I use random data to ensure that each calculation is independent.
  • Column H is hard-coded integers 1 to 5000 in rows 1 to 5000.
  • Column C contains a variation of the OP's formula in rows 1 to 5000.

I tested variations of the OP's formula to see how long it takes to recalculate the sheet. For all cases except the first, I recalculated the sheet 100 times using VBA to improve the accuracy of the timing. The average time per iteration is reported.

Results

The formula variations and average recalculation time results are:

Formula Description Time (seconds)
=XLOOKUP(1,(A:A=A2)*(G:G=B1),H:H,".") Whole-column references 51.797
=XLOOKUP(1,(A.:.A=A2)*(G.:.G=B1),H.:.H,".") Trim range operators 0.192
=XLOOKUP(1,(TRIMRANGE(A:A)=A2)*(TRIMRANGE(G:G)=B1),TRIMRANGE(H:H),".") TRIMRANGE function 0.192
=XLOOKUP(1,($A$1:$A$5000=A2)*($G$1:$G$5000=B1),$H$1:$H$5000,".") Exact range 0.133
=XLOOKUP(1,([ColA]=A3)*([ColD]=[@ColB]),[ColE],".") Structured references in a Table (except A3, which refers to the next row) 0.192

Test PC: 28 thread i7 CPU running Excel 365 on Windows 11.

Repeating the experiment produces small, insignificant variations in the times.

Conclusion

In this situation, the results are very clear:

  • Using the exact range is the fastest formula.
  • Using a Table with Structured References and the two trim range variations are the same speed. They are each 44% slower than using the exact range, though the difference is small in absolute terms.
  • Most importantly, the whole-column formula takes almost 400 times longer to recalculate compared with the exact range. That is, just under 1 minute, compared with almost instantaneous for all the other variations.

Thoughts?

submitted by /u/SolverMax
[link] [comments]

Read on the original site

Open the publisher's page for the full experience

View original article

Related Articles

Tagged with

#Excel alternatives for data analysis#Excel compatibility#Excel alternatives#big data performance#rows.com#generative AI for data analysis#AI formula generation techniques#real-time data collaboration#formula generator#big data management in spreadsheets#conversational data analysis#intelligent data visualization#data visualization tools#enterprise data management#data analysis tools#data cleaning solutions#natural language processing for spreadsheets#financial modeling with spreadsheets#real-time collaboration#no-code spreadsheet solutions