Are whole-column references slow to calculate?
Our take
User Safety: safe
Question
An issue that often arises on r/excel is: Are whole-column references slow?
Recent improvements
The answer is not as clear-cut as it used to be. Before improvements in recent years, whole-column references (like A:A) were uniformly slow to calculate. That's because Excel looked at all 1 million+ rows, even if they were empty.
But, in some situations, Excel now recognizes the used range rather than using a whole-column reference. In those situations, there is little or no difference in calculation speed.
Excel also caches data for some calculations (like lookups), which helps to speed up repeat calculations.
For more information about some of the improvements, see "Excel performance: Performance and limit improvements" https://learn.microsoft.com/en-us/office/vba/excel/concepts/excel-performance/excel-performance-and-limit-improvements
Experiment
Do these improvements mean that whole-column references are no longer slow?
Let's do an experiment, using an example post from yesterday: "Is there a way to speed up excel? " https://www.reddit.com/r/excel/comments/1tyr4p3/is_there_a_way_to_speed_up_excel/ where the poster used the formula:
=xlookup(1,(sheet2!A:A=A2)*(sheet2!G:G=B1),sheet2!H:H,"") I created some random data and copied the data and formulae down 5,000 rows (like the OP did, though they had 16 columns of data). That is:
- Columns A, B, and G contain the formula
=RANDBETWEEN(1,1000)in rows 1 to 5000. I use random data to ensure that each calculation is independent. - Column H is hard-coded integers 1 to 5000 in rows 1 to 5000.
- Column C contains a variation of the OP's formula in rows 1 to 5000.
I tested variations of the OP's formula to see how long it takes to recalculate the sheet. For all cases except the first, I recalculated the sheet 100 times using VBA to improve the accuracy of the timing. The average time per iteration is reported.
Results
The formula variations and average recalculation time results are:
| Formula | Description | Time (seconds) |
|---|---|---|
| =XLOOKUP(1,(A:A=A2)*(G:G=B1),H:H,".") | Whole-column references | 51.797 |
| =XLOOKUP(1,(A.:.A=A2)*(G.:.G=B1),H.:.H,".") | Trim range operators | 0.192 |
| =XLOOKUP(1,(TRIMRANGE(A:A)=A2)*(TRIMRANGE(G:G)=B1),TRIMRANGE(H:H),".") | TRIMRANGE function | 0.192 |
| =XLOOKUP(1,($A$1:$A$5000=A2)*($G$1:$G$5000=B1),$H$1:$H$5000,".") | Exact range | 0.133 |
| =XLOOKUP(1,([ColA]=A3)*([ColD]=[@ColB]),[ColE],".") | Structured references in a Table (except A3, which refers to the next row) | 0.192 |
Test PC: 28 thread i7 CPU running Excel 365 on Windows 11.
Repeating the experiment produces small, insignificant variations in the times.
Conclusion
In this situation, the results are very clear:
- Using the exact range is the fastest formula.
- Using a Table with Structured References and the two trim range variations are the same speed. They are each 44% slower than using the exact range, though the difference is small in absolute terms.
- Most importantly, the whole-column formula takes almost 400 times longer to recalculate compared with the exact range. That is, just under 1 minute, compared with almost instantaneous for all the other variations.
Thoughts?
[link] [comments]
Read on the original site
Open the publisher's page for the full experience
Related Articles
- How to deal with a bulky spreadsheet that is starting to hit the limits of Excel?Hello all, I have been venturing on quite the Excel journey the past year or so. I made a corporate spreadsheet that is approaching 500k formulas and that is starting to get serious speed issues at this point. It is 2026, so I conversed with ChatGPT several times regarding the speed issue, but realized I am way better off asking the experts here anyways. What is the problem So, my spreadsheet imports flat databases with specific information regarding objects that need further analysing. The imported flat databases run from say A tot CC or something, from which I probably draw about 12-15 datafields that are used for further analysis. It 'may' be more in the future. Afterwards, said data gets 'enriched' (manually) by things that aren't in the database, also because said data needs a human eye that cannot be automated. So far, so good. Right now, each object gets analysed from several different angles. As it stands, my spreadsheet runs from A until NA or something on the Formula Page. Many columns receive data from preceding columns, that are in the turn the result of many (slightly complex) logical IF or IFS tests, many of which are nested 3 or 4 deep. Often, they work in conjunction with X.LOOKUP to retrieve values, as the columns on the formula page are not equal. For example: A until BC on the Formula Page may analyze 150 objects, BD until DD may analyse 100 objects (from the same dataset, so narrower), and so forths. Thus a lot of X.LOOKUP is required, also because the first 'block' comes up with values that need to be found with X.LOOKUP. Also, values need to be retrieved from the flat database 'import' page with X.LOOKUP. Finally, X.LOOKUP is an insurance compared to FILTER, as I am not fully convinced that empty values in the flat database always contain a space (" "). To get to the point I use many IF, IFS, AND, and if need be, OR, formulas. Thinks: tens of thousands, probably in excess of 100k. These are compounded with X.LOOKUP, or X.LOOKUP gets used copiously without those. Here too, think tens of thousands. These formulas are - as much as possible - in array format, even though I find it controversial to do that as I consider how it can create a chain of updates throughout the spreadsheet. 'Dependencies' is the name of the game, with one object receiving many possible alterations / adjustments due to manual input data, for which the spreadsheet needs to provide. Right now, when I update a value, it may take up to 4 seconds to update the spreadsheet, which is already beyond the annoyance point for me. This leads me to these (hopefully) simple questions: Is it smart to use array formulas, knowing that each thing I change should only impact that one object line (for example, row 488) and none other? It is important to mention that object 1 does not influence object 488, or any other. Any manual data field only effects the object in the row it is in. In my mind, array formulas do not make sense in that regard, as it can result in a cascade of updates, but apparantly array formulas are 'way more efficient'. Is use of a VBA library the way to go to reduce lag and create more of an instant spreadsheet again? I am not able to code in VBA yet, but I am in the slow process of learning it regardless. Alternatively: should I use LET whenever a repeated lookup is needed in the same formula? Really looking for to your answers! submitted by /u/EvolvedRevolution [link] [comments]
- Excel Performance optimisation tips!Working in demand planning I have got it the point where I am making some pretty advanced files using a suite of techniques. My files often have lots of rows, with lots of Columns of complex formula including with sumifs, xloopup, ifs & Let. I’ve not advanced to using tables regularly though as I find the constraints & syntax annoying but am trying to get there & have started using power query to blend data for output analysis. The problem I am encountering is I filter ALOT drilling down into product groups etc, & excel tends to ‘hang’ a lot with ‘Not Responding’. Now I’m not sure it’s due to an underpowered machine (intel core i7 HP Elitebook) or, more likely lots of complex formula referencing ranges or tables. My question to the hive brain: share your optimisation tips & tricks! -Can Lamda combined with Let speed things up? -Are Tables vital to speeding up complex sumifs & lookups? - are match helper columns combined with Index leaner & faster than xlookup? Hit me with best tips & tricks! submitted by /u/NZGRAVELDAD [link] [comments]
- How to Identify Cause of Calculating Threads?I have a light excel file around 10mb. However, it takes a while to update / refresh with any formula given that it always shows Calculating threads (sometimes it's 6 threads, other times 14 threads) There are quite a number of worksheets - maybe around 10. formulas refer to other worksheets in the file and there are formulas linked to 10 external files (mostly VLOOKUP, XLOOKUP). Is there a quite way to identify what's causing the slowdown / calculating threads so I can address? by theory, what causes this? I'm tired of waiting 10mins just to get a simple sum or division. It seems like the Excel needs to recalculate all formulas every single time. Already tried making calculations manual instead of automatic. It helps not to update every single time, so I can wait for all formula adjustments before clicking F9. But, another 5-10mins wait once I do update. submitted by /u/Recent__Craft [link] [comments]