Extracting all formulas from a multi-sheet Excel file into R (best approach?)
Our take
I’m working with a large complex convoluted Excel workbook that has multiple sheets, lots of formulas, and some circular references. The goal isn’t just to read the data, but to actually understand and reconstruct the logic. Just sick and tired of going back and forth between sheets, and I keep forgetting what the logic is. so I have to start all over again
I'm currently going cell by cell and writing all the formulas so that it will be easier later on for reference as well
But I don't know of a way to extract all the formulas at once in a markdown document and develop like a map or something
I use R quite a bit, Is there an easy way to extract all formulas at once and map them ?
[link] [comments]
Read on the original site
Open the publisher's page for the full experience
Related Articles
- Anyone else struggle with knowing what to do with their data, not just formulas?I realized my problem isn’t just formulas—it’s understanding what the data actually needs. Like sometimes I don’t even know what formula to use or what insight I’m looking for. Do you guys just figure it out manually or is there a smarter way? submitted by /u/RazzmatazzWestern306 [link] [comments]
- Resources that help you get better at laying out Excel spreadsheets?When I have the opportunity to see someone else's sheets, sometimes I feel like "oh that's a better way to lay out the inputs v data/calcs. I'm comfortable with the data and formulas used, but I always feel like I could use improvement in laying out my information for others to understand. My brain thinks about in one sequence but that may not be the same for others. Are there any resources you've used that helped you get better at synthesizing the building of your workbooks? submitted by /u/brooklyn735 [link] [comments]
- Building a dynamic formulaI'm using Excel to predict the capacity usage of storage devices. It's clunky, but it works great until someone changes the name of the SharePoint folder I'm using to reference my data. (This has happened twice) I had the brilliant idea (or so I thought) to dynamically build the formula using "concat", which I thought would allow me to simply change one cell and be able to affect all of my entries (about 65 rows). I then learned that "concat" only builds the formula, it does not evaluate it and that I needed to use "indirect" to evaluate it. That worked, except my output was "REF", not what I was expecting. After a little more digging, I found that "indirect" does not work with external workbooks that are not open on the local machine. (I don't want to open 60 workbooks to get the data I need) It looks like I am out of luck, but I am hoping someone else has a brilliant idea I have not considered yet. Thanks submitted by /u/Separate-Tomorrow564 [link] [comments]
- How to deal with a bulky spreadsheet that is starting to hit the limits of Excel?Hello all, I have been venturing on quite the Excel journey the past year or so. I made a corporate spreadsheet that is approaching 500k formulas and that is starting to get serious speed issues at this point. It is 2026, so I conversed with ChatGPT several times regarding the speed issue, but realized I am way better off asking the experts here anyways. What is the problem So, my spreadsheet imports flat databases with specific information regarding objects that need further analysing. The imported flat databases run from say A tot CC or something, from which I probably draw about 12-15 datafields that are used for further analysis. It 'may' be more in the future. Afterwards, said data gets 'enriched' (manually) by things that aren't in the database, also because said data needs a human eye that cannot be automated. So far, so good. Right now, each object gets analysed from several different angles. As it stands, my spreadsheet runs from A until NA or something on the Formula Page. Many columns receive data from preceding columns, that are in the turn the result of many (slightly complex) logical IF or IFS tests, many of which are nested 3 or 4 deep. Often, they work in conjunction with X.LOOKUP to retrieve values, as the columns on the formula page are not equal. For example: A until BC on the Formula Page may analyze 150 objects, BD until DD may analyse 100 objects (from the same dataset, so narrower), and so forths. Thus a lot of X.LOOKUP is required, also because the first 'block' comes up with values that need to be found with X.LOOKUP. Also, values need to be retrieved from the flat database 'import' page with X.LOOKUP. Finally, X.LOOKUP is an insurance compared to FILTER, as I am not fully convinced that empty values in the flat database always contain a space (" "). To get to the point I use many IF, IFS, AND, and if need be, OR, formulas. Thinks: tens of thousands, probably in excess of 100k. These are compounded with X.LOOKUP, or X.LOOKUP gets used copiously without those. Here too, think tens of thousands. These formulas are - as much as possible - in array format, even though I find it controversial to do that as I consider how it can create a chain of updates throughout the spreadsheet. 'Dependencies' is the name of the game, with one object receiving many possible alterations / adjustments due to manual input data, for which the spreadsheet needs to provide. Right now, when I update a value, it may take up to 4 seconds to update the spreadsheet, which is already beyond the annoyance point for me. This leads me to these (hopefully) simple questions: Is it smart to use array formulas, knowing that each thing I change should only impact that one object line (for example, row 488) and none other? It is important to mention that object 1 does not influence object 488, or any other. Any manual data field only effects the object in the row it is in. In my mind, array formulas do not make sense in that regard, as it can result in a cascade of updates, but apparantly array formulas are 'way more efficient'. Is use of a VBA library the way to go to reduce lag and create more of an instant spreadsheet again? I am not able to code in VBA yet, but I am in the slow process of learning it regardless. Alternatively: should I use LET whenever a repeated lookup is needed in the same formula? Really looking for to your answers! submitted by /u/EvolvedRevolution [link] [comments]