Power query for a large dataset
Our take
My company uses a horrible format for its daily production sheets, but the data can be pulled through power query.
I want to build a reporting tool for looking at any major trends that are currently missed. Ideally looking at part efficiency by machine type and some other descriptive data too like efficiency by shift manger etc.
My problem is that even after cutting unnecessary columns and filtering unnecessary rows, it takes forever to load anything. ChatGPT isn’t all that helpful, I’d like some expert advice please!
For info, rough number of rows of data is about 50,000 per year. I want to cover at least the last three years.
Sheets are all saved into a folder by month, within a folder by year.
[link] [comments]
Read on the original site
Open the publisher's page for the full experience
Related Articles
- Power query and manual table next to itHi, I want to pull data verbatim from a spreadsheet my team uses and use data from it for my own purposes. The main goal for using power query is that the data updates on my spreadsheet. Mainly, if any new entries are added at the bottom. I also have some manual fields that I need to add that correspond with the power query data. I've added another table beside the power query data, and filtering it causes the data on both sides to adjust correctly. I'm mainly concerned that, if the entries are rearranged or sorted on the original sheet, that my tables will not align after a refresh. Also, if a refresh would break my table alignments at any point. Is my fear founded? Is there a way to combine the two features that I need into a single table? submitted by /u/Perspective-Guilty [link] [comments]
- Slow spreadsheet - need troubleshootingHi, I have a spreadsheet that has two tabs, one is essentially the original data which is YTD driven for a particular GL account, the company has smaller amounts of transactions, so by December we are talking about maybe 3-5k rows of transactions for the account total. The main tab being utilized, has about 30 columns of look up and sumifs formulas referencing the source data and in total approx maybe 500 rows by year end? To me it doesn’t seem excessive. I’ve dealt with way heavier spreadsheets that have more omph and run faster. But for some reason this one is slow as all hell to work in. I’ve even tried barcoded some data and not seen any improvement. I’m not too techy into what else could be slowing it down. And ideas on what to troubleshoot from here? submitted by /u/SlideTemporary1526 [link] [comments]
- Request for improved methodI work in accounts payable for a company and took over some additional duties a few months ago. One of those duties is keeping a tracker/log of all bills that come in. A tracker in excel was handed over to me. While I’ve improved many things with this tracker so far, I’m looking to make a major change but unsure how to go about it. This tracker has 110k rows of data and has columns with data up to column “FZ”. New rows of data are added daily. Old rows are “archived” as soon as possible. I’m no excel pro, but can hold my own and have learned along the way. Issue: large dataset presents challenges with excel freezing and/or crashing Disclaimer: I cannot remove any rows or columns. Question: is there a better way to handle this data? Ie. tools in excel, using something other than excel, etc? submitted by /u/Visible-Question-786 [link] [comments]
- Power Query Refresh TimeI created an Excel file with a Power Query product database. It takes item-level data (sizes, colors, logos, etc.) from an Item Master, expands it into all SKU combinations, and builds out a final product database with pricing and item info. It also pulls in a predetermined product codes from a separate tab with 300,000 rows. When I refresh the file (both at home and at work), it takes about 20 seconds with 50 rows. I had a couple coworkers try the exact same file, and it takes about 2 minutes for them to refresh. It is the same file, same data, same network, and similar computers. I can’t figure out why it’s consistently faster for me than everyone else. Any ideas of what could be causing this? submitted by /u/Presto1985 [link] [comments]