Brainstorming a new table layout
Our take
Hey everyone.
I have a predicament at work and I can't figure it out. Reaching out to the internet for assistance.
I have a giant table for pricing that at a high level is rows of products (numbering around 500) and columns (around 300) representing cost, product sizes, ID's, and discounts. Currently these are broken out by a national section and 6 division (state groupings) sections that have their own discount columns. It's cumbersome due to its size but the person charged with entering the data likes it, so it has stayed this way.
However, the business has outgrown it and now wants to get more granular with the data. They need discounts at the state and city level. Which I could just expand our current table but I estimate that would create a file with around 2,000+ columns, and around 95% of those would be blank.
The reason it gets so long is that each discount has to have its own column and that discount could have all/none of the products. So every level of granularity just compounds this issue.
I'm here looking for better ways to handle this data.
Right now the best I've come up with is that I create a 2nd workbook that just handles the state/city level, it would still be awkward and add a lot of duplicate work though (maintaining product names/ids/etc)
Currently I run the workbook through a power query which condenses it all and spits out individual sales books based on region data. I would plan on combining these two books into one dataset in the future. And that's an issue for another time.
I'm decent with excel/power query. I'm the company "excel guy" but I know enough to know that I don't know much.
[link] [comments]
Read on the original site
Open the publisher's page for the full experience
Related Articles
- In series drop down filters from multiple sheetsPortion rant and question. Unfortunately, being able to create pivot tables has earned me the title of "Excel Wizard" in my office, and I have been tasked with creating a dashboard to pull filtered data from several sheets easily. The database I am pulling from is fairly large, and outside of my abilities, YouTube and online searches are not getting me the exact answer I need. In theory, the end result will be a dashboard with two dropdown filters. The first is to select the specific location (37 total), and the second is to select information from 10 separate sheets, like contact information, contract expirations, insurance policy information, equipment information, etc. They would also like each category to have its own sheet so the information can be looked at as a whole. I have pushed to have seperate excel files for each location with the information needed, but they want one place to view and edit all of the data. The other caveat to this is that since pivot tables are a mind-boggling creation, I fear any complex formulas or functions may get damaged as they try to edit/update information in the data sheets. My initial thought was to consolidate all of the information onto one sheet, but the different headers/information types stopped that plan quickly. Besides advocating more for some type of software to store this information and accomplish this "dashboard" need, is there a solution to my problem? submitted by /u/Few-Combination-9985 [link] [comments]
- What would you do with this task, and how long would it take you to do it?I'm going to describe a situation as specifically as I can. I am curious what people would do in this situation, I worry that I complicate things for myself. I'm describing the whole task as it was described to me and then as I discovered it. Ultimately, I'm here to ask you, what do you do, and how long does it take you to do it? I started a new role this month, I am new to advertising modeling methods like mmm, so I am reading a lot about how to apply the methods specific to mmm in R and python, I use VScode, I don't have a github copilot license, I get to use copilot through windows office license. Although this task did not involve modeling, I do want to ask about that kind of task another day if this goes over well. The task 5, excel sheets are to be provided. You are told that this is a clients data that was given to another party for some other analysis and augmentation. This is a quality assurance task. The previous process was as follows; the data the data structure: 1 workbook per industry for 5 industries 4 workbooks had 1 tab, 1 workbook had 3 tabs each tab had a table that had a date column in days, 2 categorical columns advertising_partner, line_of_business and at least 2 numeric columns per work book. some times data is updated from our side and the partner has to redownload the data and reprocess and share again the process this is done once per client, per quarter (but it's just this client for now) open each workbook navigate to each tab the data is in a "controllable" table bing bing home home impressions spend partner dropdown line of business dropdown where bing and home are controlled with drop down toggles, with a combination of 3-4 categories each. compare with data that is to be downloaded from a tableau dashboard end state: the comparison of the metrics in tableau to the excel tables to ensure that "the numbers are the same" the categories presented map 1 to 1 with the data you have downloaded from tableau aggregate the data in a pivot table, select the matching categories, make sure the values match additional info about the file the summary table is a complicated sumproduct look up table against an extremely wide table hidden to the left. the summary table can start as early as AK and as late as FE. there are 2 broadly different formats of underlying data in the 5 notebooks, with small structure differences between the group of 3. in the group of 3 the structure of this wide table is similar to the summary table with categories in the column headers describing the metric below it. but with additional categories like region, which is the same value for every column header. 1 of these tables has 1 more header category than the other 2 the left most columns have 1 category each, there are 3 date columns for day, quarter. REGION USA USA USA PARTNER bing bing google LOB home home auto impressions spend ...etc date quarter impressions spend ...etc 2023-01-01 q1 1 2 ...etc 2023-01-02 q1 3 4 ...etc in the group of 2 the left most categories are actually the categorical headers in the group of 3, and the metrics, the values in each category mach the dates are now the headers of this very wide table the header labels are separated from the start of the values by 1 column there is an empty row immediately below the final row for column headers. date Label 2023-01-01 2023-01-02 year 2023 2023 quarter q1 q1 blank row REGION PARTNER LOB measure blank row US bing home impressions 1 3 US bing home spend 2 4 US google auto ...etc ...etc ... etc The question is, what do you do, and how long does it take you to do it? I am being honest here, I wrote out this explaination basically in the order in which I was introduced to the information and how I discovered it. (Oh it's easy if it's all the same format even if it's weird, oh there are 2-ish different formatted files) the meeting of this task ended at 11:00AM. I saw this copy paste manual etl project and I simply didn't want to do it. So I outlined my task by identifying the elements of the table, column name ranges, value ranges, stacked / pivoted column ranges, etc... for an R script to extract that data. by passing the ranges of that content to an argument make_clean_table(left_columns="B4:E4", header_dims=c(..etc)) and functions that extract that convert that excel range into the correct position in the table to extract that element. Then the data was transformed to create a tidy long table. the function gets passed once per notebook extracting the data from each worksheet, building a single table with the columns for the workbook industry, the category in the tab, partner, line of business, spend, impressions, etc... IMO; ideally (if I have to check their data in excel that is), I'd like the partner to redo their report so that I received a workbook with the underlying data in a traditionally tabular form and their reporting page to use power query and table references and not cell ranges and formula. submitted by /u/TheTresStateArea [link] [comments]
- Power Query to tackle consolidating large database to reports or other approach?I currently manage getting some financial reports out to multiple manufacturing facilities from a large corporate snowflake database. I've struggled with multiple challenges in our current process and I'm rethinking this from the beginning, I have an idea but before spending tons of time trying to implement it and possibly hitting a brick wall I wanted to throw it out here to see if you can save me some time/pain. So currently I have data connections from snowflake to a big Master Excel file that selects the fiscal year and period and imports large amounts of data. This data is then cross references with other tables in the Master Excel file to group the transactions to certain categories and business units the plants want to see. This grouping data does not exist in the snowflake database. We then filter this data to each of the specific plants and copy/paste (which has a couple issues/problems I just cannot seem to solve) into workbooks that then present it the way they want to see it, sorted by categories and departments but still with all the individual transaction details for their plant to investigate curious issues. This is prepared and sent out weekly for the fiscal months (shows last week info and month to date info). I had previously set up an access database that I could then link to from excel to do something similar so I was considering building the links and filters into each plant and using power query to access/import the categorized/filtered data into each plants file eliminating the middle step and any copy/pasting. This is option 1. Option 2 and my real question, can I set a similar query up in an excel file to link to the snowflake databases, with the categorizing tables and such in it and then use power query to import from this Excel query master file to do something without using Access? Benefit being maintenance by other people in the org that are not familiar with access. My big concern being the snowflake databases are huge and without the current year/period filter would easily surpass the number of rows excel will handle. The month/period and plant filters would be in the individual plant files importing only current/relevant info, but the Master Excel Query would be open, set to import the whole history, but that file would never be opened/used except to possibly update a manual lookup table for a new account or business unit. Would this work over the access approach? submitted by /u/robbro9 [link] [comments]
- How to organize a sheet based on how many times a certain value in a column is duplicated, and have all other columns follow?Hope the title is descriptive enough... i feel like i always struggle to describe excel stuff efficiently. I am a complete excel beginner! So i have a dataset that is 3000+ rows long. For the sake of ease (and also to not share PHI), I made a shortened 27-row long example; this is what is shown in the screenshots. The data I'm working with is downloaded from a website we use to give people questionnaires. Sometimes, people do not have very straightforward answers to each question, so we type in "comments" in those cases to help clarify the exact answer participants gave. The data I'm working with lists the ID number we gave to each participant; the "variable" AKA the name of the question (tells me exactly which question it is in our questionnaire); which "session" AKA appointment the question is from (we repeat the questionnaire multiple times per participant throughout a year, some in-person, some over the phone); which coworker left the comment (commenter); and finally the actual comment itself. We are trying to see which questions/variables were given comments most often. This is a replica of what I have: https://preview.redd.it/9lvlnwplfgng1.png?width=1025&format=png&auto=webp&s=4716f993d500383af6307b8fb1724fef5b8626e1 This is what i WANT it to sort of look like: https://preview.redd.it/3d4w5a6ofgng1.png?width=980&format=png&auto=webp&s=dd32c658aceadd0afe5c427f0ec19177226d5081 In other words, I want: A count of how many times each type of variable repeats in the excel file. the dataset to be organized from which variable appears the MOST often at the top, and the ones that show up the LEAST at the bottom. to get rid of all rows that were for a "test" participant (notice the "test" rows 8, 18, and 26 in the first screenshot are gone in the second). to be able to find an exact question from the questionnaire website based on this sheet. for example, if I wanted to look more into the context behind row 5's comment, I would know to go to our questionnaire website, go to participant 111's questionnaires from the Lab 1 appointment and specifically look at the alcohol_amount1 question. In other words it is important to keep the participant number and session information. It doesn't need to look exactly like the second picture, that was just the first way to organize it that came to mind. As long as it fills the above requirements that all I need that's what matters. I was trying out pivot tables but I couldn't really get it to look in a way that made sense to me. I really don't know what else to do besides comb through all 3000+ rows one by one... sorry if any of this doesnt fit the exact posting rules. I tried. thanks for any help in advance🥹 submitted by /u/soupysyrup [link] [comments]