Power Query: how do I change a row value based on results of a count of all rows?
Our take
In the realm of data management, the ability to manipulate and analyze information efficiently is paramount. The recent inquiry into Power Query's capabilities highlights a common challenge faced by users: how to intelligently manage duplicates while considering varying comment values associated with specific data points. The situation described—where multiple records exist for a combination of symbols, dates, and short fields, each having different comments—illustrates the complexity found in seemingly straightforward data sets. The proposed solution to standardize comments based on the presence of "high" values before deduplication is not merely a technical exercise; it reflects a broader need for effective data governance in an increasingly data-driven world. This is particularly relevant as organizations aim to leverage data for strategic insights, making tools like Power Query indispensable for streamlining workflows.
The essence of the problem speaks to a larger trend in data management, where the emphasis is shifting from merely collecting data to deriving actionable insights. Users are not only looking for tools that can handle vast amounts of information but also for those that can provide clarity and coherence in their analysis. The issue raised by the user about how to preprocess comments before deduplication can serve as a microcosm of the challenges faced across various sectors. For instance, in financial analysis, where data accuracy and clarity can significantly impact decision-making, ensuring that only the most relevant and highest priority comments are retained is crucial. This aligns with the inquiry presented, where the distinction between "high," "low," and "bad" comments can influence overall assessments and recommendations.
Moreover, the technical challenge of implementing this logic in Power Query offers an opportunity for users to explore the deeper functionalities of the tool. The user's current approach, while commendable, might benefit from further refinement, particularly concerning conditional logic and grouping operations. By embracing the capabilities of Power Query, users can not only resolve their immediate challenges but also unlock a greater understanding of how to manipulate data effectively. This aligns with the ongoing discourse around leveraging AI and advanced analytics to enhance productivity. For those interested in enhancing their skills, resources such as “Power Query - is there a way to combine columns from separate sheets so that rows with the same 'name' in a specified columns always match?” provide valuable insights into harnessing these tools for complex data management tasks.
As we consider the implications of this inquiry, it becomes evident that there is a pressing need for improved educational resources and community support surrounding tools like Power Query. While the functionalities are robust, users often find themselves navigating a steep learning curve. This serves as a reminder of the importance of fostering a supportive environment where users can share their challenges and solutions, as seen in discussions around topics like “Your Prototype Is Not Being Honest With Your Users (And Here’s How To Fix It)” and others that highlight the user experience in technology.
Looking forward, the ability to seamlessly manage and analyze data will only become more critical as organizations rely on insights derived from complex data sets. As users continue to seek innovative solutions that simplify their workflows, the conversation around tools like Power Query will grow in significance. The question remains: how can further advancements in user-friendly data manipulation technologies empower users to transform their data practices effectively? This ongoing exploration will undoubtedly shape the future landscape of data management, fostering a culture of continuous improvement and innovation.
I am trying to take a table that contains 4 relevant fields (symbol, date, shortfield, and comment) and remove duplicates such that there is one record for each symbol, date, and shortfield.
Where a sym&date&shortfield exists with different non-blank comments, I would like to edit the comment before removing duplicates: if one of those comments is "high", change all comments for that combo to blank. otherwise, change those comments to "low".
Here's my current code:
let Source = Excel.CurrentWorkbook(){[Name="fidstats"]}[Content], #"SelectedTypes" = Table.SelectRows(Source, each ([Type] = "Ind Rank" or [Type] = "Metric" or [Type] = "Ratio Calc")), #"RevisedField" = Table.AddColumn(SelectedTypes, "ShortField", each if [Field] = "Valuation" then "val" else if [Field] = "Quality" then "qual" else if [Field] = "Stability" then "stab" else if [Field] = "Health" then "health" else if Text.Contains([Field],"EPS G") then "eps" else if [Field] = "Proj EPS" then "proj eps" else if [Field] = "Fwd EPS" then "fwd eps" else if Text.Contains([Field],"ROI") then "roi" else if Text.Contains([Field],"Ratio") then "ratio" else if Text.Contains([Field],"D/E") then "d/e" else "", type any), #"RevisedValue" = Table.AddColumn(RevisedField, "RevVal", each if [ShortField]="d/e" and [Type]="Ind Rank" then 1-([Value]/100) else if [Type]="Ind Rank" then [Value]/100 else [Value], type any), #"AddedComment" = Table.AddColumn(RevisedValue, "NewComm", each if [Type]="Ind Rank" then if [RevVal]<.25 then "bad" else if [RevVal]<0.35 then "low" else if [RevVal]>0.66 then "high" else "" else if [Type]="Ratio Calc" then if [RevVal]<0.6 then "bad" else if [RevVal]<1 then "low" else if [RevVal]>1.5 then "high" else "" else if [Type]="Metric" then if [RevVal] is null then "" else if [RevVal]<25 then "bad" else if [RevVal]<40 then "low" else if [RevVal]>70 then "high" else "" else "", type any), #"SelectedComments" = Table.SelectRows(AddedComment, each ([NewComm] <> "")), #"AddedCommRank" = Table.AddColumn(SelectedComments, "CommRank", each if [NewComm]="high" then 1 else if [NewComm]="low" then 2 else if [NewComm]="bad" then 3 else "", type number), #"SortedRank" = Table.Sort(AddedCommRank,{{"CommRank", Order.Ascending}, {"Date", Order.Descending}}), #"RemovedColumns" = Table.SelectColumns(SortedRank,{"Sym", "Date", "NewComm", "ShortField"}), #"RemovedDuplicates" = Table.Distinct(RemovedColumns), #"GroupedFields" = Table.Group(RemovedDuplicates, {"Sym", "Date", "NewComm"}, {{"Fields", each Text.Combine([ShortField],", "), type text}}), #"AddedCommFields" = Table.AddColumn(GroupedFields, "CommFields", each Text.Combine({[NewComm], " ", [Fields]}), type text), #"GroupedMonth" = Table.Group(AddedCommFields, {"Sym", "Date"}, {{"AddedCommFields", each Text.Combine([CommFields],"; "), type text}}), #"CombinedText" = Table.AddColumn(GroupedMonth, "MonthCommFields", each Text.Combine({DateTime.ToText([Date], "MMM"), ": ", [AddedCommFields]}), type text), #"GroupedSym" = Table.Group(CombinedText, {"Sym"}, {{"CombinedText", each Text.Combine([MonthCommFields],"#(lf)"), type text}}) in #"GroupedSym" And here are the current results:
| Sym | CombinedText |
|---|---|
| MSFT | Jun: high val <br> May: high roi, health, stab, qual |
| ETN | May: high roi; low qual, ratio, d/e, eps; bad d/e, stab |
| DTE | May: high ratio, proj eps; low d/e, health, eps, fwd eps; bad stab, d/e, roi, eps, ratio |
| EIX | May: high roi, eps, health, val; low proj eps; bad ratio, d/e, eps, stab |
(note that the <br> in MSFT combinedtext is an actual newline in the cell, which reddit can't display properly)
You can see that ETN and EIX both have the problem I'm trying to solve for: ETN has d/e appear as both low and bad, and EIX has eps return as both high and bad. I would like ETN to have d/e only appear as low, and I would like EIX to not list eps at all.
Before the grouping fields step, I'd like to do something like this pseudocode:
if count (sym,date,shortfield) <> count (sym,date,shortfield,newcomm) then if count(newcomm=high)>0,"","low") then remove blank newcomms and duplicate records
How do I accomplish this in Power Query?
[link] [comments]
Read on the original site
Open the publisher's page for the full experience