Built a dashboard to analyze how AI skills are showing up in data science job postings (open source)
Our take
I've been scraping thousands of U.S. data science jobs for the past couple of months and writing about the findings in my newsletter.
At some point, I figured the dashboard was more useful than anything I was writing, so I decided to open source it.
Here's what it covers:
- Top skills companies are actually hiring for, ranked by frequency
- Skills broken down by category (ML/DL, GenAI, Cloud, MLOps, etc.)
- What % of roles now require AI skills, broken down by seniority level
- Salary premium for candidates with AI skills
- An interactive explorer where you can browse individual postings with matched skills highlighted
The skill extraction is built on around 230 curated keyword groups, so it's pretty granular.
Code and data are all in the repo if you want to fork it or dig into the methodology.
https://ai-in-ds.streamlit.app/
I'm scraping weekly, and soon I will upload all of the raw data into Kaggle, for now, you can find the data in the repo
P.S. By the way, I already mentioned it to Luke Barousse since some of these AI keyword groups could be worth adding into his dashboard.
[link] [comments]
Read on the original site
Open the publisher's page for the full experience
Related Articles
- I built an open-source dashboard-as-code toolIt is a code-first tool for building and deploying dashboards using simple YAML and JSX files (and yes, that means load-time dynamic generations of charts, tabs, and values) - the best part is that it works natively with AI agents. Essentially it is an open standard, code-first, framework optimized for AI-native analysis and business intelligence. This is my answer to the whole AI dashboard and BI tools out there, but focusing more on the framework and semantic layer so that it works better with AI agents. Today's the first day of releasing this publicly, so please share your honest feedback, skepticism, and even roast it - and if you want, give the repo a star: https://github.com/bruin-data/dac submitted by /u/uncertainschrodinger [link] [comments]
- Built an AI tool that cleans datasets, fills missing values, and predicts unknown fields [P]I built a Streamlit-based AI data analysis tool that: • Fills missing values using ML models (not just mean/median) • Predicts any missing column using n-1 inputs • Detects anomalies • Shows correlations and feature importance • Lets you download the updated dataset (Attached images show the UI and before vs after CSV file with a sample CSV available on the GitHub page, as well as an image showing the achieved performance metrics) I wanted to test how well it works on real-world incomplete datasets. Would love feedback on: - model approach - accuracy issues - any improvements I should make GitHub: https://github.com/WALKER00058/ML-data-analysis/tree/main submitted by /u/walker98417 [link] [comments]