2 min readfrom Machine Learning

Please I really need your help on this guys [D]

Our take

In a recent discussion, a user sought guidance on a machine learning time series classification challenge. Initially, they achieved a public score of 0.85 but later improved their submission to a perfect score of 1.00 by leveraging an external dataset. The user now wonders if it’s feasible to replicate this success using only the original train and test datasets provided. They aim to understand how to derive the same ID-to-label mapping through machine learning techniques, rather than relying on external data.

My teacher gave us a machine learning time series classification problem.

At first, I tried solving it normally and got a public score of 0.85. But then I searched for the dataset used in the competition and managed to find it. Using that dataset, I generated a submission file that scored 1.00.

Now my question is:

Is it possible to recreate the submission file using only the provided train and test datasets, without relying on the external dataset I found?

In other words, I want to understand if there is a way to learn or reverse-engineer how to produce the same submission output (ID → label mapping) using only the original train/test files. I’m not sure if “reverse engineering the submission” is the correct term, but I want to figure out how to get the same result properly using machine learning rather than external data.

Also, I want to clarify that for the submission I made, I actually had access to the full feature set—not just IDs and labels, meaning the other feature of the sub file

I would really appreciate any help or guidance. If needed, I can share the train/test files or the submission file that achieved the 1.00 score.

Thanks in advance!

submitted by /u/Djistino
[link] [comments]

Read on the original site

Open the publisher's page for the full experience

View original article

Tagged with

#machine learning in spreadsheet applications#generative AI for data analysis#large dataset processing#Excel alternatives for data analysis#natural language processing for spreadsheets#real-time data collaboration#rows.com#big data management in spreadsheets#conversational data analysis#intelligent data visualization#real-time collaboration#data visualization tools#enterprise data management#big data performance#data analysis tools#data cleaning solutions#machine learning#time series#classification#dataset