2 min readfrom Machine Learning

Question about PLS-DA hyperparameter tuning [R]

Our take

In this inquiry, a bioinformatician delves into the intricacies of sparse Partial Least Squares Discriminant Analysis (PLS-DA) while seeking clarity on hyperparameter tuning. After establishing a global model to guide feature selection, they observed unexpected results with their final model, where error rates did not decrease as anticipated with the addition of latent components. This confusion highlights the complexities of model performance assessment in machine learning.
Question about PLS-DA hyperparameter tuning [R]
Question about PLS-DA hyperparameter tuning [R]

Hi all! I am a bioinformatician and I am working on learning some ML tools for some disease/biomarker stuff. I am working with sparse PLS-DA at the moment. Before actually tuning the model, I run on overall global model (without sparsity) to get an idea of what my data looks like and to get to a starting point. Here is what that global model ends up looking like:

global model

So from this, I'm seeing that I should include 2 latent components in my model tuning and I chose to use the centroids.dist. So I tune the model with two components, it gives me the # of features to keep on each component and then I run the final model. However, when I do performance assessment on the final model, it looks like this:

final model (sparse)

I guess I am a little confused. From what I am reading online, and from my own data, error rates should go down with added components. It also doesn't make a ton of sense to me because I should have only picked the features that best distinguish two conditions, so again, I should be seeing error rates decrease.

Can someone please help me understand what I'm seeing here and what could be causing this? I am still learning how all of this works, so amy sort of guidance is appreciated. Thank you!

submitted by /u/dacherrr
[link] [comments]

Read on the original site

Open the publisher's page for the full experience

View original article

Tagged with

#generative AI for data analysis#Excel alternatives for data analysis#financial modeling with spreadsheets#data visualization tools#big data performance#data analysis tools#big data management in spreadsheets#machine learning in spreadsheet applications#conversational data analysis#rows.com#real-time data collaboration#intelligent data visualization#enterprise data management#data cleaning solutions#natural language processing for spreadsheets#self-service analytics tools#business intelligence tools#collaborative spreadsheet tools#PLS-DA#hyperparameter tuning