Telegram Group & Telegram Channel
说到特征降维/选择的问题,大部分EDA的套路都是从model训练的loss来判断feature importance。其实有一个简单易行而且很有效的办法是在CV里面用做feature permutation,对原始特征shuffle得到shadow(也可以加一些噪音),在通过zscore比较两者差异来判断importance,不断遍历筛选。在ESLII中593页有提到这个办法。R里面有一个包Boruta可以做这件事,py也有:https://github.com/scikit-learn-contrib/boruta_py



tg-me.com/DataScienceArchive/94
Create:
Last Update:

说到特征降维/选择的问题,大部分EDA的套路都是从model训练的loss来判断feature importance。其实有一个简单易行而且很有效的办法是在CV里面用做feature permutation,对原始特征shuffle得到shadow(也可以加一些噪音),在通过zscore比较两者差异来判断importance,不断遍历筛选。在ESLII中593页有提到这个办法。R里面有一个包Boruta可以做这件事,py也有:https://github.com/scikit-learn-contrib/boruta_py

BY Data Science Archive


Warning: Undefined variable $i in /var/www/tg-me/post.php on line 283

Share with your friend now:
tg-me.com/DataScienceArchive/94

View MORE
Open in Telegram


Data Science Archive Telegram | DID YOU KNOW?

Date: |

The Singapore stock market has alternated between positive and negative finishes through the last five trading days since the end of the two-day winning streak in which it had added more than a dozen points or 0.4 percent. The Straits Times Index now sits just above the 3,060-point plateau and it's likely to see a narrow trading range on Monday.

Data Science Archive from ua


Telegram Data Science Archive
FROM USA