R-universe search: topic:feature-extraction

Showing 9 of total 9 results (show query)

easystats

parameters:Processing of Model Parameters

Utilities for processing the parameters of various statistical models. Beyond computing p values, CIs, and other indices for a wide variety of models (see list of supported models using the function 'insight::supported_models()'), this package implements features like bootstrapping or simulating of parameters and models, feature reduction (feature extraction and variable selection) as well as functions to describe data and variable characteristics (e.g. skewness, kurtosis, smoothness or distribution).

Maintained by Daniel Lüdecke. Last updated 10 days ago.

beta bootstrap ci confidence-intervals data-reduction easystats fa feature-extraction feature-reduction hacktoberfest parameters pca pvalues regression-models robust-statistics standardize standardized-estimates statistical-models

454 stars 15.67 score 1.8k scripts 56 dependents

bioc

iSEE:Interactive SummarizedExperiment Explorer

Create an interactive Shiny-based graphical user interface for exploring data stored in SummarizedExperiment objects, including row- and column-level metadata. The interface supports transmission of selections between plots and tables, code tracking, interactive tours, interactive or programmatic initialization, preservation of app state, and extensibility to new panel types via S4 classes. Special attention is given to single-cell data in a SingleCellExperiment object with visualization of dimensionality reduction results.

Maintained by Kevin Rue-Albrecht. Last updated 26 days ago.

cellbasedassays clustering dimensionreduction featureextraction geneexpression gui immunooncology shinyapps singlecell transcription transcriptomics visualization dimension-reduction feature-extraction gene-expression hacktoberfest human-cell-atlas shiny single-cell

225 stars 12.86 score 380 scripts 9 dependents

robjhyndman

tsfeatures:Time Series Feature Extraction

Methods for extracting various features from time series data. The features provided are those from Hyndman, Wang and Laptev (2013) <doi:10.1109/ICDMW.2015.104>, Kang, Hyndman and Smith-Miles (2017) <doi:10.1016/j.ijforecast.2016.09.004> and from Fulcher, Little and Jones (2013) <doi:10.1098/rsif.2013.0048>. Features include spectral entropy, autocorrelations, measures of the strength of seasonality and trend, and so on. Users can also define their own feature functions.

Maintained by Rob Hyndman. Last updated 8 months ago.

feature-extraction time-series

257 stars 11.55 score 268 scripts 22 dependents

nanxstats

protr:Generating Various Numerical Representation Schemes for Protein Sequences

Comprehensive toolkit for generating various numerical features of protein sequences described in Xiao et al. (2015) <DOI:10.1093/bioinformatics/btv042>. For full functionality, the software 'ncbi-blast+' is needed, see <https://blast.ncbi.nlm.nih.gov/doc/blast-help/downloadblastdata.html> for more information.

Maintained by Nan Xiao. Last updated 7 months ago.

bioinformatics feature-engineering feature-extraction machine-learning peptides protein-sequences sequence-analysis

52 stars 10.02 score 173 scripts 3 dependents

bioc

Rcpi:Molecular Informatics Toolkit for Compound-Protein Interaction in Drug Discovery

A molecular informatics toolkit with an integration of bioinformatics and chemoinformatics tools for drug discovery.

Maintained by Nan Xiao. Last updated 5 months ago.

software dataimport datarepresentation featureextraction cheminformatics biomedicalinformatics proteomics go systemsbiology bioconductor bioinformatics drug-discovery feature-extraction fingerprint molecular-descriptors protein-sequences

37 stars 7.81 score 29 scripts

modeloriented

rSAFE:Surrogate-Assisted Feature Extraction

Provides a model agnostic tool for white-box model trained on features extracted from a black-box model. For more information see: Gosiewska et al. (2020) <doi:10.1016/j.dss.2021.113556>.

Maintained by Alicja Gosiewska. Last updated 3 years ago.

feature-engineering feature-extraction iml interpretability machine-learning xai

28 stars 6.79 score 44 scripts

feiyoung

GFM:Generalized Factor Model

Generalized factor model is implemented for ultra-high dimensional data with mixed-type variables. Two algorithms, variational EM and alternate maximization, are designed to implement the generalized factor model, respectively. The factor matrix and loading matrix together with the number of factors can be well estimated. This model can be employed in social and behavioral sciences, economy and finance, and genomics, to extract interpretable nonlinear factors. More details can be referred to Wei Liu, Huazhen Lin, Shurong Zheng and Jin Liu. (2021) <doi:10.1080/01621459.2021.1999818>.

Maintained by Wei Liu. Last updated 7 months ago.

approximate-factor-model feature-extraction nonlinear-dimension-reduction number-of-factors openblas cpp

2 stars 5.68 score 8 scripts 2 dependents

haghish

shapley:Weighted Mean SHAP and CI for Robust Feature Assessment in ML Grid

This R package introduces Weighted Mean SHapley Additive exPlanations (WMSHAP), an innovative method for calculating SHAP values for a grid of fine-tuned base-learner machine learning models as well as stacked ensembles, a method not previously available due to the common reliance on single best-performing models. By integrating the weighted mean SHAP values from individual base-learners comprising the ensemble or individual base-learners in a tuning grid search, the package weights SHAP contributions according to each model's performance, assessed by multiple either R squared (for both regression and classification models). alternatively, this software also offers weighting SHAP values based on the area under the precision-recall curve (AUCPR), the area under the curve (AUC), and F2 measures for binary classifiers. It further extends this framework to implement weighted confidence intervals for weighted mean SHAP values, offering a more comprehensive and robust feature importance evaluation over a grid of machine learning models, instead of solely computing SHAP values for the best model. This methodology is particularly beneficial for addressing the severe class imbalance (class rarity) problem by providing a transparent, generalized measure of feature importance that mitigates the risk of reporting SHAP values for an overfitted or biased model and maintains robustness under severe class imbalance, where there is no universal criteria of identifying the absolute best model. Furthermore, the package implements hypothesis testing to ascertain the statistical significance of SHAP values for individual features, as well as comparative significance testing of SHAP contributions between features. Additionally, it tackles a critical gap in feature selection literature by presenting criteria for the automatic feature selection of the most important features across a grid of models or stacked ensembles, eliminating the need for arbitrary determination of the number of top features to be extracted. This utility is invaluable for researchers analyzing feature significance, particularly within severely imbalanced outcomes where conventional methods fall short. Moreover, it is also expected to report democratic feature importance across a grid of models, resulting in a more comprehensive and generalizable feature selection. The package further implements a novel method for visualizing SHAP values both at subject level and feature level as well as a plot for feature selection based on the weighted mean SHAP ratios.

Maintained by E. F. Haghish. Last updated 13 days ago.

class-imbalance class-imbalance-problem feature-extraction feature-importance feature-selection machine-learning machine-learning-algorithms shap shap-analysis shap-values shapely shapley-additive-explanations shapley-decomposition shapley-value shapley-values shapleyvalue weighted-shap weighted-shap-confidence-interval weighted-shapley weighted-shapley-ci

15 stars 5.25 score 17 scripts

kleinomicslab

mrbin:Metabolomics Data Analysis Functions

A collection of functions for processing and analyzing metabolite data. The namesake function mrbin() converts 1D or 2D Nuclear Magnetic Resonance data into a matrix of values suitable for further data analysis and performs basic processing steps in a reproducible way. Negative values, a common issue in such data, can be replaced by positive values (<doi:10.1021/acs.jproteome.0c00684>). All used parameters are stored in a readable text file and can be restored from that file to enable exact reproduction of the data at a later time. The function fia() ranks features according to their impact on classifier models, especially artificial neural network models.

Maintained by Matthias Klein. Last updated 19 days ago.

artificial-neural-networks feature-extraction metabolomics nmr

2 stars 4.00 score 4 scripts