Showing 200 of total 351 results (show query)
kisungyou
Rdimtools:Dimension Reduction and Estimation Methods
We provide linear and nonlinear dimension reduction techniques. Intrinsic dimension estimation methods for exploratory analysis are also provided. For more details on the package, see the paper by You and Shung (2022) <doi:10.1016/j.simpa.2022.100414>.
Maintained by Kisung You. Last updated 2 years ago.
dimension-estimationdimension-reductionmanifold-learningsubspace-learningopenblascppopenmp
44.0 match 52 stars 8.37 score 186 scripts 8 dependentsprodriguezsosa
conText:'a la Carte' on Text (ConText) Embedding Regression
A fast, flexible and transparent framework to estimate context-specific word and short document embeddings using the 'a la carte' embeddings approach developed by Khodak et al. (2018) <arXiv:1805.05388> and evaluate hypotheses about covariate effects on embeddings using the regression framework developed by Rodriguez et al. (2021)<https://github.com/prodriguezsosa/EmbeddingRegression>.
Maintained by Pedro L. Rodriguez. Last updated 11 months ago.
37.9 match 104 stars 9.40 score 1.7k scriptsoscarkjell
text:Analyses of Text using Transformers Models from HuggingFace, Natural Language Processing and Machine Learning
Link R with Transformers from Hugging Face to transform text variables to word embeddings; where the word embeddings are used to statistically test the mean difference between set of texts, compute semantic similarity scores between texts, predict numerical variables, and visual statistically significant words according to various dimensions etc. For more information see <https://www.r-text.org>.
Maintained by Oscar Kjell. Last updated 3 days ago.
deep-learningmachine-learningnlptransformersopenjdk
27.0 match 146 stars 13.16 score 436 scripts 1 dependentsjonnob
rsetse:Strain Elevation Tension Spring Embedding
An R implementation for the Strain Elevation and Tension embedding algorithm from Bourne (2020) <doi:10.1007/s41109-020-00329-4>. The package embeds graphs and networks using the Strain Elevation and Tension embedding (SETSe) algorithm. SETSe represents the network as a physical system, where edges are elastic, and nodes exert a force either up or down based on node features. SETSe positions the nodes vertically such that the tension in the edges of a node is equal and opposite to the force it exerts for all nodes in the network. The resultant structure can then be analysed by looking at the node elevation and the edge strain and tension. This algorithm works on weighted and unweighted networks as well as networks with or without explicit node features. Edge elasticity can be created from existing edge weights or kept as a constant.
Maintained by Jonathan Bourne. Last updated 3 years ago.
embeddingembedding-graphsgraph-embeddingigraphnetworksnetworkscienceunsupervised-learningcppopenmp
57.1 match 7 stars 4.85 score 8 scriptsdselivanov
text2vec:Modern Text Mining Framework for R
Fast and memory-friendly tools for text vectorization, topic modeling (LDA, LSA), word embeddings (GloVe), similarities. This package provides a source-agnostic streaming API, which allows researchers to perform analysis of collections of documents which are larger than available RAM. All core functions are parallelized to benefit from multicore machines.
Maintained by Dmitriy Selivanov. Last updated 7 months ago.
glovelatent-dirichlet-allocationnatural-language-processingtext-miningtopic-modelingvectorizationword-embeddingsword2veccpp
14.6 match 860 stars 13.48 score 1.3k scripts 23 dependentsmlverse
torch:Tensors and Neural Networks with 'GPU' Acceleration
Provides functionality to define and train neural networks similar to 'PyTorch' by Paszke et al (2019) <doi:10.48550/arXiv.1912.01703> but written entirely in R using the 'libtorch' library. Also supports low-level tensor operations and 'GPU' acceleration.
Maintained by Daniel Falbel. Last updated 5 days ago.
11.3 match 520 stars 16.52 score 1.4k scripts 38 dependentsbnosac
ruimtehol:Learn Text 'Embeddings' with 'Starspace'
Wraps the 'StarSpace' library <https://github.com/facebookresearch/StarSpace> allowing users to calculate word, sentence, article, document, webpage, link and entity 'embeddings'. By using the 'embeddings', you can perform text based multi-label classification, find similarities between texts and categories, do collaborative-filtering based recommendation as well as content-based recommendation, find out relations between entities, calculate graph 'embeddings' as well as perform semi-supervised learning and multi-task learning on plain text. The techniques are explained in detail in the paper: 'StarSpace: Embed All The Things!' by Wu et al. (2017), available at <arXiv:1709.03856>.
Maintained by Jan Wijffels. Last updated 1 years ago.
classificationembeddingsnatural-language-processingnlpsimilaritystarspacetext-miningcpp
21.1 match 101 stars 6.65 score 44 scriptsexaexa
EmbedSOM:Fast Embedding Guided by Self-Organizing Map
Provides a smooth mapping of multidimensional points into low-dimensional space defined by a self-organizing map. Designed to work with 'FlowSOM' and flow-cytometry use-cases. See Kratochvil et al. (2019) <doi:10.12688/f1000research.21642.1>.
Maintained by Mirek Kratochvil. Last updated 1 months ago.
22.0 match 26 stars 6.02 score 8 scriptsedubruell
tidyllm:Tidy Integration of Large Language Models
A tidy interface for integrating large language model (LLM) APIs such as 'Claude', 'Openai', 'Groq','Mistral' and local models via 'Ollama' into R workflows. The package supports text and media-based interactions, interactive message history, batch request APIs, and a tidy, pipeline-oriented interface for streamlined integration into data workflows. Web services are available at <https://www.anthropic.com>, <https://openai.com>, <https://groq.com>, <https://mistral.ai/> and <https://ollama.com>.
Maintained by Eduard Brรผll. Last updated 4 days ago.
14.6 match 68 stars 7.82 score 26 scriptssatijalab
Seurat:Tools for Single Cell Genomics
A toolkit for quality control, analysis, and exploration of single cell RNA sequencing data. 'Seurat' aims to enable users to identify and interpret sources of heterogeneity from single cell transcriptomic measurements, and to integrate diverse types of single cell data. See Satija R, Farrell J, Gennert D, et al (2015) <doi:10.1038/nbt.3192>, Macosko E, Basu A, Satija R, et al (2015) <doi:10.1016/j.cell.2015.05.002>, Stuart T, Butler A, et al (2019) <doi:10.1016/j.cell.2019.05.031>, and Hao, Hao, et al (2020) <doi:10.1101/2020.10.12.335331> for more details.
Maintained by Paul Hoffman. Last updated 1 years ago.
human-cell-atlassingle-cell-genomicssingle-cell-rna-seqcpp
6.6 match 2.4k stars 16.86 score 50k scripts 73 dependentssa-lee
liminal:Multivariate Data Visualization with Tours and Embeddings
Compose interactive visualisations designed for exploratory high-dimensional data analysis. With 'liminal' you can create linked interactive graphics to diagnose the quality of a dimension reduction technique and explore the global structure of a dataset with a tour. A complete description of the method is discussed in ['Lee' & 'Laa' & 'Cook' (2020) <arXiv:2012.06077>].
Maintained by Stuart Lee. Last updated 4 years ago.
embedding-algorithmsinteractive-visualizationst-snetour
19.2 match 5 stars 5.31 score 41 scriptseddelbuettel
littler:R at the Command-Line via 'r'
A scripting and command-line front-end is provided by 'r' (aka 'littler') as a lightweight binary wrapper around the GNU R language and environment for statistical computing and graphics. While R can be used in batch mode, the r binary adds full support for both 'shebang'-style scripting (i.e. using a hash-mark-exclamation-path expression as the first line in scripts) as well as command-line use in standard Unix pipelines. In other words, r provides the R language without the environment.
Maintained by Dirk Eddelbuettel. Last updated 1 months ago.
10.0 match 314 stars 9.49 score 17 scriptsbioc
singleCellTK:Comprehensive and Interactive Analysis of Single Cell RNA-Seq Data
The Single Cell Toolkit (SCTK) in the singleCellTK package provides an interface to popular tools for importing, quality control, analysis, and visualization of single cell RNA-seq data. SCTK allows users to seamlessly integrate tools from various packages at different stages of the analysis workflow. A general "a la carte" workflow gives users the ability access to multiple methods for data importing, calculation of general QC metrics, doublet detection, ambient RNA estimation and removal, filtering, normalization, batch correction or integration, dimensionality reduction, 2-D embedding, clustering, marker detection, differential expression, cell type labeling, pathway analysis, and data exporting. Curated workflows can be used to run Seurat and Celda. Streamlined quality control can be performed on the command line using the SCTK-QC pipeline. Users can analyze their data using commands in the R console or by using an interactive Shiny Graphical User Interface (GUI). Specific analyses or entire workflows can be summarized and shared with comprehensive HTML reports generated by Rmarkdown. Additional documentation and vignettes can be found at camplab.net/sctk.
Maintained by Joshua David Campbell. Last updated 23 days ago.
singlecellgeneexpressiondifferentialexpressionalignmentclusteringimmunooncologybatcheffectnormalizationqualitycontroldataimportgui
9.1 match 181 stars 10.16 score 252 scriptsjayanilakshika
cardinalR:Collection of Data Structures
A collection of simple simulation datasets designed for generating Nonlinear Dimension Reduction representations techniques such as t-distributed Stochastic Neighbor Embedding, and Uniform Manifold Approximation and Projection. These datasets serve as a valuable resource for understanding the reliability of Nonlinear Dimension Reduction representations in various contexts.
Maintained by Jayani P.G. Lakshika. Last updated 11 days ago.
19.9 match 4.54 scoreeddelbuettel
RInside:C++ Classes to Embed R in C++ (and C) Applications
C++ classes to embed R in C++ (and C) applications A C++ class providing the R interpreter is offered by this package making it easier to have "R inside" your C++ application. As R itself is embedded into your application, a shared library build of R is required. This works on Linux, OS X and even on Windows provided you use the same tools used to build R itself. Numerous examples are provided in the nine subdirectories of the examples/ directory of the installed package: standard, 'mpi' (for parallel computing), 'qt' (showing how to embed 'RInside' inside a Qt GUI application), 'wt' (showing how to build a "web-application" using the Wt toolkit), 'armadillo' (for 'RInside' use with 'RcppArmadillo'), 'eigen' (for 'RInside' use with 'RcppEigen'), and 'c_interface' for a basic C interface and 'Ruby' illustration. The examples use 'GNUmakefile(s)' with GNU extensions, so a GNU make is required (and will use the 'GNUmakefile' automatically). 'Doxygen'-generated documentation of the C++ classes is available at the 'RInside' website as well.
Maintained by Dirk Eddelbuettel. Last updated 5 months ago.
12.4 match 136 stars 7.17 score 17 scripts 1 dependentsimmunogenomics
harmony:Fast, Sensitive, and Accurate Integration of Single Cell Data
Implementation of the Harmony algorithm for single cell integration, described in Korsunsky et al <doi:10.1038/s41592-019-0619-0>. Package includes a standalone Harmony function and interfaces to external frameworks.
Maintained by Ilya Korsunsky. Last updated 4 months ago.
algorithmdata-integrationscrna-seqopenblascpp
6.3 match 554 stars 13.74 score 5.5k scripts 8 dependentsbnosac
word2vec:Distributed Representations of Words
Learn vector representations of words by continuous bag of words and skip-gram implementations of the 'word2vec' algorithm. The techniques are detailed in the paper "Distributed Representations of Words and Phrases and their Compositionality" by Mikolov et al. (2013), available at <arXiv:1310.4546>.
Maintained by Jan Wijffels. Last updated 1 years ago.
embeddingsnatural-language-processingword2veccpp
10.0 match 70 stars 8.36 score 227 scripts 6 dependentshauselin
ollamar:'Ollama' Language Models
An interface to easily run local language models with 'Ollama' <https://ollama.com> server and API endpoints (see <https://github.com/ollama/ollama/blob/main/docs/api.md> for details). It lets you run open-source large language models locally on your machine.
Maintained by Hause Lin. Last updated 2 months ago.
8.7 match 84 stars 9.36 score 74 scripts 5 dependentsbioc
lemur:Latent Embedding Multivariate Regression
Fit a latent embedding multivariate regression (LEMUR) model to multi-condition single-cell data. The model provides a parametric description of single-cell data measured with treatment vs. control or more complex experimental designs. The parametric model is used to (1) align conditions, (2) predict log fold changes between conditions for all cells, and (3) identify cell neighborhoods with consistent log fold changes. For those neighborhoods, a pseudobulked differential expression test is conducted to assess which genes are significantly changed.
Maintained by Constantin Ahlmann-Eltze. Last updated 5 months ago.
transcriptomicsdifferentialexpressionsinglecelldimensionreductionregressionopenblascpp
10.2 match 87 stars 7.80 score 81 scriptsigraph
igraph:Network Analysis and Visualization
Routines for simple graphs and network analysis. It can handle large graphs very well and provides functions for generating random and regular graphs, graph visualization, centrality methods and much more.
Maintained by Kirill Mรผller. Last updated 2 days ago.
complex-networksgraph-algorithmsgraph-theorymathematicsnetwork-analysisnetwork-graphfortranlibxml2glpkopenblascpp
3.8 match 581 stars 21.10 score 31k scripts 1.9k dependentspsychbruce
PsychWordVec:Word Embedding Research Framework for Psychological Science
An integrative toolbox of word embedding research that provides: (1) a collection of 'pre-trained' static word vectors in the '.RData' compressed format <https://psychbruce.github.io/WordVector_RData.pdf>; (2) a series of functions to process, analyze, and visualize word vectors; (3) a range of tests to examine conceptual associations, including the Word Embedding Association Test <doi:10.1126/science.aal4230> and the Relative Norm Distance <doi:10.1073/pnas.1720347115>, with permutation test of significance; (4) a set of training methods to locally train (static) word vectors from text corpora, including 'Word2Vec' <arXiv:1301.3781>, 'GloVe' <doi:10.3115/v1/D14-1162>, and 'FastText' <arXiv:1607.04606>; (5) a group of functions to download 'pre-trained' language models (e.g., 'GPT', 'BERT') and extract contextualized (dynamic) word vectors (based on the R package 'text').
Maintained by Han-Wu-Shuang Bao. Last updated 1 years ago.
bertcosine-similarityfasttextglovegptlanguage-modelnatural-language-processingnlppretrained-modelspsychologysemantic-analysistext-analysistext-miningtsneword-embeddingsword-vectorsword2vecopenjdk
19.5 match 22 stars 4.04 score 10 scriptsjayanilakshika
quollr:Visualising How Nonlinear Dimension Reduction Warps Your Data
To construct a model in 2D space from 2D embedding data and then lift it to the high-dimensional space. Additionally, it provides tools to visualize the model in 2D space and to overlay the fitted model on data using the tour technique. Furthermore, it facilitates the generation of summaries of high-dimensional distributions.
Maintained by Jayani P.G. Lakshika. Last updated 1 months ago.
12.7 match 3 stars 6.18 score 7 scriptsropensci
pkgmatch:Find R Packages Matching Either Descriptions or Other R Packages
Find R packages matching either descriptions or other R packages.
Maintained by Mark Padgham. Last updated 1 months ago.
embeddingsllmsnatural-language-processingcpp
14.9 match 3 stars 5.23 scoretommyjones
textmineR:Functions for Text Mining and Topic Modeling
An aid for text mining in R, with a syntax that should be familiar to experienced R users. Provides a wrapper for several topic models that take similarly-formatted input and give similarly-formatted output. Has additional functionality for analyzing and diagnostics for topic models.
Maintained by Tommy Jones. Last updated 2 years ago.
6.6 match 106 stars 10.83 score 310 scripts 7 dependentsjkrijthe
Rtsne:T-Distributed Stochastic Neighbor Embedding using a Barnes-Hut Implementation
An R wrapper around the fast T-distributed Stochastic Neighbor Embedding implementation by Van der Maaten (see <https://github.com/lvdmaaten/bhtsne/> for more information on the original implementation).
Maintained by Jesse Krijthe. Last updated 9 months ago.
5.0 match 256 stars 13.95 score 4.4k scripts 231 dependentsfeiyoung
ProFAST:Probabilistic Factor Analysis for Spatially-Aware Dimension Reduction
Probabilistic factor analysis for spatially-aware dimension reduction across multi-section spatial transcriptomics data with millions of spatial locations. More details can be referred to Wei Liu, et al. (2023) <doi:10.1101/2023.07.11.548486>.
Maintained by Wei Liu. Last updated 1 months ago.
11.7 match 2 stars 5.86 score 12 scripts 1 dependentskharchenkolab
sccore:Core Utilities for Single-Cell RNA-Seq
Core utilities for single-cell RNA-seq data analysis. Contained within are utility functions for working with differential expression (DE) matrices and count matrices, a collection of functions for manipulating and plotting data via 'ggplot2', and functions to work with cell graphs and cell embeddings. Graph-based methods include embedding kNN cell graphs into a UMAP <doi:10.21105/joss.00861>, collapsing vertices of each cluster in the graph, and propagating graph labels.
Maintained by Evan Biederstedt. Last updated 1 years ago.
10.4 match 12 stars 6.44 score 36 scripts 9 dependentssatijalab
SeuratObject:Data Structures for Single Cell Data
Defines S4 classes for single-cell genomic data and associated information, such as dimensionality reduction embeddings, nearest-neighbor graphs, and spatially-resolved coordinates. Provides data access methods and R-native hooks to ensure the Seurat object is familiar to other R users. See Satija R, Farrell J, Gennert D, et al (2015) <doi:10.1038/nbt.3192>, Macosko E, Basu A, Satija R, et al (2015) <doi:10.1016/j.cell.2015.05.002>, and Stuart T, Butler A, et al (2019) <doi:10.1016/j.cell.2019.05.031> for more details.
Maintained by Paul Hoffman. Last updated 1 years ago.
5.5 match 25 stars 11.69 score 1.2k scripts 88 dependentsbnosac
doc2vec:Distributed Representations of Sentences, Documents and Topics
Learn vector representations of sentences, paragraphs or documents by using the 'Paragraph Vector' algorithms, namely the distributed bag of words ('PV-DBOW') and the distributed memory ('PV-DM') model. The techniques in the package are detailed in the paper "Distributed Representations of Sentences and Documents" by Mikolov et al. (2014), available at <arXiv:1405.4053>. The package also provides an implementation to cluster documents based on these embedding using a technique called top2vec. Top2vec finds clusters in text documents by combining techniques to embed documents and words and density-based clustering. It does this by embedding documents in the semantic space as defined by the 'doc2vec' algorithm. Next it maps these document embeddings to a lower-dimensional space using the 'Uniform Manifold Approximation and Projection' (UMAP) clustering algorithm and finds dense areas in that space using a 'Hierarchical Density-Based Clustering' technique (HDBSCAN). These dense areas are the topic clusters which can be represented by the corresponding topic vector which is an aggregate of the document embeddings of the documents which are part of that topic cluster. In the same semantic space similar words can be found which are representative of the topic. More details can be found in the paper 'Top2Vec: Distributed Representations of Topics' by D. Angelov available at <arXiv:2008.09470>.
Maintained by Jan Wijffels. Last updated 3 years ago.
doc2vecembeddingsnatural-language-processingparagraph2vecword2veccpp
11.0 match 48 stars 5.74 score 23 scriptssoftwareliteracy
rEDM:Empirical Dynamic Modeling ('EDM')
An implementation of 'EDM' algorithms based on research software developed for internal use at the Sugihara Lab ('UCSD/SIO'). The package is implemented with 'Rcpp' wrappers around the 'cppEDM' library. It implements the 'simplex' projection method from Sugihara & May (1990) <doi:10.1038/344734a0>, the 'S-map' algorithm from Sugihara (1994) <doi:10.1098/rsta.1994.0106>, convergent cross mapping described in Sugihara et al. (2012) <doi:10.1126/science.1227079>, and, 'multiview embedding' described in Ye & Sugihara (2016) <doi:10.1126/science.aag0863>.
Maintained by Joseph Park. Last updated 11 months ago.
10.3 match 2 stars 6.05 score 319 scripts 1 dependentseagerai
fastai:Interface to 'fastai'
The 'fastai' <https://docs.fast.ai/index.html> library simplifies training fast and accurate neural networks using modern best practices. It is based on research in to deep learning best practices undertaken at 'fast.ai', including 'out of the box' support for vision, text, tabular, audio, time series, and collaborative filtering models.
Maintained by Turgut Abdullayev. Last updated 11 months ago.
audiocollaborative-filteringdarknetdarknet-image-classificationfastaimedicalobject-detectiontabulartextvision
6.0 match 118 stars 9.40 score 76 scriptsjbgruber
rollama:Communicate with 'Ollama' to Run Large Language Models Locally
Wraps the 'Ollama' <https://ollama.com> API, which can be used to communicate with generative large language models locally.
Maintained by Johannes B. Gruber. Last updated 1 months ago.
6.0 match 110 stars 8.36 score 52 scriptschaoranhu
smam:Statistical Modeling of Animal Movements
Animal movement models including Moving-Resting Process with Embedded Brownian Motion (Yan et al., 2014, <doi:10.1007/s10144-013-0428-8>; Pozdnyakov et al., 2017, <doi:10.1007/s11009-017-9547-6>), Brownian Motion with Measurement Error (Pozdnyakov et al., 2014, <doi:10.1890/13-0532.1>), Moving-Resting-Handling Process with Embedded Brownian Motion (Pozdnyakov et al., 2020, <doi:10.1007/s11009-020-09774-1>), Moving-Resting Process with Measurement Error (Hu et al., 2021, <doi:10.1111/2041-210X.13694>), Moving-Moving Process with two Embedded Brownian Motions.
Maintained by Chaoran Hu. Last updated 1 years ago.
animal-movementbrownian-motionhidden-markov-modelhidden-statesmeasurement-errortelegraph-processgslcpp
11.1 match 3 stars 4.52 score 11 scriptssokbae
sketching:Sketching of Data via Random Subspace Embeddings
Construct sketches of data via random subspace embeddings. For more details, see the following papers. Lee, S. and Ng, S. (2022). "Least Squares Estimation Using Sketched Data with Heteroskedastic Errors," Proceedings of the 39th International Conference on Machine Learning (ICML22), 162:12498-12520. Lee, S. and Ng, S. (2020). "An Econometric Perspective on Algorithmic Subsampling," Annual Review of Economics, 12(1): 45โ80.
Maintained by Sokbae Lee. Last updated 3 years ago.
heteroskedasticityregressionsubspace-embeddingcpp
10.9 match 7 stars 4.54 score 7 scriptsjlmelville
uwot:The Uniform Manifold Approximation and Projection (UMAP) Method for Dimensionality Reduction
An implementation of the Uniform Manifold Approximation and Projection dimensionality reduction by McInnes et al. (2018) <doi:10.48550/arXiv.1802.03426>. It also provides means to transform new data and to carry out supervised dimensionality reduction. An implementation of the related LargeVis method of Tang et al. (2016) <doi:10.48550/arXiv.1602.00370> is also provided. This is a complete re-implementation in R (and C++, via the 'Rcpp' package): no Python installation is required. See the uwot website (<https://github.com/jlmelville/uwot>) for more documentation and examples.
Maintained by James Melville. Last updated 19 days ago.
dimensionality-reductionumapcpp
3.1 match 328 stars 15.74 score 2.0k scripts 140 dependentsdustinstoltz
text2map:R Tools for Text Matrices, Embeddings, and Networks
This is a collection of functions optimized for working with with various kinds of text matrices. Focusing on the text matrix as the primary object - represented either as a base R dense matrix or a 'Matrix' package sparse matrix - allows for a consistent and intuitive interface that stays close to the underlying mathematical foundation of computational text analysis. In particular, the package includes functions for working with word embeddings, text networks, and document-term matrices. Methods developed in Stoltz and Taylor (2019) <doi:10.1007/s42001-019-00048-6>, Taylor and Stoltz (2020) <doi:10.1007/s42001-020-00075-8>, Taylor and Stoltz (2020) <doi:10.15195/v7.a23>, and Stoltz and Taylor (2021) <doi:10.1016/j.poetic.2021.101567>.
Maintained by Dustin Stoltz. Last updated 3 months ago.
12.8 match 3.82 score 22 scriptswadpac
GGIR:Raw Accelerometer Data Analysis
A tool to process and analyse data collected with wearable raw acceleration sensors as described in Migueles and colleagues (JMPB 2019), and van Hees and colleagues (JApplPhysiol 2014; PLoSONE 2015). The package has been developed and tested for binary data from 'GENEActiv' <https://activinsights.com/>, binary (.gt3x) and .csv-export data from 'Actigraph' <https://theactigraph.com> devices, and binary (.cwa) and .csv-export data from 'Axivity' <https://axivity.com>. These devices are currently widely used in research on human daily physical activity. Further, the package can handle accelerometer data file from any other sensor brand providing that the data is stored in csv format. Also the package allows for external function embedding.
Maintained by Vincent T van Hees. Last updated 2 days ago.
accelerometeractivity-recognitioncircadian-rhythmmovement-sensorsleep
3.6 match 109 stars 13.20 score 342 scripts 3 dependentstkonopka
umap:Uniform Manifold Approximation and Projection
Uniform manifold approximation and projection is a technique for dimension reduction. The algorithm was described by McInnes and Healy (2018) in <arXiv:1802.03426>. This package provides an interface for two implementations. One is written from scratch, including components for nearest-neighbor search and for embedding. The second implementation is a wrapper for 'python' package 'umap-learn' (requires separate installation, see vignette for more details).
Maintained by Tomasz Konopka. Last updated 11 months ago.
dimensionality-reductionumapcpp
3.7 match 132 stars 12.74 score 3.6k scripts 43 dependentsjeroen
V8:Embedded JavaScript and WebAssembly Engine for R
An R interface to V8 <https://v8.dev>: Google's open source JavaScript and WebAssembly engine. This package can be compiled either with V8 version 6 and up or NodeJS when built as a shared library.
Maintained by Jeroen Ooms. Last updated 1 days ago.
3.0 match 201 stars 15.80 score 508 scripts 336 dependentstrelliscope
trelliscope:Create Interactive Multi-Panel Displays
Trelliscope enables interactive exploration of data frames of visualizations.
Maintained by Ryan Hafen. Last updated 7 months ago.
7.2 match 29 stars 6.43 score 117 scriptsconstantino-garcia
nonlinearTseries:Nonlinear Time Series Analysis
Functions for nonlinear time series analysis. This package permits the computation of the most-used nonlinear statistics/algorithms including generalized correlation dimension, information dimension, largest Lyapunov exponent, sample entropy and Recurrence Quantification Analysis (RQA), among others. Basic routines for surrogate data testing are also included. Part of this work was based on the book "Nonlinear time series analysis" by Holger Kantz and Thomas Schreiber (ISBN: 9780521529020).
Maintained by Constantino A. Garcia. Last updated 6 months ago.
chaoschaotic-systemsnonlinear-dynamicsnonlinear-time-seriestime-seriesopenblascpp
5.0 match 35 stars 8.98 score 123 scripts 7 dependentsbnosac
udpipe:Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing with the 'UDPipe' 'NLP' Toolkit
This natural language processing toolkit provides language-agnostic 'tokenization', 'parts of speech tagging', 'lemmatization' and 'dependency parsing' of raw text. Next to text parsing, the package also allows you to train annotation models based on data of 'treebanks' in 'CoNLL-U' format as provided at <https://universaldependencies.org/format.html>. The techniques are explained in detail in the paper: 'Tokenizing, POS Tagging, Lemmatizing and Parsing UD 2.0 with UDPipe', available at <doi:10.18653/v1/K17-3009>. The toolkit also contains functionalities for commonly used data manipulations on texts which are enriched with the output of the parser. Namely functionalities and algorithms for collocations, token co-occurrence, document term matrix handling, term frequency inverse document frequency calculations, information retrieval metrics (Okapi BM25), handling of multi-word expressions, keyword detection (Rapid Automatic Keyword Extraction, noun phrase extraction, syntactical patterns) sentiment scoring and semantic similarity analysis.
Maintained by Jan Wijffels. Last updated 2 years ago.
conlldependency-parserlemmatizationnatural-language-processingnlppos-taggingr-pkgrcpptext-miningtokenizerudpipecpp
3.8 match 215 stars 11.83 score 1.2k scripts 9 dependentsijlyttle
vembedr:Embed Video in HTML
A set of functions for generating HTML to embed hosted video in your R Markdown documents or Shiny applications.
Maintained by Ian Lyttle. Last updated 3 years ago.
boxembed-videosrmarkdownshinyvimeoyoutube
5.6 match 58 stars 7.83 score 520 scriptsbioc
StabMap:Stabilised mosaic single cell data integration using unshared features
StabMap performs single cell mosaic data integration by first building a mosaic data topology, and for each reference dataset, traverses the topology to project and predict data onto a common embedding. Mosaic data should be provided in a list format, with all relevant features included in the data matrices within each list object. The output of stabMap is a joint low-dimensional embedding taking into account all available relevant features. Expression imputation can also be performed using the StabMap embedding and any of the original data matrices for given reference and query cell lists.
Maintained by Shila Ghazanfar. Last updated 5 months ago.
singlecelldimensionreductionsoftware
7.4 match 5.95 score 60 scriptssamuel-marsh
scCustomize:Custom Visualizations & Functions for Streamlined Analyses of Single Cell Sequencing
Collection of functions created and/or curated to aid in the visualization and analysis of single-cell data using 'R'. 'scCustomize' aims to provide 1) Customized visualizations for aid in ease of use and to create more aesthetic and functional visuals. 2) Improve speed/reproducibility of common tasks/pieces of code in scRNA-seq analysis with a single or group of functions. For citation please use: Marsh SE (2021) "Custom Visualizations & Functions for Streamlined Analyses of Single Cell Sequencing" <doi:10.5281/zenodo.5706430> RRID:SCR_024675.
Maintained by Samuel Marsh. Last updated 3 months ago.
customizationggplot2scrna-seqseuratsingle-cellsingle-cell-genomicssingle-cell-rna-seqvisualization
5.0 match 242 stars 8.75 score 1.1k scriptstidymodels
recipes:Preprocessing and Feature Engineering Steps for Modeling
A recipe prepares your data for modeling. We provide an extensible framework for pipeable sequences of feature engineering steps provides preprocessing tools to be applied to data. Statistical parameters for the steps can be estimated from an initial data set and then applied to other data sets. The resulting processed output can then be used as inputs for statistical or machine learning models.
Maintained by Max Kuhn. Last updated 5 days ago.
2.3 match 584 stars 18.71 score 7.2k scripts 380 dependentsrstudio
blastula:Easily Send HTML Email Messages
Compose and send out responsive HTML email messages that render perfectly across a range of email clients and device sizes. Helper functions let the user insert embedded images, web link buttons, and 'ggplot2' plot objects into the message body. Messages can be sent through an 'SMTP' server, through the 'Posit Connect' service, or through the 'Mailgun' API service <https://www.mailgun.com/>.
Maintained by Richard Iannone. Last updated 8 months ago.
easy-to-useemailhtmlmarkdownresponsive-emailsmtp
4.1 match 552 stars 10.27 score 348 scripts 5 dependentsbnosac
textplot:Text Plots
Visualise complex relations in texts. This is done by providing functionalities for displaying text co-occurrence networks, text correlation networks, dependency relationships as well as text clustering and semantic text 'embeddings'. Feel free to join the effort of providing interesting text visualisations.
Maintained by Jan Wijffels. Last updated 3 years ago.
6.1 match 54 stars 6.78 score 75 scripts 1 dependentsstscl
spEDM:Spatial Empirical Dynamic Modeling
Inferring causal associations in cross-sectional earth system data through empirical dynamic modeling (EDM), with extensions to convergent cross mapping from Sugihara et al. (2012) <doi:10.1126/science.1227079>, partial cross mapping as outlined in Leng et al. (2020) <doi:10.1038/s41467-020-16238-0>, and cross mapping cardinality as described in Tao et al. (2023)<doi:10.1016/j.fmre.2023.01.007>.
Maintained by Wenbo Lv. Last updated 3 hours ago.
causal-inferencecppempirical-dynamic-modelinggeoinformaticsgeospatial-causalityspatial-statisticsopenblascppopenmp
6.8 match 17 stars 6.11 score 2 scriptsjaytimm
textpress:A Lightweight and Versatile NLP Toolkit
A simple Natural Language Processing (NLP) toolkit focused on search-centric workflows with minimal dependencies. The package offers key features for web scraping, text processing, corpus search, and text embedding generation via the 'HuggingFace API' <https://huggingface.co/docs/api-inference/index>.
Maintained by Jason Timm. Last updated 5 months ago.
corpus-searchnlpopenai-embeddingsweb-scraping
9.8 match 3 stars 4.18 scorebioc
scDataviz:scDataviz: single cell dataviz and downstream analyses
In the single cell World, which includes flow cytometry, mass cytometry, single-cell RNA-seq (scRNA-seq), and others, there is a need to improve data visualisation and to bring analysis capabilities to researchers even from non-technical backgrounds. scDataviz attempts to fit into this space, while also catering for advanced users. Additonally, due to the way that scDataviz is designed, which is based on SingleCellExperiment, it has a 'plug and play' feel, and immediately lends itself as flexibile and compatibile with studies that go beyond scDataviz. Finally, the graphics in scDataviz are generated via the ggplot engine, which means that users can 'add on' features to these with ease.
Maintained by Kevin Blighe. Last updated 5 months ago.
singlecellimmunooncologyrnaseqgeneexpressiontranscriptionflowcytometrymassspectrometrydataimport
6.4 match 63 stars 6.30 score 16 scriptsrstudio
keras3:R Interface to 'Keras'
Interface to 'Keras' <https://keras.io>, a high-level neural networks API. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both CPU and GPU devices.
Maintained by Tomasz Kalinowski. Last updated 4 days ago.
2.9 match 845 stars 13.57 score 264 scripts 2 dependentsgesistsa
sweater:Speedy Word Embedding Association Test and Extras Using R
Conduct various tests for evaluating implicit biases in word embeddings: Word Embedding Association Test (Caliskan et al., 2017), <doi:10.1126/science.aal4230>, Relative Norm Distance (Garg et al., 2018), <doi:10.1073/pnas.1720347115>, Mean Average Cosine Similarity (Mazini et al., 2019) <arXiv:1904.04047>, SemAxis (An et al., 2018) <arXiv:1806.05521>, Relative Negative Sentiment Bias (Sweeney & Najafian, 2019) <doi:10.18653/v1/P19-1162>, and Embedding Coherence Test (Dev & Phillips, 2019) <arXiv:1901.07656>.
Maintained by Chung-hong Chan. Last updated 1 months ago.
bias-detectiontextanalysiswordembeddingcpp
7.5 match 30 stars 4.80 score 14 scriptsrstudio
tfdatasets:Interface to 'TensorFlow' Datasets
Interface to 'TensorFlow' Datasets, a high-level library for building complex input pipelines from simple, re-usable pieces. See <https://www.tensorflow.org/guide> for additional details.
Maintained by Tomasz Kalinowski. Last updated 4 days ago.
3.8 match 34 stars 9.32 score 656 scripts 3 dependentsmkearney
wactor:Word Factor Vectors
A user-friendly factor-like interface for converting strings of text into numeric vectors and rectangular data structures.
Maintained by Michael W. Kearney. Last updated 5 years ago.
texttext-classificationtext-processingtext-vectorizationword-embeddingsword-vectorsword2vec
7.5 match 33 stars 4.52 score 3 scriptsbioc
corral:Correspondence Analysis for Single Cell Data
Correspondence analysis (CA) is a matrix factorization method, and is similar to principal components analysis (PCA). Whereas PCA is designed for application to continuous, approximately normally distributed data, CA is appropriate for non-negative, count-based data that are in the same additive scale. The corral package implements CA for dimensionality reduction of a single matrix of single-cell data, as well as a multi-table adaptation of CA that leverages data-optimized scaling to align data generated from different sequencing platforms by projecting into a shared latent space. corral utilizes sparse matrices and a fast implementation of SVD, and can be called directly on Bioconductor objects (e.g., SingleCellExperiment) for easy pipeline integration. The package also includes additional options, including variations of CA to address overdispersion in count data (e.g., Freeman-Tukey chi-squared residual), as well as the option to apply CA-style processing to continuous data (e.g., proteomic TOF intensities) with the Hellinger distance adaptation of CA.
Maintained by Lauren Hsu. Last updated 5 months ago.
batcheffectdimensionreductiongeneexpressionpreprocessingprincipalcomponentsequencingsinglecellsoftwarevisualization
7.2 match 4.64 score 22 scriptsaphalo
gginnards:Explore the Innards of 'ggplot2' Objects
Extensions to 'ggplot2' providing low-level debug tools: statistics and geometries echoing their data argument. Layer manipulation: deletion, insertion, extraction and reordering of layers. Deletion of unused variables from the data object embedded in "ggplot" objects.
Maintained by Pedro J. Aphalo. Last updated 3 months ago.
datavizdebuggingggplot2-enhancementesggplot2-layer-manipulationinspection
3.6 match 25 stars 9.03 score 378 scripts 3 dependentsrikenbit
Vicus:Exploiting Local Structures to Improve Network-Based Analysis of Biological Data
Compared with the similar graph embedding method such as Laplacian Eigenmaps, 'Vicus' can exploit more local structures of graph data. For the details of the methods, see the reference section of 'GitHub' README.md <https://github.com/rikenbit/Vicus>.
Maintained by Koki Tsuyuzaki. Last updated 2 years ago.
8.8 match 1 stars 3.70 scoreanders-biostat
sleepwalk:Interactively Explore Dimension-Reduced Embeddings
A tool to interactively explore the embeddings created by dimension reduction methods such as Principal Components Analysis (PCA), Multidimensional Scaling (MDS), T-distributed Stochastic Neighbour Embedding (t-SNE), Uniform Manifold Approximation and Projection (UMAP) or any other.
Maintained by Svetlana Ovchinnikova. Last updated 3 years ago.
5.6 match 110 stars 5.82 score 40 scriptsr-lib
rlang:Functions for Base Types and Core R and 'Tidyverse' Features
A toolbox for working with base types, core R features like the condition system, and core 'Tidyverse' features like tidy evaluation.
Maintained by Lionel Henry. Last updated 19 days ago.
1.6 match 517 stars 20.53 score 9.8k scripts 15k dependentsmlr-org
mlr3filters:Filter Based Feature Selection for 'mlr3'
Extends 'mlr3' with filter methods for feature selection. Besides standalone filter methods built-in methods of any machine-learning algorithm are supported. Partial scoring of multivariate filter methods is supported.
Maintained by Marc Becker. Last updated 4 months ago.
feature-selectionfilterfiltersmlrmlr3variable-importance
3.6 match 20 stars 8.37 score 95 scripts 3 dependentsbioc
celda:CEllular Latent Dirichlet Allocation
Celda is a suite of Bayesian hierarchical models for clustering single-cell RNA-sequencing (scRNA-seq) data. It is able to perform "bi-clustering" and simultaneously cluster genes into gene modules and cells into cell subpopulations. It also contains DecontX, a novel Bayesian method to computationally estimate and remove RNA contamination in individual cells without empty droplet information. A variety of scRNA-seq data visualization functions is also included.
Maintained by Joshua Campbell. Last updated 27 days ago.
singlecellgeneexpressionclusteringsequencingbayesianimmunooncologydataimportcppopenmp
2.8 match 147 stars 10.47 score 256 scripts 2 dependentsalaninglis
vivid:Variable Importance and Variable Interaction Displays
A suite of plots for displaying variable importance and two-way variable interaction jointly. Can also display partial dependence plots laid out in a pairs plot or 'zenplots' style.
Maintained by Alan Inglis. Last updated 8 months ago.
4.0 match 21 stars 7.39 score 39 scriptscran
SELF:A Structural Equation Embedded Likelihood Framework for Causal Discovery
Provides the SELF criteria to learn causal structure. Please cite "Ruichu Cai, Jie Qiao, Zhenjie Zhang, Zhifeng Hao. SELF: Structural Equational Embedded Likelihood Framework for Causal Discovery. AAAI. 2018."
Maintained by Jie Qiao. Last updated 7 years ago.
5.1 match 5.74 score 16k scriptsbnaras
ECOSolveR:Embedded Conic Solver in R
R interface to the Embedded COnic Solver (ECOS), an efficient and robust C library for convex problems. Conic and equality constraints can be specified in addition to integer and boolean variable constraints for mixed-integer problems. This R interface is inspired by the python interface and has similar calling conventions.
Maintained by Balasubramanian Narasimhan. Last updated 9 months ago.
3.6 match 5 stars 7.86 score 30 scripts 58 dependentscasperhart
detourr:Portable and Performant Tour Animations
Provides 2D and 3D tour animations as HTML widgets. The user can interact with the widgets using orbit controls, tooltips, brushing, and timeline controls. Linked brushing is supported using 'crosstalk', and widgets can be embedded in Shiny apps or HTML documents.
Maintained by Casper Hart. Last updated 5 months ago.
5.9 match 7 stars 4.83 score 48 scriptsbioc
Banksy:Spatial transcriptomic clustering
Banksy is an R package that incorporates spatial information to cluster cells in a feature space (e.g. gene expression). To incorporate spatial information, BANKSY computes the mean neighborhood expression and azimuthal Gabor filters that capture gene expression gradients. These features are combined with the cell's own expression to embed cells in a neighbor-augmented product space which can then be clustered, allowing for accurate and spatially-aware cell typing and tissue domain segmentation.
Maintained by Joseph Lee. Last updated 12 days ago.
clusteringspatialsinglecellgeneexpressiondimensionreductionclustering-algorithmsingle-cell-omicsspatial-omics
3.1 match 90 stars 9.03 score 248 scriptstyy20
KEPTED:Kernel-Embedding-of-Probability Test for Elliptical Distribution
Provides an implementation of a kernel-embedding of probability test for elliptical distribution. This is an asymptotic test for elliptical distribution under general alternatives, and the location and shape parameters are assumed to be unknown. Some side-products are posted, including the transformation between rectangular and polar coordinates and two product-type kernel functions. See Tang and Li (2024) <doi:10.48550/arXiv.2306.10594> for details.
Maintained by Yin Tang. Last updated 11 months ago.
6.7 match 4.18 scorekdpsingh
clinspacy:Clinical Natural Language Processing using 'spaCy', 'scispaCy', and 'medspaCy'
Performs biomedical named entity recognition, Unified Medical Language System (UMLS) concept mapping, and negation detection using the Python 'spaCy', 'scispaCy', and 'medspaCy' packages, and transforms extracted data into a wide format for inclusion in machine learning models. The development of the 'scispaCy' package is described by Neumann (2019) <doi:10.18653/v1/W19-5034>. The 'medspacy' package uses 'ConText', an algorithm for determining the context of clinical statements described by Harkema (2009) <doi:10.1016/j.jbi.2009.05.002>. Clinspacy also supports entity embeddings from 'scispaCy' and UMLS 'cui2vec' concept embeddings developed by Beam (2018) <arXiv:1804.01486>.
Maintained by Karandeep Singh. Last updated 4 years ago.
5.8 match 99 stars 4.81 score 13 scriptsr-spatial
gstat:Spatial and Spatio-Temporal Geostatistical Modelling, Prediction and Simulation
Variogram modelling; simple, ordinary and universal point or block (co)kriging; spatio-temporal kriging; sequential Gaussian or indicator (co)simulation; variogram and variogram map plotting utility functions; supports sf and stars.
Maintained by Edzer Pebesma. Last updated 10 days ago.
1.9 match 197 stars 14.78 score 4.8k scripts 57 dependentscitoverse
cito:Building and Training Neural Networks
The 'cito' package provides a user-friendly interface for training and interpreting deep neural networks (DNN). 'cito' simplifies the fitting of DNNs by supporting the familiar formula syntax, hyperparameter tuning under cross-validation, and helps to detect and handle convergence problems. DNNs can be trained on CPU, GPU and MacOS GPUs. In addition, 'cito' has many downstream functionalities such as various explainable AI (xAI) metrics (e.g. variable importance, partial dependence plots, accumulated local effect plots, and effect estimates) to interpret trained DNNs. 'cito' optionally provides confidence intervals (and p-values) for all xAI metrics and predictions. At the same time, 'cito' is computationally efficient because it is based on the deep learning framework 'torch'. The 'torch' package is native to R, so no Python installation or other API is required for this package.
Maintained by Maximilian Pichler. Last updated 2 months ago.
machine-learningneural-network
3.0 match 42 stars 9.07 score 129 scripts 1 dependentsbenwiseman
sentiment.ai:Simple Sentiment Analysis Using Deep Learning
Sentiment Analysis via deep learning and gradient boosting models with a lot of the underlying hassle taken care of to make the process as simple as possible. In addition to out-performing traditional, lexicon-based sentiment analysis (see <https://benwiseman.github.io/sentiment.ai/#Benchmarks>), it also allows the user to create embedding vectors for text which can be used in other analyses. GPU acceleration is supported on Windows and Linux.
Maintained by Ben Wiseman. Last updated 3 years ago.
10.0 match 2.70 score 7 scriptsrezakj
iCellR:Analyzing High-Throughput Single Cell Sequencing Data
A toolkit that allows scientists to work with data from single cell sequencing technologies such as scRNA-seq, scVDJ-seq, scATAC-seq, CITE-Seq and Spatial Transcriptomics (ST). Single (i) Cell R package ('iCellR') provides unprecedented flexibility at every step of the analysis pipeline, including normalization, clustering, dimensionality reduction, imputation, visualization, and so on. Users can design both unsupervised and supervised models to best suit their research. In addition, the toolkit provides 2D and 3D interactive visualizations, differential expression analysis, filters based on cells, genes and clusters, data merging, normalizing for dropouts, data imputation methods, correcting for batch differences, pathway analysis, tools to find marker genes for clusters and conditions, predict cell types and pseudotime analysis. See Khodadadi-Jamayran, et al (2020) <doi:10.1101/2020.05.05.078550> and Khodadadi-Jamayran, et al (2020) <doi:10.1101/2020.03.31.019109> for more details.
Maintained by Alireza Khodadadi-Jamayran. Last updated 8 months ago.
10xgenomics3dbatch-normalizationcell-type-classificationcite-seqclusteringclustering-algorithmdiffusion-mapsdropouticellrimputationintractive-graphnormalizationpseudotimescrna-seqscvdj-seqsingel-cell-sequencingumapcpp
4.9 match 121 stars 5.56 score 7 scripts 1 dependentsegenn
rtemis:Machine Learning and Visualization
Advanced Machine Learning and Visualization. Unsupervised Learning (Clustering, Decomposition), Supervised Learning (Classification, Regression), Cross-Decomposition, Bagging, Boosting, Meta-models. Static and interactive graphics.
Maintained by E.D. Gennatas. Last updated 1 months ago.
data-sciencedata-visualizationmachine-learningmachine-learning-libraryvisualization
3.8 match 145 stars 7.09 score 50 scripts 2 dependentshafen
trelliscopejs:Create Interactive Trelliscope Displays
Trelliscope is a scalable, flexible, interactive approach to visualizing data (Hafen, 2013 <doi:10.1109/LDAV.2013.6675164>). This package provides methods that make it easy to create a Trelliscope display specification for TrelliscopeJS. High-level functions are provided for creating displays from within 'tidyverse' or 'ggplot2' workflows. Low-level functions are also provided for creating new interfaces.
Maintained by Ryan Hafen. Last updated 1 years ago.
2.8 match 264 stars 9.50 score 1000 scripts 1 dependentskharchenkolab
pagoda2:Single Cell Analysis and Differential Expression
Analyzing and interactively exploring large-scale single-cell RNA-seq datasets. 'pagoda2' primarily performs normalization and differential gene expression analysis, with an interactive application for exploring single-cell RNA-seq datasets. It performs basic tasks such as cell size normalization, gene variance normalization, and can be used to identify subpopulations and run differential expression within individual samples. 'pagoda2' was written to rapidly process modern large-scale scRNAseq datasets of approximately 1e6 cells. The companion web application allows users to explore which gene expression patterns form the different subpopulations within your data. The package also serves as the primary method for preprocessing data for conos, <https://github.com/kharchenkolab/conos>. This package interacts with data available through the 'p2data' package, which is available in a 'drat' repository. To access this data package, see the instructions at <https://github.com/kharchenkolab/pagoda2>. The size of the 'p2data' package is approximately 6 MB.
Maintained by Evan Biederstedt. Last updated 1 years ago.
scrna-seqsingle-cellsingle-cell-rna-seqtranscriptomicsopenblascppopenmp
3.3 match 222 stars 8.00 score 282 scriptsmhahsler
seriation:Infrastructure for Ordering Objects Using Seriation
Infrastructure for ordering objects with an implementation of several seriation/sequencing/ordination techniques to reorder matrices, dissimilarity matrices, and dendrograms. Also provides (optimally) reordered heatmaps, color images and clustering visualizations like dissimilarity plots, and visual assessment of cluster tendency plots (VAT and iVAT). Hahsler et al (2008) <doi:10.18637/jss.v025.i03>.
Maintained by Michael Hahsler. Last updated 3 months ago.
combinatorial-optimizationordinationseriationfortran
1.9 match 77 stars 14.07 score 640 scripts 79 dependentsjdonaldson
tsne:T-Distributed Stochastic Neighbor Embedding for R (t-SNE)
A "pure R" implementation of the t-SNE algorithm.
Maintained by Justin Donaldson. Last updated 6 years ago.
2.8 match 58 stars 9.35 score 656 scripts 13 dependentszeileis
exams2forms:Embedding 'exams' Exercises as Forms in 'rmarkdown' or 'quarto' Documents
Automatic generation of quizzes or individual questions as (interactive) forms within 'rmarkdown' or 'quarto' documents based on 'R/exams' exercises.
Maintained by Achim Zeileis. Last updated 2 days ago.
5.6 match 1 stars 4.65 score 9 scriptsr-lib
testthat:Unit Testing for R
Software testing is important, but, in part because it is frustrating and boring, many of us avoid it. 'testthat' is a testing framework for R that is easy to learn and use, and integrates with your existing 'workflow'.
Maintained by Hadley Wickham. Last updated 16 days ago.
1.3 match 900 stars 20.97 score 74k scripts 465 dependentspaithiov909
apportita:Utility for Handling 'magnitude' Word Embeddings
A partial R port from 'magnitude', which is a fast, simple utility library for handling vector embeddings. The main goal of this package is to enable access to user's local magnitude data store.
Maintained by Akiru Kato. Last updated 2 months ago.
15.4 match 1 stars 1.70 score 4 scriptsalishinski
lavaanPlot:Path Diagrams for 'Lavaan' Models via 'DiagrammeR'
Plots path diagrams from models in 'lavaan' using the plotting functionality from the 'DiagrammeR' package. 'DiagrammeR' provides nice path diagrams via 'Graphviz', and these functions make it easy to generate these diagrams from a 'lavaan' path model without having to write the DOT language graph specification.
Maintained by Alex Lishinski. Last updated 1 years ago.
3.1 match 40 stars 8.33 score 294 scriptsfishfollower
stockassessment:State-Space Assessment Model
Fitting SAM...
Maintained by Anders Nielsen. Last updated 13 days ago.
3.3 match 49 stars 7.76 score 324 scripts 2 dependentspeterwmacd
fase:Functional Adjacency Spectral Embedding
Latent process embedding for functional network data with the Functional Adjacency Spectral Embedding. Fits smooth latent processes based on cubic spline bases. Also generates functional network data from three models, and evaluates a network generalized cross-validation criterion for dimension selection. For more information, see MacDonald, Zhu and Levina (2022+) <arXiv:2210.07491>.
Maintained by Peter W. MacDonald. Last updated 9 months ago.
7.5 match 3.40 score 3 scriptsthothorn
HSAUR3:A Handbook of Statistical Analyses Using R (3rd Edition)
Functions, data sets, analyses and examples from the third edition of the book ''A Handbook of Statistical Analyses Using R'' (Torsten Hothorn and Brian S. Everitt, Chapman & Hall/CRC, 2014). The first chapter of the book, which is entitled ''An Introduction to R'', is completely included in this package, for all other chapters, a vignette containing all data analyses is available. In addition, Sweave source code for slides of selected chapters is included in this package (see HSAUR3/inst/slides). The publishers web page is '<https://www.routledge.com/A-Handbook-of-Statistical-Analyses-using-R/Hothorn-Everitt/p/book/9781482204582>'.
Maintained by Torsten Hothorn. Last updated 7 months ago.
3.8 match 6 stars 6.72 score 120 scripts 2 dependentsbioc
ttgsea:Tokenizing Text of Gene Set Enrichment Analysis
Functional enrichment analysis methods such as gene set enrichment analysis (GSEA) have been widely used for analyzing gene expression data. GSEA is a powerful method to infer results of gene expression data at a level of gene sets by calculating enrichment scores for predefined sets of genes. GSEA depends on the availability and accuracy of gene sets. There are overlaps between terms of gene sets or categories because multiple terms may exist for a single biological process, and it can thus lead to redundancy within enriched terms. In other words, the sets of related terms are overlapping. Using deep learning, this pakage is aimed to predict enrichment scores for unique tokens or words from text in names of gene sets to resolve this overlapping set issue. Furthermore, we can coin a new term by combining tokens and find its enrichment score by predicting such a combined tokens.
Maintained by Dongmin Jung. Last updated 5 months ago.
softwaregeneexpressiongenesetenrichment
5.1 match 4.95 score 3 scripts 3 dependentsbioc
HGC:A fast hierarchical graph-based clustering method
HGC (short for Hierarchical Graph-based Clustering) is an R package for conducting hierarchical clustering on large-scale single-cell RNA-seq (scRNA-seq) data. The key idea is to construct a dendrogram of cells on their shared nearest neighbor (SNN) graph. HGC provides functions for building graphs and for conducting hierarchical clustering on the graph. The users with old R version could visit https://github.com/XuegongLab/HGC/tree/HGC4oldRVersion to get HGC package built for R 3.6.
Maintained by XGlab. Last updated 5 months ago.
singlecellsoftwareclusteringrnaseqgraphandnetworkdnaseqcpp
5.3 match 4.70 score 25 scriptstidymodels
dials:Tools for Creating Tuning Parameter Values
Many models contain tuning parameters (i.e. parameters that cannot be directly estimated from the data). These tools can be used to define objects for creating, simulating, or validating values for such parameters.
Maintained by Hannah Frick. Last updated 29 days ago.
1.8 match 114 stars 14.22 score 426 scripts 52 dependentsbioc
eiR:Accelerated similarity searching of small molecules
The eiR package provides utilities for accelerated structure similarity searching of very large small molecule data sets using an embedding and indexing approach.
Maintained by Thomas Girke. Last updated 1 months ago.
cheminformaticsbiomedicalinformaticspharmacogeneticspharmacogenomicsmicrotitreplateassaycellbasedassaysvisualizationinfrastructuredataimportclusteringproteomicsmetabolomics
4.4 match 3 stars 5.51 score 12 scriptsjoycekang
symphony:Efficient and Precise Single-Cell Reference Atlas Mapping
Implements the Symphony single-cell reference building and query mapping algorithms and additional functions described in Kang et al <https://www.nature.com/articles/s41467-021-25957-x>.
Maintained by Joyce Kang. Last updated 2 years ago.
6.3 match 3.83 score 134 scriptswch
extrafont:Tools for Using Fonts
Tools to using fonts other than the standard PostScript fonts. This package makes it easy to use system TrueType fonts and with PDF or PostScript output files, and with bitmap output files in Windows. extrafont can also be used with fonts packaged specifically to be used with, such as the fontcm package, which has Computer Modern PostScript fonts with math symbols.
Maintained by Winston Chang. Last updated 2 years ago.
1.7 match 324 stars 14.10 score 13k scripts 51 dependentsbioc
tidytof:Analyze High-dimensional Cytometry Data Using Tidy Data Principles
This package implements an interactive, scientific analysis pipeline for high-dimensional cytometry data built using tidy data principles. It is specifically designed to play well with both the tidyverse and Bioconductor software ecosystems, with functionality for reading/writing data files, data cleaning, preprocessing, clustering, visualization, modeling, and other quality-of-life functions. tidytof implements a "grammar" of high-dimensional cytometry data analysis.
Maintained by Timothy Keyes. Last updated 5 months ago.
singlecellflowcytometrybioinformaticscytometrydata-sciencesingle-celltidyversecpp
3.3 match 19 stars 7.26 score 35 scriptswgunderwood
motifcluster:Motif-Based Spectral Clustering of Weighted Directed Networks
Tools for spectral clustering of weighted directed networks using motif adjacency matrices. Methods perform well on large and sparse networks, and random sampling methods for generating weighted directed networks are also provided. Based on methodology detailed in Underwood, Elliott and Cucuringu (2020) <arXiv:2004.01293>.
Maintained by William George Underwood. Last updated 9 months ago.
4.2 match 26 stars 5.72 score 4 scriptsjwijffels
topicmodels.etm:Topic Modelling in Embedding Spaces
Find topics in texts which are semantically embedded using techniques like word2vec or Glove. This topic modelling technique models each word with a categorical distribution whose natural parameter is the inner product between a word embedding and an embedding of its assigned topic. The techniques are explained in detail in the paper 'Topic Modeling in Embedding Spaces' by Adji B. Dieng, Francisco J. R. Ruiz, David M. Blei (2019), available at <arXiv:1907.04907>.
Maintained by Jan Wijffels. Last updated 3 years ago.
7.7 match 1 stars 2.90 score 32 scriptsbioc
flowPlots:flowPlots: analysis plots and data class for gated flow cytometry data
Graphical displays with embedded statistical tests for gated ICS flow cytometry data, and a data class which stores "stacked" data and has methods for computing summary measures on stacked data, such as marginal and polyfunctional degree data.
Maintained by N. Hawkins. Last updated 5 months ago.
immunooncologyflowcytometrycellbasedassaysvisualizationdatarepresentation
6.8 match 3.30 score 1 scriptsirudnyts
openai:R Wrapper for OpenAI API
An R wrapper of OpenAI API endpoints (see <https://platform.openai.com/docs/introduction> for details). This package covers Models, Completions, Chat, Edits, Images, Embeddings, Audio, Files, Fine-tunes, Moderations, and legacy Engines endpoints.
Maintained by Iegor Rudnytskyi. Last updated 4 months ago.
2.8 match 172 stars 8.05 score 336 scripts 5 dependentsbioc
tricycle:tricycle: Transferable Representation and Inference of cell cycle
The package contains functions to infer and visualize cell cycle process using Single Cell RNASeq data. It exploits the idea of transfer learning, projecting new data to the previous learned biologically interpretable space. We provide a pre-learned cell cycle space, which could be used to infer cell cycle time of human and mouse single cell samples. In addition, we also offer functions to visualize cell cycle time on different embeddings and functions to build new reference.
Maintained by Shijie Zheng. Last updated 5 months ago.
singlecellsoftwaretranscriptomicsrnaseqtranscriptionbiologicalquestiondimensionreductionimmunooncology
3.4 match 24 stars 6.52 score 46 scriptsshaelebrown
TDApplied:Machine Learning and Inference for Topological Data Analysis
Topological data analysis is a powerful tool for finding non-linear global structure in whole datasets. The main tool of topological data analysis is persistent homology, which computes a topological shape descriptor of a dataset called a persistence diagram. 'TDApplied' provides useful and efficient methods for analyzing groups of persistence diagrams with machine learning and statistical inference, and these functions can also interface with other data science packages to form flexible and integrated topological data analysis pipelines.
Maintained by Shael Brown. Last updated 5 months ago.
3.3 match 16 stars 6.60 score 8 scriptsdoi-usgs
dataRetrieval:Retrieval Functions for USGS and EPA Hydrology and Water Quality Data
Collection of functions to help retrieve U.S. Geological Survey and U.S. Environmental Protection Agency water quality and hydrology data from web services. Data are discovered from National Water Information System <https://waterservices.usgs.gov/> and <https://waterdata.usgs.gov/nwis>. Water quality data are obtained from the Water Quality Portal <https://www.waterqualitydata.us/>.
Maintained by Laura DeCicco. Last updated 17 days ago.
1.5 match 280 stars 14.18 score 1.7k scripts 15 dependentsbioc
spatialHeatmap:spatialHeatmap: Visualizing Spatial Assays in Anatomical Images and Large-Scale Data Extensions
The spatialHeatmap package offers the primary functionality for visualizing cell-, tissue- and organ-specific assay data in spatial anatomical images. Additionally, it provides extended functionalities for large-scale data mining routines and co-visualizing bulk and single-cell data. A description of the project is available here: https://spatialheatmap.org.
Maintained by Jianhai Zhang. Last updated 4 months ago.
spatialvisualizationmicroarraysequencinggeneexpressiondatarepresentationnetworkclusteringgraphandnetworkcellbasedassaysatacseqdnaseqtissuemicroarraysinglecellcellbiologygenetarget
3.4 match 5 stars 6.26 score 12 scriptsrstudio
sortable:Drag-and-Drop in 'shiny' Apps with 'SortableJS'
Enables drag-and-drop behaviour in Shiny apps, by exposing the functionality of the 'SortableJS' <https://sortablejs.github.io/Sortable/> JavaScript library as an 'htmlwidget'. You can use this in Shiny apps and widgets, 'learnr' tutorials as well as R Markdown. In addition, provides a custom 'learnr' question type - 'question_rank()' - that allows ranking questions with drag-and-drop.
Maintained by Andrie de Vries. Last updated 6 months ago.
1.8 match 135 stars 11.62 score 368 scripts 13 dependentsxd-deng
ECharts2Shiny:Embedding Interactive Charts Generated with ECharts Library into Shiny Applications
Embed interactive charts to their Shiny applications. These charts will be generated by ECharts library developed by Baidu (<http://echarts.baidu.com/>). Current version supports line chart, bar chart, pie chart, scatter plot, gauge, word cloud, radar chart, tree map, and heat map.
Maintained by Xiaodong Deng. Last updated 4 years ago.
2.8 match 129 stars 7.42 score 135 scriptseagerai
tfaddons:Interface to 'TensorFlow SIG Addons'
'TensorFlow SIG Addons' <https://www.tensorflow.org/addons> is a repository of community contributions that conform to well-established API patterns, but implement new functionality not available in core 'TensorFlow'. 'TensorFlow' natively supports a large number of operators, layers, metrics, losses, optimizers, and more. However, in a fast moving field like Machine Learning, there are many interesting new developments that cannot be integrated into core 'TensorFlow' (because their broad applicability is not yet clear, or it is mostly used by a smaller subset of the community).
Maintained by Turgut Abdullayev. Last updated 3 years ago.
deep-learningkerasneural-networkstensorflowtensorflow-addonstfa
4.0 match 20 stars 5.20 score 16 scriptsemilhvitfeldt
wordsalad:Provide Tools to Extract and Analyze Word Vectors
Provides access to various word embedding methods (GloVe, fasttext and word2vec) to extract word vectors using a unified framework to increase reproducibility and correctness.
Maintained by Emil Hvitfeldt. Last updated 4 years ago.
5.8 match 8 stars 3.60 score 9 scriptsbioc
scrapper:Bindings to C++ Libraries for Single-Cell Analysis
Implements R bindings to C++ code for analyzing single-cell (expression) data, mostly from various libscran libraries. Each function performs an individual step in the single-cell analysis workflow, ranging from quality control to clustering and marker detection. It is mostly intended for other Bioconductor package developers to build more user-friendly end-to-end workflows.
Maintained by Aaron Lun. Last updated 4 days ago.
normalizationrnaseqsoftwaregeneexpressiontranscriptomicssinglecellbatcheffectqualitycontroldifferentialexpressionfeatureextractionprincipalcomponentclusteringopenblascpp
3.8 match 5.55 score 32 scriptshaythorn
sr:Smooth Regression - The Gamma Test and Tools
Finds causal connections in precision data, finds lags and embeddings in time series, guides training of neural networks and other smooth models, evaluates their performance, gives a mathematically grounded answer to the over-training problem. Smooth regression is based on the Gamma test, which measures smoothness in a multivariate relationship. Causal relations are smooth, noise is not. 'sr' includes the Gamma test and search techniques that use it. References: Evans & Jones (2002) <doi:10.1098/rspa.2002.1010>, AJ Jones (2004) <doi:10.1007/s10287-003-0006-1>.
Maintained by Wayne Haythorn. Last updated 2 years ago.
5.5 match 3.70 score 9 scriptstidymodels
textrecipes:Extra 'Recipes' for Text Processing
Converting text to numerical features requires specifically created procedures, which are implemented as steps according to the 'recipes' package. These steps allows for tokenization, filtering, counting (tf and tfidf) and feature hashing.
Maintained by Emil Hvitfeldt. Last updated 8 days ago.
1.9 match 160 stars 10.87 score 964 scripts 1 dependentsmano-b
MicroSEC:Sequence Error Filter for Formalin-Fixed and Paraffin-Embedded Samples
Clinical sequencing of tumor is usually performed on formalin-fixed and paraffin-embedded samples and have many sequencing errors. We found that the majority of these errors are detected in chimeric read caused by single-strand DNA with micro-homology. Our filtering pipeline focuses on the uneven distribution of the artifacts in each read and removes such errors in formalin-fixed and paraffin-embedded samples without over-eliminating the true mutations detected in fresh frozen samples.
Maintained by Masachika Ikegami. Last updated 3 months ago.
3.6 match 7 stars 5.66 score 8 scriptsandrija-djurovic
PDtoolkit:Collection of Tools for PD Rating Model Development and Validation
The goal of this package is to cover the most common steps in probability of default (PD) rating model development and validation. The main procedures available are those that refer to univariate, bivariate, multivariate analysis, calibration and validation. Along with accompanied 'monobin' and 'monobinShiny' packages, 'PDtoolkit' provides functions which are suitable for different data transformation and modeling tasks such as: imputations, monotonic binning of numeric risk factors, binning of categorical risk factors, weights of evidence (WoE) and information value (IV) calculations, WoE coding (replacement of risk factors modalities with WoE values), risk factor clustering, area under curve (AUC) calculation and others. Additionally, package provides set of validation functions for testing homogeneity, heterogeneity, discriminatory and predictive power of the model.
Maintained by Andrija Djurovic. Last updated 1 years ago.
4.3 match 14 stars 4.78 score 86 scriptsbioc
veloviz:VeloViz: RNA-velocity informed 2D embeddings for visualizing cell state trajectories
VeloViz uses each cellโs current observed and predicted future transcriptional states inferred from RNA velocity analysis to build a nearest neighbor graph between cells in the population. Edges are then pruned based on a cosine correlation threshold and/or a distance threshold and the resulting graph is visualized using a force-directed graph layout algorithm. VeloViz can help ensure that relationships between cell states are reflected in the 2D embedding, allowing for more reliable representation of underlying cellular trajectories.
Maintained by Lyla Atta. Last updated 5 months ago.
transcriptomicsvisualizationgeneexpressionsequencingrnaseqdimensionreductioncpp
5.0 match 4.00 score 6 scriptsbioc
DeProViR:A Deep-Learning Framework Based on Pre-trained Sequence Embeddings for Predicting Host-Viral Protein-Protein Interactions
Emerging infectious diseases, exemplified by the zoonotic COVID-19 pandemic caused by SARS-CoV-2, are grave global threats. Understanding protein-protein interactions (PPIs) between host and viral proteins is essential for therapeutic targets and insights into pathogen replication and immune evasion. While experimental methods like yeast two-hybrid screening and mass spectrometry provide valuable insights, they are hindered by experimental noise and costs, yielding incomplete interaction maps. Computational models, notably DeProViR, predict PPIs from amino acid sequences, incorporating semantic information with GloVe embeddings. DeProViR employs a Siamese neural network, integrating convolutional and Bi-LSTM networks to enhance accuracy. It overcomes the limitations of feature engineering, offering an efficient means to predict host-virus interactions, which holds promise for antiviral therapies and advancing our understanding of infectious diseases.
Maintained by Matineh Rahmatbakhsh. Last updated 5 months ago.
proteomicssystemsbiologynetworkinferenceneuralnetworknetwork
6.6 match 1 stars 3.00 score 1 scriptscrunch-io
crunch:Crunch.io Data Tools
The Crunch.io service <https://crunch.io/> provides a cloud-based data store and analytic engine, as well as an intuitive web interface. Using this package, analysts can interact with and manipulate Crunch datasets from within R. Importantly, this allows technical researchers to collaborate naturally with team members, managers, and clients who prefer a point-and-click interface.
Maintained by Greg Freedman Ellis. Last updated 10 days ago.
1.9 match 9 stars 10.53 score 200 scripts 2 dependentsbrodieg
diffobj:Diffs for R Objects
Generate a colorized diff of two R objects for an intuitive visualization of their differences.
Maintained by Brodie Gaslam. Last updated 3 years ago.
1.5 match 232 stars 13.12 score 107 scripts 486 dependentsnucleic-acid
namedropR:Create Visual Citations for Presentations and Posters
Provides 'visual citations' containing the metadata of a scientific paper and a 'QR' code. A 'visual citation' is a banner containing title, authors, journal and year of a publication. This package can create such banners based on 'BibTeX' and 'BibLaTeX' references or call the reference metadata from 'Crossref'-API. The banners include a QR code pointing to the 'DOI'. The resulting HTML object or PNG image can be included in a presentation to point the audience to good resources for further reading. Styling is possible via predefined designs or via custom 'CSS'. This package is not intended as replacement for proper reference manager packages, but a tool to enrich scientific presentation slides and conference posters.
Maintained by Christian A. Gebhard. Last updated 2 years ago.
3.0 match 61 stars 6.44 score 8 scriptscran
compositions:Compositional Data Analysis
Provides functions for the consistent analysis of compositional data (e.g. portions of substances) and positive numbers (e.g. concentrations) in the way proposed by J. Aitchison and V. Pawlowsky-Glahn.
Maintained by K. Gerald van den Boogaart. Last updated 1 years ago.
3.0 match 1 stars 6.35 score 36 dependentsrobjhyndman
tsfeatures:Time Series Feature Extraction
Methods for extracting various features from time series data. The features provided are those from Hyndman, Wang and Laptev (2013) <doi:10.1109/ICDMW.2015.104>, Kang, Hyndman and Smith-Miles (2017) <doi:10.1016/j.ijforecast.2016.09.004> and from Fulcher, Little and Jones (2013) <doi:10.1098/rsif.2013.0048>. Features include spectral entropy, autocorrelations, measures of the strength of seasonality and trend, and so on. Users can also define their own feature functions.
Maintained by Rob Hyndman. Last updated 8 months ago.
1.6 match 254 stars 11.47 score 268 scripts 22 dependentscogdisreslab
PAVER:PAVER: Pathway Analysis Visualization with Embedding Representations
Summary visualization using embedding representations to reveal underlying themes within sets of pathway terms.
Maintained by William G Ryan V. Last updated 8 months ago.
5.3 match 3.48 score 6 scriptsmkellerressel
hydra:Hyperbolic Embedding
Calculate an optimal embedding of a set of data points into low-dimensional hyperbolic space. This uses the strain-minimizing hyperbolic embedding of Keller-Ressel and Nargang (2019), see <arXiv:1903.08977>.
Maintained by Martin Keller-Ressel. Last updated 6 years ago.
8.4 match 2.15 score 20 scriptstiledb-inc
tiledb:Modern Database Engine for Complex Data Based on Multi-Dimensional Arrays
The modern database 'TileDB' introduces a powerful on-disk format for storing and accessing any complex data based on multi-dimensional arrays. It supports dense and sparse arrays, dataframes and key-values stores, cloud storage ('S3', 'GCS', 'Azure'), chunked arrays, multiple compression, encryption and checksum filters, uses a fully multi-threaded implementation, supports parallel I/O, data versioning ('time travel'), metadata and groups. It is implemented as an embeddable cross-platform C++ library with APIs from several languages, and integrations. This package provides the R support.
Maintained by Isaiah Norton. Last updated 4 days ago.
arrayhdfss3storage-managertiledbcpp
1.5 match 107 stars 11.96 score 306 scripts 4 dependentsdcourvoisier
doremi:Dynamics of Return to Equilibrium During Multiple Inputs
Provides models to fit the dynamics of a regulated system experiencing exogenous inputs. The underlying models use differential equations and linear mixed-effects regressions to estimate the coefficients of the equation. With them, the functions can provide an estimated signal. The package provides simulation and analysis functions and also print, summary, plot and predict methods, adapted to the function outputs, for easy implementation and presentation of results.
Maintained by Mongin Denis. Last updated 3 years ago.
3.8 match 4.48 score 25 scripts 1 dependentswilart
SMARTbayesR:Bayesian Set of Best Dynamic Treatment Regimes and Sample Size in SMARTs for Binary Outcomes
Permits determination of a set of optimal dynamic treatment regimes and sample size for a SMART design in the Bayesian setting with binary outcomes. Please see Artman (2020) <arXiv:2008.02341>.
Maintained by William Artman. Last updated 3 years ago.
8.5 match 2.00 score 2 scriptstidyverse
readxl:Read Excel Files
Import excel files into R. Supports '.xls' via the embedded 'libxls' C library <https://github.com/libxls/libxls> and '.xlsx' via the embedded 'RapidXML' C++ library <https://rapidxml.sourceforge.net/>. Works on Windows, Mac and Linux without external dependencies.
Maintained by Jennifer Bryan. Last updated 9 days ago.
0.8 match 734 stars 20.85 score 160k scripts 815 dependentsmages
googleVis:R Interface to Google Charts
R interface to Google's chart tools, allowing users to create interactive charts based on data frames. Charts are displayed locally via the R HTTP help server. A modern browser with an Internet connection is required. The data remains local and is not uploaded to Google.
Maintained by Markus Gesmann. Last updated 10 months ago.
1.3 match 361 stars 12.98 score 2.4k scripts 11 dependentsluisdva
unheadr:Handle Data with Messy Header Rows and Broken Values
Verb-like functions to work with messy data, often derived from spreadsheets or parsed PDF tables. Includes functions for unwrapping values broken up across rows, relocating embedded grouping values, and to annotate meaningful formatting in spreadsheet files.
Maintained by Luis D. Verde Arregoitia. Last updated 10 months ago.
2.5 match 61 stars 6.44 score 45 scriptsbioc
tomoda:Tomo-seq data analysis
This package provides many easy-to-use methods to analyze and visualize tomo-seq data. The tomo-seq technique is based on cryosectioning of tissue and performing RNA-seq on consecutive sections. (Reference: Kruse F, Junker JP, van Oudenaarden A, Bakkers J. Tomo-seq: A method to obtain genome-wide expression data with spatial resolution. Methods Cell Biol. 2016;135:299-307. doi:10.1016/bs.mcb.2016.01.006) The main purpose of the package is to find zones with similar transcriptional profiles and spatially expressed genes in a tomo-seq sample. Several visulization functions are available to create easy-to-modify plots.
Maintained by Wendao Liu. Last updated 5 months ago.
geneexpressionsequencingrnaseqtranscriptomicsspatialclusteringvisualization
4.0 match 4.00 score 2 scriptsbioc
monocle:Clustering, differential expression, and trajectory analysis for single- cell RNA-Seq
Monocle performs differential expression and time-series analysis for single-cell expression experiments. It orders individual cells according to progress through a biological process, without knowing ahead of time which genes define progress through that process. Monocle also performs differential expression analysis, clustering, visualization, and other useful tasks on single cell expression data. It is designed to work with RNA-Seq and qPCR data, but could be used with other types as well.
Maintained by Cole Trapnell. Last updated 5 months ago.
immunooncologysequencingrnaseqgeneexpressiondifferentialexpressioninfrastructuredataimportdatarepresentationvisualizationclusteringmultiplecomparisonqualitycontrolcpp
1.8 match 8.89 score 1.6k scripts 2 dependentsinsightsengineering
teal:Exploratory Web Apps for Analyzing Clinical Trials Data
A 'shiny' based interactive exploration framework for analyzing clinical trials data. 'teal' currently provides a dynamic filtering facility and different data viewers. 'teal' 'shiny' applications are built using standard 'shiny' modules.
Maintained by Dawid Kaledkowski. Last updated 20 days ago.
clinical-trialsnestshinywebapp
1.3 match 197 stars 12.68 score 176 scripts 5 dependentspik-piam
gms:'GAMS' Modularization Support Package
A collection of tools to create, use and maintain modularized model code written in the modeling language 'GAMS' (<https://www.gams.com/>). Out-of-the-box 'GAMS' does not come with support for modularized model code. This package provides the tools necessary to convert a standard 'GAMS' model to a modularized one by introducing a modularized code structure together with a naming convention which emulates local environments. In addition, this package provides tools to monitor the compliance of the model code with modular coding guidelines.
Maintained by Jan Philipp Dietrich. Last updated 4 days ago.
1.8 match 1 stars 8.73 score 414 scripts 37 dependentshfgolino
EGAnet:Exploratory Graph Analysis โ a Framework for Estimating the Number of Dimensions in Multivariate Data using Network Psychometrics
Implements the Exploratory Graph Analysis (EGA) framework for dimensionality and psychometric assessment. EGA estimates the number of dimensions in psychological data using network estimation methods and community detection algorithms. A bootstrap method is provided to assess the stability of dimensions and items. Fit is evaluated using the Entropy Fit family of indices. Unique Variable Analysis evaluates the extent to which items are locally dependent (or redundant). Network loadings provide similar information to factor loadings and can be used to compute network scores. A bootstrap and permutation approach are available to assess configural and metric invariance. Hierarchical structures can be detected using Hierarchical EGA. Time series and intensive longitudinal data can be analyzed using Dynamic EGA, supporting individual, group, and population level assessments.
Maintained by Hudson Golino. Last updated 9 days ago.
2.0 match 47 stars 7.80 score 61 scripts 1 dependentsropensci
rsvg:Render SVG Images into PDF, PNG, (Encapsulated) PostScript, or Bitmap Arrays
Renders vector-based svg images into high-quality custom-size bitmap arrays using 'librsvg2'. The resulting bitmap can be written to e.g. png, jpeg or webp format. In addition, the package can convert images directly to various formats such as pdf or postscript.
Maintained by Jeroen Ooms. Last updated 6 months ago.
1.3 match 98 stars 11.66 score 868 scripts 45 dependentsmlr-org
mlr3fselect:Feature Selection for 'mlr3'
Feature selection package of the 'mlr3' ecosystem. It selects the optimal feature set for any 'mlr3' learner. The package works with several optimization algorithms e.g. Random Search, Recursive Feature Elimination, and Genetic Search. Moreover, it can automatically optimize learners and estimate the performance of optimized feature sets with nested resampling.
Maintained by Marc Becker. Last updated 2 months ago.
evolutionary-algorithmsexhaustive-searchfeature-selectionmachine-learningmlr3optimizationrandom-searchrecursive-feature-eliminationsequential-feature-selection
1.9 match 23 stars 8.25 score 70 scripts 2 dependentsdrwolf85
spMC:Continuous-Lag Spatial Markov Chains
A set of functions is provided for 1) the stratum lengths analysis along a chosen direction, 2) fast estimation of continuous lag spatial Markov chains model parameters and probability computing (also for large data sets), 3) transition probability maps and transiograms drawing, 4) simulation methods for categorical random fields. More details on the methodology are discussed in Sartore (2013) <doi:10.32614/RJ-2013-022> and Sartore et al. (2016) <doi:10.1016/j.cageo.2016.06.001>.
Maintained by Luca Sartore. Last updated 2 years ago.
5.3 match 3 stars 2.92 score 55 scriptsjgarriga65
bigMap:Big Data Mapping
Unsupervised clustering protocol for large scale structured data, based on a low dimensional representation of the data. Dimensionality reduction is performed using a parallelized implementation of the t-Stochastic Neighboring Embedding algorithm (Garriga J. and Bartumeus F. (2018), <arXiv:1812.09869>).
Maintained by Joan Garriga. Last updated 9 months ago.
4.6 match 3.32 score 21 scriptsbioc
velociraptor:Toolkit for Single-Cell Velocity
This package provides Bioconductor-friendly wrappers for RNA velocity calculations in single-cell RNA-seq data. We use the basilisk package to manage Conda environments, and the zellkonverter package to convert data structures between SingleCellExperiment (R) and AnnData (Python). The information produced by the velocity methods is stored in the various components of the SingleCellExperiment class.
Maintained by Kevin Rue-Albrecht. Last updated 5 months ago.
singlecellgeneexpressionsequencingcoveragerna-velocity
1.9 match 54 stars 8.06 score 52 scriptsbnprks
BPCells:Single Cell Counts Matrices to PCA
> Efficient operations for single cell ATAC-seq fragments and RNA counts matrices. Interoperable with standard file formats, and introduces efficient bit-packed formats that allow large storage savings and increased read speeds.
Maintained by Benjamin Parks. Last updated 1 months ago.
2.0 match 184 stars 7.48 score 172 scriptsrstudio
tfhub:Interface to 'TensorFlow' Hub
'TensorFlow' Hub is a library for the publication, discovery, and consumption of reusable parts of machine learning models. A module is a self-contained piece of a 'TensorFlow' graph, along with its weights and assets, that can be reused across different tasks in a process known as transfer learning. Transfer learning train a model with a smaller dataset, improve generalization, and speed up training.
Maintained by Tomasz Kalinowski. Last updated 3 years ago.
2.0 match 29 stars 7.46 score 73 scripts 1 dependentsnicholasdavies
luajr:'LuaJIT' Scripting
An interface to 'LuaJIT' <https://luajit.org>, a just-in-time compiler for the 'Lua' scripting language <https://www.lua.org>. Allows users to run 'Lua' code from 'R'.
Maintained by Nicholas Davies. Last updated 5 months ago.
2.3 match 25 stars 6.30 score 6 scriptsalexym1
fusionchartsR:Embedding FusionCharts in R
FusionCharts provides awesome and minimalist functions to make beautiful interactive charts <https://www.fusioncharts.com/>.
Maintained by Alex Yahiaoui Martinez. Last updated 3 months ago.
3.3 match 6 stars 4.40 score 42 scriptskharchenkolab
conos:Clustering on Network of Samples
Wires together large collections of single-cell RNA-seq datasets, which allows for both the identification of recurrent cell clusters and the propagation of information between datasets in multi-sample or atlas-scale collections. 'Conos' focuses on the uniform mapping of homologous cell types across heterogeneous sample collections. For instance, users could investigate a collection of dozens of peripheral blood samples from cancer patients combined with dozens of controls, which perhaps includes samples of a related tissue such as lymph nodes. This package interacts with data available through the 'conosPanel' package, which is available in a 'drat' repository. To access this data package, see the instructions at <https://github.com/kharchenkolab/conos>. The size of the 'conosPanel' package is approximately 12 MB.
Maintained by Evan Biederstedt. Last updated 1 years ago.
batch-correctionscrna-seqsingle-cell-rna-seqopenblascppopenmp
2.0 match 204 stars 7.32 score 258 scriptstravis-barton
LilRhino:For Implementation of Feed Reduction, Learning Examples, NLP and Code Management
This is for code management functions, NLP tools, a Monty Hall simulator, and for implementing my own variable reduction technique called Feed Reduction. The Feed Reduction technique is not yet published, but is merely a tool for implementing a series of binary neural networks meant for reducing data into N dimensions, where N is the number of possible values of the response variable.
Maintained by Travis Barton. Last updated 3 years ago.
5.2 match 1 stars 2.78 score 12 scriptsbioc
snifter:R wrapper for the python openTSNE library
Provides an R wrapper for the implementation of FI-tSNE from the python package openTNSE. See Poliฤar et al. (2019) <doi:10.1101/731877> and the algorithm described by Linderman et al. (2018) <doi:10.1038/s41592-018-0308-4>.
Maintained by Alan OCallaghan. Last updated 5 months ago.
dimensionreductionvisualizationsoftwaresinglecellsequencing
2.9 match 3 stars 4.95 score 3 scriptsstephenslab
fastTopics:Fast Algorithms for Fitting Topic Models and Non-Negative Matrix Factorizations to Count Data
Implements fast, scalable optimization algorithms for fitting topic models ("grade of membership" models) and non-negative matrix factorizations to count data. The methods exploit the special relationship between the multinomial topic model (also, "probabilistic latent semantic indexing") and Poisson non-negative matrix factorization. The package provides tools to compare, annotate and visualize model fits, including functions to efficiently create "structure plots" and identify key features in topics. The 'fastTopics' package is a successor to the 'CountClust' package. For more information, see <doi:10.48550/arXiv.2105.13440> and <doi:10.1186/s13059-023-03067-9>. Please also see the GitHub repository for additional vignettes not included in the package on CRAN.
Maintained by Peter Carbonetto. Last updated 16 days ago.
1.7 match 79 stars 8.38 score 678 scripts 1 dependentsjosherrickson
rlemon:R Access to LEMON Graph Algorithms
Allows easy access to the LEMON Graph Library set of algorithms, written in C++. See the LEMON project page at <https://lemon.cs.elte.hu/trac/lemon>. Current LEMON version is 1.3.1.
Maintained by Josh Errickson. Last updated 2 months ago.
2.0 match 8 stars 7.04 score 1 scripts 13 dependentszdebruine
RcppML:Rcpp Machine Learning Library
Fast machine learning algorithms including matrix factorization and divisive clustering for large sparse and dense matrices.
Maintained by Zach DeBruine. Last updated 2 years ago.
clusteringmatrix-factorizationnmfrcpprcppeigensparse-matrixcppopenmp
1.3 match 104 stars 10.53 score 125 scripts 46 dependentsyusenzhang
qkerntool:Q-Kernel-Based and Conditionally Negative Definite Kernel-Based Machine Learning Tools
Nonlinear machine learning tool for classification, clustering and dimensionality reduction. It integrates 12 q-kernel functions and 15 conditional negative definite kernel functions and includes the q-kernel and conditional negative definite kernel version of density-based spatial clustering of applications with noise, spectral clustering, generalized discriminant analysis, principal component analysis, multidimensional scaling, locally linear embedding, sammon's mapping and t-Distributed stochastic neighbor embedding.
Maintained by Yusen Zhang. Last updated 6 years ago.
6.4 match 1 stars 2.19 score 31 scriptskumes
chatAI4R:Chat-Based Interactive Artificial Intelligence for R
The Large Language Model (LLM) represents a groundbreaking advancement in data science and programming, and also allows us to extend the world of R. A seamless interface for integrating the 'OpenAI' Web APIs into R is provided in this package. This package leverages LLM-based AI techniques, enabling efficient knowledge discovery and data analysis (see 'OpenAI' Web APIs details <https://openai.com/blog/openai-api>). The previous functions such as seamless translation and image generation have been moved to other packages 'deepRstudio' and 'stableDiffusion4R'.
Maintained by Satoshi Kume. Last updated 1 months ago.
aibioinformaticschatgptgptimageimage-generation
3.1 match 14 stars 4.45 score 3 scriptsbioc
systemPipeTools:Tools for data visualization
systemPipeTools package extends the widely used systemPipeR (SPR) workflow environment with an enhanced toolkit for data visualization, including utilities to automate the data visualizaton for analysis of differentially expressed genes (DEGs). systemPipeTools provides data transformation and data exploration functions via scatterplots, hierarchical clustering heatMaps, principal component analysis, multidimensional scaling, generalized principal components, t-Distributed Stochastic Neighbor embedding (t-SNE), and MA and volcano plots. All these utilities can be integrated with the modular design of the systemPipeR environment that allows users to easily substitute any of these features and/or custom with alternatives.
Maintained by Daniela Cassol. Last updated 5 months ago.
infrastructuredataimportsequencingqualitycontrolreportwritingexperimentaldesignclusteringdifferentialexpressionmultidimensionalscalingprincipalcomponent
3.4 match 4.00 score 4 scriptsmdsr-book
mdsr:Complement to 'Modern Data Science with R'
A complement to all editions of *Modern Data Science with R* (ISBN: 978-0367191498, publisher URL: <https://www.routledge.com/Modern-Data-Science-with-R/Baumer-Kaplan-Horton/p/book/9780367191498>). This package contains data and code to complete exercises and reproduce examples from the text. It also facilitates connections to the SQL database server used in the book. All editions of the book are supported by this package.
Maintained by Benjamin S. Baumer. Last updated 7 months ago.
1.9 match 38 stars 7.21 score 504 scriptsltorgo
DMwR2:Functions and Data for the Second Edition of "Data Mining with R"
Functions and data accompanying the second edition of the book "Data Mining with R, learning with case studies" by Luis Torgo, published by CRC Press.
Maintained by Luis Torgo. Last updated 8 years ago.
1.7 match 27 stars 7.46 score 380 scripts 2 dependentskloppen
rde:Reproducible Data Embedding
Allows caching of raw data directly in R code. This allows R scripts and R Notebooks to be shared and re-run on a machine without access to the original data. Cached data is encoded into an ASCII string that can be pasted into R code. When the code is run, the data is automatically loaded from the cached version if the original data file is unavailable. Works best for small datasets (a few hundred observations).
Maintained by Stefan Kloppenborg. Last updated 5 years ago.
3.3 match 1 stars 3.70 score 7 scriptsbioc
netSmooth:Network smoothing for scRNAseq
netSmooth is an R package for network smoothing of single cell RNA sequencing data. Using bio networks such as protein-protein interactions as priors for gene co-expression, netsmooth improves cell type identification from noisy, sparse scRNAseq data.
Maintained by Jonathan Ronen. Last updated 5 months ago.
networkgraphandnetworksinglecellrnaseqgeneexpressionsequencingtranscriptomicsnormalizationpreprocessingclusteringdimensionreductionbioinformaticsgenomicssingle-cell
1.7 match 27 stars 7.41 score 4 scriptsfkeck
subtools:Read and Manipulate Video Subtitles
A collection of functions to read, write and manipulate video subtitles. Supported formats include "srt", "subrip", "sub", "subviewer", "microdvd", "ssa", "ass", "substation", "vtt", and "webvtt".
Maintained by Francois Keck. Last updated 5 years ago.
3.5 match 45 stars 3.35 scorethomasp85
densityClust:Clustering by Fast Search and Find of Density Peaks
An improved implementation (based on k-nearest neighbors) of the density peak clustering algorithm, originally described by Alex Rodriguez and Alessandro Laio (Science, 2014 vol. 344). It can handle large datasets (> 100,000 samples) very efficiently. It was initially implemented by Thomas Lin Pedersen, with inputs from Sean Hughes and later improved by Xiaojie Qiu to handle large datasets with kNNs.
Maintained by Thomas Lin Pedersen. Last updated 1 years ago.
1.7 match 153 stars 7.14 score 75 scriptsr-forge
RHRV:Heart Rate Variability Analysis of ECG Data
Allows users to import data files containing heartbeat positions in the most broadly used formats, to remove outliers or points with unacceptable physiological values present in the time series, to plot HRV data, and to perform time domain, frequency domain and nonlinear HRV analysis. See Garcia et al. (2017) <DOI:10.1007/978-3-319-65355-6>.
Maintained by Leandro Rodriguez-Linares. Last updated 6 months ago.
1.7 match 6.79 score 63 scripts 1 dependentscarmonalab
GeneNMF:Non-Negative Matrix Factorization for Single-Cell Omics
A collection of methods to extract gene programs from single-cell gene expression data using non-negative matrix factorization (NMF). 'GeneNMF' contains functions to directly interact with the 'Seurat' toolkit and derive interpretable gene program signatures.
Maintained by Massimo Andreatta. Last updated 10 days ago.
1.8 match 102 stars 6.63 score 12 scriptsbioc
escheR:Unified multi-dimensional visualizations with Gestalt principles
The creation of effective visualizations is a fundamental component of data analysis. In biomedical research, new challenges are emerging to visualize multi-dimensional data in a 2D space, but current data visualization tools have limited capabilities. To address this problem, we leverage Gestalt principles to improve the design and interpretability of multi-dimensional data in 2D data visualizations, layering aesthetics to display multiple variables. The proposed visualization can be applied to spatially-resolved transcriptomics data, but also broadly to data visualized in 2D space, such as embedding visualizations. We provide this open source R package escheR, which is built off of the state-of-the-art ggplot2 visualization framework and can be seamlessly integrated into genomics toolboxes and workflows.
Maintained by Boyi Guo. Last updated 5 months ago.
spatialsinglecelltranscriptomicsvisualizationsoftwaremultidimensionalsingle-cellspatial-omics
1.7 match 6 stars 6.74 score 153 scripts 1 dependentsrpuggaardrode
praatpicture:'Praat Picture' Style Plots of Acoustic Data
Quickly and easily generate plots of acoustic data aligned with transcriptions similar to those made in 'Praat' using either derived signals generated directly in R with 'wrassp' or imported derived signals from 'Praat'. Provides easy and fast out-of-the-box solutions but also a high extent of flexibility. Also provides options for embedding audio in figures and animating figures.
Maintained by Rasmus Puggaard-Rode. Last updated 20 days ago.
2.2 match 29 stars 5.28 score 3 scriptsalexander-pastukhov
eyelinkReader:Import Gaze Data for EyeLink Eye Tracker
Import gaze data from edf files generated by the SR Research <https://www.sr-research.com/> EyeLink eye tracker. Gaze data, both recorded events and samples, is imported per trial. The package allows to extract events of interest, such as saccades, blinks, etc. as well as recorded variables and custom events (areas of interest, triggers) into separate tables. The package requires EDF API library that can be obtained at <https://www.sr-research.com/support/>.
Maintained by Alexander Pastukhov. Last updated 3 months ago.
edfeye-trackingeyelinksr-researchcpp
1.7 match 13 stars 6.52 score 34 scriptskisungyou
maotai:Tools for Matrix Algebra, Optimization and Inference
Matrix is an universal and sometimes primary object/unit in applied mathematics and statistics. We provide a number of algorithms for selected problems in optimization and statistical inference. For general exposition to the topic with focus on statistical context, see the book by Banerjee and Roy (2014, ISBN:9781420095388).
Maintained by Kisung You. Last updated 4 hours ago.
2.0 match 8 stars 5.51 score 15 scripts 9 dependentsduncanobrien
EWSmethods:Forecasting Tipping Points at the Community Level
Rolling and expanding window approaches to assessing abundance based early warning signals, non-equilibrium resilience measures, and machine learning. See Dakos et al. (2012) <doi:10.1371/journal.pone.0041010>, Deb et al. (2022) <doi:10.1098/rsos.211475>, Drake and Griffen (2010) <doi:10.1038/nature09389>, Ushio et al. (2018) <doi:10.1038/nature25504> and Weinans et al. (2021) <doi:10.1038/s41598-021-87839-y> for methodological details. Graphical presentation of the outputs are also provided for clear and publishable figures. Visit the 'EWSmethods' website for more information, and tutorials.
Maintained by Duncan OBrien. Last updated 7 months ago.
2.0 match 8 stars 5.51 score 20 scriptsbioc
CatsCradle:This package provides methods for analysing spatial transcriptomics data and for discovering gene clusters
This package addresses two broad areas. It allows for in-depth analysis of spatial transcriptomic data by identifying tissue neighbourhoods. These are contiguous regions of tissue surrounding individual cells. 'CatsCradle' allows for the categorisation of neighbourhoods by the cell types contained in them and the genes expressed in them. In particular, it produces Seurat objects whose individual elements are neighbourhoods rather than cells. In addition, it enables the categorisation and annotation of genes by producing Seurat objects whose elements are genes.
Maintained by Michael Shapiro. Last updated 1 months ago.
biologicalquestionstatisticalmethodgeneexpressionsinglecelltranscriptomicsspatial
1.7 match 3 stars 6.50 scoredvrbts
labdsv:Ordination and Multivariate Analysis for Ecology
A variety of ordination and community analyses useful in analysis of data sets in community ecology. Includes many of the common ordination methods, with graphical routines to facilitate their interpretation, as well as several novel analyses.
Maintained by David W. Roberts. Last updated 2 years ago.
1.8 match 3 stars 6.08 score 452 scripts 13 dependentsminoo-asty
CINNA:Deciphering Central Informative Nodes in Network Analysis
Computing, comparing, and demonstrating top informative centrality measures within a network. "CINNA: an R/CRAN package to decipher Central Informative Nodes in Network Analysis" provides a comprehensive overview of the package functionality Ashtiani et al. (2018) <doi:10.1093/bioinformatics/bty819>.
Maintained by Minoo Ashtiani. Last updated 2 years ago.
3.3 match 1 stars 3.29 score 98 scriptsjpfitzinger
tidyfit:Regularized Linear Modeling with Tidy Data
An extension to the 'R' tidy data environment for automated machine learning. The package allows fitting and cross validation of linear regression and classification algorithms on grouped data.
Maintained by Johann Pfitzinger. Last updated 2 months ago.
auto-mlclassificationmachine-learningregressiontidyverse
1.5 match 16 stars 7.22 score 26 scriptsbnosac
sentencepiece:Text Tokenization using Byte Pair Encoding and Unigram Modelling
Unsupervised text tokenizer allowing to perform byte pair encoding and unigram modelling. Wraps the 'sentencepiece' library <https://github.com/google/sentencepiece> which provides a language independent tokenizer to split text in words and smaller subword units. The techniques are explained in the paper "SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing" by Taku Kudo and John Richardson (2018) <doi:10.18653/v1/D18-2012>. Provides as well straightforward access to pretrained byte pair encoding models and subword embeddings trained on Wikipedia using 'word2vec', as described in "BPEmb: Tokenization-free Pre-trained Subword Embeddings in 275 Languages" by Benjamin Heinzerling and Michael Strube (2018) <http://www.lrec-conf.org/proceedings/lrec2018/pdf/1049.pdf>.
Maintained by Jan Wijffels. Last updated 2 years ago.
bytenatural-language-processingsentencepieceword-segmentationcpp
2.6 match 25 stars 4.10 score 8 scriptsmuschellij2
cifti:Toolbox for Connectivity Informatics Technology Initiative ('CIFTI') Files
Functions for the input/output and visualization of medical imaging data in the form of 'CIFTI' files <https://www.nitrc.org/projects/cifti/>.
Maintained by John Muschelli. Last updated 5 years ago.
1.9 match 3 stars 5.76 score 96 scriptsmengxu98
inferCSN:Inferring Cell-Specific Gene Regulatory Network
An R package for inferring cell-type specific gene regulatory network from single-cell RNA data.
Maintained by Meng Xu. Last updated 4 days ago.
2.3 match 3 stars 4.79 score 6 scriptskisungyou
NetworkDistance:Distance Measures for Networks
Network is a prevalent form of data structure in many fields. As an object of analysis, many distance or metric measures have been proposed to define the concept of similarity between two networks. We provide a number of distance measures for networks. See Jurman et al (2011) <doi:10.3233/978-1-60750-692-8-227> for an overview on spectral class of inter-graph distance measures.
Maintained by Kisung You. Last updated 2 years ago.
distancenetworknetwork-analysisopenblascppopenmp
1.9 match 9 stars 5.58 score 28 scripts 1 dependentsscottgigante
phateR:PHATE - Potential of Heat-Diffusion for Affinity-Based Transition Embedding
PHATE is a tool for visualizing high dimensional single-cell data with natural progressions or trajectories. PHATE uses a novel conceptual framework for learning and visualizing the manifold inherent to biological systems in which smooth transitions mark the progressions of cells from one state to another. To see how PHATE can be applied to single-cell RNA-seq datasets from hematopoietic stem cells, human embryonic stem cells, and bone marrow samples, check out our publication in Nature Biotechnology at <doi:10.1038/s41587-019-0336-3>.
Maintained by Scott Gigante. Last updated 4 years ago.
2.8 match 1 stars 3.72 score 262 scriptscran
Mercator:Clustering and Visualizing Distance Matrices
Defines the classes used to explore, cluster and visualize distance matrices, especially those arising from binary data. See Abrams and colleagues, 2021, <doi:10.1093/bioinformatics/btab037>.
Maintained by Kevin R. Coombes. Last updated 5 months ago.
2.4 match 4.33 score 12 scripts 1 dependentsantoniofabio
tseriesChaos:Analysis of Nonlinear Time Series
Routines for the analysis of nonlinear time series. This work is largely inspired by the TISEAN project, by Rainer Hegger, Holger Kantz and Thomas Schreiber: <http://www.mpipks-dresden.mpg.de/~tisean/>.
Maintained by Antonio Fabio Di Narzo. Last updated 6 years ago.
2.0 match 3 stars 5.12 score 106 scripts 6 dependentstidyverse
haven:Import and Export 'SPSS', 'Stata' and 'SAS' Files
Import foreign statistical formats into R via the embedded 'ReadStat' C library, <https://github.com/WizardMac/ReadStat>.
Maintained by Hadley Wickham. Last updated 5 months ago.
0.5 match 427 stars 18.63 score 18k scripts 682 dependentsravingmantis
unittest:TAP-Compliant Unit Testing
Concise TAP <http://testanything.org/> compliant unit testing package. Authored tests can be run using CMD check with minimal implementation overhead.
Maintained by Jamie Lentin. Last updated 7 months ago.
1.3 match 4 stars 7.43 score 224 scriptsbioc
RDRToolbox:A package for nonlinear dimension reduction with Isomap and LLE.
A package for nonlinear dimension reduction using the Isomap and LLE algorithm. It also includes a routine for computing the Davis-Bouldin-Index for cluster validation, a plotting tool and a data generator for microarray gene expression data and for the Swiss Roll dataset.
Maintained by Christoph Bartenhagen. Last updated 5 months ago.
dimensionreductionfeatureextractionvisualizationclusteringmicroarray
2.0 match 4.88 score 54 scriptsbioc
scBFA:A dimensionality reduction tool using gene detection pattern to mitigate noisy expression profile of scRNA-seq
This package is designed to model gene detection pattern of scRNA-seq through a binary factor analysis model. This model allows user to pass into a cell level covariate matrix X and gene level covariate matrix Q to account for nuisance variance(e.g batch effect), and it will output a low dimensional embedding matrix for downstream analysis.
Maintained by Ruoxin Li. Last updated 5 months ago.
singlecelltranscriptomicsdimensionreductiongeneexpressionatacseqbatcheffectkeggqualitycontrol
2.3 match 4.30 score 4 scriptscore-bioinformatics
ClustAssess:Tools for Assessing Clustering
A set of tools for evaluating clustering robustness using proportion of ambiguously clustered pairs (Senbabaoglu et al. (2014) <doi:10.1038/srep06207>), as well as similarity across methods and method stability using element-centric clustering comparison (Gates et al. (2019) <doi:10.1038/s41598-019-44892-y>). Additionally, this package enables stability-based parameter assessment for graph-based clustering pipelines typical in single-cell data analysis.
Maintained by Andi Munteanu. Last updated 1 months ago.
softwaresinglecellrnaseqatacseqnormalizationpreprocessingdimensionreductionvisualizationqualitycontrolclusteringclassificationannotationgeneexpressiondifferentialexpressionbioinformaticsgenomicsmachine-learningparameter-optimizationrobustnesssingle-cellunsupervised-learningcpp
1.7 match 23 stars 5.70 score 18 scriptsjsandube
DChaos:Chaotic Time Series Analysis
Chaos theory has been hailed as a revolution of thoughts and attracting ever increasing attention of many scientists from diverse disciplines. Chaotic systems are nonlinear deterministic dynamic systems which can behave like an erratic and apparently random motion. A relevant field inside chaos theory and nonlinear time series analysis is the detection of a chaotic behaviour from empirical time series data. One of the main features of chaos is the well known initial value sensitivity property. Methods and techniques related to test the hypothesis of chaos try to quantify the initial value sensitive property estimating the Lyapunov exponents. The DChaos package provides different useful tools and efficient algorithms which test robustly the hypothesis of chaos based on the Lyapunov exponent in order to know if the data generating process behind time series behave chaotically or not.
Maintained by Julio E. Sandubete. Last updated 2 years ago.
4.8 match 1 stars 2.00 score 9 scriptsmlverse
cuda.ml:R Interface for the RAPIDS cuML Suite of Libraries
R interface for RAPIDS cuML (<https://github.com/rapidsai/cuml>), a suite of GPU-accelerated machine learning libraries powered by CUDA (<https://en.wikipedia.org/wiki/CUDA>).
Maintained by Daniel Falbel. Last updated 3 years ago.
1.8 match 33 stars 5.27 score 57 scriptsnbarrowman
vtree:Display Information About Nested Subsets of a Data Frame
A tool for calculating and drawing "variable trees". Variable trees display information about nested subsets of a data frame.
Maintained by Nick Barrowman. Last updated 13 hours ago.
data-sciencedata-visualizationexploratory-data-analysisstatistics
1.3 match 76 stars 7.09 score 65 scriptscarlganz
docuSignr:Connect to 'DocuSign' API
Connect to the 'DocuSign' Rest API <https://www.docusign.com/p/RESTAPIGuide/RESTAPIGuide.htm>, which supports embedded signing, and sending of documents.
Maintained by Carl Ganz. Last updated 6 years ago.
2.8 match 5 stars 3.40 score 10 scriptsbioc
iSEEde:iSEE extension for panels related to differential expression analysis
This package contains diverse functionality to extend the usage of the iSEE package, including additional classes for the panels or modes facilitating the analysis of differential expression results. This package does not perform differential expression. Instead, it provides methods to embed precomputed differential expression results in a SummarizedExperiment object, in a manner that is compatible with interactive visualisation in iSEE applications.
Maintained by Kevin Rue-Albrecht. Last updated 4 months ago.
softwareinfrastructuredifferentialexpressionbioconductorhacktoberfestiseeu
1.8 match 1 stars 5.38 score 15 scriptsbyzheng
rtiddlywiki:R Interface for 'TiddlyWiki'
'TiddlyWiki' is a unique non-linear notebook for capturing, organising and sharing complex information. 'rtiddlywiki' is a R interface of 'TiddlyWiki' <https://tiddlywiki.com> to create new tiddler from Rmarkdown file, and then put into a local 'TiddlyWiki' node.js server if it is available.
Maintained by Bangyou Zheng. Last updated 12 days ago.
1.7 match 3 stars 5.45 score 7 scriptsjuba
robservable:Import an Observable Notebook as HTML Widget
Allows loading and displaying an Observable notebook (online JavaScript notebooks powered by <https://observablehq.com>) as an HTML Widget in an R session, 'shiny' application or 'rmarkdown' document.
Maintained by Julien Barnier. Last updated 7 months ago.
1.3 match 165 stars 7.00 score 40 scripts