R-universe search: embedded

kisungyou

Rdimtools:Dimension Reduction and Estimation Methods

We provide linear and nonlinear dimension reduction techniques. Intrinsic dimension estimation methods for exploratory analysis are also provided. For more details on the package, see the paper by You and Shung (2022) <doi:10.1016/j.simpa.2022.100414>.

Maintained by Kisung You. Last updated 2 years ago.

dimension-estimation dimension-reduction manifold-learning subspace-learning openblas cpp openmp

44.0 match 52 stars 8.37 score 186 scripts 8 dependents

prodriguezsosa

conText:'a la Carte' on Text (ConText) Embedding Regression

A fast, flexible and transparent framework to estimate context-specific word and short document embeddings using the 'a la carte' embeddings approach developed by Khodak et al. (2018) <arXiv:1805.05388> and evaluate hypotheses about covariate effects on embeddings using the regression framework developed by Rodriguez et al. (2021)<https://github.com/prodriguezsosa/EmbeddingRegression>.

Maintained by Pedro L. Rodriguez. Last updated 11 months ago.

37.9 match 104 stars 9.40 score 1.7k scripts

oscarkjell

text:Analyses of Text using Transformers Models from HuggingFace, Natural Language Processing and Machine Learning

Link R with Transformers from Hugging Face to transform text variables to word embeddings; where the word embeddings are used to statistically test the mean difference between set of texts, compute semantic similarity scores between texts, predict numerical variables, and visual statistically significant words according to various dimensions etc. For more information see <https://www.r-text.org>.

Maintained by Oscar Kjell. Last updated 3 days ago.

deep-learning machine-learning nlp transformers openjdk

27.0 match 146 stars 13.16 score 436 scripts 1 dependents

jonnob

rsetse:Strain Elevation Tension Spring Embedding

An R implementation for the Strain Elevation and Tension embedding algorithm from Bourne (2020) <doi:10.1007/s41109-020-00329-4>. The package embeds graphs and networks using the Strain Elevation and Tension embedding (SETSe) algorithm. SETSe represents the network as a physical system, where edges are elastic, and nodes exert a force either up or down based on node features. SETSe positions the nodes vertically such that the tension in the edges of a node is equal and opposite to the force it exerts for all nodes in the network. The resultant structure can then be analysed by looking at the node elevation and the edge strain and tension. This algorithm works on weighted and unweighted networks as well as networks with or without explicit node features. Edge elasticity can be created from existing edge weights or kept as a constant.

Maintained by Jonathan Bourne. Last updated 3 years ago.

embedding embedding-graphs graph-embedding igraph networks networkscience unsupervised-learning cpp openmp

57.1 match 7 stars 4.85 score 8 scripts

dselivanov

text2vec:Modern Text Mining Framework for R

Fast and memory-friendly tools for text vectorization, topic modeling (LDA, LSA), word embeddings (GloVe), similarities. This package provides a source-agnostic streaming API, which allows researchers to perform analysis of collections of documents which are larger than available RAM. All core functions are parallelized to benefit from multicore machines.

Maintained by Dmitriy Selivanov. Last updated 7 months ago.

glove latent-dirichlet-allocation natural-language-processing text-mining topic-modeling vectorization word-embeddings word2vec cpp

14.6 match 860 stars 13.48 score 1.3k scripts 23 dependents

mlverse

torch:Tensors and Neural Networks with 'GPU' Acceleration

Provides functionality to define and train neural networks similar to 'PyTorch' by Paszke et al (2019) <doi:10.48550/arXiv.1912.01703> but written entirely in R using the 'libtorch' library. Also supports low-level tensor operations and 'GPU' acceleration.

Maintained by Daniel Falbel. Last updated 5 days ago.

autograd deep-learning torch cpp

11.3 match 520 stars 16.52 score 1.4k scripts 38 dependents

bnosac

ruimtehol:Learn Text 'Embeddings' with 'Starspace'

Wraps the 'StarSpace' library <https://github.com/facebookresearch/StarSpace> allowing users to calculate word, sentence, article, document, webpage, link and entity 'embeddings'. By using the 'embeddings', you can perform text based multi-label classification, find similarities between texts and categories, do collaborative-filtering based recommendation as well as content-based recommendation, find out relations between entities, calculate graph 'embeddings' as well as perform semi-supervised learning and multi-task learning on plain text. The techniques are explained in detail in the paper: 'StarSpace: Embed All The Things!' by Wu et al. (2017), available at <arXiv:1709.03856>.

Maintained by Jan Wijffels. Last updated 1 years ago.

classification embeddings natural-language-processing nlp similarity starspace text-mining cpp

21.1 match 101 stars 6.65 score 44 scripts

exaexa

EmbedSOM:Fast Embedding Guided by Self-Organizing Map

Provides a smooth mapping of multidimensional points into low-dimensional space defined by a self-organizing map. Designed to work with 'FlowSOM' and flow-cytometry use-cases. See Kratochvil et al. (2019) <doi:10.12688/f1000research.21642.1>.

Maintained by Mirek Kratochvil. Last updated 1 months ago.

cytometry som visualization cpp

22.0 match 26 stars 6.02 score 8 scripts

neurodata

lolR:Linear Optimal Low-Rank Projection

Supervised learning techniques designed for the situation when the dimensionality exceeds the sample size have a tendency to overfit as the dimensionality of the data increases. To remedy this High dimensionality; low sample size (HDLSS) situation, we attempt to learn a lower-dimensional representation of the data before learning a classifier. That is, we project the data to a situation where the dimensionality is more manageable, and then are able to better apply standard classification or clustering techniques since we will have fewer dimensions to overfit. A number of previous works have focused on how to strategically reduce dimensionality in the unsupervised case, yet in the supervised HDLSS regime, few works have attempted to devise dimensionality reduction techniques that leverage the labels associated with the data. In this package and the associated manuscript Vogelstein et al. (2017) <arXiv:1709.01233>, we provide several methods for feature extraction, some utilizing labels and some not, along with easily extensible utilities to simplify cross-validative efforts to identify the best feature extraction method. Additionally, we include a series of adaptable benchmark simulations to serve as a standard for future investigative efforts into supervised HDLSS. Finally, we produce a comprehensive comparison of the included algorithms across a range of benchmark simulations and real data applications.

Maintained by Eric Bridgeford. Last updated 4 years ago.

16.6 match 20 stars 7.28 score 80 scripts

edubruell

tidyllm:Tidy Integration of Large Language Models

A tidy interface for integrating large language model (LLM) APIs such as 'Claude', 'Openai', 'Groq','Mistral' and local models via 'Ollama' into R workflows. The package supports text and media-based interactions, interactive message history, batch request APIs, and a tidy, pipeline-oriented interface for streamlined integration into data workflows. Web services are available at <https://www.anthropic.com>, <https://openai.com>, <https://groq.com>, <https://mistral.ai/> and <https://ollama.com>.

Maintained by Eduard Brüll. Last updated 4 days ago.

14.6 match 68 stars 7.82 score 26 scripts

satijalab

Seurat:Tools for Single Cell Genomics

A toolkit for quality control, analysis, and exploration of single cell RNA sequencing data. 'Seurat' aims to enable users to identify and interpret sources of heterogeneity from single cell transcriptomic measurements, and to integrate diverse types of single cell data. See Satija R, Farrell J, Gennert D, et al (2015) <doi:10.1038/nbt.3192>, Macosko E, Basu A, Satija R, et al (2015) <doi:10.1016/j.cell.2015.05.002>, Stuart T, Butler A, et al (2019) <doi:10.1016/j.cell.2019.05.031>, and Hao, Hao, et al (2020) <doi:10.1101/2020.10.12.335331> for more details.

Maintained by Paul Hoffman. Last updated 1 years ago.

human-cell-atlas single-cell-genomics single-cell-rna-seq cpp

6.6 match 2.4k stars 16.86 score 50k scripts 73 dependents

sa-lee

liminal:Multivariate Data Visualization with Tours and Embeddings

Compose interactive visualisations designed for exploratory high-dimensional data analysis. With 'liminal' you can create linked interactive graphics to diagnose the quality of a dimension reduction technique and explore the global structure of a dataset with a tour. A complete description of the method is discussed in ['Lee' & 'Laa' & 'Cook' (2020) <arXiv:2012.06077>].

Maintained by Stuart Lee. Last updated 4 years ago.

embedding-algorithms interactive-visualizations t-sne tour

19.2 match 5 stars 5.31 score 41 scripts

eddelbuettel

littler:R at the Command-Line via 'r'

A scripting and command-line front-end is provided by 'r' (aka 'littler') as a lightweight binary wrapper around the GNU R language and environment for statistical computing and graphics. While R can be used in batch mode, the r binary adds full support for both 'shebang'-style scripting (i.e. using a hash-mark-exclamation-path expression as the first line in scripts) as well as command-line use in standard Unix pipelines. In other words, r provides the R language without the environment.

Maintained by Dirk Eddelbuettel. Last updated 1 months ago.

embedded examples littler

10.0 match 314 stars 9.49 score 17 scripts

bioc

singleCellTK:Comprehensive and Interactive Analysis of Single Cell RNA-Seq Data

The Single Cell Toolkit (SCTK) in the singleCellTK package provides an interface to popular tools for importing, quality control, analysis, and visualization of single cell RNA-seq data. SCTK allows users to seamlessly integrate tools from various packages at different stages of the analysis workflow. A general "a la carte" workflow gives users the ability access to multiple methods for data importing, calculation of general QC metrics, doublet detection, ambient RNA estimation and removal, filtering, normalization, batch correction or integration, dimensionality reduction, 2-D embedding, clustering, marker detection, differential expression, cell type labeling, pathway analysis, and data exporting. Curated workflows can be used to run Seurat and Celda. Streamlined quality control can be performed on the command line using the SCTK-QC pipeline. Users can analyze their data using commands in the R console or by using an interactive Shiny Graphical User Interface (GUI). Specific analyses or entire workflows can be summarized and shared with comprehensive HTML reports generated by Rmarkdown. Additional documentation and vignettes can be found at camplab.net/sctk.

Maintained by Joshua David Campbell. Last updated 23 days ago.

singlecell geneexpression differentialexpression alignment clustering immunooncology batcheffect normalization qualitycontrol dataimport gui

9.1 match 181 stars 10.16 score 252 scripts

jayanilakshika

cardinalR:Collection of Data Structures

A collection of simple simulation datasets designed for generating Nonlinear Dimension Reduction representations techniques such as t-distributed Stochastic Neighbor Embedding, and Uniform Manifold Approximation and Projection. These datasets serve as a valuable resource for understanding the reliability of Nonlinear Dimension Reduction representations in various contexts.

Maintained by Jayani P.G. Lakshika. Last updated 11 days ago.

19.9 match 4.54 score

eddelbuettel

RInside:C++ Classes to Embed R in C++ (and C) Applications

C++ classes to embed R in C++ (and C) applications A C++ class providing the R interpreter is offered by this package making it easier to have "R inside" your C++ application. As R itself is embedded into your application, a shared library build of R is required. This works on Linux, OS X and even on Windows provided you use the same tools used to build R itself. Numerous examples are provided in the nine subdirectories of the examples/ directory of the installed package: standard, 'mpi' (for parallel computing), 'qt' (showing how to embed 'RInside' inside a Qt GUI application), 'wt' (showing how to build a "web-application" using the Wt toolkit), 'armadillo' (for 'RInside' use with 'RcppArmadillo'), 'eigen' (for 'RInside' use with 'RcppEigen'), and 'c_interface' for a basic C interface and 'Ruby' illustration. The examples use 'GNUmakefile(s)' with GNU extensions, so a GNU make is required (and will use the 'GNUmakefile' automatically). 'Doxygen'-generated documentation of the C++ classes is available at the 'RInside' website as well.

Maintained by Dirk Eddelbuettel. Last updated 5 months ago.

c-plus-plus embedded cpp

12.4 match 136 stars 7.17 score 17 scripts 1 dependents

immunogenomics

harmony:Fast, Sensitive, and Accurate Integration of Single Cell Data

Implementation of the Harmony algorithm for single cell integration, described in Korsunsky et al <doi:10.1038/s41592-019-0619-0>. Package includes a standalone Harmony function and interfaces to external frameworks.

Maintained by Ilya Korsunsky. Last updated 4 months ago.

algorithm data-integration scrna-seq openblas cpp

6.3 match 554 stars 13.74 score 5.5k scripts 8 dependents

bnosac

word2vec:Distributed Representations of Words

Learn vector representations of words by continuous bag of words and skip-gram implementations of the 'word2vec' algorithm. The techniques are detailed in the paper "Distributed Representations of Words and Phrases and their Compositionality" by Mikolov et al. (2013), available at <arXiv:1310.4546>.

Maintained by Jan Wijffels. Last updated 1 years ago.

embeddings natural-language-processing word2vec cpp

10.0 match 70 stars 8.36 score 227 scripts 6 dependents

hauselin

ollamar:'Ollama' Language Models

An interface to easily run local language models with 'Ollama' <https://ollama.com> server and API endpoints (see <https://github.com/ollama/ollama/blob/main/docs/api.md> for details). It lets you run open-source large language models locally on your machine.

Maintained by Hause Lin. Last updated 2 months ago.

ai api llm llms ollama ollama-api

8.7 match 84 stars 9.36 score 74 scripts 5 dependents

bioc

lemur:Latent Embedding Multivariate Regression

Fit a latent embedding multivariate regression (LEMUR) model to multi-condition single-cell data. The model provides a parametric description of single-cell data measured with treatment vs. control or more complex experimental designs. The parametric model is used to (1) align conditions, (2) predict log fold changes between conditions for all cells, and (3) identify cell neighborhoods with consistent log fold changes. For those neighborhoods, a pseudobulked differential expression test is conducted to assess which genes are significantly changed.

Maintained by Constantin Ahlmann-Eltze. Last updated 5 months ago.

transcriptomics differentialexpression singlecell dimensionreduction regression openblas cpp

10.2 match 87 stars 7.80 score 81 scripts

igraph

igraph:Network Analysis and Visualization

Routines for simple graphs and network analysis. It can handle large graphs very well and provides functions for generating random and regular graphs, graph visualization, centrality methods and much more.

Maintained by Kirill Müller. Last updated 2 days ago.

complex-networks graph-algorithms graph-theory mathematics network-analysis network-graph fortran libxml2 glpk openblas cpp

3.8 match 581 stars 21.10 score 31k scripts 1.9k dependents

psychbruce

PsychWordVec:Word Embedding Research Framework for Psychological Science

An integrative toolbox of word embedding research that provides: (1) a collection of 'pre-trained' static word vectors in the '.RData' compressed format <https://psychbruce.github.io/WordVector_RData.pdf>; (2) a series of functions to process, analyze, and visualize word vectors; (3) a range of tests to examine conceptual associations, including the Word Embedding Association Test <doi:10.1126/science.aal4230> and the Relative Norm Distance <doi:10.1073/pnas.1720347115>, with permutation test of significance; (4) a set of training methods to locally train (static) word vectors from text corpora, including 'Word2Vec' <arXiv:1301.3781>, 'GloVe' <doi:10.3115/v1/D14-1162>, and 'FastText' <arXiv:1607.04606>; (5) a group of functions to download 'pre-trained' language models (e.g., 'GPT', 'BERT') and extract contextualized (dynamic) word vectors (based on the R package 'text').

Maintained by Han-Wu-Shuang Bao. Last updated 1 years ago.

19.5 match 22 stars 4.04 score 10 scripts

jayanilakshika

quollr:Visualising How Nonlinear Dimension Reduction Warps Your Data

To construct a model in 2D space from 2D embedding data and then lift it to the high-dimensional space. Additionally, it provides tools to visualize the model in 2D space and to overlay the fitted model on data using the tour technique. Furthermore, it facilitates the generation of summaries of high-dimensional distributions.

Maintained by Jayani P.G. Lakshika. Last updated 1 months ago.

12.7 match 3 stars 6.18 score 7 scripts

ropensci

pkgmatch:Find R Packages Matching Either Descriptions or Other R Packages

Find R packages matching either descriptions or other R packages.

Maintained by Mark Padgham. Last updated 1 months ago.

embeddings llms natural-language-processing cpp

14.9 match 3 stars 5.23 score

xiaozhangryy

CAESAR.Suite:CAESAR: a Cross-Technology and Cross-Resolution Framework for Spatial Omics Annotation

Biotechnology in spatial omics has advanced rapidly over the past few years, enhancing both throughput and resolution. However, existing annotation pipelines in spatial omics predominantly rely on clustering methods, lacking the flexibility to integrate extensive annotated information from single-cell RNA sequencing (scRNA-seq) due to discrepancies in spatial resolutions, species, or modalities. Here we introduce the CAESAR suite, an open-source software package that provides image-based spatial co-embedding of locations and genomic features. It uniquely transfers labels from scRNA-seq reference, enabling the annotation of spatial omics datasets across different technologies, resolutions, species, and modalities, based on the conserved relationship between signature genes and cells/locations at an appropriate level of granularity. Notably, CAESAR enriches location-level pathways, allowing for the detection of gradual biological pathway activation within spatially defined domain types.

Maintained by Xiao Zhang. Last updated 6 months ago.

openblas cpp openmp

18.2 match 1 stars 3.95 score 2 scripts

tommyjones

textmineR:Functions for Text Mining and Topic Modeling

An aid for text mining in R, with a syntax that should be familiar to experienced R users. Provides a wrapper for several topic models that take similarly-formatted input and give similarly-formatted output. Has additional functionality for analyzing and diagnostics for topic models.

Maintained by Tommy Jones. Last updated 2 years ago.

cpp

6.6 match 106 stars 10.83 score 310 scripts 7 dependents

jkrijthe

Rtsne:T-Distributed Stochastic Neighbor Embedding using a Barnes-Hut Implementation

An R wrapper around the fast T-distributed Stochastic Neighbor Embedding implementation by Van der Maaten (see <https://github.com/lvdmaaten/bhtsne/> for more information on the original implementation).

Maintained by Jesse Krijthe. Last updated 9 months ago.

openblas cpp openmp

5.0 match 256 stars 13.95 score 4.4k scripts 231 dependents

feiyoung

ProFAST:Probabilistic Factor Analysis for Spatially-Aware Dimension Reduction

Probabilistic factor analysis for spatially-aware dimension reduction across multi-section spatial transcriptomics data with millions of spatial locations. More details can be referred to Wei Liu, et al. (2023) <doi:10.1101/2023.07.11.548486>.

Maintained by Wei Liu. Last updated 1 months ago.

openblas cpp

11.7 match 2 stars 5.86 score 12 scripts 1 dependents

kharchenkolab

sccore:Core Utilities for Single-Cell RNA-Seq

Core utilities for single-cell RNA-seq data analysis. Contained within are utility functions for working with differential expression (DE) matrices and count matrices, a collection of functions for manipulating and plotting data via 'ggplot2', and functions to work with cell graphs and cell embeddings. Graph-based methods include embedding kNN cell graphs into a UMAP <doi:10.21105/joss.00861>, collapsing vertices of each cluster in the graph, and propagating graph labels.

Maintained by Evan Biederstedt. Last updated 1 years ago.

cpp

10.4 match 12 stars 6.44 score 36 scripts 9 dependents

satijalab

SeuratObject:Data Structures for Single Cell Data

Defines S4 classes for single-cell genomic data and associated information, such as dimensionality reduction embeddings, nearest-neighbor graphs, and spatially-resolved coordinates. Provides data access methods and R-native hooks to ensure the Seurat object is familiar to other R users. See Satija R, Farrell J, Gennert D, et al (2015) <doi:10.1038/nbt.3192>, Macosko E, Basu A, Satija R, et al (2015) <doi:10.1016/j.cell.2015.05.002>, and Stuart T, Butler A, et al (2019) <doi:10.1016/j.cell.2019.05.031> for more details.

Maintained by Paul Hoffman. Last updated 1 years ago.

cpp

5.5 match 25 stars 11.69 score 1.2k scripts 88 dependents

bnosac

doc2vec:Distributed Representations of Sentences, Documents and Topics

Learn vector representations of sentences, paragraphs or documents by using the 'Paragraph Vector' algorithms, namely the distributed bag of words ('PV-DBOW') and the distributed memory ('PV-DM') model. The techniques in the package are detailed in the paper "Distributed Representations of Sentences and Documents" by Mikolov et al. (2014), available at <arXiv:1405.4053>. The package also provides an implementation to cluster documents based on these embedding using a technique called top2vec. Top2vec finds clusters in text documents by combining techniques to embed documents and words and density-based clustering. It does this by embedding documents in the semantic space as defined by the 'doc2vec' algorithm. Next it maps these document embeddings to a lower-dimensional space using the 'Uniform Manifold Approximation and Projection' (UMAP) clustering algorithm and finds dense areas in that space using a 'Hierarchical Density-Based Clustering' technique (HDBSCAN). These dense areas are the topic clusters which can be represented by the corresponding topic vector which is an aggregate of the document embeddings of the documents which are part of that topic cluster. In the same semantic space similar words can be found which are representative of the topic. More details can be found in the paper 'Top2Vec: Distributed Representations of Topics' by D. Angelov available at <arXiv:2008.09470>.

Maintained by Jan Wijffels. Last updated 3 years ago.

doc2vec embeddings natural-language-processing paragraph2vec word2vec cpp

11.0 match 48 stars 5.74 score 23 scripts

softwareliteracy

rEDM:Empirical Dynamic Modeling ('EDM')

An implementation of 'EDM' algorithms based on research software developed for internal use at the Sugihara Lab ('UCSD/SIO'). The package is implemented with 'Rcpp' wrappers around the 'cppEDM' library. It implements the 'simplex' projection method from Sugihara & May (1990) <doi:10.1038/344734a0>, the 'S-map' algorithm from Sugihara (1994) <doi:10.1098/rsta.1994.0106>, convergent cross mapping described in Sugihara et al. (2012) <doi:10.1126/science.1227079>, and, 'multiview embedding' described in Ye & Sugihara (2016) <doi:10.1126/science.aag0863>.

Maintained by Joseph Park. Last updated 11 months ago.

openblas cpp

10.3 match 2 stars 6.05 score 319 scripts 1 dependents

eagerai

fastai:Interface to 'fastai'

The 'fastai' <https://docs.fast.ai/index.html> library simplifies training fast and accurate neural networks using modern best practices. It is based on research in to deep learning best practices undertaken at 'fast.ai', including 'out of the box' support for vision, text, tabular, audio, time series, and collaborative filtering models.

Maintained by Turgut Abdullayev. Last updated 11 months ago.

audio collaborative-filtering darknet darknet-image-classification fastai medical object-detection tabular text vision

6.0 match 118 stars 9.40 score 76 scripts

fberding

aifeducation:Artificial Intelligence for Education

In social and educational settings, the use of Artificial Intelligence (AI) is a challenging task. Relevant data is often only available in handwritten forms, or the use of data is restricted by privacy policies. This often leads to small data sets. Furthermore, in the educational and social sciences, data is often unbalanced in terms of frequencies. To support educators as well as educational and social researchers in using the potentials of AI for their work, this package provides a unified interface for neural nets in 'PyTorch' to deal with natural language problems. In addition, the package ships with a shiny app, providing a graphical user interface. This allows the usage of AI for people without skills in writing python/R scripts. The tools integrate existing mathematical and statistical methods for dealing with small data sets via pseudo-labeling (e.g. Cascante-Bonilla et al. (2020) <doi:10.48550/arXiv.2001.06001>) and imbalanced data via the creation of synthetic cases (e.g. Bunkhumpornpat et al. (2012) <doi:10.1007/s10489-011-0287-y>). Performance evaluation of AI is connected to measures from content analysis which educational and social researchers are generally more familiar with (e.g. Berding & Pargmann (2022) <doi:10.30819/5581>, Gwet (2014) <ISBN:978-0-9708062-8-4>, Krippendorff (2019) <doi:10.4135/9781071878781>). Estimation of energy consumption and CO2 emissions during model training is done with the 'python' library 'codecarbon'. Finally, all objects created with this package allow to share trained AI models with other people.

Maintained by Berding Florian. Last updated 1 months ago.

cpp

11.3 match 4.48 score 8 scripts

jbgruber

rollama:Communicate with 'Ollama' to Run Large Language Models Locally

Wraps the 'Ollama' <https://ollama.com> API, which can be used to communicate with generative large language models locally.

Maintained by Johannes B. Gruber. Last updated 1 months ago.

6.0 match 110 stars 8.36 score 52 scripts

chaoranhu

smam:Statistical Modeling of Animal Movements

Animal movement models including Moving-Resting Process with Embedded Brownian Motion (Yan et al., 2014, <doi:10.1007/s10144-013-0428-8>; Pozdnyakov et al., 2017, <doi:10.1007/s11009-017-9547-6>), Brownian Motion with Measurement Error (Pozdnyakov et al., 2014, <doi:10.1890/13-0532.1>), Moving-Resting-Handling Process with Embedded Brownian Motion (Pozdnyakov et al., 2020, <doi:10.1007/s11009-020-09774-1>), Moving-Resting Process with Measurement Error (Hu et al., 2021, <doi:10.1111/2041-210X.13694>), Moving-Moving Process with two Embedded Brownian Motions.

Maintained by Chaoran Hu. Last updated 1 years ago.

animal-movement brownian-motion hidden-markov-model hidden-states measurement-error telegraph-process gsl cpp

11.1 match 3 stars 4.52 score 11 scripts

sokbae

sketching:Sketching of Data via Random Subspace Embeddings

Construct sketches of data via random subspace embeddings. For more details, see the following papers. Lee, S. and Ng, S. (2022). "Least Squares Estimation Using Sketched Data with Heteroskedastic Errors," Proceedings of the 39th International Conference on Machine Learning (ICML22), 162:12498-12520. Lee, S. and Ng, S. (2020). "An Econometric Perspective on Algorithmic Subsampling," Annual Review of Economics, 12(1): 45–80.

Maintained by Sokbae Lee. Last updated 3 years ago.

heteroskedasticity regression subspace-embedding cpp

10.9 match 7 stars 4.54 score 7 scripts

jlmelville

uwot:The Uniform Manifold Approximation and Projection (UMAP) Method for Dimensionality Reduction

An implementation of the Uniform Manifold Approximation and Projection dimensionality reduction by McInnes et al. (2018) <doi:10.48550/arXiv.1802.03426>. It also provides means to transform new data and to carry out supervised dimensionality reduction. An implementation of the related LargeVis method of Tang et al. (2016) <doi:10.48550/arXiv.1602.00370> is also provided. This is a complete re-implementation in R (and C++, via the 'Rcpp' package): no Python installation is required. See the uwot website (<https://github.com/jlmelville/uwot>) for more documentation and examples.

Maintained by James Melville. Last updated 19 days ago.

dimensionality-reduction umap cpp

3.1 match 328 stars 15.74 score 2.0k scripts 140 dependents

dustinstoltz

text2map:R Tools for Text Matrices, Embeddings, and Networks

This is a collection of functions optimized for working with with various kinds of text matrices. Focusing on the text matrix as the primary object - represented either as a base R dense matrix or a 'Matrix' package sparse matrix - allows for a consistent and intuitive interface that stays close to the underlying mathematical foundation of computational text analysis. In particular, the package includes functions for working with word embeddings, text networks, and document-term matrices. Methods developed in Stoltz and Taylor (2019) <doi:10.1007/s42001-019-00048-6>, Taylor and Stoltz (2020) <doi:10.1007/s42001-020-00075-8>, Taylor and Stoltz (2020) <doi:10.15195/v7.a23>, and Stoltz and Taylor (2021) <doi:10.1016/j.poetic.2021.101567>.

Maintained by Dustin Stoltz. Last updated 3 months ago.

12.8 match 3.82 score 22 scripts

wadpac

GGIR:Raw Accelerometer Data Analysis

A tool to process and analyse data collected with wearable raw acceleration sensors as described in Migueles and colleagues (JMPB 2019), and van Hees and colleagues (JApplPhysiol 2014; PLoSONE 2015). The package has been developed and tested for binary data from 'GENEActiv' <https://activinsights.com/>, binary (.gt3x) and .csv-export data from 'Actigraph' <https://theactigraph.com> devices, and binary (.cwa) and .csv-export data from 'Axivity' <https://axivity.com>. These devices are currently widely used in research on human daily physical activity. Further, the package can handle accelerometer data file from any other sensor brand providing that the data is stored in csv format. Also the package allows for external function embedding.

Maintained by Vincent T van Hees. Last updated 2 days ago.

accelerometer activity-recognition circadian-rhythm movement-sensor sleep

3.6 match 109 stars 13.20 score 342 scripts 3 dependents

tkonopka

umap:Uniform Manifold Approximation and Projection

Uniform manifold approximation and projection is a technique for dimension reduction. The algorithm was described by McInnes and Healy (2018) in <arXiv:1802.03426>. This package provides an interface for two implementations. One is written from scratch, including components for nearest-neighbor search and for embedding. The second implementation is a wrapper for 'python' package 'umap-learn' (requires separate installation, see vignette for more details).

Maintained by Tomasz Konopka. Last updated 11 months ago.

dimensionality-reduction umap cpp

3.7 match 132 stars 12.74 score 3.6k scripts 43 dependents

jeroen

V8:Embedded JavaScript and WebAssembly Engine for R

An R interface to V8 <https://v8.dev>: Google's open source JavaScript and WebAssembly engine. This package can be compiled either with V8 version 6 and up or NodeJS when built as a shared library.

Maintained by Jeroen Ooms. Last updated 1 days ago.

javascript libv8 wasm nodejs cpp

3.0 match 201 stars 15.80 score 508 scripts 336 dependents

trelliscope

trelliscope:Create Interactive Multi-Panel Displays

Trelliscope enables interactive exploration of data frames of visualizations.

Maintained by Ryan Hafen. Last updated 7 months ago.

visualization

7.2 match 29 stars 6.43 score 117 scripts

constantino-garcia

nonlinearTseries:Nonlinear Time Series Analysis

Functions for nonlinear time series analysis. This package permits the computation of the most-used nonlinear statistics/algorithms including generalized correlation dimension, information dimension, largest Lyapunov exponent, sample entropy and Recurrence Quantification Analysis (RQA), among others. Basic routines for surrogate data testing are also included. Part of this work was based on the book "Nonlinear time series analysis" by Holger Kantz and Thomas Schreiber (ISBN: 9780521529020).

Maintained by Constantino A. Garcia. Last updated 6 months ago.

chaos chaotic-systems nonlinear-dynamics nonlinear-time-series time-series openblas cpp

5.0 match 35 stars 8.98 score 123 scripts 7 dependents

bnosac

udpipe:Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing with the 'UDPipe' 'NLP' Toolkit

This natural language processing toolkit provides language-agnostic 'tokenization', 'parts of speech tagging', 'lemmatization' and 'dependency parsing' of raw text. Next to text parsing, the package also allows you to train annotation models based on data of 'treebanks' in 'CoNLL-U' format as provided at <https://universaldependencies.org/format.html>. The techniques are explained in detail in the paper: 'Tokenizing, POS Tagging, Lemmatizing and Parsing UD 2.0 with UDPipe', available at <doi:10.18653/v1/K17-3009>. The toolkit also contains functionalities for commonly used data manipulations on texts which are enriched with the output of the parser. Namely functionalities and algorithms for collocations, token co-occurrence, document term matrix handling, term frequency inverse document frequency calculations, information retrieval metrics (Okapi BM25), handling of multi-word expressions, keyword detection (Rapid Automatic Keyword Extraction, noun phrase extraction, syntactical patterns) sentiment scoring and semantic similarity analysis.

Maintained by Jan Wijffels. Last updated 2 years ago.

conll dependency-parser lemmatization natural-language-processing nlp pos-tagging r-pkg rcpp text-mining tokenizer udpipe cpp

3.8 match 215 stars 11.83 score 1.2k scripts 9 dependents

ijlyttle

vembedr:Embed Video in HTML

A set of functions for generating HTML to embed hosted video in your R Markdown documents or Shiny applications.

Maintained by Ian Lyttle. Last updated 3 years ago.

box embed-videos rmarkdown shiny vimeo youtube

5.6 match 58 stars 7.83 score 520 scripts

bioc

StabMap:Stabilised mosaic single cell data integration using unshared features

StabMap performs single cell mosaic data integration by first building a mosaic data topology, and for each reference dataset, traverses the topology to project and predict data onto a common embedding. Mosaic data should be provided in a list format, with all relevant features included in the data matrices within each list object. The output of stabMap is a joint low-dimensional embedding taking into account all available relevant features. Expression imputation can also be performed using the StabMap embedding and any of the original data matrices for given reference and query cell lists.

Maintained by Shila Ghazanfar. Last updated 5 months ago.

singlecell dimensionreduction software

7.4 match 5.95 score 60 scripts

samuel-marsh

scCustomize:Custom Visualizations & Functions for Streamlined Analyses of Single Cell Sequencing

Collection of functions created and/or curated to aid in the visualization and analysis of single-cell data using 'R'. 'scCustomize' aims to provide 1) Customized visualizations for aid in ease of use and to create more aesthetic and functional visuals. 2) Improve speed/reproducibility of common tasks/pieces of code in scRNA-seq analysis with a single or group of functions. For citation please use: Marsh SE (2021) "Custom Visualizations & Functions for Streamlined Analyses of Single Cell Sequencing" <doi:10.5281/zenodo.5706430> RRID:SCR_024675.

Maintained by Samuel Marsh. Last updated 3 months ago.

customization ggplot2 scrna-seq seurat single-cell single-cell-genomics single-cell-rna-seq visualization

5.0 match 242 stars 8.75 score 1.1k scripts

tidymodels

recipes:Preprocessing and Feature Engineering Steps for Modeling

A recipe prepares your data for modeling. We provide an extensible framework for pipeable sequences of feature engineering steps provides preprocessing tools to be applied to data. Statistical parameters for the steps can be estimated from an initial data set and then applied to other data sets. The resulting processed output can then be used as inputs for statistical or machine learning models.

Maintained by Max Kuhn. Last updated 5 days ago.

2.3 match 584 stars 18.71 score 7.2k scripts 380 dependents

rstudio

blastula:Easily Send HTML Email Messages

Compose and send out responsive HTML email messages that render perfectly across a range of email clients and device sizes. Helper functions let the user insert embedded images, web link buttons, and 'ggplot2' plot objects into the message body. Messages can be sent through an 'SMTP' server, through the 'Posit Connect' service, or through the 'Mailgun' API service <https://www.mailgun.com/>.

Maintained by Richard Iannone. Last updated 8 months ago.

easy-to-use email html markdown responsive-email smtp

4.1 match 552 stars 10.27 score 348 scripts 5 dependents

bnosac

textplot:Text Plots

Visualise complex relations in texts. This is done by providing functionalities for displaying text co-occurrence networks, text correlation networks, dependency relationships as well as text clustering and semantic text 'embeddings'. Feel free to join the effort of providing interesting text visualisations.

Maintained by Jan Wijffels. Last updated 3 years ago.

6.1 match 54 stars 6.78 score 75 scripts 1 dependents

stscl

spEDM:Spatial Empirical Dynamic Modeling

Inferring causal associations in cross-sectional earth system data through empirical dynamic modeling (EDM), with extensions to convergent cross mapping from Sugihara et al. (2012) <doi:10.1126/science.1227079>, partial cross mapping as outlined in Leng et al. (2020) <doi:10.1038/s41467-020-16238-0>, and cross mapping cardinality as described in Tao et al. (2023)<doi:10.1016/j.fmre.2023.01.007>.

Maintained by Wenbo Lv. Last updated 3 hours ago.

causal-inference cpp empirical-dynamic-modeling geoinformatics geospatial-causality spatial-statistics openblas cpp openmp

6.8 match 17 stars 6.11 score 2 scripts

jaytimm

textpress:A Lightweight and Versatile NLP Toolkit

A simple Natural Language Processing (NLP) toolkit focused on search-centric workflows with minimal dependencies. The package offers key features for web scraping, text processing, corpus search, and text embedding generation via the 'HuggingFace API' <https://huggingface.co/docs/api-inference/index>.

Maintained by Jason Timm. Last updated 5 months ago.

corpus-search nlp openai-embeddings web-scraping

9.8 match 3 stars 4.18 score

bioc

scDataviz:scDataviz: single cell dataviz and downstream analyses

In the single cell World, which includes flow cytometry, mass cytometry, single-cell RNA-seq (scRNA-seq), and others, there is a need to improve data visualisation and to bring analysis capabilities to researchers even from non-technical backgrounds. scDataviz attempts to fit into this space, while also catering for advanced users. Additonally, due to the way that scDataviz is designed, which is based on SingleCellExperiment, it has a 'plug and play' feel, and immediately lends itself as flexibile and compatibile with studies that go beyond scDataviz. Finally, the graphics in scDataviz are generated via the ggplot engine, which means that users can 'add on' features to these with ease.

Maintained by Kevin Blighe. Last updated 5 months ago.

singlecell immunooncology rnaseq geneexpression transcription flowcytometry massspectrometry dataimport

6.4 match 63 stars 6.30 score 16 scripts

rstudio

keras3:R Interface to 'Keras'

Interface to 'Keras' <https://keras.io>, a high-level neural networks API. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both CPU and GPU devices.

Maintained by Tomasz Kalinowski. Last updated 4 days ago.

2.9 match 845 stars 13.57 score 264 scripts 2 dependents

gesistsa

sweater:Speedy Word Embedding Association Test and Extras Using R

Conduct various tests for evaluating implicit biases in word embeddings: Word Embedding Association Test (Caliskan et al., 2017), <doi:10.1126/science.aal4230>, Relative Norm Distance (Garg et al., 2018), <doi:10.1073/pnas.1720347115>, Mean Average Cosine Similarity (Mazini et al., 2019) <arXiv:1904.04047>, SemAxis (An et al., 2018) <arXiv:1806.05521>, Relative Negative Sentiment Bias (Sweeney & Najafian, 2019) <doi:10.18653/v1/P19-1162>, and Embedding Coherence Test (Dev & Phillips, 2019) <arXiv:1901.07656>.

Maintained by Chung-hong Chan. Last updated 1 months ago.

bias-detection textanalysis wordembedding cpp

7.5 match 30 stars 4.80 score 14 scripts

rstudio

tfdatasets:Interface to 'TensorFlow' Datasets

Interface to 'TensorFlow' Datasets, a high-level library for building complex input pipelines from simple, re-usable pieces. See <https://www.tensorflow.org/guide> for additional details.

Maintained by Tomasz Kalinowski. Last updated 4 days ago.

3.8 match 34 stars 9.32 score 656 scripts 3 dependents

mkearney

wactor:Word Factor Vectors

A user-friendly factor-like interface for converting strings of text into numeric vectors and rectangular data structures.

Maintained by Michael W. Kearney. Last updated 5 years ago.

text text-classification text-processing text-vectorization word-embeddings word-vectors word2vec

7.5 match 33 stars 4.52 score 3 scripts

bioc

corral:Correspondence Analysis for Single Cell Data

Correspondence analysis (CA) is a matrix factorization method, and is similar to principal components analysis (PCA). Whereas PCA is designed for application to continuous, approximately normally distributed data, CA is appropriate for non-negative, count-based data that are in the same additive scale. The corral package implements CA for dimensionality reduction of a single matrix of single-cell data, as well as a multi-table adaptation of CA that leverages data-optimized scaling to align data generated from different sequencing platforms by projecting into a shared latent space. corral utilizes sparse matrices and a fast implementation of SVD, and can be called directly on Bioconductor objects (e.g., SingleCellExperiment) for easy pipeline integration. The package also includes additional options, including variations of CA to address overdispersion in count data (e.g., Freeman-Tukey chi-squared residual), as well as the option to apply CA-style processing to continuous data (e.g., proteomic TOF intensities) with the Hellinger distance adaptation of CA.

Maintained by Lauren Hsu. Last updated 5 months ago.

batcheffect dimensionreduction geneexpression preprocessing principalcomponent sequencing singlecell software visualization

7.2 match 4.64 score 22 scripts

aphalo

gginnards:Explore the Innards of 'ggplot2' Objects

Extensions to 'ggplot2' providing low-level debug tools: statistics and geometries echoing their data argument. Layer manipulation: deletion, insertion, extraction and reordering of layers. Deletion of unused variables from the data object embedded in "ggplot" objects.

Maintained by Pedro J. Aphalo. Last updated 3 months ago.

dataviz debugging ggplot2-enhancementes ggplot2-layer-manipulation inspection

3.6 match 25 stars 9.03 score 378 scripts 3 dependents

cran

preregr:Specify (Pre)Registrations and Export Them Human- And Machine-Readably

Preregistrations, or more generally, registrations, enable explicit timestamped and (often but not necessarily publicly) frozen documentation of plans and expectations as well as decisions and justifications. In research, preregistrations are commonly used to clearly document plans and facilitate justifications of deviations from those plans, as well as decreasing the effects of publication bias by enabling identification of research that was conducted but not published. Like reporting guidelines, (pre)registration forms often have specific structures that facilitate systematic reporting of important items. The 'preregr' package facilitates specifying (pre)registrations in R and exporting them to a human-readable format (using R Markdown partials or exporting to an 'HTML' file) as well as human-readable embedded data (using 'JSON'), as well as importing such exported (pre)registration specifications from such embedded 'JSON'.

Maintained by Gjalt-Jorn Peters. Last updated 2 years ago.

8.3 match 3.98 score 1 scripts

rikenbit

Vicus:Exploiting Local Structures to Improve Network-Based Analysis of Biological Data

Compared with the similar graph embedding method such as Laplacian Eigenmaps, 'Vicus' can exploit more local structures of graph data. For the details of the methods, see the reference section of 'GitHub' README.md <https://github.com/rikenbit/Vicus>.

Maintained by Koki Tsuyuzaki. Last updated 2 years ago.

8.8 match 1 stars 3.70 score

anders-biostat

sleepwalk:Interactively Explore Dimension-Reduced Embeddings

A tool to interactively explore the embeddings created by dimension reduction methods such as Principal Components Analysis (PCA), Multidimensional Scaling (MDS), T-distributed Stochastic Neighbour Embedding (t-SNE), Uniform Manifold Approximation and Projection (UMAP) or any other.

Maintained by Svetlana Ovchinnikova. Last updated 3 years ago.

5.6 match 110 stars 5.82 score 40 scripts

r-lib

rlang:Functions for Base Types and Core R and 'Tidyverse' Features

A toolbox for working with base types, core R features like the condition system, and core 'Tidyverse' features like tidy evaluation.

Maintained by Lionel Henry. Last updated 19 days ago.

1.6 match 517 stars 20.53 score 9.8k scripts 15k dependents

thomaschln

kgraph:Knowledge Graphs Constructions and Visualizations

Knowledge graphs enable to efficiently visualize and gain insights into large-scale data analysis results, as p-values from multiple studies or embedding data matrices. The usual workflow is a user providing a data frame of association studies results and specifying target nodes, e.g. phenotypes, to visualize. The knowledge graph then shows all the features which are significantly associated with the phenotype, with the edges being proportional to the association scores. As the user adds several target nodes and grouping information about the nodes such as biological pathways, the construction of such graphs soon becomes complex. The 'kgraph' package aims to enable users to easily build such knowledge graphs, and provides two main features: first, to enable building a knowledge graph based on a data frame of concepts relationships, be it p-values or cosine similarities; second, to enable determining an appropriate cut-off on cosine similarities from a complete embedding matrix, to enable the building of a knowledge graph directly from an embedding matrix. The 'kgraph' package provides several display, layout and cut-off options, and has already proven useful to researchers to enable them to visualize large sets of p-value associations with various phenotypes, and to quickly be able to visualize embedding results. Two example datasets are provided to demonstrate these behaviors, and several live 'shiny' applications are hosted by the CELEHS laboratory and Parse Health, as the KESER Mental Health application <https://keser-mental-health.parse-health.org/> based on Hong C. (2021) <doi:10.1038/s41746-021-00519-z>.

Maintained by Thomas Charlon. Last updated 24 days ago.

6.4 match 4.85 score

thomaschln

nlpembeds:Natural Language Processing Embeddings

Provides efficient methods to compute co-occurrence matrices, pointwise mutual information (PMI) and singular value decomposition (SVD). In the biomedical and clinical settings, one challenge is the huge size of databases, e.g. when analyzing data of millions of patients over tens of years. To address this, this package provides functions to efficiently compute monthly co-occurrence matrices, which is the computational bottleneck of the analysis, by using the 'RcppAlgos' package and sparse matrices. Furthermore, the functions can be called on 'SQL' databases, enabling the computation of co-occurrence matrices of tens of gigabytes of data, representing millions of patients over tens of years. Partly based on Hong C. (2021) <doi:10.1038/s41746-021-00519-z>.

Maintained by Thomas Charlon. Last updated 24 days ago.

6.0 match 4.98 score

mlr-org

mlr3filters:Filter Based Feature Selection for 'mlr3'

Extends 'mlr3' with filter methods for feature selection. Besides standalone filter methods built-in methods of any machine-learning algorithm are supported. Partial scoring of multivariate filter methods is supported.

Maintained by Marc Becker. Last updated 4 months ago.

feature-selection filter filters mlr mlr3 variable-importance

3.6 match 20 stars 8.37 score 95 scripts 3 dependents

bioc

celda:CEllular Latent Dirichlet Allocation

Celda is a suite of Bayesian hierarchical models for clustering single-cell RNA-sequencing (scRNA-seq) data. It is able to perform "bi-clustering" and simultaneously cluster genes into gene modules and cells into cell subpopulations. It also contains DecontX, a novel Bayesian method to computationally estimate and remove RNA contamination in individual cells without empty droplet information. A variety of scRNA-seq data visualization functions is also included.

Maintained by Joshua Campbell. Last updated 27 days ago.

singlecell geneexpression clustering sequencing bayesian immunooncology dataimport cpp openmp

2.8 match 147 stars 10.47 score 256 scripts 2 dependents

alaninglis

vivid:Variable Importance and Variable Interaction Displays

A suite of plots for displaying variable importance and two-way variable interaction jointly. Can also display partial dependence plots laid out in a pairs plot or 'zenplots' style.

Maintained by Alan Inglis. Last updated 8 months ago.

4.0 match 21 stars 7.39 score 39 scripts

cran

SELF:A Structural Equation Embedded Likelihood Framework for Causal Discovery

Provides the SELF criteria to learn causal structure. Please cite "Ruichu Cai, Jie Qiao, Zhenjie Zhang, Zhifeng Hao. SELF: Structural Equational Embedded Likelihood Framework for Causal Discovery. AAAI. 2018."

Maintained by Jie Qiao. Last updated 7 years ago.

cpp

5.1 match 5.74 score 16k scripts

bnaras

ECOSolveR:Embedded Conic Solver in R

R interface to the Embedded COnic Solver (ECOS), an efficient and robust C library for convex problems. Conic and equality constraints can be specified in addition to integer and boolean variable constraints for mixed-integer problems. This R interface is inspired by the python interface and has similar calling conventions.

Maintained by Balasubramanian Narasimhan. Last updated 9 months ago.

3.6 match 5 stars 7.86 score 30 scripts 58 dependents

andrisignorell

DescTools:Tools for Descriptive Statistics

A collection of miscellaneous basic statistic functions and convenience wrappers for efficiently describing data. The author's intention was to create a toolbox, which facilitates the (notoriously time consuming) first descriptive tasks in data analysis, consisting of calculating descriptive statistics, drawing graphical summaries and reporting the results. The package contains furthermore functions to produce documents using MS Word (or PowerPoint) and functions to import data from Excel. Many of the included functions can be found scattered in other packages and other sources written partly by Titans of R. The reason for collecting them here, was primarily to have them consolidated in ONE instead of dozens of packages (which themselves might depend on other packages which are not needed at all), and to provide a common and consistent interface as far as function and arguments naming, NA handling, recycling rules etc. are concerned. Google style guides were used as naming rules (in absence of convincing alternatives). The 'BigCamelCase' style was consequently applied to functions borrowed from contributed R packages as well.

Maintained by Andri Signorell. Last updated 9 days ago.

fortran cpp

1.7 match 87 stars 16.68 score 7.7k scripts 99 dependents

casperhart

detourr:Portable and Performant Tour Animations

Provides 2D and 3D tour animations as HTML widgets. The user can interact with the widgets using orbit controls, tooltips, brushing, and timeline controls. Linked brushing is supported using 'crosstalk', and widgets can be embedded in Shiny apps or HTML documents.

Maintained by Casper Hart. Last updated 5 months ago.

5.9 match 7 stars 4.83 score 48 scripts

bioc

Banksy:Spatial transcriptomic clustering

Banksy is an R package that incorporates spatial information to cluster cells in a feature space (e.g. gene expression). To incorporate spatial information, BANKSY computes the mean neighborhood expression and azimuthal Gabor filters that capture gene expression gradients. These features are combined with the cell's own expression to embed cells in a neighbor-augmented product space which can then be clustered, allowing for accurate and spatially-aware cell typing and tissue domain segmentation.

Maintained by Joseph Lee. Last updated 12 days ago.

clustering spatial singlecell geneexpression dimensionreduction clustering-algorithm single-cell-omics spatial-omics

3.1 match 90 stars 9.03 score 248 scripts

tyy20

KEPTED:Kernel-Embedding-of-Probability Test for Elliptical Distribution

Provides an implementation of a kernel-embedding of probability test for elliptical distribution. This is an asymptotic test for elliptical distribution under general alternatives, and the location and shape parameters are assumed to be unknown. Some side-products are posted, including the transformation between rectangular and polar coordinates and two product-type kernel functions. See Tang and Li (2024) <doi:10.48550/arXiv.2306.10594> for details.

Maintained by Yin Tang. Last updated 11 months ago.

6.7 match 4.18 score

kdpsingh

clinspacy:Clinical Natural Language Processing using 'spaCy', 'scispaCy', and 'medspaCy'

Performs biomedical named entity recognition, Unified Medical Language System (UMLS) concept mapping, and negation detection using the Python 'spaCy', 'scispaCy', and 'medspaCy' packages, and transforms extracted data into a wide format for inclusion in machine learning models. The development of the 'scispaCy' package is described by Neumann (2019) <doi:10.18653/v1/W19-5034>. The 'medspacy' package uses 'ConText', an algorithm for determining the context of clinical statements described by Harkema (2009) <doi:10.1016/j.jbi.2009.05.002>. Clinspacy also supports entity embeddings from 'scispaCy' and UMLS 'cui2vec' concept embeddings developed by Beam (2018) <arXiv:1804.01486>.

Maintained by Karandeep Singh. Last updated 4 years ago.

5.8 match 99 stars 4.81 score 13 scripts

r-spatial

gstat:Spatial and Spatio-Temporal Geostatistical Modelling, Prediction and Simulation

Variogram modelling; simple, ordinary and universal point or block (co)kriging; spatio-temporal kriging; sequential Gaussian or indicator (co)simulation; variogram and variogram map plotting utility functions; supports sf and stars.

Maintained by Edzer Pebesma. Last updated 10 days ago.

openblas

1.9 match 197 stars 14.78 score 4.8k scripts 57 dependents

citoverse

cito:Building and Training Neural Networks

The 'cito' package provides a user-friendly interface for training and interpreting deep neural networks (DNN). 'cito' simplifies the fitting of DNNs by supporting the familiar formula syntax, hyperparameter tuning under cross-validation, and helps to detect and handle convergence problems. DNNs can be trained on CPU, GPU and MacOS GPUs. In addition, 'cito' has many downstream functionalities such as various explainable AI (xAI) metrics (e.g. variable importance, partial dependence plots, accumulated local effect plots, and effect estimates) to interpret trained DNNs. 'cito' optionally provides confidence intervals (and p-values) for all xAI metrics and predictions. At the same time, 'cito' is computationally efficient because it is based on the deep learning framework 'torch'. The 'torch' package is native to R, so no Python installation or other API is required for this package.

Maintained by Maximilian Pichler. Last updated 2 months ago.

machine-learning neural-network

3.0 match 42 stars 9.07 score 129 scripts 1 dependents

benwiseman

sentiment.ai:Simple Sentiment Analysis Using Deep Learning

Sentiment Analysis via deep learning and gradient boosting models with a lot of the underlying hassle taken care of to make the process as simple as possible. In addition to out-performing traditional, lexicon-based sentiment analysis (see <https://benwiseman.github.io/sentiment.ai/#Benchmarks>), it also allows the user to create embedding vectors for text which can be used in other analyses. GPU acceleration is supported on Windows and Linux.

Maintained by Ben Wiseman. Last updated 3 years ago.

10.0 match 2.70 score 7 scripts

rezakj

iCellR:Analyzing High-Throughput Single Cell Sequencing Data

A toolkit that allows scientists to work with data from single cell sequencing technologies such as scRNA-seq, scVDJ-seq, scATAC-seq, CITE-Seq and Spatial Transcriptomics (ST). Single (i) Cell R package ('iCellR') provides unprecedented flexibility at every step of the analysis pipeline, including normalization, clustering, dimensionality reduction, imputation, visualization, and so on. Users can design both unsupervised and supervised models to best suit their research. In addition, the toolkit provides 2D and 3D interactive visualizations, differential expression analysis, filters based on cells, genes and clusters, data merging, normalizing for dropouts, data imputation methods, correcting for batch differences, pathway analysis, tools to find marker genes for clusters and conditions, predict cell types and pseudotime analysis. See Khodadadi-Jamayran, et al (2020) <doi:10.1101/2020.05.05.078550> and Khodadadi-Jamayran, et al (2020) <doi:10.1101/2020.03.31.019109> for more details.

Maintained by Alireza Khodadadi-Jamayran. Last updated 8 months ago.

10xgenomics 3d batch-normalization cell-type-classification cite-seq clustering clustering-algorithm diffusion-maps dropout icellr imputation intractive-graph normalization pseudotime scrna-seq scvdj-seq singel-cell-sequencing umap cpp

4.9 match 121 stars 5.56 score 7 scripts 1 dependents

egenn

rtemis:Machine Learning and Visualization

Advanced Machine Learning and Visualization. Unsupervised Learning (Clustering, Decomposition), Supervised Learning (Classification, Regression), Cross-Decomposition, Bagging, Boosting, Meta-models. Static and interactive graphics.

Maintained by E.D. Gennatas. Last updated 1 months ago.

data-science data-visualization machine-learning machine-learning-library visualization

3.8 match 145 stars 7.09 score 50 scripts 2 dependents

hafen

trelliscopejs:Create Interactive Trelliscope Displays

Trelliscope is a scalable, flexible, interactive approach to visualizing data (Hafen, 2013 <doi:10.1109/LDAV.2013.6675164>). This package provides methods that make it easy to create a Trelliscope display specification for TrelliscopeJS. High-level functions are provided for creating displays from within 'tidyverse' or 'ggplot2' workflows. Low-level functions are also provided for creating new interfaces.

Maintained by Ryan Hafen. Last updated 1 years ago.

visualization

2.8 match 264 stars 9.50 score 1000 scripts 1 dependents

kharchenkolab

pagoda2:Single Cell Analysis and Differential Expression

Analyzing and interactively exploring large-scale single-cell RNA-seq datasets. 'pagoda2' primarily performs normalization and differential gene expression analysis, with an interactive application for exploring single-cell RNA-seq datasets. It performs basic tasks such as cell size normalization, gene variance normalization, and can be used to identify subpopulations and run differential expression within individual samples. 'pagoda2' was written to rapidly process modern large-scale scRNAseq datasets of approximately 1e6 cells. The companion web application allows users to explore which gene expression patterns form the different subpopulations within your data. The package also serves as the primary method for preprocessing data for conos, <https://github.com/kharchenkolab/conos>. This package interacts with data available through the 'p2data' package, which is available in a 'drat' repository. To access this data package, see the instructions at <https://github.com/kharchenkolab/pagoda2>. The size of the 'p2data' package is approximately 6 MB.

Maintained by Evan Biederstedt. Last updated 1 years ago.

scrna-seq single-cell single-cell-rna-seq transcriptomics openblas cpp openmp

3.3 match 222 stars 8.00 score 282 scripts

janoleko

LaMa:Fast Numerical Maximum Likelihood Estimation for Latent Markov Models

A variety of latent Markov models, including hidden Markov models, hidden semi-Markov models, state-space models and continuous-time variants can be formulated and estimated within the same framework via directly maximising the likelihood function using the so-called forward algorithm. Applied researchers often need custom models that standard software does not easily support. Writing tailored 'R' code offers flexibility but suffers from slow estimation speeds. We address these issues by providing easy-to-use functions (written in 'C++' for speed) for common tasks like the forward algorithm. These functions can be combined into custom models in a Lego-type approach, offering up to 10-20 times faster estimation via standard numerical optimisers. To aid in building fully custom likelihood functions, several vignettes are included that show how to simulate data from and estimate all the above model classes.

Maintained by Jan-Ole Koslik. Last updated 1 days ago.

openblas cpp openmp

3.4 match 9 stars 7.84 score 42 scripts

mhahsler

seriation:Infrastructure for Ordering Objects Using Seriation

Infrastructure for ordering objects with an implementation of several seriation/sequencing/ordination techniques to reorder matrices, dissimilarity matrices, and dendrograms. Also provides (optimally) reordered heatmaps, color images and clustering visualizations like dissimilarity plots, and visual assessment of cluster tendency plots (VAT and iVAT). Hahsler et al (2008) <doi:10.18637/jss.v025.i03>.

Maintained by Michael Hahsler. Last updated 3 months ago.

combinatorial-optimization ordination seriation fortran

1.9 match 77 stars 14.07 score 640 scripts 79 dependents

jdonaldson

tsne:T-Distributed Stochastic Neighbor Embedding for R (t-SNE)

A "pure R" implementation of the t-SNE algorithm.

Maintained by Justin Donaldson. Last updated 6 years ago.

2.8 match 58 stars 9.35 score 656 scripts 13 dependents

zeileis

exams2forms:Embedding 'exams' Exercises as Forms in 'rmarkdown' or 'quarto' Documents

Automatic generation of quizzes or individual questions as (interactive) forms within 'rmarkdown' or 'quarto' documents based on 'R/exams' exercises.

Maintained by Achim Zeileis. Last updated 2 days ago.

5.6 match 1 stars 4.65 score 9 scripts

r-lib

testthat:Unit Testing for R

Software testing is important, but, in part because it is frustrating and boring, many of us avoid it. 'testthat' is a testing framework for R that is easy to learn and use, and integrates with your existing 'workflow'.

Maintained by Hadley Wickham. Last updated 16 days ago.

unit-testing cpp

1.3 match 900 stars 20.97 score 74k scripts 465 dependents

paithiov909

apportita:Utility for Handling 'magnitude' Word Embeddings

A partial R port from 'magnitude', which is a fast, simple utility library for handling vector embeddings. The main goal of this package is to enable access to user's local magnitude data store.

Maintained by Akiru Kato. Last updated 2 months ago.

embeddings

15.4 match 1 stars 1.70 score 4 scripts

alishinski

lavaanPlot:Path Diagrams for 'Lavaan' Models via 'DiagrammeR'

Plots path diagrams from models in 'lavaan' using the plotting functionality from the 'DiagrammeR' package. 'DiagrammeR' provides nice path diagrams via 'Graphviz', and these functions make it easy to generate these diagrams from a 'lavaan' path model without having to write the DOT language graph specification.

Maintained by Alex Lishinski. Last updated 1 years ago.

3.1 match 40 stars 8.33 score 294 scripts

fishfollower

stockassessment:State-Space Assessment Model

Fitting SAM...

Maintained by Anders Nielsen. Last updated 13 days ago.

stockassessment cpp

3.3 match 49 stars 7.76 score 324 scripts 2 dependents

peterwmacd

fase:Functional Adjacency Spectral Embedding

Latent process embedding for functional network data with the Functional Adjacency Spectral Embedding. Fits smooth latent processes based on cubic spline bases. Also generates functional network data from three models, and evaluates a network generalized cross-validation criterion for dimension selection. For more information, see MacDonald, Zhu and Levina (2022+) <arXiv:2210.07491>.

Maintained by Peter W. MacDonald. Last updated 9 months ago.

7.5 match 3.40 score 3 scripts

thothorn

HSAUR3:A Handbook of Statistical Analyses Using R (3rd Edition)

Functions, data sets, analyses and examples from the third edition of the book ''A Handbook of Statistical Analyses Using R'' (Torsten Hothorn and Brian S. Everitt, Chapman & Hall/CRC, 2014). The first chapter of the book, which is entitled ''An Introduction to R'', is completely included in this package, for all other chapters, a vignette containing all data analyses is available. In addition, Sweave source code for slides of selected chapters is included in this package (see HSAUR3/inst/slides). The publishers web page is '<https://www.routledge.com/A-Handbook-of-Statistical-Analyses-using-R/Hothorn-Everitt/p/book/9781482204582>'.

Maintained by Torsten Hothorn. Last updated 7 months ago.

3.8 match 6 stars 6.72 score 120 scripts 2 dependents

bioc

ttgsea:Tokenizing Text of Gene Set Enrichment Analysis

Functional enrichment analysis methods such as gene set enrichment analysis (GSEA) have been widely used for analyzing gene expression data. GSEA is a powerful method to infer results of gene expression data at a level of gene sets by calculating enrichment scores for predefined sets of genes. GSEA depends on the availability and accuracy of gene sets. There are overlaps between terms of gene sets or categories because multiple terms may exist for a single biological process, and it can thus lead to redundancy within enriched terms. In other words, the sets of related terms are overlapping. Using deep learning, this pakage is aimed to predict enrichment scores for unique tokens or words from text in names of gene sets to resolve this overlapping set issue. Furthermore, we can coin a new term by combining tokens and find its enrichment score by predicting such a combined tokens.

Maintained by Dongmin Jung. Last updated 5 months ago.

software geneexpression genesetenrichment

5.1 match 4.95 score 3 scripts 3 dependents

bioc

HGC:A fast hierarchical graph-based clustering method

HGC (short for Hierarchical Graph-based Clustering) is an R package for conducting hierarchical clustering on large-scale single-cell RNA-seq (scRNA-seq) data. The key idea is to construct a dendrogram of cells on their shared nearest neighbor (SNN) graph. HGC provides functions for building graphs and for conducting hierarchical clustering on the graph. The users with old R version could visit https://github.com/XuegongLab/HGC/tree/HGC4oldRVersion to get HGC package built for R 3.6.

Maintained by XGlab. Last updated 5 months ago.

singlecell software clustering rnaseq graphandnetwork dnaseq cpp

5.3 match 4.70 score 25 scripts

tidymodels

dials:Tools for Creating Tuning Parameter Values

Many models contain tuning parameters (i.e. parameters that cannot be directly estimated from the data). These tools can be used to define objects for creating, simulating, or validating values for such parameters.

Maintained by Hannah Frick. Last updated 29 days ago.

1.8 match 114 stars 14.22 score 426 scripts 52 dependents

r-gregmisc

gtools:Various R Programming Tools

Functions to assist in R programming, including: - assist in developing, updating, and maintaining R and R packages ('ask', 'checkRVersion', 'getDependencies', 'keywords', 'scat'), - calculate the logit and inverse logit transformations ('logit', 'inv.logit'), - test if a value is missing, empty or contains only NA and NULL values ('invalid'), - manipulate R's .Last function ('addLast'), - define macros ('defmacro'), - detect odd and even integers ('odd', 'even'), - convert strings containing non-ASCII characters (like single quotes) to plain ASCII ('ASCIIfy'), - perform a binary search ('binsearch'), - sort strings containing both numeric and character components ('mixedsort'), - create a factor variable from the quantiles of a continuous variable ('quantcut'), - enumerate permutations and combinations ('combinations', 'permutation'), - calculate and convert between fold-change and log-ratio ('foldchange', 'logratio2foldchange', 'foldchange2logratio'), - calculate probabilities and generate random numbers from Dirichlet distributions ('rdirichlet', 'ddirichlet'), - apply a function over adjacent subsets of a vector ('running'), - modify the TCP_NODELAY ('de-Nagle') flag for socket objects, - efficient 'rbind' of data frames, even if the column names don't match ('smartbind'), - generate significance stars from p-values ('stars.pval'), - convert characters to/from ASCII codes ('asc', 'chr'), - convert character vector to ASCII representation ('ASCIIfy'), - apply title capitalization rules to a character vector ('capwords').

Maintained by Ben Bolker. Last updated 9 months ago.

1.7 match 25 stars 14.47 score 11k scripts 1.1k dependents

bioc

eiR:Accelerated similarity searching of small molecules

The eiR package provides utilities for accelerated structure similarity searching of very large small molecule data sets using an embedding and indexing approach.

Maintained by Thomas Girke. Last updated 1 months ago.

cheminformatics biomedicalinformatics pharmacogenetics pharmacogenomics microtitreplateassay cellbasedassays visualization infrastructure dataimport clustering proteomics metabolomics

4.4 match 3 stars 5.51 score 12 scripts

joycekang

symphony:Efficient and Precise Single-Cell Reference Atlas Mapping

Implements the Symphony single-cell reference building and query mapping algorithms and additional functions described in Kang et al <https://www.nature.com/articles/s41467-021-25957-x>.

Maintained by Joyce Kang. Last updated 2 years ago.

openblas cpp

6.3 match 3.83 score 134 scripts

wch

extrafont:Tools for Using Fonts

Tools to using fonts other than the standard PostScript fonts. This package makes it easy to use system TrueType fonts and with PDF or PostScript output files, and with bitmap output files in Windows. extrafont can also be used with fonts packaged specifically to be used with, such as the fontcm package, which has Computer Modern PostScript fonts with math symbols.

Maintained by Winston Chang. Last updated 2 years ago.

1.7 match 324 stars 14.10 score 13k scripts 51 dependents

bioc

tidytof:Analyze High-dimensional Cytometry Data Using Tidy Data Principles

This package implements an interactive, scientific analysis pipeline for high-dimensional cytometry data built using tidy data principles. It is specifically designed to play well with both the tidyverse and Bioconductor software ecosystems, with functionality for reading/writing data files, data cleaning, preprocessing, clustering, visualization, modeling, and other quality-of-life functions. tidytof implements a "grammar" of high-dimensional cytometry data analysis.

Maintained by Timothy Keyes. Last updated 5 months ago.

singlecell flowcytometry bioinformatics cytometry data-science single-cell tidyverse cpp

3.3 match 19 stars 7.26 score 35 scripts

wgunderwood

motifcluster:Motif-Based Spectral Clustering of Weighted Directed Networks

Tools for spectral clustering of weighted directed networks using motif adjacency matrices. Methods perform well on large and sparse networks, and random sampling methods for generating weighted directed networks are also provided. Based on methodology detailed in Underwood, Elliott and Cucuringu (2020) <arXiv:2004.01293>.

Maintained by William George Underwood. Last updated 9 months ago.

4.2 match 26 stars 5.72 score 4 scripts

jwijffels

topicmodels.etm:Topic Modelling in Embedding Spaces

Find topics in texts which are semantically embedded using techniques like word2vec or Glove. This topic modelling technique models each word with a categorical distribution whose natural parameter is the inner product between a word embedding and an embedding of its assigned topic. The techniques are explained in detail in the paper 'Topic Modeling in Embedding Spaces' by Adji B. Dieng, Francisco J. R. Ruiz, David M. Blei (2019), available at <arXiv:1907.04907>.

Maintained by Jan Wijffels. Last updated 3 years ago.

7.7 match 1 stars 2.90 score 32 scripts

bioc

flowPlots:flowPlots: analysis plots and data class for gated flow cytometry data

Graphical displays with embedded statistical tests for gated ICS flow cytometry data, and a data class which stores "stacked" data and has methods for computing summary measures on stacked data, such as marginal and polyfunctional degree data.

Maintained by N. Hawkins. Last updated 5 months ago.

immunooncology flowcytometry cellbasedassays visualization datarepresentation

6.8 match 3.30 score 1 scripts

irudnyts

openai:R Wrapper for OpenAI API

An R wrapper of OpenAI API endpoints (see <https://platform.openai.com/docs/introduction> for details). This package covers Models, Completions, Chat, Edits, Images, Embeddings, Audio, Files, Fine-tunes, Moderations, and legacy Engines endpoints.

Maintained by Iegor Rudnytskyi. Last updated 4 months ago.

api ml nlp openai

2.8 match 172 stars 8.05 score 336 scripts 5 dependents

bioc

tricycle:tricycle: Transferable Representation and Inference of cell cycle

The package contains functions to infer and visualize cell cycle process using Single Cell RNASeq data. It exploits the idea of transfer learning, projecting new data to the previous learned biologically interpretable space. We provide a pre-learned cell cycle space, which could be used to infer cell cycle time of human and mouse single cell samples. In addition, we also offer functions to visualize cell cycle time on different embeddings and functions to build new reference.

Maintained by Shijie Zheng. Last updated 5 months ago.

singlecell software transcriptomics rnaseq transcription biologicalquestion dimensionreduction immunooncology

3.4 match 24 stars 6.52 score 46 scripts

shaelebrown

TDApplied:Machine Learning and Inference for Topological Data Analysis

Topological data analysis is a powerful tool for finding non-linear global structure in whole datasets. The main tool of topological data analysis is persistent homology, which computes a topological shape descriptor of a dataset called a persistence diagram. 'TDApplied' provides useful and efficient methods for analyzing groups of persistence diagrams with machine learning and statistical inference, and these functions can also interface with other data science packages to form flexible and integrated topological data analysis pipelines.

Maintained by Shael Brown. Last updated 5 months ago.

cpp

3.3 match 16 stars 6.60 score 8 scripts

doi-usgs

dataRetrieval:Retrieval Functions for USGS and EPA Hydrology and Water Quality Data

Collection of functions to help retrieve U.S. Geological Survey and U.S. Environmental Protection Agency water quality and hydrology data from web services. Data are discovered from National Water Information System <https://waterservices.usgs.gov/> and <https://waterdata.usgs.gov/nwis>. Water quality data are obtained from the Water Quality Portal <https://www.waterqualitydata.us/>.

Maintained by Laura DeCicco. Last updated 17 days ago.

usgs

1.5 match 280 stars 14.18 score 1.7k scripts 15 dependents

bioc

spatialHeatmap:spatialHeatmap: Visualizing Spatial Assays in Anatomical Images and Large-Scale Data Extensions

The spatialHeatmap package offers the primary functionality for visualizing cell-, tissue- and organ-specific assay data in spatial anatomical images. Additionally, it provides extended functionalities for large-scale data mining routines and co-visualizing bulk and single-cell data. A description of the project is available here: https://spatialheatmap.org.

Maintained by Jianhai Zhang. Last updated 4 months ago.

spatial visualization microarray sequencing geneexpression datarepresentation network clustering graphandnetwork cellbasedassays atacseq dnaseq tissuemicroarray singlecell cellbiology genetarget

3.4 match 5 stars 6.26 score 12 scripts

rstudio

sortable:Drag-and-Drop in 'shiny' Apps with 'SortableJS'

Enables drag-and-drop behaviour in Shiny apps, by exposing the functionality of the 'SortableJS' <https://sortablejs.github.io/Sortable/> JavaScript library as an 'htmlwidget'. You can use this in Shiny apps and widgets, 'learnr' tutorials as well as R Markdown. In addition, provides a custom 'learnr' question type - 'question_rank()' - that allows ranking questions with drag-and-drop.

Maintained by Andrie de Vries. Last updated 6 months ago.

htmlwidget

1.8 match 135 stars 11.62 score 368 scripts 13 dependents

xd-deng

ECharts2Shiny:Embedding Interactive Charts Generated with ECharts Library into Shiny Applications

Embed interactive charts to their Shiny applications. These charts will be generated by ECharts library developed by Baidu (<http://echarts.baidu.com/>). Current version supports line chart, bar chart, pie chart, scatter plot, gauge, word cloud, radar chart, tree map, and heat map.

Maintained by Xiaodong Deng. Last updated 4 years ago.

shiny visualization

2.8 match 129 stars 7.42 score 135 scripts

eagerai

tfaddons:Interface to 'TensorFlow SIG Addons'

'TensorFlow SIG Addons' <https://www.tensorflow.org/addons> is a repository of community contributions that conform to well-established API patterns, but implement new functionality not available in core 'TensorFlow'. 'TensorFlow' natively supports a large number of operators, layers, metrics, losses, optimizers, and more. However, in a fast moving field like Machine Learning, there are many interesting new developments that cannot be integrated into core 'TensorFlow' (because their broad applicability is not yet clear, or it is mostly used by a smaller subset of the community).

Maintained by Turgut Abdullayev. Last updated 3 years ago.

deep-learning keras neural-networks tensorflow tensorflow-addons tfa

4.0 match 20 stars 5.20 score 16 scripts

emilhvitfeldt

wordsalad:Provide Tools to Extract and Analyze Word Vectors

Provides access to various word embedding methods (GloVe, fasttext and word2vec) to extract word vectors using a unified framework to increase reproducibility and correctness.

Maintained by Emil Hvitfeldt. Last updated 4 years ago.

5.8 match 8 stars 3.60 score 9 scripts

bioc

scrapper:Bindings to C++ Libraries for Single-Cell Analysis

Implements R bindings to C++ code for analyzing single-cell (expression) data, mostly from various libscran libraries. Each function performs an individual step in the single-cell analysis workflow, ranging from quality control to clustering and marker detection. It is mostly intended for other Bioconductor package developers to build more user-friendly end-to-end workflows.

Maintained by Aaron Lun. Last updated 4 days ago.

normalization rnaseq software geneexpression transcriptomics singlecell batcheffect qualitycontrol differentialexpression featureextraction principalcomponent clustering openblas cpp

3.8 match 5.55 score 32 scripts

haythorn

sr:Smooth Regression - The Gamma Test and Tools

Finds causal connections in precision data, finds lags and embeddings in time series, guides training of neural networks and other smooth models, evaluates their performance, gives a mathematically grounded answer to the over-training problem. Smooth regression is based on the Gamma test, which measures smoothness in a multivariate relationship. Causal relations are smooth, noise is not. 'sr' includes the Gamma test and search techniques that use it. References: Evans & Jones (2002) <doi:10.1098/rspa.2002.1010>, AJ Jones (2004) <doi:10.1007/s10287-003-0006-1>.

Maintained by Wayne Haythorn. Last updated 2 years ago.

5.5 match 3.70 score 9 scripts

tidymodels

textrecipes:Extra 'Recipes' for Text Processing

Converting text to numerical features requires specifically created procedures, which are implemented as steps according to the 'recipes' package. These steps allows for tokenization, filtering, counting (tf and tfidf) and feature hashing.

Maintained by Emil Hvitfeldt. Last updated 8 days ago.

1.9 match 160 stars 10.87 score 964 scripts 1 dependents

cran

cencrne:Consistent Estimation of the Number of Communities via Regularized Network Embedding

The network analysis plays an important role in numerous application domains including biomedicine. Estimation of the number of communities is a fundamental and critical issue in network analysis. Most existing studies assume that the number of communities is known a priori, or lack of rigorous theoretical guarantee on the estimation consistency. This method proposes a regularized network embedding model to simultaneously estimate the community structure and the number of communities in a unified formulation. The proposed model equips network embedding with a novel composite regularization term, which pushes the embedding vector towards its center and collapses similar community centers with each other. A rigorous theoretical analysis is conducted, establishing asymptotic consistency in terms of community detection and estimation of the number of communities. Reference: Ren, M., Zhang S. and Wang J. (2022). "Consistent Estimation of the Number of Communities via Regularized Network Embedding". Biometrics, <doi:10.1111/biom.13815>.

Maintained by Mingyang Ren. Last updated 2 years ago.

10.2 match 2.00 score

mano-b

MicroSEC:Sequence Error Filter for Formalin-Fixed and Paraffin-Embedded Samples

Clinical sequencing of tumor is usually performed on formalin-fixed and paraffin-embedded samples and have many sequencing errors. We found that the majority of these errors are detected in chimeric read caused by single-strand DNA with micro-homology. Our filtering pipeline focuses on the uneven distribution of the artifacts in each read and removes such errors in formalin-fixed and paraffin-embedded samples without over-eliminating the true mutations detected in fresh frozen samples.

Maintained by Masachika Ikegami. Last updated 3 months ago.

3.6 match 7 stars 5.66 score 8 scripts

andrija-djurovic

PDtoolkit:Collection of Tools for PD Rating Model Development and Validation

The goal of this package is to cover the most common steps in probability of default (PD) rating model development and validation. The main procedures available are those that refer to univariate, bivariate, multivariate analysis, calibration and validation. Along with accompanied 'monobin' and 'monobinShiny' packages, 'PDtoolkit' provides functions which are suitable for different data transformation and modeling tasks such as: imputations, monotonic binning of numeric risk factors, binning of categorical risk factors, weights of evidence (WoE) and information value (IV) calculations, WoE coding (replacement of risk factors modalities with WoE values), risk factor clustering, area under curve (AUC) calculation and others. Additionally, package provides set of validation functions for testing homogeneity, heterogeneity, discriminatory and predictive power of the model.

Maintained by Andrija Djurovic. Last updated 1 years ago.

4.3 match 14 stars 4.78 score 86 scripts

bioc

veloviz:VeloViz: RNA-velocity informed 2D embeddings for visualizing cell state trajectories

VeloViz uses each cell’s current observed and predicted future transcriptional states inferred from RNA velocity analysis to build a nearest neighbor graph between cells in the population. Edges are then pruned based on a cosine correlation threshold and/or a distance threshold and the resulting graph is visualized using a force-directed graph layout algorithm. VeloViz can help ensure that relationships between cell states are reflected in the 2D embedding, allowing for more reliable representation of underlying cellular trajectories.

Maintained by Lyla Atta. Last updated 5 months ago.

transcriptomics visualization geneexpression sequencing rnaseq dimensionreduction cpp

5.0 match 4.00 score 6 scripts

bioc

DeProViR:A Deep-Learning Framework Based on Pre-trained Sequence Embeddings for Predicting Host-Viral Protein-Protein Interactions

Emerging infectious diseases, exemplified by the zoonotic COVID-19 pandemic caused by SARS-CoV-2, are grave global threats. Understanding protein-protein interactions (PPIs) between host and viral proteins is essential for therapeutic targets and insights into pathogen replication and immune evasion. While experimental methods like yeast two-hybrid screening and mass spectrometry provide valuable insights, they are hindered by experimental noise and costs, yielding incomplete interaction maps. Computational models, notably DeProViR, predict PPIs from amino acid sequences, incorporating semantic information with GloVe embeddings. DeProViR employs a Siamese neural network, integrating convolutional and Bi-LSTM networks to enhance accuracy. It overcomes the limitations of feature engineering, offering an efficient means to predict host-virus interactions, which holds promise for antiviral therapies and advancing our understanding of infectious diseases.

Maintained by Matineh Rahmatbakhsh. Last updated 5 months ago.

proteomics systemsbiology networkinference neuralnetwork network

6.6 match 1 stars 3.00 score 1 scripts

crunch-io

crunch:Crunch.io Data Tools

The Crunch.io service <https://crunch.io/> provides a cloud-based data store and analytic engine, as well as an intuitive web interface. Using this package, analysts can interact with and manipulate Crunch datasets from within R. Importantly, this allows technical researchers to collaborate naturally with team members, managers, and clients who prefer a point-and-click interface.

Maintained by Greg Freedman Ellis. Last updated 10 days ago.

1.9 match 9 stars 10.53 score 200 scripts 2 dependents

brodieg

diffobj:Diffs for R Objects

Generate a colorized diff of two R objects for an intuitive visualization of their differences.

Maintained by Brodie Gaslam. Last updated 3 years ago.

diff

1.5 match 232 stars 13.12 score 107 scripts 486 dependents

nucleic-acid

namedropR:Create Visual Citations for Presentations and Posters

Provides 'visual citations' containing the metadata of a scientific paper and a 'QR' code. A 'visual citation' is a banner containing title, authors, journal and year of a publication. This package can create such banners based on 'BibTeX' and 'BibLaTeX' references or call the reference metadata from 'Crossref'-API. The banners include a QR code pointing to the 'DOI'. The resulting HTML object or PNG image can be included in a presentation to point the audience to good resources for further reading. Styling is possible via predefined designs or via custom 'CSS'. This package is not intended as replacement for proper reference manager packages, but a tool to enrich scientific presentation slides and conference posters.

Maintained by Christian A. Gebhard. Last updated 2 years ago.

bibtex

3.0 match 61 stars 6.44 score 8 scripts

cran

compositions:Compositional Data Analysis

Provides functions for the consistent analysis of compositional data (e.g. portions of substances) and positive numbers (e.g. concentrations) in the way proposed by J. Aitchison and V. Pawlowsky-Glahn.

Maintained by K. Gerald van den Boogaart. Last updated 1 years ago.

openblas

3.0 match 1 stars 6.35 score 36 dependents

robjhyndman

tsfeatures:Time Series Feature Extraction

Methods for extracting various features from time series data. The features provided are those from Hyndman, Wang and Laptev (2013) <doi:10.1109/ICDMW.2015.104>, Kang, Hyndman and Smith-Miles (2017) <doi:10.1016/j.ijforecast.2016.09.004> and from Fulcher, Little and Jones (2013) <doi:10.1098/rsif.2013.0048>. Features include spectral entropy, autocorrelations, measures of the strength of seasonality and trend, and so on. Users can also define their own feature functions.

Maintained by Rob Hyndman. Last updated 8 months ago.

feature-extraction time-series

1.6 match 254 stars 11.47 score 268 scripts 22 dependents

cogdisreslab

PAVER:PAVER: Pathway Analysis Visualization with Embedding Representations

Summary visualization using embedding representations to reveal underlying themes within sets of pathway terms.

Maintained by William G Ryan V. Last updated 8 months ago.

5.3 match 3.48 score 6 scripts

mkellerressel

hydra:Hyperbolic Embedding

Calculate an optimal embedding of a set of data points into low-dimensional hyperbolic space. This uses the strain-minimizing hyperbolic embedding of Keller-Ressel and Nargang (2019), see <arXiv:1903.08977>.

Maintained by Martin Keller-Ressel. Last updated 6 years ago.

8.4 match 2.15 score 20 scripts

tiledb-inc

tiledb:Modern Database Engine for Complex Data Based on Multi-Dimensional Arrays

The modern database 'TileDB' introduces a powerful on-disk format for storing and accessing any complex data based on multi-dimensional arrays. It supports dense and sparse arrays, dataframes and key-values stores, cloud storage ('S3', 'GCS', 'Azure'), chunked arrays, multiple compression, encryption and checksum filters, uses a fully multi-threaded implementation, supports parallel I/O, data versioning ('time travel'), metadata and groups. It is implemented as an embeddable cross-platform C++ library with APIs from several languages, and integrations. This package provides the R support.

Maintained by Isaiah Norton. Last updated 4 days ago.

array hdfs s3 storage-manager tiledb cpp

1.5 match 107 stars 11.96 score 306 scripts 4 dependents

comp-cogneuro-lang

LexFindR:Find Related Items and Lexical Dimensions in a Lexicon

Implements code to identify lexical competitors in a given list of words. We include many of the standard competitor types used in spoken word recognition research, such as functions to find cohorts, neighbors, and rhymes, amongst many others. The package includes documentation for using a variety of lexicon files, including those with form codes made up of multiple letters (i.e., phoneme codes) and also basic orthographies. Importantly, the code makes use of multiple CPU cores and vectorization when possible, making it extremely fast and able to handle large lexicons. Additionally, the package contains documentation for users to easily write new functions, allowing researchers to examine other relationships within a lexicon. Preprint: <https://osf.io/preprints/psyarxiv/8dyru/>. Open access: <doi:10.3758/s13428-021-01667-6>. Citation: Li, Z., Crinnion, A.M. & Magnuson, J.S. (2021). <doi:10.3758/s13428-021-01667-6>.

Maintained by ZhaoBin Li. Last updated 9 months ago.

4.0 match 4 stars 4.30 score 5 scripts

dcourvoisier

doremi:Dynamics of Return to Equilibrium During Multiple Inputs

Provides models to fit the dynamics of a regulated system experiencing exogenous inputs. The underlying models use differential equations and linear mixed-effects regressions to estimate the coefficients of the equation. With them, the functions can provide an estimated signal. The package provides simulation and analysis functions and also print, summary, plot and predict methods, adapted to the function outputs, for easy implementation and presentation of results.

Maintained by Mongin Denis. Last updated 3 years ago.

3.8 match 4.48 score 25 scripts 1 dependents

wilart

SMARTbayesR:Bayesian Set of Best Dynamic Treatment Regimes and Sample Size in SMARTs for Binary Outcomes

Permits determination of a set of optimal dynamic treatment regimes and sample size for a SMART design in the Bayesian setting with binary outcomes. Please see Artman (2020) <arXiv:2008.02341>.

Maintained by William Artman. Last updated 3 years ago.

8.5 match 2.00 score 2 scripts

thomaschln

opticskxi:OPTICS K-Xi Density-Based Clustering

Density-based clustering methods are well adapted to the clustering of high-dimensional data and enable the discovery of core groups of various shapes despite large amounts of noise. This package provides a novel density-based cluster extraction method, OPTICS k-Xi, and a framework to compare k-Xi models using distance-based metrics to investigate datasets with unknown number of clusters. The vignette first introduces density-based algorithms with simulated datasets, then presents and evaluates the k-Xi cluster extraction method. Finally, the models comparison framework is described and experimented on 2 genetic datasets to identify groups and their discriminating features. The k-Xi algorithm is a novel OPTICS cluster extraction method that specifies directly the number of clusters and does not require fine-tuning of the steepness parameter as the OPTICS Xi method. Combined with a framework that compares models with varying parameters, the OPTICS k-Xi method can identify groups in noisy datasets with unknown number of clusters. Results on summarized genetic data of 1,200 patients are in Charlon T. (2019) <doi:10.13097/archive-ouverte/unige:161795>. A short video tutorial can be found at <https://www.youtube.com/watch?v=P2XAjqI5Lc4/>.

Maintained by Thomas Charlon. Last updated 6 days ago.

3.4 match 4.90 score 1 scripts

tidyverse

readxl:Read Excel Files

Import excel files into R. Supports '.xls' via the embedded 'libxls' C library <https://github.com/libxls/libxls> and '.xlsx' via the embedded 'RapidXML' C++ library <https://rapidxml.sourceforge.net/>. Works on Windows, Mac and Linux without external dependencies.

Maintained by Jennifer Bryan. Last updated 9 days ago.

excel spreadsheet xls xlsx cpp

0.8 match 734 stars 20.85 score 160k scripts 815 dependents

andrija-djurovic

LGDtoolkit:Collection of Tools for LGD Rating Model Development

The goal of this package is to cover the most common steps in Loss Given Default (LGD) rating model development. The main procedures available are those that refer to bivariate and multivariate analysis. In particular two statistical methods for multivariate analysis are currently implemented – OLS regression and fractional logistic regression. Both methods are also available within different blockwise model designs and both have customized stepwise algorithms. Descriptions of these customized designs are available in Siddiqi (2016) <doi:10.1002/9781119282396.ch10> and Anderson, R.A. (2021) <doi:10.1093/oso/9780192844194.001.0001>. Although they are explained for PD model, the same designs are applicable for LGD model with different underlying regression methods (OLS and fractional logistic regression). To cover other important steps for LGD model development, it is recommended to use 'LGDtoolkit' package along with 'PDtoolkit', and 'monobin' (or 'monobinShiny') packages. Additionally, 'LGDtoolkit' provides set of procedures handy for initial and periodical model validation.

Maintained by Andrija Djurovic. Last updated 7 months ago.

4.3 match 5 stars 3.88 score 30 scripts

mages

googleVis:R Interface to Google Charts

R interface to Google's chart tools, allowing users to create interactive charts based on data frames. Charts are displayed locally via the R HTTP help server. A modern browser with an Internet connection is required. The data remains local and is not uploaded to Google.

Maintained by Markus Gesmann. Last updated 10 months ago.

1.3 match 361 stars 12.98 score 2.4k scripts 11 dependents

luisdva

unheadr:Handle Data with Messy Header Rows and Broken Values

Verb-like functions to work with messy data, often derived from spreadsheets or parsed PDF tables. Includes functions for unwrapping values broken up across rows, relocating embedded grouping values, and to annotate meaningful formatting in spreadsheet files.

Maintained by Luis D. Verde Arregoitia. Last updated 10 months ago.

2.5 match 61 stars 6.44 score 45 scripts

bioc

tomoda:Tomo-seq data analysis

This package provides many easy-to-use methods to analyze and visualize tomo-seq data. The tomo-seq technique is based on cryosectioning of tissue and performing RNA-seq on consecutive sections. (Reference: Kruse F, Junker JP, van Oudenaarden A, Bakkers J. Tomo-seq: A method to obtain genome-wide expression data with spatial resolution. Methods Cell Biol. 2016;135:299-307. doi:10.1016/bs.mcb.2016.01.006) The main purpose of the package is to find zones with similar transcriptional profiles and spatially expressed genes in a tomo-seq sample. Several visulization functions are available to create easy-to-modify plots.

Maintained by Wendao Liu. Last updated 5 months ago.

geneexpression sequencing rnaseq transcriptomics spatial clustering visualization

4.0 match 4.00 score 2 scripts

bioc

monocle:Clustering, differential expression, and trajectory analysis for single- cell RNA-Seq

Monocle performs differential expression and time-series analysis for single-cell expression experiments. It orders individual cells according to progress through a biological process, without knowing ahead of time which genes define progress through that process. Monocle also performs differential expression analysis, clustering, visualization, and other useful tasks on single cell expression data. It is designed to work with RNA-Seq and qPCR data, but could be used with other types as well.

Maintained by Cole Trapnell. Last updated 5 months ago.

immunooncology sequencing rnaseq geneexpression differentialexpression infrastructure dataimport datarepresentation visualization clustering multiplecomparison qualitycontrol cpp

1.8 match 8.89 score 1.6k scripts 2 dependents

insightsengineering

teal:Exploratory Web Apps for Analyzing Clinical Trials Data

A 'shiny' based interactive exploration framework for analyzing clinical trials data. 'teal' currently provides a dynamic filtering facility and different data viewers. 'teal' 'shiny' applications are built using standard 'shiny' modules.

Maintained by Dawid Kaledkowski. Last updated 20 days ago.

clinical-trials nest shiny webapp

1.3 match 197 stars 12.68 score 176 scripts 5 dependents

pik-piam

gms:'GAMS' Modularization Support Package

A collection of tools to create, use and maintain modularized model code written in the modeling language 'GAMS' (<https://www.gams.com/>). Out-of-the-box 'GAMS' does not come with support for modularized model code. This package provides the tools necessary to convert a standard 'GAMS' model to a modularized one by introducing a modularized code structure together with a naming convention which emulates local environments. In addition, this package provides tools to monitor the compliance of the model code with modular coding guidelines.

Maintained by Jan Philipp Dietrich. Last updated 4 days ago.

1.8 match 1 stars 8.73 score 414 scripts 37 dependents

hfgolino

EGAnet:Exploratory Graph Analysis – a Framework for Estimating the Number of Dimensions in Multivariate Data using Network Psychometrics

Implements the Exploratory Graph Analysis (EGA) framework for dimensionality and psychometric assessment. EGA estimates the number of dimensions in psychological data using network estimation methods and community detection algorithms. A bootstrap method is provided to assess the stability of dimensions and items. Fit is evaluated using the Entropy Fit family of indices. Unique Variable Analysis evaluates the extent to which items are locally dependent (or redundant). Network loadings provide similar information to factor loadings and can be used to compute network scores. A bootstrap and permutation approach are available to assess configural and metric invariance. Hierarchical structures can be detected using Hierarchical EGA. Time series and intensive longitudinal data can be analyzed using Dynamic EGA, supporting individual, group, and population level assessments.

Maintained by Hudson Golino. Last updated 9 days ago.

2.0 match 47 stars 7.80 score 61 scripts 1 dependents

ropensci

rsvg:Render SVG Images into PDF, PNG, (Encapsulated) PostScript, or Bitmap Arrays

Renders vector-based svg images into high-quality custom-size bitmap arrays using 'librsvg2'. The resulting bitmap can be written to e.g. png, jpeg or webp format. In addition, the package can convert images directly to various formats such as pdf or postscript.

Maintained by Jeroen Ooms. Last updated 6 months ago.

librsvg glib cairo

1.3 match 98 stars 11.66 score 868 scripts 45 dependents

mlr-org

mlr3fselect:Feature Selection for 'mlr3'

Feature selection package of the 'mlr3' ecosystem. It selects the optimal feature set for any 'mlr3' learner. The package works with several optimization algorithms e.g. Random Search, Recursive Feature Elimination, and Genetic Search. Moreover, it can automatically optimize learners and estimate the performance of optimized feature sets with nested resampling.

Maintained by Marc Becker. Last updated 2 months ago.

evolutionary-algorithms exhaustive-search feature-selection machine-learning mlr3 optimization random-search recursive-feature-elimination sequential-feature-selection

1.9 match 23 stars 8.25 score 70 scripts 2 dependents

pdhoff

amen:Additive and Multiplicative Effects Models for Networks and Relational Data

Analysis of dyadic network and relational data using additive and multiplicative effects (AME) models. The basic model includes regression terms, the covariance structure of the social relations model (Warner, Kenny and Stoto (1979) <DOI:10.1037/0022-3514.37.10.1742>, Wong (1982) <DOI:10.2307/2287296>), and multiplicative factor models (Hoff(2009) <DOI:10.1007/s10588-008-9040-4>). Several different link functions accommodate different relational data structures, including binary/network data, normal relational data, zero-inflated positive outcomes using a tobit model, ordinal relational data and data from fixed-rank nomination schemes. Several of these link functions are discussed in Hoff, Fosdick, Volfovsky and Stovel (2013) <DOI:10.1017/nws.2013.17>. Development of this software was supported in part by NIH grant R01HD067509.

Maintained by Peter Hoff. Last updated 4 years ago.

2.3 match 28 stars 6.81 score 153 scripts

drwolf85

spMC:Continuous-Lag Spatial Markov Chains

A set of functions is provided for 1) the stratum lengths analysis along a chosen direction, 2) fast estimation of continuous lag spatial Markov chains model parameters and probability computing (also for large data sets), 3) transition probability maps and transiograms drawing, 4) simulation methods for categorical random fields. More details on the methodology are discussed in Sartore (2013) <doi:10.32614/RJ-2013-022> and Sartore et al. (2016) <doi:10.1016/j.cageo.2016.06.001>.

Maintained by Luca Sartore. Last updated 2 years ago.

openblas openmp

5.3 match 3 stars 2.92 score 55 scripts

jgarriga65

bigMap:Big Data Mapping

Unsupervised clustering protocol for large scale structured data, based on a low dimensional representation of the data. Dimensionality reduction is performed using a parallelized implementation of the t-Stochastic Neighboring Embedding algorithm (Garriga J. and Bartumeus F. (2018), <arXiv:1812.09869>).

Maintained by Joan Garriga. Last updated 9 months ago.

openblas cpp

4.6 match 3.32 score 21 scripts

bioc

velociraptor:Toolkit for Single-Cell Velocity

This package provides Bioconductor-friendly wrappers for RNA velocity calculations in single-cell RNA-seq data. We use the basilisk package to manage Conda environments, and the zellkonverter package to convert data structures between SingleCellExperiment (R) and AnnData (Python). The information produced by the velocity methods is stored in the various components of the SingleCellExperiment class.

Maintained by Kevin Rue-Albrecht. Last updated 5 months ago.

singlecell geneexpression sequencing coverage rna-velocity

1.9 match 54 stars 8.06 score 52 scripts

bnprks

BPCells:Single Cell Counts Matrices to PCA

> Efficient operations for single cell ATAC-seq fragments and RNA counts matrices. Interoperable with standard file formats, and introduces efficient bit-packed formats that allow large storage savings and increased read speeds.

Maintained by Benjamin Parks. Last updated 1 months ago.

zlib hdf5 cpp

2.0 match 184 stars 7.48 score 172 scripts

rstudio

tfhub:Interface to 'TensorFlow' Hub

'TensorFlow' Hub is a library for the publication, discovery, and consumption of reusable parts of machine learning models. A module is a self-contained piece of a 'TensorFlow' graph, along with its weights and assets, that can be reused across different tasks in a process known as transfer learning. Transfer learning train a model with a smaller dataset, improve generalization, and speed up training.

Maintained by Tomasz Kalinowski. Last updated 3 years ago.

2.0 match 29 stars 7.46 score 73 scripts 1 dependents

nicholasdavies

luajr:'LuaJIT' Scripting

An interface to 'LuaJIT' <https://luajit.org>, a just-in-time compiler for the 'Lua' scripting language <https://www.lua.org>. Allows users to run 'Lua' code from 'R'.

Maintained by Nicholas Davies. Last updated 5 months ago.

cpp

2.3 match 25 stars 6.30 score 6 scripts

alexym1

fusionchartsR:Embedding FusionCharts in R

FusionCharts provides awesome and minimalist functions to make beautiful interactive charts <https://www.fusioncharts.com/>.

Maintained by Alex Yahiaoui Martinez. Last updated 3 months ago.

3.3 match 6 stars 4.40 score 42 scripts

kharchenkolab

conos:Clustering on Network of Samples

Wires together large collections of single-cell RNA-seq datasets, which allows for both the identification of recurrent cell clusters and the propagation of information between datasets in multi-sample or atlas-scale collections. 'Conos' focuses on the uniform mapping of homologous cell types across heterogeneous sample collections. For instance, users could investigate a collection of dozens of peripheral blood samples from cancer patients combined with dozens of controls, which perhaps includes samples of a related tissue such as lymph nodes. This package interacts with data available through the 'conosPanel' package, which is available in a 'drat' repository. To access this data package, see the instructions at <https://github.com/kharchenkolab/conos>. The size of the 'conosPanel' package is approximately 12 MB.

Maintained by Evan Biederstedt. Last updated 1 years ago.

batch-correction scrna-seq single-cell-rna-seq openblas cpp openmp

2.0 match 204 stars 7.32 score 258 scripts

travis-barton

LilRhino:For Implementation of Feed Reduction, Learning Examples, NLP and Code Management

This is for code management functions, NLP tools, a Monty Hall simulator, and for implementing my own variable reduction technique called Feed Reduction. The Feed Reduction technique is not yet published, but is merely a tool for implementing a series of binary neural networks meant for reducing data into N dimensions, where N is the number of possible values of the response variable.

Maintained by Travis Barton. Last updated 3 years ago.

5.2 match 1 stars 2.78 score 12 scripts

bioc

snifter:R wrapper for the python openTSNE library

Provides an R wrapper for the implementation of FI-tSNE from the python package openTNSE. See Poličar et al. (2019) <doi:10.1101/731877> and the algorithm described by Linderman et al. (2018) <doi:10.1038/s41592-018-0308-4>.

Maintained by Alan OCallaghan. Last updated 5 months ago.

dimensionreduction visualization software singlecell sequencing

2.9 match 3 stars 4.95 score 3 scripts

stephenslab

fastTopics:Fast Algorithms for Fitting Topic Models and Non-Negative Matrix Factorizations to Count Data

Implements fast, scalable optimization algorithms for fitting topic models ("grade of membership" models) and non-negative matrix factorizations to count data. The methods exploit the special relationship between the multinomial topic model (also, "probabilistic latent semantic indexing") and Poisson non-negative matrix factorization. The package provides tools to compare, annotate and visualize model fits, including functions to efficiently create "structure plots" and identify key features in topics. The 'fastTopics' package is a successor to the 'CountClust' package. For more information, see <doi:10.48550/arXiv.2105.13440> and <doi:10.1186/s13059-023-03067-9>. Please also see the GitHub repository for additional vignettes not included in the package on CRAN.

Maintained by Peter Carbonetto. Last updated 16 days ago.

openblas cpp

1.7 match 79 stars 8.38 score 678 scripts 1 dependents

josherrickson

rlemon:R Access to LEMON Graph Algorithms

Allows easy access to the LEMON Graph Library set of algorithms, written in C++. See the LEMON project page at <https://lemon.cs.elte.hu/trac/lemon>. Current LEMON version is 1.3.1.

Maintained by Josh Errickson. Last updated 2 months ago.

cpp

2.0 match 8 stars 7.04 score 1 scripts 13 dependents

zdebruine

RcppML:Rcpp Machine Learning Library

Fast machine learning algorithms including matrix factorization and divisive clustering for large sparse and dense matrices.

Maintained by Zach DeBruine. Last updated 2 years ago.

clustering matrix-factorization nmf rcpp rcppeigen sparse-matrix cpp openmp

1.3 match 104 stars 10.53 score 125 scripts 46 dependents

yusenzhang

qkerntool:Q-Kernel-Based and Conditionally Negative Definite Kernel-Based Machine Learning Tools

Nonlinear machine learning tool for classification, clustering and dimensionality reduction. It integrates 12 q-kernel functions and 15 conditional negative definite kernel functions and includes the q-kernel and conditional negative definite kernel version of density-based spatial clustering of applications with noise, spectral clustering, generalized discriminant analysis, principal component analysis, multidimensional scaling, locally linear embedding, sammon's mapping and t-Distributed stochastic neighbor embedding.

Maintained by Yusen Zhang. Last updated 6 years ago.

6.4 match 1 stars 2.19 score 31 scripts

kumes

chatAI4R:Chat-Based Interactive Artificial Intelligence for R

The Large Language Model (LLM) represents a groundbreaking advancement in data science and programming, and also allows us to extend the world of R. A seamless interface for integrating the 'OpenAI' Web APIs into R is provided in this package. This package leverages LLM-based AI techniques, enabling efficient knowledge discovery and data analysis (see 'OpenAI' Web APIs details <https://openai.com/blog/openai-api>). The previous functions such as seamless translation and image generation have been moved to other packages 'deepRstudio' and 'stableDiffusion4R'.

Maintained by Satoshi Kume. Last updated 1 months ago.

ai bioinformatics chatgpt gpt image image-generation

3.1 match 14 stars 4.45 score 3 scripts

bioc

systemPipeTools:Tools for data visualization

systemPipeTools package extends the widely used systemPipeR (SPR) workflow environment with an enhanced toolkit for data visualization, including utilities to automate the data visualizaton for analysis of differentially expressed genes (DEGs). systemPipeTools provides data transformation and data exploration functions via scatterplots, hierarchical clustering heatMaps, principal component analysis, multidimensional scaling, generalized principal components, t-Distributed Stochastic Neighbor embedding (t-SNE), and MA and volcano plots. All these utilities can be integrated with the modular design of the systemPipeR environment that allows users to easily substitute any of these features and/or custom with alternatives.

Maintained by Daniela Cassol. Last updated 5 months ago.

infrastructure dataimport sequencing qualitycontrol reportwriting experimentaldesign clustering differentialexpression multidimensionalscaling principalcomponent

3.4 match 4.00 score 4 scripts

mdsr-book

mdsr:Complement to 'Modern Data Science with R'

A complement to all editions of *Modern Data Science with R* (ISBN: 978-0367191498, publisher URL: <https://www.routledge.com/Modern-Data-Science-with-R/Baumer-Kaplan-Horton/p/book/9780367191498>). This package contains data and code to complete exercises and reproduce examples from the text. It also facilitates connections to the SQL database server used in the book. All editions of the book are supported by this package.

Maintained by Benjamin S. Baumer. Last updated 7 months ago.

1.9 match 38 stars 7.21 score 504 scripts

tughall

MaxWiK:Machine Learning Method Based on Isolation Kernel Mean Embedding

Incorporates Approximate Bayesian Computation to get a posterior distribution and to select a model optimal parameter for an observation point. Additionally, the meta-sampling heuristic algorithm is realized for parameter estimation, which requires no model runs and is dimension-independent. A sampling scheme is also presented that allows model runs and uses the meta-sampling for point generation. A predictor is realized as the meta-sampling for the model output. All the algorithms leverage a machine learning method utilizing the maxima weighted Isolation Kernel approach, or 'MaxWiK'. The method involves transforming raw data to a Hilbert space (mapping) and measuring the similarity between simulated points and the maxima weighted Isolation Kernel mapping corresponding to the observation point. Comprehensive details of the methodology can be found in the papers Iurii Nagornov (2024) <doi:10.1007/978-3-031-66431-1_16> and Iurii Nagornov (2023) <doi:10.1007/978-3-031-29168-5_18>.

Maintained by Yuri Nagornov. Last updated 4 months ago.

2.8 match 4.78 score 3 scripts

ltorgo

DMwR2:Functions and Data for the Second Edition of "Data Mining with R"

Functions and data accompanying the second edition of the book "Data Mining with R, learning with case studies" by Luis Torgo, published by CRC Press.

Maintained by Luis Torgo. Last updated 8 years ago.

1.7 match 27 stars 7.46 score 380 scripts 2 dependents

kloppen

rde:Reproducible Data Embedding

Allows caching of raw data directly in R code. This allows R scripts and R Notebooks to be shared and re-run on a machine without access to the original data. Cached data is encoded into an ASCII string that can be pasted into R code. When the code is run, the data is automatically loaded from the cached version if the original data file is unavailable. Works best for small datasets (a few hundred observations).

Maintained by Stefan Kloppenborg. Last updated 5 years ago.

3.3 match 1 stars 3.70 score 7 scripts

bioc

netSmooth:Network smoothing for scRNAseq

netSmooth is an R package for network smoothing of single cell RNA sequencing data. Using bio networks such as protein-protein interactions as priors for gene co-expression, netsmooth improves cell type identification from noisy, sparse scRNAseq data.

Maintained by Jonathan Ronen. Last updated 5 months ago.

network graphandnetwork singlecell rnaseq geneexpression sequencing transcriptomics normalization preprocessing clustering dimensionreduction bioinformatics genomics single-cell

1.7 match 27 stars 7.41 score 4 scripts

fkeck

subtools:Read and Manipulate Video Subtitles

A collection of functions to read, write and manipulate video subtitles. Supported formats include "srt", "subrip", "sub", "subviewer", "microdvd", "ssa", "ass", "substation", "vtt", and "webvtt".

Maintained by Francois Keck. Last updated 5 years ago.

3.5 match 45 stars 3.35 score

thomasp85

densityClust:Clustering by Fast Search and Find of Density Peaks

An improved implementation (based on k-nearest neighbors) of the density peak clustering algorithm, originally described by Alex Rodriguez and Alessandro Laio (Science, 2014 vol. 344). It can handle large datasets (> 100,000 samples) very efficiently. It was initially implemented by Thomas Lin Pedersen, with inputs from Sean Hughes and later improved by Xiaojie Qiu to handle large datasets with kNNs.

Maintained by Thomas Lin Pedersen. Last updated 1 years ago.

cpp

1.7 match 153 stars 7.14 score 75 scripts

r-forge

RHRV:Heart Rate Variability Analysis of ECG Data

Allows users to import data files containing heartbeat positions in the most broadly used formats, to remove outliers or points with unacceptable physiological values present in the time series, to plot HRV data, and to perform time domain, frequency domain and nonlinear HRV analysis. See Garcia et al. (2017) <DOI:10.1007/978-3-319-65355-6>.

Maintained by Leandro Rodriguez-Linares. Last updated 6 months ago.

1.7 match 6.79 score 63 scripts 1 dependents

carmonalab

GeneNMF:Non-Negative Matrix Factorization for Single-Cell Omics

A collection of methods to extract gene programs from single-cell gene expression data using non-negative matrix factorization (NMF). 'GeneNMF' contains functions to directly interact with the 'Seurat' toolkit and derive interpretable gene program signatures.

Maintained by Massimo Andreatta. Last updated 10 days ago.

1.8 match 102 stars 6.63 score 12 scripts

bioc

escheR:Unified multi-dimensional visualizations with Gestalt principles

The creation of effective visualizations is a fundamental component of data analysis. In biomedical research, new challenges are emerging to visualize multi-dimensional data in a 2D space, but current data visualization tools have limited capabilities. To address this problem, we leverage Gestalt principles to improve the design and interpretability of multi-dimensional data in 2D data visualizations, layering aesthetics to display multiple variables. The proposed visualization can be applied to spatially-resolved transcriptomics data, but also broadly to data visualized in 2D space, such as embedding visualizations. We provide this open source R package escheR, which is built off of the state-of-the-art ggplot2 visualization framework and can be seamlessly integrated into genomics toolboxes and workflows.

Maintained by Boyi Guo. Last updated 5 months ago.

spatial singlecell transcriptomics visualization software multidimensional single-cell spatial-omics

1.7 match 6 stars 6.74 score 153 scripts 1 dependents

rpuggaardrode

praatpicture:'Praat Picture' Style Plots of Acoustic Data

Quickly and easily generate plots of acoustic data aligned with transcriptions similar to those made in 'Praat' using either derived signals generated directly in R with 'wrassp' or imported derived signals from 'Praat'. Provides easy and fast out-of-the-box solutions but also a high extent of flexibility. Also provides options for embedding audio in figures and animating figures.

Maintained by Rasmus Puggaard-Rode. Last updated 20 days ago.

2.2 match 29 stars 5.28 score 3 scripts

alexander-pastukhov

eyelinkReader:Import Gaze Data for EyeLink Eye Tracker

Import gaze data from edf files generated by the SR Research <https://www.sr-research.com/> EyeLink eye tracker. Gaze data, both recorded events and samples, is imported per trial. The package allows to extract events of interest, such as saccades, blinks, etc. as well as recorded variables and custom events (areas of interest, triggers) into separate tables. The package requires EDF API library that can be obtained at <https://www.sr-research.com/support/>.

Maintained by Alexander Pastukhov. Last updated 3 months ago.

edf eye-tracking eyelink sr-research cpp

1.7 match 13 stars 6.52 score 34 scripts

kisungyou

maotai:Tools for Matrix Algebra, Optimization and Inference

Matrix is an universal and sometimes primary object/unit in applied mathematics and statistics. We provide a number of algorithms for selected problems in optimization and statistical inference. For general exposition to the topic with focus on statistical context, see the book by Banerjee and Roy (2014, ISBN:9781420095388).

Maintained by Kisung You. Last updated 4 hours ago.

openblas cpp openmp

2.0 match 8 stars 5.51 score 15 scripts 9 dependents

duncanobrien

EWSmethods:Forecasting Tipping Points at the Community Level

Rolling and expanding window approaches to assessing abundance based early warning signals, non-equilibrium resilience measures, and machine learning. See Dakos et al. (2012) <doi:10.1371/journal.pone.0041010>, Deb et al. (2022) <doi:10.1098/rsos.211475>, Drake and Griffen (2010) <doi:10.1038/nature09389>, Ushio et al. (2018) <doi:10.1038/nature25504> and Weinans et al. (2021) <doi:10.1038/s41598-021-87839-y> for methodological details. Graphical presentation of the outputs are also provided for clear and publishable figures. Visit the 'EWSmethods' website for more information, and tutorials.

Maintained by Duncan OBrien. Last updated 7 months ago.

2.0 match 8 stars 5.51 score 20 scripts

bioc

CatsCradle:This package provides methods for analysing spatial transcriptomics data and for discovering gene clusters

This package addresses two broad areas. It allows for in-depth analysis of spatial transcriptomic data by identifying tissue neighbourhoods. These are contiguous regions of tissue surrounding individual cells. 'CatsCradle' allows for the categorisation of neighbourhoods by the cell types contained in them and the genes expressed in them. In particular, it produces Seurat objects whose individual elements are neighbourhoods rather than cells. In addition, it enables the categorisation and annotation of genes by producing Seurat objects whose elements are genes.

Maintained by Michael Shapiro. Last updated 1 months ago.

biologicalquestion statisticalmethod geneexpression singlecell transcriptomics spatial

1.7 match 3 stars 6.50 score

dvrbts

labdsv:Ordination and Multivariate Analysis for Ecology

A variety of ordination and community analyses useful in analysis of data sets in community ecology. Includes many of the common ordination methods, with graphical routines to facilitate their interpretation, as well as several novel analyses.

Maintained by David W. Roberts. Last updated 2 years ago.

fortran

1.8 match 3 stars 6.08 score 452 scripts 13 dependents

minoo-asty

CINNA:Deciphering Central Informative Nodes in Network Analysis

Computing, comparing, and demonstrating top informative centrality measures within a network. "CINNA: an R/CRAN package to decipher Central Informative Nodes in Network Analysis" provides a comprehensive overview of the package functionality Ashtiani et al. (2018) <doi:10.1093/bioinformatics/bty819>.

Maintained by Minoo Ashtiani. Last updated 2 years ago.

3.3 match 1 stars 3.29 score 98 scripts

jpfitzinger

tidyfit:Regularized Linear Modeling with Tidy Data

An extension to the 'R' tidy data environment for automated machine learning. The package allows fitting and cross validation of linear regression and classification algorithms on grouped data.

Maintained by Johann Pfitzinger. Last updated 2 months ago.

auto-ml classification machine-learning regression tidyverse

1.5 match 16 stars 7.22 score 26 scripts

bnosac

sentencepiece:Text Tokenization using Byte Pair Encoding and Unigram Modelling

Unsupervised text tokenizer allowing to perform byte pair encoding and unigram modelling. Wraps the 'sentencepiece' library <https://github.com/google/sentencepiece> which provides a language independent tokenizer to split text in words and smaller subword units. The techniques are explained in the paper "SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing" by Taku Kudo and John Richardson (2018) <doi:10.18653/v1/D18-2012>. Provides as well straightforward access to pretrained byte pair encoding models and subword embeddings trained on Wikipedia using 'word2vec', as described in "BPEmb: Tokenization-free Pre-trained Subword Embeddings in 275 Languages" by Benjamin Heinzerling and Michael Strube (2018) <http://www.lrec-conf.org/proceedings/lrec2018/pdf/1049.pdf>.

Maintained by Jan Wijffels. Last updated 2 years ago.

byte natural-language-processing sentencepiece word-segmentation cpp

2.6 match 25 stars 4.10 score 8 scripts

muschellij2

cifti:Toolbox for Connectivity Informatics Technology Initiative ('CIFTI') Files

Functions for the input/output and visualization of medical imaging data in the form of 'CIFTI' files <https://www.nitrc.org/projects/cifti/>.

Maintained by John Muschelli. Last updated 5 years ago.

1.9 match 3 stars 5.76 score 96 scripts

mengxu98

inferCSN:Inferring Cell-Specific Gene Regulatory Network

An R package for inferring cell-type specific gene regulatory network from single-cell RNA data.

Maintained by Meng Xu. Last updated 4 days ago.

openblas cpp

2.3 match 3 stars 4.79 score 6 scripts

kisungyou

NetworkDistance:Distance Measures for Networks

Network is a prevalent form of data structure in many fields. As an object of analysis, many distance or metric measures have been proposed to define the concept of similarity between two networks. We provide a number of distance measures for networks. See Jurman et al (2011) <doi:10.3233/978-1-60750-692-8-227> for an overview on spectral class of inter-graph distance measures.

Maintained by Kisung You. Last updated 2 years ago.

distance network network-analysis openblas cpp openmp

1.9 match 9 stars 5.58 score 28 scripts 1 dependents

scottgigante

phateR:PHATE - Potential of Heat-Diffusion for Affinity-Based Transition Embedding

PHATE is a tool for visualizing high dimensional single-cell data with natural progressions or trajectories. PHATE uses a novel conceptual framework for learning and visualizing the manifold inherent to biological systems in which smooth transitions mark the progressions of cells from one state to another. To see how PHATE can be applied to single-cell RNA-seq datasets from hematopoietic stem cells, human embryonic stem cells, and bone marrow samples, check out our publication in Nature Biotechnology at <doi:10.1038/s41587-019-0336-3>.

Maintained by Scott Gigante. Last updated 4 years ago.

2.8 match 1 stars 3.72 score 262 scripts

cran

Mercator:Clustering and Visualizing Distance Matrices

Defines the classes used to explore, cluster and visualize distance matrices, especially those arising from binary data. See Abrams and colleagues, 2021, <doi:10.1093/bioinformatics/btab037>.

Maintained by Kevin R. Coombes. Last updated 5 months ago.

clustering

2.4 match 4.33 score 12 scripts 1 dependents

antoniofabio

tseriesChaos:Analysis of Nonlinear Time Series

Routines for the analysis of nonlinear time series. This work is largely inspired by the TISEAN project, by Rainer Hegger, Holger Kantz and Thomas Schreiber: <http://www.mpipks-dresden.mpg.de/~tisean/>.

Maintained by Antonio Fabio Di Narzo. Last updated 6 years ago.

2.0 match 3 stars 5.12 score 106 scripts 6 dependents

spgarbet

tangram:The Grammar of Tables

Provides an extensible formula system to quickly and easily create production quality tables. The processing steps are a formula parser, statistical content generation from data as defined by formula, followed by rendering into a table. Each step of the processing is separate and user definable thus creating a set of composable building blocks for highly customizable table generation. A user is not limited by any of the choices of the package creator other than the formula grammar. For example, one could chose to add a different S3 rendering function and output a format not provided in the default package, or possibly one would rather have Gini coefficients for their statistical content in a resulting table. Routines to achieve New England Journal of Medicine style, Lancet style and Hmisc::summaryM() statistics are provided. The package contains rendering for HTML5, Rmarkdown and an indexing format for use in tracing and tracking are provided.

Maintained by Shawn Garbett. Last updated 2 years ago.

1.7 match 68 stars 5.93 score 62 scripts

tidyverse

haven:Import and Export 'SPSS', 'Stata' and 'SAS' Files

Import foreign statistical formats into R via the embedded 'ReadStat' C library, <https://github.com/WizardMac/ReadStat>.

Maintained by Hadley Wickham. Last updated 5 months ago.

sas spss stata zlib cpp

0.5 match 427 stars 18.63 score 18k scripts 682 dependents

ravingmantis

unittest:TAP-Compliant Unit Testing

Concise TAP <http://testanything.org/> compliant unit testing package. Authored tests can be run using CMD check with minimal implementation overhead.

Maintained by Jamie Lentin. Last updated 7 months ago.

1.3 match 4 stars 7.43 score 224 scripts

bioc

RDRToolbox:A package for nonlinear dimension reduction with Isomap and LLE.

A package for nonlinear dimension reduction using the Isomap and LLE algorithm. It also includes a routine for computing the Davis-Bouldin-Index for cluster validation, a plotting tool and a data generator for microarray gene expression data and for the Swiss Roll dataset.

Maintained by Christoph Bartenhagen. Last updated 5 months ago.

dimensionreduction featureextraction visualization clustering microarray

2.0 match 4.88 score 54 scripts

bioc

scBFA:A dimensionality reduction tool using gene detection pattern to mitigate noisy expression profile of scRNA-seq

This package is designed to model gene detection pattern of scRNA-seq through a binary factor analysis model. This model allows user to pass into a cell level covariate matrix X and gene level covariate matrix Q to account for nuisance variance(e.g batch effect), and it will output a low dimensional embedding matrix for downstream analysis.

Maintained by Ruoxin Li. Last updated 5 months ago.

singlecell transcriptomics dimensionreduction geneexpression atacseq batcheffect kegg qualitycontrol

2.3 match 4.30 score 4 scripts

core-bioinformatics

ClustAssess:Tools for Assessing Clustering

A set of tools for evaluating clustering robustness using proportion of ambiguously clustered pairs (Senbabaoglu et al. (2014) <doi:10.1038/srep06207>), as well as similarity across methods and method stability using element-centric clustering comparison (Gates et al. (2019) <doi:10.1038/s41598-019-44892-y>). Additionally, this package enables stability-based parameter assessment for graph-based clustering pipelines typical in single-cell data analysis.

Maintained by Andi Munteanu. Last updated 1 months ago.

software singlecell rnaseq atacseq normalization preprocessing dimensionreduction visualization qualitycontrol clustering classification annotation geneexpression differentialexpression bioinformatics genomics machine-learning parameter-optimization robustness single-cell unsupervised-learning cpp

1.7 match 23 stars 5.70 score 18 scripts

jsandube

DChaos:Chaotic Time Series Analysis

Chaos theory has been hailed as a revolution of thoughts and attracting ever increasing attention of many scientists from diverse disciplines. Chaotic systems are nonlinear deterministic dynamic systems which can behave like an erratic and apparently random motion. A relevant field inside chaos theory and nonlinear time series analysis is the detection of a chaotic behaviour from empirical time series data. One of the main features of chaos is the well known initial value sensitivity property. Methods and techniques related to test the hypothesis of chaos try to quantify the initial value sensitive property estimating the Lyapunov exponents. The DChaos package provides different useful tools and efficient algorithms which test robustly the hypothesis of chaos based on the Lyapunov exponent in order to know if the data generating process behind time series behave chaotically or not.

Maintained by Julio E. Sandubete. Last updated 2 years ago.

4.8 match 1 stars 2.00 score 9 scripts

mlverse

cuda.ml:R Interface for the RAPIDS cuML Suite of Libraries

R interface for RAPIDS cuML (<https://github.com/rapidsai/cuml>), a suite of GPU-accelerated machine learning libraries powered by CUDA (<https://en.wikipedia.org/wiki/CUDA>).

Maintained by Daniel Falbel. Last updated 3 years ago.

gpu machine-learning cpp

1.8 match 33 stars 5.27 score 57 scripts

steverozen

ICAMS:In-Depth Characterization and Analysis of Mutational Signatures ('ICAMS')

Analysis and visualization of experimentally elucidated mutational signatures -- the kind of analysis and visualization in Boot et al., "In-depth characterization of the cisplatin mutational signature in human cell lines and in esophageal and liver tumors", Genome Research 2018, <doi:10.1101/gr.230219.117> and "Characterization of colibactin-associated mutational signature in an Asian oral squamous cell carcinoma and in other mucosal tumor types", Genome Research 2020 <doi:10.1101/gr.255620.119>. 'ICAMS' stands for In-depth Characterization and Analysis of Mutational Signatures. 'ICAMS' has functions to read in variant call files (VCFs) and to collate the corresponding catalogs of mutational spectra and to analyze and plot catalogs of mutational spectra and signatures. Handles both "counts-based" and "density-based" (i.e. representation as mutations per megabase) mutational spectra or signatures.

Maintained by Steve Rozen. Last updated 3 years ago.

1.8 match 8 stars 5.41 score 128 scripts

nbarrowman

vtree:Display Information About Nested Subsets of a Data Frame

A tool for calculating and drawing "variable trees". Variable trees display information about nested subsets of a data frame.

Maintained by Nick Barrowman. Last updated 13 hours ago.

data-science data-visualization exploratory-data-analysis statistics

1.3 match 76 stars 7.09 score 65 scripts

carlganz

docuSignr:Connect to 'DocuSign' API

Connect to the 'DocuSign' Rest API <https://www.docusign.com/p/RESTAPIGuide/RESTAPIGuide.htm>, which supports embedded signing, and sending of documents.

Maintained by Carl Ganz. Last updated 6 years ago.

curl docusign httr

2.8 match 5 stars 3.40 score 10 scripts

bioc

iSEEde:iSEE extension for panels related to differential expression analysis

This package contains diverse functionality to extend the usage of the iSEE package, including additional classes for the panels or modes facilitating the analysis of differential expression results. This package does not perform differential expression. Instead, it provides methods to embed precomputed differential expression results in a SummarizedExperiment object, in a manner that is compatible with interactive visualisation in iSEE applications.

Maintained by Kevin Rue-Albrecht. Last updated 4 months ago.

software infrastructure differentialexpression bioconductor hacktoberfest iseeu

1.8 match 1 stars 5.38 score 15 scripts

byzheng

rtiddlywiki:R Interface for 'TiddlyWiki'

'TiddlyWiki' is a unique non-linear notebook for capturing, organising and sharing complex information. 'rtiddlywiki' is a R interface of 'TiddlyWiki' <https://tiddlywiki.com> to create new tiddler from Rmarkdown file, and then put into a local 'TiddlyWiki' node.js server if it is available.

Maintained by Bangyou Zheng. Last updated 12 days ago.

tiddlywki

1.7 match 3 stars 5.45 score 7 scripts

juba

robservable:Import an Observable Notebook as HTML Widget

Allows loading and displaying an Observable notebook (online JavaScript notebooks powered by <https://observablehq.com>) as an HTML Widget in an R session, 'shiny' application or 'rmarkdown' document.

Maintained by Julien Barnier. Last updated 7 months ago.

htmlwidgets observable

1.3 match 165 stars 7.00 score 40 scripts