R-universe search: knowledge

chockemeyer

kstMatrix:Basic Functions in Knowledge Space Theory Using Matrix Representation

Knowledge space theory by Doignon and Falmagne (1999) <doi:10.1007/978-3-642-58625-5> is a set- and order-theoretical framework, which proposes mathematical formalisms to operationalize knowledge structures in a particular domain. The 'kstMatrix' package provides basic functionalities to generate, handle, and manipulate knowledge structures and knowledge spaces. Opposed to the 'kst' package, 'kstMatrix' uses matrix representations for knowledge structures. Furthermore, 'kstMatrix' contains several knowledge spaces developed by the research group around Cornelia Dowling through querying experts.

Maintained by Cord Hockemeyer. Last updated 2 months ago.

52.1 match 2 stars 3.43 score 15 scripts 1 dependents

satopaa

metaggR:Calculate the Knowledge-Weighted Estimate

According to a phenomenon known as "the wisdom of the crowds," combining point estimates from multiple judges often provides a more accurate aggregate estimate than using a point estimate from a single judge. However, if the judges use shared information in their estimates, the simple average will over-emphasize this common component at the expense of the judges’ private information. Asa Palley & Ville Satopää (2021) "Boosting the Wisdom of Crowds Within a Single Judgment Problem: Selective Averaging Based on Peer Predictions" <https://papers.ssrn.com/sol3/Papers.cfm?abstract_id=3504286> proposes a procedure for calculating a weighted average of the judges’ individual estimates such that resulting aggregate estimate appropriately combines the judges' collective information within a single estimation problem. The authors use both simulation and data from six experimental studies to illustrate that the weighting procedure outperforms existing averaging-like methods, such as the equally weighted average, trimmed average, and median. This aggregate estimate -- know as "the knowledge-weighted estimate" -- inputs a) judges' estimates of a continuous outcome (E) and b) predictions of others' average estimate of this outcome (P). In this R-package, the function knowledge_weighted_estimate(E,P) implements the knowledge-weighted estimate. Its use is illustrated with a simple stylized example and on real-world experimental data.

Maintained by Ville Satopää. Last updated 3 years ago.

52.4 match 1 stars 2.85 score 14 scripts

chockemeyer

kst:Knowledge Space Theory

Knowledge space theory by Doignon and Falmagne (1999) <doi:10.1007/978-3-642-58625-5> is a set- and order-theoretical framework, which proposes mathematical formalisms to operationalize knowledge structures in a particular domain. The 'kst' package provides basic functionalities to generate, handle, and manipulate knowledge structures and knowledge spaces.

Maintained by Cord Hockemeyer. Last updated 2 years ago.

37.5 match 6 stars 3.36 score 38 scripts

patzaw

TKCat:Tailored Knowledge Catalog

Facilitate the management of data from knowledge resources that are frequently used alone or together in research environments. In 'TKCat', knowledge resources are manipulated as modeled database (MDB) objects. These objects provide access to the data tables along with a general description of the resource and a detail data model documenting the tables, their fields and their relationships. These MDBs are then gathered in catalogs that can be easily explored an shared. Finally, 'TKCat' provides tools to easily subset, filter and combine MDBs and create new catalogs suited for specific needs.

Maintained by Patrice Godard. Last updated 2 days ago.

19.8 match 5 stars 6.08 score 27 scripts

soodoku

guess:Adjust Estimates of Learning for Guessing

Adjust Estimates of Learning for Guessing. The package provides standard guessing correction, and a latent class model that leverages informative pre-post transitions. For details of the latent class model, see <http://gsood.com/research/papers/guess.pdf>.

Maintained by Gaurav Sood. Last updated 3 years ago.

adjust-estimates bias learning

19.4 match 3 stars 4.29 score 13 scripts

kjhealy

gssrdoc:Document General Social Survey Variable

The General Social Survey (GSS) is a long-running, mostly annual survey of US households. It is administered by the National Opinion Research Center (NORC). This package contains the a tibble with information on the survey variables, together with every variable documented as an R help page. For more information on the GSS see \url{http://gss.norc.org}.

Maintained by Kieran Healy. Last updated 11 months ago.

36.3 match 2.28 score 38 scripts

bioc

fobitools:Tools for Manipulating the FOBI Ontology

A set of tools for interacting with the Food-Biomarker Ontology (FOBI). A collection of basic manipulation tools for biological significance analysis, graphs, and text mining strategies for annotating nutritional data.

Maintained by Pol Castellano-Escuder. Last updated 4 months ago.

massspectrometry metabolomics software visualization biomedicalinformatics graphandnetwork annotation cheminformatics pathways genesetenrichment biological-intrerpretation biological-knowledge biological-significance-analysis enrichment-analysis food-biomarker-ontology knowledge-graph nutrition obofoundry ontology text-mining

15.0 match 1 stars 5.08 score 5 scripts

thomaschln

kgraph:Knowledge Graphs Constructions and Visualizations

Knowledge graphs enable to efficiently visualize and gain insights into large-scale data analysis results, as p-values from multiple studies or embedding data matrices. The usual workflow is a user providing a data frame of association studies results and specifying target nodes, e.g. phenotypes, to visualize. The knowledge graph then shows all the features which are significantly associated with the phenotype, with the edges being proportional to the association scores. As the user adds several target nodes and grouping information about the nodes such as biological pathways, the construction of such graphs soon becomes complex. The 'kgraph' package aims to enable users to easily build such knowledge graphs, and provides two main features: first, to enable building a knowledge graph based on a data frame of concepts relationships, be it p-values or cosine similarities; second, to enable determining an appropriate cut-off on cosine similarities from a complete embedding matrix, to enable the building of a knowledge graph directly from an embedding matrix. The 'kgraph' package provides several display, layout and cut-off options, and has already proven useful to researchers to enable them to visualize large sets of p-value associations with various phenotypes, and to quickly be able to visualize embedding results. Two example datasets are provided to demonstrate these behaviors, and several live 'shiny' applications are hosted by the CELEHS laboratory and Parse Health, as the KESER Mental Health application <https://keser-mental-health.parse-health.org/> based on Hong C. (2021) <doi:10.1038/s41746-021-00519-z>.

Maintained by Thomas Charlon. Last updated 25 days ago.

14.5 match 4.85 score

spectra-to-knowledge

SpectraToQueries:Spectra to queries

SpectraToQueries provides the infrastructure to translate spectra to queries.

Maintained by Adriano Rutz. Last updated 21 days ago.

knowledge extraction spectral information querying system

22.5 match 1 stars 3.02 score

cran

pks:Probabilistic Knowledge Structures

Fitting and testing probabilistic knowledge structures, especially the basic local independence model (BLIM, Doignon & Flamagne, 1999) and the simple learning model (SLM), using the minimum discrepancy maximum likelihood (MDML) method (Heller & Wickelmaier, 2013 <doi:10.1016/j.endm.2013.05.145>).

Maintained by Florian Wickelmaier. Last updated 6 months ago.

24.3 match 1 stars 2.78 score 2 dependents

bioc

CellNOptR:Training of boolean logic models of signalling networks using prior knowledge networks and perturbation data

This package does optimisation of boolean logic networks of signalling pathways based on a previous knowledge network and a set of data upon perturbation of the nodes in the network.

Maintained by Attila Gabor. Last updated 5 months ago.

cellbasedassays cellbiology proteomics pathways network timecourse immunooncology

8.8 match 6.72 score 98 scripts 6 dependents

tkcaccia

KODAMA:Knowledge Discovery by Accuracy Maximization

An unsupervised and semi-supervised learning algorithm that performs feature extraction from noisy and high-dimensional data. It facilitates identification of patterns representing underlying groups on all samples in a data set. Based on Cacciatore S, Tenori L, Luchinat C, Bennett PR, MacIntyre DA. (2017) Bioinformatics <doi:10.1093/bioinformatics/btw705> and Cacciatore S, Luchinat C, Tenori L. (2014) Proc Natl Acad Sci USA <doi:10.1073/pnas.1220873111>.

Maintained by Stefano Cacciatore. Last updated 23 hours ago.

openblas cpp

8.1 match 1 stars 7.00 score 63 scripts 1 dependents

hope-data-science

akc:Automatic Knowledge Classification

A tidy framework for automatic knowledge classification and visualization. Currently, the core functionality of the framework is mainly supported by modularity-based clustering (community detection) in keyword co-occurrence network, and focuses on co-word analysis of bibliometric research. However, the designed functions in 'akc' are general, and could be extended to solve other tasks in text mining as well.

Maintained by Tian-Yuan Huang. Last updated 20 days ago.

9.7 match 15 stars 5.85 score 47 scripts

erossiter

catSurv:Computerized Adaptive Testing for Survey Research

Provides methods of computerized adaptive testing for survey researchers. See Montgomery and Rossiter (2020) <doi:10.1093/jssam/smz027>. Includes functionality for data fit with the classic item response methods including the latent trait model, Birnbaum`s three parameter model, the graded response, and the generalized partial credit model. Additionally, includes several ability parameter estimation and item selection routines. During item selection, all calculations are done in compiled C++ code.

Maintained by Erin Rossiter. Last updated 10 months ago.

gsl cpp

11.5 match 12 stars 4.68 score 3 scripts

annajenul

UBayFS:A User-Guided Bayesian Framework for Ensemble Feature Selection (UBayFS)

Implements the user-guided Bayesian framework for ensemble feature selection (UBayFS) : Jenul et al., (2022) <doi:10.1007/s10994-022-06221-9>.

Maintained by Anna Jenul. Last updated 2 years ago.

bayesian-statistics ensemble-models feature-selection user-knowledge

10.3 match 5 stars 5.11 score 13 scripts

jimbrig

rtraining:R Training Resources, Guides, Tips, and Knowledge Base

Houses variouse material realted to teaching R.

Maintained by Jimmy Briggs. Last updated 2 years ago.

best-practices curation developer-tools development development-environment guide knowledge package-development setup shiny-apps tips-and-tricks training training-materials walkthrough

12.9 match 4 stars 3.60 score 6 scripts

friendly

vcdExtra:'vcd' Extensions and Additions

Provides additional data sets, methods and documentation to complement the 'vcd' package for Visualizing Categorical Data and the 'gnm' package for Generalized Nonlinear Models. In particular, 'vcdExtra' extends mosaic, assoc and sieve plots from 'vcd' to handle 'glm()' and 'gnm()' models and adds a 3D version in 'mosaic3d'. Additionally, methods are provided for comparing and visualizing lists of 'glm' and 'loglm' objects. This package is now a support package for the book, "Discrete Data Analysis with R" by Michael Friendly and David Meyer.

Maintained by Michael Friendly. Last updated 5 months ago.

categorical-data-visualization generalized-linear-models mosaic-plots

4.0 match 24 stars 10.34 score 472 scripts 3 dependents

nliulab

AutoScore:An Interpretable Machine Learning-Based Automatic Clinical Score Generator

A novel interpretable machine learning-based framework to automate the development of a clinical scoring model for predefined outcomes. Our novel framework consists of six modules: variable ranking with machine learning, variable transformation, score derivation, model selection, domain knowledge-based score fine-tuning, and performance evaluation.The details are described in our research paper<doi:10.2196/21798>. Users or clinicians could seamlessly generate parsimonious sparse-score risk models (i.e., risk scores), which can be easily implemented and validated in clinical practice. We hope to see its application in various medical case studies.

Maintained by Feng Xie. Last updated 15 days ago.

5.3 match 32 stars 7.70 score 30 scripts

beerda

nuggets:Extensible Data Pattern Searching Framework

Extensible framework for subgroup discovery (Atzmueller (2015) <doi:10.1002/widm.1144>), contrast patterns (Chen (2022) <doi:10.48550/arXiv.2209.13556>), emerging patterns (Dong (1999) <doi:10.1145/312129.312191>), association rules (Agrawal (1994) <https://www.vldb.org/conf/1994/P487.PDF>) and conditional correlations (Hájek (1978) <doi:10.1007/978-3-642-66943-9>). Both crisp (Boolean, binary) and fuzzy data are supported. It generates conditions in the form of elementary conjunctions, evaluates them on a dataset and checks the induced sub-data for interesting statistical properties. A user-defined function may be defined to evaluate on each generated condition to search for custom patterns.

Maintained by Michal Burda. Last updated 4 days ago.

association-rule-mining contrast-pattern-mining data-mining fuzzy knowledge-discovery pattern-recognition cpp openmp

7.5 match 2 stars 5.38 score 10 scripts

bayesball

LearnBayes:Learning Bayesian Inference

Contains functions for summarizing basic one and two parameter posterior distributions and predictive distributions. It contains MCMC algorithms for summarizing posterior distributions defined by the user. It also contains functions for regression models, hierarchical models, Bayesian tests, and illustrations of Gibbs sampling.

Maintained by Jim Albert. Last updated 7 years ago.

3.4 match 38 stars 11.34 score 690 scripts 31 dependents

bioc

OmnipathR:OmniPath web service client and more

A client for the OmniPath web service (https://www.omnipathdb.org) and many other resources. It also includes functions to transform and pretty print some of the downloaded data, functions to access a number of other resources such as BioPlex, ConsensusPathDB, EVEX, Gene Ontology, Guide to Pharmacology (IUPHAR/BPS), Harmonizome, HTRIdb, Human Phenotype Ontology, InWeb InBioMap, KEGG Pathway, Pathway Commons, Ramilowski et al. 2015, RegNetwork, ReMap, TF census, TRRUST and Vinayagam et al. 2011. Furthermore, OmnipathR features a close integration with the NicheNet method for ligand activity prediction from transcriptomics data, and its R implementation `nichenetr` (available only on github).

Maintained by Denes Turei. Last updated 19 days ago.

graphandnetwork network pathways software thirdpartyclient dataimport datarepresentation genesignaling generegulation systemsbiology transcriptomics singlecell annotation kegg complexes enzyme-ptm networks networks-biology omnipath proteins quarto

3.6 match 126 stars 9.90 score 226 scripts 2 dependents

paballand

EconGeo:Computing Key Indicators of the Spatial Distribution of Economic Activities

Functions to compute a series of indices commonly used in the fields of economic geography, economic complexity, and evolutionary economics to describe the location, distribution, spatial organization, structure, and complexity of economic activities. Functions include basic spatial indicators such as the location quotient, the Krugman specialization index, the Herfindahl or the Shannon entropy indices but also more advanced functions to compute different forms of normalized relatedness between economic activities or network-based measures of economic complexity. Most of the functions use matrix calculus and are based on bipartite (incidence) matrices consisting of region - industry pairs.

Maintained by Pierre-Alexandre Balland. Last updated 2 years ago.

6.8 match 41 stars 4.96 score 44 scripts

bioc

cosmosR:COSMOS (Causal Oriented Search of Multi-Omic Space)

COSMOS (Causal Oriented Search of Multi-Omic Space) is a method that integrates phosphoproteomics, transcriptomics, and metabolomics data sets based on prior knowledge of signaling, metabolic, and gene regulatory networks. It estimated the activities of transcrption factors and kinases and finds a network-level causal reasoning. Thereby, COSMOS provides mechanistic hypotheses for experimental observations across mulit-omics datasets.

Maintained by Attila Gabor. Last updated 5 months ago.

cellbiology pathways network proteomics metabolomics transcriptomics genesignaling data-integration metabolomic-data network-modelling phosphoproteomics

4.3 match 59 stars 7.22 score 35 scripts

gobbios

EloRating:Animal Dominance Hierarchies by Elo Rating

Provides functions to quantify animal dominance hierarchies. The major focus is on Elo rating and its ability to deal with temporal dynamics in dominance interaction sequences. For static data, David's score and de Vries' I&SI are also implemented. In addition, the package provides functions to assess transitivity, linearity and stability of dominance networks. See Neumann et al (2011) <doi:10.1016/j.anbehav.2011.07.016> for an introduction.

Maintained by Christof Neumann. Last updated 8 months ago.

cpp

4.5 match 4 stars 6.86 score 61 scripts 1 dependents

optimal-learning-lab

LKT:Logistic Knowledge Tracing

Computes Logistic Knowledge Tracing ('LKT') which is a general method for tracking human learning in an educational software system. Please see Pavlik, Eglington, and Harrel-Williams (2021) <https://ieeexplore.ieee.org/document/9616435>. 'LKT' is a method to compute features of student data that are used as predictors of subsequent performance. 'LKT' allows great flexibility in the choice of predictive components and features computed for these predictive components. The system is built on top of 'LiblineaR', which enables extremely fast solutions compared to base glm() in R.

Maintained by Philip I. Pavlik Jr.. Last updated 9 months ago.

5.0 match 12 stars 5.84 score 29 scripts

enblacar

SCpubr:Generate Publication Ready Visualizations of Single Cell Transcriptomics Data

A system that provides a streamlined way of generating publication ready plots for known Single-Cell transcriptomics data in a “publication ready” format. This is, the goal is to automatically generate plots with the highest quality possible, that can be used right away or with minimal modifications for a research article.

Maintained by Enrique Blanco-Carmona. Last updated 1 months ago.

software singlecell visualization data-visualization ggplot2 publication-quality-plots seurat single-cell single-cell-genomics single-cell-rna-seq

3.4 match 178 stars 8.71 score 194 scripts

pecanproject

PEcAn.priors:PEcAn Functions Used to Estimate Priors from Data

Functions to estimate priors from data.

Maintained by David LeBauer. Last updated 3 days ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants jags cpp

2.9 match 216 stars 9.93 score 13 scripts 6 dependents

ropensci

refsplitr:author name disambiguation, author georeferencing, and mapping of coauthorship networks with 'Web of Science' data

Tools to parse and organize reference records downloaded from the 'Web of Science' citation database into an R-friendly format, disambiguate the names of authors, geocode their locations, and generate/visualize coauthorship networks. This package has been peer-reviewed by rOpenSci (v. 1.0).

Maintained by Emilio Bruna. Last updated 7 months ago.

name disambiguation bibliometrics coauthorship collaboration georeferencing metascience references scientometrics science of science web of science

5.1 match 55 stars 5.64 score 16 scripts

barnzilla

capl:Compute and Visualize CAPL-2 Scores and Interpretations

A toolkit for computing and visualizing CAPL-2 (Canadian Assessment of Physical Literacy, Second Edition; <https://www.capl-eclp.ca>) scores and interpretations from raw data.

Maintained by Joel Barnes. Last updated 3 years ago.

7.0 match 2 stars 4.00 score 2 scripts

cran

DAKS:Data Analysis and Knowledge Spaces

Functions and an example dataset for the psychometric theory of knowledge spaces. This package implements data analysis methods and procedures for simulating data and quasi orders and transforming different formulations in knowledge space theory. See package?DAKS for an overview.

Maintained by Ali Uenlue. Last updated 9 years ago.

14.0 match 2.00 score

sbgraves237

Ecdat:Data Sets for Econometrics

Data sets for econometrics, including political science.

Maintained by Spencer Graves. Last updated 4 months ago.

3.8 match 2 stars 7.25 score 740 scripts 3 dependents

bioc

decoupleR:decoupleR: Ensemble of computational methods to infer biological activities from omics data

Many methods allow us to extract biological activities from omics data using information from prior knowledge resources, reducing the dimensionality for increased statistical power and better interpretability. Here, we present decoupleR, a Bioconductor package containing different statistical methods to extract these signatures within a unified framework. decoupleR allows the user to flexibly test any method with any resource. It incorporates methods that take into account the sign and weight of network interactions. decoupleR can be used with any omic, as long as its features can be linked to a biological process based on prior knowledge. For example, in transcriptomics gene sets regulated by a transcription factor, or in phospho-proteomics phosphosites that are targeted by a kinase.

Maintained by Pau Badia-i-Mompel. Last updated 5 months ago.

differentialexpression functionalgenomics geneexpression generegulation network software statisticalmethod transcription

2.3 match 230 stars 11.27 score 316 scripts 3 dependents

chockemeyer

kstIO:Knowledge Space Theory Input/Output

Knowledge space theory by Doignon and Falmagne (1999) <doi:10.1007/978-3-642-58625-5> is a set- and order-theoretical framework which proposes mathematical formalisms to operationalize knowledge structures in a particular domain. The 'kstIO' package provides basic functionalities to read and write KST data from/to files to be used together with the 'kst', 'kstMatrix', 'CDSS', 'pks', or 'DAKS' packages.

Maintained by Cord Hockemeyer. Last updated 2 months ago.

13.1 match 2.00 score 8 scripts

ropensci

ckanr:Client for the Comprehensive Knowledge Archive Network ('CKAN') API

Client for 'CKAN' API (<https://ckan.org/>). Includes interface to 'CKAN' 'APIs' for search, list, show for packages, organizations, and resources. In addition, provides an interface to the 'datastore' API.

Maintained by Francisco Alves. Last updated 2 years ago.

database open-data ckan api data dataset api-wrapper ckan-api

2.9 match 100 stars 8.67 score 448 scripts 4 dependents

racorreia

gkgraphR:Accessing the Official 'Google Knowledge Graph' API

A simple way to interact with and extract data from the official 'Google Knowledge Graph' API <https://developers.google.com/knowledge-graph/>.

Maintained by Ricardo Correia. Last updated 4 years ago.

5.5 match 5 stars 4.40 score 3 scripts

kwb-r

kwb.endnote:Helper Functions for Analysing KWB Endnote Library (Exported as .xml)

Helper Functions For Analysing KWB Endnote Library (Exported As .XML).

Maintained by Michael Rustler. Last updated 4 years ago.

endnote knowledge-repo literature-data-management project-fakin publication

7.5 match 3.00 score 2 scripts

moosa-r

rbioapi:User-Friendly R Interface to Biologic Web Services' API

Currently fully supports Enrichr, JASPAR, miEAA, PANTHER, Reactome, STRING, and UniProt! The goal of rbioapi is to provide a user-friendly and consistent interface to biological databases and services. In a way that insulates the user from the technicalities of using web services API and creates a unified and easy-to-use interface to biological and medical web services. This is an ongoing project; New databases and services will be added periodically. Feel free to suggest any databases or services you often use.

Maintained by Moosa Rezwani. Last updated 1 months ago.

api-client bioinformatics biology enrichment enrichment-analysis enrichr jaspar mieaa over-representation-analysis panther reactome string uniprot

3.0 match 20 stars 7.60 score 55 scripts

joliencremers

bpnreg:Bayesian Projected Normal Regression Models for Circular Data

Fitting Bayesian multiple and mixed-effect regression models for circular data based on the projected normal distribution. Both continuous and categorical predictors can be included. Sampling from the posterior is performed via an MCMC algorithm. Posterior descriptives of all parameters, model fit statistics and Bayes factors for hypothesis tests for inequality constrained hypotheses are provided. See Cremers, Mulder & Klugkist (2018) <doi:10.1111/bmsp.12108> and Nuñez-Antonio & Guttiérez-Peña (2014) <doi:10.1016/j.csda.2012.07.025>.

Maintained by Jolien Cremers. Last updated 1 years ago.

openblas cpp openmp

3.6 match 14 stars 6.15 score 101 scripts

bioc

CNORfeeder:Integration of CellNOptR to add missing links

This package integrates literature-constrained and data-driven methods to infer signalling networks from perturbation experiments. It permits to extends a given network with links derived from the data via various inference methods and uses information on physical interactions of proteins to guide and validate the integration of links.

Maintained by Attila Gabor. Last updated 5 months ago.

cellbasedassays cellbiology proteomics networkinference

6.0 match 3.60 score 9 scripts

bioc

deepSNV:Detection of subclonal SNVs in deep sequencing data.

This package provides provides quantitative variant callers for detecting subclonal mutations in ultra-deep (>=100x coverage) sequencing experiments. The deepSNV algorithm is used for a comparative setup with a control experiment of the same loci and uses a beta-binomial model and a likelihood ratio test to discriminate sequencing errors and subclonal SNVs. The shearwater algorithm computes a Bayes classifier based on a beta-binomial model for variant calling with multiple samples for precisely estimating model parameters - such as local error rates and dispersion - and prior knowledge, e.g. from variation data bases such as COSMIC.

Maintained by Moritz Gerstung. Last updated 5 months ago.

geneticvariability snp sequencing genetics dataimport curl bzip2 xz-utils zlib cpp

3.3 match 6.53 score 38 scripts 1 dependents

cran

vannstats:Simplified Statistical Procedures for Social Sciences

Simplifies functions assess normality for bivariate and multivariate statistical techniques. Includes functions designed to replicate plots and tables that would result from similar calls in 'SPSS', including hst(), box(), qq(), tab(), cormat(), and residplot(). Also includes simplified formulae, such as mode(), scatter(), p.corr(), ow.anova(), and rm.anova().

Maintained by Burrel Vann Jr. Last updated 2 months ago.

6.9 match 3.06 score

bioc

mixOmics:Omics Data Integration Project

Multivariate methods are well suited to large omics data sets where the number of variables (e.g. genes, proteins, metabolites) is much larger than the number of samples (patients, cells, mice). They have the appealing properties of reducing the dimension of the data by using instrumental variables (components), which are defined as combinations of all variables. Those components are then used to produce useful graphical outputs that enable better understanding of the relationships and correlation structures between the different data sets that are integrated. mixOmics offers a wide range of multivariate methods for the exploration and integration of biological datasets with a particular focus on variable selection. The package proposes several sparse multivariate models we have developed to identify the key variables that are highly correlated, and/or explain the biological outcome of interest. The data that can be analysed with mixOmics may come from high throughput sequencing technologies, such as omics data (transcriptomics, metabolomics, proteomics, metagenomics etc) but also beyond the realm of omics (e.g. spectral imaging). The methods implemented in mixOmics can also handle missing values without having to delete entire rows with missing data. A non exhaustive list of methods include variants of generalised Canonical Correlation Analysis, sparse Partial Least Squares and sparse Discriminant Analysis. Recently we implemented integrative methods to combine multiple data sets: N-integration with variants of Generalised Canonical Correlation Analysis and P-integration with variants of multi-group Partial Least Squares.

Maintained by Eva Hamrud. Last updated 4 days ago.

immunooncology microarray sequencing metabolomics metagenomics proteomics geneprediction multiplecomparison classification regression bioconductor genomics genomics-data genomics-visualization multivariate-analysis multivariate-statistics omics r-pkg r-project

1.5 match 182 stars 13.71 score 1.3k scripts 22 dependents

manueleleonelli

bnRep:A Repository of Bayesian Networks from the Academic Literature

A collection of Bayesian networks (discrete, Gaussian, and conditional linear Gaussian) collated from recent academic literature. The 'bnRep_summary' object provides an overview of the Bayesian networks in the repository and the package documentation includes details about the variables in each network. A Shiny app to explore the repository can be launched with 'bnRep_app()' and is available online at <https://manueleleonelli.shinyapps.io/bnRep>. For details see <https://github.com/manueleleonelli/bnRep>.

Maintained by Manuele Leonelli. Last updated 6 months ago.

4.0 match 5 stars 5.10 score 7 scripts

kwb-r

algoliar:Simple Access to Algolia Search REST API

Simple Access to Algolia REST API (https://www.algolia.com/doc/rest-api/search/).

Maintained by Michael Rustler. Last updated 6 years ago.

academic algolia api hugo knowledge-repo project-fakin search

7.5 match 2.70 score

kwb-r

kwb.twitter:Simplify Access to Twitter Messages

Simplify access to Twitter messages.

Maintained by Hauke Sonnenberg. Last updated 3 years ago.

knowledge-repo project-fakin publication social-network twitter

7.5 match 2.70 score

ralmond

Peanut:Parameterized Bayesian Networks, Abstract Classes

This provides support of learning conditional probability tables parameterized using CPTtools. This provides and object oriented layer on top of a CPTtools, to facilitate calculations with Parameterized models for Bayesian networks. Peanut is a collection of abstract classes and generic functions defining a protocol, with the intent that the protocol can be implemented with different Bayes net engines. The companion pacakge PNetica provides an implementation using Netica and RNetica.

Maintained by Russell Almond. Last updated 2 years ago.

bayesian-network knowledge-representation

7.5 match 1 stars 2.48 score 4 scripts 2 dependents

openpharma

DoseFinding:Planning and Analyzing Dose Finding Experiments

The DoseFinding package provides functions for the design and analysis of dose-finding experiments (with focus on pharmaceutical Phase II clinical trials). It provides functions for: multiple contrast tests, fitting non-linear dose-response models (using Bayesian and non-Bayesian estimation), calculating optimal designs and an implementation of the MCPMod methodology (Pinheiro et al. (2014) <doi:10.1002/sim.6052>).

Maintained by Marius Thomas. Last updated 5 days ago.

openblas

1.8 match 8 stars 10.32 score 98 scripts 10 dependents

bioc

TOAST:Tools for the analysis of heterogeneous tissues

This package is devoted to analyzing high-throughput data (e.g. gene expression microarray, DNA methylation microarray, RNA-seq) from complex tissues. Current functionalities include 1. detect cell-type specific or cross-cell type differential signals 2. tree-based differential analysis 3. improve variable selection in reference-free deconvolution 4. partial reference-free deconvolution with prior knowledge.

Maintained by Ziyi Li. Last updated 5 months ago.

dnamethylation geneexpression differentialexpression differentialmethylation microarray genetarget epigenetics methylationarray

2.3 match 11 stars 8.01 score 104 scripts 3 dependents

bioc

PCAN:Phenotype Consensus ANalysis (PCAN)

Phenotypes comparison based on a pathway consensus approach. Assess the relationship between candidate genes and a set of phenotypes based on additional genes related to the candidate (e.g. Pathways or network neighbors).

Maintained by Matthew Page. Last updated 5 months ago.

annotation sequencing genetics functionalprediction variantannotation pathways network

4.2 match 4.15 score 7 scripts

bioc

wppi:Weighting protein-protein interactions

Protein-protein interaction data is essential for omics data analysis and modeling. Database knowledge is general, not specific for cell type, physiological condition or any other context determining which connections are functional and contribute to the signaling. Functional annotations such as Gene Ontology and Human Phenotype Ontology might help to evaluate the relevance of interactions. This package predicts functional relevance of protein-protein interactions based on functional annotations such as Human Protein Ontology and Gene Ontology, and prioritizes genes based on network topology, functional scores and a path search algorithm.

Maintained by Ana Galhoz. Last updated 5 months ago.

graphandnetwork network pathways software genesignaling genetarget systemsbiology transcriptomics annotation gene-ontology gene-prioritization human-phenotype-ontology omnipath ppi-networks random-walk-with-restart quarto

4.0 match 1 stars 4.30 score 4 scripts

r-forge

pcalg:Methods for Graphical Models and Causal Inference

Functions for causal structure learning and causal inference using graphical models. The main algorithms for causal structure learning are PC (for observational data without hidden variables), FCI and RFCI (for observational data with hidden variables), and GIES (for a mix of data from observational studies (i.e. observational data) and data from experiments involving interventions (i.e. interventional data) without hidden variables). For causal inference the IDA algorithm, the Generalized Backdoor Criterion (GBC), the Generalized Adjustment Criterion (GAC) and some related functions are implemented. Functions for incorporating background knowledge are provided.

Maintained by Markus Kalisch. Last updated 6 months ago.

openblas cpp

2.3 match 7.32 score 700 scripts 19 dependents

bioc

derfinder:Annotation-agnostic differential expression analysis of RNA-seq data at base-pair resolution via the DER Finder approach

This package provides functions for annotation-agnostic differential expression analysis of RNA-seq data. Two implementations of the DER Finder approach are included in this package: (1) single base-level F-statistics and (2) DER identification at the expressed regions-level. The DER Finder approach can also be used to identify differentially bounded ChIP-seq peaks.

Maintained by Leonardo Collado-Torres. Last updated 3 months ago.

differentialexpression sequencing rnaseq chipseq differentialpeakcalling software immunooncology coverage annotation-agnostic bioconductor derfinder

1.5 match 42 stars 10.03 score 78 scripts 6 dependents

bioc

recount:Explore and download data from the recount project

Explore and download data from the recount project available at https://jhubiostatistics.shinyapps.io/recount/. Using the recount package you can download RangedSummarizedExperiment objects at the gene, exon or exon-exon junctions level, the raw counts, the phenotype metadata used, the urls to the sample coverage bigWig files or the mean coverage bigWig file for a particular study. The RangedSummarizedExperiment objects can be used by different packages for performing differential expression analysis. Using http://bioconductor.org/packages/derfinder you can perform annotation-agnostic differential expression analyses with the data from the recount project as described at http://www.nature.com/nbt/journal/v35/n4/full/nbt.3838.html.

Maintained by Leonardo Collado-Torres. Last updated 3 months ago.

coverage differentialexpression geneexpression rnaseq sequencing software dataimport immunooncology annotation-agnostic bioconductor count derfinder deseq2 exon gene human illumina junction recount

1.5 match 41 stars 9.57 score 498 scripts 3 dependents

cran

DCL:Claims Reserving under the Double Chain Ladder Model

Statistical modelling and forecasting in claims reserving in non-life insurance under the Double Chain Ladder framework by Martinez-Miranda, Nielsen and Verrall (2012).

Maintained by Maria Dolores Martinez-Miranda. Last updated 3 years ago.

13.7 match 1 stars 1.00 score

kwb-r

kwb.site:R Package for Scraping Our Offical KWB Website (Before Re-Design in 2021)

This package contains functions for scraping our official [KWB website](https://kompetenz-wasser.de). The data for all projects and people can be collected in order to provide an overview of the website`s content and in order to be integrate that data into a KWB knowledge repo.

Maintained by Michael Rustler. Last updated 3 years ago.

knowledge-repo project-fakin r-selenium rvest web-scraping website

8.0 match 1.70 score 2 scripts

bioc

ALDEx2:Analysis Of Differential Abundance Taking Sample and Scale Variation Into Account

A differential abundance analysis for the comparison of two or more conditions. Useful for analyzing data from standard RNA-seq or meta-RNA-seq assays as well as selected and unselected values from in-vitro sequence selections. Uses a Dirichlet-multinomial model to infer abundance from counts, optimized for three or more experimental replicates. The method infers biological and sampling variation to calculate the expected false discovery rate, given the variation, based on a Wilcoxon Rank Sum test and Welch's t-test (via aldex.ttest), a Kruskal-Wallis test (via aldex.kw), a generalized linear model (via aldex.glm), or a correlation test (via aldex.corr). All tests report predicted p-values and posterior Benjamini-Hochberg corrected p-values. ALDEx2 also calculates expected standardized effect sizes for paired or unpaired study designs. ALDEx2 can now be used to estimate the effect of scale on the results and report on the scale-dependent robustness of results.

Maintained by Greg Gloor. Last updated 5 months ago.

differentialexpression rnaseq transcriptomics geneexpression dnaseq chipseq bayesian sequencing software microbiome metagenomics immunooncology scale simulation posterior p-value

1.3 match 28 stars 10.70 score 424 scripts 3 dependents

trinker

lexicon:Lexicons for Text Analysis

A collection of lexical hash tables, dictionaries, and word lists.

Maintained by Tyler Rinker. Last updated 3 years ago.

hash lexicon lookup names-frequent stopwords text-dictionaries text-mining

1.5 match 111 stars 8.80 score 224 scripts 25 dependents

bioc

KBoost:Inference of gene regulatory networks from gene expression data

Reconstructing gene regulatory networks and transcription factor activity is crucial to understand biological processes and holds potential for developing personalized treatment. Yet, it is still an open problem as state-of-art algorithm are often not able to handle large amounts of data. Furthermore, many of the present methods predict numerous false positives and are unable to integrate other sources of information such as previously known interactions. Here we introduce KBoost, an algorithm that uses kernel PCA regression, boosting and Bayesian model averaging for fast and accurate reconstruction of gene regulatory networks. KBoost can also use a prior network built on previously known transcription factor targets. We have benchmarked KBoost using three different datasets against other high performing algorithms. The results show that our method compares favourably to other methods across datasets.

Maintained by Luis F. Iglesias-Martinez. Last updated 5 months ago.

network graphandnetwork bayesian networkinference generegulation transcriptomics systemsbiology transcription geneexpression regression principalcomponent

2.8 match 4 stars 4.60 score 9 scripts

kalimu

GitAI:Extracts Knowledge from 'Git' Repositories

Scan multiple 'Git' repositories, pull specified files content and process it with large language models. You can summarize the content in specific way, extract information and data, or find answers to your questions about the repositories. The output can be stored in vector database and used for semantic search or as a part of a RAG (Retrieval Augmented Generation) prompt.

Maintained by Kamil Wais. Last updated 25 days ago.

4.8 match 2.70 score 5 scripts

bioc

recount3:Explore and download data from the recount3 project

The recount3 package enables access to a large amount of uniformly processed RNA-seq data from human and mouse. You can download RangedSummarizedExperiment objects at the gene, exon or exon-exon junctions level with sample metadata and QC statistics. In addition we provide access to sample coverage BigWig files.

Maintained by Leonardo Collado-Torres. Last updated 3 months ago.

coverage differentialexpression geneexpression rnaseq sequencing software dataimport annotation-agnostic bioconductor count derfinder exon gene human illumina junction mouse recount recount3

1.5 match 33 stars 8.03 score 216 scripts

cran

BKT:Bayesian Knowledge Tracing Model

Fitting, cross-validating, and predicting with Bayesian Knowledge Tracing (BKT) models. It is designed for analyzing educational datasets to trace student knowledge over time. The package includes functions for fitting BKT models, evaluating their performance using various metrics, and making predictions on new data. It provides the similar functionality as the Python package pyBKT authored by Zachary A. Pardos (zp@berkeley.edu) at <https://github.com/CAHLR/pyBKT>.

Maintained by Yuhao Yuan. Last updated 1 months ago.

5.9 match 2.00 score

bioc

biocthis:Automate package and project setup for Bioconductor packages

This package expands the usethis package with the goal of helping automate the process of creating R packages for Bioconductor or making them Bioconductor-friendly.

Maintained by Leonardo Collado-Torres. Last updated 3 months ago.

software reportwriting actions bioconductor biocthis github styler usethis

1.5 match 51 stars 7.78 score 4 scripts 1 dependents

bioc

benchdamic:Benchmark of differential abundance methods on microbiome data

Starting from a microbiome dataset (16S or WMS with absolute count values) it is possible to perform several analysis to assess the performances of many differential abundance detection methods. A basic and standardized version of the main differential abundance analysis methods is supplied but the user can also add his method to the benchmark. The analyses focus on 4 main aspects: i) the goodness of fit of each method's distributional assumptions on the observed count data, ii) the ability to control the false discovery rate, iii) the within and between method concordances, iv) the truthfulness of the findings if any apriori knowledge is given. Several graphical functions are available for result visualization.

Maintained by Matteo Calgaro. Last updated 4 months ago.

metagenomics microbiome differentialexpression multiplecomparison normalization preprocessing software benchmark differential-abundance-methods

2.0 match 6 stars 5.73 score 8 scripts

shanpengli

PDXpower:Time to Event Outcome in Experimental Designs of Pre-Clinical Studies

Conduct simulation-based customized power calculation for clustered time to event data in a mixed crossed/nested design, where a number of cell lines and a number of mice within each cell line are considered to achieve a desired statistical power, motivated by Eckel-Passow and colleagues (2021) <doi:10.1093/neuonc/noab137> and Li and colleagues (2024) <doi:10.48550/arXiv.2404.08927>. This package provides two commonly used models for powering a design, linear mixed effects and Cox frailty model. Both models account for within-subject (cell line) correlation while holding different distributional assumptions about the outcome. Alternatively, the counterparts of fixed effects model are also available, which produces similar estimates of statistical power.

Maintained by Shanpeng Li. Last updated 2 months ago.

3.1 match 1 stars 3.65 score 2 scripts

microsoft

wpa:Tools for Analysing and Visualising Viva Insights Data

Opinionated functions that enable easier and faster analysis of Viva Insights data. There are three main types of functions in 'wpa': (i) Standard functions create a 'ggplot' visual or a summary table based on a specific Viva Insights metric; (2) Report Generation functions generate HTML reports on a specific analysis area, e.g. Collaboration; (3) Other miscellaneous functions cover more specific applications (e.g. Subject Line text mining) of Viva Insights data. This package adheres to 'tidyverse' principles and works well with the pipe syntax. 'wpa' is built with the beginner-to-intermediate R users in mind, and is optimised for simplicity.

Maintained by Martin Chan. Last updated 4 months ago.

workplace-analytics

1.7 match 30 stars 6.69 score 39 scripts 1 dependents

teachinglab

tlShiny:Supplies essential functions to Teaching Lab dashboards

A bunch of random functions I use in developing dashboards Needs to vastly reduce the number of dependencies at the moment.

Maintained by Duncan Gates. Last updated 13 days ago.

3.6 match 3.04 score

jonesor

Rage:Life History Metrics from Matrix Population Models

Functions for calculating life history metrics using matrix population models ('MPMs'). Described in Jones et al. (2021) <doi:10.1101/2021.04.26.441330>.

Maintained by Owen Jones. Last updated 3 months ago.

1.3 match 11 stars 8.17 score 62 scripts 1 dependents

bioc

regionReport:Generate HTML or PDF reports for a set of genomic regions or DESeq2/edgeR results

Generate HTML or PDF reports to explore a set of regions such as the results from annotation-agnostic expression analysis of RNA-seq data at base-pair resolution performed by derfinder. You can also create reports for DESeq2 or edgeR results.

Maintained by Leonardo Collado-Torres. Last updated 2 months ago.

differentialexpression sequencing rnaseq software visualization transcription coverage reportwriting differentialmethylation differentialpeakcalling immunooncology qualitycontrol bioconductor derfinder deseq2 edger regionreport rmarkdown

1.5 match 9 stars 7.22 score 46 scripts

tlverse

tmle3:The Extensible TMLE Framework

A general framework supporting the implementation of targeted maximum likelihood estimators (TMLEs) of a diverse range of statistical target parameters through a unified interface. The goal is that the exposed framework be as general as the mathematical framework upon which it draws.

Maintained by Jeremy Coyle. Last updated 4 months ago.

causal-inference machine-learning targeted-learning variable-importance

1.3 match 38 stars 7.91 score 286 scripts 5 dependents

celehs

PheCAP:High-Throughput Phenotyping with EHR using a Common Automated Pipeline

Implement surrogate-assisted feature extraction (SAFE) and common machine learning approaches to train and validate phenotyping models. Background and details about the methods can be found at Zhang et al. (2019) <doi:10.1038/s41596-019-0227-6>, Yu et al. (2017) <doi:10.1093/jamia/ocw135>, and Liao et al. (2015) <doi:10.1136/bmj.h1885>.

Maintained by PARSE LTD. Last updated 4 years ago.

1.7 match 21 stars 6.02 score 8 scripts

jeroen

curl:A Modern and Flexible Web Client for R

Bindings to 'libcurl' <https://curl.se/libcurl/> for performing fully configurable HTTP/FTP requests where responses can be processed in memory, on disk, or streaming via the callback or connection interfaces. Some knowledge of 'libcurl' is recommended; for a more-user-friendly web client see the 'httr2' package which builds on this package with http specific tools and logic.

Maintained by Jeroen Ooms. Last updated 23 days ago.

curl

0.5 match 224 stars 19.98 score 4.0k scripts 5.9k dependents

microsoft

vivainsights:Analyze and Visualize Data from 'Microsoft Viva Insights'

Provides a versatile range of functions, including exploratory data analysis, time-series analysis, organizational network analysis, and data validation, whilst at the same time implements a set of best practices in analyzing and visualizing data specific to 'Microsoft Viva Insights'.

Maintained by Martin Chan. Last updated 24 days ago.

1.7 match 11 stars 6.12 score 68 scripts

vascobranco

gecko:Geographical Ecology and Conservation Knowledge Online

Includes a collection of geographical analysis functions aimed primarily at ecology and conservation science studies, allowing processing of both point and raster data. Now integrates SPECTRE (<https://biodiversityresearch.org/spectre/>), a dataset of global geospatial threat data, developed by the authors.

Maintained by Vasco V. Branco. Last updated 3 months ago.

conservation-science ecology spatial-analysis

3.0 match 5 stars 3.40 score 4 scripts

bioc

megadepth:megadepth: BigWig and BAM related utilities

This package provides an R interface to Megadepth by Christopher Wilks available at https://github.com/ChristopherWilks/megadepth. It is particularly useful for computing the coverage of a set of genomic regions across bigWig or BAM files. With this package, you can build base-pair coverage matrices for regions or annotations of your choice from BigWig files. Megadepth was used to create the raw files provided by https://bioconductor.org/packages/recount3.

Maintained by David Zhang. Last updated 3 months ago.

software coverage dataimport transcriptomics rnaseq preprocessing bam bigwig daspter megadepth recount2 recount3

1.5 match 12 stars 6.69 score 7 scripts 3 dependents

bioc

FELLA:Interpretation and enrichment for metabolomics data

Enrichment of metabolomics data using KEGG entries. Given a set of affected compounds, FELLA suggests affected reactions, enzymes, modules and pathways using label propagation in a knowledge model network. The resulting subnetwork can be visualised and exported.

Maintained by Sergio Picart-Armada. Last updated 5 months ago.

software metabolomics graphandnetwork kegg go pathways network networkenrichment

2.3 match 4.41 score 32 scripts

bioc

iSEEtree:Interactive visualisation for microbiome data

iSEEtree is an extension of iSEE for the TreeSummarizedExperiment. It leverages the functionality from the miaViz package for microbiome data visualisation to create panels that are specific for TreeSummarizedExperiment objects. Not surprisingly, it also depends on the generic panels from iSEE.

Maintained by Giulio Benedetti. Last updated 6 days ago.

microbiome software visualization gui shinyapps dataimport shiny-apps visualisation

1.5 match 3 stars 6.26 score 5 scripts

bioc

derfinderHelper:derfinder helper package

Helper package for speeding up the derfinder package when using multiple cores. This package is particularly useful when using BiocParallel and it helps reduce the time spent loading the full derfinder package when running the F-statistics calculation in parallel.

Maintained by Leonardo Collado-Torres. Last updated 3 months ago.

differentialexpression sequencing rnaseq software immunooncology bioconductor derfinder

1.5 match 6.20 score 7 dependents

frictionlessdata

tableschema.r:Table Schema 'Frictionless Data'

Allows to work with 'Table Schema' (<https://specs.frictionlessdata.io/table-schema/>). 'Table Schema' is well suited for use cases around handling and validating tabular data in text formats such as 'csv', but its utility extends well beyond this core usage, towards a range of applications where data benefits from a portable schema format. The 'tableschema.r' package can load and validate any table schema descriptor, allow the creation and modification of descriptors, expose methods for reading and streaming data that conforms to a 'Table Schema' via the 'Tabular Data Resource' abstraction.

Maintained by Kleanthis Koupidis. Last updated 2 years ago.

1.6 match 25 stars 5.70 score 101 scripts

malaga-fca-group

fcaR:Formal Concept Analysis

Provides tools to perform fuzzy formal concept analysis, presented in Wille (1982) <doi:10.1007/978-3-642-01815-2_23> and in Ganter and Obiedkov (2016) <doi:10.1007/978-3-662-49291-8>. It provides functions to load and save a formal context, extract its concept lattice and implications. In addition, one can use the implications to compute semantic closures of fuzzy sets and, thus, build recommendation systems.

Maintained by Domingo Lopez Rodriguez. Last updated 2 years ago.

formal-concept-analysis cpp

1.5 match 6 stars 6.02 score 70 scripts

ipbes-data

IPBES.R:Tool functions used by the Data and Knowledge Technical Support Unit of IPBES

More about what it does (maybe more than one line).

Maintained by Rainer M. Krug. Last updated 1 years ago.

4.4 match 2 stars 2.00 score 10 scripts

ropensci

bowerbird:Keep a Collection of Sparkly Data Resources

Tools to get and maintain a data repository from third-party data providers.

Maintained by Ben Raymond. Last updated 5 days ago.

ropensci antarctic southern ocean data environmental satellite climate peer-reviewed

1.2 match 50 stars 7.16 score 16 scripts 1 dependents

deboerk

cocron:Statistical Comparisons of Two or more Alpha Coefficients

Statistical tests for the comparison between two or more alpha coefficients based on either dependent or independent groups of individuals. A web interface is available at http://comparingcronbachalphas.org. A plugin for the R GUI and IDE RKWard is included. Please install RKWard from https:// rkward.kde.org to use this feature. The respective R package 'rkward' cannot be installed directly from a repository, as it is a part of RKWard.

Maintained by Birk Diedenhofen. Last updated 9 years ago.

4.0 match 2.12 score 22 scripts

bioc

iSEEindex:iSEE extension for a landing page to a custom collection of data sets

This package provides an interface to any collection of data sets within a single iSEE web-application. The main functionality of this package is to define a custom landing page allowing app maintainers to list a custom collection of data sets that users can selected from and directly load objects into an iSEE web-application.

Maintained by Kevin Rue-Albrecht. Last updated 5 months ago.

software infrastructure bioconductor hacktoberfest

1.5 match 2 stars 5.65 score 8 scripts

paul-buerkner

brms:Bayesian Regression Models using 'Stan'

Fit Bayesian generalized (non-)linear multivariate multilevel models using 'Stan' for full Bayesian inference. A wide range of distributions and link functions are supported, allowing users to fit -- among others -- linear, robust linear, count data, survival, response times, ordinal, zero-inflated, hurdle, and even self-defined mixture models all in a multilevel context. Further modeling options include both theory-driven and data-driven non-linear terms, auto-correlation structures, censoring and truncation, meta-analytic standard errors, and quite a few more. In addition, all parameters of the response distribution can be predicted in order to perform distributional regression. Prior specifications are flexible and explicitly encourage users to apply prior distributions that actually reflect their prior knowledge. Models can easily be evaluated and compared using several methods assessing posterior or prior predictions. References: Bürkner (2017) <doi:10.18637/jss.v080.i01>; Bürkner (2018) <doi:10.32614/RJ-2018-017>; Bürkner (2021) <doi:10.18637/jss.v100.i05>; Carpenter et al. (2017) <doi:10.18637/jss.v076.i01>.

Maintained by Paul-Christian Bürkner. Last updated 3 days ago.

bayesian-inference brms multilevel-models stan statistical-models

0.5 match 1.3k stars 16.61 score 13k scripts 34 dependents

bioc

iSEEhub:iSEE for the Bioconductor ExperimentHub

This package defines a custom landing page for an iSEE app interfacing with the Bioconductor ExperimentHub. The landing page allows users to browse the ExperimentHub, select a data set, download and cache it, and import it directly into a Bioconductor iSEE app.

Maintained by Kevin Rue-Albrecht. Last updated 5 months ago.

dataimport immunooncology infrastructure shinyapps singlecell software bioconductor bioconductor-package hacktoberfest isee

1.5 match 3 stars 5.56 score 4 scripts

blw921

jeek:A Fast and Scalable Joint Estimator for Integrating Additional Knowledge in Learning Multiple Related Sparse Gaussian Graphical Models

Provides a fast and scalable joint estimator for integrating additional knowledge in learning multiple related sparse Gaussian Graphical Models (JEEK). The JEEK algorithm can be used to fast estimate multiple related precision matrices in a large-scale. For instance, it can identify multiple gene networks from multi-context gene expression datasets. By performing data-driven network inference from high-dimensional and heterogeneous data sets, this tool can help users effectively translate aggregated data into knowledge that take the form of graphs among entities. Please run demo(jeek) to learn the basic functions provided by this package. For further details, please read the original paper: Beilun Wang, Arshdeep Sekhon, Yanjun Qi "A Fast and Scalable Joint Estimator for Integrating Additional Knowledge in Learning Multiple Related Sparse Gaussian Graphical Models" (ICML 2018) <arXiv:1806.00548>.

Maintained by Beilun Wang. Last updated 7 years ago.

6.8 match 1.20 score 16 scripts

bioc

chevreulProcess:Tools for managing SingleCellExperiment objects as projects

Tools analyzing SingleCellExperiment objects as projects. for input into the Chevreul app downstream. Includes functions for analysis of single cell RNA sequencing data. Supported by NIH grants R01CA137124 and R01EY026661 to David Cobrinik.

Maintained by Kevin Stachelek. Last updated 1 months ago.

coverage rnaseq sequencing visualization geneexpression transcription singlecell transcriptomics normalization preprocessing qualitycontrol dimensionreduction dataimport

1.5 match 5.38 score 2 scripts 2 dependents

bioc

iSEEde:iSEE extension for panels related to differential expression analysis

This package contains diverse functionality to extend the usage of the iSEE package, including additional classes for the panels or modes facilitating the analysis of differential expression results. This package does not perform differential expression. Instead, it provides methods to embed precomputed differential expression results in a SummarizedExperiment object, in a manner that is compatible with interactive visualisation in iSEE applications.

Maintained by Kevin Rue-Albrecht. Last updated 4 months ago.

software infrastructure differentialexpression bioconductor hacktoberfest iseeu

1.5 match 1 stars 5.38 score 15 scripts

bioc

qsvaR:Generate Quality Surrogate Variable Analysis for Degradation Correction

The qsvaR package contains functions for removing the effect of degration in rna-seq data from postmortem brain tissue. The package is equipped to help users generate principal components associated with degradation. The components can be used in differential expression analysis to remove the effects of degradation.

Maintained by Hedia Tnani. Last updated 3 months ago.

software workflowstep normalization biologicalquestion differentialexpression sequencing coverage bioconductor brain degradation human qsva

1.5 match 5.26 score 4 scripts

wjawaid

enrichR:Provides an R Interface to 'Enrichr'

Provides an R interface to all 'Enrichr' databases. 'Enrichr' is a web-based tool for analysing gene sets and returns any enrichment of common annotated biological features. Quoting from their website 'Enrichment analysis is a computational method for inferring knowledge about an input gene set by comparing it to annotated gene sets representing prior biological knowledge.' See <https://maayanlab.cloud/Enrichr/> for further details.

Maintained by Wajid Jawaid. Last updated 1 months ago.

0.8 match 90 stars 9.96 score 7 dependents

bioc

regutools:regutools: an R package for data extraction from RegulonDB

RegulonDB has collected, harmonized and centralized data from hundreds of experiments for nearly two decades and is considered a point of reference for transcriptional regulation in Escherichia coli K12. Here, we present the regutools R package to facilitate programmatic access to RegulonDB data in computational biology. regutools provides researchers with the possibility of writing reproducible workflows with automated queries to RegulonDB. The regutools package serves as a bridge between RegulonDB data and the Bioconductor ecosystem by reusing the data structures and statistical methods powered by other Bioconductor packages. We demonstrate the integration of regutools with Bioconductor by analyzing transcription factor DNA binding sites and transcriptional regulatory networks from RegulonDB. We anticipate that regutools will serve as a useful building block in our progress to further our understanding of gene regulatory networks.

Maintained by Joselyn Chavez. Last updated 3 months ago.

generegulation geneexpression systemsbiology network networkinference visualization transcription bioconductor cdsb regulondb

1.5 match 4 stars 5.20 score 6 scripts

bioc

TREG:Tools for finding Total RNA Expression Genes in single nucleus RNA-seq data

RNA abundance and cell size parameters could improve RNA-seq deconvolution algorithms to more accurately estimate cell type proportions given the different cell type transcription activity levels. A Total RNA Expression Gene (TREG) can facilitate estimating total RNA content using single molecule fluorescent in situ hybridization (smFISH). We developed a data-driven approach using a measure of expression invariance to find candidate TREGs in postmortem human brain single nucleus RNA-seq. This R package implements the method for identifying candidate TREGs from snRNA-seq data.

Maintained by Louise Huuki-Myers. Last updated 3 months ago.

software singlecell rnaseq geneexpression transcriptomics transcription sequencing bioconductor deconvolution rnascope scrna-seq smfish snrna-seq treg

1.5 match 4 stars 5.20 score 5 scripts

patzaw

ReDaMoR:Relational Data Modeler

The aim of this package is to manipulate relational data models in R. It provides functions to create, modify and export data models in json format. It also allows importing models created with 'MySQL Workbench' (<https://www.mysql.com/products/workbench/>). These functions are accessible through a graphical user interface made with 'shiny'. Constraints such as types, keys, uniqueness and mandatory fields are automatically checked and corrected when editing a model. Finally, real data can be confronted to a model to check their compatibility.

Maintained by Patrice Godard. Last updated 24 days ago.

1.3 match 17 stars 6.24 score 17 scripts 1 dependents

bioc

snapcount:R/Bioconductor Package for interfacing with Snaptron for rapid querying of expression counts

snapcount is a client interface to the Snaptron webservices which support querying by gene name or genomic region. Results include raw expression counts derived from alignment of RNA-seq samples and/or various summarized measures of expression across one or more regions/genes per-sample (e.g. percent spliced in).

Maintained by Rone Charles. Last updated 5 months ago.

coverage geneexpression rnaseq sequencing software dataimport

1.5 match 3 stars 5.19 score 13 scripts

bioc

chevreulShiny:Tools for managing SingleCellExperiment objects as projects

Tools for managing SingleCellExperiment objects as projects. Includes functions for analysis and visualization of single-cell data. Also included is a shiny app for visualization of pre-processed scRNA data. Supported by NIH grants R01CA137124 and R01EY026661 to David Cobrinik.

Maintained by Kevin Stachelek. Last updated 14 days ago.

coverage rnaseq sequencing visualization geneexpression transcription singlecell transcriptomics normalization preprocessing qualitycontrol dimensionreduction dataimport

1.5 match 5.08 score

bioc

chevreulPlot:Plots used in the chevreulPlot package

Tools for plotting SingleCellExperiment objects in the chevreulPlot package. Includes functions for analysis and visualization of single-cell data. Supported by NIH grants R01CA137124 and R01EY026661 to David Cobrinik.

Maintained by Kevin Stachelek. Last updated 18 days ago.

coverage rnaseq sequencing visualization geneexpression transcription singlecell transcriptomics normalization preprocessing qualitycontrol dimensionreduction dataimport

1.5 match 5.08 score 2 scripts

tidymodels

hardhat:Construct Modeling Packages

Building modeling packages is hard. A large amount of effort generally goes into providing an implementation for a new method that is efficient, fast, and correct, but often less emphasis is put on the user interface. A good interface requires specialized knowledge about S3 methods and formulas, which the average package developer might not have. The goal of 'hardhat' is to reduce the burden around building new modeling packages by providing functionality for preprocessing, predicting, and validating input.

Maintained by Hannah Frick. Last updated 2 months ago.

0.5 match 103 stars 14.88 score 175 scripts 436 dependents

bioc

derfinderPlot:Plotting functions for derfinder

This package provides plotting functions for results from the derfinder package. This helps separate the graphical dependencies required for making these plots from the core functionality of derfinder.

Maintained by Leonardo Collado-Torres. Last updated 3 months ago.

differentialexpression sequencing rnaseq software visualization immunooncology bioconductor derfinder

1.5 match 2 stars 5.00 score 5 scripts

bioc

netprioR:A model for network-based prioritisation of genes

A model for semi-supervised prioritisation of genes integrating network data, phenotypes and additional prior knowledge about TP and TN gene labels from the literature or experts.

Maintained by Fabian Schmich. Last updated 5 months ago.

immunooncology cellbasedassays preprocessing network

1.9 match 4.00 score 1 scripts

bioc

iSEEpathways:iSEE extension for panels related to pathway analysis

This package contains diverse functionality to extend the usage of the iSEE package, including additional classes for the panels or modes facilitating the analysis of pathway analysis results. This package does not perform pathway analysis. Instead, it provides methods to embed precomputed pathway analysis results in a SummarizedExperiment object, in a manner that is compatible with interactive visualisation in iSEE applications.

Maintained by Kevin Rue-Albrecht. Last updated 5 months ago.

software infrastructure differentialexpression geneexpression gui visualization pathways genesetenrichment go shinyapps bioconductor hacktoberfest isee iseeu

1.5 match 1 stars 4.95 score 10 scripts

bioc

awst:Asymmetric Within-Sample Transformation

We propose an Asymmetric Within-Sample Transformation (AWST) to regularize RNA-seq read counts and reduce the effect of noise on the classification of samples. AWST comprises two main steps: standardization and smoothing. These steps transform gene expression data to reduce the noise of the lowly expressed features, which suffer from background effects and low signal-to-noise ratio, and the influence of the highly expressed features, which may be the result of amplification bias and other experimental artifacts.

Maintained by Davide Risso. Last updated 5 months ago.

normalization geneexpression rnaseq software transcriptomics sequencing singlecell

1.5 match 3 stars 4.95 score 15 scripts

program--

fipio:Lightweight Federal Information Processing System (FIPS) Code Information Retrieval

Provides a lightweight suite of functions for retrieving information about 5-digit or 2-digit US FIPS codes.

Maintained by Justin Singh-Mohudpur. Last updated 1 years ago.

information-retrieval spatial us-data

1.6 match 14 stars 4.77 score 14 scripts 2 dependents

cran

GoogleKnowledgeGraphR:Retrieve Information from 'Google Knowledge Graph' API

Allows you to retrieve information from the 'Google Knowledge Graph' API <https://www.google.com/intl/bn/insidesearch/features/search/knowledge.html> and process it in R in various forms. The 'Knowledge Graph Search' API lets you find entities in the 'Google Knowledge Graph'. The API uses standard 'schema.org' types and is compliant with the 'JSON-LD' specification.

Maintained by Daniel Schmeh. Last updated 7 years ago.

7.3 match 1.00 score

skranz

gtree:gtree basic functionality to model and solve games

gtree basic functionality to model and solve games

Maintained by Sebastian Kranz. Last updated 4 years ago.

economic-experiments economics gambit game-theory nash-equilibrium

1.9 match 18 stars 3.79 score 23 scripts 1 dependents

rrwen

nbc4va:Bayes Classifier for Verbal Autopsy Data

An implementation of the Naive Bayes Classifier (NBC) algorithm used for Verbal Autopsy (VA) built on code from Miasnikof et al (2015) <DOI:10.1186/s12916-015-0521-2>.

Maintained by Richard Wen. Last updated 3 years ago.

autopsy bayes cause classifier coded computer death estimate imputation learning machine mds million naive nbc probability study theory va verbal

1.5 match 4.60 score 79 scripts

rickhelmus

RDCOMClient:R-DCOM client

Provides dynamic client-side access to (D)COM applications from within R.

Maintained by Duncan Temple Lang. Last updated 1 years ago.

1.8 match 3.90 score 315 scripts

richardkwo

eff2:Efficient Least Squares for Total Causal Effects

Estimate a total causal effect from observational data under linearity and causal sufficiency. The observational data is supposed to be generated from a linear structural equation model (SEM) with independent and additive noise. The underlying causal DAG associated the SEM is required to be known up to a maximally oriented partially directed graph (MPDAG), which is a general class of graphs consisting of both directed and undirected edges, including CPDAGs (i.e., essential graphs) and DAGs. Such graphs are usually obtained with structure learning algorithms with added background knowledge. The program is able to estimate every identified effect, including single and multiple treatment variables. Moreover, the resulting estimate has the minimal asymptotic covariance (and hence shortest confidence intervals) among all estimators that are based on the sample covariance.

Maintained by Richard Guo. Last updated 1 years ago.

1.8 match 3.70 score 3 scripts

fbrun-acta

KenSyn:Knowledge Synthesis in Agriculture - From Experimental Network to Meta-Analysis

Demo and dataset accompaying the books : De l'analyse des réseaux expérimentaux à la méta-analyse: Méthodes et applications avec le logiciel R pour les sciences agronomiques et environnementales (Published 2018-06-28, Quae, for french version) by David Makowski, Francois Piraux and Francois Brun - <https://www.quae.com/produit/1514/9782759228164/de-l-analyse-des-reseaux-experimentaux-a-la-meta-analyse> Knowledge Synthesis in Agriculture : from Experimental Network to Meta-Analysis (in preparation for 2018-06, Springer , for English version) by David Makowski, Francois Piraux and Francois Brun A full description of all the material is in both books. ACKNOWLEDGMENTS : The French network "RMT modeling and data analysis for agriculture" (<http://www.modelia.org>) have contributed to the development of this R package. This project and network are lead by ACTA (French Technical Institute for Agriculture) and was funded by a grant from the Ministry of Agriculture and Fishing of France.

Maintained by Francois Brun (ACTA). Last updated 6 years ago.

5.1 match 1.30 score 20 scripts

jorgeklz

moc.gapbk:Multi-Objective Clustering Algorithm Guided by a-Priori Biological Knowledge

Implements the Multi-Objective Clustering Algorithm Guided by a-Priori Biological Knowledge (MOC-GaPBK) which was proposed by Parraga-Alava, J. et. al. (2018) <doi:10.1186/s13040-018-0178-4>.

Maintained by Jorge Parraga-Alava. Last updated 7 months ago.

5.0 match 1.30 score 1 scripts

kalifa-manjang

GOxploreR:Structural Exploration of the Gene Ontology (GO) Knowledge Base

It provides an effective, efficient, and fast way to explore the Gene Ontology (GO). Given a set of genes, the package contains functions to assess the GO and obtain the terms associated with the genes and the levels of the GO terms. The package provides functions for the three different GO ontology. We discussed the methods explicitly in the following article <doi:10.1038/s41598-020-73326-3>.

Maintained by Kalifa Manjang. Last updated 1 years ago.

2.9 match 2.26 score 18 scripts

bioc

slingshot:Tools for ordering single-cell sequencing

Provides functions for inferring continuous, branching lineage structures in low-dimensional data. Slingshot was designed to model developmental trajectories in single-cell RNA sequencing data and serve as a component in an analysis pipeline after dimensionality reduction and clustering. It is flexible enough to handle arbitrarily many branching events and allows for the incorporation of prior knowledge through supervised graph construction.

Maintained by Kelly Street. Last updated 5 months ago.

clustering differentialexpression geneexpression rnaseq sequencing software singlecell transcriptomics visualization

0.5 match 283 stars 12.01 score 1.0k scripts 4 dependents

ecmerkle

smdata:Data to Accompany Smithson & Merkle, 2013

Contains data files to accompany Smithson & Merkle (2013), Generalized Linear Models for Categorical and Continuous Limited Dependent Variables.

Maintained by Ed Merkle. Last updated 7 years ago.

4.0 match 1.46 score 29 scripts

edwinkipruto

mfp2:Multivariable Fractional Polynomial Models with Extensions

Multivariable fractional polynomial algorithm simultaneously selects variables and functional forms in both generalized linear models and Cox proportional hazard models. Key references are Royston and Altman (1994) <doi:10.2307/2986270> and Royston and Sauerbrei (2008, ISBN:978-0-470-02842-1). In addition, it can model a sigmoid relationship between variable x and an outcome variable y using the approximate cumulative distribution transformation proposed by Royston (2014) <doi:10.1177/1536867X1401400206>. This feature distinguishes it from a standard fractional polynomial function, which lacks the ability to achieve such modeling.

Maintained by Edwin Kipruto. Last updated 10 months ago.

1.1 match 3 stars 5.26 score 4 scripts 2 dependents

cran

DiceOptim:Kriging-Based Optimization for Computer Experiments

Efficient Global Optimization (EGO) algorithm as described in "Roustant et al. (2012)" <doi:10.18637/jss.v051.i01> and adaptations for problems with noise ("Picheny and Ginsbourger, 2012") <doi:10.1016/j.csda.2013.03.018>, parallel infill, and problems with constraints.

Maintained by Victor Picheny. Last updated 4 years ago.

1.9 match 4 stars 3.11 score 107 scripts 1 dependents

cubiczebra

TPMplt:Tool-Kit for Dynamic Materials Model and Thermal Processing Maps

Provides a simple approach for constructing dynamic materials modeling suggested by Prasad and Gegel (1984) <doi:10.1007/BF02664902>. It can easily generate various processing-maps based on this model as well. The calculation result in this package contains full materials constants, information about power dissipation efficiency factor, and rheological properties, can be exported completely also, through which further analysis and customized plots will be applicable as well.

Maintained by Chen Zhang. Last updated 6 months ago.

1.2 match 2 stars 4.76 score 29 scripts

bioc

systemPipeShiny:systemPipeShiny: An Interactive Framework for Workflow Management and Visualization

systemPipeShiny (SPS) extends the widely used systemPipeR (SPR) workflow environment with a versatile graphical user interface provided by a Shiny App. This allows non-R users, such as experimentalists, to run many systemPipeR’s workflow designs, control, and visualization functionalities interactively without requiring knowledge of R. Most importantly, SPS has been designed as a general purpose framework for interacting with other R packages in an intuitive manner. Like most Shiny Apps, SPS can be used on both local computers as well as centralized server-based deployments that can be accessed remotely as a public web service for using SPR’s functionalities with community and/or private data. The framework can integrate many core packages from the R/Bioconductor ecosystem. Examples of SPS’ current functionalities include: (a) interactive creation of experimental designs and metadata using an easy to use tabular editor or file uploader; (b) visualization of workflow topologies combined with auto-generation of R Markdown preview for interactively designed workflows; (d) access to a wide range of data processing routines; (e) and an extendable set of visualization functionalities. Complex visual results can be managed on a 'Canvas Workbench’ allowing users to organize and to compare plots in an efficient manner combined with a session snapshot feature to continue work at a later time. The present suite of pre-configured visualization examples. The modular design of SPR makes it easy to design custom functions without any knowledge of Shiny, as well as extending the environment in the future with contributions from the community.

Maintained by Le Zhang. Last updated 5 months ago.

shinyapps infrastructure dataimport sequencing qualitycontrol reportwriting experimentaldesign clustering bioconductor bioconductor-package data-visualization shiny systempiper

0.8 match 33 stars 7.03 score 36 scripts

cran

FamilyRank:Algorithm for Ranking Predictors Using Graphical Domain Knowledge

Grows families of features by selecting features that maximize a weighted score calculated from empirical feature scores and graphical knowledge. The final weighted score for a feature is determined by summing a feature's family-weighted scores across all families in which the feature appears.

Maintained by Michelle Saul. Last updated 4 years ago.

cpp

5.1 match 1.00 score 6 scripts

jafarilab

NIMAA:Nominal Data Mining Analysis

Functions for nominal data mining based on bipartite graphs, which build a pipeline for analysis and missing values imputation. Methods are mainly from the paper: Jafari, Mohieddin, et al. (2021) <doi:10.1101/2021.03.18.436040>, some new ones are also included.

Maintained by Mohieddin Jafari. Last updated 2 years ago.

1.1 match 4 stars 4.30 score 7 scripts

sanchezi

kfino:Kalman Filter for Impulse Noised Outliers

A method for detecting outliers with a Kalman filter on impulsed noised outliers and prediction on cleaned data. 'kfino' is a robust sequential algorithm allowing to filter data with a large number of outliers. This algorithm is based on simple latent linear Gaussian processes as in the Kalman Filter method and is devoted to detect impulse-noised outliers. These are data points that differ significantly from other observations. 'ML' (Maximization Likelihood) and 'EM' (Expectation-Maximization algorithm) algorithms were implemented in 'kfino'. The method is described in full details in the following arXiv e-Print: <arXiv:2208.00961>.

Maintained by Isabelle Sanchez. Last updated 2 years ago.

1.6 match 3.00 score 6 scripts

olgalezhnina

dtreg:Interact with Data Type Registries and Create Machine-Readable Data

You can load a schema from a DTR (data type registry) as an R object. Use this schema to write your data in JSON-LD (JavaScript Object Notation for Linked Data) format to make it machine readable.

Maintained by Olga Lezhnina. Last updated 30 days ago.

1.5 match 3.18 score 4 scripts

cran

wPerm:Permutation Tests

Supplies permutation-test alternatives to traditional hypothesis-test procedures such as two-sample tests for means, medians, and standard deviations; correlation tests; tests for homogeneity and independence; and more. Suitable for general audiences, including individual and group users, introductory statistics courses, and more advanced statistics courses that desire an introduction to permutation tests.

Maintained by Neil A. Weiss. Last updated 9 years ago.

3.6 match 1.30 score

bioc

debCAM:Deconvolution by Convex Analysis of Mixtures

An R package for fully unsupervised deconvolution of complex tissues. It provides basic functions to perform unsupervised deconvolution on mixture expression profiles by Convex Analysis of Mixtures (CAM) and some auxiliary functions to help understand the subpopulation-specific results. It also implements functions to perform supervised deconvolution based on prior knowledge of molecular markers, S matrix or A matrix. Combining molecular markers from CAM and from prior knowledge can achieve semi-supervised deconvolution of mixtures.

Maintained by Lulu Chen. Last updated 5 months ago.

software cellbiology geneexpression openjdk

0.8 match 7 stars 5.69 score 14 scripts

bioc

sccomp:Tests differences in cell-type proportion for single-cell data, robust to outliers

A robust and outlier-aware method for testing differences in cell-type proportion in single-cell data. This model can infer changes in tissue composition and heterogeneity, and can produce realistic data simulations based on any existing dataset. This model can also transfer knowledge from a large set of integrated datasets to increase accuracy further.

Maintained by Stefano Mangiola. Last updated 2 days ago.

bayesian regression differentialexpression singlecell metagenomics flowcytometry spatial batch-correction composition cytof differential-proportion microbiome multilevel proportions random-effects single-cell unwanted-variation

0.5 match 99 stars 8.43 score 69 scripts

bendeivide

leem:Laboratory of Teaching to Statistics and Mathematics

An educational package for the teaching of statistics and mathematics in primary and higher education. The objective is to assist in teaching/learning for both student study planning and teacher teaching strategies. The leem package will try to bring, in a simple and at the same time in-depth, knowledge of statistics and mathematics to everyone who wants to study these areas of knowledge. The main function of the package is 'leem' function.

Maintained by Ben Deivide. Last updated 17 days ago.

0.8 match 4 stars 5.33 score 152 scripts

causalinference

gfoRmula:Parametric G-Formula

Implements the non-iterative conditional expectation (NICE) algorithm of the g-formula algorithm (Robins (1986) <doi:10.1016/0270-0255(86)90088-6>, Hernán and Robins (2024, ISBN:9781420076165)). The g-formula can estimate an outcome's counterfactual mean or risk under hypothetical treatment strategies (interventions) when there is sufficient information on time-varying treatments and confounders. This package can be used for discrete or continuous time-varying treatments and for failure time outcomes or continuous/binary end of follow-up outcomes. The package can handle a random measurement/visit process and a priori knowledge of the data structure, as well as censoring (e.g., by loss to follow-up) and two options for handling competing events for failure time outcomes. Interventions can be flexibly specified, both as interventions on a single treatment or as joint interventions on multiple treatments. See McGrath et al. (2020) <doi:10.1016/j.patter.2020.100008> for a guide on how to use the package.

Maintained by Sean McGrath. Last updated 28 days ago.

0.5 match 165 stars 8.18 score 132 scripts

usepa

ctxR:Utilities for Interacting with the 'CTX' APIs

Access chemical, hazard, bioactivity, and exposure data from the Computational Toxicology and Exposure ('CTX') APIs <https://www.epa.gov/comptox-tools/computational-toxicology-and-exposure-apis>. 'ctxR' was developed to streamline the process of accessing the information available through the 'CTX' APIs without requiring prior knowledge of how to use APIs. Most data is also available on the CompTox Chemical Dashboard ('CCD') <https://comptox.epa.gov/dashboard/> and other resources found at the EPA Computational Toxicology and Exposure Online Resources <https://www.epa.gov/comptox-tools>.

Maintained by Paul Kruse. Last updated 2 months ago.

ccte comptox ord

0.5 match 10 stars 8.02 score 13 scripts 1 dependents

cogdisreslab

KinaseTauScore:Tau Scores For Human Kinases

This data package provides the tau scores for each kinase based on its activity in Alzheimer's disease samples. The data was generated by using an siRNA Library to knock down individual kinases and then measuring the total Tau protein Expression and the phopho-Tau protein expression. The resulting data wasc reposited online. This package processes the resulting data to create a meaningful Tau Score for each Kinase based on its activity.

Maintained by Ali Sajid Imami. Last updated 3 years ago.

experimentdata proteome expressiondata

1.5 match 2.70 score

bioc

mistyR:Multiview Intercellular SpaTial modeling framework

mistyR is an implementation of the Multiview Intercellular SpaTialmodeling framework (MISTy). MISTy is an explainable machine learning framework for knowledge extraction and analysis of single-cell, highly multiplexed, spatially resolved data. MISTy facilitates an in-depth understanding of marker interactions by profiling the intra- and intercellular relationships. MISTy is a flexible framework able to process a custom number of views. Each of these views can describe a different spatial context, i.e., define a relationship among the observed expressions of the markers, such as intracellular regulation or paracrine regulation, but also, the views can also capture cell-type specific relationships, capture relations between functional footprints or focus on relations between different anatomical regions. Each MISTy view is considered as a potential source of variability in the measured marker expressions. Each MISTy view is then analyzed for its contribution to the total expression of each marker and is explained in terms of the interactions with other measurements that led to the observed contribution.

Maintained by Jovan Tanevski. Last updated 5 months ago.

software biomedicalinformatics cellbiology systemsbiology regression decisiontree singlecell spatial bioconductor biology intercellular machine-learning modular molecular-biology multiview spatial-transcriptomics

0.5 match 51 stars 7.87 score 160 scripts

bioc

ASSIGN:Adaptive Signature Selection and InteGratioN (ASSIGN)

ASSIGN is a computational tool to evaluate the pathway deregulation/activation status in individual patient samples. ASSIGN employs a flexible Bayesian factor analysis approach that adapts predetermined pathway signatures derived either from knowledge-based literature or from perturbation experiments to the cell-/tissue-specific pathway signatures. The deregulation/activation level of each context-specific pathway is quantified to a score, which represents the extent to which a patient sample encompasses the pathway deregulation/activation signature.

Maintained by Ying Shen. Last updated 5 months ago.

software geneexpression pathways bayesian

0.5 match 2 stars 7.37 score 65 scripts 1 dependents

celehs

kesernetwork:Visualization of the KESER Network

A shiny app to visualize the knowledge networks for the code concepts. Using co-occurrence matrices of EHR codes from Veterans Affairs (VA) and Massachusetts General Brigham (MGB), the knowledge extraction via sparse embedding regression (KESER) algorithm was used to construct knowledge networks for the code concepts. Background and details about the method can be found at Chuan et al. (2021) <doi:10.1038/s41746-021-00519-z>.

Maintained by Su-Chun Cheng. Last updated 2 years ago.

0.9 match 1 stars 4.00 score 7 scripts

bioc

scAnnotatR:Pretrained learning models for cell type prediction on single cell RNA-sequencing data

The package comprises a set of pretrained machine learning models to predict basic immune cell types. This enables all users to quickly get a first annotation of the cell types present in their dataset without requiring prior knowledge. scAnnotatR also allows users to train their own models to predict new cell types based on specific research needs.

Maintained by Johannes Griss. Last updated 5 months ago.

singlecell transcriptomics geneexpression supportvectormachine classification software

0.5 match 15 stars 6.73 score 20 scripts

mjanuario

evolved:Open Software for Teaching Evolutionary Biology at Multiple Scales Through Virtual Inquiries

"Evolutionary Virtual Education" - 'evolved' - provides multiple tools to help educators (especially at the graduate level or in advanced undergraduate level courses) apply inquiry-based learning in general evolution classes. In particular, the tools provided include functions that simulate evolutionary processes (e.g., genetic drift, natural selection within a single locus) or concepts (e.g. Hardy-Weinberg equilibrium, phylogenetic distribution of traits). More than only simulating, the package also provides tools for students to analyze (e.g., measuring, testing, visualizing) datasets with characteristics that are common to many fields related to evolutionary biology. Importantly, the package is heavily oriented towards providing tools for inquiry-based learning - where students follow scientific practices to actively construct knowledge. For additional details, see package's vignettes.

Maintained by Matheus Januario. Last updated 1 months ago.

0.5 match 3 stars 6.73 score 23 scripts

reconhub

earlyR:Estimation of Transmissibility in the Early Stages of a Disease Outbreak

Implements a simple, likelihood-based estimation of the reproduction number (R0) using a branching process with a Poisson likelihood. This model requires knowledge of the serial interval distribution, and dates of symptom onsets. Infectiousness is determined by weighting R0 by the probability mass function of the serial interval on the corresponding day. It is a simplified version of the model introduced by Cori et al. (2013) <doi:10.1093/aje/kwt133>.

Maintained by Thibaut Jombart. Last updated 4 years ago.

0.5 match 9 stars 6.59 score 96 scripts

bioc

ViSEAGO:ViSEAGO: a Bioconductor package for clustering biological functions using Gene Ontology and semantic similarity

The main objective of ViSEAGO package is to carry out a data mining of biological functions and establish links between genes involved in the study. We developed ViSEAGO in R to facilitate functional Gene Ontology (GO) analysis of complex experimental design with multiple comparisons of interest. It allows to study large-scale datasets together and visualize GO profiles to capture biological knowledge. The acronym stands for three major concepts of the analysis: Visualization, Semantic similarity and Enrichment Analysis of Gene Ontology. It provides access to the last current GO annotations, which are retrieved from one of NCBI EntrezGene, Ensembl or Uniprot databases for several species. Using available R packages and novel developments, ViSEAGO extends classical functional GO analysis to focus on functional coherence by aggregating closely related biological themes while studying multiple datasets at once. It provides both a synthetic and detailed view using interactive functionalities respecting the GO graph structure and ensuring functional coherence supplied by semantic similarity. ViSEAGO has been successfully applied on several datasets from different species with a variety of biological questions. Results can be easily shared between bioinformaticians and biologists, enhancing reporting capabilities while maintaining reproducibility.

Maintained by Aurelien Brionne. Last updated 2 months ago.

software annotation go genesetenrichment multiplecomparison clustering visualization

0.5 match 6.64 score 22 scripts

cran

arakno:ARAchnid KNowledge Online

Allows the user to connect with the World Spider Catalogue (WSC; <https://wsc.nmbe.ch/>) and the World Spider Trait (WST; <https://spidertraits.sci.muni.cz/>) databases. Also performs several basic functions such as checking names validity, retrieving coordinate data from the Global Biodiversity Information Facility (GBIF; <https://www.gbif.org/>), and mapping.

Maintained by Pedro Cardoso. Last updated 3 years ago.

3.3 match 1.00 score

cran

spidR:Spider Knowledge Online

Allows the user to connect with the World Spider Catalogue (WSC; <https://wsc.nmbe.ch/>) and the World Spider Trait (WST; <https://spidertraits.sci.muni.cz/>) databases. Also performs several basic functions such as checking names validity, retrieving coordinate data from the Global Biodiversity Information Facility (GBIF; <https://www.gbif.org/>), and mapping.

Maintained by Pedro Cardoso. Last updated 3 years ago.

3.3 match 1.00 score

bstatcomp

bayes4psy:User Friendly Bayesian Data Analysis for Psychology

Contains several Bayesian models for data analysis of psychological tests. A user friendly interface for these models should enable students and researchers to perform professional level Bayesian data analysis without advanced knowledge in programming and Bayesian statistics. This package is based on the Stan platform (Carpenter et el. 2017 <doi:10.18637/jss.v076.i01>).

Maintained by Jure Demšar. Last updated 1 years ago.

cpp

0.5 match 14 stars 6.44 score 33 scripts

luckinet

ontologics:Code-Logics to Handle Ontologies

Provides tools to build and work with an ontology of linked (open) data in a tidy workflow. It is inspired by the Food and Agrilculture Organizations (FAO) caliper platform <https://www.fao.org/statistics/caliper/web/> and makes use of the Simple Knowledge Organisation System (SKOS).

Maintained by Steffen Ehrmann. Last updated 2 months ago.

interoperability ontology

0.5 match 3 stars 6.39 score 17 scripts 1 dependents

usepa

ccdR:Utilities for Interacting with the 'CTX' APIs

Access chemical, hazard, bioactivity, and exposure data from the Computational Toxicology and Exposure ('CTX') APIs <https://api-ccte.epa.gov/docs/>. 'ccdR' was developed to streamline the process of accessing the information available through the 'CTX' APIs without requiring prior knowledge of how to use APIs. Most data is also available on the CompTox Chemical Dashboard ('CCD') <https://comptox.epa.gov/dashboard/> and other resources found at the EPA Computational Toxicology and Exposure Online Resources <https://www.epa.gov/comptox-tools>.

Maintained by Paul Kruse. Last updated 8 months ago.

0.5 match 2 stars 6.38 score 7 scripts

data-cleaning

dcmodify:Modify Data Using Externally Defined Modification Rules

Data cleaning scripts typically contain a lot of 'if this change that' type of statements. Such statements are typically condensed expert knowledge. With this package, such 'data modifying rules' are taken out of the code and become in stead parameters to the work flow. This allows one to maintain, document, and reason about data modification rules as separate entities.

Maintained by Mark van der Loo. Last updated 9 months ago.

0.5 match 10 stars 6.24 score 58 scripts

egeulgen

driveR:Prioritizing Cancer Driver Genes Using Genomics Data

Cancer genomes contain large numbers of somatic alterations but few genes drive tumor development. Identifying cancer driver genes is critical for precision oncology. Most of current approaches either identify driver genes based on mutational recurrence or using estimated scores predicting the functional consequences of mutations. 'driveR' is a tool for personalized or batch analysis of genomic data for driver gene prioritization by combining genomic information and prior biological knowledge. As features, 'driveR' uses coding impact metaprediction scores, non-coding impact scores, somatic copy number alteration scores, hotspot gene/double-hit gene condition, 'phenolyzer' gene scores and memberships to cancer-related KEGG pathways. It uses these features to estimate cancer-type-specific probability for each gene of being a cancer driver using the related task of a multi-task learning classification model. The method is described in detail in Ulgen E, Sezerman OU. 2021. driveR: driveR: a novel method for prioritizing cancer driver genes using somatic genomics data. BMC Bioinformatics <doi:10.1186/s12859-021-04203-7>.

Maintained by Ege Ulgen. Last updated 2 years ago.

cancer-driverness driver driver-gene-prioritization identify-driver-genes ranking-genes scoring

0.5 match 15 stars 6.29 score 260 scripts

bioc

martini:GWAS Incorporating Networks

martini deals with the low power inherent to GWAS studies by using prior knowledge represented as a network. SNPs are the vertices of the network, and the edges represent biological relationships between them (genomic adjacency, belonging to the same gene, physical interaction between protein products). The network is scanned using SConES, which looks for groups of SNPs maximally associated with the phenotype, that form a close subnetwork.

Maintained by Hector Climente-Gonzalez. Last updated 5 months ago.

software genomewideassociation snp geneticvariability genetics featureextraction graphandnetwork network bioinformatics genomics gwas network-analysis snps systems-biology cpp

0.5 match 4 stars 6.16 score 30 scripts

g-rho

clustMixType:k-Prototypes Clustering for Mixed Variable-Type Data

Functions to perform k-prototypes partitioning clustering for mixed variable-type data according to Z.Huang (1998): Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Variables, Data Mining and Knowledge Discovery 2, 283-304.

Maintained by Gero Szepannek. Last updated 9 months ago.

0.5 match 1 stars 6.07 score 111 scripts 8 dependents

mmedl94

lionfish:Interactive 'tourr' Using 'python'

Extends the functionality of the 'tourr' package by an interactive graphical user interface. The interactivity allows users to effortlessly refine their 'tourr' results by manual intervention, which allows for integration of expert knowledge and aids the interpretation of results. For more information on 'tourr' see Wickham et. al (2011) <doi:10.18637/jss.v040.i02> or <https://github.com/ggobi/tourr>.

Maintained by Matthias Medl. Last updated 5 days ago.

data-sience data-visualization dimensionality-reduction exploratory-data-analysis interactive interactive-visualizations tourr

0.5 match 1 stars 5.96 score

nicwir

QurvE:Robust and User-Friendly Analysis of Growth and Fluorescence Curves

High-throughput analysis of growth curves and fluorescence data using three methods: linear regression, growth model fitting, and smooth spline fit. Analysis of dose-response relationships via smoothing splines or dose-response models. Complete data analysis workflows can be executed in a single step via user-friendly wrapper functions. The results of these workflows are summarized in detailed reports as well as intuitively navigable 'R' data containers. A 'shiny' application provides access to all features without requiring any programming knowledge. The package is described in further detail in Wirth et al. (2023) <doi:10.1038/s41596-023-00850-7>.

Maintained by Nicolas T. Wirth. Last updated 1 years ago.

0.5 match 25 stars 6.00 score 7 scripts

bioc

omicsViewer:Interactive and explorative visualization of SummarizedExperssionSet or ExpressionSet using omicsViewer

omicsViewer visualizes ExpressionSet (or SummarizedExperiment) in an interactive way. The omicsViewer has a separate back- and front-end. In the back-end, users need to prepare an ExpressionSet that contains all the necessary information for the downstream data interpretation. Some extra requirements on the headers of phenotype data or feature data are imposed so that the provided information can be clearly recognized by the front-end, at the same time, keep a minimum modification on the existing ExpressionSet object. The pure dependency on R/Bioconductor guarantees maximum flexibility in the statistical analysis in the back-end. Once the ExpressionSet is prepared, it can be visualized using the front-end, implemented by shiny and plotly. Both features and samples could be selected from (data) tables or graphs (scatter plot/heatmap). Different types of analyses, such as enrichment analysis (using Bioconductor package fgsea or fisher's exact test) and STRING network analysis, will be performed on the fly and the results are visualized simultaneously. When a subset of samples and a phenotype variable is selected, a significance test on means (t-test or ranked based test; when phenotype variable is quantitative) or test of independence (chi-square or fisher’s exact test; when phenotype data is categorical) will be performed to test the association between the phenotype of interest with the selected samples. Additionally, other analyses can be easily added as extra shiny modules. Therefore, omicsViewer will greatly facilitate data exploration, many different hypotheses can be explored in a short time without the need for knowledge of R. In addition, the resulting data could be easily shared using a shiny server. Otherwise, a standalone version of omicsViewer together with designated omics data could be easily created by integrating it with portable R, which can be shared with collaborators or submitted as supplementary data together with a manuscript.

Maintained by Chen Meng. Last updated 2 months ago.

software visualization genesetenrichment differentialexpression motifdiscovery network networkenrichment

0.5 match 4 stars 6.02 score 22 scripts

bioc

BindingSiteFinder:Binding site defintion based on iCLIP data

Precise knowledge on the binding sites of an RNA-binding protein (RBP) is key to understand (post-) transcriptional regulatory processes. Here we present a workflow that describes how exact binding sites can be defined from iCLIP data. The package provides functions for binding site definition and result visualization. For details please see the vignette.

Maintained by Mirko Brüggemann. Last updated 13 hours ago.

sequencing geneexpression generegulation functionalgenomics coverage dataimport binding-site-classification binding-sites bioconductor-package iclip rna-binding-proteins

0.5 match 6 stars 5.73 score 3 scripts

strancsus

scCAN:Single-Cell Clustering using Autoencoder and Network Fusion

A single-cell Clustering method using 'Autoencoder' and Network fusion ('scCAN') Bang Tran (2022) <doi:10.1038/s41598-022-14218-6> for segregating the cells from the high-dimensional 'scRNA-Seq' data. The software automatically determines the optimal number of clusters and then partitions the cells in a way such that the results are robust to noise and dropouts. 'scCAN' is fast and it supports Windows, Linux, and Mac OS.

Maintained by Bang Tran. Last updated 9 months ago.

1.1 match 2.70 score

weinijiahuan123

offlineChange:Detect Multiple Change Points from Time Series

Detect the number and locations of change points. The locations can be either exact or in terms of ranges, depending on the available computational resource. The method is based on Jie Ding, Yu Xiang, Lu Shen, Vahid Tarokh (2017) <doi:10.1109/TSP.2017.2711558>.

Maintained by Jiahuan Ye. Last updated 5 years ago.

cpp

1.1 match 2.70 score 3 scripts

giopogg

webSDM:Including Known Interactions in Species Distribution Models

A collection of tools to fit and work with trophic Species Distribution Models. Trophic Species Distribution Models combine knowledge of trophic interactions with Bayesian structural equation models that model each species as a function of its prey (or predators) and environmental conditions. It exploits the topological ordering of the known trophic interaction network to predict species distribution in space and/or time, where the prey (or predator) distribution is unavailable. The method implemented by the package is described in Poggiato, Andréoletti, Pollock and Thuiller (2022) <doi:10.22541/au.166853394.45823739/v1>.

Maintained by Giovanni Poggiato. Last updated 9 months ago.

0.5 match 17 stars 5.71 score 9 scripts

cran

MCPMod:Design and Analysis of Dose-Finding Studies

Implements a methodology for the design and analysis of dose-response studies that combines aspects of multiple comparison procedures and modeling approaches (Bretz, Pinheiro and Branson, 2005, Biometrics 61, 738-748, <doi: 10.1111/j.1541-0420.2005.00344.x>). The package provides tools for the analysis of dose finding trials as well as a variety of tools necessary to plan a trial to be conducted with the MCP-Mod methodology. Please note: The 'MCPMod' package will not be further developed, all future development of the MCP-Mod methodology will be done in the 'DoseFinding' R-package.

Maintained by Bjoern Bornkamp. Last updated 5 years ago.

1.8 match 1.60 score

cyclestreets

cyclestreets:Cycle Routing and Data for Cycling Advocacy

An interface to the cycle routing/data services provided by 'CycleStreets', a not-for-profit social enterprise and advocacy organisation. The application programming interfaces (APIs) provided by 'CycleStreets' are documented at (<https://www.cyclestreets.net/api/>). The focus of this package is the journey planning API, which aims to emulate the routes taken by a knowledgeable cyclist. An innovative feature of the routing service of its provision of fastest, quietest and balanced profiles. These represent routes taken to minimise time, avoid traffic and compromise between the two, respectively.

Maintained by Robin Lovelace. Last updated 3 months ago.

cycling routing transport transportation-planning

0.5 match 27 stars 5.62 score 31 scripts

statistikat

tatoo:Combine and Export Data Frames

Functions to combine data.frames in ways that require additional effort in base R, and to add metadata (id, title, ...) that can be used for printing and xlsx export. The 'Tatoo_report' class is provided as a convenient helper to write several such tables to a workbook, one table per worksheet. Tatoo is built on top of 'openxlsx', but intimate knowledge of that package is not required to use tatoo.

Maintained by Stefan Fleck. Last updated 2 years ago.

0.5 match 7 stars 5.53 score 24 scripts

mjwestgate

revtools:Tools to Support Evidence Synthesis

Researchers commonly need to summarize scientific information, a process known as 'evidence synthesis'. The first stage of a synthesis process (such as a systematic review or meta-analysis) is to download a list of references from academic search engines such as 'Web of Knowledge' or 'Scopus'. The traditional approach to systematic review is then to sort these data manually, first by locating and removing duplicated entries, and then screening to remove irrelevant content by viewing titles and abstracts (in that order). 'revtools' provides interfaces for each of these tasks. An alternative approach, however, is to draw on tools from machine learning to visualise patterns in the corpus. In this case, you can use 'revtools' to render ordinations of text drawn from article titles, keywords and abstracts, and interactively select or exclude individual references, words or topics.

Maintained by Martin J. Westgate. Last updated 5 years ago.

0.5 match 52 stars 5.57 score 72 scripts

bioc

diffuStats:Diffusion scores on biological networks

Label propagation approaches are a widely used procedure in computational biology for giving context to molecular entities using network data. Node labels, which can derive from gene expression, genome-wide association studies, protein domains or metabolomics profiling, are propagated to their neighbours in the network, effectively smoothing the scores through prior annotated knowledge and prioritising novel candidates. The R package diffuStats contains a collection of diffusion kernels and scoring approaches that facilitates their computation, characterisation and benchmarking.

Maintained by Sergio Picart-Armada. Last updated 5 months ago.

network geneexpression graphandnetwork metabolomics transcriptomics proteomics genetics genomewideassociation normalization cpp

0.5 match 5.40 score 42 scripts

magichead99

bread:Analyze Big Files Without Loading Them in Memory

A simple set of wrapper functions for data.table::fread() that allows subsetting or filtering rows and selecting columns of table-formatted files too large for the available RAM. 'b stands for 'big files'. bread makes heavy use of Unix commands like 'grep', 'sed', 'wc', 'awk' and 'cut'. They are available by default in all Unix environments. For Windows, you need to install those commands externally in order to simulate a Unix environment and make sure that the executables are in the Windows PATH variable. To my knowledge, the simplest ways are to install 'RTools', 'Git' or 'Cygwin'. If they have been correctly installed (with the expected registry entries), they should be detected on loading the package and the correct directories will be added automatically to the PATH.

Maintained by Vincent Guegan. Last updated 2 years ago.

0.5 match 14 stars 5.37 score 56 scripts 2 dependents

bioc

ReactomeGraph4R:Interface for the Reactome Graph Database

Pathways, reactions, and biological entities in Reactome knowledge are systematically represented as an ordered network. Instances are represented as nodes and relationships between instances as edges; they are all stored in the Reactome Graph Database. This package serves as an interface to query the interconnected data from a local Neo4j database, with the aim of minimizing the usage of Neo4j Cypher queries.

Maintained by Chi-Lam Poon. Last updated 5 months ago.

dataimport pathways reactome network graphandnetwork

0.5 match 6 stars 5.26 score 6 scripts

zzawadz

DepthProc:Statistical Depth Functions for Multivariate Analysis

Data depth concept offers a variety of powerful and user friendly tools for robust exploration and inference for multivariate data. The offered techniques may be successfully used in cases of lack of our knowledge on parametric models generating data due to their nature. The package consist of among others implementations of several data depth techniques involving multivariate quantile-quantile plots, multivariate scatter estimators, multivariate Wilcoxon tests and robust regressions.

Maintained by Zygmunt Zawadzki. Last updated 3 years ago.

depth-functions exploratory-data-analysis statistics openblas cpp openmp

0.5 match 6 stars 5.27 score 104 scripts 2 dependents

jsugarelli

flatxml:Tools for Working with XML Files as R Dataframes

On import, the XML information is converted to a dataframe that reflects the hierarchical XML structure. Intuitive functions allow to navigate within this transparent XML data structure (without any knowledge of 'XPath'). 'flatXML' also provides tools to extract data from the XML into a flat dataframe that can be used to perform statistical operations. It also supports converting dataframes to XML.

Maintained by Joachim Zuckarelli. Last updated 4 years ago.

dataframe xml

0.5 match 24 stars 5.09 score 34 scripts 1 dependents

david-hammond

pmev:Calculates Earned Value for a Project Schedule

Given a project schedule and associated costs, this package calculates the earned value to date. It is an implementation of Project Management Body of Knowledge (PMBOK) methodologies (reference Project Management Institute. (2021). A guide to the Project Management Body of Knowledge (PMBOK guide) (7th ed.). Project Management Institute, Newtown Square, PA, ISBN 9781628256673 (pdf)).

Maintained by David Hammond. Last updated 7 months ago.

0.8 match 3.30 score 4 scripts

hannahcomiskey

mcmsupply:Estimating Public and Private Sector Contraceptive Market Supply Shares

Family Planning programs and initiatives typically use nationally representative surveys to estimate key indicators of a country’s family planning progress. However, in recent years, routinely collected family planning services data (Service Statistics) have been used as a supplementary data source to bridge gaps in the surveys. The use of service statistics comes with the caveat that adjustments need to be made for missing private sector contributions to the contraceptive method supply chain. Evaluating the supply source of modern contraceptives often relies on Demographic Health Surveys (DHS), where many countries do not have recent data beyond 2015/16. Fortunately, in the absence of recent surveys we can rely on statistical model-based estimates and projections to fill the knowledge gap. We present a Bayesian, hierarchical, penalized-spline model with multivariate-normal spline coefficients, to account for across method correlations, to produce country-specific,annual estimates for the proportion of modern contraceptive methods coming from the public and private sectors. This package provides a quick and convenient way for users to access the DHS modern contraceptive supply share data at national and subnational administration levels, estimate, evaluate and plot annual estimates with uncertainty for a sample of low- and middle-income countries. Methods for the estimation of method supply shares at the national level are described in Comiskey, Alkema, Cahill (2022) <arXiv:2212.03844>.

Maintained by Hannah Comiskey. Last updated 12 months ago.

jags cpp

0.5 match 2 stars 5.15 score 20 scripts

ingebogh

makemyprior:Intuitive Construction of Joint Priors for Variance Parameters

Tool for easy prior construction and visualization. It helps to formulates joint prior distributions for variance parameters in latent Gaussian models. The resulting prior is robust and can be created in an intuitive way. A graphical user interface (GUI) can be used to choose the joint prior, where the user can click through the model and select priors. An extensive guide is available in the GUI. The package allows for direct inference with the specified model and prior. Using a hierarchical variance decomposition, we formulate a joint variance prior that takes the whole model structure into account. In this way, existing knowledge can intuitively be incorporated at the level it applies to. Alternatively, one can use independent variance priors for each model components in the latent Gaussian model. Details can be found in the accompanying scientific paper: Hem, Fuglstad, Riebler (2024, Journal of Statistical Software, <doi:10.18637/jss.v110.i03>).

Maintained by Ingeborg Hem. Last updated 7 months ago.

0.5 match 1 stars 5.06 score 19 scripts

bioc

DESpace:DESpace: a framework to discover spatially variable genes

Intuitive framework for identifying spatially variable genes (SVGs) via edgeR, a popular method for performing differential expression analyses. Based on pre-annotated spatial clusters as summarized spatial information, DESpace models gene expression using a negative binomial (NB), via edgeR, with spatial clusters as covariates. SVGs are then identified by testing the significance of spatial clusters. The method is flexible and robust, and is faster than the most SV methods. Furthermore, to the best of our knowledge, it is the only SV approach that allows: - performing a SV test on each individual spatial cluster, hence identifying the key regions of the tissue affected by spatial variability; - jointly fitting multiple samples, targeting genes with consistent spatial patterns across replicates.

Maintained by Peiying Cai. Last updated 5 months ago.

spatial singlecell rnaseq transcriptomics geneexpression sequencing differentialexpression statisticalmethod visualization

0.5 match 4 stars 5.02 score 13 scripts

bioc

KnowSeq:KnowSeq R/Bioc package: The Smart Transcriptomic Pipeline

KnowSeq proposes a novel methodology that comprises the most relevant steps in the Transcriptomic gene expression analysis. KnowSeq expects to serve as an integrative tool that allows to process and extract relevant biomarkers, as well as to assess them through a Machine Learning approaches. Finally, the last objective of KnowSeq is the biological knowledge extraction from the biomarkers (Gene Ontology enrichment, Pathway listing and Visualization and Evidences related to the addressed disease). Although the package allows analyzing all the data manually, the main strenght of KnowSeq is the possibilty of carrying out an automatic and intelligent HTML report that collect all the involved steps in one document. It is important to highligh that the pipeline is totally modular and flexible, hence it can be started from whichever of the different steps. KnowSeq expects to serve as a novel tool to help to the experts in the field to acquire robust knowledge and conclusions for the data and diseases to study.

Maintained by Daniel Castillo-Secilla. Last updated 5 months ago.

geneexpression differentialexpression genesetenrichment dataimport classification featureextraction sequencing rnaseq batcheffect normalization preprocessing qualitycontrol genetics transcriptomics microarray alignment pathways systemsbiology go immunooncology

0.8 match 3.30 score 5 scripts

robson-fernandes

bnviewer:Bayesian Networks Interactive Visualization and Explainable Artificial Intelligence

Bayesian networks provide an intuitive framework for probabilistic reasoning and its graphical nature can be interpreted quite clearly. Graph based methods of machine learning are becoming more popular because they offer a richer model of knowledge that can be understood by a human in a graphical format. The 'bnviewer' is an R Package that allows the interactive visualization of Bayesian Networks. The aim of this package is to improve the Bayesian Networks visualization over the basic and static views offered by existing packages.

Maintained by Robson Fernandes. Last updated 5 years ago.

bayesian-inference bayesian-network bayesian-networks probabilistic-graphical-models

0.5 match 7 stars 4.86 score 69 scripts 1 dependents

bioc

MSstatsBioNet:Network Analysis for MS-based Proteomics Experiments

A set of tools for network analysis using mass spectrometry-based proteomics data and network databases. The package takes as input the output of MSstats differential abundance analysis and provides functions to perform enrichment analysis and visualization in the context of prior knowledge from past literature. Notably, this package integrates with INDRA, which is a database of biological networks extracted from the literature using text mining techniques.

Maintained by Anthony Wu. Last updated 1 months ago.

immunooncology massspectrometry proteomics software qualitycontrol networkenrichment network

0.5 match 4.85 score 3 scripts

lebebr01

highlightHTML:Highlight HTML Text and Tables

A tool to format R markdown with CSS ids for HTML output. The tool may be most helpful for those using markdown to create reproducible documents. The biggest limitations in formatting is the knowledge of CSS by the document authors.

Maintained by Brandon LeBeau. Last updated 4 years ago.

html markdown r-markdown

0.5 match 5 stars 4.74 score 22 scripts

bioc

SCBN:A statistical normalization method and differential expression analysis for RNA-seq data between different species

This package provides a scale based normalization (SCBN) method to identify genes with differential expression between different species. It takes into account the available knowledge of conserved orthologous genes and the hypothesis testing framework to detect differentially expressed orthologous genes. The method on this package are described in the article 'A statistical normalization method and differential expression analysis for RNA-seq data between different species' by Yan Zhou, Jiadi Zhu, Tiejun Tong, Junhui Wang, Bingqing Lin, Jun Zhang (2018, pending publication).

Maintained by Yan Zhou. Last updated 5 months ago.

differentialexpression geneexpression normalization

0.5 match 4.78 score 1 scripts 1 dependents

christophergandrud

dpmr:Data Package Manager for R

Create, install, and summarise data packages that follow the Open Knowledge Foundation's Data Package Protocol.

Maintained by Christopher Gandrud. Last updated 8 years ago.

0.5 match 56 stars 4.45 score 5 scripts

cran

visit:Vaccine Phase I Design with Simultaneous Evaluation of Immunogenicity and Toxicity

Phase I clinical trials are the first step in drug development to test a new drug or drug combination on humans. Typical designs of Phase I trials use toxicity as the primary endpoint and aim to find the maximum tolerable dosage. However, these designs are poorly applicable for the development of cancer therapeutic vaccines because the expected safety concerns for these vaccines are not as much as cytotoxic agents. The primary objectives of a cancer therapeutic vaccine phase I trial thus often include determining whether the vaccine shows biologic activity and the minimum dose necessary to achieve a full immune or even clinical response. This package implements a Bayesian Phase I cancer vaccine trial design that allows simultaneous evaluation of safety and immunogenicity outcomes. See Wang et al. (2019) <DOI:10.1002/sim.8021> for further details.

Maintained by Chenguang Wang. Last updated 2 years ago.

cpp

1.2 match 2.00 score

openpharma

elaborator:A 'shiny' Application for Exploring Laboratory Data

A novel concept for generating knowledge and gaining insights into laboratory data. You will be able to efficiently and easily explore your laboratory data from different perspectives. Janitza, S., Majumder, M., Mendolia, F., Jeske, S., & Kulmann, H. (2021) <doi:10.1007/s43441-021-00318-4>.

Maintained by Bodo Kirsch. Last updated 6 months ago.

0.5 match 6 stars 4.56 score

liao961120

linguisticsdown:Easy Linguistics Document Writing with R Markdown

Provides 'Shiny gadgets' to search, type, and insert IPA symbols into documents or scripts, requiring only knowledge about phonetics or 'X-SAMPA'. Also provides functions to facilitate the rendering of IPA symbols in 'LaTeX' and PDF format, making IPA symbols properly rendered in all output formats. A minimal R Markdown template for authoring Linguistics related documents is also bundled with the package. Some helper functions to facilitate authoring with R Markdown is also provided.

Maintained by Yongfu Liao. Last updated 6 years ago.

linguistics rmarkdown rmarkdown-template

0.5 match 26 stars 4.59 score 30 scripts

bioc

iBBiG:Iterative Binary Biclustering of Genesets

iBBiG is a bi-clustering algorithm which is optimizes for binary data analysis. We apply it to meta-gene set analysis of large numbers of gene expression datasets. The iterative algorithm extracts groups of phenotypes from multiple studies that are associated with similar gene sets. iBBiG does not require prior knowledge of the number or scale of clusters and allows discovery of clusters with diverse sizes

Maintained by Aedin Culhane. Last updated 5 months ago.

clustering annotation genesetenrichment

0.5 match 4.56 score 3 scripts 2 dependents

admaldonado

MoTBFs:Learning Hybrid Bayesian Networks using Mixtures of Truncated Basis Functions

Learning, manipulation and evaluation of mixtures of truncated basis functions (MoTBFs), which include mixtures of polynomials (MOPs) and mixtures of truncated exponentials (MTEs). MoTBFs are a flexible framework for modelling hybrid Bayesian networks (I. Pérez-Bernabé, A. Salmerón, H. Langseth (2015) <doi:10.1007/978-3-319-20807-7_36>; H. Langseth, T.D. Nielsen, I. Pérez-Bernabé, A. Salmerón (2014) <doi:10.1016/j.ijar.2013.09.012>; I. Pérez-Bernabé, A. Fernández, R. Rumí, A. Salmerón (2016) <doi:10.1007/s10618-015-0429-7>). The package provides functionality for learning univariate, multivariate and conditional densities, with the possibility of incorporating prior knowledge. Structural learning of hybrid Bayesian networks is also provided. A set of useful tools is provided, including plotting, printing and likelihood evaluation. This package makes use of S3 objects, with two new classes called 'motbf' and 'jointmotbf'.

Maintained by Ana D. Maldonado. Last updated 3 years ago.

2.3 match 1 stars 1.00 score 1 scripts

bioc

BiocBook:Write, containerize, publish and version Quarto books with Bioconductor

A BiocBook can be created by authors (e.g. R developers, but also scientists, teachers, communicators, ...) who wish to 1) write (compile a body of biological and/or bioinformatics knowledge), 2) containerize (provide Docker images to reproduce the examples illustrated in the compendium), 3) publish (deploy an online book to disseminate the compendium), and 4) version (automatically generate specific online book versions and Docker images for specific Bioconductor releases).

Maintained by Jacques Serizay. Last updated 5 months ago.

infrastructure reportwriting software

0.5 match 3 stars 4.48 score 4 scripts

kumes

chatAI4R:Chat-Based Interactive Artificial Intelligence for R

The Large Language Model (LLM) represents a groundbreaking advancement in data science and programming, and also allows us to extend the world of R. A seamless interface for integrating the 'OpenAI' Web APIs into R is provided in this package. This package leverages LLM-based AI techniques, enabling efficient knowledge discovery and data analysis (see 'OpenAI' Web APIs details <https://openai.com/blog/openai-api>). The previous functions such as seamless translation and image generation have been moved to other packages 'deepRstudio' and 'stableDiffusion4R'.

Maintained by Satoshi Kume. Last updated 1 months ago.

ai bioinformatics chatgpt gpt image image-generation

0.5 match 14 stars 4.45 score 3 scripts

jhardenberg

rainfarmr:Stochastic Precipitation Downscaling with the RainFARM Method

An implementation of the RainFARM (Rainfall Filtered Autoregressive Model) stochastic precipitation downscaling method (Rebora et al. (2006) <doi:10.1175/JHM517.1>). Adapted for climate downscaling according to D'Onofrio et al. (2018) <doi:10.1175/JHM-D-13-096.1> and for complex topography as in Terzago et al. (2018) <doi:10.5194/nhess-18-2825-2018>. The RainFARM method is based on the extrapolation to small scales of the Fourier spectrum of a large-scale precipitation field, using a fixed logarithmic slope and random phases at small scales, followed by a nonlinear transformation of the resulting linearly correlated stochastic field. RainFARM allows to generate ensembles of spatially downscaled precipitation fields which conserve precipitation at large scales and whose statistical properties are consistent with the small-scale statistics of observed precipitation, based only on knowledge of the large-scale precipitation field.

Maintained by Jost von Hardenberg. Last updated 3 years ago.

0.5 match 4 stars 4.48 score 5 dependents

bioc

transite:RNA-binding protein motif analysis

transite is a computational method that allows comprehensive analysis of the regulatory role of RNA-binding proteins in various cellular processes by leveraging preexisting gene expression data and current knowledge of binding preferences of RNA-binding proteins.

Maintained by Konstantin Krismer. Last updated 5 months ago.

geneexpression transcription differentialexpression microarray mrnamicroarray genetics genesetenrichment cpp

0.5 match 4.30 score 20 scripts

bioc

drugTargetInteractions:Drug-Target Interactions

Provides utilities for identifying drug-target interactions for sets of small molecule or gene/protein identifiers. The required drug-target interaction information is obained from a local SQLite instance of the ChEMBL database. ChEMBL has been chosen for this purpose, because it provides one of the most comprehensive and best annotatated knowledge resources for drug-target information available in the public domain.

Maintained by Thomas Girke. Last updated 5 months ago.

cheminformatics biomedicalinformatics pharmacogenetics pharmacogenomics proteomics metabolomics

0.5 match 1 stars 4.34 score 11 scripts

bioc

ASURAT:Functional annotation-driven unsupervised clustering for single-cell data

ASURAT is a software for single-cell data analysis. Using ASURAT, one can simultaneously perform unsupervised clustering and biological interpretation in terms of cell type, disease, biological process, and signaling pathway activity. Inputting a single-cell RNA-seq data and knowledge-based databases, such as Cell Ontology, Gene Ontology, KEGG, etc., ASURAT transforms gene expression tables into original multivariate tables, termed sign-by-sample matrices (SSMs).

Maintained by Keita Iida. Last updated 5 months ago.

geneexpression singlecell sequencing clustering genesignaling cpp

0.5 match 4.32 score 21 scripts

frbcesab

popbayes:Bayesian Model to Estimate Population Trends from Counts Series

Infers the trends of one or several animal populations over time from series of counts. It does so by accounting for count precision (provided or inferred based on expert knowledge, e.g. guesstimates), smoothing the population rate of increase over time, and accounting for the maximum demographic potential of species. Inference is carried out in a Bayesian framework. This work is part of the FRB-CESAB working group AfroBioDrivers <https://www.fondationbiodiversite.fr/en/the-frb-in-action/programs-and-projects/le-cesab/afrobiodrivers/>.

Maintained by Nicolas Casajus. Last updated 1 years ago.

animal bayesian counts population precision temporal-trend jags cpp

0.5 match 1 stars 4.30 score

pbiecek

bgmm:Gaussian Mixture Modeling Algorithms and the Belief-Based Mixture Modeling

Two partially supervised mixture modeling methods: soft-label and belief-based modeling are implemented. For completeness, we equipped the package also with the functionality of unsupervised, semi- and fully supervised mixture modeling. The package can be applied also to selection of the best-fitting from a set of models with different component numbers or constraints on their structures. For detailed introduction see: Przemyslaw Biecek, Ewa Szczurek, Martin Vingron, Jerzy Tiuryn (2012), The R Package bgmm: Mixture Modeling with Uncertain Knowledge, Journal of Statistical Software <doi:10.18637/jss.v047.i03>.

Maintained by Przemyslaw Biecek. Last updated 2 years ago.

0.5 match 2 stars 4.22 score 55 scripts 1 dependents

jreisner

biclustermd:Biclustering with Missing Data

Biclustering is a statistical learning technique that simultaneously partitions and clusters rows and columns of a data matrix. Since the solution space of biclustering is in infeasible to completely search with current computational mechanisms, this package uses a greedy heuristic. The algorithm featured in this package is, to the best our knowledge, the first biclustering algorithm to work on data with missing values. Li, J., Reisner, J., Pham, H., Olafsson, S., and Vardeman, S. (2020) Biclustering with Missing Data. Information Sciences, 510, 304–316.

Maintained by John Reisner. Last updated 4 years ago.

0.5 match 3 stars 4.18 score 4 scripts

bioc

easier:Estimate Systems Immune Response from RNA-seq data

This package provides a workflow for the use of EaSIeR tool, developed to assess patients' likelihood to respond to ICB therapies providing just the patients' RNA-seq data as input. We integrate RNA-seq data with different types of prior knowledge to extract quantitative descriptors of the tumor microenvironment from several points of view, including composition of the immune repertoire, and activity of intra- and extra-cellular communications. Then, we use multi-task machine learning trained in TCGA data to identify how these descriptors can simultaneously predict several state-of-the-art hallmarks of anti-cancer immune response. In this way we derive cancer-specific models and identify cancer-specific systems biomarkers of immune response. These biomarkers have been experimentally validated in the literature and the performance of EaSIeR predictions has been validated using independent datasets form four different cancer types with patients treated with anti-PD1 or anti-PDL1 therapy.

Maintained by Oscar Lapuente-Santana. Last updated 5 months ago.

geneexpression software transcription systemsbiology pathways genesetenrichment immunooncology epigenetics classification biomedicalinformatics regression experimenthubsoftware

0.5 match 4.20 score 16 scripts

thiyangt

tsdataleaks:Exploit Data Leakages in Time Series Forecasting Competitions

Forecasting competitions are of increasing importance as a mean to learn best practices and gain knowledge. Data leakage is one of the most common issues that can often be found in competitions. Data leaks can happen when the training data contains information about the test data. For example: randomly chosen blocks of time series are concatenated to form a new time series, scale-shifts, repeating patterns in time series, white noise is added in the original time series to form a new time series, etc. 'tsdataleaks' package can be used to detect data leakages in a collection of time series.

Maintained by Thiyanga S. Talagala. Last updated 1 years ago.

0.5 match 3 stars 4.18 score 8 scripts

ketsiaguichard

telraamStats:Retrieval and Visualization of Mobility Data from 'Telraam' Sensors

Streamline the processing of 'Telraam' data, sourced from open data mobility sensors. These tools range from data retrieval (without the need for API knowledge) to data visualization, including data preprocessing.

Maintained by Ketsia Guichard. Last updated 10 months ago.

0.5 match 1 stars 4.04 score 11 scripts

bioc

UNDO:Unsupervised Deconvolution of Tumor-Stromal Mixed Expressions

UNDO is an R package for unsupervised deconvolution of tumor and stromal mixed expression data. It detects marker genes and deconvolutes the mixing expression data without any prior knowledge.

Maintained by Niya Wang. Last updated 5 months ago.

software

0.5 match 4.00 score 6 scripts

cran

AMModels:Adaptive Management Model Manager

Helps enable adaptive management by codifying knowledge in the form of models generated from numerous analyses and data sets. Facilitates this process by storing all models and data sets in a single object that can be updated and saved, thus tracking changes in knowledge through time. A shiny application called AM Model Manager (modelMgr()) enables the use of these functions via a GUI.

Maintained by Jon Katz. Last updated 6 years ago.

0.8 match 2.58 score 19 scripts

rezamoammadi

liver:"Eating the Liver of Data Science"

Offers a suite of helper functions to simplify various data science techniques for non-experts. This package aims to enable individuals with only a minimal level of coding knowledge to become acquainted with these techniques in an accessible manner. Inspired by an ancient Persian idiom, we liken this process to "eating the liver of data science," suggesting a deep and intimate engagement with the field of data science. This package includes functions for tasks such as data partitioning for out-of-sample testing, calculating Mean Squared Error (MSE) to assess prediction accuracy, and data transformations (z-score and min-max). In addition to these helper functions, the 'liver' package also features several intriguing datasets valuable for multivariate analysis.

Maintained by Reza Mohammadi. Last updated 4 months ago.

0.5 match 4.00 score 67 scripts

jingxuanh

xtune:Regularized Regression with Feature-Specific Penalties Integrating External Information

Extends standard penalized regression (Lasso, Ridge, and Elastic-net) to allow feature-specific shrinkage based on external information with the goal of achieving a better prediction accuracy and variable selection. Examples of external information include the grouping of predictors, prior knowledge of biological importance, external p-values, function annotations, etc. The choice of multiple tuning parameters is done using an Empirical Bayes approach. A majorization-minimization algorithm is employed for implementation.

Maintained by Jingxuan He. Last updated 2 years ago.

0.5 match 3.90 score 16 scripts

pbiecek

proton:The Proton Game

'The Proton Game' is a console-based data-crunching game for younger and older data scientists. Act as a data-hacker and find Slawomir Pietraszko's credentials to the Proton server. You have to solve four data-based puzzles to find the login and password. There are many ways to solve these puzzles. You may use loops, data filtering, ordering, aggregation or other tools. Only basics knowledge of R is required to play the game, yet the more functions you know, the more approaches you can try. The knowledge of dplyr is not required but may be very helpful. This game is linked with the ,,Pietraszko's Cave'' story available at http://biecek.pl/BetaBit/Warsaw. It's a part of Beta and Bit series. You will find more about the Beta and Bit series at http://biecek.pl/BetaBit.

Maintained by Przemysław Biecek. Last updated 9 years ago.

0.8 match 2.49 score 312 scripts

molaison

MantaID:A Machine-Learning Based Tool to Automate the Identification of Biological Database IDs

The number of biological databases is growing rapidly, but different databases use different IDs to refer to the same biological entity. The inconsistency in IDs impedes the integration of various types of biological data. To resolve the problem, we developed 'MantaID', a data-driven, machine-learning based approach that automates identifying IDs on a large scale. The 'MantaID' model's prediction accuracy was proven to be 99%, and it correctly and effectively predicted 100,000 ID entries within two minutes. 'MantaID' supports the discovery and exploitation of ID patterns from large quantities of databases. (e.g., up to 542 biological databases). An easy-to-use freely available open-source software R package, a user-friendly web application, and API were also developed for 'MantaID' to improve applicability. To our knowledge, 'MantaID' is the first tool that enables an automatic, quick, accurate, and comprehensive identification of large quantities of IDs, and can therefore be used as a starting point to facilitate the complex assimilation and aggregation of biological data across diverse databases.

Maintained by Zhengpeng Zeng. Last updated 6 months ago.

0.5 match 3.78 score 2 scripts

jo-theo

shinySbm:'shiny' Application to Use the Stochastic Block Model

A 'shiny' interface for a simpler use of the 'sbm' R package. It also contains useful functions to easily explore the 'sbm' package results. With this package you should be able to use the stochastic block model without any knowledge in R, get automatic reports and nice visuals, as well as learning the basic functions of 'sbm'.

Maintained by Theodore Vanrenterghem. Last updated 1 years ago.

0.5 match 3.70 score 6 scripts

oxfordihtm

codigo:Interface to the International Classification of Diseases (ICD) API

The International Classification of Diseases (ICD) serves a broad range of uses globally and provides critical knowledge on the extent, causes and consequences of human disease and death worldwide via data that is reported and coded with the ICD. ICD API allows programmatic access to the ICD. It is an HTTP based REST API. This package provides functions that interface with the ICD API.

Maintained by Ernest Guevarra. Last updated 5 months ago.

diseases icd icd-10 icd-11

0.5 match 4 stars 3.68 score 6 scripts 1 dependents

laurabruckman

netSEM:Network Structural Equation Modeling

The network structural equation modeling conducts a network statistical analysis on a data frame of coincident observations of multiple continuous variables [1]. It builds a pathway model by exploring a pool of domain knowledge guided candidate statistical relationships between each of the variable pairs, selecting the 'best fit' on the basis of a specific criteria such as adjusted r-squared value. This material is based upon work supported by the U.S. National Science Foundation Award EEC-2052776 and EEC-2052662 for the MDS-Rely IUCRC Center, under the NSF Solicitation: NSF 20-570 Industry-University Cooperative Research Centers Program [1] Bruckman, Laura S., Nicholas R. Wheeler, Junheng Ma, Ethan Wang, Carl K. Wang, Ivan Chou, Jiayang Sun, and Roger H. French. (2013) <doi:10.1109/ACCESS.2013.2267611>.

Maintained by Laura S. Bruckman. Last updated 2 years ago.

0.5 match 3.72 score 13 scripts

bioc

CNORfuzzy:Addon to CellNOptR: Fuzzy Logic

This package is an extension to CellNOptR. It contains additional functionality needed to simulate and train a prior knowledge network to experimental data using constrained fuzzy logic (cFL, rather than Boolean logic as is the case in CellNOptR). Additionally, this package will contain functions to use for the compilation of multiple optimization results (either Boolean or cFL).

Maintained by T. Cokelaer. Last updated 5 months ago.

network

0.5 match 3.60 score 7 scripts

sistia01

DWLS:Gene Expression Deconvolution Using Dampened Weighted Least Squares

The rapid development of single-cell transcriptomic technologies has helped uncover the cellular heterogeneity within cell populations. However, bulk RNA-seq continues to be the main workhorse for quantifying gene expression levels due to technical simplicity and low cost. To most effectively extract information from bulk data given the new knowledge gained from single-cell methods, we have developed a novel algorithm to estimate the cell-type composition of bulk data from a single-cell RNA-seq-derived cell-type signature. Comparison with existing methods using various real RNA-seq data sets indicates that our new approach is more accurate and comprehensive than previous methods, especially for the estimation of rare cell types. More importantly,our method can detect cell-type composition changes in response to external perturbations, thereby providing a valuable, cost-effective method for dissecting the cell-type-specific effects of drug treatments or condition changes. As such, our method is applicable to a wide range of biological and clinical investigations. Dampened weighted least squares ('DWLS') is an estimation method for gene expression deconvolution, in which the cell-type composition of a bulk RNA-seq data set is computationally inferred. This method corrects common biases towards cell types that are characterized by highly expressed genes and/or are highly prevalent, to provide accurate detection across diverse cell types. See: <https://www.nature.com/articles/s41467-019-10802-z.pdf> for more information about the development of 'DWLS' and the methods behind our functions.

Maintained by Adriana Sistig. Last updated 3 years ago.

0.5 match 2 stars 3.62 score 42 scripts

bioc

omada:Machine learning tools for automated transcriptome clustering analysis

Symptomatic heterogeneity in complex diseases reveals differences in molecular states that need to be investigated. However, selecting the numerous parameters of an exploratory clustering analysis in RNA profiling studies requires deep understanding of machine learning and extensive computational experimentation. Tools that assist with such decisions without prior field knowledge are nonexistent and further gene association analyses need to be performed independently. We have developed a suite of tools to automate these processes and make robust unsupervised clustering of transcriptomic data more accessible through automated machine learning based functions. The efficiency of each tool was tested with four datasets characterised by different expression signal strengths. Our toolkit’s decisions reflected the real number of stable partitions in datasets where the subgroups are discernible. Even in datasets with less clear biological distinctions, stable subgroups with different expression profiles and clinical associations were found.

Maintained by Sokratis Kariotis. Last updated 5 months ago.

software clustering rnaseq geneexpression

0.5 match 3.60 score 5 scripts

tjebo

eyedata:Open Source Ophthalmic Data Sets Curated for R

Open source data allows for reproducible research and helps advance our knowledge. The purpose of this package is to collate open source ophthalmic data sets curated for direct use. This is real life data of people with intravitreal injections with anti-vascular endothelial growth factor (anti-VEGF), due to age-related macular degeneration or diabetic macular edema. Associated publications of the data sets: Fu et al. (2020) <doi:10.1001/jamaophthalmol.2020.5044>, Moraes et al (2020) <doi:10.1016/j.ophtha.2020.09.025>, Fasler et al. (2019) <doi:10.1136/bmjopen-2018-027441>, Arpa et al. (2020) <doi:10.1136/bjophthalmol-2020-317161>, Kern et al. 2020, <doi:10.1038/s41433-020-1048-0>.

Maintained by Tjebo Heeren. Last updated 4 years ago.

0.5 match 4 stars 3.48 score 15 scripts

r4goodacademy

R4GoodPersonalFinances:Make Better Financial Decisions

Make informed, data-driven decisions for your personal or household finances. Use tools and methods that are selected carefully to align with academic consensus, bridging the gap between theoretical knowledge and practical application. They assist you in finding optimal asset allocation, preparing for retirement or financial independence, calculating optimal spending, and more. For more details see: Haghani V., White J. (2023, ISBN:978-1-119-74791-8), Idzorek T., Kaplan P. (2024, ISBN:9781952927379).

Maintained by Kamil Wais. Last updated 3 days ago.

financial-independence fire optimal-asset-allocations optimal-spending personal-finances retirement

0.5 match 1 stars 3.40 score