Showing 200 of total 259 results (show query)
chockemeyer
kstMatrix:Basic Functions in Knowledge Space Theory Using Matrix Representation
Knowledge space theory by Doignon and Falmagne (1999) <doi:10.1007/978-3-642-58625-5> is a set- and order-theoretical framework, which proposes mathematical formalisms to operationalize knowledge structures in a particular domain. The 'kstMatrix' package provides basic functionalities to generate, handle, and manipulate knowledge structures and knowledge spaces. Opposed to the 'kst' package, 'kstMatrix' uses matrix representations for knowledge structures. Furthermore, 'kstMatrix' contains several knowledge spaces developed by the research group around Cornelia Dowling through querying experts.
Maintained by Cord Hockemeyer. Last updated 2 months ago.
52.1 match 2 stars 3.43 score 15 scripts 1 dependentschockemeyer
kst:Knowledge Space Theory
Knowledge space theory by Doignon and Falmagne (1999) <doi:10.1007/978-3-642-58625-5> is a set- and order-theoretical framework, which proposes mathematical formalisms to operationalize knowledge structures in a particular domain. The 'kst' package provides basic functionalities to generate, handle, and manipulate knowledge structures and knowledge spaces.
Maintained by Cord Hockemeyer. Last updated 2 years ago.
37.5 match 6 stars 3.36 score 38 scriptspatzaw
TKCat:Tailored Knowledge Catalog
Facilitate the management of data from knowledge resources that are frequently used alone or together in research environments. In 'TKCat', knowledge resources are manipulated as modeled database (MDB) objects. These objects provide access to the data tables along with a general description of the resource and a detail data model documenting the tables, their fields and their relationships. These MDBs are then gathered in catalogs that can be easily explored an shared. Finally, 'TKCat' provides tools to easily subset, filter and combine MDBs and create new catalogs suited for specific needs.
Maintained by Patrice Godard. Last updated 2 days ago.
19.8 match 5 stars 6.08 score 27 scriptssoodoku
guess:Adjust Estimates of Learning for Guessing
Adjust Estimates of Learning for Guessing. The package provides standard guessing correction, and a latent class model that leverages informative pre-post transitions. For details of the latent class model, see <http://gsood.com/research/papers/guess.pdf>.
Maintained by Gaurav Sood. Last updated 3 years ago.
19.4 match 3 stars 4.29 score 13 scriptskjhealy
gssrdoc:Document General Social Survey Variable
The General Social Survey (GSS) is a long-running, mostly annual survey of US households. It is administered by the National Opinion Research Center (NORC). This package contains the a tibble with information on the survey variables, together with every variable documented as an R help page. For more information on the GSS see \url{http://gss.norc.org}.
Maintained by Kieran Healy. Last updated 11 months ago.
36.3 match 2.28 score 38 scriptsbioc
fobitools:Tools for Manipulating the FOBI Ontology
A set of tools for interacting with the Food-Biomarker Ontology (FOBI). A collection of basic manipulation tools for biological significance analysis, graphs, and text mining strategies for annotating nutritional data.
Maintained by Pol Castellano-Escuder. Last updated 4 months ago.
massspectrometrymetabolomicssoftwarevisualizationbiomedicalinformaticsgraphandnetworkannotationcheminformaticspathwaysgenesetenrichmentbiological-intrerpretationbiological-knowledgebiological-significance-analysisenrichment-analysisfood-biomarker-ontologyknowledge-graphnutritionobofoundryontologytext-mining
15.0 match 1 stars 5.08 score 5 scriptsspectra-to-knowledge
SpectraToQueries:Spectra to queries
SpectraToQueries provides the infrastructure to translate spectra to queries.
Maintained by Adriano Rutz. Last updated 21 days ago.
knowledge extractionspectral informationquerying system
22.5 match 1 stars 3.02 scorecran
pks:Probabilistic Knowledge Structures
Fitting and testing probabilistic knowledge structures, especially the basic local independence model (BLIM, Doignon & Flamagne, 1999) and the simple learning model (SLM), using the minimum discrepancy maximum likelihood (MDML) method (Heller & Wickelmaier, 2013 <doi:10.1016/j.endm.2013.05.145>).
Maintained by Florian Wickelmaier. Last updated 6 months ago.
24.3 match 1 stars 2.78 score 2 dependentsbioc
CellNOptR:Training of boolean logic models of signalling networks using prior knowledge networks and perturbation data
This package does optimisation of boolean logic networks of signalling pathways based on a previous knowledge network and a set of data upon perturbation of the nodes in the network.
Maintained by Attila Gabor. Last updated 5 months ago.
cellbasedassayscellbiologyproteomicspathwaysnetworktimecourseimmunooncology
8.8 match 6.72 score 98 scripts 6 dependentstkcaccia
KODAMA:Knowledge Discovery by Accuracy Maximization
An unsupervised and semi-supervised learning algorithm that performs feature extraction from noisy and high-dimensional data. It facilitates identification of patterns representing underlying groups on all samples in a data set. Based on Cacciatore S, Tenori L, Luchinat C, Bennett PR, MacIntyre DA. (2017) Bioinformatics <doi:10.1093/bioinformatics/btw705> and Cacciatore S, Luchinat C, Tenori L. (2014) Proc Natl Acad Sci USA <doi:10.1073/pnas.1220873111>.
Maintained by Stefano Cacciatore. Last updated 23 hours ago.
8.1 match 1 stars 7.00 score 63 scripts 1 dependentshope-data-science
akc:Automatic Knowledge Classification
A tidy framework for automatic knowledge classification and visualization. Currently, the core functionality of the framework is mainly supported by modularity-based clustering (community detection) in keyword co-occurrence network, and focuses on co-word analysis of bibliometric research. However, the designed functions in 'akc' are general, and could be extended to solve other tasks in text mining as well.
Maintained by Tian-Yuan Huang. Last updated 20 days ago.
9.7 match 15 stars 5.85 score 47 scriptserossiter
catSurv:Computerized Adaptive Testing for Survey Research
Provides methods of computerized adaptive testing for survey researchers. See Montgomery and Rossiter (2020) <doi:10.1093/jssam/smz027>. Includes functionality for data fit with the classic item response methods including the latent trait model, Birnbaum`s three parameter model, the graded response, and the generalized partial credit model. Additionally, includes several ability parameter estimation and item selection routines. During item selection, all calculations are done in compiled C++ code.
Maintained by Erin Rossiter. Last updated 10 months ago.
11.5 match 12 stars 4.68 score 3 scriptsannajenul
UBayFS:A User-Guided Bayesian Framework for Ensemble Feature Selection (UBayFS)
Implements the user-guided Bayesian framework for ensemble feature selection (UBayFS) : Jenul et al., (2022) <doi:10.1007/s10994-022-06221-9>.
Maintained by Anna Jenul. Last updated 2 years ago.
bayesian-statisticsensemble-modelsfeature-selectionuser-knowledge
10.3 match 5 stars 5.11 score 13 scriptsjimbrig
rtraining:R Training Resources, Guides, Tips, and Knowledge Base
Houses variouse material realted to teaching R.
Maintained by Jimmy Briggs. Last updated 2 years ago.
best-practicescurationdeveloper-toolsdevelopmentdevelopment-environmentguideknowledgepackage-developmentsetupshiny-appstips-and-trickstrainingtraining-materialswalkthrough
12.9 match 4 stars 3.60 score 6 scriptsfriendly
vcdExtra:'vcd' Extensions and Additions
Provides additional data sets, methods and documentation to complement the 'vcd' package for Visualizing Categorical Data and the 'gnm' package for Generalized Nonlinear Models. In particular, 'vcdExtra' extends mosaic, assoc and sieve plots from 'vcd' to handle 'glm()' and 'gnm()' models and adds a 3D version in 'mosaic3d'. Additionally, methods are provided for comparing and visualizing lists of 'glm' and 'loglm' objects. This package is now a support package for the book, "Discrete Data Analysis with R" by Michael Friendly and David Meyer.
Maintained by Michael Friendly. Last updated 5 months ago.
categorical-data-visualizationgeneralized-linear-modelsmosaic-plots
4.0 match 24 stars 10.34 score 472 scripts 3 dependentsnliulab
AutoScore:An Interpretable Machine Learning-Based Automatic Clinical Score Generator
A novel interpretable machine learning-based framework to automate the development of a clinical scoring model for predefined outcomes. Our novel framework consists of six modules: variable ranking with machine learning, variable transformation, score derivation, model selection, domain knowledge-based score fine-tuning, and performance evaluation.The details are described in our research paper<doi:10.2196/21798>. Users or clinicians could seamlessly generate parsimonious sparse-score risk models (i.e., risk scores), which can be easily implemented and validated in clinical practice. We hope to see its application in various medical case studies.
Maintained by Feng Xie. Last updated 15 days ago.
5.3 match 32 stars 7.70 score 30 scriptsbeerda
nuggets:Extensible Data Pattern Searching Framework
Extensible framework for subgroup discovery (Atzmueller (2015) <doi:10.1002/widm.1144>), contrast patterns (Chen (2022) <doi:10.48550/arXiv.2209.13556>), emerging patterns (Dong (1999) <doi:10.1145/312129.312191>), association rules (Agrawal (1994) <https://www.vldb.org/conf/1994/P487.PDF>) and conditional correlations (Hájek (1978) <doi:10.1007/978-3-642-66943-9>). Both crisp (Boolean, binary) and fuzzy data are supported. It generates conditions in the form of elementary conjunctions, evaluates them on a dataset and checks the induced sub-data for interesting statistical properties. A user-defined function may be defined to evaluate on each generated condition to search for custom patterns.
Maintained by Michal Burda. Last updated 4 days ago.
association-rule-miningcontrast-pattern-miningdata-miningfuzzyknowledge-discoverypattern-recognitioncppopenmp
7.5 match 2 stars 5.38 score 10 scriptsbayesball
LearnBayes:Learning Bayesian Inference
Contains functions for summarizing basic one and two parameter posterior distributions and predictive distributions. It contains MCMC algorithms for summarizing posterior distributions defined by the user. It also contains functions for regression models, hierarchical models, Bayesian tests, and illustrations of Gibbs sampling.
Maintained by Jim Albert. Last updated 7 years ago.
3.4 match 38 stars 11.34 score 690 scripts 31 dependentsbioc
OmnipathR:OmniPath web service client and more
A client for the OmniPath web service (https://www.omnipathdb.org) and many other resources. It also includes functions to transform and pretty print some of the downloaded data, functions to access a number of other resources such as BioPlex, ConsensusPathDB, EVEX, Gene Ontology, Guide to Pharmacology (IUPHAR/BPS), Harmonizome, HTRIdb, Human Phenotype Ontology, InWeb InBioMap, KEGG Pathway, Pathway Commons, Ramilowski et al. 2015, RegNetwork, ReMap, TF census, TRRUST and Vinayagam et al. 2011. Furthermore, OmnipathR features a close integration with the NicheNet method for ligand activity prediction from transcriptomics data, and its R implementation `nichenetr` (available only on github).
Maintained by Denes Turei. Last updated 19 days ago.
graphandnetworknetworkpathwayssoftwarethirdpartyclientdataimportdatarepresentationgenesignalinggeneregulationsystemsbiologytranscriptomicssinglecellannotationkeggcomplexesenzyme-ptmnetworksnetworks-biologyomnipathproteinsquarto
3.6 match 126 stars 9.90 score 226 scripts 2 dependentspaballand
EconGeo:Computing Key Indicators of the Spatial Distribution of Economic Activities
Functions to compute a series of indices commonly used in the fields of economic geography, economic complexity, and evolutionary economics to describe the location, distribution, spatial organization, structure, and complexity of economic activities. Functions include basic spatial indicators such as the location quotient, the Krugman specialization index, the Herfindahl or the Shannon entropy indices but also more advanced functions to compute different forms of normalized relatedness between economic activities or network-based measures of economic complexity. Most of the functions use matrix calculus and are based on bipartite (incidence) matrices consisting of region - industry pairs.
Maintained by Pierre-Alexandre Balland. Last updated 2 years ago.
6.8 match 41 stars 4.96 score 44 scriptsbioc
cosmosR:COSMOS (Causal Oriented Search of Multi-Omic Space)
COSMOS (Causal Oriented Search of Multi-Omic Space) is a method that integrates phosphoproteomics, transcriptomics, and metabolomics data sets based on prior knowledge of signaling, metabolic, and gene regulatory networks. It estimated the activities of transcrption factors and kinases and finds a network-level causal reasoning. Thereby, COSMOS provides mechanistic hypotheses for experimental observations across mulit-omics datasets.
Maintained by Attila Gabor. Last updated 5 months ago.
cellbiologypathwaysnetworkproteomicsmetabolomicstranscriptomicsgenesignalingdata-integrationmetabolomic-datanetwork-modellingphosphoproteomics
4.3 match 59 stars 7.22 score 35 scriptsgobbios
EloRating:Animal Dominance Hierarchies by Elo Rating
Provides functions to quantify animal dominance hierarchies. The major focus is on Elo rating and its ability to deal with temporal dynamics in dominance interaction sequences. For static data, David's score and de Vries' I&SI are also implemented. In addition, the package provides functions to assess transitivity, linearity and stability of dominance networks. See Neumann et al (2011) <doi:10.1016/j.anbehav.2011.07.016> for an introduction.
Maintained by Christof Neumann. Last updated 8 months ago.
4.5 match 4 stars 6.86 score 61 scripts 1 dependentsoptimal-learning-lab
LKT:Logistic Knowledge Tracing
Computes Logistic Knowledge Tracing ('LKT') which is a general method for tracking human learning in an educational software system. Please see Pavlik, Eglington, and Harrel-Williams (2021) <https://ieeexplore.ieee.org/document/9616435>. 'LKT' is a method to compute features of student data that are used as predictors of subsequent performance. 'LKT' allows great flexibility in the choice of predictive components and features computed for these predictive components. The system is built on top of 'LiblineaR', which enables extremely fast solutions compared to base glm() in R.
Maintained by Philip I. Pavlik Jr.. Last updated 9 months ago.
5.0 match 12 stars 5.84 score 29 scriptsenblacar
SCpubr:Generate Publication Ready Visualizations of Single Cell Transcriptomics Data
A system that provides a streamlined way of generating publication ready plots for known Single-Cell transcriptomics data in a “publication ready” format. This is, the goal is to automatically generate plots with the highest quality possible, that can be used right away or with minimal modifications for a research article.
Maintained by Enrique Blanco-Carmona. Last updated 1 months ago.
softwaresinglecellvisualizationdata-visualizationggplot2publication-quality-plotsseuratsingle-cellsingle-cell-genomicssingle-cell-rna-seq
3.4 match 178 stars 8.71 score 194 scriptspecanproject
PEcAn.priors:PEcAn Functions Used to Estimate Priors from Data
Functions to estimate priors from data.
Maintained by David LeBauer. Last updated 3 days ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplantsjagscpp
2.9 match 216 stars 9.93 score 13 scripts 6 dependentsropensci
refsplitr:author name disambiguation, author georeferencing, and mapping of coauthorship networks with 'Web of Science' data
Tools to parse and organize reference records downloaded from the 'Web of Science' citation database into an R-friendly format, disambiguate the names of authors, geocode their locations, and generate/visualize coauthorship networks. This package has been peer-reviewed by rOpenSci (v. 1.0).
Maintained by Emilio Bruna. Last updated 7 months ago.
name disambiguationbibliometricscoauthorshipcollaborationgeoreferencingmetasciencereferencesscientometricsscience of scienceweb of science
5.1 match 55 stars 5.64 score 16 scriptsbarnzilla
capl:Compute and Visualize CAPL-2 Scores and Interpretations
A toolkit for computing and visualizing CAPL-2 (Canadian Assessment of Physical Literacy, Second Edition; <https://www.capl-eclp.ca>) scores and interpretations from raw data.
Maintained by Joel Barnes. Last updated 3 years ago.
7.0 match 2 stars 4.00 score 2 scriptscran
DAKS:Data Analysis and Knowledge Spaces
Functions and an example dataset for the psychometric theory of knowledge spaces. This package implements data analysis methods and procedures for simulating data and quasi orders and transforming different formulations in knowledge space theory. See package?DAKS for an overview.
Maintained by Ali Uenlue. Last updated 9 years ago.
14.0 match 2.00 scoresbgraves237
Ecdat:Data Sets for Econometrics
Data sets for econometrics, including political science.
Maintained by Spencer Graves. Last updated 4 months ago.
3.8 match 2 stars 7.25 score 740 scripts 3 dependentsbioc
decoupleR:decoupleR: Ensemble of computational methods to infer biological activities from omics data
Many methods allow us to extract biological activities from omics data using information from prior knowledge resources, reducing the dimensionality for increased statistical power and better interpretability. Here, we present decoupleR, a Bioconductor package containing different statistical methods to extract these signatures within a unified framework. decoupleR allows the user to flexibly test any method with any resource. It incorporates methods that take into account the sign and weight of network interactions. decoupleR can be used with any omic, as long as its features can be linked to a biological process based on prior knowledge. For example, in transcriptomics gene sets regulated by a transcription factor, or in phospho-proteomics phosphosites that are targeted by a kinase.
Maintained by Pau Badia-i-Mompel. Last updated 5 months ago.
differentialexpressionfunctionalgenomicsgeneexpressiongeneregulationnetworksoftwarestatisticalmethodtranscription
2.3 match 230 stars 11.27 score 316 scripts 3 dependentschockemeyer
kstIO:Knowledge Space Theory Input/Output
Knowledge space theory by Doignon and Falmagne (1999) <doi:10.1007/978-3-642-58625-5> is a set- and order-theoretical framework which proposes mathematical formalisms to operationalize knowledge structures in a particular domain. The 'kstIO' package provides basic functionalities to read and write KST data from/to files to be used together with the 'kst', 'kstMatrix', 'CDSS', 'pks', or 'DAKS' packages.
Maintained by Cord Hockemeyer. Last updated 2 months ago.
13.1 match 2.00 score 8 scriptsropensci
ckanr:Client for the Comprehensive Knowledge Archive Network ('CKAN') API
Client for 'CKAN' API (<https://ckan.org/>). Includes interface to 'CKAN' 'APIs' for search, list, show for packages, organizations, and resources. In addition, provides an interface to the 'datastore' API.
Maintained by Francisco Alves. Last updated 2 years ago.
databaseopen-datackanapidatadatasetapi-wrapperckan-api
2.9 match 100 stars 8.67 score 448 scripts 4 dependentsracorreia
gkgraphR:Accessing the Official 'Google Knowledge Graph' API
A simple way to interact with and extract data from the official 'Google Knowledge Graph' API <https://developers.google.com/knowledge-graph/>.
Maintained by Ricardo Correia. Last updated 4 years ago.
5.5 match 5 stars 4.40 score 3 scriptskwb-r
kwb.endnote:Helper Functions for Analysing KWB Endnote Library (Exported as .xml)
Helper Functions For Analysing KWB Endnote Library (Exported As .XML).
Maintained by Michael Rustler. Last updated 4 years ago.
endnoteknowledge-repoliterature-data-managementproject-fakinpublication
7.5 match 3.00 score 2 scriptsmoosa-r
rbioapi:User-Friendly R Interface to Biologic Web Services' API
Currently fully supports Enrichr, JASPAR, miEAA, PANTHER, Reactome, STRING, and UniProt! The goal of rbioapi is to provide a user-friendly and consistent interface to biological databases and services. In a way that insulates the user from the technicalities of using web services API and creates a unified and easy-to-use interface to biological and medical web services. This is an ongoing project; New databases and services will be added periodically. Feel free to suggest any databases or services you often use.
Maintained by Moosa Rezwani. Last updated 1 months ago.
api-clientbioinformaticsbiologyenrichmentenrichment-analysisenrichrjasparmieaaover-representation-analysispantherreactomestringuniprot
3.0 match 20 stars 7.60 score 55 scriptsjoliencremers
bpnreg:Bayesian Projected Normal Regression Models for Circular Data
Fitting Bayesian multiple and mixed-effect regression models for circular data based on the projected normal distribution. Both continuous and categorical predictors can be included. Sampling from the posterior is performed via an MCMC algorithm. Posterior descriptives of all parameters, model fit statistics and Bayes factors for hypothesis tests for inequality constrained hypotheses are provided. See Cremers, Mulder & Klugkist (2018) <doi:10.1111/bmsp.12108> and Nuñez-Antonio & Guttiérez-Peña (2014) <doi:10.1016/j.csda.2012.07.025>.
Maintained by Jolien Cremers. Last updated 1 years ago.
3.6 match 14 stars 6.15 score 101 scriptsbioc
CNORfeeder:Integration of CellNOptR to add missing links
This package integrates literature-constrained and data-driven methods to infer signalling networks from perturbation experiments. It permits to extends a given network with links derived from the data via various inference methods and uses information on physical interactions of proteins to guide and validate the integration of links.
Maintained by Attila Gabor. Last updated 5 months ago.
cellbasedassayscellbiologyproteomicsnetworkinference
6.0 match 3.60 score 9 scriptsbioc
deepSNV:Detection of subclonal SNVs in deep sequencing data.
This package provides provides quantitative variant callers for detecting subclonal mutations in ultra-deep (>=100x coverage) sequencing experiments. The deepSNV algorithm is used for a comparative setup with a control experiment of the same loci and uses a beta-binomial model and a likelihood ratio test to discriminate sequencing errors and subclonal SNVs. The shearwater algorithm computes a Bayes classifier based on a beta-binomial model for variant calling with multiple samples for precisely estimating model parameters - such as local error rates and dispersion - and prior knowledge, e.g. from variation data bases such as COSMIC.
Maintained by Moritz Gerstung. Last updated 5 months ago.
geneticvariabilitysnpsequencinggeneticsdataimportcurlbzip2xz-utilszlibcpp
3.3 match 6.53 score 38 scripts 1 dependentscran
vannstats:Simplified Statistical Procedures for Social Sciences
Simplifies functions assess normality for bivariate and multivariate statistical techniques. Includes functions designed to replicate plots and tables that would result from similar calls in 'SPSS', including hst(), box(), qq(), tab(), cormat(), and residplot(). Also includes simplified formulae, such as mode(), scatter(), p.corr(), ow.anova(), and rm.anova().
Maintained by Burrel Vann Jr. Last updated 2 months ago.
6.9 match 3.06 scorebioc
mixOmics:Omics Data Integration Project
Multivariate methods are well suited to large omics data sets where the number of variables (e.g. genes, proteins, metabolites) is much larger than the number of samples (patients, cells, mice). They have the appealing properties of reducing the dimension of the data by using instrumental variables (components), which are defined as combinations of all variables. Those components are then used to produce useful graphical outputs that enable better understanding of the relationships and correlation structures between the different data sets that are integrated. mixOmics offers a wide range of multivariate methods for the exploration and integration of biological datasets with a particular focus on variable selection. The package proposes several sparse multivariate models we have developed to identify the key variables that are highly correlated, and/or explain the biological outcome of interest. The data that can be analysed with mixOmics may come from high throughput sequencing technologies, such as omics data (transcriptomics, metabolomics, proteomics, metagenomics etc) but also beyond the realm of omics (e.g. spectral imaging). The methods implemented in mixOmics can also handle missing values without having to delete entire rows with missing data. A non exhaustive list of methods include variants of generalised Canonical Correlation Analysis, sparse Partial Least Squares and sparse Discriminant Analysis. Recently we implemented integrative methods to combine multiple data sets: N-integration with variants of Generalised Canonical Correlation Analysis and P-integration with variants of multi-group Partial Least Squares.
Maintained by Eva Hamrud. Last updated 4 days ago.
immunooncologymicroarraysequencingmetabolomicsmetagenomicsproteomicsgenepredictionmultiplecomparisonclassificationregressionbioconductorgenomicsgenomics-datagenomics-visualizationmultivariate-analysismultivariate-statisticsomicsr-pkgr-project
1.5 match 182 stars 13.71 score 1.3k scripts 22 dependentsmanueleleonelli
bnRep:A Repository of Bayesian Networks from the Academic Literature
A collection of Bayesian networks (discrete, Gaussian, and conditional linear Gaussian) collated from recent academic literature. The 'bnRep_summary' object provides an overview of the Bayesian networks in the repository and the package documentation includes details about the variables in each network. A Shiny app to explore the repository can be launched with 'bnRep_app()' and is available online at <https://manueleleonelli.shinyapps.io/bnRep>. For details see <https://github.com/manueleleonelli/bnRep>.
Maintained by Manuele Leonelli. Last updated 6 months ago.
4.0 match 5 stars 5.10 score 7 scriptskwb-r
algoliar:Simple Access to Algolia Search REST API
Simple Access to Algolia REST API (https://www.algolia.com/doc/rest-api/search/).
Maintained by Michael Rustler. Last updated 6 years ago.
academicalgoliaapihugoknowledge-repoproject-fakinsearch
7.5 match 2.70 scorekwb-r
kwb.twitter:Simplify Access to Twitter Messages
Simplify access to Twitter messages.
Maintained by Hauke Sonnenberg. Last updated 3 years ago.
knowledge-repoproject-fakinpublicationsocial-networktwitter
7.5 match 2.70 scoreralmond
Peanut:Parameterized Bayesian Networks, Abstract Classes
This provides support of learning conditional probability tables parameterized using CPTtools. This provides and object oriented layer on top of a CPTtools, to facilitate calculations with Parameterized models for Bayesian networks. Peanut is a collection of abstract classes and generic functions defining a protocol, with the intent that the protocol can be implemented with different Bayes net engines. The companion pacakge PNetica provides an implementation using Netica and RNetica.
Maintained by Russell Almond. Last updated 2 years ago.
bayesian-networkknowledge-representation
7.5 match 1 stars 2.48 score 4 scripts 2 dependentsopenpharma
DoseFinding:Planning and Analyzing Dose Finding Experiments
The DoseFinding package provides functions for the design and analysis of dose-finding experiments (with focus on pharmaceutical Phase II clinical trials). It provides functions for: multiple contrast tests, fitting non-linear dose-response models (using Bayesian and non-Bayesian estimation), calculating optimal designs and an implementation of the MCPMod methodology (Pinheiro et al. (2014) <doi:10.1002/sim.6052>).
Maintained by Marius Thomas. Last updated 5 days ago.
1.8 match 8 stars 10.32 score 98 scripts 10 dependentsbioc
TOAST:Tools for the analysis of heterogeneous tissues
This package is devoted to analyzing high-throughput data (e.g. gene expression microarray, DNA methylation microarray, RNA-seq) from complex tissues. Current functionalities include 1. detect cell-type specific or cross-cell type differential signals 2. tree-based differential analysis 3. improve variable selection in reference-free deconvolution 4. partial reference-free deconvolution with prior knowledge.
Maintained by Ziyi Li. Last updated 5 months ago.
dnamethylationgeneexpressiondifferentialexpressiondifferentialmethylationmicroarraygenetargetepigeneticsmethylationarray
2.3 match 11 stars 8.01 score 104 scripts 3 dependentsbioc
PCAN:Phenotype Consensus ANalysis (PCAN)
Phenotypes comparison based on a pathway consensus approach. Assess the relationship between candidate genes and a set of phenotypes based on additional genes related to the candidate (e.g. Pathways or network neighbors).
Maintained by Matthew Page. Last updated 5 months ago.
annotationsequencinggeneticsfunctionalpredictionvariantannotationpathwaysnetwork
4.2 match 4.15 score 7 scriptsbioc
wppi:Weighting protein-protein interactions
Protein-protein interaction data is essential for omics data analysis and modeling. Database knowledge is general, not specific for cell type, physiological condition or any other context determining which connections are functional and contribute to the signaling. Functional annotations such as Gene Ontology and Human Phenotype Ontology might help to evaluate the relevance of interactions. This package predicts functional relevance of protein-protein interactions based on functional annotations such as Human Protein Ontology and Gene Ontology, and prioritizes genes based on network topology, functional scores and a path search algorithm.
Maintained by Ana Galhoz. Last updated 5 months ago.
graphandnetworknetworkpathwayssoftwaregenesignalinggenetargetsystemsbiologytranscriptomicsannotationgene-ontologygene-prioritizationhuman-phenotype-ontologyomnipathppi-networksrandom-walk-with-restartquarto
4.0 match 1 stars 4.30 score 4 scriptsr-forge
pcalg:Methods for Graphical Models and Causal Inference
Functions for causal structure learning and causal inference using graphical models. The main algorithms for causal structure learning are PC (for observational data without hidden variables), FCI and RFCI (for observational data with hidden variables), and GIES (for a mix of data from observational studies (i.e. observational data) and data from experiments involving interventions (i.e. interventional data) without hidden variables). For causal inference the IDA algorithm, the Generalized Backdoor Criterion (GBC), the Generalized Adjustment Criterion (GAC) and some related functions are implemented. Functions for incorporating background knowledge are provided.
Maintained by Markus Kalisch. Last updated 6 months ago.
2.3 match 7.32 score 700 scripts 19 dependentsbioc
derfinder:Annotation-agnostic differential expression analysis of RNA-seq data at base-pair resolution via the DER Finder approach
This package provides functions for annotation-agnostic differential expression analysis of RNA-seq data. Two implementations of the DER Finder approach are included in this package: (1) single base-level F-statistics and (2) DER identification at the expressed regions-level. The DER Finder approach can also be used to identify differentially bounded ChIP-seq peaks.
Maintained by Leonardo Collado-Torres. Last updated 3 months ago.
differentialexpressionsequencingrnaseqchipseqdifferentialpeakcallingsoftwareimmunooncologycoverageannotation-agnosticbioconductorderfinder
1.5 match 42 stars 10.03 score 78 scripts 6 dependentsbioc
recount:Explore and download data from the recount project
Explore and download data from the recount project available at https://jhubiostatistics.shinyapps.io/recount/. Using the recount package you can download RangedSummarizedExperiment objects at the gene, exon or exon-exon junctions level, the raw counts, the phenotype metadata used, the urls to the sample coverage bigWig files or the mean coverage bigWig file for a particular study. The RangedSummarizedExperiment objects can be used by different packages for performing differential expression analysis. Using http://bioconductor.org/packages/derfinder you can perform annotation-agnostic differential expression analyses with the data from the recount project as described at http://www.nature.com/nbt/journal/v35/n4/full/nbt.3838.html.
Maintained by Leonardo Collado-Torres. Last updated 3 months ago.
coveragedifferentialexpressiongeneexpressionrnaseqsequencingsoftwaredataimportimmunooncologyannotation-agnosticbioconductorcountderfinderdeseq2exongenehumanilluminajunctionrecount
1.5 match 41 stars 9.57 score 498 scripts 3 dependentscran
DCL:Claims Reserving under the Double Chain Ladder Model
Statistical modelling and forecasting in claims reserving in non-life insurance under the Double Chain Ladder framework by Martinez-Miranda, Nielsen and Verrall (2012).
Maintained by Maria Dolores Martinez-Miranda. Last updated 3 years ago.
13.7 match 1 stars 1.00 scorekwb-r
kwb.site:R Package for Scraping Our Offical KWB Website (Before Re-Design in 2021)
This package contains functions for scraping our official [KWB website](https://kompetenz-wasser.de). The data for all projects and people can be collected in order to provide an overview of the website`s content and in order to be integrate that data into a KWB knowledge repo.
Maintained by Michael Rustler. Last updated 3 years ago.
knowledge-repoproject-fakinr-seleniumrvestweb-scrapingwebsite
8.0 match 1.70 score 2 scriptsbioc
ALDEx2:Analysis Of Differential Abundance Taking Sample and Scale Variation Into Account
A differential abundance analysis for the comparison of two or more conditions. Useful for analyzing data from standard RNA-seq or meta-RNA-seq assays as well as selected and unselected values from in-vitro sequence selections. Uses a Dirichlet-multinomial model to infer abundance from counts, optimized for three or more experimental replicates. The method infers biological and sampling variation to calculate the expected false discovery rate, given the variation, based on a Wilcoxon Rank Sum test and Welch's t-test (via aldex.ttest), a Kruskal-Wallis test (via aldex.kw), a generalized linear model (via aldex.glm), or a correlation test (via aldex.corr). All tests report predicted p-values and posterior Benjamini-Hochberg corrected p-values. ALDEx2 also calculates expected standardized effect sizes for paired or unpaired study designs. ALDEx2 can now be used to estimate the effect of scale on the results and report on the scale-dependent robustness of results.
Maintained by Greg Gloor. Last updated 5 months ago.
differentialexpressionrnaseqtranscriptomicsgeneexpressiondnaseqchipseqbayesiansequencingsoftwaremicrobiomemetagenomicsimmunooncologyscale simulationposterior p-value
1.3 match 28 stars 10.70 score 424 scripts 3 dependentstrinker
lexicon:Lexicons for Text Analysis
A collection of lexical hash tables, dictionaries, and word lists.
Maintained by Tyler Rinker. Last updated 3 years ago.
hashlexiconlookupnames-frequentstopwordstext-dictionariestext-mining
1.5 match 111 stars 8.80 score 224 scripts 25 dependentsbioc
KBoost:Inference of gene regulatory networks from gene expression data
Reconstructing gene regulatory networks and transcription factor activity is crucial to understand biological processes and holds potential for developing personalized treatment. Yet, it is still an open problem as state-of-art algorithm are often not able to handle large amounts of data. Furthermore, many of the present methods predict numerous false positives and are unable to integrate other sources of information such as previously known interactions. Here we introduce KBoost, an algorithm that uses kernel PCA regression, boosting and Bayesian model averaging for fast and accurate reconstruction of gene regulatory networks. KBoost can also use a prior network built on previously known transcription factor targets. We have benchmarked KBoost using three different datasets against other high performing algorithms. The results show that our method compares favourably to other methods across datasets.
Maintained by Luis F. Iglesias-Martinez. Last updated 5 months ago.
networkgraphandnetworkbayesiannetworkinferencegeneregulationtranscriptomicssystemsbiologytranscriptiongeneexpressionregressionprincipalcomponent
2.8 match 4 stars 4.60 score 9 scriptskalimu
GitAI:Extracts Knowledge from 'Git' Repositories
Scan multiple 'Git' repositories, pull specified files content and process it with large language models. You can summarize the content in specific way, extract information and data, or find answers to your questions about the repositories. The output can be stored in vector database and used for semantic search or as a part of a RAG (Retrieval Augmented Generation) prompt.
Maintained by Kamil Wais. Last updated 25 days ago.
4.8 match 2.70 score 5 scriptsbioc
recount3:Explore and download data from the recount3 project
The recount3 package enables access to a large amount of uniformly processed RNA-seq data from human and mouse. You can download RangedSummarizedExperiment objects at the gene, exon or exon-exon junctions level with sample metadata and QC statistics. In addition we provide access to sample coverage BigWig files.
Maintained by Leonardo Collado-Torres. Last updated 3 months ago.
coveragedifferentialexpressiongeneexpressionrnaseqsequencingsoftwaredataimportannotation-agnosticbioconductorcountderfinderexongenehumanilluminajunctionmouserecountrecount3
1.5 match 33 stars 8.03 score 216 scriptscran
BKT:Bayesian Knowledge Tracing Model
Fitting, cross-validating, and predicting with Bayesian Knowledge Tracing (BKT) models. It is designed for analyzing educational datasets to trace student knowledge over time. The package includes functions for fitting BKT models, evaluating their performance using various metrics, and making predictions on new data. It provides the similar functionality as the Python package pyBKT authored by Zachary A. Pardos (zp@berkeley.edu) at <https://github.com/CAHLR/pyBKT>.
Maintained by Yuhao Yuan. Last updated 1 months ago.
5.9 match 2.00 scorebioc
biocthis:Automate package and project setup for Bioconductor packages
This package expands the usethis package with the goal of helping automate the process of creating R packages for Bioconductor or making them Bioconductor-friendly.
Maintained by Leonardo Collado-Torres. Last updated 3 months ago.
softwarereportwritingactionsbioconductorbiocthisgithubstylerusethis
1.5 match 51 stars 7.78 score 4 scripts 1 dependentsbioc
benchdamic:Benchmark of differential abundance methods on microbiome data
Starting from a microbiome dataset (16S or WMS with absolute count values) it is possible to perform several analysis to assess the performances of many differential abundance detection methods. A basic and standardized version of the main differential abundance analysis methods is supplied but the user can also add his method to the benchmark. The analyses focus on 4 main aspects: i) the goodness of fit of each method's distributional assumptions on the observed count data, ii) the ability to control the false discovery rate, iii) the within and between method concordances, iv) the truthfulness of the findings if any apriori knowledge is given. Several graphical functions are available for result visualization.
Maintained by Matteo Calgaro. Last updated 4 months ago.
metagenomicsmicrobiomedifferentialexpressionmultiplecomparisonnormalizationpreprocessingsoftwarebenchmarkdifferential-abundance-methods
2.0 match 6 stars 5.73 score 8 scriptsshanpengli
PDXpower:Time to Event Outcome in Experimental Designs of Pre-Clinical Studies
Conduct simulation-based customized power calculation for clustered time to event data in a mixed crossed/nested design, where a number of cell lines and a number of mice within each cell line are considered to achieve a desired statistical power, motivated by Eckel-Passow and colleagues (2021) <doi:10.1093/neuonc/noab137> and Li and colleagues (2024) <doi:10.48550/arXiv.2404.08927>. This package provides two commonly used models for powering a design, linear mixed effects and Cox frailty model. Both models account for within-subject (cell line) correlation while holding different distributional assumptions about the outcome. Alternatively, the counterparts of fixed effects model are also available, which produces similar estimates of statistical power.
Maintained by Shanpeng Li. Last updated 2 months ago.
3.1 match 1 stars 3.65 score 2 scriptsmicrosoft
wpa:Tools for Analysing and Visualising Viva Insights Data
Opinionated functions that enable easier and faster analysis of Viva Insights data. There are three main types of functions in 'wpa': (i) Standard functions create a 'ggplot' visual or a summary table based on a specific Viva Insights metric; (2) Report Generation functions generate HTML reports on a specific analysis area, e.g. Collaboration; (3) Other miscellaneous functions cover more specific applications (e.g. Subject Line text mining) of Viva Insights data. This package adheres to 'tidyverse' principles and works well with the pipe syntax. 'wpa' is built with the beginner-to-intermediate R users in mind, and is optimised for simplicity.
Maintained by Martin Chan. Last updated 4 months ago.
1.7 match 30 stars 6.69 score 39 scripts 1 dependentsteachinglab
tlShiny:Supplies essential functions to Teaching Lab dashboards
A bunch of random functions I use in developing dashboards Needs to vastly reduce the number of dependencies at the moment.
Maintained by Duncan Gates. Last updated 13 days ago.
3.6 match 3.04 scorejonesor
Rage:Life History Metrics from Matrix Population Models
Functions for calculating life history metrics using matrix population models ('MPMs'). Described in Jones et al. (2021) <doi:10.1101/2021.04.26.441330>.
Maintained by Owen Jones. Last updated 3 months ago.
1.3 match 11 stars 8.17 score 62 scripts 1 dependentsbioc
regionReport:Generate HTML or PDF reports for a set of genomic regions or DESeq2/edgeR results
Generate HTML or PDF reports to explore a set of regions such as the results from annotation-agnostic expression analysis of RNA-seq data at base-pair resolution performed by derfinder. You can also create reports for DESeq2 or edgeR results.
Maintained by Leonardo Collado-Torres. Last updated 2 months ago.
differentialexpressionsequencingrnaseqsoftwarevisualizationtranscriptioncoveragereportwritingdifferentialmethylationdifferentialpeakcallingimmunooncologyqualitycontrolbioconductorderfinderdeseq2edgerregionreportrmarkdown
1.5 match 9 stars 7.22 score 46 scriptstlverse
tmle3:The Extensible TMLE Framework
A general framework supporting the implementation of targeted maximum likelihood estimators (TMLEs) of a diverse range of statistical target parameters through a unified interface. The goal is that the exposed framework be as general as the mathematical framework upon which it draws.
Maintained by Jeremy Coyle. Last updated 4 months ago.
causal-inferencemachine-learningtargeted-learningvariable-importance
1.3 match 38 stars 7.91 score 286 scripts 5 dependentscelehs
PheCAP:High-Throughput Phenotyping with EHR using a Common Automated Pipeline
Implement surrogate-assisted feature extraction (SAFE) and common machine learning approaches to train and validate phenotyping models. Background and details about the methods can be found at Zhang et al. (2019) <doi:10.1038/s41596-019-0227-6>, Yu et al. (2017) <doi:10.1093/jamia/ocw135>, and Liao et al. (2015) <doi:10.1136/bmj.h1885>.
Maintained by PARSE LTD. Last updated 4 years ago.
1.7 match 21 stars 6.02 score 8 scriptsjeroen
curl:A Modern and Flexible Web Client for R
Bindings to 'libcurl' <https://curl.se/libcurl/> for performing fully configurable HTTP/FTP requests where responses can be processed in memory, on disk, or streaming via the callback or connection interfaces. Some knowledge of 'libcurl' is recommended; for a more-user-friendly web client see the 'httr2' package which builds on this package with http specific tools and logic.
Maintained by Jeroen Ooms. Last updated 23 days ago.
0.5 match 224 stars 19.98 score 4.0k scripts 5.9k dependentsmicrosoft
vivainsights:Analyze and Visualize Data from 'Microsoft Viva Insights'
Provides a versatile range of functions, including exploratory data analysis, time-series analysis, organizational network analysis, and data validation, whilst at the same time implements a set of best practices in analyzing and visualizing data specific to 'Microsoft Viva Insights'.
Maintained by Martin Chan. Last updated 24 days ago.
1.7 match 11 stars 6.12 score 68 scriptsvascobranco
gecko:Geographical Ecology and Conservation Knowledge Online
Includes a collection of geographical analysis functions aimed primarily at ecology and conservation science studies, allowing processing of both point and raster data. Now integrates SPECTRE (<https://biodiversityresearch.org/spectre/>), a dataset of global geospatial threat data, developed by the authors.
Maintained by Vasco V. Branco. Last updated 3 months ago.
conservation-scienceecologyspatial-analysis
3.0 match 5 stars 3.40 score 4 scriptsbioc
megadepth:megadepth: BigWig and BAM related utilities
This package provides an R interface to Megadepth by Christopher Wilks available at https://github.com/ChristopherWilks/megadepth. It is particularly useful for computing the coverage of a set of genomic regions across bigWig or BAM files. With this package, you can build base-pair coverage matrices for regions or annotations of your choice from BigWig files. Megadepth was used to create the raw files provided by https://bioconductor.org/packages/recount3.
Maintained by David Zhang. Last updated 3 months ago.
softwarecoveragedataimporttranscriptomicsrnaseqpreprocessingbambigwigdasptermegadepthrecount2recount3
1.5 match 12 stars 6.69 score 7 scripts 3 dependentsbioc
FELLA:Interpretation and enrichment for metabolomics data
Enrichment of metabolomics data using KEGG entries. Given a set of affected compounds, FELLA suggests affected reactions, enzymes, modules and pathways using label propagation in a knowledge model network. The resulting subnetwork can be visualised and exported.
Maintained by Sergio Picart-Armada. Last updated 5 months ago.
softwaremetabolomicsgraphandnetworkkegggopathwaysnetworknetworkenrichment
2.3 match 4.41 score 32 scriptsbioc
iSEEtree:Interactive visualisation for microbiome data
iSEEtree is an extension of iSEE for the TreeSummarizedExperiment. It leverages the functionality from the miaViz package for microbiome data visualisation to create panels that are specific for TreeSummarizedExperiment objects. Not surprisingly, it also depends on the generic panels from iSEE.
Maintained by Giulio Benedetti. Last updated 6 days ago.
microbiomesoftwarevisualizationguishinyappsdataimportshiny-appsvisualisation
1.5 match 3 stars 6.26 score 5 scriptsbioc
derfinderHelper:derfinder helper package
Helper package for speeding up the derfinder package when using multiple cores. This package is particularly useful when using BiocParallel and it helps reduce the time spent loading the full derfinder package when running the F-statistics calculation in parallel.
Maintained by Leonardo Collado-Torres. Last updated 3 months ago.
differentialexpressionsequencingrnaseqsoftwareimmunooncologybioconductorderfinder
1.5 match 6.20 score 7 dependentsfrictionlessdata
tableschema.r:Table Schema 'Frictionless Data'
Allows to work with 'Table Schema' (<https://specs.frictionlessdata.io/table-schema/>). 'Table Schema' is well suited for use cases around handling and validating tabular data in text formats such as 'csv', but its utility extends well beyond this core usage, towards a range of applications where data benefits from a portable schema format. The 'tableschema.r' package can load and validate any table schema descriptor, allow the creation and modification of descriptors, expose methods for reading and streaming data that conforms to a 'Table Schema' via the 'Tabular Data Resource' abstraction.
Maintained by Kleanthis Koupidis. Last updated 2 years ago.
1.6 match 25 stars 5.70 score 101 scriptsmalaga-fca-group
fcaR:Formal Concept Analysis
Provides tools to perform fuzzy formal concept analysis, presented in Wille (1982) <doi:10.1007/978-3-642-01815-2_23> and in Ganter and Obiedkov (2016) <doi:10.1007/978-3-662-49291-8>. It provides functions to load and save a formal context, extract its concept lattice and implications. In addition, one can use the implications to compute semantic closures of fuzzy sets and, thus, build recommendation systems.
Maintained by Domingo Lopez Rodriguez. Last updated 2 years ago.
1.5 match 6 stars 6.02 score 70 scriptsipbes-data
IPBES.R:Tool functions used by the Data and Knowledge Technical Support Unit of IPBES
More about what it does (maybe more than one line).
Maintained by Rainer M. Krug. Last updated 1 years ago.
4.4 match 2 stars 2.00 score 10 scriptsropensci
bowerbird:Keep a Collection of Sparkly Data Resources
Tools to get and maintain a data repository from third-party data providers.
Maintained by Ben Raymond. Last updated 5 days ago.
ropensciantarcticsouthern oceandataenvironmentalsatelliteclimatepeer-reviewed
1.2 match 50 stars 7.16 score 16 scripts 1 dependentsdeboerk
cocron:Statistical Comparisons of Two or more Alpha Coefficients
Statistical tests for the comparison between two or more alpha coefficients based on either dependent or independent groups of individuals. A web interface is available at http://comparingcronbachalphas.org. A plugin for the R GUI and IDE RKWard is included. Please install RKWard from https:// rkward.kde.org to use this feature. The respective R package 'rkward' cannot be installed directly from a repository, as it is a part of RKWard.
Maintained by Birk Diedenhofen. Last updated 9 years ago.
4.0 match 2.12 score 22 scriptsbioc
iSEEindex:iSEE extension for a landing page to a custom collection of data sets
This package provides an interface to any collection of data sets within a single iSEE web-application. The main functionality of this package is to define a custom landing page allowing app maintainers to list a custom collection of data sets that users can selected from and directly load objects into an iSEE web-application.
Maintained by Kevin Rue-Albrecht. Last updated 5 months ago.
softwareinfrastructurebioconductorhacktoberfest
1.5 match 2 stars 5.65 score 8 scriptspaul-buerkner
brms:Bayesian Regression Models using 'Stan'
Fit Bayesian generalized (non-)linear multivariate multilevel models using 'Stan' for full Bayesian inference. A wide range of distributions and link functions are supported, allowing users to fit -- among others -- linear, robust linear, count data, survival, response times, ordinal, zero-inflated, hurdle, and even self-defined mixture models all in a multilevel context. Further modeling options include both theory-driven and data-driven non-linear terms, auto-correlation structures, censoring and truncation, meta-analytic standard errors, and quite a few more. In addition, all parameters of the response distribution can be predicted in order to perform distributional regression. Prior specifications are flexible and explicitly encourage users to apply prior distributions that actually reflect their prior knowledge. Models can easily be evaluated and compared using several methods assessing posterior or prior predictions. References: Bürkner (2017) <doi:10.18637/jss.v080.i01>; Bürkner (2018) <doi:10.32614/RJ-2018-017>; Bürkner (2021) <doi:10.18637/jss.v100.i05>; Carpenter et al. (2017) <doi:10.18637/jss.v076.i01>.
Maintained by Paul-Christian Bürkner. Last updated 3 days ago.
bayesian-inferencebrmsmultilevel-modelsstanstatistical-models
0.5 match 1.3k stars 16.61 score 13k scripts 34 dependentsbioc
iSEEhub:iSEE for the Bioconductor ExperimentHub
This package defines a custom landing page for an iSEE app interfacing with the Bioconductor ExperimentHub. The landing page allows users to browse the ExperimentHub, select a data set, download and cache it, and import it directly into a Bioconductor iSEE app.
Maintained by Kevin Rue-Albrecht. Last updated 5 months ago.
dataimportimmunooncology infrastructureshinyappssinglecellsoftwarebioconductorbioconductor-packagehacktoberfestisee
1.5 match 3 stars 5.56 score 4 scriptsbioc
chevreulProcess:Tools for managing SingleCellExperiment objects as projects
Tools analyzing SingleCellExperiment objects as projects. for input into the Chevreul app downstream. Includes functions for analysis of single cell RNA sequencing data. Supported by NIH grants R01CA137124 and R01EY026661 to David Cobrinik.
Maintained by Kevin Stachelek. Last updated 1 months ago.
coveragernaseqsequencingvisualizationgeneexpressiontranscriptionsinglecelltranscriptomicsnormalizationpreprocessingqualitycontroldimensionreductiondataimport
1.5 match 5.38 score 2 scripts 2 dependentsbioc
iSEEde:iSEE extension for panels related to differential expression analysis
This package contains diverse functionality to extend the usage of the iSEE package, including additional classes for the panels or modes facilitating the analysis of differential expression results. This package does not perform differential expression. Instead, it provides methods to embed precomputed differential expression results in a SummarizedExperiment object, in a manner that is compatible with interactive visualisation in iSEE applications.
Maintained by Kevin Rue-Albrecht. Last updated 4 months ago.
softwareinfrastructuredifferentialexpressionbioconductorhacktoberfestiseeu
1.5 match 1 stars 5.38 score 15 scriptsbioc
qsvaR:Generate Quality Surrogate Variable Analysis for Degradation Correction
The qsvaR package contains functions for removing the effect of degration in rna-seq data from postmortem brain tissue. The package is equipped to help users generate principal components associated with degradation. The components can be used in differential expression analysis to remove the effects of degradation.
Maintained by Hedia Tnani. Last updated 3 months ago.
softwareworkflowstepnormalizationbiologicalquestiondifferentialexpressionsequencingcoveragebioconductorbraindegradationhumanqsva
1.5 match 5.26 score 4 scriptswjawaid
enrichR:Provides an R Interface to 'Enrichr'
Provides an R interface to all 'Enrichr' databases. 'Enrichr' is a web-based tool for analysing gene sets and returns any enrichment of common annotated biological features. Quoting from their website 'Enrichment analysis is a computational method for inferring knowledge about an input gene set by comparing it to annotated gene sets representing prior biological knowledge.' See <https://maayanlab.cloud/Enrichr/> for further details.
Maintained by Wajid Jawaid. Last updated 1 months ago.
0.8 match 90 stars 9.96 score 7 dependentsbioc
regutools:regutools: an R package for data extraction from RegulonDB
RegulonDB has collected, harmonized and centralized data from hundreds of experiments for nearly two decades and is considered a point of reference for transcriptional regulation in Escherichia coli K12. Here, we present the regutools R package to facilitate programmatic access to RegulonDB data in computational biology. regutools provides researchers with the possibility of writing reproducible workflows with automated queries to RegulonDB. The regutools package serves as a bridge between RegulonDB data and the Bioconductor ecosystem by reusing the data structures and statistical methods powered by other Bioconductor packages. We demonstrate the integration of regutools with Bioconductor by analyzing transcription factor DNA binding sites and transcriptional regulatory networks from RegulonDB. We anticipate that regutools will serve as a useful building block in our progress to further our understanding of gene regulatory networks.
Maintained by Joselyn Chavez. Last updated 3 months ago.
generegulationgeneexpressionsystemsbiologynetworknetworkinferencevisualizationtranscriptionbioconductorcdsbregulondb
1.5 match 4 stars 5.20 score 6 scriptsbioc
TREG:Tools for finding Total RNA Expression Genes in single nucleus RNA-seq data
RNA abundance and cell size parameters could improve RNA-seq deconvolution algorithms to more accurately estimate cell type proportions given the different cell type transcription activity levels. A Total RNA Expression Gene (TREG) can facilitate estimating total RNA content using single molecule fluorescent in situ hybridization (smFISH). We developed a data-driven approach using a measure of expression invariance to find candidate TREGs in postmortem human brain single nucleus RNA-seq. This R package implements the method for identifying candidate TREGs from snRNA-seq data.
Maintained by Louise Huuki-Myers. Last updated 3 months ago.
softwaresinglecellrnaseqgeneexpressiontranscriptomicstranscriptionsequencingbioconductordeconvolutionrnascopescrna-seqsmfishsnrna-seqtreg
1.5 match 4 stars 5.20 score 5 scriptspatzaw
ReDaMoR:Relational Data Modeler
The aim of this package is to manipulate relational data models in R. It provides functions to create, modify and export data models in json format. It also allows importing models created with 'MySQL Workbench' (<https://www.mysql.com/products/workbench/>). These functions are accessible through a graphical user interface made with 'shiny'. Constraints such as types, keys, uniqueness and mandatory fields are automatically checked and corrected when editing a model. Finally, real data can be confronted to a model to check their compatibility.
Maintained by Patrice Godard. Last updated 24 days ago.
1.3 match 17 stars 6.24 score 17 scripts 1 dependentsbioc
snapcount:R/Bioconductor Package for interfacing with Snaptron for rapid querying of expression counts
snapcount is a client interface to the Snaptron webservices which support querying by gene name or genomic region. Results include raw expression counts derived from alignment of RNA-seq samples and/or various summarized measures of expression across one or more regions/genes per-sample (e.g. percent spliced in).
Maintained by Rone Charles. Last updated 5 months ago.
coveragegeneexpressionrnaseqsequencingsoftwaredataimport
1.5 match 3 stars 5.19 score 13 scriptsbioc
chevreulShiny:Tools for managing SingleCellExperiment objects as projects
Tools for managing SingleCellExperiment objects as projects. Includes functions for analysis and visualization of single-cell data. Also included is a shiny app for visualization of pre-processed scRNA data. Supported by NIH grants R01CA137124 and R01EY026661 to David Cobrinik.
Maintained by Kevin Stachelek. Last updated 14 days ago.
coveragernaseqsequencingvisualizationgeneexpressiontranscriptionsinglecelltranscriptomicsnormalizationpreprocessingqualitycontroldimensionreductiondataimport
1.5 match 5.08 scorebioc
chevreulPlot:Plots used in the chevreulPlot package
Tools for plotting SingleCellExperiment objects in the chevreulPlot package. Includes functions for analysis and visualization of single-cell data. Supported by NIH grants R01CA137124 and R01EY026661 to David Cobrinik.
Maintained by Kevin Stachelek. Last updated 18 days ago.
coveragernaseqsequencingvisualizationgeneexpressiontranscriptionsinglecelltranscriptomicsnormalizationpreprocessingqualitycontroldimensionreductiondataimport
1.5 match 5.08 score 2 scriptstidymodels
hardhat:Construct Modeling Packages
Building modeling packages is hard. A large amount of effort generally goes into providing an implementation for a new method that is efficient, fast, and correct, but often less emphasis is put on the user interface. A good interface requires specialized knowledge about S3 methods and formulas, which the average package developer might not have. The goal of 'hardhat' is to reduce the burden around building new modeling packages by providing functionality for preprocessing, predicting, and validating input.
Maintained by Hannah Frick. Last updated 2 months ago.
0.5 match 103 stars 14.88 score 175 scripts 436 dependentsbioc
derfinderPlot:Plotting functions for derfinder
This package provides plotting functions for results from the derfinder package. This helps separate the graphical dependencies required for making these plots from the core functionality of derfinder.
Maintained by Leonardo Collado-Torres. Last updated 3 months ago.
differentialexpressionsequencingrnaseqsoftwarevisualizationimmunooncologybioconductorderfinder
1.5 match 2 stars 5.00 score 5 scriptsbioc
netprioR:A model for network-based prioritisation of genes
A model for semi-supervised prioritisation of genes integrating network data, phenotypes and additional prior knowledge about TP and TN gene labels from the literature or experts.
Maintained by Fabian Schmich. Last updated 5 months ago.
immunooncologycellbasedassayspreprocessingnetwork
1.9 match 4.00 score 1 scriptsbioc
iSEEpathways:iSEE extension for panels related to pathway analysis
This package contains diverse functionality to extend the usage of the iSEE package, including additional classes for the panels or modes facilitating the analysis of pathway analysis results. This package does not perform pathway analysis. Instead, it provides methods to embed precomputed pathway analysis results in a SummarizedExperiment object, in a manner that is compatible with interactive visualisation in iSEE applications.
Maintained by Kevin Rue-Albrecht. Last updated 5 months ago.
softwareinfrastructuredifferentialexpressiongeneexpressionguivisualizationpathwaysgenesetenrichmentgoshinyappsbioconductorhacktoberfestiseeiseeu
1.5 match 1 stars 4.95 score 10 scriptsbioc
awst:Asymmetric Within-Sample Transformation
We propose an Asymmetric Within-Sample Transformation (AWST) to regularize RNA-seq read counts and reduce the effect of noise on the classification of samples. AWST comprises two main steps: standardization and smoothing. These steps transform gene expression data to reduce the noise of the lowly expressed features, which suffer from background effects and low signal-to-noise ratio, and the influence of the highly expressed features, which may be the result of amplification bias and other experimental artifacts.
Maintained by Davide Risso. Last updated 5 months ago.
normalizationgeneexpressionrnaseqsoftwaretranscriptomicssequencingsinglecell
1.5 match 3 stars 4.95 score 15 scriptsprogram--
fipio:Lightweight Federal Information Processing System (FIPS) Code Information Retrieval
Provides a lightweight suite of functions for retrieving information about 5-digit or 2-digit US FIPS codes.
Maintained by Justin Singh-Mohudpur. Last updated 1 years ago.
information-retrievalspatialus-data
1.6 match 14 stars 4.77 score 14 scripts 2 dependentscran
GoogleKnowledgeGraphR:Retrieve Information from 'Google Knowledge Graph' API
Allows you to retrieve information from the 'Google Knowledge Graph' API <https://www.google.com/intl/bn/insidesearch/features/search/knowledge.html> and process it in R in various forms. The 'Knowledge Graph Search' API lets you find entities in the 'Google Knowledge Graph'. The API uses standard 'schema.org' types and is compliant with the 'JSON-LD' specification.
Maintained by Daniel Schmeh. Last updated 7 years ago.
7.3 match 1.00 scoreskranz
gtree:gtree basic functionality to model and solve games
gtree basic functionality to model and solve games
Maintained by Sebastian Kranz. Last updated 4 years ago.
economic-experimentseconomicsgambitgame-theorynash-equilibrium
1.9 match 18 stars 3.79 score 23 scripts 1 dependentsrrwen
nbc4va:Bayes Classifier for Verbal Autopsy Data
An implementation of the Naive Bayes Classifier (NBC) algorithm used for Verbal Autopsy (VA) built on code from Miasnikof et al (2015) <DOI:10.1186/s12916-015-0521-2>.
Maintained by Richard Wen. Last updated 3 years ago.
autopsybayescauseclassifiercodedcomputerdeathestimateimputationlearningmachinemdsmillionnaivenbcprobabilitystudytheoryvaverbal
1.5 match 4.60 score 79 scriptsrickhelmus
RDCOMClient:R-DCOM client
Provides dynamic client-side access to (D)COM applications from within R.
Maintained by Duncan Temple Lang. Last updated 1 years ago.
1.8 match 3.90 score 315 scriptsjorgeklz
moc.gapbk:Multi-Objective Clustering Algorithm Guided by a-Priori Biological Knowledge
Implements the Multi-Objective Clustering Algorithm Guided by a-Priori Biological Knowledge (MOC-GaPBK) which was proposed by Parraga-Alava, J. et. al. (2018) <doi:10.1186/s13040-018-0178-4>.
Maintained by Jorge Parraga-Alava. Last updated 7 months ago.
5.0 match 1.30 score 1 scriptskalifa-manjang
GOxploreR:Structural Exploration of the Gene Ontology (GO) Knowledge Base
It provides an effective, efficient, and fast way to explore the Gene Ontology (GO). Given a set of genes, the package contains functions to assess the GO and obtain the terms associated with the genes and the levels of the GO terms. The package provides functions for the three different GO ontology. We discussed the methods explicitly in the following article <doi:10.1038/s41598-020-73326-3>.
Maintained by Kalifa Manjang. Last updated 1 years ago.
2.9 match 2.26 score 18 scriptsbioc
slingshot:Tools for ordering single-cell sequencing
Provides functions for inferring continuous, branching lineage structures in low-dimensional data. Slingshot was designed to model developmental trajectories in single-cell RNA sequencing data and serve as a component in an analysis pipeline after dimensionality reduction and clustering. It is flexible enough to handle arbitrarily many branching events and allows for the incorporation of prior knowledge through supervised graph construction.
Maintained by Kelly Street. Last updated 5 months ago.
clusteringdifferentialexpressiongeneexpressionrnaseqsequencingsoftwaresinglecelltranscriptomicsvisualization
0.5 match 283 stars 12.01 score 1.0k scripts 4 dependentsecmerkle
smdata:Data to Accompany Smithson & Merkle, 2013
Contains data files to accompany Smithson & Merkle (2013), Generalized Linear Models for Categorical and Continuous Limited Dependent Variables.
Maintained by Ed Merkle. Last updated 7 years ago.
4.0 match 1.46 score 29 scriptsedwinkipruto
mfp2:Multivariable Fractional Polynomial Models with Extensions
Multivariable fractional polynomial algorithm simultaneously selects variables and functional forms in both generalized linear models and Cox proportional hazard models. Key references are Royston and Altman (1994) <doi:10.2307/2986270> and Royston and Sauerbrei (2008, ISBN:978-0-470-02842-1). In addition, it can model a sigmoid relationship between variable x and an outcome variable y using the approximate cumulative distribution transformation proposed by Royston (2014) <doi:10.1177/1536867X1401400206>. This feature distinguishes it from a standard fractional polynomial function, which lacks the ability to achieve such modeling.
Maintained by Edwin Kipruto. Last updated 10 months ago.
1.1 match 3 stars 5.26 score 4 scripts 2 dependentscran
DiceOptim:Kriging-Based Optimization for Computer Experiments
Efficient Global Optimization (EGO) algorithm as described in "Roustant et al. (2012)" <doi:10.18637/jss.v051.i01> and adaptations for problems with noise ("Picheny and Ginsbourger, 2012") <doi:10.1016/j.csda.2013.03.018>, parallel infill, and problems with constraints.
Maintained by Victor Picheny. Last updated 4 years ago.
1.9 match 4 stars 3.11 score 107 scripts 1 dependentscubiczebra
TPMplt:Tool-Kit for Dynamic Materials Model and Thermal Processing Maps
Provides a simple approach for constructing dynamic materials modeling suggested by Prasad and Gegel (1984) <doi:10.1007/BF02664902>. It can easily generate various processing-maps based on this model as well. The calculation result in this package contains full materials constants, information about power dissipation efficiency factor, and rheological properties, can be exported completely also, through which further analysis and customized plots will be applicable as well.
Maintained by Chen Zhang. Last updated 6 months ago.
1.2 match 2 stars 4.76 score 29 scriptsbioc
systemPipeShiny:systemPipeShiny: An Interactive Framework for Workflow Management and Visualization
systemPipeShiny (SPS) extends the widely used systemPipeR (SPR) workflow environment with a versatile graphical user interface provided by a Shiny App. This allows non-R users, such as experimentalists, to run many systemPipeR’s workflow designs, control, and visualization functionalities interactively without requiring knowledge of R. Most importantly, SPS has been designed as a general purpose framework for interacting with other R packages in an intuitive manner. Like most Shiny Apps, SPS can be used on both local computers as well as centralized server-based deployments that can be accessed remotely as a public web service for using SPR’s functionalities with community and/or private data. The framework can integrate many core packages from the R/Bioconductor ecosystem. Examples of SPS’ current functionalities include: (a) interactive creation of experimental designs and metadata using an easy to use tabular editor or file uploader; (b) visualization of workflow topologies combined with auto-generation of R Markdown preview for interactively designed workflows; (d) access to a wide range of data processing routines; (e) and an extendable set of visualization functionalities. Complex visual results can be managed on a 'Canvas Workbench’ allowing users to organize and to compare plots in an efficient manner combined with a session snapshot feature to continue work at a later time. The present suite of pre-configured visualization examples. The modular design of SPR makes it easy to design custom functions without any knowledge of Shiny, as well as extending the environment in the future with contributions from the community.
Maintained by Le Zhang. Last updated 5 months ago.
shinyappsinfrastructuredataimportsequencingqualitycontrolreportwritingexperimentaldesignclusteringbioconductorbioconductor-packagedata-visualizationshinysystempiper
0.8 match 33 stars 7.03 score 36 scriptscran
FamilyRank:Algorithm for Ranking Predictors Using Graphical Domain Knowledge
Grows families of features by selecting features that maximize a weighted score calculated from empirical feature scores and graphical knowledge. The final weighted score for a feature is determined by summing a feature's family-weighted scores across all families in which the feature appears.
Maintained by Michelle Saul. Last updated 4 years ago.
5.1 match 1.00 score 6 scriptsjafarilab
NIMAA:Nominal Data Mining Analysis
Functions for nominal data mining based on bipartite graphs, which build a pipeline for analysis and missing values imputation. Methods are mainly from the paper: Jafari, Mohieddin, et al. (2021) <doi:10.1101/2021.03.18.436040>, some new ones are also included.
Maintained by Mohieddin Jafari. Last updated 2 years ago.
1.1 match 4 stars 4.30 score 7 scriptssanchezi
kfino:Kalman Filter for Impulse Noised Outliers
A method for detecting outliers with a Kalman filter on impulsed noised outliers and prediction on cleaned data. 'kfino' is a robust sequential algorithm allowing to filter data with a large number of outliers. This algorithm is based on simple latent linear Gaussian processes as in the Kalman Filter method and is devoted to detect impulse-noised outliers. These are data points that differ significantly from other observations. 'ML' (Maximization Likelihood) and 'EM' (Expectation-Maximization algorithm) algorithms were implemented in 'kfino'. The method is described in full details in the following arXiv e-Print: <arXiv:2208.00961>.
Maintained by Isabelle Sanchez. Last updated 2 years ago.
1.6 match 3.00 score 6 scriptsolgalezhnina
dtreg:Interact with Data Type Registries and Create Machine-Readable Data
You can load a schema from a DTR (data type registry) as an R object. Use this schema to write your data in JSON-LD (JavaScript Object Notation for Linked Data) format to make it machine readable.
Maintained by Olga Lezhnina. Last updated 30 days ago.
1.5 match 3.18 score 4 scriptscran
wPerm:Permutation Tests
Supplies permutation-test alternatives to traditional hypothesis-test procedures such as two-sample tests for means, medians, and standard deviations; correlation tests; tests for homogeneity and independence; and more. Suitable for general audiences, including individual and group users, introductory statistics courses, and more advanced statistics courses that desire an introduction to permutation tests.
Maintained by Neil A. Weiss. Last updated 9 years ago.
3.6 match 1.30 scorebioc
debCAM:Deconvolution by Convex Analysis of Mixtures
An R package for fully unsupervised deconvolution of complex tissues. It provides basic functions to perform unsupervised deconvolution on mixture expression profiles by Convex Analysis of Mixtures (CAM) and some auxiliary functions to help understand the subpopulation-specific results. It also implements functions to perform supervised deconvolution based on prior knowledge of molecular markers, S matrix or A matrix. Combining molecular markers from CAM and from prior knowledge can achieve semi-supervised deconvolution of mixtures.
Maintained by Lulu Chen. Last updated 5 months ago.
softwarecellbiologygeneexpressionopenjdk
0.8 match 7 stars 5.69 score 14 scriptsbioc
sccomp:Tests differences in cell-type proportion for single-cell data, robust to outliers
A robust and outlier-aware method for testing differences in cell-type proportion in single-cell data. This model can infer changes in tissue composition and heterogeneity, and can produce realistic data simulations based on any existing dataset. This model can also transfer knowledge from a large set of integrated datasets to increase accuracy further.
Maintained by Stefano Mangiola. Last updated 2 days ago.
bayesianregressiondifferentialexpressionsinglecellmetagenomicsflowcytometryspatialbatch-correctioncompositioncytofdifferential-proportionmicrobiomemultilevelproportionsrandom-effectssingle-cellunwanted-variation
0.5 match 99 stars 8.43 score 69 scriptsbendeivide
leem:Laboratory of Teaching to Statistics and Mathematics
An educational package for the teaching of statistics and mathematics in primary and higher education. The objective is to assist in teaching/learning for both student study planning and teacher teaching strategies. The leem package will try to bring, in a simple and at the same time in-depth, knowledge of statistics and mathematics to everyone who wants to study these areas of knowledge. The main function of the package is 'leem' function.
Maintained by Ben Deivide. Last updated 17 days ago.
0.8 match 4 stars 5.33 score 152 scriptsusepa
ctxR:Utilities for Interacting with the 'CTX' APIs
Access chemical, hazard, bioactivity, and exposure data from the Computational Toxicology and Exposure ('CTX') APIs <https://www.epa.gov/comptox-tools/computational-toxicology-and-exposure-apis>. 'ctxR' was developed to streamline the process of accessing the information available through the 'CTX' APIs without requiring prior knowledge of how to use APIs. Most data is also available on the CompTox Chemical Dashboard ('CCD') <https://comptox.epa.gov/dashboard/> and other resources found at the EPA Computational Toxicology and Exposure Online Resources <https://www.epa.gov/comptox-tools>.
Maintained by Paul Kruse. Last updated 2 months ago.
0.5 match 10 stars 8.02 score 13 scripts 1 dependentscogdisreslab
KinaseTauScore:Tau Scores For Human Kinases
This data package provides the tau scores for each kinase based on its activity in Alzheimer's disease samples. The data was generated by using an siRNA Library to knock down individual kinases and then measuring the total Tau protein Expression and the phopho-Tau protein expression. The resulting data wasc reposited online. This package processes the resulting data to create a meaningful Tau Score for each Kinase based on its activity.
Maintained by Ali Sajid Imami. Last updated 3 years ago.
experimentdataproteomeexpressiondata
1.5 match 2.70 scorebioc
mistyR:Multiview Intercellular SpaTial modeling framework
mistyR is an implementation of the Multiview Intercellular SpaTialmodeling framework (MISTy). MISTy is an explainable machine learning framework for knowledge extraction and analysis of single-cell, highly multiplexed, spatially resolved data. MISTy facilitates an in-depth understanding of marker interactions by profiling the intra- and intercellular relationships. MISTy is a flexible framework able to process a custom number of views. Each of these views can describe a different spatial context, i.e., define a relationship among the observed expressions of the markers, such as intracellular regulation or paracrine regulation, but also, the views can also capture cell-type specific relationships, capture relations between functional footprints or focus on relations between different anatomical regions. Each MISTy view is considered as a potential source of variability in the measured marker expressions. Each MISTy view is then analyzed for its contribution to the total expression of each marker and is explained in terms of the interactions with other measurements that led to the observed contribution.
Maintained by Jovan Tanevski. Last updated 5 months ago.
softwarebiomedicalinformaticscellbiologysystemsbiologyregressiondecisiontreesinglecellspatialbioconductorbiologyintercellularmachine-learningmodularmolecular-biologymultiviewspatial-transcriptomics
0.5 match 51 stars 7.87 score 160 scriptsbioc
ASSIGN:Adaptive Signature Selection and InteGratioN (ASSIGN)
ASSIGN is a computational tool to evaluate the pathway deregulation/activation status in individual patient samples. ASSIGN employs a flexible Bayesian factor analysis approach that adapts predetermined pathway signatures derived either from knowledge-based literature or from perturbation experiments to the cell-/tissue-specific pathway signatures. The deregulation/activation level of each context-specific pathway is quantified to a score, which represents the extent to which a patient sample encompasses the pathway deregulation/activation signature.
Maintained by Ying Shen. Last updated 5 months ago.
softwaregeneexpressionpathwaysbayesian
0.5 match 2 stars 7.37 score 65 scripts 1 dependentscelehs
kesernetwork:Visualization of the KESER Network
A shiny app to visualize the knowledge networks for the code concepts. Using co-occurrence matrices of EHR codes from Veterans Affairs (VA) and Massachusetts General Brigham (MGB), the knowledge extraction via sparse embedding regression (KESER) algorithm was used to construct knowledge networks for the code concepts. Background and details about the method can be found at Chuan et al. (2021) <doi:10.1038/s41746-021-00519-z>.
Maintained by Su-Chun Cheng. Last updated 2 years ago.
0.9 match 1 stars 4.00 score 7 scriptsbioc
scAnnotatR:Pretrained learning models for cell type prediction on single cell RNA-sequencing data
The package comprises a set of pretrained machine learning models to predict basic immune cell types. This enables all users to quickly get a first annotation of the cell types present in their dataset without requiring prior knowledge. scAnnotatR also allows users to train their own models to predict new cell types based on specific research needs.
Maintained by Johannes Griss. Last updated 5 months ago.
singlecelltranscriptomicsgeneexpressionsupportvectormachineclassificationsoftware
0.5 match 15 stars 6.73 score 20 scriptsreconhub
earlyR:Estimation of Transmissibility in the Early Stages of a Disease Outbreak
Implements a simple, likelihood-based estimation of the reproduction number (R0) using a branching process with a Poisson likelihood. This model requires knowledge of the serial interval distribution, and dates of symptom onsets. Infectiousness is determined by weighting R0 by the probability mass function of the serial interval on the corresponding day. It is a simplified version of the model introduced by Cori et al. (2013) <doi:10.1093/aje/kwt133>.
Maintained by Thibaut Jombart. Last updated 4 years ago.
0.5 match 9 stars 6.59 score 96 scriptsbioc
ViSEAGO:ViSEAGO: a Bioconductor package for clustering biological functions using Gene Ontology and semantic similarity
The main objective of ViSEAGO package is to carry out a data mining of biological functions and establish links between genes involved in the study. We developed ViSEAGO in R to facilitate functional Gene Ontology (GO) analysis of complex experimental design with multiple comparisons of interest. It allows to study large-scale datasets together and visualize GO profiles to capture biological knowledge. The acronym stands for three major concepts of the analysis: Visualization, Semantic similarity and Enrichment Analysis of Gene Ontology. It provides access to the last current GO annotations, which are retrieved from one of NCBI EntrezGene, Ensembl or Uniprot databases for several species. Using available R packages and novel developments, ViSEAGO extends classical functional GO analysis to focus on functional coherence by aggregating closely related biological themes while studying multiple datasets at once. It provides both a synthetic and detailed view using interactive functionalities respecting the GO graph structure and ensuring functional coherence supplied by semantic similarity. ViSEAGO has been successfully applied on several datasets from different species with a variety of biological questions. Results can be easily shared between bioinformaticians and biologists, enhancing reporting capabilities while maintaining reproducibility.
Maintained by Aurelien Brionne. Last updated 2 months ago.
softwareannotationgogenesetenrichmentmultiplecomparisonclusteringvisualization
0.5 match 6.64 score 22 scriptscran
arakno:ARAchnid KNowledge Online
Allows the user to connect with the World Spider Catalogue (WSC; <https://wsc.nmbe.ch/>) and the World Spider Trait (WST; <https://spidertraits.sci.muni.cz/>) databases. Also performs several basic functions such as checking names validity, retrieving coordinate data from the Global Biodiversity Information Facility (GBIF; <https://www.gbif.org/>), and mapping.
Maintained by Pedro Cardoso. Last updated 3 years ago.
3.3 match 1.00 scorecran
spidR:Spider Knowledge Online
Allows the user to connect with the World Spider Catalogue (WSC; <https://wsc.nmbe.ch/>) and the World Spider Trait (WST; <https://spidertraits.sci.muni.cz/>) databases. Also performs several basic functions such as checking names validity, retrieving coordinate data from the Global Biodiversity Information Facility (GBIF; <https://www.gbif.org/>), and mapping.
Maintained by Pedro Cardoso. Last updated 3 years ago.
3.3 match 1.00 scorebstatcomp
bayes4psy:User Friendly Bayesian Data Analysis for Psychology
Contains several Bayesian models for data analysis of psychological tests. A user friendly interface for these models should enable students and researchers to perform professional level Bayesian data analysis without advanced knowledge in programming and Bayesian statistics. This package is based on the Stan platform (Carpenter et el. 2017 <doi:10.18637/jss.v076.i01>).
Maintained by Jure Demšar. Last updated 1 years ago.
0.5 match 14 stars 6.44 score 33 scriptsluckinet
ontologics:Code-Logics to Handle Ontologies
Provides tools to build and work with an ontology of linked (open) data in a tidy workflow. It is inspired by the Food and Agrilculture Organizations (FAO) caliper platform <https://www.fao.org/statistics/caliper/web/> and makes use of the Simple Knowledge Organisation System (SKOS).
Maintained by Steffen Ehrmann. Last updated 2 months ago.
0.5 match 3 stars 6.39 score 17 scripts 1 dependentsusepa
ccdR:Utilities for Interacting with the 'CTX' APIs
Access chemical, hazard, bioactivity, and exposure data from the Computational Toxicology and Exposure ('CTX') APIs <https://api-ccte.epa.gov/docs/>. 'ccdR' was developed to streamline the process of accessing the information available through the 'CTX' APIs without requiring prior knowledge of how to use APIs. Most data is also available on the CompTox Chemical Dashboard ('CCD') <https://comptox.epa.gov/dashboard/> and other resources found at the EPA Computational Toxicology and Exposure Online Resources <https://www.epa.gov/comptox-tools>.
Maintained by Paul Kruse. Last updated 8 months ago.
0.5 match 2 stars 6.38 score 7 scriptsdata-cleaning
dcmodify:Modify Data Using Externally Defined Modification Rules
Data cleaning scripts typically contain a lot of 'if this change that' type of statements. Such statements are typically condensed expert knowledge. With this package, such 'data modifying rules' are taken out of the code and become in stead parameters to the work flow. This allows one to maintain, document, and reason about data modification rules as separate entities.
Maintained by Mark van der Loo. Last updated 9 months ago.
0.5 match 10 stars 6.24 score 58 scriptsegeulgen
driveR:Prioritizing Cancer Driver Genes Using Genomics Data
Cancer genomes contain large numbers of somatic alterations but few genes drive tumor development. Identifying cancer driver genes is critical for precision oncology. Most of current approaches either identify driver genes based on mutational recurrence or using estimated scores predicting the functional consequences of mutations. 'driveR' is a tool for personalized or batch analysis of genomic data for driver gene prioritization by combining genomic information and prior biological knowledge. As features, 'driveR' uses coding impact metaprediction scores, non-coding impact scores, somatic copy number alteration scores, hotspot gene/double-hit gene condition, 'phenolyzer' gene scores and memberships to cancer-related KEGG pathways. It uses these features to estimate cancer-type-specific probability for each gene of being a cancer driver using the related task of a multi-task learning classification model. The method is described in detail in Ulgen E, Sezerman OU. 2021. driveR: driveR: a novel method for prioritizing cancer driver genes using somatic genomics data. BMC Bioinformatics <doi:10.1186/s12859-021-04203-7>.
Maintained by Ege Ulgen. Last updated 2 years ago.
cancer-drivernessdriverdriver-gene-prioritizationidentify-driver-genesranking-genesscoring
0.5 match 15 stars 6.29 score 260 scriptsbioc
martini:GWAS Incorporating Networks
martini deals with the low power inherent to GWAS studies by using prior knowledge represented as a network. SNPs are the vertices of the network, and the edges represent biological relationships between them (genomic adjacency, belonging to the same gene, physical interaction between protein products). The network is scanned using SConES, which looks for groups of SNPs maximally associated with the phenotype, that form a close subnetwork.
Maintained by Hector Climente-Gonzalez. Last updated 5 months ago.
softwaregenomewideassociationsnpgeneticvariabilitygeneticsfeatureextractiongraphandnetworknetworkbioinformaticsgenomicsgwasnetwork-analysissnpssystems-biologycpp
0.5 match 4 stars 6.16 score 30 scriptsg-rho
clustMixType:k-Prototypes Clustering for Mixed Variable-Type Data
Functions to perform k-prototypes partitioning clustering for mixed variable-type data according to Z.Huang (1998): Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Variables, Data Mining and Knowledge Discovery 2, 283-304.
Maintained by Gero Szepannek. Last updated 9 months ago.
0.5 match 1 stars 6.07 score 111 scripts 8 dependentsmmedl94
lionfish:Interactive 'tourr' Using 'python'
Extends the functionality of the 'tourr' package by an interactive graphical user interface. The interactivity allows users to effortlessly refine their 'tourr' results by manual intervention, which allows for integration of expert knowledge and aids the interpretation of results. For more information on 'tourr' see Wickham et. al (2011) <doi:10.18637/jss.v040.i02> or <https://github.com/ggobi/tourr>.
Maintained by Matthias Medl. Last updated 5 days ago.
data-siencedata-visualizationdimensionality-reductionexploratory-data-analysisinteractiveinteractive-visualizationstourr
0.5 match 1 stars 5.96 scorenicwir
QurvE:Robust and User-Friendly Analysis of Growth and Fluorescence Curves
High-throughput analysis of growth curves and fluorescence data using three methods: linear regression, growth model fitting, and smooth spline fit. Analysis of dose-response relationships via smoothing splines or dose-response models. Complete data analysis workflows can be executed in a single step via user-friendly wrapper functions. The results of these workflows are summarized in detailed reports as well as intuitively navigable 'R' data containers. A 'shiny' application provides access to all features without requiring any programming knowledge. The package is described in further detail in Wirth et al. (2023) <doi:10.1038/s41596-023-00850-7>.
Maintained by Nicolas T. Wirth. Last updated 1 years ago.
0.5 match 25 stars 6.00 score 7 scriptsbioc
omicsViewer:Interactive and explorative visualization of SummarizedExperssionSet or ExpressionSet using omicsViewer
omicsViewer visualizes ExpressionSet (or SummarizedExperiment) in an interactive way. The omicsViewer has a separate back- and front-end. In the back-end, users need to prepare an ExpressionSet that contains all the necessary information for the downstream data interpretation. Some extra requirements on the headers of phenotype data or feature data are imposed so that the provided information can be clearly recognized by the front-end, at the same time, keep a minimum modification on the existing ExpressionSet object. The pure dependency on R/Bioconductor guarantees maximum flexibility in the statistical analysis in the back-end. Once the ExpressionSet is prepared, it can be visualized using the front-end, implemented by shiny and plotly. Both features and samples could be selected from (data) tables or graphs (scatter plot/heatmap). Different types of analyses, such as enrichment analysis (using Bioconductor package fgsea or fisher's exact test) and STRING network analysis, will be performed on the fly and the results are visualized simultaneously. When a subset of samples and a phenotype variable is selected, a significance test on means (t-test or ranked based test; when phenotype variable is quantitative) or test of independence (chi-square or fisher’s exact test; when phenotype data is categorical) will be performed to test the association between the phenotype of interest with the selected samples. Additionally, other analyses can be easily added as extra shiny modules. Therefore, omicsViewer will greatly facilitate data exploration, many different hypotheses can be explored in a short time without the need for knowledge of R. In addition, the resulting data could be easily shared using a shiny server. Otherwise, a standalone version of omicsViewer together with designated omics data could be easily created by integrating it with portable R, which can be shared with collaborators or submitted as supplementary data together with a manuscript.
Maintained by Chen Meng. Last updated 2 months ago.
softwarevisualizationgenesetenrichmentdifferentialexpressionmotifdiscoverynetworknetworkenrichment
0.5 match 4 stars 6.02 score 22 scriptsbioc
BindingSiteFinder:Binding site defintion based on iCLIP data
Precise knowledge on the binding sites of an RNA-binding protein (RBP) is key to understand (post-) transcriptional regulatory processes. Here we present a workflow that describes how exact binding sites can be defined from iCLIP data. The package provides functions for binding site definition and result visualization. For details please see the vignette.
Maintained by Mirko Brüggemann. Last updated 13 hours ago.
sequencinggeneexpressiongeneregulationfunctionalgenomicscoveragedataimportbinding-site-classificationbinding-sitesbioconductor-packageicliprna-binding-proteins
0.5 match 6 stars 5.73 score 3 scriptsstrancsus
scCAN:Single-Cell Clustering using Autoencoder and Network Fusion
A single-cell Clustering method using 'Autoencoder' and Network fusion ('scCAN') Bang Tran (2022) <doi:10.1038/s41598-022-14218-6> for segregating the cells from the high-dimensional 'scRNA-Seq' data. The software automatically determines the optimal number of clusters and then partitions the cells in a way such that the results are robust to noise and dropouts. 'scCAN' is fast and it supports Windows, Linux, and Mac OS.
Maintained by Bang Tran. Last updated 9 months ago.
1.1 match 2.70 scoreweinijiahuan123
offlineChange:Detect Multiple Change Points from Time Series
Detect the number and locations of change points. The locations can be either exact or in terms of ranges, depending on the available computational resource. The method is based on Jie Ding, Yu Xiang, Lu Shen, Vahid Tarokh (2017) <doi:10.1109/TSP.2017.2711558>.
Maintained by Jiahuan Ye. Last updated 5 years ago.
1.1 match 2.70 score 3 scriptsgiopogg
webSDM:Including Known Interactions in Species Distribution Models
A collection of tools to fit and work with trophic Species Distribution Models. Trophic Species Distribution Models combine knowledge of trophic interactions with Bayesian structural equation models that model each species as a function of its prey (or predators) and environmental conditions. It exploits the topological ordering of the known trophic interaction network to predict species distribution in space and/or time, where the prey (or predator) distribution is unavailable. The method implemented by the package is described in Poggiato, Andréoletti, Pollock and Thuiller (2022) <doi:10.22541/au.166853394.45823739/v1>.
Maintained by Giovanni Poggiato. Last updated 9 months ago.
0.5 match 17 stars 5.71 score 9 scriptscran
MCPMod:Design and Analysis of Dose-Finding Studies
Implements a methodology for the design and analysis of dose-response studies that combines aspects of multiple comparison procedures and modeling approaches (Bretz, Pinheiro and Branson, 2005, Biometrics 61, 738-748, <doi: 10.1111/j.1541-0420.2005.00344.x>). The package provides tools for the analysis of dose finding trials as well as a variety of tools necessary to plan a trial to be conducted with the MCP-Mod methodology. Please note: The 'MCPMod' package will not be further developed, all future development of the MCP-Mod methodology will be done in the 'DoseFinding' R-package.
Maintained by Bjoern Bornkamp. Last updated 5 years ago.
1.8 match 1.60 scorecyclestreets
cyclestreets:Cycle Routing and Data for Cycling Advocacy
An interface to the cycle routing/data services provided by 'CycleStreets', a not-for-profit social enterprise and advocacy organisation. The application programming interfaces (APIs) provided by 'CycleStreets' are documented at (<https://www.cyclestreets.net/api/>). The focus of this package is the journey planning API, which aims to emulate the routes taken by a knowledgeable cyclist. An innovative feature of the routing service of its provision of fastest, quietest and balanced profiles. These represent routes taken to minimise time, avoid traffic and compromise between the two, respectively.
Maintained by Robin Lovelace. Last updated 3 months ago.
cyclingroutingtransporttransportation-planning
0.5 match 27 stars 5.62 score 31 scriptsstatistikat
tatoo:Combine and Export Data Frames
Functions to combine data.frames in ways that require additional effort in base R, and to add metadata (id, title, ...) that can be used for printing and xlsx export. The 'Tatoo_report' class is provided as a convenient helper to write several such tables to a workbook, one table per worksheet. Tatoo is built on top of 'openxlsx', but intimate knowledge of that package is not required to use tatoo.
Maintained by Stefan Fleck. Last updated 2 years ago.
0.5 match 7 stars 5.53 score 24 scriptsmjwestgate
revtools:Tools to Support Evidence Synthesis
Researchers commonly need to summarize scientific information, a process known as 'evidence synthesis'. The first stage of a synthesis process (such as a systematic review or meta-analysis) is to download a list of references from academic search engines such as 'Web of Knowledge' or 'Scopus'. The traditional approach to systematic review is then to sort these data manually, first by locating and removing duplicated entries, and then screening to remove irrelevant content by viewing titles and abstracts (in that order). 'revtools' provides interfaces for each of these tasks. An alternative approach, however, is to draw on tools from machine learning to visualise patterns in the corpus. In this case, you can use 'revtools' to render ordinations of text drawn from article titles, keywords and abstracts, and interactively select or exclude individual references, words or topics.
Maintained by Martin J. Westgate. Last updated 5 years ago.
0.5 match 52 stars 5.57 score 72 scriptsbioc
diffuStats:Diffusion scores on biological networks
Label propagation approaches are a widely used procedure in computational biology for giving context to molecular entities using network data. Node labels, which can derive from gene expression, genome-wide association studies, protein domains or metabolomics profiling, are propagated to their neighbours in the network, effectively smoothing the scores through prior annotated knowledge and prioritising novel candidates. The R package diffuStats contains a collection of diffusion kernels and scoring approaches that facilitates their computation, characterisation and benchmarking.
Maintained by Sergio Picart-Armada. Last updated 5 months ago.
networkgeneexpressiongraphandnetworkmetabolomicstranscriptomicsproteomicsgeneticsgenomewideassociationnormalizationcpp
0.5 match 5.40 score 42 scriptsmagichead99
bread:Analyze Big Files Without Loading Them in Memory
A simple set of wrapper functions for data.table::fread() that allows subsetting or filtering rows and selecting columns of table-formatted files too large for the available RAM. 'b stands for 'big files'. bread makes heavy use of Unix commands like 'grep', 'sed', 'wc', 'awk' and 'cut'. They are available by default in all Unix environments. For Windows, you need to install those commands externally in order to simulate a Unix environment and make sure that the executables are in the Windows PATH variable. To my knowledge, the simplest ways are to install 'RTools', 'Git' or 'Cygwin'. If they have been correctly installed (with the expected registry entries), they should be detected on loading the package and the correct directories will be added automatically to the PATH.
Maintained by Vincent Guegan. Last updated 2 years ago.
0.5 match 14 stars 5.37 score 56 scripts 2 dependentsbioc
ReactomeGraph4R:Interface for the Reactome Graph Database
Pathways, reactions, and biological entities in Reactome knowledge are systematically represented as an ordered network. Instances are represented as nodes and relationships between instances as edges; they are all stored in the Reactome Graph Database. This package serves as an interface to query the interconnected data from a local Neo4j database, with the aim of minimizing the usage of Neo4j Cypher queries.
Maintained by Chi-Lam Poon. Last updated 5 months ago.
dataimportpathwaysreactomenetworkgraphandnetwork
0.5 match 6 stars 5.26 score 6 scriptszzawadz
DepthProc:Statistical Depth Functions for Multivariate Analysis
Data depth concept offers a variety of powerful and user friendly tools for robust exploration and inference for multivariate data. The offered techniques may be successfully used in cases of lack of our knowledge on parametric models generating data due to their nature. The package consist of among others implementations of several data depth techniques involving multivariate quantile-quantile plots, multivariate scatter estimators, multivariate Wilcoxon tests and robust regressions.
Maintained by Zygmunt Zawadzki. Last updated 3 years ago.
depth-functionsexploratory-data-analysisstatisticsopenblascppopenmp
0.5 match 6 stars 5.27 score 104 scripts 2 dependentsjsugarelli
flatxml:Tools for Working with XML Files as R Dataframes
On import, the XML information is converted to a dataframe that reflects the hierarchical XML structure. Intuitive functions allow to navigate within this transparent XML data structure (without any knowledge of 'XPath'). 'flatXML' also provides tools to extract data from the XML into a flat dataframe that can be used to perform statistical operations. It also supports converting dataframes to XML.
Maintained by Joachim Zuckarelli. Last updated 4 years ago.
0.5 match 24 stars 5.09 score 34 scripts 1 dependentsdavid-hammond
pmev:Calculates Earned Value for a Project Schedule
Given a project schedule and associated costs, this package calculates the earned value to date. It is an implementation of Project Management Body of Knowledge (PMBOK) methodologies (reference Project Management Institute. (2021). A guide to the Project Management Body of Knowledge (PMBOK guide) (7th ed.). Project Management Institute, Newtown Square, PA, ISBN 9781628256673 (pdf)).
Maintained by David Hammond. Last updated 7 months ago.
0.8 match 3.30 score 4 scriptsbioc
DESpace:DESpace: a framework to discover spatially variable genes
Intuitive framework for identifying spatially variable genes (SVGs) via edgeR, a popular method for performing differential expression analyses. Based on pre-annotated spatial clusters as summarized spatial information, DESpace models gene expression using a negative binomial (NB), via edgeR, with spatial clusters as covariates. SVGs are then identified by testing the significance of spatial clusters. The method is flexible and robust, and is faster than the most SV methods. Furthermore, to the best of our knowledge, it is the only SV approach that allows: - performing a SV test on each individual spatial cluster, hence identifying the key regions of the tissue affected by spatial variability; - jointly fitting multiple samples, targeting genes with consistent spatial patterns across replicates.
Maintained by Peiying Cai. Last updated 5 months ago.
spatialsinglecellrnaseqtranscriptomicsgeneexpressionsequencingdifferentialexpressionstatisticalmethodvisualization
0.5 match 4 stars 5.02 score 13 scriptsbioc
KnowSeq:KnowSeq R/Bioc package: The Smart Transcriptomic Pipeline
KnowSeq proposes a novel methodology that comprises the most relevant steps in the Transcriptomic gene expression analysis. KnowSeq expects to serve as an integrative tool that allows to process and extract relevant biomarkers, as well as to assess them through a Machine Learning approaches. Finally, the last objective of KnowSeq is the biological knowledge extraction from the biomarkers (Gene Ontology enrichment, Pathway listing and Visualization and Evidences related to the addressed disease). Although the package allows analyzing all the data manually, the main strenght of KnowSeq is the possibilty of carrying out an automatic and intelligent HTML report that collect all the involved steps in one document. It is important to highligh that the pipeline is totally modular and flexible, hence it can be started from whichever of the different steps. KnowSeq expects to serve as a novel tool to help to the experts in the field to acquire robust knowledge and conclusions for the data and diseases to study.
Maintained by Daniel Castillo-Secilla. Last updated 5 months ago.
geneexpressiondifferentialexpressiongenesetenrichmentdataimportclassificationfeatureextractionsequencingrnaseqbatcheffectnormalizationpreprocessingqualitycontrolgeneticstranscriptomicsmicroarrayalignmentpathwayssystemsbiologygoimmunooncology
0.8 match 3.30 score 5 scriptsrobson-fernandes
bnviewer:Bayesian Networks Interactive Visualization and Explainable Artificial Intelligence
Bayesian networks provide an intuitive framework for probabilistic reasoning and its graphical nature can be interpreted quite clearly. Graph based methods of machine learning are becoming more popular because they offer a richer model of knowledge that can be understood by a human in a graphical format. The 'bnviewer' is an R Package that allows the interactive visualization of Bayesian Networks. The aim of this package is to improve the Bayesian Networks visualization over the basic and static views offered by existing packages.
Maintained by Robson Fernandes. Last updated 5 years ago.
bayesian-inferencebayesian-networkbayesian-networksprobabilistic-graphical-models
0.5 match 7 stars 4.86 score 69 scripts 1 dependentsbioc
MSstatsBioNet:Network Analysis for MS-based Proteomics Experiments
A set of tools for network analysis using mass spectrometry-based proteomics data and network databases. The package takes as input the output of MSstats differential abundance analysis and provides functions to perform enrichment analysis and visualization in the context of prior knowledge from past literature. Notably, this package integrates with INDRA, which is a database of biological networks extracted from the literature using text mining techniques.
Maintained by Anthony Wu. Last updated 1 months ago.
immunooncologymassspectrometryproteomicssoftwarequalitycontrolnetworkenrichmentnetwork
0.5 match 4.85 score 3 scriptslebebr01
highlightHTML:Highlight HTML Text and Tables
A tool to format R markdown with CSS ids for HTML output. The tool may be most helpful for those using markdown to create reproducible documents. The biggest limitations in formatting is the knowledge of CSS by the document authors.
Maintained by Brandon LeBeau. Last updated 4 years ago.
0.5 match 5 stars 4.74 score 22 scriptsbioc
SCBN:A statistical normalization method and differential expression analysis for RNA-seq data between different species
This package provides a scale based normalization (SCBN) method to identify genes with differential expression between different species. It takes into account the available knowledge of conserved orthologous genes and the hypothesis testing framework to detect differentially expressed orthologous genes. The method on this package are described in the article 'A statistical normalization method and differential expression analysis for RNA-seq data between different species' by Yan Zhou, Jiadi Zhu, Tiejun Tong, Junhui Wang, Bingqing Lin, Jun Zhang (2018, pending publication).
Maintained by Yan Zhou. Last updated 5 months ago.
differentialexpressiongeneexpressionnormalization
0.5 match 4.78 score 1 scripts 1 dependentschristophergandrud
dpmr:Data Package Manager for R
Create, install, and summarise data packages that follow the Open Knowledge Foundation's Data Package Protocol.
Maintained by Christopher Gandrud. Last updated 8 years ago.
0.5 match 56 stars 4.45 score 5 scriptsopenpharma
elaborator:A 'shiny' Application for Exploring Laboratory Data
A novel concept for generating knowledge and gaining insights into laboratory data. You will be able to efficiently and easily explore your laboratory data from different perspectives. Janitza, S., Majumder, M., Mendolia, F., Jeske, S., & Kulmann, H. (2021) <doi:10.1007/s43441-021-00318-4>.
Maintained by Bodo Kirsch. Last updated 6 months ago.
0.5 match 6 stars 4.56 scoreliao961120
linguisticsdown:Easy Linguistics Document Writing with R Markdown
Provides 'Shiny gadgets' to search, type, and insert IPA symbols into documents or scripts, requiring only knowledge about phonetics or 'X-SAMPA'. Also provides functions to facilitate the rendering of IPA symbols in 'LaTeX' and PDF format, making IPA symbols properly rendered in all output formats. A minimal R Markdown template for authoring Linguistics related documents is also bundled with the package. Some helper functions to facilitate authoring with R Markdown is also provided.
Maintained by Yongfu Liao. Last updated 6 years ago.
linguisticsrmarkdownrmarkdown-template
0.5 match 26 stars 4.59 score 30 scriptsbioc
iBBiG:Iterative Binary Biclustering of Genesets
iBBiG is a bi-clustering algorithm which is optimizes for binary data analysis. We apply it to meta-gene set analysis of large numbers of gene expression datasets. The iterative algorithm extracts groups of phenotypes from multiple studies that are associated with similar gene sets. iBBiG does not require prior knowledge of the number or scale of clusters and allows discovery of clusters with diverse sizes
Maintained by Aedin Culhane. Last updated 5 months ago.
clusteringannotationgenesetenrichment
0.5 match 4.56 score 3 scripts 2 dependentsbioc
BiocBook:Write, containerize, publish and version Quarto books with Bioconductor
A BiocBook can be created by authors (e.g. R developers, but also scientists, teachers, communicators, ...) who wish to 1) write (compile a body of biological and/or bioinformatics knowledge), 2) containerize (provide Docker images to reproduce the examples illustrated in the compendium), 3) publish (deploy an online book to disseminate the compendium), and 4) version (automatically generate specific online book versions and Docker images for specific Bioconductor releases).
Maintained by Jacques Serizay. Last updated 5 months ago.
infrastructurereportwritingsoftware
0.5 match 3 stars 4.48 score 4 scriptskumes
chatAI4R:Chat-Based Interactive Artificial Intelligence for R
The Large Language Model (LLM) represents a groundbreaking advancement in data science and programming, and also allows us to extend the world of R. A seamless interface for integrating the 'OpenAI' Web APIs into R is provided in this package. This package leverages LLM-based AI techniques, enabling efficient knowledge discovery and data analysis (see 'OpenAI' Web APIs details <https://openai.com/blog/openai-api>). The previous functions such as seamless translation and image generation have been moved to other packages 'deepRstudio' and 'stableDiffusion4R'.
Maintained by Satoshi Kume. Last updated 1 months ago.
aibioinformaticschatgptgptimageimage-generation
0.5 match 14 stars 4.45 score 3 scriptsjhardenberg
rainfarmr:Stochastic Precipitation Downscaling with the RainFARM Method
An implementation of the RainFARM (Rainfall Filtered Autoregressive Model) stochastic precipitation downscaling method (Rebora et al. (2006) <doi:10.1175/JHM517.1>). Adapted for climate downscaling according to D'Onofrio et al. (2018) <doi:10.1175/JHM-D-13-096.1> and for complex topography as in Terzago et al. (2018) <doi:10.5194/nhess-18-2825-2018>. The RainFARM method is based on the extrapolation to small scales of the Fourier spectrum of a large-scale precipitation field, using a fixed logarithmic slope and random phases at small scales, followed by a nonlinear transformation of the resulting linearly correlated stochastic field. RainFARM allows to generate ensembles of spatially downscaled precipitation fields which conserve precipitation at large scales and whose statistical properties are consistent with the small-scale statistics of observed precipitation, based only on knowledge of the large-scale precipitation field.
Maintained by Jost von Hardenberg. Last updated 3 years ago.
0.5 match 4 stars 4.48 score 5 dependentsbioc
transite:RNA-binding protein motif analysis
transite is a computational method that allows comprehensive analysis of the regulatory role of RNA-binding proteins in various cellular processes by leveraging preexisting gene expression data and current knowledge of binding preferences of RNA-binding proteins.
Maintained by Konstantin Krismer. Last updated 5 months ago.
geneexpressiontranscriptiondifferentialexpressionmicroarraymrnamicroarraygeneticsgenesetenrichmentcpp
0.5 match 4.30 score 20 scriptsbioc
drugTargetInteractions:Drug-Target Interactions
Provides utilities for identifying drug-target interactions for sets of small molecule or gene/protein identifiers. The required drug-target interaction information is obained from a local SQLite instance of the ChEMBL database. ChEMBL has been chosen for this purpose, because it provides one of the most comprehensive and best annotatated knowledge resources for drug-target information available in the public domain.
Maintained by Thomas Girke. Last updated 5 months ago.
cheminformaticsbiomedicalinformaticspharmacogeneticspharmacogenomicsproteomicsmetabolomics
0.5 match 1 stars 4.34 score 11 scriptsbioc
ASURAT:Functional annotation-driven unsupervised clustering for single-cell data
ASURAT is a software for single-cell data analysis. Using ASURAT, one can simultaneously perform unsupervised clustering and biological interpretation in terms of cell type, disease, biological process, and signaling pathway activity. Inputting a single-cell RNA-seq data and knowledge-based databases, such as Cell Ontology, Gene Ontology, KEGG, etc., ASURAT transforms gene expression tables into original multivariate tables, termed sign-by-sample matrices (SSMs).
Maintained by Keita Iida. Last updated 5 months ago.
geneexpressionsinglecellsequencingclusteringgenesignalingcpp
0.5 match 4.32 score 21 scriptsfrbcesab
popbayes:Bayesian Model to Estimate Population Trends from Counts Series
Infers the trends of one or several animal populations over time from series of counts. It does so by accounting for count precision (provided or inferred based on expert knowledge, e.g. guesstimates), smoothing the population rate of increase over time, and accounting for the maximum demographic potential of species. Inference is carried out in a Bayesian framework. This work is part of the FRB-CESAB working group AfroBioDrivers <https://www.fondationbiodiversite.fr/en/the-frb-in-action/programs-and-projects/le-cesab/afrobiodrivers/>.
Maintained by Nicolas Casajus. Last updated 1 years ago.
animalbayesiancountspopulationprecisiontemporal-trendjagscpp
0.5 match 1 stars 4.30 scorepbiecek
bgmm:Gaussian Mixture Modeling Algorithms and the Belief-Based Mixture Modeling
Two partially supervised mixture modeling methods: soft-label and belief-based modeling are implemented. For completeness, we equipped the package also with the functionality of unsupervised, semi- and fully supervised mixture modeling. The package can be applied also to selection of the best-fitting from a set of models with different component numbers or constraints on their structures. For detailed introduction see: Przemyslaw Biecek, Ewa Szczurek, Martin Vingron, Jerzy Tiuryn (2012), The R Package bgmm: Mixture Modeling with Uncertain Knowledge, Journal of Statistical Software <doi:10.18637/jss.v047.i03>.
Maintained by Przemyslaw Biecek. Last updated 2 years ago.
0.5 match 2 stars 4.22 score 55 scripts 1 dependentsjreisner
biclustermd:Biclustering with Missing Data
Biclustering is a statistical learning technique that simultaneously partitions and clusters rows and columns of a data matrix. Since the solution space of biclustering is in infeasible to completely search with current computational mechanisms, this package uses a greedy heuristic. The algorithm featured in this package is, to the best our knowledge, the first biclustering algorithm to work on data with missing values. Li, J., Reisner, J., Pham, H., Olafsson, S., and Vardeman, S. (2020) Biclustering with Missing Data. Information Sciences, 510, 304–316.
Maintained by John Reisner. Last updated 4 years ago.
0.5 match 3 stars 4.18 score 4 scriptsbioc
easier:Estimate Systems Immune Response from RNA-seq data
This package provides a workflow for the use of EaSIeR tool, developed to assess patients' likelihood to respond to ICB therapies providing just the patients' RNA-seq data as input. We integrate RNA-seq data with different types of prior knowledge to extract quantitative descriptors of the tumor microenvironment from several points of view, including composition of the immune repertoire, and activity of intra- and extra-cellular communications. Then, we use multi-task machine learning trained in TCGA data to identify how these descriptors can simultaneously predict several state-of-the-art hallmarks of anti-cancer immune response. In this way we derive cancer-specific models and identify cancer-specific systems biomarkers of immune response. These biomarkers have been experimentally validated in the literature and the performance of EaSIeR predictions has been validated using independent datasets form four different cancer types with patients treated with anti-PD1 or anti-PDL1 therapy.
Maintained by Oscar Lapuente-Santana. Last updated 5 months ago.
geneexpressionsoftwaretranscriptionsystemsbiologypathwaysgenesetenrichmentimmunooncologyepigeneticsclassificationbiomedicalinformaticsregressionexperimenthubsoftware
0.5 match 4.20 score 16 scriptsthiyangt
tsdataleaks:Exploit Data Leakages in Time Series Forecasting Competitions
Forecasting competitions are of increasing importance as a mean to learn best practices and gain knowledge. Data leakage is one of the most common issues that can often be found in competitions. Data leaks can happen when the training data contains information about the test data. For example: randomly chosen blocks of time series are concatenated to form a new time series, scale-shifts, repeating patterns in time series, white noise is added in the original time series to form a new time series, etc. 'tsdataleaks' package can be used to detect data leakages in a collection of time series.
Maintained by Thiyanga S. Talagala. Last updated 1 years ago.
0.5 match 3 stars 4.18 score 8 scriptsketsiaguichard
telraamStats:Retrieval and Visualization of Mobility Data from 'Telraam' Sensors
Streamline the processing of 'Telraam' data, sourced from open data mobility sensors. These tools range from data retrieval (without the need for API knowledge) to data visualization, including data preprocessing.
Maintained by Ketsia Guichard. Last updated 10 months ago.
0.5 match 1 stars 4.04 score 11 scriptsbioc
UNDO:Unsupervised Deconvolution of Tumor-Stromal Mixed Expressions
UNDO is an R package for unsupervised deconvolution of tumor and stromal mixed expression data. It detects marker genes and deconvolutes the mixing expression data without any prior knowledge.
Maintained by Niya Wang. Last updated 5 months ago.
0.5 match 4.00 score 6 scriptscran
AMModels:Adaptive Management Model Manager
Helps enable adaptive management by codifying knowledge in the form of models generated from numerous analyses and data sets. Facilitates this process by storing all models and data sets in a single object that can be updated and saved, thus tracking changes in knowledge through time. A shiny application called AM Model Manager (modelMgr()) enables the use of these functions via a GUI.
Maintained by Jon Katz. Last updated 6 years ago.
0.8 match 2.58 score 19 scriptsrezamoammadi
liver:"Eating the Liver of Data Science"
Offers a suite of helper functions to simplify various data science techniques for non-experts. This package aims to enable individuals with only a minimal level of coding knowledge to become acquainted with these techniques in an accessible manner. Inspired by an ancient Persian idiom, we liken this process to "eating the liver of data science," suggesting a deep and intimate engagement with the field of data science. This package includes functions for tasks such as data partitioning for out-of-sample testing, calculating Mean Squared Error (MSE) to assess prediction accuracy, and data transformations (z-score and min-max). In addition to these helper functions, the 'liver' package also features several intriguing datasets valuable for multivariate analysis.
Maintained by Reza Mohammadi. Last updated 4 months ago.
0.5 match 4.00 score 67 scriptsjingxuanh
xtune:Regularized Regression with Feature-Specific Penalties Integrating External Information
Extends standard penalized regression (Lasso, Ridge, and Elastic-net) to allow feature-specific shrinkage based on external information with the goal of achieving a better prediction accuracy and variable selection. Examples of external information include the grouping of predictors, prior knowledge of biological importance, external p-values, function annotations, etc. The choice of multiple tuning parameters is done using an Empirical Bayes approach. A majorization-minimization algorithm is employed for implementation.
Maintained by Jingxuan He. Last updated 2 years ago.
0.5 match 3.90 score 16 scriptspbiecek
proton:The Proton Game
'The Proton Game' is a console-based data-crunching game for younger and older data scientists. Act as a data-hacker and find Slawomir Pietraszko's credentials to the Proton server. You have to solve four data-based puzzles to find the login and password. There are many ways to solve these puzzles. You may use loops, data filtering, ordering, aggregation or other tools. Only basics knowledge of R is required to play the game, yet the more functions you know, the more approaches you can try. The knowledge of dplyr is not required but may be very helpful. This game is linked with the ,,Pietraszko's Cave'' story available at http://biecek.pl/BetaBit/Warsaw. It's a part of Beta and Bit series. You will find more about the Beta and Bit series at http://biecek.pl/BetaBit.
Maintained by Przemysław Biecek. Last updated 9 years ago.
0.8 match 2.49 score 312 scriptsjo-theo
shinySbm:'shiny' Application to Use the Stochastic Block Model
A 'shiny' interface for a simpler use of the 'sbm' R package. It also contains useful functions to easily explore the 'sbm' package results. With this package you should be able to use the stochastic block model without any knowledge in R, get automatic reports and nice visuals, as well as learning the basic functions of 'sbm'.
Maintained by Theodore Vanrenterghem. Last updated 1 years ago.
0.5 match 3.70 score 6 scriptsoxfordihtm
codigo:Interface to the International Classification of Diseases (ICD) API
The International Classification of Diseases (ICD) serves a broad range of uses globally and provides critical knowledge on the extent, causes and consequences of human disease and death worldwide via data that is reported and coded with the ICD. ICD API allows programmatic access to the ICD. It is an HTTP based REST API. This package provides functions that interface with the ICD API.
Maintained by Ernest Guevarra. Last updated 5 months ago.
0.5 match 4 stars 3.68 score 6 scripts 1 dependentslaurabruckman
netSEM:Network Structural Equation Modeling
The network structural equation modeling conducts a network statistical analysis on a data frame of coincident observations of multiple continuous variables [1]. It builds a pathway model by exploring a pool of domain knowledge guided candidate statistical relationships between each of the variable pairs, selecting the 'best fit' on the basis of a specific criteria such as adjusted r-squared value. This material is based upon work supported by the U.S. National Science Foundation Award EEC-2052776 and EEC-2052662 for the MDS-Rely IUCRC Center, under the NSF Solicitation: NSF 20-570 Industry-University Cooperative Research Centers Program [1] Bruckman, Laura S., Nicholas R. Wheeler, Junheng Ma, Ethan Wang, Carl K. Wang, Ivan Chou, Jiayang Sun, and Roger H. French. (2013) <doi:10.1109/ACCESS.2013.2267611>.
Maintained by Laura S. Bruckman. Last updated 2 years ago.
0.5 match 3.72 score 13 scriptsbioc
CNORfuzzy:Addon to CellNOptR: Fuzzy Logic
This package is an extension to CellNOptR. It contains additional functionality needed to simulate and train a prior knowledge network to experimental data using constrained fuzzy logic (cFL, rather than Boolean logic as is the case in CellNOptR). Additionally, this package will contain functions to use for the compilation of multiple optimization results (either Boolean or cFL).
Maintained by T. Cokelaer. Last updated 5 months ago.
0.5 match 3.60 score 7 scriptsbioc
omada:Machine learning tools for automated transcriptome clustering analysis
Symptomatic heterogeneity in complex diseases reveals differences in molecular states that need to be investigated. However, selecting the numerous parameters of an exploratory clustering analysis in RNA profiling studies requires deep understanding of machine learning and extensive computational experimentation. Tools that assist with such decisions without prior field knowledge are nonexistent and further gene association analyses need to be performed independently. We have developed a suite of tools to automate these processes and make robust unsupervised clustering of transcriptomic data more accessible through automated machine learning based functions. The efficiency of each tool was tested with four datasets characterised by different expression signal strengths. Our toolkit’s decisions reflected the real number of stable partitions in datasets where the subgroups are discernible. Even in datasets with less clear biological distinctions, stable subgroups with different expression profiles and clinical associations were found.
Maintained by Sokratis Kariotis. Last updated 5 months ago.
softwareclusteringrnaseqgeneexpression
0.5 match 3.60 score 5 scriptstjebo
eyedata:Open Source Ophthalmic Data Sets Curated for R
Open source data allows for reproducible research and helps advance our knowledge. The purpose of this package is to collate open source ophthalmic data sets curated for direct use. This is real life data of people with intravitreal injections with anti-vascular endothelial growth factor (anti-VEGF), due to age-related macular degeneration or diabetic macular edema. Associated publications of the data sets: Fu et al. (2020) <doi:10.1001/jamaophthalmol.2020.5044>, Moraes et al (2020) <doi:10.1016/j.ophtha.2020.09.025>, Fasler et al. (2019) <doi:10.1136/bmjopen-2018-027441>, Arpa et al. (2020) <doi:10.1136/bjophthalmol-2020-317161>, Kern et al. 2020, <doi:10.1038/s41433-020-1048-0>.
Maintained by Tjebo Heeren. Last updated 4 years ago.
0.5 match 4 stars 3.48 score 15 scriptsr4goodacademy
R4GoodPersonalFinances:Make Better Financial Decisions
Make informed, data-driven decisions for your personal or household finances. Use tools and methods that are selected carefully to align with academic consensus, bridging the gap between theoretical knowledge and practical application. They assist you in finding optimal asset allocation, preparing for retirement or financial independence, calculating optimal spending, and more. For more details see: Haghani V., White J. (2023, ISBN:978-1-119-74791-8), Idzorek T., Kaplan P. (2024, ISBN:9781952927379).
Maintained by Kamil Wais. Last updated 3 days ago.
financial-independencefireoptimal-asset-allocationsoptimal-spendingpersonal-financesretirement
0.5 match 1 stars 3.40 score