Showing 142 of total 142 results (show query)

r-lib

desc:Manipulate DESCRIPTION Files

Tools to read, write, create, and manipulate DESCRIPTION files. It is intended for packages that create or manipulate other packages.

Maintained by Gábor Csárdi. Last updated 1 months ago.

5.3 match 123 stars 14.68 score 409 scripts 1.1k dependents

bioc

Biobase:Biobase: Base functions for Bioconductor

Functions that are needed by many other packages or which replace R functions.

Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.

infrastructurebioconductor-packagecore-package

1.8 match 9 stars 16.45 score 6.6k scripts 1.8k dependents

bioc

MoonlightR:Identify oncogenes and tumor suppressor genes from omics data

Motivation: The understanding of cancer mechanism requires the identification of genes playing a role in the development of the pathology and the characterization of their role (notably oncogenes and tumor suppressors). Results: We present an R/bioconductor package called MoonlightR which returns a list of candidate driver genes for specific cancer types on the basis of TCGA expression data. The method first infers gene regulatory networks and then carries out a functional enrichment analysis (FEA) (implementing an upstream regulator analysis, URA) to score the importance of well-known biological processes with respect to the studied cancer type. Eventually, by means of random forests, MoonlightR predicts two specific roles for the candidate driver genes: i) tumor suppressor genes (TSGs) and ii) oncogenes (OCGs). As a consequence, this methodology does not only identify genes playing a dual role (e.g. TSG in one cancer type and OCG in another) but also helps in elucidating the biological processes underlying their specific roles. In particular, MoonlightR can be used to discover OCGs and TSGs in the same cancer type. This may help in answering the question whether some genes change role between early stages (I, II) and late stages (III, IV) in breast cancer. In the future, this analysis could be useful to determine the causes of different resistances to chemotherapeutic treatments.

Maintained by Matteo Tiberti. Last updated 5 months ago.

dnamethylationdifferentialmethylationgeneregulationgeneexpressionmethylationarraydifferentialexpressionpathwaysnetworksurvivalgenesetenrichmentnetworkenrichment

3.7 match 17 stars 6.57 score

huanglabumn

oncoPredict:Drug Response Modeling and Biomarker Discovery

Allows for building drug response models using screening data between bulk RNA-Seq and a drug response metric and two additional tools for biomarker discovery that have been developed by the Huang Laboratory at University of Minnesota. There are 3 main functions within this package. (1) calcPhenotype is used to build drug response models on RNA-Seq data and impute them on any other RNA-Seq dataset given to the model. (2) GLDS is used to calculate the general level of drug sensitivity, which can improve biomarker discovery. (3) IDWAS can take the results from calcPhenotype and link the imputed response back to available genomic (mutation and CNV alterations) to identify biomarkers. Each of these functions comes from a paper from the Huang research laboratory. Below gives the relevant paper for each function. calcPhenotype - Geeleher et al, Clinical drug response can be predicted using baseline gene expression levels and in vitro drug sensitivity in cell lines. GLDS - Geeleher et al, Cancer biomarker discovery is improved by accounting for variability in general levels of drug sensitivity in pre-clinical models. IDWAS - Geeleher et al, Discovering novel pharmacogenomic biomarkers by imputing drug response in cancer patients from large genomics studies.

Maintained by Robert Gruener. Last updated 12 months ago.

svapreprocesscorestringrbiomartgenefilterorg.hs.eg.dbgenomicfeaturestxdb.hsapiens.ucsc.hg19.knowngenetcgabiolinksbiocgenericsgenomicrangesirangess4vectors

2.4 match 18 stars 6.47 score 41 scripts

pablobarbera

Rfacebook:Access to Facebook API via R

Provides an interface to the Facebook API.

Maintained by Pablo Barbera. Last updated 5 years ago.

1.7 match 351 stars 7.75 score 268 scripts

samlestrade2

MoLE:Modeling Language Evolution

Model for simulating language evolution in terms of cultural evolution (Smith & Kirby (2008) <DOI:10.1098/rstb.2008.0145>; Deacon 1997). The focus is on the emergence of argument-marking systems (Dowty (1991) <DOI:10.1353/lan.1991.0021>, Van Valin 1999, Dryer 2002, Lestrade 2015a), i.e. noun marking (Aristar (1997) <DOI:10.1075/sl.21.2.04ari>, Lestrade (2010) <DOI:10.7282/T3ZG6R4S>), person indexing (Ariel 1999, Dahl (2000) <DOI:10.1075/fol.7.1.03dah>, Bhat 2004), and word order (Dryer 2013), but extensions are foreseen. Agents start out with a protolanguage (a language without grammar; Bickerton (1981) <DOI:10.17169/langsci.b91.109>, Jackendoff 2002, Arbib (2015) <DOI:10.1002/9781118346136.ch27>) and interact through language games (Steels 1997). Over time, grammatical constructions emerge that may or may not become obligatory (for which the tolerance principle is assumed; Yang 2016). Throughout the simulation, uniformitarianism of principles is assumed (Hopper (1987) <DOI:10.3765/bls.v13i0.1834>, Givon (1995) <DOI:10.1075/z.74>, Croft (2000), Saffran (2001) <DOI:10.1111/1467-8721.01243>, Heine & Kuteva 2007), in which maximal psychological validity is aimed at (Grice (1975) <DOI:10.1057/9780230005853_5>, Levelt 1989, Gaerdenfors 2000) and language representation is usage based (Tomasello 2003, Bybee 2010). In Lestrade (2015b) <DOI:10.15496/publikation-8640>, Lestrade (2015c) <DOI:10.1075/avt.32.08les>, and Lestrade (2016) <DOI:10.17617/2.2248195>), which reported on the results of preliminary versions, this package was announced as WDWTW (for who does what to whom), but for reasons of pronunciation and generalization the title was changed.

Maintained by Sander Lestrade. Last updated 7 years ago.

4.0 match 2.46 score 58 scripts

bioc

Moonlight2R:Identify oncogenes and tumor suppressor genes from omics data

The understanding of cancer mechanism requires the identification of genes playing a role in the development of the pathology and the characterization of their role (notably oncogenes and tumor suppressors). We present an updated version of the R/bioconductor package called MoonlightR, namely Moonlight2R, which returns a list of candidate driver genes for specific cancer types on the basis of omics data integration. The Moonlight framework contains a primary layer where gene expression data and information about biological processes are integrated to predict genes called oncogenic mediators, divided into putative tumor suppressors and putative oncogenes. This is done through functional enrichment analyses, gene regulatory networks and upstream regulator analyses to score the importance of well-known biological processes with respect to the studied cancer type. By evaluating the effect of the oncogenic mediators on biological processes or through random forests, the primary layer predicts two putative roles for the oncogenic mediators: i) tumor suppressor genes (TSGs) and ii) oncogenes (OCGs). As gene expression data alone is not enough to explain the deregulation of the genes, a second layer of evidence is needed. We have automated the integration of a secondary mutational layer through new functionalities in Moonlight2R. These functionalities analyze mutations in the cancer cohort and classifies these into driver and passenger mutations using the driver mutation prediction tool, CScape-somatic. Those oncogenic mediators with at least one driver mutation are retained as the driver genes. As a consequence, this methodology does not only identify genes playing a dual role (e.g. TSG in one cancer type and OCG in another) but also helps in elucidating the biological processes underlying their specific roles. In particular, Moonlight2R can be used to discover OCGs and TSGs in the same cancer type. This may for instance help in answering the question whether some genes change role between early stages (I, II) and late stages (III, IV). In the future, this analysis could be useful to determine the causes of different resistances to chemotherapeutic treatments. An additional mechanistic layer evaluates if there are mutations affecting the protein stability of the transcription factors (TFs) of the TSGs and OCGs, as that may have an effect on the expression of the genes.

Maintained by Matteo Tiberti. Last updated 2 months ago.

dnamethylationdifferentialmethylationgeneregulationgeneexpressionmethylationarraydifferentialexpressionpathwaysnetworksurvivalgenesetenrichmentnetworkenrichment

1.0 match 5 stars 6.59 score 43 scripts

ifpri

ARIA:App for IMPACT (App foR ImpAct)

App for IMPACT (App foR ImpAct).

Maintained by Abhijeet Mishra. Last updated 10 months ago.

2.2 match 2.70 score 6 scripts

inqs909

csucistats:CSU Channel Islands R Tools

An R package containing functions for statistics courses at CSUCI.

Maintained by Isaac Quintanilla Salinas. Last updated 2 months ago.

1.6 match 2.99 score 14 scripts

nhs-r-community

NHSRtools:NHS-R Tools

Provides tools commonly used by Analysts within the NHS.

Maintained by Tom Jemmett. Last updated 4 years ago.

2.3 match 2 stars 2.00 score 1 scripts

ropengov

mpg:FuelEconomy.gov Data

Extract fuel economy data from FuelEconomy.gov.

Maintained by Thomas J. Leeper. Last updated 3 years ago.

ropengov

1.6 match 12 stars 2.78 score

lcrawlab

mvMAPIT:Multivariate Genome Wide Marginal Epistasis Test

Epistasis, commonly defined as the interaction between genetic loci, is known to play an important role in the phenotypic variation of complex traits. As a result, many statistical methods have been developed to identify genetic variants that are involved in epistasis, and nearly all of these approaches carry out this task by focusing on analyzing one trait at a time. Previous studies have shown that jointly modeling multiple phenotypes can often dramatically increase statistical power for association mapping. In this package, we present the 'multivariate MArginal ePIstasis Test' ('mvMAPIT') – a multi-outcome generalization of a recently proposed epistatic detection method which seeks to detect marginal epistasis or the combined pairwise interaction effects between a given variant and all other variants. By searching for marginal epistatic effects, one can identify genetic variants that are involved in epistasis without the need to identify the exact partners with which the variants interact – thus, potentially alleviating much of the statistical and computational burden associated with conventional explicit search based methods. Our proposed 'mvMAPIT' builds upon this strategy by taking advantage of correlation structure between traits to improve the identification of variants involved in epistasis. We formulate 'mvMAPIT' as a multivariate linear mixed model and develop a multi-trait variance component estimation algorithm for efficient parameter inference and P-value computation. Together with reasonable model approximations, our proposed approach is scalable to moderately sized genome-wide association studies. Crawford et al. (2017) <doi:10.1371/journal.pgen.1006869>. Stamp et al. (2023) <doi:10.1093/g3journal/jkad118>.

Maintained by Julian Stamp. Last updated 5 months ago.

cppepistasisepistasis-analysisgwasgwas-toolslinear-mixed-modelsmapitmvmapitvariance-componentsopenblascppopenmp

0.5 match 11 stars 6.90 score 17 scripts 1 dependents

amoeba

pkgsci:Does science on R pakcages

Various utility functions for analyzing coding practices in R packages.

Maintained by Bryce Mecum. Last updated 6 years ago.

1.6 match 1.70 score 1 scripts

bioc

GRaNIE:GRaNIE: Reconstruction cell type specific gene regulatory networks including enhancers using single-cell or bulk chromatin accessibility and RNA-seq data

Genetic variants associated with diseases often affect non-coding regions, thus likely having a regulatory role. To understand the effects of genetic variants in these regulatory regions, identifying genes that are modulated by specific regulatory elements (REs) is crucial. The effect of gene regulatory elements, such as enhancers, is often cell-type specific, likely because the combinations of transcription factors (TFs) that are regulating a given enhancer have cell-type specific activity. This TF activity can be quantified with existing tools such as diffTF and captures differences in binding of a TF in open chromatin regions. Collectively, this forms a gene regulatory network (GRN) with cell-type and data-specific TF-RE and RE-gene links. Here, we reconstruct such a GRN using single-cell or bulk RNAseq and open chromatin (e.g., using ATACseq or ChIPseq for open chromatin marks) and optionally (Capture) Hi-C data. Our network contains different types of links, connecting TFs to regulatory elements, the latter of which is connected to genes in the vicinity or within the same chromatin domain (TAD). We use a statistical framework to assign empirical FDRs and weights to all links using a permutation-based approach.

Maintained by Christian Arnold. Last updated 5 months ago.

softwaregeneexpressiongeneregulationnetworkinferencegenesetenrichmentbiomedicalinformaticsgeneticstranscriptomicsatacseqrnaseqgraphandnetworkregressiontranscriptionchipseq

0.5 match 5.40 score 24 scripts

briandk

granovaGG:Graphical Analysis of Variance Using ggplot2

Create what we call Elemental Graphics for display of anova results. The term elemental derives from the fact that each function is aimed at construction of graphical displays that afford direct visualizations of data with respect to the fundamental questions that drive the particular anova methods. This package represents a modification of the original granova package; the key change is to use 'ggplot2', Hadley Wickham's package based on Grammar of Graphics concepts (due to Wilkinson). The main function is granovagg.1w() (a graphic for one way ANOVA); two other functions (granovagg.ds() and granovagg.contr()) are to construct graphics for dependent sample analyses and contrast-based analyses respectively. (The function granova.2w(), which entails dynamic displays of data, is not currently part of 'granovaGG'.) The 'granovaGG' functions are to display data for any number of groups, regardless of their sizes (however, very large data sets or numbers of groups can be problematic). For granovagg.1w() a specialized approach is used to construct data-based contrast vectors for which anova data are displayed. The result is that the graphics use a straight line to facilitate clear interpretations while being faithful to the standard effect test in anova. The graphic results are complementary to standard summary tables; indeed, numerical summary statistics are provided as side effects of the graphic constructions. granovagg.ds() and granovagg.contr() provide graphic displays and numerical outputs for a dependent sample and contrast-based analyses. The graphics based on these functions can be especially helpful for learning how the respective methods work to answer the basic question(s) that drive the analyses. This means they can be particularly helpful for students and non-statistician analysts. But these methods can be of assistance for work-a-day applications of many kinds, as they can help to identify outliers, clusters or patterns, as well as highlight the role of non-linear transformations of data. In the case of granovagg.1w() and granovagg.ds() several arguments are provided to facilitate flexibility in the construction of graphics that accommodate diverse features of data, according to their corresponding display requirements. See the help files for individual functions.

Maintained by Brian A. Danielak. Last updated 1 years ago.

0.5 match 15 stars 4.90 score 35 scripts

bioc

NoRCE:NoRCE: Noncoding RNA Sets Cis Annotation and Enrichment

While some non-coding RNAs (ncRNAs) are assigned critical regulatory roles, most remain functionally uncharacterized. This presents a challenge whenever an interesting set of ncRNAs needs to be analyzed in a functional context. Transcripts located close-by on the genome are often regulated together. This genomic proximity on the sequence can hint to a functional association. We present a tool, NoRCE, that performs cis enrichment analysis for a given set of ncRNAs. Enrichment is carried out using the functional annotations of the coding genes located proximal to the input ncRNAs. Other biologically relevant information such as topologically associating domain (TAD) boundaries, co-expression patterns, and miRNA target prediction information can be incorporated to conduct a richer enrichment analysis. To this end, NoRCE includes several relevant datasets as part of its data repository, including cell-line specific TAD boundaries, functional gene sets, and expression data for coding & ncRNAs specific to cancer. Additionally, the users can utilize custom data files in their investigation. Enrichment results can be retrieved in a tabular format or visualized in several different ways. NoRCE is currently available for the following species: human, mouse, rat, zebrafish, fruit fly, worm, and yeast.

Maintained by Gulden Olgun. Last updated 5 months ago.

biologicalquestiondifferentialexpressiongenomeannotationgenesetenrichmentgenetargetgenomeassemblygo

0.5 match 1 stars 4.60 score 6 scripts

bioc

DMCHMM:Differentially Methylated CpG using Hidden Markov Model

A pipeline for identifying differentially methylated CpG sites using Hidden Markov Model in bisulfite sequencing data. DNA methylation studies have enabled researchers to understand methylation patterns and their regulatory roles in biological processes and disease. However, only a limited number of statistical approaches have been developed to provide formal quantitative analysis. Specifically, a few available methods do identify differentially methylated CpG (DMC) sites or regions (DMR), but they suffer from limitations that arise mostly due to challenges inherent in bisulfite sequencing data. These challenges include: (1) that read-depths vary considerably among genomic positions and are often low; (2) both methylation and autocorrelation patterns change as regions change; and (3) CpG sites are distributed unevenly. Furthermore, there are several methodological limitations: almost none of these tools is capable of comparing multiple groups and/or working with missing values, and only a few allow continuous or multiple covariates. The last of these is of great interest among researchers, as the goal is often to find which regions of the genome are associated with several exposures and traits. To tackle these issues, we have developed an efficient DMC identification method based on Hidden Markov Models (HMMs) called “DMCHMM” which is a three-step approach (model selection, prediction, testing) aiming to address the aforementioned drawbacks.

Maintained by Farhad Shokoohi. Last updated 5 months ago.

differentialmethylationsequencinghiddenmarkovmodelcoverage

0.5 match 3.78 score 3 scripts

augustobrusaca

KHQ:Methods for Calculating 'KHQ' Scores and 'KHQ5D' Utility Index Scores

The King's Health Questionnaire (KHQ) is a disease-specific, self-administered questionnaire designed specific to assess the impact of Urinary Incontinence (UI) on Quality of Life. The questionnaire was developed by Kelleher and collaborators (1997) <doi:10.1111/j.1471-0528.1997.tb11006.x>. It is a simple, acceptable and reliable measure to use in the clinical setting and a research tool that is useful in evaluating UI treatment outcomes. The KHQ five dimensions (KHQ5D) is a condition-specific preference-based measure developed by Brazier and collaborators (2008) <doi:10.1177/0272989X07301820>. Although not as popular as the SF6D <doi:10.1016/S0895-4356(98)00103-6> and EQ-5D <https://euroqol.org/>, the KHQ5D measures health-related quality of life (HRQoL) specifically for UI, not general conditions like the others two instruments mentioned. The KHQ5D ca be used in the clinical and economic evaluation of health care. The subject self-rates their health in terms of five dimensions: Role Limitation (RL), Physical Limitations (PL), Social Limitations (SL), Emotions (E), and Sleep (S). Frequently the states on these five dimensions are converted to a single utility index using country specific value sets, which can be used in the clinical and economic evaluation of health care as well as in population health surveys. This package provides methods to calculate scores for each dimension of the KHQ; converts KHQ item scores to KHQ5D scores; and also calculates the utility index of the KHQ5D.

Maintained by Luiz Augusto Brusaca. Last updated 4 years ago.

0.5 match 3.70 score 4 scripts

florian-laroumagne

previsionio:'Prevision.io' R SDK

For working with the 'Prevision.io' AI model management platform's API <https://prevision.io/>.

Maintained by Florian Laroumagne. Last updated 3 years ago.

1.8 match 1.00 score 1 scripts

cran

NST:Normalized Stochasticity Ratio

To estimate ecological stochasticity in community assembly. Understanding the community assembly mechanisms controlling biodiversity patterns is a central issue in ecology. Although it is generally accepted that both deterministic and stochastic processes play important roles in community assembly, quantifying their relative importance is challenging. The new index, normalized stochasticity ratio (NST), is to estimate ecological stochasticity, i.e. relative importance of stochastic processes, in community assembly. With functions in this package, NST can be calculated based on different similarity metrics and/or different null model algorithms, as well as some previous indexes, e.g. previous Stochasticity Ratio (ST), Standard Effect Size (SES), modified Raup-Crick metrics (RC). Functions for permutational test and bootstrapping analysis are also included. Previous ST is published by Zhou et al (2014) <doi:10.1073/pnas.1324044111>. NST is modified from ST by considering two alternative situations and normalizing the index to range from 0 to 1 (Ning et al 2019) <doi:10.1073/pnas.1904623116>. A modified version, MST, is a special case of NST, used in some recent or upcoming publications, e.g. Liang et al (2020) <doi:10.1016/j.soilbio.2020.108023>. SES is calculated as described in Kraft et al (2011) <doi:10.1126/science.1208584>. RC is calculated as reported by Chase et al (2011) <doi:10.1890/ES10-00117.1> and Stegen et al (2013) <doi:10.1038/ismej.2013.93>. Version 3 added NST based on phylogenetic beta diversity, used by Ning et al (2020) <doi:10.1038/s41467-020-18560-z>.

Maintained by Daliang Ning. Last updated 3 years ago.

0.5 match 2 stars 2.85 score 35 scripts

harryyy213

descstatsr:Descriptive Univariate Statistics

It generates summary statistics on the input dataset using different descriptive univariate statistical measures on entire data or at a group level. Though there are other packages which does similar job but each of these are deficient in one form or other, in the measures generated, in treating numeric, character and date variables alike, no functionality to view these measures on a group level or the way the output is represented. Given the foremost role of the descriptive statistics in any of the exploratory data analysis or solution development, there is a need for a more constructive, structured and refined version over these packages. This is the idea behind the package and it brings together all the required descriptive measures to give an initial understanding of the data quality, distribution in a faster,easier and elaborative way.The function brings an additional capability to be able to generate these statistical measures on the entire dataset or at a group level. It calculates measures of central tendency (mean, median), distribution (count, proportion), dispersion (min, max, quantile, standard deviation, variance) and shape (skewness, kurtosis). Addition to these measures, it provides information on the data type, count on no. of rows, unique entries and percentage of missing entries. More importantly the measures are generated based on the data types as required by them,rather than applying numerical measures on character and data variables and vice versa. Output as a dataframe object gives a very neat representation, which often is useful when working with a large number of columns. It can easily be exported as csv and analyzed further or presented as a summary report for the data.

Maintained by Harish Kumar. Last updated 6 years ago.

0.5 match 1.00 score 1 scripts