Showing 200 of total 259 results (show query)

satopaa

metaggR:Calculate the Knowledge-Weighted Estimate

According to a phenomenon known as "the wisdom of the crowds," combining point estimates from multiple judges often provides a more accurate aggregate estimate than using a point estimate from a single judge. However, if the judges use shared information in their estimates, the simple average will over-emphasize this common component at the expense of the judges’ private information. Asa Palley & Ville Satopää (2021) "Boosting the Wisdom of Crowds Within a Single Judgment Problem: Selective Averaging Based on Peer Predictions" <https://papers.ssrn.com/sol3/Papers.cfm?abstract_id=3504286> proposes a procedure for calculating a weighted average of the judges’ individual estimates such that resulting aggregate estimate appropriately combines the judges' collective information within a single estimation problem. The authors use both simulation and data from six experimental studies to illustrate that the weighting procedure outperforms existing averaging-like methods, such as the equally weighted average, trimmed average, and median. This aggregate estimate -- know as "the knowledge-weighted estimate" -- inputs a) judges' estimates of a continuous outcome (E) and b) predictions of others' average estimate of this outcome (P). In this R-package, the function knowledge_weighted_estimate(E,P) implements the knowledge-weighted estimate. Its use is illustrated with a simple stylized example and on real-world experimental data.

Maintained by Ville Satopää. Last updated 3 years ago.

52.4 match 1 stars 2.85 score 14 scripts

thomaschln

kgraph:Knowledge Graphs Constructions and Visualizations

Knowledge graphs enable to efficiently visualize and gain insights into large-scale data analysis results, as p-values from multiple studies or embedding data matrices. The usual workflow is a user providing a data frame of association studies results and specifying target nodes, e.g. phenotypes, to visualize. The knowledge graph then shows all the features which are significantly associated with the phenotype, with the edges being proportional to the association scores. As the user adds several target nodes and grouping information about the nodes such as biological pathways, the construction of such graphs soon becomes complex. The 'kgraph' package aims to enable users to easily build such knowledge graphs, and provides two main features: first, to enable building a knowledge graph based on a data frame of concepts relationships, be it p-values or cosine similarities; second, to enable determining an appropriate cut-off on cosine similarities from a complete embedding matrix, to enable the building of a knowledge graph directly from an embedding matrix. The 'kgraph' package provides several display, layout and cut-off options, and has already proven useful to researchers to enable them to visualize large sets of p-value associations with various phenotypes, and to quickly be able to visualize embedding results. Two example datasets are provided to demonstrate these behaviors, and several live 'shiny' applications are hosted by the CELEHS laboratory and Parse Health, as the KESER Mental Health application <https://keser-mental-health.parse-health.org/> based on Hong C. (2021) <doi:10.1038/s41746-021-00519-z>.

Maintained by Thomas Charlon. Last updated 25 days ago.

14.5 match 4.85 score

sbgraves237

Ecdat:Data Sets for Econometrics

Data sets for econometrics, including political science.

Maintained by Spencer Graves. Last updated 4 months ago.

3.8 match 2 stars 7.25 score 740 scripts 3 dependents

bioc

mixOmics:Omics Data Integration Project

Multivariate methods are well suited to large omics data sets where the number of variables (e.g. genes, proteins, metabolites) is much larger than the number of samples (patients, cells, mice). They have the appealing properties of reducing the dimension of the data by using instrumental variables (components), which are defined as combinations of all variables. Those components are then used to produce useful graphical outputs that enable better understanding of the relationships and correlation structures between the different data sets that are integrated. mixOmics offers a wide range of multivariate methods for the exploration and integration of biological datasets with a particular focus on variable selection. The package proposes several sparse multivariate models we have developed to identify the key variables that are highly correlated, and/or explain the biological outcome of interest. The data that can be analysed with mixOmics may come from high throughput sequencing technologies, such as omics data (transcriptomics, metabolomics, proteomics, metagenomics etc) but also beyond the realm of omics (e.g. spectral imaging). The methods implemented in mixOmics can also handle missing values without having to delete entire rows with missing data. A non exhaustive list of methods include variants of generalised Canonical Correlation Analysis, sparse Partial Least Squares and sparse Discriminant Analysis. Recently we implemented integrative methods to combine multiple data sets: N-integration with variants of Generalised Canonical Correlation Analysis and P-integration with variants of multi-group Partial Least Squares.

Maintained by Eva Hamrud. Last updated 4 days ago.

immunooncologymicroarraysequencingmetabolomicsmetagenomicsproteomicsgenepredictionmultiplecomparisonclassificationregressionbioconductorgenomicsgenomics-datagenomics-visualizationmultivariate-analysismultivariate-statisticsomicsr-pkgr-project

1.5 match 182 stars 13.71 score 1.3k scripts 22 dependents

trinker

lexicon:Lexicons for Text Analysis

A collection of lexical hash tables, dictionaries, and word lists.

Maintained by Tyler Rinker. Last updated 3 years ago.

hashlexiconlookupnames-frequentstopwordstext-dictionariestext-mining

1.5 match 111 stars 8.80 score 224 scripts 25 dependents

ipbes-data

IPBES.R:Tool functions used by the Data and Knowledge Technical Support Unit of IPBES

More about what it does (maybe more than one line).

Maintained by Rainer M. Krug. Last updated 1 years ago.

4.4 match 2 stars 2.00 score 10 scripts

skranz

gtree:gtree basic functionality to model and solve games

gtree basic functionality to model and solve games

Maintained by Sebastian Kranz. Last updated 4 years ago.

economic-experimentseconomicsgambitgame-theorynash-equilibrium

1.9 match 18 stars 3.79 score 23 scripts 1 dependents

rickhelmus

RDCOMClient:R-DCOM client

Provides dynamic client-side access to (D)COM applications from within R.

Maintained by Duncan Temple Lang. Last updated 1 years ago.

1.8 match 3.90 score 315 scripts

bioc

systemPipeShiny:systemPipeShiny: An Interactive Framework for Workflow Management and Visualization

systemPipeShiny (SPS) extends the widely used systemPipeR (SPR) workflow environment with a versatile graphical user interface provided by a Shiny App. This allows non-R users, such as experimentalists, to run many systemPipeR’s workflow designs, control, and visualization functionalities interactively without requiring knowledge of R. Most importantly, SPS has been designed as a general purpose framework for interacting with other R packages in an intuitive manner. Like most Shiny Apps, SPS can be used on both local computers as well as centralized server-based deployments that can be accessed remotely as a public web service for using SPR’s functionalities with community and/or private data. The framework can integrate many core packages from the R/Bioconductor ecosystem. Examples of SPS’ current functionalities include: (a) interactive creation of experimental designs and metadata using an easy to use tabular editor or file uploader; (b) visualization of workflow topologies combined with auto-generation of R Markdown preview for interactively designed workflows; (d) access to a wide range of data processing routines; (e) and an extendable set of visualization functionalities. Complex visual results can be managed on a 'Canvas Workbench’ allowing users to organize and to compare plots in an efficient manner combined with a session snapshot feature to continue work at a later time. The present suite of pre-configured visualization examples. The modular design of SPR makes it easy to design custom functions without any knowledge of Shiny, as well as extending the environment in the future with contributions from the community.

Maintained by Le Zhang. Last updated 5 months ago.

shinyappsinfrastructuredataimportsequencingqualitycontrolreportwritingexperimentaldesignclusteringbioconductorbioconductor-packagedata-visualizationshinysystempiper

0.8 match 33 stars 7.03 score 36 scripts

bioc

ViSEAGO:ViSEAGO: a Bioconductor package for clustering biological functions using Gene Ontology and semantic similarity

The main objective of ViSEAGO package is to carry out a data mining of biological functions and establish links between genes involved in the study. We developed ViSEAGO in R to facilitate functional Gene Ontology (GO) analysis of complex experimental design with multiple comparisons of interest. It allows to study large-scale datasets together and visualize GO profiles to capture biological knowledge. The acronym stands for three major concepts of the analysis: Visualization, Semantic similarity and Enrichment Analysis of Gene Ontology. It provides access to the last current GO annotations, which are retrieved from one of NCBI EntrezGene, Ensembl or Uniprot databases for several species. Using available R packages and novel developments, ViSEAGO extends classical functional GO analysis to focus on functional coherence by aggregating closely related biological themes while studying multiple datasets at once. It provides both a synthetic and detailed view using interactive functionalities respecting the GO graph structure and ensuring functional coherence supplied by semantic similarity. ViSEAGO has been successfully applied on several datasets from different species with a variety of biological questions. Results can be easily shared between bioinformaticians and biologists, enhancing reporting capabilities while maintaining reproducibility.

Maintained by Aurelien Brionne. Last updated 2 months ago.

softwareannotationgogenesetenrichmentmultiplecomparisonclusteringvisualization

0.5 match 6.64 score 22 scripts

bioc

omicsViewer:Interactive and explorative visualization of SummarizedExperssionSet or ExpressionSet using omicsViewer

omicsViewer visualizes ExpressionSet (or SummarizedExperiment) in an interactive way. The omicsViewer has a separate back- and front-end. In the back-end, users need to prepare an ExpressionSet that contains all the necessary information for the downstream data interpretation. Some extra requirements on the headers of phenotype data or feature data are imposed so that the provided information can be clearly recognized by the front-end, at the same time, keep a minimum modification on the existing ExpressionSet object. The pure dependency on R/Bioconductor guarantees maximum flexibility in the statistical analysis in the back-end. Once the ExpressionSet is prepared, it can be visualized using the front-end, implemented by shiny and plotly. Both features and samples could be selected from (data) tables or graphs (scatter plot/heatmap). Different types of analyses, such as enrichment analysis (using Bioconductor package fgsea or fisher's exact test) and STRING network analysis, will be performed on the fly and the results are visualized simultaneously. When a subset of samples and a phenotype variable is selected, a significance test on means (t-test or ranked based test; when phenotype variable is quantitative) or test of independence (chi-square or fisher’s exact test; when phenotype data is categorical) will be performed to test the association between the phenotype of interest with the selected samples. Additionally, other analyses can be easily added as extra shiny modules. Therefore, omicsViewer will greatly facilitate data exploration, many different hypotheses can be explored in a short time without the need for knowledge of R. In addition, the resulting data could be easily shared using a shiny server. Otherwise, a standalone version of omicsViewer together with designated omics data could be easily created by integrating it with portable R, which can be shared with collaborators or submitted as supplementary data together with a manuscript.

Maintained by Chen Meng. Last updated 2 months ago.

softwarevisualizationgenesetenrichmentdifferentialexpressionmotifdiscoverynetworknetworkenrichment

0.5 match 4 stars 6.02 score 22 scripts

hannahcomiskey

mcmsupply:Estimating Public and Private Sector Contraceptive Market Supply Shares

Family Planning programs and initiatives typically use nationally representative surveys to estimate key indicators of a country’s family planning progress. However, in recent years, routinely collected family planning services data (Service Statistics) have been used as a supplementary data source to bridge gaps in the surveys. The use of service statistics comes with the caveat that adjustments need to be made for missing private sector contributions to the contraceptive method supply chain. Evaluating the supply source of modern contraceptives often relies on Demographic Health Surveys (DHS), where many countries do not have recent data beyond 2015/16. Fortunately, in the absence of recent surveys we can rely on statistical model-based estimates and projections to fill the knowledge gap. We present a Bayesian, hierarchical, penalized-spline model with multivariate-normal spline coefficients, to account for across method correlations, to produce country-specific,annual estimates for the proportion of modern contraceptive methods coming from the public and private sectors. This package provides a quick and convenient way for users to access the DHS modern contraceptive supply share data at national and subnational administration levels, estimate, evaluate and plot annual estimates with uncertainty for a sample of low- and middle-income countries. Methods for the estimation of method supply shares at the national level are described in Comiskey, Alkema, Cahill (2022) <arXiv:2212.03844>.

Maintained by Hannah Comiskey. Last updated 12 months ago.

jagscpp

0.5 match 2 stars 5.15 score 20 scripts

christophergandrud

dpmr:Data Package Manager for R

Create, install, and summarise data packages that follow the Open Knowledge Foundation's Data Package Protocol.

Maintained by Christopher Gandrud. Last updated 8 years ago.

0.5 match 56 stars 4.45 score 5 scripts

sistia01

DWLS:Gene Expression Deconvolution Using Dampened Weighted Least Squares

The rapid development of single-cell transcriptomic technologies has helped uncover the cellular heterogeneity within cell populations. However, bulk RNA-seq continues to be the main workhorse for quantifying gene expression levels due to technical simplicity and low cost. To most effectively extract information from bulk data given the new knowledge gained from single-cell methods, we have developed a novel algorithm to estimate the cell-type composition of bulk data from a single-cell RNA-seq-derived cell-type signature. Comparison with existing methods using various real RNA-seq data sets indicates that our new approach is more accurate and comprehensive than previous methods, especially for the estimation of rare cell types. More importantly,our method can detect cell-type composition changes in response to external perturbations, thereby providing a valuable, cost-effective method for dissecting the cell-type-specific effects of drug treatments or condition changes. As such, our method is applicable to a wide range of biological and clinical investigations. Dampened weighted least squares ('DWLS') is an estimation method for gene expression deconvolution, in which the cell-type composition of a bulk RNA-seq data set is computationally inferred. This method corrects common biases towards cell types that are characterized by highly expressed genes and/or are highly prevalent, to provide accurate detection across diverse cell types. See: <https://www.nature.com/articles/s41467-019-10802-z.pdf> for more information about the development of 'DWLS' and the methods behind our functions.

Maintained by Adriana Sistig. Last updated 3 years ago.

0.5 match 2 stars 3.62 score 42 scripts