Showing 200 of total 354 results (show query)
ropensci
spocc:Interface to Species Occurrence Data Sources
A programmatic interface to many species occurrence data sources, including Global Biodiversity Information Facility ('GBIF'), 'iNaturalist', 'eBird', Integrated Digitized 'Biocollections' ('iDigBio'), 'VertNet', Ocean 'Biogeographic' Information System ('OBIS'), and Atlas of Living Australia ('ALA'). Includes functionality for retrieving species occurrence data, and combining those data.
Maintained by Hannah Owens. Last updated 1 months ago.
specimensapiweb-servicesoccurrencesspeciestaxonomygbifinatvertnetebirdidigbioobisalaantwebbisondataecoengineinaturalistoccurrencespecies-occurrencespocc
49.0 match 118 stars 10.09 score 552 scripts 5 dependentsropensci
rgbif:Interface to the Global Biodiversity Information Facility API
A programmatic interface to the Web Service methods provided by the Global Biodiversity Information Facility (GBIF; <https://www.gbif.org/developer/summary>). GBIF is a database of species occurrence records from sources all over the globe. rgbif includes functions for searching for taxonomic names, retrieving information on data providers, getting species occurrence records, getting counts of occurrence records, and using the GBIF tile map service to make rasters summarizing huge amounts of data.
Maintained by John Waller. Last updated 3 days ago.
gbifspecimensapiweb-servicesoccurrencesspeciestaxonomybiodiversitydatalifewatchoscibiospocc
34.1 match 161 stars 13.26 score 2.1k scripts 20 dependentsnowosad
comat:Creates Co-Occurrence Matrices of Spatial Data
Builds co-occurrence matrices based on spatial raster data. It includes creation of weighted co-occurrence matrices (wecoma) and integrated co-occurrence matrices (incoma; Vadivel et al. (2007) <doi:10.1016/j.patrec.2007.01.004>).
Maintained by Jakub Nowosad. Last updated 1 years ago.
34.2 match 6 stars 6.31 score 25 scripts 3 dependentsjbdorey
BeeBDC:Occurrence Data Cleaning
Flags and checks occurrence data that are in Darwin Core format. The package includes generic functions and data as well as some that are specific to bees. This package is meant to build upon and be complimentary to other excellent occurrence cleaning packages, including 'bdc' and 'CoordinateCleaner'. This package uses datasets from several sources and particularly from the Discover Life Website, created by Ascher and Pickering (2020). For further information, please see the original publication and package website. Publication - Dorey et al. (2023) <doi:10.1101/2023.06.30.547152> and package website - Dorey et al. (2023) <https://github.com/jbdorey/BeeBDC>.
Maintained by James B. Dorey. Last updated 4 months ago.
35.4 match 3 stars 5.68 score 7 scriptsr-a-dobson
dynamicSDM:Species Distribution and Abundance Modelling at High Spatio-Temporal Resolution
A collection of novel tools for generating species distribution and abundance models (SDM) that are dynamic through both space and time. These highly flexible functions incorporate spatial and temporal aspects across key SDM stages; including when cleaning and filtering species occurrence data, generating pseudo-absence records, assessing and correcting sampling biases and autocorrelation, extracting explanatory variables and projecting distribution patterns. Throughout, functions utilise Google Earth Engine and Google Drive to minimise the computing power and storage demands associated with species distribution modelling at high spatio-temporal resolution.
Maintained by Rachel Dobson. Last updated 27 days ago.
dynamicsdmgoogle-earth-enginegoogledrivesdmspatiotemporalspatiotemporal-data-analysisspatiotemporal-forecastingspecies-distribution-modellingspecies-distributions
32.0 match 6 stars 6.16 score 20 scriptsluomus
finbif:Interface for the 'Finnish Biodiversity Information Facility' API
A programmatic interface to the 'Finnish Biodiversity Information Facility' ('FinBIF') API (<https://api.laji.fi>). 'FinBIF' aggregates Finnish biodiversity data from multiple sources in a single open access portal for researchers, citizen scientists, industry and government. 'FinBIF' allows users of biodiversity information to find, access, combine and visualise data on Finnish plants, animals and microorganisms. The 'finbif' package makes the publicly available data in 'FinBIF' easily accessible to programmers. Biodiversity information is available on taxonomy and taxon occurrence. Occurrence data can be filtered by taxon, time, location and other variables. The data accessed are conveniently preformatted for subsequent analyses.
Maintained by William K. Morris. Last updated 6 days ago.
apibiodiversitybiodiversity-informaticsbiodiversity-informationfinbiffinbif-accessoccurrencesr-programmingspeciesspecimenstaxontaxonomyweb-services
24.2 match 5 stars 8.15 score 42 scripts 3 dependentsdivdyn
divDyn:Diversity Dynamics using Fossil Sampling Data
Functions to describe sampling and diversity dynamics of fossil occurrence datasets (e.g. from the Paleobiology Database). The package includes methods to calculate range- and occurrence-based metrics of taxonomic richness, extinction and origination rates, along with traditional sampling measures. A powerful subsampling tool is also included that implements frequently used sampling standardization methods in a multiple bin-framework. The plotting of time series and the occurrence data can be simplified by the functions incorporated in the package, as well as other calculations, such as environmental affinities and extinction selectivity testing. Details can be found in: Kocsis, A.T.; Reddin, C.J.; Alroy, J. and Kiessling, W. (2019) <doi:10.1101/423780>.
Maintained by Adam T. Kocsis. Last updated 4 months ago.
diversityextinctionfossil-dataoccurrencesoriginationpaleobiologycpp
28.7 match 11 stars 6.48 score 137 scriptsgagolews
stringi:Fast and Portable Character String Processing Facilities
A collection of character string/text/natural language processing tools for pattern searching (e.g., with 'Java'-like regular expressions or the 'Unicode' collation algorithm), random string generation, case mapping, string transliteration, concatenation, sorting, padding, wrapping, Unicode normalisation, date-time formatting and parsing, and many more. They are fast, consistent, convenient, and - thanks to 'ICU' (International Components for Unicode) - portable across all locales and platforms. Documentation about 'stringi' is provided via its website at <https://stringi.gagolewski.com/> and the paper by Gagolewski (2022, <doi:10.18637/jss.v103.i02>).
Maintained by Marek Gagolewski. Last updated 1 months ago.
icuicu4cnatural-language-processingnlpregexregexpstring-manipulationstringistringrtexttext-processingtidy-dataunicodecpp
9.9 match 309 stars 18.31 score 10k scripts 8.6k dependentsbiodiverse
unmarked:Models for Data from Unmarked Animals
Fits hierarchical models of animal abundance and occurrence to data collected using survey methods such as point counts, site occupancy sampling, distance sampling, removal sampling, and double observer sampling. Parameters governing the state and observation processes can be modeled as functions of covariates. References: Kellner et al. (2023) <doi:10.1111/2041-210X.14123>, Fiske and Chandler (2011) <doi:10.18637/jss.v043.i10>.
Maintained by Ken Kellner. Last updated 1 days ago.
12.0 match 4 stars 13.03 score 652 scripts 12 dependentsecospat
ecospat:Spatial Ecology Miscellaneous Methods
Collection of R functions and data sets for the support of spatial ecology analyses with a focus on pre, core and post modelling analyses of species distribution, niche quantification and community assembly. Written by current and former members and collaborators of the ecospat group of Antoine Guisan, Department of Ecology and Evolution (DEE) and Institute of Earth Surface Dynamics (IDYST), University of Lausanne, Switzerland. Read Di Cola et al. (2016) <doi:10.1111/ecog.02671> for details.
Maintained by Olivier Broennimann. Last updated 1 months ago.
15.8 match 32 stars 9.35 score 418 scripts 1 dependentsgriffithdan
cooccur:Probabilistic Species Co-Occurrence Analysis in R
This R package applies the probabilistic model of species co-occurrence (Veech 2013) to a set of species distributed among a set of survey or sampling sites. The algorithm calculates the observed and expected frequencies of co-occurrence between each pair of species. The expected frequency is based on the distribution of each species being random and independent of the other species. The analysis returns the probabilities that a more extreme (either low or high) value of co-occurrence could have been obtained by chance. The package also includes functions for visualizing species co-occurrence results and preparing data for downstream analyses.
Maintained by Daniel M. Griffith. Last updated 7 years ago.
29.9 match 3 stars 4.63 score 142 scriptsb-cubed-eu
b3gbi:General Biodiversity Indicators for Biodiversity Data Cubes
Calculate general biodiversity indicators from GBIF data cubes. Includes many common indicators such as species richness and evenness, which can be calculated over time (trends) or space (maps).
Maintained by Shawn Dove. Last updated 13 days ago.
biodiversity-indicatorsdata-cubes
21.7 match 3 stars 6.26 score 34 scripts 1 dependentsavrodrigues
naturaList:Classify Occurrences by Confidence Levels in the Species ID
Classify occurrence records based on confidence levels of species identification. In addition, implement tools to filter occurrences inside grid cells and to manually check for possibles errors with an interactive shiny application.
Maintained by Arthur Vinicius Rodrigues. Last updated 1 years ago.
27.3 match 4.66 score 23 scriptsr-forge
wordspace:Distributional Semantic Models in R
An interactive laboratory for research on distributional semantic models ('DSM', see <https://en.wikipedia.org/wiki/Distributional_semantics> for more information).
Maintained by Stephanie Evert. Last updated 3 months ago.
24.0 match 4.95 score 150 scripts 2 dependentsropensci
occCite:Querying and Managing Large Biodiversity Occurrence Datasets
Facilitates the gathering of biodiversity occurrence data from disparate sources. Metadata is managed throughout the process to facilitate reporting and enhanced ability to repeat analyses.
Maintained by Hannah L. Owens. Last updated 5 months ago.
biodiversity-databiodiversity-informaticsbiodiversity-standardscitationsmuseum-collection-specimensmuseum-collectionsmuseum-metadata
15.8 match 23 stars 7.30 score 43 scriptsfrareb
inpdfr:Analyse Text Documents Using Ecological Tools
A set of functions to analyse and compare texts, using classical text mining functions, as well as those from theoretical ecology.
Maintained by Rebaudo Francois. Last updated 2 years ago.
25.7 match 2 stars 4.41 score 26 scriptswallaceecomod
wallace:A Modular Platform for Reproducible Modeling of Species Niches and Distributions
The 'shiny' application Wallace is a modular platform for reproducible modeling of species niches and distributions. Wallace guides users through a complete analysis, from the acquisition of species occurrence and environmental data to visualizing model predictions on an interactive map, thus bundling complex workflows into a single, streamlined interface. An extensive vignette, which guides users through most package functionality can be found on the package's GitHub Pages website: <https://wallaceecomod.github.io/wallace/articles/tutorial-v2.html>.
Maintained by Mary E. Blair. Last updated 9 days ago.
13.5 match 133 stars 8.36 score 96 scriptstjheaton
carbondate:Calibration and Summarisation of Radiocarbon Dates
Performs Bayesian non-parametric calibration of multiple related radiocarbon determinations, and summarises the calendar age information to plot their joint calendar age density (see Heaton (2022) <doi:10.1111/rssc.12599>). Also models the occurrence of radiocarbon samples as a variable-rate (inhomogeneous) Poisson process, plotting the posterior estimate for the occurrence rate of the samples over calendar time, and providing information about potential change points.
Maintained by Timothy J Heaton. Last updated 2 months ago.
19.4 match 5 stars 5.78 score 20 scriptspalaeoverse
palaeoverse:Prepare and Explore Data for Palaeobiological Analyses
Provides functionality to support data preparation and exploration for palaeobiological analyses, improving code reproducibility and accessibility. The wider aim of 'palaeoverse' is to bring the palaeobiological community together to establish agreed standards. The package currently includes functionality for data cleaning, binning (time and space), exploration, summarisation and visualisation. Reference datasets (i.e. Geological Time Scales <https://stratigraphy.org/chart>) and auxiliary functions are also provided. Details can be found in: Jones et al., (2023) <doi: 10.1111/2041-210X.14099>.
Maintained by Lewis A. Jones. Last updated 5 months ago.
biodiversityfossilpalaeobiologypaleobiology
12.8 match 21 stars 8.57 score 44 scripts 1 dependentsbmaitner
BIEN:Tools for Accessing the Botanical Information and Ecology Network Database
Provides Tools for Accessing the Botanical Information and Ecology Network Database. The BIEN database contains cleaned and standardized botanical data including occurrence, trait, plot and taxonomic data (See <https://bien.nceas.ucsb.edu/bien/> for more Information). This package provides functions that query the BIEN database by constructing and executing optimized SQL queries.
Maintained by Brian Maitner. Last updated 1 months ago.
17.9 match 6.04 score 205 scripts 5 dependentsinsightsengineering
tern:Create Common TLGs Used in Clinical Trials
Table, Listings, and Graphs (TLG) library for common outputs used in clinical trials.
Maintained by Joe Zhu. Last updated 2 months ago.
clinical-trialsgraphslistingsnestoutputstables
8.3 match 79 stars 12.62 score 186 scripts 9 dependentsropensci
rvertnet:Search 'Vertnet', a 'Database' of Vertebrate Specimen Records
Retrieve, map and summarize data from the 'VertNet.org' archives (<https://vertnet.org/>). Functions allow searching by many parameters, including 'taxonomic' names, places, and dates. In addition, there is an interface for conducting spatially delimited searches, and another for requesting large 'datasets' via email.
Maintained by Dave Slager. Last updated 5 months ago.
speciesoccurrencesbiodiversitymapsvertnetmammalsmammaliaspecimensapi-wrapperspecimenspocc
11.9 match 7 stars 8.51 score 35 scripts 6 dependentsropensci
CoordinateCleaner:Automated Cleaning of Occurrence Records from Biological Collections
Automated flagging of common spatial and temporal errors in biological and paleontological collection data, for the use in conservation, ecology and paleontology. Includes automated tests to easily flag (and exclude) records assigned to country or province centroid, the open ocean, the headquarters of the Global Biodiversity Information Facility, urban areas or the location of biodiversity institutions (museums, zoos, botanical gardens, universities). Furthermore identifies per species outlier coordinates, zero coordinates, identical latitude/longitude and invalid coordinates. Also implements an algorithm to identify data sets with a significant proportion of rounded coordinates. Especially suited for large data sets. The reference for the methodology is: Zizka et al. (2019) <doi:10.1111/2041-210X.13152>.
Maintained by Alexander Zizka. Last updated 1 years ago.
8.4 match 82 stars 10.93 score 306 scripts 3 dependentsiobis
robis:Ocean Biodiversity Information System (OBIS) Client
Client for the Ocean Biodiversity Information System (<https://obis.org>).
Maintained by Pieter Provoost. Last updated 1 years ago.
11.9 match 41 stars 7.54 score 282 scriptstbep-tech
tbeptools:Data and Indicators for the Tampa Bay Estuary Program
Several functions are provided for working with Tampa Bay Estuary Program data and indicators, including the water quality report card, tidal creek assessments, Tampa Bay Nekton Index, Tampa Bay Benthic Index, seagrass transect data, habitat report card, and fecal indicator bacteria. Additional functions are provided for miscellaneous tasks, such as reference library curation.
Maintained by Marcus Beck. Last updated 9 days ago.
data-analysistampa-baytbepwater-quality
11.3 match 10 stars 7.86 score 133 scriptsmalaria-atlas-project
malariaAtlas:An R Interface to Open-Access Malaria Data, Hosted by the 'Malaria Atlas Project'
A suite of tools to allow you to download all publicly available parasite rate survey points, mosquito occurrence points and raster surfaces from the 'Malaria Atlas Project' <https://malariaatlas.org/> servers as well as utility functions for plotting the downloaded data.
Maintained by Mauricio van den Berg. Last updated 8 months ago.
9.7 match 44 stars 9.10 score 118 scripts 3 dependentsinsightsengineering
chevron:Standard TLGs for Clinical Trials Reporting
Provide standard tables, listings, and graphs (TLGs) libraries used in clinical trials. This package implements a structure to reformat the data with 'dunlin', create reporting tables using 'rtables' and 'tern' with standardized input arguments to enable quick generation of standard outputs. In addition, it also provides comprehensive data checks and script generation functionality.
Maintained by Joe Zhu. Last updated 24 days ago.
clinical-trialsgraphslistingsnestreportingtables
9.8 match 12 stars 8.24 score 12 scriptsconfig-i1
smooth:Forecasting Using State Space Models
Functions implementing Single Source of Error state space models for purposes of time series analysis and forecasting. The package includes ADAM (Svetunkov, 2023, <https://openforecast.org/adam/>), Exponential Smoothing (Hyndman et al., 2008, <doi: 10.1007/978-3-540-71918-2>), SARIMA (Svetunkov & Boylan, 2019 <doi: 10.1080/00207543.2019.1600764>), Complex Exponential Smoothing (Svetunkov & Kourentzes, 2018, <doi: 10.13140/RG.2.2.24986.29123>), Simple Moving Average (Svetunkov & Petropoulos, 2018 <doi: 10.1080/00207543.2017.1380326>) and several simulation functions. It also allows dealing with intermittent demand based on the iETS framework (Svetunkov & Boylan, 2019, <doi: 10.13140/RG.2.2.35897.06242>).
Maintained by Ivan Svetunkov. Last updated 2 days ago.
arimaarima-forecastingcesetsexponential-smoothingforecaststate-spacetime-seriesopenblascpp
5.7 match 90 stars 11.87 score 412 scripts 25 dependentsbioc
seqPattern:Visualising oligonucleotide patterns and motif occurrences across a set of sorted sequences
Visualising oligonucleotide patterns and sequence motifs occurrences across a large set of sequences centred at a common reference point and sorted by a user defined feature.
Maintained by Vanja Haberle. Last updated 5 months ago.
14.0 match 4.78 score 12 scripts 7 dependentsrpolars
polars:Lightning-Fast 'DataFrame' Library
Lightning-fast 'DataFrame' library written in 'Rust'. Convert R data to 'Polars' data and vice versa. Perform fast, lazy, larger-than-memory and optimized data queries. 'Polars' is interoperable with the package 'arrow', as both are based on the 'Apache Arrow' Columnar Format.
Maintained by Soren Welling. Last updated 3 days ago.
5.5 match 499 stars 12.01 score 1.0k scripts 2 dependentsedsandorf
spdesign:Designing Stated Preference Experiments
Contemporary software commonly used to design stated preference experiments are expensive and the code is closed source. This is a free software package with an easy to use interface to make flexible stated preference experimental designs using state-of-the-art methods. For an overview of stated choice experimental design theory, see e.g., Rose, J. M. & Bliemer, M. C. J. (2014) in Hess S. & Daly. A. <doi:10.4337/9781781003152>. The package website can be accessed at <https://spdesign.edsandorf.me>. We acknowledge funding from the European Union’s Horizon 2020 research and innovation program under the Marie Sklodowska-Curie grant INSPiRE (Grant agreement ID: 793163).
Maintained by Erlend Dancke Sandorf. Last updated 5 months ago.
14.1 match 4.60 score 20 scriptsgawainantell
divvy:Spatial Subsampling of Biodiversity Occurrence Data
Divide taxonomic occurrence data into geographic regions of fair comparison, with three customisable methods to standardise area and extent. Calculate common biodiversity and range-size metrics on subsampled data. Background theory and practical considerations for the methods are described in Antell and others (2023) <doi:10.31223/X5997Z>.
Maintained by Gawain Antell. Last updated 1 years ago.
16.0 match 4.00 score 10 scriptsrorynolan
filesstrings:Handy File and String Manipulation
This started out as a package for file and string manipulation. Since then, the 'fs' and 'strex' packages emerged, offering functionality previously given by this package (but it's done better in these new ones). Those packages have hence almost pushed 'filesstrings' into extinction. However, it still has a small number of unique, handy file manipulation functions which can be seen in the vignette. One example is a function to remove spaces from all file names in a directory.
Maintained by Rory Nolan. Last updated 1 years ago.
7.3 match 22 stars 8.59 score 632 scripts 4 dependentsb-cubed-eu
gcube:Simulating Biodiversity Data Cubes
This R package provides a simulation framework for biodiversity data cubes. This can start from simulating multiple species distributed in a landscape over a temporal scope. In a second phase, the simulation of a variety of observation processes and effort can generate actual occurrence datasets. Based on their (simulated) spatial uncertainty, occurrences can then be designated to a grid to form a data cube.
Maintained by Ward Langeraert. Last updated 1 months ago.
biodiversity-informaticsdata-cubessimulations
13.5 match 6 stars 4.60 score 9 scriptsskembel
picante:Integrating Phylogenies and Ecology
Functions for phylocom integration, community analyses, null-models, traits and evolution. Implements numerous ecophylogenetic approaches including measures of community phylogenetic and trait diversity, phylogenetic signal, estimation of trait values for unobserved taxa, null models for community and phylogeny randomizations, and utility functions for data input/output and phylogeny plotting. A full description of package functionality and methods are provided by Kembel et al. (2010) <doi:10.1093/bioinformatics/btq166>.
Maintained by Steven W. Kembel. Last updated 2 years ago.
5.3 match 34 stars 11.42 score 1.1k scripts 16 dependentsjamiemkass
ENMeval:Automated Tuning and Evaluations of Ecological Niche Models
Runs ecological niche models over all combinations of user-defined settings (i.e., tuning), performs cross validation to evaluate models, and returns data tables to aid in selection of optimal model settings that balance goodness-of-fit and model complexity. Also has functions to partition data spatially (or not) for cross validation, to plot multiple visualizations of results, to run null models to estimate significance and effect sizes of performance metrics, and to calculate range overlap between model predictions, among others. The package was originally built for Maxent models (Phillips et al. 2006, Phillips et al. 2017), but the current version allows possible extensions for any modeling algorithm. The extensive vignette, which guides users through most package functionality but unfortunately has a file size too big for CRAN, can be found here on the package's Github Pages website: <https://jamiemkass.github.io/ENMeval/articles/ENMeval-2.0-vignette.html>.
Maintained by Jamie M. Kass. Last updated 2 months ago.
5.3 match 49 stars 11.25 score 332 scripts 2 dependentsxijianzheng
coefa:Meta Analysis of Factor Analysis Based on CO-Occurrence Matrices
Provide a series of functions to conduct a meta analysis of factor analysis based on co-occurrence matrices. The tool can be used to solve the factor structure (i.e. inner structure of a construct, or scale) debate in several disciplines, such as psychology, psychiatry, management, education so on. References: Shafer (2005) <doi:10.1037/1040-3590.17.3.324>; Shafer (2006) <doi:10.1002/jclp.20213>; Loeber and Schmaling (1985) <doi:10.1007/BF00910652>.
Maintained by Xijian Zheng. Last updated 2 years ago.
21.9 match 2.70 score 4 scriptsrorynolan
strex:Extra String Manipulation Functions
There are some things that I wish were easier with the 'stringr' or 'stringi' packages. The foremost of these is the extraction of numbers from strings. 'stringr' and 'stringi' make you figure out the regular expression for yourself; 'strex' takes care of this for you. There are many other handy functionalities in 'strex'. Contributions to this package are encouraged; it is intended as a miscellany of string manipulation functions that cannot be found in 'stringi' or 'stringr'.
Maintained by Rory Nolan. Last updated 6 months ago.
5.3 match 41 stars 10.59 score 1.2k scripts 18 dependentsmlammens
spThin:Functions for Spatial Thinning of Species Occurrence Records for Use in Ecological Models
A set of functions that can be used to spatially thin species occurrence data. The resulting thinned data can be used in ecological modeling, such as ecological niche modeling.
Maintained by Matthew E. Aiello-Lammens. Last updated 5 years ago.
6.9 match 2 stars 8.00 score 209 scripts 3 dependentsropensci
paleobioDB:Download and Process Data from the Paleobiology Database
Includes functions to wrap most endpoints of the 'PaleobioDB' API and functions to visualize and process the fossil data. The API documentation for the Paleobiology Database can be found at <https://paleobiodb.org/data1.2/>.
Maintained by Adrián Castro Insua. Last updated 1 years ago.
8.9 match 42 stars 6.19 score 74 scriptsdwarton
ecostats:Code and Data Accompanying the Eco-Stats Text (Warton 2022)
Functions and data supporting the Eco-Stats text (Warton, 2022, Springer), and solutions to exercises. Functions include tools for using simulation envelopes in diagnostic plots, and a function for diagnostic plots of multivariate linear models. Datasets mentioned in the package are included here (where not available elsewhere) and there is a vignette for each chapter of the text with solutions to exercises.
Maintained by David Warton. Last updated 1 years ago.
8.3 match 8 stars 6.58 score 53 scriptsbioc
sarks:Suffix Array Kernel Smoothing for discovery of correlative sequence motifs and multi-motif domains
Suffix Array Kernel Smoothing (see https://academic.oup.com/bioinformatics/article-abstract/35/20/3944/5418797), or SArKS, identifies sequence motifs whose presence correlates with numeric scores (such as differential expression statistics) assigned to the sequences (such as gene promoters). SArKS smooths over sequence similarity, quantified by location within a suffix array based on the full set of input sequences. A second round of smoothing over spatial proximity within sequences reveals multi-motif domains. Discovered motifs can then be merged or extended based on adjacency within MMDs. False positive rates are estimated and controlled by permutation testing.
Maintained by Dennis Wylie. Last updated 5 months ago.
motifdiscoverygeneregulationgeneexpressiontranscriptomicsrnaseqdifferentialexpressionfeatureextractionopenjdk
11.0 match 3 stars 4.78 score 3 scriptsmapme-initiative
mapme.biodiversity:Efficient Monitoring of Global Biodiversity Portfolios
Biodiversity areas, especially primary forest, serve a multitude of functions for local economy, regional functionality of the ecosystems as well as the global health of our planet. Recently, adverse changes in human land use practices and climatic responses to increased greenhouse gas emissions, put these biodiversity areas under a variety of different threats. The present package helps to analyse a number of biodiversity indicators based on freely available geographical datasets. It supports computational efficient routines that allow the analysis of potentially global biodiversity portfolios. The primary use case of the package is to support evidence based reporting of an organization's effort to protect biodiversity areas under threat and to identify regions were intervention is most duly needed.
Maintained by Darius A. Görgen. Last updated 3 months ago.
environmenteogismapmespatialsustainability
5.5 match 35 stars 9.24 score 287 scriptstommyjones
textmineR:Functions for Text Mining and Topic Modeling
An aid for text mining in R, with a syntax that should be familiar to experienced R users. Provides a wrapper for several topic models that take similarly-formatted input and give similarly-formatted output. Has additional functionality for analyzing and diagnostics for topic models.
Maintained by Tommy Jones. Last updated 2 years ago.
4.6 match 106 stars 10.83 score 310 scripts 7 dependentssylvainschmitt
SSDM:Stacked Species Distribution Modelling
Allows to map species richness and endemism based on stacked species distribution models (SSDM). Individuals SDMs can be created using a single or multiple algorithms (ensemble SDMs). For each species, an SDM can yield a habitat suitability map, a binary map, a between-algorithm variance map, and can assess variable importance, algorithm accuracy, and between- algorithm correlation. Methods to stack individual SDMs include summing individual probabilities and thresholding then summing. Thresholding can be based on a specific evaluation metric or by drawing repeatedly from a Bernoulli distribution. The SSDM package also provides a user-friendly interface.
Maintained by Sylvain Schmitt. Last updated 10 months ago.
7.0 match 44 stars 6.99 score 44 scriptsarchaeostat
ArchaeoPhases:Post-Processing of Markov Chain Monte Carlo Simulations for Chronological Modelling
Statistical analysis of archaeological dates and groups of dates. This package allows to post-process Markov Chain Monte Carlo (MCMC) simulations from 'ChronoModel' <https://chronomodel.com/>, 'Oxcal' <https://c14.arch.ox.ac.uk/oxcal.html> or 'BCal' <https://bcal.shef.ac.uk/>. It provides functions for the study of rhythms of the long term from the posterior distribution of a series of dates (tempo and activity plot). It also allows the estimation and visualization of time ranges from the posterior distribution of groups of dates (e.g. duration, transition and hiatus between successive phases) as described in Philippe and Vibet (2020) <doi:10.18637/jss.v093.c01>.
Maintained by Anne Philippe. Last updated 11 months ago.
archaeologybayesian-statisticsgeochronologymarkov-chainradiocarbon-dates
6.8 match 10 stars 6.90 score 66 scriptsgagolews
stringx:Replacements for Base String Functions Powered by 'stringi'
English is the native language for only 5% of the World population. Also, only 17% of us can understand this text. Moreover, the Latin alphabet is the main one for merely 36% of the total. The early computer era, now a very long time ago, was dominated by the US. Due to the proliferation of the internet, smartphones, social media, and other technologies and communication platforms, this is no longer the case. This package replaces base R string functions (such as grep(), tolower(), sprintf(), and strptime()) with ones that fully support the Unicode standards related to natural language and date-time processing. It also fixes some long-standing inconsistencies, and introduces some new, useful features. Thanks to 'ICU' (International Components for Unicode) and 'stringi', they are fast, reliable, and portable across different platforms.
Maintained by Marek Gagolewski. Last updated 2 months ago.
icuicu4cnatural-language-processingnlpregexregexpstring-manipulationstringitexttext-processingunicode
9.8 match 28 stars 4.75 score 1 scriptslleisong
itsdm:Isolation Forest-Based Presence-Only Species Distribution Modeling
Collection of R functions to do purely presence-only species distribution modeling with isolation forest (iForest) and its variations such as Extended isolation forest and SCiForest. See the details of these methods in references: Liu, F.T., Ting, K.M. and Zhou, Z.H. (2008) <doi:10.1109/ICDM.2008.17>, Hariri, S., Kind, M.C. and Brunner, R.J. (2019) <doi:10.1109/TKDE.2019.2947676>, Liu, F.T., Ting, K.M. and Zhou, Z.H. (2010) <doi:10.1007/978-3-642-15883-4_18>, Guha, S., Mishra, N., Roy, G. and Schrijvers, O. (2016) <https://proceedings.mlr.press/v48/guha16.html>, Cortes, D. (2021) <arXiv:2110.13402>. Additionally, Shapley values are used to explain model inputs and outputs. See details in references: Shapley, L.S. (1953) <doi:10.1515/9781400881970-018>, Lundberg, S.M. and Lee, S.I. (2017) <https://dl.acm.org/doi/abs/10.5555/3295222.3295230>, Molnar, C. (2020) <ISBN:978-0-244-76852-2>, Štrumbelj, E. and Kononenko, I. (2014) <doi:10.1007/s10115-013-0679-x>. itsdm also provides functions to diagnose variable response, analyze variable importance, draw spatial dependence of variables and examine variable contribution. As utilities, the package includes a few functions to download bioclimatic variables including 'WorldClim' version 2.0 (see Fick, S.E. and Hijmans, R.J. (2017) <doi:10.1002/joc.5086>) and 'CMCC-BioClimInd' (see Noce, S., Caporaso, L. and Santini, M. (2020) <doi:10.1038/s41597-020-00726-5>.
Maintained by Lei Song. Last updated 2 years ago.
isolation-forestoutlier-detectionpresence-onlymodelshapley-valuespecies-distribution-modelling
8.1 match 4 stars 5.59 score 65 scriptsatlasoflivingaustralia
galah:Biodiversity Data from the GBIF Node Network
The Global Biodiversity Information Facility ('GBIF', <https://www.gbif.org>) sources data from an international network of data providers, known as 'nodes'. Several of these nodes - the "living atlases" (<https://living-atlases.gbif.org>) - maintain their own web services using software originally developed by the Atlas of Living Australia ('ALA', <https://www.ala.org.au>). 'galah' enables the R community to directly access data and resources hosted by 'GBIF' and its partner nodes.
Maintained by Martin Westgate. Last updated 1 months ago.
4.9 match 43 stars 9.17 score 275 scripts 1 dependentsbnosac
udpipe:Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing with the 'UDPipe' 'NLP' Toolkit
This natural language processing toolkit provides language-agnostic 'tokenization', 'parts of speech tagging', 'lemmatization' and 'dependency parsing' of raw text. Next to text parsing, the package also allows you to train annotation models based on data of 'treebanks' in 'CoNLL-U' format as provided at <https://universaldependencies.org/format.html>. The techniques are explained in detail in the paper: 'Tokenizing, POS Tagging, Lemmatizing and Parsing UD 2.0 with UDPipe', available at <doi:10.18653/v1/K17-3009>. The toolkit also contains functionalities for commonly used data manipulations on texts which are enriched with the output of the parser. Namely functionalities and algorithms for collocations, token co-occurrence, document term matrix handling, term frequency inverse document frequency calculations, information retrieval metrics (Okapi BM25), handling of multi-word expressions, keyword detection (Rapid Automatic Keyword Extraction, noun phrase extraction, syntactical patterns) sentiment scoring and semantic similarity analysis.
Maintained by Jan Wijffels. Last updated 2 years ago.
conlldependency-parserlemmatizationnatural-language-processingnlppos-taggingr-pkgrcpptext-miningtokenizerudpipecpp
3.8 match 215 stars 11.83 score 1.2k scripts 9 dependentsdracor-org
rdracor:Access to the 'DraCor' API
Provide an interface for 'Drama Corpora Project' ('DraCor') API: <https://dracor.org/documentation/api>.
Maintained by Ivan Pozdniakov. Last updated 6 months ago.
8.7 match 14 stars 5.05 score 40 scriptstheropod1
paleoDiv:Extracting and Visualizing Paleobiodiversity
Contains various tools for conveniently downloading and editing taxon-specific datasets from the Paleobiology Database <https://paleobiodb.org>, extracting information on abundance, temporal distribution of subtaxa and taxonomic diversity through deep time, and visualizing these data in relation to phylogeny and stratigraphy.
Maintained by Darius Nau. Last updated 5 months ago.
15.7 match 2 stars 2.78 scorebiorgeo
bioregion:Comparison of Bioregionalisation Methods
The main purpose of this package is to propose a transparent methodological framework to compare bioregionalisation methods based on hierarchical and non-hierarchical clustering algorithms (Kreft & Jetz (2010) <doi:10.1111/j.1365-2699.2010.02375.x>) and network algorithms (Lenormand et al. (2019) <doi:10.1002/ece3.4718> and Leroy et al. (2019) <doi:10.1111/jbi.13674>).
Maintained by Maxime Lenormand. Last updated 11 days ago.
biogeographybioregionbioregionalizationcpp
6.9 match 7 stars 6.27 score 11 scriptsecor
RGENERATEPREC:Tools to Generate Daily-Precipitation Time Series
The method 'generate()' is extended for spatial multi-site stochastic generation of daily precipitation. It generates precipitation occurrence in several sites using logit regression (Generalized Linear Models) and the approach by D.S. Wilks (1998) <doi:10.1016/S0022-1694(98)00186-3> .
Maintained by Emanuele Cordano. Last updated 7 months ago.
8.0 match 4 stars 5.26 score 45 scriptsequitable-equations
fqar:Floristic Quality Assessment Tools for R
Tools for downloading and analyzing floristic quality assessment data. See Freyman et al. (2015) <doi:10.1111/2041-210X.12491> for more information about floristic quality assessment and the associated database.
Maintained by Andrew Gard. Last updated 2 months ago.
7.0 match 5 stars 5.88 score 5 scriptslbbe-software
Mondrian:A Simple Graphical Representation of the Relative Occurrence and Co-Occurrence of Events
The unique function of this package allows representing in a single graph the relative occurrence and co-occurrence of events measured in a sample. As examples, the package was applied to describe the occurrence and co-occurrence of different species of bacterial or viral symbionts infecting arthropods at the individual level. The graphics allows determining the prevalence of each symbiont and the patterns of multiple infections (i.e. how different symbionts share or not the same individual hosts). We named the package after the famous painter as the graphical output recalls Mondrian’s paintings.
Maintained by Aurélie Siberchicot. Last updated 8 months ago.
10.2 match 2 stars 4.00 score 8 scriptsmacroecology
letsR:Data Handling and Analysis in Macroecology
Handling, processing, and analyzing geographic data on species' distributions and environmental variables. Read Vilela & Villalobos (2015) <doi:10.1111/2041-210X.12401> for details.
Maintained by Bruno Vilela. Last updated 2 months ago.
4.5 match 29 stars 8.87 score 104 scriptsbmaitner
S4DM:Small Sample Size Species Distribution Modeling
Implements a set of distribution modeling methods that are suited to species with small sample sizes (e.g., poorly sampled species or rare species). While these methods can also be used on well-sampled taxa, they are united by the fact that they can be utilized with relatively few data points. More details on the currently implemented methodologies can be found in Drake and Richards (2018) <doi:10.1002/ecs2.2373>, Drake (2015) <doi:10.1098/rsif.2015.0086>, and Drake (2014) <doi:10.1890/ES13-00202.1>.
Maintained by Brian S. Maitner. Last updated 1 months ago.
open-sciencerange-modellingrare-speciesspecies-distribution-modelingspecies-distribution-modelling
6.7 match 4 stars 5.97 score 33 scriptsluomus
f2g:FinBIF to GBIF
Tools for publishing FinBIF data to GBIF.
Maintained by William K. Morris. Last updated 10 days ago.
13.1 match 1 stars 3.02 scorequanteda
quanteda:Quantitative Analysis of Textual Data
A fast, flexible, and comprehensive framework for quantitative text analysis in R. Provides functionality for corpus management, creating and manipulating tokens and n-grams, exploring keywords in context, forming and manipulating sparse matrices of documents by features and feature co-occurrences, analyzing keywords, computing feature similarities and distances, applying content dictionaries, applying supervised and unsupervised machine learning, visually representing text and text analyses, and more.
Maintained by Kenneth Benoit. Last updated 2 months ago.
corpusnatural-language-processingquantedatext-analyticsonetbbcpp
2.3 match 851 stars 16.68 score 5.4k scripts 51 dependentsohdsi
omock:Creation of Mock Observational Medical Outcomes Partnership Common Data Model
Creates mock data for testing and package development for the Observational Medical Outcomes Partnership common data model. The package offers functions crafted with pipeline-friendly implementation, enabling users to effortlessly include only the necessary tables for their testing needs.
Maintained by Mike Du. Last updated 1 months ago.
5.1 match 2 stars 7.44 score 45 scripts 1 dependentsbioc
mosbi:Molecular Signature identification using Biclustering
This package is a implementation of biclustering ensemble method MoSBi (Molecular signature Identification from Biclustering). MoSBi provides standardized interfaces for biclustering results and can combine their results with a multi-algorithm ensemble approach to compute robust ensemble biclusters on molecular omics data. This is done by computing similarity networks of biclusters and filtering for overlaps using a custom error model. After that, the louvain modularity it used to extract bicluster communities from the similarity network, which can then be converted to ensemble biclusters. Additionally, MoSBi includes several network visualization methods to give an intuitive and scalable overview of the results. MoSBi comes with several biclustering algorithms, but can be easily extended to new biclustering algorithms.
Maintained by Tim Daniel Rose. Last updated 5 months ago.
softwarestatisticalmethodclusteringnetworkcpp
8.8 match 4.30 score 8 scriptsnataliepatten
gatoRs:Geographic and Taxonomic Occurrence R-Based Scrubbing
Streamlines downloading and cleaning biodiversity data from Integrated Digitized Biocollections (iDigBio) and the Global Biodiversity Information Facility (GBIF).
Maintained by Natalie N. Patten. Last updated 10 months ago.
6.0 match 11 stars 6.16 score 66 scriptstesselle
tabula:Analysis and Visualization of Archaeological Count Data
An easy way to examine archaeological count data. This package provides several tests and measures of diversity: heterogeneity and evenness (Brillouin, Shannon, Simpson, etc.), richness and rarefaction (Chao1, Chao2, ACE, ICE, etc.), turnover and similarity (Brainerd-Robinson, etc.). It allows to easily visualize count data and statistical thresholds: rank vs abundance plots, heatmaps, Ford (1962) and Bertin (1977) diagrams, etc.
Maintained by Nicolas Frerebeau. Last updated 13 days ago.
data-visualizationarchaeologyarchaeological-science
7.1 match 5.10 score 38 scripts 1 dependentstrinker
qdap:Bridging the Gap Between Qualitative Data and Quantitative Analysis
Automates many of the tasks associated with quantitative discourse analysis of transcripts containing discourse including frequency counts of sentence types, words, sentences, turns of talk, syllables and other assorted analysis tasks. The package provides parsing tools for preparing transcript data. Many functions enable the user to aggregate data by any number of grouping variables, providing analysis and seamless integration with other R packages that undertake higher level analysis and visualization of text. This affords the user a more efficient and targeted analysis. 'qdap' is designed for transcript analysis, however, many functions are applicable to other areas of Text Mining/ Natural Language Processing.
Maintained by Tyler Rinker. Last updated 4 years ago.
qdapquantitative-discourse-analysistext-analysistext-miningtext-plottingopenjdk
3.7 match 176 stars 9.61 score 1.3k scripts 3 dependentsjmestret
GeoThinneR:Simple Spatial Thinning for Ecological and Spatial Analysis
Provides efficient geospatial thinning algorithms to reduce the density of coordinate data while maintaining spatial relationships. Implements K-D Tree and brute-force distance-based thinning, as well as grid-based and precision-based thinning methods. For more information on the methods, see Elseberg et al. (2012) <https://hdl.handle.net/10446/86202>.
Maintained by Jorge Mestre-Tomás. Last updated 5 months ago.
6.2 match 9 stars 5.73 score 7 scripts 1 dependentsbjoelle
FossilSim:Simulation and Plots for Fossil and Taxonomy Data
Simulating and plotting taxonomy and fossil data on phylogenetic trees under mechanistic models of speciation, preservation and sampling.
Maintained by Joelle Barido-Sottani. Last updated 6 months ago.
6.7 match 1 stars 5.24 score 65 scripts 1 dependentsadamlilith
enmSdmX:Species Distribution Modeling and Ecological Niche Modeling
Implements species distribution modeling and ecological niche modeling, including: bias correction, spatial cross-validation, model evaluation, raster interpolation, biotic "velocity" (speed and direction of movement of a "mass" represented by a raster), interpolating across a time series of rasters, and use of spatially imprecise records. The heart of the package is a set of "training" functions which automatically optimize model complexity based number of available occurrences. These algorithms include MaxEnt, MaxNet, boosted regression trees/gradient boosting machines, generalized additive models, generalized linear models, natural splines, and random forests. To enhance interoperability with other modeling packages, no new classes are created. The package works with 'PROJ6' geodetic objects and coordinate reference systems.
Maintained by Adam B. Smith. Last updated 25 days ago.
bias-correctionbiogeographyecological-niche-modelingecological-niche-modellingniche-modelingniche-modellingspecies-distribution-modelingopenjdk
6.2 match 25 stars 5.62 score 37 scriptseitsupi
neopolars:R Bindings for the 'polars' Rust Library
Lightning-fast 'DataFrame' library written in 'Rust'. Convert R data to 'Polars' data and vice versa. Perform fast, lazy, larger-than-memory and optimized data queries. 'Polars' is interoperable with the package 'arrow', as both are based on the 'Apache Arrow' Columnar Format.
Maintained by Tatsuya Shima. Last updated 1 days ago.
7.0 match 40 stars 4.86 score 1 scriptsdvrbts
labdsv:Ordination and Multivariate Analysis for Ecology
A variety of ordination and community analyses useful in analysis of data sets in community ecology. Includes many of the common ordination methods, with graphical routines to facilitate their interpretation, as well as several novel analyses.
Maintained by David W. Roberts. Last updated 2 years ago.
5.6 match 3 stars 6.08 score 452 scripts 13 dependentsazvoleff
glcm:Calculate Textures from Grey-Level Co-Occurrence Matrices (GLCMs)
Enables calculation of image textures (Haralick 1973) <doi:10.1109/TSMC.1973.4309314> from grey-level co-occurrence matrices (GLCMs). Supports processing images that cannot fit in memory.
Maintained by Alex Zvoleff. Last updated 5 years ago.
6.7 match 15 stars 5.05 score 74 scriptsluismurao
tenm:Temporal Ecological Niche Models
Implements methods and functions to calibrate time-specific niche models (multi-temporal calibration), letting users execute a strict calibration and selection process of niche models based on ellipsoids, as well as functions to project the potential distribution in the present and in global change scenarios.The 'tenm' package has functions to recover information that may be lost or overlooked while applying a data curation protocol. This curation involves preserving occurrences that may appear spatially redundant (occurring in the same pixel) but originate from different time periods. A novel aspect of this package is that it might reconstruct the fundamental niche more accurately than mono-calibrated approaches. The theoretical background of the package can be found in Peterson et al. (2011)<doi:10.5860/CHOICE.49-6266>.
Maintained by Luis Osorio-Olvera. Last updated 8 months ago.
5.8 match 5 stars 5.77 score 34 scriptsviralemergence
insectDisease:Ecological Database of the World's Insect Pathogens
David Onstad provided us with this insect disease database, sometimes referred to as the 'Ecological Database of the Worlds Insect Pathogens' or EDWIP. Files have been converted from 'SQL' to csv, and ported into 'R' for easy exploration and analysis. Thanks to the Macroecology of Infectious Disease Research Coordination Network (RCN) for funding and support. Data are also served online in a static format at <https://edwip.ecology.uga.edu/>.
Maintained by Tad Dallas. Last updated 2 months ago.
7.5 match 13 stars 4.41 score 2 scriptsconfig-i1
greybox:Toolbox for Model Building and Forecasting
Implements functions and instruments for regression model building and its application to forecasting. The main scope of the package is in variables selection and models specification for cases of time series data. This includes promotional modelling, selection between different dynamic regressions with non-standard distributions of errors, selection based on cross validation, solutions to the fat regression model problem and more. Models developed in the package are tailored specifically for forecasting purposes. So as a results there are several methods that allow producing forecasts from these models and visualising them.
Maintained by Ivan Svetunkov. Last updated 2 days ago.
forecastingmodel-selectionmodel-selection-and-evaluationregressionregression-modelsstatisticscpp
3.0 match 30 stars 11.03 score 97 scripts 34 dependentsmrmaxent
maxnet:Fitting 'Maxent' Species Distribution Models with 'glmnet'
Procedures to fit species distributions models from occurrence records and environmental variables, using 'glmnet' for model fitting. Model structure is the same as for the 'Maxent' Java package, version 3.4.0, with the same feature types and regularization options. See the 'Maxent' website <http://biodiversityinformatics.amnh.org/open_source/maxent> for more details.
Maintained by Steven Phillips. Last updated 2 years ago.
3.8 match 75 stars 8.68 score 169 scripts 7 dependentsemcramer
CHOIRBM:Plots the CHOIR Body Map
Collection of utility functions for visualizing body map data collected with the Collaborative Health Outcomes Information Registry.
Maintained by Eric Cramer. Last updated 1 years ago.
body-mapcbmchoirdata-visualizationvisualization
5.9 match 5 stars 5.51 score 26 scriptshenrikbengtsson
matrixStats:Functions that Apply to Rows and Columns of Matrices (and to Vectors)
High-performing functions operating on rows and columns of matrices, e.g. col / rowMedians(), col / rowRanks(), and col / rowSds(). Functions optimized per data type and for subsetted calculations such that both memory usage and processing time is minimized. There are also optimized vector-based methods, e.g. binMeans(), madDiff() and weightedMedian().
Maintained by Henrik Bengtsson. Last updated 2 months ago.
1.8 match 208 stars 18.09 score 20k scripts 2.3k dependentsmindthegap-erc
StratPal:Stratigraphic Paleobiology Modeling Pipelines
The fossil record is a joint expression of ecological, taphonomic, evolutionary, and stratigraphic processes (Holland and Patzkowsky, 2012, ISBN:978-0226649382). This package allowing to simulate biological processes in the time domain (e.g., trait evolution, fossil abundance), and examine how their expression in the rock record (stratigraphic domain) is influenced based on age-depth models, ecological niche models, and taphonomic effects. Functions simulating common processes used in modeling trait evolution or event type data such as first/last occurrences are provided and can be used standalone or as part of a pipeline. The package comes with example data sets and tutorials in several vignettes, which can be used as a template to set up one's own simulation.
Maintained by Niklas Hohmann. Last updated 24 days ago.
palaeobiologypalaeontologypaleobiologypaleontologystratigraphic-paleobiologystratigraphy
5.3 match 1 stars 5.88 score 18 scriptsbrpetrucci
paleobuddy:Simulating Diversification Dynamics
Simulation of species diversification, fossil records, and phylogenies. While the literature on species birth-death simulators is extensive, including important software like 'paleotree' and 'APE', we concluded there were interesting gaps to be filled regarding possible diversification scenarios. Here we strove for flexibility over focus, implementing a large array of regimens for users to experiment with and combine. In this way, 'paleobuddy' can be used in complement to other simulators as a flexible jack of all trades, or, in the case of scenarios implemented only here, can allow for robust and easy simulations for novel situations. Environmental data modified from that in 'RPANDA': Morlon H. et al (2016) <doi:10.1111/2041-210X.12526>.
Maintained by Bruno do Rosario Petrucci. Last updated 1 months ago.
evolutionmacroevolutionpaleobiologypaleontologyphylogenetics
5.9 match 6 stars 4.95 score 4 scriptsbbolker
emdbook:Support Functions and Data for "Ecological Models and Data"
Auxiliary functions and data sets for "Ecological Models and Data", a book presenting maximum likelihood estimation and related topics for ecologists (ISBN 978-0-691-12522-0).
Maintained by Ben Bolker. Last updated 8 months ago.
3.6 match 4 stars 8.04 score 656 scripts 21 dependentsmstrimas
colorist:Coloring Wildlife Distributions in Space-Time
Color and visualize wildlife distributions in space-time using raster data. In addition to enabling display of sequential change in distributions through the use of small multiples, 'colorist' provides functions for extracting several features of interest from a sequence of distributions and for visualizing those features using HCL (hue-chroma-luminance) color palettes. Resulting maps allow for "fair" visual comparison of intensity values (e.g., occurrence, abundance, or density) across space and time and can be used to address questions about where, when, and how consistently a species, group, or individual is likely to be found.
Maintained by Matthew Strimas-Mackey. Last updated 11 months ago.
5.1 match 14 stars 5.60 score 19 scriptsgustavobio
flora:Tools for Interacting with the Brazilian Flora 2020
Tools to quickly compile taxonomic and distribution data from the Brazilian Flora 2020.
Maintained by Gustavo Carvalho. Last updated 1 years ago.
5.3 match 29 stars 5.37 score 54 scripts 1 dependentsvascobranco
red:IUCN Redlisting Tools
Includes algorithms to facilitate the assessment of extinction risk of species according to the IUCN (International Union for Conservation of Nature, see <https://www.iucn.org/> for more information) red list criteria.
Maintained by Vasco V. Branco. Last updated 3 months ago.
6.1 match 1 stars 4.54 score 29 scripts 1 dependentsusepa
pTITAN2:Permutations of Treatment Labels and TITAN2 Analysis
Permute treatment labels for taxa and environmental gradients to generate an empirical distribution of change points. This is an extension for the 'TITAN2' package <https://cran.r-project.org/package=TITAN2>.
Maintained by Peter DeWitt. Last updated 3 years ago.
7.5 match 1 stars 3.70 score 7 scriptsjlmelville
rnndescent:Nearest Neighbor Descent Method for Approximate Nearest Neighbors
The Nearest Neighbor Descent method for finding approximate nearest neighbors by Dong and co-workers (2010) <doi:10.1145/1963405.1963487>. Based on the 'Python' package 'PyNNDescent' <https://github.com/lmcinnes/pynndescent>.
Maintained by James Melville. Last updated 8 months ago.
approximate-nearest-neighbor-searchcpp
3.7 match 11 stars 7.31 score 75 scriptssmithsonian
gde:GBIF Dataset Explorer
Functions to explore datasets from the Global Biodiversity Information Facility (GBIF - <https://www.gbif.org/>) using a Shiny interface.
Maintained by Luis J Villanueva. Last updated 2 years ago.
biodiversity-informaticsdata-issuesgbifgbif-dataoccurrenceshinyshiny-r
10.0 match 1 stars 2.70 scoreptitle
rangeBuilder:Occurrence Filtering, Geographic Standardization and Generation of Species Range Polygons
Provides tools for filtering occurrence records, generating alpha-hull-derived range polygons and mapping species distributions.
Maintained by Pascal Title. Last updated 5 months ago.
5.1 match 9 stars 4.99 score 72 scripts 1 dependentsghislainv
hSDM:Hierarchical Bayesian Species Distribution Models
User-friendly and fast set of functions for estimating parameters of hierarchical Bayesian species distribution models (Latimer and others 2006 <doi:10.1890/04-0609>). Such models allow interpreting the observations (occurrence and abundance of a species) as a result of several hierarchical processes including ecological processes (habitat suitability, spatial dependence and anthropogenic disturbance) and observation processes (species detectability). Hierarchical species distribution models are essential for accurately characterizing the environmental response of species, predicting their probability of occurrence, and assessing uncertainty in the model results.
Maintained by Ghislain Vieilledent. Last updated 2 years ago.
4.1 match 9 stars 6.04 score 41 scriptsrspatial
geodata:Download Geographic Data
Functions for downloading of geographic data for use in spatial analysis and mapping. The package facilitates access to climate, crops, elevation, land use, soil, species occurrence, accessibility, administrative boundaries and other data.
Maintained by Robert J. Hijmans. Last updated 1 months ago.
2.3 match 162 stars 10.75 score 1.5k scripts 7 dependentsternaustralia
ausplotsR:TERN AusPlots Australian Ecosystem Monitoring Data
Extraction, preparation, visualisation and analysis of TERN AusPlots ecosystem monitoring data. Direct access to plot-based data on vegetation and soils across Australia, including physical sample barcode numbers. Simple function calls extract the data and merge them into species occurrence matrices for downstream analysis, or calculate things like basal area and fractional cover. TERN AusPlots is a national field plot-based ecosystem surveillance monitoring method and dataset for Australia. The data have been collected across a national network of plots and transects by the Terrestrial Ecosystem Research Network (TERN - <https://www.tern.org.au>), an Australian Government NCRIS-enabled project, and its Ecosystem Surveillance platform (<https://www.tern.org.au/tern-land-observatory/ecosystem-surveillance-and-environmental-monitoring/>).
Maintained by Greg Guerin. Last updated 1 years ago.
4.1 match 10 stars 6.07 score 59 scriptsb-cubed-eu
impIndicator:Impact Indicators of Alien Taxa
Compute impact indicators of alien taxa using GBIF occurrence cube and EICAT assessment of alien species. Aggregates species impact of various scores due to mecahnism. Aggregates site impact of various scores due to species.
Maintained by Mukhtar Muhammed Yahaya. Last updated 13 hours ago.
biodiversity-indicatorsimpactinvasive-species
5.6 match 4.38 score 4 scriptsjulienvollering
MIAmaxent:A Modular, Integrated Approach to Maximum Entropy Distribution Modeling
Tools for training, selecting, and evaluating maximum entropy (and standard logistic regression) distribution models. This package provides tools for user-controlled transformation of explanatory variables, selection of variables by nested model comparison, and flexible model evaluation and projection. It follows principles based on the maximum- likelihood interpretation of maximum entropy modeling, and uses infinitely- weighted logistic regression for model fitting. The package is described in Vollering et al. (2019; <doi:10.1002/ece3.5654>).
Maintained by Julien Vollering. Last updated 7 months ago.
3.8 match 14 stars 6.53 score 30 scriptsmhahsler
arules:Mining Association Rules and Frequent Itemsets
Provides the infrastructure for representing, manipulating and analyzing transaction data and patterns (frequent itemsets and association rules). Also provides C implementations of the association mining algorithms Apriori and Eclat. Hahsler, Gruen and Hornik (2005) <doi:10.18637/jss.v014.i15>.
Maintained by Michael Hahsler. Last updated 1 months ago.
arulesassociation-rulesfrequent-itemsets
1.7 match 194 stars 13.99 score 3.3k scripts 28 dependentscmerow
rangeModelMetadata:Provides Templates for Metadata Files Associated with Species Range Models
Range Modeling Metadata Standards (RMMS) address three challenges: they (i) are designed for convenience to encourage use, (ii) accommodate a wide variety of applications, and (iii) are extensible to allow the community of range modelers to steer it as needed. RMMS are based on a data dictionary that specifies a hierarchical structure to catalog different aspects of the range modeling process. The dictionary balances a constrained, minimalist vocabulary to improve standardization with flexibility for users to provide their own values. Merow et al. (2019) <DOI:10.1111/geb.12993> describe the standards in more detail. Note that users who prefer to use the R package 'ecospat' can obtain it from <https://github.com/ecospat/ecospat>.
Maintained by Cory Merow. Last updated 8 months ago.
ecological-metadata-languageecological-modellingecological-modelsecologyspecies-distribution-modellingspecies-distributions
3.4 match 6 stars 6.96 score 16 scripts 3 dependentsrstudio
keras3:R Interface to 'Keras'
Interface to 'Keras' <https://keras.io>, a high-level neural networks API. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both CPU and GPU devices.
Maintained by Tomasz Kalinowski. Last updated 4 days ago.
1.8 match 845 stars 13.57 score 264 scripts 2 dependentsjmsigner
amt:Animal Movement Tools
Manage and analyze animal movement data. The functionality of 'amt' includes methods to calculate home ranges, track statistics (e.g. step lengths, speed, or turning angles), prepare data for fitting habitat selection analyses, and simulation of space-use from fitted step-selection functions.
Maintained by Johannes Signer. Last updated 4 months ago.
2.3 match 41 stars 10.54 score 418 scriptsfarewe
virtualspecies:Generation of Virtual Species Distributions
Provides a framework for generating virtual species distributions, a procedure increasingly used in ecology to improve species distribution models. This package integrates the existing methodological approaches with the objective of generating virtual species distributions with increased ecological realism.
Maintained by Boris Leroy. Last updated 1 years ago.
3.5 match 17 stars 6.68 score 158 scripts 1 dependentshope-data-science
akc:Automatic Knowledge Classification
A tidy framework for automatic knowledge classification and visualization. Currently, the core functionality of the framework is mainly supported by modularity-based clustering (community detection) in keyword co-occurrence network, and focuses on co-word analysis of bibliometric research. However, the designed functions in 'akc' are general, and could be extended to solve other tasks in text mining as well.
Maintained by Tian-Yuan Huang. Last updated 19 days ago.
4.0 match 15 stars 5.85 score 47 scriptskpmainali
CooccurrenceAffinity:Affinity in Co-Occurrence Data
Computes a novel metric of affinity between two entities based on their co-occurrence (using binary presence/absence data). The metric and its MLE, alpha hat, were advanced in Mainali, Slud, et al, 2021 <doi:10.1126/sciadv.abj9204>. Various types of confidence intervals and median interval were developed in Mainali and Slud, 2022 <doi:10.1101/2022.11.01.514801>.
Maintained by Kumar Mainali. Last updated 2 years ago.
5.3 match 26 stars 4.39 score 19 scriptsmassimoaria
bibliometrix:Comprehensive Science Mapping Analysis
Tool for quantitative research in scientometrics and bibliometrics. It implements the comprehensive workflow for science mapping analysis proposed in Aria M. and Cuccurullo C. (2017) <doi:10.1016/j.joi.2017.08.007>. 'bibliometrix' provides various routines for importing bibliographic data from 'SCOPUS', 'Clarivate Analytics Web of Science' (<https://www.webofknowledge.com/>), 'Digital Science Dimensions' (<https://www.dimensions.ai/>), 'OpenAlex' (<https://openalex.org/>), 'Cochrane Library' (<https://www.cochranelibrary.com/>), 'Lens' (<https://lens.org>), and 'PubMed' (<https://pubmed.ncbi.nlm.nih.gov/>) databases, performing bibliometric analysis and building networks for co-citation, coupling, scientific collaboration and co-word analysis.
Maintained by Massimo Aria. Last updated 8 days ago.
bibliometric-analysisbibliometricscitationcitation-networkcitationsco-authorsco-occurenceco-word-analysiscorrespondence-analysiscouplingisi-webjournalmanuscriptquantitative-analysisscholarssciencescience-mappingscientificscientometricsscopus
1.8 match 545 stars 12.54 score 518 scripts 2 dependentssteffenmoritz
imputeTS:Time Series Missing Value Imputation
Imputation (replacement) of missing values in univariate time series. Offers several imputation functions and missing data plots. Available imputation algorithms include: 'Mean', 'LOCF', 'Interpolation', 'Moving Average', 'Seasonal Decomposition', 'Kalman Smoothing on Structural Time Series models', 'Kalman Smoothing on ARIMA models'. Published in Moritz and Bartz-Beielstein (2017) <doi:10.32614/RJ-2017-009>.
Maintained by Steffen Moritz. Last updated 3 years ago.
data-visualizationimputationimputation-algorithmimputetsmissing-datatime-seriescpp
1.8 match 162 stars 12.18 score 1.9k scripts 27 dependentsewenharrison
finalfit:Quickly Create Elegant Regression Results Tables and Plots when Modelling
Generate regression results tables and plots in final format for publication. Explore models and export directly to PDF and 'Word' using 'RMarkdown'.
Maintained by Ewen Harrison. Last updated 7 months ago.
1.9 match 270 stars 11.43 score 1.0k scriptspythonicr
strs:'Python' Style String Functions
A comprehensive set of string manipulation functions based on those found in 'Python' without relying on 'reticulate'. It provides functions that intend to (1) make it easier for users familiar with 'Python' to work with strings, (2) reduce the complexity often associated with string operations, (3) and enable users to write more readable and maintainable code that manipulates strings.
Maintained by Garrett Shipley. Last updated 2 months ago.
5.5 match 2 stars 3.90 score 5 scriptsnicholasjclark
MRFcov:Markov Random Fields with Additional Covariates
Approximate node interaction parameters of Markov Random Fields graphical networks. Models can incorporate additional covariates, allowing users to estimate how interactions between nodes in the graph are predicted to change across covariate gradients. The general methods implemented in this package are described in Clark et al. (2018) <doi:10.1002/ecy.2221>.
Maintained by Nicholas J Clark. Last updated 12 months ago.
conditional-random-fieldsgraphical-modelsmachine-learningmarkov-random-fieldmultivariate-analysismultivariate-statisticsnetwork-analysisnetworks
3.5 match 24 stars 6.03 score 30 scriptstdaverse
ripserr:Calculate Persistent Homology with Ripser-Based Engines
Ports the Ripser <https://arxiv.org/abs/1908.02518> and Cubical Ripser <https://arxiv.org/abs/2005.12692> persistent homology calculation engines from C++. Can be used as a rapid calculation tool in topological data analysis pipelines.
Maintained by Raoul Wadhwa. Last updated 2 days ago.
algebraic-topologycohomologycppcubical-complexpersistent-homologypixelpoint-cloudr-languager-programmingrcpprips-complexripsersimplicial-complexsimplicial-homologytopological-data-analysistopologyvietoris-complexvoxelcpp
3.6 match 7 stars 5.80 score 6 scriptsbilldenney
PKNCA:Perform Pharmacokinetic Non-Compartmental Analysis
Compute standard Non-Compartmental Analysis (NCA) parameters for typical pharmacokinetic analyses and summarize them.
Maintained by Bill Denney. Last updated 16 days ago.
ncanoncompartmental-analysispharmacokinetics
1.7 match 73 stars 12.61 score 214 scripts 4 dependentsyiluheihei
RevEcoR:Reverse Ecology Analysis on Microbiome
An implementation of the reverse ecology framework. Reverse ecology refers to the use of genomics to study ecology with no a priori assumptions about the organism(s) under consideration, linking organisms to their environment. It allows researchers to reconstruct the metabolic networks and study the ecology of poorly characterized microbial species from their genomic information, and has substantial potentials for microbial community ecological analysis.
Maintained by Yang Cao. Last updated 6 years ago.
3.5 match 6 stars 5.77 score 22 scripts 1 dependentshugheylab
phers:Calculate Phenotype Risk Scores
Use phenotype risk scores based on linked clinical and genetic data to study Mendelian disease and rare genetic variants. See Bastarache et al. 2018 <doi:10.1126/science.aal4043>.
Maintained by Jake Hughey. Last updated 2 years ago.
6.8 match 3.00 score 1 scriptseikeluedeling
decisionSupport:Quantitative Support of Decision Making under Uncertainty
Supporting the quantitative analysis of binary welfare based decision making processes using Monte Carlo simulations. Decision support is given on two levels: (i) The actual decision level is to choose between two alternatives under probabilistic uncertainty. This package calculates the optimal decision based on maximizing expected welfare. (ii) The meta decision level is to allocate resources to reduce the uncertainty in the underlying decision problem, i.e to increase the current information to improve the actual decision making process. This problem is dealt with using the Value of Information Analysis. The Expected Value of Information for arbitrary prospective estimates can be calculated as well as Individual Expected Value of Perfect Information. The probabilistic calculations are done via Monte Carlo simulations. This Monte Carlo functionality can be used on its own.
Maintained by Eike Luedeling. Last updated 11 months ago.
3.9 match 6 stars 5.17 score 123 scriptsbernd-mueller
epos:Epilepsy Ontologies' Similarities
Analysis and visualization of similarities between epilepsy ontologies based on text mining results by comparing ranked lists of co-occurring drug terms in the BioASQ corpus. The ranked result lists of neurological drug terms co-occurring with terms from the epilepsy ontologies EpSO, ESSO, EPILONT, EPISEM and FENICS undergo further analysis. The source data to create the ranked lists of drug names is produced using the text mining workflows described in Mueller, Bernd and Hagelstein, Alexandra (2016) <doi:10.4126/FRL01-006408558>, Mueller, Bernd et al. (2017) <doi:10.1007/978-3-319-58694-6_22>, Mueller, Bernd and Rebholz-Schuhmann, Dietrich (2020) <doi:10.1007/978-3-030-43887-6_52>, and Mueller, Bernd et al. (2022) <doi:10.1186/s13326-021-00258-w>.
Maintained by Bernd Mueller. Last updated 1 years ago.
4.8 match 4.03 score 53 scriptscran
NHPoisson:Modelling and Validation of Non Homogeneous Poisson Processes
Tools for modelling, ML estimation, validation analysis and simulation of non homogeneous Poisson processes in time.
Maintained by Ana C. Cebrian. Last updated 5 years ago.
7.1 match 2 stars 2.71 score 43 scripts 2 dependentspaballand
EconGeo:Computing Key Indicators of the Spatial Distribution of Economic Activities
Functions to compute a series of indices commonly used in the fields of economic geography, economic complexity, and evolutionary economics to describe the location, distribution, spatial organization, structure, and complexity of economic activities. Functions include basic spatial indicators such as the location quotient, the Krugman specialization index, the Herfindahl or the Shannon entropy indices but also more advanced functions to compute different forms of normalized relatedness between economic activities or network-based measures of economic complexity. Most of the functions use matrix calculus and are based on bipartite (incidence) matrices consisting of region - industry pairs.
Maintained by Pierre-Alexandre Balland. Last updated 2 years ago.
3.9 match 41 stars 4.96 score 44 scriptsphilipmostert
intSDM:Reproducible Integrated Species Distribution Models Across Norway using 'INLA'
Integration of disparate datasets is needed in order to make efficient use of all available data and thereby address the issues currently threatening biodiversity. Data integration is a powerful modeling framework which allows us to combine these datasets together into a single model, yet retain the strengths of each individual dataset. We therefore introduce the package, 'intSDM': an R package designed to help ecologists develop a reproducible workflow of integrated species distribution models, using data both provided from the user as well as data obtained freely online. An introduction to data integration methods is discussed in Issac, Jarzyna, Keil, Dambly, Boersch-Supan, Browning, Freeman, Golding, Guillera-Arroita, Henrys, Jarvis, Lahoz-Monfort, Pagel, Pescott, Schmucki, Simmonds and O’Hara (2020) <doi:10.1016/j.tree.2019.08.006>.
Maintained by Philip Mostert. Last updated 2 months ago.
3.1 match 5 stars 6.26 score 12 scriptsgabrielnakamura
FishPhyloMaker:Phylogenies for a List of Finned-Ray Fishes
Provides an alternative to facilitate the construction of a phylogeny for fish species from a list of species or a community matrix using as a backbone the phylogenetic tree proposed by Rabosky et al. (2018) <doi:10.1038/s41586-018-0273-1>.
Maintained by Gabriel Nakamura. Last updated 1 years ago.
3.5 match 8 stars 5.49 score 13 scriptsagi-lab
SynthETIC:Synthetic Experience Tracking Insurance Claims
Creation of an individual claims simulator which generates various features of non-life insurance claims. An initial set of test parameters, designed to mirror the experience of an Auto Liability portfolio, were set up and applied by default to generate a realistic test data set of individual claims (see vignette). The simulated data set then allows practitioners to back-test the validity of various reserving models and to prove and/or disprove certain actuarial assumptions made in claims modelling. The distributional assumptions used to generate this data set can be easily modified by users to match their experiences. Reference: Avanzi B, Taylor G, Wang M, Wong B (2020) "SynthETIC: an individual insurance claim simulator with feature control" <arXiv:2008.05693>.
Maintained by Melantha Wang. Last updated 1 years ago.
3.1 match 12 stars 6.22 score 23 scripts 2 dependentscran
zetadiv:Functions to Compute Compositional Turnover Using Zeta Diversity
Functions to compute compositional turnover using zeta-diversity, the number of species shared by multiple assemblages. The package includes functions to compute zeta-diversity for a specific number of assemblages and to compute zeta-diversity for a range of numbers of assemblages. It also includes functions to explain how zeta-diversity varies with distance and with differences in environmental variables between assemblages, using generalised linear models, linear models with negative constraints, generalised additive models,shape constrained additive models, and I-splines.
Maintained by Guillaume Latombe. Last updated 3 years ago.
6.5 match 3 stars 2.89 score 64 scriptsbrry
berryFunctions:Function Collection Related to Plotting and Hydrology
Draw horizontal histograms, color scattered points by 3rd dimension, enhance date- and log-axis plots, zoom in X11 graphics, trace errors and warnings, use the unit hydrograph in a linear storage cascade, convert lists to data.frames and arrays, fit multiple functions.
Maintained by Berry Boessenkool. Last updated 1 months ago.
2.0 match 13 stars 9.43 score 350 scripts 16 dependentspharmaverse
admiral:ADaM in R Asset Library
A toolbox for programming Clinical Data Interchange Standards Consortium (CDISC) compliant Analysis Data Model (ADaM) datasets in R. ADaM datasets are a mandatory part of any New Drug or Biologics License Application submitted to the United States Food and Drug Administration (FDA). Analysis derivations are implemented in accordance with the "Analysis Data Model Implementation Guide" (CDISC Analysis Data Model Team, 2021, <https://www.cdisc.org/standards/foundational/adam>).
Maintained by Ben Straub. Last updated 4 days ago.
cdiscclinical-trialsopen-source
1.3 match 236 stars 13.89 score 486 scripts 4 dependentsbioc
genomation:Summary, annotation and visualization of genomic data
A package for summary and annotation of genomic intervals. Users can visualize and quantify genomic intervals over pre-defined functional regions, such as promoters, exons, introns, etc. The genomic intervals represent regions with a defined chromosome position, which may be associated with a score, such as aligned reads from HT-seq experiments, TF binding sites, methylation scores, etc. The package can use any tabular genomic feature data as long as it has minimal information on the locations of genomic intervals. In addition, It can use BAM or BigWig files as input.
Maintained by Altuna Akalin. Last updated 5 months ago.
annotationsequencingvisualizationcpgislandcpp
1.7 match 75 stars 11.09 score 738 scripts 5 dependentsbioc
MicrobiotaProcess:A comprehensive R package for managing and analyzing microbiome and other ecological data within the tidy framework
MicrobiotaProcess is an R package for analysis, visualization and biomarker discovery of microbial datasets. It introduces MPSE class, this make it more interoperable with the existing computing ecosystem. Moreover, it introduces a tidy microbiome data structure paradigm and analysis grammar. It provides a wide variety of microbiome data analysis procedures under the unified and common framework (tidy-like framework).
Maintained by Shuangbin Xu. Last updated 5 months ago.
visualizationmicrobiomesoftwaremultiplecomparisonfeatureextractionmicrobiome-analysismicrobiome-data
1.9 match 183 stars 9.70 score 126 scripts 1 dependentshannahlowens
voluModel:Modeling Species Distributions in Three Dimensions
Facilitates modeling species' ecological niches and geographic distributions based on occurrences and environments that have a vertical as well as horizontal component, and projecting models into three-dimensional geographic space. Working in three dimensions is useful in an aquatic context when the organisms one wishes to model can be found across a wide range of depths in the water column. The package also contains functions to automatically generate marine training model training regions using machine learning, and interpolate and smooth patchily sampled environmental rasters using thin plate splines. Davis Rabosky AR, Cox CL, Rabosky DL, Title PO, Holmes IA, Feldman A, McGuire JA (2016) <doi:10.1038/ncomms11484>. Nychka D, Furrer R, Paige J, Sain S (2021) <doi:10.5065/D6W957CT>. Pateiro-Lopez B, Rodriguez-Casal A (2022) <https://CRAN.R-project.org/package=alphahull>.
Maintained by Hannah L. Owens. Last updated 2 days ago.
2.8 match 9 stars 6.60 score 35 scriptshumaniverse
asylum:Data on Asylum and Resettlement for the UK
Data on Asylum and Resettlement for the UK, provided by the Home Office <https://www.gov.uk/government/statistical-data-sets/immigration-system-statistics-data-tables>.
Maintained by Matthew Gwynfryn Thomas. Last updated 17 days ago.
3.6 match 3 stars 4.99 score 36 scriptsschochastics
netrankr:Analyzing Partial Rankings in Networks
Implements methods for centrality related analyses of networks. While the package includes the possibility to build more than 20 indices, its main focus lies on index-free assessment of centrality via partial rankings obtained by neighborhood-inclusion or positional dominance. These partial rankings can be analyzed with different methods, including probabilistic methods like computing expected node ranks and relative rank probabilities (how likely is it that a node is more central than another?). The methodology is described in depth in the vignettes and in Schoch (2018) <doi:10.1016/j.socnet.2017.12.003>.
Maintained by David Schoch. Last updated 1 months ago.
network-analysisnetwork-centralityopenblascppopenmp
1.9 match 49 stars 9.56 score 91 scripts 2 dependentsropensci
SymbiotaR2:Downloading Data from Symbiota2 Portals into R
Download data from Symbiota2 portals using Symbiota's API. Covers the Checklists, Collections, Crowdsource, Exsiccati, Glossary, ImageProcessor, Key, Media, Occurrence, Reference, Taxa, Traits, and UserRoles API families. Each Symbiota2 portal owner can load their own plugins (and modified code), and so this package may not cover every possible API endpoint from a given Symbiota2 instance.
Maintained by Austin Koontz. Last updated 3 years ago.
databaselibraryspecimen-recordssymbiotasymbiota2symbiota2-portal
5.3 match 2 stars 3.30 score 4 scriptsmoderndive
moderndive:Tidyverse-Friendly Introductory Linear Regression
Datasets and wrapper functions for tidyverse-friendly introductory linear regression, used in "Statistical Inference via Data Science: A ModernDive into R and the Tidyverse" available at <https://moderndive.com/>.
Maintained by Albert Y. Kim. Last updated 3 months ago.
1.5 match 88 stars 11.35 score 1.8k scriptspakillo
rSDM:Species distribution and niche modelling in R
Functions for niche modelling and SDM.
Maintained by Francisco Rodriguez-Sanchez. Last updated 18 days ago.
5.4 match 7 stars 3.23 score 16 scriptsbioc
periodicDNA:Set of tools to identify periodic occurrences of k-mers in DNA sequences
This R package helps the user identify k-mers (e.g. di- or tri-nucleotides) present periodically in a set of genomic loci (typically regulatory elements). The functions of this package provide a straightforward approach to find periodic occurrences of k-mers in DNA sequences, such as regulatory elements. It is not aimed at identifying motifs separated by a conserved distance; for this type of analysis, please visit MEME website.
Maintained by Jacques Serizay. Last updated 5 months ago.
sequencematchingmotifdiscoverymotifannotationsequencingcoveragealignmentdataimport
3.3 match 6 stars 5.26 score 5 scriptsbabaknaimi
sdm:Species Distribution Modelling
An extensible framework for developing species distribution models using individual and community-based approaches, generate ensembles of models, evaluate the models, and predict species potential distributions in space and time. For more information, please check the following paper: Naimi, B., Araujo, M.B. (2016) <doi:10.1111/ecog.01881>.
Maintained by Babak Naimi. Last updated 2 months ago.
1.8 match 24 stars 9.53 score 312 scripts 1 dependentsalrobles
cofid:Copepod Fish Interaction Database
A curated list of copepod-fish ecological interaction records. It contains the taxonomy of the copepod and the fish and the publication from which the information was obtained. This database contains only marine and brackish water fish species. It excludes fish species that inhabit only freshwater.
Maintained by Angel Robles. Last updated 4 months ago.
5.0 match 3.40 score 3 scriptsdarwin-eu
PatientProfiles:Identify Characteristics of Patients in the OMOP Common Data Model
Identify the characteristics of patients in data mapped to the Observational Medical Outcomes Partnership (OMOP) common data model.
Maintained by Marti Catala. Last updated 10 days ago.
1.7 match 1 stars 9.97 score 225 scripts 9 dependentsthecomeonman
ggTimeSeries:Time Series Visualisations Using the Grammar of Graphics
Provides additional display mediums for time series visualisations.
Maintained by Aditya Kothari. Last updated 6 years ago.
3.2 match 1 stars 5.23 score 112 scriptsalj1983
MaxentVariableSelection:Selecting the Best Set of Relevant Environmental Variables along with the Optimal Regularization Multiplier for Maxent Niche Modeling
Complex niche models show low performance in identifying the most important range-limiting environmental variables and in transferring habitat suitability to novel environmental conditions (Warren and Seifert, 2011 <DOI:10.1890/10-1171.1>; Warren et al., 2014 <DOI:10.1111/ddi.12160>). This package helps to identify the most important set of uncorrelated variables and to fine-tune Maxent's regularization multiplier. In combination, this allows to constrain complexity and increase performance of Maxent niche models (assessed by information criteria, such as AICc (Akaike, 1974 <DOI:10.1109/TAC.1974.1100705>), and by the area under the receiver operating characteristic (AUC) (Fielding and Bell, 1997 <DOI:10.1017/S0376892997000088>). Users of this package should be familiar with Maxent niche modelling.
Maintained by "Alexander Jueterbock". Last updated 6 years ago.
3.1 match 4 stars 5.34 score 11 scriptsalarm-redist
redist:Simulation Methods for Legislative Redistricting
Enables researchers to sample redistricting plans from a pre-specified target distribution using Sequential Monte Carlo and Markov Chain Monte Carlo algorithms. The package allows for the implementation of various constraints in the redistricting process such as geographic compactness and population parity requirements. Tools for analysis such as computation of various summary statistics and plotting functionality are also included. The package implements the SMC algorithm of McCartan and Imai (2023) <doi:10.1214/23-AOAS1763>, the enumeration algorithm of Fifield, Imai, Kawahara, and Kenny (2020) <doi:10.1080/2330443X.2020.1791773>, the Flip MCMC algorithm of Fifield, Higgins, Imai and Tarr (2020) <doi:10.1080/10618600.2020.1739532>, the Merge-split/Recombination algorithms of Carter et al. (2019) <arXiv:1911.01503> and DeFord et al. (2021) <doi:10.1162/99608f92.eb30390f>, and the Short-burst optimization algorithm of Cannon et al. (2020) <arXiv:2011.02288>.
Maintained by Christopher T. Kenny. Last updated 2 months ago.
geospatialgerrymanderingredistrictingsamplingopenblascppopenmp
1.8 match 68 stars 9.17 score 259 scriptsmikldk
malan:MAle Lineage ANalysis
MAle Lineage ANalysis by simulating genealogies backwards and imposing short tandem repeats (STR) mutations forwards. Intended for forensic Y chromosomal STR (Y-STR) haplotype analyses. Numerous analyses are possible, e.g. number of matches and meiotic distance to matches. Refer to papers mentioned in citation("malan") (DOI's: <doi:10.1371/journal.pgen.1007028>, <doi:10.21105/joss.00684> and <doi:10.1016/j.fsigen.2018.10.004>).
Maintained by Mikkel Meyer Andersen. Last updated 1 years ago.
3.7 match 4.48 score 6 scriptsbioc
ShortRead:FASTQ input and manipulation
This package implements sampling, iteration, and input of FASTQ files. The package includes functions for filtering and trimming reads, and for generating a quality assessment report. Data are represented as DNAStringSet-derived objects, and easily manipulated for a diversity of purposes. The package also contains legacy support for early single-end, ungapped alignment formats.
Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.
dataimportsequencingqualitycontrolbioconductor-packagecore-packagezlibcpp
1.3 match 8 stars 12.08 score 1.8k scripts 49 dependentsbioc
spatzie:Identification of enriched motif pairs from chromatin interaction data
Identifies motifs that are significantly co-enriched from enhancer-promoter interaction data. While enhancer-promoter annotation is commonly used to define groups of interaction anchors, spatzie also supports co-enrichment analysis between preprocessed interaction anchors. Supports BEDPE interaction data derived from genome-wide assays such as HiC, ChIA-PET, and HiChIP. Can also be used to look for differentially enriched motif pairs between two interaction experiments.
Maintained by Jennifer Hammelman. Last updated 5 months ago.
dna3dstructuregeneregulationpeakdetectionepigeneticsfunctionalgenomicsclassificationhictranscription
3.7 match 4.30 score 5 scriptsrbarkerclarke
gtexture:Generalized Application of Co-Occurrence Matrices and Haralick Texture
Generalizes application of gray-level co-occurrence matrix (GLCM) metrics to objects outside of images. The current focus is to apply GLCM metrics to the study of biological networks and fitness landscapes that are used in studying evolutionary medicine and biology, particularly the evolution of cancer resistance. The package was used in our publication, Barker-Clarke et al. (2023) <doi:10.1088/1361-6560/ace305>. A general reference to learn more about mathematical oncology can be found at Rockne et al. (2019) <doi:10.1088/1478-3975/ab1a09>.
Maintained by Rowan Barker-Clarke. Last updated 12 months ago.
5.2 match 3.00 score 1 scriptsediorg
ecocomDP:Tools to Create, Use, and Convert ecocomDP Data
Work with the Ecological Community Data Design Pattern. 'ecocomDP' is a flexible data model for harmonizing ecological community surveys, in a research question agnostic format, from source data published across repositories, and with methods that keep the derived data up-to-date as the underlying sources change. Described in O'Brien et al. (2021), <doi:10.1016/j.ecoinf.2021.101374>.
Maintained by Colin Smith. Last updated 7 months ago.
1.9 match 32 stars 8.22 score 77 scriptsropensci
helminthR:Access London Natural History Museum Host-Helminth Record Database
Access to large host-parasite data is often hampered by the availability of data and difficulty in obtaining it in a programmatic way to encourage analyses. 'helminthR' provides a programmatic interface to the London Natural History Museum's host-parasite database, one of the largest host-parasite databases existing currently <https://www.nhm.ac.uk/research-curation/scientific-resources/taxonomy-systematics/host-parasites/>. The package allows the user to query by host species, parasite species, and geographic location.
Maintained by Tad Dallas. Last updated 2 years ago.
disease-networkshelminthopen-dataparasites
3.5 match 7 stars 4.32 score 12 scriptsgavinsimpson
analogue:Analogue and Weighted Averaging Methods for Palaeoecology
Fits Modern Analogue Technique and Weighted Averaging transfer function models for prediction of environmental data from species data, and related methods used in palaeoecology.
Maintained by Gavin L. Simpson. Last updated 6 months ago.
1.7 match 14 stars 8.96 score 185 scripts 4 dependentsgabrielhoc
Mapinguari:Process-Based Biogeographical Analysis
Facilitates the incorporation of biological processes in biogeographical analyses. It offers conveniences in fitting, comparing and extrapolating models of biological processes such as physiology and phenology. These spatial extrapolations can be informative by themselves, but also complement traditional correlative species distribution models, by mixing environmental and process-based predictors. Caetano et al (2020) <doi:10.1111/oik.07123>.
Maintained by Gabriel Caetano. Last updated 2 years ago.
5.5 match 5 stars 2.70 score 4 scriptsplant-data
ritalic:Interface to the ITALIC Database of Lichen Biodiversity
A programmatic interface to the Web Service methods provided by ITALIC (<https://italic.units.it>). ITALIC is a database of lichen data in Italy and bordering European countries. 'ritalic' includes functions for retrieving information about lichen scientific names, geographic distribution, ecological data, morpho-functional traits and identification keys. More information about the data is available at <https://italic.units.it/?procedure=base&t=59&c=60>. The API documentation is available at <https://italic.units.it/?procedure=api>.
Maintained by Matteo Conti. Last updated 14 days ago.
biodiversityecologyfungiitaliclichen
3.6 match 2 stars 4.04 score 6 scriptstraminer
TraMineR:Trajectory Miner: a Sequence Analysis Toolkit
Set of sequence analysis tools for manipulating, describing and rendering categorical sequences, and more generally mining sequence data in the field of social sciences. Although this sequence analysis package is primarily intended for state or event sequences that describe time use or life courses such as family formation histories or professional careers, its features also apply to many other kinds of categorical sequence data. It accepts many different sequence representations as input and provides tools for converting sequences from one format to another. It offers several functions for describing and rendering sequences, for computing distances between sequences with different metrics (among which optimal matching), original dissimilarity-based analysis tools, and functions for extracting the most frequent event subsequences and identifying the most discriminating ones among them. A user's guide can be found on the TraMineR web page.
Maintained by Gilbert Ritschard. Last updated 3 months ago.
1.8 match 11 stars 8.24 score 534 scripts 13 dependentsbioc
survcomp:Performance Assessment and Comparison for Survival Analysis
Assessment and Comparison for Performance of Risk Prediction (Survival) Models.
Maintained by Benjamin Haibe-Kains. Last updated 5 months ago.
geneexpressiondifferentialexpressionvisualizationcpp
1.7 match 8.46 score 448 scripts 12 dependentsdkahle
TITAN2:Threshold Indicator Taxa Analysis
Uses indicator species scores across binary partitions of a sample set to detect congruence in taxon-specific changes of abundance and occurrence frequency along an environmental gradient as evidence of an ecological community threshold. Relevant references include Baker and King (2010) <doi:10.1111/j.2041-210X.2009.00007.x>, King and Baker (2010) <doi:10.1899/09-144.1>, and Baker and King (2013) <doi:10.1899/12-142.1>.
Maintained by David Kahle. Last updated 1 years ago.
2.2 match 13 stars 6.59 score 30 scriptswillgearty
deeptime:Plotting Tools for Anyone Working in Deep Time
Extends the functionality of other plotting packages (notably 'ggplot2') to help facilitate the plotting of data over long time intervals, including, but not limited to, geological, evolutionary, and ecological data. The primary goal of 'deeptime' is to enable users to add highly customizable timescales to their visualizations. Other functions are also included to assist with other areas of deep time visualization.
Maintained by William Gearty. Last updated 3 months ago.
geologyggplot2paleontologyvisualization
1.3 match 92 stars 10.61 score 207 scripts 3 dependentsinsightsengineering
cardx:Extra Analysis Results Data Utilities
Create extra Analysis Results Data (ARD) summary objects. The package supplements the simple ARD functions from the 'cards' package, exporting functions to put statistical results in the ARD format. These objects are used and re-used to construct summary tables, visualizations, and written reports.
Maintained by Daniel D. Sjoberg. Last updated 20 days ago.
1.7 match 19 stars 8.46 score 50 scriptscran
CircNNTSR:Statistical Analysis of Circular Data using Nonnegative Trigonometric Sums (NNTS) Models
Includes functions for the analysis of circular data using distributions based on Nonnegative Trigonometric Sums (NNTS). The package includes functions for calculation of densities and distributions, for the estimation of parameters, for plotting and more.
Maintained by Maria Mercedes Gregorio-Dominguez. Last updated 2 years ago.
7.8 match 1.78 score 2 dependentsniklashohmann
DAIME:Effects of Changing Deposition Rates
Reverse and model the effects of changing deposition rates on geological data and rates. Based on Hohmann (2018) <doi:10.13140/RG.2.2.23372.51841> .
Maintained by Niklas Hohmann. Last updated 5 years ago.
4.6 match 3.00 scorekwb-r
kwb.utils:General Utility Functions Developed at KWB
This package contains some small helper functions that aim at improving the quality of code developed at Kompetenzzentrum Wasser gGmbH (KWB).
Maintained by Hauke Sonnenberg. Last updated 12 months ago.
1.9 match 8 stars 7.33 score 12 scripts 78 dependentsaibrt
FreqProf:Frequency Profiles Computing and Plotting
Tools for generating an informative type of line graph, the frequency profile, which allows single behaviors, multiple behaviors, or the specific behavioral patterns of individual subjects to be graphed from occurrence/nonoccurrence behavioral data.
Maintained by Ronald E. Robertson. Last updated 9 years ago.
4.0 match 2 stars 3.48 score 7 scriptsgavinsimpson
coenocliner:Coenocline Simulation
Simulate species occurrence and abundances (counts) along gradients.
Maintained by Gavin L. Simpson. Last updated 4 years ago.
2.2 match 12 stars 6.03 score 15 scripts 1 dependentsbioc
Rcpi:Molecular Informatics Toolkit for Compound-Protein Interaction in Drug Discovery
A molecular informatics toolkit with an integration of bioinformatics and chemoinformatics tools for drug discovery.
Maintained by Nan Xiao. Last updated 5 months ago.
softwaredataimportdatarepresentationfeatureextractioncheminformaticsbiomedicalinformaticsproteomicsgosystemsbiologybioconductorbioinformaticsdrug-discoveryfeature-extractionfingerprintmolecular-descriptorsprotein-sequences
1.7 match 37 stars 7.81 score 29 scriptscodymarquart
rENA:Epistemic Network Analysis
ENA (Shaffer, D. W. (2017) Quantitative Ethnography. ISBN: 0578191687) is a method used to identify meaningful and quantifiable patterns in discourse or reasoning. ENA moves beyond the traditional frequency-based assessments by examining the structure of the co-occurrence, or connections in coded data. Moreover, compared to other methodological approaches, ENA has the novelty of (1) modeling whole networks of connections and (2) affording both quantitative and qualitative comparisons between different network models. Shaffer, D.W., Collier, W., & Ruis, A.R. (2016).
Maintained by Cody L Marquart. Last updated 1 years ago.
5.9 match 1 stars 2.26 score 36 scriptscalvagone
campsismod:Generic Implementation of a PK/PD Model
A generic, easy-to-use and expandable implementation of a pharmacokinetic (PK) / pharmacodynamic (PD) model based on the S4 class system. This package allows the user to read/write a pharmacometric model from/to files and adapt it further on the fly in the R environment. For this purpose, this package provides an intuitive API to add, modify or delete equations, ordinary differential equations (ODE's), model parameters or compartment properties (like infusion duration or rate, bioavailability and initial values). Finally, this package also provides a useful export of the model for use with simulation packages 'rxode2' and 'mrgsolve'. This package is designed and intended to be used with package 'campsis', a PK/PD simulation platform built on top of 'rxode2' and 'mrgsolve'.
Maintained by Nicolas Luyckx. Last updated 1 months ago.
2.0 match 5 stars 6.64 score 42 scripts 1 dependentstomasfryda
h2o:R Interface for the 'H2O' Scalable Machine Learning Platform
R interface for 'H2O', the scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).
Maintained by Tomas Fryda. Last updated 1 years ago.
1.6 match 3 stars 8.20 score 7.8k scripts 11 dependentsjenniniku
gllvm:Generalized Linear Latent Variable Models
Analysis of multivariate data using generalized linear latent variable models (gllvm). Estimation is performed using either the Laplace method, variational approximations, or extended variational approximations, implemented via TMB (Kristensen et al. (2016), <doi:10.18637/jss.v070.i05>).
Maintained by Jenni Niku. Last updated 13 hours ago.
1.3 match 52 stars 10.53 score 176 scripts 1 dependentscwolock
survML:Tools for Flexible Survival Analysis Using Machine Learning
Statistical tools for analyzing time-to-event data using machine learning. Implements survival stacking for conditional survival estimation, standardized survival function estimation for current status data, and methods for algorithm-agnostic variable importance. See Wolock CJ, Gilbert PB, Simon N, and Carone M (2024) <doi:10.1080/10618600.2024.2304070>.
Maintained by Charles Wolock. Last updated 2 months ago.
1.6 match 16 stars 8.06 score 73 scripts 1 dependentskevinstadler
cultevo:Tools, Measures and Statistical Tests for Cultural Evolution
Provides tools and statistics useful for analysing data from artificial language experiments. It implements the information-theoretic measure of the compositionality of signalling systems due to Spike (2016) <http://hdl.handle.net/1842/25930>, the Mantel test for distance matrix correlation (after Dietz 1983) <doi:10.1093/sysbio/32.1.21>), functions for computing string and meaning distance matrices as well as an implementation of the Page test for monotonicity of ranks (Page 1963) <doi:10.1080/01621459.1963.10500843> with exact p-values up to k = 22.
Maintained by Kevin Stadler. Last updated 1 years ago.
2.0 match 8 stars 6.50 score 131 scripts 1 dependentskasperwelbers
corpustools:Managing, Querying and Analyzing Tokenized Text
Provides text analysis in R, focusing on the use of a tokenized text format. In this format, the positions of tokens are maintained, and each token can be annotated (e.g., part-of-speech tags, dependency relations). Prominent features include advanced Lucene-like querying for specific tokens or contexts (e.g., documents, sentences), similarity statistics for words and documents, exporting to DTM for compatibility with many text analysis packages, and the possibility to reconstruct original text from tokens to facilitate interpretation.
Maintained by Kasper Welbers. Last updated 6 months ago.
1.7 match 31 stars 7.50 score 174 scripts 1 dependentscranhaven
rock:Reproducible Open Coding Kit
The Reproducible Open Coding Kit ('ROCK', and this package, 'rock') was developed to facilitate reproducible and open coding, specifically geared towards qualitative research methods. Although it is a general-purpose toolkit, three specific applications have been implemented, specifically an interface to the 'rENA' package that implements Epistemic Network Analysis ('ENA'), means to process notes from Cognitive Interviews ('CIs'), and means to work with decentralized construct taxonomies ('DCTs'). The 'ROCK' and this 'rock' package are described in the ROCK book <https://rockbook.org> and more information, such as tutorials, is available at <https://rock.science>.
Maintained by Gjalt-Jorn Peters. Last updated 8 days ago.
3.8 match 5 stars 3.40 scorecran
SAPP:Statistical Analysis of Point Processes
Functions for statistical analysis of point processes.
Maintained by Masami Saga. Last updated 2 years ago.
4.0 match 3.18 score 15 scriptsandrew-plowright
ForestTools:Tools for Analyzing Remote Sensing Forest Data
Tools for analyzing remote sensing forest data, including functions for detecting treetops from canopy models, outlining tree crowns, and calculating textural metrics.
Maintained by Andrew Plowright. Last updated 1 months ago.
1.8 match 73 stars 7.01 score 103 scripts 1 dependentsroelandkindt
maxlike:Model Species Distributions by Estimating the Probability of Occurrence Using Presence-Only Data
Provides a likelihood-based approach to modeling species distributions using presence-only data. In contrast to the popular software program MAXENT, this approach yields estimates of the probability of occurrence, which is a natural descriptor of a species' distribution.
Maintained by Roeland Kindt. Last updated 12 months ago.
6.7 match 1.87 score 20 scriptsconfig-i1
legion:Forecasting Using Multivariate Models
Functions implementing multivariate state space models for purposes of time series analysis and forecasting. The focus of the package is on multivariate models, such as Vector Exponential Smoothing, Vector ETS (Error-Trend-Seasonal model) etc. It currently includes Vector Exponential Smoothing (VES, de Silva et al., 2010, <doi:10.1177/1471082X0901000401>), Vector ETS (Svetunkov et al., 2023, <doi:10.1016/j.ejor.2022.04.040>) and simulation function for VES.
Maintained by Ivan Svetunkov. Last updated 1 months ago.
1.8 match 11 stars 6.90 score 1 scripts 1 dependentsinbo
ladybird:Analysis of Ladybird Occurrence Data
Analysis of ladybird occurrence data from Belgium, the Netherlands and the UK since 1990.
Maintained by Thierry Onkelinx. Last updated 4 years ago.
7.3 match 1.70 score 3 scriptsroelandkindt
BiodiversityR:Package for Community Ecology and Suitability Analysis
Graphical User Interface (via the R-Commander) and utility functions (often based on the vegan package) for statistical analysis of biodiversity and ecological communities, including species accumulation curves, diversity indices, Renyi profiles, GLMs for analysis of species abundance and presence-absence, distance matrices, Mantel tests, and cluster, constrained and unconstrained ordination analysis. A book on biodiversity and community ecology analysis is available for free download from the website. In 2012, methods for (ensemble) suitability modelling and mapping were expanded in the package.
Maintained by Roeland Kindt. Last updated 2 months ago.
1.7 match 16 stars 7.42 score 390 scripts 2 dependentsfarewe
Rarity:Calculation of Rarity Indices for Species and Assemblages of Species
Allows calculation of rarity weights for species and indices of rarity for assemblages of species according to different methods (Leroy et al. 2012, Insect. Conserv. Divers. 5:159-168 <doi:10.1111/j.1752-4598.2011.00148.x>; Leroy et al. 2013, Divers. Distrib. 19:794-803 <doi:10.1111/ddi.12040>).
Maintained by Boris Leroy. Last updated 2 years ago.
3.4 match 1 stars 3.61 score 27 scripts 1 dependentsbabaknaimi
usdm:Uncertainty Analysis for Species Distribution Models
This is a framework that aims to provide methods and tools for assessing the impact of different sources of uncertainties (e.g.positional uncertainty) on performance of species distribution models (SDMs).)
Maintained by Babak Naimi. Last updated 1 years ago.
1.8 match 3 stars 6.87 score 644 scripts 1 dependentsroux-ohdsi
allofus:Interface for 'All of Us' Researcher Workbench
Streamline use of the 'All of Us' Researcher Workbench (<https://www.researchallofus.org/data-tools/workbench/>)with tools to extract and manipulate data from the 'All of Us' database. Increase interoperability with the Observational Health Data Science and Informatics ('OHDSI') tool stack by decreasing reliance of 'All of Us' tools and allowing for cohort creation via 'Atlas'. Improve reproducible and transparent research using 'All of Us'.
Maintained by Rob Cavanaugh. Last updated 4 months ago.
1.7 match 16 stars 7.19 score 30 scriptspythonicr
re:'Python' Style Regular Expression Functions
A comprehensive set of regular expression functions based on those found in 'Python' without relying on 'reticulate'. It provides functions that intend to (1) make it easier for users familiar with 'Python' to work with regular expressions, (2) reduce the complexity often associated with regular expressions code, (3) and enable users to write more readable and maintainable code that relies on regular expression-based pattern matching.
Maintained by Garrett Shipley. Last updated 7 months ago.
3.8 match 1 stars 3.28 score 19 scriptsquanteda
quanteda.textplots:Plots for the Quantitative Analysis of Textual Data
Plotting functions for visualising textual data. Extends 'quanteda' and related packages with plot methods designed specifically for text data, textual statistics, and models fit to textual data. Plot types include word clouds, lexical dispersion plots, scaling plots, network visualisations, and word 'keyness' plots.
Maintained by Kenneth Benoit. Last updated 7 months ago.
1.8 match 7 stars 6.77 score 648 scriptsbioc
soGGi:Visualise ChIP-seq, MNase-seq and motif occurrence as aggregate plots Summarised Over Grouped Genomic Intervals
The soGGi package provides a toolset to create genomic interval aggregate/summary plots of signal or motif occurence from BAM and bigWig files as well as PWM, rlelist, GRanges and GAlignments Bioconductor objects. soGGi allows for normalisation, transformation and arithmetic operation on and between summary plot objects as well as grouping and subsetting of plots by GRanges objects and user supplied metadata. Plots are created using the GGplot2 libary to allow user defined manipulation of the returned plot object. Coupled together, soGGi features a broad set of methods to visualise genomics data in the context of groups of genomic intervals such as genes, superenhancers and transcription factor binding events.
Maintained by Tom Carroll. Last updated 5 months ago.
2.7 match 4.49 score 51 scripts 1 dependentsbblonder
netassoc:Inference of Species Associations from Co-Occurrence Data
Infers species associations from community matrices. Uses local and (optional) regional-scale co-occurrence data by comparing observed partial correlation coefficients between species to those estimated from regional species distributions. Extends Gaussian graphical models to a null modeling framework. Provides interface to a variety of inverse covariance matrix estimation methods.
Maintained by Benjamin Blonder. Last updated 3 years ago.
5.2 match 2 stars 2.30 score 9 scriptshemingnm
SESraster:Raster Randomization for Null Hypothesis Testing
Randomization of presence/absence species distribution raster data with or without including spatial structure for calculating standardized effect sizes and testing null hypothesis. The randomization algorithms are based on classical algorithms for matrices (Gotelli 2000, <doi:10.2307/177478>) implemented for raster data.
Maintained by Neander Marcel Heming. Last updated 5 months ago.
null-modelsrandomizationrasterspatialspatial-analysisspecies-distribution-modelling
1.8 match 7 stars 6.61 score 32 scripts 2 dependentscysouw
qlcMatrix:Utility Sparse Matrix Functions for Quantitative Language Comparison
Extension of the functionality of the 'Matrix' package for using sparse matrices. Some of the functions are very general, while other are highly specific for special data format as used for quantitative language comparison.
Maintained by Michael Cysouw. Last updated 9 months ago.
1.7 match 6 stars 6.98 score 256 scripts 1 dependentsphilipmostert
PointedSDMs:Fit Models Derived from Point Processes to Species Distributions using 'inlabru'
Integrated species distribution modeling is a rising field in quantitative ecology thanks to significant rises in the quantity of data available, increases in computational speed and the proven benefits of using such models. Despite this, the general software to help ecologists construct such models in an easy-to-use framework is lacking. We therefore introduce the R package 'PointedSDMs': which provides the tools to help ecologists set up integrated models and perform inference on them. There are also functions within the package to help run spatial cross-validation for model selection, as well as generic plotting and predicting functions. An introduction to these methods is discussed in Issac, Jarzyna, Keil, Dambly, Boersch-Supan, Browning, Freeman, Golding, Guillera-Arroita, Henrys, Jarvis, Lahoz-Monfort, Pagel, Pescott, Schmucki, Simmonds and O’Hara (2020) <doi:10.1016/j.tree.2019.08.006>.
Maintained by Philip Mostert. Last updated 2 months ago.
1.3 match 25 stars 8.57 score 50 scripts 1 dependentsred-list-ecosystem
redlistr:Tools for the IUCN Red List of Ecosystems and Species
A toolbox created by members of the International Union for Conservation of Nature (IUCN) Red List of Ecosystems Committee for Scientific Standards. Primarily, it is a set of tools suitable for calculating the metrics required for making assessments of species and ecosystems against the IUCN Red List of Threatened Species and the IUCN Red List of Ecosystems categories and criteria. See the IUCN website for detailed guidelines, the criteria, publications and other information.
Maintained by Calvin Lee. Last updated 1 years ago.
1.8 match 32 stars 6.35 score 35 scriptsnjlyon0
supportR:Support Functions for Wrangling and Visualization
Suite of helper functions for data wrangling and visualization. The only theme for these functions is that they tend towards simple, short, and narrowly-scoped. These functions are built for tasks that often recur but are not large enough in scope to warrant an ecosystem of interdependent functions.
Maintained by Nicholas J Lyon. Last updated 4 months ago.
1.8 match 5 stars 6.22 score 15 scriptsmicrosoft
wpa:Tools for Analysing and Visualising Viva Insights Data
Opinionated functions that enable easier and faster analysis of Viva Insights data. There are three main types of functions in 'wpa': (i) Standard functions create a 'ggplot' visual or a summary table based on a specific Viva Insights metric; (2) Report Generation functions generate HTML reports on a specific analysis area, e.g. Collaboration; (3) Other miscellaneous functions cover more specific applications (e.g. Subject Line text mining) of Viva Insights data. This package adheres to 'tidyverse' principles and works well with the pipe syntax. 'wpa' is built with the beginner-to-intermediate R users in mind, and is optimised for simplicity.
Maintained by Martin Chan. Last updated 4 months ago.
1.7 match 30 stars 6.69 score 39 scripts 1 dependentsjsanchezalv
WARDEN:Workflows for Health Technology Assessments in R using Discrete EveNts
Toolkit to support and perform discrete event simulations without resource constraints in the context of health technology assessments (HTA). The package focuses on cost-effectiveness modelling and aims to be submission-ready to relevant HTA bodies in alignment with 'NICE TSD 15' <https://www.sheffield.ac.uk/nice-dsu/tsds/patient-level-simulation>. More details an examples can be found in the package website <https://jsanchezalv.github.io/WARDEN/>.
Maintained by Javier Sanchez Alvarez. Last updated 3 months ago.
1.7 match 6 stars 6.69 score 9 scriptsmatildabrown
rWCVP:Generating Summaries, Reports and Plots from the World Checklist of Vascular Plants
A companion to the World Checklist of Vascular Plants (WCVP). It includes functions to generate maps and species lists, as well as match names to the WCVP. For more details and to cite the package, see: Brown M.J.M., Walker B.E., Black N., Govaerts R., Ondo I., Turner R., Nic Lughadha E. (in press). "rWCVP: A companion R package to the World Checklist of Vascular Plants". New Phytologist.
Maintained by Matilda Brown. Last updated 1 years ago.
1.8 match 22 stars 6.17 score 45 scripts 1 dependentsmt1022
cubar:Codon Usage Bias Analysis
A suite of functions for rapid and flexible analysis of codon usage bias. It provides in-depth analysis at the codon level, including relative synonymous codon usage (RSCU), tRNA weight calculations, machine learning predictions for optimal or preferred codons, and visualization of codon-anticodon pairing. Additionally, it can calculate various gene- specific codon indices such as codon adaptation index (CAI), effective number of codons (ENC), fraction of optimal codons (Fop), tRNA adaptation index (tAI), mean codon stabilization coefficients (CSCg), and GC contents (GC/GC3s/GC4d). It also supports both standard and non-standard genetic code tables found in NCBI, as well as custom genetic code tables.
Maintained by Hong Zhang. Last updated 3 months ago.
bioinformaticscodon-usagemachine-learningsequence-analysis
1.9 match 6 stars 5.82 score 8 scriptsfabrice-rossi
mixvlmc:Variable Length Markov Chains with Covariates
Estimates Variable Length Markov Chains (VLMC) models and VLMC with covariates models from discrete sequences. Supports model selection via information criteria and simulation of new sequences from an estimated model. See Bühlmann, P. and Wyner, A. J. (1999) <doi:10.1214/aos/1018031204> for VLMC and Zanin Zambom, A., Kim, S. and Lopes Garcia, N. (2022) <doi:10.1111/jtsa.12615> for VLMC with covariates.
Maintained by Fabrice Rossi. Last updated 10 months ago.
machine-learningmarkov-chainmarkov-modelstatisticstime-seriescpp
1.8 match 2 stars 6.23 score 20 scriptsbioc
dominoSignal:Cell Communication Analysis for Single Cell RNA Sequencing
dominoSignal is a package developed to analyze cell signaling through ligand - receptor - transcription factor networks in scRNAseq data. It takes as input information transcriptomic data, requiring counts, z-scored counts, and cluster labels, as well as information on transcription factor activation (such as from SCENIC) and a database of ligand and receptor pairings (such as from CellPhoneDB). This package creates an object storing ligand - receptor - transcription factor linkages by cluster and provides several methods for exploring, summarizing, and visualizing the analysis.
Maintained by Jacob T Mitchell. Last updated 5 months ago.
systemsbiologysinglecelltranscriptomicsnetwork
1.7 match 5 stars 6.50 score 5 scriptsdanlwarren
ENMTools:Analysis of Niche Evolution using Niche and Distribution Models
Constructing niche models and analyzing patterns of niche evolution. Acts as an interface for many popular modeling algorithms, and allows users to conduct Monte Carlo tests to address basic questions in evolutionary ecology and biogeography. Warren, D.L., R.E. Glor, and M. Turelli (2008) <doi:10.1111/j.1558-5646.2008.00482.x> Glor, R.E., and D.L. Warren (2011) <doi:10.1111/j.1558-5646.2010.01177.x> Warren, D.L., R.E. Glor, and M. Turelli (2010) <doi:10.1111/j.1600-0587.2009.06142.x> Cardillo, M., and D.L. Warren (2016) <doi:10.1111/geb.12455> D.L. Warren, L.J. Beaumont, R. Dinnage, and J.B. Baumgartner (2019) <doi:10.1111/ecog.03900>.
Maintained by Dan Warren. Last updated 2 months ago.
1.6 match 105 stars 6.91 score 126 scripts