Showing 200 of total 207 results (show query)

kurthornik

NLP:Natural Language Processing Infrastructure

Basic classes and methods for Natural Language Processing.

Maintained by Kurt Hornik. Last updated 4 months ago.

13.4 match 6 stars 9.42 score 1.0k scripts 127 dependents

r-spatial

spdep:Spatial Dependence: Weighting Schemes, Statistics

A collection of functions to create spatial weights matrix objects from polygon 'contiguities', from point patterns by distance and tessellations, for summarizing these objects, and for permitting their use in spatial data analysis, including regional aggregation by minimum spanning tree; a collection of tests for spatial 'autocorrelation', including global 'Morans I' and 'Gearys C' proposed by 'Cliff' and 'Ord' (1973, ISBN: 0850860369) and (1981, ISBN: 0850860814), 'Hubert/Mantel' general cross product statistic, Empirical Bayes estimates and 'Assunção/Reis' (1999) <doi:10.1002/(SICI)1097-0258(19990830)18:16%3C2147::AID-SIM179%3E3.0.CO;2-I> Index, 'Getis/Ord' G ('Getis' and 'Ord' 1992) <doi:10.1111/j.1538-4632.1992.tb00261.x> and multicoloured join count statistics, 'APLE' ('Li 'et al.' ) <doi:10.1111/j.1538-4632.2007.00708.x>, local 'Moran's I', 'Gearys C' ('Anselin' 1995) <doi:10.1111/j.1538-4632.1995.tb00338.x> and 'Getis/Ord' G ('Ord' and 'Getis' 1995) <doi:10.1111/j.1538-4632.1995.tb00912.x>, 'saddlepoint' approximations ('Tiefelsdorf' 2002) <doi:10.1111/j.1538-4632.2002.tb01084.x> and exact tests for global and local 'Moran's I' ('Bivand et al.' 2009) <doi:10.1016/j.csda.2008.07.021> and 'LOSH' local indicators of spatial heteroscedasticity ('Ord' and 'Getis') <doi:10.1007/s00168-011-0492-y>. The implementation of most of these measures is described in 'Bivand' and 'Wong' (2018) <doi:10.1007/s11749-018-0599-x>, with further extensions in 'Bivand' (2022) <doi:10.1111/gean.12319>. 'Lagrange' multiplier tests for spatial dependence in linear models are provided ('Anselin et al'. 1996) <doi:10.1016/0166-0462(95)02111-6>, as are 'Rao' score tests for hypothesised spatial 'Durbin' models based on linear models ('Koley' and 'Bera' 2023) <doi:10.1080/17421772.2023.2256810>. A local indicators for categorical data (LICD) implementation based on 'Carrer et al.' (2021) <doi:10.1016/j.jas.2020.105306> and 'Bivand et al.' (2017) <doi:10.1016/j.spasta.2017.03.003> was added in 1.3-7. From 'spdep' and 'spatialreg' versions >= 1.2-1, the model fitting functions previously present in this package are defunct in 'spdep' and may be found in 'spatialreg'.

Maintained by Roger Bivand. Last updated 1 months ago.

spatial-autocorrelationspatial-dependencespatial-weights

6.1 match 131 stars 16.59 score 6.0k scripts 106 dependents

rspatial

geosphere:Spherical Trigonometry

Spherical trigonometry for geographic applications. That is, compute distances and related measures for angular (longitude/latitude) locations.

Maintained by Robert J. Hijmans. Last updated 6 months ago.

cpp

5.6 match 36 stars 13.79 score 5.7k scripts 116 dependents

rstudio

htmltools:Tools for HTML

Tools for HTML generation and output.

Maintained by Carson Sievert. Last updated 11 months ago.

3.3 match 218 stars 17.61 score 10k scripts 4.5k dependents

graemetlloyd

Claddis:Measuring Morphological Diversity and Evolutionary Tempo

Measures morphological diversity from discrete character data and estimates evolutionary tempo on phylogenetic trees. Imports morphological data from #NEXUS (Maddison et al. (1997) <doi:10.1093/sysbio/46.4.590>) format with read_nexus_matrix(), and writes to both #NEXUS and TNT format (Goloboff et al. (2008) <doi:10.1111/j.1096-0031.2008.00217.x>). Main functions are test_rates(), which implements AIC and likelihood ratio tests for discrete character rates introduced across Lloyd et al. (2012) <doi:10.1111/j.1558-5646.2011.01460.x>, Brusatte et al. (2014) <doi:10.1016/j.cub.2014.08.034>, Close et al. (2015) <doi:10.1016/j.cub.2015.06.047>, and Lloyd (2016) <doi:10.1111/bij.12746>, and calculate_morphological_distances(), which implements multiple discrete character distance metrics from Gower (1971) <doi:10.2307/2528823>, Wills (1998) <doi:10.1006/bijl.1998.0255>, Lloyd (2016) <doi:10.1111/bij.12746>, and Hopkins and St John (2018) <doi:10.1098/rspb.2018.1784>. This also includes the GED correction from Lehmann et al. (2019) <doi:10.1111/pala.12430>. Multiple functions implement morphospace plots: plot_chronophylomorphospace() implements Sakamoto and Ruta (2012) <doi:10.1371/journal.pone.0039752>, plot_morphospace() implements Wills et al. (1994) <doi:10.1017/S009483730001263X>, plot_changes_on_tree() implements Wang and Lloyd (2016) <doi:10.1098/rspb.2016.0214>, and plot_morphospace_stack() implements Foote (1993) <doi:10.1017/S0094837300015864>. Other functions include safe_taxonomic_reduction(), which implements Wilkinson (1995) <doi:10.1093/sysbio/44.4.501>, map_dollo_changes() implements the Dollo stochastic character mapping of Tarver et al. (2018) <doi:10.1093/gbe/evy096>, and estimate_ancestral_states() implements the ancestral state options of Lloyd (2018) <doi:10.1111/pala.12380>. calculate_tree_length() and reconstruct_ancestral_states() implements the generalised algorithms from Swofford and Maddison (1992; no doi).

Maintained by Graeme T. Lloyd. Last updated 7 days ago.

5.4 match 13 stars 7.86 score 77 scripts 2 dependents

vegandevs

vegan:Community Ecology Package

Ordination methods, diversity analysis and other functions for community and vegetation ecologists.

Maintained by Jari Oksanen. Last updated 1 months ago.

ecological-modellingecologyordinationfortranopenblas

2.0 match 476 stars 19.40 score 15k scripts 445 dependents

lbbe-software

MareyMap:Estimation of Meiotic Recombination Rates Using Marey Maps

Local recombination rates are graphically estimated across a genome using Marey maps.

Maintained by Aurélie Siberchicot. Last updated 29 days ago.

6.3 match 1 stars 5.30 score 20 scripts

alanarnholt

BSDA:Basic Statistics and Data Analysis

Data sets for book "Basic Statistics and Data Analysis" by Larry J. Kitchens.

Maintained by Alan T. Arnholt. Last updated 2 years ago.

3.4 match 7 stars 9.11 score 1.3k scripts 6 dependents

bioc

RBGL:An interface to the BOOST graph library

A fairly extensive and comprehensive interface to the graph algorithms contained in the BOOST library.

Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.

graphandnetworknetworkcpp

3.5 match 8.59 score 320 scripts 132 dependents

gagolews

genieclust:Fast and Robust Hierarchical Clustering with Noise Points Detection

A retake on the Genie algorithm (Gagolewski, 2021 <DOI:10.1016/j.softx.2021.100722>), which is a robust hierarchical clustering method (Gagolewski, Bartoszuk, Cena, 2016 <DOI:10.1016/j.ins.2016.05.003>). It is now faster and more memory efficient; determining the whole cluster hierarchy for datasets of 10M points in low dimensional Euclidean spaces or 100K points in high-dimensional ones takes only a minute or so. Allows clustering with respect to mutual reachability distances so that it can act as a noise point detector or a robustified version of 'HDBSCAN*' (that is able to detect a predefined number of clusters and hence it does not dependent on the somewhat fragile 'eps' parameter). The package also features an implementation of inequality indices (e.g., Gini and Bonferroni), external cluster validity measures (e.g., the normalised clustering accuracy, the adjusted Rand index, the Fowlkes-Mallows index, and normalised mutual information), and internal cluster validity indices (e.g., the Calinski-Harabasz, Davies-Bouldin, Ball-Hall, Silhouette, and generalised Dunn indices). See also the 'Python' version of 'genieclust' available on 'PyPI', which supports sparse data, more metrics, and even larger datasets.

Maintained by Marek Gagolewski. Last updated 11 days ago.

cluster-analysisclusteringclustering-algorithmdata-analysisdata-miningdata-sciencegeniehdbscanhierarchical-clusteringhierarchical-clustering-algorithmmachine-learningmachine-learning-algorithmsmlpacknmslibpythonpython3sparsecppopenmp

3.5 match 61 stars 7.33 score 13 scripts 5 dependents

r-lib

marquee:Markdown Parser and Renderer for R Graphics

Provides the mean to parse and render markdown text with grid along with facilities to define the styling of the text.

Maintained by Thomas Lin Pedersen. Last updated 3 days ago.

cpp

3.0 match 86 stars 8.59 score 28 scripts 1 dependents

lukejharmon

geiger:Analysis of Evolutionary Diversification

Methods for fitting macroevolutionary models to phylogenetic trees Pennell (2014) <doi:10.1093/bioinformatics/btu181>.

Maintained by Luke Harmon. Last updated 2 years ago.

openblascpp

2.3 match 1 stars 7.84 score 2.3k scripts 28 dependents

elianhugh

quartools:Programmatic Element Creation For Quarto Documents

Programatically generate quarto-compliant markdown elements.

Maintained by Elian Thiele-Evans. Last updated 1 years ago.

markdownquarto

3.8 match 27 stars 3.13 score 3 scripts

mrc-ide

eppasm:Age-structured EPP Model for HIV Epidemic Estimates

What the package does (one paragraph).

Maintained by Jeff Eaton. Last updated 4 months ago.

1.8 match 6 stars 5.04 score 34 scripts 3 dependents

scholaempirica

reschola:The Schola Empirica Package

A collection of utilies, themes and templates for data analysis at Schola Empirica.

Maintained by Jan Netík. Last updated 6 months ago.

1.7 match 4 stars 4.83 score 14 scripts

loicschwaller

saturnin:Spanning Trees Used for Network Inference

Bayesian inference of graphical model structures using spanning trees.

Maintained by Loïc Schwaller. Last updated 10 years ago.

cpp

5.4 match 1.18 score 15 scripts

drelliesmall

smallstuff:Dr. Small's Functions

Functions used in courses taught by Dr. Small at Drew University.

Maintained by Ellie Small. Last updated 1 years ago.

4.1 match 1.48 score 2 scripts 1 dependents

wjschne

WJSmisc:Miscellaneous functions from W. Joel Schneider

Several functions I find useful.

Maintained by W. Joel Schneider. Last updated 2 years ago.

1.7 match 5 stars 2.40 score 10 scripts

openintrostat

usdata:Data on the States and Counties of the United States

Demographic data on the United States at the county and state levels spanning multiple years.

Maintained by Mine Çetinkaya-Rundel. Last updated 10 months ago.

dataopenintro

0.6 match 9 stars 6.89 score 294 scripts 1 dependents

jerryratcliffe

aoristic:Generates Aoristic Probability Distributions

It can sometimes be difficult to ascertain when some events (such as property crime) occur because the victim is not present when the crime happens. As a result, police databases often record a 'start' (or 'from') date and time, and an 'end' (or 'to') date and time. The time span between these date/times can be minutes, hours, or sometimes days, hence the term 'Aoristic'. Aoristic is one of the past tenses in Greek and represents an uncertain occurrence in time. For events with a location describes with either a latitude/longitude, or X,Y coordinate pair, and a start and end date/time, this package generates an aoristic data frame with aoristic weighted probability values for each hour of the week, for each observation. The coordinates are not necessary for the program to calculate aoristic weights; however, they are part of this package because a spatial component has been integral to aoristic analysis from the start. Dummy coordinates can be introduced if the user only has temporal data. Outputs include an aoristic data frame, as well as summary graphs and displays. For more information see: Ratcliffe, JH (2002) Aoristic signatures and the temporal analysis of high volume crime patterns, Journal of Quantitative Criminology. 18 (1): 23-43. Note: This package replaces an original 'aoristic' package (version 0.6) by George Kikuchi that has been discontinued with his permission.

Maintained by Jerry Ratcliffe. Last updated 2 years ago.

0.5 match 7 stars 3.54 score 9 scripts

zh2395

KPC:Kernel Partial Correlation Coefficient

Implementations of two empirical versions the kernel partial correlation (KPC) coefficient and the associated variable selection algorithms. KPC is a measure of the strength of conditional association between Y and Z given X, with X, Y, Z being random variables taking values in general topological spaces. As the name suggests, KPC is defined in terms of kernels on reproducing kernel Hilbert spaces (RKHSs). The population KPC is a deterministic number between 0 and 1; it is 0 if and only if Y is conditionally independent of Z given X, and it is 1 if and only if Y is a measurable function of Z and X. One empirical KPC estimator is based on geometric graphs, such as K-nearest neighbor graphs and minimum spanning trees, and is consistent under very weak conditions. The other empirical estimator, defined using conditional mean embeddings (CMEs) as used in the RKHS literature, is also consistent under suitable conditions. Using KPC, a stepwise forward variable selection algorithm KFOCI (using the graph based estimator of KPC) is provided, as well as a similar stepwise forward selection algorithm based on the RKHS based estimator. For more details on KPC, its empirical estimators and its application on variable selection, see Huang, Z., N. Deb, and B. Sen (2022). “Kernel partial correlation coefficient – a measure of conditional dependence” (URL listed below). When X is empty, KPC measures the unconditional dependence between Y and Z, which has been described in Deb, N., P. Ghosal, and B. Sen (2020), “Measuring association on topological spaces using kernels and geometric graphs” <arXiv:2010.01768>, and it is implemented in the functions KMAc() and Klin() in this package. The latter can be computed in near linear time.

Maintained by Zhen Huang. Last updated 1 years ago.

0.5 match 4 stars 3.30 score 6 scripts

alighanbari26

GPRMortality:Gaussian Process Regression for Mortality Rates

A Bayesian statistical model for estimating child (under-five age group) and adult (15-60 age group) mortality. The main challenge is how to combine and integrate these different time series and how to produce unified estimates of mortality rates during a specified time span. GPR is a Bayesian statistical model for estimating child and adult mortality rates which its data likelihood is mortality rates from different data sources such as: Death Registration System, Censuses or surveys. There are also various hyper-parameters for completeness of DRS, mean, covariance functions and variances as priors. This function produces estimations and uncertainty (95% or any desirable percentiles) based on sampling and non-sampling errors due to variation in data sources. The GP model utilizes Bayesian inference to update predicted mortality rates as a posterior in Bayes rule by combining data and a prior probability distribution over parameters in mean, covariance function, and the regression model. This package uses Markov Chain Monte Carlo (MCMC) to sample from posterior probability distribution by 'rstan' package in R. Details are given in Wang H, Dwyer-Lindgren L, Lofgren KT, et al. (2012) <doi:10.1016/S0140-6736(12)61719-X>, Wang H, Liddell CA, Coates MM, et al. (2014) <doi:10.1016/S0140-6736(14)60497-9> and Mohammadi, Parsaeian, Mehdipour et al. (2017) <doi:10.1016/S2214-109X(17)30105-5>.

Maintained by Ali Ghanbari. Last updated 4 years ago.

0.5 match 2.70 score 7 scripts