Showing 13 of total 13 results (show query)
oscarkjell
text:Analyses of Text using Transformers Models from HuggingFace, Natural Language Processing and Machine Learning
Link R with Transformers from Hugging Face to transform text variables to word embeddings; where the word embeddings are used to statistically test the mean difference between set of texts, compute semantic similarity scores between texts, predict numerical variables, and visual statistically significant words according to various dimensions etc. For more information see <https://www.r-text.org>.
Maintained by Oscar Kjell. Last updated 3 days ago.
deep-learningmachine-learningnlptransformersopenjdk
13.7 match 146 stars 13.16 score 436 scripts 1 dependentsmichelnivard
gptstudio:Use Large Language Models Directly in your Development Environment
Large language models are readily accessible via API. This package lowers the barrier to use the API inside of your development environment. For more on the API, see <https://platform.openai.com/docs/introduction>.
Maintained by James Wade. Last updated 5 days ago.
chatgptgpt-3rstudiorstudio-addin
12.5 match 924 stars 10.83 score 43 scripts 1 dependentspsychbruce
FMAT:The Fill-Mask Association Test
The Fill-Mask Association Test ('FMAT') <doi:10.1037/pspa0000396> is an integrative and probability-based method using Masked Language Models to measure conceptual associations (e.g., attitudes, biases, stereotypes, social norms, cultural values) as propositions in natural language. Supported language models include 'BERT' <doi:10.48550/arXiv.1810.04805> and its variants available at 'Hugging Face' <https://huggingface.co/models?pipeline_tag=fill-mask>. Methodological references and installation guidance are provided at <https://psychbruce.github.io/FMAT/>.
Maintained by Han-Wu-Shuang Bao. Last updated 5 months ago.
aiartificial-intelligencebertbert-modelbert-modelscontextualized-representationfill-in-the-blankfill-maskhuggingfacelanguage-modellanguage-modelslarge-language-modelsmasked-language-modelsnatural-language-processingnatural-language-understandingnlppretrained-modelstransformertransformers
10.5 match 12 stars 4.82 score 2 scriptsjameshwade
gpttools:Extensions and Tools for gptstudio
gpttools is an R package that provides extensions to gptstudio to provide devtools-like functionality using the latest natural language processing (NLP) models. It is designed to make package development easier by providing a range of tools and functions that can be used to improve the quality of your package's documentation, testing, and maybe even functionality.
Maintained by James Wade. Last updated 7 months ago.
chatgptnlpopenaipackage-developmentrstudio-addin
7.1 match 293 stars 7.06 score 14 scriptsdyfanjones
sagemaker.mlframework:sagemaker machine learning developed by amazon
`sagemaker` machine learning developed by amazon.
Maintained by Dyfan Jones. Last updated 3 years ago.
amazon-sagemakerawsmachine-learningsagemakersdk
5.0 match 2.48 score 2 dependentsmlverse
hfhub:Hugging Face Hub Interface
Provides functionality to download and cache files from 'Hugging Face Hub' <https://huggingface.co/models>. Uses the same caching structure so files can be shared between different client libraries.
Maintained by Daniel Falbel. Last updated 6 months ago.
2.4 match 16 stars 4.28 score 24 scriptsropensci
pangoling:Access to Large Language Model Predictions
Provides access to word predictability estimates using large language models (LLMs) based on 'transformer' architectures via integration with the 'Hugging Face' ecosystem. The package interfaces with pre-trained neural networks and supports both causal/auto-regressive LLMs (e.g., 'GPT-2'; Radford et al., 2019) and masked/bidirectional LLMs (e.g., 'BERT'; Devlin et al., 2019, <doi:10.48550/arXiv.1810.04805>) to compute the probability of words, phrases, or tokens given their linguistic context. By enabling a straightforward estimation of word predictability, the package facilitates research in psycholinguistics, computational linguistics, and natural language processing (NLP).
Maintained by Bruno Nicenboim. Last updated 4 days ago.
nlppsycholinguisticstransformers
1.8 match 8 stars 4.90 scoredyfanjones
sagemaker:R SDK for `AWS Sagemaker`
A library for training and deploying machine learning models on Amazon SageMaker <https://aws.amazon.com/sagemaker/> using R through `paws SDK`.
Maintained by Dyfan Jones. Last updated 3 years ago.
amazon-sagemakerawsmachine-learningsagemakersdk
3.0 match 12 stars 2.78 score 6 scriptspsychbruce
PsychWordVec:Word Embedding Research Framework for Psychological Science
An integrative toolbox of word embedding research that provides: (1) a collection of 'pre-trained' static word vectors in the '.RData' compressed format <https://psychbruce.github.io/WordVector_RData.pdf>; (2) a series of functions to process, analyze, and visualize word vectors; (3) a range of tests to examine conceptual associations, including the Word Embedding Association Test <doi:10.1126/science.aal4230> and the Relative Norm Distance <doi:10.1073/pnas.1720347115>, with permutation test of significance; (4) a set of training methods to locally train (static) word vectors from text corpora, including 'Word2Vec' <arXiv:1301.3781>, 'GloVe' <doi:10.3115/v1/D14-1162>, and 'FastText' <arXiv:1607.04606>; (5) a group of functions to download 'pre-trained' language models (e.g., 'GPT', 'BERT') and extract contextualized (dynamic) word vectors (based on the R package 'text').
Maintained by Han-Wu-Shuang Bao. Last updated 1 years ago.
bertcosine-similarityfasttextglovegptlanguage-modelnatural-language-processingnlppretrained-modelspsychologysemantic-analysistext-analysistext-miningtsneword-embeddingsword-vectorsword2vecopenjdk
1.8 match 22 stars 4.04 score 10 scriptsatomashevic
transforEmotion:Sentiment Analysis for Text, Image and Video using Transformer Models
Implements sentiment analysis using huggingface <https://huggingface.co> transformer zero-shot classification model pipelines for text and image data. The default text pipeline is Cross-Encoder's DistilRoBERTa <https://huggingface.co/cross-encoder/nli-distilroberta-base> and default image/video pipeline is Open AI's CLIP <https://huggingface.co/openai/clip-vit-base-patch32>. All other zero-shot classification model pipelines can be implemented using their model name from <https://huggingface.co/models?pipeline_tag=zero-shot-classification>.
Maintained by Aleksandar Tomašević. Last updated 2 months ago.
1.0 match 26 stars 6.40 score 12 scriptsjaytimm
textpress:A Lightweight and Versatile NLP Toolkit
A simple Natural Language Processing (NLP) toolkit focused on search-centric workflows with minimal dependencies. The package offers key features for web scraping, text processing, corpus search, and text embedding generation via the 'HuggingFace API' <https://huggingface.co/docs/api-inference/index>.
Maintained by Jason Timm. Last updated 5 months ago.
corpus-searchnlpopenai-embeddingsweb-scraping
0.8 match 3 stars 4.18 scoregesistsa
grafzahl:Supervised Machine Learning for Textual Data Using Transformers and 'Quanteda'
Duct tape the 'quanteda' ecosystem (Benoit et al., 2018) <doi:10.21105/joss.00774> to modern Transformer-based text classification models (Wolf et al., 2020) <doi:10.18653/v1/2020.emnlp-demos.6>, in order to facilitate supervised machine learning for textual data. This package mimics the behaviors of 'quanteda.textmodels' and provides a function to setup the 'Python' environment to use the pretrained models from 'Hugging Face' <https://huggingface.co/>. More information: <doi:10.5117/CCR2023.1.003.CHAN>.
Maintained by Chung-hong Chan. Last updated 25 days ago.
0.5 match 41 stars 5.91 score 3 scriptsmacmillancontentscience
wordpiece.data:Data for Wordpiece-Style Tokenization
Provides data to be used by the wordpiece algorithm in order to tokenize text into somewhat meaningful chunks. Included vocabularies were retrieved from <https://huggingface.co/bert-base-cased/resolve/main/vocab.txt> and <https://huggingface.co/bert-base-uncased/resolve/main/vocab.txt> and parsed into an R-friendly format.
Maintained by Jon Harmon. Last updated 3 years ago.
0.8 match 3.18 score 5 scripts 1 dependents