Showing 20 of total 20 results (show query)
ropensci
tesseract:Open Source OCR Engine
Bindings to 'Tesseract': a powerful optical character recognition (OCR) engine that supports over 100 languages. The engine is highly configurable in order to tune the detection algorithms and obtain the best possible results.
Maintained by Jeroen Ooms. Last updated 5 months ago.
30.7 match 246 stars 10.06 score 482 scriptspachadotdev
cpp11tesseract:Open Source OCR Engine
Bindings to 'tesseract': 'tesseract' (<https://github.com/tesseract-ocr/tesseract>) is a powerful optical character recognition (OCR) engine that supports over 100 languages. The engine is highly configurable in order to tune the detection algorithms and obtain the best possible results.
Maintained by Mauricio Vargas Sepulveda. Last updated 2 days ago.
cpp11-libraryocrtesseracttesseract-ocrcpp
32.9 match 4 stars 6.28 scorehegghammer
daiR:Interface with Google Cloud Document AI API
R interface for the Google Cloud Services 'Document AI API' <https://cloud.google.com/document-ai/> with additional tools for output file parsing and text reconstruction. 'Document AI' is a powerful server-based OCR service that extracts text and tables from images and PDF files with high accuracy. 'daiR' gives R users programmatic access to this service and additional tools to handle and visualize the output. See the package website <https://dair.info/> for more information and examples.
Maintained by Thomas Hegghammer. Last updated 4 months ago.
15.5 match 42 stars 6.77 score 40 scriptsgojiplus
captr:Client for the Captricity API
Get text from images of text using Captricity Optical Character Recognition (OCR) API. Captricity allows you to get text from handwritten forms --- think surveys --- and other structured paper documents. And it can output data in form a delimited file keeping field information intact. For more information, read <https://shreddr.captricity.com/developer/overview/>.
Maintained by Gaurav Sood. Last updated 7 years ago.
11.5 match 14 stars 5.29 score 28 scriptsropensci
magick:Advanced Graphics and Image-Processing in R
Bindings to 'ImageMagick': the most comprehensive open-source image processing library available. Supports many common formats (png, jpeg, tiff, pdf, etc) and manipulations (rotate, scale, crop, trim, flip, blur, etc). All operations are vectorized via the Magick++ STL meaning they operate either on a single frame or a series of frames for working with layers, collages, or animation. In RStudio images are automatically previewed when printed to the console, resulting in an interactive editing environment. The latest version of the package includes a native graphics device for creating in-memory graphics or drawing onto images using pixel coordinates.
Maintained by Jeroen Ooms. Last updated 20 days ago.
image-manipulationimage-processingimagemagickcpp
3.3 match 468 stars 17.31 score 9.0k scripts 256 dependentsropensci
pdftools:Text Extraction, Rendering and Converting of PDF Documents
Utilities based on 'libpoppler' <https://poppler.freedesktop.org> for extracting text, fonts, attachments and metadata from a PDF file. Also supports high quality rendering of PDF documents into PNG, JPEG, TIFF format, or into raw bitmap vectors for further processing in R.
Maintained by Jeroen Ooms. Last updated 13 days ago.
pdf-filespdf-formatpdftoolspopplerpoppler-librarytext-extractioncpp
2.0 match 529 stars 13.10 score 3.3k scripts 47 dependentsshotaochi
imagerExtra:Extra Image Processing Library Based on 'imager'
Provides advanced functions for image processing based on the package 'imager'.
Maintained by Shota Ochi. Last updated 2 years ago.
4.8 match 11 stars 5.22 scorejniedballa
camtrapR:Camera Trap Data Management and Preparation of Occupancy and Spatial Capture-Recapture Analyses
Management of and data extraction from camera trap data in wildlife studies. The package provides a workflow for storing and sorting camera trap photos (and videos), tabulates records of species and individuals, and creates detection/non-detection matrices for occupancy and spatial capture-recapture analyses with great flexibility. In addition, it can visualise species activity data and provides simple mapping functions with GIS export.
Maintained by Juergen Niedballa. Last updated 3 months ago.
occupancy-modelingspatial-capture-recapturewildlife
1.7 match 35 stars 8.65 score 178 scriptscloudyr
googleCloudVisionR:Access to the 'Google Cloud Vision' API for Image Recognition, OCR and Labeling
Interact with the 'Google Cloud Vision' <https://cloud.google.com/vision/> API in R. Part of the 'cloudyr' <https://cloudyr.github.io/> project.
Maintained by Jeno Pal. Last updated 5 years ago.
2.8 match 7 stars 4.95 score 14 scripts 1 dependentsskranz
rmistral:Experimental R interface to Mistral AI API
Currently implement only OCR capabilties for PDF conversion.
Maintained by Sebastian Kranz. Last updated 9 days ago.
7.9 match 1.70 scoreropensci
jstor:Read Data from JSTOR/DfR
Functions and helpers to import metadata, ngrams and full-texts delivered by Data for Research by JSTOR.
Maintained by Thomas Klebel. Last updated 8 months ago.
jstorpeer-reviewedtext-analysistext-mining
1.5 match 47 stars 7.29 score 55 scriptsjamespeapen
ceas:Cellular Energetics Analysis Software
Measuring cellular energetics is essential to understanding a matrix’s (e.g. cell, tissue or biofluid) metabolic state. The Agilent Seahorse machine is a common method to measure real-time cellular energetics, but existing analysis tools are highly manual or lack functionality. The Cellular Energetics Analysis Software (ceas) R package fills this analytical gap by providing modular and automated Seahorse data analysis and visualization using the methods described by Mookerjee et al. (2017) <doi:10.1074/jbc.m116.774471>.
Maintained by Rachel House. Last updated 3 months ago.
1.3 match 1 stars 5.08 score 3 scriptsfrankiethull
kuzco:LLM image classification using ollama in R
This package is a designed to use local models for image classification. The prompts and functions are designed to take an input image and supply classification information as an output.
Maintained by Frank Hull. Last updated 2 months ago.
1.8 match 11 stars 3.34 scoregianlucafilippa
phenopix:Process Digital Images of a Vegetation Cover
A collection of functions to process digital images, depict greenness index trajectories and extract relevant phenological stages.
Maintained by Gianluca Filippa. Last updated 1 months ago.
1.8 match 5 stars 2.52 score 66 scriptsbioc
seahtrue:Seahtrue revives XF data for structured data analysis
Seahtrue organizes oxygen consumption and extracellular acidification analysis data from experiments performed on an XF analyzer into structured nested tibbles.This allows for detailed processing of raw data and advanced data visualization and statistics. Seahtrue introduces an open and reproducible way to analyze these XF experiments. It uses file paths to .xlsx files. These .xlsx files are supplied by the userand are generated by the user in the Wave software from Agilent from the assay result files (.asyr). The .xlsx file contains different sheets of important data for the experiment; 1. Assay Information - Details about how the experiment was set up. 2. Rate Data - Information about the OCR and ECAR rates. 3. Raw Data - The original raw data collected during the experiment. 4. Calibration Data - Data related to calibrating the instrument. Seahtrue focuses on getting the specific data needed for analysis. Once this data is extracted, it is prepared for calculations through preprocessing. To make sure everything is accurate, both the initial data and the preprocessed data go through thorough checks.
Maintained by Vincent de Boer. Last updated 5 months ago.
cellbasedassaysfunctionalpredictiondatarepresentationdataimportcellbiologycheminformaticsmetabolomicsmicrotitreplateassayvisualizationqualitycontrolbatcheffectexperimentaldesignpreprocessinggo
0.5 match 5.04 score 2 scriptsmdlincoln
salty:Turn Clean Data into Messy Data
Take real or simulated data and salt it with errors commonly found in the wild, such as pseudo-OCR errors, Unicode problems, numeric fields with nonsensical punctuation, bad dates, etc.
Maintained by Matthew Lincoln. Last updated 7 months ago.
0.5 match 64 stars 4.81 score 20 scriptskwb-r
kwb.gocr:Interface to gocr Program
Wrapper functions to the gocr (Optical Character Recognition) program developed by Jens Schulenberg (https://www-e.ovgu.de/jschulen/ocr/).
Maintained by Hauke Sonnenberg. Last updated 3 years ago.
optical-character-recognitionproject-miacso
0.5 match 1.70 scorecran
podcleaner:Legacy Scottish Post Office Directories Cleaner
Attempts to clean optical character recognition (OCR) errors in legacy Scottish Post Office Directories. Further attempts to match records from trades and general directories.
Maintained by Olivier Bautheac. Last updated 3 years ago.
0.5 match 1.70 scorehsonne
magickx:Extension of the R-Package magick
Image manipulation based on the magick package. It contains functions to select or remove horizontal or vertical stripes from an image. These may be used to cut off undesired areas from an image, e.g. as preparation for optical character recognition (OCR).
Maintained by Hauke Sonnenberg. Last updated 5 years ago.
0.5 match 1.70 score 1 scriptsmichael-scholz-dev
orderanalyzer:Extracting Order Position Tables from PDF-Based Order Documents
Functions for extracting text and tables from PDF-based order documents. It provides an n-gram-based approach for identifying the language of an order document. It furthermore uses R-package 'pdftools' to extract the text from an order document. In the case that the PDF document is only including an image (because it is scanned document), R package 'tesseract' is used for OCR. Furthermore, the package provides functionality for identifying and extracting order position tables in order documents based on a clustering approach.
Maintained by Michael Scholz. Last updated 3 months ago.
0.5 match 1.00 score