Showing 2 of total 2 results (show query)
paithiov909
audubon:Japanese Text Processing Tools
A collection of Japanese text processing tools for filling Japanese iteration marks, Japanese character type conversions, segmentation by phrase, and text normalization which is based on rules for the 'Sudachi' morphological analyzer and the 'NEologd' (Neologism dictionary for 'MeCab'). These features are specific to Japanese and are not implemented in 'ICU' (International Components for Unicode).
Maintained by Akiru Kato. Last updated 11 days ago.
10 stars 5.65 score 3 scripts 1 dependentspaithiov909
gibasa:An Alternative 'Rcpp' Wrapper of 'MeCab'
A plain 'Rcpp' wrapper for 'MeCab' that can segment Chinese, Japanese, and Korean text into tokens. The main goal of this package is to provide an alternative to 'tidytext' using morphological analysis.
Maintained by Akiru Kato. Last updated 11 days ago.
15 stars 5.02 score 3 scripts