
[RubyML | RubyDataScience | RubyInterop]
Awesome NLP with Ruby [
][ruby]
Useful resources for text processing in Ruby
This curated list comprises awesome resources, libraries, information sources about computational processing of texts in human languages with the Ruby programming language. That field is often referred to as NLP, Computational Linguistics, HLT (Human Language Technology) and can be brought in conjunction with Artificial Intelligence, Machine Learning, Information Retrieval, Text Mining, Knowledge Extraction and other related disciplines.
This list comes from our day to day work on Language Models and NLP Tools. Read why this list is awesome. Our FAQ describes the important decisions and useful answers you may be interested in.
:sparkles: Every contribution is welcome! Add links through pull requests or create an issue to start a discussion.
Follow us on Twitter
and please spread the word using the #RubyNLP hash tag!
Contents
- :sparkles: Tutorials
- NLP Pipeline Subtasks
- Pipeline Generation
- Multipurpose Engines
- Language Identification
- Segmentation
- Lexical Processing
- Phrasal Level Processing
- Syntactic Processing
- Semantic Analysis
- Pragmatical Analysis
- High Level Tasks
- Spelling and Error Correction
- Text Alignment
- Machine Translation
- Sentiment Analysis
- Numbers, Dates, and Time Parsing
- Named Entity Recognition
- Text-to-Speech-to-Text
- Dialog Agents, Assistants, and Chatbots
- Linguistic Resources
- Machine Learning Libraries
- Data Visualization
- Optical Character Recognition
- Text Extraction
- Full Text Search, Information Retrieval, Indexing
- Language Aware String Manipulation
- Articles, Posts, Talks, and Presentations
- Projects and Code Examples
- Books
- Community
- Needs your Help!
- Related Resources
- License
:sparkles: Tutorials
Please help us to fill out this section! :smiley:
NLP Pipeline Subtasks
An NLP Pipeline starts with a plain text.
Pipeline Generation
- composable_operations - Definition framework for operation pipelines.
- ruby-spark - Spark bindings with an easy to understand DSL.
- phobos - Simplified Ruby Client for Apache Kafka.
- parallel - Supervisor for parallel execution on multiple CPUs or in many threads.
- pwrake - Rake extensions to run local and remote tasks in parallel.
Multipurpose Engines
- open-nlp - Ruby Bindings for the OpenNLP Toolkit.
- stanford-core-nlp -