Awesome BioIE Logo
Awesome
How to extract information from unstructured biomedical data and text.

What is BioIE? It includes any effort to extract structured information from unstructured (or, at least inconsistently structured) biological, clinical, or other biomedical data. The data source is often some collection of text documents written in technical language. If the resulting information is verifiable and consistent across sources, we may then consider it knowledge. Extracting information and producing knowledge from bio data requires adaptations upon methods developed for other types of unstructured data.

BioIE has undergone massive changes since the introduction of language models like BERT and the more recently created Large Language Models (LLMs; e.g., GPT-3/4, LLAMA2/3, Gemini, etc).

Resources included here are preferentially those available at no monetary cost and limited license requirements. Methods and datasets should be publicly accessible and actively maintained.

See also awesome-nlp, awesome-biology and Awesome-Bioinformatics.

Please read the contribution guidelines before contributing. Please add your favourite resource by raising a pull request.

Contents

Research Overviews

LLMs in Biomedical IE

Pre-LLM Overviews