Awesome Computational Biology 
A knowledge collection of databases, software and papers related to computational biology.
Computational biology involves the development and application of data-analytical and theoretical methods, mathematical modelling and computational simulation techniques to the study of biological, ecological, behavioural, and social systems. - Wikipedia
Contents
- Databases
- scRNA
- Compound
- Pathway
- Mass Spectra
- Protein
- Genome
- Disease
- Interaction
- Clinical Trial
- API
- Preprocess
- Machine Learning Tasks and Models
- Drug Response Prediction
- Drug Repurposing
- Drug Target Interaction
- Compound Protein Interaction
- Pre-trained embedding
- LLM for biology
Databases
scRNA
- Gene Expression Omnibus - Public functional genemics database.
- Single Cell PORTAL - Public database for single cell RNA.
- Single Cell Expression Atlas - Public database for single cell RNA.
Compound
- PubChem - One of the biggest chemical database such as compounds, genes and proteins.
- ChEBI - Chemical database focused on small chemical compounds.
- ChEMBL - Database of bioactive molecules with drug-like properties.
- ChemSpider - Chemical structure database.
- KEGG COMPOUND - Collection of small molecules and biopolymers.
- LIPID MAPS - Database of lipids.
- Rhea - Database of chemical reactions.
- Drug Repurposing Hub - Collections of drug repurposing data containing drug, moa, target etc.
- Therapeutic Target Database - collections of drug-target, target-disease, and drug-disease dataset.
- ZINC ligand discovery database - Free database of commercially-available compounds for virtual screening.
- MoleculeNet - Benchmark for molecular machine learning.
- Ames Mutagenicity dataset - Dataset for predicting mutagenicity.
- ADCdb - Database for antibody-drug conjugates.
Pathway
- PathwayCommons - Database of Pathways and Interactions.
- KEGG PATHWAY - Collection fo drawn pathway maps.
- WikiPathways - Database of biological pathways.
Mass Spectra
- MassBank - Open souce databases and tools for mass spectrometry reference spectra.
- MoNA MassBank of North America - Meta database of metabolite mass spectra, metadata and associated compounds.
Protein
- THE HUMAN PROTEIN ATLAS - One of the biggest human protein database contained cells, tissues, and organs.
- PROTEIN DATA BANK - Database of the 3D shapes of proteins, nucleic acids, and complex assemblies.
- UniProt - The collection of functional information on proteins.
- AlphaFold Protein Structure Database - Database of 3D protein structures.
- RCSB Protein Data Bank (PDB) - Repository of 3D structural data of large biological molecules.
- Critical Assessment of Structure Prediction (CASP) - Experiment for advancing the methods of predicting protein structure from sequence.
- Uniclust - Collection of clustered protein sequence databases.
- CATH database - Hierarchical classification of protein domain structures.
Genome
- Human Genome Resources at NCBI - Database of image, proteomics, transcriptomics and systems biology.
- GenBank - Database of genetic sequence offered by NCBI.
- UCSC Genome Browser - Genome blowser offered by UCSC.
- cBioPortal - Database of Cancer Genomics. This has overall metaview for a lot of patients.
- 10x Genomics Dataset - Collection of single-cell datasets.