Natural Language Processing (NLP) entity extraction via open-source Stanza
Open-source LLMs (e.g. Mistral 7B or Llama 3.1) Embeddings created with SCIBERT
Open-source LLMs (e.g. Llama 3.1 and BLOOM) using LongChain and LlamaIndex
Latent Dirichlet Analysis (LDA) and other clustering algorithms (UMAP AND HDBSCAN)