Run Large Language Models on Colab with TextGen-WebUI

This blog post will guide you through using a fantastic GitHub repository to effortlessly run Large Language Models (LLMs) on Google Colab with TextGen-WebUI.

Introduction

Tired of limited access to powerful LLMs? This repository by seyf1elislam offers a convenient solution! It provides Colab notebooks that allow you to run various LLMs with just a few clicks.

Repo Link : Run LLM - OneClickColab stars

Available Notebooks

The repository offers two notebooks:

Run GGUF LLM models in TextGen-webui: This notebook lets you unleash the power of GGUF models on TextGen-WebUI. Here : Open in Github .
Run GPTQ and Exl2 LLM models in TextGen-webui: Explore GPTQ and Exl2 models through TextGen-WebUI with this notebook. Here : Open in Github .

Getting Started

Here's how to get started:

Open the Colab Notebook: Click the "Open in Colab" button for the desired notebook (GGUF or GPTQ/Exl2).
- GGUF notebook (recommended ⭐)
- GPTQ/Exl2 notebook
Choose Your Model: Select the LLM model you want to run from the list provided in the notebook.
Select Quantization Type (if applicable): Choose the appropriate quantization type for your chosen model (if the notebook supports it).
Run the last Cell: Execute the designated cell in the notebook.
Visit the Generated Link: A link like https://****.trycloudflare.com/ will be displayed. Open this link in your browser.
Start the Conversation: You're now ready to interact with your chosen LLM model through the TextGen-WebUI interface!

Quantized Models Sources

For optimal performance, consider using quantized models. You can find these on the Hugging Face repositories of:

mradermacher (GGUF)
bartowski (GGUF)
LongStriker (exl2, GGUF)
QuantFactory (GGUF)

You can also search for "GGUF" on Hugging Face to see all available models: https://huggingface.co/docs/hub/en/GGUF

Recommended Models

Here are some exciting models to try:

Latest (11/2024):

Qwen/Qwen2.5-Coder-14B-Instruct-GGUF: This Q5_K_M / Q4_K_M model is highly recommended for coding (⭐⭐⭐⭐⭐⭐).
Qwen/Qwen2.5-Coder-7B-Instruct-GGUF: This Q8_0 model is highly recommended for coding (⭐⭐⭐⭐⭐).

Others :

QuantFactory/Mistral-Nemo-Instruct-2407-GGUF (12B): This Q5_K_M / Q4_K_M model is highly recommended (⭐⭐⭐⭐).
bartowski/Mistral-Small-Instruct-2409-GGUF (22B): This model works well with 3KM in 15g vram (⭐).
Meta-Llama Models (8B): Explore Meta-Llama-3.1-8B-Instruct-GGUF (Q8_0) and Meta-Llama-3-8B-Instruct-GGUF (Q8_0) (⭐⭐⭐⭐).
bartowski/gemma-2-9b-it-GGUF (9B): This model offers Q8_0/Q6 quantization (⭐⭐⭐).

Requirements

There are no specific software requirements. Just open Colab in GPU mode, and the notebook will handle the necessary installations.

Some Tips

Free Colab T4 GPU (15G vram) Recommendations:
- 22B model: Up to Q3_K_M (context up to ~8K)
- 12B model: Up to Q5_K_M (context up to 16K)
- 8B/7B model: Up to Q8_0 (context up to 16K, if supported)

Share this Article

Comments are disabled