bigscience-workshop/petals — 🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and

Repository image
Run large language models at home, BitTorrent-style.
Fine-tuning and inference up to 10x faster than offloading

Generate text with distributed Llama 3.1 (up to 405B), Mixtral (8x22B), Falcon (40B+) or BLOOM (176B) and fine‑tune them for your own tasks — right from your desktop computer or Google Colab:

from transformers import AutoTokenizer
from petals import AutoDistributedModelForCausalLM

# Choose any model available at https://health.petals.dev
model_name = "meta-llama/Meta-Llama-3.1-405B-Instruct"

# Connect to a distributed network hosting model layers
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoDistributedModelForCausalLM.from_pretrained(model_name)

# Run the model as if it were on your computer
inputs = tokenizer("A cat sat", return_tensors="pt")["input_ids"]
outputs = model.generate(inputs, max_new_tokens=5)
print(tokenizer.decode(outputs[0]))  # A cat sat on a mat...

petals

Quick Overview

Scores

Trust Score

Maintenance

Popularity

Star History

Snapshot Versions

Alternatives

prompts.chat

hermes-agent

dify

open-webui

langchain

awesome-llm-apps

Community Reviews

README