LongLLaMA

LongLLaMA

LongLLaMA is a large language model designed for handling extensive text contexts, capable of processing up to 256,000 tokens. It's based on OpenLLaMA and fine-tuned using the Focused Transformer (FoT) method. The repository offers a smaller 3B base variant of LongLLaMA on an Apache 2.0 license for use in existing implementations. Additionally, it provides code for instruction tuning and FoT continued pretraining. LongLLaMA's key innovation is in its ability to manage contexts significantly longer than its training data, making it useful for tasks that demand extensive context understanding. It includes tools for easy integration into Hugging Face for natural language processing tasks.

About the author
Rohan

TOOLHUNT

Effortlessly find the right tools for the job.

TOOLHUNT

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to TOOLHUNT.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.