Petals Logo

Petals

Free

Collaboratively run large language models faster with Petals.

Last Updated:

Petals allows users to collaborate in running large language models like BLOOM-176B at home, for faster single-batch inference and parallel inference capabilities. The platform offers flexibility beyond classic language model APIs, allowing for custom paths through the model and access to hidden states. Follow the development progress by subscribing via email or joining the Discord community.

Petal is a collaborative platform that enables users to run large language models like BLOOM-176B at home, BitTorrent-style. This innovative approach to language modeling allows users to load a small part of the model before teaming up with others serving the other parts, resulting in faster single-batch inference and parallel inference capabilities. With parallel inference, users can reach hundreds of tokens per second, while single-batch inference runs at approximately one second per step (token). This is up to 10 times faster than offloading, making it ideal for chatbots and other interactive applications.

Petals are more than just a classic language model API. Users can employ any fine-tuning and sampling methods, execute custom paths through the model, or see its hidden states. This level of flexibility is combined with the ease of use of an API, thanks to the integration with PyTorch. This allows users to easily implement, modify, and deploy language models while retaining access to the latest advancements in neural network research.

The Petals platform is easy to use, with no need for extensive coding knowledge or expensive hardware. Users can access it through Colab Docs or by subscribing via email or joining the Discord community. The platform was developed as part of the BigScience research workshop and is open source, making it accessible to anyone interested in advancing the field of natural language processing.

The benefits of Petals extend beyond faster processing times. Its collaborative, BitTorrent-style approach to language modeling also reduces the reliance on centralized data centers for large language models, as users can combine their computing resources to run the models at home. This decentralization of language model training is vital for protecting user privacy and promoting ethical data usage practices.

In conclusion, Petals is a revolutionary platform that allows users to collaboratively run large language models at home, BitTorrent-style. With faster single-batch inference and parallel inference capabilities, as well as unparalleled flexibility in model design and usage, Petals is empowering users to advance the field of natural language processing in a privacy-conscious and ethical manner. Try it now in Colab Docs or join the Petals community to stay up-to-date with the latest developments.