Instructions to use BEE-spoke-data/pegasus-x-base-synthsumm_open-16k with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use BEE-spoke-data/pegasus-x-base-synthsumm_open-16k with Transformers:
# Use a pipeline as a high-level helper # Warning: Pipeline type "summarization" is no longer supported in transformers v5. # You must load the model directly (see below) or downgrade to v4.x with: # 'pip install "transformers<5.0.0' from transformers import pipeline pipe = pipeline("summarization", model="BEE-spoke-data/pegasus-x-base-synthsumm_open-16k")# Load model directly from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("BEE-spoke-data/pegasus-x-base-synthsumm_open-16k") model = AutoModelForSeq2SeqLM.from_pretrained("BEE-spoke-data/pegasus-x-base-synthsumm_open-16k") - Notebooks
- Google Colab
- Kaggle
pegasus-x-base-synthsumm_open-16k
This is a text-to-text summarization model fine-tuned from pegasus-x-base on a dataset of long documents from various sources/domains and their synthetic summaries.
It performs surprisingly well as a general summarization model for its size. More details, a larger model, and the dataset will be released (as time permits).
Usage
It's recommended to use this model with beam search decoding. If interested, you can also use the textsum util package to have most of this abstracted out for you:
pip install -U textsum
then:
from textsum.summarize import Summarizer
model_name = "BEE-spoke-data/pegasus-x-base-synthsumm_open-16k"
summarizer = Summarizer(model_name) # GPU auto-detected
text = "put the text you don't want to read here"
summary = summarizer.summarize_string(text)
print(summary)
architecture
Update May 2026:
The architecture of Pegasus-X is rather interesting and perhaps under-explored or built on. Additionally, one small innovation here (more on the larger variant) is the original activation function was updated to swish and subsequently healed as part of the fine-tuning process and worked fine
Here's a little glimpse on how this thing works and processes long sequences while being a small encoder-decoder:
- Downloads last month
- 148
Model tree for BEE-spoke-data/pegasus-x-base-synthsumm_open-16k
Base model
google/pegasus-x-base