"Generated output from the GatorTron model mirrors the input prompt without generating additional text or meaningful content, resulting in repetitive

I’m using the GatorTron model from the UF NLP repository (UFNLP/gatortronS) for a text generation task. I’ve successfully loaded the tokenizer and model and can tokenize the input prompt without issues. However, when I generate text using the model, the output is nearly identical to the input prompt, followed by repetitive padding (e.g., ....................................................................................................). The generated token IDs also reflect this behavior, with a large number of repeated values.

Script:
from transformers import AutoTokenizer, AutoModelForCausalLM

Define a more detailed prompt to test

prompt = “What are the common symptoms associated with fever?”

Load the GatorTron base model tokenizer

tokenizer = AutoTokenizer.from_pretrained(‘UFNLP/gatortronS’)
print(“Tokenizer Loaded Successfully”)

Tokenize the input prompt

inputs = tokenizer(prompt, return_tensors=“pt”).input_ids
print(“Tokenized Inputs:”, inputs)

Load the GatorTron base model

model = AutoModelForCausalLM.from_pretrained(‘UFNLP/gatortronS’, is_decoder=True)
print(“Model Loaded Successfully”)

Generate output using the model with adjusted parameters

outputs = model.generate(
input_ids=inputs,
max_new_tokens=100,
do_sample=True, # Enable sampling
top_p=0.9, # Use nucleus sampling
temperature=0.7 # Adjust temperature for diversity
)
print(“Generated Outputs (token IDs):”, outputs)

Decode the generated tokens to text

decoded_output = tokenizer.batch_decode(outputs, skip_special_tokens=True)
print(“Decoded Outputs:”, decoded_output)

Output:
Generated Outputs (token IDs): tensor([[ 101, 3810, 307, 134, 2813, 1111, 1551, 189, 1754, 574, 102, 112,
112, 112, 112, 112, 112, 112, 112, 112, 112, 112, 112, 112,
112, 112, 112, 112, 112, 112, 112, 112, 112, 112, 112, 112,
112, 112, 112, 112, 112, 112, 112, 112, 112, 112, 112, 112,
112, 112, 112, 112, 112, 112, 112, 112, 112, 112, 112, 112,
112, 112, 112, 112, 112, 112, 112, 112, 112, 112, 112, 112,
112, 112, 112, 112, 112, 112, 112, 112, 112, 112, 112, 112,
112, 112, 112, 112, 112, 112, 112, 112, 112, 112, 112, 112,
112, 112, 112, 112, 112, 112, 112, 112, 112, 112, 112, 112,
112, 112, 112]])
Decoded Outputs: [‘what are the common symptoms associated with fever?..’]

The expected output should be a continuation of the input prompt with relevant content, but the model instead generates the same text as the input prompt, followed by a series of repetitive padding characters. I have tried adjusting the generation parameters (e.g., do_sample, top_p, temperature), but the issue persists.

Has anyone encountered a similar issue, or can provide insights on how to resolve this?