Hire a web Developer and Designer to upgrade and boost your online presence with cutting edge Technologies

Saturday, 6 May 2023

Transformer summariser pipeline giving different results on same model with fixed seed

 I am using a HuggingFace summariser pipeline and I noticed that if I train a model for 3 epochs and then at the end run evaluation on all 3 epochs with fixed random seeds, I get a different results based on whether I restart the python console 3 times or whether I load the different model (one for every epoch) on the same summariser object in a loop, and I would like to understand why we have this strange behaviour.

While my results are based on ROUGE score on a large dataset, I have made this small reproducible example to show this issue. Instead of using the weights of the same model at different training epochs, I decided to demonstrate using two different summarization models, but the effect is the same. Grateful for any help.

Notice how in the first run I firstly use the facebook/bart-large-cnn model and then the lidiya/bart-large-xsum-samsum model without shutting the python terminal. In the second run I only use lidiya/bart-large-xsum-samsum model and get different output (which should not be the case).

NOTE: this reproducible example won't work on a CPU machine as it doesn't seem sensitive to torch.use_deterministic_algorithms(True) and it might give different results every time when run on a CPU, so should be reproduced on a GPU.

FIRST RUN

from transformers import AutoTokenizer,  AutoModelForSeq2SeqLM, pipeline
import torch

# random text taken from UK news website
text = """
The veteran retailer Stuart Rose has urged the government to do more to shield the poorest from double-digit inflation, describing the lack of action as “horrifying”, with a prime minister “on shore leave” leaving a situation where “nobody is in charge”.
Responding to July’s 10.1% headline rate, the Conservative peer and Asda chair said: “We have been very, very slow in recognising this train coming down the tunnel and it’s run quite a lot of people over and we now have to deal with the aftermath.”
Attacking a lack of leadership while Boris Johnson is away on holiday, he said: “We’ve got to have some action. The captain of the ship is on shore leave, right, nobody’s in charge at the moment.”
Lord Rose, who is a former boss of Marks & Spencer, said action was needed to kill “pernicious” inflation, which he said “erodes wealth over time”. He dismissed claims by the Tory leadership candidate Liz Truss’s camp that it would be possible for the UK to grow its way out of the crisis.
"""

seed = 42
torch.cuda.manual_seed_all(seed)
torch.use_deterministic_algorithms(True)
tokenizer = AutoTokenizer.from_pretrained("facebook/bart-large-cnn")
model = AutoModelForSeq2SeqLM.from_pretrained("facebook/bart-large-cnn")
model.eval()
summarizer = pipeline(
    "summarization", model=model, tokenizer=tokenizer, 
    num_beams=5, do_sample=True, no_repeat_ngram_size=3, device=0
)

output = summarizer(text, truncation=True)

tokenizer = AutoTokenizer.from_pretrained("lidiya/bart-large-xsum-samsum")
model = AutoModelForSeq2SeqLM.from_pretrained("lidiya/bart-large-xsum-samsum")
model.eval()
summarizer = pipeline(
    "summarization", model=model, tokenizer=tokenizer, 
    num_beams=5, do_sample=True, no_repeat_ngram_size=3, device=0
)

output = summarizer(text, truncation=True)
print(output)

output from lidiya/bart-large-xsum-samsum model should be

[{'summary_text': 'The UK economy is in crisis because of inflation. The government has been slow to react to it. Boris Johnson is on holiday.'}]

SECOND RUN (you must restart python to conduct the experiment)

from transformers import AutoTokenizer,  AutoModelForSeq2SeqLM, pipeline
import torch

text = """
The veteran retailer Stuart Rose has urged the government to do more to shield the poorest from double-digit inflation, describing the lack of action as “horrifying”, with a prime minister “on shore leave” leaving a situation where “nobody is in charge”.
Responding to July’s 10.1% headline rate, the Conservative peer and Asda chair said: “We have been very, very slow in recognising this train coming down the tunnel and it’s run quite a lot of people over and we now have to deal with the aftermath.”
Attacking a lack of leadership while Boris Johnson is away on holiday, he said: “We’ve got to have some action. The captain of the ship is on shore leave, right, nobody’s in charge at the moment.”
Lord Rose, who is a former boss of Marks & Spencer, said action was needed to kill “pernicious” inflation, which he said “erodes wealth over time”. He dismissed claims by the Tory leadership candidate Liz Truss’s camp that it would be possible for the UK to grow its way out of the crisis.
"""

seed = 42
torch.cuda.manual_seed_all(seed)
torch.use_deterministic_algorithms(True)

tokenizer = AutoTokenizer.from_pretrained("lidiya/bart-large-xsum-samsum")
model = AutoModelForSeq2SeqLM.from_pretrained("lidiya/bart-large-xsum-samsum")
model.eval()
summarizer = pipeline(
    "summarization", model=model, tokenizer=tokenizer, 
    num_beams=5, do_sample=True, no_repeat_ngram_size=3, device=0
)

output = summarizer(text, truncation=True)
print(output)

output should be

[{'summary_text': 'The government has been slow to deal with inflation. Stuart Rose has urged the government to do more to shield the poorest from double-digit inflation.'}]

Why is the first output different from the second one?You might re-seed the program after bart-large-cnn pipeline. Otherwise the seed generator would be used by the first pipeline and generate different outputs for your lidiya model across two scripts.

from transformers import AutoTokenizer,  AutoModelForSeq2SeqLM, pipeline
import torch

# random text taken from UK news website
text = """
The veteran retailer Stuart Rose has urged the government to do more to shield the poorest from double-digit inflation, describing the lack of action as “horrifying”, with a prime minister “on shore leave” leaving a situation where “nobody is in charge”.
Responding to July’s 10.1% headline rate, the Conservative peer and Asda chair said: “We have been very, very slow in recognising this train coming down the tunnel and it’s run quite a lot of people over and we now have to deal with the aftermath.”
Attacking a lack of leadership while Boris Johnson is away on holiday, he said: “We’ve got to have some action. The captain of the ship is on shore leave, right, nobody’s in charge at the moment.”
Lord Rose, who is a former boss of Marks & Spencer, said action was needed to kill “pernicious” inflation, which he said “erodes wealth over time”. He dismissed claims by the Tory leadership candidate Liz Truss’s camp that it would be possible for the UK to grow its way out of the crisis.
"""

seed = 42
torch.cuda.manual_seed_all(seed)
torch.use_deterministic_algorithms(True)
tokenizer = AutoTokenizer.from_pretrained("facebook/bart-large-cnn")
model = AutoModelForSeq2SeqLM.from_pretrained("facebook/bart-large-cnn")
model.eval()
summarizer = pipeline(
    "summarization", model=model, tokenizer=tokenizer, 
    num_beams=5, do_sample=True, no_repeat_ngram_size=3, device=0
)

output = summarizer(text, truncation=True)

seed = 42
torch.cuda.manual_seed_all(seed)
torch.use_deterministic_algorithms(True)

tokenizer = AutoTokenizer.from_pretrained("lidiya/bart-large-xsum-samsum")
model = AutoModelForSeq2SeqLM.from_pretrained("lidiya/bart-large-xsum-samsum")
model.eval()
summarizer = pipeline(
    "summarization", model=model, tokenizer=tokenizer, 
    num_beams=5, do_sample=True, no_repeat_ngram_size=3, device=0
)

output = summarizer(text, truncation=True)
print(output)
  • Hi @joe32140 that's a great point, thanks. May I ask as a follow up why we need to do this? What happens if we don't re-seed? Why would the seed only be used by the first pipeline?
    – andrea
  • The seeding is to ensure the reproducibility of the same script. In your case, you are running two different scripts so we should not expect them to generate the same result. My example is a simple example to explain why you got different summaries from the model but might not the right way to use seed.
    – joe32140                     

No comments:

Post a Comment

Connect broadband

Oxford Course on Deep Learning for Natural Language Processing

  Deep Learning methods achieve state-of-the-art results on a suite of   natural language processing   problems What makes this exciting is ...