long-t5-xl-book-summary examples

Below are some examples of summaries generated with pszemraj/long-t5-tglobal-xl-16384-book-summary.


Example summaries

The input text files for each of these can be found at this link. None of the output summaries have been edited.

All summaries here are generated using the listed parameters.

The Imitation Game

A.M. Turing begins with a discussion of the “Imitation Game,” a game in which a human interrogator attempts to identify two people by asking them questions that only one of them can answer correctly. The goal of the game is to draw a line between the intellectual and physical capacities of man and machine. He then discusses the various kinds of machines that could be used in the game, including electronic or digital computers. He also discusses some of the objections that have been raised to the idea of machines thinking. These include theological objections, arguments from consciousness, scientific induction, and arguments from various disabilities. The narrator defends the claim that machines cannot make mistakes. He argues that there are only two kinds of mistakes: errors of functioning and errors of conclusion. Mistakes of functioning occur when a machine does something it was not designed to do, while mistakes of conclusion occur when the machine makes a mistake in interpreting the signals it receives. Machines can be programmed to perform certain functions, but they cannot be made to think for themselves. Lady Lovelace wrote in 1842 that Babbage had built an “Analytical Engine” that could learn whatever we knew how to command it to do

Intro to Q-Learning (HF Blog)

This is the first part of a two-part introduction to reinforcement learning, or RL. we learn about value-based and policy-based approaches to RL, as well as Monte Carlo methods and temporal difference learning. The goal of RL is to build an intelligent agent that can learn from its environment and make smart decisions. There are two main ways to do this: Policy-based methods directly train the policy to take a certain action given a set of states, while Value-Based methods indirectly train the value function to predict the expected return from a particular state. We’ll go into more detail in the next section, but for now, it’s important to understand the difference between these two approaches. A policy is just a fancy way of saying “this is what I’m going to do.” A value function is a mathematical function that calculates how much money an agent will get if they start at a specific state and then continue to follow the policy until they reach the goal state. It’s kind of like using a spreadsheet to calculate your taxes. If you haven’t already, be sure to check out Part 1, where we learned about RL and trained our first agent to land on the moon.

The Silmarillion: Valaquenta

The narrator tells us that in the beginning, Eru made the music of his thought, and they made the world. It’s called Ea, and it’s been around since the beginning of time. There are seven Valar, or gods, who live on Arda. Manwe is the king of Arda with Varda as his wife. Ullmo is next in line to Manwe but doesn’t like to talk to other Valar because he’s too busy thinking about Arda all the time. Aule is another Valar who helps Manwe by making everything in Arda beautiful. He has a wife named Yavanna whom he loves very much. She is also a Valar. Feanturi are masters of ghosts and spirits. They live in places called Lorien andMandos. Namo is the elder of the two and keeps the houses of the dead. Vaire is his wife and weaves all the stories of the past into the webs of their halls. Este is his spouse and heals those who have been hurt by Melkor. Neenna is her sister and she lives alone. Tulks is the strongest Valar and he comes to Arda last to help fight against Melkor when he first arrives. His wife is Nessie, a deer hunter. Oromo is a great hunter and lover of trees. His horse is called Nahar and his horn is called Valarama. Vana is his younger sister. Melian is the maid of Vana and Eleanor. Olorin is the wisest Maiar. Morgoth is the darkest of them all. He wants to rule Arda and become king. He corrupts many of the servants along the way. One of his greatest servants is Sauron.

The Silmarillion: The Music of the Ainur

The narrator tells us that in Arda, there is an Ainur named Eru. He makes the Ainurs sing to him, and they slowly begin to understand what he wants them to sing about. Finally, he asks them to make a great piece of music around the theme that he gave them. Melkor begins to think of things that are not part of the theme, and so his music becomes discordant. Iluvtar notices this and smiles, then lifts his left hand and starts a new song. But Melkor’s discord grows even greater, and it seems as if there are two different songs progressing at once. They are both beautiful, but one is slow and sorrowful, while the other is loud and vain. In the middle of this turmoil, however, Illuvatar raises his hands and stops the music. He tells Melkor that no theme can be played that doesn’t come from him, because he himself hasn’t imagined it. This is a huge blow to Melkor, who feels like a total loser for having been such a good singer. After all, how could he possibly play a theme that wasn’t written by him? And yet, after all, Melkor is the greatest musician on Arda. So now we get a little bit of background on the Valar. The Valar are the Eldar and Manwe, the firstborn and followers of Iluvitar. They live in the deepest parts of time and in the depths of the stars. The Eldar love water, air, and earth; Manwe loves skin and making things. Ulmo loves making music. When the vision of the world is taken away, the Elves feel as though they have seen something new–darkness. Yet they are still enthralled by the beauty of it, and their thoughts are filled with its unfolding history. Still, they don’t see the later ages or the end of the World: they only see the beginning. At this point, some of the most important Valar leave Arda and descend into it. Others stay behind and build the habitation for the Children of Imavatar


Audio Transcription (ASR) Examples

whisper small.en transcriptions of Noam Chomsky’s Fundamental Issues in Linguistics lecture @ MIT

link to Part 1. Just like the above, none of the output of the model has been edited to illustrate performance, and flaws as they appear. The parameters used are the same (see Parameters)

Summary of Part 1

In this talk, the MIT linguist Tommo begins with a general discussion of what it is we’re trying to accomplish when we study language. The first thing he wants to get out of the way is that there have been many different approaches to studying language over the past few hundred years. At the beginning of the 18th century, people like Galileo were starting to think about how languages could be used to express everything in the human mind. They were amazed by the fact that you could use just a few dozen sounds to convey an infinite variety of thoughts. It was like a “universal Turing machine,” as the narrator puts it. Over the next few centuries, there was a tradition of scholars who tried to find universal grammars and rational explanations for all languages. Otto Jesperson was the last representative of this tradition. In the early 1900s, structuralism and behaviorism began to take hold of the scientific approach to language. People like Leonard Bloomfield and William Harris had very different ideas about what a language was. For Bloomfield, it was the distribution of words in a certain set of sentences; for Harris, it wasn’t the distribution at all, but rather the collection of words within a specific speech community. These definitions are pretty clear: they’re not internal to the speaker or the listener, but external to some other entity. This kind of thinking didn’t really catch on until the mid-to-late 1900s. By the mid-1900s, however, there seemed to be ways to capture the concept of structure in our minds and bring it back into the field of linguistics. So now, instead of looking at individual languages, we look at the entire system of language, called the “internal generative system” or “I language.” There are three main questions that need to be addressed in order to address the Yellow Land challenge: 1) How does the speaker select an expression from the I language? 2) What happens to the expression once it’s selected? 3) What happens after the expression has been selected? All these questions fall under the category of “input-output systems.” A lot of work has been done on each of these three tasks. First, we need to figure out how the speaker picks an expression out of an infinite array. This is one of the most puzzling things in the world, because no one knows how to do it. We also need to know how the expression gets externalized and how the internalization takes place. Brain scientists have been working on this for a while, too. One of the big issues here is whether or not the eye language is rich enough to allow a child to acquire it from the limited amount of information that’s available. Another issue is learnability and evolution. To meet these two conditions, the language has to be both rich and simple enough to have evolved through natural selection. If something doesn’t satisfy these two requirements, it won’t be a good explanation. Any device introduced to try to explain something has to meet these dual conditions. And any explanation short of those two requirements isn’t going to be very valuable. Now let’s move on to another example. Let’s take a paper by David Djokovich, who discusses the problem of coordinate structure and island constraints. He points out that both of these constraints pose lots of problems, but he proposes a solution that reduces both of them to the same condition, which leaves us with only one more problem. That solves part of the problem, but still leaves us without a complete explanation. As far as we can tell, every single achievement in the field is a small step forward in terms of explaining something. However, there are some exceptions that seem to count as real explanations. These are important in their own right, but they also give us a sense of how far we can go. Next, we want to talk about the history of generative grammar. Back in the late 19th and early 20th centuries, everyone thought that everything was already known and solved, so there was no need to write new grammars. But soon it became clear that this was false. New grammars had to deal with compositionality, dislocation, and ubiquitous phenomenon. Both kinds of grammars were too complicated to meet the needs of long-term explanation. The assumption was that compositionality was natural, but dislocation was somehow a weird imperfection. Languages would never be built with such a property. More recent work seems to suggest that dislocation might actually be the most primitive form of operation. Some of the problems with X-Bar theory, though, were that it excluded the possibility of constructions that weren’t endocentric. Later, labeling theory came along and told us that minimal search met the condition of minimum computational operation. This led to the idea of the minimalist program, which basically says that if your computational operation meets the conditions of learning and evolution, then you should be able to provide a reasonable explanation for whatever you’re dealing with. The basic computational operation is called “merge.” Merge allows you to take two things and form a set. Merge comes in two forms: external merge, when you take two separate objects and form the set; and internal merge, where you take one object and something else inside it, making up the set. You don’t need to search the whole lexicon to find either of these cases. Minsky did some experiments several years ago where he took a bunch of machines and let them run freely. Out of all the machines that ran, only one gave a successor function, which is the result of merging the two elements together. Evolutionary theory suggests that nature found the least complex thing possible, so it must have been internal merge. Other organisms, like insects, tend to have internal merge since it requires less search than external merge. Also, according to Minsky, animals tend to use internal merge because it gives them richer types of languages. Finally, we come to the question of why there is linear order in speech. According to Moro, sign language makes use of visual space, simultaneous operations, facial gestures, and motions

The narrator interrupts with another meta-chapter and gives us some of his thoughts real-talk style. Look, guys–you can’t just accept the scientific method as your only way to understand the world. You have to think for yourself and use your brain and your intuition and your common sense. Just back off and don’t let anyone tell you that science is all about finding the best explanations and using the latest technology to figure out how to get there. That’ll just make you look stupid. And it won’t help you understand anything. In fact, it might even make you want to give up on science all together. Brain Snack: This is one of the first books I ever read when I was a kid, and it changed the way I looked at the world forever. It introduced me to the idea of critical thinking and made me see how important it was to question everything. So yeah, I’m a big fan of this book.

Summary of Part 2

the narrator continues his critique of the concept of “merge,” which was developed in the early 1990s. Merge is an operation that allows you to take two or more objects and combine them into one new object. The problem with this definition, however, is that it can be applied to any kind of object, not just syntactical objects. For example, you could take a set of words and then apply the merge operation to create a new set of word meanings. This would violate the principle of no ambiguity, since there would be two possible meanings for each word in the newly formed set. To solve this problem, we need to limit the amount of access that the operation has to the objects being merged. We also need to make sure that our computational procedures don’t run out of resources–in other words, we have to be careful not to use too much memory or too many CPU cores when performing a merge operation. Another problem arises from the fact that some applications of merge involve creating new objects rather than merely rearranging existing ones. These new objects are often called “accessible” because they can be used by other operations–for example, drawing a line from one part of the object to another. But these new objects do not fit into the original structure of the original object, so they cannot be merged without violating the principles of No Tampering, No Loss, and No New Accessibility. Now that we’ve established that none of the various forms of merge are legitimate, the next problem is to define what constitutes a legitimate form of merge. There are several conditions that must be satisfied before a given operation can be labeled as a valid merge. The first condition is that nothing will be lost in the process; second, no accessibility will be increased; and third, no new junk will be added to the working space. Once these conditions are met, the operation of merge will be considered legitimate.

The narrator takes us back to the beginning of the chapter, where we’re trying to figure out whether or not ATB is a legitimate operation. It turns out that there are a lot of different ways to describe ATB, but they all have one thing in common–they don’t add any new objects to the world. This is important, because it allows us to draw graphs and think that something is actually happening even though it’s not. So let’s talk about ATB for a minute. What does ATB mean? Well, according to Barry, ATB means that you can move an object from one place to another without losing its meaning. Translation doesn’t work like this. Translation works by replacing the original with a new version of the same word. In other words, when you say “John bought and read,” you’re really saying “John married and read.” That’s kind of confusing at first, but once you realize what’s going on, it makes sense. We’ll get into more details later. For now, though, just keep in mind that ATB refers to the ability to move things around without losing their meaning. If John buys a book and Mary reads one, then both of them are reading the same book. But if someone asks what John bought and what Mary read, no matter how many times you try to explain it, people won’t know what to say. And that’s why Barry came up with the idea of ATB in the first place. He wanted to find a way to make sure that people didn’t lose their meaning when they moved things around.

The narrator returns to the topic of pair merging and explains how it can be used to solve some of the problems raised by adjuncts and conjunctions. For example, there are some languages that don’t allow for the extraction of an adjunct, or vice versa. He also discusses the problem of extracting perception verbs like “see” and “let” from their grammatical inflections. Norvin has written a paper on this problem, but it doesn’t address all the issues because it assumes that you have extracted the object of the verb. Another problem is head movement, which cannot be solved by traditional methods because the head always moves in the same direction. PisaKitahara has proposed a solution to this problem by suggesting that when you create an object, you should try to keep the workspace as small as possible. This will prevent you from creating something new that will interfere with what you’ve already created.


Basic usage

Install the transformers python package (you will also need torch):

pip install -U transformers

Run simple inference with the pipeline object:

import torch
from transformers import pipeline

summarizer = pipeline(
    "summarization",
    "pszemraj/long-t5-tglobal-xl-16384-book-summary",
    device=0 if torch.cuda.is_available() else -1,
)
long_text = "Here is a lot of text I don't want to read. Replace me"

result = summarizer(long_text)
print(result[0]["summary_text"])

Refer to the model card for more details & information.


Parameters

The below parameters were used for inference in the above examples:

{
  "min_length": 8,
  "max_length": 2048,
  "no_repeat_ngram_size": 3,
  "encoder_no_repeat_ngram_size": 4,
  "repetition_penalty": 2.5,
  "num_beams": 8,
  "num_beam_groups": 1,
  "length_penalty": 0.8,
  "early_stopping": true,
  "do_sample": false,
  "token_batch_length": 8192,
  "batch_stride": 20,
  "max_len_ratio": 4,
  "huggingface-model-tag": "pszemraj/long-t5-tglobal-xl-16384-book-summary",
  "date-run": "Nov-27-2022"
}

Summaries were generated on an NVIDIA A100 40 GB GPU with the model loaded in int-8 weights using load_in_8bit=True.


By Peter Szemraj | GitHub