View file src/colab/transformers_quick_start.py - Download
# -*- coding: utf-8 -*-
"""transformers_quick_start.ipynb
Automatically generated by Colab.
Original file is located at
https://colab.research.google.com/drive/1DoNJOflmG6BlGtO-g8CSAdb-iWhEnLpQ
https://huggingface.co/docs/transformers/quicktour
Quickstart
Transformers is designed to be fast and easy to use so that everyone can start learning or building with transformer models.
The number of user-facing abstractions is limited to only three classes for instantiating a model, and two APIs for inference or training. This quickstart introduces you to Transformers’ key features and shows you how to:
load a pretrained modelrun inference with Pipelinefine-tune a model with TrainerSet up
To start, we recommend creating a Hugging Face account. An account lets you host and access version controlled models, datasets, and Spaces on the Hugging Face Hub, a collaborative platform for discovery and building.
Create a User Access Token and log in to your account.
"""
#%%script echo Disabled
from huggingface_hub import notebook_login
notebook_login()
"""Install a machine learning framework."""
# !pip install torch
import torch
import gc
"""
Then install an up-to-date version of Transformers and some additional libraries from the Hugging Face ecosystem for accessing datasets and vision models, evaluating training, and optimizing training for large models."""
!pip install -U transformers datasets evaluate accelerate timm
"""Pretrained models
Each pretrained model inherits from three base classes.
ClassDescriptionPretrainedConfigA file that specifies a models attributes such as the number of attention heads or vocabulary size.PreTrainedModelA model (or architecture) defined by the model attributes from the configuration file. A pretrained model only returns the raw hidden states. For a specific task, use the appropriate model head to convert the raw hidden states into a meaningful result (for example, LlamaModel versus LlamaForCausalLM).PreprocessorA class for converting raw inputs (text, images, audio, multimodal) into numerical inputs to the model. For example, PreTrainedTokenizer converts text into tensors and ImageProcessingMixin converts pixels into tensors.
We recommend using the AutoClass API to load models and preprocessors because it automatically infers the appropriate architecture for each task and machine learning framework based on the name or path to the pretrained weights and configuration file.
Use from_pretrained() to load the weights and configuration file from the Hub into the model and preprocessor class.
PyTorch
TensorFlow
When you load a model, configure the following parameters to ensure the model is optimally loaded.
device_map="auto" automatically allocates the model weights to your fastest device first, which is typically the GPU.torch_dtype="auto" directly initializes the model weights in the data type they’re stored in, which can help avoid loading the weights twice (PyTorch loads weights in torch.float32 by default).
Copied
"""
from transformers import AutoModelForCausalLM, AutoTokenizer
# model_name = "meta-llama/Llama-2-7b-hf"
model_name = "Upstage/SOLAR-10.7B-Instruct-v1.0"
# model_name = "TinyLlama/TinyLlama-1.1B-Chat-v1.0"
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(model_name)
"""Tokenize the text and return PyTorch tensors with the tokenizer. Move the model to a GPU if it’s available to accelerate inference."""
model_inputs = tokenizer(["The secret to baking a good cake is "], return_tensors="pt").to("cuda")
"""
The model is now ready for inference or training.
For inference, pass the tokenized inputs to generate() to generate text. Decode the token ids back into text with batch_decode()."""
generated_ids = model.generate(**model_inputs, max_length=30)
output = tokenizer.batch_decode(generated_ids)[0]
print(output)
"""
Skip ahead to the Trainer section to learn how to fine-tune a model.
Pipeline
The Pipeline class is the most convenient way to inference with a pretrained model. It supports many tasks such as text generation, image segmentation, automatic speech recognition, document question answering, and more.
Refer to the Pipeline API reference for a complete list of available tasks.
Create a Pipeline object and select a task. By default, Pipeline downloads and caches a default pretrained model for a given task. Pass the model name to the model parameter to choose a specific model.
text generation
image segmentation
automatic speech recognition
Set device="cuda" to accelerate inference with a GPU."""
from transformers import pipeline
# pipeline = pipeline("text-generation", model=model_name, device="cuda")
# Initialize the pipeline with the already loaded model and tokenizer
# The pipeline will use the device mapping set when the model was loaded
pipeline = pipeline("text-generation", model=model, tokenizer=tokenizer)
"""Prompt Pipeline with some initial text to generate more text."""
output = pipeline("The secret to baking a good cake is ", max_length=50)
import textwrap
def wrap_preserve_newlines(text, width=80):
# Découper le texte en lignes selon les retours à la ligne d'origine
lines = text.splitlines()
# Appliquer textwrap.wrap à chaque ligne séparément
wrapped_lines = [wrapped_line
for line in lines
for wrapped_line in textwrap.wrap(line, width=width)
or ['']] # Si ligne vide, forcer une ligne vide
# Rejoindre les lignes
return '\n'.join(wrapped_lines)
print(wrap_preserve_newlines(output[0]['generated_text']))
# Ensure the previous model is removed from memory
if 'model' in locals() and model is not None:
# Move the model back to CPU if it's on GPU
if hasattr(model, 'to'):
try:
model.to('cpu')
except:
pass # Ignore errors if model doesn't have .to or similar
del model # Delete the model variable
model = None # Set to None
# Clear CUDA cache and run garbage collection
if torch.cuda.is_available():
torch.cuda.empty_cache()
gc.collect()
"""Trainer
Trainer is a complete training and evaluation loop for PyTorch models. It abstracts away a lot of the boilerplate usually involved in manually writing a training loop, so you can start training faster and focus on training design choices. You only need a model, dataset, a preprocessor, and a data collator to build batches of data from the dataset.
Use the TrainingArguments class to customize the training process. It provides many options for training, evaluation, and more. Experiment with training hyperparameters and features like batch size, learning rate, mixed precision, torch.compile, and more to meet your training needs. You could also use the default training parameters to quickly produce a baseline.
Load a model, tokenizer, and dataset for training.
"""
from transformers import AutoModelForSequenceClassification, AutoTokenizer
from datasets import load_dataset
model = AutoModelForSequenceClassification.from_pretrained("distilbert/distilbert-base-uncased")
tokenizer = AutoTokenizer.from_pretrained("distilbert/distilbert-base-uncased")
dataset1 = load_dataset("rotten_tomatoes")
"""Create a function to tokenize the text and convert it into PyTorch tensors. Apply this function to the whole dataset with the map method."""
def tokenize_dataset(dataset):
return tokenizer(dataset["text"])
dataset = dataset1.map(tokenize_dataset, batched=True)
"""Load a data collator to create batches of data and pass the tokenizer to it."""
from transformers import DataCollatorWithPadding
data_collator = DataCollatorWithPadding(tokenizer=tokenizer)
"""Next, set up TrainingArguments with the training features and hyperparameters."""
from transformers import TrainingArguments
training_args = TrainingArguments(
output_dir="distilbert-rotten-tomatoes",
learning_rate=2e-5,
per_device_train_batch_size=8,
per_device_eval_batch_size=8,
num_train_epochs=2,
push_to_hub=True,
)
"""Finally, pass all these separate components to Trainer and call train() to start."""
# Commented out IPython magic to ensure Python compatibility.
# %%script echo Disabled
#
# from transformers import Trainer
#
# trainer = Trainer(
# model=model,
# args=training_args,
# train_dataset=dataset["train"],
# eval_dataset=dataset["test"],
# tokenizer=tokenizer,
# data_collator=data_collator,
# )
#
# trainer.train()
"""Share your model and tokenizer to the Hub with push_to_hub()."""
# trainer.push_to_hub()
# Ensure the previous model is removed from memory
if 'model' in locals() and model is not None:
# Move the model back to CPU if it's on GPU
if hasattr(model, 'to'):
try:
model.to('cpu')
except:
pass # Ignore errors if model doesn't have .to or similar
del model # Delete the model variable
model = None # Set to None
# Clear CUDA cache and run garbage collection
if torch.cuda.is_available():
torch.cuda.empty_cache()
gc.collect()
"""Not all pretrained models are available in TensorFlow. Refer to a models API doc to check whether a TensorFlow implementation is supported.
Trainer doesn’t work with TensorFlow models, but you can still train a Transformers model implemented in TensorFlow with Keras. Transformers TensorFlow models are a standard tf.keras.Model, which is compatible with Keras’ compile and fit methods.
Load a model, tokenizer, and dataset for training.
"""
import tensorflow as tf
# Check for GPU availability
gpus = tf.config.list_physical_devices('GPU')
if gpus:
try:
# Configure TensorFlow to use the first GPU
tf.config.set_visible_devices(gpus[0], 'GPU')
logical_gpus = tf.config.list_logical_devices('GPU')
print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPU")
print("GPU is available and configured for TensorFlow.")
except RuntimeError as e:
# Visible devices must be set before GPUs have been initialized
print(e)
print("Could not configure GPU. TensorFlow might run on CPU.")
else:
print("No GPU available. TensorFlow will run on CPU.")
from transformers import TFAutoModelForSequenceClassification, AutoTokenizer
model = TFAutoModelForSequenceClassification.from_pretrained("distilbert/distilbert-base-uncased")
tokenizer = AutoTokenizer.from_pretrained("distilbert/distilbert-base-uncased")
"""Create a function to tokenize the text and convert it into TensorFlow tensors. Apply this function to the whole dataset with the map method."""
def tokenize_dataset(dataset):
return tokenizer(dataset["text"])
dataset = dataset1.map(tokenize_dataset)
"""Transformers provides the prepare_tf_dataset() method to collate and batch a dataset."""
tf_dataset = model.prepare_tf_dataset(
dataset["train"], batch_size=16, shuffle=True, tokenizer=tokenizer
)
"""Finally, call compile to configure the model for training and fit to start."""
# Commented out IPython magic to ensure Python compatibility.
# %%script echo Disabled
#
# from tensorflow.keras.optimizers import Adam
#
# model.compile(optimizer="adam")
# model.fit(tf_dataset)
"""Next steps
Now that you have a better understanding of Transformers and what it offers, it’s time to keep exploring and learning what interests you the most.
Base classes: Learn more about the configuration, model and processor classes. This will help you understand how to create and customize models, preprocess different types of inputs (audio, images, multimodal), and how to share your model.
Inference: Explore the Pipeline further, inference and chatting with LLMs, agents, and how to optimize inference with your machine learning framework and hardware.
Training: Study the Trainer in more detail, as well as distributed training and optimizing training on specific hardware.
Quantization: Reduce memory and storage requirements with quantization and speed up inference by representing weights with fewer bits.
Resources: Looking for end-to-end recipes for how to train and inference with a model for a specific task? Check out the task recipes!
"""