Machine Learning with PyTorch and Scikit-Learn: Develop machine learning and deep learning models with Python / Машинное обучение с помощью PyTorch и Scikit-Learn: Разработка моделей машинного обучения и глубокого обучения с помощью Python
Год издания: 2022
Автор: Dzhulgakov D., Raschka S., Mirjalili V., Liu Y. / Джулгаков Д., Рашка С., Мирджалили В., Лю Ю.
Издательство: Packt Publishing
ISBN: 978-1-80181-931-2
Язык: Английский
Формат: PDF, EPUB
Качество: Издательский макет или текст (eBook)
Интерактивное оглавление: Да
Количество страниц: 771
Описание: This book of the bestselling and widely acclaimed Python Machine Learning series is a comprehensive guide to machine and deep learning using PyTorch’s simple to code framework
Key Features
Learn applied machine learning with a solid foundation in theory
Clear, intuitive explanations take you deep into the theory and practice of Python machine learning
Fully updated and expanded to cover PyTorch, transformers, XGBoost, graph neural networks, and best practices
Book Description
Machine Learning with PyTorch and Scikit-Learn is a comprehensive guide to machine learning and deep learning with PyTorch. It acts as both a step-by-step tutorial and a reference you’ll keep coming back to as you build your machine learning systems.
Packed with clear explanations, visualizations, and examples, the book covers all the essential machine learning techniques in depth. While some books teach you only to follow instructions, with this machine learning book, we teach the principles allowing you to build models and applications for yourself.
Why PyTorch?
PyTorch is the Pythonic way to learn machine learning, making it easier to learn and simpler to code with. This book explains the essential parts of PyTorch and how to create models using popular libraries, such as PyTorch Lightning and PyTorch Geometric.
You will also learn about generative adversarial networks (GANs) for generating new data and training intelligent agents with reinforcement learning. Finally, this new edition is expanded to cover the latest trends in deep learning, including graph neural networks and large-scale transformers used for natural language processing (NLP).
This PyTorch book is your companion to machine learning with Python, whether you’re a Python developer new to machine learning or want to deepen your knowledge of the latest developments.
What you will learn
Explore frameworks, models, and techniques for machines to ‘learn’ from data
Use scikit-learn for machine learning and PyTorch for deep learning
Train machine learning classifiers on images, text, and more
Build and train neural networks, transformers, and boosting algorithms
Discover best practices for evaluating and tuning models
Predict continuous target outcomes using regression analysis
Dig deeper into textual and social media data using sentiment analysis
Who this book is for
If you know some Python and you want to use machine learning and deep learning, pick up this book. Whether you want to start from scratch or extend your machine learning knowledge, this is an essential resource.
Written for developers and data scientists who want to create practical machine learning with Python and PyTorch deep learning code. This Python book is ideal for anyone who wants to teach computers how to learn from data.
Working knowledge of the Python programming language, along with a good understanding of calculus and linear algebra is a must.
Эта книга из бестселлерной и широко известной серии Python Machine Learning представляет собой всеобъемлющее руководство по машинному и глубокому обучению с использованием простого в кодировании фреймворка PyTorch
Ключевые функции
Изучайте прикладное машинное обучение на прочной теоретической основе
Четкие, интуитивно понятные объяснения углубят вас в теорию и практику машинного обучения на Python
Полностью обновлен и расширен, чтобы охватить PyTorch, transformers, XGBoost, графические нейронные сети и лучшие практики
Описание книги
Машинное обучение с помощью PyTorch и Scikit-Learn - это всеобъемлющее руководство по машинному обучению и глубокому обучению с помощью PyTorch. Он служит одновременно и пошаговым руководством, и справочником, к которому вы будете постоянно возвращаться при создании своих систем машинного обучения.
Насыщенная четкими объяснениями, визуализациями и примерами, книга подробно описывает все основные методы машинного обучения. В то время как некоторые книги учат вас только следовать инструкциям, в этой книге по машинному обучению мы обучаем принципам, позволяющим вам самостоятельно создавать модели и приложения.
Почему именно PyTorch?
PyTorch - это питонический способ изучения машинного обучения, облегчающий его изучение и программирование с его помощью. В этой книге рассказывается об основных частях PyTorch и о том, как создавать модели с использованием популярных библиотек, таких как PyTorch Lightning и PyTorch Geometric.
Вы также узнаете о генеративных состязательных сетях (GAN) для генерации новых данных и обучения интеллектуальных агентов с помощью обучения с подкреплением. Наконец, это новое издание расширено, чтобы охватить последние тенденции в области глубокого обучения, включая графовые нейронные сети и крупномасштабные преобразователи, используемые для обработки естественного языка (NLP).
Эта книга PyTorch станет вашим помощником в машинном обучении с помощью Python, независимо от того, являетесь ли вы разработчиком Python, новичком в машинном обучении или хотите углубить свои знания о последних разработках.
Чему вы научитесь
Изучите фреймворки, модели и методы, позволяющие машинам "учиться" на основе данных
Использовать scikit-learn для машинного обучения и PyTorch для глубокого обучения
Обучите классификаторы машинного обучения изображениям, тексту и многому другому
Создадите и обучите нейронные сети, трансформаторы и повышающие алгоритмы
Ознакомитесь с лучшими практиками оценки и настройки моделей
Прогнозировать непрерывные целевые результаты с помощью регрессионного анализа
Углубитесь в текстовые данные и данные социальных сетей, используя анализ настроений
Для кого предназначена эта книга
Если вы немного знаете Python и хотите использовать машинное обучение и глубокое обучение, возьмите в руки эту книгу. Независимо от того, хотите ли вы начать с нуля или расширить свои знания в области машинного обучения, это незаменимый ресурс.
Написана для разработчиков и специалистов по обработке данных, которые хотят создать практическое машинное обучение с помощью кода глубокого обучения на Python и PyTorch. Эта книга по Python идеально подходит для тех, кто хочет научить компьютеры извлекать уроки из данных.
Обязательными являются практические знания языка программирования Python, а также хорошее понимание математического анализа и линейной алгебры.
Код на
GitHub
Оглавление
Preface xxiii
Chapter 1: Giving Computers the Ability to Learn from Data 1
Building intelligent machines to transform data into knowledge 1
The three different types of machine learning 2
Making predictions about the future with supervised learning • 3
Classification for predicting class labels • 4
Regression for predicting continuous outcomes • 5
Solving interactive problems with reinforcement learning • 6
Discovering hidden structures with unsupervised learning • 7
Finding subgroups with clustering • 8
Dimensionality reduction for data compression • 8
Introduction to the basic terminology and notations 9
Notation and conventions used in this book • 9
Machine learning terminology • 11
A roadmap for building machine learning systems 12
Preprocessing – getting data into shape • 13
Training and selecting a predictive model • 13
Evaluating models and predicting unseen data instances • 14
Using Python for machine learning 14
Installing Python and packages from the Python Package Index • 14
Using the Anaconda Python distribution and package manager • 15
Packages for scientific computing, data science, and machine learning • 16
Summary 17
Chapter 2: Training Simple Machine Learning Algorithms for Classification 19
Artificial neurons – a brief glimpse into the early history of machine learning 19
The formal definition of an artificial neuron • 20
The perceptron learning rule • 22
Implementing a perceptron learning algorithm in Python 25
An object-oriented perceptron API • 25
Training a perceptron model on the Iris dataset • 29
Adaptive linear neurons and the convergence of learning 35
Minimizing loss functions with gradient descent • 37
Implementing Adaline in Python • 39
Improving gradient descent through feature scaling • 43
Large-scale machine learning and stochastic gradient descent • 45
Summary 51
Chapter 3: A Tour of Machine Learning Classifiers Using Scikit-Learn 53
Choosing a classification algorithm 53
First steps with scikit-learn – training a perceptron 54
Modeling class probabilities via logistic regression 59
Logistic regression and conditional probabilities • 60
Learning the model weights via the logistic loss function • 63
Converting an Adaline implementation into an algorithm for logistic regression • 66
Training a logistic regression model with scikit-learn • 70
Tackling overfitting via regularization • 73
Maximum margin classification with support vector machines 76
Maximum margin intuition • 77
Dealing with a nonlinearly separable case using slack variables • 77
Alternative implementations in scikit-learn • 79
Solving nonlinear problems using a kernel SVM 80
Kernel methods for linearly inseparable data • 80
Using the kernel trick to find separating hyperplanes in a high-dimensional space • 82
Decision tree learning 86
Maximizing IG – getting the most bang for your buck • 88
Building a decision tree • 92
Combining multiple decision trees via random forests • 95
K-nearest neighbors – a lazy learning algorithm 98
Summary 102
Table of Contents xi
Chapter 4: Building Good Training Datasets – Data Preprocessing 105
Dealing with missing data 105
Identifying missing values in tabular data • 106
Eliminating training examples or features with missing values • 107
Imputing missing values • 108
Understanding the scikit-learn estimator API • 109
Handling categorical data 111
Categorical data encoding with pandas • 111
Mapping ordinal features • 111
Encoding class labels • 112
Performing one-hot encoding on nominal features • 113
Optional: encoding ordinal features • 116
Partitioning a dataset into separate training and test datasets 117
Bringing features onto the same scale 119
Selecting meaningful features 122
L1 and L2 regularization as penalties against model complexity • 122
A geometric interpretation of L2 regularization • 123
Sparse solutions with L1 regularization • 125
Sequential feature selection algorithms • 128
Assessing feature importance with random forests 134
Summary 137
Chapter 5: Compressing Data via Dimensionality Reduction 139
Unsupervised dimensionality reduction via principal component analysis 139
The main steps in principal component analysis • 140
Extracting the principal components step by step • 142
Total and explained variance • 144
Feature transformation • 146
Principal component analysis in scikit-learn • 149
Assessing feature contributions • 152
Supervised data compression via linear discriminant analysis 154
Principal component analysis versus linear discriminant analysis • 154
The inner workings of linear discriminant analysis • 156
Computing the scatter matrices • 156
Selecting linear discriminants for the new feature subspace • 158
Projecting examples onto the new feature space • 161
LDA via scikit-learn • 162
Nonlinear dimensionality reduction and visualization 163
Why consider nonlinear dimensionality reduction? • 164
Visualizing data via t-distributed stochastic neighbor embedding • 165
Summary 169
Chapter 6: Learning Best Practices for Model Evaluation and
Hyperparameter Tuning 171
Streamlining workflows with pipelines 171
Loading the Breast Cancer Wisconsin dataset • 172
Combining transformers and estimators in a pipeline • 173
Using k-fold cross-validation to assess model performance 175
The holdout method • 175
K-fold cross-validation • 176
Debugging algorithms with learning and validation curves 180
Diagnosing bias and variance problems with learning curves • 180
Addressing over- and underfitting with validation curves • 183
Fine-tuning machine learning models via grid search 185
Tuning hyperparameters via grid search • 186
Exploring hyperparameter configurations more widely with randomized search • 187
More resource-efficient hyperparameter search with successive halving • 189
Algorithm selection with nested cross-validation • 191
Looking at different performance evaluation metrics 193
Reading a confusion matrix • 193
Optimizing the precision and recall of a classification model • 195
Plotting a receiver operating characteristic • 198
Scoring metrics for multiclass classification • 200
Dealing with class imbalance • 201
Summary 203
Chapter 7: Combining Different Models for Ensemble Learning 205
Learning with ensembles 205
Combining classifiers via majority vote 209
Implementing a simple majority vote classifier • 209
Using the majority voting principle to make predictions • 214
Evaluating and tuning the ensemble classifier • 217
Table of Contents xiii
Bagging – building an ensemble of classifiers from bootstrap samples 223
Bagging in a nutshell • 224
Applying bagging to classify examples in the Wine dataset • 225
Leveraging weak learners via adaptive boosting 229
How adaptive boosting works • 229
Applying AdaBoost using scikit-learn • 233
Gradient boosting – training an ensemble based on loss gradients 237
Comparing AdaBoost with gradient boosting • 237
Outlining the general gradient boosting algorithm • 237
Explaining the gradient boosting algorithm for classification • 239
Illustrating gradient boosting for classification • 241
Using XGBoost • 243
Summary 245
Chapter 8: Applying Machine Learning to Sentiment Analysis 247
Preparing the IMDb movie review data for text processing 247
Obtaining the movie review dataset • 248
Preprocessing the movie dataset into a more convenient format • 248
Introducing the bag-of-words model 250
Transforming words into feature vectors • 250
Assessing word relevancy via term frequency-inverse document frequency • 252
Cleaning text data • 254
Processing documents into tokens • 256
Training a logistic regression model for document classification 258
Working with bigger data – online algorithms and out-of-core learning 260
Topic modeling with latent Dirichlet allocation 264
Decomposing text documents with LDA • 264
LDA with scikit-learn • 265
Summary 268
Chapter 9: Predicting Continuous Target Variables with Regression Analysis 269
Introducing linear regression 269
Simple linear regression • 270
Multiple linear regression • 271
Exploring the Ames Housing dataset 272
Loading the Ames Housing dataset into a DataFrame • 272
Visualizing the important characteristics of a dataset • 274
Looking at relationships using a correlation matrix • 276
Implementing an ordinary least squares linear regression model 278
Solving regression for regression parameters with gradient descent • 278
Estimating the coefficient of a regression model via scikit-learn • 283
Fitting a robust regression model using RANSAC 285
Evaluating the performance of linear regression models 288
Using regularized methods for regression 292
Turning a linear regression model into a curve – polynomial regression 294
Adding polynomial terms using scikit-learn • 294
Modeling nonlinear relationships in the Ames Housing dataset • 297
Dealing with nonlinear relationships using random forests 299
Decision tree regression • 300
Random forest regression • 301
Summary 304
Chapter 10: Working with Unlabeled Data – Clustering Analysis 305
Grouping objects by similarity using k-means 305
k-means clustering using scikit-learn • 305
A smarter way of placing the initial cluster centroids using k-means++ • 310
Hard versus soft clustering • 311
Using the elbow method to find the optimal number of clusters • 313
Quantifying the quality of clustering via silhouette plots • 314
Organizing clusters as a hierarchical tree 319
Grouping clusters in a bottom-up fashion • 320
Performing hierarchical clustering on a distance matrix • 321
Attaching dendrograms to a heat map • 325
Applying agglomerative clustering via scikit-learn • 327
Locating regions of high density via DBSCAN 328
Summary 334
Chapter 11: Implementing a Multilayer Artificial Neural Network from Scratch 335
Modeling complex functions with artificial neural networks 335
Single-layer neural network recap • 337
Introducing the multilayer neural network architecture • 338
Activating a neural network via forward propagation • 340
Table of Contents xv
Classifying handwritten digits 343
Obtaining and preparing the MNIST dataset • 343
Implementing a multilayer perceptron • 347
Coding the neural network training loop • 352
Evaluating the neural network performance • 357
Training an artificial neural network 360
Computing the loss function • 360
Developing your understanding of backpropagation • 362
Training neural networks via backpropagation • 363
About convergence in neural networks 367
A few last words about the neural network implementation 368
Summary 368
Chapter 12: Parallelizing Neural Network Training with PyTorch 369
PyTorch and training performance 369
Performance challenges • 369
What is PyTorch? • 371
How we will learn PyTorch • 372
First steps with PyTorch 372
Installing PyTorch • 372
Creating tensors in PyTorch • 373
Manipulating the data type and shape of a tensor • 374
Applying mathematical operations to tensors • 375
Split, stack, and concatenate tensors • 376
Building input pipelines in PyTorch 378
Creating a PyTorch DataLoader from existing tensors • 378
Combining two tensors into a joint dataset • 379
Shuffle, batch, and repeat • 380
Creating a dataset from files on your local storage disk • 382
Fetching available datasets from the torchvision.datasets library • 386
Building an NN model in PyTorch 389
The PyTorch neural network module (torch.nn) • 390
Building a linear regression model • 390
Model training via the torch.nn and torch.optim modules • 394
Building a multilayer perceptron for classifying flowers in the Iris dataset • 395
Evaluating the trained model on the test dataset • 398
Saving and reloading the trained model • 399
Choosing activation functions for multilayer neural networks 400
Logistic function recap • 400
Estimating class probabilities in multiclass classification via the softmax function • 402
Broadening the output spectrum using a hyperbolic tangent • 403
Rectified linear unit activation • 405
Summary 406
Chapter 13: Going Deeper – The Mechanics of PyTorch 409
The key features of PyTorch 410
PyTorch’s computation graphs 410
Understanding computation graphs • 410
Creating a graph in PyTorch • 411
PyTorch tensor objects for storing and updating model parameters 412
Computing gradients via automatic differentiation 415
Computing the gradients of the loss with respect to trainable variables • 415
Understanding automatic differentiation • 416
Adversarial examples • 416
Simplifying implementations of common architectures via the torch.nn module 417
Implementing models based on nn.Sequential • 417
Choosing a loss function • 418
Solving an XOR classification problem • 419
Making model building more flexible with nn.Module • 424
Writing custom layers in PyTorch • 426
Project one – predicting the fuel efficiency of a car 431
Working with feature columns • 431
Training a DNN regression model • 435
Project two – classifying MNIST handwritten digits 436
Higher-level PyTorch APIs: a short introduction to PyTorch-Lightning 439
Setting up the PyTorch Lightning model • 440
Setting up the data loaders for Lightning • 443
Training the model using the PyTorch Lightning Trainer class • 444
Evaluating the model using TensorBoard • 445
Summary 449
Table of Contents xvii
Chapter 14: Classifying Images with Deep Convolutional Neural Networks 451
The building blocks of CNNs 451
Understanding CNNs and feature hierarchies • 452
Performing discrete convolutions • 454
Discrete convolutions in one dimension • 454
Padding inputs to control the size of the output feature maps • 457
Determining the size of the convolution output • 458
Performing a discrete convolution in 2D • 459
Subsampling layers • 463
Putting everything together – implementing a CNN 464
Working with multiple input or color channels • 464
Regularizing an NN with L2 regularization and dropout • 467
Loss functions for classification • 471
Implementing a deep CNN using PyTorch 473
The multilayer CNN architecture • 473
Loading and preprocessing the data • 474
Implementing a CNN using the torch.nn module • 476
Configuring CNN layers in PyTorch • 476
Constructing a CNN in PyTorch • 477
Smile classification from face images using a CNN 482
Loading the CelebA dataset • 483
Image transformation and data augmentation • 484
Training a CNN smile classifier • 490
Summary 497
Chapter 15: Modeling Sequential Data Using Recurrent Neural Networks 499
Introducing sequential data 499
Modeling sequential data – order matters • 500
Sequential data versus time series data • 500
Representing sequences • 500
The different categories of sequence modeling • 501
RNNs for modeling sequences 502
Understanding the dataflow in RNNs • 502
Computing activations in an RNN • 504
Hidden recurrence versus output recurrence • 506
The challenges of learning long-range interactions • 509
Long short-term memory cells • 511
Implementing RNNs for sequence modeling in PyTorch 513
Project one – predicting the sentiment of IMDb movie reviews • 513
Preparing the movie review data • 513
Embedding layers for sentence encoding • 517
Building an RNN model • 520
Building an RNN model for the sentiment analysis task • 521
Project two – character-level language modeling in PyTorch • 525
Preprocessing the dataset • 526
Building a character-level RNN model • 531
Evaluation phase – generating new text passages • 533
Summary 537
Chapter 16: Transformers – Improving Natural Language Processing
with Attention Mechanisms 539
Adding an attention mechanism to RNNs 540
Attention helps RNNs with accessing information • 540
The original attention mechanism for RNNs • 542
Processing the inputs using a bidirectional RNN • 543
Generating outputs from context vectors • 543
Computing the attention weights • 544
Introducing the self-attention mechanism 544
Starting with a basic form of self-attention • 545
Parameterizing the self-attention mechanism: scaled dot-product attention • 549
Attention is all we need: introducing the original transformer architecture 552
Encoding context embeddings via multi-head attention • 554
Learning a language model: decoder and masked multi-head attention • 558
Implementation details: positional encodings and layer normalization • 559
Building large-scale language models by leveraging unlabeled data 61
Pre-training and fine-tuning transformer models • 561
Leveraging unlabeled data with GPT • 563
Using GPT-2 to generate new text • 566
Bidirectional pre-training with BERT • 569
The best of both worlds: BART • 572
Table of Contents xix
Fine-tuning a BERT model in PyTorch 574
Loading the IMDb movie review dataset • 575
Tokenizing the dataset • 577
Loading and fine-tuning a pre-trained BERT model • 578
Fine-tuning a transformer more conveniently using the Trainer API • 582
Summary 586
Chapter 17: Generative Adversarial Networks for Synthesizing New Data 589
Introducing generative adversarial networks 589
Starting with autoencoders • 590
Generative models for synthesizing new data • 592
Generating new samples with GANs • 593
Understanding the loss functions of the generator and discriminator
networks in a GAN model • 594
Implementing a GAN from scratch 596
Training GAN models on Google Colab • 596
Implementing the generator and the discriminator networks • 600
Defining the training dataset • 604
Training the GAN model • 605
Improving the quality of synthesized images using a convolutional and Wasserstein GAN 612
Transposed convolution • 612
Batch normalization • 614
Implementing the generator and discriminator • 616
Dissimilarity measures between two distributions • 624
Using EM distance in practice for GANs • 627
Gradient penalty • 628
Implementing WGAN-GP to train the DCGAN model • 629
Mode collapse • 633
Other GAN applications 635
Summary 635
Chapter 18: Graph Neural Networks for Capturing Dependencies
in Graph Structured Data 637
Introduction to graph data 638
Undirected graphs • 638
Directed graphs • 639
Labeled graphs • 640
Representing molecules as graphs • 640
Understanding graph convolutions 641
The motivation behind using graph convolutions • 641
Implementing a basic graph convolution • 644
Implementing a GNN in PyTorch from scratch 648
Defining the NodeNetwork model • 649
Coding the NodeNetwork’s graph convolution layer • 650
Adding a global pooling layer to deal with varying graph sizes • 652
Preparing the DataLoader • 655
Using the NodeNetwork to make predictions • 658
Implementing a GNN using the PyTorch Geometric library 659
Other GNN layers and recent developments 665
Spectral graph convolutions • 665
Pooling • 667
Normalization • 668
Pointers to advanced graph neural network literature • 669
Summary 671
Chapter 19: Reinforcement Learning for Decision Making in
Complex Environments 673
Introduction – learning from experience 674
Understanding reinforcement learning • 674
Defining the agent-environment interface of a reinforcement learning system • 675
The theoretical foundations of RL 676
Markov decision processes • 677
The mathematical formulation of Markov decision processes • 677
Visualization of a Markov process • 679
Episodic versus continuing tasks • 679
RL terminology: return, policy, and value function • 680
The return • 680
Policy • 682
Value function • 682
Dynamic programming using the Bellman equation • 684
Table of Contents xxi
Reinforcement learning algorithms 684
Dynamic programming • 685
Policy evaluation – predicting the value function with dynamic programming • 686
Improving the policy using the estimated value function • 686
Policy iteration • 687
Value iteration • 687
Reinforcement learning with Monte Carlo • 687
State-value function estimation using MC • 688
Action-value function estimation using MC • 688
Finding an optimal policy using MC control • 688
Policy improvement – computing the greedy policy from the action-value function • 689
Temporal difference learning • 689
TD prediction • 689
On-policy TD control (SARSA) • 691
Off-policy TD control (Q-learning) • 691
Implementing our first RL algorithm 691
Introducing the OpenAI Gym toolkit • 692
Working with the existing environments in OpenAI Gym • 692
A grid world example • 694
Implementing the grid world environment in OpenAI Gym • 694
Solving the grid world problem with Q-learning • 701
A glance at deep Q-learning 706
Training a DQN model according to the Q-learning algorithm • 706
Replay memory • 707
Determining the target values for computing the loss • 708
Implementing a deep Q-learning algorithm • 710
Chapter and book summary 714
Other Books You May Enjoy 719
Index 723