vendredi, mai 29, 2020

Deep Learning - Convolutional Neural Network

The usage of Convolutional Neural Network (CNN) is mainly for image processing, and can be used for example to read checks or to recognize objects in photos. In this post I explain my own experience on the usage of CNN for two use cases:

  • The MNIST dataset will figure out the case of capturing written numbers.
  • The CIFAR-10 dataset address the case of recognizing objects in photos.
There are plenty of articles for understanding CNN, here it's simply my understanding through one example.

An image is a story of pixels

The decomposition of a numeral picture into a network of neurons can be represented like this:
In the above image, the numeral "8" is an image of 28 * 28 pixels. This equals to to 784 input features for the network. The simplest way to recognize the numeral is to use a simple Multilayer Perceptron neural network without convolution. Even with a simple MLP, you can achieve less than 2% error rate.

Convolution

Understanding Convolutional Neural Network (CNN) is a challenge, but working with the MNIST dataset as an example helped me. Either in KERAS or with MXNET (from the Apache framework) you have code examples.
Convolution is defined as a mathematical operation describing a rule for how to merge two sets of information. Does it help you? Me not really :=(
An image is worth a thousand word:
In the image processing with CNN, it's like photography when you are applying different filters to the same picture, you get different results. In CNN, you apply different filters and then you pool them together to produce your classification.
In the above image the filter helps the processing at emphasizing part of the image. For example the first filter is spotting the upper part of the image by putting a specific weight on the first row of the matrix. There are other ways to enhance the performance with "image augmentation" techniques, as for example using the Keras Image Image Augmentation API. You can implement a concrete example with the book of Jason Brownlee.

With a CNN on a pre-trained model you get a very good accuracy:



Of the importance of pre-training the model





















If you compare the two images above, you will remark that two types of network have been experimented: Resnet20 and Resnet56. The CNN have been running in the first case without pre-training and and in the second case with pre-training data. The conclusion is that you get a much better accuracy with pre-training. The importance of pre-training was demonstrated in this paper. You have an explanation about pre-training. Pre-training quickly defined: you compute the weights of your CNN network on a first set of data, and then you "transfer" this knowledge to the new set of data, that means without starting from scratch for the second set of data.

During that day we worked on a CNN with "transfer learning": that means that the CNN model was trained  on different but correlated problems. We have been using Amazon Sage Maker built-in algorithms to train our model incrementally.
Once the model was trained, we prepare the model for inference using an Amazon SageMaker endpoint.

Training the model


In the above image you will notice that the autograd (MXNET) function is used.

Two paradigms: imperative programming vs declarative programming

This distinction is important for the construction of your neuronal network: in the first stage you declare the different layers, and in a second phase you compile after adding them all.

Inference and deployment

All the screenshots of this post were extracted during a one day training at AWS, and the models were executed and deployed on AWS EC2 instance, using Jupiter Notebook and Amazon Elastic Container Service. 
Every application is now deployed in containers, the most important artefact being the Container Registry which is operated with Amazon SageMaker. You only pay (6cents to $27/hour for a 64 cores) for execution when you are using Amazon Lambda. You upload your static website to Amazon S3 (Amazon Simple Storage Service)  containing your model.
With Amazon lambda you don't have to manage infrastructure, you are managing services. It's server less and provides a fully managed highly available services.

Amazon SageMaker Ground truth is a capability that helps you because it reduces the time of preparation of the data.

Final result

The final result is a static website in which you can upload your picture and the CNN will detect for you the objects in the image.
The dog was the first example as a candidate for classification. Then I added the photo of the cat, taken out from the internet. For both the dog and the cat, you can see that the objects are well identified. Then I uploaded an image of my own photos library, a lady bird on leaf, and here you remark that the classification is far from being satisfying.
My conclusion is that images from the internet are already known and trained, but when it comes to the discovery of new images, the CNN has more difficulty to find out. Hence the need for still pre training a CNN network. And the conclusion of my conclusion, keep your private images safe.

dimanche, mai 24, 2020

Les Misérables de Victor Hugo à la sauce Intelligence Artificielle

Sur une idée et un code source issu du livre de Jason Brownlee, j'ai exécuté un réseau de neurone récurrent sur le livre "Les Misérables Tome V : Jean Valjean" de Victor Hugo.

Le but de l'exercice est que la machine apprenne les différentes séquences du livre afin de pouvoir générer elle-même de nouvelles phrases.

Pour cela, j'ai du récupérer le texte de l'oeuvre de Victor Hugo. Le texte est disponible sur le site Gutenberg.

Le texte comprend pas moins de 625332 caractères pour un vocabulaire composé de 65 caractères différents.

Une fois le texte rapatrié en local, le réseau de neurone de type LSTM (Long Short Term Memory) en fait l'analyse sur 20 époques. Sur ma machine l'analyse a duré 6 heures et 33 minutes.

Une fois l'analyse exécutée, on va récupérer de cette analyse l'époque ayant eue la plus faible perte. En l'occurrence dans mon exécution, il s'agit de l'époque 20/20 :


L'activité du CPU pendant les 6h33 est intense:




Puis on va faire générer au modèle des nouvelles phrases cette fois sans avoir à entrainer le modèle (ouf, pas 6h 33 d'attente). Ceci est possible par récupération du modèle précédemment entrainé.

Le modèle permet de générer autant de phrases que l'on souhaite. Ci-dessous un extrait de quelques phrases générées par le programme :

" il se dit qu'il était probablement dans l'égout des halles; que, s'il
choisissait la gauche et suiva "

" est-à-dire braves. quand on est amoureux comme un tigre, c'est bien le
moins qu'on se batte comme un "

" il y aura des nuages de pourpre et d'or au-dessus
de leur tête, se déclarent contents, et qui sont d "

"  voir. je vais lui dire que
monsieur fauchelevent est là.

--non. ne lui dites pas que c'est moi. di "

"  être
réprimés. l'homme probe s'y dévoue, et, par amour même pour cette foule,
il la combat. mais co "

" vait pas à refuser. pourtant celui qui avait la clef parlementa,
uniquement pour gagner du temps. il "

" tait arrivé avec cosette, le portier n'avait pu
s'empêcher de confier à sa femme cet aparté: je ne s "

Je trouve personnellement que le résultat est impressionnant pour un modèle de base.

Le modèle prend du temps à s'exécuter sur mon iMac, car la machine ne possède pas de carte graphique NVIDIA. NVIDIA utilise CUDA qui est non disponible sur MAC. 

La solution pour augmenter la performance serait soit matérielle en achetant une machine spécifique Machine Learning comme le fait le constructeur AIME ou une solution logicielle de type PlaidML qui est un cadre logiciel qui exécute KERAS sur un GPU utilisant OpenCL au lieu de CUDA. Cette dernière solution est une bonne alternative pour exécuter du Machine Learning sur MAC sans carte NVIDIA.


samedi, mai 23, 2020

Deep Learning With Python Jason Brownlee


  • In this post, I write my feedback on this again excellent book "Deep Learning With Python" from Jason Brownlee. I would recommend this book if you want to progress on this Deep Learning topic with real examples and not complex math.



  • You will need a minimum of 15 days to read and run through the different examples that are provided alongside the book.
  • This book help me to progress at the understanding of important concepts such as Multilayer Perceptron, usage of the KERAS library, Convolutional Neural Networks, Recurrent Neural Networks.

    • I particularly appreciated the chapter related to the predictions of sentiments. It's about the word representation which are part of Natural Processing language (NPL). Jason reference an interesting link about the model used in NPL: "Learning Word Vectors for Sentiment Analysis".
    • The more I was progressing along the book, the more I was needing to run the code on an external computer. Indeed some program takes more than an hour. For example the classification of images provided with CIFAR10 run in one hour on my iMAC whereas it took only six minutes on Google Colab using GPU. The Colab platform is easy to access and you can use your own data file stored in Google Drive. Here is the link on howto to upload your own files.
    • This kind of experience help at understanding the notion of Tensors: a generalization of matrices and are represented using n-dimensional arrays. A vector is one-dimensional or first order tensor and a matrix is two dimensional or second order tensor. Python together with numpy and pandas libraries are perfect for managing matrix.
    • I learned the different techniques of using Keras, the library that wraps TensorFlow. The KerasClassifierWrapper class takes a function that creates and returns your neural network model.
    • I was also made aware of how to reduce overfitting, tuning hyper parameters such as batch size, number of epochs, learning rate decay, usage of momentum, choosing a Time-based learning rate schedule or a Drop-based learning rate schedule. All these notions are clearly explained and demonstrated with the code examples.
    • I was reading the book and in parallel I am also subscribing to the email of Jason as there is always some corollary subjects that helped me understanding subjects such as Discretization, Time series prediction, Whitening, Principal Component Analysis (PCA) and Zero-phase Component Analysis (ZCA).
    • I am using the Anaconda container which comes with Sypder, a Python editor. This environment is perfect for me as it provides a stable environment with up to date libraries. Setting up Anaconda is explained in the annex of the book.

    • I will certainly explore the Google seed-bank for other examples of deep learning.

    samedi, mai 16, 2020

    maschinelles Lernen

    Das Hintergrund
    Wenn wir über künstliche Intelligenz sprechen, ist das Thema sehr breit. Das Thema enthält Roboter mit Sensoren verbunden mit dem Gehirn, Datenmengen Untersuchungen, Sprachassistenten, Autonom fahrende Autos, Gesicht-Identifizierung, Übersetzung, Entscheidungshilfe usw... Je breiter das Thema breit ist, desto mehr Leute wollen diskutieren.  Ich habe auch bemerkt, dass viele Leute gegenüber künstlicher Intelligenz sehr skeptisch sind. Je mehr die Leute über künstliche Intelligenz informiert sind, eine desto bessere Meinung über künstliche Intelligenz haben sie.

    Maschinelles Lernen
    Das Thema « maschinelles Lernen » ist nicht einfach zu erklären. Aber ich glaube, es zu verstehen, ist wichtig, weil « maschinelles Lernen » das Herz von künstlicher Intelligenz ist.
    Mit verschiedenen Methoden, z.b. mit Hilfe von Internetsuche, oder mit Rechner Geschichte seit 1950, mit Beispielen, können wir dieses Thema klären. Für die heutige Erklärung, werde ich mit meine eigene Geschichte probieren.


    Meine Geschichte
    Als ich im vorigen Jahrhundert ein junger Ingenieur war, programmierte ich die Rechner mit einem besonderen Ziel. Der Rechner musste, mit meinen Programmen, mehrere Aktionen ausführen. Jetzt, mit maschinellem Lernen, ist das Ziel komplett anders: der Rechner lernt mit Daten, die Nahrung der Rechner, was er ausführen muss. Was sehr wichtig ist zu erinnern, ist dass die Daten am wichtigsten sind. Die Daten sind wichtiger als die Algorithmen, die sind seit mehr als 20 Jahren bekannt. Zunächst muss der Rechner die Daten analysieren und es gibt mehrere Stufen: Daten aufladen, Model festlegen, Model kompilieren, Model passen, Model bewerten und die letzte Stufe ist Prognose oder Einstufung. Zum Beispiel habe ich meine eigenen Daten über Wettlauf, Herzfrequenz, Blutdruck analysiert. Diese erste Stufe ist die wichtigste beim maschinelles Lernen. Für die Datenanalyse gibt es mehrere Techniken die aus der Statistik kommen. Nach der Datenanalyse muss der Computer, mit der Hilfe des Algorithmus, Vorhersagen machen. Und dann kommt « tiefes Lernen », eine Technik die Neuronnennetzwerk aufgebaut ist. Ich arbeitete and trainierte jetzt über « tiefes Lernen » und das ist mein Projekt Heute.
    Was bedeutet das für mich? Kann ich mit meine eigenen Daten Vorhersage machen? Kann ich zum Beispiel eine Vorhersage über verbleibende Lebensjahre machen? Das ist nur möglich wenn ich Daten von anderen Leuten habe, die ähnlich wie ich leben. Leider habe ich diese Daten nicht.

    Das Beispiel

    COVID19 ERREGER HOT SPOT NEBEN WUHAN

    Eine heutige Anwendung von maschinelles Lernen ist zum Beispiel COVID19. Zunächst mit Datenmengen Untersuchung können wir die geografische Verteilung des Erregers sichtbar machen. Sehen Sie die Karte « COVID 19 Erreger Hot Spot neben Wuhan ».
    Diese Information ist wichtig für die Entscheidungsinstanzen. Aber Menschen haben kognitive Verzerrung und unter Druck oder Stress irren sie sich. Die Maschinen sind rationaler als Menschen und wenn Maschinen mit richtigen und vielen Daten gefuttert sind, ist die Entscheidung besser. Dann spielt maschinelles Lernen eine wichtige Rolle, damit die Menschen eine bessere Entscheidung treffen.

    Fazit
    Heutzutage ist Datenmenge entscheidend und Heute sind die Daten der Treibstoff des künstliche Intelligenz. Deswegen ist Heute die weltweite Schlacht um Daten so intensiv.
    Die Verarbeitung und Analyse deren Daten mit dem Ziel Prognose oder die Delegierung menschliche Aufgaben ist jetzt möglich dank der Leistung der Computer und Datenmengen.

    samedi, mai 02, 2020

    Machine Learning Mastery with R Jason Brownlee


    • I have just finished this excellent eBook from Jason Brownlee. This was for me a constant excitation at every chapter. The learning curve is progressive and you finish with end to end projects.
    • A lot of fundamentals about the metrics of Machine Learning, the algorithms of Machine Learning, how to spot the best Machine Learning algorithms, and the fundamentals of analyzing the data before using any algorithms are explained into this book. What I liked, is that it is a book on which you are not passive, but you can implement and see the results directly on your machine.
    • Prior to this book, I followed the MOOC "Introduction à la statistique avec R" which gives you the basics of statistics and the basics of R language.
    • You will go through the algorithms KNN(Méthode des plus proches voisins) part of the non-linear algorithms family (peu d' hypothèses sur la fonction à modéliser) together with Naive Bayes, Support Vector Machine (SVM), Classification and Regression tress (CART). You will also go through the linear algorithms (forte hypothèse sur la forme de la fonction modélisée) like linear regression, logistic regression, Linear Discriminators Analysis (LDA), regularized regression.
    • You will compare the performance of Machine Learning algorithms, and you will tune the Machine Learning algorithms.
    • The required metrics need to analyze you data are also explained in the book, but in plain English and not with complex mathematical equations. RMSE (Root Mean Squared Error), the average deviation of the prediction from the observation and R2, le coefficient de détermination (1 is perfect, 0 is worst), accuracy for the classification (percentage of correctly classified instances out of all instances), Log Loss used to evaluate binary classification and more common for multi class classification algorithms.
    • I have been using the R language and Studio tool to copy and paste the snippets of code from the book in my environment. See below my RStudio environment with an example extracted from the book: