GopherCon 2019 - Machine Learning & AI with Go Workshop

conference, golang, gophercon2019, notes
https://github.com/dwhitena/gc-ml

These are some notes from my experiences at the GopherCon 2019. I don’t expect these will be laid out in any particularly useful way; I am mostly taking them so I can remember some of the bits I found most useful in the future.


Presenter: Daniel Whitenack

Introduction to ML/AI

Benefits of Go for ML/AI

  • Type safety
  • Performance pretty good
  • Easy concurrency

Uses of AI in the World

  • Classification: input of images/text, output label or bounding boxes etc
  • Control systems (self-driving): input of images, output of control deltas
  • Translation: input of text, output of more text

All basically just input -> ML model -> output (data transformation)

  • Input data == features
  • Output data == labels, responses
  • ML model is just a function

ML Models

  • Definitions – equations, expressions, conditions (if an image is mostly color C, then a cat)
  • Parameters – weights, biases (the color C)
  • Hyperparameters – parameters that we choose but don’t subject to training (kind of a part of model selection)
  • Make ML/AI basically by trial and error to set parameter values
Two Major pieces to ML/AI

(at least supervised ML/AI)

  • Inference / prediction – using the model
  • Training – generating the model

Training

  • Known Results (labels/responses for set inputs)
  • Automated trial and error to find the “best” paramater value(s)

Model Selection

  • How do we pick which definition is the best? Trial and error / domain knowledge

Machine Learning vs. AI

  • Not a great answer to this – not well differentiated

Common Blockers

  • Getting the data

    • Need annotations / known outputs for the training and evaluation data
  • Overfitting

    • Only works well on the data it knows about, not novel data
    • Set aside a validation set

    • Can still overfit on validation set, but could like randomly select validation set every time or something?

    • Can have a really separate holdout set that is not used in training too

    • Can always increase model complexity to decrease error too

Kinds of ML/AI Problems

  • Object recognition / Classification
  • Prediction (customers etc -> sales)
  • Forecasting (last month sales -> this month sales)
  • Recommendation (netflix problem)
  • Clustering (group users by “similarity”)

Model Artifacts

  • After training, we save a model artifact file

    • some include both definition and parameters, other just parameters
    • various formats
    • newer format emerging: onnx

Linear and Logistic Regression

Linear Regression

  • y = w * x + b (w == weights, b == biases)

    • example: number of users (x) -> actual sales (y)
  • pick initial values somehow

    • maybe random, maybe pick 2 points and draw that line
  • loss function

    • determines how good a line is
    • example: absolute vertical distance
  • data normalization

    • squish values to always be 0-1 (or some other known range)
  • profiling data / looking for intuition

    • for many-x worlds when wanting to pick a single x, try graphing all the pairs (gc-ml/linear_regression/example1)
  • before doing learning, reminder to pick out test data (gc-ml/linear_regression/example2)

    • might want to try and ensure that test data is representative and expand as necessary
  • Stochastic Gradient Descent training method (gc-ml/linear_regression/example3, gc-ml/linear_regression/example4, gc-ml/linear_regression/example5 (adds multi-linear regression))

    • epochs: number of training iterations (# of times through the training data)
    • gradient: more or less the derivative of goodness – move parameters in the direction of less error

    • derivatives of error loss wrt each parameter, adjusted w/ learning rate

    • learning rate: hyperparamter that helps prevent huge jumping

  • Evaluate data (gc-ml/linear_regression/example6)

    • test data set evaluated by RMSE (root mean squared error) when the loss function is MSE (mean squared error)

    • gets back into the units of the prediction

    • multi-regression might not get you more than linear sometimes, but it might

    • might want to un-normalize errors to better understand error numbers

Logistic Regression

  • Pretty similar to linear (gc-ml/logistic_regression)
  • Often used for classification where we need a step-function-like thing
  • Logistic function: 1 / (1 + e^(wx+b)) = 1 / (1 + e^b * e^wx)
  • Inflection at - b/w ? (worked out for myself, but might have a wrong sign or something)

  • Data cleaning is often necessary in real world (gc-ml/logistic_regression/example2)

  • Intuition generation again (gc-ml/logistic_regression/example3)

  • Don’t forget to create test/training splits (gc-ml/logistic_regression/example4)

  • Training (gc-ml/logistic_regression/example5)

  • Validation (gc-ml/logistic_regression/example6)

    • Accuracy – how many things did I get right?
    • Alternatives: precision, recall, sensitivity, AUC, false pos/neg, etc.
  • goml package to do a lot of this for you (gc-ml/logistic_regression/example7)

Neural Networks and Deep Learning

Neural Networks

  • Semi-black-box neurons acting as mini-models
  • “With enough parameters we can model just about any relationship”

    • Pile up logistic regressions (and other such things) to give enough freedom for more things
  • Terminology

    • Input layer
    • Hidden layers
    • Output layer
    • Feed forward – generate predictions
    • Backpropogation – calculate error, then adjust parameters
  • Architecture choice is usually finding one that someone found has worked well

  • Iris flower classification example (gc-ml/neural_networks/example{1,2})

    • uses “one-hot” encoding of correct species

Deep Learning

  • As used here: pre-trained models that we might tweak or just use to solve problems
  • TensorFlow trained model from python used in go (gc-ml/deep_learning/example1) for object identification

    • can be pretty verbose
  • Using gocv / opencv to interface w/ tensorflow model (gc-ml/deep_learning/example2)

  • Using MachineBox to do classification via a rest service

ML Pipelines with Pachyderm

  • Pachyderm seems to make ML pipeline work pretty darn efficient and painless, but that is definitely just first impression