GopherCon 2019  Machine Learning & AI with Go Workshop
These are some notes from my experiences at the GopherCon 2019. I don’t expect these will be laid out in any particularly useful way; I am mostly taking them so I can remember some of the bits I found most useful in the future.
Presenter: Daniel Whitenack
Introduction to ML/AI
Benefits of Go for ML/AI
 Type safety
 Performance pretty good
 Easy concurrency
Uses of AI in the World
 Classification: input of images/text, output label or bounding boxes etc
 Control systems (selfdriving): input of images, output of control deltas
 Translation: input of text, output of more text
All basically just input > ML model > output (data transformation)
 Input data == features
 Output data == labels, responses
 ML model is just a function
ML Models
 Definitions – equations, expressions, conditions (if an image is mostly color C, then a cat)
 Parameters – weights, biases (the color C)
 Hyperparameters – parameters that we choose but don’t subject to training (kind of a part of model selection)
 Make ML/AI basically by trial and error to set parameter values
Two Major pieces to ML/AI
(at least supervised ML/AI)
 Inference / prediction – using the model
 Training – generating the model
Training
 Known Results (labels/responses for set inputs)
 Automated trial and error to find the “best” paramater value(s)
Model Selection
 How do we pick which definition is the best? Trial and error / domain knowledge
Machine Learning vs. AI
 Not a great answer to this – not well differentiated
Common Blockers

Getting the data
 Need annotations / known outputs for the training and evaluation data

Overfitting

Only works well on the data it knows about, not novel data

Set aside a validation set
 Can still overfit on validation set, but could like randomly select validation set every time or something?
 Can have a really separate holdout set that is not used in training too

Can always increase model complexity to decrease error too

Kinds of ML/AI Problems
 Object recognition / Classification
 Prediction (customers etc > sales)
 Forecasting (last month sales > this month sales)
 Recommendation (netflix problem)
 Clustering (group users by “similarity”)
Model Artifacts

After training, we save a model artifact file
 some include both definition and parameters, other just parameters
 various formats
 newer format emerging: onnx
Linear and Logistic Regression
Linear Regression

y = w * x + b (w == weights, b == biases)
 example: number of users (x) > actual sales (y)

pick initial values somehow
 maybe random, maybe pick 2 points and draw that line

loss function
 determines how good a line is
 example: absolute vertical distance

data normalization
 squish values to always be 01 (or some other known range)

profiling data / looking for intuition
 for manyx worlds when wanting to pick a single x, try graphing all the pairs (
gcml/linear_regression/example1
)
 for manyx worlds when wanting to pick a single x, try graphing all the pairs (

before doing learning, reminder to pick out test data (
gcml/linear_regression/example2
) might want to try and ensure that test data is representative and expand as necessary

Stochastic Gradient Descent training method (
gcml/linear_regression/example3
,gcml/linear_regression/example4
,gcml/linear_regression/example5
(adds multilinear regression))
epochs: number of training iterations (# of times through the training data)

gradient: more or less the derivative of goodness – move parameters in the direction of less error
 derivatives of error loss wrt each parameter, adjusted w/ learning rate

learning rate: hyperparamter that helps prevent huge jumping


Evaluate data (
gcml/linear_regression/example6
)
test data set evaluated by
RMSE
(root mean squared error) when the loss function isMSE
(mean squared error) gets back into the units of the prediction

multiregression might not get you more than linear sometimes, but it might

might want to unnormalize errors to better understand error numbers

Logistic Regression

Pretty similar to linear (
gcml/logistic_regression
) 
Often used for classification where we need a stepfunctionlike thing

Logistic function:
1 / (1 + e^(wx+b)) = 1 / (1 + e^b * e^wx)

Inflection at
 b/w
? (worked out for myself, but might have a wrong sign or something) 
Data cleaning is often necessary in real world (
gcml/logistic_regression/example2
) 
Intuition generation again (
gcml/logistic_regression/example3
) 
Don’t forget to create test/training splits (
gcml/logistic_regression/example4
) 
Training (
gcml/logistic_regression/example5
) 
Validation (
gcml/logistic_regression/example6
) Accuracy – how many things did I get right?
 Alternatives: precision, recall, sensitivity, AUC, false pos/neg, etc.

goml
package to do a lot of this for you (gcml/logistic_regression/example7
)
Neural Networks and Deep Learning
 Gorgonia for tensorflow / theanos
Neural Networks

Semiblackbox neurons acting as minimodels

“With enough parameters we can model just about any relationship”
 Pile up logistic regressions (and other such things) to give enough freedom for more things

Terminology
 Input layer
 Hidden layers
 Output layer
 Feed forward – generate predictions
 Backpropogation – calculate error, then adjust parameters

Architecture choice is usually finding one that someone found has worked well

Iris flower classification example (
gcml/neural_networks/example{1,2}
) uses “onehot” encoding of correct species
Deep Learning

As used here: pretrained models that we might tweak or just use to solve problems

TensorFlow trained model from python used in go (
gcml/deep_learning/example1
) for object identification can be pretty verbose

Using gocv / opencv to interface w/ tensorflow model (
gcml/deep_learning/example2
) 
Using MachineBox to do classification via a rest service
ML Pipelines with Pachyderm
 Pachyderm seems to make ML pipeline work pretty darn efficient and painless, but that is definitely just first impression