ENVIRONMENT SETUP
Preparation makes perfect
Before getting into the actual programming lessons, we first need to set up our coding environment.
Python is the most popular tool for machine learning, and it is what will be using throughout this curriculum. Now you might be wondering: why use Python, a high-level programming language, for machine learning? Wouldn’t things run faster if we used C++?
It turns out, we can run our machine learning programs in Python just as fast as we could in C++ and the simplicity of Python makes it an attractive choice. In a typical deep learning program, all of the computational burdens will be from large matrix multiplications. There is a library for Python called NumPy, a scientific computing library that offers a Python interface for low level, fast math operations, especially with matrices and other linear algebra concepts. The same is true for TensorFlow: the deep learning package that we will be using throughout most of our workshops. TensorFlow provides a Python interface for fast deep learning operations that can be performed on either the CPU or GPU (the GPU is a lot faster at matrix multiplications than the CPU, making it valuable for deep learning applications). Finally, we will also install scikit-learn: a general machine learning package for Python.
These instructions will walk you through what you will need to install. In summary, we will be installing Anaconda, TensorFlow and scikit-learn, all using Python 3.5. These instructions should work for both Windows and any Unix system.
- A lot of the functionality in Python comes from external packages. Anaconda is a package manager that we will use to manage our environments and versions of Python. Download the Python 3.6 version of Anaconda from here. Note that if you currently have Python installed, you may need to uninstall your existing installation first in order to avoid any conflicts during the Anaconda setup. Lastly, if you are on Windows, make sure you check the box “Add Anaconda to my PATH environment variable” during setup. It is not the default option, so be sure not to miss it!
- Check that anaconda is installed by running in your terminal
conda info
. - Create your conda environment. This will specify a certain version of Python to use and will act as the separated container (apart from your root installation) for all of your Python packages to exist. Run
conda create -n caispp python=3.5
in your terminal. This command specifies to use Python 3.5 for our virtual environment, and names our environment ‘caispp’. - Activate your environment. This tells your terminal session to use the version of Python and the packages in the conda environment. This is done through
activate caispp
on Windows orsource activate caispp
on Unix systems. You should see your prompt change with the name of the environment to the left of the input line. Make sure this environment is activated while doing the remaining steps. - Make sure that pip is installed by running
pip -v
. Pip is an easy-to-use package manager built for Python, and we will use it to install several of the packages we will need in the future. If pip is not installed, follow the instructions here. - Install TensorFlow. For Windows, run
pip install --ignore-installed --upgrade tensorflow
. For Unix based systems, just runpip install tensorflow
. - Check that TensorFlow was actually installed. Startup a Python instance in terminal with the command:
python
. When in the Python instance, import TensorFlow using:import tensorflow as tf
. At this point, you might see some warning logs or other messages, but as long as it didn’t give an error, you are good to go! You can now exit out of Python by enteringctrl-d
orctrl-c
. (Some machines use one or the other, so try both.) - Install scikit-learn. Scikit-learn is on, so we just need to enter into the command line:
conda install scikit-learn
. Once again, test what we just installed. Create another Python instance in terminal, and try importing the package:import sklearn
. - NumPy should have been installed as a dependency of the other packages, but it may be a good idea to ensure that numpy is also installed and working. Go ahead and launch another Python instance and type the following code:
import numpy as np
. Again, if you didn’t get an error, then that means numpy was installed correctly. - Final installations: exit out of your Python instance, and run these commands into the command line:
conda install nb_conda
(to make our conda environment compatible with Jupyter Notebooks),pip install matplotlib
(a plotting library for Python),pip install pandas
(a data table library), andpip install keras
. Keras is a high-level deep learning library that sits on top of Tensorflow and makes it significantly easier to write your own neural networks in just a couple lines of code. - Before we start writing some code, let’s restart our
conda
environment so that we can be sure that all the installations are complete:source deactivate caispp
(to deactivate our environment), and thensource activate caispp
(to reactivate it).
That's all your environment setup is done
Thank you for Reading