which python3
and then which pip3
. Those commands should return a file path to your Anaconda installation.conda
is Anaconda’s package manager. Make sure that your conda installation worked: conda --version
. If that returns a version number, then you have installed Anaconda3 and conda correctly.Every data science project you do will require some combination of external libraries, sometimes with specific versions that differ from the specific versions you used for other projects. If you were to have a single Python installation, these libraries would conflict and cause you all sorts of problems.
The standard solution is to use virtual environments, which are sandboxed Python environments that maintain their own versions of Python libraries (and, depending on how you set up the environment, of Python itself).
As a matter of good discipline, you should always work in a virtual environment, and never use the “base” Python installation.
Source: “Data Science from Scratch - 2nd Ed,” pages 37, 24
environment.yml
file in your project root directoryYou can create and activate virtual environments with conda. (See Managing environments.)
First, create an environment.yml
file in your project root directory. Here is an example of an environment.yml
file:
1 | |
2 | |
3 | |
4 | |
5 | |
6 | |
7 | |
8 | |
9 | |
10 | |
11 | |
12 | |
13 | |
14 | |
15 | |
16 | |
17 | |
18 | |
19 | |
20 | |
What is ipykernel
?
The Jupyter Notebook and other frontends automatically ensure that the IPython kernel is available. However, if you want to use a kernel with a different version of Python, or in a virtualenv or conda environment, you’ll need to install that manually. (Source Installing the IPython kernel)
The environment.yml
file assumes that you are using JupyterLab instead of Jupyter Notebooks. So you would be able to run it inside of a virtual environment in a pretty automated and easy way. I don’t think that is possible with Jupyter Notebooks, but I could be wrong.
NOTE: If you need to use Jupyter Notebooks (instead of JupyterLab), then I think you will have to install ipykernel
and run Jupyter Notebooks inside a virtual environment a different (and more manual) way. This process is detailed in the Preface of the book “Data Science for Marketing Analytics”.
For reference, here is an environment.yml
file that could be used with a Python backend for web development, if you are not using Docker as your development environment:
1 | |
2 | |
3 | |
4 | |
5 | |
6 | |
7 | |
8 | |
9 | |
10 | |
11 | |
12 | |
13 | |
14 | |
15 | |
16 | |
17 | |
18 | |
In a terminal window cd
into your project directory and run
1 | |
This will install the virtual environment that is specified in your environment.yml
file. In case you see a prompt asking you to confirm before proceeding, type y
and press Enter to continue creating the environment. Depending on your system configuration, it may take a while for the process to complete.
In your terminal type:
1 | |
You should see the name of your virtual environment in that list, which is the name
specified in your environment.yml
file.
There are a few different notebook options that you can use. These instructions show you how to use Jupyter Notebooks inside VSCode.
NOTE: Make sure that you have the Polyglot Notebooks extension installed in VS Code. This extension provides support for Jupyter Notebooks inside of VSCode.
<file-name>.ipynb
. VSCode will open the file inside a notebook.Select Kernel
. (NOTE: That button looks more like a label than a button.)Python Environments...
>> Look for the name of the virtual environment that you created previously and select it.Select Kernel
has been replaced by the name of the virtual environment that you selected.There are a few different notebook options that you can use. These instructions show you how to use JupyterLab inside your virtual environment.
1 | |
Once your virtual environment has been activated, your command prompt should be prefixed with the name of your virtual environment. For example:
1 | |
Run the following command inside your activated virtual environment:
1 | |
When you run jupyter-lab
the JupyterLab server will run and an instance of JupyterLab will open up in a browser window. The following are a couple of indicators that you are running JupyterLab from inside a virtual environment:
jupyter-lab
should have the following two lines (which indicate that JupyterLab is being served from the /path/to/anaconda3/envs/
directory, which is where the virtual environments are stored):1 | |
2 | |
You may need to update your environment for a variety of reasons. For example, it may be the case that:
If any of these occur, all you need to do is update the contents of your environment.yml
file accordingly and then run the following commands in your terminal:
1 | |
2 | |
Note
The --prune
option causes conda to remove any dependencies that are no longer required from the environment.
When you are done working on a particular project you can deactivate the virtual environment with:
1 | |
See Deactivating an environment
First make sure your environment is deactivated. Then run this command:
1 | |
You can verify that your virtual environment has been deleted by running
1 | |
Install JupyterLab with pip
:
1 | |
Note: If you install JupyterLab with conda or mamba, it is recommended to use the conda-forge
channel.
Check that JupyterLab installed correctly:
1 | |
Once installed, launch JupyterLab with:
1 | |