{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"execution": {},
"id": "view-in-github"
},
"source": [
"
"
]
},
{
"cell_type": "markdown",
"metadata": {
"execution": {}
},
"source": [
"# Tutorial 3: Building and Evaluating Normative Encoding Models\n",
"\n",
"**Week 1, Day 5: Deep Learning**\n",
"\n",
"**By Neuromatch Academy**\n",
"\n",
"**Content creators**: Jorge A. Menendez, Yalda Mohsenzadeh, Carsen Stringer\n",
"\n",
"**Content reviewers**: Roozbeh Farhoodi, Madineh Sarvestani, Kshitij Dwivedi, Spiros Chavlis, Ella Batty, Michael Waskom\n",
"\n",
"**Production editors:** Spiros Chavlis"
]
},
{
"cell_type": "markdown",
"metadata": {
"execution": {}
},
"source": [
"---\n",
"# Tutorial Objectives\n",
"\n",
"*Estimated timing of tutorial: 1 hour, 10 minutes*\n",
"\n",
"In this tutorial, we'll be using deep learning to build an encoding model of the visual system, and then compare its internal representations to those observed in neural data.\n",
"\n",
"Its parameters won't be directly optimized to fit the neural data. Instead, we will optimize its parameters to solve a particular visual task that we know the brain can solve. We therefore refer to it as a **\"normative\"** encoding model, since it is optimized for a specific behavioral task. It is the optimal model to solve the problem (optimal for the specified architecture). See Section 3 of the bonus tutorial to fit a convolutional neural network directly to neural data (an alternative approach for investigating encoding with deep neural networks).\n",
"\n",
"To then evaluate whether this normative encoding model is actually a good model of the brain, we'll analyze its internal representations and compare them to the representations observed in the mouse primary visual cortex. Since we understand exactly what the encoding model's representations are optimized to do, any similarities will hopefully shed light on why the representations in the brain look the way they do.\n",
"\n",
"More concretely, our goal will be learn how to:\n",
"* Visualize and analyze the internal representations of a deep network\n",
"* Quantify the similarity between distributed representations in a model and neural representations observed in recordings, using Representational Similarity Analysis (RSA)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"execution": {},
"tags": [
"remove-input"
]
},
"outputs": [],
"source": [
"# @markdown\n",
"from IPython.display import IFrame\n",
"from ipywidgets import widgets\n",
"out = widgets.Output()\n",
"with out:\n",
" print(f\"If you want to download the slides: https://osf.io/download/kwyvp/\")\n",
" display(IFrame(src=f\"https://mfr.ca-1.osf.io/render?url=https://osf.io/kwyvp/?direct%26mode=render%26action=download%26mode=render\", width=730, height=410))\n",
"display(out)"
]
},
{
"cell_type": "markdown",
"metadata": {
"execution": {}
},
"source": [
"---\n",
"# Setup\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Install and import feedback gadget\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"execution": {},
"tags": [
"hide-input"
]
},
"outputs": [],
"source": [
"# @title Install and import feedback gadget\n",
"\n",
"!pip3 install vibecheck datatops --quiet\n",
"\n",
"from vibecheck import DatatopsContentReviewContainer\n",
"def content_review(notebook_section: str):\n",
" return DatatopsContentReviewContainer(\n",
" \"\", # No text prompt\n",
" notebook_section,\n",
" {\n",
" \"url\": \"https://pmyvdlilci.execute-api.us-east-1.amazonaws.com/klab\",\n",
" \"name\": \"neuromatch_cn\",\n",
" \"user_key\": \"y1x3mpx5\",\n",
" },\n",
" ).render()\n",
"\n",
"\n",
"feedback_prefix = \"W1D5_T3\""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "both",
"execution": {}
},
"outputs": [],
"source": [
"# Imports\n",
"import numpy as np\n",
"from scipy.stats import zscore\n",
"import matplotlib as mpl\n",
"from matplotlib import pyplot as plt\n",
"\n",
"import torch\n",
"from torch import nn, optim\n",
"from sklearn.decomposition import PCA\n",
"from sklearn.manifold import TSNE"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Figure Settings\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"execution": {},
"tags": [
"hide-input"
]
},
"outputs": [],
"source": [
"# @title Figure Settings\n",
"import logging\n",
"logging.getLogger('matplotlib.font_manager').disabled = True\n",
"\n",
"%matplotlib inline\n",
"%config InlineBackend.figure_format='retina'\n",
"plt.style.use(\"https://raw.githubusercontent.com/NeuromatchAcademy/course-content/main/nma.mplstyle\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Plotting Functions\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"execution": {},
"tags": [
"hide-input"
]
},
"outputs": [],
"source": [
"# @title Plotting Functions\n",
"\n",
"def show_stimulus(img, ax=None, show=False):\n",
" \"\"\"Visualize a stimulus\"\"\"\n",
" if ax is None:\n",
" ax = plt.gca()\n",
" ax.imshow(img, cmap=mpl.cm.binary)\n",
" ax.set_aspect('auto')\n",
" ax.set_xticks([])\n",
" ax.set_yticks([])\n",
" ax.spines['left'].set_visible(False)\n",
" ax.spines['bottom'].set_visible(False)\n",
" if show:\n",
" plt.show()\n",
"\n",
"\n",
"def plot_corr_matrix(rdm, ax=None, show=False):\n",
" \"\"\"Plot dissimilarity matrix\n",
"\n",
" Args:\n",
" rdm (numpy array): n_stimuli x n_stimuli representational dissimilarity\n",
" matrix\n",
" ax (matplotlib axes): axes onto which to plot\n",
"\n",
" Returns:\n",
" nothing\n",
"\n",
" \"\"\"\n",
" if ax is None:\n",
" ax = plt.gca()\n",
" image = ax.imshow(rdm, vmin=0.0, vmax=2.0)\n",
" ax.set_xticks([])\n",
" ax.set_yticks([])\n",
" cbar = plt.colorbar(image, ax=ax, label='dissimilarity')\n",
" if show:\n",
" plt.show()\n",
"\n",
"\n",
"def plot_multiple_rdm(rdm_dict):\n",
" \"\"\"Draw multiple subplots for each RDM in rdm_dict.\"\"\"\n",
" fig, axs = plt.subplots(1, len(rdm_dict),\n",
" figsize=(4 * len(resp_dict), 3.5))\n",
"\n",
" # Compute RDM's for each set of responses and plot\n",
" for i, (label, rdm) in enumerate(rdm_dict.items()):\n",
"\n",
" image = plot_corr_matrix(rdm, axs[i])\n",
" axs[i].set_title(label)\n",
" plt.show()\n",
"\n",
"\n",
"def plot_rdm_rdm_correlations(rdm_sim):\n",
" \"\"\"Draw a bar plot showing between-RDM correlations.\"\"\"\n",
" f, ax = plt.subplots()\n",
" ax.bar(rdm_sim.keys(), rdm_sim.values())\n",
" ax.set_xlabel('Deep network model layer')\n",
" ax.set_ylabel('Correlation of model layer RDM\\nwith mouse V1 RDM')\n",
" plt.show()\n",
"\n",
"\n",
"def plot_rdm_rows(ori_list, rdm_dict, rdm_oris):\n",
" \"\"\"Plot the dissimilarity of response to each stimulus with response to one\n",
" specific stimulus\n",
"\n",
" Args:\n",
" ori_list (list of float): plot dissimilarity with response to stimulus with\n",
" orientations closest to each value in this list\n",
" rdm_dict (dict): RDM's from which to extract dissimilarities\n",
" rdm_oris (np.ndarray): orientations corresponding to each row/column of RDMs\n",
" in rdm_dict\n",
"\n",
" \"\"\"\n",
" n_col = len(ori_list)\n",
" f, axs = plt.subplots(1, n_col, figsize=(4 * n_col, 4), sharey=True)\n",
"\n",
" # Get index of orientation closest to ori_plot\n",
" for ax, ori_plot in zip(axs, ori_list):\n",
" iori = np.argmin(np.abs(rdm_oris - ori_plot))\n",
"\n",
" # Plot dissimilarity curves in each RDM\n",
" for label, rdm in rdm_dict.items():\n",
" ax.plot(rdm_oris, rdm[iori, :], label=label)\n",
"\n",
" # Draw vertical line at stimulus we are plotting dissimilarity w.r.t.\n",
" ax.axvline(rdm_oris[iori], color=\".7\", zorder=-1)\n",
"\n",
" # Label axes\n",
" ax.set_title(f'Dissimilarity with response\\nto {ori_plot: .0f}$^o$ stimulus')\n",
" ax.set_xlabel('Stimulus orientation ($^o$)')\n",
"\n",
" axs[0].set_ylabel('Dissimilarity')\n",
" axs[-1].legend(loc=\"upper left\", bbox_to_anchor=(1, 1))\n",
" plt.tight_layout()\n",
" plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Helper Functions\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"execution": {},
"tags": [
"hide-input"
]
},
"outputs": [],
"source": [
"# @title Helper Functions\n",
"\n",
"def load_data(data_name, bin_width=1):\n",
" \"\"\"Load mouse V1 data from Stringer et al. (2019)\n",
"\n",
" Data from study reported in this preprint:\n",
" https://www.biorxiv.org/content/10.1101/679324v2.abstract\n",
"\n",
" These data comprise time-averaged responses of ~20,000 neurons\n",
" to ~4,000 stimulus gratings of different orientations, recorded\n",
" through Calcium imaginge. The responses have been normalized by\n",
" spontanous levels of activity and then z-scored over stimuli, so\n",
" expect negative numbers. They have also been binned and averaged\n",
" to each degree of orientation.\n",
"\n",
" This function returns the relevant data (neural responses and\n",
" stimulus orientations) in a torch.Tensor of data type torch.float32\n",
" in order to match the default data type for nn.Parameters in\n",
" Google Colab.\n",
"\n",
" This function will actually average responses to stimuli with orientations\n",
" falling within bins specified by the bin_width argument. This helps\n",
" produce individual neural \"responses\" with smoother and more\n",
" interpretable tuning curves.\n",
"\n",
" Args:\n",
" bin_width (float): size of stimulus bins over which to average neural\n",
" responses\n",
"\n",
" Returns:\n",
" resp (torch.Tensor): n_stimuli x n_neurons matrix of neural responses,\n",
" each row contains the responses of each neuron to a given stimulus.\n",
" As mentioned above, neural \"response\" is actually an average over\n",
" responses to stimuli with similar angles falling within specified bins.\n",
" stimuli: (torch.Tensor): n_stimuli x 1 column vector with orientation\n",
" of each stimulus, in degrees. This is actually the mean orientation\n",
" of all stimuli in each bin.\n",
"\n",
" \"\"\"\n",
" with np.load(data_name) as dobj:\n",
" data = dict(**dobj)\n",
" resp = data['resp']\n",
" stimuli = data['stimuli']\n",
"\n",
" if bin_width > 1:\n",
" # Bin neural responses and stimuli\n",
" bins = np.digitize(stimuli, np.arange(0, 360 + bin_width, bin_width))\n",
" stimuli_binned = np.array([stimuli[bins == i].mean() for i in np.unique(bins)])\n",
" resp_binned = np.array([resp[bins == i, :].mean(0) for i in np.unique(bins)])\n",
" else:\n",
" resp_binned = resp\n",
" stimuli_binned = stimuli\n",
"\n",
" # only use stimuli <= 180\n",
" resp_binned = resp_binned[stimuli_binned <= 180]\n",
" stimuli_binned = stimuli_binned[stimuli_binned <= 180]\n",
"\n",
" stimuli_binned -= 90 # 0 means vertical, -ve means tilted left, +ve means tilted right\n",
"\n",
" # Return as torch.Tensor\n",
" resp_tensor = torch.tensor(resp_binned, dtype=torch.float32)\n",
" stimuli_tensor = torch.tensor(stimuli_binned, dtype=torch.float32).unsqueeze(1) # add singleton dimension to make a column vector\n",
"\n",
" return resp_tensor, stimuli_tensor\n",
"\n",
"\n",
"def grating(angle, sf=1 / 28, res=0.1, patch=False):\n",
" \"\"\"Generate oriented grating stimulus\n",
"\n",
" Args:\n",
" angle (float): orientation of grating (angle from vertical), in degrees\n",
" sf (float): controls spatial frequency of the grating\n",
" res (float): resolution of image. Smaller values will make the image\n",
" smaller in terms of pixels. res=1.0 corresponds to 640 x 480 pixels.\n",
" patch (boolean): set to True to make the grating a localized\n",
" patch on the left side of the image. If False, then the\n",
" grating occupies the full image.\n",
"\n",
" Returns:\n",
" torch.Tensor: (res * 480) x (res * 640) pixel oriented grating image\n",
"\n",
" \"\"\"\n",
"\n",
" angle = np.deg2rad(angle) # transform to radians\n",
"\n",
" wpix, hpix = 640, 480 # width and height of image in pixels for res=1.0\n",
"\n",
" xx, yy = np.meshgrid(sf * np.arange(0, wpix * res) / res, sf * np.arange(0, hpix * res) / res)\n",
"\n",
" if patch:\n",
" gratings = np.cos(xx * np.cos(angle + .1) + yy * np.sin(angle + .1)) # phase shift to make it better fit within patch\n",
" gratings[gratings < 0] = 0\n",
" gratings[gratings > 0] = 1\n",
" xcent = gratings.shape[1] * .75\n",
" ycent = gratings.shape[0] / 2\n",
" xxc, yyc = np.meshgrid(np.arange(0, gratings.shape[1]), np.arange(0, gratings.shape[0]))\n",
" icirc = ((xxc - xcent) ** 2 + (yyc - ycent) ** 2) ** 0.5 < wpix / 3 / 2 * res\n",
" gratings[~icirc] = 0.5\n",
"\n",
" else:\n",
" gratings = np.cos(xx * np.cos(angle) + yy * np.sin(angle))\n",
" gratings[gratings < 0] = 0\n",
" gratings[gratings > 0] = 1\n",
"\n",
" # Return torch tensor\n",
" return torch.tensor(gratings, dtype=torch.float32)\n",
"\n",
"\n",
"def filters(out_channels=6, K=7):\n",
" \"\"\" make example filters, some center-surround and gabors\n",
" Returns:\n",
" filters: out_channels x K x K\n",
" \"\"\"\n",
" grid = np.linspace(-K/2, K/2, K).astype(np.float32)\n",
" xx,yy = np.meshgrid(grid, grid, indexing='ij')\n",
"\n",
" # create center-surround filters\n",
" sigma = 1.1\n",
" gaussian = np.exp(-(xx**2 + yy**2)**0.5/(2*sigma**2))\n",
" wide_gaussian = np.exp(-(xx**2 + yy**2)**0.5/(2*(sigma*2)**2))\n",
" center_surround = gaussian - 0.5 * wide_gaussian\n",
"\n",
" # create gabor filters\n",
" thetas = np.linspace(0, 180, out_channels-2+1)[:-1] * np.pi/180\n",
" gabors = np.zeros((len(thetas), K, K), np.float32)\n",
" lam = 10\n",
" phi = np.pi/2\n",
" gaussian = np.exp(-(xx**2 + yy**2)**0.5/(2*(sigma*0.4)**2))\n",
" for i,theta in enumerate(thetas):\n",
" x = xx*np.cos(theta) + yy*np.sin(theta)\n",
" gabors[i] = gaussian * np.cos(2*np.pi*x/lam + phi)\n",
"\n",
" filters = np.concatenate((center_surround[np.newaxis,:,:],\n",
" -1*center_surround[np.newaxis,:,:],\n",
" gabors),\n",
" axis=0)\n",
" filters /= np.abs(filters).max(axis=(1,2))[:,np.newaxis,np.newaxis]\n",
" # convert to torch\n",
" filters = torch.from_numpy(filters)\n",
" # add channel axis\n",
" filters = filters.unsqueeze(1)\n",
"\n",
" return filters\n",
"\n",
"\n",
"class CNN(nn.Module):\n",
" \"\"\"Deep convolutional network with one convolutional + pooling layer followed\n",
" by one fully connected layer\n",
"\n",
" Args:\n",
" h_in (int): height of input image, in pixels (i.e. number of rows)\n",
" w_in (int): width of input image, in pixels (i.e. number of columns)\n",
"\n",
" Attributes:\n",
" conv (nn.Conv2d): filter weights of convolutional layer\n",
" pool (nn.MaxPool2d): max pooling layer\n",
" dims (tuple of ints): dimensions of output from pool layer\n",
" fc (nn.Linear): weights and biases of fully connected layer\n",
" out (nn.Linear): weights and biases of output layer\n",
"\n",
" \"\"\"\n",
"\n",
" def __init__(self, h_in, w_in):\n",
" super().__init__()\n",
" C_in = 1 # input stimuli have only 1 input channel\n",
" C_out = 6 # number of output channels (i.e. of convolutional kernels to convolve the input with)\n",
" K = 7 # size of each convolutional kernel\n",
" Kpool = 8 # size of patches over which to pool\n",
" self.conv = nn.Conv2d(C_in, C_out, kernel_size=K, padding=K//2) # add padding to ensure that each channel has same dimensionality as input\n",
" self.pool = nn.MaxPool2d(Kpool)\n",
" self.dims = (C_out, h_in // Kpool, w_in // Kpool) # dimensions of pool layer output\n",
" self.fc = nn.Linear(np.prod(self.dims), 10) # flattened pool output --> 10D representation\n",
" self.out = nn.Linear(10, 1) # 10D representation --> scalar\n",
" self.conv.weight = nn.Parameter(filters(C_out, K))\n",
" self.conv.bias = nn.Parameter(torch.zeros((C_out,), dtype=torch.float32))\n",
"\n",
" def forward(self, x):\n",
" \"\"\"Classify grating stimulus as tilted right or left\n",
"\n",
" Args:\n",
" x (torch.Tensor): p x 48 x 64 tensor with pixel grayscale values for\n",
" each of p stimulus images.\n",
"\n",
" Returns:\n",
" torch.Tensor: p x 1 tensor with network outputs for each input provided\n",
" in x. Each output should be interpreted as the probability of the\n",
" corresponding stimulus being tilted right.\n",
"\n",
" \"\"\"\n",
" x = x.unsqueeze(1) # p x 1 x 48 x 64, add a singleton dimension for the single stimulus channel\n",
" x = torch.relu(self.conv(x)) # output of convolutional layer\n",
" x = self.pool(x) # output of pooling layer\n",
" x = x.view(-1, np.prod(self.dims)) # flatten pooling layer outputs into a vector\n",
" x = torch.relu(self.fc(x)) # output of fully connected layer\n",
" x = torch.sigmoid(self.out(x)) # network output\n",
" return x\n",
"\n",
"\n",
"def train(net, train_data, train_labels,\n",
" n_epochs=25, learning_rate=0.0005,\n",
" batch_size=100, momentum=.99):\n",
" \"\"\"Run stochastic gradient descent on binary cross-entropy loss for a given\n",
" deep network (cf. appendix for details)\n",
"\n",
" Args:\n",
" net (nn.Module): deep network whose parameters to optimize with SGD\n",
" train_data (torch.Tensor): n_train x h x w tensor with stimulus gratings\n",
" train_labels (torch.Tensor): n_train x 1 tensor with true tilt of each\n",
" stimulus grating in train_data, i.e. 1. for right, 0. for left\n",
" n_epochs (int): number of times to run SGD through whole training data set\n",
" batch_size (int): number of training data samples in each mini-batch\n",
" learning_rate (float): learning rate to use for SGD updates\n",
" momentum (float): momentum parameter for SGD updates\n",
"\n",
" \"\"\"\n",
"\n",
" # Initialize binary cross-entropy loss function\n",
" loss_fn = nn.BCELoss()\n",
"\n",
" # Initialize SGD optimizer with momentum\n",
" optimizer = optim.SGD(net.parameters(), lr=learning_rate, momentum=momentum)\n",
"\n",
" # Placeholder to save loss at each iteration\n",
" track_loss = []\n",
"\n",
" # Loop over epochs\n",
" for i in range(n_epochs):\n",
"\n",
" # Split up training data into random non-overlapping mini-batches\n",
" ishuffle = torch.randperm(train_data.shape[0]) # random ordering of training data\n",
" minibatch_data = torch.split(train_data[ishuffle], batch_size) # split train_data into minibatches\n",
" minibatch_labels = torch.split(train_labels[ishuffle], batch_size) # split train_labels into minibatches\n",
"\n",
" # Loop over mini-batches\n",
" for stimuli, tilt in zip(minibatch_data, minibatch_labels):\n",
"\n",
" # Evaluate loss and update network weights\n",
" out = net(stimuli) # predicted probability of tilt right\n",
" loss = loss_fn(out, tilt) # evaluate loss\n",
" optimizer.zero_grad() # clear gradients\n",
" loss.backward() # compute gradients\n",
" optimizer.step() # update weights\n",
"\n",
" # Keep track of loss at each iteration\n",
" track_loss.append(loss.item())\n",
"\n",
" # Track progress\n",
" if (i + 1) % (n_epochs // 5) == 0:\n",
" print(f'epoch {i + 1} | loss on last mini-batch: {loss.item(): .2e}')\n",
"\n",
" print('training done!')\n",
"\n",
"\n",
"def get_hidden_activity(net, stimuli, layer_labels):\n",
" \"\"\"Retrieve internal representations of network\n",
"\n",
" Args:\n",
" net (nn.Module): deep network\n",
" stimuli (torch.Tensor): p x 48 x 64 tensor with stimuli for which to\n",
" compute and retrieve internal representations\n",
" layer_labels (list): list of strings with labels of each layer for which\n",
" to return its internal representations\n",
"\n",
" Returns:\n",
" dict: internal representations at each layer of the network, in\n",
" numpy arrays. The keys of this dict are the strings in layer_labels.\n",
"\n",
" \"\"\"\n",
"\n",
" # Placeholder\n",
" hidden_activity = {}\n",
"\n",
" # Attach 'hooks' to each layer of the network to store hidden\n",
" # representations in hidden_activity\n",
" def hook(module, input, output):\n",
" module_label = list(net._modules.keys())[np.argwhere([module == m for m in net._modules.values()])[0, 0]]\n",
" if module_label in layer_labels: # ignore output layer\n",
" hidden_activity[module_label] = output.view(stimuli.shape[0], -1).detach().numpy()\n",
" hooks = [layer.register_forward_hook(hook) for layer in net.children()]\n",
"\n",
" # Run stimuli through the network\n",
" pred = net(stimuli)\n",
"\n",
" # Remove the hooks\n",
" [h.remove() for h in hooks]\n",
"\n",
" return hidden_activity"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Data retrieval and loading\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"execution": {},
"tags": [
"hide-input"
]
},
"outputs": [],
"source": [
"#@title Data retrieval and loading\n",
"import os\n",
"import hashlib\n",
"import requests\n",
"\n",
"fname = \"W3D4_stringer_oribinned1.npz\"\n",
"url = \"https://osf.io/683xc/download\"\n",
"expected_md5 = \"436599dfd8ebe6019f066c38aed20580\"\n",
"\n",
"if not os.path.isfile(fname):\n",
" try:\n",
" r = requests.get(url)\n",
" except requests.ConnectionError:\n",
" print(\"!!! Failed to download data !!!\")\n",
" else:\n",
" if r.status_code != requests.codes.ok:\n",
" print(\"!!! Failed to download data !!!\")\n",
" elif hashlib.md5(r.content).hexdigest() != expected_md5:\n",
" print(\"!!! Data download appears corrupted !!!\")\n",
" else:\n",
" with open(fname, \"wb\") as fid:\n",
" fid.write(r.content)"
]
},
{
"cell_type": "markdown",
"metadata": {
"execution": {}
},
"source": [
"---\n",
"# Section 1: Setting up deep network and neural data\n",
"\n",
"In the future sections, we will compare the activity in a deep network, specifically in a CNN, with neural activity. First, we need to understand the task we are using (Section 1.1), train our deep network (Section 1.2), and load in neural data (Section 1.3)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Video 1: Deep convolutional network for orientation discrimination\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"execution": {},
"tags": [
"remove-input"
]
},
"outputs": [],
"source": [
"# @title Video 1: Deep convolutional network for orientation discrimination\n",
"from ipywidgets import widgets\n",
"from IPython.display import YouTubeVideo\n",
"from IPython.display import IFrame\n",
"from IPython.display import display\n",
"\n",
"\n",
"class PlayVideo(IFrame):\n",
" def __init__(self, id, source, page=1, width=400, height=300, **kwargs):\n",
" self.id = id\n",
" if source == 'Bilibili':\n",
" src = f'https://player.bilibili.com/player.html?bvid={id}&page={page}'\n",
" elif source == 'Osf':\n",
" src = f'https://mfr.ca-1.osf.io/render?url=https://osf.io/download/{id}/?direct%26mode=render'\n",
" super(PlayVideo, self).__init__(src, width, height, **kwargs)\n",
"\n",
"\n",
"def display_videos(video_ids, W=400, H=300, fs=1):\n",
" tab_contents = []\n",
" for i, video_id in enumerate(video_ids):\n",
" out = widgets.Output()\n",
" with out:\n",
" if video_ids[i][0] == 'Youtube':\n",
" video = YouTubeVideo(id=video_ids[i][1], width=W,\n",
" height=H, fs=fs, rel=0)\n",
" print(f'Video available at https://youtube.com/watch?v={video.id}')\n",
" else:\n",
" video = PlayVideo(id=video_ids[i][1], source=video_ids[i][0], width=W,\n",
" height=H, fs=fs, autoplay=False)\n",
" if video_ids[i][0] == 'Bilibili':\n",
" print(f'Video available at https://www.bilibili.com/video/{video.id}')\n",
" elif video_ids[i][0] == 'Osf':\n",
" print(f'Video available at https://osf.io/{video.id}')\n",
" display(video)\n",
" tab_contents.append(out)\n",
" return tab_contents\n",
"\n",
"\n",
"video_ids = [('Youtube', 'KlXtKJCpV4I'), ('Bilibili', 'BV1ip4y1i7Yo')]\n",
"tab_contents = display_videos(video_ids, W=730, H=410)\n",
"tabs = widgets.Tab()\n",
"tabs.children = tab_contents\n",
"for i in range(len(tab_contents)):\n",
" tabs.set_title(i, video_ids[i][0])\n",
"display(tabs)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Submit your feedback\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"execution": {},
"tags": [
"hide-input"
]
},
"outputs": [],
"source": [
"# @title Submit your feedback\n",
"content_review(f\"{feedback_prefix}_Deep_convolutional_network_for_orientation_discrimination_Video\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"execution": {}
},
"source": [
"## Section 1.1: Orientation discrimination task\n",
"\n",
"We will build our normative encoding model by optimizing its parameters to solve an orientation discrimination task.\n",
"\n",
"The task is to tell whether a given grating stimulus is tilted to the \"right\" or \"left\"; that is, whether its angle relative to the vertical is positive or negative, respectively. We show example stimuli below, which were constructed using the helper function `grating()`.\n",
"\n",
"Note that this is a task that we know many mammalian visual systems are capable of solving. It is therefore conceivable that the representations in a deep network model optimized for this task might resemble those in the brain. To test this hypothesis, we will compare the representations of our optimized encoding model to neural activity recorded in response to these very same stimuli, courtesy of [Stringer et al 2019](https://www.biorxiv.org/content/10.1101/679324v2.abstract)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" Execute this cell to plot example stimuli\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"execution": {},
"tags": [
"hide-input"
]
},
"outputs": [],
"source": [
"# @markdown Execute this cell to plot example stimuli\n",
"\n",
"orientations = np.linspace(-90, 90, 5)\n",
"\n",
"h_ = 3\n",
"n_col = len(orientations)\n",
"h, w = grating(0).shape # height and width of stimulus\n",
"fig, axs = plt.subplots(1, n_col, figsize=(h_ * n_col, h_))\n",
"\n",
"for i, ori in enumerate(orientations):\n",
" stimulus = grating(ori)\n",
" axs[i].set_title(f'{ori: .0f}$^o$')\n",
" show_stimulus(stimulus, axs[i])\n",
"fig.suptitle(f'stimulus size: {h} x {w}')\n",
"plt.tight_layout()\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {
"execution": {}
},
"source": [
"## Section 1.2: A deep network model of orientation discrimination\n",
"\n",
"*Estimated timing to here from start of tutorial: 10 min*\n",
"\n",
"Our goal is to build a model that solves the orientation discrimination task outlined above. The model should take as input a stimulus image and output the probability of that stimulus being tilted right.\n",
"\n",
"To do this, we will use a **convolutional neural network (CNN)**, which is the type of network we saw in Tutorial 2. Here, we will use a CNN that performs *two-dimensional* convolutions on the raw stimulus image (which is a 2D matrix of pixels), rather than *one-dimensional* convolutions on a categorical 1D vector representation of the stimulus. CNNs are commonly used for image processing.\n",
"\n",
"The particular CNN we will use here has two layers:\n",
"1. a *convolutional layer*, which convolves the images with a set of filters\n",
"2. a *fully connected layer*, which transforms the output of this convolution into a 10-dimensional representation\n",
"\n",
"Finally, a set of output weights transforms this 10-dimensional representation into a single scalar $p$, denoting the predicted probability of the input stimulus being tilted right.\n",
"\n",
"
\n",
" \n",
"
\n",
" \n",
"