The following describes the contents of SmallCNN.tar.gz, a single archive file that will be uploaded to the PNNL DataHub. contents of SmallCNN.tar.gz: ############################################################################### CK /CK/CIFAR/ ----SEED{i} i in {range(0,100)}, though only i in range(66,100) are populated ----CK_E{epoch}_SEED{i}.npy epoch in {range(0,200,4)} /CK/FMNIST/ ----SEED{i} i in {range(0,100)}, though only i in range(33,66) are populated ----CK_E{epoch}_SEED{i}.npy epoch in {range(0,100,2)} /CK/MNIST/ ----SEED{i} i in {range(0,100)}, though only i in range(0,33) are populated ----CK_E{epoch}_SEED{i}.npy epoch in {range(0,100,2)} /CK/MNIST2 ----SEED{i} i in {range(0,100)}, though only i in range(0,33) are populated ----CK_E{epoch}_SEED{i}.npy epoch in {range(0,100,2)} We calculate the Conjugate Kernel (CK) after every epoch or after a set value of epochs and record the value here. In truth, we actually record the feature vector neccessary to comptute the CK. So, for example, one can reconstruct the CK with: ```python import numpy as np A = np.load(/path/to/CK.npy) CK = A.T @ A ``` ############################################################################### NTK /NTK/CIFAR/ ----SEED{i} i in {range(0,100)}, though only i in range(66,100) are populated ----NTK_E{epoch}_SEED{i}.npy epoch in {range(0,200,4)} /NTK/FMNIST/ ----SEED{i} i in {range(0,100)}, though only i in range(33,66) are populated ----NTK_E{epoch}_SEED{i}.npy epoch in {range(0,100,2)} /NTK/MNIST/ ----SEED{i} i in {range(0,100)}, though only i in range(0,33) are populated ----NTK_E{epoch}_SEED{i}.npy epoch in {range(0,100,2)} /NTK/MNIST2 ----SEED{i} i in {range(0,100)}, though only i in range(0,33) are populated ----NTK_E{epoch}_SEED{i}.npy epoch in {range(0,100,2)} We calculate the NTK after every other (or every 4th) epoch for training a small CNN on each of the datasets. The NTK is saved as the full calculated kernel matrix, so simply load with np.load. ############################################################################### LP #!!! NOTE: these are .pt files to be opened with torch.load() /LP/CIFAR/ ----SEED{i} i in {range(0,100)}, though only i in range(66,100) are populated ----LP_E{epoch}_SEED{i}.npy epoch in {range(0,200,4)} /LP/FMNIST/ ----SEED{i} i in {range(0,100)}, though only i in range(33,66) are populated ----LP_E{epoch}_SEED{i}.npy epoch in {range(0,100,2)} /LP/MNIST/ ----SEED{i} i in {range(0,100)}, though only i in range(0,33) are populated ----LP_E{epoch}_SEED{i}.npy epoch in {range(0,100,2)} /LP/MNIST2 ----SEED{i} i in {range(0,100)}, though only i in range(0,33) are populated ----LP_E{epoch}_SEED{i}.npy epoch in {range(0,100,2)} Each of the files are the PYTORCH files representing the final state of the model after 5 epochs of linear probe training, where the epoch number is the epoch of the regular training starting point for the linear probing. ############################################################################### MODELS /MODELS/CIFAR/ ----SEED{i} i in {range(0,100)}, though only i in range(66,100) are populated ----CIFAR_E{epoch}_SEED{i}.pt epoch in {range(0,200,1)} /MODELS/FMNIST/ ----SEED{i} i in {range(0,100)}, though only i in range(33,66) are populated ----FMNIST_E{epoch}_SEED{i}.pt epoch in {range(0,100,1)} /MODELS/MNIST/ ----SEED{i} i in {range(0,100)}, though only i in range(0,33) are populated ----MNIST_E{epoch}_SEED{i}.pt epoch in {range(0,100,1)} /MODELS/MNIST2 ----SEED{i} i in {range(0,33)} ----MNIST_E{epoch}_SEED{i}.pt epoch in {range(0,100,1)} each file contains the PyTorch state dictionary / model weights saved at the end of each epoch. for example, to get the model weights for MNIST experiment at the end of epoch 4: ```python import torch ckpt = torch.load(./MODELS/MNIST/MNIST_E4_SEED0.pt') model = MyModel() #defined inside the jupyter notebook files model.load_state_dict(ckpt) ``` ############################################################################### LOG /LOG/CIFAR/ ----CIFAR_SEED{i}_history.npy for i in range(66,100) /LOG/FMNIST/ ----FMNIST_SEED{i}_history.npy for i in range(33,66) /LOG/MNIST/ ----MNIST_SEED{i}_history.npy for i in range(0,33) /LOG/MNIST2/ ----MNIST_SEED{i}_history.npy for i in range(0,33) ----MNIST_SEED{i}_histor_justnewLP.npy for i in range(0,33) Log files are numpy object files that can contain extracted information like the test accuracy of the model, the vector of predicted probabilities, the test accuracy of the linear probes, and test accuracy of kernel machines trained on both the CK / NTK kernel. This allows one to re-compute the main figure for the small CNN section without having to re-initialize all variables.