The following describes the contents of SmallCNN.tar.gz, a single archive file
that will be uploaded to the PNNL DataHub.

contents of SmallCNN.tar.gz:
###############################################################################
CK 

/CK/CIFAR/
----SEED{i} i in {range(0,100)}, though only i in range(66,100) are populated
    ----CK_E{epoch}_SEED{i}.npy epoch in {range(0,200,4)}

/CK/FMNIST/
----SEED{i} i in {range(0,100)}, though only i in range(33,66) are populated
    ----CK_E{epoch}_SEED{i}.npy epoch in {range(0,100,2)}

/CK/MNIST/
----SEED{i} i in {range(0,100)}, though only i in range(0,33) are populated
    ----CK_E{epoch}_SEED{i}.npy epoch in {range(0,100,2)}

/CK/MNIST2
----SEED{i} i in {range(0,100)}, though only i in range(0,33) are populated
    ----CK_E{epoch}_SEED{i}.npy epoch in {range(0,100,2)}


We calculate the Conjugate Kernel (CK) after every epoch or after a set value
of epochs and record the value here. In truth, we actually record the feature
vector neccessary to comptute the CK. So, for example, one can reconstruct the
CK with:

```python
import numpy as np
A = np.load(/path/to/CK.npy)
CK = A.T @ A
```

###############################################################################
NTK

/NTK/CIFAR/
----SEED{i} i in {range(0,100)}, though only i in range(66,100) are populated
    ----NTK_E{epoch}_SEED{i}.npy epoch in {range(0,200,4)}

/NTK/FMNIST/
----SEED{i} i in {range(0,100)}, though only i in range(33,66) are populated
    ----NTK_E{epoch}_SEED{i}.npy epoch in {range(0,100,2)}

/NTK/MNIST/
----SEED{i} i in {range(0,100)}, though only i in range(0,33) are populated
    ----NTK_E{epoch}_SEED{i}.npy epoch in {range(0,100,2)}

/NTK/MNIST2
----SEED{i} i in {range(0,100)}, though only i in range(0,33) are populated
    ----NTK_E{epoch}_SEED{i}.npy epoch in {range(0,100,2)}

We calculate the NTK after every other (or every 4th) epoch for training a
small CNN on each of the datasets. The NTK is saved as the full calculated 
kernel matrix, so simply load with np.load.

###############################################################################
LP #!!! NOTE: these are .pt files to be opened with torch.load()

/LP/CIFAR/
----SEED{i} i in {range(0,100)}, though only i in range(66,100) are populated
    ----LP_E{epoch}_SEED{i}.npy epoch in {range(0,200,4)}

/LP/FMNIST/
----SEED{i} i in {range(0,100)}, though only i in range(33,66) are populated
    ----LP_E{epoch}_SEED{i}.npy epoch in {range(0,100,2)}

/LP/MNIST/
----SEED{i} i in {range(0,100)}, though only i in range(0,33) are populated
    ----LP_E{epoch}_SEED{i}.npy epoch in {range(0,100,2)}

/LP/MNIST2
----SEED{i} i in {range(0,100)}, though only i in range(0,33) are populated
    ----LP_E{epoch}_SEED{i}.npy epoch in {range(0,100,2)}


Each of the files are the PYTORCH files representing the final state of the
model after 5 epochs of linear probe training, where the epoch number is the
epoch of the regular training starting point for the linear probing.


###############################################################################
MODELS

/MODELS/CIFAR/
----SEED{i} i in {range(0,100)}, though only i in range(66,100) are populated
    ----CIFAR_E{epoch}_SEED{i}.pt epoch in {range(0,200,1)}

/MODELS/FMNIST/
----SEED{i} i in {range(0,100)}, though only i in range(33,66) are populated
    ----FMNIST_E{epoch}_SEED{i}.pt epoch in {range(0,100,1)}

/MODELS/MNIST/
----SEED{i} i in {range(0,100)}, though only i in range(0,33) are populated
    ----MNIST_E{epoch}_SEED{i}.pt epoch in {range(0,100,1)}

/MODELS/MNIST2
----SEED{i} i in {range(0,33)}
    ----MNIST_E{epoch}_SEED{i}.pt epoch in {range(0,100,1)}

each file contains the PyTorch state dictionary / model weights saved at the end
of each epoch. for example, to get the model weights for MNIST experiment at 
the end of epoch 4:

```python
import torch
ckpt = torch.load(./MODELS/MNIST/MNIST_E4_SEED0.pt')

model = MyModel() #defined inside the jupyter notebook files
model.load_state_dict(ckpt)
```

###############################################################################
LOG

/LOG/CIFAR/
----CIFAR_SEED{i}_history.npy for i in range(66,100)

/LOG/FMNIST/
----FMNIST_SEED{i}_history.npy for i in range(33,66)

/LOG/MNIST/
----MNIST_SEED{i}_history.npy for i in range(0,33)

/LOG/MNIST2/
----MNIST_SEED{i}_history.npy for i in range(0,33)
----MNIST_SEED{i}_histor_justnewLP.npy for i in range(0,33)

Log files are numpy object files that can contain extracted information like the
test accuracy of the model, the vector of predicted probabilities, the test 
accuracy of the linear probes, and test accuracy of kernel machines trained on
both the CK / NTK kernel. This allows one to re-compute the main figure for the
small CNN section without having to re-initialize all variables.