Problem with loading data in extraction_assessment notebook and compatibility issues

Hello, I am doing Exploring Adversarial Machine Learning course and I am unable to download the Food101 dataset in extraction_assesment (3). I’m using the default code, which I haven’t modified. The KeyError: ‘tags’ occurs because the dataset cannot be downloaded.

The second issue I have is with the assessments_assessment (4) notebook. It’s the same situation – I’m running the provided code, and there are compatibility issues.
WARNING: Error parsing dependencies of cudf: .* suffix can only be used with==or!=operators cuda-python (>=12.*) ~~~~~^ WARNING: Error parsing dependencies of pylibraft: .* suffix can only be used with==or!=operators cuda-python (>=12.*) ~~~~~^ WARNING: Error parsing dependencies of rmm: .* suffix can only be used with==or!=operators cuda-python (>=12.*) ~~~~~^ Installing collected packages: en-core-web-md Successfully installed en-core-web-md-3.5.0 WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager, possibly rendering your system unusable.It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv. Use the --root-user-action option if you know what you are doing and want to suppress this warning. ✔ Download and installation successful You can now load the package via spacy.load('en_core_web_md') ⚠ As of spaCy v3.0, model symlinks are not supported anymore. You can load trained pipeline packages using their full names or from a directory path. /usr/local/lib/python3.8/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning:resume_downloadis deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, useforce_download=True`.
warnings.warn(
Some weights of the model checkpoint at openai-community/roberta-base-openai-detector were not used when initializing RobertaForSequenceClassification: [‘roberta.pooler.dense.weight’, ‘roberta.pooler.dense.bias’]

  • This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
  • This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).

Could you please take a look and resolve it, or restart my environment? Something might be wrong with it

Food101 Dataset Assessment 3 - Exploring Adversarial Machine Learning

@member35 or @jolucas can someone help solve this problem? I can the get the dataset when using google colab. I’ve tried at least a dozen methods in the notebook, but none have worked yet.

Again, I’m just trying to get the food101 dataset into jupyter.

I need HF credentials to get the data set, this is the code that works in colab but not the jupyter lab.

!pip install datasets

from datasets import load_dataset

import os

os.environ[“HUGGING_FACE_HUB_TOKEN”] = “hf_mysecrethuggingfacetoken”

Then load the dataset

from datasets import load_dataset

dataset = load_dataset(“ethz/food101”)

These are the indexes corresponding to pizza and hotdog in the food101 data

f101_pizza = 76

f101_hotdog = 55

def ds_preprocess(example):

example[‘image’] = preprocess(example[‘image’])

return example

ds = load_dataset(‘food101’)\

.filter(lambda x, f101_pizza=f101_pizza, f101_hotdog=f101_hotdog:x[‘label’] in [f101_pizza, f101_hotdog])\

.map(ds_preprocess).with_format(‘torch’)

I just want to get the dataset, I can figure the rest out.

Any help would be greatly appreciated.

Thanks,
Bill

OMG, WHY DON’T I HAVE ACCESS TO THE FOOD101 DATASET? PLEASE HELP ME! ITS THE LAST ASSESSMENT I NEED TO PASS.