EOFError: Compressed file ended before the end-of-stream marker was reached

Dear Sir or Madam:

While I run the babi_rnn.py from keras/examples, the code lines of downloading from Amazon AWS was extremely slow and was stopped with the error, I found out the following directory did not include the designated downloading file.

Path:
/usr/lib/python3/dist-packages/keras/datasets ​

While I tried to add the manually-downloaed file ‘babi_tasks_1-20_v1-2.tar.gz’ into the above-mentioned path, it had the same error during the runtime. As I know, Ubuntu Desktop Computer has such a path of datasets. But Jetson Nano has no such a path including any file online-downloaded from Amazon AWS. Should you please tell me the real path including the AWS-downloaded file in …keras/datasets?

  1. Code snippet:

try:
path = get_file(‘babi-tasks-v1-2.tar.gz’,
origin=‘https://s3.amazonaws.com/text-datasets/
‘babi_tasks_1-20_v1-2.tar.gz’)
except:
print(‘Error downloading dataset, please download it manually:\n’
‘$ wget http://www.thespermwhale.com/jaseweston/babi/tasks_1-20_v1-2
‘.tar.gz\n’
‘$ mv tasks_1-20_v1-2.tar.gz ~/.keras/datasets/babi-tasks-v1-2.tar.gz’)
raise

challenge = ‘tasks_1-20_v1-2/en/qa2_two-supporting-facts_{}.txt’
with tarfile.open(path) as tar:
train = get_stories(tar.extractfile(challenge.format(‘train’)))
test = get_stories(tar.extractfile(challenge.format(‘test’)))

  1. Error Message

Traceback (most recent call last):
File “babi_rnn.py”, line 178, in
train = get_stories(tar.extractfile(challenge.format(‘train’)))
File “/usr/lib/python3.6/tarfile.py”, line 2076, in extractfile
tarinfo = self.getmember(member)
File “/usr/lib/python3.6/tarfile.py”, line 1750, in getmember
tarinfo = self._getmember(name)
File “/usr/lib/python3.6/tarfile.py”, line 2335, in _getmember
members = self.getmembers()
File “/usr/lib/python3.6/tarfile.py”, line 1761, in getmembers
self._load() # all members, we first have to
File “/usr/lib/python3.6/tarfile.py”, line 2358, in _load
tarinfo = self.next()
File “/usr/lib/python3.6/tarfile.py”, line 2289, in next
self.fileobj.seek(self.offset - 1)
File “/usr/lib/python3.6/gzip.py”, line 368, in seek
return self._buffer.seek(offset, whence)
File “/usr/lib/python3.6/_compression.py”, line 143, in seek
data = self.read(min(io.DEFAULT_BUFFER_SIZE, offset))
File “/usr/lib/python3.6/gzip.py”, line 482, in read
raise EOFError("Compressed file ended before the "
EOFError: Compressed file ended before the end-of-stream marker was reached

Appreciate your help in advance,

Mike

1 Like

I solved the issue. Jetson Nano in Ubuntu 18.04 has a unique path of keras/Datasets. The solution is listed as follows.

  1. Open the Document.

  2. key in the command “control + h”;

  3. show the hidden file “.keras”;

  4. open the “.keras” and see “Datasets”;

  5. Open the “Datasets” and delete "babi_tasks_1-20_v1-2.tar.gz’

  6. re-run the application and have a success operation.

Cheers,

Mike

Glad to know you solved the issue. Thanks.