Errors with ‘Accelerating End-to-End Data Science Workflows’ course

Hi,

I think it hasn’t been answered in my follow-up reply but I have been facing issues with the dbscan section of the project, the code is unable to provide an output for more than 45min (I just left it to run). This is the code I ran:

infected_df[‘cluster’] = dbscan.fit_predict(infected_df[[‘northing’, ‘easting’]])
infected_df[‘cluster’].nunique()

I am also constantly getting this error even after restarting the whole task:

TypingError: Failed in nopython mode pipeline (step: nopython frontend)
Internal error at <numba.core.typeinfer.CallConstraint object at 0x7f9fa63de4d0>.
module, class, method, function, traceback, frame, or code object was expected, got CPUDispatcher
During: resolving callee type: Function(<numba.cuda.compiler.DeviceFunctionTemplate object at 0x7f9fa63de150>)
During: typing of call at /opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/utils/cudautils.py (102)

Enable logging at debug level for details.

File “…/…/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/utils/cudautils.py”, line 102:
def gpu_mark_found_int(arr, val, out, not_found):

if i < arr.size:
if check_equals_int(arr[i], val):

My time for the course has also been used up during this troubleshooting, so I’ll like to request for more time as well, thanks so much in advance.

Claire

hi Claire,

sorry for getting back to you so late. i have just seen this post recently. even though your error message suggests a bug, i had no issue completing the project. i did try to reproduce your error. after about an hour i was able to get a similar error as you if i don’t apply any filter to the original data. the issue i believe is there is too much data you’re trying to cluster. it would appear DBSCAN doesn’t scale very well, which makes intuitive sense to me. part of the instructions asks for filtering down the data, i.e.

infected_df = gdf[gdf['infected'] == True]

this will reduce the DataFrame from 58479894 rows to 18148 rows.

please let me know if you’re still interested in completing this course and i can work on that.

best,

Kevin