Mars: MapReduce Framework & Cuda ToolKit 7.0

Hmm… I see…

Alright so the make.sh and the Makefile in the WordCount directory have nothing referring to be libraries or even a gcc/g++/nvcc command. But I did find a line in the common.mk file inside SDK/C/common/common.mk that refers to these two libraries (-lcutil and -lshrutil). This is the section of code from the common.mk:

code for WordCount make.sh:

#!/bin/sh

#--------------------------------------------------------------------------
#usage: sh make.sh [emu=1]
#--------------------------------------------------------------------------

rm -f -r obj
make clean
make $1

code for WordCount Makefile:

################################################################################
#
# Build script for project
#
################################################################################

# Add source files here
EXECUTABLE	:= WordCount/WordCount
# Cuda source files (compiled with cudacc)
CUFILES		:= main.cu MarsLib.cu MarsScan.cu MarsSort.cu
# C/C++ source files (compiled with gcc / c++)
CCFILES		:= MarsUtils.cpp

################################################################################
# Rules and targets

include ~/NVIDIA_GPU_Computing_SDK/C/common/common.mk

Code for common.mk

# Lib/exe configuration
# Lib/exe configuration
# Lib/exe configuration
ifneq ($(STATIC_LIB),)
	TARGETDIR := $(LIBDIR)
	TARGET   := $(subst .a,_$(LIB_ARCH)$(LIBSUFFIX).a,$(LIBDIR)/$(STATIC_LIB))
	LINKLINE  = ar rucv $(TARGET) $(OBJS)
else
	ifneq ($(OMIT_CUTIL_LIB),1)
		LIB += -lcutil_$(LIB_ARCH)$(LIBSUFFIX) -lshrutil_$(LIB_ARCH)$(LIBSUFFIX)
	endif
	# Device emulation configuration
	ifeq ($(emu), 1)
		NVCCFLAGS   += -deviceemu
		CUDACCFLAGS += 
		BINSUBDIR   := emu$(BINSUBDIR)
		# consistency, makes developing easier
		CXXFLAGS		+= -D__DEVICE_EMULATION__
		CFLAGS			+= -D__DEVICE_EMULATION__
	endif
	TARGETDIR := $(BINDIR)/$(BINSUBDIR)
	TARGET    := $(TARGETDIR)/$(EXECUTABLE)
	LINKLINE  = $(LINK) -o $(TARGET) $(OBJS) $(LIB)
endif

Particularly referring to line 10. This doesn’t have anything referring to the -L that you were talking about… Should I be looking else where, or do you think the solution may be in common.mk?

Thanks!
M

p.S: Also i just looked inside the Makefile located in SDK/C/ and there is a line referring to lcutil.so:

lib/libcutil.so:
	@$(MAKE) -C common

I wonder why its spitting out a .a when it clearly states that its supposed to make a lib/libcutil.so… -_-

Alright, I think i figured it out!

in common.mk after this section of code:

# Libs
ifneq ($(DARWIN),)
    LIB       := -L$(CUDA_INSTALL_PATH)/lib -L$(LIBDIR) -L$(COMMONDIR)/lib/$(OSLOWER) -L$(SHAREDDIR)/lib $(NVCUVIDLIB) 
else
  ifeq "$(strip $(HP_64))" ""
    ifeq ($(x86_64),1)
       LIB       := -L$(CUDA_INSTALL_PATH)/lib64 -L$(LIBDIR) -L$(COMMONDIR)/lib/$(OSLOWER) -L$(SHAREDDIR)/lib  
    else
       LIB       := -L$(CUDA_INSTALL_PATH)/lib -L$(LIBDIR) -L$(COMMONDIR)/lib/$(OSLOWER) -L$(SHAREDDIR)/lib
    endif
  else
    ifeq ($(i386),1)
       LIB       := -L$(CUDA_INSTALL_PATH)/lib -L$(LIBDIR) -L$(COMMONDIR)/lib/$(OSLOWER) -L$(SHAREDDIR)/lib
    else
       LIB       := -L$(CUDA_INSTALL_PATH)/lib64 -L$(LIBDIR) -L$(COMMONDIR)/lib/$(OSLOWER) -L$(SHAREDDIR)/lib
    endif
  endif
endif

I added the line:

LIB += -L/home/taylor/NVIDIA_GPU_Computing_SDK/shared/lib -L/home/taylor/NVIDIA_GPU_Computing_SDK/C/lib

This allowed me to compile the WordCount project.

This may be out of the scope of this forum now, but when I run the application ./WordCount test.txt the application fails with a segmentation fault (core dumped). If i remember correctly, does this mean the program is trying to read/write in memory that it shouldn’t?

Or could this be related to cuda issues?

Thanks,
M

I traced the segment fault down to a section of code where the Mars developers are freeing a bunch of device variables.

printf("\n\nTESTTTT\n\n");
	cudaFree(d_interKeysSizePerTask);
	
	cudaFree(d_interValsSizePerTask);
	cudaFree(d_interCountPerTask);
	
	cudaFree(d_keyValOffsets);
	cudaFree(d_curIndex);
	
	//Segmentation Fault here
	cudaFree(d_inputKeys);
	
	
	cudaFree(d_inputVals);
	cudaFree(d_inputOffsetSizes);
	
	cudaFree(d_psKeySizes);
	cudaFree(d_psValSizes);
	cudaFree(d_psCounts);

	endTimer("Map", &mapTimer);
	return 0;

if “\n\nTESTTT\n\n” is at the top it will always show, but if it is in the middle or near the bottom it’ll show every now and then. WHich seems like the seg fault is happening on different cudaFree() calls.

To Update: The problem was ‘fixed’ after doing a reboot… I’m not sure what happened but maybe the GPU got locked up or something… Either way, I’ve moved on to an attempt to debug the application, because though its “running”, it doesn’t seem to actually be a working example of a word counter… It only counts the words that are at least 6 characters long. Pretty weird.

Currently, i’ve got the project compiling and running in Nsight Eclipse edition. But I cannot get it to debug correctly. It’ll allow me to start the debug process because i’ve appended the tabs -g -O0 to gcc and g++, and i’ve appended -G -g to nvcc. But once I get to a line that deals with device memory the debugging stops and says:

All CUDA devices are used for display and cannot be used while debugging

I have a GTX 670M and i’ve never run into this issue on the windows side when i’ve debugged applications in VS2013. So, as far as i’m aware, my GPU should be capable of debugging while displaying.

Any ideas?

search on CUDA DEBUG PREEMPTION

This thread saved me–I just got Mars running myself. Thank you to all involved. Myg0t, did you ever solve your word count problem? Look in the file map.cu. There are two lines that say “wsize > 6” which can be replaced with any other minimum size.

Thanks again guys.