Stern Center for Research Computing

New York University • Leonard Stern School of Business

Running hadoop, hive and mahout at the Stern Center for Research Computing

First, you must have your Stern userid enabled for hadoop To do that, please send an email to research@stern.nyu.edu, or call the help desk at 212-998-0180 and create a ticket for research computing.

To access hadoop,

ssh yournetid@bigdata.stern.nyu.edu

Typing

hadoop  fs  -mkdir  test

Should create a directory “test”  in /user/yournetid  (which is your default folder in the hadoop file system).

type

hadoop fs -lsr

and you will get a list of  all of your files in hadoop

hive

will enter the hive command line environment

mahout  options

will run a mahout job.

Important things to remember.

hadoop keeps all of its files in its own file system called “hdfs”. You need to move your files from linux to the hadoop files system with the

hadoop fs -put /mylocalpath/mylocalfile myhadoopfilename

command. That will copy  the file at

/mylocalpath/mylocalfile

to

myhadoopfilename

in hdfs:/user/yournetid/myhadoopfilename

Good luck….