HDFS can be accessed from applications in many different ways. Natively, HDFS provides a Java API for applications to use. A C language wrapper for this Java API is also available. In addition, an HTTP browser can also be used to browse the files of an HDFS instance. Work is in progress to expose HDFS through the WebDAV protocol.
FS Shell
HDFS allows user data to be organized in the form of files and directories. It provides a command line interface called FS shell that lets a user interact with the data in HDFS. The syntax of this command set is similar to other shells (e.g. bash, csh) that users are already familiar with. Here are some sample action/command pairs:
Action | Command |
Create a directory named /foodir | bin/hadoop dfs -mkdir /foodir |
Remove a directory named /foodir | bin/hadoop dfs -rmr /foodir |
View the contents of a file named /foodir/myfile.txt | bin/hadoop dfs -cat /foodir/myfile.txt |
FS shell is targeted for applications that need a scripting language to interact with the stored data.
DFSAdmin
The DFSAdmin command set is used for administering an HDFS cluster. These are commands that are used only by an HDFS administrator. Here are some sample action/command pairs:
Action | Command |
Put the cluster in Safemode | bin/hadoop dfsadmin -safemode enter |
Generate a list of DataNodes | bin/hadoop dfsadmin -report |
Recommission or decommission DataNode(s) | bin/hadoop dfsadmin -refreshNodes |
Browser Interface
A typical HDFS install configures a web server to expose the HDFS namespace through a configurable TCP port. This allows a user to navigate the HDFS namespace and view the contents of its files using a web browser.