HDFS commands Interview questions
1). Difference between the commands hadoop dfs and hadoop fs?
hadoop dfs - This is only for HDFS file system and this is deprecated. So we can use hdfs dfs.
hadoop fs - can interact with other file systems like local, HDFS, S3, etc..
2). HDFS or Hadoop command to display contents of directory
hadoop fs -ls <hdfs path>
(or)
hdfs dfs -ls <hdfs path>
3). HDFS or Hadoop command to make a directory in HDFS
hadoop fs -mkdir <path>
(or)
hdfs dfs -mkdir <path>
4). Hadoop or HDFS command to create a touch file or empty file in HDFS:
hadoop fs -touchz <path/some_touch_file_name>
(or)
hdfs dfs -touchz <path/some_touch_file_name>
5). Hadoop or HDFS command to copy file from local to HDFS
hadoop fs -copyFromLocal <local_source_file_path> <destination_hdfs_path>
(or)
hdfs dfs -copyFromLocal <local_source_file_path> <destination_hdfs_path>
6). Hadoop or HDFS command to copy file from HDFS to local file system
hadoop fs -copyToLocal <source_hdfs_path> <local_destination_file_path>
(or)
hdfs fs -copyToLocal <source_hdfs_path> <local_destination_file_path>
7). Hadoop or HDFS command to display or view contents of a file in HDFS
hadoop fs -cat <hdfs_file_path>
(or)
hdfs dfs -cat <hdfs_file_path>
8). Hadoop or HDFS command to copy files from one location to another location in HDFS
hadoop fs -cp <hdfs_source_path> <hdfs_destination_path>
(or)
hdfs dfs -cp <hdfs_source_path> <hdfs_destination_path>
9). Hadoop or HDFS command to move files from location to another in HDFS
hadoop fs -mv <hdfs_source_path> <hdfs_destination_path>
(or)
hdfs dfs -mv <hdfs_source_path> <hdfs_destination_path>
10). Hadoop or HDFS command to move files from local to HDFS
hadoop fs -mvFromLocal <local_source_path> <hdfs_destination_path>
(or)
hdfs dfs -mvFromLocal <local_source_path> <hdfs_destination_path>
11). Hadoop or HDFS command to merge files in HDFS and copy them to local
hadoop fs -getmerge -nl <hdfs_file_path_1> <hdfs_file_path_2> <local_destination_path>
(or)
hdfs dfs -getmerge -nl <hdfs_file_path_1> <hdfs_file_path_2> <local_destination_path>
12). Hadoop or HDFS command to append contents or one or more files to another destination file in HDFS.
hadoop fs -appendToFile <hdfs_scource_file_path_1> <hdfs_scource_file_path_2> <hdfs_destination_file_path>
(or)
hdfs dfs -appendToFile <hdfs_scource_file_path_1> <hdfs_scource_file_path_2> <hdfs_destination_file_path>
13). Hadoop or HDFS command to check the health or status of a file in HDFS
hadoop fs -fsck <hdfs_path>
(or)
hdfs dfs -fsck <hdfs_path>
14). Hadoop or HDFS command to check the size or number of files in a directory
hadoop fs -count <hdfs_path>
(or)
hdfs dfs -count <hdfs_path>
15). Hadoop or HDFS command to check disk usage of files in HDFS
hadoop fs -du <hdfs_path>
(or)
hdfs dfs -du <hdfs_path>
16). Hadoop or HDFS command to check if a path is directory or not
hadoop fs -test -d <hdfs_path>
(or)
hdfs dfs -test -d <hdfs_path>
17). Hadoop or HDFS command to check if a path is directory or not
hadoop fs -test -f <hdfs_path>
(or)
hdfs dfs -test -f <hdfs_path>
18). Hadoop or HDFS command to check if a path exists or not
hadoop fs -test -e <hdfs_path>
(or)
hdfs dfs -test -e <hdfs_path>
19). Hadoop or HDFS command to check if a file is empty or not in HDFS
hadoop fs -test -z <hdfs_file_path>
(or)
hdfs dfs -test -z <hdfs_file_path>
20). Hadoop or HDFS command to check if a file is empty or not in HDFS
hadoop fs -test -z <hdfs_file_path>
(or)
hdfs dfs -test -z <hdfs_file_path>
21). Hadoop or HDFS command to delete a file in HDFS
hadoop fs -rm <hdfs_file_path>
(or)
hdfs dfs -rm <hdfs_file_path>
Note: If we need to remove or delete a directory we might need to use -rm -r as arguments to hdfs dfs command.
22). Hadoop or HDFS command to check metrics of a file in HDFS
hadoop fs -stat %b <hdfs_path>
(or)
hdfs dfs -stat %b <hdfs_path>
Note: Instead of %b we can use other arguments to check other metrics. %r -> replication, %g -> group, %u -> username, %y -> last modification.
23). Hadoop or HDFS command to delete file from trash in HDFS
hadoop fs -expunge
(or)
hdfs dfs -expunge
24). Hadoop or HDFS command to change replication factor of a file
hadoop fs -setrep -w 3 <hdfs_fie_path>
(or)
hdfs dfs -setrep -w 3 <hdfs_fie_path>