Until now, we referred to HDFS as the Hadoop filesystem. In reality, Hadoop has a rather abstract notion of filesystem. HDFS is only one of several implementations of the org.apache.hadoop.fs.FileSystem
Java abstract class. A list of available filesystems can be found at https://hadoop.apache.org/docs/r2.5.0/api/org/apache/hadoop/fs/FileSystem.html. The following table summarizes some of these filesystems, along with the corresponding URI scheme and Java implementation class.
Filesystem |
URI scheme |
Java implementation |
---|---|---|
Local |
|
|
HDFS |
|
|
S3 (native) |
|
|
S3 (block-based) |
|
|
There exist two implementations of the S3 filesystem. Native—s3n
—is used to read and write regular files. Data stored using s3n
can be accessed by any tool and conversely can be used to read data generated...