User Tools

Site Tools


The Hadoop distributed file system (HDFS) is a distributed, scalable, and portable file-system written in Java for the Hadoop framework.[1] HDFS is usually run on multiple servers where most are data servers that store the bulk of the data while one so-called 'name node' is in charge of keeping track of the directory structure.

HDFS integrates well with other parts of the Hadoop stack and ecosystem. It is, in fact, generally only used in conjunction with other parts of the stack. These tools allow data locality aware access and simple APIs to perform distributed bulk operations on the data.

HDFS is well suited for very large data in the realm of multiple gigabytes, it's a big data solution after all.