ApacheCon NA 2015 has ended
Back To Schedule
Tuesday, April 14 • 3:00pm - 3:50pm
Keep Me in the Loop: INotify in the Apache Hadoop Distributed Filesystem - Colin McCabe, Cloudera

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

An elephant never forgets-- at least, not if that elephant is Apache Hadoop. The Hadoop Distributed Filesystem (HDFS) can store petabytes of data. Services that run on top of HDFS often want to cache or index some of that data. When files in HDFS change, or when more files are added, these services need to update their caches and indices.

The new HDFS inotify API allows applications to listen for changes to files stored in HDFS. Instead of periodically rescanning the filesystem, applications can simply receive notifications about changes. In this talk, I will cover the design goals for INotify and how we accomplished them. I will talk about how other projects can make effective use of the new API. Finally, I'll discuss some ideas we might explore in the future.


Colin McCabe

Software Engineer, Cloudera
Colin McCabe is a Platform Software Engineer at Cloudera, where he works on HDFS and related technologies. He is a committer on HDFS. Prior to joining Cloudera, he worked on the Ceph Distributed Filesystem, and the Linux kernel, among other things. He studied Computer Science and... Read More →

Tuesday April 14, 2015 3:00pm - 3:50pm CDT
Texas VI

Attendees (0)