ApacheCon NA 2015 has ended

Sign up or log in to bookmark your favorites and sync them to your phone or calendar.

What\'s New At the ASF [clear filter]
Monday, April 13

10:45am CDT

Apache Incubator: Where It Is Coming From and Where It Is Going - Roman Shaposhnik, Pivotal
If you think you know Apache Incubator and its role in ASF -- think again! All the projects in the foundation are constantly evolving and Apache Incubator is no exception. Up until recently, It used to be the only gateway into the ASF family for new projects. While it is still predominantly the case, the emergence of pTLPs and fundamental rethinking of the mentorship approach makes it the most exciting time for new projects to come to the foundation. This presentation will cover the old school Incubator polices and will point out how they are changing and what new alternatives are now available for podling communities. It will also focus on areas where we are still experimenting with the process, how it relates to the ASF board of directors and how you can help speed things up. Finally, a few battle stories will be shared and wounds put on display. This outgoing Chair has a few to show.

avatar for Roman Shaposhnik

Roman Shaposhnik

Director of Open Source, Linux Foundation
Apache Software Foundation and Data, oh but also unikernels

Monday April 13, 2015 10:45am - 11:35am CDT
Texas V

11:45am CDT

Introduction to Zeppelin - Moon soo Lee, NFLabs
Apache Zeppelin (incubating) is interactive data analytics environment for distributed data processing system. It provides beautiful interactive web-based interface, data visualization, collaborative work environment and many other nice features to make your data anayltics more fun and enjoyable. LeeMoonSoo will going to demo Zeppelin's features to show how it helps data analytics.

Zeppelin provides integration with Apache Spark. Yet Zeppeiln has got flexible architecture to support various data processing backend. This presentation will describe how other projects integrate Zeppelin and leverage it with an example.

As well as discussing current project roadmap.

avatar for Moon


cto, NFLabs
Moon soo Lee is a creator for Apache Zeppelin and a Co-Founder, CTO at NFLabs. For past few years he has been working on bootstrapping Zeppelin project and it’s community. His recent focus is growing Zeppelin community and getting adoptions.

Monday April 13, 2015 11:45am - 12:35pm CDT
Texas V

2:00pm CDT

Integrating Event Streams and File Data with Apache Flume and Apache NiFi - Joseph Echeverria, Scaling Data
Large scale data analysis often requires merging event-based data and file-based data. Often two or more tools are required to ingest these types of data. In this presentation, Joey Echeverria will explore at how to integrate Apache Flume and Apache NiFi, which is currently undergoing incubation at the ASF, to build complex data flows that enable advanced analysis.


Joseph Echeverria

Platform Technical Lead, Rocana
Joey Echevarria builds systems for reliably ingesting and transforming large amounts of data. He is the platform technical lead at Rocana where he builds scalable applications for monitoring IT operations. At Rocana he has applied his data infrastructure expertise to design and implement... Read More →

Monday April 13, 2015 2:00pm - 2:50pm CDT
Texas V

3:00pm CDT

Apache Lens: Cut Data Analytics Silos in your Enterprise - Sharad Agarwal, FlipKart & Amareshwari Sriramdasu, Inmobi
Apache Lens enables multi-dimensional queries in a unified way over datasets stored in multiple warehouses. Apache Lens allows queries to be executed where the data resides providing logical data Cube abstraction. In a typical enterprise multiple data warehouses co-exist, as single one does not address the needs of all workload requirements in cost-effective way. Apache Hive is one of the widely-used data warehouse in the Hadoop ecosystem. The traditional Columnar data warehouses complement Apache Hive for summarized and very frequently accessed data. Having multiple data warehouses leads to data silos that Lens aims to cut within the enterprise and provide a holistic unified access.
In this talk Sharad Agarwal and Amareshwari Sriraramadasu will present the current and upcoming features. They will also give the live demonstration of Apache Lens salient features.

avatar for Sharad Agarwal

Sharad Agarwal

Sharad is a Apache Hadoop committer, PMC member and is active in hadoop community for over 5 years. He has been involved with YARN since it was in concept stage. He is the author of YARN core runtime libraries and Map-Reduce Application Master. Sharad is the founding member, committer... Read More →
avatar for Amareshwari Sriramadasu

Amareshwari Sriramadasu

Architect, Inmobi
Amareshwari is currently working as Architect in data team at Inmobi, where she works on Hadoop and related projects for data collection and analytics. She is member of the ASF, Apache Incubator PMC, Apache Hadoop PMC, Apache Lens PMC and Apache Falcon PMC, and is Apache Hive committer... Read More →

Monday April 13, 2015 3:00pm - 3:50pm CDT
Texas V

4:00pm CDT

Apache MRQL (incubating): Advanced Query Processing for Complex, Large-Scale Data Analysis - Leonidas Fegaras, Univ. of Texas at Arlington
Apache MRQL (incubating) is a new query processing system for large-scale, distributed data analysis. MRQL is more powerful than other query languages for distributed data analysis because it can operate on more complex data and supports more powerful query constructs. With MRQL, users are able to express complex data analysis tasks, such as PageRank, k-means clustering, and matrix factorization, using declarative queries exclusively, while the MRQL query processing system is able to compile these queries to efficient Java code that can run on multiple distributed processing platforms (currently Apache Hadoop MapReduce, Hama, Spark, and Flink). In this presentation, Leonidas Fegaras will give a brief overview of the MRQL query language and query processing architecture, will compare MRQL with other related Apache projects, and will discuss current state and future plans.

avatar for Leonidas Fegaras

Leonidas Fegaras

Associate Professor, Univ. of Texas at Arlington
Leonidas Fegaras is a Computer Science Professor at the University of Texas at Arlington (UTA). Prior to joining UTA, he was a Senior Research Scientist at the Oregon Graduate Institute in Portland, Oregon. He is a committer and PPMC member of Apache MRQL (incubating). Leonidas has... Read More →

Monday April 13, 2015 4:00pm - 4:50pm CDT
Texas V

5:00pm CDT

Introducing Apache HTrace: An End-to-End Tracing Framework for Distributed Systems - Colin McCabe, Cloudera
Apache HTrace is a new incubator project which makes it easier to monitor and understand the performance of distributed systems. HTrace aims to provide a truly end-to-end, cluster-wide view of how requests are processed in a production distributed system-- similar to Google's Dapper or the XTrace network tracing framework.

I will talk about the architecture of HTrace and how it fits into the stack. There is a lot happening in the HTrace project, and I will discuss some of the new features that are on the horizon, such as the web interface and htraced daemon. We are working on integrating HTrace into a few Apache projects such as HDFS, HBase, and Accumulo, and we hope to have many more in the future. I will talk about how developers and users can get involved with the HTrace community.


Colin McCabe

Software Engineer, Cloudera
Colin McCabe is a Platform Software Engineer at Cloudera, where he works on HDFS and related technologies. He is a committer on HDFS. Prior to joining Cloudera, he worked on the Ceph Distributed Filesystem, and the Linux kernel, among other things. He studied Computer Science and... Read More →

Monday April 13, 2015 5:00pm - 5:50pm CDT
Texas V
Tuesday, April 14

10:40am CDT

But We're Already Open Source! Why Would I Want To Bring My Code To Apache? - Nick Burch, Quanticate
So, your business has already opened sourced some of it's code? Great! But now, someone's asking you about giving it to these Apache people? What's up with that, and why isn't just being open source enough?

In this talk, we'll look at several real world examples of where companies have chosen to contribute their existing open source code to the Apache Software Foundation. We'll see the advantages they got from it, the problems they faced along the way, why they did it, and how it helped their business. We'll also look briefly at where it may not be the right fit.

Wondering about how to take your business's open source involvement to the next level, and if contributing to projects at the Apache Software Foundation will deliver RoI, then this is the talk for you!

avatar for Nick Burch

Nick Burch

CTO, Quanticate
Nick began contributing to Apache projects in 2003, and hasn't looked back since! Most of the projects Nick has worked in belong in the "Content" space, such as Apache POI (ex-PMC Chair), Apache Tika and Apache Chemistry. As well as coding projects, Nick is also involved in a number... Read More →

Tuesday April 14, 2015 10:40am - 11:30am CDT
Texas V

11:40am CDT

Pulsar: Realtime Analytics at Scale Leveraging Kafka, Hadoop and Kylin - Tony Ng, eBay
Enterprises are Increasingly demanding realtime analytics and insights to power use cases like personalization, monitoring and marketing. We will present Pulsar, a realtime streaming system used at eBay which can scale to millions of events per second with high availability and SQL-like language support, enabling realtime data enrichment, filtering and multi-dimensional metrics aggregation.

We will discuss how Pulsar integrates with a number of open source Apache technologies like Kafka, Hadoop and Kylin (Apache incubator) to achieve the high scalability, availability and flexibility. We use Kafka to replay unprocessed events to avoid data loss and to stream realtime events into Hadoop enabling reconciliation of data between realtime and batch. We use Kylin to provide multi-dimensional OLAP capabilities.


Tony Ng

Tony Ng is a Director of Engineering at eBay, Inc where he leads the User Behavior Analytics, Experimentation and Marketing Platform products. At eBay, Tony has been involved in building eBay's core platforms and services, including cloud, big data analytics, real-time streaming... Read More →

Tuesday April 14, 2015 11:40am - 12:30pm CDT
Texas V

3:00pm CDT

Apache Brooklyn: from YAML Blueprints to Autonomic Management - Alex Heneveld, Cloudsoft
The Apache-incubating Brooklyn project gives a deploy-and-manage framework for any application. We'll show how blueprints for standing up complex software -- on metal, in clouds, docker or paas -- are written in Brooklyn, using the YAML syntax or the Java libraries, and then how at runtime this becomes a model of the running software. This view is essentially a control plane for applications, and this talk will cover autonomic control policies including scaling, failover, and upgrades, as part of the testable, source-controlled blueprint. With example blueprints including Apache Spark, Kafka, Hadoop and Ambari, Cassandra and ActiveMQ, Brooklyn can offer many projects better integration testing, streamlined used first-touch experience, and runtime ops dashboards. And as an incubation project, we're on the hunt for more beneficiaries and contributors.

avatar for Alex Heneveld

Alex Heneveld

Co-founder & CTO, Cloudsoft
Alex Heneveld is one of the creators of Apache Brooklyn, and CTO and co-founder at Cloudsoft Corporation where he works with companies large and small to build their application management strategy. With the surprisingly controversial view that applications are more important than... Read More →

Tuesday April 14, 2015 3:00pm - 3:50pm CDT
Texas V

4:20pm CDT

Apache Ignite (incubating): Anatomy of an In-Memory Data Fabric - Dmitriy Setrakyan, GridGain
In this presentation, we will describe the strategy and architecture behind Apache IgniteTM (incubating), a high-performance, distributed in-memory data management software layer that has been designed to operate between both new and existing data sources and applications, boosting application performance and scale by orders of magnitude. We will dive into the technical details of distributed clusters and compute grids as well as distributed data grids, and provide code samples for each. As integral parts of an In-Memory Data Fabric, we will also cover distributed streaming, CEP and Hadoop acceleration. This presentation is particularly relevant for software developers and architects who work on the front lines of high-speed, low-latency big data systems, high-performance transactional systems and real-time analytics applications. - Apache Ignite is either a registered trademark or a trademark of the Apache Software Foundation in the United Stated and/or other countries.

avatar for Dmitriy Setrakyan

Dmitriy Setrakyan

Co-Founder and EVP of Engineering, GridGain
Dmitriy Setrakyan is co-founder and EVP of Engineering at GridGain Systems. Dmitriy has been designing, architecting and developing software and applications for over 15 years and has expertise in the development of distributed computing systems, middleware platforms, financial trading... Read More →

Tuesday April 14, 2015 4:20pm - 5:10pm CDT
Texas I

4:20pm CDT

Apache Olingo - From Incubation to a Real Olingo (Apache TLP) - Michael Bolz
The Apache Olingo project actual contains Java and JavaScript libraries which provided development support for the realization of an OData service as a server in addition to support for consuming an OData service as client.
Before the Apache Olingo project started there was an earlier Open Source project from which we learnt that it’s best to start from scratch and thereby avoid failures from the past. By starting afresh, we could do everything better than before.This is how the Apache Olingo was born.
In his presentation, Michael will explain…
* Why it was decided to start a new Open Source project (to learn from past mistakes)
* Why Apache was chosen
* The Apache journey - from Incubation to a Top-Level-Project)
* The future of Olingo (not the attic)

Michael will explain the pros and cons of an open source project and talk about his own experiences and lessons learned as be

avatar for Michael Bolz

Michael Bolz

Developer, SAP SE
Michael Bolz has been working within the OData context for two years and during this time he has focused on the implementation of OData specification versions 2 and 4 as an Open Source Library named Apache Olingo.

Tuesday April 14, 2015 4:20pm - 5:10pm CDT
Texas V

5:20pm CDT

Apache NiFi: Better Analytics Demands Better Dataflow - Joseph Witt, Onyara
In this presentation, Joe Witt will outline the fundamental challenges of enterprise dataflow at scale and the resulting implications for analytics. Key capabilities of Apache NiFi (incubating) are designed to solve these dataflow challenges. Joe will address the importance of flow-based programming concepts, real-time command and control, and data provenance to provide a powerful platform to automate the flow of data between critical infrastructure systems in a complex globally distributed enterprise.


Joseph Witt

Joe Witt Onyara Inc. In 2006, Joe Witt created a dataflow framework that grew into a community and evolved over eight years into what became Apache NiFi (incubating). Following NiFi's open source release by the NSA in 2014, Joe has become an active committer and member of the Apache... Read More →

Tuesday April 14, 2015 5:20pm - 6:10pm CDT
Texas V
Wednesday, April 15

2:15pm CDT

Subversion Error Messages Demystified - Stefan Sperling, elego Software Solutions GmbH
 This talk presents a case study of obscure error messages raised by Apache Subversion, based on questions raised by Subversion users who ran into them. We'll discuss what developers and users can do to help raise the quality of error messages in Subversion and other projects.

avatar for Stefan Sperling

Stefan Sperling

Stefan Sperling is a freelance Open Source Software developer and consultant based in Berlin. He has been involved in Apache Subversion development since 2007 and provides training and consulting services around Apache Subversion in partnership with elego Software Solutions Gm... Read More →

Wednesday April 15, 2015 2:15pm - 3:05pm CDT
Texas VI