Double Down with Apache Cassandra and Spark
Chief Evangelist for Apache Cassandra , DataStax
Apache Cassandra has proven to be one of the best solutions for storing and retrieving time series data at high velocity and high volume. This talk will give you an overview of the many ways you can be successful by introducing Apache Cassandra concepts. We will discuss how the storage model of Cassandra is well suited for this pattern and go over examples of how best to build data models. There will also be examples of how you can use Apache Spark along with Apache Cassandra to create a real time data analytics platform. It’s so easy, you will be shocked and ready to try it yourself.
Patrick McFadin is one of the leading experts of Apache Cassandra and data modeling techniques. As the Chief Evangelist for Apache Cassandra and consultant for DataStax, he has helped build some of the largest and exciting deployments in production. Previous to DataStax, he was Chief Architect at Hobsons, an education services company. There, he spoke often on Web Application design and performance.
Why Spark Is the Next Top (Compute) Model
Dean Wampler, Ph.D.
Architect for Big Data Products and Services, Typesafe
Spark is an open-source computation platform for Big Data that has emerged as the successor to MapReduce, the aging standard Hadoop. This talk explains why this change was necessary.
Spark provides a concise core API that enables large MapReduce programs to be rewritten as small "scripts". Spark has excellent performance, often 100 times better than MapReduce. The core abstractions of Spark facilitate special-purpose APIs, such as SparkSQL, that combines SQL-based queries for asking questions with the core API for general-purpose programming. Other examples are the machine learning library, MLlib, and the graph algorithms library, GraphX. Finally, Spark supports event stream processing.
Using examples, we'll also see that the secret to Spark's success is its roots in the Scala programming language and the "combinators" from Functional Programming, which together provide concise, powerful primitives for composing a wide variety of high-performance applications.
Dean Wampler, Ph.D., is the Architect for Big Data Products and Services for Typesafe, the company behind Scala, Akka, and Play. He specializes in scalable, distributed, data-centric application development, Big Data or otherwise, including large-scale Internet of Things systems. He applies Functional Programming principles with the Typesafe Reactive Platform, Hadoop, Apache Spark, and other tools. Dean is a contributor to several open source projects and the founder of the Chicago-Area Scala Enthusiasts. He is the author of Programming Scala, 2nd Edition and Functional Programming for Java Developers, and the co-author of Programming Hive, all from O’Reilly. He pontificates on twitter, @deanwampler.