Breaking News
Home / Components

Components

Apache Apex

AT A GLANCE A Hadoop YARN native big data processing platform, Apex is a enabling real time stream as well as batch processing for your big data. Apex provides the following benefits: High scalability and performance Fault tolerance and state management Hadoop-native YARN & HDFS implementation Event processing guarantees Separation …

Read More »

Apache Zeppelin

zepplin

AT A GLANCE Said to be a collaborative data analytics and visualization tool for Apache Spark, Apache Flink, Apache’s incubating Zeppelin project is a web-based tool for data scientists to collaborate over large-scale data exploration. Zeppelin is independent of the execution framework, and its interpreter allows any language or data-processing …

Read More »

Prediction.IO

AT A GLANCE Said to “eliminate the friction between software development, data science and production deployment,” PredictionIO is an open-source Machine Learning server for developers and data scientists to build and deploy predictive applications. The core part of the tool is an engine deployment platform built on top of Apache …

Read More »

Apache Kudu

Apache Kudo

AT A GLANCE A new addition to the open source Apache Hadoop ecosystem, Apache Kudu (incubating) completes Hadoop’s storage layer to enable fast analytics on fast data. Currently, a limited-functionality version of Kudu is available as a Beta. Like most modern analytic data stores, Kudu internally organizes its data by …

Read More »

Pivotal HAWQ

Pivotal HAWQ

AT A GLANCE Pivotal‘s Hawq is a closed-source product offered as part of their PivotalHD stack, their proprietary distribution of Hadoop. Claiming that Hawq is the ‘worlds fastest SQL engine on Hadoop’ and that it has been in development for 10 years. PROS Full SQL syntax support Interoperability with Hive …

Read More »

Apache Ignite

Apache Ignite

AT A GLANCE The new (late 2014) web site says that Apache’s Ignite in-memory fabric is a “high-performance, integrated and distributed in-memory platform for computing and transacting on large-scale data sets in real-time, orders of magnitude faster than possible with traditional disk-based or flash technologies.” The goal of an In-Memory …

Read More »

Apache Mesos

apache mesos

AT A GLANCE Apache’s Mesos project is built using the same principles as the Linux kernel, only at a different level of abstraction. The Mesos kernel runs on every machine and provides applications (e.g., Hadoop, Spark, Kafka, MPI, Hypertable, Elastic Search) with API’s for resource management and scheduling across entire …

Read More »

Pivotal HD

Pivotal HD

AT A GLANCE Pivotal, the EMC spin-off company pursuing modern application development in the context of cloud computing and big-data analysis, sponsors PivotalHD, a Hadoop distribution incorporating an in-memory database and a battery of analysis components. PivotalHD 2.0 is the vendor’s first distribution based on Apache Hadoop 2.2, the latest …

Read More »

Actian (ParAccel)

ParAccel

AT A GLANCE Actian boldly claims that its SQL-on-Hadoop product, formerly known as Paraccel is the “#1 SQL in Hadoop Analytics Platform“. Actian positions their new platform as an industrialized platform that is priced disruptively and delivers rich, easy to use functionality. In particular that the new platform is not …

Read More »

Amazon Redshift

Amazon Redshift

AT A GLANCE Amazon claims that its Redshift is a “fast, fully managed, petabyte-scale data warehouse solution that makes it simple and cost-effective to efficiently analyze all your data using your existing business intelligence tools.” It’s a paid Amazon service: you can start small for about $0.25 per hour with …

Read More »