Breaking News

Apache Spark

Apache Spark

AT A GLANCE Along with phenomena such as container technology Docker, Apache Spark emerged in 2014 as a new darling of the opensource world, with widespread take-up by data teams and developers, backed by a highly active community and the startup Databricks. Started in 2009 as a UC Berkeley research …

Read More »

Mahout

Apache Mahout

AT A GLANCE Apache’s Mahout is a top-level project to produce free implementations of distributed or otherwise scalable machine learning algorithms focused primarily in the areas of collaborative filtering, clustering and classification. Many of the implementations use the Apache Hadoop platform. Mahout also provides Java libraries for common math operations, …

Read More »

Apache Whirr

Apache Whirr

AT A GLANCE The Apache Whirr project provides a Java API and set of shell scripts for installing and running various services on cloud providers such as Amazon EC2 and Rackspace. Whirr allows you to define the layout of a cluster in terms of the number of nodes as well …

Read More »

Apache S4

AT A GLANCE The Apache-incubated S4 project was said to fill the gap between complex proprietary systems and batch-oriented open source computing platforms, aiming to “develop a high performance computing platform that hides the complexity inherent in parallel processing system from the application programmer.” It purported a general-purpose, distributed, scalable, …

Read More »