Facebook open sources its SQL-on-Hadoop engine, and the web rejoices

  SUMMARY:Facebook has open sourced Presto, a SQL engine it says is on average 10 times faster than Hive for running queries across large data sets stored in Hadoop and elsewhere. Facebook has open sourced Presto, the interactive SQL-on-Hadoop engine the company first discussed in June. Presto is Facebook’s take on Cloudera’s Impala or Google’s Dremel, and ...

How to Get Started in Data Science

A lot of people ask me: how do I become a data scientist? I think the short answer is: as with any technical role, it isn’t necessarily easy or quick, but if you’re smart, committed and willing to invest in learning and experimentation, then of course you can do it. In a previous post, I described my ...

Big Data Believers: NetApp, Cisco Take FlexPod To Next Level With Hadoop

Cisco (NSDQ:CSCO) and NetApp Tuesday rolled out a new version of their jointly developed FlexPod converged infrastructure aimed specifically at big data workloads, the first in a series of solutions targeting data-intensive applications. Cisco and NetApp also unveiled a new naming convention for the FlexPod solutions to make it easier to differentiate among the various configurations. The ...

The hot new technology in Big Data is decades old: SQL

The decades-old database technology is staging a comeback. A slide from a presentation at Hadoop Summit describes HortonWorks’ Stinger initiative, an effort to make SQL work better with Hadoop through Apache Hive. Jason Levitt Despite the growth of “NoSQL” databases over the past few years, SQL is going nowhere isn’t going anywhere. In fact, it seems Structured ...

Evolving Hadoop ecosystem presents new ways to program big data apps

The Hadoop ecosystem is a body in motion. Just a few years ago, you might quickly but fairly describe Hadoop as “HDFS, MapReduce and some glue” — referring to the Hadoop Distributed File System, its associated software programming model and an emerging collection of APIs and utilities, which together were becoming synonymous with big data systems. What ...

How HBase converted MySpace’s MySQL champion and is driving Hadoop mainstream

photo: Shutterstock / z0w SUMMARY:Gravity CTO Jim Benedetto knows his way around MySQL after managing a 600-instance cluster at MySpace, but he has found HBase religion as his real-time content-recommendation platform grew. And he’s not alone. How’s this for an understatement: Operational databases are important for many, if not the majority, of web applications. And if ...