BURLINGAME, CA -- (Marketwire) -- 03/16/09 -- Cloudera, the commercial Hadoop(TM) company, today announced the general availability of the Cloudera Distribution for Hadoop, an open source product used to store and process big data: petabytes of information, often distributed across thousands of servers. Hadoop is in production use at most of the world's largest Web companies, including Facebook, Google, and Yahoo!. Cloudera, with the financial backing of Accel Partners, is the first company to develop technology to bring Hadoop into enterprise data centers.
"After working with large Hadoop deployments at companies like Facebook, Google and Yahoo!, we came to realize that people needed Hadoop installation, configuration, and management to be much easier," said Christophe Bisciglia, Cloudera founder and former manager of Google's Hadoop cluster. "Cloudera is advancing Hadoop technology to make it easier for everyone to store and process the same types of big data that large Web companies are successfully using in their businesses."
The Cloudera Distribution for Hadoop is freely available for download and immediate use. The product is distributed as a pre-packaged RPM bundle for Red Hat Linux systems or an Amazon EC2 image. To make Hadoop easy to install and use, Cloudera is launching a new portal called http://my.cloudera.com where people can use a Web-based configuration tool to create custom packages that are optimized to their specific needs. Settings for the cluster can also be saved on the portal to enable automatic updates. There is no charge to use http://my.cloudera.com. The RPM packages and EC2 images are freely distributed under the Apache 2 software license.
"Since we use Hadoop to help run our business, we are excited that Cloudera is offering commercial support for Hadoop and is making the technology more accessible to businesses," said David C Peterson, SVP Technology at ContextWeb, Inc., a leading contextual advertising company and operator of the ADSDAQ Exchange. "Businesses need to feel confident that there is a company like Cloudera to stand behind Hadoop in order for this great open source technology to become widely used by companies."
Cloudera is also making a pre-configured VMware image freely available for evaluation and use with their free online training (http://www.cloudera.com/hadoop-training). People that want to test the Cloudera Distribution for Hadoop or learn more about Hadoop and Cloudera's online training can download the image and run it on their Linux, Mac or Windows desktop. The image ships with example code and all the components needed to use the Cloudera Distribution for Hadoop, including a master server and single node.
The Cloudera Distribution for Hadoop is a complete system to handle the processing and storage of big data. Major components include:
-- HDFS - Hadoop Distributed File System, a distributed and fault- tolerant file system designed to run on commodity hardware. HDFS assumes that hardware failure is normal and provides quick detection and automatic recovery. HDFS can support tens of millions of files in a single instance; -- MapReduce implementation to divide applications into many small blocks of work for automatic parallelization and execution on large clusters. Cloudera's implementation of MapReduce takes care of partitioning of input data, scheduling program execution across distributed machines, and the handling of machine failure; -- Hive - a data warehousing infrastructure built on top of Hadoop that provides tools for easy data summary generation, ad hoc querying, and analysis. Hive comes with Hive QL, a simple query language based on SQL. -- Pig - a platform for analyzing large data sets in Hadoop using a high- level language for expressing data analysis programs, PigLatin.
Additional information about the Cloudera Distribution for Hadoop:
-- www.cloudera.com/hadoop with free access to a web configuration system, downloadable software, VMware image and documentation; -- The Story of the Cloudera Distribution for Hadoop - video featuring CEO and founder (http://www.youtube.com/watch?v=Y3eL6DfNkTw); -- Screencast on configuring the Cloudera Distribution for Hadoop: http://www.cloudera.com/hadoop-config-screencast
Cloudera (www.cloudera.com), the commercial Hadoop company, provides commercial services and support for Hadoop, the open source software that powers the data processing engines of the world's largest and most popular web sites. Founded by leading experts on big data from Facebook, Google, Oracle and Yahoo!, Cloudera's mission is to bring the power of Hadoop, MapReduce, and distributed storage to companies of all sizes. Headquartered in Silicon Valley, Cloudera has financial backing from Accel Partners and angel investors who include Diane Greene (former CEO of VMware), Marten Mickos (former CEO of MySQL) and Gideon Yu (CFO of Facebook). Cloudera's advisors include the founders of the Hadoop project, Doug Cutting and Mike Cafarella.
Cloudera is a registered trademark of Cloudera, Inc. Hadoop is a registered trademark of the Apache Software Foundation. All other company and product names are the property of their respective owners.