Reference Architecture for Cloudera Enterprise

The configurations are based on Cloudera’s Distribution including Apache Hadoop (CDH), specifically CDH4.3, Cloudera Manager 4.6 and the HP ProLiant DL Gen8 server platform. The configurations reflected in this document have been jointly designed and developed by HP and Cloudera to provide optimum computational performance for Hadoop and are also compatible with other CDH4.x releases.Continue reading “Reference Architecture for Cloudera Enterprise”

Big Data Reference Architecture for Cloudera

This document describes the reference architecture for the Big Data Solution from Cloudera. It provides a predefined and optimized hardware infrastructure for the Cloudera Distribution for Hadoop (CDH version 5.4, a distribution of Hadoop with value added capabilities from Cloudera. This reference architecture provides the planning, design considerations, and best practices for implementing CDH withContinue reading “Big Data Reference Architecture for Cloudera”

Configurations for a small to medium-sized Hadoop cluster

Configurations for a small to medium-sized Hadoop cluster: Node type Node components Recommended specification       Master node CPU 2 Quad Core, 2.0GHz   RAM (main memory) 16 GB   Hard drive 2 x 1TB SATA II 7200 RPM HDD or SSD   Network card 1GBps Ethernet       Slave node CPU 2Continue reading “Configurations for a small to medium-sized Hadoop cluster”

Benefits of MapR cluster for table storage

Benefits of MapR cluster for table storage compare to Apache HBase on Hadoop – MapR cluster with High Availability features – Unified namespace for tables & files – Volume mirrors & snapshorts provide flexible, reliable read-only access – Table storage & MapReduce jobs co-exist on same node without degrading cluster performance