Practical Hadoop Migration: How to Integrate Your RDBMS with by Bhushan Lakhe

By Bhushan Lakhe

Re-architect relational functions to NoSQL, combine relational database administration structures with the Hadoop surroundings, and rework and migrate relational info to and from Hadoop elements. This publication covers the best-practice layout techniques to re-architecting your relational functions and reworking your relational info to optimize concurrency, safety, denormalization, and function.

Winner of IBM’s 2012 Gerstner Award for his implementation of massive facts and information warehouse projects and writer of Practical Hadoop Security, writer Bhushan Lakhe walks you thru the complete transition technique. First, he lays out the standards for figuring out what mixture of re-architecting, migration, and integration among RDBMS and HDFS top meets your transition pursuits. Then he demonstrates the right way to layout your transition version.

Lakhe proceeds to hide the choice standards for ETL instruments, the implementation steps for migration with SQOOP- and Flume-based info transfers, and transition optimization innovations for tuning walls, scheduling aggregations, and remodeling ETL. eventually, he assesses the professionals and cons of information lakes and Lambda structure as integrative options and illustrates their implementation with real-world case experiences.

Hadoop/NoSQL strategies don't supply through default sure relational know-how gains equivalent to role-based entry keep watch over, locking for concurrent updates, and diverse instruments for measuring and embellishing functionality. Practical Hadoop Migration shows the best way to use open-source instruments to emulate such relational functionalities in Hadoop atmosphere components.

What you are going to Learn

  • The necessities and layout methodologies of relational information and NoSQL models
  • How to make a decision no matter if you'll want to migrate your relational purposes to important info applied sciences or combine them
  • How to transition your relational functions to Hadoop/NoSQL systems when it comes to logical layout and actual implementation
  • RDBMS-to-HDFS integration, information transformation, and optimization techniques
  • The occasions within which Lambda structure and information lake recommendations can be considered
  • How to choose and enforce Hadoop-based parts and purposes to hurry transition, optimize built-in functionality, and emulate relational functionalities

Who This publication Is For
The basic readership for Practical Hadoop Migration is database builders, database directors, company architects, Hadoop/NoSQL builders, and IT leaders. Its secondary readership is venture and application managers and complicated scholars of database and administration info systems.

Show description

Read Online or Download Practical Hadoop Migration: How to Integrate Your RDBMS with the Hadoop Ecosystem and Re-Architect Relational Applications to NoSQL PDF

Best data modeling & design books

Modular Ontologies: Concepts, Theories and Techniques for Knowledge Modularization

This booklet constitutes a set of study achievements mature sufficient to supply an organization and trustworthy foundation on modular ontologies. It provides the reader a close research of the state-of-the-art of the examine region and discusses the new strategies, theories and methods for wisdom modularization.

Advances in Object-Oriented Data Modeling

Until eventually lately, info platforms were designed round diversified company services, similar to debts payable and stock keep an eye on. Object-oriented modeling, against this, buildings platforms round the data--the objects--that make up some of the company services. simply because information regarding a selected functionality is proscribed to 1 place--to the object--the process is protected against the consequences of swap.

Introduction To Database Management System

Designed in particular for a unmarried semester, first direction on database platforms, there are four facets that differentiate our e-book from the remainder. simplicity - generally, the expertise of database structures may be very obscure. There are

Additional info for Practical Hadoop Migration: How to Integrate Your RDBMS with the Hadoop Ecosystem and Re-Architect Relational Applications to NoSQL

Example text

Due to the large volume and unstructured nature of such data, Hadoop/ NoSQL are ideally suited to process it. You should certainly re-architect/transition these applications to NoSQL. • Log analysis applications: Any mid-size or large corporation uses a large number of applications, and these applications generate a large number of log files. In case of troubleshooting or security issues, it is almost impossible to analyze these log files. Other important information can be derived from log files, like average processing time for batch processing tasks, number of failures and their details, user access and resource details (accessed by the users), and so on.

For now, Figure 1-8 summarizes the steps for transition to Hadoop/NoSQL. 22 CHAPTER 1 ■ RDBMS MEETS HADOOP: INTEGRATING, RE-ARCHITECTING, AND TRANSITIONING RDBMS based application Decide on NoSQL solution ETL Transition data model or staging database ETL NoSQL destination Figure 1-8. Transition to Hadoop/NoSQL Summary Everyone wants to utilize the power of Big Data in their environment. Unfortunately, decision makers often don’t consider all the parameters before concluding that Big Data is right for their organization.

There is also additional work involved in separating the fact data from dimensional data as the need may be. If, however, you want to use Hadoop for analyzing the browsing habits of thousands of your potential customers and determine what percentage of that converted to actual sales, then the work involved may be minimal—because you probably have all the required data available in separate NoSQL tables—albeit it may be in unstructured or semi-structured format (which NoSQL has no problems processing).

Download PDF sample

Rated 4.82 of 5 – based on 14 votes