eBooks-it.org Logo
eBooks-IT.org Inner Image

Apache Flume

Distributed Log Collection for Hadoop

Apache Flume Image

Book Details:

Publisher:Packt Publishing
Series: Packt
Author:Steve Hoffman
Edition:1
ISBN-10:1782167919
ISBN-13:9781782167914
Pages:108
Published:Jul 16 2013
Posted:Nov 19 2014
Language:English
Book format:PDF
Book size:3.69 MB

Book Description:

If your role includes moving datasets into Hadoop, this book will help you do it more efficiently using Apache Flume. From installation to customization, it's a complete step-by-step guide on making the service work for you. Overview Integrate Flume with your data sources Transcode your data en-route in Flume Route and separate your data using regular expression matching Configure failover paths and load-balancing to remove single points of failure Utilize Gzip Compression for files written to HDFS In Detail Apache Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data. Its main goal is to deliver data from applications to Apache Hadoop's HDFS. It has a simple and flexible architecture based on streaming data flows. It is robust and fault tolerant with many failover and recovery mechanisms. Apache Flume: Distributed Log Collection for Hadoop covers problems with HDFS and streaming data/logs, and how Flume can resolve these problems. This book explains the generalized architecture of Flume, which includes moving data to/from databases, NO-SQL-ish data stores, as well as optimizing performance. This book includes real-world scenarios on Flume implementation. Apache Flume: Distributed Log Collection for Hadoop starts with an architectural overview of Flume and then discusses each component in detail. It guides you through the complete installation process and compilation of Flume. It will give you a heads-up on how to use channels and channel selectors. For each architectural component (Sources, Channels, Sinks, Channel Processors, Sink Groups, and so on) the various implementations will be covered in detail along with configuration options. You can use it to customize Flume to your specific needs. There are pointers given on writing custom implementations as well that would help you learn and implement them. By the end, you should be able to construct a series of Flume agents to transport your streaming data and logs from your systems into Hadoop in near real time. What you will learn from this book Understand the Flume architecture Download and install open source Flume from Apache Discover when to use a memory or file-backed channel Understand and configure the Hadoop File System (HDFS) sink Learn how to use sink groups to create redundant data flows Configure and use various sources for ingesting data Inspect data records and route to different or multiple destinations based on payload content Transform data en-route to Hadoop Monitor your data flows Approach A starter guide that covers Apache Flume in detail. Who this book is written for Apache Flume: Distributed Log Collection for Hadoop is intended for people who are responsible for moving datasets into Hadoop in a timely and reliable manner like software engineers, database administrators, and data warehouse administrators.

Download Link:

Related Books:

Apache Flume

Distributed Log Collection for Hadoop
Apache Flume Image
2nd Edition
Design and implement a series of Flume agents to send streamed data into Hadoop About This BookConstruct a series of Flume agents using the Apache Flume service to efficiently collect, aggregate, and move large amounts of event dataConfigure failover paths and load balancing to remove single points of failureUse this step-by-step guide to stream logs from application servers to Hadoop's HDFSWho This Book Is ForIf you are a Hadoop programmer who wants to learn about Flume to be able to move datasets into Hadoop in a timely and replicable manner, then this book is ideal for you. No prior knowledge about Apache Flume is necessary, but a basic knowledge of...

Apache Cookbook

Solutions and Examples for Apache Administrators
Apache Cookbook Image
2nd Edition
There's plenty of documentation on installing and configuring the Apache web server, but where do you find help for the day-to-day stuff, like adding common modules or fine-tuning your activity logging? That's easy. The new edition of the Apache Cookbook offers you updated solutions to the problems you're likely to encounter with the new versions of Apache. Written by members of the Apache Software Foundation, and thoroughly revised for Apache versions 2.0 and 2.2, recipes in this book range from simple tasks, such installing the server on Red Hat Linux or Windows, to more complex tasks, such as setting up name-based virtual hosts or securing and manag...

The Apache Modules Book

Application Development with Apache
The Apache Modules Book Image
"Do you learn best by example and experimentation? This book is ideal. Have your favorite editor and compiler readyyou'll encounter example code you'll want to try right away. You've picked the right bookthis is sure to become the de facto standard guide to writing Apache modules." Rich Bowen, coauthor, Apache Administrators Handbook, Apache Cookbook, and The Definitive Guide to Apache mod_rewrite "A first-rate guide to getting the most out of Apache as a modular application platformsure to become a must-read for any Apache programmer, from beginner to experienced professional. It builds up carefully and meticulously from the absolute basics, while including chapters on everything from the popular Apache DBD Framework to best practi...



2007 - 2021 © eBooks-IT.org