HBase 在 hulu 的使用和实践 张虔熙 @ hulu qianxi.zhang@hulu.com
About hulu
About me 张虔熙 ü 软件工程师 @Hulu 大数据平台组 ü 专注于分布式计算和存储技术 ü 热衷于参与开源社区贡献代码 üqianxi.zhang@hulu.com
Agenda Overview Audience Platform( 用户画像系统 ) Auto Balance InputFormat and Snapshot Online Bill Storage( 订单信息存储系统 ) Replication, RPC Queue and Replica
Agenda Overview Audience Platform( 用户画像系统 ) Auto Balance InputFormat and Snapshot Online Bill Storage( 订单信息存储系统 ) Replication, RPC Queue and Replica
Overview HBase version : 1.2.0 Hadoop nodes :1000+ HBase nodes:200+ HBase table: 200+ HBase data size:700tb Cluster:4
Scenario Audience Platform( 用户画像系统 ) Log Storage( 日志存储系统 ) Online Bill Storage( 订单信息存储系统 ) OpenTSDB
Agenda Overview Audience Platform( 用户画像系统 ) Auto Balance InputFormat and Snapshot Online Bill Storage( 订单信息存储系统 ) Replication, RPC Queue and Replica
Audience Platform( 用户画像系统 ) 用户画像 : 根据用户行为抽象出的一个标签化的用户模型 Data üprofile( 基本属性 ) üuser behavior( 用户行为 ) üthird party data( 第三方数据 ) ülabel( 标签 )
Audience Platform( 用户画像系统 ) Data characteristic üsparse(10^6 qualifier) ümulti-version(user behavior) Purpose ümarketing decision üpersonalized recommendation üadvertisement
Audience Platform( 用户画像系统 ) Kafka Spark Streaming Spark Service HDFS HBase Cache DB Bulk Load MapReduce HDFS
Audience Platform( 用户画像系统 ) Key technology üauto balance InputFormat üsnapshot
Agenda Overview Audience Platform( 用户画像系统 ) Auto Balance InputFormat and Snapshot Online Bill Storage( 订单信息存储系统 ) Replication, RPC Queue and Replica
Region Size Distribution
Application Performance Problem ü Task execution time in MapReduce and Spark is positive correlation with Region Size ü Task execution timevaries wildly Resolve Bug ü Enable TableInputFormat auto balance(hbase.mapreduce.input.autobalance) ü Split large Region and merge small Region for InputFormat ü HBASE-15357(Wrong split/middle key)
Snapshot Snapshot ütable Meta ühfile Link Why Snapshot? üperformance üthe view of data at specific time
Snapshot Problem ücreate one snapshot per application? ühow toshare snapshot between application? Snapshot Service ümanage snapshot lifecycle üassign the reasonable snapshot tothe application
Agenda Overview Audience Platform( 用户画像系统 ) Auto Balance InputFormat and Snapshot Online Bill Storage( 订单信息存储系统 ) Replication, RPC Queue and Replica
Online Bill Storage( 订单信息存储系统 ) Characteristic übill information üonline service üwrite more, readless üread delay < 1s
Online Bill Storage( 订单信息存储系统 ) Key technology üreplication ürpc Queue üreplica
Agenda Overview Audience Platform( 用户画像系统 ) Auto Balance InputFormat and Snapshot Online Bill Storage( 订单信息存储系统 ) Replication, RPC Queue and Replica
Replication Two datacenter,master-master Replication Write Read Write Read Replication Cluster A Cluster B Replication
Replication Problem ü Replication Table and CF configuration will be wrong if the table name includes namespace ü Previous design did not consider namespace üuse : when parsing tables and family, such as usertable:family1 übut Namespace and table segmentation is also :, such as namespace1:usertable:faimly1 Resolve ü HBASE-11386, HBASE-11393(Use Protobuf instead of string)
Replication Problem üsome data couldn t be replicated üpeerclusterznode under is of removed peer may never be deleted üif some regionserver crash, other regionserver couldn t take over the rest replication work since the method copyqueuesfromrsusingmulti fails Resolve ühbase-16135, HBASE-14476
RPC Queue Improve Performance ümulti RPC Queue ühbase-11355 Write Queue Get Queue Scan Queue More ücontrolling Queue Delay(CoDel) ühbase-15136
Replica Problem üwhen a RegionServer crash, the region on it is unavailable for a period Resolve üregion replicas ü There could be more than one replica for one region ü One primary replica could accept write and read operation ü Multi secondary replica only accepts read operation ü HBASE-10070
Replica Client Read and Write RegionServer RegionServer Read Only HBase Region (Primary) Region (Secondary) HDFS WAL HFile-1 HFile-2
Replica Client strategy ü Query primaryregion first ü If don t get the result in 10ms, add a query to the secondary replicas ü Take the firstanswer and cancel others Problem More ü The data in secondary replica may be stale. ü HBASE-11568(Async WAL to secondary replica)
Future Multi-Tenancy(HBASE-10994) Strong schema High availability
Reference https://issues.apache.org/jira/browse/hbase-15357 https://issues.apache.org/jira/browse/hbase-11386 https://issues.apache.org/jira/browse/hbase-11393 https://issues.apache.org/jira/browse/hbase-16135 https://issues.apache.org/jira/browse/hbase-14476 https://issues.apache.org/jira/browse/hbase-15136 https://issues.apache.org/jira/browse/hbase-10070 https://issues.apache.org/jira/browse/hbase-11568 https://issues.apache.org/jira/browse/hbase-10994
Thank you qianxi.zhang@hulu.com