Open in app

Sign In

Write

Sign In

Tamil Selvan K
Tamil Selvan K

28 Followers

Home

About

Oct 31, 2020

Apache Tez -Overview

You might have used Tez extensively if you are using HDP distribution of Hive but if you are new to the HDP/CDP or have used Hive on MR only, then this article will give you an quick overview on what Apache Tez is and how it uses the existing Yarn…

Tez

4 min read

Apache Tez
Apache Tez
Tez

4 min read


Oct 29, 2020

Hive Datanucleus ConnectionPool

ConnectionPool: A “connection pool” is a cache of database connection objects. Connection pools promotes the reuse of connection objects and reduce the number of times that connection objects are created. …

Cloudera

3 min read

Cloudera

3 min read


Oct 26, 2020

Locks in Hive

Database has locks and Hive is no different. Locks in database can be either Read Lock or Write Lock. Locks are used when concurrent applications tries to access the same table. Locks prevents data from being corrupted or invalidated when multiple users try to reach while others write to database. …

Hive

1 min read

Hive

1 min read


Oct 26, 2020

Hive LLAP Caching

Cache is used mostly for BI queries as compared to ETL queries. hive.llap.io.threadpool.size is at the node level and it defines the number of low level io threads .Basically, the daemon offloads I/O and transformation from compressed formats to these I/O threads. Then, the data will be passed on to…

Llap

1 min read

Llap

1 min read


Oct 26, 2020

Hive 3 (Architecture)

Hive 3 has seen lot of changes in terms of Architecture like default Table type as ACID, deprecating hive cli (thick Jdbc client) and only supporting the Thin JDBC client (Beeline) etc. Below is the High Level Architecture (I tried to make some changes to existing Hive Architecture Diagram which…

Hive

2 min read

Hive 3 (Architecture)
Hive 3 (Architecture)
Hive

2 min read


Oct 25, 2020

Hadoop Yarn ATS Rest APIs

For Yarn application, to fetch application’s data we can use Rest APIs on ATS below are some reference links: https://community.hortonworks.com/content/supportkb/221899/how-to-export-information-from-tez-view.html https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/TimelineServer.html#Timeline_Server_REST_API_V1 We can use RM REST APIs to get some application related data. Some of the Examples are below:

Yarn

1 min read

Yarn

1 min read


Oct 25, 2020

How Orc Split Strategies Work? (Hive)

What strategy ORC should use to create splits for execution. The available options are “BI”, “ETL” and “HYBRID”. The HYBRID mode reads the footers for all files if there are fewer files than expected mapper count, switching over to generating 1 split per file if the average file sizes are…

Hive

1 min read

Hive

1 min read

Tamil Selvan K

Tamil Selvan K

28 Followers

Support Engineer @DataRobot

Help

Status

Writers

Blog

Careers

Privacy

Terms

About

Text to speech