Tamil Selvan K

Database has locks and Hive is no different. Locks in database can be either Read Lock or Write Lock. Locks are used when concurrent applications tries to access the same table. Locks prevents data from being corrupted or invalidated when multiple users try to reach while others write to database.

--

--

Cache is used mostly for BI queries as compared to ETL queries.
hive.llap.io.threadpool.size is at the node level and it defines the number of low level io threads .Basically, the daemon offloads I/O and transformation from compressed formats to these I/O threads. Then, the data will be passed on to…

--

--

Hive 3 has seen lot of changes in terms of Architecture like default Table type as ACID, deprecating hive cli (thick Jdbc client) and only supporting the Thin JDBC client (Beeline) etc.

Below is the High Level Architecture (I tried to make some changes to existing Hive Architecture Diagram which…

--

--

What strategy ORC should use to create splits for execution. The available options are “BI”, “ETL” and “HYBRID”.

The HYBRID mode reads the footers for all files if there are fewer files than expected mapper count, switching over to generating 1 split per file if the average file sizes are…

--

--