Skip to main content

Hadoop vs HBase - difference

We are aware that there are too many parts and pieces in Hadoop. It is easy to misunderstand the functionality of a particular Hadoop piece. One of the piece I found difficulty is the difference between Hadoop and HBase. It is really important to understand the different better there to work with Hadoop ecosystem.

Hadoop
The core of the Hadoop is filesystem. It has a new filesystem HDFS and a processing framework map reduce. HDFS files system stores data in pieces. A single piece is usually replicated thrice in a cheap server. Another feature of HDFS is it's size. Since it is usually used for big data projects the filesystem size is huge. Filesystem like NTFS will have minimum file size is 8 KB where as in HDFS it is minimum 64 MB or 128 MB. You can even choose a bigger size inorder to time it for the needs. In Hadoop files are called as chunks. You can process those natively using Map Reduce program. Map Reduce are written in Java programming language.

HBase
Customers using Hadoop don't want to query data natively. Although it is effective to work with Hadoop natively but it is not accessible for more people in the organization. In a normal work environment we have small number of developers and a larger number of analyst. One solution is to use HBase library. It stores data in wide column. Easier way to think is it looks like spreadsheet with two column where one column have id and the other column contains data. In Hadoop to view the data you need to write a map reduce program which is not possible for all users.

You might be wondering how the table is created in the first place. We need to use create table statement which will performs abstraction on Hadoop filesystem.

Comments

Popular posts from this blog

Insufficient authorization to access PIPE error in SAS EG

Issue: When I tried to run SAS code in SAS Enterprise Guide it throws following errors: ERROR: Insufficient authorization to access PIPE. ERROR: Error in the FILENAME statement. Screenshot of error: Solution: This error occurs when you try to run OS commands in SAS code. To run the OS commands in SAS code you need to enable XCMD option. You check it in SAS Management Console by following below steps.   Open SMC -> Expand Servers -> Expand   In SASApp , expand Logical Workspace Server -> right click on Workspace Server. Click properties -> option tab -> advanced options -> launch properties. Check whether Allow XCMD is checked. The issue arises if the Allow XCMD is not checked. In above image, Allow XCMD option is not checked. It should be checked to run OS commands from SAS code. In Unix /Linux machines, this XCMD option can be enabled by using system option XCMD in sasv9 config file or workspaceserver.sh script f...

The authentication server is not SETUID ROOT error in SAS

Question: When validating the SAS Server from SAS Management Console, I received the following error: The authentication server is not SETUID ROOT.  So, I ran the setuid.sh utility and restarted the services many times. I just checked the elssrv sasauth sasperm setuid bit. There were no error in sasauth-debug.log, sasauth-access.log, sasauth-error.log.  Any suggestions? Answer: Please do the following:    1) Run /<SASConfig>/Lev<X>/ObjectSpawner/ObjectSpawner.sh stop  2) Edit /<SASConfig>/Lev<X>/ObjectSpawner/ObjectSpawner.sh and add the code shown below right after SCRIPT=`basename $0`:  if [ -n ""$TKPATH"" ]; then  unset TKPATH  fi   if [ -n ""$TK_PATHLIST"" ]; then  unset TK_PATHLIST  fi    3) Run /<SASConfig>/Lev<X>/ObjectSpawner/ObjectSpawner.sh start  The above code change in ObjectSpawner.sh should fix the issue.

SAS - CLI error trying to establish connection

Issue: User asked me to make a database connectivity to SQL Server. They provided following details SQL server hostname and ip address Database/DSN name Username Password I made entry in ODBC.ini file. You know, SQL Server entries were made in ODBC.ini and Oracle entries were made in TNS.ora file. Everything went fine, took back up of odbc.ini, made entry and saved the file. So to test this connection I ran the libname statement in SAS Enterprise Guide 6.1. It throwed following error. Error Message: My DB team showed that they are able to login   14 GOPTIONS ACCESSIBLE; 15 LIBNAME test ODBC DATASRC=SGE_DS SCHEMA=VST USER=sales PASSWORD=XXXXXXXXX; ERROR: CLI error trying to establish connection: [SAS/ACCESS to SQL Server][ODBC SQL Server Legacy Driver][SQL Server]Login failed for user 'sales'. Solution: First I suspected that Login failed for user 'sales' meant the password provided by DB team was wrong. They responded that they were able to login wi...