Configuring Hadoop for Failover

Introduction

As of 0.20, Hadoop does not support automatic recovery in the case of a NameNode failure. This is a well known and recognized single point of failure in Hadoop.

Experience at Yahoo! shows that NameNodes are more likely to fail due to misconfiguration, network issues, and bad behavior amongst clients than actual hardware problems. Out of fifteen grids over three year period, only three NameNode failures were related to hardware problems.

 

Configuring Hadoop for Failover

There are some preliminary steps that must be in place prior to performing a NameNode recovery. The most important is the dfs.name.dir property. This setting configures the NameNode such that it can write to more than one directory. A typcal configuration might look something like this:

  • <property>
    • <name>dfs.name.dir</name> <value>/export/hadoop/namedir,/remote/export/hadoop/namedir</value>
    • </property>

The first directory is a local directory and the second directory is a NFS mounted directory. The NameNode will write to both locations, keeping the HDFS metadata in sync. This allows for storage of the metadata off-machine so that one will have something to recover. During startup, the NameNode will pick the most recent version of these two directories to use and then sync both of them to use the same data.

After we have configured the NameNode to write to two or more directories, we now have a working backup of the metadata. Using this data, in the more common failure scenarios, we can use this data to bring the dead NameNode from the grave.

When a Failure Occurs

Now the recovery steps:

  1. Just to be safe, make a copy of the data on the remote NFS mount for safe keeping.
  2. Pick a target machine on the same network.
  3. Change the IP address of that machine to match the NameNode‘s IP address. Using an interface alias to provide this address movement works as well. If this is not an option, be prepared to restart the entire grid to avoid hitting https://issues.apache.org/jira/browse/HADOOP-3988 .
  4. Install Hadoop similarly to how you did the NameNode
  5. Do not format this node!
  6. Mount the remote NFS directory in the same location.
  7. Startup the NameNode.
  8. The NameNode should start replaying the edits file, updating the image, block reports should come in, etc.

At this point, your NameNode should be up.

Other Ideas

There are some other ideas to help with NameNode recovery:

  1. Keep in mind that the SecondaryNameNode and/or the CheckpointNode also has an older copy of the NameNode metadata. If you haven’t done the preliminary work above, you might still be able to recover using the data on those systems. Just note that it will only be as fresh as the last run and you will likely experience some data loss.
  2. Instead of using NFS on Linux, it may be worth while looking into DRBD. A few sites are using this with great success.

Measuring Apache bandwidth from a bash script

So reason : double checking Sawmill and AWstats.

(sawmill won..)

echo “$(awk ‘{print $10}’ prd.logs | grep -v “-” | paste -sd + – | bc )/1024/1024/1024″ | bc

result is in GB.

nJoy;

 

Rebooting a Centos server ignoring CIFS or other broken / hung mount

We had a problem on a server df would hang and we knew that a cifs / samba share died and recovered. Trying unmount / with any parameters -l -f failed.

Experience and google told us we need to reboot. This is a remote server and we know the shutdown hangs on going down so another way to restart was needed.

We needed magic !!

Wouldn’t it be nice if there was a way to ask the kernel to reboot without needing to access the failing drive? Well, there is a way, and it is remarkably simple.

The “magic SysRq key” provides a way to send commands directly to the kernel through the /proc filesystem. It is enabled via a kernel compile time option, CONFIG_MAGIC_SYSRQ, which seems to be standard on most distributions. First you must activate the magic SysRq option:

echo 1 > /proc/sys/kernel/sysrq

When you are ready to reboot the machine simply run the following:

echo b > /proc/sysrq-trigger

This does not attempt to unmount or sync filesystems, so it should only be used when absolutely necessary, but if your drive is already failing then that may not be a concern.

In addition to rebooting the system the sysrq trick can be used to dump memory information to the console, sync all filesystems, remount all filesystems in read-only mode, send SIGTERM or SIGKILL to all processes except init, or power off the machine entirely, among other things.

Also, instead of echoing into /proc/sys/kernel/sysrq each time you can activate the magic SysRq key at system boot time using sysctl, where supported:

echo “kernel.sysrq = 1” >> /etc/sysctl.conf

If you would like to learn more about magic SysRq you can read the sysrq.txt file in the kernel documentation.

nJoy 😉

Listing SIDs in Windows

In windows (again I know …. ) sometimes you get reference to removed users ( and sometimes not so removed)  in mmc and secpol looking like this :

S-1-5-21-123456789-3881959548-123456789-500

To check the SIDs of users for some obscure bug like I had today use:

wmic useraccount get name,sid

That’s it ..

( ofcourse you can filter it with )

wmic useraccount get name,sid | find “Administrator” /I

nJoy 😉

 

 

Check if Windows Drive exists

In windows (yes I know) you can check if a drive exists by using

IF EXIST E:\NUL echo yep it does ..

that’s it . nJoy 😉

 

Querying network card information and status

Great tool for checking ethernet NIC information in linux:

ethtool

In ubuntu you might need to install it like so :

sudo apt-get install ethtool

The tool is easy to use:

 

Use

ifconfig -a 

to list nics

then

 

root@wo1:~# ethtool p4p1
Settings for p4p1:
Supported ports: [ TP ]
           Supported link modes: 10baseT/Half 10baseT/Full
                                            100baseT/Half 100baseT/Full
                                            1000baseT/Full
Supported pause frame use: No
Supports auto-negotiation: Yes
Advertised link modes:            10baseT/Half 10baseT/Full
                                            100baseT/Half 100baseT/Full
                                            1000baseT/Full
Advertised pause frame use: Symmetric
Advertised auto-negotiation: Yes
Speed: 1000Mb/s
Duplex: Full
Port: Twisted Pair
PHYAD: 1
Transceiver: internal
Auto-negotiation: on
MDI-X: Unknown
Supports Wake-on: pumbg
Wake-on: g
Current message level: 0x00000007 (7)
                                    drv probe link
Link detected: yes
root@wo1:~#

 

nJoy 😉

Vmware Client Automating connections

This is a hint from Andrew Thanks..

 

 

C:\Program Files\VMware\Infrastructure\Virtual Infrastructure Client\Launcher\VpxClient.exe -i yes -s 10.21.68.8 -u root -p password

 

Works !

In a batch:

start “VMLauncher” /D”C:\Program Files\VMware\Infrastructure\Virtual Infrastructure Client\Launcher\” VpxClient.exe -i yes -s 10.21.68.8 -u root -p password”

 

Thanks nJoy 😉