HADOOP: Difference between revisions
		
		
		
		
		
		Jump to navigation
		Jump to search
		
				
		
		
	
| Line 112: | Line 112: | ||
| * login hdfs@namenode | * login hdfs@namenode | ||
| *  | * hdfs balancer | ||
| * watch the grass grow | * watch the grass grow | ||
Revision as of 01:10, 30 June 2012
HADOOP
Links
- https://ccp.cloudera.com/display/DOC/Documentation
- http://hadoop.apache.org/common/docs/current/
- http://archive.cloudera.com/cdh4/cdh/4/hadoop/hadoop-project-dist/hadoop-common/DeprecatedProperties.html
- https://ladd00.triumf.ca/hadoop/
Management
Base install
cd /triumfcs/trshare/olchansk/linux/hadoop/ rpm --import RPM-GPG-KEY-cloudera rpm -vh --install cloudera-cdh-4-0.noarch.rpm yum install hadoop-hdfs hadoop-hdfs-fuse ln -s /home/olchansk/sysadm/hadoop/conf.daq_test /etc/hadoop/ alternatives --install /etc/hadoop/conf hadoop-conf /etc/hadoop/conf.daq_test 50 alternatives --display hadoop-conf
Create a name node
- yum install hadoop-hdfs-namenode
- chkconfig hadoop-hdfs-namenode off
- TBW
- add to /etc/rc.local: firewall rules and service startup
iptables -F hadoop-namenode iptables -N hadoop-namenode iptables -A hadoop-namenode -p tcp -s 142.90.0.0/16 --dport 8020 -j ACCEPT iptables -A hadoop-namenode -p tcp -s 142.90.0.0/16 --dport 50070 -j ACCEPT iptables -A hadoop-namenode -p tcp --dport 8020 -j REJECT iptables -A hadoop-namenode -p tcp --dport 50070 -j REJECT iptables -A INPUT -j hadoop-namenode iptables -L -v service hadoop-hdfs-namenode restart
Create a data node
(only SL6 is supported. SL5 has problems with dead /hdfs mounts)
- install data node software
yum install hadoop-hdfs-datanode chkconfig hadoop-hdfs-datanode off
- FIXME: adjust hadoop UID/GID somehow - it is different on every machine! wrong, wrong, wrong!
- configure data node
mkdir /data8/hdfs_data chown -R hdfs.hdfs /data8/hdfs_data ln -s /data8/hdfs_data /datah
- add ladd08 to the hadoop cluster:
- cd /etc/hadoop/conf
- ### in hdfs-site.xml: add "/data8/hdfs_data" to dfs.data.dir
- ### in hdfs-site.xml: increase dfs.datanode.failed.volumes.tolerated by one (should be "number of dfs.data.dir entries" minus 1).
- add new node to hosts
- ### add ladd08 IP address to topology.data
- add new node to topology.perl
- ### restart the name server
- login to hdfs@nameserver, run: hadoop dfsadmin -refreshNodes
 
- observe data node is listed as dead on the nameserver web page
- add to /etc/rc.local: firewall rules, service startup and hdfs mount
iptables -F hadoop iptables -N hadoop iptables -A hadoop -p tcp -s 142.90.0.0/16 --dport 50010 -j ACCEPT iptables -A hadoop -p tcp -s 142.90.0.0/16 --dport 50020 -j ACCEPT iptables -A hadoop -p tcp -s 142.90.0.0/16 --dport 50075 -j ACCEPT iptables -A hadoop -p tcp --dport 50010 -j REJECT iptables -A hadoop -p tcp --dport 50020 -j REJECT iptables -A hadoop -p tcp --dport 50075 -j REJECT iptables -A INPUT -j hadoop iptables -L -v service hadoop-hdfs-datanode restart mkdir /hdfs hadoop-fuse-dfs dfs://ladd12:8020 /hdfs
- start the data node: run /etc/rc.local
- observe data node is now listed as live on the nameserver web page
decommission a data node
- add node name to "hosts.exclude", should be typed exactly as it is entered in "hosts"
- login hdfs@namenode
- hdfs dfsadmin -refreshNodes
- go to the hdfs web interface "Live Nodes" page, observe the data node status is "decommission in progress"
- go to the hdfs web interface "Decommissioning Nodes", observe node is listed, observe "Under Replicated Blocks" count down to zero
- observe the data is being rebalanced (???)
- after the node no longer shows in "live nodes" (???), it can be shut down.
create user directory
- login hdfs@namenode
- hadoop fs -mkdir /users/trinat
- hadoop fs -chown trinat /users/trinat
rebalance the data
- login hdfs@namenode
- hdfs balancer
- watch the grass grow
debug functions
- start the data node: service hadoop-hdfs-datanode restart
- look at the log file: tail -100 /usr/lib/hadoop/logs/hadoop-hadoop-datanode-ladd08.triumf.ca.log
- mount hdfs: unset LD_LIBRARY_PATH; hadoop-fuse-dfs dfs://ladd12:8020 /mnt/xxx
- watch name node: http://ladd12.triumf.ca:50070
- watch data node: http://ladd08.triumf.ca:50075
- report status and configuration: hdfs dfsadmin -report
- report disk usage: hdfs dfs -count -q /users/"*"
- report location of nodes: hdfs dfsadmin -printTopology
- report everything about every file: hdfs fsck / -files -blocks -locations -racks
Performance
- cluster with 3 data nodes: ladd12 (quad-core i7-860), ladd08 (dual opteron 2GHz), ladd05 (dual-core Athlon 2.6 GHz)
- hdfs mounted on ladd12
- write benchmark:
ladd12:olchansk$ /usr/bin/time dd if=/dev/zero of=xxx bs=1024k count=1000 1000+0 records in 1000+0 records out 1048576000 bytes (1.0 GB) copied, 31.3406 s, 33.5 MB/s 0.00user 0.18system 0:31.43elapsed 0%CPU (0avgtext+0avgdata 7104maxresident)k 0inputs+0outputs (0major+483minor)pagefaults 0swaps ladd12:olchansk$ /usr/bin/time dd if=/dev/zero of=xxx2 bs=1024k count=10000 10000+0 records in 10000+0 records out 10485760000 bytes (10 GB) copied, 436.875 s, 24.0 MB/s 0.00user 1.72system 7:16.97elapsed 0%CPU (0avgtext+0avgdata 7104maxresident)k 0inputs+0outputs (0major+483minor)pagefaults 0swaps
ganglia reports peak network use 30 Mbytes/sec, ladd12: load 1, cpu use 10%, disk use 15%+15%busy; ladd05: load 2, cpu use 30%sys, 30%wait, disk use 15%+20%busy; ladd08: load 6, cpu use 30%sys, 70%wait, disk use 40%+40%busy.
- read benchmark:
ladd12:olchansk$ ls -l total 11391305 -rw-r--r-- 1 olchansk nobody 130360111 Dec 28 18:36 run21600sub00000.mid.gz -rw-r--r-- 1 olchansk nobody 1048576000 Jan 2 10:35 xxx -rw-r--r-- 1 olchansk nobody 10485760000 Jan 2 10:37 xxx2 ladd12:olchansk$ /usr/bin/time dd if=run21600sub00000.mid.gz of=/dev/null 254609+1 records in 254609+1 records out 130360111 bytes (130 MB) copied, 1.55497 s, 83.8 MB/s 0.02user 0.08system 0:01.69elapsed 6%CPU (0avgtext+0avgdata 3024maxresident)k 254632inputs+0outputs (1major+227minor)pagefaults 0swaps ladd12:olchansk$ ladd12:olchansk$ /usr/bin/time dd if=xxx of=/dev/null 2048000+0 records in 2048000+0 records out 1048576000 bytes (1.0 GB) copied, 11.4278 s, 91.8 MB/s 0.18user 0.69system 0:11.43elapsed 7%CPU (0avgtext+0avgdata 3008maxresident)k 2048000inputs+0outputs (0major+227minor)pagefaults 0swaps ladd12:olchansk$ ladd12:olchansk$ /usr/bin/time dd if=xxx2 of=/dev/null 20480000+0 records in 20480000+0 records out 10485760000 bytes (10 GB) copied, 130.188 s, 80.5 MB/s 1.68user 7.10system 2:10.19elapsed 6%CPU (0avgtext+0avgdata 3024maxresident)k 20480000inputs+0outputs (0major+228minor)pagefaults 0swaps
no network use, cpu use ladd12 is 10%+30%wait, disk use is 0% because all data was cached in memory: 10GB file, 16GB RAM.
Network ports
- data node ports:
[root@ladd08 ~]# diff xxx1 xxx2 | grep LISTEN > tcp 0 0 :::50020 :::* LISTEN > tcp 0 0 :::47860 :::* LISTEN > tcp 0 0 :::50010 :::* LISTEN > tcp 0 0 :::50075 :::* LISTEN
port 47860 is the java rmi server port, it is different on different machines and sometimes changes after restart. how to pin it to a fixed port number so I can firewall it?!?
- name node:
[root@ladd12 ~]# netstat -an | grep LISTEN | grep tcp > xxx2 [root@ladd12 ~]# diff xxx1 xxx2 < tcp 0 0 ::ffff:142.90.119.126:8020 :::* LISTEN < tcp 0 0 :::50070 :::* LISTEN < tcp 0 0 :::52745 :::* LISTEN
port 52745 is probably also the java rmi server port...