HADOOP: Difference between revisions
| m (79 revisions imported) | |||
| (19 intermediate revisions by the same user not shown) | |||
| Line 9: | Line 9: | ||
| == Management == | == Management == | ||
| === Complete removal === | |||
| <pre> | |||
|   638  xemacs -nw /etc/rc.local --- comment out all hdfs stuff | |||
|   639  umount /hdfs | |||
|   642  service hadoop-hdfs-datanode stop | |||
|   643  chkconfig --list | grep hdfs | |||
|   644  chkconfig --list | grep had | |||
|   650  rm /datah | |||
|   652  cd /iris_data1/ --- this is different on every data node | |||
|   656  rm -rf hdfs_data/ | |||
|   659  rpm -qa | grep -i hadoop | |||
|   660  yum erase hadoop | |||
|   666  rm -rf /etc/hadoop/ | |||
|   667  rm /etc/alternatives/hadoop-conf  | |||
|   668  rm -rf /etc/hadoop"*" /etc/alternatives/hadoop"*" | |||
|   669  rm /etc/yum.repos.d/cloudera-cdh4.repo | |||
|   670  rm -rf /var/log/hadoop"*" | |||
|   671  rm -rf /var/lib/hadoop"*" | |||
|   672  userdel hdfs | |||
|   673  grep hdfs /etc/passwd | |||
|   674  sync | |||
| </pre> | |||
| === Base install === | === Base install === | ||
| Line 53: | Line 77: | ||
| * FIXME: adjust hadoop UID/GID somehow - it is different on every machine! wrong, wrong, wrong! | * FIXME: adjust hadoop UID/GID somehow - it is different on every machine! wrong, wrong, wrong! | ||
| * configure data node | * configure data node (dataNNN is the local data disk) | ||
| <pre> | <pre> | ||
| mkdir / | mkdir /dataNNN/hdfs_data | ||
| chown -R hdfs.hdfs / | chown -R hdfs.hdfs /dataNNN/hdfs_data | ||
| ln -s / | ln -s /dataNNN/hdfs_data /datah | ||
| </pre> | </pre> | ||
| *  | * login as root@nameserver (ladd12) | ||
| ** cd /etc/hadoop/conf | ** cd /etc/hadoop/conf | ||
| ** add new node to hosts | ** add new node to hosts | ||
| ** add new node to topology.perl | ** add new node to topology.perl | ||
| **  | ** su - hdfs, run: hadoop dfsadmin -refreshNodes | ||
| * login as root@ladd00 | |||
| ** add new node to /etc/httpd/conf.d/ssl.conf around "hadoop-50075" | |||
| ** service httpd restart | |||
| * observe data node is listed as dead on the nameserver web page | * observe data node is listed as dead on the nameserver web page | ||
| Line 96: | Line 120: | ||
| * add node name to "hosts.exclude", should be typed exactly as it is entered in "hosts" | * add node name to "hosts.exclude", should be typed exactly as it is entered in "hosts" | ||
| *  | * login hdfs@namenode | ||
| * go to the  | * hdfs dfsadmin -refreshNodes | ||
| * go to the hdfs web interface "Live Nodes" page, observe the data node status is "decommission in progress" | |||
| * observe  | * go to the hdfs web interface "Decommissioning Nodes", observe node is listed, observe "Under Replicated Blocks" count down to zero | ||
| * after the node no longer  | * after the node no longer listed in "Decommissioning Nodes" and is listed as "Decommissioned" in "Live Nodes", it can be shut down. | ||
| === create user directory === | === create user directory === | ||
| Line 109: | Line 133: | ||
| === rebalance the data === | === rebalance the data === | ||
| (does not work with) | |||
| * login hdfs@namenode | * login hdfs@namenode | ||
| *  | * hdfs balancer | ||
| * watch the grass grow | * watch the grass grow | ||
| === Setup quotas === | |||
| * ssh hdfs@namenode | |||
| * hdfs dfsadmin -setSpaceQuota 600g /users/trinat | |||
| note: the quota applies to the size of files and their replicas. If default replication count is 3, the user can create 600g/3 = 200g worth of files. | |||
| === Report quotas === | |||
| * (as any user) | |||
| * hdfs dfs -count -q /users/"*" | |||
| The reported numbers are not obvious, for example: | |||
| <pre> | |||
| ladd12:fs$ hdfs dfs -count -q /users/"*" | |||
|         none             inf            none             inf            1            5            7809498 /users/lindner | |||
|         none             inf            none             inf            1            1          130360111 /users/olchansk | |||
|         none             inf    600000000000    208986905727            2          146       130337698091 /users/trinat | |||
| </pre> | |||
| Here: | |||
| * 600000000000 is the value set by dfsadmin -setSpaceQuota | |||
| * 130337698091 is the size of all files not counting replicas (same value as reported by "du -ks /hdfs/users/trinat") | |||
| * 208986905727 = 600000000000 - 3*130337698091 is the remaining space couning replicas | |||
| * actual remaining space is 130337698091/3 = 69662301909 (about 69 Gbytes) assuming replication count of 3. | |||
| === debug functions === | === debug functions === | ||
| * start the data node: service hadoop-hdfs-datanode restart | * start the data node: service hadoop-hdfs-datanode restart | ||
| * look at the log file: tail -100 / | * look at the log file: tail -100 /var/log/hadoop-hdfs/hadoop-hdfs-datanode-ladd12.triumf.ca.log | ||
| * mount hdfs: unset LD_LIBRARY_PATH; hadoop-fuse-dfs dfs://ladd12:8020 /mnt/xxx | * mount hdfs: unset LD_LIBRARY_PATH; hadoop-fuse-dfs dfs://ladd12:8020 /mnt/xxx | ||
| * watch name node: http://ladd12.triumf.ca:50070 | * watch name node: http://ladd12.triumf.ca:50070 | ||
| Line 124: | Line 177: | ||
| * report disk usage: hdfs dfs -count -q /users/"*" | * report disk usage: hdfs dfs -count -q /users/"*" | ||
| * report location of nodes: hdfs dfsadmin -printTopology | * report location of nodes: hdfs dfsadmin -printTopology | ||
| * report everything about every file: hdfs fsck / -files -blocks -locations -racks | * report everything about every file: hdfs fsck / -openforwrite -files -blocks -locations -racks | ||
| == Performance == | == Performance == | ||
| Line 201: | Line 254: | ||
| port 52745 is probably also the java rmi server port... | port 52745 is probably also the java rmi server port... | ||
| * name node: | |||
| <pre> | |||
| > lsof -P | grep LISTEN | grep java | |||
| java      18096      hdfs  144u     IPv4         1464177661      0t0        TCP *:44898 (LISTEN) | |||
| java      18096      hdfs  155u     IPv4         1464179424      0t0        TCP ladd12.triumf.ca:8020 (LISTEN) | |||
| java      18096      hdfs  165u     IPv4         1464180124      0t0        TCP *:50070 (LISTEN) | |||
| java      18332      hdfs  144u     IPv4         1464181247      0t0        TCP *:59445 (LISTEN) | |||
| java      18332      hdfs  151u     IPv4         1464182326      0t0        TCP *:50010 (LISTEN) | |||
| java      18332      hdfs  152u     IPv4         1464182357      0t0        TCP *:50075 (LISTEN) | |||
| java      18332      hdfs  157u     IPv4         1464182653      0t0        TCP *:50020 (LISTEN) | |||
| </pre> | |||
| port 44898 and 59445 are not documented. probably java rpm server ports | |||
Latest revision as of 17:19, 2 February 2022
HADOOP
Links
- https://ccp.cloudera.com/display/DOC/Documentation
- http://hadoop.apache.org/common/docs/current/
- http://archive.cloudera.com/cdh4/cdh/4/hadoop/hadoop-project-dist/hadoop-common/DeprecatedProperties.html
- https://ladd00.triumf.ca/hadoop/
Management
Complete removal
638 xemacs -nw /etc/rc.local --- comment out all hdfs stuff 639 umount /hdfs 642 service hadoop-hdfs-datanode stop 643 chkconfig --list | grep hdfs 644 chkconfig --list | grep had 650 rm /datah 652 cd /iris_data1/ --- this is different on every data node 656 rm -rf hdfs_data/ 659 rpm -qa | grep -i hadoop 660 yum erase hadoop 666 rm -rf /etc/hadoop/ 667 rm /etc/alternatives/hadoop-conf 668 rm -rf /etc/hadoop"*" /etc/alternatives/hadoop"*" 669 rm /etc/yum.repos.d/cloudera-cdh4.repo 670 rm -rf /var/log/hadoop"*" 671 rm -rf /var/lib/hadoop"*" 672 userdel hdfs 673 grep hdfs /etc/passwd 674 sync
Base install
cd /triumfcs/trshare/olchansk/linux/hadoop/ rpm --import RPM-GPG-KEY-cloudera rpm -vh --install cloudera-cdh-4-0.noarch.rpm yum install hadoop-hdfs hadoop-hdfs-fuse ln -s /home/olchansk/sysadm/hadoop/conf.daq_test /etc/hadoop/ alternatives --install /etc/hadoop/conf hadoop-conf /etc/hadoop/conf.daq_test 50 alternatives --display hadoop-conf
Create a name node
- yum install hadoop-hdfs-namenode
- chkconfig hadoop-hdfs-namenode off
- TBW
- add to /etc/rc.local: firewall rules and service startup
iptables -F hadoop-namenode iptables -N hadoop-namenode iptables -A hadoop-namenode -p tcp -s 142.90.0.0/16 --dport 8020 -j ACCEPT iptables -A hadoop-namenode -p tcp -s 142.90.0.0/16 --dport 50070 -j ACCEPT iptables -A hadoop-namenode -p tcp --dport 8020 -j REJECT iptables -A hadoop-namenode -p tcp --dport 50070 -j REJECT iptables -A INPUT -j hadoop-namenode iptables -L -v service hadoop-hdfs-namenode restart
Create a data node
(only SL6 is supported. SL5 has problems with dead /hdfs mounts)
- install data node software
yum install hadoop-hdfs-datanode chkconfig hadoop-hdfs-datanode off
- FIXME: adjust hadoop UID/GID somehow - it is different on every machine! wrong, wrong, wrong!
- configure data node (dataNNN is the local data disk)
mkdir /dataNNN/hdfs_data chown -R hdfs.hdfs /dataNNN/hdfs_data ln -s /dataNNN/hdfs_data /datah
- login as root@nameserver (ladd12)
- cd /etc/hadoop/conf
- add new node to hosts
- add new node to topology.perl
- su - hdfs, run: hadoop dfsadmin -refreshNodes
 
- login as root@ladd00
- add new node to /etc/httpd/conf.d/ssl.conf around "hadoop-50075"
- service httpd restart
 
- observe data node is listed as dead on the nameserver web page
- add to /etc/rc.local: firewall rules, service startup and hdfs mount
iptables -F hadoop iptables -N hadoop iptables -A hadoop -p tcp -s 142.90.0.0/16 --dport 50010 -j ACCEPT iptables -A hadoop -p tcp -s 142.90.0.0/16 --dport 50020 -j ACCEPT iptables -A hadoop -p tcp -s 142.90.0.0/16 --dport 50075 -j ACCEPT iptables -A hadoop -p tcp --dport 50010 -j REJECT iptables -A hadoop -p tcp --dport 50020 -j REJECT iptables -A hadoop -p tcp --dport 50075 -j REJECT iptables -A INPUT -j hadoop iptables -L -v service hadoop-hdfs-datanode restart mkdir /hdfs hadoop-fuse-dfs dfs://ladd12:8020 /hdfs
- start the data node: run /etc/rc.local
- observe data node is now listed as live on the nameserver web page
decommission a data node
- add node name to "hosts.exclude", should be typed exactly as it is entered in "hosts"
- login hdfs@namenode
- hdfs dfsadmin -refreshNodes
- go to the hdfs web interface "Live Nodes" page, observe the data node status is "decommission in progress"
- go to the hdfs web interface "Decommissioning Nodes", observe node is listed, observe "Under Replicated Blocks" count down to zero
- after the node no longer listed in "Decommissioning Nodes" and is listed as "Decommissioned" in "Live Nodes", it can be shut down.
create user directory
- login hdfs@namenode
- hadoop fs -mkdir /users/trinat
- hadoop fs -chown trinat /users/trinat
rebalance the data
(does not work with)
- login hdfs@namenode
- hdfs balancer
- watch the grass grow
Setup quotas
- ssh hdfs@namenode
- hdfs dfsadmin -setSpaceQuota 600g /users/trinat
note: the quota applies to the size of files and their replicas. If default replication count is 3, the user can create 600g/3 = 200g worth of files.
Report quotas
- (as any user)
- hdfs dfs -count -q /users/"*"
The reported numbers are not obvious, for example:
ladd12:fs$ hdfs dfs -count -q /users/"*"
        none             inf            none             inf            1            5            7809498 /users/lindner
        none             inf            none             inf            1            1          130360111 /users/olchansk
        none             inf    600000000000    208986905727            2          146       130337698091 /users/trinat
Here:
- 600000000000 is the value set by dfsadmin -setSpaceQuota
- 130337698091 is the size of all files not counting replicas (same value as reported by "du -ks /hdfs/users/trinat")
- 208986905727 = 600000000000 - 3*130337698091 is the remaining space couning replicas
- actual remaining space is 130337698091/3 = 69662301909 (about 69 Gbytes) assuming replication count of 3.
debug functions
- start the data node: service hadoop-hdfs-datanode restart
- look at the log file: tail -100 /var/log/hadoop-hdfs/hadoop-hdfs-datanode-ladd12.triumf.ca.log
- mount hdfs: unset LD_LIBRARY_PATH; hadoop-fuse-dfs dfs://ladd12:8020 /mnt/xxx
- watch name node: http://ladd12.triumf.ca:50070
- watch data node: http://ladd08.triumf.ca:50075
- report status and configuration: hdfs dfsadmin -report
- report disk usage: hdfs dfs -count -q /users/"*"
- report location of nodes: hdfs dfsadmin -printTopology
- report everything about every file: hdfs fsck / -openforwrite -files -blocks -locations -racks
Performance
- cluster with 3 data nodes: ladd12 (quad-core i7-860), ladd08 (dual opteron 2GHz), ladd05 (dual-core Athlon 2.6 GHz)
- hdfs mounted on ladd12
- write benchmark:
ladd12:olchansk$ /usr/bin/time dd if=/dev/zero of=xxx bs=1024k count=1000 1000+0 records in 1000+0 records out 1048576000 bytes (1.0 GB) copied, 31.3406 s, 33.5 MB/s 0.00user 0.18system 0:31.43elapsed 0%CPU (0avgtext+0avgdata 7104maxresident)k 0inputs+0outputs (0major+483minor)pagefaults 0swaps ladd12:olchansk$ /usr/bin/time dd if=/dev/zero of=xxx2 bs=1024k count=10000 10000+0 records in 10000+0 records out 10485760000 bytes (10 GB) copied, 436.875 s, 24.0 MB/s 0.00user 1.72system 7:16.97elapsed 0%CPU (0avgtext+0avgdata 7104maxresident)k 0inputs+0outputs (0major+483minor)pagefaults 0swaps
ganglia reports peak network use 30 Mbytes/sec, ladd12: load 1, cpu use 10%, disk use 15%+15%busy; ladd05: load 2, cpu use 30%sys, 30%wait, disk use 15%+20%busy; ladd08: load 6, cpu use 30%sys, 70%wait, disk use 40%+40%busy.
- read benchmark:
ladd12:olchansk$ ls -l total 11391305 -rw-r--r-- 1 olchansk nobody 130360111 Dec 28 18:36 run21600sub00000.mid.gz -rw-r--r-- 1 olchansk nobody 1048576000 Jan 2 10:35 xxx -rw-r--r-- 1 olchansk nobody 10485760000 Jan 2 10:37 xxx2 ladd12:olchansk$ /usr/bin/time dd if=run21600sub00000.mid.gz of=/dev/null 254609+1 records in 254609+1 records out 130360111 bytes (130 MB) copied, 1.55497 s, 83.8 MB/s 0.02user 0.08system 0:01.69elapsed 6%CPU (0avgtext+0avgdata 3024maxresident)k 254632inputs+0outputs (1major+227minor)pagefaults 0swaps ladd12:olchansk$ ladd12:olchansk$ /usr/bin/time dd if=xxx of=/dev/null 2048000+0 records in 2048000+0 records out 1048576000 bytes (1.0 GB) copied, 11.4278 s, 91.8 MB/s 0.18user 0.69system 0:11.43elapsed 7%CPU (0avgtext+0avgdata 3008maxresident)k 2048000inputs+0outputs (0major+227minor)pagefaults 0swaps ladd12:olchansk$ ladd12:olchansk$ /usr/bin/time dd if=xxx2 of=/dev/null 20480000+0 records in 20480000+0 records out 10485760000 bytes (10 GB) copied, 130.188 s, 80.5 MB/s 1.68user 7.10system 2:10.19elapsed 6%CPU (0avgtext+0avgdata 3024maxresident)k 20480000inputs+0outputs (0major+228minor)pagefaults 0swaps
no network use, cpu use ladd12 is 10%+30%wait, disk use is 0% because all data was cached in memory: 10GB file, 16GB RAM.
Network ports
- data node ports:
[root@ladd08 ~]# diff xxx1 xxx2 | grep LISTEN > tcp 0 0 :::50020 :::* LISTEN > tcp 0 0 :::47860 :::* LISTEN > tcp 0 0 :::50010 :::* LISTEN > tcp 0 0 :::50075 :::* LISTEN
port 47860 is the java rmi server port, it is different on different machines and sometimes changes after restart. how to pin it to a fixed port number so I can firewall it?!?
- name node:
[root@ladd12 ~]# netstat -an | grep LISTEN | grep tcp > xxx2 [root@ladd12 ~]# diff xxx1 xxx2 < tcp 0 0 ::ffff:142.90.119.126:8020 :::* LISTEN < tcp 0 0 :::50070 :::* LISTEN < tcp 0 0 :::52745 :::* LISTEN
port 52745 is probably also the java rmi server port...
- name node:
> lsof -P | grep LISTEN | grep java java 18096 hdfs 144u IPv4 1464177661 0t0 TCP *:44898 (LISTEN) java 18096 hdfs 155u IPv4 1464179424 0t0 TCP ladd12.triumf.ca:8020 (LISTEN) java 18096 hdfs 165u IPv4 1464180124 0t0 TCP *:50070 (LISTEN) java 18332 hdfs 144u IPv4 1464181247 0t0 TCP *:59445 (LISTEN) java 18332 hdfs 151u IPv4 1464182326 0t0 TCP *:50010 (LISTEN) java 18332 hdfs 152u IPv4 1464182357 0t0 TCP *:50075 (LISTEN) java 18332 hdfs 157u IPv4 1464182653 0t0 TCP *:50020 (LISTEN)
port 44898 and 59445 are not documented. probably java rpm server ports