HADOOP: Difference between revisions
m (79 revisions imported) |
|||
(20 intermediate revisions by the same user not shown) | |||
Line 9: | Line 9: | ||
== Management == | == Management == | ||
=== Complete removal === | |||
<pre> | |||
638 xemacs -nw /etc/rc.local --- comment out all hdfs stuff | |||
639 umount /hdfs | |||
642 service hadoop-hdfs-datanode stop | |||
643 chkconfig --list | grep hdfs | |||
644 chkconfig --list | grep had | |||
650 rm /datah | |||
652 cd /iris_data1/ --- this is different on every data node | |||
656 rm -rf hdfs_data/ | |||
659 rpm -qa | grep -i hadoop | |||
660 yum erase hadoop | |||
666 rm -rf /etc/hadoop/ | |||
667 rm /etc/alternatives/hadoop-conf | |||
668 rm -rf /etc/hadoop"*" /etc/alternatives/hadoop"*" | |||
669 rm /etc/yum.repos.d/cloudera-cdh4.repo | |||
670 rm -rf /var/log/hadoop"*" | |||
671 rm -rf /var/lib/hadoop"*" | |||
672 userdel hdfs | |||
673 grep hdfs /etc/passwd | |||
674 sync | |||
</pre> | |||
=== Base install === | === Base install === | ||
Line 53: | Line 77: | ||
* FIXME: adjust hadoop UID/GID somehow - it is different on every machine! wrong, wrong, wrong! | * FIXME: adjust hadoop UID/GID somehow - it is different on every machine! wrong, wrong, wrong! | ||
* configure data node | * configure data node (dataNNN is the local data disk) | ||
<pre> | <pre> | ||
mkdir / | mkdir /dataNNN/hdfs_data | ||
chown -R hdfs.hdfs / | chown -R hdfs.hdfs /dataNNN/hdfs_data | ||
ln -s / | ln -s /dataNNN/hdfs_data /datah | ||
</pre> | </pre> | ||
* | * login as root@nameserver (ladd12) | ||
** cd /etc/hadoop/conf | ** cd /etc/hadoop/conf | ||
** add new node to hosts | ** add new node to hosts | ||
** add new node to topology.perl | ** add new node to topology.perl | ||
** | ** su - hdfs, run: hadoop dfsadmin -refreshNodes | ||
* login as root@ladd00 | |||
** add new node to /etc/httpd/conf.d/ssl.conf around "hadoop-50075" | |||
** service httpd restart | |||
* observe data node is listed as dead on the nameserver web page | * observe data node is listed as dead on the nameserver web page | ||
Line 96: | Line 120: | ||
* add node name to "hosts.exclude", should be typed exactly as it is entered in "hosts" | * add node name to "hosts.exclude", should be typed exactly as it is entered in "hosts" | ||
* | * login hdfs@namenode | ||
* go to the | * hdfs dfsadmin -refreshNodes | ||
* go to the hdfs web interface "Live Nodes" page, observe the data node status is "decommission in progress" | |||
* observe | * go to the hdfs web interface "Decommissioning Nodes", observe node is listed, observe "Under Replicated Blocks" count down to zero | ||
* after the node no longer | * after the node no longer listed in "Decommissioning Nodes" and is listed as "Decommissioned" in "Live Nodes", it can be shut down. | ||
=== create user directory === | === create user directory === | ||
Line 109: | Line 133: | ||
=== rebalance the data === | === rebalance the data === | ||
(does not work with) | |||
* login hdfs@namenode | * login hdfs@namenode | ||
* | * hdfs balancer | ||
* watch the grass grow | * watch the grass grow | ||
=== Setup quotas === | |||
* ssh hdfs@namenode | |||
* hdfs dfsadmin -setSpaceQuota 600g /users/trinat | |||
note: the quota applies to the size of files and their replicas. If default replication count is 3, the user can create 600g/3 = 200g worth of files. | |||
=== Report quotas === | |||
* (as any user) | |||
* hdfs dfs -count -q /users/"*" | |||
The reported numbers are not obvious, for example: | |||
<pre> | |||
ladd12:fs$ hdfs dfs -count -q /users/"*" | |||
none inf none inf 1 5 7809498 /users/lindner | |||
none inf none inf 1 1 130360111 /users/olchansk | |||
none inf 600000000000 208986905727 2 146 130337698091 /users/trinat | |||
</pre> | |||
Here: | |||
* 600000000000 is the value set by dfsadmin -setSpaceQuota | |||
* 130337698091 is the size of all files not counting replicas (same value as reported by "du -ks /hdfs/users/trinat") | |||
* 208986905727 = 600000000000 - 3*130337698091 is the remaining space couning replicas | |||
* actual remaining space is 130337698091/3 = 69662301909 (about 69 Gbytes) assuming replication count of 3. | |||
=== debug functions === | === debug functions === | ||
* start the data node: service hadoop-hdfs-datanode restart | * start the data node: service hadoop-hdfs-datanode restart | ||
* look at the log file: tail -100 / | * look at the log file: tail -100 /var/log/hadoop-hdfs/hadoop-hdfs-datanode-ladd12.triumf.ca.log | ||
* mount hdfs: unset LD_LIBRARY_PATH; hadoop-fuse-dfs dfs://ladd12:8020 /mnt/xxx | * mount hdfs: unset LD_LIBRARY_PATH; hadoop-fuse-dfs dfs://ladd12:8020 /mnt/xxx | ||
* watch name node: http://ladd12.triumf.ca:50070 | * watch name node: http://ladd12.triumf.ca:50070 | ||
Line 124: | Line 177: | ||
* report disk usage: hdfs dfs -count -q /users/"*" | * report disk usage: hdfs dfs -count -q /users/"*" | ||
* report location of nodes: hdfs dfsadmin -printTopology | * report location of nodes: hdfs dfsadmin -printTopology | ||
* report everything about every file: hdfs fsck / -openforwrite -files -blocks -locations -racks | |||
== Performance == | == Performance == | ||
Line 200: | Line 254: | ||
port 52745 is probably also the java rmi server port... | port 52745 is probably also the java rmi server port... | ||
* name node: | |||
<pre> | |||
> lsof -P | grep LISTEN | grep java | |||
java 18096 hdfs 144u IPv4 1464177661 0t0 TCP *:44898 (LISTEN) | |||
java 18096 hdfs 155u IPv4 1464179424 0t0 TCP ladd12.triumf.ca:8020 (LISTEN) | |||
java 18096 hdfs 165u IPv4 1464180124 0t0 TCP *:50070 (LISTEN) | |||
java 18332 hdfs 144u IPv4 1464181247 0t0 TCP *:59445 (LISTEN) | |||
java 18332 hdfs 151u IPv4 1464182326 0t0 TCP *:50010 (LISTEN) | |||
java 18332 hdfs 152u IPv4 1464182357 0t0 TCP *:50075 (LISTEN) | |||
java 18332 hdfs 157u IPv4 1464182653 0t0 TCP *:50020 (LISTEN) | |||
</pre> | |||
port 44898 and 59445 are not documented. probably java rpm server ports |
Latest revision as of 17:19, 2 February 2022
HADOOP
Links
- https://ccp.cloudera.com/display/DOC/Documentation
- http://hadoop.apache.org/common/docs/current/
- http://archive.cloudera.com/cdh4/cdh/4/hadoop/hadoop-project-dist/hadoop-common/DeprecatedProperties.html
- https://ladd00.triumf.ca/hadoop/
Management
Complete removal
638 xemacs -nw /etc/rc.local --- comment out all hdfs stuff 639 umount /hdfs 642 service hadoop-hdfs-datanode stop 643 chkconfig --list | grep hdfs 644 chkconfig --list | grep had 650 rm /datah 652 cd /iris_data1/ --- this is different on every data node 656 rm -rf hdfs_data/ 659 rpm -qa | grep -i hadoop 660 yum erase hadoop 666 rm -rf /etc/hadoop/ 667 rm /etc/alternatives/hadoop-conf 668 rm -rf /etc/hadoop"*" /etc/alternatives/hadoop"*" 669 rm /etc/yum.repos.d/cloudera-cdh4.repo 670 rm -rf /var/log/hadoop"*" 671 rm -rf /var/lib/hadoop"*" 672 userdel hdfs 673 grep hdfs /etc/passwd 674 sync
Base install
cd /triumfcs/trshare/olchansk/linux/hadoop/ rpm --import RPM-GPG-KEY-cloudera rpm -vh --install cloudera-cdh-4-0.noarch.rpm yum install hadoop-hdfs hadoop-hdfs-fuse ln -s /home/olchansk/sysadm/hadoop/conf.daq_test /etc/hadoop/ alternatives --install /etc/hadoop/conf hadoop-conf /etc/hadoop/conf.daq_test 50 alternatives --display hadoop-conf
Create a name node
- yum install hadoop-hdfs-namenode
- chkconfig hadoop-hdfs-namenode off
- TBW
- add to /etc/rc.local: firewall rules and service startup
iptables -F hadoop-namenode iptables -N hadoop-namenode iptables -A hadoop-namenode -p tcp -s 142.90.0.0/16 --dport 8020 -j ACCEPT iptables -A hadoop-namenode -p tcp -s 142.90.0.0/16 --dport 50070 -j ACCEPT iptables -A hadoop-namenode -p tcp --dport 8020 -j REJECT iptables -A hadoop-namenode -p tcp --dport 50070 -j REJECT iptables -A INPUT -j hadoop-namenode iptables -L -v service hadoop-hdfs-namenode restart
Create a data node
(only SL6 is supported. SL5 has problems with dead /hdfs mounts)
- install data node software
yum install hadoop-hdfs-datanode chkconfig hadoop-hdfs-datanode off
- FIXME: adjust hadoop UID/GID somehow - it is different on every machine! wrong, wrong, wrong!
- configure data node (dataNNN is the local data disk)
mkdir /dataNNN/hdfs_data chown -R hdfs.hdfs /dataNNN/hdfs_data ln -s /dataNNN/hdfs_data /datah
- login as root@nameserver (ladd12)
- cd /etc/hadoop/conf
- add new node to hosts
- add new node to topology.perl
- su - hdfs, run: hadoop dfsadmin -refreshNodes
- login as root@ladd00
- add new node to /etc/httpd/conf.d/ssl.conf around "hadoop-50075"
- service httpd restart
- observe data node is listed as dead on the nameserver web page
- add to /etc/rc.local: firewall rules, service startup and hdfs mount
iptables -F hadoop iptables -N hadoop iptables -A hadoop -p tcp -s 142.90.0.0/16 --dport 50010 -j ACCEPT iptables -A hadoop -p tcp -s 142.90.0.0/16 --dport 50020 -j ACCEPT iptables -A hadoop -p tcp -s 142.90.0.0/16 --dport 50075 -j ACCEPT iptables -A hadoop -p tcp --dport 50010 -j REJECT iptables -A hadoop -p tcp --dport 50020 -j REJECT iptables -A hadoop -p tcp --dport 50075 -j REJECT iptables -A INPUT -j hadoop iptables -L -v service hadoop-hdfs-datanode restart mkdir /hdfs hadoop-fuse-dfs dfs://ladd12:8020 /hdfs
- start the data node: run /etc/rc.local
- observe data node is now listed as live on the nameserver web page
decommission a data node
- add node name to "hosts.exclude", should be typed exactly as it is entered in "hosts"
- login hdfs@namenode
- hdfs dfsadmin -refreshNodes
- go to the hdfs web interface "Live Nodes" page, observe the data node status is "decommission in progress"
- go to the hdfs web interface "Decommissioning Nodes", observe node is listed, observe "Under Replicated Blocks" count down to zero
- after the node no longer listed in "Decommissioning Nodes" and is listed as "Decommissioned" in "Live Nodes", it can be shut down.
create user directory
- login hdfs@namenode
- hadoop fs -mkdir /users/trinat
- hadoop fs -chown trinat /users/trinat
rebalance the data
(does not work with)
- login hdfs@namenode
- hdfs balancer
- watch the grass grow
Setup quotas
- ssh hdfs@namenode
- hdfs dfsadmin -setSpaceQuota 600g /users/trinat
note: the quota applies to the size of files and their replicas. If default replication count is 3, the user can create 600g/3 = 200g worth of files.
Report quotas
- (as any user)
- hdfs dfs -count -q /users/"*"
The reported numbers are not obvious, for example:
ladd12:fs$ hdfs dfs -count -q /users/"*" none inf none inf 1 5 7809498 /users/lindner none inf none inf 1 1 130360111 /users/olchansk none inf 600000000000 208986905727 2 146 130337698091 /users/trinat
Here:
- 600000000000 is the value set by dfsadmin -setSpaceQuota
- 130337698091 is the size of all files not counting replicas (same value as reported by "du -ks /hdfs/users/trinat")
- 208986905727 = 600000000000 - 3*130337698091 is the remaining space couning replicas
- actual remaining space is 130337698091/3 = 69662301909 (about 69 Gbytes) assuming replication count of 3.
debug functions
- start the data node: service hadoop-hdfs-datanode restart
- look at the log file: tail -100 /var/log/hadoop-hdfs/hadoop-hdfs-datanode-ladd12.triumf.ca.log
- mount hdfs: unset LD_LIBRARY_PATH; hadoop-fuse-dfs dfs://ladd12:8020 /mnt/xxx
- watch name node: http://ladd12.triumf.ca:50070
- watch data node: http://ladd08.triumf.ca:50075
- report status and configuration: hdfs dfsadmin -report
- report disk usage: hdfs dfs -count -q /users/"*"
- report location of nodes: hdfs dfsadmin -printTopology
- report everything about every file: hdfs fsck / -openforwrite -files -blocks -locations -racks
Performance
- cluster with 3 data nodes: ladd12 (quad-core i7-860), ladd08 (dual opteron 2GHz), ladd05 (dual-core Athlon 2.6 GHz)
- hdfs mounted on ladd12
- write benchmark:
ladd12:olchansk$ /usr/bin/time dd if=/dev/zero of=xxx bs=1024k count=1000 1000+0 records in 1000+0 records out 1048576000 bytes (1.0 GB) copied, 31.3406 s, 33.5 MB/s 0.00user 0.18system 0:31.43elapsed 0%CPU (0avgtext+0avgdata 7104maxresident)k 0inputs+0outputs (0major+483minor)pagefaults 0swaps ladd12:olchansk$ /usr/bin/time dd if=/dev/zero of=xxx2 bs=1024k count=10000 10000+0 records in 10000+0 records out 10485760000 bytes (10 GB) copied, 436.875 s, 24.0 MB/s 0.00user 1.72system 7:16.97elapsed 0%CPU (0avgtext+0avgdata 7104maxresident)k 0inputs+0outputs (0major+483minor)pagefaults 0swaps
ganglia reports peak network use 30 Mbytes/sec, ladd12: load 1, cpu use 10%, disk use 15%+15%busy; ladd05: load 2, cpu use 30%sys, 30%wait, disk use 15%+20%busy; ladd08: load 6, cpu use 30%sys, 70%wait, disk use 40%+40%busy.
- read benchmark:
ladd12:olchansk$ ls -l total 11391305 -rw-r--r-- 1 olchansk nobody 130360111 Dec 28 18:36 run21600sub00000.mid.gz -rw-r--r-- 1 olchansk nobody 1048576000 Jan 2 10:35 xxx -rw-r--r-- 1 olchansk nobody 10485760000 Jan 2 10:37 xxx2 ladd12:olchansk$ /usr/bin/time dd if=run21600sub00000.mid.gz of=/dev/null 254609+1 records in 254609+1 records out 130360111 bytes (130 MB) copied, 1.55497 s, 83.8 MB/s 0.02user 0.08system 0:01.69elapsed 6%CPU (0avgtext+0avgdata 3024maxresident)k 254632inputs+0outputs (1major+227minor)pagefaults 0swaps ladd12:olchansk$ ladd12:olchansk$ /usr/bin/time dd if=xxx of=/dev/null 2048000+0 records in 2048000+0 records out 1048576000 bytes (1.0 GB) copied, 11.4278 s, 91.8 MB/s 0.18user 0.69system 0:11.43elapsed 7%CPU (0avgtext+0avgdata 3008maxresident)k 2048000inputs+0outputs (0major+227minor)pagefaults 0swaps ladd12:olchansk$ ladd12:olchansk$ /usr/bin/time dd if=xxx2 of=/dev/null 20480000+0 records in 20480000+0 records out 10485760000 bytes (10 GB) copied, 130.188 s, 80.5 MB/s 1.68user 7.10system 2:10.19elapsed 6%CPU (0avgtext+0avgdata 3024maxresident)k 20480000inputs+0outputs (0major+228minor)pagefaults 0swaps
no network use, cpu use ladd12 is 10%+30%wait, disk use is 0% because all data was cached in memory: 10GB file, 16GB RAM.
Network ports
- data node ports:
[root@ladd08 ~]# diff xxx1 xxx2 | grep LISTEN > tcp 0 0 :::50020 :::* LISTEN > tcp 0 0 :::47860 :::* LISTEN > tcp 0 0 :::50010 :::* LISTEN > tcp 0 0 :::50075 :::* LISTEN
port 47860 is the java rmi server port, it is different on different machines and sometimes changes after restart. how to pin it to a fixed port number so I can firewall it?!?
- name node:
[root@ladd12 ~]# netstat -an | grep LISTEN | grep tcp > xxx2 [root@ladd12 ~]# diff xxx1 xxx2 < tcp 0 0 ::ffff:142.90.119.126:8020 :::* LISTEN < tcp 0 0 :::50070 :::* LISTEN < tcp 0 0 :::52745 :::* LISTEN
port 52745 is probably also the java rmi server port...
- name node:
> lsof -P | grep LISTEN | grep java java 18096 hdfs 144u IPv4 1464177661 0t0 TCP *:44898 (LISTEN) java 18096 hdfs 155u IPv4 1464179424 0t0 TCP ladd12.triumf.ca:8020 (LISTEN) java 18096 hdfs 165u IPv4 1464180124 0t0 TCP *:50070 (LISTEN) java 18332 hdfs 144u IPv4 1464181247 0t0 TCP *:59445 (LISTEN) java 18332 hdfs 151u IPv4 1464182326 0t0 TCP *:50010 (LISTEN) java 18332 hdfs 152u IPv4 1464182357 0t0 TCP *:50075 (LISTEN) java 18332 hdfs 157u IPv4 1464182653 0t0 TCP *:50020 (LISTEN)
port 44898 and 59445 are not documented. probably java rpm server ports