1교시
하둡을 설치하기 위해 센토스 이미지를 활용해보자.
(없는 사람도 있어서 선생님이 공유 폴더에 올려주심, CentOS-7-x86_64-DVD-1611.iso 파일인데
나는 USB 에 이미 가지고 있다.
하지만 다들 새로 까니까 나도 새로 깔 거야...)
★ 하둡 분산 파일 시스템 (Hadoop Distributed File System, HDFS)
- 하둡은 대용량 데이터를 분산 처리할 수 있는 자바 기반의 오픈 소스 프레임워크이다.
- 하둡을 다루려면 자바는 필수
- 구글에 쌓이는 수많은 빅데이터 (웹 페이지, 로그성 데이터(iot)) 들을
RDBMS (오라클) 에 입력하고, 데이터를 저장하고 처리하려 했으나 데이터가 너무 많아서 실패하고
자체적으로 빅데이터를 저장할 기술을 개발해 (구글 파일 시스템) 대외적으로 논문을 발표했다.
그 논문을 야후 (yahoo!) 에 있는 더그 커팅이라는 사람 (하둡을 만든 사람) 이 읽고 자바로 구현했다.
★ 오라클 vs 하둡
RDMBS (오라클) | Hadoop (하둡) |
실시간 데이터 처리 | 배치 처리 |
비싸다 | 무료, 분산 처리 |
※ 분산 처리: 여러 대의 노드 (서버) 를 묶어서 마치 하나의 서버처럼 보이게 하고,
여러 노드의 자원을 이용해서 데이터를 처리하기 때문에 처리 속도가 매우 빠르다.
한 대의 서버로 1TB (테라바이트) 데이터를 처리하는 데 걸리는 시간이 약 2시간 반이라고 할 때
하둡으로 여러 대의 서버로 병렬 작업할 경우에는 2분 내로 데이터를 읽을 수 있다.
예) 2008년 뉴욕타임즈의 130년어치의 신문기사 1천 1백만 페이지를
하둡을 이용해 하루만에 pdf 로 변환했다. 비용은 200만원
(하둡이 아닌 일반 서버로 작업했을 경우 예상 작업시간 14년)
★ 하둡의 장점과 단점
장점 | 단점 |
저렴한 구축비용 | 무료라서 그런지 유지 보수가 어렵다. |
비용 대비 빠른 데이터 처리 | 실제 데이터 위치 정보는 Name Node 라는 별도의 서버가 관리하는데, Name Node 가 망가지면 고가용성이 지원되지 않는다. (그래서 하둡 2 버전부터는 네임 노드를 이중화로 관리한다) |
2교시
본격적인 설치 작업
새로 만들기 > 이름 센토스 > 위치 지정 > 종류 리눅스 / 버전 레드햇 64 > 다음
방금 만든 가상머신 시작
맨 위에 한줄 하얀색으로 선택 > 엔터
영어 선택 > 날짜 한국
위에서 꼭 세이브 버튼 누르기
세이브
호스트 네임 센토스 > 던
인스톨 시작
루트 비밀번호 1234
유저 만들기 > 비밀번호 1234
설치가 되고 있는 모습이 보인다.
다 되면 reboot 누르기
한참 지혼자 화면이 돌아가다가
라이센스 동의
네트워크 & 호스트네임 눌러서
ethernet 03, 08 on 으로 잘 되어있나 확인해보기
그리고 general 박스 2개 체크하고
8의 주소도 192.168.56.10 으로 잘 되어있나 보고 호스트 네임도 centos 로 되어있나 확인해보고
done > fisnish configuration
로그인하라고 하면 > not listed 클릭
root > 1234 로 로그인
웰컴 화면이 나오면 넥스트 > 넥스트 > 스킵 > 하늘색 스타트 어쩌고 버튼 > x
장치 > 게스트 확장 CD 삽입 > 런 > 스크립트가 뜨면서 설치가 된다.
press return to close... 어쩌고 뜨면 x 눌러서 창 끄기
마우스 우클릭 > 오픈 터미널 > ifconfig > 192.168.56.10 으로 나와야 정상
퍼티 접속
login as : 가 뜨면 root 입력 > 비밀번호 1234
[root@centos ~]# vi /etc/hosts
문서 열어서
맨 아랫줄에 192.168.56.10 centos 붙여넣고 저장
방화벽 해지
[root@centos ~]# iptables -F
[root@centos ~]# iptables -L
둘 다 수행하기 (F는 플러시, L 은 리스트 보기)
그러면 리스트가 보인다.
[root@centos ~]# iptables -F
[root@centos ~]# iptables -L
Chain INPUT (policy ACCEPT)
target prot opt source destination
Chain FORWARD (policy ACCEPT)
target prot opt source destination
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
Chain FORWARD_IN_ZONES (0 references)
target prot opt source destination
Chain FORWARD_IN_ZONES_SOURCE (0 references)
target prot opt source destination
Chain FORWARD_OUT_ZONES (0 references)
target prot opt source destination
Chain FORWARD_OUT_ZONES_SOURCE (0 references)
target prot opt source destination
Chain FORWARD_direct (0 references)
target prot opt source destination
Chain FWDI_public (0 references)
target prot opt source destination
Chain FWDI_public_allow (0 references)
target prot opt source destination
Chain FWDI_public_deny (0 references)
target prot opt source destination
Chain FWDI_public_log (0 references)
target prot opt source destination
Chain FWDO_public (0 references)
target prot opt source destination
Chain FWDO_public_allow (0 references)
target prot opt source destination
Chain FWDO_public_deny (0 references)
target prot opt source destination
Chain FWDO_public_log (0 references)
target prot opt source destination
Chain INPUT_ZONES (0 references)
target prot opt source destination
Chain INPUT_ZONES_SOURCE (0 references)
target prot opt source destination
Chain INPUT_direct (0 references)
target prot opt source destination
Chain IN_public (0 references)
target prot opt source destination
Chain IN_public_allow (0 references)
target prot opt source destination
Chain IN_public_deny (0 references)
target prot opt source destination
Chain IN_public_log (0 references)
target prot opt source destination
Chain OUTPUT_direct (0 references)
target prot opt source destination
(공유에서 파일 하나 다운받기, 압축파일인데 이름 jdk-8u131-linux-x64.tar.gz )
아니면 ftp 에서 192.168.56.10 / root / 1234 / 22 > 빠른 연결 > 방금 받은 jdk 파일 찾아서
리모트 사이트 /root 쪽으로 보내주면 된다.
(도저히 파일을 못찾겠으면 그냥 내가 다운로드 폴더를 열어서
옮길 위치인 오른쪽으로 옮겨주면 됨)
# jdk 다시 설정하기
[root@centos ~]#
일단 ↑ 호스트 이름이 이렇게 centos 로 잘 되어있는지 확인해보자.
(나는 잘 되어있음)
그런데 만약 안 바뀌어 있다면
# host 이름 바꾸기
[root@centos ~]# hostnamectl set-hostname centos 이렇게 해서 바꾸기
# host 이름 확인
[root@centos ~]# hostnamectl
# jdk 설정 - 디렉토리 만들기 ( jdk 파일을 이쪽으로 옮길 예정)
[root@centos ~]# mkdir -p /usr/java
[root@centos ~]# ls
[root@centos ~]# mv jdk-8u131-linux-x64.tar.gz /usr/java
옮긴 디렉토리로 이동한 다음 압축 풀기
[root@centos ~]# cd /usr/java
[root@centos java]# ls
jdk-8u131-linux-x64.tar.gz
압축 풀기
[root@centos java]# tar vxfz jdk-8u131-linux-x64.tar.gz
뭔가 엄청나게 긴 목록이 주르륵 뜬다.
[root@centos java]# ls
jdk1.8.0_131 jdk-8u131-linux-x64.tar.gz
jdk1.8.0_131 디렉토리로 이동
[root@centos ~]# cd /jdk1.8.0_131
cd jdk1.8.0_131
pwd
/usr/java/jdk1.8.0_131
cd 해서 홈 디렉토리로 가기
[root@centos~]# vi /etc/profile
문서 열고 들어가서 > 맨 아래에 얘 붙여넣고 저장
export JAVA_HOME=/usr/java/jdk1.8.0_131
export PATH=$PATH:$JAVA_HOME/bin
export CLASS_PATH="."
활성화
[root@centos~]# source /etc/profile
[root@centos~]# echo $JAVA_HOME
/usr/java/jdk1.8.0_131
자바 버전 확인
[root@centos~]# java -version
open version "1.8.0_102"
OpenJDK Runtime Environment (build 1.8.0_102-b14)
OpenJDK 64-Bit Server VM (build 25.102-b14, mixed mode)
# 기본 JDK 변경
[root@centos~]# which java
/usr/bin/java
[root@centos~]# update-alternatives --install "/usr/bin/java" "java" "/usr/java/jdk1.8.0_131/bin/java" 1
[root@centos~]# update-alternatives --config java
새롭게 만든 것(command가 /usr/java/jdk1.8.0_131/bin/java 로 되어있는 것)
의 번호를 입력하고 엔터키 누르면 된다.
만약 경로 설정을 잘못했다면:
[root@centos ~]# update-alternatives --remove java /usr/java/jdk1.8.0_131
이렇게 해서 잘못 설정한 경로를 지우면 된다.
3교시
# 하둡 그룹 생성하기
[root@centos ~]# groupadd hadoop
[root@centos ~]# tail /etc/group
# 하둡 유저 생성하기
[root@centos ~]# useradd -g hadoop hadoop
[root@centos ~]# tail /etc/passwd
# 하둡 유저 패스워드 설정
(비밀번호 hadoop)
# 하둡 유저로 접속하기
[root@centos ~]# su - hadoop
# 공유 폴더에서 압축파일 (hadoop-3.2.4.tar) 다운받아서 /home/hadoop 유저로 보내기
[hadoop@centos ~]$ ls
hadoop-3.2.4.tar.gz
# 방금 보낸 압축파일 압축 풀기
[hadoop@centos ~]$ tar xvzf hadoop-3.2.4.tar.gz
지혼자서 뭐가 또 실컷 돌아간다.
[hadoop@centos ~]$ ls
hadoop-3.2.4 hadoop-3.2.4.tar.gz
[hadoop@centos ~]$ ls
hadoop-3.2.4 hadoop-3.2.4.tar.gz
[hadoop@centos ~]$ cd hadoop-3.2.4
[hadoop@centos hadoop-3.2.4]$ pwd
/home/hadoop/hadoop-3.2.4
# 다시 하둡의 홈 디렉토리에서
[hadoop@centos ~]$ vi .bashrc
문서 열어서 > 제일 아래에 저 4줄 붙여넣고 저장
export JAVA_HOME=/usr/java/jdk1.8.0_131
export HADOOP_HOME=/home/hadoop/hadoop-3.2.4
export HADOOP_CONFIG_HOME=$HADOOP_HOME/etc/hadoop
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
[hadoop@centos ~]$ source .bashrc
[hadoop@centos ~]$ echo $HADOOP_HOME
/home/hadoop/hadoop-3.2.4
자바 버전 확인하기
[hadoop@centos ~]$ java -version
java version "1.8.0_131"
Java(TM) SE Runtime Environment (build 1.8.0_131-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode)
하둡 버전 확인하기
[hadoop@centos ~]$ hadoop version
Hadoop 3.2.4
Source code repository Unknown -r 7e5d9983b388e372fe640f21f048f2f2ae6e9eba
Compiled by ubuntu on 2022-07-12T11:58Z
Compiled with protoc 2.5.0
From source with checksum ee031c16fe785bbb35252c749418712
This command was run using /home/hadoop/hadoop-3.2.4/share/hadoop/common/hadoop-common-3.2.4.jar
# 각각의 서버에 로그인할 때마다 아이디/비밀번호를 입력하지 않아도 되도록
공개 키 생성하기
( # 혹시라도 기존의 공개 키가 있었다면 삭제하기)
[hadoop@centos ~]$ rm -rf .ssh
# 공개 키 생성하기
[hadoop@centos ~]$ ssh-keygen
화살표로 표시한 부분마다 엔터 누르기
[hadoop@centos ~]$ ssh-copy-id -i /home/hadoop/.ssh/id_rsa.pub hadoop@192.168.56.10
예스 > 패스워드를 물어보면 hadoop
[hadoop@centos ~]$ ssh-copy-id -i /home/hadoop/.ssh/id_rsa.pub hadoop@192.168.56.10
The authenticity of host '192.168.56.10 (192.168.56.10)' can't be established.
ECDSA key fingerprint is f0:f5:64:03:7c:c1:ad:70:8e:25:5e:79:f4:31:ca:06.
Are you sure you want to continue connecting (yes/no)? yes
/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
hadoop@192.168.56.10's password: (비밀번호 hadoop)
Number of key(s) added: 1
Now try logging into the machine, with: "ssh 'hadoop@192.168.56.10'"
and check to make sure that only the key(s) you wanted were added.
그러면 이제 접속할 때 패스워드를 물어보는지 확인해보자.
[hadoop@centos ~]$ ssh hadoop@192.168.56.10
(비밀번호 안물어보고 그냥 순순히 들여보내줌 - 공개 키가 있기 때문에)
그리고 빠져나올 때는 exit 하면 빠져나와진다.
[hadoop@centos ~]$ exit
logout
4교시
공유에서 파일 다운받기
[hadoop@centos ~]$ cd $HADOOP_HOME/etc/hadoop
[hadoop@centos hadoop]$ pwd
/home/hadoop/hadoop-3.2.4/etc/hadoop
[hadoop@centos hadoop]$ vi hadoop-env.sh
맨 아래에 이 두줄 붙여넣고 저장
export JAVA_HOME=/usr/java/jdk1.8.0_131
export HADOOP_HOME=/home/hadoop/hadoop-3.2.4
masters 라는 파일 새로 생성하고
아이피 주소 192.168.56.10 써넣기
[hadoop@centos hadoop]$ vi masters
[hadoop@centos hadoop]$ vi workers
localhost 라고 되어있는 부분 삭제 > 192.168.56.10 써넣고 저장
[hadoop@centos hadoop]$ vi core-site.xml
화살표 표시한 부분 (2줄) 지우고
아래 부분 붙여넣기
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://centos:9010</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/hadoop-3.2.4/tmp</value>
</property>
</configuration>
[hadoop@centos hadoop]$ vi hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/hadoop/data/dfs/namenode</value>
</property>
<property>
<name>dfs.namenode.checkpoint.dir</name>
<value>/home/hadoop/data/dfs/namesecondary</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/hadoop/data/dfs/datanode</value>
</property>
<property>
<name>dfs.http.address</name>
<value>centos:50070</value>
</property>
<property>
<name>dfs.secondary.http.address</name>
<value>centos:50090</value>
</property>
</configuration>
[hadoop@centos hadoop]$ vi mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>yarn.app.mapreduce.am.env</name>
<value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
</property>
<property>
<name>mapreduce.map.env</name>
<value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
</property>
<property>
<name>mapreduce.reduce.env</name>
<value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
</property>
</configuration>
[hadoop@server01 hadoop]$ vi yarn-site.xml
아래 빨간색으로 표시한 부분 몽땅 삭제하고 붙여넣기
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce_suffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.nodemanager.local-dirs</name>
<value>/home/hadoop/data/yarn/nm-local-dir</value>
</property>
<property>
<name>yarn.resourcemanager.fs.state-store.uri</name>
<value>/home/hadoop/data/yarn/system/rmstore</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>centos</value>
</property>
<property>
<name>yarn.web-proxy.address</name>
<value>0.0.0.0:8089</value>
</property>yes
</configuration>
[hadoop@centos hadoop]$ vi yarn-env.sh
문서 열어서 맨 밑에 이 두줄 추가하기
JAVA=$JAVA_HOME/bin/java
JAVA_HELP_MAX=-Xmx1000m
환경설정 끝!
# 포맷 작업하기 (하둡 스토리지 만들기)
홈 디렉토리로 가기 (그냥 현재 디렉토리에서 해도 상관없다.)
[hadoop@centos hadoop]$ cd
# 하둡 파일 포맷
[hadoop@centos ~]$ hdfs namenode -format
[hadoop@centos ~]$ hdfs namenode -format
WARNING: /home/hadoop/hadoop-3.2.4/logs does not exist. Creating.
2024-03-11 14:05:49,106 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = centos/192.168.56.10
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 3.2.4
STARTUP_MSG: classpath = /home/hadoop/hadoop-3.2.4/etc/hadoop:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/kerb-client-1.0.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/hadoop-annotations-3.2.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/commons-configuration2-2.1.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/kerb-common-1.0.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/kerb-crypto-1.0.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/hadoop-auth-3.2.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/jersey-core-1.19.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/listenablefuture-9999.0-empty-to-avoid-conflict-with-guava.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/jetty-xml-9.4.43.v20210629.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/jcip-annotations-1.0-1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/woodstox-core-5.3.0.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/gson-2.9.0.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/htrace-core4-4.1.0-incubating.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/commons-collections-3.2.2.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/netty-3.10.6.Final.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/zookeeper-3.4.14.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/jersey-server-1.19.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/slf4j-reload4j-1.7.35.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/paranamer-2.3.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/javax.servlet-api-3.1.0.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/jackson-core-asl-1.9.13.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/javax.activation-api-1.2.0.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/guava-27.0-jre.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/asm-5.0.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/j2objc-annotations-1.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/curator-framework-2.13.0.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/accessors-smart-2.4.7.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/jetty-server-9.4.43.v20210629.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/re2j-1.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/kerby-config-1.0.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/jetty-servlet-9.4.43.v20210629.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/jackson-databind-2.10.5.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/commons-beanutils-1.9.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/failureaccess-1.0.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/commons-cli-1.2.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/jersey-json-1.19.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/jackson-mapper-asl-1.9.13.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/dnsjava-2.1.7.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/jackson-xc-1.9.13.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/jsp-api-2.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/jaxb-api-2.2.11.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/reload4j-1.2.18.3.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/kerby-util-1.0.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/kerb-core-1.0.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/httpclient-4.5.13.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/nimbus-jose-jwt-9.8.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/curator-client-2.13.0.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/commons-text-1.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/jackson-annotations-2.10.5.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/jaxb-impl-2.2.3-1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/jetty-util-9.4.43.v20210629.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/jersey-servlet-1.19.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/jetty-http-9.4.43.v20210629.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/kerb-identity-1.0.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/spotbugs-annotations-3.1.9.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/checker-qual-2.5.2.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/jetty-webapp-9.4.43.v20210629.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/metrics-core-3.2.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/token-provider-1.0.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/snappy-java-1.0.5.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/commons-net-3.6.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/kerb-server-1.0.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/kerby-pkix-1.0.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/commons-compress-1.21.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/httpcore-4.4.13.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/jul-to-slf4j-1.7.35.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/jsr305-3.0.2.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/commons-math3-3.1.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/kerb-admin-1.0.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/curator-recipes-2.13.0.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/jetty-util-ajax-9.4.43.v20210629.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/commons-codec-1.11.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/commons-io-2.8.0.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/avro-1.7.7.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/kerby-asn1-1.0.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/commons-lang3-3.7.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/json-smart-2.4.7.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/kerb-util-1.0.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/jsr311-api-1.1.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/kerb-simplekdc-1.0.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/jackson-jaxrs-1.9.13.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/jackson-core-2.10.5.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/kerby-xdr-1.0.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/jsch-0.1.55.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/jettison-1.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/protobuf-java-2.5.0.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/commons-logging-1.1.3.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/animal-sniffer-annotations-1.17.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/error_prone_annotations-2.2.0.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/slf4j-api-1.7.35.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/audience-annotations-0.5.0.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/jetty-security-9.4.43.v20210629.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/jetty-io-9.4.43.v20210629.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/lib/stax2-api-4.2.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/hadoop-common-3.2.4-tests.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/hadoop-common-3.2.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/hadoop-kms-3.2.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/common/hadoop-nfs-3.2.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/kerb-client-1.0.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/hadoop-annotations-3.2.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/commons-configuration2-2.1.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/kerb-common-1.0.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/kerb-crypto-1.0.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/hadoop-auth-3.2.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/jersey-core-1.19.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/listenablefuture-9999.0-empty-to-avoid-conflict-with-guava.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/jetty-xml-9.4.43.v20210629.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/jcip-annotations-1.0-1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/woodstox-core-5.3.0.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/gson-2.9.0.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/htrace-core4-4.1.0-incubating.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/commons-collections-3.2.2.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/netty-3.10.6.Final.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/zookeeper-3.4.14.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/jersey-server-1.19.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/paranamer-2.3.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/leveldbjni-all-1.8.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/javax.servlet-api-3.1.0.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/jackson-core-asl-1.9.13.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/json-simple-1.1.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/javax.activation-api-1.2.0.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/guava-27.0-jre.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/asm-5.0.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/j2objc-annotations-1.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/curator-framework-2.13.0.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/accessors-smart-2.4.7.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/jetty-server-9.4.43.v20210629.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/re2j-1.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/kerby-config-1.0.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/jetty-servlet-9.4.43.v20210629.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/jackson-databind-2.10.5.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/commons-beanutils-1.9.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/failureaccess-1.0.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/commons-cli-1.2.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/jersey-json-1.19.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/jackson-mapper-asl-1.9.13.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/dnsjava-2.1.7.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/jackson-xc-1.9.13.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/jaxb-api-2.2.11.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/reload4j-1.2.18.3.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/kerby-util-1.0.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/kerb-core-1.0.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/httpclient-4.5.13.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/nimbus-jose-jwt-9.8.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/curator-client-2.13.0.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/commons-text-1.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/netty-all-4.1.68.Final.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/jackson-annotations-2.10.5.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/jaxb-impl-2.2.3-1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/jetty-util-9.4.43.v20210629.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/jersey-servlet-1.19.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/jetty-http-9.4.43.v20210629.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/kerb-identity-1.0.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/spotbugs-annotations-3.1.9.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/checker-qual-2.5.2.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/jetty-webapp-9.4.43.v20210629.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/token-provider-1.0.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/snappy-java-1.0.5.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/commons-net-3.6.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/kerb-server-1.0.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/kerby-pkix-1.0.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/commons-compress-1.21.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/httpcore-4.4.13.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/jsr305-3.0.2.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/commons-math3-3.1.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/kerb-admin-1.0.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/commons-daemon-1.0.13.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/curator-recipes-2.13.0.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/jetty-util-ajax-9.4.43.v20210629.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/commons-codec-1.11.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/commons-io-2.8.0.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/avro-1.7.7.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/kerby-asn1-1.0.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/commons-lang3-3.7.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/json-smart-2.4.7.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/kerb-util-1.0.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/jsr311-api-1.1.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/kerb-simplekdc-1.0.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/jackson-jaxrs-1.9.13.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/okio-1.6.0.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/jackson-core-2.10.5.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/kerby-xdr-1.0.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/jsch-0.1.55.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/jettison-1.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/protobuf-java-2.5.0.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/commons-logging-1.1.3.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/animal-sniffer-annotations-1.17.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/error_prone_annotations-2.2.0.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/audience-annotations-0.5.0.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/jetty-security-9.4.43.v20210629.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/okhttp-2.7.5.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/jetty-io-9.4.43.v20210629.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/lib/stax2-api-4.2.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/hadoop-hdfs-3.2.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/hadoop-hdfs-3.2.4-tests.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/hadoop-hdfs-client-3.2.4-tests.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/hadoop-hdfs-client-3.2.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/hadoop-hdfs-rbf-3.2.4-tests.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/hadoop-hdfs-native-client-3.2.4-tests.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/hadoop-hdfs-native-client-3.2.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/hadoop-hdfs-httpfs-3.2.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/hadoop-hdfs-rbf-3.2.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/hdfs/hadoop-hdfs-nfs-3.2.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/mapreduce/lib/junit-4.13.2.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/mapreduce/lib/hamcrest-core-1.3.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/mapreduce/hadoop-mapreduce-client-core-3.2.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/mapreduce/hadoop-mapreduce-client-app-3.2.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/mapreduce/hadoop-mapreduce-client-uploader-3.2.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/mapreduce/hadoop-mapreduce-client-shuffle-3.2.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/mapreduce/hadoop-mapreduce-client-common-3.2.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-3.2.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-plugins-3.2.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/mapreduce/hadoop-mapreduce-client-nativetask-3.2.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.2.4-tests.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.2.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/yarn:/home/hadoop/hadoop-3.2.4/share/hadoop/yarn/lib/jersey-guice-1.19.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/yarn/lib/swagger-annotations-1.5.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/yarn/lib/json-io-2.5.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/yarn/lib/HikariCP-java7-2.4.12.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/yarn/lib/ehcache-3.3.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/yarn/lib/bcpkix-jdk15on-1.60.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/yarn/lib/jackson-jaxrs-json-provider-2.10.5.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/yarn/lib/javax.inject-1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/yarn/lib/aopalliance-1.0.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/yarn/lib/jakarta.xml.bind-api-2.3.2.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/yarn/lib/metrics-core-3.2.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/yarn/lib/objenesis-1.0.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/yarn/lib/snakeyaml-1.26.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/yarn/lib/bcprov-jdk15on-1.60.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/yarn/lib/jackson-module-jaxb-annotations-2.10.5.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/yarn/lib/geronimo-jcache_1.0_spec-1.0-alpha-1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/yarn/lib/jackson-jaxrs-base-2.10.5.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/yarn/lib/fst-2.50.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/yarn/lib/mssql-jdbc-6.2.1.jre7.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/yarn/lib/guice-4.0.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/yarn/lib/java-util-1.9.0.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/yarn/lib/jersey-client-1.19.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/yarn/lib/guice-servlet-4.0.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/yarn/lib/jakarta.activation-api-1.2.1.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-3.2.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/yarn/hadoop-yarn-server-tests-3.2.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/yarn/hadoop-yarn-services-core-3.2.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/yarn/hadoop-yarn-services-api-3.2.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/yarn/hadoop-yarn-api-3.2.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/yarn/hadoop-yarn-server-nodemanager-3.2.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/yarn/hadoop-yarn-server-router-3.2.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/yarn/hadoop-yarn-server-sharedcachemanager-3.2.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/yarn/hadoop-yarn-client-3.2.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/yarn/hadoop-yarn-submarine-3.2.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/yarn/hadoop-yarn-server-timeline-pluginstorage-3.2.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/yarn/hadoop-yarn-common-3.2.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-3.2.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/yarn/hadoop-yarn-server-applicationhistoryservice-3.2.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/yarn/hadoop-yarn-registry-3.2.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/yarn/hadoop-yarn-server-common-3.2.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/yarn/hadoop-yarn-server-web-proxy-3.2.4.jar:/home/hadoop/hadoop-3.2.4/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.4.jar
STARTUP_MSG: build = Unknown -r 7e5d9983b388e372fe640f21f048f2f2ae6e9eba; compiled by 'ubuntu' on 2022-07-12T11:58Z
STARTUP_MSG: java = 1.8.0_131
************************************************************/
2024-03-11 14:05:49,150 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
2024-03-11 14:05:49,717 INFO namenode.NameNode: createNameNode [-format]
2024-03-11 14:05:50,738 INFO common.Util: Assuming 'file' scheme for path /home/hadoop/data/dfs/namenode in configuration.
2024-03-11 14:05:50,738 INFO common.Util: Assuming 'file' scheme for path /home/hadoop/data/dfs/namenode in configuration.
Formatting using clusterid: CID-8bac28ff-ddc0-45e6-a529-053cd4114fc7
2024-03-11 14:05:50,768 INFO namenode.FSEditLog: Edit logging is async:true
2024-03-11 14:05:50,806 INFO namenode.FSNamesystem: KeyProvider: null
2024-03-11 14:05:50,807 INFO namenode.FSNamesystem: fsLock is fair: true
2024-03-11 14:05:50,810 INFO namenode.FSNamesystem: Detailed lock hold time metrics enabled: false
2024-03-11 14:05:50,829 INFO namenode.FSNamesystem: fsOwner = hadoop (auth:SIMPLE)
2024-03-11 14:05:50,830 INFO namenode.FSNamesystem: supergroup = supergroup
2024-03-11 14:05:50,830 INFO namenode.FSNamesystem: isPermissionEnabled = true
2024-03-11 14:05:50,830 INFO namenode.FSNamesystem: HA Enabled: false
2024-03-11 14:05:50,896 INFO common.Util: dfs.datanode.fileio.profiling.sampling.percentage set to 0. Disabling file IO profiling
2024-03-11 14:05:50,903 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit: configured=1000, counted=60, effected=1000
2024-03-11 14:05:50,903 INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true
2024-03-11 14:05:50,906 INFO blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000
2024-03-11 14:05:50,906 INFO blockmanagement.BlockManager: The block deletion will start around 2024 Mar 11 14:05:50
2024-03-11 14:05:50,907 INFO util.GSet: Computing capacity for map BlocksMap
2024-03-11 14:05:50,907 INFO util.GSet: VM type = 64-bit
2024-03-11 14:05:50,921 INFO util.GSet: 2.0% max memory 409 MB = 8.2 MB
2024-03-11 14:05:50,921 INFO util.GSet: capacity = 2^20 = 1048576 entries
2024-03-11 14:05:50,937 INFO blockmanagement.BlockManager: Storage policy satisfier is disabled
2024-03-11 14:05:50,937 INFO blockmanagement.BlockManager: dfs.block.access.token.enable = false
2024-03-11 14:05:50,940 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.threshold-pct = 0.9990000128746033
2024-03-11 14:05:50,940 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.min.datanodes = 0
2024-03-11 14:05:50,940 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.extension = 30000
2024-03-11 14:05:50,941 INFO blockmanagement.BlockManager: defaultReplication = 1
2024-03-11 14:05:50,941 INFO blockmanagement.BlockManager: maxReplication = 512
2024-03-11 14:05:50,941 INFO blockmanagement.BlockManager: minReplication = 1
2024-03-11 14:05:50,941 INFO blockmanagement.BlockManager: maxReplicationStreams = 2
2024-03-11 14:05:50,941 INFO blockmanagement.BlockManager: redundancyRecheckInterval = 3000ms
2024-03-11 14:05:50,941 INFO blockmanagement.BlockManager: encryptDataTransfer = false
2024-03-11 14:05:50,941 INFO blockmanagement.BlockManager: maxNumBlocksToLog = 1000
2024-03-11 14:05:50,990 INFO namenode.FSDirectory: GLOBAL serial map: bits=29 maxEntries=536870911
2024-03-11 14:05:50,990 INFO namenode.FSDirectory: USER serial map: bits=24 maxEntries=16777215
2024-03-11 14:05:50,991 INFO namenode.FSDirectory: GROUP serial map: bits=24 maxEntries=16777215
2024-03-11 14:05:50,991 INFO namenode.FSDirectory: XATTR serial map: bits=24 maxEntries=16777215
2024-03-11 14:05:50,999 INFO util.GSet: Computing capacity for map INodeMap
2024-03-11 14:05:50,999 INFO util.GSet: VM type = 64-bit
2024-03-11 14:05:50,999 INFO util.GSet: 1.0% max memory 409 MB = 4.1 MB
2024-03-11 14:05:50,999 INFO util.GSet: capacity = 2^19 = 524288 entries
2024-03-11 14:05:50,999 INFO namenode.FSDirectory: ACLs enabled? false
2024-03-11 14:05:50,999 INFO namenode.FSDirectory: POSIX ACL inheritance enabled? true
2024-03-11 14:05:50,999 INFO namenode.FSDirectory: XAttrs enabled? true
2024-03-11 14:05:50,999 INFO namenode.NameNode: Caching file names occurring more than 10 times
2024-03-11 14:05:51,002 INFO snapshot.SnapshotManager: Loaded config captureOpenFiles: false, skipCaptureAccessTimeOnlyChange: false, snapshotDiffAllowSnapRootDescendant: true, maxSnapshotLimit: 65536
2024-03-11 14:05:51,003 INFO snapshot.SnapshotManager: SkipList is disabled
2024-03-11 14:05:51,007 INFO util.GSet: Computing capacity for map cachedBlocks
2024-03-11 14:05:51,007 INFO util.GSet: VM type = 64-bit
2024-03-11 14:05:51,007 INFO util.GSet: 0.25% max memory 409 MB = 1.0 MB
2024-03-11 14:05:51,007 INFO util.GSet: capacity = 2^17 = 131072 entries
2024-03-11 14:05:51,011 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.window.num.buckets = 10
2024-03-11 14:05:51,012 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.num.users = 10
2024-03-11 14:05:51,012 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.windows.minutes = 1,5,25
2024-03-11 14:05:51,015 INFO namenode.FSNamesystem: Retry cache on namenode is enabled
2024-03-11 14:05:51,015 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis
2024-03-11 14:05:51,016 INFO util.GSet: Computing capacity for map NameNodeRetryCache
2024-03-11 14:05:51,016 INFO util.GSet: VM type = 64-bit
2024-03-11 14:05:51,016 INFO util.GSet: 0.029999999329447746% max memory 409 MB = 125.6 KB
2024-03-11 14:05:51,016 INFO util.GSet: capacity = 2^14 = 16384 entries
2024-03-11 14:05:51,044 INFO namenode.FSImage: Allocated new BlockPoolId: BP-2095424250-192.168.56.10-1710133551027
2024-03-11 14:05:51,181 INFO common.Storage: Storage directory /home/hadoop/data/dfs/namenode has been successfully formatted.
2024-03-11 14:05:51,200 INFO namenode.FSImageFormatProtobuf: Saving image file /home/hadoop/data/dfs/namenode/current/fsimage.ckpt_0000000000000000000 using no compression
2024-03-11 14:05:51,246 INFO namenode.FSImageFormatProtobuf: Image file /home/hadoop/data/dfs/namenode/current/fsimage.ckpt_0000000000000000000 of size 401 bytes saved in 0 seconds .
2024-03-11 14:05:51,258 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
2024-03-11 14:05:51,268 INFO namenode.FSNamesystem: Stopping services started for active state
2024-03-11 14:05:51,269 INFO namenode.FSNamesystem: Stopping services started for standby state
2024-03-11 14:05:51,274 INFO namenode.FSImage: FSImageSaver clean checkpoint: txid=0 when meet shutdown.
2024-03-11 14:05:51,274 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at centos/192.168.56.10
************************************************************/
# 하둡 데몬 시작
[hadoop@centos ~]$ start-all.sh
[hadoop@centos ~]$ start-all.sh
WARNING: Attempting to start all Apache Hadoop daemons as hadoop in 10 seconds.
WARNING: This is not a recommended production deployment configuration.
WARNING: Use CTRL-C to abort.
Starting namenodes on [centos]
centos: Warning: Permanently added 'centos' (ECDSA) to the list of known hosts.
Starting datanodes
Starting secondary namenodes [centos]
Starting resourcemanager
Starting nodemanagers
0.0.0.0: Warning: Permanently added '0.0.0.0' (ECDSA) to the list of known hosts.
# 시작된 데몬 확인
[hadoop@centos ~]$ jps
[hadoop@centos ~]$ jps
5106 WebAppProxyServer
5138 Jps
4579 ResourceManager
4036 NameNode
4152 DataNode
4699 NodeManager
4350 SecondaryNameNode
# 하둡 데몬 종료
[hadoop@centos ~]$ stop-all.sh
(하지마!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!)
이거 수행하면 connection refused 나오면서 계속 오류 뜸 ㅡㅡ
[hadoop@centos ~]$ stop-all.sh
WARNING: Stopping all Apache Hadoop daemons as hadoop in 10 seconds.
WARNING: Use CTRL-C to abort.
Stopping namenodes on [centos]
Stopping datanodes
Stopping secondary namenodes [centos]
Stopping nodemanagers
Stopping resourcemanager
Stopping proxy server [0.0.0.0]
# 하둡 파일 체크
hdfs dfs -ls /
[hadoop@centos ~]$ hdfs dfs -ls /
[hadoop@centos ~]$
# 하둡 파일 시스템 쪽으로 가기
[hadoop@centos ~]$ hdfs dfs -mkdir /user
[hadoop@centos ~]$ hdfs dfs -mkdir /user
[hadoop@centos ~]$
[hadoop@centos ~]$ hdfs dfs -ls /
Found 1 items
drwxr-xr-x - hadoop supergroup 0 2024-03-11 14:36 /user
하둡 유저에게 윈터 파일 전송하기
# 로컬 파일 시스템 파일을 hdfs 로 복사
[hadoop@centos ~]$ hdfs dfs -put /home/hadoop/winter.txt /user
[hadoop@centos ~]$ hdfs dfs -put /home/hadoop/winter.txt /user
[hadoop@centos ~]$
[hadoop@centos ~]$ hdfs dfs -ls /user
[hadoop@centos ~]$ cd $HADOOP_HOME/share/hadoop/mapreduce
[hadoop@centos mapreduce]$ pwd
/home/hadoop/hadoop-3.2.4/share/hadoop/mapreduce
[hadoop@centos mapreduce]$ ls
이 중에서 hadoop-mapreduce-examples-3.2.4.jar 이 파일 사용하기
[hadoop@centos mapreduce]$ yarn jar hadoop-mapreduce-examples-3.2.4.jar wordcount /user/winter.txt output
결과가 이상한데... 날짜가 나와야 되는데 안나오고 있어 ㅠㅠ
[hadoop@centos mapreduce]$ yarn jar hadoop-mapreduce-examples-3.2.4.jar wordcount /user/winter.txt
Usage: wordcount <in> [<in>...] <out>
[hadoop@centos mapreduce]$ yarn jar hadoop-mapreduce-examples-3.2.4.jar wordcount /user/winter.txt output
2024-03-11 14:41:39,352 INFO client.RMProxy: Connecting to ResourceManager at centos/192.168.56.10:8032
2024-03-11 14:41:43,000 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/hadoop/.staging/job_1710135305685_0001
2024-03-11 14:41:44,187 INFO input.FileInputFormat: Total input files to process : 1
2024-03-11 14:41:44,620 INFO mapreduce.JobSubmitter: number of splits:1
2024-03-11 14:41:48,539 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1710135305685_0001
2024-03-11 14:41:48,562 INFO mapreduce.JobSubmitter: Executing with tokens: []
2024-03-11 14:41:50,582 INFO conf.Configuration: resource-types.xml not found
2024-03-11 14:41:50,582 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
2024-03-11 14:41:52,411 INFO impl.YarnClientImpl: Submitted application application_1710135305685_0001
2024-03-11 14:41:57,300 INFO mapreduce.Job: The url to track the job: http://0.0.0.0:8089/proxy/application_1710135305685_0001/
2024-03-11 14:41:57,300 INFO mapreduce.Job: Running job: job_1710135305685_0001
2024-03-11 14:43:12,080 INFO mapreduce.Job: Job job_1710135305685_0001 running in uber mode : false
2024-03-11 14:43:12,161 INFO mapreduce.Job: map 0% reduce 0%
2024-03-11 14:44:01,649 INFO mapreduce.Job: map 100% reduce 0%
2024-03-11 14:44:35,328 INFO mapreduce.Job: map 100% reduce 100%
2024-03-11 14:44:41,436 INFO mapreduce.Job: Job job_1710135305685_0001 completed successfully
2024-03-11 14:44:41,642 INFO mapreduce.Job: Counters: 54
File System Counters
FILE: Number of bytes read=68275
FILE: Number of bytes written=613389
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=114644
HDFS: Number of bytes written=48033
HDFS: Number of read operations=8
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
HDFS: Number of bytes read erasure-coded=0
Job Counters
Launched map tasks=1
Launched reduce tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=45709
Total time spent by all reduces in occupied slots (ms)=24624
Total time spent by all map tasks (ms)=45709
Total time spent by all reduce tasks (ms)=24624
Total vcore-milliseconds taken by all map tasks=45709
Total vcore-milliseconds taken by all reduce tasks=24624
Total megabyte-milliseconds taken by all map tasks=46806016
Total megabyte-milliseconds taken by all reduce tasks=25214976
Map-Reduce Framework
Map input records=4239
Map output records=19909
Map output bytes=193427
Map output materialized bytes=68275
Input split bytes=99
Combine input records=19909
Combine output records=5140
Reduce input groups=5140
Reduce shuffle bytes=68275
Reduce input records=5140
Reduce output records=5140
Spilled Records=10280
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=830
CPU time spent (ms)=1500
Physical memory (bytes) snapshot=374861824
Virtual memory (bytes) snapshot=5567590400
Total committed heap usage (bytes)=227540992
Peak Map Physical memory (bytes)=214331392
Peak Map Virtual memory (bytes)=2781409280
Peak Reduce Physical memory (bytes)=160530432
Peak Reduce Virtual memory (bytes)=2786181120
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=114545
File Output Format Counters
Bytes Written=48033
[hadoop@centos mapreduce]$ hdfs dfs -ls output
[hadoop@centos mapreduce]$ hdfs dfs -ls output
Found 2 items
-rw-r--r-- 1 hadoop supergroup 0 2024-03-11 14:44 output/_SUCCESS
-rw-r--r-- 1 hadoop supergroup 48033 2024-03-11 14:44 output/part-r-00000
[hadoop@centos mapreduce]$ cd
[hadoop@centos ~]$ hdfs dfs -cat output/part-r-00000
[hadoop@centos ~]$ hdfs dfs -cat output/part-r-00000 | tail -10
# HDFS에 저장된 파일을 로컬 파일 시스템으로 복사하기
[hadoop@centos ~]$ hdfs dfs -get output/part-r-00000 /home/hadoop/wc_output
[hadoop@centos ~]$ ls
data hadoop-3.2.4 hadoop-3.2.4.tar.gz wc_output winter.txt
[hadoop@centos ~]$ vi wc_output
5교시
# 모든 디렉토리들 보기
[hadoop@centos ~]$ hdfs dfs -ls -R /
# 서브 디렉토리까지 몽땅 보기
[hadoop@centos ~]$ hdfs dfs -ls -R /user/hadoop
디렉토리 디스크 사용량 보기
[hadoop@centos ~]$ hdfs dfs -du
48033 48033 output
[hadoop@centos ~]$ hdfs dfs -du output
0 0 output/_SUCCESS
48033 48033 output/part-r-00000
# 전체 다 보기: -s 옵션
[hadoop@centos ~]$ hdfs dfs -du -s
48033 48033 .
# 파일 삭제하기
[hadoop@centos ~]$ hdfs dfs -rm /user/hadoop/output/_SUCCESS
Deleted /user/hadoop/output/_SUCCESS
# 디렉토리 안에 있는 내용까지 몽땅 삭제하기
[hadoop@centos ~]$ hdfs dfs -rm -r /user/hadoop/output/
Deleted /user/hadoop/output
하이브 설치하기
# hive 파일 유저에게 전송
확인
압축 풀기
[hadoop@centos ~]$ tar xvzf apache-hive-3.1.3-bin.tar.gz
[hadoop@centos ~]$ ls
[hadoop@centos ~]$ cd apache-hive-3.1.3-bin/
[hadoop@centos apache-hive-3.1.3-bin]$ pwd
/home/hadoop/apache-hive-3.1.3-bin
[hadoop@centos apache-hive-3.1.3-bin]$ cd
[hadoop@centos ~]$ vi .bashrc
원래 이런 ↑ 모습인데
이렇게 고쳐놓기
[hadoop@centos ~]$ source .bashrc
[hadoop@centos ~]$ echo $HIVE_HOME
/home/hadoop/apache-hive-3.1.3-bin
# hive 환경 설정
[hadoop@centos ~]$ cd $HIVE_HOME/conf
# 파일 새로 열어서 아래 내용 붙여넣기
[hadoop@centos conf]$ vi hive-site.xml
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
</property>
<property>
<name>hive.cli.print.header</name>
<value>true</value>
</property>
</configuration>
[hadoop@centos conf]$ vi $HIVE_HOME/bin/hive-config.sh
export HADOOP_HOME=/home/hadoop/hadoop-3.2.4
[hadoop@centos conf]$ cp $HADOOP_HOME/share/hadoop/common/lib/guava-27.0-jre.jar $HIVE_HOME/lib/guava-27.0-jre.jar
[hadoop@centos conf]$ mv $HIVE_HOME/lib/guava-19.0.jar $HIVE_HOME/lib/guava-19.0.jar.bak
# 디렉토리 만들기
/user/hive/warehouse
[hadoop@centos conf]$ hdfs dfs -mkdir -p /user/hive/warehouse
# 하이브 디렉토리 안의 서브 디렉토리들 확인하기
[hadoop@centos conf]$ hdfs dfs -ls -R /user/hive
drwxr-xr-x - hadoop supergroup 0 2024-03-11 15:13 /user/hive/warehouse
# 같은 그룹에 속한 유저들에게 쓰기 권한 주기
[hadoop@centos conf]$ hdfs dfs -chmod g+w /user/hive/warehouse
[hadoop@centos conf]$ hdfs dfs -ls -R /user/hive
drwxrwxr-x - hadoop supergroup 0 2024-03-11 15:13 /user/hive/warehouse
[hadoop@centos conf]$ schematool -dbType derby -initSchema
(시커먼 창 나와도 정상, 맨 끝에 Initialization 어쩌고 하는 두 줄만 뜨면 됨. 그럼 완료된 것)
Initialization script completed
schemaTool completed --- 이렇게.
이제 하이브 프롬프트만 뜨면 된다.
hive> 이런 식으로
그러면 아무 문제도 없는 것
[hadoop@centos conf]$ hive
hive> show databases;
OK
database_name
default
Time taken: 1.191 seconds, Fetched: 1 row(s)
hive> show tables;
위 두개 파일 하둡 유저 쪽에 옮겨주기
# 새로운 세션 열어서 방금 옮긴 파일이 들어있는지 확인해보기
아까 하이브 (hive> ) 창은 그대로 두고
새 세션에서 계속 작업하기
[hadoop@centos ~]$ hdfs dfs -put /home/hadoop/emp.csv /user
[hadoop@centos ~]$ hdfs dfs -put /home/hadoop/dept.csv /user
(시간이 조금 걸린다...)
[hadoop@centos ~]$ hdfs dfs -ls /user
6교시
파일 확인하기
[hadoop@centos ~]$ hdfs dfs -cat /user/emp.csv | head -2
100,Steven,King,SKING,515.123.4567,2003-06-17,AD_PRES,24000,,,90
101,Neena,Kochhar,NKOCHHAR,515.123.4568,2005-09-21,AD_VP,17000,,100,90
[hadoop@centos ~]$ hdfs dfs -cat /user/dept.csv | head -2
10,Administration,200,1700
20,Marketing,201,1800
다시 하이브 쪽으로 와서 테이블 만들기
(OK 라고 뜨면 테이블이 생성된 것)
create table emp
(empno int,
fname string,
lname string,
mail string,
phone string,
hiredate string,
job string,
sal int,
comm int,
mgr int,
deptno int)
row format delimited
fields terminated by ','
lines terminated by '\n'
stored as textfile;
hive> show tables;
emp 라는 테이블이 만들어져 있는 것이 보인다.
hive> desc emp;
hive> select * from emp;
# 로컬 파일을 하이브로 전송하기
overwrite 는 기존에 혹시라도 파일이 있었다면 덮어씌워서 입력하라는 뜻
hive> load data local inpath '/home/hadoop/emp.csv' overwrite into table emp;
hive> select * from emp;
그러면 데이터 107건이 조회된다.
hive> select * from emp;
OK
emp.empno emp.fname emp.lname emp.phone emp.hiredate emp.job emp.sal emp.comm emp.mgr emp.deptno
100 Steven King SKING 515.123.4567 2003-06-17 NULL 24000 NULL NULL
101 Neena Kochhar NKOCHHAR 515.123.4568 2005-09-21 NULL 17000 NULL 100
102 Lex De Haan LDEHAAN 515.123.4569 2001-01-13 NULL 17000 NULL 100
103 Alexander Hunold AHUNOLD 590.423.4567 2006-01-03 NULL 9000 NULL 102
104 Bruce Ernst BERNST 590.423.4568 2007-05-21 NULL 6000 NULL 103
105 David Austin DAUSTIN 590.423.4569 2005-06-25 NULL 4800 NULL 103
106 Valli Pataballa VPATABAL 590.423.4560 2006-02-05NULL 4800 NULL 103
107 Diana Lorentz DLORENTZ 590.423.5567 2007-02-07 NULL 4200 NULL 103
108 Nancy Greenberg NGREENBE 515.124.4569 2002-08-17NULL 12008 NULL 101
109 Daniel Faviet DFAVIET 515.124.4169 2002-08-16 NULL 9000 NULL 108
110 John Chen JCHEN 515.124.4269 2005-09-28 NULL 8200 NULL 108
111 Ismael Sciarra ISCIARRA 515.124.4369 2005-09-30 NULL 7700 NULL 108
112 Jose Manuel Urman JMURMAN 515.124.4469 2006-03-07 NULL 7800 NULL 108
113 Luis Popp LPOPP 515.124.4567 2007-12-07 NULL 6900 NULL 108
114 Den Raphaely DRAPHEAL 515.127.4561 2002-12-07NULL 11000 NULL 100
115 Alexander Khoo AKHOO 515.127.4562 2003-05-18 NULL 3100 NULL 114
116 Shelli Baida SBAIDA 515.127.4563 2005-12-24 NULL 2900 NULL 114
117 Sigal Tobias STOBIAS 515.127.4564 2005-07-24 NULL 2800 NULL 114
118 Guy Himuro GHIMURO 515.127.4565 2006-11-15 NULL 2600 NULL 114
119 Karen Colmenares KCOLMENA 515.127.4566 2007-08-10NULL 2500 NULL 114
120 Matthew Weiss MWEISS 650.123.1234 2004-07-18 NULL 8000 NULL 100
121 Adam Fripp AFRIPP 650.123.2234 2005-04-10 NULL 8200 NULL 100
122 Payam Kaufling PKAUFLIN 650.123.3234 2003-05-01NULL 7900 NULL 100
123 Shanta Vollman SVOLLMAN 650.123.4234 2005-10-10 NULL 6500 NULL 100
124 Kevin Mourgos KMOURGOS 650.123.5234 2007-11-16 NULL 5800 NULL 100
125 Julia Nayer JNAYER 650.124.1214 2005-07-16 NULL 3200 NULL 120
126 Irene Mikkilineni IMIKKILI 650.124.1224 2006-09-28NULL 2700 NULL 120
127 James Landry JLANDRY 650.124.1334 2007-01-14 NULL 2400 NULL 120
128 Steven Markle SMARKLE 650.124.1434 2008-03-08 NULL 2200 NULL 120
129 Laura Bissot LBISSOT 650.124.5234 2005-08-20 NULL 3300 NULL 121
130 Mozhe Atkinson MATKINSO 650.124.6234 2005-10-30NULL 2800 NULL 121
131 James Marlow JAMRLOW 650.124.7234 2005-02-16 NULL 2500 NULL 121
132 TJ Olson TJOLSON 650.124.8234 2007-04-10 NULL 2100 NULL 121
133 Jason Mallin JMALLIN 650.127.1934 2004-06-14 NULL 3300 NULL 122
134 Michael Rogers MROGERS 650.127.1834 2006-08-26 NULL 2900 NULL 122
135 Ki Gee KGEE 650.127.1734 2007-12-12 NULL 2400 NULL 122
136 Hazel Philtanker HPHILTAN 650.127.1634 2008-02-06NULL 2200 NULL 122
137 Renske Ladwig RLADWIG 650.121.1234 2003-07-14 NULL 3600 NULL 123
138 Stephen Stiles SSTILES 650.121.2034 2005-10-26 NULL 3200 NULL 123
139 John Seo JSEO 650.121.2019 2006-02-12 NULL 2700 NULL 123
140 Joshua Patel JPATEL 650.121.1834 2006-04-06 NULL 2500 NULL 123
141 Trenna Rajs TRAJS 650.121.8009 2003-10-17 NULL 3500 NULL 124
142 Curtis Davies CDAVIES 650.121.2994 2005-01-29 NULL 3100 NULL 124
143 Randall Matos RMATOS 650.121.2874 2006-03-15 NULL 2600 NULL 124
144 Peter Vargas PVARGAS 650.121.2004 2006-07-09 NULL 2500 NULL 124
145 John Russell JRUSSEL 011.44.1344.429268 2004-10-01 NULL 14000 0 100
146 Karen Partners KPARTNER 011.44.1344.467268 2005-01-05 NULL 13500 0 100
147 Alberto Errazuriz AERRAZUR 011.44.1344.429278 2005-03-10 NULL 12000 0 100
148 Gerald Cambrault GCAMBRAU 011.44.1344.619268 2007-10-15 NULL 11000 0 100
149 Eleni Zlotkey EZLOTKEY 011.44.1344.429018 2008-01-29NULL 10500 0 100
150 Peter Tucker PTUCKER 011.44.1344.129268 2005-01-30 NULL 10000 0 145
151 David Bernstein DBERNSTE 011.44.1344.345268 2005-03-24 NULL 9500 0 145
152 Peter Hall PHALL 011.44.1344.478968 2005-08-20 NULL 9000 0 145
153 Christopher Olsen COLSEN 011.44.1344.498718 2006-03-30NULL 8000 0 145
154 Nanette Cambrault NCAMBRAU 011.44.1344.987668 2006-12-09 NULL 7500 0 145
155 Oliver Tuvault OTUVAULT 011.44.1344.486508 2007-11-23NULL 7000 0 145
156 Janette King JKING 011.44.1345.429268 2004-01-30 NULL 10000 0 146
157 Patrick Sully PSULLY 011.44.1345.929268 2004-03-04 NULL 9500 0 146
158 Allan McEwen AMCEWEN 011.44.1345.829268 2004-08-01 NULL 9000 0 146
159 Lindsey Smith LSMITH 011.44.1345.729268 2005-03-10 NULL 8000 0 146
160 Louise Doran LDORAN 011.44.1345.629268 2005-12-15 NULL 7500 0 146
161 Sarath Sewall SSEWALL 011.44.1345.529268 2006-11-03 NULL 7000 0 146
162 Clara Vishney CVISHNEY 011.44.1346.129268 2005-11-11NULL 10500 0 147
163 Danielle Greene DGREENE 011.44.1346.229268 2007-03-19NULL 9500 0 147
164 Mattea Marvins MMARVINS 011.44.1346.329268 2008-01-24NULL 7200 0 147
165 David Lee DLEE 011.44.1346.529268 2008-02-23 NULL 6800 0 147
166 Sundar Ande SANDE 011.44.1346.629268 2008-03-24 NULL 6400 0 147
167 Amit Banda ABANDA 011.44.1346.729268 2008-04-21 NULL 6200 0 147
168 Lisa Ozer LOZER 011.44.1343.929268 2005-03-11 NULL 11500 0 148
169 Harrison Bloom HBLOOM 011.44.1343.829268 2006-03-23NULL 10000 0 148
170 Tayler Fox TFOX 011.44.1343.729268 2006-01-24 NULL 9600 0 148
171 William Smith WSMITH 011.44.1343.629268 2007-02-23 NULL 7400 0 148
172 Elizabeth Bates EBATES 011.44.1343.529268 2007-03-24NULL 7300 0 148
173 Sundita Kumar SKUMAR 011.44.1343.329268 2008-04-21 NULL 6100 0 148
174 Ellen Abel EABEL 011.44.1644.429267 2004-05-11 NULL 11000 0 149
175 Alyssa Hutton AHUTTON 011.44.1644.429266 2005-03-19 NULL 8800 0 149
176 Jonathon Taylor JTAYLOR 011.44.1644.429265 2006-03-24NULL 8600 0 149
177 Jack Livingston JLIVINGS 011.44.1644.429264 2006-04-23 NULL 8400 0 149
178 Kimberely Grant KGRANT 011.44.1644.429263 2007-05-24NULL 7000 0 149
179 Charles Johnson CJOHNSON 011.44.1644.429262 2008-01-04NULL 6200 0 149
180 Winston Taylor WTAYLOR 650.507.9876 2006-01-24 NULL 3200 NULL 120
181 Jean Fleaur JFLEAUR 650.507.9877 2006-02-23 NULL 3100 NULL 120
182 Martha Sullivan MSULLIVA 650.507.9878 2007-06-21NULL 2500 NULL 120
183 Girard Geoni GGEONI 650.507.9879 2008-02-03 NULL 2800 NULL 120
184 Nandita Sarchand NSARCHAN 650.509.1876 2004-01-27NULL 4200 NULL 121
185 Alexis Bull ABULL 650.509.2876 2005-02-20 NULL 4100 NULL 121
186 Julia Dellinger JDELLING 650.509.3876 2006-06-24NULL 3400 NULL 121
187 Anthony Cabrio ACABRIO 650.509.4876 2007-02-07 NULL 3000 NULL 121
188 Kelly Chung KCHUNG 650.505.1876 2005-06-14 NULL 3800 NULL 122
189 Jennifer Dilly JDILLY 650.505.2876 2005-08-13 NULL 3600 NULL 122
190 Timothy Gates TGATES 650.505.3876 2006-07-11 NULL 2900 NULL 122
191 Randall Perkins RPERKINS 650.505.4876 2007-12-19 NULL 2500 NULL 122
192 Sarah Bell SBELL 650.501.1876 2004-02-04 NULL 4000 NULL 123
193 Britney Everett BEVERETT 650.501.2876 2005-03-03 NULL 3900 NULL 123
194 Samuel McCain SMCCAIN 650.501.3876 2006-07-01 NULL 3200 NULL 123
195 Vance Jones VJONES 650.501.4876 2007-03-17 NULL 2800 NULL 123
196 Alana Walsh AWALSH 650.507.9811 2006-04-24 NULL 3100 NULL 124
197 Kevin Feeney KFEENEY 650.507.9822 2006-05-23 NULL 3000 NULL 124
198 Donald OConnell DOCONNEL 650.507.9833 2007-06-21NULL 2600 NULL 124
199 Douglas Grant DGRANT 650.507.9844 2008-01-13 NULL 2600 NULL 124
200 Jennifer Whalen JWHALEN 515.123.4444 2003-09-17 NULL 4400 NULL 101
201 Michael Hartstein MHARTSTE 515.123.5555 2004-02-17NULL 13000 NULL 100
202 Pat Fay PFAY 603.123.6666 2005-08-17 NULL 6000 NULL 201
203 Susan Mavris SMAVRIS 515.123.7777 2002-06-07 NULL 6500 NULL 101
204 Hermann Baer HBAER 515.123.8888 2002-06-07 NULL 10000 NULL 101
205 Shelley Higgins SHIGGINS 515.123.8080 2002-06-07 NULL 12008 NULL 101
206 William Gietz WGIETZ 515.123.8181 2002-06-07 NULL 8300 NULL 205
Time taken: 0.142 seconds, Fetched: 107 row(s)
# 로컬 파일을 hive 쪽으로 전송하되 append 하게 넣기 (기존 데이터 제일 뒤에 추가)
hive> load data local inpath '/home/hadoop/emp.csv' into table emp;
hive> select * from emp;
Time taken: 0.18 seconds, Fetched: 214 row(s)
데이터가 214건이 있다고 나온다.
# 테이블 삭제하기
hive> drop table emp;
# 다시 테이블 만들기
hive> show tables;
# 하둡에서 하이브로 전송하기
load data inpath 'hdfs://centos:9010/user/emp.csv' overwrite into table emp;
hive> select * from emp;
...
Time taken: 0.183 seconds, Fetched: 107 row(s)
# 테이블 삭제
hive> drop table emp;
<<또다른 새로운 세션에서>>
# 디렉토리 만들기
[hadoop@centos ~]$ hdfs dfs -mkdir /user/hive/warehouse/emp
[hadoop@centos ~]$ hdfs dfs -ls -R /user/hive
[hadoop@centos ~]$ hdfs dfs -put /home/hadoop/emp.csv /user/hive/warehouse/emp
[hadoop@centos ~]$ hdfs dfs -ls -R /user/hive
<<다시 하이브에서>>
# 뼈대만 가지고 있는 하이브 쪽 테이블 안으로, 하둡 파일의 데이터 읽어오기
external table 만들기
hive>
create external table if not exists emp
(empno int,
fname string,
lname string,
mail string,
phone string,
hiredate string,
job string,
sal int,
comm int,
mgr int,
deptno int)
row format delimited
fields terminated by ','
lines terminated by '\n'
stored as textfile
location '/user/hive/warehouse/emp';
hive> show tables;
hive> select * from emp;
Time taken: 0.285 seconds, Fetched: 107 row(s)
# 테이블 메타 정보 확인
hive> describe formatted emp;
hive> describe formatted emp;
OK
col_name data_type comment
# col_name data_type comment
empno int
fname string
lname string
phone string
hiredate string
job string
sal int
comm int
mgr int
deptno int
# Detailed Table Information
Database: default
OwnerType: USER
Owner: hadoop
CreateTime: Mon Mar 11 16:11:16 KST 2024
LastAccessTime: UNKNOWN
Retention: 0
Location: hdfs://centos:9010/user/hive/warehouse/emp
Table Type: EXTERNAL_TABLE
Table Parameters:
EXTERNAL TRUE
bucketing_version 2
numFiles 1
totalSize 8017
transient_lastDdlTime 1710141076
# Storage Information
SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
InputFormat: org.apache.hadoop.mapred.TextInputFormat
OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
Compressed: No
Num Buckets: -1
Bucket Columns: []
Sort Columns: []
Storage Desc Params:
field.delim ,
line.delim \n
serialization.format ,
Time taken: 0.17 seconds, Fetched: 40 row(s)
hive> select * from emp where empno = 100;
hive> select * from emp where empno = 100 or empno = 200;
select * from emp where empno in (100, 200);
select * from emp where sal >= 10000;
select * from emp where sal >= 10000 and sal <= 11000;
select * from emp where sal between 10000 and 11000;
select * from emp where lname like 'K%';
select * from emp where lname like '%g';
select * from emp where lname like '_i%';
select * from emp where lname like 'K___';
select * from emp where lname like '%in%' or lname like '%un%';
hive> select * from emp where lname like 'K___';
OK
emp.empno emp.fname emp.lname emp.phone emp.hiredate emp.job emp.sal emp.comm emp.mgr emp.deptno
100 Steven King SKING 515.123.4567 2003-06-17 NULL 24000 NULL NULL
115 Alexander Khoo AKHOO 515.127.4562 2003-05-18 NULL 3100 NULL 114
156 Janette King JKING 011.44.1345.429268 2004-01-30 NULL 10000 0 146
Time taken: 0.168 seconds, Fetched: 3 row(s)
hive> select * from emp where lname like '%in%' or lname like '%un%';
OK
emp.empno emp.fname emp.lname emp.phone emp.hiredate emp.job emp.sal emp.comm emp.mgr emp.deptno
100 Steven King SKING 515.123.4567 2003-06-17 NULL 24000 NULL NULL
103 Alexander Hunold AHUNOLD 590.423.4567 2006-01-03 NULL 9000 NULL 102
105 David Austin DAUSTIN 590.423.4569 2005-06-25 NULL 4800 NULL 103
122 Payam Kaufling PKAUFLIN 650.123.3234 2003-05-01NULL 7900 NULL 100
126 Irene Mikkilineni IMIKKILI 650.124.1224 2006-09-28NULL 2700 NULL 120
130 Mozhe Atkinson MATKINSO 650.124.6234 2005-10-30NULL 2800 NULL 121
133 Jason Mallin JMALLIN 650.127.1934 2004-06-14 NULL 3300 NULL 122
151 David Bernstein DBERNSTE 011.44.1344.345268 2005-03-24 NULL 9500 0 145
156 Janette King JKING 011.44.1345.429268 2004-01-30 NULL 10000 0 146
164 Mattea Marvins MMARVINS 011.44.1346.329268 2008-01-24NULL 7200 0 147
177 Jack Livingston JLIVINGS 011.44.1644.429264 2006-04-23 NULL 8400 0 149
186 Julia Dellinger JDELLING 650.509.3876 2006-06-24NULL 3400 NULL 121
188 Kelly Chung KCHUNG 650.505.1876 2005-06-14 NULL 3800 NULL 122
191 Randall Perkins RPERKINS 650.505.4876 2007-12-19 NULL 2500 NULL 122
194 Samuel McCain SMCCAIN 650.501.3876 2006-07-01 NULL 3200 NULL 123
201 Michael Hartstein MHARTSTE 515.123.5555 2004-02-17NULL 13000 NULL 100
205 Shelley Higgins SHIGGINS 515.123.8080 2002-06-07 NULL 12008 NULL 101
Time taken: 0.073 seconds, Fetched: 17 row(s)
# lname 안에 in 이 오든가, un 이 오든가 둘 중 하나 해야 함
select * from emp where lname rlike '.*in|un.*';
hive> select * from emp where lname rlike '.*in|un.*';
OK
emp.empno emp.fname emp.lname emp.phone emp.hiredate emp.job emp.sal emp.comm emp.mgr emp.deptno
100 Steven King SKING 515.123.4567 2003-06-17 NULL 24000 NULL NULL
103 Alexander Hunold AHUNOLD 590.423.4567 2006-01-03 NULL 9000 NULL 102
105 David Austin DAUSTIN 590.423.4569 2005-06-25 NULL 4800 NULL 103
122 Payam Kaufling PKAUFLIN 650.123.3234 2003-05-01NULL 7900 NULL 100
126 Irene Mikkilineni IMIKKILI 650.124.1224 2006-09-28NULL 2700 NULL 120
130 Mozhe Atkinson MATKINSO 650.124.6234 2005-10-30NULL 2800 NULL 121
133 Jason Mallin JMALLIN 650.127.1934 2004-06-14 NULL 3300 NULL 122
151 David Bernstein DBERNSTE 011.44.1344.345268 2005-03-24 NULL 9500 0 145
156 Janette King JKING 011.44.1345.429268 2004-01-30 NULL 10000 0 146
164 Mattea Marvins MMARVINS 011.44.1346.329268 2008-01-24NULL 7200 0 147
177 Jack Livingston JLIVINGS 011.44.1644.429264 2006-04-23 NULL 8400 0 149
186 Julia Dellinger JDELLING 650.509.3876 2006-06-24NULL 3400 NULL 121
188 Kelly Chung KCHUNG 650.505.1876 2005-06-14 NULL 3800 NULL 122
191 Randall Perkins RPERKINS 650.505.4876 2007-12-19 NULL 2500 NULL 122
194 Samuel McCain SMCCAIN 650.501.3876 2006-07-01 NULL 3200 NULL 123
201 Michael Hartstein MHARTSTE 515.123.5555 2004-02-17NULL 13000 NULL 100
205 Shelley Higgins SHIGGINS 515.123.8080 2002-06-07 NULL 12008 NULL 101
Time taken: 0.18 seconds, Fetched: 17 row(s)
select * from emp where deptno is null;
select * from emp where deptno is not null;
Time taken: 0.095 seconds, Fetched: 106 row(s)
select * from emp where deptno not in (30, 40, 50, 60);
Time taken: 0.074 seconds, Fetched: 106 row(s)
select * from emp where job not like '%CLERK%';
Time taken: 0.246 seconds, Fetched: 107 row(s)
select * from emp where sal not between 10000 and 20000;
select 1+2, 4-2, 4*2, 4/2, 7%2;
select empno, sal, comm, sal * 12 + sal * 12 * nvl(comm, 0) from emp;
Time taken: 0.199 seconds, Fetched: 107 row(s)
select concat(empno, fname) from emp;
Time taken: 0.159 seconds, Fetched: 107 row(s)
select substr(fname, 1, 2), substr(fname, -2, 2) from emp;
Time taken: 0.183 seconds, Fetched: 107 row(s)
select * from emp where lname = 'king';
select * from emp where lower(lname) = 'king';
select * from emp where upper(lname) = 'KING';
select * from emp where lcase(lname) = 'KING';
select * from emp where ucase(lname) = 'KING';
7교시
<< 계속해서 hive 세션 >>
hive>
drop table emp;
hive>
create external table if not exists emp
(empno int,
fname string,
lname string,
mail string,
phone string,
hiredate string,
job string,
sal int,
comm int,
mgr int,
deptno int)
row format delimited
fields terminated by ','
lines terminated by '\n'
stored as textfile
location '/user/hive/warehouse/emp';
select phone, translate(phone, '.','-') from emp where deptno = 20;
select distinct deptno from emp;
select length(lname) from emp;
Time taken: 6.407 seconds, Fetched: 107 row(s)
select round(45.926, 2), round(45.926), round(45.926, -1), round(55.926, -2);
select trunc(45.926, 2), trunc(45.926), trunc(45.926, -1), trunc(55.926, -2);
select ceil(10.0001), floor(10.0001);
select * from emp where hiredate like '2003%';
select * from emp where hiredate between to_date('2003-01-01') and to_date('2003-12-31');
select hiredate, date_format(hiredate, 'MM-dd-yyyy') from emp;
Time taken: 0.365 seconds, Fetched: 107 row(s)
select current_date;
select current_timestamp;
select current_date, month(current_date);
select current_date, day(current_date);
select hiredate, year(hiredate) from emp;
Time taken: 0.546 seconds, Fetched: 107 row(s)
select current_timestamp, hour(current_timestamp);
select current_timestamp, minute(current_timestamp);
select current_timestamp, second(current_timestamp);
select current_timestamp, weekofyear(current_timestamp);
select datediff(current_date, '2023-10-05');
select date_add(current_date, 100);
select date_sub(current_date, 100);
select add_months(current_date, 12);
select add_months(current_date, 12, 'yyyyMMdd');
select last_day(current_date);
select next_day(current_date, 'FRIDAY');
select next_day(current_date, 'FRI');
select months_between('2023-10-05', current_date);
select extract(hour from current_timestamp), extract(minute from current_timestamp), extract(second from current_timestamp);
select '100'+100;
select cast('100' as int) + 100;
select cast('100' as double) + 100;
select cast(100 as string);
select cast(100.01 as float);
select cast('true' as boolean);
select count(*), count(deptno) from emp;
select sum(sal), avg(sal), max(sal), min(sal), stddev_pop(sal), variance(sal) from emp;
select deptno, sum(sal)
from emp
group by deptno
having sum(sal) > 10000;
select deptno, sum(sal) sum_sal
from emp
group by deptno
having sum(sal) > 10000
order by sum_sal desc;
JOIN 은 ANSI 표준으로만
조인도 서브쿼리도 다 똑같다.
2024년 3월 8일 (4) | 2024.03.08 |
---|---|
2024년 3월 6일 2교시 + 3교시 + 4교시 (0) | 2024.03.06 |
2024년 3월 5일 1교시 (0) | 2024.03.05 |
2024년 3월 4일 1교시 QUESTION. (0) | 2024.03.04 |
2024년 2월 29일 QUESTION. (0) | 2024.02.29 |