前言摘要:hive,hql,sql,大数据,学习
针对有sql基础,刚熟悉linux界面,在此进行hive的入门学习分享,文章结合网络资料并加以细化步骤。
本文步骤参照:
https://www.cnblogs.com/qingyunzong/p/8708057.html
有关语句的讲解命令参考:
https://www.cnblogs.com/sqdmydxf/p/7771538.html
基本使用:
现有student.txt的表,表结构如下:
id,name,sex,age,department (该行为表头)
95002,刘晨,女,19,IS
95017,王风娟,女,18,IS
95018,王一,女,19,IS
95013,冯伟,男,21,CS
95014,王小丽,女,19,CS
95019,邢小丽,女,19,IS
95020,赵钱,男,21,IS
95003,王敏,女,22,MA
95004,张立,男,19,IS
95012,孙花,女,20,CS
95010,孔小涛,男,19,CS
95005,刘刚,男,18,MA
95006,孙庆,男,23,CS
95007,易思玲,女,19,MA
95008,李娜,女,18,CS
95021,周二,男,17,MA
95022,郑明,男,20,MA
95001,李勇,男,20,CS
95011,包小柏,男,18,MA
95009,梦圆圆,女,18,MA
95015,王君,男,18,MA
1、在home目录中导入student.txt
#cd /home
#mkdir jiafeng
cd /home/jiafeng #进入Home中jiafeng文件
vi student.txt
#创建student.txt文件,并复制数据,在编辑模式按a进入当前光标后输入,复制粘贴数据后,进入末行模式:wq保存退出
2、创建一个数据库myhive
#首先输入hive进入
hive> create database my_hjf_hive;
OK
Time taken: 7.847 seconds
hive>
3、使用新的数据库
hive> use my_hjf_hive;
OK
Time taken: 0.047 seconds
hive>
4、查看现有数据库
hive>show databases;
OK
my_hjf_hive
Time taken: 2.838 seconds, Fetched: 1 row(s)
5、查看当前正在使用的数据库
hive> select current_database();
OK
my_hjf_hive
Time taken: 0.728 seconds, Fetched: 1 row(s)
hive>
6、在数据库myhive创建一张student表
hive> create table student(id int, name string, sex string, age int, department string) row format delimited fields terminated by ",";
OK
Time taken: 0.718 seconds
hive>
7、往表中加载数据
hive> load data local inpath "/home/jiafeng/student.txt" into table student;
Loading data to table my_hjf_hive.student
OK
Time taken: 1.854 seconds
hive>
8、查询数据
hive> select * from student;
OK
student.id student.name student.sex student.age student.department
95017 王风娟 女 18 IS
95018 王一 女 19 IS
95013 冯伟 男 21 CS
95014 王小丽 女 19 CS
95019 邢小丽 女 19 IS
95020 赵钱 男 21 IS
95003 王敏 女 22 MA
95004 张立 男 19 IS
95012 孙花 女 20 CS
95010 孔小涛 男 19 CS
95005 刘刚 男 18 MA
95006 孙庆 男 23 CS
95007 易思玲 女 19 MA
95008 李娜 女 18 CS
95021 周二 男 17 MA
95022 郑明 男 20 MA
95001 李勇 男 20 CS
95011 包小柏 男 18 MA
95009 梦圆圆 女 18 MA
95015 王君 男 18 MA
Time taken: 3.341 seconds, Fetched: 20 row(s)
9、查看表结构
hive> desc student;
OK
id int
name string
sex string
age int
department string
Time taken: 0.102 seconds, Fetched: 5 row(s)
hive>
hive> desc extended student;
OK
col_name data_type comment
id int
name string
sex string
age int
department string
Detailed Table Information Table(tableName:student, dbName:my_hjf_hive, owner:root, createTime:1573527088, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:id, type:int, comment:null), FieldSchema(name:name, type:string, comment:null), FieldSchema(name:sex, type:string, comment:null), FieldSchema(name:age, type:int, comment:null), FieldSchema(name:department, type:string, comment:null)], location:hdfs://localhost:9000/user/hive/warehouse/my_hjf_hive.db/student, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=,, field.delim=,}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{}), storedAsSubDirectories:false), partitionKeys:[], parameters:{transient_lastDdlTime=1573527261, totalSize=481, numRows=0, rawDataSize=0, numFiles=1}, viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE, rewriteEnabled:false)
Time taken: 0.132 seconds, Fetched: 7 row(s)
hive> desc formatted student;
OK
col_name data_type comment
# col_name data_type comment
id int
name string
sex string
age int
department string
# Detailed Table Information
Database: my_hjf_hive
Owner: root
CreateTime: Tue Nov 12 10:51:28 CST 2019
LastAccessTime: UNKNOWN
Retention: 0
Location: hdfs://localhost:9000/user/hive/warehouse/my_hjf_hive.db/student
Table Type: MANAGED_TABLE
Table Parameters:
numFiles 1
numRows 0
rawDataSize 0
totalSize 481
transient_lastDdlTime 1573527261
# Storage Information
SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
InputFormat: org.apache.hadoop.mapred.TextInputFormat
OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
Compressed: No
Num Buckets: -1
Bucket Columns: []
Sort Columns: []
Storage Desc Params:
field.delim ,
serialization.format ,
Time taken: 0.23 seconds, Fetched: 34 row(s)
回到hive专题:
https://www.top8488.top/category/hive