[hadoop读书笔记] 第十五章 sqoop1.4.6小实验 - 将mysq数据导入hive
- grant all on hive_metastore.* to 'root'@'%' IDENTIFIED BY 'weidong' with grant option;
- flush privileges;
- cp hive-log4j2.properties. template hive-log4j2.properties
- cp hive-exec-log4j2.properties.template hive-exec-log4j2.properties
8、导入mysql connector jar包
- hive --service metastore &
调试 模式命令 hive -hiveconf hive.root.logger=DEBUG,console
- sqoop import --connect jdbc:mysql:// --table widgets_copy -m 1 --hive-import --username root -P
- [hadoop@hadoop-allinone-- conf]$ sqoop import --connect jdbc:mysql:// --table widgets_copy -m 1 --hive-import --username root -P
- // :: INFO sqoop.Sqoop: Running Sqoop version: 1.4.
- Enter password:
- // :: INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
- // :: INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
- // :: INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
- // :: INFO tool.CodeGenTool: Beginning code generation
- // :: INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `widgets_copy` AS t LIMIT
- // :: INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `widgets_copy` AS t LIMIT
- // :: INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /wdcloud/app/hadoop-2.7.
- Note: /tmp/sqoop-hadoop/compile/4a89a67225918969c1c0f4c7c13168e9/widgets_copy.java uses or overrides a deprecated API.
- Note: Recompile with -Xlint:deprecation for details.
- // :: INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/4a89a67225918969c1c0f4c7c13168e9/widgets_copy.jar
- // :: WARN manager.MySQLManager: It looks like you are importing from mysql.
- // :: WARN manager.MySQLManager: This transfer can be faster! Use the --direct
- // :: WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
- // :: INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
- // :: INFO mapreduce.ImportJobBase: Beginning import of widgets_copy
- // :: INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
- SLF4J: Class path contains multiple SLF4J bindings.
- SLF4J: Found binding in [jar:file:/wdcloud/app/hadoop-2.7./share/hadoop/common/lib/slf4j-log4j12-1.7..jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: Found binding in [jar:file:/wdcloud/app/hbase-1.1./lib/slf4j-log4j12-1.7..jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
- SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
- // :: INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
- // :: INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
- // :: INFO client.RMProxy: Connecting to ResourceManager at hadoop-allinone-200-123.wdcloud.locl/
- // :: INFO db.DBInputFormat: Using read commited transaction isolation
- // :: INFO mapreduce.JobSubmitter: number of splits:
- // :: INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1485230213604_0011
- // :: INFO impl.YarnClientImpl: Submitted application application_1485230213604_0011
- // :: INFO mapreduce.Job: The url to track the job: http://hadoop-allinone-200-123.wdcloud.locl:8088/proxy/application_1485230213604_0011/
- // :: INFO mapreduce.Job: Running job: job_1485230213604_0011
- // :: INFO mapreduce.Job: Job job_1485230213604_0011 running in uber mode : false
- // :: INFO mapreduce.Job: map % reduce %
- // :: INFO mapreduce.Job: map % reduce %
- // :: INFO mapreduce.Job: Job job_1485230213604_0011 completed successfully
- // :: INFO mapreduce.Job: Counters:
- File System Counters
- FILE: Number of bytes read=
- FILE: Number of bytes written=
- FILE: Number of read operations=
- FILE: Number of large read operations=
- FILE: Number of write operations=
- HDFS: Number of bytes read=
- HDFS: Number of bytes written=
- HDFS: Number of read operations=
- HDFS: Number of large read operations=
- HDFS: Number of write operations=
- Job Counters
- Launched map tasks=
- Other local map tasks=
- Total time spent by all maps in occupied slots (ms)=
- Total time spent by all reduces in occupied slots (ms)=
- Total time spent by all map tasks (ms)=
- Total vcore-milliseconds taken by all map tasks=
- Total megabyte-milliseconds taken by all map tasks=
- Map-Reduce Framework
- Map input records=
- Map output records=
- Input split bytes=
- Spilled Records=
- Failed Shuffles=
- Merged Map outputs=
- GC time elapsed (ms)=
- CPU time spent (ms)=
- Physical memory (bytes) snapshot=
- Virtual memory (bytes) snapshot=
- Total committed heap usage (bytes)=
- File Input Format Counters
- Bytes Read=
- File Output Format Counters
- Bytes Written=
- // :: INFO mapreduce.ImportJobBase: Transferred 169 bytes in 31.7543 seconds (5.3221 bytes/sec)
- // :: INFO mapreduce.ImportJobBase: Retrieved 4 records.
- // :: INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `widgets_copy` AS t LIMIT
- // :: WARN hive.TableDefWriter: Column price had to be cast to a less precise type in Hive
- // :: WARN hive.TableDefWriter: Column design_date had to be cast to a less precise type in Hive
- // :: INFO hive.HiveImport: Loading uploaded data into Hive(将生成在HDFS的数据加载到HIVE中)
- // :: INFO hive.HiveImport: SLF4J: Class path contains multiple SLF4J bindings.
- // :: INFO hive.HiveImport: SLF4J: Found binding in [jar:file:/wdcloud/app/hive-2.1./lib/log4j-slf4j-impl-2.4..jar!/org/slf4j/impl/StaticLoggerBinder.class]
- // :: INFO hive.HiveImport: SLF4J: Found binding in [jar:file:/wdcloud/app/hbase-1.1./lib/slf4j-log4j12-1.7..jar!/org/slf4j/impl/StaticLoggerBinder.class]
- // :: INFO hive.HiveImport: SLF4J: Found binding in [jar:file:/wdcloud/app/hadoop-2.7./share/hadoop/common/lib/slf4j-log4j12-1.7..jar!/org/slf4j/impl/StaticLoggerBinder.class]
- // :: INFO hive.HiveImport: SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
- // :: INFO hive.HiveImport: SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
- // :: INFO hive.HiveImport:
- // :: INFO hive.HiveImport: Logging initialized using configuration in file:/wdcloud/app/hive-2.1./conf/hive-log4j2.properties Async: true
- // :: INFO hive.HiveImport: OK
- // :: INFO hive.HiveImport: Time taken: 3.687 seconds
- // :: INFO hive.HiveImport: Loading data to table default.widgets_copy
- // :: INFO hive.HiveImport: OK
- // :: INFO hive.HiveImport: Time taken: 1.92 seconds
- // :: INFO hive.HiveImport: Hive import complete.
- // :: INFO hive.HiveImport: Export directory is contains the _SUCCESS file only, removing the directory.(加载进Hive成功后将HDFS上的中间数据删除掉)
- ERROR tool.ImportTool: Encountered IOException running import job: org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory xxx already exists
这时,执行hadoop fs -rmr xxx 即可
--hive-overwrite : Overwrite existing data inthe Hive table
拓展: Sqoop-1.4.4工具import和export使用详解
- <property>
- <name>javax.jdo.option.ConnectionURL</name>
- <value>jdbc:mysql://</value>
- </property>
- <property>
- <name>javax.jdo.option.ConnectionDriverName</name>
- <value>com.mysql.jdbc.Driver</value>
- </property>
- <property>
- <name>javax.jdo.option.ConnectionUserName</name>
- <value>root</value>
- </property>
- <property>
- <name>javax.jdo.option.ConnectionPassword</name>
- <value>weidong</value>
- </property>
- <property>
- <name>datanucleus.schema.autoCreateTables</name>
- <value>true</value>
- </property>
- <property>
- <name>hive.metastore.warehouse.dir</name>
- <value>/hive/warehouse</value>
- </property>
- <property>
- <name>hive.exec.scratchdir</name>
- <value>/hive/warehouse</value>
- </property>
- <property>
- <name>hive.querylog.location</name>
- <value>/wdcloud/app/hive-2.1./logs</value>
- </property>
- <property>
- <name>hive.aux.jars.path</name>
- <value>/wdcloud/app/hbase-1.1./lib</value>
- </property>
- <property>
- <name>hive.metastore.uris</name>
- <value>thrift://</value>
- </property>
- <property>
- <name>hive.metastore.schema.verification</name>
- <value>false</value>
- </property>
- </configuration>
