Hadoop基础-Apache Avro串行化的与反串行化
Hadoop基础-Apache Avro串行化的与反串行化
作者:尹正杰
版权声明:原创作品,谢绝转载!否则将追究法律责任。
一.Apache Avro简介
1>.Apache Avro的来源
Apache Avro 是一个中立性语言,它是有Hadoop之父Doug Cutting开发而来。因为hadoop的Writerable的串行化只支持Java语言,即非跨语言。所以Doug Cutting开发了Avro ,它是一个语言独立的数据结构,也就是说它是跨语言的。
2>.Avro特点
Apache Avro™是一个数据序列化系统。它有以下特点:
第一:丰富的数据结构。
第二:紧凑,快速的二进制数据格式。
第三:一个容器文件,用于存储持久性数据。
第四:远程过程调用(RPC)。
第五:与动态语言的简单集成。读取或写入数据文件不需要代码生成,也不需要使用或实现RPC协议。代码生成是一种可选的优化,只有静态类型语言才值得实现。
想要了解更多Avro,可以参考Apache官网(http://avro.apache.org/docs/1.8.2/),我就懒得搬运了,直接上干货,本篇博客介绍了两种Avro序列化与反序列化的方式。
3>.安装Avro
下载Avro:http://mirror.bit.edu.cn/apache/avro/avro-1.8.2/
二.Avro环境准备
其实部署Avro的流程大致可以用下图来表示,配置起来相对来说还是蛮简单的,具体操作如下:
1>.创建emp.avsc文件,内容如下
{
"namespace": "tutorialspoint.com",
"type": "record",
"name": "Emp",
"fields": [
{"name": "name", "type": "string"},
{"name": "id", "type": "int"},
{"name": "salary", "type": "int"},
{"name": "age", "type": "int"},
{"name": "address", "type": "string"}
]
}
2>.将下载的avro-1.8.2.jar和avro-tools-1.8.2.jar文件放在emp.avsc同级目录
3>.编译schema文件
4>.查看文件内容
/**
* Autogenerated by Avro
*
* DO NOT EDIT DIRECTLY
*/
package tutorialspoint.com; import org.apache.avro.specific.SpecificData;
import org.apache.avro.message.BinaryMessageEncoder;
import org.apache.avro.message.BinaryMessageDecoder;
import org.apache.avro.message.SchemaStore; @SuppressWarnings("all")
@org.apache.avro.specific.AvroGenerated
public class Emp extends org.apache.avro.specific.SpecificRecordBase implements org.apache.avro.specific.SpecificRecord {
private static final long serialVersionUID = 6405205887550658768L;
public static final org.apache.avro.Schema SCHEMA$ = new org.apache.avro.Schema.Parser().parse("{\"type\":\"record\",\"name\":\"Emp\",\"namespace\":\"tutorialspoint.com\",\"fields\":[{\"name\":\"name\",\"type\":\"string\"},{\"name\":\"id\",\"type\":\"int\"},{\"name\":\"salary\",\"type\":\"int\"},{\"name\":\"age\",\"type\":\"int\"},{\"name\":\"address\",\"type\":\"string\"}]}");
public static org.apache.avro.Schema getClassSchema() { return SCHEMA$; } private static SpecificData MODEL$ = new SpecificData(); private static final BinaryMessageEncoder<Emp> ENCODER =
new BinaryMessageEncoder<Emp>(MODEL$, SCHEMA$); private static final BinaryMessageDecoder<Emp> DECODER =
new BinaryMessageDecoder<Emp>(MODEL$, SCHEMA$); /**
* Return the BinaryMessageDecoder instance used by this class.
*/
public static BinaryMessageDecoder<Emp> getDecoder() {
return DECODER;
} /**
* Create a new BinaryMessageDecoder instance for this class that uses the specified {@link SchemaStore}.
* @param resolver a {@link SchemaStore} used to find schemas by fingerprint
*/
public static BinaryMessageDecoder<Emp> createDecoder(SchemaStore resolver) {
return new BinaryMessageDecoder<Emp>(MODEL$, SCHEMA$, resolver);
} /** Serializes this Emp to a ByteBuffer. */
public java.nio.ByteBuffer toByteBuffer() throws java.io.IOException {
return ENCODER.encode(this);
} /** Deserializes a Emp from a ByteBuffer. */
public static Emp fromByteBuffer(
java.nio.ByteBuffer b) throws java.io.IOException {
return DECODER.decode(b);
} @Deprecated public java.lang.CharSequence name;
@Deprecated public int id;
@Deprecated public int salary;
@Deprecated public int age;
@Deprecated public java.lang.CharSequence address; /**
* Default constructor. Note that this does not initialize fields
* to their default values from the schema. If that is desired then
* one should use <code>newBuilder()</code>.
*/
public Emp() {} /**
* All-args constructor.
* @param name The new value for name
* @param id The new value for id
* @param salary The new value for salary
* @param age The new value for age
* @param address The new value for address
*/
public Emp(java.lang.CharSequence name, java.lang.Integer id, java.lang.Integer salary, java.lang.Integer age, java.lang.CharSequence address) {
this.name = name;
this.id = id;
this.salary = salary;
this.age = age;
this.address = address;
} public org.apache.avro.Schema getSchema() { return SCHEMA$; }
// Used by DatumWriter. Applications should not call.
public java.lang.Object get(int field$) {
switch (field$) {
case 0: return name;
case 1: return id;
case 2: return salary;
case 3: return age;
case 4: return address;
default: throw new org.apache.avro.AvroRuntimeException("Bad index");
}
} // Used by DatumReader. Applications should not call.
@SuppressWarnings(value="unchecked")
public void put(int field$, java.lang.Object value$) {
switch (field$) {
case 0: name = (java.lang.CharSequence)value$; break;
case 1: id = (java.lang.Integer)value$; break;
case 2: salary = (java.lang.Integer)value$; break;
case 3: age = (java.lang.Integer)value$; break;
case 4: address = (java.lang.CharSequence)value$; break;
default: throw new org.apache.avro.AvroRuntimeException("Bad index");
}
} /**
* Gets the value of the 'name' field.
* @return The value of the 'name' field.
*/
public java.lang.CharSequence getName() {
return name;
} /**
* Sets the value of the 'name' field.
* @param value the value to set.
*/
public void setName(java.lang.CharSequence value) {
this.name = value;
} /**
* Gets the value of the 'id' field.
* @return The value of the 'id' field.
*/
public java.lang.Integer getId() {
return id;
} /**
* Sets the value of the 'id' field.
* @param value the value to set.
*/
public void setId(java.lang.Integer value) {
this.id = value;
} /**
* Gets the value of the 'salary' field.
* @return The value of the 'salary' field.
*/
public java.lang.Integer getSalary() {
return salary;
} /**
* Sets the value of the 'salary' field.
* @param value the value to set.
*/
public void setSalary(java.lang.Integer value) {
this.salary = value;
} /**
* Gets the value of the 'age' field.
* @return The value of the 'age' field.
*/
public java.lang.Integer getAge() {
return age;
} /**
* Sets the value of the 'age' field.
* @param value the value to set.
*/
public void setAge(java.lang.Integer value) {
this.age = value;
} /**
* Gets the value of the 'address' field.
* @return The value of the 'address' field.
*/
public java.lang.CharSequence getAddress() {
return address;
} /**
* Sets the value of the 'address' field.
* @param value the value to set.
*/
public void setAddress(java.lang.CharSequence value) {
this.address = value;
} /**
* Creates a new Emp RecordBuilder.
* @return A new Emp RecordBuilder
*/
public static tutorialspoint.com.Emp.Builder newBuilder() {
return new tutorialspoint.com.Emp.Builder();
} /**
* Creates a new Emp RecordBuilder by copying an existing Builder.
* @param other The existing builder to copy.
* @return A new Emp RecordBuilder
*/
public static tutorialspoint.com.Emp.Builder newBuilder(tutorialspoint.com.Emp.Builder other) {
return new tutorialspoint.com.Emp.Builder(other);
} /**
* Creates a new Emp RecordBuilder by copying an existing Emp instance.
* @param other The existing instance to copy.
* @return A new Emp RecordBuilder
*/
public static tutorialspoint.com.Emp.Builder newBuilder(tutorialspoint.com.Emp other) {
return new tutorialspoint.com.Emp.Builder(other);
} /**
* RecordBuilder for Emp instances.
*/
public static class Builder extends org.apache.avro.specific.SpecificRecordBuilderBase<Emp>
implements org.apache.avro.data.RecordBuilder<Emp> { private java.lang.CharSequence name;
private int id;
private int salary;
private int age;
private java.lang.CharSequence address; /** Creates a new Builder */
private Builder() {
super(SCHEMA$);
} /**
* Creates a Builder by copying an existing Builder.
* @param other The existing Builder to copy.
*/
private Builder(tutorialspoint.com.Emp.Builder other) {
super(other);
if (isValidValue(fields()[0], other.name)) {
this.name = data().deepCopy(fields()[0].schema(), other.name);
fieldSetFlags()[0] = true;
}
if (isValidValue(fields()[1], other.id)) {
this.id = data().deepCopy(fields()[1].schema(), other.id);
fieldSetFlags()[1] = true;
}
if (isValidValue(fields()[2], other.salary)) {
this.salary = data().deepCopy(fields()[2].schema(), other.salary);
fieldSetFlags()[2] = true;
}
if (isValidValue(fields()[3], other.age)) {
this.age = data().deepCopy(fields()[3].schema(), other.age);
fieldSetFlags()[3] = true;
}
if (isValidValue(fields()[4], other.address)) {
this.address = data().deepCopy(fields()[4].schema(), other.address);
fieldSetFlags()[4] = true;
}
} /**
* Creates a Builder by copying an existing Emp instance
* @param other The existing instance to copy.
*/
private Builder(tutorialspoint.com.Emp other) {
super(SCHEMA$);
if (isValidValue(fields()[0], other.name)) {
this.name = data().deepCopy(fields()[0].schema(), other.name);
fieldSetFlags()[0] = true;
}
if (isValidValue(fields()[1], other.id)) {
this.id = data().deepCopy(fields()[1].schema(), other.id);
fieldSetFlags()[1] = true;
}
if (isValidValue(fields()[2], other.salary)) {
this.salary = data().deepCopy(fields()[2].schema(), other.salary);
fieldSetFlags()[2] = true;
}
if (isValidValue(fields()[3], other.age)) {
this.age = data().deepCopy(fields()[3].schema(), other.age);
fieldSetFlags()[3] = true;
}
if (isValidValue(fields()[4], other.address)) {
this.address = data().deepCopy(fields()[4].schema(), other.address);
fieldSetFlags()[4] = true;
}
} /**
* Gets the value of the 'name' field.
* @return The value.
*/
public java.lang.CharSequence getName() {
return name;
} /**
* Sets the value of the 'name' field.
* @param value The value of 'name'.
* @return This builder.
*/
public tutorialspoint.com.Emp.Builder setName(java.lang.CharSequence value) {
validate(fields()[0], value);
this.name = value;
fieldSetFlags()[0] = true;
return this;
} /**
* Checks whether the 'name' field has been set.
* @return True if the 'name' field has been set, false otherwise.
*/
public boolean hasName() {
return fieldSetFlags()[0];
} /**
* Clears the value of the 'name' field.
* @return This builder.
*/
public tutorialspoint.com.Emp.Builder clearName() {
name = null;
fieldSetFlags()[0] = false;
return this;
} /**
* Gets the value of the 'id' field.
* @return The value.
*/
public java.lang.Integer getId() {
return id;
} /**
* Sets the value of the 'id' field.
* @param value The value of 'id'.
* @return This builder.
*/
public tutorialspoint.com.Emp.Builder setId(int value) {
validate(fields()[1], value);
this.id = value;
fieldSetFlags()[1] = true;
return this;
} /**
* Checks whether the 'id' field has been set.
* @return True if the 'id' field has been set, false otherwise.
*/
public boolean hasId() {
return fieldSetFlags()[1];
} /**
* Clears the value of the 'id' field.
* @return This builder.
*/
public tutorialspoint.com.Emp.Builder clearId() {
fieldSetFlags()[1] = false;
return this;
} /**
* Gets the value of the 'salary' field.
* @return The value.
*/
public java.lang.Integer getSalary() {
return salary;
} /**
* Sets the value of the 'salary' field.
* @param value The value of 'salary'.
* @return This builder.
*/
public tutorialspoint.com.Emp.Builder setSalary(int value) {
validate(fields()[2], value);
this.salary = value;
fieldSetFlags()[2] = true;
return this;
} /**
* Checks whether the 'salary' field has been set.
* @return True if the 'salary' field has been set, false otherwise.
*/
public boolean hasSalary() {
return fieldSetFlags()[2];
} /**
* Clears the value of the 'salary' field.
* @return This builder.
*/
public tutorialspoint.com.Emp.Builder clearSalary() {
fieldSetFlags()[2] = false;
return this;
} /**
* Gets the value of the 'age' field.
* @return The value.
*/
public java.lang.Integer getAge() {
return age;
} /**
* Sets the value of the 'age' field.
* @param value The value of 'age'.
* @return This builder.
*/
public tutorialspoint.com.Emp.Builder setAge(int value) {
validate(fields()[3], value);
this.age = value;
fieldSetFlags()[3] = true;
return this;
} /**
* Checks whether the 'age' field has been set.
* @return True if the 'age' field has been set, false otherwise.
*/
public boolean hasAge() {
return fieldSetFlags()[3];
} /**
* Clears the value of the 'age' field.
* @return This builder.
*/
public tutorialspoint.com.Emp.Builder clearAge() {
fieldSetFlags()[3] = false;
return this;
} /**
* Gets the value of the 'address' field.
* @return The value.
*/
public java.lang.CharSequence getAddress() {
return address;
} /**
* Sets the value of the 'address' field.
* @param value The value of 'address'.
* @return This builder.
*/
public tutorialspoint.com.Emp.Builder setAddress(java.lang.CharSequence value) {
validate(fields()[4], value);
this.address = value;
fieldSetFlags()[4] = true;
return this;
} /**
* Checks whether the 'address' field has been set.
* @return True if the 'address' field has been set, false otherwise.
*/
public boolean hasAddress() {
return fieldSetFlags()[4];
} /**
* Clears the value of the 'address' field.
* @return This builder.
*/
public tutorialspoint.com.Emp.Builder clearAddress() {
address = null;
fieldSetFlags()[4] = false;
return this;
} @Override
@SuppressWarnings("unchecked")
public Emp build() {
try {
Emp record = new Emp();
record.name = fieldSetFlags()[0] ? this.name : (java.lang.CharSequence) defaultValue(fields()[0]);
record.id = fieldSetFlags()[1] ? this.id : (java.lang.Integer) defaultValue(fields()[1]);
record.salary = fieldSetFlags()[2] ? this.salary : (java.lang.Integer) defaultValue(fields()[2]);
record.age = fieldSetFlags()[3] ? this.age : (java.lang.Integer) defaultValue(fields()[3]);
record.address = fieldSetFlags()[4] ? this.address : (java.lang.CharSequence) defaultValue(fields()[4]);
return record;
} catch (java.lang.Exception e) {
throw new org.apache.avro.AvroRuntimeException(e);
}
}
} @SuppressWarnings("unchecked")
private static final org.apache.avro.io.DatumWriter<Emp>
WRITER$ = (org.apache.avro.io.DatumWriter<Emp>)MODEL$.createDatumWriter(SCHEMA$); @Override public void writeExternal(java.io.ObjectOutput out)
throws java.io.IOException {
WRITER$.write(this, SpecificData.getEncoder(out));
} @SuppressWarnings("unchecked")
private static final org.apache.avro.io.DatumReader<Emp>
READER$ = (org.apache.avro.io.DatumReader<Emp>)MODEL$.createDatumReader(SCHEMA$); @Override public void readExternal(java.io.ObjectInput in)
throws java.io.IOException {
READER$.read(this, SpecificData.getDecoder(in));
} }
Emp.java 文件内容
5>.将此文件加载到idea
第一步:编辑pom.xml配置文件
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion> <groupId>com.yinzhengjie</groupId>
<artifactId>myAvro-pb</artifactId>
<version>1.0-SNAPSHOT</version> <dependencies>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro</artifactId>
<version>1.8.2</version>
</dependency>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro-tools</artifactId>
<version>1.8.2</version>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.12</version>
</dependency>
</dependencies>
</project>
pom.xml 文件内容
第二步:新建包,名称tutorialspoint.com,并将Emp.java文件复制到包内,如有错误不建议直接修改。
/**
* Autogenerated by Avro
*
* DO NOT EDIT DIRECTLY
*/
package tutorialspoint.com; import org.apache.avro.specific.SpecificData;
import org.apache.avro.message.BinaryMessageEncoder;
import org.apache.avro.message.BinaryMessageDecoder;
import org.apache.avro.message.SchemaStore; @SuppressWarnings("all")
@org.apache.avro.specific.AvroGenerated
public class Emp extends org.apache.avro.specific.SpecificRecordBase implements org.apache.avro.specific.SpecificRecord {
private static final long serialVersionUID = 6405205887550658768L;
public static final org.apache.avro.Schema SCHEMA$ = new org.apache.avro.Schema.Parser().parse("{\"type\":\"record\",\"name\":\"Emp\",\"namespace\":\"tutorialspoint.com\",\"fields\":[{\"name\":\"name\",\"type\":\"string\"},{\"name\":\"id\",\"type\":\"int\"},{\"name\":\"salary\",\"type\":\"int\"},{\"name\":\"age\",\"type\":\"int\"},{\"name\":\"address\",\"type\":\"string\"}]}");
public static org.apache.avro.Schema getClassSchema() { return SCHEMA$; } private static SpecificData MODEL$ = new SpecificData(); private static final BinaryMessageEncoder<Emp> ENCODER =
new BinaryMessageEncoder<Emp>(MODEL$, SCHEMA$); private static final BinaryMessageDecoder<Emp> DECODER =
new BinaryMessageDecoder<Emp>(MODEL$, SCHEMA$); /**
* Return the BinaryMessageDecoder instance used by this class.
*/
public static BinaryMessageDecoder<Emp> getDecoder() {
return DECODER;
} /**
* Create a new BinaryMessageDecoder instance for this class that uses the specified {@link SchemaStore}.
* @param resolver a {@link SchemaStore} used to find schemas by fingerprint
*/
public static BinaryMessageDecoder<Emp> createDecoder(SchemaStore resolver) {
return new BinaryMessageDecoder<Emp>(MODEL$, SCHEMA$, resolver);
} /** Serializes this Emp to a ByteBuffer. */
public java.nio.ByteBuffer toByteBuffer() throws java.io.IOException {
return ENCODER.encode(this);
} /** Deserializes a Emp from a ByteBuffer. */
public static Emp fromByteBuffer(
java.nio.ByteBuffer b) throws java.io.IOException {
return DECODER.decode(b);
} @Deprecated public CharSequence name;
@Deprecated public int id;
@Deprecated public int salary;
@Deprecated public int age;
@Deprecated public CharSequence address; /**
* Default constructor. Note that this does not initialize fields
* to their default values from the schema. If that is desired then
* one should use <code>newBuilder()</code>.
*/
public Emp() {} /**
* All-args constructor.
* @param name The new value for name
* @param id The new value for id
* @param salary The new value for salary
* @param age The new value for age
* @param address The new value for address
*/
public Emp(CharSequence name, Integer id, Integer salary, Integer age, CharSequence address) {
this.name = name;
this.id = id;
this.salary = salary;
this.age = age;
this.address = address;
} public org.apache.avro.Schema getSchema() { return SCHEMA$; }
// Used by DatumWriter. Applications should not call.
public Object get(int field$) {
switch (field$) {
case 0: return name;
case 1: return id;
case 2: return salary;
case 3: return age;
case 4: return address;
default: throw new org.apache.avro.AvroRuntimeException("Bad index");
}
} // Used by DatumReader. Applications should not call.
@SuppressWarnings(value="unchecked")
public void put(int field$, Object value$) {
switch (field$) {
case 0: name = (CharSequence)value$; break;
case 1: id = (Integer)value$; break;
case 2: salary = (Integer)value$; break;
case 3: age = (Integer)value$; break;
case 4: address = (CharSequence)value$; break;
default: throw new org.apache.avro.AvroRuntimeException("Bad index");
}
} /**
* Gets the value of the 'name' field.
* @return The value of the 'name' field.
*/
public CharSequence getName() {
return name;
} /**
* Sets the value of the 'name' field.
* @param value the value to set.
*/
public void setName(CharSequence value) {
this.name = value;
} /**
* Gets the value of the 'id' field.
* @return The value of the 'id' field.
*/
public Integer getId() {
return id;
} /**
* Sets the value of the 'id' field.
* @param value the value to set.
*/
public void setId(Integer value) {
this.id = value;
} /**
* Gets the value of the 'salary' field.
* @return The value of the 'salary' field.
*/
public Integer getSalary() {
return salary;
} /**
* Sets the value of the 'salary' field.
* @param value the value to set.
*/
public void setSalary(Integer value) {
this.salary = value;
} /**
* Gets the value of the 'age' field.
* @return The value of the 'age' field.
*/
public Integer getAge() {
return age;
} /**
* Sets the value of the 'age' field.
* @param value the value to set.
*/
public void setAge(Integer value) {
this.age = value;
} /**
* Gets the value of the 'address' field.
* @return The value of the 'address' field.
*/
public CharSequence getAddress() {
return address;
} /**
* Sets the value of the 'address' field.
* @param value the value to set.
*/
public void setAddress(CharSequence value) {
this.address = value;
} /**
* Creates a new Emp RecordBuilder.
* @return A new Emp RecordBuilder
*/
public static Builder newBuilder() {
return new Builder();
} /**
* Creates a new Emp RecordBuilder by copying an existing Builder.
* @param other The existing builder to copy.
* @return A new Emp RecordBuilder
*/
public static Builder newBuilder(Builder other) {
return new Builder(other);
} /**
* Creates a new Emp RecordBuilder by copying an existing Emp instance.
* @param other The existing instance to copy.
* @return A new Emp RecordBuilder
*/
public static Builder newBuilder(Emp other) {
return new Builder(other);
} /**
* RecordBuilder for Emp instances.
*/
public static class Builder extends org.apache.avro.specific.SpecificRecordBuilderBase<Emp>
implements org.apache.avro.data.RecordBuilder<Emp> { private CharSequence name;
private int id;
private int salary;
private int age;
private CharSequence address; /** Creates a new Builder */
private Builder() {
super(SCHEMA$);
} /**
* Creates a Builder by copying an existing Builder.
* @param other The existing Builder to copy.
*/
private Builder(Builder other) {
super(other);
if (isValidValue(fields()[0], other.name)) {
this.name = data().deepCopy(fields()[0].schema(), other.name);
fieldSetFlags()[0] = true;
}
if (isValidValue(fields()[1], other.id)) {
this.id = data().deepCopy(fields()[1].schema(), other.id);
fieldSetFlags()[1] = true;
}
if (isValidValue(fields()[2], other.salary)) {
this.salary = data().deepCopy(fields()[2].schema(), other.salary);
fieldSetFlags()[2] = true;
}
if (isValidValue(fields()[3], other.age)) {
this.age = data().deepCopy(fields()[3].schema(), other.age);
fieldSetFlags()[3] = true;
}
if (isValidValue(fields()[4], other.address)) {
this.address = data().deepCopy(fields()[4].schema(), other.address);
fieldSetFlags()[4] = true;
}
} /**
* Creates a Builder by copying an existing Emp instance
* @param other The existing instance to copy.
*/
private Builder(Emp other) {
super(SCHEMA$);
if (isValidValue(fields()[0], other.name)) {
this.name = data().deepCopy(fields()[0].schema(), other.name);
fieldSetFlags()[0] = true;
}
if (isValidValue(fields()[1], other.id)) {
this.id = data().deepCopy(fields()[1].schema(), other.id);
fieldSetFlags()[1] = true;
}
if (isValidValue(fields()[2], other.salary)) {
this.salary = data().deepCopy(fields()[2].schema(), other.salary);
fieldSetFlags()[2] = true;
}
if (isValidValue(fields()[3], other.age)) {
this.age = data().deepCopy(fields()[3].schema(), other.age);
fieldSetFlags()[3] = true;
}
if (isValidValue(fields()[4], other.address)) {
this.address = data().deepCopy(fields()[4].schema(), other.address);
fieldSetFlags()[4] = true;
}
} /**
* Gets the value of the 'name' field.
* @return The value.
*/
public CharSequence getName() {
return name;
} /**
* Sets the value of the 'name' field.
* @param value The value of 'name'.
* @return This builder.
*/
public Builder setName(CharSequence value) {
validate(fields()[0], value);
this.name = value;
fieldSetFlags()[0] = true;
return this;
} /**
* Checks whether the 'name' field has been set.
* @return True if the 'name' field has been set, false otherwise.
*/
public boolean hasName() {
return fieldSetFlags()[0];
} /**
* Clears the value of the 'name' field.
* @return This builder.
*/
public Builder clearName() {
name = null;
fieldSetFlags()[0] = false;
return this;
} /**
* Gets the value of the 'id' field.
* @return The value.
*/
public Integer getId() {
return id;
} /**
* Sets the value of the 'id' field.
* @param value The value of 'id'.
* @return This builder.
*/
public Builder setId(int value) {
validate(fields()[1], value);
this.id = value;
fieldSetFlags()[1] = true;
return this;
} /**
* Checks whether the 'id' field has been set.
* @return True if the 'id' field has been set, false otherwise.
*/
public boolean hasId() {
return fieldSetFlags()[1];
} /**
* Clears the value of the 'id' field.
* @return This builder.
*/
public Builder clearId() {
fieldSetFlags()[1] = false;
return this;
} /**
* Gets the value of the 'salary' field.
* @return The value.
*/
public Integer getSalary() {
return salary;
} /**
* Sets the value of the 'salary' field.
* @param value The value of 'salary'.
* @return This builder.
*/
public Builder setSalary(int value) {
validate(fields()[2], value);
this.salary = value;
fieldSetFlags()[2] = true;
return this;
} /**
* Checks whether the 'salary' field has been set.
* @return True if the 'salary' field has been set, false otherwise.
*/
public boolean hasSalary() {
return fieldSetFlags()[2];
} /**
* Clears the value of the 'salary' field.
* @return This builder.
*/
public Builder clearSalary() {
fieldSetFlags()[2] = false;
return this;
} /**
* Gets the value of the 'age' field.
* @return The value.
*/
public Integer getAge() {
return age;
} /**
* Sets the value of the 'age' field.
* @param value The value of 'age'.
* @return This builder.
*/
public Builder setAge(int value) {
validate(fields()[3], value);
this.age = value;
fieldSetFlags()[3] = true;
return this;
} /**
* Checks whether the 'age' field has been set.
* @return True if the 'age' field has been set, false otherwise.
*/
public boolean hasAge() {
return fieldSetFlags()[3];
} /**
* Clears the value of the 'age' field.
* @return This builder.
*/
public Builder clearAge() {
fieldSetFlags()[3] = false;
return this;
} /**
* Gets the value of the 'address' field.
* @return The value.
*/
public CharSequence getAddress() {
return address;
} /**
* Sets the value of the 'address' field.
* @param value The value of 'address'.
* @return This builder.
*/
public Builder setAddress(CharSequence value) {
validate(fields()[4], value);
this.address = value;
fieldSetFlags()[4] = true;
return this;
} /**
* Checks whether the 'address' field has been set.
* @return True if the 'address' field has been set, false otherwise.
*/
public boolean hasAddress() {
return fieldSetFlags()[4];
} /**
* Clears the value of the 'address' field.
* @return This builder.
*/
public Builder clearAddress() {
address = null;
fieldSetFlags()[4] = false;
return this;
} @SuppressWarnings("unchecked")
public Emp build() {
try {
Emp record = new Emp();
record.name = fieldSetFlags()[0] ? this.name : (CharSequence) defaultValue(fields()[0]);
record.id = fieldSetFlags()[1] ? this.id : (Integer) defaultValue(fields()[1]);
record.salary = fieldSetFlags()[2] ? this.salary : (Integer) defaultValue(fields()[2]);
record.age = fieldSetFlags()[3] ? this.age : (Integer) defaultValue(fields()[3]);
record.address = fieldSetFlags()[4] ? this.address : (CharSequence) defaultValue(fields()[4]);
return record;
} catch (Exception e) {
throw new org.apache.avro.AvroRuntimeException(e);
}
}
} @SuppressWarnings("unchecked")
private static final org.apache.avro.io.DatumWriter<Emp>
WRITER$ = (org.apache.avro.io.DatumWriter<Emp>)MODEL$.createDatumWriter(SCHEMA$); @Override public void writeExternal(java.io.ObjectOutput out)
throws java.io.IOException {
WRITER$.write(this, SpecificData.getEncoder(out));
} @SuppressWarnings("unchecked")
private static final org.apache.avro.io.DatumReader<Emp>
READER$ = (org.apache.avro.io.DatumReader<Emp>)MODEL$.createDatumReader(SCHEMA$); @Override public void readExternal(java.io.ObjectInput in)
throws java.io.IOException {
READER$.read(this, SpecificData.getDecoder(in));
} }
Emp.java(删除第480行的“@Override”)
三.进行Avro编程
1>.开始编写串行化代码
/*
@author :yinzhengjie
Blog:http://www.cnblogs.com/yinzhengjie/tag/Hadoop%E8%BF%9B%E9%98%B6%E4%B9%8B%E8%B7%AF/
EMAIL:y1053419035@qq.com
*/
package cn.org.yinzhengjie.avro; import org.apache.avro.file.DataFileWriter;
import org.apache.avro.io.DatumWriter;
import org.apache.avro.specific.SpecificDatumWriter;
import org.junit.Test;
import tutorialspoint.com.Emp; import java.io.File;
import java.io.IOException; public class TestAvro { /**
* 测试Avro序列化
*/
@Test
public void testAvroSerial() throws IOException {
//定义对象,并给对象赋初值
Emp yinzhengjie = new Emp();
yinzhengjie.setId(1);
yinzhengjie.setName("尹正杰");
yinzhengjie.setAge(18);
yinzhengjie.setSalary(80000);
yinzhengjie.setAddress("北京"); //初始化writer对象,我习惯称它为写入器
DatumWriter<Emp> dw = new SpecificDatumWriter<Emp>(Emp.class);
//初始化文件写入器
DataFileWriter<Emp> dfw = new DataFileWriter<Emp>(dw);
//开始序列化文件,这个 Emp.SCHEMA$ 其实是一个常量,是咱们编译时自动生成的一个json格式的字符串。
dfw.create(Emp.SCHEMA$,new File("D:\\10.Java\\IDE\\yhinzhengjieData\\Avro\\emp.avro"));
//在序列化文件中追加对象
dfw.append(yinzhengjie);
//释放资源
dfw.close();
System.out.println("序列化成功!");
}
}
2>.测试java、hadoop、avro对10000000个对象串行化速度比对,大小比对
/**
* Autogenerated by Avro
*
* DO NOT EDIT DIRECTLY
*/
package tutorialspoint.com; import org.apache.avro.specific.SpecificData;
import org.apache.avro.message.BinaryMessageEncoder;
import org.apache.avro.message.BinaryMessageDecoder;
import org.apache.avro.message.SchemaStore; @SuppressWarnings("all")
@org.apache.avro.specific.AvroGenerated
public class Emp extends org.apache.avro.specific.SpecificRecordBase implements org.apache.avro.specific.SpecificRecord {
private static final long serialVersionUID = 6405205887550658768L;
public static final org.apache.avro.Schema SCHEMA$ = new org.apache.avro.Schema.Parser().parse("{\"type\":\"record\",\"name\":\"Emp\",\"namespace\":\"tutorialspoint.com\",\"fields\":[{\"name\":\"name\",\"type\":\"string\"},{\"name\":\"id\",\"type\":\"int\"},{\"name\":\"salary\",\"type\":\"int\"},{\"name\":\"age\",\"type\":\"int\"},{\"name\":\"address\",\"type\":\"string\"}]}");
public static org.apache.avro.Schema getClassSchema() { return SCHEMA$; } private static SpecificData MODEL$ = new SpecificData(); private static final BinaryMessageEncoder<Emp> ENCODER =
new BinaryMessageEncoder<Emp>(MODEL$, SCHEMA$); private static final BinaryMessageDecoder<Emp> DECODER =
new BinaryMessageDecoder<Emp>(MODEL$, SCHEMA$); /**
* Return the BinaryMessageDecoder instance used by this class.
*/
public static BinaryMessageDecoder<Emp> getDecoder() {
return DECODER;
} /**
* Create a new BinaryMessageDecoder instance for this class that uses the specified {@link SchemaStore}.
* @param resolver a {@link SchemaStore} used to find schemas by fingerprint
*/
public static BinaryMessageDecoder<Emp> createDecoder(SchemaStore resolver) {
return new BinaryMessageDecoder<Emp>(MODEL$, SCHEMA$, resolver);
} /** Serializes this Emp to a ByteBuffer. */
public java.nio.ByteBuffer toByteBuffer() throws java.io.IOException {
return ENCODER.encode(this);
} /** Deserializes a Emp from a ByteBuffer. */
public static Emp fromByteBuffer(
java.nio.ByteBuffer b) throws java.io.IOException {
return DECODER.decode(b);
} @Deprecated public CharSequence name;
@Deprecated public int id;
@Deprecated public int salary;
@Deprecated public int age;
@Deprecated public CharSequence address; /**
* Default constructor. Note that this does not initialize fields
* to their default values from the schema. If that is desired then
* one should use <code>newBuilder()</code>.
*/
public Emp() {} /**
* All-args constructor.
* @param name The new value for name
* @param id The new value for id
* @param salary The new value for salary
* @param age The new value for age
* @param address The new value for address
*/
public Emp(CharSequence name, Integer id, Integer salary, Integer age, CharSequence address) {
this.name = name;
this.id = id;
this.salary = salary;
this.age = age;
this.address = address;
} public org.apache.avro.Schema getSchema() { return SCHEMA$; }
// Used by DatumWriter. Applications should not call.
public Object get(int field$) {
switch (field$) {
case 0: return name;
case 1: return id;
case 2: return salary;
case 3: return age;
case 4: return address;
default: throw new org.apache.avro.AvroRuntimeException("Bad index");
}
} // Used by DatumReader. Applications should not call.
@SuppressWarnings(value="unchecked")
public void put(int field$, Object value$) {
switch (field$) {
case 0: name = (CharSequence)value$; break;
case 1: id = (Integer)value$; break;
case 2: salary = (Integer)value$; break;
case 3: age = (Integer)value$; break;
case 4: address = (CharSequence)value$; break;
default: throw new org.apache.avro.AvroRuntimeException("Bad index");
}
} /**
* Gets the value of the 'name' field.
* @return The value of the 'name' field.
*/
public CharSequence getName() {
return name;
} /**
* Sets the value of the 'name' field.
* @param value the value to set.
*/
public void setName(CharSequence value) {
this.name = value;
} /**
* Gets the value of the 'id' field.
* @return The value of the 'id' field.
*/
public Integer getId() {
return id;
} /**
* Sets the value of the 'id' field.
* @param value the value to set.
*/
public void setId(Integer value) {
this.id = value;
} /**
* Gets the value of the 'salary' field.
* @return The value of the 'salary' field.
*/
public Integer getSalary() {
return salary;
} /**
* Sets the value of the 'salary' field.
* @param value the value to set.
*/
public void setSalary(Integer value) {
this.salary = value;
} /**
* Gets the value of the 'age' field.
* @return The value of the 'age' field.
*/
public Integer getAge() {
return age;
} /**
* Sets the value of the 'age' field.
* @param value the value to set.
*/
public void setAge(Integer value) {
this.age = value;
} /**
* Gets the value of the 'address' field.
* @return The value of the 'address' field.
*/
public CharSequence getAddress() {
return address;
} /**
* Sets the value of the 'address' field.
* @param value the value to set.
*/
public void setAddress(CharSequence value) {
this.address = value;
} /**
* Creates a new Emp RecordBuilder.
* @return A new Emp RecordBuilder
*/
public static Builder newBuilder() {
return new Builder();
} /**
* Creates a new Emp RecordBuilder by copying an existing Builder.
* @param other The existing builder to copy.
* @return A new Emp RecordBuilder
*/
public static Builder newBuilder(Builder other) {
return new Builder(other);
} /**
* Creates a new Emp RecordBuilder by copying an existing Emp instance.
* @param other The existing instance to copy.
* @return A new Emp RecordBuilder
*/
public static Builder newBuilder(Emp other) {
return new Builder(other);
} /**
* RecordBuilder for Emp instances.
*/
public static class Builder extends org.apache.avro.specific.SpecificRecordBuilderBase<Emp>
implements org.apache.avro.data.RecordBuilder<Emp> { private CharSequence name;
private int id;
private int salary;
private int age;
private CharSequence address; /** Creates a new Builder */
private Builder() {
super(SCHEMA$);
} /**
* Creates a Builder by copying an existing Builder.
* @param other The existing Builder to copy.
*/
private Builder(Builder other) {
super(other);
if (isValidValue(fields()[0], other.name)) {
this.name = data().deepCopy(fields()[0].schema(), other.name);
fieldSetFlags()[0] = true;
}
if (isValidValue(fields()[1], other.id)) {
this.id = data().deepCopy(fields()[1].schema(), other.id);
fieldSetFlags()[1] = true;
}
if (isValidValue(fields()[2], other.salary)) {
this.salary = data().deepCopy(fields()[2].schema(), other.salary);
fieldSetFlags()[2] = true;
}
if (isValidValue(fields()[3], other.age)) {
this.age = data().deepCopy(fields()[3].schema(), other.age);
fieldSetFlags()[3] = true;
}
if (isValidValue(fields()[4], other.address)) {
this.address = data().deepCopy(fields()[4].schema(), other.address);
fieldSetFlags()[4] = true;
}
} /**
* Creates a Builder by copying an existing Emp instance
* @param other The existing instance to copy.
*/
private Builder(Emp other) {
super(SCHEMA$);
if (isValidValue(fields()[0], other.name)) {
this.name = data().deepCopy(fields()[0].schema(), other.name);
fieldSetFlags()[0] = true;
}
if (isValidValue(fields()[1], other.id)) {
this.id = data().deepCopy(fields()[1].schema(), other.id);
fieldSetFlags()[1] = true;
}
if (isValidValue(fields()[2], other.salary)) {
this.salary = data().deepCopy(fields()[2].schema(), other.salary);
fieldSetFlags()[2] = true;
}
if (isValidValue(fields()[3], other.age)) {
this.age = data().deepCopy(fields()[3].schema(), other.age);
fieldSetFlags()[3] = true;
}
if (isValidValue(fields()[4], other.address)) {
this.address = data().deepCopy(fields()[4].schema(), other.address);
fieldSetFlags()[4] = true;
}
} /**
* Gets the value of the 'name' field.
* @return The value.
*/
public CharSequence getName() {
return name;
} /**
* Sets the value of the 'name' field.
* @param value The value of 'name'.
* @return This builder.
*/
public Builder setName(CharSequence value) {
validate(fields()[0], value);
this.name = value;
fieldSetFlags()[0] = true;
return this;
} /**
* Checks whether the 'name' field has been set.
* @return True if the 'name' field has been set, false otherwise.
*/
public boolean hasName() {
return fieldSetFlags()[0];
} /**
* Clears the value of the 'name' field.
* @return This builder.
*/
public Builder clearName() {
name = null;
fieldSetFlags()[0] = false;
return this;
} /**
* Gets the value of the 'id' field.
* @return The value.
*/
public Integer getId() {
return id;
} /**
* Sets the value of the 'id' field.
* @param value The value of 'id'.
* @return This builder.
*/
public Builder setId(int value) {
validate(fields()[1], value);
this.id = value;
fieldSetFlags()[1] = true;
return this;
} /**
* Checks whether the 'id' field has been set.
* @return True if the 'id' field has been set, false otherwise.
*/
public boolean hasId() {
return fieldSetFlags()[1];
} /**
* Clears the value of the 'id' field.
* @return This builder.
*/
public Builder clearId() {
fieldSetFlags()[1] = false;
return this;
} /**
* Gets the value of the 'salary' field.
* @return The value.
*/
public Integer getSalary() {
return salary;
} /**
* Sets the value of the 'salary' field.
* @param value The value of 'salary'.
* @return This builder.
*/
public Builder setSalary(int value) {
validate(fields()[2], value);
this.salary = value;
fieldSetFlags()[2] = true;
return this;
} /**
* Checks whether the 'salary' field has been set.
* @return True if the 'salary' field has been set, false otherwise.
*/
public boolean hasSalary() {
return fieldSetFlags()[2];
} /**
* Clears the value of the 'salary' field.
* @return This builder.
*/
public Builder clearSalary() {
fieldSetFlags()[2] = false;
return this;
} /**
* Gets the value of the 'age' field.
* @return The value.
*/
public Integer getAge() {
return age;
} /**
* Sets the value of the 'age' field.
* @param value The value of 'age'.
* @return This builder.
*/
public Builder setAge(int value) {
validate(fields()[3], value);
this.age = value;
fieldSetFlags()[3] = true;
return this;
} /**
* Checks whether the 'age' field has been set.
* @return True if the 'age' field has been set, false otherwise.
*/
public boolean hasAge() {
return fieldSetFlags()[3];
} /**
* Clears the value of the 'age' field.
* @return This builder.
*/
public Builder clearAge() {
fieldSetFlags()[3] = false;
return this;
} /**
* Gets the value of the 'address' field.
* @return The value.
*/
public CharSequence getAddress() {
return address;
} /**
* Sets the value of the 'address' field.
* @param value The value of 'address'.
* @return This builder.
*/
public Builder setAddress(CharSequence value) {
validate(fields()[4], value);
this.address = value;
fieldSetFlags()[4] = true;
return this;
} /**
* Checks whether the 'address' field has been set.
* @return True if the 'address' field has been set, false otherwise.
*/
public boolean hasAddress() {
return fieldSetFlags()[4];
} /**
* Clears the value of the 'address' field.
* @return This builder.
*/
public Builder clearAddress() {
address = null;
fieldSetFlags()[4] = false;
return this;
} @SuppressWarnings("unchecked")
public Emp build() {
try {
Emp record = new Emp();
record.name = fieldSetFlags()[0] ? this.name : (CharSequence) defaultValue(fields()[0]);
record.id = fieldSetFlags()[1] ? this.id : (Integer) defaultValue(fields()[1]);
record.salary = fieldSetFlags()[2] ? this.salary : (Integer) defaultValue(fields()[2]);
record.age = fieldSetFlags()[3] ? this.age : (Integer) defaultValue(fields()[3]);
record.address = fieldSetFlags()[4] ? this.address : (CharSequence) defaultValue(fields()[4]);
return record;
} catch (Exception e) {
throw new org.apache.avro.AvroRuntimeException(e);
}
}
} @SuppressWarnings("unchecked")
private static final org.apache.avro.io.DatumWriter<Emp>
WRITER$ = (org.apache.avro.io.DatumWriter<Emp>)MODEL$.createDatumWriter(SCHEMA$); @Override public void writeExternal(java.io.ObjectOutput out)
throws java.io.IOException {
WRITER$.write(this, SpecificData.getEncoder(out));
} @SuppressWarnings("unchecked")
private static final org.apache.avro.io.DatumReader<Emp>
READER$ = (org.apache.avro.io.DatumReader<Emp>)MODEL$.createDatumReader(SCHEMA$); @Override public void readExternal(java.io.ObjectInput in)
throws java.io.IOException {
READER$.read(this, SpecificData.getDecoder(in));
} }
Emp.java 文件内容
/*
@author :yinzhengjie
Blog:http://www.cnblogs.com/yinzhengjie/tag/Hadoop%E8%BF%9B%E9%98%B6%E4%B9%8B%E8%B7%AF/
EMAIL:y1053419035@qq.com
*/
package cn.org.yinzhengjie.avro; import java.io.Serializable; public class Emp2 implements Serializable {
private int id;
private String name;
private int age;
private String address;
private int salary; public int getId() {
return id;
}
public void setId(int id) {
this.id = id;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public int getAge() {
return age;
}
public void setAge(int age) {
this.age = age;
}
public String getAddress() {
return address;
}
public void setAddress(String address) {
this.address = address;
}
public int getSalary() {
return salary;
}
public void setSalary(int salary) {
this.salary = salary;
}
}
Emp2.java 文件内容
/*
@author :yinzhengjie
Blog:http://www.cnblogs.com/yinzhengjie/tag/Hadoop%E8%BF%9B%E9%98%B6%E4%B9%8B%E8%B7%AF/
EMAIL:y1053419035@qq.com
*/
package cn.org.yinzhengjie.avro; import org.apache.hadoop.io.Writable;
import tutorialspoint.com.Emp; import java.io.DataInput;
import java.io.DataOutput;
import java.io.IOException; public class EmpWritable implements Writable {
private Emp emp; public Emp getEmp() {
return emp;
} public void setEmp(Emp emp) {
this.emp = emp; } public EmpWritable() {
this.emp = emp;
} //串行化
public void write(DataOutput dataOutput) throws IOException {
dataOutput.writeUTF((String)emp.getName());
dataOutput.writeInt(emp.getId());
dataOutput.writeInt(emp.getSalary());
dataOutput.writeInt(emp.getAge());
dataOutput.writeUTF((String)emp.getAddress());
} //发串行化,我们测试的是串行化,咱们不需要写反串行化的代码,空实现即可。
public void readFields(DataInput dataInput) {
}
}
EmpWritable.java 文件内容
/*
@author :yinzhengjie
Blog:http://www.cnblogs.com/yinzhengjie/tag/Hadoop%E8%BF%9B%E9%98%B6%E4%B9%8B%E8%B7%AF/
EMAIL:y1053419035@qq.com
*/
package cn.org.yinzhengjie.avro; import org.apache.avro.file.DataFileWriter;
import org.apache.avro.io.DatumWriter;
import org.apache.avro.specific.SpecificDatumWriter;
import tutorialspoint.com.Emp; import java.io.*; public class TestAvro {
private static final File avro = new File("D:\\10.Java\\IDE\\yhinzhengjieData\\Avro\\emp.avroTest");
private static final File java = new File("D:\\10.Java\\IDE\\yhinzhengjieData\\Avro\\emp.javaTest");
private static final File hadoop = new File("D:\\10.Java\\IDE\\yhinzhengjieData\\Avro\\emp.hadoopTest"); public static void main(String[] args) throws IOException {
testAvroSerial();
testJavaSerial();
testHadoopSerial();
} //测试hadoop的序列化方式
public static void testHadoopSerial() throws IOException {
long start = System.currentTimeMillis();
//定义对象,并给对象赋初值
Emp p = new Emp();
p.setId(3);
p.setName("jay");
p.setAge(30);
p.setSalary(15000);
p.setAddress("昌平");
EmpWritable ew = new EmpWritable();
ew.setEmp(p);
FileOutputStream fos = new FileOutputStream(hadoop);
DataOutputStream dos = new DataOutputStream(fos);
for (int i = 0;i <= 10000000;i++){
ew.write(dos);
}
fos.close();
dos.close();
System.out.printf("这是Hadoop序列化方式: 生成文件大小:[%d],用时:[%d]\n",hadoop.length(),System.currentTimeMillis() - start);
} //测试Avro序列化
public static void testAvroSerial() throws IOException {
long start = System.currentTimeMillis();
//定义对象,并给对象赋初值
Emp yinzhengjie = new Emp();
yinzhengjie.setId(1);
yinzhengjie.setName("尹正杰");
yinzhengjie.setAge(18);
yinzhengjie.setSalary(80000);
yinzhengjie.setAddress("北京"); //初始化writer对象,我习惯称它为写入器
DatumWriter<Emp> dw = new SpecificDatumWriter<Emp>(Emp.class);
//初始化文件写入器
DataFileWriter<Emp> dfw = new DataFileWriter<Emp>(dw);
//开始序列化文件,这个 Emp2.SCHEMA$ 其实是一个常量,是咱们编译时自动生成的一个json格式的字符串。
dfw.create(Emp.SCHEMA$,avro);
//在序列化文件中追加对象
for (int i = 0;i <= 10000000;i++){
dfw.append(yinzhengjie);
}
//释放资源
dfw.close();
System.out.printf("这是Avro序列化方式: 生成文件大小:[%d],用时:[%d]\n",avro.length(),System.currentTimeMillis() - start);
} //测试Java序列化
public static void testJavaSerial() throws IOException {
long start = System.currentTimeMillis();
Emp2 p = new Emp2();
p.setId(2);
p.setName("tom");
p.setAge(20);
p.setSalary(13000);
p.setAddress("亦庄");
FileOutputStream fos = new FileOutputStream(java);
ObjectOutputStream oos = new ObjectOutputStream(fos);
for (int i = 0;i <= 10000000;i++){
oos.writeObject(p);
}
fos.close();
oos.close();
System.out.printf("这是Java序列化方式: 生成文件大小:[%d],用时:[%d]\n",java.length(),(System.currentTimeMillis()-start));
}
} /*
以上代码执行结果如下:
这是Avro序列化方式: 生成文件大小:[220072462],用时:[2570]
这是Java序列化方式: 生成文件大小:[50000139],用时:[9876]
这是Hadoop序列化方式: 生成文件大小:[250000025],用时:[126290]
*/
经过测试对一个,序列化通一个对象时,用时最短的是Avro,用时最长的是Hadoop,生成文件最小的是Java,生成文件最大是Hadoop。
3>.编写反串行化代码
/*
@author :yinzhengjie
Blog:http://www.cnblogs.com/yinzhengjie/tag/Hadoop%E8%BF%9B%E9%98%B6%E4%B9%8B%E8%B7%AF/
EMAIL:y1053419035@qq.com
*/
package cn.org.yinzhengjie.avro; import org.apache.avro.file.DataFileReader;
import org.apache.avro.file.DataFileWriter;
import org.apache.avro.io.DatumReader;
import org.apache.avro.io.DatumWriter;
import org.apache.avro.specific.SpecificDatumReader;
import org.apache.avro.specific.SpecificDatumWriter;
import tutorialspoint.com.Emp; import java.io.*; public class TestAvro {
private static final File avro = new File("D:\\10.Java\\IDE\\yhinzhengjieData\\Avro\\emp.avroTest"); public static void main(String[] args) throws IOException {
AvroSerial();
AvroDeserial();
} //测试Avro序列化
public static void AvroSerial() throws IOException {
long start = System.currentTimeMillis();
//定义对象,并给对象赋初值
Emp yinzhengjie = new Emp();
yinzhengjie.setId(1);
yinzhengjie.setName("尹正杰");
yinzhengjie.setAge(18);
yinzhengjie.setSalary(80000);
yinzhengjie.setAddress("北京");
//初始化writer对象,我习惯称它为写入器
DatumWriter<Emp> dw = new SpecificDatumWriter<Emp>(Emp.class);
//初始化文件写入器
DataFileWriter<Emp> dfw = new DataFileWriter<Emp>(dw);
//开始序列化文件,这个 Emp2.SCHEMA$ 其实是一个常量,是咱们编译时自动生成的一个json格式的字符串。
dfw.create(Emp.SCHEMA$,avro);
//在序列化文件中追加对象
for (int i = 0;i <= 10000000;i++){
dfw.append(yinzhengjie);
}
//释放资源
dfw.close();
System.out.printf("这是Avro序列化方式: 生成文件大小:[%d],用时:[%d]\n",avro.length(),System.currentTimeMillis() - start);
} //测试Avro反序列化
public static void AvroDeserial() throws IOException {
long start = System.currentTimeMillis();
//初始化reader对象
DatumReader<Emp> dr = new SpecificDatumReader<Emp>(Emp.class);
//初始化文件阅读器
DataFileReader<Emp> dfr = new DataFileReader<Emp>(avro,dr);
//遍历阅读器里面的内容
while (dfr.hasNext()){
Emp emp = dfr.next();
Integer id = emp.getId();
CharSequence name = emp.getName();
Integer age = emp.getAge();
Integer salary = emp.getSalary();
CharSequence address = emp.getAddress();
// System.out.println(id + "|" + name + "|" + age + "|" + salary + "|" + address); 不建议打印,因为1000万条数据估计得30~40秒左右才能输出完成。
}
System.out.printf("这是Avro反序列化方式: 生成文件大小:[%d],用时:[%d]\n",avro.length(),System.currentTimeMillis() - start);
} } /*
以上代码执行结果如下:
这是Avro序列化方式: 生成文件大小:[220072462],用时:[2640]
这是Avro反序列化方式: 生成文件大小:[220072462],用时:[2866]
*/
4>.测试java、hadoop、avro对10000000个对象反串行化速度比对
/**
* Autogenerated by Avro
*
* DO NOT EDIT DIRECTLY
*/
package tutorialspoint.com; import org.apache.avro.specific.SpecificData;
import org.apache.avro.message.BinaryMessageEncoder;
import org.apache.avro.message.BinaryMessageDecoder;
import org.apache.avro.message.SchemaStore; @SuppressWarnings("all")
@org.apache.avro.specific.AvroGenerated
public class Emp extends org.apache.avro.specific.SpecificRecordBase implements org.apache.avro.specific.SpecificRecord {
private static final long serialVersionUID = 6405205887550658768L;
public static final org.apache.avro.Schema SCHEMA$ = new org.apache.avro.Schema.Parser().parse("{\"type\":\"record\",\"name\":\"Emp\",\"namespace\":\"tutorialspoint.com\",\"fields\":[{\"name\":\"name\",\"type\":\"string\"},{\"name\":\"id\",\"type\":\"int\"},{\"name\":\"salary\",\"type\":\"int\"},{\"name\":\"age\",\"type\":\"int\"},{\"name\":\"address\",\"type\":\"string\"}]}");
public static org.apache.avro.Schema getClassSchema() { return SCHEMA$; } private static SpecificData MODEL$ = new SpecificData(); private static final BinaryMessageEncoder<Emp> ENCODER =
new BinaryMessageEncoder<Emp>(MODEL$, SCHEMA$); private static final BinaryMessageDecoder<Emp> DECODER =
new BinaryMessageDecoder<Emp>(MODEL$, SCHEMA$); /**
* Return the BinaryMessageDecoder instance used by this class.
*/
public static BinaryMessageDecoder<Emp> getDecoder() {
return DECODER;
} /**
* Create a new BinaryMessageDecoder instance for this class that uses the specified {@link SchemaStore}.
* @param resolver a {@link SchemaStore} used to find schemas by fingerprint
*/
public static BinaryMessageDecoder<Emp> createDecoder(SchemaStore resolver) {
return new BinaryMessageDecoder<Emp>(MODEL$, SCHEMA$, resolver);
} /** Serializes this Emp to a ByteBuffer. */
public java.nio.ByteBuffer toByteBuffer() throws java.io.IOException {
return ENCODER.encode(this);
} /** Deserializes a Emp from a ByteBuffer. */
public static Emp fromByteBuffer(
java.nio.ByteBuffer b) throws java.io.IOException {
return DECODER.decode(b);
} @Deprecated public CharSequence name;
@Deprecated public int id;
@Deprecated public int salary;
@Deprecated public int age;
@Deprecated public CharSequence address; /**
* Default constructor. Note that this does not initialize fields
* to their default values from the schema. If that is desired then
* one should use <code>newBuilder()</code>.
*/
public Emp() {} /**
* All-args constructor.
* @param name The new value for name
* @param id The new value for id
* @param salary The new value for salary
* @param age The new value for age
* @param address The new value for address
*/
public Emp(CharSequence name, Integer id, Integer salary, Integer age, CharSequence address) {
this.name = name;
this.id = id;
this.salary = salary;
this.age = age;
this.address = address;
} public org.apache.avro.Schema getSchema() { return SCHEMA$; }
// Used by DatumWriter. Applications should not call.
public Object get(int field$) {
switch (field$) {
case 0: return name;
case 1: return id;
case 2: return salary;
case 3: return age;
case 4: return address;
default: throw new org.apache.avro.AvroRuntimeException("Bad index");
}
} // Used by DatumReader. Applications should not call.
@SuppressWarnings(value="unchecked")
public void put(int field$, Object value$) {
switch (field$) {
case 0: name = (CharSequence)value$; break;
case 1: id = (Integer)value$; break;
case 2: salary = (Integer)value$; break;
case 3: age = (Integer)value$; break;
case 4: address = (CharSequence)value$; break;
default: throw new org.apache.avro.AvroRuntimeException("Bad index");
}
} /**
* Gets the value of the 'name' field.
* @return The value of the 'name' field.
*/
public CharSequence getName() {
return name;
} /**
* Sets the value of the 'name' field.
* @param value the value to set.
*/
public void setName(CharSequence value) {
this.name = value;
} /**
* Gets the value of the 'id' field.
* @return The value of the 'id' field.
*/
public Integer getId() {
return id;
} /**
* Sets the value of the 'id' field.
* @param value the value to set.
*/
public void setId(Integer value) {
this.id = value;
} /**
* Gets the value of the 'salary' field.
* @return The value of the 'salary' field.
*/
public Integer getSalary() {
return salary;
} /**
* Sets the value of the 'salary' field.
* @param value the value to set.
*/
public void setSalary(Integer value) {
this.salary = value;
} /**
* Gets the value of the 'age' field.
* @return The value of the 'age' field.
*/
public Integer getAge() {
return age;
} /**
* Sets the value of the 'age' field.
* @param value the value to set.
*/
public void setAge(Integer value) {
this.age = value;
} /**
* Gets the value of the 'address' field.
* @return The value of the 'address' field.
*/
public CharSequence getAddress() {
return address;
} /**
* Sets the value of the 'address' field.
* @param value the value to set.
*/
public void setAddress(CharSequence value) {
this.address = value;
} /**
* Creates a new Emp RecordBuilder.
* @return A new Emp RecordBuilder
*/
public static Builder newBuilder() {
return new Builder();
} /**
* Creates a new Emp RecordBuilder by copying an existing Builder.
* @param other The existing builder to copy.
* @return A new Emp RecordBuilder
*/
public static Builder newBuilder(Builder other) {
return new Builder(other);
} /**
* Creates a new Emp RecordBuilder by copying an existing Emp instance.
* @param other The existing instance to copy.
* @return A new Emp RecordBuilder
*/
public static Builder newBuilder(Emp other) {
return new Builder(other);
} /**
* RecordBuilder for Emp instances.
*/
public static class Builder extends org.apache.avro.specific.SpecificRecordBuilderBase<Emp>
implements org.apache.avro.data.RecordBuilder<Emp> { private CharSequence name;
private int id;
private int salary;
private int age;
private CharSequence address; /** Creates a new Builder */
private Builder() {
super(SCHEMA$);
} /**
* Creates a Builder by copying an existing Builder.
* @param other The existing Builder to copy.
*/
private Builder(Builder other) {
super(other);
if (isValidValue(fields()[0], other.name)) {
this.name = data().deepCopy(fields()[0].schema(), other.name);
fieldSetFlags()[0] = true;
}
if (isValidValue(fields()[1], other.id)) {
this.id = data().deepCopy(fields()[1].schema(), other.id);
fieldSetFlags()[1] = true;
}
if (isValidValue(fields()[2], other.salary)) {
this.salary = data().deepCopy(fields()[2].schema(), other.salary);
fieldSetFlags()[2] = true;
}
if (isValidValue(fields()[3], other.age)) {
this.age = data().deepCopy(fields()[3].schema(), other.age);
fieldSetFlags()[3] = true;
}
if (isValidValue(fields()[4], other.address)) {
this.address = data().deepCopy(fields()[4].schema(), other.address);
fieldSetFlags()[4] = true;
}
} /**
* Creates a Builder by copying an existing Emp instance
* @param other The existing instance to copy.
*/
private Builder(Emp other) {
super(SCHEMA$);
if (isValidValue(fields()[0], other.name)) {
this.name = data().deepCopy(fields()[0].schema(), other.name);
fieldSetFlags()[0] = true;
}
if (isValidValue(fields()[1], other.id)) {
this.id = data().deepCopy(fields()[1].schema(), other.id);
fieldSetFlags()[1] = true;
}
if (isValidValue(fields()[2], other.salary)) {
this.salary = data().deepCopy(fields()[2].schema(), other.salary);
fieldSetFlags()[2] = true;
}
if (isValidValue(fields()[3], other.age)) {
this.age = data().deepCopy(fields()[3].schema(), other.age);
fieldSetFlags()[3] = true;
}
if (isValidValue(fields()[4], other.address)) {
this.address = data().deepCopy(fields()[4].schema(), other.address);
fieldSetFlags()[4] = true;
}
} /**
* Gets the value of the 'name' field.
* @return The value.
*/
public CharSequence getName() {
return name;
} /**
* Sets the value of the 'name' field.
* @param value The value of 'name'.
* @return This builder.
*/
public Builder setName(CharSequence value) {
validate(fields()[0], value);
this.name = value;
fieldSetFlags()[0] = true;
return this;
} /**
* Checks whether the 'name' field has been set.
* @return True if the 'name' field has been set, false otherwise.
*/
public boolean hasName() {
return fieldSetFlags()[0];
} /**
* Clears the value of the 'name' field.
* @return This builder.
*/
public Builder clearName() {
name = null;
fieldSetFlags()[0] = false;
return this;
} /**
* Gets the value of the 'id' field.
* @return The value.
*/
public Integer getId() {
return id;
} /**
* Sets the value of the 'id' field.
* @param value The value of 'id'.
* @return This builder.
*/
public Builder setId(int value) {
validate(fields()[1], value);
this.id = value;
fieldSetFlags()[1] = true;
return this;
} /**
* Checks whether the 'id' field has been set.
* @return True if the 'id' field has been set, false otherwise.
*/
public boolean hasId() {
return fieldSetFlags()[1];
} /**
* Clears the value of the 'id' field.
* @return This builder.
*/
public Builder clearId() {
fieldSetFlags()[1] = false;
return this;
} /**
* Gets the value of the 'salary' field.
* @return The value.
*/
public Integer getSalary() {
return salary;
} /**
* Sets the value of the 'salary' field.
* @param value The value of 'salary'.
* @return This builder.
*/
public Builder setSalary(int value) {
validate(fields()[2], value);
this.salary = value;
fieldSetFlags()[2] = true;
return this;
} /**
* Checks whether the 'salary' field has been set.
* @return True if the 'salary' field has been set, false otherwise.
*/
public boolean hasSalary() {
return fieldSetFlags()[2];
} /**
* Clears the value of the 'salary' field.
* @return This builder.
*/
public Builder clearSalary() {
fieldSetFlags()[2] = false;
return this;
} /**
* Gets the value of the 'age' field.
* @return The value.
*/
public Integer getAge() {
return age;
} /**
* Sets the value of the 'age' field.
* @param value The value of 'age'.
* @return This builder.
*/
public Builder setAge(int value) {
validate(fields()[3], value);
this.age = value;
fieldSetFlags()[3] = true;
return this;
} /**
* Checks whether the 'age' field has been set.
* @return True if the 'age' field has been set, false otherwise.
*/
public boolean hasAge() {
return fieldSetFlags()[3];
} /**
* Clears the value of the 'age' field.
* @return This builder.
*/
public Builder clearAge() {
fieldSetFlags()[3] = false;
return this;
} /**
* Gets the value of the 'address' field.
* @return The value.
*/
public CharSequence getAddress() {
return address;
} /**
* Sets the value of the 'address' field.
* @param value The value of 'address'.
* @return This builder.
*/
public Builder setAddress(CharSequence value) {
validate(fields()[4], value);
this.address = value;
fieldSetFlags()[4] = true;
return this;
} /**
* Checks whether the 'address' field has been set.
* @return True if the 'address' field has been set, false otherwise.
*/
public boolean hasAddress() {
return fieldSetFlags()[4];
} /**
* Clears the value of the 'address' field.
* @return This builder.
*/
public Builder clearAddress() {
address = null;
fieldSetFlags()[4] = false;
return this;
} @SuppressWarnings("unchecked")
public Emp build() {
try {
Emp record = new Emp();
record.name = fieldSetFlags()[0] ? this.name : (CharSequence) defaultValue(fields()[0]);
record.id = fieldSetFlags()[1] ? this.id : (Integer) defaultValue(fields()[1]);
record.salary = fieldSetFlags()[2] ? this.salary : (Integer) defaultValue(fields()[2]);
record.age = fieldSetFlags()[3] ? this.age : (Integer) defaultValue(fields()[3]);
record.address = fieldSetFlags()[4] ? this.address : (CharSequence) defaultValue(fields()[4]);
return record;
} catch (Exception e) {
throw new org.apache.avro.AvroRuntimeException(e);
}
}
} @SuppressWarnings("unchecked")
private static final org.apache.avro.io.DatumWriter<Emp>
WRITER$ = (org.apache.avro.io.DatumWriter<Emp>)MODEL$.createDatumWriter(SCHEMA$); @Override public void writeExternal(java.io.ObjectOutput out)
throws java.io.IOException {
WRITER$.write(this, SpecificData.getEncoder(out));
} @SuppressWarnings("unchecked")
private static final org.apache.avro.io.DatumReader<Emp>
READER$ = (org.apache.avro.io.DatumReader<Emp>)MODEL$.createDatumReader(SCHEMA$); @Override public void readExternal(java.io.ObjectInput in)
throws java.io.IOException {
READER$.read(this, SpecificData.getDecoder(in));
} }
Emp.java 文件内容
/*
@author :yinzhengjie
Blog:http://www.cnblogs.com/yinzhengjie/tag/Hadoop%E8%BF%9B%E9%98%B6%E4%B9%8B%E8%B7%AF/
EMAIL:y1053419035@qq.com
*/
package cn.org.yinzhengjie.avro; import java.io.Serializable; public class Emp2 implements Serializable {
private int id;
private String name;
private int age;
private String address;
private int salary; public int getId() {
return id;
}
public void setId(int id) {
this.id = id;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public int getAge() {
return age;
}
public void setAge(int age) {
this.age = age;
}
public String getAddress() {
return address;
}
public void setAddress(String address) {
this.address = address;
}
public int getSalary() {
return salary;
}
public void setSalary(int salary) {
this.salary = salary;
}
}
Emp2.java 文件内容
/*
@author :yinzhengjie
Blog:http://www.cnblogs.com/yinzhengjie/tag/Hadoop%E8%BF%9B%E9%98%B6%E4%B9%8B%E8%B7%AF/
EMAIL:y1053419035@qq.com
*/
package cn.org.yinzhengjie.avro; import org.apache.hadoop.io.Writable;
import tutorialspoint.com.Emp; import java.io.DataInput;
import java.io.DataOutput;
import java.io.IOException; public class EmpWritable implements Writable {
private Emp emp = new Emp(); public Emp getEmp() {
return emp;
} public void setEmp(Emp emp) {
this.emp = emp; } public EmpWritable() {
} //串行化
public void write(DataOutput dataOutput) throws IOException {
dataOutput.writeUTF((String)emp.getName());
dataOutput.writeInt(emp.getId());
dataOutput.writeInt(emp.getSalary());
dataOutput.writeInt(emp.getAge());
dataOutput.writeUTF((String)emp.getAddress());
} //发串行化
public void readFields(DataInput dataInput) throws IOException {
emp.setName((CharSequence)dataInput.readUTF());
emp.setId(dataInput.readInt());
emp.setSalary(dataInput.readInt());
emp.setAge(dataInput.readInt());
emp.setAddress((CharSequence)dataInput.readUTF());
}
}
EmpWritable.java 文件内容
/*
@author :yinzhengjie
Blog:http://www.cnblogs.com/yinzhengjie/tag/Hadoop%E8%BF%9B%E9%98%B6%E4%B9%8B%E8%B7%AF/
EMAIL:y1053419035@qq.com
*/
package cn.org.yinzhengjie.avro; import org.apache.avro.file.DataFileReader;
import org.apache.avro.io.DatumReader;
import org.apache.avro.specific.SpecificDatumReader;
import tutorialspoint.com.Emp; import java.io.*; public class test {
private static final File avro = new File("D:\\10.Java\\IDE\\yhinzhengjieData\\Avro\\emp.avroTest");
private static final File java = new File("D:\\10.Java\\IDE\\yhinzhengjieData\\Avro\\emp.javaTest");
private static final File hadoop = new File("D:\\10.Java\\IDE\\yhinzhengjieData\\Avro\\emp.hadoopTest"); public static void main(String[] args) throws Exception {
AvroDeserial();
JavaDeserial();
HadoopDeserial();
} //测试hadoop的反序列化方式
public static void HadoopDeserial() throws IOException {
long start = System.currentTimeMillis();
EmpWritable ew = new EmpWritable();
FileInputStream fis = new FileInputStream(hadoop);
DataInputStream dis = new DataInputStream(fis); for (int i = 0;i <= 10000000;i++){
ew.readFields(dis);
// Emp emp = ew.getEmp();
// System.out.println(emp);
}
fis.close();
dis.close();
System.out.printf("这是Hadoop反序列化方式: 反序列化文件大小:[%d],用时:[%d]\n",hadoop.length(),System.currentTimeMillis() - start);
} //测试Avro反序列化
public static void AvroDeserial() throws IOException {
long start = System.currentTimeMillis();
//初始化reader对象
DatumReader<Emp> dr = new SpecificDatumReader<Emp>(Emp.class);
//初始化文件阅读器
DataFileReader<Emp> dfr = new DataFileReader<Emp>(avro,dr);
//遍历阅读器里面的内容
while (dfr.hasNext()){
Emp emp = dfr.next();
Integer id = emp.getId();
CharSequence name = emp.getName();
Integer age = emp.getAge();
Integer salary = emp.getSalary();
CharSequence address = emp.getAddress();
// System.out.println(id + "|" + name + "|" + age + "|" + salary + "|" + address); 不建议打印,因为1000万条数据估计得30~40秒左右才能输出完成。
}
System.out.printf("这是Avro反序列化方式: 反序列化文件大小:[%d],用时:[%d]\n",avro.length(),System.currentTimeMillis() - start);
} //测试Java反序列化
public static void JavaDeserial() throws Exception {
long start = System.currentTimeMillis(); FileInputStream fis = new FileInputStream(java);
ObjectInputStream ois = new ObjectInputStream(fis);
for (int i = 0;i <= 10000000;i++){
Emp2 emp = (Emp2) ois.readObject();
}
fis.close();
ois.close();
System.out.printf("这是Java反序列化方式: 反序列化文件大小:[%d],用时:[%d]\n",java.length(),(System.currentTimeMillis()-start));
}
} /*
以上代码执行结果如下:
这是Avro反序列化方式: 反序列化文件大小:[220072462],用时:[3479]
这是Java反序列化方式: 反序列化文件大小:[50000139],用时:[15677]
这是Hadoop反序列化方式: 反序列化文件大小:[250000025],用时:[134628]
*/
经测试,将一个对象进行1000万次序列化到一个文件后,在反序列化回来,用时最短的依然是Avro,Java用时次之,Hadoop依然用时最长。
四.avro通过不生成代码,直接使用schema的方式对对象进行串行化
通过上面的编程发现如果想要通过Avro进行序列化很繁琐,在序列化之前需要准备环境,需要通过编译工具生成代码,然后在导入IDE中,进行编程,那有么有不生成代码就可以进行序列化操作的呢?答案是肯定的,大致流程如下图:
实现代码如下:
/*
@author :yinzhengjie
Blog:http://www.cnblogs.com/yinzhengjie/tag/Hadoop%E8%BF%9B%E9%98%B6%E4%B9%8B%E8%B7%AF/
EMAIL:y1053419035@qq.com
*/
package cn.org.yinzhengjie.avro; import org.apache.avro.Schema;
import org.apache.avro.file.DataFileReader;
import org.apache.avro.file.DataFileWriter;
import org.apache.avro.generic.GenericData;
import org.apache.avro.generic.GenericRecord;
import org.apache.avro.io.DatumReader;
import org.apache.avro.io.DatumWriter;
import org.apache.avro.specific.SpecificDatumReader;
import org.apache.avro.specific.SpecificDatumWriter; import java.io.File;
import java.io.IOException; public class avroSchema {
private static final File avscFile = new File("D:\\10.Java\\IDE\\yhinzhengjieData\\Avro\\emp.avsc");
private static final File avro = new File("D:\\10.Java\\IDE\\yhinzhengjieData\\Avro\\schema.avro");
public static void main(String[] args) throws IOException {
AvroSerial();
AvroDeserial();
} //通过Schema的方式序列化
private static void AvroSerial() throws IOException {
long start = System.currentTimeMillis();
//通过文件解析schema,生成schema对象,因此需要传入emp.avsc的路径
Schema schema = new Schema.Parser().parse(avscFile);
//将schema变成了类似与map的对象
GenericRecord yinzhengjie = new GenericData.Record(schema);
yinzhengjie.put("id",1);
yinzhengjie.put("name","尹正杰");
yinzhengjie.put("age",18);
yinzhengjie.put("salary",80000);
yinzhengjie.put("address","北京");
//初始化writer对象,注意,泛型应写为:GenericRecord
DatumWriter<GenericRecord> dw = new SpecificDatumWriter<GenericRecord>(schema);
//初始化文件写入器,注意,泛型应写为:GenericRecord
DataFileWriter<GenericRecord> dfw = new DataFileWriter<GenericRecord>(dw);
//定义写入的路径
dfw.create(schema,avro);
for (int i = 0;i <= 10000000;i++){
dfw.append(yinzhengjie);
}
System.out.printf("这是Avro序列化方式: 生成文件大小:[%d],用时:[%d]\n",avro.length(),System.currentTimeMillis() - start);
} //通过Schema的方式序列化
private static void AvroDeserial() throws IOException {
long start = System.currentTimeMillis();
//通过文件解析schema,生成schema对象,因此需要传入emp.avsc的路径
Schema schema = new Schema.Parser().parse(avscFile);
//将schema变成了类似与map的对象
GenericRecord yinzhengjie = new GenericData.Record(schema); //初始化Reader对象,注意,泛型应写为:GenericRecord
DatumReader<GenericRecord> dr = new SpecificDatumReader<GenericRecord>(schema);
//初始化文件阅读器,注意,泛型应写为:GenericRecord
DataFileReader<GenericRecord> dfr = new DataFileReader<GenericRecord>(avro,dr);
//遍历
while (dfr.hasNext()){
GenericRecord record = dfr.next();
// System.out.println(record);
}
System.out.printf("这是Avro反序列化方式: 反序列化文件大小:[%d],用时:[%d]\n",avro.length(),System.currentTimeMillis() - start);
}
} /*
以上代码执行结果如下:
这是Avro序列化方式: 生成文件大小:[220045139],用时:[2408]
这是Avro反序列化方式: 反序列化文件大小:[220045139],用时:[2614]
*/
Hadoop基础-Apache Avro串行化的与反串行化的更多相关文章
- Hadoop基础-Protocol Buffers串行化与反串行化
Hadoop基础-Protocol Buffers串行化与反串行化 作者:尹正杰 版权声明:原创作品,谢绝转载!否则将追究法律责任. 我们之前学习过很多种序列化文件格式,比如python中的pickl ...
- Hadoop基础-MapReduce的常用文件格式介绍
Hadoop基础-MapReduce的常用文件格式介绍 作者:尹正杰 版权声明:原创作品,谢绝转载!否则将追究法律责任. 一.MR文件格式-SequenceFile 1>.生成SequenceF ...
- C#基础知识回顾--串行化与反串行化
串行化是指存储和获取磁盘文件.内存或其他地方中的对象.在串行化时,所有的实例数据都保存到存储介质上, 在取消串行化时,对象会被还原,且不能与其原实例区别开来.只需给类添加Serializable属性, ...
- PHP中的抽象类与抽象方法/静态属性和静态方法/PHP中的单利模式(单态模式)/串行化与反串行化(序列化与反序列化)/约束类型/魔术方法小结
前 言 OOP 学习了好久的PHP,今天来总结一下PHP中的抽象类与抽象方法/静态属性和静态方法/PHP中的单利模式(单态模式)/串行化与反串行化(序列化与反序列化). 1 PHP中的抽象 ...
- C#--串行化与反串行化
串行化是指存储和获取磁盘文件.内存或其他地方中的对象.在串行化时,所有的实例数据都保存到存储介质上,在取消串行化时,对象会被还原,且不能与其原实例区别开来.只需给类添加Serializable属性,就 ...
- Hadoop基础-序列化与反序列化(实现Writable接口)
Hadoop基础-序列化与反序列化(实现Writable接口) 作者:尹正杰 版权声明:原创作品,谢绝转载!否则将追究法律责任. 一.序列化简介 1>.什么是序列化 序列化也称串行化,是将结构化 ...
- Hadoop基础原理
Hadoop基础原理 作者:尹正杰 版权声明:原创作品,谢绝转载!否则将追究法律责任. 业内有这么一句话说:云计算可能改变了整个传统IT产业的基础架构,而大数据处理,尤其像Hadoop组件这样的技术出 ...
- Hadoop基础-MapReduce入门篇之编写简单的Wordcount测试代码
Hadoop基础-MapReduce入门篇之编写简单的Wordcount测试代码 作者:尹正杰 版权声明:原创作品,谢绝转载!否则将追究法律责任. 本文主要是记录一写我在学习MapReduce时的一些 ...
- Hadoop基础--统计商家id的标签数案例分析
Hadoop基础--统计商家id的标签数案例分析 作者:尹正杰 版权声明:原创作品,谢绝转载!否则将追究法律责任. 一.项目需求 将“temptags.txt”中的数据进行分析,统计出商家id的评论标 ...
随机推荐
- tomcat启动问题排查
遇到tomcat错误时不一定是tomcat的配置问题,还有可能是项目的配置问题.检查下xml的servlet配置是不是出了问题. tomcat8.0使用注解的方式帮我注册了servlet了,这时候已经 ...
- Linux上多次restore Tensorflow模型报错
环境:python3,tensotflow 在恢复了预先训练好的模型进行预测时,第一次是能够成功执行的,但我多次restore模型时,出现了以下问题: 1.ValueError: Variable c ...
- Head First Java & 构造函数
java继承中对构造函数是不继承的,只是调用(隐式或显式). ----------------------------------------------------------------- ...
- c# Parallel 并行运算 异步处理
var list = new List<string> { "https://www.baidu.com","https://associates.amazo ...
- 解决Shiro+SpringBoot自定义Filter不生效问题
在SpringBoot+Shiro实现安全框架的时候,自定义扩展了一些Filter,并注册到ShiroFilter,但是运行的时候发现总是在ShiroFilter之前就进入了自定义Filter,结果当 ...
- 配置docker的私有仓库
1:安装docker-registry包 yum install -y docker-distribution 2:启动docker-distribution,默认监听于TCP/5000端口 sy ...
- D3.js 入门学习(二) V4的改动
//d3.scan /* 新的d3.scan方法对数组进行线性扫描,并根据指定的比较函数返回至少一个元素的索引. 这个方法有点类似于d3.min和d3.max. 而d3.scan可以得到极值的索引而不 ...
- webservice(二)简单实例
1.建立WSDL文件 建立WSDL的工具很多,eclipse.zendstudio.vs都可以,我个人建议自己写,熟悉结构,另外自动工具对xml schame类型支持在类型中可能会报错. 下 ...
- 【转载】HttpWebRequest的GetResponse或GetRequestStream偶尔超时 + 总结各种超时死掉的可能和相应的解决办法
[问题] 用C#模拟网页登陆,其中去请求几个页面,会发起对应的http的请求request,其中keepAlive设置为true,提交请求后,然后会有对应的response: resp = (Http ...
- Mysql innodb 间隙锁 (转)
MySQL InnoDB支持三种行锁定方式: 行锁(Record Lock):锁直接加在索引记录上面. 间隙锁(Gap Lock):锁加在不存在的空闲空间,可以是两个索引记录之间,也可能是第一个索引记 ...