SDP(8):文本式数据库-MongoDB-Scala基本操作
MongoDB是一种文本式数据库。与传统的关系式数据库最大不同是MongoDB没有标准的格式要求,即没有schema,合适高效处理当今由互联网+商业产生的多元多态数据。MongoDB也是一种分布式数据库,充分具备大数据处理能力和高可用性。MongoDB提供了scala终端驱动mongo-scala-driver,我们就介绍一下MongoDB数据库和通过scala来进行数据操作编程。
与关系数据库相似,MongoDB结构为Database->Collection->Document。Collection对应Table,Document对应Row。因为MongoDB没有schema,所以Collection中的Document可以是不同形状格式的。在用scala使用MongoDB之前必须先建立连接,scala-driver提供了多种连接方式:
val client1 = MongoClient()
val client2 = MongoClient("mongodb://localhost:27017") val clusterSettings = ClusterSettings.builder()
.hosts(List(new ServerAddress("localhost:27017")).asJava).build()
val clientSettings = MongoClientSettings.builder().clusterSettings(clusterSettings).build()
val client = MongoClient(clientSettings)
下面是一些对应的MongoClient构建函数:
/**
* Create a default MongoClient at localhost:27017
*
* @return MongoClient
*/
def apply(): MongoClient = apply("mongodb://localhost:27017") /**
* Create a MongoClient instance from a connection string uri
*
* @param uri the connection string
* @return MongoClient
*/
def apply(uri: String): MongoClient = MongoClient(uri, None) /**
* Create a MongoClient instance from a connection string uri
*
* @param uri the connection string
* @param mongoDriverInformation any driver information to associate with the MongoClient
* @return MongoClient
* @note the `mongoDriverInformation` is intended for driver and library authors to associate extra driver metadata with the connections.
*/
def apply(uri: String, mongoDriverInformation: Option[MongoDriverInformation]): MongoClient = {...}
/**
* Create a MongoClient instance from the MongoClientSettings
*
* @param clientSettings MongoClientSettings to use for the MongoClient
* @return MongoClient
*/
def apply(clientSettings: MongoClientSettings): MongoClient = MongoClient(clientSettings, None) /**
* Create a MongoClient instance from the MongoClientSettings
*
* @param clientSettings MongoClientSettings to use for the MongoClient
* @param mongoDriverInformation any driver information to associate with the MongoClient
* @return MongoClient
* @note the `mongoDriverInformation` is intended for driver and library authors to associate extra driver metadata with the connections.
*/
def apply(clientSettings: MongoClientSettings, mongoDriverInformation: Option[MongoDriverInformation]): MongoClient = {
与MongoDB建立连接后可用选定Database:
val db = client.getDatabase("testdb")
由于没有格式限制,所以testdb不需要预先构建,像文件系统的directory一样,不存在时可以自动创建。同样,db内的collection也是可以自动创建的,因为不需要预先设定字段格式(no-schema):
val db: MongoDatabase = client.getDatabase("testdb")
val userCollection: MongoCollection[Document] = db.getCollection("users")
collection中Document类的构建函数如下:
/**
* Create a new document from the elems
* @param elems the key/value pairs that make up the Document. This can be any valid `(String, BsonValue)` pair that can be
* transformed into a [[BsonElement]] via [[BsonMagnets.CanBeBsonElement]] implicits and any [[BsonTransformer]]s that
* are in scope.
* @return a new Document consisting key/value pairs given by `elems`.
*/
def apply(elems: CanBeBsonElement*): Document = {
val underlying = new BsonDocument()
elems.foreach(elem => underlying.put(elem.key, elem.value))
new Document(underlying)
}
Document可以通过CanbeBsonElement构建。CanbeBsonElement是一种key/value结构:
/**
* Represents a single [[BsonElement]]
*
* This is essentially a `(String, BsonValue)` key value pair. Any pair of `(String, T)` where type `T` has a [[BsonTransformer]] in
* scope into a [[BsonValue]] is also a valid pair.
*/
sealed trait CanBeBsonElement {
val bsonElement: BsonElement /**
* The key of the [[BsonElement]]
* @return the key
*/
def key: String = bsonElement.getName /**
* The value of the [[BsonElement]]
* @return the BsonValue
*/
def value: BsonValue = bsonElement.getValue
} /**
* Implicitly converts key/value tuple of type (String, T) into a `CanBeBsonElement`
*
* @param kv the key value pair
* @param transformer the implicit [[BsonTransformer]] for the value
* @tparam T the type of the value
* @return a CanBeBsonElement representing the key/value pair
*/
implicit def tupleToCanBeBsonElement[T](kv: (String, T))(implicit transformer: BsonTransformer[T]): CanBeBsonElement = {
new CanBeBsonElement {
override val bsonElement: BsonElement = BsonElement(kv._1, transformer(kv._2))
}
}
有了上面这个tupleToCanBeBsonElement隐式转换函数就可以用下面的方式构建Document了:
val doc: Document = Document("_id" -> , "name" -> "MongoDB", "type" -> "database",
"count" -> , "info" -> Document("x" -> , "y" -> ))
这种key/value关系对应了一般数据库表中的字段名称/字段值。下面我们尝试建两个不同格式的Document并把它们加入到同一个collection里:
val alice = Document("_id" -> , "name" -> "alice wong", "age" -> )
val tiger = Document("first" -> "tiger", "last" -> "chan", "name" -> "tiger chan", "age" -> "unavailable") val addAlice: Observable[Completed] = userCollection.insertOne(alice)
val addTiger: Observable[Completed] = userCollection.insertOne(tiger)
上面这个例子证明了MongoDB的no-schema特性。用insert方法加入数据返回结果是个Obervable类型。这个类型与Future很像:只是一种运算的描述,必须通过subscribe方法来实际运算获取结果:
addAlice.subscribe(new Observer[Completed] {
override def onComplete(): Unit = println("insert alice completed.")
override def onNext(result: Completed): Unit = println("insert alice sucessful.")
override def onError(e: Throwable): Unit = println(s"insert error: ${e.getMessage}")
})
又或者转成Future后用Future方法如Await来运算:
def headResult(observable: Observable[Completed]) = Await.result(observable.head(), seconds)
val r1 = headResult(addTiger)
Mongo-Scala提供了Observable到Future的转换函数:
/**
* Collects the [[Observable]] results and converts to a [[scala.concurrent.Future]].
*
* Automatically subscribes to the `Observable` and uses the [[collect]] method to aggregate the results.
*
* @note If the Observable is large then this will consume lots of memory!
* If the underlying Observable is infinite this Observable will never complete.
* @return a future representation of the whole Observable
*/
def toFuture(): Future[Seq[T]] = {
val promise = Promise[Seq[T]]()
collect().subscribe((l: Seq[T]) => promise.success(l), (t: Throwable) => promise.failure(t))
promise.future
} /**
* Returns the head of the [[Observable]] in a [[scala.concurrent.Future]].
*
* @return the head result of the [[Observable]].
*/
def head(): Future[T] = {
import scala.concurrent.ExecutionContext.Implicits.global
headOption().map {
case Some(result) => result
case None => null.asInstanceOf[T] // scalastyle:ignore null
}
}
也可以用insertMany来成批加入:
val peter = Document("_id" -> , "first" -> "peter", "age" -> "old")
val chan = Document("last" -> "chan", "family" -> "chan's")
val addMany = userCollection.insertMany(List(peter,chan))
val r2 = headResult(addMany)
现在我们可以用count得出usersCollection中Document数量和用find把所有Document都印出来:
userCollection.count.head.onComplete {
case Success(c) => println(s"$c documents in users collection")
case Failure(e) => println(s"count() error: ${e.getMessage}")
}
userCollection.find().toFuture().onComplete {
case Success(users) => users.foreach(println)
case Failure(e) => println(s"find error: ${e.getMessage}")
}
scala.io.StdIn.readLine()
显示结果:
insert alice sucessful.
insert alice completed.
documents in users collection
Document((_id,BsonInt32{value=}), (name,BsonString{value='alice wong'}), (age,BsonInt32{value=}))
Document((_id,BsonObjectId{value=5a96641aa83f2923ab437602}), (first,BsonString{value='tiger'}), (last,BsonString{value='chan'}), (name,BsonString{value='tiger chan'}), (age,BsonString{value='unavailable'}))
Document((_id,BsonInt32{value=}), (first,BsonString{value='peter'}), (age,BsonString{value='old'}))
Document((_id,BsonObjectId{value=5a96641aa83f2923ab437603}), (last,BsonString{value='chan'}), (family,BsonString{value='chan's'}))
这个BsonString很碍眼,用隐式转换来把它转成String:
object Helpers { implicit class DocumentObservable[C](val observable: Observable[Document]) extends ImplicitObservable[Document] {
override val converter: (Document) => String = (doc) => doc.toJson
} implicit class GenericObservable[C](val observable: Observable[C]) extends ImplicitObservable[C] {
override val converter: (C) => String = (doc) => doc.toString
} trait ImplicitObservable[C] {
val observable: Observable[C]
val converter: (C) => String def results(): Seq[C] = Await.result(observable.toFuture(), seconds)
def headResult() = Await.result(observable.head(), seconds)
def printResults(initial: String = ""): Unit = {
if (initial.length > ) print(initial)
results().foreach(res => println(converter(res)))
}
def printHeadResult(initial: String = ""): Unit = println(s"${initial}${converter(headResult())}")
} }
现在再列印:
userCollection.find().printResults("all documents:") all documents:{ "_id" : , "name" : "alice wong", "age" : }
{ "_id" : { "$oid" : "5a9665cea83f29243ccacbd2" }, "first" : "tiger", "last" : "chan", "name" : "tiger chan", "age" : "unavailable" }
{ "_id" : , "first" : "peter", "age" : "old" }
{ "_id" : { "$oid" : "5a9665cea83f29243ccacbd3" }, "last" : "chan", "family" : "chan's" }
现在可读性强多了。find()无条件选出所有Document。MongoDB-Scala通过Filters对象提供了完整的查询条件构建函数如equal:
/**
* Creates a filter that matches all documents where the value of the field name equals the specified value. Note that this does
* actually generate a `\$eq` operator, as the query language doesn't require it.
*
* A friendly alias for the `eq` method.
*
* @param fieldName the field name
* @param value the value
* @tparam TItem the value type
* @return the filter
* @see [[http://docs.mongodb.org/manual/reference/operator/query/eq \$eq]]
*/
def equal[TItem](fieldName: String, value: TItem): Bson = eq(fieldName, value)
equal返回Bson,我们也可以把多个Bson组合起来形成一个更复杂的查询条件:
userCollection.find(and(gte("age",),exists("name",true)))
好了,现在我们可以测试各种查询条件了:
userCollection.find(notEqual("_id",)).printResults("id != 3:")
userCollection.find(equal("last", "chan")).printResults("last = chan:")
userCollection.find(and(gte("age",),exists("name",true))).printResults("age >= 24")
userCollection.find(or(gte("age",),equal("first","tiger"))).printResults("first = tiger")
显示结果:
id != :{ "_id" : , "name" : "alice wong", "age" : }
{ "_id" : { "$oid" : "5a9665cea83f29243ccacbd2" }, "first" : "tiger", "last" : "chan", "name" : "tiger chan", "age" : "unavailable" }
{ "_id" : { "$oid" : "5a9665cea83f29243ccacbd3" }, "last" : "chan", "family" : "chan's" }
last = chan:{ "_id" : { "$oid" : "5a9665cea83f29243ccacbd2" }, "first" : "tiger", "last" : "chan", "name" : "tiger chan", "age" : "unavailable" }
{ "_id" : { "$oid" : "5a9665cea83f29243ccacbd3" }, "last" : "chan", "family" : "chan's" }
age >= { "_id" : , "name" : "alice wong", "age" : }
first = tiger{ "_id" : , "name" : "alice wong", "age" : }
{ "_id" : { "$oid" : "5a9665cea83f29243ccacbd2" }, "first" : "tiger", "last" : "chan", "name" : "tiger chan", "age" : "unavailable" }
下面是本次示范的源代码:
build.sbt
name := "learn-mongo" version := "0.1" scalaVersion := "2.12.4" libraryDependencies := Seq(
"org.mongodb.scala" %% "mongo-scala-driver" % "2.2.1",
"com.lightbend.akka" %% "akka-stream-alpakka-mongodb" % "0.17"
)
MongoScala101.scala
import org.mongodb.scala._
import scala.collection.JavaConverters._
import org.mongodb.scala.connection.ClusterSettings
import scala.concurrent._
import scala.concurrent.duration._
import scala.util._
import org.mongodb.scala.model.Filters._
object MongoScala101 extends App {
import scala.concurrent.ExecutionContext.Implicits.global
// val client1 = MongoClient()
// val client2 = MongoClient("mongodb://localhost:27017") val clusterSettings = ClusterSettings.builder()
.hosts(List(new ServerAddress("localhost:27017")).asJava).build()
val clientSettings = MongoClientSettings.builder().clusterSettings(clusterSettings).build()
val client = MongoClient(clientSettings) val db: MongoDatabase = client.getDatabase("testdb")
val userCollection: MongoCollection[Document] = db.getCollection("users")
val deleteAll = userCollection.deleteMany(notEqual("_id", ))
deleteAll.head.onComplete {
case Success(c) => println(s"delete sucessful $c")
case Failure(e) => println(s"delete error: ${e.getMessage}")
} scala.io.StdIn.readLine()
val delete3 = userCollection.deleteMany(equal("_id", ))
delete3.head.onComplete {
case Success(c) => println(s"delete sucessful $c")
case Failure(e) => println(s"delete error: ${e.getMessage}")
}
scala.io.StdIn.readLine() val doc: Document = Document("_id" -> , "name" -> "MongoDB", "type" -> "database",
"count" -> , "info" -> Document("x" -> , "y" -> )) val alice = Document("_id" -> , "name" -> "alice wong", "age" -> )
val tiger = Document("first" -> "tiger", "last" -> "chan", "name" -> "tiger chan", "age" -> "unavailable") val addAlice: Observable[Completed] = userCollection.insertOne(alice)
val addTiger: Observable[Completed] = userCollection.insertOne(tiger) addAlice.subscribe(new Observer[Completed] {
override def onComplete(): Unit = println("insert alice completed.")
override def onNext(result: Completed): Unit = println("insert alice sucessful.")
override def onError(e: Throwable): Unit = println(s"insert error: ${e.getMessage}")
}) def headResult(observable: Observable[Completed]) = Await.result(observable.head(), seconds)
val r1 = headResult(addTiger) val peter = Document("_id" -> , "first" -> "peter", "age" -> "old")
val chan = Document("last" -> "chan", "family" -> "chan's")
val addMany = userCollection.insertMany(List(peter,chan))
val r2 = headResult(addMany) import Helpers._
userCollection.count.head.onComplete {
case Success(c) => println(s"$c documents in users collection")
case Failure(e) => println(s"count() error: ${e.getMessage}")
}
userCollection.find().toFuture().onComplete {
case Success(users) => users.foreach(println)
case Failure(e) => println(s"find error: ${e.getMessage}")
}
scala.io.StdIn.readLine() userCollection.find().printResults("all documents:")
userCollection.find(notEqual("_id",)).printResults("id != 3:")
userCollection.find(equal("last", "chan")).printResults("last = chan:")
userCollection.find(and(gte("age",),exists("name",true))).printResults("age >= 24")
userCollection.find(or(gte("age",),equal("first","tiger"))).printResults("first = tiger") client.close() println("end!!!") } object Helpers { implicit class DocumentObservable[C](val observable: Observable[Document]) extends ImplicitObservable[Document] {
override val converter: (Document) => String = (doc) => doc.toJson
} implicit class GenericObservable[C](val observable: Observable[C]) extends ImplicitObservable[C] {
override val converter: (C) => String = (doc) => doc.toString
} trait ImplicitObservable[C] {
val observable: Observable[C]
val converter: (C) => String def results(): Seq[C] = Await.result(observable.toFuture(), seconds)
def headResult() = Await.result(observable.head(), seconds)
def printResults(initial: String = ""): Unit = {
if (initial.length > ) print(initial)
results().foreach(res => println(converter(res)))
}
def printHeadResult(initial: String = ""): Unit = println(s"${initial}${converter(headResult())}")
} }
SDP(8):文本式数据库-MongoDB-Scala基本操作的更多相关文章
- 【网络爬虫入门05】分布式文件存储数据库MongoDB的基本操作与爬虫应用
[网络爬虫入门05]分布式文件存储数据库MongoDB的基本操作与爬虫应用 广东职业技术学院 欧浩源 1.引言 网络爬虫往往需要将大量的数据存储到数据库中,常用的有MySQL.MongoDB和Red ...
- SDP(10):文本式大数据运算环境-MongoDB-Engine功能设计
为了让前面规划的互联网+数据平台能有效对电子商务数据进行管理及实现大数据统计功能,必须在平台上再增加一个MongDB-Engine:数据平台用户通过传入一种Context来指示MongoDB-Engi ...
- mariadb_1 数据库介绍及基本操作
数据库介绍 1.什么是数据库? 简单的说,数据库就是一个存放数据的仓库,这个仓库是按照一定的数据结构(数据结构是指数据的组织形式或数据之间的联系)来组织,存储的,我们可以通过数据库提供的多种方法来管理 ...
- 数据库MongoDB
一.MongoDB简介 MongoDB是由c++语言编写的,是一个基于分布式文件存储的开源数据库系统,在高负载的情况下,添加更多的节点,可以保证服务器性能.MongoDB旨在为web应用提供扩展的高性 ...
- 孤荷凌寒自学python第六十六天学习mongoDB的基本操作并进行简单封装5
孤荷凌寒自学python第六十六天学习mongoDB的基本操作并进行简单封装5并学习权限设置 (完整学习过程屏幕记录视频地址在文末) 今天是学习mongoDB数据库的第十二天. 今天继续学习mongo ...
- 孤荷凌寒自学python第六十五天学习mongoDB的基本操作并进行简单封装4
孤荷凌寒自学python第六十五天学习mongoDB的基本操作并进行简单封装4 (完整学习过程屏幕记录视频地址在文末) 今天是学习mongoDB数据库的第十一天. 今天继续学习mongoDB的简单操作 ...
- 孤荷凌寒自学python第六十四天学习mongoDB的基本操作并进行简单封装3
孤荷凌寒自学python第六十四天学习mongoDB的基本操作并进行简单封装3 (完整学习过程屏幕记录视频地址在文末) 今天是学习mongoDB数据库的第十天. 今天继续学习mongoDB的简单操作, ...
- 孤荷凌寒自学python第六十三天学习mongoDB的基本操作并进行简单封装2
孤荷凌寒自学python第六十三天学习mongoDB的基本操作并进行简单封装2 (完整学习过程屏幕记录视频地址在文末) 今天是学习mongoDB数据库的第九天. 今天继续学习mongoDB的简单操作, ...
- 孤荷凌寒自学python第六十二天学习mongoDB的基本操作并进行简单封装1
孤荷凌寒自学python第六十二天学习mongoDB的基本操作并进行简单封装1 (完整学习过程屏幕记录视频地址在文末) 今天是学习mongoDB数据库的第八天. 今天开始学习mongoDB的简单操作, ...
随机推荐
- Hyperledger Fabric Transaction Flow——事务处理流程
Transaction Flow 本文概述了在标准资产交换过程中发生的事务机制.这个场景包括两个客户,A和B,他们在购买和销售萝卜(产品).他们每个人在网络上都有一个peer,通过这个网络,他们发送自 ...
- winform webbrowser如何强制使用ie11内核?
webkit.net ,cefsharp,openwebkit.net等这些基于谷歌或者基于firfox内核的浏览器有个共同点,就是必须指定winform为x86的才能使用, 而且使用过程中也是各种坑 ...
- 2017-07-09(tar who last)
tar gzip ,bzip2对于文件目录压缩支持有限,所以出现了tar命令 tar [选项] 打包文件名 源文件 -c 打包 -v 显示过程 -f 指定打包文件名 -x 解包 -z 压缩成.ta ...
- javascript对象的标签
[[proto]]标签 [[class]]标签 [[class]] 标签,代表这对象是哪个类型的.在js中不能直接访问到.可以通过Object.prototype.toString.call(obj) ...
- intellij springmvc的配置文件报错
报错: Checks references injected by IntelliLang plugin. Cannot resolve bean 解决: File--Settings[或直接CTR ...
- Linux实践篇--crontab定时任务
原文出处:http://www.cnblogs.com/tracy/archive/2011/12/27/2303788.html.感谢作者的无私分享 一. Crontab 介绍 ...
- C语言学习之选择排序
上一篇文章中讲C语言排序中的比较常见的(交换)冒泡排序,那么这篇文章也将以新手个人的经历来讲同样比较常见而实用的数组排序之选择排序. 选择排序,从字面上看是通过选择来进行排序.其实它的用法就是通过选择 ...
- GDB 的使用
gdb使用: 1.编译时必须加-g选项,生成调试需要的信息.如 g++ xxx.cpp -o xxx -g 2.调试最好结合core文件 3.调试命令:gdb xxx x ...
- Visionpro学习笔记 :QuickBuild-Based Application Run-Once Button
1) Creating a Run-Once Button 通过JobManager调用VisionPro文件.所有的过程放到一个Try/Catch块中. Private Sub RunOnceBut ...
- border-image用法详解
图像边框 border-image使用方法:border-image:url('图像路径') 边距(不能带单位)/宽度 上下方式 左右方式:(四个边距,上右下左,相同时可缩写为一个)repeat平铺 ...