SDP（8）：文本式数据库-MongoDB-Scala基本操作

MongoDB是一种文本式数据库。与传统的关系式数据库最大不同是MongoDB没有标准的格式要求，即没有schema，合适高效处理当今由互联网+商业产生的多元多态数据。MongoDB也是一种分布式数据库，充分具备大数据处理能力和高可用性。MongoDB提供了scala终端驱动mongo-scala-driver，我们就介绍一下MongoDB数据库和通过scala来进行数据操作编程。

与关系数据库相似，MongoDB结构为Database->Collection->Document。Collection对应Table，Document对应Row。因为MongoDB没有schema，所以Collection中的Document可以是不同形状格式的。在用scala使用MongoDB之前必须先建立连接，scala-driver提供了多种连接方式：

  val client1 = MongoClient()

  val client2 = MongoClient("mongodb://localhost:27017")

  val clusterSettings = ClusterSettings.builder()

         .hosts(List(new ServerAddress("localhost:27017")).asJava).build()

  val clientSettings = MongoClientSettings.builder().clusterSettings(clusterSettings).build()

  val client = MongoClient(clientSettings)

下面是一些对应的MongoClient构建函数：

  /**

   * Create a default MongoClient at localhost:27017

   *

   * @return MongoClient

   */

  def apply(): MongoClient = apply("mongodb://localhost:27017")

  /**

   * Create a MongoClient instance from a connection string uri

   *

   * @param uri the connection string

   * @return MongoClient

   */

  def apply(uri: String): MongoClient = MongoClient(uri, None)

  /**

   * Create a MongoClient instance from a connection string uri

   *

   * @param uri the connection string

   * @param mongoDriverInformation any driver information to associate with the MongoClient

   * @return MongoClient

   * @note the `mongoDriverInformation` is intended for driver and library authors to associate extra driver metadata with the connections.

   */

  def apply(uri: String, mongoDriverInformation: Option[MongoDriverInformation]): MongoClient = {...}

  /**

   * Create a MongoClient instance from the MongoClientSettings

   *

   * @param clientSettings MongoClientSettings to use for the MongoClient

   * @return MongoClient

   */

  def apply(clientSettings: MongoClientSettings): MongoClient = MongoClient(clientSettings, None)

  /**

   * Create a MongoClient instance from the MongoClientSettings

   *

   * @param clientSettings MongoClientSettings to use for the MongoClient

   * @param mongoDriverInformation any driver information to associate with the MongoClient

   * @return MongoClient

   * @note the `mongoDriverInformation` is intended for driver and library authors to associate extra driver metadata with the connections.

   */

  def apply(clientSettings: MongoClientSettings, mongoDriverInformation: Option[MongoDriverInformation]): MongoClient = {

与MongoDB建立连接后可用选定Database：

 val db = client.getDatabase("testdb")

由于没有格式限制，所以testdb不需要预先构建，像文件系统的directory一样，不存在时可以自动创建。同样，db内的collection也是可以自动创建的，因为不需要预先设定字段格式（no-schema）：

val db: MongoDatabase = client.getDatabase("testdb")

val userCollection: MongoCollection[Document] = db.getCollection("users")

collection中Document类的构建函数如下：

 /**

   * Create a new document from the elems

   * @param elems   the key/value pairs that make up the Document. This can be any valid `(String, BsonValue)` pair that can be

   *                transformed into a [[BsonElement]] via [[BsonMagnets.CanBeBsonElement]] implicits and any [[BsonTransformer]]s that

   *                are in scope.

   * @return        a new Document consisting key/value pairs given by `elems`.

   */

  def apply(elems: CanBeBsonElement*): Document = {

    val underlying = new BsonDocument()

    elems.foreach(elem => underlying.put(elem.key, elem.value))

    new Document(underlying)

  }

Document可以通过CanbeBsonElement构建。CanbeBsonElement是一种key/value结构：

 /**

   * Represents a single [[BsonElement]]

   *

   * This is essentially a `(String, BsonValue)` key value pair. Any pair of `(String, T)` where type `T` has a [[BsonTransformer]] in

   * scope into a [[BsonValue]] is also a valid pair.

   */

  sealed trait CanBeBsonElement {

    val bsonElement: BsonElement

    /**

     * The key of the [[BsonElement]]

     * @return the key

     */

    def key: String = bsonElement.getName

    /**

     * The value of the [[BsonElement]]

     * @return the BsonValue

     */

    def value: BsonValue = bsonElement.getValue

  }

  /**

   * Implicitly converts key/value tuple of type (String, T) into a `CanBeBsonElement`

   *

   * @param kv the key value pair

   * @param transformer the implicit [[BsonTransformer]] for the value

   * @tparam T the type of the value

   * @return a CanBeBsonElement representing the key/value pair

   */

  implicit def tupleToCanBeBsonElement[T](kv: (String, T))(implicit transformer: BsonTransformer[T]): CanBeBsonElement = {

    new CanBeBsonElement {

      override val bsonElement: BsonElement = BsonElement(kv._1, transformer(kv._2))

    }

  }

有了上面这个tupleToCanBeBsonElement隐式转换函数就可以用下面的方式构建Document了：

  val doc: Document = Document("_id" -> , "name" -> "MongoDB", "type" -> "database",

    "count" -> , "info" -> Document("x" -> , "y" -> ))

这种key/value关系对应了一般数据库表中的字段名称/字段值。下面我们尝试建两个不同格式的Document并把它们加入到同一个collection里：

  val alice = Document("_id" -> , "name" -> "alice wong", "age" -> )

  val tiger = Document("first" -> "tiger", "last" -> "chan", "name" -> "tiger chan", "age" -> "unavailable")

  val addAlice: Observable[Completed] = userCollection.insertOne(alice)

  val addTiger: Observable[Completed] = userCollection.insertOne(tiger)

上面这个例子证明了MongoDB的no-schema特性。用insert方法加入数据返回结果是个Obervable类型。这个类型与Future很像：只是一种运算的描述，必须通过subscribe方法来实际运算获取结果：

   addAlice.subscribe(new Observer[Completed] {

    override def onComplete(): Unit = println("insert alice completed.")

    override def onNext(result: Completed): Unit = println("insert alice sucessful.")

    override def onError(e: Throwable): Unit = println(s"insert error: ${e.getMessage}")

  })

又或者转成Future后用Future方法如Await来运算：

  def headResult(observable: Observable[Completed]) = Await.result(observable.head(),  seconds)

  val r1 = headResult(addTiger)

Mongo-Scala提供了Observable到Future的转换函数：

   /**

     * Collects the [[Observable]] results and converts to a [[scala.concurrent.Future]].

     *

     * Automatically subscribes to the `Observable` and uses the [[collect]] method to aggregate the results.

     *

     * @note If the Observable is large then this will consume lots of memory!

     *       If the underlying Observable is infinite this Observable will never complete.

     * @return a future representation of the whole Observable

     */

    def toFuture(): Future[Seq[T]] = {

      val promise = Promise[Seq[T]]()

      collect().subscribe((l: Seq[T]) => promise.success(l), (t: Throwable) => promise.failure(t))

      promise.future

    }

    /**

     * Returns the head of the [[Observable]] in a [[scala.concurrent.Future]].

     *

     * @return the head result of the [[Observable]].

     */

    def head(): Future[T] = {

      import scala.concurrent.ExecutionContext.Implicits.global

      headOption().map {

        case Some(result) => result

        case None         => null.asInstanceOf[T] // scalastyle:ignore null

      }

    }

也可以用insertMany来成批加入：

  val peter = Document("_id" -> , "first" -> "peter", "age" -> "old")

  val chan = Document("last" -> "chan", "family" -> "chan's")

  val addMany = userCollection.insertMany(List(peter,chan))

  val r2 = headResult(addMany)

现在我们可以用count得出usersCollection中Document数量和用find把所有Document都印出来：

  userCollection.count.head.onComplete {

    case Success(c) => println(s"$c documents in users collection")

    case Failure(e) => println(s"count() error: ${e.getMessage}")

  }

  userCollection.find().toFuture().onComplete {

    case Success(users) => users.foreach(println)

    case Failure(e) => println(s"find error: ${e.getMessage}")

  }

  scala.io.StdIn.readLine()

显示结果：

insert alice sucessful.

insert alice completed.

 documents in users collection

Document((_id,BsonInt32{value=}), (name,BsonString{value='alice wong'}), (age,BsonInt32{value=}))

Document((_id,BsonObjectId{value=5a96641aa83f2923ab437602}), (first,BsonString{value='tiger'}), (last,BsonString{value='chan'}), (name,BsonString{value='tiger chan'}), (age,BsonString{value='unavailable'}))

Document((_id,BsonInt32{value=}), (first,BsonString{value='peter'}), (age,BsonString{value='old'}))

Document((_id,BsonObjectId{value=5a96641aa83f2923ab437603}), (last,BsonString{value='chan'}), (family,BsonString{value='chan's'}))

这个BsonString很碍眼，用隐式转换来把它转成String：

object Helpers {

  implicit class DocumentObservable[C](val observable: Observable[Document]) extends ImplicitObservable[Document] {

    override val converter: (Document) => String = (doc) => doc.toJson

  }

  implicit class GenericObservable[C](val observable: Observable[C]) extends ImplicitObservable[C] {

    override val converter: (C) => String = (doc) => doc.toString

  }

  trait ImplicitObservable[C] {

    val observable: Observable[C]

    val converter: (C) => String

    def results(): Seq[C] = Await.result(observable.toFuture(),  seconds)

    def headResult() = Await.result(observable.head(),  seconds)

    def printResults(initial: String = ""): Unit = {

      if (initial.length > ) print(initial)

      results().foreach(res => println(converter(res)))

    }

    def printHeadResult(initial: String = ""): Unit = println(s"${initial}${converter(headResult())}")

  }

}

现在再列印：

  userCollection.find().printResults("all documents:")

all documents:{ "_id" : , "name" : "alice wong", "age" :  }

{ "_id" : { "$oid" : "5a9665cea83f29243ccacbd2" }, "first" : "tiger", "last" : "chan", "name" : "tiger chan", "age" : "unavailable" }

{ "_id" : , "first" : "peter", "age" : "old" }

{ "_id" : { "$oid" : "5a9665cea83f29243ccacbd3" }, "last" : "chan", "family" : "chan's" }

现在可读性强多了。find()无条件选出所有Document。MongoDB-Scala通过Filters对象提供了完整的查询条件构建函数如equal：

 /**

   * Creates a filter that matches all documents where the value of the field name equals the specified value. Note that this does

   * actually generate a `\$eq` operator, as the query language doesn't require it.

   *

   * A friendly alias for the `eq` method.

   *

   * @param fieldName the field name

   * @param value     the value

   * @tparam TItem  the value type

   * @return the filter

   * @see [[http://docs.mongodb.org/manual/reference/operator/query/eq \$eq]]

   */

  def equal[TItem](fieldName: String, value: TItem): Bson = eq(fieldName, value)

equal返回Bson，我们也可以把多个Bson组合起来形成一个更复杂的查询条件：

userCollection.find(and(gte("age",),exists("name",true)))

好了，现在我们可以测试各种查询条件了：

  userCollection.find(notEqual("_id",)).printResults("id != 3:")

  userCollection.find(equal("last", "chan")).printResults("last = chan:")

  userCollection.find(and(gte("age",),exists("name",true))).printResults("age >= 24")

  userCollection.find(or(gte("age",),equal("first","tiger"))).printResults("first = tiger")

显示结果：

id != :{ "_id" : , "name" : "alice wong", "age" :  }

{ "_id" : { "$oid" : "5a9665cea83f29243ccacbd2" }, "first" : "tiger", "last" : "chan", "name" : "tiger chan", "age" : "unavailable" }

{ "_id" : { "$oid" : "5a9665cea83f29243ccacbd3" }, "last" : "chan", "family" : "chan's" }

last = chan:{ "_id" : { "$oid" : "5a9665cea83f29243ccacbd2" }, "first" : "tiger", "last" : "chan", "name" : "tiger chan", "age" : "unavailable" }

{ "_id" : { "$oid" : "5a9665cea83f29243ccacbd3" }, "last" : "chan", "family" : "chan's" }

age >= { "_id" : , "name" : "alice wong", "age" :  }

first = tiger{ "_id" : , "name" : "alice wong", "age" :  }

{ "_id" : { "$oid" : "5a9665cea83f29243ccacbd2" }, "first" : "tiger", "last" : "chan", "name" : "tiger chan", "age" : "unavailable" }

下面是本次示范的源代码：

build.sbt

name := "learn-mongo"

version := "0.1"

scalaVersion := "2.12.4"

libraryDependencies := Seq(

    "org.mongodb.scala" %% "mongo-scala-driver" % "2.2.1",

    "com.lightbend.akka" %% "akka-stream-alpakka-mongodb" % "0.17"

)

MongoScala101.scala

import org.mongodb.scala._

import scala.collection.JavaConverters._

import org.mongodb.scala.connection.ClusterSettings

import scala.concurrent._

import scala.concurrent.duration._

import scala.util._

import org.mongodb.scala.model.Filters._

object MongoScala101 extends App {

  import scala.concurrent.ExecutionContext.Implicits.global

//  val client1 = MongoClient()

//  val client2 = MongoClient("mongodb://localhost:27017")

  val clusterSettings = ClusterSettings.builder()

         .hosts(List(new ServerAddress("localhost:27017")).asJava).build()

  val clientSettings = MongoClientSettings.builder().clusterSettings(clusterSettings).build()

  val client = MongoClient(clientSettings)

  val db: MongoDatabase = client.getDatabase("testdb")

  val userCollection: MongoCollection[Document] = db.getCollection("users")

  val deleteAll = userCollection.deleteMany(notEqual("_id", ))

  deleteAll.head.onComplete {

    case Success(c) => println(s"delete sucessful $c")

    case Failure(e) => println(s"delete error: ${e.getMessage}")

  }

  scala.io.StdIn.readLine()

  val delete3 = userCollection.deleteMany(equal("_id", ))

  delete3.head.onComplete {

    case Success(c) => println(s"delete sucessful $c")

    case Failure(e) => println(s"delete error: ${e.getMessage}")

  }

  scala.io.StdIn.readLine()

  val doc: Document = Document("_id" -> , "name" -> "MongoDB", "type" -> "database",

    "count" -> , "info" -> Document("x" -> , "y" -> ))

  val alice = Document("_id" -> , "name" -> "alice wong", "age" -> )

  val tiger = Document("first" -> "tiger", "last" -> "chan", "name" -> "tiger chan", "age" -> "unavailable")

  val addAlice: Observable[Completed] = userCollection.insertOne(alice)

  val addTiger: Observable[Completed] = userCollection.insertOne(tiger)

  addAlice.subscribe(new Observer[Completed] {

    override def onComplete(): Unit = println("insert alice completed.")

    override def onNext(result: Completed): Unit = println("insert alice sucessful.")

    override def onError(e: Throwable): Unit = println(s"insert error: ${e.getMessage}")

  })

  def headResult(observable: Observable[Completed]) = Await.result(observable.head(),  seconds)

  val r1 = headResult(addTiger)

  val peter = Document("_id" -> , "first" -> "peter", "age" -> "old")

  val chan = Document("last" -> "chan", "family" -> "chan's")

  val addMany = userCollection.insertMany(List(peter,chan))

  val r2 = headResult(addMany)

  import Helpers._

  userCollection.count.head.onComplete {

    case Success(c) => println(s"$c documents in users collection")

    case Failure(e) => println(s"count() error: ${e.getMessage}")

  }

  userCollection.find().toFuture().onComplete {

    case Success(users) => users.foreach(println)

    case Failure(e) => println(s"find error: ${e.getMessage}")

  }

  scala.io.StdIn.readLine()

  userCollection.find().printResults("all documents:")

  userCollection.find(notEqual("_id",)).printResults("id != 3:")

  userCollection.find(equal("last", "chan")).printResults("last = chan:")

  userCollection.find(and(gte("age",),exists("name",true))).printResults("age >= 24")

  userCollection.find(or(gte("age",),equal("first","tiger"))).printResults("first = tiger")

  client.close()

  println("end!!!")

}

object Helpers {

  implicit class DocumentObservable[C](val observable: Observable[Document]) extends ImplicitObservable[Document] {

    override val converter: (Document) => String = (doc) => doc.toJson

  }

  implicit class GenericObservable[C](val observable: Observable[C]) extends ImplicitObservable[C] {

    override val converter: (C) => String = (doc) => doc.toString

  }

  trait ImplicitObservable[C] {

    val observable: Observable[C]

    val converter: (C) => String

    def results(): Seq[C] = Await.result(observable.toFuture(),  seconds)

    def headResult() = Await.result(observable.head(),  seconds)

    def printResults(initial: String = ""): Unit = {

      if (initial.length > ) print(initial)

      results().foreach(res => println(converter(res)))

    }

    def printHeadResult(initial: String = ""): Unit = println(s"${initial}${converter(headResult())}")

  }

}

SDP（8）：文本式数据库-MongoDB-Scala基本操作的更多相关文章

【网络爬虫入门05】分布式文件存储数据库MongoDB的基本操作与爬虫应用
[网络爬虫入门05]分布式文件存储数据库MongoDB的基本操作与爬虫应用广东职业技术学院欧浩源 1.引言网络爬虫往往需要将大量的数据存储到数据库中,常用的有MySQL.MongoDB和Red ...
SDP（10）：文本式大数据运算环境-MongoDB-Engine功能设计
为了让前面规划的互联网+数据平台能有效对电子商务数据进行管理及实现大数据统计功能,必须在平台上再增加一个MongDB-Engine:数据平台用户通过传入一种Context来指示MongoDB-Engi ...
mariadb_1 数据库介绍及基本操作
数据库介绍 1.什么是数据库? 简单的说,数据库就是一个存放数据的仓库,这个仓库是按照一定的数据结构(数据结构是指数据的组织形式或数据之间的联系)来组织,存储的,我们可以通过数据库提供的多种方法来管理 ...
数据库MongoDB
一.MongoDB简介 MongoDB是由c++语言编写的,是一个基于分布式文件存储的开源数据库系统,在高负载的情况下,添加更多的节点,可以保证服务器性能.MongoDB旨在为web应用提供扩展的高性 ...
孤荷凌寒自学python第六十六天学习mongoDB的基本操作并进行简单封装5
孤荷凌寒自学python第六十六天学习mongoDB的基本操作并进行简单封装5并学习权限设置 (完整学习过程屏幕记录视频地址在文末) 今天是学习mongoDB数据库的第十二天. 今天继续学习mongo ...
孤荷凌寒自学python第六十五天学习mongoDB的基本操作并进行简单封装4
孤荷凌寒自学python第六十五天学习mongoDB的基本操作并进行简单封装4 (完整学习过程屏幕记录视频地址在文末) 今天是学习mongoDB数据库的第十一天. 今天继续学习mongoDB的简单操作 ...
孤荷凌寒自学python第六十四天学习mongoDB的基本操作并进行简单封装3
孤荷凌寒自学python第六十四天学习mongoDB的基本操作并进行简单封装3 (完整学习过程屏幕记录视频地址在文末) 今天是学习mongoDB数据库的第十天. 今天继续学习mongoDB的简单操作, ...
孤荷凌寒自学python第六十三天学习mongoDB的基本操作并进行简单封装2
孤荷凌寒自学python第六十三天学习mongoDB的基本操作并进行简单封装2 (完整学习过程屏幕记录视频地址在文末) 今天是学习mongoDB数据库的第九天. 今天继续学习mongoDB的简单操作, ...
孤荷凌寒自学python第六十二天学习mongoDB的基本操作并进行简单封装1
孤荷凌寒自学python第六十二天学习mongoDB的基本操作并进行简单封装1 (完整学习过程屏幕记录视频地址在文末) 今天是学习mongoDB数据库的第八天. 今天开始学习mongoDB的简单操作, ...

随机推荐

Java hashtable和hastmap的区别
1. 继承和实现区别 Hashtable是基于陈旧的Dictionary类,完成了Map接口:HashMap是Java 1.2引进的Map接口的一个实现(HashMap继承于AbstractMap,A ...
MYSQL DISTINCT Optimization
在很多情况下,Distinct和order by的组合需要建立一个内存临时表. 因为distinct关键字可能利用group by,所以了解下mysql如何处理group by有帮助. distin ...
java实现最小生成树的prim算法和kruskal算法
在边赋权图中,权值总和最小的生成树称为最小生成树.构造最小生成树有两种算法,分别是prim算法和kruskal算法.在边赋权图中,如下图所示: 在上述赋权图中,可以看到图的顶点编号和顶点之间邻接边的权 ...
Linxu指令--crond
前一天学习了 at 命令是针对仅运行一次的任务,循环运行的例行性计划任务,linux系统则是由 cron (crond) 这个系统服务来控制的.Linux 系统上面原本就有非常多的计划性工作,因此这个 ...
JavaWeb项目架构之Redis分布式日志队列
架构.分布式.日志队列,标题自己都看着唬人,其实就是一个日志收集的功能,只不过中间加了一个Redis做消息队列罢了. 前言为什么需要消息队列? 当系统中出现"生产"和" ...
Windows下为PHP安装redis扩展
1.使用phpinfo()函数查看PHP的版本信息,这会决定扩展文件版本. 2.下载 php_redis-2.2.7-5.5-ts-vc11-x86.zip 和 php_igbinary-2.0.5- ...
AM解调的FPGA实现
一.说明: 功能:AM解调平台:Vivado 2016.4 和 Matlab R2017a 二.原理: 1.AM解调原理模拟电路中采用"包络检波"的方法: 数字电路中采用类似的 ...
Spring事务不回滚原因分析
Synchronized用于线程间的数据共享,而ThreadLocal则用于线程间的数据隔离. 在我完成一个项目的时候,遇到了一个Spring事务不回滚的问题,通过aspectJ和@Transacti ...
（三）surging 微服务框架使用系列之我的第一个服务（审计日志）
前言:前面准备了那么久的准备工作,现在终于可以开始构建我们自己的服务了.这篇博客就让我们一起构建自己的第一个服务---审计日志. 首先我们先创建两个项目,一个控制台的服务启动项目,一个业务的实现项目. ...

SDP（8）：文本式数据库-MongoDB-Scala基本操作

SDP（8）：文本式数据库-MongoDB-Scala基本操作的更多相关文章

随机推荐

热门专题