这里分析Log对象本身的源代码.

Log类是一个topic分区的基础类.一个topic分区的所有基本管理动作.都在这个对象里完成.类源代码文件为Log.scala.在源代码log目录下.

Log类是LogSegment的集合和管理封装.首先看看初始化代码.

  1. class Log(val dir: File, //log的实例化对象在LogManager分析中已经介绍过.这里可以对照一下.
  2. @volatile var config: LogConfig,
  3. @volatile var recoveryPoint: Long = 0L,
  4. scheduler: Scheduler,
  5. time: Time = SystemTime) extends Logging with KafkaMetricsGroup {
  6.  
  7. import kafka.log.Log._ //这里是把同文件下的object加载进来.代码在文件的末尾.
  8.  
  9. /* A lock that guards all modifications to the log */
  10. private val lock = new Object //锁对象
  11.  
  12. /* last time it was flushed */
  13. private val lastflushedTime = new AtomicLong(time.milliseconds) //最后log刷新到磁盘的时间,这个变量贯穿整个管理过程.
  14.  
  15. /* the actual segments of the log */
    //这个对象是这个topic下所有分片的集合.这个集合贯彻整个log管理过程.之后所有动作都依赖此集合.
  16. private val segments: ConcurrentNavigableMap[java.lang.Long, LogSegment] = new ConcurrentSkipListMap[java.lang.Long, LogSegment]
  17. loadSegments() //将topic所有的分片加载到segments集合了.并做一些topic分片文件检查工作.
  18.  
  19. /* Calculate the offset of the next message */
  20. @volatile var nextOffsetMetadata = new LogOffsetMetadata(activeSegment.nextOffset(), activeSegment.baseOffset, activeSegment.size.toInt) //activeSegment表示当前最后一个分片.因为分片是按大小分布.最大的就是最新的.也就是活跃的分片.这里生成下一个offsetmetadata
  21.  
  22. val topicAndPartition: TopicAndPartition = Log.parseTopicPartitionName(name) //获取topic名称和分区.
  23.  
  24. info("Completed load of log %s with log end offset %d".format(name, logEndOffset))
  25.  
  26. val tags = Map("topic" -> topicAndPartition.topic, "partition" -> topicAndPartition.partition.toString) //监控度量的映射标签.

  27. //下面全是通过metrics做的一些监控.
  28. newGauge("NumLogSegments",
  29. new Gauge[Int] {
  30. def value = numberOfSegments
  31. },
  32. tags)
  33.  
  34. newGauge("LogStartOffset",
  35. new Gauge[Long] {
  36. def value = logStartOffset
  37. },
  38. tags)
  39.  
  40. newGauge("LogEndOffset",
  41. new Gauge[Long] {
  42. def value = logEndOffset
  43. },
  44. tags)
  45.  
  46. newGauge("Size",
  47. new Gauge[Long] {
  48. def value = size
  49. },
  50. tags)
  51.  
  52. /** The name of this log */
  53. def name = dir.getName()

  上面是Log class初始化的部分.这个部分最重要的就是声明了几个贯穿全过程的对象,并且将分片文件加载到内存对象中.

  下面看看主要的加载函数loadSegments.

  1. private def loadSegments() {
  2. // create the log directory if it doesn't exist
  3. dir.mkdirs() //这里是创建topic目录的.本身的注释也说明了这个.
  4.  
  5. // first do a pass through the files in the log directory and remove any temporary files
  6. // and complete any interrupted swap operations
  7. for(file <- dir.listFiles if file.isFile) { //这个for循环是用来检查分片是否是要被清除或者删除的.
  8. if(!file.canRead)
  9. throw new IOException("Could not read file " + file)
  10. val filename = file.getName
  11. if(filename.endsWith(DeletedFileSuffix) || filename.endsWith(CleanedFileSuffix)) {
  12. // if the file ends in .deleted or .cleaned, delete it
  13. file.delete()
  14. } else if(filename.endsWith(SwapFileSuffix)) { //这里检查是不是有swap文件存在.根据不同情况删除或重命名swap文件.
  15. // we crashed in the middle of a swap operation, to recover:
  16. // if a log, swap it in and delete the .index file
  17. // if an index just delete it, it will be rebuilt
  18. val baseName = new File(Utils.replaceSuffix(file.getPath, SwapFileSuffix, ""))
  19. if(baseName.getPath.endsWith(IndexFileSuffix)) {
  20. file.delete()
  21. } else if(baseName.getPath.endsWith(LogFileSuffix)){
  22. // delete the index
  23. val index = new File(Utils.replaceSuffix(baseName.getPath, LogFileSuffix, IndexFileSuffix))
  24. index.delete()
  25. // complete the swap operation
  26. val renamed = file.renameTo(baseName)
  27. if(renamed)
  28. info("Found log file %s from interrupted swap operation, repairing.".format(file.getPath))
  29. else
  30. throw new KafkaException("Failed to rename file %s.".format(file.getPath))
  31. }
  32. }
  33. }
  34.  
  35. // now do a second pass and load all the .log and .index files
  36. for(file <- dir.listFiles if file.isFile) { //这个for循环是加载和检查log分片是否存在的.
  37. val filename = file.getName
  38. if(filename.endsWith(IndexFileSuffix)) {
  39. // if it is an index file, make sure it has a corresponding .log file
  40. val logFile = new File(file.getAbsolutePath.replace(IndexFileSuffix, LogFileSuffix))
  41. if(!logFile.exists) { //这里是如果只有index文件没有对应的log文件.就把index文件清理掉.
  42. warn("Found an orphaned index file, %s, with no corresponding log file.".format(file.getAbsolutePath))
  43. file.delete()
  44. }
  45. } else if(filename.endsWith(LogFileSuffix)) { //这里是创建LogSegment对象的地方.
  46. // if its a log file, load the corresponding log segment
  47. val start = filename.substring(0, filename.length - LogFileSuffix.length).toLong
  48. val hasIndex = Log.indexFilename(dir, start).exists //确认对应的index文件是否存在.
  49. val segment = new LogSegment(dir = dir,
  50. startOffset = start,
  51. indexIntervalBytes = config.indexInterval,
  52. maxIndexSize = config.maxIndexSize,
  53. rollJitterMs = config.randomSegmentJitter,
  54. time = time)
  55. if(!hasIndex) {
  56. error("Could not find index file corresponding to log file %s, rebuilding index...".format(segment.log.file.getAbsolutePath))
  57. segment.recover(config.maxMessageSize) //对应log文件的index不存在的话,进行recover.这个地方就是平常碰见kafka index出错需要重新建立的时候管理员删除了对应的index会引起的动作.
  58. }
  59. segments.put(start, segment) //将segment对象添加到总集里.
  60. }
  61. }
  62.  
  63. if(logSegments.size == 0) { //这里判断是否是一个新的topic分区.尚不存在分片文件.所以创建一个空的分片文件对象.
  64. // no existing segments, create a new mutable segment beginning at offset 0
  65. segments.put(0L, new LogSegment(dir = dir,
  66. startOffset = 0,
  67. indexIntervalBytes = config.indexInterval,
  68. maxIndexSize = config.maxIndexSize,
  69. rollJitterMs = config.randomSegmentJitter,
  70. time = time))
  71. } else {
  72. recoverLog() //这里是topic分片不为空的话.就为检查点设置新offset值.
  73. // reset the index size of the currently active log segment to allow more entries
  74. activeSegment.index.resize(config.maxIndexSize)
  75. }
  76.  
  77. // sanity check the index file of every segment to ensure we don't proceed with a corrupt segment
  78. for (s <- logSegments)
  79. s.index.sanityCheck() //index文件检查.
  80. }

  看看recoverLog是做了哪些工作.

  1. private def recoverLog() {
  2. // if we have the clean shutdown marker, skip recovery
  3. if(hasCleanShutdownFile) { //看看是否有cleanshutdownfile存在.hasCleanShutdownFile函数就是判断这个文件存不存在
  4. this.recoveryPoint = activeSegment.nextOffset //存在则直接把恢复检查点设置成最后一个分片的最新offset值
  5. return
  6. }
  7.  
  8. // okay we need to actually recovery this log
  9. val unflushed = logSegments(this.recoveryPoint, Long.MaxValue).iterator //这个是获取检查点到最大值之间是否还有其他的分片.也就是检查检查点是不是就是最后一个分片文件.
  10. while(unflushed.hasNext) { //如果不是最后一个分片.就获取这个分片.然后调用这个对象的recover函数如果函数返回错误就删除这个分片.
  11. val curr = unflushed.next
  12. info("Recovering unflushed segment %d in log %s.".format(curr.baseOffset, name))
  13. val truncatedBytes =
  14. try {
  15. curr.recover(config.maxMessageSize)
  16. } catch {
  17. case e: InvalidOffsetException =>
  18. val startOffset = curr.baseOffset
  19. warn("Found invalid offset during recovery for log " + dir.getName +". Deleting the corrupt segment and " +
  20. "creating an empty one with starting offset " + startOffset)
  21. curr.truncateTo(startOffset)
  22. }
  23. if(truncatedBytes > 0) {
  24. // we had an invalid message, delete all remaining log
  25. warn("Corruption found in segment %d of log %s, truncating to offset %d.".format(curr.baseOffset, name, curr.nextOffset))
  26. unflushed.foreach(deleteSegment)
  27. }
  28. }
  29. }

   这个函数的处理动作.包装的是LogSegment对同名对象.LogSegment的分析会在后续文章里继续分析.现在接着看看Log对象的其他方法,即被LogManager函数里封装的两个个功能函数deleteOldSegment和flush.

  首先是deleteOldSegment函数

  1. def deleteOldSegments(predicate: LogSegment => Boolean): Int = { //这个函数就是在LogManager类中被用来处理清除log的函数.函数的参数是一个匿名函数
  2. // find any segments that match the user-supplied predicate UNLESS it is the final segment
  3. // and it is empty (since we would just end up re-creating it
  4. val lastSegment = activeSegment //activeSegment 在上面也提到过是最后一个分片对象.
    //这里predicate函数在LogManager类中是判断是否超过大小限制和时间限制的.这里遍历判断每个分片是否达到限制,并且不是第一个分片或者分片不是空的.
  5. val deletable = logSegments.takeWhile(s => predicate(s) && (s.baseOffset != lastSegment.baseOffset || s.size > 0))
  6. val numToDelete = deletable.size //这里获得有多少个分片要被删除.
  7. if(numToDelete > 0) { //如果有分片要被删除则执行这个
  8. lock synchronized { //这是一个同步块.
  9. // we must always have at least one segment, so if we are going to delete all the segments, create a new one first
  10. if(segments.size == numToDelete) //如果要删除的分片是这个topic下的所有分片的话.需要先通过roll创建新的分片.
  11. roll()
  12. // remove the segments for lookups
  13. deletable.foreach(deleteSegment(_)) //遍历所有要被删除的分片.将之删除.
  14. }
  15. }
  16. numToDelete //返回删除的分片个数.
  17. }

  上面函数可以看的很清楚,删除一个分片所面临的动作.下面贴上关于这个函数的一些相关被调用函数的解析.

  activeSegment,logSegments,deleteSegment函数.

  1. def activeSegment = segments.lastEntry.getValue //可以看见这个被多次使用的函数.就是分片集里的最后一个分片.
  2.  
  3. /**
  4. * All the log segments in this log ordered from oldest to newest
  5. */
  6. def logSegments: Iterable[LogSegment] = { //这是在上面deleteOldSegment函数中被调用的函数.是分片集的一个值集.
  7. import JavaConversions._
  8. segments.values
  9. }
  10.  
  11. /**
  12. * Get all segments beginning with the segment that includes "from" and ending with the segment
  13. * that includes up to "to-1" or the end of the log (if to > logEndOffset)
  14. */
  15. def logSegments(from: Long, to: Long): Iterable[LogSegment] = { //这是在recoverLog函数里被调用来查找检查点记录的offset是否是最后一个分片.
  16. import JavaConversions._
  17. lock synchronized {
  18. val floor = segments.floorKey(from) //获取最后一个分片.或者返回给出的值到Long.maxvalue之间的所有分片对象.
  19. if(floor eq null)
  20. segments.headMap(to).values
  21. else
  22. segments.subMap(floor, true, to, false).values
  23. }
  24. }

  再接着看看deleteSegment函数.这个函数是主要的删除函数.并且这个函数也调用了其他函数来删除分片对象.

  1. private def deleteSegment(segment: LogSegment) {
  2. info("Scheduling log segment %d for log %s for deletion.".format(segment.baseOffset, name))
  3. lock synchronized { //这里进行同步,
  4. segments.remove(segment.baseOffset) //从集合里清楚这个分片对象.
  5. asyncDeleteSegment(segment) //调用异步删除方法来清理对象的相关文件.
  6. }
  7. }
  8.  
  9. /**
  10. * Perform an asynchronous delete on the given file if it exists (otherwise do nothing)
  11. * @throws KafkaStorageException if the file can't be renamed and still exists
  12. */
  13. private def asyncDeleteSegment(segment: LogSegment) {
  14. segment.changeFileSuffixes("", Log.DeletedFileSuffix)
  15. def deleteSeg() {
  16. info("Deleting segment %d from log %s.".format(segment.baseOffset, name))
  17. segment.delete()
  18. }
    //可以看见这里是调用了构建对象的时候传递进来的最初由KafkaServer.start里最开始初始化的线程池.然后把经过包装的LogSegment.delete方法提交到线程池中运行.
  19. scheduler.schedule("delete-file", deleteSeg, delay = config.fileDeleteDelayMs)
  20. }

  上面的部分讲解了LogManager.cleanupLogs函数的封装函数具体处理工作.下面看看LogManager.flushDirtyLogs里的Log.flush是如何工作的.

  1. def flush(): Unit = flush(this.logEndOffset) //这是在logmanager中被调用的函数.
  2.  
  3. /**
  4. * Flush log segments for all offsets up to offset-1
  5. * @param offset The offset to flush up to (non-inclusive); the new recovery point
  6. */
  7. def flush(offset: Long) : Unit = { //这是真正工作的函数.
  8. if (offset <= this.recoveryPoint) //首先判断现在的offset是否是在检查点offset范围内的.如果是则不做任何动作.
  9. return
  10. debug("Flushing log '" + name + " up to offset " + offset + ", last flushed: " + lastFlushTime + " current time: " +
  11. time.milliseconds + " unflushed = " + unflushedMessages)
  12. for(segment <- logSegments(this.recoveryPoint, offset)) //找到检查点offset到最新offset之间的所有分片.为这些分片调用LogSegment.flush函数
  13. segment.flush() //通过这个函数刷新到磁盘.具体动作如何.会在后续LogSegment的文章里分析.
  14. lock synchronized { //同步方法.
  15. if(offset > this.recoveryPoint) {
  16. this.recoveryPoint = offset //刷新之后更新最新的检查点offset.
  17. lastflushedTime.set(time.milliseconds) //更新最新刷新时间.
  18. }
  19. }
  20. }

  上面说完了核心的管理函数和加载函数.下面看看读取和写入相关的函数.read和append.

  1. def append(messages: ByteBufferMessageSet, assignOffsets: Boolean = true): LogAppendInfo = { //将消息添加到分片尾
  2. val appendInfo = analyzeAndValidateMessageSet(messages) //这里验证log信息和创建logappendinfo.
  3.  
  4. // if we have any valid messages, append them to the log
  5. if(appendInfo.shallowCount == 0) //如果消息为空的话,就直接返回信息.
  6. return appendInfo
  7.  
  8. // trim any invalid bytes or partial messages before appending it to the on-disk log
  9. var validMessages = trimInvalidBytes(messages, appendInfo) //这个函数将消息里多余的部分截掉.
  10.  
  11. try {
  12. // they are valid, insert them in the log
  13. lock synchronized {
  14. appendInfo.firstOffset = nextOffsetMetadata.messageOffset //这里开始分配offset值.即上最后一个分片的最后一个offset值.
  15.  
  16. if(assignOffsets) {
  17. // assign offsets to the message set
  18. val offset = new AtomicLong(nextOffsetMetadata.messageOffset) //创建新的offset
  19. try {
  20. validMessages = validMessages.assignOffsets(offset, appendInfo.codec) //使用ByteBufferMessageSet类中的分配方法分配offset值.
  21. } catch {
  22. case e: IOException => throw new KafkaException("Error in validating messages while appending to log '%s'".format(name), e)
  23. }
  24. appendInfo.lastOffset = offset.get - 1 //因为offset被assignOffsets方法累加过.所以最后要减1.
  25. } else {
  26. // we are taking the offsets we are given
  27. if(!appendInfo.offsetsMonotonic || appendInfo.firstOffset < nextOffsetMetadata.messageOffset)
  28. throw new IllegalArgumentException("Out of order offsets found in " + messages)
  29. }
  30.  
  31. // re-validate message sizes since after re-compression some may exceed the limit
  32. for(messageAndOffset <- validMessages.shallowIterator) {
  33. if(MessageSet.entrySize(messageAndOffset.message) > config.maxMessageSize) { //这里判断每一个消息是否大于配置的消息最大长度.
  34. // we record the original message set size instead of trimmed size
  35. // to be consistent with pre-compression bytesRejectedRate recording
  36. BrokerTopicStats.getBrokerTopicStats(topicAndPartition.topic).bytesRejectedRate.mark(messages.sizeInBytes)
  37. BrokerTopicStats.getBrokerAllTopicsStats.bytesRejectedRate.mark(messages.sizeInBytes)
  38. throw new MessageSizeTooLargeException("Message size is %d bytes which exceeds the maximum configured message size of %d."
  39. .format(MessageSet.entrySize(messageAndOffset.message), config.maxMessageSize))
  40. }
  41. }
  42.  
  43. // check messages set size may be exceed config.segmentSize
  44. if(validMessages.sizeInBytes > config.segmentSize) { //判断要写入的消息集大小是否超过配置的分片大小.
  45. throw new MessageSetSizeTooLargeException("Message set size is %d bytes which exceeds the maximum configured segment size of %d."
  46. .format(validMessages.sizeInBytes, config.segmentSize))
  47. }
  48.  
  49. // maybe roll the log if this segment is full
  50. val segment = maybeRoll(validMessages.sizeInBytes) //这里是判断是否需要滚动分片.
  51.  
  52. // now append to the log
  53. segment.append(appendInfo.firstOffset, validMessages) //这里真正调用LogSegment对象写入消息.
  54.  
  55. // increment the log end offset
  56. updateLogEndOffset(appendInfo.lastOffset + 1) //更新lastoffset.
  57.  
  58. trace("Appended message set to log %s with first offset: %d, next offset: %d, and messages: %s"
  59. .format(this.name, appendInfo.firstOffset, nextOffsetMetadata.messageOffset, validMessages))
  60.  
  61. if(unflushedMessages >= config.flush) //判断是否需要刷新到磁盘
  62. flush()
  63.  
  64. appendInfo
  65. }
  66. } catch {
  67. case e: IOException => throw new KafkaStorageException("I/O exception in append to log '%s'".format(name), e)
  68. }
  69. }

  由于涉及到消息写入检查等等动作.所以其中有很多操作需要看见message目录下的关于message的具体实现才能了解.关于Message的具体解析会在后续的篇章里继续分析.先简略看看对应的验证和处理函数analyzeAndValidateMessageSet和trimInvalidBytes

  1. private def analyzeAndValidateMessageSet(messages: ByteBufferMessageSet): LogAppendInfo = {
  2. var shallowMessageCount = 0
  3. var validBytesCount = 0
  4. var firstOffset, lastOffset = -1L
  5. var codec: CompressionCodec = NoCompressionCodec
  6. var monotonic = true
  7. for(messageAndOffset <- messages.shallowIterator) { //这里是遍历所有message对象.
  8. // update the first offset if on the first message
  9. if(firstOffset < 0)
  10. firstOffset = messageAndOffset.offset //设置第一个offset
  11. // check that offsets are monotonically increasing
  12. if(lastOffset >= messageAndOffset.offset) //判断最后offset是否失效.
  13. monotonic = false
  14. // update the last offset seen
  15. lastOffset = messageAndOffset.offset //设置最后一个offset
  16.  
  17. val m = messageAndOffset.message //具体message消息.
  18.  
  19. // Check if the message sizes are valid.
  20. val messageSize = MessageSet.entrySize(m)
  21. if(messageSize > config.maxMessageSize) { //判断消息大小时候超过配置最大消息大小.
  22. BrokerTopicStats.getBrokerTopicStats(topicAndPartition.topic).bytesRejectedRate.mark(messages.sizeInBytes)
  23. BrokerTopicStats.getBrokerAllTopicsStats.bytesRejectedRate.mark(messages.sizeInBytes)
  24. throw new MessageSizeTooLargeException("Message size is %d bytes which exceeds the maximum configured message size of %d."
  25. .format(messageSize, config.maxMessageSize))
  26. }
  27.  
  28. // check the validity of the message by checking CRC
  29. m.ensureValid() //校验消息是否完整.
  30.  
  31. shallowMessageCount += 1 //统计验证完成的消息总个数.
  32. validBytesCount += messageSize //统计总大小.
  33.  
  34. val messageCodec = m.compressionCodec //是否启用压缩.
  35. if(messageCodec != NoCompressionCodec)
  36. codec = messageCodec
  37. }
    //返回一个LogAppendInfo的对象.
  38. LogAppendInfo(firstOffset, lastOffset, codec, shallowMessageCount, validBytesCount, monotonic)
  39. }
  40.  
  41. /**
  42. * Trim any invalid bytes from the end of this message set (if there are any)
  43. * @param messages The message set to trim
  44. * @param info The general information of the message set
  45. * @return A trimmed message set. This may be the same as what was passed in or it may not.
  46. */
  47. private def trimInvalidBytes(messages: ByteBufferMessageSet, info: LogAppendInfo): ByteBufferMessageSet = {
  48. val messageSetValidBytes = info.validBytes
  49. if(messageSetValidBytes < 0) //查看消息是不是正常.
  50. throw new InvalidMessageSizeException("Illegal length of message set " + messageSetValidBytes + " Message set cannot be appended to log. Possible causes are corrupted produce requests")
  51. if(messageSetValidBytes == messages.sizeInBytes) { //消息正好正常.就直接返回
  52. messages
  53. } else {
  54. // trim invalid bytes
  55. val validByteBuffer = messages.buffer.duplicate()
  56. validByteBuffer.limit(messageSetValidBytes) //不正常则通过从新设置limit把大小设置在验证的大小上.丢弃后续部分.
  57. new ByteBufferMessageSet(validByteBuffer) //返回新的消息.
  58. }
  59. }

  这两个函数是在append函数里被调用的预处理方法.里面涉及了message的操作.具体会在message的篇幅里分析.  

  写入日志里做了日志管理中的另外一个工作.就是滚动日志分片.maybeRoll用来跟配置文件对照看是否需要创建新分片.

  1. private def maybeRoll(messagesSize: Int): LogSegment = {
  2. val segment = activeSegment
    //这里判断是否需要滚动log分片.
  3. if (segment.size > config.segmentSize - messagesSize ||
  4. segment.size > 0 && time.milliseconds - segment.created > config.segmentMs - segment.rollJitterMs ||
  5. segment.index.isFull) {
  6. debug("Rolling new log segment in %s (log_size = %d/%d, index_size = %d/%d, age_ms = %d/%d)."
  7. .format(name,
  8. segment.size,
  9. config.segmentSize,
  10. segment.index.entries,
  11. segment.index.maxEntries,
  12. time.milliseconds - segment.created,
  13. config.segmentMs - segment.rollJitterMs))
  14. roll() //调用roll函数完成.
  15. } else {
  16. segment //不需要则直接返回当前分片.
  17. }
  18. }
  19.  
  20. /**
  21. * Roll the log over to a new active segment starting with the current logEndOffset.
  22. * This will trim the index to the exact size of the number of entries it currently contains.
  23. * @return The newly rolled segment
  24. */
  25. def roll(): LogSegment = {
  26. val start = time.nanoseconds
  27. lock synchronized {
  28. val newOffset = logEndOffset //以最后offset当作新分片文件名
  29. val logFile = logFilename(dir, newOffset) //log文件名
  30. val indexFile = indexFilename(dir, newOffset) //index文件名.
  31. for(file <- List(logFile, indexFile); if file.exists) { //判断这两个文件是否存在
  32. warn("Newly rolled segment file " + file.getName + " already exists; deleting it first")
  33. file.delete() //存在就删除.
  34. }
  35.  
  36. segments.lastEntry() match {
  37. case null =>
  38. case entry => entry.getValue.index.trimToValidSize()
  39. }
  40. val segment = new LogSegment(dir, //创建一个新分片.
  41. startOffset = newOffset,
  42. indexIntervalBytes = config.indexInterval,
  43. maxIndexSize = config.maxIndexSize,
  44. rollJitterMs = config.randomSegmentJitter,
  45. time = time)
  46. val prev = addSegment(segment) //将新分片添加到集合中.
  47. if(prev != null)
  48. throw new KafkaException("Trying to roll a new log segment for topic partition %s with start offset %d while it already exists.".format(name, newOffset))
  49.  
  50. // schedule an asynchronous flush of the old segment
  51. scheduler.schedule("flush-log", () => flush(newOffset), delay = 0L) //提交一个刷新任务到线程池中.
  52.  
  53. info("Rolled new log segment for '" + name + "' in %.0f ms.".format((System.nanoTime - start) / (1000.0*1000.0)))
  54.  
  55. segment
  56. }
  57. }

  上面讨论了写入一段消息.下面看看读取一段消息.

  1. def read(startOffset: Long, maxLength: Int, maxOffset: Option[Long] = None): FetchDataInfo = {
  2. trace("Reading %d bytes from offset %d in log %s of length %d bytes".format(maxLength, startOffset, name, size))
  3.  
  4. // check if the offset is valid and in range
  5. val next = nextOffsetMetadata.messageOffset
  6. if(startOffset == next) //如果读取的消息就是最新的消息.返回一个空消息对象.
  7. return FetchDataInfo(nextOffsetMetadata, MessageSet.Empty)
  8.  
  9. var entry = segments.floorEntry(startOffset) //获取对应的offset分片对象.
  10.  
  11. // attempt to read beyond the log end offset is an error
  12. if(startOffset > next || entry == null)
  13. throw new OffsetOutOfRangeException("Request for offset %d but we only have log segments in the range %d to %d.".format(startOffset, segments.firstKey, next))
  14.  
  15. // do the read on the segment with a base offset less than the target offset
  16. // but if that segment doesn't contain any messages with an offset greater than that
  17. // continue to read from successive segments until we get some messages or we reach the end of the log
  18. while(entry != null) {
  19. val fetchInfo = entry.getValue.read(startOffset, maxOffset, maxLength) //调用logsegment的read方法获取消息.
  20. if(fetchInfo == null) {
  21. entry = segments.higherEntry(entry.getKey)
  22. } else {
  23. return fetchInfo //成功返回新消息对象.
  24. }
  25. }
  26.  
  27. // okay we are beyond the end of the last segment with no data fetched although the start offset is in range,
  28. // this can happen when all messages with offset larger than start offsets have been deleted.
  29. // In this case, we will return the empty set with log end offset metadata
  30. FetchDataInfo(nextOffsetMetadata, MessageSet.Empty) //失败返回空消息对象.
  31. }

  读取一段消息,依赖LogSegment的实现.具体将在后续篇章里分析.

到这里已经将Log类中的主要功能,方法都分析过了.关于Log的分析就到此结束了.

关于在Log中使用到的一些常量,以及常量方法.Log Object的内容就直接贴在下面部分了.

  1. object Log {
  2. //这里就是一些常量和一些组合函数.
  3. /** a log file */
  4. val LogFileSuffix = ".log"
  5.  
  6. /** an index file */
  7. val IndexFileSuffix = ".index"
  8.  
  9. /** a file that is scheduled to be deleted */
  10. val DeletedFileSuffix = ".deleted"
  11.  
  12. /** A temporary file that is being used for log cleaning */
  13. val CleanedFileSuffix = ".cleaned"
  14.  
  15. /** A temporary file used when swapping files into the log */
  16. val SwapFileSuffix = ".swap"
  17.  
  18. /** Clean shutdown file that indicates the broker was cleanly shutdown in 0.8. This is required to maintain backwards compatibility
  19. * with 0.8 and avoid unnecessary log recovery when upgrading from 0.8 to 0.8.1 */
  20. /** TODO: Get rid of CleanShutdownFile in 0.8.2 */
  21. val CleanShutdownFile = ".kafka_cleanshutdown"
  22.  
  23. /**
  24. * Make log segment file name from offset bytes. All this does is pad out the offset number with zeros
  25. * so that ls sorts the files numerically.
  26. * @param offset The offset to use in the file name
  27. * @return The filename
  28. */
  29. def filenamePrefixFromOffset(offset: Long): String = {
  30. val nf = NumberFormat.getInstance()
  31. nf.setMinimumIntegerDigits(20)
  32. nf.setMaximumFractionDigits(0)
  33. nf.setGroupingUsed(false)
  34. nf.format(offset)
  35. }
  36.  
  37. /**
  38. * Construct a log file name in the given dir with the given base offset
  39. * @param dir The directory in which the log will reside
  40. * @param offset The base offset of the log file
  41. */
  42. def logFilename(dir: File, offset: Long) =
  43. new File(dir, filenamePrefixFromOffset(offset) + LogFileSuffix)
  44.  
  45. /**
  46. * Construct an index file name in the given dir using the given base offset
  47. * @param dir The directory in which the log will reside
  48. * @param offset The base offset of the log file
  49. */
  50. def indexFilename(dir: File, offset: Long) =
  51. new File(dir, filenamePrefixFromOffset(offset) + IndexFileSuffix)
  52.  
  53. /**
  54. * Parse the topic and partition out of the directory name of a log
  55. */
  56. def parseTopicPartitionName(name: String): TopicAndPartition = {
  57. val index = name.lastIndexOf('-')
  58. TopicAndPartition(name.substring(0,index), name.substring(index+1).toInt)
  59. }
  60. }

Kafka 源代码分析之Log的更多相关文章

  1. Kafka 源代码分析之log框架介绍

    这里主要介绍log管理,读写相关的类的调用关系的介绍. 在围绕log的实际处理上.有很多层的封装和调用.这里主要介绍一下调用结构和顺序. 首先从LogManager开始. 调用关系简单如下:LogMa ...

  2. Kafka 源代码分析之FileMessageSet

    这里主要分析FileMessageSet类 这个类主要是管理log消息的内存对象和文件对象的类.源代码文件在log目录下.这个类被LogSegment类代理调用用来管理分片. 下面是完整代码.代码比较 ...

  3. Kafka 源代码分析之LogSegment

    这里分析kafka LogSegment源代码 通过一步步分析LogManager,Log源代码之后就会发现,最终的log操作都在LogSegment上实现.LogSegment负责分片的读写恢复刷新 ...

  4. Kafka 源代码分析之LogManager

    这里分析kafka 0.8.2的LogManager logmanager是kafka用来管理log文件的子系统.源代码文件在log目录下. 这里会逐步分析logmanager的源代码.首先看clas ...

  5. Kafka 源代码分析.

    这里记录kafka源代码笔记.(代码版本是0.8.2.1) kafka的源代码如何下载.这里简单说一下. git clone https://git-wip-us.apache.org/repos/a ...

  6. Kafka 源代码分析之ByteBufferMessageSet

    这里分析一下message的封装类ByteBufferMessageSet类 ByteBufferMessageSet类的源代码在源代码目录message目录下.这个类主要封装了message,mes ...

  7. kafka 源代码分析之Message(v0.10)

    这里主要更新一下kafka 0.10.0版本的message消息格式的变化. message 的格式在0.10.0的版本里发生了一些变化(相对于0.8.2.1的版本)这里把0.10.0的message ...

  8. Kafka 源代码分析之Message

    这里主要分析一下message的格式. 一条message的构成由以下部分组成 val CrcOffset = 0 //crc校验部分和字长 val CrcLength = 4 val MagicOf ...

  9. Kafka 源代码分析之MessageSet

    这里分析MessageSet类 MessageSet是一个抽象类,定义了一条log的一些接口和常量,FileMessageSet就是MessageSet类的实现类.一条日志中存储的log完整格式如下 ...

随机推荐

  1. Java Swing 图形界面实现验证码(验证码可动态刷新)

    import java.awt.Color;import java.awt.Font;import java.awt.Graphics;import java.awt.Toolkit;import j ...

  2. JAVA-Socket通信笔记

    JAVA - Socket 从开学到现在 也学了三个月时间的java了,一直在 在 语法和基本使用上周旋,井底之娃一枚. 这两天 有学长指点,花了两天的时间 学习了java多线程和socket的简单使 ...

  3. 深入理解IOC

    1. IoC理论的背景 我们都知道,在采用面向对象方法设计的软件系统中,它的底层实现都是由N个对象组成的,所有的对象通过彼此的合作,最终实现系统的业务逻辑. 图1:软件系统中耦合的对象 如果我们打开机 ...

  4. Sqlla: 数据库操作从未如此简单

    Sqlla 一套数据库的 ORM 微型库,提供简单高效的 API 来操作数据库. Sqlla 拥有极少的API,使用方式简单.让开发者不需要关心数据库操作的具体细节,只需专注SQL和业务逻辑.同时简单 ...

  5. 【设计模式】之开闭原则(OCP)

    开闭原则是面向对象设计的一个重要原则,其定义如下: 开闭原则(Open-Closed Principle, OCP):一个软件实体应当对扩展开放,对修改关闭.即软件实体应尽量在不修改原有代码的情况下进 ...

  6. 【Spark2.0源码学习】-6.Client启动

    Client作为Endpoint的具体实例,下面我们介绍一下Client启动以及OnStart指令后的额外工作 一.脚本概览      下面是一个举例: /opt/jdk1..0_79/bin/jav ...

  7. Aspose.Cells.dll操作execl

    附件:Aspose.Cells.dll 1.创建execl(不需要服务器或者客户端安装office) public void DCExexl(DataTable dt) {  Workbook wb ...

  8. 《NLTK基础教程》译者序

    购买<NLTK基础教程> 说来也凑巧,在我签下这本书的翻译合同时,这个世界好像还不知道AlphaGo的存在.而在我完成这本书的翻译之时,Master已经对人类顶级高手连胜60局了.至少从媒 ...

  9. ubuntu16.04 英文环境安装google chrome

    1.下载google wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb 2.安装缺少的依赖 ...

  10. JS 使用 splice() 对数组去重

    一 问题 有如下 js 数组 connect_clients,需要在去掉所有元素中 user_id, goods_id 这两者的值都相同的元素. [ { id: 'eff040fb-92bc-4f24 ...