自修改代码 on the fly 动态编译 即时编译 字节码
自修改代码(Self-modifying code)是指程序在运行期间(Run time)修改自身指令。可能的用途有:病毒利用此方法逃避杀毒软件的查杀,反静态分析,反盗版[1] ,单片机程序升级。
在暂存存储器中执行代码的计算机,可修改内存中的代码段,以往这种方法常被黑客用来制造病毒(参见:EICAR 测试病毒),现今许多操作系统及CPU提供限制程序修改代码段的方法。还可用于程序保护,增加软件破解人员的静态分析难度[2]。
Java SE 6 提供Java Compiler API,和Java的反射(Reflection)机制结合在一起,即可使Java程序在运行时产生新类(Class),替换旧类。
In computer science, self-modifying code is code that alters its own instructions while it is executing – usually to reduce the instruction path length and improve performance or simply to reduce otherwise repetitively similar code, thus simplifying maintenance. Self-modification is an alternative to the method of "flag setting" and conditional program branching, used primarily to reduce the number of times a condition needs to be tested. The term is usually only applied to code where the self-modification is intentional, not in situations where code accidentally modifies itself due to an error such as a buffer overflow.
The method is frequently used for conditionally invoking test/debugging code without requiring additional computational overhead for every input/output cycle.
The modifications may be performed:
- only during initialization – based on input parameters (when the process is more commonly described as software 'configuration' and is somewhat analogous, in hardware terms, to setting jumpers for printed circuit boards). Alteration of program entry pointers is an equivalent indirect method of self-modification, but requiring the co-existence of one or more alternative instruction paths, increasing the program size.
- throughout execution ("on the fly") – based on particular program states that have been reached during the execution
In either case, the modifications may be performed directly to the machine code instructions themselves, by overlaying new instructions over the existing ones (for example: altering a compare and branch to an unconditional branch or alternatively a 'NOP').
一个非常近似的技术是递增式编译。递增式编译器用于POP-2、POP-11、一些Lisp的版本,如Maclisp和最少一种版本的ML语言(Poplog ML)。这需要编程语言的编译器成为执行环境的一部分作为要件以实作。如此便得以在任何时候从终端、从档案、或从执行中程式所建造数据结构中读取源码。然后,转成机器码区块或函数(有可能取代之前同名的函数),之后可立即被程式使用。因为执行中对互动开发和测试的速度的要求,编译后的机器码所做的最佳化程度不如标准“批次编译器”。然而,递增式编译过的程式跑起来通常比同一个程式的一般解译版本还快。递增式编译因而能够同时提供编译和解译语言优点。 为了增加可移植性,递增式编译通常采两步骤。第一个步骤会编译到中间、与平台独立的语言,然后再到机器码。在这个例子中,移植只须改变“后端”编译器。不同于动态编译,递增式编译在程式执行后不会做更进一步的最佳化。
Dynamic compilation is a process used by some programming language implementations to gain performance during program execution. Although the technique originated in Self,[citation needed] the best-known language that uses this technique is Java. Since the machine code emitted by a dynamic compiler is constructed and optimized at program runtime, the use of dynamic compilation enables optimizations for efficiency not available to compiled programs except through code duplication or metaprogramming.
Runtime environments using dynamic compilation typically have programs run slowly for the first few minutes, and then after that, most of the compilation and recompilation is done and it runs quickly. Due to this initial performance lag, dynamic compilation is undesirable in certain cases. In most implementations of dynamic compilation, some optimizations that could be done at the initial compile time are delayed until further compilation at run-time, causing further unnecessary slowdowns. Just-in-time compilation is a form of dynamic compilation.
即时编译(英语:Just-in-time compilation,缩写:JIT)[1][2],又译及时编译[3]、实时编译[4],动态编译的一种形式,是一种提高程序运行效率的方法。通常,程序有两种运行方式:静态编译与动态解释。静态编译的程序在执行前全部被翻译为机器码,而解释执行的则是一句一句边运行边翻译。
微软的.NET Framework[5][6],还有绝大多数的Java实现[7],都依赖即时编译以提供高速的代码执行。Mozilla Firefox使用的JavaScript引擎SpiderMonkey也用到了JIT的技术。Ruby的第三方实现Rubinius和Python的第三方实现PyPy也都通过JIT来明显改善了解释器的性能。
In computing, just-in-time (JIT) compilation (also dynamic translation or run-time compilations)[1] is a way of executing computer code that involves compilation during execution of a program – at run time – rather than prior to execution.[2] Most often, this consists of source code or more commonly bytecode translation to machine code, which is then executed directly. A system implementing a JIT compiler typically continuously analyses the code being executed and identifies parts of the code where the speedup gained from compilation or recompilation would outweigh the overhead of compiling that code.
JIT compilation is a combination of the two traditional approaches to translation to machine code – ahead-of-time compilation (AOT), and interpretation – and combines some advantages and drawbacks of both.[2] Roughly, JIT compilation combines the speed of compiled code with the flexibility of interpretation, with the overhead of an interpreter and the additional overhead of compiling (not just interpreting). JIT compilation is a form of dynamic compilation, and allows adaptive optimization such as dynamic recompilation and microarchitecture-specific speedups[nb 1][3] – thus, in theory, JIT compilation can yield faster execution than static compilation[clarification needed]. Interpretation and JIT compilation are particularly suited for dynamic programming languages, as the runtime system can handle late-bound data types and enforce security guarantees.
字节码主要为了实现特定软件运行和软件环境、与硬件环境无关。字节码的实现方式是通过编译器和虚拟机。编译器将源码编译成字节码,特定平台上的虚拟机将字节码转译为可以直接运行的指令。字节码的典型应用为Java bytecode。
A bytecode program may be executed by parsing and directly executing the instructions, one at a time. This kind of bytecode interpreter is very portable. Some systems, called dynamic translators, or just-in-time (JIT) compilers, translate bytecode into machine code as necessary at runtime. This makes the virtual machine hardware-specific but doesn't lose the portability of the bytecode. For example, Java and Smalltalk code is typically stored in bytecode format, which is typically then JIT compiled to translate the bytecode to machine code before execution. This introduces a delay before a program is run, when the bytecode is compiled to native machine code, but improves execution speed considerably compared to interpreting source code directly, normally by around an order of magnitude (10x).[3]
Because of its performance advantage, today many language implementations execute a program in two phases, first compiling the source code into bytecode, and then passing the bytecode to the virtual machine. There are bytecode based virtual machines of this sort for Java, Raku, Python, PHP,[nb 1] Tcl, mawk and Forth (however, Forth is seldom compiled via bytecodes in this way, and its virtual machine is more generic instead). The implementation of Perl and Ruby 1.8 instead work by walking an abstract syntax tree representation derived from the source code.
More recently, the authors of V8[4] and Dart[5] have challenged the notion that intermediate bytecode is needed for fast and efficient VM implementation. Both of these language implementations currently do direct JIT compiling from source code to machine code with no bytecode intermediary.[6]
Java 字节码(Bytecode)技术详解
