postgresql 创建函数
One of the most powerful features of PostgreSQL is its support for user-defined functions written in various programming languages, including pure SQL, C, Perl, Python, and PHP. Perhaps the most common programming language for PostgreSQL functions, however, is PL/pgSQL (don't ask me to pronounce it), because it comes with PostgreSQL and is easy to set up.
Installing PL/pgSQL
To get started with PL/pgSQL, first make sure it's installed in your PostgreSQL database. If it was a part of the template1
database when your database was created, it will already be installed. To see whether you have it, run the following in the psql
client:
SELECT true FROM pg_catalog.pg_language WHERE lanname = 'plpgsql';
If the result row has the value true
, PL/pgSQL is already installed in your database. If not, quit psql
and execute the command:
$ createlang plpgsql database_name
To add a language, you must have superuser access to the database. If you've just installed PostgreSQL, then you can likely use the defaultpostgres
user by passing -U postgres
to createlang
. From this point, you should be able to follow along by pasting the sample functions into psql
.
A First Function
To write your first PL/pgSQL function, start with something simple: a function to return the Fibonacci number for a position in the Fibonacci sequence. I know, I know; everyone uses a Fibonacci calculator to demonstrate code. Why can't I be original? Because a couple iterations of such a function will show off some of the more useful features of PL/pgSQL. It's purely pedagogical. A simple implementation is:
1 CREATE OR REPLACE FUNCTION fib (
2 fib_for integer
3 ) RETURNS integer AS $$
4 BEGIN
5 IF fib_for < 2 THEN
6 RETURN fib_for;
7 END IF;
8 RETURN fib(fib_for - 2) + fib(fib_for - 1);
9 END;
10 $$ LANGUAGE plpgsql;
Using the function is easy:
try=% select fib(8);
fib
-----
21
(1 row)
The first line uses PostgreSQL's CREATE OR REPLACE FUNCTION
statement to create the function. The name of the function is fib
. The CREATE OR REPLACE FUNCTION
statement is more useful in practice than the simple CREATE FUNCTION
statement, because it will drop an existing function with the same name and argument signature before creating the new one. This is very convenient while you're developing and testing a new function.
The second line declares the integer variable fib_for
as the sole argument to the function, and thus constitutes its entire argument signature. The argument signature must come after the name of the function, inside parentheses. In this respect, it's not much different than function or method declarations in most programming languages. Arguments can be of any type supported by PostgreSQL, including user-created types and domains, as well as composite data types such as table row types. This article's examples will use only simple data types, but see the PL/pgSQL Declarations documentation for details.
Note that named arguments were added to PL/pgSQL in PostgreSQL 8.0. In earlier versions of PostgreSQL, you must either use the default, numbered variable names for the arguments, or declare aliases in a DECLARE
block:
CREATE OR REPLACE FUNCTION fib ( integer ) RETURNS integer AS $$
DECLARE
fib_for ALIAS FOR $1;
BEGIN
-- ...
Unless you have an older version of PostgreSQL, use named arguments. They're more convenient.
The third line closes the argument signature and specifies the function return value (integer
). As with arguments, the return value of a function can be any PostgreSQL data type, including a composite type or even a cursor. The end of line three has the odd string $$
. This is PostgreSQL dollar-quoting. When used in place of the usual single-quotation mark quoting ('
), you don't have to escape single quotation marks within the body of the function. This makes them much easier to read.
Line four's BEGIN
statement marks the start of the function body, while lines 5-8 are the function body, implenting the standard recursive algorithm for determining a Fibonacci number. Lines 5-7 use the PL/pgSQL IF-THEN
conditional statement to return the sequence number itself if it is less than two. As with all blocks in PL/pgSQL, the IF-THEN
conditional ends with a final END
statement. Conditional expressions in PL/pgSQL can be any SQL expression that you might use in the WHERE
clause of a typical SELECT
statement. The nice thing here, however, is that you can use a variable (in this case, fib_for
) in the expression.
Line five demonstrates the ability of PL/pgSQL to not only execute other PL/pgSQL functions, but to do so recursively. In this case, the fib()
function calls itself twice in order to properly determine and return the Fibonacci number. Note that you can use the PL/pgSQL RETURN
keyword anywhere in a PL/pgSQL function to terminate the execution of the function and return a value.
Line nine's END
statement signals the end of the function body, while line ten closes the dollar quoting and identifies the function implementation language.
A Note on Statement Termination
At first glance, the placement of semicolons to terminate statements in the example function might appear to be somewhat ad hoc. I assure you that it is not. In PL/pgSQL, all blocks must terminate in a semicolon, as must all statements within that block. The expression that initiates the block, however, such as BEGIN
on line 4 or IF fib_for < 2 THEN
on line five, does not end with semicolons. Line six, as a complete statement within the IF ... THEN
block, ends with a semicolon, as does the statement on line eight.
Perhaps the simplest way to remember this rule is to think of statements as requiring semicolons, and block initiation expressions as not being complete statements. That is, blocks only become complete statements when they END
.
==============================
Accessing Data
As a recursive function, the fib()
example is none too fast. Of course it short-circuits when the argument is less than two, but otherwise its recursion can be quite slow. Why? Because each time you call it, it must calculate and return many of the same numbers. On my PowerBook, for example, it takes nearly 3.8 seconds to find the 26th Fibonacci number:
try=% explain analyze select fib(26);
QUERY PLAN
------------------------------------------------------------------------------------------
Result (cost=0.00..0.01 rows=1 width=0) (actual time=3772.062..3772.063 rows=1 loops=1)
Total runtime: 3772.315 ms
(2 rows)
Why does it take so long? Like any recursive Fibonacci function, it has to make 392,835 recursive calls to itself to calculate the 26th Fibonacci number. Those recursive calls add up! Because the function always returns the same values for the same arguments, it would really improve the performance of the function to memoize it. Memoization caches the results of a function call for a given set of arguments so that when the function is called again with the same arguments, it can simply return the value from the cache without recalculating it--in this case, minimizing the need for recursion.
PL/pgSQL itself has no ability to store data outside of a function, but this is database programming--take advantage of it! The trick is to create a table to function as the cache, then access it from the function. The new function, fib_cached()
, does exactly that:
1 CREATE TABLE fib_cache (
2 num integer PRIMARY KEY,
3 fib integer NOT NULL
4 );
5
6 CREATE OR REPLACE FUNCTION fib_cached(
7 fib_for integer
8 ) RETURNS integer AS $$
9 DECLARE
10 ret integer;
11 BEGIN
12 if fib_for < 2 THEN
13 RETURN fib_for;
14 END IF;
15
16 SELECT INTO ret fib
17 FROM fib_cache
18 WHERE num = fib_for;
19
20 IF ret IS NULL THEN
21 ret := fib_cached(fib_for - 2) + fib_cached(fib_for - 1);
22 INSERT INTO fib_cache (num, fib)
23 VALUES (fib_for, ret);
24 END IF;
25 RETURN ret;
26
27 END;
28 $$ LANGUAGE plpgsql;
Lines 1-4 create the table for caching the Fibonacci sequence. The num
column represents the sequence number for which the corresponding Fibonacci number is stored in the fib
column. The num
column is a primary key because it should be unique.
The fib_cached()
function defined from lines 6-28 introduces quite a bit more syntax. The first line with something new is line five's DECLARE
statement. As you may have ascertained by the previous discussion of argument aliases, this statement introduces a block for declaring variables for use in the function body. All variables used in the function but not declared in the argument signature must be declared here. You can give them a preliminary assignment using the PL/pgSQL assignment operator, :=
(so named to avoid conflicts with the SQL =
comparison operator). You can use any PostgreSQL data type for your variables, but this example again keeps things quite simple. There is a single integer variable, ret
, which keeps track of a value for the function to return.
The BEGIN
statement on line 11 ends the variable declaration block and starts the function body. Line 12 contains the familiar IF-THEN
block that once again short-circuits the function if the argument to the function (stored in fib_for
) is less than two. Then things get more interesting.
As shown in the DECLARE
block, you can assign a value to a PL/pgSQL variable using :=
, but what if you want to assign a value from a SELECT
statement to a variable? Lines 16-18 demonstrate the approach. A variation of the standard SELECT INTO
statement allows you to select values into one or more PL/pgSQL variables rather than into a table. A comma-delimited list of variables follows the INTO
expression, while the rest of the statement remains a normal SELECT
statement. There are a couple of caveats to SELECT INTO
assignment, however: the SELECT
statement must return no more than one row, and the selected columns must match the number and types of the variables.
Here it's relatively straightforward. The code looks in its cache (the fib_cache
table) to see if it has already calculated and cached the Fibonacci number for the sequence number fib_for
. The SELECT
statement selects the fib
column from fib_cached
where the number is fib_for
and stores the result in the ret
variable.
Now, I mentioned that the SELECT INTO
statement can return no more than one row, which also means that it can return zero rows. If this is the case, then the value of ret
will be NULL
in this function. Accordingly, line 20 begins an IF-THEN
block by checking to see if the value of ret
is indeed NULL
. If it is, the function needs to calculate that Fibonacci number. Line 21 thus recursively calls fib_cached()
just as fib()
used recursion. Instead of returning the value, the code uses the :=
assignment operator to store the value in ret
.
With the new value calculated, the code needs to insert it into the cache so that it never has to be calculated again. Lines 22-23 do just that with a standard SQL INSERT
statement. The variables fib_for
and ret
can be used right in the INSERT
statement, just as fib_for
was used in the SELECT
statement at lines 16-18. One of the great features of PL/pgSQL is that it precompiles SQL statements embedded in it, using variables as if they were passed to a prepared SQL statement as arguments for placeholders. In other words, the INSERT
statement magically becomes akin to:
PREPARE some_insert(integer, integer) AS
INSERT INTO fib_cache (num, fib)
VALUES ($1, $2);
EXECUTE some_insert(fib_for, ret);
The great thing about this feature of PL/pgSQL is that it makes embedded SQL statements extremely fast. The database can reuse the precompiled query plan for each call to the function (on a per-connection basis) without recompiling and planning.
At any rate, line 25 returns the value of ret
, regardless of where it came from. Adding the caching support makes the function far faster:
try=% explain analyze select fib_cached(26);
QUERY PLAN
--------------------------------------------------------------------------------------
Result (cost=0.00..0.01 rows=1 width=0) (actual time=50.837..50.838 rows=1 loops=1)
Total runtime: 50.889 ms
(2 rows)
try=% explain analyze select fib_cached(26);
QUERY PLAN
------------------------------------------------------------------------------------
Result (cost=0.00..0.01 rows=1 width=0) (actual time=2.197..2.198 rows=1 loops=1)
Total runtime: 2.249 ms
(2 rows)
The first call to fib_cached()
finds no cached values, and so it must create them all as it goes along. This simply means that it needs to calculate the values for each number up to 26 once, practically eliminating that recursion (indeed, fib_cached()
has only 24 recursive calls, once each for the numbers 2 through 26). The result is a much faster query: only .05 seconds, as opposed to the nearly 3.8 seconds for fib()
. Of course the second call to fib_cached()
needs only to look up and return the 26th Fibonacci number, because the first call has already cached it. That cuts the time down to 0.0025 seconds. Not bad, eh?
=========================================================
Loop Constructs
Of course, memoization is not necessarily the best way to speed up a function. Some languages provide memoization support natively or via easily added libraries, but PL/pgSQL offers no such facility. This means adding a fair bit of code to memoize the function. You have something fast, but also something slightly more difficult to maintain.
There's another approach to determining the Fibonacci number for a particular position in the sequence that involves neither recursion nor memoization. Execute a loop fib_for
number of times and keep track of the calculation each time through. How does that look?
CREATE OR REPLACE FUNCTION fib_fast(
1 fib_for integer
2 ) RETURNS integer AS $$
3 DECLARE
4 ret integer := 0;
5 nxt integer := 1;
6 tmp integer;
7 BEGIN
8 FOR num IN 1..fib_for LOOP
9 tmp := ret;
10 ret := nxt;
11 nxt := tmp + nxt;
12 END LOOP;
13
14 RETURN ret;
15 END;
16 $$ LANGUAGE plpgsql;
17
Everything should look familiar up through line eight, but notice the declaration of multiple variables in the DECLARE
block and the initial values assigned to two of them. The new variables, nxt
and tmp
, will track the Fibonacci numbers through each iteration of the loop.
Speaking of the loop, it begins on line nine. All loops in PL/pgSQL use the LOOP
keyword and end with END LOOP
. A loop can begin with nothing more than LOOP
, in which case it will be an infinite loop (break out of it with the EXIT
keyword). Otherwise, there are a couple of different approaches to looping in PL/pgSQL, including WHILE
(such as WHILE foo IS NULL LOOP
) and FOR
.
A FOR
loop is the only context in PL/pgSQL where you can use a variable without predeclaring it in the DECLARE
block. The form used here (there are other forms--for iterating over rows in a SELECT
query, for example), uses the variable num
, which is automatically created as a read-only integer variable that exists only in the scope of the loop, to loop fib_for
times. This example doesn't actually use num
in the loop, but I thought you should know that it could.
The only thing that happens inside the loop is assignment. The ret
variable once again keeps track of the return value, while tmp
and nxt
track the previous and next values in the sequence. That's it. Once the loop has run through all of its iterations, the function returns the last value stored in ret
.
How does the use of the loop affect performance?
try=% explain analyze select fib_fast(26);
QUERY PLAN
------------------------------------------------------------------------------------
Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.433..0.433 rows=1 loops=1)
Total runtime: 0.485 ms
(2 rows)
It's over four times faster than the cached version, mainly because there are no longer any queries to an external table. It'll be faster with lower numbers and slower with higher numbers because the fib_for
argument determines the number of iterations required. (Any number over 45 won't work at all because the return values will be too big for PostgreSQL integers. If you need them that big, use bigint
s instead.)
Practical PL/pgSQL
Of course these functions are not very practical (except as teaching examples), unless for some reason you need to calculate Fibonacci numbers in your database. The real advantages of PL/pgSQL become apparent when you use it to simplify situations where client code must typically make numerous database calls to satisfy a data pattern. In the interests of illustrating such practical PL/pgSQL, my next article will demonstrate how to write PL/pgSQL functions to simplify the management of ordered many-to-many relationships.
Acknowledgments
My thanks for David Fetter for suggesting the memoized version of fib()
as an illustrative example for this article, and to Mark Jason Dominus and his terrific Higher Order Perl for an excellent discussion and examples of the Fibonacci sequence functions. I'd also like to thank AndrewSN for providing feedback on a draft of this article.
David Wheeler is a developer at Portland, Oregon-based Values of n, where he writes the code that makes Stikkit's little yellow notes think.
Return to ONLamp.com.
postgresql 创建函数的更多相关文章
- [PostgreSql]PostgreSql创建函数及函数的执行
1.准备好创建函数的脚本 -- FUNCTION: public.dynamic_placelist_select(integer, timestamp without time zone) -- D ...
- [官网]CREATE EXTENSION PostGreSQL 创建函数的方法
CREATE EXTENSION https://www.postgresql.org/docs/current/sql-createextension.html CREATE EXTENSION — ...
- Sql Server创建函数
在使用数据库的过程中,往往我们需要对有的数据先进行计算,然后再查询出来,所以我们就需要创建函数来完成这项任务,在数据库的Programmability(如图1)下面的Function中创建函数(如图2 ...
- Python 动态创建函数【转】
知乎上也有相似的问题 偶然碰到一个问题,初想是通过动态创建Python函数的方式来解决,于是调研了动态创建Python函数的方法. 定义lambda函数 在Python中定义lambda函数的写法很简 ...
- 从new Function创建函数联想到MVC模式
我们知道任何一个自定义函数都是Function构造器的实例,所以我们可以通过new Function的方式来创建函数,使用语法很简单, new Function(形参1, 形参2, ..., 形参N, ...
- 进程创建函数fork()、vfork() ,以及excel()函数
一.进程的创建步骤以及创建函数的介绍 1.使用fork()或者vfork()函数创建新的进程 2.条用exec函数族修改创建的进程.使用fork()创建出来的进程是当前进程的完全复制,然而我们创建进程 ...
- Mysql创建函数出错
目前在项目中,执行创建mysql的函数出错, mysql 创建函数出错信息如下: Error Code: 1227. Access denied; you need (at least one of) ...
- mysql 创建函数set global log_bin_trust_function_creators=TRUE;
<pre name="code" class="html">set global log_bin_trust_function_creators=T ...
- mysql 创建函数
<pre name="code" class="html">root 用户创建函数: delimiter $$ CREATE FUNCTION `l ...
随机推荐
- C# 写的正整数四则运算计算器
实际上没能做出来负数.括号.小数的功能,才写这么个标题 大神直接略过,也欢迎指指点点-.- 输入一个四则运算表达式,计算出结果,想想不难,实现起来也不是很容易的. 流程:1.for循环输入的四则运算字 ...
- putty中文乱码问题解决
###putty中文乱码问题解决 用putty从windows xp连接ubuntu server或者FreeBSD系统,其中中文部分乱码,经常遇到这个问题的时候,会觉得很郁闷.现共享一些解决这个问题 ...
- PHP开发心得二
如何解决错误:PHP SOAP Fatal error: Uncaught SoapFault exception: [Client] looks like we got no XML documen ...
- 主库binlog(master-log)与从库relay-log的关系
主库binlog: # at # :: server id end_log_pos CRC32 COMMIT/*!*/; # at # :: server id end_log_pos CRC32 e ...
- [Windows Server 2012] MySQL更改数据库引擎(MyISAM改为INNODB)
★ 欢迎来到[护卫神·V课堂],网站地址:http://v.huweishen.com ★ 护卫神·V课堂 是护卫神旗下专业提供服务器教学视频的网站,每周更新视频. ★ 本节我们将带领大家:更改MyS ...
- glassfish中新建数据源(创建数据库连接池)
1.浏览器输入:http://localhost:4848 登录glassfish域管理控制台,默认的用户名和密码是amin和adminadmin.(也可以通过NetBeans的服务选项卡--服务器- ...
- Flask框架 之abort、自定义错误、视图函数返回值与jsonify
一.abort函数 使用abort函数可以立即终止视图函数的执行,并可以返回给前端特定的值. abort函数的作用: 1.传递状态码,必须是标准的http状态码 2.传递响应体信息 @app.rout ...
- CAD得到所有实体2
主要用到函数说明: IMxDrawSelectionSet::Select2 构造选择集.详细说明如下: 参数 说明 [in] MCAD_McSelect Mode 构造选择集方式 [in] VARI ...
- 日常开发需要掌握的Git命令
本人待的两家公司,一直都是用的SVN,Git我只是自己私下学习和开发小项目的时候用过,工作一直没有使用过,但还是要学的... Git是最好的分布式版本控制系统 工作流程 SVN和Git的区别 SVN是 ...
- 100 道 Linux 笔试题,能拿 80 分就算大神!
本套笔试题共100题,每题1分,共100分.(参考答案在文章末尾) 1. cron 后台常驻程序 (daemon) 用于: A. 负责文件在网络中的共享 B. 管理打印子系统C. 跟踪管理系统信息和错 ...