GeSHi Documentation
GeSHi Documentation
Version 1.0.8.11
- Authors:
- © 2004 - 2007 Nigel McNie
- © 2007 - 2012 Benny Baumann
- © 2008 - 2009 Milian Wolff
- GeSHi Website:
- http://qbnz.com/highlighter
This is the documentation for GeSHi - Generic Syntax Highlighter.
The most modern version of this document is available on the web - go to http://qbnz.com/highlighter/documentation.php to view it.
Any comments, questions, confusing points? Please get in contact with the developers! We need all the information we can get to make the use of GeSHi and everything related to it (including this documentation) a breeze.
Contents
- 1 Introduction
- 2 The Basics
- 3 Advanced Features
- 3.1 The Code Container
- 3.2 Line Numbers
- 3.3 Using CSS Classes
- 3.4 Changing Styles
- 3.5 Case Sensitivity and Auto Casing
- 3.6 Changing the Source, Language, Config Options
- 3.7 Error Handling
- 3.8 Disabling styling of some Lexics
- 3.9 Setting the Tab Width
- 3.10 Using Strict Mode
- 3.11 Adding/Removing Keywords
- 3.12 Headers and Footers for Your Code
- 3.13 Keyword URLs
- 3.14 Using Contextual Importance
- 3.15 Highlighting Special Lines “Extra”
- 3.16 Adding IDs to Each Line
- 3.17 Getting the Time of Styling
- 4 Language Files
- 4.1 An Example Language File
- 4.2 Language File Conventions
- 4.3 Language File Sections
- 4.3.1 The Header
- 4.3.2 The First Indices
- 4.3.3 Keywords
- 4.3.4 Symbols and Case Sensitivity
- 4.3.5 Styles for your Language File
- 4.3.6 URLs for Functions
- 4.3.7 Number Highlighting Support
- 4.3.8 Object Orientation Support
- 4.3.9 Using Regular Expressions
- 4.3.10 Contextual Highlighting and Strict Mode
- 4.3.11 Special Parser Settings (Experimental)
- 4.3.12 Tidying Up
- 4.4 Validating your language file
- 5 Method/Constant Reference
1 Introduction
GeSHi is exactly what the acronym stands for: a Generic Syntax Highlighter. As long as you have a language file for almost any computer language - whether it be a scripting language, object orientated, markup or anything in between - GeSHi can highlight it! GeSHi is extremely customisable - the same source can be highlighted multiple times in multiple ways - the same source even with a different language. GeSHi outputs XHTML strict compliant code1, and can make use of CSS to save on the amount of output. And what is the cost for all of this? You need PHP. That’s all!
1.1 Features
Here are some of the standout features of GeSHi:
- Programmed in PHP:
- GeSHi is coded entirely in PHP. This means that where ever you have PHP, you can have GeSHi! Almost any free webhost supports PHP, and GeSHi works fine with PHP > 4.3.02.
- Support for many languages:
- GeSHi comes with more than 100 languages, including PHP, HTML, CSS, Java, C, Lisp, XML, Perl, Python, ASM and many more!
- XHTML compliant output:
- GeSHi produces XHTML compliant output, using stylesheets, so you need not worry about GeSHi ruining your claims to perfection in the standards department ;)
- Highly customisable:
- GeSHi allows you to change the style of the output on the fly, use CSS classes or not, use an external stylesheet or not, use line numbering, change the case of output keywords… the list goes on and on!
- Flexible:
- Unfortunately, GeSHi is quite load/time intensive for large blocks of code. However, you want speed? Turn off any features you don’t like, pre-make a stylesheet and use CSS classes to reduce the amount of output and more - it’s easy to strike a balance that suits you.
This is just a taste of what you get with GeSHi - the best syntax highlighter for the web in the world!
1.2 About GeSHi
GeSHi started as a mod for the phpBB forum system, to enable highlighting of more languages than the available (which can be roughly estimated to exactly 0 ;)). However, it quickly spawned into an entire project on its own. But now it has been released, work continues on a mod for phpBB3 - and hopefully for many forum systems, blogs and other web-based systems.
Several systems are using GeSHi now, including:
- Dokuwiki - An advanced wiki engine
- gtk.php.net - Their manual uses GeSHi for syntax highlighting
- WordPress - A powerful blogging system4
- PHP-Fusion - A constantly evolving CMS
- SQL Manager - A Postgres DBAL
- Mambo - A popular open source CMS
- MediaWiki - A leader in Wikis[^plugin-only]
- TikiWiki - A megapowerful Wiki/CMS
- TikiPro - Another powerful Wiki based on TikiWiki
- WikkaWiki - A flexible and lightweight Wiki engine
- RWeb - A site-building tool
GeSHi is the original work of Nigel McNie. The project was later handed over to Benny Baumann. Others have helped with aspects of GeSHi also, they’re mentioned in the THANKS
file.
1.3 Credits
Many people have helped out with GeSHi, whether by creating language files, submitting bug reports, suggesting new ideas or simply pointing out a new idea or something I’d missed. All of these people have helped to build a better GeSHi, you can see them in the THANKS
file.
Do you want your name on this list? Why not make a language file, or submit a valid bug? Or perhaps help me with an added feature I can’t get my head around, or suggest a new feature, or even port GeSHi to anothe language? There’s lots you can do to help out, and I need it all :)
1.4 Feedback
I need your feedback! ANYthing you have to say is fine, whether it be a query, congratulations, a bug report or complaint, I don’t care! I want to make this software the best it can be, and I need your help! You can contact me in the following ways:
- E-mail: Nigel McNie, Benny Baumann or better yet: use the geshi-users mailinglist
- Forums: Sourceforge.net Forums
- IRC: #geshi on Freenode
Remember, any help I am grateful for :)
2 The Basics
In this section, you’ll learn a bit about GeSHi, how it works and what it uses, how to install it and how to use it to perform basic highlighting.
2.1 Getting GeSHi work
If you’re reading this and don’t have GeSHi, that’s a problem ;). So, how do you get your hands on it?
2.1.1 Requirements
GeSHi requires the following to be installable:
- PHP. It’s untested with anything other below 4.4.X. I hope to extend this range soon. I see no reason why it won’t work with any version of PHP above 4.3.0.
- Approximately 2 megabytes of space. The actual script is small - around 150K - but most of the size comes from the large number of language files (over 100!). If you’re pushed for space, make sure you don’t upload to your server the
docs/
orcontrib/
directory, and you may want to leave out any language files that don’t take your fancy either.
As you can see, the requirements are very small. If GeSHi does NOT work for you in a particular version of PHP, let me know why and I’ll fix it.
2.1.2 Downloading GeSHi
There are several ways to get a copy of GeSHi. The first and easiest way of all is visiting http://qbnz.com/highlighter/downloads.php to obtain the latest version. This is suitable especially when you plan on using GeSHi on an production website or otherwise need a stable copy for flawless operation.
If you are somewhat more sophisticated or need a feature just recently implemented you might consider getting GeSHi by downloading via SVN. There are multiple ways for doing so and each one has its own advantages and disadvantages. Let’s cover the various locations in the SVN you might download from:
- https://geshi.svn.sourceforge.net/svnroot/geshi/tags/:
This directory holds all previous releases of GeSHi each as a subdirectory. By downloading from here you can test your code with various old versions in case something has been broken recently. - https://geshi.svn.sourceforge.net/svnroot/geshi/branches/RELEASE_1_0_X_STABLE/geshi-1.0.X/src/:
This directory is the right place for you if you want to have reasonably current versions of GeSHi but need something that is stable. This directory is updated once in a while between updates whenever there’s something new but which is already reasonably stable. This branch is used to form the actual release once the work is done. - https://geshi.svn.sourceforge.net/svnroot/geshi/trunk/geshi-1.0.X/src/:
This directory is the working directory where every new feature, patch or improvement is committed to. This directory is updated regularly, but is not guaranteed to be tested and stable at all times. With this version you’ll always get the latest version of GeSHi out there, but beware of bugs! There will be loads of them here! So this is absolutely not recommended for productive use!
If you have choosen the right SVN directory for you do a quick svn co $SVNPATH geshi
where $SVNPATH
is one of the above paths and your desired version of GeSHi will be downloaded into an subdirectory called “geshi”. If you got a version of GeSHi you can go on installing as shown below.
2.1.3 Extracting GeSHi
Packages come in .zip
, .tar.gz
and .tar.bz2
format, so there’s no complaining about whether it’s available for you. *nix users probably want .tar.gz
or .tar.bz2
and windows users probably want .zip
. And those lucky to download it directly from SVN don’t even need to bother extracting GeSHi.
To extract GeSHi in Linux (.tar.gz
):
- Open a shell
cd
to the directory where the archive lies- Type
tar -xzvf [filename]
where[filename]
is the name of the archive (typicallyGeSHi-1.X.X.tar.gz
) - GeSHi will be extracted to its own directory
To extract GeSHi in Windows (.zip
):
- Open Explorer
- Navigate to the directory where the archive lies
- Extract the archive. The method you use will depend on your configuration. Some people can right-click upon the archive and select “Extract” from there, others may have to drag the archive and drop it upon an extraction program.
To extract from .zip
you’ll need an unzipping program - unzip
in Linux, or 7-Zip, WinZip, WinRAR or similar for Windows.
2.1.4 Installing GeSHi
Installing GeSHi is a snap, even for those most new to PHP. There’s no tricks involved. Honest!
GeSHi is nothing more than a PHP class with related language support files. Those of you familiar with PHP can then guess how easy the installation will be: simply copy it into your include path somewhere. You can put it wherever you like in this include path. I recommend that you put the language files in a subdirectory of your include path too - perhaps the same subdirectory that geshi.php is in. Remember this path for later.
If you don’t know what an include path is, don’t worry. Simply copy GeSHi to your webserver. So for example, say your site is at http://mysite.com/myfolder
, you can copy GeSHi to your site so the directory structure is like this:
http://mysite.com/myfolder/geshi/[language files]
http://mysite.com/myfolder/geshi.php
Or you can put it in any subdirectory you like:
http://mysite.com/myfolder/includes/geshi/[language files]
http://mysite.com/myfolder/includes/geshi.php
When using GeSHi on a live site, the only directory required is the geshi/
subdirectory. Both contrib/
and docs/
are worthless, and furthermore, as some people discovered, one of the files in contrib had a security hole (fixed as of 1.0.7.3). I suggest you delete these directories from any live site they are on.
2.2 Basic Usage
Use of GeSHi is very easy. Here’s a simple example:
PHP code | |
1 |
// |
As you can see, there’s only three really important lines:
include_once('geshi.php')
This line includes the GeSHi class for use
$geshi = new GeSHi($source, $language);
This line creates a new GeSHi object, holding the source and the language you want to use for highlighting.
echo $geshi->parse_code();
This line spits out the result :)
So as you can see, simple usage of GeSHi is really easy. Just create a new GeSHi object and get the code!
Since version 1.0.2, there is a function included with GeSHi called geshi_highlight
. This behaves exactly as the php function highlight_string()
behaves - all you do is pass it the language you want to use to highlight and the path to the language files as well as the source. Here are some examples:
PHP code | |
1 |
// Simply echo the highlighted code |
However, these are really simple examples and doesn’t even begin to cover all the advanced features of GeSHi. If you want to learn more, continue on to section 3: Advanced Features.
3 Advanced Features
This section documents the advanced features of GeSHi - strict mode, using CSS classes, changing styles on the fly, disabling highlighting of some things and more.
In this section there are many code snippets. For all of these, you should assume that the GeSHi library has been included, and a GeSHi object has been created and is referenced by the variable $geshi
. Normally, the source, language and path used are arbitary.
3.1 The Code Container
The Code Container has a fundamental effect on the layout of your code before you even begin to style. What is the Code Container? It’s the bit of markup that goes around your code to contain it. By default your code is surrounded by a <pre>
, but you can also specify a <div>
.
The <pre>
header is the default. If you’re familiar with HTML you’ll know that whitespace is rendered “as is” by a <pre>
element. The advantage for you is that if you use <pre>
the whitespace you use will appear pretty much exactly how it is in the source, and what’s more GeSHi won’t have to add a whole lot of <br />
’s and non-breaking spaces (
) to your code to indent it. This saves you source code (and your valuable visitors waiting time and your bandwidth).
But if you don’t like <pre>
or it looks stupid in your browser no matter what styles you try to apply to it or something similar, you might want to use a <div>
instead. A <div>
will result in more source - GeSHi will have to insert whitespace markup - but in return you can wrap long lines of code that would otherwise have your browser’s horizontal scrollbar appear. Of course with <div>
you can not wrap lines if you please. The highlighter demo at the GeSHi home page uses the <div>
approach for this reason.
At this stage there isn’t an option to wrap the code in <code>
tags (unless you use the function geshi_highlight
), partly because of the inconsistent and unexpected ways stuff in <code>
tags is highlighted. Besides, <code>
is an inline element. But this may become an option in future versions.
As of GeSHi 1.0.7.2 there is a new header type, that specifies that the code should not be wrapped in anything at all.
Another requested addition has been made in GeSHi 1.0.7.20 to force GeSHi to create a block around the highlighted source even if this wasn’t necessary, thus styles that are applied to the output of GeSHi can directly influence the code only even if headers and footers are present.
To change/set the header to use, you call the set_header_type()
method. It has one required argument which defines the container type. Available are:
$geshi->set_header_type(GESHI_HEADER_DIV);
-
Puts a
<div>
around both, code and linenumbers. Whitespace is converted to
sequences (i.e. one whitespace and the html entity of a non-breaking whitespace) to keep your indendation level in tact. Tabs are converted as well and you can manually define the tab-width. Lines are automatically wrapped. Linenumbers are created using an ordered list. $geshi->set_header_type(GESHI_HEADER_PRE);
-
Wraps code and linenumbers in a
<pre>
container. This way whitespace is kept as-is and thus this header produces less overhead then theGESHI_HEADER_DIV
header type. Since linenumbers are still created using an ordered list this header type produces invalid HTML. $geshi->set_header_type(GESHI_HEADER_PRE_VALID);
- Available since 1.0.8
-
When linenumbers are disabled, this behaves just like
GESHI_HEADER_PRE
. In the other case though, a<div>
is used to wrap the code and linenumbers and the<pre>
is put inside the list items (<li>
). This means slightly larger HTML output compared toGESHI_HEADER_PRE
, but the output is valid HTML. $geshi->set_header_type(GESHI_HEADER_PRE_TABLE);
- Available since 1.0.8
-
Once again a
<div>
tag wraps the output. This time though no ordered list is used to create an ordered list, but instead we use a table with two cells in a single row. The left cell contains a<pre>
tag which holds all linenumbers. The second cell holds the highlighted code, also wrapped in a<pre>
tag, just like withGESHI_HEADER_PRE
. -
This produces valid HTML and works around the nasty selection behaviour of Firefox and other Gecko based browsers, see SF#1651996 for more information.
$geshi->set_header_type(GESHI_HEADER_NONE);
- Available since 1.0.7.2
-
No wrapper is added.
Those are the only arguments you should pass to set_header_type
. Passing anything else may cause inconsistencies in what is used as the Code Container (although it should simply use a <pre>
). Better not to risk it.
GESHI_HEADER_DIV, GESHI_HEADER_PRE, etc. are constants, so don’t put them in strings!
The default styles for the <pre>
and <div>
will be different, especially if you use line numbers!
I have found that a <pre>
results in code that is smaller than for that of a <div>
, you should rectify this difference by using set_overall_style()
if you need to. But be aware of this difference for if you are changing the header type!
3.2 Line Numbers
GeSHi has the ability to add line numbers to your code (see the demo available at http://qbnz.com/highlighter/demo.php to see what can be achieved). Line numbers are a great way to make your code look professional, especially if you use the fancy line numbers feature.
There are multiple methods for highlighting line numbers, but none of them is perfect. Of the various ways to highlight line numbers GeSHi itself implements 2 different approaches, but allows you by the way it generates the code to do the line numbers yourself if necessary - but more on this case later.
The easiest approach is using the <ol>
-tag for generating the line numbers, but even though this is the easiest one there’s a big drawback with this one when using Gecko-engine based browsers like Firefox or Konqueror. In these browsers this approach will select the line numbers along with the code or will include extra markup in the selection.
The other approach has been implemented in the 1.0.8 release of GeSHi with the GESHI_HEADER_PRE_TABLE
header type. When using this header type the line numbers are rendered apart from the source in a table cell while the actual source is formatted as if the GESHI_HEADER_PRE
header had been used. This approach works with Firefox and other Gecko-based browsers so far although extreme care has to be taken when applying styles to your source as Windows has some fonts where bold font is of different height than normal or italic text of the same fontface.
3.2.1 Enabling Line Numbers
To highlight a source with line numbers, you call the enable_line_numbers()
method:
$geshi->enable_line_numbers($flag);
Where $flag
is one of the following:
GESHI_NORMAL_LINE_NUMBERS
- Use normal line numberingGESHI_FANCY_LINE_NUMBERS
- Use fancy line numberingGESHI_NO_LINE_NUMBERS
- Disable line numbers (default)
Normal line numbers means you specify a style for them, and that style gets applied to all of them. Fancy line numbers means that you can specify a different style for each nth line number. You change the value of n (default 5):
$geshi->enable_line_numbers(GESHI_FANCY_LINE_NUMBERS, 37);
The second parameter is not used in any other mode. Setting it to 0
is the same as simply using normal line numbers. Setting it to 1
applies the fancy style to every line number.
The values above are CONSTANTS - so don’t put them in strings!
3.2.2 Styling Line Numbers
As of GeSHi 1.0.2, line numbers are added by the use of ordered lists. This solves the old issues of line number styles inheriting from styles meant for the code. Also, this solves an important issue about selecting code. For example, line numbers look nice, but when you go to select the code in your browser to copy it? You got the line numbers too! Not such a good thing, but thankfully this issue is now solved. What is the price? Unfortunately the whole way that styles are inherited/used has changed for those of you who were familiar with 1.0.1, and there is quite a bit more HTML involved. So think carefully about these things before you enable line numbers.
Now, onto how to style line numbers:
Styles are set for line numbers using the set_line_style()
method:
$geshi->set_line_style('background: #fcfcfc;');
If you’re using Fancy Line Numbers mode, you pass a second string for the style of the nth line number:
$geshi->set_line_style('background: #fcfcfc;', 'background: #f0f0f0;');
The second style will have no effect if you’re not using Fancy Line Numbers mode.
By default, the styles you pass overwrite the current styles. Add a boolean “true” after the styles you specify to combine them with the current styles:
PHP code | |
1 |
$geshi->set_line_style('background: red;', true); |
Due to a bug with Firefox the issue that should have been fixed with 1.0.2 has reappeared in another form as Firefox includes extra text\markup into plaintext versions of webpage copies. This can sometimes be useful (actually it’s used to get the plaintext version of this documentation), but more often is quite annoying. Best practice so far is to either not use line numbers, or offer the visitor of your page a plaintext version of your source. To learn more have a look at the SF.net BugTracker Issue #1651996. This will hopefully be fixed in GeSHi version 1.2 or as soon as Firefox provides webdevelopers with adequate ways to control this feature - whichever comes first!
When you set line number styles, the code will inherit those styles! This is the main issue to come out of the 1.0.2 release. If you want your code to be styled in a predictable manner, you’ll have to call the set_code_style()
method to rectify this problem.
Note also that you cannot apply background colours to line numbers unless you use set_overall_style()
. Here’s how you’d style:
Use
set_overall_style()
to style the overall code block. For example, you can set the border style/colour, any margins and padding etc. using this method. In addition: set the background colour for all the line numbers using this method.Use
set_line_style()
to style the foreground of the line numbers. For example, you can set the colour, weight, font, padding etc. of the line numbers using this method.Use
set_code_style()
to explicitly override the styles you set for line numbers usingset_line_style
. For example, if you’d set the line numbers to be bold (or even if you’d only set the fancy line number style to be bold), and you didn’t actually want your code to be bold, you’d make sure thatfont-weight: normal;
was in the stylesheet rule you passed toset_code_style()
.This is the one major change from GeSHi 1.0.1 - make sure you become familiar with this, and make sure that you check any code you have already styled with 1.0.1 when you upgrade to make sure nothing bad happens to it.
3.2.3 Choosing a Start Number
As of GeSHi 1.0.2, you can now make the line numbers start at any number, rather than just 1. This feature is useful if you’re highlighting code from a file from around a certain line number in that file, as an additional guide to those who will view the code. You set the line numbers by calling the start_line_numbers_at()
method:
$geshi->start_line_numbers_at($number);
$number
must be a positive integer (or zero). If it is not, GeSHi will convert it anyway.
If you have not enabled line numbers, this will have no effect.
Although I’d like GeSHi to have XHTML strict compliance, this feature will break compliancy (however transitional compliancy remains). This is because the only widely supported way to change the start value for line numbers is by using the start=”number” attribute of the <ol>
tag. Although CSS does provide a mechanism for doing this, it is only supported in Opera versions 7.5 and above (not even Firefox supports this).
3.3 Using CSS Classes
Using CSS to highlight your code instead of in-lining the styles is a definate bonus. Not only is it more compliant (the w3c is deprecating the style attribute in XHTML 2.0) but it results in far less outputted code - up to a whopping 90% saving - which makes a *huge* difference to those unlucky of us on modems!
3.3.1 Enabling CSS Classes
By default, GeSHi doesn’t use the classes, so it’s easy just to whack out some highlighted code if you need without worrying about stylesheets. However, if you’re a bit more organised about it, you should use the classes ;). To turn the use of classes on, you call the enable_classes()
method:
$geshi->enable_classes();
If you want to turn classes OFF for some reason later:
$geshi->enable_classes(false);
If classes are enabled when parse_code()
is called, then the resultant source will use CSS classes in the output, otherwise it will in-line the styles. The advantages of using classes are great - the reduction in source will be very noticeable, and what’s more you can use one stylesheet for several different highlights on the same page. In fact, you can even use an external stylesheet and link to that, saving even more time and source (because stylesheets are cached by browsers).
There have been problems with inline styles and the Symbol Highlighting added in 1.0.7.21. If you can you should therefore turn CSS classes ON to avoid those issues. Although latest reworks in 1.0.8 should fix most of those issues.
This should be the very first method you call after creating a new GeSHi object! That way, various other methods can act upon your choice to use classes correctly. In theory, you could call this method just before parsing the code, but this may result in unexpected behaviour.
3.3.2 Setting the CSS class and ID
You can set an overall CSS class and id for the code. This is a good feature that allows you to use the same stylesheet for many different snippets of code. You call set_overall_class()
and set_overall_id
to accomplish this:
PHP code | |
1 |
$geshi->set_overall_class('mycode'); |
The default classname is the name of the language being used. This means you can use just the one stylesheet for all sources that use the same language, and incidentally means that you probably won’t have to call these methods too often.
CSS IDs are supposed to be unique, and you should use them as such. Basically, you can specify an ID for your code and then use that ID to highlight that code in a unique way. You’d do this for a block of code that you expressly wanted to be highlighted in a different way (see the section below on gettting the stylesheet for your code for an example).
As of GeSHi 1.0.8 the class name will always include the language name used for highlighting.
3.3.3 Getting the stylesheet for your code
The other half of using CSS classes is getting the stylesheet for use with the classes. GeSHi makes it very easy to get a stylesheet for your code, with one easy method call:
PHP code | |
1 |
$geshi->enable_classes(); |
The get_stylesheet()
method gets the stylesheet for your code in one easy call. All you need to do is output it in the correct place. As you can also see, you don’t even have to enable class usage to get the stylesheet nessecary either - however not enabling classes but using the stylesheet may result in problems later.
By default, get_stylesheet()
tries to echo the least amount of code possible. Although currently it doesn’t check to see if a certain lexic is even in the source, you can expect this feature in the future. At least for the present however, if you explicitly disable the highlighting of a certain lexic, or disable line numbers, the related CSS will not be outputted. This may be a bad thing for you perhaps you’re going to use the stylesheet for many blocks of code, some with line numbers, others with some lexic enabled where this source has it disabled. Or perhaps you’re building an external stylesheet and want all lexics included. So to get around this problem, you do this:
$geshi->get_stylesheet(false);
This turns economy mode off, and all of the stylesheet will be outputted regardless.
Now lets say you have several snippets of code, using the same language. In most of them you don’t mind if they’re highlighted the same way (in fact, that’s exactly what you want) but in one of them you’d like the source to be highlighted differently. Here’s how you can do that:
PHP code | |
1 |
// assume path is the default "geshi/" relative to the current directory |
Before version 1.0.2, you needed to set the class of the code you wanted to be unique to the empty string. This limitation has been removed in version 1.0.2 - if you set the ID of a block of code, all styling will be done based on that ID alone.
3.3.4 Using an External Stylesheet
An external stylesheet can reduce even more the amount of code needed to highlight some source. However there are some drawbacks with this. To use an external stylesheet, it’s up to you to link it in to your document, normally with the following HTML:
HTML code | |
1 |
<html> |
In your external stylesheet you put CSS declarations for your code. Then just make sure you’re using the correct class (use set_overall_class()
to ensure this) and this should work fine.
This method is great if you don’t mind the source always being highlighted the same (in particular, if you’re making a plugin for a forum/wiki/other system, using an external stylesheet is a good idea!). It saves a small amount of code and your bandwidth, and it’s relatively easy to just change the stylesheet should you need to. However, using this will render the methods that change the styles of the code useless, because of course the stylesheet is no longer being dynamically generated. You can still disable highlighting of certain lexics dynamically, however.
As of version 1.0.2, GeSHi comes with a contrib/
directory, which in it contains a “wizard” script for creating a stylesheet. Although this script is by no means a complete solution, it will create the necessary rules for the basic lexics - comments, strings for example. Things not included in the wizard include regular expressions for any language that uses them (PHP and XML are two languages that use them), and keyword-link styles. However, this script should take some of the tedium out of the job of making an external stylesheet. Expect a much better version of this script in version 1.2!
3.4 Changing Styles
One of the more powerful features of GeSHi is the ability to change the style of the output dynamically. Why be chained to the boring styles the language authors make up? You can change almost every single aspect of highlighted code - and can even say whether something is to be highlighted at all.
If you’re confused about “styles”, you probably want to have a quick tutorial in them so you know what you can do with them. Checkout the homepage of CSS at http://www.w3.org/Style/CSS.
3.4.1 The Overall Styles
The code outputted by GeSHi is either in a <div>
or a <pre>
(see the section entitled “The Code Container”), and this can be styled.
$geshi->set_overall_style('... styles ...');
Where styles is a string containing valid CSS declarations. By default, these styles overwrite the current styles, but you can change this by adding a second parameter:
$geshi->set_overall_style('color: blue;', true);
The default styles “shine through” wherever anything isn’t highlighted. Also, you can apply more advanced styles, like position: (fixed|relative) etc, because a <div>
/<pre>
is a block level element.
Remember that a <div>
will by default have a larger font size than a <pre>
, as discussed in the section “The Code Container”.
3.4.2 Line Number Styles
You may wish to refer to the section [Styling Line Numbers][1] before reading this section.
As of version 1.0.2, the way line numbers are generated is different, so therefore the way that they are styled is different. In particular, now you cannot set the background style of the fancy line numbers to be different from that of the normal line numbers.
Line number styles are set by using the method set_line_style
:
$geshi->set_line_style($style1, $style2);
$style1
is the style of the line numbers by default, and $style2
is the style of the fancy line numbers.
Things have changed since 1.0.1! This note is very important - please make sure you check this twice before complaining about line numbers!
Because of the way that ordered lists are done in HTML, there really isn’t normally a way to style the actual numbers in the list. I’ve cheated somewhat with GeSHi - I’ve made it possible to use CSS to style the foreground of the line numbers. So therefore, you can change the color, font size and type, and padding on them. If you want to have a pretty background, you must use set_overall_style()
to do this, and use set_code_style()
to style the actual code! This is explained in the section above: Styling Line Numbers.
In addition, the styles for fancy line numbers is now the difference between the normal styles and the styles you want to achieve. For example, in GeSHi prior to 1.0.2 you may have done this to style line numbers:
$geshi->set_line_style('color: red; font-weight: bold;', 'color: green; font-weight: bold');
Now you instead can do this:
$geshi->set_line_style('color: red; font-weight: bold;', 'color: green;');
The font-weight: bold;
will automatically carry through to the fancy styles. This is actually a small saving in code - but the difference may be confusing for anyone using 1.0.1 at first.
3.4.3 Setting Keyword Styles
Perhaps the most regular change you will make will be to the styles of a keyword set. In order to change the styles for a particular set, you’ll have to know what the set is called first. Sets are numbered from 1 up. Typically, set 1 contains keywords like if
, while
, do
, for
, switch
etc, set 2 contains null
, false
, true
etc, set 3 contains function inbuilt into the language (echo
, htmlspecialchars
etc. in PHP) and set 4 contains data types and similar variable modifiers: int
, double
, real
, static
etc. However these things are not fixed, and you should check the language file to see what key you want. Having a familiarity with a language file is definately a plus for using it.
To change the styles for a keyword set, call the set_keyword_group_style()
method:
$geshi->set_keyword_group_style($group, $styles);
Where $group
is the group to change the styles for and $styles
is a string containing the styles to apply to that group.
By default, the styles you pass overwrite the current styles. Add a boolean true
after the styles you specify to combine them with the current styles:
$geshi->set_keyword_group_style(3, 'color: white;', true);
3.4.4 Setting Comment Styles
To change the styles for a comment group, call the set_comments_style()
method:
$geshi->set_comments_style($group, $styles);
Where $group
is either a number corresponding to a single-line comment, or the string 'MULTI'
to specify multiline comments:
PHP code | |
1 |
$geshi->set_comments_style(1, 'font-style: italic;'); |
By default, the styles you pass overwrite the current styles. Add a boolean true
after the styles you specify to combine them with the current styles:
$geshi->set_comments_style(1, 'font-weight: 100;', true);
In 1.0.7.22 a new kind of Comments called “COMMENT_REGEXP” has been added. Those are handled by setting single line comment styles.
3.4.5 Setting Other Styles
GeSHi can highlight many other aspects of your source other than just keywords and comments. Strings, Numbers, Methods and Brackets among other things can all also be highlighted. Here are the related methods:
PHP code | |
1 |
$geshi->set_escape_characters_style($styles[, $preserve_defaults]); |
$styles
is a string containing valid stylesheet declarations, while $preserve_defaults
should be set to true
if you want your styles to be merged with the previous styles. In the case of set_methods_style()
, you should select a group to set the styles of, check the language files for the number used for each “object splitter”.
Like this was possible for set_method_style
a new parameter has been introduced for set_symbols_style
too which allows you to select the group of symbols for which you’d like to change your style. $geshi->set_symbols_style($styles[,$preserve_defaults[, $group]]);
If the third parameter is not given, group 0 is assumed. Furthermore you should note that any changes to group 0 are also reflected in the bracket style, i.e. a pass-through call to set_bracket_style
is made.
Since GeSHi 1.0.8 multiple styles for strings and numbers are supported, though the API doesn’t provide full access yet.
3.5 Case Sensitivity and Auto Casing
Controlling the case of the outputted source is an easy job with GeSHi. You can control which keywords are converted in case, and also control whether keywords are checked in a case sensitive manner.
3.5.1 Auto-Caps/NoCaps
Auto-Caps/NoCaps is a nifty little feature that capitalises or lowercases automatically certain lexics when they are styled. I dabble in QuickBASIC, a dialect of BASIC which is well known for it’s capatalisation, and SQL is another language well known for using caps for readability.
To change what case lexics are rendered in, you call the set_case_keywords()
method:
$geshi->set_case_keywords($caps_modifier);
The valid values to pass to this method are:
GESHI_CAPS_NO_CHANGE
- Don’t change the case of any lexics, leave as they are foundGESHI_CAPS_UPPER
- Uppercase all lexics foundGESHI_CAPS_LOWER
- Lowercase all lexics found
When I say “lexic”, I mean “keywords”. Any keyword in any keyword array will be modified using this option! This is one small area of inflexibility I hope to fix in 1.2.X.
I suspect this will only be used to specify GESHI_CAPS_NO_CHANGE
to turn off autocaps for languages like SQL and BASIC variants, like so:
PHP code | |
1 |
$geshi = new GeSHi($source, 'sql'); |
All the same, it can be used for some interesting effects:
PHP code | |
1 |
$geshi = new GeSHi($source, 'java'); |
3.5.2 Setting Case Sensitivity
Some languages, like PHP, don’t mind what case function names and keywords are in, while others, like Java, depend on such pickiness to maintain their bad reputations ;). In any event, you can use the set_case_sensitivity()
to change the case sensitiveness of a particular keyword group from the default:
$geshi->set_case_sensitivity($key, $sensitivity);
Where $key
is the key of the group for which you wish to change case sensitivness for (see the language file for that language), and $sensitivity
is a boolean value - true
if the keyword is case sensitive, and false
if not.
3.6 Changing the Source, Language, Config Options
What happens if you want to change the source to be highlighted on the fly, or the language. Or if you want to specify any of those basic fields after you’ve created a GeSHi object? Well, that’s where these methods come in.
3.6.1 Changing the Source Code
To change the source code, you call the set_source()
method:
$geshi->set_source($newsource);
Example:
PHP code | |
1 |
$geshi = new GeSHi($source1, 'php'); |
3.6.2 Changing the Language
What happens if you want to change the language used for highlighting? Just call set_language()
:
$geshi->set_language('newlanguage');
Example:
PHP code | |
1 |
$geshi = new GeSHi($source, 'php'); |
As of GeSHi 1.0.5, you can use the method load_from_file()
to load the source code and language from a file. Simply pass this method a file name and it will attempt to load the source and set the language.
$geshi->load_from_file($file_name, $lookup);
$file_name
is the file name to use, and $lookup
is an optional parameter that contains a lookup array to use for deciding which language to choose. You can use this to override GeSHi’s default lookup array, which may not contain the extension of the file you’re after, or perhaps does have your extension but under a different language. The lookup array is of the form:
PHP code | |
1 |
array( |
Also, you can use the method get_language_name_from_extension()
if you need to convert a file extension to a valid language name. This method will return the empty string if it could not find a match in the lookup, and like load_from_file
it accepts an optional second parameter that contains a lookup array.
Names are case-insensitive - they will be converted to lower case to match a language file however. So if you’re making a language file, remember it should have a name in lower case.
What you pass to this method is the name of a language file, minus the .php extension. If you’re writing a plugin for a particular application, it’s up to you to somehow convert user input into a valid language name.
Since GeSHi 1.0.8 this function does not reset language settings for an already loaded language. If you want to highlight code in the same language with different settings add the optional $force_reset parameter
:
$geshi->set_language('language', true);
GeSHi include()
s the language file, so be careful to make sure that users can’t pass some wierd language name to include any old script! GeSHi tries to strip non-valid characters out of a language name, but you should always do this your self anyway. In particular, language files are always lower-case, with either alphanumeric characters, dashes or underscores in their name.
At the very least, strip “/” characters out of a language name.
3.6.3 Changing the Language Path
What happens if all of a sudden you want to use language files from a different directory from the current language file location? You call the set_language_path()
method:
$geshi->set_language_path($newpath);
It doesn’t matter whether the path has a trailing slash after it or not - only that it points to a valid folder. If it doesn’t, that’s your tough luck ;)
3.6.4 Changing the Character Set
Although GeSHi itself does not require to know the exact charset of your source you will need to set this option when processing sources where multi-byte characters can occur. As of GeSHi 1.0.7.18 internally a rewrite of htmlspecialchars
is used due to a security flaw in that function that is unpatched in even the most recent PHP4 versions and in PHP5 < 5.2. Although this does no longer explicitely require the charset it is required again as of GeSHi 1.0.8 to properly handle multi-byte characters (e.g. after an escape char).
As of GeSHi 1.0.8 the default charset has been changed to UTF-8.
As of version 1.0.3, you can use the method set_encoding()
to specify the character set that your source is in. Valid names are those names that are valid for the PHP mbstring library:
$geshi->set_encoding($encoding);
There is a table of valid strings for $encoding
at the php.net manual linked to above. If you do not specify an encoding, or specify an invalid encoding, the character set used is ISO-8859-1.
3.7 Error Handling
What happens if you try to highlight using a language that doesn’t exist? Or if GeSHi can’t read a required file? The results you get may be confusing. You may check your code over and over, and never find anything wrong. GeSHi provides ways of finding out if GeSHi itself found anything wrong with what you tried to do. After highlighting, you can call the error()
method:
$geshi = new GeSHi('hi', 'thisLangIsNotSupported');
echo $geshi->error(); // echoes error message
The error message you will get will look like this:
GeSHi Error: GeSHi could not find the language thisLangIsNotSupported (using path geshi/) (code 2)
The error outputted will be the last error GeSHi came across, just like how mysql_error()
works.
3.8 Disabling styling of some Lexics
One disadvantage of GeSHi is that for large source files using complex languages, it can be quite slow with every option turned on. Although future releases will concentrate on the speed/resource side of highlighting, you can gain speed by disabling some of the highlighting options. This is done by using a series of set_*_highlighting
methods:
set_keyword_group_highlighting($group, $flag):
- Sets whether a particular
$group
of keywords is to be highlighted or not. Consult the necessary language file(s) to see what$group
should be for each group (typically a positive integer).$flag
isfalse
if you want to disable highlighting of this group, andtrue
if you want to re-enable higlighting of this group. If you disable a keyword group then even if the keyword group has a related URL one will not be generated for that keyword. set_comments_highlighting($group, $flag):
- Sets whether a particular
$group
of comments is to be highlighted or not. Consult the necessary language file(s) to see what$group
should be for each group (typically a positive integer, or th string'MULTI'
for multiline comments.$flag
isfalse
if you want to disable highlighting of this group, andtrue
if you want to re-enable highlighting of this group. set_regexps_highlighting($regexp, $flag):
- Sets whether a particular
$regexp
is to be highlighted or not. Consult the necessary language file(s) to see what$regexp
should be for each regexp (typically a positive integer, or the string'MULTI'
for multiline comments.$flag
isfalse
if you want to disable highlighting of this group, andtrue
if you want to re-enable highlighting of this group.
The following methods:
set_escape_characters_highlighting($flag)
set_symbols_highlighting($flag)
set_strings_highlighting($flag)
set_numbers_highlighting($flag)
set_methods_highlighting($flag)
Work on their respective lexics (e.g. set_methods_highlighting()
will disable/enable highlighting of methods). For each method, if $flag
is false
then the related lexics will not be highlighted at all (this means no HTML will surround the lexic like usual, saving on time and bandwidth.
In case all highlighting should be disabled or reenabled GeSHi provides two methods called disable_highlighting()
and enable_highlighting($flag)
. The optional paramter $flag
has been added in 1.0.7.21 and specifies the desired state, i.e. true
(default) to turn all highlighting on, or false
to turn all highlighting off. Since 1.0.7.21 the method disnable_highlighting()
has become deprecated.
3.9 Setting the Tab Width
If you’re using the <pre>
header, tabs are handled automatically by your browser, and in general you can count on good results. However, if you’re using the <div>
header, you may want to specify a tab width explicitly.
Note that tabs created in this fashion won’t be like normal tabs - there won’t be “tab-stops” as such, instead tabs will be replaced with the specified number of spaces - just like most editors do.
To change the tab width, you call the set_tab_width()
method:
$geshi->set_tab_width($width);
Where $width
is the width in spaces that you’d like tabs to be.
3.10 Using Strict Mode
Some languages like to get tricky, and jump in and out of the file that they’re in. For example, the vast majority of you reading this will have used a PHP file. And you know that PHP code is only executed if it’s within delimiters like <?php
and ?>
(there are others of course…). So what happens if you do the following in a php file?
<img src="<?php echo rand(1, 100) ?>" />
When using GeSHi without strict mode, or using a bad highlighter, you’ll end up with scrambled crap, especially if you’re being slack about where you’re putting your quotes, you could end up with the rest of your file as bright blue. Fortunately, you can tell GeSHi to be “strict” about just when it highlights and when it does not, using the enable_strict_mode()
method:
$geshi->enable_strict_mode($mode);
Where $mode
is true
or not specified to enable strict mode, or false
to disable strict mode if you’ve already turned it and don’t want it now.
As of GeSHi 1.0.8 there is a new way to tell GeSHi when to use Strict Mode which is somewhat more intelligent than in previous releases. GeSHi now also allows GESHI_MAYBE
, GESHI_NEVER
and GESHI_ALWAYS
instead of true
and false
. Basically GESHI_ALWAYS
(true
) always enables strict mode, whereas GESHI_NEVER
(false
) completely disables strict mode. The new thing is GESHI_MAYBE
which enables strict mode if it finds any sequences of code that look like strict block delimiters.
By the way: That’s why this section had to be changed, as the new documentation tool we now use, applies this feature and thus auto-detects when strict mode has to be used…
3.11 Adding/Removing Keywords
Lets say that you’re working on a large project, with many files, many classes and many functions. Perhaps also you have the source code on the web and highlighted by GeSHi, perhaps as a front end to CVS, as a learning tool, something to refer to, whatever. Well, why not highlight the names of the functions and classes your project uses, as well as the standard functions and classes? Or perhaps you’re not interested in highlighting certain functions, and would like to remove them? Or maybe you don’t mind if an entire function group goes west in the interest of speed? GeSHi can handle all of this!
3.11.1 Adding a Keyword
If you want to add a keyword to an existing keyword group, you use the add_keyword
method:
$geshi->add_keyword($key, $word);
Where $key
is the index of the group of keywords you want to add this keyword to, and $word
is the word to add.
This implies knowledge of the language file to know the correct index.
3.11.2 Removing a Keyword
Perhaps you want to remove a keyword from an existing group. Maybe you don’t use it and want to save yourself some time. Whatever the reason, you can remove it using the remove_keyword
method:
$geshi->remove_keyword($key, $word);
Where $key
is the index of the group of keywords that you want to remove this keyword from, and $word
is the word to remove.
This implies knowledge of the language file to know the correct index - most of the time the keywords you’ll want to remove will be in group 3, but this is not guaranteed and you should check the language file first.
This function is silent - if the keyword is not in the group you specified, nothing awful will happen ;)
3.11.3 Adding a Keyword Group
Lets say for your big project you have several main functions and classes that you’d like highlighted. Why not add them as their own group instead of having them highlighted the same way as other keywords? Then you can make them stand out, and people can instantly see which functions and classes are user defined or inbuilt. Furthermore, you could set the URL for this group to point at the API documentation of your project.
You add a keyword group by using the add_keyword_group
method:
$geshi->add_keyword_group($key, $styles, $case_sensitive, $words);
Where $key
is the key that you want to use to refer to this group, $styles
is the styles that you want to use to style this group, $case_sensitive
is true or false depending on whether you want this group of keywords to be case sensitive or not and $words
is an array of words (or a string) of which words to add to this group. For example:
$geshi->add_keyword_group(10, 'color: #600000;', false, array('myfunc_1', 'myfunc_2', 'myfunc_3'));
Adds a keyword group referenced by index 10, of which all keywords in the group will be dark red, each keyword can be in any case and which contains the keywords “myfunc_1”, “myfunc_2” and “myfunc_3”.
After creating such a keyword group, you may call other GeSHi methods on it, just as you would for any other keyword group.
If you specify a $key
for which there is already a keyword group, the old keyword group will be overwritten! Most language files don’t use numbers larger than 5, so I recommend you play it safe and use a number like 10 or 42.
3.11.4 Removing a Keyword Group
Perhaps you really need speed? Why not just remove an entire keyword group? GeSHi won’t have to loop through each keyword checking for its existance, saving much time. You remove a keyword group by using the remove_keyword_group
method:
$geshi->remove_keyword_group($key);
Where $key
is the key of the group you wish to remove. This implies knowleged of the language file.
3.12 Headers and Footers for Your Code
So you want to add some special information to the highlighted source? GeSHi can do that too! You can specify headers and footers for your code, style them, and insert information from the highlighted source into your header or footer.
3.12.1 Keyword Substitution
In your header and footer, you can put special keywords that will be replaced with actual configuration values for this GeSHi object. The keywords you can use are:
<TIME>
or{TIME}
: Is replaced by the time it took for theparse_code()
method - i.e., how long it took for your code to be highlighted. The time is returned to three decimal places.<LANGUAGE>
or{LANGUAGE}
: Is replaced by a nice, friendly version of the language name used to highlight this code.<SPEED>
or{SPEED}
: Is replaced by the speed at which your source has been processed.<VERSION>
or{VERSION}
: The GeSHi version used to highlight the code.
3.12.2 Setting Header Content
The header for your code is a <div>
, which is inside the containing block. Therefore, it is affected by the method set_overall_style
, and should contain the sort of HTML that belongs in a <div>
. You may use any HTML you like, and format it as an HTML document. You should use valid HTML - convert to entities any quotemarks or angle brackets you want displayed. You set the header content using the method set_header_content()
:
$geshi->set_header_content($content);
Where $content
is the HTML you want to use for the header.
3.12.3 Setting Footer Content
The footer for your code is a <div>
, which is inside the containing block. Therefore, it is affected by the method set_overall_style
, and should contain the sort of HTML that belongs in a <div>
. You may use any HTML you like, and format it as an HTML document. You should use valid HTML - convert to entities any quotemarks or angle brackets you want displayed. You set the footer content using the method set_footer_content()
:
$geshi->set_footer_content($content);
Where $content
is the HTML you want to use for the footer.
3.12.4 Styling Header Content
You can apply styles to the header content you have set with the set_header_content_style
:
$geshi->set_header_content_style($styles);
Where $styles
is the stylesheet declarations you want to use to style the header content.
3.12.5 Styling Footer Content
You can apply styles to the footer content you have set with the set_footer_content_style
:
$geshi->set_footer_content_style($styles);
Where $styles
is the stylesheet declarations you want to use to style the footer content.
3.13 Keyword URLs
As of version 1.0.2, GeSHi allows you to specify a URL for keyword groups. This URL is used by GeSHi to convert the keywords in that group into URLs to appropriate documentation. And using add_keyword_group
you can add functions and classes from your own projects and use the URL functionality to provide a link to your own API documentation.
3.13.1 Setting a URL for a Keyword Group
To set the URL to be used for a keyword group, you use the set_url_for_keyword_group()
method:
$geshi->set_url_for_keyword_group($group, $url);
Where $group
is the keyword group you want to assign the URL for, and $url
is the URL for this group of keywords.
You may be wondering how to make each keyword in the group point to the correct URL. You do this by putting {FNAME}
in the URL at the correct place. For example, PHP makes it easy by linking www.php.net/function-name
to the documentation for that function, so the URL used is http://www.php.net/{FNAME}
.
Of course, when you get to a language like Java, that puts its class documentation in related folders, it gets a little trickier to work out an appropriate URL (see the Java language file!). I hope to provide some kind of redirection service at the GeSHi website in the future.
As of Version 1.0.7.21 there have been added two more symbols you can use to link to functions. {FNAMEL}
will generate the lowercase version of the keyword, {FNAMEU}
will generate the uppercase version. {FNAME}
will provide the keyword as specified in the language file. Use one of these more specific placeholders if possible, as they result in less overhead while linking for case insensitive languages.
3.13.2 Disabling a URL for a Keyword Group
It’s easy to disable a URL for a keyword group: Simply use the method set_url_for_keyword_group()
to pass an empty string as the URL:
$geshi->set_url_for_keyword_group($group, '');
3.13.3 Disabling all URLs for Keywords
As of GeSHi 1.0.7.18, you can disable all URL linking for keywords:
$geshi->enable_keyword_links(false);
3.13.4 Styling Links
You can also style the function links. You can style their default status, hovered, active and visited status. All of this is controlled by one method, set_link_styles()
:
$geshi->set_link_styles($mode, $styles);
Where $mode
is one of four values:
GESHI_LINK
: The default style of the links.GESHI_HOVER
: The style of the links when they have focus (the mouse is hovering over them).GESHI_ACTIVE
: The style of the links when they are being clicked.GESHI_VISITED
: The style of links that the user has already visited.
And $styles
is the stylesheet declarations to apply to the links.
The names GESHI_LINK
, GESHI_HOVER
… are constants. Don’t put them in quotes!
3.13.5 Setting the Link Target
Perhaps you want to set the target of link attributes, so the manual pages open in a new window? Use the set_link_target()
method:
$geshi->set_link_target($target, $styles);
Where $target
is any valid (X)HTML target value - _blank
or _top
for example.
3.14 Using Contextual Importance
This functionality is not only buggy, but is proving very hard to implement in 1.1.X. Therefore, this functionality may well be removed in 1.2.0. You are hereby warned!
This feature allows you to mark a part of your source as important. But as the implementation its use is deprecated and you should consider using the “Highlight Lines Extra” feature described below.
3.15 Highlighting Special Lines “Extra”
An alternative (and more stable) method of highlighting code that is important is to use extra highlighting by line. Although you may not know what line numbers contain the important lines, if you do this method is a much more flexible way of making important lines stand out.
3.15.1 Specifying the Lines to Highlight Extra
To specify which lines to highlight extra, you pass an array containing the line numbers to highlight_lines_extra()
:
$geshi->highlight_lines_extra($array);
The array could be in the form array(2, 3, 4, 7, 12, 344, 4242)
, made from a DB query, generated from looking through the source for certain important things and working out what line those things are… However you get the line numbers, the array should simply be an array of integers.
Here’s an example, using the same source as before:
PHP code | |
1 |
// |
Which produces:
Java code | |
1 |
public int[][] product ( n, m ) |
What’s more, as you can see the code on a highlighted line is still actually highlighted itself.
3.15.2 Styles for the Highlighted Lines
Again as with contextual importance, you’re not chained to the yellow theme that is the default. You can use the set_highlight_lines_extra_style
method:
$geshi->set_highlight_lines_extra_style($styles);
Where $styles
is the stylesheet declarations that you want to apply to highlighted lines.
3.16 Adding IDs to Each Line
Perhaps you’re a javascript junkie? GeSHi provides a way to give each line an ID so you can access that line with javascript, or perhaps just by plain CSS (though if you want to access lines by CSS you should use the method in the previous section). To enable IDs you call the enable_ids()
method:
$geshi->enable_ids($flag);
Where $flag
is true
or not present to enable IDs, and false
to disable them again if you need.
The ID generated is in the form {overall-css-id}-{line-number}
. So for example, if you set the overall CSS id to be “mycode”, then the IDs for each line would by “mycode-1”, “mycode-2” etc. If there is no CSS ID set, then one is made up in the form geshi-[4 random characters]
, but this is not so useful for if you want to do javascript manipulation.
3.17 Getting the Time of Styling
Once you’ve called parse_code()
, you can get the time it took to run the highlighting by calling the get_time()
method:
PHP code | |
1 |
$geshi = new GeSHi($source, $language, $path); |
4 Language Files
So now you know what features GeSHi offers, and perhaps you’ve even meddled with the source. Or perhaps you’d like a language file for language X but it doesn’t seem to be supported? Rubbish! GeSHi will highlight anything, what do you think I coded this for? ^_^ You’ll just have to learn how to make a language file yourself. And I promise it’s not too hard - and if you’re here you’re in the right place!
4.1 An Example Language File
Let’s begin by looking at an example language file - the language file for the first language ever supported, PHP:
PHP code | |
1 |
<?php |
If you’re remotely familiar with PHP (or even if you’re not), you can see that all that a language file consists of is a glorified variable assignment. Easy! All a language file does is assign a variable $language_data
. Though still, there’s a lot of indices to that array… but this section is here to break each index down and explain it to you.
4.2 Language File Conventions
There are several conventions that are used in language files. For ease of use and readability, your language files should obey the following rules:
- Indentation is 4 spaces, not tabs: Use spaces! as editors continiously screw up tabs there should be no tabs in your documents since it would look differently on every computer otherwise.
- Strings are in single quotes: Every string in a language file should be in single quotes (‘), unless you are specifying a single quote as a quotemark or escape character, in which case they can be in double quotes for readability; or if you are specifying a REGEXP (see below). This ensures that the language file can be loaded as fast as possible by PHP as unnecessary parsing can be avoided.
- Large arrays are multi-lined: An array with more than three or four values should be broken into multiple lines. In any case, lines should not be wider than a full-screen window (about 100 chars per line max). Don’t break the keywords arrays after every keyword.
- Ending brackets for multi-lined arrays on a new line: Also with a comma after them, unless the array is the last one in a parent array. See the PHP language file for examples of where to use commas.
- Use GeSHi’s constants: For capatalisation, regular expressions etc. use the GeSHi constants, not their actual values.
- Verbatim header format: Copy the file header verbatim from other language files and modify the values afterwards. Don’t try to invent own header formats, as your languages else will fail validation!
There are more notes on each convention where it may appear in the language file sections below.
4.3 Language File Sections
This section will look at all the sections of a language file, and how they relate to the final highlighting result.
4.3.1 The Header
The header of a language file is the first lines with the big comment and the start of the variable $language_data
:
PHP code | |
1 |
<?php |
The parts in angle brackets are the parts that you change for your language file. Everything else must remain the same!
Here are the parts you should change:
<name-of-language-file.php>
- This should become the name of your language file. Language file names are in lower case and contain only alphanumeric characters, dashes and underscores. Language files end with .php (which you should put with the name of your language file, eg language.php)<name>
- Your name, or alias.<e-mail address>
- Your e-mail address. If you want your language file included with GeSHi you must include an e-mail address that refers to an inbox controlled by you.<website>
- A URL of a website of yours (perhaps to a page that deals with your contribution to GeSHi, or your home page/blog)<date-started>
- The date you started working on the language file. If you can’t remember, guestimate.<name-of-language>
- The name of the language you made this language file for (probably similar to the language file name).<any-comments>
- Any comments you have to make about this language file, perhaps on where you got the keywords for, what dialect of the language this language file is for etc etc. If you don’t have any comments, remove the space for them.<date-of-release
- The date you released the language file to the public. If you simply send it to me for inclusion in a new GeSHi and don’t release it, leave this blank, and I’ll replace it with the date of the GeSHi release that it is first added to.<GeSHi release>
- This is the version of the release that will contain the changes you made. So if you have version 1.0.8 of GeSHi running this will be the next version to be released, e.g. 1.0.8.1.
Everything should remain the same.
Also: I’m not sure about the copyright on a new language file. I’m not a lawyer, could someone contact me about whether the copyright for a new language file should be exclusivly the authors, or joint with me (if included in a GeSHi release)?
4.3.2 The First Indices
Here is an example from the php language file of the first indices:
PHP code | |
1 |
'LANG_NAME' => 'PHP', |
The first indices are the first few lines of a language file before the KEYWORDS index. These indices specify:
- ‘LANG_NAME’: The name of the language. This name should be a human-readable version of the name (e.g. HTML 4 (transitional) instead of html4trans)
- ‘COMMENT_SINGLE’: An array of single-line comments in your language, indexed by integers starting from 1. A single line comment is a comment that starts at the marker and goes until the end of the line. These comments may be any length > 0, and since they can be styled individually, can be used for other things than comments (for example the Java language file defines “import” as a single line comment). If you are making a language that uses a ’ (apostrophe) as a comment (or in the comment marker somewhere), use double quotes. e.g.: “’”
- ‘COMMENT_MULTI’: Used to specify multiline comments, an array in the form ‘OPEN’ => ‘CLOSE’. Unfortunately, all of these comments you add here will be styled the same way (an area of improvement for GeSHi 1.2.X). These comment markers may be any length > 0.
- ‘CASE_KEYWORDS’: Used to set whether the case of keywords should be changed automatically as they are found. For example, in an SQL or BASIC dialect you may want all keywords to be upper case. The accepted values for this are:
GESHI_CAPS_UPPER
: Convert the case of all keywords to upper case.GESHI_CAPS_LOWER
: Convert the case of all keywords to lower case.GESHI_CAPS_NO_CHANGE
: Don’t change the case of any keyword.- ‘QUOTEMARKS’: Specifies the characters that mark the beginning and end of a string. This is another example where if your language includes the ’ string delimiter you should use double quotes around it.
- ‘ESCAPE_CHAR’: Specifies the escape character used in all strings. If your language does not have an escape character then make this the empty string (
''
). This is not an array! If found, any character after an escape character and the escape character itself will be highlighted differently, and the character after the escape character cannot end a string.
In some language files you might see here other indices too, but those are dealt with later on.
4.3.3 Keywords
Keywords will make up the bulk of a language file. In this part you add keywords for your language, including inbuilt functions, data types, predefined constants etc etc.
Here’s a (shortened) example from the php language file:
PHP code | |
1 |
'KEYWORDS' => array( |
You can see that the index ‘KEYWORDS’ refers to an array of arrays, indexed by positive integers. In each array, there are some keywords (in the actual php language file there is in fact many more keywords in the array indexed by 3). Here are some points to note about these keywords:
- Indexed by positive integers: Use nothing else! I may change this in 1.2.X, but for the 1.0.X series, use positive integers only. Using strings here results in unnecessary overhead degrading performance when highlighting code with your language file!
- Keywords sorted ascending: Keywords should be sorted in ascending order. This is mainly for readability. An issue with versions before 1.0.8 has been solved, so the reverse sorting order is no longer required and should thus be avoided. GeSHi itself sorts the keywords internally when building some of its caches, so the order doesn’t matter, but makes things easier to read and maintain.
- Keywords are case sensitive (sometimes): If your language is case-sensitive, the correct casing of the keywords is defined as the case of the keywords in these keyword arrays. If you check the java language file you will see that everything is in exact casing. So if any of these keyword arrays are case sensitive, put the keywords in as their correct case! (note that which groups are case sensitive and which are not is configurable, see later on). If a keyword group is case insensitive, put the lowercase version of the keyword here OR in case documentation links require a special casing (other than all lowercase or all uppercase) the casing required for them use their casing.
- Keywords must be in
htmlentities()
form: All keywords should be written as if they had been run through the php functionhtmlentities()
. E.g, the keyword is<foo>
, not<foo>
- Don’t use keywords to highlight symbols: Just don’t!!! It doesn’t work, and there is seperate support for symbols since GeSHi 1.0.7.21.
- Markup Languages are special cases: Check the html4strict language file for an example: You need to tweak the Parser control here to tell the surroundings of tagnames. In case of doubt, feel free to ask.
4.3.4 Symbols and Case Sensitivity
So you’ve put all the keywords for your language in? Now for a breather before we style them :). Symbols define what symbols your language uses. These are things like colons, brackets/braces, and other such general punctuation. No alphanumeric stuff belongs here, just the same as no symbols belong into the keywords section.
As of GeSHi version 1.0.7.21 the symbols section can be used in two ways:
- Flat usage:
- This mode is the suggested way for existing language files and languages that only need few symbols where no further differentiation is needed or desired. You simply put all the characters in an array under symbols as shown in the first example below. All symbols in flat usage belong to symbol style group 0.
- Group usage:
- This is a slightly more enhanced way to provide GeSHi symbol information. To use group you create several subarrays each containing only a subset of the symbols to highlight. Every array will need to have an unique index thus you can assign the appropriate styles later.
Here’s an example for flat symbol usage
PHP code | |
1 |
'SYMBOLS' => array( |
which is not too different from the newly introduced group usage shown below:
PHP code | |
1 |
'SYMBOLS' => array( |
Please note that versions before 1.0.7.21 will silently ignore this setting.
Also note that GeSHi 1.0.7.21 itself had some bugs in Symbol highlighting that could cause heavily scrambled code output.
The following case sensitivity group alludes to the keywords section: here you can set which keyword groups are case sensitive.
In the ‘CASE_SENSITIVE’ group there’s a special key GESHI_COMMENTS
which is used to set whether comments are case sensitive or not (for example, BASIC has the REM statement which while not being case sensitive is still alphanumeric, and as in the example given before about the Java language file using “import” as a single line comment, this can be useful sometimes. true if comments are case sensitive, false otherwise. All of the other indices correspond to indices in the 'KEYWORDS'
section (see above).
4.3.5 Styles for your Language File
This is the fun part! Here you get to choose the colours, fonts, backgrounds and anything else you’d like for your language file.
Here’s an example:
PHP code | |
1 |
'STYLES' => array( |
Note that all style rules should end with a semi-colon! This is important: GeSHi may add extra rules to the rules you specify (and will do so if a user tries to change your styles on the fly), so the last semi-colon in any stylesheet rule is important!
All strings here should contain valid stylesheet declarations (it’s also fine to have the empty string).
- ‘KEYWORDS’: This is an array, from keyword index to style. The index you use is the index you used in the keywords section to specify the keywords belonging to that group.
- ‘COMMENTS’: This is an array, from single-line comment index to style for that index. The index ‘MULTI’ is used for multiline comments (and cannot be an array). COMMENT_REGEXP use the style given for their key as if they were single-line comments.
- ‘ESCAPE_CHAR’, ‘BRACKETS’ and ‘METHODS’: These are arrays with only one index: 0. You cannot add other indices to these arrays.
- ‘STRINGS’: This defines the various styles for the Quotemarks you defined earlier. If you don’t use multiple styles for strings there’s only one index: 0. Please also add this index in case no strings are present.
- ‘NUMBERS’: This sets the styles used to highlight numbers. The format used here depends on the format used to set the formats of numbers to highlight. If you just used an integer (bitmask) for numbers, you have to either specify one key with the respective constant, and\or include a key 0 as a default style. If you used an array for the number markup, copy the keys used there and assign the styles accordingly.
- ‘SYMBOLS’: This provides one key for each symbol group you defined above. If you used the flat usage make sure you include a key for symbols group 0.
- ‘REGEXPS’: This is an array with a style for each matching regex. Also, since 1.0.7.21, you can specify the name of a function to be called, that will be given the text matched by the regex, each time a match is found. Note that my testing found that
create_function
would not work with this due to a PHP bug, so you have to put the function definition at the top of the language file. Be sure to prefix the function name withgeshi_[languagename]_
as to not conflict with other functions! - ‘SCRIPT’: For languages that use script delimiters, this is where you can style each block of script. For example, HTML and XML have blocks that begin with < and end with > (i.e. tags) and blocks that begin with & and end with ; (i.e. character entities), and you can set a style to apply to each whole block. You specify the delimiters for the blocks below. Note that many languages will not need this feature.
4.3.6 URLs for Functions
This section lets you specify a url to visit for each keyword group. Useful for pointing functions at their online manual entries.
Here is an example:
PHP code | |
1 |
'URLS' => array( |
The indices of this array correspond to the keyword groups you specified in the keywords section. The string {FNAME}
marks where the name of the function is substituted in. So for the example above, if the keyword being highlighted is “echo”, then the keyword will be a URL pointing to http://www.php.net/echo
. Because some languages (Java!) don’t keep a uniform URL for functions/classes, you may have trouble in creating a URL for that language (though look in the java language file for a novel solution to it’s problem)
4.3.7 Number Highlighting Support
If your language supports different formats of numbers (e.g. integers and float representations) and you want GeSHi to handle them differently you can select from a set of predefined formats.
PHP code | |
1 |
'NUMBERS' => |
All the formats you want GeSHi to recognize are selected via a bitmask that is built by bitwise OR-ing the format constants. When styling you use these constants to assign the proper styles. A style not assigned will automatically fallback to style group 0.
For a complete list of formats supported by GeSHi have a look into the sources of geshi.php.
If you want to define your own formats for numbers or when you want to group the style for two or more formats you can use the array syntax.
PHP code | |
1 |
'NUMBERS' => array( |
This creates 5 style groups 1..5 that will highlight each of the formats specified for each group. Styling of these groups doesn’t use the constants but uses the indices you just defined.
Instead of using those predefined constants you also can assign a PCRE that matches a number when using this advanced format.
The extended format hasn’t been exhaustively been tested. So beware of bugs there.
4.3.8 Object Orientation Support
Now we’re reaching the most little-used section of a language file, which includes such goodies as object orientation support and context support. GeSHi can highlight methods and data fields of objects easily, all you need to do is to tell it to do so and what the “splitter” is between object/method etc.
Here’s an example:
PHP code | |
1 |
'OOLANG' => true, |
If your language has object orientation, the value of 'OOLANG'
is true, otherwise it is false. If it is object orientated, in the 'OBJECT_SPLITTER'
value you put the htmlentities()
version of the “splitter” between objects and methods/fields. If it is not, then make this the empty string.
4.3.9 Using Regular Expressions
Regular expressions are a good way to catch any other lexic that fits certain rules but can’t be listed as a keyword. A good example is variables in PHP: variables always start with either one or two “$” signs, then alphanumeric characters (a simplification). This is easy to catch with regular expressions.
And new to version 1.0.2, there is an advanced way of using regular expressions to catch certain things but highlight only part of those things. This is particularly useful for languages like XML.
Regular expressions use the PCRE syntax (perl-style), not the ereg()
style!
Here is an example (this time the PHP file merged with the XML file):
PHP code | |
1 |
0 => array( |
As you can see there are two formats. One is the “simple” format used in GeSHi < 1.0.2, and the other is a more advanced syntax. Firstly, the simple syntax:
- May be in double quotes: To make it easier for those who always place their regular expressions in double quotes, you may place any regular expression here in double quotes if you wish.
- Don’t use curly brackets where possible: If you want to use curly brackets (
()
) then by all means give it a try, but I’m not sure whether under some circumstances GeSHi may throw a wobbly. You have been warned! If you want to use brackets, it would be better to use the advanced syntax. - Don’t use the “everything” regex: (That’s the
.*?
regex). Use advanced syntax instead.
And now for advanced syntax, which gives you much more control over exactly what is highlighted:
- GESHI_SEARCH: This element specifies the regular expression to search for. If you plan to capture the output, use brackets (
()
). See how in the first example above, most of the regular expression is in one set of brackets (with the equals sign in other brackets). You should make sure that the part of the regular expression that is supposed to match what is highlighted is in brackets. - GESHI_REPLACE: This is what the stuff matched by the regular expression will be replaced with. If you’ve grouped the stuff you want highlighted into brackets in the GESHI_SEARCH element, then you can use
\\number
to match that group, wherenumber
is a number corresponding to how many open brackets are between the open bracket of the group you want highlighted and the start of the GESHI_SEARCH string + 1. This may sound confusing, and it probably is, but if you’re familiar with how PHP’s regular expressions work you should understand. In the example above, the opening bracket for the stuff we want highlighted is the very first bracket in the string, so the number of brackets before that bracket and the start of the string is 0. So we add 1 and get our replacement string of\\1
(whew!).
If you didn’t understand a word of that, make sure that there are brackets around the string in GESHI_SEARCH
and use \\1
for GESHI_REPLACE
;)
- GESHI_MODIFIERS: Specify modifiers for your regular expression. If your regular expression includes the everything matcher (
.*?
), then your modifiers should include “s” and “i” (e.g. use ‘si’ for this). - GESHI_BEFORE:Specifies a bracket group that should appear before the highlighted match (this bracketed group will not be highlighted). Use this if you had to match what you wanted by matching part of your regexp string to something before what you wanted to highlight, and you don’t want that part to disappear in the highlighted result.
- GESHI_AFTER:Specifies a bracket group that should appear after the highlighted match (this bracketed group will not be highlighted). Use this if you had to match what you wanted by matching part of your regexp string to something after what you wanted to highlight, and you don’t want that part to disappear in the highlighted result.
Is that totally confusing?
GeSHi Documentation的更多相关文章
- OpenCASCADE Documentation System
OpenCASCADE Documentation System eryar@163.com Abstract. Doxygen is the de facto standard tool for g ...
- https://developers.google.com/maps/documentation/javascript/examples/places-autocomplete-addressform
https://developers.google.com/maps/documentation/javascript/examples/places-autocomplete-addressform
- Spring Framework------>version4.3.5.RELAESE----->Reference Documentation学习心得----->Spring Framework中web相关的知识(概述)
Spring Framework中web相关的知识 1.概述: 参考资料:官网documentation中第22小节内容 关于spring web mvc: spring framework中拥有自 ...
- Spring Framework------>version4.3.5.RELAESE----->Reference Documentation学习心得----->关于spring framework中的beans
Spring framework中的beans 1.概述 bean其实就是各个类实例化后的对象,即objects spring framework的IOC容器所管理的基本单元就是bean spring ...
- Spring Framework------>version4.3.5.RELAESE----->Reference Documentation学习心得----->使用spring framework的IoC容器功能----->方法一:使用XML文件定义beans之间的依赖注入关系
XML-based configuration metadata(使用XML文件定义beans之间的依赖注入关系) 第一部分 编程思路概述 step1,在XML文件中定义各个bean之间的依赖关系. ...
- Spring Framework------>version4.3.5.RELAESE----->Reference Documentation学习心得----->使用Spring Framework开发自己的应用程序
1.直接基于spring framework开发自己的应用程序: 1.1参考资料: Spring官网spring-framework.4.3.5.RELAESE的Reference Documenta ...
- Apache安装问题:configure: error: APR not found . Please read the documentation
Linux上安装Apache时,编译出现错误: checking for APR... no configure: error: APR not found . Please read the do ...
- 解决编译apache出现的问题:configure: error: APR not found . Please read the documentation
今日编译apache时出错: #./configure --prefix……检查编辑环境时出现: checking for APR... no configure: error: APR not fo ...
- 发现不错的cache系统Cache Manager Documentation
http://cachemanager.net/Documentation/Index/cachemanager_architecture https://www.nuget.org/packages ...
随机推荐
- ASP.net jQuery调用webservice返回json数据的一些问题
之前寒假时,试着使用jQuery写了几个异步请求demo, 但是那样是使用的webform普通页面,一般应该是用 webservice 居多. 最近写后台管理时,想用异步来实现一些信息的展示和修改, ...
- CSS HTML 常用属性备忘录
学习软件设计有一年多了,明年五月就要毕业了.回头看看发现自己其实挺差劲的. 最近开通了博客所以就整理了一下笔记,在这里发布一下自己以前学习css时总是记不住去翻书又很常用的属性,都是一些很基础的. 大 ...
- 20162303 解读同伴的收获&解决同伴的问题 周三补交
解读同伴的收获&解决同伴的问题 11月29号 解决同伴的问题 我的同组同学是20162307学号张韵琪同学 同组同学的问题是动态规划算法步骤中递归定义的最优值 我理解他的意思是她不太理解最优值 ...
- IDEA JSP项目构建及学习心得
近期学习的东西比较杂乱,导致了很多东西都有些忘却.在这里记录一份心得. 简而言之JSP也就是Java代码在页面上的一种呈现方式,用于Web项目的前台展示. 在这里不做过多的阐述. MVC设计模式,Se ...
- java 中常用的类
java 中常用的类 Math Math 类,包含用于执行基本数学运算的方法 常用API 取整 l static double abs(double a) 获取double 的绝对值 l sta ...
- bzoj 3671 贪心
想到了从小到大依次填,但想到可能有重复元素,那是就会有分支,就不知怎样办了,最后才发现它是用随机数来调整排列,所以没有重复元素,唉..... /**************************** ...
- HDU 4268 Alice and Bob 贪心STL O(nlogn)
B - Alice and Bob Time Limit:5000MS Memory Limit:32768KB 64bit IO Format:%I64d & %I64u D ...
- 解决MySQL建立连接问题,快速回收复用TCP的TIME_WAIT
最近同事遇到一个问题,使用python开发的工具在执行的时候无法和MySQL建立连接,其最直接的现象就是满篇的TIME_WAIT,最后通过调整tcp_timestamps参数问题得以解决,再次记录一下 ...
- JAVA读取XML,JAVA读取XML文档,JAVA解析XML文档,JAVA与XML,XML文档解析(Document Object Model, DOM)
使用Document Object Model, DOM解析XML文档 也可参考我的新浪博客:http://blog.sina.com.cn/s/blog_43ac5543010190w3.html ...
- .Net C#向远程服务器Api上传文件
Api服务代码一: /// <summary> /// 服务器接收接口 /// </summary> [HttpPost] [Route("ReceiveFile&q ...