Code Markup in WordPress

In various pages of this site you’ll see sample code marked up for display. At first I manually marked up a couple samples, and still do for small snippets, but it didn’t take long to realize this would be a tedious task. I put together a shortcode and free-standing class object to more easily add code snippets to CreateSource.com.

There are a lot of code markup systems out there, many of them require a lot of heavy Javascript libraries and the result is quite impressive. I didn’t need anything elaborate or beautiful, and my first priority in building a site is optimization through lightweight resources. I built the markup class object as a wrapper for PHP’s highlight_string() and highlight_file() functions. Out of the box, the native functions work well enough, but my goals were to easily add small or large code bits directly in posts and a free-standing functionality to output larger chunks of code in window overlay. I also wanted to style the highlighted code with site CSS so it can be changed at any time.

INI Values

The first problem to solve is that both highlighting functions output inline color styles like so, which inherently can’t be overridden by CSS.

    
     <span style="color:#779fe8">
           echo </span><span style="color:#ffffff">&quot;hello world&quot;</span><span style="color:#779fe8">;</span>
      </span>
     
  
 

These are controlled by the settings in PHP’s INI. While you could modify the color values in the PHP configuration or override them at run time with ini_set(), it still outputs as an inline style and can’t be controlled by CSS.

    
       highlight.string  = #DD0000
       highlight.comment = #FF9900
       highlight.keyword = #007700
       highlight.default = #0000BB
       highlight.html    = #000000
  
 

My first task was to modify the INI values at run time and set them to a value we can replace with a class selector that can be applied by my CSS style sheet. I do this at run time so no other applications are affected by the INI change. Before any code is output or processed, I make a call to the markup object’s method setMarkupIniValues() and "swap out" the static INI values for the classes I wish to use.


<?php
$markup 
= new CodeMarkup();
$markup->setMarkupIniValues('highlight');

The single param is optional, it is the default and here for example only. Note each of the INI setting values – string, comment, keyword, default, and html. The end result is

    
       highlight.string  = string
       highlight.comment = comment
       highlight.keyword = keyword
       highlight.default = default
       highlight.html    = html
  
 

Now when highlight_string() and highlight_file() read the INI, it would output

    
     <span style="highlight-html">
      <span style="highlight-keyword">echo </span><span style="highlight-string">&quot;hello world&quot;</span><span style="highlight-keyword">;</span>
      </span>
     
  
 

At this point that’s of course incorrect, but we have static values we can swap out. Next we call highlight_file() to get code from a file or pass a code string into highlight_string()* to get marked up content with the inline styles. The content is a param for a single function that swaps out the inline styles for legitimate CSS classes.


<?php
    
/**
     * See setMarkupIniValues(), if not run before anything is processed this
     * will output default inline span styles. Swap out the ini value for a
     * class assignment. Example:  style="'highlight-comment';" should be
     * swapped for 'class="highlight-comment".
     *
     * @param string $content
     * @return string
     */
    
protected function substituteIniValues($content)
    {
        
$identifier $this->getHighlightIdentifier();

        foreach (
$this->highlight_functions as $value) {

            
$content preg_replace(
                
"/style=\"([^\"]+)$value\"/m",
                
"class=\"{$identifier}-{$value}\"",
                
$content
            
);
        }

        return 
$content;
    }

    
/**
     * We set the ini values before code is run, but in the context of the shortcode
     * or standalone code, the identifier value is not available. To avoid passing
     * a bunch of params around, get the identifier we set from the actual ini
     * setting.
     *
     * @return string
     */
    
protected function getHighlightIdentifier()
    {
        
$first $this->highlight_functions[0];
        list (
$identifier$func) = explode('-'ini_get("highlight.$first"));
        return 
$identifier;
    }

While I’d like to take credit for this concept, it is a simplified version of a comment left on the PHP INI Runtime Configuration page. Thank you Eric, wherever you are! Now we have code that looks like this:

    
     <span class="highlight-html">
      <span class="highlight-keyword">echo </span><span class="highlight-string">&quot;hello world&quot;</span><span class="highlight-keyword">;</span>
      </span>
     
  
 

From here, I have control over my code snippets directly in the CSS stylesheet, and and variations can be applied for different contexts only by the CSS. Next is to solve how I get the code to the class object.

Shortcode for Markup

The shortcode passes $atts like any other shortcode, or optionally a code param for short inline snippets.* It also allows an inline param, whether to display the code inline in the page or output a link to open a new overlay window. This link is a sample of the window overlay functionality from the blog post PSR’s and Legibility, and below is the full shortcode params available to the page/post content. The shortcode requires a WordPress init() hook to run setMarkupIniValues() as mentioned above, before any code is output or processed.

    
    /*
     * [code_sample
     *   file = string, quoted, invalid if code used below.
     *   text = string, quoted, optional, text on the link if the code opens in a new window, displays file name if absent.
     *   code = string, quoted, any code chunk for inline display
     *   inline = integer 1|0 or string 'true' or 'false', optional. Display inline or create a link to open an overlay window.
     *   pre_class = string, quoted, the class or classes to assign to the <pre> tag in code output
     *   anchor_class = string, quoted, anchor class when a link is output. Useful to attach event handlers when using Javascript.
     * ]
     */
 

Standalone Code Markup

When the anchor is output, the file name is set as the value for a data-file attribute. An AJAX post sends this value (only) to a standalone endpoint that only accepts the file post (see security below.) The markup object responds in the same way except that it returns the full code chunk for rendering in the overlay window.


<?php
/**
 * Outputs code ONLY in the specified directory, filtered with
 * highlight_file(). Code MUST exist in code sample directory
 * or it fails. Configure post shortcode accordingly.
 *
 * @param string CSS highlight class identifier (e.g $class-comment)
 * @param string CSS class to apply to the <pre> tag. Optional, defaults to shown
 * @param string directory of the code sample files
 * @return string
 */

require_once($_SERVER['DOCUMENT_ROOT'] . '/full/path/to/MarkupObject.php');
$markup = new CodeMarkup();
echo 
$markup
    
->setMarkupIniValues('highlight')
    ->
fullPageMarkup('code-block full''/path/to/sample-code-directory');

The window uses the same Javascript used to open image enlargement windows. The Javascript detects if this is an image enlargement or a content enlargement and renders it accordingly.

Security

Reading files in any server-side language is scary, and should be. Steps have been taken to ensure (hopefully!) a nefarious visitor cannot read files they shouldn’t, and no sensitive data is stored in the samples directory.

  • None of the sample code posts data or writes anything, anywhere. The samples all just output simple code that pretty much does nothing.
  • The code sample directory is hard coded in both the shortcode call and the param passed into the markup object. It is not part of any query or post so it cannot be modified. While it’s an easy task to find out what the code sample directory is, there’s not a lot anyone can do once they get there.
  • No code can be read if the file does not exist in the sample directory and the requested file cannot be index.php (which really doesn’t matter, the contents are above with mock values.)
  • Nothing in the code markup object or anything calling it uses file_get_contents() or any other file functions, only highlight_file().
  • The markup ojects accepts only post, not get or request, and is first filtered via standard PHP input filters before the below filtering.
  • The file post param is a particular point of interest. It is cleansed to only the file name. All characters are stripped to letters, dot, and dash. It cannot be empty and cannot be abused to append some other directory name to it. For example, posting /some-system-directory/file.php to the standalone script will result in file.php, and you’ll get an error.
  • The markup object can only be queried for .php, .css, .js, .pl. and .html files.
  • As mentioned, nothing in the code sample directory does anything that could potentially wreak havoc. None of it posts, writes, or reads anything but what is in the sample directory.

Other File Types (AKA Tricking highlight_file())

highlight_file() works great for PHP files, but other file types – html, CSS, and Javascript – only output the html markup by default. I discovered this was because there was no opening PHP tag in the target file. To trick the highlighter into properly highlighting these files, there is a method in the markup object that sets the file extension. If it’s not a PHP extension (see above, only php, css, js, and html are allowed,) remove the <?php tag from the output.


<?php
    
/**
     * Check that the cleansed input file is string with an allowed extension.
     * See method stripPhpTagForNonPhp(). If we know the extension, we can
     * get highlight_file() to markup the code, then we strip the PHP tag
     * later.
     *
     * @param string $file
     * @return bool
     */
    
protected function isAllowedExtension($file '')
    {
        
$parts explode('.'$file);
        if (! 
is_array($parts)) {
            return 
'';
        }

        if (
$this->setFileExtension($parts)) {
            return 
true;
        }

        return 
false;
    }

    
/**
     * Examine the array of parts from input file, if it's in the allowed
     * extensions, set it.
     *
     * @param array $parts
     * @return bool
     */
    
protected function setFileExtension($parts)
    {
        
$ext array_pop($parts);

        if (
in_array($ext$this->allowed_extensions)) {
            
$this->file_extension $ext;
            return 
true;
        }

        return 
false;
    }

    
/**
     * This little trick allows us to mark up Javascript, CSS, and HTML
     * files, which normally output as a single color. The markup
     * engine in highlight_file() normally only works with an
     * opening PHP tag. In our sample files, put one in
     * there and remove it if the extension is not php.
     *
     * @param string $content
     * @return string
     */
    
protected function stripPhpTagForNonPhp($content)
    {
        if (
strtolower($this->file_extension) === 'php') {
            return 
$content;
        }

        return 
preg_replace('/&lt;\?php(<br\s*\/*>)?/i'''$content);

    }

For example a Javascript sample from the Google Analytics/Visualization project,


/**
 * Attach event handlers and load up any included JS files. If we don't use
 * this approach, some scripts won't load properly and throws errors.
 *
 * @return void
 */
function initializeCharts()
{
    $.
when(
        $.
getScript(analytics_js_path 'ga-chart-config.min.js'),
        $.
getScript(analytics_js_path 'ga-data-formatting.min.js'),
        $.
getScript(analytics_js_path 'ga-vis-chart-options.min.js'),
        $.
getScript(analytics_js_path 'ga-chart-gui.min.js'),
        $.
getScript(analytics_js_path 'ga-load-analytics.js'),
        $.
getScript(analytics_js_path 'ga-render-charts.min.js'),
        $.
getScript(analytics_js_path 'ga-multi-query-render.min.js'),
        $.
Deferred(function (deferred)
    {
        $(
deferred.resolve);
    })
    ).
fail(function ()
    {
        
console.log('Failed to load required scripts, check the paths');
    }).
done(function ()
    {

        
GaChartGui
        
.setSelectObjects()
        .
setDatePickers();

        
GaChartConfig.setChartConfigs();
        
// Once the scripts are fully loaded, send the entire config so all charts are queried.
        
GaLoadAnalytics.loadAnalyticsData(ga_configs);
    });
}

This code markup implementation is not as pretty as other code markup systems, but it doesn’t need to be. It follows one of my basic premises, and does exactly what I need it to and nothing more.

*There are still struggles with the inline code param. The way WordPress processes content often hoses up the code, and highlight_string() acts a bit differently than highlight_file(). When cornered, In these cases just I use highlight_file().

Categories: PHP