How to modify the Gutenberg blocks from post content with PHP

WordPress block editor named Gutenberg brings a new format for the post content. It saved all the text, images, and other objects in blocks. To make all this parseable to a format that would allow the editor to read them, the content blocks are delimited by HTML comments that start with <!-- wp: .

There are several functions in PHP and Javascript that allow parsing this format into an array of objects (or arrays in the case of PHP).

In JS there are several ways to parse that content. The editor has a very rich JS API under the hood. But this post will focus on PHP instead. Because we may want to modify the content when a post is updated, just before the page renders the HTML or when something else happened.

The use cases are endless. One thing is clear, we want to modify the content in PHP.

To do this, we need a mechanism that would convert the post content, which is a string, into an array.

We already have a function that does this: parse_blocks

And another function that converts back from array to the specific string format: serialize_blocks

So ideally we may use these two functions and get the job done. But wait…

We may have an endless number of blocks on a single page. And each of these blocks may have other nested blocks, and those nested blocks may have other nested… You got the idea…

We must take care of each block and also make sure that we don’t destroy this relationship between parent and nested blocks.

To achieve this I created the following class.

<?php

namespace ZeroWP;

class PostBlocks
{
    /**
     * @param \WP_Post|int $post
     * @param callable     $callback
     *
     * @return string
     */
    public static function getNewContent($post, $callback)
    {
        $post    = get_post($post);
        $content = $post->post_content;

        if (has_blocks($post->post_content)) {
            $blocks       = parse_blocks($post->post_content);
            $parsedBlocks = self::parseBlocks($blocks, $callback);

            $content = serialize_blocks($parsedBlocks);
        }

        return $content;
    }

    /**
     * @param \WP_Block_Parser_Block[]|array[] $blocks
     * @param callable                 $callback
     *
     * @return \WP_Block_Parser_Block[]
     */
    protected static function parseBlocks($blocks, $callback): array
    {
        $allBlocks = [];

        foreach ($blocks as $block) {
            // Go into inner blocks and run this method recursively
            if ( ! empty($block['innerBlocks'])) {
                $block['innerBlocks'] = self::parseBlocks($block['innerBlocks'], $callback);
            }

            // Make sure that is a valid block (some block names may be NULL)
            if ( ! empty($block['blockName'])) {
                $allBlocks[] = $callback($block); // the magic is here...
                continue;
            }

            // Continuously create back the blocks array.
            $allBlocks[] = $block;
        }

        return $allBlocks;
    }
}

The class is incomplete, and that’s correct. Because I can’t know what you want to modify in each block, you must provide a callback function.

As you may have noticed, we have the second parameter in the first method named $callback. Here is where you, the developer, must call it.

This function will consecutively get access to each block from your post.

Note: Is your duty to verify the block name. Otherwise, you’ll end up modifying all blocks.

Here are a few callback examples.

Replace a string from block’s innerHTML

$callback = function($block)
{
    if ( ! empty($block['innerHTML'])) {
        $block['innerHTML'] = str_replace('Hello World', 'Hello Mars!', $block['innerHTML']);
    }

    return $block;
}

Replace a string from block’s innerContent

$callback = function($block)
{
    if ( ! empty($block['innerContent']) && is_array($block['innerContent'])) {
        $block['innerContent'] = array_map(function ($item) {
            return str_replace('Hello World', 'Hello Mars!', $item);
        }, $block['innerContent']);
    }

    return $block;
}

Obviously, you can combine these two into one if you want or need to. But I’m pretty sure that you’ll need it. Since these two properties, innerHTML and innerContent contain similar data but save in a different format.

Modify block attributes

If a block contains some settings, then it may have some attributes. They are usually JSON objects, but in PHP they end up as associative arrays.

These block attributes are under attrs property. Here is a simple example that will replace the url to another URL. Keep in mind that this is just an example and the URL property may not exist.

$callback = function($block)
{
    if (isset($block['attrs']['url'])){
        $block['attrs']['url'] = 'http://my-new-url.com';
    }

    return $block;
}

Make sure that you’re modifying the right block

In the previous examples, I did not include a check that would allow me to identify the block name. So I’d be able to apply different mods for different block types.

Here is a quick snippet that would match the paragraph and heading blocks.

$callback = function($block)
{
    $neededBlocks = ['core/paragraph', 'core/heading'];
    
    if (isset($block['blockName']) && in_array($block['blockName'], $neededBlocks)){
        // We have the correct blocks and now we can modify them.
    }

    return $block;
}

Initiate the block modification process

$newContent = \ZeroWP\PostBlocks::getNewContent($postIdOrObject, $callback);

Now the $newContent is modified. Use it where you need it.

Example: Modifiy one or more block when the post is saved.

Let try in with a real-world example. How we would do it when a post is saved…

In the following example, we’ll alter the content of a post when it’s saved to DB, but only if it has blocks. If it does not, then we’ll do nothing.


add_action('wp_after_insert_post', function ($postId, $postAfter) {
    if (has_blocks($postAfter)) {
        $newContent = \ZeroWP\PostBlocks::getNewContent($postAfter, function ($block) {
            $neededBlocks = ['core/paragraph', 'core/heading'];

            if (isset($block['blockName']) && in_array($block['blockName'], $neededBlocks)) {
                if ( ! empty($block['innerHTML'])) {
                    $block['innerHTML'] = str_replace(['vulgar', 'dirty'], ['v****r', 'd***y'], $block['innerHTML']);
                }
                if ( ! empty($block['innerContent']) && is_array($block['innerContent'])) {
                    $block['innerContent'] = array_map(function ($item) {
                        return str_replace(['vulgar', 'dirty'], ['v****r', 'd***y'], $item['innerHTML']);
                    }, $block['innerContent']);
                }
            }

            return $block;
        }
        );

        wp_update_post(wp_slash([
            'ID'           => $postId,
            'post_content' => $newContent,
        ]), false, false);
    }
}, 99, 2);

This will replace some forbidden words with anything else of your choice and save it back to DB.

Imagine when you want to prevent bad words in your articles. you can do this automatically on save without reviewing the content manually.

Conclusion

As you can see, we can easily and safely modify the saved blocks with PHP. How to do it and when to do it is up to you. You got a good starting point and with the help of WordPress hooks, you’ll be able to do whatever you want.

Member since January 2, 2019

Fullstack Web Developer with more than 12 years of experience in web development. Adept in all stages of advanced web development. Knowledgeable in the user interface, backend, testing, and debugging processes. Bringing forth expertise in design, installation, testing, and maintenance of web systems. Working exclusively and professionally with WordPress since 2010.

Comments

  • Valentin 5 months ago

    Great post!

    Is there a difference about innerContent render when the block uses a PHP file instead of static HTML from the save JS method?

  • Andrei 5 months ago

    Yes.

    The innerContent will have some data when you actually implement the save method in JS.

    If you don’t implement this method in JS and instead render it using PHP, it will usually be empty.

    In the end, all you need in PHP are the attrs and innerBlocks.

    innerBlocks may be there if you implement the save method in JS with InnerBlocks.Content. Example:

    save: () => {
      const blockProps = useBlockProps.save();
     
      return (
          <div { ...blockProps }>
              <InnerBlocks.Content />
          </div>
       );
    }
    

    However, that’s not the content of the block that you register, but another nested level that are individual blocks.

Your email address will not be published.