Timo Abele Pascal Euhus Johannes Thorn Bence Hornák apparatchick Adelrich König (koa) Andreas Klemp (EXT) Rosi Martin Ralf D. Müller Jan-Niklas Vierheller Jody Winter Denny Israel Jeremie Bresson Nikolay Orozov Ralf D. Müller Vladi Joseph Staub Peter Stange oliver drewes Nils Mahlstaedt sabatmonk Heiko Stehli Alexander Herrmann Tim Riemer Luis Muniz Dierk Höppner bit-jkraushaar Alexander Schwartz Constantin Alfon Thomas Grabietz Lars Francke David M. Carr

10 minutes to read

About This Task

This task takes a generated HTML file, splits it by headline, and pushes it to your instance of Confluence. This lets you use the docs-as-code approach even if your organisation insists on using Confluence as its main document repository.


From the 01.01.2024 on, Atlassian turns off API V1 for Confluence Cloud, if there is a V2 equivalent. docToolchain versions from 3.1 on support API V2. If you are using an older version of docToolchain, you’ll need to upgrade to a newer version. To enable API V2, set useV1Api to false in the Confluence section of the docToolchain configuration file.


Currently, docToolchain only has full support for the old Confluence editor. The new editor is not fully supported yet. You can use the new editor, but you may experience some unexpected layout issues/ changes. To make use of the new editor you need to set enforceNewEditor to true in the Confluence section of the docToolchain configuration file.

Special Features

Easy Code Block Conversion

[source]-blocks are converted to code-macro blocks in Confluence.

Confluence supports a very limited list of languages supported for code block syntax highlighting. When specifying an unknown language, it would even display an error. Therefore, some transformation is applied.

  • If no language is given in the source block, it is explicitly set to plain text (because the default would be Java that might not always apply).

  • Some known and common AsciiDoc source languages are mapped to Confluence code block languages.

    source target note



    produces an acceptable highlighting



    only a specific shell is supported



    different name of language

  • If the language of the source block is not supported by Confluence, it is set to plain text as fallback to avoid the error.

Get a list of valid languages (and learn how to add others) here.

Minimal Impact on Non-Techie Confluence Users

Only pages and images that changed between task runs are published, and only those changes are notified to page watchers, cutting down on 'spam'.

Keywords Automatically Attached as Labels

:keywords: are attached as labels to every Confluence page generated using the publishToConfluence task. See Atlassian’s own guidelines on labels. Several keywords are allowed, and they must be separated by commas. For example: :keywords: label_1, label-2, label3, …​. Labels (keywords) must not contain a space character. Use either '_' or '-'.


You configure the publishToConfluence task in the file docToolchainConfig.groovy. It is located in the root of your project folder. We try to make the configuration self-explanatory, but below is some more information about each config option.


is an array of files to upload to Confluence with the ability to configure a different parent page for each file.


  • file: absolute or relative path to the asciidoc generated html file to be exported

  • url: absolute URL to an asciidoc generated html file to be exported

  • ancestorName (optional): the name of the parent page in Confluence as string; this attribute has priority over ancestorId, but if page with given name doesn’t exist, ancestorId will be used as a fallback

  • ancestorId (optional): the id of the parent page in Confluence as string; leave this empty if a new parent shall be created in the space

The following four keys can also be used in the global section below

  • spaceKey: page specific variable for the key of the confluence space to write to case sensitive! If the case is not correct, it can be that new page will be created but can’t be updated in the next run.

  • subpagesForSections (optional): The number of nested sub-pages to create. Default is '1'. '0' means creating all on one page. The following migration for removed configuration can be used.

    • allInOnePage = true is the same as subpagesForSections = 0

    • allInOnePage = false && createSubpages = false is the same as subpagesForSections = 1

    • allInOnePage = false && createSubpages = true is the same as subpagesForSections = 2

  • pagePrefix (optional): page specific variable, the pagePrefix will be a prefix for the page title and it’s sub-pages use this if you only have access to one confluence space but need to store several pages with the same title - a different pagePrefix will make them unique

  • pageSuffix (optional): same usage as prefix but appended to the title and it’s subpages

only 'file' or 'url' is allowed. If both are given, 'url' is ignored


The page ID of the parent page where you want your docs to be published. Go to this page, click Edit and the required ID will show up in the URL. Specify the ID as a string within the config file.


Endpoint of the confluenceAPI (REST) to be used and looks like https://[yourServer]/[context], while [context] is optional. If you use Confluence Cloud, you can omit the context. If you use Confluence Server, you may need to set a context, depending on your Confluence configuration.

rateLimit (since 3.2.0), The rate limit for Confluence requests. Default is 10 requests per second.


This feature is available for docToolchain >= 3.1 only

If you set this to false, ensure the api config is set to https://[yourCloudDomain]. (Mind no context given here)

If you are using Confluence Cloud, you can set this to false to use the new API V2. If you are using Confluence Server, you can set this to true to use the old API V1. If you are using Confluence Cloud and set this to false, you will get an error message, once Atlassian turns off API V1 (starting 01.01.2024).


Atlassian is currently rolling out a new editor for Confluence. If you want to use the new editor, you can set this to true. If you are using the old editor, you can set this to false. If you are using the new editor, you may experience some unexpected layout issues/ changes, since the new editor has yet no feature parity and therefore may be incompatible.


This boolean configuration determines whether the table of contents (ToC) is disabled on the page once uploaded to Confluence. false by default, so the ToC is active.


Confluence can’t handle two pages with the same name - even with different casing (lowercase, UPPERCASE, a mix). This script matches pages regardless of case and refuses to replace a page whose name differs from an existing page only by casing. Ideally, you should create a new Confluence space for each piece of larger documentation. If you are restricted and can’t create new spaces, you can use pagePrefix/pageSuffix to define a prefix/suffix for the doc so that it doesn’t conflict with other page names.


Set an optional comment for the new page version in Confluence.


For security reasons it is highly recommended to store your credentials in a separate file outside the Git repository, such as in your Home folder.

To authenticate with username and API token, use: credentials = "user:${new File("/users/me/apitoken").text}" or credentials = "user:${new File("/users/me/apitoken").text}"`.bytes.encodeBase64().toString()` to …​…​.. You can create an API-token in your profile.

To authenticate with username and password, use: credentials = …​…​

You can also set your username, password of apitoken as an environment variable. You then do the following: 1. Open the file that contains the environment variables: a. On a Mac, go to your Home folder and open the file .zpfrofile. 2. …​.

If you wish to simplify the injection of credentials from external sources, do the following: 1. In docToolchainConfig.groovy, do not enter the credentials. Make sure the credentials are escaped. 2. Create a file in the project or home directory. See the gradle user guide. 3. Open the file, and put the variables in it: - confluenceUser=myusername, and on a new line - confluencePass=myuserpassword


In situations where you have to use full user authorisation because of internal Confluence permission handling, you’ll need to add the API-token in addition to the credentials. The API-token cannot be added to the credentials because it’s used for user and password exchange. Therefore, the API-token can be added as parameter apikey, which makes the addition of the token a separate header field with key: keyId and value of apikey. An example (including storing of the real value outside this configuration) is: apikey = "${new File("/home/me/apitoken").text}".


You can pass a Confluence Personal Access Token as the bearerToken. It is an alternative to credentials. Do not confuse it with apiKey.


If you need to prefix your pages with a warning stating that 'this is generated content', this is where you do it.


If value is set to true, any links to local file references will be uploaded as attachments. The current implementation only supports a single folder, the name of which will be used as a prefix to validate whether your file should be uploaded. If you enable this feature, and use a folder which starts with 'attachment', an adaption of this prefix is required.


Limits the number of pages retrieved from the server to check if a page with this name already exists.

jiraServerId Stores the Jira server ID that your Confluence instance is connected to. If a value is set, all anchors pointing to a Jira ticket will be replaced by the Confluence Jira macro. To function properly, jiraRoot must be configured (see exportJiraIssues). Here’s an example:

All files to attach will need to be linked inside the document: link:attachment/myfolder/myfile.json[My API definition]


Stores the expected foldername of your output directory. Default is attachment.


If you need to provide proxy to access Confluence, you can set a map with the keys host (e.g. ''), port (e.g. '1234') and schema (e.g. 'http') of your proxy.


If this option is present and equal to confluence-open-api or swagger-open-api then any source block marked with class openapi will be wrapped in the Elitesoft Swagger Editor macro (see Elitesoft Swagger Editor). The key depends on the version of the macro.

For backward compatibility, if this option is present and equal to true, then again the Elitesoft Swagger Editor macro will be used.

If this option is present and equal to "open-api" then any source block marked with class openapi will be wrapped in Open API Documentation for Confluence macro: (see Open API Documentation for Confluence). A download source (yaml) button is shown by default.
Using the plugin can be handled on different ways.

  • copy/paste the content of the YAML file to the plugin without linking to the origin source by using the url to the YAML file

  • copy/paste the content of the YAML file to the plugin without linking to the origin source by using a YAML file in your project structure:

  • create a link between the plugin and the YAML file without copying the content into the plugin. The advantage following this way is that even in case the API specification is changed without re-generating the documentation, the new version of the configuration is used in Confluence.

//Configureation for publishToConfluence

confluence = [:]

// 'input' is an array of files to upload to Confluence with the ability
//          to configure a different parent page for each file.
// Attributes
// - 'file': absolute or relative path to the asciidoc generated html file to be exported
// - 'url': absolute URL to an asciidoc generated html file to be exported
// - 'ancestorName' (optional): the name of the parent page in Confluence as string;
//                             this attribute has priority over ancestorId, but if page with given name doesn't exist,
//                             ancestorId will be used as a fallback
// - 'ancestorId' (optional): the id of the parent page in Confluence as string; leave this empty
//                            if a new parent shall be created in the space
//                            Set it for every file so the page scanning is done only for the given ancestor page trees.
// The following four keys can also be used in the global section below
// - 'spaceKey' (optional): page specific variable for the key of the confluence space to write to
// - 'subpagesForSections' (optional): The number of nested sub-pages to create. Default is '1'.
//                                     '0' means creating all on one page.
//                                     The following migration for removed configuration can be used.
//                                     'allInOnePage = true' is the same as 'subpagesForSections = 0'
//                                     'allInOnePage = false && createSubpages = false' is the same as 'subpagesForSections = 1'
//                                     'allInOnePage = false && createSubpages = true' is the same as 'subpagesForSections = 2'
// - 'pagePrefix' (optional): page specific variable, the pagePrefix will be a prefix for the page title and it's sub-pages
//                            use this if you only have access to one confluence space but need to store several
//                            pages with the same title - a different pagePrefix will make them unique
// - 'pageSuffix' (optional): same usage as prefix but appended to the title and it's subpages
// only 'file' or 'url' is allowed. If both are given, 'url' is ignored
confluence.with {
    input = [
            [ file: "build/docs/html5/arc42-template-de.html" ],

    // endpoint of the confluenceAPI (REST) to be used
    // https://[yourServer]
    api = 'https://[yourServer]'

    // requests per second for confluence API calls
    rateLimit = 10

//    Additionally, spaceKey, subpagesForSections, pagePrefix and pageSuffix can be globally defined here. The assignment in the input array has precedence

    // the key of the confluence space to write to
    spaceKey = 'asciidoc'

    // if true, all pages will be created using the new editor v2
    // enforceNewEditor = false

    // variable to determine how many layers of sub pages should be created
    subpagesForSections = 1

    // the pagePrefix will be a prefix for each page title
    // use this if you only have access to one confluence space but need to store several
    // pages with the same title - a different pagePrefix will make them unique
    pagePrefix = ''

    pageSuffix = ''

    WARNING: It is strongly recommended to store credentials securely instead of commiting plain text values to your git repository!!!

    Tool expects credentials that belong to an account which has the right permissions to to create and edit confluence pages in the given space.
    Credentials can be used in a form of:
     - passed parameters when calling script (-PconfluenceUser=myUsername -PconfluencePass=myPassword) which can be fetched as a secrets on CI/CD or
     - gradle variables set through gradle properties (uses the 'confluenceUser' and 'confluencePass' keys)
    Often, same credentials are used for Jira & Confluence, in which case it is recommended to pass CLI parameters for both entities as
    -Pusername=myUser -Ppassword=myPassword

    //optional API-token to be added in case the credentials are needed for user and password exchange.
    //apikey = "[API-token]"

    // HTML Content that will be included with every page published
    // directly after the TOC. If left empty no additional content will be
    // added
    // extraPageContent = '<ac:structured-macro ac:name="warning"><ac:parameter ac:name="title" /><ac:rich-text-body>This is a generated page, do not edit!</ac:rich-text-body></ac:structured-macro>
    extraPageContent = ''

    // enable or disable attachment uploads for local file references
    enableAttachments = false

    // default attachmentPrefix = attachment - All files to attach will require to be linked inside the document.
    // attachmentPrefix = "attachment"

    // Optional proxy configuration, only used to access Confluence
    // schema supports http and https
    // proxy = [host: '', port: 1234, schema: 'http']

    // Optional: specify which Confluence OpenAPI Macro should be used to render OpenAPI definitions
    // possible values: ["confluence-open-api", "open-api", "swagger-open-api", true]. true is the same as "confluence-open-api" for backward compatibility
    // useOpenapiMacro = "confluence-open-api"

CSS Styling

Some AsciiDoctor features depend on specific CSS style definitions. Unless these styles are defined, some formatting that is present in the HTML version will not be represented when published to Confluence. To configure Confluence to include additional style definitions:

  1. Log in to Confluence as a space admin.

  2. Go to the desired space.

  3. Select Space tools > Look and Feel > Stylesheet.

  4. Click Edit then enter the desired style definitions.

  5. Click Save.

The default style definitions can be found in the AsciiDoc project as asciidoctor-default.css. You will most likely NOT want to include the entire thing, as some of the definitions are likely to disrupt Confluence’s layout.

The following style definitions are Confluence-compatible, and will enable the use of the built-in roles (big/small, underline/overline/line-through, COLOR/COLOR-background for the sixteen HTML color names):



Show source code of scripts/publishToConfluence.gradle or go directly to GitHub · docToolchain/scripts/publishToConfluence.gradle.
task publishToConfluence(
        description: 'publishes the HTML rendered output to confluence',
        group: 'docToolchain'
) {
    doLast {"docToolchain> docDir: "+docDir)
        config.confluence.api = findProperty("confluence.api")?:config.confluence.api
        //TODO default should be false, if the V1 has been removed in cloud
        config.confluence.useV1Api = findProperty("confluence.useV1Api") != null ?
            findProperty("confluence.useV1Api") : config.confluence.useV1Api != [:] ?
            config.confluence.useV1Api :true
        Asciidoc2ConfluenceTask.From(config, docDir).execute()
Show source code of core/src/main/groovy/org/docToolchain/scripts/asciidoc2confluence.groovy or go directly to GitHub · docToolchain/core/src/main/groovy/org/docToolchain/scripts/asciidoc2confluence.groovy.
package org.docToolchain.scripts

import org.docToolchain.atlassian.transformer.HtmlTransformer

 * Created by Ralf D. Mueller and Alexander Heusingfeld
 * this script expects an HTML document created with AsciiDoctor
 * in the following style (default AsciiDoctor output)
 * <div class="sect1">
 *     <h2>Page Title</h2>
 *     <div class="sectionbody">
 *         <div class="sect2">
 *            <h3>Sub-Page Title</h3>
 *         </div>
 *         <div class="sect2">
 *            <h3>Sub-Page Title</h3>
 *         </div>
 *     </div>
 * </div>
 * <div class="sect1">
 *     <h2>Page Title</h2>
 *     ...
 * </div>

    Additions for issue #342 marked as #342-dierk42

// some dependencies

import org.jsoup.nodes.Document
import org.jsoup.nodes.Element
import org.jsoup.nodes.Entities
import org.jsoup.nodes.TextNode

import groovy.transform.Field

import java.nio.charset.Charset
import java.nio.file.Path
import static

import org.docToolchain.atlassian.confluence.clients.ConfluenceClientV1
import org.docToolchain.atlassian.confluence.clients.ConfluenceClientV2
import org.docToolchain.configuration.ConfigService
import org.docToolchain.atlassian.confluence.ConfluenceService

ConfigService configService = new ConfigService(config)

ConfluenceService confluenceService = new ConfluenceService(configService)

def confluenceClient = configService.getConfigProperty("confluence.useV1Api") ?
        new ConfluenceClientV1(configService) :
        new ConfluenceClientV2(configService)

def CDATA_PLACEHOLDER_START = '<cdata-placeholder>'

def CDATA_PLACEHOLDER_END = '</cdata-placeholder>'

def baseUrl
def allPages
// #938-mksiva: global variable to hold input spaceKey passed in the Config.groovy
def spaceKeyInput
// configuration

def confluenceSpaceKey
def confluenceSubpagesForSections
def confluencePagePrefix
def confluencePageSuffix
//def baseApiPath = new URI(config.confluence.api).path
// helper functions

def MD5(String s) {

def parseAdmonitionBlock(block, String type) {
    content =".content").first()
    titleElement =".title")
    titleText = ''
    if(titleElement != null) {
        titleText = "<ac:parameter ac:name=\"title\">${titleElement.text()}</ac:parameter>"
    block.after("<ac:structured-macro ac:name=\"${type}\">${titleText}<ac:rich-text-body>${content}</ac:rich-text-body></ac:structured-macro>")

/*  #342-dierk42

    add labels to a Confluence page. Labels are taken from :keywords: which
    are converted as meta tags in HTML. Building the array: see below

    Confluence allows adding labels only after creation of a page.
    Therefore we need extra API calls.

    Currently the labels are added one by one. Suggestion for improvement:
    Build a label structure of all labels an place them with one call.

    Replaces exisiting labels. No harm
    Does not check for deleted labels when keywords are deleted from source
def addLabels = { def pageId, def labelsArray ->
    // Attach each label in a API call of its own. The only prefix possible
    // in our own Confluence is 'global'
    labelsArray.each { label ->
        label_data = [
            prefix : 'global',
            name : label
        confluenceClient.addLabel(pageId, label_data)
        println "added label " + label + " to page ID " + pageId

def uploadAttachment = { def pageId, String url, String fileName, String note ->
    def is
    def localHash
    if (url.startsWith('http')) {
        is = new URL(url).openStream()
        //build a hash of the attachment
        localHash = MD5(new URL(url).openStream().text)
    } else {
        is = new File(url).newDataInputStream()
        //build a hash of the attachment
        localHash = MD5(new File(url).newDataInputStream().text)

    def attachment = confluenceClient.getAttachment(pageId, fileName)
    if (attachment?.results) {
        // attachment exists. need an update?
        if (confluenceClient.attachmentHasChanged(attachment, localHash)) {
            //hash is different -> attachment needs to be updated
            confluenceClient.updateAttachment(pageId, attachment.results[0].id, is, fileName, note, localHash)
            println "    updated attachment"
    } else {
        confluenceClient.createAttachment(pageId, is, fileName, note, localHash)

def realTitle(pageTitle){
    confluencePagePrefix + pageTitle + confluencePageSuffix

def rewriteMarks (body) {
    // Confluence strips out mark elements.  Replace them with default formatting.'mark').wrap('<span style="background:#ff0;color:#000"></style>').unwrap()

// #352-LuisMuniz: Helper methods
// Fetch all pages of the defined config ancestorsIds. Only keep relevant info in the pages Map
// The map is indexed by lower-case title
def retrieveAllPages = { String spaceKey ->
    // #938-mksiva: added a condition spaceKeyInput is null, if it is null, it means that, space key is different, so re fetch all pages.
    if (allPages != null && spaceKeyInput == null) {
        println "allPages already retrieved"
    } else {
        def pageIds = []
        def checkSpace = false
        int pageLimit = config.confluence.pageLimit ? config.confluence.pageLimit : 100
        config.confluence.input.each { input ->
            if (!input.ancestorId) {
                // if one ancestorId is missing we should scan the whole space
                checkSpace = true;
        println (".")

        if(checkSpace) {
            allPages = confluenceClient.fetchPagesBySpaceKey(spaceKey, pageLimit)
        } else {
            allPages = confluenceClient.fetchPagesByAncestorId(pageIds, pageLimit)

// Retrieve a page by id with contents and version
def retrieveFullPage = { String id ->
    println("retrieving page with id " + id)

//if a parent has been specified, check whether a page has the same parent.
boolean hasRequestedParent(Map existingPage, String requestedParentId) {
    if (requestedParentId) {
        existingPage.parentId == requestedParentId
    } else {

def rewriteDescriptionLists(body) {
    def TAGS = [ dt: 'th', dd: 'td' ]'dl').each { dl ->
        // WHATWG allows wrapping dt/dd in divs, simply unwrap them'div').each { it.unwrap() }

        // group dts and dds that belong together, usually it will be a 1:1 relation
        // but HTML allows for different constellations
        def rows = []
        def current = [dt: [], dd: []]
        rows << current'dt, dd').each { child ->
            def tagName = child.tagName()
            if (tagName == 'dt' && current.dd.size() > 0) {
                // dt follows dd, start a new group
                current = [dt: [], dd: []]
                rows << current
            current[tagName] << child.tagName(TAGS[tagName])

        rows.each { row ->
            def sizes = [dt: row.dt.size(), dd: row.dd.size()]
            def rowspanIdx = [dt: -1, dd: sizes.dd - 1]
            def rowspan = Math.abs(sizes.dt - sizes.dd) + 1
            def max = sizes.dt
            if (sizes.dt < sizes.dd) {
                max = sizes.dd
                rowspanIdx = [dt: sizes.dt - 1, dd: -1]
            (0..<max).each { idx ->
                def tr = dl.appendElement('tr')
                ['dt', 'dd'].each { type ->
                    if (sizes[type] > idx) {
                        if (idx == rowspanIdx[type] && rowspan > 1) {
                            row[type][idx].attr('rowspan', "${rowspan}")
                    } else if (idx == 0) {
                        tr.appendElement(TAGS[type]).attr('rowspan', "${rowspan}")


def rewriteInternalLinks (body, anchors, pageAnchors) {
    // find internal cross-references and replace them with link macros'a[href]').each { a ->
        def href = a.attr('href')
        if (href.startsWith('#')) {
            def anchor = href.substring(1)
            def pageTitle = anchors[anchor] ?: pageAnchors[anchor]
            if (pageTitle && a.text()) {
                // as Confluence insists on link texts to be contained
                // inside CDATA, we have to strip all HTML and
                // potentially loose styling that way.
                a.wrap("<ac:link${anchors.containsKey(anchor) ? ' ac:anchor="' + anchor + '"' : ''}></ac:link>")
                   .before("<ri:page ri:content-title=\"${realTitle pageTitle}\"/>")

def rewriteJiraLinks = { body ->
    // find links to jira tickets and replace them with jira macros'a[href]').each { a ->
        def href = a.attr('href')
        if (href.startsWith(config.jira.api + "/browse/")) {
                def ticketId = a.text()
                a.before("""<ac:structured-macro ac:name=\"jira\" ac:schema-version=\"1\">
                     <ac:parameter ac:name=\"key\">${ticketId}</ac:parameter>
                     <ac:parameter ac:name=\"serverId\">${config.confluence.jiraServerId}</ac:parameter>

def rewriteOpenAPI (org.jsoup.nodes.Element body) {
    if (config.confluence.useOpenapiMacro == true || config.confluence.useOpenapiMacro == 'confluence-open-api') {'div.openapi  pre > code').each { code ->
            def parent=code.parent()
            def rawYaml=code.wholeText()
                    .wrap('<ac:structured-macro ac:name="confluence-open-api" ac:schema-version="1" ac:macro-id="1dfde21b-6111-4535-928a-470fa8ae3e7d"></ac:structured-macro>')
                    .replaceWith(new TextNode(rawYaml))
    } else if (config.confluence.useOpenapiMacro == 'swagger-open-api') {'div.openapi  pre > code').each { code ->
            def parent=code.parent()
            def rawYaml=code.wholeText()
                    .wrap('<ac:structured-macro ac:name="swagger-open-api" ac:schema-version="1" ac:macro-id="f9deda8a-1375-4488-8ca5-3e10e2e4ee70"></ac:structured-macro>')
                    .replaceWith(new TextNode(rawYaml))
    } else if (config.confluence.useOpenapiMacro == 'open-api') {
        def includeURL=null

        for (Element e :'div .listingblock.openapi')) {
            for (String s : e.className().split(" ")) {
                if (s.startsWith("url")) {
                    //include the link to the URL for the macro
                    includeURL = s.replace('url:', '')
        }'div.openapi  pre > code').each { code ->
            def parent=code.parent()
            def rawYaml=code.wholeText()

                .wrap('<ac:structured-macro ac:name="open-api" ac:schema-version="1" data-layout="default" ac:macro-id="4302c9d8-fca4-4f14-99a9-9885128870fa"></ac:structured-macro>')

            if (includeURL!=null)
                code.before('<ac:parameter ac:name="url">'+includeURL+'</ac:parameter>')
            else {
                //default: show download button
                code.before('<ac:parameter ac:name="showDownloadButton">true</ac:parameter>')
                    .replaceWith(new TextNode(rawYaml))

def getEmbeddedImageData(src){
    def imageData = src.split("[;:,]")
    def fileExtension = imageData[1].split("/")[1]
    // treat svg+xml as svg to be able to create a file from the embedded image
    // more MIME types:
    if(fileExtension == "svg+xml"){
        fileExtension = "svg"
    return Map.of(
        "fileExtension", fileExtension,
        "encoding", imageData[2],
        "encodedContent", imageData[3]

def handleEmbeddedImage(basePath, fileName, fileExtension, encodedContent) {
    def imageDir = "images/"
    if(config.imageDirs.size() > 0){
        def dir = config.imageDirs.find { it ->
            def configureImagesDir = it.replace('./', '/')
            Path.of(basePath, configureImagesDir, fileName).toFile().exists()
        if(dir != null){
            imageDir = dir.replace('./', '/')

    if(!Path.of(basePath, imageDir, fileName).toFile().exists()){
        println "Could not find embedded image at a known location"
        def embeddedImagesLocation = "/confluence/images/"
        new File(basePath + embeddedImagesLocation).mkdirs()
        def imageHash = MD5(encodedContent)
        println "Embedded Image Hash " + imageHash
        def image = new File(basePath + embeddedImagesLocation + imageHash + ".${fileExtension}")
            println "Creating image at " + basePath + embeddedImagesLocation
            image.withOutputStream {output ->
        fileName = imageHash + ".${fileExtension}"
        return Map.of(
            "filePath", image.canonicalPath,
            "fileName", fileName
    } else {
        return Map.of(
            "filePath", basePath + imageDir + fileName,
            "fileName", fileName

//modify local page in order to match the internal confluence storage representation a bit better
//definition lists are not displayed by confluence, so turn them into tables
//body can be of type Element or Elements

def parseBody(body, anchors, pageAnchors) {
    def uploads = []
    rewriteOpenAPI body'div.paragraph').unwrap()'div.ulist').unwrap()
    [   'note':'info',
        'tip':'tip'            ].each { adType, cType ->'.admonitionblock.'+adType).each { block ->
            parseAdmonitionBlock(block, cType)
    //special for the arc42-template'div.arc42help').select('.content')
            .wrap('<ac:structured-macro ac:name="expand"></ac:structured-macro>')
            .wrap('<ac:structured-macro ac:name="info"></ac:structured-macro>')
            .before('<ac:parameter ac:name="title">arc42</ac:parameter>')
            .wrap('<ac:rich-text-body><p></p></ac:rich-text-body>')'div.arc42help').unwrap()'div.title').wrap("<strong></strong>").before("<br />").wrap("<div></div>")'div.listingblock').wrap("<p></p>").unwrap()
    // see if we can find referenced images and fetch them
    new File("tmp/images/.").mkdirs()
    // find images, extract their URLs for later uploading (after we know the pageId) and replace them with this macro:
    // <ac:image ac:align="center" ac:width="500">
    // <ri:attachment ri:filename="deployment-context.png"/>
    // </ac:image>'img').each { img ->
        def src = img.attr('src')
        def imgWidth = img.attr('width')?:500
        def imgAlign = img.attr('align')?:"center"

        //it is not an online image, so upload it to confluence and use the ri:attachment tag
        if(!src.startsWith("http")) {
            def sanitizedBaseUrl = baseUrl.toString().replaceAll('\\\\','/').replaceAll('/[^/]*$','/')
            def newUrl
            def fileName
            //it is an embedded image
                def imageData = getEmbeddedImageData(src)
                def fileExtension = imageData.get("fileExtension")
                def encodedContent = imageData.get("encodedContent")
                fileName = img.attr('alt').replaceAll(/\s+/,"_").concat(".${fileExtension}")
                def embeddedImage = handleEmbeddedImage(sanitizedBaseUrl, fileName, fileExtension, encodedContent)
                newUrl = embeddedImage.get("filePath")
                fileName = embeddedImage.get("fileName")
            }else {
                newUrl = sanitizedBaseUrl + src
                fileName ='/')[-1]),"UTF-8")
            newUrl =,"UTF-8")
            println "    image: "+newUrl
            uploads <<  [0,newUrl,fileName,"automatically uploaded"]
            img.after("<ac:image ac:align=\"${imgAlign}\" ac:width=\"${imgWidth}\"><ri:attachment ri:filename=\"${fileName}\"/></ac:image>")
        // it is an online image, so we have to use the ri:url tag
        else {
            img.after("<ac:image ac:align=\"imgAlign\" ac:width=\"${imgWidth}\"><ri:url ri:value=\"${src}\"/></ac:image>")

        attachmentPrefix = config.confluence.attachmentPrefix ? config.confluence.attachmentPrefix : 'attachment''a').each { link ->

            def src = link.attr('href')
            println "    attachment src: "+src

            //upload it to confluence and use the ri:attachment tag
            if(src.startsWith(attachmentPrefix)) {
                def newUrl = baseUrl.toString().replaceAll('\\\\','/').replaceAll('/[^/]*$','/')+src
                def fileName ='/')[-1]),"UTF-8")
                newUrl =,"UTF-8")

                uploads <<  [0,newUrl,fileName,"automatically uploaded non-image attachment by docToolchain"]
                def uriArray=fileName.split("/")
                def pureFilename = uriArray[uriArray.length-1]
                def innerhtml = link.html()
                link.after("<ac:structured-macro ac:name=\"view-file\" ac:schema-version=\"1\"><ac:parameter ac:name=\"name\"><ri:attachment ri:filename=\"${pureFilename}\"/></ac:parameter></ac:structured-macro>")
                link.after("<ac:link><ri:attachment ri:filename=\"${pureFilename}\"/><ac:plain-text-link-body> <![CDATA[\"${innerhtml}\"]]></ac:plain-text-link-body></ac:link>")


        rewriteJiraLinks body

    rewriteMarks body
    rewriteDescriptionLists body
    rewriteInternalLinks body, anchors, pageAnchors
    //not really sure if must check here the type
    String bodyString = body
    if(body instanceof Element){
        bodyString = body.html()
    Element saneHtml = new Document("")
        .outputSettings(new Document.OutputSettings().syntax(Document.OutputSettings.Syntax.xml).prettyPrint(false))
    def pageString = new HtmlTransformer().transformToConfluenceFormat(saneHtml)

    return Map.of(
        "page", pageString,
        "uploads", uploads

def generateAndAttachToC(localPage) {
    def content
        def prefix = (config.confluence.extraPageContent?:'')
        content  = prefix+localPage
        def default_toc = '<p><ac:structured-macro ac:name="toc"/></p>'
        def prefix = (config.confluence.tableOfContents?:default_toc)+(config.confluence.extraPageContent?:'')
        content  = prefix+localPage
        def default_children = '<p><ac:structured-macro ac:name="children"><ac:parameter ac:name="sort">creation</ac:parameter></ac:structured-macro></p>'
        content += (config.confluence.tableOfChildren?:default_children)
    def localHash = MD5(localPage)
    content += '<ac:placeholder>hash: #'+localHash+'#</ac:placeholder>'
    return content

// the create-or-update functionality for confluence pages
// #342-dierk42: added parameter 'keywords'
def pushToConfluence = { pageTitle, pageBody, parentId, anchors, pageAnchors, keywords ->
    parentId = parentId?.toString()

    def deferredUpload = []
    String realTitleLC = realTitle(pageTitle).toLowerCase()
    String realTitle = realTitle(pageTitle)

    //try to get an existing page
    def parsedBody = parseBody(pageBody, anchors, pageAnchors)
    localPage = parsedBody.get("page")
    def localHash = MD5(localPage)
    localPage = generateAndAttachToC(localPage)

    // #938-mksiva: Changed the 3rd parameter from 'config.confluence.spaceKey' to 'confluenceSpaceKey' as it was always taking the default spaceKey
    // instead of the one passed in the input for each row.
    def pages = retrieveAllPages(confluenceSpaceKey)
    println("pages retrieved")
    // println "Suche nach vorhandener Seite: " + pageTitle
    Map existingPage = pages[realTitleLC]
    def page

    if (existingPage) {
        if (hasRequestedParent(existingPage, parentId)) {
            page = retrieveFullPage( as String)
        } else {
            page = null
    } else {
        page = null
    // println "Gefunden: " + + " Titel: " + page.title

    if (page) {
        println "found existing page: " + +" version "+page.version.number

        //extract hash from remote page to see if it is different from local one
        def remotePage =

        def remoteHash = remotePage =~ /(?ms)hash: #([^#]+)#/
        remoteHash = remoteHash.size()==0?"":remoteHash[0][1]

        // println "remoteHash: " + remoteHash
        // println "localHash:  " + localHash

        if (remoteHash == localHash) {
            println "page hasn't changed!"
            deferredUpload.each {
                uploadAttachment(page?.id, it[1], it[2], it[3])
            deferredUpload = []
            // #324-dierk42: Add keywords as labels to page.
            if (keywords) {
                addLabels(, keywords)
        } else {
            def newPageVersion = (page.version.number as Integer) + 1

                    config.confluence.pageVersionComment ?: '',
            println "> updated page "
            deferredUpload.each {
                uploadAttachment(, it[1], it[2], it[3])
            deferredUpload = []
            // #324-dierk42: Add keywords as labels to page.
            if (keywords) {
                addLabels(, keywords)
    } else {
        //#352-LuisMuniz if the existing page's parent does not match the requested parentId, fail
        if (existingPage && !hasRequestedParent(existingPage, parentId)) {
            throw new IllegalArgumentException("Cannot create page, page with the same "
                    + "title=${existingPage.title} "
                    + "with id=${} already exists in the space. "
                    + "A Confluence page title must be unique within a space, consider specifying a 'confluencePagePrefix' in ConfluenceConfig.groovy")
        //create a page
        page = confluenceClient.createPage(
                config.confluence.pageVersionComment ?: '',
        println "> created page "+page?.id
        deferredUpload.each {
            uploadAttachment(page?.id, it[1], it[2], it[3])
        deferredUpload = []
        // #324-dierk42: Add keywords as labels to page.
        if (keywords) {
            addLabels(page?.id, keywords)
        return page?.id

def parseAnchors(page) {
    def anchors = [:]'[id]').each { anchor ->
        def name = anchor.attr('id')
        anchors[name] = page.title
        anchor.before("<ac:structured-macro ac:name=\"anchor\"><ac:parameter ac:name=\"\">${name}</ac:parameter></ac:structured-macro>")

def pushPages
pushPages = { pages, anchors, pageAnchors, labels ->
    pages.each { page ->
        page.title = page.title.trim()
        println page.title
        def id = pushToConfluence page.title, page.body, page.parent, anchors, pageAnchors, labels
        page.children*.parent = id
        // println "Push children von id " + id
        pushPages page.children, anchors, pageAnchors, labels
        // println "Ende Push children von id " + id

def recordPageAnchor(head) {
    def a = [:]
    if (head.attr('id')) {
        a[head.attr('id')] = head.text()

def promoteHeaders(tree, start, offset) {
    (start..7).each { i ->"h${i}").tagName("h${i-offset}").before('<br />')

def retrievePageIdByName = { String name ->
    def data = confluenceClient.retrievePageIdByName(name, confluenceSpaceKey)
    return data?.results?.get(0)?.id

def getPagesRecursive(Element element, String parentId, Map anchors, Map pageAnchors, int level, int maxLevel) {
    def pages = []"div.sect${level}").each { sect ->
        def title ="h${level + 1}").text()
        pageAnchors.putAll(recordPageAnchor("h${level + 1}")))
        Elements pageBody
        if (level == 1) {
            pageBody ='div.sectionbody')
        } else {
            pageBody = new Elements(sect)
  "h${level + 1}").remove()
        def currentPage = [
            title: title,
            body: pageBody,
            children: [],
            parent: parentId

        if (maxLevel > level) {
            currentPage.children.addAll(getPagesRecursive(sect, null, anchors, pageAnchors, level + 1, maxLevel))
  "div.sect${level + 1}").remove()
        } else {
  "div.sect${level + 1}").unwrap()
        promoteHeaders sect, level + 2, level + 1
        pages << currentPage
    return pages

def getPages(Document dom, String parentId, int maxLevel) {
    def anchors = [:]
    def pageAnchors = [:]
    def sections = pages = []
    def title ='h1').text()
    if (maxLevel <= 0) {'div#content').each { pageBody ->
            promoteHeaders pageBody, 2, 1
            def page = [title   : title,
                        body    : pageBody,
                        children: [],
                        parent  : parentId]
            pages << page
            sections = page.children
            parentId = null
    } else {
        // let's try to select the "first page" and push it to confluence'div#preamble div.sectionbody').each { pageBody ->
            def preamble = [
                title: title,
                body: pageBody,
                children: [],
                parent: parentId
            pages << preamble
            sections = preamble.children
            parentId = null
        sections.addAll(getPagesRecursive(dom, parentId, anchors, pageAnchors, 1, maxLevel))
    return [pages, anchors, pageAnchors]

if(config.confluence.inputHtmlFolder) {
    htmlFolder = "${docDir}/${config.confluence.inputHtmlFolder}"
    println "Starting processing files in folder: " + config.confluence.inputHtmlFolder
    def dir = new File(htmlFolder)

    dir.eachFileRecurse (FILES) { fileName ->
        if (fileName.isFile()){
            def map = [file: config.confluence.inputHtmlFolder+fileName.getName()]

config.confluence.input.each { input ->
    // TODO check why this is necessary
    if(input.file) {
        input.file = confluenceService.checkAndBuildCanonicalFileName(input.file)
        //  assignend, but never used in pushToConfluence(...) (fixed here)
        // #938-mksiva: assign spaceKey passed for each file in the input
        spaceKeyInput = input.spaceKey
        confluenceSpaceKey = input.spaceKey ?: config.confluence.spaceKey
        confluenceCreateSubpages = (input.createSubpages != null) ? input.createSubpages : config.confluence.createSubpages
        confluenceAllInOnePage = (input.allInOnePage != null) ? input.allInOnePage : config.confluence.allInOnePage
        if (!(confluenceCreateSubpages instanceof ConfigObject && confluenceAllInOnePage instanceof ConfigObject)) {
            println "ERROR:"
            println "Deprecated configuration, migrate as follows:"
            println "allInOnePage = true -> subpagesForSections = 0"
            println "allInOnePage = false && createSubpages = false -> subpagesForSections = 1"
            println "allInOnePage = false && createSubpages = true -> subpagesForSections = 2"
            throw new RuntimeException("config problem")
        confluenceSubpagesForSections = (input.subpagesForSections != null) ? input.subpagesForSections : config.confluence.subpagesForSections
        if (confluenceSubpagesForSections instanceof ConfigObject) {
            confluenceSubpagesForSections = 1
    //  hard to read in case of using :sectnums: -> so we add a suffix
        confluencePagePrefix = input.pagePrefix ?: config.confluence.pagePrefix
    //  added
        confluencePageSuffix = input.pageSuffix ?: config.confluence.pageSuffix
        confluencePreambleTitle = input.preambleTitle ?: config.confluence.preambleTitle
        if (!(confluencePreambleTitle instanceof ConfigObject)) {
            println "ERROR:"
            println "Deprecated configuration, use first level heading in document instead of preambleTitle configuration"
            throw new RuntimeException("config problem")
        File htmlFile = new File(input.file)
        baseUrl = htmlFile
        Document dom = confluenceService.parseFile(htmlFile)

        // if ancestorName is defined try to find machingAncestorId in confluence
        def retrievedAncestorId
        if (input.ancestorName) {
            // Retrieve a page id by name
            retrievedAncestorId = retrievePageIdByName(input.ancestorName)
            println("Retrieved pageId for given ancestorName '${input.ancestorName}' is ${retrievedAncestorId}")
        // if input does not contain an ancestorName, check if there is ancestorId, otherwise check if there is a global one
        def parentId = retrievedAncestorId ?: input.ancestorId ?: config.confluence.ancestorId

        // if parentId is still not set, create a new parent page (parentId = null)
        parentId = parentId ?: null
        //println("ancestorName: '${input.ancestorName}', ancestorId: ${input.ancestorId} ---> final parentId: ${parentId}")

        // #342-dierk42: get the keywords from the meta tags
        def keywords = confluenceService.getKeywords(dom)

        def (pages, anchors, pageAnchors) = getPages(dom, parentId, confluenceSubpagesForSections)
        pushPages pages, anchors, pageAnchors, keywords
        if (parentId) {
            println "published to ${config.confluence.api - "rest/api/"}/spaces/${confluenceSpaceKey}/pages/${parentId}"
        } else {
            println "published to ${config.confluence.api - "rest/api/"}/spaces/${confluenceSpaceKey}"