How to gather Confluence page data with PowerShell

Nate Blevins headshot
Nate Blevins|June 26, 2019
Gathering Confluence Page Data With PowerShell
Gathering Confluence Page Data With PowerShell

Have you ever wanted to gather data from a Confluence page automagically? In this post, I'll show you how gathering Confluence page data with PowerShell can do just that. I use Confluence pages internally to track lists of items I’m working on and to view current progress. For instance, I have a goal to finish a list of items by the end of this quarter, but I don’t want to manually track my progress. I want to automate it!

Nate's Really Important Life Changing Things To Do

Prerequisites

There are several different ways to gather data from webpages. I’m going to focus on one way using PowerShell and the ConfluencePS module.

Helpful Cmdlets along the way:

First, you’ll need to get a token from Atlassian so you can have access to gather the data you want. To do this simply go to Atlassian's homepage, logon to your Atlassian account and select the ‘Create API Token’ button.

Second, you’ll need to install the ConfluencePS module using the following command.

Install-Module ConfluencePS

Third, I like to add parameterization so that I can easily re-use the script without having to edit it. As you can see I’m adding the parameters PageID and Space. I’m requiring PageID to be mandatory and Space I’ve added a default value since the majority of the time I will be using this script within the same space.

param ( [Parameter(Mandatory = $true, ParameterSetName = 'PageID')][int] $PageID, [string] $Space = "TS"

Finally, I highly suggest that you encrypt your credentials. I’m using a key file so that others can use my script without needing their own login info. We already have a great article Kris wrote if you need more information check out Secure Password with PowerShell: Encrypting Credentials.

# Login Info $Username = "MyUsername" $PasswordFile = "\\Server\Share\AtlassianToken.txt" $AESKey = "\\Server\Share\AES.key" $Password = Get-Content $PasswordFile | ConvertTo-SecureString ` -Key (Get-Content $AESKey)$Credential = New-Object System.Management.Automation.PsCredential("$Username",$Password) $BaseUri = "https://MyCompany.atlassian.net/wiki/"

Getting started

Use, ‘Set-ConfluenceInfo’  as the default in your current session for common parameters so you don’t have to use those parameters over and over again.

# Predefining these parameters so we don't have to add them per commandSet-ConfluenceInfo -BaseUri $BaseUri -Credential $Credential PageID is a unique ID within each Confluence page link. ‘Get-ConfluencePage’ finds the specific page I want using the PageID, and from there can get other information for that page such as its title and body. # Page info$GetPageInfo = Get-ConfluencePage -PageID $PageID$Title = ($GetPageInfo).Title $PageBody = ($GetPageInfo).Body

PageID is a unique ID within each Confluence page link. ‘Get-ConfluencePage’ finds the specific page I want using the PageID, and from there can get other information for that page such as its title and body.

# Page info $GetPageInfo = Get-ConfluencePage -PageID $PageID $Title = ($GetPageInfo).Title $PageBody = ($GetPageInfo).Body

If you run the ‘$PageBody’ variable you’ll notice the page information you get back is the information you’re looking for surrounded by tags. We are going to use the ‘ConvertFrom-String’ cmdlet to parse that data for us by creating a template. There are other ways to do this, but I personally found this really easy to work with.

In this case, we can see the TaskID, Checkbox Status, and Body for each task.

<ac:task-id>112</ac:task-id> <ac:task-status>incomplete </ac:task-status><ac:task-body><span class="placeholder-inline-tasks">Sneak like a ninja.</span><ac:task-list>

ConvertFrom-String template

All we need to do is to take a few of these examples, add them to a template, highlight the areas that contain the information we want by surrounding them in brackets {} and add a property name. The more examples you use, the better. I’ve found that 3 work really well for me.

$Template = @' <ac:task-id>{TaskID*:139}</ac:task-id> <ac:task-status>{Checkbox:incomplete} </ac:task-status><ac:task-body>{Body:Learn to yodel}</ac:task-body> <ac:task-id>{TaskID*:189}</ac:task-id> <ac:task-status>{Checkbox:complete}</ac:task-status> <ac:task-body>{Body:Sneak like a ninja}</ac:task-body> <ac:task-id>{TaskID*:196}</ac:task-id> <ac:task-status>{Checkbox:complete}</ac:task-status> <ac:task-body>{Body:With your weight on that foot slowly advance back to your right foot and repeat the process}</ac:task-body>'@

For example, if I had a list of names, phone numbers and ages I could convert them into properties and then add them to a variable. Once I have that data then I can use it how I want.

$List = @’ Nate, 801-123-4567, 32 Mike, 801-123-4568, 45 Ben, 801-123-4569, 24 Dan, 801-123-4570, 56‘@ $Template = @' {Name*:Nate}, {Phone:801-123-4567}, {Age:32} {Name*:Ben}, {Phone:801-123-4568}, {Age:24}'@ $Data = (ConvertFrom-String -InputObject $List -TemplateContent $Template)
ConvertFrom-String Template

Now that I have a valid template I’m going to gather all the data I need so that I can finally calculate how much progress I’ve made by measuring checkboxes that are both complete and incomplete.

# Gather page data into variables for output( ConvertFrom-String -InputObject $PageBody -TemplateContent $Template -OutVariable results) | Out-Null $CheckboxCount = $Results.Checkbox.Count $CheckboxCompleteCount = ($Results.Checkbox | Where-Object { $_ -eq "complete" }).Count $CheckboxIncompleteCount = ($Results.Checkbox | Where-Object { $_ -eq "incomplete" }).Count $CompletionPercentage = ([math]::round($CheckboxCompleteCount / $CheckboxCount * 100,2))$IncompletePercentage = ([math]::round($CheckboxIncompleteCount / $CheckboxCount * 100,2))

Output

Lastly, I’m making my output easy to read, but still containing all the info I wanted.

# Output "      Page: $Title       URL: $BaseUri`spaces/$Space/pages/$PageID  Complete: $CompletionPercentage% checkboxes complete. Incomplete: $IncompletePercentage% checkboxes incomplete.     Total: $CheckboxCount total checkboxes. "

How to call the script with parameters.

powershell.exe -file "C:\Scripts\ConfluenceChecklist.ps1" -PageID 292782089 -Space TS

Here is an example of all of our hard work.

How to call the script with parameters - PowerShell

The final product

Once I ran the script for the first time it worked great, however, I later realized that if someone were to run my script against a page that doesn’t contain checkboxes they would receive an unrelated error. So I added an if condition to account for that. My script is now done hurray!

param (      [Parameter(Mandatory = $true, ParameterSetName = 'PageID')]      [int]      $PageID,      [string]      $Space = "TS" ) # Login Info $Username = "MyUsername" $PasswordFile = "\\Server\Share\AtlassianToken.txt" $AESKey = "\\Server\Share\AES.key" $Password = Get-Content $PasswordFile | ConvertTo-SecureString -Key (Get-Content $AESKey) $Credential = New-Object System.Management.Automation.PsCredential("$Username",$Password) $BaseUri = "https://MyCompany.atlassian.net/wiki/" # Predefining these parameters so we don't have to add them per command Set-ConfluenceInfo -BaseUri $BaseUri -Credential $Credential # Template $Template = @' <ac:task-id>{TaskID*:139}</ac:task-id> <ac:task-status>{Checkbox:incomplete}</ac:task-status> <ac:task-body>{Body:Learn to yodel}</ac:task-body> <ac:task-id>{TaskID*:189}</ac:task-id> <ac:task-status>{Checkbox:complete}</ac:task-status> <ac:task-body>{Body:Sneak like a ninja}</ac:task-body> <ac:task-id>{TaskID*:196}</ac:task-id> <ac:task-status>{Checkbox:complete}</ac:task-status> <ac:task-body>{Body:With your weight on that foot slowly advance back to your right foot and repeat the process}</ac:task-body> '@ # Page info $GetPageInfo = Get-ConfluencePage -PageID $PageID $Title = ($GetPageInfo).Title $PageBody = ($GetPageInfo).Body # Throw error if there are no checkboxes within the page if ($PageBody -notmatch "<ac:task-status>") {   throw "The specified page does not contain any checkboxes"       } # Gather page data into variables for output (ConvertFrom-String -InputObject $PageBody -TemplateContent $Template -OutVariable results) | Out-Null $CheckboxCount = $Results.Checkbox.Count $CheckboxCompleteCount = ($Results.Checkbox | Where-Object { $_ -eq "complete" }).Count $CheckboxIncompleteCount = ($Results.Checkbox | Where-Object { $_ -eq "incomplete" }).Count $CompletionPercentage = ([math]::round($CheckboxCompleteCount / $CheckboxCount * 100,2)) $IncompletePercentage = ([math]::round($CheckboxIncompleteCount / $CheckboxCount * 100,2)) # Output "      Page:  $Title       URL:  $BaseUri`spaces/$Space/pages/$PageID  Complete:  $CompletionPercentage% checkboxes complete. Incomplete:  $IncompletePercentage% checkboxes incomplete.     Total:  $CheckboxCount total checkboxes.   "

Wrapping up

Now I can be confident that I can finish my quarterly goal of sneaking like a ninja by easily keeping track of my progress and making it my priority when needed over useless things like company sales and listening to my boss. Thanks for reading and hopefully this post helps you automate gathering Confluence page data with PowerShell more effectively.

Nate Blevins headshot
Nate Blevins

Nate has a passion for automating. If he's repeated the same boring task too he will find a way to automate it! Nate's been a sysadmin for 15 years, loves to dabble in PowerShell, trolling on social media, and petting large dogs.

Related articles