Monitoring Confluence to SharePoint Migration Progress

How to monitor the progress of migration from Confluence to SharePoint? This article explains how.

Viewing progress via the Site Pages library

The Site Pages library of every target space contains migrated pages and metadata.

Use the library view Recent Pages (WikiTraccs) to gain insights into the migration success for every single page. The SharePoint page’s metadata also includes the space key and content ID.

You can correlate the contents of the Site Pages library with the content of the source Confluence spaces. But this can be tedious.

What if pages are missing? How to quickly determine which ones? Let’s look at progress log files.

Using progress log files to get insights

As of release v1.1.0 WikiTraccs provides information about the migration progress in different progress-related log files:

Log files containing information about the Confluence to SharePoint migration progress.

The information you can gather from these files is:

  • How many pages are scheduled to be migrated?
  • Which pages have already been migrated to SharePoint?
  • Which pages are yet to be migrated?
  • Which pages have been migrated but need an update?

See below for a quick rundown of the log files and their content.

The values in those files are separated by tabulator. So it’s nearly CSV, but with tab instead of comma. The file name contains a timestamp, the Confluence site ID as specified in the configuration, the space key, and the last part of the target SharePoint site URL.

Progress log file documentation

xxx__10-not-yet-migrated-pages.txt

This file contains information about Confluence pages that are yet to be migrated.

Sample content:

CASIG	78022359	CA2SIG - Meeting November 29	/display/CASIG/CA2SIG+-+Meeting+November+29
CASIG	78022377	2022-11-22 Standard WG	/display/CASIG/2022-11-22+Standard+WG
CASIG	80773916	2022 12 06 Standards WG	/display/CASIG/2022+12+06+Standards+WG

The tab-separated columns in this file are:

  • Confluence space key
  • Confluence page ID
  • Confluence page title
  • Confluence page URL
xxx__20-migrated-pages

This file contains information about migrated pages. It contains information about existing Confluence pages where a corresponding SharePoint page exists as well.

Sample content:

CASIG	24781062	Climate Action and Accounting SIG Home	/display/CASIG/Climate+Action+and+Accounting+SIG+Home	97652
CASIG	24781122	Meetings	/display/CASIG/Meetings	97653
CASIG	24781124	Member Directory	/display/CASIG/Member+Directory	97654

The tab-separated columns in this file are:

  • Confluence space key
  • Confluence page ID
  • Confluence page title
  • Confluence page URL
  • SharePoint page ID (in the Site Pages library)
xxx__25-update-state-of-migrated-pages

This file contains information about the freshness of migrated pages.

Sample content:

+++
SourceTenantId = "https://confluence.contoso.com/"
PageSelectorType = "ConfluenceSpace"
PageSelector = "CASIG"
CreationDateUtc = 2023-01-21T15:30:07.1691711
SchemaVersion = 2
+++
CASIG	Page	78022313	2022-11-21 Peer Programming Call	/display/CASIG/2022-11-21+Peer+Programming+Call	2022-11-30T22:17:28	2022-11-29T23:17:28 2022-11-29T23:17:28	needsupdate
CASIG	Page	78022335	CA2 SIG - Meeting November 15	/display/CASIG/CA2+SIG+-+Meeting+November+15	2022-11-24T06:52:40	2022-11-24T06:52:40 2022-11-29T23:17:28	uptodate
CASIG	Page	78022347	CA2 SIG - Meeting December 13	/display/CASIG/CA2+SIG+-+Meeting+December+13	2022-12-14T01:46:57	2022-12-14T01:46:57 2022-11-29T23:17:28	uptodate

The file contains a header that is enclosed with +++.

Following the header, the list of page update states starts.

The tab-separated columns are:

  • Confluence space key
  • Confluence content type: Page or Blogpost
  • Confluence page ID
  • Confluence page title
  • Confluence page URL
  • Modification date of the Confluence page
  • Stored modification date of the migrated page in SharePoint (this is the Confluence page modification date at the time of migration)
  • Modification date of the SharePoint page (note: added in release v1.12.24)
  • State of the SharePoint page
    • uptodate: this page is up to date
    • needsupdate: this page has been updated in Confluence since its migration
    • cannotdetermine: metadata in SharePoint is missing, cannot determine if update is needed

Note: WikiTraccs will output additional state values in a verification run.

xxx__30-aggregated-info

This file contains information that could be gathered from the other files, but already aggregated:

Sample content:

Source Confluence Site: https://wiki.hyperledger.org
Target SharePoint Site: https://contoso.sharepoint.com/sites/migration-target
Space Key: CASIG
Blog posts included in migration and calculation: no
Confluence page count for space space CASIG: 292
Migrated SharePoint pages that correspond to found Confluence pages in space CASIG: 259
Migrated SharePoint pages overall for space CASIG: 259
Pages yet to be migrated for space CASIG: 33

If Migrated SharePoint pages overall for space is larger than Migrated SharePoint pages that correspond to found Confluence pages in space then pages turned inaccessible in Confluence (deleted? permission denied?) but the once-migrated pages in SharePoint still exist.

Progress log file cadence

WikiTraccs creates progress log files at specific times:

  • when starting a migration - log files for each scheduled space will be created
  • when stopping a running migration by pressing Ctrl+C in the console window of WikiTraccs.Console - log files for each space handled so far will be created
  • when the migration is done - log files for each space handled so far will be created

This means that multiple progress log files per space will be created. Look at the ones with the most recent timestamp to see the latest progress information.

You can delete or archive old progress log files. Every new migration run will create new files.

Last modified February 21, 2024