Strategy Shift to Text Files for Knowledge Repositories
This topic is part of WikiTraccs for Markdown and work in progress.
Join the waitlist and start publishing Markdown to SharePoint soon
Why Go Back to Text Files?
Corporations are faced with several challenges right now when it comes to internal documentation and knowledge management:
- How to feed information into third-party AI services like Copilot and Atlassian Intelligence?
- How to feed information into privacy-focused in-house AI solutions?
- How to tackle rising costs of third-party tools and services?
- How to prevent vendor lock-in in a changing environment?
One way to tackle all of above issues is to take a step back, cut ties to any tools at all and go back to… text files.
Multiple of my clients who used WikiTraccs to migrate content from Confluence to SharePoint are already implementing or at least considering this as part of a broader strategy.
Their strategy entails:
- Reducing the number of tools and services to reduce operational complexity and cut costs
- Ultimately going to a simpler, slightly more technical, yet proven approach of maintaining information like a source code repository
Usually, there are additional collateral benefits like easier collaboration with a translation agency for multi-lingual pages.
One common implementation of a text file-based documentation approach is the use of Markdown.
The benefits of Markdown are clear:
- AI services and tools understand Markdown very well - they understand it so well that Microsoft created MarkItDown to convert files of any kind into Markdown, to feed it to AI
- Text files can be stored anywhere; a place of choice is something like GitHub, Gitlab, or any other repository that supports collaboration and versioning
- Markdown is supported by a wide range of (free) tools and services, so there is no vendor lock-in
- Markdown is a format suited for data exchange across enterprises
- Markdown can be published on-demand to third-party services like Confluence and SharePoint, while preventing getting locked in with one specific vendor
It’s the latter point - publishing to third parties - where WikiTraccs for Markdown comes into play, as it can publish Markdown files to SharePoint Online.
Preparing for the Change
Switching from an Enterprise wiki like Confluence to text files containing Markdown is a big change.
We look at the feature set that tools like Confluence and SharePoint provide, to then map those to a Markdown-based repository.
Some features will have parity, some will be lost, some might be added again by different third-party tools and solutions.
Users will have the expectation that certain features are available in a knowledge repository - like formatting text, attaching files, restricting access, and so on.
In this article, we assume that third-party tools have features built-in because there is a demand for them. We look at those features and - in a second step - look at what’s possible to do with a Markdown-based knowledge repository to meet user’s demands.
What’s a “Markdown-Based Knowledge Repository”?
This kind of knowledge repository is - at least in the context of WikiTraccs for Markdown - a collection of Markdown files, organized in folders.
The Markdown-based knowledge repository is
- a single point of truth for enterprise documentation
- written in a well documented and well established markup language (Markdown), with broad tool support
- yet independent of any vendor or third-party service, as information is stored in text files
- easily consumable both by humans and machines
- a starting point for publishing content to third-party services like SharePoint Online or Atlassian Confluence
Before jumping right into creating such a repository, we take a step back to look at the typical content types and structure these repositories are usually built of.
Content Types in a Knowledge Respository
Which types of content are there in a knowledge repository?
This list is influenced mainly by looking at the following third party services as those are often used for documentation and knowledge management:
- Atlassian Confluence Enterprise Wiki
- SharePoint Online
Each row in this list below names a content type, describes what that type means, and then maps this abstract content type to specific third-party services.
The list also includes the file-based Markdown Repository we aim to build.
Content Type | Meaning | Confluence | SharePoint Online | Markdown Repository |
---|---|---|---|---|
Space | A logical container for pages. | space | site | folder |
Page | A wiki page, a page in a PDF document, something textual, might have columns or rows, contains tables, text, images, etc. | page, blog post, whiteboard | modern page | text file (Markdown) |
List | A tabular representation of data, something like a sheet in an Excel file. | database (Confluence Cloud) | list, document library | text file (Markdown) |
Attachment | A file that is associated with a page. | attachments of a page | kind of has page attachments, but often just links to files stored in document libraries | files next to a page’s text file |
Comment | A comment by a user | footer comments, inline comments, attachment comments; all support rich content | page; very very basic nesting and formatting support | ? |
User | A user resource that can be linked to | @-mentions in pages, used in metadata | used in metadata | ? |
All content types have metadata attached, to different extents. At least the author and a timestamp are nearly always available.
Structures of Knowledge Repositories
Let’s look at some approaches to organizing files and folders.
Atlassian Confluence
The structure used by Confluence:
Confluence
├── Space
│ ├── Page
│ └── Page
│ ├── Child Page
│ └── Comments
│ └── Page
│ └── Attachment
│ └── Comments
└── Space
└── ...
SharePoint Online
The structure used by SharePoint Online:
SharePoint Online
├── Site
│ ├── Document Library
| │ ├── Page
| │ └── Page
| │ └── Comments
│ └── Document Library
| │ └── Page Attachment Folder
| │ └── Attachments
| │ └── Page Attachment Folder
| │ └── Attachments
│ └── Document Library
| └── Files
└── Site
└── ...
Note: there are no child pages in SharePoint.
Hugo
Hugo (a Markdown-based static web site generator that powers this very website) uses this structure:
Local File System
├── Section Folder
│ ├── Page
│ └── Page Folder
│ ├── Page
│ ├── Attachments
│ └── Page Folder
│ ├── Page
│ └── Attachments
│ └── Page
└── Section Folder
└── ...
Glossary
In addition to the content types, the following words should be defined:
- content = pages, attachments, comments, etc. - quite everything, except permissions or processes
- files = files in general, can be an attachment, can be an attachment of another page, can be an external file that is being linked to
Access Control in Knowledge Repositories
We again look at the different third-party services.
Atlassian Confluence
Confluence uses the following approach to permission management:
Confluence (access is granted to users and groups)
└── Space (access is granted to users and groups)
└── Page (access can be narrowed down to a subset of users and groups)
└── Child Page (access can be further narrowed down)
Confluence also has the option to grant Anonymous access.
Being granted access to a page grants access to this page’s attachments and comments as well.
SharePoint Online
SharePoint Online
└── Site (access is granted to users and groups)
├── Document Library (access can be changed to a completely different set of users and groups)
│ └── Page (access can be changed to a completely different set of users and groups)
└── Document Library (access can be changed to a completely different set of users and groups)
└── Page Attachment Folder (access can be changed to a completely different set of users and groups)
SharePoint Online allows sharing of pages and attachments with anonymous users using a sharing link. But there is no real anonymous access as Confluence has.
Actions
Users can perform actions on content and structure, based on their permissions.
Possible actions are:
- modify content (create, read, update, delete)
- one user at a time / multiple users at the same time
- using a visual editor / without a visual editor
- restructure content (e.g. move page to a new parent, create a new space)
- duplicate content
- set metadata on content (e.g. label)
- link to existing content
- link to non-existing content (“red links”)
- use extensions to enrich content with non-text elements (first-party or third-party)
- export content (e.g. to PDF or Word)
- change permissions on content
Processes
tbd
Next reading
Now that we know what’s in a repository, we can apply that knowledge and build our Markdown-based knowledge repository: Markdown Repository Specifics.