Migrate away from Facebook app to Disqus comments

I was helping one old university project and as part of user experience and SEO tweaks we decided to move away from Facebook app comments to Disqus comments. The main reason was problems with comments/replies not appearing on the website, and on top of this we wanted to boost SEO having comments that could be indexed by search engines.

disqus comments

Analysis and planning

The website was built using Joomla and was using a 3rd party plugin to display Facebook app comments. A single page was identified by website’s URL, and the website itself used Facebook App ID. I had access to the app, therefore the plan looked quite straightforward:

  1. Make a list of page URLs from the website’s sitemap.
  2. Export comments for each URL using Facebook’s OpenGraph API.
  3. Format exported comments from Facebook JSON format into Disqus comments XML format.
  4. Import generated XML files into Disqus.
  5. Test the newly imported comments on the website.

List of page URLs

Google’s XML sitemap proved to be very useful for this :) Since the number of pages was small, just a few dozens, I manually reviewed links and deleted the ones that didn’t have Facebook comments on them. Done!

Export comments from Facebook

First of all, I run a few OpenGraph queries to authenticate (get access token) and get list of comments by a page URL. Get client ID and client secret from the Facebook App’s page.

Facebook App

And the following API call returns access token if app ID and secret were correct.

https://graph.facebook.com/oauth/access_token?client_id={appID}&client_secret={appSecret}&grant_type=client_credentials
Facebook Access Token

Next, need to get a page object that would contain the actual page ID.

https://graph.facebook.com/{pageUrl}?access_token={accessToken}
Facebook Page Object

However, the page object would indicate that it’s ID is the same page URL, and when using the page URL instead of page ID, an error would be returned. Moreover, sometimes for whatever reason page object would be not found by page URL, or comments count would be zero, or a count would be slightly different. Even though, the actual comments rendering would work fine. Comments rendering preview can be checked using Facebook Comments Code Generation Tool which is quite handy for manual testing.

Facebook Comments Code Generation

Even though comments were rendered correctly, however some were still missing, and only Moderation Tool would display them all… And finally, the penny dropped! The correct and reliable page ID can be found in the URL of the page’s comments Moderation Tool!

Facebook Moderation Tool

Since the number of pages was small, I quickly went through all the page URLs, visited Moderation Tool, and acquired the page IDs for each URL.

Having correct page IDs was a quick and easy to make OpenGraph calls to get comments in JSON format:

https://graph.facebook.com/comments?id={pageID}&access_token={accessToken}
Facebook API Comments JSON

This allowed me to do more testing of the data, and find that OpenGraph API sometimes would return less comments than there are displayed on the website, and also it turned out that the website sometimes displays less comments than there are displayed in the Moderation Tool. This meant there would be some manual check up later against Moderation Tool to make sure all the comments were included into the export.

The part of application I wrote for retrieving Facebook comments could be found on my GitHub. I wrote two methods:

  • GetAccessTokenAsync – to retrieve access token which had to be included in all other calls to OpenGraph API.
  • GetPageCommentsAsync – to retrieve Facebook page comments. This is a recursive method to retrieve all replies (child comments) as well.

After comments are received they are stored in following object.


public class FacebookComment
{
    [JsonProperty("id")]
    public string Id { get; set; }

    [JsonProperty("created_time")]
    public DateTime CreatedTime { get; set; }

    [JsonProperty("from")]
    public FacebookCommentUser From { get; set; }

    [JsonProperty("message")]
    public string Message { get; set; }

    [JsonIgnore]
    public IList<FacebookComment> Children { get; set; }
}

Children property is ignored by deserialization, since OpenGraph API doesn’t provide comments in tree mode, but instead need to make additional call to the API where comment’s ID becomes child’s parent ID.

Change comments format into Disqus XML

Disqus comments XML formatting looked straightforward, since I came across a Disqus help page that explained custom import in detail. I just needed to write a small helper to reformat Facebook JSON format into the Disqus XML format. Several key points:

  • I didn’t have Single-Sign-On (SSO), therefore removed “<dsq:remote>”
  • Didn’t have user emails/URLs/IPs, therefore these field were left empty. That means only user name will be transfered to Disqus, no photo.
  • Set comments to be approved and open for any further replies.
  • Comments with replies had to have parent comment ID specified. Default is 0.
  • Comment text has to be wrapper as CData.
  • Need to include target page ID, title, and URL for each batch of comments.

After several hours I had a working comments formatting helper. The finalized code can be found on my GitHub. For XML manipulation XDocument class proved to be very useful :) A single formatted Disqus comment looks following.

<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dsq="http://www.disqus.com/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:wp="http://wordpress.org/export/1.0/">
<channel>
<item>
<title>Title</title>
<link>URL</link>
<content:encoded></content:encoded>
<dsq:thread_identifier>111</dsq:thread_identifier>
<wp:post_date_gmt></wp:post_date_gmt>
<wp:comment_status>open</wp:comment_status>
<wp:comment>
<wp:comment_id>Comment ID</wp:comment_id>
<wp:comment_author>Full Name</wp:comment_author>
<wp:comment_author_email></wp:comment_author_email>
<wp:comment_author_url></wp:comment_author_url>
<wp:comment_author_IP></wp:comment_author_IP>
<wp:comment_date_gmt>2016-11-04 16:11:53</wp:comment_date_gmt>
<wp:comment_content><![CDATA[message]]></wp:comment_content>
<wp:comment_approved>1</wp:comment_approved>
<wp:comment_parent>0</wp:comment_parent>
</wp:comment>
</item>
</channel>
</rss>

To support multiple page comments retrieval at the same run, I’ve made application to take input from a tab separated values from a text file with comment page ID and other pertinent data. Input file path can be specified in app settings file, and format should be following:

Comments_PageId<TAB>Target_Page_Title<TAB>Target_Page_URL<TAB>Target_Page_ID

Finally, I had a bunch of XML files containing my Facebook comments formatted in Disqus comments format.

Migration to Disqus comments process

Firstly, a useful thing to do is to close Facebook comments thread in Moderation tool just to make sure nobody posts any further comments.

Facebook Comments Thread Closed

And if you are super cautious, you can re-download all Facebook comments from that thread to make sure you have the very latest and greatest comments.

Secondly, create a test forum (site) for comments in Disqus at https://disqus.com/admin/create/.

Disqus New Site

This is important since:

  • If something went wrong during testing, you won’t be able to undo this, and might end up manually marking comments as deleted.
  • You can do test import, and point your site to this Disqus forum (site) to see how actually comments look on your website.

Thirdly, there is a Disqus help page on how to import comments into Disqus system, and while the suggested way works just fine, however you don’t get any feedback neither about success, nor about failure. BUT! There is actually the whole dedicated comments import website (https://import.disqus.com/) with detailed feedback and import progress. Highly recommended!

Disqus Comments Import

Choose an XML file to import, leave WordPress (WXR) type set and click on Upload. Reloading the page will update import progress but it took me less than a minute for 20-30 comments.

Once you are happy with your test import, just switch to the main forum (site) in Disqus import website, and upload all your XML files there. And voila!

Disqus Comments Import History

Once the migration was completed, I pointed project’s website to the live comments forum/site in Disqus and all comments became live again, however provided by Disqus. After a quick testing I confirmed the migration was successful!

Important Links

Tips and Tricks

I am still not sure about the reason, however some Facebook comments were not retrieved from OpenGraph even though I was able to see them in Moderation Tool. Therefore I had to manually copy data from Moderation Tool and manually update comments list in the application to append missing comments to the list. E.g. in RunAsync method following could be added.

if (page.FacebookPageId == "someFacebookPageId")
{
    comments.Add(new FacebookComment
    {
        Id = "missing Id",
        CreatedTime = DateTime.Parse("2017-03-26 20:00"),
        Message = "missing message",
        From = new FacebookCommentUser
        {
            Name = "missing full name"
        }
    });
}

Also, if you are not sure about something, better create another test forum/site in Disqus and run another import before doing the live migration.

Application architecture

Technologies used:

  • Visual Studio 2017 with C# 7.0.
  • .NET Core 1.1 Console App.
  • Microsoft.Extensions.DependencyInjection for DI.
  • MSTest, NSubstitute, and Fluent Assertions for unit testing.
  • JSON config file.
  • Newtonsoft.Json for Facebook JSON responses parsing.
  • XDocument for formatting Disqus comments XML.
  • RichardSzalay.MockHttp for unit testing HttpClient calls.

Since console apps do not support async pattern by themselves, and I wanted to have a proper app settings JSON file with Options pattern implemented, the actual application main logic was moved out to Startup class. This allowed making the main method async also.

public static void Main(string[] args)
{
    // ...
    var app = serviceProvider.GetService<Startup>();

    // Entry point, async
    var returnCode = ReturnCodes.Success;
    Task.Run(async () => { returnCode = await app.RunAsync(); }).Wait();

    // ...
}

Dependencies / high-level structure.

  • Program class calls RunAsync from Startup class.
  • Startup class gets settings, Facebook API wrapper to retrieve comments, Disqus formatter to format Facebook comments into Disqus comments XML format, and file utils to save XML to file.
  • Facebook API wrapper has dependencies of response parser to parse JSON response, and http client factory to lazily provide instance of HttpClient. This also allows to unit tests HttpClient calls.

Conclusion

Comments migration from Facebook app to Disqus comments was an interesting project. On one hand it took longer than expected to work out how to get comments from OpenGraph API due to inconsistent responses, however on the other hand it was really pleasant to work with Disqus after I found dedicated comments import tool website.

Even though comments migration required some manual work to find missing comments in Moderation Tool, however every comment was successfully migrated to Disqus, dates and user names preserved, and this makes the whole project successful.

And last but not least, this was a chance to try out Visual Studio 2017 with .Net Core to build a comments retrieval and formatting tool. Soure code is here.