Trouble converting Discusware 4.10.1 to SMF 2.0.9

Started by Ned Ludd, January 04, 2015, 03:08:06 PM

Previous topic - Next topic

Randem

Not sure what those redirects are, that is where the post counts are supposed to be. Have you run the SMF utilities on the database to recount the posts etc...

Ned Ludd

This is the error message thrown by the 'Fix' step of of 'Find and repair any errors' using the unmodified Converter data.  I'm no expert but I guess this is Composite Key duplication in mySQL.
Duplicate entry '0-31' for key 2
File: /home/smf-test/Sources/RepairBoards.php
Line: 1486


It's a pity it doesn't tell me which table this is in but my focus is on getting a result so I didn't spend too much time searching for that duplicate.  All I know is that the error went away after I hard-coded all the ids.  I've no problem with this fairly easy fix and it gives me greater control over the data anyway.

The 'Redirect' items appear in the main forum page.  I attach a screenshot in case you can give me some insight into what it might mean.  I'm more curious than concerned so don't fret if you can't explain it.

I converted all the <img src...> URLs in the source HTML to tags and the Converted output now includes the original URLs as desired. I'm content with this solution. If you're at all interested, here's the rough and ready command I used to convert the image tags.
find . -name "*.html" -exec sed -i 's#<img src="\(http://.[^<]*/messages/.[^<]*\)" alt="\([^<]*\)">#[img alt=\2]\1[/img]#g {} \;
(Obviously this would need to be enhanced if I want to relocate the actual image URLs.)

Randem

Not sure what you mean by the second column saying "<n> Redirects". Second column of what? This board was a direct conversion of a Discus board and inline images are present and working. Take a look at the url for the attachments and inline images (what does it say), it will give you a clue as to why there is a 404 error. All attachment names are taken from the database then used to locate the file in the attachments folder, it is very straight forward. The names are MD5 encoded.

As far as id's being duplicated, mySQL will not generate duplicate on an id field that is marked as autonumber or identity so I am at a loss to how you have duplicates in that field. I can only guess at most of your issues since I have not seen any of the source information...

Ned Ludd

After numerous attempts from scratch I've managed to import the Converter output to my SMF database and have everything consistent.

I only managed this by hard-coding most of the 'id' field values in the import files rather than having MySQL generate them.  When auto-generated I had multiple fencepost errors with ids, as well as duplicate records - the worst being the disabling of the original SMF 'admin' member, thus rendering the forum unmanageable!  To avoid any duplicates of pre-existing records I simply added 1000 to every relevant id (but 999 for Topics!)  Only after I'd done all of this would the forum error repairs complete.

One major disappointment (and something of a showstopper) is the complete loss of inline images.  The converter strips all HTML tags from the body of a message , which includes <img> tags.  I guess I'll have to munge the Discusware source to convert these into tags before running the Converter, since it leaves BB code intact.

One file I ignored was the SMF_TopicsLog.txt file, because every entry - one for each id_topic - had the same id_member and id_message value.  It didn't make sense and its omission doesn't seem to have caused any problem.

The attachments have been copied over but so far clicking on any attachment link returns a 404 error.  I'll investigate that when I've solved the above problems.

Another puzzler: why might the second column of every Board say "<n> Redirects"?  I don't understand what that signifies.

Randem

The InternalLinks.txt file is created by the conversion to be used to correct links in the output message.txt, then InternalLinks.txt is deleted. This is the process that searches for 'url=' tags to gather the data then map it to SMF's Topic URL format.

Randem

Discus may have changed some of their directory structures somewhere between 3 and 4.10.1 but your html files should reflect where they are expected to be. When the conversion happens it just changes the http://www.yourdiscus.com to the local root folder, everything after that it uses as part of the folder structure.

I checked to see what the internallinks.txt was used for (it was a long time ago). This file was used to correct links that were in the Discus file to the new location in the SMF database. Is there a messages.txt that has been created? The Internallinks.txt was used along with that file to correct the page links so that they would work in SMF. I cannot seem to find the original conversion that I did to check fully. I will look more...

Ned Ludd

As shown in the file list in a previous post, InternalLinks.txt is an empty file created by the Conversion process.  I can't tell what it's for or what the error message implies.

You said: replace the discus website address (http://www.yourdicusaddr.com) with the Discus Root Address (c:\Discusware\) then look for the attachment in that folder location.

Again, I thought I'd fallen foul of an assumption that the forum is not in a subdirectory.  So I munged all HTML files to eliminate the subdirectory and again had no change in the result.  Time for another rethink.

So I took your explanation completely literally and copied the entire /messages/ and /secure/ directories up from /discus/ and /discus_admin_nnnnnnnnnn/ to the F:\Discusware directory.  After running the converter again I now get attachments!  These are, as you said, labeled with a number and the hash of the file.  There were 370 attachments created, which is awfully close to the 373 instances of "<!--attachment" I can find in the HTML source.  (Identifying the hashed files for the purposes of cross-verification has been interesting.)

My working directory tree now looks like this:

F:\Discusware
F:\Discusware\discus
F:\Discusware\discus\messages
F:\Discusware\discus_admin_nnnnnnnnnn
F:\Discusware\discus_admin_nnnnnnnnnn\secure
F:\Discusware\messages  (includes copies of all /secure subdirectories)
F:\Discusware\redirect
F:\Discusware\redirect\discus
F:\Discusware\redirect\discus\messages
F:\Discusware\SMF
F:\Discusware\SMF\Output
F:\Discusware\SMF\Output\attachments

Wondering where the inline images are I note that they appear as 2789 [ url=http://... ]  URLs in the SMF_Messages.txt file.  Presumably I'll have to alter all of them to reflect the new location of the forum, or leave the old forum tree in place just for the images.  Luckily that's a simple task.

Overall I think I now have enough to proceed with a test of importing the data to SMF.  My only concern is the continued error messages during the conversion, which leaves me in the dark as to what's not being done and hence what might be missing from the converted data.

Randem

What exactly is in this file - InternalLinks.txt

Attachments are enclosed in the following tags - <!--attachment:", "<!--/attachment-->
If found it will replace the discus website address (http://www.yourdicusaddr.com) with the Discus Root Address (c:\Discusware\) then look for the attachment in that folder location.

You would had to have downloaded the attachments from your Discus site into that folder already.

Ned Ludd

From what you say I will need to modify the Redirect files to suit my situation.  I have no problem with that and can easily do it after the fact.

I do have 'Create Attachments' and 'Import Hidden Topics' ticked. Here's a quick count of the most common types of file attachments in my forum:
   pdf 354
   jpg 2601
   png 74
   doc 17
   xls 3

Each attachment lives in its respective topic directory and is directly linked from its message file.

With the latest version I'm still getting the 62 - GetFileData() - F:\Discusware\20150112\SMF\Output\InternalLinks.txt - Input past end of file error.

Randem

Here is a file which eliminates randem.com from any output files but there is nowhere that places randemsystems.com in any file. The file you posted is a file itself and should not be used or should be changed if used to reflect your website. From the information in the file you posted was information that I wanted placed into every html page that would log the page for reporting purposes. You can change or eliminate the contents of that source file.

Other notes: Do you have actual attachments in your forum? I will double check on what conditions no attachment would be created.

Make sure you have "Create Attachments" Checked...

Ned Ludd

After a weekend of tidying up the forum I retried the conversion.  The last error seen on Friday recurred:

         62 - GetFileData() - f:\Discusware\20150112\SMF\Output\InternalLinks.txt

This happens after the last post in the least thread in the last topic is converted.  After clicking 'OK' there is about a ten second pause until the last 'OK' dialog box appears.

The SMF/Output directory contains what appears to be a complete conversion of the messages but the empty output files might suggest where the conversion is failing:

Byte counts:
       0 Jan 12 13:01 attachments (empty directory)
       0 Jan 12 13:01 InternalLinks.txt
  351002 Jan 12 13:03 Log.txt
       0 Jan 12 13:01 SMF_Attachments.txt
    4102 Jan 12 13:03 SMF_Boards.txt
     205 Jan 12 13:03 SMF_Categories.txt
     368 Jan 12 13:04 SMF_MemberGroups.txt
  215761 Jan 12 13:04 SMF_Members.txt
20565175 Jan 12 13:04 SMF_Messages.txt
     311 Jan 12 13:03 SMF_NotifyLog.txt
  142492 Jan 12 13:03 SMF_Topics.txt
   42209 Jan 12 13:03 SMF_TopicsLog.txt


Line counts:
       0 SMF/Output/InternalLinks.txt
    6792 SMF/Output/Log.txt
       0 SMF/Output/SMF_Attachments.txt
      34 SMF/Output/SMF_Boards.txt
       6 SMF/Output/SMF_Categories.txt
       7 SMF/Output/SMF_MemberGroups.txt
    1662 SMF/Output/SMF_Members.txt
  187317 SMF/Output/SMF_Messages.txt
      32 SMF/Output/SMF_NotifyLog.txt
    3332 SMF/Output/SMF_Topics.txt
    3332 SMF/Output/SMF_TopicsLog.txt


There are 3392 HTML files in the Redirect tree, which matches the number of thread HTML files in the source (i.e. excluding the topic and index files).  Unexpectedly, the URLs in these files all point to https://randemsystems.com, despite my having entered the appropriate 'Discus Website' and 'SMF Forum Addr' values in in the converter dialog.  A typical example:
</head>
<body>
<SCRIPT language="JavaScript" type="text/javascript" src="https://randemsystems.com/discus/logpagessc.js"> </SCRIPT>
<SCRIPT language="JavaScript" type="text/javascript" src="https://randemsystems.com/discus/logpagesnsc.js"> </SCRIPT>
<script>
window.location = "https://randemsystems.support//index.php?topic=3297.0";
</script>


Also, if I re-run the converter without restarting the application I get a '9 - ProcessPostsOnPage() - Subscript out of range' error.

Randem

Not sure what happened it runs here. I will re-attach the whole install.

Attachments would not be obvious. The file names would be MD5 encoded in the attachments folder.

Ned Ludd

After dropping that EXE in place of the previous version and running it, I get Run-time error '429': ActiveX component can't create object and then it crashes.  Not being a Windows person I've little idea what to do about this error.

I'm also getting no obvious attachments included in the conversion output.  I guess that will be something to do with the EOF error not letting the process continue after converting the messages.

Randem

Progress... This will tell you which file it is having an issue with...

Ned Ludd

"In the secure folder it is looking for the board-topics.html file. So that would be in your folder F:\Discusware\discus_admin_nnnnnnnnnn\secure"

BINGO.

In my install that file is in the messages folder.  After copying it across, the Conversion runs.  Thanks!

After the last topic is reported to have been converted I get an error popup: 62 - GetFileData() - Input past end of file
Is this expected or might something else be missing?

Also, (and I think I read about this somewhere) it hasn't converted subtopics below the top level.  I'll have to rearrange things on the existing forum for the move but that's not going to be a major issue.  I'm planning to bring them up to the top on the new forum anyway.

I'll probably clone the Discusware site on another web server and use the Move Topic function on the clone, to avoid risking catastrophe on the still-live forum.  (The affected subtopics are quite large.)

My next step after that: importing it to the SMF database.  Watch this space.