These are the steps needed to import the MediaWiki help pages from MediaWiki.org
Who this is for
You’ll need to know a bit of MySQL, have a working Java installation, and have a working MediaWiki installation. These instructions are for those who already have content in their MediaWiki, and wish to preserve it. The instructions on MediaWiki (currently) zap your content.
Java JRE or JDK – Curently requires Sun JDK (see Errors below)
Populated MySQL MediaWiki DB
MediaWiki 1.4.X (This will be updated when I move to 1.5.X)
– converts XML to SQL to import into a MySQL database
importDump.php (included with MediaWiki 1.5+) – imports database dumps directly
- Download the (pages_current.xml.bz2) from MetaWiki. Download the from MetaWiki.
- Backup your current MediaWiki DB. REALLY! DO NOT SKIP THIS STEP. At shell:
mysqldump --all-databases > <date>-full-backup.sql
- Create a new empty database in MySQL. At shell:
mysqladmin -u root -p create <tmp_databasename>
- At shell:
java -server -jar mwdumper.jar --output=stdout --format=sql:1.4 --filter=namespace:NS_HELP,NS_TEMPLATE pages_full.xml.bz2 | mysql -u <username> -p <tmp_databasename>
- Delete cur_id from temp database. At shell:
mysql -u <sername> -p <tmp_databasename> -e "ALTER TABLE cur DROP COLUMN cur_id;"
- At shell:
mysqldump -u <username> -p -c -n -t --skip-add-drop-table <tmp_databasename> > dropped-wikidb.sql
- (optional*) Open dropped-wikidb.sql in vim. Enter the following, after entering a “:”
%s/TABLE `/TABLE `<specdb>/g %s/TABLES `/TABLES `<specdb>/g %s/INTO `/INTO `<specdb>/g
where <specdb> is any MediaWiki instance name you’ve created. Note the backticks – they are NOT single quotes.
- At shell:
mysql -p -u <username> < dropped-wikidb.sql
- Check your wiki! Oh, it is b0rked? You DID backup your database, right?
* Only needed if you’ve installed MediaWiki with localized table names
What is missing:
- Links to help text outside the Help: namespace
Some weirdness in imported info.I had some odd effects where a Help: namespace link appears orphaned, but, upon clicking it, presnted with Edit box, but with content box filled with appropriate text.
- Cannot “upgrade” Help /Templates. Since we remove the cur_id as created at the source site (“Meta”), I’m not certain that upgrades are possible. I don’t know if this would upset the database to have a new object imported with a newer cur_id that has the same name/meta-information as an older object. But, this could just reflect more of my ignorance of MediaWiki’s internal structures…
Solutions to try:
- Find Images! Figure out where MW expects them to be.
Conversation on #mediawiki on Freenode, November 4, 2005 11:30 PM to 12:14 AM
<NightMonkey> Howdy. Is there an easy way to populate a fresh MediaWiki install's Help: with the MediaWiki Handbook? <brion> at this time we don't have an installable, redistributable set of help pages <TimStarling> no reason we couldn't have one though <TimStarling> is there? <brion> time and labor <brion> there've been several attempts to reorg the various doc pages <brion> afaik none has produced an actual downloadable package to this date <TimStarling> we could just do an XML dump of meta's Help namespace <TimStarling> maybe import it to mediawiki.org first and delete any unnecessary pages <NightMonkey> I'd love it. I'm a SysAdmin, so I could deal with a database dump of some sort, if that is what is necessary. I'm not worried about any non-English pages, or even Wikipedia-specific pages - I can edit any I find. <brion> NightMonkey: well if you're brave... <brion> fetch the special/meta page dump from download.wikimedia.org <brion> use the mwdumper too to extract pages with --filter=namespaces:NS_HELP <brion> and then import that into your wiki with importDump.php <brion> it *might* work :D <NightMonkey> brion: Cool! I'll give it a try. Is that the whole Wikimedia namespace? I'll edit it to just include the Help: namespace, if that's the case. <brion> the dump is the entire meta.wikimedia.org site <brion> but mwdumper can extract subsets based on namespace or a list of page titles <NightMonkey> brion: Thank you. I don't need to use importDump.php if I use mvdumper, correct? (I have a 1.4.11 MediaWiki) <brion> mwdumper's database import only works on a clean (empty) database, as it includes the page id numbers <brion> however you could dump into an empty cur table, then copy those entries to your own skippiing the cur_id <NightMonkey> brion: Ah, I see.
mwdumper.jar Java exception:
Exception in thread “main” java.io.IOException: Parser has reached the entity expansion limit “64,000” set by the Application.
at org.mediawiki.importer.XmlDumpReader.readDump(Unknown Source)
at org.mediawiki.dumper.Dumper.main(Unknown Source)
Adding -DentityExpansionLimit=10000000 or similar high number to the java command only allows it to go furhter through the XML, but it still dies with a different exception.
This error occurs while using “Blackdown JDK 1.4.2.02”. Using Sun JDK 1.5.0.05 fixes this problem. I haven’t tested other JDKs/JREs, sorry.