As with most computing tasks, editing Wikitravel can benefit from using automated scripts -- programs that modify Wikitravel pages and images with content, or at a pace, not directly controlled by humans. This page describes Wikitravel's policy towards scripts: what we like, what we don't like, and how to make a script that works on Wikitravel.
Pros and cons
Scripts are nice for the following reasons:
- Repetitive work. Using a script can make repetitive tasks a lot easier to do. The script program does the work, instead of someone -- or a lot of someones -- having to do it by hand.
- Accurate work. Well-written scripts don't make common typos, spelling mistakes, etc. A script will do the same task over and over pretty well.
But scripts have the following problems:
- Unintended consequences. If you have a bug in your script, you muck up a whole lot of pages, and either we have to correct those problems by hand, or we have to create another script to clean up after it.
- Unilateral. The writer of a script can make a whole bunch of pages that work exactly the way they want them to. If someone else wants them to work another way, they have to write another script.
- Suck up resources. Bots take up bandwidth and other resources on the server that could be used by people hand-editing a page.
- Unnecessary. Many problems that people want to make scripts for are unnecessary. The time and energy the person would have to put into creating, debugging, and running the script could be spread across a number of people doing the same task by hand. This would be less unilateral, and may actually be faster.
- No new knowledge. In general, scripts just re-adjust the formatting or presentation of knowledge on Wikitravel. Unless they're importing information from another database, there's nothing new added. We could better spend that time and energy adding actual travel information to the guide.
For these reasons, we have the following rules for running scripts against Wikitravel:
- Scripts have to be approved by the Wikitravel administrators. To create a script that runs against Wikitravel, post the name and reason for the script on Script nominations. Explain why we need the script, why it can't be done by hand, and what the script will do. If 2 administrators voice their support for the script within 7 days, and none oppose it in that period, the script can be run.
- Scripts should be in accordance with our policies and guidelines and manual of style. For example, a script that reformats all restaurant listings on the site should make them look like our preferred form rather than something else. If you think we should use that something else, get the policy or style guideline changed first.
- Each script has to run with its own user name. All edits made to the site need to be marked with this user name.
- The user page for the script's user name should describe what the script does.
- Scripts have to check two pages, [[Wikitravel:Script policy/Run]] and [[User:name of script/Run]], before making each edit. Both these pages have to contain exactly the word "yes" before the edit should be saved. This allows any Wikitraveller to turn off all scripts, or individual scripts, just by changing the contents of one or the other page.
- Scripts should make less than one change per minute. This keeps them from hogging up the Web server resources.
- If possible, scripts should be run against a test version of Wikitravel before running against the live site. (We may set up a test server, with a mirror of the live site, for this purpose.)
Programs that only read pages and images, and don't modify the versions on the server, don't require an approval process. However, there are some guidelines to follow for these programs, too.
- Read-only scripts must read the robots.txt file for Wikitravel and follow its suggestions. Most programs (like wget) automatically know about robots.txt, as do major scripting language's HTTP client libraries. But if you're writing your own program, check the Standard for Robot Exclusion for more info.
- Read-only scripts should recognize the non-standard Crawl-Delay field in robots.txt. If not, they must not fetch pages or images more often than once every 30 seconds.
- Read-only scripts must have a User-Agent header set. Scripts should provide a contact email or URL in the header. For example: ExampleBot/0.1 (http://www.example.com/bot.html), MyBot/2.3 ([email protected]).
Read-only scripts that ignore these simple requirements can and will be blocked from accessing Wikitravel. If you provide a contact address or URL, someone will try to contact you after the block.
Scripts that don't comply to these requirements will be blocked from reading or editing Wikitravel pages -- even if they're not doing any actual harm.